Convert Figma logo to code with AI

sthalles logoSimCLR

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

2,214
458
2,214
30

Top Related Projects

4,714

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722

4,047

SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners

2,918

A python library for self-supervised learning on images.

1,298

[arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

Quick Overview

SimCLR is a PyTorch implementation of the paper "A Simple Framework for Contrastive Learning of Visual Representations" by Chen et al. It provides a self-supervised learning approach for training visual representations without labeled data, using contrastive learning techniques.

Pros

  • Implements a state-of-the-art self-supervised learning method for computer vision tasks
  • Provides a flexible and modular codebase for easy experimentation and customization
  • Includes pre-trained models and evaluation scripts for quick testing and benchmarking
  • Supports distributed training for faster experimentation on multiple GPUs

Cons

  • Requires significant computational resources for training on large datasets
  • Limited documentation and examples for advanced usage scenarios
  • May require deep understanding of contrastive learning concepts for optimal use
  • Dependency on specific versions of PyTorch and other libraries may cause compatibility issues

Code Examples

  1. Loading a pre-trained SimCLR model:
from simclr import SimCLR

model = SimCLR.load_from_checkpoint('path/to/checkpoint.ckpt')
model.eval()
  1. Extracting features from an image:
import torch
from torchvision import transforms
from PIL import Image

transform = transforms.Compose([
    transforms.Resize(224),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
])

image = Image.open('path/to/image.jpg')
input_tensor = transform(image).unsqueeze(0)

with torch.no_grad():
    features = model.encoder(input_tensor)
  1. Training a SimCLR model:
from simclr import SimCLR
from pytorch_lightning import Trainer

model = SimCLR(num_samples=50000, batch_size=256, num_epochs=100)
trainer = Trainer(gpus=1, max_epochs=100)
trainer.fit(model)

Getting Started

To get started with SimCLR:

  1. Clone the repository:

    git clone https://github.com/sthalles/SimCLR.git
    cd SimCLR
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run the training script:

    python run.py --dataset_name=STL10 --gpus=1 --num_workers=8 --max_epochs=100
    
  4. Evaluate the trained model:

    python eval.py --model_path=path/to/checkpoint.ckpt --dataset_name=STL10
    

Competitor Comparisons

4,714

PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722

Pros of MoCo

  • More memory-efficient, allowing for larger batch sizes
  • Supports a wider range of architectures and tasks
  • Better performance on some downstream tasks, especially object detection

Cons of MoCo

  • Slightly more complex implementation due to the momentum encoder
  • May require more careful hyperparameter tuning
  • Potentially slower training due to the queue mechanism

Code Comparison

MoCo:

# Momentum update
self._momentum_update_key_encoder()

# Compute key features
with torch.no_grad():
    k = self.encoder_k(im_k)
    k = nn.functional.normalize(k, dim=1)

SimCLR:

# Compute features for both augmented views
z_i = self.encoder(x_i)
z_j = self.encoder(x_j)

# Normalize embeddings
z_i = F.normalize(z_i, dim=1)
z_j = F.normalize(z_j, dim=1)

Both MoCo and SimCLR are self-supervised learning frameworks for visual representation learning. MoCo uses a momentum encoder and a queue of negative samples, while SimCLR relies on large batch sizes and in-batch negatives. The code snippets highlight the key differences in their approaches to feature computation and normalization.

4,047

SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners

Pros of simclr

  • Official implementation by Google Research, likely more aligned with the original paper
  • Supports distributed training across multiple GPUs and TPUs
  • More comprehensive documentation and examples

Cons of simclr

  • More complex codebase, potentially harder to understand and modify
  • Requires TensorFlow 2.x, which may not be preferred by all users
  • Larger repository size with additional dependencies

Code Comparison

SimCLR:

class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder(num_classes=projection_dim)

simclr:

class SimCLRModel(tf.keras.Model):
  def __init__(self, num_classes, width_multiplier, resnet_depth):
    super(SimCLRModel, self).__init__()
    self.resnet_model = resnet.resnet(
        resnet_depth=resnet_depth,
        width_multiplier=width_multiplier,
        cifar_stem=False)

The SimCLR implementation is more concise and uses PyTorch, while simclr uses TensorFlow and includes additional parameters for customization. SimCLR is easier to understand at a glance, but simclr offers more flexibility in model architecture.

2,918

A python library for self-supervised learning on images.

Pros of Lightly

  • More comprehensive self-supervised learning framework with multiple methods
  • Actively maintained with regular updates and new features
  • Includes tools for data curation and dataset management

Cons of Lightly

  • Steeper learning curve due to more complex API
  • Requires more setup and configuration compared to SimCLR
  • May have higher computational requirements for some features

Code Comparison

SimCLR:

model = SimCLR(encoder, projection_dim, n_features)
criterion = NT_Xent(batch_size, temperature, world_size)
optimizer = LARS(model.parameters(), lr=0.3, weight_decay=1e-6)

Lightly:

model = SimCLRModel(backbone)
criterion = NTXentLoss()
optimizer = LARS(model.parameters(), lr=0.1, weight_decay=1e-6)
collate_fn = SimCLRCollateFunction()

Both implementations follow similar patterns, but Lightly offers more flexibility in model configuration and data handling. SimCLR provides a more straightforward implementation focused specifically on the SimCLR method, while Lightly's approach is more modular and extensible.

1,298

[arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis

Pros of CMC

  • Implements Contrastive Multiview Coding (CMC), which can handle multiple views of data
  • Provides more flexibility in terms of data augmentation strategies
  • Includes additional loss functions and architectures beyond SimCLR

Cons of CMC

  • Less actively maintained compared to SimCLR
  • Documentation is not as comprehensive
  • May require more setup and configuration for certain use cases

Code Comparison

SimCLR:

class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim=128):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder(num_classes=projection_dim)

CMC:

class CMC(nn.Module):
    def __init__(self, base_encoder, n_views=2, feat_dim=128):
        super(CMC, self).__init__()
        self.encoder = nn.ModuleList([base_encoder() for _ in range(n_views)])
        self.head = nn.ModuleList([nn.Linear(feat_dim, feat_dim) for _ in range(n_views)])

The main difference in the code is that CMC supports multiple views by using ModuleList for both the encoder and projection head, while SimCLR uses a single encoder with a specified projection dimension.

The easiest way to use deep metric learning in your application. Modular, flexible, and extensible. Written in PyTorch.

Pros of pytorch-metric-learning

  • More comprehensive library with a wider range of metric learning techniques
  • Actively maintained with regular updates and improvements
  • Extensive documentation and examples for various use cases

Cons of pytorch-metric-learning

  • Steeper learning curve due to its broader scope
  • May have more overhead for simple tasks compared to SimCLR's focused approach

Code Comparison

SimCLR:

class SimCLR(nn.Module):
    def __init__(self, base_encoder, projection_dim=128):
        super(SimCLR, self).__init__()
        self.encoder = base_encoder(pretrained=False)
        self.projector = nn.Sequential(
            nn.Linear(self.encoder.fc.in_features, projection_dim),
            nn.ReLU(),
            nn.Linear(projection_dim, projection_dim)
        )

pytorch-metric-learning:

from pytorch_metric_learning import losses, miners, distances

loss_func = losses.NTXentLoss(temperature=0.07)
mining_func = miners.MultiSimilarityMiner()
distance = distances.CosineSimilarity()

loss = loss_func(embeddings, labels, indices_tuple=mining_func(embeddings, labels))

The SimCLR code focuses on the model architecture, while pytorch-metric-learning provides a more modular approach with separate components for loss functions, mining strategies, and distance metrics.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

PyTorch SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

DOI

Blog post with full documentation: Exploring SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Image of SimCLR Arch

See also PyTorch Implementation for BYOL - Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning.

Installation

$ conda env create --name simclr --file env.yml
$ conda activate simclr
$ python run.py

Config file

Before running SimCLR, make sure you choose the correct running configurations. You can change the running configurations by passing keyword arguments to the run.py file.


$ python run.py -data ./datasets --dataset-name stl10 --log-every-n-steps 100 --epochs 100 

If you want to run it on CPU (for debugging purposes) use the --disable-cuda option.

For 16-bit precision GPU training, there NO need to to install NVIDIA apex. Just use the --fp16_precision flag and this implementation will use Pytorch built in AMP training.

Feature Evaluation

Feature evaluation is done using a linear model protocol.

First, we learned features using SimCLR on the STL10 unsupervised set. Then, we train a linear classifier on top of the frozen features from SimCLR. The linear model is trained on features extracted from the STL10 train set and evaluated on the STL10 test set.

Check the Open In Colab notebook for reproducibility.

Note that SimCLR benefits from longer training.

Linear ClassificationDatasetFeature ExtractorArchitectureFeature dimensionalityProjection Head dimensionalityEpochsTop1 %
Logistic Regression (Adam)STL10SimCLRResNet-1851212810074.45
Logistic Regression (Adam)CIFAR10SimCLRResNet-1851212810069.82
Logistic Regression (Adam)STL10SimCLRResNet-5020481285070.075