Top Related Projects
Datasets, Transforms and Models specific to Computer Vision
Models and examples built with TensorFlow
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
OpenMMLab Pre-training Toolbox and Benchmark
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Deep Learning for humans
Quick Overview
PyClS is a PyTorch-based image classification codebase developed by Facebook AI Research. It provides a simple and flexible toolkit for training and evaluating state-of-the-art image classification models, with a focus on efficiency and reproducibility.
Pros
- Highly modular and extensible architecture
- Supports a wide range of popular image classification models
- Efficient implementation with multi-GPU training support
- Comprehensive documentation and examples
Cons
- Limited to image classification tasks
- Requires familiarity with PyTorch
- May have a steeper learning curve for beginners
- Not as widely adopted as some other deep learning frameworks
Code Examples
- Loading a pre-trained model:
import pycls.core.model_builder as model_builder
from pycls.core.config import cfg
# Load configuration
cfg.merge_from_file("path/to/config.yaml")
# Build the model
model = model_builder.build_model()
# Load pre-trained weights
checkpoint = torch.load("path/to/checkpoint.pyth")
model.load_state_dict(checkpoint["model_state"])
- Training a model:
from pycls.core.trainer import train_model
# Assuming cfg is already loaded and model is built
optimizer = optim.construct_optimizer(model)
loader = loader.construct_loader()
# Train the model
train_model(cfg, model, optimizer, loader)
- Evaluating a model:
from pycls.core.trainer import test_model
# Assuming cfg is already loaded and model is built
loader = loader.construct_loader(is_train=False)
# Evaluate the model
results = test_model(cfg, model, loader)
print(f"Top-1 accuracy: {results['top1_err']:.2f}%")
Getting Started
- Install PyClS:
git clone https://github.com/facebookresearch/pycls.git
cd pycls
pip install -e .
-
Prepare your dataset and configuration file.
-
Train a model:
python tools/train_net.py --cfg configs/imagenet/resnet/R-50-1x64d.yaml
- Evaluate a model:
python tools/test_net.py --cfg configs/imagenet/resnet/R-50-1x64d.yaml TEST.WEIGHTS /path/to/model/weights.pyth
Competitor Comparisons
Datasets, Transforms and Models specific to Computer Vision
Pros of vision
- Broader scope, covering various computer vision tasks beyond image classification
- More extensive documentation and tutorials
- Larger community and more frequent updates
Cons of vision
- Can be more complex to use for simple image classification tasks
- May have more dependencies and a larger footprint
Code comparison
vision:
import torchvision.models as models
resnet18 = models.resnet18(pretrained=True)
pycls:
from pycls.models.resnet import ResNet
model = ResNet("resnet18")
Summary
vision offers a more comprehensive toolkit for computer vision tasks, with better documentation and community support. However, pycls may be more straightforward for specific image classification tasks. vision's code tends to be more concise for model instantiation, while pycls provides more explicit control over model architecture.
Models and examples built with TensorFlow
Pros of models
- Broader scope, covering various ML tasks beyond computer vision
- Larger community and more frequent updates
- Official TensorFlow implementation, ensuring compatibility
Cons of models
- More complex structure, potentially harder to navigate
- Heavier resource requirements due to its comprehensive nature
- May include unnecessary components for specific tasks
Code comparison
models:
import tensorflow as tf
from official.vision.image_classification import resnet_model
model = resnet_model.resnet50(num_classes=1000)
pycls:
from pycls.core.config import cfg
from pycls.core.builders import build_model
cfg.MODEL.TYPE = "resnet"
model = build_model()
Summary
models is a comprehensive repository for various machine learning tasks, offering a wide range of models and implementations. It benefits from the large TensorFlow ecosystem and frequent updates. However, its breadth can make it more complex to use for specific tasks.
pycls focuses on computer vision tasks, particularly image classification. It offers a simpler, more streamlined approach for these specific use cases. While it may have fewer features overall, it can be easier to use for targeted computer vision projects.
The choice between the two depends on the specific requirements of your project, the desired ecosystem, and the breadth of functionality needed.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Pros of pytorch-image-models
- Larger collection of pre-trained models and architectures
- More frequent updates and active community contributions
- Extensive documentation and examples for various use cases
Cons of pytorch-image-models
- Potentially more complex to use for beginners due to its extensive features
- May have higher memory requirements for some models
Code Comparison
pycls:
from pycls.core.config import cfg
from pycls.core.builders import build_model
# Load config and build model
cfg.merge_from_file("path/to/config.yaml")
model = build_model()
pytorch-image-models:
import timm
# Load pre-trained model
model = timm.create_model('resnet50', pretrained=True)
pytorch-image-models offers a more straightforward approach to loading pre-trained models, while pycls provides a configuration-based setup that may be more flexible for custom architectures.
OpenMMLab Pre-training Toolbox and Benchmark
Pros of mmpretrain
- More comprehensive and feature-rich, supporting a wider range of models and tasks
- Better documentation and community support
- More frequent updates and active development
Cons of mmpretrain
- Steeper learning curve due to its complexity
- Potentially slower execution for simple tasks compared to pycls
Code Comparison
mmpretrain:
from mmpretrain import ImageClassificationInferencer
inferencer = ImageClassificationInferencer('resnet50_8xb32_in1k')
results = inferencer('demo.jpg')
print(results)
pycls:
import pycls.core.builders as builders
from pycls.core.config import cfg
cfg.merge_from_file("path/to/config.yaml")
model = builders.build_model()
outputs = model(inputs)
mmpretrain offers a higher-level API for inference, making it easier to use out-of-the-box. pycls requires more manual configuration and model setup, but provides more fine-grained control over the process.
Both repositories are valuable tools for computer vision tasks, with mmpretrain being more suitable for complex projects and pycls for simpler, more focused applications.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Pros of transformers
- Broader scope: Supports a wide range of NLP tasks and models
- Extensive documentation and community support
- Regular updates and new model implementations
Cons of transformers
- Steeper learning curve due to its complexity
- Potentially higher computational requirements
- May include unnecessary features for simpler projects
Code comparison
transformers:
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
pycls:
from pycls.core.config import cfg
from pycls.core.builders import build_model
cfg.MODEL.TYPE = "resnet"
cfg.MODEL.DEPTH = 50
model = build_model()
Summary
transformers offers a comprehensive toolkit for NLP tasks with extensive support, while pycls focuses on image classification models with a simpler interface. transformers provides more flexibility but may be overkill for basic projects, whereas pycls is more straightforward but limited in scope. Choose based on your specific needs and project complexity.
Deep Learning for humans
Pros of Keras
- More extensive documentation and larger community support
- Supports multiple backend engines (TensorFlow, Theano, CNTK)
- Higher-level API, making it easier for beginners to get started
Cons of Keras
- Less flexibility for low-level operations compared to PyTorch-based pycls
- Potentially slower execution due to higher-level abstractions
- Limited support for dynamic computational graphs
Code Comparison
Keras:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(10, activation='softmax')
])
pycls:
import pycls.models.mlp as mlp
import torch.nn as nn
class MLP(nn.Module):
def __init__(self):
super().__init__()
self.fc1 = nn.Linear(784, 64)
self.fc2 = nn.Linear(64, 10)
Both frameworks allow for creating neural networks, but pycls offers more low-level control while Keras provides a more concise, high-level API. Keras is generally easier for beginners, while pycls (based on PyTorch) offers more flexibility for advanced users and researchers.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
pycls
pycls is an image classification codebase, written in PyTorch. It was originally developed for the On Network Design Spaces for Visual Recognition project. pycls has since matured and been adopted by a number of projects at Facebook AI Research.

pycls provides a large set of baseline models across a wide range of flop regimes.
Introduction
The goal of pycls is to provide a simple and flexible codebase for image classification. It is designed to support rapid implementation and evaluation of research ideas. pycls also provides a large collection of baseline results (Model Zoo). The codebase supports efficient single-machine multi-gpu training, powered by the PyTorch distributed package, and provides implementations of standard models including ResNet, ResNeXt, EfficientNet, and RegNet.
Using pycls
Please see GETTING_STARTED
for brief installation instructions and basic usage examples.
Model Zoo
We provide a large set of baseline results and pretrained models available for download in the pycls Model Zoo; including the simple, fast, and effective RegNet models that we hope can serve as solid baselines across a wide range of flop regimes.
Sweep Code
The pycls codebase now provides powerful support for studying design spaces and more generally population statistics of models as introduced in On Network Design Spaces for Visual Recognition and Designing Network Design Spaces. This idea is that instead of planning a single pycls job (e.g., testing a specific model configuration), one can study the behavior of an entire population of models. This allows for quite powerful and succinct experimental design, and elevates the study of individual model behavior to the study of the behavior of model populations. Please see SWEEP_INFO
for details.
Projects
A number of projects at FAIR have been built on top of pycls:
- On Network Design Spaces for Visual Recognition
- Exploring Randomly Wired Neural Networks for Image Recognition
- Designing Network Design Spaces
- Fast and Accurate Model Scaling
- Are Labels Necessary for Neural Architecture Search?
- PySlowFast Video Understanding Codebase
If you are using pycls in your research and would like to include your project here, please let us know or send a PR.
Citing pycls
If you find pycls helpful in your research or refer to the baseline results in the Model Zoo, please consider citing an appropriate subset of the following papers:
@InProceedings{Radosavovic2019,
title = {On Network Design Spaces for Visual Recognition},
author = {Ilija Radosavovic and Justin Johnson and Saining Xie Wan-Yen Lo and Piotr Doll{\'a}r},
booktitle = {ICCV},
year = {2019}
}
@InProceedings{Radosavovic2020,
title = {Designing Network Design Spaces},
author = {Ilija Radosavovic and Raj Prateek Kosaraju and Ross Girshick and Kaiming He and Piotr Doll{\'a}r},
booktitle = {CVPR},
year = {2020}
}
@InProceedings{Dollar2021,
title = {Fast and Accurate Model Scaling},
author = {Piotr Doll{\'a}r and Mannat Singh and Ross Girshick},
booktitle = {CVPR},
year = {2021}
}
License
pycls is released under the MIT license. Please see the LICENSE
file for more information.
Contributing
We actively welcome your pull requests! Please see CONTRIBUTING.md
and CODE_OF_CONDUCT.md
for more info.
Top Related Projects
Datasets, Transforms and Models specific to Computer Vision
Models and examples built with TensorFlow
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
OpenMMLab Pre-training Toolbox and Benchmark
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Deep Learning for humans
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot