Convert Figma logo to code with AI

weiaicunzai logoawesome-image-classification

A curated list of deep learning image classification papers and codes

2,804
595
2,804
1

Top Related Projects

76,949

Models and examples built with TensorFlow

15,955

Datasets, Transforms and Models specific to Computer Vision

Reference implementations of popular deep learning models.

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

24,519

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Quick Overview

The "awesome-image-classification" repository is a curated list of image classification papers, including architectures, tricks, and implementation details. It serves as a comprehensive resource for researchers and practitioners in the field of computer vision, focusing on state-of-the-art image classification techniques.

Pros

  • Extensive collection of image classification papers and resources
  • Well-organized structure with categorization by year and topic
  • Regular updates with the latest research and implementations
  • Includes links to official implementations and third-party reproductions

Cons

  • Lacks detailed explanations or summaries of the listed papers
  • May be overwhelming for beginners due to the large volume of information
  • Does not provide direct code examples or implementations
  • Some links may become outdated over time

Code Examples

This repository is not a code library but a curated list of resources. Therefore, there are no code examples to provide.

Getting Started

As this is not a code library, there are no specific getting started instructions. However, users can navigate the repository by browsing through the README.md file, which contains the curated list of papers and resources organized by year and topic.

Competitor Comparisons

76,949

Models and examples built with TensorFlow

Pros of models

  • Comprehensive collection of official TensorFlow models and examples
  • Regularly updated with new models and features
  • Includes pre-trained models for various tasks beyond image classification

Cons of models

  • Larger and more complex repository, potentially overwhelming for beginners
  • Focused solely on TensorFlow, limiting options for other frameworks
  • May require more setup and dependencies to run specific models

Code Comparison

models:

import tensorflow as tf
from official.vision.image_classification import resnet_model

model = resnet_model.resnet50(num_classes=1000)

awesome-image-classification:

import torch
import torchvision.models as models

model = models.resnet50(pretrained=True)

Summary

models is a comprehensive repository maintained by TensorFlow, offering a wide range of official models and examples. It provides regular updates and pre-trained models for various tasks. However, it can be complex for beginners and is limited to TensorFlow.

awesome-image-classification is a curated list of image classification papers, codes, and datasets. It offers a broader overview of different approaches and frameworks but may not provide as many ready-to-use implementations as models.

15,955

Datasets, Transforms and Models specific to Computer Vision

Pros of vision

  • Comprehensive library with official PyTorch support
  • Regularly updated with new models and features
  • Extensive documentation and community support

Cons of vision

  • Larger and more complex, potentially harder for beginners
  • Focused on implementation rather than curating resources
  • May include unnecessary components for specific use cases

Code comparison

awesome-image-classification:

# No direct code implementation, primarily a curated list of resources

vision:

import torchvision.models as models
resnet18 = models.resnet18(pretrained=True)
alexnet = models.alexnet(pretrained=True)
vgg16 = models.vgg16(pretrained=True)

Summary

awesome-image-classification is a curated list of image classification resources, papers, and implementations. It's excellent for researchers and those looking for a comprehensive overview of the field.

vision is an official PyTorch library for computer vision tasks, including image classification. It provides ready-to-use models and utilities, making it ideal for developers and practitioners implementing vision solutions.

Choose awesome-image-classification for research and exploration, and vision for practical implementation and development of image classification models within the PyTorch ecosystem.

Reference implementations of popular deep learning models.

Pros of keras-applications

  • Provides pre-trained deep learning models for immediate use in Keras
  • Includes weight conversion utilities for popular model architectures
  • Offers consistent API for various models, simplifying integration

Cons of keras-applications

  • Limited to a specific set of pre-trained models
  • Focused solely on Keras implementation, less flexible for other frameworks
  • May not include the latest state-of-the-art models

Code Comparison

keras-applications:

from keras.applications import VGG16
model = VGG16(weights='imagenet', include_top=False)
features = model.predict(img_array)

awesome-image-classification:

# No direct code implementation
# Provides links to various repositories and papers
# Users need to navigate to specific implementations

awesome-image-classification is a curated list of image classification resources, including papers, implementations, and datasets. It offers a broader overview of the field but doesn't provide ready-to-use code. keras-applications, on the other hand, focuses on providing pre-trained models with a consistent API for immediate use in Keras projects.

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Pros of pytorch-image-models

  • Comprehensive collection of state-of-the-art image models implemented in PyTorch
  • Actively maintained with frequent updates and new model additions
  • Includes pre-trained weights and easy-to-use interfaces for fine-tuning and inference

Cons of pytorch-image-models

  • Focused solely on PyTorch implementations, limiting options for users of other frameworks
  • May have a steeper learning curve for beginners due to its extensive feature set
  • Less emphasis on educational resources compared to awesome-image-classification

Code Comparison

pytorch-image-models:

import timm
model = timm.create_model('resnet50', pretrained=True)
output = model(input_tensor)

awesome-image-classification:

from torchvision import models
model = models.resnet50(pretrained=True)
output = model(input_tensor)

The code comparison shows that pytorch-image-models (timm) offers a more unified interface for accessing various models, while awesome-image-classification primarily relies on standard PyTorch implementations.

pytorch-image-models provides a centralized repository of implemented models with consistent APIs, making it easier to experiment with different architectures. awesome-image-classification, on the other hand, serves as a curated list of resources and implementations, offering a broader overview of the field but with less direct code integration.

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

  • Comprehensive object detection and segmentation library with state-of-the-art models
  • Modular design allows for easy customization and extension
  • Extensive documentation and active community support

Cons of Detectron2

  • Steeper learning curve due to its complexity and extensive features
  • Focused primarily on object detection and segmentation, less versatile for general image classification tasks
  • Requires more computational resources for training and inference

Code Comparison

Awesome-image-classification (PyTorch example):

model = torchvision.models.resnet50(pretrained=True)
transform = transforms.Compose([transforms.Resize(256), transforms.CenterCrop(224), transforms.ToTensor()])
img = Image.open("image.jpg")
img_t = transform(img)
batch_t = torch.unsqueeze(img_t, 0)

Detectron2 (object detection example):

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
24,519

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

  • Provides a complete implementation of Mask R-CNN for object detection and instance segmentation
  • Includes pre-trained models and examples for easy use and adaptation
  • Offers detailed documentation and tutorials for understanding and using the codebase

Cons of Mask_RCNN

  • Focuses solely on Mask R-CNN, limiting its scope compared to the broader image classification resources
  • May require more computational resources due to the complexity of instance segmentation tasks
  • Less frequently updated compared to the curated list of resources in awesome-image-classification

Code Comparison

Mask_RCNN example usage:

import mrcnn.model as modellib
model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(COCO_MODEL_PATH, by_name=True)
results = model.detect([image], verbose=1)

awesome-image-classification doesn't provide direct code examples but offers links to various image classification implementations.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Awesome - Image Classification

Awesome

A curated list of deep learning image classification papers and codes since 2014, Inspired by awesome-object-detection, deep_learning_object_detection and awesome-deep-learning-papers.

Background

I believe image classification is a great start point before diving into other computer vision fields, espacially for begginers who know nothing about deep learning. When I started to learn computer vision, I've made a lot of mistakes, I wish someone could have told me that which paper I should start with back then. There doesn't seem to have a repository to have a list of image classification papers like deep_learning_object_detection until now. Therefore, I decided to make a repository of a list of deep learning image classification papers and codes to help others. My personal advice for people who know nothing about deep learning, try to start with vgg, then googlenet, resnet, feel free to continue reading other listed papers or switch to other fields after you are finished.

Note: I also have a repository of pytorch implementation of some of the image classification networks, you can check out here.

Performance Table

For simplicity reason, I only listed the best top1 and top5 accuracy on ImageNet from the papers. Note that this does not necessarily mean one network is better than another when the acc is higher, cause some networks are focused on reducing the model complexity instead of improving accuracy, or some papers only give the single crop results on ImageNet, but others give the model fusion or multicrop results.

  • ConvNet: name of the covolution network
  • ImageNet top1 acc: best top1 accuracy on ImageNet from the Paper
  • ImageNet top5 acc: best top5 accuracy on ImageNet from the Paper
  • Published In: which conference or journal the paper was published in.
ConvNetImageNet top1 accImageNet top5 accPublished In
Vgg76.393.2ICLR2015
GoogleNet-93.33CVPR2015
PReLU-nets-95.06ICCV2015
ResNet-96.43CVPR2015
PreActResNet79.995.2CVPR2016
Inceptionv382.896.42CVPR2016
Inceptionv482.396.2AAAI2016
Inception-ResNet-v282.496.3AAAI2016
Inceptionv4 + Inception-ResNet-v283.596.92AAAI2016
RiR--ICLR Workshop2016
Stochastic Depth ResNet78.02-ECCV2016
WRN78.194.21BMVC2016
SqueezeNet60.482.5arXiv2017(rejected by ICLR2017)
GeNet72.1390.26ICCV2017
MetaQNN--ICLR2017
PyramidNet80.895.3CVPR2017
DenseNet79.294.71ECCV2017
FractalNet75.892.61ICLR2017
ResNext-96.97CVPR2017
IGCV173.0591.08ICCV2017
Residual Attention Network80.595.2CVPR2017
Xception7994.5CVPR2017
MobileNet70.6-arXiv2017
PolyNet82.6496.55CVPR2017
DPN7994.5NIPS2017
Block-QNN77.493.54CVPR2018
CRU-Net79.794.7IJCAI2018
DLA75.3-CVPR2018
ShuffleNet75.3-CVPR2018
CondenseNet73.891.7CVPR2018
NasNet82.796.2CVPR2018
MobileNetV274.7-CVPR2018
IGCV270.07-CVPR2018
hier79.794.8ICLR2018
PNasNet82.996.2ECCV2018
AmoebaNet83.996.6AAAI2018
SENet-97.749CVPR2018
ShuffleNetV281.44-ECCV2018
CBAM79.9394.41ECCV2018
IGCV372.2-BMVC2018
BAM77.5693.71BMVC2018
MnasNet76.1392.85CVPR2018
SKNet80.60-CVPR2019
DARTS73.391.3ICLR2019
ProxylessNAS75.192.5ICLR2019
MobileNetV375.2-CVPR2019
Res2Net79.294.37PAMI2019
LIP-ResNet79.3394.6ICCV2019
EfficientNet84.397.0ICML2019
FixResNeXt86.498.0NIPS2019
BiT87.5-ECCV2020
PSConv + ResNext10180.50295.276ECCV2020
NoisyStudent88.498.7CVPR2020
RegNet79.9-CVPR2020
GhostNet75.7-CVPR2020
ViT88.55-ICLR2021
DeiT85.2-ICML2021
PVT81.7-ICCV2021
T2T-Vit83.3-ICCV2021
DeepVit80.9-Arvix2021
ViL83.7-ICCV2021
TNT83.9-Arvix2021
CvT87.7-ICCV2021
CViT84.1-ICCV2021
Focal-T84.0-NIPS2021
Twins83.7-NIPS2021
PVTv281.7-CVM2022

Papers&Codes

VGG

Very Deep Convolutional Networks for Large-Scale Image Recognition. Karen Simonyan, Andrew Zisserman

GoogleNet

Going Deeper with Convolutions Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich

PReLU-nets

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

ResNet

Deep Residual Learning for Image Recognition Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

PreActResNet

Identity Mappings in Deep Residual Networks Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun

Inceptionv3

Rethinking the Inception Architecture for Computer Vision Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, Zbigniew Wojna

Inceptionv4 && Inception-ResNetv2

Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, Alex Alemi

RiR

Resnet in Resnet: Generalizing Residual Architectures Sasha Targ, Diogo Almeida, Kevin Lyman

Stochastic Depth ResNet

Deep Networks with Stochastic Depth Gao Huang, Yu Sun, Zhuang Liu, Daniel Sedra, Kilian Weinberger

WRN

Wide Residual Networks Sergey Zagoruyko, Nikos Komodakis

SqueezeNet

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size Forrest N. Iandola, Song Han, Matthew W. Moskewicz, Khalid Ashraf, William J. Dally, Kurt Keutzer

GeNet

Genetic CNN Lingxi Xie, Alan Yuille

MetaQNN

Designing Neural Network Architectures using Reinforcement Learning Bowen Baker, Otkrist Gupta, Nikhil Naik, Ramesh Raskar

PyramidNet

Deep Pyramidal Residual Networks Dongyoon Han, Jiwhan Kim, Junmo Kim

DenseNet

Densely Connected Convolutional Networks Gao Huang, Zhuang Liu, Laurens van der Maaten, Kilian Q. Weinberger

FractalNet

FractalNet: Ultra-Deep Neural Networks without Residuals Gustav Larsson, Michael Maire, Gregory Shakhnarovich

ResNext

Aggregated Residual Transformations for Deep Neural Networks Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, Kaiming He

IGCV1

Interleaved Group Convolutions for Deep Neural Networks Ting Zhang, Guo-Jun Qi, Bin Xiao, Jingdong Wang

Residual Attention Network

Residual Attention Network for Image Classification Fei Wang, Mengqing Jiang, Chen Qian, Shuo Yang, Cheng Li, Honggang Zhang, Xiaogang Wang, Xiaoou Tang

Xception

Xception: Deep Learning with Depthwise Separable Convolutions François Chollet

MobileNet

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, Hartwig Adam

PolyNet

PolyNet: A Pursuit of Structural Diversity in Very Deep Networks Xingcheng Zhang, Zhizhong Li, Chen Change Loy, Dahua Lin

DPN

Dual Path Networks Yunpeng Chen, Jianan Li, Huaxin Xiao, Xiaojie Jin, Shuicheng Yan, Jiashi Feng

Block-QNN

Practical Block-wise Neural Network Architecture Generation Zhao Zhong, Junjie Yan, Wei Wu, Jing Shao, Cheng-Lin Liu

CRU-Net

Sharing Residual Units Through Collective Tensor Factorization in Deep Neural Networks Chen Yunpeng, Jin Xiaojie, Kang Bingyi, Feng Jiashi, Yan Shuicheng

DLA

Deep Layer Aggregation Fisher Yu, Dequan Wang, Evan Shelhamer, Trevor Darrell

ShuffleNet

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun

CondenseNet

CondenseNet: An Efficient DenseNet using Learned Group Convolutions Gao Huang, Shichen Liu, Laurens van der Maaten, Kilian Q. Weinberger

NasNet

Learning Transferable Architectures for Scalable Image Recognition Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le

MobileNetV2

MobileNetV2: Inverted Residuals and Linear Bottlenecks Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, Liang-Chieh Chen

IGCV2

IGCV2: Interleaved Structured Sparse Convolutional Neural Networks Guotian Xie, Jingdong Wang, Ting Zhang, Jianhuang Lai, Richang Hong, Guo-Jun Qi

hier

Hierarchical Representations for Efficient Architecture Search Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, Koray Kavukcuoglu

PNasNet

Progressive Neural Architecture Search Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy

AmoebaNet

Regularized Evolution for Image Classifier Architecture Search Esteban Real, Alok Aggarwal, Yanping Huang, Quoc V Le

SENet

Squeeze-and-Excitation Networks Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Enhua Wu

ShuffleNetV2

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, Jian Sun

CBAM

CBAM: Convolutional Block Attention Module Sanghyun Woo, Jongchan Park, Joon-Young Lee, In So Kweon

IGCV3

IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks Ke Sun, Mingjie Li, Dong Liu, Jingdong Wang

BAM

BAM: Bottleneck Attention Module Jongchan Park, Sanghyun Woo, Joon-Young Lee, In So Kweon

MNasNet

MnasNet: Platform-Aware Neural Architecture Search for Mobile Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Quoc V. Le

SKNet

Selective Kernel Networks Xiang Li, Wenhai Wang, Xiaolin Hu, Jian Yang

DARTS

DARTS: Differentiable Architecture Search Hanxiao Liu, Karen Simonyan, Yiming Yang

ProxylessNAS

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware Han Cai, Ligeng Zhu, Song Han

MobileNetV3

Searching for MobileNetV3 Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, Quoc V. Le, Hartwig Adam

Res2Net

Res2Net: A New Multi-scale Backbone Architecture Shang-Hua Gao, Ming-Ming Cheng, Kai Zhao, Xin-Yu Zhang, Ming-Hsuan Yang, Philip Torr

LIP-ResNet

LIP: Local Importance-based Pooling Ziteng Gao, Limin Wang, Gangshan Wu

EfficientNet

EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks Mingxing Tan, Quoc V. Le

FixResNeXt

Fixing the train-test resolution discrepancy Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

BiT

Big Transfer (BiT): General Visual Representation Learning Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Joan Puigcerver, Jessica Yung, Sylvain Gelly, Neil Houlsby

PSConv + ResNext101

PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer Duo Li1, Anbang Yao2B, and Qifeng Chen1B

NoisyStudent

Self-training with Noisy Student improves ImageNet classification Qizhe Xie, Minh-Thang Luong, Eduard Hovy, Quoc V. Le

RegNet

Designing Network Design Spaces Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, Piotr Dollár

GhostNet

GhostNet: More Features from Cheap Operations Kai Han, Yunhe Wang, Qi Tian, Jianyuan Guo, Chunjing Xu, Chang Xu

ViT

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

DeiT

Training data-efficient image transformers & distillation through attention Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou

PVT

Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao

T2T

Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet Li Yuan, Yunpeng Chen, Tao Wang, Weihao Yu, Yujun Shi, Zihang Jiang, Francis EH Tay, Jiashi Feng, Shuicheng Yan

DeepVit

DeepViT: Towards Deeper Vision Transformer Daquan Zhou, Bingyi Kang, Xiaojie Jin, Linjie Yang, Xiaochen Lian, Zihang Jiang, Qibin Hou, and Jiashi Feng.

ViL

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding Pengchuan Zhang, Xiyang Dai, Jianwei Yang, Bin Xiao, Lu Yuan, Lei Zhang, Jianfeng Gao

TNT

Transformer in Transformer Kai Han, An Xiao, Enhua Wu, Jianyuan Guo, Chunjing Xu, Yunhe Wang

CvT

CvT: Introducing Convolutions to Vision Transformers Haiping Wu, Bin Xiao, Noel Codella, Mengchen Liu, Xiyang Dai, Lu Yuan, Lei Zhang

CViT

CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification Chun-Fu (Richard) Chen, Quanfu Fan, Rameswar Panda

Focal-T

Focal Attention for Long-Range Interactions in Vision Transformers Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Xiyang Dai, Bin Xiao, Lu Yuan, Jianfeng Gao

Twins

Twins: Revisiting the Design of Spatial Attention in Vision Transformers

PVTv2

Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao