Efficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

4,182

718

4,182

Top Related Projects

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Swin-Transformer

14,946

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

mmdetection

30,903

OpenMMLab Detection Toolbox and Benchmark

models

77,497

Models and examples built with TensorFlow

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Quick Overview

The Efficient-AI-Backbones repository by Huawei Noah's Ark Lab contains implementations of efficient neural network architectures for various computer vision tasks. It focuses on lightweight models that can be deployed on resource-constrained devices while maintaining high accuracy.

Pros

Offers a collection of state-of-the-art efficient neural network architectures
Provides pre-trained models for quick deployment and fine-tuning
Includes detailed documentation and usage instructions
Supports multiple deep learning frameworks (PyTorch, TensorFlow, MindSpore)

Cons

Limited to computer vision tasks, not applicable to other domains
May require additional optimization for specific hardware platforms
Some architectures might be complex for beginners to understand and modify
Regular updates and maintenance may be needed to keep up with latest advancements

Code Examples

Loading a pre-trained GhostNet model in PyTorch:

import torch
from ghostnet import ghostnet

# Load pre-trained GhostNet model
model = ghostnet(pretrained=True)
model.eval()

# Prepare input tensor
input_tensor = torch.randn(1, 3, 224, 224)

# Run inference
with torch.no_grad():
    output = model(input_tensor)

Creating a custom GhostNet model with a specific width multiplier:

from ghostnet import ghostnet

# Create a GhostNet model with width multiplier of 0.5
model = ghostnet(width_mult=0.5)

Fine-tuning a pre-trained model on a custom dataset:

import torch
from ghostnet import ghostnet

# Load pre-trained model
model = ghostnet(pretrained=True)

# Replace the last fully connected layer
num_classes = 10
model.classifier = torch.nn.Linear(model.classifier.in_features, num_classes)

# Define loss function and optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Training loop (simplified)
for epoch in range(num_epochs):
    for inputs, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

Getting Started

To get started with Efficient-AI-Backbones:

Clone the repository:

git clone https://github.com/huawei-noah/Efficient-AI-Backbones.git
cd Efficient-AI-Backbones

Install dependencies:
```
pip install -r requirements.txt
```

Import and use a model in your Python script:

from ghostnet import ghostnet

model = ghostnet(pretrained=True)
# Use the model for inference or fine-tuning

For more detailed instructions and examples, refer to the documentation in the repository's README file.

Competitor Comparisons

deit

4,224

Official DeiT repository

Pros of DeiT

Focuses on data-efficient image transformers, offering a novel distillation approach
Provides pre-trained models and training scripts for easy reproduction of results
Achieves competitive performance with fewer parameters and less training data

Cons of DeiT

Limited to vision transformers, not covering other types of efficient AI backbones
May require more computational resources for training compared to some lightweight models

Code Comparison

DeiT:

class DistilledVisionTransformer(VisionTransformer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.dist_token = nn.Parameter(torch.zeros(1, 1, self.embed_dim))
        num_patches = self.patch_embed.num_patches
        self.pos_embed = nn.Parameter(torch.zeros(1, num_patches + 2, self.embed_dim))

Efficient-AI-Backbones:

class GhostModule(nn.Module):
    def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, relu=True):
        super(GhostModule, self).__init__()
        self.oup = oup
        init_channels = math.ceil(oup / ratio)
        new_channels = init_channels*(ratio-1)

Both repositories focus on efficient AI models, but DeiT emphasizes vision transformers with a novel distillation approach, while Efficient-AI-Backbones offers a broader range of lightweight architectures for various tasks. DeiT provides more comprehensive training resources, while Efficient-AI-Backbones may be more suitable for resource-constrained environments.

pytorch-image-models

34,518

Pros of pytorch-image-models

Extensive collection of pre-trained models and architectures
Active community and frequent updates
Comprehensive documentation and examples

Cons of pytorch-image-models

Larger repository size and potentially higher resource requirements
May include more complex models that are not optimized for efficiency

Code Comparison

pytorch-image-models:

import timm
model = timm.create_model('resnet50', pretrained=True)
output = model(input_tensor)

Efficient-AI-Backbones:

from backbones import get_model
model = get_model('ghostnet')
output = model(input_tensor)

Summary

pytorch-image-models offers a wide range of models and active community support, making it suitable for various image-related tasks. Efficient-AI-Backbones focuses on lightweight and efficient architectures, which may be more appropriate for resource-constrained environments or mobile applications. The code usage is similar, with pytorch-image-models using the timm library and Efficient-AI-Backbones using a custom backbones module.

Swin-Transformer

14,946

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Pros of Swin-Transformer

More versatile architecture, applicable to a wider range of vision tasks
Better performance on various benchmarks, especially for high-resolution images
Extensive documentation and pre-trained models available

Cons of Swin-Transformer

Higher computational complexity, potentially slower inference times
Requires more training data to achieve optimal performance
More complex implementation, which may be challenging for beginners

Code Comparison

Swin-Transformer:

class SwinTransformer(nn.Module):
    def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes=1000,
                 embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24],
                 window_size=7, mlp_ratio=4., qkv_bias=True, qk_scale=None,
                 drop_rate=0., attn_drop_rate=0., drop_path_rate=0.1,
                 norm_layer=nn.LayerNorm, ape=False, patch_norm=True,
                 use_checkpoint=False, **kwargs):
        super().__init__()
        # ... (implementation details)

Efficient-AI-Backbones:

class EfficientNetV2(nn.Module):
    def __init__(self, cfgs, num_classes=1000, width_mult=1.):
        super(EfficientNetV2, self).__init__()
        self.cfgs = cfgs
        # ... (implementation details)

Both repositories offer efficient backbone architectures for computer vision tasks, but Swin-Transformer provides a more flexible and powerful approach at the cost of increased complexity and computational requirements. Efficient-AI-Backbones focuses on lightweight models optimized for mobile and edge devices.

mmdetection

30,903

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

Comprehensive collection of object detection algorithms and models
Extensive documentation and tutorials for easy adoption
Active community support and frequent updates

Cons of mmdetection

Larger codebase and potentially steeper learning curve
May include unnecessary components for users focused solely on efficient backbones

Code Comparison

mmdetection:

from mmdet.models import build_detector
from mmdet.apis import inference_detector, init_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')

Efficient-AI-Backbones:

from efficient_ai_backbones import build_model

model = build_model('efficientnet_b0', num_classes=1000)
model.load_state_dict(torch.load('efficientnet_b0.pth'))

The mmdetection code showcases its focus on complete detection pipelines, while Efficient-AI-Backbones emphasizes simplicity in backbone model creation and loading.

models

77,497

Models and examples built with TensorFlow

Pros of models

Extensive collection of pre-trained models for various tasks
Well-documented and maintained by the TensorFlow team
Supports a wide range of deep learning applications

Cons of models

Large repository size, potentially overwhelming for beginners
May include unnecessary components for specific use cases
Some models might be less optimized for efficiency

Code comparison

models:

import tensorflow as tf
from official.vision.image_classification import resnet
model = resnet.resnet50(num_classes=1000)

Efficient-AI-Backbones:

from backbones import ghostnet
model = ghostnet(num_classes=1000)

Key differences

models focuses on a broad range of deep learning tasks and applications
Efficient-AI-Backbones specializes in efficient network architectures
models provides more comprehensive documentation and examples
Efficient-AI-Backbones offers lightweight models optimized for mobile and edge devices

Use cases

Choose models for general-purpose deep learning projects with TensorFlow
Opt for Efficient-AI-Backbones when prioritizing model efficiency and deployment on resource-constrained devices

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

More comprehensive and feature-rich object detection framework
Extensive documentation and community support
Modular design allowing easy customization and extension

Cons of Detectron2

Heavier and potentially slower for deployment in resource-constrained environments
Steeper learning curve due to its extensive features and complexity

Code Comparison

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Efficient-AI-Backbones:

from efficient_ai_backbones import create_model

model = create_model('efficientnet_b0', pretrained=True)
outputs = model(image)

The code comparison shows that Detectron2 requires more setup and configuration, while Efficient-AI-Backbones offers a simpler API for model creation and inference. Detectron2 provides more flexibility and options, whereas Efficient-AI-Backbones focuses on lightweight and efficient models for quick deployment.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Efficient AI Backbones

including GhostNet, TNT (Transformer in Transformer), AugViT, WaveMLP and ViG developed by Huawei Noah's Ark Lab.

News
Model zoo

News

2024/02/27 The paper of ParameterNet is accepted by CVPR 2024.

2022/12/01 The code of NeurIPS 2022 (Spotlight) GhostNetV2 is released at ./ghostnetv2_pytorch.

2022/11/13 The code of IJCV 2022 G-Ghost RegNet is released at ./g_ghost_pytorch.

2022/06/17 The code of NeurIPS 2022 Vision GNN (ViG) is released at ./vig_pytorch.

2022/02/06 Transformer in Transformer (TNT) is selected as the Most Influential NeurIPS 2021 Papers.

2021/09/18 The extended version of Versatile Filters is accepted by T-PAMI.

2021/08/30 GhostNet paper is selected as the Most Influential CVPR 2020 Papers.

Model zoo

Model	Paper	Pytorch code	MindSpore code
GhostNet	GhostNet: More Features from Cheap Operations. [CVPR 2020]	./ghostnet_pytorch	MindSpore Model Zoo
GhostNetV2	GhostNetV2: Enhance Cheap Operation with Long-Range Attention. [NeurIPS 2022 Spotlight]	./ghostnetv2_pytorch	MindSpore Model Zoo
G-GhostNet	GhostNets on Heterogeneous Devices via Cheap Operations. [IJCV 2022]	./g_ghost_pytorch	MindSpore Model Zoo
TinyNet	Model Rubikâs Cube: Twisting Resolution, Depth and Width for TinyNets. [NeurIPS 2020]	./tinynet_pytorch	MindSpore Model Zoo
TNT	Transformer in Transformer. [NeurIPS 2021]	./tnt_pytorch	MindSpore Model Zoo
PyramidTNT	PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture. [CVPR 2022 Workshop]	./tnt_pytorch	MindSpore Model Zoo
CMT	CMT: Convolutional Neural Networks Meet Vision Transformers. [CVPR 2022]	./cmt_pytorch	MindSpore Model Zoo
AugViT	Augmented Shortcuts for Vision Transformers. [NeurIPS 2021]	./augvit_pytorch	MindSpore Model Zoo
SNN-MLP	Brain-inspired Multilayer Perceptron with Spiking Neurons. [CVPR 2022]	./snnmlp_pytorch	MindSpore Model Zoo
WaveMLP	An Image Patch is a Wave: Quantum Inspired Vision MLP. [CVPR 2022]	./wavemlp_pytorch	MindSpore Model Zoo
ViG	Vision GNN: An Image is Worth Graph of Nodes. [NeurIPS 2022]	./vig_pytorch	-
LegoNet	LegoNet: Efficient Convolutional Neural Networks with Lego Filters. [ICML 2019]	./legonet_pytorch	-
Versatile Filters	Learning Versatile Filters for Efficient Convolutional Neural Networks. [NeurIPS 2018]	./versatile_filters	-
ParameterNet	ParameterNet: Parameters Are All You Need. [CVPR 2024].	./parameternet_pytorch	-

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot