Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
Top Related Projects
Official DeiT repository
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
OpenMMLab Detection Toolbox and Benchmark
Models and examples built with TensorFlow
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Quick Overview
The Efficient-AI-Backbones repository by Huawei Noah's Ark Lab contains implementations of efficient neural network architectures for various computer vision tasks. It focuses on lightweight models that can be deployed on resource-constrained devices while maintaining high accuracy.
Pros
- Offers a collection of state-of-the-art efficient neural network architectures
- Provides pre-trained models for quick deployment and fine-tuning
- Includes detailed documentation and usage instructions
- Supports multiple deep learning frameworks (PyTorch, TensorFlow, MindSpore)
Cons
- Limited to computer vision tasks, not applicable to other domains
- May require additional optimization for specific hardware platforms
- Some architectures might be complex for beginners to understand and modify
- Regular updates and maintenance may be needed to keep up with latest advancements
Code Examples
- Loading a pre-trained GhostNet model in PyTorch:
import torch
from ghostnet import ghostnet
# Load pre-trained GhostNet model
model = ghostnet(pretrained=True)
model.eval()
# Prepare input tensor
input_tensor = torch.randn(1, 3, 224, 224)
# Run inference
with torch.no_grad():
output = model(input_tensor)
- Creating a custom GhostNet model with a specific width multiplier:
from ghostnet import ghostnet
# Create a GhostNet model with width multiplier of 0.5
model = ghostnet(width_mult=0.5)
- Fine-tuning a pre-trained model on a custom dataset:
import torch
from ghostnet import ghostnet
# Load pre-trained model
model = ghostnet(pretrained=True)
# Replace the last fully connected layer
num_classes = 10
model.classifier = torch.nn.Linear(model.classifier.in_features, num_classes)
# Define loss function and optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
# Training loop (simplified)
for epoch in range(num_epochs):
for inputs, labels in dataloader:
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Getting Started
To get started with Efficient-AI-Backbones:
-
Clone the repository:
git clone https://github.com/huawei-noah/Efficient-AI-Backbones.git cd Efficient-AI-Backbones
-
Install dependencies:
pip install -r requirements.txt
-
Import and use a model in your Python script:
from ghostnet import ghostnet model = ghostnet(pretrained=True) # Use the model for inference or fine-tuning
For more detailed instructions and examples, refer to the documentation in the repository's README file.
Competitor Comparisons
Official DeiT repository
Pros of DeiT
- Focuses on data-efficient image transformers, offering a novel distillation approach
- Provides pre-trained models and training scripts for easy reproduction of results
- Achieves competitive performance with fewer parameters and less training data
Cons of DeiT
- Limited to vision transformers, not covering other types of efficient AI backbones
- May require more computational resources for training compared to some lightweight models
Code Comparison
DeiT:
class DistilledVisionTransformer(VisionTransformer):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.dist_token = nn.Parameter(torch.zeros(1, 1, self.embed_dim))
num_patches = self.patch_embed.num_patches
self.pos_embed = nn.Parameter(torch.zeros(1, num_patches + 2, self.embed_dim))
Efficient-AI-Backbones:
class GhostModule(nn.Module):
def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, relu=True):
super(GhostModule, self).__init__()
self.oup = oup
init_channels = math.ceil(oup / ratio)
new_channels = init_channels*(ratio-1)
Both repositories focus on efficient AI models, but DeiT emphasizes vision transformers with a novel distillation approach, while Efficient-AI-Backbones offers a broader range of lightweight architectures for various tasks. DeiT provides more comprehensive training resources, while Efficient-AI-Backbones may be more suitable for resource-constrained environments.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
Pros of pytorch-image-models
- Extensive collection of pre-trained models and architectures
- Active community and frequent updates
- Comprehensive documentation and examples
Cons of pytorch-image-models
- Larger repository size and potentially higher resource requirements
- May include more complex models that are not optimized for efficiency
Code Comparison
pytorch-image-models:
import timm
model = timm.create_model('resnet50', pretrained=True)
output = model(input_tensor)
Efficient-AI-Backbones:
from backbones import get_model
model = get_model('ghostnet')
output = model(input_tensor)
Summary
pytorch-image-models offers a wide range of models and active community support, making it suitable for various image-related tasks. Efficient-AI-Backbones focuses on lightweight and efficient architectures, which may be more appropriate for resource-constrained environments or mobile applications. The code usage is similar, with pytorch-image-models using the timm
library and Efficient-AI-Backbones using a custom backbones
module.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Pros of Swin-Transformer
- More versatile architecture, applicable to a wider range of vision tasks
- Better performance on various benchmarks, especially for high-resolution images
- Extensive documentation and pre-trained models available
Cons of Swin-Transformer
- Higher computational complexity, potentially slower inference times
- Requires more training data to achieve optimal performance
- More complex implementation, which may be challenging for beginners
Code Comparison
Swin-Transformer:
class SwinTransformer(nn.Module):
def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes=1000,
embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24],
window_size=7, mlp_ratio=4., qkv_bias=True, qk_scale=None,
drop_rate=0., attn_drop_rate=0., drop_path_rate=0.1,
norm_layer=nn.LayerNorm, ape=False, patch_norm=True,
use_checkpoint=False, **kwargs):
super().__init__()
# ... (implementation details)
Efficient-AI-Backbones:
class EfficientNetV2(nn.Module):
def __init__(self, cfgs, num_classes=1000, width_mult=1.):
super(EfficientNetV2, self).__init__()
self.cfgs = cfgs
# ... (implementation details)
Both repositories offer efficient backbone architectures for computer vision tasks, but Swin-Transformer provides a more flexible and powerful approach at the cost of increased complexity and computational requirements. Efficient-AI-Backbones focuses on lightweight models optimized for mobile and edge devices.
OpenMMLab Detection Toolbox and Benchmark
Pros of mmdetection
- Comprehensive collection of object detection algorithms and models
- Extensive documentation and tutorials for easy adoption
- Active community support and frequent updates
Cons of mmdetection
- Larger codebase and potentially steeper learning curve
- May include unnecessary components for users focused solely on efficient backbones
Code Comparison
mmdetection:
from mmdet.models import build_detector
from mmdet.apis import inference_detector, init_detector
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
Efficient-AI-Backbones:
from efficient_ai_backbones import build_model
model = build_model('efficientnet_b0', num_classes=1000)
model.load_state_dict(torch.load('efficientnet_b0.pth'))
The mmdetection code showcases its focus on complete detection pipelines, while Efficient-AI-Backbones emphasizes simplicity in backbone model creation and loading.
Models and examples built with TensorFlow
Pros of models
- Extensive collection of pre-trained models for various tasks
- Well-documented and maintained by the TensorFlow team
- Supports a wide range of deep learning applications
Cons of models
- Large repository size, potentially overwhelming for beginners
- May include unnecessary components for specific use cases
- Some models might be less optimized for efficiency
Code comparison
models:
import tensorflow as tf
from official.vision.image_classification import resnet
model = resnet.resnet50(num_classes=1000)
Efficient-AI-Backbones:
from backbones import ghostnet
model = ghostnet(num_classes=1000)
Key differences
- models focuses on a broad range of deep learning tasks and applications
- Efficient-AI-Backbones specializes in efficient network architectures
- models provides more comprehensive documentation and examples
- Efficient-AI-Backbones offers lightweight models optimized for mobile and edge devices
Use cases
- Choose models for general-purpose deep learning projects with TensorFlow
- Opt for Efficient-AI-Backbones when prioritizing model efficiency and deployment on resource-constrained devices
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Pros of Detectron2
- More comprehensive and feature-rich object detection framework
- Extensive documentation and community support
- Modular design allowing easy customization and extension
Cons of Detectron2
- Heavier and potentially slower for deployment in resource-constrained environments
- Steeper learning curve due to its extensive features and complexity
Code Comparison
Detectron2:
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Efficient-AI-Backbones:
from efficient_ai_backbones import create_model
model = create_model('efficientnet_b0', pretrained=True)
outputs = model(image)
The code comparison shows that Detectron2 requires more setup and configuration, while Efficient-AI-Backbones offers a simpler API for model creation and inference. Detectron2 provides more flexibility and options, whereas Efficient-AI-Backbones focuses on lightweight and efficient models for quick deployment.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Efficient AI Backbones
including GhostNet, TNT (Transformer in Transformer), AugViT, WaveMLP and ViG developed by Huawei Noah's Ark Lab.
News
2024/02/27 The paper of ParameterNet is accepted by CVPR 2024.
2022/12/01 The code of NeurIPS 2022 (Spotlight) GhostNetV2 is released at ./ghostnetv2_pytorch.
2022/11/13 The code of IJCV 2022 G-Ghost RegNet is released at ./g_ghost_pytorch.
2022/06/17 The code of NeurIPS 2022 Vision GNN (ViG) is released at ./vig_pytorch.
2022/02/06 Transformer in Transformer (TNT) is selected as the Most Influential NeurIPS 2021 Papers.
2021/09/18 The extended version of Versatile Filters is accepted by T-PAMI.
2021/08/30 GhostNet paper is selected as the Most Influential CVPR 2020 Papers.
Model zoo
Model | Paper | Pytorch code | MindSpore code |
---|---|---|---|
GhostNet | GhostNet: More Features from Cheap Operations. [CVPR 2020] | ./ghostnet_pytorch | MindSpore Model Zoo |
GhostNetV2 | GhostNetV2: Enhance Cheap Operation with Long-Range Attention. [NeurIPS 2022 Spotlight] | ./ghostnetv2_pytorch | MindSpore Model Zoo |
G-GhostNet | GhostNets on Heterogeneous Devices via Cheap Operations. [IJCV 2022] | ./g_ghost_pytorch | MindSpore Model Zoo |
TinyNet | Model Rubikâs Cube: Twisting Resolution, Depth and Width for TinyNets. [NeurIPS 2020] | ./tinynet_pytorch | MindSpore Model Zoo |
TNT | Transformer in Transformer. [NeurIPS 2021] | ./tnt_pytorch | MindSpore Model Zoo |
PyramidTNT | PyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture. [CVPR 2022 Workshop] | ./tnt_pytorch | MindSpore Model Zoo |
CMT | CMT: Convolutional Neural Networks Meet Vision Transformers. [CVPR 2022] | ./cmt_pytorch | MindSpore Model Zoo |
AugViT | Augmented Shortcuts for Vision Transformers. [NeurIPS 2021] | ./augvit_pytorch | MindSpore Model Zoo |
SNN-MLP | Brain-inspired Multilayer Perceptron with Spiking Neurons. [CVPR 2022] | ./snnmlp_pytorch | MindSpore Model Zoo |
WaveMLP | An Image Patch is a Wave: Quantum Inspired Vision MLP. [CVPR 2022] | ./wavemlp_pytorch | MindSpore Model Zoo |
ViG | Vision GNN: An Image is Worth Graph of Nodes. [NeurIPS 2022] | ./vig_pytorch | - |
LegoNet | LegoNet: Efficient Convolutional Neural Networks with Lego Filters. [ICML 2019] | ./legonet_pytorch | - |
Versatile Filters | Learning Versatile Filters for Efficient Convolutional Neural Networks. [NeurIPS 2018] | ./versatile_filters | - |
ParameterNet | ParameterNet: Parameters Are All You Need. [CVPR 2024]. | ./parameternet_pytorch | - |
Top Related Projects
Official DeiT repository
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
OpenMMLab Detection Toolbox and Benchmark
Models and examples built with TensorFlow
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot