Convert Figma logo to code with AI

huawei-noah logoEfficient-AI-Backbones

Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.

4,182
718
4,182
89

Top Related Projects

4,224

Official DeiT repository

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

OpenMMLab Detection Toolbox and Benchmark

77,497

Models and examples built with TensorFlow

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Quick Overview

The Efficient-AI-Backbones repository by Huawei Noah's Ark Lab contains implementations of efficient neural network architectures for various computer vision tasks. It focuses on lightweight models that can be deployed on resource-constrained devices while maintaining high accuracy.

Pros

  • Offers a collection of state-of-the-art efficient neural network architectures
  • Provides pre-trained models for quick deployment and fine-tuning
  • Includes detailed documentation and usage instructions
  • Supports multiple deep learning frameworks (PyTorch, TensorFlow, MindSpore)

Cons

  • Limited to computer vision tasks, not applicable to other domains
  • May require additional optimization for specific hardware platforms
  • Some architectures might be complex for beginners to understand and modify
  • Regular updates and maintenance may be needed to keep up with latest advancements

Code Examples

  1. Loading a pre-trained GhostNet model in PyTorch:
import torch
from ghostnet import ghostnet

# Load pre-trained GhostNet model
model = ghostnet(pretrained=True)
model.eval()

# Prepare input tensor
input_tensor = torch.randn(1, 3, 224, 224)

# Run inference
with torch.no_grad():
    output = model(input_tensor)
  1. Creating a custom GhostNet model with a specific width multiplier:
from ghostnet import ghostnet

# Create a GhostNet model with width multiplier of 0.5
model = ghostnet(width_mult=0.5)
  1. Fine-tuning a pre-trained model on a custom dataset:
import torch
from ghostnet import ghostnet

# Load pre-trained model
model = ghostnet(pretrained=True)

# Replace the last fully connected layer
num_classes = 10
model.classifier = torch.nn.Linear(model.classifier.in_features, num_classes)

# Define loss function and optimizer
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

# Training loop (simplified)
for epoch in range(num_epochs):
    for inputs, labels in dataloader:
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

Getting Started

To get started with Efficient-AI-Backbones:

  1. Clone the repository:

    git clone https://github.com/huawei-noah/Efficient-AI-Backbones.git
    cd Efficient-AI-Backbones
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Import and use a model in your Python script:

    from ghostnet import ghostnet
    
    model = ghostnet(pretrained=True)
    # Use the model for inference or fine-tuning
    

For more detailed instructions and examples, refer to the documentation in the repository's README file.

Competitor Comparisons

4,224

Official DeiT repository

Pros of DeiT

  • Focuses on data-efficient image transformers, offering a novel distillation approach
  • Provides pre-trained models and training scripts for easy reproduction of results
  • Achieves competitive performance with fewer parameters and less training data

Cons of DeiT

  • Limited to vision transformers, not covering other types of efficient AI backbones
  • May require more computational resources for training compared to some lightweight models

Code Comparison

DeiT:

class DistilledVisionTransformer(VisionTransformer):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.dist_token = nn.Parameter(torch.zeros(1, 1, self.embed_dim))
        num_patches = self.patch_embed.num_patches
        self.pos_embed = nn.Parameter(torch.zeros(1, num_patches + 2, self.embed_dim))

Efficient-AI-Backbones:

class GhostModule(nn.Module):
    def __init__(self, inp, oup, kernel_size=1, ratio=2, dw_size=3, stride=1, relu=True):
        super(GhostModule, self).__init__()
        self.oup = oup
        init_channels = math.ceil(oup / ratio)
        new_channels = init_channels*(ratio-1)

Both repositories focus on efficient AI models, but DeiT emphasizes vision transformers with a novel distillation approach, while Efficient-AI-Backbones offers a broader range of lightweight architectures for various tasks. DeiT provides more comprehensive training resources, while Efficient-AI-Backbones may be more suitable for resource-constrained environments.

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more

Pros of pytorch-image-models

  • Extensive collection of pre-trained models and architectures
  • Active community and frequent updates
  • Comprehensive documentation and examples

Cons of pytorch-image-models

  • Larger repository size and potentially higher resource requirements
  • May include more complex models that are not optimized for efficiency

Code Comparison

pytorch-image-models:

import timm
model = timm.create_model('resnet50', pretrained=True)
output = model(input_tensor)

Efficient-AI-Backbones:

from backbones import get_model
model = get_model('ghostnet')
output = model(input_tensor)

Summary

pytorch-image-models offers a wide range of models and active community support, making it suitable for various image-related tasks. Efficient-AI-Backbones focuses on lightweight and efficient architectures, which may be more appropriate for resource-constrained environments or mobile applications. The code usage is similar, with pytorch-image-models using the timm library and Efficient-AI-Backbones using a custom backbones module.

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".

Pros of Swin-Transformer

  • More versatile architecture, applicable to a wider range of vision tasks
  • Better performance on various benchmarks, especially for high-resolution images
  • Extensive documentation and pre-trained models available

Cons of Swin-Transformer

  • Higher computational complexity, potentially slower inference times
  • Requires more training data to achieve optimal performance
  • More complex implementation, which may be challenging for beginners

Code Comparison

Swin-Transformer:

class SwinTransformer(nn.Module):
    def __init__(self, img_size=224, patch_size=4, in_chans=3, num_classes=1000,
                 embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24],
                 window_size=7, mlp_ratio=4., qkv_bias=True, qk_scale=None,
                 drop_rate=0., attn_drop_rate=0., drop_path_rate=0.1,
                 norm_layer=nn.LayerNorm, ape=False, patch_norm=True,
                 use_checkpoint=False, **kwargs):
        super().__init__()
        # ... (implementation details)

Efficient-AI-Backbones:

class EfficientNetV2(nn.Module):
    def __init__(self, cfgs, num_classes=1000, width_mult=1.):
        super(EfficientNetV2, self).__init__()
        self.cfgs = cfgs
        # ... (implementation details)

Both repositories offer efficient backbone architectures for computer vision tasks, but Swin-Transformer provides a more flexible and powerful approach at the cost of increased complexity and computational requirements. Efficient-AI-Backbones focuses on lightweight models optimized for mobile and edge devices.

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

  • Comprehensive collection of object detection algorithms and models
  • Extensive documentation and tutorials for easy adoption
  • Active community support and frequent updates

Cons of mmdetection

  • Larger codebase and potentially steeper learning curve
  • May include unnecessary components for users focused solely on efficient backbones

Code Comparison

mmdetection:

from mmdet.models import build_detector
from mmdet.apis import inference_detector, init_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')

Efficient-AI-Backbones:

from efficient_ai_backbones import build_model

model = build_model('efficientnet_b0', num_classes=1000)
model.load_state_dict(torch.load('efficientnet_b0.pth'))

The mmdetection code showcases its focus on complete detection pipelines, while Efficient-AI-Backbones emphasizes simplicity in backbone model creation and loading.

77,497

Models and examples built with TensorFlow

Pros of models

  • Extensive collection of pre-trained models for various tasks
  • Well-documented and maintained by the TensorFlow team
  • Supports a wide range of deep learning applications

Cons of models

  • Large repository size, potentially overwhelming for beginners
  • May include unnecessary components for specific use cases
  • Some models might be less optimized for efficiency

Code comparison

models:

import tensorflow as tf
from official.vision.image_classification import resnet
model = resnet.resnet50(num_classes=1000)

Efficient-AI-Backbones:

from backbones import ghostnet
model = ghostnet(num_classes=1000)

Key differences

  • models focuses on a broad range of deep learning tasks and applications
  • Efficient-AI-Backbones specializes in efficient network architectures
  • models provides more comprehensive documentation and examples
  • Efficient-AI-Backbones offers lightweight models optimized for mobile and edge devices

Use cases

  • Choose models for general-purpose deep learning projects with TensorFlow
  • Opt for Efficient-AI-Backbones when prioritizing model efficiency and deployment on resource-constrained devices

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

  • More comprehensive and feature-rich object detection framework
  • Extensive documentation and community support
  • Modular design allowing easy customization and extension

Cons of Detectron2

  • Heavier and potentially slower for deployment in resource-constrained environments
  • Steeper learning curve due to its extensive features and complexity

Code Comparison

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Efficient-AI-Backbones:

from efficient_ai_backbones import create_model

model = create_model('efficientnet_b0', pretrained=True)
outputs = model(image)

The code comparison shows that Detectron2 requires more setup and configuration, while Efficient-AI-Backbones offers a simpler API for model creation and inference. Detectron2 provides more flexibility and options, whereas Efficient-AI-Backbones focuses on lightweight and efficient models for quick deployment.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Efficient AI Backbones

including GhostNet, TNT (Transformer in Transformer), AugViT, WaveMLP and ViG developed by Huawei Noah's Ark Lab.

News

2024/02/27 The paper of ParameterNet is accepted by CVPR 2024.

2022/12/01 The code of NeurIPS 2022 (Spotlight) GhostNetV2 is released at ./ghostnetv2_pytorch.

2022/11/13 The code of IJCV 2022 G-Ghost RegNet is released at ./g_ghost_pytorch.

2022/06/17 The code of NeurIPS 2022 Vision GNN (ViG) is released at ./vig_pytorch.

2022/02/06 Transformer in Transformer (TNT) is selected as the Most Influential NeurIPS 2021 Papers.

2021/09/18 The extended version of Versatile Filters is accepted by T-PAMI.

2021/08/30 GhostNet paper is selected as the Most Influential CVPR 2020 Papers.

Model zoo

ModelPaperPytorch codeMindSpore code
GhostNetGhostNet: More Features from Cheap Operations. [CVPR 2020]./ghostnet_pytorchMindSpore Model Zoo
GhostNetV2GhostNetV2: Enhance Cheap Operation with Long-Range Attention. [NeurIPS 2022 Spotlight]./ghostnetv2_pytorchMindSpore Model Zoo
G-GhostNetGhostNets on Heterogeneous Devices via Cheap Operations. [IJCV 2022]./g_ghost_pytorchMindSpore Model Zoo
TinyNetModel Rubik’s Cube: Twisting Resolution, Depth and Width for TinyNets. [NeurIPS 2020]./tinynet_pytorchMindSpore Model Zoo
TNTTransformer in Transformer. [NeurIPS 2021]./tnt_pytorchMindSpore Model Zoo
PyramidTNTPyramidTNT: Improved Transformer-in-Transformer Baselines with Pyramid Architecture. [CVPR 2022 Workshop]./tnt_pytorchMindSpore Model Zoo
CMTCMT: Convolutional Neural Networks Meet Vision Transformers. [CVPR 2022]./cmt_pytorchMindSpore Model Zoo
AugViTAugmented Shortcuts for Vision Transformers. [NeurIPS 2021]./augvit_pytorchMindSpore Model Zoo
SNN-MLPBrain-inspired Multilayer Perceptron with Spiking Neurons. [CVPR 2022]./snnmlp_pytorchMindSpore Model Zoo
WaveMLPAn Image Patch is a Wave: Quantum Inspired Vision MLP. [CVPR 2022]./wavemlp_pytorchMindSpore Model Zoo
ViGVision GNN: An Image is Worth Graph of Nodes. [NeurIPS 2022]./vig_pytorch-
LegoNetLegoNet: Efficient Convolutional Neural Networks with Lego Filters. [ICML 2019]./legonet_pytorch-
Versatile FiltersLearning Versatile Filters for Efficient Convolutional Neural Networks. [NeurIPS 2018]./versatile_filters-
ParameterNetParameterNet: Parameters Are All You Need. [CVPR 2024]../parameternet_pytorch-