Top Related Projects
Models and examples built with TensorFlow
Datasets, Transforms and Models specific to Computer Vision
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Pre-trained Deep Learning models and demos (high quality and extremely fast)
Quick Overview
The ailia-models repository is a collection of pre-trained deep learning models optimized for the ailia SDK. It provides a wide range of AI models for various tasks such as image classification, object detection, pose estimation, and more. The repository serves as a resource for developers to easily integrate AI capabilities into their applications using the ailia SDK.
Pros
- Extensive collection of pre-trained models for diverse AI tasks
- Optimized for efficient inference using the ailia SDK
- Well-documented with usage examples and model information
- Regular updates and additions of new models
Cons
- Requires the ailia SDK, which may have licensing restrictions
- Limited to models compatible with the ailia framework
- May require additional setup and dependencies for specific models
- Some models may have varying levels of performance or accuracy
Code Examples
import ailia
# Load an image classification model
model = ailia.Net('resnet50.onnx', 'resnet50.prototxt')
# Perform inference on an image
input_image = ailia.imread('sample.jpg')
output = model.predict(input_image)
# Object detection using YOLOv3
detector = ailia.Detector('yolov3.onnx', 'yolov3.prototxt', 'coco.names')
img = ailia.imread('street.jpg')
boxes, scores, labels = detector.detect(img)
# Pose estimation using OpenPose
pose_estimator = ailia.PoseEstimator('openpose.onnx', 'openpose.prototxt')
img = ailia.imread('person.jpg')
poses = pose_estimator.estimate(img)
Getting Started
- Install the ailia SDK (follow instructions from the ailia website)
- Clone the repository:
git clone https://github.com/axinc-ai/ailia-models.git
- Install required dependencies:
pip install -r requirements.txt
- Choose a model and follow the specific usage instructions in its directory
- Run the sample script for the chosen model:
python3 <model_name>_demo.py
Competitor Comparisons
Models and examples built with TensorFlow
Pros of TensorFlow Models
- Extensive collection of pre-trained models and implementations
- Strong community support and regular updates
- Comprehensive documentation and tutorials
Cons of TensorFlow Models
- Large repository size, potentially overwhelming for beginners
- Primarily focused on TensorFlow framework, limiting flexibility
- Some models may require significant computational resources
Code Comparison
TensorFlow Models:
import tensorflow as tf
from official.nlp import bert
model = bert.BertModel(config=bert.BertConfig())
ailia-models:
import ailia
model = ailia.Net("bert.onnx", "bert.prototxt")
Key Differences
- TensorFlow Models offers a wider range of models and research implementations
- ailia-models focuses on optimized inference for edge devices
- TensorFlow Models requires TensorFlow framework, while ailia-models uses ONNX format
- ailia-models provides a simpler API for model loading and inference
- TensorFlow Models has more extensive documentation and examples
Use Cases
TensorFlow Models is ideal for:
- Research and experimentation with state-of-the-art models
- Large-scale machine learning projects
- Integration with TensorFlow ecosystem
ailia-models is suitable for:
- Edge device deployment
- Quick prototyping with pre-optimized models
- Cross-platform compatibility using ONNX
Datasets, Transforms and Models specific to Computer Vision
Pros of vision
- Larger community and more active development
- Broader range of pre-trained models and datasets
- Tighter integration with PyTorch ecosystem
Cons of vision
- Steeper learning curve for beginners
- Potentially higher computational requirements
- Less focus on edge/mobile deployment
Code Comparison
vision:
import torchvision.models as models
resnet18 = models.resnet18(pretrained=True)
resnet18.eval()
ailia-models:
import ailia
model = ailia.Net('resnet18.onnx', 'resnet18.prototxt')
Summary
vision offers a more comprehensive set of tools and models for computer vision tasks, benefiting from PyTorch's extensive ecosystem. It's ideal for research and large-scale applications. ailia-models, while less extensive, provides a simpler interface and focuses on edge deployment. The choice between them depends on the specific project requirements, available resources, and deployment targets.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Extensive library of pre-trained models for various NLP tasks
- Active community and frequent updates
- Comprehensive documentation and tutorials
Cons of transformers
- Larger file size and memory footprint
- Steeper learning curve for beginners
- May require more computational resources
Code comparison
transformers:
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
result = classifier("I love this product!")[0]
print(f"Label: {result['label']}, Score: {result['score']:.4f}")
ailia-models:
import ailia
model = ailia.Net("sentiment_analysis.onnx", "sentiment_analysis.prototxt")
input_data = ailia.make_input_blob("I love this product!")
output = model.predict(input_data)
print(f"Sentiment: {output[0]}")
Summary
transformers offers a wide range of pre-trained models and extensive community support, making it ideal for various NLP tasks. However, it may require more resources and have a steeper learning curve. ailia-models provides a lightweight alternative with a focus on edge devices and real-time processing, but may have a more limited selection of models and less community support. The choice between the two depends on the specific project requirements and deployment environment.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Pros of onnxruntime
- Broader ecosystem support and integration with various ML frameworks
- Extensive optimization capabilities for different hardware platforms
- Active development and regular updates from Microsoft
Cons of onnxruntime
- Steeper learning curve for beginners compared to ailia-models
- May require more setup and configuration for specific use cases
Code Comparison
onnxruntime:
import onnxruntime as ort
session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})
ailia-models:
import ailia
model = ailia.Net("model.onnx", "model.prototxt", ailia.ENVIRONMENT_AUTO)
output = model.predict(input_data)
Both repositories provide tools for running ONNX models, but onnxruntime offers more flexibility and optimization options, while ailia-models focuses on simplicity and ease of use. onnxruntime is better suited for large-scale deployments and performance-critical applications, whereas ailia-models may be more appropriate for quick prototyping or simpler use cases.
Pre-trained Deep Learning models and demos (high quality and extremely fast)
Pros of open_model_zoo
- Larger collection of pre-trained models across various domains
- Extensive documentation and usage examples
- Optimized for Intel hardware and OpenVINO toolkit
Cons of open_model_zoo
- Primarily focused on Intel platforms, potentially limiting portability
- May require more setup and configuration for non-Intel environments
Code comparison
open_model_zoo:
from openvino.inference_engine import IECore
ie = IECore()
net = ie.read_network(model="model.xml", weights="model.bin")
exec_net = ie.load_network(network=net, device_name="CPU")
ailia-models:
import ailia
net = ailia.Net("model.prototxt", "model.caffemodel")
input_data = np.random.random((1, 3, 224, 224))
output = net.predict(input_data)
Both repositories provide pre-trained models and inference examples, but they differ in their focus and implementation. open_model_zoo offers a wider range of models and is optimized for Intel hardware, while ailia-models provides a more straightforward API for quick integration across various platforms. The choice between them depends on the specific project requirements, target hardware, and desired level of optimization.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
The collection of pre-trained, state-of-the-art AI models.
About ailia SDK
ailia SDK is a self-contained, cross-platform, high-speed inference SDK for AI. The ailia SDK provides a consistent C++ API across Windows, Mac, Linux, iOS, Android, Jetson, and Raspberry Pi platforms. It also supports Unity (C#), Python, Rust, Flutter(Dart) and JNI for efficient AI implementation. The ailia SDK makes extensive use of the GPU through Vulkan and Metal to enable accelerated computing.
How to use
NEW - ailia SDK can now be installed with "pip3 install ailia" !
ailia MODELS tutorial æ¥æ¬èªç
Supported models
353 models as of September 10th, 2024
Latest update
- 2024.09.10 Add segment-anything-2 (video mode)
- 2024.08.27 Add segment-anything-2 (image mode)
- 2024.08.20 Add bert_ner_japanese
- 2024.08.16 Add latent-consistency-model-txt2img, fbcnn
- 2024.08.15 Add volo, elegant, depth_anything, drbn_skf, codeformer, dtln
- 2024.08.10 Add TripoSR, japanese-reranker-cross-encoder
- 2024.08.09 Add mahalanobis-ad, t5_base_japanese_ner
- 2024.08.08 Add sdxl-turbo, sd-turbo
- 2024.08.05 Migrate to ailia Tokenizer 1.3 from Transformers
- 2024.07.16 Add grounded_sam
- 2024.07.12 Add llava
- 2024.07.09 Add GroundingDINO
- More information in our Wiki
Action recognition
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
mars | MARS: Motion-Augmented RGB Stream for Action Recognition | Pytorch | 1.2.4 and later | EN JP | |
st-gcn | ST-GCN | Pytorch | 1.2.5 and later | EN JP | |
ax_action_recognition | Realtime-Action-Recognition | Pytorch | 1.2.7 and later | ||
va-cnn | View Adaptive Neural Networks (VA) for Skeleton-based Human Action Recognition | Pytorch | 1.2.7 and later | ||
driver-action-recognition-adas | driver-action-recognition-adas-0002 | OpenVINO | 1.2.5 and later | ||
action_clip | ActionCLIP | Pytorch | 1.2.7 and later |
Anomaly detection
Model | Reference | Exported From | Supported Ailia Version | Date | Blog | |
---|---|---|---|---|---|---|
mahalanobisad | MahalanobisAD-pytorch | Pytorch | 1.2.9 and later | May 2020 | ||
spade-pytorch | Sub-Image Anomaly Detection with Deep Pyramid Correspondences | Pytorch | 1.2.6 and later | May 2020 | ||
padim | PaDiM-Anomaly-Detection-Localization-master | Pytorch | 1.2.6 and later | Nov 2020 | EN JP | |
patchcore | PatchCore_anomaly_detection | Pytorch | 1.2.6 and later | Jun 2021 |
Audio processing
Audio classification
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
crnn_audio_classification | crnn-audio-classification | Pytorch | 1.2.5 and later | EN JP |
transformer-cnn-emotion-recognition | Combining Spatial and Temporal Feature Representions of Speech Emotion by Parallelizing CNNs and Transformer-Encoders | Pytorch | 1.2.5 and later | |
audioset_tagging_cnn | PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition | Pytorch | 1.2.9 and later | |
clap | CLAP | Pytorch | 1.2.6 and later | |
microsoft clap | CLAP | Pytorch | 1.2.11 and later |
Music enhancement
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
hifigan | HiFi-GAN | Pytorch | 1.2.9 and later | |
deep music enhancer | On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks | Pytorch | 1.2.6 and later |
Music generation
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
pytorch_wavenet | pytorch_wavenet | Pytorch | 1.2.14 and later |
Noise reduction
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
unet_source_separation | source_separation | Pytorch | 1.2.6 and later | EN JP |
voicefilter | VoiceFilter | Pytorch | 1.2.7 and later | EN JP |
rnnoise | rnnoise | Keras | 1.2.15 and later | |
dtln | Dual-signal Transformation LSTM Network | Tensorflow | 1.3.0 and later |
Phoneme alignment
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
narabas | narabas: Japanese phoneme forced alignment tool | Pytorch | 1.2.11 and later |
Pitch detection
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
crepe | torchcrepe | Pytorch | 1.2.10 and later | JP |
Speaker diarization
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
auto_speech | AutoSpeech: Neural Architecture Search for Speaker Recognition | Pytorch | 1.2.5 and later | EN JP |
wespeaker | WeSpeaker | Onnxruntime | 1.2.9 and later | |
pyannote-audio | Pyannote-audio | Pytorch | 1.2.15 and later | JP |
Speech to text
Model | Reference | Exported From | Supported Ailia Version | Date | Blog |
---|---|---|---|---|---|
deepspeech2 | deepspeech.pytorch | Pytorch | 1.2.2 and later | Oct 2017 | EN JP |
whisper | Whisper | Pytorch | 1.2.10 and later | Dec 2022 | JP |
reazon_speech | ReazonSpeech | Pytorch | 1.4.0 and later | Jan 2023 | |
distil-whisper | Hugging Face - Distil-Whisper | Pytorch | 1.2.16 and later | Nov 2023 | |
reazon_speech2 | ReazonSpeech2 | Pytorch | 1.4.0 and later | Feb 2024 | |
kotoba-whisper | kotoba-whisper | Pytorch | 1.2.16 and later | Apr 2024 |
Text to speech
Model | Reference | Exported From | Supported Ailia Version | Date | Blog |
---|---|---|---|---|---|
pytorch-dc-tts | Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention | Pytorch | 1.2.6 and later | Oct 2017 | EN JP |
tacotron2 | Tacotron2 | Pytorch | 1.2.15 and later | Feb 2018 | JP |
vall-e-x | VALL-E-X | Pytorch | 1.2.15 and later | Mar 2023 | JP |
Bert-VITS2 | Bert-VITS2 | Pytorch | 1.2.16 and later | Aug 2023 | |
gpt-sovits | GPT-SoVITS | Pytorch | 1.4.0 and later | Feb 2024 | JP |
gpt-sovits-v2 | GPT-SoVITS | Pytorch | 1.4.0 and later | Aug 2024 |
Voice activity detection
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
silero-vad | Silero VAD | Pytorch | 1.2.15 and later | JP |
Voice conversion
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
rvc | Retrieval-based-Voice-Conversion-WebUI | Pytorch | 1.2.12 and later | JP |
Background removal
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
U-2-Net | U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection | Pytorch | 1.2.2 and later | EN JP | |
u2net-portrait-matting | U^2-Net - Portrait matting | Pytorch | 1.2.7 and later | ||
u2net-human-seg | U^2-Net - human segmentation | Pytorch | 1.2.4 and later | ||
deep-image-matting | Deep Image Matting | Keras | 1.2.3 and later | EN JP | |
indexnet | Indices Matter: Learning to Index for Deep Image Matting | Pytorch | 1.2.7 and later | ||
modnet | MODNet: Trimap-Free Portrait Matting in Real Time | Pytorch | 1.2.7 and later | ||
background_matting_v2 | Real-Time High-Resolution Background Matting | Pytorch | 1.2.9 and later | ||
cascade_psp | CascadePSP | Pytorch | 1.2.9 and later | ||
rembg | Rembg | Pytorch | 1.2.4 and later | ||
dis_seg | Highly Accurate Dichotomous Image Segmentation | Pytorch | 1.2.10 and later | ||
gfm | Bridging Composite and Real: Towards End-to-end Deep Image Matting | Pytorch | 1.2.10 and later |
Crowd counting
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
crowdcount-cascaded-mtl | CNN-based Cascaded Multi-task Learning of High-level Prior and Density Estimation for Crowd Counting (Single Image Crowd Counting) | Pytorch | 1.2.1 and later | EN JP | |
c-3-framework | Crowd Counting Code Framework(C^3-Framework) | Pytorch | 1.2.5 and later |
Deep fashion
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
clothing-detection | Clothing-Detection | Pytorch | 1.2.1 and later | EN JP | |
mmfashion | MMFashion | Pytorch | 1.2.5 and later | EN JP | |
mmfashion_tryon | MMFashion virtual try-on | Pytorch | 1.2.8 and later | ||
mmfashion_retrieval | MMFashion In-Shop Clothes Retrieval | Pytorch | 1.2.5 and later | ||
fashionai-key-points-detection | A Pytorch Implementation of Cascaded Pyramid Network for FashionAI Key Points Detection | Pytorch | 1.2.5 and later | ||
person-attributes-recognition-crossroad | person-attributes-recognition-crossroad-0230 | Pytorch | 1.2.10 and later |
Depth estimation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
monodepth2 | Monocular depth estimation from a single image | Pytorch | 1.2.2 and later | ||
midas | Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer | Pytorch | 1.2.4 and later | EN JP | |
fcrn-depthprediction | Deeper Depth Prediction with Fully Convolutional Residual Networks | TensorFlow | 1.2.6 and later | ||
fast-depth | ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems" | Pytorch | 1.2.5 and later | ||
lap-depth | LapDepth-release | Pytorch | 1.2.9 and later | ||
hitnet | ONNX-HITNET-Stereo-Depth-estimation | Pytorch | 1.2.9 and later | ||
crestereo | ONNX-CREStereo-Depth-Estimation | Pytorch | 1.2.13 and later | ||
mobilestereonet | MobileStereoNet | Pytorch | 1.2.13 and later | ||
zoe_depth | ZoeDepth | Pytorch | 1.3.0 and later | ||
DepthAnything | DepthAnything | Pytorch | 1.2.9 and later |
Diffusion
Text to image
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
latent-diffusion-txt2img | Latent Diffusion - txt2img | Pytorch | 1.2.10 and later | ||
stable-diffusion-txt2img | Stable Diffusion | Pytorch | 1.2.14 and later | JP | |
control_net | ControlNet | Pytorch | 1.2.15 and later | ||
sd-turbo | Hugging Face - SD-Turbo | Pytorch | 1.2.16 and later | ||
sdxl-turbo | Hugging Face - SDXL-Turbo | Pytorch | 1.2.16 and later | ||
latent-consistency-models | latent-consistency-models | Pytorch | 1.2.16 and later |
Text to audio
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
riffusion | Riffusion | Pytorch | 1.2.16 and later |
Others
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
latent-diffusion-inpainting | Latent Diffusion - inpainting | Pytorch | 1.2.10 and later | ||
latent-diffusion-superresolution | Latent Diffusion - Super-resolution | Pytorch | 1.2.10 and later | ||
DA-CLIP | DA-CLIP | Pytorch | 1.2.16 and later | ||
marigold | Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Pytorch | 1.2.16 and later |
Face detection
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
yolov1-face | YOLO-Face-detection | Darknet | 1.1.0 and later | ||
yolov3-face | Face detection using keras-yolov3 | Keras | 1.2.1 and later | ||
blazeface | BlazeFace-PyTorch | Pytorch | 1.2.1 and later | EN JP | |
face-mask-detection | Face detection using keras-yolov3 | Keras | 1.2.1 and later | EN JP | |
dbface | DBFace : real-time, single-stage detector for face detection, with faster speed and higher accuracy | Pytorch | 1.2.2 and later | ||
retinaface | RetinaFace: Single-stage Dense Face Localisation in the Wild. | Pytorch | 1.2.5 and later | JP | |
anime-face-detector | Anime Face Detector | Pytorch | 1.2.6 and later | ||
face-detection-adas | face-detection-adas-0001 | OpenVINO | 1.2.5 and later | ||
mtcnn | mtcnn | Keras | 1.2.10 and later |
Face identification
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
vggface2 | VGGFace2 Dataset for Face Recognition | Caffe | 1.1.0 and later | ||
arcface | pytorch implement of arcface | Pytorch | 1.2.1 and later | EN JP | |
insightface | InsightFace: 2D and 3D Face Analysis Project | Pytorch | 1.2.5 and later | ||
cosface | Pytorch implementation of CosFace | Pytorch | 1.2.10 and later | ||
facenet_pytorch | Face Recognition Using Pytorch | Pytorch | 1.2.6 and later |
Face recognition
Age gender estimation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
face_classification | Real-time face detection and emotion/gender classification | Keras | 1.1.0 and later | ||
age-gender-recognition-retail | age-gender-recognition-retail-0013 | OpenVINO | 1.2.5 and later | EN JP | |
mivolo | MiVOLO: Multi-input Transformer for Age and Gender Estimation | Pytorch | 1.2.13 and later |
Emotion recognition
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
ferplus | FER+ | CNTK | 1.2.2 and later | ||
hsemotion | HSEmotion (High-Speed face Emotion recognition) library | Pytorch | 1.2.5 and later |
Gaze estimation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
gazeml | A deep learning framework based on Tensorflow for the training of high performance gaze estimation | TensorFlow | 1.2.0 and later | ||
mediapipe_iris | irislandmarks.pytorch | Pytorch | 1.2.2 and later | EN JP | |
ax_gaze_estimation | ax Gaze Estimation | Pytorch | 1.2.2 and later | EN JP |
Head pose estimation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
hopenet | deep-head-pose | Pytorch | 1.2.2 and later | EN JP | |
6d_repnet | 6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) | Pytorch | 1.2.6 and later | ||
L2CS_Net | L2CS_Net | Pytorch | 1.2.9 and later |
Keypoint detection
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
facial_feature | kaggle-facial-keypoints | Pytorch | 1.2.0 and later | ||
face_alignment | 2D and 3D Face alignment library build using pytorch | Pytorch | 1.2.1 and later | EN JP | |
prnet | Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network | TensorFlow | 1.2.2 and later | ||
facemesh | facemesh.pytorch | Pytorch | 1.2.2 and later | EN JP | |
facemesh_v2 | MediaPipe Face landmark detection | Pytorch | 1.2.9 and later | JP | |
3ddfa | Towards Fast, Accurate and Stable 3D Dense Face Alignment | Pytorch | 1.2.10 and later |
Others
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
ax_facial_features | ax Facial Features | Pytorch | 1.2.5 and later | EN | |
face-anti-spoofing | Lightweight Face Anti Spoofing | Pytorch | 1.2.5 and later | EN JP |
Face restoration
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
gfpgan | GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior | Pytorch | 1.2.10 and later | JP | |
codeformer | CodeFormer: Towards Robust Blind Face Restoration with Codebook Lookup Transformer | Pytorch | 1.2.9 and later |
Face swapping
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
sber-swap | SberSwap | Pytorch | 1.2.12 and later | JP | |
facefusion | FaceFusion | ONNXRuntime | 1.2.10 and later |
Frame Interpolation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
flavr | FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation | Pytorch | 1.2.7 and later | EN JP | |
cain | Channel Attention Is All You Need for Video Frame Interpolation | Pytorch | 1.2.5 and later | ||
film | FILM: Frame Interpolation for Large Motion | Tensorflow | 1.2.10 and later | ||
rife | Real-Time Intermediate Flow Estimation for Video Frame Interpolation | Pytorch | 1.2.13 and later |
Generative adversarial networks
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
pytorch-gan | Code repo for the Pytorch GAN Zoo project (used to train this model) | Pytorch | 1.2.4 and later | ||
council-gan | Council-GAN | Pytorch | 1.2.4 and later | ||
restyle-encoder | ReStyle | Pytorch | 1.2.9 and later | ||
sam | Age Transformation Using a Style-Based Regression Model | Pytorch | 1.2.9 and later | ||
encoder4editing | Designing an Encoder for StyleGAN Image Manipulation | Pytorch | 1.2.10 and later | ||
lipgan | LipGAN | Keras | 1.2.15 and later | JP |
Hand detection
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
yolov3-hand | Hand detection branch of Face detection using keras-yolov3 | Keras | 1.2.1 and later | ||
hand_detection_pytorch | hand-detection.PyTorch | Pytorch | 1.2.2 and later | ||
blazepalm | MediaPipePyTorch | Pytorch | 1.2.5 and later |
Hand recognition
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
blazehand | MediaPipePyTorch | Pytorch | 1.2.5 and later | EN JP | |
hand3d | ColorHandPose3D network | TensorFlow | 1.2.5 and later | ||
minimal-hand | Minimal Hand | TensorFlow | 1.2.8 and later | ||
v2v-posenet | V2V-PoseNet | Pytorch | 1.2.6 and later | ||
hands_segmentation_pytorch | hands-segmentation-pytorch | Pytorch | 1.2.10 and later |
Image captioning
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
illustration2vec | Illustration2Vec | Caffe | 1.2.2 and later | ||
image_captioning_pytorch | Image Captioning pytorch | Pytorch | 1.2.5 and later | EN JP | |
blip2 | Hugging Face - BLIP-2 | Pytorch | 1.2.16 and later |
Image classification
CNN
Transformer
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
vit | Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale) | Pytorch | 1.2.7 and later | EN JP | |
swin-transformer | Swin Transformer | Pytorch | 1.2.6 and later | ||
clip | CLIP | Pytorch | 1.2.9 and later | EN JP | |
japanese-clip | Japanese-CLIP | Pytorch | 1.2.15 and later | ||
japanese-stable-clip-vit-l-16 | japanese-stable-clip-vit-l-16 | Pytorch | 1.2.11 and later |
Specific task
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
partialconv | Partial Convolution Layer for Padding and Image Inpainting | Pytorch | 1.2.0 and later | ||
weather-prediction-from-image | Weather Prediction From Image - (Warmth Of Image) | Keras | 1.2.5 and later |
Image inpainting
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
inpainting-with-partial-conv | pytorch-inpainting-with-partial-conv | PyTorch | 1.2.6 and later | EN JP | |
inpainting_gmcnn | Image Inpainting via Generative Multi-column Convolutional Neural Networks | TensorFlow | 1.2.6 and later | ||
3d-photo-inpainting | 3D Photography using Context-aware Layered Depth Inpainting | Pytorch | 1.2.7 and later | ||
deepfillv2 | Free-Form Image Inpainting with Gated Convolution | Pytorch | 1.2.9 and later |
Image manipulation
Image restoration
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
nafnet | NAFNet: Nonlinear Activation Free Network for Image Restoration | Pytorch | 1.2.10 and later |
Image segmentation
Large Language Model
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
llava | LLaVA | Pytorch | 1.2.16 and later |
Landmark classification
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
landmarks_classifier_asia | Landmarks classifier_asia_V1.1 | TensorFlow Hub | 1.2.4 and later | EN JP | |
places365 | Release of Places365-CNNs | Pytorch | 1.2.5 and later |
Line segment detection
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
mlsd | M-LSD: Towards Light-weight and Real-time Line Segment Detection | TensorFlow | 1.2.8 and later | EN JP | |
dexined | DexiNed: Dense Extreme Inception Network for Edge Detection | Pytorch | 1.2.5 and later |
Low Light Image Enhancement
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
agllnet | AGLLNet: Attention Guided Low-light Image Enhancement (IJCV 2021) | Pytorch | 1.2.9 and later | EN JP | |
drbn_skf | DRBN SKF | Pytorch | 1.2.14 and later |
Natural language processing
Bert
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert | pytorch-pretrained-bert | Pytorch | 1.2.2 and later | EN JP |
bert_maskedlm | huggingface/transformers | Pytorch | 1.2.5 and later | |
bert_question_answering | huggingface/transformers | Pytorch | 1.2.5 and later | |
bert_zero_shot_classification | huggingface/transformers | Pytorch | 1.2.5 and later |
Embedding
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
sentence_transformers_japanese | sentence transformers | Pytorch | 1.2.7 and later | JP |
multilingual-e5 | multilingual-e5-base | Pytorch | 1.2.15 and later | JP |
glucose | GLuCoSE (General Luke-based Contrastive Sentence Embedding)-base-Japanese | Pytorch | 1.2.15 and later |
Error corrector
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert_insert_punctuation | bert-japanese | Pytorch | 1.2.15 and later | |
t5_whisper_medical | error correction of medical terms using t5 | Pytorch | 1.2.13 and later | |
bertjsc | bertjsc | Pytorch | 1.2.15 and later |
Grapheme to phoneme
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
soundchoice-g2p | Hugging Face - speechbrain/soundchoice-g2p | Pytorch | 1.2.16 and later | |
g2p_en | g2p_en | Pytorch | 1.2.14 and later |
Named entity recognition
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert_ner | huggingface/transformers | Pytorch | 1.2.5 and later | |
t5_base_japanese_ner | t5-japanese | Pytorch | 1.2.13 and later | |
bert_ner_japanese | jurabi/bert-ner-japanese | Pytorch | 1.2.10 and later |
Reranker
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
cross_encoder_mmarco | jeffwan/mmarco-mMiniLMv2-L12-H384-v | Pytorch | 1.2.10 and later | JP |
japanese-reranker-cross-encoder | hotchpotch/japanese-reranker-cross-encoder-large-v1 | Pytorch | 1.2.16 and later |
Sentence generation
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
gpt2 | GPT-2 | Pytorch | 1.2.7 and later | |
rinna_gpt2 | japanese-pretrained-models | Pytorch | 1.2.7 and later |
Sentiment analysis
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert_sentiment_analysis | huggingface/transformers | Pytorch | 1.2.5 and later | |
bert_tweets_sentiment | huggingface/transformers | Pytorch | 1.2.5 and later |
Summarize
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert_sum_ext | BERTSUMEXT | Pytorch | 1.2.7 and later | |
presumm | PreSumm | Pytorch | 1.2.8 and later | |
t5_base_japanese_title_generation | t5-japanese | Pytorch | 1.2.13 and later | JP |
t5_base_summarization | t5-japanese | Pytorch | 1.2.13 and later |
Translation
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
fugumt-en-ja | Fugu-Machine Translator | Pytorch | 1.2.9 and later | JP |
fugumt-ja-en | Fugu-Machine Translator | Pytorch | 1.2.10 abd later |
Network intrusion detection
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert-network-packet-flow-header-payload | bert-network-packet-flow-header-payload | Pytorch | 1.2.10 and later | |
falcon-adapter-network-packet | falcon-adapter-network-packet | Pytorch | 1.2.10 and later |
Neural Rendering
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
nerf | NeRF: Neural Radiance Fields | Tensorflow | 1.2.10 and later | EN JP | |
TripoSR | TripoSR | Pytorch | 1.2.6 and later |
NSFW detector
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
clip-based-nsfw-detector | CLIP-based-NSFW-Detector | Keras | 1.2.10 and later | JP |
Object detection
CNN
Transformer
Model | Reference | Exported From | Supported Ailia Version | Date | Blog | |
---|---|---|---|---|---|---|
glip | GLIP | Pytorch | 1.2.13 and later | Dec 2021 | ||
dab-detr | DAB-DETR | Pytorch | 1.2.12 and later | Jan 2022 | ||
detic | Detecting Twenty-thousand Classes using Image-level Supervision | Pytorch | 1.2.10 and later | Jan 2022 | EN JP | |
groundingdino | Grounding DINO | Pytorch | 1.2.16 and later | Mar 2023 | JP |
Specific target
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
mobile_object_localizer | mobile_object_localizer_v1 | TensorFlow Hub | 1.2.6 and later | EN JP | |
sku110k-densedet | SKU110K-DenseDet | Pytorch | 1.2.9 and later | EN JP | |
traffic-sign-detection | Traffic Sign Detection | Tensorflow | 1.2.10 and later | EN JP | |
footandball | FootAndBall: Integrated player and ball detector | Pytorch | 1.2.0 and later | ||
qrcode_wechatqrcode | qrcode_wechatqrcode | Caffe | 1.2.15 and later | ||
layout_parsing | unstructured-inference | Pytorch | 1.2.9 and later |
Object detection 3d
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
3d_bbox | 3D Bounding Box Estimation Using Deep Learning and Geometry | Pytorch | 1.2.6 and later | ||
3d-object-detection.pytorch | 3d-object-detection.pytorch | Pytorch | 1.2.8 and later | EN JP | |
mediapipe_objectron | MediaPipe Objectron | TensorFlow Lite | 1.2.5 and later | ||
egonet | EgoNet | Pytorch | 1.2.9 and later | ||
d4lcn | D4LCN | Pytorch | 1.2.9 and later | ||
did_m3d | DID M3D | Pytorch | 1.2.11 and later |
Object tracking
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
deepsort | Deep Sort with PyTorch | Pytorch | 1.2.3 and later | EN JP | |
person_reid_baseline_pytorch | UTS-Person-reID-Practical | Pytorch | 1.2.6 and later | ||
abd_net | Attentive but Diverse Person Re-Identification | Pytorch | 1.2.7 and later | ||
siam-mot | SiamMOT | Pytorch | 1.2.9 and later | ||
bytetrack | ByteTrack | Pytorch | 1.2.5 and later | EN JPã | |
qd-3dt | Monocular Quasi-Dense 3D Object Tracking | Pytorch | 1.2.11 and later | ã | |
strong_sort | StrongSORT | Pytorch | 1.2.15 and later | ã | |
centroids-reid | On the Unreasonable Effectiveness of Centroids in Image Retrieval | Pytorch | 1.2.9 and later | ã | |
deepsort_vehicle | Multi-Camera Live Object Tracking | Pytorch | 1.2.9 and later |
Optical Flow Estimation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
raft | RAFT: Recurrent All Pairs Field Transforms for Optical Flow | Pytorch | 1.2.6 and later | EN JPã |
Point segmentation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
pointnet_pytorch | PointNet.pytorch | Pytorch | 1.2.6 and later |
Pose estimation
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
openpose | Code repo for realtime multi-person pose estimation in CVPR'17 (Oral) | Caffe | 1.2.1 and later | ||
lightweight-human-pose-estimation | Fast and accurate human pose estimation in PyTorch. Contains implementation of "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" paper. | Pytorch | 1.2.1 and later | EN JP | |
pose_resnet | Simple Baselines for Human Pose Estimation and Tracking | Pytorch | 1.2.1 and later | EN JP | |
blazepose | MediaPipePyTorch | Pytorch | 1.2.5 and later | ||
efficientpose | Code repo for EfficientPose | TensorFlow | 1.2.6 and later | ||
movenet | Code repo for movenet | TensorFlow | 1.2.8 and later | EN JP | |
animalpose | MMPose - 2D animal pose estimation | Pytorch | 1.2.7 and later | EN JP | |
mediapipe_holistic | MediaPipe Holistic | TensorFlow | 1.2.9 and later | ||
ap-10k | AP-10K | Pytorch | 1.2.4 and later | ||
posenet | PoseNet Pytorch | Pytorch | 1.2.10 and later | ||
e2pose | E2Pose | Tensorflow | 1.2.5 and later |
Pose estimation 3d
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
lightweight-human-pose-estimation-3d | Real-time 3D multi-person pose estimation demo in PyTorch. OpenVINO backend can be used for fast inference on CPU. | Pytorch | 1.2.1 and later | ||
3d-pose-baseline | A simple baseline for 3d human pose estimation in tensorflow. Presented at ICCV 17. | TensorFlow | 1.2.3 and later | ||
pose-hg-3d | Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach | Pytorch | 1.2.6 and later | ||
blazepose-fullbody | MediaPipe | TensorFlow Lite | 1.2.5 and later | EN JP | |
3dmppe_posenet | PoseNet of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image" | Pytorch | 1.2.6 and later | ||
gast | A Graph Attention Spatio-temporal Convolutional Networks for 3D Human Pose Estimation in Video (GAST-Net) | Pytorch | 1.2.7 and later | EN JP | |
mediapipe_pose_world_landmarks | MediaPipe Pose real-world 3D coordinates | TensorFlow Lite | 1.2.10 and later |
Road detection
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
codes-for-lane-detection | Codes-for-Lane-Detection | Pytorch | 1.2.6 and later | EN JP | |
roneld | RONELD-Lane-Detection | Pytorch | 1.2.6 and later | ||
road-segmentation-adas | road-segmentation-adas-0001 | OpenVINO | 1.2.5 and later | ||
cdnet | CDNet | Pytorch | 1.2.5 and later | ||
lstr | LSTR | Pytorch | 1.2.8 and later | ||
ultra-fast-lane-detection | Ultra-Fast-Lane-Detection | Pytorch | 1.2.6 and later | ||
yolop | YOLOP | Pytorch | 1.2.6 and later | ||
hybridnets | HybridNets | Pytorch | 1.2.6 and later | ||
polylanenet | PolyLaneNet | Pytorch | 1.2.9 and later |
Rotation prediction
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
rotnet | CNNs for predicting the rotation angle of an image to correct its orientation | Keras | 1.2.1 and later |
Style transfer
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
adain | Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization | Pytorch | 1.2.1 and later | EN JP | |
psgan | PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer | Pytorch | 1.2.7 and later | ||
beauty_gan | BeautyGAN | Pytorch | 1.2.7 and later | ||
animeganv2 | PyTorch Implementation of AnimeGANv2 | Pytorch | 1.2.5 and later | ||
pix2pixHD | pix2pixHD: High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs | Pytorch | 1.2.6 and later | ||
EleGANt | EleGANt: Exquisite and Locally Editable GAN for Makeup Transfer | Pytorch | 1.2.15 and later |
Super resolution
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
srresnet | Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network | Pytorch | 1.2.0 and later | EN JP | |
edsr | Enhanced Deep Residual Networks for Single Image Super-Resolution | Pytorch | 1.2.6 and later | EN JP | |
han | Single Image Super-Resolution via a Holistic Attention Network | Pytorch | 1.2.6 and later | ||
real-esrgan | Real-ESRGAN | Pytorch | 1.2.9 and later | ||
rcan-it | Revisiting RCAN: Improved Training for Image Super-Resolution | Pytorch | 1.2.10 and later | ||
swinir | SwinIR: Image Restoration Using Swin Transformer | Pytorch | 1.2.12 and later | ||
Hat | Hat | Pytorch | 1.2.6 and later |
Text detection
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
craft_pytorch | CRAFT: Character-Region Awareness For Text detection | Pytorch | 1.2.2 and later | ||
pixel_link | Pixel-Link | TensorFlow | 1.2.6 and later | ||
east | EAST: An Efficient and Accurate Scene Text Detector | TensorFlow | 1.2.6 and later |
Text recognition
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
etl | Japanese Character Classification | Keras | 1.1.0 and later | JP | |
deep-text-recognition-benchmark | deep-text-recognition-benchmark | Pytorch | 1.2.6 and later | ||
crnn.pytorch | Convolutional Recurrent Neural Network | Pytorch | 1.2.6 and later | ||
paddleocr | PaddleOCR : Awesome multilingual OCR toolkits based on PaddlePaddle | Pytorch | 1.2.6 and later | EN JP | |
easyocr | Ready-to-use OCR with 80+ supported languages | Pytorch | 1.2.6 and later | ||
ndlocr_text_recognition | NDL OCR | Pytorch | 1.2.5 and later |
Time-Series Forecasting
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
informer2020 | Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting (AAAI'21 Best Paper) | Pytorch | 1.2.10 and later |
Vehicle recognition
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
vehicle-attributes-recognition-barrier | vehicle-attributes-recognition-barrier-0042 | OpenVINO | 1.2.5 and later | EN JP | |
vehicle-license-plate-detection-barrier | vehicle-license-plate-detection-barrier-0106 | OpenVINO | 1.2.5 and later |
Commercial model
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
acculus-pose | Acculus, Inc. | Caffe | 1.2.3 and later |
Other languages
Top Related Projects
Models and examples built with TensorFlow
Datasets, Transforms and Models specific to Computer Vision
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Pre-trained Deep Learning models and demos (high quality and extremely fast)
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot