Convert Figma logo to code with AI

Megvii-BaseDetection logoYOLOX

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

9,518
2,220
9,518
744

Top Related Projects

51,450

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

13,305

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

5,732

YOLOv6: a single-stage object detection framework dedicated to industrial applications.

OpenMMLab Detection Toolbox and Benchmark

21,700

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Quick Overview

YOLOX is a high-performance object detection framework based on YOLO (You Only Look Once). It introduces several improvements over previous YOLO versions, including decoupled head and SimOTA label assignment, making it more accurate and efficient for real-time object detection tasks.

Pros

  • Excellent performance-speed trade-off, suitable for real-time applications
  • Flexible architecture allowing easy customization and deployment
  • Supports various backbones and model sizes for different use cases
  • Well-documented and actively maintained

Cons

  • Requires significant computational resources for training
  • May be overkill for simpler object detection tasks
  • Learning curve can be steep for beginners in deep learning
  • Limited to object detection, not suitable for other computer vision tasks without modification

Code Examples

  1. Loading a pre-trained YOLOX model:
from yolox.exp import get_exp
from yolox.utils import postprocess
from yolox.utils.model_utils import get_model_info

exp = get_exp('yolox_s', 'nano')
model = exp.get_model()
ckpt = torch.load("yolox_s.pth", map_location="cpu")
model.load_state_dict(ckpt["model"])
model.eval()
  1. Performing inference on an image:
import cv2
import torch

image = cv2.imread("test_image.jpg")
img, ratio = preproc(image, input_size)
img = torch.from_numpy(img).unsqueeze(0).float()

with torch.no_grad():
    outputs = model(img)
    outputs = postprocess(outputs, num_classes, conf_thres=0.7, nms_thres=0.45)
  1. Visualizing detection results:
from yolox.utils import vis

if outputs[0] is not None:
    output = outputs[0].cpu()
    bboxes = output[:, 0:4]
    cls = output[:, 6]
    scores = output[:, 4] * output[:, 5]
    
    vis_res = vis(image, bboxes, scores, cls, conf=0.7, class_names=COCO_CLASSES)
    cv2.imwrite("result.jpg", vis_res)

Getting Started

  1. Install YOLOX:
git clone https://github.com/Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip install -v -e .
  1. Download a pre-trained model:
wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth
  1. Run inference on an image:
from yolox.exp import get_exp
from yolox.utils import postprocess, vis
from yolox.data.data_augment import preproc
import torch
import cv2

exp = get_exp('yolox_s', 'nano')
model = exp.get_model()
ckpt = torch.load("yolox_s.pth", map_location="cpu")
model.load_state_dict(ckpt["model"])
model.eval()

image = cv2.imread("test_image.jpg")
img, ratio = preproc(image, (640, 640))
img = torch.from_numpy(img).unsqueeze(0).float()

with torch.no_grad():
    outputs = model(img)
    outputs = postprocess(outputs, 80, conf_thres=0.7, nms_thres=0.45)

# Visualize results (assuming COCO_CLASSES is defined)
if outputs[0] is not None:
    output = outputs[0].cpu()
    vis_res = vis(image, output[:, 0:4], output[:, 4] * output[:, 5], output[:, 6], conf=0.7, class_names=COCO_CLASSES)
    cv2.imwrite("result.jpg

Competitor Comparisons

51,450

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

  • More extensive documentation and tutorials
  • Larger community support and frequent updates
  • Easier integration with various deployment platforms

Cons of YOLOv5

  • Slightly lower accuracy compared to YOLOX in some benchmarks
  • More complex architecture, potentially leading to longer training times

Code Comparison

YOLOX example:

from yolox.exp import Exp as MyExp

class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.depth = 0.33
        self.width = 0.50
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]

YOLOv5 example:

from models.yolo import Model
from utils.torch_utils import intersect_dicts

model = Model(cfg='models/yolov5s.yaml')
ckpt = torch.load('yolov5s.pt', map_location='cpu')
csd = ckpt['model'].float().state_dict()
model.load_state_dict(intersect_dicts(csd, model.state_dict(), exclude=['anchor']), strict=False)

Both repositories offer powerful object detection capabilities, with YOLOX focusing on improved accuracy and YOLOv5 providing a more user-friendly experience and broader ecosystem support.

13,305

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Pros of yolov7

  • Higher accuracy and faster inference speed on various datasets
  • Includes additional features like instance segmentation and pose estimation
  • More active development and frequent updates

Cons of yolov7

  • More complex architecture, potentially harder to understand and modify
  • Requires more computational resources for training and inference
  • Less extensive documentation compared to YOLOX

Code Comparison

YOLOX example:

from yolox.exp import Exp as MyExp

class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.depth = 0.33
        self.width = 0.50
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]

yolov7 example:

from models.yolo import Model
from utils.torch_utils import intersect_dicts

model = Model(cfg, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)
state_dict = torch.load(weights, map_location=device)['model']
state_dict = intersect_dicts(state_dict, model.state_dict(), exclude=['anchor'])
model.load_state_dict(state_dict, strict=False)

Both repositories provide powerful object detection frameworks, but yolov7 offers more advanced features and potentially better performance at the cost of increased complexity and resource requirements. YOLOX may be more suitable for simpler use cases or when working with limited computational resources.

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

  • Extensive library with a wide range of object detection and segmentation models
  • Well-documented and actively maintained by Facebook AI Research
  • Modular design allows for easy customization and extension

Cons of Detectron2

  • Steeper learning curve due to its comprehensive nature
  • Heavier and more resource-intensive compared to YOLOX
  • May be overkill for simpler object detection tasks

Code Comparison

YOLOX example:

from yolox.exp import get_exp
from yolox.utils import postprocess

exp = get_exp('yolox_s', 'nano')
model = exp.get_model()
outputs = model(img)
results = postprocess(outputs, num_classes, conf_thre=0.5, nms_thre=0.45)

Detectron2 example:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(img)

Both repositories offer powerful object detection capabilities, but YOLOX is more focused on YOLO-based models, while Detectron2 provides a broader range of algorithms and features for various computer vision tasks.

5,732

YOLOv6: a single-stage object detection framework dedicated to industrial applications.

Pros of YOLOv6

  • Higher accuracy and faster inference speed on various datasets
  • More efficient network architecture with RepOpt and Network Reparameterization
  • Better support for deployment on different hardware platforms

Cons of YOLOv6

  • Less extensive documentation compared to YOLOX
  • Fewer pre-trained models available for different tasks
  • Limited community contributions and third-party implementations

Code Comparison

YOLOv6:

from yolov6.core.evaler import Evaler
from yolov6.utils.config import Config

cfg = Config.fromfile('configs/yolov6s.py')
evaler = Evaler(cfg, img_size=640)
evaler.eval()

YOLOX:

from yolox.exp import get_exp
from yolox.utils import get_model_info

exp = get_exp('yolox_s.py', 'yolox_s')
model = exp.get_model()
print(get_model_info(model, exp.test_size))

Both repositories provide easy-to-use interfaces for model evaluation and inference. YOLOv6 uses a configuration-based approach, while YOLOX employs an experiment-based system. YOLOX offers more flexibility in customizing model architectures, while YOLOv6 focuses on optimized performance out-of-the-box.

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

  • Extensive model zoo with a wide variety of pre-trained models
  • Highly modular and flexible architecture for easy customization
  • Comprehensive documentation and tutorials

Cons of mmdetection

  • Steeper learning curve due to its complexity and extensive features
  • Potentially slower inference speed compared to YOLOX

Code Comparison

YOLOX example:

from yolox.exp import get_exp
from yolox.utils import postprocess

exp = get_exp('yolox_s', 'coco')
model = exp.get_model()

mmdetection example:

from mmdet.apis import init_detector, inference_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')

Both repositories offer powerful object detection frameworks, but mmdetection provides a more comprehensive toolkit with a wider range of models and customization options. YOLOX, on the other hand, focuses on a specific architecture and may offer better performance in certain scenarios.

21,700

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

  • Supports a wide range of YOLO versions (YOLOv2, YOLOv3, YOLOv4, etc.)
  • Includes pre-trained models for various tasks
  • Offers both CPU and GPU support

Cons of darknet

  • Written in C, which may be less accessible for some developers
  • Requires manual compilation and setup
  • Less modular architecture compared to YOLOX

Code Comparison

darknet:

layer make_yolo_layer(int batch, int w, int h, int n, int total, int *mask, int classes)
{
    int i;
    layer l = {0};
    l.type = YOLO;

YOLOX:

class YOLOXHead(nn.Module):
    def __init__(self, num_classes, width=1.0, in_channels=[256, 512, 1024], act="silu", depthwise=False):
        super().__init__()
        self.n_anchors = 1
        self.num_classes = num_classes

The darknet code is in C and focuses on low-level layer creation, while YOLOX uses Python and PyTorch for a more high-level, object-oriented approach. YOLOX's implementation is generally more readable and easier to modify for most developers familiar with modern deep learning frameworks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Introduction

YOLOX is an anchor-free version of YOLO, with a simpler design but better performance! It aims to bridge the gap between research and industrial communities. For more details, please refer to our report on Arxiv.

This repo is an implementation of PyTorch version YOLOX, there is also a MegEngine implementation.

Updates!!

  • 【2023/02/28】 We support assignment visualization tool, see doc here.
  • 【2022/04/14】 We support jit compile op.
  • 【2021/08/19】 We optimize the training process with 2x faster training and ~1% higher performance! See notes for more details.
  • 【2021/08/05】 We release MegEngine version YOLOX.
  • 【2021/07/28】 We fix the fatal error of memory leak
  • 【2021/07/26】 We now support MegEngine deployment.
  • 【2021/07/20】 We have released our technical report on Arxiv.

Benchmark

Standard Models.

ModelsizemAPval
0.5:0.95
mAPtest
0.5:0.95
Speed V100
(ms)
Params
(M)
FLOPs
(G)
weights
YOLOX-s64040.540.59.89.026.8github
YOLOX-m64046.947.212.325.373.8github
YOLOX-l64049.750.114.554.2155.6github
YOLOX-x64051.151.517.399.1281.9github
YOLOX-Darknet5364047.748.011.163.7185.3github
Legacy models
ModelsizemAPtest
0.5:0.95
Speed V100
(ms)
Params
(M)
FLOPs
(G)
weights
YOLOX-s64039.69.89.026.8onedrive/github
YOLOX-m64046.412.325.373.8onedrive/github
YOLOX-l64050.014.554.2155.6onedrive/github
YOLOX-x64051.217.399.1281.9onedrive/github
YOLOX-Darknet5364047.411.163.7185.3onedrive/github

Light Models.

ModelsizemAPval
0.5:0.95
Params
(M)
FLOPs
(G)
weights
YOLOX-Nano41625.80.911.08github
YOLOX-Tiny41632.85.066.45github
Legacy models
ModelsizemAPval
0.5:0.95
Params
(M)
FLOPs
(G)
weights
YOLOX-Nano41625.30.911.08github
YOLOX-Tiny41632.85.066.45github

Quick Start

Installation

Step1. Install YOLOX from source.

git clone git@github.com:Megvii-BaseDetection/YOLOX.git
cd YOLOX
pip3 install -v -e .  # or  python3 setup.py develop
Demo

Step1. Download a pretrained model from the benchmark table.

Step2. Use either -n or -f to specify your detector's config. For example:

python tools/demo.py image -n yolox-s -c /path/to/your/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]

or

python tools/demo.py image -f exps/default/yolox_s.py -c /path/to/your/yolox_s.pth --path assets/dog.jpg --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]

Demo for video:

python tools/demo.py video -n yolox-s -c /path/to/your/yolox_s.pth --path /path/to/your/video --conf 0.25 --nms 0.45 --tsize 640 --save_result --device [cpu/gpu]
Reproduce our results on COCO

Step1. Prepare COCO dataset

cd <YOLOX_HOME>
ln -s /path/to/your/COCO ./datasets/COCO

Step2. Reproduce our results on COCO by specifying -n:

python -m yolox.tools.train -n yolox-s -d 8 -b 64 --fp16 -o [--cache]
                               yolox-m
                               yolox-l
                               yolox-x
  • -d: number of gpu devices
  • -b: total batch size, the recommended number for -b is num-gpu * 8
  • --fp16: mixed precision training
  • --cache: caching imgs into RAM to accelarate training, which need large system RAM.

When using -f, the above commands are equivalent to:

python -m yolox.tools.train -f exps/default/yolox_s.py -d 8 -b 64 --fp16 -o [--cache]
                               exps/default/yolox_m.py
                               exps/default/yolox_l.py
                               exps/default/yolox_x.py

Multi Machine Training

We also support multi-nodes training. Just add the following args:

  • --num_machines: num of your total training nodes
  • --machine_rank: specify the rank of each node

Suppose you want to train YOLOX on 2 machines, and your master machines's IP is 123.123.123.123, use port 12312 and TCP.

On master machine, run

python tools/train.py -n yolox-s -b 128 --dist-url tcp://123.123.123.123:12312 --num_machines 2 --machine_rank 0

On the second machine, run

python tools/train.py -n yolox-s -b 128 --dist-url tcp://123.123.123.123:12312 --num_machines 2 --machine_rank 1

Logging to Weights & Biases

To log metrics, predictions and model checkpoints to W&B use the command line argument --logger wandb and use the prefix "wandb-" to specify arguments for initializing the wandb run.

python tools/train.py -n yolox-s -d 8 -b 64 --fp16 -o [--cache] --logger wandb wandb-project <project name>
                         yolox-m
                         yolox-l
                         yolox-x

An example wandb dashboard is available here

Others

See more information with the following command:

python -m yolox.tools.train --help
Evaluation

We support batch testing for fast evaluation:

python -m yolox.tools.eval -n  yolox-s -c yolox_s.pth -b 64 -d 8 --conf 0.001 [--fp16] [--fuse]
                               yolox-m
                               yolox-l
                               yolox-x
  • --fuse: fuse conv and bn
  • -d: number of GPUs used for evaluation. DEFAULT: All GPUs available will be used.
  • -b: total batch size across on all GPUs

To reproduce speed test, we use the following command:

python -m yolox.tools.eval -n  yolox-s -c yolox_s.pth -b 1 -d 1 --conf 0.001 --fp16 --fuse
                               yolox-m
                               yolox-l
                               yolox-x
Tutorials

Deployment

  1. MegEngine in C++ and Python
  2. ONNX export and an ONNXRuntime
  3. TensorRT in C++ and Python
  4. ncnn in C++ and Java
  5. OpenVINO in C++ and Python
  6. Accelerate YOLOX inference with nebullvm in Python

Third-party resources

Cite YOLOX

If you use YOLOX in your research, please cite our work by using the following BibTeX entry:

 @article{yolox2021,
  title={YOLOX: Exceeding YOLO Series in 2021},
  author={Ge, Zheng and Liu, Songtao and Wang, Feng and Li, Zeming and Sun, Jian},
  journal={arXiv preprint arXiv:2107.08430},
  year={2021}
}

In memory of Dr. Jian Sun

Without the guidance of Dr. Jian Sun, YOLOX would not have been released and open sourced to the community. The passing away of Dr. Sun is a huge loss to the Computer Vision field. We add this section here to express our remembrance and condolences to our captain Dr. Sun. It is hoped that every AI practitioner in the world will stick to the belief of "continuous innovation to expand cognitive boundaries, and extraordinary technology to achieve product value" and move forward all the way.

没有孙剑博士的指导,YOLOX也不会问世并开源给社区使用。 孙剑博士的离去是CV领域的一大损失,我们在此特别添加了这个部分来表达对我们的“船长”孙老师的纪念和哀思。 希望世界上的每个AI从业者秉持着“持续创新拓展认知边界,非凡科技成就产品价值”的观念,一路向前。