YOLOv6
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
Top Related Projects
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
OpenMMLab Detection Toolbox and Benchmark
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
Quick Overview
YOLOv6 is an object detection model developed by Meituan, designed for industrial applications. It offers state-of-the-art performance while maintaining high efficiency, making it suitable for various scenarios from embedded devices to large-scale server clusters.
Pros
- High performance-to-speed ratio, suitable for real-time applications
- Supports various hardware platforms, including NVIDIA, TensorRT, and NCNN
- Provides different model sizes (N/T/S/M/L) for different application needs
- Regularly updated with new features and improvements
Cons
- Limited documentation compared to some other YOLO versions
- Fewer pre-trained models available compared to more established object detection frameworks
- May require more fine-tuning for specific use cases
- Relatively new, so the community and ecosystem are still growing
Code Examples
- Loading a pre-trained YOLOv6 model:
from yolov6.utils.events import load_yaml
from yolov6.layers.common import DetectBackend
from yolov6.utils.nms import non_max_suppression
# Load model
model = DetectBackend('yolov6s.pt', device='cpu')
model.model.eval()
- Performing inference on an image:
import cv2
import torch
# Load and preprocess image
img = cv2.imread('image.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = torch.from_numpy(img).permute(2, 0, 1).float().div(255.0).unsqueeze(0)
# Run inference
with torch.no_grad():
pred = model(img)
pred = non_max_suppression(pred, conf_thres=0.25, iou_thres=0.45, classes=None, agnostic=False, max_det=1000)
- Visualizing detection results:
from yolov6.utils.plotting import plot_box_and_label
# Draw bounding boxes on the image
for i, det in enumerate(pred):
if len(det):
for *xyxy, conf, cls in reversed(det):
plot_box_and_label(img[0], max(round(sum(img[0].shape) / 2 * 0.003), 2), xyxy, label=f'{names[int(cls)]} {conf:.2f}')
# Display the result
cv2.imshow('YOLOv6 Detection', img[0].permute(1, 2, 0).numpy())
cv2.waitKey(0)
Getting Started
-
Clone the repository:
git clone https://github.com/meituan/YOLOv6.git cd YOLOv6
-
Install dependencies:
pip install -r requirements.txt
-
Download a pre-trained model:
wget https://github.com/meituan/YOLOv6/releases/download/0.3.0/yolov6s.pt
-
Run inference on an image:
python tools/infer.py --weights yolov6s.pt --source path/to/image.jpg
Competitor Comparisons
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Pros of YOLOv5
- More established and widely adopted in the community
- Extensive documentation and tutorials available
- Broader range of pre-trained models for various tasks
Cons of YOLOv5
- Slightly lower inference speed compared to YOLOv6
- May require more computational resources for training
Code Comparison
YOLOv5:
from yolov5 import YOLOv5
model = YOLOv5('yolov5s.pt')
results = model('image.jpg')
YOLOv6:
from yolov6.core.inferer import Inferer
inferer = Inferer(model='yolov6s.pt', device='cpu')
results = inferer.infer('image.jpg')
Both repositories offer similar ease of use for inference, with YOLOv5 having a slightly more straightforward API. YOLOv5 provides a more comprehensive ecosystem with additional tools and utilities, while YOLOv6 focuses on improved speed and efficiency.
YOLOv5 is better suited for users who prioritize community support and a wide range of pre-trained models. YOLOv6 may be preferred by those seeking cutting-edge performance and faster inference times, especially on edge devices or in real-time applications.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Pros of YOLOv7
- Higher accuracy and performance on benchmark datasets
- More extensive documentation and community support
- Includes additional features like instance segmentation
Cons of YOLOv7
- Larger model size, potentially slower inference
- More complex architecture, which may be harder to understand and modify
- Less focus on mobile and edge device deployment
Code Comparison
YOLOv7:
class YOLOv7(nn.Module):
def __init__(self, nc=80):
super().__init__()
self.model = parse_model(yaml_file, ch=[3])
self.nc = nc
YOLOv6:
class YOLOv6(nn.Module):
def __init__(self, config, channels=3, num_classes=None, anchors=None):
super().__init__()
self.backbone = build_backbone(config, channels, False)
self.neck = build_neck(config)
The code snippets show that YOLOv7 uses a YAML file for model configuration, while YOLOv6 uses a more modular approach with separate backbone and neck components. YOLOv7's implementation may be more flexible, but YOLOv6's structure could be easier to customize for specific use cases.
Both repositories offer state-of-the-art object detection capabilities, with YOLOv7 focusing on high performance and additional features, while YOLOv6 emphasizes efficiency and mobile deployment. The choice between them depends on specific project requirements and hardware constraints.
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
Pros of darknet
- More established and widely used, with a larger community and extensive documentation
- Supports a broader range of YOLO versions (YOLOv3, YOLOv4, etc.)
- Offers pre-trained models for various tasks and datasets
Cons of darknet
- Written in C, which may be less accessible for some developers compared to YOLOv6's Python implementation
- Generally slower inference speed compared to YOLOv6's optimized architecture
- Less focus on mobile and edge device deployment
Code Comparison
darknet (C):
layer make_yolo_layer(int batch, int w, int h, int n, int total, int *mask, int classes)
{
int i;
layer l = {0};
l.type = YOLO;
l.n = n;
l.total = total;
l.batch = batch;
l.h = h;
l.w = w;
l.c = n*(classes + 4 + 1);
l.out_w = l.w;
l.out_h = l.h;
l.out_c = l.c;
l.classes = classes;
l.cost = calloc(1, sizeof(float));
l.biases = calloc(total*2, sizeof(float));
if(mask) l.mask = mask;
else{
l.mask = calloc(n, sizeof(int));
for(i = 0; i < n; ++i){
l.mask[i] = i;
}
}
l.bias_updates = calloc(n*2, sizeof(float));
l.outputs = h*w*n*(classes + 4 + 1);
l.inputs = l.outputs;
l.truths = 90*(4 + 1);
l.delta = calloc(batch*l.outputs, sizeof(float));
l.output = calloc(batch*l.outputs, sizeof(float));
l.forward = forward_yolo_layer;
l.backward = backward_yolo_layer;
#ifdef GPU
l.forward_gpu = forward_yolo_layer_gpu;
l.backward_gpu = backward_yolo_layer_gpu;
l.output_gpu = cuda_make_array(l.output, batch*l.outputs);
l.delta_gpu = cuda_make_array(l.delta, batch*l.outputs);
#endif
fprintf(stderr, "yolo\n");
srand(0);
return l;
}
YOLOv6 (Python):
class YOLOv6(nn.Module):
def __init__(self, config):
super().__init__()
self.backbone = build_backbone(config)
self.neck = build_neck(config)
self.head = build_head(config)
def forward(self, x):
x = self.backbone(x)
x = self.neck(x)
x = self.head(x)
return x
OpenMMLab Detection Toolbox and Benchmark
Pros of mmdetection
- Extensive model zoo with a wide variety of pre-trained models
- Highly modular and customizable architecture
- Comprehensive documentation and active community support
Cons of mmdetection
- Steeper learning curve due to its complexity
- Potentially slower inference speed compared to YOLOv6
- Larger codebase and higher resource requirements
Code Comparison
mmdetection:
from mmdet.apis import init_detector, inference_detector
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'test.jpg')
YOLOv6:
from yolov6.utils.events import LOGGER
from yolov6.core.inferer import Inferer
model = Inferer('yolov6s.pt', device='cuda:0')
img = 'test.jpg'
result = model.infer(img, conf_thres=0.25, iou_thres=0.45, classes=None)
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Pros of Detectron2
- More comprehensive and flexible framework for object detection, segmentation, and other CV tasks
- Extensive documentation and community support
- Built on PyTorch, offering easier customization and integration with other PyTorch models
Cons of Detectron2
- Steeper learning curve due to its complexity and extensive features
- Generally slower inference speed compared to YOLOv6
- Requires more computational resources for training and inference
Code Comparison
Detectron2:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
YOLOv6:
from yolov6.utils.events import LOGGER, load_yaml
from yolov6.core.inferer import Inferer
cfg = load_yaml('configs/yolov6s.py')
inferer = Inferer(cfg, img_size=640)
results = inferer.infer(img_path)
implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
Pros of YOLOR
- Implements a unified network for object detection, offering potential performance improvements
- Provides pre-trained models for various tasks, including object detection and instance segmentation
- Supports multiple backbones and offers flexibility in model architecture
Cons of YOLOR
- Less frequent updates and maintenance compared to YOLOv6
- May have higher computational requirements due to its unified network approach
- Documentation could be more comprehensive for easier implementation and customization
Code Comparison
YOLOv6:
from yolov6.core.evaler import Evaler
from yolov6.utils.config import Config
cfg = Config.fromfile('configs/yolov6s.py')
evaler = Evaler(cfg, img_size=640)
evaler.eval()
YOLOR:
from yolor.models.models import *
from yolor.utils.datasets import *
from yolor.utils.utils import *
model = Darknet('cfg/yolor_p6.cfg', img_size=640)
model.load_state_dict(torch.load('yolor_p6.pt')['model'])
Both repositories offer YOLO-based object detection models, but YOLOv6 focuses on efficiency and deployment optimization, while YOLOR emphasizes a unified network approach. YOLOv6 provides more recent updates and extensive documentation, making it potentially easier to implement and customize for specific use cases.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
English | ç®ä½ä¸æ
YOLOv6
Implementation of paper:
- YOLOv6 v3.0: A Full-Scale Reloading ð¥
- YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
What's New
- [2023.09.15] Release YOLOv6-Segmentation. ð Performance
- [2023.04.28] Release YOLOv6Lite models on mobile or CPU. âï¸ Mobile Benchmark
- [2023.03.10] Release YOLOv6-Face. ð¥ Performance
- [2023.03.02] Update base models to version 3.0.
- [2023.01.06] Release P6 models and enhance the performance of P5 models. âï¸ Benchmark
- [2022.11.04] Release base models to simplify the training and deployment process.
- [2022.09.06] Customized quantization methods. ð Quantization Tutorial
- [2022.09.05] Release M/L models and update N/T/S models with enhanced performance.
- [2022.06.23] Release N/T/S models with excellent performance.
Benchmark
Model | Size | mAPval 0.5:0.95 | SpeedT4 trt fp16 b1 (fps) | SpeedT4 trt fp16 b32 (fps) | Params (M) | FLOPs (G) |
---|---|---|---|---|---|---|
YOLOv6-N | 640 | 37.5 | 779 | 1187 | 4.7 | 11.4 |
YOLOv6-S | 640 | 45.0 | 339 | 484 | 18.5 | 45.3 |
YOLOv6-M | 640 | 50.0 | 175 | 226 | 34.9 | 85.8 |
YOLOv6-L | 640 | 52.8 | 98 | 116 | 59.6 | 150.7 |
YOLOv6-N6 | 1280 | 44.9 | 228 | 281 | 10.4 | 49.8 |
YOLOv6-S6 | 1280 | 50.3 | 98 | 108 | 41.4 | 198.0 |
YOLOv6-M6 | 1280 | 55.2 | 47 | 55 | 79.6 | 379.5 |
YOLOv6-L6 | 1280 | 57.2 | 26 | 29 | 140.4 | 673.4 |
Table Notes
- All checkpoints are trained with self-distillation except for YOLOv6-N6/S6 models trained to 300 epochs without distillation.
- Results of the mAP and speed are evaluated on COCO val2017 dataset with the input resolution of 640Ã640 for P5 models and 1280x1280 for P6 models.
- Speed is tested with TensorRT 7.2 on T4.
- Refer to Test speed tutorial to reproduce the speed results of YOLOv6.
- Params and FLOPs of YOLOv6 are estimated on deployed models.
Legacy models
Model | Size | mAPval 0.5:0.95 | SpeedT4 trt fp16 b1 (fps) | SpeedT4 trt fp16 b32 (fps) | Params (M) | FLOPs (G) |
---|---|---|---|---|---|---|
YOLOv6-N | 640 | 35.9300e 36.3400e | 802 | 1234 | 4.3 | 11.1 |
YOLOv6-T | 640 | 40.3300e 41.1400e | 449 | 659 | 15.0 | 36.7 |
YOLOv6-S | 640 | 43.5300e 43.8400e | 358 | 495 | 17.2 | 44.2 |
YOLOv6-M | 640 | 49.5 | 179 | 233 | 34.3 | 82.2 |
YOLOv6-L-ReLU | 640 | 51.7 | 113 | 149 | 58.5 | 144.0 |
YOLOv6-L | 640 | 52.5 | 98 | 121 | 58.5 | 144.0 |
- Speed is tested with TensorRT 7.2 on T4.
Quantized model ð
Model | Size | Precision | mAPval 0.5:0.95 | SpeedT4 trt b1 (fps) | SpeedT4 trt b32 (fps) |
---|---|---|---|---|---|
YOLOv6-N RepOpt | 640 | INT8 | 34.8 | 1114 | 1828 |
YOLOv6-N | 640 | FP16 | 35.9 | 802 | 1234 |
YOLOv6-T RepOpt | 640 | INT8 | 39.8 | 741 | 1167 |
YOLOv6-T | 640 | FP16 | 40.3 | 449 | 659 |
YOLOv6-S RepOpt | 640 | INT8 | 43.3 | 619 | 924 |
YOLOv6-S | 640 | FP16 | 43.5 | 377 | 541 |
- Speed is tested with TensorRT 8.4 on T4.
- Precision is figured on models for 300 epochs.
Mobile Benchmark
Model | Size | mAPval 0.5:0.95 | sm8350 (ms) | mt6853 (ms) | sdm660 (ms) | Params (M) | FLOPs (G) |
---|---|---|---|---|---|---|---|
YOLOv6Lite-S | 320*320 | 22.4 | 7.99 | 11.99 | 41.86 | 0.55 | 0.56 |
YOLOv6Lite-M | 320*320 | 25.1 | 9.08 | 13.27 | 47.95 | 0.79 | 0.67 |
YOLOv6Lite-L | 320*320 | 28.0 | 11.37 | 16.20 | 61.40 | 1.09 | 0.87 |
YOLOv6Lite-L | 320*192 | 25.0 | 7.02 | 9.66 | 36.13 | 1.09 | 0.52 |
YOLOv6Lite-L | 224*128 | 18.9 | 3.63 | 4.99 | 17.76 | 1.09 | 0.24 |
Table Notes
- From the perspective of model size and input image ratio, we have built a series of models on the mobile terminal to facilitate flexible applications in different scenarios.
- All checkpoints are trained with 400 epochs without distillation.
- Results of the mAP and speed are evaluated on COCO val2017 dataset, and the input resolution is the Size in the table.
- Speed is tested on MNN 2.3.0 AArch64 with 2 threads by arm82 acceleration. The inference warm-up is performed 10 times, and the cycle is performed 100 times.
- Qualcomm 888(sm8350), Dimensity 720(mt6853) and Qualcomm 660(sdm660) correspond to chips with different performances at the high, middle and low end respectively, which can be used as a reference for model capabilities under different chips.
- Refer to Test NCNN Speed tutorial to reproduce the NCNN speed results of YOLOv6Lite.
Quick Start
Install
git clone https://github.com/meituan/YOLOv6
cd YOLOv6
pip install -r requirements.txt
Reproduce our results on COCO
Please refer to Train COCO Dataset.
Finetune on custom data
Single GPU
# P5 models
python tools/train.py --batch 32 --conf configs/yolov6s_finetune.py --data data/dataset.yaml --fuse_ab --device 0
# P6 models
python tools/train.py --batch 32 --conf configs/yolov6s6_finetune.py --data data/dataset.yaml --img 1280 --device 0
Multi GPUs (DDP mode recommended)
# P5 models
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py --batch 256 --conf configs/yolov6s_finetune.py --data data/dataset.yaml --fuse_ab --device 0,1,2,3,4,5,6,7
# P6 models
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py --batch 128 --conf configs/yolov6s6_finetune.py --data data/dataset.yaml --img 1280 --device 0,1,2,3,4,5,6,7
- fuse_ab: add anchor-based auxiliary branch and use Anchor Aided Training Mode (Not supported on P6 models currently)
- conf: select config file to specify network/optimizer/hyperparameters. We recommend to apply yolov6n/s/m/l_finetune.py when training on your custom dataset.
- data: prepare dataset and specify dataset paths in data.yaml ( COCO, YOLO format coco labels )
- make sure your dataset structure as follows:
âââ coco
â âââ annotations
â â âââ instances_train2017.json
â â âââ instances_val2017.json
â âââ images
â â âââ train2017
â â âââ val2017
â âââ labels
â â âââ train2017
â â âââ val2017
â âââ LICENSE
â âââ README.txt
YOLOv6 supports different input resolution modes. For details, see How to Set the Input Size.
Resume training
If your training process is corrupted, you can resume training by
# single GPU training.
python tools/train.py --resume
# multi GPU training.
python -m torch.distributed.launch --nproc_per_node 8 tools/train.py --resume
Above command will automatically find the latest checkpoint in YOLOv6 directory, then resume the training process.
Your can also specify a checkpoint path to --resume
parameter by
# remember to replace /path/to/your/checkpoint/path to the checkpoint path which you want to resume training.
--resume /path/to/your/checkpoint/path
This will resume from the specific checkpoint you provide.
Evaluation
Reproduce mAP on COCO val2017 dataset with 640Ã640 or 1280x1280 resolution
# P5 models
python tools/eval.py --data data/coco.yaml --batch 32 --weights yolov6s.pt --task val --reproduce_640_eval
# P6 models
python tools/eval.py --data data/coco.yaml --batch 32 --weights yolov6s6.pt --task val --reproduce_640_eval --img 1280
- verbose: set True to print mAP of each classes.
- do_coco_metric: set True / False to enable / disable pycocotools evaluation method.
- do_pr_metric: set True / False to print or not to print the precision and recall metrics.
- config-file: specify a config file to define all the eval params, for example: yolov6n_with_eval_params.py
Inference
First, download a pretrained model from the YOLOv6 release or use your trained model to do inference.
Second, run inference with tools/infer.py
# P5 models
python tools/infer.py --weights yolov6s.pt --source img.jpg / imgdir / video.mp4
# P6 models
python tools/infer.py --weights yolov6s6.pt --img 1280 1280 --source img.jpg / imgdir / video.mp4
If you want to inference on local camera or web camera, you can run:
# P5 models
python tools/infer.py --weights yolov6s.pt --webcam --webcam-addr 0
# P6 models
python tools/infer.py --weights yolov6s6.pt --img 1280 1280 --webcam --webcam-addr 0
webcam-addr
can be local camera number id or rtsp address.
Tutorials
Third-party resources
-
YOLOv6 Training with Amazon Sagemaker: yolov6-sagemaker from ashwincc
-
YOLOv6 NCNN Android app demo: ncnn-android-yolov6 from FeiGeChuanShu
-
YOLOv6 ONNXRuntime/MNN/TNN C++: YOLOv6-ORT, YOLOv6-MNN and YOLOv6-TNN from DefTruth
-
YOLOv6 TensorRT Python: yolov6-tensorrt-python from Linaom1214
-
YOLOv6 web demo on Huggingface Spaces with Gradio.
-
Interactive demo on DagsHub with Streamlit
-
Tutorial: How to train YOLOv6 on a custom dataset
-
YouTube Tutorial: How to train YOLOv6 on a custom dataset
-
Blog post: YOLOv6 Object Detection â Paper Explanation and Inference
FAQï¼Continuously updatedï¼
If you have any questions, welcome to join our WeChat group to discuss and exchange.
Top Related Projects
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
OpenMMLab Detection Toolbox and Benchmark
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot