yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

13,768

4,326

13,768

1,564

View on GitHub

Top Related Projects

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

darknet

22,006

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

YOLOv6

5,817

YOLOv6: a single-stage object detection framework dedicated to industrial applications.

YOLOX

9,915

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Quick Overview

YOLOv7 is a state-of-the-art object detection model that improves upon previous YOLO versions. It offers real-time object detection with high accuracy and efficiency, making it suitable for various computer vision applications.

Pros

Excellent performance in terms of speed and accuracy
Supports both object detection and instance segmentation tasks
Highly customizable and adaptable to different hardware configurations
Well-documented with extensive training and inference examples

Cons

Requires significant computational resources for training
Complex architecture may be challenging for beginners to understand and modify
Limited support for older hardware or low-resource environments
Dependency on specific versions of libraries may cause compatibility issues

Code Examples

Loading a pre-trained YOLOv7 model:

from models.experimental import attempt_load

model = attempt_load('yolov7.pt', map_location='cuda')

Performing inference on an image:

import torch
from utils.datasets import letterbox
from utils.general import non_max_suppression, scale_coords

img = letterbox(img0, new_shape=640)[0]
img = img.transpose((2, 0, 1))[::-1]
img = torch.from_numpy(img).to('cuda').float() / 255.0

pred = model(img[None])[0]
pred = non_max_suppression(pred, conf_thres=0.25, iou_thres=0.45)

Visualizing detection results:

from utils.plots import plot_one_box

for *xyxy, conf, cls in reversed(det):
    label = f'{names[int(cls)]} {conf:.2f}'
    plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)

Getting Started

Clone the repository:

git clone https://github.com/WongKinYiu/yolov7.git
cd yolov7

Install dependencies:
```
pip install -r requirements.txt
```

Download pre-trained weights:

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt

Run inference on an image:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg

Competitor Comparisons

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

More established and widely adopted in the community
Extensive documentation and tutorials available
Easier to use and integrate into existing projects

Cons of YOLOv5

Slightly lower accuracy compared to YOLOv7
May have slower inference speed on certain hardware configurations

Code Comparison

YOLOv5:

from yolov5 import YOLOv5

model = YOLOv5('yolov5s.pt')
results = model('image.jpg')

YOLOv7:

from yolov7 import YOLOv7

model = YOLOv7('yolov7.pt')
results = model.detect('image.jpg')

Both repositories offer similar functionality for object detection, but YOLOv7 aims to provide improved accuracy and performance. YOLOv5 benefits from a larger community and more extensive documentation, making it easier for beginners to get started. However, YOLOv7 may offer better performance in certain scenarios, especially for advanced users who require state-of-the-art accuracy.

The code comparison shows that both libraries have similar usage patterns, with minor differences in method names and initialization. Users familiar with one library should find it relatively easy to switch to the other if needed.

darknet

22,006

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

More established and mature project with a longer history
Supports a wider range of YOLO versions (v2, v3, v4, etc.)
Offers more comprehensive documentation and examples

Cons of darknet

Less focused on the latest YOLO architectures
May have slower inference speed compared to yolov7
Requires more manual configuration and setup

Code Comparison

darknet:

layer make_yolo_layer(int batch, int w, int h, int n, int total, int *mask, int classes)
{
    int i;
    layer l = {0};
    l.type = YOLO;

yolov7:

class YOLOv7(nn.Module):
    def __init__(self, nc=80, anchors=(), ch=()):  # detection layer
        super(YOLOv7, self).__init__()
        self.nc = nc  # number of classes

The darknet implementation is in C, while yolov7 uses Python with PyTorch. yolov7 offers a more modern and flexible approach, leveraging deep learning frameworks for easier development and deployment.

YOLOv6

5,817

YOLOv6: a single-stage object detection framework dedicated to industrial applications.

Pros of YOLOv6

Better performance on edge devices due to its lightweight design
Faster inference speed, especially on mobile platforms
More frequent updates and active development from Meituan

Cons of YOLOv6

Less versatile compared to YOLOv7, with fewer pre-trained models
Lower accuracy on certain datasets, particularly for larger objects
Limited documentation and community support compared to YOLOv7

Code Comparison

YOLOv6:

from yolov6.utils.events import LOGGER, load_yaml
from yolov6.core.inferer import Inferer

model = Inferer(weights='yolov6s.pt', device='cpu')
results = model.infer(source='image.jpg', conf_thres=0.25, iou_thres=0.45)

YOLOv7:

from models.experimental import attempt_load
from utils.general import non_max_suppression

model = attempt_load('yolov7.pt', map_location='cpu')
results = non_max_suppression(model(img), conf_thres=0.25, iou_thres=0.45)

Both repositories offer state-of-the-art object detection capabilities, but they cater to different use cases. YOLOv6 is more suitable for edge devices and mobile applications due to its lightweight design and faster inference speed. On the other hand, YOLOv7 provides better accuracy and versatility, making it more appropriate for a wider range of applications where computational resources are less constrained.

YOLOX

9,915

YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/

Pros of YOLOX

Modular design allows for easier customization and experimentation
Includes anchor-free detection, which can improve performance on small objects
Offers a range of pre-trained models for different use cases

Cons of YOLOX

May have slightly lower performance on some benchmarks compared to YOLOv7
Less extensive documentation and community support

Code Comparison

YOLOX:

from yolox.exp import Exp as MyExp

class Exp(MyExp):
    def __init__(self):
        super(Exp, self).__init__()
        self.depth = 0.33
        self.width = 0.50
        self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]

YOLOv7:

from models.yolo import Model
from utils.torch_utils import select_device

device = select_device('0')
model = Model('yolov7.yaml', ch=3, nc=80).to(device)
model.train()

Both repositories offer powerful object detection capabilities, but YOLOX provides a more modular approach, while YOLOv7 focuses on pushing the boundaries of performance. YOLOX's anchor-free detection can be advantageous in certain scenarios, while YOLOv7 may offer better overall performance on standard benchmarks. The choice between the two depends on specific project requirements and preferences.

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

More comprehensive and flexible framework for object detection and segmentation
Supports a wider range of models and tasks (e.g., instance segmentation, keypoint detection)
Better documentation and community support

Cons of Detectron2

Generally slower inference speed compared to YOLOv7
Steeper learning curve and more complex setup process
Requires more computational resources for training and inference

Code Comparison

Detectron2 example:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

YOLOv7 example:

from models.experimental import attempt_load
from utils.general import non_max_suppression

model = attempt_load('yolov7.pt')
pred = model(img)[0]
pred = non_max_suppression(pred)

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Official YOLOv7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Web Demo

Integrated into Huggingface Spaces ð¤ using Gradio. Try out the Web Demo

Performance

MS COCO

Model	Test Size	AP^test	AP₅₀^test	AP₇₅^test	batch 1 fps	batch 32 average time
YOLOv7	640	51.4%	69.7%	55.9%	161 fps	2.8 ms
YOLOv7-X	640	53.1%	71.2%	57.8%	114 fps	4.3 ms

YOLOv7-W6	1280	54.9%	72.6%	60.1%	84 fps	7.6 ms
YOLOv7-E6	1280	56.0%	73.5%	61.2%	56 fps	12.3 ms
YOLOv7-D6	1280	56.6%	74.0%	61.8%	44 fps	15.0 ms
YOLOv7-E6E	1280	56.8%	74.4%	62.1%	36 fps	18.7 ms

Installation

Docker environment (recommended)

Expand

# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov7 -it -v your_coco_path/:/coco/ -v your_code_path/:/yolov7 --shm-size=64g nvcr.io/nvidia/pytorch:21.08-py3

# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx

# pip install required packages
pip install seaborn thop

# go to code folder
cd /yolov7

Testing

yolov7.pt yolov7x.pt yolov7-w6.pt yolov7-e6.pt yolov7-d6.pt yolov7-e6e.pt

python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val

You will get the results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.51206
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.69730
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.55521
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35247
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.55937
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66693
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.38453
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.63765
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.68772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.53766
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.73549
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.83868

To measure accuracy, download COCO-annotations for Pycocotools to the ./coco/annotations/instances_val2017.json

Training

Data preparation

bash scripts/get_coco.sh

Download MS COCO dataset images (train, val, test) and labels. If you have previously used a different version of YOLO, we strongly recommend that you delete train2017.cache and val2017.cache files, and redownload labels

Single GPU training

# train p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

# train p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml

Multiple GPU training

# train p5 models
python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch-size 128 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

# train p6 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_aux.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 128 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml

Transfer learning

yolov7_training.pt yolov7x_training.pt yolov7-w6_training.pt yolov7-e6_training.pt yolov7-d6_training.pt yolov7-e6e_training.pt

Single GPU finetuning for custom dataset

# finetune p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7-custom.yaml --weights 'yolov7_training.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml

# finetune p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/custom.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6-custom.yaml --weights 'yolov7-w6_training.pt' --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml

Re-parameterization

See reparameterization.ipynb

Inference

On video:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source yourvideo.mp4

On image:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg

Export

Pytorch to CoreML (and inference on MacOS/iOS)

Pytorch to ONNX with NMS (and inference)

python export.py --weights yolov7-tiny.pt --grid --end2end --simplify \
        --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --max-wh 640

Pytorch to TensorRT with NMS (and inference)

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights ./yolov7-tiny.pt --grid --end2end --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16

Pytorch to TensorRT another way

Expand

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights yolov7-tiny.pt --grid --include-nms
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16

# Or use trtexec to convert ONNX to TensorRT engine
/usr/src/tensorrt/bin/trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --fp16

Tested with: Python 3.7.13, Pytorch 1.12.0+cu113

Pose estimation

code yolov7-w6-pose.pt

See keypoint.ipynb.

Instance segmentation (with NTU)

code yolov7-mask.pt

See instance.ipynb.

Instance segmentation

code yolov7-seg.pt

YOLOv7 for instance segmentation (YOLOR + YOLOv5 + YOLACT)

Model	Test Size	AP^box	AP₅₀^box	AP₇₅^box	AP^mask	AP₅₀^mask	AP₇₅^mask
YOLOv7-seg	640	51.4%	69.4%	55.8%	41.5%	65.5%	43.7%

Anchor free detection head

code yolov7-u6.pt

YOLOv7 with decoupled TAL head (YOLOR + YOLOv5 + YOLOv6)

Model	Test Size	AP^val	AP₅₀^val	AP₇₅^val
YOLOv7-u6	640	52.6%	69.7%	57.3%

Citation

@inproceedings{wang2023yolov7,
  title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
  author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023}
}

@article{wang2023designing,
  title={Designing Network Design Strategies Through Gradient Path Analysis},
  author={Wang, Chien-Yao and Liao, Hong-Yuan Mark and Yeh, I-Hau},
  journal={Journal of Information Science and Engineering},
  year={2023}
}