yolov7
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Top Related Projects
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Quick Overview
YOLOv7 is a state-of-the-art object detection model that improves upon previous YOLO versions. It offers real-time object detection with high accuracy and efficiency, making it suitable for various computer vision applications.
Pros
- Excellent performance in terms of speed and accuracy
- Supports both object detection and instance segmentation tasks
- Highly customizable and adaptable to different hardware configurations
- Well-documented with extensive training and inference examples
Cons
- Requires significant computational resources for training
- Complex architecture may be challenging for beginners to understand and modify
- Limited support for older hardware or low-resource environments
- Dependency on specific versions of libraries may cause compatibility issues
Code Examples
- Loading a pre-trained YOLOv7 model:
from models.experimental import attempt_load
model = attempt_load('yolov7.pt', map_location='cuda')
- Performing inference on an image:
import torch
from utils.datasets import letterbox
from utils.general import non_max_suppression, scale_coords
img = letterbox(img0, new_shape=640)[0]
img = img.transpose((2, 0, 1))[::-1]
img = torch.from_numpy(img).to('cuda').float() / 255.0
pred = model(img[None])[0]
pred = non_max_suppression(pred, conf_thres=0.25, iou_thres=0.45)
- Visualizing detection results:
from utils.plots import plot_one_box
for *xyxy, conf, cls in reversed(det):
label = f'{names[int(cls)]} {conf:.2f}'
plot_one_box(xyxy, im0, label=label, color=colors[int(cls)], line_thickness=3)
Getting Started
-
Clone the repository:
git clone https://github.com/WongKinYiu/yolov7.git cd yolov7
-
Install dependencies:
pip install -r requirements.txt
-
Download pre-trained weights:
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7.pt
-
Run inference on an image:
python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg
Competitor Comparisons
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Pros of YOLOv5
- More established and widely adopted in the community
- Extensive documentation and tutorials available
- Easier to use and integrate into existing projects
Cons of YOLOv5
- Slightly lower accuracy compared to YOLOv7
- May have slower inference speed on certain hardware configurations
Code Comparison
YOLOv5:
from yolov5 import YOLOv5
model = YOLOv5('yolov5s.pt')
results = model('image.jpg')
YOLOv7:
from yolov7 import YOLOv7
model = YOLOv7('yolov7.pt')
results = model.detect('image.jpg')
Both repositories offer similar functionality for object detection, but YOLOv7 aims to provide improved accuracy and performance. YOLOv5 benefits from a larger community and more extensive documentation, making it easier for beginners to get started. However, YOLOv7 may offer better performance in certain scenarios, especially for advanced users who require state-of-the-art accuracy.
The code comparison shows that both libraries have similar usage patterns, with minor differences in method names and initialization. Users familiar with one library should find it relatively easy to switch to the other if needed.
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
Pros of darknet
- More established and mature project with a longer history
- Supports a wider range of YOLO versions (v2, v3, v4, etc.)
- Offers more comprehensive documentation and examples
Cons of darknet
- Less focused on the latest YOLO architectures
- May have slower inference speed compared to yolov7
- Requires more manual configuration and setup
Code Comparison
darknet:
layer make_yolo_layer(int batch, int w, int h, int n, int total, int *mask, int classes)
{
int i;
layer l = {0};
l.type = YOLO;
yolov7:
class YOLOv7(nn.Module):
def __init__(self, nc=80, anchors=(), ch=()): # detection layer
super(YOLOv7, self).__init__()
self.nc = nc # number of classes
The darknet implementation is in C, while yolov7 uses Python with PyTorch. yolov7 offers a more modern and flexible approach, leveraging deep learning frameworks for easier development and deployment.
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
Pros of YOLOv6
- Better performance on edge devices due to its lightweight design
- Faster inference speed, especially on mobile platforms
- More frequent updates and active development from Meituan
Cons of YOLOv6
- Less versatile compared to YOLOv7, with fewer pre-trained models
- Lower accuracy on certain datasets, particularly for larger objects
- Limited documentation and community support compared to YOLOv7
Code Comparison
YOLOv6:
from yolov6.utils.events import LOGGER, load_yaml
from yolov6.core.inferer import Inferer
model = Inferer(weights='yolov6s.pt', device='cpu')
results = model.infer(source='image.jpg', conf_thres=0.25, iou_thres=0.45)
YOLOv7:
from models.experimental import attempt_load
from utils.general import non_max_suppression
model = attempt_load('yolov7.pt', map_location='cpu')
results = non_max_suppression(model(img), conf_thres=0.25, iou_thres=0.45)
Both repositories offer state-of-the-art object detection capabilities, but they cater to different use cases. YOLOv6 is more suitable for edge devices and mobile applications due to its lightweight design and faster inference speed. On the other hand, YOLOv7 provides better accuracy and versatility, making it more appropriate for a wider range of applications where computational resources are less constrained.
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Pros of YOLOX
- Modular design allows for easier customization and experimentation
- Includes anchor-free detection, which can improve performance on small objects
- Offers a range of pre-trained models for different use cases
Cons of YOLOX
- May have slightly lower performance on some benchmarks compared to YOLOv7
- Less extensive documentation and community support
Code Comparison
YOLOX:
from yolox.exp import Exp as MyExp
class Exp(MyExp):
def __init__(self):
super(Exp, self).__init__()
self.depth = 0.33
self.width = 0.50
self.exp_name = os.path.split(os.path.realpath(__file__))[1].split(".")[0]
YOLOv7:
from models.yolo import Model
from utils.torch_utils import select_device
device = select_device('0')
model = Model('yolov7.yaml', ch=3, nc=80).to(device)
model.train()
Both repositories offer powerful object detection capabilities, but YOLOX provides a more modular approach, while YOLOv7 focuses on pushing the boundaries of performance. YOLOX's anchor-free detection can be advantageous in certain scenarios, while YOLOv7 may offer better overall performance on standard benchmarks. The choice between the two depends on specific project requirements and preferences.
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Pros of Detectron2
- More comprehensive and flexible framework for object detection and segmentation
- Supports a wider range of models and tasks (e.g., instance segmentation, keypoint detection)
- Better documentation and community support
Cons of Detectron2
- Generally slower inference speed compared to YOLOv7
- Steeper learning curve and more complex setup process
- Requires more computational resources for training and inference
Code Comparison
Detectron2 example:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
YOLOv7 example:
from models.experimental import attempt_load
from utils.general import non_max_suppression
model = attempt_load('yolov7.pt')
pred = model(img)[0]
pred = non_max_suppression(pred)
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Official YOLOv7
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Web Demo
- Integrated into Huggingface Spaces ð¤ using Gradio. Try out the Web Demo
Performance
MS COCO
Model | Test Size | APtest | AP50test | AP75test | batch 1 fps | batch 32 average time |
---|---|---|---|---|---|---|
YOLOv7 | 640 | 51.4% | 69.7% | 55.9% | 161 fps | 2.8 ms |
YOLOv7-X | 640 | 53.1% | 71.2% | 57.8% | 114 fps | 4.3 ms |
YOLOv7-W6 | 1280 | 54.9% | 72.6% | 60.1% | 84 fps | 7.6 ms |
YOLOv7-E6 | 1280 | 56.0% | 73.5% | 61.2% | 56 fps | 12.3 ms |
YOLOv7-D6 | 1280 | 56.6% | 74.0% | 61.8% | 44 fps | 15.0 ms |
YOLOv7-E6E | 1280 | 56.8% | 74.4% | 62.1% | 36 fps | 18.7 ms |
Installation
Docker environment (recommended)
Expand
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov7 -it -v your_coco_path/:/coco/ -v your_code_path/:/yolov7 --shm-size=64g nvcr.io/nvidia/pytorch:21.08-py3
# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx
# pip install required packages
pip install seaborn thop
# go to code folder
cd /yolov7
Testing
yolov7.pt
yolov7x.pt
yolov7-w6.pt
yolov7-e6.pt
yolov7-d6.pt
yolov7-e6e.pt
python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val
You will get the results:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.51206
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.69730
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.55521
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35247
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.55937
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66693
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.38453
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.63765
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.68772
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.53766
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.73549
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.83868
To measure accuracy, download COCO-annotations for Pycocotools to the ./coco/annotations/instances_val2017.json
Training
Data preparation
bash scripts/get_coco.sh
- Download MS COCO dataset images (train, val, test) and labels. If you have previously used a different version of YOLO, we strongly recommend that you delete
train2017.cache
andval2017.cache
files, and redownload labels
Single GPU training
# train p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml
# train p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml
Multiple GPU training
# train p5 models
python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch-size 128 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml
# train p6 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_aux.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 128 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml
Transfer learning
yolov7_training.pt
yolov7x_training.pt
yolov7-w6_training.pt
yolov7-e6_training.pt
yolov7-d6_training.pt
yolov7-e6e_training.pt
Single GPU finetuning for custom dataset
# finetune p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7-custom.yaml --weights 'yolov7_training.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml
# finetune p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/custom.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6-custom.yaml --weights 'yolov7-w6_training.pt' --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml
Re-parameterization
Inference
On video:
python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source yourvideo.mp4
On image:
python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg
Export
Pytorch to CoreML (and inference on MacOS/iOS)
Pytorch to ONNX with NMS (and inference)
python export.py --weights yolov7-tiny.pt --grid --end2end --simplify \
--topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --max-wh 640
Pytorch to TensorRT with NMS (and inference)
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights ./yolov7-tiny.pt --grid --end2end --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16
Pytorch to TensorRT another way
Expand
wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights yolov7-tiny.pt --grid --include-nms
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16
# Or use trtexec to convert ONNX to TensorRT engine
/usr/src/tensorrt/bin/trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --fp16
Tested with: Python 3.7.13, Pytorch 1.12.0+cu113
Pose estimation
See keypoint.ipynb.
Instance segmentation (with NTU)
See instance.ipynb.
Instance segmentation
YOLOv7 for instance segmentation (YOLOR + YOLOv5 + YOLACT)
Model | Test Size | APbox | AP50box | AP75box | APmask | AP50mask | AP75mask |
---|---|---|---|---|---|---|---|
YOLOv7-seg | 640 | 51.4% | 69.4% | 55.8% | 41.5% | 65.5% | 43.7% |
Anchor free detection head
YOLOv7 with decoupled TAL head (YOLOR + YOLOv5 + YOLOv6)
Model | Test Size | APval | AP50val | AP75val |
---|---|---|---|---|
YOLOv7-u6 | 640 | 52.6% | 69.7% | 57.3% |
Citation
@inproceedings{wang2023yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2023}
}
@article{wang2023designing,
title={Designing Network Design Strategies Through Gradient Path Analysis},
author={Wang, Chien-Yao and Liao, Hong-Yuan Mark and Yeh, I-Hau},
journal={Journal of Information Science and Engineering},
year={2023}
}
Teaser
YOLOv7-semantic & YOLOv7-panoptic & YOLOv7-caption
YOLOv7-semantic & YOLOv7-detection & YOLOv7-depth (with NTUT)
YOLOv7-3d-detection & YOLOv7-lidar & YOLOv7-road (with NTUT)
Acknowledgements
Expand
- https://github.com/AlexeyAB/darknet
- https://github.com/WongKinYiu/yolor
- https://github.com/WongKinYiu/PyTorch_YOLOv4
- https://github.com/WongKinYiu/ScaledYOLOv4
- https://github.com/Megvii-BaseDetection/YOLOX
- https://github.com/ultralytics/yolov3
- https://github.com/ultralytics/yolov5
- https://github.com/DingXiaoH/RepVGG
- https://github.com/JUGGHM/OREPA_CVPR2022
- https://github.com/TexasInstruments/edgeai-yolov5/tree/yolo-pose
Top Related Projects
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot