Convert Figma logo to code with AI

WongKinYiu logoyolor

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks (https://arxiv.org/abs/2105.04206)

1,992
515
1,992
216

Top Related Projects

51,450

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

21,700

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

24,600

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

77,006

Models and examples built with TensorFlow

OpenMMLab Detection Toolbox and Benchmark

Quick Overview

The WongKinYiu/yolor repository is a PyTorch implementation of the YOLOR (You Only Look Once Refined) object detection model. YOLOR is a real-time object detection system that aims to achieve high accuracy and efficiency on a wide range of tasks and datasets.

Pros

  • High Accuracy: YOLOR has been shown to achieve state-of-the-art performance on several object detection benchmarks, including COCO and Pascal VOC.
  • Real-Time Performance: The model is designed to run at high frame rates, making it suitable for real-time applications such as video surveillance and autonomous vehicles.
  • Versatility: YOLOR can be applied to a variety of object detection tasks and can be easily fine-tuned on different datasets.
  • Open-Source: The PyTorch implementation is available on GitHub, allowing for easy customization and contribution to the project.

Cons

  • Complexity: The YOLOR model is relatively complex, with a large number of parameters and a sophisticated architecture, which may make it more challenging to understand and modify.
  • Hardware Requirements: Running YOLOR efficiently may require powerful hardware, such as high-end GPUs, which can be a barrier for some users.
  • Limited Documentation: The project's documentation could be more comprehensive, making it harder for new users to get started with the library.
  • Ongoing Development: As an active project, YOLOR may undergo frequent updates and changes, which could make it more difficult to maintain a stable integration in production environments.

Code Examples

Here are a few code examples demonstrating the usage of the YOLOR PyTorch implementation:

  1. Loading a Pre-Trained Model:
import torch
from yolor.models.experimental import attempt_load

# Load a pre-trained YOLOR model
model = attempt_load('yolor_p6.pt', map_location=torch.device('cpu'))
  1. Performing Object Detection:
import cv2
from yolor.utils.datasets import letterbox
from yolor.utils.general import non_max_suppression, scale_coords

# Load an image and perform object detection
img = cv2.imread('image.jpg')
img, ratio, (dw, dh) = letterbox(img, auto=True)
pred = model(img)[0]
pred = non_max_suppression(pred, 0.4, 0.5)
for i, det in enumerate(pred):
    det[:4] = scale_coords(img.shape[2:], det[:4], img.shape).round()
    # Visualize the detected objects
    # ...
  1. Evaluating Model Performance:
from yolor.utils.metrics import ap_per_class
from yolor.utils.general import coco80_to_coco91_class

# Evaluate the model's performance on the COCO dataset
targets, predictions = load_coco_data()
names = coco80_to_coco91_class()
p, r, ap, f1, ap_class = ap_per_class(predictions, targets, names=names)
# Print the results
# ...

Getting Started

To get started with the YOLOR PyTorch implementation, follow these steps:

  1. Clone the repository:
git clone https://github.com/WongKinYiu/yolor.git
  1. Install the required dependencies:
cd yolor
pip install -r requirements.txt
  1. Download a pre-trained YOLOR model:
wget https://github.com/WongKinYiu/yolor/releases/download/v0.1/yolor_p6.pt
  1. Run the object detection example:
from yolor.detect import run
run(
    weights='yolor_p6.pt',
    source='0',  # webcam
    imgsz=1280,
    conf_thres=0.25,
    iou_thres=0.45,
    max_det=1000,
    device='0',
    view_img=True,
    save_txt=False,
    save_conf=False,
    save_crop=False,
    nosave=False,

Competitor Comparisons

51,450

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

  • More extensive documentation and tutorials
  • Larger community and more frequent updates
  • Better integration with PyTorch ecosystem

Cons of YOLOv5

  • Slightly lower performance on some benchmarks
  • Less focus on novel architectural improvements

Code Comparison

YOLOv5:

from models.yolo import Model
from utils.torch_utils import select_device

device = select_device('0')
model = Model('yolov5s.yaml', ch=3, nc=80).to(device)

YOLOR:

from models.yolo import Model
from utils.torch_utils import select_device

device = select_device('0')
model = Model('cfg/yolor_p6.cfg', ch=3, nc=80).to(device)

The code structure is similar, but YOLOR uses a different configuration file format (.cfg) compared to YOLOv5's YAML-based configuration.

YOLOv5 offers a more standardized approach with better documentation, making it easier for beginners to get started. However, YOLOR introduces some architectural innovations that can lead to improved performance in certain scenarios.

Both projects are actively maintained and offer state-of-the-art object detection capabilities. The choice between them often depends on specific project requirements and user familiarity with each framework.

21,700

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

  • More established and widely used in the computer vision community
  • Supports a broader range of YOLO versions (v2, v3, v4) and other architectures
  • Extensive documentation and community support

Cons of darknet

  • Less focus on recent YOLO improvements and optimizations
  • May have slower inference speed compared to YOLOR's optimized implementation
  • Requires more manual configuration for advanced features

Code Comparison

darknet:

layer make_yolo_layer(int batch, int w, int h, int n, int total, int *mask, int classes)
{
    int i;
    layer l = {0};
    l.type = YOLO;

YOLOR:

class YOLOR(nn.Module):
    def __init__(self, nc=80, anchors=()): 
        super(YOLOR, self).__init__()
        self.nc = nc  # number of classes
        self.no = nc + 5  # number of outputs per anchor

The code snippets show the different implementation languages and approaches. darknet uses C for lower-level control, while YOLOR utilizes Python with PyTorch for easier integration and development.

YOLOR focuses on recent YOLO improvements and optimizations, potentially offering better performance in certain scenarios. However, darknet provides a more comprehensive set of features and broader architecture support, making it suitable for a wider range of applications.

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

  • More comprehensive and feature-rich, supporting a wider range of computer vision tasks
  • Better documentation and community support, making it easier to use and extend
  • Modular architecture allows for easier customization and experimentation

Cons of Detectron2

  • Steeper learning curve due to its complexity and extensive features
  • Potentially slower inference time compared to YOLOR's optimized architecture
  • Requires more computational resources for training and inference

Code Comparison

YOLOR (simplified detection code):

from models.models import *
from utils.datasets import *
from utils.utils import *

model = Darknet('cfg/yolor_p6.cfg', img_size)
model.load_state_dict(torch.load('yolor_p6.pt'))
img = torch.zeros((1, 3, img_size, img_size))
output = model(img)

Detectron2 (simplified detection code):

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both repositories focus on object detection, but Detectron2 offers a more comprehensive toolkit for various computer vision tasks. YOLOR is more specialized and optimized for real-time object detection, while Detectron2 provides a flexible framework for a broader range of applications.

24,600

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

  • Provides instance segmentation in addition to object detection
  • Well-documented and easier to understand for beginners
  • Supports both TensorFlow and Keras backends

Cons of Mask_RCNN

  • Generally slower inference speed compared to YOLOR
  • May require more computational resources for training and inference
  • Less suitable for real-time applications

Code Comparison

Mask_RCNN:

import mrcnn.model as modellib
from mrcnn import utils

model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(WEIGHTS_PATH, by_name=True)
results = model.detect([image], verbose=1)

YOLOR:

from models.models import *
from utils.datasets import *
from utils.utils import *

model = Darknet(cfg, img_size)
model.load_state_dict(torch.load(weights, map_location=device)['model'])
pred = model(img.to(device))[0]

The code snippets show the basic setup and inference process for both models. Mask_RCNN uses a more straightforward approach with built-in functions, while YOLOR requires more manual setup but offers more flexibility in terms of customization.

77,006

Models and examples built with TensorFlow

Pros of TensorFlow Models

  • Comprehensive collection of models and examples across various domains
  • Backed by Google, with extensive documentation and community support
  • Supports multiple deep learning tasks beyond object detection

Cons of TensorFlow Models

  • Steeper learning curve due to the breadth of the repository
  • May require more setup and configuration for specific tasks
  • Potentially slower inference compared to YOLOR's optimized architecture

Code Comparison

YOLOR (PyTorch):

from models.models import *
from utils.utils import *

model = Darknet('cfg/yolor_p6.cfg', imgsize)
model.load_state_dict(torch.load('yolor_p6.pt')['model'])

TensorFlow Models:

import tensorflow as tf
from object_detection import model_lib_v2

model_dir = 'path/to/model'
pipeline_config = 'path/to/pipeline.config'
model_lib_v2.train_loop(pipeline_config_path=pipeline_config, model_dir=model_dir)

Summary

YOLOR focuses on a single, highly optimized object detection model, while TensorFlow Models offers a diverse range of pre-trained models and examples. YOLOR may be easier to get started with for specific object detection tasks, while TensorFlow Models provides more flexibility and options for various deep learning applications.

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

  • Comprehensive framework with support for multiple object detection algorithms
  • Extensive documentation and community support
  • Modular design allowing easy customization and extension

Cons of mmdetection

  • Steeper learning curve due to its complexity
  • Potentially slower inference time for some models compared to YOLOR

Code Comparison

YOLOR (model definition):

class YOLOR(nn.Module):
    def __init__(self, nc=80, anchors=(), ch=()):
        super(YOLOR, self).__init__()
        self.backbone = CSPDarknet(ch)
        self.head = YOLOHead(nc, anchors)

mmdetection (model configuration):

model = dict(
    type='YOLOV3',
    backbone=dict(type='Darknet', depth=53),
    neck=dict(type='YOLOV3Neck'),
    bbox_head=dict(type='YOLOV3Head', num_classes=80)
)

YOLOR focuses on a single, highly optimized architecture, while mmdetection provides a flexible framework for implementing various object detection algorithms. YOLOR may offer better performance for specific use cases, while mmdetection provides more options and easier experimentation with different models and techniques.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

YOLOR

implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

PWC

Unified Network

To get the results on the table, please use this branch.

ModelTest SizeAPtestAP50testAP75testbatch1 throughputbatch32 inference
YOLOR-CSP64052.8%71.2%57.6%106 fps3.2 ms
YOLOR-CSP-X64054.8%73.1%59.7%87 fps5.5 ms
YOLOR-P6128055.7%73.3%61.0%76 fps8.3 ms
YOLOR-W6128056.9%74.4%62.2%66 fps10.7 ms
YOLOR-E6128057.6%75.2%63.0%45 fps17.1 ms
YOLOR-D6128058.2%75.8%63.8%34 fps21.8 ms
YOLOv4-P589651.8%70.3%56.6%41 fps (old)-
YOLOv4-P6128054.5%72.6%59.8%30 fps (old)-
YOLOv4-P7153655.5%73.4%60.8%16 fps (old)-
  • Fix the speed bottleneck on our NFS, many thanks to NCHC, TWCC, and NARLabs support teams.
ModelTest SizeAPvalAP50valAP75valAPSvalAPMvalAPLvalweights
YOLOv4-CSP64049.1%67.7%53.8%32.1%54.4%63.2%-
YOLOR-CSP64049.2%67.6%53.7%32.9%54.4%63.0%weights
YOLOR-CSP*64050.0%68.7%54.3%34.2%55.1%64.3%weights
YOLOv4-CSP-X64050.9%69.3%55.4%35.3%55.8%64.8%-
YOLOR-CSP-X64051.1%69.6%55.7%35.7%56.0%65.2%weights
YOLOR-CSP-X*64051.5%69.9%56.1%35.8%56.8%66.1%weights

Developing...

ModelTest SizeAPtestAP50testAP75testAPStestAPMtestAPLtest
YOLOR-CSP64051.1%69.6%55.7%31.7%55.3%64.7%
YOLOR-CSP-X64053.0%71.4%57.9%33.7%57.1%66.8%

Train from scratch for 300 epochs...

ModelInfoTest SizeAP
YOLOR-CSPevolution64048.0%
YOLOR-CSPstrategy64050.0%
YOLOR-CSPstrategy + simOTA64051.1%
YOLOR-CSP-Xstrategy64051.5%
YOLOR-CSP-Xstrategy + simOTA64053.0%

Installation

Docker environment (recommended)

Expand
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolor -it -v your_coco_path/:/coco/ -v your_code_path/:/yolor --shm-size=64g nvcr.io/nvidia/pytorch:20.11-py3

# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx

# pip install required packages
pip install seaborn thop

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
cd /
git clone https://github.com/JunnYu/mish-cuda
cd mish-cuda
python setup.py build install

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
cd /
git clone https://github.com/fbcotter/pytorch_wavelets
cd pytorch_wavelets
pip install .

# go to code folder
cd /yolor

Colab environment

Expand
git clone https://github.com/WongKinYiu/yolor
cd yolor

# pip install required packages
pip install -qr requirements.txt

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
git clone https://github.com/JunnYu/mish-cuda
cd mish-cuda
python setup.py build install
cd ..

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
git clone https://github.com/fbcotter/pytorch_wavelets
cd pytorch_wavelets
pip install .
cd ..

Prepare COCO dataset

Expand
cd /yolor
bash scripts/get_coco.sh

Prepare pretrained weight

Expand
cd /yolor
bash scripts/get_pretrain.sh

Testing

yolor_p6.pt

python test.py --data data/coco.yaml --img 1280 --batch 32 --conf 0.001 --iou 0.65 --device 0 --cfg cfg/yolor_p6.cfg --weights yolor_p6.pt --name yolor_p6_val

You will get the results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.52510
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.70718
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.57520
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.37058
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.56878
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66102
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.39181
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.65229
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.71441
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.57755
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.75337
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.84013

Training

Single GPU training:

python train.py --batch-size 8 --img 1280 1280 --data coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0 --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300

Multiple GPU training:

python -m torch.distributed.launch --nproc_per_node 2 --master_port 9527 train.py --batch-size 16 --img 1280 1280 --data coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0,1 --sync-bn --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300

Training schedule in the paper:

python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights '' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6 --hyp hyp.scratch.1280.yaml --epochs 300
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 tune.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights 'runs/train/yolor_p6/weights/last_298.pt' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6-tune --hyp hyp.finetune.1280.yaml --epochs 450
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train.py --batch-size 64 --img 1280 1280 --data data/coco.yaml --cfg cfg/yolor_p6.cfg --weights 'runs/train/yolor_p6-tune/weights/epoch_424.pt' --device 0,1,2,3,4,5,6,7 --sync-bn --name yolor_p6-fine --hyp hyp.finetune.1280.yaml --epochs 450

Inference

yolor_p6.pt

python detect.py --source inference/images/horses.jpg --cfg cfg/yolor_p6.cfg --weights yolor_p6.pt --conf 0.25 --img-size 1280 --device 0

You will get the results:

horses

Citation

@article{wang2023you,
  title={You Only Learn One Representation: Unified Network for Multiple Tasks},
  author={Wang, Chien-Yao and Yeh, I-Hau and Liao, Hong-Yuan Mark},
  journal={Journal of Information Science and Engineering},
  year={2023}
}

Acknowledgements

Expand