Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

26,363

5,443

26,363

331

View on GitHub

Top Related Projects

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

mmdetection

31,487

OpenMMLab Detection Toolbox and Benchmark

models

77,618

Models and examples built with TensorFlow

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

darknet

22,101

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Quick Overview

Detectron is an open-source object detection and segmentation platform developed by Facebook AI Research (FAIR). It implements state-of-the-art object detection algorithms, including Mask R-CNN, RetinaNet, and Faster R-CNN, providing a flexible and efficient framework for research and production use cases in computer vision.

Pros

High performance and accuracy in object detection and instance segmentation tasks
Modular design allowing easy customization and extension of models
Comprehensive documentation and examples for various use cases
Pre-trained models available for quick deployment and fine-tuning

Cons

Steep learning curve for beginners in computer vision and deep learning
Requires significant computational resources for training and inference
Limited support for older hardware and operating systems
Primarily focused on PyTorch, which may not be ideal for TensorFlow users

Code Examples

Loading a pre-trained model:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
cfg.MODEL.WEIGHTS = "path/to/model_weights.pth"
predictor = DefaultPredictor(cfg)

Performing inference on an image:

import cv2
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog

image = cv2.imread("path/to/image.jpg")
outputs = predictor(image)

v = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2.imshow("Result", out.get_image()[:, :, ::-1])
cv2.waitKey(0)

Training a custom model:

from detectron2.engine import DefaultTrainer

cfg.DATASETS.TRAIN = ("custom_dataset_train",)
cfg.DATASETS.TEST = ("custom_dataset_val",)
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 1000

trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

Getting Started

Install Detectron2:

pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.html

Import and set up configuration:

from detectron2.config import get_cfg
from detectron2 import model_zoo

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")

Create a predictor and run inference:

from detectron2.engine import DefaultPredictor
import cv2

predictor = DefaultPredictor(cfg)
image = cv2.imread("input_image.jpg")
outputs = predictor(image)

Competitor Comparisons

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of detectron2

Written in PyTorch, offering better flexibility and ease of use
Improved performance and speed compared to its predecessor
More modular architecture, making it easier to extend and customize

Cons of detectron2

Steeper learning curve due to architectural changes
Some legacy features from Detectron may not be available

Code comparison

Detectron (Caffe2):

from detectron.core.config import get_cfg
from detectron.engine.default_trainer import DefaultTrainer

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
trainer = DefaultTrainer(cfg)
trainer.train()

detectron2 (PyTorch):

from detectron2.config import get_cfg
from detectron2.engine import DefaultTrainer

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

The code structure is similar, but detectron2 uses PyTorch-based implementations and offers more flexibility in configuration and training processes.

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

Easier to use and more beginner-friendly
Built on top of Keras and TensorFlow, which are widely used and well-documented
Includes pre-trained models for quick start and transfer learning

Cons of Mask_RCNN

Generally slower in training and inference compared to Detectron
Less flexibility in model architecture and hyperparameter tuning
Smaller community and fewer updates compared to Detectron

Code Comparison

Mask_RCNN:

import mrcnn.model as modellib

model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(WEIGHTS_PATH, by_name=True)
results = model.detect([image], verbose=1)

Detectron:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file(MODEL_ZOO_CONFIG_PATH)
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both repositories provide powerful tools for object detection and instance segmentation. Mask_RCNN is more accessible for beginners and those familiar with Keras/TensorFlow, while Detectron offers better performance and flexibility for advanced users and researchers.

mmdetection

31,487

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

More comprehensive and up-to-date model zoo with a wider range of object detection algorithms
Modular design allowing for easier customization and extension of components
Better documentation and community support

Cons of mmdetection

Steeper learning curve due to its more complex architecture
Potentially slower inference speed for some models compared to Detectron

Code Comparison

mmdetection:

from mmdet.apis import init_detector, inference_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'test.jpg')

Detectron:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/model_final_b275ba.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

models

77,618

Models and examples built with TensorFlow

Pros of TensorFlow Models

Broader scope, covering various ML tasks beyond computer vision
Extensive documentation and tutorials for beginners
Active community with frequent updates and contributions

Cons of TensorFlow Models

Less specialized for object detection compared to Detectron
May require more setup and configuration for specific tasks
Potentially steeper learning curve due to its broad scope

Code Comparison

Detectron (PyTorch):

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)

TensorFlow Models:

import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils

model = tf.saved_model.load('path/to/saved_model')
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS)

Both repositories offer powerful tools for computer vision tasks, but they differ in their focus and implementation. Detectron is more specialized for object detection and instance segmentation, while TensorFlow Models covers a broader range of machine learning tasks. The choice between them depends on the specific requirements of your project and your familiarity with the respective frameworks.

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

Faster inference speed and real-time object detection capabilities
Easier to use and deploy, with a more user-friendly interface
More extensive documentation and community support

Cons of YOLOv5

Generally lower accuracy compared to Detectron, especially for small objects
Less flexibility in terms of model architecture and customization options

Code Comparison

YOLOv5:

from ultralytics import YOLO

model = YOLO('yolov5s.pt')
results = model('image.jpg')
results.show()

Detectron:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

YOLOv5 offers a more straightforward API for quick implementation, while Detectron provides more detailed configuration options for advanced users.

darknet

22,101

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

Lightweight and fast, with minimal dependencies
Supports both CPU and GPU computation
Includes pre-trained models for various tasks

Cons of darknet

Less extensive documentation and community support
Fewer built-in features and tools for model analysis

Code comparison

Darknet (C):

layer make_convolutional_layer(int batch, int h, int w, int c, int n, int size, int stride, int padding, ACTIVATION activation, int batch_normalize, int binary, int xnor)
{
    layer l = {0};
    l.type = CONVOLUTIONAL;
    // ... (additional initialization)
    return l;
}

Detectron (Python):

class Conv2d(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True):
        super(Conv2d, self).__init__()
        self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias)
    
    def forward(self, x):
        return self.conv(x)

The code comparison shows that darknet uses C for low-level implementation, while Detectron uses Python with PyTorch for a higher-level abstraction. This reflects the different design philosophies and target users of the two projects.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch.

Detectron

Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN. It is written in Python and powered by the Caffe2 deep learning framework.

At FAIR, Detectron has enabled numerous research projects, including: Feature Pyramid Networks for Object Detection, Mask R-CNN, Detecting and Recognizing Human-Object Interactions, Focal Loss for Dense Object Detection, Non-local Neural Networks, Learning to Segment Every Thing, Data Distillation: Towards Omni-Supervised Learning, DensePose: Dense Human Pose Estimation In The Wild, and Group Normalization.

Example Mask R-CNN output.

Introduction

The goal of Detectron is to provide a high-quality, high-performance codebase for object detection research. It is designed to be flexible in order to support rapid implementation and evaluation of novel research. Detectron includes implementations of the following object detection algorithms:

Mask R-CNN -- Marr Prize at ICCV 2017
RetinaNet -- Best Student Paper Award at ICCV 2017
Faster R-CNN
RPN
Fast R-CNN
R-FCN

using the following backbone network architectures:

ResNeXt{50,101,152}
ResNet{50,101,152}
Feature Pyramid Networks (with ResNet/ResNeXt)
VGG16

Additional backbone architectures may be easily implemented. For more details about these models, please see References below.

Update

4/2018: Support Group Normalization - see GN/README.md

License

Detectron is released under the Apache 2.0 license. See the NOTICE file for additional details.

Citing Detectron

If you use Detectron in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@misc{Detectron2018,
  author =       {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
                  Piotr Doll\'{a}r and Kaiming He},
  title =        {Detectron},
  howpublished = {\url{https://github.com/facebookresearch/detectron}},
  year =         {2018}
}

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the Detectron Model Zoo.

Installation

Please find installation instructions for Caffe2 and Detectron in INSTALL.md.

Quick Start: Using Detectron

After installation, please see GETTING_STARTED.md for brief tutorials covering inference and training with Detectron.

Getting Help

To start, please check the troubleshooting section of our installation instructions as well as our FAQ. If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.

If bugs are found, we appreciate pull requests (including adding Q&A's to FAQ.md and improving our installation instructions and troubleshooting documents). Please see CONTRIBUTING.md for more information about contributing to Detectron.

References

Data Distillation: Towards Omni-Supervised Learning. Ilija Radosavovic, Piotr DollÃ¡r, Ross Girshick, Georgia Gkioxari, and Kaiming He. Tech report, arXiv, Dec. 2017.
Learning to Segment Every Thing. Ronghang Hu, Piotr DollÃ¡r, Kaiming He, Trevor Darrell, and Ross Girshick. Tech report, arXiv, Nov. 2017.
Non-Local Neural Networks. Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. Tech report, arXiv, Nov. 2017.
Mask R-CNN. Kaiming He, Georgia Gkioxari, Piotr DollÃ¡r, and Ross Girshick. IEEE International Conference on Computer Vision (ICCV), 2017.
Focal Loss for Dense Object Detection. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr DollÃ¡r. IEEE International Conference on Computer Vision (ICCV), 2017.
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. Priya Goyal, Piotr DollÃ¡r, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Tech report, arXiv, June 2017.
Detecting and Recognizing Human-Object Interactions. Georgia Gkioxari, Ross Girshick, Piotr DollÃ¡r, and Kaiming He. Tech report, arXiv, Apr. 2017.
Feature Pyramid Networks for Object Detection. Tsung-Yi Lin, Piotr DollÃ¡r, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
Aggregated Residual Transformations for Deep Neural Networks. Saining Xie, Ross Girshick, Piotr DollÃ¡r, Zhuowen Tu, and Kaiming He. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
R-FCN: Object Detection via Region-based Fully Convolutional Networks. Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. Conference on Neural Information Processing Systems (NIPS), 2016.
Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Conference on Neural Information Processing Systems (NIPS), 2015.
Fast R-CNN. Ross Girshick. IEEE International Conference on Computer Vision (ICCV), 2015.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot