Detectron
FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Top Related Projects
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
OpenMMLab Detection Toolbox and Benchmark
Models and examples built with TensorFlow
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
Quick Overview
Detectron is an open-source object detection and segmentation platform developed by Facebook AI Research (FAIR). It implements state-of-the-art object detection algorithms, including Mask R-CNN, RetinaNet, and Faster R-CNN, providing a flexible and efficient framework for research and production use cases in computer vision.
Pros
- High performance and accuracy in object detection and instance segmentation tasks
- Modular design allowing easy customization and extension of models
- Comprehensive documentation and examples for various use cases
- Pre-trained models available for quick deployment and fine-tuning
Cons
- Steep learning curve for beginners in computer vision and deep learning
- Requires significant computational resources for training and inference
- Limited support for older hardware and operating systems
- Primarily focused on PyTorch, which may not be ideal for TensorFlow users
Code Examples
- Loading a pre-trained model:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
cfg.MODEL.WEIGHTS = "path/to/model_weights.pth"
predictor = DefaultPredictor(cfg)
- Performing inference on an image:
import cv2
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
image = cv2.imread("path/to/image.jpg")
outputs = predictor(image)
v = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2.imshow("Result", out.get_image()[:, :, ::-1])
cv2.waitKey(0)
- Training a custom model:
from detectron2.engine import DefaultTrainer
cfg.DATASETS.TRAIN = ("custom_dataset_train",)
cfg.DATASETS.TEST = ("custom_dataset_val",)
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
cfg.SOLVER.MAX_ITER = 1000
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
Getting Started
- Install Detectron2:
pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.7/index.html
- Import and set up configuration:
from detectron2.config import get_cfg
from detectron2 import model_zoo
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
- Create a predictor and run inference:
from detectron2.engine import DefaultPredictor
import cv2
predictor = DefaultPredictor(cfg)
image = cv2.imread("input_image.jpg")
outputs = predictor(image)
Competitor Comparisons
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Pros of detectron2
- Written in PyTorch, offering better flexibility and ease of use
- Improved performance and speed compared to its predecessor
- More modular architecture, making it easier to extend and customize
Cons of detectron2
- Steeper learning curve due to architectural changes
- Some legacy features from Detectron may not be available
Code comparison
Detectron (Caffe2):
from detectron.core.config import get_cfg
from detectron.engine.default_trainer import DefaultTrainer
cfg = get_cfg()
cfg.merge_from_file("config.yaml")
trainer = DefaultTrainer(cfg)
trainer.train()
detectron2 (PyTorch):
from detectron2.config import get_cfg
from detectron2.engine import DefaultTrainer
cfg = get_cfg()
cfg.merge_from_file("config.yaml")
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
The code structure is similar, but detectron2 uses PyTorch-based implementations and offers more flexibility in configuration and training processes.
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Pros of Mask_RCNN
- Easier to use and more beginner-friendly
- Built on top of Keras and TensorFlow, which are widely used and well-documented
- Includes pre-trained models for quick start and transfer learning
Cons of Mask_RCNN
- Generally slower in training and inference compared to Detectron
- Less flexibility in model architecture and hyperparameter tuning
- Smaller community and fewer updates compared to Detectron
Code Comparison
Mask_RCNN:
import mrcnn.model as modellib
model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(WEIGHTS_PATH, by_name=True)
results = model.detect([image], verbose=1)
Detectron:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file(MODEL_ZOO_CONFIG_PATH)
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Both repositories provide powerful tools for object detection and instance segmentation. Mask_RCNN is more accessible for beginners and those familiar with Keras/TensorFlow, while Detectron offers better performance and flexibility for advanced users and researchers.
OpenMMLab Detection Toolbox and Benchmark
Pros of mmdetection
- More comprehensive and up-to-date model zoo with a wider range of object detection algorithms
- Modular design allowing for easier customization and extension of components
- Better documentation and community support
Cons of mmdetection
- Steeper learning curve due to its more complex architecture
- Potentially slower inference speed for some models compared to Detectron
Code Comparison
mmdetection:
from mmdet.apis import init_detector, inference_detector
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'test.jpg')
Detectron:
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file("configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/model_final_b275ba.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Models and examples built with TensorFlow
Pros of TensorFlow Models
- Broader scope, covering various ML tasks beyond computer vision
- Extensive documentation and tutorials for beginners
- Active community with frequent updates and contributions
Cons of TensorFlow Models
- Less specialized for object detection compared to Detectron
- May require more setup and configuration for specific tasks
- Potentially steeper learning curve due to its broad scope
Code Comparison
Detectron (PyTorch):
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
TensorFlow Models:
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
model = tf.saved_model.load('path/to/saved_model')
category_index = label_map_util.create_category_index_from_labelmap(PATH_TO_LABELS)
Both repositories offer powerful tools for computer vision tasks, but they differ in their focus and implementation. Detectron is more specialized for object detection and instance segmentation, while TensorFlow Models covers a broader range of machine learning tasks. The choice between them depends on the specific requirements of your project and your familiarity with the respective frameworks.
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Pros of YOLOv5
- Faster inference speed and real-time object detection capabilities
- Easier to use and deploy, with a more user-friendly interface
- More extensive documentation and community support
Cons of YOLOv5
- Generally lower accuracy compared to Detectron, especially for small objects
- Less flexibility in terms of model architecture and customization options
Code Comparison
YOLOv5:
from ultralytics import YOLO
model = YOLO('yolov5s.pt')
results = model('image.jpg')
results.show()
Detectron:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
YOLOv5 offers a more straightforward API for quick implementation, while Detectron provides more detailed configuration options for advanced users.
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
Pros of darknet
- Lightweight and fast, with minimal dependencies
- Supports both CPU and GPU computation
- Includes pre-trained models for various tasks
Cons of darknet
- Less extensive documentation and community support
- Fewer built-in features and tools for model analysis
Code comparison
Darknet (C):
layer make_convolutional_layer(int batch, int h, int w, int c, int n, int size, int stride, int padding, ACTIVATION activation, int batch_normalize, int binary, int xnor)
{
layer l = {0};
l.type = CONVOLUTIONAL;
// ... (additional initialization)
return l;
}
Detectron (Python):
class Conv2d(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True):
super(Conv2d, self).__init__()
self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, dilation, groups, bias)
def forward(self, x):
return self.conv(x)
The code comparison shows that darknet uses C for low-level implementation, while Detectron uses Python with PyTorch for a higher-level abstraction. This reflects the different design philosophies and target users of the two projects.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Detectron is deprecated. Please see detectron2, a ground-up rewrite of Detectron in PyTorch.
Detectron
Detectron is Facebook AI Research's software system that implements state-of-the-art object detection algorithms, including Mask R-CNN. It is written in Python and powered by the Caffe2 deep learning framework.
At FAIR, Detectron has enabled numerous research projects, including: Feature Pyramid Networks for Object Detection, Mask R-CNN, Detecting and Recognizing Human-Object Interactions, Focal Loss for Dense Object Detection, Non-local Neural Networks, Learning to Segment Every Thing, Data Distillation: Towards Omni-Supervised Learning, DensePose: Dense Human Pose Estimation In The Wild, and Group Normalization.
Example Mask R-CNN output.
Introduction
The goal of Detectron is to provide a high-quality, high-performance codebase for object detection research. It is designed to be flexible in order to support rapid implementation and evaluation of novel research. Detectron includes implementations of the following object detection algorithms:
- Mask R-CNN -- Marr Prize at ICCV 2017
- RetinaNet -- Best Student Paper Award at ICCV 2017
- Faster R-CNN
- RPN
- Fast R-CNN
- R-FCN
using the following backbone network architectures:
- ResNeXt{50,101,152}
- ResNet{50,101,152}
- Feature Pyramid Networks (with ResNet/ResNeXt)
- VGG16
Additional backbone architectures may be easily implemented. For more details about these models, please see References below.
Update
- 4/2018: Support Group Normalization - see
GN/README.md
License
Detectron is released under the Apache 2.0 license. See the NOTICE file for additional details.
Citing Detectron
If you use Detectron in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.
@misc{Detectron2018,
author = {Ross Girshick and Ilija Radosavovic and Georgia Gkioxari and
Piotr Doll\'{a}r and Kaiming He},
title = {Detectron},
howpublished = {\url{https://github.com/facebookresearch/detectron}},
year = {2018}
}
Model Zoo and Baselines
We provide a large set of baseline results and trained models available for download in the Detectron Model Zoo.
Installation
Please find installation instructions for Caffe2 and Detectron in INSTALL.md
.
Quick Start: Using Detectron
After installation, please see GETTING_STARTED.md
for brief tutorials covering inference and training with Detectron.
Getting Help
To start, please check the troubleshooting section of our installation instructions as well as our FAQ. If you couldn't find help there, try searching our GitHub issues. We intend the issues page to be a forum in which the community collectively troubleshoots problems.
If bugs are found, we appreciate pull requests (including adding Q&A's to FAQ.md
and improving our installation instructions and troubleshooting documents). Please see CONTRIBUTING.md for more information about contributing to Detectron.
References
- Data Distillation: Towards Omni-Supervised Learning. Ilija Radosavovic, Piotr Dollár, Ross Girshick, Georgia Gkioxari, and Kaiming He. Tech report, arXiv, Dec. 2017.
- Learning to Segment Every Thing. Ronghang Hu, Piotr Dollár, Kaiming He, Trevor Darrell, and Ross Girshick. Tech report, arXiv, Nov. 2017.
- Non-Local Neural Networks. Xiaolong Wang, Ross Girshick, Abhinav Gupta, and Kaiming He. Tech report, arXiv, Nov. 2017.
- Mask R-CNN. Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. IEEE International Conference on Computer Vision (ICCV), 2017.
- Focal Loss for Dense Object Detection. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. IEEE International Conference on Computer Vision (ICCV), 2017.
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour. Priya Goyal, Piotr Dollár, Ross Girshick, Pieter Noordhuis, Lukasz Wesolowski, Aapo Kyrola, Andrew Tulloch, Yangqing Jia, and Kaiming He. Tech report, arXiv, June 2017.
- Detecting and Recognizing Human-Object Interactions. Georgia Gkioxari, Ross Girshick, Piotr Dollár, and Kaiming He. Tech report, arXiv, Apr. 2017.
- Feature Pyramid Networks for Object Detection. Tsung-Yi Lin, Piotr Dollár, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- Aggregated Residual Transformations for Deep Neural Networks. Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- R-FCN: Object Detection via Region-based Fully Convolutional Networks. Jifeng Dai, Yi Li, Kaiming He, and Jian Sun. Conference on Neural Information Processing Systems (NIPS), 2016.
- Deep Residual Learning for Image Recognition. Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Conference on Neural Information Processing Systems (NIPS), 2015.
- Fast R-CNN. Ross Girshick. IEEE International Conference on Computer Vision (ICCV), 2015.
Top Related Projects
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
OpenMMLab Detection Toolbox and Benchmark
Models and examples built with TensorFlow
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot