detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

32,239

7,711

32,239

569

View on GitHub

Top Related Projects

Mask2Former

2,905

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

mmdetection

31,487

OpenMMLab Detection Toolbox and Benchmark

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

models

77,618

Models and examples built with TensorFlow

detr

14,567

End-to-End Object Detection with Transformers

Quick Overview

Detectron2 is an open-source object detection and segmentation library developed by Facebook AI Research. It provides state-of-the-art detection and segmentation algorithms implemented in PyTorch, offering a flexible and efficient platform for computer vision research and applications.

Pros

High performance and efficiency, with fast training and inference times
Modular design allowing easy customization and extension of models
Comprehensive set of pre-trained models and datasets
Extensive documentation and community support

Cons

Steep learning curve for beginners in computer vision
Requires significant computational resources for training large models
Limited support for deployment on mobile and edge devices
Some advanced features may require in-depth understanding of the codebase

Code Examples

Loading a pre-trained model and running inference:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.data import MetadataCatalog

cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
cfg.MODEL.WEIGHTS = "path/to/model_weights.pth"
predictor = DefaultPredictor(cfg)

# Run inference on an image
outputs = predictor(image)

Training a custom model:

from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
cfg.DATASETS.TRAIN = ("custom_dataset_train",)
cfg.DATASETS.TEST = ("custom_dataset_val",)
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025

trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

Visualizing predictions:

from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog

outputs = predictor(image)
v = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2.imshow("Predictions", out.get_image()[:, :, ::-1])
cv2.waitKey(0)

Getting Started

To get started with Detectron2:

Install Detectron2:

pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html

Import and use Detectron2 in your Python script:

import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# Use Detectron2 functionalities

For more detailed instructions and examples, refer to the official Detectron2 documentation and tutorials on the GitHub repository.

Competitor Comparisons

Mask2Former

2,905

Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"

Pros of Mask2Former

More advanced architecture for instance segmentation and panoptic segmentation tasks
Improved performance on various benchmarks compared to Detectron2
Unified approach for different segmentation tasks (instance, semantic, panoptic)

Cons of Mask2Former

Potentially higher computational requirements due to more complex architecture
May have a steeper learning curve for implementation and fine-tuning
Less extensive documentation and community support compared to Detectron2

Code Comparison

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Mask2Former:

from mask2former import add_mask2former_config
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
add_mask2former_config(cfg)
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both repositories build upon the Detectron2 framework, with Mask2Former introducing additional configurations and improvements for advanced segmentation tasks.

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

Easier to set up and use, especially for beginners
Built on top of Keras and TensorFlow, which are widely used and well-documented
Includes pre-trained models on the COCO dataset

Cons of Mask_RCNN

Less flexible and customizable compared to Detectron2
Slower inference speed and higher memory usage
Limited to Mask R-CNN architecture, while Detectron2 supports multiple models

Code Comparison

Mask_RCNN:

import mrcnn.model as modellib

model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(WEIGHTS_PATH, by_name=True)
results = model.detect([image], verbose=1)

Detectron2:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file(MODEL_ZOO_CONFIG_PATH)
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both repositories provide powerful tools for instance segmentation, but Detectron2 offers more flexibility and better performance at the cost of a steeper learning curve.

mmdetection

31,487

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

More comprehensive model zoo with a wider variety of pre-trained models
Flexible and modular design, allowing easier customization and extension
Better documentation and tutorials for beginners

Cons of mmdetection

Slightly steeper learning curve due to its more complex architecture
May have slower inference speed for some models compared to Detectron2

Code Comparison

mmdetection:

from mmdet.apis import init_detector, inference_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'test.jpg')

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/model_final_b275ba.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both libraries offer powerful object detection capabilities, with mmdetection providing more flexibility and a larger model zoo, while Detectron2 may be simpler to use for beginners and potentially faster for some models.

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

Faster training and inference times
Simpler architecture, easier to understand and modify
Better out-of-the-box performance on small object detection

Cons of YOLOv5

Less flexible for custom tasks compared to Detectron2
Smaller community and fewer pre-trained models
Limited to object detection, while Detectron2 supports multiple tasks

Code Comparison

YOLOv5:

from yolov5 import YOLOv5

model = YOLOv5('yolov5s.pt')
results = model('image.jpg')

Detectron2:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

YOLOv5 offers a more straightforward API for quick implementation, while Detectron2 provides greater flexibility and customization options. YOLOv5 is ideal for rapid prototyping and deployment of object detection models, especially when dealing with small objects or requiring fast inference times. Detectron2, on the other hand, excels in research environments and complex computer vision tasks beyond object detection.

models

77,618

Models and examples built with TensorFlow

Pros of TensorFlow Models

Broader scope, covering various ML tasks beyond computer vision
Extensive documentation and tutorials for beginners
Large community support and frequent updates

Cons of TensorFlow Models

Less focused on object detection compared to Detectron2
May require more setup and configuration for specific tasks
Performance can be slower for some computer vision tasks

Code Comparison

Detectron2 (PyTorch):

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

TensorFlow Models:

import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils

detection_model = tf.saved_model.load(path_to_model)
image, shapes = detection_model.preprocess(tf.convert_to_tensor(image))
prediction_dict = detection_model.predict(image, shapes)
detections = detection_model.postprocess(prediction_dict, shapes)

Both repositories offer powerful tools for computer vision tasks, with Detectron2 being more specialized in object detection and instance segmentation, while TensorFlow Models provides a broader range of machine learning capabilities.

detr

14,567

End-to-End Object Detection with Transformers

Pros of DETR

Simpler architecture with end-to-end training
Better performance on large objects and crowded scenes
More flexible and extensible to other tasks like panoptic segmentation

Cons of DETR

Slower inference time compared to Detectron2
May struggle with small objects
Requires longer training time to converge

Code Comparison

DETR:

class DETR(nn.Module):
    def __init__(self, num_classes, hidden_dim, nheads,
                 num_encoder_layers, num_decoder_layers):
        super().__init__()
        self.transformer = Transformer(
            d_model=hidden_dim,
            dropout=0.1,
            nhead=nheads,
            dim_feedforward=2048,
            num_encoder_layers=num_encoder_layers,
            num_decoder_layers=num_decoder_layers,
            normalize_before=False,
            return_intermediate_dec=True,
        )

Detectron2:

class GeneralizedRCNN(nn.Module):
    def __init__(self, cfg):
        super().__init__()
        self.backbone = build_backbone(cfg)
        self.proposal_generator = build_proposal_generator(cfg, self.backbone.output_shape())
        self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
        self.vis_period = cfg.VIS_PERIOD
        self.input_format = cfg.INPUT.FORMAT

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook.

Learn More about Detectron2

Includes new capabilities such as panoptic segmentation, Densepose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, ViTDet, MViTv2 etc.
Used as a library to support building research projects on top of it.
Models can be exported to TorchScript format or Caffe2 format for deployment.
It trains much faster.

See our blog post to see more demos. See this interview to learn more about the stories behind detectron2.

Installation

See installation instructions.

Getting Started

See Getting Started with Detectron2, and the Colab Notebook to learn about basic usage.

Learn more at our documentation. And see projects/ for some projects that are built on top of detectron2.

Model Zoo and Baselines

We provide a large set of baseline results and trained models available for download in the Detectron2 Model Zoo.

License

Detectron2 is released under the Apache 2.0 license.

Citing Detectron2

If you use Detectron2 in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot