detectron2
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Top Related Projects
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
OpenMMLab Detection Toolbox and Benchmark
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Models and examples built with TensorFlow
End-to-End Object Detection with Transformers
Quick Overview
Detectron2 is an open-source object detection and segmentation library developed by Facebook AI Research. It provides state-of-the-art detection and segmentation algorithms implemented in PyTorch, offering a flexible and efficient platform for computer vision research and applications.
Pros
- High performance and efficiency, with fast training and inference times
- Modular design allowing easy customization and extension of models
- Comprehensive set of pre-trained models and datasets
- Extensive documentation and community support
Cons
- Steep learning curve for beginners in computer vision
- Requires significant computational resources for training large models
- Limited support for deployment on mobile and edge devices
- Some advanced features may require in-depth understanding of the codebase
Code Examples
- Loading a pre-trained model and running inference:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.data import MetadataCatalog
cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
cfg.MODEL.WEIGHTS = "path/to/model_weights.pth"
predictor = DefaultPredictor(cfg)
# Run inference on an image
outputs = predictor(image)
- Training a custom model:
from detectron2.engine import DefaultTrainer
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
cfg.DATASETS.TRAIN = ("custom_dataset_train",)
cfg.DATASETS.TEST = ("custom_dataset_val",)
cfg.SOLVER.IMS_PER_BATCH = 2
cfg.SOLVER.BASE_LR = 0.00025
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()
- Visualizing predictions:
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
outputs = predictor(image)
v = Visualizer(image[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2.imshow("Predictions", out.get_image()[:, :, ::-1])
cv2.waitKey(0)
Getting Started
To get started with Detectron2:
- Install Detectron2:
pip install detectron2 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu113/torch1.10/index.html
- Import and use Detectron2 in your Python script:
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()
# Use Detectron2 functionalities
- For more detailed instructions and examples, refer to the official Detectron2 documentation and tutorials on the GitHub repository.
Competitor Comparisons
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Pros of Mask2Former
- More advanced architecture for instance segmentation and panoptic segmentation tasks
- Improved performance on various benchmarks compared to Detectron2
- Unified approach for different segmentation tasks (instance, semantic, panoptic)
Cons of Mask2Former
- Potentially higher computational requirements due to more complex architecture
- May have a steeper learning curve for implementation and fine-tuning
- Less extensive documentation and community support compared to Detectron2
Code Comparison
Detectron2:
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Mask2Former:
from mask2former import add_mask2former_config
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
add_mask2former_config(cfg)
cfg.merge_from_file("path/to/config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Both repositories build upon the Detectron2 framework, with Mask2Former introducing additional configurations and improvements for advanced segmentation tasks.
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
Pros of Mask_RCNN
- Easier to set up and use, especially for beginners
- Built on top of Keras and TensorFlow, which are widely used and well-documented
- Includes pre-trained models on the COCO dataset
Cons of Mask_RCNN
- Less flexible and customizable compared to Detectron2
- Slower inference speed and higher memory usage
- Limited to Mask R-CNN architecture, while Detectron2 supports multiple models
Code Comparison
Mask_RCNN:
import mrcnn.model as modellib
model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(WEIGHTS_PATH, by_name=True)
results = model.detect([image], verbose=1)
Detectron2:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file(MODEL_ZOO_CONFIG_PATH)
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Both repositories provide powerful tools for instance segmentation, but Detectron2 offers more flexibility and better performance at the cost of a steeper learning curve.
OpenMMLab Detection Toolbox and Benchmark
Pros of mmdetection
- More comprehensive model zoo with a wider variety of pre-trained models
- Flexible and modular design, allowing easier customization and extension
- Better documentation and tutorials for beginners
Cons of mmdetection
- Slightly steeper learning curve due to its more complex architecture
- May have slower inference speed for some models compared to Detectron2
Code Comparison
mmdetection:
from mmdet.apis import init_detector, inference_detector
config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'test.jpg')
Detectron2:
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file("configs/COCO-Detection/faster_rcnn_R_50_FPN_1x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-Detection/faster_rcnn_R_50_FPN_1x/137257794/model_final_b275ba.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
Both libraries offer powerful object detection capabilities, with mmdetection providing more flexibility and a larger model zoo, while Detectron2 may be simpler to use for beginners and potentially faster for some models.
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Pros of YOLOv5
- Faster training and inference times
- Simpler architecture, easier to understand and modify
- Better out-of-the-box performance on small object detection
Cons of YOLOv5
- Less flexible for custom tasks compared to Detectron2
- Smaller community and fewer pre-trained models
- Limited to object detection, while Detectron2 supports multiple tasks
Code Comparison
YOLOv5:
from yolov5 import YOLOv5
model = YOLOv5('yolov5s.pt')
results = model('image.jpg')
Detectron2:
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file("config.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
YOLOv5 offers a more straightforward API for quick implementation, while Detectron2 provides greater flexibility and customization options. YOLOv5 is ideal for rapid prototyping and deployment of object detection models, especially when dealing with small objects or requiring fast inference times. Detectron2, on the other hand, excels in research environments and complex computer vision tasks beyond object detection.
Models and examples built with TensorFlow
Pros of TensorFlow Models
- Broader scope, covering various ML tasks beyond computer vision
- Extensive documentation and tutorials for beginners
- Large community support and frequent updates
Cons of TensorFlow Models
- Less focused on object detection compared to Detectron2
- May require more setup and configuration for specific tasks
- Performance can be slower for some computer vision tasks
Code Comparison
Detectron2 (PyTorch):
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)
TensorFlow Models:
import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils
detection_model = tf.saved_model.load(path_to_model)
image, shapes = detection_model.preprocess(tf.convert_to_tensor(image))
prediction_dict = detection_model.predict(image, shapes)
detections = detection_model.postprocess(prediction_dict, shapes)
Both repositories offer powerful tools for computer vision tasks, with Detectron2 being more specialized in object detection and instance segmentation, while TensorFlow Models provides a broader range of machine learning capabilities.
End-to-End Object Detection with Transformers
Pros of DETR
- Simpler architecture with end-to-end training
- Better performance on large objects and crowded scenes
- More flexible and extensible to other tasks like panoptic segmentation
Cons of DETR
- Slower inference time compared to Detectron2
- May struggle with small objects
- Requires longer training time to converge
Code Comparison
DETR:
class DETR(nn.Module):
def __init__(self, num_classes, hidden_dim, nheads,
num_encoder_layers, num_decoder_layers):
super().__init__()
self.transformer = Transformer(
d_model=hidden_dim,
dropout=0.1,
nhead=nheads,
dim_feedforward=2048,
num_encoder_layers=num_encoder_layers,
num_decoder_layers=num_decoder_layers,
normalize_before=False,
return_intermediate_dec=True,
)
Detectron2:
class GeneralizedRCNN(nn.Module):
def __init__(self, cfg):
super().__init__()
self.backbone = build_backbone(cfg)
self.proposal_generator = build_proposal_generator(cfg, self.backbone.output_shape())
self.roi_heads = build_roi_heads(cfg, self.backbone.output_shape())
self.vis_period = cfg.VIS_PERIOD
self.input_format = cfg.INPUT.FORMAT
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Detectron2 is Facebook AI Research's next generation library that provides state-of-the-art detection and segmentation algorithms. It is the successor of Detectron and maskrcnn-benchmark. It supports a number of computer vision research projects and production applications in Facebook.
Learn More about Detectron2
Explain Like Iâm 5: Detectron2 | Using Machine Learning with Detectron2 |
---|---|
What's New
- Includes new capabilities such as panoptic segmentation, Densepose, Cascade R-CNN, rotated bounding boxes, PointRend, DeepLab, ViTDet, MViTv2 etc.
- Used as a library to support building research projects on top of it.
- Models can be exported to TorchScript format or Caffe2 format for deployment.
- It trains much faster.
See our blog post to see more demos and learn about detectron2.
Installation
See installation instructions.
Getting Started
See Getting Started with Detectron2, and the Colab Notebook to learn about basic usage.
Learn more at our documentation. And see projects/ for some projects that are built on top of detectron2.
Model Zoo and Baselines
We provide a large set of baseline results and trained models available for download in the Detectron2 Model Zoo.
License
Detectron2 is released under the Apache 2.0 license.
Citing Detectron2
If you use Detectron2 in your research or wish to refer to the baseline results published in the Model Zoo, please use the following BibTeX entry.
@misc{wu2019detectron2,
author = {Yuxin Wu and Alexander Kirillov and Francisco Massa and
Wan-Yen Lo and Ross Girshick},
title = {Detectron2},
howpublished = {\url{https://github.com/facebookresearch/detectron2}},
year = {2019}
}
Top Related Projects
Code release for "Masked-attention Mask Transformer for Universal Image Segmentation"
Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow
OpenMMLab Detection Toolbox and Benchmark
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Models and examples built with TensorFlow
End-to-End Object Detection with Transformers
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot