Convert Figma logo to code with AI

facebookresearch logomaskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

9,283
2,500
9,283
530

Top Related Projects

24,600

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

OpenMMLab Detection Toolbox and Benchmark

77,006

Models and examples built with TensorFlow

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Quick Overview

Maskrcnn-benchmark is a fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch. It's designed to be flexible and extensible, allowing for easy implementation of novel object detection models. The project is maintained by Facebook AI Research.

Pros

  • High performance and efficiency, optimized for both training and inference
  • Modular design, making it easy to add new models and datasets
  • Comprehensive documentation and examples for various use cases
  • Active community and regular updates from Facebook AI Research

Cons

  • Steep learning curve for beginners in computer vision and deep learning
  • Requires significant computational resources for training large models
  • Limited to PyTorch framework, which may not suit all users' preferences
  • Some advanced features may require in-depth understanding of the codebase

Code Examples

  1. Loading a pre-trained model:
from maskrcnn_benchmark.config import cfg
from maskrcnn_benchmark.modeling.detector import build_detection_model
from maskrcnn_benchmark.utils.checkpoint import DetectronCheckpointer

config_file = "path/to/config/file.yaml"
cfg.merge_from_file(config_file)
model = build_detection_model(cfg)
checkpointer = DetectronCheckpointer(cfg, model, save_dir="output")
_ = checkpointer.load(cfg.MODEL.WEIGHT)
  1. Performing inference on an image:
from maskrcnn_benchmark.structures.image_list import to_image_list
from torchvision.transforms import functional as F

image = Image.open("path/to/image.jpg")
image = F.to_tensor(image).unsqueeze(0)
image_list = to_image_list(image, cfg.DATALOADER.SIZE_DIVISIBILITY)

with torch.no_grad():
    predictions = model(image_list)
  1. Visualizing predictions:
from maskrcnn_benchmark.utils.visualize import visualize_detection

image = Image.open("path/to/image.jpg")
visualize_detection(image, predictions[0], dataset=dataset)
plt.axis('off')
plt.tight_layout()
plt.show()

Getting Started

  1. Clone the repository:

    git clone https://github.com/facebookresearch/maskrcnn-benchmark.git
    
  2. Install dependencies:

    cd maskrcnn-benchmark
    pip install -r requirements.txt
    
  3. Build and install maskrcnn-benchmark:

    python setup.py build develop
    
  4. Run a demo:

    python demo/demo.py
    

Competitor Comparisons

24,600

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

  • Easier to use and more beginner-friendly
  • Better documentation and tutorials
  • Supports both TensorFlow and Keras backends

Cons of Mask_RCNN

  • Generally slower performance
  • Less flexibility for advanced users
  • Not as actively maintained as maskrcnn-benchmark

Code Comparison

Mask_RCNN:

import mrcnn.model as modellib

model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(COCO_MODEL_PATH, by_name=True)
results = model.detect([image], verbose=1)

maskrcnn-benchmark:

from maskrcnn_benchmark.config import cfg
from maskrcnn_benchmark.engine.predictor import COCODemo

coco_demo = COCODemo(cfg, confidence_threshold=0.7, masks_per_dim=2)
predictions = coco_demo.run_on_opencv_image(image)

Both repositories implement Mask R-CNN for object detection and instance segmentation. Mask_RCNN is more user-friendly and well-documented, making it a good choice for beginners or those who prefer a simpler API. However, maskrcnn-benchmark offers better performance and more flexibility for advanced users, making it suitable for research and production environments.

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

  • More comprehensive and flexible framework, supporting a wider range of object detection algorithms
  • Better documentation and community support
  • Modular design allowing easier customization and extension

Cons of mmdetection

  • Steeper learning curve due to its complexity and extensive features
  • Potentially slower inference speed for simpler models compared to maskrcnn-benchmark

Code Comparison

mmdetection:

from mmdet.apis import init_detector, inference_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, 'test.jpg')

maskrcnn-benchmark:

from maskrcnn_benchmark.config import cfg
from predictor import COCODemo

config_file = "configs/e2e_faster_rcnn_R_50_FPN_1x.yaml"
cfg.merge_from_file(config_file)
coco_demo = COCODemo(cfg, confidence_threshold=0.7, masks_per_dim=2)
predictions = coco_demo.run_on_opencv_image(image)
77,006

Models and examples built with TensorFlow

Pros of models

  • Broader scope: Covers a wide range of machine learning models and applications
  • Extensive documentation and community support
  • Regular updates and maintenance from Google's TensorFlow team

Cons of models

  • Can be overwhelming due to its large size and diverse content
  • May require more setup and configuration for specific tasks
  • Performance might not be optimized for certain computer vision tasks

Code comparison

maskrcnn-benchmark:

from maskrcnn_benchmark.config import cfg
from predictor import COCODemo

config_file = "path/to/config/file.yaml"
cfg.merge_from_file(config_file)
coco_demo = COCODemo(cfg, confidence_threshold=0.7)

models:

import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils

model = tf.saved_model.load('path/to/saved_model')
category_index = label_map_util.create_category_index_from_labelmap('path/to/labelmap.pbtxt')

Key differences

  • maskrcnn-benchmark focuses specifically on instance segmentation and object detection
  • models provides a more general-purpose framework for various machine learning tasks
  • maskrcnn-benchmark may offer better performance for specific computer vision tasks
  • models offers greater flexibility and a wider range of pre-trained models

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

  • More comprehensive and flexible framework for object detection and segmentation
  • Supports a wider range of models and tasks
  • Better performance and faster training times

Cons of Detectron2

  • Steeper learning curve due to increased complexity
  • May be overkill for simpler projects or beginners
  • Requires more computational resources

Code Comparison

Maskrcnn-benchmark:

from maskrcnn_benchmark.config import cfg
from predictor import COCODemo

config_file = "path/to/config/file.yaml"
cfg.merge_from_file(config_file)
coco_demo = COCODemo(cfg, confidence_threshold=0.7)

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("path/to/config/file.yaml")
predictor = DefaultPredictor(cfg)

Both repositories provide powerful tools for object detection and instance segmentation. Maskrcnn-benchmark is more focused and easier to use for specific tasks, while Detectron2 offers a broader range of features and flexibility. The code comparison shows that Detectron2 has a slightly more streamlined API, but both follow similar patterns for configuration and model initialization.

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version

Pros of py-faster-rcnn

  • Simpler implementation, making it easier to understand and modify for beginners
  • Supports both Python 2 and Python 3
  • Includes pre-trained models for quick start and benchmarking

Cons of py-faster-rcnn

  • Older codebase with less frequent updates
  • Limited to Faster R-CNN architecture, lacking support for more recent object detection models
  • Uses older deep learning frameworks (Caffe), which may have limited compatibility with modern hardware

Code Comparison

py-faster-rcnn:

import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms

maskrcnn-benchmark:

from maskrcnn_benchmark.config import cfg
from maskrcnn_benchmark.modeling.detector import build_detection_model
from maskrcnn_benchmark.utils.checkpoint import DetectronCheckpointer

The code snippets show the difference in import structure and naming conventions between the two projects. maskrcnn-benchmark uses a more modular and organized approach, reflecting its more recent development and broader scope.

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.

Pros of DeepLearningExamples

  • Broader scope: Covers various deep learning tasks and models
  • More up-to-date: Regularly maintained with recent NVIDIA optimizations
  • Extensive documentation: Includes detailed setup guides and performance benchmarks

Cons of DeepLearningExamples

  • Less focused: Not specialized for instance segmentation like maskrcnn-benchmark
  • Higher complexity: Requires more setup and configuration for specific tasks
  • Steeper learning curve: May be overwhelming for users focused solely on Mask R-CNN

Code Comparison

maskrcnn-benchmark:

from maskrcnn_benchmark.config import cfg
from maskrcnn_benchmark.modeling.detector import build_detection_model

model = build_detection_model(cfg)

DeepLearningExamples:

from model.mask_rcnn import MaskRCNN
from model.backbone import resnet_fpn_backbone

backbone = resnet_fpn_backbone(50, True)
model = MaskRCNN(backbone, num_classes=81)

Both repositories provide implementations of Mask R-CNN, but DeepLearningExamples offers a more modular approach with separate backbone and model definitions. maskrcnn-benchmark uses a configuration-based model building, which can be more convenient for quick setups but potentially less flexible for customization.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Faster R-CNN and Mask R-CNN in PyTorch 1.0

maskrcnn-benchmark has been deprecated. Please see detectron2, which includes implementations for all models in maskrcnn-benchmark

This project aims at providing the necessary building blocks for easily creating detection and segmentation models using PyTorch 1.0.

alt text

Highlights

  • PyTorch 1.0: RPN, Faster R-CNN and Mask R-CNN implementations that matches or exceeds Detectron accuracies
  • Very fast: up to 2x faster than Detectron and 30% faster than mmdetection during training. See MODEL_ZOO.md for more details.
  • Memory efficient: uses roughly 500MB less GPU memory than mmdetection during training
  • Multi-GPU training and inference
  • Mixed precision training: trains faster with less GPU memory on NVIDIA tensor cores.
  • Batched inference: can perform inference using multiple images per batch per GPU
  • CPU support for inference: runs on CPU in inference time. See our webcam demo for an example
  • Provides pre-trained models for almost all reference Mask R-CNN and Faster R-CNN configurations with 1x schedule.

Webcam and Jupyter notebook demo

We provide a simple webcam demo that illustrates how you can use maskrcnn_benchmark for inference:

cd demo
# by default, it runs on the GPU
# for best results, use min-image-size 800
python webcam.py --min-image-size 800
# can also run it on the CPU
python webcam.py --min-image-size 300 MODEL.DEVICE cpu
# or change the model that you want to use
python webcam.py --config-file ../configs/caffe2/e2e_mask_rcnn_R_101_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu
# in order to see the probability heatmaps, pass --show-mask-heatmaps
python webcam.py --min-image-size 300 --show-mask-heatmaps MODEL.DEVICE cpu
# for the keypoint demo
python webcam.py --config-file ../configs/caffe2/e2e_keypoint_rcnn_R_50_FPN_1x_caffe2.yaml --min-image-size 300 MODEL.DEVICE cpu

A notebook with the demo can be found in demo/Mask_R-CNN_demo.ipynb.

Installation

Check INSTALL.md for installation instructions.

Model Zoo and Baselines

Pre-trained models, baselines and comparison with Detectron and mmdetection can be found in MODEL_ZOO.md

Inference in a few lines

We provide a helper class to simplify writing inference pipelines using pre-trained models. Here is how we would do it. Run this from the demo folder:

from maskrcnn_benchmark.config import cfg
from predictor import COCODemo

config_file = "../configs/caffe2/e2e_mask_rcnn_R_50_FPN_1x_caffe2.yaml"

# update the config options with the config file
cfg.merge_from_file(config_file)
# manual override some options
cfg.merge_from_list(["MODEL.DEVICE", "cpu"])

coco_demo = COCODemo(
    cfg,
    min_image_size=800,
    confidence_threshold=0.7,
)
# load image and then run prediction
image = ...
predictions = coco_demo.run_on_opencv_image(image)

Perform training on COCO dataset

For the following examples to work, you need to first install maskrcnn_benchmark.

You will also need to download the COCO dataset. We recommend to symlink the path to the coco dataset to datasets/ as follows

We use minival and valminusminival sets from Detectron

# symlink the coco dataset
cd ~/github/maskrcnn-benchmark
mkdir -p datasets/coco
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2014 datasets/coco/train2014
ln -s /path_to_coco_dataset/test2014 datasets/coco/test2014
ln -s /path_to_coco_dataset/val2014 datasets/coco/val2014
# or use COCO 2017 version
ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/test2017 datasets/coco/test2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

# for pascal voc dataset:
ln -s /path_to_VOCdevkit_dir datasets/voc

P.S. COCO_2017_train = COCO_2014_train + valminusminival , COCO_2017_val = minival

You can also configure your own paths to the datasets. For that, all you need to do is to modify maskrcnn_benchmark/config/paths_catalog.py to point to the location where your dataset is stored. You can also create a new paths_catalog.py file which implements the same two classes, and pass it as a config argument PATHS_CATALOG during training.

Single GPU training

Most of the configuration files that we provide assume that we are running on 8 GPUs. In order to be able to run it on fewer GPUs, there are a few possibilities:

1. Run the following without modifications

python /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "/path/to/config/file.yaml"

This should work out of the box and is very similar to what we should do for multi-GPU training. But the drawback is that it will use much more GPU memory. The reason is that we set in the configuration files a global batch size that is divided over the number of GPUs. So if we only have a single GPU, this means that the batch size for that GPU will be 8x larger, which might lead to out-of-memory errors.

If you have a lot of memory available, this is the easiest solution.

2. Modify the cfg parameters

If you experience out-of-memory errors, you can reduce the global batch size. But this means that you'll also need to change the learning rate, the number of iterations and the learning rate schedule.

Here is an example for Mask R-CNN R-50 FPN with the 1x schedule:

python tools/train_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" SOLVER.IMS_PER_BATCH 2 SOLVER.BASE_LR 0.0025 SOLVER.MAX_ITER 720000 SOLVER.STEPS "(480000, 640000)" TEST.IMS_PER_BATCH 1 MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000

This follows the scheduling rules from Detectron. Note that we have multiplied the number of iterations by 8x (as well as the learning rate schedules), and we have divided the learning rate by 8x.

We also changed the batch size during testing, but that is generally not necessary because testing requires much less memory than training.

Furthermore, we set MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN 2000 as the proposals are selected for per the batch rather than per image in the default training. The value is calculated by 1000 x images-per-gpu. Here we have 2 images per GPU, therefore we set the number as 1000 x 2 = 2000. If we have 8 images per GPU, the value should be set as 8000. Note that this does not apply if MODEL.RPN.FPN_POST_NMS_PER_BATCH is set to False during training. See #672 for more details.

Multi-GPU training

We use internally torch.distributed.launch in order to launch multi-gpu training. This utility function from PyTorch spawns as many Python processes as the number of GPUs we want to use, and each Python process will only use a single GPU.

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000

Note we should set MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN follow the rule in Single-GPU training.

Mixed precision training

We currently use APEX to add Automatic Mixed Precision support. To enable, just do Single-GPU or Multi-GPU training and set DTYPE "float16".

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/train_net.py --config-file "path/to/config/file.yaml" MODEL.RPN.FPN_POST_NMS_TOP_N_TRAIN images_per_gpu x 1000 DTYPE "float16"

If you want more verbose logging, set AMP_VERBOSE True. See Mixed Precision Training guide for more details.

Evaluation

You can test your model directly on single or multiple gpus. Here is an example for Mask R-CNN R-50 FPN with the 1x schedule on 8 GPUS:

export NGPUS=8
python -m torch.distributed.launch --nproc_per_node=$NGPUS /path_to_maskrcnn_benchmark/tools/test_net.py --config-file "configs/e2e_mask_rcnn_R_50_FPN_1x.yaml" TEST.IMS_PER_BATCH 16

To calculate mAP for each class, you can simply modify a few lines in coco_eval.py. See #524 for more details.

Abstractions

For more information on some of the main abstractions in our implementation, see ABSTRACTIONS.md.

Adding your own dataset

This implementation adds support for COCO-style datasets. But adding support for training on a new dataset can be done as follows:

from maskrcnn_benchmark.structures.bounding_box import BoxList

class MyDataset(object):
    def __init__(self, ...):
        # as you would do normally

    def __getitem__(self, idx):
        # load the image as a PIL Image
        image = ...

        # load the bounding boxes as a list of list of boxes
        # in this case, for illustrative purposes, we use
        # x1, y1, x2, y2 order.
        boxes = [[0, 0, 10, 10], [10, 20, 50, 50]]
        # and labels
        labels = torch.tensor([10, 20])

        # create a BoxList from the boxes
        boxlist = BoxList(boxes, image.size, mode="xyxy")
        # add the labels to the boxlist
        boxlist.add_field("labels", labels)

        if self.transforms:
            image, boxlist = self.transforms(image, boxlist)

        # return the image, the boxlist and the idx in your dataset
        return image, boxlist, idx

    def get_img_info(self, idx):
        # get img_height and img_width. This is used if
        # we want to split the batches according to the aspect ratio
        # of the image, as it can be more efficient than loading the
        # image from disk
        return {"height": img_height, "width": img_width}

That's it. You can also add extra fields to the boxlist, such as segmentation masks (using structures.segmentation_mask.SegmentationMask), or even your own instance type.

For a full example of how the COCODataset is implemented, check maskrcnn_benchmark/data/datasets/coco.py.

Once you have created your dataset, it needs to be added in a couple of places:

Testing

While the aforementioned example should work for training, we leverage the cocoApi for computing the accuracies during testing. Thus, test datasets should currently follow the cocoApi for now.

To enable your dataset for testing, add a corresponding if statement in maskrcnn_benchmark/data/datasets/evaluation/__init__.py:

if isinstance(dataset, datasets.MyDataset):
        return coco_evaluation(**args)

Finetuning from Detectron weights on custom datasets

Create a script tools/trim_detectron_model.py like here. You can decide which keys to be removed and which keys to be kept by modifying the script.

Then you can simply point the converted model path in the config file by changing MODEL.WEIGHT.

For further information, please refer to #15.

Troubleshooting

If you have issues running or compiling this code, we have compiled a list of common issues in TROUBLESHOOTING.md. If your issue is not present there, please feel free to open a new issue.

Citations

Please consider citing this project in your publications if it helps your research. The following is a BibTeX reference. The BibTeX entry requires the url LaTeX package.

@misc{massa2018mrcnn,
author = {Massa, Francisco and Girshick, Ross},
title = {{maskrcnn-benchmark: Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch}},
year = {2018},
howpublished = {\url{https://github.com/facebookresearch/maskrcnn-benchmark}},
note = {Accessed: [Insert date here]}
}

Projects using maskrcnn-benchmark

License

maskrcnn-benchmark is released under the MIT license. See LICENSE for additional details.