py-faster-rcnn

Faster R-CNN (Python implementation) -- see https://github.com/ShaoqingRen/faster_rcnn for the official MATLAB version

8,227

4,108

8,227

667

View on GitHub

Top Related Projects

Detectron

26,363

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

models

77,618

Models and examples built with TensorFlow

mmdetection

31,487

OpenMMLab Detection Toolbox and Benchmark

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Quick Overview

py-faster-rcnn is a Python implementation of Faster R-CNN, a state-of-the-art object detection algorithm. It provides a framework for training and testing object detection models using deep learning techniques. The repository includes pre-trained models and tools for working with popular datasets like PASCAL VOC and COCO.

Pros

Implements a powerful and efficient object detection algorithm
Includes pre-trained models for quick start and benchmarking
Supports multiple popular datasets out of the box
Provides a modular architecture for easy customization and experimentation

Cons

Requires significant computational resources for training
Limited documentation and tutorials for beginners
Depends on older versions of libraries, which may cause compatibility issues
Not actively maintained, with the last update several years ago

Code Examples

Loading a pre-trained model:

from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms
from utils.timer import Timer
import matplotlib.pyplot as plt
import numpy as np
import scipy.io as sio
import caffe, os, sys, cv2
import argparse

NETS = {'vgg16': ('VGG16',
                  'VGG16_faster_rcnn_final.caffemodel'),
        'zf': ('ZF',
                  'ZF_faster_rcnn_final.caffemodel')}

def demo(net, image_name):
    """Detect object classes in an image using pre-computed object proposals."""

    # Load the demo image
    im_file = os.path.join(cfg.DATA_DIR, 'demo', image_name)
    im = cv2.imread(im_file)

    # Detect all object classes and regress object bounds
    timer = Timer()
    timer.tic()
    scores, boxes = im_detect(net, im)
    timer.toc()
    print('Detection took {:.3f}s for {:d} object proposals'.format(timer.total_time, boxes.shape[0]))

Visualizing detection results:

def vis_detections(im, class_name, dets, thresh=0.5):
    """Draw detected bounding boxes."""
    inds = np.where(dets[:, -1] >= thresh)[0]
    if len(inds) == 0:
        return

    im = im[:, :, (2, 1, 0)]
    fig, ax = plt.subplots(figsize=(12, 12))
    ax.imshow(im, aspect='equal')
    for i in inds:
        bbox = dets[i, :4]
        score = dets[i, -1]

        ax.add_patch(
            plt.Rectangle((bbox[0], bbox[1]),
                          bbox[2] - bbox[0],
                          bbox[3] - bbox[1], fill=False,
                          edgecolor='red', linewidth=3.5)
            )
        ax.text(bbox[0], bbox[1] - 2,
                '{:s} {:.3f}'.format(class_name, score),
                bbox=dict(facecolor='blue', alpha=0.5),
                fontsize=14, color='white')

    ax.set_title(('{} detections with '
                  'p({} | box) >= {:.1f}').format(class_name, class_name,
                                                  thresh),
                  fontsize=14)
    plt.axis('off')
    plt.tight_layout()
    plt.draw()

Getting Started

Clone the repository:

git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git

Build Cython modules:
```
cd $FRCN_ROOT/lib
make
```

Download pre-trained models:

./data/scripts/fetch_faster_rcnn_models.sh

Run the demo:
```
./tools/demo.py
```

Note: Make

Competitor Comparisons

Detectron

26,363

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Pros of Detectron

More comprehensive, supporting a wider range of object detection algorithms
Better performance and faster training times on large datasets
Actively maintained with regular updates and improvements

Cons of Detectron

Steeper learning curve due to its complexity and extensive features
Requires more computational resources for training and inference
Less suitable for quick prototyping or smaller projects

Code Comparison

py-faster-rcnn:

from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms

scores, boxes = im_detect(net, im)
cls_boxes = boxes[:, 4*cls:4*(cls + 1)]

Detectron:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both repositories focus on object detection, but Detectron offers a more modern and flexible approach with its modular design and extensive configuration options. py-faster-rcnn is simpler and more straightforward, making it easier to understand and modify for specific use cases. However, Detectron's broader feature set and active development make it a more powerful choice for advanced applications and research.

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

Implements instance segmentation in addition to object detection
Built on more modern deep learning frameworks (TensorFlow, Keras)
Actively maintained with regular updates and improvements

Cons of Mask_RCNN

Higher computational requirements due to additional segmentation task
Potentially more complex to set up and use for simpler object detection tasks

Code Comparison

Mask_RCNN (using Keras):

model = modellib.MaskRCNN(mode="training", config=config, model_dir=MODEL_DIR)
model.load_weights(COCO_WEIGHTS_PATH, by_name=True, exclude=["mrcnn_class_logits", "mrcnn_bbox_fc", "mrcnn_bbox", "mrcnn_mask"])
model.train(dataset_train, dataset_val, learning_rate=config.LEARNING_RATE, epochs=30, layers='heads')

py-faster-rcnn (using Caffe):

solver = caffe.SGDSolver(solver_prototxt)
solver.net.copy_from(pretrained_model)
solver.step(80000)

The Mask_RCNN code showcases a more intuitive API with Keras, while py-faster-rcnn uses the older Caffe framework. Mask_RCNN provides more flexibility in model configuration and training process, whereas py-faster-rcnn follows a more rigid structure typical of Caffe-based implementations.

models

77,618

Models and examples built with TensorFlow

Pros of models

Broader scope: Covers a wide range of machine learning models and applications
Active development: Regularly updated with new features and improvements
TensorFlow integration: Seamlessly works with the TensorFlow ecosystem

Cons of models

Complexity: May be overwhelming for users focused solely on object detection
Learning curve: Requires familiarity with TensorFlow and its conventions
Resource intensive: Can be more demanding in terms of computational resources

Code comparison

models:

import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as viz_utils

model = tf.saved_model.load('path/to/saved_model')
detect_fn = model.signatures['serving_default']

py-faster-rcnn:

import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms

net = caffe.Net(prototxt, caffemodel, caffe.TEST)

The models repository offers a more modern, TensorFlow-based approach, while py-faster-rcnn uses an older Caffe-based implementation. models provides a more comprehensive toolkit for various machine learning tasks, whereas py-faster-rcnn is specifically focused on the Faster R-CNN object detection algorithm.

mmdetection

31,487

OpenMMLab Detection Toolbox and Benchmark

Pros of mmdetection

Supports a wider range of object detection algorithms and models
More actively maintained with frequent updates and contributions
Provides comprehensive documentation and tutorials

Cons of mmdetection

Steeper learning curve due to its extensive features and configurations
Requires more computational resources for training and inference

Code Comparison

py-faster-rcnn:

from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms

scores, boxes = im_detect(net, im)
cls_boxes = boxes[:, 4*cls:4*(cls + 1)]

mmdetection:

from mmdet.apis import init_detector, inference_detector

config_file = 'configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
result = inference_detector(model, img)

mmdetection offers a more modular and flexible approach, allowing easier customization and integration of various detection algorithms. py-faster-rcnn focuses specifically on the Faster R-CNN algorithm, providing a more straightforward implementation for that particular method.

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

Faster inference speed and real-time object detection capabilities
More user-friendly with better documentation and community support
Supports a wider range of deployment options (e.g., mobile, edge devices)

Cons of YOLOv5

May have slightly lower accuracy compared to Faster R-CNN in some scenarios
Requires more data augmentation and careful hyperparameter tuning

Code Comparison

YOLOv5:

from ultralytics import YOLO

model = YOLO('yolov5s.pt')
results = model('image.jpg')
results.show()

py-faster-rcnn:

import _init_paths
from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms
scores, boxes = im_detect(net, im)

YOLOv5 offers a more straightforward API for object detection tasks, while py-faster-rcnn requires more setup and configuration. YOLOv5's simplicity makes it easier to integrate into projects, but py-faster-rcnn may offer more flexibility for advanced users.

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

More comprehensive and up-to-date implementation of object detection algorithms
Better performance and faster training times
Modular design allowing easier customization and extension

Cons of Detectron2

Steeper learning curve due to more complex architecture
Requires more computational resources for training and inference
Less suitable for simpler projects or quick prototyping

Code Comparison

py-faster-rcnn:

from fast_rcnn.config import cfg
from fast_rcnn.test import im_detect
from fast_rcnn.nms_wrapper import nms

scores, boxes = im_detect(net, im)
cls_boxes = boxes[:, 4*cls:4*(cls + 1)]

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

The code comparison shows that Detectron2 has a more streamlined API for configuration and prediction, while py-faster-rcnn requires more manual setup. Detectron2's approach is more user-friendly and consistent across different models.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

py-faster-rcnn has been deprecated. Please see Detectron, which includes an implementation of Mask R-CNN.

Disclaimer

The official Faster R-CNN code (written in MATLAB) is available here. If your goal is to reproduce the results in our NIPS 2015 paper, please use the official code.

This repository contains a Python reimplementation of the MATLAB code. This Python implementation is built on a fork of Fast R-CNN. There are slight differences between the two implementations. In particular, this Python port

is ~10% slower at test-time, because some operations execute on the CPU in Python layers (e.g., 220ms / image vs. 200ms / image for VGG16)
gives similar, but not exactly the same, mAP as the MATLAB version
is not compatible with models trained using the MATLAB code due to the minor implementation differences
includes approximate joint training that is 1.5x faster than alternating optimization (for VGG16) -- see these slides for more information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

By Shaoqing Ren, Kaiming He, Ross Girshick, Jian Sun (Microsoft Research)

This Python implementation contains contributions from Sean Bell (Cornell) written during an MSR internship.

Please see the official README.md for more details.

Faster R-CNN was initially described in an arXiv tech report and was subsequently published in NIPS 2015.

License

Faster R-CNN is released under the MIT License (refer to the LICENSE file for details).

Citing Faster R-CNN

If you find Faster R-CNN useful in your research, please consider citing:

@inproceedings{renNIPS15fasterrcnn,
    Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
    Title = {Faster {R-CNN}: Towards Real-Time Object Detection
             with Region Proposal Networks},
    Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
    Year = {2015}
}

Requirements: software
Requirements: hardware
Basic installation
Demo
Beyond the demo: training and testing
Usage

Requirements: software

NOTE If you are having issues compiling and you are using a recent version of CUDA/cuDNN, please consult this issue for a workaround

Requirements for Caffe and pycaffe (see: Caffe installation instructions)

Note: Caffe must be built with support for Python layers!

# In your Makefile.config, make sure to have this line uncommented
WITH_PYTHON_LAYER := 1
# Unrelatedly, it's also recommended that you use CUDNN
USE_CUDNN := 1

You can download my Makefile.config for reference. 2. Python packages you might not have: cython, python-opencv, easydict 3. [Optional] MATLAB is required for official PASCAL VOC evaluation only. The code now includes unofficial Python evaluation code.

Requirements: hardware

For training smaller networks (ZF, VGG_CNN_M_1024) a good GPU (e.g., Titan, K20, K40, ...) with at least 3G of memory suffices
For training Fast R-CNN with VGG16, you'll need a K40 (~11G of memory)
For training the end-to-end version of Faster R-CNN with VGG16, 3G of GPU memory is sufficient (using CUDNN)

Installation (sufficient for the demo)

Clone the Faster R-CNN repository

# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/py-faster-rcnn.git

We'll call the directory that you cloned Faster R-CNN into FRCN_ROOT

Ignore notes 1 and 2 if you followed step 1 above.

Note 1: If you didn't clone Faster R-CNN with the --recursive flag, then you'll need to manually clone the caffe-fast-rcnn submodule:
```
git submodule update --init --recursive
```
Note 2: The caffe-fast-rcnn submodule needs to be on the faster-rcnn branch (or equivalent detached state). This will happen automatically if you followed step 1 instructions.
Build the Cython modules
```
cd $FRCN_ROOT/lib
make
```

Build Caffe and pycaffe

cd $FRCN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
#   http://caffe.berkeleyvision.org/installation.html

# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe

Download pre-computed Faster R-CNN detectors
```
cd $FRCN_ROOT
./data/scripts/fetch_faster_rcnn_models.sh
```
This will populate the $FRCN_ROOT/data folder with faster_rcnn_models. See data/README.md for details. These models were trained on VOC 2007 trainval.

Demo

After successfully completing basic installation, you'll be ready to run the demo.

To run the demo

cd $FRCN_ROOT
./tools/demo.py

The demo performs detection using a VGG16 network trained for detection on PASCAL VOC 2007.

Beyond the demo: installation for training and testing models

Download the training, validation, test data and VOCdevkit

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directory named VOCdevkit

tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar

It should have this basic structure

$VOCdevkit/                           # development kit
$VOCdevkit/VOCcode/                   # VOC utility code
$VOCdevkit/VOC2007                    # image sets, annotations, etc.
# ... and several other directories ...

Create symlinks for the PASCAL VOC dataset
```
cd $FRCN_ROOT/data
ln -s $VOCdevkit VOCdevkit2007
```
Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
[Optional] follow similar steps to get PASCAL VOC 2010 and 2012
[Optional] If you want to use COCO, please see some notes under data/README.md
Follow the next sections to download pre-trained ImageNet models

Download pre-trained ImageNet models

Pre-trained ImageNet models can be downloaded for the three networks described in the paper: ZF and VGG16.

cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh

VGG16 comes from the Caffe Model Zoo, but is provided here for your convenience. ZF was trained at MSRA.

Usage

To train and test a Faster R-CNN detector using the alternating optimization algorithm from our NIPS 2015 paper, use experiments/scripts/faster_rcnn_alt_opt.sh. Output is written underneath $FRCN_ROOT/output.

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

("alt opt" refers to the alternating optimization training algorithm described in the NIPS paper.)

To train and test a Faster R-CNN detector using the approximate joint training method, use experiments/scripts/faster_rcnn_end2end.sh. Output is written underneath $FRCN_ROOT/output.

cd $FRCN_ROOT
./experiments/scripts/faster_rcnn_end2end.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701

This method trains the RPN module jointly with the Fast R-CNN network, rather than alternating between training the two. It results in faster (~ 1.5x speedup) training times and similar detection accuracy. See these slides for more details.

Artifacts generated by the scripts in tools are written in this directory.

Trained Fast R-CNN networks are saved under:

output/<experiment directory>/<dataset name>/

Test outputs are saved under:

output/<experiment directory>/<dataset name>/<network snapshot name>/

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of Detectron

Cons of Detectron

Code Comparison

Pros of Mask_RCNN

Cons of Mask_RCNN

Code Comparison

Pros of models

Cons of models

Code comparison

Pros of mmdetection

Cons of mmdetection

Code Comparison

Pros of YOLOv5

Cons of YOLOv5

Code Comparison

Pros of Detectron2

Cons of Detectron2

Code Comparison

Convert designs to code with AI

README

py-faster-rcnn has been deprecated. Please see Detectron, which includes an implementation of Mask R-CNN.

Disclaimer

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

License

Citing Faster R-CNN

Contents

Requirements: software

Requirements: hardware

Installation (sufficient for the demo)

Demo

Beyond the demo: installation for training and testing models

Download pre-trained ImageNet models

Usage

Top Related Projects

Convert designs to code with AI