MobileNet-SSD

Caffe implementation of Google MobileNet SSD detection network, with pretrained weights on VOC0712 and mAP=0.727.

2,066

1,181

2,066

151

View on GitHub

Top Related Projects

models

77,497

Models and examples built with TensorFlow

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

darknet

22,006

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Mask_RCNN

25,093

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

yolov7

13,768

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Quick Overview

MobileNet-SSD is a GitHub repository that implements a lightweight object detection model combining MobileNet and Single Shot Detector (SSD) architectures. It's designed for efficient real-time object detection on mobile and embedded devices, offering a balance between speed and accuracy.

Pros

Lightweight and efficient, suitable for mobile and embedded devices
Provides good accuracy while maintaining real-time performance
Implements a popular and well-established object detection architecture
Includes pre-trained models for quick deployment

Cons

Limited documentation and usage instructions
Not actively maintained (last update was several years ago)
May not include the latest improvements in object detection techniques
Lacks extensive examples and use cases

Code Examples

# Load the MobileNet-SSD model
net = cv2.dnn.readNetFromCaffe('MobileNetSSD_deploy.prototxt', 'MobileNetSSD_deploy.caffemodel')

# Prepare input image
blob = cv2.dnn.blobFromImage(image, 0.007843, (300, 300), 127.5)

# Set the input and perform forward pass
net.setInput(blob)
detections = net.forward()

This code snippet demonstrates how to load the MobileNet-SSD model and perform object detection on an input image.

# Loop over detections and draw bounding boxes
for i in range(detections.shape[2]):
    confidence = detections[0, 0, i, 2]
    if confidence > 0.2:
        box = detections[0, 0, i, 3:7] * np.array([w, h, w, h])
        (startX, startY, endX, endY) = box.astype("int")
        cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)

This example shows how to process the detection results and draw bounding boxes around detected objects.

Getting Started

Clone the repository:

git clone https://github.com/chuanqi305/MobileNet-SSD.git

Download the pre-trained model files:
- MobileNetSSD_deploy.caffemodel
- MobileNetSSD_deploy.prototxt
Install dependencies:
```
pip install opencv-python numpy
```
Use the code examples provided above to implement object detection in your Python script.
Run your script with an input image to perform object detection using MobileNet-SSD.

Competitor Comparisons

models

77,497

Models and examples built with TensorFlow

Pros of TensorFlow Models

Comprehensive collection of models and implementations
Regularly updated with new research and state-of-the-art models
Extensive documentation and community support

Cons of TensorFlow Models

Larger repository size, potentially overwhelming for beginners
May require more setup and configuration for specific use cases

Code Comparison

MobileNet-SSD:

import caffe
net = caffe.Net('MobileNetSSD_deploy.prototxt', 'MobileNetSSD_deploy.caffemodel', caffe.TEST)

TensorFlow Models:

import tensorflow as tf
from object_detection.utils import label_map_util
from object_detection.utils import visualization_utils as vis_util

detection_graph = tf.Graph()
with detection_graph.as_default():
    od_graph_def = tf.GraphDef()
    with tf.gfile.GFile(PATH_TO_FROZEN_GRAPH, 'rb') as fid:
        serialized_graph = fid.read()
        od_graph_def.ParseFromString(serialized_graph)
        tf.import_graph_def(od_graph_def, name='')

The MobileNet-SSD repository focuses specifically on the MobileNet-SSD model implementation in Caffe, while TensorFlow Models provides a broader range of models and tools for various tasks. TensorFlow Models offers more flexibility and options but may require more setup. MobileNet-SSD is more straightforward for its specific use case but has limited scope compared to the extensive TensorFlow Models repository.

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

Higher accuracy and faster inference speed
More flexible architecture with support for various backbones
Extensive documentation and active community support

Cons of YOLOv5

Larger model size, potentially less suitable for mobile devices
More complex training process and hyperparameter tuning
Requires more computational resources for training

Code Comparison

MobileNet-SSD:

from caffe2.python import workspace

init_net = caffe2.python.core.Net("init")
predict_net = caffe2.python.core.Net("predict")

YOLOv5:

from models.yolo import Model
from utils.torch_utils import select_device

device = select_device('0')
model = Model('yolov5s.yaml', ch=3, nc=80).to(device)

The code snippets show the initialization process for both models. MobileNet-SSD uses Caffe2, while YOLOv5 uses PyTorch, reflecting their different underlying frameworks and ease of use.

darknet

22,006

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

More comprehensive: Supports multiple object detection algorithms (YOLO, Tiny-YOLO, etc.)
Active development: Regularly updated with new features and improvements
Extensive documentation: Detailed guides and examples for various use cases

Cons of darknet

Higher complexity: Steeper learning curve for beginners
Resource-intensive: Requires more computational power for training and inference
Larger codebase: May be harder to customize or integrate into existing projects

Code Comparison

MobileNet-SSD:

net = caffe.Net('MobileNetSSD_deploy.prototxt', 'MobileNetSSD_deploy.caffemodel', caffe.TEST)

darknet:

network *net = load_network("yolov3.cfg", "yolov3.weights", 0);

MobileNet-SSD focuses on a single, lightweight model for mobile devices, while darknet offers a broader range of object detection algorithms. MobileNet-SSD is simpler to use and deploy, but darknet provides more flexibility and advanced features for various applications. The code comparison shows the difference in initialization, with MobileNet-SSD using Caffe and darknet using its custom framework.

Mask_RCNN

25,093

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

Provides instance segmentation in addition to object detection
Generally achieves higher accuracy on complex scenes
Supports transfer learning and fine-tuning on custom datasets

Cons of Mask_RCNN

Slower inference speed, less suitable for real-time applications
Higher computational requirements and larger model size
More complex to implement and train

Code Comparison

MobileNet-SSD:

from nets.ssd import build_ssd
net = build_ssd('test', 300, 21)
net.load_weights('weights/ssd300_mAP_77.43_v2.pth')

Mask_RCNN:

import mrcnn.model as modellib
model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
model.load_weights(COCO_MODEL_PATH, by_name=True)

MobileNet-SSD focuses on lightweight, efficient object detection, making it suitable for mobile and embedded devices. It sacrifices some accuracy for speed and smaller model size. Mask_RCNN, on the other hand, offers more comprehensive instance segmentation capabilities and higher accuracy but at the cost of increased computational requirements and slower inference speed. The choice between the two depends on the specific use case, balancing factors like accuracy, speed, and available computational resources.

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

More comprehensive and flexible object detection framework
Supports a wider range of models and tasks (e.g., instance segmentation, keypoint detection)
Active development and maintenance by Facebook AI Research

Cons of Detectron2

Higher computational requirements and complexity
Steeper learning curve for beginners
May be overkill for simpler object detection tasks

Code Comparison

MobileNet-SSD:

from caffe2.python import workspace

net = workspace.CreateNet(net_def)
for _ in range(100):
    workspace.RunNet(net.Proto().name)

Detectron2:

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

MobileNet-SSD is simpler and more lightweight, focusing specifically on mobile-friendly object detection. Detectron2 offers a more comprehensive framework with greater flexibility and features, but at the cost of increased complexity and resource requirements.

yolov7

13,768

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

Pros of YOLOv7

Higher accuracy and faster inference speed on various datasets
More flexible architecture with support for different backbones and head designs
Better performance on small object detection

Cons of YOLOv7

Larger model size, requiring more computational resources
More complex training process and hyperparameter tuning
May be overkill for simpler detection tasks or resource-constrained environments

Code Comparison

MobileNet-SSD:

from nets.mobilenet_v2_ssd_lite import create_mobilenetv2_ssd_lite
net = create_mobilenetv2_ssd_lite(num_classes)

YOLOv7:

from models.yolo import Model
model = Model(cfg='cfg/yolov7.yaml', ch=3, nc=num_classes)

Summary

YOLOv7 offers superior performance and flexibility compared to MobileNet-SSD, making it suitable for advanced object detection tasks. However, MobileNet-SSD remains a viable option for lightweight applications or devices with limited resources. The choice between the two depends on the specific requirements of the project, balancing accuracy, speed, and resource constraints.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

MobileNet-SSD

A caffe implementation of MobileNet-SSD detection network, with pretrained weights on VOC0712 and mAP=0.727.

Network	mAP	Download	Download
MobileNet-SSD	72.7	train	deploy

Run

Download SSD source code and compile (follow the SSD README).
Download the pretrained deploy weights from the link above.
Put all the files in SSD_HOME/examples/
Run demo.py to show the detection result.
You can run merge_bn.py to generate a no bn model, it will be much faster.

Create LMDB for your own dataset

Place the Images directory and Labels directory into same directory. (Each image in Images folder should have a unique label file in Labels folder with same name)
cd create_lmdb/code
Modify the labelmap.prototxt file according to your classes.
Modify the paths and directories in create_list.sh and create_data.sh as specified in same file in comments.
run bash create_list.sh, which will create trainval.txt, test.txt and test_name_size.txt
run bash create_data.sh, which will generate the LMDB in Dataset directory.
Delete trainval.txt, test.txt, test_name_size.txt before creation of next LMDB.

LMDB Creation part is taken from https://github.com/jinfagang/kitti-ssd

Train your own dataset

Convert your own dataset to lmdb database (follow the SSD README), and create symlinks to current directory.

ln -s PATH_TO_YOUR_TRAIN_LMDB trainval_lmdb
ln -s PATH_TO_YOUR_TEST_LMDB test_lmdb

Create the labelmap.prototxt file and put it into current directory.
Use gen_model.sh to generate your own training prototxt.
Download the training weights from the link above, and run train.sh, after about 30000 iterations, the loss should be 1.5 - 2.5.
Run test.sh to evaluate the result.
Run merge_bn.py to generate your own no-bn caffemodel if necessary.

python merge_bn.py --model example/MobileNetSSD_deploy.prototxt --weights snapshot/mobilenet_iter_xxxxxx.caffemodel

About some details

There are 2 primary differences between this model and MobileNet-SSD on tensorflow:

ReLU6 layer is replaced by ReLU.
For the conv11_mbox_prior layer, the anchors are [(0.2, 1.0), (0.2, 2.0), (0.2, 0.5)] vs tensorflow's [(0.1, 1.0), (0.2, 2.0), (0.2, 0.5)].

Reproduce the result

I trained this model from a MobileNet classifier(caffemodel and prototxt) converted from tensorflow. I first trained the model on MS-COCO and then fine-tuned on VOC0712. Without MS-COCO pretraining, it can only get mAP=0.68.

Mobile Platform

You can run it on Android with my another project rscnn.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot