mmsegmentation

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

9,003

2,734

9,003

859

View on GitHub

Top Related Projects

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

models

77,618

Models and examples built with TensorFlow

vision

17,046

Datasets, Transforms and Models specific to Computer Vision

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

transformers

146,142

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

computervision-recipes

9,732

Best Practices, code samples, and documentation for Computer Vision.

Quick Overview

MMSegmentation is an open-source semantic segmentation toolbox based on PyTorch. It is a part of the OpenMMLab project and provides a comprehensive set of tools for training, testing, and deploying various semantic segmentation models. The library supports a wide range of popular architectures and datasets, making it a versatile choice for researchers and practitioners in computer vision.

Pros

Extensive model zoo with state-of-the-art architectures
Modular design allowing easy customization and extension
Comprehensive documentation and tutorials
Active community support and regular updates

Cons

Steep learning curve for beginners
Requires significant computational resources for training large models
Limited support for real-time inference on edge devices
Some advanced features may require in-depth knowledge of PyTorch

Code Examples

Loading a pre-trained model and performing inference:

from mmseg.apis import inference_segmentor, init_segmentor

config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'demo/demo.png')

Training a custom model:

from mmseg.datasets import build_dataset
from mmseg.models import build_segmentor
from mmseg.apis import train_segmentor

# Build dataset
datasets = [build_dataset(cfg.data.train)]

# Build the model
model = build_segmentor(cfg.model)

# Train the model
train_segmentor(model, datasets, cfg, distributed=False, validate=True)

Evaluating a model on a dataset:

from mmseg.datasets import build_dataset
from mmseg.apis import single_gpu_test
from mmcv.parallel import MMDataParallel
from mmcv import Config

cfg = Config.fromfile('configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py')
dataset = build_dataset(cfg.data.test)
model = MMDataParallel(model, device_ids=[0])
outputs = single_gpu_test(model, data_loader, show=False)

Getting Started

Install MMSegmentation:

pip install openmim
mim install mmengine
mim install mmcv
mim install mmsegmentation

Download a configuration file and checkpoint:

mim download mmsegmentation --config pspnet_r50-d8_512x1024_40k_cityscapes --dest .

Run inference on an image:

from mmseg.apis import inference_segmentor, init_segmentor
import mmcv

config_file = 'pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
img = 'demo.png'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)

Competitor Comparisons

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

More comprehensive object detection and instance segmentation capabilities
Faster training and inference times due to optimized CUDA implementations
Extensive pre-trained model zoo for various tasks

Cons of Detectron2

Steeper learning curve for beginners due to its complexity
Less focus on semantic segmentation compared to MMSegmentation
Requires more computational resources for training and inference

Code Comparison

MMSegmentation:

from mmseg.apis import inference_segmentor, init_segmentor

config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)

Detectron2:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file("configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(img)

models

77,618

Models and examples built with TensorFlow

Pros of models

Broader scope, covering various ML tasks beyond segmentation
Official TensorFlow repository with extensive documentation
Large community and ecosystem support

Cons of models

Less specialized for segmentation tasks
Potentially more complex to navigate for specific use cases
May require more setup and configuration for segmentation projects

Code comparison

mmsegmentation:

from mmseg.apis import inference_segmentor, init_segmentor

config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)

models:

import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.layers import Conv2D, UpSampling2D

base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(None, None, 3))
x = Conv2D(256, (3, 3), padding='same', activation='relu')(base_model.output)
x = UpSampling2D((2, 2))(x)
outputs = Conv2D(num_classes, (1, 1), activation='softmax')(x)

vision

17,046

Datasets, Transforms and Models specific to Computer Vision

Pros of vision

Broader scope, covering various computer vision tasks beyond segmentation
Tighter integration with PyTorch ecosystem and core functionality
More extensive documentation and tutorials for beginners

Cons of vision

Less specialized for segmentation tasks compared to mmsegmentation
Fewer pre-trained models specifically for semantic segmentation
May require more custom code for advanced segmentation workflows

Code Comparison

mmsegmentation example:

from mmseg.apis import inference_segmentor, init_segmentor

config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'demo.png')

vision example:

import torchvision.models.segmentation as models

model = models.fcn_resnet50(pretrained=True)
model.eval()

from PIL import Image
from torchvision import transforms

input_image = Image.open('demo.png')
preprocess = transforms.Compose([transforms.ToTensor()])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)

output = model(input_batch)['out'][0]

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

Simpler architecture and easier to use for object detection tasks
Faster inference speed, especially on edge devices
More extensive documentation and community support

Cons of YOLOv5

Limited to object detection, less versatile than MMSegmentation
May have lower accuracy for complex scenes compared to more advanced models
Less flexibility in terms of model customization and experimentation

Code Comparison

YOLOv5:

import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
results = model('image.jpg')
results.print()

MMSegmentation:

from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'test.jpg')

YOLOv5 focuses on simplicity and speed for object detection, while MMSegmentation offers a more comprehensive framework for various segmentation tasks with greater flexibility and customization options.

transformers

146,142

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Pros of transformers

Broader scope, covering various NLP tasks beyond segmentation
Larger community and more frequent updates
Extensive documentation and tutorials

Cons of transformers

Less specialized for image segmentation tasks
Potentially steeper learning curve for segmentation-specific applications
May require more customization for optimal segmentation performance

Code comparison

mmsegmentation:

from mmseg.apis import inference_segmentor, init_segmentor

config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)

transformers:

from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image

feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")
model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")

image = Image.open("path/to/image.jpg")
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)

computervision-recipes

9,732

Best Practices, code samples, and documentation for Computer Vision.

Pros of computervision-recipes

Broader scope covering multiple computer vision tasks
Includes Jupyter notebooks for easy experimentation and learning
Provides end-to-end examples for various scenarios

Cons of computervision-recipes

Less specialized in semantic segmentation compared to mmsegmentation
May have fewer state-of-the-art models for segmentation tasks
Potentially slower development cycle for cutting-edge segmentation techniques

Code Comparison

mmsegmentation:

from mmseg.apis import inference_segmentor, init_segmentor

config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'demo/demo.png')

computervision-recipes:

from azureml.contrib.services.aml_request import rawhttp
from azureml.contrib.services.aml_response import AMLResponse
from PIL import Image
import numpy as np

@rawhttp
def run(request):
    image = Image.open(request.files["image"].stream)
    # Perform segmentation using a pre-trained model
    result = model.predict(np.array(image))
    return AMLResponse(result.tolist(), 200)

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

OpenMMLab website ^HOT OpenMMLab platform ^{TRY IT OUT}

Documentation: https://mmsegmentation.readthedocs.io/en/latest/

English | ç®ä½ä¸æ

Introduction

MMSegmentation is an open source semantic segmentation toolbox based on PyTorch. It is a part of the OpenMMLab project.

The main branch works with PyTorch 1.6+.

ð Introducing MMSegmentation v1.0.0 ð

We are thrilled to announce the official release of MMSegmentation's latest version! For this new release, the main branch serves as the primary branch, while the development branch is dev-1.x. The stable branch for the previous release remains as the 0.x branch. Please note that the master branch will only be maintained for a limited time before being removed. We encourage you to be mindful of branch selection and updates during use. Thank you for your unwavering support and enthusiasm, and let's work together to make MMSegmentation even more robust and powerful! ðª

MMSegmentation v1.x brings remarkable improvements over the 0.x release, offering a more flexible and feature-packed experience. To utilize the new features in v1.x, we kindly invite you to consult our detailed ð migration guide, which will help you seamlessly transition your projects. Your support is invaluable, and we eagerly await your feedback!

demo image

Major features

Unified Benchmark

We provide a unified benchmark toolbox for various semantic segmentation methods.
Modular Design

We decompose the semantic segmentation framework into different components and one can easily construct a customized semantic segmentation framework by combining different modules.
Support of multiple methods out of box

The toolbox directly supports popular and contemporary semantic segmentation frameworks, e.g. PSPNet, DeepLabV3, PSANet, DeepLabV3+, etc.
High efficiency

The training speed is faster than or comparable to other codebases.

What's New

v1.2.0 was released on 10/12/2023, from 1.1.0 to 1.2.0, we have added or updated the following features:

Highlights

Support for the open-vocabulary semantic segmentation algorithm SAN
Support monocular depth estimation task, please refer to VPD and Adabins for more details.
Add new projects: open-vocabulary semantic segmentation algorithm CAT-Seg, real-time semantic segmentation algofithm PP-MobileSeg

Installation

Please refer to get_started.md for installation and dataset_prepare.md for dataset preparation.

Get Started

Please see Overview for the general introduction of MMSegmentation.

Please see user guides for the basic usage of MMSegmentation. There are also advanced tutorials for in-depth understanding of mmseg design and implementation .

A Colab tutorial is also provided. You may preview the notebook here or directly run on Colab.

To migrate from MMSegmentation 0.x, please refer to migration.

Tutorial

MMSegmentation Tutorials

Get Started	MMSeg Basic Tutorial	MMSeg Detail Tutorial	MMSeg Development Tutorial
MMSeg overview MMSeg Installation FAQ	Tutorial 1: Learn about Configs Tutorial 2: Prepare datasets Tutorial 3: Inference with existing models Tutorial 4: Train and test with existing models Tutorial 5: Model deployment Deploy mmsegmentation on Jetson platform Useful Tools Feature Map Visualization Visualization	MMSeg Dataset MMSeg Models MMSeg Dataset Structures MMSeg Data Transforms MMSeg Dataflow MMSeg Training Engine MMSeg Evaluation	Add New Datasets Add New Metrics Add New Modules Add New Data Transforms Customize Runtime Settings Training Tricks Contribute code to MMSeg Contribute a standard dataset in projects NPU (HUAWEI Ascend) 0.x â 1.x migration 0.x â 1.x package

Benchmark and model zoo

Results and models are available in the model zoo.

Overview

Supported backbones	Supported methods	Supported Head	Supported datasets	Other
ResNet(CVPR'2016) ResNeXt (CVPR'2017) HRNet (CVPR'2019) ResNeSt (ArXiv'2020) MobileNetV2 (CVPR'2018) MobileNetV3 (ICCV'2019) Vision Transformer (ICLR'2021) Swin Transformer (ICCV'2021) Twins (NeurIPS'2021) BEiT (ICLR'2022) ConvNeXt (CVPR'2022) MAE (CVPR'2022) PoolFormer (CVPR'2022) SegNeXt (NeurIPS'2022)	SAN (CVPR'2023) VPD (ICCV'2023) DDRNet (T-ITS'2022) PIDNet (ArXiv'2022) Mask2Former (CVPR'2022) MaskFormer (NeurIPS'2021) K-Net (NeurIPS'2021) SegFormer (NeurIPS'2021) Segmenter (ICCV'2021) DPT (ArXiv'2021) SETR (CVPR'2021) STDC (CVPR'2021) BiSeNetV2 (IJCV'2021) CGNet (TIP'2020) PointRend (CVPR'2020) DNLNet (ECCV'2020) OCRNet (ECCV'2020) ISANet (ArXiv'2019/IJCV'2021) Fast-SCNN (ArXiv'2019) FastFCN (ArXiv'2019) GCNet (ICCVW'2019/TPAMI'2020) ANN (ICCV'2019) EMANet (ICCV'2019) CCNet (ICCV'2019) DMNet (ICCV'2019) Semantic FPN (CVPR'2019) DANet (CVPR'2019) APCNet (CVPR'2019) NonLocal Net (CVPR'2018) EncNet (CVPR'2018) DeepLabV3+ (CVPR'2018) UPerNet (ECCV'2018) ICNet (ECCV'2018) PSANet (ECCV'2018) BiSeNetV1 (ECCV'2018) DeepLabV3 (ArXiv'2017) PSPNet (CVPR'2017) ERFNet (T-ITS'2017) UNet (MICCAI'2016/Nat. Methods'2019) FCN (CVPR'2015/TPAMI'2017)	ANN_Head APC_Head ASPP_Head CC_Head DA_Head DDR_Head DM_Head DNL_Head DPT_HEAD EMA_Head ENC_Head FCN_Head FPN_Head GC_Head LightHam_Head ISA_Head Knet_Head LRASPP_Head mask2former_Head maskformer_Head NL_Head OCR_Head PID_Head point_Head PSA_Head PSP_Head SAN_Head segformer_Head segmenter_mask_Head SepASPP_Head SepFCN_Head SETRMLAHead_Head SETRUP_Head STDC_Head Uper_Head VPDDepth_Head	Cityscapes PASCAL VOC ADE20K Pascal Context COCO-Stuff 10k COCO-Stuff 164k CHASE_DB1 DRIVE HRF STARE Dark Zurich Nighttime Driving LoveDA Potsdam Vaihingen iSAID Mapillary Vistas LEVIR-CD BDD100K NYU HSIDrive20	Supported loss boundary_loss cross_entropy_loss dice_loss focal_loss huasdorff_distance_loss kldiv_loss lovasz_loss ohem_cross_entropy_loss silog_loss tversky_loss

Please refer to FAQ for frequently asked questions.

Projects

Here are some implementations of SOTA models and solutions built on MMSegmentation, which are supported and maintained by community users. These projects demonstrate the best practices based on MMSegmentation for research and product development. We welcome and appreciate all the contributions to OpenMMLab ecosystem.

Contributing

We appreciate all contributions to improve MMSegmentation. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgement

MMSegmentation is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new semantic segmentation methods.

Citation

If you find this project useful in your research, please consider cite:

@misc{mmseg2020,
    title={{MMSegmentation}: OpenMMLab Semantic Segmentation Toolbox and Benchmark},
    author={MMSegmentation Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmsegmentation}},
    year={2020}
}

License

This project is released under the Apache 2.0 license.

OpenMMLab Family

MMEngine: OpenMMLab foundational library for training deep learning models.
MMCV: OpenMMLab foundational library for computer vision.
MMPreTrain: OpenMMLab pre-training toolbox and benchmark.
MMagic: OpenMMLab Advanced, Generative and Intelligent Creation toolbox.
MMDetection: OpenMMLab detection toolbox and benchmark.
MMYOLO: OpenMMLab YOLO series toolbox and benchmark.
MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
MMRotate: OpenMMLab rotated object detection toolbox and benchmark.
MMTracking: OpenMMLab video perception toolbox and benchmark.
MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
MMOCR: OpenMMLab text detection, recognition, and understanding toolbox.
MMPose: OpenMMLab pose estimation toolbox and benchmark.
MMHuman3D: OpenMMLab 3D human parametric model toolbox and benchmark.
MMFewShot: OpenMMLab fewshot learning toolbox and benchmark.
MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
MMFlow: OpenMMLab optical flow toolbox and benchmark.
MMDeploy: OpenMMLab Model Deployment Framework.
MMRazor: OpenMMLab model compression toolbox and benchmark.
MIM: MIM installs OpenMMLab packages.
Playground: A central hub for gathering and showcasing amazing projects built upon OpenMMLab.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of Detectron2

Cons of Detectron2

Code Comparison

Pros of models

Cons of models

Code comparison

Pros of vision

Cons of vision

Code Comparison

Pros of YOLOv5

Cons of YOLOv5

Code Comparison

Pros of transformers

Cons of transformers

Code comparison

Pros of computervision-recipes

Cons of computervision-recipes

Code Comparison

Convert designs to code with AI

README

Introduction

ð Introducing MMSegmentation v1.0.0 ð

Major features

What's New

Highlights

Installation

Get Started

Tutorial

Benchmark and model zoo

Projects

Contributing

Acknowledgement

Citation

License

OpenMMLab Family

Top Related Projects

Convert designs to code with AI

ð Introducing MMSegmentation v1.0.0 ð