Top Related Projects
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Models and examples built with TensorFlow
Datasets, Transforms and Models specific to Computer Vision
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Best Practices, code samples, and documentation for Computer Vision.
Quick Overview
MMSegmentation is an open-source semantic segmentation toolbox based on PyTorch. It is a part of the OpenMMLab project and provides a comprehensive set of tools for training, testing, and deploying various semantic segmentation models. The library supports a wide range of popular architectures and datasets, making it a versatile choice for researchers and practitioners in computer vision.
Pros
- Extensive model zoo with state-of-the-art architectures
- Modular design allowing easy customization and extension
- Comprehensive documentation and tutorials
- Active community support and regular updates
Cons
- Steep learning curve for beginners
- Requires significant computational resources for training large models
- Limited support for real-time inference on edge devices
- Some advanced features may require in-depth knowledge of PyTorch
Code Examples
- Loading a pre-trained model and performing inference:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'demo/demo.png')
- Training a custom model:
from mmseg.datasets import build_dataset
from mmseg.models import build_segmentor
from mmseg.apis import train_segmentor
# Build dataset
datasets = [build_dataset(cfg.data.train)]
# Build the model
model = build_segmentor(cfg.model)
# Train the model
train_segmentor(model, datasets, cfg, distributed=False, validate=True)
- Evaluating a model on a dataset:
from mmseg.datasets import build_dataset
from mmseg.apis import single_gpu_test
from mmcv.parallel import MMDataParallel
from mmcv import Config
cfg = Config.fromfile('configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py')
dataset = build_dataset(cfg.data.test)
model = MMDataParallel(model, device_ids=[0])
outputs = single_gpu_test(model, data_loader, show=False)
Getting Started
- Install MMSegmentation:
pip install openmim
mim install mmengine
mim install mmcv
mim install mmsegmentation
- Download a configuration file and checkpoint:
mim download mmsegmentation --config pspnet_r50-d8_512x1024_40k_cityscapes --dest .
- Run inference on an image:
from mmseg.apis import inference_segmentor, init_segmentor
import mmcv
config_file = 'pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
img = 'demo.png'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)
Competitor Comparisons
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Pros of Detectron2
- More comprehensive object detection and instance segmentation capabilities
- Faster training and inference times due to optimized CUDA implementations
- Extensive pre-trained model zoo for various tasks
Cons of Detectron2
- Steeper learning curve for beginners due to its complexity
- Less focus on semantic segmentation compared to MMSegmentation
- Requires more computational resources for training and inference
Code Comparison
MMSegmentation:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)
Detectron2:
from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file("configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(img)
Models and examples built with TensorFlow
Pros of models
- Broader scope, covering various ML tasks beyond segmentation
- Official TensorFlow repository with extensive documentation
- Large community and ecosystem support
Cons of models
- Less specialized for segmentation tasks
- Potentially more complex to navigate for specific use cases
- May require more setup and configuration for segmentation projects
Code comparison
mmsegmentation:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)
models:
import tensorflow as tf
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.layers import Conv2D, UpSampling2D
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(None, None, 3))
x = Conv2D(256, (3, 3), padding='same', activation='relu')(base_model.output)
x = UpSampling2D((2, 2))(x)
outputs = Conv2D(num_classes, (1, 1), activation='softmax')(x)
Datasets, Transforms and Models specific to Computer Vision
Pros of vision
- Broader scope, covering various computer vision tasks beyond segmentation
- Tighter integration with PyTorch ecosystem and core functionality
- More extensive documentation and tutorials for beginners
Cons of vision
- Less specialized for segmentation tasks compared to mmsegmentation
- Fewer pre-trained models specifically for semantic segmentation
- May require more custom code for advanced segmentation workflows
Code Comparison
mmsegmentation example:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'demo.png')
vision example:
import torchvision.models.segmentation as models
model = models.fcn_resnet50(pretrained=True)
model.eval()
from PIL import Image
from torchvision import transforms
input_image = Image.open('demo.png')
preprocess = transforms.Compose([transforms.ToTensor()])
input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)
output = model(input_batch)['out'][0]
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
Pros of YOLOv5
- Simpler architecture and easier to use for object detection tasks
- Faster inference speed, especially on edge devices
- More extensive documentation and community support
Cons of YOLOv5
- Limited to object detection, less versatile than MMSegmentation
- May have lower accuracy for complex scenes compared to more advanced models
- Less flexibility in terms of model customization and experimentation
Code Comparison
YOLOv5:
import torch
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
results = model('image.jpg')
results.print()
MMSegmentation:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'test.jpg')
YOLOv5 focuses on simplicity and speed for object detection, while MMSegmentation offers a more comprehensive framework for various segmentation tasks with greater flexibility and customization options.
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Broader scope, covering various NLP tasks beyond segmentation
- Larger community and more frequent updates
- Extensive documentation and tutorials
Cons of transformers
- Less specialized for image segmentation tasks
- Potentially steeper learning curve for segmentation-specific applications
- May require more customization for optimal segmentation performance
Code comparison
mmsegmentation:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, img)
transformers:
from transformers import SegformerFeatureExtractor, SegformerForSemanticSegmentation
from PIL import Image
feature_extractor = SegformerFeatureExtractor.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")
model = SegformerForSemanticSegmentation.from_pretrained("nvidia/segformer-b0-finetuned-ade-512-512")
image = Image.open("path/to/image.jpg")
inputs = feature_extractor(images=image, return_tensors="pt")
outputs = model(**inputs)
Best Practices, code samples, and documentation for Computer Vision.
Pros of computervision-recipes
- Broader scope covering multiple computer vision tasks
- Includes Jupyter notebooks for easy experimentation and learning
- Provides end-to-end examples for various scenarios
Cons of computervision-recipes
- Less specialized in semantic segmentation compared to mmsegmentation
- May have fewer state-of-the-art models for segmentation tasks
- Potentially slower development cycle for cutting-edge segmentation techniques
Code Comparison
mmsegmentation:
from mmseg.apis import inference_segmentor, init_segmentor
config_file = 'configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py'
checkpoint_file = 'checkpoints/pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'
model = init_segmentor(config_file, checkpoint_file, device='cuda:0')
result = inference_segmentor(model, 'demo/demo.png')
computervision-recipes:
from azureml.contrib.services.aml_request import rawhttp
from azureml.contrib.services.aml_response import AMLResponse
from PIL import Image
import numpy as np
@rawhttp
def run(request):
image = Image.open(request.files["image"].stream)
# Perform segmentation using a pre-trained model
result = model.predict(np.array(image))
return AMLResponse(result.tolist(), 200)
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Documentation: https://mmsegmentation.readthedocs.io/en/latest/
English | ç®ä½ä¸æ
Introduction
MMSegmentation is an open source semantic segmentation toolbox based on PyTorch. It is a part of the OpenMMLab project.
The main branch works with PyTorch 1.6+.
ð Introducing MMSegmentation v1.0.0 ð
We are thrilled to announce the official release of MMSegmentation's latest version! For this new release, the main branch serves as the primary branch, while the development branch is dev-1.x. The stable branch for the previous release remains as the 0.x branch. Please note that the master branch will only be maintained for a limited time before being removed. We encourage you to be mindful of branch selection and updates during use. Thank you for your unwavering support and enthusiasm, and let's work together to make MMSegmentation even more robust and powerful! ðª
MMSegmentation v1.x brings remarkable improvements over the 0.x release, offering a more flexible and feature-packed experience. To utilize the new features in v1.x, we kindly invite you to consult our detailed ð migration guide, which will help you seamlessly transition your projects. Your support is invaluable, and we eagerly await your feedback!
Major features
-
Unified Benchmark
We provide a unified benchmark toolbox for various semantic segmentation methods.
-
Modular Design
We decompose the semantic segmentation framework into different components and one can easily construct a customized semantic segmentation framework by combining different modules.
-
Support of multiple methods out of box
The toolbox directly supports popular and contemporary semantic segmentation frameworks, e.g. PSPNet, DeepLabV3, PSANet, DeepLabV3+, etc.
-
High efficiency
The training speed is faster than or comparable to other codebases.
What's New
v1.2.0 was released on 10/12/2023, from 1.1.0 to 1.2.0, we have added or updated the following features:
Highlights
-
Support for the open-vocabulary semantic segmentation algorithm SAN
-
Support monocular depth estimation task, please refer to VPD and Adabins for more details.
-
Add new projects: open-vocabulary semantic segmentation algorithm CAT-Seg, real-time semantic segmentation algofithm PP-MobileSeg
Installation
Please refer to get_started.md for installation and dataset_prepare.md for dataset preparation.
Get Started
Please see Overview for the general introduction of MMSegmentation.
Please see user guides for the basic usage of MMSegmentation. There are also advanced tutorials for in-depth understanding of mmseg design and implementation .
A Colab tutorial is also provided. You may preview the notebook here or directly run on Colab.
To migrate from MMSegmentation 0.x, please refer to migration.
Tutorial
Get Started | MMSeg Basic Tutorial | MMSeg Detail Tutorial | MMSeg Development Tutorial |
Benchmark and model zoo
Results and models are available in the model zoo.
Please refer to FAQ for frequently asked questions.
Projects
Here are some implementations of SOTA models and solutions built on MMSegmentation, which are supported and maintained by community users. These projects demonstrate the best practices based on MMSegmentation for research and product development. We welcome and appreciate all the contributions to OpenMMLab ecosystem.
Contributing
We appreciate all contributions to improve MMSegmentation. Please refer to CONTRIBUTING.md for the contributing guideline.
Acknowledgement
MMSegmentation is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new semantic segmentation methods.
Citation
If you find this project useful in your research, please consider cite:
@misc{mmseg2020,
title={{MMSegmentation}: OpenMMLab Semantic Segmentation Toolbox and Benchmark},
author={MMSegmentation Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmsegmentation}},
year={2020}
}
License
This project is released under the Apache 2.0 license.
OpenMMLab Family
- MMEngine: OpenMMLab foundational library for training deep learning models.
- MMCV: OpenMMLab foundational library for computer vision.
- MMPreTrain: OpenMMLab pre-training toolbox and benchmark.
- MMagic: OpenMMLab Advanced, Generative and Intelligent Creation toolbox.
- MMDetection: OpenMMLab detection toolbox and benchmark.
- MMYOLO: OpenMMLab YOLO series toolbox and benchmark.
- MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
- MMRotate: OpenMMLab rotated object detection toolbox and benchmark.
- MMTracking: OpenMMLab video perception toolbox and benchmark.
- MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
- MMOCR: OpenMMLab text detection, recognition, and understanding toolbox.
- MMPose: OpenMMLab pose estimation toolbox and benchmark.
- MMHuman3D: OpenMMLab 3D human parametric model toolbox and benchmark.
- MMFewShot: OpenMMLab fewshot learning toolbox and benchmark.
- MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
- MMFlow: OpenMMLab optical flow toolbox and benchmark.
- MMDeploy: OpenMMLab Model Deployment Framework.
- MMRazor: OpenMMLab model compression toolbox and benchmark.
- MIM: MIM installs OpenMMLab packages.
- Playground: A central hub for gathering and showcasing amazing projects built upon OpenMMLab.
Top Related Projects
Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
Models and examples built with TensorFlow
Datasets, Transforms and Models specific to Computer Vision
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Best Practices, code samples, and documentation for Computer Vision.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot