mmpose

OpenMMLab Pose Estimation Toolbox and Benchmark.

5,562

1,209

5,562

233

View on GitHub

Top Related Projects

detectron2

29,935

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

openpose

30,786

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

tfjs-models

13,990

Pretrained models for TensorFlow.js

human-pose-estimation.pytorch

2,933

The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"

Quick Overview

MMPose is an open-source toolbox for pose estimation tasks based on PyTorch. It is a part of the OpenMMLab project and provides a comprehensive set of tools for various pose estimation tasks, including 2D and 3D human pose estimation, animal pose estimation, and face landmark detection.

Pros

Extensive collection of state-of-the-art algorithms and models
Modular design allowing easy customization and extension
Comprehensive documentation and tutorials
Active community and regular updates

Cons

Steep learning curve for beginners
Requires significant computational resources for training large models
Limited support for real-time applications

Code Examples

Loading a pre-trained model and performing inference:

from mmpose.apis import init_pose_model, inference_top_down_pose_model
from mmpose.utils import register_all_modules

register_all_modules()

config_file = 'configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w48_coco_256x192.py'
checkpoint_file = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_256x192-b9e0b3ab_20200708.pth'

# Initialize the model
model = init_pose_model(config_file, checkpoint_file, device='cuda:0')

# Perform inference
image = 'demo/demo.jpg'
results, _ = inference_top_down_pose_model(model, image)

Visualizing the results:

from mmpose.apis import vis_pose_result

# Visualize the results
vis_result = vis_pose_result(model, image, results)
vis_result.show()

Training a custom model:

from mmpose.apis import train_model
from mmpose.models import build_posenet
from mmpose.datasets import build_dataset

# Build the model
model = build_posenet(cfg.model)

# Build the dataset
datasets = [build_dataset(cfg.data.train)]

# Train the model
train_model(model, datasets, cfg, distributed=False, validate=True)

Getting Started

Install MMPose:

pip install openmim
mim install mmpose

Download a config file and checkpoint:

config_file = 'configs/body/2d_kpt_sview_rgb_img/topdown_heatmap/coco/hrnet_w48_coco_256x192.py'
checkpoint_file = 'https://download.openmmlab.com/mmpose/top_down/hrnet/hrnet_w48_coco_256x192-b9e0b3ab_20200708.pth'

Perform inference:

from mmpose.apis import init_pose_model, inference_top_down_pose_model
model = init_pose_model(config_file, checkpoint_file, device='cuda:0')
results, _ = inference_top_down_pose_model(model, 'demo/demo.jpg')

Competitor Comparisons

detectron2

29,935

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

Broader scope, covering object detection, segmentation, and more
Extensive documentation and tutorials
Highly modular and customizable architecture

Cons of Detectron2

Steeper learning curve for beginners
Less focused on pose estimation specifically
Requires more setup and configuration for pose-related tasks

Code Comparison

MMPose example:

from mmpose.apis import inference_top_down_pose_model, init_pose_model

pose_model = init_pose_model(pose_config, pose_checkpoint)
pose_results, _ = inference_top_down_pose_model(pose_model, image)

Detectron2 example:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor

cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

MMPose is more specialized for pose estimation tasks, offering a simpler API for common use cases. Detectron2 provides a more general-purpose framework, requiring additional setup but offering greater flexibility for various computer vision tasks.

openpose

30,786

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

Pros of OpenPose

Real-time performance on CPU and GPU
Supports multi-person pose estimation
Includes hand, face, and foot keypoint detection

Cons of OpenPose

Less flexible architecture compared to MMPose
Limited support for recent pose estimation models
Fewer pre-trained models available

Code Comparison

OpenPose:

#include <openpose/pose/poseExtractor.hpp>

auto poseExtractor = op::PoseExtractorCaffe::getInstance(poseModel, netInputSize, outputSize, keypointScaleMode, num_gpu_start);
poseExtractor->forwardPass(netInputArray, imageSize, scaleInputToNetInputs);

MMPose:

from mmpose.apis import init_pose_model, inference_top_down_pose_model

pose_model = init_pose_model(config, checkpoint)
pose_results, _ = inference_top_down_pose_model(pose_model, image, person_results)

OpenPose offers real-time performance and multi-person pose estimation out of the box, making it suitable for applications requiring immediate results. However, MMPose provides a more flexible architecture, supports a wider range of recent pose estimation models, and offers more pre-trained models. MMPose's Python-based implementation also makes it easier to integrate into existing machine learning pipelines.

tfjs-models

13,990

Pretrained models for TensorFlow.js

Pros of tfjs-models

Runs in web browsers and Node.js environments
Offers a wide range of pre-trained models beyond pose estimation
Easier integration with JavaScript and web applications

Cons of tfjs-models

Limited to TensorFlow.js ecosystem
May have lower performance compared to native implementations
Fewer pose estimation models and less flexibility in model customization

Code Comparison

mmpose:

from mmpose.apis import inference_top_down_pose_model, init_pose_model

pose_model = init_pose_model(config, checkpoint)
pose_results, _ = inference_top_down_pose_model(pose_model, image)

tfjs-models:

import * as poseDetection from '@tensorflow-models/pose-detection';

const detector = await poseDetection.createDetector(poseDetection.SupportedModels.MoveNet);
const poses = await detector.estimatePoses(image);

Key Differences

mmpose offers more comprehensive pose estimation capabilities and research-oriented features
tfjs-models provides easier integration with web technologies and broader model selection
mmpose is Python-based, while tfjs-models is JavaScript-based
mmpose has more advanced customization options for pose estimation models
tfjs-models is more suitable for client-side applications and rapid prototyping

google-research

33,785

Google Research

Pros of google-research

Broader scope, covering various research areas beyond pose estimation
Regularly updated with cutting-edge research from Google's teams
Extensive documentation and explanations for each project

Cons of google-research

Less focused on pose estimation specifically
May be more challenging to navigate due to its diverse content
Some projects might be less production-ready compared to mmpose

Code comparison

mmpose:

from mmpose.apis import inference_top_down_pose_model, init_pose_model

pose_model = init_pose_model(pose_config, pose_checkpoint)
pose_results, _ = inference_top_down_pose_model(pose_model, image)

google-research (PoseNet example):

import tensorflow as tf
import posenet

model = posenet.load_model(101)
output = model(tf.convert_to_tensor(image))
keypoints = output['heatmapScores']

Summary

While google-research offers a wide range of research projects and frequent updates, mmpose provides a more focused and potentially more production-ready solution for pose estimation tasks. The choice between the two depends on whether you need a specialized pose estimation tool or want to explore various research areas.

human-pose-estimation.pytorch

2,933

The project is an official implement of our ECCV2018 paper "Simple Baselines for Human Pose Estimation and Tracking(https://arxiv.org/abs/1804.06208)"

Pros of human-pose-estimation.pytorch

Simpler and more lightweight implementation
Easier to understand and modify for beginners
Focuses specifically on the SimpleBaseline method

Cons of human-pose-estimation.pytorch

Limited to a single pose estimation method
Less actively maintained and updated
Fewer pre-trained models and datasets supported

Code Comparison

mmpose:

from mmpose.apis import inference_top_down_pose_model, init_pose_model

pose_model = init_pose_model(config, checkpoint)
pose_results, _ = inference_top_down_pose_model(pose_model, image)

human-pose-estimation.pytorch:

from pose_estimation import get_pose_net
from pose_utils import get_final_preds

model = get_pose_net()
heatmaps = model(image)
preds = get_final_preds(heatmaps)

mmpose offers a more comprehensive and modular approach, while human-pose-estimation.pytorch provides a simpler, more direct implementation. mmpose's API is designed for flexibility across multiple models and datasets, whereas human-pose-estimation.pytorch focuses on a single method with a straightforward interface.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

OpenMMLab website ^HOT OpenMMLab platform ^{TRY IT OUT}

Introduction

English | ç®ä½ä¸æ

MMPose is an open-source toolbox for pose estimation based on PyTorch. It is a part of the OpenMMLab project.

The main branch works with PyTorch 1.8+.

https://user-images.githubusercontent.com/15977946/124654387-0fd3c500-ded1-11eb-84f6-24eeddbf4d91.mp4

Major Features

Support diverse tasks

We support a wide spectrum of mainstream pose analysis tasks in current research community, including 2d multi-person human pose estimation, 2d hand pose estimation, 2d face landmark detection, 133 keypoint whole-body human pose estimation, 3d human mesh recovery, fashion landmark detection and animal pose estimation. See Demo for more information.
Higher efficiency and higher accuracy

MMPose implements multiple state-of-the-art (SOTA) deep learning models, including both top-down & bottom-up approaches. We achieve faster training speed and higher accuracy than other popular codebases, such as HRNet. See benchmark.md for more information.
Support for various datasets

The toolbox directly supports multiple popular and representative datasets, COCO, AIC, MPII, MPII-TRB, OCHuman etc. See dataset_zoo for more information.
Well designed, tested and documented

We decompose MMPose into different components and one can easily construct a customized pose estimation framework by combining different modules. We provide detailed documentation and API reference, as well as unittests.

What's New

Release RTMW3D, a real-time model for 3D wholebody pose estimation.
Release RTMO, a state-of-the-art real-time method for multi-person pose estimation.
Release RTMW models in various sizes ranging from RTMW-m to RTMW-x. The input sizes include 256x192 and 384x288. This provides flexibility to select the right model for different speed and accuracy requirements.
Support inference of PoseAnything. Web demo is available here.
Support for new datasets:
- (ICCV 2015) 300VW
Welcome to use the MMPose project. Here, you can discover the latest features and algorithms in MMPose and quickly share your ideas and code implementations with the community. Adding new features to MMPose has become smoother:
- Provides a simple and fast way to add new algorithms, features, and applications to MMPose.
- More flexible code structure and style, fewer restrictions, and a shorter code review process.
- Utilize the powerful capabilities of MMPose in the form of independent projects without being constrained by the code framework.
- Newly added projects include:
  - RTMPose
  - RTMO
  - RTMPose3D
  - PoseAnything
  - YOLOX-Pose
  - MMPose4AIGC
  - Simple Keypoints
  - Just Dance
  - Uniformer
- Start your journey as an MMPose contributor with a simple example project, and let's build a better MMPose together!

January 4, 2024: MMPose v1.3.0 has been officially released, with major updates including:
- Support for new datasets: ExLPose, H3WB
- Release of new RTMPose series models: RTMO, RTMW
- Support for new algorithm PoseAnything
- Enhanced Inferencer with optional progress bar and improved affinity for one-stage methods
Please check the complete release notes for more details on the updates brought by MMPose v1.3.0!

0.x / 1.x Migration

MMPose v1.0.0 is a major update, including many API and config file changes. Currently, a part of the algorithms have been migrated to v1.0.0, and the remaining algorithms will be completed in subsequent versions. We will show the migration progress in this Roadmap.

If your algorithm has not been migrated, you can continue to use the 0.x branch and old documentation.

Installation

Please refer to installation.md for more detailed installation and dataset preparation.

Getting Started

We provided a series of tutorials about the basic usage of MMPose for new users:

For the basic usage of MMPose:
For developers who wish to develop based on MMPose:
For researchers and developers who are willing to contribute to MMPose:
- Contribution Guide
For some common issues, we provide a FAQ list:
- FAQ

Model Zoo

Results and models are available in the README.md of each method's config directory. A summary can be found in the Model Zoo page.

Supported algorithms:

Supported techniques:

Supported datasets:

Supported backbones:

Model Request

We will keep up with the latest progress of the community, and support more popular algorithms and frameworks. If you have any feature requests, please feel free to leave a comment in MMPose Roadmap.

Contributing

We appreciate all contributions to improve MMPose. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgement

MMPose is an open source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new models.

Citation

If you find this project useful in your research, please consider cite:

@misc{mmpose2020,
    title={OpenMMLab Pose Estimation Toolbox and Benchmark},
    author={MMPose Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmpose}},
    year={2020}
}

License

This project is released under the Apache 2.0 license.

Projects in OpenMMLab

MMEngine: OpenMMLab foundational library for training deep learning models.
MMCV: OpenMMLab foundational library for computer vision.
MMPreTrain: OpenMMLab pre-training toolbox and benchmark.
MMagic: OpenMMLab Advanced, Generative and Intelligent Creation toolbox.
MMDetection: OpenMMLab detection toolbox and benchmark.
MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
MMRotate: OpenMMLab rotated object detection toolbox and benchmark.
MMTracking: OpenMMLab video perception toolbox and benchmark.
MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
MMOCR: OpenMMLab text detection, recognition, and understanding toolbox.
MMPose: OpenMMLab pose estimation toolbox and benchmark.
MMHuman3D: OpenMMLab 3D human parametric model toolbox and benchmark.
MMFewShot: OpenMMLab fewshot learning toolbox and benchmark.
MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
MMFlow: OpenMMLab optical flow toolbox and benchmark.
MMDeploy: OpenMMLab Model Deployment Framework.
MMRazor: OpenMMLab model compression toolbox and benchmark.
MIM: MIM installs OpenMMLab packages.
Playground: A central hub for gathering and showcasing amazing projects built upon OpenMMLab.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot