openpose

OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation

32,828

7,990

32,828

357

View on GitHub

Top Related Projects

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

tfjs-models

14,585

Pretrained models for TensorFlow.js

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Detectron

26,363

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

darknet

22,101

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Quick Overview

OpenPose is an open-source real-time multi-person keypoint detection library for body, face, hands, and foot estimation. Developed by the Perceptual Computing Lab at Carnegie Mellon University, it's the first real-time system to jointly detect human body, hand, facial, and foot keypoints on single images.

Pros

High accuracy in multi-person pose estimation
Real-time performance on CPU and GPU
Supports 2D and 3D keypoint detection
Extensive documentation and community support

Cons

Computationally intensive, requiring powerful hardware for real-time performance
Limited to keypoint detection, not full body segmentation
May struggle with occluded or partially visible body parts
Requires careful calibration for optimal performance

Code Examples

Basic usage for body pose estimation:

import cv2
import numpy as np
from openpose import pyopenpose as op

# Configure OpenPose
params = dict()
params["model_folder"] = "../models/"
opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()

# Read image and process it
imageToProcess = cv2.imread("image.jpg")
datum = op.Datum()
datum.cvInputData = imageToProcess
opWrapper.emplaceAndPop(op.VectorDatum([datum]))

# Display result
print("Body keypoints: \n" + str(datum.poseKeypoints))
cv2.imshow("OpenPose Result", datum.cvOutputData)
cv2.waitKey(0)

Estimating hand keypoints:

# Add hand detection to parameters
params["hand"] = True

# Configure and start OpenPose
opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()

# Process image
datum = op.Datum()
datum.cvInputData = imageToProcess
opWrapper.emplaceAndPop(op.VectorDatum([datum]))

# Display hand keypoints
print("Left hand keypoints: \n" + str(datum.handKeypoints[0]))
print("Right hand keypoints: \n" + str(datum.handKeypoints[1]))

Face keypoint detection:

# Enable face keypoint detection
params["face"] = True

# Configure and start OpenPose
opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()

# Process image
datum = op.Datum()
datum.cvInputData = imageToProcess
opWrapper.emplaceAndPop(op.VectorDatum([datum]))

# Display face keypoints
print("Face keypoints: \n" + str(datum.faceKeypoints))

Getting Started

Clone the OpenPose repository:

git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose.git

Install dependencies (Ubuntu):

sudo apt-get install cmake gcc g++
sudo apt-get install libopencv-dev

Build OpenPose:

cd openpose
mkdir build && cd build
cmake ..
make -j`nproc`

Run the demo:
```
./build/examples/openpose/openpose.bin
```

For more detailed instructions, refer to the official documentation in the repository.

Competitor Comparisons

detectron2

32,239

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.

Pros of Detectron2

More comprehensive object detection and segmentation capabilities
Modular architecture allowing easier customization and extension
Better performance on large-scale datasets and complex scenes

Cons of Detectron2

Steeper learning curve due to more complex architecture
Higher computational requirements for training and inference
Less specialized for human pose estimation compared to OpenPose

Code Comparison

OpenPose example:

from openpose import pyopenpose as op
params = dict()
opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()

Detectron2 example:

from detectron2.config import get_cfg
from detectron2.engine import DefaultPredictor
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
predictor = DefaultPredictor(cfg)

Both libraries offer powerful computer vision capabilities, but Detectron2 provides a more versatile toolkit for various object detection and segmentation tasks, while OpenPose specializes in human pose estimation with potentially easier setup for that specific use case.

tfjs-models

14,585

Pretrained models for TensorFlow.js

Pros of tfjs-models

Browser-based: Runs directly in web browsers without additional dependencies
Diverse models: Offers a variety of pre-trained models beyond pose estimation
JavaScript ecosystem: Integrates easily with web applications and frameworks

Cons of tfjs-models

Performance: Generally slower than native implementations like OpenPose
Limited customization: Less flexibility for advanced users to modify core algorithms
Dependency on TensorFlow.js: Requires loading additional libraries

Code Comparison

OpenPose (C++):

auto datum = op::Datum();
opWrapper.emplaceAndPop(op::VectorDatum{datum});
cv::imshow("OpenPose", datum.cvOutputData);

tfjs-models (JavaScript):

const net = await posenet.load();
const pose = await net.estimateSinglePose(imageElement);
drawKeypoints(pose.keypoints, 0.6, ctx);

Both repositories provide pose estimation capabilities, but OpenPose offers a more comprehensive and performant solution for desktop and server environments, while tfjs-models excels in web-based applications with its easy integration and diverse model offerings.

yolov5

54,362

YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite

Pros of YOLOv5

Faster inference speed and real-time object detection capabilities
More versatile, supporting a wider range of object detection tasks
Easier to deploy and integrate into various applications

Cons of YOLOv5

Less specialized for human pose estimation compared to OpenPose
May require more data and fine-tuning for specific pose-related tasks
Potentially lower accuracy for complex human pose scenarios

Code Comparison

OpenPose example:

import cv2
import pyopenpose as op

params = dict()
params["model_folder"] = "../models/"
opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()

datum = op.Datum()
imageToProcess = cv2.imread("image.jpg")
datum.cvInputData = imageToProcess
opWrapper.emplaceAndPop(op.VectorDatum([datum]))

YOLOv5 example:

import torch

model = torch.hub.load('ultralytics/yolov5', 'yolov5s')
img = 'https://ultralytics.com/images/zidane.jpg'
results = model(img)
results.print()
results.save()

Mask_RCNN

25,251

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Pros of Mask_RCNN

Provides instance segmentation in addition to object detection
Supports a wider range of object classes and datasets
More flexible architecture for various computer vision tasks

Cons of Mask_RCNN

Generally slower inference time compared to OpenPose
Requires more computational resources for training and inference
Less specialized for human pose estimation tasks

Code Comparison

Mask_RCNN example:

import mrcnn.model as modellib

model = modellib.MaskRCNN(mode="inference", config=config, model_dir=MODEL_DIR)
results = model.detect([image], verbose=1)

OpenPose example:

from openpose import pyopenpose as op

opWrapper = op.WrapperPython()
opWrapper.configure(params)
opWrapper.start()
datum = op.Datum()
opWrapper.emplaceAndPop(op.VectorDatum([datum]))

Both repositories offer powerful computer vision capabilities, but they focus on different aspects. Mask_RCNN is more versatile for general object detection and instance segmentation, while OpenPose specializes in human pose estimation. The choice between them depends on the specific requirements of your project and the available computational resources.

Detectron

26,363

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.

Pros of Detectron

Broader scope: Supports multiple computer vision tasks beyond pose estimation
More flexible architecture: Modular design allows easier customization
Better documentation and community support

Cons of Detectron

Higher computational requirements
Steeper learning curve for beginners
Less specialized for real-time pose estimation

Code Comparison

OpenPose example:

#include <openpose/pose/poseExtractor.hpp>

auto poseExtractor = op::PoseExtractorCaffe::getInstance(poseModel, netInputSize, outputSize, keypointScaleMode, num_gpu_start);
poseExtractor->forwardPass(netInputArray, imageSize, scaleInputToNetInputs);

Detectron example:

from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor

cfg = model_zoo.get_config_file("COCO-Keypoints/keypoint_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(image)

Both repositories offer powerful tools for computer vision tasks, with OpenPose specializing in real-time pose estimation and Detectron providing a more versatile framework for various detection and segmentation tasks. The choice between them depends on the specific requirements of your project and your familiarity with the respective ecosystems.

darknet

22,101

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

Pros of darknet

Supports a wider range of object detection models (YOLO, Tiny-YOLO, etc.)
Generally faster inference times for object detection tasks
More flexible for various computer vision tasks beyond pose estimation

Cons of darknet

Less specialized for human pose estimation compared to openpose
May require more setup and configuration for specific use cases
Documentation can be less comprehensive for certain features

Code Comparison

openpose:

auto opWrapper = op::Wrapper{op::ThreadManagerMode::Asynchronous};
opWrapper.configure(wrapperStructPose);
opWrapper.start();

// Process and display image
auto datumProcessed = opWrapper.emplaceAndPop(imageToProcess);
if (datumProcessed != nullptr)
    cv::imshow("OpenPose", datumProcessed->at(0)->cvOutputData);

darknet:

network *net = load_network("cfg/yolov3.cfg", "yolov3.weights", 0);
image im = load_image("data/dog.jpg", 0, 0, net->w, net->h);
float *X = im.data;
network_predict(net, X);
int nboxes = 0;
detection *dets = get_network_boxes(net, im.w, im.h, 0.5, 0.5, 0, 1, &nboxes);

Both repositories offer powerful computer vision capabilities, with openpose specializing in human pose estimation and darknet providing a broader range of object detection models. The choice between them depends on the specific requirements of your project.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Build Type	`Linux`	`MacOS`	`Windows`
Build Status

OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facial, and foot keypoints (in total 135 keypoints) on single images.

It is authored by GinÃ©s Hidalgo, Zhe Cao, Tomas Simon, Shih-En Wei, Yaadhav Raaj, Hanbyul Joo, and Yaser Sheikh. It is maintained by GinÃ©s Hidalgo and Yaadhav Raaj. OpenPose would not be possible without the CMU Panoptic Studio dataset. We would also like to thank all the people who have helped OpenPose in any way.

Results
Features
Related Work
Installation
Quick Start Overview
Send Us Feedback!
Citation
License

Results

Whole-body (Body, Foot, Face, and Hands) 2D Pose Estimation

^{Testing OpenPose: (Left) Crazy Uptown Funk flashmob in Sydney video sequence. (Center and right) Authors GinÃ©s Hidalgo and Tomas Simon testing face and hands}

Whole-body 3D Pose Reconstruction and Estimation

^{Tianyi Zhao testing the OpenPose 3D Module}

Unity Plugin

Runtime Analysis

We show an inference time comparison between the 3 available pose estimation libraries (same hardware and conditions): OpenPose, Alpha-Pose (fast Pytorch version), and Mask R-CNN. The OpenPose runtime is constant, while the runtime of Alpha-Pose and Mask R-CNN grow linearly with the number of people. More details here.

Features

Main Functionality:

2D real-time multi-person keypoint detection:
- 15, 18 or 25-keypoint body/foot keypoint estimation, including 6 foot keypoints. Runtime invariant to number of detected people.
- 2x21-keypoint hand keypoint estimation. Runtime depends on number of detected people. See OpenPose Training for a runtime invariant alternative.
- 70-keypoint face keypoint estimation. Runtime depends on number of detected people. See OpenPose Training for a runtime invariant alternative.
3D real-time single-person keypoint detection:
- 3D triangulation from multiple single views.
- Synchronization of Flir cameras handled.
- Compatible with Flir/Point Grey cameras.
Calibration toolbox: Estimation of distortion, intrinsic, and extrinsic camera parameters.
Single-person tracking for further speedup or visual smoothing.

Input: Image, video, webcam, Flir/Point Grey, IP camera, and support to add your own custom input source (e.g., depth camera).

Output: Basic image + keypoint display/saving (PNG, JPG, AVI, ...), keypoint saving (JSON, XML, YML, ...), keypoints as array class, and support to add your own custom output code (e.g., some fancy UI).

OS: Ubuntu (20, 18, 16, 14), Windows (10, 8), Mac OSX, Nvidia TX2.

Hardware compatibility: CUDA (Nvidia GPU), OpenCL (AMD GPU), and non-GPU (CPU-only) versions.

Usage Alternatives:

Command-line demo for built-in functionality.
C++ API and Python API for custom functionality. E.g., adding your custom inputs, pre-processing, post-posprocessing, and output steps.

For further details, check the major released features and release notes docs.

Related Work

OpenPose training code
OpenPose foot dataset
OpenPose Unity Plugin
OpenPose papers published in IEEE TPAMI and CVPR. Cite them in your publications if OpenPose helps your research! (Links and more details in the Citation section below).

Installation

If you want to use OpenPose without installing or writing any code, simply download and use the latest Windows portable version of OpenPose!

Otherwise, you could build OpenPose from source. See the installation doc for all the alternatives.

Quick Start Overview

Simply use the OpenPose Demo from your favorite command-line tool (e.g., Windows PowerShell or Ubuntu Terminal). E.g., this example runs OpenPose on your webcam and displays the body keypoints:

# Ubuntu
./build/examples/openpose/openpose.bin

:: Windows - Portable Demo
bin\OpenPoseDemo.exe --video examples\media\video.avi

You can also add any of the available flags in any order. E.g., the following example runs on a video (--video {PATH}), enables face (--face) and hands (--hand), and saves the output keypoints on JSON files on disk (--write_json {PATH}).

# Ubuntu
./build/examples/openpose/openpose.bin --video examples/media/video.avi --face --hand --write_json output_json_folder/

:: Windows - Portable Demo
bin\OpenPoseDemo.exe --video examples\media\video.avi --face --hand --write_json output_json_folder/

Optionally, you can also extend OpenPose's functionality from its Python and C++ APIs. After installing OpenPose, check its official doc for a quick overview of all the alternatives and tutorials.

Send Us Feedback!

Our library is open source for research purposes, and we want to improve it! So let us know (create a new GitHub issue or pull request, email us, etc.) if you...

Find/fix any bug (in functionality or speed) or know how to speed up or improve any part of OpenPose.
Want to add/show some cool functionality/demo/project made on top of OpenPose. We can add your project link to our Community-based Projects section or even integrate it with OpenPose!

Citation

Please cite these papers in your publications if OpenPose helps your research. All of OpenPose is based on OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields, while the hand and face detectors also use Hand Keypoint Detection in Single Images using Multiview Bootstrapping (the face detector was trained using the same procedure as the hand detector).

@article{8765346,
  author = {Z. {Cao} and G. {Hidalgo Martinez} and T. {Simon} and S. {Wei} and Y. A. {Sheikh}},
  journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
  title = {OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
  year = {2019}
}

@inproceedings{simon2017hand,
  author = {Tomas Simon and Hanbyul Joo and Iain Matthews and Yaser Sheikh},
  booktitle = {CVPR},
  title = {Hand Keypoint Detection in Single Images using Multiview Bootstrapping},
  year = {2017}
}

@inproceedings{cao2017realtime,
  author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
  booktitle = {CVPR},
  title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
  year = {2017}
}

@inproceedings{wei2016cpm,
  author = {Shih-En Wei and Varun Ramakrishna and Takeo Kanade and Yaser Sheikh},
  booktitle = {CVPR},
  title = {Convolutional pose machines},
  year = {2016}
}

Paper links:

OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields:
- IEEE TPAMI
- ArXiv
Hand Keypoint Detection in Single Images using Multiview Bootstrapping
Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields
Convolutional Pose Machines

License

OpenPose is freely available for free non-commercial use, and may be redistributed under these conditions. Please, see the license for further details. Interested in a commercial license? Check this FlintBox link. For commercial queries, use the Contact section from the FlintBox link and also send a copy of that message to Yaser Sheikh.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of Detectron2

Cons of Detectron2

Code Comparison

Pros of tfjs-models

Cons of tfjs-models

Code Comparison

Pros of YOLOv5

Cons of YOLOv5

Code Comparison

Pros of Mask_RCNN

Cons of Mask_RCNN

Code Comparison

Pros of Detectron

Cons of Detectron

Code Comparison

Pros of darknet

Cons of darknet

Code Comparison

Convert designs to code with AI

README

Contents

Results

Whole-body (Body, Foot, Face, and Hands) 2D Pose Estimation

Whole-body 3D Pose Reconstruction and Estimation

Unity Plugin

Runtime Analysis

Features

Related Work

Installation

Quick Start Overview

Send Us Feedback!

Citation

License

Top Related Projects

Convert designs to code with AI