rpg_svo_pro_open

No description available

1,491

406

1,491

View on GitHub

Top Related Projects

ORB_SLAM2

9,851

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

VINS-Mono

5,380

A Robust and Versatile Monocular Visual-Inertial State Estimator

ORB_SLAM3

7,407

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Quick Overview

RPG SVO Pro Open is an advanced visual odometry system for monocular and stereo cameras. It's designed for real-time pose estimation in robotics and augmented reality applications, offering high accuracy and robustness in challenging environments.

Pros

High performance and accuracy in pose estimation
Supports both monocular and stereo camera setups
Robust operation in challenging environments (e.g., low texture, dynamic scenes)
Open-source implementation with ROS integration

Cons

Requires careful parameter tuning for optimal performance
May struggle in environments with extreme lighting changes
Limited documentation for advanced usage scenarios
Steep learning curve for newcomers to visual odometry

Code Examples

// Initialize SVO
svo::FrameHandlerMono* vo = new svo::FrameHandlerMono(cam);
vo->start();

// Process a new image
cv::Mat img;
vo->addImage(img, 0.01);

// Get the current pose
Sophus::SE3 T_WS = vo->lastFrame()->T_world_cam();

// Configure SVO for stereo setup
svo::BaseOptions options;
options.camera_type = svo::CameraType::Stereo;
svo::FrameHandlerStereo* vo = new svo::FrameHandlerStereo(options);

// Process stereo image pair
cv::Mat left_img, right_img;
vo->addStereoImages(left_img, right_img, 0.01);

// Enable loop closure
svo::LoopClosingOptions lc_options;
lc_options.use_loop_closing = true;
vo->setLoopClosingOptions(lc_options);

// Get optimized trajectory after loop closure
std::vector<Sophus::SE3> optimized_poses = vo->getAllPoses();

Getting Started

Clone the repository:

git clone https://github.com/uzh-rpg/rpg_svo_pro_open.git

Install dependencies:

sudo apt-get install ros-melodic-opencv3 ros-melodic-eigen-conversions

Build the project:
```
catkin build svo_ros
```
Run SVO with a sample dataset:
```
roslaunch svo_ros run_from_bag.launch
```

For more detailed instructions, refer to the README.md file in the repository.

Competitor Comparisons

ORB_SLAM2

9,851

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

Pros of ORB_SLAM2

More robust loop closure and relocalization capabilities
Better performance in feature-rich environments
Supports monocular, stereo, and RGB-D cameras

Cons of ORB_SLAM2

Higher computational requirements
Less suitable for resource-constrained platforms
May struggle in low-texture environments

Code Comparison

ORB_SLAM2 (feature extraction):

void Frame::ExtractORB(int flag, const cv::Mat &im)
{
    if(flag==0)
        (*mpORBextractorLeft)(im,cv::Mat(),mvKeys,mDescriptors);
    else
        (*mpORBextractorRight)(im,cv::Mat(),mvKeysRight,mDescriptorsRight);
}

rpg_svo_pro_open (feature detection):

void detectFeatures(
    const cv::Mat& img,
    const cv::Mat& mask,
    Corners& corners,
    Features& features)
{
  detector_->detect(img, mask, corners, features);
}

Both repositories implement visual odometry and SLAM techniques, but they use different approaches. ORB_SLAM2 relies on ORB features for tracking and mapping, while rpg_svo_pro_open uses a direct method with sparse image alignment. The code snippets show the difference in feature extraction methods, with ORB_SLAM2 using ORB features and rpg_svo_pro_open using a more generic detector approach.

lsd_slam

2,671

LSD-SLAM

Pros of LSD-SLAM

Operates on direct image intensities, allowing for semi-dense mapping
Capable of large-scale, consistent mapping with loop closure
Works well in environments with little texture or few features

Cons of LSD-SLAM

Generally slower than feature-based methods like SVO
May struggle in environments with rapid motion or significant lighting changes
Requires more computational resources, especially for large-scale mapping

Code Comparison

LSD-SLAM (C++):

void SlamSystem::trackFrame(Frame* newFrame)
{
    // ... (tracking logic)
    SE3TrackerResult* trackingResult = tracker->trackFrame(
        currentKeyFrame, newFrame, SE3TrackerResult());
    // ... (update pose and map)
}

SVO (C++):

void FrameHandlerMono::addImage(const cv::Mat& img, const double timestamp)
{
    // ... (frame processing)
    FramePtr new_frame(new Frame(cam_, img, timestamp));
    tracking_quality_ = trackFrame(new_frame);
    // ... (update keyframes and map)
}

Both systems implement frame tracking, but LSD-SLAM operates on direct image intensities, while SVO uses feature-based tracking. SVO's implementation tends to be more efficient, contributing to its faster performance.

dso

2,372

Direct Sparse Odometry

Pros of DSO

More accurate in low-texture environments due to direct photometric error minimization
Better handling of rolling shutter cameras
Supports multi-camera setups

Cons of DSO

Higher computational requirements
Less robust to rapid camera motions
Limited to monocular setups (no stereo support)

Code Comparison

DSO (photometric error calculation):

float DSO::calculatePhotometricError(const Frame* frame, const Point* point) {
    Vec2f proj = project(point->pos, frame->pose);
    float intensity = interpolate(frame->image, proj);
    return (intensity - point->idepth) * (intensity - point->idepth);
}

SVO (feature detection and matching):

void SVO::detectAndMatchFeatures(const Frame& frame) {
    std::vector<cv::KeyPoint> keypoints;
    cv::Mat descriptors;
    feature_detector_->detect(frame.img, keypoints);
    descriptor_extractor_->compute(frame.img, keypoints, descriptors);
    matcher_->match(descriptors, last_frame_.descriptors, matches);
}

The code snippets highlight the different approaches: DSO focuses on direct photometric error minimization, while SVO uses feature detection and matching. This reflects their core algorithmic differences, with DSO being a direct method and SVO being a feature-based method.

VINS-Mono

5,380

A Robust and Versatile Monocular Visual-Inertial State Estimator

Pros of VINS-Mono

Integrates visual-inertial odometry with loop closure for improved accuracy
Supports monocular camera setups, making it suitable for a wider range of hardware configurations
Provides a more comprehensive solution for simultaneous localization and mapping (SLAM)

Cons of VINS-Mono

Generally slower in processing speed compared to SVO Pro
May require more computational resources due to its comprehensive nature
Less suitable for scenarios requiring extremely low latency

Code Comparison

VINS-Mono (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        TicToc t_c;
        clahe->apply(_img, img);
        ROS_DEBUG("CLAHE costs: %fms", t_c.toc());
    }
    else
        img = _img;

SVO Pro (feature detection):

void DetectorBase::detect(
    const Frame& frame,
    const cv::Mat& mask,
    const size_t max_n_features,
    Keypoints* keypoints_new,
    Scores* scores_new,
    Levels* levels_new,
    Gradients* gradients_new,
    FeatureTypes* types_new)
{
  CHECK_NOTNULL(keypoints_new);
  CHECK_NOTNULL(scores_new);
  CHECK_NOTNULL(levels_new);

Both repositories offer robust visual odometry solutions, with VINS-Mono providing a more comprehensive SLAM approach, while SVO Pro focuses on speed and efficiency.

ORB_SLAM3

7,407

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Pros of ORB_SLAM3

More comprehensive SLAM system with support for monocular, stereo, and RGB-D cameras
Includes loop closing and relocalization capabilities
Active development with frequent updates and improvements

Cons of ORB_SLAM3

Higher computational complexity, potentially slower in real-time applications
Steeper learning curve due to more complex architecture
May struggle in environments with limited features or textures

Code Comparison

ORB_SLAM3:

void ORBextractor::ComputeKeyPointsOctTree(vector<vector<KeyPoint>>& allKeypoints)
{
    allKeypoints.resize(nlevels);
    const float W = 30;
    for (int level = 0; level < nlevels; ++level)
    {
        const int minBorderX = EDGE_THRESHOLD-3;
        const int minBorderY = minBorderX;
        const int maxBorderX = mvImagePyramid[level].cols-EDGE_THRESHOLD+3;
        const int maxBorderY = mvImagePyramid[level].rows-EDGE_THRESHOLD+3;

rpg_svo_pro_open:

void FeatureDetector::detectFeatures(
    const cv::Mat& img,
    const cv::Mat& mask,
    const size_t max_n_features,
    Features& features)
{
  cv::Mat img_8u;
  if(img.type() != CV_8UC1)
  {
    img.convertTo(img_8u, CV_8UC1);
  }
  else
  {
    img_8u = img;
  }

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

rpg_svo_pro

This repo includes SVO Pro which is the newest version of Semi-direct Visual Odometry (SVO) developed over the past few years at the Robotics and Perception Group (RPG). SVO was born as a fast and versatile visual front-end as described in the SVO paper (TRO-17). Since then, different extensions have been integrated through various research and industrial projects. SVO Pro features the support of different camera models, active exposure control, a sliding window based backend, and global bundle adjustment with loop closure.

In summary, this repository offers the following functionalities:

Visual-odometry: The most recent version of SVO that supports perspective and fisheye/catadioptric cameras in monocular or stereo setup. It also includes active exposure control.
Visual-inertial odometry: SVO fronted + visual-inertial sliding window optimization backend (modified from OKVIS)
Visual-inertial SLAM: SVO frontend + visual-inertial sliding window optimization backend + globally bundle adjusted map (using iSAM2). The global map is updated in real-time, thanks to iSAM2, and used for localization at frame-rate.
Visual-inertial SLAM with loop closure: Loop closures, via DBoW2, are integrated in the global bundle adjustment. Pose graph optimization is also included as a lightweight replacement of the global bundle adjustment.

An example of the visual-inertial SLAM pipeline on EuRoC dataset is below (green points - sliding window; blue points - iSAM2 map):

SVO Pro and its extensions have been used to support various projects at RPG, such as our recent work on multiple camera SLAM, voxel map for visual SLAM and the tight-coupling of global positional measurements into VIO. We hope that the efforts we made can facilitate the research and applications of SLAM and spatial perception.

License

The code is licensed under GPLv3. For commercial use, please contact sdavide [at] ifi [dot] uzh [dot] ch.

The visual-inertial backend is modified from OKVIS, and the license is retained at the beginning of the related files.

Credits

If you use the code in the academic context, please cite:

Christian Forster, Matia Pizzoli, Davide Scaramuzza. SVO: Fast Semi-Direct Monocular Visual Odometry. ICRA, 2014. bibtex
Christian Forster, Zichao Zhang, Michael Gassner, Manuel Werlberger, Davide Scaramuzza. SVO: Semi-Direct Visual Odometry for Monocular and Multi-Camera Systems. TRO, 2017. bibtex

Additionally, please cite the following papers for the specific extensions you make use of:

Fisheye/catadioptric camera extension: Zichao Zhang, Henri Rebecq, Christian Forster, Davide Scaramuzza. Benefit of Large Field-of-View Cameras for Visual Odometry. ICRA, 2016. bibtex
Brightness/exposure compensation: Zichao Zhang, Christian Forster, Davide Scaramuzza. Active Exposure Control for Robust Visual Odometry in HDR Environments. ICRA, 2017. bibtex
Ceres-based optimization backend: Stefan Leutenegger, Simon Lynen, Michael Bosse, Roland Siegwart, Paul Timothy Furgale. Keyframe-based visualâinertial odometry using nonlinear optimization. IJRR, 2015. bibtex
Global map powered by iSAM2: Michael Kaess, Hordur Johannsson, Richard Roberts, Viorela Ila, John Leonard, Frank Dellaert. iSAM2: Incremental Smoothing and Mapping Using the Bayes Tree. IJRR, 2012. bibtex
Loop closure: Dorian GÃ¡lvez-LÃ³pez and Juan D. TardÃ³s. Bags of Binary Words for Fast Place Recognition in Image Sequences. TRO, 2012. bibtex

Our recent publications that use SVO Pro are:

Multiple camera SLAM: Juichung Kuo, Manasi Muglikar, Zichao Zhang, Davide Scaramuzza. Redesigning SLAM for Arbitrary Multi-Camera Systems. ICRA, 2020. bibtex
Voxel map for visual SLAM: Manasi Muglikar, Zichao Zhang, Davide Scaramuzza. Voxel Map for Visual SLAM. ICRA, 2020. bibtex
Tight-coupling of global positional measurements into VIO: Giovanni Cioffi, Davide Scaramuzza. Tightly-coupled Fusion of Global Positional Measurements in Optimization-based Visual-Inertial Odometry. IROS, 2020. bibtex

Install

The code has been tested on

Ubuntu 18.04 with ROS Melodic
Ubuntu 20.04 with ROS Noetic

Install dependences

Install catkin tools and vcstools if you haven't done so before. Depending on your operating system, run

# For Ubuntu 18.04 + Melodic
sudo apt-get install python-catkin-tools python-vcstool

# For Ubuntu 20.04 + Noetic
sudo apt-get install python3-catkin-tools python3-vcstool python3-osrf-pycommon

Install system dependencies and dependencies for Ceres Solver

# system dep.
sudo apt-get install libglew-dev libopencv-dev libyaml-cpp-dev 
# Ceres dep.
sudo apt-get install libblas-dev liblapack-dev libsuitesparse-dev

Clone and compile

Create a workspace and clone the code (ROS-DISTRO=melodic/noetic):

mkdir svo_ws && cd svo_ws
# see below for the reason for specifying the eigen path
catkin config --init --mkdirs --extend /opt/ros/<ROS-DISTRO> --cmake-args -DCMAKE_BUILD_TYPE=Release -DEIGEN3_INCLUDE_DIR=/usr/include/eigen3
cd src
git clone git@github.com:uzh-rpg/rpg_svo_pro_open.git
vcs-import < ./rpg_svo_pro_open/dependencies.yaml
touch minkindr/minkindr_python/CATKIN_IGNORE
# vocabulary for place recognition
cd rpg_svo_pro_open/svo_online_loopclosing/vocabularies && ./download_voc.sh
cd ../../..

There are two types of builds that you can proceed from here

Build without the global map (front-end + sliding window back-end + loop closure/pose graph)
```
catkin build
```

Build with the global map using iSAM2 (all functionalities)

First, enable the global map feature

rm rpg_svo_pro_open/svo_global_map/CATKIN_IGNORE

and in svo_cmake/cmake/Modules/SvoSetup.cmake

SET(USE_GLOBAL_MAP TRUE)

Second, clone GTSAM

git clone --branch 4.0.3 git@github.com:borglab/gtsam.git

and modify GTSAM compilation flags a bit:

# 1. gtsam/CMakelists.txt: use system Eigen
-option(GTSAM_USE_SYSTEM_EIGEN "Find and use system-installed Eigen. If 'off', use the one bundled with GTSAM" OFF)
+option(GTSAM_USE_SYSTEM_EIGEN "Find and use system-installed Eigen. If 'off', use the one bundled with GTSAM" ON)
# 2. gtsam/cmake/GtsamBuildTypes: disable avx instruction set
# below the line `list_append_cache(GTSAM_COMPILE_OPTIONS_PUBLIC "-march=native")`
list_append_cache(GTSAM_COMPILE_OPTIONS_PUBLIC "-mno-avx")

Using the same version of Eigen helps avoid memory issues. Disabling avx instruction set also helps with some segment faults in our experience (this can be however OS and hardware dependent).

And finally build the whole workspace

# building GTSAM may take a while
catkin build

Instructions

Troubleshooting

Weird building issues after some tinkering. It is recommend to
- clean your workspace (catkin clean --all at the workspace root) and rebuild your workspace (catkin build)
- or catkin build --force-cmake
after your have made changes to CMake files (CMakeLists.txt or *.cmake) to make sure the changes take effect.

Longer explanation
Catkin tools can detect changes in CMake files and re-build affected files only. But since we are working with a multi-package project, some changes may not be detected as desired. For example, changing the building flags in `svo_cmake/cmake/Modules/SvoSetup.cmake` will affect all the packages but the re-compiling may not be done automatically (since the files in each package are not changed). Also, we need to keep the linking (e.g., library version) and compiling flags consistent across different packages. Therefore, unless you are familiar with how the compilation works out, it is the safest to re-build the whole workspace. `catkin build --force-cmake` should also work in most cases.
Compiling/linking error related to OpenCV: find find_package(OpenCV REQUIRED) in the CMakeLists.txt files in each package (in rpg_common, svo_ros, svo_direct, vikit/vikit_common and svo_online_loopclosing) and replace it with
```
# Ubuntu 18.04 + Melodic
find_package(OpenCV 3 REQUIRED)
# Ubuntu 20.04 + Noetic
find_package(OpenCV 4 REQUIRED)
```
Longer explanation
First, ROS is built against OpenCV 3 on Ubuntu 18.04 and OpenCV 4 on Ubuntu 20.04. It is desired to keep the OpenCV version linked in SVO consistent with the ROS one, since in `svo_ros` we need to link everything with ROS. Second, The original `CMakeLists.txt` files will work fine if you only have the default OpenCV installed. But if you have some customized version of OpenCV installed (e.g., from source), it is recommended to explicitly specify the version of OpenCV that should be used (=the version ROS uses) as mentione above.
Visualization issues with the PointCloud2: Using Points to visualize PointCloud2 in RVIZ seems to be problematic in Ubuntu 20.04. We use other visualization types instead of Points per default. However, it is good to be aware of this if you want to customize the visualization.
Pipeline crashes with loop closure enabled: If the pipeline crashes calling svo::loadVoc(), did you forgot to download the vocabulary files as mentioned above?
```
cd rpg_svo_pro_open/svo_online_loopclosing/vocabularies && ./download_voc.sh
```
Inconsistent Eigen versions during compilation: The same Eigen should be used across the whole project (which should be system Eigen, since we are also using ROS). Check whether eigen_catkin and gtsam find the same version of Eigen:
```
# for eigen_catkin
catkin build eigen_catkin --force-cmake --verbose
# for gtsam
catkin build gtsam --force-cmake --verbose
```
Longer explanation
One common pitfall of using Eigen in your projects is have different libraries compiled against different Eigen versions. For SVO, eigen_catkin (https://github.com/ethz-asl/eigen_catkin) is used to keep the Eigen version same, which should be the system one (under /usr/include) on 18.04 and 20.04. For GTSAM, system Eigen is found via a cumstomized cmake file (https://github.com/borglab/gtsam/blob/develop/cmake/FindEigen3.cmake#L66). It searches for `/usr/local/include` first, which may contain Eigen versions that are manually installed. Therefore, we explicitly specifies `EIGEN_INCLUDE_PATH` when configuring the workspace to force GTSAM to find system Eigen. If you still encounter inconsistent Eigen versions, the first thing to check is whether different versions of Eigen are still used.

Acknowledgement

Thanks to Simon Klenk, Manasi Muglikar, Giovanni Cioffi and Javier Hidalgo-CarriÃ³ for their valuable help and comments for the open source code.

The work is made possible thanks to the efforts of many contributors from RPG. Apart from the authors listed in the above papers, Titus Cieslewski and Henri Rebecq made significant contributions to the visual front-end. Jeffrey Delmerico made great efforts to apply SVO on different real robots, which in turn helped improve the pipeline. Many PhD and master students and lab engineers have also contributed to the code.

The Ceres-based optimization back-end is based on code developed at Zurich-eye, a spin-off from RPG. Jonathan Huber is the main contributor that integrated the back-end with SVO. Kunal Shrivastava (now CEO of SUIND) developed the loop closure module during his semester project and internship at RPG. The integration of the iSAM2-based global map was developed by Zichao Zhang.

We would like to thank our collaborators at Prophesee for pointing out several bugs in the visual front-end. Part of the code was developed during a funded project with Huawei.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot