VINS-Fusion

An optimization-based multi-sensor state estimator

3,961

1,477

3,961

206

View on GitHub

Top Related Projects

VINS-Mono

5,380

A Robust and Versatile Monocular Visual-Inertial State Estimator

ORB_SLAM2

9,851

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

ORB_SLAM3

7,407

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Kimera-VIO

1,694

Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation.

Quick Overview

VINS-Fusion is an open-source state estimation system for visual-inertial navigation. It combines visual and inertial measurements to provide robust and accurate pose estimation for various robotic applications, including aerial and ground robots. The system supports multiple sensor configurations and can work with both monocular and stereo cameras.

Pros

Supports multiple sensor configurations (monocular, stereo, stereo + IMU)
Provides real-time performance on various platforms
Includes loop closure for improved accuracy and drift reduction
Open-source with active community support

Cons

Requires careful calibration for optimal performance
May struggle in environments with limited visual features
Can be computationally intensive for resource-constrained platforms
Learning curve for new users to understand and configure the system

Code Examples

Initializing the VINS-Fusion system:

#include "estimator/estimator.h"

Estimator estimator;
estimator.setParameter();

Processing IMU data:

void processIMU(double dt, const Vector3d &linear_acceleration, const Vector3d &angular_velocity)
{
    estimator.processIMU(dt, linear_acceleration, angular_velocity);
}

Processing image data:

void processImage(const map<int, vector<pair<int, Eigen::Matrix<double, 7, 1>>>> &image, double timestamp)
{
    estimator.processImage(image, timestamp);
}

Getting Started

Clone the repository:

git clone https://github.com/HKUST-Aerial-Robotics/VINS-Fusion.git

Build the project:

cd VINS-Fusion
mkdir build && cd build
cmake ..
make

Run the example:

./vins_node PATH_TO_CONFIG_FILE PATH_TO_IMAGE_FOLDER TIMESTAMP_FILE

Replace PATH_TO_CONFIG_FILE, PATH_TO_IMAGE_FOLDER, and TIMESTAMP_FILE with the appropriate paths for your dataset.

Competitor Comparisons

VINS-Mono

5,380

A Robust and Versatile Monocular Visual-Inertial State Estimator

Pros of VINS-Mono

Simpler implementation, easier to understand and modify
Lower computational requirements, suitable for resource-constrained systems
Faster processing time for real-time applications

Cons of VINS-Mono

Limited to monocular camera setups, reducing accuracy in some scenarios
Less robust in challenging environments with limited visual features
Lacks multi-sensor fusion capabilities for improved state estimation

Code Comparison

VINS-Mono (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        clahe->apply(_img, img);
    }
    else
        img = _img;

VINS-Fusion (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        clahe->apply(_img, img);
    }
    else
        img = _img;

The code snippets show similar feature tracking implementations, highlighting the shared codebase between the two projects.

ORB_SLAM2

9,851

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

Pros of ORB_SLAM2

Lightweight and efficient, suitable for real-time applications on CPUs
Robust loop closing and relocalization capabilities
Well-documented and widely adopted in the robotics community

Cons of ORB_SLAM2

Limited to visual odometry, lacking multi-sensor fusion capabilities
May struggle in feature-poor environments or with rapid camera motion
No built-in support for IMU integration or GPS data

Code Comparison

ORB_SLAM2 (feature extraction):

void Frame::ExtractORB(int flag, const cv::Mat &im)
{
    if(flag==0)
        (*mpORBextractorLeft)(im,cv::Mat(),mvKeys,mDescriptors);
    else
        (*mpORBextractorRight)(im,cv::Mat(),mvKeysRight,mDescriptorsRight);
}

VINS-Fusion (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    frame_cnt++;
    cv::remap(_img, img, undist_map1_, undist_map2_, CV_INTER_LINEAR);
    // ... (additional processing)
}

Both repositories implement visual SLAM systems, but VINS-Fusion offers multi-sensor fusion capabilities, including IMU and GPS integration. ORB_SLAM2 focuses on pure visual odometry with robust loop closing, while VINS-Fusion provides a more comprehensive solution for aerial robotics applications.

rovio

1,189

Pros of ROVIO

Lightweight and computationally efficient, suitable for resource-constrained platforms
Robust to rapid motions and dynamic environments
Supports multi-camera setups for improved accuracy and robustness

Cons of ROVIO

Limited to visual-inertial odometry, lacking loop closure and global optimization
May struggle with feature-poor environments or low-texture scenes
Less extensive documentation and community support compared to VINS-Fusion

Code Comparison

ROVIO (C++):

rovio::RovioFilter<rovio::FilterState> rovioFilter(
    rovio::makeImgCovMat(1e-4),
    rovio::makePoseMeasCovMat(1e-6, 1e-6, 1e-6, 1e-4, 1e-4, 1e-4)
);

VINS-Fusion (C++):

estimator.setParameter();
f_manager.setRic(Ric);
ProjectionTwoFrameOneCamFactor::sqrt_info = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();
ProjectionTwoFrameTwoCamFactor::sqrt_info = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();
ProjectionOneFrameTwoCamFactor::sqrt_info = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();

Both repositories use C++ and provide similar functionality for visual-inertial odometry. ROVIO focuses on efficiency and robustness, while VINS-Fusion offers a more comprehensive SLAM solution with additional features like loop closure and multi-sensor fusion.

rpg_svo

2,161

Semi-direct Visual Odometry

Pros of rpg_svo

Lightweight and computationally efficient, suitable for resource-constrained platforms
Fast initialization and recovery from tracking failures
Supports both monocular and stereo camera setups

Cons of rpg_svo

Less accurate in complex environments compared to VINS-Fusion
Lacks multi-sensor fusion capabilities (e.g., IMU integration)
May struggle with rapid camera motions or feature-poor scenes

Code Comparison

rpg_svo:

FrameHandlerMono::FrameHandlerMono(vk::AbstractCamera* cam) :
  FrameHandlerBase(),
  cam_(cam),
  reprojector_(cam_, map_),
  depth_filter_(NULL)
{
  initialize();
}

VINS-Fusion:

Estimator::Estimator()
{
    clearState();
    failure_occur = 0;
    sum_of_back = 0;
    sum_of_front = 0;
    frame_count = 0;
}

The code snippets show the initialization of core classes in both projects. rpg_svo focuses on camera setup and depth filtering, while VINS-Fusion initializes state variables for estimation and tracking.

ORB_SLAM3

7,407

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Pros of ORB_SLAM3

Supports monocular, stereo, and RGB-D cameras, as well as visual-inertial odometry
Includes loop closing and relocalization capabilities
Offers real-time performance on standard CPUs

Cons of ORB_SLAM3

Requires careful parameter tuning for optimal performance
May struggle in environments with limited visual features
Less robust in dynamic scenes compared to VINS-Fusion

Code Comparison

ORB_SLAM3:

// Feature extraction
ORBextractor* mpORBextractorLeft;
ORBextractor* mpORBextractorRight;

// Frame object creation
Frame mCurrentFrame(imLeft, imRight, timeStamp, mpORBextractorLeft, mpORBextractorRight, mpORBVocabulary, mK, mDistCoef, mbf, mThDepth);

VINS-Fusion:

// Feature tracking
void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    frame_cnt++;
    cv::remap(_img, img, undist_map1_, undist_map2_, CV_INTER_LINEAR);

Both systems utilize feature extraction and tracking, but ORB_SLAM3 focuses on ORB features, while VINS-Fusion employs a more general approach to feature tracking. ORB_SLAM3's code snippet shows the creation of ORB extractors and frame objects, while VINS-Fusion's code demonstrates the image preprocessing and feature tracking process.

Kimera-VIO

1,694

Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation.

Pros of Kimera-VIO

Includes a 3D mesh reconstruction module, offering a more complete visual-inertial mapping solution
Provides a modular architecture, allowing easier integration of custom components
Offers both stereo and monocular VIO options, providing more flexibility for different sensor setups

Cons of Kimera-VIO

Generally requires more computational resources due to its additional features and complexity
Has a steeper learning curve for new users compared to VINS-Fusion
Less extensive documentation and community support compared to VINS-Fusion

Code Comparison

VINS-Fusion (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        TicToc t_c;
        clahe->apply(_img, img);
        ROS_DEBUG("CLAHE costs: %fms", t_c.toc());
    }
    else
        img = _img;

Kimera-VIO (feature tracking):

void StereoVisionFrontEnd::featureTracking(
    const Frame& cur_frame,
    const Frame& ref_frame,
    const TrackingStatusMask& ref_mask,
    const gtsam::Pose3& body_pose_cam_ref,
    const gtsam::Pose3& body_pose_cam_cur,
    TrackingStatusMask* status_mask) {
  CHECK_NOTNULL(status_mask);
  const CameraParams& cam_params = tracker_params_.camera_params;

Both codebases use similar approaches for feature tracking, but Kimera-VIO's implementation appears more modular and object-oriented.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

VINS-Fusion

An optimization-based multi-sensor state estimator

VINS-Fusion is an optimization-based multi-sensor state estimator, which achieves accurate self-localization for autonomous applications (drones, cars, and AR/VR). VINS-Fusion is an extension of VINS-Mono, which supports multiple visual-inertial sensor types (mono camera + IMU, stereo cameras + IMU, even stereo cameras only). We also show a toy example of fusing VINS with GPS. Features:

multiple sensors support (stereo cameras / mono camera+IMU / stereo cameras+IMU)
online spatial calibration (transformation between camera and IMU)
online temporal calibration (time offset between camera and IMU)
visual loop closure

We are the top open-sourced stereo algorithm on KITTI Odometry Benchmark (12.Jan.2019).

Authors: Tong Qin, Shaozu Cao, Jie Pan, Peiliang Li, and Shaojie Shen from the Aerial Robotics Group, HKUST

Videos:

Related Paper: (paper is not exactly same with code)

Online Temporal Calibration for Monocular Visual-Inertial Systems, Tong Qin, Shaojie Shen, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS, 2018), best student paper award pdf
VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator, Tong Qin, Peiliang Li, Shaojie Shen, IEEE Transactions on Robotics pdf

If you use VINS-Fusion for your academic research, please cite our related papers. bib

1. Prerequisites

1.1 Ubuntu and ROS

Ubuntu 64-bit 16.04 or 18.04. ROS Kinetic or Melodic. ROS Installation

1.2. Ceres Solver

Follow Ceres Installation.

2. Build VINS-Fusion

Clone the repository and catkin_make:

    cd ~/catkin_ws/src
    git clone https://github.com/HKUST-Aerial-Robotics/VINS-Fusion.git
    cd ../
    catkin_make
    source ~/catkin_ws/devel/setup.bash

(if you fail in this step, try to find another computer with clean system or reinstall Ubuntu and ROS)

3. EuRoC Example

Download EuRoC MAV Dataset to YOUR_DATASET_FOLDER. Take MH_01 for example, you can run VINS-Fusion with three sensor types (monocular camera + IMU, stereo cameras + IMU and stereo cameras). Open four terminals, run vins odometry, visual loop closure(optional), rviz and play the bag file respectively. Green path is VIO odometry; red path is odometry under visual loop closure.

3.1 Monocualr camera + IMU

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml 
    rosbag play YOUR_DATASET_FOLDER/MH_01_easy.bag

3.2 Stereo cameras + IMU

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_imu_config.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_imu_config.yaml 
    rosbag play YOUR_DATASET_FOLDER/MH_01_easy.bag

3.3 Stereo cameras

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_config.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_config.yaml 
    rosbag play YOUR_DATASET_FOLDER/MH_01_easy.bag

4. KITTI Example

4.1 KITTI Odometry (Stereo)

Download KITTI Odometry dataset to YOUR_DATASET_FOLDER. Take sequences 00 for example, Open two terminals, run vins and rviz respectively. (We evaluated odometry on KITTI benchmark without loop closure funtion)

    roslaunch vins vins_rviz.launch
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml
    rosrun vins kitti_odom_test ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml YOUR_DATASET_FOLDER/sequences/00/

4.2 KITTI GPS Fusion (Stereo + GPS)

Download KITTI raw dataset to YOUR_DATASET_FOLDER. Take 2011_10_03_drive_0027_synced for example. Open three terminals, run vins, global fusion and rviz respectively. Green path is VIO odometry; blue path is odometry under GPS global fusion.

    roslaunch vins vins_rviz.launch
    rosrun vins kitti_gps_test ~/catkin_ws/src/VINS-Fusion/config/kitti_raw/kitti_10_03_config.yaml YOUR_DATASET_FOLDER/2011_10_03_drive_0027_sync/ 
    rosrun global_fusion global_fusion_node

5. VINS-Fusion on car demonstration

Download car bag to YOUR_DATASET_FOLDER. Open four terminals, run vins odometry, visual loop closure(optional), rviz and play the bag file respectively. Green path is VIO odometry; red path is odometry under visual loop closure.

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/vi_car/vi_car.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/vi_car/vi_car.yaml 
    rosbag play YOUR_DATASET_FOLDER/car.bag

6. Run with your devices

VIO is not only a software algorithm, it heavily relies on hardware quality. For beginners, we recommend you to run VIO with professional equipment, which contains global shutter cameras and hardware synchronization.

6.1 Configuration file

Write a config file for your device. You can take config files of EuRoC and KITTI as the example.

6.2 Camera calibration

VINS-Fusion support several camera models (pinhole, mei, equidistant). You can use camera model to calibrate your cameras. We put some example data under /camera_models/calibrationdata to tell you how to calibrate.

cd ~/catkin_ws/src/VINS-Fusion/camera_models/camera_calib_example/
rosrun camera_models Calibrations -w 12 -h 8 -s 80 -i calibrationdata --camera-model pinhole

7. Docker Support

To further facilitate the building process, we add docker in our code. Docker environment is like a sandbox, thus makes our code environment-independent. To run with docker, first make sure ros and docker are installed on your machine. Then add your account to docker group by sudo usermod -aG docker $YOUR_USER_NAME. Relaunch the terminal or logout and re-login if you get Permission denied error, type:

cd ~/catkin_ws/src/VINS-Fusion/docker
make build

Note that the docker building process may take a while depends on your network and machine. After VINS-Fusion successfully built, you can run vins estimator with script run.sh. Script run.sh can take several flags and arguments. Flag -k means KITTI, -l represents loop fusion, and -g stands for global fusion. You can get the usage details by ./run.sh -h. Here are some examples with this script:

# Euroc Monocualr camera + IMU
./run.sh ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml

# Euroc Stereo cameras + IMU with loop fusion
./run.sh -l ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml

# KITTI Odometry (Stereo)
./run.sh -k ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml YOUR_DATASET_FOLDER/sequences/00/

# KITTI Odometry (Stereo) with loop fusion
./run.sh -kl ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml YOUR_DATASET_FOLDER/sequences/00/

#  KITTI GPS Fusion (Stereo + GPS)
./run.sh -kg ~/catkin_ws/src/VINS-Fusion/config/kitti_raw/kitti_10_03_config.yaml YOUR_DATASET_FOLDER/2011_10_03_drive_0027_sync/

In Euroc cases, you need open another terminal and play your bag file. If you need modify the code, simply re-run ./run.sh with proper auguments after your changes.

8. Acknowledgements

We use ceres solver for non-linear optimization and DBoW2 for loop detection, a generic camera model and GeographicLib.

9. License

The source code is released under GPLv3 license.

We are still working on improving the code reliability. For any technical issues, please contact Tong Qin <qintonguavATgmail.com>.

For commercial inquiries, please contact Shaojie Shen <eeshaojieATust.hk>.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot