Convert Figma logo to code with AI

HKUST-Aerial-Robotics logoVINS-Fusion

An optimization-based multi-sensor state estimator

3,595
1,404
3,595
201

Top Related Projects

A Robust and Versatile Monocular Visual-Inertial State Estimator

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

1,148

2,110

Semi-direct Visual Odometry

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation.

Quick Overview

VINS-Fusion is an open-source state estimation system for visual-inertial navigation. It combines visual and inertial measurements to provide robust and accurate pose estimation for various robotic applications, including aerial and ground robots. The system supports multiple sensor configurations and can work with both monocular and stereo cameras.

Pros

  • Supports multiple sensor configurations (monocular, stereo, stereo + IMU)
  • Provides real-time performance on various platforms
  • Includes loop closure for improved accuracy and drift reduction
  • Open-source with active community support

Cons

  • Requires careful calibration for optimal performance
  • May struggle in environments with limited visual features
  • Can be computationally intensive for resource-constrained platforms
  • Learning curve for new users to understand and configure the system

Code Examples

  1. Initializing the VINS-Fusion system:
#include "estimator/estimator.h"

Estimator estimator;
estimator.setParameter();
  1. Processing IMU data:
void processIMU(double dt, const Vector3d &linear_acceleration, const Vector3d &angular_velocity)
{
    estimator.processIMU(dt, linear_acceleration, angular_velocity);
}
  1. Processing image data:
void processImage(const map<int, vector<pair<int, Eigen::Matrix<double, 7, 1>>>> &image, double timestamp)
{
    estimator.processImage(image, timestamp);
}

Getting Started

  1. Clone the repository:

    git clone https://github.com/HKUST-Aerial-Robotics/VINS-Fusion.git
    
  2. Build the project:

    cd VINS-Fusion
    mkdir build && cd build
    cmake ..
    make
    
  3. Run the example:

    ./vins_node PATH_TO_CONFIG_FILE PATH_TO_IMAGE_FOLDER TIMESTAMP_FILE
    

Replace PATH_TO_CONFIG_FILE, PATH_TO_IMAGE_FOLDER, and TIMESTAMP_FILE with the appropriate paths for your dataset.

Competitor Comparisons

A Robust and Versatile Monocular Visual-Inertial State Estimator

Pros of VINS-Mono

  • Simpler implementation, easier to understand and modify
  • Lower computational requirements, suitable for resource-constrained systems
  • Faster processing time for real-time applications

Cons of VINS-Mono

  • Limited to monocular camera setups, reducing accuracy in some scenarios
  • Less robust in challenging environments with limited visual features
  • Lacks multi-sensor fusion capabilities for improved state estimation

Code Comparison

VINS-Mono (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        clahe->apply(_img, img);
    }
    else
        img = _img;

VINS-Fusion (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        clahe->apply(_img, img);
    }
    else
        img = _img;

The code snippets show similar feature tracking implementations, highlighting the shared codebase between the two projects.

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

Pros of ORB_SLAM2

  • Lightweight and efficient, suitable for real-time applications on CPUs
  • Robust loop closing and relocalization capabilities
  • Well-documented and widely adopted in the robotics community

Cons of ORB_SLAM2

  • Limited to visual odometry, lacking multi-sensor fusion capabilities
  • May struggle in feature-poor environments or with rapid camera motion
  • No built-in support for IMU integration or GPS data

Code Comparison

ORB_SLAM2 (feature extraction):

void Frame::ExtractORB(int flag, const cv::Mat &im)
{
    if(flag==0)
        (*mpORBextractorLeft)(im,cv::Mat(),mvKeys,mDescriptors);
    else
        (*mpORBextractorRight)(im,cv::Mat(),mvKeysRight,mDescriptorsRight);
}

VINS-Fusion (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    frame_cnt++;
    cv::remap(_img, img, undist_map1_, undist_map2_, CV_INTER_LINEAR);
    // ... (additional processing)
}

Both repositories implement visual SLAM systems, but VINS-Fusion offers multi-sensor fusion capabilities, including IMU and GPS integration. ORB_SLAM2 focuses on pure visual odometry with robust loop closing, while VINS-Fusion provides a more comprehensive solution for aerial robotics applications.

1,148

Pros of ROVIO

  • Lightweight and computationally efficient, suitable for resource-constrained platforms
  • Robust to rapid motions and dynamic environments
  • Supports multi-camera setups for improved accuracy and robustness

Cons of ROVIO

  • Limited to visual-inertial odometry, lacking loop closure and global optimization
  • May struggle with feature-poor environments or low-texture scenes
  • Less extensive documentation and community support compared to VINS-Fusion

Code Comparison

ROVIO (C++):

rovio::RovioFilter<rovio::FilterState> rovioFilter(
    rovio::makeImgCovMat(1e-4),
    rovio::makePoseMeasCovMat(1e-6, 1e-6, 1e-6, 1e-4, 1e-4, 1e-4)
);

VINS-Fusion (C++):

estimator.setParameter();
f_manager.setRic(Ric);
ProjectionTwoFrameOneCamFactor::sqrt_info = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();
ProjectionTwoFrameTwoCamFactor::sqrt_info = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();
ProjectionOneFrameTwoCamFactor::sqrt_info = FOCAL_LENGTH / 1.5 * Matrix2d::Identity();

Both repositories use C++ and provide similar functionality for visual-inertial odometry. ROVIO focuses on efficiency and robustness, while VINS-Fusion offers a more comprehensive SLAM solution with additional features like loop closure and multi-sensor fusion.

2,110

Semi-direct Visual Odometry

Pros of rpg_svo

  • Lightweight and computationally efficient, suitable for resource-constrained platforms
  • Fast initialization and recovery from tracking failures
  • Supports both monocular and stereo camera setups

Cons of rpg_svo

  • Less accurate in complex environments compared to VINS-Fusion
  • Lacks multi-sensor fusion capabilities (e.g., IMU integration)
  • May struggle with rapid camera motions or feature-poor scenes

Code Comparison

rpg_svo:

FrameHandlerMono::FrameHandlerMono(vk::AbstractCamera* cam) :
  FrameHandlerBase(),
  cam_(cam),
  reprojector_(cam_, map_),
  depth_filter_(NULL)
{
  initialize();
}

VINS-Fusion:

Estimator::Estimator()
{
    clearState();
    failure_occur = 0;
    sum_of_back = 0;
    sum_of_front = 0;
    frame_count = 0;
}

The code snippets show the initialization of core classes in both projects. rpg_svo focuses on camera setup and depth filtering, while VINS-Fusion initializes state variables for estimation and tracking.

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Pros of ORB_SLAM3

  • Supports monocular, stereo, and RGB-D cameras, as well as visual-inertial odometry
  • Includes loop closing and relocalization capabilities
  • Offers real-time performance on standard CPUs

Cons of ORB_SLAM3

  • Requires careful parameter tuning for optimal performance
  • May struggle in environments with limited visual features
  • Less robust in dynamic scenes compared to VINS-Fusion

Code Comparison

ORB_SLAM3:

// Feature extraction
ORBextractor* mpORBextractorLeft;
ORBextractor* mpORBextractorRight;

// Frame object creation
Frame mCurrentFrame(imLeft, imRight, timeStamp, mpORBextractorLeft, mpORBextractorRight, mpORBVocabulary, mK, mDistCoef, mbf, mThDepth);

VINS-Fusion:

// Feature tracking
void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    frame_cnt++;
    cv::remap(_img, img, undist_map1_, undist_map2_, CV_INTER_LINEAR);

Both systems utilize feature extraction and tracking, but ORB_SLAM3 focuses on ORB features, while VINS-Fusion employs a more general approach to feature tracking. ORB_SLAM3's code snippet shows the creation of ORB extractors and frame objects, while VINS-Fusion's code demonstrates the image preprocessing and feature tracking process.

Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation.

Pros of Kimera-VIO

  • Includes a 3D mesh reconstruction module, offering a more complete visual-inertial mapping solution
  • Provides a modular architecture, allowing easier integration of custom components
  • Offers both stereo and monocular VIO options, providing more flexibility for different sensor setups

Cons of Kimera-VIO

  • Generally requires more computational resources due to its additional features and complexity
  • Has a steeper learning curve for new users compared to VINS-Fusion
  • Less extensive documentation and community support compared to VINS-Fusion

Code Comparison

VINS-Fusion (feature tracking):

void FeatureTracker::readImage(const cv::Mat &_img, double _cur_time)
{
    cv::Mat img;
    TicToc t_r;
    cur_time = _cur_time;

    if (EQUALIZE)
    {
        cv::Ptr<cv::CLAHE> clahe = cv::createCLAHE(3.0, cv::Size(8, 8));
        TicToc t_c;
        clahe->apply(_img, img);
        ROS_DEBUG("CLAHE costs: %fms", t_c.toc());
    }
    else
        img = _img;

Kimera-VIO (feature tracking):

void StereoVisionFrontEnd::featureTracking(
    const Frame& cur_frame,
    const Frame& ref_frame,
    const TrackingStatusMask& ref_mask,
    const gtsam::Pose3& body_pose_cam_ref,
    const gtsam::Pose3& body_pose_cam_cur,
    TrackingStatusMask* status_mask) {
  CHECK_NOTNULL(status_mask);
  const CameraParams& cam_params = tracker_params_.camera_params;

Both codebases use similar approaches for feature tracking, but Kimera-VIO's implementation appears more modular and object-oriented.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

VINS-Fusion

An optimization-based multi-sensor state estimator

VINS-Fusion is an optimization-based multi-sensor state estimator, which achieves accurate self-localization for autonomous applications (drones, cars, and AR/VR). VINS-Fusion is an extension of VINS-Mono, which supports multiple visual-inertial sensor types (mono camera + IMU, stereo cameras + IMU, even stereo cameras only). We also show a toy example of fusing VINS with GPS. Features:

  • multiple sensors support (stereo cameras / mono camera+IMU / stereo cameras+IMU)
  • online spatial calibration (transformation between camera and IMU)
  • online temporal calibration (time offset between camera and IMU)
  • visual loop closure

We are the top open-sourced stereo algorithm on KITTI Odometry Benchmark (12.Jan.2019).

Authors: Tong Qin, Shaozu Cao, Jie Pan, Peiliang Li, and Shaojie Shen from the Aerial Robotics Group, HKUST

Videos:

VINS

Related Paper: (paper is not exactly same with code)

  • Online Temporal Calibration for Monocular Visual-Inertial Systems, Tong Qin, Shaojie Shen, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS, 2018), best student paper award pdf

  • VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator, Tong Qin, Peiliang Li, Shaojie Shen, IEEE Transactions on Robotics pdf

If you use VINS-Fusion for your academic research, please cite our related papers. bib

1. Prerequisites

1.1 Ubuntu and ROS

Ubuntu 64-bit 16.04 or 18.04. ROS Kinetic or Melodic. ROS Installation

1.2. Ceres Solver

Follow Ceres Installation.

2. Build VINS-Fusion

Clone the repository and catkin_make:

    cd ~/catkin_ws/src
    git clone https://github.com/HKUST-Aerial-Robotics/VINS-Fusion.git
    cd ../
    catkin_make
    source ~/catkin_ws/devel/setup.bash

(if you fail in this step, try to find another computer with clean system or reinstall Ubuntu and ROS)

3. EuRoC Example

Download EuRoC MAV Dataset to YOUR_DATASET_FOLDER. Take MH_01 for example, you can run VINS-Fusion with three sensor types (monocular camera + IMU, stereo cameras + IMU and stereo cameras). Open four terminals, run vins odometry, visual loop closure(optional), rviz and play the bag file respectively. Green path is VIO odometry; red path is odometry under visual loop closure.

3.1 Monocualr camera + IMU

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml 
    rosbag play YOUR_DATASET_FOLDER/MH_01_easy.bag

3.2 Stereo cameras + IMU

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_imu_config.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_imu_config.yaml 
    rosbag play YOUR_DATASET_FOLDER/MH_01_easy.bag

3.3 Stereo cameras

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_config.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_stereo_config.yaml 
    rosbag play YOUR_DATASET_FOLDER/MH_01_easy.bag

4. KITTI Example

4.1 KITTI Odometry (Stereo)

Download KITTI Odometry dataset to YOUR_DATASET_FOLDER. Take sequences 00 for example, Open two terminals, run vins and rviz respectively. (We evaluated odometry on KITTI benchmark without loop closure funtion)

    roslaunch vins vins_rviz.launch
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml
    rosrun vins kitti_odom_test ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml YOUR_DATASET_FOLDER/sequences/00/ 

4.2 KITTI GPS Fusion (Stereo + GPS)

Download KITTI raw dataset to YOUR_DATASET_FOLDER. Take 2011_10_03_drive_0027_synced for example. Open three terminals, run vins, global fusion and rviz respectively. Green path is VIO odometry; blue path is odometry under GPS global fusion.

    roslaunch vins vins_rviz.launch
    rosrun vins kitti_gps_test ~/catkin_ws/src/VINS-Fusion/config/kitti_raw/kitti_10_03_config.yaml YOUR_DATASET_FOLDER/2011_10_03_drive_0027_sync/ 
    rosrun global_fusion global_fusion_node

5. VINS-Fusion on car demonstration

Download car bag to YOUR_DATASET_FOLDER. Open four terminals, run vins odometry, visual loop closure(optional), rviz and play the bag file respectively. Green path is VIO odometry; red path is odometry under visual loop closure.

    roslaunch vins vins_rviz.launch
    rosrun vins vins_node ~/catkin_ws/src/VINS-Fusion/config/vi_car/vi_car.yaml 
    (optional) rosrun loop_fusion loop_fusion_node ~/catkin_ws/src/VINS-Fusion/config/vi_car/vi_car.yaml 
    rosbag play YOUR_DATASET_FOLDER/car.bag

6. Run with your devices

VIO is not only a software algorithm, it heavily relies on hardware quality. For beginners, we recommend you to run VIO with professional equipment, which contains global shutter cameras and hardware synchronization.

6.1 Configuration file

Write a config file for your device. You can take config files of EuRoC and KITTI as the example.

6.2 Camera calibration

VINS-Fusion support several camera models (pinhole, mei, equidistant). You can use camera model to calibrate your cameras. We put some example data under /camera_models/calibrationdata to tell you how to calibrate.

cd ~/catkin_ws/src/VINS-Fusion/camera_models/camera_calib_example/
rosrun camera_models Calibrations -w 12 -h 8 -s 80 -i calibrationdata --camera-model pinhole

7. Docker Support

To further facilitate the building process, we add docker in our code. Docker environment is like a sandbox, thus makes our code environment-independent. To run with docker, first make sure ros and docker are installed on your machine. Then add your account to docker group by sudo usermod -aG docker $YOUR_USER_NAME. Relaunch the terminal or logout and re-login if you get Permission denied error, type:

cd ~/catkin_ws/src/VINS-Fusion/docker
make build

Note that the docker building process may take a while depends on your network and machine. After VINS-Fusion successfully built, you can run vins estimator with script run.sh. Script run.sh can take several flags and arguments. Flag -k means KITTI, -l represents loop fusion, and -g stands for global fusion. You can get the usage details by ./run.sh -h. Here are some examples with this script:

# Euroc Monocualr camera + IMU
./run.sh ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml

# Euroc Stereo cameras + IMU with loop fusion
./run.sh -l ~/catkin_ws/src/VINS-Fusion/config/euroc/euroc_mono_imu_config.yaml

# KITTI Odometry (Stereo)
./run.sh -k ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml YOUR_DATASET_FOLDER/sequences/00/

# KITTI Odometry (Stereo) with loop fusion
./run.sh -kl ~/catkin_ws/src/VINS-Fusion/config/kitti_odom/kitti_config00-02.yaml YOUR_DATASET_FOLDER/sequences/00/

#  KITTI GPS Fusion (Stereo + GPS)
./run.sh -kg ~/catkin_ws/src/VINS-Fusion/config/kitti_raw/kitti_10_03_config.yaml YOUR_DATASET_FOLDER/2011_10_03_drive_0027_sync/

In Euroc cases, you need open another terminal and play your bag file. If you need modify the code, simply re-run ./run.sh with proper auguments after your changes.

8. Acknowledgements

We use ceres solver for non-linear optimization and DBoW2 for loop detection, a generic camera model and GeographicLib.

9. License

The source code is released under GPLv3 license.

We are still working on improving the code reliability. For any technical issues, please contact Tong Qin <qintonguavATgmail.com>.

For commercial inquiries, please contact Shaojie Shen <eeshaojieATust.hk>.