Convert Figma logo to code with AI

MIT-SPARK logoKimera-VIO

Visual Inertial Odometry with SLAM capabilities and 3D Mesh generation.

1,527
416
1,527
36

Top Related Projects

A Robust and Versatile Monocular Visual-Inertial State Estimator

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

3,328

LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping

An optimization-based multi-sensor state estimator

1,123

Quick Overview

Kimera-VIO is an open-source Visual Inertial Odometry (VIO) library developed by the MIT Spark Lab. It provides a robust and efficient solution for real-time pose estimation and mapping using visual and inertial sensors, making it suitable for various robotics and augmented reality applications.

Pros

  • High accuracy and robustness in challenging environments
  • Real-time performance on CPU
  • Modular design allowing easy integration and customization
  • Extensive documentation and examples

Cons

  • Steep learning curve for beginners
  • Limited support for non-ROS environments
  • Requires careful calibration for optimal performance
  • May struggle in environments with limited visual features

Code Examples

  1. Initializing Kimera-VIO:
#include <kimera-vio/pipeline/Pipeline.h>

gtsam::Pose3 initial_W_Body = gtsam::Pose3::identity();
VioBackEndParams vio_params;
ImuParams imu_params;
VioPipeline vio_pipeline(vio_params, imu_params, initial_W_Body);
  1. Processing a frame:
cv::Mat left_img, right_img;
double timestamp;
// Assume left_img, right_img, and timestamp are populated
VioNavState updated_state = vio_pipeline.spinOnce(left_img, right_img, timestamp);
  1. Accessing the 3D map:
const LandmarkMap& landmarks = vio_pipeline.getLandmarkMap();
for (const auto& landmark : landmarks) {
    gtsam::Point3 position = landmark.second;
    // Process landmark position
}

Getting Started

  1. Clone the repository:

    git clone https://github.com/MIT-SPARK/Kimera-VIO.git
    
  2. Install dependencies:

    cd Kimera-VIO
    ./scripts/install_deps.sh
    
  3. Build the project:

    mkdir build && cd build
    cmake ..
    make -j4
    
  4. Run the example:

    ./build/stereoVIOEuroc -p $(pwd)/params/euroc/euroc_flags.yaml -d PATH_TO_EUROC_DATASET
    

Competitor Comparisons

A Robust and Versatile Monocular Visual-Inertial State Estimator

Pros of VINS-Mono

  • More mature and widely adopted in the robotics community
  • Supports loop closure for improved accuracy in long trajectories
  • Provides a more comprehensive sensor fusion framework

Cons of VINS-Mono

  • Less optimized for real-time performance on resource-constrained platforms
  • Lacks some advanced features like multi-session mapping and relocalization

Code Comparison

VINS-Mono (initialization):

void Estimator::processIMU(double dt, const Vector3d &linear_acceleration, const Vector3d &angular_velocity)
{
    if (!first_imu)
    {
        first_imu = true;
        acc_0 = linear_acceleration;
        gyr_0 = angular_velocity;
    }
    // ... (additional processing)
}

Kimera-VIO (initialization):

void VioBackEnd::initializeStateAndSetPriors(
    const VioNavState& vio_nav_state_initial_seed) {
  CHECK(state_) << "State is not initialized.";
  state_->setNavState(vio_nav_state_initial_seed);
  // ... (additional initialization)
}

Both repositories provide robust visual-inertial odometry solutions, but Kimera-VIO focuses more on real-time performance and advanced mapping features, while VINS-Mono offers a more established and comprehensive framework for sensor fusion and loop closure.

Real-Time SLAM for Monocular, Stereo and RGB-D Cameras, with Loop Detection and Relocalization Capabilities

Pros of ORB_SLAM2

  • Well-established and widely used in the SLAM community
  • Robust loop closure and relocalization capabilities
  • Supports monocular, stereo, and RGB-D cameras

Cons of ORB_SLAM2

  • Limited to visual information, no IMU integration
  • May struggle in feature-poor environments
  • Less frequent updates and maintenance

Code Comparison

ORB_SLAM2 (C++):

// Feature extraction
ORBextractor* mpORBextractor;
mpORBextractor = new ORBextractor(nFeatures,fScaleFactor,nLevels,iniThFAST,minThFAST);

// Tracking
mCurrentFrame = Frame(mImGray,timestamp,mpORBextractor,mpORBVocabulary,mK,mDistCoef,mbf,mThDepth);

Kimera-VIO (C++):

// VIO pipeline
VioBackEnd vio_backend(FLAGS_params_folder + "/BackendParams.yaml");
VioFrontEnd vio_frontend(FLAGS_params_folder + "/FrontendParams.yaml");

// Run VIO
vio_pipeline.spinOnce(stereo_frame, imu_data);

ORB_SLAM2 focuses on visual feature extraction and tracking, while Kimera-VIO integrates visual and inertial data in a tightly-coupled VIO pipeline. Kimera-VIO's code structure reflects its more comprehensive approach to sensor fusion and state estimation.

ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM

Pros of ORB_SLAM3

  • More versatile, supporting monocular, stereo, and RGB-D cameras
  • Better loop closure and relocalization capabilities
  • More mature and widely adopted in the SLAM community

Cons of ORB_SLAM3

  • Slower processing speed compared to Kimera-VIO
  • Less focus on real-time performance in resource-constrained environments
  • Limited integration with other sensors (e.g., IMU) compared to Kimera-VIO

Code Comparison

ORB_SLAM3:

void System::TrackMonocular(const cv::Mat &im, const double &timestamp, const vector<IMU::Point>& vImuMeas)
{
    if(mSensor!=MONOCULAR && mSensor!=IMU_MONOCULAR)
    {
        cerr << "ERROR: you called TrackMonocular but input sensor was not set to Monocular nor Monocular-Inertial." << endl;
        exit(-1);
    }
    // ...
}

Kimera-VIO:

void Pipeline::spinOnce(const StereoImuSyncPacket& stereo_imu_sync_packet) {
  CHECK(frontend_);
  CHECK(backend_);
  CHECK(mesher_);
  CHECK(visualizer_);
  // ...
}

The code snippets show different approaches to handling sensor input and pipeline execution. ORB_SLAM3 focuses on specific sensor configurations, while Kimera-VIO uses a more modular approach with separate components for frontend, backend, meshing, and visualization.

3,328

LIO-SAM: Tightly-coupled Lidar Inertial Odometry via Smoothing and Mapping

Pros of LIO-SAM

  • Utilizes LiDAR-inertial odometry, providing robust performance in environments with limited visual features
  • Implements loop closure for improved mapping accuracy and drift correction
  • Offers real-time performance on most modern computers without requiring a GPU

Cons of LIO-SAM

  • Limited to LiDAR and IMU sensors, lacking support for visual data integration
  • May struggle in environments with limited geometric features or rapid motion

Code Comparison

Kimera-VIO (C++):

gtsam::NavState propagateIMUState(const gtsam::NavState& state_k,
                                  const ImuAccGyr& imu_accgyr,
                                  const double delta_t) {
  // IMU propagation implementation
}

LIO-SAM (C++):

void imuHandler(const sensor_msgs::Imu::ConstPtr& imuMsg) {
    // IMU data processing and integration
}

Both projects use C++ and ROS, but Kimera-VIO focuses on visual-inertial odometry, while LIO-SAM emphasizes LiDAR-inertial odometry. Kimera-VIO's code snippet shows IMU state propagation using GTSAM, while LIO-SAM's code handles IMU data processing within a ROS callback function.

An optimization-based multi-sensor state estimator

Pros of VINS-Fusion

  • Supports multi-sensor fusion (stereo cameras, IMU, GPS)
  • Provides loop closure for improved accuracy
  • Offers both loosely and tightly coupled sensor fusion

Cons of VINS-Fusion

  • Less optimized for real-time performance compared to Kimera-VIO
  • May require more computational resources
  • Limited support for semantic information integration

Code Comparison

VINS-Fusion (initialization):

void System::ProcessImage(const map<int, vector<pair<int, Eigen::Matrix<double, 7, 1>>>> &image, const double header)
{
    if (init_feature_num_ < 20)
    {
        if (feature_manager.AddFeatureCheckParallax(frame_count, image, td))
            init_feature_num_++;
        else
            return;
    }
    // ... (additional initialization code)
}

Kimera-VIO (initialization):

bool VioBackEnd::initializeStateAndSetPriors(
    const gtsam::Pose3& initial_W_Pose_B,
    const gtsam::Vector3& initial_W_Vel_B,
    const ImuBias& initial_imu_bias) {
  initial_state_ = VioNavState(initial_W_Pose_B, initial_W_Vel_B, initial_imu_bias);
  return true;
}

Both repositories provide visual-inertial odometry solutions, but VINS-Fusion offers more flexibility in sensor fusion and loop closure, while Kimera-VIO focuses on real-time performance and semantic information integration.

1,123

Pros of ROVIO

  • Lightweight and computationally efficient, suitable for resource-constrained platforms
  • Robust performance in challenging environments with rapid motions
  • Well-documented and extensively tested in real-world scenarios

Cons of ROVIO

  • Limited to monocular visual-inertial odometry, lacking multi-sensor fusion capabilities
  • Does not provide dense 3D reconstruction or mapping functionalities
  • May struggle with scale estimation in certain scenarios

Code Comparison

ROVIO:

rovio::RovioNode<rovio::FilterState> rovio_node;
rovio_node.makeTest();
rovio_node.makeFilterTest();

Kimera-VIO:

VioParams vio_params;
VioBackEnd vio_backend(vio_params);
VioFrontEnd vio_frontend(vio_params);

ROVIO focuses on a tightly-coupled EKF-based approach, while Kimera-VIO implements a modular pipeline with separate front-end and back-end components. Kimera-VIO offers more extensive functionality, including multi-sensor fusion and 3D reconstruction, but may require more computational resources compared to ROVIO's lightweight design.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Kimera-VIO: Open-Source Visual Inertial Odometry

Build Status For evaluation plots, check our jenkins server.

Authors: Antoni Rosinol, Yun Chang, Marcus Abate, Nathan Hughes, Sandro Berchier, Luca Carlone

What is Kimera-VIO?

Kimera-VIO is a Visual Inertial Odometry pipeline for accurate State Estimation from Stereo + IMU data. It can optionally use Mono + IMU data instead of stereo cameras.

Publications

We kindly ask to cite our paper if you find this library useful:

@InProceedings{Rosinol20icra-Kimera,
  title = {Kimera: an Open-Source Library for Real-Time Metric-Semantic Localization and Mapping},
  author = {Rosinol, Antoni and Abate, Marcus and Chang, Yun and Carlone, Luca},
  year = {2020},
  booktitle = {IEEE Intl. Conf. on Robotics and Automation (ICRA)},
  url = {https://github.com/MIT-SPARK/Kimera},
  pdf = {https://arxiv.org/pdf/1910.02490.pdf}
}
@article{Rosinol21arxiv-Kimera,
   title = {Kimera: from {SLAM} to Spatial Perception with {3D} Dynamic Scene Graphs},
   author = {Rosinol, Antoni and Violette, Andrew and Abate, Marcus and Hughes, Nathan and Chang, Yun
   and Shi, Jingnan and Gupta, Arjun and Carlone, Luca},
   year = {2021},
   journal = {arXiv preprint arXiv: 2101.06894},
   pdf = {https://arxiv.org/pdf/2101.06894.pdf} 
}

Related Publications

Backend optimization is based on:

Alternatively, the Regular VIO Backend, using structural regularities, is described in this paper:

Demo

1. Installation

Tested on Ubuntu 20.04.

Prerequisites

Note: if you want to avoid building all dependencies yourself, we provide a docker image that will install them for you. Check installation instructions in docs/kimera_vio_install.md.

Note 2: if you use ROS, then Kimera-VIO-ROS can install all dependencies and Kimera inside a catkin workspace.

Installation Instructions

Find how to install Kimera-VIO and its dependencies here: Installation instructions.

2. Usage

General tips

The LoopClosureDetector (and PGO) module is disabled by default. If you wish to run the pipeline with loop-closure detection enabled, set the use_lcd flag to true. For the example script, this is done by passing -lcd at commandline like so:

./scripts/stereoVIOEUROC.bash -lcd

To log output, set the log_output flag to true. For the script, this is done with the -log commandline argument. By default, log files will be saved in output_logs.

To run the pipeline in sequential mode (one thread only), set parallel_runto false. This can be done in the example script with the -s argument at commandline.

i. Euroc Dataset

Download Euroc's dataset

Datasets MH_04 and V2_03 have different number of left/right frames. We suggest using instead our version of Euroc here.

  • Unzip the dataset to your preferred directory, for example, in ~/Euroc/V1_01_easy:
mkdir -p ~/Euroc/V1_01_easy
unzip -o ~/Downloads/V1_01_easy.zip -d ~/Euroc/V1_01_easy

Yamelize Euroc's dataset

Add %YAML:1.0 at the top of each .yaml file inside Euroc. You can do this manually or run the yamelize.bash script by indicating where the dataset is (it is assumed below to be in ~/path/to/euroc):

You don't need to yamelize the dataset if you download our version here

cd Kimera-VIO
bash ./scripts/euroc/yamelize.bash -p ~/path/to/euroc

Run Kimera-VIO in Euroc's dataset

Using a bash script bundling all command-line options and gflags:

cd Kimera-VIO
bash ./scripts/stereoVIOEuroc.bash -p "PATH_TO_DATASET/V1_01_easy"

Alternatively, one may directly use the executable in the build folder: ./build/stereoVIOEuroc. Nevertheless, check the script ./scripts/stereoVIOEuroc.bash to understand what parameters are expected, or check the parameters section below.

Kimera can also run in monocular mode. For Euroc, this means only processing the left image. To use this simply use the parameters in params/EurocMono. In the bash script there is a PARAMS_PATH variable that can be set to point to these parameters instead.

ii. Using ROS wrapper

We provide a ROS wrapper of Kimera-VIO that you can find at: https://github.com/MIT-SPARK/Kimera-VIO-ROS.

This library can be cloned into a catkin workspace and built alongside the ROS wrapper.

iii. Evaluation and Debugging

For more information on tools for debugging and evaluating the pipeline, see our documentation

iv. Unit Testing

We use gtest for unit testing. To run the unit tests: build the code, navigate inside the build folder and run testKimeraVIO:

cd build
./testKimeraVIO

A useful flag is ./testKimeraVIO --gtest_filter=foo to only run the test you are interested in (regex is also valid).

Alternatively, you can run rosrun kimera_vio run_gtest.py from anywhere on your system if you've built Kimera-VIO through ROS and sourced the workspace containing Kimera-VIO. This script passes all arguments to testKimeraVIO, so you should feel free to use whatever flags you would normally use.

3. Parameters

Kimera-VIO accepts two independent sources of parameters:

  • YAML files: contains parameters for Backend and Frontend.
  • gflags contains parameters for all the rest.

To get help on what each gflag parameter does, just run the executable with the --help flag: ./build/stereoVIOEuroc --help. You should get a list of gflags similar to the ones here.

  • Optionally, you can try the VIO using structural regularities, as in our ICRA 2019 paper, by specifying the option -r: ./stereoVIOEuroc.bash -p "PATH_TO_DATASET/V1_01_easy" -r

OpenCV's 3D visualization also has some shortcuts for interaction: check tips for usage

Camera parameters can be described using the pinhole model or the omni model. The omni model is based on the OCamCalib toolbox described in this paper. A tutorial for generating the calibration can be found here.

The Omni camera model requires these additional parameters:

omni_affine
omni_distortion_center

The distortion polynomial is stored in the distortion_coefficients field. The matlab toolbox gives only 4 coefficients as output, however Kimera supports 5 coefficients. The second one can be set to zero. For example, if your output from OCamCalib is [1, 2, 3, 4] then you can set distortion_coefficients to [1, 0, 2, 3, 4] in the camera parameters file.

Inverse polynomial for projection is not required. In the omni camera case, the intrinsics field represents the intrinsics of a ideal pinhole model of the fisheye camera, which is primarily used when instantiating gtsam calibrations that currently are only implemented for pinhole cameras. Leaving it blank is sufficient as the code will generate a ideal model based on the image size. You may supply your own ideal pinhole intrinsics and they will be used instead. In the pinhole case, these values must be supplied consistently with the camera parameters (focal lengths and image center).

4. Contribution guidelines

We strongly encourage you to submit issues, feedback and potential improvements. We follow the branch, open PR, review, and merge workflow.

To contribute to this repo, ensure your commits pass the linter pre-commit checks. To enable these checks you will need to install linter. We also provide a .clang-format file with the style rules that the repo uses, so that you can use clang-format to reformat your code.

Also, check tips for development and our developer guide.

5. FAQ

Issues

If you have problems building or running the pipeline and/or issues with dependencies, you might find useful information in our FAQ or in the issue tracker.

How to interpret console output

I0512 21:05:55.136549 21233 Pipeline.cpp:449] Statistics
-----------                                  #	Log Hz	{avg     +- std    }	[min,max]
Data Provider [ms]                      	    0	
Display [ms]                            	  146	36.5421	{8.28082 +- 2.40370}	[3,213]
VioBackend [ms]                         	   73	19.4868	{15.2192 +- 9.75712}	[0,39]
VioFrontend Frame Rate [ms]             	  222	59.3276	{5.77027 +- 1.51571}	[3,12]
VioFrontend Keyframe Rate [ms]          	   73	19.6235	{31.4110 +- 7.29504}	[24,62]
VioFrontend [ms]                        	  295	77.9727	{12.1593 +- 10.7279}	[3,62]
Visualizer [ms]                         	   73	19.4639	{3.82192 +- 0.805234}	[2,7]
backend_input_queue Size [#]            	   73	18.3878	{1.00000 +- 0.00000}	[1,1]
data_provider_left_frame_queue Size (#) 	  663	165.202	{182.265 +- 14.5110}	[1,359]
data_provider_right_frame_queue Size (#)	  663	165.084	{182.029 +- 14.5150}	[1,359]
display_input_queue Size [#]            	  146	36.5428	{1.68493 +- 0.00000}	[1,12]
stereo_frontend_input_queue Size [#]    	  301	75.3519	{4.84718 +- 0.219043}	[1,5]
visualizer_backend_queue Size [#]       	   73	18.3208	{1.00000 +- 0.00000}	[1,1]
visualizer_frontend_queue Size [#]      	  295	73.9984	{4.21695 +- 1.24381}	[1,7]
  • # number of samples taken.
  • Log Hz average number of samples taken per second in Hz.
  • avg average of the actual value logged. Same unit as the logged quantity.
  • std standard deviation of the value logged.
  • [min,max] minimum and maximum values that the logged value took.

There are two main things logged: the time it takes for the pipeline modules to run (i.e. VioBackend, Visualizer etc), and the size of the queues between pipeline modules (i.e. backend_input_queue).

For example:

VioBackend [ms]                         	   73	19.4868	{15.2192 +- 9.75712}	[0,39]

Shows that the Backend runtime got sampled 73 times, at a rate of 19.48Hz (which accounts for both the time the Backend waits for input to consume and the time it takes to process it). That it takes 15.21ms to consume its input with a standard deviation of 9.75ms and that the least it took to run for one input was 0ms and the most it took so far is 39ms.

For the queues, for example:

stereo_frontend_input_queue Size [#]    	  301	75.3519	{4.84718 +- 0.219043}	[1,5]

Shows that the Frontend input queue got sampled 301 times, at a rate of 75.38Hz. That it stores an average of 4.84 elements, with a standard deviation of 0.21 elements, and that the min size it had was 1 element, and the max size it stored was of 5 elements.

6. Chart

vio_chart

overall_chart

7. BSD License

Kimera-VIO is open source under the BSD license, see the LICENSE.BSD file.