kaolin

A PyTorch Library for Accelerating 3D Deep Learning Research

4,817

582

4,817

View on GitHub

Top Related Projects

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

nerfies

1,845

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

multinerf

3,739

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

nvdiffrec

2,217

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

mesh_to_sdf

1,133

Calculate signed distance fields for arbitrary meshes

Quick Overview

Kaolin is a PyTorch library for accelerating 3D deep learning research. It provides efficient implementations of differentiable 3D modules for use in deep learning systems. Kaolin aims to make 3D deep learning more accessible and accelerate progress in the field.

Pros

Comprehensive set of 3D deep learning tools and primitives
Efficient GPU-accelerated implementations
Seamless integration with PyTorch ecosystem
Active development and support from NVIDIA

Cons

Steep learning curve for beginners in 3D deep learning
Limited documentation for some advanced features
Requires powerful GPU for optimal performance
Some features may be unstable or experimental

Code Examples

Loading and visualizing a 3D mesh:

import kaolin as kal
from kaolin.visualize import plot_mesh

mesh = kal.io.obj.import_mesh('path/to/model.obj')
plot_mesh(mesh.vertices, mesh.faces)

Performing differentiable rendering:

import torch
import kaolin as kal

vertices = torch.rand(100, 3)
faces = torch.randint(0, 100, (200, 3))
camera = kal.render.camera.perspective_camera()
render = kal.render.mesh.dibr_rasterization(vertices, faces, camera)

Voxelizing a point cloud:

import torch
import kaolin as kal

points = torch.rand(1000, 3)
voxels = kal.ops.conversions.pointcloud_to_voxelgrid(points, resolution=32)

Getting Started

To get started with Kaolin, follow these steps:

Install Kaolin:
```
pip install kaolin
```
Import Kaolin in your Python script:
```
import kaolin as kal
```

Load a 3D model and perform operations:

mesh = kal.io.obj.import_mesh('path/to/model.obj')
vertices = mesh.vertices
faces = mesh.faces

# Perform operations on the mesh
transformed_vertices = kal.ops.mesh.index_vertices_by_faces(vertices, faces)

For more detailed information and tutorials, refer to the official Kaolin documentation.

Competitor Comparisons

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Pros of PyTorch3D

More comprehensive documentation and tutorials
Broader range of 3D vision tasks supported
Better integration with PyTorch ecosystem

Cons of PyTorch3D

Steeper learning curve for beginners
Less focus on real-time rendering capabilities

Code Comparison

PyTorch3D:

from pytorch3d.structures import Meshes
from pytorch3d.renderer import MeshRenderer, MeshRasterizer, SoftPhongShader

renderer = MeshRenderer(
    rasterizer=MeshRasterizer(),
    shader=SoftPhongShader()
)

Kaolin:

import kaolin as kal
from kaolin.render.camera import perspective_camera
from kaolin.render.mesh import dibr_rasterization

camera = perspective_camera(...)
faces, attributes = dibr_rasterization(vertices, faces, camera)

Both libraries offer powerful 3D rendering capabilities, but PyTorch3D provides a more abstracted interface, while Kaolin offers more low-level control. PyTorch3D is generally more suitable for research and prototyping, while Kaolin excels in performance-critical applications and game development scenarios.

nerfies

1,845

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Pros of Nerfies

Focuses specifically on dynamic scene reconstruction and novel view synthesis
Provides a complete implementation of the Nerfies paper, including training and evaluation scripts
Offers a user-friendly web interface for visualizing results

Cons of Nerfies

Limited to a specific use case (dynamic scene reconstruction)
Requires more computational resources for training and inference
Less versatile compared to Kaolin's broader 3D deep learning toolkit

Code Comparison

Nerfies (Python):

config = configs.get_config()
model = models.NerfModel(config)
loss = model(batch)

Kaolin (Python):

mesh = kaolin.rep.TriangleMesh.from_obj('model.obj')
renderer = kaolin.render.mesh.dibr.DIBRenderer(camera)
image = renderer(mesh.vertices, mesh.faces)

Summary

Nerfies is a specialized repository for dynamic scene reconstruction, offering a complete implementation of the Nerfies paper with user-friendly visualization tools. Kaolin, on the other hand, is a more comprehensive 3D deep learning library that provides a wide range of tools and utilities for various 3D-related tasks. While Nerfies excels in its specific use case, Kaolin offers greater versatility and broader applicability in 3D deep learning projects.

multinerf

3,739

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

Pros of multinerf

Focuses specifically on neural radiance fields (NeRF) for 3D scene reconstruction
Implements advanced NeRF techniques like mip-NeRF 360 for improved rendering quality
Provides pre-trained models and datasets for easy experimentation

Cons of multinerf

Limited to NeRF-based techniques, less versatile than Kaolin's broader 3D deep learning toolkit
May require more computational resources for training and rendering
Less extensive documentation compared to Kaolin's comprehensive guides

Code Comparison

multinerf:

config = config_flags.DEFINE_config_file('config', None, 'Path to the config file.')
FLAGS = flags.FLAGS
render_poses = generate_spiral_path(...)
render_fn = jax.pmap(...)

Kaolin:

import kaolin as kal
mesh = kal.io.obj.import_mesh('model.obj')
voxels = kal.ops.conversions.trianglemeshes_to_voxelgrids(mesh.vertices, mesh.faces)
rendered = kal.render.mesh.dibr_rasterization(mesh.vertices, mesh.faces, camera)

Both repositories offer powerful tools for 3D graphics and deep learning, but they serve different purposes. multinerf specializes in neural radiance fields for scene reconstruction, while Kaolin provides a more comprehensive toolkit for various 3D deep learning tasks.

nvdiffrec

2,217

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

Pros of nvdiffrec

Focused on differentiable rendering and material reconstruction
Provides advanced techniques for inverse rendering problems
Includes pre-trained models and datasets for quick experimentation

Cons of nvdiffrec

More specialized and narrower in scope compared to Kaolin
Less comprehensive documentation and tutorials
Smaller community and fewer contributors

Code Comparison

nvdiffrec:

import nvdiffrec
renderer = nvdiffrec.Renderer(resolution=(512, 512))
material = nvdiffrec.Material(basecolor_tex=texture)
mesh = nvdiffrec.load_obj('model.obj')
image = renderer.render(mesh, material)

Kaolin:

import kaolin as kal
mesh = kal.io.obj.import_mesh('model.obj')
renderer = kal.render.mesh.rasterize(mesh.vertices, mesh.faces)
texture = kal.render.mesh.texture_mapping(renderer, mesh.uvs, texture)
image = kal.render.camera.perspective_camera(renderer, texture)

Both libraries offer rendering capabilities, but nvdiffrec focuses on differentiable rendering and material reconstruction, while Kaolin provides a broader set of 3D deep learning tools. nvdiffrec is more specialized for inverse rendering problems, while Kaolin offers a more comprehensive suite of 3D-related functionalities.

meshrcnn

1,154

code for Mesh R-CNN, ICCV 2019

Pros of Mesh R-CNN

Focuses specifically on 3D object reconstruction from 2D images
Integrates with Detectron2, leveraging its powerful object detection capabilities
Provides end-to-end training for mesh prediction tasks

Cons of Mesh R-CNN

Limited to mesh reconstruction tasks, less versatile than Kaolin
Requires more setup and dependencies due to Detectron2 integration
Less active development and community support compared to Kaolin

Code Comparison

Mesh R-CNN:

from detectron2.config import get_cfg
from meshrcnn import add_meshrcnn_config
cfg = get_cfg()
add_meshrcnn_config(cfg)
cfg.merge_from_file("meshrcnn_config.yaml")

Kaolin:

import kaolin as kal
from kaolin.render.camera import perspective_camera
from kaolin.ops.mesh import check_sign
vertices, faces = kal.io.obj.import_mesh('model.obj')

Both libraries offer tools for 3D mesh manipulation, but Kaolin provides a more comprehensive set of utilities for various 3D tasks, while Mesh R-CNN specializes in reconstructing 3D meshes from 2D images using deep learning techniques. Kaolin's broader scope makes it more suitable for general 3D deep learning projects, whereas Mesh R-CNN excels in specific image-to-mesh reconstruction scenarios.

mesh_to_sdf

1,133

Calculate signed distance fields for arbitrary meshes

Pros of mesh_to_sdf

Focused specifically on mesh-to-SDF conversion, making it more lightweight and easier to use for this specific task
Provides a simple command-line interface for quick conversions
Supports multiple output formats, including NumPy arrays and VDB files

Cons of mesh_to_sdf

Limited functionality compared to Kaolin's broader set of 3D deep learning tools
Less active development and community support
May not integrate as seamlessly with other deep learning frameworks

Code Comparison

mesh_to_sdf:

import mesh_to_sdf
import trimesh

mesh = trimesh.load('model.obj')
points = mesh_to_sdf.sample_sdf_near_surface(mesh, number_of_points=250000)

Kaolin:

import kaolin as kal
import torch

mesh = kal.io.obj.import_mesh('model.obj')
points = kal.ops.mesh.sample_points(mesh.vertices, mesh.faces, num_samples=250000)
sdf = kal.metrics.mesh.signed_distance(mesh.vertices, mesh.faces, points)

Both libraries offer mesh processing capabilities, but Kaolin provides a more comprehensive set of tools for 3D deep learning tasks, while mesh_to_sdf focuses specifically on SDF conversion.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research

Overview

NVIDIA Kaolin library provides a PyTorch API for working with a variety of 3D representations and includes a growing collection of GPU-optimized operations such as modular differentiable rendering, fast conversions between representations, data loading, 3D checkpoints, differentiable camera API, differentiable lighting with spherical harmonics and spherical gaussians, powerful quadtree acceleration structure called Structured Point Clouds, interactive 3D visualizer for jupyter notebooks, convenient batched mesh container and more. Visit the Kaolin Library Documentation to get started!

Note that Kaolin library is part of the larger NVIDIA Kaolin effort for 3D deep learning.

Installation and Getting Started

Starting with v0.12.0, Kaolin supports installation with wheels:

# Replace TORCH_VERSION and CUDA_VERSION with your torch / cuda versions
pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-{TORCH_VERSION}_cu{CUDA_VERSION}.html

For example, to install kaolin 0.17.0 over torch 2.0.1 and cuda 11.8:

pip install kaolin==0.17.0 -f https://nvidia-kaolin.s3.us-east-2.amazonaws.com/torch-2.0.1_cu118.html

About the Latest Release (0.17.0)

In this version we added sample_points_in_volume function used for "densifying" a gaussian splats, this can be used to improve Physics simulation.

We further improved physics training and simulation using NVIDIA Warp on some of our functions. We also added support for transmittance in the GLTF loader.


Without Densifier	With Densifier

Check our updated tutorials:

See change logs for details.

Contributing

Please review our contribution guidelines.

External Projects using Kaolin

NVIDIA Kaolin Wisp:
- Use Camera API, Structured Point Clouds and its rendering capabilities
gradSim: Differentiable simulation for system identification and visuomotor control:
- Use DIB-R rasterizer, obj loader and timelapse
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer:
- Use Kaolin's DIB-R rasterizer, camera functions and Timelapse for 3D checkpoints.
Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Surfaces:
- Use SPC conversions and ray-tracing, yielding 30x memory and 3x training time reduction.
Learning Deformable Tetrahedral Meshes for 3D Reconstruction:
- Use Kaolin's DefTet volumetric renderer, tetrahedral losses, camera_functions, mesh operators and conversions, ShapeNet dataset, point_to_mesh_distance and sided_distance.
Text2Mesh:
- Use Kaolin's rendering functions, camera functions, and obj and off importers.
Flexible Isosurface Extraction for Gradient-Based Mesh Optimization (FlexiCubes) :
- Use Flexicube class, obj loader, turntable visualizer
SATR:
- Use Kaolin's rendering functions, lighting functions, camera functions, and obj/off importers.

Licenses

Most of Kaolin's repository is under Apache v2.0 license, except under kaolin/non_commercial which is under NSCL license restricted to non commercial usage for research and evaluation purposes.

Default kaolin import includes Apache-licensed components:

import kaolin

The non-commercial components need to be explicitly imported as:

import kaolin.non_commercial

Update

FlexiCubes is now under Apache-v2 here, the old version is maintained for backward compatibility

Citation

If you are using Kaolin library for your research, please cite:

@software{KaolinLibrary,
      author = {Fuji Tsang, Clement and Shugrina, Maria and Lafleche, Jean Francois and Perel, Or and Loop, Charles and Takikawa, Towaki and Modi, Vismay and Zook, Alexander and Wang, Jiehan and Chen, Wenzheng and Shen, Tianchang and Gao, Jun and Jatavallabhula, Krishna Murthy and Smith, Edward and Rozantsev, Artem and Fidler, Sanja and State, Gavriel and Gorski, Jason and Xiang, Tommy and Li, Jianing and Li, Michael and Lebaredian, Rev},
      title = {Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research},
      date = {2024-11-20},
      version = {0.17.0},
      url={\url{https://github.com/NVIDIAGameWorks/kaolin}}
}

Contributors

Current Team:

Technical Lead: Clement Fuji Tsang
Manager: Maria (Masha) Shugrina
Charles Loop
Vismay Modi
Or Perel
Alexander Zook

Other Majors Contributors:

Wenzheng Chen
Sanja Fidler
Jun Gao
Jason Gorski
Jean-Francois Lafleche
Rev Lebaredian
Jianing Li
Michael Li
Krishna Murthy Jatavallabhula
Artem Rozantsev
Tianchang (Frank) Shen
Edward Smith
Gavriel State
Towaki Takikawa
Jiehan Wang
Tommy Xiang

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot