nvdiffrec

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

2,217

229

2,217

View on GitHub

Top Related Projects

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

nerfies

1,845

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

multinerf

3,739

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

Quick Overview

NVlabs/nvdiffrec is an open-source project for differentiable rendering and reconstruction of 3D objects from 2D images. It uses PyTorch to implement a differentiable rendering pipeline, allowing for the optimization of 3D geometry, materials, and lighting from 2D observations.

Pros

Enables 3D reconstruction from 2D images without the need for 3D supervision
Supports various material models, including diffuse, specular, and microfacet BRDFs
Utilizes GPU acceleration for efficient rendering and optimization
Provides a flexible framework for experimenting with different rendering techniques and loss functions

Cons

Requires significant computational resources, especially for complex scenes or high-resolution images
May struggle with highly complex or occluded geometries
Learning curve can be steep for users unfamiliar with differentiable rendering concepts
Limited documentation and examples for advanced use cases

Code Examples

Loading and rendering a 3D mesh:

import nvdiffrec

# Load mesh and create a differentiable renderer
mesh = nvdiffrec.load_mesh("path/to/mesh.obj")
renderer = nvdiffrec.create_renderer(resolution=(512, 512))

# Render the mesh
image = renderer.render(mesh)

Optimizing material properties:

import torch

# Create an optimizer for material parameters
material_params = mesh.material.parameters()
optimizer = torch.optim.Adam(material_params, lr=0.01)

# Optimization loop
for _ in range(100):
    optimizer.zero_grad()
    rendered_image = renderer.render(mesh)
    loss = torch.nn.functional.mse_loss(rendered_image, target_image)
    loss.backward()
    optimizer.step()

Reconstructing 3D geometry from images:

# Create a deformable mesh for optimization
deformable_mesh = nvdiffrec.create_deformable_mesh(initial_mesh)

# Optimization loop for geometry and material
for _ in range(1000):
    optimizer.zero_grad()
    rendered_image = renderer.render(deformable_mesh)
    loss = compute_reconstruction_loss(rendered_image, target_image)
    loss.backward()
    optimizer.step()

Getting Started

To get started with NVlabs/nvdiffrec:

Clone the repository:

git clone https://github.com/NVlabs/nvdiffrec.git
cd nvdiffrec

Install dependencies:
```
pip install -r requirements.txt
```

Run a simple example:

import nvdiffrec

# Load a sample mesh and create a renderer
mesh = nvdiffrec.load_mesh("samples/sphere.obj")
renderer = nvdiffrec.create_renderer()

# Render the mesh and display the result
image = renderer.render(mesh)
nvdiffrec.display_image(image)

For more detailed instructions and advanced usage, refer to the project's documentation and examples in the repository.

Competitor Comparisons

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Pros of pytorch3d

More comprehensive 3D computer vision library with a broader range of functionalities
Better integration with PyTorch ecosystem and GPU acceleration
Extensive documentation and community support

Cons of pytorch3d

Steeper learning curve due to its extensive feature set
May be overkill for simpler 3D rendering tasks
Less focused on differentiable rendering compared to nvdiffrec

Code Comparison

pytorch3d

import torch
from pytorch3d.structures import Meshes
from pytorch3d.renderer import Textures, look_at_view_transform, FoVPerspectiveCameras, RasterizationSettings, MeshRenderer, MeshRasterizer, SoftPhongShader, TexturesVertex

# Create a mesh with texture
verts = torch.rand(4, 3)
faces = torch.tensor([[0, 1, 2], [0, 2, 3]])
textures = torch.ones_like(verts)
mesh = Meshes(verts=[verts], faces=[faces], textures=TexturesVertex(verts_features=[textures]))

nvdiffrec

import torch
import nvdiffrec

# Create a mesh with texture
verts = torch.rand(4, 3)
faces = torch.tensor([[0, 1, 2], [0, 2, 3]])
uv = torch.rand(4, 2)
material = nvdiffrec.Material({'kd': torch.ones(3)})
mesh = nvdiffrec.Mesh(verts, faces, uv, material)

nerfies

1,845

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Pros of nerfies

Focuses on dynamic scenes and non-rigid objects, allowing for more versatile applications
Utilizes a deformation-based approach, enabling better handling of complex movements
Provides a method for creating 3D animations from casual video captures

Cons of nerfies

May require more computational resources due to the complexity of handling dynamic scenes
Potentially less accurate for static objects compared to specialized static reconstruction methods
Limited to scenes captured with a handheld camera, which may restrict some use cases

Code Comparison

nvdiffrec:

import nvdiffrec
scene = nvdiffrec.Scene(...)
optimizer = nvdiffrec.Optimizer(...)
result = optimizer.optimize(scene)

nerfies:

import nerfies
model = nerfies.NerfiesModel(...)
trainer = nerfies.Trainer(model, ...)
trainer.train()

Both repositories focus on 3D reconstruction, but nvdiffrec specializes in static scenes with high-quality material recovery, while nerfies excels in handling dynamic scenes and non-rigid objects. nvdiffrec may offer better results for static objects, while nerfies provides more flexibility for capturing and reconstructing moving subjects.

multinerf

3,739

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

Pros of multinerf

Supports multi-view synthesis and novel view generation
Implements advanced NeRF variants like Mip-NeRF 360 and Ref-NeRF
Provides a unified framework for various NeRF-based techniques

Cons of multinerf

Potentially higher computational requirements due to complex models
May require more extensive datasets for optimal performance

Code comparison

multinerf:

config = config_flags.DEFINE_config_file('config', None, 'File path to the config file.')
FLAGS = flags.FLAGS
flags.mark_flags_as_required(['config'])

def main(unused_argv):
  render_fn = train_utils.setup_model(config)
  train_utils.train_loop(config, render_fn)

nvdiffrec:

parser = argparse.ArgumentParser(description='nvdiffrec')
parser.add_argument('--config', type=str, default=None, help='Config file')
args = parser.parse_args()

def main():
    trainer = Trainer(args)
    trainer.train()

Both repositories focus on neural rendering techniques, but multinerf specializes in NeRF-based methods, while nvdiffrec offers a more general approach to differentiable rendering. multinerf provides a unified framework for various NeRF variants, potentially offering more flexibility in multi-view synthesis. However, nvdiffrec may be more suitable for tasks requiring explicit geometry reconstruction and material estimation.

DPT

2,163

Dense Prediction Transformers

Pros of DPT

Focuses on depth estimation and monocular depth prediction
Utilizes transformers for improved performance in vision tasks
Provides pre-trained models for various depth estimation scenarios

Cons of DPT

Limited to depth estimation tasks, less versatile than nvdiffrec
May require more computational resources due to transformer architecture
Less emphasis on material reconstruction and rendering

Code Comparison

DPT:

from dpt.models import DPTDepthModel
model = DPTDepthModel(
    path=path_to_pretrained_model,
    backbone="vitb_rn50_384",
    non_negative=True,
)
depth = model(image)

nvdiffrec:

import nvdiffrec
renderer = nvdiffrec.Renderer(resolution=(512, 512))
material = nvdiffrec.Material(basecolor_tex=texture)
mesh = nvdiffrec.Mesh(vertices, faces, material=material)
image = renderer.render(mesh, camera)

DPT is specialized for depth estimation using transformers, offering pre-trained models for various scenarios. nvdiffrec, on the other hand, provides a more comprehensive framework for 3D reconstruction and rendering, including material properties. DPT may be more suitable for projects focused solely on depth estimation, while nvdiffrec offers greater flexibility for 3D reconstruction and rendering tasks.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

nvdiffrec

Teaser image

Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D Models, Materials, and Lighting From Images.

For differentiable marching tetrahedons, we have adapted code from NVIDIA's Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research.

News

2023-10-20 : We added a version of the renderutils library written in slangpy to leverage the autodiff capabilities of slang instead of CUDA extensions with manually crafted forward and backward passes. This simplifies the code substantially, with the same runtime performance as before. This version is available in the slang branch of this repo.
2023-09-15 : We added support for the FlexiCubes isosurfacing technique. Please see the config configs/bob_flexi.json for a usage example, and refer to the FlexiCubes documentation for details.

Citation

@inproceedings{Munkberg_2022_CVPR,
    author    = {Munkberg, Jacob and Hasselgren, Jon and Shen, Tianchang and Gao, Jun and Chen, Wenzheng 
                    and Evans, Alex and M\"uller, Thomas and Fidler, Sanja},
    title     = "{Extracting Triangular 3D Models, Materials, and Lighting From Images}",
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {8280-8290}
}

Licenses

This work is made available under the Nvidia Source Code License.

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing.

Installation

Requires Python 3.6+, VS2019+, Cuda 11.3+ and PyTorch 1.10+

Tested in Anaconda3 with Python 3.9 and PyTorch 1.10

One time setup (Windows)

Install the Cuda toolkit (required to build the PyTorch extensions). We support Cuda 11.3 and above. Pick the appropriate version of PyTorch compatible with the installed Cuda toolkit. Below is an example with Cuda 11.6

conda create -n dmodel python=3.9
activate dmodel
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
pip install ninja imageio PyOpenGL glfw xatlas gdown
pip install git+https://github.com/NVlabs/nvdiffrast/
pip install --global-option="--no-networks" git+https://github.com/NVlabs/tiny-cuda-nn#subdirectory=bindings/torch
imageio_download_bin freeimage

Every new command prompt

activate dmodel

Examples

Our approach is designed for high-end NVIDIA GPUs with large amounts of memory. To run on mid-range GPU's, reduce the batch size parameter in the .json files.

Simple genus 1 reconstruction example:

python train.py --config configs/bob.json

Visualize training progress (only supported on Windows):

python train.py --config configs/bob.json --display-interval 20

Multi GPU example (Linux only. Experimental: all results in the paper were generated using a single GPU), using PyTorch DDP

torchrun --nproc_per_node=4 train.py --config configs/bob.json

Below, we show the starting point and the final result. References to the right.

Initial guess Our result

The results will be stored in the out folder. The Spot and Bob models were created and released into the public domain by Keenan Crane.

Included examples

spot.json - Extracting a 3D model of the spot model. Geometry, materials, and lighting from image observations.
spot_fixlight.json - Same as above but assuming known environment lighting.
spot_metal.json - Example of joint learning of materials and high frequency environment lighting to showcase split-sum.
bob.json - Simple example of a genus 1 model.

Datasets

We additionally include configs (nerf_*.json, nerd_*.json) to reproduce the main results of the paper. We rely on third party datasets, which are courtesy of their respective authors. Please note that individual licenses apply to each dataset. To automatically download and pre-process all datasets, run the download_datasets.py script:

activate dmodel
cd data
python download_datasets.py

Below follows more information and instructions on how to manually install the datasets (in case the automated script fails).

NeRF synthetic dataset Our view interpolation results use the synthetic dataset from the original NeRF paper. To manually install it, download the NeRF synthetic dataset archive and unzip it into the nvdiffrec/data folder. This is required for running any of the nerf_*.json configs.

NeRD dataset We use datasets from the NeRD paper, which features real-world photogrammetry and inaccurate (manually annotated) segmentation masks. Clone the NeRD datasets using git and rescale them to 512 x 512 pixels resolution using the script scale_images.py. This is required for running any of the nerd_*.json configs.

activate dmodel
cd nvdiffrec/data/nerd
git clone https://github.com/vork/ethiopianHead.git
git clone https://github.com/vork/moldGoldCape.git
python scale_images.py

Server usage (through Docker)

Build docker image.

cd docker
./make_image.sh nvdiffrec:v1

Start an interactive docker container: docker run --gpus device=0 -it --rm -v /raid:/raid -it nvdiffrec:v1 bash
Detached docker: docker run --gpus device=1 -d -v /raid:/raid -w=[path to the code] nvdiffrec:v1 python train.py --config configs/bob.json

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot