Convert Figma logo to code with AI

NVlabs logonvdiffrec

Official code for the CVPR 2022 (oral) paper "Extracting Triangular 3D Models, Materials, and Lighting From Images".

2,165
228
2,165
66

Top Related Projects

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

1,704

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

2,051

Dense Prediction Transformers

Quick Overview

NVlabs/nvdiffrec is an open-source project for differentiable rendering and reconstruction of 3D objects from 2D images. It uses PyTorch to implement a differentiable rendering pipeline, allowing for the optimization of 3D geometry, materials, and lighting from 2D observations.

Pros

  • Enables 3D reconstruction from 2D images without the need for 3D supervision
  • Supports various material models, including diffuse, specular, and microfacet BRDFs
  • Utilizes GPU acceleration for efficient rendering and optimization
  • Provides a flexible framework for experimenting with different rendering techniques and loss functions

Cons

  • Requires significant computational resources, especially for complex scenes or high-resolution images
  • May struggle with highly complex or occluded geometries
  • Learning curve can be steep for users unfamiliar with differentiable rendering concepts
  • Limited documentation and examples for advanced use cases

Code Examples

  1. Loading and rendering a 3D mesh:
import nvdiffrec

# Load mesh and create a differentiable renderer
mesh = nvdiffrec.load_mesh("path/to/mesh.obj")
renderer = nvdiffrec.create_renderer(resolution=(512, 512))

# Render the mesh
image = renderer.render(mesh)
  1. Optimizing material properties:
import torch

# Create an optimizer for material parameters
material_params = mesh.material.parameters()
optimizer = torch.optim.Adam(material_params, lr=0.01)

# Optimization loop
for _ in range(100):
    optimizer.zero_grad()
    rendered_image = renderer.render(mesh)
    loss = torch.nn.functional.mse_loss(rendered_image, target_image)
    loss.backward()
    optimizer.step()
  1. Reconstructing 3D geometry from images:
# Create a deformable mesh for optimization
deformable_mesh = nvdiffrec.create_deformable_mesh(initial_mesh)

# Optimization loop for geometry and material
for _ in range(1000):
    optimizer.zero_grad()
    rendered_image = renderer.render(deformable_mesh)
    loss = compute_reconstruction_loss(rendered_image, target_image)
    loss.backward()
    optimizer.step()

Getting Started

To get started with NVlabs/nvdiffrec:

  1. Clone the repository:

    git clone https://github.com/NVlabs/nvdiffrec.git
    cd nvdiffrec
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Run a simple example:

    import nvdiffrec
    
    # Load a sample mesh and create a renderer
    mesh = nvdiffrec.load_mesh("samples/sphere.obj")
    renderer = nvdiffrec.create_renderer()
    
    # Render the mesh and display the result
    image = renderer.render(mesh)
    nvdiffrec.display_image(image)
    

For more detailed instructions and advanced usage, refer to the project's documentation and examples in the repository.

Competitor Comparisons

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Pros of pytorch3d

  • More comprehensive 3D computer vision library with a broader range of functionalities
  • Better integration with PyTorch ecosystem and GPU acceleration
  • Extensive documentation and community support

Cons of pytorch3d

  • Steeper learning curve due to its extensive feature set
  • May be overkill for simpler 3D rendering tasks
  • Less focused on differentiable rendering compared to nvdiffrec

Code Comparison

pytorch3d

import torch
from pytorch3d.structures import Meshes
from pytorch3d.renderer import Textures, look_at_view_transform, FoVPerspectiveCameras, RasterizationSettings, MeshRenderer, MeshRasterizer, SoftPhongShader, TexturesVertex

# Create a mesh with texture
verts = torch.rand(4, 3)
faces = torch.tensor([[0, 1, 2], [0, 2, 3]])
textures = torch.ones_like(verts)
mesh = Meshes(verts=[verts], faces=[faces], textures=TexturesVertex(verts_features=[textures]))

nvdiffrec

import torch
import nvdiffrec

# Create a mesh with texture
verts = torch.rand(4, 3)
faces = torch.tensor([[0, 1, 2], [0, 2, 3]])
uv = torch.rand(4, 2)
material = nvdiffrec.Material({'kd': torch.ones(3)})
mesh = nvdiffrec.Mesh(verts, faces, uv, material)
1,704

This is the code for Deformable Neural Radiance Fields, a.k.a. Nerfies.

Pros of nerfies

  • Focuses on dynamic scenes and non-rigid objects, allowing for more versatile applications
  • Utilizes a deformation-based approach, enabling better handling of complex movements
  • Provides a method for creating 3D animations from casual video captures

Cons of nerfies

  • May require more computational resources due to the complexity of handling dynamic scenes
  • Potentially less accurate for static objects compared to specialized static reconstruction methods
  • Limited to scenes captured with a handheld camera, which may restrict some use cases

Code Comparison

nvdiffrec:

import nvdiffrec
scene = nvdiffrec.Scene(...)
optimizer = nvdiffrec.Optimizer(...)
result = optimizer.optimize(scene)

nerfies:

import nerfies
model = nerfies.NerfiesModel(...)
trainer = nerfies.Trainer(model, ...)
trainer.train()

Both repositories focus on 3D reconstruction, but nvdiffrec specializes in static scenes with high-quality material recovery, while nerfies excels in handling dynamic scenes and non-rigid objects. nvdiffrec may offer better results for static objects, while nerfies provides more flexibility for capturing and reconstructing moving subjects.

A Code Release for Mip-NeRF 360, Ref-NeRF, and RawNeRF

Pros of multinerf

  • Supports multi-view synthesis and novel view generation
  • Implements advanced NeRF variants like Mip-NeRF 360 and Ref-NeRF
  • Provides a unified framework for various NeRF-based techniques

Cons of multinerf

  • Potentially higher computational requirements due to complex models
  • May require more extensive datasets for optimal performance

Code comparison

multinerf:

config = config_flags.DEFINE_config_file('config', None, 'File path to the config file.')
FLAGS = flags.FLAGS
flags.mark_flags_as_required(['config'])

def main(unused_argv):
  render_fn = train_utils.setup_model(config)
  train_utils.train_loop(config, render_fn)

nvdiffrec:

parser = argparse.ArgumentParser(description='nvdiffrec')
parser.add_argument('--config', type=str, default=None, help='Config file')
args = parser.parse_args()

def main():
    trainer = Trainer(args)
    trainer.train()

Both repositories focus on neural rendering techniques, but multinerf specializes in NeRF-based methods, while nvdiffrec offers a more general approach to differentiable rendering. multinerf provides a unified framework for various NeRF variants, potentially offering more flexibility in multi-view synthesis. However, nvdiffrec may be more suitable for tasks requiring explicit geometry reconstruction and material estimation.

2,051

Dense Prediction Transformers

Pros of DPT

  • Focuses on depth estimation and monocular depth prediction
  • Utilizes transformers for improved performance in vision tasks
  • Provides pre-trained models for various depth estimation scenarios

Cons of DPT

  • Limited to depth estimation tasks, less versatile than nvdiffrec
  • May require more computational resources due to transformer architecture
  • Less emphasis on material reconstruction and rendering

Code Comparison

DPT:

from dpt.models import DPTDepthModel
model = DPTDepthModel(
    path=path_to_pretrained_model,
    backbone="vitb_rn50_384",
    non_negative=True,
)
depth = model(image)

nvdiffrec:

import nvdiffrec
renderer = nvdiffrec.Renderer(resolution=(512, 512))
material = nvdiffrec.Material(basecolor_tex=texture)
mesh = nvdiffrec.Mesh(vertices, faces, material=material)
image = renderer.render(mesh, camera)

DPT is specialized for depth estimation using transformers, offering pre-trained models for various scenarios. nvdiffrec, on the other hand, provides a more comprehensive framework for 3D reconstruction and rendering, including material properties. DPT may be more suitable for projects focused solely on depth estimation, while nvdiffrec offers greater flexibility for 3D reconstruction and rendering tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

nvdiffrec

Teaser image

Joint optimization of topology, materials and lighting from multi-view image observations as described in the paper Extracting Triangular 3D Models, Materials, and Lighting From Images.

For differentiable marching tetrahedons, we have adapted code from NVIDIA's Kaolin: A Pytorch Library for Accelerating 3D Deep Learning Research.

News

  • 2023-10-20 : We added a version of the renderutils library written in slangpy to leverage the autodiff capabilities of slang instead of CUDA extensions with manually crafted forward and backward passes. This simplifies the code substantially, with the same runtime performance as before. This version is available in the slang branch of this repo.

  • 2023-09-15 : We added support for the FlexiCubes isosurfacing technique. Please see the config configs/bob_flexi.json for a usage example, and refer to the FlexiCubes documentation for details.

Citation

@inproceedings{Munkberg_2022_CVPR,
    author    = {Munkberg, Jacob and Hasselgren, Jon and Shen, Tianchang and Gao, Jun and Chen, Wenzheng 
                    and Evans, Alex and M\"uller, Thomas and Fidler, Sanja},
    title     = "{Extracting Triangular 3D Models, Materials, and Lighting From Images}",
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {8280-8290}
}

Licenses

Copyright © 2022, NVIDIA Corporation. All rights reserved.

This work is made available under the Nvidia Source Code License.

For business inquiries, please visit our website and submit the form: NVIDIA Research Licensing.

Installation

Requires Python 3.6+, VS2019+, Cuda 11.3+ and PyTorch 1.10+

Tested in Anaconda3 with Python 3.9 and PyTorch 1.10

One time setup (Windows)

Install the Cuda toolkit (required to build the PyTorch extensions). We support Cuda 11.3 and above. Pick the appropriate version of PyTorch compatible with the installed Cuda toolkit. Below is an example with Cuda 11.6

conda create -n dmodel python=3.9
activate dmodel
conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge
pip install ninja imageio PyOpenGL glfw xatlas gdown
pip install git+https://github.com/NVlabs/nvdiffrast/
pip install --global-option="--no-networks" git+https://github.com/NVlabs/tiny-cuda-nn#subdirectory=bindings/torch
imageio_download_bin freeimage

Every new command prompt

activate dmodel

Examples

Our approach is designed for high-end NVIDIA GPUs with large amounts of memory. To run on mid-range GPU's, reduce the batch size parameter in the .json files.

Simple genus 1 reconstruction example:

python train.py --config configs/bob.json

Visualize training progress (only supported on Windows):

python train.py --config configs/bob.json --display-interval 20

Multi GPU example (Linux only. Experimental: all results in the paper were generated using a single GPU), using PyTorch DDP

torchrun --nproc_per_node=4 train.py --config configs/bob.json

Below, we show the starting point and the final result. References to the right.

Initial guess Our result

The results will be stored in the out folder. The Spot and Bob models were created and released into the public domain by Keenan Crane.

Included examples

  • spot.json - Extracting a 3D model of the spot model. Geometry, materials, and lighting from image observations.
  • spot_fixlight.json - Same as above but assuming known environment lighting.
  • spot_metal.json - Example of joint learning of materials and high frequency environment lighting to showcase split-sum.
  • bob.json - Simple example of a genus 1 model.

Datasets

We additionally include configs (nerf_*.json, nerd_*.json) to reproduce the main results of the paper. We rely on third party datasets, which are courtesy of their respective authors. Please note that individual licenses apply to each dataset. To automatically download and pre-process all datasets, run the download_datasets.py script:

activate dmodel
cd data
python download_datasets.py

Below follows more information and instructions on how to manually install the datasets (in case the automated script fails).

NeRF synthetic dataset Our view interpolation results use the synthetic dataset from the original NeRF paper. To manually install it, download the NeRF synthetic dataset archive and unzip it into the nvdiffrec/data folder. This is required for running any of the nerf_*.json configs.

NeRD dataset We use datasets from the NeRD paper, which features real-world photogrammetry and inaccurate (manually annotated) segmentation masks. Clone the NeRD datasets using git and rescale them to 512 x 512 pixels resolution using the script scale_images.py. This is required for running any of the nerd_*.json configs.

activate dmodel
cd nvdiffrec/data/nerd
git clone https://github.com/vork/ethiopianHead.git
git clone https://github.com/vork/moldGoldCape.git
python scale_images.py

Server usage (through Docker)

  • Build docker image.
cd docker
./make_image.sh nvdiffrec:v1
  • Start an interactive docker container: docker run --gpus device=0 -it --rm -v /raid:/raid -it nvdiffrec:v1 bash

  • Detached docker: docker run --gpus device=1 -d -v /raid:/raid -w=[path to the code] nvdiffrec:v1 python train.py --config configs/bob.json