graphics

TensorFlow Graphics: Differentiable Graphics Layers for TensorFlow

2,773

369

2,773

142

View on GitHub

Top Related Projects

jax

32,985

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

kaolin

4,817

A PyTorch Library for Accelerating 3D Deep Learning Research

Quick Overview

TensorFlow Graphics is an open-source library that extends TensorFlow's capabilities to handle 3D computer graphics and geometry processing tasks. It provides a set of differentiable graphics layers and utilities that can be seamlessly integrated into deep learning models, enabling researchers and developers to incorporate 3D operations into their machine learning pipelines.

Pros

Seamless integration with TensorFlow ecosystem
Differentiable graphics operations for end-to-end learning
Supports various 3D tasks like rendering, mesh processing, and camera operations
Well-documented with examples and tutorials

Cons

Steep learning curve for those unfamiliar with 3D graphics concepts
Limited compared to specialized graphics libraries
Dependency on TensorFlow may restrict usage in other ML frameworks
Still in active development, so some features may be experimental

Code Examples

Rendering a 3D mesh:

import tensorflow as tf
import tensorflow_graphics as tfg

vertices = tf.constant([[0, 0, 0], [1, 0, 0], [0, 1, 0]], dtype=tf.float32)
triangles = tf.constant([[0, 1, 2]], dtype=tf.int32)

rendered = tfg.rendering.mesh.rasterize(vertices, triangles, image_size=(256, 256))

Applying perspective projection:

import tensorflow as tf
import tensorflow_graphics.geometry.transformation as tfg_transformation

points_3d = tf.constant([[1, 2, 3], [4, 5, 6]], dtype=tf.float32)
focal = tf.constant([1.0, 1.0])
principal_point = tf.constant([0.0, 0.0])

points_2d = tfg_transformation.perspective.project(points_3d, focal, principal_point)

Computing mesh normals:

import tensorflow as tf
import tensorflow_graphics.geometry.representation as tfg_representation

vertices = tf.constant([[0, 0, 0], [1, 0, 0], [0, 1, 0], [0, 0, 1]], dtype=tf.float32)
triangles = tf.constant([[0, 1, 2], [0, 2, 3], [0, 3, 1]], dtype=tf.int32)

normals = tfg_representation.mesh.normals(vertices, triangles)

Getting Started

To get started with TensorFlow Graphics:

Install the library:

pip install tensorflow-graphics

Import the library in your Python script:

import tensorflow as tf
import tensorflow_graphics as tfg

Use the desired modules and functions in your code:

# Example: Render a simple triangle
vertices = tf.constant([[0, 0, 0], [1, 0, 0], [0, 1, 0]], dtype=tf.float32)
triangles = tf.constant([[0, 1, 2]], dtype=tf.int32)
rendered = tfg.rendering.mesh.rasterize(vertices, triangles, image_size=(256, 256))

For more detailed examples and tutorials, refer to the official documentation and examples in the GitHub repository.

Competitor Comparisons

jax

32,985

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Pros of JAX

More flexible and composable, allowing for easier customization of models and algorithms
Better support for automatic differentiation and vectorization, leading to improved performance
Simpler API and more pythonic syntax, making it easier to learn and use

Cons of JAX

Smaller ecosystem and fewer pre-built models compared to TensorFlow Graphics
Less comprehensive documentation and tutorials for graphics-specific tasks
May require more low-level implementation for certain graphics operations

Code Comparison

JAX example:

import jax.numpy as jnp
from jax import grad, jit

def loss(params, x, y):
    return jnp.mean((params[0] * x + params[1] - y) ** 2)

grad_loss = jit(grad(loss))

TensorFlow Graphics example:

import tensorflow as tf
import tensorflow_graphics as tfg

def render_mesh(vertices, triangles, camera):
    return tfg.rendering.mesh.rasterize(vertices, triangles, camera)

loss = tf.reduce_mean(tf.square(rendered - target))

Both libraries offer powerful tools for graphics-related computations, but JAX provides more flexibility and performance optimizations, while TensorFlow Graphics offers a more specialized set of graphics-specific functions and operations.

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Pros of PyTorch3D

Built on PyTorch, offering seamless integration with PyTorch ecosystem
More extensive documentation and tutorials
Faster development cycle and more frequent updates

Cons of PyTorch3D

Less mature compared to TensorFlow Graphics
Smaller community and fewer third-party extensions
Limited support for non-PyTorch workflows

Code Comparison

PyTorch3D:

import torch
from pytorch3d.structures import Meshes
from pytorch3d.renderer import Textures

verts = torch.randn(4, 3)
faces = torch.tensor([[0, 1, 2], [1, 2, 3]])
mesh = Meshes(verts=[verts], faces=[faces])

TensorFlow Graphics:

import tensorflow as tf
import tensorflow_graphics.geometry.representation as tfg_rep

vertices = tf.random.uniform((4, 3))
triangles = tf.constant([[0, 1, 2], [1, 2, 3]], dtype=tf.int32)
mesh = tfg_rep.TriangleMesh(vertices, triangles)

Both libraries provide similar functionality for 3D graphics operations, but with syntax and implementation differences reflecting their respective frameworks.

google-research

36,128

Google Research

Pros of google-research

Broader scope, covering various research areas beyond graphics
More frequent updates and contributions from Google researchers
Larger community and potential for collaboration across different fields

Cons of google-research

Less focused on graphics-specific applications
May require more effort to find relevant graphics-related content
Potentially steeper learning curve due to diverse topics

Code Comparison

graphics:

import tensorflow_graphics as tfg

vertices = tf.constant([[0., 0., 0.], [1., 0., 0.], [1., 1., 0.]])
triangles = tf.constant([[0, 1, 2]])
normals = tfg.geometry.representation.mesh.normals(vertices, triangles)

google-research:

import tensorflow as tf
from google_research import vision_transformer as vit

model = vit.VisionTransformer(
    num_classes=1000,
    patches=(16, 16),
    transformer=dict(mlp_dim=3072, num_heads=12, num_layers=12)
)

Summary

graphics focuses specifically on computer graphics tasks using TensorFlow, while google-research covers a wide range of research topics. graphics provides more specialized tools for graphics-related tasks, but google-research offers a broader scope and more frequent updates. The choice between the two depends on the specific needs of the project and the desired level of specialization in graphics applications.

kaolin

4,817

A PyTorch Library for Accelerating 3D Deep Learning Research

Pros of Kaolin

Specialized for 3D deep learning tasks with a focus on computer graphics and vision
Includes a wider range of 3D data structures and operations (e.g., meshes, voxels, point clouds)
Better integration with PyTorch ecosystem and NVIDIA hardware acceleration

Cons of Kaolin

Less mature and smaller community compared to TensorFlow Graphics
More limited documentation and tutorials
Narrower scope, primarily focused on 3D tasks rather than general computer graphics

Code Comparison

Kaolin example (mesh loading and rendering):

import kaolin as kal
import torch

mesh = kal.io.obj.import_mesh('model.obj')
renderer = kal.render.mesh.dibr.DIBRenderer(height=512, width=512)
image = renderer(mesh.vertices, mesh.faces)

TensorFlow Graphics example (3D rotation):

import tensorflow_graphics as tfg
import tensorflow as tf

angles = tf.constant([[0., 0., np.pi/2]])
matrix = tfg.geometry.transformation.rotation_matrix_3d.from_euler(angles)

Both libraries offer powerful tools for 3D graphics and deep learning, but Kaolin is more specialized for 3D tasks and PyTorch integration, while TensorFlow Graphics provides a broader range of computer graphics functionalities within the TensorFlow ecosystem.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

TensorFlow Graphics

The last few years have seen a rise in novel differentiable graphics layers which can be inserted in neural network architectures. From spatial transformers to differentiable graphics renderers, these new layers leverage the knowledge acquired over years of computer vision and graphics research to build new and more efficient network architectures. Explicitly modeling geometric priors and constraints into neural networks opens up the door to architectures that can be trained robustly, efficiently, and more importantly, in a self-supervised fashion.

Overview

At a high level, a computer graphics pipeline requires a representation of 3D objects and their absolute positioning in the scene, a description of the material they are made of, lights and a camera. This scene description is then interpreted by a renderer to generate a synthetic rendering.

In comparison, a computer vision system would start from an image and try to infer the parameters of the scene. This allows the prediction of which objects are in the scene, what materials they are made of, and their three-dimensional position and orientation.

Training machine learning systems capable of solving these complex 3D vision tasks most often requires large quantities of data. As labelling data is a costly and complex process, it is important to have mechanisms to design machine learning models that can comprehend the three dimensional world while being trained without much supervision. Combining computer vision and computer graphics techniques provides a unique opportunity to leverage the vast amounts of readily available unlabelled data. As illustrated in the image below, this can, for instance, be achieved using analysis by synthesis where the vision system extracts the scene parameters and the graphics system renders back an image based on them. If the rendering matches the original image, the vision system has accurately extracted the scene parameters. In this setup, computer vision and computer graphics go hand in hand, forming a single machine learning system similar to an autoencoder, which can be trained in a self-supervised manner.

Tensorflow Graphics is being developed to help tackle these types of challenges and to do so, it provides a set of differentiable graphics and geometry layers (e.g. cameras, reflectance models, spatial transformations, mesh convolutions) and 3D viewer functionalities (e.g. 3D TensorBoard) that can be used to train and debug your machine learning models of choice.

Installing TensorFlow Graphics

See the install documentation for instructions on how to install TensorFlow Graphics.

API Documentation

You can find the API documentation here.

Compatibility

TensorFlow Graphics is fully compatible with the latest stable release of TensorFlow, tf-nightly, and tf-nightly-2.0-preview. All the functions are compatible with graph and eager execution.

Debugging

Tensorflow Graphics heavily relies on L2 normalized tensors, as well as having the inputs to specific function be in a pre-defined range. Checking for all of this takes cycles, and hence is not activated by default. It is recommended to turn these checks on during a couple epochs of training to make sure that everything behaves as expected. This page provides the instructions to enable these checks.

Colab tutorials

To help you get started with some of the functionalities provided by TF Graphics, some Colab notebooks are available below and roughly ordered by difficulty. These Colabs touch upon a large range of topics including, object pose estimation, interpolation, object materials, lighting, non-rigid surface deformation, spherical harmonics, and mesh convolutions.

NOTE: the tutorials are maintained carefully. However, they are not considered part of the API and they can change at any time without warning. It is not advised to write code that takes dependency on them.

Beginner

Object pose estimation	Camera intrinsics optimization

Intermediate

B-spline and slerp interpolation	Reflectance	Non-rigid surface deformation

Advanced

Spherical harmonics rendering	Environment map optimization	Semantic mesh segmentation

TensorBoard 3D

Visual debugging is a great way to assess whether an experiment is going in the right direction. To this end, TensorFlow Graphics comes with a TensorBoard plugin to interactively visualize 3D meshes and point clouds. This demo shows how to use the plugin. Follow these instructions to install and configure TensorBoard 3D. Note that TensorBoard 3D is currently not compatible with eager execution nor TensorFlow 2.

Coming next...

Among many things, we are hoping to release resamplers, additional 3D convolution and pooling operators, and a differentiable rasterizer!

Additional Information

You may use this software under the Apache 2.0 License.

Community

As part of TensorFlow, we're committed to fostering an open and welcoming environment.

Stack Overflow: Ask or answer technical questions.
GitHub: Report bugs or make feature requests.
TensorFlow Blog: Stay up to date on content from the TensorFlow team and best articles from the community.
Youtube Channel: Follow TensorFlow shows.

References

If you use TensorFlow Graphics in your research, please reference it as:

@inproceedings{TensorflowGraphicsIO2019,
   author = {Oztireli, Cengiz and Valentin, Julien and Keskin, Cem and Pidlypenskyi, Pavel and Makadia, Ameesh and Sud, Avneesh and Bouaziz, Sofien},
   title = {TensorFlow Graphics: Computer Graphics Meets Deep Learning},
   year = {2019}
}

Contact

Want to reach out? E-mail us at tf-graphics-contact@google.com!

Contributors - in alphabetical order

Sofien Bouaziz (sofien@google.com)
Jay Busch
Forrester Cole
Ambrus Csaszar
Boyang Deng
Ariel Gordon
Christian HÃ¤ne
Cem Keskin
Ameesh Makadia
Cengiz Ãztireli
Rohit Pandey
Pavel Pidlypenskyi
Stefan Popov
Konstantinos Rematas
Omar Sanseviero
Aviv Segal
Avneesh Sud
Andrea Tagliasacchi
Anastasia Tkach
Julien Valentin
He Wang
Yinda Zhang

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot