kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

2,520

247

2,520

View on GitHub

Top Related Projects

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

imaginaire

4,058

NVIDIA's Deep Imagination Team's PyTorch Library

ml-hypersim

1,846

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Quick Overview

Kubric is an open-source Python framework for creating synthetic datasets and rendering high-quality images and videos. It provides a flexible and efficient way to generate large-scale, diverse datasets for computer vision tasks, including object detection, tracking, and 3D reconstruction.

Pros

Highly customizable and extensible for various computer vision tasks
Supports both static image and video generation
Integrates with popular 3D rendering engines like Blender
Offers a wide range of built-in assets and scene generation capabilities

Cons

Steep learning curve for users unfamiliar with 3D rendering concepts
Limited documentation and examples for advanced use cases
Requires significant computational resources for large-scale dataset generation
Dependency on external rendering engines may introduce compatibility issues

Code Examples

Creating a simple scene with a cube:

import kubric as kb

scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)
scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()

Adding a light source to the scene:

import kubric as kb

scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
light = kb.PointLight(name="light", position=(5, 5, 5), color=(1, 1, 1), intensity=1000)
scene.add(cube, light)
scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()

Generating a video with a moving cube:

import kubric as kb
import numpy as np

scene = kb.Scene(resolution=(256, 256), frame_end=60)
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)

keyframes = [
    kb.Keyframe(frame=0, position=(0, 0, 0)),
    kb.Keyframe(frame=30, position=(2, 2, 2)),
    kb.Keyframe(frame=60, position=(0, 0, 0)),
]
cube.keyframe_insert(keyframes, attribute="position")

scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()

Getting Started

To get started with Kubric, follow these steps:

Install Kubric and its dependencies:
```
pip install kubric
```
Create a new Python script and import Kubric:
```
import kubric as kb
```

Set up a basic scene and render it:

scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)
scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()

Explore the Kubric documentation and examples to learn more about advanced features and customization options.

Competitor Comparisons

pytorch3d

9,337

PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Pros of PyTorch3D

Seamless integration with PyTorch ecosystem
Extensive documentation and tutorials
Wider range of 3D vision tasks (rendering, reconstruction, etc.)

Cons of PyTorch3D

Steeper learning curve for beginners
Less focus on synthetic data generation
Limited built-in support for complex scene composition

Code Comparison

PyTorch3D example (rendering a mesh):

renderer = MeshRenderer(
    rasterizer=MeshRasterizer(),
    shader=SoftPhongShader()
)
images = renderer(meshes, cameras=cameras, lights=lights)

Kubric example (creating a scene):

scene = kb.Scene(resolution=(512, 512))
cube = kb.Cube(name="cube", scale=1, position=(0, 0, 0))
scene.add(cube)
kb.simulate(scene, frame_start=0, frame_end=20)

PyTorch3D focuses on providing tools for 3D deep learning tasks, while Kubric specializes in synthetic data generation for computer vision. PyTorch3D offers more flexibility for various 3D vision tasks, but Kubric excels in creating complex, physically-based synthetic scenes for training data.

imaginaire

4,058

NVIDIA's Deep Imagination Team's PyTorch Library

Pros of imaginaire

Focuses on image and video synthesis tasks, offering a wide range of models for various applications
Provides pre-trained models and easy-to-use inference scripts for quick implementation
Supports both PyTorch and TensorFlow frameworks, offering flexibility for developers

Cons of imaginaire

Limited to 2D image and video tasks, lacking 3D rendering capabilities
Requires more computational resources due to complex neural network architectures
Less suitable for generating large-scale synthetic datasets compared to Kubric

Code Comparison

Kubric (Python):

import kubric as kb

scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1))
scene.add(cube)

imaginaire (Python):

from imaginaire.utils.distributed import init_dist
from imaginaire.trainers import BaseTrainer

trainer = BaseTrainer(cfg)
trainer.train()

ml-hypersim

1,846

Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding

Pros of ml-hypersim

Focuses on photorealistic indoor scene generation
Provides high-quality, physically-based rendered images
Includes a diverse set of indoor environments and objects

Cons of ml-hypersim

Limited to indoor scenes, less versatile than Kubric
Requires more computational resources due to photorealistic rendering
Steeper learning curve for users unfamiliar with 3D modeling software

Code Comparison

ml-hypersim:

import hypersim as hs

scene = hs.Scene()
scene.load_from_file("indoor_scene.json")
renderer = hs.Renderer()
image = renderer.render(scene)

Kubric:

import kubric as kb

scene = kb.Scene()
scene.add(kb.Cube())
renderer = kb.Renderer()
image = renderer.render(scene)

Both libraries provide scene creation and rendering capabilities, but ml-hypersim is more focused on photorealistic indoor scenes, while Kubric offers a more general-purpose approach to 3D scene generation and rendering.

meshrcnn

1,159

code for Mesh R-CNN, ICCV 2019

Pros of Mesh R-CNN

Focuses on 3D object reconstruction from single RGB images
Provides end-to-end training for both instance segmentation and 3D shape prediction
Includes a novel mesh refinement branch for improved 3D shape quality

Cons of Mesh R-CNN

Limited to static image processing, unlike Kubric's video generation capabilities
Requires pre-existing 3D shape priors, which may limit its applicability to novel objects
More computationally intensive due to complex 3D reconstruction process

Code Comparison

Mesh R-CNN (PyTorch):

class MeshRCNN(nn.Module):
    def __init__(self, cfg):
        super(MeshRCNN, self).__init__()
        self.backbone = build_backbone(cfg)
        self.rpn = build_rpn(cfg, self.backbone.out_channels)
        self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)

Kubric (Python):

scene = kb.Scene(resolution=(512, 512))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)
kb.simulate(scene, frame_start=0, frame_end=20)

The code snippets highlight the different focus areas of the two projects: Mesh R-CNN on 3D reconstruction from images, and Kubric on 3D scene generation and simulation.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Motivation and design

We need better data for training and evaluating machine learning systems, especially in the context of unsupervised multi-object video understanding. Current systems succeed on toy datasets, but fail on real-world data. Progress could be greatly accelerated if we had the ability to create suitable datasets of varying complexity on demand. Kubric is mainly built on-top of pybullet (for physics simulation) and Blender (for rendering); however, the code is kept modular to potentially support different rendering backends.

Getting started

For instructions, please refer to https://kubric.readthedocs.io

Assuming you have docker installed, to generate the data above simply execute:

git clone https://github.com/google-research/kubric.git
cd kubric
docker pull kubricdockerhub/kubruntu
docker run --rm --interactive \
           --user $(id -u):$(id -g) \
           --volume "$(pwd):/kubric" \
           kubricdockerhub/kubruntu \
           /usr/bin/python3 examples/helloworld.py
ls output

Kubric employs Blender 2.93 (see here), so if you want to inspect the generated *.blend scene file for interactive inspection (i.e. without needing to render the scene), please make sure you have installed the correct Blender version.

Requirements

A pipeline for conveniently generating video data.
Physics simulation for automatically generating physical interactions between multiple objects.
Good control over the complexity of the generated data, so that we can evaluate individual aspects such as variability of objects and textures.
Realism: Ideally, the ability to span the entire complexity range from CLEVR all the way to real-world video such as YouTube8. This is clearly not feasible, but we would like to get as close as possible.
Access to rich ground truth information about the objects in a scene for the purpose of evaluation (eg. object segmentations and properties)
Control the train/test split to evaluate compositionality and systematic generalization (for example on held-out combinations of features or objects)

Challenges and datasets

Generally, we store datasets for the challenges in this Google Cloud Bucket. More specifically, these challenges are dataset contributions of the Kubric CVPR'22 paper:

Pointers to additional datasets/workers:

Bibtex

@article{greff2021kubric,
    title = {Kubric: a scalable dataset generator}, 
    author = {Klaus Greff and Francois Belletti and Lucas Beyer and Carl Doersch and
              Yilun Du and Daniel Duckworth and David J Fleet and Dan Gnanapragasam and
              Florian Golemo and Charles Herrmann and Thomas Kipf and Abhijit Kundu and
              Dmitry Lagun and Issam Laradji and Hsueh-Ti (Derek) Liu and Henning Meyer and
              Yishu Miao and Derek Nowrouzezahrai and Cengiz Oztireli and Etienne Pot and
              Noha Radwan and Daniel Rebain and Sara Sabour and Mehdi S. M. Sajjadi and Matan Sela and
              Vincent Sitzmann and Austin Stone and Deqing Sun and Suhani Vora and Ziyu Wang and
              Tianhao Wu and Kwang Moo Yi and Fangcheng Zhong and Andrea Tagliasacchi},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year = {2022},
}

Disclaimer

This is not an official Google Product

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot