kubric
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Top Related Projects
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
NVIDIA's Deep Imagination Team's PyTorch Library
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
code for Mesh R-CNN, ICCV 2019
Quick Overview
Kubric is an open-source Python framework for creating synthetic datasets and rendering high-quality images and videos. It provides a flexible and efficient way to generate large-scale, diverse datasets for computer vision tasks, including object detection, tracking, and 3D reconstruction.
Pros
- Highly customizable and extensible for various computer vision tasks
- Supports both static image and video generation
- Integrates with popular 3D rendering engines like Blender
- Offers a wide range of built-in assets and scene generation capabilities
Cons
- Steep learning curve for users unfamiliar with 3D rendering concepts
- Limited documentation and examples for advanced use cases
- Requires significant computational resources for large-scale dataset generation
- Dependency on external rendering engines may introduce compatibility issues
Code Examples
- Creating a simple scene with a cube:
import kubric as kb
scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)
scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()
- Adding a light source to the scene:
import kubric as kb
scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
light = kb.PointLight(name="light", position=(5, 5, 5), color=(1, 1, 1), intensity=1000)
scene.add(cube, light)
scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()
- Generating a video with a moving cube:
import kubric as kb
import numpy as np
scene = kb.Scene(resolution=(256, 256), frame_end=60)
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)
keyframes = [
kb.Keyframe(frame=0, position=(0, 0, 0)),
kb.Keyframe(frame=30, position=(2, 2, 2)),
kb.Keyframe(frame=60, position=(0, 0, 0)),
]
cube.keyframe_insert(keyframes, attribute="position")
scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0))
scene.render()
Getting Started
To get started with Kubric, follow these steps:
-
Install Kubric and its dependencies:
pip install kubric
-
Create a new Python script and import Kubric:
import kubric as kb
-
Set up a basic scene and render it:
scene = kb.Scene(resolution=(256, 256)) cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0)) scene.add(cube) scene.camera = kb.PerspectiveCamera(position=(5, 5, 5), look_at=(0, 0, 0)) scene.render()
-
Explore the Kubric documentation and examples to learn more about advanced features and customization options.
Competitor Comparisons
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
Pros of PyTorch3D
- Seamless integration with PyTorch ecosystem
- Extensive documentation and tutorials
- Wider range of 3D vision tasks (rendering, reconstruction, etc.)
Cons of PyTorch3D
- Steeper learning curve for beginners
- Less focus on synthetic data generation
- Limited built-in support for complex scene composition
Code Comparison
PyTorch3D example (rendering a mesh):
renderer = MeshRenderer(
rasterizer=MeshRasterizer(),
shader=SoftPhongShader()
)
images = renderer(meshes, cameras=cameras, lights=lights)
Kubric example (creating a scene):
scene = kb.Scene(resolution=(512, 512))
cube = kb.Cube(name="cube", scale=1, position=(0, 0, 0))
scene.add(cube)
kb.simulate(scene, frame_start=0, frame_end=20)
PyTorch3D focuses on providing tools for 3D deep learning tasks, while Kubric specializes in synthetic data generation for computer vision. PyTorch3D offers more flexibility for various 3D vision tasks, but Kubric excels in creating complex, physically-based synthetic scenes for training data.
NVIDIA's Deep Imagination Team's PyTorch Library
Pros of imaginaire
- Focuses on image and video synthesis tasks, offering a wide range of models for various applications
- Provides pre-trained models and easy-to-use inference scripts for quick implementation
- Supports both PyTorch and TensorFlow frameworks, offering flexibility for developers
Cons of imaginaire
- Limited to 2D image and video tasks, lacking 3D rendering capabilities
- Requires more computational resources due to complex neural network architectures
- Less suitable for generating large-scale synthetic datasets compared to Kubric
Code Comparison
Kubric (Python):
import kubric as kb
scene = kb.Scene(resolution=(256, 256))
cube = kb.Cube(name="cube", scale=(1, 1, 1))
scene.add(cube)
imaginaire (Python):
from imaginaire.utils.distributed import init_dist
from imaginaire.trainers import BaseTrainer
trainer = BaseTrainer(cfg)
trainer.train()
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
Pros of ml-hypersim
- Focuses on photorealistic indoor scene generation
- Provides high-quality, physically-based rendered images
- Includes a diverse set of indoor environments and objects
Cons of ml-hypersim
- Limited to indoor scenes, less versatile than Kubric
- Requires more computational resources due to photorealistic rendering
- Steeper learning curve for users unfamiliar with 3D modeling software
Code Comparison
ml-hypersim:
import hypersim as hs
scene = hs.Scene()
scene.load_from_file("indoor_scene.json")
renderer = hs.Renderer()
image = renderer.render(scene)
Kubric:
import kubric as kb
scene = kb.Scene()
scene.add(kb.Cube())
renderer = kb.Renderer()
image = renderer.render(scene)
Both libraries provide scene creation and rendering capabilities, but ml-hypersim is more focused on photorealistic indoor scenes, while Kubric offers a more general-purpose approach to 3D scene generation and rendering.
code for Mesh R-CNN, ICCV 2019
Pros of Mesh R-CNN
- Focuses on 3D object reconstruction from single RGB images
- Provides end-to-end training for both instance segmentation and 3D shape prediction
- Includes a novel mesh refinement branch for improved 3D shape quality
Cons of Mesh R-CNN
- Limited to static image processing, unlike Kubric's video generation capabilities
- Requires pre-existing 3D shape priors, which may limit its applicability to novel objects
- More computationally intensive due to complex 3D reconstruction process
Code Comparison
Mesh R-CNN (PyTorch):
class MeshRCNN(nn.Module):
def __init__(self, cfg):
super(MeshRCNN, self).__init__()
self.backbone = build_backbone(cfg)
self.rpn = build_rpn(cfg, self.backbone.out_channels)
self.roi_heads = build_roi_heads(cfg, self.backbone.out_channels)
Kubric (Python):
scene = kb.Scene(resolution=(512, 512))
cube = kb.Cube(name="cube", scale=(1, 1, 1), position=(0, 0, 0))
scene.add(cube)
kb.simulate(scene, frame_start=0, frame_end=20)
The code snippets highlight the different focus areas of the two projects: Mesh R-CNN on 3D reconstruction from images, and Kubric on 3D scene generation and simulation.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Kubric
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Motivation and design
We need better data for training and evaluating machine learning systems, especially in the context of unsupervised multi-object video understanding. Current systems succeed on toy datasets, but fail on real-world data. Progress could be greatly accelerated if we had the ability to create suitable datasets of varying complexity on demand. Kubric is mainly built on-top of pybullet (for physics simulation) and Blender (for rendering); however, the code is kept modular to potentially support different rendering backends.
Getting started
For instructions, please refer to https://kubric.readthedocs.io
Assuming you have docker installed, to generate the data above simply execute:
git clone https://github.com/google-research/kubric.git
cd kubric
docker pull kubricdockerhub/kubruntu
docker run --rm --interactive \
--user $(id -u):$(id -g) \
--volume "$(pwd):/kubric" \
kubricdockerhub/kubruntu \
/usr/bin/python3 examples/helloworld.py
ls output
Kubric employs Blender 2.93 (see here), so if you want to inspect the generated *.blend
scene file for interactive inspection (i.e. without needing to render the scene), please make sure you have installed the correct Blender version.
Requirements
- A pipeline for conveniently generating video data.
- Physics simulation for automatically generating physical interactions between multiple objects.
- Good control over the complexity of the generated data, so that we can evaluate individual aspects such as variability of objects and textures.
- Realism: Ideally, the ability to span the entire complexity range from CLEVR all the way to real-world video such as YouTube8. This is clearly not feasible, but we would like to get as close as possible.
- Access to rich ground truth information about the objects in a scene for the purpose of evaluation (eg. object segmentations and properties)
- Control the train/test split to evaluate compositionality and systematic generalization (for example on held-out combinations of features or objects)
Challenges and datasets
Generally, we store datasets for the challenges in this Google Cloud Bucket. More specifically, these challenges are dataset contributions of the Kubric CVPR'22 paper:
- MOVi: Multi-Object Video
- Texture-Structure in NeRF
- Optical Flow
- Pre-training Visual Representations
- Robust NeRF
- Multi-View Object Matting
- Complex BRDFs
- Single View Reconstruction
- Video Based Reconstruction
- Point Tracking
Pointers to additional datasets/workers:
- ToyBox (from Neural Semantic Fields)
- MultiShapeNet (from Scene Representation Transformer)
- SyntheticTrio(from Controllable Neural Radiance Fields)
Bibtex
@article{greff2021kubric,
title = {Kubric: a scalable dataset generator},
author = {Klaus Greff and Francois Belletti and Lucas Beyer and Carl Doersch and
Yilun Du and Daniel Duckworth and David J Fleet and Dan Gnanapragasam and
Florian Golemo and Charles Herrmann and Thomas Kipf and Abhijit Kundu and
Dmitry Lagun and Issam Laradji and Hsueh-Ti (Derek) Liu and Henning Meyer and
Yishu Miao and Derek Nowrouzezahrai and Cengiz Oztireli and Etienne Pot and
Noha Radwan and Daniel Rebain and Sara Sabour and Mehdi S. M. Sajjadi and Matan Sela and
Vincent Sitzmann and Austin Stone and Deqing Sun and Suhani Vora and Ziyu Wang and
Tianhao Wu and Kwang Moo Yi and Fangcheng Zhong and Andrea Tagliasacchi},
booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2022},
}
Disclaimer
This is not an official Google Product
Top Related Projects
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data
NVIDIA's Deep Imagination Team's PyTorch Library
Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding
code for Mesh R-CNN, ICCV 2019
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot