glow
Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
Top Related Projects
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Quick Overview
Glow is a machine learning compiler and runtime for hardware accelerators, developed by OpenAI. It aims to optimize neural network models for various hardware targets, including GPUs, CPUs, and specialized AI accelerators. Glow focuses on providing efficient execution of deep learning models across different platforms.
Pros
- Supports multiple hardware targets, enabling flexibility in deployment
- Optimizes neural network models for improved performance
- Integrates with popular deep learning frameworks like PyTorch and TensorFlow
- Open-source project with active development and community support
Cons
- Steep learning curve for users unfamiliar with compiler technologies
- Limited documentation and examples for some advanced features
- May require additional effort to integrate with custom hardware or less common accelerators
- Performance gains can vary depending on the specific model and hardware combination
Code Examples
- Loading and compiling a model:
import torch
from torch_glow import enable_glow
# Enable Glow for PyTorch
enable_glow()
# Load a pre-trained model
model = torch.hub.load('pytorch/vision:v0.10.0', 'resnet18', pretrained=True)
# Compile the model with Glow
compiled_model = torch.jit.trace(model, torch.randn(1, 3, 224, 224))
- Running inference with the compiled model:
import torch
import numpy as np
# Prepare input data
input_data = torch.randn(1, 3, 224, 224)
# Run inference
with torch.no_grad():
output = compiled_model(input_data)
# Process the output
predicted_class = np.argmax(output.numpy())
- Configuring Glow backend options:
from torch_glow import enable_glow, Backend
# Enable Glow with specific backend options
enable_glow(
backend=Backend.CPU,
num_devices=1,
debug_compile=True
)
Getting Started
To get started with Glow:
- Install Glow and its dependencies:
git clone https://github.com/pytorch/glow.git
cd glow
mkdir build && cd build
cmake -G Ninja ..
ninja all
- Install the PyTorch integration:
pip install torch-glow
- Use Glow in your PyTorch project:
import torch
from torch_glow import enable_glow
enable_glow()
# Your PyTorch code here
Competitor Comparisons
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Pros of tensor2tensor
- Broader scope, covering a wide range of machine learning tasks and models
- More active development and larger community support
- Extensive documentation and tutorials for easier adoption
Cons of tensor2tensor
- Steeper learning curve due to its comprehensive nature
- Potentially slower execution compared to Glow's optimized flow-based approach
- May require more computational resources for some tasks
Code comparison
tensor2tensor:
import tensorflow as tf
import tensor2tensor as t2t
problem = t2t.problems.problem("image_mnist")
model = t2t.models.transformer.Transformer(...)
trainer = t2t.utils.trainer_lib.create_trainer(...)
Glow:
import torch
from glow import models
model = models.Glow(...)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
loss = model(x)
The code snippets demonstrate the different approaches:
- tensor2tensor uses a problem-model-trainer structure
- Glow focuses on flow-based models with PyTorch integration
Both repositories offer powerful tools for machine learning tasks, but tensor2tensor provides a more comprehensive framework while Glow specializes in flow-based generative models.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Wider adoption and larger community support
- More flexible and dynamic computational graph
- Extensive ecosystem of tools and libraries
Cons of PyTorch
- Steeper learning curve for beginners
- Slightly slower execution compared to static graph frameworks
Code Comparison
PyTorch:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = x + y
print(z)
Glow:
import glow
x = glow.placeholder(shape=[3], dtype=glow.float32)
y = glow.placeholder(shape=[3], dtype=glow.float32)
z = x + y
print(glow.Session().run(z, {x: [1, 2, 3], y: [4, 5, 6]}))
PyTorch offers a more Pythonic and intuitive approach, while Glow follows a static graph model similar to TensorFlow. PyTorch's dynamic computation graph allows for easier debugging and more natural Python integration, whereas Glow's static graph may provide optimization benefits for certain use cases.
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Pros of JAX
- More flexible and general-purpose, supporting a wider range of machine learning tasks
- Better performance on GPUs and TPUs through XLA compilation
- Active development with frequent updates and growing ecosystem
Cons of JAX
- Steeper learning curve, especially for those familiar with NumPy/PyTorch
- Less focus on specific generative models compared to Glow
- Smaller community and fewer pre-trained models available
Code Comparison
Glow (PyTorch-based):
import torch
from glow import Glow
model = Glow(in_channel=3, n_flow=32, n_block=3)
x = torch.randn(16, 3, 64, 64)
z, logdet = model(x)
JAX:
import jax
import jax.numpy as jnp
def model(params, x):
# Define your model architecture here
return output
x = jnp.random.normal(key, (16, 3, 64, 64))
output = jax.jit(model)(params, x)
Summary
JAX offers more flexibility and performance for general machine learning tasks, while Glow is specifically designed for generative models. JAX has a steeper learning curve but provides better optimization capabilities. Glow may be easier to use for specific generative tasks but is less versatile overall.
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Pros of Horovod
- Designed specifically for distributed deep learning, offering better scalability across multiple GPUs and nodes
- Supports multiple deep learning frameworks (TensorFlow, PyTorch, MXNet, Keras)
- Easier to integrate into existing deep learning workflows
Cons of Horovod
- More focused on distributed training, less versatile for general machine learning tasks
- Requires additional setup and configuration for distributed environments
- May have a steeper learning curve for users not familiar with distributed computing concepts
Code Comparison
Glow example (ONNX model loading):
auto *F = mod.createFunction("main");
Placeholder *output;
PlaceholderBindings bindings;
ONNXModelLoader onnxLD(modelPath, {}, {}, *F);
output = EXIT_ON_ERR(onnxLD.getSingleOutput());
Horovod example (distributed training):
import horovod.tensorflow as hvd
hvd.init()
config = tf.ConfigProto()
config.gpu_options.visible_device_list = str(hvd.local_rank())
hvd.DistributedOptimizer(optimizer)
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Pros of fairseq
- Broader focus on sequence modeling tasks, including machine translation, text summarization, and language modeling
- More extensive documentation and tutorials, making it easier for newcomers to get started
- Active community with frequent updates and contributions
Cons of fairseq
- Potentially steeper learning curve due to its broader scope and more complex architecture
- May require more computational resources for training and inference compared to Glow's optimized approach
Code Comparison
fairseq example (PyTorch-based):
from fairseq.models.transformer import TransformerModel
model = TransformerModel.from_pretrained('/path/to/model', 'checkpoint.pt')
tokens = model.encode('Hello world!')
translated = model.translate(tokens)
Glow example (TensorFlow-based):
import glow
model = glow.get_glow_model('/path/to/model')
latents = model.encode('Hello world!')
reconstructed = model.decode(latents)
Summary
fairseq offers a more comprehensive toolkit for various sequence modeling tasks, with better documentation and community support. However, it may be more complex and resource-intensive compared to Glow. Glow focuses specifically on flow-based generative models, potentially offering a more streamlined experience for certain tasks. The code examples highlight the different frameworks (PyTorch vs. TensorFlow) and approaches to encoding and decoding text.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Focuses on optimizing large-scale model training and inference
- Offers a wider range of optimization techniques, including ZeRO, pipeline parallelism, and 3D parallelism
- Actively maintained with frequent updates and extensive documentation
Cons of DeepSpeed
- Steeper learning curve due to its comprehensive feature set
- May introduce additional complexity for smaller projects or simpler models
- Requires more configuration and tuning to achieve optimal performance
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
model=model,
model_parameters=params)
Glow:
import glow
model = glow.models.FlowModel(hidden_channels=512, K=32, L=3)
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)
Summary
DeepSpeed offers more advanced optimization techniques for large-scale models, while Glow focuses on generative flow models. DeepSpeed is more actively maintained and provides extensive documentation, but it may be overkill for smaller projects. Glow is simpler to use but has a narrower scope and less frequent updates.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Status: Archive (code is provided as-is, no updates expected)
Glow
Code for reproducing results in "Glow: Generative Flow with Invertible 1x1 Convolutions"
To use pretrained CelebA-HQ model, make your own manipulation vectors and run our interactive demo, check demo
folder.
Requirements
- Tensorflow (tested with v1.8.0)
- Horovod (tested with v0.13.8) and (Open)MPI
Run
pip install -r requirements.txt
To setup (Open)MPI, check instructions on Horovod github page.
Download datasets
For small scale experiments, use MNIST/CIFAR-10 (directly downloaded by train.py
using keras)
For larger scale experiments, the datasets used are in the Google Cloud locations https://openaipublic.azureedge.net/glow-demo/data/{dataset_name}-tfr.tar
. The dataset_names are below, we mention the exact preprocessing / downsampling method for a correct comparison of likelihood.
Quantitative results
imagenet-oord
- 20GB. Unconditional ImageNet 32x32 and 64x64, as described in PixelRNN/RealNVP papers (we downloaded this processed version).lsun_realnvp
- 140GB. LSUN 96x96. Random 64x64 crops taken at processing time, as described in RealNVP.
Qualitative results
celeba
- 4GB. CelebA-HQ 256x256 dataset, as described in Progressive growing of GAN's. For 1024x1024 version (120GB), useceleba-full-tfr.tar
while downloading.imagenet
- 20GB. ImageNet 32x32 and 64x64 with class labels. Centre cropped, area downsampled.lsun
- 700GB. LSUN 256x256. Centre cropped, area downsampled.
To download and extract celeb for example, run
wget https://openaipublic.azureedge.net/glow-demo/data/celeba-tfr.tar
tar -xvf celeb-tfr.tar
Change hps.data_dir
in train.py file to point to the above folder (or use the --data_dir
flag when you run train.py)
For lsun
, since download can be quite big, you can instead follow the instructions in data_loaders/generate_tfr/lsun.py
to generate the tfr file directly from LSUN images. church_outdoor
will be the smallest category.
Simple Train with 1 GPU
Run wtih small depth to test
CUDA_VISIBLE_DEVICES=0 python train.py --depth 1
Train with multiple GPUs using MPI and Horovod
Run default training script with 8 GPUs:
mpiexec -n 8 python train.py
Ablation experiments
mpiexec -n 8 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation [0/1/2] --flow_coupling [0/1] --seed [0/1/2] --learntop --lr 0.001
Pretrained models, logs and samples
wget https://openaipublic.azureedge.net/glow-demo/logs/abl-[reverse/shuffle/1x1]-[add/aff].tar
CIFAR-10 Quantitative result
mpiexec -n 8 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
ImageNet 32x32 Quantitative result
mpiexec -n 8 python train.py --problem imagenet-oord --image_size 32 --n_level 3 --depth 48 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
ImageNet 64x64 Quantitative result
mpiexec -n 8 python train.py --problem imagenet-oord --image_size 64 --n_level 4 --depth 48 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
LSUN 64x64 Quantitative result
mpiexec -n 8 python train.py --problem lsun_realnvp --category [bedroom/church_outdoor/tower] --image_size 64 --n_level 3 --depth 48 --flow_permutation 2 --flow_coupling 1 --seed 0 --learntop --lr 0.001 --n_bits_x 8
Pretrained models, logs and samples
wget https://openaipublic.azureedge.net/glow-demo/logs/lsun-rnvp-[bdr/crh/twr].tar
CelebA-HQ 256x256 Qualitative result
mpiexec -n 40 python train.py --problem celeba --image_size 256 --n_level 6 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5
LSUN 96x96 and 128x128 Qualitative result
mpiexec -n 40 python train.py --problem lsun --category [bedroom/church_outdoor/tower] --image_size [96/128] --n_level 5 --depth 64 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5
Logs and samples
wget https://openaipublic.azureedge.net/glow-demo/logs/lsun-bdr-[96/128].tar
Conditional CIFAR-10 Qualitative result
mpiexec -n 8 python train.py --problem cifar10 --image_size 32 --n_level 3 --depth 32 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --ycond --weight_y=0.01
Conditional ImageNet 32x32 Qualitative result
mpiexec -n 8 python train.py --problem imagenet --image_size 32 --n_level 3 --depth 48 --flow_permutation 2 --flow_coupling 0 --seed 0 --learntop --lr 0.001 --n_bits_x 5 --ycond --weight_y=0.01
Top Related Projects
Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot