Convert Figma logo to code with AI

hojonathanho logodiffusion

Denoising Diffusion Probabilistic Models

3,971
382
3,971
20

Top Related Projects

A latent text-to-image diffusion model

1,504

Denoising Diffusion Implicit Models

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Karras et al. (2022) diffusion models for PyTorch

Quick Overview

The hojonathanho/diffusion repository is a PyTorch implementation of denoising diffusion probabilistic models (DDPMs) for image generation. It provides a framework for training and sampling from diffusion models, which have shown impressive results in generating high-quality images.

Pros

  • Implements state-of-the-art diffusion models for image generation
  • Provides a flexible and extensible codebase for experimenting with diffusion models
  • Includes pre-trained models and example scripts for quick start and experimentation
  • Supports various image datasets and model architectures

Cons

  • Limited documentation and explanations of the underlying concepts
  • May require significant computational resources for training large models
  • Lacks extensive hyperparameter tuning guidelines
  • Not actively maintained (last commit was over a year ago at the time of writing)

Code Examples

  1. Loading a pre-trained model and generating samples:
import torch
from models import UNet
from diffusion import GaussianDiffusion

model = UNet(
    dim=64,
    dim_mults=(1, 2, 4, 8)
)
diffusion = GaussianDiffusion(
    model,
    image_size=32,
    timesteps=1000,
    loss_type='l1'
)

ckpt = torch.load('path/to/checkpoint.pt')
diffusion.load_state_dict(ckpt['model'])

samples = diffusion.sample(batch_size=16)
  1. Training a new diffusion model:
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform = transforms.Compose([
    transforms.Resize(32),
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

dataset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
dataloader = DataLoader(dataset, batch_size=32, shuffle=True)

for epoch in range(100):
    for batch in dataloader:
        loss = diffusion(batch)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
  1. Customizing the noise schedule:
from diffusion import GaussianDiffusion, cosine_beta_schedule

custom_betas = cosine_beta_schedule(timesteps=1000)
diffusion = GaussianDiffusion(
    model,
    image_size=32,
    timesteps=1000,
    loss_type='l1',
    betas=custom_betas
)

Getting Started

  1. Clone the repository:

    git clone https://github.com/hojonathanho/diffusion.git
    cd diffusion
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Train a model or use a pre-trained one:

    from models import UNet
    from diffusion import GaussianDiffusion
    
    model = UNet(dim=64, dim_mults=(1, 2, 4, 8))
    diffusion = GaussianDiffusion(model, image_size=32, timesteps=1000)
    
    # Train the model or load a pre-trained checkpoint
    # Generate samples
    samples = diffusion.sample(batch_size=16)
    

Competitor Comparisons

A latent text-to-image diffusion model

Pros of stable-diffusion

  • More advanced and widely adopted in the AI image generation community
  • Offers pre-trained models and easier integration with various applications
  • Extensive documentation and community support

Cons of stable-diffusion

  • Larger model size and higher computational requirements
  • More complex architecture, potentially harder for beginners to understand and modify

Code Comparison

diffusion:

def p_sample(self, x, t, clip_denoised=True):
    out = self.p_mean_variance(x, t, clip_denoised=clip_denoised)
    noise = torch.randn_like(x)
    nonzero_mask = (t != 0).float().view(-1, *([1] * (len(x.shape) - 1)))
    return out["mean"] + nonzero_mask * torch.exp(0.5 * out["log_variance"]) * noise

stable-diffusion:

@torch.no_grad()
def p_sample_plms(self, x, c, t, index, repeat_noise=False, use_original_steps=False, quantize_denoised=False,
                  temperature=1., noise_dropout=0., score_corrector=None, corrector_kwargs=None,
                  unconditional_guidance_scale=1., unconditional_conditioning=None, old_eps=None, t_next=None):
    b, *_, device = *x.shape, x.device
    e_t = self.model.apply_model(x, t, c)
    if unconditional_conditioning is None or unconditional_guidance_scale == 1.:
        e_t = e_t
    else:
        e_t = e_t * unconditional_guidance_scale + (1. - unconditional_guidance_scale) * self.model.apply_model(x, t, unconditional_conditioning)

Pros of guided-diffusion

  • More comprehensive documentation and examples
  • Supports a wider range of diffusion models and techniques
  • Actively maintained with regular updates and contributions

Cons of guided-diffusion

  • Higher computational requirements due to more complex models
  • Steeper learning curve for beginners
  • Less focus on specific image generation tasks

Code Comparison

guided-diffusion:

def create_model(
    image_size,
    num_channels,
    num_res_blocks,
    channel_mult="",
    learn_sigma=False,
    class_cond=False,
    use_checkpoint=False,
    attention_resolutions="16",
    num_heads=1,
    num_head_channels=-1,
    num_heads_upsample=-1,
    use_scale_shift_norm=False,
    dropout=0,
    resblock_updown=False,
    use_fp16=False,
    use_new_attention_order=False,
):
    # ... (implementation details)

diffusion:

def create_model(
    image_size,
    num_channels,
    num_res_blocks,
    channel_mult="",
    learn_sigma=False,
    class_cond=False,
    use_checkpoint=False,
    attention_resolutions="16",
    num_heads=1,
    num_head_channels=-1,
    num_heads_upsample=-1,
    use_scale_shift_norm=False,
    dropout=0,
):
    # ... (implementation details)

The code comparison shows that guided-diffusion offers more parameters and options for model creation, allowing for greater flexibility and customization.

1,504

Denoising Diffusion Implicit Models

Pros of ddim

  • Implements Denoising Diffusion Implicit Models (DDIM), offering faster sampling
  • Provides a more comprehensive set of pre-trained models
  • Includes additional utilities for image manipulation and processing

Cons of ddim

  • Less active development and maintenance compared to diffusion
  • Fewer options for customizing the diffusion process
  • Limited documentation on advanced usage and model fine-tuning

Code Comparison

diffusion:

def p_sample_loop(model, shape, noise=None, clip_denoised=True, denoised_fn=None,
                  model_kwargs=None, device=None, progress=False):
    final = None
    for sample in p_sample_loop_progressive(
        model,
        shape,
        noise=noise,
        clip_denoised=clip_denoised,
        denoised_fn=denoised_fn,
        model_kwargs=model_kwargs,
        device=device,
        progress=progress,
    ):
        final = sample
    return final["sample"]

ddim:

def p_sample_ddim(model, x, t, clip_denoised=True, denoised_fn=None, model_kwargs=None):
    out = p_mean_variance(
        model,
        x,
        t,
        clip_denoised=clip_denoised,
        denoised_fn=denoised_fn,
        model_kwargs=model_kwargs,
    )
    noise = th.randn_like(x)
    nonzero_mask = (t != 0).float().view(-1, *([1] * (len(x.shape) - 1)))
    sample = out["mean"] + nonzero_mask * th.exp(0.5 * out["log_variance"]) * noise
    return {"sample": sample, "pred_xstart": out["pred_xstart"]}

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Pros of denoising-diffusion-pytorch

  • More comprehensive implementation with additional features like conditional generation and image inpainting
  • Better documentation and code organization, making it easier for users to understand and modify
  • Active development and maintenance, with regular updates and improvements

Cons of denoising-diffusion-pytorch

  • Higher complexity, which may be overwhelming for beginners or those seeking a simpler implementation
  • Potentially slower execution due to additional features and abstractions

Code Comparison

diffusion:

def p_sample(self, x, t, clip_denoised=True):
    t_batch = torch.full((x.shape[0],), t, device=x.device, dtype=torch.long)
    noise_level = extract(self.sqrt_alphas_cumprod, t_batch, x.shape)
    x_recon = self.predict_start_from_noise(x, t=t_batch, noise=self.denoise_fn(x, t_batch))
    if clip_denoised:
        x_recon.clamp_(-1., 1.)

denoising-diffusion-pytorch:

@torch.no_grad()
def p_sample(self, x, t, t_index):
    betas_t = extract(self.betas, t, x.shape)
    sqrt_one_minus_alphas_cumprod_t = extract(self.sqrt_one_minus_alphas_cumprod, t, x.shape)
    sqrt_recip_alphas_t = extract(self.sqrt_recip_alphas, t, x.shape)
    model_mean = sqrt_recip_alphas_t * (x - betas_t * self.model(x, t) / sqrt_one_minus_alphas_cumprod_t)

Karras et al. (2022) diffusion models for PyTorch

Pros of k-diffusion

  • More flexible and customizable implementation
  • Supports a wider range of sampling methods
  • Better documentation and examples for usage

Cons of k-diffusion

  • Less optimized for performance in some cases
  • May require more setup and configuration
  • Potentially steeper learning curve for beginners

Code Comparison

k-diffusion:

model = diffusion.get_model(config)
x = torch.randn(1, 3, 64, 64)
samples = diffusion.sample(model, x, steps=1000)

diffusion:

model = create_model(config)
noise = torch.randn(1, 3, 64, 64)
samples = p_sample_loop(model, noise, num_timesteps=1000)

Both repositories provide implementations of diffusion models, but k-diffusion offers more flexibility and customization options, while diffusion may be simpler to use out of the box. The code comparison shows that k-diffusion uses a more modular approach, separating the model creation and sampling steps, while diffusion combines these steps in a single function call.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Denoising Diffusion Probabilistic Models

Jonathan Ho, Ajay Jain, Pieter Abbeel

Paper: https://arxiv.org/abs/2006.11239

Website: https://hojonathanho.github.io/diffusion

Samples generated by our model

Experiments run on Google Cloud TPU v3-8. Requires TensorFlow 1.15 and Python 3.5, and these dependencies for CPU instances (see requirements.txt):

pip3 install fire
pip3 install scipy
pip3 install pillow
pip3 install tensorflow-probability==0.8
pip3 install tensorflow-gan==0.0.0.dev0
pip3 install tensorflow-datasets==2.1.0

The training and evaluation scripts are in the scripts/ subdirectory. The commands to run training and evaluation are in comments at the top of the scripts. Data is stored in GCS buckets. The scripts are written to assume that the bucket names are of the form gs://mybucketprefix-us-central1; i.e. some prefix followed by the region. The prefix should be passed into the scripts using the --bucket_name_prefix flag.

Models and samples can be found at: https://www.dropbox.com/sh/pm6tn31da21yrx4/AABWKZnBzIROmDjGxpB6vn6Ja

Citation

If you find our work relevant to your research, please cite:

@article{ho2020denoising,
    title={Denoising Diffusion Probabilistic Models},
    author={Jonathan Ho and Ajay Jain and Pieter Abbeel},
    year={2020},
    journal={arXiv preprint arxiv:2006.11239}
}