ddim

Denoising Diffusion Implicit Models

1,595

216

1,595

View on GitHub

Top Related Projects

stable-diffusion

70,080

A latent text-to-image diffusion model

diffusion

4,246

Denoising Diffusion Probabilistic Models

denoising-diffusion-pytorch

9,064

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

k-diffusion

2,415

Karras et al. (2022) diffusion models for PyTorch

Quick Overview

The ermongroup/ddim repository is an implementation of Denoising Diffusion Implicit Models (DDIMs), a class of non-Markovian diffusion processes that generate high-quality samples faster than their Markovian counterparts. This project provides a PyTorch implementation of DDIMs, along with training and sampling scripts for various datasets.

Pros

Faster sampling compared to traditional diffusion models
High-quality image generation results
Flexible architecture that can be applied to various datasets
Well-documented codebase with clear instructions

Cons

Requires significant computational resources for training
Limited to image generation tasks
May require fine-tuning for specific use cases
Relatively complex mathematical foundation, which may be challenging for beginners

Code Examples

Loading a pre-trained DDIM model:

from models.diffusion import Model

model = Model(args)
model.load_state_dict(torch.load('path/to/checkpoint.pth'))
model.eval()

Generating samples using DDIM:

from sampling import ddim_sampling

samples = ddim_sampling(model, x_T, alphas, betas, T, eta)

Training a DDIM model:

from train import train

train(args, model, train_loader, optimizer, scheduler)

Getting Started

To get started with DDIM:

Clone the repository:

git clone https://github.com/ermongroup/ddim.git
cd ddim

Install dependencies:
```
pip install -r requirements.txt
```

Train a model or use a pre-trained one:

from models.diffusion import Model
from train import train

model = Model(args)
train(args, model, train_loader, optimizer, scheduler)

Generate samples:

from sampling import ddim_sampling

samples = ddim_sampling(model, x_T, alphas, betas, T, eta)

Competitor Comparisons

guided-diffusion

6,641

Pros of guided-diffusion

More comprehensive implementation with additional features like classifier guidance
Better documentation and code organization
Supports a wider range of diffusion models and sampling techniques

Cons of guided-diffusion

Higher computational requirements due to more complex models
Steeper learning curve for newcomers to diffusion models
Less focused on specific DDIM implementation

Code Comparison

guided-diffusion:

def p_sample_loop(
    model,
    shape,
    noise=None,
    clip_denoised=True,
    denoised_fn=None,
    model_kwargs=None,
    device=None,
    progress=False,
):
    # ... (implementation details)

ddim:

def p_sample_ddim(model, x, t, clip_denoised=True, denoised_fn=None, model_kwargs=None):
    out = p_mean_variance(
        model,
        x,
        t,
        clip_denoised=clip_denoised,
        denoised_fn=denoised_fn,
        model_kwargs=model_kwargs,
    )
    # ... (implementation details)

The guided-diffusion repository offers a more flexible sampling loop with additional parameters, while ddim provides a more focused implementation of the DDIM sampling process. guided-diffusion's approach allows for easier integration of various sampling techniques, but ddim's implementation is more straightforward for those specifically interested in DDIM.

stable-diffusion

70,080

A latent text-to-image diffusion model

Pros of stable-diffusion

More advanced and versatile image generation capabilities
Larger community and active development
Supports text-to-image generation

Cons of stable-diffusion

Higher computational requirements
More complex architecture and implementation
Larger model size, requiring more storage and memory

Code Comparison

DDIM:

def p_sample(self, x, t, clip_denoised=True):
    out = self.p_mean_variance(x, t)
    noise = torch.randn_like(x)
    nonzero_mask = (t != 0).float().view(-1, *([1] * (len(x.shape) - 1)))
    return out["mean"] + nonzero_mask * torch.exp(0.5 * out["log_variance"]) * noise

stable-diffusion:

@torch.no_grad()
def p_sample_plms(self, x, c, t, index, repeat_noise=False, use_original_steps=False, quantize_denoised=False,
                  temperature=1., noise_dropout=0., score_corrector=None, corrector_kwargs=None,
                  unconditional_guidance_scale=1., unconditional_conditioning=None, old_eps=None, t_next=None):
    b, *_, device = *x.shape, x.device

The code snippets show that stable-diffusion has a more complex sampling function with additional parameters and features compared to DDIM's simpler implementation.

diffusion

4,246

Denoising Diffusion Probabilistic Models

Pros of diffusion

More comprehensive implementation of diffusion models
Includes additional features like conditional sampling and various noise schedules
Better documentation and code organization

Cons of diffusion

Larger codebase, potentially more complex to understand and modify
May require more computational resources due to additional features
Less focused on specific applications compared to ddim

Code Comparison

diffusion:

def p_sample(self, model, x, t, clip_denoised=True, denoised_fn=None, cond_fn=None, model_kwargs=None):
    out = self.p_mean_variance(
        model,
        x,
        t,
        clip_denoised=clip_denoised,
        denoised_fn=denoised_fn,
        model_kwargs=model_kwargs,
    )

ddim:

def p_sample(self, model, x, t, t_index, betas):
    e_t = model(x, t)
    alphas = 1.0 - betas
    alphas_cumprod = torch.cumprod(alphas, dim=0)
    x_recon = self._predict_xstart_from_eps(x, t_index, e_t)

The code snippets show different approaches to sampling in the diffusion process, with diffusion offering more flexibility and options, while ddim focuses on a specific implementation.

denoising-diffusion-pytorch

9,064

Implementation of Denoising Diffusion Probabilistic Model in Pytorch

Pros of denoising-diffusion-pytorch

More comprehensive implementation with additional features like conditional generation and image inpainting
Better documentation and code organization, making it easier for users to understand and modify
Active development and regular updates, incorporating latest research and improvements

Cons of denoising-diffusion-pytorch

Higher complexity, which may be overwhelming for beginners or those seeking a simpler implementation
Potentially slower inference time due to the additional features and flexibility

Code Comparison

ddim:

def p_sample(self, x, t, clip_denoised=True):
    out = self.p_mean_variance(x, t)
    noise = torch.randn_like(x)
    nonzero_mask = (t != 0).float().view(-1, *([1] * (len(x.shape) - 1)))
    return out["mean"] + nonzero_mask * torch.exp(0.5 * out["log_variance"]) * noise

denoising-diffusion-pytorch:

@torch.no_grad()
def p_sample(self, x, t: int, x_self_cond = None, clip_denoised = True):
    b, *_, device = *x.shape, x.device
    model_mean, _, model_log_variance = self.p_mean_variance(x = x, t = t, x_self_cond = x_self_cond)
    noise = torch.randn_like(x) if t > 0 else 0.
    pred_img = model_mean + (0.5 * model_log_variance).exp() * noise
    return pred_img, x_self_cond

k-diffusion

2,415

Karras et al. (2022) diffusion models for PyTorch

Pros of k-diffusion

More flexible sampling algorithms, including advanced methods like DPM-Solver
Better support for conditional generation and guidance techniques
More active development and community support

Cons of k-diffusion

Steeper learning curve due to more complex architecture
Less focus on theoretical foundations compared to DDIM
May require more computational resources for some advanced sampling methods

Code Comparison

k-diffusion:

model = diffusion.DiffusionModel(...)
x = torch.randn(...)
samples = diffusion.sample(model, x, steps=20, eta=0.0)

DDIM:

model = DDIM(...)
x_T = torch.randn(...)
samples = model.sample(x_T, num_steps=20)

Both repositories provide implementations of diffusion models, but k-diffusion offers a wider range of sampling algorithms and more flexibility in model architecture. DDIM focuses on a specific sampling technique (Denoising Diffusion Implicit Models) and provides a more straightforward implementation. k-diffusion is generally more suitable for advanced users and researchers exploring various diffusion model variants, while DDIM may be preferable for those specifically interested in the DDIM algorithm or seeking a simpler starting point.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Denoising Diffusion Implicit Models (DDIM)

Jiaming Song, Chenlin Meng and Stefano Ermon, Stanford

Implements sampling from an implicit model that is trained with the same procedure as Denoising Diffusion Probabilistic Model, but costs much less time and compute if you want to sample from it (click image below for a video demo):

Integration with ð¤ Diffusers library

DDIM is now also available in ð§¨ Diffusers and accesible via the DDIMPipeline. Diffusers allows you to test DDIM in PyTorch in just a couple lines of code.

You can install diffusers as follows:

pip install diffusers torch accelerate

And then try out the model with just a couple lines of code:

from diffusers import DDIMPipeline

model_id = "google/ddpm-cifar10-32"

# load model and scheduler
ddim = DDIMPipeline.from_pretrained(model_id)

# run pipeline in inference (sample random noise and denoise)
image = ddim(num_inference_steps=50).images[0]

# save image
image.save("ddim_generated_image.png")

More DDPM/DDIM models compatible with hte DDIM pipeline can be found directly on the Hub

To better understand the DDIM scheduler, you can check out this introductionary google colab

The DDIM scheduler can also be used with more powerful diffusion models such as Stable Diffusion

You simply need to accept the license on the Hub, login with huggingface-cli login and install transformers:

pip install transformers

Then you can run:

from diffusers import StableDiffusionPipeline, DDIMScheduler

ddim = DDIMScheduler.from_config("runwayml/stable-diffusion-v1-5", subfolder="scheduler")
pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", scheduler=ddim)

image = pipeline("An astronaut riding a horse.").images[0]

image.save("astronaut_riding_a_horse.png")

Running the Experiments

The code has been tested on PyTorch 1.6.

Train a model

Training is exactly the same as DDPM with the following:

python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --ni

Sampling from the model

Sampling from the generalized model for FID evaluation

python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --doc {MODEL_NAME} --sample --fid --timesteps {STEPS} --eta {ETA} --ni

where

ETA controls the scale of the variance (0 is DDIM, and 1 is one type of DDPM).
STEPS controls how many timesteps used in the process.
MODEL_NAME finds the pre-trained checkpoint according to its inferred path.

If you want to use the DDPM pretrained model:

python main.py --config {DATASET}.yml --exp {PROJECT_PATH} --use_pretrained --sample --fid --timesteps {STEPS} --eta {ETA} --ni

the --use_pretrained option will automatically load the model according to the dataset.

We provide a CelebA 64x64 model here, and use the DDPM version for CIFAR10 and LSUN.

If you want to use the version with the larger variance in DDPM: use the --sample_type ddpm_noisy option.

Sampling from the model for image inpainting

Use --interpolation option instead of --fid.

Sampling from the sequence of images that lead to the sample

Use --sequence option instead.

The above two cases contain some hard-coded lines specific to producing the image, so modify them according to your needs.

References and Acknowledgements

@article{song2020denoising,
  title={Denoising Diffusion Implicit Models},
  author={Song, Jiaming and Meng, Chenlin and Ermon, Stefano},
  journal={arXiv:2010.02502},
  year={2020},
  month={October},
  abbr={Preprint},
  url={https://arxiv.org/abs/2010.02502}
}

This implementation is based on / inspired by:

https://github.com/hojonathanho/diffusion (the DDPM TensorFlow repo),
https://github.com/pesser/pytorch_diffusion (PyTorch helper that loads the DDPM model), and
https://github.com/ermongroup/ncsnv2 (code structure).

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of guided-diffusion

Cons of guided-diffusion

Code Comparison

Pros of stable-diffusion

Cons of stable-diffusion

Code Comparison

Pros of diffusion

Cons of diffusion

Code Comparison

Pros of denoising-diffusion-pytorch

Cons of denoising-diffusion-pytorch

Code Comparison

Pros of k-diffusion

Cons of k-diffusion

Code Comparison

Convert designs to code with AI

README

Denoising Diffusion Implicit Models (DDIM)

Integration with ð¤ Diffusers library

Running the Experiments

Train a model

Sampling from the model

Sampling from the generalized model for FID evaluation

Sampling from the model for image inpainting

Sampling from the sequence of images that lead to the sample

References and Acknowledgements

Top Related Projects

Convert designs to code with AI

Integration with ð¤ Diffusers library