Convert Figma logo to code with AI

junyanz logopytorch-CycleGAN-and-pix2pix

Image-to-Image Translation in PyTorch

22,900
6,304
22,900
561

Top Related Projects

10,072

Image-to-image translation with conditional adversarial nets

PyTorch implementations of Generative Adversarial Networks.

7,591

Semantic Image Synthesis with SPADE

Synthesizing and manipulating 2048x1024 images with conditional GANs

2,517

Learning Chinese Character style with conditional GAN

Quick Overview

The junyanz/pytorch-CycleGAN-and-pix2pix repository is a PyTorch implementation of image-to-image translation models, specifically CycleGAN and pix2pix. These models can be used for various tasks such as style transfer, object transfiguration, season transfer, and photo enhancement. The repository provides a comprehensive framework for training and testing these models on custom datasets.

Pros

  • Implements both CycleGAN (unpaired image translation) and pix2pix (paired image translation) in a single repository
  • Provides pre-trained models for various tasks, making it easy to get started with image translation
  • Includes detailed documentation and examples for training, testing, and using the models
  • Supports custom datasets and various data preprocessing options

Cons

  • Requires significant computational resources for training, especially for high-resolution images
  • May produce artifacts or unrealistic results in some cases, particularly with complex transformations
  • Limited to 2D image-to-image translation tasks, not suitable for other types of data or 3D applications
  • Requires some understanding of deep learning and GANs to effectively use and modify the models

Code Examples

  1. Loading a pre-trained CycleGAN model:
from models import create_model
from options.test_options import TestOptions

opt = TestOptions().parse()
model = create_model(opt)
model.setup(opt)
  1. Translating an image using CycleGAN:
from util.util import tensor2im
import torchvision.transforms as transforms
from PIL import Image

transform = transforms.Compose([
    transforms.Resize((256, 256)),
    transforms.ToTensor(),
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
])

img = Image.open('input.jpg')
input_tensor = transform(img).unsqueeze(0)
output = model.netG_A(input_tensor)
output_image = tensor2im(output)
output_image.save('output.jpg')
  1. Training a pix2pix model:
from options.train_options import TrainOptions
from data import create_dataset
from models import create_model

opt = TrainOptions().parse()
dataset = create_dataset(opt)
model = create_model(opt)
model.setup(opt)

for epoch in range(opt.epoch_count, opt.n_epochs + opt.n_epochs_decay + 1):
    for i, data in enumerate(dataset):
        model.set_input(data)
        model.optimize_parameters()

Getting Started

  1. Clone the repository:
git clone https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
cd pytorch-CycleGAN-and-pix2pix
  1. Install dependencies:
pip install -r requirements.txt
  1. Download a pre-trained model:
bash ./scripts/download_cyclegan_model.sh horse2zebra
  1. Test the model:
python test.py --dataroot ./datasets/horse2zebra/testA --name horse2zebra_pretrained --model test --no_dropout

Competitor Comparisons

10,072

Image-to-image translation with conditional adversarial nets

Pros of pix2pix

  • Original implementation of the pix2pix algorithm
  • Simpler codebase, easier to understand for beginners
  • Includes pre-trained models for quick testing

Cons of pix2pix

  • Implemented in Torch, which has a smaller community compared to PyTorch
  • Limited to pix2pix only, doesn't include CycleGAN or other related models
  • Less actively maintained, with fewer recent updates

Code Comparison

pix2pix (Torch):

local ndf = opt.ndf
netD = nn.Sequential()
netD:add(nn.SpatialConvolution(input_nc+output_nc, ndf, 4, 4, 2, 2, 1, 1))
netD:add(nn.LeakyReLU(0.2, true))

pytorch-CycleGAN-and-pix2pix (PyTorch):

def define_D(input_nc, ndf, netD, n_layers_D=3, norm='batch', use_sigmoid=False, init_type='normal', init_gain=0.02, gpu_ids=[]):
    net = None
    norm_layer = get_norm_layer(norm_type=norm)
    net = NLayerDiscriminator(input_nc, ndf, n_layers_D, norm_layer=norm_layer, use_sigmoid=use_sigmoid)

The pytorch-CycleGAN-and-pix2pix repository offers a more modern implementation in PyTorch, includes both pix2pix and CycleGAN models, and is actively maintained. It provides more flexibility and options for network architectures, making it suitable for a wider range of image-to-image translation tasks.

PyTorch implementations of Generative Adversarial Networks.

Pros of PyTorch-GAN

  • Implements a wider variety of GAN architectures (>30 models)
  • Provides a more comprehensive overview of different GAN types
  • Offers a unified structure for easier comparison between models

Cons of PyTorch-GAN

  • Less focused on specific applications like image-to-image translation
  • May have less optimized implementations for certain models
  • Lacks some advanced features present in CycleGAN-and-pix2pix

Code Comparison

PyTorch-GAN (DCGAN example):

class Generator(nn.Module):
    def __init__(self):
        super(Generator, self).__init__()
        self.init_size = opt.img_size // 4
        self.l1 = nn.Sequential(nn.Linear(opt.latent_dim, 128 * self.init_size ** 2))
        self.conv_blocks = nn.Sequential(
            nn.BatchNorm2d(128),
            nn.Upsample(scale_factor=2),
            nn.Conv2d(128, 128, 3, stride=1, padding=1),

CycleGAN-and-pix2pix (Generator example):

class ResnetGenerator(nn.Module):
    def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
        assert(n_blocks >= 0)
        super(ResnetGenerator, self).__init__()
        self.input_nc = input_nc
        self.output_nc = output_nc
        self.ngf = ngf
        if type(norm_layer) == functools.partial:
            use_bias = norm_layer.func == nn.InstanceNorm2d
7,591

Semantic Image Synthesis with SPADE

Pros of SPADE

  • Advanced semantic image synthesis with better control over spatial layouts
  • Improved quality and realism of generated images, especially for complex scenes
  • More flexible architecture that can handle various input types (e.g., segmentation masks, sketches)

Cons of SPADE

  • More complex implementation and potentially higher computational requirements
  • Less versatile in terms of image-to-image translation tasks compared to CycleGAN
  • May require more extensive training data for optimal results

Code Comparison

SPADE (config file snippet):

self.opt.semantic_nc = opt.label_nc + \
    (1 if opt.contain_dontcare_label else 0) + \
    (0 if opt.no_instance else 1)
self.spade = SPADE(opt.semantic_nc, opt.ngf, opt.norm_G)

CycleGAN (model definition snippet):

def define_G(input_nc, output_nc, ngf, netG, norm='batch', use_dropout=False, init_type='normal', init_gain=0.02, gpu_ids=[]):
    net = None
    norm_layer = get_norm_layer(norm_type=norm)

Both repositories provide implementations of advanced image generation techniques, but SPADE focuses more on semantic image synthesis, while CycleGAN offers a broader range of image-to-image translation capabilities. SPADE generally produces higher quality results for complex scenes, while CycleGAN is more versatile and easier to implement for various translation tasks.

Synthesizing and manipulating 2048x1024 images with conditional GANs

Pros of pix2pixHD

  • Higher resolution output (up to 2048x1024)
  • Multi-scale generator and discriminator architecture for improved results
  • Instance-level feature embedding for finer details

Cons of pix2pixHD

  • More complex implementation, potentially harder to understand and modify
  • Requires more computational resources due to higher resolution and advanced architecture
  • Limited to image-to-image translation tasks, unlike CycleGAN which can handle unpaired data

Code Comparison

pix2pixHD:

class GlobalGenerator(nn.Module):
    def __init__(self, input_nc, output_nc, ngf=64, n_downsampling=3, n_blocks=9, norm_layer=nn.BatchNorm2d, 
                 padding_type='reflect'):
        super(GlobalGenerator, self).__init__()        
        activation = nn.ReLU(True)        

pytorch-CycleGAN-and-pix2pix:

class ResnetGenerator(nn.Module):
    def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
        assert(n_blocks >= 0)
        super(ResnetGenerator, self).__init__()
        self.input_nc = input_nc
        self.output_nc = output_nc
        self.ngf = ngf

The code snippets show differences in generator architecture, with pix2pixHD using a more complex GlobalGenerator compared to the ResnetGenerator in pytorch-CycleGAN-and-pix2pix.

2,517

Learning Chinese Character style with conditional GAN

Pros of zi2zi

  • Specialized for Chinese character generation and style transfer
  • Includes pre-trained models for immediate use
  • Offers a user-friendly interface for generating characters

Cons of zi2zi

  • Limited to Chinese character domain, less versatile than CycleGAN-and-pix2pix
  • Smaller community and fewer updates compared to CycleGAN-and-pix2pix
  • Less extensive documentation and examples

Code Comparison

zi2zi (model definition):

class UNet(nn.Module):
    def __init__(self, input_nc, output_nc, ngf=64):
        super(UNet, self).__init__()
        # ... (UNet architecture implementation)

CycleGAN-and-pix2pix (model definition):

class ResnetGenerator(nn.Module):
    def __init__(self, input_nc, output_nc, ngf=64, norm_layer=nn.BatchNorm2d, use_dropout=False, n_blocks=6, padding_type='reflect'):
        super(ResnetGenerator, self).__init__()
        # ... (ResNet-based generator implementation)

Both projects use PyTorch for model implementation, but zi2zi focuses on a UNet architecture for character generation, while CycleGAN-and-pix2pix uses a more general ResNet-based generator for various image-to-image translation tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README




CycleGAN and pix2pix in PyTorch

New: Please check out img2img-turbo repo that includes both pix2pix-turbo and CycleGAN-Turbo. Our new one-step image-to-image translation methods can support both paired and unpaired training and produce better results by leveraging the pre-trained StableDiffusion-Turbo model. The inference time for 512x512 image is 0.29 sec on A6000 and 0.11 sec on A100.

Please check out contrastive-unpaired-translation (CUT), our new unpaired image-to-image translation model that enables fast and memory-efficient training.

We provide PyTorch implementations for both unpaired and paired image-to-image translation.

The code was written by Jun-Yan Zhu and Taesung Park, and supported by Tongzhou Wang.

This PyTorch implementation produces results comparable to or better than our original Torch software. If you would like to reproduce the same results as in the papers, check out the original CycleGAN Torch and pix2pix Torch code in Lua/Torch.

Note: The current software works well with PyTorch 1.4. Check out the older branch that supports PyTorch 0.1-0.3.

You may find useful information in training/test tips and frequently asked questions. To implement custom models and datasets, check out our templates. To help users better understand and adapt our codebase, we provide an overview of the code structure of this repository.

CycleGAN: Project | Paper | Torch | Tensorflow Core Tutorial | PyTorch Colab

Pix2pix: Project | Paper | Torch | Tensorflow Core Tutorial | PyTorch Colab

EdgesCats Demo | pix2pix-tensorflow | by Christopher Hesse

If you use this code for your research, please cite:

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks.
Jun-Yan Zhu*, Taesung Park*, Phillip Isola, Alexei A. Efros. In ICCV 2017. (* equal contributions) [Bibtex]

Image-to-Image Translation with Conditional Adversarial Networks.
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros. In CVPR 2017. [Bibtex]

Talks and Course

pix2pix slides: keynote | pdf, CycleGAN slides: pptx | pdf

CycleGAN course assignment code and handout designed by Prof. Roger Grosse for CSC321 "Intro to Neural Networks and Machine Learning" at University of Toronto. Please contact the instructor if you would like to adopt it in your course.

Colab Notebook

TensorFlow Core CycleGAN Tutorial: Google Colab | Code

TensorFlow Core pix2pix Tutorial: Google Colab | Code

PyTorch Colab notebook: CycleGAN and pix2pix

ZeroCostDL4Mic Colab notebook: CycleGAN and pix2pix

Other implementations

CycleGAN

[Tensorflow] (by Harry Yang), [Tensorflow] (by Archit Rathore), [Tensorflow] (by Van Huy), [Tensorflow] (by Xiaowei Hu), [Tensorflow2] (by Zhenliang He), [TensorLayer1.0] (by luoxier), [TensorLayer2.0] (by zsdonghao), [Chainer] (by Yanghua Jin), [Minimal PyTorch] (by yunjey), [Mxnet] (by Ldpe2G), [lasagne/Keras] (by tjwei), [Keras] (by Simon Karlsson), [OneFlow] (by Ldpe2G)

pix2pix

[Tensorflow] (by Christopher Hesse), [Tensorflow] (by Eyyüb Sariu), [Tensorflow (face2face)] (by Dat Tran), [Tensorflow (film)] (by Arthur Juliani), [Tensorflow (zi2zi)] (by Yuchen Tian), [Chainer] (by mattya), [tf/torch/keras/lasagne] (by tjwei), [Pytorch] (by taey16)

Prerequisites

  • Linux or macOS
  • Python 3
  • CPU or NVIDIA GPU + CUDA CuDNN

Getting Started

Installation

  • Clone this repo:
git clone https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
cd pytorch-CycleGAN-and-pix2pix
  • Install PyTorch and 0.4+ and other dependencies (e.g., torchvision, visdom and dominate).
    • For pip users, please type the command pip install -r requirements.txt.
    • For Conda users, you can create a new Conda environment using conda env create -f environment.yml.
    • For Docker users, we provide the pre-built Docker image and Dockerfile. Please refer to our Docker page.
    • For Repl users, please click Run on Repl.it.

CycleGAN train/test

  • Download a CycleGAN dataset (e.g. maps):
bash ./datasets/download_cyclegan_dataset.sh maps
  • To view training results and loss plots, run python -m visdom.server and click the URL http://localhost:8097.
  • To log training progress and test images to W&B dashboard, set the --use_wandb flag with train and test script
  • Train a model:
#!./scripts/train_cyclegan.sh
python train.py --dataroot ./datasets/maps --name maps_cyclegan --model cycle_gan

To see more intermediate results, check out ./checkpoints/maps_cyclegan/web/index.html.

  • Test the model:
#!./scripts/test_cyclegan.sh
python test.py --dataroot ./datasets/maps --name maps_cyclegan --model cycle_gan
  • The test results will be saved to a html file here: ./results/maps_cyclegan/latest_test/index.html.

pix2pix train/test

  • Download a pix2pix dataset (e.g.facades):
bash ./datasets/download_pix2pix_dataset.sh facades
  • To view training results and loss plots, run python -m visdom.server and click the URL http://localhost:8097.
  • To log training progress and test images to W&B dashboard, set the --use_wandb flag with train and test script
  • Train a model:
#!./scripts/train_pix2pix.sh
python train.py --dataroot ./datasets/facades --name facades_pix2pix --model pix2pix --direction BtoA

To see more intermediate results, check out ./checkpoints/facades_pix2pix/web/index.html.

  • Test the model (bash ./scripts/test_pix2pix.sh):
#!./scripts/test_pix2pix.sh
python test.py --dataroot ./datasets/facades --name facades_pix2pix --model pix2pix --direction BtoA
  • The test results will be saved to a html file here: ./results/facades_pix2pix/test_latest/index.html. You can find more scripts at scripts directory.
  • To train and test pix2pix-based colorization models, please add --model colorization and --dataset_mode colorization. See our training tips for more details.

Apply a pre-trained model (CycleGAN)

  • You can download a pretrained model (e.g. horse2zebra) with the following script:
bash ./scripts/download_cyclegan_model.sh horse2zebra
  • The pretrained model is saved at ./checkpoints/{name}_pretrained/latest_net_G.pth. Check here for all the available CycleGAN models.
  • To test the model, you also need to download the horse2zebra dataset:
bash ./datasets/download_cyclegan_dataset.sh horse2zebra
  • Then generate the results using
python test.py --dataroot datasets/horse2zebra/testA --name horse2zebra_pretrained --model test --no_dropout
  • The option --model test is used for generating results of CycleGAN only for one side. This option will automatically set --dataset_mode single, which only loads the images from one set. On the contrary, using --model cycle_gan requires loading and generating results in both directions, which is sometimes unnecessary. The results will be saved at ./results/. Use --results_dir {directory_path_to_save_result} to specify the results directory.

  • For pix2pix and your own models, you need to explicitly specify --netG, --norm, --no_dropout to match the generator architecture of the trained model. See this FAQ for more details.

Apply a pre-trained model (pix2pix)

Download a pre-trained model with ./scripts/download_pix2pix_model.sh.

  • Check here for all the available pix2pix models. For example, if you would like to download label2photo model on the Facades dataset,
bash ./scripts/download_pix2pix_model.sh facades_label2photo
  • Download the pix2pix facades datasets:
bash ./datasets/download_pix2pix_dataset.sh facades
  • Then generate the results using
python test.py --dataroot ./datasets/facades/ --direction BtoA --model pix2pix --name facades_label2photo_pretrained
  • Note that we specified --direction BtoA as Facades dataset's A to B direction is photos to labels.

  • If you would like to apply a pre-trained model to a collection of input images (rather than image pairs), please use --model test option. See ./scripts/test_single.sh for how to apply a model to Facade label maps (stored in the directory facades/testB).

  • See a list of currently available models at ./scripts/download_pix2pix_model.sh

Docker

We provide the pre-built Docker image and Dockerfile that can run this code repo. See docker.

Datasets

Download pix2pix/CycleGAN datasets and create your own datasets.

Training/Test Tips

Best practice for training and testing your models.

Frequently Asked Questions

Before you post a new question, please first look at the above Q & A and existing GitHub issues.

Custom Model and Dataset

If you plan to implement custom models and dataset for your new applications, we provide a dataset template and a model template as a starting point.

Code structure

To help users better understand and use our code, we briefly overview the functionality and implementation of each package and each module.

Pull Request

You are always welcome to contribute to this repository by sending a pull request. Please run flake8 --ignore E501 . and python ./scripts/test_before_push.py before you commit the code. Please also update the code structure overview accordingly if you add or remove files.

Citation

If you use this code for your research, please cite our papers.

@inproceedings{CycleGAN2017,
  title={Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks},
  author={Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle={Computer Vision (ICCV), 2017 IEEE International Conference on},
  year={2017}
}


@inproceedings{isola2017image,
  title={Image-to-Image Translation with Conditional Adversarial Networks},
  author={Isola, Phillip and Zhu, Jun-Yan and Zhou, Tinghui and Efros, Alexei A},
  booktitle={Computer Vision and Pattern Recognition (CVPR), 2017 IEEE Conference on},
  year={2017}
}

Other Languages

Spanish

Related Projects

contrastive-unpaired-translation (CUT)
CycleGAN-Torch | pix2pix-Torch | pix2pixHD| BicycleGAN | vid2vid | SPADE/GauGAN
iGAN | GAN Dissection | GAN Paint

Cat Paper Collection

If you love cats, and love reading cool graphics, vision, and learning papers, please check out the Cat Paper Collection.

Acknowledgments

Our code is inspired by pytorch-DCGAN.