AdaIN-style

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization

1,513

196

1,513

View on GitHub

Top Related Projects

neural-style

18,316

Torch implementation of neural style algorithm

deep-photo-styletransfer

10,004

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

FastPhotoStyle

11,184

Style transfer, deep learning, feature transform

fast-style-transfer

10,952

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼

Quick Overview

AdaIN-style is a PyTorch implementation of the paper "Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization" by Huang and Belongie. This project provides a method for fast arbitrary style transfer, allowing users to apply the style of one image to the content of another in real-time.

Pros

Fast and real-time style transfer capabilities
Supports arbitrary styles without retraining
Includes pre-trained models for immediate use
Well-documented with clear instructions for usage

Cons

Limited to PyTorch framework
May require significant computational resources for training
Results can sometimes be inconsistent depending on the style-content pair
Lacks some advanced features found in more recent style transfer methods

Code Examples

Loading pre-trained models:

from model import net
decoder = net.decoder
vgg = net.vgg

decoder.eval()
vgg.eval()

Performing style transfer:

from function import adaptive_instance_normalization as adain
from function import coral

output = adain(content_feat, style_feat)
output = decoder(output)

Adjusting style strength:

def style_transfer(vgg, decoder, content, style, alpha=1.0):
    assert (0.0 <= alpha <= 1.0)
    content_f = vgg(content)
    style_f = vgg(style)
    feat = adain(content_f, style_f)
    feat = feat * alpha + content_f * (1 - alpha)
    return decoder(feat)

Getting Started

Clone the repository:

git clone https://github.com/xunhuang1995/AdaIN-style.git
cd AdaIN-style

Install dependencies:
```
pip install torch torchvision pillow
```
Download pre-trained models:
```
bash models/download_models.sh
```

Run style transfer:

python test.py --content input/content.jpg --style input/style.jpg --output output/result.jpg

Competitor Comparisons

neural-style

18,316

Torch implementation of neural style algorithm

Pros of neural-style

Offers more control over style transfer parameters
Produces high-quality results with fine-grained details
Supports multiple style images for blended effects

Cons of neural-style

Slower processing time, especially for high-resolution images
Requires more computational resources
Less suitable for real-time applications or video processing

Code Comparison

neural-style:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers or {21}
local style_layers = params.style_layers or {2, 7, 12, 17, 22}

AdaIN-style:

content = torch.from_numpy(content).to(device)
style = torch.from_numpy(style).to(device)
with torch.no_grad():
    output = style_transfer(vgg, decoder, content, style, alpha)

neural-style uses Lua and allows for more detailed configuration of content and style layers, while AdaIN-style uses Python and PyTorch, offering a simpler interface for style transfer. AdaIN-style is generally faster and more suitable for real-time applications, but may produce less detailed results compared to neural-style.

deep-photo-styletransfer

10,004

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

Pros of deep-photo-styletransfer

Produces more photorealistic results, preserving the structure and details of the original image
Offers better control over the style transfer process, allowing for fine-tuning of specific areas
Includes a segmentation-aware loss function for improved content preservation

Cons of deep-photo-styletransfer

Slower processing time compared to AdaIN-style
Requires more computational resources and setup
Less versatile for real-time applications or quick style transfers

Code Comparison

deep-photo-styletransfer:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_seg = image.load(params.content_seg, 1)
local style_seg = image.load(params.style_seg, 1)

AdaIN-style:

content = tf.placeholder(tf.float32, [None, None, None, 3])
style = tf.placeholder(tf.float32, [None, None, None, 3])
alpha = tf.placeholder(tf.float32)

The deep-photo-styletransfer code loads content and style images along with their segmentation maps, while AdaIN-style uses TensorFlow placeholders for content, style, and an alpha parameter for style strength.

FastPhotoStyle

11,184

Style transfer, deep learning, feature transform

Pros of FastPhotoStyle

Faster processing time for style transfer
Better preservation of content structure and details
Supports high-resolution images

Cons of FastPhotoStyle

More complex implementation and setup
Requires more computational resources
Limited style flexibility compared to AdaIN-style

Code Comparison

AdaIN-style:

def calc_mean_std(feat, eps=1e-5):
    size = feat.size()
    assert (len(size) == 4)
    N, C = size[:2]
    feat_var = feat.view(N, C, -1).var(dim=2) + eps
    feat_std = feat_var.sqrt().view(N, C, 1, 1)
    feat_mean = feat.view(N, C, -1).mean(dim=2).view(N, C, 1, 1)
    return feat_mean, feat_std

FastPhotoStyle:

def wct_core(cont_feat, styl_feat, weight=1, registers=None):
    cont_c, cont_h, cont_w = cont_feat.size(0), cont_feat.size(1), cont_feat.size(2)
    cont_feat_view = cont_feat.view(cont_c, -1)
    cont_feat_mean = torch.mean(cont_feat_view, 1)
    cont_feat_var = torch.var(cont_feat_view, 1)
    cont_feat_std = cont_feat_var.sqrt()

PyTorch-Multi-Style-Transfer

1,004

Neural Style and MSG-Net

Pros of PyTorch-Multi-Style-Transfer

Supports multiple style transfer methods (AdaIN, WCT, and Avatar-Net)
Includes pre-trained models for quick testing and deployment
Provides a web service demo for easy visualization

Cons of PyTorch-Multi-Style-Transfer

Less focused on a single method, potentially making it more complex to use
May have higher computational requirements due to multiple implemented methods
Last updated in 2018, potentially outdated compared to more recent implementations

Code Comparison

AdaIN-style:

def calc_mean_std(feat, eps=1e-5):
    size = feat.size()
    assert (len(size) == 4)
    N, C = size[:2]
    feat_var = feat.view(N, C, -1).var(dim=2) + eps
    feat_std = feat_var.sqrt().view(N, C, 1, 1)
    feat_mean = feat.view(N, C, -1).mean(dim=2).view(N, C, 1, 1)
    return feat_mean, feat_std

PyTorch-Multi-Style-Transfer:

def calc_mean_std(features):
    batch_size, c = features.size()[:2]
    features_mean = features.reshape(batch_size, c, -1).mean(dim=2).reshape(batch_size, c, 1, 1)
    features_std = features.reshape(batch_size, c, -1).std(dim=2).reshape(batch_size, c, 1, 1) + 1e-6
    return features_mean, features_std

Both repositories implement similar functionality for calculating mean and standard deviation of features, but PyTorch-Multi-Style-Transfer's implementation is more concise.

fast-style-transfer

10,952

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼

Pros of fast-style-transfer

Faster inference time for real-time applications
Supports video stylization
Well-documented with clear instructions for usage

Cons of fast-style-transfer

Limited to pre-trained styles; requires retraining for new styles
Less flexibility in adjusting style strength or characteristics
May produce lower quality results for certain types of images

Code Comparison

fast-style-transfer:

stylizer = stylize.StyleTransferNetwork(checkpoint)
stylized_image = stylizer.predict(content_image)

AdaIN-style:

style_model = net.Net(encoder, decoder)
content_feat = encoder(content_image)
style_feat = encoder(style_image)
stylized = decoder(adain(content_feat, style_feat))

The main difference in code usage is that fast-style-transfer uses a pre-trained network for a specific style, while AdaIN-style allows for arbitrary style transfer by encoding both content and style images at runtime.

AdaIN-style offers more flexibility in style application and can adapt to new styles without retraining, but may require more computational resources during inference. fast-style-transfer, on the other hand, provides quicker stylization but is limited to styles it has been trained on.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

AdaIN-style

This repository contains the code (in Torch) for the paper:

Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Xun Huang, Serge Belongie
ICCV 2017 (Oral)

This paper proposes the first real-time style transfer algorithm that can transfer arbitrary new styles, in contrast to a single style or 32 styles. Our algorithm runs at 15 FPS with 512x512 images on a Pascal Titan X. This is around 720x speedup compared with the original algorithm of Gatys et al., without sacrificing any flexibility. We accomplish this with a novel adaptive instance normalization (AdaIN) layer, which is similar to instance normalization but with affine parameters adaptively computed from the feature representations of an arbitrary style image.

Examples

Dependencies

torch7

Optionally:

CUDA and cuDNN
cunn
torch.cudnn
ffmpeg (for video)

Download

bash models/download_models.sh

This command will download a pre-trained decoder as well as a modified VGG-19 network. Our style transfer network consists of the first few layers of VGG, an AdaIN layer, and the provided decoder.

Usage

Basic usage

Use -content and -style to provide the respective path to the content and style image, for example:

th test.lua -content input/content/cornell.jpg -style input/style/woman_with_hat_matisse.jpg

You can also run the code on directories of content and style images using -contentDir and -styleDir. It will save every possible combination of content and styles to the output directory.

th test.lua -contentDir input/content -styleDir input/style

Some other options:

-crop: Center crop both content and style images beforehand.
-contentSize: New (minimum) size for the content image. Keeping the original size if set to 0.
-styleSize: New (minimum) size for the content image. Keeping the original size if set to 0.

To see all available options, type:

th test.lua -help

Content-style trade-off

Use -alpha to adjust the degree of stylization. It should be a value between 0 and 1 (default). Example usage:

th test.lua -content input/content/chicago.jpg -style input/style/asheville.jpg -alpha 0.5 -crop

By changing -alpha, you should be able to reproduce the following results.

Transfer style but not color

Add -preserveColor to preserve the color of the content image. Example usage:

th test.lua -content input/content/newyork.jpg -style input/style/brushstrokes.jpg -contentSize 0 -styleSize 0 -preserveColor

Style interpolation

It is possible to interpolate between several styles using -styleInterpWeights that controls the relative weight of each style. Note that you also to need to provide the same number of style images separated be commas. Example usage:

th test.lua -content input/content/avril.jpg \
-style input/style/picasso_self_portrait.jpg,input/style/impronte_d_artista.jpg,input/style/trial.jpg,input/style/antimonocromatismo.jpg \
-styleInterpWeights 1,1,1,1 -crop

You should be able to reproduce the following results shown in our paper by changing -styleInterpWeights .

Spatial control

Use -mask to provide the path to a binary foreground mask. You can transfer the foreground and background of the content image to different styles. Note that you also to need to provide two style images separated be comma, in which the first one is applied to foreground and the second one is applied to background. Example usage:

th test.lua -content input/content/blonde_girl.jpg -style input/style/woman_in_peasant_dress_cropped.jpg,input/style/mondrian_cropped.jpg \
-mask input/mask/mask.png -contentSize 0 -styleSize 0

Video Stylization

Use styVid.sh to process videos, example usage:

th testVid.lua -contentDir videoprocessing/${filename} -style ${styleimage} -outputDir videoprocessing/${filename}-${stylename}

This generates 1 mp4 for each image present in style-dir-path. Other video formats are also supported. To change other parameters like alpha, edit line 53 of styVid.sh. An example video with some results can be seen here on youtube.

Training

Download MSCOCO images and Wikiart images.
Use th train.lua -contentDir COCO_TRAIN_DIR -styleDir WIKIART_TRAIN_DIR to start training with default hyperparameters. Replace COCO_TRAIN_DIR with the path to COCO training images and WIKIART_TRAIN_DIR with the path to Wikiart training images. The default hyperparameters are the same as the ones used to train decoder-content-similar.t7. To reproduce the results from decoder.t7, add -styleWeight 1e-1.

Citation

If you find this code useful for your research, please cite the paper:

@inproceedings{huang2017adain,
  title={Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization},
  author={Huang, Xun and Belongie, Serge},
  booktitle={ICCV},
  year={2017}
}

Acknowledgement

This project is inspired by many existing style transfer methods and their open-source implementations, including:

Image Style Transfer Using Convolutional Neural Networks, Gatys et al. [code (by Johnson)]
Perceptual Losses for Real-Time Style Transfer and Super-Resolution, Johnson et al. [code]
Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis, Ulyanov et al. [code]
A Learned Representation For Artistic Style, Dumoulin et al. [code]
Fast Patch-based Style Transfer of Arbitrary Style, Chen and Schmidt [code]
Controlling Perceptual Factors in Neural Style Transfer, Gatys et al. [code]

Contact

If you have any questions or suggestions about the paper, feel free to reach me (xh258@cornell.edu).

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot