deep-photo-styletransfer

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

10,004

1,401

10,004

View on GitHub

Top Related Projects

neural-style

18,316

Torch implementation of neural style algorithm

fast-style-transfer

10,952

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼

FastPhotoStyle

11,165

Style transfer, deep learning, feature transform

Deep-Image-Analogy

1,369

The source code of 'Visual Attribute Transfer through Deep Image Analogy'.

pytorch-CycleGAN-and-pix2pix

23,952

Image-to-Image Translation in PyTorch

style-transfer

1,539

An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.

Quick Overview

The deep-photo-styletransfer repository is an implementation of the paper "Deep Photo Style Transfer" by Luan et al. It provides a method for transferring the style of a reference image to a content image while preserving the photorealism of the content. This project aims to produce high-quality style transfers for photographs, maintaining the original image's structure and details.

Pros

Produces highly realistic and visually appealing style transfers
Preserves the content image's structure and details better than traditional style transfer methods
Includes pre-trained models for easier implementation
Offers flexibility in adjusting style transfer parameters

Cons

Requires significant computational resources and time for processing
Limited to photo-realistic style transfers, may not work well with abstract or non-photographic styles
Dependency on specific versions of libraries and CUDA, which may cause compatibility issues
Lack of recent updates or maintenance (last updated in 2017)

Code Examples

-- Example 1: Loading content and style images
local content_image = image.load('examples/input/in3.png', 3)
local style_image = image.load('examples/style/tar3.png', 3)

This code loads the content and style images for processing.

-- Example 2: Setting up optimization parameters
local num_iterations = 1000
local content_weight = 10
local style_weight = 1

Here, we set up basic parameters for the style transfer optimization process.

-- Example 3: Running the style transfer
local output = styletransfer(content_image, style_image, content_weight, style_weight, num_iterations)
image.save('output.png', output)

This code snippet performs the actual style transfer and saves the output image.

Getting Started

Clone the repository:

git clone https://github.com/luanfujun/deep-photo-styletransfer.git
cd deep-photo-styletransfer

Install dependencies:
- CUDA 8.0
- Torch7
- cuDNN 5.1
- torch.image
- torch.nn
- torch.optim
Download the VGG-19 model:
```
sh models/download_models.sh
```

Run the style transfer:

th neuralstyle_seg.lua -content_image <content_image> -style_image <style_image> -content_seg <content_seg> -style_seg <style_seg> -index <index> -serial <out_dir>

Replace <content_image>, <style_image>, <content_seg>, <style_seg>, <index>, and <out_dir> with appropriate values for your use case.

Competitor Comparisons

neural-style

18,316

Torch implementation of neural style algorithm

Pros of neural-style

More versatile, capable of applying various artistic styles beyond photo-realistic transfers
Simpler implementation, making it easier to understand and modify
Wider community adoption and support

Cons of neural-style

Less effective at preserving photorealistic details in the output
May produce more artifacts and unrealistic textures in the stylized image
Slower processing time for high-resolution images

Code Comparison

neural-style:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers or {21}
local style_layers = params.style_layers or {2, 7, 12, 17, 22}

deep-photo-styletransfer:

content_image = imread(content_image_path);
style_image = imread(style_image_path);
content_seg = imread(content_seg_path);
style_seg = imread(style_seg_path);

The code snippets show that neural-style uses Lua and focuses on specifying content and style layers, while deep-photo-styletransfer uses MATLAB and includes segmentation maps for more precise control over the style transfer process.

fast-style-transfer

10,952

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼

Pros of fast-style-transfer

Significantly faster processing time, allowing for real-time style transfer
Supports video style transfer
Easier to set up and use, with fewer dependencies

Cons of fast-style-transfer

Generally lower quality results compared to deep-photo-styletransfer
Less control over the style transfer process
Limited to pre-trained styles, while deep-photo-styletransfer allows for more customization

Code Comparison

deep-photo-styletransfer:

net = loadjson('models/vgg19.json');
net = vl_simplenn_tidy(net);
content = imread('input/in1.png');
style = imread('style/tar1.png');

fast-style-transfer:

import tensorflow as tf
from style import stylize
content_image = tf.io.read_file('input/content.jpg')
stylized_image = stylize(content_image, 'models/wave.ckpt')

The code snippets show that deep-photo-styletransfer uses MATLAB and requires more setup, while fast-style-transfer uses Python and TensorFlow, offering a more streamlined approach. deep-photo-styletransfer loads both content and style images separately, allowing for more flexibility, whereas fast-style-transfer uses pre-trained style models for quicker processing.

FastPhotoStyle

11,165

Style transfer, deep learning, feature transform

Pros of FastPhotoStyle

Significantly faster processing time, enabling real-time style transfer
Improved preservation of content structure and details
Support for high-resolution images

Cons of FastPhotoStyle

May produce less artistic results compared to Deep Photo Style Transfer
Requires more computational resources (GPU) for optimal performance

Code Comparison

Deep Photo Style Transfer:

def wct_core(cont_feat, styl_feat):
    cFSize = cont_feat.size()
    c_mean = torch.mean(cont_feat,1) 
    c_mean = c_mean.unsqueeze(1).expand_as(cont_feat)
    cont_feat = cont_feat - c_mean

FastPhotoStyle:

def wct_core(content, style, alpha=1.0):
    c_c, c_h, c_w = content.size(0), content.size(1), content.size(2)
    content = content.view(c_c, -1)
    c_mean = torch.mean(content, 1, keepdim=True)
    content = content - c_mean

Both repositories implement the Whitening and Coloring Transform (WCT) algorithm, but FastPhotoStyle's implementation is optimized for speed and efficiency. The core functionality remains similar, with slight differences in tensor manipulation and dimension handling.

Deep-Image-Analogy

1,369

The source code of 'Visual Attribute Transfer through Deep Image Analogy'.

Pros of Deep-Image-Analogy

Focuses on semantic-aware image analogy, allowing for more precise style transfer between semantically similar regions
Supports bidirectional style transfer, enabling both A→B and B→A transformations
Utilizes a coarse-to-fine approach, resulting in more detailed and refined outputs

Cons of Deep-Image-Analogy

May require more computational resources due to its complex architecture and bidirectional processing
Limited to transferring styles between images with similar semantic content, potentially reducing versatility
Might produce less artistic or abstract results compared to deep-photo-styletransfer

Code Comparison

Deep-Image-Analogy:

[nn_field_A2B, nn_field_B2A] = NNF_Search(A_feats, B_feats, params);
[A_prime, B_prime] = Reconstruct(A, B, nn_field_A2B, nn_field_B2A);

deep-photo-styletransfer:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local output_image = stylize(content_image, style_image, params)

The code snippets highlight the different approaches: Deep-Image-Analogy focuses on bidirectional nearest neighbor field search and reconstruction, while deep-photo-styletransfer uses a more straightforward stylization process.

pytorch-CycleGAN-and-pix2pix

23,952

Image-to-Image Translation in PyTorch

Pros of pytorch-CycleGAN-and-pix2pix

Supports multiple image-to-image translation tasks, including style transfer, object transfiguration, and season transfer
Implements both CycleGAN and pix2pix models, offering more flexibility for different use cases
Built with PyTorch, providing easier customization and integration with modern deep learning workflows

Cons of pytorch-CycleGAN-and-pix2pix

May produce less photorealistic results for style transfer compared to deep-photo-styletransfer
Requires paired datasets for pix2pix, which can be more challenging to obtain than single style images

Code Comparison

deep-photo-styletransfer (Lua/Torch):

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers
local style_layers = params.style_layers

pytorch-CycleGAN-and-pix2pix (Python/PyTorch):

class CycleGANModel(BaseModel):
    def __init__(self, opt):
        BaseModel.__init__(self, opt)
        self.loss_names = ['D_A', 'G_A', 'cycle_A', 'idt_A', 'D_B', 'G_B', 'cycle_B', 'idt_B']
        self.visual_names = ['real_A', 'fake_B', 'rec_A', 'real_B', 'fake_A', 'rec_B']

style-transfer

1,539

An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.

Pros of style-transfer

Simpler implementation, making it easier to understand and modify
Faster processing time for style transfer
More flexible in terms of input image types and styles

Cons of style-transfer

Less photorealistic results compared to deep-photo-styletransfer
May produce artifacts or distortions in the output image
Limited control over specific style transfer parameters

Code Comparison

deep-photo-styletransfer:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers
local style_layers = params.style_layers

style-transfer:

content_image = scipy.misc.imread(args.content_image)
style_image = scipy.misc.imread(args.style_image)
content_layer = 'conv4_2'
style_layers = ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv5_1']

The code snippets show differences in image loading and layer selection. deep-photo-styletransfer uses Lua and allows for more customizable layer selection, while style-transfer uses Python and has a simpler, predefined layer structure.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

deep-photo-styletransfer

Code and data for paper "Deep Photo Style Transfer"

Disclaimer

This software is published for academic and non-commercial use only.

Setup

This code is based on torch. It has been tested on Ubuntu 14.04 LTS.

Dependencies:

Torch (with matio-ffi and loadcaffe)
Matlab or Octave

CUDA backend:

CUDA
cudnn

Download VGG-19:

sh models/download_models.sh

Compile cuda_utils.cu (Adjust PREFIX and NVCC_PREFIX in makefile for your machine):

make clean && make

Usage

Quick start

To generate all results (in examples/) using the provided scripts, simply run

run('gen_laplacian/gen_laplacian.m')

in Matlab or Octave and then

python gen_all.py

in Python. The final output will be in examples/final_results/.

Basic usage

Given input and style images with semantic segmentation masks, put them in examples/ respectively. They will have the following filename form: examples/input/in<id>.png, examples/style/tar<id>.png and examples/segmentation/in<id>.png, examples/segmentation/tar<id>.png;
Compute the matting Laplacian matrix using gen_laplacian/gen_laplacian.m in Matlab. The output matrix will have the following filename form: gen_laplacian/Input_Laplacian_3x3_1e-7_CSR<id>.mat;

Note: Please make sure that the content image resolution is consistent for Matting Laplacian computation in Matlab and style transfer in Torch, otherwise the result won't be correct.

Run the following script to generate segmented intermediate result:

th neuralstyle_seg.lua -content_image <input> -style_image <style> -content_seg <inputMask> -style_seg <styleMask> -index <id> -serial <intermediate_folder>

Run the following script to generate final result:

th deepmatting_seg.lua -content_image <input> -style_image <style> -content_seg <inputMask> -style_seg <styleMask> -index <id> -init_image <intermediate_folder/out<id>_t_1000.png> -serial <final_folder> -f_radius 15 -f_edge 0.01

You can pass -backend cudnn and -cudnn_autotune to both Lua scripts (step 3. and 4.) to potentially improve speed and memory usage. libcudnn.so must be in your LD_LIBRARY_PATH. This requires cudnn.torch.

Image segmentation

Note: In the main paper we generate all comparison results using automatic scene segmentation algorithm modified from DilatedNet. Manual segmentation enables more diverse tasks hence we provide the masks in examples/segmentation/.

The mask colors we used (you could add more colors in ExtractMask function in two *.lua files):

Color variable	RGB Value	Hex Value
`blue`	`0 0 255`	`0000ff`
`green`	`0 255 0`	`00ff00`
`black`	`0 0 0`	`000000`
`white`	`255 255 255`	`ffffff`
`red`	`255 0 0`	`ff0000`
`yellow`	`255 255 0`	`ffff00`
`grey`	`128 128 128`	`808080`
`lightblue`	`0 255 255`	`00ffff`
`purple`	`255 0 255`	`ff00ff`

Here are some automatic and manual tools for creating a segmentation mask for a photo image:

Automatic:

Manual:

Examples

Here are some results from our algorithm (from left to right are input, style and our output):

Acknowledgement

Our torch implementation is based on Justin Johnson's code;
We use Anat Levin's Matlab code to compute the matting Laplacian matrix.

Citation

If you find this work useful for your research, please cite:

@article{luan2017deep,
  title={Deep Photo Style Transfer},
  author={Luan, Fujun and Paris, Sylvain and Shechtman, Eli and Bala, Kavita},
  journal={arXiv preprint arXiv:1703.07511},
  year={2017}
}

Contact

Feel free to contact me if there is any question (Fujun Luan fl356@cornell.edu).

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot