deep-photo-styletransfer
Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511
Top Related Projects
Torch implementation of neural style algorithm
TensorFlow CNN for fast style transfer ⚡🖥🎨🖼
Style transfer, deep learning, feature transform
The source code of 'Visual Attribute Transfer through Deep Image Analogy'.
Image-to-Image Translation in PyTorch
An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.
Quick Overview
The deep-photo-styletransfer repository is an implementation of the paper "Deep Photo Style Transfer" by Luan et al. It provides a method for transferring the style of a reference image to a content image while preserving the photorealism of the content. This project aims to produce high-quality style transfers for photographs, maintaining the original image's structure and details.
Pros
- Produces highly realistic and visually appealing style transfers
- Preserves the content image's structure and details better than traditional style transfer methods
- Includes pre-trained models for easier implementation
- Offers flexibility in adjusting style transfer parameters
Cons
- Requires significant computational resources and time for processing
- Limited to photo-realistic style transfers, may not work well with abstract or non-photographic styles
- Dependency on specific versions of libraries and CUDA, which may cause compatibility issues
- Lack of recent updates or maintenance (last updated in 2017)
Code Examples
-- Example 1: Loading content and style images
local content_image = image.load('examples/input/in3.png', 3)
local style_image = image.load('examples/style/tar3.png', 3)
This code loads the content and style images for processing.
-- Example 2: Setting up optimization parameters
local num_iterations = 1000
local content_weight = 10
local style_weight = 1
Here, we set up basic parameters for the style transfer optimization process.
-- Example 3: Running the style transfer
local output = styletransfer(content_image, style_image, content_weight, style_weight, num_iterations)
image.save('output.png', output)
This code snippet performs the actual style transfer and saves the output image.
Getting Started
-
Clone the repository:
git clone https://github.com/luanfujun/deep-photo-styletransfer.git cd deep-photo-styletransfer
-
Install dependencies:
- CUDA 8.0
- Torch7
- cuDNN 5.1
- torch.image
- torch.nn
- torch.optim
-
Download the VGG-19 model:
sh models/download_models.sh
-
Run the style transfer:
th neuralstyle_seg.lua -content_image <content_image> -style_image <style_image> -content_seg <content_seg> -style_seg <style_seg> -index <index> -serial <out_dir>
Replace <content_image>
, <style_image>
, <content_seg>
, <style_seg>
, <index>
, and <out_dir>
with appropriate values for your use case.
Competitor Comparisons
Torch implementation of neural style algorithm
Pros of neural-style
- More versatile, capable of applying various artistic styles beyond photo-realistic transfers
- Simpler implementation, making it easier to understand and modify
- Wider community adoption and support
Cons of neural-style
- Less effective at preserving photorealistic details in the output
- May produce more artifacts and unrealistic textures in the stylized image
- Slower processing time for high-resolution images
Code Comparison
neural-style:
local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers or {21}
local style_layers = params.style_layers or {2, 7, 12, 17, 22}
deep-photo-styletransfer:
content_image = imread(content_image_path);
style_image = imread(style_image_path);
content_seg = imread(content_seg_path);
style_seg = imread(style_seg_path);
The code snippets show that neural-style uses Lua and focuses on specifying content and style layers, while deep-photo-styletransfer uses MATLAB and includes segmentation maps for more precise control over the style transfer process.
TensorFlow CNN for fast style transfer ⚡🖥🎨🖼
Pros of fast-style-transfer
- Significantly faster processing time, allowing for real-time style transfer
- Supports video style transfer
- Easier to set up and use, with fewer dependencies
Cons of fast-style-transfer
- Generally lower quality results compared to deep-photo-styletransfer
- Less control over the style transfer process
- Limited to pre-trained styles, while deep-photo-styletransfer allows for more customization
Code Comparison
deep-photo-styletransfer:
net = loadjson('models/vgg19.json');
net = vl_simplenn_tidy(net);
content = imread('input/in1.png');
style = imread('style/tar1.png');
fast-style-transfer:
import tensorflow as tf
from style import stylize
content_image = tf.io.read_file('input/content.jpg')
stylized_image = stylize(content_image, 'models/wave.ckpt')
The code snippets show that deep-photo-styletransfer uses MATLAB and requires more setup, while fast-style-transfer uses Python and TensorFlow, offering a more streamlined approach. deep-photo-styletransfer loads both content and style images separately, allowing for more flexibility, whereas fast-style-transfer uses pre-trained style models for quicker processing.
Style transfer, deep learning, feature transform
Pros of FastPhotoStyle
- Significantly faster processing time, enabling real-time style transfer
- Improved preservation of content structure and details
- Support for high-resolution images
Cons of FastPhotoStyle
- May produce less artistic results compared to Deep Photo Style Transfer
- Requires more computational resources (GPU) for optimal performance
Code Comparison
Deep Photo Style Transfer:
def wct_core(cont_feat, styl_feat):
cFSize = cont_feat.size()
c_mean = torch.mean(cont_feat,1)
c_mean = c_mean.unsqueeze(1).expand_as(cont_feat)
cont_feat = cont_feat - c_mean
FastPhotoStyle:
def wct_core(content, style, alpha=1.0):
c_c, c_h, c_w = content.size(0), content.size(1), content.size(2)
content = content.view(c_c, -1)
c_mean = torch.mean(content, 1, keepdim=True)
content = content - c_mean
Both repositories implement the Whitening and Coloring Transform (WCT) algorithm, but FastPhotoStyle's implementation is optimized for speed and efficiency. The core functionality remains similar, with slight differences in tensor manipulation and dimension handling.
The source code of 'Visual Attribute Transfer through Deep Image Analogy'.
Pros of Deep-Image-Analogy
- Focuses on semantic-aware image analogy, allowing for more precise style transfer between semantically similar regions
- Supports bidirectional style transfer, enabling both A→B and B→A transformations
- Utilizes a coarse-to-fine approach, resulting in more detailed and refined outputs
Cons of Deep-Image-Analogy
- May require more computational resources due to its complex architecture and bidirectional processing
- Limited to transferring styles between images with similar semantic content, potentially reducing versatility
- Might produce less artistic or abstract results compared to deep-photo-styletransfer
Code Comparison
Deep-Image-Analogy:
[nn_field_A2B, nn_field_B2A] = NNF_Search(A_feats, B_feats, params);
[A_prime, B_prime] = Reconstruct(A, B, nn_field_A2B, nn_field_B2A);
deep-photo-styletransfer:
local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local output_image = stylize(content_image, style_image, params)
The code snippets highlight the different approaches: Deep-Image-Analogy focuses on bidirectional nearest neighbor field search and reconstruction, while deep-photo-styletransfer uses a more straightforward stylization process.
Image-to-Image Translation in PyTorch
Pros of pytorch-CycleGAN-and-pix2pix
- Supports multiple image-to-image translation tasks, including style transfer, object transfiguration, and season transfer
- Implements both CycleGAN and pix2pix models, offering more flexibility for different use cases
- Built with PyTorch, providing easier customization and integration with modern deep learning workflows
Cons of pytorch-CycleGAN-and-pix2pix
- May produce less photorealistic results for style transfer compared to deep-photo-styletransfer
- Requires paired datasets for pix2pix, which can be more challenging to obtain than single style images
Code Comparison
deep-photo-styletransfer (Lua/Torch):
local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers
local style_layers = params.style_layers
pytorch-CycleGAN-and-pix2pix (Python/PyTorch):
class CycleGANModel(BaseModel):
def __init__(self, opt):
BaseModel.__init__(self, opt)
self.loss_names = ['D_A', 'G_A', 'cycle_A', 'idt_A', 'D_B', 'G_B', 'cycle_B', 'idt_B']
self.visual_names = ['real_A', 'fake_B', 'rec_A', 'real_B', 'fake_A', 'rec_B']
An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.
Pros of style-transfer
- Simpler implementation, making it easier to understand and modify
- Faster processing time for style transfer
- More flexible in terms of input image types and styles
Cons of style-transfer
- Less photorealistic results compared to deep-photo-styletransfer
- May produce artifacts or distortions in the output image
- Limited control over specific style transfer parameters
Code Comparison
deep-photo-styletransfer:
local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers
local style_layers = params.style_layers
style-transfer:
content_image = scipy.misc.imread(args.content_image)
style_image = scipy.misc.imread(args.style_image)
content_layer = 'conv4_2'
style_layers = ['conv1_1', 'conv2_1', 'conv3_1', 'conv4_1', 'conv5_1']
The code snippets show differences in image loading and layer selection. deep-photo-styletransfer uses Lua and allows for more customizable layer selection, while style-transfer uses Python and has a simpler, predefined layer structure.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
deep-photo-styletransfer
Code and data for paper "Deep Photo Style Transfer"
Disclaimer
This software is published for academic and non-commercial use only.
Setup
This code is based on torch. It has been tested on Ubuntu 14.04 LTS.
Dependencies:
CUDA backend:
Download VGG-19:
sh models/download_models.sh
Compile cuda_utils.cu
(Adjust PREFIX
and NVCC_PREFIX
in makefile
for your machine):
make clean && make
Usage
Quick start
To generate all results (in examples/
) using the provided scripts, simply run
run('gen_laplacian/gen_laplacian.m')
in Matlab or Octave and then
python gen_all.py
in Python. The final output will be in examples/final_results/
.
Basic usage
- Given input and style images with semantic segmentation masks, put them in
examples/
respectively. They will have the following filename form:examples/input/in<id>.png
,examples/style/tar<id>.png
andexamples/segmentation/in<id>.png
,examples/segmentation/tar<id>.png
; - Compute the matting Laplacian matrix using
gen_laplacian/gen_laplacian.m
in Matlab. The output matrix will have the following filename form:gen_laplacian/Input_Laplacian_3x3_1e-7_CSR<id>.mat
;
Note: Please make sure that the content image resolution is consistent for Matting Laplacian computation in Matlab and style transfer in Torch, otherwise the result won't be correct.
- Run the following script to generate segmented intermediate result:
th neuralstyle_seg.lua -content_image <input> -style_image <style> -content_seg <inputMask> -style_seg <styleMask> -index <id> -serial <intermediate_folder>
- Run the following script to generate final result:
th deepmatting_seg.lua -content_image <input> -style_image <style> -content_seg <inputMask> -style_seg <styleMask> -index <id> -init_image <intermediate_folder/out<id>_t_1000.png> -serial <final_folder> -f_radius 15 -f_edge 0.01
You can pass -backend cudnn
and -cudnn_autotune
to both Lua scripts (step 3.
and 4.) to potentially improve speed and memory usage. libcudnn.so
must be in
your LD_LIBRARY_PATH
. This requires cudnn.torch.
Image segmentation
Note: In the main paper we generate all comparison results using automatic scene segmentation algorithm modified from DilatedNet. Manual segmentation enables more diverse tasks hence we provide the masks in examples/segmentation/
.
The mask colors we used (you could add more colors in ExtractMask
function in two *.lua
files):
Color variable | RGB Value | Hex Value |
---|---|---|
blue | 0 0 255 | 0000ff |
green | 0 255 0 | 00ff00 |
black | 0 0 0 | 000000 |
white | 255 255 255 | ffffff |
red | 255 0 0 | ff0000 |
yellow | 255 255 0 | ffff00 |
grey | 128 128 128 | 808080 |
lightblue | 0 255 255 | 00ffff |
purple | 255 0 255 | ff00ff |
Here are some automatic and manual tools for creating a segmentation mask for a photo image:
Automatic:
- MIT Scene Parsing
- SuperParsing
- Nonparametric Scene Parsing
- Berkeley Contour Detection and Image Segmentation Resources
- CRF-RNN for Semantic Image Segmentation
- Selective Search
- DeepLab-TensorFlow
Manual:
- Photoshop Quick Selection Tool
- GIMP Selection Tool
- GIMP G'MIC Interactive Foreground Extraction tool
Examples
Here are some results from our algorithm (from left to right are input, style and our output):
Acknowledgement
- Our torch implementation is based on Justin Johnson's code;
- We use Anat Levin's Matlab code to compute the matting Laplacian matrix.
Citation
If you find this work useful for your research, please cite:
@article{luan2017deep,
title={Deep Photo Style Transfer},
author={Luan, Fujun and Paris, Sylvain and Shechtman, Eli and Bala, Kavita},
journal={arXiv preprint arXiv:1703.07511},
year={2017}
}
Contact
Feel free to contact me if there is any question (Fujun Luan fl356@cornell.edu).
Top Related Projects
Torch implementation of neural style algorithm
TensorFlow CNN for fast style transfer ⚡🖥🎨🖼
Style transfer, deep learning, feature transform
The source code of 'Visual Attribute Transfer through Deep Image Analogy'.
Image-to-Image Translation in PyTorch
An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot