Deep-Image-Analogy

The source code of 'Visual Attribute Transfer through Deep Image Analogy'.

1,368

232

1,368

View on GitHub

Top Related Projects

deep-photo-styletransfer

10,008

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

FastPhotoStyle

11,184

Style transfer, deep learning, feature transform

style-transfer

1,539

An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.

neural-style

18,313

Torch implementation of neural style algorithm

fast-style-transfer

10,965

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼

Quick Overview

Deep Image Analogy is a project that explores visual attribute transfer across images using deep neural networks. It aims to establish semantically meaningful dense correspondences between two input images, allowing for the transfer of visual attributes from one image to another while preserving structure and style.

Pros

Enables high-quality visual attribute transfer between images
Preserves structural integrity and style of the target image
Utilizes deep neural networks for improved semantic understanding
Applicable to various image editing and manipulation tasks

Cons

Requires significant computational resources
Limited documentation and user guides
May struggle with complex or highly dissimilar image pairs
Not actively maintained (last update was in 2017)

Code Examples

# Load input images
source_img = load_image('source.jpg')
target_img = load_image('target.jpg')

# Initialize Deep Image Analogy model
dia_model = DeepImageAnalogy()

# Perform visual attribute transfer
result_img = dia_model.transfer_attributes(source_img, target_img)

# Save the result
save_image(result_img, 'result.jpg')

# Fine-tune the transfer process
dia_model.set_parameters(
    num_iterations=100,
    patch_size=3,
    gpu_id=0
)
result_img = dia_model.transfer_attributes(source_img, target_img)

# Perform bidirectional transfer
forward_result = dia_model.transfer_attributes(source_img, target_img)
backward_result = dia_model.transfer_attributes(target_img, source_img)

# Combine results
final_result = combine_results(forward_result, backward_result)

Getting Started

Clone the repository:

git clone https://github.com/msracver/Deep-Image-Analogy.git
cd Deep-Image-Analogy

Install dependencies:
```
pip install -r requirements.txt
```
Compile the CUDA kernels:
```
cd source
make
```

Run the example:

python demo.py --source path/to/source.jpg --target path/to/target.jpg

Note: This project requires CUDA-capable GPUs and appropriate CUDA toolkit installation.

Competitor Comparisons

deep-photo-styletransfer

10,008

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

Pros of deep-photo-styletransfer

Focuses specifically on photorealistic style transfer
Includes a segmentation-aware loss for improved results
Provides pre-trained models for easier implementation

Cons of deep-photo-styletransfer

Limited to style transfer applications
Requires more computational resources for high-quality results
Less versatile in terms of image manipulation capabilities

Code Comparison

Deep-Image-Analogy:

[nn_field_X2Y, nn_field_Y2X] = NNF_Search(X, Y, param);
X_NN = warp(Y, nn_field_X2Y);
Y_NN = warp(X, nn_field_Y2X);

deep-photo-styletransfer:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local opt_img = optim.lbfgs(feval, img, optim_state)

The Deep-Image-Analogy code focuses on nearest neighbor field search and warping, while deep-photo-styletransfer loads content and style images, then optimizes the output using LBFGS. Deep-Image-Analogy offers more flexibility for various image analogy tasks, whereas deep-photo-styletransfer is tailored specifically for photorealistic style transfer.

FastPhotoStyle

11,184

Style transfer, deep learning, feature transform

Pros of FastPhotoStyle

Faster processing time for style transfer
Supports high-resolution images
Provides more photorealistic results

Cons of FastPhotoStyle

Limited to photorealistic style transfer
Requires more computational resources
Less flexibility in artistic style manipulation

Code Comparison

FastPhotoStyle:

from photo_wct import PhotoWCT
p_wct = PhotoWCT()
p_wct.load_state_dict(torch.load('models/photo_wct.pth'))
p_wct.cuda(0)

Deep-Image-Analogy:

net = caffe.Net('models/VGG19/VGG_ILSVRC_19_layers_deploy.prototxt', ...
    'models/VGG19/VGG_ILSVRC_19_layers.caffemodel', 'test');
[img_A, img_B] = deep_image_analogy(net, A, B, params);

FastPhotoStyle focuses on photorealistic style transfer using PyTorch, while Deep-Image-Analogy offers more versatile image analogies using MATLAB and Caffe. FastPhotoStyle is generally faster and supports higher resolutions, but Deep-Image-Analogy provides more flexibility in artistic style manipulation. The choice between the two depends on the specific use case and desired output style.

PyTorch-Multi-Style-Transfer

1,004

Neural Style and MSG-Net

Pros of PyTorch-Multi-Style-Transfer

Utilizes PyTorch, offering better flexibility and ease of use
Supports multiple style transfer techniques in a single repository
Provides pre-trained models for quick implementation

Cons of PyTorch-Multi-Style-Transfer

May have higher computational requirements due to multiple style options
Less focused on image analogy, which is the primary goal of Deep-Image-Analogy

Code Comparison

Deep-Image-Analogy (MATLAB):

[A_prime, B_prime] = deep_image_analogy(A, B, params);
imshow(A_prime);
imshow(B_prime);

PyTorch-Multi-Style-Transfer (Python):

model = Net(ngf=128)
model.load_state_dict(torch.load('models/21styles.model'))
output = style_transfer(model, content_image, style_image)
imshow(output)

Both repositories focus on image manipulation, but Deep-Image-Analogy emphasizes creating analogies between images, while PyTorch-Multi-Style-Transfer offers a broader range of style transfer techniques. The code snippets demonstrate the difference in implementation languages and approaches, with Deep-Image-Analogy using MATLAB and PyTorch-Multi-Style-Transfer leveraging PyTorch in Python.

style-transfer

1,539

An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.

Pros of style-transfer

Simpler implementation, making it easier to understand and modify
Faster execution time for style transfer tasks
More flexible in terms of input image requirements

Cons of style-transfer

Less precise in preserving content details during style transfer
Limited ability to handle complex style patterns
May produce less visually appealing results for certain image combinations

Code Comparison

style-transfer:

def style_transfer(content_image, style_image, iterations=1000):
    content_features = extract_features(content_image)
    style_features = extract_features(style_image)
    target = content_image.clone()
    
    for i in range(iterations):
        optimize_image(target, content_features, style_features)
    
    return target

Deep-Image-Analogy:

def deep_image_analogy(A, Ap, B):
    features_A = extract_features(A)
    features_Ap = extract_features(Ap)
    features_B = extract_features(B)
    
    for level in reversed(range(num_levels)):
        nn_field = compute_nearest_neighbors(features_A[level], features_B[level])
        features_B[level] = reconstruct_features(features_Ap[level], nn_field)
    
    return reconstruct_image(features_B)

The code snippets illustrate the different approaches: style-transfer uses an iterative optimization process, while Deep-Image-Analogy employs a multi-level feature matching and reconstruction technique.

neural-style

18,313

Torch implementation of neural style algorithm

Pros of neural-style

More widely adopted and actively maintained
Supports a broader range of style transfer techniques
Extensive documentation and examples available

Cons of neural-style

Slower processing time for high-resolution images
Requires more computational resources
Less precise in preserving structural details of the content image

Code Comparison

neural-style:

local content_image = image.load(params.content_image, 3)
local style_image = image.load(params.style_image, 3)
local content_layers = params.content_layers or {21}
local style_layers = params.style_layers or {2,7,12,17,22}

Deep-Image-Analogy:

cv::Mat content_img = cv::imread(content_path);
cv::Mat style_img = cv::imread(style_path);
DeepAnalogy deep_analogy(content_img, style_img);
deep_analogy.run();

The code snippets show that neural-style uses Lua and focuses on layer selection, while Deep-Image-Analogy uses C++ and provides a more straightforward API for image processing.

fast-style-transfer

10,965

TensorFlow CNN for fast style transfer ⚡🖥🎨🖼

Pros of Fast-Style-Transfer

Faster processing time for real-time style transfer
Supports video style transfer
Easier to set up and use with pre-trained models

Cons of Fast-Style-Transfer

Limited to predefined styles, less flexibility
May produce less detailed or accurate results
Requires separate training for each new style

Code Comparison

Deep-Image-Analogy:

def compute_feature_statistics(features):
    mean = np.mean(features, axis=(0, 2, 3), keepdims=True)
    std = np.std(features, axis=(0, 2, 3), keepdims=True)
    return mean, std

Fast-Style-Transfer:

def _conv_layer(net, num_filters, filter_size, strides, relu=True):
    weights_init = _conv_init_vars(net, num_filters, filter_size)
    strides_shape = [1, strides, strides, 1]
    net = tf.nn.conv2d(net, weights_init, strides_shape, padding='SAME')
    net = _instance_norm(net)
    if relu:
        net = tf.nn.relu(net)
    return net

The code snippets show different approaches: Deep-Image-Analogy focuses on feature statistics, while Fast-Style-Transfer implements convolutional layers for neural style transfer.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Deep Image Analogy

The major contributors of this repository include Jing Liao, Yuan Yao, Lu Yuan, Gang Hua and Sing Bing Kang at Microsoft Research.

Introduction

Deep Image Analogy is a technique to find semantically-meaningful dense correspondences between two input images. It adapts the notion of image analogy with features extracted from a Deep Convolutional Neural Network.

Deep Image Analogy is initially described in a SIGGRAPH 2017 paper

Disclaimer

This is an official C++ combined with CUDA implementation of Deep Image Analogy. It is worth noticing that:

Our codes are based on Caffe.
Our codes only have been tested on Windows 10 and Windows Server 2012 R2 with CUDA 8 or 7.5.
Our codes only have been tested on several Nvidia GPU: Titan X, Titan Z, K40, GTX770.
The size of input image is limited, mostly should not be large than 700x500 if you use 1.0 for parameter ratio.

License

Citation

If you find Deep Image Analogy (include deep patchmatch) helpful for your research, please consider citing:

@article{Liao:2017:VAT:3072959.3073683,
 author = {Liao, Jing and Yao, Yuan and Yuan, Lu and Hua, Gang and Kang, Sing Bing},
 title = {Visual Attribute Transfer Through Deep Image Analogy},
 journal = {ACM Trans. Graph.},
 issue_date = {July 2017},
 volume = {36},
 number = {4},
 month = jul,
 year = {2017},
 issn = {0730-0301},
 pages = {120:1--120:15},
 articleno = {120},
 numpages = {15},
 url = {http://doi.acm.org/10.1145/3072959.3073683},
 doi = {10.1145/3072959.3073683},
 acmid = {3073683},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {deep matching, image analogy, transfer},
}

Application

Photo to Style

One major application of our code is to transfer the style from a painting to a photo.

Style to Style

It can also swap the styles between two artworks.

Style to Photo

The most challenging application is converting a sketch or a painting to a photo.

Photo to Photo

It can do color transfer between two photos, such as generating time lapse.

Getting Started

Prerequisite

Windows 7/8/10 (for linux or mac os x user, please check branch linux.)
CUDA 8 or 7.5
Visual Studio 2013

Build

Build Caffe at first. Just follow the tutorial here.
Edit deep_image_analogy.vcxproj under windows/deep_image_analogy to make the CUDA version in it match yours .
Open solution Caffe and add deep_image_analogy project.
Build project deep_image_analogy.

Download models

You need to download models VGG-19 model before start to run a demo. Go to windows/deep_image_analogy/models/vgg19/ folder and download:

http://www.robots.ox.ac.uk/~vgg/software/very_deep/caffe/VGG_ILSVRC_19_layers.caffemodel

Demo

Open main.cpp in windows/deep_image_analogy/source/ to see how to run a demo. You need to set several parameters which have been mentioned in the paper. To be more specific, you need to set

path_model, where the VGG-19 model is.
path_A, the input image A.
path_BP, the input image BP.
path_output, the output path.
GPU Number, GPU ID you want to run this experiment.
Ratio, the ratio to resize the inputs before sending them into the network.
Blend Weight, the level of weights in blending process.
Flag of WLS Filter, if you are trying to do photo style transfer, we recommend to switch this on to keep the structure of original photo.

Direct Run

We also provide a pre-built executable file in folder windows/deep_image_analogy/exe/, don't hesitate to try it.

To run this deep_image_analogy.exe, you need to write a command line as:

deep_image_analogy.exe ../models/ ../demo/content.png ../demo/style.png ../demo/output/ 0 0.5 2 0

which means

path_model=../models/
path_A=../demo/content.png
path_BP=../demo/style.png
path_output=../demo/output/
GPU Number=0
Ratio=0.5
Blend Weight=2
Flag of WLS Filter=0( 0: WLS filter disabled, 1: WLS filter enabled, only required for the case of photo to photo)

Tips

We often test images of size 600x400 and 448x448.
We set ratio to 1.0 by default. Specifically, for face (portrait) cases, we find ratio = 0.5 often make the results better.
Blend weight controls the result appearance. If you want the result to be more like original content photo, please increase it; if you want the result more faithful to the style, please reduce it.
For the four applications, our settings are mostly (but not definitely):
- Photo to Style: blend weight=3, ratio=0.5 for face and ratio=1 for other cases.
- Style to Style: blend weight=3, ratio=1.
- Style to Photo: blend weight=2, ratio=0.5.
- Photo to Photo: blend weight=3, ratio=1.

Acknowledgments

Our codes acknowledge Eigen, PatchMatch, CudaLBFGS and Caffe. We also acknowledge to the authors of our image and style examples but we do not own the copyrights of them.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot