Top Related Projects
Feedforward style transfer
Neural style in TensorFlow! 🎨
Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511
Style transfer, deep learning, feature transform
Magenta: Music and Art Generation with Machine Intelligence
Image-to-Image Translation in PyTorch
Quick Overview
Fast Style Transfer is a TensorFlow implementation of fast style transfer, allowing users to apply artistic styles to images and videos quickly. It uses a feed-forward neural network to generate stylized images in real-time, making it suitable for both image and video processing.
Pros
- Fast processing: Can generate stylized images in real-time
- Supports both image and video style transfer
- Pre-trained models available for various artistic styles
- Easy to use with clear documentation and examples
Cons
- Requires specific versions of TensorFlow and other dependencies
- Limited to pre-trained styles unless users train their own models
- May produce less accurate results compared to slower optimization-based methods
- Resource-intensive for training new styles
Code Examples
- Stylizing an image:
from style import stylize
content_image = 'path/to/content/image.jpg'
output_path = 'path/to/output/image.jpg'
checkpoint = 'path/to/style/checkpoint.ckpt'
stylize(content_image, output_path, checkpoint)
- Stylizing a video:
from style import stylize_video
input_video = 'path/to/input/video.mp4'
output_video = 'path/to/output/video.mp4'
checkpoint = 'path/to/style/checkpoint.ckpt'
stylize_video(input_video, output_video, checkpoint)
- Evaluating a trained model:
from evaluate import ffwd_to_img
in_path = 'path/to/test/image.jpg'
out_path = 'path/to/output/image.jpg'
checkpoint_dir = 'path/to/checkpoint/dir'
device = '/gpu:0'
ffwd_to_img(in_path, out_path, checkpoint_dir, device)
Getting Started
-
Clone the repository:
git clone https://github.com/lengstrom/fast-style-transfer.git cd fast-style-transfer
-
Install dependencies:
pip install -r requirements.txt
-
Download pre-trained models:
sh models/download_style_transfer_models.sh
-
Stylize an image:
python style.py --content path/to/content/image.jpg --output-path path/to/output/image.jpg --model-file models/udnie.ckpt
Competitor Comparisons
Feedforward style transfer
Pros of fast-neural-style
- Supports both training and inference, allowing users to create custom style models
- Includes pre-trained models for quick style transfer without training
- Offers multi-GPU training for faster model creation
Cons of fast-neural-style
- Less actively maintained, with the last update in 2018
- Requires more setup and dependencies compared to fast-style-transfer
- Limited documentation and examples for advanced usage
Code Comparison
fast-neural-style:
local cmd = torch.CmdLine()
cmd:option('-style_image', 'examples/inputs/seated-nude.jpg', 'Path to style image')
cmd:option('-content_image', 'examples/inputs/tubingen.jpg', 'Path to content image')
cmd:option('-image_size', 512, 'Maximum height / width of generated image')
fast-style-transfer:
parser.add_argument('--checkpoint-dir', type=str,
dest='checkpoint_dir', help='dir to save checkpoint in',
metavar='CHECKPOINT_DIR', required=True)
parser.add_argument('--style', type=str,
dest='style', help='style image path',
metavar='STYLE', required=True)
Both repositories focus on fast neural style transfer, but fast-neural-style offers more flexibility in training custom models, while fast-style-transfer provides a simpler, more user-friendly approach for quick style transfer using pre-trained models.
Neural style in TensorFlow! 🎨
Pros of neural-style
- Offers more flexibility in style transfer, allowing for custom adjustments
- Produces higher quality results, especially for complex styles
- Supports a wider range of input image sizes
Cons of neural-style
- Significantly slower processing time compared to fast-style-transfer
- Requires more computational resources and GPU power
- Less suitable for real-time applications or batch processing
Code Comparison
neural-style:
parser.add_argument('--content-weight', type=float, default=5e0)
parser.add_argument('--style-weight', type=float, default=1e2)
parser.add_argument('--tv-weight', type=float, default=1e-3)
parser.add_argument('--learning-rate', type=float, default=1e0)
parser.add_argument('--max-iterations', type=int, default=1000)
fast-style-transfer:
parser.add_argument('--checkpoint-dir', type=str,
dest='checkpoint_dir', help='dir to save checkpoint in')
parser.add_argument('--style', type=str,
dest='style', help='style image path')
parser.add_argument('--train-path', type=str,
dest='train_path', help='path to training images folder')
The code snippets show that neural-style offers more fine-grained control over the style transfer process, while fast-style-transfer focuses on simplicity and ease of use for training and applying pre-trained models.
Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511
Pros of deep-photo-styletransfer
- Produces more photorealistic results, preserving the structure of the original photo
- Offers better color preservation and transfer from the style image
- Includes a technique for maintaining local affine color transformations
Cons of deep-photo-styletransfer
- Slower processing time compared to fast-style-transfer
- Requires more computational resources and setup
- Less suitable for real-time or video applications
Code Comparison
deep-photo-styletransfer:
def wct_core(cont_feat, styl_feat, weight=1, eps=1e-5):
cont_c, cont_h, cont_w = cont_feat.size()
cont_feat_view = cont_feat.view(cont_c, -1)
cont_feat_mean = torch.mean(cont_feat_view, 1)
cont_feat_var = torch.var(cont_feat_view, 1)
fast-style-transfer:
def _conv_layer(net, num_filters, filter_size, strides, relu=True):
weights_init = _conv_init_vars(net.get_shape()[-1], num_filters, filter_size)
strides_shape = [1, strides, strides, 1]
net = tf.nn.conv2d(net, weights_init, strides_shape, padding='SAME')
net = _instance_norm(net)
The code snippets show different approaches: deep-photo-styletransfer uses PyTorch and focuses on Whitening and Coloring Transforms, while fast-style-transfer uses TensorFlow and emphasizes convolutional layers with instance normalization.
Style transfer, deep learning, feature transform
Pros of FastPhotoStyle
- Focuses on photorealistic style transfer, preserving content structure better
- Utilizes semantic segmentation for improved style application
- Offers both photo-to-photo and painting-to-photo style transfer
Cons of FastPhotoStyle
- Requires more computational resources due to complex architecture
- Less versatile for non-photorealistic style transfer tasks
- Steeper learning curve for implementation and customization
Code Comparison
FastPhotoStyle:
from photo_wct import PhotoWCT
p_wct = PhotoWCT()
p_wct.load_state_dict(torch.load('models/photo_wct.pth'))
stylized = p_wct.transfer(content, style, csF=1.0)
fast-style-transfer:
from style import stylize
stylized = stylize(content_image, model_path)
Key Differences
- FastPhotoStyle aims for photorealistic results, while fast-style-transfer is more general-purpose
- FastPhotoStyle uses a more complex architecture with semantic segmentation
- fast-style-transfer is simpler to implement but less flexible for fine-tuning
- FastPhotoStyle may produce higher quality results for photo-to-photo transfers
- fast-style-transfer is better suited for artistic style transfer tasks
Magenta: Music and Art Generation with Machine Intelligence
Pros of Magenta
- Broader scope: Covers various AI music and art generation tasks beyond style transfer
- Active development: More frequent updates and contributions from Google researchers
- Extensive documentation and tutorials for users and developers
Cons of Magenta
- Steeper learning curve due to its broader scope and more complex architecture
- Potentially slower performance for specific tasks like style transfer
- Requires more computational resources for some advanced features
Code Comparison
Fast-Style-Transfer:
stylized_image = style_transfer(content_image, style_image)
Magenta:
model = magenta.models.image_stylization.image_stylization_transform.Transform()
stylized_image = model.stylize(content_image, style_image)
Summary
Fast-Style-Transfer is more focused and potentially easier to use for quick style transfer tasks. Magenta offers a wider range of AI-powered creative tools but may require more setup and learning. Fast-Style-Transfer might be preferable for simple style transfer projects, while Magenta is better suited for more complex, multi-faceted AI art and music generation tasks.
Image-to-Image Translation in PyTorch
Pros of pytorch-CycleGAN-and-pix2pix
- Supports multiple image-to-image translation tasks beyond style transfer
- Implements more advanced GAN-based techniques (CycleGAN, pix2pix)
- Provides a flexible PyTorch framework for easier customization
Cons of pytorch-CycleGAN-and-pix2pix
- Requires more computational resources and training time
- May have slower inference speed compared to fast-style-transfer
- More complex to set up and use for beginners
Code Comparison
fast-style-transfer:
stylized_image = style_transfer(content_image, style_image)
pytorch-CycleGAN-and-pix2pix:
model = create_model(opt)
model.setup(opt)
model.eval()
stylized_image = model.netG(content_image)
The fast-style-transfer code is simpler and more straightforward for style transfer tasks. The pytorch-CycleGAN-and-pix2pix code requires more setup but offers greater flexibility for various image-to-image translation tasks.
Both repositories provide valuable tools for image manipulation, with fast-style-transfer focusing on quick and efficient style transfer, while pytorch-CycleGAN-and-pix2pix offers a broader range of image translation capabilities at the cost of increased complexity.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Fast Style Transfer in TensorFlow
Add styles from famous paintings to any photo in a fraction of a second! You can even style videos!
It takes 100ms on a 2015 Titan X to style the MIT Stata Center (1024Ã680) like Udnie, by Francis Picabia.
Our implementation is based off of a combination of Gatys' A Neural Algorithm of Artistic Style, Johnson's Perceptual Losses for Real-Time Style Transfer and Super-Resolution, and Ulyanov's Instance Normalization.
Sponsorship
Please consider sponsoring my work on this project!
License
Copyright (c) 2016 Logan Engstrom. Contact me for commercial use (or rather any use that is not academic research) (email: engstrom at my university's domain dot edu). Free for research use, as long as proper attribution is given and this copyright notice is retained.
Video Stylization
Here we transformed every frame in a video, then combined the results. Click to go to the full demo on YouTube! The style here is Udnie, as above.
See how to generate these videos here!
Image Stylization
We added styles from various paintings to a photo of Chicago. Click on thumbnails to see full applied style images.
Implementation Details
Our implementation uses TensorFlow to train a fast style transfer network. We use roughly the same transformation network as described in Johnson, except that batch normalization is replaced with Ulyanov's instance normalization, and the scaling/offset of the output tanh
layer is slightly different. We use a loss function close to the one described in Gatys, using VGG19 instead of VGG16 and typically using "shallower" layers than in Johnson's implementation (e.g. we use relu1_1
rather than relu1_2
). Empirically, this results in larger scale style features in transformations.
Virtual Environment Setup (Anaconda) - Windows/Linux
Tested on
Spec | |
---|---|
Operating System | Windows 10 Home |
GPU | Nvidia GTX 2080 TI |
CUDA Version | 11.0 |
Driver Version | 445.75 |
Step 1ï¼Install Anaconda
https://docs.anaconda.com/anaconda/install/
Step 2ï¼Build a virtual environment
Run the following commands in sequence in Anaconda Prompt:
conda create -n tf-gpu tensorflow-gpu=2.1.0
conda activate tf-gpu
conda install jupyterlab
jupyter lab
Run the following command in the notebook or just conda install the package:
!pip install moviepy==1.0.2
Follow the commands below to use fast-style-transfer
Documentation
Training Style Transfer Networks
Use style.py
to train a new style transfer network. Run python style.py
to view all the possible parameters. Training takes 4-6 hours on a Maxwell Titan X. More detailed documentation here. Before you run this, you should run setup.sh
. Example usage:
python style.py --style path/to/style/img.jpg \
--checkpoint-dir checkpoint/path \
--test path/to/test/img.jpg \
--test-dir path/to/test/dir \
--content-weight 1.5e1 \
--checkpoint-iterations 1000 \
--batch-size 20
Evaluating Style Transfer Networks
Use evaluate.py
to evaluate a style transfer network. Run python evaluate.py
to view all the possible parameters. Evaluation takes 100 ms per frame (when batch size is 1) on a Maxwell Titan X. More detailed documentation here. Takes several seconds per frame on a CPU. Models for evaluation are located here. Example usage:
python evaluate.py --checkpoint path/to/style/model.ckpt \
--in-path dir/of/test/imgs/ \
--out-path dir/for/results/
Stylizing Video
Use transform_video.py
to transfer style into a video. Run python transform_video.py
to view all the possible parameters. Requires ffmpeg
. More detailed documentation here. Example usage:
python transform_video.py --in-path path/to/input/vid.mp4 \
--checkpoint path/to/style/model.ckpt \
--out-path out/video.mp4 \
--device /gpu:0 \
--batch-size 4
Requirements
You will need the following to run the above:
- TensorFlow 0.11.0
- Python 2.7.9, Pillow 3.4.2, scipy 0.18.1, numpy 1.11.2
- If you want to train (and don't want to wait for 4 months):
- A decent GPU
- All the required NVIDIA software to run TF on a GPU (cuda, etc)
- ffmpeg 3.1.3 if you want to stylize video
Citation
@misc{engstrom2016faststyletransfer,
author = {Logan Engstrom},
title = {Fast Style Transfer},
year = {2016},
howpublished = {\url{https://github.com/lengstrom/fast-style-transfer/}},
note = {commit xxxxxxx}
}
Attributions/Thanks
- This project could not have happened without the advice (and GPU access) given by Anish Athalye.
- The project also borrowed some code from Anish's Neural Style
- Some readme/docs formatting was borrowed from Justin Johnson's Fast Neural Style
- The image of the Stata Center at the very beginning of the README was taken by Juan Paulo
Related Work
- Michael Ramos ported this network to use CoreML on iOS
Top Related Projects
Feedforward style transfer
Neural style in TensorFlow! 🎨
Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511
Style transfer, deep learning, feature transform
Magenta: Music and Art Generation with Machine Intelligence
Image-to-Image Translation in PyTorch
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot