AnimeGANv2
[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime
Top Related Projects
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
This repository contains the source code for the paper First Order Motion Model for Image Animation
Bringing Old Photo Back to Life (CVPR 2020 oral)
ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.
Quick Overview
AnimeGANv2 is an improved version of AnimeGAN for fast photo animation. It's a deep learning model that can transform real-world photos into anime-style images. The project aims to provide a high-quality, efficient solution for anime-style image generation.
Pros
- High-quality anime-style image generation
- Fast processing speed compared to previous versions
- Supports various input image sizes and formats
- Provides pre-trained models for easy use
Cons
- Requires significant computational resources for training
- Limited customization options for output style
- May struggle with complex scenes or unusual input images
- Dependency on specific versions of TensorFlow and other libraries
Code Examples
- Loading and using a pre-trained model:
from AnimeGANv2 import load_model, process_image
# Load the pre-trained model
model = load_model('generator_Hayao_weight')
# Process an image
input_image = 'path/to/input/image.jpg'
output_image = process_image(model, input_image)
output_image.save('path/to/output/image.jpg')
- Batch processing multiple images:
import os
from AnimeGANv2 import load_model, process_image
model = load_model('generator_Paprika_weight')
input_dir = 'path/to/input/directory'
output_dir = 'path/to/output/directory'
for filename in os.listdir(input_dir):
if filename.endswith(('.jpg', '.png')):
input_path = os.path.join(input_dir, filename)
output_path = os.path.join(output_dir, f'anime_{filename}')
output_image = process_image(model, input_path)
output_image.save(output_path)
- Adjusting output style intensity:
from AnimeGANv2 import load_model, process_image_with_intensity
model = load_model('generator_Shinkai_weight')
input_image = 'path/to/input/image.jpg'
intensity = 0.7 # Adjust between 0 and 1
output_image = process_image_with_intensity(model, input_image, intensity)
output_image.save('path/to/output/image.jpg')
Getting Started
-
Clone the repository:
git clone https://github.com/TachibanaYoshino/AnimeGANv2.git cd AnimeGANv2
-
Install dependencies:
pip install -r requirements.txt
-
Download pre-trained models from the releases page and place them in the
checkpoint
directory. -
Use the provided scripts or integrate the code examples above into your project to start generating anime-style images from photos.
Competitor Comparisons
Pros of GPEN
- Focuses on face restoration and enhancement, offering more specialized results for facial images
- Provides pre-trained models for different resolutions, allowing flexibility in output quality
- Includes a user-friendly GUI for easier use by non-technical users
Cons of GPEN
- Limited to face-specific applications, unlike AnimeGANv2's broader anime-style transformation
- Requires more computational resources due to its complex architecture
- Less customizable for artistic style transfer compared to AnimeGANv2
Code Comparison
GPEN:
from gpen import FaceRestoration
model = FaceRestoration(model_path='path/to/model.pth')
restored_face = model.process(input_image)
AnimeGANv2:
from AnimeGANv2 import AnimeGANv2
model = AnimeGANv2(checkpoint='path/to/checkpoint')
anime_image = model.convert(input_image)
Both repositories provide easy-to-use interfaces for their respective tasks. GPEN focuses on face restoration with a single process
method, while AnimeGANv2 offers a convert
method for anime-style transformation. GPEN's code structure is more oriented towards face-specific processing, whereas AnimeGANv2 is designed for general image conversion to anime style.
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
Pros of Real-ESRGAN
- More versatile, capable of enhancing both real-world images and anime-style content
- Offers better performance on a wider range of image types, including heavily compressed or low-quality images
- Provides pre-trained models for various use cases, making it easier to get started
Cons of Real-ESRGAN
- Requires more computational resources due to its more complex architecture
- May introduce artifacts or over-smoothing in some cases, especially with extreme upscaling factors
Code Comparison
Real-ESRGAN:
from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
upsampler = RealESRGANer(scale=4, model_path='weights/RealESRGAN_x4plus.pth', model=model, tile=0, tile_pad=10, pre_pad=0)
AnimeGANv2:
from net import generator
import tensorflow as tf
checkpoint_dir = './checkpoint/generator_Hayao_weight'
generator = generator.Generator()
ckpt = tf.train.Checkpoint(generator=generator)
ckpt.restore(tf.train.latest_checkpoint(checkpoint_dir)).expect_partial()
Both repositories focus on image enhancement, but Real-ESRGAN is more versatile and generally performs better on a wider range of images. AnimeGANv2 is specifically designed for anime-style image generation and may produce better results for that particular use case.
Pros of ailab
- More comprehensive AI lab with multiple projects and tools
- Larger community and corporate backing from Bilibili
- Broader scope beyond just anime-style image generation
Cons of ailab
- Less focused on anime-specific image generation
- May have a steeper learning curve due to broader scope
- Documentation might be more complex or less tailored to anime enthusiasts
Code Comparison
AnimeGANv2:
def load_test_data(image_path, size=[256,256]):
img = cv2.imread(image_path).astype(np.float32)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = preprocessing(img, size)
img = np.expand_dims(img, axis=0)
return img
ailab:
def preprocess_image(image_path, target_size=(224, 224)):
img = Image.open(image_path)
img = img.resize(target_size)
img_array = np.array(img) / 255.0
img_array = np.expand_dims(img_array, axis=0)
return img_array
Both repositories provide image preprocessing functions, but AnimeGANv2 is more tailored for anime-style conversion, while ailab's function is more general-purpose for various AI tasks.
This repository contains the source code for the paper First Order Motion Model for Image Animation
Pros of first-order-model
- Focuses on motion transfer and animation, allowing for more dynamic results
- Supports a wider range of input types, including videos and images
- Has a more extensive documentation and usage examples
Cons of first-order-model
- Requires more computational resources and training time
- May produce less consistent results across different input types
- Has a steeper learning curve for implementation and fine-tuning
Code Comparison
first-order-model:
from demo import load_checkpoints
generator, kp_detector = load_checkpoints(config_path='config/vox-256.yaml',
checkpoint_path='vox-cpk.pth.tar')
AnimeGANv2:
from net import generator
G = generator.Generator()
G.load_state_dict(torch.load('weights/generator.pth', map_location=device))
The first-order-model code demonstrates loading checkpoints for both the generator and keypoint detector, while AnimeGANv2 focuses on loading the generator weights. This reflects the different approaches and complexities of the two projects.
Bringing Old Photo Back to Life (CVPR 2020 oral)
Pros of Bringing-Old-Photos-Back-to-Life
- Focuses on restoring and enhancing old, damaged photos
- Provides a complete pipeline for photo restoration, including face enhancement
- Offers pre-trained models for immediate use
Cons of Bringing-Old-Photos-Back-to-Life
- More complex setup and dependencies compared to AnimeGANv2
- Limited to photo restoration, not suitable for stylization or animation
Code Comparison
AnimeGANv2:
def conv2d(inputs, filters, kernel_size=3, strides=1, padding='SAME', name='conv'):
return tf.layers.conv2d(inputs=inputs, filters=filters, kernel_size=kernel_size, strides=strides,
padding=padding, name=name, kernel_initializer=tf.contrib.layers.xavier_initializer())
Bringing-Old-Photos-Back-to-Life:
def conv_block(input, num_filters, kernel_size, stride, padding='same', use_bias=True):
x = Conv2D(num_filters, kernel_size=kernel_size, strides=stride, padding=padding, use_bias=use_bias)(input)
x = BatchNormalization()(x)
x = LeakyReLU(alpha=0.2)(x)
return x
Both projects use convolutional layers, but Bringing-Old-Photos-Back-to-Life includes batch normalization and LeakyReLU activation, which may contribute to better stability and performance in photo restoration tasks.
AnimeGANv2 is more focused on stylization and anime-style image generation, while Bringing-Old-Photos-Back-to-Life specializes in photo restoration and enhancement. The choice between them depends on the specific use case and desired output.
ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.
Pros of ESRGAN
- More versatile, capable of enhancing various types of images beyond anime
- Provides pre-trained models for different use cases
- Extensive documentation and examples for implementation
Cons of ESRGAN
- Requires more computational resources for training and inference
- May produce artifacts in some cases, especially with extreme upscaling
Code Comparison
AnimeGANv2:
def load_test_data(image_path, size):
img = cv2.imread(image_path).astype(np.float32)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = preprocessing(img, size)
return img
ESRGAN:
def preprocess(img):
img = np.transpose(img, (2, 0, 1))
img = torch.from_numpy(img).float()
img = img.unsqueeze(0)
return img
Both repositories focus on image enhancement, but AnimeGANv2 specializes in transforming real-world images into anime-style artwork, while ESRGAN is designed for general image super-resolution. AnimeGANv2 may be more suitable for specific anime-related projects, whereas ESRGAN offers broader applicability across various image types and resolutions.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
AnimeGANv2
The improved version of AnimeGAN.
Project Page | Landscape photos / videos to anime
News
- (2022.08.03) Added the AnimeGANv2 Colab: ð¼ï¸ Photos | ðï¸ Videos
- (2021.12.25) AnimeGANv3 has been released. :christmas_tree:
- (2021.02.21) The pytorch version of AnimeGANv2 has been released, Be grateful to @bryandlee for his contribution.
- (2020.12.25) AnimeGANv3 will be released along with its paper in the spring of 2021.
Focus:
Anime style | Film | Picture Number | Quality | Download Style Dataset |
---|---|---|---|---|
Miyazaki Hayao | The Wind Rises | 1752 | 1080p | Link |
Makoto Shinkai | Your Name & Weathering with you | 1445 | BD | |
Kon Satoshi | Paprika | 1284 | BDRip |
News:
The improvement directions of AnimeGANv2 mainly include the following 4 points:
-
1. Solve the problem of high-frequency artifacts in the generated image.
-
2. It is easy to train and directly achieve the effects in the paper.
-
3. Further reduce the number of parameters of the generator network. (generator size: 8.17 Mb), The lite version has a smaller generator model.
-
4. Use new high-quality style data, which come from BD movies as much as possible.
AnimeGAN can be accessed from here.
Requirements
- python 3.6
- tensorflow-gpu 1.15.0 (GPU 2080Ti, cuda 10.0.130, cudnn 7.6.0)
- opencv
- tqdm
- numpy
- glob
- argparse
- onnxruntime (If onnx file needs to be run.)
Usage
1. Inference
python test.py --checkpoint_dir checkpoint/generator_Hayao_weight --test_dir dataset/test/HR_photo --save_dir Hayao/HR_photo
2. Convert video to anime
python video2anime.py --video video/input/ãè±è¦.mp4 --checkpoint_dir checkpoint/generator_Hayao_weight --output video/output
3. Train
1. Download vgg19
2. Download Train/Val Photo dataset
3. Do edge_smooth
python edge_smooth.py --dataset Hayao --img_size 256
4. Train
python train.py --dataset Hayao --epoch 101 --init_epoch 10
5. Extract the weights of the generator
python get_generator_ckpt.py --checkpoint_dir ../checkpoint/AnimeGANv2_Shinkai_lsgan_300_300_1_2_10_1 --style_name Shinkai
Results
:heart_eyes: Photo to Paprika Style
:heart_eyes: Photo to Hayao Style
:heart_eyes: Photo to Shinkai Style
License
This repo is made freely available to academic and non-academic entities for non-commercial purposes such as academic research, teaching, scientific publications. Permission is granted to use the AnimeGANv2 given that you agree to my license terms. Regarding the request for commercial use, please contact us via email to help you obtain the authorization letter.
Author
Xin Chen
Top Related Projects
Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
This repository contains the source code for the paper First Order Motion Model for Image Animation
Bringing Old Photo Back to Life (CVPR 2020 oral)
ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot