White-box-Cartoonization

Official tensorflow implementation for CVPR2020 paper “Learning to Cartoonize Using White-box Cartoon Representations”

3,984

743

3,984

View on GitHub

Top Related Projects

RepVGG

3,382

RepVGG: Making VGG-style ConvNets Great Again

GFPGAN

36,861

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Real-ESRGAN

30,708

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

stable-diffusion

71,028

A latent text-to-image diffusion model

JoJoGAN

1,428

Official PyTorch repo for JoJoGAN: One Shot Face Stylization

AnimeGANv2

5,290

[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime

Quick Overview

White-box-Cartoonization is a GitHub repository that implements a novel white-box cartoon style transfer algorithm. This project aims to transform real-world images and videos into cartoon-style renderings using a learning-based approach. The method provides interpretable results and allows for adjustable cartoonization effects.

Pros

High-quality cartoonization results with preserved details and structures
Supports both image and video cartoonization
Provides a TensorFlow implementation for easy integration and experimentation
Includes pre-trained models for quick testing and deployment

Cons

Requires significant computational resources for training and inference
Limited customization options for fine-tuning the cartoonization style
Dependency on specific versions of TensorFlow and other libraries
Lack of extensive documentation for advanced usage and modifications

Code Examples

Loading and preprocessing an image:

import cv2
import numpy as np

def load_image(path):
    img = cv2.imread(path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = img.astype(np.float32) / 127.5 - 1
    return img

Performing cartoonization on an image:

def cartoonize(model, img):
    input_image = np.expand_dims(img, axis=0)
    output = model.signatures['serving_default'](tf.constant(input_image))
    cartoon = output['output_1'].numpy()
    cartoon = (cartoon[0] + 1) * 127.5
    cartoon = cartoon.astype(np.uint8)
    return cartoon

Saving the cartoonized image:

def save_image(img, path):
    cv2.imwrite(path, cv2.cvtColor(img, cv2.COLOR_RGB2BGR))

Getting Started

Clone the repository:

git clone https://github.com/SystemErrorWang/White-box-Cartoonization.git
cd White-box-Cartoonization

Install dependencies:
```
pip install -r requirements.txt
```
Download pre-trained models from the provided link in the repository's README.

Run the cartoonization script:

python test_code/cartoonize.py --input_path path/to/input/image --output_path path/to/output/image

Competitor Comparisons

RepVGG

3,382

RepVGG: Making VGG-style ConvNets Great Again

Pros of RepVGG

Focuses on efficient and scalable neural network architecture for image classification
Offers better inference speed and accuracy trade-off compared to many existing models
Provides a simple and flexible design that can be easily adapted to various tasks

Cons of RepVGG

Limited to image classification tasks, unlike White-box-Cartoonization's focus on image stylization
May require more computational resources for training compared to White-box-Cartoonization
Less visually appealing output for end-users, as it doesn't produce stylized images

Code Comparison

RepVGG:

class RepVGGBlock(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1, dilation=1, groups=1, padding_mode='zeros', deploy=False):
        super(RepVGGBlock, self).__init__()
        # ... (implementation details)

White-box-Cartoonization:

class CartoonizeNetwork(nn.Module):
    def __init__(self):
        super(CartoonizeNetwork, self).__init__()
        # ... (implementation details)

Both repositories provide PyTorch implementations of their respective neural network architectures. RepVGG focuses on efficient convolutional blocks, while White-box-Cartoonization implements a more complex network for image stylization.

GFPGAN

36,861

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Pros of GFPGAN

Focuses on face restoration and enhancement, providing more detailed and realistic results for facial features
Utilizes a pre-trained model, allowing for faster processing and easier implementation
Supports both image and video processing, offering more versatility in applications

Cons of GFPGAN

Limited to face restoration, not suitable for full-body or non-facial image stylization
May produce less stylized or artistic results compared to White-box-Cartoonization
Requires more computational resources due to its complex neural network architecture

Code Comparison

GFPGAN:

from gfpgan import GFPGANer

restorer = GFPGANer(model_path='experiments/pretrained_models/GFPGANv1.3.pth', upscale=2)
restored_img, _ = restorer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True)

White-box-Cartoonization:

from cartoonize import WB_Cartoonize

cartoonizer = WB_Cartoonize(os.path.abspath("saved_models/"), gpu=1)
cartoon_image = cartoonizer.infer(img)

Both repositories provide easy-to-use interfaces for their respective tasks. GFPGAN focuses on face restoration with a pre-trained model, while White-box-Cartoonization offers a more general image stylization approach. The choice between them depends on the specific use case and desired output style.

Real-ESRGAN

30,708

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

Pros of Real-ESRGAN

Focuses on image super-resolution and enhancement, providing high-quality upscaling
Offers better performance in restoring details and textures in low-quality images
Supports both anime and real-world photo processing

Cons of Real-ESRGAN

Limited to image enhancement and upscaling, not designed for stylization or cartoonization
May require more computational resources due to its complex architecture

Code Comparison

Real-ESRGAN:

model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
netscale = 4
model_path = 'experiments/pretrained_models/RealESRGAN_x4plus.pth'

White-box-Cartoonization:

guided_filter = GuidedFilter(r=1, eps=5e-3)
cartoon_generator = CartoonGenerator()
cartoon_generator.load_state_dict(torch.load('pretrained_model.pth'))

Both repositories provide pre-trained models and offer inference code for easy usage. Real-ESRGAN focuses on enhancing image quality, while White-box-Cartoonization aims to transform images into cartoon-style representations. The code snippets show the initialization of their respective models, highlighting the different approaches and architectures used by each project.

stable-diffusion

71,028

A latent text-to-image diffusion model

Pros of stable-diffusion

More versatile, capable of generating a wide range of image styles and content
Utilizes advanced machine learning techniques for high-quality image generation
Supports text-to-image generation, allowing for creative and customizable outputs

Cons of stable-diffusion

Requires more computational resources and longer processing times
May produce less consistent results compared to White-box-Cartoonization
More complex to set up and use, especially for beginners

Code Comparison

White-box-Cartoonization:

output = cartoonize(input_image)

stable-diffusion:

prompt = "A cartoon-style image of a cat"
image = pipe(prompt).images[0]

White-box-Cartoonization focuses on a specific task (cartoonization) with a simpler API, while stable-diffusion offers more flexibility but requires more detailed input and configuration.

Both projects have their strengths: White-box-Cartoonization excels in its specific task of cartoonization, while stable-diffusion provides a more versatile platform for various image generation and manipulation tasks. The choice between them depends on the specific requirements of the project and the desired level of control over the output.

JoJoGAN

1,428

Official PyTorch repo for JoJoGAN: One Shot Face Stylization

Pros of JoJoGAN

Focuses on stylizing faces in the style of JoJo's Bizarre Adventure anime
Utilizes a GAN-based approach for more flexible and diverse outputs
Allows for fine-tuning on custom styles with limited data

Cons of JoJoGAN

Limited to face stylization, unlike White-box-Cartoonization's full-image approach
Requires more computational resources due to the GAN architecture
May produce less consistent results across different input images

Code Comparison

White-box-Cartoonization:

def cartoonize(img_path):
    input_photo = tf.io.read_file(img_path)
    input_photo = tf.image.decode_jpeg(input_photo, channels=3)
    input_photo = tf.image.resize(input_photo, [256, 256])
    input_photo = input_photo / 127.5 - 1
    output = network(input_photo)

JoJoGAN:

def stylize(img, model):
    img = transform(img).unsqueeze(0).to(device)
    with torch.no_grad():
        out = model(img)
    out = out.squeeze(0).permute(1, 2, 0).cpu().numpy()
    out = (out * 255).astype(np.uint8)

Both repositories provide image stylization capabilities, but they differ in their approach and focus. White-box-Cartoonization offers a more general cartoonization method for entire images, while JoJoGAN specializes in face stylization with a specific anime aesthetic. The code snippets demonstrate the different frameworks and preprocessing steps used in each project.

AnimeGANv2

5,290

[Open Source]. The improved version of AnimeGAN. Landscape photos/videos to anime

Pros of AnimeGANv2

Produces higher quality anime-style images with more vibrant colors and sharper details
Offers multiple pre-trained models for different anime styles
Includes a comprehensive training pipeline for custom datasets

Cons of AnimeGANv2

Requires more computational resources for inference and training
Less flexibility in controlling the cartoonization process compared to White-box-Cartoonization
Limited documentation and examples for customization

Code Comparison

White-box-Cartoonization:

output = cartoonize(input_image)

AnimeGANv2:

face_painter = AnimeGANv2(pretrained_model='generator_Hayao_weight.pt')
output = face_painter.inference(input_image)

White-box-Cartoonization uses a simpler function call, while AnimeGANv2 requires initializing a model object before inference. AnimeGANv2 allows for easy switching between different pre-trained models, offering more style options.

Both repositories provide Python-based implementations and support various input formats. White-box-Cartoonization focuses on a general cartoonization approach, while AnimeGANv2 specifically targets anime-style image generation. White-box-Cartoonization offers more interpretability and control over the transformation process, making it suitable for research and experimentation. AnimeGANv2, on the other hand, excels in producing high-quality anime-style images with less user intervention.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

[CVPR2020]Learning to Cartoonize Using White-box Cartoon Representations

Tensorflow implementation for CVPR2020 paper âLearning to Cartoonize Using White-box Cartoon Representationsâ.
Improved method for facial images are now available:
https://github.com/SystemErrorWang/FacialCartoonization

Use cases

Scenery

Food

Indoor Scenes

People

More Images Are Shown In The Supplementary Materials

Online demo

Some kind people made online demo for this project
Demo link: https://cartoonize-lkqov62dia-de.a.run.app/cartoonize
Code: https://github.com/experience-ml/cartoonize
Sample Demo: https://www.youtube.com/watch?v=GqduSLcmhto&feature=emb_title

Prerequisites

Training code: Linux or Windows
NVIDIA GPU + CUDA CuDNN for performance
Inference code: Linux, Windows and MacOS

How To Use

Installation

Assume you already have NVIDIA GPU and CUDA CuDNN installed
Install tensorflow-gpu, we tested 1.12.0 and 1.13.0rc0
Install scikit-image==0.14.5, other versions may cause problems

Inference with Pre-trained Model

Store test images in /test_code/test_images
Run /test_code/cartoonize.py
Results will be saved in /test_code/cartoonized_images

Train

Place your training data in corresponding folders in /dataset
Run pretrain.py, results will be saved in /pretrain folder
Run train.py, results will be saved in /train_cartoon folder
Codes are cleaned from production environment and untested
There may be minor problems but should be easy to resolve
Pretrained VGG_19 model can be found at following url: https://drive.google.com/file/d/1j0jDENjdwxCDb36meP6-u5xDBzmKBOjJ/view?usp=sharing

Datasets

Due to copyright issues, we cannot provide cartoon images used for training
However, these training datasets are easy to prepare
Scenery images are collected from Shinkai Makoto, Miyazaki Hayao and Hosoda Mamoru films
Clip films into frames and random crop and resize to 256x256
Portrait images are from Kyoto animations and PA Works
We use this repo(https://github.com/nagadomi/lbpcascade_animeface) to detect facial areas
Manual data cleaning will greatly increace both datasets quality

Acknowledgement

We are grateful for the help from Lvmin Zhang and Style2Paints Research

License

license (https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode).
Commercial application is prohibited, please remain this license if you clone this repo

Citation

If you use this code for your research, please cite our paper:

@InProceedings{Wang_2020_CVPR, author = {Wang, Xinrui and Yu, Jinze}, title = {Learning to Cartoonize Using White-Box Cartoon Representations}, booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2020} }

ä¸æç¤¾åº

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of RepVGG

Cons of RepVGG

Code Comparison

Pros of GFPGAN

Cons of GFPGAN

Code Comparison

Pros of Real-ESRGAN

Cons of Real-ESRGAN

Code Comparison

Pros of stable-diffusion

Cons of stable-diffusion

Code Comparison

Pros of JoJoGAN

Cons of JoJoGAN

Code Comparison

Pros of AnimeGANv2

Cons of AnimeGANv2

Code Comparison

Convert designs to code with AI

README

[CVPR2020]Learning to Cartoonize Using White-box Cartoon Representations

Use cases

Scenery

Food

Indoor Scenes

People

More Images Are Shown In The Supplementary Materials

Online demo

Prerequisites

How To Use

Installation

Inference with Pre-trained Model

Train

Datasets

Acknowledgement

License

Citation

ä¸­æç¤¾åº

Top Related Projects

Convert designs to code with AI

ä¸æç¤¾åº