Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

31,984

3,991

31,984

600

View on GitHub

Top Related Projects

upscayl

38,372

🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.

waifu2x-ncnn-vulkan

3,169

waifu2x converter ncnn version, runs fast on intel / amd / nvidia / apple-silicon GPU with vulkan

Video, Image and GIF upscale/enlarge(Super-Resolution) and Video frame interpolation. Achieved with Waifu2x, Real-ESRGAN, Real-CUGAN, RTX Video Super Resolution VSR, SRMD, RealSR, Anime4K, RIFE, IFRNet, CAIN, DAIN, and ACNet.

GFPGAN

36,861

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

SwinIR

4,946

SwinIR: Image Restoration Using Swin Transformer (official repository)

Quick Overview

Real-ESRGAN is an open-source project that enhances the quality of images and videos using AI-powered super-resolution techniques. It builds upon the ESRGAN model, offering improved performance and practical applications for real-world image restoration tasks.

Pros

Produces high-quality, realistic image upscaling results
Supports both image and video enhancement
Offers pre-trained models for easy use
Includes a user-friendly GUI application for non-technical users

Cons

Requires significant computational resources for training and inference
May introduce artifacts in some cases, especially with extreme upscaling
Limited customization options for non-expert users
Dependency on specific deep learning frameworks may limit portability

Code Examples

Basic image upscaling:

from realesrgan import RealESRGAN
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = RealESRGAN(device, scale=4)
model.load_weights('weights/RealESRGAN_x4plus.pth')

input_image = 'input.jpg'
output_image = 'output.jpg'

model.enhance(input_image, output_image)

Video enhancement:

from realesrgan import RealESRGANer
import cv2
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = RealESRGANer(scale=4, model_path='weights/RealESRGAN_x4plus.pth', device=device)

input_video = cv2.VideoCapture('input.mp4')
output_video = cv2.VideoWriter('output.mp4', cv2.VideoWriter_fourcc(*'mp4v'), 30, (1920, 1080))

while True:
    ret, frame = input_video.read()
    if not ret:
        break
    enhanced_frame, _ = model.enhance(frame)
    output_video.write(enhanced_frame)

input_video.release()
output_video.release()

Batch processing of images:

import os
from realesrgan import RealESRGAN
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = RealESRGAN(device, scale=4)
model.load_weights('weights/RealESRGAN_x4plus.pth')

input_dir = 'input_images'
output_dir = 'output_images'

for filename in os.listdir(input_dir):
    if filename.endswith(('.png', '.jpg', '.jpeg')):
        input_path = os.path.join(input_dir, filename)
        output_path = os.path.join(output_dir, f'enhanced_{filename}')
        model.enhance(input_path, output_path)

Getting Started

Install the required dependencies:

pip install realesrgan torch opencv-python

Download the pre-trained weights from the project's GitHub repository.

Use the following code to enhance an image:

from realesrgan import RealESRGAN
import torch

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model = RealESRGAN(device, scale=4)
model.load_weights('path/to/RealESRGAN_x4plus.pth')

model.enhance('input.jpg', 'output.jpg')

For more advanced usage and options, refer to the project's documentation on GitHub.

Competitor Comparisons

upscayl

38,372

🆙 Upscayl - #1 Free and Open Source AI Image Upscaler for Linux, MacOS and Windows.

Pros of Upscayl

User-friendly GUI application for easy image upscaling
Cross-platform support (Windows, macOS, Linux)
Integrates multiple AI upscaling models, including Real-ESRGAN

Cons of Upscayl

Limited to pre-trained models, less flexibility for custom training
Slower processing compared to command-line implementations
Requires more system resources due to GUI overhead

Code Comparison

Real-ESRGAN (Python):

from realesrgan import RealESRGANer
upsampler = RealESRGANer(scale=4, model_path='weights/RealESRGAN_x4plus.pth')
upsampled_image = upsampler.enhance(input_image)

Upscayl (JavaScript/Electron):

const { upscale } = require('@upscayl/core');
const result = await upscale(inputPath, outputPath, {
  model: 'realesrgan-x4plus',
  scale: 4
});

Both repositories focus on image upscaling using AI models, with Real-ESRGAN being the underlying technology and Upscayl providing a user-friendly interface. Real-ESRGAN offers more flexibility for developers and researchers, while Upscayl caters to end-users seeking a simple upscaling solution without coding knowledge.

waifu2x-ncnn-vulkan

3,169

waifu2x converter ncnn version, runs fast on intel / amd / nvidia / apple-silicon GPU with vulkan

Pros of waifu2x-ncnn-vulkan

Faster processing speed due to Vulkan GPU acceleration
Smaller memory footprint, suitable for devices with limited resources
Supports more input and output formats (PNG, JPG, WebP)

Cons of waifu2x-ncnn-vulkan

Limited to 2x upscaling, while Real-ESRGAN supports up to 4x
Less effective at handling complex textures and details
Primarily designed for anime-style images, may not perform as well on real-world photos

Code Comparison

waifu2x-ncnn-vulkan:

int waifu2x(const cv::Mat& inimage, cv::Mat& outimage, int noise, int scale, int tilesize_x, int tilesize_y, int prepadding, int gpu_id)
{
    ncnn::VulkanDevice* vkdev = ncnn::get_gpu_device(gpu_id);
    // ... (implementation details)
}

Real-ESRGAN:

def inference(model, img):
    img = img.unsqueeze(0).to(device)
    with torch.no_grad():
        output = model(img)
    return output.squeeze().float().cpu().clamp_(0, 1).numpy()

The code snippets show that waifu2x-ncnn-vulkan uses C++ with Vulkan for GPU acceleration, while Real-ESRGAN utilizes Python with PyTorch for its implementation.

Waifu2x-Extension-GUI

15,071

Pros of Waifu2x-Extension-GUI

User-friendly graphical interface for easy operation
Supports multiple AI models, including Waifu2x, Real-ESRGAN, and others
Batch processing capabilities for multiple images or videos

Cons of Waifu2x-Extension-GUI

May have slower processing speed compared to Real-ESRGAN
Requires more system resources due to the GUI and multiple model support
Less flexibility for advanced users or integration into other workflows

Code Comparison

While a direct code comparison is not particularly relevant due to the different nature of these projects (Real-ESRGAN being a model implementation and Waifu2x-Extension-GUI being a GUI wrapper), we can look at how they might be used:

Real-ESRGAN:

from realesrgan import RealESRGAN
model = RealESRGAN('cuda')
model.load_weights('weights/RealESRGAN_x4plus.pth')
img = model.predict('input.jpg')

Waifu2x-Extension-GUI:

# No direct Python usage; it's a GUI application
# Users interact with the interface to select files and settings

Real-ESRGAN is more suitable for developers and researchers who want to integrate the model into their own projects, while Waifu2x-Extension-GUI is designed for end-users who prefer a simple, graphical interface for image upscaling tasks.

GFPGAN

36,861

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.

Pros of GFPGAN

Specialized in face restoration and enhancement
Incorporates facial component dictionaries for improved detail reconstruction
Offers better preservation of facial features and identity

Cons of GFPGAN

Limited to face-specific applications, less versatile for general image upscaling
May introduce artifacts in non-facial areas of images
Potentially higher computational requirements due to facial component analysis

Code Comparison

GFPGAN:

from gfpgan import GFPGANer

restorer = GFPGANer(model_path='experiments/pretrained_models/GFPGANv1.3.pth', upscale=2)
restored_img, _ = restorer.enhance(img, has_aligned=False, only_center_face=False, paste_back=True)

Real-ESRGAN:

from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer

model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)
upsampler = RealESRGANer(scale=4, model_path='experiments/pretrained_models/RealESRGAN_x4plus.pth', model=model)
output, _ = upsampler.enhance(img, outscale=3.5)

SwinIR

4,946

SwinIR: Image Restoration Using Swin Transformer (official repository)

Pros of SwinIR

Utilizes the Swin Transformer architecture, which can capture long-range dependencies more effectively
Offers a wider range of image restoration tasks, including denoising and JPEG compression artifact removal
Provides pre-trained models for various tasks and scales

Cons of SwinIR

Generally slower inference time compared to Real-ESRGAN
May require more computational resources due to the transformer-based architecture
Less focus on real-world image super-resolution scenarios

Code Comparison

SwinIR:

from models.network_swinir import SwinIR
model = SwinIR(upscale=4, in_chans=3, img_size=64, window_size=8,
                img_range=1., depths=[6, 6, 6, 6], embed_dim=60, num_heads=[6, 6, 6, 6],
                mlp_ratio=2, upsampler='pixelshuffledirect', resi_connection='1conv')

Real-ESRGAN:

from basicsr.archs.rrdbnet_arch import RRDBNet
model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=4)

The code snippets show the model initialization for both projects. SwinIR uses a transformer-based architecture, while Real-ESRGAN employs a CNN-based approach with RRDB blocks.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

English | ç®ä½ä¸æ

ð¥ AnimeVideo-v3 model (å¨æ¼«è§é¢å°æ¨¡å). Please see [anime video models] and [comparisons]
ð¥ RealESRGAN_x4plus_anime_6B for anime images (å¨æ¼«æå¾æ¨¡å). Please see [anime_model]

:boom: Update online Replicate demo:
Online Colab demo for Real-ESRGAN: | Online Colab demo for for Real-ESRGAN (anime videos):
Portable Windows / Linux / MacOS executable files for Intel/AMD/Nvidia GPU. You can find more information here. The ncnn implementation is in Real-ESRGAN-ncnn-vulkan

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
We extend the powerful ESRGAN to a practical restoration application (namely, Real-ESRGAN), which is trained with pure synthetic data.

ð Thanks for your valuable feedbacks/suggestions. All the feedbacks are updated in feedback.md.

If Real-ESRGAN is helpful, please help to â this repo or recommend it to your friends ð
Other recommended projects:
â¶ï¸ GFPGAN: A practical algorithm for real-world face restoration
â¶ï¸ BasicSR: An open-source image and video restoration toolbox
â¶ï¸ facexlib: A collection that provides useful face-relation functions.
â¶ï¸ HandyView: A PyQt5-based image viewer that is handy for view and comparison
â¶ï¸ HandyFigure: Open source of paper figures

ð Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

[Paper] [YouTube Video] [Bç«è®²è§£] [Poster] [PPT slides]
Xintao Wang, Liangbin Xie, Chao Dong, Ying Shan
Tencent ARC Lab; Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences

ð© Updates

â Add the realesr-general-x4v3 model - a tiny small model for general scenes. It also supports the -dn option to balance the noise (avoiding over-smooth results). -dn is short for denoising strength.
â Update the RealESRGAN AnimeVideo-v3 model. Please see anime video models and comparisons for more details.
â Add small models for anime videos. More details are in anime video models.
â Add the ncnn implementation Real-ESRGAN-ncnn-vulkan.
â Add RealESRGAN_x4plus_anime_6B.pth, which is optimized for anime images with much smaller model size. More details and comparisons with waifu2x are in anime_model.md
â Support finetuning on your own data or paired data (i.e., finetuning ESRGAN). See here
â Integrate GFPGAN to support face enhancement.
â Integrated to Huggingface Spaces with Gradio. See Gradio Web Demo. Thanks @AK391
â Support arbitrary scale with --outscale (It actually further resizes outputs with LANCZOS4). Add RealESRGAN_x2plus.pth model.
â The inference code supports: 1) tile options; 2) images with alpha channel; 3) gray images; 4) 16-bit images.
â The training codes have been released. A detailed guide can be found in Training.md.

ð Demos Videos

Bilibili

YouTube

ð§ Dependencies and Installation

Python >= 3.7 (Recommend to use Anaconda or Miniconda)
PyTorch >= 1.7

Installation

Clone repo

git clone https://github.com/xinntao/Real-ESRGAN.git
cd Real-ESRGAN

Install dependent packages

# Install basicsr - https://github.com/xinntao/BasicSR
# We use BasicSR for both training and inference
pip install basicsr
# facexlib and gfpgan are for face enhancement
pip install facexlib
pip install gfpgan
pip install -r requirements.txt
python setup.py develop

â¡ Quick Inference

There are usually three ways to inference Real-ESRGAN.

Online inference
Portable executable files (NCNN)
Python script

Online inference

You can try in our website: ARC Demo (now only support RealESRGAN_x4plus_anime_6B)
Colab Demo for Real-ESRGAN | Colab Demo for Real-ESRGAN (anime videos).

Portable executable files (NCNN)

You can download Windows / Linux / MacOS executable files for Intel/AMD/Nvidia GPU.

This executable file is portable and includes all the binaries and models required. No CUDA or PyTorch environment is needed.

You can simply run the following command (the Windows example, more information is in the README.md of each executable files):

./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n model_name

We have provided five models:

realesrgan-x4plus (default)
realesrnet-x4plus
realesrgan-x4plus-anime (optimized for anime images, small model size)
realesr-animevideov3 (animation video)

You can use the -n argument for other models, for example, ./realesrgan-ncnn-vulkan.exe -i input.jpg -o output.png -n realesrnet-x4plus

Usage of portable executable files

Please refer to Real-ESRGAN-ncnn-vulkan for more details.
Note that it does not support all the functions (such as outscale) as the python script inference_realesrgan.py.

Usage: realesrgan-ncnn-vulkan.exe -i infile -o outfile [options]...

  -h                   show this help
  -i input-path        input image path (jpg/png/webp) or directory
  -o output-path       output image path (jpg/png/webp) or directory
  -s scale             upscale ratio (can be 2, 3, 4. default=4)
  -t tile-size         tile size (>=32/0=auto, default=0) can be 0,0,0 for multi-gpu
  -m model-path        folder path to the pre-trained models. default=models
  -n model-name        model name (default=realesr-animevideov3, can be realesr-animevideov3 | realesrgan-x4plus | realesrgan-x4plus-anime | realesrnet-x4plus)
  -g gpu-id            gpu device to use (default=auto) can be 0,1,2 for multi-gpu
  -j load:proc:save    thread count for load/proc/save (default=1:2:2) can be 1:2,2,2:2 for multi-gpu
  -x                   enable tta mode"
  -f format            output image format (jpg/png/webp, default=ext/png)
  -v                   verbose output

Note that it may introduce block inconsistency (and also generate slightly different results from the PyTorch implementation), because this executable file first crops the input image into several tiles, and then processes them separately, finally stitches together.

Python script

Usage of python script

You can use X4 model for arbitrary output size with the argument outscale. The program will further perform cheap resize operation after the Real-ESRGAN output.

Usage: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile -o outfile [options]...

A common command: python inference_realesrgan.py -n RealESRGAN_x4plus -i infile --outscale 3.5 --face_enhance

  -h                   show this help
  -i --input           Input image or folder. Default: inputs
  -o --output          Output folder. Default: results
  -n --model_name      Model name. Default: RealESRGAN_x4plus
  -s, --outscale       The final upsampling scale of the image. Default: 4
  --suffix             Suffix of the restored image. Default: out
  -t, --tile           Tile size, 0 for no tile during testing. Default: 0
  --face_enhance       Whether to use GFPGAN to enhance face. Default: False
  --fp32               Use fp32 precision during inference. Default: fp16 (half precision).
  --ext                Image extension. Options: auto | jpg | png, auto means using the same extension as inputs. Default: auto

Inference general images

Download pre-trained models: RealESRGAN_x4plus.pth

wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth -P weights

Inference!

python inference_realesrgan.py -n RealESRGAN_x4plus -i inputs --face_enhance

Results are in the results folder

Inference anime images

Pre-trained models: RealESRGAN_x4plus_anime_6B
More details and comparisons with waifu2x are in anime_model.md

# download model
wget https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.2.4/RealESRGAN_x4plus_anime_6B.pth -P weights
# inference
python inference_realesrgan.py -n RealESRGAN_x4plus_anime_6B -i inputs

Results are in the results folder

BibTeX

@InProceedings{wang2021realesrgan,
    author    = {Xintao Wang and Liangbin Xie and Chao Dong and Ying Shan},
    title     = {Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data},
    booktitle = {International Conference on Computer Vision Workshops (ICCVW)},
    date      = {2021}
}

ð§ Contact

If you have any question, please email xintao.wang@outlook.com or xintaowang@tencent.com.

If you develop/use Real-ESRGAN in your projects, welcome to let me know.

NCNN-Android: RealSR-NCNN-Android by tumuyan
VapourSynth: vs-realesrgan by HolyWu
NCNN: Real-ESRGAN-ncnn-vulkan

GUI

ð¤ Acknowledgement

Thanks for all the contributors.

AK391: Integrate RealESRGAN to Huggingface Spaces with Gradio. See Gradio Web Demo.
Asiimoviet: Translate the README.md to Chinese (ä¸æ).
2ji3150: Thanks for the detailed and valuable feedbacks/suggestions.
Jared-02: Translate the Training.md to Chinese (ä¸æ).

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of Upscayl

Cons of Upscayl

Code Comparison

Pros of waifu2x-ncnn-vulkan

Cons of waifu2x-ncnn-vulkan

Code Comparison

Pros of Waifu2x-Extension-GUI

Cons of Waifu2x-Extension-GUI

Code Comparison

Pros of GFPGAN

Cons of GFPGAN

Code Comparison

Pros of SwinIR

Cons of SwinIR

Code Comparison

Convert designs to code with AI

README

English | ç®ä½ä¸­æ

ð Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

ð© Updates

ð Demos Videos

Bilibili

YouTube

ð§ Dependencies and Installation

Installation

â¡ Quick Inference

Online inference

Portable executable files (NCNN)

Usage of portable executable files

Python script

Usage of python script

Inference general images

Inference anime images

BibTeX

ð§ Contact

ð§© Projects that use Real-ESRGAN

ð¤ Acknowledgement

Top Related Projects

Convert designs to code with AI

English | ç®ä½ä¸æ

ð Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data

ð© Updates

ð Demos Videos

ð§ Dependencies and Installation

â¡ Quick Inference

ð§ Contact

ð§© Projects that use Real-ESRGAN

ð¤ Acknowledgement