Convert Figma logo to code with AI

cszn logoKAIR

Image Restoration Toolbox (PyTorch). Training and testing codes for DPIR, USRNet, DnCNN, FFDNet, SRMD, DPSR, BSRGAN, SwinIR

2,996
638
2,996
65

Top Related Projects

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

4,526

SwinIR: Image Restoration Using Swin Transformer (official repository)

6,086

ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.

Tensorflow 2.x based implementation of EDSR, WDSR and SRGAN for single image super-resolution

Quick Overview

KAIR (Keras AI Research) is a comprehensive toolkit for image restoration and enhancement tasks using deep learning. It provides implementations of various state-of-the-art models for tasks such as denoising, super-resolution, and deblurring, along with training and testing frameworks.

Pros

  • Extensive collection of pre-trained models for various image restoration tasks
  • Well-organized codebase with modular architecture for easy customization
  • Supports both PyTorch and TensorFlow/Keras implementations
  • Includes data preparation scripts and utility functions for dataset handling

Cons

  • Limited documentation for some advanced features and customizations
  • Requires significant computational resources for training large models
  • Some older models may not be actively maintained or updated
  • Steep learning curve for users new to deep learning in image processing

Code Examples

  1. Loading a pre-trained denoising model:
from models.network_unet import UNetRes
from utils import utils_image as util

model = UNetRes(in_nc=3, out_nc=3, nc=[64, 128, 256, 512], nb=4, act_mode='R', downsample_mode="strideconv", upsample_mode="convtranspose")
model.load_state_dict(torch.load('model_zoo/dncnn3.pth'), strict=True)
model.eval()
  1. Performing image denoising:
import torch

noisy_img = util.imread_uint('noisy_image.png', n_channels=3)
noisy_img = util.uint2tensor4(noisy_img)

with torch.no_grad():
    denoised_img = model(noisy_img)

denoised_img = util.tensor2uint(denoised_img)
util.imsave(denoised_img, 'denoised_image.png')
  1. Training a super-resolution model:
from models.network_srresnet import SRResNet
from data.dataset_sr import DatasetSR
from torch.utils.data import DataLoader

model = SRResNet(in_nc=3, out_nc=3, nc=64, nb=16, upscale=4)
train_set = DatasetSR('path/to/training/data', patch_size=96, scale=4)
train_loader = DataLoader(train_set, batch_size=16, shuffle=True)

optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)
criterion = torch.nn.L1Loss()

for epoch in range(100):
    for data in train_loader:
        lr, hr = data['L'], data['H']
        optimizer.zero_grad()
        sr = model(lr)
        loss = criterion(sr, hr)
        loss.backward()
        optimizer.step()

Getting Started

  1. Clone the repository:

    git clone https://github.com/cszn/KAIR.git
    cd KAIR
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Download pre-trained models:

    python main_download_pretrained_models.py
    
  4. Run a demo:

    python main_test_dncdn.py
    

Competitor Comparisons

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

Pros of Real-ESRGAN

  • Focuses specifically on real-world image super-resolution
  • Implements a more advanced degradation model for training
  • Provides pre-trained models for immediate use

Cons of Real-ESRGAN

  • Limited to super-resolution tasks
  • Less comprehensive in terms of image restoration techniques
  • Fewer options for customization and experimentation

Code Comparison

Real-ESRGAN:

from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer

model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32)
upsampler = RealESRGANer(model_path='weights/RealESRGAN_x4plus.pth', model=model, scale=4)

KAIR:

from models.network_unet import UNetRes
from utils import utils_image as util

model = UNetRes(in_nc=1, out_nc=1, nc=[64, 128, 256, 512], nb=4, act_mode='R', downsample_mode="strideconv", upsample_mode="convtranspose")
img_L = util.imread_uint('input.png', n_channels=1)
img_E = model(img_L)

Summary

Real-ESRGAN excels in real-world super-resolution tasks with ready-to-use models, while KAIR offers a broader range of image restoration techniques and greater flexibility for researchers. Real-ESRGAN is more user-friendly for specific super-resolution applications, whereas KAIR provides a comprehensive toolkit for various image processing tasks and experimentation.

4,526

SwinIR: Image Restoration Using Swin Transformer (official repository)

Pros of SwinIR

  • Utilizes the Swin Transformer architecture, which can capture long-range dependencies more effectively
  • Achieves state-of-the-art performance on various image restoration tasks
  • Provides pre-trained models for different applications (e.g., image denoising, super-resolution)

Cons of SwinIR

  • More complex architecture, potentially requiring more computational resources
  • Limited to specific image restoration tasks compared to KAIR's broader scope
  • May have a steeper learning curve for implementation and customization

Code Comparison

SwinIR:

from models.network_swinir import SwinIR
model = SwinIR(upscale=4, in_chans=3, img_size=64, window_size=8,
               img_range=1., depths=[6, 6, 6, 6], embed_dim=60, num_heads=[6, 6, 6, 6],
               mlp_ratio=2, upsampler='pixelshuffledirect', resi_connection='1conv')

KAIR:

from models.network_unet import UNetRes
model = UNetRes(in_nc=3, out_nc=3, nc=[64, 128, 256, 512], nb=4, act_mode='R',
                downsample_mode='strideconv', upsample_mode='convtranspose')

Both repositories offer powerful image restoration solutions, with SwinIR focusing on transformer-based architectures and KAIR providing a more diverse set of traditional and deep learning models.

6,086

ECCV18 Workshops - Enhanced SRGAN. Champion PIRM Challenge on Perceptual Super-Resolution. The training codes are in BasicSR.

Pros of ESRGAN

  • Focused specifically on super-resolution tasks, providing a more specialized solution
  • Includes pre-trained models for quick implementation and testing
  • Offers a perceptual loss function for improved visual quality

Cons of ESRGAN

  • Limited to super-resolution tasks, while KAIR supports multiple image restoration tasks
  • Less active development and updates compared to KAIR
  • Fewer options for customization and experimentation

Code Comparison

ESRGAN:

from models.archs.arch_util import initialize_weights
from models.archs.rrdb_net import RRDBNet

model = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32)
initialize_weights(model, scale=0.1)

KAIR:

from models.network_unet import UNetRes as net
from utils import utils_model

model = net(in_nc=3, out_nc=3, nc=[64, 128, 256, 512], nb=4, act_mode='R', downsample_mode='strideconv', upsample_mode='convtranspose')
utils_model.init_weights(model, init_type='orthogonal')

Both repositories offer implementations of deep learning models for image processing tasks. ESRGAN is more specialized for super-resolution, while KAIR provides a broader range of image restoration capabilities. ESRGAN may be easier to use for specific super-resolution tasks, but KAIR offers more flexibility and ongoing development for various image processing applications.

Tensorflow 2.x based implementation of EDSR, WDSR and SRGAN for single image super-resolution

Pros of super-resolution

  • Focuses specifically on super-resolution tasks, making it more specialized and potentially easier to use for this specific application
  • Implements multiple state-of-the-art super-resolution models, providing a variety of options for users
  • Includes pre-trained models, allowing for quick implementation and testing

Cons of super-resolution

  • Less comprehensive than KAIR, which covers a broader range of image restoration tasks
  • May have fewer active contributors and updates compared to KAIR
  • Documentation might be less extensive, potentially making it harder for new users to get started

Code Comparison

KAIR example:

from models.network_unet import UNetRes as net
model = net(in_nc=3, out_nc=3, nc=[64, 128, 256, 512], nb=4, act_mode='R', downsample_mode='strideconv', upsample_mode='convtranspose')

super-resolution example:

from model import resolve_single
from model.srgan import generator
model = generator()
sr_image = resolve_single(model, lr_image)

Both repositories provide implementations of image enhancement models, but KAIR offers a more comprehensive toolkit for various image restoration tasks, while super-resolution focuses specifically on super-resolution techniques.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Training and testing codes for USRNet, DnCNN, FFDNet, SRMD, DPSR, MSRResNet, ESRGAN, BSRGAN, SwinIR, VRT, RVRT

download visitors

Kai Zhang

Computer Vision Lab, ETH Zurich, Switzerland


The following results are obtained by our SCUNet with purely synthetic training data! We did not use the paired noisy/clean data by DND and SIDD during training!

Real-World Image (x4)BSRGAN, ICCV2021Real-ESRGANSwinIR (ours)
  • News (2021-08-31): We upload the training code of BSRGAN.

  • News (2021-08-24): We upload the BSRGAN degradation model.

  • News (2021-08-22): Support multi-feature-layer VGG perceptual loss and UNet discriminator.

  • News (2021-08-18): We upload the extended BSRGAN degradation model. It is slightly different from our published version.

  • News (2021-06-03): Add testing codes of GPEN (CVPR21) for face image enhancement: main_test_face_enhancement.py

from utils.utils_modelsummary import get_model_activation, get_model_flops
input_dim = (3, 256, 256)  # set the input dimension
activations, num_conv2d = get_model_activation(model, input_dim)
logger.info('{:>16s} : {:<.4f} [M]'.format('#Activations', activations/10**6))
logger.info('{:>16s} : {:<d}'.format('#Conv2d', num_conv2d))
flops = get_model_flops(model, input_dim, False)
logger.info('{:>16s} : {:<.4f} [G]'.format('FLOPs', flops/10**9))
num_parameters = sum(map(lambda x: x.numel(), model.parameters()))
logger.info('{:>16s} : {:<.4f} [M]'.format('#Params', num_parameters/10**6))

Clone repo

git clone https://github.com/cszn/KAIR.git
pip install -r requirement.txt

Training

You should modify the json file from options first, for example, setting "gpu_ids": [0,1,2,3] if 4 GPUs are used, setting "dataroot_H": "trainsets/trainH" if path of the high quality dataset is trainsets/trainH.

  • Training with DataParallel - PSNR
python main_train_psnr.py --opt options/train_msrresnet_psnr.json
  • Training with DataParallel - GAN
python main_train_gan.py --opt options/train_msrresnet_gan.json
  • Training with DistributedDataParallel - PSNR - 4 GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_psnr.py --opt options/train_msrresnet_psnr.json  --dist True
  • Training with DistributedDataParallel - PSNR - 8 GPUs
python -m torch.distributed.launch --nproc_per_node=8 --master_port=1234 main_train_psnr.py --opt options/train_msrresnet_psnr.json  --dist True
  • Training with DistributedDataParallel - GAN - 4 GPUs
python -m torch.distributed.launch --nproc_per_node=4 --master_port=1234 main_train_gan.py --opt options/train_msrresnet_gan.json  --dist True
  • Training with DistributedDataParallel - GAN - 8 GPUs
python -m torch.distributed.launch --nproc_per_node=8 --master_port=1234 main_train_gan.py --opt options/train_msrresnet_gan.json  --dist True
  • Kill distributed training processes of main_train_gan.py
kill $(ps aux | grep main_train_gan.py | grep -v grep | awk '{print $2}')

MethodOriginal Link
DnCNNhttps://github.com/cszn/DnCNN
FDnCNNhttps://github.com/cszn/DnCNN
FFDNethttps://github.com/cszn/FFDNet
SRMDhttps://github.com/cszn/SRMD
DPSR-SRResNethttps://github.com/cszn/DPSR
SRResNethttps://github.com/xinntao/BasicSR
ESRGANhttps://github.com/xinntao/ESRGAN
RRDBhttps://github.com/xinntao/ESRGAN
IMDBhttps://github.com/Zheng222/IMDN
USRNethttps://github.com/cszn/USRNet
DRUNethttps://github.com/cszn/DPIR
DPIRhttps://github.com/cszn/DPIR
BSRGANhttps://github.com/cszn/BSRGAN
SwinIRhttps://github.com/JingyunLiang/SwinIR
VRThttps://github.com/JingyunLiang/VRT
DiffPIRhttps://github.com/yuanzhi-zhu/DiffPIR

Network architectures

  • FFDNet

  • SRMD

  • SRResNet, SRGAN, RRDB, ESRGAN

  • IMDN

    -----

Testing

Methodmodel_zoo
main_test_dncnn.pydncnn_15.pth, dncnn_25.pth, dncnn_50.pth, dncnn_gray_blind.pth, dncnn_color_blind.pth, dncnn3.pth
main_test_ircnn_denoiser.pyircnn_gray.pth, ircnn_color.pth
main_test_fdncnn.pyfdncnn_gray.pth, fdncnn_color.pth, fdncnn_gray_clip.pth, fdncnn_color_clip.pth
main_test_ffdnet.pyffdnet_gray.pth, ffdnet_color.pth, ffdnet_gray_clip.pth, ffdnet_color_clip.pth
main_test_srmd.pysrmdnf_x2.pth, srmdnf_x3.pth, srmdnf_x4.pth, srmd_x2.pth, srmd_x3.pth, srmd_x4.pth
The above models are converted from MatConvNet.
main_test_dpsr.pydpsr_x2.pth, dpsr_x3.pth, dpsr_x4.pth, dpsr_x4_gan.pth
main_test_msrresnet.pymsrresnet_x4_psnr.pth, msrresnet_x4_gan.pth
main_test_rrdb.pyrrdb_x4_psnr.pth, rrdb_x4_esrgan.pth
main_test_imdn.pyimdn_x4.pth

model_zoo

trainsets

testsets

References

@inproceedings{zhu2023denoising, % DiffPIR
title={Denoising Diffusion Models for Plug-and-Play Image Restoration},
author={Yuanzhi Zhu and Kai Zhang and Jingyun Liang and Jiezhang Cao and Bihan Wen and Radu Timofte and Luc Van Gool},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition Workshops},
year={2023}
}
@article{liang2022vrt,
title={VRT: A Video Restoration Transformer},
author={Liang, Jingyun and Cao, Jiezhang and Fan, Yuchen and Zhang, Kai and Ranjan, Rakesh and Li, Yawei and Timofte, Radu and Van Gool, Luc},
journal={arXiv preprint arXiv:2022.00000},
year={2022}
}
@inproceedings{liang2021swinir,
title={SwinIR: Image Restoration Using Swin Transformer},
author={Liang, Jingyun and Cao, Jiezhang and Sun, Guolei and Zhang, Kai and Van Gool, Luc and Timofte, Radu},
booktitle={IEEE International Conference on Computer Vision Workshops},
pages={1833--1844},
year={2021}
}
@inproceedings{zhang2021designing,
title={Designing a Practical Degradation Model for Deep Blind Image Super-Resolution},
author={Zhang, Kai and Liang, Jingyun and Van Gool, Luc and Timofte, Radu},
booktitle={IEEE International Conference on Computer Vision},
pages={4791--4800},
year={2021}
}
@article{zhang2021plug, % DPIR & DRUNet & IRCNN
  title={Plug-and-Play Image Restoration with Deep Denoiser Prior},
  author={Zhang, Kai and Li, Yawei and Zuo, Wangmeng and Zhang, Lei and Van Gool, Luc and Timofte, Radu},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021}
}
@inproceedings{zhang2020aim, % efficientSR_challenge
  title={AIM 2020 Challenge on Efficient Super-Resolution: Methods and Results},
  author={Kai Zhang and Martin Danelljan and Yawei Li and Radu Timofte and others},
  booktitle={European Conference on Computer Vision Workshops},
  year={2020}
}
@inproceedings{zhang2020deep, % USRNet
  title={Deep unfolding network for image super-resolution},
  author={Zhang, Kai and Van Gool, Luc and Timofte, Radu},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3217--3226},
  year={2020}
}
@article{zhang2017beyond, % DnCNN
  title={Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising},
  author={Zhang, Kai and Zuo, Wangmeng and Chen, Yunjin and Meng, Deyu and Zhang, Lei},
  journal={IEEE Transactions on Image Processing},
  volume={26},
  number={7},
  pages={3142--3155},
  year={2017}
}
@inproceedings{zhang2017learning, % IRCNN
title={Learning deep CNN denoiser prior for image restoration},
author={Zhang, Kai and Zuo, Wangmeng and Gu, Shuhang and Zhang, Lei},
booktitle={IEEE conference on computer vision and pattern recognition},
pages={3929--3938},
year={2017}
}
@article{zhang2018ffdnet, % FFDNet, FDnCNN
  title={FFDNet: Toward a fast and flexible solution for CNN-based image denoising},
  author={Zhang, Kai and Zuo, Wangmeng and Zhang, Lei},
  journal={IEEE Transactions on Image Processing},
  volume={27},
  number={9},
  pages={4608--4622},
  year={2018}
}
@inproceedings{zhang2018learning, % SRMD
  title={Learning a single convolutional super-resolution network for multiple degradations},
  author={Zhang, Kai and Zuo, Wangmeng and Zhang, Lei},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
  pages={3262--3271},
  year={2018}
}
@inproceedings{zhang2019deep, % DPSR
  title={Deep Plug-and-Play Super-Resolution for Arbitrary Blur Kernels},
  author={Zhang, Kai and Zuo, Wangmeng and Zhang, Lei},
  booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
  pages={1671--1681},
  year={2019}
}
@InProceedings{wang2018esrgan, % ESRGAN, MSRResNet
    author = {Wang, Xintao and Yu, Ke and Wu, Shixiang and Gu, Jinjin and Liu, Yihao and Dong, Chao and Qiao, Yu and Loy, Chen Change},
    title = {ESRGAN: Enhanced super-resolution generative adversarial networks},
    booktitle = {The European Conference on Computer Vision Workshops (ECCVW)},
    month = {September},
    year = {2018}
}
@inproceedings{hui2019lightweight, % IMDN
  title={Lightweight Image Super-Resolution with Information Multi-distillation Network},
  author={Hui, Zheng and Gao, Xinbo and Yang, Yunchu and Wang, Xiumei},
  booktitle={Proceedings of the 27th ACM International Conference on Multimedia (ACM MM)},
  pages={2024--2032},
  year={2019}
}
@inproceedings{zhang2019aim, % IMDN
  title={AIM 2019 Challenge on Constrained Super-Resolution: Methods and Results},
  author={Kai Zhang and Shuhang Gu and Radu Timofte and others},
  booktitle={IEEE International Conference on Computer Vision Workshops},
  year={2019}
}
@inproceedings{yang2021gan,
    title={GAN Prior Embedded Network for Blind Face Restoration in the Wild},
    author={Tao Yang, Peiran Ren, Xuansong Xie, and Lei Zhang},
    booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
    year={2021}
}