Convert Figma logo to code with AI

emilwallner logoColoring-greyscale-images

Coloring black and white images with deep learning.

1,028
227
1,028
8

Top Related Projects

3,969

Interactive Image Generation via Generative Adversarial Networks

Torch implementation of neural style algorithm

Tensorflow port of Image-to-Image Translation with Conditional Adversarial Nets https://phillipi.github.io/pix2pix/

10,072

Image-to-image translation with conditional adversarial nets

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

Synthesizing and manipulating 2048x1024 images with conditional GANs

Quick Overview

The emilwallner/Coloring-greyscale-images repository is a project that uses deep learning to colorize grayscale images. It provides a pre-trained model that can be used to add color to black and white images, transforming them into vibrant, realistic-looking images.

Pros

  • Automated Colorization: The project allows users to easily colorize grayscale images without the need for manual editing or color selection.
  • Pre-trained Model: The repository includes a pre-trained model, which can be used out-of-the-box, making it accessible to users without requiring extensive machine learning expertise.
  • Flexible Usage: The project can be integrated into various applications and workflows, as it provides a simple API for colorizing images.
  • Impressive Results: The colorization results produced by the model are generally of high quality, with realistic and natural-looking colors.

Cons

  • Limited Customization: The pre-trained model may not always produce the exact desired color palette, and there is limited ability to fine-tune the colorization process.
  • Computational Requirements: Colorizing images can be computationally intensive, especially for larger images, which may limit its use on resource-constrained devices.
  • Potential Bias: As with any machine learning model, the colorization results may reflect biases present in the training data, which could lead to inaccuracies or inconsistencies in certain cases.
  • Dependency on External Libraries: The project relies on several external libraries, which may require additional setup and configuration, potentially increasing the complexity for some users.

Code Examples

The project is primarily a deep learning-based application, and the code examples provided here demonstrate how to use the pre-trained model to colorize grayscale images.

from coloring_greyscale_images.colorizer import Colorizer

# Load the pre-trained model
colorizer = Colorizer()

# Colorize a grayscale image
grayscale_image = Image.open('grayscale_image.jpg')
colored_image = colorizer.colorize(grayscale_image)
colored_image.save('colored_image.jpg')

This code demonstrates how to load the pre-trained model and use it to colorize a grayscale image.

import numpy as np
from coloring_greyscale_images.colorizer import Colorizer

# Load the pre-trained model
colorizer = Colorizer()

# Colorize a numpy array representing a grayscale image
grayscale_image = np.random.rand(224, 224, 1)
colored_image = colorizer.colorize_numpy(grayscale_image)

This code shows how to colorize a grayscale image represented as a NumPy array.

from coloring_greyscale_images.colorizer import Colorizer
from PIL import Image

# Load the pre-trained model
colorizer = Colorizer()

# Colorize multiple grayscale images
grayscale_images = [Image.open('grayscale_image1.jpg'), Image.open('grayscale_image2.jpg')]
colored_images = colorizer.colorize_batch(grayscale_images)

for i, colored_image in enumerate(colored_images):
    colored_image.save(f'colored_image_{i+1}.jpg')

This code demonstrates how to colorize a batch of grayscale images using the pre-trained model.

Getting Started

To get started with the Coloring-greyscale-images project, follow these steps:

  1. Clone the repository:
git clone https://github.com/emilwallner/Coloring-greyscale-images.git
  1. Install the required dependencies:
cd Coloring-greyscale-images
pip install -r requirements.txt
  1. Download the pre-trained model:
python download_model.py
  1. Use the Colorizer class to colorize grayscale images:
from coloring_greyscale_images.colorizer import Colorizer
from PIL import Image

# Load the pre-trained model
colorizer = Colorizer()

# Colorize a grayscale image
grayscale_image = Image.open('grayscale_image.jpg

Competitor Comparisons

3,969

Interactive Image Generation via Generative Adversarial Networks

Pros of iGAN

  • Offers interactive image generation and manipulation
  • Supports multiple GAN models for diverse applications
  • Provides a user-friendly interface for real-time image editing

Cons of iGAN

  • More complex setup and dependencies
  • Requires more computational resources
  • Less focused on a specific task compared to Coloring-greyscale-images

Code Comparison

iGAN:

def initialize_model(self):
    self.model = GANModel()
    self.model.load_weights('pretrained_weights.h5')
    self.model.compile(optimizer='adam', loss='binary_crossentropy')

Coloring-greyscale-images:

def build_model():
    model = Sequential()
    model.add(Conv2D(64, (3, 3), activation='relu', padding='same', input_shape=(256, 256, 1)))
    model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2))
    model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
    model.compile(optimizer='rmsprop', loss='mse')
    return model

The iGAN code snippet shows the initialization of a pre-trained GAN model, while the Coloring-greyscale-images code demonstrates the construction of a simpler convolutional neural network specifically designed for colorization tasks. iGAN's approach is more flexible but potentially more complex, whereas Coloring-greyscale-images focuses on a specific task with a streamlined architecture.

Torch implementation of neural style algorithm

Pros of neural-style

  • Focuses on artistic style transfer, allowing for more creative and diverse outputs
  • Supports both image and video style transfer
  • Has a larger community and more extensive documentation

Cons of neural-style

  • Requires more computational resources and longer processing times
  • Less specialized for colorization tasks, which may result in less accurate color reproduction
  • More complex setup and usage compared to Coloring-greyscale-images

Code Comparison

neural-style:

local cmd = torch.CmdLine()
cmd:option('-style_image', 'examples/inputs/seated-nude.jpg', 'Style target image')
cmd:option('-content_image', 'examples/inputs/tubingen.jpg', 'Content target image')
cmd:option('-output_image', 'out.png', 'Output image')

Coloring-greyscale-images:

def create_combined_model(model):
    model_input = Input(shape=(256, 256, 1))
    model_output = model(model_input)
    model_output = Activation('sigmoid')(model_output)
    return Model(inputs=model_input, outputs=model_output)

The code snippets show different approaches: neural-style uses Torch and focuses on style transfer parameters, while Coloring-greyscale-images uses TensorFlow/Keras and emphasizes model creation for colorization.

Tensorflow port of Image-to-Image Translation with Conditional Adversarial Nets https://phillipi.github.io/pix2pix/

Pros of pix2pix-tensorflow

  • More versatile, supporting various image-to-image translation tasks beyond colorization
  • Implements the full pix2pix architecture, allowing for more complex transformations
  • Better documentation and examples for different use cases

Cons of pix2pix-tensorflow

  • More complex to set up and use for beginners
  • Requires more computational resources due to its comprehensive architecture
  • Less focused on the specific task of colorizing grayscale images

Code Comparison

Coloring-greyscale-images:

def create_conv_layer(input, filters):
    conv = Conv2D(filters, (3, 3), activation='relu', padding='same')(input)
    return conv

model = Sequential()
model.add(create_conv_layer(input_layer, 64))
model.add(create_conv_layer(model.output, 128))

pix2pix-tensorflow:

def create_generator(generator_inputs, generator_outputs_channels):
    layers = []
    # encoder_1: [batch, 256, 256, in_channels] => [batch, 128, 128, ngf]
    with tf.variable_scope("encoder_1"):
        output = conv(generator_inputs, a.ngf, stride=2)
        layers.append(output)

Both repositories focus on image transformation tasks, but pix2pix-tensorflow offers a more comprehensive solution for various image-to-image translation problems. Coloring-greyscale-images is more specialized for colorization, making it potentially easier to use for that specific task. The code snippets show that pix2pix-tensorflow uses a more complex architecture with separate encoder and decoder components, while Coloring-greyscale-images uses a simpler sequential model structure.

10,072

Image-to-image translation with conditional adversarial nets

Pros of pix2pix

  • More versatile, capable of various image-to-image translation tasks beyond just colorization
  • Implements a conditional GAN architecture, potentially producing more realistic results
  • Offers both PyTorch and TensorFlow implementations

Cons of pix2pix

  • More complex to set up and use, requiring deeper understanding of GANs
  • May require more computational resources due to its more sophisticated architecture
  • Less focused on the specific task of colorization compared to Coloring-greyscale-images

Code Comparison

Coloring-greyscale-images (using Keras):

model = Sequential()
model.add(InputLayer(input_shape=(256, 256, 1)))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))

pix2pix (using PyTorch):

class UnetGenerator(nn.Module):
    def __init__(self, input_nc, output_nc, num_downs, ngf=64):
        super(UnetGenerator, self).__init__()
        unet_block = UnetSkipConnectionBlock(ngf * 8, ngf * 8, input_nc=None, submodule=None, innermost=True)
        for i in range(num_downs - 5):
            unet_block = UnetSkipConnectionBlock(ngf * 8, ngf * 8, input_nc=None, submodule=unet_block)

Code and data for paper "Deep Photo Style Transfer": https://arxiv.org/abs/1703.07511

Pros of deep-photo-styletransfer

  • Focuses on transferring photorealistic styles between images
  • Preserves the structure and details of the original photo
  • Utilizes a semantic segmentation mask for improved results

Cons of deep-photo-styletransfer

  • Requires more complex input (style image and content image)
  • May have longer processing times due to advanced algorithms
  • Limited to style transfer, not specifically designed for colorization

Code Comparison

Coloring-greyscale-images:

def create_conv_layer(input, filters, kernel_size, strides, activation=None):
    return Conv2D(filters, kernel_size, strides=strides, activation=activation, padding='same')(input)

deep-photo-styletransfer:

function styleTransfer(content, style, output)
    local contentImage = image.load(content, 3)
    local styleImage = image.load(style, 3)
    -- Additional processing and neural network operations
end

Key Differences

  • Coloring-greyscale-images focuses on colorizing black and white images
  • deep-photo-styletransfer aims to transfer styles between color photos
  • Coloring-greyscale-images uses a CNN-based approach
  • deep-photo-styletransfer employs a combination of CNN and optimization techniques
  • Coloring-greyscale-images is implemented in Python/Keras
  • deep-photo-styletransfer is implemented in Lua/Torch

Both projects demonstrate advanced image processing techniques but serve different purposes in the field of computer vision and deep learning.

Synthesizing and manipulating 2048x1024 images with conditional GANs

Pros of pix2pixHD

  • Supports high-resolution image generation (up to 2048x1024 pixels)
  • Utilizes multi-scale generator and discriminator architectures for improved results
  • Incorporates instance-level feature embedding for enhanced detail preservation

Cons of pix2pixHD

  • Requires more computational resources due to its complex architecture
  • May have a steeper learning curve for implementation and fine-tuning
  • Primarily focused on image-to-image translation tasks, not specifically optimized for colorization

Code Comparison

Coloring-greyscale-images:

def create_conv_layer(input_layer, filter_size, feature_maps, stride=1):
    return Conv2D(feature_maps, (filter_size, filter_size), strides=stride, padding='same')(input_layer)

pix2pixHD:

def define_G(input_nc, output_nc, ngf, n_downsample_global=3, n_blocks_global=9, n_local_enhancers=1, n_blocks_local=3, norm='instance', gpu_ids=[]):
    norm_layer = get_norm_layer(norm_type=norm)
    netG = GlobalGenerator(input_nc, output_nc, ngf, n_downsample_global, n_blocks_global, norm_layer)

The code snippets show different approaches to defining network architectures. Coloring-greyscale-images uses a simpler convolutional layer creation, while pix2pixHD employs a more complex generator definition with multiple parameters for fine-tuning the network structure.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Coloring Black and White Images with Neural Networks


A detailed tutorial covering the code in this repository: Coloring Black and White photos with Neural Networks

👉 Try the Palette API to test the latest advancements in AI colorization.

The network is built in four parts and gradually becomes more complex. The first part is the bare minimum to understand the core parts of the network. It's built to color one image. Once I have something to experiment with, I find it easier to add the remaining 80% of the network.

For the second stage, the Beta version, I start automating the training flow. In the full version, I add features from a pre-trained classifier. The GAN version is not covered in the tutorial. It's an experimental version using some of the emerging best practices in image colorization.

🍿 Featured by Google >>>

Note: The display images below are cherry-picked. A large majority of the images are mostly black and white or are lightly colored in brown. A narrow and simple dataset often creates better results.

Installation

pip install keras tensorflow pillow h5py jupyter scikit-image
git clone https://github.com/emilwallner/Coloring-greyscale-images
cd Coloring-greyscale-images/
jupyter notebook

Go do the desired notebook, files that end with '.ipynb'. To run the model, go to the menu then click on Cell > Run all

For the GAN version, enter the GAN-version folder, and run:

python3 colorize_base.py

Pre-trained weights: Download the pre-trained weights for the GAN-version here. Create a folder called 'resources' and put it inside of Coloring-greyscale-images/GAN-version/. It's trained on contemporary photography with different objects but not a lot of people.

Alpha Version

This is a great starting point to get a hang of the moving pieces. How an image is transformed into RGB pixel values and later translated into LAB pixel values, changing the color space. It also builds a core intuition for how the network learns. How the network compares the input with the output and adjusts the network.

In this version, you will see a result in a few minutes. Once you have trained the network, try coloring an image it was not trained on. This will build an intuition for the purpose of the later versions.

Beta Version

The network in the beta version is very similar to the alpha version. The difference is that we use more than one image to train the network. I'd recommend running top/htop and nvidia-smi to see how different batch sizes affect your computer's memory.

For this model, I'd go with a this cropped celebrity dataset or Nvidia's StyleGAN dataset. Because the images are very similar, the network can learn basic colorization despite being trivial. To get a feel for the limits of this network, you can try it on this dataset of diverse images from Unsplash. If you are on a laptop, I'd run it for a day. If you are using a GPU, train it at least 6 - 12h.

Full Version

The full version adds information from a pre-trained classifier. You can think of the information as 20% nature, 30% humans, 30% sky, and 20% brick buildings. It then learns to combine that information with the black and white photo. It gives the network more confidence to color the image. Otherwise, it tends to default to the safest color, brown.

The model comes from an elegant insight by Baldassarre and his team.

In the article, I use the Unsplash dataset, but in retrospect, I'd choose five to ten categories in the Imagenet dataset. You can also go with the Nvidia's StyleGAN dataset or create a dataset from Pixabay categories. You'll start getting some results after about 12 - 24 hours on a GPU.

GAN Version

The GAN version uses Generative Adversarial Networks to make the coloring more consistent and vibrant. However, the network is a magnitude more complex and requires more computing power to work with. Many of the techniques in this network are inspired by the brilliant work of Jason Antic and his DeOldify coloring network.

In breif, the generator comes from the pix2pix model, the discriminators and loss function from the pix2pixHD model, and a few optimizations from the Self-Attention GAN. If you want to experiment with this approach, I'd recommend starting with Erik Linder-Norén's excellent pix2pix implementation.

Implementation details:

  • With a 16GB GPU, you can fit 150 images that are 128x128 and 25 images that are 256x256.
  • The learning improved a magnitude faster on the 128x128 images compared to the 256x256 images.
  • I'd recommend experimenting with pre-trained U-nets (One of the secrets in Jason's model)
  • Test different normalizations. I prefer spectral normalization, but I've also added instance normalization.
  • The network uses 3 discriminators for different image resolutions, based on the pix2pixHD paper. However, this might be overkill, so I'd try it with one.
  • Nvidia's StyleGAN model has shown some incredible images. It might be worth experimenting with some of the best practice they developed. Same goes with the Large Scale GAN paper.
  • I've added the pix2pixHD generator, but it requires more compute to converge.
  • The image generator has some memory problems. Perhaps go with the original generator in Keras or find something equivalent.
  • If you want to build your own dataset, I've inluded a few scraping and cleaning scripts in 'download_and_clean_data_scripts'. You can build the datasets based on keywords from Yahoo's 100M images or Pixabay.
  • I've implemented it for multi-gpu, however, all the models are copied on each GPU. This increases the batch sizes which improves the result, but it only marginally increases images/sec. I'd recommend specifing on which GPU each model is loaded, to avoid merging the weights for each batch.

Run the code on FloydHub

Run on FloydHub

Click this button to open a Workspace on FloydHub where you will find the same environment and dataset used for the Full version.

Acknowledgments

  • Thanks to IBM for donating computing power through their PowerAI platform
  • The full-model is a reproduction of Baldassarre alt el., 2017. Code Paper
  • The advanced model is inspired by the pix2pix, pix2pixHD, SA-GAN, and DeOldify models.