capsule-networks

A PyTorch implementation of the NIPS 2017 paper "Dynamic Routing Between Capsules".

1,745

315

1,745

View on GitHub

Top Related Projects

CapsNet-Tensorflow

3,799

A Tensorflow implementation of CapsNet(Capsules Net) in paper Dynamic Routing Between Capsules

models

2,897

Models and examples built with TensorFlow

Quick Overview

The gram-ai/capsule-networks repository is an implementation of Capsule Networks in PyTorch. It aims to provide a flexible and extensible framework for experimenting with and building upon the original CapsNet architecture proposed by Hinton et al. The project includes implementations of various capsule network variants and utilities for training and evaluation.

Pros

Implements multiple capsule network architectures, allowing for easy comparison and experimentation
Written in PyTorch, providing good performance and GPU acceleration
Includes utilities for data loading, training, and evaluation, simplifying the research process
Well-documented code with clear structure, making it easier for researchers to understand and modify

Cons

Limited to PyTorch ecosystem, may not be suitable for those preferring other deep learning frameworks
Focuses primarily on image classification tasks, potentially limiting its applicability to other domains
May require significant computational resources for training larger models
Documentation could be more extensive, especially for advanced usage and customization

Code Examples

Creating a basic CapsNet model:

from capsule_networks import CapsNet

model = CapsNet(
    input_channels=1,
    num_classes=10,
    routing_iterations=3
)

Training the model:

from capsule_networks import train_model

train_model(
    model,
    train_loader,
    optimizer,
    num_epochs=50,
    device='cuda'
)

Evaluating the model:

from capsule_networks import evaluate_model

accuracy = evaluate_model(
    model,
    test_loader,
    device='cuda'
)
print(f"Test accuracy: {accuracy:.2f}")

Getting Started

To get started with the gram-ai/capsule-networks project:

Clone the repository:

git clone https://github.com/gram-ai/capsule-networks.git
cd capsule-networks

Install dependencies:
```
pip install -r requirements.txt
```

Run a basic example:

from capsule_networks import CapsNet, train_model, evaluate_model
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
import torch.optim as optim

# Load MNIST dataset
train_dataset = MNIST(root='./data', train=True, download=True)
test_dataset = MNIST(root='./data', train=False, download=True)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# Create model and optimizer
model = CapsNet(input_channels=1, num_classes=10)
optimizer = optim.Adam(model.parameters())

# Train and evaluate
train_model(model, train_loader, optimizer, num_epochs=10)
accuracy = evaluate_model(model, test_loader)
print(f"Test accuracy: {accuracy:.2f}")

This example sets up a basic CapsNet model, trains it on the MNIST dataset, and evaluates its performance.

Competitor Comparisons

keras-contrib

1,580

Keras community contributions

Pros of keras-contrib

Larger community and more active development
Broader range of advanced Keras layers and functionalities
Better integration with the Keras ecosystem

Cons of keras-contrib

Less focused on capsule networks specifically
May require more setup and configuration for capsule network implementations

Code Comparison

keras-contrib example:

from keras_contrib.layers import CapsuleLayer

model.add(CapsuleLayer(num_capsule=10, dim_capsule=16, routings=3))

capsule-networks example:

from capsule_layer import CapsuleLayer

model.add(CapsuleLayer(num_capsule=10, dim_capsule=16, num_routing=3))

The code snippets show that both repositories provide implementations of capsule layers, but with slightly different parameter naming conventions. keras-contrib uses routings, while capsule-networks uses num_routing.

keras-contrib offers a more extensive collection of advanced Keras layers and functionalities, making it suitable for a wider range of deep learning projects. However, capsule-networks is more focused on capsule network implementations specifically, which may be beneficial for projects centered around this architecture.

While keras-contrib benefits from a larger community and more active development, capsule-networks provides a more specialized approach to capsule networks. The choice between the two repositories depends on the specific requirements of your project and whether you need a broader range of Keras extensions or a more focused capsule network implementation.

CapsNet-Tensorflow

3,799

A Tensorflow implementation of CapsNet(Capsules Net) in paper Dynamic Routing Between Capsules

Pros of CapsNet-Tensorflow

More comprehensive documentation and explanations
Includes additional features like dynamic routing and reconstruction
Supports multiple datasets (MNIST, CIFAR10, SVHN)

Cons of CapsNet-Tensorflow

Less frequently updated compared to capsule-networks
May have higher computational requirements due to additional features
Potentially more complex for beginners to understand and modify

Code Comparison

CapsNet-Tensorflow:

def squash(vector):
    vector_squared_norm = tf.reduce_sum(tf.square(vector), -2, keepdims=True)
    scalar_factor = vector_squared_norm / (1 + vector_squared_norm) / tf.sqrt(vector_squared_norm + epsilon)
    return scalar_factor * vector

capsule-networks:

def squash(s, axis=-1, epsilon=1e-7):
    squared_norm = torch.sum(s**2, dim=axis, keepdim=True)
    scale = squared_norm / (1 + squared_norm)
    return scale * s / (torch.sqrt(squared_norm) + epsilon)

Both implementations showcase the squash function, a key component in capsule networks. CapsNet-Tensorflow uses TensorFlow operations, while capsule-networks uses PyTorch. The overall structure is similar, but CapsNet-Tensorflow includes an additional scalar_factor calculation.

models

2,897

Models and examples built with TensorFlow

Pros of models

More comprehensive, covering a wider range of machine learning models and techniques
Better documentation and examples for various use cases
More active development and community support

Cons of models

Larger and more complex codebase, potentially harder to navigate
May include unnecessary components for those specifically interested in capsule networks
Potentially slower to implement specific capsule network architectures

Code Comparison

capsule-networks:

class CapsuleLayer(nn.Module):
    def __init__(self, num_capsules, num_route_nodes, in_channels, out_channels, kernel_size=None, stride=None, num_iterations=3):
        super(CapsuleLayer, self).__init__()
        self.num_route_nodes = num_route_nodes
        self.num_iterations = num_iterations
        self.num_capsules = num_capsules

models:

class CapsuleNet(nn.Module):
    def __init__(self):
        super(CapsuleNet, self).__init__()
        self.conv1 = nn.Conv2d(in_channels=1, out_channels=256, kernel_size=9, stride=1)
        self.primary_capsules = PrimaryCaps(256, 32, 8, kernel_size=9, stride=2)
        self.digit_capsules = DigitCaps(32*6*6, 10, 16, 8)

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Dynamic Routing Between Capsules

A barebones CUDA-enabled PyTorch implementation of the CapsNet architecture in the paper "Dynamic Routing Between Capsules" by Kenta Iwasaki on behalf of Gram.AI.

Training for the model is done using TorchNet, with MNIST dataset loading and preprocessing done with TorchVision.

Description

A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation paramters. Active capsules at one level make predictions, via transformation matrices, for the instantiation parameters of higher-level capsules. When multiple predictions agree, a higher level capsule becomes active. We show that a discrimininatively trained, multi-layer capsule system achieves state-of-the-art performance on MNIST and is considerably better than a convolutional net at recognizing highly overlapping digits. To achieve these results we use an iterative routing-by-agreement mechanism: A lower-level capsule prefers to send its output to higher level capsules whose activity vectors have a big scalar product with the prediction coming from the lower-level capsule.

Paper written by Sara Sabour, Nicholas Frosst, and Geoffrey E. Hinton. For more information, please check out the paper here.

Requirements

Python 3
PyTorch
TorchVision
TorchNet
TQDM
Visdom

Usage

Step 1 Adjust the number of training epochs, batch sizes, etc. inside capsule_network.py.

BATCH_SIZE = 100
NUM_CLASSES = 10
NUM_EPOCHS = 30
NUM_ROUTING_ITERATIONS = 3

Step 2 Start training. The MNIST dataset will be downloaded if you do not already have it in the same directory the script is run in. Make sure to have Visdom Server running!

$ sudo python3 -m visdom.server & python3 capsule_network.py

Benchmarks

Highest accuracy was 99.7% on the 443rd epoch. The model may achieve a higher accuracy as shown by the trend of the test accuracy/loss graphs below.

Training progress.

Default PyTorch Adam optimizer hyperparameters were used with no learning rate scheduling. Epochs with batch size of 100 takes ~3 minutes on a Razer Blade w/ GTX 1050 and ~2 minutes on a NVIDIA Titan XP

TODO

Extension to other datasets apart from MNIST.

Credits

Primarily referenced these two TensorFlow and Keras implementations:

Many thanks to @InnerPeace-Wu for a discussion on the dynamic routing procedure outlined in the paper.

Contact/Support

Gram.AI is currently heavily developing a wide number of AI models to be either open-sourced or released for free to the community, hence why we cannot guarantee complete support for this work.

If any issues come up with the usage of this implementation however, or if you would like to contribute in any way, please feel free to send an e-mail to kenta@perlin.net or open a new GitHub issue on this repository.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot