CapsNet-Keras

A Keras implementation of CapsNet in NIPS2017 paper "Dynamic Routing Between Capsules". Now test error ＝ 0.34%.

2,461

651

2,461

View on GitHub

Top Related Projects

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

deep-learning-models

7,345

Keras code and weights files for popular deep learning models.

keras-applications

2,005

Reference implementations of popular deep learning models.

Quick Overview

XifengGuo/CapsNet-Keras is a GitHub repository that implements Capsule Networks (CapsNet) using the Keras deep learning framework. It provides a Keras implementation of the CapsNet architecture proposed by Hinton et al. in the paper "Dynamic Routing Between Capsules," allowing researchers and developers to experiment with and build upon this novel neural network architecture.

Pros

Implements CapsNet architecture in a popular deep learning framework (Keras)
Includes pre-trained models and examples for MNIST and Fashion-MNIST datasets
Well-documented code with clear explanations of the CapsNet architecture
Provides both training and evaluation scripts for easy experimentation

Cons

Limited to specific datasets (MNIST and Fashion-MNIST)
May require updates to work with the latest versions of Keras and TensorFlow
Lacks implementations for more complex datasets or real-world applications
Performance may be slower compared to optimized implementations in other frameworks

Code Examples

Defining the CapsNet model:

def CapsNet(input_shape, n_class, routings):
    x = layers.Input(shape=input_shape)
    conv1 = layers.Conv2D(filters=256, kernel_size=9, strides=1, padding='valid', activation='relu', name='conv1')(x)
    primarycaps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')
    digitcaps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings, name='digitcaps')(primarycaps)
    out_caps = Length(name='capsnet')(digitcaps)
    model = models.Model(inputs=x, outputs=out_caps)
    return model

Training the CapsNet model:

model = CapsNet(input_shape=x_train.shape[1:], n_class=len(np.unique(y_train)), routings=3)
model.compile(optimizer=optimizers.Adam(lr=args.lr), loss=[margin_loss, 'mse'], loss_weights=[1., args.lam_recon])
model.fit([x_train, y_train], [y_train, x_train], batch_size=args.batch_size, epochs=args.epochs,
          validation_data=[[x_test, y_test], [y_test, x_test]], callbacks=[log, tb, checkpoint, lr_decay])

Evaluating the trained model:

y_pred, x_recon = model.predict([x_test, y_test], batch_size=100)
print('-'*50)
print('Test acc:', np.sum(np.argmax(y_pred, 1) == np.argmax(y_test, 1))/y_test.shape[0])

Getting Started

Clone the repository:

git clone https://github.com/XifengGuo/CapsNet-Keras.git
cd CapsNet-Keras

Install dependencies:
```
pip install -r requirements.txt
```
Train the model on MNIST:
```
python capsulenet.py
```

Evaluate the model:

python capsulenet.py -t -w result/trained_model.h5

Competitor Comparisons

keras

63,156

Deep Learning for humans

Pros of Keras

Comprehensive deep learning framework with a wide range of built-in layers and models
Large community support and extensive documentation
Seamless integration with TensorFlow backend

Cons of Keras

Less specialized for capsule networks compared to CapsNet-Keras
May require more custom code to implement specific capsule network architectures

Code Comparison

CapsNet-Keras:

def CapsNet(input_shape, n_class, routings):
    x = layers.Input(shape=input_shape)
    conv1 = layers.Conv2D(256, 9, activation='relu')(x)
    primary_caps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')
    digit_caps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings)(primary_caps)
    out_caps = Length(name='capsnet')(digit_caps)

Keras:

def build_model(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)
    x = layers.Conv2D(32, 3, activation="relu")(inputs)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(64, 3, activation="relu")(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Flatten()(x)
    outputs = layers.Dense(num_classes, activation="softmax")(x)
    return keras.Model(inputs, outputs)

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

Comprehensive deep learning framework with extensive functionality
Large community support and extensive documentation
Supports multiple programming languages and platforms

Cons of TensorFlow

Steeper learning curve for beginners
Can be more complex to set up and configure

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

TensorFlow:

def squash(vectors, axis=-1):
    squared_norm = tf.reduce_sum(tf.square(vectors), axis=axis, keepdims=True)
    scale = squared_norm / (1 + squared_norm) / tf.sqrt(squared_norm + tf.keras.backend.epsilon())
    return scale * vectors

Summary

CapsNet-Keras is a specific implementation of Capsule Networks using Keras, while TensorFlow is a more general-purpose machine learning framework. TensorFlow offers greater flexibility and a wider range of applications but may be more challenging for beginners. CapsNet-Keras provides a focused implementation of Capsule Networks, which can be easier to understand and use for this specific task.

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of pytorch

Broader scope and functionality as a full deep learning framework
Larger community and more extensive documentation
More flexible and dynamic computational graph

Cons of pytorch

Steeper learning curve for beginners
Potentially more complex setup and configuration
May be overkill for simple projects like implementing CapsNet

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

pytorch:

def squash(input_tensor, dim=-1, epsilon=1e-7):
    squared_norm = (input_tensor ** 2).sum(dim=dim, keepdim=True)
    scale = squared_norm / (1 + squared_norm)
    return scale * input_tensor / torch.sqrt(squared_norm + epsilon)

The code snippets show similar implementations of the squash function, with pytorch using its own tensor operations and CapsNet-Keras using Keras backend functions. pytorch's version is slightly more concise and uses native PyTorch operations.

deep-learning-models

7,345

Keras code and weights files for popular deep learning models.

Pros of deep-learning-models

Offers a wide variety of pre-implemented deep learning models
Maintained by François Chollet, the creator of Keras
Includes popular architectures like ResNet, VGG, and Inception

Cons of deep-learning-models

Focuses on general deep learning models, not specifically on capsule networks
May require more customization for specific capsule network implementations
Less specialized documentation for capsule network concepts

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

deep-learning-models:

def identity_block(input_tensor, kernel_size, filters, stage, block):
    filters1, filters2, filters3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

The code snippets show the difference in focus between the two repositories. CapsNet-Keras implements specific capsule network functions, while deep-learning-models provides building blocks for various architectures.

keras-applications

2,005

Reference implementations of popular deep learning models.

Pros of keras-applications

Comprehensive collection of pre-trained deep learning models
Official Keras repository, ensuring compatibility and regular updates
Extensive documentation and community support

Cons of keras-applications

Focused on general-purpose models, not specialized architectures like CapsNet
May require more computational resources for some applications
Less suitable for research on novel network architectures

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

keras-applications:

def preprocess_input(x, data_format=None):
    return imagenet_utils.preprocess_input(x, data_format=data_format, mode='caffe')

The CapsNet-Keras code snippet shows a custom squash function for capsule networks, while the keras-applications code demonstrates a standard preprocessing function for image input. This highlights the difference in focus between the two repositories, with CapsNet-Keras implementing specialized capsule network operations and keras-applications providing general-purpose utilities for common deep learning models.

keras-contrib

1,580

Keras community contributions

Pros of keras-contrib

Broader scope: Includes various experimental Keras layers, not limited to CapsNet
Official extension: Maintained by the Keras team, ensuring compatibility and support
Active community: More contributors and frequent updates

Cons of keras-contrib

Less focused: Not specialized for CapsNet implementation
Complexity: May be overwhelming for users specifically interested in CapsNet

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

keras-contrib:

def squash(x, axis=-1):
    s_squared_norm = K.sum(K.square(x), axis, keepdims=True) + K.epsilon()
    scale = K.sqrt(s_squared_norm) / (0.5 + s_squared_norm)
    return scale * x

Both implementations provide a squash function for CapsNet, but keras-contrib's version is slightly different in its calculation approach.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

CapsNet-Keras

A Keras (branch tf2.2 supports TensorFlow 2) implementation of CapsNet in the paper:
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017
The current average test error = 0.34% and best test error = 0.30%.

Differences with the paper:

We use the learning rate decay with decay factor = 0.9 and step = 1 epoch,
while the paper did not give the detailed parameters (or they didn't use it?).
We only report the test errors after 50 epochs training.
In the paper, I suppose they trained for 1250 epochs according to Figure A.1? Sounds crazy, maybe I misunderstood.
We use MSE (mean squared error) as the reconstruction loss and the coefficient for the loss is lam_recon=0.0005*784=0.392.
This should be equivalent with using SSE (sum squared error) and lam_recon=0.0005 as in the paper.

Warnning

Please use Keras==2.0.7 with TensorFlow==1.2 backend, or the K.batch_dot function may not work correctly.

However, if you use Tensorflow>=2.0, then checkout branch tf2.2

Usage

Step 1. Clone this repository to local.

git clone https://github.com/XifengGuo/CapsNet-Keras.git capsnet-keras
cd capsnet-keras
git checkout tf2.2 # Only if use Tensorflow>=2.0

Step 2. Install Keras==2.0.7 with TensorFlow==1.2 backend.

pip install tensorflow-gpu==1.2
pip install keras==2.0.7

or install Tensorflow>=2.0

pip install tensorflow==2.2

Step 3. Train a CapsNet on MNIST

Training with default settings:

python capsulenet.py

More detailed usage run for help:

python capsulenet.py -h

Step 4. Test a pre-trained CapsNet model

Suppose you have trained a model using the above command, then the trained model will be saved to result/trained_model.h5. Now just launch the following command to get test results.

$ python capsulenet.py -t -w result/trained_model.h5

It will output the testing accuracy and show the reconstructed images. The testing data is same as the validation data. It will be easy to test on new data, just change the code as you want.

You can also just download a model I trained from https://pan.baidu.com/s/1sldqQo1 or https://drive.google.com/open?id=1A7pRxH7iWzYZekzr-O0nrwqdUUpUpkik

Step 5. Train on multi gpus

This requires Keras>=2.0.9. After updating Keras:

python capsulenet-multi-gpu.py --gpus 2

It will automatically train on multi gpus for 50 epochs and then output the performance on test dataset. But during training, no validation accuracy is reported.

Results

Test Errors

CapsNet classification test error on MNIST. Average and standard deviation results are reported by 3 trials. The results can be reproduced by launching the following commands.

python capsulenet.py --routings 1 --lam_recon 0.0    #CapsNet-v1   
python capsulenet.py --routings 1 --lam_recon 0.392  #CapsNet-v2
python capsulenet.py --routings 3 --lam_recon 0.0    #CapsNet-v3 
python capsulenet.py --routings 3 --lam_recon 0.392  #CapsNet-v4

Method	Routing	Reconstruction	MNIST (%)	Paper
Baseline	--	--	--	0.39
CapsNet-v1	1	no	0.39 (0.024)	0.34 (0.032)
CapsNet-v2	1	yes	0.36 (0.009)	0.29 (0.011)
CapsNet-v3	3	no	0.40 (0.016)	0.35 (0.036)
CapsNet-v4	3	yes	0.34 (0.016)	0.25 (0.005)

Losses and accuracies:

Training Speed

About 100s / epoch on a single GTX 1070 GPU.
About 80s / epoch on a single GTX 1080Ti GPU.
About 55s / epoch on two GTX 1080Ti GPU by using capsulenet-multi-gpu.py.

Reconstruction result

The result of CapsNet-v4 by launching

python capsulenet.py -t -w result/trained_model.h5

Digits at top 5 rows are real images from MNIST and digits at bottom are corresponding reconstructed images.

Manipulate latent code

python capsulenet.py -t --digit 5 -w result/trained_model.h5

For each digit, the ith row corresponds to the ith dimension of the capsule, and columns from left to right correspond to adding [-0.25, -0.2, -0.15, -0.1, -0.05, 0, 0.05, 0.1, 0.15, 0.2, 0.25] to the value of one dimension of the capsule.

As we can see, each dimension has caught some characteristics of a digit. The same dimension of different digit capsules may represent different characteristics. This is because that different digits are reconstructed from different feature vectors (digit capsules). These vectors are mutually independent during reconstruction.

Other Implementations

PyTorch:
TensorFlow:
- naturomics/CapsNet-Tensorflow
  I referred to some functions in this repository.
- InnerPeace-Wu/CapsNet-tensorflow
- chrislybaer/capsules-tensorflow
MXNet:
- AaronLeong/CapsNet_Mxnet
Chainer:
- soskek/dynamic_routing_between_capsules
Matlab:
- yechengxi/LightCapsNet

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot