Convert Figma logo to code with AI

XifengGuo logoCapsNet-Keras

A Keras implementation of CapsNet in NIPS2017 paper "Dynamic Routing Between Capsules". Now test error = 0.34%.

2,464
654
2,464
29

Top Related Projects

62,199

Deep Learning for humans

186,879

An Open Source Machine Learning Framework for Everyone

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Keras code and weights files for popular deep learning models.

Reference implementations of popular deep learning models.

Keras community contributions

Quick Overview

XifengGuo/CapsNet-Keras is a GitHub repository that implements Capsule Networks (CapsNet) using the Keras deep learning framework. It provides a Keras implementation of the CapsNet architecture proposed by Hinton et al. in the paper "Dynamic Routing Between Capsules," allowing researchers and developers to experiment with and build upon this novel neural network architecture.

Pros

  • Implements CapsNet architecture in a popular deep learning framework (Keras)
  • Includes pre-trained models and examples for MNIST and Fashion-MNIST datasets
  • Well-documented code with clear explanations of the CapsNet architecture
  • Provides both training and evaluation scripts for easy experimentation

Cons

  • Limited to specific datasets (MNIST and Fashion-MNIST)
  • May require updates to work with the latest versions of Keras and TensorFlow
  • Lacks implementations for more complex datasets or real-world applications
  • Performance may be slower compared to optimized implementations in other frameworks

Code Examples

  1. Defining the CapsNet model:
def CapsNet(input_shape, n_class, routings):
    x = layers.Input(shape=input_shape)
    conv1 = layers.Conv2D(filters=256, kernel_size=9, strides=1, padding='valid', activation='relu', name='conv1')(x)
    primarycaps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')
    digitcaps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings, name='digitcaps')(primarycaps)
    out_caps = Length(name='capsnet')(digitcaps)
    model = models.Model(inputs=x, outputs=out_caps)
    return model
  1. Training the CapsNet model:
model = CapsNet(input_shape=x_train.shape[1:], n_class=len(np.unique(y_train)), routings=3)
model.compile(optimizer=optimizers.Adam(lr=args.lr), loss=[margin_loss, 'mse'], loss_weights=[1., args.lam_recon])
model.fit([x_train, y_train], [y_train, x_train], batch_size=args.batch_size, epochs=args.epochs,
          validation_data=[[x_test, y_test], [y_test, x_test]], callbacks=[log, tb, checkpoint, lr_decay])
  1. Evaluating the trained model:
y_pred, x_recon = model.predict([x_test, y_test], batch_size=100)
print('-'*50)
print('Test acc:', np.sum(np.argmax(y_pred, 1) == np.argmax(y_test, 1))/y_test.shape[0])

Getting Started

  1. Clone the repository:

    git clone https://github.com/XifengGuo/CapsNet-Keras.git
    cd CapsNet-Keras
    
  2. Install dependencies:

    pip install -r requirements.txt
    
  3. Train the model on MNIST:

    python capsulenet.py
    
  4. Evaluate the model:

    python capsulenet.py -t -w result/trained_model.h5
    

Competitor Comparisons

62,199

Deep Learning for humans

Pros of Keras

  • Comprehensive deep learning framework with a wide range of built-in layers and models
  • Large community support and extensive documentation
  • Seamless integration with TensorFlow backend

Cons of Keras

  • Less specialized for capsule networks compared to CapsNet-Keras
  • May require more custom code to implement specific capsule network architectures

Code Comparison

CapsNet-Keras:

def CapsNet(input_shape, n_class, routings):
    x = layers.Input(shape=input_shape)
    conv1 = layers.Conv2D(256, 9, activation='relu')(x)
    primary_caps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')
    digit_caps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings)(primary_caps)
    out_caps = Length(name='capsnet')(digit_caps)

Keras:

def build_model(input_shape, num_classes):
    inputs = keras.Input(shape=input_shape)
    x = layers.Conv2D(32, 3, activation="relu")(inputs)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(64, 3, activation="relu")(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Flatten()(x)
    outputs = layers.Dense(num_classes, activation="softmax")(x)
    return keras.Model(inputs, outputs)
186,879

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Comprehensive deep learning framework with extensive functionality
  • Large community support and extensive documentation
  • Supports multiple programming languages and platforms

Cons of TensorFlow

  • Steeper learning curve for beginners
  • Can be more complex to set up and configure

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

TensorFlow:

def squash(vectors, axis=-1):
    squared_norm = tf.reduce_sum(tf.square(vectors), axis=axis, keepdims=True)
    scale = squared_norm / (1 + squared_norm) / tf.sqrt(squared_norm + tf.keras.backend.epsilon())
    return scale * vectors

Summary

CapsNet-Keras is a specific implementation of Capsule Networks using Keras, while TensorFlow is a more general-purpose machine learning framework. TensorFlow offers greater flexibility and a wider range of applications but may be more challenging for beginners. CapsNet-Keras provides a focused implementation of Capsule Networks, which can be easier to understand and use for this specific task.

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of pytorch

  • Broader scope and functionality as a full deep learning framework
  • Larger community and more extensive documentation
  • More flexible and dynamic computational graph

Cons of pytorch

  • Steeper learning curve for beginners
  • Potentially more complex setup and configuration
  • May be overkill for simple projects like implementing CapsNet

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

pytorch:

def squash(input_tensor, dim=-1, epsilon=1e-7):
    squared_norm = (input_tensor ** 2).sum(dim=dim, keepdim=True)
    scale = squared_norm / (1 + squared_norm)
    return scale * input_tensor / torch.sqrt(squared_norm + epsilon)

The code snippets show similar implementations of the squash function, with pytorch using its own tensor operations and CapsNet-Keras using Keras backend functions. pytorch's version is slightly more concise and uses native PyTorch operations.

Keras code and weights files for popular deep learning models.

Pros of deep-learning-models

  • Offers a wide variety of pre-implemented deep learning models
  • Maintained by François Chollet, the creator of Keras
  • Includes popular architectures like ResNet, VGG, and Inception

Cons of deep-learning-models

  • Focuses on general deep learning models, not specifically on capsule networks
  • May require more customization for specific capsule network implementations
  • Less specialized documentation for capsule network concepts

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

deep-learning-models:

def identity_block(input_tensor, kernel_size, filters, stage, block):
    filters1, filters2, filters3 = filters
    conv_name_base = 'res' + str(stage) + block + '_branch'
    bn_name_base = 'bn' + str(stage) + block + '_branch'

The code snippets show the difference in focus between the two repositories. CapsNet-Keras implements specific capsule network functions, while deep-learning-models provides building blocks for various architectures.

Reference implementations of popular deep learning models.

Pros of keras-applications

  • Comprehensive collection of pre-trained deep learning models
  • Official Keras repository, ensuring compatibility and regular updates
  • Extensive documentation and community support

Cons of keras-applications

  • Focused on general-purpose models, not specialized architectures like CapsNet
  • May require more computational resources for some applications
  • Less suitable for research on novel network architectures

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

keras-applications:

def preprocess_input(x, data_format=None):
    return imagenet_utils.preprocess_input(x, data_format=data_format, mode='caffe')

The CapsNet-Keras code snippet shows a custom squash function for capsule networks, while the keras-applications code demonstrates a standard preprocessing function for image input. This highlights the difference in focus between the two repositories, with CapsNet-Keras implementing specialized capsule network operations and keras-applications providing general-purpose utilities for common deep learning models.

Keras community contributions

Pros of keras-contrib

  • Broader scope: Includes various experimental Keras layers, not limited to CapsNet
  • Official extension: Maintained by the Keras team, ensuring compatibility and support
  • Active community: More contributors and frequent updates

Cons of keras-contrib

  • Less focused: Not specialized for CapsNet implementation
  • Complexity: May be overwhelming for users specifically interested in CapsNet

Code Comparison

CapsNet-Keras:

def squash(vectors, axis=-1):
    s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
    scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
    return scale * vectors

keras-contrib:

def squash(x, axis=-1):
    s_squared_norm = K.sum(K.square(x), axis, keepdims=True) + K.epsilon()
    scale = K.sqrt(s_squared_norm) / (0.5 + s_squared_norm)
    return scale * x

Both implementations provide a squash function for CapsNet, but keras-contrib's version is slightly different in its calculation approach.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

CapsNet-Keras

License

A Keras (branch tf2.2 supports TensorFlow 2) implementation of CapsNet in the paper:
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017
The current average test error = 0.34% and best test error = 0.30%.

Differences with the paper:

  • We use the learning rate decay with decay factor = 0.9 and step = 1 epoch,
    while the paper did not give the detailed parameters (or they didn't use it?).
  • We only report the test errors after 50 epochs training.
    In the paper, I suppose they trained for 1250 epochs according to Figure A.1? Sounds crazy, maybe I misunderstood.
  • We use MSE (mean squared error) as the reconstruction loss and the coefficient for the loss is lam_recon=0.0005*784=0.392.
    This should be equivalent with using SSE (sum squared error) and lam_recon=0.0005 as in the paper.

Warnning

Please use Keras==2.0.7 with TensorFlow==1.2 backend, or the K.batch_dot function may not work correctly.

However, if you use Tensorflow>=2.0, then checkout branch tf2.2

Usage

Step 1. Clone this repository to local.

git clone https://github.com/XifengGuo/CapsNet-Keras.git capsnet-keras
cd capsnet-keras
git checkout tf2.2 # Only if use Tensorflow>=2.0

Step 2. Install Keras==2.0.7 with TensorFlow==1.2 backend.

pip install tensorflow-gpu==1.2
pip install keras==2.0.7

or install Tensorflow>=2.0

pip install tensorflow==2.2

Step 3. Train a CapsNet on MNIST

Training with default settings:

python capsulenet.py

More detailed usage run for help:

python capsulenet.py -h

Step 4. Test a pre-trained CapsNet model

Suppose you have trained a model using the above command, then the trained model will be saved to result/trained_model.h5. Now just launch the following command to get test results.

$ python capsulenet.py -t -w result/trained_model.h5

It will output the testing accuracy and show the reconstructed images. The testing data is same as the validation data. It will be easy to test on new data, just change the code as you want.

You can also just download a model I trained from https://pan.baidu.com/s/1sldqQo1 or https://drive.google.com/open?id=1A7pRxH7iWzYZekzr-O0nrwqdUUpUpkik

Step 5. Train on multi gpus

This requires Keras>=2.0.9. After updating Keras:

python capsulenet-multi-gpu.py --gpus 2

It will automatically train on multi gpus for 50 epochs and then output the performance on test dataset. But during training, no validation accuracy is reported.

Results

Test Errors

CapsNet classification test error on MNIST. Average and standard deviation results are reported by 3 trials. The results can be reproduced by launching the following commands.

python capsulenet.py --routings 1 --lam_recon 0.0    #CapsNet-v1   
python capsulenet.py --routings 1 --lam_recon 0.392  #CapsNet-v2
python capsulenet.py --routings 3 --lam_recon 0.0    #CapsNet-v3 
python capsulenet.py --routings 3 --lam_recon 0.392  #CapsNet-v4
MethodRoutingReconstructionMNIST (%)Paper
Baseline------0.39
CapsNet-v11no0.39 (0.024)0.34 (0.032)
CapsNet-v21yes0.36 (0.009)0.29 (0.011)
CapsNet-v33no0.40 (0.016)0.35 (0.036)
CapsNet-v43yes0.34 (0.016)0.25 (0.005)

Losses and accuracies:

Training Speed

About 100s / epoch on a single GTX 1070 GPU.
About 80s / epoch on a single GTX 1080Ti GPU.
About 55s / epoch on two GTX 1080Ti GPU by using capsulenet-multi-gpu.py.

Reconstruction result

The result of CapsNet-v4 by launching

python capsulenet.py -t -w result/trained_model.h5

Digits at top 5 rows are real images from MNIST and digits at bottom are corresponding reconstructed images.

Manipulate latent code

python capsulenet.py -t --digit 5 -w result/trained_model.h5 

For each digit, the ith row corresponds to the ith dimension of the capsule, and columns from left to right correspond to adding [-0.25, -0.2, -0.15, -0.1, -0.05, 0, 0.05, 0.1, 0.15, 0.2, 0.25] to the value of one dimension of the capsule.

As we can see, each dimension has caught some characteristics of a digit. The same dimension of different digit capsules may represent different characteristics. This is because that different digits are reconstructed from different feature vectors (digit capsules). These vectors are mutually independent during reconstruction.

Other Implementations