CapsNet-Keras
A Keras implementation of CapsNet in NIPS2017 paper "Dynamic Routing Between Capsules". Now test error = 0.34%.
Top Related Projects
Deep Learning for humans
An Open Source Machine Learning Framework for Everyone
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Keras code and weights files for popular deep learning models.
Reference implementations of popular deep learning models.
Keras community contributions
Quick Overview
XifengGuo/CapsNet-Keras is a GitHub repository that implements Capsule Networks (CapsNet) using the Keras deep learning framework. It provides a Keras implementation of the CapsNet architecture proposed by Hinton et al. in the paper "Dynamic Routing Between Capsules," allowing researchers and developers to experiment with and build upon this novel neural network architecture.
Pros
- Implements CapsNet architecture in a popular deep learning framework (Keras)
- Includes pre-trained models and examples for MNIST and Fashion-MNIST datasets
- Well-documented code with clear explanations of the CapsNet architecture
- Provides both training and evaluation scripts for easy experimentation
Cons
- Limited to specific datasets (MNIST and Fashion-MNIST)
- May require updates to work with the latest versions of Keras and TensorFlow
- Lacks implementations for more complex datasets or real-world applications
- Performance may be slower compared to optimized implementations in other frameworks
Code Examples
- Defining the CapsNet model:
def CapsNet(input_shape, n_class, routings):
x = layers.Input(shape=input_shape)
conv1 = layers.Conv2D(filters=256, kernel_size=9, strides=1, padding='valid', activation='relu', name='conv1')(x)
primarycaps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')
digitcaps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings, name='digitcaps')(primarycaps)
out_caps = Length(name='capsnet')(digitcaps)
model = models.Model(inputs=x, outputs=out_caps)
return model
- Training the CapsNet model:
model = CapsNet(input_shape=x_train.shape[1:], n_class=len(np.unique(y_train)), routings=3)
model.compile(optimizer=optimizers.Adam(lr=args.lr), loss=[margin_loss, 'mse'], loss_weights=[1., args.lam_recon])
model.fit([x_train, y_train], [y_train, x_train], batch_size=args.batch_size, epochs=args.epochs,
validation_data=[[x_test, y_test], [y_test, x_test]], callbacks=[log, tb, checkpoint, lr_decay])
- Evaluating the trained model:
y_pred, x_recon = model.predict([x_test, y_test], batch_size=100)
print('-'*50)
print('Test acc:', np.sum(np.argmax(y_pred, 1) == np.argmax(y_test, 1))/y_test.shape[0])
Getting Started
-
Clone the repository:
git clone https://github.com/XifengGuo/CapsNet-Keras.git cd CapsNet-Keras
-
Install dependencies:
pip install -r requirements.txt
-
Train the model on MNIST:
python capsulenet.py
-
Evaluate the model:
python capsulenet.py -t -w result/trained_model.h5
Competitor Comparisons
Deep Learning for humans
Pros of Keras
- Comprehensive deep learning framework with a wide range of built-in layers and models
- Large community support and extensive documentation
- Seamless integration with TensorFlow backend
Cons of Keras
- Less specialized for capsule networks compared to CapsNet-Keras
- May require more custom code to implement specific capsule network architectures
Code Comparison
CapsNet-Keras:
def CapsNet(input_shape, n_class, routings):
x = layers.Input(shape=input_shape)
conv1 = layers.Conv2D(256, 9, activation='relu')(x)
primary_caps = PrimaryCap(conv1, dim_capsule=8, n_channels=32, kernel_size=9, strides=2, padding='valid')
digit_caps = CapsuleLayer(num_capsule=n_class, dim_capsule=16, routings=routings)(primary_caps)
out_caps = Length(name='capsnet')(digit_caps)
Keras:
def build_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
x = layers.Conv2D(32, 3, activation="relu")(inputs)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(64, 3, activation="relu")(x)
x = layers.MaxPooling2D()(x)
x = layers.Flatten()(x)
outputs = layers.Dense(num_classes, activation="softmax")(x)
return keras.Model(inputs, outputs)
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Comprehensive deep learning framework with extensive functionality
- Large community support and extensive documentation
- Supports multiple programming languages and platforms
Cons of TensorFlow
- Steeper learning curve for beginners
- Can be more complex to set up and configure
Code Comparison
CapsNet-Keras:
def squash(vectors, axis=-1):
s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
return scale * vectors
TensorFlow:
def squash(vectors, axis=-1):
squared_norm = tf.reduce_sum(tf.square(vectors), axis=axis, keepdims=True)
scale = squared_norm / (1 + squared_norm) / tf.sqrt(squared_norm + tf.keras.backend.epsilon())
return scale * vectors
Summary
CapsNet-Keras is a specific implementation of Capsule Networks using Keras, while TensorFlow is a more general-purpose machine learning framework. TensorFlow offers greater flexibility and a wider range of applications but may be more challenging for beginners. CapsNet-Keras provides a focused implementation of Capsule Networks, which can be easier to understand and use for this specific task.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of pytorch
- Broader scope and functionality as a full deep learning framework
- Larger community and more extensive documentation
- More flexible and dynamic computational graph
Cons of pytorch
- Steeper learning curve for beginners
- Potentially more complex setup and configuration
- May be overkill for simple projects like implementing CapsNet
Code Comparison
CapsNet-Keras:
def squash(vectors, axis=-1):
s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
return scale * vectors
pytorch:
def squash(input_tensor, dim=-1, epsilon=1e-7):
squared_norm = (input_tensor ** 2).sum(dim=dim, keepdim=True)
scale = squared_norm / (1 + squared_norm)
return scale * input_tensor / torch.sqrt(squared_norm + epsilon)
The code snippets show similar implementations of the squash function, with pytorch using its own tensor operations and CapsNet-Keras using Keras backend functions. pytorch's version is slightly more concise and uses native PyTorch operations.
Keras code and weights files for popular deep learning models.
Pros of deep-learning-models
- Offers a wide variety of pre-implemented deep learning models
- Maintained by François Chollet, the creator of Keras
- Includes popular architectures like ResNet, VGG, and Inception
Cons of deep-learning-models
- Focuses on general deep learning models, not specifically on capsule networks
- May require more customization for specific capsule network implementations
- Less specialized documentation for capsule network concepts
Code Comparison
CapsNet-Keras:
def squash(vectors, axis=-1):
s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
return scale * vectors
deep-learning-models:
def identity_block(input_tensor, kernel_size, filters, stage, block):
filters1, filters2, filters3 = filters
conv_name_base = 'res' + str(stage) + block + '_branch'
bn_name_base = 'bn' + str(stage) + block + '_branch'
The code snippets show the difference in focus between the two repositories. CapsNet-Keras implements specific capsule network functions, while deep-learning-models provides building blocks for various architectures.
Reference implementations of popular deep learning models.
Pros of keras-applications
- Comprehensive collection of pre-trained deep learning models
- Official Keras repository, ensuring compatibility and regular updates
- Extensive documentation and community support
Cons of keras-applications
- Focused on general-purpose models, not specialized architectures like CapsNet
- May require more computational resources for some applications
- Less suitable for research on novel network architectures
Code Comparison
CapsNet-Keras:
def squash(vectors, axis=-1):
s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
return scale * vectors
keras-applications:
def preprocess_input(x, data_format=None):
return imagenet_utils.preprocess_input(x, data_format=data_format, mode='caffe')
The CapsNet-Keras code snippet shows a custom squash function for capsule networks, while the keras-applications code demonstrates a standard preprocessing function for image input. This highlights the difference in focus between the two repositories, with CapsNet-Keras implementing specialized capsule network operations and keras-applications providing general-purpose utilities for common deep learning models.
Keras community contributions
Pros of keras-contrib
- Broader scope: Includes various experimental Keras layers, not limited to CapsNet
- Official extension: Maintained by the Keras team, ensuring compatibility and support
- Active community: More contributors and frequent updates
Cons of keras-contrib
- Less focused: Not specialized for CapsNet implementation
- Complexity: May be overwhelming for users specifically interested in CapsNet
Code Comparison
CapsNet-Keras:
def squash(vectors, axis=-1):
s_squared_norm = K.sum(K.square(vectors), axis, keepdims=True)
scale = s_squared_norm / (1 + s_squared_norm) / K.sqrt(s_squared_norm + K.epsilon())
return scale * vectors
keras-contrib:
def squash(x, axis=-1):
s_squared_norm = K.sum(K.square(x), axis, keepdims=True) + K.epsilon()
scale = K.sqrt(s_squared_norm) / (0.5 + s_squared_norm)
return scale * x
Both implementations provide a squash function for CapsNet, but keras-contrib's version is slightly different in its calculation approach.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
CapsNet-Keras
A Keras (branch tf2.2 supports TensorFlow 2) implementation of CapsNet in the paper:
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017
The current average test error = 0.34%
and best test error = 0.30%
.
Differences with the paper:
- We use the learning rate decay with
decay factor = 0.9
andstep = 1 epoch
,
while the paper did not give the detailed parameters (or they didn't use it?). - We only report the test errors after
50 epochs
training.
In the paper, I suppose they trained for1250 epochs
according to Figure A.1? Sounds crazy, maybe I misunderstood. - We use MSE (mean squared error) as the reconstruction loss and
the coefficient for the loss is
lam_recon=0.0005*784=0.392
.
This should be equivalent with using SSE (sum squared error) andlam_recon=0.0005
as in the paper.
Warnning
Please use Keras==2.0.7 with TensorFlow==1.2 backend, or the K.batch_dot
function may not work correctly.
However, if you use Tensorflow>=2.0, then checkout branch tf2.2
Usage
Step 1. Clone this repository to local.
git clone https://github.com/XifengGuo/CapsNet-Keras.git capsnet-keras
cd capsnet-keras
git checkout tf2.2 # Only if use Tensorflow>=2.0
Step 2. Install Keras==2.0.7 with TensorFlow==1.2 backend.
pip install tensorflow-gpu==1.2
pip install keras==2.0.7
or install Tensorflow>=2.0
pip install tensorflow==2.2
Step 3. Train a CapsNet on MNIST
Training with default settings:
python capsulenet.py
More detailed usage run for help:
python capsulenet.py -h
Step 4. Test a pre-trained CapsNet model
Suppose you have trained a model using the above command, then the trained model will be
saved to result/trained_model.h5
. Now just launch the following command to get test results.
$ python capsulenet.py -t -w result/trained_model.h5
It will output the testing accuracy and show the reconstructed images. The testing data is same as the validation data. It will be easy to test on new data, just change the code as you want.
You can also just download a model I trained from https://pan.baidu.com/s/1sldqQo1 or https://drive.google.com/open?id=1A7pRxH7iWzYZekzr-O0nrwqdUUpUpkik
Step 5. Train on multi gpus
This requires Keras>=2.0.9
. After updating Keras:
python capsulenet-multi-gpu.py --gpus 2
It will automatically train on multi gpus for 50 epochs and then output the performance on test dataset. But during training, no validation accuracy is reported.
Results
Test Errors
CapsNet classification test error on MNIST. Average and standard deviation results are reported by 3 trials. The results can be reproduced by launching the following commands.
python capsulenet.py --routings 1 --lam_recon 0.0 #CapsNet-v1
python capsulenet.py --routings 1 --lam_recon 0.392 #CapsNet-v2
python capsulenet.py --routings 3 --lam_recon 0.0 #CapsNet-v3
python capsulenet.py --routings 3 --lam_recon 0.392 #CapsNet-v4
Method | Routing | Reconstruction | MNIST (%) | Paper |
---|---|---|---|---|
Baseline | -- | -- | -- | 0.39 |
CapsNet-v1 | 1 | no | 0.39 (0.024) | 0.34 (0.032) |
CapsNet-v2 | 1 | yes | 0.36 (0.009) | 0.29 (0.011) |
CapsNet-v3 | 3 | no | 0.40 (0.016) | 0.35 (0.036) |
CapsNet-v4 | 3 | yes | 0.34 (0.016) | 0.25 (0.005) |
Losses and accuracies:
Training Speed
About 100s / epoch
on a single GTX 1070 GPU.
About 80s / epoch
on a single GTX 1080Ti GPU.
About 55s / epoch
on two GTX 1080Ti GPU by using capsulenet-multi-gpu.py
.
Reconstruction result
The result of CapsNet-v4 by launching
python capsulenet.py -t -w result/trained_model.h5
Digits at top 5 rows are real images from MNIST and digits at bottom are corresponding reconstructed images.
Manipulate latent code
python capsulenet.py -t --digit 5 -w result/trained_model.h5
For each digit, the ith row corresponds to the ith dimension of the capsule, and columns from left to
right correspond to adding [-0.25, -0.2, -0.15, -0.1, -0.05, 0, 0.05, 0.1, 0.15, 0.2, 0.25]
to
the value of one dimension of the capsule.
As we can see, each dimension has caught some characteristics of a digit. The same dimension of different digit capsules may represent different characteristics. This is because that different digits are reconstructed from different feature vectors (digit capsules). These vectors are mutually independent during reconstruction.
Other Implementations
-
PyTorch:
-
TensorFlow:
- naturomics/CapsNet-Tensorflow
I referred to some functions in this repository. - InnerPeace-Wu/CapsNet-tensorflow
- chrislybaer/capsules-tensorflow
- naturomics/CapsNet-Tensorflow
-
MXNet:
-
Chainer:
-
Matlab:
Top Related Projects
Deep Learning for humans
An Open Source Machine Learning Framework for Everyone
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Keras code and weights files for popular deep learning models.
Reference implementations of popular deep learning models.
Keras community contributions
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot