d2l-pytorch
This project reproduces the book Dive Into Deep Learning (https://d2l.ai/), adapting the code from MXNet into PyTorch.
Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
Deep Learning for humans
scikit-learn: machine learning in Python
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Quick Overview
The dsgiitr/d2l-pytorch repository is a PyTorch implementation of the "Dive into Deep Learning" (D2L) book. It provides a comprehensive collection of Jupyter notebooks that cover various deep learning concepts and techniques, with code examples and explanations in PyTorch.
Pros
- Comprehensive coverage of deep learning topics
- Hands-on approach with interactive Jupyter notebooks
- Well-structured content following the D2L book
- Regularly updated to incorporate new PyTorch features
Cons
- May require some prior knowledge of Python and machine learning
- Some advanced topics might be challenging for beginners
- Occasional discrepancies between the book content and the PyTorch implementation
- Limited community support compared to the official MXNet version
Code Examples
- Loading and preprocessing data:
import torch
from torch.utils.data import DataLoader
from torchvision import datasets, transforms
transform = transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,))
])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
- Defining a simple neural network:
import torch.nn as nn
class SimpleNet(nn.Module):
def __init__(self):
super(SimpleNet, self).__init__()
self.fc1 = nn.Linear(784, 128)
self.fc2 = nn.Linear(128, 10)
self.relu = nn.ReLU()
def forward(self, x):
x = x.view(-1, 784)
x = self.relu(self.fc1(x))
x = self.fc2(x)
return x
- Training loop:
def train(model, train_loader, criterion, optimizer, device):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
data, target = data.to(device), target.to(device)
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
Getting Started
To get started with the dsgiitr/d2l-pytorch repository:
-
Clone the repository:
git clone https://github.com/dsgiitr/d2l-pytorch.git cd d2l-pytorch
-
Install the required dependencies:
pip install -r requirements.txt
-
Launch Jupyter Notebook:
jupyter notebook
-
Navigate to the desired chapter and open the corresponding notebook to start learning and experimenting with PyTorch implementations of deep learning concepts.
Competitor Comparisons
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Pros of transformers
- Extensive library of pre-trained models for various NLP tasks
- Active community and frequent updates
- Seamless integration with popular deep learning frameworks
Cons of transformers
- Steeper learning curve for beginners
- Larger codebase and potentially higher computational requirements
- Less focus on educational content and explanations
Code comparison
transformers:
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertModel.from_pretrained('bert-base-uncased')
inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")
outputs = model(**inputs)
d2l-pytorch:
import torch
from d2l import torch as d2l
net = d2l.LinearRegression(2)
batch_size, lr, num_epochs = 10, 0.03, 3
train_iter = d2l.load_array((X, y), batch_size)
d2l.train_ch3(net, train_iter, loss, num_epochs, lr)
The transformers code demonstrates loading a pre-trained BERT model, while d2l-pytorch focuses on implementing and training a simple linear regression model from scratch.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of pytorch
- Official PyTorch repository with comprehensive documentation and support
- Larger community and more frequent updates
- Broader scope, covering the entire PyTorch ecosystem
Cons of pytorch
- Steeper learning curve for beginners
- Less focused on educational content and examples
- May be overwhelming for those specifically interested in deep learning
Code comparison
d2l-pytorch:
def train_epoch(net, train_iter, loss, updater):
metric = Accumulator(3)
for X, y in train_iter:
y_hat = net(X)
l = loss(y_hat, y)
updater.zero_grad()
l.backward()
updater.step()
metric.add(float(l) * len(y), accuracy(y_hat, y), len(y))
return metric[0] / metric[2], metric[1] / metric[2]
pytorch:
def train(model, train_loader, optimizer, criterion):
model.train()
for batch_idx, (data, target) in enumerate(train_loader):
optimizer.zero_grad()
output = model(data)
loss = criterion(output, target)
loss.backward()
optimizer.step()
The d2l-pytorch code is more educational and includes metrics, while the pytorch example is more concise and focused on the core training loop.
An Open Source Machine Learning Framework for Everyone
Pros of tensorflow
- Larger community and ecosystem with more resources and tools
- Better support for production deployment and serving models
- More comprehensive documentation and official tutorials
Cons of tensorflow
- Steeper learning curve, especially for beginners
- Less Pythonic syntax compared to PyTorch
- Slower development cycle and less flexibility for research
Code comparison
d2l-pytorch:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.cat((x, y))
tensorflow:
import tensorflow as tf
x = tf.constant([1, 2, 3])
y = tf.constant([4, 5, 6])
z = tf.concat([x, y], axis=0)
Summary
tensorflow is a more established framework with better production support, while d2l-pytorch offers a more beginner-friendly approach to deep learning with PyTorch. The code syntax differs slightly, with PyTorch generally being more intuitive for Python developers. tensorflow may be preferred for large-scale projects and deployment, while d2l-pytorch could be better suited for research and experimentation.
Deep Learning for humans
Pros of Keras
- More mature and widely adopted framework with extensive documentation
- Supports multiple backend engines (TensorFlow, Theano, CNTK)
- Offers a higher-level API for faster prototyping and experimentation
Cons of Keras
- Less flexible for low-level operations compared to PyTorch
- May have slightly slower performance in some scenarios
- Limited support for dynamic computational graphs
Code Comparison
Keras:
from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
Dense(64, activation='relu', input_shape=(784,)),
Dense(10, activation='softmax')
])
d2l-pytorch:
import torch.nn as nn
class MLP(nn.Module):
def __init__(self):
super().__init__()
self.flatten = nn.Flatten()
self.linear_relu_stack = nn.Sequential(
nn.Linear(784, 64),
nn.ReLU(),
nn.Linear(64, 10)
)
The Keras example demonstrates its simplicity and high-level API, while the d2l-pytorch code showcases PyTorch's more explicit approach to defining neural network architectures.
scikit-learn: machine learning in Python
Pros of scikit-learn
- Comprehensive machine learning library with a wide range of algorithms and tools
- Well-established, mature project with extensive documentation and community support
- Designed for general-purpose machine learning tasks across various domains
Cons of scikit-learn
- Limited support for deep learning and neural networks
- Not optimized for GPU acceleration or distributed computing
- Less focus on cutting-edge research and state-of-the-art models
Code Comparison
scikit-learn:
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
X, y = make_classification(n_samples=1000, n_features=4)
clf = RandomForestClassifier()
clf.fit(X, y)
d2l-pytorch:
import torch
from torch import nn
net = nn.Sequential(nn.Linear(4, 10), nn.ReLU(), nn.Linear(10, 1))
loss = nn.BCEWithLogitsLoss()
trainer = torch.optim.SGD(net.parameters(), lr=0.1)
The code examples highlight the difference in focus between the two repositories. scikit-learn provides a high-level API for traditional machine learning algorithms, while d2l-pytorch offers a more flexible approach for deep learning using PyTorch.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Pros of DeepSpeed
- Highly optimized for large-scale distributed training
- Supports advanced techniques like ZeRO optimizer and pipeline parallelism
- Integrates seamlessly with popular frameworks like PyTorch and Hugging Face
Cons of DeepSpeed
- Steeper learning curve for beginners
- Primarily focused on performance optimization rather than educational content
- May be overkill for smaller projects or single-GPU training
Code Comparison
DeepSpeed:
import deepspeed
model_engine, optimizer, _, _ = deepspeed.initialize(args=args,
model=model,
model_parameters=params)
for step, batch in enumerate(data_loader):
loss = model_engine(batch)
model_engine.backward(loss)
model_engine.step()
d2l-pytorch:
from d2l import torch as d2l
trainer = d2l.Trainer(max_epochs=10, num_gpus=1)
model = MyModel()
trainer.fit(model, train_iter, test_iter)
Summary
DeepSpeed excels in large-scale distributed training scenarios, offering advanced optimization techniques. However, it may be more complex for beginners. d2l-pytorch, on the other hand, focuses on educational content and ease of use, making it more suitable for learning and smaller projects. The code comparison highlights DeepSpeed's explicit optimization setup versus d2l-pytorch's more abstracted training approach.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
UPDATE: Please see the orignal repo for the complete PyTorch port. We no longer maintain this repo.
This project is adapted from the original Dive Into Deep Learning book by Aston Zhang, Zachary C. Lipton, Mu Li, Alex J. Smola and all the community contributors. GitHub of the original book: https://github.com/d2l-ai/d2l-en. We have made an effort to modify the book and convert the MXnet code snippets into PyTorch.
Note: Some ipynb notebooks may not be rendered perfectly in Github. We suggest cloning
the repo or using nbviewer to view the notebooks.
Chapters
-
Ch02 Installation
-
Ch03 Introduction
-
Ch04 The Preliminaries: A Crashcourse
-
Ch05 Linear Neural Networks
-
Ch06 Multilayer Perceptrons
- 6.1 Multilayer Perceptron
- 6.2 Implementation of Multilayer Perceptron from Scratch
- 6.3 Concise Implementation of Multilayer Perceptron
- 6.4 Model Selection Underfitting and Overfitting
- 6.5 Weight Decay
- 6.6 Dropout
- 6.7 Forward Propagation Backward Propagation and Computational Graphs
- 6.8 Numerical Stability and Initialization
- 6.9 Considering the Environment
- 6.10 Predicting House Prices on Kaggle
-
Ch07 Deep Learning Computation
- 7.1 Layers and Blocks
- 7.2 Parameter Management
- 7.3 Deferred Initialization
- 7.4 Custom Layers
- 7.5 File I/O
- 7.6 GPUs
-
Ch08 Convolutional Neural Networks
-
Ch09 Modern Convolutional Networks
-
Ch10 Recurrent Neural Networks
- 10.1 Sequence Models
- 10.2 Language Models
- 10.3 Recurrent Neural Networks
- 10.4 Text Preprocessing
- 10.5 Implementation of Recurrent Neural Networks from Scratch
- 10.6 Concise Implementation of Recurrent Neural Networks
- 10.7 Backpropagation Through Time
- 10.8 Gated Recurrent Units (GRU)
- 10.9 Long Short Term Memory (LSTM)
- 10.10 Deep Recurrent Neural Networks
- 10.11 Bidirectional Recurrent Neural Networks
- 10.12 Machine Translation and DataSets
- 10.13 Encoder-Decoder Architecture
- 10.14 Sequence to Sequence
- 10.15 Beam Search
-
Ch11 Attention Mechanism
- 11.1 Attention Mechanism
- 11.2 Sequence to Sequence with Attention Mechanism
- 11.3 Transformer
-
Ch12 Optimization Algorithms
- 12.1 Optimization and Deep Learning
- 12.2 Convexity
- 12.3 Gradient Descent
- 12.4 Stochastic Gradient Descent
- 12.5 Mini-batch Stochastic Gradient Descent
- 12.6 Momentum
- 12.7 Adagrad
- 12.8 RMSProp
- 12.9 Adadelta
- 12.10 Adam
-
Ch14 Computer Vision
- 14.1 Image Augmentation
- 14.2 Fine Tuning
- 14.3 Object Detection and Bounding Boxes
- 14.4 Anchor Boxes
- 14.5 Multiscale Object Detection
- 14.6 Object Detection Data Set (Pikachu)
- 14.7 Single Shot Multibox Detection (SSD)
- 14.8 Region-based CNNs (R-CNNs)
- 14.9 Semantic Segmentation and Data Sets
- 14.10 Transposed Convolution
- 14.11 Fully Convolutional Networks (FCN)
- 14.12 Neural Style Transfer
- 14.13 Image Classification (CIFAR-10) on Kaggle
- 14.14 Dog Breed Identification (ImageNet Dogs) on Kaggle
Contributing
-
Please feel free to open a Pull Request to contribute a notebook in PyTorch for the rest of the chapters. Before starting out with the notebook, open an issue with the name of the notebook in order to contribute for the same. We will assign that issue to you (if no one has been assigned earlier).
-
Strictly follow the naming conventions for the IPython Notebooks and the subsections.
-
Also, if you think there's any section that requires more/better explanation, please use the issue tracker to open an issue and let us know about the same. We'll get back as soon as possible.
-
Find some code that needs improvement and submit a pull request.
-
Find a reference that we missed and submit a pull request.
-
Try not to submit huge pull requests since this makes them hard to understand and incorporate. Better send several smaller ones.
Support
If you like this repo and find it useful, please consider (â ) starring it, so that it can reach a broader audience.
References
[1] Original Book Dive Into Deep Learning -> Github Repo
[2] Deep Learning - The Straight Dope
[3] PyTorch - MXNet Cheatsheet
Cite
If you use this work or code for your research please cite the original book with the following bibtex entry.
@book{zhang2020dive,
title={Dive into Deep Learning},
author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola},
note={\url{https://d2l.ai}},
year={2020}
}
Top Related Projects
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
Deep Learning for humans
scikit-learn: machine learning in Python
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot