deep-learning-roadmap

:satellite: All You Need to Know About Deep Learning - A kick-starter

4,627

666

4,627

View on GitHub

Top Related Projects

tensorflow

191,921

An Open Source Machine Learning Framework for Everyone

pytorch

93,668

Tensors and Dynamic neural networks in Python with strong GPU acceleration

scikit-learn

63,533

scikit-learn: machine learning in Python

ML-For-Beginners

77,807

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

jax

32,985

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Quick Overview

The "deep-learning-roadmap" repository by instillai is a comprehensive guide for learning deep learning. It provides a structured path for beginners and intermediate learners to understand and master various concepts in deep learning, including neural networks, computer vision, and natural language processing.

Pros

Offers a well-organized learning path for deep learning enthusiasts
Includes a wide range of topics, from basics to advanced concepts
Provides links to high-quality resources, including papers, tutorials, and courses
Regularly updated with new content and resources

Cons

May be overwhelming for absolute beginners due to the vast amount of information
Lacks hands-on coding examples or projects
Some linked resources may become outdated over time
Primarily focuses on theoretical concepts rather than practical implementation

As this is not a code library, we'll skip the code examples and getting started instructions sections.

Competitor Comparisons

tensorflow

191,921

An Open Source Machine Learning Framework for Everyone

Pros of tensorflow

Comprehensive deep learning framework with extensive functionality
Large community support and extensive documentation
Backed by Google, ensuring ongoing development and updates

Cons of tensorflow

Steeper learning curve for beginners
More complex setup and configuration
Larger codebase, which can be overwhelming for simple projects

Code comparison

deep-learning-roadmap:

# Deep Learning Roadmap
1. Introduction to Deep Learning
2. Neural Networks Basics
3. Convolutional Neural Networks (CNNs)
4. Recurrent Neural Networks (RNNs)
...

tensorflow:

import tensorflow as tf

# Create a simple neural network
model = tf.keras.Sequential([
  tf.keras.layers.Dense(64, activation='relu', input_shape=(784,)),
  tf.keras.layers.Dense(10, activation='softmax')
])

The deep-learning-roadmap repository provides a structured learning path for deep learning concepts, while tensorflow offers a powerful framework for implementing deep learning models. deep-learning-roadmap is more suitable for beginners looking to understand the fundamentals, whereas tensorflow is ideal for practitioners who want to build and deploy complex models. The code comparison shows the difference in focus: deep-learning-roadmap presents a markdown outline of topics, while tensorflow demonstrates actual model implementation.

keras

63,453

Deep Learning for humans

Pros of Keras

Comprehensive deep learning library with extensive documentation and examples
Actively maintained by a large community and backed by Google
Provides high-level APIs for easy model building and training

Cons of Keras

Focuses on implementation rather than learning deep learning concepts
May not cover all cutting-edge techniques or research areas
Requires prior knowledge of deep learning fundamentals

Code Comparison

deep-learning-roadmap:

# Deep Learning Roadmap
1. Linear Algebra
2. Probability and Statistics
3. Machine Learning Basics
4. Neural Networks
5. Convolutional Neural Networks

Keras:

from keras.models import Sequential
from keras.layers import Dense

model = Sequential([
    Dense(64, activation='relu', input_shape=(784,)),
    Dense(10, activation='softmax')
])

The deep-learning-roadmap repository provides a structured learning path for deep learning concepts, while Keras offers a practical implementation framework. deep-learning-roadmap is better suited for beginners looking to understand the theoretical foundations, whereas Keras is ideal for those ready to build and deploy models. The code comparison illustrates this difference, with deep-learning-roadmap presenting a conceptual outline and Keras demonstrating actual model creation.

pytorch

93,668

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of pytorch

Comprehensive deep learning framework with extensive functionality
Large, active community providing support and contributions
Well-documented with extensive tutorials and examples

Cons of pytorch

Steeper learning curve for beginners
Larger codebase, potentially overwhelming for those new to deep learning
Focuses on implementation rather than providing a learning roadmap

Code comparison

deep-learning-roadmap:

## Deep Learning Roadmap

1. Linear Algebra
2. Probability and Statistics
3. Python Programming
4. Machine Learning Basics
...

pytorch:

import torch

# Define a simple neural network
class Net(torch.nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.fc = torch.nn.Linear(10, 5)

    def forward(self, x):
        return self.fc(x)

Summary

deep-learning-roadmap is a curated list of resources for learning deep learning, providing a structured path for beginners. It offers a clear roadmap but lacks hands-on implementation.

pytorch is a powerful deep learning framework used for building and training neural networks. It provides comprehensive tools for implementation but may be overwhelming for beginners seeking a learning path.

The choice between these repositories depends on the user's goals: learning concepts (deep-learning-roadmap) or practical implementation (pytorch).

scikit-learn

63,533

scikit-learn: machine learning in Python

Pros of scikit-learn

Comprehensive machine learning library with a wide range of algorithms and tools
Well-documented and actively maintained by a large community
Seamless integration with other scientific Python libraries (NumPy, SciPy, Pandas)

Cons of scikit-learn

Focuses primarily on traditional machine learning, with limited deep learning capabilities
May not be as suitable for large-scale, complex deep learning projects
Steeper learning curve for beginners compared to deep-learning-roadmap's educational approach

Code Comparison

deep-learning-roadmap:

# No specific code examples provided, as it's primarily an educational resource

scikit-learn:

from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
clf = RandomForestClassifier()
clf.fit(X_train, y_train)

Summary

scikit-learn is a robust machine learning library offering a wide range of algorithms and tools, while deep-learning-roadmap serves as an educational resource for deep learning concepts. scikit-learn is better suited for traditional machine learning tasks and integration with other scientific Python libraries, whereas deep-learning-roadmap provides a structured learning path for those specifically interested in deep learning.

ML-For-Beginners

77,807

12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all

Pros of ML-For-Beginners

Comprehensive curriculum covering various ML topics
Includes hands-on projects and quizzes for practical learning
Well-structured lessons with clear learning objectives

Cons of ML-For-Beginners

Focuses on general ML concepts, less emphasis on deep learning
May not cover advanced topics in as much depth
Primarily uses Python, limiting exposure to other languages

Code Comparison

ML-For-Beginners:

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = RandomForestClassifier(n_estimators=100)
model.fit(X_train, y_train)

deep-learning-roadmap:

model = Sequential([
    Dense(64, activation='relu', input_shape=(input_dim,)),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

The ML-For-Beginners repository provides a structured approach to learning machine learning concepts, suitable for beginners. It offers a broad overview of ML topics with practical exercises. On the other hand, the deep-learning-roadmap repository focuses more specifically on deep learning techniques and may be better suited for those looking to specialize in this area. The code examples reflect this difference, with ML-For-Beginners using scikit-learn for traditional ML algorithms, while deep-learning-roadmap demonstrates neural network construction using frameworks like TensorFlow or Keras.

jax

32,985

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Pros of jax

Focuses on high-performance machine learning research
Provides a powerful, flexible framework for numerical computing
Supports automatic differentiation and GPU/TPU acceleration

Cons of jax

Steeper learning curve for beginners
Less comprehensive for general deep learning concepts
Requires more advanced programming skills

Code comparison

deep-learning-roadmap:

# No specific code examples provided
# Focuses on educational resources and learning paths

jax:

import jax.numpy as jnp
from jax import grad, jit, vmap

def predict(params, inputs):
    return jnp.dot(inputs, params)

@jit
def loss(params, inputs, targets):
    preds = predict(params, inputs)
    return jnp.mean((preds - targets)**2)

Summary

deep-learning-roadmap is an educational resource providing a structured learning path for deep learning concepts, suitable for beginners and intermediate learners. jax, on the other hand, is a high-performance numerical computing library focused on advanced machine learning research and applications. While deep-learning-roadmap offers a broader overview of deep learning topics, jax provides powerful tools for implementing and optimizing complex machine learning models.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

################################################### Deep Learning - All You Need to Know ###################################################

.. image:: https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat :target: https://github.com/osforscience/deep-learning-all-you-need/pulls .. image:: https://badges.frapsoft.com/os/v2/open-source.png?v=103 :target: https://github.com/ellerbrock/open-source-badge/ .. image:: https://img.shields.io/pypi/l/ansicolortags.svg :target: https://github.com/osforscience/deep-learning-all-you-need/blob/master/LICENSE .. image:: https://img.shields.io/twitter/follow/machinemindset.svg?label=Follow&style=social :target: https://twitter.com/machinemindset

########################################################################## Sponsorship ##########################################################################

To support maintaining and upgrading this project, please kindly consider Sponsoring the project developer <https://github.com/sponsors/astorfi/dashboard>_.

Any level of support is a great contribution here :heart:

.. raw:: html

################################################### Download Free Python Machine Learning Book ###################################################

.. raw:: html

################################################### Slack Group ###################################################

.. raw:: html

################## Table of Contents ################## .. contents:: :local: :depth: 4

.. image:: _img/mainpage/logo.gif

Introduction

The purpose of this project is to introduce a shortcut to developers and researcher for finding useful resources about Deep Learning.

============ Motivation

There are different motivations for this open source project.

.. -------------------- .. Why Deep Learning? .. --------------------

What's the point of this open source project?

There are other repositories similar to this repository that are very comprehensive and useful and to be honest they made me ponder if there is a necessity for this repository!

The point of this repository is that the resources are being targeted. The organization of the resources is such that the user can easily find the things he/she is looking for. We divided the resources to a large number of categories that in the beginning one may have a headache!!! However, if someone knows what is being located, it is very easy to find the most related resources. Even if someone doesn't know what to look for, in the beginning, the general resources have been provided.

.. ================================================ .. How to make the most of this effort .. ================================================

Papers

.. image:: _img/mainpage/article.jpeg

This chapter is associated with the papers published in deep learning.

==================== Models

Convolutional Networks

.. image:: _img/mainpage/convolutional.png

.. For continuous lines, the lines must be start from the same locations.

Imagenet classification with deep convolutional neural networks : [Paper <http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks>][Code <https://github.com/dontfollowmeimcrazy/imagenet>]

.. image:: _img/mainpage/star_5.png
Convolutional Neural Networks for Sentence Classification : [Paper <https://arxiv.org/abs/1408.5882>][Code <https://github.com/yoonkim/CNN_sentence>]

.. image:: _img/mainpage/star_4.png
Large-scale Video Classification with Convolutional Neural Networks : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2014/html/Karpathy_Large-scale_Video_Classification_2014_CVPR_paper.html>][Project Page <https://cs.stanford.edu/people/karpathy/deepvideo/>]

.. image:: _img/mainpage/star_4.png
Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html>_]

.. image:: _img/mainpage/star_5.png
Deep convolutional neural networks for LVCSR : [Paper <https://ieeexplore.ieee.org/abstract/document/6639347/&hl=zh-CN&sa=T&oi=gsb&ct=res&cd=0&ei=KknXWYbGFMbFjwSsyICADQ&scisig=AAGBfm2F0Zlu0ciUwadzshNNm80IQQhuhA>_]

.. image:: _img/mainpage/star_3.png
Face recognition: a convolutional neural-network approach : [Paper <https://ieeexplore.ieee.org/abstract/document/554195/>_]

.. image:: _img/mainpage/star_5.png

Recurrent Networks

.. image:: _img/mainpage/Recurrent_neural_network_unfold.svg

.. For continuous lines, the lines must be start from the same locations.

An empirical exploration of recurrent network architectures : [Paper <http://proceedings.mlr.press/v37/jozefowicz15.pdf?utm_campaign=Revue%20newsletter&utm_medium=Newsletter&utm_source=revue>][Code <https://github.com/debajyotidatta/RecurrentArchitectures>]

.. image:: _img/mainpage/star_4.png
LSTM: A search space odyssey : [Paper <https://ieeexplore.ieee.org/abstract/document/7508408/>][Code <https://github.com/fomorians/lstm-odyssey>]

.. image:: _img/mainpage/star_3.png
On the difficulty of training recurrent neural networks : [Paper <http://proceedings.mlr.press/v28/pascanu13.pdf>][Code <https://github.com/pascanur/trainingRNNs>]

.. image:: _img/mainpage/star_5.png
Learning to forget: Continual prediction with LSTM : [Paper <http://digital-library.theiet.org/content/conferences/10.1049/cp_19991218>_]

.. image:: _img/mainpage/star_5.png

Autoencoders

.. image:: _img/mainpage/Autoencoder_structure.png

Extracting and composing robust features with denoising autoencoders : [Paper <https://dl.acm.org/citation.cfm?id=1390294>_]

.. image:: _img/mainpage/star_5.png
Stacked Denoising Autoencoders: Learning Useful Representations in a Deep Network with a Local Denoising Criterion : [Paper <http://www.jmlr.org/papers/v11/vincent10a.html>][Code <https://github.com/rajarsheem/libsdae-autoencoder-tensorflow>]

.. image:: _img/mainpage/star_5.png
Adversarial Autoencoders : [Paper <https://arxiv.org/abs/1511.05644>][Code <https://github.com/conan7882/adversarial-autoencoders>]

.. image:: _img/mainpage/star_3.png
Autoencoders, Unsupervised Learning, and Deep Architectures : [Paper <http://proceedings.mlr.press/v27/baldi12a/baldi12a.pdf>_]

.. image:: _img/mainpage/star_4.png
Reducing the Dimensionality of Data with Neural Networks : [Paper <http://science.sciencemag.org/content/313/5786/504>][Code <https://github.com/jordn/autoencoder>]

.. image:: _img/mainpage/star_5.png

Generative Models

.. image:: _img/mainpage/generative.png

Exploiting generative models discriminative classifiers : [Paper <http://papers.nips.cc/paper/1520-exploiting-generative-models-in-discriminative-classifiers.pdf>_]

.. image:: _img/mainpage/star_4.png
Semi-supervised Learning with Deep Generative Models : [Paper <http://papers.nips.cc/paper/5352-semi-supervised-learning-with-deep-generative-models>][Code <https://github.com/wohlert/semi-supervised-pytorch>]

.. image:: _img/mainpage/star_4.png
Generative Adversarial Nets : [Paper <http://papers.nips.cc/paper/5423-generative-adversarial-nets>][Code <https://github.com/goodfeli/adversarial>]

.. image:: _img/mainpage/star_5.png
Generalized Denoising Auto-Encoders as Generative Models : [Paper <http://papers.nips.cc/paper/5023-generalized-denoising-auto-encoders-as-generative-models>_]

.. image:: _img/mainpage/star_5.png
Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks : [Paper <https://arxiv.org/abs/1511.06434>][Code <https://github.com/carpedm20/DCGAN-tensorflow>]

.. image:: _img/mainpage/star_5.png

Probabilistic Models

Stochastic Backpropagation and Approximate Inference in Deep Generative Models : [Paper <https://arxiv.org/abs/1401.4082>_]

.. image:: _img/mainpage/star_4.png
Probabilistic models of cognition: exploring representations and inductive biases : [Paper <https://www.sciencedirect.com/science/article/pii/S1364661310001129>_]

.. image:: _img/mainpage/star_5.png
On deep generative models with applications to recognition : [Paper <https://ieeexplore.ieee.org/abstract/document/5995710/>_]

.. image:: _img/mainpage/star_5.png

==================== Core

Optimization

.. ################################################################################ .. For continuous lines, the lines must be start from the same locations.

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift : [Paper <https://arxiv.org/abs/1502.03167>_]

.. image:: _img/mainpage/star_5.png
Dropout: A Simple Way to Prevent Neural Networks from Overfitting : [Paper <http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf?utm_content=buffer79b43&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer>_]

.. image:: _img/mainpage/star_5.png
Training Very Deep Networks : [Paper <http://papers.nips.cc/paper/5850-training-very-deep-networks>_]

.. image:: _img/mainpage/star_4.png
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification : [Paper <https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/He_Delving_Deep_into_ICCV_2015_paper.pdf>_]

.. image:: _img/mainpage/star_5.png
Large Scale Distributed Deep Networks : [Paper <http://papers.nips.cc/paper/4687-large-scale-distributed-deep-networks>_]

.. image:: _img/mainpage/star_5.png

Representation Learning

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks : [Paper <https://arxiv.org/abs/1511.06434>][Code <https://github.com/Newmu/dcgan_code>]

.. image:: _img/mainpage/star_5.png
Representation Learning: A Review and New Perspectives : [Paper <https://ieeexplore.ieee.org/abstract/document/6472238/>_]

.. image:: _img/mainpage/star_4.png
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets : [Paper <http://papers.nips.cc/paper/6399-infogan-interpretable-representation>][Code <https://github.com/openai/InfoGAN>]

.. image:: _img/mainpage/star_3.png

Understanding and Transfer Learning

Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2014/html/Oquab_Learning_and_Transferring_2014_CVPR_paper.html>_]

.. image:: _img/mainpage/star_5.png
Distilling the Knowledge in a Neural Network : [Paper <https://arxiv.org/abs/1503.02531>_]

.. image:: _img/mainpage/star_4.png
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition : [Paper <http://proceedings.mlr.press/v32/donahue14.pdf>_][

.. image:: _img/mainpage/star_5.png
How transferable are features in deep neural networks? : [Paper <http://papers.nips.cc/paper/5347-how-transferable-are-features-in-deep-n%E2%80%A6>][Code <https://github.com/yosinski/convnet_transfer>]

.. image:: _img/mainpage/star_5.png

Reinforcement Learning

Human-level control through deep reinforcement learning : [Paper <https://www.nature.com/articles/nature14236/>][Code <https://github.com/devsisters/DQN-tensorflow>]

.. image:: _img/mainpage/star_5.png
Playing Atari with Deep Reinforcement Learning : [Paper <https://arxiv.org/abs/1312.5602>][Code <https://github.com/carpedm20/deep-rl-tensorflow>]

.. image:: _img/mainpage/star_3.png
Continuous control with deep reinforcement learning : [Paper <https://arxiv.org/abs/1509.02971>][Code <https://github.com/stevenpjg/ddpg-aigym>]

.. image:: _img/mainpage/star_4.png
Deep Reinforcement Learning with Double Q-Learning : [Paper <http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12389/11847>][Code <https://github.com/carpedm20/deep-rl-tensorflow>]

.. image:: _img/mainpage/star_3.png
Dueling Network Architectures for Deep Reinforcement Learning : [Paper <https://arxiv.org/abs/1511.06581>][Code <https://github.com/yoosan/deeprl>]

.. image:: _img/mainpage/star_3.png

==================== Applications

Image Recognition

Deep Residual Learning for Image Recognition : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/He_Deep_Residual_Learning_CVPR_2016_paper.html>][Code <https://github.com/gcr/torch-residual-networks>]

.. image:: _img/mainpage/star_5.png
Very Deep Convolutional Networks for Large-Scale Image Recognition : [Paper <https://arxiv.org/abs/1409.1556>_]

.. image:: _img/mainpage/star_5.png
Multi-column Deep Neural Networks for Image Classification : [Paper <https://arxiv.org/abs/1202.2745>_]

.. image:: _img/mainpage/star_4.png
DeepID3: Face Recognition with Very Deep Neural Networks : [Paper <https://arxiv.org/abs/1502.00873>_]

.. image:: _img/mainpage/star_4.png
Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps : [Paper <https://arxiv.org/abs/1312.6034>][Code <https://github.com/artvandelay/Deep_Inside_Convolutional_Networks>]

.. image:: _img/mainpage/star_3.png
Deep Image: Scaling up Image Recognition : [Paper <https://arxiv.org/vc/arxiv/papers/1501/1501.02876v1.pdf>_]

.. image:: _img/mainpage/star_4.png
Long-Term Recurrent Convolutional Networks for Visual Recognition and Description : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.html>][Code <https://github.com/JaggerYoung/LRCN-for-Activity-Recognition>]

.. image:: _img/mainpage/star_5.png
3D Convolutional Neural Networks for Cross Audio-Visual Matching Recognition : [Paper <https://ieeexplore.ieee.org/document/8063416>][Code <https://github.com/astorfi/lip-reading-deeplearning>]

.. image:: _img/mainpage/star_4.png

Object Recognition

ImageNet Classification with Deep Convolutional Neural Networks : [Paper <http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks>_]

.. image:: _img/mainpage/star_5.png
Learning Deep Features for Scene Recognition using Places Database : [Paper <http://papers.nips.cc/paper/5349-learning-deep-features>_]

.. image:: _img/mainpage/star_3.png
Scalable Object Detection using Deep Neural Networks : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2014/html/Erhan_Scalable_Object_Detection_2014_CVPR_paper.html>_]

.. image:: _img/mainpage/star_4.png
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks : [Paper <http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks>][Code <https://github.com/rbgirshick/py-faster-rcnn>]

.. image:: _img/mainpage/star_4.png
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks : [Paper <https://arxiv.org/abs/1312.6229>][Code <https://github.com/sermanet/OverFeat>]

.. image:: _img/mainpage/star_5.png
CNN Features Off-the-Shelf: An Astounding Baseline for Recognition : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_workshops_2014/W15/html/Razavian_CNN_Features_Off-the-Shelf_2014_CVPR_paper.html>_]

.. image:: _img/mainpage/star_3.png
What is the best multi-stage architecture for object recognition? : [Paper <https://ieeexplore.ieee.org/abstract/document/5459469/>_]

.. image:: _img/mainpage/star_2.png

Action Recognition

Long-Term Recurrent Convolutional Networks for Visual Recognition and Description : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Donahue_Long-Term_Recurrent_Convolutional_2015_CVPR_paper.html>_]

.. image:: _img/mainpage/star_5.png
Learning Spatiotemporal Features With 3D Convolutional Networks : [Paper <https://www.cv-foundation.org/openaccess/content_iccv_2015/html/Tran_Learning_Spatiotemporal_Features_ICCV_2015_paper.html>][Code <https://github.com/DavideA/c3d-pytorch>]

.. image:: _img/mainpage/star_5.png
Describing Videos by Exploiting Temporal Structure : [Paper <https://www.cv-foundation.org/openaccess/content_iccv_2015/html/Yao_Describing_Videos_by_ICCV_2015_paper.html>][Code <https://github.com/tsenghungchen/SA-tensorflow>]

.. image:: _img/mainpage/star_3.png
Convolutional Two-Stream Network Fusion for Video Action Recognition : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2016/html/Feichtenhofer_Convolutional_Two-Stream_Network_CVPR_2016_paper.html>][Code <https://github.com/feichtenhofer/twostreamfusion>]

.. image:: _img/mainpage/star_4.png
Temporal segment networks: Towards good practices for deep action recognition : [Paper <https://link.springer.com/chapter/10.1007/978-3-319-46484-8_2>][Code <https://github.com/yjxiong/temporal-segment-networks>]

.. image:: _img/mainpage/star_3.png

Caption Generation

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention : [Paper <http://proceedings.mlr.press/v37/xuc15.pdf>][Code <https://github.com/yunjey/show-attend-and-tell>]

.. image:: _img/mainpage/star_5.png
Mind's Eye: A Recurrent Visual Representation for Image Caption Generation : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Chen_Minds_Eye_A_2015_CVPR_paper.html>_]

.. image:: _img/mainpage/star_2.png
Generative Adversarial Text to Image Synthesis : [Paper <http://proceedings.mlr.press/v48/reed16.pdf>][Code <https://github.com/zsdonghao/text-to-image>]

.. image:: _img/mainpage/star_3.png
Deep Visual-Semantic Al60ignments for Generating Image Descriptions : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Karpathy_Deep_Visual-Semantic_Alignments_2015_CVPR_paper.html>][Code <https://github.com/jonkuo/Deep-Learning-Image-Captioning>]

.. image:: _img/mainpage/star_4.png
Show and Tell: A Neural Image Caption Generator : [Paper <https://www.cv-foundation.org/openaccess/content_cvpr_2015/html/Vinyals_Show_and_Tell_2015_CVPR_paper.html>][Code <https://github.com/DeepRNN/image_captioning>]

.. image:: _img/mainpage/star_5.png

Natural Language Processing

Distributed Representations of Words and Phrases and their Compositionality : [Paper <http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf>][Code <https://code.google.com/archive/p/word2vec/>]

.. image:: _img/mainpage/star_5.png
Efficient Estimation of Word Representations in Vector Space : [Paper <https://arxiv.org/pdf/1301.3781.pdf>][Code <https://code.google.com/archive/p/word2vec/>]

.. image:: _img/mainpage/star_4.png
Sequence to Sequence Learning with Neural Networks : [Paper <https://arxiv.org/pdf/1409.3215.pdf>][Code <https://github.com/farizrahman4u/seq2seq>]

.. image:: _img/mainpage/star_5.png
Neural Machine Translation by Jointly Learning to Align and Translate : [Paper <https://arxiv.org/pdf/1409.0473.pdf>][Code <https://github.com/tensorflow/nmt>]

.. image:: _img/mainpage/star_4.png
Get To The Point: Summarization with Pointer-Generator Networks : [Paper <https://arxiv.org/abs/1704.04368>][Code <https://github.com/abisee/pointer-generator>]

.. image:: _img/mainpage/star_3.png
Attention Is All You Need : [Paper <https://arxiv.org/abs/1706.03762>][Code <https://github.com/jadore801120/attention-is-all-you-need-pytorch>]

.. image:: _img/mainpage/star_4.png
Convolutional Neural Networks for Sentence Classification : [Paper <https://arxiv.org/abs/1408.5882>][Code <https://github.com/yoonkim/CNN_sentence>]

.. image:: _img/mainpage/star_4.png

Speech Technology

Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups : [Paper <https://ieeexplore.ieee.org/abstract/document/6296526/>_]

.. image:: _img/mainpage/star_5.png
Towards End-to-End Speech Recognition with Recurrent Neural Networks : [Paper <http://proceedings.mlr.press/v32/graves14.pdf>_]

.. image:: _img/mainpage/star_3.png
Speech recognition with deep recurrent neural networks : [Paper <https://ieeexplore.ieee.org/abstract/document/6638947/>_]

.. image:: _img/mainpage/star_4.png
Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition : [Paper <https://arxiv.org/abs/1507.06947>_]

.. image:: _img/mainpage/star_3.png
Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin : [Paper <http://proceedings.mlr.press/v48/amodei16.html>][Code <https://github.com/PaddlePaddle/DeepSpeech>]

.. image:: _img/mainpage/star_4.png
A novel scheme for speaker recognition using a phonetically-aware deep neural network : [Paper <https://ieeexplore.ieee.org/abstract/document/6853887/>_]

.. image:: _img/mainpage/star_3.png
Text-Independent Speaker Verification Using 3D Convolutional Neural Networks : [Paper <https://arxiv.org/abs/1705.09422>][Code <https://github.com/astorfi/3D-convolutional-speaker-recognition>]

.. image:: _img/mainpage/star_4.png

Datasets

==================== Image

General

MNIST Handwritten digits: [Link <http://yann.lecun.com/exdb/mnist/>_]

Face

Face Recognition Technology (FERET) The goal of the FERET program was to develop automatic face recognition capabilities that could be employed to assist security, intelligence, and law enforcement personnel in the performance of their duties: [Link <https://www.nist.gov/programs-projects/face-recognition-technology-feret>_]
The CMU Pose, Illumination, and Expression (PIE) Database of Human Faces Between October and December 2000 we collected a database of 41,368 images of 68 people: [Link <https://www.ri.cmu.edu/publications/the-cmu-pose-illumination-and-expression-pie-database-of-human-faces/>_]
YouTube Faces DB The data set contains 3,425 videos of 1,595 different people. All the videos were downloaded from YouTube. An average of 2.15 videos are available for each subject: [Link <https://www.cs.tau.ac.il/~wolf/ytfaces/>_]
Grammatical Facial Expressions Data Set Developed to assist the the automated analysis of facial expressions: [Link <https://archive.ics.uci.edu/ml/datasets/Grammatical+Facial+Expressions>_]
FaceScrub A Dataset With Over 100,000 Face Images of 530 People: [Link <http://vintage.winklerbros.net/facescrub.html>_]
IMDB-WIKI 500k+ face images with age and gender labels: [Link <https://data.vision.ee.ethz.ch/cvl/rrothe/imdb-wiki/>_]
FDDB Face Detection Data Set and Benchmark (FDDB): [Link <http://vis-www.cs.umass.edu/fddb/>_]

Object Recognition

COCO Microsoft COCO: Common Objects in Context: [Link <http://cocodataset.org/#home>_]
ImageNet The famous ImageNet dataset: [Link <http://www.image-net.org/>_]
Open Images Dataset Open Images is a dataset of ~9 million images that have been annotated with image-level labels and object bounding boxes: [Link <https://storage.googleapis.com/openimages/web/index.html>_]
Caltech-256 Object Category Dataset A large dataset object classification: [Link <https://authors.library.caltech.edu/7694/>_]
Pascal VOC dataset A large dataset for classification tasks: [Link <http://host.robots.ox.ac.uk/pascal/VOC/>_]
CIFAR 10 / CIFAR 100 The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes. CIFAR-100 is similar to CIFAR-10 but it has 100 classes containing 600 images each: [Link <https://www.cs.toronto.edu/~kriz/cifar.html>_]

Action recognition

HMDB a large human motion database: [Link <http://serre-lab.clps.brown.edu/resource/hmdb-a-large-human-motion-database/>_]
MHAD Berkeley Multimodal Human Action Database: [Link <http://tele-immersion.citris-uc.org/berkeley_mhad>_]
UCF101 - Action Recognition Data Set UCF101 is an action recognition data set of realistic action videos, collected from YouTube, having 101 action categories. This data set is an extension of UCF50 data set which has 50 action categories: [Link <http://crcv.ucf.edu/data/UCF101.php>_]
THUMOS Dataset A large dataset for action classification: [Link <http://crcv.ucf.edu/data/THUMOS.php>_]
ActivityNet A Large-Scale Video Benchmark for Human Activity Understanding: [Link <http://activity-net.org/>_]

====================================== Text and Natural Language Processing

General

1 Billion Word Language Model Benchmark: The purpose of the project is to make available a standard training and test setup for language modeling experiments: [Link <http://www.statmt.org/lm-benchmark/>_]
Common Crawl: The Common Crawl corpus contains petabytes of data collected over the last 7 years. It contains raw web page data, extracted metadata and text extractions: [Link <http://commoncrawl.org/the-data/get-started/>_]
Yelp Open Dataset: A subset of Yelp's businesses, reviews, and user data for use in personal, educational, and academic purposes: [Link <https://www.yelp.com/dataset>_]

Text classification

20 newsgroups The 20 Newsgroups data set is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups: [Link <http://qwone.com/~jason/20Newsgroups/>_]
Broadcast News The 1996 Broadcast News Speech Corpus contains a total of 104 hours of broadcasts from ABC, CNN and CSPAN television networks and NPR and PRI radio networks with corresponding transcripts: [Link <https://catalog.ldc.upenn.edu/LDC97S44>_]
The wikitext long term dependency language modeling dataset: A collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. : [Link <https://einstein.ai/research/the-wikitext-long-term-dependency-language-modeling-dataset>_]

Question Answering

Question Answering Corpus by Deep Mind and Oxford which is two new corpora of roughly a million news stories with associated queries from the CNN and Daily Mail websites. [Link <https://github.com/deepmind/rc-data>_]
Stanford Question Answering Dataset (SQuAD) consisting of questions posed by crowdworkers on a set of Wikipedia articles: [Link <https://rajpurkar.github.io/SQuAD-explorer/>_]
Amazon question/answer data contains Question and Answer data from Amazon, totaling around 1.4 million answered questions: [Link <http://jmcauley.ucsd.edu/data/amazon/qa/>_]

Sentiment Analysis

Multi-Domain Sentiment Dataset TThe Multi-Domain Sentiment Dataset contains product reviews taken from Amazon.com from many product types (domains): [Link <http://www.cs.jhu.edu/~mdredze/datasets/sentiment/>_]
Stanford Sentiment Treebank Dataset The Stanford Sentiment Treebank is the first corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language: [Link <https://nlp.stanford.edu/sentiment/>_]
Large Movie Review Dataset: This is a dataset for binary sentiment classification: [Link <http://ai.stanford.edu/~amaas/data/sentiment/>_]

Machine Translation

Aligned Hansards of the 36th Parliament of Canada dataset contains 1.3 million pairs of aligned text chunks: [Link <https://www.isi.edu/natural-language/download/hansard/>_]
Europarl: A Parallel Corpus for Statistical Machine Translation dataset extracted from the proceedings of the European Parliament: [Link <http://www.statmt.org/europarl/>_]

Summarization

Legal Case Reports Data Set as a textual corpus of 4000 legal cases for automatic summarization and citation analysis.: [Link <https://archive.ics.uci.edu/ml/datasets/Legal+Case+Reports>_]

====================================== Speech Technology

TIMIT Acoustic-Phonetic Continuous Speech Corpus The TIMIT corpus of read speech is designed to provide speech data for acoustic-phonetic studies and for the development and evaluation of automatic speech recognition systems: [Link <https://catalog.ldc.upenn.edu/ldc93s1>_]
LibriSpeech LibriSpeech is a corpus of approximately 1000 hours of 16kHz read English speech, prepared by Vassil Panayotov with the assistance of Daniel Povey: [Link <http://www.openslr.org/12/>_]
VoxCeleb A large scale audio-visual dataset: [Link <http://www.robots.ox.ac.uk/~vgg/data/voxceleb/>_]
NIST Speaker Recognition: [Link <https://www.nist.gov/itl/iad/mig/speaker-recognition>_]

Courses

.. image:: _img/mainpage/online.png

Machine Learning by Stanford on Coursera : [Link <https://www.coursera.org/learn/machine-learning>_]
Neural Networks and Deep Learning Specialization by Coursera: [Link <https://www.coursera.org/learn/neural-networks-deep-learning>_]
Intro to Deep Learning by Google: [Link <https://www.udacity.com/course/deep-learning--ud730>_]
Introduction to Deep Learning by CMU: [Link <http://deeplearning.cs.cmu.edu/>_]
NVIDIA Deep Learning Institute by NVIDIA: [Link <https://www.nvidia.com/en-us/deep-learning-ai/education/>_]
Convolutional Neural Networks for Visual Recognition by Stanford: [Link <http://cs231n.stanford.edu/>_]
Deep Learning for Natural Language Processing by Stanford: [Link <http://cs224d.stanford.edu/>_]
Deep Learning by fast.ai: [Link <http://www.fast.ai/>_]
Course on Deep Learning for Visual Computing by IITKGP: [Link <https://www.youtube.com/playlist?list=PLuv3GM6-gsE1Biyakccxb3FAn4wBLyfWf>_]

Books

.. image:: _img/mainpage/books.jpg

Deep Learning by Ian Goodfellow: [Link <http://www.deeplearningbook.org/>_]
Neural Networks and Deep Learning : [Link <http://neuralnetworksanddeeplearning.com/>_]
Deep Learning with Python: [Link <https://www.amazon.com/Deep-Learning-Python-Francois-Chollet/dp/1617294438/ref=as_li_ss_tl?s=books&ie=UTF8&qid=1519989624&sr=1-4&keywords=deep+learning+with+python&linkCode=sl1&tag=trndingcom-20&linkId=ec7663329fdb7ace60f39c762e999683>_]
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems: [Link <https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1491962291/ref=as_li_ss_tl?ie=UTF8&qid=1519989725&sr=1-2-ent&linkCode=sl1&tag=trndingcom-20&linkId=71938c9398940c7b0a811dc1cfef7cc3>_]

Blogs

.. image:: _img/mainpage/Blogger_icon.png

Colah's blog: [Link <http://colah.github.io/>_]
Andrej Karpathy blog: [Link <http://karpathy.github.io/>_]
The Spectator Shakir's Machine Learning Blog: [Link <http://blog.shakirm.com/>_]
WILDML: [Link <http://www.wildml.com/about/>_]
Distill blog It is more like a journal than a blog because it has a peer review process and only accepted articles will be published on that.: [Link <https://distill.pub/>_]
BAIR Berkeley Artificial Inteliigent Research: [Link <http://bair.berkeley.edu/blog/>_]
Sebastian Ruder's blog: [Link <http://ruder.io/>_]
inFERENCe: [Link <https://www.inference.vc/page/2/>_]
i am trask A Machine Learning Craftsmanship Blog: [Link <http://iamtrask.github.io>_]

Tutorials

.. image:: _img/mainpage/tutorial.png

Deep Learning Tutorials: [Link <http://deeplearning.net/tutorial/>_]
Deep Learning for NLP with Pytorch by Pytorch: [Link <https://pytorch.org/tutorials/beginner/deep_learning_nlp_tutorial.html>_]
Deep Learning for Natural Language Processing: Tutorials with Jupyter Notebooks by Jon Krohn: [Link <https://insights.untapt.com/deep-learning-for-natural-language-processing-tutorials-with-jupyter-notebooks-ad67f336ce3f>_]

Frameworks

Tensorflow: [Link <https://www.tensorflow.org/>_]
Pytorch: [Link <https://pytorch.org/>_]
CNTK: [Link <https://docs.microsoft.com/en-us/cognitive-toolkit/>_]
MatConvNet: [Link <http://www.vlfeat.org/matconvnet/>_]
Keras: [Link <https://keras.io/>_]
Caffe: [Link <http://caffe.berkeleyvision.org/>_]
Theano: [Link <http://www.deeplearning.net/software/theano/>_]
CuDNN: [Link <https://developer.nvidia.com/cudnn>_]
Torch: [Link <https://github.com/torch/torch7>_]
Deeplearning4j: [Link <https://deeplearning4j.org/>_]

Contributing

For typos, unless significant changes, please do not create a pull request. Instead, declare them in issues or email the repository owner. Please note we have a code of conduct, please follow it in all your interactions with the project.

======================== Pull Request Process

Please consider the following criterions in order to help us in a better way:

The pull request is mainly expected to be a link suggestion.
Please make sure your suggested resources are not obsolete or broken.
Ensure any install or build dependencies are removed before the end of the layer when doing a build and creating a pull request.
Add comments with details of changes to the interface, this includes new environment variables, exposed ports, useful file locations and container parameters.
You may merge the Pull Request in once you have the sign-off of at least one other developer, or if you do not have permission to do that, you may request the owner to merge it for you if you believe all checks are passed.

======================== Final Note

We are looking forward to your kind feedback. Please help us to improve this open source project and make our work better. For contribution, please create a pull request and we will investigate it promptly. Once again, we appreciate your kind feedback and support.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot