Convert Figma logo to code with AI

kjw0612 logoawesome-rnn

Recurrent Neural Network - A curated list of resources dedicated to RNN

6,064
1,438
6,064
4

Top Related Projects

11,562

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch

76,949

Models and examples built with TensorFlow

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

PyTorch Tutorial for Deep Learning Researchers

61,580

Deep Learning for humans

Quick Overview

The kjw0612/awesome-rnn repository is a curated list of resources related to Recurrent Neural Networks (RNNs). It serves as a comprehensive collection of papers, tutorials, and implementations focusing on RNNs and their applications in various domains of machine learning and artificial intelligence.

Pros

  • Extensive collection of RNN-related resources in one place
  • Well-organized structure with categorized sections for easy navigation
  • Regularly updated with new and relevant content
  • Includes both theoretical papers and practical implementations

Cons

  • May be overwhelming for beginners due to the large amount of information
  • Some links might become outdated over time
  • Lacks detailed explanations or summaries for each resource
  • Limited to RNNs and closely related topics, not covering other neural network architectures

Code Examples

This repository is not a code library but a curated list of resources. Therefore, there are no code examples to provide.

Getting Started

As this is not a code library, there are no specific getting started instructions. However, users can navigate the repository by browsing through the different sections and clicking on the links that interest them. The main categories include:

  1. Papers
  2. Tutorials
  3. Datasets
  4. Implementations
  5. Applications
  6. Blogs
  7. Talks

To get started, simply visit the repository at https://github.com/kjw0612/awesome-rnn and explore the resources that align with your interests and level of expertise in RNNs.

Competitor Comparisons

11,562

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch

Pros of char-rnn

  • Focused implementation of character-level RNNs for text generation
  • Includes training scripts and pre-trained models for quick experimentation
  • Well-documented codebase with clear explanations of the architecture

Cons of char-rnn

  • Limited to character-level models, less versatile than awesome-rnn
  • Lacks comprehensive resources and references found in awesome-rnn
  • May be outdated compared to more recent RNN implementations

Code Comparison

char-rnn (Lua):

local model = nn.Sequential()
model:add(nn.LSTM(input_size, rnn_size, 1, true))
model:add(nn.Linear(rnn_size, output_size))
model:add(nn.LogSoftMax())

awesome-rnn (Python - example from a linked resource):

class RNN(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(RNN, self).__init__()
        self.hidden_size = hidden_size
        self.i2h = nn.Linear(input_size + hidden_size, hidden_size)
        self.i2o = nn.Linear(input_size + hidden_size, output_size)
        self.softmax = nn.LogSoftmax(dim=1)

Note: awesome-rnn is a curated list of RNN resources, so the code example is from one of its linked projects.

76,949

Models and examples built with TensorFlow

Pros of models

  • Comprehensive collection of official TensorFlow models and examples
  • Actively maintained by Google and the TensorFlow community
  • Covers a wide range of machine learning tasks beyond just RNNs

Cons of models

  • Larger and more complex repository, potentially overwhelming for beginners
  • Focuses on TensorFlow-specific implementations, limiting flexibility for other frameworks

Code comparison

models:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.LSTM(64, return_sequences=True),
    tf.keras.layers.LSTM(32),
    tf.keras.layers.Dense(10, activation='softmax')
])

awesome-rnn:

# No direct code examples provided in the repository
# Focuses on curating links to external resources and papers

Summary

models is a comprehensive repository maintained by Google, offering a wide range of TensorFlow models and examples for various machine learning tasks. It provides up-to-date, production-ready implementations but may be overwhelming for beginners and is limited to TensorFlow.

awesome-rnn is a curated list of resources, papers, and links related to Recurrent Neural Networks. It doesn't provide direct code implementations but serves as a valuable reference for researchers and practitioners interested in RNNs across different frameworks and applications.

Choose models for practical TensorFlow implementations, or awesome-rnn for a broader overview of RNN research and resources.

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

Pros of TensorFlow-Examples

  • Focuses specifically on TensorFlow, providing practical examples for this popular deep learning framework
  • Includes a wide range of examples covering various machine learning tasks and model architectures
  • Regularly updated with new examples and improvements

Cons of TensorFlow-Examples

  • Limited to TensorFlow implementations, not covering other frameworks or theoretical aspects
  • May not provide as comprehensive a list of resources for RNNs specifically
  • Lacks curated lists of papers, books, and other external resources

Code Comparison

TensorFlow-Examples (LSTM example):

lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(num_hidden, forget_bias=1.0)
outputs, states = tf.nn.dynamic_rnn(lstm_cell, x, dtype=tf.float32)

awesome-rnn (pseudocode, as it's a curated list):

# No direct code examples provided
# Instead, links to various RNN implementations and resources

Summary

TensorFlow-Examples offers practical, hands-on examples for TensorFlow users, while awesome-rnn provides a curated list of resources for RNN research and implementation across various frameworks. TensorFlow-Examples is more focused on code implementation, while awesome-rnn offers a broader overview of RNN-related materials.

PyTorch Tutorial for Deep Learning Researchers

Pros of pytorch-tutorial

  • Focuses specifically on PyTorch, providing hands-on tutorials and examples
  • Covers a wide range of deep learning topics beyond just RNNs
  • Regularly updated with newer PyTorch features and best practices

Cons of pytorch-tutorial

  • Less comprehensive in RNN-specific resources compared to awesome-rnn
  • May not include as many academic papers or theoretical materials
  • Primarily code-focused, with less emphasis on explanatory text or research

Code Comparison

pytorch-tutorial (LSTM example):

class LSTM(nn.Module):
    def __init__(self, input_size, hidden_size, num_layers, num_classes):
        super(LSTM, self).__init__()
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.lstm = nn.LSTM(input_size, hidden_size, num_layers, batch_first=True)
        self.fc = nn.Linear(hidden_size, num_classes)

awesome-rnn (no direct code examples, as it's a curated list of resources)

Summary

pytorch-tutorial is a practical, code-focused resource specifically for PyTorch users, covering various deep learning topics. awesome-rnn is a comprehensive list of RNN-related resources, including papers, tutorials, and implementations across different frameworks. Choose pytorch-tutorial for hands-on PyTorch learning, or awesome-rnn for a broader overview of RNN research and implementations.

61,580

Deep Learning for humans

Pros of Keras

  • Full-featured deep learning library with comprehensive documentation
  • Supports multiple backend engines (TensorFlow, Theano, CNTK)
  • Active development and large community support

Cons of Keras

  • More complex setup and learning curve for beginners
  • Less focused on RNNs specifically compared to awesome-rnn

Code Comparison

Keras example (RNN implementation):

from keras.models import Sequential
from keras.layers import SimpleRNN

model = Sequential()
model.add(SimpleRNN(32, input_shape=(None, 10)))
model.compile(optimizer='adam', loss='mse')

awesome-rnn doesn't provide direct code examples, as it's a curated list of RNN resources rather than a library.

Key Differences

  • Keras is a full deep learning framework, while awesome-rnn is a curated list of RNN resources
  • Keras offers hands-on implementation, whereas awesome-rnn provides links to various RNN-related materials
  • Keras has a broader scope covering various neural network architectures, while awesome-rnn focuses specifically on RNNs

Use Cases

  • Keras: Ideal for developing and deploying deep learning models, including RNNs
  • awesome-rnn: Best for researchers and learners seeking RNN-specific resources and papers

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Awesome Recurrent Neural Networks

A curated list of resources dedicated to recurrent neural networks (closely related to deep learning).

Maintainers - Myungsub Choi, Taeksoo Kim, Jiwon Kim

We have pages for other topics: awesome-deep-vision, awesome-random-forest

Contributing

Please feel free to pull requests, email Myungsub Choi (cms6539@gmail.com) or join our chats to add links.

The project is not actively maintained.

Join the chat at https://gitter.im/kjw0612/awesome-rnn

Sharing

Table of Contents

Codes

Theory

Lectures

Books / Thesis

Architecture Variants

Structure

  • Bi-directional RNN [Paper]
    • Mike Schuster and Kuldip K. Paliwal, Bidirectional Recurrent Neural Networks, Trans. on Signal Processing 1997
  • Multi-dimensional RNN [Paper]
    • Alex Graves, Santiago Fernandez, and Jurgen Schmidhuber, Multi-Dimensional Recurrent Neural Networks, ICANN 2007
  • GFRNN [Paper-arXiv] [Paper-ICML] [Supplementary]
    • Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, Yoshua Bengio, Gated Feedback Recurrent Neural Networks, arXiv:1502.02367 / ICML 2015
  • Tree-Structured RNNs
    • Kai Sheng Tai, Richard Socher, and Christopher D. Manning, Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks, arXiv:1503.00075 / ACL 2015 [Paper]
    • Samuel R. Bowman, Christopher D. Manning, and Christopher Potts, Tree-structured composition in neural networks without tree-structured architectures, arXiv:1506.04834 [Paper]
  • Grid LSTM [Paper] [Code]
    • Nal Kalchbrenner, Ivo Danihelka, and Alex Graves, Grid Long Short-Term Memory, arXiv:1507.01526
  • Segmental RNN [Paper]
    • Lingpeng Kong, Chris Dyer, Noah Smith, "Segmental Recurrent Neural Networks", ICLR 2016.
  • Seq2seq for Sets [Paper]
    • Oriol Vinyals, Samy Bengio, Manjunath Kudlur, "Order Matters: Sequence to sequence for sets", ICLR 2016.
  • Hierarchical Recurrent Neural Networks [Paper]
    • Junyoung Chung, Sungjin Ahn, Yoshua Bengio, "Hierarchical Multiscale Recurrent Neural Networks", arXiv:1609.01704

Memory

  • LSTM [Paper]
    • Sepp Hochreiter and Jurgen Schmidhuber, Long Short-Term Memory, Neural Computation 1997
  • GRU (Gated Recurrent Unit) [Paper]
    • Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, arXiv:1406.1078 / EMNLP 2014
  • NTM [Paper]
    • A.Graves, G. Wayne, and I. Danihelka., Neural Turing Machines, arXiv preprint arXiv:1410.5401
  • Neural GPU [Paper]
    • Łukasz Kaiser, Ilya Sutskever, arXiv:1511.08228 / ICML 2016 (under review)
  • Memory Network [Paper]
    • Jason Weston, Sumit Chopra, Antoine Bordes, Memory Networks, arXiv:1410.3916
  • Pointer Network [Paper]
    • Oriol Vinyals, Meire Fortunato, and Navdeep Jaitly, Pointer Networks, arXiv:1506.03134 / NIPS 2015
  • Deep Attention Recurrent Q-Network [Paper]
    • Ivan Sorokin, Alexey Seleznev, Mikhail Pavlov, Aleksandr Fedorov, Anastasiia Ignateva, Deep Attention Recurrent Q-Network , arXiv:1512.01693
  • Dynamic Memory Networks [Paper]
    • Ankit Kumar, Ozan Irsoy, Peter Ondruska, Mohit Iyyer, James Bradbury, Ishaan Gulrajani, Victor Zhong, Romain Paulus, Richard Socher, "Ask Me Anything: Dynamic Memory Networks for Natural Language Processing", arXiv:1506.07285

Surveys

Applications

Natural Language Processing

Language Modeling

  • Tomas Mikolov, Martin Karafiat, Lukas Burget, Jan "Honza" Cernocky, Sanjeev Khudanpur, Recurrent Neural Network based Language Model, Interspeech 2010 [Paper]
  • Tomas Mikolov, Stefan Kombrink, Lukas Burget, Jan "Honza" Cernocky, Sanjeev Khudanpur, Extensions of Recurrent Neural Network Language Model, ICASSP 2011 [Paper]
  • Stefan Kombrink, Tomas Mikolov, Martin Karafiat, Lukas Burget, Recurrent Neural Network based Language Modeling in Meeting Recognition, Interspeech 2011 [Paper]
  • Jiwei Li, Minh-Thang Luong, and Dan Jurafsky, A Hierarchical Neural Autoencoder for Paragraphs and Documents, ACL 2015 [Paper], [Code]
  • Ryan Kiros, Yukun Zhu, Ruslan Salakhutdinov, and Richard S. Zemel, Skip-Thought Vectors, arXiv:1506.06726 / NIPS 2015 [Paper]
  • Yoon Kim, Yacine Jernite, David Sontag, and Alexander M. Rush, Character-Aware Neural Language Models, arXiv:1508.06615 [Paper]
  • Xingxing Zhang, Liang Lu, and Mirella Lapata, Tree Recurrent Neural Networks with Application to Language Modeling, arXiv:1511.00060 [Paper]
  • Felix Hill, Antoine Bordes, Sumit Chopra, and Jason Weston, The Goldilocks Principle: Reading children's books with explicit memory representations, arXiv:1511.0230 [Paper]

Speech Recognition

  • Geoffrey Hinton, Li Deng, Dong Yu, George E. Dahl, Abdel-rahman Mohamed, Navdeep Jaitly, Andrew Senior, Vincent Vanhoucke, Patrick Nguyen, Tara N. Sainath, and Brian Kingsbury, Deep Neural Networks for Acoustic Modeling in Speech Recognition, IEEE Signam Processing Magazine 2012 [Paper]
  • Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton, Speech Recognition with Deep Recurrent Neural Networks, arXiv:1303.5778 / ICASSP 2013 [Paper]
  • Jan Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio, Attention-Based Models for Speech Recognition, arXiv:1506.07503 / NIPS 2015 [Paper]
  • Haşim Sak, Andrew Senior, Kanishka Rao, and Françoise Beaufays. Fast and Accurate Recurrent Neural Network Acoustic Models for Speech Recognition, arXiv:1507.06947 2015 [Paper].

Machine Translation

  • Oxford [Paper]
    • Nal Kalchbrenner and Phil Blunsom, Recurrent Continuous Translation Models, EMNLP 2013
  • Univ. Montreal
    • Kyunghyun Cho, Bart van Berrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation, arXiv:1406.1078 / EMNLP 2014 [Paper]
    • Kyunghyun Cho, Bart van Merrienboer, Dzmitry Bahdanau, and Yoshua Bengio, On the Properties of Neural Machine Translation: Encoder-Decoder Approaches, SSST-8 2014 [Paper]
    • Jean Pouget-Abadie, Dzmitry Bahdanau, Bart van Merrienboer, Kyunghyun Cho, and Yoshua Bengio, Overcoming the Curse of Sentence Length for Neural Machine Translation using Automatic Segmentation, SSST-8 2014
    • Dzmitry Bahdanau, KyungHyun Cho, and Yoshua Bengio, Neural Machine Translation by Jointly Learning to Align and Translate, arXiv:1409.0473 / ICLR 2015 [Paper]
    • Sebastian Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio, On using very large target vocabulary for neural machine translation, arXiv:1412.2007 / ACL 2015 [Paper]
  • Univ. Montreal + Middle East Tech. Univ. + Univ. Maine [Paper]
    • Caglar Gulcehre, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, and Yoshua Bengio, On Using Monolingual Corpora in Neural Machine Translation, arXiv:1503.03535
  • Google [Paper]
    • Ilya Sutskever, Oriol Vinyals, and Quoc V. Le, Sequence to Sequence Learning with Neural Networks, arXiv:1409.3215 / NIPS 2014
  • Google + NYU [Paper]
    • Minh-Thang Luong, Ilya Sutskever, Quoc V. Le, Oriol Vinyals, and Wojciech Zaremba, Addressing the Rare Word Problem in Neural Machine Transltaion, arXiv:1410.8206 / ACL 2015
  • ICT + Huawei [Paper]
    • Fandong Meng, Zhengdong Lu, Zhaopeng Tu, Hang Li, and Qun Liu, A Deep Memory-based Architecture for Sequence-to-Sequence Learning, arXiv:1506.06442
  • Stanford [Paper]
    • Minh-Thang Luong, Hieu Pham, and Christopher D. Manning, Effective Approaches to Attention-based Neural Machine Translation, arXiv:1508.04025
  • Middle East Tech. Univ. + NYU + Univ. Montreal [Paper]
    • Orhan Firat, Kyunghyun Cho, and Yoshua Bengio, Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism, arXiv:1601.01073

Conversation Modeling

  • Lifeng Shang, Zhengdong Lu, and Hang Li, Neural Responding Machine for Short-Text Conversation, arXiv:1503.02364 / ACL 2015 [Paper]
  • Oriol Vinyals and Quoc V. Le, A Neural Conversational Model, arXiv:1506.05869 [Paper]
  • Ryan Lowe, Nissan Pow, Iulian V. Serban, and Joelle Pineau, The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems, arXiv:1506.08909 [Paper]
  • Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, and Jason Weston, Evaluating Prerequisite Qualities for Learning End-to-End Dialog Systems, arXiv:1511.06931 [Paper]
  • Jason Weston, Dialog-based Language Learning, arXiv:1604.06045, [Paper]
  • Antoine Bordes and Jason Weston, Learning End-to-End Goal-Oriented Dialog, arXiv:1605.07683 [Paper]

Question Answering

  • FAIR
    • Jason Weston, Antoine Bordes, Sumit Chopra, Tomas Mikolov, and Alexander M. Rush, Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:1502.05698 [Web] [Paper]
    • Antoine Bordes, Nicolas Usunier, Sumit Chopra, and Jason Weston, Simple Question answering with Memory Networks, arXiv:1506.02075 [Paper]
    • Felix Hill, Antoine Bordes, Sumit Chopra, Jason Weston, "The Goldilocks Principle: Reading Children's Books with Explicit Memory Representations", ICLR 2016 [Paper]
  • DeepMind + Oxford [Paper]
    • Karl M. Hermann, Tomas Kocisky, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, and Phil Blunsom, Teaching Machines to Read and Comprehend, arXiv:1506.03340 / NIPS 2015
  • MetaMind [Paper]
    • Ankit Kumar, Ozan Irsoy, Jonathan Su, James Bradbury, Robert English, Brian Pierce, Peter Ondruska, Mohit Iyyer, Ishaan Gulrajani, and Richard Socher, Ask Me Anything: Dynamic Memory Networks for Natural Language Processing, arXiv:1506.07285

Computer Vision

Object Recognition

  • Pedro Pinheiro and Ronan Collobert, Recurrent Convolutional Neural Networks for Scene Labeling, ICML 2014 [Paper]
  • Ming Liang and Xiaolin Hu, Recurrent Convolutional Neural Network for Object Recognition, CVPR 2015 [Paper]
  • Wonmin Byeon, Thomas Breuel, Federico Raue1, and Marcus Liwicki1, Scene Labeling with LSTM Recurrent Neural Networks, CVPR 2015 [Paper]
  • Mircea Serban Pavel, Hannes Schulz, and Sven Behnke, Recurrent Convolutional Neural Networks for Object-Class Segmentation of RGB-D Video, IJCNN 2015 [Paper]
  • Shuai Zheng, Sadeep Jayasumana, Bernardino Romera-Paredes, Vibhav Vineet, Zhizhong Su, Dalong Du, Chang Huang, and Philip H. S. Torr, Conditional Random Fields as Recurrent Neural Networks, arXiv:1502.03240 [Paper]
  • Xiaodan Liang, Xiaohui Shen, Donglai Xiang, Jiashi Feng, Liang Lin, and Shuicheng Yan, Semantic Object Parsing with Local-Global Long Short-Term Memory, arXiv:1511.04510 [Paper]
  • Sean Bell, C. Lawrence Zitnick, Kavita Bala, and Ross Girshick, Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks, arXiv:1512.04143 / ICCV 2015 workshop [Paper]

Visual Tracking

  • Quan Gan, Qipeng Guo, Zheng Zhang, and Kyunghyun Cho, First Step toward Model-Free, Anonymous Object Tracking with Recurrent Neural Networks, arXiv:1511.06425 [Paper]

Image Generation

  • Karol Gregor, Ivo Danihelka, Alex Graves, Danilo J. Rezende, and Daan Wierstra, DRAW: A Recurrent Neural Network for Image Generation, ICML 2015 [Paper]
  • Angeliki Lazaridou, Dat T. Nguyen, R. Bernardi, and M. Baroni, Unveiling the Dreams of Word Embeddings: Towards Language-Driven Image Generation, arXiv:1506.03500 [Paper]
  • Lucas Theis and Matthias Bethge, Generative Image Modeling Using Spatial LSTMs, arXiv:1506.03478 / NIPS 2015 [Paper]
  • Aaron van den Oord, Nal Kalchbrenner, and Koray Kavukcuoglu, Pixel Recurrent Neural Networks, arXiv:1601.06759 [Paper]

Video Analysis

  • Univ. Toronto [paper]
    • Nitish Srivastava, Elman Mansimov, Ruslan Salakhutdinov, Unsupervised Learning of Video Representations using LSTMs, arXiv:1502.04681 / ICML 2015
  • Univ. Cambridge [paper]
    • Viorica Patraucean, Ankur Handa, Roberto Cipolla, Spatio-temporal video autoencoder with differentiable memory, arXiv:1511.06309

Multimodal (CV + NLP)

Image Captioning

  • UCLA + Baidu [Web] [Paper-arXiv1], [Paper-arXiv2]
    • Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, and Alan L. Yuille, Explain Images with Multimodal Recurrent Neural Networks, arXiv:1410.1090
    • Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, and Alan L. Yuille, Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN), arXiv:1412.6632 / ICLR 2015
  • Univ. Toronto [Paper] [Web demo]
    • Ryan Kiros, Ruslan Salakhutdinov, and Richard S. Zemel, Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models, arXiv:1411.2539 / TACL 2015
  • Berkeley [Web] [Paper]
    • Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and Description, arXiv:1411.4389 / CVPR 2015
  • Google [Paper]
    • Oriol Vinyals, Alexander Toshev, Samy Bengio, and Dumitru Erhan, Show and Tell: A Neural Image Caption Generator, arXiv:1411.4555 / CVPR 2015
  • Stanford [Web] [Paper]
    • Andrej Karpathy and Li Fei-Fei, Deep Visual-Semantic Alignments for Generating Image Description, CVPR 2015
  • Microsoft [Paper]
    • Hao Fang, Saurabh Gupta, Forrest Iandola, Rupesh Srivastava, Li Deng, Piotr Dollar, Jianfeng Gao, Xiaodong He, Margaret Mitchell, John C. Platt, Lawrence Zitnick, and Geoffrey Zweig, From Captions to Visual Concepts and Back, arXiv:1411.4952 / CVPR 2015
  • CMU + Microsoft [Paper-arXiv], [Paper-CVPR]
    • Xinlei Chen, and C. Lawrence Zitnick, Learning a Recurrent Visual Representation for Image Caption Generation
    • Xinlei Chen, and C. Lawrence Zitnick, Mind’s Eye: A Recurrent Visual Representation for Image Caption Generation, CVPR 2015
  • Univ. Montreal + Univ. Toronto [Web] [Paper]
    • Kelvin Xu, Jimmy Lei Ba, Ryan Kiros, Kyunghyun Cho, Aaron Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua Bengio, Show, Attend, and Tell: Neural Image Caption Generation with Visual Attention, arXiv:1502.03044 / ICML 2015
  • Idiap + EPFL + Facebook [Paper]
    • Remi Lebret, Pedro O. Pinheiro, and Ronan Collobert, Phrase-based Image Captioning, arXiv:1502.03671 / ICML 2015
  • UCLA + Baidu [Paper]
    • Junhua Mao, Wei Xu, Yi Yang, Jiang Wang, Zhiheng Huang, and Alan L. Yuille, Learning like a Child: Fast Novel Visual Concept Learning from Sentence Descriptions of Images, arXiv:1504.06692
  • MS + Berkeley
    • Jacob Devlin, Saurabh Gupta, Ross Girshick, Margaret Mitchell, and C. Lawrence Zitnick, Exploring Nearest Neighbor Approaches for Image Captioning, arXiv:1505.04467 (Note: technically not RNN) [Paper]
    • Jacob Devlin, Hao Cheng, Hao Fang, Saurabh Gupta, Li Deng, Xiaodong He, Geoffrey Zweig, and Margaret Mitchell, Language Models for Image Captioning: The Quirks and What Works, arXiv:1505.01809 [Paper]
  • Adelaide [Paper]
    • Qi Wu, Chunhua Shen, Anton van den Hengel, Lingqiao Liu, and Anthony Dick, Image Captioning with an Intermediate Attributes Layer, arXiv:1506.01144
  • Tilburg [Paper]
    • Grzegorz Chrupala, Akos Kadar, and Afra Alishahi, Learning language through pictures, arXiv:1506.03694
  • Univ. Montreal [Paper]
    • Kyunghyun Cho, Aaron Courville, and Yoshua Bengio, Describing Multimedia Content using Attention-based Encoder-Decoder Networks, arXiv:1507.01053
  • Cornell [Paper]
    • Jack Hessel, Nicolas Savva, and Michael J. Wilber, Image Representations and New Domains in Neural Image Captioning, arXiv:1508.02091

Video Captioning

  • Berkeley [Web] [Paper]
    • Jeff Donahue, Lisa Anne Hendricks, Sergio Guadarrama, Marcus Rohrbach, Subhashini Venugopalan, Kate Saenko, and Trevor Darrell, Long-term Recurrent Convolutional Networks for Visual Recognition and Description, arXiv:1411.4389 / CVPR 2015
  • UT Austin + UML + Berkeley [Paper]
    • Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, and Kate Saenko, Translating Videos to Natural Language Using Deep Recurrent Neural Networks, arXiv:1412.4729
  • Microsoft [Paper]
    • Yingwei Pan, Tao Mei, Ting Yao, Houqiang Li, and Yong Rui, Joint Modeling Embedding and Translation to Bridge Video and Language, arXiv:1505.01861
  • UT Austin + Berkeley + UML [Paper]
    • Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, and Kate Saenko, Sequence to Sequence--Video to Text, arXiv:1505.00487
  • Univ. Montreal + Univ. Sherbrooke [Paper]
    • Li Yao, Atousa Torabi, Kyunghyun Cho, Nicolas Ballas, Christopher Pal, Hugo Larochelle, and Aaron Courville, Describing Videos by Exploiting Temporal Structure, arXiv:1502.08029
  • MPI + Berkeley [Paper]
    • Anna Rohrbach, Marcus Rohrbach, and Bernt Schiele, The Long-Short Story of Movie Description, arXiv:1506.01698
  • Univ. Toronto + MIT [Paper]
    • Yukun Zhu, Ryan Kiros, Richard Zemel, Ruslan Salakhutdinov, Raquel Urtasun, Antonio Torralba, and Sanja Fidler, Aligning Books and Movies: Towards Story-like Visual Explanations by Watching Movies and Reading Books, arXiv:1506.06724
  • Univ. Montreal [Paper]
    • Kyunghyun Cho, Aaron Courville, and Yoshua Bengio, Describing Multimedia Content using Attention-based Encoder-Decoder Networks, arXiv:1507.01053
  • Zhejiang Univ. + UTS [Paper]
    • Pingbo Pan, Zhongwen Xu, Yi Yang, Fei Wu, Yueting Zhuang, Hierarchical Recurrent Neural Encoder for Video Representation with Application to Captioning, arXiv:1511.03476
  • Univ. Montreal + NYU + IBM [Paper]
    • Li Yao, Nicolas Ballas, Kyunghyun Cho, John R. Smith, and Yoshua Bengio, Empirical performance upper bounds for image and video captioning, arXiv:1511.04590

Visual Question Answering

  • Virginia Tech. + MSR [Web] [Paper]
    • Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh, VQA: Visual Question Answering, arXiv:1505.00468 / CVPR 2015 SUNw:Scene Understanding workshop
  • MPI + Berkeley [Web] [Paper]
    • Mateusz Malinowski, Marcus Rohrbach, and Mario Fritz, Ask Your Neurons: A Neural-based Approach to Answering Questions about Images, arXiv:1505.01121
  • Univ. Toronto [Paper] [Dataset]
    • Mengye Ren, Ryan Kiros, and Richard Zemel, Exploring Models and Data for Image Question Answering, arXiv:1505.02074 / ICML 2015 deep learning workshop
  • Baidu + UCLA [Paper] [Dataset]
    • Hauyuan Gao, Junhua Mao, Jie Zhou, Zhiheng Huang, Lei Wang, and Wei Xu, Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering, arXiv:1505.05612 / NIPS 2015
  • SNU + NAVER [Paper]
    • Jin-Hwa Kim, Sang-Woo Lee, Dong-Hyun Kwak, Min-Oh Heo, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang, Multimodal Residual Learning for Visual QA, arXiv:1606:01455
  • UC Berkeley + Sony [Paper]
    • Akira Fukui, Dong Huk Park, Daylen Yang, Anna Rohrbach, Trevor Darrell, and Marcus Rohrbach, Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding, arXiv:1606.01847
  • Postech [Paper]
    • Hyeonwoo Noh and Bohyung Han, Training Recurrent Answering Units with Joint Loss Minimization for VQA, arXiv:1606.03647
  • SNU + NAVER [Paper]
    • Jin-Hwa Kim, Kyoung Woon On, Jeonghee Kim, Jung-Woo Ha, Byoung-Tak Zhang, Hadamard Product for Low-rank Bilinear Pooling, arXiv:1610.04325
  • Video QA
    • CMU + UTS [paper]
      • Linchao Zhu, Zhongwen Xu, Yi Yang, Alexander G. Hauptmann, Uncovering Temporal Context for Video Question and Answering, arXiv:1511.04670
    • KIT + MIT + Univ. Toronto [Paper] [Dataset]
      • Makarand Tapaswi, Yukun Zhu, Rainer Stiefelhagen, Antonio Torralba, Raquel Urtasun, Sanja Fidler, MovieQA: Understanding Stories in Movies through Question-Answering, arXiv:1512.02902

Turing Machines

  • A.Graves, G. Wayne, and I. Danihelka., Neural Turing Machines, arXiv preprint arXiv:1410.5401 [Paper]
  • Jason Weston, Sumit Chopra, Antoine Bordes, Memory Networks, arXiv:1410.3916 [Paper]
  • Armand Joulin and Tomas Mikolov, Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets, arXiv:1503.01007 / NIPS 2015 [Paper]
  • Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, and Rob Fergus, End-To-End Memory Networks, arXiv:1503.08895 / NIPS 2015 [Paper]
  • Wojciech Zaremba and Ilya Sutskever, Reinforcement Learning Neural Turing Machines, arXiv:1505.00521 [Paper]
  • Baolin Peng and Kaisheng Yao, Recurrent Neural Networks with External Memory for Language Understanding, arXiv:1506.00195 [Paper]
  • Fandong Meng, Zhengdong Lu, Zhaopeng Tu, Hang Li, and Qun Liu, A Deep Memory-based Architecture for Sequence-to-Sequence Learning, arXiv:1506.06442 [Paper]
  • Arvind Neelakantan, Quoc V. Le, and Ilya Sutskever, Neural Programmer: Inducing Latent Programs with Gradient Descent, arXiv:1511.04834 [Paper]
  • Scott Reed and Nando de Freitas, Neural Programmer-Interpreters, arXiv:1511.06279 [Paper]
  • Karol Kurach, Marcin Andrychowicz, and Ilya Sutskever, Neural Random-Access Machines, arXiv:1511.06392 [Paper]
  • Łukasz Kaiser and Ilya Sutskever, Neural GPUs Learn Algorithms, arXiv:1511.08228 [Paper]
  • Ethan Caballero, Skip-Thought Memory Networks, arXiv:1511.6420 [Paper]
  • Wojciech Zaremba, Tomas Mikolov, Armand Joulin, and Rob Fergus, Learning Simple Algorithms from Examples, arXiv:1511.07275 [Paper]

Robotics

  • Hongyuan Mei, Mohit Bansal, and Matthew R. Walter, Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, arXiv:1506.04089 [Paper]
  • Marvin Zhang, Sergey Levine, Zoe McCarthy, Chelsea Finn, and Pieter Abbeel, Policy Learning with Continuous Memory States for Partially Observed Robotic Control, arXiv:1507.01273. [Paper]

Other

  • Alex Graves, Generating Sequences With Recurrent Neural Networks, arXiv:1308.0850 [Paper]
  • Volodymyr Mnih, Nicolas Heess, Alex Graves, and Koray Kavukcuoglu, Recurrent Models of Visual Attention, NIPS 2014 / arXiv:1406.6247 [Paper]
  • Wojciech Zaremba and Ilya Sutskever, Learning to Execute, arXiv:1410.4615 [Paper] [Code]
  • Samy Bengio, Oriol Vinyals, Navdeep Jaitly, and Noam Shazeer, Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks, arXiv:1506.03099 / NIPS 2015 [Paper]
  • Bing Shuai, Zhen Zuo, Gang Wang, and Bing Wang, DAG-Recurrent Neural Networks For Scene Labeling, arXiv:1509.00552 [Paper]
  • Soren Kaae Sonderby, Casper Kaae Sonderby, Lars Maaloe, and Ole Winther, Recurrent Spatial Transformer Networks, arXiv:1509.05329 [Paper]
  • Cesar Laurent, Gabriel Pereyra, Philemon Brakel, Ying Zhang, and Yoshua Bengio, Batch Normalized Recurrent Neural Networks, arXiv:1510.01378 [Paper]
  • Jiwon Kim, Jung Kwon Lee, Kyoung Mu Lee, Deeply-Recursive Convolutional Network for Image Super-Resolution, arXiv:1511.04491 [Paper]
  • Quan Gan, Qipeng Guo, Zheng Zhang, and Kyunghyun Cho, First Step toward Model-Free, Anonymous Object Tracking with Recurrent Neural Networks, arXiv:1511.06425 [Paper]
  • Francesco Visin, Kyle Kastner, Aaron Courville, Yoshua Bengio, Matteo Matteucci, and Kyunghyun Cho, ReSeg: A Recurrent Neural Network for Object Segmentation, arXiv:1511.07053 [Paper]
  • Juergen Schmidhuber, On Learning to Think: Algorithmic Information Theory for Novel Combinations of Reinforcement Learning Controllers and Recurrent Neural World Models, arXiv:1511.09249 [Paper]

Datasets

Blogs

Online Demos

  • Alex graves, hand-writing generation [link]
  • Ink Poster: Handwritten post-it notes [link]
  • LSTMVis: Visual Analysis for Recurrent Neural Networks [link]