Convert Figma logo to code with AI

sherjilozair logochar-rnn-tensorflow

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow

2,642
960
2,642
47

Top Related Projects

77,006

Models and examples built with TensorFlow

11,562

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

Quick Overview

The sherjilozair/char-rnn-tensorflow repository is a TensorFlow implementation of a character-level recurrent neural network (RNN) for text generation. It allows users to train a model on a corpus of text and then generate new text that mimics the style and content of the original text.

Pros

  • Flexible and Customizable: The project provides a modular and configurable architecture, allowing users to easily experiment with different model configurations, hyperparameters, and training data.
  • Comprehensive Documentation: The repository includes detailed documentation, including instructions for training, generating text, and evaluating the model's performance.
  • Active Development: The project is actively maintained, with regular updates and bug fixes.
  • Supports Multiple Datasets: The project can be used with a variety of text datasets, including books, articles, and scripts.

Cons

  • Limited to Character-level Modeling: The project is focused on character-level text generation, which may not be suitable for all use cases that require higher-level semantic understanding.
  • Computational Complexity: Training large-scale character-level RNNs can be computationally intensive and may require significant hardware resources, especially for longer sequences.
  • Potential for Biased or Offensive Output: As with any text generation model, the output of the char-rnn-tensorflow model may reflect biases or offensive content present in the training data.
  • Lack of Automatic Evaluation Metrics: The project does not provide built-in support for automatic evaluation of the generated text, requiring users to manually assess the quality and coherence of the output.

Code Examples

Here are a few code examples from the sherjilozair/char-rnn-tensorflow repository:

  1. Training the Model:
import tensorflow as tf
from char_rnn_tensorflow.model import CharRNN

model = CharRNN(
    num_classes=len(vocab),
    batch_size=64,
    num_steps=50,
    lstm_size=128,
    num_layers=2,
    learning_rate=0.002,
)

model.train(
    data_loader,
    num_epochs=50,
    save_every=1000,
    log_every=10,
    sample_every=1000,
    checkpoint_dir='./checkpoints'
)

This code sets up a CharRNN model and trains it on the provided data, saving checkpoints and generating sample text at regular intervals.

  1. Generating Text:
import tensorflow as tf
from char_rnn_tensorflow.model import CharRNN

model = CharRNN(
    num_classes=len(vocab),
    batch_size=1,
    num_steps=1,
    lstm_size=128,
    num_layers=2,
    sampling=True,
)

model.load('./checkpoints/model.ckpt')
print(model.sample(100, vocab, prime='The '))

This code loads a pre-trained CharRNN model and uses it to generate 100 characters of text, primed with the phrase "The ".

  1. Evaluating the Model:
import tensorflow as tf
from char_rnn_tensorflow.model import CharRNN

model = CharRNN(
    num_classes=len(vocab),
    batch_size=64,
    num_steps=50,
    lstm_size=128,
    num_layers=2,
)

model.load('./checkpoints/model.ckpt')
perplexity = model.perplexity(data_loader)
print(f'Perplexity: {perplexity:.2f}')

This code loads a pre-trained CharRNN model and evaluates its performance on the provided data using perplexity as the metric.

Getting Started

To get started with the sherjilozair/char-rnn-tensorflow project, follow these steps:

  1. Clone the repository:
git clone https://github.com/sherjilozair/char-rnn-tensorflow.git
  1. Install the required dependencies:
cd char-rnn-tensorflow
pip install -r requirements.txt
  1. Prepare your training data:
    • The project expects the

Competitor Comparisons

77,006

Models and examples built with TensorFlow

Pros of tensorflow/models

  • Comprehensive collection of state-of-the-art models and examples for various machine learning tasks
  • Active community with frequent updates and contributions
  • Extensive documentation and tutorials for each model

Cons of tensorflow/models

  • Larger codebase and more complex to navigate compared to a single-purpose project like char-rnn-tensorflow
  • May require more setup and configuration to get a specific model running

Code Comparison

tensorflow/models (Transformer model):

def create_masks(inp, tar):
    # Encoder padding mask
    enc_padding_mask = create_padding_mask(inp)

    # Used in the 2nd attention block in the decoder.
    # This padding mask is used to mask the encoder outputs.
    dec_padding_mask = create_padding_mask(inp)

    # Used in the 1st attention block in the decoder.
    # It is used to pad and mask future tokens in the input received by
    # the decoder.
    look_ahead_mask = create_look_ahead_mask(tf.shape(tar)[1])
    dec_target_padding_mask = create_padding_mask(tar)
    combined_mask = tf.maximum(dec_target_padding_mask, look_ahead_mask)

    return enc_padding_mask, combined_mask, dec_padding_mask

sherjilozair/char-rnn-tensorflow (Character-level RNN):

def sample(self, n=200, prime='The '):
    """
    Generate sample text (n characters) from the model.
    """
    states = self.sess.run(self.initial_state)
    x = np.array([self.char_to_ix[c] for c in prime])
    txt = prime
    for i in range(n):
        feed = {self.x: x[None, None], self.initial_state: states}
        preds, states = self.sess.run([self.proba, self.final_state], feed)
        p = preds[0, -1]
        x = np.random.choice(len(p), p=p)
        txt += self.ix_to_char[x]
    return txt
11,562

Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch

Pros of char-rnn

  • Supports multiple programming languages (Python, Lua, and Torch)
  • Provides a more comprehensive set of features, including support for different types of RNNs (LSTM, GRU) and optimization methods
  • Includes more detailed documentation and examples

Cons of char-rnn

  • Requires more setup and configuration compared to char-rnn-tensorflow
  • May be more complex for beginners to understand and use
  • Potentially slower performance due to the overhead of the Torch framework

Code Comparison

Here's a brief comparison of the code for training a character-level language model in both repositories:

char-rnn-tensorflow:

model = CharRNNModel(
    num_classes=len(vocab),
    batch_size=batch_size,
    num_steps=num_steps,
    lstm_size=lstm_size,
    num_layers=num_layers,
    learning_rate=learning_rate)
model.train(data_loader.train_data, data_loader.valid_data)

char-rnn:

local opt = {
  data_dir = 'data/tinyshakespeare',
  rnn_size = 128,
  num_layers = 2,
  batch_size = 50,
  seq_length = 50,
  num_epochs = 50,
  learning_rate = 2e-3,
  decay_rate = 0.97,
  dropout = 0.5,
}

local model = require 'model'(opt)
model:train()

The char-rnn-tensorflow code is more concise and straightforward, while the char-rnn code is more verbose and requires more configuration options. However, the char-rnn code provides more flexibility and customization options.

TensorFlow Tutorial and Examples for Beginners (support TF v1 & v2)

Pros of TensorFlow-Examples

  • Covers a wide range of TensorFlow examples, from basic to advanced, making it a comprehensive resource for learning and experimentation.
  • Provides clear and well-documented code, making it easier for beginners to understand and follow along.
  • Includes examples for various machine learning tasks, such as classification, regression, and generative models.

Cons of TensorFlow-Examples

  • May not be as focused or specialized as char-rnn-tensorflow, which is dedicated to a specific task (character-level language modeling).
  • The examples may not be as up-to-date with the latest TensorFlow versions and best practices compared to a more focused project.
  • The project may not have the same level of active maintenance and community support as a more specialized repository.

Code Comparison

Here's a brief comparison of the code structure between the two repositories:

char-rnn-tensorflow

def build_rnn_graph(num_classes, num_seqs=50, num_steps=50):
    """
    Builds the computation graph for the character-level language model.
    """
    # ...
    cell = tf.nn.rnn_cell.BasicLSTMCell(rnn_size)
    initial_state = cell.zero_state(num_seqs, tf.float32)
    # ...

TensorFlow-Examples

def conv_net(x_dict, n_classes, dropout, reuse, is_training):
    """
    Convolution Neural Network Model.
    """
    # Create the network
    with tf.variable_scope('ConvNet', reuse=reuse):
        # Convolution Layer
        conv1 = tf.layers.conv2d(x_dict['images'], 32, 5, activation=tf.nn.relu)
        # Max Pooling (down-sampling)
        conv1 = tf.layers.max_pooling2d(conv1, 2, 2)
    # ...

The char-rnn-tensorflow code focuses on building a character-level language model using an LSTM-based recurrent neural network, while the TensorFlow-Examples code demonstrates a convolutional neural network for image classification.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

char-rnn-tensorflow

Join the chat at https://gitter.im/char-rnn-tensorflow/Lobby Coverage Status Build Status

Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow.

Inspired from Andrej Karpathy's char-rnn.

Requirements

Basic Usage

To train with default parameters on the tinyshakespeare corpus, run python train.py. To access all the parameters use python train.py --help.

To sample from a checkpointed model, python sample.py. Sampling while the learning is still in progress (to check last checkpoint) works only in CPU or using another GPU. To force CPU mode, use export CUDA_VISIBLE_DEVICES="" and unset CUDA_VISIBLE_DEVICES afterward (resp. set CUDA_VISIBLE_DEVICES="" and set CUDA_VISIBLE_DEVICES= on Windows).

To continue training after interruption or to run on more epochs, python train.py --init_from=save

Datasets

You can use any plain text file as input. For example you could download The complete Sherlock Holmes as such:

cd data
mkdir sherlock
cd sherlock
wget https://sherlock-holm.es/stories/plain-text/cnus.txt
mv cnus.txt input.txt

Then start train from the top level directory using python train.py --data_dir=./data/sherlock/

A quick tip to concatenate many small disparate .txt files into one large training file: ls *.txt | xargs -L 1 cat >> input.txt.

Tuning

Tuning your models is kind of a "dark art" at this point. In general:

  1. Start with as much clean input.txt as possible e.g. 50MiB
  2. Start by establishing a baseline using the default settings.
  3. Use tensorboard to compare all of your runs visually to aid in experimenting.
  4. Tweak --rnn_size up somewhat from 128 if you have a lot of input data.
  5. Tweak --num_layers from 2 to 3 but no higher unless you have experience.
  6. Tweak --seq_length up from 50 based on the length of a valid input string (e.g. names are <= 12 characters, sentences may be up to 64 characters, etc). An lstm cell will "remember" for durations longer than this sequence, but the effect falls off for longer character distances.
  7. Finally once you've done all that, only then would I suggest adding some dropout. Start with --output_keep_prob 0.8 and maybe end up with both --input_keep_prob 0.8 --output_keep_prob 0.5 only after exhausting all the above values.

Tensorboard

To visualize training progress, model graphs, and internal state histograms: fire up Tensorboard and point it at your log_dir. E.g.:

$ tensorboard --logdir=./logs/

Then open a browser to http://localhost:6006 or the correct IP/Port specified.

Roadmap

  • Add explanatory comments
  • Expose more command-line arguments
  • Compare accuracy and performance with char-rnn
  • More Tensorboard instrumentation

Contributing

Please feel free to:

  • Leave feedback in the issues
  • Open a Pull Request
  • Join the gittr chat
  • Share your success stories and data sets!