Convert Figma logo to code with AI

keras-rl logokeras-rl

Deep Reinforcement Learning for Keras.

5,542
1,365
5,542
49

Top Related Projects

16,242

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

2,901

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

1,226

PFRL: a PyTorch-based deep reinforcement learning library

1,965

A toolkit for reproducible reinforcement learning research.

Quick Overview

keras-rl is a deep reinforcement learning library that extends Keras, a popular deep learning framework. It provides implementations of various reinforcement learning algorithms, allowing users to easily train and evaluate RL agents on different environments. The library is designed to be modular and extensible, making it suitable for both research and practical applications.

Pros

  • Easy integration with Keras and TensorFlow
  • Implements a wide range of popular RL algorithms
  • Supports both discrete and continuous action spaces
  • Highly customizable and extensible architecture

Cons

  • Limited documentation and examples
  • Not actively maintained (last update was in 2019)
  • May not be compatible with the latest versions of Keras and TensorFlow
  • Lacks some advanced RL techniques and algorithms

Code Examples

  1. Creating and training a DQN agent:
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

model = create_model(num_actions)
memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=num_actions, memory=memory, nb_steps_warmup=10,
               target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
  1. Defining a custom environment:
from rl.core import Env

class CustomEnv(Env):
    def step(self, action):
        # Implement environment dynamics
        return next_state, reward, done, {}

    def reset(self):
        # Reset environment to initial state
        return initial_state

    def render(self, mode='human'):
        # Implement rendering logic
        pass
  1. Using a pre-trained agent:
from rl.agents import DQNAgent

dqn = DQNAgent.load('dqn_weights.h5f', model)
dqn.test(env, nb_episodes=5, visualize=True)

Getting Started

To get started with keras-rl, follow these steps:

  1. Install the library:
pip install keras-rl
  1. Import required modules:
import gym
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from keras.optimizers import Adam

from rl.agents.dqn import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
  1. Create an environment and model:
env = gym.make('CartPole-v0')
model = Sequential([
    Flatten(input_shape=(1,) + env.observation_space.shape),
    Dense(16),
    Activation('relu'),
    Dense(16),
    Activation('relu'),
    Dense(16),
    Activation('relu'),
    Dense(env.action_space.n),
    Activation('linear')
])
  1. Create and train an agent:
memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=env.action_space.n, memory=memory, nb_steps_warmup=10,
               target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)

Competitor Comparisons

16,242

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Pros of baselines

  • More comprehensive set of RL algorithms implemented
  • Better documentation and examples
  • Active development and maintenance by OpenAI

Cons of baselines

  • Steeper learning curve for beginners
  • Less focus on integration with Keras

Code Comparison

baselines:

from baselines import deepq
from baselines.common.atari_wrappers import wrap_deepmind

env = wrap_deepmind(gym.make("PongNoFrameskip-v4"))
model = deepq.learn(env, network='cnn', total_timesteps=100000)

keras-rl:

from rl.agents.dqn import DQNAgent
from rl.policy import EpsGreedyQPolicy
from rl.memory import SequentialMemory

model = create_model(input_shape, nb_actions)
memory = SequentialMemory(limit=50000, window_length=1)
policy = EpsGreedyQPolicy()
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, policy=policy)

Both repositories provide implementations of reinforcement learning algorithms, but baselines offers a wider range of algorithms and more extensive documentation. However, keras-rl may be more suitable for those already familiar with Keras and looking for a simpler integration. The code examples demonstrate the different approaches to implementing a DQN agent in each library.

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Pros of stable-baselines

  • More comprehensive and actively maintained library
  • Supports a wider range of algorithms and environments
  • Better documentation and community support

Cons of stable-baselines

  • Steeper learning curve for beginners
  • Requires more computational resources for some algorithms

Code Comparison

stable-baselines:

from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv

env = DummyVecEnv([lambda: gym.make("CartPole-v1")])
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)

keras-rl:

from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))

memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
               target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=2)

The code comparison shows that stable-baselines offers a more concise and straightforward implementation, while keras-rl requires more manual setup and configuration.

2,901

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Pros of TF-Agents

  • More comprehensive and actively maintained
  • Better integration with TensorFlow ecosystem
  • Supports both TensorFlow and TensorFlow Probability

Cons of TF-Agents

  • Steeper learning curve for beginners
  • More complex setup and configuration
  • Potentially slower development for simple projects

Code Comparison

keras-rl:

from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
               target_model_update=1e-2, policy=BoltzmannQPolicy())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])

TF-Agents:

from tf_agents.agents.dqn import dqn_agent
from tf_agents.networks import q_network
from tf_agents.replay_buffers import tf_uniform_replay_buffer

q_net = q_network.QNetwork(train_env.observation_spec(), train_env.action_spec())
agent = dqn_agent.DqnAgent(train_env.time_step_spec(), train_env.action_spec(), q_network=q_net,
                           optimizer=tf.compat.v1.train.AdamOptimizer(learning_rate=1e-3))
1,226

PFRL: a PyTorch-based deep reinforcement learning library

Pros of pfrl

  • More comprehensive and up-to-date implementation of reinforcement learning algorithms
  • Better support for distributed training and parallel environments
  • More extensive documentation and examples

Cons of pfrl

  • Steeper learning curve due to its more complex architecture
  • Less integration with Keras, which may be a drawback for Keras users
  • Requires more setup and configuration compared to keras-rl

Code Comparison

keras-rl example:

from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

model = create_model()
memory = SequentialMemory(limit=50000, window_length=1)
dqn = DQNAgent(model=model, memory=memory, policy=BoltzmannQPolicy())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)

pfrl example:

import pfrl

q_func = create_q_function()
optimizer = torch.optim.Adam(q_func.parameters(), lr=1e-3)
replay_buffer = pfrl.replay_buffers.ReplayBuffer(capacity=10**6)
explorer = pfrl.explorers.LinearDecayEpsilonGreedy(start_epsilon=1.0, end_epsilon=0.1, decay_steps=50000)
agent = pfrl.agents.DoubleDQN(q_func, optimizer, replay_buffer, gamma=0.99, explorer=explorer)
pfrl.experiments.train_agent_with_evaluation(agent, env, steps=50000, eval_n_steps=None, eval_n_episodes=10)
1,965

A toolkit for reproducible reinforcement learning research.

Pros of garage

  • More comprehensive and flexible framework for RL research
  • Supports a wider range of algorithms and environments
  • Better documentation and examples for advanced users

Cons of garage

  • Steeper learning curve for beginners
  • Less focus on simplicity and ease of use
  • Requires more setup and configuration

Code Comparison

keras-rl:

from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory

dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
               target_model_update=1e-2, policy=BoltzmannQPolicy())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])

garage:

from garage import wrap_experiment
from garage.tf.algos import TRPO
from garage.tf.policies import GaussianMLPPolicy

@wrap_experiment
def trpo_cartpole(ctxt=None):
    policy = GaussianMLPPolicy(env_spec=env.spec, hidden_sizes=(32, 32))
    algo = TRPO(env=env, policy=policy, baseline=baseline, max_path_length=100)

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Deep Reinforcement Learning for Keras

Build Status Documentation License Join the chat at https://gitter.im/keras-rl/Lobby

What is it?

keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras.

Furthermore, keras-rl works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy.

Of course you can extend keras-rl according to your own needs. You can use built-in Keras callbacks and metrics or define your own. Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. Documentation is available online.

What is included?

As of today, the following algorithms have been implemented:

  • Deep Q Learning (DQN) [1], [2]
  • Double DQN [3]
  • Deep Deterministic Policy Gradient (DDPG) [4]
  • Continuous DQN (CDQN or NAF) [6]
  • Cross-Entropy Method (CEM) [7], [8]
  • Dueling network DQN (Dueling DQN) [9]
  • Deep SARSA [10]
  • Asynchronous Advantage Actor-Critic (A3C) [5]
  • Proximal Policy Optimization Algorithms (PPO) [11]

You can find more information on each agent in the doc.

Installation

  • Install Keras-RL from Pypi (recommended):
pip install keras-rl
  • Install from Github source:
git clone https://github.com/keras-rl/keras-rl.git
cd keras-rl
python setup.py install

Examples

If you want to run the examples, you'll also have to install:

For atari example you will also need:

  • Pillow: pip install Pillow
  • gym[atari]: Atari module for gym. Use pip install gym[atari]

Once you have installed everything, you can try out a simple example:

python examples/dqn_cartpole.py

This is a very simple example and it should converge relatively quickly, so it's a great way to get started! It also visualizes the game during training, so you can watch it learn. How cool is that?

Some sample weights are available on keras-rl-weights.

If you have questions or problems, please file an issue or, even better, fix the problem yourself and submit a pull request!

External Projects

You're using Keras-RL on a project? Open a PR and share it!

Visualizing Training Metrics

To see graphs of your training progress and compare across runs, run pip install wandb and add the WandbLogger callback to your agent's fit() call:

from rl.callbacks import WandbLogger

...

agent.fit(env, nb_steps=50000, callbacks=[WandbLogger()])

For more info and options, see the W&B docs.

Citing

If you use keras-rl in your research, you can cite it as follows:

@misc{plappert2016kerasrl,
    author = {Matthias Plappert},
    title = {keras-rl},
    year = {2016},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/keras-rl/keras-rl}},
}

References

  1. Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
  2. Human-level control through deep reinforcement learning, Mnih et al., 2015
  3. Deep Reinforcement Learning with Double Q-learning, van Hasselt et al., 2015
  4. Continuous control with deep reinforcement learning, Lillicrap et al., 2015
  5. Asynchronous Methods for Deep Reinforcement Learning, Mnih et al., 2016
  6. Continuous Deep Q-Learning with Model-based Acceleration, Gu et al., 2016
  7. Learning Tetris Using the Noisy Cross-Entropy Method, Szita et al., 2006
  8. Deep Reinforcement Learning (MLSS lecture notes), Schulman, 2016
  9. Dueling Network Architectures for Deep Reinforcement Learning, Wang et al., 2016
  10. Reinforcement learning: An introduction, Sutton and Barto, 2011
  11. Proximal Policy Optimization Algorithms, Schulman et al., 2017