Top Related Projects
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
PFRL: a PyTorch-based deep reinforcement learning library
A toolkit for reproducible reinforcement learning research.
Quick Overview
keras-rl is a deep reinforcement learning library that extends Keras, a popular deep learning framework. It provides implementations of various reinforcement learning algorithms, allowing users to easily train and evaluate RL agents on different environments. The library is designed to be modular and extensible, making it suitable for both research and practical applications.
Pros
- Easy integration with Keras and TensorFlow
- Implements a wide range of popular RL algorithms
- Supports both discrete and continuous action spaces
- Highly customizable and extensible architecture
Cons
- Limited documentation and examples
- Not actively maintained (last update was in 2019)
- May not be compatible with the latest versions of Keras and TensorFlow
- Lacks some advanced RL techniques and algorithms
Code Examples
- Creating and training a DQN agent:
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
model = create_model(num_actions)
memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=num_actions, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
- Defining a custom environment:
from rl.core import Env
class CustomEnv(Env):
def step(self, action):
# Implement environment dynamics
return next_state, reward, done, {}
def reset(self):
# Reset environment to initial state
return initial_state
def render(self, mode='human'):
# Implement rendering logic
pass
- Using a pre-trained agent:
from rl.agents import DQNAgent
dqn = DQNAgent.load('dqn_weights.h5f', model)
dqn.test(env, nb_episodes=5, visualize=True)
Getting Started
To get started with keras-rl, follow these steps:
- Install the library:
pip install keras-rl
- Import required modules:
import gym
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten
from keras.optimizers import Adam
from rl.agents.dqn import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
- Create an environment and model:
env = gym.make('CartPole-v0')
model = Sequential([
Flatten(input_shape=(1,) + env.observation_space.shape),
Dense(16),
Activation('relu'),
Dense(16),
Activation('relu'),
Dense(16),
Activation('relu'),
Dense(env.action_space.n),
Activation('linear')
])
- Create and train an agent:
memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=env.action_space.n, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
Competitor Comparisons
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
Pros of baselines
- More comprehensive set of RL algorithms implemented
- Better documentation and examples
- Active development and maintenance by OpenAI
Cons of baselines
- Steeper learning curve for beginners
- Less focus on integration with Keras
Code Comparison
baselines:
from baselines import deepq
from baselines.common.atari_wrappers import wrap_deepmind
env = wrap_deepmind(gym.make("PongNoFrameskip-v4"))
model = deepq.learn(env, network='cnn', total_timesteps=100000)
keras-rl:
from rl.agents.dqn import DQNAgent
from rl.policy import EpsGreedyQPolicy
from rl.memory import SequentialMemory
model = create_model(input_shape, nb_actions)
memory = SequentialMemory(limit=50000, window_length=1)
policy = EpsGreedyQPolicy()
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, policy=policy)
Both repositories provide implementations of reinforcement learning algorithms, but baselines offers a wider range of algorithms and more extensive documentation. However, keras-rl may be more suitable for those already familiar with Keras and looking for a simpler integration. The code examples demonstrate the different approaches to implementing a DQN agent in each library.
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
Pros of stable-baselines
- More comprehensive and actively maintained library
- Supports a wider range of algorithms and environments
- Better documentation and community support
Cons of stable-baselines
- Steeper learning curve for beginners
- Requires more computational resources for some algorithms
Code Comparison
stable-baselines:
from stable_baselines3 import PPO
from stable_baselines3.common.vec_env import DummyVecEnv
env = DummyVecEnv([lambda: gym.make("CartPole-v1")])
model = PPO("MlpPolicy", env, verbose=1)
model.learn(total_timesteps=10000)
keras-rl:
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))
memory = SequentialMemory(limit=50000, window_length=1)
policy = BoltzmannQPolicy()
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=policy)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=2)
The code comparison shows that stable-baselines offers a more concise and straightforward implementation, while keras-rl requires more manual setup and configuration.
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Pros of TF-Agents
- More comprehensive and actively maintained
- Better integration with TensorFlow ecosystem
- Supports both TensorFlow and TensorFlow Probability
Cons of TF-Agents
- Steeper learning curve for beginners
- More complex setup and configuration
- Potentially slower development for simple projects
Code Comparison
keras-rl:
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=BoltzmannQPolicy())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
TF-Agents:
from tf_agents.agents.dqn import dqn_agent
from tf_agents.networks import q_network
from tf_agents.replay_buffers import tf_uniform_replay_buffer
q_net = q_network.QNetwork(train_env.observation_spec(), train_env.action_spec())
agent = dqn_agent.DqnAgent(train_env.time_step_spec(), train_env.action_spec(), q_network=q_net,
optimizer=tf.compat.v1.train.AdamOptimizer(learning_rate=1e-3))
PFRL: a PyTorch-based deep reinforcement learning library
Pros of pfrl
- More comprehensive and up-to-date implementation of reinforcement learning algorithms
- Better support for distributed training and parallel environments
- More extensive documentation and examples
Cons of pfrl
- Steeper learning curve due to its more complex architecture
- Less integration with Keras, which may be a drawback for Keras users
- Requires more setup and configuration compared to keras-rl
Code Comparison
keras-rl example:
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
model = create_model()
memory = SequentialMemory(limit=50000, window_length=1)
dqn = DQNAgent(model=model, memory=memory, policy=BoltzmannQPolicy())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
pfrl example:
import pfrl
q_func = create_q_function()
optimizer = torch.optim.Adam(q_func.parameters(), lr=1e-3)
replay_buffer = pfrl.replay_buffers.ReplayBuffer(capacity=10**6)
explorer = pfrl.explorers.LinearDecayEpsilonGreedy(start_epsilon=1.0, end_epsilon=0.1, decay_steps=50000)
agent = pfrl.agents.DoubleDQN(q_func, optimizer, replay_buffer, gamma=0.99, explorer=explorer)
pfrl.experiments.train_agent_with_evaluation(agent, env, steps=50000, eval_n_steps=None, eval_n_episodes=10)
A toolkit for reproducible reinforcement learning research.
Pros of garage
- More comprehensive and flexible framework for RL research
- Supports a wider range of algorithms and environments
- Better documentation and examples for advanced users
Cons of garage
- Steeper learning curve for beginners
- Less focus on simplicity and ease of use
- Requires more setup and configuration
Code Comparison
keras-rl:
from rl.agents import DQNAgent
from rl.policy import BoltzmannQPolicy
from rl.memory import SequentialMemory
dqn = DQNAgent(model=model, nb_actions=nb_actions, memory=memory, nb_steps_warmup=10,
target_model_update=1e-2, policy=BoltzmannQPolicy())
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
garage:
from garage import wrap_experiment
from garage.tf.algos import TRPO
from garage.tf.policies import GaussianMLPPolicy
@wrap_experiment
def trpo_cartpole(ctxt=None):
policy = GaussianMLPPolicy(env_spec=env.spec, hidden_sizes=(32, 32))
algo = TRPO(env=env, policy=policy, baseline=baseline, max_path_length=100)
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Deep Reinforcement Learning for Keras
![]() |
![]() |
![]() |
What is it?
keras-rl
implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras.
Furthermore, keras-rl
works with OpenAI Gym out of the box. This means that evaluating and playing around with different algorithms is easy.
Of course you can extend keras-rl
according to your own needs. You can use built-in Keras callbacks and metrics or define your own.
Even more so, it is easy to implement your own environments and even algorithms by simply extending some simple abstract classes. Documentation is available online.
What is included?
As of today, the following algorithms have been implemented:
- Deep Q Learning (DQN) [1], [2]
- Double DQN [3]
- Deep Deterministic Policy Gradient (DDPG) [4]
- Continuous DQN (CDQN or NAF) [6]
- Cross-Entropy Method (CEM) [7], [8]
- Dueling network DQN (Dueling DQN) [9]
- Deep SARSA [10]
- Asynchronous Advantage Actor-Critic (A3C) [5]
- Proximal Policy Optimization Algorithms (PPO) [11]
You can find more information on each agent in the doc.
Installation
- Install Keras-RL from Pypi (recommended):
pip install keras-rl
- Install from Github source:
git clone https://github.com/keras-rl/keras-rl.git
cd keras-rl
python setup.py install
Examples
If you want to run the examples, you'll also have to install:
- gym by OpenAI: Installation instruction
- h5py: simply run
pip install h5py
For atari example you will also need:
- Pillow:
pip install Pillow
- gym[atari]: Atari module for gym. Use
pip install gym[atari]
Once you have installed everything, you can try out a simple example:
python examples/dqn_cartpole.py
This is a very simple example and it should converge relatively quickly, so it's a great way to get started! It also visualizes the game during training, so you can watch it learn. How cool is that?
Some sample weights are available on keras-rl-weights.
If you have questions or problems, please file an issue or, even better, fix the problem yourself and submit a pull request!
External Projects
You're using Keras-RL on a project? Open a PR and share it!
Visualizing Training Metrics
To see graphs of your training progress and compare across runs, run pip install wandb
and add the WandbLogger callback to your agent's fit()
call:
from rl.callbacks import WandbLogger
...
agent.fit(env, nb_steps=50000, callbacks=[WandbLogger()])
For more info and options, see the W&B docs.
Citing
If you use keras-rl
in your research, you can cite it as follows:
@misc{plappert2016kerasrl,
author = {Matthias Plappert},
title = {keras-rl},
year = {2016},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/keras-rl/keras-rl}},
}
References
- Playing Atari with Deep Reinforcement Learning, Mnih et al., 2013
- Human-level control through deep reinforcement learning, Mnih et al., 2015
- Deep Reinforcement Learning with Double Q-learning, van Hasselt et al., 2015
- Continuous control with deep reinforcement learning, Lillicrap et al., 2015
- Asynchronous Methods for Deep Reinforcement Learning, Mnih et al., 2016
- Continuous Deep Q-Learning with Model-based Acceleration, Gu et al., 2016
- Learning Tetris Using the Noisy Cross-Entropy Method, Szita et al., 2006
- Deep Reinforcement Learning (MLSS lecture notes), Schulman, 2016
- Dueling Network Architectures for Deep Reinforcement Learning, Wang et al., 2016
- Reinforcement learning: An introduction, Sutton and Barto, 2011
- Proximal Policy Optimization Algorithms, Schulman et al., 2017
Top Related Projects
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
PFRL: a PyTorch-based deep reinforcement learning library
A toolkit for reproducible reinforcement learning research.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot