acme

A library of reinforcement learning components and agents

3,755

492

3,755

View on GitHub

Top Related Projects

baselines

16,374

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

stable-baselines

4,300

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

agents

2,943

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

rl

2,984

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

ray

38,187

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

pfrl

1,242

PFRL: a PyTorch-based deep reinforcement learning library

Quick Overview

Acme is an open-source research framework for reinforcement learning developed by DeepMind. It provides a collection of building blocks for RL agents and algorithms, designed to be flexible, scalable, and easy to use. Acme supports both single-process and distributed training, making it suitable for a wide range of research applications.

Pros

Modular design allows for easy customization and experimentation
Supports both TensorFlow and JAX backends
Includes implementations of popular RL algorithms (e.g., DQN, PPO, SAC)
Designed for scalability, from single-process to distributed training

Cons

Steeper learning curve compared to some other RL libraries
Documentation could be more comprehensive for beginners
Limited support for environments outside of OpenAI Gym and DeepMind's dm_control
Requires familiarity with TensorFlow or JAX

Code Examples

Creating a DQN agent:

import acme
from acme import specs
from acme.agents import dqn

# Create an environment and get its spec
environment = acme.make_environment('CartPole-v1')
environment_spec = specs.make_environment_spec(environment)

# Create a DQN agent
agent = dqn.DQN(
    environment_spec=environment_spec,
    network=dqn.make_default_networks(environment_spec.observations),
    batch_size=256,
    samples_per_insert=2.0,
    min_replay_size=1000,
    max_replay_size=1000000,
    learning_rate=1e-3,
)

Running a training loop:

# Create a loop to train the agent
loop = acme.EnvironmentLoop(environment, agent)

# Run the training loop
loop.run(num_episodes=1000)

Using a JAX-based agent:

from acme.agents.jax import ppo

# Create a PPO agent using JAX
agent = ppo.PPO(
    environment_spec=environment_spec,
    network=ppo.make_networks(environment_spec.observations, environment_spec.actions),
    batch_size=256,
    learning_rate=3e-4,
)

Getting Started

To get started with Acme, follow these steps:

Install Acme and its dependencies:
```
pip install dm-acme
```

Import the necessary modules:

import acme
from acme import specs
from acme.agents import dqn

Create an environment and agent:

environment = acme.make_environment('CartPole-v1')
environment_spec = specs.make_environment_spec(environment)
agent = dqn.DQN(environment_spec=environment_spec)

Run a training loop:

loop = acme.EnvironmentLoop(environment, agent)
loop.run(num_episodes=100)

This basic example sets up a DQN agent to train on the CartPole environment. You can customize the agent, environment, and training parameters to suit your specific research needs.

Competitor Comparisons

baselines

16,374

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Pros of Baselines

Simpler and more straightforward implementation, making it easier for beginners to understand and use
Focuses on classic RL algorithms, providing a solid foundation for learning and experimentation
Includes a wider range of environments out-of-the-box, particularly Atari games

Cons of Baselines

Less actively maintained, with fewer recent updates compared to Acme
Limited support for more advanced or cutting-edge RL techniques
Less modular architecture, making it harder to extend or customize algorithms

Code Comparison

Baselines (example of running DQN):

from baselines import deepq
from baselines.common.atari_wrappers import make_atari

env = make_atari('PongNoFrameskip-v4')
model = deepq.learn(env, "mlp", print_freq=10, total_timesteps=100000)

Acme (example of running DQN):

from acme import environment_loop
from acme import specs
from acme.agents import dqn
from acme.tf import networks

environment = gym.make('CartPole-v0')
environment_spec = specs.make_environment_spec(environment)
network = networks.DQNAtariNetwork(environment_spec.actions.num_values)
agent = dqn.DQN(environment_spec, network)
loop = environment_loop.EnvironmentLoop(environment, agent)
loop.run(num_episodes=10)

stable-baselines

4,300

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Pros of Stable-Baselines

Easier to use and more beginner-friendly
Provides pre-trained models and ready-to-use algorithms
Better documentation and tutorials for quick start

Cons of Stable-Baselines

Less flexible and customizable compared to Acme
Fewer advanced features and research-oriented tools
Limited support for distributed training and multi-agent scenarios

Code Comparison

Stable-Baselines:

from stable_baselines3 import PPO

model = PPO("MlpPolicy", "CartPole-v1", verbose=1)
model.learn(total_timesteps=10000)

Acme:

from acme import environment_loop
from acme import specs
from acme.agents import dqn

environment = gym.make('CartPole-v1')
agent = dqn.DQN(environment_spec=specs.make_environment_spec(environment))
loop = environment_loop.EnvironmentLoop(environment, agent)
loop.run(num_episodes=10)

Stable-Baselines offers a more straightforward API for common use cases, while Acme provides a modular framework for more complex and customized reinforcement learning experiments. Acme's design allows for greater flexibility in research settings, but may require more setup and understanding of RL concepts.

agents

2,943

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Pros of TF-Agents

Tighter integration with TensorFlow ecosystem
More extensive documentation and tutorials
Broader community support and contributions

Cons of TF-Agents

Less flexible for non-TensorFlow backends
Potentially steeper learning curve for beginners
Slower development cycle compared to Acme

Code Comparison

Acme (using JAX):

import acme
from acme import specs
from acme.agents import dqn

environment = ...
environment_spec = specs.make_environment_spec(environment)
agent = dqn.DQN(environment_spec)

TF-Agents:

import tensorflow as tf
from tf_agents.agents.dqn import dqn_agent
from tf_agents.environments import tf_py_environment

py_env = ...
tf_env = tf_py_environment.TFPyEnvironment(py_env)
agent = dqn_agent.DqnAgent(tf_env.time_step_spec(), tf_env.action_spec())

Both libraries offer similar functionality for implementing reinforcement learning agents, but with different syntax and underlying frameworks. Acme focuses on flexibility and research-oriented design, while TF-Agents provides a more structured approach within the TensorFlow ecosystem.

rl

2,984

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Pros of RL

Simpler and more lightweight, focusing on core RL algorithms
Tighter integration with PyTorch ecosystem
More accessible for beginners and researchers

Cons of RL

Less comprehensive feature set compared to Acme
Fewer advanced algorithms and environments
Less active development and maintenance

Code Comparison

RL:

env = gym.make('CartPole-v0')
model = DQN('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

Acme:

environment = gym.make('CartPole-v0')
agent = dqn.DQN(
    environment_spec=specs.make_environment_spec(environment),
    network=networks.DQNAtariNetwork(num_actions=environment.action_space.n),
    batch_size=32,
    samples_per_insert=2.0,
    min_replay_size=100)

The RL code is more concise and straightforward, while Acme offers more granular control and configuration options. RL is better suited for quick prototyping and experimentation, whereas Acme provides a more robust framework for complex research and production environments.

ray

38,187

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Pros of Ray

More versatile, supporting distributed computing beyond just reinforcement learning
Larger community and ecosystem, with more integrations and third-party libraries
Better suited for production environments and scaling to large clusters

Cons of Ray

Steeper learning curve due to its broader scope and more complex API
Less specialized for reinforcement learning tasks compared to Acme
Potentially more overhead for simple, single-machine RL experiments

Code Comparison

Acme (RL-focused):

environment = gym.make('CartPole-v0')
agent = dqn.DQN(environment.observation_space, environment.action_space)
loop = acme.EnvironmentLoop(environment, agent)
loop.run(num_episodes=10)

Ray (distributed computing):

@ray.remote
def train_model(data):
    model = Model()
    for batch in data:
        model.train(batch)
    return model

results = ray.get([train_model.remote(data_shard) for data_shard in data_shards])

Ray offers a more general-purpose distributed computing framework, while Acme provides a more specialized toolkit for reinforcement learning research. Ray's flexibility comes at the cost of increased complexity, whereas Acme offers a more streamlined experience for RL-specific tasks.

pfrl

1,242

PFRL: a PyTorch-based deep reinforcement learning library

Pros of PFRL

More focused on practical implementations of reinforcement learning algorithms
Provides a wider range of pre-implemented RL algorithms out-of-the-box
Better documentation and examples for quick start and implementation

Cons of PFRL

Less flexible architecture compared to Acme's modular design
Smaller community and fewer contributions from external developers
Limited support for distributed training and multi-agent scenarios

Code Comparison

PFRL example (PPO implementation):

import pfrl

def make_env():
    return gym.make('CartPole-v0')

obs_size = 4
n_actions = 2
model = pfrl.nn.MLPActorCritic(obs_size, n_actions)
opt = torch.optim.Adam(model.parameters(), lr=3e-4)
agent = pfrl.agents.PPO(model, opt, gpu=0)

Acme example (PPO implementation):

import acme
from acme import specs
from acme.agents import ppo

environment = gym.make('CartPole-v0')
environment_spec = specs.make_environment_spec(environment)
network = ppo.make_networks(environment_spec)
agent = ppo.PPO(environment_spec, network)

Both libraries offer implementations of popular RL algorithms, but PFRL provides a more straightforward approach for quick implementation, while Acme offers a more modular and flexible architecture for advanced users and researchers.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Acme: a research framework for reinforcement learning

Acme is a library of reinforcement learning (RL) building blocks that strives to expose simple, efficient, and readable agents. These agents first and foremost serve both as reference implementations as well as providing strong baselines for algorithm performance. However, the baseline agents exposed by Acme should also provide enough flexibility and simplicity that they can be used as a starting block for novel research. Finally, the building blocks of Acme are designed in such a way that the agents can be run at multiple scales (e.g. single-stream vs. distributed agents).

Getting started

The quickest way to get started is to take a look at the detailed working code examples found in the examples subdirectory. These show how to instantiate a number of different agents and run them within a variety of environments. See the quickstart notebook for an even quicker dive into using a single agent. Even more detail on the internal construction of an agent can be found inside our tutorial notebook. Finally, a full description Acme and its underlying components can be found by referring to the documentation. More background information and details behind the design decisions can be found in our technical report.

NOTE: Acme is first and foremost a framework for RL research written by researchers, for researchers. We use it for our own work on a daily basis. So with that in mind, while we will make every attempt to keep everything in good working order, things may break occasionally. But if so we will make our best effort to fix them as quickly as possible!

Installation

To get up and running quickly just follow the steps below:

While you can install Acme in your standard python environment, we strongly recommend using a Python virtual environment to manage your dependencies. This should help to avoid version conflicts and just generally make the installation process easier.
```
python3 -m venv acme
source acme/bin/activate
pip install --upgrade pip setuptools wheel
```
While the core dm-acme library can be pip installed directly, the set of dependencies included for installation is minimal. In particular, to run any of the included agents you will also need either JAX or TensorFlow depending on the agent. As a result we recommend installing these components as well, i.e.
```
pip install dm-acme[jax,tf]
```
Finally, to install a few example environments (including gym, dm_control, and bsuite):
```
pip install dm-acme[envs]
```
Installing from github: if you're interested in running the bleeding-edge version of Acme, you can do so by cloning the Acme GitHub repository and then executing following command from the main directory (where setup.py is located):
```
pip install .[jax,tf,testing,envs]
```

Citing Acme

If you use Acme in your work, please cite the updated accompanying technical report:

@article{hoffman2020acme,
    title={Acme: A Research Framework for Distributed Reinforcement Learning},
    author={
        Matthew W. Hoffman and Bobak Shahriari and John Aslanides and
        Gabriel Barth-Maron and Nikola Momchev and Danila Sinopalnikov and
        Piotr Sta\'nczyk and Sabela Ramos and Anton Raichuk and
        Damien Vincent and L\'eonard Hussenot and Robert Dadashi and
        Gabriel Dulac-Arnold and Manu Orsini and Alexis Jacq and
        Johan Ferret and Nino Vieillard and Seyed Kamyar Seyed Ghasemipour and
        Sertan Girgin and Olivier Pietquin and Feryal Behbahani and
        Tamara Norman and Abbas Abdolmaleki and Albin Cassirer and
        Fan Yang and Kate Baumli and Sarah Henderson and Abe Friesen and
        Ruba Haroun and Alex Novikov and Sergio G\'omez Colmenarejo and
        Serkan Cabi and Caglar Gulcehre and Tom Le Paine and
        Srivatsan Srinivasan and Andrew Cowie and Ziyu Wang and Bilal Piot and
        Nando de Freitas
    },
    year={2020},
    journal={arXiv preprint arXiv:2006.00979},
    url={https://arxiv.org/abs/2006.00979},
}

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot