Convert Figma logo to code with AI

google-deepmind logoacme

A library of reinforcement learning components and agents

3,486
426
3,486
74

Top Related Projects

15,725

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

2,788

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

2,268

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

34,860

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

1,186

PFRL: a PyTorch-based deep reinforcement learning library

Quick Overview

Acme is an open-source research framework for reinforcement learning developed by DeepMind. It provides a collection of building blocks for RL agents and algorithms, designed to be flexible, scalable, and easy to use. Acme supports both single-process and distributed training, making it suitable for a wide range of research applications.

Pros

  • Modular design allows for easy customization and experimentation
  • Supports both TensorFlow and JAX backends
  • Includes implementations of popular RL algorithms (e.g., DQN, PPO, SAC)
  • Designed for scalability, from single-process to distributed training

Cons

  • Steeper learning curve compared to some other RL libraries
  • Documentation could be more comprehensive for beginners
  • Limited support for environments outside of OpenAI Gym and DeepMind's dm_control
  • Requires familiarity with TensorFlow or JAX

Code Examples

  1. Creating a DQN agent:
import acme
from acme import specs
from acme.agents import dqn

# Create an environment and get its spec
environment = acme.make_environment('CartPole-v1')
environment_spec = specs.make_environment_spec(environment)

# Create a DQN agent
agent = dqn.DQN(
    environment_spec=environment_spec,
    network=dqn.make_default_networks(environment_spec.observations),
    batch_size=256,
    samples_per_insert=2.0,
    min_replay_size=1000,
    max_replay_size=1000000,
    learning_rate=1e-3,
)
  1. Running a training loop:
# Create a loop to train the agent
loop = acme.EnvironmentLoop(environment, agent)

# Run the training loop
loop.run(num_episodes=1000)
  1. Using a JAX-based agent:
from acme.agents.jax import ppo

# Create a PPO agent using JAX
agent = ppo.PPO(
    environment_spec=environment_spec,
    network=ppo.make_networks(environment_spec.observations, environment_spec.actions),
    batch_size=256,
    learning_rate=3e-4,
)

Getting Started

To get started with Acme, follow these steps:

  1. Install Acme and its dependencies:

    pip install dm-acme
    
  2. Import the necessary modules:

    import acme
    from acme import specs
    from acme.agents import dqn
    
  3. Create an environment and agent:

    environment = acme.make_environment('CartPole-v1')
    environment_spec = specs.make_environment_spec(environment)
    agent = dqn.DQN(environment_spec=environment_spec)
    
  4. Run a training loop:

    loop = acme.EnvironmentLoop(environment, agent)
    loop.run(num_episodes=100)
    

This basic example sets up a DQN agent to train on the CartPole environment. You can customize the agent, environment, and training parameters to suit your specific research needs.

Competitor Comparisons

15,725

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Pros of Baselines

  • Simpler and more straightforward implementation, making it easier for beginners to understand and use
  • Focuses on classic RL algorithms, providing a solid foundation for learning and experimentation
  • Includes a wider range of environments out-of-the-box, particularly Atari games

Cons of Baselines

  • Less actively maintained, with fewer recent updates compared to Acme
  • Limited support for more advanced or cutting-edge RL techniques
  • Less modular architecture, making it harder to extend or customize algorithms

Code Comparison

Baselines (example of running DQN):

from baselines import deepq
from baselines.common.atari_wrappers import make_atari

env = make_atari('PongNoFrameskip-v4')
model = deepq.learn(env, "mlp", print_freq=10, total_timesteps=100000)

Acme (example of running DQN):

from acme import environment_loop
from acme import specs
from acme.agents import dqn
from acme.tf import networks

environment = gym.make('CartPole-v0')
environment_spec = specs.make_environment_spec(environment)
network = networks.DQNAtariNetwork(environment_spec.actions.num_values)
agent = dqn.DQN(environment_spec, network)
loop = environment_loop.EnvironmentLoop(environment, agent)
loop.run(num_episodes=10)

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Pros of Stable-Baselines

  • Easier to use and more beginner-friendly
  • Provides pre-trained models and ready-to-use algorithms
  • Better documentation and tutorials for quick start

Cons of Stable-Baselines

  • Less flexible and customizable compared to Acme
  • Fewer advanced features and research-oriented tools
  • Limited support for distributed training and multi-agent scenarios

Code Comparison

Stable-Baselines:

from stable_baselines3 import PPO

model = PPO("MlpPolicy", "CartPole-v1", verbose=1)
model.learn(total_timesteps=10000)

Acme:

from acme import environment_loop
from acme import specs
from acme.agents import dqn

environment = gym.make('CartPole-v1')
agent = dqn.DQN(environment_spec=specs.make_environment_spec(environment))
loop = environment_loop.EnvironmentLoop(environment, agent)
loop.run(num_episodes=10)

Stable-Baselines offers a more straightforward API for common use cases, while Acme provides a modular framework for more complex and customized reinforcement learning experiments. Acme's design allows for greater flexibility in research settings, but may require more setup and understanding of RL concepts.

2,788

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Pros of TF-Agents

  • Tighter integration with TensorFlow ecosystem
  • More extensive documentation and tutorials
  • Broader community support and contributions

Cons of TF-Agents

  • Less flexible for non-TensorFlow backends
  • Potentially steeper learning curve for beginners
  • Slower development cycle compared to Acme

Code Comparison

Acme (using JAX):

import acme
from acme import specs
from acme.agents import dqn

environment = ...
environment_spec = specs.make_environment_spec(environment)
agent = dqn.DQN(environment_spec)

TF-Agents:

import tensorflow as tf
from tf_agents.agents.dqn import dqn_agent
from tf_agents.environments import tf_py_environment

py_env = ...
tf_env = tf_py_environment.TFPyEnvironment(py_env)
agent = dqn_agent.DqnAgent(tf_env.time_step_spec(), tf_env.action_spec())

Both libraries offer similar functionality for implementing reinforcement learning agents, but with different syntax and underlying frameworks. Acme focuses on flexibility and research-oriented design, while TF-Agents provides a more structured approach within the TensorFlow ecosystem.

2,268

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Pros of RL

  • Simpler and more lightweight, focusing on core RL algorithms
  • Tighter integration with PyTorch ecosystem
  • More accessible for beginners and researchers

Cons of RL

  • Less comprehensive feature set compared to Acme
  • Fewer advanced algorithms and environments
  • Less active development and maintenance

Code Comparison

RL:

env = gym.make('CartPole-v0')
model = DQN('MlpPolicy', env, verbose=1)
model.learn(total_timesteps=10000)

Acme:

environment = gym.make('CartPole-v0')
agent = dqn.DQN(
    environment_spec=specs.make_environment_spec(environment),
    network=networks.DQNAtariNetwork(num_actions=environment.action_space.n),
    batch_size=32,
    samples_per_insert=2.0,
    min_replay_size=100)

The RL code is more concise and straightforward, while Acme offers more granular control and configuration options. RL is better suited for quick prototyping and experimentation, whereas Acme provides a more robust framework for complex research and production environments.

34,860

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Pros of Ray

  • More versatile, supporting distributed computing beyond just reinforcement learning
  • Larger community and ecosystem, with more integrations and third-party libraries
  • Better suited for production environments and scaling to large clusters

Cons of Ray

  • Steeper learning curve due to its broader scope and more complex API
  • Less specialized for reinforcement learning tasks compared to Acme
  • Potentially more overhead for simple, single-machine RL experiments

Code Comparison

Acme (RL-focused):

environment = gym.make('CartPole-v0')
agent = dqn.DQN(environment.observation_space, environment.action_space)
loop = acme.EnvironmentLoop(environment, agent)
loop.run(num_episodes=10)

Ray (distributed computing):

@ray.remote
def train_model(data):
    model = Model()
    for batch in data:
        model.train(batch)
    return model

results = ray.get([train_model.remote(data_shard) for data_shard in data_shards])

Ray offers a more general-purpose distributed computing framework, while Acme provides a more specialized toolkit for reinforcement learning research. Ray's flexibility comes at the cost of increased complexity, whereas Acme offers a more streamlined experience for RL-specific tasks.

1,186

PFRL: a PyTorch-based deep reinforcement learning library

Pros of PFRL

  • More focused on practical implementations of reinforcement learning algorithms
  • Provides a wider range of pre-implemented RL algorithms out-of-the-box
  • Better documentation and examples for quick start and implementation

Cons of PFRL

  • Less flexible architecture compared to Acme's modular design
  • Smaller community and fewer contributions from external developers
  • Limited support for distributed training and multi-agent scenarios

Code Comparison

PFRL example (PPO implementation):

import pfrl

def make_env():
    return gym.make('CartPole-v0')

obs_size = 4
n_actions = 2
model = pfrl.nn.MLPActorCritic(obs_size, n_actions)
opt = torch.optim.Adam(model.parameters(), lr=3e-4)
agent = pfrl.agents.PPO(model, opt, gpu=0)

Acme example (PPO implementation):

import acme
from acme import specs
from acme.agents import ppo

environment = gym.make('CartPole-v0')
environment_spec = specs.make_environment_spec(environment)
network = ppo.make_networks(environment_spec)
agent = ppo.PPO(environment_spec, network)

Both libraries offer implementations of popular RL algorithms, but PFRL provides a more straightforward approach for quick implementation, while Acme offers a more modular and flexible architecture for advanced users and researchers.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Acme: a research framework for reinforcement learning

PyPI Python Version PyPI version acme-tests Documentation Status

Acme is a library of reinforcement learning (RL) building blocks that strives to expose simple, efficient, and readable agents. These agents first and foremost serve both as reference implementations as well as providing strong baselines for algorithm performance. However, the baseline agents exposed by Acme should also provide enough flexibility and simplicity that they can be used as a starting block for novel research. Finally, the building blocks of Acme are designed in such a way that the agents can be run at multiple scales (e.g. single-stream vs. distributed agents).

Getting started

The quickest way to get started is to take a look at the detailed working code examples found in the examples subdirectory. These show how to instantiate a number of different agents and run them within a variety of environments. See the quickstart notebook for an even quicker dive into using a single agent. Even more detail on the internal construction of an agent can be found inside our tutorial notebook. Finally, a full description Acme and its underlying components can be found by referring to the documentation. More background information and details behind the design decisions can be found in our technical report.

NOTE: Acme is first and foremost a framework for RL research written by researchers, for researchers. We use it for our own work on a daily basis. So with that in mind, while we will make every attempt to keep everything in good working order, things may break occasionally. But if so we will make our best effort to fix them as quickly as possible!

Installation

To get up and running quickly just follow the steps below:

  1. While you can install Acme in your standard python environment, we strongly recommend using a Python virtual environment to manage your dependencies. This should help to avoid version conflicts and just generally make the installation process easier.

    python3 -m venv acme
    source acme/bin/activate
    pip install --upgrade pip setuptools wheel
    
  2. While the core dm-acme library can be pip installed directly, the set of dependencies included for installation is minimal. In particular, to run any of the included agents you will also need either JAX or TensorFlow depending on the agent. As a result we recommend installing these components as well, i.e.

    pip install dm-acme[jax,tf]
    
  3. Finally, to install a few example environments (including gym, dm_control, and bsuite):

    pip install dm-acme[envs]
    
  4. Installing from github: if you're interested in running the bleeding-edge version of Acme, you can do so by cloning the Acme GitHub repository and then executing following command from the main directory (where setup.py is located):

    pip install .[jax,tf,testing,envs]
    

Citing Acme

If you use Acme in your work, please cite the updated accompanying technical report:

@article{hoffman2020acme,
    title={Acme: A Research Framework for Distributed Reinforcement Learning},
    author={
        Matthew W. Hoffman and Bobak Shahriari and John Aslanides and
        Gabriel Barth-Maron and Nikola Momchev and Danila Sinopalnikov and
        Piotr Sta\'nczyk and Sabela Ramos and Anton Raichuk and
        Damien Vincent and L\'eonard Hussenot and Robert Dadashi and
        Gabriel Dulac-Arnold and Manu Orsini and Alexis Jacq and
        Johan Ferret and Nino Vieillard and Seyed Kamyar Seyed Ghasemipour and
        Sertan Girgin and Olivier Pietquin and Feryal Behbahani and
        Tamara Norman and Abbas Abdolmaleki and Albin Cassirer and
        Fan Yang and Kate Baumli and Sarah Henderson and Abe Friesen and
        Ruba Haroun and Alex Novikov and Sergio G\'omez Colmenarejo and
        Serkan Cabi and Caglar Gulcehre and Tom Le Paine and
        Srivatsan Srinivasan and Andrew Cowie and Ziyu Wang and Bilal Piot and
        Nando de Freitas
    },
    year={2020},
    journal={arXiv preprint arXiv:2006.00979},
    url={https://arxiv.org/abs/2006.00979},
}