Convert Figma logo to code with AI

google logodopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

10,537
1,378
10,537
104

Top Related Projects

15,725

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

2,788

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

3,485

A library of reinforcement learning components and agents

1,272

2,268

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Quick Overview

Dopamine is an open-source research framework for fast prototyping of reinforcement learning algorithms developed by Google. It aims to make RL research more accessible and reproducible by providing a flexible, scalable, and well-tested framework for implementing and evaluating RL agents.

Pros

  • Easy to use and extend, with a modular design that allows for quick experimentation
  • Implements several popular RL algorithms out of the box (e.g., DQN, Rainbow, C51)
  • Integrates well with TensorFlow and Gin-config for flexible configuration
  • Provides tools for visualization and analysis of agent performance

Cons

  • Limited to discrete action spaces, which may not be suitable for all RL problems
  • Primarily focused on value-based methods, with less support for policy gradient algorithms
  • May have a steeper learning curve for those unfamiliar with TensorFlow or Gin-config
  • Documentation could be more comprehensive for advanced usage scenarios

Code Examples

  1. Creating a DQN agent:
import dopamine.agents.dqn.dqn_agent as dqn_agent
import dopamine.discrete_domains.gym_lib as gym_lib

environment = gym_lib.create_gym_environment('Pong')
agent = dqn_agent.DQNAgent(
    num_actions=environment.action_space.n,
    observation_shape=environment.observation_space.shape,
    stack_size=4)
  1. Running a training iteration:
initial_observation = environment.reset()
action = agent.begin_episode(initial_observation)

for _ in range(max_steps_per_episode):
    observation, reward, done, _ = environment.step(action)
    action = agent.step(reward, observation)
    if done:
        break

agent.end_episode(reward)
  1. Evaluating agent performance:
from dopamine.metrics import statistics_instance

statistics = statistics_instance.StatisticsInstance('eval')
for _ in range(num_eval_episodes):
    statistics.episode()
    # Run evaluation episode
    statistics.add_episode_reward(episode_reward)

average_reward = statistics.get_average_reward()

Getting Started

To get started with Dopamine:

  1. Install Dopamine:
pip install dopamine-rl
  1. Create a simple DQN agent and run it on a Gym environment:
import dopamine.discrete_domains.run_experiment as run_experiment
import dopamine.agents.dqn.dqn_agent as dqn_agent
import gin

gin.parse_config_file('dopamine/agents/dqn/configs/dqn.gin')

runner = run_experiment.Runner(
    base_dir='/tmp/dopamine_runs',
    create_agent_fn=dqn_agent.DQNAgent)

runner.run_experiment()

This will train a DQN agent on the default Atari environment (Pong) using the configurations specified in the gin file.

Competitor Comparisons

15,725

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Pros of Baselines

  • Wider range of algorithms implemented, including PPO, TRPO, and DDPG
  • More extensive documentation and examples for various environments
  • Active community support and regular updates

Cons of Baselines

  • Steeper learning curve for beginners
  • Less focus on visualization tools compared to Dopamine
  • Some implementations may be less optimized for performance

Code Comparison

Baselines (PPO implementation):

def learn(network, env, total_timesteps, **network_kwargs):
    policy = build_policy(env, network, **network_kwargs)
    
    # PPO-specific parameters
    nminibatches = 4
    noptepochs = 4
    
    model = PPO2(policy=policy, env=env, nminibatches=nminibatches, noptepochs=noptepochs)
    model.learn(total_timesteps=total_timesteps)

Dopamine (DQN implementation):

def create_agent(sess, environment, summary_writer=None):
    return dqn_agent.DQNAgent(
        sess,
        num_actions=environment.action_space.n,
        observation_shape=environment.observation_space.shape,
        summary_writer=summary_writer)

runner = Runner(base_dir, create_agent, game_name)
runner.run()

A fork of OpenAI Baselines, implementations of reinforcement learning algorithms

Pros of Stable-Baselines

  • Wider range of algorithms implemented, including PPO, SAC, and DDPG
  • More extensive documentation and tutorials for beginners
  • Active community and frequent updates

Cons of Stable-Baselines

  • Less focus on research-oriented features compared to Dopamine
  • May have slightly higher computational overhead due to its comprehensive nature

Code Comparison

Dopamine (loading an agent):

agent = dopamine.agents.dqn.dqn_agent.DQNAgent(
    num_actions=environment.action_space.n,
    observation_shape=environment.observation_space.shape,
    observation_dtype=environment.observation_space.dtype,
    stack_size=config.stack_size,
    network=atari_lib.nature_dqn_network)

Stable-Baselines (loading an agent):

from stable_baselines3 import DQN

model = DQN("MlpPolicy", "CartPole-v1", verbose=1)
model.learn(total_timesteps=10000)

Both libraries offer easy-to-use interfaces for reinforcement learning, but Stable-Baselines provides a more streamlined API for quick experimentation, while Dopamine offers more flexibility for custom implementations and research-oriented projects.

2,788

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Pros of TensorFlow Agents

  • More comprehensive and feature-rich, offering a wider range of RL algorithms
  • Better integration with the TensorFlow ecosystem
  • More active development and community support

Cons of TensorFlow Agents

  • Steeper learning curve due to its complexity
  • Potentially slower execution compared to Dopamine's focused approach

Code Comparison

Dopamine (simple DQN implementation):

agent = dqn_agent.DQNAgent(
    num_actions=num_actions,
    observation_shape=observation_shape,
    observation_dtype=tf.float32,
    stack_size=stack_size,
    network=atari_lib.NatureDQNNetwork)

TensorFlow Agents (DQN implementation):

agent = dqn_agent.DqnAgent(
    time_step_spec,
    action_spec,
    q_network=q_net,
    optimizer=optimizer,
    td_errors_loss_fn=common.element_wise_squared_loss,
    train_step_counter=train_step_counter)

Both repositories focus on reinforcement learning, but TensorFlow Agents offers a more comprehensive toolkit with broader algorithm support. Dopamine, on the other hand, provides a simpler, more focused approach to RL research. TensorFlow Agents integrates better with the TensorFlow ecosystem, while Dopamine may be easier to get started with for beginners. The code comparison shows that TensorFlow Agents requires more setup but offers more flexibility in configuration.

3,485

A library of reinforcement learning components and agents

Pros of Acme

  • More comprehensive and flexible framework for RL research
  • Supports a wider range of algorithms and environments
  • Better suited for distributed and large-scale experiments

Cons of Acme

  • Steeper learning curve due to increased complexity
  • Potentially overkill for simpler RL tasks
  • Less focus on visualization tools compared to Dopamine

Code Comparison

Dopamine (agent creation):

agent = rainbow_agent.RainbowAgent(
    num_actions=environment.action_space.n,
    observation_shape=environment.observation_space.shape,
    observation_dtype=environment.observation_space.dtype)

Acme (agent creation):

agent = sac.SACAgent(
    environment_spec=environment_spec,
    policy_network=policy_network,
    critic_network=critic_network,
    target_entropy=target_entropy)

Both repositories provide frameworks for reinforcement learning research, but Acme offers a more extensive and flexible approach. Dopamine focuses on simplicity and reproducibility, making it easier for beginners to get started. Acme, on the other hand, provides a broader range of tools and algorithms, making it more suitable for advanced research and large-scale experiments. The code comparison shows that Acme requires more setup but offers greater customization, while Dopamine provides a more straightforward approach to agent creation.

1,272

Pros of rlax

  • More flexible and modular design, allowing for easier customization of RL algorithms
  • Better integration with JAX, enabling efficient GPU/TPU acceleration
  • Broader range of RL algorithms and components available

Cons of rlax

  • Steeper learning curve due to its more low-level nature
  • Less comprehensive documentation and tutorials compared to Dopamine
  • Requires more boilerplate code to set up complete RL experiments

Code Comparison

rlax example:

import jax
import rlax

def loss_fn(params, target, prediction):
    return rlax.l2_loss(prediction, target)

grad_fn = jax.grad(loss_fn)

Dopamine example:

import dopamine.jax.agents.dqn.dqn_agent as dqn_agent

agent = dqn_agent.JaxDQNAgent(
    num_actions=4,
    observation_shape=(84, 84, 4),
    stack_size=4
)

rlax offers more granular control over RL components, while Dopamine provides higher-level abstractions for complete agents. rlax's integration with JAX allows for easy gradient computation, whereas Dopamine focuses on providing ready-to-use agent implementations.

2,268

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Pros of RL

  • Built on PyTorch, offering more flexibility and easier integration with other deep learning projects
  • Supports a wider range of RL algorithms, including policy gradient methods and model-based RL
  • More active community and frequent updates

Cons of RL

  • Less focus on reproducibility compared to Dopamine
  • May have a steeper learning curve for beginners due to its broader scope

Code Comparison

RL (PyTorch):

import torch
from torch import nn
from torch.distributions import Categorical

class Policy(nn.Module):
    def __init__(self):
        super(Policy, self).__init__()
        self.affine1 = nn.Linear(4, 128)
        self.action_head = nn.Linear(128, 2)

Dopamine (TensorFlow):

import tensorflow as tf

def create_agent(sess, environment, summary_writer=None):
  return dqn_agent.DQNAgent(
      sess,
      num_actions=environment.action_space.n,
      observation_shape=environment.observation_space.shape,
      summary_writer=summary_writer)

The code snippets showcase the different approaches: RL uses PyTorch's object-oriented style, while Dopamine relies on TensorFlow's functional approach. RL's example demonstrates defining a policy network, whereas Dopamine's shows agent creation.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Dopamine

Getting Started | Docs | Baseline Results | Changelist



Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research).

Our design principles are:

  • Easy experimentation: Make it easy for new users to run benchmark experiments.
  • Flexible development: Make it easy for new users to try out research ideas.
  • Compact and reliable: Provide implementations for a few, battle-tested algorithms.
  • Reproducible: Facilitate reproducibility in results. In particular, our setup follows the recommendations given by Machado et al. (2018).

Dopamine supports the following agents, implemented with jax:

For more information on the available agents, see the docs.

Many of these agents also have a tensorflow (legacy) implementation, though newly added agents are likely to be jax-only.

This is not an official Google product.

Getting Started

We provide docker containers for using Dopamine. Instructions can be found here.

Alternatively, Dopamine can be installed from source (preferred) or installed with pip. For either of these methods, continue reading at prerequisites.

Prerequisites

Dopamine supports Atari environments and Mujoco environments. Install the environments you intend to use before you install Dopamine:

Atari

  1. These should now come packaged with ale_py.
  2. You may need to manually run some steps to properly install baselines, see instructions.

Mujoco

  1. Install Mujoco and get a license here.
  2. Run pip install mujoco-py (we recommend using a virtual environment).

Installing from Source

The most common way to use Dopamine is to install it from source and modify the source code directly:

git clone https://github.com/google/dopamine

After cloning, install dependencies:

pip install -r dopamine/requirements.txt

Dopamine supports tensorflow (legacy) and jax (actively maintained) agents. View the Tensorflow documentation for more information on installing tensorflow.

Note: We recommend using a virtual environment when working with Dopamine.

Installing with Pip

Note: We strongly recommend installing from source for most users.

Installing with pip is simple, but Dopamine is designed to be modified directly. We recommend installing from source for writing your own experiments.

pip install dopamine-rl

Running tests

You can test whether the installation was successful by running the following from the dopamine root directory.

export PYTHONPATH=$PYTHONPATH:$PWD
python -m tests.dopamine.atari_init_test

Next Steps

View the docs for more information on training agents.

We supply baselines for each Dopamine agent.

We also provide a set of Colaboratory notebooks which demonstrate how to use Dopamine.

References

Bellemare et al., The Arcade Learning Environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 2013.

Machado et al., Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents, Journal of Artificial Intelligence Research, 2018.

Hessel et al., Rainbow: Combining Improvements in Deep Reinforcement Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 2018.

Mnih et al., Human-level Control through Deep Reinforcement Learning. Nature, 2015.

Schaul et al., Prioritized Experience Replay. Proceedings of the International Conference on Learning Representations, 2016.

Haarnoja et al., Soft Actor-Critic Algorithms and Applications, arXiv preprint arXiv:1812.05905, 2018.

Schulman et al., Proximal Policy Optimization Algorithms.

Giving credit

If you use Dopamine in your work, we ask that you cite our white paper. Here is an example BibTeX entry:

@article{castro18dopamine,
  author    = {Pablo Samuel Castro and
               Subhodeep Moitra and
               Carles Gelada and
               Saurabh Kumar and
               Marc G. Bellemare},
  title     = {Dopamine: {A} {R}esearch {F}ramework for {D}eep {R}einforcement {L}earning},
  year      = {2018},
  url       = {http://arxiv.org/abs/1812.06110},
  archivePrefix = {arXiv}
}