Top Related Projects
A toolkit for developing and comparing reinforcement learning algorithms.
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Quick Overview
The rlworkgroup/garage
repository is a framework for developing and evaluating reinforcement learning (RL) algorithms. It provides a modular and extensible architecture for building RL agents and environments, making it easier to experiment with different RL techniques and compare their performance.
Pros
- Modular and Extensible: The framework is designed to be highly modular, allowing users to easily swap out different components (e.g., environments, policies, optimizers) to experiment with new RL approaches.
- Comprehensive Documentation: The project has extensive documentation, including detailed tutorials, API references, and examples, making it easier for new users to get started.
- Active Development and Community: The project is actively maintained by the RLWorkGroup, a community of RL researchers and practitioners, and has a growing user base.
- Supports Multiple RL Algorithms: The framework supports a wide range of RL algorithms, including deep Q-learning, policy gradients, and actor-critic methods, among others.
Cons
- Steep Learning Curve: The framework can have a steep learning curve, especially for users new to RL and the project's architecture.
- Limited Support for Specific Domains: While the framework is designed to be general-purpose, it may not provide the best support for certain specialized RL domains or applications.
- Potential Performance Issues: Depending on the complexity of the RL tasks and the hardware available, the framework may experience performance issues, especially when running large-scale experiments.
- Dependency on External Libraries: The framework relies on several external libraries (e.g., TensorFlow, PyTorch), which can introduce additional complexity and potential compatibility issues.
Code Examples
Here are a few code examples demonstrating the usage of the rlworkgroup/garage
framework:
- Creating a Simple RL Environment:
from garage.envs import GymEnv
env = GymEnv('CartPole-v1')
obs = env.reset()
action = env.action_space.sample()
next_obs, reward, done, info = env.step(action)
This code creates a simple OpenAI Gym environment and demonstrates how to interact with it using the garage.envs.GymEnv
class.
- Defining a Policy Network:
from garage.torch.policies import GaussianMLPPolicy
policy = GaussianMLPPolicy(
env_spec=env.spec,
hidden_sizes=[32, 32],
hidden_nonlinearity=torch.tanh,
output_nonlinearity=None,
)
This code defines a Gaussian Multilayer Perceptron (MLP) policy network using the garage.torch.policies.GaussianMLPPolicy
class.
- Training an RL Agent:
from garage.torch.algos import PPO
from garage.torch.trainers import TorchBatchTRPOTrainer
algo = PPO(
env_spec=env.spec,
policy=policy,
optimizer=torch.optim.Adam,
discount=0.99,
gae_lambda=0.95,
policy_ent_coeff=0.0,
use_softplus_entropy=True,
lr_clip_range=0.2,
max_optimization_epochs=10,
minibatch_size=32,
policy_optimizer_args=dict(lr=1e-3),
)
trainer = TorchBatchTRPOTrainer(
algos=[algo],
n_epochs=100,
batch_size=4096,
)
trainer.train()
This code sets up a Proximal Policy Optimization (PPO) algorithm and a PyTorch-based batch TRPO trainer to train the RL agent.
Getting Started
To get started with the rlworkgroup/garage
framework, follow these steps:
- Install the Dependencies: Ensure that you have Python 3.6 or higher installed, and then install the required dependencies using pip:
pip install garage[all]
- Create a New Environment: Create a new Python file (e.g.,
main.py
) and import the necessary modules from thegarage
library:
Competitor Comparisons
A toolkit for developing and comparing reinforcement learning algorithms.
Pros of Gym
- Widely adopted and supported by the RL community
- Extensive collection of pre-built environments
- Simple and intuitive API for environment interaction
Cons of Gym
- Limited built-in support for advanced RL algorithms
- Lacks integrated visualization tools for training progress
- Fewer options for customizing environment parameters
Code Comparison
Gym:
import gym
env = gym.make('CartPole-v1')
observation = env.reset()
for _ in range(1000):
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
Garage:
from garage import wrap_experiment
from garage.envs import GymEnv
from garage.experiment import LocalRunner
from garage.sampler import RaySampler
from garage.tf.algos import PPO
from garage.tf.policies import GaussianMLPPolicy
@wrap_experiment
def ppo_cartpole(ctxt=None):
env = GymEnv('CartPole-v1')
policy = GaussianMLPPolicy(env.spec)
sampler = RaySampler(agents=policy, envs=env)
algo = PPO(env_spec=env.spec, policy=policy, sampler=sampler)
runner = LocalRunner(ctxt)
runner.setup(algo, env)
runner.train(n_epochs=100, batch_size=4000)
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
Pros of Stable-baselines
- More user-friendly and easier to get started with
- Better documentation and examples
- Wider range of implemented algorithms
Cons of Stable-baselines
- Less flexible for custom environments and modifications
- Fewer options for advanced users and researchers
Code Comparison
Stable-baselines:
from stable_baselines3 import PPO
model = PPO("MlpPolicy", "CartPole-v1", verbose=1)
model.learn(total_timesteps=10000)
Garage:
from garage import wrap_experiment
from garage.tf.algos import PPO
from garage.tf.policies import GaussianMLPPolicy
@wrap_experiment
def ppo_cartpole(ctxt=None):
policy = GaussianMLPPolicy(env_spec=env.spec)
algo = PPO(env_spec=env.spec, policy=policy)
trainer.setup(algo, env)
trainer.train(n_epochs=100, batch_size=4000)
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
Pros of TensorFlow Agents
- Built on TensorFlow, offering seamless integration with TensorFlow ecosystem
- Extensive documentation and tutorials for easier onboarding
- Regular updates and active maintenance by Google's team
Cons of TensorFlow Agents
- Limited to TensorFlow backend, less flexible than Garage's multi-framework support
- Steeper learning curve for those not familiar with TensorFlow
- Fewer algorithm implementations compared to Garage's diverse collection
Code Comparison
Garage example:
from garage import wrap_experiment
from garage.tf.algos import PPO
from garage.tf.baselines import GaussianMLPBaseline
from garage.tf.envs import TfEnv
from garage.tf.experiment import LocalTFRunner
from garage.tf.policies import GaussianMLPPolicy
@wrap_experiment
def ppo_garage(ctxt=None):
with LocalTFRunner(ctxt) as runner:
env = TfEnv(normalize(gym.make('HalfCheetah-v2')))
policy = GaussianMLPPolicy(env.spec)
baseline = GaussianMLPBaseline(env.spec)
algo = PPO(env_spec=env.spec, policy=policy, baseline=baseline)
runner.setup(algo, env)
runner.train(n_epochs=100, batch_size=4000)
TensorFlow Agents example:
import tensorflow as tf
from tf_agents.agents.ppo import ppo_agent
from tf_agents.environments import suite_gym
from tf_agents.networks import actor_distribution_network, value_network
env = suite_gym.load('HalfCheetah-v2')
actor_net = actor_distribution_network.ActorDistributionNetwork(
env.observation_spec(), env.action_spec())
value_net = value_network.ValueNetwork(env.observation_spec())
agent = ppo_agent.PPOAgent(
env.time_step_spec(), env.action_spec(), actor_net, value_net)
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Pros of stable-baselines3
- More active development and frequent updates
- Better documentation and tutorials for beginners
- Wider range of implemented algorithms
Cons of stable-baselines3
- Less flexibility for customization and experimentation
- Fewer options for parallel training and distributed computing
Code Comparison
stable-baselines3:
from stable_baselines3 import PPO
model = PPO("MlpPolicy", "CartPole-v1", verbose=1)
model.learn(total_timesteps=10000)
garage:
from garage import wrap_experiment
from garage.tf.algos import PPO
from garage.tf.policies import GaussianMLPPolicy
@wrap_experiment
def ppo_cartpole(ctxt=None):
policy = GaussianMLPPolicy(env_spec=env.spec)
algo = PPO(env_spec=env.spec, policy=policy)
trainer.setup(algo, env)
trainer.train(n_epochs=100, batch_size=4000)
Both libraries offer implementations of popular RL algorithms, but stable-baselines3 provides a more straightforward API for quick experimentation, while garage offers more flexibility for advanced users and researchers. The code comparison shows that stable-baselines3 requires less boilerplate code to get started, making it more accessible for beginners.
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
Pros of softlearning
- Focuses on soft actor-critic (SAC) algorithms, providing specialized implementations
- Includes pre-trained models and benchmarks for easy comparison
- Offers integration with RLlib for distributed training
Cons of softlearning
- Less versatile than garage, primarily centered on SAC variants
- Smaller community and fewer contributors compared to garage
- Limited documentation and examples for beginners
Code Comparison
softlearning:
from softlearning.environments.utils import get_environment
from softlearning.algorithms.sac import SAC
env = get_environment('Ant-v2')
algorithm = SAC(env=env, Q_lr=3e-4, policy_lr=3e-4)
algorithm.train()
garage:
from garage import wrap_experiment
from garage.envs import GymEnv
from garage.experiment import LocalTFRunner
from garage.tf.algos import SAC
@wrap_experiment
def sac_ant(ctxt=None):
with LocalTFRunner(ctxt) as runner:
env = GymEnv('Ant-v2')
algo = SAC(env_spec=env.spec, qf_lr=3e-4, policy_lr=3e-4)
runner.setup(algo, env)
runner.train(n_epochs=500, batch_size=256)
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Pros of Gymnasium
- More active development and community support
- Wider range of pre-built environments for various RL tasks
- Better documentation and tutorials for beginners
Cons of Gymnasium
- Less focus on modular, composable RL components
- Fewer built-in algorithms and policy implementations
- Limited support for distributed training
Code Comparison
Gymnasium:
import gymnasium as gym
env = gym.make("CartPole-v1")
observation, info = env.reset(seed=42)
for _ in range(1000):
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
Garage:
from garage import wrap_experiment
from garage.envs import GymEnv
from garage.experiment import LocalRunner
@wrap_experiment
def my_experiment(ctxt=None):
env = GymEnv('CartPole-v1')
runner = LocalRunner(ctxt)
runner.setup(algo, env)
runner.train(n_epochs=100, batch_size=4000)
The code comparison shows that Gymnasium focuses on simplicity and ease of use, while Garage provides a more structured approach with experiment management and runner classes. Gymnasium's API is more straightforward for beginners, while Garage offers more flexibility for complex RL setups.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
garage
garage is a toolkit for developing and evaluating reinforcement learning algorithms, and an accompanying library of state-of-the-art implementations built using that toolkit.
The toolkit provides wide range of modular tools for implementing RL algorithms, including:
- Composable neural network models
- Replay buffers
- High-performance samplers
- An expressive experiment definition interface
- Tools for reproducibility (e.g. set a global random seed which all components respect)
- Logging to many outputs, including TensorBoard
- Reliable experiment checkpointing and resuming
- Environment interfaces for many popular benchmark suites
- Supporting for running garage in diverse environments, including always up-to-date Docker containers
See the latest documentation for getting started instructions and detailed APIs.
Installation
pip install --user garage
Examples
Starting from version v2020.10.0, garage comes packaged with examples. To get a list of examples, run:
garage examples
You can also run garage examples --help
, or visit
the documentation
for even more details.
Join the Community
Join the garage-announce mailing list for infrequent updates (<1/mo.) on the status of the project and new releases.
Need some help? Want to ask garage is right for your project? Have a question which is not quite a bug and not quite a feature request?
Join the community Slack by filling out this Google Form.
Algorithms
The table below summarizes the algorithms available in garage.
Algorithm | Framework(s) |
---|---|
CEM | numpy |
CMA-ES | numpy |
REINFORCE (a.k.a. VPG) | PyTorch, TensorFlow |
DDPG | PyTorch, TensorFlow |
DQN | PyTorch, TensorFlow |
DDQN | PyTorch, TensorFlow |
ERWR | TensorFlow |
NPO | TensorFlow |
PPO | PyTorch, TensorFlow |
REPS | TensorFlow |
TD3 | PyTorch, TensorFlow |
TNPG | TensorFlow |
TRPO | PyTorch, TensorFlow |
MAML | PyTorch |
RL2 | TensorFlow |
PEARL | PyTorch |
SAC | PyTorch |
MTSAC | PyTorch |
MTPPO | PyTorch, TensorFlow |
MTTRPO | PyTorch, TensorFlow |
Task Embedding | TensorFlow |
Behavioral Cloning | PyTorch |
Supported Tools and Frameworks
garage requires Python 3.6+. If you need Python 3.5 support, the last garage release to support Python 3.5 was v2020.06.
The package is tested on Ubuntu 18.04. It is also known to run on Ubuntu 16.04, 18.04, and 20.04, and recent versions of macOS using Homebrew. Windows users can install garage via WSL, or by making use of the Docker containers.
We currently support PyTorch and
TensorFlow for implementing the neural network
portions of RL algorithms, and additions of new framework support are always
welcome. PyTorch modules can be found in the package
garage.torch
and TensorFlow modules can be found in the package
garage.tf
.
Algorithms which do not require neural networks are found in the package
garage.np
.
The package is available for download on PyPI, and we ensure that it installs successfully into environments defined using conda, Pipenv, and virtualenv.
Testing
The most important feature of garage is its comprehensive automated unit test and benchmarking suite, which helps ensure that the algorithms and modules in garage maintain state-of-the-art performance as the software changes.
Our testing strategy has three pillars:
- Automation: We use continuous integration to test all modules and algorithms in garage before adding any change. The full installation and test suite is also run nightly, to detect regressions.
- Acceptance Testing: Any commit which might change the performance of an algorithm is subjected to comprehensive benchmarks on the relevant algorithms before it is merged
- Benchmarks and Monitoring: We benchmark the full suite of algorithms against their relevant benchmarks and widely-used implementations regularly, to detect regressions and improvements we may have missed.
Supported Releases
Release | Build Status | Last date of support |
---|---|---|
v2021.03 | May 31st, 2021 |
Maintenance releases have a stable API and dependency tree, and receive bug fixes and critical improvements but not new features. We currently support each release for a window of 2 months.
Citing garage
If you use garage for academic research, please cite the repository using the
following BibTeX entry. You should update the commit
field with the commit or
release tag your publication uses.
@misc{garage,
author = {The garage contributors},
title = {Garage: A toolkit for reproducible reinforcement learning research},
year = {2019},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/rlworkgroup/garage}},
commit = {be070842071f736eb24f28e4b902a9f144f5c97b}
}
Credits
The earliest code for garage was adopted from predecessor project called rllab. The garage project is grateful for the contributions of the original rllab authors, and hopes to continue advancing the state of reproducibility in RL research in the same spirit. garage has previously been supported by the Amazon Research Award "Watch, Practice, Learn, Do: Unsupervised Learning of Robust and Composable Robot Motion Skills by Fusing Expert Demonstrations with Robot Experience."
Top Related Projects
A toolkit for developing and comparing reinforcement learning algorithms.
A fork of OpenAI Baselines, implementations of reinforcement learning algorithms
TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.
PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.
Softlearning is a reinforcement learning framework for training maximum entropy policies in continuous domains. Includes the official implementation of the Soft Actor-Critic algorithm.
An API standard for single-agent reinforcement learning environments, with popular reference environments and related utilities (formerly Gym)
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot