Top Related Projects
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
A toolkit for developing and comparing reinforcement learning algorithms.
A customisable 3D platform for agent-based AI research
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
Quick Overview
AI2-THOR is an open-source interactive 3D environment for AI research, developed by the Allen Institute for AI. It provides a platform for training and testing embodied AI agents in realistic, physics-enabled indoor scenes, focusing on household tasks and navigation.
Pros
- Highly realistic and interactive 3D environments
- Supports a wide range of embodied AI tasks, including navigation, manipulation, and visual question answering
- Offers both Unity-based and headless simulation options for flexibility in research setups
- Regularly updated with new features and improvements
Cons
- Steep learning curve for researchers new to 3D environments or Unity
- Resource-intensive, requiring significant computational power for complex simulations
- Limited to indoor household environments, which may not suit all research needs
- Some users report occasional stability issues or bugs
Code Examples
- Initializing the environment and moving the agent:
import ai2thor.controller
controller = ai2thor.controller.Controller()
controller.start()
# Move the agent forward
event = controller.step(action="MoveAhead")
- Interacting with objects in the scene:
# Pick up an object
event = controller.step(
action="PickupObject",
objectId="Apple|+03.08|+00.90|-00.84"
)
# Open a drawer
event = controller.step(
action="OpenObject",
objectId="Drawer|+01.16|+00.57|-01.34"
)
- Capturing observations from the environment:
# Get the current frame
frame = event.frame
# Get depth map
depth_map = event.depth_frame
# Get instance segmentation
instance_seg = event.instance_segmentation_frame
Getting Started
To get started with AI2-THOR, follow these steps:
- Install the library:
pip install ai2thor
- Create a simple script to initialize and interact with the environment:
import ai2thor.controller
controller = ai2thor.controller.Controller()
controller.start()
# Initialize in a kitchen scene
event = controller.reset(scene="FloorPlan1")
# Move the agent and print its position
event = controller.step(action="MoveAhead")
print(event.metadata["agent"]["position"])
# Close the environment
controller.stop()
This basic script initializes the environment, moves the agent, and prints its position. From here, you can explore more complex interactions and tasks within the AI2-THOR environment.
Competitor Comparisons
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
Pros of ml-agents
- More general-purpose, supporting a wider range of AI/ML applications in Unity
- Larger community and more extensive documentation
- Integrates seamlessly with Unity's built-in features and asset pipeline
Cons of ml-agents
- Steeper learning curve for non-Unity developers
- Less focused on embodied AI and robotics simulations
- Requires more setup and configuration for specific use cases
Code Comparison
ml-agents:
public class RollerAgent : Agent
{
public override void OnEpisodeBegin()
{
// Reset agent position and state
}
}
ai2thor:
controller = Controller()
event = controller.step(action="MoveAhead")
Summary
ml-agents is a more versatile toolkit for developing AI in Unity, with broader applications and a larger community. However, it may require more effort to set up for specific scenarios. ai2thor, on the other hand, is more focused on embodied AI and robotics simulations, offering a simpler API for these use cases but with less flexibility for general Unity development.
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
Pros of Habitat-lab
- Supports a wider range of 3D datasets and environments
- More flexible and customizable for various embodied AI tasks
- Better performance and scalability for large-scale simulations
Cons of Habitat-lab
- Steeper learning curve and more complex setup process
- Less photorealistic environments compared to AI2-THOR
- Fewer built-in interactive objects and physics simulations
Code Comparison
AI2-THOR example:
controller = Controller()
event = controller.step(action="MoveAhead")
Habitat-lab example:
env = habitat.Env(config=config)
observations = env.reset()
action = {"action": "MOVE_FORWARD"}
observations = env.step(action)
Both frameworks provide similar functionality for agent movement and interaction, but Habitat-lab's API is more flexible and allows for more customization in defining actions and observations.
A toolkit for developing and comparing reinforcement learning algorithms.
Pros of Gym
- Broader scope, supporting a wide range of environments beyond just robotics and vision
- Larger community and ecosystem, with many third-party environments available
- Simpler setup and installation process
Cons of Gym
- Less realistic visual environments compared to AI2-THOR
- Limited built-in support for complex, interactive 3D environments
- Fewer options for physics-based interactions and object manipulations
Code Comparison
AI2-THOR example:
controller = Controller()
event = controller.step(action="MoveAhead")
rgb = event.frame
Gym example:
env = gym.make("CartPole-v1")
observation, reward, done, info = env.step(action)
Both libraries use a similar step-based approach for agent interactions, but AI2-THOR provides more detailed visual and physics-based information in its events, while Gym focuses on simpler numerical observations and rewards.
A customisable 3D platform for agent-based AI research
Pros of DeepMind Lab
- More extensive documentation and tutorials
- Supports a wider range of research tasks and environments
- Better integration with machine learning frameworks like TensorFlow
Cons of DeepMind Lab
- Steeper learning curve for beginners
- Less realistic visuals compared to AI2-THOR
- Limited to first-person perspective interactions
Code Comparison
AI2-THOR example:
controller = Controller()
event = controller.step(action="MoveAhead")
DeepMind Lab example:
env = deepmind_lab.Lab("seekavoid_arena_01", ["RGB_INTERLEAVED"])
env.reset()
obs = env.step(actions=[0, 0, 0, 1, 0, 0, 0])
AI2-THOR focuses on realistic home environments with object interactions, while DeepMind Lab provides more abstract, game-like environments for reinforcement learning research. AI2-THOR offers a higher level of visual fidelity and physics simulation, making it suitable for tasks involving object manipulation and navigation in realistic settings. DeepMind Lab, on the other hand, excels in providing a diverse range of customizable environments for testing and developing AI agents in various scenarios.
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
Pros of AirSim
- More realistic physics simulation, especially for drones and vehicles
- Supports multiple environments (indoor, outdoor, urban)
- Offers APIs for various programming languages (C++, Python, C#, Java)
Cons of AirSim
- Steeper learning curve and more complex setup
- Requires Unreal Engine, which can be resource-intensive
- Less focused on household environments and object interactions
Code Comparison
AirSim (Python):
import airsim
client = airsim.MultirotorClient()
client.takeoffAsync().join()
client.moveToPositionAsync(-10, 10, -10, 5).join()
AI2-THOR (Python):
from ai2thor.controller import Controller
controller = Controller()
event = controller.step(action="MoveAhead")
event = controller.step(action="RotateRight")
Summary
AirSim excels in realistic physics simulations for drones and vehicles, offering diverse environments and multi-language support. However, it has a steeper learning curve and requires more computational resources. AI2-THOR focuses on household environments and object interactions, with a simpler setup process but less realistic physics. The code examples demonstrate the different approaches: AirSim uses more specific movement commands, while AI2-THOR employs higher-level actions for navigation and interaction.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
A Near Photo-Realistic Interactable Framework for Embodied AI Agents
ð¡ Environments
![]() |
![]() |
![]() |
iTHOR
|
ManipulaTHOR
|
RoboTHOR
|
A high-level interaction framework that facilitates research in embodied common sense reasoning. | A mid-level interaction framework that facilitates visual manipulation of objects using a robotic arm. | A framework that facilitates Sim2Real research with a collection of simulated scene counterparts in the physical world. |
ð Features
ð¡ Scenes. 200+ custom built high-quality scenes. The scenes can be explored on our demo page. We are working on rapidly expanding the number of available scenes and domain randomization within each scene.
ðª Objects. 2600+ custom designed household objects across 100+ object types. Each object is heavily annotated, which allows for near-realistic physics interaction.
ð¤ Agent Types. Multi-agent support, a custom built LoCoBot agent, a Kinova 3 inspired robotic manipulation agent, and a drone agent.
𦾠Actions. 200+ actions that facilitate research in a wide range of interaction and navigation based embodied AI tasks.
ð¼ Images. First-class support for many image modalities and camera adjustments. Some modalities include ego-centric RGB images, instance segmentation, semantic segmentation, depth frames, normals frames, top-down frames, orthographic projections, and third-person camera frames. User's can also easily change camera properties, such as the size of the images and field of view.
ðº Metadata. After each step in the environment, there is a large amount of sensory data available about the state of the environment. This information can be used to build highly complex custom reward functions.
ð° Latest Announcements
Date | Announcement | |
5/2021 |
RandomizeMaterials is now supported! It enables a massive amount of realistic looking domain randomization within each scene. Try it out on the demo
|
|
4/2021 | We are excited to release ManipulaTHOR, an environment within the AI2-THOR framework that facilitates visual manipulation of objects using a robotic arm. Please see the full 3.0.0 release notes here. | |
4/2021 |
RandomizeLighting is now supported! It includes many tunable parameters to allow for vast control over its effects. Try it out on the demo! |
|
2/2021 | We are excited to host the AI2-THOR Rearrangement Challenge, RoboTHOR ObjectNav Challenge, and ALFRED Challenge, held in conjunction with the Embodied AI Workshop at CVPR 2021. | |
2/2021 | AI2-THOR v2.7.0 announces several massive speedups to AI2-THOR! Read more about it here. | |
6/2020 | We've released ð³ AI2-THOR Docker a mini-framework to simplify running AI2-THOR in Docker. | |
4/2020 | Version 2.4.0 update of the framework is here. All sim objects that aren't explicitly part of the environmental structure are now moveable with physics interactions. New object types have been added, and many new actions have been added. Please see the full 2.4.0 release notes here. | |
2/2020 | AI2-THOR now includes two frameworks: iTHOR and RoboTHOR. iTHOR includes interactive objects and scenes and RoboTHOR consists of simulated scenes and their corresponding real world counterparts. | |
9/2019 | Version 2.1.0 update of the framework has been added. New object types have been added. New Initialization actions have been added. Segmentation image generation has been improved in all scenes. | |
6/2019 | Version 2.0 update of the AI2-THOR framework is now live! We have over quadrupled our action and object states, adding new actions that allow visually distinct state changes such as broken screens on electronics, shattered windows, breakable dishware, liquid fillable containers, cleanable dishware, messy and made beds and more! Along with these new state changes, objects have more physical properties like Temperature, Mass, and Salient Materials that are all reported back in object metadata. To combine all of these new properties and actions, new context sensitive interactions can now automatically change object states. This includes interactions like placing a dirty bowl under running sink water to clean it, placing a mug in a coffee machine to automatically fill it with coffee, putting out a lit candle by placing it in water, or placing an object over an active stove burner or in the fridge to change its temperature. Please see the full 2.0 release notes here to view details on all the changes and new features. |
ð» Installation
With Google Colab
AI2-THOR Colab can be used to run AI2-THOR freely in the cloud with Google Colab. Running AI2-THOR in Google Colab makes it extremely easy to explore functionality without having to set AI2-THOR up locally.
With pip
pip install ai2thor
With conda
conda install -c conda-forge ai2thor
With Docker
ð³ AI2-THOR Docker can be used, which adds the configuration for running a X server to be used by Unity 3D to render scenes.
Minimal Example
Once you've installed AI2-THOR, you can verify that everything is working correctly by running the following minimal example:
from ai2thor.controller import Controller
controller = Controller(scene="FloorPlan10")
event = controller.step(action="RotateRight")
metadata = event.metadata
print(event, event.metadata.keys())
Requirements
Component | Requirement |
---|---|
OS | Mac OS X 10.9+, Ubuntu 14.04+ |
Graphics Card | DX9 (shader model 3.0) or DX11 with feature level 9.3 capabilities. |
CPU | SSE2 instruction set support. |
Python | Versions 3.5+ |
Linux | X server with GLX module enabled |
ð¬ Support
Questions. If you have any questions on AI2-THOR, please ask them on our GitHub Discussions Page.
Issues. If you encounter any issues while using AI2-THOR, please open an Issue on GitHub.
ð« Learn more
Section | Description |
---|---|
Demo | Interact and play with AI2-THOR live in the browser. |
iTHOR Documentation | Documentation for the iTHOR environment. |
ManipulaTHOR Documentation | Documentation for the ManipulaTHOR environment. |
RoboTHOR Documentation | Documentation for the RoboTHOR environment. |
AI2-THOR Colab | A way to run AI2-THOR freely on the cloud using Google Colab. |
AllenAct | An Embodied AI Framework build at AI2 that provides first-class support for AI2-THOR. |
AI2-THOR Unity Development | A (sparse) collection of notes that may be useful if editing on the AI2-THOR backend. |
AI2-THOR WebGL Development | Documentation on packaging AI2-THOR for the web, which might be useful for annotation based tasks. |
ð Citation
If you use AI2-THOR or iTHOR scenes, please cite the original AI2-THOR paper:
@article{ai2thor,
author={Eric Kolve and Roozbeh Mottaghi and Winson Han and
Eli VanderBilt and Luca Weihs and Alvaro Herrasti and
Daniel Gordon and Yuke Zhu and Abhinav Gupta and
Ali Farhadi},
title={{AI2-THOR: An Interactive 3D Environment for Visual AI}},
journal={arXiv},
year={2017}
}
If you use ðï¸ ProcTHOR or procedurally generated scenes, please cite the following paper:
@inproceedings{procthor,
author={Matt Deitke and Eli VanderBilt and Alvaro Herrasti and
Luca Weihs and Jordi Salvador and Kiana Ehsani and
Winson Han and Eric Kolve and Ali Farhadi and
Aniruddha Kembhavi and Roozbeh Mottaghi},
title={{ProcTHOR: Large-Scale Embodied AI Using Procedural Generation}},
booktitle={NeurIPS},
year={2022},
note={Outstanding Paper Award}
}
If you use ManipulaTHOR agent, please cite the following paper:
@inproceedings{manipulathor,
title={{ManipulaTHOR: A Framework for Visual Object Manipulation}},
author={Kiana Ehsani and Winson Han and Alvaro Herrasti and
Eli VanderBilt and Luca Weihs and Eric Kolve and
Aniruddha Kembhavi and Roozbeh Mottaghi},
booktitle={CVPR},
year={2021}
}
If you use RoboTHOR scenes, please cite the following paper:
@inproceedings{robothor,
author={Matt Deitke and Winson Han and Alvaro Herrasti and
Aniruddha Kembhavi and Eric Kolve and Roozbeh Mottaghi and
Jordi Salvador and Dustin Schwenk and Eli VanderBilt and
Matthew Wallingford and Luca Weihs and Mark Yatskar and
Ali Farhadi},
title={{RoboTHOR: An Open Simulation-to-Real Embodied AI Platform}},
booktitle={CVPR},
year={2020}
}
ð Our Team
AI2-THOR is an open-source project built by the PRIOR team at the Allen Institute for AI (AI2). AI2 is a non-profit institute with the mission to contribute to humanity through high-impact AI research and engineering.
Top Related Projects
The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
A toolkit for developing and comparing reinforcement learning algorithms.
A customisable 3D platform for agent-based AI research
Open source simulator for autonomous vehicles built on Unreal Engine / Unity, from Microsoft AI & Research
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot