Top Related Projects
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Quick Overview
IREE (Intermediate Representation Execution Environment) is an open-source compiler and runtime infrastructure for executing machine learning models on a variety of hardware platforms. It aims to provide a unified approach to compiling and deploying ML models across different architectures, including CPUs, GPUs, and specialized accelerators.
Pros
- Cross-platform support for various hardware targets
- Optimized performance through advanced compilation techniques
- Seamless integration with popular ML frameworks like TensorFlow and PyTorch
- Active development and community support
Cons
- Relatively new project, still evolving and stabilizing
- Limited documentation and examples for some advanced use cases
- Steeper learning curve compared to some other ML deployment solutions
- May require additional setup and configuration for certain hardware targets
Code Examples
- Compiling a TensorFlow model to IREE:
import tensorflow as tf
import iree.compiler as ireec
# Define a simple TensorFlow model
model = tf.keras.Sequential([
tf.keras.layers.Dense(10, input_shape=(5,), activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
# Compile the model to IREE
compiled_module = ireec.compile_str(
ireec.tf.compile_module(model, ['predict']),
target_backends=['vulkan-spirv']
)
- Running an IREE compiled module:
import iree.runtime as ireert
# Create an IREE runtime context
config = ireert.Config("vulkan")
ctx = ireert.SystemContext(config=config)
# Load and run the compiled module
vm_module = ireert.VmModule.from_flatbuffer(compiled_module)
ctx.add_vm_module(vm_module)
f = ctx.modules.module.predict
result = f(tf.constant([[1.0, 2.0, 3.0, 4.0, 5.0]]))
print(result)
- Using IREE with PyTorch:
import torch
import iree.compiler as ireec
import iree.runtime as ireert
# Define a simple PyTorch model
class SimpleModel(torch.nn.Module):
def __init__(self):
super().__init__()
self.linear = torch.nn.Linear(5, 1)
def forward(self, x):
return torch.sigmoid(self.linear(x))
model = SimpleModel()
# Compile the model to IREE
compiled_module = ireec.compile_str(
ireec.torch.compile_module(model, ['forward']),
target_backends=['llvm-cpu']
)
# Run the compiled module
config = ireert.Config("llvm-cpu")
ctx = ireert.SystemContext(config=config)
vm_module = ireert.VmModule.from_flatbuffer(compiled_module)
ctx.add_vm_module(vm_module)
f = ctx.modules.module.forward
result = f(torch.randn(1, 5))
print(result)
Getting Started
To get started with IREE, follow these steps:
- Install IREE and its dependencies:
pip install iree-compiler iree-runtime
- Import IREE in your Python script:
import iree.compiler as ireec
import iree.runtime as ireert
- Compile your ML model to IREE format and run it using the IREE runtime, as shown in the code examples above.
For more detailed instructions and advanced usage, refer to the official IREE documentation and examples in the GitHub repository.
Competitor Comparisons
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Pros of JAX
- Widely adopted in the machine learning community, especially for research
- Offers automatic differentiation and GPU/TPU acceleration out of the box
- Provides a more flexible and Pythonic programming model
Cons of JAX
- Steeper learning curve for users not familiar with functional programming concepts
- Limited support for dynamic shapes and control flow compared to IREE
- Smaller ecosystem of tools and integrations outside of the core library
Code Comparison
JAX example:
import jax.numpy as jnp
from jax import grad, jit
def f(x):
return jnp.sum(jnp.sin(x))
grad_f = jit(grad(f))
IREE example:
import iree.compiler as ireec
import numpy as np
def f(x):
return np.sum(np.sin(x))
compiled_f = ireec.compile_str(f)
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Widely adopted and supported by a large community
- Extensive documentation and tutorials available
- Flexible and intuitive API for deep learning research
Cons of PyTorch
- Larger memory footprint and slower inference compared to specialized runtimes
- Less optimized for mobile and edge devices
- Limited support for specialized hardware accelerators
Code Comparison
PyTorch:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.matmul(x, y)
IREE:
import iree.runtime as ireert
import numpy as np
x = np.array([1, 2, 3], dtype=np.float32)
y = np.array([4, 5, 6], dtype=np.float32)
config = ireert.Config("vulkan")
ctx = ireert.SystemContext(config=config)
z = ctx.invoke("matmul", x, y)
IREE focuses on efficient execution across various hardware targets, while PyTorch provides a more general-purpose deep learning framework. IREE's code involves more setup for hardware-specific optimization, whereas PyTorch's API is more straightforward for common deep learning tasks.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Extensive ecosystem with wide industry adoption and support
- Comprehensive documentation and large community for troubleshooting
- Flexible architecture supporting various platforms and devices
Cons of TensorFlow
- Steeper learning curve for beginners
- Can be resource-intensive and slower for certain operations
- Complex setup process for some environments
Code Comparison
IREE example:
import iree.runtime as ireert
import numpy as np
context = ireert.SystemContext()
vm_module = ireert.VmModule.from_file(context, "model.vmfb")
f = vm_module.lookup_function("predict")
result = f(np.array([1.0, 2.0, 3.0], dtype=np.float32))
TensorFlow example:
import tensorflow as tf
model = tf.keras.models.load_model("model.h5")
result = model.predict(tf.constant([[1.0, 2.0, 3.0]]))
Both examples demonstrate loading and running a pre-trained model, but IREE uses a more low-level approach with its runtime system, while TensorFlow provides a higher-level API through Keras.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Pros of ONNX Runtime
- Wider industry adoption and support
- Extensive compatibility with various ML frameworks
- Robust performance optimizations for different hardware
Cons of ONNX Runtime
- Larger codebase and potentially more complex setup
- Less focus on embedded and mobile deployments
Code Comparison
IREE example (C API):
iree_vm_module_t* module = NULL;
iree_vm_load_bytecode_module(bytecode_data, bytecode_length, &module);
iree_vm_invoke(module, "my_function", inputs, &outputs);
ONNX Runtime example (C++ API):
Ort::Session session(env, model_path, session_options);
auto input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_data, input_size, input_shape.data(), input_shape.size());
auto output_tensors = session.Run(Ort::RunOptions{nullptr}, input_names, &input_tensor, 1, output_names, 1);
Both projects aim to provide efficient runtime environments for machine learning models, but IREE focuses more on compiler techniques and embedded systems, while ONNX Runtime emphasizes broad compatibility and enterprise-scale deployments.
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Pros of TVM
- Broader ecosystem support with multiple frontends and backends
- More mature project with a larger community and extensive documentation
- Flexible and customizable for various hardware targets
Cons of TVM
- Steeper learning curve due to its complexity and extensive features
- Potentially slower compilation times for some use cases
- May require more manual optimization for specific hardware targets
Code Comparison
IREE example (using MLIR):
func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32> {
%0 = mhlo.multiply %arg0, %arg1 : tensor<4xf32>
return %0 : tensor<4xf32>
}
TVM example (using Relay):
import tvm
from tvm import relay
x = relay.var("x", shape=(4,), dtype="float32")
y = relay.var("y", shape=(4,), dtype="float32")
z = relay.multiply(x, y)
func = relay.Function([x, y], z)
Both examples demonstrate a simple element-wise multiplication operation, but IREE uses MLIR dialect while TVM uses its Relay IR. TVM's approach is more Python-centric, while IREE's MLIR representation is closer to the underlying hardware abstraction.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
IREE: Intermediate Representation Execution Environment
IREE (Intermediate Representation Execution Environment, pronounced as "eerie") is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.
See our website for project details, user guides, and instructions on building from source.
Project news
Project status
Release status
Releases notes are published on GitHub releases.
Package | Release status |
---|---|
GitHub release (stable) | |
GitHub release (nightly) | |
Python iree-base-compiler | |
Python iree-base-runtime |
Build status
Nightly build status
Operating system | Build status |
---|---|
Linux | |
macOS | |
Windows |
For the full list of workflows see https://iree.dev/developers/general/github-actions/.
Communication channels
- GitHub issues: Feature requests, bugs, and other work tracking
- IREE Discord server: Daily development discussions with the core team and collaborators
- (New) iree-announce email list: Announcements
- (New) iree-technical-discussion email list: General and low-priority discussion
- (Legacy) iree-discuss email list: Announcements, general and low-priority discussion
Related project channels
- MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.
Architecture overview
See our website for more information.
Presentations and talks
Community meeting recordings: IREE YouTube channel
Date | Title | Recording | Slides |
---|---|---|---|
2021-06-09 | IREE Runtime Design Tech Talk | recording | slides |
2020-08-20 | IREE CodeGen (MLIR Open Design Meeting) | recording | slides |
2020-03-18 | Interactive HAL IR Walkthrough | recording | |
2020-01-31 | End-to-end MLIR Workflow in IREE (MLIR Open Design Meeting) | recording | slides |
License
IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.
Top Related Projects
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot