iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

3,070

676

3,070

1,327

View on GitHub

Top Related Projects

jax

32,065

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

onnxruntime

16,412

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

tvm

12,236

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Quick Overview

IREE (Intermediate Representation Execution Environment) is an open-source compiler and runtime infrastructure for executing machine learning models on a variety of hardware platforms. It aims to provide a unified approach to compiling and deploying ML models across different architectures, including CPUs, GPUs, and specialized accelerators.

Pros

Cross-platform support for various hardware targets
Optimized performance through advanced compilation techniques
Seamless integration with popular ML frameworks like TensorFlow and PyTorch
Active development and community support

Cons

Relatively new project, still evolving and stabilizing
Limited documentation and examples for some advanced use cases
Steeper learning curve compared to some other ML deployment solutions
May require additional setup and configuration for certain hardware targets

Code Examples

Compiling a TensorFlow model to IREE:

import tensorflow as tf
import iree.compiler as ireec

# Define a simple TensorFlow model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=(5,), activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model to IREE
compiled_module = ireec.compile_str(
    ireec.tf.compile_module(model, ['predict']),
    target_backends=['vulkan-spirv']
)

Running an IREE compiled module:

import iree.runtime as ireert

# Create an IREE runtime context
config = ireert.Config("vulkan")
ctx = ireert.SystemContext(config=config)

# Load and run the compiled module
vm_module = ireert.VmModule.from_flatbuffer(compiled_module)
ctx.add_vm_module(vm_module)
f = ctx.modules.module.predict
result = f(tf.constant([[1.0, 2.0, 3.0, 4.0, 5.0]]))
print(result)

Using IREE with PyTorch:

import torch
import iree.compiler as ireec
import iree.runtime as ireert

# Define a simple PyTorch model
class SimpleModel(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = torch.nn.Linear(5, 1)

    def forward(self, x):
        return torch.sigmoid(self.linear(x))

model = SimpleModel()

# Compile the model to IREE
compiled_module = ireec.compile_str(
    ireec.torch.compile_module(model, ['forward']),
    target_backends=['llvm-cpu']
)

# Run the compiled module
config = ireert.Config("llvm-cpu")
ctx = ireert.SystemContext(config=config)
vm_module = ireert.VmModule.from_flatbuffer(compiled_module)
ctx.add_vm_module(vm_module)
f = ctx.modules.module.forward
result = f(torch.randn(1, 5))
print(result)

Getting Started

To get started with IREE, follow these steps:

Install IREE and its dependencies:

pip install iree-compiler iree-runtime

Import IREE in your Python script:

import iree.compiler as ireec
import iree.runtime as ireert

Compile your ML model to IREE format and run it using the IREE runtime, as shown in the code examples above.

For more detailed instructions and advanced usage, refer to the official IREE documentation and examples in the GitHub repository.

Competitor Comparisons

jax

32,065

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Pros of JAX

Widely adopted in the machine learning community, especially for research
Offers automatic differentiation and GPU/TPU acceleration out of the box
Provides a more flexible and Pythonic programming model

Cons of JAX

Steeper learning curve for users not familiar with functional programming concepts
Limited support for dynamic shapes and control flow compared to IREE
Smaller ecosystem of tools and integrations outside of the core library

Code Comparison

JAX example:

import jax.numpy as jnp
from jax import grad, jit

def f(x):
    return jnp.sum(jnp.sin(x))

grad_f = jit(grad(f))

IREE example:

import iree.compiler as ireec
import numpy as np

def f(x):
    return np.sum(np.sin(x))

compiled_f = ireec.compile_str(f)

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

Widely adopted and supported by a large community
Extensive documentation and tutorials available
Flexible and intuitive API for deep learning research

Cons of PyTorch

Larger memory footprint and slower inference compared to specialized runtimes
Less optimized for mobile and edge devices
Limited support for specialized hardware accelerators

Code Comparison

PyTorch:

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.matmul(x, y)

IREE:

import iree.runtime as ireert
import numpy as np

x = np.array([1, 2, 3], dtype=np.float32)
y = np.array([4, 5, 6], dtype=np.float32)
config = ireert.Config("vulkan")
ctx = ireert.SystemContext(config=config)
z = ctx.invoke("matmul", x, y)

IREE focuses on efficient execution across various hardware targets, while PyTorch provides a more general-purpose deep learning framework. IREE's code involves more setup for hardware-specific optimization, whereas PyTorch's API is more straightforward for common deep learning tasks.

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

Extensive ecosystem with wide industry adoption and support
Comprehensive documentation and large community for troubleshooting
Flexible architecture supporting various platforms and devices

Cons of TensorFlow

Steeper learning curve for beginners
Can be resource-intensive and slower for certain operations
Complex setup process for some environments

Code Comparison

IREE example:

import iree.runtime as ireert
import numpy as np

context = ireert.SystemContext()
vm_module = ireert.VmModule.from_file(context, "model.vmfb")
f = vm_module.lookup_function("predict")
result = f(np.array([1.0, 2.0, 3.0], dtype=np.float32))

TensorFlow example:

import tensorflow as tf

model = tf.keras.models.load_model("model.h5")
result = model.predict(tf.constant([[1.0, 2.0, 3.0]]))

Both examples demonstrate loading and running a pre-trained model, but IREE uses a more low-level approach with its runtime system, while TensorFlow provides a higher-level API through Keras.

onnxruntime

16,412

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Pros of ONNX Runtime

Wider industry adoption and support
Extensive compatibility with various ML frameworks
Robust performance optimizations for different hardware

Cons of ONNX Runtime

Larger codebase and potentially more complex setup
Less focus on embedded and mobile deployments

Code Comparison

IREE example (C API):

iree_vm_module_t* module = NULL;
iree_vm_load_bytecode_module(bytecode_data, bytecode_length, &module);
iree_vm_invoke(module, "my_function", inputs, &outputs);

ONNX Runtime example (C++ API):

Ort::Session session(env, model_path, session_options);
auto input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_data, input_size, input_shape.data(), input_shape.size());
auto output_tensors = session.Run(Ort::RunOptions{nullptr}, input_names, &input_tensor, 1, output_names, 1);

Both projects aim to provide efficient runtime environments for machine learning models, but IREE focuses more on compiler techniques and embedded systems, while ONNX Runtime emphasizes broad compatibility and enterprise-scale deployments.

tvm

12,236

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Pros of TVM

Broader ecosystem support with multiple frontends and backends
More mature project with a larger community and extensive documentation
Flexible and customizable for various hardware targets

Cons of TVM

Steeper learning curve due to its complexity and extensive features
Potentially slower compilation times for some use cases
May require more manual optimization for specific hardware targets

Code Comparison

IREE example (using MLIR):

func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32> {
  %0 = mhlo.multiply %arg0, %arg1 : tensor<4xf32>
  return %0 : tensor<4xf32>
}

TVM example (using Relay):

import tvm
from tvm import relay

x = relay.var("x", shape=(4,), dtype="float32")
y = relay.var("y", shape=(4,), dtype="float32")
z = relay.multiply(x, y)
func = relay.Function([x, y], z)

Both examples demonstrate a simple element-wise multiplication operation, but IREE uses MLIR dialect while TVM uses its Relay IR. TVM's approach is more Python-centric, while IREE's MLIR representation is closer to the underlying hardware abstraction.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

IREE: Intermediate Representation Execution Environment

IREE (Intermediate Representation Execution Environment, pronounced as "eerie") is an MLIR-based end-to-end compiler and runtime that lowers Machine Learning (ML) models to a unified IR that scales up to meet the needs of the datacenter and down to satisfy the constraints and special considerations of mobile and edge deployments.

See our website for project details, user guides, and instructions on building from source.

Project news

2024-05-23: IREE joins the LF AI & Data Foundation as a sandbox-stage project

Project status

Release status

Releases notes are published on GitHub releases.

Package	Release status
GitHub release (stable)
GitHub release (nightly)
`iree-base-compiler`
`iree-base-runtime`

For more details on the release process, see https://iree.dev/developers/general/release-management/.

Build status

Nightly build status

Operating system	Build status
Linux
macOS
macOS
Windows

For the full list of workflows see https://iree.dev/developers/general/github-actions/.

Communication channels

GitHub issues: Feature requests, bugs, and other work tracking
IREE Discord server: Daily development discussions with the core team and collaborators
(New) iree-announce email list: Announcements
(New) iree-technical-discussion email list: General and low-priority discussion
(Legacy) iree-discuss email list: Announcements, general and low-priority discussion

Related project channels

MLIR topic within LLVM Discourse: IREE is enabled by and heavily relies on MLIR. IREE sometimes is referred to in certain MLIR discussions. Useful if you are also interested in MLIR evolution.

Architecture overview

IREE Architecture

See our website for more information.

Presentations and talks

Community meeting recordings: IREE YouTube channel

Date	Title	Recording	Slides
2025-02-12	The Long Tail of AI: SPIR-V in IREE and MLIR (Vulkanised)	recording	slides
2021-06-09	IREE Runtime Design Tech Talk	recording	slides
2020-08-20	IREE CodeGen (MLIR Open Design Meeting)	recording	slides
2020-03-18	Interactive HAL IR Walkthrough	recording
2020-01-31	End-to-end MLIR Workflow in IREE (MLIR Open Design Meeting)	recording	slides

License

IREE is licensed under the terms of the Apache 2.0 License with LLVM Exceptions. See LICENSE for more information.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot