onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

17,390

3,375

17,390

2,312

View on GitHub View on NPM

Top Related Projects

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

coremltools

4,869

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

jax

32,985

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

tvm

12,482

Open deep learning compiler stack for cpu, gpu and specialized accelerators

OpenBLAS

6,839

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

Quick Overview

ONNX Runtime is a cross-platform, high-performance machine learning inference and training accelerator. It's designed to optimize and accelerate machine learning models across various hardware platforms and operating systems, supporting models from popular frameworks like PyTorch, TensorFlow, and scikit-learn.

Pros

Improved performance and reduced inference time for machine learning models
Wide compatibility with various ML frameworks and hardware platforms
Automatic optimization of models for specific hardware
Supports both CPU and GPU acceleration

Cons

Learning curve for integration into existing ML pipelines
Limited support for some specialized or custom operations
Potential compatibility issues with older model versions
May require model conversion for some frameworks

Code Examples

Loading and running an ONNX model:

import onnxruntime as ort
import numpy as np

# Load the ONNX model
session = ort.InferenceSession("model.onnx")

# Prepare input data
input_name = session.get_inputs()[0].name
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)

# Run inference
output = session.run(None, {input_name: input_data})

Converting a PyTorch model to ONNX:

import torch
import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc = nn.Linear(10, 5)

    def forward(self, x):
        return self.fc(x)

model = SimpleModel()
dummy_input = torch.randn(1, 10)

torch.onnx.export(model, dummy_input, "simple_model.onnx")

Quantizing an ONNX model:

import onnx
from onnxruntime.quantization import quantize_dynamic

# Load the ONNX model
model = onnx.load("model.onnx")

# Quantize the model
quantized_model = quantize_dynamic(model, weight_type=QuantType.QUInt8)

# Save the quantized model
onnx.save(quantized_model, "quantized_model.onnx")

Getting Started

To get started with ONNX Runtime, follow these steps:

Install ONNX Runtime:

pip install onnxruntime

Load and run an ONNX model:

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("path/to/your/model.onnx")
input_name = session.get_inputs()[0].name
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
output = session.run(None, {input_name: input_data})

For GPU acceleration, install the GPU version:

pip install onnxruntime-gpu

Competitor Comparisons

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

Larger ecosystem with more tools, libraries, and community support
Better support for distributed and large-scale machine learning
More comprehensive documentation and tutorials

Cons of TensorFlow

Steeper learning curve, especially for beginners
Slower execution speed for some operations compared to ONNX Runtime
Larger file size and memory footprint

Code Comparison

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

ONNX Runtime:

import onnxruntime as ort

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})

Both repositories provide powerful frameworks for machine learning and deep learning. TensorFlow offers a more comprehensive ecosystem with extensive tools and libraries, making it suitable for complex projects and research. However, it comes with a steeper learning curve and potentially slower execution for some operations.

ONNX Runtime, on the other hand, focuses on providing a lightweight and efficient inference engine for various machine learning models. It offers faster execution speed for certain operations and easier deployment across different platforms, but may have a smaller ecosystem compared to TensorFlow.

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

More flexible and dynamic computational graph, allowing for easier debugging and experimentation
Extensive ecosystem with a wide range of pre-trained models and libraries
Strong community support and frequent updates

Cons of PyTorch

Generally slower inference speed compared to ONNX Runtime
Larger model file sizes, which can be a concern for deployment on edge devices
Steeper learning curve for beginners due to its dynamic nature

Code Comparison

PyTorch:

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.add(x, y)

ONNX Runtime:

import onnxruntime as ort
import numpy as np

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
sess = ort.InferenceSession("model.onnx")
z = sess.run(None, {"input1": x, "input2": y})[0]

The code examples demonstrate the difference in approach between PyTorch's dynamic computation and ONNX Runtime's static graph execution. PyTorch allows for more intuitive tensor operations, while ONNX Runtime requires a pre-defined model and explicit input/output handling.

coremltools

4,869

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Pros of Core ML Tools

Specifically designed for Apple platforms, offering seamless integration with iOS, macOS, and other Apple devices
Provides tools for converting models from various frameworks (TensorFlow, Keras, scikit-learn) to Core ML format
Supports on-device machine learning, optimizing for performance and privacy on Apple hardware

Cons of Core ML Tools

Limited to Apple ecosystem, lacking cross-platform support
Fewer supported model types and operations compared to ONNX Runtime
Smaller community and ecosystem compared to the more widely-used ONNX format

Code Comparison

Core ML Tools conversion example:

import coremltools as ct

keras_model = ...  # Your Keras model
coreml_model = ct.convert(keras_model)
coreml_model.save("model.mlmodel")

ONNX Runtime inference example:

import onnxruntime as ort

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})

Both libraries serve different purposes: Core ML Tools focuses on model conversion for Apple platforms, while ONNX Runtime is a cross-platform inference engine. Choose based on your target platform and specific requirements.

jax

32,985

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Pros of JAX

Offers automatic differentiation and GPU/TPU acceleration
Provides a more flexible and customizable framework for machine learning research
Supports functional programming paradigms, enabling easier composition of operations

Cons of JAX

Steeper learning curve compared to ONNX Runtime
Less optimized for production deployment and inference
Smaller ecosystem and fewer pre-built models available

Code Comparison

ONNX Runtime example:

import onnxruntime as ort
session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})

JAX example:

import jax.numpy as jnp
from jax import grad, jit
def loss_fn(x):
    return jnp.sum(x**2)
grad_fn = jit(grad(loss_fn))
result = grad_fn(jnp.array([1.0, 2.0, 3.0]))

tvm

12,482

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Pros of TVM

More flexible and customizable for different hardware targets
Supports a wider range of deep learning frameworks
Offers advanced graph-level optimizations

Cons of TVM

Steeper learning curve and more complex to use
Less mature and stable compared to ONNX Runtime
Smaller community and ecosystem

Code Comparison

TVM example:

import tvm
from tvm import relay

# Define a simple network
data = relay.var("data", relay.TensorType((1, 3, 224, 224), "float32"))
weight = relay.var("weight")
conv2d = relay.nn.conv2d(data, weight)
func = relay.Function([data, weight], conv2d)

# Compile the network
target = "llvm"
with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(func, target)

ONNX Runtime example:

import onnxruntime as ort
import numpy as np

# Load pre-trained ONNX model
session = ort.InferenceSession("model.onnx")

# Prepare input data
input_data = np.random.rand(1, 3, 224, 224).astype(np.float32)

# Run inference
output = session.run(None, {"input": input_data})

OpenBLAS

6,839

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

Pros of OpenBLAS

Highly optimized linear algebra operations for various CPU architectures
Lightweight and focused on BLAS (Basic Linear Algebra Subprograms) functionality
Open-source with a strong community and long-standing reputation in scientific computing

Cons of OpenBLAS

Limited to CPU operations, lacking GPU support unlike ONNX Runtime
Narrower scope, focusing primarily on linear algebra operations rather than a full machine learning inference framework
May require more manual integration and optimization for complex ML workflows

Code Comparison

OpenBLAS (C):

#include <cblas.h>

double x[] = {1, 2, 3, 4};
double y[] = {5, 6, 7, 8};
cblas_daxpy(4, 2.0, x, 1, y, 1);

ONNX Runtime (Python):

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: np.array([1, 2, 3, 4]).astype(np.float32)})

While OpenBLAS excels in optimized linear algebra operations, ONNX Runtime provides a more comprehensive solution for machine learning inference across various hardware platforms. OpenBLAS is ideal for projects requiring high-performance linear algebra, while ONNX Runtime is better suited for end-to-end ML deployment and inference tasks.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started & Resources

General Information: onnxruntime.ai
Usage documentation and tutorials: onnxruntime.ai/docs
YouTube video tutorials: youtube.com/@ONNXRuntime
Upcoming Release Roadmap
Companion sample repositories:
- ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples
- ONNX Runtime Training: microsoft/onnxruntime-training-examples

Releases

The current release and past releases can be found here: https://github.com/microsoft/onnxruntime/releases.

For details on the upcoming release, including release dates, announcements, features, and guidance on submitting feature requests, please visit the release roadmap: https://onnxruntime.ai/roadmap.

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

Top Related Projects

Quick Overview

Pros

Cons

Code Examples

Getting Started

Competitor Comparisons

Pros of TensorFlow

Cons of TensorFlow

Code Comparison

Pros of PyTorch

Cons of PyTorch

Code Comparison

Pros of Core ML Tools

Cons of Core ML Tools

Code Comparison

Pros of JAX

Cons of JAX

Code Comparison

Pros of TVM

Cons of TVM

Code Comparison

Pros of OpenBLAS

Cons of OpenBLAS

Code Comparison

Convert designs to code with AI

README

Get Started & Resources

Releases

Data/Telemetry

Contributions and Feedback

Code of Conduct

License

Top Related Projects

Convert designs to code with AI

NPM DownloadsLast 30 Days