Convert Figma logo to code with AI

microsoft logoonnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

14,417
2,896
14,417
2,809

Top Related Projects

186,879

An Open Source Machine Learning Framework for Everyone

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

30,218

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

11,694

Open deep learning compiler stack for cpu, gpu and specialized accelerators

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

Quick Overview

ONNX Runtime is a cross-platform, high-performance machine learning inference and training accelerator. It's designed to optimize and accelerate machine learning models across various hardware platforms and operating systems, supporting models from popular frameworks like PyTorch, TensorFlow, and scikit-learn.

Pros

  • Improved performance and reduced inference time for machine learning models
  • Wide compatibility with various ML frameworks and hardware platforms
  • Automatic optimization of models for specific hardware
  • Supports both CPU and GPU acceleration

Cons

  • Learning curve for integration into existing ML pipelines
  • Limited support for some specialized or custom operations
  • Potential compatibility issues with older model versions
  • May require model conversion for some frameworks

Code Examples

  1. Loading and running an ONNX model:
import onnxruntime as ort
import numpy as np

# Load the ONNX model
session = ort.InferenceSession("model.onnx")

# Prepare input data
input_name = session.get_inputs()[0].name
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)

# Run inference
output = session.run(None, {input_name: input_data})
  1. Converting a PyTorch model to ONNX:
import torch
import torch.nn as nn

class SimpleModel(nn.Module):
    def __init__(self):
        super(SimpleModel, self).__init__()
        self.fc = nn.Linear(10, 5)

    def forward(self, x):
        return self.fc(x)

model = SimpleModel()
dummy_input = torch.randn(1, 10)

torch.onnx.export(model, dummy_input, "simple_model.onnx")
  1. Quantizing an ONNX model:
import onnx
from onnxruntime.quantization import quantize_dynamic

# Load the ONNX model
model = onnx.load("model.onnx")

# Quantize the model
quantized_model = quantize_dynamic(model, weight_type=QuantType.QUInt8)

# Save the quantized model
onnx.save(quantized_model, "quantized_model.onnx")

Getting Started

To get started with ONNX Runtime, follow these steps:

  1. Install ONNX Runtime:
pip install onnxruntime
  1. Load and run an ONNX model:
import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("path/to/your/model.onnx")
input_name = session.get_inputs()[0].name
input_data = np.random.randn(1, 3, 224, 224).astype(np.float32)
output = session.run(None, {input_name: input_data})
  1. For GPU acceleration, install the GPU version:
pip install onnxruntime-gpu

Competitor Comparisons

186,879

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Larger ecosystem with more tools, libraries, and community support
  • Better support for distributed and large-scale machine learning
  • More comprehensive documentation and tutorials

Cons of TensorFlow

  • Steeper learning curve, especially for beginners
  • Slower execution speed for some operations compared to ONNX Runtime
  • Larger file size and memory footprint

Code Comparison

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

ONNX Runtime:

import onnxruntime as ort

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})

Both repositories provide powerful frameworks for machine learning and deep learning. TensorFlow offers a more comprehensive ecosystem with extensive tools and libraries, making it suitable for complex projects and research. However, it comes with a steeper learning curve and potentially slower execution for some operations.

ONNX Runtime, on the other hand, focuses on providing a lightweight and efficient inference engine for various machine learning models. It offers faster execution speed for certain operations and easier deployment across different platforms, but may have a smaller ecosystem compared to TensorFlow.

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

  • More flexible and dynamic computational graph, allowing for easier debugging and experimentation
  • Extensive ecosystem with a wide range of pre-trained models and libraries
  • Strong community support and frequent updates

Cons of PyTorch

  • Generally slower inference speed compared to ONNX Runtime
  • Larger model file sizes, which can be a concern for deployment on edge devices
  • Steeper learning curve for beginners due to its dynamic nature

Code Comparison

PyTorch:

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.add(x, y)

ONNX Runtime:

import onnxruntime as ort
import numpy as np

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
sess = ort.InferenceSession("model.onnx")
z = sess.run(None, {"input1": x, "input2": y})[0]

The code examples demonstrate the difference in approach between PyTorch's dynamic computation and ONNX Runtime's static graph execution. PyTorch allows for more intuitive tensor operations, while ONNX Runtime requires a pre-defined model and explicit input/output handling.

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Pros of Core ML Tools

  • Specifically designed for Apple platforms, offering seamless integration with iOS, macOS, and other Apple devices
  • Provides tools for converting models from various frameworks (TensorFlow, Keras, scikit-learn) to Core ML format
  • Supports on-device machine learning, optimizing for performance and privacy on Apple hardware

Cons of Core ML Tools

  • Limited to Apple ecosystem, lacking cross-platform support
  • Fewer supported model types and operations compared to ONNX Runtime
  • Smaller community and ecosystem compared to the more widely-used ONNX format

Code Comparison

Core ML Tools conversion example:

import coremltools as ct

keras_model = ...  # Your Keras model
coreml_model = ct.convert(keras_model)
coreml_model.save("model.mlmodel")

ONNX Runtime inference example:

import onnxruntime as ort

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})

Both libraries serve different purposes: Core ML Tools focuses on model conversion for Apple platforms, while ONNX Runtime is a cross-platform inference engine. Choose based on your target platform and specific requirements.

30,218

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Pros of JAX

  • Offers automatic differentiation and GPU/TPU acceleration
  • Provides a more flexible and customizable framework for machine learning research
  • Supports functional programming paradigms, enabling easier composition of operations

Cons of JAX

  • Steeper learning curve compared to ONNX Runtime
  • Less optimized for production deployment and inference
  • Smaller ecosystem and fewer pre-built models available

Code Comparison

ONNX Runtime example:

import onnxruntime as ort
session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})

JAX example:

import jax.numpy as jnp
from jax import grad, jit
def loss_fn(x):
    return jnp.sum(x**2)
grad_fn = jit(grad(loss_fn))
result = grad_fn(jnp.array([1.0, 2.0, 3.0]))
11,694

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Pros of TVM

  • More flexible and customizable for different hardware targets
  • Supports a wider range of deep learning frameworks
  • Offers advanced graph-level optimizations

Cons of TVM

  • Steeper learning curve and more complex to use
  • Less mature and stable compared to ONNX Runtime
  • Smaller community and ecosystem

Code Comparison

TVM example:

import tvm
from tvm import relay

# Define a simple network
data = relay.var("data", relay.TensorType((1, 3, 224, 224), "float32"))
weight = relay.var("weight")
conv2d = relay.nn.conv2d(data, weight)
func = relay.Function([data, weight], conv2d)

# Compile the network
target = "llvm"
with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(func, target)

ONNX Runtime example:

import onnxruntime as ort
import numpy as np

# Load pre-trained ONNX model
session = ort.InferenceSession("model.onnx")

# Prepare input data
input_data = np.random.rand(1, 3, 224, 224).astype(np.float32)

# Run inference
output = session.run(None, {"input": input_data})

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.

Pros of OpenBLAS

  • Highly optimized linear algebra operations for various CPU architectures
  • Lightweight and focused on BLAS (Basic Linear Algebra Subprograms) functionality
  • Open-source with a strong community and long-standing reputation in scientific computing

Cons of OpenBLAS

  • Limited to CPU operations, lacking GPU support unlike ONNX Runtime
  • Narrower scope, focusing primarily on linear algebra operations rather than a full machine learning inference framework
  • May require more manual integration and optimization for complex ML workflows

Code Comparison

OpenBLAS (C):

#include <cblas.h>

double x[] = {1, 2, 3, 4};
double y[] = {5, 6, 7, 8};
cblas_daxpy(4, 2.0, x, 1, y, 1);

ONNX Runtime (Python):

import onnxruntime as ort
import numpy as np

session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: np.array([1, 2, 3, 4]).astype(np.float32)})

While OpenBLAS excels in optimized linear algebra operations, ONNX Runtime provides a more comprehensive solution for machine learning inference across various hardware platforms. OpenBLAS is ideal for projects requiring high-performance linear algebra, while ONNX Runtime is better suited for end-to-end ML deployment and inference tasks.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started & Resources

Builtin Pipeline Status

SystemInferenceTraining
WindowsBuild Status
Build Status
Build Status
Build Status
LinuxBuild Status
Build Status
Build Status
Build Status
Build Status
Build Status
Build Status
MacBuild Status
AndroidBuild Status
iOSBuild Status
WebBuild Status
OtherBuild Status

This project is tested with BrowserStack.

Third-party Pipeline Status

SystemInferenceTraining
LinuxBuild Status

Releases

The current release and past releases can be found here: https://github.com/microsoft/onnxruntime/releases.

For details on the upcoming release, including release dates, announcements, features, and guidance on submitting feature requests, please visit the release roadmap: https://onnxruntime.ai/roadmap.

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

This project is licensed under the MIT License.

NPM DownloadsLast 30 Days