Top Related Projects
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
Open standard for machine learning interoperability
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
mlpack: a fast, header-only C++ machine learning library
Quick Overview
Apache TVM is an open-source machine learning compiler framework for CPUs, GPUs, and machine learning accelerators. It aims to enable machine learning engineers to optimize and run computations efficiently on various hardware backends, including mobile devices, embedded systems, and cloud platforms.
Pros
- Supports multiple hardware targets and deep learning frameworks
- Provides automatic optimization and tuning capabilities
- Offers a flexible and extensible architecture for custom optimizations
- Enables efficient deployment of machine learning models on diverse platforms
Cons
- Steep learning curve for beginners
- Documentation can be complex and sometimes outdated
- Limited support for certain specialized hardware accelerators
- Requires expertise in both machine learning and hardware optimization
Code Examples
- Defining and compiling a simple computation:
import tvm
from tvm import te
n = te.var("n")
A = te.placeholder((n,), name="A")
B = te.compute(A.shape, lambda i: A[i] * 2, name="B")
s = te.create_schedule(B.op)
f = tvm.build(s, [A, B], "llvm", name="double")
- Optimizing a convolution operation:
import tvm
from tvm import te, auto_scheduler
@auto_scheduler.register_workload
def conv2d(N, H, W, CO, CI, KH, KW, stride, padding):
data = te.placeholder((N, CI, H, W), name="data")
kernel = te.placeholder((CO, CI, KH, KW), name="kernel")
conv = tvm.topi.nn.conv2d_nchw(data, kernel, stride, padding, dilation=1, out_dtype="float32")
return [data, kernel, conv]
target = tvm.target.Target("cuda")
task = auto_scheduler.SearchTask(func=conv2d, args=(1, 224, 224, 64, 3, 7, 7, 2, 3), target=target)
tune_option = auto_scheduler.TuningOptions(
num_measure_trials=200,
measure_callbacks=[auto_scheduler.RecordToFile("conv2d.json")],
verbose=2,
)
sch, args = auto_scheduler.auto_schedule(task, tuning_options=tune_option)
- Deploying a pre-trained model:
import tvm
from tvm import relay
import tflite
tflite_model_file = "mobilenet_v1_1.0_224_quant.tflite"
tflite_model_buf = open(tflite_model_file, "rb").read()
tflite_model = tflite.Model.GetRootAsModel(tflite_model_buf, 0)
input_tensor = "input"
input_shape = (1, 224, 224, 3)
input_dtype = "uint8"
mod, params = relay.frontend.from_tflite(tflite_model,
shape_dict={input_tensor: input_shape},
dtype_dict={input_tensor: input_dtype})
target = "llvm"
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target=target, params=params)
dev = tvm.device(str(target), 0)
module = runtime.GraphModule(lib["default"](dev))
Getting Started
To get started with Apache TVM:
- Install TVM:
git clone --recursive https://github.com/apache/tvm tvm
cd tvm
mkdir build
cp cmake/config.cmake build
cd build
cmake ..
make -j4
- Set up Python environment:
export TVM_HOME=/path/to/tvm
export PYTHONPATH=$TVM_HOME/python:${PYTHONPATH}
- Run a simple example:
import tvm
from tvm import te
A = te.placeholder((10,), name="A")
B = te.compute(A.
Competitor Comparisons
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- More user-friendly and intuitive API for deep learning tasks
- Extensive ecosystem with pre-trained models and libraries
- Dynamic computational graphs for flexible model development
Cons of PyTorch
- Less optimized for deployment on edge devices and mobile platforms
- Limited support for specialized hardware accelerators compared to TVM
- Steeper learning curve for low-level optimizations and custom operators
Code Comparison
PyTorch:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.matmul(x, y)
TVM:
import tvm
from tvm import te
n = te.var("n")
A = te.placeholder((n,), name="A")
B = te.placeholder((n,), name="B")
C = te.compute(A.shape, lambda i: A[i] * B[i])
PyTorch focuses on high-level tensor operations and automatic differentiation, making it easier for researchers and developers to build and train neural networks. TVM, on the other hand, provides a lower-level approach with more control over hardware-specific optimizations and compilation for various targets.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Larger ecosystem and community support
- More comprehensive documentation and tutorials
- Wider range of pre-trained models and tools
Cons of TensorFlow
- Steeper learning curve for beginners
- Less flexibility for low-level optimizations
- Heavier resource requirements
Code Comparison
TensorFlow:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
TVM:
import tvm
from tvm import relay
def simple_net(data):
dense1 = relay.nn.dense(data, relay.var("dense1_weight"))
relu1 = relay.nn.relu(dense1)
dense2 = relay.nn.dense(relu1, relay.var("dense2_weight"))
return relay.nn.softmax(dense2)
TensorFlow provides a higher-level API for model creation, while TVM offers more low-level control for optimization. TVM focuses on optimizing and deploying models across various hardware platforms, whereas TensorFlow is a more comprehensive framework for building and training machine learning models.
Open standard for machine learning interoperability
Pros of ONNX
- Widely adopted standard for neural network exchange
- Supports a broader range of frameworks and tools
- Simpler model representation and easier to understand
Cons of ONNX
- Limited runtime optimization capabilities
- Less focus on end-to-end deployment and hardware-specific optimizations
- Narrower scope, primarily for model exchange rather than compilation
Code Comparison
ONNX model definition:
import onnx
node = onnx.helper.make_node('Relu', inputs=['X'], outputs=['Y'])
graph = onnx.helper.make_graph([node], 'test', [X], [Y])
model = onnx.helper.make_model(graph)
TVM model definition and compilation:
import tvm
from tvm import relay
x = relay.var('x', shape=(1, 10))
y = relay.nn.relu(x)
func = relay.Function([x], y)
mod = tvm.IRModule.from_expr(func)
target = tvm.target.Target('llvm')
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target)
ONNX focuses on model representation and interoperability, while TVM provides a more comprehensive approach to model optimization and deployment across various hardware targets. TVM offers more advanced compilation techniques and runtime optimizations, making it better suited for performance-critical applications and specialized hardware.
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Pros of ONNX Runtime
- Broader hardware support and optimizations for various devices
- Easier integration with existing ML frameworks and tools
- More extensive documentation and community support
Cons of ONNX Runtime
- Less flexibility for custom operators and optimizations
- Limited support for certain advanced deep learning models
- Potentially higher memory usage for some workloads
Code Comparison
ONNX Runtime example:
import onnxruntime as ort
session = ort.InferenceSession("model.onnx")
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: input_data})
TVM example:
import tvm
from tvm import relay
mod, params = relay.frontend.from_onnx(onnx_model)
with tvm.transform.PassContext(opt_level=3):
lib = relay.build(mod, target, params=params)
Both ONNX Runtime and TVM are powerful frameworks for optimizing and deploying machine learning models. ONNX Runtime excels in ease of use and broad hardware support, while TVM offers more flexibility for advanced optimizations and custom operators. The choice between the two depends on specific project requirements, target hardware, and the level of customization needed.
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Pros of JAX
- Seamless integration with NumPy and automatic differentiation
- Efficient compilation to XLA for GPU and TPU acceleration
- Strong support for functional programming paradigms
Cons of JAX
- Steeper learning curve for users not familiar with functional programming
- Limited support for dynamic shapes and control flow compared to TVM
- Smaller ecosystem and fewer pre-built models than TVM
Code Comparison
JAX example:
import jax.numpy as jnp
from jax import grad, jit
def f(x):
return jnp.sum(jnp.sin(x))
grad_f = jit(grad(f))
TVM example:
import tvm
from tvm import te
n = te.var("n")
A = te.placeholder((n,), name="A")
B = te.compute(A.shape, lambda i: tvm.tir.sin(A[i]), name="B")
s = te.create_schedule(B.op)
Both frameworks offer powerful capabilities for optimizing and accelerating numerical computations, but they approach the problem from different angles. JAX focuses on providing a NumPy-like interface with automatic differentiation and XLA compilation, while TVM offers a more flexible approach to tensor expressions and scheduling optimizations across various hardware targets.
mlpack: a fast, header-only C++ machine learning library
Pros of mlpack
- Focuses on scalable machine learning algorithms, offering a wide range of ML techniques
- Provides bindings for multiple languages, including Python, Julia, and R
- Emphasizes ease of use and fast prototyping for ML applications
Cons of mlpack
- Less suitable for deep learning and neural network optimization compared to TVM
- Smaller community and ecosystem compared to TVM's backing by Apache
- Limited support for hardware-specific optimizations and cross-platform deployment
Code Comparison
mlpack (C++):
#include <mlpack/core.hpp>
#include <mlpack/methods/linear_regression/linear_regression.hpp>
arma::mat X, y;
mlpack::regression::LinearRegression lr(X, y);
arma::vec predictions;
lr.Predict(X_test, predictions);
TVM (Python):
import tvm
from tvm import relay
data = relay.var("data", shape=(1, 3, 224, 224))
weight = relay.var("weight")
conv2d = relay.nn.conv2d(data, weight)
func = relay.Function([data, weight], conv2d)
Both libraries offer different approaches to machine learning tasks. mlpack focuses on traditional ML algorithms with a C++ core, while TVM specializes in deep learning optimizations and cross-platform deployment.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
<img src=https://raw.githubusercontent.com/apache/tvm-site/main/images/logo/tvm-logo-small.png width=128/> Open Deep Learning Compiler Stack
Documentation | Contributors | Community | Release Notes
Apache TVM is a compiler stack for deep learning systems. It is designed to close the gap between the productivity-focused deep learning frameworks, and the performance- and efficiency-focused hardware backends. TVM works with deep learning frameworks to provide end to end compilation to different backends.
License
TVM is licensed under the Apache-2.0 license.
Getting Started
Check out the TVM Documentation site for installation instructions, tutorials, examples, and more. The Getting Started with TVM tutorial is a great place to start.
Contribute to TVM
TVM adopts apache committer model, we aim to create an open source project that is maintained and owned by the community. Check out the Contributor Guide.
Acknowledgement
We learned a lot from the following projects when building TVM.
- Halide: Part of TVM's TIR and arithmetic simplification module originates from Halide. We also learned and adapted some part of lowering pipeline from Halide.
- Loopy: use of integer set analysis and its loop transformation primitives.
- Theano: the design inspiration of symbolic scan operator for recurrence.
Top Related Projects
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
Open standard for machine learning interoperability
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
mlpack: a fast, header-only C++ machine learning library
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot