Top Related Projects
A domain specific language to express machine learning workloads.
"Multi-Level Intermediate Representation" Compiler Infrastructure
A retargetable MLIR-based machine learning compiler and runtime toolkit.
PlaidML is a framework for making deep learning work everywhere.
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Quick Overview
Glow is a machine learning compiler and execution engine for hardware accelerators, developed by Facebook (now Meta). It's designed to optimize and run neural networks on various hardware platforms, focusing on low-latency inference and efficient memory usage.
Pros
- Supports multiple hardware targets, including CPUs, GPUs, and specialized AI accelerators
- Optimizes neural network models for improved performance and reduced memory footprint
- Integrates well with PyTorch, allowing seamless conversion of PyTorch models
- Provides a flexible graph transformation framework for custom optimizations
Cons
- Relatively complex setup and learning curve for new users
- Limited documentation and examples compared to more mainstream frameworks
- Primarily focused on inference, with less emphasis on training capabilities
- May require frequent updates to keep up with rapidly evolving hardware accelerators
Code Examples
- Loading and running a PyTorch model with Glow:
import torch
from torch.utils.bundled_inputs import bundled_inputs
from torch_glow import enable_glow_fusion
@enable_glow_fusion()
def run_model(model, inputs):
return model(*inputs)
# Load your PyTorch model
model = torch.jit.load("path/to/your/model.pt")
# Prepare input data
example_inputs = bundled_inputs(model)
# Run the model using Glow
output = run_model(model, example_inputs)
- Compiling a model for a specific backend:
import torch_glow
from torch_glow import CompilationSpec, InputSpec
# Define input specifications
input_specs = [
InputSpec("input", [1, 3, 224, 224], torch.float32),
]
# Create a compilation specification
comp_spec = CompilationSpec()
comp_spec.set_input_specs(input_specs)
# Compile the model for a specific backend (e.g., "CPU")
compiled_model = torch_glow.compile(model, comp_spec, backend="CPU")
- Performing quantization-aware training with Glow:
import torch
from torch_glow import enable_glow_fusion
@enable_glow_fusion()
def quantized_training_step(model, inputs, labels, optimizer):
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
return loss
# Your quantization-aware training loop
for epoch in range(num_epochs):
for inputs, labels in dataloader:
loss = quantized_training_step(model, inputs, labels, optimizer)
Getting Started
To get started with Glow:
- Install PyTorch and Glow:
pip install torch torchvision
pip install torch-glow
- Import the necessary modules:
import torch
import torch_glow
- Enable Glow fusion for your PyTorch model:
from torch_glow import enable_glow_fusion
@enable_glow_fusion()
def run_model(model, inputs):
return model(*inputs)
# Use run_model() to execute your PyTorch model with Glow optimizations
Competitor Comparisons
A domain specific language to express machine learning workloads.
Pros of TensorComprehensions
- Focuses on automatic code generation for specific tensor operations
- Provides a domain-specific language for expressing computations
- Integrates with existing deep learning frameworks like PyTorch
Cons of TensorComprehensions
- More specialized and narrower in scope compared to Glow
- Less mature and potentially less stable
- May require more manual intervention for complex operations
Code Comparison
TensorComprehensions:
def matmul(float(M,K) A, float(K,N) B) -> (C) {
C(m,n) +=! A(m,k) * B(k,n)
}
Glow:
Node *matmul = F->createMatMul("matmul", A, B);
SaveNode *result = F->createSave("result", matmul);
TensorComprehensions uses a custom DSL to define tensor operations, while Glow employs a more traditional C++ API for creating computational graphs. TensorComprehensions focuses on generating optimized code for specific operations, whereas Glow provides a broader compiler infrastructure for neural network models.
Both projects aim to improve performance in deep learning applications, but they approach the problem from different angles. TensorComprehensions is more suited for developers who need fine-grained control over specific tensor operations, while Glow offers a more comprehensive solution for compiling and optimizing entire neural network models.
"Multi-Level Intermediate Representation" Compiler Infrastructure
Pros of MLIR
- More comprehensive and flexible intermediate representation (IR) system
- Broader scope, supporting multiple frontends and backends beyond just machine learning
- Active development and support from Google and the wider community
Cons of MLIR
- Steeper learning curve due to its more complex architecture
- Less mature ecosystem compared to Glow's focus on PyTorch integration
- May be overkill for projects solely focused on machine learning optimization
Code Comparison
MLIR example:
func @simple_mul(%arg0: tensor<4xf32>, %arg1: tensor<4xf32>) -> tensor<4xf32> {
%0 = "tf.Mul"(%arg0, %arg1) : (tensor<4xf32>, tensor<4xf32>) -> tensor<4xf32>
return %0 : tensor<4xf32>
}
Glow example:
Function *F = mod.createFunction("simple_mul");
auto *input1 = mod.createPlaceholder(ElemKind::FloatTy, {4}, "input1", false);
auto *input2 = mod.createPlaceholder(ElemKind::FloatTy, {4}, "input2", false);
auto *mul = F->createMul("mul", input1, input2);
F->createSave("save", mul);
Summary
MLIR offers a more versatile and powerful IR system with broader applications, while Glow provides a more focused solution for PyTorch-based machine learning optimization. MLIR's flexibility comes at the cost of increased complexity, whereas Glow offers a simpler approach for specific use cases.
A retargetable MLIR-based machine learning compiler and runtime toolkit.
Pros of IREE
- Broader target support, including mobile and embedded devices
- More active development and community engagement
- Flexible multi-backend architecture for various hardware targets
Cons of IREE
- Steeper learning curve due to its complexity
- Less mature ecosystem compared to Glow
- Potentially higher overhead for simpler deployment scenarios
Code Comparison
IREE example:
import iree.compiler as ireec
import numpy as np
module = ireec.compile_str("""
func @add(%a: tensor<4xf32>, %b: tensor<4xf32>) -> tensor<4xf32> {
%0 = mhlo.add %a, %b : tensor<4xf32>
return %0 : tensor<4xf32>
}
""", target_backends=["vulkan-spirv"])
Glow example:
#include "glow/ExecutionEngine/ExecutionEngine.h"
#include "glow/Graph/Graph.h"
#include "glow/Support/Support.h"
glow::PlaceholderBindings bindings;
glow::ExecutionEngine EE;
auto &mod = EE.getModule();
auto *F = mod.createFunction("main");
Both IREE and Glow aim to optimize and deploy machine learning models, but they differ in their approach and target platforms. IREE offers more flexibility for various hardware targets, while Glow is more tightly integrated with the PyTorch ecosystem. The choice between them depends on specific project requirements and deployment scenarios.
PlaidML is a framework for making deep learning work everywhere.
Pros of PlaidML
- Supports a wider range of hardware, including GPUs from NVIDIA, AMD, and Intel
- Offers automatic kernel generation for various backends
- Provides a more flexible approach to defining custom operations
Cons of PlaidML
- Smaller community and ecosystem compared to Glow
- Less optimized for specific hardware targets like mobile devices
- Fewer pre-trained models and examples available
Code Comparison
PlaidML example:
import plaidml.keras
plaidml.keras.install_backend()
from keras.models import Sequential
from keras.layers import Dense
model = Sequential([
Dense(32, input_shape=(16,), activation='relu'),
Dense(10, activation='softmax')
])
Glow example:
#include "glow/ExecutionEngine/ExecutionEngine.h"
#include "glow/Graph/Graph.h"
#include "glow/IR/IR.h"
glow::ExecutionEngine EE;
auto &mod = EE.getModule();
auto *F = mod.createFunction("main");
Both PlaidML and Glow aim to provide efficient deep learning frameworks, but they have different focuses and strengths. PlaidML offers broader hardware support and flexibility, while Glow is more optimized for specific targets and has a larger ecosystem due to its association with PyTorch.
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Pros of TVM
- Broader hardware support, including CPUs, GPUs, and specialized AI accelerators
- More flexible and customizable compilation pipeline
- Active community and frequent updates
Cons of TVM
- Steeper learning curve due to its complexity
- Potentially slower compilation times for simpler models
Code Comparison
TVM example:
import tvm
from tvm import relay
def example_network():
data = relay.var("data", shape=(1, 3, 224, 224))
weight = relay.var("weight")
conv = relay.nn.conv2d(data, weight)
return relay.Function([data, weight], conv)
Glow example:
#include "glow/Graph/Graph.h"
void exampleNetwork(glow::Module &mod) {
auto *F = mod.createFunction("main");
auto *input = mod.createPlaceholder(ElemKind::FloatTy, {1, 3, 224, 224}, "input", false);
auto *filter = mod.createConstant(ElemKind::FloatTy, {16, 3, 3, 3}, "filter");
auto *conv = F->createConv("conv", input, filter, 16, 3, 1, 1, 1);
}
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Glow is a machine learning compiler and execution engine for hardware accelerators. It is designed to be used as a backend for high-level machine learning frameworks. The compiler is designed to allow state of the art compiler optimizations and code generation of neural network graphs. This library is in active development. The project plan is described in the Github issues section and in the Roadmap wiki page.
Partners
Contributions to Glow are welcomed and encouraged! Glow is developed in collaboration with the following partners:
How does it work?
Glow lowers a traditional neural network dataflow graph into a two-phase strongly-typed intermediate representation (IR). The high-level IR allows the optimizer to perform domain-specific optimizations. The lower-level instruction-based address-only IR allows the compiler to perform memory-related optimizations, such as instruction scheduling, static memory allocation and copy elimination. At the lowest level, the optimizer performs machine-specific code generation to take advantage of specialized hardware features. Glow features a lowering phase which enables the compiler to support a high number of input operators as well as a large number of hardware targets by eliminating the need to implement all operators on all targets. The lowering phase is designed to reduce the input space and allow new hardware backends to focus on a small number of linear algebra primitives. The design philosophy is described in an arXiv paper.
Getting Started
System Requirements
Glow builds and runs on macOS and Linux. The software depends on a modern C++ compiler that supports C++11, on CMake, LLVM (>=7.0), glog, protocol buffers, and libpng.
Get Glow!
git clone git@github.com:pytorch/glow.git # or: git clone https://github.com/pytorch/glow.git
cd glow
Submodules
Glow depends on a few submodules: googletest, onnx, and a library for FP16 conversions.
To get them, from the glow directory, run:
git submodule update --init --recursive
Source dependencies
Glow depends on fmt
, which must be built from source:
git clone https://github.com/fmtlib/fmt
mkdir fmt/build
cd fmt/build
cmake ..
make
sudo make install
macOS
Install the required dependencies using either Homebrew or MacPorts. If using Homebrew, run:
brew install cmake graphviz libpng ninja protobuf wget glog autopep8 llvm \
boost double-conversion gflags jemalloc libevent lz4 openssl pkg-config \
snappy xz
If using MacPorts, run:
port install cmake graphviz libpng ninja protobuf-cpp wget google-glog \
boost double-conversion gflags jemalloc libevent lz4 openssl snappy xz
# Choose version >= 7
export LLVM_VERSION=7
port install llvm-$LLVM_VERSION.0
Note that LLVM is installed in a non-default location to avoid conflicts with
the system's LLVM --Homebrew usually installs LLVM in /usr/local/opt/llvm/
,
whereas MacPorts installs it in /opt/local/libexec/llvm-$LLVM_VERSION.0/
. This means that
CMake will need to be told where to find LLVM when building; instructions on
that can be found here.
Finally, create a symbolic link to the Homebrew- or MacPorts-installed
clang-*
tools so that the utils/format.sh
script is able to find them later
on. For a Homebrew-managed installation, run:
ln -s "/usr/local/opt/llvm/bin/clang-format" "/usr/local/bin/clang-format"
ln -s "/usr/local/opt/llvm/bin/clang-tidy" "/usr/local/bin/clang-tidy"
For MacPorts, run:
ln -s "/opt/local/libexec/llvm-$LLVM_VERSION.0/bin/clang-format" "/usr/local/bin/clang-format"
ln -s "/opt/local/libexec/llvm-$LLVM_VERSION.0/bin/clang-tidy" "/usr/local/bin/clang-tidy"
Note: Starting with macOS Mojave, Xcode's command line tools changed header layout. In order for Glow to build on Mojave, you might need to install
macOS_SDK_headers_for_macOS_10.14.pkg
, located in/Library/Developer/CommandLineTools/Packages/
. For macOS Catalina you might need to explicitly specify SDKROOT:export SDKROOT="/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk"
Ubuntu
[The following instructions have been tested on Ubuntu 16.04 and 18.04]
In order to build Glow on Ubuntu it is necessary to install a few packages. The following command should install the required dependencies:
sudo apt-get install clang clang-8 cmake graphviz libpng-dev \
libprotobuf-dev llvm-8 llvm-8-dev ninja-build protobuf-compiler wget \
opencl-headers libgoogle-glog-dev libboost-all-dev \
libdouble-conversion-dev libevent-dev libssl-dev libgflags-dev \
libjemalloc-dev libpthread-stubs0-dev liblz4-dev libzstd-dev libbz2-dev \
libsodium-dev libfmt-dev
[Note: Ubuntu 16.04 and 18.04 ship with llvm-6 and need to be upgraded before building Glow. Building Glow on Ubuntu 16.04 with llvm-7 fails because llvm-7 xenial distribution uses an older c++ ABI, however building Glow on Ubuntu 18.04 with llvm-7 has been tested and is successful]
It may be desirable to use update-alternatives
to manage the version of
clang/clang++:
sudo update-alternatives --install /usr/bin/clang clang \
/usr/lib/llvm-8/bin/clang 50
sudo update-alternatives --install /usr/bin/clang++ clang++ \
/usr/lib/llvm-8/bin/clang++ 50
Glow uses the system default C/C++ compiler (/usr/bin/c++), and so you may also want to switch your default C/C++ compiler to clang:
sudo update-alternatives --config cc
# Select the option corresponding to /usr/bin/clang ...
sudo update-alternatives --config c++
# Select the option corresponding to /usr/bin/clang++ ...
Glow should build just fine with gcc (e.g. gcc 5.4), but we mostly use clang and are more attentive to compatibility with clang.
Finally, in order to support the ONNX net serialization format, Glow requires
protobuf >= 2.6.1
, but the above command may install older
version on older Ubuntu (e.g. 14.04). If this is the case, we suggest to look
at utils/install_protobuf.sh
to install a newer version from source.
For details on installing OpenCL on Ubuntu please see these instructions.
Configure and Build
To build the compiler, create a build directory and run cmake on the source directory. It's a good idea to build two configurations (Release and Debug) because some programs take a really long time to run in Debug mode. It's also a good idea to build the project outside of the source directory.
mkdir build_Debug
cd build_Debug
cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug ../glow
ninja all
It's possible to configure and build the compiler with any CMake generator, like GNU Makefiles, Ninja and Xcode build.
For platform-specific build instructions and advanced options, such as building with Address-Sanitizers refer to this guide: Building the Compiler.
If you're running macOS v10.14 (Mojave) and ninja all
fails because it can't
find headers (e.g. string.h
), run this command to fix it, and try again.
More information is available here
under "Command Line Tools".
open /Library/Developer/CommandLineTools/Packages/macOS_SDK_headers_for_macOS_10.14.pkg
For macOS v10.15 (Catalina) you might need to explicitly specify SDKROOT:
export SDKROOT="/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk"
Building with dependencies (LLVM)
By default, Glow will use a system provided LLVM. Note that Glow requires LLVM
7.0 or later. If you have LLVM installed in a non-default location (for
example, if you installed it using Homebrew on macOS), you need to tell CMake
where to find llvm using -DLLVM_DIR
. For example, if LLVM were
installed in /usr/local/opt
:
cmake -G Ninja ../glow \
-DCMAKE_BUILD_TYPE=Debug \
-DLLVM_DIR=/usr/local/opt/llvm/lib/cmake/llvm
If LLVM is not available on your system you'll need to build it manually. Run
the script '/utils/build_llvm.sh
to clone, build and install LLVM in a local
directory. You will need to configure Glow with the flag -DLLVM_DIR
to tell
the build system where to find LLVM given the local directory you installed it
in (e.g. -DLLVM_DIR=/path/to/llvm_install/lib/cmake/llvm
if using
build_llvm.sh
).
Testing and Running
Unit tests
The project has a few unit tests in the tests/unittests subdirectory. To run all
of them, simply run ninja test
.
C++ API examples
A few test programs that use Glow's C++ API are found under the examples/
subdirectory. The mnist
, cifar10
, fr2en
and ptb
programs train and run digit
recognition, image classification and language modeling benchmarks,
respectively.
To run these programs, build Glow in Release mode, then run the following commands to download the cifar10, mnist and ptb databases.
python ../glow/utils/download_datasets_and_models.py --all-datasets
Now run the examples. Note that the databases should be in the current working directory.
./bin/mnist
./bin/cifar10
./bin/fr2en
./bin/ptb
./bin/char-rnn
If everything goes well you should see:
mnist
: pictures from the mnist digits databasecifar10
: image classifications that steadily improvefr2en
: an interactive French-to-English translatorptb
: decreasing perplexity on the dataset as the network trainschar-rnn
: generates random text based on some document
Note that the default build mode is Debug
, which means that the compiler
itself is easy to debug because the binary contains debug info, lots of
assertions, and the optimizations are disabled. It also means that the compiler
and runtime are very slow, and the execution time can be hundreds of times
slower than that of release builds. If you wish to benchmark the compiler, run
long benchmarks, or release the product then you should compile the compiler in
Release mode. Check the main CMake file for more details.
More details on testing and running Glow can be found in: Testing the Glow Compiler.
Ahead-of-time Compilation
Glow can be used to compile neural networks into object files containing native
code. We provide resnet50 (both quantized and non-quantized versions) as an
example of this capability in examples/bundles/resnet50
. See Creating
Standalone Executable Bundles for more detail.
Contributing
To get started contributing, please refer to the following guides:
Communication
- Forums: discuss implementations, research, etc: https://discuss.pytorch.org/c/glow. Make sure to label topic with the "glow" category.
- GitHub issues: bug reports, feature requests, install issues, RFCs, thoughts, etc.
License
Glow is licensed under the Apache 2.0 License.
Top Related Projects
A domain specific language to express machine learning workloads.
"Multi-Level Intermediate Representation" Compiler Infrastructure
A retargetable MLIR-based machine learning compiler and runtime toolkit.
PlaidML is a framework for making deep learning work everywhere.
Open deep learning compiler stack for cpu, gpu and specialized accelerators
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot