mace

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

5,020

826

5,020

View on GitHub

Top Related Projects

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba. Full multimodal LLM Android App:[MNN-LLM-Android](./apps/Android/MnnLlmChat/README.md). MNN TaoAvatar Android - Local 3D Avatar Intelligence: apps/Android/Mnn3dAvatar/README.md

ncnn

21,857

ncnn is a high-performance neural network inference framework optimized for the mobile platform

turicreate

11,202

Turi Create simplifies the development of custom machine learning models.

Tengine

4,474

Tengine is a lite, high performance, modular inference engine for embedded device

Quick Overview

MACE (Mobile AI Compute Engine) is an open-source deep learning inference framework optimized for mobile heterogeneous computing platforms. It provides a unified interface for deploying deep learning models on various mobile devices, including CPU, GPU, and DSP, with high performance and low latency.

Pros

Cross-platform support for various mobile devices and processors
High performance and low latency optimizations for mobile inference
Easy-to-use API for model deployment and inference
Support for popular deep learning frameworks like TensorFlow, Caffe, and ONNX

Cons

Limited support for newer deep learning models and architectures
Requires model conversion and optimization, which can be complex for some users
Documentation could be more comprehensive and up-to-date
Smaller community compared to some other mobile inference frameworks

Code Examples

Loading a model and performing inference:

import mace
from mace.proto import mace_pb2
from mace.python.tools.converter_tool import base_converter

# Load the model
model = mace_pb2.NetDef()
with open('mobilenet_v1.pb', 'rb') as f:
    model.ParseFromString(f.read())

# Create an engine
engine = mace.CreateMaceEngineFromProto(model, device_type=mace.DeviceType.GPU)

# Perform inference
output = engine.Run(input_data)

Converting a TensorFlow model to MACE format:

from mace.python.tools.converter_tool import base_converter

converter = base_converter.ConverterUtil()
converter.convert(model_file_path='mobilenet_v1.pb',
                  output_node_names=['MobilenetV1/Predictions/Reshape_1'],
                  input_node_names=['input'],
                  input_shapes=[[1, 224, 224, 3]],
                  output_dir='mace_model')

Setting up a custom operator:

#include "mace/core/operator.h"

namespace mace {

class MyCustomOp : public Operation {
 public:
  explicit MyCustomOp(OpConstructContext *context)
      : Operation(context) {}

  MaceStatus Run(OpContext *context) override {
    // Implement custom operation logic here
    return MaceStatus::MACE_SUCCESS;
  }

  static void RegisterOp(OpRegistryBase *op_registry) {
    MACE_REGISTER_OP(op_registry, "MyCustomOp", MyCustomOp,
                     DeviceType::CPU, float);
  }
};

}  // namespace mace

Getting Started

Install MACE:

git clone https://github.com/XiaoMi/mace.git
cd mace
pip install -e .

Convert your model:

from mace.python.tools.converter_tool import base_converter

converter = base_converter.ConverterUtil()
converter.convert(model_file_path='your_model.pb',
                  output_node_names=['output_node'],
                  input_node_names=['input_node'],
                  input_shapes=[[1, 224, 224, 3]],
                  output_dir='mace_model')

Use the converted model in your application:

import mace
from mace.proto import mace_pb2

model = mace_pb2.NetDef()
with open('mace_model/model.pb', 'rb') as f:
    model.ParseFromString(f.read())

engine = mace.CreateMaceEngineFromProto(model, device_type=mace.DeviceType.GPU)
output = engine.Run(input_data)

Competitor Comparisons

tensorflow

190,523

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

Larger ecosystem and community support
More comprehensive documentation and tutorials
Wider range of supported platforms and devices

Cons of TensorFlow

Steeper learning curve for beginners
Larger codebase and installation size
Can be slower for mobile and edge devices

Code Comparison

MACE:

#include "mace/public/mace.h"

MaceStatus CreateMaceEngineFromProto(
    const std::vector<unsigned char> &model_pb,
    const std::string &device,
    const std::vector<std::string> &input_nodes,
    const std::vector<std::string> &output_nodes,
    const MaceTuningParams &tuning_params,
    std::shared_ptr<MaceEngine> *engine);

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

MACE is designed specifically for mobile and embedded devices, offering optimized performance for these platforms. TensorFlow provides a more versatile and extensive framework for various machine learning tasks across different platforms. MACE's API is more C++-oriented, while TensorFlow offers a high-level Python API for easier model creation and training.

pytorch

91,080

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

Larger community and ecosystem, with more resources and third-party libraries
More flexible and dynamic computational graph, allowing for easier debugging
Better support for research and prototyping in deep learning

Cons of PyTorch

Generally slower inference speed compared to MACE
Larger model size and memory footprint
Less optimized for mobile and edge devices

Code Comparison

PyTorch example:

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.add(x, y)

MACE example:

#include "mace/public/mace.h"

mace::MaceTensor input;
mace::MaceTensor output;
mace::MaceEngine engine;
engine.Run({input}, &output);

The PyTorch example demonstrates its intuitive tensor operations, while the MACE example shows its focus on efficient model deployment and inference.

MNN

11,962

Pros of MNN

Wider platform support, including iOS, Android, Windows, Linux, and macOS
More comprehensive documentation and examples
Supports a broader range of deep learning frameworks, including TensorFlow, PyTorch, and ONNX

Cons of MNN

Slightly more complex setup process
Less focus on mobile-specific optimizations compared to MACE
Larger codebase, which may lead to longer compilation times

Code Comparison

MNN example:

auto net = std::shared_ptr<MNN::Interpreter>(MNN::Interpreter::createFromFile(modelPath));
net->createSession(config);
auto input = net->getSessionInput(nullptr, "input");
auto output = net->getSessionOutput(nullptr, "output");

MACE example:

mace::MaceEngine engine(config);
engine.Init(model_data, model_data_size, input_buffers, output_buffers);
engine.Run(input_data, output_data);

Both libraries offer straightforward APIs for model inference, but MNN's approach is more object-oriented, while MACE uses a more procedural style. MNN provides more granular control over the session and tensors, which may be beneficial for complex use cases.

ncnn

21,857

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Pros of ncnn

Lighter weight and more suitable for mobile and embedded devices
Supports a wider range of platforms, including Android, iOS, Windows, Linux, and more
Has a larger community and more frequent updates

Cons of ncnn

Less focus on model optimization compared to MACE
May require more manual work for model conversion and optimization

Code Comparison

MACE example (C++):

#include "mace/public/mace.h"

MaceStatus status;
MaceEngineConfig config;
std::shared_ptr<mace::MaceEngine> engine;
status = CreateMaceEngineFromProto(model_graph_proto,
                                   model_graph_proto_size,
                                   model_weights_data,
                                   model_weights_data_size,
                                   input_nodes,
                                   output_nodes,
                                   config,
                                   &engine);

ncnn example (C++):

#include "net.h"

ncnn::Net net;
net.load_param("model.param");
net.load_model("model.bin");

ncnn::Mat in(w, h, 3);
ncnn::Mat out;
ncnn::Extractor ex = net.create_extractor();
ex.input("data", in);
ex.extract("output", out);

Both libraries offer efficient inference for deep learning models on mobile and embedded devices. MACE focuses more on model optimization and provides a higher-level API, while ncnn is lighter weight and supports a broader range of platforms. The choice between them depends on specific project requirements and target devices.

turicreate

11,202

Turi Create simplifies the development of custom machine learning models.

Pros of Turicreate

Broader scope: Supports a wide range of machine learning tasks, including image classification, object detection, and recommender systems
User-friendly: Provides high-level APIs and tools for easy model creation and deployment
Cross-platform: Works on macOS, Linux, and Windows

Cons of Turicreate

Less optimized for mobile: Not specifically designed for mobile deployment like MACE
Larger footprint: Generally requires more resources and has a larger codebase

Code Comparison

MACE (C++):

MaceEngine engine;
MaceStatus status = CreateMaceEngineFromProto(model_graph_proto,
                                              model_weights_data,
                                              input_nodes,
                                              output_nodes,
                                              device_type,
                                              &engine);

Turicreate (Python):

model = tc.image_classifier.create(train_data, target='label', model='resnet-50')
predictions = model.predict(test_data)

Summary

MACE focuses on efficient mobile deployment of deep learning models, while Turicreate offers a more comprehensive suite of machine learning tools with a user-friendly interface. MACE is better suited for mobile-specific optimizations, whereas Turicreate provides a broader range of ML capabilities across multiple platforms.

Tengine

4,474

Tengine is a lite, high performance, modular inference engine for embedded device

Pros of Tengine

Broader hardware support, including ARM, RISC-V, and x86
More flexible model conversion tools
Better support for quantization and model compression

Cons of Tengine

Less optimized for mobile devices compared to MACE
Smaller community and fewer contributors
Less comprehensive documentation and examples

Code Comparison

Tengine example:

int tengine_init(void);
int create_input_node(graph_t graph, const char* node_name, int data_type, int layout, int n, int c, int h, int w);
int create_graph(graph_t graph, const char* model_name, const char* model_format);

MACE example:

mace::MaceStatus CreateMaceEngineFromProto(const std::vector<unsigned char>& model_pb,
                                           const std::string& model_data_file,
                                           const std::vector<std::string>& input_nodes,
                                           const std::vector<std::string>& output_nodes,
                                           const DeviceType device_type,
                                           std::shared_ptr<mace::MaceEngine>* engine);

Both libraries provide APIs for creating and running neural network models, but Tengine's API is more C-style, while MACE uses a more modern C++ approach. Tengine's API appears to be more granular, allowing for more fine-grained control over graph creation and node management.

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Mobile AI Compute Engine (or MACE for short) is a deep learning inference framework optimized for mobile heterogeneous computing on Android, iOS, Linux and Windows devices. The design focuses on the following targets:

Performance
- Runtime is optimized with NEON, OpenCL and Hexagon, and Winograd algorithm is introduced to speed up convolution operations. The initialization is also optimized to be faster.
Power consumption
- Chip dependent power options like big.LITTLE scheduling, Adreno GPU hints are included as advanced APIs.
Responsiveness
- UI responsiveness guarantee is sometimes obligatory when running a model. Mechanism like automatically breaking OpenCL kernel into small units is introduced to allow better preemption for the UI rendering task.
Memory usage and library footprint
- Graph level memory allocation optimization and buffer reuse are supported. The core library tries to keep minimum external dependencies to keep the library footprint small.
Model protection
- Model protection has been the highest priority since the beginning of the design. Various techniques are introduced like converting models to C++ code and literal obfuscations.
Platform coverage
- Good coverage of recent Qualcomm, MediaTek, Pinecone and other ARM based chips. CPU runtime supports Android, iOS and Linux.
Rich model formats support
- TensorFlow, Caffe and ONNX model formats are supported.

Getting Started

Performance

MACE Model Zoo contains several common neural networks and models which will be built daily against a list of mobile phones. The benchmark results can be found in the CI result page (choose the latest passed pipeline, click release step and you will see the benchmark results). To get the comparison results with other frameworks, you can take a look at MobileAIBench project.

Communication

GitHub issues: bug reports, usage issues, feature requests
Slack: mace-users.slack.com
QQç¾¤: 756046893

Contributing

Any kind of contribution is welcome. For bug reports, feature requests, please just open an issue without any hesitation. For code contributions, it's strongly suggested to open an issue for discussion first. For more details, please refer to the contribution guide.

License

Apache License 2.0.

Acknowledgement

MACE depends on several open source projects located in the third_party directory. Particularly, we learned a lot from the following projects during the development:

Qualcomm Hexagon NN Offload Framework: the Hexagon DSP runtime depends on this library.
TensorFlow, Caffe, SNPE, ARM ComputeLibrary, ncnn, ONNX and many others: we learned many best practices from these projects.

Finally, we also thank the Qualcomm, Pinecone and MediaTek engineering teams for their help.

Top Related Projects

Convert designs to code with AI

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot