Convert Figma logo to code with AI

alibaba logoMNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

8,832
1,680
8,832
35

Top Related Projects

186,879

An Open Source Machine Learning Framework for Everyone

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

17,765

Open standard for machine learning interoperability

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

20,298

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Quick Overview

MNN (Mobile Neural Network) is a lightweight deep learning framework developed by Alibaba. It's designed for efficient inference on mobile devices and embedded systems, supporting various neural network architectures and optimized for cross-platform performance.

Pros

  • High performance and low memory footprint, ideal for mobile and embedded devices
  • Cross-platform support (iOS, Android, Linux, Windows, macOS)
  • Supports multiple deep learning frameworks (TensorFlow, PyTorch, ONNX, etc.)
  • Provides quantization and model compression techniques for further optimization

Cons

  • Limited documentation and examples compared to more established frameworks
  • Smaller community and ecosystem compared to TensorFlow or PyTorch
  • Primarily focused on inference, not training
  • May require additional effort to integrate with custom or less common model architectures

Code Examples

  1. Loading and running a model:
#include <MNN/Interpreter.hpp>

auto interpreter = std::shared_ptr<MNN::Interpreter>(MNN::Interpreter::createFromFile("model.mnn"));
auto session = interpreter->createSession({});
interpreter->runSession(session);
  1. Accessing input and output tensors:
auto input = interpreter->getSessionInput(session, nullptr);
auto output = interpreter->getSessionOutput(session, nullptr);

// Fill input tensor with data
float* inputData = input->host<float>();
// ... fill inputData with your input

interpreter->runSession(session);

// Access output data
float* outputData = output->host<float>();
// ... use outputData for further processing
  1. Quantizing a model:
#include <MNN/Interpreter.hpp>
#include <MNN/Converter.hpp>

std::unique_ptr<MNN::NetT> net = std::move(MNN::Converter::load("model.mnn"));
MNN::QuantizeParams params;
params.quant_bits = 8;
MNN::Converter::quantize(net.get(), params);
MNN::Converter::save(net.get(), "quantized_model.mnn");

Getting Started

  1. Clone the repository:

    git clone https://github.com/alibaba/MNN.git
    
  2. Build MNN:

    cd MNN
    ./schema/generate.sh
    ./tools/script/get_model.sh
    mkdir build && cd build
    cmake .. && make -j4
    
  3. Run the example:

    ./build/express/temp/testModel.out models/mobilenet_v1.mnn
    

This will run a simple inference using a pre-trained MobileNet model. For more detailed instructions and advanced usage, refer to the official MNN documentation.

Competitor Comparisons

186,879

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Larger ecosystem and community support
  • More comprehensive documentation and tutorials
  • Wider range of pre-trained models and tools

Cons of TensorFlow

  • Heavier and more resource-intensive
  • Steeper learning curve for beginners
  • Slower inference speed on mobile devices

Code Comparison

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

MNN:

#include <MNN/Interpreter.hpp>

auto net = std::shared_ptr<MNN::Interpreter>(MNN::Interpreter::createFromFile("model.mnn"));
auto session = net->createSession(config);
net->runSession(session);

The code snippets demonstrate the basic model creation and execution in both frameworks. TensorFlow uses a high-level Python API, while MNN employs a C++ interface for model loading and inference.

TensorFlow offers a more intuitive and flexible approach to building models, especially for researchers and data scientists. MNN, on the other hand, focuses on efficient model deployment and execution, particularly on mobile and embedded devices.

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

  • Larger community and ecosystem, with more resources and third-party libraries
  • More flexible and dynamic computational graph, allowing for easier debugging
  • Better support for research and prototyping in deep learning

Cons of PyTorch

  • Generally slower inference speed compared to MNN
  • Larger model size and memory footprint
  • Less optimized for mobile and edge devices

Code Comparison

PyTorch example:

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.add(x, y)

MNN example:

#include <MNN/Interpreter.hpp>

auto net = std::shared_ptr<MNN::Interpreter>(MNN::Interpreter::createFromFile("model.mnn"));
net->runSession(session);

The PyTorch example demonstrates its Python-based API and dynamic tensor operations, while the MNN example shows its C++ interface and focus on model inference. PyTorch offers a more intuitive and flexible approach for model development, whereas MNN is designed for efficient deployment and inference on various platforms, especially mobile devices.

17,765

Open standard for machine learning interoperability

Pros of ONNX

  • Wider industry adoption and support from major AI/ML frameworks
  • More comprehensive model representation, supporting a broader range of operations
  • Better interoperability between different deep learning frameworks

Cons of ONNX

  • Can be more complex to use and implement
  • Larger file sizes for model representations
  • May have slower inference speed compared to MNN's optimized runtime

Code Comparison

MNN example:

auto input = _Input({1, 3, 224, 224}, NC4HW4);
auto conv = _Conv(3, 16, {3, 3}, SAME, input);
auto output = _Softmax(conv);

ONNX example:

input = helper.make_tensor_value_info('input', TensorProto.FLOAT, [1, 3, 224, 224])
conv = helper.make_node('Conv', ['input', 'weight', 'bias'], ['conv_output'])
output = helper.make_node('Softmax', ['conv_output'], ['output'])

Both examples show basic model construction, but ONNX requires more verbose code to define nodes and tensors. MNN's API is more concise and intuitive for direct model building. However, ONNX's verbosity allows for more detailed control over model structure and attributes.

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Pros of Core ML Tools

  • Seamless integration with Apple's ecosystem and iOS/macOS devices
  • Supports a wide range of popular machine learning frameworks (TensorFlow, PyTorch, scikit-learn)
  • Extensive documentation and active community support

Cons of Core ML Tools

  • Limited to Apple platforms, reducing cross-platform compatibility
  • May require more computational resources for model conversion and optimization
  • Less flexibility for custom optimizations compared to MNN

Code Comparison

Core ML Tools:

import coremltools as ct

model = ct.convert('model.h5', 
                   source='keras',
                   convert_to='mlprogram',
                   compute_units='ALL')
model.save('converted_model.mlpackage')

MNN:

#include <MNN/Interpreter.hpp>

auto net = std::shared_ptr<MNN::Interpreter>(MNN::Interpreter::createFromFile("model.mnn"));
net->setSessionMode(MNN::Interpreter::Session_Release);
auto session = net->createSession(config);
net->runSession(session);

Core ML Tools focuses on converting models to Apple's Core ML format, while MNN provides a lightweight inference engine for direct model execution across multiple platforms. Core ML Tools offers easier integration with Apple devices, but MNN provides more flexibility and cross-platform support.

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

Pros of ONNX Runtime

  • Broader ecosystem support and compatibility with various ML frameworks
  • More extensive documentation and community resources
  • Better performance optimization for a wider range of hardware platforms

Cons of ONNX Runtime

  • Larger binary size and potentially higher memory footprint
  • Steeper learning curve for beginners due to more complex API

Code Comparison

MNN example:

auto interpreter = std::shared_ptr<Interpreter>(Interpreter::createFromFile(modelPath));
interpreter->runSession(session);
auto output = interpreter->getSessionOutput(session, nullptr);

ONNX Runtime example:

Ort::Session session(env, model_path, session_options);
std::vector<float> input_tensor_values(input_tensor_size);
Ort::Value input_tensor = Ort::Value::CreateTensor<float>(memory_info, input_tensor_values.data(), input_tensor_size, input_node_dims.data(), 4);
auto output_tensors = session.Run(Ort::RunOptions{nullptr}, input_node_names.data(), &input_tensor, 1, output_node_names.data(), 1);

Both MNN and ONNX Runtime are powerful inference engines for deploying machine learning models. MNN, developed by Alibaba, focuses on mobile and embedded devices, offering a lightweight solution with fast inference speeds. ONNX Runtime, created by Microsoft, provides a more versatile platform supporting a wider range of hardware and frameworks. While MNN excels in mobile scenarios, ONNX Runtime offers broader compatibility and optimization options across various deployment environments.

20,298

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Pros of ncnn

  • Smaller binary size and lower memory footprint
  • Better support for quantization and fixed-point operations
  • More extensive documentation and community support

Cons of ncnn

  • Less support for newer neural network architectures
  • Slower inference speed on some devices compared to MNN
  • More limited platform support, especially for mobile devices

Code Comparison

MNN example:

auto input = _Input({1, 3, 224, 224}, NC4HW4);
auto conv = _Conv(3, 16, {3, 3}, VALID);
auto output = conv(input);

ncnn example:

ncnn::Net net;
net.load_param("model.param");
net.load_model("model.bin");
ncnn::Mat in(224, 224, 3);
ncnn::Mat out;
net.extract("output", out);

Both libraries offer concise APIs for model inference, but MNN's API is more declarative and allows for easier model construction, while ncnn focuses on loading pre-trained models and performing inference.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

MNN

中文版本

MNN Homepage

Intro

MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. At present, MNN has been integrated into more than 30 apps of Alibaba Inc, such as Taobao, Tmall, Youku, DingTalk, Xianyu, etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. In addition, MNN is also used on embedded devices, such as IoT.

MNN-LLM is a large language model runtime solution developed based on the MNN engine. The mission of this project is to deploy LLM models locally on everyone's platforms(Mobile Phone/PC/IOT). It supports popular large language models such as Qianwen, Baichuan, Zhipu, LLAMA, and others. MNN-LLM User guide

MNN-Diffusion is a stable diffusion model runtime solution developed based on the MNN engine. The mission of this project is to deploy stable diffusion models locally on everyone's platforms. MNN-Diffusion User guide

architecture

Inside Alibaba, MNN works as the basic module of the compute container in the Walle System, the first end-to-end, general-purpose, and large-scale production system for device-cloud collaborative machine learning, which has been published in the top system conference OSDI’22. The key design principles of MNN and the extensive benchmark testing results (vs. TensorFlow, TensorFlow Lite, PyTorch, PyTorch Mobile, TVM) can be found in the OSDI paper. The scripts and instructions for benchmark testing are put in the path “/benchmark”. If MNN or the design of Walle helps your research or production use, please cite our OSDI paper as follows:

@inproceedings {proc:osdi22:walle,
    author = {Chengfei Lv and Chaoyue Niu and Renjie Gu and Xiaotang Jiang and Zhaode Wang and Bin Liu and Ziqi Wu and Qiulin Yao and Congyu Huang and Panos Huang and Tao Huang and Hui Shu and Jinde Song and Bin Zou and Peng Lan and Guohuan Xu and Fei Wu and Shaojie Tang and Fan Wu and Guihai Chen},
    title = {Walle: An {End-to-End}, {General-Purpose}, and {Large-Scale} Production System for {Device-Cloud} Collaborative Machine Learning},
    booktitle = {16th USENIX Symposium on Operating Systems Design and Implementation (OSDI 22)},
    year = {2022},
    isbn = {978-1-939133-28-1},
    address = {Carlsbad, CA},
    pages = {249--265},
    url = {https://www.usenix.org/conference/osdi22/presentation/lv},
    publisher = {USENIX Association},
    month = jul,
}

Documentation and Workbench

MNN's docs are in place in Read the docs.

You can also read docs/README to build docs's html.

MNN Workbench could be downloaded from MNN's homepage, which provides pretrained models, visualized training tools, and one-click deployment of models to devices.

Key Features

Lightweight

  • Optimized for devices, no dependencies, can be easily deployed to mobile devices and a variety of embedded devices.
  • iOS platform: static library size will full option for armv7+arm64 platforms is about 12MB, size increase of linked executables is about 2M.
  • Android platform: core so size is about 800KB (armv7a - c++_shared).
  • Using MNN_BUILD_MINI can reduce package size by about 25%, with a limit of fixed model input size
  • Support FP16 / Int8 quantize, can reduce model size 50%-70%

Versatility

  • Supports Tensorflow, Caffe, ONNX,Torchscripts and supports common neural networks such as CNN, RNN, GAN, Transformer.
  • Supports AI model with multi-inputs or multi-outputs, every kind of dimension format, dynamic inputs, controlflow.
  • MNN supports approximate full OPs used for the AI Model. The converter supports 178 Tensorflow OPs, 52 Caffe OPs, 163 Torchscripts OPs, 158 ONNX OPs.
  • Supports iOS 8.0+, Android 4.3+, and embedded devices with POSIX interface.
  • Supports hybrid computing on multiple devices. Currently supports CPU and GPU.

High performance

  • Implements core computing with lots of optimized assembly code to make full use of the ARM / x64 CPU.
  • Use Metal / OpenCL / Vulkan to support GPU inference on mobile.
  • Use CUDA and tensorcore to support NVIDIA GPU for better performance
  • Convolution and transposition convolution algorithms are efficient and stable. The Winograd convolution algorithm is widely used to better symmetric convolutions such as 3x3,4x4,5x5,6x6,7x7.
  • Twice speed increase for the new architecture ARM v8.2 with FP16 half-precision calculation support. 2.5 faster to use sdot for ARM v8.2 and VNNI.

Ease of use

  • Support use MNN's OP to do numerical calculating like numpy.
  • Support lightweight image process module like OpenCV, which is only 100k.
  • Support build model and train it on PC / mobile.
  • MNN Python API helps ML engineers to easily use MNN to infer, train, and process images, without dipping their toes in C++ code.

The Architecture / Precision MNN supported is shown below:

  • S :Support and work well, deeply optimized, recommend to use
  • A :Support and work well, can use
  • B :Support but has bug or not optimized, no recommend to use
  • C :Not Support
Architecture / PrecisionNormalFP16BF16Int8
CPUNativeBCBB
x86/x64-SSE4.1ABBA
x86/x64-AVX2SBBA
x86/x64-AVX512SBBS
ARMv7aSS (ARMv8.2)SS
ARMv8SS (ARMv8.2)S(ARMv8.6)S
GPUOpenCLASCC
VulkanAACC
MetalASCC
CUDAASCC
NPUCoreMLBBCC
HIAIBCCB
NNAPIBBCC

Tools

Base on MNN (Tensor compute engine), we provided a series of tools for inference, train and general computation.

  • MNN-Converter: Convert other models to MNN models for inference, such as Tensorflow(lite), Caffe, ONNX, Torchscripts. And do graph optimization to reduce computation.
  • MNN-Compress: Compress model to reduce size and increase performance / speed
  • MNN-Express: Support model with controlflow, use MNN's OP to do general-purpose computing.
  • MNN-CV: An OpenCV-like library, but based on MNN and then much more lightweight.
  • MNN-Train: Support train MNN model.

How to Discuss and Get Help From the MNN Community

The group discussions are predominantly Chinese. But we welcome and will help English speakers.

Dingtalk discussion groups:

Group #1 (Full): 23329087

Group #2 (Full): 23350225

Group #3: QR code:

MNN-3

Historical Paper

The preliminary version of MNN, as mobile inference engine and with the focus on manual optimization, has also been published in MLSys 2020. Please cite the paper, if MNN previously helped your research:

@inproceedings{alibaba2020mnn,
  author = {Jiang, Xiaotang and Wang, Huan and Chen, Yiliu and Wu, Ziqi and Wang, Lichuan and Zou, Bin and Yang, Yafeng and Cui, Zongyang and Cai, Yu and Yu, Tianhang and Lv, Chengfei and Wu, Zhihua},
  title = {MNN: A Universal and Efficient Inference Engine},
  booktitle = {MLSys},
  year = {2020}
}

License

Apache 2.0

Acknowledgement

MNN participants: Taobao Technology Department, Search Engineering Team, DAMO Team, Youku and other Alibaba Group employees.

MNN refers to the following projects: