Convert Figma logo to code with AI

OAID logoTengine

Tengine is a lite, high performance, modular inference engine for embedded device

4,613
998
4,613
245

Top Related Projects

8,579

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

20,095

ncnn is a high-performance neural network inference framework optimized for the mobile platform

4,909

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

11,580

Open deep learning compiler stack for cpu, gpu and specialized accelerators

4,377

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.

Quick Overview

Tengine is a high-performance, open-source, and production-ready deep learning inference engine. It is designed to be efficient, flexible, and easy to use, making it a popular choice for deploying deep learning models in various applications, including edge computing, mobile devices, and cloud-based services.

Pros

  • High Performance: Tengine is optimized for both CPU and GPU, delivering efficient inference with low latency and high throughput.
  • Flexibility: Tengine supports a wide range of deep learning models, including popular frameworks like TensorFlow, PyTorch, and ONNX.
  • Ease of Use: Tengine provides a user-friendly API and comprehensive documentation, making it easy for developers to integrate and deploy deep learning models.
  • Cross-Platform Compatibility: Tengine can run on a variety of platforms, including Linux, Android, and iOS, making it a versatile choice for diverse deployment environments.

Cons

  • Limited Community Support: Compared to some other deep learning inference engines, Tengine may have a smaller community and fewer resources available for troubleshooting and support.
  • Fewer Pre-Trained Models: While Tengine supports a wide range of models, it may not have as many pre-trained models available as some other popular deep learning frameworks.
  • Potential Compatibility Issues: As with any cross-platform software, there may be occasional compatibility issues or challenges when deploying Tengine on specific hardware or software configurations.
  • Ongoing Maintenance: As with any active project, Tengine requires ongoing maintenance and updates to keep up with the rapidly evolving deep learning landscape.

Code Examples

Here are a few code examples demonstrating the usage of Tengine:

  1. Loading and Executing a TensorFlow Model:
#include <iostream>
#include "tengine_c_api.h"

int main(int argc, char* argv[]) {
    init_tengine();
    graph_t graph = create_graph(nullptr, "tensorflow", "model.pb");
    run_graph(graph, true);
    release_graph(graph);
    release_tengine();
    return 0;
}

This code demonstrates how to load a TensorFlow model, create a Tengine graph, and execute the model using the Tengine C API.

  1. Executing an ONNX Model:
#include <iostream>
#include "tengine_c_api.h"

int main(int argc, char* argv[]) {
    init_tengine();
    graph_t graph = create_graph(nullptr, "onnx", "model.onnx");
    run_graph(graph, true);
    release_graph(graph);
    release_tengine();
    return 0;
}

This code shows how to load an ONNX model, create a Tengine graph, and execute the model using the Tengine C API.

  1. Performing Inference on a PyTorch Model:
import torch
from tengine.pytorch.wrapper import TengineWrapper

# Load the PyTorch model
model = torch.load("model.pth")

# Create a Tengine wrapper for the PyTorch model
tengine_model = TengineWrapper(model)

# Run inference using the Tengine wrapper
output = tengine_model(input_data)

This Python code demonstrates how to use the Tengine PyTorch wrapper to perform inference on a PyTorch model.

Getting Started

To get started with Tengine, follow these steps:

  1. Install Tengine: Depending on your platform, you can install Tengine using the provided packages or by building it from source. Refer to the Tengine installation guide for detailed instructions.

  2. Load and Execute a Model: Once Tengine is installed, you can load and execute deep learning models using the Tengine C API or the provided language-specific wrappers (e.g., Python, C++). The code examples above demonstrate the basic usage.

  3. Optimize Model Performance: Tengine provides various optimization techniques, such as quantization and graph optimization, to improve the inference performance of your models. Refer to the [Tengine documentation](https://github.

Competitor Comparisons

8,579

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Pros of MNN

  • More extensive platform support, including iOS, Android, Windows, and Linux
  • Better performance optimization, especially on mobile devices
  • More active development and frequent updates

Cons of MNN

  • Steeper learning curve due to more complex architecture
  • Less focus on embedded systems compared to Tengine
  • Larger binary size, which may be a concern for some embedded applications

Code Comparison

MNN example:

auto input = _Input({1, 3, 224, 224}, NC4HW4);
auto conv = _Conv(3, 16, {3, 3}, VALID);
auto output = conv(input);

Tengine example:

struct tensor* input = get_graph_input_tensor(graph, 0, 0);
struct tensor* conv = create_tensor(graph, "conv", TENGINE_DT_FP32);
struct node* conv_node = create_node(graph, "conv", "CONV");

Both libraries offer C/C++ APIs for model deployment, but MNN's API tends to be more high-level and object-oriented, while Tengine's API is more C-style and lower-level. MNN generally provides more abstraction and ease of use for complex models, while Tengine offers finer control over low-level operations, which can be beneficial for embedded systems.

20,095

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Pros of ncnn

  • Wider platform support, including mobile and embedded devices
  • More optimized for mobile and edge computing scenarios
  • Larger community and more frequent updates

Cons of ncnn

  • Steeper learning curve due to more complex API
  • Less focus on high-level abstractions, requiring more low-level implementation

Code Comparison

ncnn:

ncnn::Net net;
net.load_param("model.param");
net.load_model("model.bin");

ncnn::Mat in = ncnn::Mat::from_pixels(image_data, ncnn::Mat::PIXEL_BGR, w, h);
ncnn::Mat out;
net.extract("output", out);

Tengine:

graph_t graph = create_graph(NULL, "tengine", model_file);
tensor_t input_tensor = get_graph_input_tensor(graph, 0, 0);
set_tensor_shape(input_tensor, dims, 4);
set_tensor_buffer(input_tensor, input_data, input_size);
run_graph(graph, 1);

Both libraries offer efficient inference for deep learning models, but ncnn is more focused on mobile and embedded scenarios, while Tengine provides a simpler API with a focus on ease of use. ncnn has a larger community and more frequent updates, but may require more low-level implementation. Tengine offers a more straightforward approach but with potentially less optimization for specific platforms.

4,909

MACE is a deep learning inference framework optimized for mobile heterogeneous computing platforms.

Pros of mace

  • More extensive documentation and examples
  • Better support for mobile platforms, especially Android
  • Active community and regular updates

Cons of mace

  • Steeper learning curve for beginners
  • Limited support for some less common neural network operations

Code comparison

mace:

MaceEngine mace_engine(device_type);
mace_engine.Init(net_def, input_nodes, output_nodes, device_context);
mace_engine.Run(inputs, &outputs);

Tengine:

graph_t graph = create_graph(NULL, "tengine", model_file);
prerun_graph(graph);
run_graph(graph, 1);

Summary

mace offers better mobile support and documentation, while Tengine provides a simpler API and easier integration for embedded systems. mace's code is more object-oriented, while Tengine uses a C-style API. Both projects aim to optimize deep learning models for edge devices, but cater to slightly different use cases and developer preferences.

The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.

Pros of ComputeLibrary

  • Extensive support for ARM-based architectures, optimized for ARM CPUs and GPUs
  • Comprehensive documentation and examples for various use cases
  • Active development and regular updates from ARM

Cons of ComputeLibrary

  • Limited cross-platform support compared to Tengine
  • Steeper learning curve for developers not familiar with ARM architectures
  • Larger codebase and potentially higher resource requirements

Code Comparison

Tengine example (inference):

graph_t graph = create_graph(NULL, "tengine", model_file);
tensor_t input_tensor = get_graph_input_tensor(graph, 0, 0);
set_tensor_shape(input_tensor, dims, 4);
prerun_graph(graph);
run_graph(graph, 1);

ComputeLibrary example (convolution):

NEConvolutionLayer conv;
conv.configure(&src, &weights, &biases, &dst, conv_info, weights_info);
NEScheduler::get().schedule(&conv, Window::DimY);
11,580

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Pros of TVM

  • Broader ecosystem support with multiple frontends (e.g., PyTorch, TensorFlow, ONNX)
  • More extensive documentation and community resources
  • Advanced optimization techniques, including AutoTVM and AutoScheduler

Cons of TVM

  • Steeper learning curve due to its complexity and extensive features
  • Potentially higher resource requirements for compilation and optimization

Code Comparison

TVM example (tensor addition):

import tvm
from tvm import te

n = te.var("n")
A = te.placeholder((n,), name="A")
B = te.placeholder((n,), name="B")
C = te.compute(A.shape, lambda i: A[i] + B[i], name="C")

Tengine example (tensor addition):

struct tensor* A = get_graph_tensor(graph, "input1");
struct tensor* B = get_graph_tensor(graph, "input2");
struct tensor* C = get_graph_tensor(graph, "output");
float* a_data = (float*)get_tensor_buffer(A);
float* b_data = (float*)get_tensor_buffer(B);
float* c_data = (float*)get_tensor_buffer(C);

Both TVM and Tengine aim to optimize deep learning models for various hardware platforms. TVM offers a more comprehensive solution with advanced features and broader ecosystem support, while Tengine focuses on lightweight deployment for embedded systems. TVM's code tends to be more high-level and abstract, whereas Tengine's code is closer to traditional C programming.

4,377

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.

Pros of TNN

  • More extensive model support, including popular architectures like ResNet, MobileNet, and YOLO
  • Better optimization for mobile and embedded devices, with specific support for ARM and OpenCL
  • More active development and frequent updates

Cons of TNN

  • Steeper learning curve due to more complex architecture
  • Less focus on lightweight deployment for IoT devices
  • Potentially higher resource requirements for some use cases

Code Comparison

TNN example (C++):

auto net = std::make_shared<TNN::TNN>();
TNN::Status status = net->Init(proto, model);
auto instance = net->CreateInst(network_config, status);
instance->Forward();

Tengine example (C):

graph_t graph = create_graph(NULL, "tengine", model_file);
prerun_graph(graph);
run_graph(graph, 1);

Summary

TNN offers more extensive model support and optimization for mobile devices, while Tengine focuses on lightweight deployment for IoT. TNN has a more complex architecture but provides better performance on mobile platforms. Tengine is simpler to use and more suitable for resource-constrained environments. The code examples demonstrate that TNN uses a C++ interface with object-oriented design, while Tengine employs a C-style API with a more procedural approach.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

简体中文 | English

Tengine

GitHub license GitHub Workflow Status Test Status codecov Language grade: C/C++

简介

Tengine 由 OPEN AI LAB 主导开发,该项目实现了深度学习神经网络模型在嵌入式设备上的快速、高效部署需求。为实现在众多 AIoT 应用中的跨平台部署,本项目使用 C 语言进行核心模块开发,针对嵌入式设备资源有限的特点进行了深度框架裁剪。同时采用了完全分离的前后端设计,有利于 CPU、GPU、NPU 等异构计算单元的快速移植和部署,降低评估、迁移成本。

Tengine 核心代码由 4 个模块组成:

  • device:NN Operators 后端模块,已提供 CPU、GPU、NPU 参考代码;
  • scheduler:框架核心部件,包括 NNIR、计算图、硬件资源、模型解析器的调度和执行模块;
  • operator:NN Operators 前端模块,实现 NN Operators 注册、初始化;
  • serializer:模型解析器,实现 tmfile 格式的网络模型参数解析。

架构简析

Tengine 架构

快速上手

编译

示例

  • examples 提供基础的分类、检测算法用例,根据 issue 需求持续更新。
  • 源安装 提供ubuntu系统的apt-get命令行安装和试用,目前支持x86/A311D硬件。

模型仓库

转换工具

  • 预编译版本 :提供 Ubuntu 18.04 系统上预编译好的模型转换工具;
  • 在线转换版本 :基于 WebAssembly 实现(浏览器本地转换,模型不会上传;
  • 源码编译 :建议在服务器或者PC上编译,指令如下:
    mkdir build && cd build
    cmake -DTENGINE_BUILD_CONVERT_TOOL=ON ..
    make -j`nproc`
    

量化工具

  • 源码编译:已开源量化工具源码,已支持 uint8/int8。

速度评估

  • Benchmark 基础网络速度评估工具,欢迎大家更新。

NPU Plugin

  • TIM-VX VeriSilicon NPU 使用指南。

AutoKernel Plugin

  • AutoKernel 是一个简单易用,低门槛的自动算子优化工具,AutoKernel Plugin实现了自动优化算子一键部署到 Tengine 中。

Container

  • SuperEdge 借助 SuperEdge 边缘计算的开源容器管理系统,提供更便捷的业务管理方案;
  • How to use Tengine with SuperEdge 容器使用指南;
  • Video Capture user manual Demo 依赖文件生成指南。

Roadmap

致谢

Tengine Lite 参考和借鉴了下列项目:

License

澄清说明

  • [在线上报功能] 在线上报功能主要目的是了解Tengine的使用信息,信息用于优化和迭代Tengine,不会影响任何正常功能。该功能默认开启,如需关闭,可修改如下配置关闭:(主目录 CMakeLists.txt ) OPTION (TENGINE_ONLINE_REPORT "online report" OFF)

FAQ

技术讨论