Convert Figma logo to code with AI

microsoft logoDirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.

2,271
302
2,271
234

Top Related Projects

186,879

An Open Source Machine Learning Framework for Everyone

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

17,765

Open standard for machine learning interoperability

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

30,218

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

10,668

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Quick Overview

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. It provides low-level APIs for executing machine learning primitives on GPUs and other hardware accelerators, enabling developers to integrate machine learning into their applications with optimal performance.

Pros

  • Hardware acceleration for machine learning tasks on DirectX 12 compatible devices
  • Seamless integration with DirectX 12 graphics pipelines
  • Supports a wide range of machine learning operations and primitives
  • Cross-platform compatibility (Windows and Xbox)

Cons

  • Limited to DirectX 12 compatible hardware
  • Steeper learning curve compared to higher-level machine learning frameworks
  • Less extensive documentation and community support compared to more popular ML libraries
  • Primarily focused on inference rather than training

Code Examples

  1. Creating a DirectML device:
ComPtr<ID3D12Device> d3d12Device;
D3D12CreateDevice(nullptr, D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&d3d12Device));

ComPtr<IDMLDevice> dmlDevice;
DMLCreateDevice(d3d12Device.Get(), DML_CREATE_DEVICE_FLAG_NONE, IID_PPV_ARGS(&dmlDevice));
  1. Executing a simple addition operation:
ComPtr<IDMLOperatorInitializer> initializer;
ComPtr<IDMLCompiledOperator> compiledOperator;

// Create and initialize the operator
DML_ELEMENT_WISE_ADD_OPERATOR_DESC addDesc = {};
addDesc.InputTensor1 = &inputTensor1Desc;
addDesc.InputTensor2 = &inputTensor2Desc;
addDesc.OutputTensor = &outputTensorDesc;

dmlDevice->CreateOperator(&addDesc, IID_PPV_ARGS(&operator));
dmlDevice->CompileOperator(operator.Get(), DML_EXECUTION_FLAG_NONE, IID_PPV_ARGS(&compiledOperator));

// Execute the operator
dmlCommandRecorder->RecordDispatch(commandList.Get(), compiledOperator.Get(), dispatchableBindings.Get());
  1. Creating a convolution operator:
DML_CONVOLUTION_OPERATOR_DESC convDesc = {};
convDesc.InputTensor = &inputTensorDesc;
convDesc.FilterTensor = &filterTensorDesc;
convDesc.OutputTensor = &outputTensorDesc;
convDesc.Mode = DML_CONVOLUTION_MODE_CROSS_CORRELATION;
convDesc.Direction = DML_CONVOLUTION_DIRECTION_FORWARD;
convDesc.Strides = strides;
convDesc.Dilations = dilations;
convDesc.StartPadding = startPadding;
convDesc.EndPadding = endPadding;
convDesc.OutputPadding = outputPadding;
convDesc.GroupCount = 1;

dmlDevice->CreateOperator(&convDesc, IID_PPV_ARGS(&convOperator));

Getting Started

  1. Install the DirectML NuGet package:

    nuget install Microsoft.AI.DirectML
    
  2. Include the DirectML header in your C++ project:

    #include <DirectML.h>
    
  3. Initialize DirectML device:

    ComPtr<ID3D12Device> d3d12Device;
    D3D12CreateDevice(nullptr, D3D_FEATURE_LEVEL_11_0, IID_PPV_ARGS(&d3d12Device));
    
    ComPtr<IDMLDevice> dmlDevice;
    DMLCreateDevice(d3d12Device.Get(), DML_CREATE_DEVICE_FLAG_NONE, IID_PPV_ARGS(&dmlDevice));
    
  4. Create and execute operators as needed for your machine learning tasks.

Competitor Comparisons

186,879

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Larger community and ecosystem, with more resources and third-party libraries
  • Supports a wider range of platforms and hardware accelerators
  • More comprehensive documentation and tutorials

Cons of TensorFlow

  • Steeper learning curve for beginners
  • Can be slower for certain operations compared to DirectML
  • Larger file size and memory footprint

Code Comparison

TensorFlow:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

DirectML:

import tensorflow as tf
from tensorflow.python.eager import context
context.set_preferred_device('DML')

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

The main difference in the code is the additional import and device preference setting for DirectML. TensorFlow's code is slightly more concise, while DirectML requires explicit device selection for GPU acceleration.

85,015

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

  • Widely adopted in the research community with extensive ecosystem
  • Supports dynamic computational graphs for flexible model development
  • Offers a more Pythonic and intuitive API

Cons of PyTorch

  • Generally slower performance on Windows compared to DirectML
  • Less optimized for DirectX-based hardware acceleration
  • Steeper learning curve for beginners compared to DirectML's simplicity

Code Comparison

PyTorch:

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.matmul(x, y)

DirectML:

import numpy as np
import directml as dml

x = np.array([1, 2, 3])
y = np.array([4, 5, 6])
z = dml.matmul(x, y)

Summary

PyTorch is a popular deep learning framework with a rich ecosystem and flexible design, making it ideal for research and complex model development. DirectML, on the other hand, focuses on providing efficient hardware acceleration for DirectX-based systems, offering better performance on Windows platforms. While PyTorch has a steeper learning curve, it provides more advanced features for experienced users. DirectML aims for simplicity and optimization on Microsoft platforms, making it a good choice for Windows-centric development and deployment.

17,765

Open standard for machine learning interoperability

Pros of ONNX

  • Broader ecosystem support and compatibility across multiple frameworks
  • More extensive model zoo and pre-trained models available
  • Active community-driven development with frequent updates

Cons of ONNX

  • Steeper learning curve for beginners
  • May require additional tools for optimization and deployment
  • Less integrated with DirectX and Windows-specific hardware acceleration

Code Comparison

ONNX example:

import onnx
model = onnx.load("model.onnx")
onnx.checker.check_model(model)
print(onnx.helper.printable_graph(model.graph))

DirectML example:

ComPtr<IDMLDevice> dmlDevice;
DMLCreateDevice(d3d12Device.Get(), DML_CREATE_DEVICE_FLAG_NONE, IID_PPV_ARGS(&dmlDevice));
ComPtr<IDMLOperatorInitializer> initializer;
dmlDevice->CreateOperatorInitializer(1, &operatorDesc, IID_PPV_ARGS(&initializer));

ONNX focuses on model representation and interoperability, while DirectML provides low-level GPU acceleration for machine learning operations. ONNX is more versatile across platforms, whereas DirectML is optimized for Windows and DirectX-compatible hardware. ONNX has a larger community and more extensive tooling, but DirectML offers tighter integration with Microsoft's ecosystem and potentially better performance on supported devices.

Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Pros of Core ML Tools

  • Specifically designed for iOS and macOS, providing seamless integration with Apple's ecosystem
  • Supports a wide range of popular machine learning frameworks, including TensorFlow, PyTorch, and scikit-learn
  • Offers comprehensive tools for model conversion, optimization, and deployment on Apple devices

Cons of Core ML Tools

  • Limited to Apple platforms, lacking cross-platform support
  • May require more manual optimization for performance on non-Apple hardware
  • Smaller community and ecosystem compared to DirectML

Code Comparison

Core ML Tools:

import coremltools as ct

model = ct.convert('model.h5', source='keras')
model.save('converted_model.mlmodel')

DirectML:

ComPtr<IDMLDevice> device;
DMLCreateDevice(d3d12Device.Get(), DML_CREATE_DEVICE_FLAG_NONE, IID_PPV_ARGS(&device));

ComPtr<IDMLOperatorInitializer> initializer;
device->CreateOperatorInitializer(1, &operatorDesc, IID_PPV_ARGS(&initializer));

Note: The code snippets demonstrate basic usage and may not be directly comparable due to the different nature and purposes of the libraries.

30,218

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Pros of JAX

  • More flexible and general-purpose, supporting a wider range of machine learning tasks
  • Better support for automatic differentiation and GPU/TPU acceleration
  • Larger and more active community, with frequent updates and contributions

Cons of JAX

  • Steeper learning curve, especially for those not familiar with NumPy
  • Less optimized for DirectX-specific hardware and scenarios
  • May have higher overhead for simple operations compared to DirectML

Code Comparison

JAX example:

import jax.numpy as jnp
from jax import grad, jit

def f(x):
    return jnp.sum(jnp.sin(x))

grad_f = jit(grad(f))

DirectML example:

DML_TENSOR_DESC inputDesc = {};
// ... (initialize tensor description)
DML_ELEMENT_WISE_SIN_OPERATOR_DESC sinDesc = {};
sinDesc.InputTensor = &inputDesc;
// ... (create and execute operator)

The JAX example showcases its simplicity in defining and differentiating functions, while the DirectML example demonstrates lower-level control over tensor operations. JAX provides a more Pythonic interface, whereas DirectML offers fine-grained control for DirectX-specific optimizations.

10,668

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

Pros of TensorRT

  • Highly optimized for NVIDIA GPUs, offering superior performance on supported hardware
  • Extensive support for various deep learning frameworks and models
  • Robust quantization and precision calibration tools for model optimization

Cons of TensorRT

  • Limited to NVIDIA hardware, lacking cross-platform support
  • Steeper learning curve and more complex setup process
  • Less frequent updates and potentially slower bug fixes

Code Comparison

TensorRT:

IBuilder* builder = createInferBuilder(gLogger);
INetworkDefinition* network = builder->createNetworkV2(1U << static_cast<uint32_t>(NetworkDefinitionCreationFlag::kEXPLICIT_BATCH));
IOptimizationProfile* profile = builder->createOptimizationProfile();

DirectML:

ComPtr<IDMLDevice> dmlDevice;
DMLCreateDevice(d3d12Device.Get(), DML_CREATE_DEVICE_FLAG_NONE, IID_PPV_ARGS(&dmlDevice));
ComPtr<IDMLOperatorInitializer> initializer;
dmlDevice->CreateOperatorInitializer(1, &op, IID_PPV_ARGS(&initializer));

Both libraries provide APIs for creating and optimizing deep learning models, but TensorRT focuses on NVIDIA GPUs, while DirectML offers a more hardware-agnostic approach.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

DirectML

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.

When used standalone, the DirectML API is a low-level DirectX 12 library and is suitable for high-performance, low-latency applications such as frameworks, games, and other real-time applications. The seamless interoperability of DirectML with Direct3D 12 as well as its low overhead and conformance across hardware makes DirectML ideal for accelerating machine learning when both high performance is desired, and the reliability and predictability of results across hardware is critical.

More information about DirectML can be found in Introduction to DirectML.

Visit the DirectX Landing Page for more resources for DirectX developers.

Getting Started with DirectML

DirectML is distributed as a system component of Windows 10, and is available as part of the Windows 10 operating system (OS) in Windows 10, version 1903 (10.0; Build 18362), and newer.

Starting with DirectML version 1.4.0, DirectML is also available as a standalone redistributable package (see Microsoft.AI.DirectML), which is useful for applications that wish to use a fixed version of DirectML, or when running on older versions of Windows 10.

Hardware requirements

DirectML requires a DirectX 12 capable device. Almost all commercially-available graphics cards released in the last several years support DirectX 12. Examples of compatible hardware include:

  • AMD GCN 1st Gen (Radeon HD 7000 series) and above
  • Intel Haswell (4th-gen core) HD Integrated Graphics and above
  • NVIDIA Kepler (GTX 600 series) and above
  • Qualcomm Adreno 600 and above

For application developers

DirectML exposes a native C++ DirectX 12 API. The header and library (DirectML.h/DirectML.lib) are available as part of the redistributable NuGet package, and are also included in the Windows 10 SDK version 10.0.18362 or newer.

For users, data scientists, and researchers

DirectML is built-in as a backend to several frameworks such as Windows ML, ONNX Runtime, and TensorFlow.

See the following sections for more information:

DirectML Samples

DirectML C++ sample code is available under Samples.

  • HelloDirectML: A minimal "hello world" application that executes a single DirectML operator.
  • DirectMLNpuInference: A sample that showcases how to utilize NPU hardware with DirectML.
  • DirectMLSuperResolution: A sample that uses DirectML to execute a basic super-resolution model to upscale video from 540p to 1080p in real time.
  • yolov4: YOLOv4 is an object detection model capable of recognizing up to 80 different classes of objects in an image. This sample contains a complete end-to-end implementation of the model using DirectML, and is able to run in real time on a user-provided video stream.

DirectML Python sample code is available under Python/samples. The samples require PyDirectML, an open source Python projection library for DirectML, which can be built and installed to a Python executing environment from Python/src. Refer to the Python/README.md file for more details.

DxDispatch Tool

DxDispatch is simple command-line executable for launching DirectX 12 compute programs (including DirectML operators) without writing all the C++ boilerplate.

Windows ML on DirectML

Windows ML (WinML) is a high-performance, reliable API for deploying hardware-accelerated ML inferences on Windows devices. DirectML provides the GPU backend for Windows ML.

DirectML acceleration can be enabled in Windows ML using the LearningModelDevice with any one of the DirectX DeviceKinds.

For more information, see Get Started with Windows ML.

ONNX Runtime on DirectML

ONNX Runtime is a cross-platform inferencing and training accelerator compatible with many popular ML/DNN frameworks, including PyTorch, TensorFlow/Keras, scikit-learn, and more.

DirectML is available as an optional execution provider for ONNX Runtime that provides hardware acceleration when running on Windows 10.

For more information about getting started, see Using the DirectML execution provider.

PyTorch with DirectML

PyTorch with DirectML enables training and inference of complex machine learning models on a wide range of DirectX 12-compatible hardware. This is done through torch-directml, a plugin for PyTorch.

PyTorch with DirectML is supported on both the latest versions of Windows and the Windows Subsystem for Linux, and is available for download as a PyPI package. For more information about getting started with torch-directml, see our Windows or WSL 2 guidance on Microsoft Learn.

TensorFlow with DirectML

TensorFlow is a popular open source platform for machine learning and is a leading framework for training of machine learning models.

DirectML acceleration for TensorFlow 1.15 is currently available for Public Preview. TensorFlow on DirectML enables training and inference of complex machine learning models on a wide range of DirectX 12-compatible hardware.

TensorFlow on DirectML is supported on both the latest versions of Windows 10 and the Windows Subsystem for Linux, and is available for download as a PyPI package. For more information about getting started, see GPU accelerated ML training (docs.microsoft.com)

Feedback

We look forward to hearing from you!

External Links

Documentation

DirectML programming guide
DirectML API reference

More information

Introducing DirectML (Game Developers Conference '19)
Accelerating GPU Inferencing with DirectML and DirectX 12 (SIGGRAPH '18)
Windows AI: hardware-accelerated ML on Windows devices (Microsoft Build '20)
Gaming with Windows ML (DirectX Developer Blog)
DirectML at GDC 2019 (DirectX Developer Blog)
DirectX ❤ Linux (DirectX Developer Blog)

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.