Convert Figma logo to code with AI

intel logocompute-runtime

Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver

1,121
230
1,121
85

Top Related Projects

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.

3,683

HIP: C++ Heterogeneous-Compute Interface for Portability

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

185,446

An Open Source Machine Learning Framework for Everyone

82,049

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Quick Overview

The intel/compute-runtime repository is an open-source project that provides the Intel Graphics Compute Runtime for OpenCL and oneAPI Level Zero. It enables developers to leverage Intel's integrated and discrete GPUs for general-purpose computing tasks, supporting a wide range of Intel processors.

Pros

  • Supports both OpenCL and oneAPI Level Zero, providing flexibility for developers
  • Optimized for Intel GPUs, offering high performance for compatible hardware
  • Regularly updated with new features and improvements
  • Open-source nature allows for community contributions and customizations

Cons

  • Limited to Intel GPUs, not compatible with other manufacturers' hardware
  • May require specific driver versions, which can lead to compatibility issues
  • Learning curve for developers new to GPU computing or Intel's ecosystem
  • Performance may vary depending on the specific Intel GPU model

Getting Started

To get started with the Intel Graphics Compute Runtime:

  1. Ensure you have a compatible Intel GPU and supported Linux distribution.
  2. Install the necessary dependencies:
sudo apt-get update
sudo apt-get install ocl-icd-libopencl1 opencl-headers clinfo
  1. Download and install the latest release from the GitHub repository:
git clone https://github.com/intel/compute-runtime.git
cd compute-runtime
mkdir build && cd build
cmake ..
make
sudo make install
  1. Verify the installation:
clinfo

This should display information about the available OpenCL platforms and devices, including your Intel GPU.

Competitor Comparisons

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning. DirectML provides GPU acceleration for common machine learning tasks across a broad range of supported hardware and drivers, including all DirectX 12-capable GPUs from vendors such as AMD, Intel, NVIDIA, and Qualcomm.

Pros of DirectML

  • Broader hardware support across multiple vendors (not limited to Intel)
  • Designed for machine learning workloads, potentially offering better performance for AI/ML tasks
  • Integration with DirectX ecosystem for graphics and compute

Cons of DirectML

  • Windows-centric, less cross-platform support
  • May have a steeper learning curve for developers not familiar with DirectX

Code Comparison

DirectML (simplified example):

dml::Expression input = dml::InputTensor(graph, 0, inputDesc);
dml::Expression weights = dml::InputTensor(graph, 1, weightsDesc);
dml::Expression output = dml::Convolution(input, weights);

Compute Runtime (OpenCL-based):

cl_kernel kernel = clCreateKernel(program, "convolution", NULL);
clSetKernelArg(kernel, 0, sizeof(cl_mem), &inputBuffer);
clSetKernelArg(kernel, 1, sizeof(cl_mem), &weightsBuffer);
clEnqueueNDRangeKernel(queue, kernel, work_dim, global_work_size, local_work_size, 0, NULL, NULL);

Both repositories aim to provide efficient compute capabilities, but they target different ecosystems and use cases. DirectML focuses on machine learning acceleration within the Microsoft ecosystem, while Compute Runtime provides a more general-purpose compute solution for Intel hardware using OpenCL.

3,683

HIP: C++ Heterogeneous-Compute Interface for Portability

Pros of HIP

  • Open-source and vendor-neutral, supporting multiple GPU architectures
  • Easier porting of CUDA code to run on AMD GPUs
  • Active community development and frequent updates

Cons of HIP

  • Limited support for non-AMD GPUs compared to compute-runtime
  • Potentially lower performance on Intel hardware
  • Steeper learning curve for developers new to GPU programming

Code Comparison

HIP:

#include <hip/hip_runtime.h>

__global__ void vectorAdd(float *a, float *b, float *c, int n) {
    int i = blockDim.x * blockIdx.x + threadIdx.x;
    if (i < n) c[i] = a[i] + b[i];
}

compute-runtime:

#include <CL/sycl.hpp>

void vectorAdd(sycl::queue& q, float* a, float* b, float* c, int n) {
    q.parallel_for(sycl::range<1>(n), [=](sycl::id<1> i) {
        c[i] = a[i] + b[i];
    });
}

The HIP code uses CUDA-like syntax, while compute-runtime uses SYCL for a more abstracted approach. HIP's syntax may be more familiar to CUDA developers, but compute-runtime's SYCL implementation offers better portability across different hardware vendors.

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

Pros of cuda-samples

  • Extensive collection of CUDA code examples covering various GPU computing topics
  • Well-documented samples with detailed explanations and performance tips
  • Regular updates to support the latest CUDA features and best practices

Cons of cuda-samples

  • Limited to NVIDIA GPUs and CUDA framework
  • Samples may require specific NVIDIA hardware or driver versions

Code Comparison

compute-runtime (OpenCL):

cl_int status;
cl_platform_id platform;
status = clGetPlatformIDs(1, &platform, NULL);
cl_device_id device;
status = clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);

cuda-samples (CUDA):

int deviceCount;
cudaError_t error = cudaGetDeviceCount(&deviceCount);
if (error != cudaSuccess) {
    printf("Error: %s\n", cudaGetErrorString(error));
    exit(EXIT_FAILURE);
}

The compute-runtime repository focuses on Intel's OpenCL implementation for their GPUs, while cuda-samples provides CUDA examples for NVIDIA GPUs. compute-runtime is more of a runtime implementation, whereas cuda-samples is a collection of educational examples. The code snippets show the different APIs used for device initialization in OpenCL and CUDA.

185,446

An Open Source Machine Learning Framework for Everyone

Pros of TensorFlow

  • Broader ecosystem and community support
  • More extensive documentation and learning resources
  • Supports a wider range of hardware platforms

Cons of TensorFlow

  • Larger codebase and potentially steeper learning curve
  • May have higher overhead for simple tasks
  • Less optimized for Intel-specific hardware

Code Comparison

TensorFlow example:

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

Compute Runtime example:

#include <CL/cl.h>

cl_platform_id platform;
cl_device_id device;
clGetPlatformIDs(1, &platform, NULL);
clGetDeviceIDs(platform, CL_DEVICE_TYPE_GPU, 1, &device, NULL);

Summary

TensorFlow is a more versatile and widely-used machine learning framework with extensive community support. Compute Runtime, on the other hand, is specifically designed for Intel hardware and may offer better performance optimization for Intel GPUs. TensorFlow provides higher-level abstractions for machine learning tasks, while Compute Runtime offers lower-level control over compute operations on Intel devices.

82,049

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Pros of PyTorch

  • Broader machine learning framework with extensive ecosystem
  • More active community and frequent updates
  • Supports dynamic computational graphs for flexible model development

Cons of PyTorch

  • Larger codebase and potentially steeper learning curve
  • May have higher resource requirements for basic operations
  • Less focused on specific hardware optimizations

Code Comparison

PyTorch example (tensor creation and basic operation):

import torch

x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = x + y
print(z)

compute-runtime example (OpenCL kernel execution):

cl_int err;
cl_kernel kernel = clCreateKernel(program, "vector_add", &err);
clSetKernelArg(kernel, 0, sizeof(cl_mem), &buffer_A);
clSetKernelArg(kernel, 1, sizeof(cl_mem), &buffer_B);
clSetKernelArg(kernel, 2, sizeof(cl_mem), &buffer_C);
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &global_size, &local_size, 0, NULL, NULL);

Summary

PyTorch is a comprehensive machine learning framework with a large community and ecosystem, while compute-runtime focuses on low-level GPU compute capabilities for Intel hardware. PyTorch offers more flexibility and ease of use for general machine learning tasks, but compute-runtime may provide better performance for specific Intel-based applications.

Convert Figma logo designs to code with AI

Visual Copilot

Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.

Try Visual Copilot

README

Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver

Introduction

The Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver is an open source project providing compute API support (Level Zero, OpenCL) for Intel graphics hardware architectures (HD Graphics, Xe).

What is NEO?

NEO is the shorthand name for Compute Runtime contained within this repository. It is also a development mindset that we adopted when we first started the implementation effort for OpenCL.

The project evolved beyond a single API and NEO no longer implies a specific API. When talking about a specific API, we will mention it by name (e.g. Level Zero, OpenCL).

License

The Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver is distributed under the MIT License.

You may obtain a copy of the License at: https://opensource.org/licenses/MIT

Supported Platforms

PlatformOpenCLLevel Zero
Intel Core Processors with Gen8 graphics devices (formerly Broadwell)3.0-
Intel Core Processors with Gen9 graphics devices (formerly Skylake, Kaby Lake, Coffee Lake)3.0Y
Intel Atom Processors with Gen9 graphics devices (formerly Apollo Lake, Gemini Lake)3.0-
Intel Core Processors with Gen11 graphics devices (formerly Ice Lake)3.0Y
Intel Atom Processors with Gen11 graphics devices (formerly Elkhart Lake)3.0-
Intel Core Processors with Gen12 graphics devices (formerly Tiger Lake, Rocket Lake, Alder Lake)3.0Y

Release cadence

Release cadence changed from weekly to monthly late 2022

  • At the beginning of each calendar month, we identify a well-tested driver version from the previous month as a release candidate for our monthly release.
  • We create a release branch and apply selected fixes for significant issues.
  • The branch naming convention is releases/yy.ww (yy - year, ww - work week of release candidate)
  • The builds are tagged using the following format: yy.ww.bbbbb.hh (yy - year, ww - work week, bbbbb - incremental build number from the master branch, hh - incremental commit number on release branch).
  • We publish and document a monthly release from the tip of that branch.
  • During subsequent weeks of a given month, we continue to cherry-pick fixes to that branch and may publish a hotfix release.
  • Quality level of the driver (per platform) will be provided in the Release Notes.

Installation Options

To allow NEO access to GPU device make sure user has permissions to files /dev/dri/renderD*.

Via system package manager

NEO is available for installation on a variety of Linux distributions and can be installed via the distro's package manager.

For example on Ubuntu* 22.04:

apt-get install intel-opencl-icd

Manual download

.deb packages for Ubuntu are provided along with installation instructions and Release Notes on the release page

Linking applications

Directly linking to the runtime library is not supported:

Dependencies

In addition, to enable performance counters support, the following packages are needed:

How to provide feedback

By default, please submit an issue using native github.com interface.

How to contribute

Create a pull request on github.com with your patch. Make sure your change is cleanly building and passing ULTs. A maintainer will contact you if there are questions or concerns. See contribution guidelines for more details.

See also

Level Zero specific

OpenCL specific

(*) Other names and brands may be claimed as property of others.