Top Related Projects
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
HIP: C++ Heterogeneous-Compute Interface for Portability
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
An Open Source Machine Learning Framework for Everyone
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Quick Overview
oneAPI-samples is a GitHub repository containing code samples and tutorials for Intel's oneAPI toolkit. It demonstrates how to use various oneAPI components, including DPC++, oneMKL, oneDNN, and others, to develop high-performance, cross-architecture applications. The repository serves as a valuable resource for developers looking to leverage oneAPI's capabilities in their projects.
Pros
- Comprehensive collection of samples covering various oneAPI components
- Well-documented examples with detailed explanations and comments
- Supports multiple programming languages (C++, Python, Fortran)
- Regularly updated to reflect the latest oneAPI features and best practices
Cons
- Requires Intel hardware or emulation for full functionality
- Some samples may be complex for beginners
- Limited coverage of certain advanced topics
- May require additional setup and dependencies for specific examples
Code Examples
- Simple vector addition using DPC++:
#include <CL/sycl.hpp>
#include <array>
#include <iostream>
constexpr int array_size = 10000;
int main() {
std::array<int, array_size> a, b, c;
for (int i = 0; i < array_size; i++) {
a[i] = i;
b[i] = array_size - i;
}
cl::sycl::queue q;
cl::sycl::buffer<int, 1> a_buf(a.data(), cl::sycl::range<1>(array_size));
cl::sycl::buffer<int, 1> b_buf(b.data(), cl::sycl::range<1>(array_size));
cl::sycl::buffer<int, 1> c_buf(c.data(), cl::sycl::range<1>(array_size));
q.submit([&](cl::sycl::handler& h) {
auto a_acc = a_buf.get_access<cl::sycl::access::mode::read>(h);
auto b_acc = b_buf.get_access<cl::sycl::access::mode::read>(h);
auto c_acc = c_buf.get_access<cl::sycl::access::mode::write>(h);
h.parallel_for<class vector_add>(cl::sycl::range<1>(array_size),
[=](cl::sycl::id<1> i) {
c_acc[i] = a_acc[i] + b_acc[i];
});
});
auto c_acc = c_buf.get_access<cl::sycl::access::mode::read>();
for (int i = 0; i < array_size; i++) {
if (c_acc[i] != array_size) {
std::cout << "Error: Incorrect result" << std::endl;
return 1;
}
}
std::cout << "Success!" << std::endl;
return 0;
}
- Using oneMKL for matrix multiplication:
#include <CL/sycl.hpp>
#include <oneapi/mkl.hpp>
#include <iostream>
int main() {
sycl::queue q;
const int m = 2000, n = 1000, k = 1000;
std::vector<float> A(m * k, 1.0f), B(k * n, 2.0f), C(m * n, 0.0f);
oneapi::mkl::blas::gemm(q, oneapi::mkl::transpose::nontrans, oneapi::mkl::transpose::nontrans,
m, n, k, 1.0f, A.data(), m, B.data(), k, 0.0f, C.data(), m);
q.wait();
std::cout << "Matrix multiplication completed." << std::endl;
return 0;
}
- Using oneDNN for convolution:
#include <dnnl.hpp>
#include <iostream>
#include <vector>
Competitor Comparisons
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Pros of LLVM
- More comprehensive and mature codebase for compiler infrastructure
- Broader community support and contributions
- Wider range of supported languages and architectures
Cons of LLVM
- Steeper learning curve for newcomers
- Larger codebase, potentially more complex to navigate
- May require more setup and configuration for specific use cases
Code Comparison
LLVM (C++):
#include "llvm/IR/LLVMContext.h"
#include "llvm/IR/Module.h"
#include "llvm/IR/IRBuilder.h"
LLVMContext Context;
Module *M = new Module("MyModule", Context);
IRBuilder<> Builder(Context);
oneAPI-samples (DPC++):
#include <CL/sycl.hpp>
#include <array>
#include <iostream>
using namespace sycl;
queue q;
std::array<int, 10> data;
Summary
LLVM is a more extensive and versatile compiler infrastructure project, while oneAPI-samples focuses on providing examples and tutorials for Intel's oneAPI toolkit. LLVM offers broader language and architecture support but may be more challenging for beginners. oneAPI-samples is more specialized for Intel hardware and provides easier-to-follow examples for those working with oneAPI, but it has a narrower scope compared to LLVM.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
Pros of compute-runtime
- Focuses on low-level GPU runtime implementation
- Provides direct access to Intel GPU hardware capabilities
- Suitable for advanced developers and system integrators
Cons of compute-runtime
- Steeper learning curve for beginners
- Less comprehensive documentation and examples
- Narrower scope, primarily targeting Intel GPUs
Code Comparison
compute-runtime (OpenCL kernel):
__kernel void vector_add(__global const int *A, __global const int *B, __global int *C) {
int i = get_global_id(0);
C[i] = A[i] + B[i];
}
oneAPI-samples (DPC++ kernel):
h.parallel_for(range<1>(N), [=](id<1> i) {
C[i] = A[i] + B[i];
});
Summary
compute-runtime is a lower-level implementation focusing on Intel GPU runtimes, while oneAPI-samples provides a broader set of examples and tutorials for the oneAPI ecosystem. compute-runtime offers more direct hardware access but requires more expertise, whereas oneAPI-samples is more beginner-friendly and covers a wider range of Intel hardware. The code comparison shows the difference between OpenCL and DPC++ implementations, with oneAPI-samples using a higher-level abstraction.
HIP: C++ Heterogeneous-Compute Interface for Portability
Pros of HIP
- Open-source and vendor-neutral, supporting both AMD and NVIDIA GPUs
- Easier migration path from CUDA to HIP for existing CUDA applications
- Lightweight runtime library with minimal overhead
Cons of HIP
- Smaller ecosystem and community compared to oneAPI
- Limited support for non-GPU accelerators (e.g., FPGAs, specialized AI hardware)
- Less comprehensive toolset for heterogeneous computing
Code Comparison
HIP code example:
#include <hip/hip_runtime.h>
__global__ void vectorAdd(float *a, float *b, float *c, int n) {
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < n) c[i] = a[i] + b[i];
}
oneAPI code example:
#include <CL/sycl.hpp>
void vectorAdd(sycl::queue &q, float *a, float *b, float *c, int n) {
q.parallel_for(sycl::range<1>(n), [=](sycl::id<1> i) {
c[i] = a[i] + b[i];
});
}
The HIP example uses CUDA-like syntax with explicit kernel definition, while the oneAPI example uses SYCL with a more abstracted parallel_for construct. HIP provides a familiar environment for CUDA developers, whereas oneAPI offers a higher-level abstraction for heterogeneous computing across various accelerators.
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Pros of cuda-samples
- More extensive collection of samples covering a wider range of CUDA features and applications
- Longer history and more mature codebase, reflecting CUDA's established position in GPU computing
- Includes advanced topics like multi-GPU programming and CUDA-specific optimizations
Cons of cuda-samples
- Limited to NVIDIA GPUs, lacking cross-platform compatibility
- Steeper learning curve for beginners due to CUDA's lower-level nature
- Less focus on modern C++ features and programming paradigms
Code Comparison
cuda-samples (CUDA):
__global__ void vectorAdd(const float *A, const float *B, float *C, int numElements) {
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numElements) {
C[i] = A[i] + B[i];
}
}
oneAPI-samples (DPC++):
q.parallel_for(sycl::range<1>(numElements), [=](sycl::id<1> i) {
C[i] = A[i] + B[i];
});
The CUDA sample uses explicit kernel definition and thread indexing, while the oneAPI sample uses a higher-level abstraction with parallel_for and lambda functions, showcasing the difference in programming models between CUDA and oneAPI.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Widely adopted and supported by a large community
- Comprehensive ecosystem with tools like TensorBoard and TensorFlow Serving
- Supports multiple programming languages (Python, JavaScript, C++)
Cons of TensorFlow
- Steeper learning curve for beginners
- Can be slower for prototyping compared to other frameworks
- Large library size and potential overhead for simpler projects
Code Comparison
TensorFlow:
import tensorflow as tf
model = tf.keras.Sequential([
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
oneAPI-samples:
#include <CL/sycl.hpp>
#include <array>
#include <iostream>
constexpr size_t array_size = 10000;
Summary
TensorFlow is a comprehensive machine learning framework with a large ecosystem and community support. It offers more features and flexibility but may have a steeper learning curve. oneAPI-samples, on the other hand, focuses on demonstrating Intel's oneAPI toolkit capabilities across various hardware accelerators. It provides examples for heterogeneous computing but is more specialized compared to TensorFlow's general-purpose machine learning capabilities.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Larger community and ecosystem, with more resources and third-party libraries
- More flexible and dynamic computational graph, allowing for easier debugging
- Broader application range, including computer vision, NLP, and reinforcement learning
Cons of PyTorch
- Steeper learning curve for beginners compared to oneAPI-samples
- Less optimized for Intel hardware, potentially slower on Intel CPUs and GPUs
- More complex setup and installation process
Code Comparison
PyTorch example:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = torch.add(x, y)
oneAPI-samples example:
#include <CL/sycl.hpp>
using namespace sycl;
queue q;
int* data = malloc_shared<int>(N, q);
q.parallel_for(range<1>(N), [=](id<1> i) {
data[i] = i;
}).wait();
The PyTorch example demonstrates simple tensor operations, while the oneAPI-samples code shows parallel processing using SYCL. PyTorch offers a more intuitive API for machine learning tasks, whereas oneAPI-samples provides lower-level control for heterogeneous computing across various Intel architectures.
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
oneAPI Samples
The oneAPI-samples repository contains samples for the Intel® oneAPI Toolkits.
The contents of the default branch in this repository are meant to be used with the most recent released version of the Intel® oneAPI Toolkits.
Find oneAPI Samples
You can find samples by browsing the oneAPI Samples Catalog. Using the catalog you can search on the sample titles or descriptions.
You can refine your browsing or searching through filtering on the following:
- Expertise (Getting Started, Tutorial, etc.)
- Programming language (C++, Python, or Fortran)
- Target device (CPU GPU, and FPGA)
Get the oneAPI Samples
Clone the repository by entering the following command:
git clone https://github.com/oneapi-src/oneAPI-samples.git
Alternatively, you can download a zip file containing the primary branch in repository.
- Click the Code button.
- Select Download ZIP from the menu options.
- After downloading the file, unzip the repository contents.
Get Earlier Versions of the oneAPI Samples
If you need samples for an earlier version of any of the Intel® oneAPI Toolkits, then use a tagged version of the repository that corresponds with the toolkit version.
Clone an earlier version of the repository using Git by entering a command similar to the following:
git clone -b <tag> https://github.com/oneapi-src/oneAPI-samples.git
where <tag>
is the GitHub tag corresponding to the toolkit version number, like 2024.2.0.
Alternatively, you can download a zip file containing a specific tagged version of the repository.
- Select the appropriate tag.
- Click the Code button.
- Select Download ZIP from the menu options.
- After downloading the file, unzip the repository contents.
Getting Started with oneAPI Samples
The best oneAPI sample to start with depends on what you are trying to learn or types of problems you are trying to solve.
If you want to learn about... | Start with... |
---|---|
the basics of writing, compiling, and building programs for CPUs, GPUs, or FPGAs | Simple Add or Vector Add samples (You can use these samples as starter projects by removing unwanted elements and adding your code and build requirements.) |
the basics of using artificial intelligence | Getting Started Samples for AI Tools |
the basics of image rendering workloads and ray tracing | Getting Started Samples for Intel® oneAPI Rendering Toolkit (Render Kit) |
how to modify or create build files for SYCL-compliant projects | Vector Add sample |
Note: The README.md included with each sample provides build instructions for all supported operating system. For samples run in Jupyter Notebooks, you might need to install or configure additional frameworks or package managers if you do not already have them on your system.
Using Integrated Development Environments (IDE)
If you prefer to use an Integrated Development Environment (IDE) with these samples, you can download Visual Studio Code for use on Windows*, Linux*, and macOS*.
Repository Structure
The oneAPI-sample repository is organized by high-level categories.
Platform Validation
Ubuntu 22.04
Intel(R) Xeon(R) Platinum 8468V
Intel(R) Data Center GPU Max 1100
OpenCL Driver: Intel(R) OpenCL, Intel(R) Xeon(R) Platinum 8468V OpenCL 3.0 (Build 0) [2024.18.7.0.11_160000]
Level Zero Driver: Intel(R) Level-Zero, Intel(R) Data Center GPU Max 1100 1.3 [1.3.28202]
oneAPI package version:
‐ Intel oneAPI HPC Toolkit Build Version: 2025.0.0.825
Windows 11
11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz
Intel(R) Iris(R) Xe Graphics OpenCL 3.0 NEO
OpenCL Driver: Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz OpenCL 3.0 (Build 0) [2024.18.9.0.28_160000]
Level Zero Driver: Intel(R) oneAPI Unified Runtime over Level-Zero, Intel(R) Iris(R) Xe Graphics 12.0.0 [1.3.27193]
oneAPI package version:
‐ Intel oneAPI HPC Toolkit Build Version: 2025.0.0.822
Known Issues and Limitations
Windows
- If you are using Microsoft Visual Studio* 2019, you must use Microsoft Visual Studio 2019 version 16.4.0 or newer.
- Windows support for the FPGA code samples is limited to the FPGA emulator and optimization reports. Only Linux supports FPGA hardware compilation. See any FPGA code sample README.md for more details.
- If you encounter
Error MSB6003 The specified task executable ... could not be run...
when building a sample program, it might be due to the length of the directory path. Move thebuild
directory to a location with a shorter path. Build the sample in the new location.
Additional Resources for Code Samples
A curated list of samples from oneAPI based projects, libraries, and tools. In addition, the most exciting samples from other AI projects that are not necessarily based on oneAPI are also listed here to provide you with the latest and valuable resources for augmenting your productivity.
- OpenVINO⢠notebooks: A collection of ready-to-run Jupyter notebooks for learning and experimenting with the OpenVINO⢠Toolkit, an open-source AI toolkit that makes it easier to write once, deploy anywhere. The notebooks introduce OpenVINO basics and teach developers how to leverage the API for optimized deep learning inference.
- Intel® Gaudi® Tutorials: Tutorials with step-by-step instructions for running PyTorch and PyTorch Lightning models on the Intel Gaudi AI Processor for training and inferencing, from beginner level to advanced users.
- Powered-by-Intel Leaderboard: This leaderboard celebrates and increases the discoverability of models developed on Intel hardware by the AI developer community. We provide developers with sample code and resources (developer programs) to deploy (inference) AI PC, Intel® Xeon® Scalable processors, Intel® Gaudi® processors, Intel® Arc⢠GPUs, and Intel® Data Center GPUs.
- Intel® AI Reference Models: This repository contains links to pre-trained models, sample scripts, best practices, and step-by-step tutorials for many popular open-source machine learning models optimized by Intel to run on Intel® Xeon® Scalable processors and Intel® Data Center GPUs.
- awesome-oneapi: A community sourced list of awesome oneAPI and SYCL projects for solutions across a wide range of industry segments.
- Generative AI Examples: A collection of GenAI examples such as ChatQnA, Copilot, which illustrate the pipeline capabilities of the Open Platform for Enterprise AI (OPEA) project. OPEA is an ecosystem orchestration framework to integrate performant GenAI technologies & workflows leading to quicker GenAI adoption and business value.
Licenses
Code samples are licensed under the MIT license. See License.txt for details.
Third-party program licenses can be found here: third-party-programs.txt.
Notices and Disclaimers
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.
Top Related Projects
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
Intel® Graphics Compute Runtime for oneAPI Level Zero and OpenCL™ Driver
HIP: C++ Heterogeneous-Compute Interface for Portability
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
An Open Source Machine Learning Framework for Everyone
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Convert
designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot