Top Related Projects
ArrayFire: a general purpose GPU library.
HIP: C++ Heterogeneous-Compute Interface for Portability
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
Quick Overview
Boost.Compute is a C++ GPU computing library based on OpenCL. It provides a high-level, STL-like interface for GPU-accelerated computing, making it easier for developers to harness the power of GPUs for parallel processing tasks.
Pros
- Seamless integration with C++ and the STL, allowing for familiar programming patterns
- Cross-platform support, working on various GPU architectures
- High-level abstractions that simplify GPU programming
- Extensive documentation and examples
Cons
- Requires OpenCL support, which may not be available on all systems
- Performance may not always match hand-optimized OpenCL code
- Learning curve for developers new to GPU programming concepts
- Limited support for newer GPU features compared to vendor-specific libraries
Code Examples
- Vector addition:
#include <boost/compute/algorithm/transform.hpp>
#include <boost/compute/container/vector.hpp>
namespace compute = boost::compute;
compute::vector<float> a(1000, context);
compute::vector<float> b(1000, context);
compute::vector<float> c(1000, context);
compute::transform(
a.begin(), a.end(),
b.begin(),
c.begin(),
compute::plus<float>(),
queue
);
- Parallel reduction:
#include <boost/compute/algorithm/reduce.hpp>
float sum = compute::reduce(
vector.begin(),
vector.end(),
0.0f,
compute::plus<float>(),
queue
);
- Custom kernel:
#include <boost/compute/kernel.hpp>
const char source[] = BOOST_COMPUTE_STRINGIZE_SOURCE(
__kernel void custom_kernel(__global float* input, __global float* output)
{
const uint i = get_global_id(0);
output[i] = input[i] * 2.0f;
}
);
compute::kernel kernel = compute::kernel::create_with_source(source, "custom_kernel", context);
kernel.set_arg(0, input.get_buffer());
kernel.set_arg(1, output.get_buffer());
queue.enqueue_1d_range_kernel(kernel, 0, input.size(), 0);
Getting Started
-
Install Boost.Compute:
git clone https://github.com/boostorg/compute.git cd compute mkdir build && cd build cmake .. make install
-
Include the necessary headers in your C++ file:
#include <boost/compute/core.hpp> #include <boost/compute/algorithm/copy.hpp> #include <boost/compute/container/vector.hpp>
-
Initialize the Boost.Compute environment:
namespace compute = boost::compute; compute::device device = compute::system::default_device(); compute::context context(device); compute::command_queue queue(context, device);
-
Start using Boost.Compute in your code!
Competitor Comparisons
ArrayFire: a general purpose GPU library.
Pros of ArrayFire
- Supports multiple backends (CUDA, OpenCL, CPU) for greater flexibility
- Extensive library of pre-built functions for various domains (linear algebra, signal processing, etc.)
- Active development and regular updates
Cons of ArrayFire
- Larger codebase and potentially steeper learning curve
- May have higher memory usage due to its comprehensive feature set
Code Comparison
ArrayFire:
af::array A = af::randu(5, 5);
af::array B = af::constant(1, 5, 5);
af::array C = af::matmul(A, B);
Compute:
boost::compute::vector<float> A(25, context);
boost::compute::vector<float> B(25, context);
boost::compute::vector<float> C(25, context);
boost::compute::transform(A.begin(), A.end(), B.begin(), C.begin(),
boost::compute::multiplies<float>(), queue);
Key Differences
- ArrayFire provides higher-level abstractions and built-in functions
- Compute offers more fine-grained control over GPU operations
- ArrayFire's syntax is more concise for complex operations
- Compute integrates well with other Boost libraries
Use Cases
- ArrayFire: Rapid prototyping, scientific computing, machine learning
- Compute: Performance-critical applications, custom GPU algorithms, Boost ecosystem integration
HIP: C++ Heterogeneous-Compute Interface for Portability
Pros of HIP
- Broader hardware support, including AMD GPUs and potentially other accelerators
- More active development and community support
- Closer to native CUDA syntax, potentially easier for CUDA developers to adopt
Cons of HIP
- Less mature and stable compared to Boost.Compute
- May require more frequent updates to keep up with hardware changes
- Potentially more complex setup and configuration process
Code Comparison
Boost.Compute example:
compute::vector<int> vec(1000);
compute::fill(vec.begin(), vec.end(), 42);
compute::sort(vec.begin(), vec.end());
HIP example:
int* d_vec;
hipMalloc(&d_vec, 1000 * sizeof(int));
hipLaunchKernelGGL(fill_kernel, dim3(32), dim3(32), 0, 0, d_vec, 42, 1000);
hipLaunchKernelGGL(sort_kernel, dim3(32), dim3(32), 0, 0, d_vec, 1000);
Both libraries aim to simplify GPU programming, but HIP focuses on providing a unified interface for CUDA and AMD GPUs, while Boost.Compute offers a more abstracted C++ approach. HIP's syntax is closer to CUDA, which may be familiar to many developers, while Boost.Compute provides a higher-level interface that integrates well with the C++ Standard Library.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
Pros of PyTorch
- Extensive deep learning framework with a large ecosystem and community support
- Dynamic computational graphs for flexible model development
- Seamless integration with Python and NumPy
Cons of PyTorch
- Steeper learning curve for beginners compared to Compute
- Higher resource requirements for installation and usage
- Less focus on general-purpose GPU computing
Code Comparison
PyTorch:
import torch
x = torch.tensor([1, 2, 3])
y = torch.tensor([4, 5, 6])
z = x + y
print(z)
Compute:
#include <boost/compute.hpp>
namespace compute = boost::compute;
compute::vector<int> x = {1, 2, 3};
compute::vector<int> y = {4, 5, 6};
compute::vector<int> z(3);
compute::transform(x.begin(), x.end(), y.begin(), z.begin(), compute::plus<int>());
PyTorch offers a more concise and Python-friendly syntax, while Compute provides lower-level control and integration with C++ ecosystems. PyTorch is primarily designed for deep learning tasks, whereas Compute focuses on general-purpose GPU computing across various domains.
An Open Source Machine Learning Framework for Everyone
Pros of TensorFlow
- Extensive ecosystem with high-level APIs and tools for machine learning
- Strong support for distributed computing and GPU acceleration
- Large community and extensive documentation
Cons of TensorFlow
- Steeper learning curve for beginners
- Larger footprint and slower compilation times
- Less flexible for general-purpose GPU computing tasks
Code Comparison
TensorFlow (Python):
import tensorflow as tf
x = tf.constant([[1], [2], [3], [4]], dtype=tf.float32)
y = tf.constant([[0], [-1], [-2], [-3]], dtype=tf.float32)
linear_model = tf.layers.Dense(units=1)
y_pred = linear_model(x)
loss = tf.losses.mean_squared_error(labels=y, predictions=y_pred)
Boost.Compute (C++):
#include <boost/compute.hpp>
namespace compute = boost::compute;
compute::vector<float> x = {1, 2, 3, 4};
compute::vector<float> y = {0, -1, -2, -3};
compute::transform(x.begin(), x.end(), y.begin(), y.begin(),
compute::minus<float>(), queue);
TensorFlow is more suited for complex machine learning tasks, while Boost.Compute provides a lower-level interface for general-purpose GPU computing. TensorFlow offers a higher level of abstraction, making it easier to implement complex models, but Boost.Compute allows for more fine-grained control over GPU operations.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
Boost.Compute
Boost.Compute is a GPU/parallel-computing library for C++ based on OpenCL.
The core library is a thin C++ wrapper over the OpenCL API and provides access to compute devices, contexts, command queues and memory buffers.
On top of the core library is a generic, STL-like interface providing common
algorithms (e.g. transform()
, accumulate()
, sort()
) along with common
containers (e.g. vector<T>
, flat_set<T>
). It also features a number of
extensions including parallel-computing algorithms (e.g. exclusive_scan()
,
scatter()
, reduce()
) and a number of fancy iterators (e.g.
transform_iterator<>
, permutation_iterator<>
, zip_iterator<>
).
The full documentation is available at http://boostorg.github.io/compute/.
Example
The following example shows how to sort a vector of floats on the GPU:
#include <vector>
#include <algorithm>
#include <boost/compute.hpp>
namespace compute = boost::compute;
int main()
{
// get the default compute device
compute::device gpu = compute::system::default_device();
// create a compute context and command queue
compute::context ctx(gpu);
compute::command_queue queue(ctx, gpu);
// generate random numbers on the host
std::vector<float> host_vector(1000000);
std::generate(host_vector.begin(), host_vector.end(), rand);
// create vector on the device
compute::vector<float> device_vector(1000000, ctx);
// copy data to the device
compute::copy(
host_vector.begin(), host_vector.end(), device_vector.begin(), queue
);
// sort data on the device
compute::sort(
device_vector.begin(), device_vector.end(), queue
);
// copy data back to the host
compute::copy(
device_vector.begin(), device_vector.end(), host_vector.begin(), queue
);
return 0;
}
Boost.Compute is a header-only library, so no linking is required. The example above can be compiled with:
g++ -I/path/to/compute/include sort.cpp -lOpenCL
More examples can be found in the tutorial and under the examples directory.
Support
Questions about the library (both usage and development) can be posted to the mailing list.
Bugs and feature requests can be reported through the issue tracker.
Also feel free to send me an email with any problems, questions, or feedback.
Help Wanted
The Boost.Compute project is currently looking for additional developers with interest in parallel computing.
Please send an email to Kyle Lutz (kyle.r.lutz@gmail.com) for more information.
Top Related Projects
ArrayFire: a general purpose GPU library.
HIP: C++ Heterogeneous-Compute Interface for Portability
Tensors and Dynamic neural networks in Python with strong GPU acceleration
An Open Source Machine Learning Framework for Everyone
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot