Top Related Projects
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows.
Quick Overview
NVIDIA/CUDALibrarySamples is a GitHub repository containing sample code demonstrating the usage of various CUDA libraries. It provides developers with practical examples and best practices for leveraging NVIDIA's GPU-accelerated libraries in their applications, covering areas such as linear algebra, signal processing, and machine learning.
Pros
- Comprehensive collection of samples for multiple CUDA libraries
- Well-documented code with explanations and comments
- Regularly updated to reflect the latest CUDA library versions
- Serves as a valuable learning resource for GPU programming
Cons
- Requires NVIDIA GPU hardware for execution
- Some samples may be complex for beginners
- Limited to CUDA-specific implementations, not applicable to other GPU platforms
- May require significant computational resources for certain examples
Code Examples
- cuBLAS matrix multiplication:
#include <cuda_runtime.h>
#include <cublas_v2.h>
int main() {
cublasHandle_t handle;
cublasCreate(&handle);
float *d_A, *d_B, *d_C;
int m = 1000, n = 1000, k = 1000;
cudaMalloc(&d_A, m * k * sizeof(float));
cudaMalloc(&d_B, k * n * sizeof(float));
cudaMalloc(&d_C, m * n * sizeof(float));
float alpha = 1.0f, beta = 0.0f;
cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_N, m, n, k, &alpha, d_A, m, d_B, k, &beta, d_C, m);
cublasDestroy(handle);
return 0;
}
- cuFFT 1D FFT:
#include <cufft.h>
int main() {
cufftHandle plan;
cufftComplex *d_input, *d_output;
int n = 1024;
cudaMalloc(&d_input, n * sizeof(cufftComplex));
cudaMalloc(&d_output, n * sizeof(cufftComplex));
cufftPlan1d(&plan, n, CUFFT_C2C, 1);
cufftExecC2C(plan, d_input, d_output, CUFFT_FORWARD);
cufftDestroy(plan);
return 0;
}
- cuDNN convolution:
#include <cudnn.h>
int main() {
cudnnHandle_t cudnn;
cudnnCreate(&cudnn);
cudnnTensorDescriptor_t input_descriptor;
cudnnFilterDescriptor_t kernel_descriptor;
cudnnConvolutionDescriptor_t convolution_descriptor;
cudnnTensorDescriptor_t output_descriptor;
cudnnCreateTensorDescriptor(&input_descriptor);
cudnnCreateFilterDescriptor(&kernel_descriptor);
cudnnCreateConvolutionDescriptor(&convolution_descriptor);
cudnnCreateTensorDescriptor(&output_descriptor);
// Set up descriptors...
cudnnConvolutionForward(cudnn, &alpha, input_descriptor, d_input, kernel_descriptor, d_kernel,
convolution_descriptor, CUDNN_CONVOLUTION_FWD_ALGO_IMPLICIT_GEMM,
d_workspace, workspace_size, &beta, output_descriptor, d_output);
cudnnDestroy(cudnn);
return 0;
}
Getting Started
-
Clone the repository:
git clone https://github.com/NVIDIA/CUDALibrarySamples.git
-
Install CUDA Toolkit and required libraries (cuBLAS, cuFFT, cuDNN, etc.)
-
Navigate to a specific sample directory:
cd CUDALibrarySamples/cuBLAS/Level-1
-
Build and run the sample:
Competitor Comparisons
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Pros of cuda-samples
- More comprehensive coverage of CUDA features and techniques
- Includes a wider range of application domains and use cases
- Better suited for beginners learning CUDA programming
Cons of cuda-samples
- Larger repository size, potentially overwhelming for some users
- Some samples may be outdated or less relevant for modern CUDA development
- Less focused on specific CUDA libraries compared to CUDALibrarySamples
Code Comparison
CUDALibrarySamples (cuBLAS example):
cublasHandle_t handle;
cublasCreate(&handle);
cublasSgemm(handle, CUBLAS_OP_N, CUBLAS_OP_N, m, n, k, &alpha, d_A, m, d_B, k, &beta, d_C, m);
cublasDestroy(handle);
cuda-samples (vector addition example):
__global__ void vectorAdd(const float *A, const float *B, float *C, int numElements) {
int i = blockDim.x * blockIdx.x + threadIdx.x;
if (i < numElements) {
C[i] = A[i] + B[i];
}
}
CUDALibrarySamples focuses on demonstrating the usage of specific CUDA libraries, while cuda-samples provides a broader range of CUDA programming examples and techniques.
This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows.
Pros of DirectX-Graphics-Samples
- More comprehensive documentation and tutorials for beginners
- Broader range of graphics techniques and effects demonstrated
- Better integration with Windows ecosystem and tools
Cons of DirectX-Graphics-Samples
- Limited to DirectX and Windows platforms
- Potentially steeper learning curve for those new to graphics programming
- Less focus on high-performance computing compared to CUDA samples
Code Comparison
DirectX-Graphics-Samples (D3D12HelloTriangle.cpp):
ComPtr<ID3D12Resource> m_vertexBuffer;
D3D12_VERTEX_BUFFER_VIEW m_vertexBufferView;
const UINT vertexBufferSize = sizeof(triangleVertices);
ThrowIfFailed(m_device->CreateCommittedResource(
&CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD),
D3D12_HEAP_FLAG_NONE,
&CD3DX12_RESOURCE_DESC::Buffer(vertexBufferSize),
D3D12_RESOURCE_STATE_GENERIC_READ,
nullptr,
IID_PPV_ARGS(&m_vertexBuffer)));
CUDALibrarySamples (vectorAdd.cu):
float *h_A, *h_B, *h_C;
float *d_A, *d_B, *d_C;
size_t size = N * sizeof(float);
cudaMalloc((void **)&d_A, size);
cudaMalloc((void **)&d_B, size);
cudaMalloc((void **)&d_C, size);
cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice);
cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice);
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual CopilotREADME
CUDA Library Samples
The CUDA Library Samples repository contains various examples that demonstrate the use of GPU-accelerated libraries in CUDA. These libraries enable high-performance computing in a wide range of applications, including math operations, image processing, signal processing, linear algebra, and compression. The samples included cover:
- Math and Image Processing Libraries
- cuBLAS (Basic Linear Algebra Subprograms)
- cuTENSOR (Tensor Linear Algebra)
- cuSPARSE (Sparse Matrix Operations)
- cuSOLVER (Dense and Sparse Solvers)
- cuFFT (Fast Fourier Transform)
- cuRAND (Random Number Generation)
- NPP (Image and Video Processing)
- nvJPEG (JPEG Encode/Decode)
- nvCOMP (Data Compression)
- and more...
About
The CUDA Library Samples are provided by NVIDIA Corporation as Open Source software, released under the 3-clause "New" BSD license. These examples showcase how to leverage GPU-accelerated libraries for efficient computation across various fields.
For more information on the available libraries and their uses, visit GPU Accelerated Libraries.
Library Examples
Explore the examples of each CUDA library included in this repository:
- cuBLAS - GPU-accelerated basic linear algebra (BLAS) library
- cuBLASLt - Lightweight BLAS library
- cuBLASMp - Multi-process BLAS library
- cuBLASDx - Device-side BLAS extensions
- cuDSS - GPU-accelerated linear solvers
- cuFFT - Fast Fourier Transforms
- cuFFTMp - Multi-process FFT
- cuFFTDx - Device-side FFT extensions
- cuRAND - Random number generation
- cuSOLVER - Dense and sparse direct solvers
- cuSOLVERMp - Multi-process solvers
- cuSPARSE - BLAS for sparse matrices
- cuSPARSELt - Lightweight BLAS for sparse matrices
- cuTENSOR - Tensor linear algebra library
- cuTENSORMg - Multi-GPU tensor linear algebra
- NPP - GPU-accelerated image, video, and signal processing functions
- NPP+ - C++ extensions for NPP
- nvJPEG - High-performance JPEG encode/decode
- nvJPEG2000 - JPEG2000 encoding/decoding
- nvTIFF - TIFF encoding/decoding
- nvCOMP - Data compression and decompression
Each sample provides a practical use case for how to apply these libraries in real-world scenarios, showcasing the power and flexibility of CUDA for a wide variety of computational needs.
Additional Resources
For more information and documentation on CUDA libraries, please visit:
License
The CUDA Library Samples are distributed under the 3-clause "New" BSD license. For more details, refer to the license terms below:
Copyright
Copyright (c) 2022-2024 NVIDIA CORPORATION AND AFFILIATES. All rights reserved.
Redistribution and use in source and binary forms, with or without modification, are permitted
provided that the following conditions are met:
* Redistributions of source code must retain the above copyright notice, this list of
conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of
conditions and the following disclaimer in the documentation and/or other materials
provided with the distribution.
* Neither the name of the NVIDIA CORPORATION nor the names of its contributors may be used
to endorse or promote products derived from this software without specific prior written
permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL NVIDIA CORPORATION BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;
OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
STRICT LIABILITY, OR TOR (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
Top Related Projects
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows.
Convert designs to code with AI
Introducing Visual Copilot: A new AI model to turn Figma designs to high quality code using your components.
Try Visual Copilot