Samples for CUDA Developers which demonstrates features in CUDA Toolkit
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
Rob Armstrong 14b1bfdcc4 Replace README references to "CUDA Toolkit 12.5" with general "CUDA Toolkit" 2 months ago
..
.vscode add and update samples for CUDA 11.6 4 years ago
CMakeLists.txt Update all sample CMakeLists.txt to include ENABLE_CUDA_DEBUG flag to enable cuda-gdb 3 months ago
README.md Replace README references to "CUDA Toolkit 12.5" with general "CUDA Toolkit" 2 months ago
warpAggregatedAtomicsCG.cu Apply consistent code formatting across the repo. Add clang-format and pre-commit hooks. 3 months ago

README.md

warpAggregatedAtomicsCG - Warp Aggregated Atomics using Cooperative Groups

Description

This sample demonstrates how using Cooperative Groups (CG) to perform warp aggregated atomics to single and multiple counters, a useful technique to improve performance when many threads atomically add to a single or multiple counters.

Key Concepts

Cooperative Groups, Atomic Intrinsics

Supported SM Architectures

Supported OSes

Linux, Windows

Supported CPU Architecture

x86_64, armv7l, aarch64

CUDA APIs involved

CUDA Runtime API

cudaMemcpy, cudaFree, cudaDeviceGetAttribute, cudaMemset, cudaMalloc

Prerequisites

Download and install the CUDA Toolkit for your corresponding platform.

References (for more details)