You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
1.2 KiB
1.2 KiB
hipCUB Device Sum Example
Description
This simple program showcases the usage of the hipcub::DeviceReduce::Sum()
.
Application flow
- Host side data is instantiated in a
std::vector<int>
. - Device side storage is allocated using
hipMalloc
. - Data is copied from host to device using
hipMemcpy
. hipCUB
makes use of some temporary memory on the device and that needs to be allocated. A first call tohipcub::DeviceReduce::Sum
is made (and sinced_temp_storage
is set to null) it stores the size in bytes of temporary storage needed intemp_storage_bytes
.temp_storage_bytes
is used to allocate device memory ind_temp_storage
usinghipMalloc
.- Finally a call to
hipcub::DeviceReduce::Sum
is made that computes the sum. - Result is transfered from device to host.
- Free any device side memory using
hipFree
Key APIs and Concepts
- hipCUB provided device level API is used in this example. It performs global device level operations (in this case a sum reduction using
hipcub::DeviceReduce::Sum
) on the GPU.
Demonstrated API Calls
hipCUB
hipcub::DeviceReduce::Sum
HIP runtime
hipGetErrorString
hipMalloc
hipMemcpy
hipFree