Skip to content
This repository was archived by the owner on Mar 21, 2024. It is now read-only.
This repository was archived by the owner on Mar 21, 2024. It is now read-only.

thrust::reduce_by_key error for 2^31 elements #1609

Closed
@seunghwak

Description

@seunghwak

Tested environment
GA102 (48 GB GPU memory)
CUDA 11.4
Thrust 1.15.0

The code below failed inside the 'reduce_by_key' call with a memory allocation failure; tried to allocate 18446744073694639872 bytes (16 EB) and failed as my system has only 48 GB. This sounds like a bug related to 32 bit signed integer overflow.

cudaDeviceSynchronize();
std::cout << "test thrust reduce_by_key START" << std::endl;
constexpr size_t num_elements = std::size_t{1} << 31;
constexpr int32_t max_key = 8;
thrust::device_vector<int32_t> int_values(num_elements);
thrust::tabulate(thrust::device, int_values.begin(), int_values.end(), [max_key]__device__(auto i) {
  return static_cast<int32_t>(i % max_key);
});
thrust::sort(thrust::device, int_values.begin(), int_values.end());
thrust::device_vector<int32_t> keys(max_key);
thrust::device_vector<size_t> values(max_key);
thrust::reduce_by_key(thrust::device, int_values.begin(), int_values.end(), thrust::make_constant_iterator(size_t{1}), keys.data(), values.data());
cudaDeviceSynchronize();
std::cout << "test thrust reduce_by_key END" << std::endl;

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions