Skip to content

Commit 74e9bc8

Browse files
authored
Develop stream 2024-11-07 (#486)
* Workaround compiler issue for inclusive/exclusive scan with FixedVector items * Added missing explicit qualification of pow * Ensure that {cr}begin works with types that pull in namespace std via ADL * Various and sundry fixes for Thrust's CPP backends. * Expose thrust's contiguous iterator unwrap helpers * Several improvements to zip_iterator/zip_function * Fix thrust::optional<T&>::emplace() Where optional<T> inherits optional<T>::construct via a series of classes, optional<T&> does not. This means that optional<T&>::emplace() was broken and called into a member function that did not exist. This replaces the functionality to make optional<T&>::emplace() change the stored reference to the new one. Note that it does _not_ emplace the referee, as this would lead to questionable behavior when the optional holds nullopt. This was revealed by a change in LLVM, see llvm/llvm-project#90152 and ROCm/rocThrust#404. * Fix issues that came up with building cuDF with main * Bump version to 2.5.0 * Add test for CUDA backend to make sure that thrust::counting_iterator is supported from cuda::std * Remove redundant thrust dialect conditional * Move visibility attributes to cccl * Fix logic for including libcudacxx headers * add find_first_of to HIPSTDPAR * fix memory free on error * fix review comments * fix rocprim::find_first_of usage * fix copy_if to work with large data types * fix tests * fix copyright date * add another operator== for large_data add large_data only to copy tests * add casts to make code clearer * Added search and find_end to hipstdpar * Fix compile warning * Add missing CHANGELOG entry for CCCL 2.5.0 update * Add rocPRIM's adjacent_find to hipstdpar * Consistently use thrust::equal_to as default operator * add change log * Add search_n support * fix windows build failed * Add overload under std namespace * add typename before * add typename before * Work in progress * Work in progress * Apply 1 suggestion(s) to 1 file(s) Co-authored-by: Milo Lurati <[email protected]> * fix function name mistakes * Set c++ version to 17 and create warning * Fix no_discard warning c++17 * Build for both c++ 14 and 17 * Fix tuple test for c++17 * Add new documentation for tuple.h --------- Co-authored-by: Beatriz Navidad Vilches <[email protected]> Co-authored-by: Robin Voetter <[email protected]> Co-authored-by: Bence Parajdi <[email protected]> Co-authored-by: Cenxuan Tian <[email protected]> Co-authored-by: Milo Lurati <[email protected]> [ROCm/rocThrust commit: d8ee6ee]
1 parent b3ac8ec commit 74e9bc8

File tree

150 files changed

+2206
-1645
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

150 files changed

+2206
-1645
lines changed

projects/rocthrust/.gitlab-ci.yml

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -122,6 +122,7 @@ copyright-date:
122122
-D CMAKE_C_COMPILER_LAUNCHER=phc_sccache_c
123123
-D CMAKE_CXX_COMPILER_LAUNCHER=phc_sccache_cxx
124124
-D CMAKE_CUDA_COMPILER_LAUNCHER=phc_sccache_cuda
125+
-D CMAKE_CXX_STANDARD="$BUILD_VERSION"
125126
-S $CI_PROJECT_DIR
126127
-B $CI_PROJECT_DIR/build
127128
- cmake --build $CI_PROJECT_DIR/build
@@ -147,6 +148,7 @@ build:cmake-latest:
147148
matrix:
148149
- BUILD_TYPE: Release
149150
BUILD_TARGET: [BENCHMARKS, TEST, EXAMPLES]
151+
BUILD_VERSION: [14, 17]
150152

151153
build:cmake-minimum:
152154
stage: build
@@ -157,6 +159,7 @@ build:cmake-minimum:
157159
matrix:
158160
- BUILD_TYPE: Release
159161
BUILD_TARGET: [BENCHMARKS, TEST, EXAMPLES]
162+
BUILD_VERSION: 14
160163

161164
build:package:
162165
stage: build
@@ -221,6 +224,7 @@ build:windows:
221224
-D CMAKE_CXX_FLAGS=-Wno-deprecated-declarations
222225
-D CMAKE_CXX_COMPILER:FILEPATH="${env:HIP_PATH}/bin/clang++.exe"
223226
-D CMAKE_INSTALL_PREFIX:PATH="$CI_PROJECT_DIR/build/install"
227+
-D CMAKE_CXX_STANDARD=14
224228
-D CMAKE_PREFIX_PATH:PATH="$ROCPRIM_DIR/build/install;${env:HIP_PATH}" *>&1
225229
- \& cmake --build "$CI_PROJECT_DIR/build" *>&1
226230
artifacts:
@@ -277,6 +281,7 @@ test:
277281
matrix:
278282
- BUILD_TYPE: Release
279283
BUILD_TARGET: TEST
284+
BUILD_VERSION: 14
280285
script:
281286
- cd $CI_PROJECT_DIR/build
282287
- cmake
@@ -353,7 +358,7 @@ build:cuda-and-omp:
353358
tags:
354359
- build
355360
variables:
356-
CCCL_GIT_BRANCH: v2.4.0
361+
CCCL_GIT_BRANCH: v2.5.0
357362
CCCL_DIR: ${CI_PROJECT_DIR}/cccl
358363
needs: []
359364
script:
@@ -362,13 +367,11 @@ build:cuda-and-omp:
362367
- rm -R $CCCL_DIR/thrust/thrust
363368
- cp -r $CI_PROJECT_DIR/thrust $CCCL_DIR/thrust
364369
# Build tests and examples from CCCL Thrust
365-
# CCCL 2.4.0 breaks compilation of tests. Compile examples only until we
366-
# match v2.5.0.
367370
- cmake
368371
-G Ninja
369372
-D CMAKE_BUILD_TYPE=Release
370373
-D CMAKE_CUDA_ARCHITECTURES="$GPU_TARGETS"
371-
-D THRUST_ENABLE_TESTING=OFF
374+
-D THRUST_ENABLE_TESTING=ON
372375
-D THRUST_ENABLE_EXAMPLES=ON
373376
-D THRUST_ENABLE_BENCHMARKS=OFF
374377
-D THRUST_ENABLE_MULTICONFIG=ON

projects/rocthrust/CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,12 @@
33
Documentation for rocThrust available at
44
[https://rocm.docs.amd.com/projects/rocThrust/en/latest/](https://rocm.docs.amd.com/projects/rocThrust/en/latest/).
55

6+
## (Unreleased) rocThrust 3.x.x for ROCm 6.x
7+
8+
### Changes
9+
10+
* Changed the C++ version from 14 to 17. C++14 will be deprecated in the next major release.
11+
612
## (Unreleased) rocThrust 3.3.0 for ROCm 6.4
713

814
### Added
@@ -11,10 +17,16 @@ Documentation for rocThrust available at
1117
* Added smoke test options, which runs a subset of the unit tests and ensures that less than 2gb of VRAM will be used. Use `python rtest.py [--emulation|-e|--test|-t]=smoke` to run these tests.
1218
* Added `--emulation` option for `rtest.py`
1319
* Merged changes from upstream CCCL/thrust 2.4.0
20+
* Merged changes from upstream CCCL/thrust 2.5.0
21+
* Added `find_first_of` to HIPSTDPAR
22+
* Added `search` and `find_end` to HIPSTDPAR
23+
* Added `search_n` to HIPSTDPAR
24+
* Updated HIPSTDPAR's `adjacent_find` to use rocPRIM's implementation
1425

1526
### Changed
1627
* `--test|-t` is no longer a required flag for `rtest.py`. Instead, the user can use either `--emulation|-e` or `--test|-t`, but not both.
1728
* Split the contents of HIPSTDPAR's forwarding header into several implementation headers.
29+
* Fixed `copy_if` to work with large data types (512 bytes)
1830

1931
## (Unreleased) rocThrust 3.2.0 for ROCm 6.3
2032

projects/rocthrust/CMakeLists.txt

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,9 @@ if (NOT THRUST_HOST_SYSTEM IN_LIST THRUST_HOST_SYSTEM_OPTIONS)
107107
endif ()
108108

109109
# Set CXX flags
110-
set(CMAKE_CXX_STANDARD 14)
110+
if (NOT DEFINED CMAKE_CXX_STANDARD)
111+
set(CMAKE_CXX_STANDARD 17)
112+
endif()
111113
set(CMAKE_CXX_STANDARD_REQUIRED ON)
112114
set(CMAKE_CXX_EXTENSIONS OFF)
113115

@@ -117,6 +119,12 @@ else()
117119
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -Wextra -Werror")
118120
endif()
119121

122+
if (CMAKE_CXX_STANDARD EQUAL 14)
123+
message(WARNING "C++14 will be deprecated in the next major release")
124+
elseif(NOT CMAKE_CXX_STANDARD EQUAL 17)
125+
message(FATAL_ERROR "Only C++14 and C++17 are supported")
126+
endif()
127+
120128
if (WIN32)
121129
add_compile_options(-xhip)
122130
add_compile_definitions(THRUST_IGNORE_DEPRECATED_CPP_DIALECT)

projects/rocthrust/benchmarks/bench_utils/generation_utils.hpp

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -371,7 +371,13 @@ namespace detail
371371
rocrand_set_seed(gen, seed.get());
372372
rocrand_generate_uniform_double(gen, d_distribution, num_items);
373373

374-
hipDeviceSynchronize();
374+
hipError_t error = hipDeviceSynchronize();
375+
if(error != hipSuccess)
376+
{
377+
std::cout << "HIP error: " << hipGetErrorString(error) << " file: " << __FILE__
378+
<< " line: " << __LINE__ << std::endl;
379+
exit(error);
380+
}
375381

376382
return d_distribution;
377383
}

projects/rocthrust/examples/arbitrary_transformation.cu

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55

66
#include <thrust/detail/config.h>
77

8-
#if THRUST_CPP_DIALECT >= 2011 && !defined(THRUST_LEGACY_GCC)
8+
#if !defined(THRUST_LEGACY_GCC)
99
#include <thrust/zip_function.h>
1010
#endif // >= C++11
1111

@@ -54,7 +54,8 @@ struct arbitrary_functor1
5454
}
5555
};
5656

57-
#if THRUST_CPP_DIALECT >= 2011 && !defined(THRUST_LEGACY_GCC)
57+
58+
#if !defined(THRUST_LEGACY_GCC)
5859
struct arbitrary_functor2
5960
{
6061
__host__ __device__
@@ -91,8 +92,9 @@ int main(void)
9192
for(int i = 0; i < 5; i++)
9293
std::cout << A[i] << " + " << B[i] << " * " << C[i] << " = " << D1[i] << std::endl;
9394

95+
9496
// apply the transformation using zip_function
95-
#if THRUST_CPP_DIALECT >= 2011 && !defined(THRUST_LEGACY_GCC)
97+
#if !defined(THRUST_LEGACY_GCC)
9698
thrust::device_vector<float> D2(5);
9799
thrust::for_each(thrust::make_zip_iterator(thrust::make_tuple(A.begin(), B.begin(), C.begin(), D2.begin())),
98100
thrust::make_zip_iterator(thrust::make_tuple(A.end(), B.end(), C.end(), D2.end())),

projects/rocthrust/examples/cuda/async_reduce.cu

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,7 @@
44
#include <thrust/system/cuda/execution_policy.h>
55
#include <cassert>
66

7-
#if THRUST_CPP_DIALECT >= 2011
87
#include <future>
9-
#endif
108

119
// This example demonstrates two ways to achieve algorithm invocations that are asynchronous with
1210
// the calling thread.
@@ -53,9 +51,7 @@ int main()
5351
// reset the result
5452
result[0] = 0;
5553

56-
#if THRUST_CPP_DIALECT >= 2011
5754
// method 2: use std::async to create asynchrony
58-
5955
// copy all the algorithm parameters
6056
auto begin = data.begin();
6157
auto end = data.end();
@@ -71,7 +67,6 @@ int main()
7167

7268
// wait on the result and check that it is correct
7369
assert(future_result.get() == n);
74-
#endif
7570

7671
return 0;
7772
}

projects/rocthrust/examples/cuda/global_device_vector.cu

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,6 @@ typedef thrust::system::cuda::detail::cuda_memory_resource<
2121
thrust::cuda::pointer<void>
2222
> device_ignore_shutdown_memory_resource;
2323

24-
#if THRUST_CPP_DIALECT >= 2011
2524
template <typename T>
2625
using device_ignore_shutdown_allocator =
2726
thrust::mr::stateless_resource_allocator<
@@ -30,15 +29,6 @@ typedef thrust::system::cuda::detail::cuda_memory_resource<
3029
>;
3130

3231
thrust::device_vector<double, device_ignore_shutdown_allocator<double>> d;
33-
#else
34-
thrust::device_vector<
35-
double,
36-
thrust::mr::stateless_resource_allocator<
37-
double,
38-
thrust::device_ptr_memory_resource<device_ignore_shutdown_memory_resource>
39-
>
40-
> d;
41-
#endif
4232

4333
int main() {
4434
d.resize(25);

projects/rocthrust/examples/padded_grid_reduction.cu

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,17 @@
1-
#include <thrust/transform_reduce.h>
2-
#include <thrust/functional.h>
31
#include <thrust/device_vector.h>
2+
#include <thrust/extrema.h>
3+
#include <thrust/functional.h>
44
#include <thrust/host_vector.h>
55
#include <thrust/iterator/constant_iterator.h>
66
#include <thrust/iterator/zip_iterator.h>
77
#include <thrust/random.h>
8-
#include <thrust/extrema.h>
8+
#include <thrust/transform_reduce.h>
9+
910
#include <cmath>
1011
#include <iomanip>
11-
#include <float.h>
1212

1313
#include "include/host_device.h"
14+
#include <float.h>
1415

1516
// This example computes the minimum and maximum values
1617
// over a padded grid. The padded values are not considered

projects/rocthrust/examples/remove_points2d.cu

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
#include <thrust/host_vector.h>
2-
#include <thrust/remove.h>
32
#include <thrust/random.h>
3+
#include <thrust/remove.h>
44

55
#include "include/host_device.h"
66

projects/rocthrust/examples/repeated_range.cu

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
1+
#include <thrust/copy.h>
2+
#include <thrust/device_vector.h>
3+
#include <thrust/fill.h>
4+
#include <thrust/functional.h>
15
#include <thrust/iterator/counting_iterator.h>
2-
#include <thrust/iterator/transform_iterator.h>
36
#include <thrust/iterator/permutation_iterator.h>
4-
#include <thrust/functional.h>
5-
#include <thrust/fill.h>
6-
#include <thrust/device_vector.h>
7-
#include <thrust/copy.h>
7+
#include <thrust/iterator/transform_iterator.h>
8+
89
#include <iostream>
910

1011
#include "include/host_device.h"

projects/rocthrust/examples/saxpy.cu

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
1-
#include <thrust/transform.h>
21
#include <thrust/device_vector.h>
3-
#include <thrust/host_vector.h>
42
#include <thrust/functional.h>
3+
#include <thrust/host_vector.h>
4+
#include <thrust/transform.h>
5+
6+
#include <algorithm>
57
#include <iostream>
68
#include <iterator>
7-
#include <algorithm>
89

910
#include "include/host_device.h"
1011

projects/rocthrust/examples/scan_by_key.cu

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
#include <thrust/device_vector.h>
21
#include <thrust/copy.h>
2+
#include <thrust/device_vector.h>
33
#include <thrust/scan.h>
4+
45
#include <iostream>
56

67
#include "include/host_device.h"

projects/rocthrust/examples/sorting_aos_vs_soa.cu

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,11 @@
1-
#include <thrust/host_vector.h>
21
#include <thrust/device_vector.h>
3-
#include <thrust/sort.h>
2+
#include <thrust/host_vector.h>
43
#include <thrust/random.h>
5-
#include <assert.h>
4+
#include <thrust/sort.h>
65

76
#include "include/host_device.h"
87
#include "include/timer.h"
8+
#include <assert.h>
99

1010
// This examples compares sorting performance using Array of Structures (AoS)
1111
// and Structure of Arrays (SoA) data layout. Legacy applications will often

projects/rocthrust/examples/stream_compaction.cu

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
#include <thrust/device_vector.h>
2-
#include <thrust/sequence.h>
31
#include <thrust/copy.h>
42
#include <thrust/count.h>
3+
#include <thrust/device_vector.h>
54
#include <thrust/remove.h>
5+
#include <thrust/sequence.h>
6+
67
#include <iostream>
78
#include <iterator>
89
#include <string>

projects/rocthrust/examples/strided_range.cu

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
1+
#include <thrust/copy.h>
2+
#include <thrust/device_vector.h>
3+
#include <thrust/fill.h>
4+
#include <thrust/functional.h>
15
#include <thrust/iterator/counting_iterator.h>
2-
#include <thrust/iterator/transform_iterator.h>
36
#include <thrust/iterator/permutation_iterator.h>
4-
#include <thrust/functional.h>
5-
#include <thrust/fill.h>
6-
#include <thrust/device_vector.h>
7-
#include <thrust/copy.h>
7+
#include <thrust/iterator/transform_iterator.h>
8+
89
#include <iostream>
910

1011
#include "include/host_device.h"

projects/rocthrust/examples/summary_statistics.cu

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
11
#include <thrust/device_vector.h>
2+
#include <thrust/extrema.h>
3+
#include <thrust/functional.h>
24
#include <thrust/host_vector.h>
35
#include <thrust/transform_reduce.h>
4-
#include <thrust/functional.h>
5-
#include <thrust/extrema.h>
6+
67
#include <cmath>
78
#include <limits>
89
#include <iostream>

projects/rocthrust/examples/tiled_range.cu

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,11 @@
1+
#include <thrust/copy.h>
2+
#include <thrust/device_vector.h>
3+
#include <thrust/fill.h>
4+
#include <thrust/functional.h>
15
#include <thrust/iterator/counting_iterator.h>
2-
#include <thrust/iterator/transform_iterator.h>
36
#include <thrust/iterator/permutation_iterator.h>
4-
#include <thrust/functional.h>
5-
#include <thrust/fill.h>
6-
#include <thrust/device_vector.h>
7-
#include <thrust/copy.h>
7+
#include <thrust/iterator/transform_iterator.h>
8+
89
#include <iostream>
910

1011
#include "include/host_device.h"

projects/rocthrust/examples/transform_iterator.cu

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,9 @@
1-
#include <thrust/iterator/transform_iterator.h>
2-
#include <thrust/iterator/counting_iterator.h>
31
#include <thrust/device_vector.h>
4-
#include <thrust/reduce.h>
52
#include <thrust/functional.h>
3+
#include <thrust/iterator/counting_iterator.h>
4+
#include <thrust/iterator/transform_iterator.h>
5+
#include <thrust/reduce.h>
6+
67
#include <iostream>
78
#include <iterator>
89
#include <string>

projects/rocthrust/examples/uninitialized_vector.cu

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,7 @@ template<typename T>
3131
__host__
3232
~uninitialized_allocator() {}
3333

34-
#if THRUST_CPP_DIALECT >= 2011
35-
uninitialized_allocator & operator=(const uninitialized_allocator &) = default;
36-
#endif
34+
uninitialized_allocator & operator=(const uninitialized_allocator &) = default;
3735

3836
// for correctness, you should also redefine rebind when you inherit
3937
// from an allocator type; this way, if the allocator is rebound somewhere,

projects/rocthrust/examples/word_count.cu

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
#include <thrust/device_vector.h>
2-
#include <thrust/reduce.h>
32
#include <thrust/functional.h>
43
#include <thrust/inner_product.h>
4+
#include <thrust/reduce.h>
55

66
#include <iostream>
77

0 commit comments

Comments
 (0)