Open
Description
Building an OpenMP hello world fails to link, when Optimization is turned on:
#include <omp.h>
#include <stdio.h>
int main(int argc, char** argv){
printf("Devices: %i\n", omp_get_num_devices());
int a[10]={0};
for (int i=0; i<= omp_get_num_devices(); i++)
#pragma omp target device(i) map(tofrom: a[:10])
{
printf("Hello from device %i, is_initial_device=%i\n", i, omp_is_initial_device());
a[i]++;
}
printf("%i, %i, %i, %i\n", a[0], a[1], a[2], a[3]);
return 0;
}
Building like:
clang -fopenmp -fopenmp-targets=nvptx64 omp_hello_device.c -O3
This fails with
nvlink error : Size doesn't match for '__omp_rtl_device_environment' in '/tmp/omp_hello_device-1f1427-nvptx64-nvidia-cuda-sm_70-1a1851.cubin', first specified in '/tmp/a-d3fd98.cubin'
nvlink fatal : merge_elf failed
clang: error: fatbinary command failed with exit code 1 (use -v to see invocation)
clang-linker-wrapper: error: 'clang' failed
clang: error: linker command failed with exit code 1 (use -v to see invocation)
I tried with a quite recent build from main (f7c2e5f). I tried with a release 17 build (6009708). Both versions show this issue. I tried on different of our systems with different GPUs equipped, with different versions of CUDA (11.6/11.8/12.1.1). The result is consistent with clang 17 or newer.
I tried a clang/16.0.6 build, which succeeds to build with any optimization level.
@jhuber6 did you see something like this before?