Skip to content

[OpenMP] Linking OpenMP target offloading fails with optimization enabled #84028

Open
@jprotze

Description

@jprotze

Building an OpenMP hello world fails to link, when Optimization is turned on:

#include <omp.h>
#include <stdio.h>
int main(int argc, char** argv){
  printf("Devices: %i\n", omp_get_num_devices());
  int a[10]={0};
  for (int i=0; i<= omp_get_num_devices(); i++)
  #pragma omp target device(i) map(tofrom: a[:10])
  {
        printf("Hello from device %i, is_initial_device=%i\n", i, omp_is_initial_device());
        a[i]++;
  }
  printf("%i, %i, %i, %i\n", a[0], a[1], a[2], a[3]);
  return 0;
}

Building like:

clang -fopenmp -fopenmp-targets=nvptx64 omp_hello_device.c -O3

This fails with

nvlink error   : Size doesn't match for '__omp_rtl_device_environment' in '/tmp/omp_hello_device-1f1427-nvptx64-nvidia-cuda-sm_70-1a1851.cubin', first specified in '/tmp/a-d3fd98.cubin'
nvlink fatal   : merge_elf failed
clang: error: fatbinary command failed with exit code 1 (use -v to see invocation)
clang-linker-wrapper: error: 'clang' failed
clang: error: linker command failed with exit code 1 (use -v to see invocation)

I tried with a quite recent build from main (f7c2e5f). I tried with a release 17 build (6009708). Both versions show this issue. I tried on different of our systems with different GPUs equipped, with different versions of CUDA (11.6/11.8/12.1.1). The result is consistent with clang 17 or newer.
I tried a clang/16.0.6 build, which succeeds to build with any optimization level.

@jhuber6 did you see something like this before?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions