Skip to content

[OMPT][Trunk] Offloading to x86_64 misses some OMPT target callbacks #64487

Closed
@Thyre

Description

@Thyre

Description

LLVM Trunk has added first support for OMPT callbacks for target directives. During testing, I noticed that callbacks for offloading to host are dispatched as well. While that's fine, the callbacks ompt_callback_device_initialize, ompt_callback_device_load, ompt_callback_device_unload and ompt_callback_device_finalize are not dispatched.

The callback ompt_callback_device_unload is not implemented as far as I know, but I would expect the others to show up.

Missing ompt_callback_device_initialize is against the OpenMP specifications, which state [Link]:

The OpenMP implementation invokes this callback after OpenMP is initialized for the device but before execution of any OpenMP construct is started on the device.

NVHPC 23.7 also dispatches ompt_callback_target without ompt_callback_device_initialize. However, in their case one can identify the host execution by checking the device_num against the returned value of ompt_get_num_devices which is either negative, or above ompt_get_num_devices. In the case of LLVM, offloading to host seems to initialize four devices, which are handled just like offloading to GPUs. Therefore, we get normal device numbers. We can verify this by running llvm-omp-device-info

$ llvm-omp-device-info
Device (0):
    Device Type    Generic-elf-64bit

Device (1):
    Device Type    Generic-elf-64bit

Device (2):
    Device Type    Generic-elf-64bit

Device (3):
    Device Type    Generic-elf-64bit

Device (4):
    CUDA Driver Version              12020
    CUDA OpenMP Device Number        0
    Device Name                      NVIDIA GeForce MX550
    Global Memory Size               94779004878848 bytes
    Number of Multiprocessors        16
    Concurrent Copy and Execution    Yes
    Total Constant Memory            65536 bytes
    Max Shared Memory per Block      49152 bytes
    Registers per Block              65536
    Warp Size                        32
    Maximum Threads per Block        1024
    Maximum Block Dimensions         
        x                            1024
        y                            1024
        z                            64
    Maximum Grid Dimensions          
        x                            2147483647
        y                            65535
        z                            65535
    Maximum Memory Pitch             2147483647 bytes
    Texture Alignment                512 bytes
    Clock Rate                       1320000 kHz
    Execution Timeout                No
    Integrated Device                No
    Can Map Host Memory              Yes
    Compute Mode                     Default
    Concurrent Kernels               Yes
    ECC Enabled                      No
    Memory Clock Rate                6001000 kHz
    Memory Bus Width                 64 bits
    L2 Cache Size                    524288 bytes
    Max Threads Per SMP              1024
    Async Engines                    3
    Unified Addressing               Yes
    Managed Memory                   Yes
    Concurrent Managed Memory        Yes
    Preemption Supported             Yes
    Cooperative Launch               Yes
    Multi-Device Boars               No
    Compute Capabilities             sm_75

Reproducer

I used one of the aomp smoke tests veccopy-ompt-target-emi to verify this issue.

When compiling and running the code with the offload target nvptx64, the following output can be seen:

$ clang --version
clang version 18.0.0 (https://github.com/llvm/llvm-project.git 52ac71f92d38f75df5cb88e9c090ac5fd5a71548)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/software/software/LLVM/git/bin
$ clang -fopenmp -fopenmp-targets=nvptx64 test.c   
$ ./a.out
Callback Init: device_num=0 type=sm_75 device=0x5593d65d44c0 lookup=0x7fef929e0480 doc=(nil)
Callback Load: device_num:0 filename:(null) host_adddr:0x5593d5e8e758 device_addr:(nil) bytes:758224
Callback Target EMI: kind=1 endpoint=1 device_num=0 task_data=0x5593d6591540 (0x0) target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) code=0x5593d5e8c9e2
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000002) src=0x7fff98ef29d0 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000002) src=0x7fff98ef29d0 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000003) src=0x7fff98ef29d0 src_device_num=1 dest=0x7fef60600000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000003) src=0x7fff98ef29d0 src_device_num=1 dest=0x7fef60600000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000004) src=0x7fff98ef1a30 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000004) src=0x7fff98ef1a30 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000005) src=0x7fff98ef1a30 src_device_num=1 dest=0x7fef60601000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000005) src=0x7fff98ef1a30 src_device_num=1 dest=0x7fef60601000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback Submit EMI: endpoint=1  req_num_teams=1 target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7a0 (0x0)
  Callback Submit EMI: endpoint=2  req_num_teams=1 target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7a0 (0x0)
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000006) src=0x7fef60601000 src_device_num=0 dest=0x7fff98ef1a30 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000006) src=0x7fef60601000 src_device_num=0 dest=0x7fff98ef1a30 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000007) src=0x7fef60600000 src_device_num=0 dest=0x7fff98ef29d0 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000007) src=0x7fef60600000 src_device_num=0 dest=0x7fff98ef29d0 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000008) src=0x7fef60601000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000008) src=0x7fef60601000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000009) src=0x7fef60600000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) host_op_id=0x7fef9282a7c0 (0x8000000000000009) src=0x7fef60600000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
Callback Target EMI: kind=1 endpoint=2 device_num=0 task_data=0x5593d6591540 (0x0) target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x8000000000000001) code=0x5593d5e8c9e2
Callback Target EMI: kind=1 endpoint=1 device_num=0 task_data=0x5593d6591540 (0x0) target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) code=0x5593d5e8cbfe
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000b) src=0x7fff98ef29d0 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000b) src=0x7fff98ef29d0 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000c) src=0x7fff98ef29d0 src_device_num=1 dest=0x7fef60601000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000c) src=0x7fff98ef29d0 src_device_num=1 dest=0x7fef60601000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000d) src=0x7fff98ef1a30 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000d) src=0x7fff98ef1a30 src_device_num=1 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fef928eb383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000e) src=0x7fff98ef1a30 src_device_num=1 dest=0x7fef60600000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000e) src=0x7fff98ef1a30 src_device_num=1 dest=0x7fef60600000 dest_device_num=0 bytes=4000 code=0x7fef928eb2fe
  Callback Submit EMI: endpoint=1  req_num_teams=0 target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7a0 (0x0)
  Callback Submit EMI: endpoint=2  req_num_teams=0 target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7a0 (0x0)
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000f) src=0x7fef60600000 src_device_num=0 dest=0x7fff98ef1a30 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x800000000000000f) src=0x7fef60600000 src_device_num=0 dest=0x7fff98ef1a30 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x8000000000000010) src=0x7fef60601000 src_device_num=0 dest=0x7fff98ef29d0 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x8000000000000010) src=0x7fef60601000 src_device_num=0 dest=0x7fff98ef29d0 dest_device_num=1 bytes=4000 code=0x7fef928f457f
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x8000000000000011) src=0x7fef60600000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x8000000000000011) src=0x7fef60600000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x8000000000000012) src=0x7fef60601000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) host_op_id=0x7fef9282a7c0 (0x8000000000000012) src=0x7fef60601000 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fef928ec73a
Callback Target EMI: kind=1 endpoint=2 device_num=0 task_data=0x5593d6591540 (0x0) target_task_data=0x5593d65ba818 (0x0) target_data=0x7fef9282a7a8 (0x800000000000000a) code=0x5593d5e8cbfe
Success
Callback Fini: device_num=0

Replacing nvptx64 with x86_64, one can see the follwing:

$ clang --version
clang version 18.0.0 (https://github.com/llvm/llvm-project.git 52ac71f92d38f75df5cb88e9c090ac5fd5a71548)
Target: x86_64-unknown-linux-gnu
Thread model: posix
$ clang -fopenmp -fopenmp-targets=x86_64 test.c   
$ ./a.out
Callback Target EMI: kind=1 endpoint=1 device_num=0 task_data=0x55a83389d540 (0x0) target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) code=0x55a8336389e2
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000002) src=0x7fffc2fe7820 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000002) src=0x7fffc2fe7820 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000003) src=0x7fffc2fe7820 src_device_num=4 dest=0x55a8338f82d0 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000003) src=0x7fffc2fe7820 src_device_num=4 dest=0x55a8338f82d0 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000004) src=0x7fffc2fe6880 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000004) src=0x7fffc2fe6880 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000005) src=0x7fffc2fe6880 src_device_num=4 dest=0x55a833902680 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000005) src=0x7fffc2fe6880 src_device_num=4 dest=0x55a833902680 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback Submit EMI: endpoint=1  req_num_teams=1 target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287a0 (0x0)
  Callback Submit EMI: endpoint=2  req_num_teams=1 target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287a0 (0x0)
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000006) src=0x55a833902680 src_device_num=0 dest=0x7fffc2fe6880 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000006) src=0x55a833902680 src_device_num=0 dest=0x7fffc2fe6880 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000007) src=0x55a8338f82d0 src_device_num=0 dest=0x7fffc2fe7820 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000007) src=0x55a8338f82d0 src_device_num=0 dest=0x7fffc2fe7820 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000008) src=0x55a833902680 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000008) src=0x55a833902680 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000009) src=0x55a8338f82d0 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) host_op_id=0x7fa2632287c0 (0x8000000000000009) src=0x55a8338f82d0 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
Callback Target EMI: kind=1 endpoint=2 device_num=0 task_data=0x55a83389d540 (0x0) target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x8000000000000001) code=0x55a8336389e2
Callback Target EMI: kind=1 endpoint=1 device_num=0 task_data=0x55a83389d540 (0x0) target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) code=0x55a833638bfe
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000b) src=0x7fffc2fe7820 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000b) src=0x7fffc2fe7820 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000c) src=0x7fffc2fe7820 src_device_num=4 dest=0x55a833902680 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000c) src=0x7fffc2fe7820 src_device_num=4 dest=0x55a833902680 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback DataOp EMI: endpoint=1 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000d) src=0x7fffc2fe6880 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=2 optype=1 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000d) src=0x7fffc2fe6880 src_device_num=4 dest=(nil) dest_device_num=0 bytes=4000 code=0x7fa26336f383
  Callback DataOp EMI: endpoint=1 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000e) src=0x7fffc2fe6880 src_device_num=4 dest=0x55a8338f82d0 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback DataOp EMI: endpoint=2 optype=2 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000e) src=0x7fffc2fe6880 src_device_num=4 dest=0x55a8338f82d0 dest_device_num=0 bytes=4000 code=0x7fa26336f2fe
  Callback Submit EMI: endpoint=1  req_num_teams=0 target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287a0 (0x0)
  Callback Submit EMI: endpoint=2  req_num_teams=0 target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287a0 (0x0)
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000f) src=0x55a8338f82d0 src_device_num=0 dest=0x7fffc2fe6880 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x800000000000000f) src=0x55a8338f82d0 src_device_num=0 dest=0x7fffc2fe6880 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=1 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x8000000000000010) src=0x55a833902680 src_device_num=0 dest=0x7fffc2fe7820 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=2 optype=3 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x8000000000000010) src=0x55a833902680 src_device_num=0 dest=0x7fffc2fe7820 dest_device_num=4 bytes=4000 code=0x7fa26337857f
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x8000000000000011) src=0x55a8338f82d0 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x8000000000000011) src=0x55a8338f82d0 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
  Callback DataOp EMI: endpoint=1 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x8000000000000012) src=0x55a833902680 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
  Callback DataOp EMI: endpoint=2 optype=4 target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) host_op_id=0x7fa2632287c0 (0x8000000000000012) src=0x55a833902680 src_device_num=0 dest=(nil) dest_device_num=-1 bytes=0 code=0x7fa26337073a
Callback Target EMI: kind=1 endpoint=2 device_num=0 task_data=0x55a83389d540 (0x0) target_task_data=0x55a8338c6818 (0x0) target_data=0x7fa2632287a8 (0x800000000000000a) code=0x55a833638bfe
Success

Notice, that the callbacks are missing from the output.

Side question:

I noticed that omp_get_num_devices() returns a number of four for offloading to x86_64. What's the reasoning behind that? llvm-omp-device-info also shows four devices with the type Generif-elf-64bit on a system with Ubuntu 22.04 LTS, Intel Core i7-1260P.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions