Skip to content

[CUDA][HIP] comparison of device functions not allowed in host function #105825

Open
@yxsamliu

Description

@yxsamliu

HIP/CUDA has separate compilations for host and device. Instructions of host functions are generated by host compilation, during which the compiler has no access to device function pointers. The device functions seen by host code is just a placeholder address, not the real device function address. If the placeholder address is stored to a variable, then passed to kernel and called there, it won’t work. To avoid misuse, clang forbids using of device functions in host function.

However, if device function is used for comparison with each other, it should be fine, e.g.

__device__ float dfn1(float) { return 1;}
__device__ float dfn2(float) { return 2;}

template<float (*OP)(float)> 
__global__  void some_kernel(float *x, float y) {
  *x = OP(y) + 10 *y;
}

template<float (*OP)(float)>
void run_kernel(float* x) {
   constexpr float param = (OP == &dfn1) ? 1 : 0;
   some_kernel<OP><<<1,1>>>(x, param);
}

void run(float* x) {
    run_kernel<dfn1>(x);
}

However, currently clang diagnose the above code (https://godbolt.org/z/d4aW1oor4 ) whereas nvcc allows it (https://godbolt.org/z/YWYKeTr67).

Basically, nvcc only diagnose call of device functions in host function and allows other uses, while clang diagnose any ODR-use of device functions in host functions.

I think we may want to be consistent with nvcc regarding use of device functions in host functions.

@Artem-B

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions