Skip to content

X86 rejects inlining of target-feature-wise compatible functions #67054

Closed
@kazutakahirata

Description

@kazutakahirata

This is an offshoot from #65205. Consider:

#include <immintrin.h>

__attribute__((target("avx512bw")))
static inline __m512i MM512_MASK_ADD_EPI8(__m512i src,
                                          __mmask64 k,
                                          __m512i a,
                                          __m512i b) {
    __asm__("vpaddb\t{%3, %2, %0 %{%1%}" : "+v"(src) : "Yk"(k), "v"(a), "v"(b));
    return src;
}

__attribute__((target("avx512bw,avx512dq")))
__m512i G(__m512i src, __mmask64 k, __m512i a, __m512i b) {
    return MM512_MASK_ADD_EPI8(src, k, a, b);
}

The credit for the testcase goes to @kalcutter.

clang refuses to inline:

$ clang -O2 -S target_feature.cc -o /dev/stdout | grep call
        callq   _ZL19MM512_MASK_ADD_EPI8Dv8_xyS_S_

We should be able to inline the callee into the caller because the caller is allowed to use a superset of the instruction set that the callee is allowed to use.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions