[X86] Improve `__bf16` code generation #134859

antoniofrighetto · 2025-04-08T14:16:28Z

bf16 is a typedef short type introduced in AVX-512_BF16 and should be able to leverage SSE/AVX registers used for f16.

Fixes: #134222.

`bf16` is a typedef short type introduced in AVX-512_BF16 and should be able to leverage SSE/AVX registers used for `f16`. Fixes: llvm#134222.

antoniofrighetto · 2025-04-08T14:19:50Z

llvm/lib/Target/X86/X86InstrAVX512.td

+let Predicates = [HasAVX512, HasBF16] in {
+  def : Pat<(f16 (bitconvert (bf16 FR16X:$src))), (f16 FR16X:$src)>;
+  def : Pat<(bf16 (bitconvert (f16 FR16X:$src))), (bf16 FR16X:$src)>;
+}


@phoebewang Could we reuse FR16X for bf16 as well here? Changing FR16X definition to include bf16: 

def FR16X : RegisterClass<"X86", [f16, bf16], 16, (add FR32X)> {let Size = 32;}

Seems to lead to a bunch of compilation issues due to ambiguity in instruction pattern :( Should we create a new class?

Yes, we can use FR16X for bf16. We used in this way for different vector types within the same size. It just requires updating affected patterns with explicit type cast.

phoebewang · 2025-04-08T14:48:37Z

Thanks for the fix! We intended not to assign __bf16 a register classes because 1) there's not a single scalar BF16 instrcution supported on x86 so far; 2) we can leverage the target independent expansion to lower BF16 operations.

It is not a big problem though. We support FP16 without any the scalar feature enabled too. But there's more work to do than the current change. The FP16 type enabling patch would be a good example.

OTOH, I'm not sure if it's a big hammer for the issue here. Would it be better to explore some small change in fastisel side for it?

…entions

…patterns

antoniofrighetto · 2025-04-26T17:28:29Z

@phoebewang Thanks for the suggestion (and sorry for getting back only now), agree we should be using the already-existing FR16X register class, updated affected patterns (first fixup commit)!

In the second fixup commit, I took the chance to add custom hook for extension/truncation: while fptrunc is already handled correctly (https://llvm.godbolt.org/z/K87Tor8qP), I don't think that is the case for fpext (redundant shift of bits + call to __truncsfbf2, which shouId just be a call to __extendbfsf2), though I believe this should be orthogonal to the current miscompilation (might as well add in a different patch if any).

I tried to restrain the changes to FastISel only, however, even so, there were some other patterns to be added. There are some test failures mostly related to missing instruction selection for some nodes, e.g., bf16 = ConstantFP<APFloat(0)> (think we miss some move pattern here, this comes due to the fact that a v32f16 = BUILD_VECTOR node of 32 ConstantFP:f16<APFloat(0)> elements should have been correctly replaced with a single v32f16 = X86ISD::VBROADCAST ConstantFP:bf16<APFloat(0)>), but if possible it'd be nice to confirm this is the correct direction before.

[X86] Add support for __bf16 to f16 conversion

7f2a112

`bf16` is a typedef short type introduced in AVX-512_BF16 and should be able to leverage SSE/AVX registers used for `f16`. Fixes: llvm#134222.

antoniofrighetto commented Apr 8, 2025

View reviewed changes

antoniofrighetto added 2 commits April 26, 2025 19:21

!fixup update affected patterns w/ explicit cast, update calling conv…

22f3976

…entions

!fixup custom hooks for trunc/ext, add missing instruction selection …

b56a6ba

…patterns

antoniofrighetto changed the title ~~[X86] Add support for __bf16 to f16 conversion~~ [X86] Improve __bf16 code generation Apr 26, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[X86] Improve `__bf16` code generation #134859

[X86] Improve `__bf16` code generation #134859

Uh oh!

antoniofrighetto commented Apr 8, 2025

Uh oh!

antoniofrighetto Apr 8, 2025

Uh oh!

phoebewang Apr 8, 2025

Uh oh!

phoebewang commented Apr 8, 2025

Uh oh!

antoniofrighetto commented Apr 26, 2025

Uh oh!

Uh oh!

[X86] Improve __bf16 code generation #134859

Are you sure you want to change the base?

[X86] Improve __bf16 code generation #134859

Uh oh!

Conversation

antoniofrighetto commented Apr 8, 2025

Uh oh!

antoniofrighetto Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

phoebewang Apr 8, 2025

Choose a reason for hiding this comment

Uh oh!

phoebewang commented Apr 8, 2025

Uh oh!

antoniofrighetto commented Apr 26, 2025

Uh oh!

Uh oh!

[X86] Improve `__bf16` code generation #134859

[X86] Improve `__bf16` code generation #134859