[AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point branches. #124028

kmitropoulou · 2025-01-23T00:00:46Z

[NFC] Use GCNPat instead of Pat.
[AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point branches.

llvmbot · 2025-01-23T00:01:17Z

@llvm/pr-subscribers-backend-amdgpu

Author: Konstantina Mitropoulou (kmitropoulou)

Changes

[NFC] Use GCNPat instead of Pat.
[AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point branches.

Full diff: https://github.com/llvm/llvm-project/pull/124028.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPU.td (+3)
(modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+8-7)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPU.td b/llvm/lib/Target/AMDGPU/AMDGPU.td
index 7ad6720b8001af..6439149d801f6b 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPU.td
+++ b/llvm/lib/Target/AMDGPU/AMDGPU.td
@@ -2498,6 +2498,9 @@ def HasNotMADIntraFwdBug : Predicate<"!Subtarget->hasMADIntraFwdBug()">;
 def HasSALUFloatInsts : Predicate<"Subtarget->hasSALUFloatInsts()">,
   AssemblerPredicate<(all_of FeatureSALUFloatInsts)>;
 
+def NotHasSALUFloatInsts : Predicate<"!Subtarget->hasSALUFloatInsts()">,
+  AssemblerPredicate<(all_of (not FeatureSALUFloatInsts))>;
+
 def HasPseudoScalarTrans : Predicate<"Subtarget->hasPseudoScalarTrans()">,
   AssemblerPredicate<(all_of FeaturePseudoScalarTrans)>;
 
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 40a20fa9cb15ea..ae3d20bcfb3aaf 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -1053,39 +1053,40 @@ def : GCNPat<
   (SI_ELSE $src, $target)
 >;
 
-def : Pat <
+def : GCNPat <
   (int_amdgcn_kill i1:$src),
   (SI_KILL_I1_PSEUDO SCSrc_i1:$src, 0)
 >;
 
-def : Pat <
+def : GCNPat <
   (int_amdgcn_kill (i1 (not i1:$src))),
   (SI_KILL_I1_PSEUDO SCSrc_i1:$src, -1)
 >;
 
-def : Pat <
+let SubtargetPredicate = NotHasSALUFloatInsts in
+def : GCNPat <
   (int_amdgcn_kill (i1 (setcc f32:$src, InlineImmFP32:$imm, cond:$cond))),
   (SI_KILL_F32_COND_IMM_PSEUDO VSrc_b32:$src, (bitcast_fpimm_to_i32 $imm), (cond_as_i32imm $cond))
 >;
 
-def : Pat <
+def : GCNPat <
   (int_amdgcn_wqm_demote i1:$src),
   (SI_DEMOTE_I1 SCSrc_i1:$src, 0)
 >;
 
-def : Pat <
+def : GCNPat <
   (int_amdgcn_wqm_demote (i1 (not i1:$src))),
   (SI_DEMOTE_I1 SCSrc_i1:$src, -1)
 >;
 
   // TODO: we could add more variants for other types of conditionals
 
-def : Pat <
+def : GCNPat <
   (i64 (int_amdgcn_icmp i1:$src, (i1 0), (i32 33))),
   (COPY $src) // Return the SGPRs representing i1 src
 >;
 
-def : Pat <
+def : GCNPat <
   (i32 (int_amdgcn_icmp i1:$src, (i1 0), (i32 33))),
   (COPY $src) // Return the SGPRs representing i1 src
 >;

jayfoad · 2025-01-23T10:38:20Z

Patch looks OK but can you come up with a test case? Maybe use llvm-reduce to reduce a pipeline that shows the problem?

As a follow up, we should probably rename SI_KILL_I1_PSEUDO and SI_KILL_F32_COND_IMM_PSEUDO, because the important different is not whether the comparison is integer or floating point. It is whether the comparison instruction is SALU or VALU (i.e. whether the result is in scc or vcc).

arsenm · 2025-01-24T04:30:54Z

llvm/lib/Target/AMDGPU/AMDGPU.td

@@ -2498,6 +2498,9 @@ def HasNotMADIntraFwdBug : Predicate<"!Subtarget->hasMADIntraFwdBug()">;
 def HasSALUFloatInsts : Predicate<"Subtarget->hasSALUFloatInsts()">,
  AssemblerPredicate<(all_of FeatureSALUFloatInsts)>;

+def NotHasSALUFloatInsts : Predicate<"!Subtarget->hasSALUFloatInsts()">,
+  AssemblerPredicate<(all_of (not FeatureSALUFloatInsts))>;


do we really not have none_of?

We really don't. Looks like there are plenty of places where it could be used in existing .td files.

jayfoad

Thanks for the test. It can be simplified as noted inline.

LGTM apart from that.

jayfoad · 2025-01-28T08:53:57Z

llvm/test/CodeGen/AMDGPU/set_kill_i1_for_floation_point_comparison.ll

+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
+target triple = "amdgcn--amdpal"


Remove this

jayfoad · 2025-01-28T08:54:19Z

llvm/test/CodeGen/AMDGPU/set_kill_i1_for_floation_point_comparison.ll

+
+; Function Attrs: nocallback nofree nounwind
+declare void @llvm.amdgcn.kill(i1) #0
+
+; Function Attrs: nocallback nofree nosync nounwind speculatable willreturn memory(none)
+declare noundef i64 @llvm.amdgcn.s.getpc() #1
+
+; Function Attrs: nocallback nofree nosync nounwind willreturn memory(none)
+declare i32 @llvm.amdgcn.s.buffer.load.i32(<4 x i32>, i32, i32 immarg) #2
+
+attributes #0 = { nocallback nofree nounwind }
+attributes #1 = { nocallback nofree nosync nounwind speculatable willreturn memory(none) }
+attributes #2 = { nocallback nofree nosync nounwind willreturn memory(none) }
+


Remove this

jayfoad · 2025-01-28T08:55:04Z

llvm/test/CodeGen/AMDGPU/set_kill_i1_for_floation_point_comparison.ll

+  ; CHECK-NEXT: bb.2 (%ir-block.5):
+  ; CHECK-NEXT:   S_ENDPGM 0
+.entry:
+  %0 = call i32 @llvm.amdgcn.s.buffer.load.i32(<4 x i32> zeroinitializer, i32 0, i32 0)


Use named values instead of %0 etc. You can do this automatically by running the test through opt -passes=instnamer.

jayfoad · 2025-01-29T11:42:22Z

llvm/test/CodeGen/AMDGPU/set_kill_i1_for_floation_point_comparison.ll

@@ -0,0 +1,42 @@
+; NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py UTC_ARGS: --version 5
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1200 -verify-machineinstrs -stop-after=amdgpu-isel < %s 2>&1 | FileCheck %s
+target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"


Remove this

jayfoad

LGTM with datalayout removed

…nches.

jayfoad

Thanks

llvmbot added the backend:AMDGPU label Jan 23, 2025

kmitropoulou changed the title ~~set i1 kill~~ [AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point branches. Jan 23, 2025

kmitropoulou requested review from jayfoad and dstutt January 23, 2025 00:02

jayfoad requested review from perlfu and piotrAMD January 23, 2025 10:36

kmitropoulou force-pushed the set_i1_kill branch from 5eadc3d to 6fd8c15 Compare January 24, 2025 04:16

arsenm reviewed Jan 24, 2025

View reviewed changes

kmitropoulou force-pushed the set_i1_kill branch from 6fd8c15 to 2ae284a Compare January 24, 2025 20:25

jayfoad reviewed Jan 28, 2025

View reviewed changes

[NFC] Use GCNPat instead of Pat.

b725200

kmitropoulou force-pushed the set_i1_kill branch from 2ae284a to 6954d3c Compare January 28, 2025 23:00

jayfoad reviewed Jan 29, 2025

View reviewed changes

jayfoad approved these changes Jan 29, 2025

View reviewed changes

Konstantina Mitropoulou added 2 commits January 29, 2025 08:11

Add a new test with SI_KILL_F32_COND_IMM_PSEUDO

3f19ceb

[AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point bra…

26ef9ed

…nches.

kmitropoulou force-pushed the set_i1_kill branch from 6954d3c to 26ef9ed Compare January 29, 2025 16:14

jayfoad approved these changes Jan 29, 2025

View reviewed changes

kmitropoulou merged commit 9adc99b into llvm:main Jan 29, 2025
5 of 7 checks passed

		target datalayout = "e-p:64:64-p1:64:64-p2:32:32-p3:32:32-p4:64:64-p5:32:32-p6:32:32-p7:160:256:256:32-p8:128:128-p9:192:256:256:32-i64:64-v16:16-v24:32-v32:32-v48:64-v96:128-v192:256-v256:256-v512:512-v1024:1024-v2048:2048-n32:64-S32-A5-G1-ni:7:8:9"
		target triple = "amdgcn--amdpal"

[AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point branches. #124028

[AMDGPU] Always emit SI_KILL_I1_PSEUDO for uniform floating point branches. #124028

Uh oh!

Conversation

kmitropoulou commented Jan 23, 2025

Uh oh!

llvmbot commented Jan 23, 2025

Uh oh!

jayfoad commented Jan 23, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jayfoad Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

jayfoad left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jayfoad Jan 24, 2025 •

edited

Loading