Skip to content

[AMDGPU][True16][CodeGen] update VGPRimm for t16 #131021

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 20, 2025

Conversation

broxigarchen
Copy link
Contributor

@broxigarchen broxigarchen commented Mar 12, 2025

added a bitcast_fpimm_to_i16 and update the VGPRImm pattern for t16 flow.

This change is following the pattern from the 32bit case

@broxigarchen broxigarchen changed the title fix vgprimm for t16 [AMDGPU][True16][CodeGen] update VGPRimm for t16 Mar 13, 2025
@broxigarchen broxigarchen marked this pull request as ready for review March 14, 2025 02:36
@broxigarchen broxigarchen requested review from arsenm and Sisyph March 14, 2025 02:36
@broxigarchen broxigarchen requested a review from kosarev March 14, 2025 02:36
@broxigarchen
Copy link
Contributor Author

broxigarchen commented Mar 14, 2025

CI error is not related.

Seems we don't have a test case for this in our current code base. But this triggered an error in a downstream PR of a true16 pattern. I will try to get a test up

@llvmbot
Copy link
Member

llvmbot commented Mar 14, 2025

@llvm/pr-subscribers-backend-amdgpu

Author: Brox Chen (broxigarchen)

Changes

added a bitcast_fpimm_to_i16 and update the VGPRImm pattern for t16 flow.

This is following the pattern for 32bit case


Full diff: https://github.com/llvm/llvm-project/pull/131021.diff

2 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/SIInstrInfo.td (+5)
  • (modified) llvm/lib/Target/AMDGPU/SIInstructions.td (+1-1)
diff --git a/llvm/lib/Target/AMDGPU/SIInstrInfo.td b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
index 403c657c64053..23a7f508dcda2 100644
--- a/llvm/lib/Target/AMDGPU/SIInstrInfo.td
+++ b/llvm/lib/Target/AMDGPU/SIInstrInfo.td
@@ -846,6 +846,11 @@ def cond_as_i32imm: SDNodeXForm<cond, [{
 }]>;
 
 // Copied from the AArch64 backend:
+def bitcast_fpimm_to_i16 : SDNodeXForm<fpimm, [{
+return CurDAG->getTargetConstant(
+  N->getValueAPF().bitcastToAPInt().getZExtValue(), SDLoc(N), MVT::i16);
+}]>;
+
 def bitcast_fpimm_to_i32 : SDNodeXForm<fpimm, [{
 return CurDAG->getTargetConstant(
   N->getValueAPF().bitcastToAPInt().getZExtValue(), SDLoc(N), MVT::i32);
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index de77401eb0137..37ce5d548dfed 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -2283,7 +2283,7 @@ let True16Predicate = UseRealTrue16Insts in {
   foreach vt = [f16, bf16] in {
     def : GCNPat <
       (VGPRImm<(vt fpimm)>:$imm),
-      (V_MOV_B16_t16_e64 0, $imm, 0)
+      (V_MOV_B16_t16_e64 0, (vt (bitcast_fpimm_to_i16 $imm)), 0)
     >;
   }
 }

@@ -2283,7 +2283,7 @@ let True16Predicate = UseRealTrue16Insts in {
foreach vt = [f16, bf16] in {
def : GCNPat <
(VGPRImm<(vt fpimm)>:$imm),
(V_MOV_B16_t16_e64 0, $imm, 0)
(V_MOV_B16_t16_e64 0, (vt (bitcast_fpimm_to_i16 $imm)), 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The type of immediate nodes in the selected dag doesn't really matter, they're all turning into sign extended int64_t in the end anyway. We could change all of these to just use i64 in the first place

Copy link
Contributor Author

@broxigarchen broxigarchen Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double checked in downstream branch. I think you are right.

Still merge this patch as it makes the pattern more consisitent

@broxigarchen broxigarchen merged commit 541d6c3 into llvm:main Mar 20, 2025
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants