-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[AMDGPU] Set Size to 4 for V_MOV_B64_PSEUDO and S_MOV_B64_IMM_PSEUDO #70376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
These are not fixed size instructions, so immediate size shall be added separately. A minimal opcode size 4 since the inception of the V_MOV_B64 instruction. A real instruction can be as small as 4 bytes in case of inline immediate. Otherwise it is NFCI.
@llvm/pr-subscribers-backend-amdgpu Author: Stanislav Mekhanoshin (rampitec) ChangesThese are not fixed size instructions, so immediate size shall be added separately. A minimal opcode size 4 since the inception of the V_MOV_B64 instruction. A real instruction can be as small as 4 bytes in case of inline immediate. Otherwise it is NFCI. Full diff: https://github.com/llvm/llvm-project/pull/70376.diff 1 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/SIInstructions.td b/llvm/lib/Target/AMDGPU/SIInstructions.td
index 567f1b812c1808c..bab74c170ab399d 100644
--- a/llvm/lib/Target/AMDGPU/SIInstructions.td
+++ b/llvm/lib/Target/AMDGPU/SIInstructions.td
@@ -132,7 +132,7 @@ def V_MOV_B64_PSEUDO : VPseudoInstSI <(outs VReg_64:$vdst),
let isAsCheapAsAMove = 1;
let isMoveImm = 1;
let SchedRW = [Write64Bit];
- let Size = 16; // Needs maximum 2 v_mov_b32 instructions 8 byte long each.
+ let Size = 4;
let UseNamedOperandTable = 1;
}
@@ -149,7 +149,7 @@ def S_MOV_B64_IMM_PSEUDO : SPseudoInstSI <(outs SReg_64:$sdst),
let isAsCheapAsAMove = 1;
let isMoveImm = 1;
let SchedRW = [WriteSALU, Write64Bit];
- let Size = 16; // Needs maximum 2 s_mov_b32 instructions 8 byte long each.
+ let Size = 4;
let Uses = [];
let UseNamedOperandTable = 1;
}
|
@@ -132,7 +132,7 @@ def V_MOV_B64_PSEUDO : VPseudoInstSI <(outs VReg_64:$vdst), | |||
let isAsCheapAsAMove = 1; | |||
let isMoveImm = 1; | |||
let SchedRW = [Write64Bit]; | |||
let Size = 16; // Needs maximum 2 v_mov_b32 instructions 8 byte long each. | |||
let Size = 4; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the exact meaning of Size?
In Target.td it says "// Size of encoded instruction", but can we make that more specific that? Is it minimum size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat strange for a pseudo which is never encoded as is, but SIInstrInfo::getInstSizeInBytes() uses it as a baseline size. If it is not a fixed size instruction (and it is not) it will then add literal size to the base. The 16 we have now already count the literal (second time). Given that is a pseudo for expansion I'd prefer to use minimal possible size. 16 was written when we had no V_MOV_B64 and it was always expanded into a pair of V_MOV_B32, but even in that case it shall be 8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the exact meaning of Size?
Good question. I think TargetInstrInfo::getInstSizeInBytes probably has to return a maximum size, since it is used by (for example) the BranchRelaxation pass. The exact meaning of the TableGen Size field is probably target-dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like getInstSizeInBytes will return the true size, neither min nor max. And I agree TableGen Size an intermediate target defined value. We should Size it consistently on the AMDGPU target. I think it is minimum size. There does not appear to be a natural place to document it, but perhaps as a comment on SIInstrInfo::getInstSizeInBytes. Please consider that a nit, otherwise the patch looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the time we run BranchRelaxation we already have no these post-RA expandable pseudos. I.e. it should really return a correct size, but only at certain point of lowering.
These are not fixed size instructions, so immediate size shall be added separately. A minimal opcode size 4 since the inception of the V_MOV_B64 instruction. A real instruction can be as small as 4 bytes in case of inline immediate. Otherwise it is NFCI.