-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[AMDGPU] Rename getNumVGPRBlocks. NFC #84161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Rename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's using the encoding granule. This is used to program the hardware. In practice, the hardware will use the alloc granule instead, so this patch also adds a new helper, getAllocatedNumVGPRBlocks, which can be useful when driving heuristics.
@llvm/pr-subscribers-backend-amdgpu Author: Diana Picus (rovka) ChangesRename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's using the encoding granule. This is used to program the hardware. In practice, the hardware will use the alloc granule instead, so this patch also adds a new helper, getAllocatedNumVGPRBlocks, which can be useful when driving heuristics. Full diff: https://github.com/llvm/llvm-project/pull/84161.diff 4 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 37a36b26b947c6..d9970a200804ae 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -868,8 +868,8 @@ void AMDGPUAsmPrinter::getSIProgramInfo(SIProgramInfo &ProgInfo,
ProgInfo.SGPRBlocks = IsaInfo::getNumSGPRBlocks(
&STM, ProgInfo.NumSGPRsForWavesPerEU);
- ProgInfo.VGPRBlocks = IsaInfo::getNumVGPRBlocks(
- &STM, ProgInfo.NumVGPRsForWavesPerEU);
+ ProgInfo.VGPRBlocks =
+ IsaInfo::getEncodedNumVGPRBlocks(&STM, ProgInfo.NumVGPRsForWavesPerEU);
const SIModeRegisterDefaults Mode = MFI->getMode();
diff --git a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
index 10999d846e3bb2..b42c1acbd305c3 100644
--- a/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
+++ b/llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
@@ -5376,8 +5376,8 @@ bool AMDGPUAsmParser::calculateGPRBlocks(
NumSGPRs = IsaInfo::FIXED_NUM_SGPRS_FOR_INIT_BUG;
}
- VGPRBlocks =
- IsaInfo::getNumVGPRBlocks(&getSTI(), NumVGPRs, EnableWavefrontSize32);
+ VGPRBlocks = IsaInfo::getEncodedNumVGPRBlocks(&getSTI(), NumVGPRs,
+ EnableWavefrontSize32);
SGPRBlocks = IsaInfo::getNumSGPRBlocks(&getSTI(), NumSGPRs);
return false;
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
index 63285c06edaf2c..f0dc01644b85da 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
@@ -1158,14 +1158,23 @@ unsigned getMaxNumVGPRs(const MCSubtargetInfo *STI, unsigned WavesPerEU) {
return std::min(MaxNumVGPRs, AddressableNumVGPRs);
}
-unsigned getNumVGPRBlocks(const MCSubtargetInfo *STI, unsigned NumVGPRs,
- std::optional<bool> EnableWavefrontSize32) {
- NumVGPRs = alignTo(std::max(1u, NumVGPRs),
- getVGPREncodingGranule(STI, EnableWavefrontSize32));
- // VGPRBlocks is actual number of VGPR blocks minus 1.
- return NumVGPRs / getVGPREncodingGranule(STI, EnableWavefrontSize32) - 1;
+static unsigned getNumBlocks(unsigned NumVGPRs, unsigned Granule) {
+ return divideCeil(std::max(1u, NumVGPRs), Granule);
}
+unsigned getEncodedNumVGPRBlocks(const MCSubtargetInfo *STI, unsigned NumVGPRs,
+ std::optional<bool> EnableWavefrontSize32) {
+ return getNumBlocks(NumVGPRs,
+ getVGPREncodingGranule(STI, EnableWavefrontSize32)) -
+ 1;
+}
+
+unsigned getAllocatedNumVGPRBlocks(const MCSubtargetInfo *STI,
+ unsigned NumVGPRs,
+ std::optional<bool> EnableWavefrontSize32) {
+ return getNumBlocks(NumVGPRs,
+ getVGPRAllocGranule(STI, EnableWavefrontSize32));
+}
} // end namespace IsaInfo
void initDefaultAMDKernelCodeT(amd_kernel_code_t &Header,
diff --git a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
index 9fcb4caca30b01..d827ef3827e2a0 100644
--- a/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
+++ b/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h
@@ -316,13 +316,20 @@ unsigned getNumWavesPerEUWithNumVGPRs(const MCSubtargetInfo *STI,
unsigned NumVGPRs);
/// \returns Number of VGPR blocks needed for given subtarget \p STI when
-/// \p NumVGPRs are used.
+/// \p NumVGPRs are used. We actually return the number of blocks -1, since
+/// that's what we encode.
///
/// For subtargets which support it, \p EnableWavefrontSize32 should match the
/// ENABLE_WAVEFRONT_SIZE32 kernel descriptor field.
-unsigned
-getNumVGPRBlocks(const MCSubtargetInfo *STI, unsigned NumSGPRs,
- std::optional<bool> EnableWavefrontSize32 = std::nullopt);
+unsigned getEncodedNumVGPRBlocks(
+ const MCSubtargetInfo *STI, unsigned NumVGPRs,
+ std::optional<bool> EnableWavefrontSize32 = std::nullopt);
+
+/// \returns Number of VGPR blocks that need to be allocated for the given
+/// subtarget \p STI when \p NumVGPRs are used.
+unsigned getAllocatedNumVGPRBlocks(
+ const MCSubtargetInfo *STI, unsigned NumVGPRs,
+ std::optional<bool> EnableWavefrontSize32 = std::nullopt);
} // end namespace IsaInfo
|
Makes sense to me, but then I suggested it, so hopefully someone else will take a look too. It might also make sense to refactor |
NFC? |
getVGPREncodingGranule(STI, EnableWavefrontSize32)); | ||
// VGPRBlocks is actual number of VGPR blocks minus 1. | ||
return NumVGPRs / getVGPREncodingGranule(STI, EnableWavefrontSize32) - 1; | ||
static unsigned getNumBlocks(unsigned NumVGPRs, unsigned Granule) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add something about VGPRs here? getNumBlocks sounds too vague.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you mention it, there's nothing VGPR specific here, since we're passing in both the number of registers and the granule. In fact, I should probably use this for getNumSGPRBlocks
too. Maybe getNumRegisterBlocks
would be best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought so too, it is not VGPR specific. Maybe even something like getGranulatedNumRegisterBlocks. Plus rename the first argument, it is not VGPR number either, just register number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Rename getNumVGPRBlocks to getEncodedNumVGPRBlocks, to clarify that it's using the encoding granule. This is used to program the hardware. In practice, the hardware will use the alloc granule instead, so this patch also adds a new helper, getAllocatedNumVGPRBlocks, which can be useful when driving heuristics.