-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[LV] Fix '-1U' bits for smallest type in getSmallestAndWidestTypes #135783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV] Fix '-1U' bits for smallest type in getSmallestAndWidestTypes #135783
Conversation
For loops without loads/stores, where the smallest/widest types are calculated from the reduction, the smallest type returned is always -1U and it actually returns the smallest type as the widest type. This PR fixes the calculation.
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Sander de Smalen (sdesmalen-arm) ChangesFor loops without loads/stores, where the smallest/widest types are calculated from the reduction, the smallest type returned is always -1U and it actually returns the smallest type as the widest type. This PR fixes the calculation. This follows from #132190 (comment) Full diff: https://github.com/llvm/llvm-project/pull/135783.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index af94dc01c8c5c..c21b6c4c88929 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -4798,17 +4798,16 @@ LoopVectorizationCostModel::getSmallestAndWidestTypes() {
// if there are no loads/stores in the loop. In this case, check through the
// reduction variables to determine the maximum width.
if (ElementTypesInLoop.empty() && !Legal->getReductionVars().empty()) {
- // Reset MaxWidth so that we can find the smallest type used by recurrences
- // in the loop.
- MaxWidth = -1U;
for (const auto &PhiDescriptorPair : Legal->getReductionVars()) {
const RecurrenceDescriptor &RdxDesc = PhiDescriptorPair.second;
// When finding the min width used by the recurrence we need to account
// for casts on the input operands of the recurrence.
- MaxWidth = std::min<unsigned>(
- MaxWidth, std::min<unsigned>(
+ MinWidth = std::min<unsigned>(
+ MinWidth, std::min<unsigned>(
RdxDesc.getMinWidthCastToRecurrenceTypeInBits(),
RdxDesc.getRecurrenceType()->getScalarSizeInBits()));
+ MaxWidth = std::max<unsigned>(
+ MaxWidth, RdxDesc.getRecurrenceType()->getScalarSizeInBits());
}
} else {
for (Type *T : ElementTypesInLoop) {
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll b/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
index 269562fa70549..b34ba5e38811a 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
@@ -37,7 +37,7 @@ for.end:
; chosen. The following 3 cases check different combinations of widths.
; CHECK-LABEL: Checking a loop in 'no_loads_stores_32'
-; CHECK: The Smallest and Widest types: 4294967295 / 32 bits
+; CHECK: The Smallest and Widest types: 32 / 64 bits
; CHECK: Selecting VF: 4
define double @no_loads_stores_32(i32 %n) {
@@ -60,7 +60,7 @@ for.end:
}
; CHECK-LABEL: Checking a loop in 'no_loads_stores_16'
-; CHECK: The Smallest and Widest types: 4294967295 / 16 bits
+; CHECK: The Smallest and Widest types: 16 / 64 bits
; CHECK: Selecting VF: 8
define double @no_loads_stores_16() {
@@ -82,7 +82,7 @@ for.end:
}
; CHECK-LABEL: Checking a loop in 'no_loads_stores_8'
-; CHECK: The Smallest and Widest types: 4294967295 / 8 bits
+; CHECK: The Smallest and Widest types: 8 / 32 bits
; CHECK: Selecting VF: 16
define float @no_loads_stores_8() {
|
RdxDesc.getMinWidthCastToRecurrenceTypeInBits(), | ||
RdxDesc.getRecurrenceType()->getScalarSizeInBits())); | ||
MaxWidth = std::max<unsigned>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change does also change how MaxWidth is calculated, previously MaxWidth was what MinWidth
is now?
Granted, it seems to make sense to change this as well, although I am not sure if this was done intentionally when support for in-loop reductions originally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change does also change how MaxWidth is calculated, previously MaxWidth was what MinWidth is now?
Yes, that's right.
I am not sure if this was done intentionally when support for in-loop reductions originally
I wonder if MaxWidth got confused with the resulting maximum vector width (/factor) at some point. The original patch that added this was https://reviews.llvm.org/D113973.
The tests added in that patch contain two different types, particularly for testing this purpose, although the CHECK lines (e.g. CHECK: The Smallest and Widest types: 4294967295 / 16 bits
) seemed wrong, but after this change it is what I would expect it to be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems like the new behavior matches the expectations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks
RdxDesc.getMinWidthCastToRecurrenceTypeInBits(), | ||
RdxDesc.getRecurrenceType()->getScalarSizeInBits())); | ||
MaxWidth = std::max<unsigned>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it seems like the new behavior matches the expectations.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/14674 Here is the relevant piece of the build log for the reference
|
For loops without loads/stores, where the smallest/widest types are calculated from the reduction, the smallest type returned is always -1U and it actually returns the smallest type as the widest type. This PR fixes the calculation.
This follows from #132190 (comment)