Skip to content

[LV] Fix '-1U' bits for smallest type in getSmallestAndWidestTypes #135783

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

sdesmalen-arm
Copy link
Collaborator

For loops without loads/stores, where the smallest/widest types are calculated from the reduction, the smallest type returned is always -1U and it actually returns the smallest type as the widest type. This PR fixes the calculation.

This follows from #132190 (comment)

For loops without loads/stores, where the smallest/widest types are calculated
from the reduction, the smallest type returned is always -1U and it actually
returns the smallest type as the widest type. This PR fixes the calculation.
@llvmbot
Copy link
Member

llvmbot commented Apr 15, 2025

@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Sander de Smalen (sdesmalen-arm)

Changes

For loops without loads/stores, where the smallest/widest types are calculated from the reduction, the smallest type returned is always -1U and it actually returns the smallest type as the widest type. This PR fixes the calculation.

This follows from #132190 (comment)


Full diff: https://github.com/llvm/llvm-project/pull/135783.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+4-5)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll (+3-3)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index af94dc01c8c5c..c21b6c4c88929 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -4798,17 +4798,16 @@ LoopVectorizationCostModel::getSmallestAndWidestTypes() {
   // if there are no loads/stores in the loop. In this case, check through the
   // reduction variables to determine the maximum width.
   if (ElementTypesInLoop.empty() && !Legal->getReductionVars().empty()) {
-    // Reset MaxWidth so that we can find the smallest type used by recurrences
-    // in the loop.
-    MaxWidth = -1U;
     for (const auto &PhiDescriptorPair : Legal->getReductionVars()) {
       const RecurrenceDescriptor &RdxDesc = PhiDescriptorPair.second;
       // When finding the min width used by the recurrence we need to account
       // for casts on the input operands of the recurrence.
-      MaxWidth = std::min<unsigned>(
-          MaxWidth, std::min<unsigned>(
+      MinWidth = std::min<unsigned>(
+          MinWidth, std::min<unsigned>(
                         RdxDesc.getMinWidthCastToRecurrenceTypeInBits(),
                         RdxDesc.getRecurrenceType()->getScalarSizeInBits()));
+      MaxWidth = std::max<unsigned>(
+          MaxWidth, RdxDesc.getRecurrenceType()->getScalarSizeInBits());
     }
   } else {
     for (Type *T : ElementTypesInLoop) {
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll b/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
index 269562fa70549..b34ba5e38811a 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/smallest-and-widest-types.ll
@@ -37,7 +37,7 @@ for.end:
 ; chosen. The following 3 cases check different combinations of widths.
 
 ; CHECK-LABEL: Checking a loop in 'no_loads_stores_32'
-; CHECK: The Smallest and Widest types: 4294967295 / 32 bits
+; CHECK: The Smallest and Widest types: 32 / 64 bits
 ; CHECK: Selecting VF: 4
 
 define double @no_loads_stores_32(i32 %n) {
@@ -60,7 +60,7 @@ for.end:
 }
 
 ; CHECK-LABEL: Checking a loop in 'no_loads_stores_16'
-; CHECK: The Smallest and Widest types: 4294967295 / 16 bits
+; CHECK: The Smallest and Widest types: 16 / 64 bits
 ; CHECK: Selecting VF: 8
 
 define double @no_loads_stores_16() {
@@ -82,7 +82,7 @@ for.end:
 }
 
 ; CHECK-LABEL: Checking a loop in 'no_loads_stores_8'
-; CHECK: The Smallest and Widest types: 4294967295 / 8 bits
+; CHECK: The Smallest and Widest types: 8 / 32 bits
 ; CHECK: Selecting VF: 16
 
 define float @no_loads_stores_8() {

RdxDesc.getMinWidthCastToRecurrenceTypeInBits(),
RdxDesc.getRecurrenceType()->getScalarSizeInBits()));
MaxWidth = std::max<unsigned>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does also change how MaxWidth is calculated, previously MaxWidth was what MinWidth is now?

Granted, it seems to make sense to change this as well, although I am not sure if this was done intentionally when support for in-loop reductions originally

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change does also change how MaxWidth is calculated, previously MaxWidth was what MinWidth is now?

Yes, that's right.

I am not sure if this was done intentionally when support for in-loop reductions originally

I wonder if MaxWidth got confused with the resulting maximum vector width (/factor) at some point. The original patch that added this was https://reviews.llvm.org/D113973.

The tests added in that patch contain two different types, particularly for testing this purpose, although the CHECK lines (e.g. CHECK: The Smallest and Widest types: 4294967295 / 16 bits) seemed wrong, but after this change it is what I would expect it to be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems like the new behavior matches the expectations.

Copy link
Collaborator

@SamTebbs33 SamTebbs33 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

RdxDesc.getMinWidthCastToRecurrenceTypeInBits(),
RdxDesc.getRecurrenceType()->getScalarSizeInBits()));
MaxWidth = std::max<unsigned>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it seems like the new behavior matches the expectations.

@sdesmalen-arm sdesmalen-arm merged commit f9c01b5 into llvm:main Apr 17, 2025
14 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Apr 17, 2025

LLVM Buildbot has detected a new failure on builder arc-builder running on arc-worker while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/3/builds/14674

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: 1200 seconds without output running [b'ninja', b'check-all'], attempting to kill
...
..............................................................................................................................................................
----------------------------------------------------------------------
Ran 158 tests in 2.494s

OK
10.643 [31/18/30] Linking CXX executable unittests/Debuginfod/DebuginfodTests
11.126 [30/18/31] Linking CXX executable unittests/DebugInfo/PDB/DebugInfoPDBTests
11.746 [29/18/32] Linking CXX executable unittests/ExecutionEngine/JITLink/JITLinkTests
11.894 [28/18/33] Linking CXX executable unittests/ExecutionEngine/ExecutionEngineTests
12.283 [27/18/34] Linking CXX executable unittests/InterfaceStub/InterfaceStubTests
command timed out: 1200 seconds without output running [b'ninja', b'check-all'], attempting to kill
process killed by signal 9
program finished with exit code -1
elapsedTime=1213.276590

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants