[AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote #93466

Pierre-vh · 2024-05-27T12:45:25Z

When I rewrote this, I made a mistake in the control flow. I thought we could just stop promoting if an alloca is too big to vectorize, but we can't. Other allocas in the list may be promotable and fit within the budget;

Fixes SWDEV-455343

When I rewrote this, I made a mistake in the control flow. I thought we could just stop promoting if an alloca is too big to vectorize, but we can't. Other allocas in the list may be promotable and fit within the budget; Fixes SWDEV-455343

llvmbot · 2024-05-27T12:45:54Z

@llvm/pr-subscribers-backend-amdgpu

Author: Pierre van Houtryve (Pierre-vh)

Changes

When I rewrote this, I made a mistake in the control flow. I thought we could just stop promoting if an alloca is too big to vectorize, but we can't. Other allocas in the list may be promotable and fit within the budget;

Fixes SWDEV-455343

Full diff: https://github.com/llvm/llvm-project/pull/93466.diff

2 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp (+16-14)
(added) llvm/test/CodeGen/AMDGPU/promote-alloca-budget-exhausted.ll (+41)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
index c0846b123d187..fbda8f973db99 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
@@ -333,22 +333,24 @@ bool AMDGPUPromoteAllocaImpl::run(Function &F, bool PromoteToLDS) {
   bool Changed = false;
   for (AllocaInst *AI : Allocas) {
     const unsigned AllocaCost = DL->getTypeSizeInBits(AI->getAllocatedType());
-    if (AllocaCost > VectorizationBudget) {
-      LLVM_DEBUG(dbgs() << "  Alloca too big for vectorization: " << *AI
-                        << "\n");
-      return Changed;
+    // First, check if we have enough budget to vectorize this alloca.
+    if (AllocaCost <= VectorizationBudget) {
+      // If we do, attempt vectorization, otherwise, fall through and try
+      // promoting to LDS instead.
+      if (tryPromoteAllocaToVector(*AI)) {
+        Changed = true;
+        assert((VectorizationBudget - AllocaCost) < VectorizationBudget &&
+               "Underflow!");
+        VectorizationBudget -= AllocaCost;
+        LLVM_DEBUG(dbgs() << "  Remaining vectorization budget:"
+                          << VectorizationBudget << "\n");
+        continue;
+      }
+    } else {
+      LLVM_DEBUG(dbgs() << "Alloca too big for vectorization (size:" << AllocaCost << ", budget:" << VectorizationBudget << "): " << *AI << "\n");
     }
 
-    if (tryPromoteAllocaToVector(*AI)) {
-      Changed = true;
-      assert((VectorizationBudget - AllocaCost) < VectorizationBudget &&
-             "Underflow!");
-      VectorizationBudget -= AllocaCost;
-      LLVM_DEBUG(dbgs() << "  Remaining vectorization budget:"
-                        << VectorizationBudget << "\n");
-      if (VectorizationBudget == 0)
-        break;
-    } else if (PromoteToLDS && tryPromoteAllocaToLDS(*AI, SufficientLDS))
+    if (PromoteToLDS && tryPromoteAllocaToLDS(*AI, SufficientLDS))
       Changed = true;
   }
 
diff --git a/llvm/test/CodeGen/AMDGPU/promote-alloca-budget-exhausted.ll b/llvm/test/CodeGen/AMDGPU/promote-alloca-budget-exhausted.ll
new file mode 100644
index 0000000000000..e13ab421dfdb7
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/promote-alloca-budget-exhausted.ll
@@ -0,0 +1,41 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-unknown-amdhsa -mcpu=kaveri -amdgpu-promote-alloca-to-vector-limit=128 -passes=amdgpu-promote-alloca-to-vector %s -o - | FileCheck %s
+
+; Check that when we see an alloca that's too big to vectorize given the remaining budget,
+; we don't give up and we keep looking for other allocas to vectorize.
+
+define amdgpu_kernel void @simple_users_scores() {
+; CHECK-LABEL: define amdgpu_kernel void @simple_users_scores(
+; CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[MANYUSERS:%.*]] = alloca [64 x i64], align 4, addrspace(5)
+; CHECK-NEXT:    [[MANYUSERS_1:%.*]] = getelementptr i8, ptr addrspace(5) [[MANYUSERS]], i64 2
+; CHECK-NEXT:    [[V0:%.*]] = load i8, ptr addrspace(5) [[MANYUSERS_1]], align 1
+; CHECK-NEXT:    [[V0_EXT:%.*]] = zext i8 [[V0]] to i64
+; CHECK-NEXT:    store i64 [[V0_EXT]], ptr addrspace(5) [[MANYUSERS_1]], align 8
+; CHECK-NEXT:    [[MANYUSERS_2:%.*]] = getelementptr i8, ptr addrspace(5) [[MANYUSERS]], i64 1
+; CHECK-NEXT:    [[V1:%.*]] = load i8, ptr addrspace(5) [[MANYUSERS_2]], align 1
+; CHECK-NEXT:    [[V1_EXT:%.*]] = zext i8 [[V0]] to i64
+; CHECK-NEXT:    store i64 [[V1_EXT]], ptr addrspace(5) [[MANYUSERS_2]], align 8
+; CHECK-NEXT:    ret void
+;
+entry:
+  ; should get a score of 1
+  %simpleuser = alloca [4 x i64], align 4, addrspace(5)
+  ; should get a score of 4 and be visited first.
+  %manyusers = alloca [64 x i64], align 4, addrspace(5)
+
+  store i64 42, ptr addrspace(5) %simpleuser
+
+  %manyusers.1 = getelementptr i8, ptr addrspace(5) %manyusers, i64 2
+  %v0 = load i8, ptr addrspace(5)  %manyusers.1
+  %v0.ext = zext i8 %v0 to i64
+  store i64 %v0.ext, ptr addrspace(5) %manyusers.1
+
+  %manyusers.2 = getelementptr i8, ptr addrspace(5) %manyusers, i64 1
+  %v1 = load i8, ptr addrspace(5)  %manyusers.2
+  %v1.ext = zext i8 %v0 to i64
+  store i64 %v1.ext, ptr addrspace(5) %manyusers.2
+
+  ret void
+}

Pierre-vh · 2024-05-27T12:46:07Z

llvm/test/CodeGen/AMDGPU/promote-alloca-budget-exhausted.ll

+; Check that when we see an alloca that's too big to vectorize given the remaining budget,
+; we don't give up and we keep looking for other allocas to vectorize.
+
+define amdgpu_kernel void @simple_users_scores() {


Before this patch, %simpleuser wouldn't be promoted because %manyusers is seen first, but it's too big and we'd simply stop.

github-actions · 2024-05-27T12:48:06Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Reverts: 0e73bbd [AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote (llvm#93466) Change-Id: I1d86ab5653aa184da3062366f0f55d26f56f6db2

… to promote (llvm#93466) When I rewrote this, I made a mistake in the control flow. I thought we could just stop promoting if an alloca is too big to vectorize, but we can't. Other allocas in the list may be promotable and fit within the budget. Fixes SWDEV-455343 Fixes SWDEV-464683 Change-Id: Iedeabc3ee1c91500da13503097d2029f2838c8ce

Pierre-vh requested review from jayfoad and arsenm May 27, 2024 12:45

llvmbot added the backend:AMDGPU label May 27, 2024

Pierre-vh commented May 27, 2024

View reviewed changes

clang-format

4b43b2d

arsenm approved these changes May 27, 2024

View reviewed changes

Pierre-vh merged commit 0e73bbd into llvm:main May 28, 2024
7 checks passed

Pierre-vh deleted the fix-promotealloca-bug branch May 28, 2024 06:05

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request May 29, 2024

merge main into amd-staging

fab307f

Reverts: 0e73bbd [AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote (llvm#93466) Change-Id: I1d86ab5653aa184da3062366f0f55d26f56f6db2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote #93466

[AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote #93466

Uh oh!

Pierre-vh commented May 27, 2024

Uh oh!

llvmbot commented May 27, 2024

Uh oh!

Pierre-vh May 27, 2024

Uh oh!

github-actions bot commented May 27, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

[AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote #93466

[AMDGPU][PromoteAlloca] Don't stop when an alloca is too big to promote #93466

Uh oh!

Conversation

Pierre-vh commented May 27, 2024

Uh oh!

llvmbot commented May 27, 2024

Uh oh!

Pierre-vh May 27, 2024

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented May 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented May 27, 2024 •

edited

Loading