AMDGPU: Add instruction flags when lowering ctor/dtor #111652

arsenm · 2024-10-09T09:17:21Z

These should be well behaved address computations.

arsenm · 2024-10-09T09:17:31Z

AMDGPU: Add instruction flags when lowering ctor/dtor #111652 👈
AMDGPU: Use pointer types more consistently #111651
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @arsenm and the rest of your teammates on Graphite

llvmbot · 2024-10-09T09:18:23Z

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

These should be well behaved address computations.

Full diff: https://github.com/llvm/llvm-project/pull/111652.diff

4 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp (+7-3)
(modified) llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll (+2-2)
(modified) llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll (+2-2)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
index ea11002bb6a5fa..a774ad53b5bede 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
@@ -112,9 +112,13 @@ static void createInitOrFiniCalls(Function &F, bool IsCtor) {
     Type *Int64Ty = IntegerType::getInt64Ty(C);
     auto *EndPtr = IRB.CreatePtrToInt(End, Int64Ty);
     auto *BeginPtr = IRB.CreatePtrToInt(Begin, Int64Ty);
-    auto *ByteSize = IRB.CreateSub(EndPtr, BeginPtr);
-    auto *Size = IRB.CreateAShr(ByteSize, ConstantInt::get(Int64Ty, 3));
-    auto *Offset = IRB.CreateSub(Size, ConstantInt::get(Int64Ty, 1));
+    auto *ByteSize = IRB.CreateSub(EndPtr, BeginPtr, "", /*HasNUW=*/true,
+                                   /*HasNSW=*/true);
+    auto *Size = IRB.CreateAShr(ByteSize, ConstantInt::get(Int64Ty, 3), "",
+                                /*isExact=*/true);
+    auto *Offset =
+        IRB.CreateSub(Size, ConstantInt::get(Int64Ty, 1), "", /*HasNUW=*/true,
+                      /*HasNSW=*/true);
     Start = IRB.CreateInBoundsGEP(
         PtrArrayTy, Begin,
         ArrayRef<Value *>({ConstantInt::get(Int64Ty, 0), Offset}));
diff --git a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
index a87e07cb57e05e..968871af2d059a 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
@@ -64,8 +64,8 @@ define void @bar() addrspace(1) {
 ; CHECK-LABEL: define weak_odr amdgpu_kernel void @amdgcn.device.fini(
 ; CHECK-SAME: ) #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = ashr i64 sub (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP0:%.*]] = ashr exact i64 sub nuw nsw (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
+; CHECK-NEXT:    [[TMP1:%.*]] = sub nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [0 x ptr addrspace(1)], ptr addrspace(1) @__fini_array_start, i64 0, i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp uge ptr addrspace(1) [[TMP2]], @__fini_array_start
 ; CHECK-NEXT:    br i1 [[TMP3]], label [[WHILE_ENTRY:%.*]], label [[WHILE_END:%.*]]
diff --git a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
index a423b320db559d..98497a64e3204c 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
@@ -79,8 +79,8 @@ define internal void @bar() {
 ; CHECK-LABEL: define weak_odr amdgpu_kernel void @amdgcn.device.fini(
 ; CHECK-SAME: ) #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = ashr i64 sub (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP0:%.*]] = ashr exact i64 sub nuw nsw (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
+; CHECK-NEXT:    [[TMP1:%.*]] = sub nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [0 x ptr addrspace(1)], ptr addrspace(1) @__fini_array_start, i64 0, i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp uge ptr addrspace(1) [[TMP2]], @__fini_array_start
 ; CHECK-NEXT:    br i1 [[TMP3]], label [[WHILE_ENTRY:%.*]], label [[WHILE_END:%.*]]
diff --git a/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll b/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll
index 309ecb17e79ed1..a137f31c7aeeca 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll
@@ -71,8 +71,8 @@ define internal void @bar.5() {
 ; CHECK-LABEL: define weak_odr amdgpu_kernel void @amdgcn.device.fini(
 ; CHECK-SAME: ) #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = ashr i64 sub (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP0:%.*]] = ashr exact i64 sub nuw nsw (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
+; CHECK-NEXT:    [[TMP1:%.*]] = sub nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [0 x ptr addrspace(1)], ptr addrspace(1) @__fini_array_start, i64 0, i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp uge ptr addrspace(1) [[TMP2]], @__fini_array_start
 ; CHECK-NEXT:    br i1 [[TMP3]], label [[WHILE_ENTRY:%.*]], label [[WHILE_END:%.*]]

These should be well behaved address computations.

arsenm · 2024-10-09T13:28:58Z

llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp

-    auto *Offset = IRB.CreateSub(Size, ConstantInt::get(Int64Ty, 1));
+    auto *ByteSize = IRB.CreateSub(EndPtr, BeginPtr, "", /*HasNUW=*/true,
+                                   /*HasNSW=*/true);
+    auto *Size = IRB.CreateAShr(ByteSize, ConstantInt::get(Int64Ty, 3), "",


Why is this ashr actually? I assume end > start always?

I think it's because I typed the GEP so I had to do arr[size / sizeof(void *)] or something, don't remember exactly.

arsenm · 2024-10-09T14:01:48Z

Merge activity

Oct 9, 10:01 AM EDT: A user started a stack merge that includes this pull request via Graphite.
Oct 9, 10:03 AM EDT: A user merged this pull request with Graphite.

arsenm mentioned this pull request Oct 9, 2024

AMDGPU: Use pointer types more consistently #111651

Merged

arsenm added the backend:AMDGPU label Oct 9, 2024 — with Graphite App

arsenm requested a review from jhuber6 October 9, 2024 09:18

arsenm marked this pull request as ready for review October 9, 2024 09:20

jhuber6 approved these changes Oct 9, 2024

View reviewed changes

Base automatically changed from users/arsenm/amdgpu-more-precise-pointer-types to main October 9, 2024 13:23

AMDGPU: Add instruction flags when lowering ctor/dtor

20bc27e

These should be well behaved address computations.

arsenm force-pushed the users/arsenm/amdgpu-add-flags-ctor-dtor-lowering branch from 61f32fc to 20bc27e Compare October 9, 2024 13:25

arsenm commented Oct 9, 2024

View reviewed changes

arsenm merged commit e85fcb7 into main Oct 9, 2024
5 of 8 checks passed

arsenm deleted the users/arsenm/amdgpu-add-flags-ctor-dtor-lowering branch October 9, 2024 14:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AMDGPU: Add instruction flags when lowering ctor/dtor #111652

AMDGPU: Add instruction flags when lowering ctor/dtor #111652

Uh oh!

arsenm commented Oct 9, 2024

Uh oh!

arsenm commented Oct 9, 2024 •

edited

Loading

Uh oh!

llvmbot commented Oct 9, 2024

Uh oh!

arsenm Oct 9, 2024

Uh oh!

jhuber6 Oct 9, 2024

Uh oh!

arsenm commented Oct 9, 2024 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

AMDGPU: Add instruction flags when lowering ctor/dtor #111652

AMDGPU: Add instruction flags when lowering ctor/dtor #111652

Uh oh!

Conversation

arsenm commented Oct 9, 2024

Uh oh!

arsenm commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 9, 2024

Uh oh!

arsenm Oct 9, 2024

Choose a reason for hiding this comment

Uh oh!

jhuber6 Oct 9, 2024

Choose a reason for hiding this comment

Uh oh!

arsenm commented Oct 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merge activity

Uh oh!

Uh oh!

Uh oh!

arsenm commented Oct 9, 2024 •

edited

Loading

arsenm commented Oct 9, 2024 •

edited

Loading