Skip to content

AMDGPU: Add instruction flags when lowering ctor/dtor #111652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 9, 2024

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Oct 9, 2024

These should be well behaved address computations.

Copy link
Contributor Author

arsenm commented Oct 9, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @arsenm and the rest of your teammates on Graphite Graphite

@llvmbot
Copy link
Member

llvmbot commented Oct 9, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

These should be well behaved address computations.


Full diff: https://github.com/llvm/llvm-project/pull/111652.diff

4 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp (+7-3)
  • (modified) llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll (+2-2)
  • (modified) llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll (+2-2)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp b/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
index ea11002bb6a5fa..a774ad53b5bede 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUCtorDtorLowering.cpp
@@ -112,9 +112,13 @@ static void createInitOrFiniCalls(Function &F, bool IsCtor) {
     Type *Int64Ty = IntegerType::getInt64Ty(C);
     auto *EndPtr = IRB.CreatePtrToInt(End, Int64Ty);
     auto *BeginPtr = IRB.CreatePtrToInt(Begin, Int64Ty);
-    auto *ByteSize = IRB.CreateSub(EndPtr, BeginPtr);
-    auto *Size = IRB.CreateAShr(ByteSize, ConstantInt::get(Int64Ty, 3));
-    auto *Offset = IRB.CreateSub(Size, ConstantInt::get(Int64Ty, 1));
+    auto *ByteSize = IRB.CreateSub(EndPtr, BeginPtr, "", /*HasNUW=*/true,
+                                   /*HasNSW=*/true);
+    auto *Size = IRB.CreateAShr(ByteSize, ConstantInt::get(Int64Ty, 3), "",
+                                /*isExact=*/true);
+    auto *Offset =
+        IRB.CreateSub(Size, ConstantInt::get(Int64Ty, 1), "", /*HasNUW=*/true,
+                      /*HasNSW=*/true);
     Start = IRB.CreateInBoundsGEP(
         PtrArrayTy, Begin,
         ArrayRef<Value *>({ConstantInt::get(Int64Ty, 0), Offset}));
diff --git a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
index a87e07cb57e05e..968871af2d059a 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor-constexpr-alias.ll
@@ -64,8 +64,8 @@ define void @bar() addrspace(1) {
 ; CHECK-LABEL: define weak_odr amdgpu_kernel void @amdgcn.device.fini(
 ; CHECK-SAME: ) #[[ATTR2:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = ashr i64 sub (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP0:%.*]] = ashr exact i64 sub nuw nsw (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
+; CHECK-NEXT:    [[TMP1:%.*]] = sub nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [0 x ptr addrspace(1)], ptr addrspace(1) @__fini_array_start, i64 0, i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp uge ptr addrspace(1) [[TMP2]], @__fini_array_start
 ; CHECK-NEXT:    br i1 [[TMP3]], label [[WHILE_ENTRY:%.*]], label [[WHILE_END:%.*]]
diff --git a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
index a423b320db559d..98497a64e3204c 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-ctor-dtor.ll
@@ -79,8 +79,8 @@ define internal void @bar() {
 ; CHECK-LABEL: define weak_odr amdgpu_kernel void @amdgcn.device.fini(
 ; CHECK-SAME: ) #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = ashr i64 sub (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP0:%.*]] = ashr exact i64 sub nuw nsw (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
+; CHECK-NEXT:    [[TMP1:%.*]] = sub nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [0 x ptr addrspace(1)], ptr addrspace(1) @__fini_array_start, i64 0, i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp uge ptr addrspace(1) [[TMP2]], @__fini_array_start
 ; CHECK-NEXT:    br i1 [[TMP3]], label [[WHILE_ENTRY:%.*]], label [[WHILE_END:%.*]]
diff --git a/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll b/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll
index 309ecb17e79ed1..a137f31c7aeeca 100644
--- a/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll
+++ b/llvm/test/CodeGen/AMDGPU/lower-multiple-ctor-dtor.ll
@@ -71,8 +71,8 @@ define internal void @bar.5() {
 ; CHECK-LABEL: define weak_odr amdgpu_kernel void @amdgcn.device.fini(
 ; CHECK-SAME: ) #[[ATTR1:[0-9]+]] {
 ; CHECK-NEXT:  entry:
-; CHECK-NEXT:    [[TMP0:%.*]] = ashr i64 sub (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
-; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP0:%.*]] = ashr exact i64 sub nuw nsw (i64 ptrtoint (ptr addrspace(1) @__fini_array_end to i64), i64 ptrtoint (ptr addrspace(1) @__fini_array_start to i64)), 3
+; CHECK-NEXT:    [[TMP1:%.*]] = sub nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [0 x ptr addrspace(1)], ptr addrspace(1) @__fini_array_start, i64 0, i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp uge ptr addrspace(1) [[TMP2]], @__fini_array_start
 ; CHECK-NEXT:    br i1 [[TMP3]], label [[WHILE_ENTRY:%.*]], label [[WHILE_END:%.*]]

@arsenm arsenm marked this pull request as ready for review October 9, 2024 09:20
Base automatically changed from users/arsenm/amdgpu-more-precise-pointer-types to main October 9, 2024 13:23
These should be well behaved address computations.
@arsenm arsenm force-pushed the users/arsenm/amdgpu-add-flags-ctor-dtor-lowering branch from 61f32fc to 20bc27e Compare October 9, 2024 13:25
auto *Offset = IRB.CreateSub(Size, ConstantInt::get(Int64Ty, 1));
auto *ByteSize = IRB.CreateSub(EndPtr, BeginPtr, "", /*HasNUW=*/true,
/*HasNSW=*/true);
auto *Size = IRB.CreateAShr(ByteSize, ConstantInt::get(Int64Ty, 3), "",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this ashr actually? I assume end > start always?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's because I typed the GEP so I had to do arr[size / sizeof(void *)] or something, don't remember exactly.

Copy link
Contributor Author

arsenm commented Oct 9, 2024

Merge activity

  • Oct 9, 10:01 AM EDT: A user started a stack merge that includes this pull request via Graphite.
  • Oct 9, 10:03 AM EDT: A user merged this pull request with Graphite.

@arsenm arsenm merged commit e85fcb7 into main Oct 9, 2024
5 of 8 checks passed
@arsenm arsenm deleted the users/arsenm/amdgpu-add-flags-ctor-dtor-lowering branch October 9, 2024 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants