Skip to content

[DSE] Consider the aliasing through global variable while checking clobber #120044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 14, 2025

Conversation

haopliu
Copy link
Contributor

@haopliu haopliu commented Dec 16, 2024

While update the read clobber check for the "initializes" attr, we checked the aliasing among arguments, but didn't consider the aliasing through global variable. It causes problems in this example:

int g_var = 123;

void update(int* ptr) {
  *ptr = g_var;

void foo() {
  g_var = 0;
  bar(&g_var);
}

We mistakenly removed g_var = 0; as a dead store.

Fix the issue by requiring the CallBase only access argmem or inaccessiblemem.

@llvmbot
Copy link
Member

llvmbot commented Dec 16, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Haopeng Liu (haopliu)

Changes

While update the read clobber check for the "initializes" attr, we checked the aliasing among arguments, but didn't consider the aliasing through global variable. It causes problems in this example:

int g_var = 123;

void update(int* ptr) {
  *ptr = g_var;

void foo() {
  g_var = 0;
  bar(g_var);
}

We mistakenly removed g_var = 0; as a dead store.

Fix the issue by requiring the CallBase only access argmem or inaccessiblemem.


Full diff: https://github.com/llvm/llvm-project/pull/120044.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp (+8-3)
  • (modified) llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll (+34)
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 4799640089fa9a..19f6bba583941b 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1665,11 +1665,16 @@ struct DSEState {
       // original MD. Stop walk.
       // If KillingDef is a CallInst with "initializes" attribute, the reads in
       // the callee would be dominated by initializations, so it should be safe.
+      // Note that in `getInitializesArgMemLoc`, we only check the aliasing
+      // among arguments. If aliasing through global variables, we consider it
+      // as read clobber.
       bool IsKillingDefFromInitAttr = false;
       if (IsInitializesAttrMemLoc) {
-        if (KillingI == UseInst &&
-            KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr))
-          IsKillingDefFromInitAttr = true;
+        if (auto *CB = dyn_cast<CallBase>(UseInst))
+          IsKillingDefFromInitAttr =
+              KillingI == UseInst &&
+              KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr) &&
+              CB->onlyAccessesInaccessibleMemOrArgMem();
       }
 
       if (isReadClobber(MaybeDeadLoc, UseInst) && !IsKillingDefFromInitAttr) {
diff --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
index d93da9b6612b05..9c040d4d5bb380 100644
--- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
+++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
@@ -2,12 +2,22 @@
 ; RUN: opt < %s -passes=dse -enable-dse-initializes-attr-improvement -S | FileCheck %s
 
 declare void @p1_write_only(ptr nocapture noundef writeonly initializes((0, 2)) dead_on_unwind)
+
 declare void @p1_write_then_read(ptr nocapture noundef initializes((0, 2)) dead_on_unwind)
+  memory(argmem: readwrite, inaccessiblemem: readwrite)
+
 declare void @p1_clobber(ptr nocapture noundef)
+
 declare void @p2_same_range(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2)) dead_on_unwind)
+  memory(argmem: readwrite, inaccessiblemem: readwrite)
+
 declare void @p2_no_init(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef dead_on_unwind)
+
 declare void @p2_no_dead_on_unwind(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2)))
+  memory(argmem: readwrite, inaccessiblemem: readwrite)
+
 declare void @p2_no_dead_on_unwind_but_nounwind(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2))) nounwind
+  memory(argmem: readwrite, inaccessiblemem: readwrite)
 
 ; Function Attrs: mustprogress nounwind uwtable
 define i16 @p1_write_only_caller() {
@@ -215,8 +225,12 @@ define i16 @p2_no_dead_on_unwind_but_nounwind_alias_caller() {
 }
 
 declare void @llvm.memset.p0.i64(ptr nocapture, i8, i64, i1) nounwind
+
 declare void @large_p1(ptr nocapture noundef initializes((0, 200))) nounwind
+  memory(argmem: readwrite, inaccessiblemem: readwrite)
+
 declare void @large_p2(ptr nocapture noundef initializes((0, 200)), ptr nocapture noundef initializes((0, 100))) nounwind
+  memory(argmem: readwrite, inaccessiblemem: readwrite)
 
 ; Function Attrs: mustprogress nounwind uwtable
 define i16 @large_p1_caller() {
@@ -299,3 +313,23 @@ define i16 @large_p2_may_or_partial_alias_caller2(ptr %base1, ptr %base2) {
   ret i16 %l
 }
 
+@g = global i16 123, align 2
+
+declare void @read_global(ptr nocapture noundef initializes((0, 2)))
+  memory(read, argmem: write, inaccessiblemem: none) nounwind
+
+define i16 @global_var_alias() {
+; CHECK-LABEL: @global_var_alias(
+; CHECK-NEXT:    store i32 0, ptr @g, align 4
+; CHECK-NEXT:    [[G_ADDR:%.*]] = getelementptr i32, ptr @g, i64 1
+; CHECK-NEXT:    call void @read_global(ptr [[G_ADDR]])
+; CHECK-NEXT:    [[L:%.*]] = load i16, ptr @g, align 2
+; CHECK-NEXT:    ret i16 [[L]]
+;
+  store i32 0, ptr @g, align 4
+  %g_addr = getelementptr i32, ptr @g, i64 1
+  call void @read_global(ptr %g_addr)
+  %l = load i16, ptr @g
+  ret i16 %l
+}
+

@dtcxzyw dtcxzyw changed the title Consider the aliasing through global variable while checking clobber [DSE] Consider the aliasing through global variable while checking clobber Dec 16, 2024
IsKillingDefFromInitAttr =
KillingI == UseInst &&
KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr) &&
CB->onlyAccessesInaccessibleMemOrArgMem();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make more sense to handle this inside getInitializesArgMemLoc?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this check to inside getInitializesArgMemLoc. Thanks!

Copy link
Contributor

@aeubanks aeubanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's unfortunate to require onlyAccessesInaccessibleMemOrArgMem() for all of the cases, it'll probably prevent a lot of valid DSE. can we be more precise on when we need onlyAccessesInaccessibleMemOrArgMem()? if the pointer hasn't escaped yet then we don't need this restriction

; CHECK-NEXT: [[L:%.*]] = load i16, ptr @g, align 2
; CHECK-NEXT: ret i16 [[L]]
;
store i32 0, ptr @g, align 4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this storing an i32 value to an i16 global? and the gep is out of bounds?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, fixed the mistake. Thanks.

@haopliu
Copy link
Contributor Author

haopliu commented Dec 21, 2024

I think it's unfortunate to require onlyAccessesInaccessibleMemOrArgMem() for all of the cases, it'll probably prevent a lot of valid DSE. can we be more precise on when we need onlyAccessesInaccessibleMemOrArgMem()? if the pointer hasn't escaped yet then we don't need this restriction

Updated the check to require either the argument is from Alloca that doesn't escape or CB->onlyAccessesInaccessibleMemOrArgMem. Otherwise, discard the "initializes" info.

PTAL, thanks.

@nikic
Copy link
Contributor

nikic commented Dec 21, 2024

The check for alloca should be a check for isIdentifiedFunctionLocal and your ValidFromAlloca function should be PointerMayBeCaptured -- though it probably makes sense to query it via EarliestEscapeAnalysis, which is a bit more precise and cached.

@haopliu
Copy link
Contributor Author

haopliu commented Dec 22, 2024

Thanks! Updated the check to use isIdentifiedFunctionLocal and EarliestEscapeAnalysis::isNotCapturedBefore.

// Return true if "Arg" is function local and isn't captured before "CB" or
// if "Arg" is GEP whose base pointer is function local and isn't captured
// before "CB".
bool IsFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
bool IsFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB,
bool isFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

EarliestEscapeAnalysis &EA) {
if (isIdentifiedFunctionLocal(Arg))
return EA.isNotCapturedBefore(Arg, CB, /*OrAt*/ true);
const auto *GEP = dyn_cast<GetElementPtrInst>(Arg);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the result of getUnderlyingObject instead, it will look through multiple GEPs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, done. Thanks!

;
store i16 0, ptr @g, align 4
%g_addr = getelementptr i8, ptr @g, i64 1
call void @read_global(ptr %g_addr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like this case would still go out of bounds for the i16 global?
%g_addr = 1 byte from the start
then @read_global initializes 2 bytes from there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooops, removed the GEP and directly read/write @g. Thanks.

Copy link
Contributor

@jvoung jvoung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM w.r.t. my comments

Nice recommendations for isIdentifiedFunctionLocal( and isNotCapturedBefore =)

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// either it's function local and isn't captured before or the "CB" only
// accesses arg or inaccessible mem.
if (!Inits.empty() && !isFuncLocalAndNotCaptured(CurArg, CB, EA) &&
!CB->onlyAccessesInaccessibleMemOrArgMem())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd check onlyAccessesInaccessibleMemOrArgMem before isFuncLocalAndNotCaptured as it's cheaper.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for the reminder! Done.

@@ -39,6 +39,26 @@ define i16 @p1_write_then_read_caller() {
ret i16 %l
}

declare void @fn_capture(ptr noundef)
; Function Attrs: mustprogress nounwind uwtable
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop inaccurate Function Attrs comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -39,6 +39,26 @@ define i16 @p1_write_then_read_caller() {
ret i16 %l
}

declare void @fn_capture(ptr noundef)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drop noundef, it's not relevant here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@haopliu haopliu merged commit fed817a into llvm:main Jan 14, 2025
8 checks passed
haopliu added a commit that referenced this pull request Jan 23, 2025
(Retry) enable the initializes improvement in DSE.

Initially enabled in #119116.

Fix the aliasing issue through global variables in
#120044.

The compile-time comparison of this enabling (no meaningful diff):
https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
github-actions bot pushed a commit to arm/arm-toolchain that referenced this pull request Jan 24, 2025
(Retry) enable the initializes improvement in DSE.

Initially enabled in llvm/llvm-project#119116.

Fix the aliasing issue through global variables in
llvm/llvm-project#120044.

The compile-time comparison of this enabling (no meaningful diff):
https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants