-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[DSE] Consider the aliasing through global variable while checking clobber #120044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms Author: Haopeng Liu (haopliu) ChangesWhile update the read clobber check for the "initializes" attr, we checked the aliasing among arguments, but didn't consider the aliasing through global variable. It causes problems in this example:
We mistakenly removed Fix the issue by requiring the CallBase only access argmem or inaccessiblemem. Full diff: https://github.com/llvm/llvm-project/pull/120044.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
index 4799640089fa9a..19f6bba583941b 100644
--- a/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
+++ b/llvm/lib/Transforms/Scalar/DeadStoreElimination.cpp
@@ -1665,11 +1665,16 @@ struct DSEState {
// original MD. Stop walk.
// If KillingDef is a CallInst with "initializes" attribute, the reads in
// the callee would be dominated by initializations, so it should be safe.
+ // Note that in `getInitializesArgMemLoc`, we only check the aliasing
+ // among arguments. If aliasing through global variables, we consider it
+ // as read clobber.
bool IsKillingDefFromInitAttr = false;
if (IsInitializesAttrMemLoc) {
- if (KillingI == UseInst &&
- KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr))
- IsKillingDefFromInitAttr = true;
+ if (auto *CB = dyn_cast<CallBase>(UseInst))
+ IsKillingDefFromInitAttr =
+ KillingI == UseInst &&
+ KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr) &&
+ CB->onlyAccessesInaccessibleMemOrArgMem();
}
if (isReadClobber(MaybeDeadLoc, UseInst) && !IsKillingDefFromInitAttr) {
diff --git a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
index d93da9b6612b05..9c040d4d5bb380 100644
--- a/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
+++ b/llvm/test/Transforms/DeadStoreElimination/inter-procedural.ll
@@ -2,12 +2,22 @@
; RUN: opt < %s -passes=dse -enable-dse-initializes-attr-improvement -S | FileCheck %s
declare void @p1_write_only(ptr nocapture noundef writeonly initializes((0, 2)) dead_on_unwind)
+
declare void @p1_write_then_read(ptr nocapture noundef initializes((0, 2)) dead_on_unwind)
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @p1_clobber(ptr nocapture noundef)
+
declare void @p2_same_range(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2)) dead_on_unwind)
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @p2_no_init(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef dead_on_unwind)
+
declare void @p2_no_dead_on_unwind(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2)))
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @p2_no_dead_on_unwind_but_nounwind(ptr nocapture noundef initializes((0, 2)) dead_on_unwind, ptr nocapture noundef initializes((0, 2))) nounwind
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
; Function Attrs: mustprogress nounwind uwtable
define i16 @p1_write_only_caller() {
@@ -215,8 +225,12 @@ define i16 @p2_no_dead_on_unwind_but_nounwind_alias_caller() {
}
declare void @llvm.memset.p0.i64(ptr nocapture, i8, i64, i1) nounwind
+
declare void @large_p1(ptr nocapture noundef initializes((0, 200))) nounwind
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
+
declare void @large_p2(ptr nocapture noundef initializes((0, 200)), ptr nocapture noundef initializes((0, 100))) nounwind
+ memory(argmem: readwrite, inaccessiblemem: readwrite)
; Function Attrs: mustprogress nounwind uwtable
define i16 @large_p1_caller() {
@@ -299,3 +313,23 @@ define i16 @large_p2_may_or_partial_alias_caller2(ptr %base1, ptr %base2) {
ret i16 %l
}
+@g = global i16 123, align 2
+
+declare void @read_global(ptr nocapture noundef initializes((0, 2)))
+ memory(read, argmem: write, inaccessiblemem: none) nounwind
+
+define i16 @global_var_alias() {
+; CHECK-LABEL: @global_var_alias(
+; CHECK-NEXT: store i32 0, ptr @g, align 4
+; CHECK-NEXT: [[G_ADDR:%.*]] = getelementptr i32, ptr @g, i64 1
+; CHECK-NEXT: call void @read_global(ptr [[G_ADDR]])
+; CHECK-NEXT: [[L:%.*]] = load i16, ptr @g, align 2
+; CHECK-NEXT: ret i16 [[L]]
+;
+ store i32 0, ptr @g, align 4
+ %g_addr = getelementptr i32, ptr @g, i64 1
+ call void @read_global(ptr %g_addr)
+ %l = load i16, ptr @g
+ ret i16 %l
+}
+
|
IsKillingDefFromInitAttr = | ||
KillingI == UseInst && | ||
KillingUndObj == getUnderlyingObject(MaybeDeadLoc.Ptr) && | ||
CB->onlyAccessesInaccessibleMemOrArgMem(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to handle this inside getInitializesArgMemLoc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved this check to inside getInitializesArgMemLoc
. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's unfortunate to require onlyAccessesInaccessibleMemOrArgMem()
for all of the cases, it'll probably prevent a lot of valid DSE. can we be more precise on when we need onlyAccessesInaccessibleMemOrArgMem()
? if the pointer hasn't escaped yet then we don't need this restriction
; CHECK-NEXT: [[L:%.*]] = load i16, ptr @g, align 2 | ||
; CHECK-NEXT: ret i16 [[L]] | ||
; | ||
store i32 0, ptr @g, align 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't this storing an i32 value to an i16 global? and the gep is out of bounds?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, fixed the mistake. Thanks.
Updated the check to require either the argument is from Alloca that doesn't escape or PTAL, thanks. |
The check for alloca should be a check for isIdentifiedFunctionLocal and your ValidFromAlloca function should be PointerMayBeCaptured -- though it probably makes sense to query it via EarliestEscapeAnalysis, which is a bit more precise and cached. |
Thanks! Updated the check to use |
// Return true if "Arg" is function local and isn't captured before "CB" or | ||
// if "Arg" is GEP whose base pointer is function local and isn't captured | ||
// before "CB". | ||
bool IsFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool IsFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB, | |
bool isFuncLocalAndNotCaptured(Value *Arg, const CallBase *CB, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
EarliestEscapeAnalysis &EA) { | ||
if (isIdentifiedFunctionLocal(Arg)) | ||
return EA.isNotCapturedBefore(Arg, CB, /*OrAt*/ true); | ||
const auto *GEP = dyn_cast<GetElementPtrInst>(Arg); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the result of getUnderlyingObject instead, it will look through multiple GEPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, done. Thanks!
; | ||
store i16 0, ptr @g, align 4 | ||
%g_addr = getelementptr i8, ptr @g, i64 1 | ||
call void @read_global(ptr %g_addr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like this case would still go out of bounds for the i16 global?
%g_addr = 1 byte from the start
then @read_global initializes 2 bytes from there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooops, removed the GEP and directly read/write @g. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM w.r.t. my comments
Nice recommendations for isIdentifiedFunctionLocal( and isNotCapturedBefore =)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
// either it's function local and isn't captured before or the "CB" only | ||
// accesses arg or inaccessible mem. | ||
if (!Inits.empty() && !isFuncLocalAndNotCaptured(CurArg, CB, EA) && | ||
!CB->onlyAccessesInaccessibleMemOrArgMem()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd check onlyAccessesInaccessibleMemOrArgMem before isFuncLocalAndNotCaptured as it's cheaper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the reminder! Done.
@@ -39,6 +39,26 @@ define i16 @p1_write_then_read_caller() { | |||
ret i16 %l | |||
} | |||
|
|||
declare void @fn_capture(ptr noundef) | |||
; Function Attrs: mustprogress nounwind uwtable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop inaccurate Function Attrs comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -39,6 +39,26 @@ define i16 @p1_write_then_read_caller() { | |||
ret i16 %l | |||
} | |||
|
|||
declare void @fn_capture(ptr noundef) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop noundef, it's not relevant here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
(Retry) enable the initializes improvement in DSE. Initially enabled in #119116. Fix the aliasing issue through global variables in #120044. The compile-time comparison of this enabling (no meaningful diff): https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
(Retry) enable the initializes improvement in DSE. Initially enabled in llvm/llvm-project#119116. Fix the aliasing issue through global variables in llvm/llvm-project#120044. The compile-time comparison of this enabling (no meaningful diff): https://llvm-compile-time-tracker.com/compare.php?from=b46fcb9fa32f24660b1b8858d5c4cbdb76ef9d8b&to=33dc817b81f7898c87b052d1ddfd3d6e6f5b5dbd&stat=instructions%3Au
While update the read clobber check for the "initializes" attr, we checked the aliasing among arguments, but didn't consider the aliasing through global variable. It causes problems in this example:
We mistakenly removed
g_var = 0;
as a dead store.Fix the issue by requiring the CallBase only access argmem or inaccessiblemem.