-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[LAA] Make Ptr argument optional in isNoWrap. #127410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis Author: Florian Hahn (fhahn) ChangesUpdate isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed. Full diff: https://github.com/llvm/llvm-project/pull/127410.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/LoopAccessAnalysis.cpp b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
index 7d6dbd51a404d..63bb79f061bea 100644
--- a/llvm/lib/Analysis/LoopAccessAnalysis.cpp
+++ b/llvm/lib/Analysis/LoopAccessAnalysis.cpp
@@ -799,8 +799,13 @@ getStrideFromAddRec(const SCEVAddRecExpr *AR, const Loop *Lp, Type *AccessTy,
Value *Ptr, PredicatedScalarEvolution &PSE) {
// The access function must stride over the innermost loop.
if (Lp != AR->getLoop()) {
- LLVM_DEBUG(dbgs() << "LAA: Bad stride - Not striding over innermost loop "
- << *Ptr << " SCEV: " << *AR << "\n");
+ LLVM_DEBUG({
+ dbgs() << "LAA: Bad stride - Not striding over innermost loop ";
+ if (Ptr)
+ dbgs() << *Ptr << " ";
+
+ dbgs() << "SCEV: " << *AR << "\n";
+ });
return std::nullopt;
}
@@ -810,8 +815,12 @@ getStrideFromAddRec(const SCEVAddRecExpr *AR, const Loop *Lp, Type *AccessTy,
// Calculate the pointer stride and check if it is constant.
const SCEVConstant *C = dyn_cast<SCEVConstant>(Step);
if (!C) {
- LLVM_DEBUG(dbgs() << "LAA: Bad stride - Not a constant strided " << *Ptr
- << " SCEV: " << *AR << "\n");
+ LLVM_DEBUG({
+ dbgs() << "LAA: Bad stride - Not a constant strided ";
+ if (Ptr)
+ dbgs() << *Ptr << " ";
+ dbgs() << "SCEV: " << *AR << "\n";
+ });
return std::nullopt;
}
@@ -838,7 +847,8 @@ getStrideFromAddRec(const SCEVAddRecExpr *AR, const Loop *Lp, Type *AccessTy,
static bool isNoWrapAddRec(Value *Ptr, const SCEVAddRecExpr *AR,
PredicatedScalarEvolution &PSE, const Loop *L);
-/// Check whether a pointer address cannot wrap.
+/// Check whether a pointer address cannot wrap. If \p Ptr is not nullptr, use
+/// informating from the IR pointer value to determine no-wrap.
static bool isNoWrap(PredicatedScalarEvolution &PSE, const SCEVAddRecExpr *AR,
Value *Ptr, Type *AccessTy, const Loop *L, bool Assume,
std::optional<int64_t> Stride = std::nullopt) {
@@ -852,7 +862,7 @@ static bool isNoWrap(PredicatedScalarEvolution &PSE, const SCEVAddRecExpr *AR,
// location will be larger than half the pointer index type space. In that
// case, the GEP would be poison and any memory access dependent on it would
// be immediate UB when executed.
- if (auto *GEP = dyn_cast<GetElementPtrInst>(Ptr);
+ if (auto *GEP = dyn_cast_if_present<GetElementPtrInst>(Ptr);
GEP && GEP->hasNoUnsignedSignedWrap())
return true;
@@ -868,6 +878,9 @@ static bool isNoWrap(PredicatedScalarEvolution &PSE, const SCEVAddRecExpr *AR,
return true;
}
+ if (!Ptr)
+ return false;
+
if (Assume) {
PSE.setNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW);
LLVM_DEBUG(dbgs() << "LAA: Pointer may wrap:\n"
@@ -1135,13 +1148,10 @@ bool AccessAnalysis::createCheckForAccess(RuntimePointerChecking &RtCheck,
// When we run after a failing dependency check we have to make sure
// we don't have wrapping pointers.
- if (ShouldCheckWrap) {
- // Skip wrap checking when translating pointers.
- if (TranslatedPtrs.size() > 1)
- return false;
-
- if (!isNoWrap(PSE, AR, Ptr, AccessTy, TheLoop, Assume))
- return false;
+ if (ShouldCheckWrap &&
+ !isNoWrap(PSE, AR, TranslatedPtrs.size() == 1 ? Ptr : nullptr, AccessTy,
+ TheLoop, Assume)) {
+ return false;
}
}
@@ -1454,6 +1464,9 @@ static bool isNoWrapAddRec(Value *Ptr, const SCEVAddRecExpr *AR,
if (AR->getNoWrapFlags(SCEV::NoWrapMask))
return true;
+ if (!Ptr)
+ return false;
+
if (PSE.hasNoOverflow(Ptr, SCEVWrapPredicate::IncrementNUSW))
return true;
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll b/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll
index 5e9dc7f2b91cc..38b7389ae9083 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/retry-runtime-checks-after-dependence-analysis-forked-pointers.ll
@@ -83,10 +83,52 @@ exit:
define void @dependency_check_and_runtime_checks_needed_select_of_ptr_add_recs(ptr %a, ptr %b, ptr %c, i64 %offset, i64 %n) {
; CHECK-LABEL: 'dependency_check_and_runtime_checks_needed_select_of_ptr_add_recs'
; CHECK-NEXT: loop:
-; CHECK-NEXT: Report: cannot check memory dependencies at runtime
+; CHECK-NEXT: Memory dependences are safe with run-time checks
; CHECK-NEXT: Dependences:
; CHECK-NEXT: Run-time memory checks:
+; CHECK-NEXT: Check 0:
+; CHECK-NEXT: Comparing group ([[GRP5:0x[0-9a-f]+]]):
+; CHECK-NEXT: %gep.a.iv = getelementptr inbounds float, ptr %a, i64 %iv
+; CHECK-NEXT: Against group ([[GRP6:0x[0-9a-f]+]]):
+; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.b, ptr %gep.c
+; CHECK-NEXT: Check 1:
+; CHECK-NEXT: Comparing group ([[GRP5]]):
+; CHECK-NEXT: %gep.a.iv = getelementptr inbounds float, ptr %a, i64 %iv
+; CHECK-NEXT: Against group ([[GRP7:0x[0-9a-f]+]]):
+; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.b, ptr %gep.c
+; CHECK-NEXT: Check 2:
+; CHECK-NEXT: Comparing group ([[GRP5]]):
+; CHECK-NEXT: %gep.a.iv = getelementptr inbounds float, ptr %a, i64 %iv
+; CHECK-NEXT: Against group ([[GRP8:0x[0-9a-f]+]]):
+; CHECK-NEXT: %gep.a.iv.off = getelementptr inbounds float, ptr %a, i64 %iv.offset
+; CHECK-NEXT: Check 3:
+; CHECK-NEXT: Comparing group ([[GRP6]]):
+; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.b, ptr %gep.c
+; CHECK-NEXT: Against group ([[GRP7]]):
+; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.b, ptr %gep.c
+; CHECK-NEXT: Check 4:
+; CHECK-NEXT: Comparing group ([[GRP6]]):
+; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.b, ptr %gep.c
+; CHECK-NEXT: Against group ([[GRP8]]):
+; CHECK-NEXT: %gep.a.iv.off = getelementptr inbounds float, ptr %a, i64 %iv.offset
+; CHECK-NEXT: Check 5:
+; CHECK-NEXT: Comparing group ([[GRP7]]):
+; CHECK-NEXT: %select = select i1 %cmp, ptr %gep.b, ptr %gep.c
+; CHECK-NEXT: Against group ([[GRP8]]):
+; CHECK-NEXT: %gep.a.iv.off = getelementptr inbounds float, ptr %a, i64 %iv.offset
; CHECK-NEXT: Grouped accesses:
+; CHECK-NEXT: Group [[GRP5]]:
+; CHECK-NEXT: (Low: %a High: ((4 * %n) + %a))
+; CHECK-NEXT: Member: {%a,+,4}<nuw><%loop>
+; CHECK-NEXT: Group [[GRP6]]:
+; CHECK-NEXT: (Low: %b High: ((4 * %n) + %b))
+; CHECK-NEXT: Member: {%b,+,4}<%loop>
+; CHECK-NEXT: Group [[GRP7]]:
+; CHECK-NEXT: (Low: %c High: ((4 * %n) + %c))
+; CHECK-NEXT: Member: {%c,+,4}<%loop>
+; CHECK-NEXT: Group [[GRP8]]:
+; CHECK-NEXT: (Low: ((4 * %offset) + %a) High: ((4 * %offset) + (4 * %n) + %a))
+; CHECK-NEXT: Member: {((4 * %offset) + %a),+,4}<%loop>
; CHECK-EMPTY:
; CHECK-NEXT: Non vectorizable stores to invariant address were not found in loop.
; CHECK-NEXT: SCEV assumptions:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks reasonable, although I'm quite unhappy with the existing code: some of the comments can be addressed in follow-ups/pre-commits.
if (!Ptr) | ||
return false; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmpf, if Ptr is nullptr, Assume is ignored: perhaps it makes sense to add a comment about this in the function's doc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If AR is stripped from isNoWrapAddRec (actually isNoWrapGEP), the check for it in this function can be moved after the Ptr check.
if (AR->getNoWrapFlags(SCEV::NoWrapMask)) | ||
return true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder off-hand if this is ever hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, the API isn't very clear, as AR is only ever used here, and what's actually checked is the Ptr, not AR. Perhaps the function should be renamed to isNoWrapGEP, and the AR check can be moved to the caller, assuming that it's useful?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good to me
I propose #127479, over which you can rebase for a clean patch. |
171135f
to
706e5f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose #127479, over which you can rebase for a clean patch.
Thanks, updated on top of the patch
return true; | ||
|
||
// An nusw getelementptr that is an AddRec cannot wrap. If it would wrap, | ||
// the distance between the previously accessed location and the wrapped | ||
// location will be larger than half the pointer index type space. In that | ||
// case, the GEP would be poison and any memory access dependent on it would | ||
// be immediate UB when executed. | ||
if (auto *GEP = dyn_cast<GetElementPtrInst>(Ptr); | ||
if (auto *GEP = dyn_cast_if_present<GetElementPtrInst>(Ptr); | ||
GEP && GEP->hasNoUnsignedSignedWrap()) | ||
return true; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stride check could be moved to the top and then we could have an early return false if Ptr
is nullptr, but most of the checks above are simpler/cheaper than the stride check
Update isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed.
Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). I might be missing something, but at least for the test it doesn't seem correct to ignore wrapping pointers. If the pointer wraps, then we swap start and end of the runtime check. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on llvm#127410 to avoid more regressions.
if (!isNoWrap(PSE, AR, Ptr, AccessTy, TheLoop, Assume)) | ||
return false; | ||
if (ShouldCheckWrap && | ||
!isNoWrap(PSE, AR, TranslatedPtrs.size() == 1 ? Ptr : nullptr, AccessTy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A direct translation would be
!isNoWrap(PSE, AR, TranslatedPtrs.size() == 1 ? Ptr : nullptr, AccessTy, | |
!isNoWrap(PSE, AR, TranslatedPtrs.size() <= 1 ? Ptr : nullptr, AccessTy, |
why exactly 1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is exactly one, it means that we have translated the IR pointer directly to a corresponding SCEV, and we can use the IR value in the analysis.
There could be more translated pointers, e.g. if the IR pointer is something like %p = select %c, ptr %a, ptr %b
. There's logic to look through things like select/phi and it may return multiple SCEVs, one for each possible option. For the select mentioned earlier, TranslatedPointers may contain 2 entries, one with the SCEV for %a and one with the SCEV for %b.
In that case we do not try to find IR expressions corresponding to the SCEVs, and most of the logic looking at the IR value would not be safe, as the IR value may never be executed (i.e. we cannot assume it being poison would cause UB)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is was specifically thinking about the case TranslatedPtrs.empty()
. Before the patch, isNoWrap
would have been skipped. Now it is executed with Ptr==nullptr
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see. It should never be empty. I added an assert, as it would be incorrect if it would be empty, meaning we couldn't translate the input pointer.
If it is empty, then we should never enter the enclosing loop. I'd prefer to leave it checking == 1
as this is the only case where it should be safe to use the pointer. WDYT?
@@ -83,10 +83,52 @@ exit: | |||
define void @dependency_check_and_runtime_checks_needed_select_of_ptr_add_recs(ptr %a, ptr %b, ptr %c, i64 %offset, i64 %n) { | |||
; CHECK-LABEL: 'dependency_check_and_runtime_checks_needed_select_of_ptr_add_recs' | |||
; CHECK-NEXT: loop: | |||
; CHECK-NEXT: Report: cannot check memory dependencies at runtime | |||
; CHECK-NEXT: Memory dependences are safe with run-time checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain where the behavioral change comes from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
%select = select i1 %cmp, ptr %gep.b, ptr %gep.c
will be translated to 2 AddRecs, one for %gep.b
, and one for %gep.c
. Before the patch, we would bail out if we need to check dependencies and there are multiple TranslatedPtrs for a source pointer.
With the patch, we now use can determine no-wrap for both of the AddRecs, without needing to look at the IR pointers.
769b3ca
to
3968414
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good to go from my end. Perhaps also wait for @Meinersbur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the explainations
Update isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed. PR: llvm/llvm-project#127410
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/59/builds/13013 Here is the relevant piece of the build log for the reference
|
Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). I might be missing something, but at least for the test it doesn't seem correct to ignore wrapping pointers. If the pointer wraps, then we swap start and end of the runtime check. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on llvm#127410 to avoid more regressions.
Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). I might be missing something, but at least for the test it doesn't seem correct to ignore wrapping pointers. If the pointer wraps, then we swap start and end of the runtime check. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on llvm#127410 to avoid more regressions.
Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). If the pointer wraps, then we swap start and end of the runtime check, leading to incorrect checks. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on #127410 to avoid more regressions. PR: #127543
…cks. (#127543) Currently we only check if the pointers involved in runtime checks do not wrap if we need to perform dependency checks. If that's not the case, we generate runtime checks, even if the pointers may wrap (see test/Analysis/LoopAccessAnalysis/runtime-checks-may-wrap.ll). If the pointer wraps, then we swap start and end of the runtime check, leading to incorrect checks. An Alive2 proof of what the runtime checks are checking conceptually (on i4 to have it complete in reasonable time) showing the incorrect result should be https://alive2.llvm.org/ce/z/KsHzn8 Depends on llvm/llvm-project#127410 to avoid more regressions. PR: llvm/llvm-project#127543
Update isNoWrap to make the IR Ptr argument optional. This allows using isNoWrap when dealing with things like pointer-selects, where a select is translated to multiple pointer SCEV expressions, but there is no IR value that can be used. We don't try to retrieve pointer values for the pointer SCEVs and using info from the IR would not be safe. For example, we cannot use inbounds, because the pointer may never be accessed.