-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[LoopInterchange] Handle LE and GE correctly #124901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
LoopInterchange have converted `DVEntry::LE` and `DVEntry::GE` in direction vectors to '<' and '>' respectively. This handling is incorrect because the information about the '=' it lost. This leads to miscompilation in some cases. To resolve this issue, convert them to '*' instead. Resolve llvm#123920
@llvm/pr-subscribers-llvm-transforms Author: Ryotaro Kasuga (kasuga-fj) ChangesLoopInterchange have converted Resolve #123920 Full diff: https://github.com/llvm/llvm-project/pull/124901.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index 38fc682698c53e..ca125d2c0c490c 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -160,11 +160,16 @@ static bool populateDependencyMatrix(CharMatrix &DepMatrix, unsigned Level,
unsigned Levels = D->getLevels();
char Direction;
for (unsigned II = 1; II <= Levels; ++II) {
+ // `DVEntry::LE` is converted to `*`. This is because `LE` means `<`
+ // or `=`, for which we don't have an equivalent representation, so
+ // that the conservative approximation is necessary. The same goes for
+ // `DVEntry::GE`.
+ // TODO: Use of fine-grained expressions allows for more accurate
+ // analysis.
unsigned Dir = D->getDirection(II);
- if (Dir == Dependence::DVEntry::LT || Dir == Dependence::DVEntry::LE)
+ if (Dir == Dependence::DVEntry::LT)
Direction = '<';
- else if (Dir == Dependence::DVEntry::GT ||
- Dir == Dependence::DVEntry::GE)
+ else if (Dir == Dependence::DVEntry::GT)
Direction = '>';
else if (Dir == Dependence::DVEntry::EQ)
Direction = '=';
diff --git a/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll b/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
new file mode 100644
index 00000000000000..c17d78f7cfce68
--- /dev/null
+++ b/llvm/test/Transforms/LoopInterchange/outer-dependency-lte.ll
@@ -0,0 +1,75 @@
+; RUN: opt < %s -passes=loop-interchange -pass-remarks-missed='loop-interchange' -pass-remarks-output=%t \
+; RUN: -verify-dom-info -verify-loop-info -verify-loop-lcssa
+; RUN: FileCheck --input-file=%t %s
+
+;; The original code:
+;;
+;; #define N 4
+;; int a[N*N][N*N][N*N];
+;; void f() {
+;; for (int i = 0; i < N; i++)
+;; for (int j = 1; j < 2*N; j++)
+;; for (int k = 1; k < 2*N; k++)
+;; a[2*i][k+1][j-1] -= a[i+N-1][k][j];
+;; }
+;;
+;; The entry of the direction vector for the outermost loop is `DVEntry::LE`.
+;; We need to treat this as `*`, not `<`. See issue #123920 for details.
+
+; CHECK: --- !Missed
+; CHECK-NEXT: Pass: loop-interchange
+; CHECK-NEXT: Name: Dependence
+; CHECK-NEXT: Function: f
+; CHECK: --- !Missed
+; CHECK-NEXT: Pass: loop-interchange
+; CHECK-NEXT: Name: Dependence
+; CHECK-NEXT: Function: f
+
+@a = dso_local global [16 x [16 x [16 x i32]]] zeroinitializer, align 4
+
+define dso_local void @f() {
+entry:
+ br label %for.cond1.preheader
+
+for.cond1.preheader:
+ %i.039 = phi i32 [ 0, %entry ], [ %inc26, %for.cond.cleanup3 ]
+ %sub = add nuw nsw i32 %i.039, 3
+ %idxprom = zext nneg i32 %sub to i64
+ %mul = shl nuw nsw i32 %i.039, 1
+ %idxprom13 = zext nneg i32 %mul to i64
+ br label %for.cond5.preheader
+
+for.cond.cleanup:
+ ret void
+
+for.cond5.preheader:
+ %j.038 = phi i32 [ 1, %for.cond1.preheader ], [ %inc23, %for.cond.cleanup7 ]
+ %idxprom11 = zext nneg i32 %j.038 to i64
+ %sub18 = add nsw i32 %j.038, -1
+ %idxprom19 = sext i32 %sub18 to i64
+ br label %for.body8
+
+for.cond.cleanup3:
+ %inc26 = add nuw nsw i32 %i.039, 1
+ %cmp = icmp samesign ult i32 %i.039, 3
+ br i1 %cmp, label %for.cond1.preheader, label %for.cond.cleanup
+
+for.cond.cleanup7:
+ %inc23 = add nuw nsw i32 %j.038, 1
+ %cmp2 = icmp samesign ult i32 %j.038, 7
+ br i1 %cmp2, label %for.cond5.preheader, label %for.cond.cleanup3
+
+for.body8:
+ %k.037 = phi i32 [ 1, %for.cond5.preheader ], [ %add15, %for.body8 ]
+ %idxprom9 = zext nneg i32 %k.037 to i64
+ %arrayidx12 = getelementptr inbounds nuw [16 x [16 x [16 x i32]]], ptr @a, i64 0, i64 %idxprom, i64 %idxprom9, i64 %idxprom11
+ %0 = load i32, ptr %arrayidx12, align 4
+ %add15 = add nuw nsw i32 %k.037, 1
+ %idxprom16 = zext nneg i32 %add15 to i64
+ %arrayidx20 = getelementptr inbounds [16 x [16 x [16 x i32]]], ptr @a, i64 0, i64 %idxprom13, i64 %idxprom16, i64 %idxprom19
+ %1 = load i32, ptr %arrayidx20, align 4
+ %sub21 = sub nsw i32 %1, %0
+ store i32 %sub21, ptr %arrayidx20, align 4
+ %cmp6 = icmp samesign ult i32 %k.037, 7
+ br i1 %cmp6, label %for.body8, label %for.cond.cleanup7
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this direction, correctness should be the first priority.
Also, as you mentioned, this does not regress many cases.
LGTM
Thanks for your review! |
This is a work in progress patch to enable loop-interchange by default and is a continuation of the RFC: https://discourse.llvm.org/t/enabling-loop-interchange/82589 Basically, we promised to fix any compile-time and correctness issues in the different components involved here (loop-interchange and dependence analaysis.) before discussing enabling interchange by default. We think are close to complete this; I would like to explain where we are and wanted to check if there are any thoughts or concerns. A quick overview of the correctness and compile-time improvements that we have made include: Correctness: - [LoopInterchange] Remove 'S' Scalar Dependencies (llvm#119345) - [LoopInterchange] Fix overflow in cost calculation (llvm#111807) - [LoopInterchange] Handle LE and GE correctly (PR llvm#124901) @kasuga-fj - [DA] disambiguate evolution of base addresses (llvm#116628) Compile-times: - [LoopInterchange] Constrain number of load/stores in a loop (llvm#118973) - [LoopInterchange] Bail out early if minimum loop nest is not met (llvm#115128) - [LoopInterchange] Hoist isComputableLoopNest() in the control flow (llvm#124247) And in terms of remaining work, we think we are very close to fixing these depenence analysis issues: - [DA] do not handle array accesses of different offsets (llvm#123436) - [DA] Dependence analysis does not handle array accesses of different sizes (llvm#116630) - [DA] use NSW arithmetic llvm#116632 The compile-time increase with a geomean increase of 0.19% looks good (after committing llvm#124247), I think: stage1-O3: Benchmark kimwitu++ +0.10% sqlite3 +0.14% consumer-typeset +0.07% Bullet +0.06% tramp3d-v4 +0.21% mafft +0.39% ClamAVi +0.06% lencod +0.61% SPASS +0.17% 7zip +0.08% geomean +0.19% See also: http://llvm-compile-time-tracker.com/compare.php?from=19a7fe03b4f58c4f73ea91d5e63bc4c6e61f987b&to=b24f1367d68ee675ea93ecda4939208c6b68ae4b&stat=instructions%3Au We might want to look into lencod to see if we can improve more, but not sure it is strictly necessary.
LoopInterchange have converted
DVEntry::LE
andDVEntry::GE
in direction vectors to '<' and '>' respectively. This handling is incorrect because the information about the '=' it lost. This leads to miscompilation in some cases. To resolve this issue, convert them to '*' instead.Resolve #123920