Skip to content

[LoopInterchange] Relax the legality check to accept more patterns #118267

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

kasuga-fj
Copy link
Contributor

@kasuga-fj kasuga-fj commented Dec 2, 2024

When proving the legality of exchanging two loops, it doesn't need to check the elements of the direction vectors associated with the loops outside of the two target loops. Before this patch, the legality check looked at all elements of a direction vector to calculate the lexicographically order of the vector, which may reject some legal exchanges. For example, if a direction vector is [* < =], it is safe to swap the last two loops because the corresponding subsequence of the vector ([< =]) is lexicographically positive for both before and after the exchange. However, the its order is unknown if we don't drop the prefix since the first element is *. This patch improves the logic of legality check to ignore such unrelated prefixes of direction vectors.

Copy link

graphite-app bot commented Dec 2, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “FP Bundles” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@llvmbot
Copy link
Member

llvmbot commented Dec 2, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Ryotaro Kasuga (kasuga-fj)

Changes

We lose opportunities to interchange loops because the current legality check is stricter than necessary. This patch relaxes the restriction and increases the number of acceptable patterns. Here is a motivating example.

for (int nl=0;nl&lt;100;nl++) {
  for (int i=0;i&lt;256;i++) {
    for (int j=1;j&lt;256;j++)
      aa[j][i] = aa[j-1][i] + bb[j][i];
  }
  dummy(aa, bb);
}

This patch allows us to interchange the two innermost loops. Note, however, that the current implementation interchanges these loops twice so that they end up going back in the original order.

Related issue: #71519


Full diff: https://github.com/llvm/llvm-project/pull/118267.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopInterchange.cpp (+30-9)
  • (added) llvm/test/Transforms/LoopInterchange/direction-vector-legality-negative.ll (+74)
  • (added) llvm/test/Transforms/LoopInterchange/direction-vector-legality-none.ll (+54)
  • (added) llvm/test/Transforms/LoopInterchange/direction-vector-legality-opposite.ll (+53)
diff --git a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
index a0c0080c0bda1c..29a53a258f4ca1 100644
--- a/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopInterchange.cpp
@@ -64,6 +64,15 @@ using LoopVector = SmallVector<Loop *, 8>;
 // TODO: Check if we can use a sparse matrix here.
 using CharMatrix = std::vector<std::vector<char>>;
 
+// Classify direction vectors according to the leftmost non-"=" direction. "S"
+// and "I" are treated the same as "=".
+enum class DirectionVectorOrder {
+  Zero,     ///< The direction vector consists only of "=", "S", and "I".
+  Positive, ///< The leftmost non-"=" direction is "<".
+  Negative, ///< The leftmost non-"=" direction is ">".
+  All,      ///< The leftmost non-"=" direction is "*".
+};
+
 } // end anonymous namespace
 
 // Maximum number of dependencies that can be handled in the dependency matrix.
@@ -185,15 +194,25 @@ static void interChangeDependencies(CharMatrix &DepMatrix, unsigned FromIndx,
 // After interchanging, check if the direction vector is valid.
 // [Theorem] A permutation of the loops in a perfect nest is legal if and only
 // if the direction matrix, after the same permutation is applied to its
-// columns, has no ">" direction as the leftmost non-"=" direction in any row.
-static bool isLexicographicallyPositive(std::vector<char> &DV) {
+// columns, each row of it satisfies either the following conditions.
+//
+// - The row consists only of "=", "S", and "I".
+// - The leftmost direction that is not "=", "S" and "I" in the row is
+//   "<" or ">", and it does not change before and after the permutation is
+//   applied.
+static DirectionVectorOrder
+calcDirectionVectorOrder(const std::vector<char> &DV) {
   for (unsigned char Direction : DV) {
-    if (Direction == '<')
-      return true;
-    if (Direction == '>' || Direction == '*')
-      return false;
+    switch (Direction) {
+    case '<':
+      return DirectionVectorOrder::Positive;
+    case '>':
+      return DirectionVectorOrder::Negative;
+    case '*':
+      return DirectionVectorOrder::All;
+    }
   }
-  return true;
+  return DirectionVectorOrder::Zero;
 }
 
 // Checks if it is legal to interchange 2 loops.
@@ -207,10 +226,12 @@ static bool isLegalToInterChangeLoops(CharMatrix &DepMatrix,
     // Create temporary DepVector check its lexicographical order
     // before and after swapping OuterLoop vs InnerLoop
     Cur = DepMatrix[Row];
-    if (!isLexicographicallyPositive(Cur))
+    auto OrderBefore = calcDirectionVectorOrder(Cur);
+    if (OrderBefore == DirectionVectorOrder::All)
       return false;
     std::swap(Cur[InnerLoopId], Cur[OuterLoopId]);
-    if (!isLexicographicallyPositive(Cur))
+    auto OrderAfter = calcDirectionVectorOrder(Cur);
+    if (OrderBefore != OrderAfter)
       return false;
   }
   return true;
diff --git a/llvm/test/Transforms/LoopInterchange/direction-vector-legality-negative.ll b/llvm/test/Transforms/LoopInterchange/direction-vector-legality-negative.ll
new file mode 100644
index 00000000000000..b81a2c96c9cb72
--- /dev/null
+++ b/llvm/test/Transforms/LoopInterchange/direction-vector-legality-negative.ll
@@ -0,0 +1,74 @@
+; REQUIRES: asserts
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info \
+; RUN:     -S -debug 2>&1 | FileCheck %s
+
+@aa = dso_local global [256 x [256 x float]] zeroinitializer, align 64
+@bb = dso_local global [256 x [256 x float]] zeroinitializer, align 64
+
+declare i32 @dummy(ptr noundef, ptr noundef)
+
+;;  for (int nl=0;nl<100;++nl) {
+;;    for (int i=0;i<256;++i) {
+;;      for (int j=1;j<256;++j)
+;;        aa[j][i] = aa[j-1][i] + bb[j][i];
+;;    }
+;;    dummy(aa, bb);
+;;  }
+;;
+;; The direction vector of `aa` is [S = >]. We can swap the innermost two
+;; loops, The direction vector after interchanging will be [S > =].
+
+; CHECK: Dependency matrix before interchange:
+; CHECK-NEXT: S = >
+; CHECK-NEXT: S = =
+; CHECK-NEXT: Processing InnerLoopId = 2 and OuterLoopId = 1
+; CHECK-NEXT: Checking if loops are tightly nested
+; CHECK-NEXT: Checking instructions in Loop header and Loop latch
+; CHECK-NEXT: Loops are perfectly nested
+; CHECK-NEXT: Loops are legal to interchange
+; CHECK: Dependency matrix after interchange:
+; CHECK-NEXT: S > =
+; CHECK-NEXT: S = =
+
+define void @f() {
+entry:
+  br label %for.cond1.preheader
+
+; Loop:
+for.cond1.preheader:                              ; preds = %entry, %for.cond.cleanup3
+  %nl.036 = phi i32 [ 0, %entry ], [ %inc23, %for.cond.cleanup3 ]
+  br label %for.cond5.preheader
+
+for.cond.cleanup3:                                ; preds = %for.cond.cleanup7
+  %call = tail call i32 @dummy(ptr noundef nonnull @aa, ptr noundef nonnull @bb)
+  %inc23 = add nuw nsw i32 %nl.036, 1
+  %exitcond43 = icmp ne i32 %inc23, 100
+  br i1 %exitcond43, label %for.cond1.preheader, label %for.cond.cleanup
+
+for.cond.cleanup7:                                ; preds = %for.body8
+  %indvars.iv.next40 = add nuw nsw i64 %indvars.iv39, 1
+  %exitcond42 = icmp ne i64 %indvars.iv.next40, 256
+  br i1 %exitcond42, label %for.cond5.preheader, label %for.cond.cleanup3
+
+for.body8:                                        ; preds = %for.cond5.preheader, %for.body8
+  %indvars.iv = phi i64 [ 1, %for.cond5.preheader ], [ %indvars.iv.next, %for.body8 ]
+  %0 = add nsw i64 %indvars.iv, -1
+  %arrayidx10 = getelementptr inbounds [256 x [256 x float]], ptr @aa, i64 0, i64 %0, i64 %indvars.iv39
+  %1 = load float, ptr %arrayidx10, align 4
+  %arrayidx14 = getelementptr inbounds [256 x [256 x float]], ptr @bb, i64 0, i64 %indvars.iv, i64 %indvars.iv39
+  %2 = load float, ptr %arrayidx14, align 4
+  %add = fadd fast float %2, %1
+  %arrayidx18 = getelementptr inbounds [256 x [256 x float]], ptr @aa, i64 0, i64 %indvars.iv, i64 %indvars.iv39
+  store float %add, ptr %arrayidx18, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp ne i64 %indvars.iv.next, 256
+  br i1 %exitcond, label %for.body8, label %for.cond.cleanup7
+
+for.cond5.preheader:                              ; preds = %for.cond1.preheader, %for.cond.cleanup7
+  %indvars.iv39 = phi i64 [ 0, %for.cond1.preheader ], [ %indvars.iv.next40, %for.cond.cleanup7 ]
+  br label %for.body8
+
+; Exit blocks
+for.cond.cleanup:                                 ; preds = %for.cond.cleanup3
+  ret void
+}
diff --git a/llvm/test/Transforms/LoopInterchange/direction-vector-legality-none.ll b/llvm/test/Transforms/LoopInterchange/direction-vector-legality-none.ll
new file mode 100644
index 00000000000000..6b131a74148a08
--- /dev/null
+++ b/llvm/test/Transforms/LoopInterchange/direction-vector-legality-none.ll
@@ -0,0 +1,54 @@
+; REQUIRES: asserts
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info \
+; RUN:     -S -debug 2>&1 | FileCheck %s
+
+@aa = dso_local global [256 x [256 x float]] zeroinitializer, align 64
+@bb = dso_local global [256 x [256 x float]] zeroinitializer, align 64
+
+;;  for (int i=0;i<256;++i)
+;;    for (int j=1;j<256;++j)
+;;      aa[j][i] = aa[j-1][255-i] + bb[j][i];
+;;
+;; The direciton vector of `aa` is [* =]. We cannot interchange the loops
+;; because we must handle a `*` dependence conservatively.
+
+; CHECK: Dependency matrix before interchange:
+; CHECK-NEXT: * >
+; CHECK-NEXT: Processing InnerLoopId = 1 and OuterLoopId = 0
+; CHECK-NEXT: Failed interchange InnerLoopId = 1 and OuterLoopId = 0 due to dependence
+; CHECK-NEXT: Not interchanging loops. Cannot prove legality.
+
+define void @f() {
+; Preheader:
+entry:
+  br label %for.cond1.preheader
+
+; Loop:
+for.cond1.preheader:                              ; preds = %entry, %for.cond.cleanup3
+  %indvars.iv31 = phi i64 [ 0, %entry ], [ %indvars.iv.next32, %for.cond.cleanup3 ]
+  %0 = sub nuw nsw i64 255, %indvars.iv31
+  br label %for.body4
+
+for.cond.cleanup3:                                ; preds = %for.body4
+  %indvars.iv.next32 = add nuw nsw i64 %indvars.iv31, 1
+  %exitcond35 = icmp ne i64 %indvars.iv.next32, 256
+  br i1 %exitcond35, label %for.cond1.preheader, label %for.cond.cleanup
+
+for.body4:                                        ; preds = %for.cond1.preheader, %for.body4
+  %indvars.iv = phi i64 [ 1, %for.cond1.preheader ], [ %indvars.iv.next, %for.body4 ]
+  %1 = add nsw i64 %indvars.iv, -1
+  %arrayidx7 = getelementptr inbounds [256 x [256 x float]], ptr @aa, i64 0, i64 %1, i64 %0
+  %2 = load float, ptr %arrayidx7, align 4
+  %arrayidx11 = getelementptr inbounds [256 x [256 x float]], ptr @bb, i64 0, i64 %indvars.iv, i64 %indvars.iv31
+  %3 = load float, ptr %arrayidx11, align 4
+  %add = fadd fast float %3, %2
+  %arrayidx15 = getelementptr inbounds [256 x [256 x float]], ptr @aa, i64 0, i64 %indvars.iv, i64 %indvars.iv31
+  store float %add, ptr %arrayidx15, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp ne i64 %indvars.iv.next, 256
+  br i1 %exitcond, label %for.body4, label %for.cond.cleanup3
+
+; Exit blocks
+for.cond.cleanup:                                 ; preds = %for.cond.cleanup3
+  ret void
+}
diff --git a/llvm/test/Transforms/LoopInterchange/direction-vector-legality-opposite.ll b/llvm/test/Transforms/LoopInterchange/direction-vector-legality-opposite.ll
new file mode 100644
index 00000000000000..978b21e195101f
--- /dev/null
+++ b/llvm/test/Transforms/LoopInterchange/direction-vector-legality-opposite.ll
@@ -0,0 +1,53 @@
+; REQUIRES: asserts
+; RUN: opt < %s -passes=loop-interchange -cache-line-size=64 -verify-dom-info -verify-loop-info \
+; RUN:     -S -debug 2>&1 | FileCheck %s
+
+@aa = dso_local global [256 x [256 x float]] zeroinitializer, align 64
+@bb = dso_local global [256 x [256 x float]] zeroinitializer, align 64
+
+;;  for (int i=0;i<255;++i)
+;;    for (int j=1;j<256;++j)
+;;      aa[j][i] = aa[j-1][i+1] + bb[j][i];
+;;
+;; The direciton vector of `aa` is [< >]. We cannot interchange the loops
+;; because the read/write order for `aa` cannot be changed.
+
+; CHECK: Dependency matrix before interchange:
+; CHECK-NEXT: < >
+; CHECK-NEXT: Processing InnerLoopId = 1 and OuterLoopId = 0
+; CHECK-NEXT: Failed interchange InnerLoopId = 1 and OuterLoopId = 0 due to dependence
+; CHECK-NEXT: Not interchanging loops. Cannot prove legality.
+
+define void @f() {
+; Preheader:
+entry:
+  br label %for.cond1.preheader
+
+; Loop:
+for.cond1.preheader:                              ; preds = %entry, %for.cond.cleanup3
+  %indvars.iv31 = phi i64 [ 0, %entry ], [ %indvars.iv.next32, %for.cond.cleanup3 ]
+  %indvars.iv.next32 = add nuw nsw i64 %indvars.iv31, 1
+  br label %for.body4
+
+for.cond.cleanup3:                                ; preds = %for.body4
+  %exitcond34 = icmp ne i64 %indvars.iv.next32, 255
+  br i1 %exitcond34, label %for.cond1.preheader, label %for.cond.cleanup
+
+for.body4:                                        ; preds = %for.cond1.preheader, %for.body4
+  %indvars.iv = phi i64 [ 1, %for.cond1.preheader ], [ %indvars.iv.next, %for.body4 ]
+  %0 = add nsw i64 %indvars.iv, -1
+  %arrayidx6 = getelementptr inbounds [256 x [256 x float]], ptr @aa, i64 0, i64 %0, i64 %indvars.iv.next32
+  %1 = load float, ptr %arrayidx6, align 4
+  %arrayidx10 = getelementptr inbounds [256 x [256 x float]], ptr @bb, i64 0, i64 %indvars.iv, i64 %indvars.iv31
+  %2 = load float, ptr %arrayidx10, align 4
+  %add11 = fadd fast float %2, %1
+  %arrayidx15 = getelementptr inbounds [256 x [256 x float]], ptr @aa, i64 0, i64 %indvars.iv, i64 %indvars.iv31
+  store float %add11, ptr %arrayidx15, align 4
+  %indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
+  %exitcond = icmp ne i64 %indvars.iv.next, 256
+  br i1 %exitcond, label %for.body4, label %for.cond.cleanup3
+
+; Exit blocks
+for.cond.cleanup:                                 ; preds = %for.cond.cleanup3
+  ret void
+}

@kasuga-fj
Copy link
Contributor Author

kasuga-fj commented Dec 2, 2024

I have investigated what happens in the case of #71519 when this patch is applied, and found that there is a problem with LoopInterchangeProfitability::isProfitable. In the case of #71519, the following things happen:

  1. First, the loop cost (a value returned by CacheCost::getLoopCost) is the same for the loop of i and j.
  2. In the first step, isProfitablePerLoopCacheAnalysis returns std::nullopt, then isProfitablePerInstrOrderCost returns true, and finally these loops are interchanged.
  3. Because LoopInterchange uses a bubble-sort fashion algorithm, it tries to interchange these loops again.
  4. This time isProfitablePerLoopCacheAnalysis returns true because InnerIndex < OuterIndex holds.

I think we should modify LoopInterchangeProfitability and avoid to interchange the same pair of loops more than once.

@sjoerdmeijer
Copy link
Collaborator

I think we should modify LoopInterchangeProfitability and avoid to interchange the same pair of loops more than once.

I completely agree with this and that sounds like a bug in the cost-model.

@kasuga-fj
Copy link
Contributor Author

I think the easiest way is to check the depth of each of the two loops in the original code. If the depth of OuterLoopId is greater than InnerLoopId, it means that these loops have already been interchanged. However, it seems that the profitability checks have a "hierarchy". That is, if respecting the current implementation, the decision made by isProfitablePerLoopCacheAnalysis should take precedence over that of isProfitablePerIntsrOrderCost. So it might not be that simple.
What do you think?

@kasuga-fj
Copy link
Contributor Author

Ping.

Copy link
Contributor

@madhur13490 madhur13490 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note, in #119345, there are discussions about removing "S" dependence.

@kasuga-fj
Copy link
Contributor Author

kasuga-fj commented Jan 10, 2025

Hi @madhur13490 , thanks for your comments!

Please note, in #119345, there are discussions about removing "S" dependence.

Yes, I am aware of the discussion. My implementation doesn't depend on the existence of the S dependency, but comments and tests need to be changed after it is removed. This is not correct. The loop in #71519 cannot be vectorized if the S is replaced with *.

By the way, I now feel that the normalize invocation of dependencies in the populateDependencyMatrix is no longer necessary. What do you think?
https://github.com/llvm/llvm-project/blob/eb71802c422cc337966d25fab726faf9074eb0aa/llvm/lib/Transforms/Scalar/LoopInterchange.cpp#L138-L139

@madhur13490
Copy link
Contributor

Hi @madhur13490 , thanks for your comments!

Please note, in #119345, there are discussions about removing "S" dependence.

Yes, I am aware of the discussion. My implementation doesn't depend on the existence of the `S' dependency, but comments and tests need to be changed after it is removed.

By the way, I now feel that the normalize invocation of dependencies in the populateDependencyMatrix is no longer necessary. What do you think?

https://github.com/llvm/llvm-project/blob/eb71802c422cc337966d25fab726faf9074eb0aa/llvm/lib/Transforms/Scalar/LoopInterchange.cpp#L138-L139

I don't fully understand why normalize becomes unnecessary after this patch. Can you please elaborate?
CC @CongzheUalberta

@kasuga-fj
Copy link
Contributor Author

I don't fully understand why normalize becomes unnecessary after this patch. Can you please elaborate?

I believe the normalization is called because isLegalToInterChangeLoops assumes non-negative direction vectors. If the logic of this patch is correct, I think this assumption becomes unnecessary.

I also realized that the removal of S affects the logic of my change (I was mistaken in my previous comment). I will fix the code.

@kasuga-fj
Copy link
Contributor Author

Perhaps #123920 must be resolved first.

@kasuga-fj kasuga-fj force-pushed the issue-71519-relax-legality branch from 6395986 to 5ff4eb2 Compare January 29, 2025 11:52
Copy link
Contributor Author

@kasuga-fj kasuga-fj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @sjoerdmeijer @madhur13490 .

This patch addresses issues with the current legality check being too conservative. I think correctness is more important than this to enable interchange by default, so I won't merge this before it is achieved. However, it would be appreciated if you would remember that this patch improves some of the S dependency, which were eliminated in #119345. (I'm not treating the S as special in this patch, but rather changing the handling of *). I would also like to know if you have any comments at this time.

Thanks.

Comment on lines 288 to 293
// TODO: There are cases where the interchange is legal but rejected. At least
// the following patterns are legal:
// - If both Dep[OuterLoopId] and Dep[InnerLoopId] are '=', the interchange is
// legal regardless of any other elements.
// - If the loops are adjacent to each other and at least one of them is '=',
// the interchange is legal even if the other is '*'.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to separate the PR, so I left it as a TODO. As I tested on my local, addressing these items makes some tests pass that are marked as XFAIL in #119345 .

; CHECK-NEXT: Name: Dependence
; CHECK-NEXT: Name: UnsupportedPHIOuter
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What was changed in #119345 has been restored.

kasuga-fj added a commit that referenced this pull request Mar 21, 2025
LoopInterchange uses the bubble-sort fashion algorithm to sort the loops
in a loop nest. For two loops `A` and `B`, it calls `isProfitable(A, B)`
to determine whether these loops should be exchanged. The result of
`isProfitable(A, B)` should be conservative, that is, it should return
true only when we are sure exchanging `A` and `B` will improve
performance. If it's not conservative enough, it's possible that a
subsequent `isProfitable(B, A)` will also return true, in which case
LoopInterchange will undo its previous transformation. To avoid such
cases, `isProfitable(B, A)` must not return true if `isProfitable(A, B)`
returned true in the past. However, the current implementation can be in
such a situation. This patch resolves it by correcting the handling of
two loops that have the same cache cost.

This resolve the problem mentioned in
#118267 (comment).
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Mar 21, 2025
…ion (#127473)

LoopInterchange uses the bubble-sort fashion algorithm to sort the loops
in a loop nest. For two loops `A` and `B`, it calls `isProfitable(A, B)`
to determine whether these loops should be exchanged. The result of
`isProfitable(A, B)` should be conservative, that is, it should return
true only when we are sure exchanging `A` and `B` will improve
performance. If it's not conservative enough, it's possible that a
subsequent `isProfitable(B, A)` will also return true, in which case
LoopInterchange will undo its previous transformation. To avoid such
cases, `isProfitable(B, A)` must not return true if `isProfitable(A, B)`
returned true in the past. However, the current implementation can be in
such a situation. This patch resolves it by correcting the handling of
two loops that have the same cache cost.

This resolve the problem mentioned in
llvm/llvm-project#118267 (comment).
@kasuga-fj
Copy link
Contributor Author

@Meinersbur Hi, let me ask a question. Should I bring this kind of topic to the LoopOpt WG meeting? This is because, looking at past discussion logs, it seems that the correctness of the legality check has often been discussed on the LoopWG call. Or is a text discussion on GitHub sufficient?

@Meinersbur
Copy link
Member

Meinersbur commented Apr 30, 2025 via email

@kasuga-fj
Copy link
Contributor Author

Thanks, I will attend the next meeting.

@kasuga-fj kasuga-fj changed the base branch from main to users/kasuga-fj/loop-interchange-legality-prefix May 9, 2025 12:40
@kasuga-fj kasuga-fj force-pushed the issue-71519-relax-legality branch from 808e9aa to 41c24f3 Compare May 9, 2025 12:41
@kasuga-fj
Copy link
Contributor Author

@Meinersbur Could you please take a look?

I have submitted another PR #139254 as well, on which this PR depends, to address the following problems: I first tried to simply replace the function isLexicographicallyPositive(std::vector<char> &DV) to take an additional argument like StartAt to ignore the prefix, as discussed in the last meeting. However, it made several tests to fail, e.g. a case where a direction vector is [< > =] and we would like to swap the last two loops. It is safe because the first element guarantees that the whole vector is lexicographically positive, but just changing the legality check to drop the unrelated prefix rejected such a case because it makes the direction vector to be [> =], which is negative.

The original issue #71519 isn't resolved by this patch because a problem similar to the above exists (I believe this can be resolved by improving the handling of negative direction). For now, I'd like to add a logic to ignore the irrelevant prefix of direction vectors, and will create another PR to address #71519.

Thanks.

Copy link
Member

@Meinersbur Meinersbur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you.

@kasuga-fj
Copy link
Contributor Author

Thanks for your review.

kasuga-fj added a commit that referenced this pull request May 13, 2025
…arantee dependency (NFC) (#139254)

The legality check in LoopInterchange allows two loops to be exchanged
if all direction vectors are lexicographically positive (or zero) for
both before and after the swap. The current implementation performs this
routine naively. However, if a direction vector is lexicographically
positive due to an element corresponding to a loop that is outside the
given two loops (i.e., if there is an element `<` before the loops we
are trying to interchange), then obviously it is also positive after
exchanging them. For example, for a direction vector `[< < >]`, swapping
the last two elements doesn't make it lexicographically negative because
the first element is `<`.
This patch adds a code to skip legality check if surrounding loops
already guarantee that the direction vector is lexicographically
positive. Note that this is only a small improvement on its own, but
it's necessary to relax the legality check I'm working on.

Split off from #118267

---------

Co-authored-by: Michael Kruse <[email protected]>
@kasuga-fj kasuga-fj deleted the branch llvm:users/kasuga-fj/loop-interchange-legality-prefix May 13, 2025 01:19
@kasuga-fj kasuga-fj closed this May 13, 2025
@kasuga-fj
Copy link
Contributor Author

I misunderstood GitHub behavior and cannot reopen this PR... Submitted another PR #139690 with same contents.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants