Skip to content

[VPlan] Retain exit conditions and edges in initial VPlan (NFC). #137709

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Apr 28, 2025

Update initial VPlan construction to include exit conditions and edges.

For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (#128420).

This includes updates in a few places to use
replaceSuccessor/replacePredecessor to preserve the order of predecessors and successors, to reduce the need of fixing up phi operand orderings. This unfortunately required making them public, not sure if there's a

@llvmbot
Copy link
Member

llvmbot commented Apr 28, 2025

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-vectorizers

Author: Florian Hahn (fhahn)

Changes

Update initial VPlan construction to include exit conditions and edges.

For now, all early exits are disconnected before forming the regions, but a follow-up will update uncountable exit handling to also happen here. This is required to enable VPlan predication and remove the dependence any IR BBs (#128420).

This includes updates in a few places to use
replaceSuccessor/replacePredecessor to preserve the order of predecessors and successors, to reduce the need of fixing up phi operand orderings. This unfortunately required making them public, not sure if there's a


Full diff: https://github.com/llvm/llvm-project/pull/137709.diff

4 Files Affected:

  • (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+11-1)
  • (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+1-2)
  • (modified) llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp (+38-40)
  • (modified) llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll (+5-1)
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index 4684378687ef6..bd043ae5e06e8 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -9167,6 +9167,7 @@ void LoopVectorizationPlanner::buildVPlansWithVPRecipes(ElementCount MinVF,
 // loop.
 static void addCanonicalIVRecipes(VPlan &Plan, Type *IdxTy, bool HasNUW,
                                   DebugLoc DL) {
+  using namespace VPlanPatternMatch;
   Value *StartIdx = ConstantInt::get(IdxTy, 0);
   auto *StartV = Plan.getOrAddLiveIn(StartIdx);
 
@@ -9176,7 +9177,16 @@ static void addCanonicalIVRecipes(VPlan &Plan, Type *IdxTy, bool HasNUW,
   VPBasicBlock *Header = TopRegion->getEntryBasicBlock();
   Header->insert(CanonicalIVPHI, Header->begin());
 
-  VPBuilder Builder(TopRegion->getExitingBasicBlock());
+  VPBasicBlock *LatchVPBB = TopRegion->getExitingBasicBlock();
+  // We are about to replace the branch to exit the region. Remove the original
+  // BranchOnCond, if there is any.
+  // TODO: Move canonical IV and BranchOnCount introduction to initial skeleton
+  // creation.
+  if (!LatchVPBB->empty() &&
+      match(&LatchVPBB->back(), m_BranchOnCond(m_VPValue())))
+    LatchVPBB->getTerminator()->eraseFromParent();
+
+  VPBuilder Builder(LatchVPBB);
   // Add a VPInstruction to increment the scalar canonical IV by VF * UF.
   auto *CanonicalIVIncrement = Builder.createOverflowingOp(
       Instruction::Add, {CanonicalIVPHI, &Plan.getVFxUF()}, {HasNUW, false}, DL,
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index afad73bcd3501..5d81bcdc4a21e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -117,6 +117,7 @@ class VPBlockBase {
     Predecessors.erase(Pos);
   }
 
+public:
   /// Remove \p Successor from the successors of this block.
   void removeSuccessor(VPBlockBase *Successor) {
     auto Pos = find(Successors, Successor);
@@ -129,8 +130,6 @@ class VPBlockBase {
   void replacePredecessor(VPBlockBase *Old, VPBlockBase *New) {
     auto I = find(Predecessors, Old);
     assert(I != Predecessors.end());
-    assert(Old->getParent() == New->getParent() &&
-           "replaced predecessor must have the same parent");
     *I = New;
   }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
index 58d63933fb724..af198bbf57df6 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanConstruction.cpp
@@ -112,6 +112,9 @@ VPBasicBlock *PlainCFGBuilder::getOrCreateVPBB(BasicBlock *BB) {
     return VPBB;
   }
 
+  if (!TheLoop->contains(BB))
+    return Plan->getExitBlock(BB);
+
   // Create new VPBB.
   StringRef Name = BB->getName();
   LLVM_DEBUG(dbgs() << "Creating VPBasicBlock for " << Name << "\n");
@@ -145,14 +148,6 @@ bool PlainCFGBuilder::isExternalDef(Value *Val) {
     // Instruction definition is in outermost loop PH.
     return false;
 
-  // Check whether Instruction definition is in a loop exit.
-  SmallVector<BasicBlock *> ExitBlocks;
-  TheLoop->getExitBlocks(ExitBlocks);
-  if (is_contained(ExitBlocks, InstParent)) {
-    // Instruction definition is in outermost loop exit.
-    return false;
-  }
-
   // Check whether Instruction definition is in loop body.
   return !TheLoop->contains(Inst);
 }
@@ -201,11 +196,6 @@ void PlainCFGBuilder::createVPInstructionsForVPBB(VPBasicBlock *VPBB,
            "Instruction shouldn't have been visited.");
 
     if (auto *Br = dyn_cast<BranchInst>(Inst)) {
-      if (TheLoop->getLoopLatch() == BB ||
-          any_of(successors(BB),
-                 [this](BasicBlock *Succ) { return !TheLoop->contains(Succ); }))
-        continue;
-
       // Conditional branch instruction are represented using BranchOnCond
       // recipes.
       if (Br->isConditional()) {
@@ -295,7 +285,6 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG(
   for (BasicBlock *BB : RPO) {
     // Create or retrieve the VPBasicBlock for this BB.
     VPBasicBlock *VPBB = getOrCreateVPBB(BB);
-    Loop *LoopForBB = LI->getLoopFor(BB);
     // Set VPBB predecessors in the same order as they are in the incoming BB.
     setVPBBPredsFromBB(VPBB, BB);
 
@@ -326,24 +315,12 @@ std::unique_ptr<VPlan> PlainCFGBuilder::buildPlainCFG(
     BasicBlock *IRSucc1 = BI->getSuccessor(1);
     VPBasicBlock *Successor0 = getOrCreateVPBB(IRSucc0);
     VPBasicBlock *Successor1 = getOrCreateVPBB(IRSucc1);
-
-    // Don't connect any blocks outside the current loop except the latches for
-    // inner loops.
-    // TODO: Also connect exit blocks during initial VPlan construction.
-    if (LoopForBB == TheLoop || BB != LoopForBB->getLoopLatch()) {
-      if (!LoopForBB->contains(IRSucc0)) {
-        VPBB->setOneSuccessor(Successor1);
-        continue;
-      }
-      if (!LoopForBB->contains(IRSucc1)) {
-        VPBB->setOneSuccessor(Successor0);
-        continue;
-      }
-    }
-
     VPBB->setTwoSuccessors(Successor0, Successor1);
   }
 
+  for (auto *EB : Plan->getExitBlocks())
+    setVPBBPredsFromBB(EB, EB->getIRBasicBlock());
+
   // 2. The whole CFG has been built at this point so all the input Values must
   // have a VPlan counterpart. Fix VPlan header phi by adding their
   // corresponding VPlan operands.
@@ -446,19 +423,21 @@ static void createLoopRegion(VPlan &Plan, VPBlockBase *HeaderVPB) {
   VPBlockBase *Succ = LatchVPBB->getSingleSuccessor();
   assert(LatchVPBB->getNumSuccessors() <= 1 &&
          "Latch has more than one successor");
-  if (Succ)
-    VPBlockUtils::disconnectBlocks(LatchVPBB, Succ);
+  LatchVPBB->removeSuccessor(Succ);
 
   auto *R = Plan.createVPRegionBlock(HeaderVPB, LatchVPBB, "",
                                      false /*isReplicator*/);
   // All VPBB's reachable shallowly from HeaderVPB belong to top level loop,
   // because VPlan is expected to end at top level latch disconnected above.
+  SmallPtrSet<VPBlockBase *, 2> ExitBlocks(Plan.getExitBlocks().begin(),
+                                           Plan.getExitBlocks().end());
   for (VPBlockBase *VPBB : vp_depth_first_shallow(HeaderVPB))
-    VPBB->setParent(R);
+    if (!ExitBlocks.contains(VPBB))
+      VPBB->setParent(R);
 
   VPBlockUtils::insertBlockAfter(R, PreheaderVPBB);
-  if (Succ)
-    VPBlockUtils::connectBlocks(R, Succ);
+  R->setOneSuccessor(Succ);
+  Succ->replacePredecessor(LatchVPBB, R);
 }
 
 void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
@@ -476,8 +455,29 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
   VPBlockUtils::insertBlockAfter(VecPreheader, Plan.getEntry());
 
   VPBasicBlock *MiddleVPBB = Plan.createVPBasicBlock("middle.block");
-  VPBlockUtils::connectBlocks(LatchVPB, MiddleVPBB);
-  LatchVPB->swapSuccessors();
+  VPBlockBase *LatchExitVPB = LatchVPB->getNumSuccessors() == 2
+                                  ? LatchVPB->getSuccessors()[0]
+                                  : nullptr;
+  if (LatchExitVPB) {
+    LatchVPB->getSuccessors()[0] = MiddleVPBB;
+    MiddleVPBB->setPredecessors({LatchVPB});
+    MiddleVPBB->setSuccessors({LatchExitVPB});
+    LatchExitVPB->replacePredecessor(LatchVPB, MiddleVPBB);
+  } else {
+    VPBlockUtils::connectBlocks(LatchVPB, MiddleVPBB);
+    LatchVPB->swapSuccessors();
+  }
+
+  // Disconnect all edges between exit blocks other than from the latch.
+  // TODO: Uncountable exit blocks should be handled here.
+  for (VPBlockBase *EB : to_vector(Plan.getExitBlocks())) {
+    for (VPBlockBase *Pred : to_vector(EB->getPredecessors())) {
+      if (Pred == MiddleVPBB)
+        continue;
+      cast<VPBasicBlock>(Pred)->getTerminator()->eraseFromParent();
+      VPBlockUtils::disconnectBlocks(Pred, EB);
+    }
+  }
 
   // Create SCEV and VPValue for the trip count.
   // We use the symbolic max backedge-taken-count, which works also when
@@ -503,8 +503,9 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
   //    Thus if tail is to be folded, we know we don't need to run the
   //    remainder and we can set the condition to true.
   // 3) Otherwise, construct a runtime check.
-
   if (!RequiresScalarEpilogueCheck) {
+    if (LatchExitVPB)
+      VPBlockUtils::disconnectBlocks(MiddleVPBB, LatchExitVPB);
     VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
     // The exit blocks are unreachable, remove their recipes to make sure no
     // users remain that may pessimize transforms.
@@ -516,9 +517,6 @@ void VPlanTransforms::prepareForVectorization(VPlan &Plan, Type *InductionTy,
   }
 
   // The connection order corresponds to the operands of the conditional branch.
-  BasicBlock *IRExitBlock = TheLoop->getUniqueLatchExitBlock();
-  auto *VPExitBlock = Plan.getExitBlock(IRExitBlock);
-  VPBlockUtils::connectBlocks(MiddleVPBB, VPExitBlock);
   VPBlockUtils::connectBlocks(MiddleVPBB, ScalarPH);
 
   auto *ScalarLatchTerm = TheLoop->getLoopLatch()->getTerminator();
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll b/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll
index 91a5ea6b7fe36..fe845ae74cbee 100644
--- a/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll
+++ b/llvm/test/Transforms/LoopVectorize/vplan-printing-outer-loop.ll
@@ -31,7 +31,11 @@ define void @foo(i64 %n) {
 ; CHECK-NEXT: outer.latch:
 ; CHECK-NEXT:   EMIT ir<%outer.iv.next> = add ir<%outer.iv>, ir<1>
 ; CHECK-NEXT:   EMIT ir<%outer.ec> = icmp ir<%outer.iv.next>, ir<8>
-; CHECK-NEXT: Successor(s): outer.header
+; CHECK-NEXT:   EMIT branch-on-cond ir<%outer.ec>
+; CHECK-NEXT: Successor(s): ir-bb<exit>, outer.header
+; CHECK-EMPTY:
+; CHECK-NEXT: ir-bb<exit>:
+; CHECK-NEXT: No successors
 ; CHECK-NEXT: }
 entry:
   br label %outer.header

Update initial VPlan construction to include exit conditions and
edges.

For now, all early exits are disconnected before forming the regions,
but a follow-up will update uncountable exit handling to also happen
here. This is required to enable VPlan predication and remove the
dependence any IR BBs (llvm#128420).

This includes updates in a few places to use
replaceSuccessor/replacePredecessor to preserve the order of predecessors
and successors, to reduce the need of fixing up phi operand orderings.
This unfortunately required making them public, not sure if there's a
@fhahn fhahn force-pushed the vplan-retain-exits branch from c368b5d to e347f11 Compare May 3, 2025 12:08
fhahn added a commit to fhahn/llvm-project that referenced this pull request May 3, 2025
Move early-exit handling up front to original VPlan construction, before
introducing early exits.

This builds on llvm#137709, which
adds exiting edges to the original VPlan, instead of adding exit blocks
later.

This retains the exit conditions early, and means we can handle early
exits before forming regions, without the reliance on VPRecipeBuilder.

Once we retain all exits initially, handling early exits before region
construction ensures the regions are valid; otherwise we would leave
edges exiting the region from elsewhere than the latch.

Removing the reliance on VPRecipeBuilder removes the dependence on
mapping IR BBs to VPBBs and unblocks predication as VPlan transform:
llvm#128420.

Depends on llvm#137709.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants