-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Reland: [MLIR][Transforms] Fix Mem2Reg removal order to respect dominance #68767
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ance Reverts the revert commit and fixes the weak ordering requirement of `llvm::sort`. Original commit message: This commit fixes a bug in the Mem2Reg operation erasure order. Replacing the topological order with a dominance based order ensures that no operation is removed before all its uses have been replaced. Additionally, the reliance on the `DenseMap` key order was eliminated by switching to a `MapVector`, that gives a deterministic iteration order. Example: ``` %ptr = alloca ... ... %val0 = %load %ptr ... // LOAD0 store %val0 %ptr ... %val1 = load %ptr ... // LOAD1 ```` When promoting the slot backing %ptr, it can happen that the LOAD0 was cleaned before LOAD1. This results in all uses of LOAD0 being replaced by its reaching definition, before LOAD1's result is replaced by LOAD0's result. The subsequent erasure of LOAD0 can thus not succeed, as it has remaining usages.
The original PR ( #68687) cannot be reopened, that's why this new one was created. |
@llvm/pr-subscribers-mlir-llvm @llvm/pr-subscribers-mlir Author: Christian Ulmann (Dinistro) Changes…ance Reverts the revert commit and fixes the weak ordering requirement of Original commit message: This commit fixes a bug in the Mem2Reg operation erasure order. Replacing the topological order with a dominance based order ensures that no operation is removed before all its uses have been replaced. Additionally, the reliance on the Example:
When promoting the slot backing %ptr, it can happen that the LOAD0 was cleaned before LOAD1. This results in all uses of LOAD0 being replaced by its reaching definition, before LOAD1's result is replaced by LOAD0's result. The subsequent erasure of LOAD0 can thus not succeed, as it has remaining usages. Full diff: https://github.com/llvm/llvm-project/pull/68767.diff 2 Files Affected:
diff --git a/mlir/lib/Transforms/Mem2Reg.cpp b/mlir/lib/Transforms/Mem2Reg.cpp
index 65de25dd2f32663..3132d5b2f82a6a0 100644
--- a/mlir/lib/Transforms/Mem2Reg.cpp
+++ b/mlir/lib/Transforms/Mem2Reg.cpp
@@ -96,6 +96,9 @@ using namespace mlir;
namespace {
+using BlockingUsesMap =
+ llvm::MapVector<Operation *, SmallPtrSet<OpOperand *, 4>>;
+
/// Information computed during promotion analysis used to perform actual
/// promotion.
struct MemorySlotPromotionInfo {
@@ -106,7 +109,7 @@ struct MemorySlotPromotionInfo {
/// its uses, it is because the defining ops of the blocking uses requested
/// it. The defining ops therefore must also have blocking uses or be the
/// starting point of the bloccking uses.
- DenseMap<Operation *, SmallPtrSet<OpOperand *, 4>> userToBlockingUses;
+ BlockingUsesMap userToBlockingUses;
};
/// Computes information for basic slot promotion. This will check that direct
@@ -129,8 +132,7 @@ class MemorySlotPromotionAnalyzer {
/// uses (typically, removing its users because it will delete itself to
/// resolve its own blocking uses). This will fail if one of the transitive
/// users cannot remove a requested use, and should prevent promotion.
- LogicalResult computeBlockingUses(
- DenseMap<Operation *, SmallPtrSet<OpOperand *, 4>> &userToBlockingUses);
+ LogicalResult computeBlockingUses(BlockingUsesMap &userToBlockingUses);
/// Computes in which blocks the value stored in the slot is actually used,
/// meaning blocks leading to a load. This method uses `definingBlocks`, the
@@ -233,7 +235,7 @@ Value MemorySlotPromoter::getLazyDefaultValue() {
}
LogicalResult MemorySlotPromotionAnalyzer::computeBlockingUses(
- DenseMap<Operation *, SmallPtrSet<OpOperand *, 4>> &userToBlockingUses) {
+ BlockingUsesMap &userToBlockingUses) {
// The promotion of an operation may require the promotion of further
// operations (typically, removing operations that use an operation that must
// delete itself). We thus need to start from the use of the slot pointer and
@@ -243,7 +245,7 @@ LogicalResult MemorySlotPromotionAnalyzer::computeBlockingUses(
// use it.
for (OpOperand &use : slot.ptr.getUses()) {
SmallPtrSet<OpOperand *, 4> &blockingUses =
- userToBlockingUses.getOrInsertDefault(use.getOwner());
+ userToBlockingUses[use.getOwner()];
blockingUses.insert(&use);
}
@@ -281,7 +283,7 @@ LogicalResult MemorySlotPromotionAnalyzer::computeBlockingUses(
assert(llvm::is_contained(user->getResults(), blockingUse->get()));
SmallPtrSetImpl<OpOperand *> &newUserBlockingUseSet =
- userToBlockingUses.getOrInsertDefault(blockingUse->getOwner());
+ userToBlockingUses[blockingUse->getOwner()];
newUserBlockingUseSet.insert(blockingUse);
}
}
@@ -516,14 +518,21 @@ void MemorySlotPromoter::computeReachingDefInRegion(Region *region,
}
void MemorySlotPromoter::removeBlockingUses() {
- llvm::SetVector<Operation *> usersToRemoveUses;
- for (auto &user : llvm::make_first_range(info.userToBlockingUses))
- usersToRemoveUses.insert(user);
- SetVector<Operation *> sortedUsersToRemoveUses =
- mlir::topologicalSort(usersToRemoveUses);
+ llvm::SmallVector<Operation *> usersToRemoveUses(
+ llvm::make_first_range(info.userToBlockingUses));
+
+ // The uses need to be traversed in *reverse dominance* order to ensure that
+ // transitive replacements are performed correctly.
+ // NOTE: The order can be non-deterministic, due to a pointer comparision, but
+ // this has no effect on the result of the pattern. This is necessary to get a
+ // strict weak order relation.
+ llvm::sort(usersToRemoveUses, [&](Operation *lhs, Operation *rhs) {
+ return dominance.properlyDominates(rhs, lhs) ||
+ (!dominance.properlyDominates(lhs, rhs) && rhs < lhs);
+ });
llvm::SmallVector<Operation *> toErase;
- for (Operation *toPromote : llvm::reverse(sortedUsersToRemoveUses)) {
+ for (Operation *toPromote : usersToRemoveUses) {
if (auto toPromoteMemOp = dyn_cast<PromotableMemOpInterface>(toPromote)) {
Value reachingDef = reachingDefs.lookup(toPromoteMemOp);
// If no reaching definition is known, this use is outside the reach of
diff --git a/mlir/test/Dialect/LLVMIR/mem2reg.mlir b/mlir/test/Dialect/LLVMIR/mem2reg.mlir
index 30ba459d07a49f3..32e3fed7e5485df 100644
--- a/mlir/test/Dialect/LLVMIR/mem2reg.mlir
+++ b/mlir/test/Dialect/LLVMIR/mem2reg.mlir
@@ -683,3 +683,16 @@ llvm.func @no_inner_alloca_promotion(%arg: i64) -> i64 {
// CHECK: llvm.return %[[RES]] : i64
llvm.return %2 : i64
}
+
+// -----
+
+// CHECK-LABEL: @transitive_reaching_def
+llvm.func @transitive_reaching_def() -> !llvm.ptr {
+ %0 = llvm.mlir.constant(1 : i32) : i32
+ // CHECK-NOT: alloca
+ %1 = llvm.alloca %0 x !llvm.ptr {alignment = 8 : i64} : (i32) -> !llvm.ptr
+ %2 = llvm.load %1 {alignment = 8 : i64} : !llvm.ptr -> !llvm.ptr
+ llvm.store %2, %1 {alignment = 8 : i64} : !llvm.ptr, !llvm.ptr
+ %3 = llvm.load %1 {alignment = 8 : i64} : !llvm.ptr -> !llvm.ptr
+ llvm.return %3 : !llvm.ptr
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
hi! I see a test failure in mem2reg.mlir.test |
Did you see this on the relanded commit or on the original version? |
for reland |
Please be careful about the title wrapping (both authors and reviewer). |
I reverted the commit again, thanks for the heads up. |
Reverts the revert commit and fixes the weak ordering requirement of
llvm::sort
.Original commit message:
This commit fixes a bug in the Mem2Reg operation erasure order. Replacing the topological order with a dominance based order ensures that no operation is removed before all its uses have been replaced. Additionally, the reliance on the
DenseMap
key order was eliminated by switching to aMapVector
, that gives a deterministic iteration order.Example:
When promoting the slot backing %ptr, it can happen that the LOAD0 was cleaned before LOAD1. This results in all uses of LOAD0 being replaced by its reaching definition, before LOAD1's result is replaced by LOAD0's result. The subsequent erasure of LOAD0 can thus not succeed, as it has remaining usages.