-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[MLIR][Tensor] Canonicalize fully covering slice insertions into tensors with unit prefixes #92912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ors with unit prefixes If the destination tensor of the insertion of a slice has the same number of elements as the slice, but with a shape that only differs by a prefix of unit-sized dimensions, and if the insertion happens at zero offsets, unit strides and with a size matching the size of the destination, the insertion covers all elements of the destination. The result of such an insertion is equivalent to the slice, with its shape expanded to the type of the destination. Example: ```mlir %0 = tensor.insert_slice %slice into %x[0, 0, 0, 0, 0][1, 1, 1, 16, 32][1, 1, 1, 1, 1] : tensor<16x32xf32> into tensor<1x1x1x16x32xf32> ``` folds into: ```mlir %0 = tensor.expand_shape %slice[[0,1,2,3], [4]] : tensor<16x32xf32> into tensor<1x1x1x16x32xf32> ``` This commit adds a canonicalization pattern for `InsertSliceOp` that implements this pattern.
@llvm/pr-subscribers-mlir-tensor @llvm/pr-subscribers-mlir Author: Andi Drebes (andidr) ChangesIf the destination tensor of the insertion of a slice has the same number of elements as the slice, but with a shape that only differs by a prefix of unit-sized dimensions, and if the insertion happens at zero offsets, unit strides and with a size matching the size of the destination, the insertion covers all elements of the destination. The result of such an insertion is equivalent to the slice, with its shape expanded to the type of the destination. Example: %0 = tensor.insert_slice %slice into
%x[0, 0, 0, 0, 0][1, 1, 1, 16, 32][1, 1, 1, 1, 1] :
tensor<16x32xf32> into tensor<1x1x1x16x32xf32> folds into: %0 = tensor.expand_shape %slice[[0,1,2,3], [4]] :
tensor<16x32xf32> into tensor<1x1x1x16x32xf32> This PR adds a canonicalization pattern for Full diff: https://github.com/llvm/llvm-project/pull/92912.diff 2 Files Affected:
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index 8545c7b9af8f7..52d7005470232 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -2835,6 +2835,91 @@ struct InsertSliceOpSourceCastInserter final
return success();
}
};
+
+/// If the destination tensor of the insertion of a slice has the same
+/// number of elements as the slice, but with a shape that only
+/// differs by a prefix of unit-sized dimensions, and if the insertion
+/// happens at zero offsets, unit strides and with a size matching the
+/// size of the destination, the insertion covers all elements of the
+/// destination. The result of such an insertion is equivalent to the
+/// slice, with its shape expanded to the type of the destination.
+///
+/// Example:
+/// ```mlir
+/// %0 = tensor.insert_slice %slice into
+/// %x[0, 0, 0, 0, 0][1, 1, 1, 16, 32][1, 1, 1, 1, 1] :
+/// tensor<16x32xf32> into tensor<1x1x1x16x32xf32>
+/// ```
+///
+/// folds into:
+///
+/// ```mlir
+/// %0 = tensor.expand_shape %slice[[0,1,2,3], [4]] :
+/// tensor<16x32xf32> into tensor<1x1x1x16x32xf32>
+/// ```
+struct InsertSliceOpFullRewriteCanonicalizer final
+ : public OpRewritePattern<InsertSliceOp> {
+ using OpRewritePattern<InsertSliceOp>::OpRewritePattern;
+ LogicalResult matchAndRewrite(InsertSliceOp insertSliceOp,
+ PatternRewriter &rewriter) const override {
+ RankedTensorType sourceType = insertSliceOp.getSourceType();
+ RankedTensorType resultType = insertSliceOp.getType();
+
+ if (sourceType != resultType && sourceType.hasStaticShape() &&
+ resultType.hasStaticShape() &&
+ isSameSizedSuffixShape(resultType.getShape(), sourceType.getShape()) &&
+ succeeded(foldIdentityOffsetSizeAndStrideOpInterface(insertSliceOp,
+ resultType))) {
+ SmallVector<ReassociationIndices> reassocIndices;
+
+ // Number of leading dimensions with unit size that are not
+ // shared with the source type
+ size_t unitPrefixLength =
+ resultType.getShape().size() - sourceType.getShape().size();
+
+ // Compose mapping of leading dimensions with unit size and the
+ // fist common dimension to the first dimension of the source
+ // tensor
+ ReassociationIndices unitPrefixExpansion;
+
+ size_t dim;
+ for (dim = 0; dim < unitPrefixLength; dim++)
+ unitPrefixExpansion.push_back(dim);
+
+ unitPrefixExpansion.push_back(unitPrefixLength);
+ reassocIndices.push_back(unitPrefixExpansion);
+
+ // Map remaining common dimensions of the source to the target
+ for (dim = dim + 1; dim < resultType.getShape().size(); dim++) {
+ reassocIndices.push_back({static_cast<int64_t>(dim)});
+ }
+
+ rewriter.replaceOpWithNewOp<ExpandShapeOp>(
+ insertSliceOp, insertSliceOp.getType(), insertSliceOp.getSource(),
+ reassocIndices);
+
+ return mlir::success();
+ }
+
+ return mlir::failure();
+ }
+
+private:
+ /// Checks if `suffix` is a suffix of `shape` and all preceding
+ /// elements in `shape` are ones.
+ static bool isSameSizedSuffixShape(ArrayRef<int64_t> shape,
+ ArrayRef<int64_t> suffix) {
+ if (shape.size() >= suffix.size()) {
+ ArrayRef<int64_t> prefix = shape.take_front(shape.size() - suffix.size());
+ ArrayRef<int64_t> remainder = shape.take_back(suffix.size());
+
+ return llvm::all_of(prefix, [](int64_t d) { return d == 1; }) &&
+ remainder == suffix;
+ }
+
+ return false;
+ }
+};
} // namespace
llvm::SmallBitVector InsertSliceOp::getDroppedDims() {
@@ -2845,7 +2930,8 @@ void InsertSliceOp::getCanonicalizationPatterns(RewritePatternSet &results,
MLIRContext *context) {
results.add<InsertSliceOpConstantArgumentFolder<InsertSliceOp>,
InsertSliceOpCastFolder<InsertSliceOp>,
- InsertSliceOpSourceCastInserter<InsertSliceOp>>(context);
+ InsertSliceOpSourceCastInserter<InsertSliceOp>,
+ InsertSliceOpFullRewriteCanonicalizer>(context);
}
Value mlir::tensor::createCanonicalRankReducingInsertSliceOp(OpBuilder &b,
diff --git a/mlir/test/Dialect/Tensor/canonicalize.mlir b/mlir/test/Dialect/Tensor/canonicalize.mlir
index 914e5e8b8c4b8..8e66ef9f89c74 100644
--- a/mlir/test/Dialect/Tensor/canonicalize.mlir
+++ b/mlir/test/Dialect/Tensor/canonicalize.mlir
@@ -542,6 +542,18 @@ func.func @trivial_insert_slice(%arg0 : tensor<4x6x16x32xi8>, %arg1 : tensor<4x6
// -----
+// CHECK-LABEL: func @trivial_insert_slice_unit_prefix
+// CHECK-SAME: %[[ARG0:.[a-z0-9A-Z_]+]]: tensor<4x6x16x32xi8>
+// CHECK-NOT: tensor.insert_slice
+// CHECK: %[[EXPANDED:.[a-z0-9A-Z_]+]] = tensor.expand_shape %[[ARG0]] {{\[\[0, 1, 2, 3\], \[4\], \[5\], \[6\]\] output}}_shape {{\[1, 1, 1, 4, 6, 16, 32\]}} : tensor<4x6x16x32xi8> into tensor<1x1x1x4x6x16x32xi8>
+// CHECK: return %[[EXPANDED]] : tensor<1x1x1x4x6x16x32xi8>
+func.func @trivial_insert_slice_unit_prefix(%arg0 : tensor<4x6x16x32xi8>, %arg1 : tensor<1x1x1x4x6x16x32xi8>) -> tensor<1x1x1x4x6x16x32xi8> {
+ %0 = tensor.insert_slice %arg0 into %arg1[0, 0, 0, 0, 0, 0, 0] [1, 1, 1, 4, 6, 16, 32] [1, 1, 1, 1, 1, 1, 1] : tensor<4x6x16x32xi8> into tensor<1x1x1x4x6x16x32xi8>
+ return %0 : tensor<1x1x1x4x6x16x32xi8>
+}
+
+// -----
+
// CHECK-LABEL: func @empty_insert_slice
// CHECK-SAME: %[[ARG0:.[a-z0-9A-Z_]+]]: tensor<0x2xi8>
// CHECK-SAME: %[[ARG1:.[a-z0-9A-Z_]+]]: tensor<3x3xi8>
|
Adding @matthias-springer to the conversation, who has contributed several canonicalizers for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldnt be a canonicalizer. This might not be a good thing to do always. For example, bufferization would not be able to handle the expand_shape
properly (without essentially doing this analysis in reverse, essentially undoing this canonicalization). Please add a separate entry point for this pattern.
This does not seem like an argument against canonicalization to me, unless you'd advocate for the opposite canonicalization? |
I think we have the same discussion many times. Canonicalizations should be narrow. Changing one op to a completely different op does not seem like a canonicalization to me. |
Let’s not have the discussion repeatedly and keep it grounded in first principles then please? “Narrow” does not mean anything to me! Back to first principles: canonicalization is about finding a single form for two equivalent programs. If you have an expand shape in the input IR: would you do the opposite transformation? |
There might not be a single form. The two equivalent programs are equivalent and there is nothing to chose between them without additional context that only exist in the use case.
No. IMO that is an opinionated decision. Should be left to the downstream users to pick what works for their context. |
Sorry but I don't see any first principles coming from your statement here: in particular when it comes to a very simple case like here that only involve 1 op before and after the canonicalization, we ought to be able to select one! Going back to this pattern: what makes it more difficult to match
Unfortunately this isn't how MLIR Canonicalization is designed: you either canonicalize, or you don't. Otherwise, feel free to propose a redesign of the system, but this is what we have!! |
This is exactly my point. There is no argument to pick one over the other. So what's stopping me from writing the reverse and add that to a canonicalization?
This option is supremely unergonomic and really cannot be used apart from simple use cases
These kind of statements are unhelpful and do not lead to an outcome. The above discussion is not helping the author land his change. I would suggest just adding a new |
At the moment nothing, with this patch it would be the fact that infinite loop are explicitly prohibited in our doc :)
I mean your whole approach isn't constructive here: I'm urging you to either come back to build some basic principles upon which we can make progress, or abstain of blocking canonicalization without being able to provide stronger reasoning.
I see this as avoidance in face of unjustified obstruction of being able to make progress on canonicalization: so I strongly object. |
Since you ask for a basic principle, my answer is you have to prove something is a canonical representation. Just an arbitrary choice is not justification enough. You have to prove that this is "always better representation" to have it in canonicalization. |
You're providing a vague statement: this isn't a set of principles upon which we can build anything right now. For example my principles for canonicalization:
Sorry but I can't follow you: are you saying that the normal standard for canonicalization in MLIR is that we have to prove anything? I don't believe we ever hold any canonicalization to this standard (please prove to me that canonicalizing constant to the right is canonical...). You probably should re-read https://sunfishcode.github.io/blog/2018/10/22/Canonicalization.html which is the underlying reference in our documentation: https://mlir.llvm.org/docs/Canonicalization/ |
I am pretty sure we don't have a "first to invent" policy on patches implementing a disputed design. And some things in these algebras are quite sensitive with non obvious tradeoffs. I tire of these education sessions in rhetoric. Bottom line, if I look at the commit history and project affiliations of the author and the person rebutting, I'm willing to believe they both have an intuition and feel for the algebra here. And it is completely reasonable to call for a pause to assess: this is a volunteer project and detailed arguments for something with as much nuance as this dialect are definitely work and may not just be able to be synthesized out of thin air (or in a fly by patch review) or in a patch thread like this. But we're not talking about that. So -1 from me: this patch shouldn't land without consensus. It may have merit, but it is a very specific thing, not at all in the category of "fold right" as for constants. It is obvious that there are two people with presumed expertise and contribution history to this dialect who disagree. Happens all the time. The traditional thing we do in this community is to pause and pop that up to the list. That is not blocking. That is just how we work. I've read this patch and the thread and I still don't know why the author wanted it -- I presume they have good reasons, or why Mahesh doesn't (who I expect also has a lot of experience on this point exactly). Instead I've got an escalation based on the form of the argument, and I don't find that salient for understanding or contributing. I'd recommend raising in the list and discussing experiences with these forms and seeing if we're even close or far on consensus. For something like this, principles are something that we often arrive at through interrogation and experience. Just calling for them in some kind of summary fashion is not helpful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Marking request changes because there appears to be a lack of consensus and I think we would benefit from the author and Mahesh having a discussion on discourse about what the best thing to do here is. I also know that key contributors are off for a long holiday weekend and we should be patient/wait for feedback on disputed points of design.
One more data point: We've had similar discussions for other
Some of these could have been canonicalization patterns. The problem is that some transformations are hard-coded to specific IR patterns. E.g.:
This is not ideal and ideally those transformations should be generalized. But that's non-trivial and a lot of work. Esp. generalizing the bufferization could easily take a few months. I basically stopped adding new canonicalization patterns. Apart from very simple canonicalizations like turning dynamic sizes into static sizes. There is no good way for users to opt out of certain canonicalizations. But they can always opt into applying additional patterns if they want to. |
I was going to mention that. We use them and we run canonicalizations in between passes to avoid surprises, but this particular (type of) canonicalization would be the ones causing surprise. I'm not against a canonical form, far from it, but the points raised here regarding the infrastructure to actually have a canonical form that we can actually use are very pertinent. We have similar issues in LLVM, where what's considered "canonical" is not absolute to the IR or the target, but relative to the actual pass running on it: what patterns do they recognise, what matchers are available, what you really mean by redundant information.
On the original document, please refer to the |
It is perfectly fine (and actually I would say desirable) to have "intuition" from experience, however it does not free oneself from being able to articulate some reasoning, which is entirely absent from the discussion here. In particular the lack of first principles reasoning (including trying to anchor some of it on "experience" and trying to make a "feel for the algebra" more concrete in some way) is quite alarming to me. |
Minor correction: this thread is full of rebuttals from three different contributors in good standing to this dialect, but they are not passing your bar for listening and letting the conversation transpire. I jumped in here as a circuit breaker to the form the discussion was taking because I receive strong feedback from the community that when it turns this way, it is incredibly damaging, and that outcome is the opposite of what we all want. I've been guilty of this as many times as anyone -- just trying to make us better. Once we got out of rapid fire mode, I got the sentiment in back channels that it seems past time to write down the canonicalization approach for this dialect. There is a logic to it, but it kind of comes out of history and precedence (and not everyone can produce a fully formed, first principles argument in one go on a topic like this). Mahesh's original response was kind of extending this common law to provide guidance. If the discussion had continued at that point, versus diverting into a more rhetorical form, the other contributors would have had time to see it and likely have come to a similar consensus that writing down more formal guidelines is needed. The points and insights you are bringing up would enrich such a thing but are really hard to litigate on a very low level point like this specific one. The one thing I would ask: this is a volunteer project, and engaging to produce a well reasoned approach to something like this is real work. It takes time and can't always be formulated on demand in an adversarial/debate like forum on a single issue like this in real time. The choice to step back and consider is something that the contributors in good standing to the component have to be able to make, and that is what I'm intervening to make sure can happen here. |
I don't parse your sentence, are you saying that in this thread, someone rebutted one or multiple of these three statements and I've been blind to it?
|
I'm sorry you don't parse my statement, but I think that just reinforces that this topic needs to be upleveled. People are rebutting your conclusions. They may not be agreeing with you but are struggling with the form of the argument as you are approaching it. To be blunt, you have a keen mind for rigorous logic. This is an asset because it lets you see inconsistencies at the outset. But not everyone can debate in that way in real time on a deeply nested issue. My experience is that most people need a minute, a zoom out and a helping hand to get to the best outcomes. A well developed ability to identify inconsistencies like you have might be a bit less alarming to you when you see less structured reasoning if you focus more on guiding people to eventual consistency of their rationale vs requiring it on each step. At least that's a lesson I had to learn in order to be able to integrate broader perspectives. Sorry if this is off topic. I do get very strong feedback on the impact when we go into laser focus, rapid fire mode on this stuff, and I'm trying to make us better. I apologize in advance if any of it doesn't land right and an willing to discuss more in person or on an upleveled thread. |
If the destination tensor of the insertion of a slice has the same number of elements as the slice, but with a shape that only differs by a prefix of unit-sized dimensions, and if the insertion happens at zero offsets, unit strides and with a size matching the size of the destination, the insertion covers all elements of the destination. The result of such an insertion is equivalent to the slice, with its shape expanded to the type of the destination.
Example:
folds into:
This PR adds a canonicalization pattern for
InsertSliceOp
that implements this pattern.