-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[mlir] Add apply_patterns.linalg.pad_vectorization TD Op #112504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir] Add apply_patterns.linalg.pad_vectorization TD Op #112504
Conversation
@llvm/pr-subscribers-mlir-linalg @llvm/pr-subscribers-mlir Author: Andrzej Warzyński (banach-space) ChangesThis PR simply wraps This change makes it possible to run (and test) the corresponding
Note that the Op above only supports non-masked vectorisation (i.e. when To test the new Op and the corresponding patterns, I added As a side note, I feel that we should move Patch is 22.25 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/112504.diff 5 Files Affected:
diff --git a/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td b/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
index 98b9151381222d..a12d4b43223e47 100644
--- a/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
+++ b/mlir/include/mlir/Dialect/Linalg/TransformOps/LinalgTransformOps.td
@@ -84,6 +84,23 @@ def ApplyFoldAddIntoDestPatternsOp : Op<Transform_Dialect,
let assemblyFormat = "attr-dict";
}
+def ApplyPadVectorizationPatternsOp : Op<Transform_Dialect,
+ "apply_patterns.linalg.pad_vectorization",
+ [DeclareOpInterfaceMethods<PatternDescriptorOpInterface>]> {
+ let description = [{
+ Apply patterns that take tensor.pad and rewrites it as
+ vector.transfer_read/vector.transfer_write Ops.
+
+ These patterns will either fold tensor.pad with an existing
+ vector.transfer_read or vector.transfer_write producer/consumers (requires
+ other surrounding Ops to be already vectorised) or rewrite it, together
+ with tensor.insert_slice consumer, as a vector.transfer_read +
+ vector.transfer_write pair.
+ }];
+
+ let assemblyFormat = "attr-dict";
+}
+
//===----------------------------------------------------------------------===//
// BufferizeToAllocationOp
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp b/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
index ad72b5d7beccde..55e9f88a0046f2 100644
--- a/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
+++ b/mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp
@@ -253,6 +253,11 @@ void transform::ApplyFoldAddIntoDestPatternsOp::populatePatterns(
linalg::populateFoldAddIntoDestPatterns(patterns);
}
+void transform::ApplyPadVectorizationPatternsOp::populatePatterns(
+ RewritePatternSet &patterns) {
+ linalg::populatePadOpVectorizationPatterns(patterns);
+}
+
//===----------------------------------------------------------------------===//
// BufferizeToAllocationOp
//===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 09c6b2683b4388..1f5f500a6245f6 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -2285,7 +2285,7 @@ static SmallVector<Value> ofrToIndexValues(RewriterBase &rewriter, Location loc,
return result;
}
-/// Rewrite a tensor::PadOp into a sequence of EmptyOp, FillOp and
+/// Rewrite a tensor::PadOp into a sequence of EmptyOp, FillOp/GenerateOp and
/// InsertSliceOp. For now, only constant padding values are supported.
/// If there is enough static type information, TransferReadOps and
/// TransferWriteOps may be generated instead of InsertSliceOps.
diff --git a/mlir/test/Dialect/Linalg/vectorization-pad-patterns.mlir b/mlir/test/Dialect/Linalg/vectorization-pad-patterns.mlir
new file mode 100644
index 00000000000000..76cd7d7b0721f1
--- /dev/null
+++ b/mlir/test/Dialect/Linalg/vectorization-pad-patterns.mlir
@@ -0,0 +1,272 @@
+// RUN: mlir-opt %s -transform-interpreter -split-input-file | FileCheck %s
+
+///----------------------------------------------------------------------------------------
+/// [Pattern: PadOpVectorizationWithTransferReadPattern]
+///----------------------------------------------------------------------------------------
+// CHECK-LABEL: func @pad_and_transfer_read
+// CHECK-SAME: %[[ARG0:.*]]: tensor<5x6xf32>
+// CHECK-NOT: tensor.pad
+// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
+// CHECK-DAG: %[[C5:.*]] = arith.constant 5.0
+// CHECK: %[[RESULT:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] : tensor<5x6xf32>, vector<7x9xf32>
+// CHECK: return %[[RESULT]]
+func.func @pad_and_transfer_read(%arg0: tensor<5x6xf32>) -> vector<7x9xf32> {
+ %c0 = arith.constant 0 : index
+ %c5 = arith.constant 5.0 : f32
+ %c6 = arith.constant 6.0 : f32
+ %0 = tensor.pad %arg0 low[0, 0] high[5, 7] {
+ ^bb0(%arg1: index, %arg2: index):
+ tensor.yield %c5 : f32
+ } : tensor<5x6xf32> to tensor<10x13xf32>
+ %1 = vector.transfer_read %0[%c0, %c0], %c6
+ : tensor<10x13xf32>, vector<7x9xf32>
+ return %1 : vector<7x9xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
+
+// -----
+
+///----------------------------------------------------------------------------------------
+/// [Pattern: PadOpVectorizationWithTransferReadPattern
+///----------------------------------------------------------------------------------------
+func.func private @make_vector() -> vector<7x9xf32>
+
+// CHECK-LABEL: func @pad_and_transfer_write_static
+// CHECK-SAME: %[[ARG0:.*]]: tensor<5x6xf32>
+// CHECK-NOT: tensor.pad
+// CHECK: %[[C0:.*]] = arith.constant 0 : index
+// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32>
+// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[ARG0]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<5x6xf32>
+// CHECK: return %[[RESULT]]
+func.func @pad_and_transfer_write_static(
+ %arg0: tensor<5x6xf32>) -> tensor<5x6xf32> {
+ %c0 = arith.constant 0 : index
+ %c5 = arith.constant 5.0 : f32
+ %0 = tensor.pad %arg0 low[0, 0] high[5, 7] {
+ ^bb0(%arg2: index, %arg3: index):
+ tensor.yield %c5 : f32
+ } : tensor<5x6xf32> to tensor<10x13xf32>
+ %1 = call @make_vector() : () -> vector<7x9xf32>
+ %2 = vector.transfer_write %1, %0[%c0, %c0]
+ : vector<7x9xf32>, tensor<10x13xf32>
+ %3 = tensor.extract_slice %2[0, 0] [5, 6] [1, 1] : tensor<10x13xf32> to tensor<5x6xf32>
+ return %3 : tensor<5x6xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
+
+// -----
+
+func.func private @make_vector() -> vector<7x9xf32>
+
+// CHECK-LABEL: func @pad_and_transfer_write_dynamic_static
+// CHECK-SAME: %[[ARG0:.*]]: tensor<?x?xf32>, %[[SIZE:.*]]: index, %[[PADDING:.*]]: index
+// CHECK-NOT: tensor.pad
+// CHECK: %[[C0:.*]] = arith.constant 0 : index
+// CHECK: %[[SUB:.*]] = tensor.extract_slice %[[ARG0]][0, 0] [%[[SIZE]], 6] [1, 1] : tensor<?x?xf32> to tensor<?x6xf32>
+// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32>
+// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[SUB]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<?x6xf32>
+// CHECK: return %[[RESULT]]
+func.func @pad_and_transfer_write_dynamic_static(
+ %arg0: tensor<?x?xf32>, %size: index, %padding: index) -> tensor<?x6xf32> {
+ %c0 = arith.constant 0 : index
+ %c5 = arith.constant 5.0 : f32
+ %s = tensor.extract_slice %arg0[0, 0] [%size, 6] [1, 1]
+ : tensor<?x?xf32> to tensor<?x6xf32>
+ %0 = tensor.pad %s low[0, 0] high[%padding, 7] {
+ ^bb0(%arg2: index, %arg3: index):
+ tensor.yield %c5 : f32
+ } : tensor<?x6xf32> to tensor<?x13xf32>
+ %1 = call @make_vector() : () -> vector<7x9xf32>
+ %2 = vector.transfer_write %1, %0[%c0, %c0]
+ : vector<7x9xf32>, tensor<?x13xf32>
+ %3 = tensor.extract_slice %2[0, 0] [%size, 6] [1, 1] : tensor<?x13xf32> to tensor<?x6xf32>
+ return %3 : tensor<?x6xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
+
+
+// -----
+
+///----------------------------------------------------------------------------------------
+/// [Pattern: PadOpVectorizationWithInsertSlicePattern]
+///----------------------------------------------------------------------------------------
+
+func.func private @make_vector() -> tensor<12x13xf32>
+
+// CHECK-LABEL: func @pad_and_insert_slice_source
+// CHECK-SAME: %[[ARG0:.*]]: tensor<5x6xf32>
+// CHECK-NOT: tensor.pad
+// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
+// CHECK-DAG: %[[C5:.*]] = arith.constant 5.0
+// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> tensor<12x13xf32>
+// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] : tensor<5x6xf32>, vector<7x9xf32>
+// CHECK: %[[WRITE:.*]] = vector.transfer_write %[[READ]], %[[VEC0]][%[[C0]], %[[C0]]] {in_bounds = [true, true]} : vector<7x9xf32>, tensor<12x13xf32>
+// CHECK: return %[[WRITE]]
+func.func @pad_and_insert_slice_source(
+ %arg0: tensor<5x6xf32>) -> tensor<12x13xf32> {
+ %c0 = arith.constant 0 : index
+ %c5 = arith.constant 5.0 : f32
+ %0 = tensor.pad %arg0 low[0, 0] high[2, 3] {
+ ^bb0(%arg2: index, %arg3: index):
+ tensor.yield %c5 : f32
+ } : tensor<5x6xf32> to tensor<7x9xf32>
+ %1 = call @make_vector() : () -> tensor<12x13xf32>
+ %r = tensor.insert_slice %0 into %1[0, 0][7, 9][1, 1] : tensor<7x9xf32> into tensor<12x13xf32>
+ return %r : tensor<12x13xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
+
+
+// -----
+
+///----------------------------------------------------------------------------------------
+/// tensor::PadOp -> tensor::EmptyOp + linalg::FillOp/tensor::GenerateOp + tensor::InsertSliceOp
+/// [Pattern: GenericPadOpVectorizationPattern]
+///----------------------------------------------------------------------------------------
+
+func.func private @make_vector() -> tensor<12x13xf32>
+
+// Same as @pad_and_insert_slice_dest in vectorization-wit-patterns.mlir, but
+// CHECK-LABEL: func.func @pad_and_insert_slice_dest(
+// CHECK-SAME: %[[ARG_0:.*]]: tensor<1x5x6xf32>) -> tensor<1x12x13xf32> {
+// CHECK-NOT: tensor.pad
+// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
+// CHECK-DAG: %[[PAD:.*]] = arith.constant 5.000000e+00 : f32
+// CHECK: %[[EMPTY:.*]] = tensor.empty() : tensor<1x12x13xf32>
+// CHECK: %[[FILL:.*]] = linalg.fill ins(%[[PAD]] : f32) outs(%[[EMPTY]] : tensor<1x12x13xf32>) -> tensor<1x12x13xf32>
+// CHECK: %[[READ:.*]] = vector.transfer_read %[[ARG_0]]{{\[}}%[[C0]], %[[C0]], %[[C0]]], %[[PAD]] {in_bounds = [true, true, true]} : tensor<1x5x6xf32>, vector<1x5x6xf32>
+// CHECK: %[[WRITE:.*]] = vector.transfer_write %[[READ]], %[[FILL]]{{\[}}%[[C0]], %[[C0]], %[[C0]]] {in_bounds = [true, true, true]} : vector<1x5x6xf32>, tensor<1x12x13xf32>
+// CHECK: %[[VEC:.*]] = call @make_vector() : () -> tensor<12x13xf32>
+// CHECK: %[[RES:.*]] = tensor.insert_slice %[[VEC]] into %[[WRITE]][0, 0, 0] [1, 12, 13] [1, 1, 1] : tensor<12x13xf32> into tensor<1x12x13xf32>
+// CHECK: return %[[RES]] : tensor<1x12x13xf32>
+
+func.func @pad_and_insert_slice_dest(
+ %arg0: tensor<1x5x6xf32>) -> tensor<1x12x13xf32> {
+ %c5 = arith.constant 5.0 : f32
+ %0 = tensor.pad %arg0 low[0, 0, 0] high[0, 7, 7] {
+ ^bb0(%arg2: index, %arg3: index, %arg4: index):
+ tensor.yield %c5 : f32
+ } : tensor<1x5x6xf32> to tensor<1x12x13xf32>
+ %1 = call @make_vector() : () -> tensor<12x13xf32>
+ %r = tensor.insert_slice %1 into %0[0, 0, 0][1, 12, 13][1, 1, 1] : tensor<12x13xf32> into tensor<1x12x13xf32>
+ return %r : tensor<1x12x13xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
+
+// -----
+func.func private @make_vector() -> vector<7x9xf32>
+
+// Variant of @pad_and_transfer_write_static
+
+// CHECK-LABEL: func @pad_and_transfer_write_static_non_zero_low_pad
+// CHECK-NOT: tensor.pad
+// CHECK: linalg.fill
+func.func @pad_and_transfer_write_static_non_zero_low_pad(
+ %arg0: tensor<5x6xf32>) -> tensor<5x6xf32> {
+ %c0 = arith.constant 0 : index
+ %c5 = arith.constant 5.0 : f32
+ %0 = tensor.pad %arg0 low[0, 1] high[5, 6] {
+ ^bb0(%arg2: index, %arg3: index):
+ tensor.yield %c5 : f32
+ } : tensor<5x6xf32> to tensor<10x13xf32>
+ %1 = call @make_vector() : () -> vector<7x9xf32>
+ %2 = vector.transfer_write %1, %0[%c0, %c0]
+ : vector<7x9xf32>, tensor<10x13xf32>
+ %3 = tensor.extract_slice %2[0, 0] [5, 6] [1, 1] : tensor<10x13xf32> to tensor<5x6xf32>
+ return %3 : tensor<5x6xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
+
+// -----
+func.func private @make_vector() -> vector<7x9xf32>
+
+// Variant of @pad_and_transfer_write_static
+
+// CHECK-LABEL: func @pad_and_transfer_write_static_non_zero_offset
+// CHECK-NOT: tensor.pad
+// CHECK: linalg.fill
+func.func @pad_and_transfer_write_static_non_zero_offset(
+ %arg0: tensor<5x6xf32>) -> tensor<5x6xf32> {
+ %c0 = arith.constant 0 : index
+ %c5 = arith.constant 5.0 : f32
+ %0 = tensor.pad %arg0 low[0, 1] high[5, 6] {
+ ^bb0(%arg2: index, %arg3: index):
+ tensor.yield %c5 : f32
+ } : tensor<5x6xf32> to tensor<10x13xf32>
+ %1 = call @make_vector() : () -> vector<7x9xf32>
+ %2 = vector.transfer_write %1, %0[%c0, %c0]
+ : vector<7x9xf32>, tensor<10x13xf32>
+ %3 = tensor.extract_slice %2[0, 1] [5, 6] [1, 1] : tensor<10x13xf32> to tensor<5x6xf32>
+ return %3 : tensor<5x6xf32>
+}
+
+module attributes {transform.with_named_sequence} {
+ transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
+ %func_op = transform.structured.match ops{["func.func"]} in %arg1 : (!transform.any_op) -> !transform.op<"func.func">
+
+ transform.apply_patterns to %func_op {
+ transform.apply_patterns.linalg.pad_vectorization
+ } : !transform.op<"func.func">
+ transform.yield
+ }
+}
diff --git a/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir b/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
index 1c6a786bfa436d..189507d97d6dc2 100644
--- a/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
+++ b/mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir
@@ -935,149 +935,6 @@ module attributes {transform.with_named_sequence} {
}
}
-// -----
-
-// CHECK-LABEL: func @pad_and_transfer_read
-// CHECK-SAME: %[[ARG0:.*]]: tensor<5x6xf32>
-// CHECK-NOT: tensor.pad
-// CHECK-DAG: %[[C0:.*]] = arith.constant 0 : index
-// CHECK-DAG: %[[C5:.*]] = arith.constant 5.0
-// CHECK: %[[RESULT:.*]] = vector.transfer_read %[[ARG0]][%[[C0]], %[[C0]]], %[[C5]] : tensor<5x6xf32>, vector<7x9xf32>
-// CHECK: return %[[RESULT]]
-func.func @pad_and_transfer_read(%arg0: tensor<5x6xf32>) -> vector<7x9xf32> {
- %c0 = arith.constant 0 : index
- %c5 = arith.constant 5.0 : f32
- %c6 = arith.constant 6.0 : f32
- %0 = tensor.pad %arg0 low[0, 0] high[5, 7] {
- ^bb0(%arg1: index, %arg2: index):
- tensor.yield %c5 : f32
- } : tensor<5x6xf32> to tensor<10x13xf32>
- %1 = vector.transfer_read %0[%c0, %c0], %c6
- : tensor<10x13xf32>, vector<7x9xf32>
- return %1 : vector<7x9xf32>
-}
-
-
-module attributes {transform.with_named_sequence} {
- transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
- %0 = transform.structured.match ops{["tensor.pad"]} in %arg1 : (!transform.any_op) -> !transform.any_op
- %1 = transform.get_parent_op %0 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
- %2 = transform.structured.vectorize_children_and_apply_patterns %1 { vectorize_padding } : (!transform.any_op) -> !transform.any_op
- transform.yield
- }
-}
-
-// -----
-
-func.func private @make_vector() -> vector<7x9xf32>
-
-// CHECK-LABEL: func @pad_and_transfer_write_static
-// CHECK-SAME: %[[ARG0:.*]]: tensor<5x6xf32>
-// CHECK-NOT: tensor.pad
-// CHECK: %[[C0:.*]] = arith.constant 0 : index
-// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32>
-// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[ARG0]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<5x6xf32>
-// CHECK: return %[[RESULT]]
-func.func @pad_and_transfer_write_static(
- %arg0: tensor<5x6xf32>) -> tensor<5x6xf32> {
- %c0 = arith.constant 0 : index
- %c5 = arith.constant 5.0 : f32
- %0 = tensor.pad %arg0 low[0, 0] high[5, 7] {
- ^bb0(%arg2: index, %arg3: index):
- tensor.yield %c5 : f32
- } : tensor<5x6xf32> to tensor<10x13xf32>
- %1 = call @make_vector() : () -> vector<7x9xf32>
- %2 = vector.transfer_write %1, %0[%c0, %c0]
- : vector<7x9xf32>, tensor<10x13xf32>
- %3 = tensor.extract_slice %2[0, 0] [5, 6] [1, 1] : tensor<10x13xf32> to tensor<5x6xf32>
- return %3 : tensor<5x6xf32>
-}
-
-module attributes {transform.with_named_sequence} {
- transform.named_sequence @__transform_main(%arg1: !transform.any_op {transform.readonly}) {
- %3 = transform.structured.match ops{["tensor.pad"]} in %arg1 : (!transform.any_op) -> !transform.any_op
- %4 = transform.get_parent_op %3 {isolated_from_above} : (!transform.any_op) -> !transform.any_op
- %5 = transform.structured.vectorize_children_and_apply_patterns %4 { vectorize_padding } : (!transform.any_op) -> !transform.any_op
- transform.yield
- }
-}
-
-
-// -----
-
-func.func private @make_vector() -> vector<7x9xf32>
-
-// CHECK-LABEL: func @pad_and_transfer_write_dynamic_static
-// CHECK-SAME: %[[ARG0:.*]]: tensor<?x?xf32>, %[[SIZE:.*]]: index, %[[PADDING:.*]]: index
-// CHECK-NOT: tensor.pad
-// CHECK: %[[C0:.*]] = arith.constant 0 : index
-// CHECK: %[[SUB:.*]] = tensor.extract_slice %[[ARG0]][0, 0] [%[[SIZE]], 6] [1, 1] : tensor<?x?xf32> to tensor<?x6xf32>
-// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32>
-// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[SUB]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<?x6xf32>
-// CHECK: return %[[RESULT]]
-func.func @pad_and_transfer_write_dynamic_static(
- %arg0: tensor<?x?xf32>, %size: index, %padding: index) -> tensor<?x6xf32> {
- %c0 = arith.constant 0 : index
- %c5 = arith.constant 5.0 : f32
- %s = tensor.extract_slice %arg0[0, 0] [%size, 6] [1, 1]
- : tensor<?x?xf32> to tensor<?x6xf32>
- %0 = tensor.pad %s low[0, 0] high[%padding, 7] {
- ^bb0(%arg2: index, %arg3: index):
- tensor.yield %c5 : f32
- } : tensor<?x6xf32> to tensor<?x13xf32>
- %1 = call @make_vector() : () -> vector<7x9xf32>
- %2 = vector.transfer_write %1, %0[%c0, %c0]
- : vector<7x9xf32>, tensor<?x13xf32>
- %3 = tensor.extract_slice %2[0, 0] [%size, 6] [1, 1] : tensor<?x13xf32> to tensor<?x6xf32>
- return %3 : tensor<?x6xf32>
-}
-
-module attributes {transform.with_named_...
[truncated]
|
let description = [{ | ||
Apply patterns that take tensor.pad and rewrites it as | ||
vector.transfer_read/vector.transfer_write Ops. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentences seem broken as two independent thoughts are packed together. My attempt to refine (maybe lost in translation) -
// These patterns will either fold tensor.pad
with an existing vector.transfer_read
or vector.transfer_write
. Or rewrite // it together with tensor.insert_slice
as a vector.transfer_read + vector.transfer_write
pair.
Then explain about producer/consumer point
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your suggestion - let me adapt it a bit. Hopefully with our joint efforts things will be clearer :)
I will work under the assumption that we want to remove GenericPadOpVectorizationPattern>
from this list - it doesn't really belong here - it's not a "vectorization" pattern per se. I am already in the process of refactoring it:
One step at a time though!
@@ -2285,7 +2285,7 @@ static SmallVector<Value> ofrToIndexValues(RewriterBase &rewriter, Location loc, | |||
return result; | |||
} | |||
|
|||
/// Rewrite a tensor::PadOp into a sequence of EmptyOp, FillOp and | |||
/// Rewrite a tensor::PadOp into a sequence of EmptyOp, FillOp/GenerateOp and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as a sequence of
[ FillOp GenerateOp InsertSlice ] .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me revert this change altogether, I will be refactoring this in #111349 and later.
// CHECK: %[[VEC0:.*]] = call @make_vector() : () -> vector<7x9xf32> | ||
// CHECK: %[[RESULT:.*]] = vector.transfer_write %[[VEC0]], %[[SUB]][%[[C0]], %[[C0]]] : vector<7x9xf32>, tensor<?x6xf32> | ||
// CHECK: return %[[RESULT]] | ||
func.func @pad_and_transfer_write_dynamic_static( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is 'dynamic_static' I guess you mean one part is dynamic and another is static ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dunno :) This is just a copy of the original test and it wasn't me who wrote it.
Looking at the example, I assume this is referring to the "low" and "high" fields in the tensor.pad
Op, which is the key thing in this test. Let me update these names accordingly.
This PR simply wraps `populatePadOpVectorizationPatterns` into a new Transform Dialect Op: `apply_patterns.linalg.pad_vectorization`. This change makes it possible to run (and test) the corresponding patterns without: `transform.structured.vectorize_children_and_apply_patterns`. Note that the Op above only supports non-masked vectorisation (i.e. when the inputs are static), so, effectively, only fixed-width vectorisation (as opposed to scalable vectorisation). As such, this change is required to construct vectorization pipelines for tensor.pad targeting scalable vectors. To test the new Op and the corresponding patterns, I added "vectorization-pad-patterns.mlir" - most tests have been extracted from "vectorization-with-patterns.mlir". As a side note, I feel that we should move `GenericPadOpVectorizationPattern` out of `populatePadOpVectorizationPatterns` as that's a "lower tensor.pad" rather than a "vectorize tensor.pad" pattern. I am leaving that as a TODO.
Address comment from Javed
f6d55c3
to
c9efeda
Compare
Ping @javedabsar1 :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. This is wrapping populatePattern and porting test i.e. mostly NFC.
@@ -2712,6 +2712,9 @@ struct PadOpVectorizationWithInsertSlicePattern | |||
|
|||
void mlir::linalg::populatePadOpVectorizationPatterns( | |||
RewritePatternSet &patterns, PatternBenefit baseBenefit) { | |||
// TODO: The following pattern implements "decomposition" and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. formatting seems wrong. clang-format?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is meant to match the code below rather than the indented arguments above :)
… patterns * Incorporate suggestions from Hanhan * Add a negative test to document when vectorization of tensor.insert_slice might fail * Update `@pad_and_insert_slice_dest` that was added in llvm#112504 (this change means that _all_ qualifying `tensor.insert_slice` Ops are vectorized). * Added more tests to demonstrate other cases (e.g. default vs non-default pad value).
This PR simply wraps `populatePadOpVectorizationPatterns` into a new Transform Dialect Op: `apply_patterns.linalg.pad_vectorization`. This change makes it possible to run (and test) the corresponding patterns _without_: `transform.structured.vectorize_children_and_apply_patterns`. Note that the Op above only supports non-masked vectorisation (i.e. when the inputs are static), so, effectively, only fixed-width vectorisation (as opposed to scalable vectorisation). As such, this change is required to construct vectorization pipelines for tensor.pad targeting scalable vectors. To test the new Op and the corresponding patterns, I added "vectorization-pad-patterns.mlir" - most tests have been extracted from "vectorization-with-patterns.mlir".
This PR simply wraps
populatePadOpVectorizationPatterns
into a newTransform Dialect Op:
apply_patterns.linalg.pad_vectorization
.This change makes it possible to run (and test) the corresponding
patterns without:
transform.structured.vectorize_children_and_apply_patterns
.Note that the Op above only supports non-masked vectorisation (i.e. when
the inputs are static), so, effectively, only fixed-width vectorisation
(as opposed to scalable vectorisation). As such, this change is required
to construct vectorization pipelines for tensor.pad targeting scalable
vectors.
To test the new Op and the corresponding patterns, I added
"vectorization-pad-patterns.mlir" - most tests have been extracted from
"vectorization-with-patterns.mlir".
As a side note, I feel that we should move
GenericPadOpVectorizationPattern
out ofpopulatePadOpVectorizationPatterns
as that's a "lower tensor.pad"rather than a "vectorize tensor.pad" pattern. I am leaving that as a
TODO.