[mlir][Vector] Add `vector.to_elements` op #141457

dcaballe · 2025-05-26T06:32:18Z

This PR introduces the vector.to_elements op, which decomposes a vector into its scalar elements. This operation is symmetrical to the existing vector.from_elements.

Examples:

    // Decompose a 0-D vector.
    %0 = vector.to_elements %v0 : vector<f32>
    // %0 = %v0[0]

    // Decompose a 1-D vector.
    %0:2 = vector.to_elements %v1 : vector<2xf32>
    // %0#0 = %v1[0]
    // %0#1 = %v1[1]

    // Decompose a 2-D.
    %0:6 = vector.to_elements %v2 : vector<2x3xf32>
    // %0#0 = %v2[0, 0]
    // %0#1 = %v2[0, 1]
    // %0#2 = %v2[0, 2]
    // %0#3 = %v2[1, 0]
    // %0#4 = %v2[1, 1]
    // %0#5 = %v2[1, 2]

This op is aimed at reducing code size when modeling "structured" vector extractions and simplifying canonicalizations of large sequences of vector.extract and vector.insert ops into vector.shuffle and other sophisticated ops that can re-arrange vector elements.

More related PRs to come!

This PR introduces the `vector.to_elements` op, which decomposes a vector into its scalar elements. This operation is symmetrical to the existing `vector.from_elements`. Examples: ``` // Decompose a 0-D vector. %0 = vector.to_elements %v0 : vector<f32> // %0 = %v0[0] // Decompose a 1-D vector. %0:2 = vector.to_elements %v1 : vector<2xf32> // %0#0 = %v1[0] // %0#1 = %v1[1] // Decompose a 2-D. %0:6 = vector.to_elements %v2 : vector<2x3xf32> // %0#0 = %v2[0, 0] // %0#1 = %v2[0, 1] // %0#2 = %v2[0, 2] // %0#3 = %v2[1, 0] // %0#4 = %v2[1, 1] // %0#5 = %v2[1, 2] ``` This op is aimed at reducing code size when modeling "structured" vector extractions and simplifying canonicalizations of large sequences of `vector.extract` and `vector.insert` ops into `vector.shuffle` and other sophisticated ops that can re-arrange vector elements. More related PRs to come!

llvmbot · 2025-05-26T06:32:54Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-vector

Author: Diego Caballero (dcaballe)

Changes

This PR introduces the vector.to_elements op, which decomposes a vector into its scalar elements. This operation is symmetrical to the existing vector.from_elements.

Examples:

    // Decompose a 0-D vector.
    %0 = vector.to_elements %v0 : vector&lt;f32&gt;
    // %0 = %v0[0]

    // Decompose a 1-D vector.
    %0:2 = vector.to_elements %v1 : vector&lt;2xf32&gt;
    // %0#<!-- -->0 = %v1[0]
    // %0#<!-- -->1 = %v1[1]

    // Decompose a 2-D.
    %0:6 = vector.to_elements %v2 : vector&lt;2x3xf32&gt;
    // %0#<!-- -->0 = %v2[0, 0]
    // %0#<!-- -->1 = %v2[0, 1]
    // %0#<!-- -->2 = %v2[0, 2]
    // %0#<!-- -->3 = %v2[1, 0]
    // %0#<!-- -->4 = %v2[1, 1]
    // %0#<!-- -->5 = %v2[1, 2]

This op is aimed at reducing code size when modeling "structured" vector extractions and simplifying canonicalizations of large sequences of vector.extract and vector.insert ops into vector.shuffle and other sophisticated ops that can re-arrange vector elements.

More related PRs to come!

Full diff: https://github.com/llvm/llvm-project/pull/141457.diff

3 Files Affected:

(modified) mlir/include/mlir/Dialect/Vector/IR/VectorOps.td (+67-12)
(modified) mlir/test/Dialect/Vector/invalid.mlir (+19-3)
(modified) mlir/test/Dialect/Vector/ops.mlir (+19)

diff --git a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
index 5e8421ed67d66..3da47d8e612e2 100644
--- a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
+++ b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
@@ -789,6 +789,57 @@ def Vector_FMAOp :
   }];
 }
 
+def Vector_ToElementsOp : Vector_Op<"to_elements", [
+    Pure,
+    TypesMatchWith<"operand element type matches result types",
+                   "input", "elements", "SmallVector<Type>("
+                   "::llvm::cast<VectorType>($_self).getNumElements(), "
+                   "::llvm::cast<VectorType>($_self).getElementType())">]> {
+  let summary = "operation that decomposes a vector into all its scalar elements";
+  let description = [{
+    This operation decomposes all the scalar elements from a vector. The
+    decomposed scalar elements are returned in row-major order. The number of
+    scalar results must match the number of elements in the input vector type.
+    All the result elements have the same result type, which must match the
+    element type of the input vector. Scalable vectors are not supported.
+
+    Examples:
+
+    ```mlir
+    // Decompose a 0-D vector.
+    %0 = vector.to_elements %v0 : vector<f32>
+    // %0 = %v0[0]
+
+    // Decompose a 1-D vector.
+    %0:2 = vector.to_elements %v1 : vector<2xf32>
+    // %0#0 = %v1[0]
+    // %0#1 = %v1[1]
+
+    // Decompose a 2-D.
+    %0:6 = vector.to_elements %v2 : vector<2x3xf32>
+    // %0#0 = %v2[0, 0]
+    // %0#1 = %v2[0, 1]
+    // %0#2 = %v2[0, 2]
+    // %0#3 = %v2[1, 0]
+    // %0#4 = %v2[1, 1]
+    // %0#5 = %v2[1, 2]
+
+    // Decompose a 3-D vector.
+    %0:6 = vector.to_elements %v3 : vector<3x1x2xf32>
+    // %0#0 = %v3[0, 0, 0]
+    // %0#1 = %v3[0, 0, 1]
+    // %0#2 = %v3[1, 0, 0]
+    // %0#3 = %v3[1, 0, 1]
+    // %0#4 = %v3[2, 0, 0]
+    // %0#5 = %v3[2, 0, 1]
+    ```
+  }];
+
+  let arguments = (ins AnyVectorOfAnyRank:$input);
+  let results = (outs Variadic<AnyType>:$elements);
+  let assemblyFormat = "$input attr-dict `:` type($input)";
+}
+
 def Vector_FromElementsOp : Vector_Op<"from_elements", [
     Pure,
     TypesMatchWith<"operand types match result element type",
@@ -798,26 +849,30 @@ def Vector_FromElementsOp : Vector_Op<"from_elements", [
   let summary = "operation that defines a vector from scalar elements";
   let description = [{
     This operation defines a vector from one or multiple scalar elements. The
-    number of elements must match the number of elements in the result type.
-    All elements must have the same type, which must match the element type of
-    the result vector type.
-
-    `elements` are a flattened version of the result vector in row-major order.
+    scalar elements are arranged in row-major within the vector. The number of
+    elements must match the number of elements in the result type. All elements
+    must have the same type, which must match the element type of the result
+    vector type. Scalable vectors are not supported.
 
-    Example:
+    Examples:
 
     ```mlir
-    // %f1
+    // Define a 0-D vector.
     %0 = vector.from_elements %f1 : vector<f32>
-    // [%f1, %f2]
+    // [%f1]
+
+    // Define a 1-D vector.
     %1 = vector.from_elements %f1, %f2 : vector<2xf32>
-    // [[%f1, %f2, %f3], [%f4, %f5, %f6]]
+    // [%f1, %f2]
+
+    // Define a 2-D vector.
     %2 = vector.from_elements %f1, %f2, %f3, %f4, %f5, %f6 : vector<2x3xf32>
-    // [[[%f1, %f2]], [[%f3, %f4]], [[%f5, %f6]]]
+    // [[%f1, %f2, %f3], [%f4, %f5, %f6]]
+
+    // Define a 3-D vector.
     %3 = vector.from_elements %f1, %f2, %f3, %f4, %f5, %f6 : vector<3x1x2xf32>
+    // [[[%f1, %f2]], [[%f3, %f4]], [[%f5, %f6]]]
     ```
-
-    Note, scalable vectors are not supported.
   }];
 
   let arguments = (ins Variadic<AnyType>:$elements);
diff --git a/mlir/test/Dialect/Vector/invalid.mlir b/mlir/test/Dialect/Vector/invalid.mlir
index 04810ed52584f..70a7274182442 100644
--- a/mlir/test/Dialect/Vector/invalid.mlir
+++ b/mlir/test/Dialect/Vector/invalid.mlir
@@ -1896,7 +1896,24 @@ func.func @deinterleave_scalable_rank_fail(%vec : vector<2x[4]xf32>) {
 
 // -----
 
-func.func @invalid_from_elements(%a: f32) {
+func.func @to_elements_wrong_num_results(%a: vector<1x1x2xf32>) {
+  // expected-error @+1 {{operation defines 2 results but was provided 4 to bind}}
+  %0:4 = vector.to_elements %a : vector<1x1x2xf32>
+  return
+}
+
+// -----
+
+func.func @to_elements_wrong_result_type(%a: vector<2xf32>) -> i32 {
+  // expected-error @+3 {{use of value '%0' expects different type than prior uses: 'i32'}}
+  // expected-note @+1 {{prior use here}}
+  %0:2 = vector.to_elements %a : vector<2xf32>
+  return %0#0 : i32
+}
+
+// -----
+
+func.func @from_elements_wrong_num_operands(%a: f32) {
   // expected-error @+1 {{'vector.from_elements' number of operands and types do not match: got 1 operands and 2 types}}
   vector.from_elements %a : vector<2xf32>
   return
@@ -1905,12 +1922,11 @@ func.func @invalid_from_elements(%a: f32) {
 // -----
 
 // expected-note @+1 {{prior use here}}
-func.func @invalid_from_elements(%a: f32, %b: i32) {
+func.func @from_elements_wrong_operand_type(%a: f32, %b: i32) {
   // expected-error @+1 {{use of value '%b' expects different type than prior uses: 'f32' vs 'i32'}}
   vector.from_elements %a, %b : vector<2xf32>
   return
 }
-
 // -----
 
 func.func @invalid_from_elements_scalable(%a: f32, %b: i32) {
diff --git a/mlir/test/Dialect/Vector/ops.mlir b/mlir/test/Dialect/Vector/ops.mlir
index f3220aed4360c..7cfe4e89d6e2f 100644
--- a/mlir/test/Dialect/Vector/ops.mlir
+++ b/mlir/test/Dialect/Vector/ops.mlir
@@ -1175,6 +1175,25 @@ func.func @deinterleave_nd_scalable(%arg:vector<2x3x4x[6]xf32>) -> (vector<2x3x4
   return %0, %1 : vector<2x3x4x[3]xf32>, vector<2x3x4x[3]xf32>
 }
 
+// CHECK-LABEL: func @to_elements(
+//  CHECK-SAME:     %[[A_VEC:.*]]: vector<f32>, %[[B_VEC:.*]]: vector<4xf32>,
+//  CHECK-SAME:     %[[C_VEC:.*]]: vector<1xf32>, %[[D_VEC:.*]]: vector<2x2xf32>)
+func.func @to_elements(%a_vec : vector<f32>, %b_vec : vector<4xf32>, %c_vec : vector<1xf32>,  %d_vec : vector<2x2xf32>)
+                   -> (f32, f32, f32, f32, f32, f32, f32, f32, f32, f32) {
+  // CHECK: %[[A_ELEMS:.*]] = vector.to_elements %[[A_VEC]] : vector<f32>
+  %0 = vector.to_elements %a_vec : vector<f32>
+  // CHECK: %[[B_ELEMS:.*]]:4 = vector.to_elements %[[B_VEC]] : vector<4xf32>
+  %1:4 = vector.to_elements %b_vec : vector<4xf32>
+  // CHECK: %[[C_ELEMS:.*]] = vector.to_elements %[[C_VEC]] : vector<1xf32>
+  %2 = vector.to_elements %c_vec : vector<1xf32>
+  // CHECK: %[[D_ELEMS:.*]]:4 = vector.to_elements %[[D_VEC]] : vector<2x2xf32>
+  %3:4 = vector.to_elements %d_vec : vector<2x2xf32>
+  //      CHECK: return %[[A_ELEMS]], %[[B_ELEMS]]#0, %[[B_ELEMS]]#1, %[[B_ELEMS]]#2,
+  // CHECK-SAME:   %[[B_ELEMS]]#3, %[[C_ELEMS]], %[[D_ELEMS]]#0, %[[D_ELEMS]]#1,
+  // CHECK-SAME:   %[[D_ELEMS]]#2, %[[D_ELEMS]]#3
+  return %0, %1#0, %1#1, %1#2, %1#3, %2, %3#0, %3#1, %3#2, %3#3 : f32, f32, f32, f32, f32, f32, f32, f32, f32, f32
+}
+
 // CHECK-LABEL: func @from_elements(
 //  CHECK-SAME:     %[[a:.*]]: f32, %[[b:.*]]: f32)
 func.func @from_elements(%a: f32, %b: f32) -> (vector<f32>, vector<1xf32>, vector<1x2xf32>, vector<2x2xf32>) {

banach-space · 2025-05-26T14:18:45Z

Thanks, Diego! I’ve left a few minor comments inline. I also have one broader question.

This op is aimed at reducing code size when modeling "structured" vector extractions and simplifying canonicalizations of large sequences of vector.extract and vector.insert ops into vector.shuffle and other sophisticated ops that can re-arrange vector elements.

That mostly makes sense, but I’m curious about the practical next steps and the overall direction we’re aiming for. Specifically:

If to_elements + from_elements are intended to simplify existing logic, would that require significant refactoring? Do you anticipate any larger changes stemming from this?

Basically, I want to make sure we avoid "dangling" ops 😅 - currently, vector.from_elements isn't widely used.

banach-space

Thanks!

banach-space · 2025-05-26T14:03:18Z

mlir/include/mlir/Dialect/Vector/IR/VectorOps.td

+    TypesMatchWith<"operand element type matches result types",
+                   "input", "elements", "SmallVector<Type>("
+                   "::llvm::cast<VectorType>($_self).getNumElements(), "
+                   "::llvm::cast<VectorType>($_self).getElementType())">]> {


Could this be shared with vector.from_elements?

banach-space · 2025-05-26T14:12:05Z

mlir/include/mlir/Dialect/Vector/IR/VectorOps.td

+    ```
+  }];
+
+  let arguments = (ins AnyVectorOfAnyRank:$input);


$input is not very descriptive and quite uncommon in Vector. Instead, IMHO, we should re-use one of the existing names to maintain consistency. My suggestion would be:

%source for to_elements,

$dest for from_elements.

Why $source? Basically, I looked at other Vector Ops that take one argument:

https://mlir.llvm.org/docs/Dialects/Vector/#vectorbroadcast-vectorbroadcastop

https://mlir.llvm.org/docs/Dialects/Vector/#vectorbitcast-vectorbitcastop

Why $dest? It naturally complements $source.

Groverkss · 2025-05-26T14:38:17Z

mlir/include/mlir/Dialect/Vector/IR/VectorOps.td

+    This operation decomposes all the scalar elements from a vector. The
+    decomposed scalar elements are returned in row-major order. The number of
+    scalar results must match the number of elements in the input vector type.
+    All the result elements have the same result type, which must match the
+    element type of the input vector. Scalable vectors are not supported.


Is it important that it decomposes into all elements? This op could be really useful for unrolling a dimension if we could do it dimwise. Something like:

%0:16 = vector.to_elements %v : vector<16x4xf32> -> vector<4xf32>

This should have the exact same semantics as vector.extract, just doing multiple extracts at once.

I would much rather have this form of the operation, it is much closer to vector.extract and works for N-D vectors much better.

I think that keeping the symmetry with from_elements is valuable. I'm not sure I follow the suggestion, but is it doing something that chaining extract / extract_strided_slice / shape_cast / to_elements cannot achieve?

newling

While I think that overall we should be deprecating/simplifying as per suggestions in this doc I think this particular new op is a sensible one to have alongside from_elements, thanks!

newling · 2025-05-27T15:07:17Z

mlir/include/mlir/Dialect/Vector/IR/VectorOps.td

+    ```
+  }];
+
+  let arguments = (ins AnyVectorOfAnyRank:$input);


Suggested change

let arguments = (ins AnyVectorOfAnyRank:$input);

let arguments = (ins AnyFixedVectorOfAnyRank:$input);

perhaps?

newling · 2025-05-27T15:18:12Z

mlir/include/mlir/Dialect/Vector/IR/VectorOps.td

+    This operation decomposes all the scalar elements from a vector. The
+    decomposed scalar elements are returned in row-major order. The number of
+    scalar results must match the number of elements in the input vector type.
+    All the result elements have the same result type, which must match the
+    element type of the input vector. Scalable vectors are not supported.


I think that keeping the symmetry with from_elements is valuable. I'm not sure I follow the suggestion, but is it doing something that chaining extract / extract_strided_slice / shape_cast / to_elements cannot achieve?

dcaballe · 2025-05-27T18:27:28Z

Thanks for the feedback! Let me elaborate a more on the value that this op (and also the existing vector.from_elements) brings:

Major code size reduction: vector.to_elements leverages MLIR’s op multi-result feature to replace long sequences of vector.extract operations with a single vector.to_elements op. For instance, imagine extracting all elements from a vector<1024xf32>. Instead of generating 1,024 separate vector.extract ops, we would only need one vector.to_elements. This is a major improvement in code size, impacting any pass that traverses the IR.
Reduced compilation time/transformation complexity: vector.to_elements and vector.from_elements streamline transformations and analyses involving large sequences of vector extractions and insertions. vector.to_elements and vector.from_elements inherently encode the extraction order (no indices!) of all the elements and source vector into one operation. To give you a concrete example, LLVM’s InstCombine spends A LOT of time untangling chains of extractelement and insertelement instructions to turn them into shufflevector instructions. This involves grouping extractions by source vector, analyzing and sorting their indices, and matching them with insertion indices, etc... With vector.to_elements and vector.from_elements, much of this complexity is gone. For example:
- We can determine if a vector.to_elements feeding a vector.from_elements is redundant by just doing llvm::equals(toElements.getResults(), fromElements.getOperands())
- We can determine if a vector.to_elements feeding a vector.from_elements can be turned into a vector.shuffle by just checking that all the vector.from_elements operands come from the same vector.to_elements (this is just a simple case).

would that require significant refactoring? Do you anticipate any larger changes stemming from this?

I don’t anticipate major changes other than replacing loops creating vector.extract for every element of a vector with a single vector.to_elements (and symmetrical replacement for vector.from_elements).

Basically, I want to make sure we avoid "dangling" ops 😅 - currently, vector.from_elements isn't widely used.

Yeah, mostly because we needed the symmetrical op that this MR is introducing and implement the corresponding canonicalization patterns and lowering, which should come after this.

Is it important that it decomposes into all elements? This op could be really useful for unrolling a dimension if we could do it dimwise. Something like:
%0:16 = vector.to_elements %v : vector<16x4xf32> -> vector<4xf32> 

As I mentioned above, it’s important that all the elements are decomposed to offer an implicit and trivial extraction order that doesn’t have to be analyzed. However, I think decomposing into sub-vectors is a natural follow-up that would be very helpful for unrolling, yes! I suggest, though, that we approach this incrementally by first having all the pieces in place for the simple scalar cases before enabling more cases complex cases. Does it sound reasonable?

banach-space · 2025-05-29T15:33:16Z

Thanks for elaborating, Diego!

This makes sense to me, the benefits of including these Ops in Vector are clear. Both code-size and compilation-complexity reduction will benefit multiple users.

Overall this looks good to me % the ongoing discussions.

dcaballe requested a review from kuhar as a code owner May 26, 2025 06:32

llvmbot added mlir:vectorops mlir mlir:vector labels May 26, 2025

dcaballe requested review from banach-space and newling May 26, 2025 06:32

banach-space reviewed May 26, 2025

View reviewed changes

Groverkss requested changes May 26, 2025

View reviewed changes

newling reviewed May 27, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][Vector] Add `vector.to_elements` op #141457

[mlir][Vector] Add `vector.to_elements` op #141457

dcaballe commented May 26, 2025

Uh oh!

llvmbot commented May 26, 2025 •

edited

Loading

Uh oh!

banach-space commented May 26, 2025

Uh oh!

banach-space left a comment

Uh oh!

banach-space May 26, 2025

Uh oh!

banach-space May 26, 2025

Uh oh!

Groverkss May 26, 2025

Uh oh!

newling May 27, 2025

Uh oh!

newling left a comment

Uh oh!

newling May 27, 2025

Uh oh!

newling May 27, 2025

Uh oh!

dcaballe commented May 27, 2025

Uh oh!

banach-space commented May 29, 2025

Uh oh!

Uh oh!

	let arguments = (ins AnyVectorOfAnyRank:$input);
	let arguments = (ins AnyFixedVectorOfAnyRank:$input);

[mlir][Vector] Add vector.to_elements op #141457

Are you sure you want to change the base?

[mlir][Vector] Add vector.to_elements op #141457

Conversation

dcaballe commented May 26, 2025

Uh oh!

llvmbot commented May 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

banach-space commented May 26, 2025

Uh oh!

banach-space left a comment

Choose a reason for hiding this comment

Uh oh!

banach-space May 26, 2025

Choose a reason for hiding this comment

Uh oh!

banach-space May 26, 2025

Choose a reason for hiding this comment

Uh oh!

Groverkss May 26, 2025

Choose a reason for hiding this comment

Uh oh!

newling May 27, 2025

Choose a reason for hiding this comment

Uh oh!

newling left a comment

Choose a reason for hiding this comment

Uh oh!

newling May 27, 2025

Choose a reason for hiding this comment

Uh oh!

newling May 27, 2025

Choose a reason for hiding this comment

Uh oh!

dcaballe commented May 27, 2025

Uh oh!

banach-space commented May 29, 2025

Uh oh!

Uh oh!

[mlir][Vector] Add `vector.to_elements` op #141457

[mlir][Vector] Add `vector.to_elements` op #141457

llvmbot commented May 26, 2025 •

edited

Loading