[mlir][vector] Update the folder for vector.{insert|extract} #136579

banach-space · 2025-04-21T16:53:46Z

This is a minor follow-up to #135498. It ensures that operations like
the following are not treated as out-of-bounds accesses and can be
folded correctly (*):

  %c_neg_1 = arith.constant -1 : index
  %0 = vector.insert %value_to_store, %dest[%c_neg_1] : vector<5xf32> into vector<4x5xf32>
  %1 = vector.extract %src[%c_neg_1, 0] : f32 from vector<4x5xf32>

In addition to adding tests for the case above, this PR also relocates
the tests from #135498 to be alongside existing tests for the
vector.{insert|extract} folder, and reformats them to follow:

https://mlir.llvm.org/getting_started/TestingGuide/

For example:

The "no_fold" prefix is now used to label negative tests.
Redundant check lines have been removed (e.g., CHECK: vector.insert
is sufficient to verify that folding did not occur).

(*) As per https://mlir.llvm.org/docs/Dialects/Vector/#vectorinsert-vectorinsertop, these are poison values.

This is a minor follow-up to llvm#135498. It ensures that operations like the following are not treated as out-of-bounds accesses and can be folded correctly: ```mlir %c_neg_1 = arith.constant -1 : index %0 = vector.insert %value_to_store, %dest[%c_neg_1] : vector<5xf32> into vector<4x5xf32>O` %1 = vector.extract %src[%c_neg_1, 0] : f32 from vector<4x5xf32> ``` In addition to adding tests for the case above, this PR also relocates the tests from llvm#135498 to be alongside existing tests for the `vector.{insert|extract}` folder, and reformats them to follow: * https://mlir.llvm.org/getting_started/TestingGuide/ For example: * The "no_fold" prefix is now used to label negative tests. * Redundant check lines have been removed (e.g., CHECK: vector.insert is sufficient to verify that folding did not occur).

llvmbot · 2025-04-21T16:54:20Z

@llvm/pr-subscribers-mlir-vector

@llvm/pr-subscribers-mlir

Author: Andrzej Warzyński (banach-space)

Changes

This is a minor follow-up to #135498. It ensures that operations like
the following are not treated as out-of-bounds accesses and can be
folded correctly:

  %c_neg_1 = arith.constant -1 : index
  %0 = vector.insert %value_to_store, %dest[%c_neg_1] : vector&lt;5xf32&gt; into vector&lt;4x5xf32&gt;O`
  %1 = vector.extract %src[%c_neg_1, 0] : f32 from vector&lt;4x5xf32&gt;

In addition to adding tests for the case above, this PR also relocates
the tests from #135498 to be alongside existing tests for the
vector.{insert|extract} folder, and reformats them to follow:

https://mlir.llvm.org/getting_started/TestingGuide/

For example:

The "no_fold" prefix is now used to label negative tests.
Redundant check lines have been removed (e.g., CHECK: vector.insert
is sufficient to verify that folding did not occur).

Full diff: https://github.com/llvm/llvm-project/pull/136579.diff

2 Files Affected:

(modified) mlir/lib/Dialect/Vector/IR/VectorOps.cpp (+3-2)
(modified) mlir/test/Dialect/Vector/canonicalize.mlir (+55-38)

diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 368259b38b153..df2709dc44ef6 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -2019,8 +2019,9 @@ static Value extractInsertFoldConstantOp(OpType op, AdaptorType adaptor,
     Value position = dynamicPosition[index++];
     if (auto attr = mlir::dyn_cast_if_present<IntegerAttr>(positionAttr)) {
       int64_t value = attr.getInt();
-      // Do not fold if the value is out of bounds.
-      if (value >= 0 && value < vectorShape[i]) {
+      // Do not fold if the value is out of bounds (-1 signifies a poison
+      // value rather than OOB index).
+      if (value >= -1 && value < vectorShape[i]) {
         staticPosition[i] = attr.getInt();
         opChange = true;
         continue;
diff --git a/mlir/test/Dialect/Vector/canonicalize.mlir b/mlir/test/Dialect/Vector/canonicalize.mlir
index 2d365ac2b4287..29a11f47481c8 100644
--- a/mlir/test/Dialect/Vector/canonicalize.mlir
+++ b/mlir/test/Dialect/Vector/canonicalize.mlir
@@ -165,6 +165,33 @@ func.func @extract_scalar_poison_idx(%a: vector<4x5xf32>) -> f32 {
   return %0 : f32
 }
 
+// -----
+
+// Similar to the test above, but the index is not a static constant.
+
+// CHECK-LABEL: @extract_scalar_poison_idx_non_cst
+func.func @extract_scalar_poison_idx_non_cst(%a: vector<4x5xf32>) -> f32 {
+  // CHECK-NEXT: %[[UB:.*]] = ub.poison : f32
+  //  CHECK-NOT: vector.extract
+  // CHECK-NEXT: return %[[UB]] : f32
+  %c_neg_1 = arith.constant -1 : index
+  %0 = vector.extract %a[%c_neg_1, 0] : f32 from vector<4x5xf32>
+  return %0 : f32
+}
+
+// -----
+
+// Similar to test above, but now the index is out-of-bounds.
+
+// CHECK-LABEL: @no_fold_extract_scalar_oob_idx
+func.func @no_fold_extract_scalar_oob_idx(%a: vector<4x5xf32>) -> f32 {
+  //  CHECK: vector.extract
+  %c_neg_2 = arith.constant -2 : index
+  %0 = vector.extract %a[%c_neg_2, 0] : f32 from vector<4x5xf32>
+  return %0 : f32
+}
+
+
 // -----
 
 // CHECK-LABEL: @extract_vector_poison_idx
@@ -3062,6 +3089,34 @@ func.func @insert_vector_poison_idx(%a: vector<4x5xf32>, %b: vector<5xf32>)
 
 // -----
 
+// Similar to the test above, but the index is not a static constant.
+
+// CHECK-LABEL: @insert_vector_poison_idx_non_cst
+func.func @insert_vector_poison_idx_non_cst(%a: vector<4x5xf32>, %b: vector<5xf32>)
+    -> vector<4x5xf32> {
+  // CHECK-NEXT: %[[UB:.*]] = ub.poison : vector<4x5xf32>
+  //  CHECK-NOT: vector.insert
+  // CHECK-NEXT: return %[[UB]] : vector<4x5xf32>
+  %c_neg_1 = arith.constant -1 : index
+  %0 = vector.insert %b, %a[%c_neg_1] : vector<5xf32> into vector<4x5xf32>
+  return %0 : vector<4x5xf32>
+}
+
+// -----
+
+// Similar to test above, but now the index is out-of-bounds.
+
+// CHECK-LABEL: @no_fold_insert_scalar_idx_oob
+func.func @no_fold_insert_scalar_idx_oob(%a: vector<4x5xf32>, %b: vector<5xf32>)
+    -> vector<4x5xf32> {
+  //  CHECK: vector.insert
+  %c_neg_2 = arith.constant -2 : index
+  %0 = vector.insert %b, %a[%c_neg_2] : vector<5xf32> into vector<4x5xf32>
+  return %0 : vector<4x5xf32>
+}
+
+// -----
+
 // CHECK-LABEL: @insert_multiple_poison_idx
 func.func @insert_multiple_poison_idx(%a: vector<4x5x8xf32>, %b: vector<8xf32>)
     -> vector<4x5x8xf32> {
@@ -3311,41 +3366,3 @@ func.func @fold_insert_constant_indices(%arg : vector<4x1xi32>) -> vector<4x1xi3
   %res = vector.insert %1, %arg[%0, %0] : i32 into vector<4x1xi32>
   return %res : vector<4x1xi32>
 }
-
-// -----
-
-// Check that out of bounds indices are not folded for vector.insert.
-
-// CHECK-LABEL: @fold_insert_oob
-//  CHECK-SAME:   %[[ARG:.*]]: vector<4x1x2xi32>) -> vector<4x1x2xi32> {
-//       CHECK:   %[[OOB1:.*]] = arith.constant -2 : index
-//       CHECK:   %[[OOB2:.*]] = arith.constant 2 : index
-//       CHECK:   %[[VAL:.*]] = arith.constant 1 : i32
-//       CHECK:   %[[RES:.*]] = vector.insert %[[VAL]], %[[ARG]] [0, %[[OOB1]], %[[OOB2]]] : i32 into vector<4x1x2xi32>
-//       CHECK:   return %[[RES]] : vector<4x1x2xi32>
-func.func @fold_insert_oob(%arg : vector<4x1x2xi32>) -> vector<4x1x2xi32> {
-  %c0 = arith.constant 0 : index
-  %c-2 = arith.constant -2 : index
-  %c2 = arith.constant 2 : index
-  %c1 = arith.constant 1 : i32
-  %res = vector.insert %c1, %arg[%c0, %c-2, %c2] : i32 into vector<4x1x2xi32>
-  return %res : vector<4x1x2xi32>
-}
-
-// -----
-
-// Check that out of bounds indices are not folded for vector.extract.
-
-// CHECK-LABEL: @fold_extract_oob
-//  CHECK-SAME:   %[[ARG:.*]]: vector<4x1x2xi32>) -> i32 {
-//       CHECK:   %[[OOB1:.*]] = arith.constant -2 : index
-//       CHECK:   %[[OOB2:.*]] = arith.constant 2 : index
-//       CHECK:   %[[RES:.*]] = vector.extract %[[ARG]][0, %[[OOB1]], %[[OOB2]]] : i32 from vector<4x1x2xi32>
-//       CHECK:   return %[[RES]] : i32
-func.func @fold_extract_oob(%arg : vector<4x1x2xi32>) -> i32 {
-  %c0 = arith.constant 0 : index
-  %c-2 = arith.constant -2 : index
-  %c2 = arith.constant 2 : index
-  %res = vector.extract %arg[%c0, %c-2, %c2] : i32 from vector<4x1x2xi32>
-  return %res : i32
-}

math-fehr · 2025-04-21T19:00:02Z

Thanks for fixing my tests, I'll make sure to follow the testing guide next time!

About the canonicalization, my understanding is that this -1 should be considered as an
OOB access, as only -1 constants (as in -1 attributes) are representing poison accesses (if I understand correctly). This change would mean that OOB accesses are not folded to poison,
unless the OOB access in -1, which feels weird to me.

I thought that the reason -1 was used was that it is not possible to have an #ub.poison attribute in the constant index array in vector.insert. And that this was to fold the following:

%poison = ub.poison : index
%0 = vector.insert %value_to_store, %dest[%poison] : vector<5xf32> into vector<4x5xf32>

What do you think? Or please tell me if I'm misunderstanding something

banach-space · 2025-04-21T19:19:09Z

Thanks for fixing my tests, I'll make sure to follow the testing guide next time!

No worries at all - the testing guide is still quite new and unfamiliar to most contributors. Rome wasn’t built in a day :)

my understanding is that this -1 should be considered as an OOB access, as only -1 constants (as in -1 attributes) are representing poison accesses (if I understand correctly).

Why do you think we should treat -1 (as an attribute) and %c_neg_1 (an SSA value that's a compile-time constant equal to -1) differently? Is there documentation I'm missing, or another rationale?

To me, they both represent the same value at compile time, so I would expect them to be treated consistently, regardless of the underlying representation.

math-fehr · 2025-04-21T20:11:11Z

Why do you think we should treat -1 (as an attribute) and %c_neg_1 (an SSA value that's a compile-time constant equal to -1) differently? Is there documentation I'm missing, or another rationale?

That's probably where I'm misunderstanding something.
I thought that -1 (as an attribute) corresponded to a %poison = ub.poison : index value.
Since only constants that are out of bounds are accetped in vector.insert anyway, it would be fine (but weird I agree) to have a different meaning than a %neg_1 = arith.constant -1 : index constant.

But if the meaning is for -1 to actually represent -1, and that the semantics of vector.insert OOB accesses are only defined for -1, then that's fine for me. I guess my main issue is that we define only the semantics for -1 and not for other OOB values. I guess things would be clearer if we decide what is the meaning of any OOB access, wether we decide it is UB or poison.

banach-space · 2025-04-22T11:44:24Z

I thought that -1 (as an attribute) corresponded to a %poison = ub.poison : index value.

Ah, I see how you interpreted that - thanks for clarifying! That also made me realise we actually have three (rather than two) relevant options:

  // Option 1   
  %1 = vector.extract %src[-1, 0] : f32 from vector<4x5xf32>
  
  // Option 2
  %c_neg_1 = arith.constant -1 : inde
  %2 = vector.extract %src[%c_neg_1, 0] : f32 from vector<4x5xf32>
  
  // Option 3  
  %poison = ub.poison : index
  %3 = vector.extract %src[%poison, 0] : f32 from vector<4x5xf32>

Your PR takes care of Option 1, and I’m currently addressing Option 2. That still leaves Option 3 open.
Now, (extra emphasis by me):

it would be fine (but weird I agree) to have a different meaning than a %neg_1 = arith.constant -1 : index constant.

Indeed, we should strive for consistency unless there's a strong reason not to. From the Vector dialect docs:

The value -1 represents a poison index, which specifies that the resulting vector is poison.

To me, -1 refers to a compile-time constant — and that should include both attribute and SSA representations (e.g. %neg_1 = arith.constant -1 : index). That’s the rationale for treating Option 1 and Option 2 consistently.

Also from the docs:

The result is undefined if any index is out-of-bounds.

So we're already making a special exception for -1, and I agree it would be helpful to update the documentation to make this exception clearer.

I guess things would be clearer if we decide what is the meaning of any OOB access, wether we decide it is UB or poison.

Yes, good point — and actually @dcaballe has voiced support for treating any OOB access as poison (see this comment):

I don't see a problem with allowing more OOB indices. This was initially constrained to test the waters. I would canonicalize OOB to -1, though, to avoid having arbitrary odd numbers in the IR.

That would also make the semantics more consistent with LLVM’s own behavior (LangRef):

If idx exceeds the length of val for a fixed-length vector, the result is a poison value

NEXT STEPS

I would land this as is - to me this change is consistent with the existing docs. As for other OOB indices (other than -1) and using %poison = ub.poison : index, I'd move that to a separate PR. WDYT? Oh, and thanks for the comments and the discussion 🙏🏻

math-fehr · 2025-04-22T14:00:22Z

Yes, good point — and actually @dcaballe has voiced support for treating any OOB access as poison (see #134516 (comment)):

Just mentionning this here so we don't forget, but if I recall @kuhar had a preference to immediate UB (#134516 (comment)).

And yes, I agree with you on all points, let's merge this first and then let's see the rest!

Also, just a quick question, why do we fold the -1 instead of directly folding the operation to ub.poison?

banach-space · 2025-04-24T09:42:57Z

Created #137147 to continue the discussion on OOB and the follow-up work. Thanks for the discussion so far, @math-fehr ! 🙏🏻

…6579) This is a minor follow-up to llvm#135498. It ensures that operations like the following are not treated as out-of-bounds accesses and can be folded correctly (*): ```mlir %c_neg_1 = arith.constant -1 : index %0 = vector.insert %value_to_store, %dest[%c_neg_1] : vector<5xf32> into vector<4x5xf32> %1 = vector.extract %src[%c_neg_1, 0] : f32 from vector<4x5xf32> ``` In addition to adding tests for the case above, this PR also relocates the tests from llvm#135498 to be alongside existing tests for the `vector.{insert|extract}` folder, and reformats them to follow: * https://mlir.llvm.org/getting_started/TestingGuide/ For example: * The "no_fold" prefix is now used to label negative tests. * Redundant check lines have been removed (e.g., CHECK: vector.insert is sufficient to verify that folding did not occur). (*) As per https://mlir.llvm.org/docs/Dialects/Vector/#vectorinsert-vectorinsertop, these are poison values.

banach-space requested review from dcaballe, nicolasvasilache and Groverkss as code owners April 21, 2025 16:53

llvmbot added mlir:vectorops mlir mlir:vector labels Apr 21, 2025

banach-space requested review from kuhar and math-fehr April 21, 2025 16:54

kuhar approved these changes Apr 21, 2025

View reviewed changes

math-fehr approved these changes Apr 22, 2025

View reviewed changes

banach-space mentioned this pull request Apr 24, 2025

[mlir][vector] Update how OOB accesses are handled by vector.extract/vector.insert #137147

Open

banach-space merged commit ebceb73 into llvm:main Apr 24, 2025
15 checks passed

banach-space deleted the andrzej/vector/insert_extract_folder_tweak branch April 24, 2025 09:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][vector] Update the folder for vector.{insert|extract} #136579

[mlir][vector] Update the folder for vector.{insert|extract} #136579

banach-space commented Apr 21, 2025 •

edited

Loading

Uh oh!

llvmbot commented Apr 21, 2025 •

edited

Loading

Uh oh!

math-fehr commented Apr 21, 2025

Uh oh!

banach-space commented Apr 21, 2025

Uh oh!

math-fehr commented Apr 21, 2025

Uh oh!

banach-space commented Apr 22, 2025

Uh oh!

math-fehr commented Apr 22, 2025

Uh oh!

banach-space commented Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

[mlir][vector] Update the folder for vector.{insert|extract} #136579

[mlir][vector] Update the folder for vector.{insert|extract} #136579

Conversation

banach-space commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Apr 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

math-fehr commented Apr 21, 2025

Uh oh!

banach-space commented Apr 21, 2025

Uh oh!

math-fehr commented Apr 21, 2025

Uh oh!

banach-space commented Apr 22, 2025

Uh oh!

math-fehr commented Apr 22, 2025

Uh oh!

banach-space commented Apr 24, 2025

Uh oh!

Uh oh!

Uh oh!

banach-space commented Apr 21, 2025 •

edited

Loading

llvmbot commented Apr 21, 2025 •

edited

Loading