Skip to content

[TTI] getTypeBasedIntrinsicInstrCost - add basic handling for strided load/store intrinsics #125223

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 31, 2025

Conversation

RKSimon
Copy link
Collaborator

@RKSimon RKSimon commented Jan 31, 2025

As noted on #124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)

… load/store intrinsics

As noted on llvm#124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)
@llvmbot
Copy link
Member

llvmbot commented Jan 31, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Simon Pilgrim (RKSimon)

Changes

As noted on #124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)


Patch is 20.04 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/125223.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+14)
  • (modified) llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll (+44-44)
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 596db392392131..9571bd9330de6c 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -2204,6 +2204,20 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
       return thisT()->getMaskedMemoryOpCost(Instruction::Load, Ty, TyAlign, 0,
                                             CostKind);
     }
+    case Intrinsic::experimental_vp_strided_store: {
+      auto *Ty = cast<VectorType>(ICA.getArgTypes()[0]);
+      Align Alignment = thisT()->DL.getABITypeAlign(Ty->getElementType());
+      return thisT()->getStridedMemoryOpCost(
+          Instruction::Store, Ty, /*Ptr=*/nullptr, /*VariableMask=*/true,
+          Alignment, CostKind, ICA.getInst());
+    }
+    case Intrinsic::experimental_vp_strided_load: {
+      auto *Ty = cast<VectorType>(ICA.getReturnType());
+      Align Alignment = thisT()->DL.getABITypeAlign(Ty->getElementType());
+      return thisT()->getStridedMemoryOpCost(
+          Instruction::Load, Ty, /*Ptr=*/nullptr, /*VariableMask=*/true,
+          Alignment, CostKind, ICA.getInst());
+    }
     case Intrinsic::vector_reduce_add:
     case Intrinsic::vector_reduce_mul:
     case Intrinsic::vector_reduce_and:
diff --git a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
index 0245a0f7ee6cbc..7c5197b283fbab 100644
--- a/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
+++ b/llvm/test/Analysis/CostModel/RISCV/vp-intrinsics.ll
@@ -1482,30 +1482,30 @@ define void @strided_load() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPEBASED-LABEL: 'strided_load'
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 13 for instruction: %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 25 for instruction: %ti1_4 = call <4 x i1> @llvm.experimental.vp.strided.load.v4i1.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 49 for instruction: %ti1_8 = call <8 x i1> @llvm.experimental.vp.strided.load.v8i1.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 97 for instruction: %ti1_16 = call <16 x i1> @llvm.experimental.vp.strided.load.v16i1.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %t0 = call <2 x i8> @llvm.experimental.vp.strided.load.v2i8.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 23 for instruction: %t2 = call <4 x i8> @llvm.experimental.vp.strided.load.v4i8.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 47 for instruction: %t4 = call <8 x i8> @llvm.experimental.vp.strided.load.v8i8.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 95 for instruction: %t6 = call <16 x i8> @llvm.experimental.vp.strided.load.v16i8.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %t8.a = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: %t10.a = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: %t13.a = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: %t15.a = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 11 for instruction: %t8 = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: %t10 = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: %t13 = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: %t15 = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t17 = call <vscale x 2 x i8> @llvm.experimental.vp.strided.load.nxv2i8.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t19 = call <vscale x 4 x i8> @llvm.experimental.vp.strided.load.nxv4i8.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t21 = call <vscale x 8 x i8> @llvm.experimental.vp.strided.load.nxv8i8.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t23 = call <vscale x 16 x i8> @llvm.experimental.vp.strided.load.nxv16i8.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t25 = call <vscale x 2 x i64> @llvm.experimental.vp.strided.load.nxv2i64.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t27 = call <vscale x 4 x i64> @llvm.experimental.vp.strided.load.nxv4i64.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t29 = call <vscale x 8 x i64> @llvm.experimental.vp.strided.load.nxv8i64.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: %t31 = call <vscale x 16 x i64> @llvm.experimental.vp.strided.load.nxv16i64.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %ti1_4 = call <4 x i1> @llvm.experimental.vp.strided.load.v4i1.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 64 for instruction: %ti1_8 = call <8 x i1> @llvm.experimental.vp.strided.load.v8i1.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 128 for instruction: %ti1_16 = call <16 x i1> @llvm.experimental.vp.strided.load.v16i1.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %t0 = call <2 x i8> @llvm.experimental.vp.strided.load.v2i8.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t2 = call <4 x i8> @llvm.experimental.vp.strided.load.v4i8.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t4 = call <8 x i8> @llvm.experimental.vp.strided.load.v8i8.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t6 = call <16 x i8> @llvm.experimental.vp.strided.load.v16i8.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %t8.a = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t10.a = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t13.a = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t15.a = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %t8 = call <2 x i64> @llvm.experimental.vp.strided.load.v2i64.p0.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t10 = call <4 x i64> @llvm.experimental.vp.strided.load.v4i64.p0.i64(ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t13 = call <8 x i64> @llvm.experimental.vp.strided.load.v8i64.p0.i64(ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t15 = call <16 x i64> @llvm.experimental.vp.strided.load.v16i64.p0.i64(ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t17 = call <vscale x 2 x i8> @llvm.experimental.vp.strided.load.nxv2i8.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t19 = call <vscale x 4 x i8> @llvm.experimental.vp.strided.load.nxv4i8.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t21 = call <vscale x 8 x i8> @llvm.experimental.vp.strided.load.nxv8i8.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %t23 = call <vscale x 16 x i8> @llvm.experimental.vp.strided.load.nxv16i8.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %t25 = call <vscale x 2 x i64> @llvm.experimental.vp.strided.load.nxv2i64.p0.i64(ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %t27 = call <vscale x 4 x i64> @llvm.experimental.vp.strided.load.nxv4i64.p0.i64(ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %t29 = call <vscale x 8 x i64> @llvm.experimental.vp.strided.load.nxv8i64.p0.i64(ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: %t31 = call <vscale x 16 x i64> @llvm.experimental.vp.strided.load.nxv16i64.p0.i64(ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
 ; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   %ti1_2 = call <2 x i1> @llvm.experimental.vp.strided.load.v2i1.i64(ptr undef, i64 undef, <2 x i1> undef, i32 undef)
@@ -1560,26 +1560,26 @@ define void @strided_store() {
 ; CHECK-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
 ; TYPEBASED-LABEL: 'strided_store'
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i8.p0.i64(<2 x i8> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: call void @llvm.experimental.vp.strided.store.v4i8.p0.i64(<4 x i8> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: call void @llvm.experimental.vp.strided.store.v8i8.p0.i64(<8 x i8> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: call void @llvm.experimental.vp.strided.store.v16i8.p0.i64(<16 x i8> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 26 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 54 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 110 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv2i8.p0.i64(<vscale x 2 x i8> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv4i8.p0.i64(<vscale x 4 x i8> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv8i8.p0.i64(<vscale x 8 x i8> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv16i8.p0.i64(<vscale x 16 x i8> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv2i64.p0.i64(<vscale x 2 x i64> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv4i64.p0.i64(<vscale x 4 x i64> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv8i64.p0.i64(<vscale x 8 x i64> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
-; TYPEBASED-NEXT:  Cost Model: Invalid cost for instruction: call void @llvm.experimental.vp.strided.store.nxv16i64.p0.i64(<vscale x 16 x i64> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.experimental.vp.strided.store.v2i8.p0.i64(<2 x i8> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.v4i8.p0.i64(<4 x i8> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.v8i8.p0.i64(<8 x i8> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.v16i8.p0.i64(<16 x i8> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.experimental.vp.strided.store.v2i64.p0.i64(<2 x i64> undef, ptr align 8 undef, i64 undef, <2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.v4i64.p0.i64(<4 x i64> undef, ptr align 8 undef, i64 undef, <4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.v8i64.p0.i64(<8 x i64> undef, ptr align 8 undef, i64 undef, <8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.v16i64.p0.i64(<16 x i64> undef, ptr align 8 undef, i64 undef, <16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.nxv2i8.p0.i64(<vscale x 2 x i8> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.nxv4i8.p0.i64(<vscale x 4 x i8> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.nxv8i8.p0.i64(<vscale x 8 x i8> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vp.strided.store.nxv16i8.p0.i64(<vscale x 16 x i8> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: call void @llvm.experimental.vp.strided.store.nxv2i64.p0.i64(<vscale x 2 x i64> undef, ptr undef, i64 undef, <vscale x 2 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: call void @llvm.experimental.vp.strided.store.nxv4i64.p0.i64(<vscale x 4 x i64> undef, ptr undef, i64 undef, <vscale x 4 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: call void @llvm.experimental.vp.strided.store.nxv8i64.p0.i64(<vscale x 8 x i64> undef, ptr undef, i64 undef, <vscale x 8 x i1> undef, i32 undef)
+; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.experimental.vp.strided.store.nxv16i64.p0.i64(<vscale x 16 x i64> undef, ptr undef, i64 undef, <vscale x 16 x i1> undef, i32 undef)
 ; TYPEBASED-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
 ;
   call void @llvm.experimental.vp.strided.store.v2i8.i64(<2 x i8> undef, ptr undef, ...
[truncated]

Copy link
Collaborator

@preames preames left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@RKSimon RKSimon merged commit 48a66e9 into llvm:main Jan 31, 2025
10 checks passed
@RKSimon RKSimon deleted the costmodel-strided-memory branch January 31, 2025 16:06
RKSimon added a commit to RKSimon/llvm-project that referenced this pull request Jan 31, 2025
…BASED + TYPEBASED test coverage

Inspired by llvm#125223 - helps identify when the cost models are relying on arg data (or failures in getTypeBasedIntrinsicInstrCost)
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder ml-opt-devrel-x86-64 running on ml-opt-devrel-x86-64-b1 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/175/builds/12482

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/ml-opt-devrel-x86-64-b1/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/ml-opt-devrel-x86-64-b1/build/bin/FileCheck /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/ml-opt-devrel-x86-64-b1/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
/b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/ml-opt-devrel-x86-64-b1/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

RKSimon added a commit that referenced this pull request Jan 31, 2025
…ling for strided load/store intrinsics (#125223)"

Investigating build bot failures (I think due to some other recent reversions).
@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder clang-debian-cpp20 running on clang-debian-cpp20 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/108/builds/8802

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/FileCheck /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /vol/worker/clang-debian-cpp20/clang-debian-cpp20/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
/vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /vol/worker/clang-debian-cpp20/clang-debian-cpp20/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder clang-x86_64-debian-fast running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/56/builds/17644

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /b/1/clang-x86_64-debian-fast/llvm.obj/bin/FileCheck /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/1/clang-x86_64-debian-fast/llvm.src/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder llvm-x86_64-debian-dylib running on gribozavr4 while building llvm at step 7 "test-build-unified-tree-check-llvm".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/60/builds/18518

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-llvm) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/1/llvm-x86_64-debian-dylib/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /b/1/llvm-x86_64-debian-dylib/build/bin/FileCheck /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/1/llvm-x86_64-debian-dylib/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder llvm-clang-x86_64-expensive-checks-debian running on gribozavr4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/16/builds/13053

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bin/FileCheck /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Jan 31, 2025

LLVM Buildbot has detected a new failure on builder lld-x86_64-ubuntu-fast running on as-builder-4 while building llvm at step 6 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/33/builds/10634

Here is the relevant piece of the build log for the reference
Step 6 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/build/bin/FileCheck /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /home/buildbot/worker/as-builder-4/ramdisk/lld-x86_64/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Feb 1, 2025

LLVM Buildbot has detected a new failure on builder premerge-monolithic-linux running on premerge-linux-1 while building llvm at step 7 "test-build-unified-tree-check-all".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/153/builds/21568

Here is the relevant piece of the build log for the reference
Step 7 (test-build-unified-tree-check-all) failure: test (failure)
******************** TEST 'LLVM :: Transforms/SLPVectorizer/RISCV/complex-loads.ll' FAILED ********************
Exit Code: 1

Command Output (stderr):
--
RUN: at line 2: /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu < /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll --passes=slp-vectorizer -mattr=+v -slp-threshold=-20 | /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
+ /build/buildbot/premerge-monolithic-linux/build/bin/opt -S -mtriple riscv64-unknown-linux-gnu --passes=slp-vectorizer -mattr=+v -slp-threshold=-20
+ /build/buildbot/premerge-monolithic-linux/build/bin/FileCheck /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll
/build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll:31:15: error: CHECK-NEXT: expected string not found in input
; CHECK-NEXT: [[ARRAYIDX8_2:%.*]] = getelementptr i8, ptr [[ADD_PTR_1]], i64 1
              ^
<stdin>:29:58: note: scanning from here
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:29:58: note: with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
 %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4
                                                         ^
<stdin>:66:12: note: possible intended match here
 %arrayidx5.3 = getelementptr i8, ptr null, i64 4
           ^

Input file: <stdin>
Check file: /build/buildbot/premerge-monolithic-linux/llvm-project/llvm/test/Transforms/SLPVectorizer/RISCV/complex-loads.ll

-dump-input=help explains the following input dump.

Input was:
<<<<<<
           .
           .
           .
          24:  %3 = load i8, ptr %arrayidx32.1, align 1 
          25:  %conv33.1 = zext i8 %3 to i32 
          26:  %add.ptr.1 = getelementptr i8, ptr %add.ptr, i64 %idx.ext 
          27:  %add.ptr64.1 = getelementptr i8, ptr %add.ptr64, i64 %idx.ext63 
          28:  %arrayidx3.2 = getelementptr i8, ptr %add.ptr.1, i64 4 
          29:  %arrayidx5.2 = getelementptr i8, ptr %add.ptr64.1, i64 4 
next:31'0                                                              X error: no match found
next:31'1                                                                with "ADD_PTR_1" equal to "%add\\.ptr\\.1"
          30:  %4 = load <4 x i8>, ptr %add.ptr.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          31:  %5 = shufflevector <4 x i8> %4, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          32:  %6 = zext <2 x i8> %5 to <2 x i32> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          33:  %7 = load <4 x i8>, ptr %add.ptr64.1, align 1 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          34:  %8 = shufflevector <4 x i8> %7, <4 x i8> poison, <2 x i32> <i32 0, i32 2> 
next:31'0     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
           .
...

RKSimon added a commit that referenced this pull request Feb 1, 2025
… load/store intrinsics (#125223) (REAPPLIED)

As noted on #124499 - this is currently missing for type-only analysis and was falling back to scalarization for fixed vectors (and failing entirely for scalable vectors)
RKSimon added a commit to RKSimon/llvm-project that referenced this pull request Feb 1, 2025
…BASED + TYPEBASED test coverage

Inspired by llvm#125223 - helps identify when the cost models are relying on arg data (or failures in getTypeBasedIntrinsicInstrCost)
RKSimon added a commit that referenced this pull request Feb 1, 2025
…BASED + TYPEBASED test coverage (#125245)

Inspired by #125223 - helps identify when the cost models are relying on arg data (or failures in getTypeBasedIntrinsicInstrCost)
RKSimon added a commit that referenced this pull request Feb 3, 2025
…e cost analysis. (#124129) (REAPPLIED)

We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.)

Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can.

Noticed while having yet another attempt at #63980

Reapplied cleanup now that #125223 and #124984 have landed.
Icohedron pushed a commit to Icohedron/llvm-project that referenced this pull request Feb 11, 2025
…e cost analysis. (llvm#124129) (REAPPLIED)

We were only constructing the IntrinsicCostAttributes with the arg type info, and not the args themselves, preventing more detailed cost analysis (constant / uniform args etc.)

Just pass the whole IntrinsicInst to the constructor and let it resolve everything it can.

Noticed while having yet another attempt at llvm#63980

Reapplied cleanup now that llvm#125223 and llvm#124984 have landed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants