-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[LV][EVL] Attach a new metadata on EVL vectorized loops #131000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV][EVL] Attach a new metadata on EVL vectorized loops #131000
Conversation
@llvm/pr-subscribers-vectorizers @llvm/pr-subscribers-llvm-transforms Author: Min-Yih Hsu (mshockwave) ChangesThis patch attaches a new metadata, This approach is much safer than, said IR pattern matching to figure out if a loop is EVL-vectorized. Patch is 81.67 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/131000.diff 11 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index e595347d62bf5..7ef06957d5322 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -1013,6 +1013,32 @@ void VPlan::execute(VPTransformState *State) {
Value *Val = State->get(PhiR->getBackedgeValue(), NeedsScalar);
cast<PHINode>(Phi)->addIncoming(Val, VectorLatchBB);
}
+
+ // Check if it's EVL-vectorized and mark the corresponding metadata.
+ // Note that we could have done this during the codegen of
+ // ExplictVectorLength, but the enclosing vector loop was not in a good shape
+ // for us to attach the metadata.
+ bool IsEVLVectorized = llvm::any_of(*Header, [](const VPRecipeBase &Recipe) {
+ // Looking for the ExplictVectorLength VPInstruction.
+ if (const auto *VI = dyn_cast<VPInstruction>(&Recipe))
+ return VI->getOpcode() == VPInstruction::ExplicitVectorLength;
+ return false;
+ });
+ if (IsEVLVectorized) {
+ // VPTransformState::CurrentParentLoop has already been reset
+ // at this moment.
+ Loop *L = State->LI->getLoopFor(VectorLatchBB);
+ assert(L);
+ LLVMContext &Context = State->Builder.getContext();
+ MDNode *LoopID = L->getLoopID();
+ auto *IsEVLVectorizedMD = MDNode::get(
+ Context,
+ {MDString::get(Context, "llvm.loop.isvectorized.withevl"),
+ ConstantAsMetadata::get(ConstantInt::get(Context, APInt(32, 1)))});
+ MDNode *NewLoopID = makePostTransformationMetadata(Context, LoopID, {},
+ {IsEVLVectorizedMD});
+ L->setLoopID(NewLoopID);
+ }
}
InstructionCost VPlan::cost(ElementCount VF, VPCostContext &Ctx) {
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/truncate-to-minimal-bitwidth-evl-crash.ll b/llvm/test/Transforms/LoopVectorize/RISCV/truncate-to-minimal-bitwidth-evl-crash.ll
index ba7158eb02d90..7a28574740348 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/truncate-to-minimal-bitwidth-evl-crash.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/truncate-to-minimal-bitwidth-evl-crash.ll
@@ -53,7 +53,7 @@ define void @truncate_to_minimal_bitwidths_widen_cast_recipe(ptr %src) {
; CHECK-NEXT: store i8 [[CONV36]], ptr null, align 1
; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 1
; CHECK-NEXT: [[EC:%.*]] = icmp eq i64 [[IV]], 1
-; CHECK-NEXT: br i1 [[EC]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK-NEXT: br i1 [[EC]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP4:![0-9]+]]
; CHECK: [[EXIT]]:
; CHECK-NEXT: ret void
;
@@ -77,8 +77,9 @@ exit: ; preds = %loop
ret void
}
;.
-; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
-; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
-; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
-; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META2]], [[META1]]}
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]], [[META3:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.isvectorized.withevl", i32 1}
+; CHECK: [[META2]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META3]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META3]], [[META2]]}
;.
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/type-info-cache-evl-crash.ll b/llvm/test/Transforms/LoopVectorize/RISCV/type-info-cache-evl-crash.ll
index c95414db18bef..8543964968c5a 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/type-info-cache-evl-crash.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/type-info-cache-evl-crash.ll
@@ -80,7 +80,7 @@ define void @type_info_cache_clobber(ptr %dstv, ptr %src, i64 %wide.trip.count)
; CHECK-NEXT: store i16 [[CONV36]], ptr null, align 2
; CHECK-NEXT: [[IV_NEXT]] = add i64 [[IV]], 1
; CHECK-NEXT: [[EC:%.*]] = icmp eq i64 [[IV]], [[WIDE_TRIP_COUNT]]
-; CHECK-NEXT: br i1 [[EC]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP8:![0-9]+]]
+; CHECK-NEXT: br i1 [[EC]], label %[[EXIT]], label %[[LOOP]], !llvm.loop [[LOOP9:![0-9]+]]
; CHECK: [[EXIT]]:
; CHECK-NEXT: ret void
;
@@ -114,8 +114,9 @@ exit:
; CHECK: [[META2]] = distinct !{[[META2]], !"LVerDomain"}
; CHECK: [[META3]] = !{[[META4:![0-9]+]]}
; CHECK: [[META4]] = distinct !{[[META4]], [[META2]]}
-; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META6:![0-9]+]], [[META7:![0-9]+]]}
-; CHECK: [[META6]] = !{!"llvm.loop.isvectorized", i32 1}
-; CHECK: [[META7]] = !{!"llvm.loop.unroll.runtime.disable"}
-; CHECK: [[LOOP8]] = distinct !{[[LOOP8]], [[META6]]}
+; CHECK: [[LOOP5]] = distinct !{[[LOOP5]], [[META6:![0-9]+]], [[META7:![0-9]+]], [[META8:![0-9]+]]}
+; CHECK: [[META6]] = !{!"llvm.loop.isvectorized.withevl", i32 1}
+; CHECK: [[META7]] = !{!"llvm.loop.isvectorized", i32 1}
+; CHECK: [[META8]] = !{!"llvm.loop.unroll.runtime.disable"}
+; CHECK: [[LOOP9]] = distinct !{[[LOOP9]], [[META7]]}
;.
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-bin-unary-ops-args.ll b/llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-bin-unary-ops-args.ll
index e7181f7f30c77..d962258662728 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-bin-unary-ops-args.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-bin-unary-ops-args.ll
@@ -65,7 +65,7 @@ define void @test_and(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP3:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP4:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -144,7 +144,7 @@ define void @test_or(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP5:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -159,7 +159,7 @@ define void @test_or(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP5:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP6:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -238,7 +238,7 @@ define void @test_xor(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP7:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -253,7 +253,7 @@ define void @test_xor(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP7:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP8:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -332,7 +332,7 @@ define void @test_shl(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP8:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP9:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -347,7 +347,7 @@ define void @test_shl(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP9:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP10:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -426,7 +426,7 @@ define void @test_lshr(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP10:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP11:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -441,7 +441,7 @@ define void @test_lshr(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP11:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP12:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -520,7 +520,7 @@ define void @test_ashr(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP12:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP13:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -535,7 +535,7 @@ define void @test_ashr(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP13:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP14:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -614,7 +614,7 @@ define void @test_add(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP14:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP15:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -629,7 +629,7 @@ define void @test_add(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP15:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP16:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -708,7 +708,7 @@ define void @test_sub(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP16:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP17:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -723,7 +723,7 @@ define void @test_sub(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP17:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP18:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -802,7 +802,7 @@ define void @test_mul(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP18:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP19:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -817,7 +817,7 @@ define void @test_mul(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP19:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP20:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -896,7 +896,7 @@ define void @test_sdiv(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP20:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP21:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -911,7 +911,7 @@ define void @test_sdiv(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP21:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP22:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -990,7 +990,7 @@ define void @test_udiv(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP22:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP23:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -1005,7 +1005,7 @@ define void @test_udiv(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP23:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP24:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -1084,7 +1084,7 @@ define void @test_srem(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], [[TMP9]]
; IF-EVL-NEXT: [[TMP19:%.*]] = icmp eq i64 [[INDEX_NEXT]], [[N_VEC]]
-; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP24:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[TMP19]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP25:![0-9]+]]
; IF-EVL: [[MIDDLE_BLOCK]]:
; IF-EVL-NEXT: br i1 true, label %[[FINISH_LOOPEXIT:.*]], label %[[SCALAR_PH]]
; IF-EVL: [[SCALAR_PH]]:
@@ -1099,7 +1099,7 @@ define void @test_srem(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[B]], i64 [[LEN]]
; IF-EVL-NEXT: store i8 [[TMP]], ptr [[ARRAYIDX1]], align 1
; IF-EVL-NEXT: [[DOTNOT:%.*]] = icmp eq i64 [[DEC]], 100
-; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP25:![0-9]+]]
+; IF-EVL-NEXT: br i1 [[DOTNOT]], label %[[FINISH_LOOPEXIT]], label %[[LOOP]], !llvm.loop [[LOOP26:![0-9]+]]
; IF-EVL: [[FINISH_LOOPEXIT]]:
; IF-EVL-NEXT: ret void
;
@@ -1178,7 +1178,7 @@ define void @test_urem(ptr nocapture %a, ptr nocapture readonly %b) {
; IF-EVL-NEXT: [[INDEX_EVL_NEXT]] = add nuw i64 [[TMP18]], [[EVL_BASED_IV]]
; IF-EVL-NEXT: [[INDEX_NEXT]] = add nuw i64...
[truncated]
|
// Note that we could have done this during the codegen of | ||
// ExplictVectorLength, but the enclosing vector loop was not in a good shape | ||
// for us to attach the metadata. | ||
bool IsEVLVectorized = llvm::any_of(*Header, [](const VPRecipeBase &Recipe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool IsEVLVectorized = llvm::any_of(*Header, [](const VPRecipeBase &Recipe) { | |
bool IsEVLVectorized = any_of(*Header, [](const VPRecipeBase &Recipe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
if (IsEVLVectorized) { | ||
// VPTransformState::CurrentParentLoop has already been reset | ||
// at this moment. | ||
Loop *L = State->LI->getLoopFor(VectorLatchBB); | ||
assert(L); | ||
LLVMContext &Context = State->Builder.getContext(); | ||
MDNode *LoopID = L->getLoopID(); | ||
auto *IsEVLVectorizedMD = MDNode::get( | ||
Context, | ||
{MDString::get(Context, "llvm.loop.isvectorized.withevl"), | ||
ConstantAsMetadata::get(ConstantInt::get(Context, APInt(32, 1)))}); | ||
MDNode *NewLoopID = makePostTransformationMetadata(Context, LoopID, {}, | ||
{IsEVLVectorizedMD}); | ||
L->setLoopID(NewLoopID); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this at the same place we set the other loop metadata?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ofc, I moved it to LoopVectorizeHints
, same place where we set the isvectorized
metadata. Because of that we now check if a VPlan is EVL-vectorized in LoopVectorizationPlanner::executePlan
(instead of VPlan::execute
)
enum TailFoldingKind { | ||
TFK_Unspecified = -1, | ||
/// Tail folding with explicit vector length intrinsics. | ||
TFK_EVL = 0 | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean it would also be possible to set the tail-folding style with the hints?
Might not be a good idea to open a two-way street.
My original suggestion was mostly meant to move it to
// 2.6. Maintain Loop Hints
// Keep all loop hints from the original loop on the vector loop (we'll
In LoopVectorize.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean it would also be possible to set the tail-folding style with the hints?
Might not be a good idea to open a two-way street.
Currently LV doesn't use this metadata to make any decision w.r.t. tail-folding style, but I agree it does make such impression.
While I can move the logics to LoopVectorize.cpp as you pointed out, I'm afraid the raw metadata form -- llvm.loop.isvectorized.tailfoldingstyle = 0
-- is not really self-described, especially when accessing it (and that's one of the reasons why I wrapped it with LoopVectorizeHints::isEVLVectorized
). IMHO, there are two potential ways to move forward:
- Use a string value to indicate tail-folding style in the metadata:
llvm.loop.isvectorized.tailfoldingstyle = "evl"
. Which was what Luke originally suggested - Use boolean-value metadata so that it's more intuitive when accessing it. Which was my original approach.
Of course, I'm open for other suggestions. What do you think @lukel97 @fhahn ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that mean it would also be possible to set the tail-folding style with the hints?
Might not be a good idea to open a two-way street.Currently LV doesn't use this metadata to make any decision w.r.t. tail-folding style, but I agree it does make such impression. While I can move the logics to LoopVectorize.cpp as you pointed out, I'm afraid the raw metadata form --
llvm.loop.isvectorized.tailfoldingstyle = 0
-- is not really self-described, especially when accessing it (and that's one of the reasons why I wrapped it withLoopVectorizeHints::isEVLVectorized
). IMHO, there are two potential ways to move forward:
- Use a string value to indicate tail-folding style in the metadata:
llvm.loop.isvectorized.tailfoldingstyle = "evl"
. Which was what Luke originally suggested- Use boolean-value metadata so that it's more intuitive when accessing it. Which was my original approach.
Of course, I'm open for other suggestions. What do you think @lukel97 @fhahn ?
I'm settling down with option (1): Use a string value to indicate tail-folding style in the metadata, llvm.loop.isvectorized.tailfoldingstyle = "evl"
as Luke originally suggested.
Also, I'm now setting the metadata in LoopVectorize.cpp (instead of LoopVectorizeHints) as @fhahn suggested.
0ffcd14
to
727c02a
Compare
And set the metadata in LoopVectorizationHints.
And use the string value to represent tail-folding style (again).
727c02a
to
54d01d8
Compare
The merge conflicts are resolved now. |
|
||
// Check if it's EVL-vectorized and mark the corresponding metadata. | ||
bool IsEVLVectorized = | ||
llvm::any_of(*HeaderVPBB, [](const VPRecipeBase &Recipe) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just for future reference, llvm::
should not be needed here
This patch attaches a new metadata, `llvm.loop.isvectorized.withevl`, on loops vectorized with explicit vector length. This will help other optimizations down in the pipeline that focus on EVL-vectorized loop This approach is much safer than, said IR pattern matching to figure out if a loop is EVL-vectorized or not.
This patch attaches a new metadata,
llvm.loop.isvectorized.withevl
, on loops vectorized with explicit vector length. This will help other optimizations down in the pipeline that focus on EVL-vectorized loop, like the (renovated) EVLIndVarSimplifyPass, of which patch I will post shortly.This approach is much safer than, said IR pattern matching to figure out if a loop is EVL-vectorized.