-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[LV] Add test cases for reverse accesses involving irregular types. nfc #135139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[LV] Add test cases for reverse accesses involving irregular types. nfc #135139
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-backend-risc-v Author: Mel Chen (Mel-Chen) ChangesAdd a test with irregular type to ensure vector types are not generated. Patch is 44.20 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/135139.diff 1 Files Affected:
diff --git a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
index f01aaa04606d9..31a774f8d8525 100644
--- a/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
+++ b/llvm/test/Transforms/LoopVectorize/RISCV/riscv-vector-reverse-output.ll
@@ -429,6 +429,631 @@ exit:
ret void
}
+define void @vector_reverse_irregular_type(ptr noalias %A, ptr noalias %B) {
+; RV64-LABEL: define void @vector_reverse_irregular_type(
+; RV64-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]]) #[[ATTR0]] {
+; RV64-NEXT: [[ENTRY:.*]]:
+; RV64-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; RV64: [[VECTOR_PH]]:
+; RV64-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV64: [[VECTOR_BODY]]:
+; RV64-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV64-NEXT: [[OFFSET_IDX:%.*]] = sub i64 1023, [[INDEX]]
+; RV64-NEXT: [[TMP0:%.*]] = add i64 [[OFFSET_IDX]], 0
+; RV64-NEXT: [[TMP1:%.*]] = add i64 [[OFFSET_IDX]], -1
+; RV64-NEXT: [[TMP2:%.*]] = add i64 [[OFFSET_IDX]], -2
+; RV64-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -3
+; RV64-NEXT: [[TMP4:%.*]] = add i64 [[OFFSET_IDX]], -4
+; RV64-NEXT: [[TMP5:%.*]] = add i64 [[OFFSET_IDX]], -5
+; RV64-NEXT: [[TMP6:%.*]] = add i64 [[OFFSET_IDX]], -6
+; RV64-NEXT: [[TMP7:%.*]] = add i64 [[OFFSET_IDX]], -7
+; RV64-NEXT: [[TMP8:%.*]] = add i64 [[OFFSET_IDX]], -8
+; RV64-NEXT: [[TMP9:%.*]] = add i64 [[OFFSET_IDX]], -9
+; RV64-NEXT: [[TMP10:%.*]] = add i64 [[OFFSET_IDX]], -10
+; RV64-NEXT: [[TMP11:%.*]] = add i64 [[OFFSET_IDX]], -11
+; RV64-NEXT: [[TMP12:%.*]] = add i64 [[OFFSET_IDX]], -12
+; RV64-NEXT: [[TMP13:%.*]] = add i64 [[OFFSET_IDX]], -13
+; RV64-NEXT: [[TMP14:%.*]] = add i64 [[OFFSET_IDX]], -14
+; RV64-NEXT: [[TMP15:%.*]] = add i64 [[OFFSET_IDX]], -15
+; RV64-NEXT: [[TMP16:%.*]] = add nsw i64 [[TMP0]], -1
+; RV64-NEXT: [[TMP17:%.*]] = add nsw i64 [[TMP1]], -1
+; RV64-NEXT: [[TMP18:%.*]] = add nsw i64 [[TMP2]], -1
+; RV64-NEXT: [[TMP19:%.*]] = add nsw i64 [[TMP3]], -1
+; RV64-NEXT: [[TMP20:%.*]] = add nsw i64 [[TMP4]], -1
+; RV64-NEXT: [[TMP21:%.*]] = add nsw i64 [[TMP5]], -1
+; RV64-NEXT: [[TMP22:%.*]] = add nsw i64 [[TMP6]], -1
+; RV64-NEXT: [[TMP23:%.*]] = add nsw i64 [[TMP7]], -1
+; RV64-NEXT: [[TMP24:%.*]] = add nsw i64 [[TMP8]], -1
+; RV64-NEXT: [[TMP25:%.*]] = add nsw i64 [[TMP9]], -1
+; RV64-NEXT: [[TMP26:%.*]] = add nsw i64 [[TMP10]], -1
+; RV64-NEXT: [[TMP27:%.*]] = add nsw i64 [[TMP11]], -1
+; RV64-NEXT: [[TMP28:%.*]] = add nsw i64 [[TMP12]], -1
+; RV64-NEXT: [[TMP29:%.*]] = add nsw i64 [[TMP13]], -1
+; RV64-NEXT: [[TMP30:%.*]] = add nsw i64 [[TMP14]], -1
+; RV64-NEXT: [[TMP31:%.*]] = add nsw i64 [[TMP15]], -1
+; RV64-NEXT: [[TMP32:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP16]]
+; RV64-NEXT: [[TMP33:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP17]]
+; RV64-NEXT: [[TMP34:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP18]]
+; RV64-NEXT: [[TMP35:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP19]]
+; RV64-NEXT: [[TMP36:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP20]]
+; RV64-NEXT: [[TMP37:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP21]]
+; RV64-NEXT: [[TMP38:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP22]]
+; RV64-NEXT: [[TMP39:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP23]]
+; RV64-NEXT: [[TMP40:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP24]]
+; RV64-NEXT: [[TMP41:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP25]]
+; RV64-NEXT: [[TMP42:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP26]]
+; RV64-NEXT: [[TMP43:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP27]]
+; RV64-NEXT: [[TMP44:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP28]]
+; RV64-NEXT: [[TMP45:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP29]]
+; RV64-NEXT: [[TMP46:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP30]]
+; RV64-NEXT: [[TMP47:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP31]]
+; RV64-NEXT: [[TMP48:%.*]] = load i7, ptr [[TMP32]], align 1
+; RV64-NEXT: [[TMP49:%.*]] = load i7, ptr [[TMP33]], align 1
+; RV64-NEXT: [[TMP50:%.*]] = load i7, ptr [[TMP34]], align 1
+; RV64-NEXT: [[TMP51:%.*]] = load i7, ptr [[TMP35]], align 1
+; RV64-NEXT: [[TMP52:%.*]] = load i7, ptr [[TMP36]], align 1
+; RV64-NEXT: [[TMP53:%.*]] = load i7, ptr [[TMP37]], align 1
+; RV64-NEXT: [[TMP54:%.*]] = load i7, ptr [[TMP38]], align 1
+; RV64-NEXT: [[TMP55:%.*]] = load i7, ptr [[TMP39]], align 1
+; RV64-NEXT: [[TMP56:%.*]] = load i7, ptr [[TMP40]], align 1
+; RV64-NEXT: [[TMP57:%.*]] = load i7, ptr [[TMP41]], align 1
+; RV64-NEXT: [[TMP58:%.*]] = load i7, ptr [[TMP42]], align 1
+; RV64-NEXT: [[TMP59:%.*]] = load i7, ptr [[TMP43]], align 1
+; RV64-NEXT: [[TMP60:%.*]] = load i7, ptr [[TMP44]], align 1
+; RV64-NEXT: [[TMP61:%.*]] = load i7, ptr [[TMP45]], align 1
+; RV64-NEXT: [[TMP62:%.*]] = load i7, ptr [[TMP46]], align 1
+; RV64-NEXT: [[TMP63:%.*]] = load i7, ptr [[TMP47]], align 1
+; RV64-NEXT: [[TMP64:%.*]] = insertelement <16 x i7> poison, i7 [[TMP48]], i32 0
+; RV64-NEXT: [[TMP65:%.*]] = insertelement <16 x i7> [[TMP64]], i7 [[TMP49]], i32 1
+; RV64-NEXT: [[TMP66:%.*]] = insertelement <16 x i7> [[TMP65]], i7 [[TMP50]], i32 2
+; RV64-NEXT: [[TMP67:%.*]] = insertelement <16 x i7> [[TMP66]], i7 [[TMP51]], i32 3
+; RV64-NEXT: [[TMP68:%.*]] = insertelement <16 x i7> [[TMP67]], i7 [[TMP52]], i32 4
+; RV64-NEXT: [[TMP69:%.*]] = insertelement <16 x i7> [[TMP68]], i7 [[TMP53]], i32 5
+; RV64-NEXT: [[TMP70:%.*]] = insertelement <16 x i7> [[TMP69]], i7 [[TMP54]], i32 6
+; RV64-NEXT: [[TMP71:%.*]] = insertelement <16 x i7> [[TMP70]], i7 [[TMP55]], i32 7
+; RV64-NEXT: [[TMP72:%.*]] = insertelement <16 x i7> [[TMP71]], i7 [[TMP56]], i32 8
+; RV64-NEXT: [[TMP73:%.*]] = insertelement <16 x i7> [[TMP72]], i7 [[TMP57]], i32 9
+; RV64-NEXT: [[TMP74:%.*]] = insertelement <16 x i7> [[TMP73]], i7 [[TMP58]], i32 10
+; RV64-NEXT: [[TMP75:%.*]] = insertelement <16 x i7> [[TMP74]], i7 [[TMP59]], i32 11
+; RV64-NEXT: [[TMP76:%.*]] = insertelement <16 x i7> [[TMP75]], i7 [[TMP60]], i32 12
+; RV64-NEXT: [[TMP77:%.*]] = insertelement <16 x i7> [[TMP76]], i7 [[TMP61]], i32 13
+; RV64-NEXT: [[TMP78:%.*]] = insertelement <16 x i7> [[TMP77]], i7 [[TMP62]], i32 14
+; RV64-NEXT: [[TMP79:%.*]] = insertelement <16 x i7> [[TMP78]], i7 [[TMP63]], i32 15
+; RV64-NEXT: [[TMP80:%.*]] = add <16 x i7> [[TMP79]], splat (i7 1)
+; RV64-NEXT: [[TMP81:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP16]]
+; RV64-NEXT: [[TMP82:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP17]]
+; RV64-NEXT: [[TMP83:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP18]]
+; RV64-NEXT: [[TMP84:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP19]]
+; RV64-NEXT: [[TMP85:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP20]]
+; RV64-NEXT: [[TMP86:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP21]]
+; RV64-NEXT: [[TMP87:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP22]]
+; RV64-NEXT: [[TMP88:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP23]]
+; RV64-NEXT: [[TMP89:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP24]]
+; RV64-NEXT: [[TMP90:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP25]]
+; RV64-NEXT: [[TMP91:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP26]]
+; RV64-NEXT: [[TMP92:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP27]]
+; RV64-NEXT: [[TMP93:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP28]]
+; RV64-NEXT: [[TMP94:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP29]]
+; RV64-NEXT: [[TMP95:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP30]]
+; RV64-NEXT: [[TMP96:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP31]]
+; RV64-NEXT: [[TMP97:%.*]] = extractelement <16 x i7> [[TMP80]], i32 0
+; RV64-NEXT: store i7 [[TMP97]], ptr [[TMP81]], align 1
+; RV64-NEXT: [[TMP98:%.*]] = extractelement <16 x i7> [[TMP80]], i32 1
+; RV64-NEXT: store i7 [[TMP98]], ptr [[TMP82]], align 1
+; RV64-NEXT: [[TMP99:%.*]] = extractelement <16 x i7> [[TMP80]], i32 2
+; RV64-NEXT: store i7 [[TMP99]], ptr [[TMP83]], align 1
+; RV64-NEXT: [[TMP100:%.*]] = extractelement <16 x i7> [[TMP80]], i32 3
+; RV64-NEXT: store i7 [[TMP100]], ptr [[TMP84]], align 1
+; RV64-NEXT: [[TMP101:%.*]] = extractelement <16 x i7> [[TMP80]], i32 4
+; RV64-NEXT: store i7 [[TMP101]], ptr [[TMP85]], align 1
+; RV64-NEXT: [[TMP102:%.*]] = extractelement <16 x i7> [[TMP80]], i32 5
+; RV64-NEXT: store i7 [[TMP102]], ptr [[TMP86]], align 1
+; RV64-NEXT: [[TMP103:%.*]] = extractelement <16 x i7> [[TMP80]], i32 6
+; RV64-NEXT: store i7 [[TMP103]], ptr [[TMP87]], align 1
+; RV64-NEXT: [[TMP104:%.*]] = extractelement <16 x i7> [[TMP80]], i32 7
+; RV64-NEXT: store i7 [[TMP104]], ptr [[TMP88]], align 1
+; RV64-NEXT: [[TMP105:%.*]] = extractelement <16 x i7> [[TMP80]], i32 8
+; RV64-NEXT: store i7 [[TMP105]], ptr [[TMP89]], align 1
+; RV64-NEXT: [[TMP106:%.*]] = extractelement <16 x i7> [[TMP80]], i32 9
+; RV64-NEXT: store i7 [[TMP106]], ptr [[TMP90]], align 1
+; RV64-NEXT: [[TMP107:%.*]] = extractelement <16 x i7> [[TMP80]], i32 10
+; RV64-NEXT: store i7 [[TMP107]], ptr [[TMP91]], align 1
+; RV64-NEXT: [[TMP108:%.*]] = extractelement <16 x i7> [[TMP80]], i32 11
+; RV64-NEXT: store i7 [[TMP108]], ptr [[TMP92]], align 1
+; RV64-NEXT: [[TMP109:%.*]] = extractelement <16 x i7> [[TMP80]], i32 12
+; RV64-NEXT: store i7 [[TMP109]], ptr [[TMP93]], align 1
+; RV64-NEXT: [[TMP110:%.*]] = extractelement <16 x i7> [[TMP80]], i32 13
+; RV64-NEXT: store i7 [[TMP110]], ptr [[TMP94]], align 1
+; RV64-NEXT: [[TMP111:%.*]] = extractelement <16 x i7> [[TMP80]], i32 14
+; RV64-NEXT: store i7 [[TMP111]], ptr [[TMP95]], align 1
+; RV64-NEXT: [[TMP112:%.*]] = extractelement <16 x i7> [[TMP80]], i32 15
+; RV64-NEXT: store i7 [[TMP112]], ptr [[TMP96]], align 1
+; RV64-NEXT: [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 16
+; RV64-NEXT: [[TMP113:%.*]] = icmp eq i64 [[INDEX_NEXT]], 1008
+; RV64-NEXT: br i1 [[TMP113]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP6:![0-9]+]]
+; RV64: [[MIDDLE_BLOCK]]:
+; RV64-NEXT: br i1 false, label %[[EXIT:.*]], label %[[SCALAR_PH]]
+; RV64: [[SCALAR_PH]]:
+; RV64-NEXT: [[BC_RESUME_VAL:%.*]] = phi i64 [ 15, %[[MIDDLE_BLOCK]] ], [ 1023, %[[ENTRY]] ]
+; RV64-NEXT: br label %[[FOR_BODY:.*]]
+; RV64: [[FOR_BODY]]:
+; RV64-NEXT: [[DEC_IV:%.*]] = phi i64 [ [[BC_RESUME_VAL]], %[[SCALAR_PH]] ], [ [[IV_NEXT:%.*]], %[[FOR_BODY]] ]
+; RV64-NEXT: [[IV_NEXT]] = add nsw i64 [[DEC_IV]], -1
+; RV64-NEXT: [[ARRAYIDX_B:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[IV_NEXT]]
+; RV64-NEXT: [[TMP114:%.*]] = load i7, ptr [[ARRAYIDX_B]], align 1
+; RV64-NEXT: [[ADD:%.*]] = add i7 [[TMP114]], 1
+; RV64-NEXT: [[ARRAYIDX_A:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[IV_NEXT]]
+; RV64-NEXT: store i7 [[ADD]], ptr [[ARRAYIDX_A]], align 1
+; RV64-NEXT: [[CMP:%.*]] = icmp ugt i64 [[DEC_IV]], 1
+; RV64-NEXT: br i1 [[CMP]], label %[[FOR_BODY]], label %[[EXIT]], !llvm.loop [[LOOP7:![0-9]+]]
+; RV64: [[EXIT]]:
+; RV64-NEXT: ret void
+;
+; RV32-LABEL: define void @vector_reverse_irregular_type(
+; RV32-SAME: ptr noalias [[A:%.*]], ptr noalias [[B:%.*]]) #[[ATTR0]] {
+; RV32-NEXT: [[ENTRY:.*]]:
+; RV32-NEXT: br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
+; RV32: [[VECTOR_PH]]:
+; RV32-NEXT: br label %[[VECTOR_BODY:.*]]
+; RV32: [[VECTOR_BODY]]:
+; RV32-NEXT: [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; RV32-NEXT: [[OFFSET_IDX:%.*]] = sub i64 1023, [[INDEX]]
+; RV32-NEXT: [[TMP0:%.*]] = add i64 [[OFFSET_IDX]], 0
+; RV32-NEXT: [[TMP1:%.*]] = add i64 [[OFFSET_IDX]], -1
+; RV32-NEXT: [[TMP2:%.*]] = add i64 [[OFFSET_IDX]], -2
+; RV32-NEXT: [[TMP3:%.*]] = add i64 [[OFFSET_IDX]], -3
+; RV32-NEXT: [[TMP4:%.*]] = add i64 [[OFFSET_IDX]], -4
+; RV32-NEXT: [[TMP5:%.*]] = add i64 [[OFFSET_IDX]], -5
+; RV32-NEXT: [[TMP6:%.*]] = add i64 [[OFFSET_IDX]], -6
+; RV32-NEXT: [[TMP7:%.*]] = add i64 [[OFFSET_IDX]], -7
+; RV32-NEXT: [[TMP8:%.*]] = add i64 [[OFFSET_IDX]], -8
+; RV32-NEXT: [[TMP9:%.*]] = add i64 [[OFFSET_IDX]], -9
+; RV32-NEXT: [[TMP10:%.*]] = add i64 [[OFFSET_IDX]], -10
+; RV32-NEXT: [[TMP11:%.*]] = add i64 [[OFFSET_IDX]], -11
+; RV32-NEXT: [[TMP12:%.*]] = add i64 [[OFFSET_IDX]], -12
+; RV32-NEXT: [[TMP13:%.*]] = add i64 [[OFFSET_IDX]], -13
+; RV32-NEXT: [[TMP14:%.*]] = add i64 [[OFFSET_IDX]], -14
+; RV32-NEXT: [[TMP15:%.*]] = add i64 [[OFFSET_IDX]], -15
+; RV32-NEXT: [[TMP16:%.*]] = add nsw i64 [[TMP0]], -1
+; RV32-NEXT: [[TMP17:%.*]] = add nsw i64 [[TMP1]], -1
+; RV32-NEXT: [[TMP18:%.*]] = add nsw i64 [[TMP2]], -1
+; RV32-NEXT: [[TMP19:%.*]] = add nsw i64 [[TMP3]], -1
+; RV32-NEXT: [[TMP20:%.*]] = add nsw i64 [[TMP4]], -1
+; RV32-NEXT: [[TMP21:%.*]] = add nsw i64 [[TMP5]], -1
+; RV32-NEXT: [[TMP22:%.*]] = add nsw i64 [[TMP6]], -1
+; RV32-NEXT: [[TMP23:%.*]] = add nsw i64 [[TMP7]], -1
+; RV32-NEXT: [[TMP24:%.*]] = add nsw i64 [[TMP8]], -1
+; RV32-NEXT: [[TMP25:%.*]] = add nsw i64 [[TMP9]], -1
+; RV32-NEXT: [[TMP26:%.*]] = add nsw i64 [[TMP10]], -1
+; RV32-NEXT: [[TMP27:%.*]] = add nsw i64 [[TMP11]], -1
+; RV32-NEXT: [[TMP28:%.*]] = add nsw i64 [[TMP12]], -1
+; RV32-NEXT: [[TMP29:%.*]] = add nsw i64 [[TMP13]], -1
+; RV32-NEXT: [[TMP30:%.*]] = add nsw i64 [[TMP14]], -1
+; RV32-NEXT: [[TMP31:%.*]] = add nsw i64 [[TMP15]], -1
+; RV32-NEXT: [[TMP32:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP16]]
+; RV32-NEXT: [[TMP33:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP17]]
+; RV32-NEXT: [[TMP34:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP18]]
+; RV32-NEXT: [[TMP35:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP19]]
+; RV32-NEXT: [[TMP36:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP20]]
+; RV32-NEXT: [[TMP37:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP21]]
+; RV32-NEXT: [[TMP38:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP22]]
+; RV32-NEXT: [[TMP39:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP23]]
+; RV32-NEXT: [[TMP40:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP24]]
+; RV32-NEXT: [[TMP41:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP25]]
+; RV32-NEXT: [[TMP42:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP26]]
+; RV32-NEXT: [[TMP43:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP27]]
+; RV32-NEXT: [[TMP44:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP28]]
+; RV32-NEXT: [[TMP45:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP29]]
+; RV32-NEXT: [[TMP46:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP30]]
+; RV32-NEXT: [[TMP47:%.*]] = getelementptr inbounds i7, ptr [[B]], i64 [[TMP31]]
+; RV32-NEXT: [[TMP48:%.*]] = load i7, ptr [[TMP32]], align 1
+; RV32-NEXT: [[TMP49:%.*]] = load i7, ptr [[TMP33]], align 1
+; RV32-NEXT: [[TMP50:%.*]] = load i7, ptr [[TMP34]], align 1
+; RV32-NEXT: [[TMP51:%.*]] = load i7, ptr [[TMP35]], align 1
+; RV32-NEXT: [[TMP52:%.*]] = load i7, ptr [[TMP36]], align 1
+; RV32-NEXT: [[TMP53:%.*]] = load i7, ptr [[TMP37]], align 1
+; RV32-NEXT: [[TMP54:%.*]] = load i7, ptr [[TMP38]], align 1
+; RV32-NEXT: [[TMP55:%.*]] = load i7, ptr [[TMP39]], align 1
+; RV32-NEXT: [[TMP56:%.*]] = load i7, ptr [[TMP40]], align 1
+; RV32-NEXT: [[TMP57:%.*]] = load i7, ptr [[TMP41]], align 1
+; RV32-NEXT: [[TMP58:%.*]] = load i7, ptr [[TMP42]], align 1
+; RV32-NEXT: [[TMP59:%.*]] = load i7, ptr [[TMP43]], align 1
+; RV32-NEXT: [[TMP60:%.*]] = load i7, ptr [[TMP44]], align 1
+; RV32-NEXT: [[TMP61:%.*]] = load i7, ptr [[TMP45]], align 1
+; RV32-NEXT: [[TMP62:%.*]] = load i7, ptr [[TMP46]], align 1
+; RV32-NEXT: [[TMP63:%.*]] = load i7, ptr [[TMP47]], align 1
+; RV32-NEXT: [[TMP64:%.*]] = insertelement <16 x i7> poison, i7 [[TMP48]], i32 0
+; RV32-NEXT: [[TMP65:%.*]] = insertelement <16 x i7> [[TMP64]], i7 [[TMP49]], i32 1
+; RV32-NEXT: [[TMP66:%.*]] = insertelement <16 x i7> [[TMP65]], i7 [[TMP50]], i32 2
+; RV32-NEXT: [[TMP67:%.*]] = insertelement <16 x i7> [[TMP66]], i7 [[TMP51]], i32 3
+; RV32-NEXT: [[TMP68:%.*]] = insertelement <16 x i7> [[TMP67]], i7 [[TMP52]], i32 4
+; RV32-NEXT: [[TMP69:%.*]] = insertelement <16 x i7> [[TMP68]], i7 [[TMP53]], i32 5
+; RV32-NEXT: [[TMP70:%.*]] = insertelement <16 x i7> [[TMP69]], i7 [[TMP54]], i32 6
+; RV32-NEXT: [[TMP71:%.*]] = insertelement <16 x i7> [[TMP70]], i7 [[TMP55]], i32 7
+; RV32-NEXT: [[TMP72:%.*]] = insertelement <16 x i7> [[TMP71]], i7 [[TMP56]], i32 8
+; RV32-NEXT: [[TMP73:%.*]] = insertelement <16 x i7> [[TMP72]], i7 [[TMP57]], i32 9
+; RV32-NEXT: [[TMP74:%.*]] = insertelement <16 x i7> [[TMP73]], i7 [[TMP58]], i32 10
+; RV32-NEXT: [[TMP75:%.*]] = insertelement <16 x i7> [[TMP74]], i7 [[TMP59]], i32 11
+; RV32-NEXT: [[TMP76:%.*]] = insertelement <16 x i7> [[TMP75]], i7 [[TMP60]], i32 12
+; RV32-NEXT: [[TMP77:%.*]] = insertelement <16 x i7> [[TMP76]], i7 [[TMP61]], i32 13
+; RV32-NEXT: [[TMP78:%.*]] = insertelement <16 x i7> [[TMP77]], i7 [[TMP62]], i32 14
+; RV32-NEXT: [[TMP79:%.*]] = insertelement <16 x i7> [[TMP78]], i7 [[TMP63]], i32 15
+; RV32-NEXT: [[TMP80:%.*]] = add <16 x i7> [[TMP79]], splat (i7 1)
+; RV32-NEXT: [[TMP81:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP16]]
+; RV32-NEXT: [[TMP82:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP17]]
+; RV32-NEXT: [[TMP83:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP18]]
+; RV32-NEXT: [[TMP84:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP19]]
+; RV32-NEXT: [[TMP85:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP20]]
+; RV32-NEXT: [[TMP86:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP21]]
+; RV32-NEXT: [[TMP87:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP22]]
+; RV32-NEXT: [[TMP88:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP23]]
+; RV32-NEXT: [[TMP89:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP24]]
+; RV32-NEXT: [[TMP90:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP25]]
+; RV32-NEXT: [[TMP91:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP26]]
+; RV32-NEXT: [[TMP92:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP27]]
+; RV32-NEXT: [[TMP93:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP28]]
+; RV32-NEXT: [[TMP94:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP29]]
+; RV32-NEXT: [[TMP95:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP30]]
+; RV32-NEXT: [[TMP96:%.*]] = getelementptr inbounds i7, ptr [[A]], i64 [[TMP31]]
+; RV32-NEXT: [[TMP97:%.*]] = extractelement <16 x i7> [[TMP80]], i32 0
+; RV32-NEXT: store i7 [[TMP97]], ptr [[TMP81]], align 1
+; RV32-NEXT: [[TMP98:%.*]] = extractelement <16 x i7> [[TMP80]], i32 1
+; RV32-NEXT: store i7 [[TMP98]], ptr [[TMP82]], align 1
+; RV32-NEXT: [[TMP99:%.*]] = extractelement <16 x i7> [[TMP80]], i32 2
+; RV32-NEXT: store i7 [[TMP99]], ptr [[TMP83]], align 1
+; RV32-NEXT: [[TMP100:%.*]] = extractelement <16 x i7> [[TMP80]], i32 3
+; RV32-NEXT: store i7 [[TMP100]], ptr [[TMP84]], align 1
+; RV32-NEXT: [[TMP101:%.*]] = extractelement <16 x i7> [[TMP80]], i32 4
+; RV32-NEXT: store i7 [[TMP101]], ptr [[TMP85]], align 1
+; RV32-NEXT: [[TMP102:%.*]] = extractelement <16 x i7> [[TMP80]], i32 5
+; RV32-NEXT: store i7 [[TMP102]], ptr [[TMP86]], align 1
+; RV32-NEXT: [[TMP103:%.*]] = extractelement <16 x i7> [[TMP80]], i32 6
+; RV32-NEXT: store i7 [[TMP103]], ptr [[TMP87]], align 1
+; RV32-NEXT: [[TMP104:%.*]] = extractelement <16 x i7> [[TMP80]], i32 7
+; RV32-NEXT: store i7 [[TMP104]], ptr [[TMP88]], align 1
+; RV32-NEXT: [[TMP105:%.*]] = extractelement <16 x i7> [[TMP80]], i32 8
+; RV32-NEXT: store i7 [[TMP105]], ptr [[TMP89...
[truncated]
|
; RV64-NEXT: [[TMP77:%.*]] = insertelement <16 x i7> [[TMP76]], i7 [[TMP61]], i32 13 | ||
; RV64-NEXT: [[TMP78:%.*]] = insertelement <16 x i7> [[TMP77]], i7 [[TMP62]], i32 14 | ||
; RV64-NEXT: [[TMP79:%.*]] = insertelement <16 x i7> [[TMP78]], i7 [[TMP63]], i32 15 | ||
; RV64-NEXT: [[TMP80:%.*]] = add <16 x i7> [[TMP79]], splat (i7 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why we have completely ignored the loop attribute that requests llvm.loop.vectorize.width=4? It's a shame the test is generating a VF of 16 because that leads to an excessive number of CHECK lines, whereas I think a lower VF would still defend the same code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that scalarized load/store currently only supports fixed VFs, but !llvm.loop !0
requests a user VF of vscale x 4
.
The current strategy for handling such cases is to allow the compiler to freely choose the VF.
// Only clamp if the UserVF is not scalable. If the UserVF is scalable, it
// is better to ignore the hint and let the compiler choose a suitable VF.
As a result, the actual VF may differ from the specified width. This behavior indeed doesn't quite match the intended semantics described in the comment for ScalableForceKind::SK_PreferScalable
.
/// Vectorize loops using scalable vectors or fixed-width vectors, but favor
/// scalable vectors when the cost-model is inconclusive. This is the
/// default when the scalable.enable hint is enabled through a pragma.
SK_PreferScalable = 1
Do we consider updating the behavior so that when a user-specified scalable VF can't be used, we fall back to a fixed VF instead? That would better align with the definition of SK_PreferScalable. cc @fhahn
But for now, since this is just an NFC test patch, I added a new hint with fixed VF 4. 8f6bd75
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test case LGTM, thanks
…fc (llvm#135139) Add a test with irregular type to ensure the vector load/store instructions are not generated.
Add a test with irregular type to ensure the vector load/store instructions are not generated.