-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[VPlan] Support fast
in FMF in VPRecipeWithIRFlags.
#130880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In current FastMathFlags implementation, we need to explicit set the flags to fast. Otherwise it will show all the sub-flags in the FMF.
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-vectorizers Author: Elvis Wang (ElvisWang123) ChangesIn current FastMathFlags implementation, we need to explicit set the flags to fast. Otherwise it will show all the sub-flags in the FMF. Split from #113903. Full diff: https://github.com/llvm/llvm-project/pull/130880.diff 4 Files Affected:
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index b1288c42b20f2..2aba1331a6259 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -614,6 +614,7 @@ class VPRecipeWithIRFlags : public VPSingleDefRecipe {
char AllowReciprocal : 1;
char AllowContract : 1;
char ApproxFunc : 1;
+ char Fast : 1;
FastMathFlagsTy(const FastMathFlags &FMF);
};
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index d154d54c37862..062c65cf2595c 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -360,6 +360,7 @@ FastMathFlags VPRecipeWithIRFlags::getFastMathFlags() const {
assert(OpType == OperationType::FPMathOp &&
"recipe doesn't have fast math flags");
FastMathFlags Res;
+ Res.setFast(FMFs.Fast);
Res.setAllowReassoc(FMFs.AllowReassoc);
Res.setNoNaNs(FMFs.NoNaNs);
Res.setNoInfs(FMFs.NoInfs);
@@ -1393,6 +1394,7 @@ VPRecipeWithIRFlags::FastMathFlagsTy::FastMathFlagsTy(
AllowReciprocal = FMF.allowReciprocal();
AllowContract = FMF.allowContract();
ApproxFunc = FMF.approxFunc();
+ Fast = FMF.isFast();
}
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
diff --git a/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll b/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
index a119707bed120..89bb495045e41 100644
--- a/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/widen-call-with-intrinsic-or-libfunc.ll
@@ -26,7 +26,7 @@ target triple = "arm64-apple-ios"
; CHECK-NEXT: vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.src>
; CHECK-NEXT: WIDEN ir<%l> = load vp<[[VEC_PTR]]>
; CHECK-NEXT: WIDEN-CAST ir<%conv> = fpext ir<%l> to double
-; CHECK-NEXT: WIDEN-CALL ir<%s> = call reassoc nnan ninf nsz arcp contract afn @llvm.sin.f64(ir<%conv>) (using library function: __simd_sin_v2f64)
+; CHECK-NEXT: WIDEN-CALL ir<%s> = call fast @llvm.sin.f64(ir<%conv>) (using library function: __simd_sin_v2f64)
; CHECK-NEXT: REPLICATE ir<%gep.dst> = getelementptr inbounds ir<%dst>, vp<[[STEPS]]>
; CHECK-NEXT: REPLICATE store ir<%s>, ir<%gep.dst>
; CHECK-NEXT: EMIT vp<[[CAN_IV_NEXT:%.+]]> = add nuw vp<[[CAN_IV]]>, vp<[[VFxUF]]>
@@ -72,7 +72,7 @@ target triple = "arm64-apple-ios"
; CHECK-NEXT: vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.src>
; CHECK-NEXT: WIDEN ir<%l> = load vp<[[VEC_PTR]]>
; CHECK-NEXT: WIDEN-CAST ir<%conv> = fpext ir<%l> to double
-; CHECK-NEXT: WIDEN-INTRINSIC ir<%s> = call reassoc nnan ninf nsz arcp contract afn llvm.sin(ir<%conv>)
+; CHECK-NEXT: WIDEN-INTRINSIC ir<%s> = call fast llvm.sin(ir<%conv>)
; CHECK-NEXT: REPLICATE ir<%gep.dst> = getelementptr inbounds ir<%dst>, vp<[[STEPS]]>
; CHECK-NEXT: REPLICATE store ir<%s>, ir<%gep.dst>
; CHECK-NEXT: EMIT vp<[[CAN_IV_NEXT:%.+]]> = add nuw vp<[[CAN_IV]]>, vp<[[VFxUF]]>
diff --git a/llvm/test/Transforms/LoopVectorize/vplan-printing.ll b/llvm/test/Transforms/LoopVectorize/vplan-printing.ll
index 00d8de67a3b40..207cb8b4a0d30 100644
--- a/llvm/test/Transforms/LoopVectorize/vplan-printing.ll
+++ b/llvm/test/Transforms/LoopVectorize/vplan-printing.ll
@@ -800,7 +800,7 @@ define void @print_fast_math_flags(i64 %n, ptr noalias %y, ptr noalias %x, ptr %
; CHECK-NEXT: vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.y>
; CHECK-NEXT: WIDEN ir<%lv> = load vp<[[VEC_PTR]]>
; CHECK-NEXT: WIDEN ir<%add> = fadd nnan ir<%lv>, ir<1.000000e+00>
-; CHECK-NEXT: WIDEN ir<%mul> = fmul reassoc nnan ninf nsz arcp contract afn ir<%add>, ir<2.000000e+00>
+; CHECK-NEXT: WIDEN ir<%mul> = fmul fast ir<%add>, ir<2.000000e+00>
; CHECK-NEXT: WIDEN ir<%div> = fdiv reassoc nsz contract ir<%mul>, ir<2.000000e+00>
; CHECK-NEXT: CLONE ir<%gep.x> = getelementptr inbounds ir<%x>, vp<[[STEPS]]>
; CHECK-NEXT: vp<[[VEC_PTR:%.+]]> = vector-pointer ir<%gep.x>
@@ -1224,8 +1224,8 @@ define void @print_select_with_fastmath_flags(ptr noalias %a, ptr noalias %b, pt
; CHECK-NEXT: vp<[[PTR2:%.+]]> = vector-pointer ir<[[GEP2]]>
; CHECK-NEXT: WIDEN ir<[[LD2:%.+]]> = load vp<[[PTR2]]>
; CHECK-NEXT: WIDEN ir<[[FCMP:%.+]]> = fcmp ogt ir<[[LD1]]>, ir<[[LD2]]>
-; CHECK-NEXT: WIDEN ir<[[FADD:%.+]]> = fadd reassoc nnan ninf nsz arcp contract afn ir<[[LD1]]>, ir<1.000000e+01>
-; CHECK-NEXT: WIDEN-SELECT ir<[[SELECT:%.+]]> = select reassoc nnan ninf nsz arcp contract afn ir<[[FCMP]]>, ir<[[FADD]]>, ir<[[LD2]]>
+; CHECK-NEXT: WIDEN ir<[[FADD:%.+]]> = fadd fast ir<[[LD1]]>, ir<1.000000e+01>
+; CHECK-NEXT: WIDEN-SELECT ir<[[SELECT:%.+]]> = select fast ir<[[FCMP]]>, ir<[[FADD]]>, ir<[[LD2]]>
; CHECK-NEXT: CLONE ir<[[GEP3:%.+]]> = getelementptr inbounds nuw ir<%a>, vp<[[ST]]>
; CHECK-NEXT: vp<[[PTR3:%.+]]> = vector-pointer ir<[[GEP3]]>
; CHECK-NEXT: WIDEN store vp<[[PTR3]]>, ir<[[SELECT]]>
|
I think this is a bit strange because we end up adding a state that doesn't exist in FastMathFlags. I see that FastMathFlagsTy was used instead of FastMathFlags directly to keep the union tightly packed according to https://reviews.llvm.org/D149079#inline-1447781, i.e. 1 byte instead of the 4 bytes for FastMathFlags But after fd66195, that union has an unsigned int so it's going to be 4 bytes anyway. So maybe we should just replace FastMathFlagsTy with FastMathFlags directly? |
Thanks your comments. |
In current FastMathFlags implementation, we need to explicit set the flags to
fast
. Otherwise it will show all the sub-flags in the FMF.This patch is quite NFC because if all the sub-flags (
reassoc
,nnan
,ninf
,nsz
,arcp
,contract
,afn
) are set equals tofast
.Split from #130881, #113903.