-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[flang] Add -f[no-]slp-vectorize flags #132801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add -f[no-]slp-vectorize to the flang driver. Add corresponding -fvectorize-slp to the flang frontend. Signed-off-by: Kajetan Puchalski <[email protected]>
@llvm/pr-subscribers-flang-driver @llvm/pr-subscribers-clang-driver Author: Kajetan Puchalski (mrkajetanp) ChangesAdd -f[no-]slp-vectorize to the flang driver. Full diff: https://github.com/llvm/llvm-project/pull/132801.diff 6 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 59a57c83c6b89..a2cdf9f26fcdf 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4032,11 +4032,14 @@ def : Flag<["-"], "ftree-vectorize">, Alias<fvectorize>;
def : Flag<["-"], "fno-tree-vectorize">, Alias<fno_vectorize>;
}
+let Visibility = [ClangOption, FlangOption] in {
def fslp_vectorize : Flag<["-"], "fslp-vectorize">, Group<f_Group>,
HelpText<"Enable the superword-level parallelism vectorization passes">;
def fno_slp_vectorize : Flag<["-"], "fno-slp-vectorize">, Group<f_Group>;
def : Flag<["-"], "ftree-slp-vectorize">, Alias<fslp_vectorize>;
def : Flag<["-"], "fno-tree-slp-vectorize">, Alias<fno_slp_vectorize>;
+}
+
def Wlarge_by_value_copy_def : Flag<["-"], "Wlarge-by-value-copy">,
HelpText<"Warn if a function definition returns or accepts an object larger "
"in bytes than a given value">, Flags<[HelpHidden]>;
@@ -7384,6 +7387,9 @@ def mlink_bitcode_file
def vectorize_loops : Flag<["-"], "vectorize-loops">,
HelpText<"Run the Loop vectorization passes">,
MarshallingInfoFlag<CodeGenOpts<"VectorizeLoop">>;
+def vectorize_slp : Flag<["-"], "vectorize-slp">,
+ HelpText<"Run the SLP vectorization passes">,
+ MarshallingInfoFlag<CodeGenOpts<"VectorizeSLP">>;
} // let Visibility = [CC1Option, FC1Option]
let Visibility = [CC1Option] in {
@@ -7499,9 +7505,6 @@ defm link_builtin_bitcode_postopt: BoolMOption<"link-builtin-bitcode-postopt",
PosFlag<SetTrue, [], [ClangOption], "Link builtin bitcodes after the "
"optimization pipeline">,
NegFlag<SetFalse, [], [ClangOption]>>;
-def vectorize_slp : Flag<["-"], "vectorize-slp">,
- HelpText<"Run the SLP vectorization passes">,
- MarshallingInfoFlag<CodeGenOpts<"VectorizeSLP">>;
def linker_option : Joined<["--"], "linker-option=">,
HelpText<"Add linker option">,
MarshallingInfoStringVector<CodeGenOpts<"LinkerOptions">>;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index 5dbc5cbe77d0a..bb89432d3e58c 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -161,6 +161,14 @@ void Flang::addCodegenOptions(const ArgList &Args,
options::OPT_fno_vectorize, enableVec))
CmdArgs.push_back("-vectorize-loops");
+ // -fslp-vectorize is enabled based on the optimization level selected.
+ bool EnableSLPVec = shouldEnableVectorizerAtOLevel(Args, true);
+ OptSpecifier SLPVectAliasOption =
+ EnableSLPVec ? options::OPT_O_Group : options::OPT_fslp_vectorize;
+ if (Args.hasFlag(options::OPT_fslp_vectorize, SLPVectAliasOption,
+ options::OPT_fno_slp_vectorize, EnableSLPVec))
+ CmdArgs.push_back("-vectorize-slp");
+
if (shouldLoopVersion(Args))
CmdArgs.push_back("-fversion-loops-for-stride");
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index 44cb5a2cdd497..5d6af4271d4f6 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -32,6 +32,7 @@ CODEGENOPT(PrepareForThinLTO , 1, 0) ///< Set when -flto=thin is enabled on the
///< compile step.
CODEGENOPT(StackArrays, 1, 0) ///< -fstack-arrays (enable the stack-arrays pass)
CODEGENOPT(VectorizeLoop, 1, 0) ///< Enable loop vectorization.
+CODEGENOPT(VectorizeSLP, 1, 0) ///< Enable SLP vectorization.
CODEGENOPT(LoopVersioning, 1, 0) ///< Enable loop versioning.
CODEGENOPT(UnrollLoops, 1, 0) ///< Enable loop unrolling
CODEGENOPT(AliasAnalysis, 1, 0) ///< Enable alias analysis pass
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index f433ec9966922..652b3d8795ab7 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -246,6 +246,9 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
if (args.getLastArg(clang::driver::options::OPT_vectorize_loops))
opts.VectorizeLoop = 1;
+ if (args.getLastArg(clang::driver::options::OPT_vectorize_slp))
+ opts.VectorizeSLP = 1;
+
if (args.hasFlag(clang::driver::options::OPT_floop_versioning,
clang::driver::options::OPT_fno_loop_versioning, false))
opts.LoopVersioning = 1;
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index 46ec7550e4140..3e6e98855fe7a 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -933,6 +933,7 @@ void CodeGenAction::runOptimizationPipeline(llvm::raw_pwrite_stream &os) {
pto.LoopUnrolling = opts.UnrollLoops;
pto.LoopInterleaving = opts.UnrollLoops;
pto.LoopVectorization = opts.VectorizeLoop;
+ pto.SLPVectorization = opts.VectorizeSLP;
llvm::PassBuilder pb(targetMachine, pto, pgoOpt, &pic);
diff --git a/flang/test/Driver/slp-vectorize.f90 b/flang/test/Driver/slp-vectorize.f90
new file mode 100644
index 0000000000000..55d4be1a13c71
--- /dev/null
+++ b/flang/test/Driver/slp-vectorize.f90
@@ -0,0 +1,10 @@
+! RUN: %flang -### -S -fslp-vectorize %s 2>&1 | FileCheck -check-prefix=CHECK-SLP-VECTORIZE %s
+! RUN: %flang -### -S -fno-slp-vectorize %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SLP-VECTORIZE %s
+! RUN: %flang -### -S -O1 %s 2>&1 | FileCheck -check-prefix=CHECK-NO-SLP-VECTORIZE %s
+! RUN: %flang -### -S -O2 %s 2>&1 | FileCheck -check-prefix=CHECK-SLP-VECTORIZE %s
+! RUN: %flang -### -S -O3 %s 2>&1 | FileCheck -check-prefix=CHECK-SLP-VECTORIZE %s
+! CHECK-SLP-VECTORIZE: "-vectorize-slp"
+! CHECK-NO-SLP-VECTORIZE-NOT: "-no-vectorize-slp"
+
+program test
+end program
|
If you are soliciting reviews, you could also use the "Reviewers" box on the top right of this page |
I would, but I do not currently have the right permissions to use the box. Hence the ccs - I need someone else to do it. |
Huh. I didn't realize that one needed special permissions for that. |
flang - Expand tests for -fslp-vectorize Signed-off-by: Kajetan Puchalski <[email protected]>
6689d84
to
e78240d
Compare
a9b7ccb
to
b45704a
Compare
Co-authored-by: Tarun Prabhu <[email protected]> Signed-off-by: Kajetan Puchalski <[email protected]>
b45704a
to
3258e8e
Compare
Signed-off-by: Kajetan Puchalski <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the docstrings, this looks good. Thanks for the changes :-)
Signed-off-by: Kajetan Puchalski <[email protected]>
Done, thanks a lot for the review! :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
Is this buildbot failure related: https://lab.llvm.org/buildbot/#/builders/89/builds/19482? |
This patch might partially address the issue raised in #73180 |
Likely. We shouldn't be generating files in driver tests. For checking output, we have to pipe the output to FileCheck. |
Ah yes, sorry about that! Just a missing -o /dev/null, I'll post a fix in a moment. |
Certainly, I tracked a regression in a workload down to flang not running the slp vectorizer. Hence this patch. |
The buildbot failure fix is here: |
Add -f[no-]slp-vectorize to the flang driver.
Add corresponding -fvectorize-slp to the flang frontend.
Enable -fslp-vectorize at -O2 and higher in flang to match the current behaviour in clang.