Skip to content

Add debug options to clang-linker-wrapper #101008

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

macurtis-amd
Copy link
Contributor

@macurtis-amd macurtis-amd commented Jul 29, 2024

Some things I found useful while debugging code generation differences between old and new offloading drivers.
No functional changes (intended).

New options:

--lto-debug-pass-manager      Prints debug information for the new pass manager during LTO
--lto-in-process              Use in-process LTO even if linker supports LTO

Also teaches clang-linker-wrapper to forward -offload-opt=-<value> as -Wl,-mmlvm=<value> to the device linker.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be
notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write
permissions for the repository. In which case you can instead tag reviewers by
name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review
by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate
is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@llvmbot llvmbot added clang Clang issues not falling into any other category lld clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' lld:ELF LTO Link time optimization (regular/full LTO or ThinLTO) labels Jul 29, 2024
@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lto

@llvm/pr-subscribers-lld-elf

Author: None (macurtis-amd)

Changes

New options:

clang-linker-wrapper
---------------------------
--lto-debug-pass-manager      Prints debug information for the new pass manager during LTO
--lto-in-process              Use in-process LTO even if linker supports LTO
--lto-print-pipeline-passes   Print a '-passes' compatible string describing the LTO pipeline (best-effort only).


lld
---------------------------
--lto-print-pipeline-passes   Print a '-passes' compatible string describing the LTO pipeline (best-effort only).

Also teaches clang-linker-wrapper to forward -offload-opt=-&lt;value&gt; as -Wl,--mmlvm=&lt;value&gt; to the device linker.


Full diff: https://github.com/llvm/llvm-project/pull/101008.diff

10 Files Affected:

  • (modified) clang/test/Driver/linker-wrapper.c (+20-4)
  • (modified) clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp (+22-4)
  • (modified) clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td (+8)
  • (modified) lld/ELF/Config.h (+1)
  • (modified) lld/ELF/Driver.cpp (+1)
  • (modified) lld/ELF/LTO.cpp (+1)
  • (modified) lld/ELF/Options.td (+2)
  • (added) lld/test/ELF/lto/print-pipeline-passes.ll (+15)
  • (modified) llvm/include/llvm/LTO/Config.h (+3)
  • (modified) llvm/lib/LTO/LTOBackend.cpp (+10)
diff --git a/clang/test/Driver/linker-wrapper.c b/clang/test/Driver/linker-wrapper.c
index 342907c1a3390..00492c0745baa 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -30,7 +30,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --device-debug -O0 \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX-LINK-DEBUG
 
-// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 -O2 {{.*}}.o {{.*}}.o -g 
+// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 -O2 {{.*}}.o {{.*}}.o -g
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \
@@ -45,11 +45,27 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --image=file=%t.amdgpu.bc,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx1030 \
 // RUN:   --image=file=%t.amdgpu.bc,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx1030
 // RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
-// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --save-temps -O2 \
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --wrapper-verbose --dry-run --save-temps -O2 \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LTO-TEMPS
-
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --wrapper-verbose --dry-run --save-temps -O2 \
+// RUN:   --lto-in-process --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LTO-IN-PROC
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --wrapper-verbose --dry-run --save-temps -O2 \
+// RUN:   --lto-debug-pass-manager --lto-print-pipeline-passes \
+// RUN:   -offload-opt=--print-before=openmp-opt -offload-opt=--print-after=openmp-opt \
+// RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LTO-OPTS
+
+// AMDGPU-LTO-TEMPS-NOT: Linking bitcode files
 // AMDGPU-LTO-TEMPS: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -O2 -Wl,--no-undefined {{.*}}.o -save-temps
 
+// AMDGPU-LTO-IN-PROC: Linking bitcode files
+// AMDGPU-LTO-IN-PROC: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -O2 -Wl,--no-undefined {{.*}}.s -save-temps
+
+// AMDGPU-LTO-OPTS: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -O2 -Wl,--no-undefined {{.*}}.o -save-temps -Wl,--save-temps
+// AMDGPU-LTO-OPTS-SAME: -Wl,--lto-debug-pass-manager
+// AMDGPU-LTO-OPTS-SAME: -Wl,--lto-print-pipeline-passes
+// AMDGPU-LTO-OPTS-SAME: -Wl,--mllvm=-print-before=openmp-opt
+// AMDGPU-LTO-OPTS-SAME: -Wl,--mllvm=-print-after=openmp-opt
+
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu
@@ -93,7 +109,7 @@ __attribute__((visibility("protected"), used)) int x;
 
 // CUDA: clang{{.*}} -o [[IMG_SM70:.+]] --target=nvptx64-nvidia-cuda -march=sm_70
 // CUDA: clang{{.*}} -o [[IMG_SM52:.+]] --target=nvptx64-nvidia-cuda -march=sm_52
-// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]] 
+// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]]
 // CUDA: usr/bin/ld{{.*}} {{.*}}.openmp.image.{{.*}}.o {{.*}}.cuda.image.{{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
diff --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index 4bb021eae25a8..abaad59d25202 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -573,8 +573,18 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
   if (SaveTemps)
     CmdArgs.push_back("-save-temps");
 
-  if (SaveTemps && linkerSupportsLTO(Args))
-    CmdArgs.push_back("-Wl,--save-temps");
+  if (linkerSupportsLTO(Args)) {
+    if (SaveTemps)
+      CmdArgs.push_back("-Wl,--save-temps");
+    if (Args.hasArg(OPT_lto_debug_pass_manager))
+      CmdArgs.push_back("-Wl,--lto-debug-pass-manager");
+    if (Args.hasArg(OPT_lto_print_pipeline_passes))
+      CmdArgs.push_back("-Wl,--lto-print-pipeline-passes");
+    for (const opt::Arg *Arg : Args.filtered(OPT_offload_opt_eq_minus)) {
+      CmdArgs.push_back(
+          Args.MakeArgString("-Wl,--mllvm=" + StringRef(Arg->getValue())));
+    }
+  }
 
   if (Args.hasArg(OPT_embed_bitcode))
     CmdArgs.push_back("-Wl,--lto-emit-llvm");
@@ -759,6 +769,9 @@ std::unique_ptr<lto::LTO> createLTO(
   // TODO: Handle remark files
   Conf.HasWholeProgramVisibility = Args.hasArg(OPT_whole_program);
 
+  Conf.DebugPassManager = Args.hasArg(OPT_lto_debug_pass_manager);
+  Conf.PrintPipelinePasses = Args.hasArg(OPT_lto_print_pipeline_passes);
+
   return std::make_unique<lto::LTO>(std::move(Conf), Backend);
 }
 
@@ -773,6 +786,8 @@ bool isValidCIdentifier(StringRef S) {
 Error linkBitcodeFiles(SmallVectorImpl<OffloadFile> &InputFiles,
                        SmallVectorImpl<StringRef> &OutputFiles,
                        const ArgList &Args) {
+  if (Verbose)
+    llvm::errs() << "Linking bitcode files\n";
   llvm::TimeTraceScope TimeScope("Link bitcode files");
   const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ));
   StringRef Arch = Args.getLastArgValue(OPT_arch_EQ);
@@ -1009,6 +1024,8 @@ Expected<StringRef> writeOffloadFile(const OffloadFile &File) {
 // Compile the module to an object file using the appropriate target machine for
 // the host triple.
 Expected<StringRef> compileModule(Module &M, OffloadKind Kind) {
+  if (Verbose)
+    llvm::errs() << "Compiling module\n";
   llvm::TimeTraceScope TimeScope("Compile module");
   std::string Msg;
   const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
@@ -1189,7 +1206,8 @@ bundleLinkedOutput(ArrayRef<OffloadingImage> Images, const ArgList &Args,
   }
 }
 
-/// Returns a new ArgList containg arguments used for the device linking phase.
+/// Returns a new ArgList containing arguments used for the device linking
+/// phase.
 DerivedArgList getLinkerArgs(ArrayRef<OffloadFile> Input,
                              const InputArgList &Args) {
   DerivedArgList DAL = DerivedArgList(DerivedArgList(Args));
@@ -1301,7 +1319,7 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
     // First link and remove all the input files containing bitcode if
     // the target linker does not support it natively.
     SmallVector<StringRef> InputFiles;
-    if (!linkerSupportsLTO(LinkerArgs))
+    if (!linkerSupportsLTO(LinkerArgs) || Args.hasArg(OPT_lto_in_process))
       if (Error Err = linkBitcodeFiles(Input, InputFiles, LinkerArgs))
         return Err;
 
diff --git a/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td b/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
index 9c27e588fc4f5..29f2177b0b138 100644
--- a/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
+++ b/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
@@ -102,6 +102,14 @@ def offload_opt_eq_minus : Joined<["--", "-"], "offload-opt=-">, Flags<[HelpHidd
   HelpText<"Options passed to LLVM, not including the Clang invocation. Use "
            "'--offload-opt=--help' for a list of options.">;
 
+// Arguments for LTO
+def lto_in_process : Flag<["--"], "lto-in-process">,
+  Flags<[WrapperOnlyOption]>, HelpText<"Use in-process LTO even if linker supports LTO">;
+def lto_debug_pass_manager : Flag<["--"], "lto-debug-pass-manager">,
+  Flags<[WrapperOnlyOption]>, HelpText<"Prints debug information for the new pass manager during LTO">;
+def lto_print_pipeline_passes : Flag<["--"], "lto-print-pipeline-passes">,
+  Flags<[WrapperOnlyOption]>, HelpText<"Print a '-passes' compatible string describing the LTO pipeline (best-effort only).">;
+
 // Standard linker flags also used by the linker wrapper.
 def sysroot_EQ : Joined<["--"], "sysroot=">, HelpText<"Set the system root">;
 
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 6abd929d2343d..65216a47ecbdf 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -260,6 +260,7 @@ struct Config {
   bool ignoreFunctionAddressEquality;
   bool ltoCSProfileGenerate;
   bool ltoPGOWarnMismatch;
+  bool ltoPrintPipelinePasses;
   bool ltoDebugPassManager;
   bool ltoEmitAsm;
   bool ltoUniqueBasicBlockSectionNames;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 7e0a5a1937c7f..814145ae41d95 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1313,6 +1313,7 @@ static void readConfigs(opt::InputArgList &args) {
   config->ltoCSProfileFile = args.getLastArgValue(OPT_lto_cs_profile_file);
   config->ltoPGOWarnMismatch = args.hasFlag(OPT_lto_pgo_warn_mismatch,
                                             OPT_no_lto_pgo_warn_mismatch, true);
+  config->ltoPrintPipelinePasses = args.hasArg(OPT_lto_print_pipeline_passes);
   config->ltoDebugPassManager = args.hasArg(OPT_lto_debug_pass_manager);
   config->ltoEmitAsm = args.hasArg(OPT_lto_emit_asm);
   config->ltoNewPmPasses = args.getLastArgValue(OPT_lto_newpm_passes);
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 935d0a9eab9ee..0e56ee4a86f88 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -127,6 +127,7 @@ static lto::Config createConfig() {
   c.SampleProfile = std::string(config->ltoSampleProfile);
   for (StringRef pluginFn : config->passPlugins)
     c.PassPlugins.push_back(std::string(pluginFn));
+  c.PrintPipelinePasses = config->ltoPrintPipelinePasses;
   c.DebugPassManager = config->ltoDebugPassManager;
   c.DwoDir = std::string(config->dwoDir);
 
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 74733efb28ff5..49a86c316ec46 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -610,6 +610,8 @@ def lto: JJ<"lto=">, HelpText<"Set LTO backend">,
                  MetaVarName<"[full,thin]">;
 def lto_aa_pipeline: JJ<"lto-aa-pipeline=">,
   HelpText<"AA pipeline to run during LTO. Used in conjunction with -lto-newpm-passes">;
+def lto_print_pipeline_passes: FF<"lto-print-pipeline-passes">,
+  HelpText<"Print a '-passes' compatible string describing the LTO pipeline (best-effort only).">;
 def lto_debug_pass_manager: FF<"lto-debug-pass-manager">,
   HelpText<"Debug new pass manager">;
 def lto_emit_asm: FF<"lto-emit-asm">,
diff --git a/lld/test/ELF/lto/print-pipeline-passes.ll b/lld/test/ELF/lto/print-pipeline-passes.ll
new file mode 100644
index 0000000000000..0ff42eebba296
--- /dev/null
+++ b/lld/test/ELF/lto/print-pipeline-passes.ll
@@ -0,0 +1,15 @@
+; REQUIRES: x86
+
+; RUN: llvm-as %s -o %t.o
+; RUN: ld.lld --lto-print-pipeline-passes -o %t.out.o %t.o 2>&1 | FileCheck %s
+
+; CHECK: pipeline-passes: verify,{{.*}},verify
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+@llvm.compiler.used = appending global [1 x ptr] [ptr @main], section "llvm.metadata"
+
+define hidden void @main() {
+  ret void
+}
diff --git a/llvm/include/llvm/LTO/Config.h b/llvm/include/llvm/LTO/Config.h
index 482b6e55a19d3..1e3aa92537ac9 100644
--- a/llvm/include/llvm/LTO/Config.h
+++ b/llvm/include/llvm/LTO/Config.h
@@ -165,6 +165,9 @@ struct Config {
   /// Whether to emit the pass manager debuggging informations.
   bool DebugPassManager = false;
 
+  /// Print a '-passes' compatible string describing the pipeline (best-effort only).
+  bool PrintPipelinePasses = false;
+
   /// Statistics output file path.
   std::string StatsFile;
 
diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index d5d642f0d25e6..949850a3c791b 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -335,6 +335,16 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM,
   if (!Conf.DisableVerify)
     MPM.addPass(VerifierPass());
 
+  if (Conf.PrintPipelinePasses) {
+    std::string PipelineStr;
+    raw_string_ostream OS(PipelineStr);
+    MPM.printPipeline(OS, [&PIC](StringRef ClassName) {
+      auto PassName = PIC.getPassNameForClassName(ClassName);
+      return PassName.empty() ? ClassName : PassName;
+    });
+    outs() << "pipeline-passes: " << PipelineStr << '\n';
+  }
+
   MPM.run(Mod, MAM);
 }
 

@llvmbot
Copy link
Member

llvmbot commented Jul 29, 2024

@llvm/pr-subscribers-clang-driver

Author: None (macurtis-amd)

Changes

New options:

clang-linker-wrapper
---------------------------
--lto-debug-pass-manager      Prints debug information for the new pass manager during LTO
--lto-in-process              Use in-process LTO even if linker supports LTO
--lto-print-pipeline-passes   Print a '-passes' compatible string describing the LTO pipeline (best-effort only).


lld
---------------------------
--lto-print-pipeline-passes   Print a '-passes' compatible string describing the LTO pipeline (best-effort only).

Also teaches clang-linker-wrapper to forward -offload-opt=-&lt;value&gt; as -Wl,--mmlvm=&lt;value&gt; to the device linker.


Full diff: https://github.com/llvm/llvm-project/pull/101008.diff

10 Files Affected:

  • (modified) clang/test/Driver/linker-wrapper.c (+20-4)
  • (modified) clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp (+22-4)
  • (modified) clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td (+8)
  • (modified) lld/ELF/Config.h (+1)
  • (modified) lld/ELF/Driver.cpp (+1)
  • (modified) lld/ELF/LTO.cpp (+1)
  • (modified) lld/ELF/Options.td (+2)
  • (added) lld/test/ELF/lto/print-pipeline-passes.ll (+15)
  • (modified) llvm/include/llvm/LTO/Config.h (+3)
  • (modified) llvm/lib/LTO/LTOBackend.cpp (+10)
diff --git a/clang/test/Driver/linker-wrapper.c b/clang/test/Driver/linker-wrapper.c
index 342907c1a3390..00492c0745baa 100644
--- a/clang/test/Driver/linker-wrapper.c
+++ b/clang/test/Driver/linker-wrapper.c
@@ -30,7 +30,7 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --device-debug -O0 \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=NVPTX-LINK-DEBUG
 
-// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 -O2 {{.*}}.o {{.*}}.o -g 
+// NVPTX-LINK-DEBUG: clang{{.*}} -o {{.*}}.img --target=nvptx64-nvidia-cuda -march=sm_70 -O2 {{.*}}.o {{.*}}.o -g
 
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx908 \
@@ -45,11 +45,27 @@ __attribute__((visibility("protected"), used)) int x;
 // RUN:   --image=file=%t.amdgpu.bc,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx1030 \
 // RUN:   --image=file=%t.amdgpu.bc,kind=openmp,triple=amdgcn-amd-amdhsa,arch=gfx1030
 // RUN: %clang -cc1 %s -triple x86_64-unknown-linux-gnu -emit-obj -o %t.o -fembed-offload-object=%t.out
-// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --dry-run --save-temps -O2 \
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --wrapper-verbose --dry-run --save-temps -O2 \
 // RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LTO-TEMPS
-
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --wrapper-verbose --dry-run --save-temps -O2 \
+// RUN:   --lto-in-process --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LTO-IN-PROC
+// RUN: clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --wrapper-verbose --dry-run --save-temps -O2 \
+// RUN:   --lto-debug-pass-manager --lto-print-pipeline-passes \
+// RUN:   -offload-opt=--print-before=openmp-opt -offload-opt=--print-after=openmp-opt \
+// RUN:   --linker-path=/usr/bin/ld %t.o -o a.out 2>&1 | FileCheck %s --check-prefix=AMDGPU-LTO-OPTS
+
+// AMDGPU-LTO-TEMPS-NOT: Linking bitcode files
 // AMDGPU-LTO-TEMPS: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -O2 -Wl,--no-undefined {{.*}}.o -save-temps
 
+// AMDGPU-LTO-IN-PROC: Linking bitcode files
+// AMDGPU-LTO-IN-PROC: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -O2 -Wl,--no-undefined {{.*}}.s -save-temps
+
+// AMDGPU-LTO-OPTS: clang{{.*}} -o {{.*}}.img --target=amdgcn-amd-amdhsa -mcpu=gfx1030 -O2 -Wl,--no-undefined {{.*}}.o -save-temps -Wl,--save-temps
+// AMDGPU-LTO-OPTS-SAME: -Wl,--lto-debug-pass-manager
+// AMDGPU-LTO-OPTS-SAME: -Wl,--lto-print-pipeline-passes
+// AMDGPU-LTO-OPTS-SAME: -Wl,--mllvm=-print-before=openmp-opt
+// AMDGPU-LTO-OPTS-SAME: -Wl,--mllvm=-print-after=openmp-opt
+
 // RUN: clang-offload-packager -o %t.out \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu \
 // RUN:   --image=file=%t.elf.o,kind=openmp,triple=x86_64-unknown-linux-gnu
@@ -93,7 +109,7 @@ __attribute__((visibility("protected"), used)) int x;
 
 // CUDA: clang{{.*}} -o [[IMG_SM70:.+]] --target=nvptx64-nvidia-cuda -march=sm_70
 // CUDA: clang{{.*}} -o [[IMG_SM52:.+]] --target=nvptx64-nvidia-cuda -march=sm_52
-// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]] 
+// CUDA: fatbinary{{.*}}-64 --create {{.*}}.fatbin --image=profile=sm_70,file=[[IMG_SM70]] --image=profile=sm_52,file=[[IMG_SM52]]
 // CUDA: usr/bin/ld{{.*}} {{.*}}.openmp.image.{{.*}}.o {{.*}}.cuda.image.{{.*}}.o
 
 // RUN: clang-offload-packager -o %t.out \
diff --git a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
index 4bb021eae25a8..abaad59d25202 100644
--- a/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
+++ b/clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
@@ -573,8 +573,18 @@ Expected<StringRef> clang(ArrayRef<StringRef> InputFiles, const ArgList &Args) {
   if (SaveTemps)
     CmdArgs.push_back("-save-temps");
 
-  if (SaveTemps && linkerSupportsLTO(Args))
-    CmdArgs.push_back("-Wl,--save-temps");
+  if (linkerSupportsLTO(Args)) {
+    if (SaveTemps)
+      CmdArgs.push_back("-Wl,--save-temps");
+    if (Args.hasArg(OPT_lto_debug_pass_manager))
+      CmdArgs.push_back("-Wl,--lto-debug-pass-manager");
+    if (Args.hasArg(OPT_lto_print_pipeline_passes))
+      CmdArgs.push_back("-Wl,--lto-print-pipeline-passes");
+    for (const opt::Arg *Arg : Args.filtered(OPT_offload_opt_eq_minus)) {
+      CmdArgs.push_back(
+          Args.MakeArgString("-Wl,--mllvm=" + StringRef(Arg->getValue())));
+    }
+  }
 
   if (Args.hasArg(OPT_embed_bitcode))
     CmdArgs.push_back("-Wl,--lto-emit-llvm");
@@ -759,6 +769,9 @@ std::unique_ptr<lto::LTO> createLTO(
   // TODO: Handle remark files
   Conf.HasWholeProgramVisibility = Args.hasArg(OPT_whole_program);
 
+  Conf.DebugPassManager = Args.hasArg(OPT_lto_debug_pass_manager);
+  Conf.PrintPipelinePasses = Args.hasArg(OPT_lto_print_pipeline_passes);
+
   return std::make_unique<lto::LTO>(std::move(Conf), Backend);
 }
 
@@ -773,6 +786,8 @@ bool isValidCIdentifier(StringRef S) {
 Error linkBitcodeFiles(SmallVectorImpl<OffloadFile> &InputFiles,
                        SmallVectorImpl<StringRef> &OutputFiles,
                        const ArgList &Args) {
+  if (Verbose)
+    llvm::errs() << "Linking bitcode files\n";
   llvm::TimeTraceScope TimeScope("Link bitcode files");
   const llvm::Triple Triple(Args.getLastArgValue(OPT_triple_EQ));
   StringRef Arch = Args.getLastArgValue(OPT_arch_EQ);
@@ -1009,6 +1024,8 @@ Expected<StringRef> writeOffloadFile(const OffloadFile &File) {
 // Compile the module to an object file using the appropriate target machine for
 // the host triple.
 Expected<StringRef> compileModule(Module &M, OffloadKind Kind) {
+  if (Verbose)
+    llvm::errs() << "Compiling module\n";
   llvm::TimeTraceScope TimeScope("Compile module");
   std::string Msg;
   const Target *T = TargetRegistry::lookupTarget(M.getTargetTriple(), Msg);
@@ -1189,7 +1206,8 @@ bundleLinkedOutput(ArrayRef<OffloadingImage> Images, const ArgList &Args,
   }
 }
 
-/// Returns a new ArgList containg arguments used for the device linking phase.
+/// Returns a new ArgList containing arguments used for the device linking
+/// phase.
 DerivedArgList getLinkerArgs(ArrayRef<OffloadFile> Input,
                              const InputArgList &Args) {
   DerivedArgList DAL = DerivedArgList(DerivedArgList(Args));
@@ -1301,7 +1319,7 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
     // First link and remove all the input files containing bitcode if
     // the target linker does not support it natively.
     SmallVector<StringRef> InputFiles;
-    if (!linkerSupportsLTO(LinkerArgs))
+    if (!linkerSupportsLTO(LinkerArgs) || Args.hasArg(OPT_lto_in_process))
       if (Error Err = linkBitcodeFiles(Input, InputFiles, LinkerArgs))
         return Err;
 
diff --git a/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td b/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
index 9c27e588fc4f5..29f2177b0b138 100644
--- a/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
+++ b/clang/tools/clang-linker-wrapper/LinkerWrapperOpts.td
@@ -102,6 +102,14 @@ def offload_opt_eq_minus : Joined<["--", "-"], "offload-opt=-">, Flags<[HelpHidd
   HelpText<"Options passed to LLVM, not including the Clang invocation. Use "
            "'--offload-opt=--help' for a list of options.">;
 
+// Arguments for LTO
+def lto_in_process : Flag<["--"], "lto-in-process">,
+  Flags<[WrapperOnlyOption]>, HelpText<"Use in-process LTO even if linker supports LTO">;
+def lto_debug_pass_manager : Flag<["--"], "lto-debug-pass-manager">,
+  Flags<[WrapperOnlyOption]>, HelpText<"Prints debug information for the new pass manager during LTO">;
+def lto_print_pipeline_passes : Flag<["--"], "lto-print-pipeline-passes">,
+  Flags<[WrapperOnlyOption]>, HelpText<"Print a '-passes' compatible string describing the LTO pipeline (best-effort only).">;
+
 // Standard linker flags also used by the linker wrapper.
 def sysroot_EQ : Joined<["--"], "sysroot=">, HelpText<"Set the system root">;
 
diff --git a/lld/ELF/Config.h b/lld/ELF/Config.h
index 6abd929d2343d..65216a47ecbdf 100644
--- a/lld/ELF/Config.h
+++ b/lld/ELF/Config.h
@@ -260,6 +260,7 @@ struct Config {
   bool ignoreFunctionAddressEquality;
   bool ltoCSProfileGenerate;
   bool ltoPGOWarnMismatch;
+  bool ltoPrintPipelinePasses;
   bool ltoDebugPassManager;
   bool ltoEmitAsm;
   bool ltoUniqueBasicBlockSectionNames;
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 7e0a5a1937c7f..814145ae41d95 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -1313,6 +1313,7 @@ static void readConfigs(opt::InputArgList &args) {
   config->ltoCSProfileFile = args.getLastArgValue(OPT_lto_cs_profile_file);
   config->ltoPGOWarnMismatch = args.hasFlag(OPT_lto_pgo_warn_mismatch,
                                             OPT_no_lto_pgo_warn_mismatch, true);
+  config->ltoPrintPipelinePasses = args.hasArg(OPT_lto_print_pipeline_passes);
   config->ltoDebugPassManager = args.hasArg(OPT_lto_debug_pass_manager);
   config->ltoEmitAsm = args.hasArg(OPT_lto_emit_asm);
   config->ltoNewPmPasses = args.getLastArgValue(OPT_lto_newpm_passes);
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 935d0a9eab9ee..0e56ee4a86f88 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -127,6 +127,7 @@ static lto::Config createConfig() {
   c.SampleProfile = std::string(config->ltoSampleProfile);
   for (StringRef pluginFn : config->passPlugins)
     c.PassPlugins.push_back(std::string(pluginFn));
+  c.PrintPipelinePasses = config->ltoPrintPipelinePasses;
   c.DebugPassManager = config->ltoDebugPassManager;
   c.DwoDir = std::string(config->dwoDir);
 
diff --git a/lld/ELF/Options.td b/lld/ELF/Options.td
index 74733efb28ff5..49a86c316ec46 100644
--- a/lld/ELF/Options.td
+++ b/lld/ELF/Options.td
@@ -610,6 +610,8 @@ def lto: JJ<"lto=">, HelpText<"Set LTO backend">,
                  MetaVarName<"[full,thin]">;
 def lto_aa_pipeline: JJ<"lto-aa-pipeline=">,
   HelpText<"AA pipeline to run during LTO. Used in conjunction with -lto-newpm-passes">;
+def lto_print_pipeline_passes: FF<"lto-print-pipeline-passes">,
+  HelpText<"Print a '-passes' compatible string describing the LTO pipeline (best-effort only).">;
 def lto_debug_pass_manager: FF<"lto-debug-pass-manager">,
   HelpText<"Debug new pass manager">;
 def lto_emit_asm: FF<"lto-emit-asm">,
diff --git a/lld/test/ELF/lto/print-pipeline-passes.ll b/lld/test/ELF/lto/print-pipeline-passes.ll
new file mode 100644
index 0000000000000..0ff42eebba296
--- /dev/null
+++ b/lld/test/ELF/lto/print-pipeline-passes.ll
@@ -0,0 +1,15 @@
+; REQUIRES: x86
+
+; RUN: llvm-as %s -o %t.o
+; RUN: ld.lld --lto-print-pipeline-passes -o %t.out.o %t.o 2>&1 | FileCheck %s
+
+; CHECK: pipeline-passes: verify,{{.*}},verify
+
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+@llvm.compiler.used = appending global [1 x ptr] [ptr @main], section "llvm.metadata"
+
+define hidden void @main() {
+  ret void
+}
diff --git a/llvm/include/llvm/LTO/Config.h b/llvm/include/llvm/LTO/Config.h
index 482b6e55a19d3..1e3aa92537ac9 100644
--- a/llvm/include/llvm/LTO/Config.h
+++ b/llvm/include/llvm/LTO/Config.h
@@ -165,6 +165,9 @@ struct Config {
   /// Whether to emit the pass manager debuggging informations.
   bool DebugPassManager = false;
 
+  /// Print a '-passes' compatible string describing the pipeline (best-effort only).
+  bool PrintPipelinePasses = false;
+
   /// Statistics output file path.
   std::string StatsFile;
 
diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index d5d642f0d25e6..949850a3c791b 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -335,6 +335,16 @@ static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM,
   if (!Conf.DisableVerify)
     MPM.addPass(VerifierPass());
 
+  if (Conf.PrintPipelinePasses) {
+    std::string PipelineStr;
+    raw_string_ostream OS(PipelineStr);
+    MPM.printPipeline(OS, [&PIC](StringRef ClassName) {
+      auto PassName = PIC.getPassNameForClassName(ClassName);
+      return PassName.empty() ? ClassName : PassName;
+    });
+    outs() << "pipeline-passes: " << PipelineStr << '\n';
+  }
+
   MPM.run(Mod, MAM);
 }
 

@doru1004 doru1004 requested review from jhuber6 and doru1004 July 29, 2024 13:30
Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lld and LLVM changes need to be separate, forwarding -mllvm is #100424 but I forgot to merge it so I'll do that.

@macurtis-amd
Copy link
Contributor Author

macurtis-amd commented Jul 29, 2024

lld and LLVM changes need to be separate, forwarding -mllvm is #100424 but I forgot to merge it so I'll do that.

I've created: PR #101018 for that commit. Once that has been merged, I'll rebase this PR.

Comment on lines +794 to +795
if (Verbose)
llvm::errs() << "Linking bitcode files\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verbose mode just wants to print information from the tools generally. I think if we want tracing we already have the time-trace scope.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time-trace seems to be focused more on profiling than just tracing. It writes json to a file.
I guess, as a human user, I just wanted a simple way to see when clang-linker-wrapper was running LTO in-process.
Is there a third alternative that would work for you?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds more like debugging than verbose output, we could add something for it but in general I don't think it adds too much, since this path isn't really used anymore.

Comment on lines +589 to +591
for (const opt::Arg *Arg : Args.filtered(OPT_offload_opt_eq_minus)) {
CmdArgs.push_back(
Args.MakeArgString("-Wl,--mllvm=" + StringRef(Arg->getValue())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should already be handled elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the commit that added this. I think I wrote it before 97c62b8f75 was merged.

Comment on lines +587 to +588
if (Args.hasArg(OPT_lto_debug_pass_manager))
CmdArgs.push_back("-Wl,--lto-debug-pass-manager");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --save-temps thing is kind of special, I'm thining for stuff like this it's better suited for -Xoffload-linker --lto-debug-pass-manager once I merge #101032.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified that --device-linker=--lto-debug-pass-manager works.

I am going to close this PR since most of the functionality is redundant or now available by other means.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang Clang issues not falling into any other category lld:ELF lld LTO Link time optimization (regular/full LTO or ThinLTO)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants