Skip to content

[LLD] Do not combine cg_profile from obj and ordering file #121325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jan 5, 2025

Conversation

HaohaiWen
Copy link
Contributor

cg_profile in object is from CGProfilePass and it is often inaccurate.
While call-graph-ordering-file is provided by user. It is weird to
aggregate them together especially when call-graph-ordering-file is
accurate enough.

Add tests to track section reordering when both cg_profile section
and call-graph-ordering-file were given.
cg_profile in object is from CGProfilePass and it is often inaccurate.
While call-graph-ordering-file is provided by user. It is weird to
aggregate them together especially when call-graph-ordering-file is
accurate enough.
@HaohaiWen
Copy link
Contributor Author

Add test in #121324.

@llvmbot
Copy link
Member

llvmbot commented Dec 30, 2024

@llvm/pr-subscribers-lld-elf
@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-lld-coff

Author: Haohai Wen (HaohaiWen)

Changes

cg_profile in object is from CGProfilePass and it is often inaccurate.
While call-graph-ordering-file is provided by user. It is weird to
aggregate them together especially when call-graph-ordering-file is
accurate enough.


Full diff: https://github.com/llvm/llvm-project/pull/121325.diff

4 Files Affected:

  • (modified) lld/COFF/Driver.cpp (+3-3)
  • (modified) lld/ELF/Driver.cpp (+3-2)
  • (modified) lld/test/COFF/cgprofile-obj.s (+13-6)
  • (modified) lld/test/ELF/cgprofile-obj.s (+13-5)
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index be01ee41c9a2f2..4b9c189e21510d 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -2812,10 +2812,10 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   // Handle /call-graph-ordering-file and /call-graph-profile-sort (default on).
   if (config->callGraphProfileSort) {
     llvm::TimeTraceScope timeScope("Call graph");
-    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
+    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
       parseCallGraphFile(arg->getValue());
-    }
-    readCallGraphsFromObjectFiles(ctx);
+    else
+      readCallGraphsFromObjectFiles(ctx);
   }
 
   // Handle /print-symbol-order.
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index f573a8d3e19f3b..e8e99fa874b5d5 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3215,11 +3215,12 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
 
   // Read the callgraph now that we know what was gced or icfed
   if (ctx.arg.callGraphProfileSort != CGProfileSortKind::None) {
-    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
+    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
       if (std::optional<MemoryBufferRef> buffer =
               readFile(ctx, arg->getValue()))
         readCallGraph(ctx, *buffer);
-    readCallGraphsFromObjectFiles<ELFT>(ctx);
+    } else
+      readCallGraphsFromObjectFiles<ELFT>(ctx);
   }
 
   // Write the result to the file.
diff --git a/lld/test/COFF/cgprofile-obj.s b/lld/test/COFF/cgprofile-obj.s
index b267850c46382c..da8011724d9859 100644
--- a/lld/test/COFF/cgprofile-obj.s
+++ b/lld/test/COFF/cgprofile-obj.s
@@ -2,9 +2,12 @@
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-pc-win32 %s -o %t
 # RUN: lld-link /subsystem:console /entry:A %t /out:%t2 /debug:symtab
-# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s
+# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s --check-prefix=CG-OBJ
 # RUN: lld-link /call-graph-profile-sort:no /subsystem:console /entry:A %t /out:%t3 /debug:symtab
 # RUN: llvm-nm --numeric-sort %t3 | FileCheck %s --check-prefix=NO-CG
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: lld-link /subsystem:console /entry:A %t /out:%t4 /debug:symtab /call-graph-ordering-file:%t.call_graph
+# RUN: llvm-nm --numeric-sort %t4 | FileCheck %s --check-prefix=CG-OBJ-OF
 
     .section    .text,"ax", one_only, D
 D:
@@ -33,13 +36,17 @@ Aa:
     .cg_profile B, C, 30
     .cg_profile C, D, 90
 
-# CHECK: 140001000 T A
-# CHECK: 140001001 T B
-# CHECK: 140001002 T C
-# CHECK: 140001003 t D
-
+# CG-OBJ: 140001000 T A
+# CG-OBJ: 140001001 T B
+# CG-OBJ: 140001002 T C
+# CG-OBJ: 140001003 t D
 
 # NO-CG: 140001000 t D
 # NO-CG: 140001001 T C
 # NO-CG: 140001002 T B
 # NO-CG: 140001003 T A
+
+# CG-OBJ-OF: 140001000 t D
+# CG-OBJ-OF: 140001001 T A
+# CG-OBJ-OF: 140001004 T C
+# CG-OBJ-OF: 140001005 T B
diff --git a/lld/test/ELF/cgprofile-obj.s b/lld/test/ELF/cgprofile-obj.s
index 0848adc5e4279a..14016658707af4 100644
--- a/lld/test/ELF/cgprofile-obj.s
+++ b/lld/test/ELF/cgprofile-obj.s
@@ -2,12 +2,15 @@
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld -e A %t.o -o %t
-# RUN: llvm-nm --no-sort %t | FileCheck %s
+# RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=CG-OBJ
 # RUN: ld.lld --call-graph-profile-sort=none -e A %t.o -o %t
 # RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=NO-CG
 ## --no-call-graph-profile-sort is an alias for --call-graph-profile-sort=none.
 # RUN: ld.lld --no-call-graph-profile-sort -e A %t.o -o %t1
 # RUN: cmp %t %t1
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: ld.lld -e A %t.o -call-graph-ordering-file=%t.call_graph -o %t2
+# RUN: llvm-nm --no-sort %t2 | FileCheck %s --check-prefix=CG-OBJ-OF
 
     .section    .text.D,"ax",@progbits
 D:
@@ -36,12 +39,17 @@ Aa:
     .cg_profile B, C, 30
     .cg_profile C, D, 90
 
-# CHECK: 0000000000201123 t D
-# CHECK: 0000000000201122 T C
-# CHECK: 0000000000201121 T B
-# CHECK: 0000000000201120 T A
+# CG-OBJ: 0000000000201123 t D
+# CG-OBJ: 0000000000201122 T C
+# CG-OBJ: 0000000000201121 T B
+# CG-OBJ: 0000000000201120 T A
 
 # NO-CG: 0000000000201120 t D
 # NO-CG: 0000000000201121 T C
 # NO-CG: 0000000000201122 T B
 # NO-CG: 0000000000201123 T A
+
+# CG-OBJ-OF: 0000000000201120 t D
+# CG-OBJ-OF: 0000000000201124 T C
+# CG-OBJ-OF: 0000000000201125 T B
+# CG-OBJ-OF: 0000000000201121 T A

@llvmbot
Copy link
Member

llvmbot commented Dec 30, 2024

@llvm/pr-subscribers-lld

Author: Haohai Wen (HaohaiWen)

Changes

cg_profile in object is from CGProfilePass and it is often inaccurate.
While call-graph-ordering-file is provided by user. It is weird to
aggregate them together especially when call-graph-ordering-file is
accurate enough.


Full diff: https://github.com/llvm/llvm-project/pull/121325.diff

4 Files Affected:

  • (modified) lld/COFF/Driver.cpp (+3-3)
  • (modified) lld/ELF/Driver.cpp (+3-2)
  • (modified) lld/test/COFF/cgprofile-obj.s (+13-6)
  • (modified) lld/test/ELF/cgprofile-obj.s (+13-5)
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index be01ee41c9a2f2..4b9c189e21510d 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -2812,10 +2812,10 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
   // Handle /call-graph-ordering-file and /call-graph-profile-sort (default on).
   if (config->callGraphProfileSort) {
     llvm::TimeTraceScope timeScope("Call graph");
-    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
+    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
       parseCallGraphFile(arg->getValue());
-    }
-    readCallGraphsFromObjectFiles(ctx);
+    else
+      readCallGraphsFromObjectFiles(ctx);
   }
 
   // Handle /print-symbol-order.
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index f573a8d3e19f3b..e8e99fa874b5d5 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -3215,11 +3215,12 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &args) {
 
   // Read the callgraph now that we know what was gced or icfed
   if (ctx.arg.callGraphProfileSort != CGProfileSortKind::None) {
-    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file))
+    if (auto *arg = args.getLastArg(OPT_call_graph_ordering_file)) {
       if (std::optional<MemoryBufferRef> buffer =
               readFile(ctx, arg->getValue()))
         readCallGraph(ctx, *buffer);
-    readCallGraphsFromObjectFiles<ELFT>(ctx);
+    } else
+      readCallGraphsFromObjectFiles<ELFT>(ctx);
   }
 
   // Write the result to the file.
diff --git a/lld/test/COFF/cgprofile-obj.s b/lld/test/COFF/cgprofile-obj.s
index b267850c46382c..da8011724d9859 100644
--- a/lld/test/COFF/cgprofile-obj.s
+++ b/lld/test/COFF/cgprofile-obj.s
@@ -2,9 +2,12 @@
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-pc-win32 %s -o %t
 # RUN: lld-link /subsystem:console /entry:A %t /out:%t2 /debug:symtab
-# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s
+# RUN: llvm-nm --numeric-sort %t2 | FileCheck %s --check-prefix=CG-OBJ
 # RUN: lld-link /call-graph-profile-sort:no /subsystem:console /entry:A %t /out:%t3 /debug:symtab
 # RUN: llvm-nm --numeric-sort %t3 | FileCheck %s --check-prefix=NO-CG
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: lld-link /subsystem:console /entry:A %t /out:%t4 /debug:symtab /call-graph-ordering-file:%t.call_graph
+# RUN: llvm-nm --numeric-sort %t4 | FileCheck %s --check-prefix=CG-OBJ-OF
 
     .section    .text,"ax", one_only, D
 D:
@@ -33,13 +36,17 @@ Aa:
     .cg_profile B, C, 30
     .cg_profile C, D, 90
 
-# CHECK: 140001000 T A
-# CHECK: 140001001 T B
-# CHECK: 140001002 T C
-# CHECK: 140001003 t D
-
+# CG-OBJ: 140001000 T A
+# CG-OBJ: 140001001 T B
+# CG-OBJ: 140001002 T C
+# CG-OBJ: 140001003 t D
 
 # NO-CG: 140001000 t D
 # NO-CG: 140001001 T C
 # NO-CG: 140001002 T B
 # NO-CG: 140001003 T A
+
+# CG-OBJ-OF: 140001000 t D
+# CG-OBJ-OF: 140001001 T A
+# CG-OBJ-OF: 140001004 T C
+# CG-OBJ-OF: 140001005 T B
diff --git a/lld/test/ELF/cgprofile-obj.s b/lld/test/ELF/cgprofile-obj.s
index 0848adc5e4279a..14016658707af4 100644
--- a/lld/test/ELF/cgprofile-obj.s
+++ b/lld/test/ELF/cgprofile-obj.s
@@ -2,12 +2,15 @@
 
 # RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux %s -o %t.o
 # RUN: ld.lld -e A %t.o -o %t
-# RUN: llvm-nm --no-sort %t | FileCheck %s
+# RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=CG-OBJ
 # RUN: ld.lld --call-graph-profile-sort=none -e A %t.o -o %t
 # RUN: llvm-nm --no-sort %t | FileCheck %s --check-prefix=NO-CG
 ## --no-call-graph-profile-sort is an alias for --call-graph-profile-sort=none.
 # RUN: ld.lld --no-call-graph-profile-sort -e A %t.o -o %t1
 # RUN: cmp %t %t1
+# RUN: echo "D A 200" > %t.call_graph
+# RUN: ld.lld -e A %t.o -call-graph-ordering-file=%t.call_graph -o %t2
+# RUN: llvm-nm --no-sort %t2 | FileCheck %s --check-prefix=CG-OBJ-OF
 
     .section    .text.D,"ax",@progbits
 D:
@@ -36,12 +39,17 @@ Aa:
     .cg_profile B, C, 30
     .cg_profile C, D, 90
 
-# CHECK: 0000000000201123 t D
-# CHECK: 0000000000201122 T C
-# CHECK: 0000000000201121 T B
-# CHECK: 0000000000201120 T A
+# CG-OBJ: 0000000000201123 t D
+# CG-OBJ: 0000000000201122 T C
+# CG-OBJ: 0000000000201121 T B
+# CG-OBJ: 0000000000201120 T A
 
 # NO-CG: 0000000000201120 t D
 # NO-CG: 0000000000201121 T C
 # NO-CG: 0000000000201122 T B
 # NO-CG: 0000000000201123 T A
+
+# CG-OBJ-OF: 0000000000201120 t D
+# CG-OBJ-OF: 0000000000201124 T C
+# CG-OBJ-OF: 0000000000201125 T B
+# CG-OBJ-OF: 0000000000201121 T A

Copy link
Member

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I believe that --call-graph-ordering-file is under-utilized and the interaction wasn't well studied. Have you tried some examples where not aggregation provides larger performance win?

@HaohaiWen
Copy link
Contributor Author

HaohaiWen commented Jan 5, 2025

Makes sense. I believe that --call-graph-ordering-file is under-utilized and the interaction wasn't well studied. Have you tried some examples where not aggregation provides larger performance win?

I was trying to feed precise call graph profile to lld and found the reorder behavior was not expected due to this issue. Regarding the performance, I didn't see significant difference, but the averaged branch distance (aggregation of LBR address difference / number of LBR) decreased with this patch.

@HaohaiWen HaohaiWen merged commit 2d9d291 into llvm:main Jan 5, 2025
8 checks passed
@HaohaiWen HaohaiWen deleted the lld branch January 5, 2025 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants