-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[ThinLTO][TypeProf] Import local-linkage global var for mod1:func_foo-> mod2:local-var edge #100448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-transforms @llvm/pr-subscribers-lto Author: Mingming Liu (minglotus-6) ChangesGiven a ref chain This patch introduces an option for compiler user. If compiler user can guarantee that all files are compiled with full paths, they can set this option on to get Test:
Full diff: https://github.com/llvm/llvm-project/pull/100448.diff 2 Files Affected:
diff --git a/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp b/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp
index 411640fc5176d..4ef7b2da2f572 100644
--- a/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp
+++ b/compiler-rt/test/profile/Linux/instrprof-vtable-value-prof.cpp
@@ -1,3 +1,4 @@
+
// REQUIRES: lld, lld-available
// Building the instrumented binary will fail because lld doesn't support
@@ -5,7 +6,7 @@
// ld.lld: error: /lib/../lib64/Scrt1.o: ABI version 1 is not supported
// UNSUPPORTED: ppc && host-byteorder-big-endian
-// RUN: rm -rf %t && mkdir %t && cd %t
+// RUN: rm -rf %t && split-file %s %t && cd %t
// RUN: %clangxx_pgogen -fuse-ld=lld -O2 -fprofile-generate=. -mllvm -enable-vtable-value-profiling %s -o test
// RUN: env LLVM_PROFILE_FILE=test.profraw ./test
@@ -141,18 +142,19 @@
// RUN: -g -flto=thin -fwhole-program-vtables -fno-split-lto-unit -O2 \
// RUN: -mllvm -enable-vtable-value-profiling -Wl,-mllvm,-enable-vtable-value-profiling \
// RUN: -mllvm -enable-vtable-profile-use \
+// RUN: -Wl,-plugin-opt,-import-assume-unique-local \
// RUN: -Wl,-mllvm,-enable-vtable-profile-use -Rpass=pgo-icall-prom \
// RUN: -Wl,-mllvm,-print-after=pgo-icall-prom \
// RUN: -Wl,-mllvm,-filter-print-funcs=main %s 2>&1 \
// RUN: | FileCheck %s --check-prefixes=REMARK,IR --implicit-check-not="!VP"
// For the indirect call site `ptr->func`
-// REMARK: instrprof-vtable-value-prof.cpp:226:19: Promote indirect call to _ZN12_GLOBAL__N_18Derived24funcEii with count 150 out of 200, sink 1 instruction(s) and compare 1 vtable(s): {_ZTVN12_GLOBAL__N_18Derived2E}
-// REMARK: instrprof-vtable-value-prof.cpp:226:19: Promote indirect call to _ZN8Derived14funcEii with count 50 out of 50, sink 1 instruction(s) and compare 1 vtable(s): {_ZTV8Derived1}
+// REMARK: instrprof-vtable-value-prof.cpp:240:19: Promote indirect call to _ZN12_GLOBAL__N_18Derived24funcEii with count 150 out of 200, sink 1 instruction(s) and compare 1 vtable(s): {_ZTVN12_GLOBAL__N_18Derived2E}
+// REMARK: instrprof-vtable-value-prof.cpp:240:19: Promote indirect call to _ZN8Derived14funcEii with count 50 out of 50, sink 1 instruction(s) and compare 1 vtable(s): {_ZTV8Derived1}
//
// For the indirect call site `delete ptr`
-// REMARK: instrprof-vtable-value-prof.cpp:228:5: Promote indirect call to _ZN12_GLOBAL__N_18Derived2D0Ev with count 750 out of 1000, sink 2 instruction(s) and compare 1 vtable(s): {_ZTVN12_GLOBAL__N_18Derived2E}
-// REMARK: instrprof-vtable-value-prof.cpp:228:5: Promote indirect call to _ZN8Derived1D0Ev with count 250 out of 250, sink 2 instruction(s) and compare 1 vtable(s): {_ZTV8Derived1}
+// REMARK: instrprof-vtable-value-prof.cpp:242:5: Promote indirect call to _ZN12_GLOBAL__N_18Derived2D0Ev with count 750 out of 1000, sink 2 instruction(s) and compare 1 vtable(s): {_ZTVN12_GLOBAL__N_18Derived2E}
+// REMARK: instrprof-vtable-value-prof.cpp:242:5: Promote indirect call to _ZN8Derived1D0Ev with count 250 out of 250, sink 2 instruction(s) and compare 1 vtable(s): {_ZTV8Derived1}
// The IR matchers for indirect callsite `ptr->func`.
// IR-LABEL: @main
@@ -185,6 +187,7 @@
// IR: [[MERGE0]]:
// IR: [[RES2:%.*]] = phi i32 [ [[RES1]], %[[MERGE1]] ], [ [[RESBB1]], %[[BB1]] ]
+//--- lib.h
#include <stdio.h>
#include <stdlib.h>
class Base {
@@ -193,21 +196,31 @@ class Base {
virtual ~Base() {};
};
+
class Derived1 : public Base {
public:
- int func(int a, int b) override { return a * b; }
+ int func(int a, int b) override;
~Derived1() {}
};
namespace {
class Derived2 : public Base {
public:
- int func(int a, int b) override { return a * (a - b); }
+ int func(int a, int b) override;
~Derived2() {}
};
} // namespace
-__attribute__((noinline)) Base *createType(int a) {
+
+__attribute__((noinline)) Base *createType(int a);
+
+//-- lib.cpp
+
+int Derived1::func(int a, int b) { return a * b; }
+
+int Derived2::func(int a, int b) { return a * (a - b); }
+
+Base *createType(int a) {
Base *base = nullptr;
if (a % 4 == 0)
base = new Derived1();
@@ -216,6 +229,7 @@ __attribute__((noinline)) Base *createType(int a) {
return base;
}
+//--- main.cpp
int main(int argc, char **argv) {
int sum = 0;
for (int i = 0; i < 1000; i++) {
diff --git a/llvm/lib/Transforms/IPO/FunctionImport.cpp b/llvm/lib/Transforms/IPO/FunctionImport.cpp
index 038785114a0cf..839c09fa9d586 100644
--- a/llvm/lib/Transforms/IPO/FunctionImport.cpp
+++ b/llvm/lib/Transforms/IPO/FunctionImport.cpp
@@ -174,6 +174,18 @@ static cl::opt<std::string> WorkloadDefinitions(
"}"),
cl::Hidden);
+static cl::opt<bool> ImportAssumeUniqueLocal(
+ "import-assume-unique-local", cl::init(false),
+ cl::desc(
+ "By default, a local-linkage global variable won't be imported in the "
+ "edge mod1:func -> mod2:func_bar -> mod2:local-var since compiler "
+ "cannot assume mod2 is compiled with full path or local-var has a "
+ "unique GUID. "
+ "Set this option to true will help cross-module import of such "
+ "variables. But it is only safe if the compiler user specify the full "
+ "module path."),
+ cl::Hidden);
+
namespace llvm {
extern cl::opt<bool> EnableMemProfContextDisambiguation;
}
@@ -367,6 +379,8 @@ class GlobalsImporter final {
// was not distinguishing path.
auto LocalNotInModule =
[&](const GlobalValueSummary *RefSummary) -> bool {
+ if (ImportAssumeUniqueLocal)
+ return false;
return GlobalValue::isLocalLinkage(RefSummary->linkage()) &&
RefSummary->modulePath() != Summary.modulePath();
};
|
static cl::opt<bool> ImportAssumeUniqueLocal( | ||
"import-assume-unique-local", cl::init(false), | ||
cl::desc( | ||
"By default, a local-linkage global variable won't be imported in the " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can the local var be promoted to global to enable importing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As shown in the updated test, with this option on, when the local-var gets imported, it will have a global name <vtable>.llvm.<mod-hash>
.
Mentioned offline to Mingming, but this code is looking at the module of the immediate referencing function, so that would be mod2. Mingming will dig a bit to see why this isn't working. |
I think you are correct that Here is why import of
The diff in [1] gives this debugging log
[1]
|
Note the local-linkage indirect callee get imported without turning on the option in this patch. Teresa pointed me to
I'll update the PR to make the indirect references (vtables or functions) handling consistent. |
This is implemented. PTAL, thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with a couple of small suggestions below. Before you submit please update the patch description to fix the example and description about it needing to be the "same as the function that starts the import" and to instead describe the value profile case.
@@ -196,6 +207,23 @@ static std::unique_ptr<Module> loadFile(const std::string &FileName, | |||
return Result; | |||
} | |||
|
|||
static bool shouldSkipLocalInAnotherModule(const GlobalVarSummary *RefSummary, | |||
size_t NumRefs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the number of defs, not refs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
size_t NumRefs, | ||
StringRef ImporterModule) { | ||
// We can import a local from another module if all inputs are compiled | ||
// with full paths or when there is one entry in the list. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"when there is one definition"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
I updated the description to mention value profiles creates the reference edges, and that vars with one definition gets imported |
Hi @minglotus-6, your change seems to be causing around 11 test failures on at least the following bots:
Can you take a look and revert if you need time to investigate? |
Hi! Sorry for the breakage. #100508 should fix the test failure (running |
…-> mod2:local-var edge (#100448) Summary: VTable value profiling can create reference edges from `mod1:func_foo` to `mod2:local-vtable`. Indirect call profiling can create reference edges from `mod1:func_foo` to `mod2:local_func_bar`. Given a ref chain `mod1:func_foo -> mod2:local-var`,`local-var` doesn't get imported by default. Compiler checks / requires the module of 'local-var' is the same as the function that referenced it(`mod1:func_foo`). This is to prevent mis-compilation when both `mod1` and `mod2` has `local-var` of the same name, and cpp files are compiled without full path. This patch allows the import when one of the following conditions happen: 1) Introduce an option `import-assume-local-unique`. When the compiler user can guarantee that all files are compiled with full paths, they can set this option. 2) When there is one instance of value summary. Test: * A/B testing this option alone gives -0.16% statistically consistent cpu cycle reduction on one search workload (no throughput increase) * Testing it together with existing more-efficient ICP bumps the throughput increase by a margin (0.05%~0.1%) * No regressions observed. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250563
"edge mod1:func -> mod2:local-var (from value profiles) since compiler " | ||
"cannot assume mod2 is compiled with full path which gives local-var a " | ||
"program-wide unique GUID. Set this option to true will help cross " | ||
"module import of such variables. This is only safe if the compiler " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to learn more about this space. Can you elaborate on "This is only safe if the compiler user specify the full module path." How does user specify it? I looked at the test and it's not clear to me from it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does user specify it?
In a monolithic repository (https://research.google/pubs/why-google-stores-billions-of-lines-of-code-in-a-single-repository/) of source code, user specifies the full path in the compile options. In practice, bazel (https://bazel.build/) configures the build rules and guarantees full path is used. Users invoke bazel to compile an executable.
Using the folder structure below as an example [1], clang -O2 <other-options> /tmp/src1/a.cpp /tmp/src1/b.cpp /tmp/src2/a.cpp /tmp/src2/b.cpp /tmp/src2/c.cpp
gives the full path, and cd /tmp/src1; clang -O2 a.cpp b.cpp <other arguments to specify paths in src2>
doesn't.
I looked at the test and it's not clear to me from it.
The purpose of the test is to illustrate and provide coverage for profile-gen and its use, so I saved the extra work to use full path there.
[1]
/tmp/
|── src1/
│ ├── a.cpp
│ ├── b.cpp
├── src2/
│ ├── a.cpp
│ ├── b.cpp
│ └──c.cpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test program itself is correct as there are only two files and they don't have conflict names.
I'm open to update it to use the full path if that is preferred or better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, I double checked whether compiler correctness can rely on the fact that full file paths are used. My takeaway is that bazel doesn't fail the executable compilation if local paths are used (if any in a unconventional way) so I don't plan to turn on import-assume-unique-local
.
Yet still NumDefs == 1
condition (https://github.com/llvm/llvm-project/pull/100448/files#diff-e7cb370fee46f0f773f2b5429dfab36b75126d3909ae98ee87ff3d0e3f75c6e9R215) should hold true for many internal-linkage vtables as long as full paths are indeed used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, thx for explanation.
While manual compiles can specify full file paths and build automation tools use full, unique paths in practice, it's not clear whether it's a general good practice to enforce full paths (fail a build if relative paths are used). `NumDefs == 1` condition [1] should hold true for many internal-linkage vtables as long as full paths are indeed used to salvage the marginal performance when local-linkage vtables are imported due to indirect reference. #100448 (comment) has more details. [1] https://github.com/llvm/llvm-project/pull/100448/files#diff-e7cb370fee46f0f773f2b5429dfab36b75126d3909ae98ee87ff3d0e3f75c6e9R215
VTable value profiling can create reference edges from
mod1:func_foo
tomod2:local-vtable
. Indirect call profiling can create reference edges frommod1:func_foo
tomod2:local_func_bar
.Given a ref chain
mod1:func_foo -> mod2:local-var
,local-var
doesn't get imported by default.Compiler checks / requires the module of 'local-var' is the same as the function that referenced it(
mod1:func_foo
). This is to prevent mis-compilation when bothmod1
andmod2
haslocal-var
of the same name, and cpp files are compiled without full path.This patch allows the import when one of the following conditions happen:
import-assume-local-unique
. When the compiler user can guarantee that all files are compiled with full paths, they can set this option.Test: