-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[InstrPGO] Support cold function coverage instrumentation #109837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 2 commits
07a2cab
b3054f0
5a2848c
f07457f
66f837a
b1f0105
3c4e85b
d54fbe0
5e0d106
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
foo:1:1 | ||
1: 1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
// Test -fprofile-generate-cold-function-coverage | ||
// RUN: %clang -O2 -fprofile-generate-cold-function-coverage=/xxx/yyy/ -fprofile-sample-accurate -fprofile-sample-use=%S/Inputs/pgo-cold-func.prof -S -emit-llvm -o - %s | FileCheck %s | ||
wlei-llvm marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
// CHECK: @__llvm_profile_filename = {{.*}} c"/xxx/yyy/default_%m.profraw\00" | ||
|
||
// CHECK: @__profc_bar | ||
// CHECK-NOT: @__profc_foo | ||
|
||
int bar(int x) { return x;} | ||
int foo(int x) { | ||
return x; | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -296,7 +296,13 @@ static cl::opt<bool> UseLoopVersioningLICM( | |
"enable-loop-versioning-licm", cl::init(false), cl::Hidden, | ||
cl::desc("Enable the experimental Loop Versioning LICM pass")); | ||
|
||
static cl::opt<std::string> InstrumentSampleColdFuncPath( | ||
"instrument-sample-cold-function-path", cl::init(""), | ||
cl::desc("File path for instrumenting sampling PGO guided cold functions"), | ||
cl::Hidden); | ||
|
||
extern cl::opt<std::string> UseCtxProfile; | ||
extern cl::opt<bool> InstrumentColdFunctionCoverage; | ||
|
||
namespace llvm { | ||
extern cl::opt<bool> EnableMemProfContextDisambiguation; | ||
|
@@ -1119,6 +1125,18 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level, | |
// removed. | ||
MPM.addPass( | ||
PGOIndirectCallPromotion(true /* IsInLTO */, true /* SamplePGO */)); | ||
|
||
if (InstrumentSampleColdFuncPath.getNumOccurrences() && | ||
Phase != ThinOrFullLTOPhase::ThinLTOPostLink) { | ||
assert(!InstrumentSampleColdFuncPath.empty() && | ||
"File path is requeired for instrumentation generation"); | ||
InstrumentColdFunctionCoverage = true; | ||
addPreInlinerPasses(MPM, Level, Phase); | ||
addPGOInstrPasses(MPM, Level, /* RunProfileGen */ true, | ||
/* IsCS */ false, /* AtomicCounterUpdate */ false, | ||
InstrumentSampleColdFuncPath, "", | ||
IntrusiveRefCntPtr<vfs::FileSystem>()); | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After looking at this a bit closer, I'm not sure why it needs to be tied to closely to SamplePGO. Couldn't we move this out of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah, good point, moved them closer to other IRPGO passes. Thanks for the suggestion! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! Now I think this should be independent of SamplePGO. Can we remove "sample" from the flag and variable names? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do you mean the
I also added an assertion:
Seems I have to remove that if we want it to be general. (just curious) do you plan to extend it for your case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't see how there would be any functional difference between using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I see, sounds good. My previous thought is: there could be functional difference for using the So user needs to be aware of the combination use of those flags, I was then trying to prevent the misuse, to disallow the use of But on a second thought, if we want to extend for other PGO options, say if the profile annotation and instrumentation are not in the same pipeline, it's probably hard to check this locally, we can leave it to the build system. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I wasn't aware that SamplePGO did this. This is indeed different than IRPGO. Perhaps it makes more sense to use a separate flag to control this case. Maybe something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point, that's more flexible, thanks! |
||
} | ||
|
||
// Try to perform OpenMP specific optimizations on the module. This is a | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
; RUN: opt < %s --passes=pgo-instr-gen -instrument-cold-function-coverage -S | FileCheck --check-prefixes=COLD %s | ||
; RUN: opt < %s --passes=pgo-instr-gen -instrument-cold-function-coverage -cold-function-coverage-max-entry-count=1 -S | FileCheck --check-prefixes=ENTRY-COUNT %s | ||
|
||
; COLD: call void @llvm.instrprof.increment(ptr @__profn_foo, i64 [[#]], i32 1, i32 0) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you plan to support There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good!(it should still work without |
||
; COLD-NOT: __profn_main | ||
|
||
; ENTRY-COUNT: call void @llvm.instrprof.increment(ptr @__profn_foo, i64 [[#]], i32 1, i32 0) | ||
; ENTRY-COUNT: call void @llvm.instrprof.increment(ptr @__profn_main, i64 [[#]], i32 1, i32 0) | ||
|
||
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128" | ||
target triple = "x86_64-unknown-linux-gnu" | ||
|
||
define void @foo() !prof !0 { | ||
entry: | ||
ret void | ||
} | ||
|
||
define i32 @main() !prof !1 { | ||
entry: | ||
ret i32 0 | ||
} | ||
|
||
!0 = !{!"function_entry_count", i64 0} | ||
!1 = !{!"function_entry_count", i64 1} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it necessary to expose this through the clang frontend?
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry if my PR description is not clear. Note that there is no use for
-fprofile-generate
and-fprofile-instr-generate
here, so a driver flag here is to configure the instr file path and make linker to link the compiler_rt.profile object files (just similar to-fprofile-[instr]-generate=
).The reason for not using
-fprofile-[instr]-generate
is because those flags conflict with-fprofile-sample-use
, see PGOOptions,ProfileFile
is a shared file path which means the two flags are actually mutually exclusive.Another option is to make
-fprofile-generate
compatible with-fprofile-samle-use
(like refactoring PGOOptions or adding another flag to configure the file path things), this seems to me they are more complicated than the current one. But I’m open to any suggestions on this.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant, why not just use
clang ... -mllvm -instrument-cold-function-coverage
? Is this a clang - only feature (i.e. rust can't use it?) Is it just for symmetry with the current PGO flags?(This is not a pushback, mainly curious. Also the patch would be quite smaller if you didn't need to pass through the flags from clang to llvm)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for the suggestion! We also need to link runtime lib(
compiler_rt.profile.a
)but yeah, I agree it's possible to pass directly to llvm and linker without through clang. Then the full command line would be like
So yes, adding the clang driver flag is kind of to mirror current IRPGO flags, for easy maintenance purpose(given that
-fprofile-generate
doesn't work with-fprofile-sample-use
) and also to centralize the configuration for the convenience. IMO, thecompiler_rt
is probably the main blocker here, I didn't find an easy way to bundle it with a llvm flag.Appreciate any further suggestions!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would
-Wl,-lclang_rt.profile
work?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I think a frontend flag is useful. It allows us to identify incompatibilities early and provide clear error messages instead of more obscure failures down the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would incompatibilities and ability to provide useful error messages be dependent on the flag being in the frontend? Is it that the backend can't actually reliably point the finger at the specific flag causing the conflict, so a message would be diluted to "sampling won't work with this"?
(probably a subject for a different forum tho)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, I think it should work, except it requires another linker flag: the
--u__llvm_runtime_variable
, we can configure it to linker too, but basically those instr PGO flags control addProfileRTLibs (which seems not equivalent to-Wl,-lclang_rt.profile
), I just wanted to mirror those flags so that we don't need to maintain if anything changes toaddProfileRTLibs
in the future. (Though I feel this code should be very stable, should not be a big problem, so mainly for the convenience for compiler user to use one flag instead of using/maintaining multiple flags )Overall, I agree that it's feasible to do it without clang flag. I'm not very familiar with the convention for adding driver flags, so if you think this doesn't justify it, I will drop it from this patch. Thanks for the discussion!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we know that this instrumentation mode should not be mixed with e.g. sanitizers or something else we can enforce these checks early. I don't see a particular downside to adding a frontend flag. The convenience of bundling the 2 lld flags and 3 mllvm flags into a single frontend flag seems like a good enough motivation to do so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on single driver flag for convenience.