-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[BOLT] Match functions with name similarity #95884
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOLT] Match functions with name similarity #95884
Conversation
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
@llvm/pr-subscribers-bolt Author: shaw young (shawbyoung) ChangesMatching functions based on edit Test Plan: tbd Full diff: https://github.com/llvm/llvm-project/pull/95884.diff 1 Files Affected:
diff --git a/bolt/lib/Profile/YAMLProfileReader.cpp b/bolt/lib/Profile/YAMLProfileReader.cpp
index f25f59201f1cd..a4f8ba9a440b6 100644
--- a/bolt/lib/Profile/YAMLProfileReader.cpp
+++ b/bolt/lib/Profile/YAMLProfileReader.cpp
@@ -13,6 +13,7 @@
#include "bolt/Profile/ProfileYAMLMapping.h"
#include "bolt/Utils/Utils.h"
#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/edit_distance.h"
#include "llvm/Support/CommandLine.h"
using namespace llvm;
@@ -23,6 +24,10 @@ extern cl::opt<unsigned> Verbosity;
extern cl::OptionCategory BoltOptCategory;
extern cl::opt<bool> InferStaleProfile;
+cl::opt<unsigned> NameSimilarityFunctionMatchingThreshold(
+ "name-similarity-function-matching-threshold", cl::desc("edit distance."),
+ cl::init(0), cl::Hidden, cl::cat(BoltOptCategory));
+
static llvm::cl::opt<bool>
IgnoreHash("profile-ignore-hash",
cl::desc("ignore hash while reading function profile"),
@@ -415,6 +420,30 @@ Error YAMLProfileReader::readProfile(BinaryContext &BC) {
if (!YamlBF.Used && BF && !ProfiledFunctions.count(BF))
matchProfileToFunction(YamlBF, *BF);
+ // Uses name similarity to match functions that were not matched by name.
+ for (auto YamlBF : YamlBP.Functions) {
+ if (YamlBF.Used)
+ continue;
+
+ unsigned MinEditDistance = UINT_MAX;
+ BinaryFunction *ClosestNameBF = nullptr;
+
+ for (auto &[_, BF] : BC.getBinaryFunctions()) {
+ if (ProfiledFunctions.count(&BF))
+ continue;
+
+ unsigned BFEditDistance = BF.getOneName().edit_distance(YamlBF.Name);
+ if (BFEditDistance < MinEditDistance) {
+ MinEditDistance = BFEditDistance;
+ ClosestNameBF = &BF;
+ }
+ }
+
+ if (ClosestNameBF &&
+ MinEditDistance < opts::NameSimilarityFunctionMatchingThreshold)
+ matchProfileToFunction(YamlBF, *ClosestNameBF);
+ }
+
for (yaml::bolt::BinaryFunctionProfile &YamlBF : YamlBP.Functions)
if (!YamlBF.Used && opts::Verbosity >= 1)
errs() << "BOLT-WARNING: profile ignored for function " << YamlBF.Name
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please also note the runtime of the namespace-filtered pairwise checks? This would guide us in whether to add a BB count filtering.
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Ran BOLT on a large binary with 4917369 functions with name similarity function matching - total execution time of rewrite passes was 2235.2883 seconds (1989.1823 wall clock) with an edit distance threshold of 20 and 2194.1708 seconds (1986.2007 wall clock) with an edit distance threshold of 10. Without name similarity matching, total execution time of rewrite passes was 520.4121 seconds (296.4178 wall clock). |
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for checking the runtime, and it's quite high. We'll need extra filtering by block count to keep runtime low - please add it as we discussed.
Functions are filtered by block size here @aaupov |
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refactor new code into a separate function. Add a comment on how the matching is done such that the interface can be understood without reading the code.
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG in general with some comments
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG with a couple of nits.
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Created using spr 1.3.4 [skip ci]
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/151/builds/775 Here is the relevant piece of the build log for the reference:
|
A mapping - from namespace to associated binary functions - is used to match function profiles to binary based on the '--name-similarity-function-matching-threshold' flag set edit distance threshold. The flag is set to 0 (exact name matching) by default as it is expensive, requiring the processing of all BFs. Test Plan: Added name-similarity-function-matching.test. On a binary with 5M functions, rewrite passes took ~520s without the flag and ~2018s with the flag set to 20.
A mapping - from namespace to associated binary functions - is used to
match function profiles to binary based on the
'--name-similarity-function-matching-threshold' flag set edit distance
threshold. The flag is set to 0 (exact name matching) by default as it is
expensive, requiring the processing of all BFs.
Test Plan: Added name-similarity-function-matching.test. On a binary
with 5M functions, rewrite passes took ~520s without the flag and
~2018s with the flag set to 20.