-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Reapply [BOLT] DataAggregator support for binaries with multiple text segments #118023
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reapply [BOLT] DataAggregator support for binaries with multiple text segments #118023
Conversation
When a binary has multiple text segments, the Size is computed as the difference of the last address of these segments from the BaseAddress. The base addresses of all text segments must be the same. Background: Larger binaries get two text segments mapped when loaded in memory. BOLT processes only the first, which is not having a correct BaseAddress, causing a wrong computation of a BinaryMMapInfo's size. Consequently, BOLT wrongly thinks that many of the samples fall outside the binary and ignores them. As a result, when used in heatmaps the output excludes all those entries and the section hotness statistics are wrong. This bug is present in both the AArch64 and x86 backends. --- This patch introduces the flag 'perf-script-events' that allows passing perf events without BOLT having to parse them using 'perf script'. The flag is used to pass a mock perf profile that has two memory mappings for a mock binary that has two text segments. The size of the mapping is updated as `parseMMapEvents` now processes all text segments. --- Example used in unit tests: From `/proc/<BINARY PID>/maps`, we have 2 text mappings, say A and B. ``` abc0000000-abc1000000 r-xp 011c0000 103:01 1573523 BINARY abc2000000-abca000000 r-xp 031d0000 103:01 1573523 BINARY ``` Size of text mappings: | Mapping | Size | | ------- | ------ | | A | ~15MB | | B | ~135MB | --- Example on a real program: ``` 2f7200000-2fabca000 r--p 00000000 bolted-binary 2fabd9000-2fe47c000 r-xp 039c9000 bolted-binary <- 1st txt segment 2fe48b000-2fe61d000 r--p 0727b000 bolted-binary 2fe62c000-2fe660000 rw-p 0740c000 bolted-binary 2fe660000-2fea4c000 rw-p 00000000 2fec00000-303dad000 r-xp 07a00000 bolted-binary <- 2nd (appears only on the bolted binary) ```
- Added to CoreTests in BUILD.gn - Hiding DataAggregator std out/err outputs
Causes issues on the CI too.
@llvm/pr-subscribers-bolt Author: Paschalis Mpeis (paschalis-mpeis) ChangesWhen a binary has multiple text segments, the Size is computed as the Introduces flag 'perf-script-events' for testing, which allows passing perf events Full diff: https://github.com/llvm/llvm-project/pull/118023.diff 5 Files Affected:
diff --git a/bolt/include/bolt/Profile/DataAggregator.h b/bolt/include/bolt/Profile/DataAggregator.h
index 2880bfd03be789..320623cfa15af1 100644
--- a/bolt/include/bolt/Profile/DataAggregator.h
+++ b/bolt/include/bolt/Profile/DataAggregator.h
@@ -170,6 +170,9 @@ class DataAggregator : public DataReader {
std::string BuildIDBinaryName;
/// Memory map info for a single file as recorded in perf.data
+ /// When a binary has multiple text segments, the Size is computed as the
+ /// difference of the last address of these segments from the BaseAddress.
+ /// The base addresses of all text segments must be the same.
struct MMapInfo {
uint64_t BaseAddress{0}; /// Base address of the mapped binary.
uint64_t MMapAddress{0}; /// Address of the executable segment.
@@ -493,6 +496,11 @@ class DataAggregator : public DataReader {
/// and return a file name matching a given \p FileBuildID.
std::optional<StringRef> getFileNameForBuildID(StringRef FileBuildID);
+ /// Get a constant reference to the parsed binary mmap entries.
+ const std::unordered_map<uint64_t, MMapInfo> &getBinaryMMapInfo() {
+ return BinaryMMapInfo;
+ }
+
friend class YAMLProfileWriter;
};
} // namespace bolt
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 697cac9fbcaa08..2b02086e3e0c99 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -95,6 +95,12 @@ cl::opt<bool> ReadPreAggregated(
"pa", cl::desc("skip perf and read data from a pre-aggregated file format"),
cl::cat(AggregatorCategory));
+cl::opt<std::string>
+ ReadPerfEvents("perf-script-events",
+ cl::desc("skip perf event collection by supplying a "
+ "perf-script output in a textual format"),
+ cl::ReallyHidden, cl::init(""), cl::cat(AggregatorCategory));
+
static cl::opt<bool>
TimeAggregator("time-aggr",
cl::desc("time BOLT aggregator"),
@@ -167,8 +173,9 @@ void DataAggregator::findPerfExecutable() {
void DataAggregator::start() {
outs() << "PERF2BOLT: Starting data aggregation job for " << Filename << "\n";
- // Don't launch perf for pre-aggregated files
- if (opts::ReadPreAggregated)
+ // Don't launch perf for pre-aggregated files or when perf input is specified
+ // by the user.
+ if (opts::ReadPreAggregated || !opts::ReadPerfEvents.empty())
return;
findPerfExecutable();
@@ -464,6 +471,13 @@ void DataAggregator::filterBinaryMMapInfo() {
int DataAggregator::prepareToParse(StringRef Name, PerfProcessInfo &Process,
PerfProcessErrorCallbackTy Callback) {
+ if (!opts::ReadPerfEvents.empty()) {
+ outs() << "PERF2BOLT: using pre-processed perf events for '" << Name
+ << "' (perf-script-events)\n";
+ ParsingBuf = opts::ReadPerfEvents;
+ return 0;
+ }
+
std::string Error;
outs() << "PERF2BOLT: waiting for perf " << Name
<< " collection to finish...\n";
@@ -2056,15 +2070,6 @@ std::error_code DataAggregator::parseMMapEvents() {
if (FileMMapInfo.first == "(deleted)")
continue;
- // Consider only the first mapping of the file for any given PID
- auto Range = GlobalMMapInfo.equal_range(FileMMapInfo.first);
- bool PIDExists = llvm::any_of(make_range(Range), [&](const auto &MI) {
- return MI.second.PID == FileMMapInfo.second.PID;
- });
-
- if (PIDExists)
- continue;
-
GlobalMMapInfo.insert(FileMMapInfo);
}
@@ -2116,12 +2121,22 @@ std::error_code DataAggregator::parseMMapEvents() {
<< " using file offset 0x" << Twine::utohexstr(MMapInfo.Offset)
<< ". Ignoring profile data for this mapping\n";
continue;
- } else {
- MMapInfo.BaseAddress = *BaseAddress;
}
+ MMapInfo.BaseAddress = *BaseAddress;
}
- BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo));
+ // Try to add MMapInfo to the map and update its size. Large binaries may
+ // span to multiple text segments, so the mapping is inserted only on the
+ // first occurrence.
+ if (!BinaryMMapInfo.insert(std::make_pair(MMapInfo.PID, MMapInfo)).second)
+ assert(MMapInfo.BaseAddress == BinaryMMapInfo[MMapInfo.PID].BaseAddress &&
+ "Base address on multiple segment mappings should match");
+
+ // Update mapping size.
+ const uint64_t EndAddress = MMapInfo.MMapAddress + MMapInfo.Size;
+ const uint64_t Size = EndAddress - BinaryMMapInfo[MMapInfo.PID].BaseAddress;
+ if (Size > BinaryMMapInfo[MMapInfo.PID].Size)
+ BinaryMMapInfo[MMapInfo.PID].Size = Size;
}
if (BinaryMMapInfo.empty()) {
diff --git a/bolt/unittests/Core/CMakeLists.txt b/bolt/unittests/Core/CMakeLists.txt
index bad7108dad0b7b..208cf6ced73585 100644
--- a/bolt/unittests/Core/CMakeLists.txt
+++ b/bolt/unittests/Core/CMakeLists.txt
@@ -8,6 +8,7 @@ set(LLVM_LINK_COMPONENTS
add_bolt_unittest(CoreTests
BinaryContext.cpp
MCPlusBuilder.cpp
+ MemoryMaps.cpp
DynoStats.cpp
DISABLE_LLVM_LINK_LLVM_DYLIB
@@ -17,6 +18,8 @@ target_link_libraries(CoreTests
PRIVATE
LLVMBOLTCore
LLVMBOLTRewrite
+ LLVMBOLTProfile
+ LLVMTestingSupport
)
foreach (tgt ${BOLT_TARGETS_TO_BUILD})
diff --git a/bolt/unittests/Core/MemoryMaps.cpp b/bolt/unittests/Core/MemoryMaps.cpp
new file mode 100644
index 00000000000000..2efaffedb9082e
--- /dev/null
+++ b/bolt/unittests/Core/MemoryMaps.cpp
@@ -0,0 +1,141 @@
+//===- bolt/unittest/Core/MemoryMaps.cpp ----------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "bolt/Core/BinaryContext.h"
+#include "bolt/Profile/DataAggregator.h"
+#include "llvm/BinaryFormat/ELF.h"
+#include "llvm/DebugInfo/DWARF/DWARFContext.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/TargetSelect.h"
+#include "llvm/Testing/Support/Error.h"
+#include "gtest/gtest.h"
+
+using namespace llvm;
+using namespace llvm::object;
+using namespace llvm::ELF;
+using namespace bolt;
+
+namespace opts {
+extern cl::opt<std::string> ReadPerfEvents;
+} // namespace opts
+
+namespace {
+
+/// Perform checks on memory map events normally captured in perf. Tests use
+/// the 'opts::ReadPerfEvents' flag to emulate these events, passing a custom
+/// 'perf script' output to DataAggregator.
+struct MemoryMapsTester : public testing::TestWithParam<Triple::ArchType> {
+ void SetUp() override {
+ initalizeLLVM();
+ prepareElf();
+ initializeBOLT();
+ }
+
+protected:
+ void initalizeLLVM() {
+ llvm::InitializeAllTargetInfos();
+ llvm::InitializeAllTargetMCs();
+ llvm::InitializeAllAsmParsers();
+ llvm::InitializeAllDisassemblers();
+ llvm::InitializeAllTargets();
+ llvm::InitializeAllAsmPrinters();
+ }
+
+ void prepareElf() {
+ memcpy(ElfBuf, "\177ELF", 4);
+ ELF64LE::Ehdr *EHdr = reinterpret_cast<typename ELF64LE::Ehdr *>(ElfBuf);
+ EHdr->e_ident[llvm::ELF::EI_CLASS] = llvm::ELF::ELFCLASS64;
+ EHdr->e_ident[llvm::ELF::EI_DATA] = llvm::ELF::ELFDATA2LSB;
+ EHdr->e_machine = GetParam() == Triple::aarch64 ? EM_AARCH64 : EM_X86_64;
+ MemoryBufferRef Source(StringRef(ElfBuf, sizeof(ElfBuf)), "ELF");
+ ObjFile = cantFail(ObjectFile::createObjectFile(Source));
+ }
+
+ void initializeBOLT() {
+ Relocation::Arch = ObjFile->makeTriple().getArch();
+ BC = cantFail(BinaryContext::createBinaryContext(
+ ObjFile->makeTriple(), ObjFile->getFileName(), nullptr, true,
+ DWARFContext::create(*ObjFile.get()), {llvm::outs(), llvm::errs()}));
+ ASSERT_FALSE(!BC);
+ }
+
+ char ElfBuf[sizeof(typename ELF64LE::Ehdr)] = {};
+ std::unique_ptr<ObjectFile> ObjFile;
+ std::unique_ptr<BinaryContext> BC;
+};
+} // namespace
+
+#ifdef X86_AVAILABLE
+
+INSTANTIATE_TEST_SUITE_P(X86, MemoryMapsTester,
+ ::testing::Values(Triple::x86_64));
+
+#endif
+
+#ifdef AARCH64_AVAILABLE
+
+INSTANTIATE_TEST_SUITE_P(AArch64, MemoryMapsTester,
+ ::testing::Values(Triple::aarch64));
+
+#endif
+
+/// Check that the correct mmap size is computed when we have multiple text
+/// segment mappings.
+TEST_P(MemoryMapsTester, ParseMultipleSegments) {
+ const int Pid = 1234;
+ StringRef Filename = "BINARY";
+ opts::ReadPerfEvents = formatv(
+ "name 0 [000] 0.000000: PERF_RECORD_MMAP2 {0}/{0}: "
+ "[0xabc0000000(0x1000000) @ 0x11c0000 103:01 1573523 0]: r-xp {1}\n"
+ "name 0 [000] 0.000000: PERF_RECORD_MMAP2 {0}/{0}: "
+ "[0xabc2000000(0x8000000) @ 0x31d0000 103:01 1573523 0]: r-xp {1}\n",
+ Pid, Filename);
+
+ BC->SegmentMapInfo[0x11da000] =
+ SegmentInfo{0x11da000, 0x10da000, 0x11ca000, 0x10da000, 0x10000, true};
+ BC->SegmentMapInfo[0x31d0000] =
+ SegmentInfo{0x31d0000, 0x51ac82c, 0x31d0000, 0x3000000, 0x200000, true};
+
+ DataAggregator DA("");
+ BC->setFilename(Filename);
+ Error Err = DA.preprocessProfile(*BC);
+
+ // Ignore errors from perf2bolt when parsing memory events later on.
+ ASSERT_THAT_ERROR(std::move(Err), Succeeded());
+
+ auto &BinaryMMapInfo = DA.getBinaryMMapInfo();
+ auto El = BinaryMMapInfo.find(Pid);
+ // Check that memory mapping is present and has the expected size.
+ ASSERT_NE(El, BinaryMMapInfo.end());
+ ASSERT_EQ(El->second.Size, static_cast<uint64_t>(0xb1d0000));
+}
+
+/// Check that DataAggregator aborts when pre-processing an input binary
+/// with multiple text segments that have different base addresses.
+TEST_P(MemoryMapsTester, MultipleSegmentsMismatchedBaseAddress) {
+ const int Pid = 1234;
+ StringRef Filename = "BINARY";
+ opts::ReadPerfEvents = formatv(
+ "name 0 [000] 0.000000: PERF_RECORD_MMAP2 {0}/{0}: "
+ "[0xabc0000000(0x1000000) @ 0x11c0000 103:01 1573523 0]: r-xp {1}\n"
+ "name 0 [000] 0.000000: PERF_RECORD_MMAP2 {0}/{0}: "
+ "[0xabc2000000(0x8000000) @ 0x31d0000 103:01 1573523 0]: r-xp {1}\n",
+ Pid, Filename);
+
+ BC->SegmentMapInfo[0x11da000] =
+ SegmentInfo{0x11da000, 0x10da000, 0x11ca000, 0x10da000, 0x10000, true};
+ // Using '0x31d0fff' FileOffset which triggers a different base address
+ // for this second text segment.
+ BC->SegmentMapInfo[0x31d0000] =
+ SegmentInfo{0x31d0000, 0x51ac82c, 0x31d0fff, 0x3000000, 0x200000, true};
+
+ DataAggregator DA("");
+ BC->setFilename(Filename);
+ ASSERT_DEBUG_DEATH({ Error Err = DA.preprocessProfile(*BC); },
+ "Base address on multiple segment mappings should match");
+}
diff --git a/llvm/utils/gn/secondary/bolt/unittests/Core/BUILD.gn b/llvm/utils/gn/secondary/bolt/unittests/Core/BUILD.gn
index 51dc24481a513b..945d31afca10f0 100644
--- a/llvm/utils/gn/secondary/bolt/unittests/Core/BUILD.gn
+++ b/llvm/utils/gn/secondary/bolt/unittests/Core/BUILD.gn
@@ -15,6 +15,7 @@ unittest("CoreTests") {
"BinaryContext.cpp",
"DynoStats.cpp",
"MCPlusBuilder.cpp",
+ "MemoryMaps.cpp",
]
defines = []
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
fbc1b8a
to
71a6c45
Compare
Reapplying:
@zmodem, this should fix the reported failure. The only change is this. An alternative could be to upgrade this assertion (line 2132) to a @maksfb, let me know if that is preferred, otherwise I will reapply the patch by EOD tomorrow. |
Verified that it passes the tests locally for me. Thanks! |
Can anyone approve this patch before I re-apply it? Thanks. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/136/builds/1828 Here is the relevant piece of the build log for the reference
|
Adding that the reported failures of ASAN buildbot an Support\ProgramTest.cpp should be unrelated to this patch. |
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/146/builds/1740 Here is the relevant piece of the build log for the reference
|
When a binary has multiple text segments, the Size is computed as the
difference of the last address of these segments from the BaseAddress.
The base addresses of all text segments must be the same.
Introduces flag 'perf-script-events' for testing, which allows passing perf events
without BOLT having to parse them by invoking 'perf script'. The flag is used to
pass a mock perf profile that has two memory mappings for a mock binary
that has two text segments. The mapping size is updated as
parseMMapEvents
now processes all text segments.