-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[BOLT][heatmap] Produce zoomed-out heatmaps #140153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOLT][heatmap] Produce zoomed-out heatmaps #140153
Conversation
Created using spr 1.3.4
@llvm/pr-subscribers-bolt Author: Amir Ayupov (aaupov) ChangesAdd an option The option rescales an existing heatmap, so the provided bucket sizes Suggested values to use: 4096 (default page size), 16384 (16k page), Test Plan: updated heatmap-preagg.test Full diff: https://github.com/llvm/llvm-project/pull/140153.diff 4 Files Affected:
diff --git a/bolt/include/bolt/Profile/Heatmap.h b/bolt/include/bolt/Profile/Heatmap.h
index 9813e7fed486d..bf3d1c91c0aa5 100644
--- a/bolt/include/bolt/Profile/Heatmap.h
+++ b/bolt/include/bolt/Profile/Heatmap.h
@@ -85,6 +85,9 @@ class Heatmap {
void printSectionHotness(raw_ostream &OS) const;
size_t size() const { return Map.size(); }
+
+ /// Increase bucket size to \p TargetSize, recomputing the heatmap.
+ bool resizeBucket(uint64_t TargetSize);
};
} // namespace bolt
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 6beb60741406e..aa681e633c0d8 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -68,6 +68,12 @@ FilterPID("pid",
cl::Optional,
cl::cat(AggregatorCategory));
+static cl::list<uint64_t>
+ HeatmapZoomOut("heatmap-zoom-out", cl::CommaSeparated,
+ cl::desc("print secondary heatmaps with given bucket sizes"),
+ cl::value_desc("bucket_size"), cl::Optional,
+ cl::cat(HeatmapCategory));
+
static cl::opt<bool>
IgnoreBuildID("ignore-build-id",
cl::desc("continue even if build-ids in input binary and perf.data mismatch"),
@@ -1365,6 +1371,15 @@ std::error_code DataAggregator::printLBRHeatMap() {
HM.printCDF(opts::HeatmapOutput + ".csv");
HM.printSectionHotness(opts::HeatmapOutput + "-section-hotness.csv");
}
+ // Provide coarse-grained heatmap if requested via --heatmap-zoom-out
+ for (const uint64_t NewBucketSize : opts::HeatmapZoomOut) {
+ if (!HM.resizeBucket(NewBucketSize))
+ break;
+ if (opts::HeatmapOutput == "-")
+ HM.print(opts::HeatmapOutput);
+ else
+ HM.print(formatv("{0}-{1}", opts::HeatmapOutput, NewBucketSize).str());
+ }
return std::error_code();
}
diff --git a/bolt/lib/Profile/Heatmap.cpp b/bolt/lib/Profile/Heatmap.cpp
index c66c2e5487613..4aaf6dc344a85 100644
--- a/bolt/lib/Profile/Heatmap.cpp
+++ b/bolt/lib/Profile/Heatmap.cpp
@@ -81,7 +81,7 @@ void Heatmap::print(raw_ostream &OS) const {
// the Address.
auto startLine = [&](uint64_t Address, bool Empty = false) {
changeColor(DefaultColor);
- const uint64_t LineAddress = Address / BytesPerLine * BytesPerLine;
+ const uint64_t LineAddress = alignTo(Address, BytesPerLine);
if (MaxAddress > 0xffffffff)
OS << format("0x%016" PRIx64 ": ", LineAddress);
@@ -364,5 +364,18 @@ void Heatmap::printSectionHotness(raw_ostream &OS) const {
OS << formatv("[unmapped], 0x0, 0x0, {0:f4}, 0, 0\n",
100.0 * UnmappedHotness / NumTotalCounts);
}
+
+bool Heatmap::resizeBucket(uint64_t TargetSize) {
+ if (TargetSize <= BucketSize)
+ return false;
+ std::map<uint64_t, uint64_t> NewMap;
+ for (const auto [Bucket, Count] : Map) {
+ const uint64_t Address = Bucket * BucketSize;
+ NewMap[Address / TargetSize] += Count;
+ }
+ Map = NewMap;
+ BucketSize = TargetSize;
+ return true;
+}
} // namespace bolt
} // namespace llvm
diff --git a/bolt/test/X86/heatmap-preagg.test b/bolt/test/X86/heatmap-preagg.test
index 306e74800a353..9539269ff0d47 100644
--- a/bolt/test/X86/heatmap-preagg.test
+++ b/bolt/test/X86/heatmap-preagg.test
@@ -3,8 +3,11 @@
RUN: yaml2obj %p/Inputs/blarge_new.yaml &> %t.exe
## Non-BOLTed input binary
RUN: llvm-bolt-heatmap %t.exe -o %t --pa -p %p/Inputs/blarge_new.preagg.txt \
-RUN: 2>&1 | FileCheck --check-prefix CHECK-HEATMAP %s
+RUN: --heatmap-zoom-out 128,1024 2>&1 | FileCheck --check-prefix CHECK-HEATMAP %s
RUN: FileCheck %s --check-prefix CHECK-SEC-HOT --input-file %t-section-hotness.csv
+RUN: FileCheck %s --check-prefix CHECK-HM-64 --input-file %t
+RUN: FileCheck %s --check-prefix CHECK-HM-128 --input-file %t-128
+RUN: FileCheck %s --check-prefix CHECK-HM-1024 --input-file %t-1024
## BOLTed input binary
RUN: llvm-bolt %t.exe -o %t.out --pa -p %p/Inputs/blarge_new.preagg.txt \
@@ -24,6 +27,15 @@ CHECK-SEC-HOT-NEXT: .plt, 0x401020, 0x4010b0, 4.7583, 66.6667, 0.0317
CHECK-SEC-HOT-NEXT: .text, 0x4010b0, 0x401c25, 78.3872, 85.1064, 0.6671
CHECK-SEC-HOT-NEXT: .fini, 0x401c28, 0x401c35, 0.0000, 0.0000, 0.0000
+# Only check start addresses – can't check colors, and FileCheck doesn't strip
+# color codes by default. Reference output:
+# HM-64: 0x00404000: ABBcccccccccccccccCCCCCCCCCccccCCCCCCCCcc....CC
+# HM-128: 0x00408000: ABCCCCCCCCCCCCCCCCCCc.CC
+# HM-1024: 0x00440000: ACC
+CHECK-HM-64: 0x00404000:
+CHECK-HM-128: 0x00408000:
+CHECK-HM-1024: 0x00440000:
+
CHECK-HEATMAP-BAT: PERF2BOLT: read 79 aggregated LBR entries
CHECK-HEATMAP-BAT: HEATMAP: invalid traces: 2
|
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great option! It may be worth listing this in the Heatmaps.md, alongside the suggested usage.
Suggested values to use: 4096 (default page size), 16384 (16k page),
1048576 (1MB for XL workloads).
Created using spr 1.3.4
Created using spr 1.3.4
Instead of introducing the new option, can we reuse
Without a printed warning? I understand that the size limitation is driven by implementation, but at the very least we should report that there is no expected output. Alternatively, use a different option format that encapsulates the limitations. E.g., |
Good call. I prefer the latter approach with explicit scales. |
Created using spr 1.3.4
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation looks good to me.
A couple comments regarding the interface:
- If we are producing multiple files by default, we might have to change docs in more places.
- I find the power-of-two limitation for the scale unnecessary.
- The variance in default scale steps looks a bit unnatural (64, 4, 64 again). I would keep just 64 and 64 making the default sizes 64 bytes, 4 KB, 256 KB.
Makes sense. I'll update heatmap doc, anything else?
This made it impossible to enter invalid scales. While working on a custom parser, I realized that a better UX would be to allow specifying bucket sizes in natural units (e.g. 4KB, 1MB):
If you agree, I'll switch to that and print warnings in case of invalid sizes/rescales.
It's the downstream of bucket size selection. 1MB works really well for XL binaries, to have a high-level view of the whole address space. I'd like to keep 1MB for that reason. We can add 256KB for sure. LMK if that sounds good. |
Okay. Then the burden falls on the proper UI messaging and error/warning diagnostics.
If 256KB is difficult to read for large binaries, 1MB works for me. |
Created using spr 1.3.4 [skip ci]
Created using spr 1.3.4
Created using spr 1.3.4
Created using spr 1.3.4
Added usage message:
Added check for sorted values. Not checking the provided values though.
256KB works well for large binaries. |
Created using spr 1.3.4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing comments.
Add a capability to produce multiple heatmaps with given bucket sizes.
The default heatmap block size (64B) could be too fine-grained for
large binaries. Extend the option
block-size
to accept a list ofbucket sizes for additional heatmaps with coarser granularity. The
heatmap is simply rescaled so provided sizes should be multiples of
each other. Human-readable suffixes can be used, e.g. 4K, 16kb, 1MiB.
New defaults: 64B (base bucket size), 4KB (default page size),
256KB (for large binaries).
Test Plan: updated heatmap-preagg.test