[BOLT][heatmap] Produce zoomed-out heatmaps #140153

aaupov · 2025-05-15T22:12:55Z

Add a capability to produce multiple heatmaps with given bucket sizes.

The default heatmap block size (64B) could be too fine-grained for
large binaries. Extend the option block-size to accept a list of
bucket sizes for additional heatmaps with coarser granularity. The
heatmap is simply rescaled so provided sizes should be multiples of
each other. Human-readable suffixes can be used, e.g. 4K, 16kb, 1MiB.

New defaults: 64B (base bucket size), 4KB (default page size),
256KB (for large binaries).

Test Plan: updated heatmap-preagg.test

Created using spr 1.3.4

llvmbot · 2025-05-15T22:13:30Z

@llvm/pr-subscribers-bolt

Author: Amir Ayupov (aaupov)

Changes

Add an option --heatmap-zoom-out=bucket_size1,bucket_size2,... to
print additional heatmaps with coarser granularity. This makes it easier
to navigate the heatmap, compared to the default setting of 64B buckets
that could be too fine-grained.

The option rescales an existing heatmap, so the provided bucket sizes
should be multiples of the original bucket size (--block-size), and be
provided in ascending order. If rescaling is impossible, no heatmap is
produced.

Suggested values to use: 4096 (default page size), 16384 (16k page),
1048576 (1MB for XL workloads).

Test Plan: updated heatmap-preagg.test

Full diff: https://github.com/llvm/llvm-project/pull/140153.diff

4 Files Affected:

(modified) bolt/include/bolt/Profile/Heatmap.h (+3)
(modified) bolt/lib/Profile/DataAggregator.cpp (+15)
(modified) bolt/lib/Profile/Heatmap.cpp (+14-1)
(modified) bolt/test/X86/heatmap-preagg.test (+13-1)

diff --git a/bolt/include/bolt/Profile/Heatmap.h b/bolt/include/bolt/Profile/Heatmap.h
index 9813e7fed486d..bf3d1c91c0aa5 100644
--- a/bolt/include/bolt/Profile/Heatmap.h
+++ b/bolt/include/bolt/Profile/Heatmap.h
@@ -85,6 +85,9 @@ class Heatmap {
   void printSectionHotness(raw_ostream &OS) const;
 
   size_t size() const { return Map.size(); }
+
+  /// Increase bucket size to \p TargetSize, recomputing the heatmap.
+  bool resizeBucket(uint64_t TargetSize);
 };
 
 } // namespace bolt
diff --git a/bolt/lib/Profile/DataAggregator.cpp b/bolt/lib/Profile/DataAggregator.cpp
index 6beb60741406e..aa681e633c0d8 100644
--- a/bolt/lib/Profile/DataAggregator.cpp
+++ b/bolt/lib/Profile/DataAggregator.cpp
@@ -68,6 +68,12 @@ FilterPID("pid",
   cl::Optional,
   cl::cat(AggregatorCategory));
 
+static cl::list<uint64_t>
+    HeatmapZoomOut("heatmap-zoom-out", cl::CommaSeparated,
+                   cl::desc("print secondary heatmaps with given bucket sizes"),
+                   cl::value_desc("bucket_size"), cl::Optional,
+                   cl::cat(HeatmapCategory));
+
 static cl::opt<bool>
 IgnoreBuildID("ignore-build-id",
   cl::desc("continue even if build-ids in input binary and perf.data mismatch"),
@@ -1365,6 +1371,15 @@ std::error_code DataAggregator::printLBRHeatMap() {
     HM.printCDF(opts::HeatmapOutput + ".csv");
     HM.printSectionHotness(opts::HeatmapOutput + "-section-hotness.csv");
   }
+  // Provide coarse-grained heatmap if requested via --heatmap-zoom-out
+  for (const uint64_t NewBucketSize : opts::HeatmapZoomOut) {
+    if (!HM.resizeBucket(NewBucketSize))
+      break;
+    if (opts::HeatmapOutput == "-")
+      HM.print(opts::HeatmapOutput);
+    else
+      HM.print(formatv("{0}-{1}", opts::HeatmapOutput, NewBucketSize).str());
+  }
 
   return std::error_code();
 }
diff --git a/bolt/lib/Profile/Heatmap.cpp b/bolt/lib/Profile/Heatmap.cpp
index c66c2e5487613..4aaf6dc344a85 100644
--- a/bolt/lib/Profile/Heatmap.cpp
+++ b/bolt/lib/Profile/Heatmap.cpp
@@ -81,7 +81,7 @@ void Heatmap::print(raw_ostream &OS) const {
   // the Address.
   auto startLine = [&](uint64_t Address, bool Empty = false) {
     changeColor(DefaultColor);
-    const uint64_t LineAddress = Address / BytesPerLine * BytesPerLine;
+    const uint64_t LineAddress = alignTo(Address, BytesPerLine);
 
     if (MaxAddress > 0xffffffff)
       OS << format("0x%016" PRIx64 ": ", LineAddress);
@@ -364,5 +364,18 @@ void Heatmap::printSectionHotness(raw_ostream &OS) const {
     OS << formatv("[unmapped], 0x0, 0x0, {0:f4}, 0, 0\n",
                   100.0 * UnmappedHotness / NumTotalCounts);
 }
+
+bool Heatmap::resizeBucket(uint64_t TargetSize) {
+  if (TargetSize <= BucketSize)
+    return false;
+  std::map<uint64_t, uint64_t> NewMap;
+  for (const auto [Bucket, Count] : Map) {
+    const uint64_t Address = Bucket * BucketSize;
+    NewMap[Address / TargetSize] += Count;
+  }
+  Map = NewMap;
+  BucketSize = TargetSize;
+  return true;
+}
 } // namespace bolt
 } // namespace llvm
diff --git a/bolt/test/X86/heatmap-preagg.test b/bolt/test/X86/heatmap-preagg.test
index 306e74800a353..9539269ff0d47 100644
--- a/bolt/test/X86/heatmap-preagg.test
+++ b/bolt/test/X86/heatmap-preagg.test
@@ -3,8 +3,11 @@
 RUN: yaml2obj %p/Inputs/blarge_new.yaml &> %t.exe
 ## Non-BOLTed input binary
 RUN: llvm-bolt-heatmap %t.exe -o %t --pa -p %p/Inputs/blarge_new.preagg.txt \
-RUN:   2>&1 | FileCheck --check-prefix CHECK-HEATMAP %s
+RUN:   --heatmap-zoom-out 128,1024 2>&1 | FileCheck --check-prefix CHECK-HEATMAP %s
 RUN: FileCheck %s --check-prefix CHECK-SEC-HOT --input-file %t-section-hotness.csv
+RUN: FileCheck %s --check-prefix CHECK-HM-64 --input-file %t
+RUN: FileCheck %s --check-prefix CHECK-HM-128 --input-file %t-128
+RUN: FileCheck %s --check-prefix CHECK-HM-1024 --input-file %t-1024
 
 ## BOLTed input binary
 RUN: llvm-bolt %t.exe -o %t.out --pa -p %p/Inputs/blarge_new.preagg.txt \
@@ -24,6 +27,15 @@ CHECK-SEC-HOT-NEXT: .plt, 0x401020, 0x4010b0, 4.7583, 66.6667, 0.0317
 CHECK-SEC-HOT-NEXT: .text, 0x4010b0, 0x401c25, 78.3872, 85.1064, 0.6671
 CHECK-SEC-HOT-NEXT: .fini, 0x401c28, 0x401c35, 0.0000, 0.0000, 0.0000
 
+# Only check start addresses – can't check colors, and FileCheck doesn't strip
+# color codes by default. Reference output:
+# HM-64:   0x00404000: ABBcccccccccccccccCCCCCCCCCccccCCCCCCCCcc....CC
+# HM-128:  0x00408000: ABCCCCCCCCCCCCCCCCCCc.CC
+# HM-1024: 0x00440000: ACC
+CHECK-HM-64:   0x00404000:
+CHECK-HM-128:  0x00408000:
+CHECK-HM-1024: 0x00440000:
+
 CHECK-HEATMAP-BAT: PERF2BOLT: read 79 aggregated LBR entries
 CHECK-HEATMAP-BAT: HEATMAP: invalid traces: 2

Created using spr 1.3.4

paschalis-mpeis

Great option! It may be worth listing this in the Heatmaps.md, alongside the suggested usage.

Suggested values to use: 4096 (default page size), 16384 (16k page),
1048576 (1MB for XL workloads).

Created using spr 1.3.4

maksfb · 2025-05-23T16:35:37Z

Instead of introducing the new option, can we reuse --block-size= and make it accept multiple values or a different format?

If rescaling is impossible, no heatmap is produced.

Without a printed warning? I understand that the size limitation is driven by implementation, but at the very least we should report that there is no expected output.

Alternatively, use a different option format that encapsulates the limitations. E.g., --block-size=<initial_size>{:<scale>:<count>} or --block-size=<initial_size>{:<scale1>,<scale2>...}.

aaupov · 2025-05-23T17:12:46Z

Instead of introducing the new option, can we reuse --block-size= and make it accept multiple values or a different format?

If rescaling is impossible, no heatmap is produced.

Without a printed warning? I understand that the size limitation is driven by implementation, but at the very least we should report that there is no expected output.

Alternatively, use a different option format that encapsulates the limitations. E.g., --block-size=<initial_size>{:<scale>:<count>} or --block-size=<initial_size>{:<scale1>,<scale2>...}.

Good call. I prefer the latter approach with explicit scales.

Created using spr 1.3.4

maksfb

The implementation looks good to me.

A couple comments regarding the interface:

If we are producing multiple files by default, we might have to change docs in more places.
I find the power-of-two limitation for the scale unnecessary.
The variance in default scale steps looks a bit unnatural (64, 4, 64 again). I would keep just 64 and 64 making the default sizes 64 bytes, 4 KB, 256 KB.

bolt/docs/Heatmaps.md

aaupov · 2025-05-27T18:18:18Z

@maksfb:

If we are producing multiple files by default, we might have to change docs in more places.

Makes sense. I'll update heatmap doc, anything else?

I find the power-of-two limitation for the scale unnecessary.

This made it impossible to enter invalid scales. While working on a custom parser, I realized that a better UX would be to allow specifying bucket sizes in natural units (e.g. 4KB, 1MB):

--block-size=default_size[,size1,...]

If you agree, I'll switch to that and print warnings in case of invalid sizes/rescales.

The variance in default scale steps looks a bit unnatural (64, 4, 64 again). I would keep just 64 and 64 making the default sizes 64 bytes, 4 KB, 256 KB.

It's the downstream of bucket size selection. 1MB works really well for XL binaries, to have a high-level view of the whole address space. I'd like to keep 1MB for that reason. We can add 256KB for sure.

LMK if that sounds good.

maksfb · 2025-05-28T02:23:16Z

@maksfb:

If we are producing multiple files by default, we might have to change docs in more places.

Makes sense. I'll update heatmap doc, anything else?

llvm-bolt-heatmap usage message if possible.

I find the power-of-two limitation for the scale unnecessary.

This made it impossible to enter invalid scales. While working on a custom parser, I realized that a better UX would be to allow specifying bucket sizes in natural units (e.g. 4KB, 1MB):
--block-size=default_size[,size1,...]
If you agree, I'll switch to that and print warnings in case of invalid sizes/rescales.

Okay. Then the burden falls on the proper UI messaging and error/warning diagnostics.

The variance in default scale steps looks a bit unnatural (64, 4, 64 again). I would keep just 64 and 64 making the default sizes 64 bytes, 4 KB, 256 KB.

It's the downstream of bucket size selection. 1MB works really well for XL binaries, to have a high-level view of the whole address space. I'd like to keep 1MB for that reason. We can add 256KB for sure.

LMK if that sounds good.

If 256KB is difficult to read for large binaries, 1MB works for me.

Created using spr 1.3.4 [skip ci]

Created using spr 1.3.4

aaupov · 2025-05-30T18:39:27Z

@maksfb:

If we are producing multiple files by default, we might have to change docs in more places.

Makes sense. I'll update heatmap doc, anything else?

llvm-bolt-heatmap usage message if possible.

Added usage message:

OVERVIEW:  BOLT Code Heatmap tool

  Produces code heatmaps using sampled profile

  Inputs:
  - Binary (supports BOLT-optimized binaries),
  - Sampled profile collected from the binary:
    - perf data or pre-aggregated profile data (instrumentation profile not supported)
    - perf data can have basic (IP) or branch-stack (LBR) samples

  Outputs:
  - Heatmaps: colored ASCII (requires a color-capable terminal or a conversion tool like `aha`)
    Multiple heatmaps are produced by default with different granularities (set by `block-size` option)
  - Section hotness: per-section samples% and utilization%
  - Cumulative distribution: working set size corresponding to a given percentile of samples

I find the power-of-two limitation for the scale unnecessary.

This made it impossible to enter invalid scales. While working on a custom parser, I realized that a better UX would be to allow specifying bucket sizes in natural units (e.g. 4KB, 1MB):
--block-size=default_size[,size1,...]
If you agree, I'll switch to that and print warnings in case of invalid sizes/rescales.
Okay. Then the burden falls on the proper UI messaging and error/warning diagnostics.

Added check for sorted values. Not checking the provided values though.

The variance in default scale steps looks a bit unnatural (64, 4, 64 again). I would keep just 64 and 64 making the default sizes 64 bytes, 4 KB, 256 KB.

It's the downstream of bucket size selection. 1MB works really well for XL binaries, to have a high-level view of the whole address space. I'd like to keep 1MB for that reason. We can add 256KB for sure.
LMK if that sounds good.

If 256KB is difficult to read for large binaries, 1MB works for me.

256KB works well for large binaries.

Created using spr 1.3.4

maksfb

Thanks for addressing comments.

[𝘀𝗽𝗿] initial version

1e1ef44

Created using spr 1.3.4

aaupov requested review from maksfb, rafaelauler, ayermolo and yota9 as code owners May 15, 2025 22:12

llvmbot added the BOLT label May 15, 2025

aaupov changed the title ~~[BOLT][heatmap] Produce zoomed-out heatmap~~ [BOLT][heatmap] Produce zoomed-out heatmaps May 15, 2025

aaupov added 2 commits May 15, 2025 15:14

drop alignTo

0bc316a

Created using spr 1.3.4

fix test

1833e09

Created using spr 1.3.4

paschalis-mpeis reviewed May 16, 2025

View reviewed changes

aaupov added 2 commits May 16, 2025 11:09

add new option to Heatmaps.md

e57835b

Created using spr 1.3.4

suggested values

5ab0498

Created using spr 1.3.4

aaupov added 2 commits May 23, 2025 15:36

Modify block-size option

d8d259f

Created using spr 1.3.4

clang-format

9f66520

Created using spr 1.3.4

maksfb reviewed May 27, 2025

View reviewed changes

bolt/docs/Heatmaps.md Outdated Show resolved Hide resolved

aaupov added 4 commits May 30, 2025 10:38

[𝘀𝗽𝗿] changes introduced through rebase

f2e5ce8

Created using spr 1.3.4 [skip ci]

update block-size option

e6ecb24

Created using spr 1.3.4

update block-size option

784de2c

Created using spr 1.3.4

usage message

1d54ac1

Created using spr 1.3.4

enforce block sizes to be multiples

4b28ffe

Created using spr 1.3.4

maksfb approved these changes May 30, 2025

View reviewed changes

aaupov merged commit 5047a33 into main May 30, 2025
8 of 9 checks passed

aaupov deleted the users/aaupov/spr/boltheatmap-produce-zoomed-out-heatmap branch May 30, 2025 23:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BOLT][heatmap] Produce zoomed-out heatmaps #140153

[BOLT][heatmap] Produce zoomed-out heatmaps #140153

aaupov commented May 15, 2025 •

edited

Loading

Uh oh!

llvmbot commented May 15, 2025

Uh oh!

paschalis-mpeis left a comment

Uh oh!

maksfb commented May 23, 2025

Uh oh!

aaupov commented May 23, 2025

Uh oh!

maksfb left a comment

Uh oh!

Uh oh!

aaupov commented May 27, 2025

Uh oh!

maksfb commented May 28, 2025

Uh oh!

aaupov commented May 30, 2025

Uh oh!

maksfb left a comment

Uh oh!

Uh oh!

Uh oh!

[BOLT][heatmap] Produce zoomed-out heatmaps #140153

[BOLT][heatmap] Produce zoomed-out heatmaps #140153

Conversation

aaupov commented May 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented May 15, 2025

Uh oh!

paschalis-mpeis left a comment

Choose a reason for hiding this comment

Uh oh!

maksfb commented May 23, 2025

Uh oh!

aaupov commented May 23, 2025

Uh oh!

maksfb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

aaupov commented May 27, 2025

Uh oh!

maksfb commented May 28, 2025

Uh oh!

aaupov commented May 30, 2025

Uh oh!

maksfb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

aaupov commented May 15, 2025 •

edited

Loading