Skip to content

Commit cb9bacf

Browse files
[AArch64][BOLT] Ensure tentative code layout for cold BBs runs. (#96609)
When split functions is used, BOLT may skip tentative code layout estimation in some cases, like: - when there is no profile data for some blocks (ie cold blocks) - when there are cold functions in lite mode - when skip functions is used However, when rewriting the binary we still need to compute PC-relative distances between hot and cold basic blocks. Without cold layout estimation, BOLT uses '0x0' as the address of the first cold block, leading to incorrect estimations of any PC-relative addresses. This affects large binaries as the relaxStub method expands more branches than necessary using the short-jump sequence, at it wrongly believes it has exceeded the branch distance boundary. This increases code size with both a larger and slower sequence; however, performance regression is expected to be minimal since this only affects any called cold code. Example of such an unnecessary relaxation: from: ```armasm b .Ltmp1234 ``` to: ```armasm adrp x16, .Ltmp1234 add x16, x16, :lo12:.Ltmp1234 br x16 ```
1 parent 1cc5290 commit cb9bacf

File tree

2 files changed

+41
-9
lines changed

2 files changed

+41
-9
lines changed

bolt/lib/Passes/LongJmp.cpp

Lines changed: 14 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -324,9 +324,8 @@ uint64_t LongJmpPass::tentativeLayoutRelocColdPart(
324324
uint64_t LongJmpPass::tentativeLayoutRelocMode(
325325
const BinaryContext &BC, std::vector<BinaryFunction *> &SortedFunctions,
326326
uint64_t DotAddress) {
327-
328327
// Compute hot cold frontier
329-
uint32_t LastHotIndex = -1u;
328+
int64_t LastHotIndex = -1u;
330329
uint32_t CurrentIndex = 0;
331330
if (opts::HotFunctionsAtEnd) {
332331
for (BinaryFunction *BF : SortedFunctions) {
@@ -351,19 +350,20 @@ uint64_t LongJmpPass::tentativeLayoutRelocMode(
351350
// Hot
352351
CurrentIndex = 0;
353352
bool ColdLayoutDone = false;
353+
auto runColdLayout = [&]() {
354+
DotAddress = tentativeLayoutRelocColdPart(BC, SortedFunctions, DotAddress);
355+
ColdLayoutDone = true;
356+
if (opts::HotFunctionsAtEnd)
357+
DotAddress = alignTo(DotAddress, opts::AlignText);
358+
};
354359
for (BinaryFunction *Func : SortedFunctions) {
355360
if (!BC.shouldEmit(*Func)) {
356361
HotAddresses[Func] = Func->getAddress();
357362
continue;
358363
}
359364

360-
if (!ColdLayoutDone && CurrentIndex >= LastHotIndex) {
361-
DotAddress =
362-
tentativeLayoutRelocColdPart(BC, SortedFunctions, DotAddress);
363-
ColdLayoutDone = true;
364-
if (opts::HotFunctionsAtEnd)
365-
DotAddress = alignTo(DotAddress, opts::AlignText);
366-
}
365+
if (!ColdLayoutDone && CurrentIndex >= LastHotIndex)
366+
runColdLayout();
367367

368368
DotAddress = alignTo(DotAddress, Func->getMinAlignment());
369369
uint64_t Pad =
@@ -382,6 +382,11 @@ uint64_t LongJmpPass::tentativeLayoutRelocMode(
382382
DotAddress += Func->estimateConstantIslandSize();
383383
++CurrentIndex;
384384
}
385+
386+
// Ensure that tentative code layout always runs for cold blocks.
387+
if (!ColdLayoutDone)
388+
runColdLayout();
389+
385390
// BBs
386391
for (BinaryFunction *Func : SortedFunctions)
387392
tentativeBBLayout(*Func);

bolt/test/AArch64/split-funcs-lite.s

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
# This test checks that tentative code layout for cold blocks always runs.
2+
# It commonly happens when using lite mode with split functions.
3+
4+
# REQUIRES: system-linux, asserts
5+
6+
# RUN: %clang %cflags -o %t %s
7+
# RUN: %clang %s %cflags -Wl,-q -o %t
8+
# RUN: link_fdata --no-lbr %s %t %t.fdata
9+
# RUN: llvm-bolt %t -o %t.bolt --data %t.fdata -split-functions \
10+
# RUN: -debug 2>&1 | FileCheck %s
11+
12+
.text
13+
.globl foo
14+
.type foo, %function
15+
foo:
16+
.entry_bb:
17+
# FDATA: 1 foo #.entry_bb# 10
18+
cmp x0, #0
19+
b.eq .Lcold_bb1
20+
ret
21+
.Lcold_bb1:
22+
ret
23+
24+
## Force relocation mode.
25+
.reloc 0, R_AARCH64_NONE
26+
27+
# CHECK: foo{{.*}} cold tentative: {{.*}}

0 commit comments

Comments
 (0)