Skip to content

[BOLT][AArch64] Fix strict usage during ADR Relax #71377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 5 additions & 2 deletions bolt/lib/Passes/ADRRelaxationPass.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,14 +72,17 @@ void ADRRelaxationPass::runOnFunction(BinaryFunction &BF) {

if (It != BB.begin() && BC.MIB->isNoop(*std::prev(It))) {
It = BB.eraseInstruction(std::prev(It));
} else if (opts::StrictMode && !BF.isSimple()) {
} else if (std::next(It) != BB.end() && BC.MIB->isNoop(*std::next(It))) {
BB.eraseInstruction(std::next(It));
} else if (!opts::StrictMode && !BF.isSimple()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yota9, I don't think it's safe to allow code size expansion for non-simple functions in strict mode. Do you remember why this change was needed?

Copy link
Member Author

@yota9 yota9 Dec 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @maksfb . I don't really like strict option usage here. But before this patch (not added by me) it was the opposite, in strict mode there was an error. But strict mode is disabled by default and by default we want the error to be raised in this case. Also strict mode "expands" number of functions bolt can process as we're more "trust" the binary as "well-formed source". So although I fill like there might be separate option used here it was strict used before and I've decided only to fix the condition here I was not agreed with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The original check was added by @treapster in https://reviews.llvm.org/D143887, and I believe the intent was to put more strict requirements on the binary (?). Anyway, if there are no objections, I'm going to remove the StrictMode check here and always issue an error if we can't update non-simple functions without changing instruction offsets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really mind. Ideally here should be option with a list of functions that are allowed to be skipped with this error. But I think someone who'll need this can implement this :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is very confusing because it has relaxed behavior with strict option and strict behavior without it. I think the only way to make it make sense is to fail regardless of --strict mode as @maksfb suggested. But you have to evaluate whether it's worth making bolt always fail on binaries with non-simple functions without nops. If there's no jump table, it is fine so a failure may be false positive.

Copy link
Member Author

@yota9 yota9 Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the name of the option might suggests to be more "strict" to the binary input, the options description says the opposite "trust the input to be from a well-formed source". So the option expands the number of binary functions we're processing, rather the shrinks it. Although it doesn't really matter, I've only fixed the option usage here, but the option choice here was not the best from the beginning.
With previous logic there was some interferience with strict in another place (I don't remember) that followed the logic right, this was the reason I had to fix it here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm the one to blame for the option name. The idea is that the input binary is built with strict requirements allowing BOLT to optimize more aggressively. The way I see it at the moment, we may get rid of the option in the future. It certainly doesn't make much sense on ARM anyway where we cannot rely on relocations to reconstruct control flow.

// If the function is not simple, it may contain a jump table undetected
// by us. This jump table may use an offset from the branch instruction
// to land in the desired place. If we add new instructions, we
// invalidate this offset, so we have to rely on linker-inserted NOP to
// replace it with ADRP, and abort if it is not present.
auto L = BC.scopeLock();
errs() << formatv("BOLT-ERROR: Cannot relax adr in non-simple function "
"{0}. Can't proceed in current mode.\n",
"{0}. Use --strict option to override\n",
BF.getOneName());
PassFailed = true;
return;
Expand Down
2 changes: 1 addition & 1 deletion bolt/test/AArch64/r_aarch64_prelxx.s
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
// CHECKPREL-NEXT: R_AARCH64_PREL32 {{.*}} _start + 4
// CHECKPREL-NEXT: R_AARCH64_PREL64 {{.*}} _start + 8

// RUN: llvm-bolt %t.exe -o %t.bolt
// RUN: llvm-bolt %t.exe -o %t.bolt --strict
// RUN: llvm-objdump -D %t.bolt | FileCheck %s --check-prefix=CHECKPREL32

// CHECKPREL32: [[#%x,DATATABLEADDR:]] <datatable>:
Expand Down
31 changes: 18 additions & 13 deletions bolt/test/runtime/AArch64/adrrelaxationpass.s
Original file line number Diff line number Diff line change
@@ -1,33 +1,27 @@
# The second and third ADR instructions are non-local to functions
# and must be replaced with ADRP + ADD by BOLT
# Also since main is non-simple, we can't change it's length so we have to
# replace NOP with adrp, and if there is no nop before adr in non-simple
# Also since main and test are non-simple, we can't change it's length so we
# have to replace NOP with adrp, and if there is no nop before adr in non-simple
# function, we can't guarantee we didn't break possible jump tables, so we
# fail in strict mode
# fail in non-strict mode

# REQUIRES: system-linux

# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown \
# RUN: %s -o %t.o
# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q
# RUN: llvm-bolt %t.exe -o %t.bolt --adr-relaxation=true
# RUN: llvm-bolt %t.exe -o %t.bolt --adr-relaxation=true --strict
# RUN: llvm-objdump --no-print-imm-hex -d --disassemble-symbols=main %t.bolt | FileCheck %s
# RUN: %t.bolt
# RUN: not llvm-bolt %t.exe -o %t.bolt --adr-relaxation=true --strict \
# RUN: not llvm-bolt %t.exe -o %t.bolt --adr-relaxation=true \
# RUN: 2>&1 | FileCheck %s --check-prefix CHECK-ERROR

.data
.align 8
.global Gvar
Gvar: .xword 0x0
.global Gvar2
Gvar2: .xword 0x42

.text
.align 4
.global test
.type test, %function
test:
adr x2, Gvar
mov x0, xzr
ret
.size test, .-test
Expand All @@ -47,11 +41,22 @@ br:
.CI:
.word 0xff

.data
.align 8
.global Gvar
Gvar: .xword 0x0
.global Gvar2
Gvar2: .xword 0x42
.balign 4
jmptable:
.word 0
.word test - jmptable

# CHECK: <main>:
# CHECK-NEXT: adr x0, 0x{{[1-8a-f][0-9a-f]*}}
# CHECK-NEXT: adrp x1, 0x{{[1-8a-f][0-9a-f]*}}
# CHECK-NEXT: add x1, x1, #{{[1-8a-f][0-9a-f]*}}
# CHECK-NEXT: adrp x2, 0x{{[1-8a-f][0-9a-f]*}}
# CHECK-NEXT: add x2, x2, #{{[1-8a-f][0-9a-f]*}}
# CHECK-NEXT: adr x3, 0x{{[1-8a-f][0-9a-f]*}}
# CHECK-ERROR: BOLT-ERROR: Cannot relax adr in non-simple function main
# CHECK-ERROR: BOLT-ERROR: Cannot relax adr in non-simple function
2 changes: 2 additions & 0 deletions bolt/test/runtime/AArch64/controlflow.s
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ test_cond_branch:
.global test_branch_reg
.type test_branch_reg, %function
test_branch_reg:
nop
adr x0, test_branch_zero
br x0
panic
Expand Down Expand Up @@ -97,6 +98,7 @@ test_call:
.global test_call_reg
.type test_call_reg, %function
test_call_reg:
nop
adr x0, test_call_foo
blr x0
panic
Expand Down