-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[BOLT] Add --pad-funcs-before=func:n #117924
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
19b8e54
to
dbfc1b5
Compare
@llvm/pr-subscribers-bolt Author: Peter Waller (peterwaller-arm) ChangesThis complements --pad-funcs, and by using both simultaneously, enables See also: proposed functionality to enable inserting random padding in Full diff: https://github.com/llvm/llvm-project/pull/117924.diff 2 Files Affected:
diff --git a/bolt/lib/Core/BinaryEmitter.cpp b/bolt/lib/Core/BinaryEmitter.cpp
index f34a94c5779213..be21f1f363af29 100644
--- a/bolt/lib/Core/BinaryEmitter.cpp
+++ b/bolt/lib/Core/BinaryEmitter.cpp
@@ -54,6 +54,12 @@ FunctionPadSpec("pad-funcs",
cl::Hidden,
cl::cat(BoltCategory));
+static cl::list<std::string> FunctionPadBeforeSpec(
+ "pad-funcs-before", cl::CommaSeparated,
+ cl::desc("list of functions to pad with amount of bytes"),
+ cl::value_desc("func1:pad1,func2:pad2,func3:pad3,..."), cl::Hidden,
+ cl::cat(BoltCategory));
+
static cl::opt<bool> MarkFuncs(
"mark-funcs",
cl::desc("mark function boundaries with break instruction to make "
@@ -94,6 +100,30 @@ size_t padFunction(const BinaryFunction &Function) {
return 0;
}
+size_t padFunctionBefore(const BinaryFunction &Function) {
+ static std::map<std::string, size_t> FunctionPadding;
+
+ if (FunctionPadding.empty() && !FunctionPadBeforeSpec.empty()) {
+ for (std::string &Spec : FunctionPadBeforeSpec) {
+ size_t N = Spec.find(':');
+ if (N == std::string::npos)
+ continue;
+ std::string Name = Spec.substr(0, N);
+ size_t Padding = std::stoull(Spec.substr(N + 1));
+ FunctionPadding[Name] = Padding;
+ }
+ }
+
+ for (auto &FPI : FunctionPadding) {
+ std::string Name = FPI.first;
+ size_t Padding = FPI.second;
+ if (Function.hasNameRegex(Name))
+ return Padding;
+ }
+
+ return 0;
+}
+
} // namespace opts
namespace {
@@ -319,6 +349,36 @@ bool BinaryEmitter::emitFunction(BinaryFunction &Function,
Streamer.emitCodeAlignment(Function.getAlign(), &*BC.STI);
}
+ if (size_t Padding = opts::padFunctionBefore(Function)) {
+ // Handle padFuncsBefore after the above alignment logic but before
+ // symbol addresses are decided; with the intent that the nops are
+ // not executed and the original alignment logic is preserved.
+ if (!BC.HasRelocations) {
+ errs() << "BOLT-ERROR: -pad-before-funcs is not supported in "
+ << "non-relocation mode\n";
+ exit(1);
+ }
+
+ // Preserve Function.getMinAlign().
+ if (!isAligned(Function.getMinAlign(), Padding)) {
+ errs() << "BOLT-ERROR: User-requested " << Padding
+ << " padding bytes before function " << Function
+ << " is not a multiple of the minimum function alignment ("
+ << Function.getMinAlign().value() << ").\n";
+ exit(1);
+ }
+
+ LLVM_DEBUG(dbgs() << "BOLT-DEBUG: padding function " << Function << " with "
+ << Padding << " bytes\n");
+
+ // Since the padding is not executed, it can be null bytes.
+ Streamer.emitFill(Padding, 0);
+
+ Function.setMaxSize(Function.getMaxSize() + Padding);
+ Function.setSize(Function.getSize() + Padding);
+ Function.setImageSize(Function.getImageSize() + Padding);
+ }
+
MCContext &Context = Streamer.getContext();
const MCAsmInfo *MAI = Context.getAsmInfo();
diff --git a/bolt/test/AArch64/pad-before-funcs.s b/bolt/test/AArch64/pad-before-funcs.s
new file mode 100644
index 00000000000000..3b4eae8e70fecc
--- /dev/null
+++ b/bolt/test/AArch64/pad-before-funcs.s
@@ -0,0 +1,28 @@
+# RUN: llvm-mc -filetype=obj -triple aarch64-unknown-unknown %s -o %t.o
+# RUN: %clang %cflags %t.o -o %t.exe -Wl,-q -Wl,--section-start=.text=0x4000
+# RUN: llvm-bolt %t.exe -o %t.bolt.0 --pad-funcs-before=_start:0
+# RUN: llvm-bolt %t.exe -o %t.bolt.4 --pad-funcs-before=_start:4
+# RUN: llvm-bolt %t.exe -o %t.bolt.8 --pad-funcs-before=_start:8
+
+# RUN: not llvm-bolt %t.exe -o %t.bolt.8 --pad-funcs-before=_start:1 2>&1 | FileCheck --check-prefix=CHECK-BAD-ALIGN %s
+
+# CHECK-BAD-ALIGN: User-requested 1 padding bytes before function _start(*2) is not a multiple of the minimum function alignment (4).
+
+# RUN: llvm-objdump --section=.text --disassemble %t.bolt.0 | FileCheck --check-prefix=CHECK-0 %s
+# RUN: llvm-objdump --section=.text --disassemble %t.bolt.4 | FileCheck --check-prefix=CHECK-4 %s
+# RUN: llvm-objdump --section=.text --disassemble %t.bolt.8 | FileCheck --check-prefix=CHECK-8 %s
+
+# Need a symbol reference so that relocations are emitted.
+.section .data
+test:
+ .word 0x0
+
+.section .text
+.globl _start
+
+# CHECK-0: 0000000000400000 <_start>
+# CHECK-4: 0000000000400004 <_start>
+# CHECK-8: 0000000000400008 <_start>
+_start:
+ adr x0, test
+ ret
|
Would this be more useful as an RNG controlled feature like the LLD feature? |
I am thinking that this is complementary to the seeded approach. In this case it's a surgical tool which can be used to deliberately modify a binary in a particular way with minimal incidental changes. I would also be interested in adding a seed-based approach at some point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left the comments on the code as it is. I'll post a comment on the RFC since I'd like to understand the possible application of this feature.
0a56fcb
to
cd8f7d3
Compare
I see the CI failure. I'll ping when resolved. |
cd8f7d3
to
8413e3c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks. Please address a few nits before committing.
8413e3c
to
579da12
Compare
This complements --pad-funcs, and by using both simultaneously, enables moving a specific function through a binary without modifying any code other than the targeted function (and references to it) by doing (before+after=constant). See also: proposed functionality to enable inserting random padding in https://discourse.llvm.org/t/rfc-lld-feature-for-controlling-for-code-size-dependent-measurement-bias
579da12
to
7866f61
Compare
@@ -70,11 +74,12 @@ X86AlignBranchBoundaryHotOnly("x86-align-branch-boundary-hot-only", | |||
cl::init(true), | |||
cl::cat(BoltOptCategory)); | |||
|
|||
size_t padFunction(const BinaryFunction &Function) { | |||
size_t padFunction(const cl::list<std::string> &Spec, | |||
const BinaryFunction &Function) { | |||
static std::map<std::string, size_t> FunctionPadding; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an issue here with the static map.
If either opts::FunctionPadSpec
or opts::FunctionPadBeforeSpec
are set, the map is going to be populated with the respective spec in the first invocation of BinaryEmitter::emitFunction
. The subsequent invocations will pick up the padding from the map irrespective of whether opts::FunctionPadSpec
or opts::FunctionPadBeforeSpec
is passed as a parameter.
This breaks one of our internal tests so I'm going to revert this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting and the revert. Fix is at #121918.
14dcf82 introduced a subtle bug with the static `FunctionPadding` map. If either `opts::FunctionPadSpec` or `opts::FunctionPadBeforeSpec` are set, the map is going to be populated with the respective spec in the first invocation of `BinaryEmitter::emitFunction`. The subsequent invocations will pick up the padding from the map irrespective of whether `opts::FunctionPadSpec` or `opts::FunctionPadBeforeSpec` is passed as a parameter. This breaks an internal test, hence reverting the patch.
- **Reapply "[BOLT] Add --pad-funcs-before=func:n (#117924)"** - **[BOLT] Fix --pad-funcs{,-before} state misinteraction** When --pad-funcs-before was introduced, it introduced a bug whereby the first one to get parsed could influence the other. Ensure that each has its own state and test that they don't interact in this manner by testing how the `_subsequent` symbol moves when both arguments are supplied with different padding values. Fixed by having a function (and static state) for each of before/after.
This complements --pad-funcs, and by using both simultaneously, enables
moving a specific function through the address space without modifying any code
other than the targeted function (and references to it) by doing
(before+after=constant).
See also: proposed functionality to enable inserting random padding in
https://discourse.llvm.org/t/rfc-lld-feature-for-controlling-for-code-size-dependent-measurement-bias
and #117653