-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[BOLT] Support instrumentation hook via DT_FINI_ARRAY #67348
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! These are my first impressions, I would probably return to re-reviewing this patch in a few days :)
Relocation for 64-bit absolute values. Note that this patch also implements two `Relocation` methods (`encodeValue` and `getAbs64`) that will become necessary to support instrumentation of non-PIE binaries (see llvm#67348). However, I didn't find a way to test those independent of instrumentation.
Added tests for the cases you suggested. Note that I tested on AArch64 because of the issues on x86 as explained in the description of this PR. This meant I had to update
Instead of doing this, I simply run the instrumented binary in the tests and verify the generated profile makes sense. I feel this should be enough because it implicitly verifies the fini function was hooked correctly. What do you think? |
Thanks, I would check the test a bit later. But in BOLT we are trying to avoid runtime tests where it is possible, for example because there are no aarch64 tests machines right now. I'm OK with adding both runtime and non-runtime tests, but I think runtime-only for this case is not enough, sorry :( |
With "runtime tests" you mean tests where a binary is executed? Or tests that trigger runtime instrumentation? As far as I can tell, the only way to trigger this new functionality is to instrument the binary. And the only way to instrument an AArch64 binary, is to compile and run BOLT on an AArch64 machine (the runtime library cannot currently be cross-compiled). Or am I missing something? |
I mean that tests in runtime folder are going to be executed. So ideally we would like the non-runtime test without binary execution to check the functionality. It is true that currently we're not supporting cross-compilation for runtime binary, that is why for non-runtime tests we have extra arch requirement. But we have this task in our backlog, so hopefully it would be changed and we would be able to run these tests too. |
Just to be sure: you also don't see another way to test this so we keep the current test? |
We can keep the current test I guess, but it would be better to add non-runtime.. Since currently it relies on the binary successfully exit - no, I don't. The non-runtime test should be easy to do, but it is just monotonic job, although the filecheck prefixes would be repeated across the multiple tests, so I guess it won't be too hard to find the needed symbol and check relocation(s) against it.. |
Just to understand things better: why is this a problem?
As far as I understand, the problem is not writing the test but actually running it since we currently need an AArch64 host for that. And if we have an AArch64 host, we might as well just execute the instrumented binary for testing purposes. Or am I missing something? |
No, you're right about it :)
I think you might misunderstood me. It is not a problem to have this runtime test, I only say that we prefer the non runtime tests in BOLT. We still can have this one, it is fine, but ideally since it is easy to arrange I would suggest to add non runtime test too. |
I guess I wanted to know why non-runtime tests are preferred. Is it so that we can run them once we have a way to cross-compile the runtime library? I tried adding non-runtime tests but the problem is that |
True. Also in this case you would check your changes directly, which is nice too :)
I see. Yes hardcode is not a good option. Let me think a bit about it too. A guess it would be enough to check that the address is somewhere in inserted section, but it might be tricky to implement this too. |
Hi @mtvec . Sorry for late response, didn't have time to study the problem. Anyway I came up that adding proper INIT/FINI symbols to symtab would be the most right way to deal with your test problems. Fill free to rebase your patch on top of the #68505 , I think it would eliminate your problem with creating a test. Thanks! |
Add absent start & fini symbols, currently setted by bolt for runtime libraries at DT_INIT and DT_FINI. The proper tests would be added by the llvm#67348 PR.
Thanks for that @yota9! I added non-runtime tests. |
Add absent start & fini symbols, currently setted by bolt for runtime libraries at DT_INIT and DT_FINI. The proper tests would be added by the #67348 PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the test! I've leaved couple of comments, would return to this review in a couple of days to re-review it last time :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rebase the changes, I've committed adding symbols today.
Sure, I was postponing this to avoid comments getting messed up (which apparently happens sometime when force-pushing). Let's see if it is ok :) |
624d9be
to
ab11fbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch :)
Thanks a lot for your help reviewing this patch @yota9! @maksfb @rafaelauler @aaupov: does this look good for you as well? |
Ping. |
Overall looks good to me.
What about DT_INIT, what does the ABI say? Usually there is symmetry between INIT and FINI, INITARRAY and FINIARRAY. BTW ideally we would like to add a new entry to dynamic table and insert ourselves into DT_PREINIT_ARRAY, but this is outside the scope of this patch. We don't currently rewrite the dynamic table so we are not doing that for this reason. We need to run our code before every constructor, that's why PREINIT is more adequate. EDIT: PREINIT_ARRAY is only for binaries, they're not used for DSOs. The context on this issue is that we currently might crash with instrumentation if your binary runs constructors in a dependent lib, and that dependent lib runs instrumented code from your binary. PREINIT_ARRAY will guarantee that we run before any constructors in DSOs linked in DT_NEEDED.
If we're being pedantic, I think we need to patch the last entry of FINI_ARRAY, not the first. If you patch the first entry, you will still run destructors after your runtime lib. But I don't think we're going to crash because of this (at least for instrumentation), since instrumentation finalization code will just dump profile. No need to implement this, but we should at least have a comment documenting this behavior, so nobody is surprised in the future if they try to implement some runtime lib that expects to run its finalization code after application code. |
At least on RISC-V,
This indeed makes sense for a future enhancement.
The entries in |
BOLT currently hooks its its instrumentation finalization function via `DT_FINI`. However, this method of calling finalization routines is not supported anymore on newer ABIs like RISC-V. `DT_FINI_ARRAY` is preferred there. This patch adds support for hooking into `DT_FINI_ARRAY` instead if the binary does not have a `DT_FINI` entry. If it does, `DT_FINI` takes precedence so this patch should not change how the currently supported instrumentation targets behave. `DT_FINI_ARRAY` points to an array in memory of `DT_FINI_ARRAYSZ` bytes. It consists of pointer-length entries that contain the addresses of finalization functions. However, the addresses are only filled-in by the dynamic linker at load time using relative relocations. This makes hooking via `DT_FINI_ARRAY` a bit more complicated than via `DT_FINI`. The implementation works as follows: - While scanning the binary: find the section where `DT_FINI_ARRAY` points to, read its first dynamic relocation and use its addend to find the address of the fini function we will use to hook; - While writing the output file: overwrite the addend of the dynamic relocation with the address of the runtime library's fini function. Updating the dynamic relocation required a bit of boiler plate: since dynamic relocations are stored in a `std::multiset` which doesn't support getting mutable references to its items, functions were added to `BinarySection` to take an existing relocation and insert a new one. Note on testing: I can currently only test this on RISC-V, but its instrumentation support hasn't been upstreamed yet as it depends on this patch which I would like to review separately. I cannot get this patch to work on x86, even when forcing the linker to omit `DT_FINI`. The reason is that, even though `DT_FINI_ARRAY` exists and has valid entries, the functions do not show up in the symbols table which triggers some asserts later in BOLT. I would propose to not add tests right now but wait until RISC-V instrumentation is upstreamed which will automatically test this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
BOLT currently hooks its its instrumentation finalization function via `DT_FINI`. However, this method of calling finalization routines is not supported anymore on newer ABIs like RISC-V. `DT_FINI_ARRAY` is preferred there. This patch adds support for hooking into `DT_FINI_ARRAY` instead if the binary does not have a `DT_FINI` entry. If it does, `DT_FINI` takes precedence so this patch should not change how the currently supported instrumentation targets behave. `DT_FINI_ARRAY` points to an array in memory of `DT_FINI_ARRAYSZ` bytes. It consists of pointer-length entries that contain the addresses of finalization functions. However, the addresses are only filled-in by the dynamic linker at load time using relative relocations. This makes hooking via `DT_FINI_ARRAY` a bit more complicated than via `DT_FINI`. The implementation works as follows: - While scanning the binary: find the section where `DT_FINI_ARRAY` points to, read its first dynamic relocation and use its addend to find the address of the fini function we will use to hook; - While writing the output file: overwrite the addend of the dynamic relocation with the address of the runtime library's fini function. Updating the dynamic relocation required a bit of boiler plate: since dynamic relocations are stored in a `std::multiset` which doesn't support getting mutable references to its items, functions were added to `BinarySection` to take an existing relocation and insert a new one. Note on testing: I can currently only test this on RISC-V, but its instrumentation support hasn't been upstreamed yet as it depends on this patch which I would like to review separately. I cannot get this patch to work on x86, even when forcing the linker to omit `DT_FINI`. The reason is that, even though `DT_FINI_ARRAY` exists and has valid entries, the functions do not show up in the symbols table which triggers some asserts later in BOLT. I would propose to not add tests right now but wait until RISC-V instrumentation is upstreamed which will automatically test this patch.
e67bb1a
to
823bd8a
Compare
Please check test failure on aarch64 buildbot: https://lab.llvm.org/buildbot/#/builders/221/builds/20084 |
Woops, should be fixed by c4b096a. |
BOLT currently hooks its its instrumentation finalization function via
DT_FINI
. However, this method of calling finalization routines is not supported anymore on newer ABIs like RISC-V.DT_FINI_ARRAY
is preferred there.This patch adds support for hooking into
DT_FINI_ARRAY
instead if the binary does not have aDT_FINI
entry. If it does,DT_FINI
takes precedence so this patch should not change how the currently supported instrumentation targets behave.DT_FINI_ARRAY
points to an array in memory ofDT_FINI_ARRAYSZ
bytes. It consists of pointer-length entries that contain the addresses of finalization functions. However, the addresses are only filled-in by the dynamic linker at load time using relative relocations. This makes hooking viaDT_FINI_ARRAY
a bit more complicated than viaDT_FINI
.The implementation works as follows:
DT_FINI_ARRAY
points to, read its first dynamic relocation and use its addend to find the address of the fini function we will use to hook;Updating the dynamic relocation required a bit of boiler plate: since dynamic relocations are stored in a
std::multiset
which doesn't support getting mutable references to its items, functions were added toBinarySection
to take an existing relocation and insert a new one.Note on testing: I can currently only test this on RISC-V, but its instrumentation support hasn't been upstreamed yet as it depends on this patch which I would like to review separately. I cannot get this patch to work on x86, even when forcing the linker to omit
DT_FINI
. The reason is that, even thoughDT_FINI_ARRAY
exists and has valid entries, the functions do not show up in the symbols table which triggers some asserts later in BOLT. I would propose to not add tests right now but wait until RISC-V instrumentation is upstreamed which will automatically test this patch.