Skip to content

[BOLT] Enable hugify for AArch64 #117158

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion bolt/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ if (LLVM_INCLUDE_TESTS)
endif()

if (BOLT_ENABLE_RUNTIME)
message(STATUS "Building BOLT runtime libraries for X86")
message(STATUS "Building BOLT runtime libraries for ${CMAKE_SYSTEM_PROCESSOR}")
set(extra_args "")
if(CMAKE_SYSROOT)
list(APPEND extra_args -DCMAKE_SYSROOT=${CMAKE_SYSROOT})
Expand Down
20 changes: 16 additions & 4 deletions bolt/lib/Rewrite/RewriteInstance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -597,8 +597,9 @@ Error RewriteInstance::discoverStorage() {

// Hugify: Additional huge page from left side due to
// weird ASLR mapping addresses (4KB aligned)
if (opts::Hugify && !BC->HasFixedLoadAddress)
if (opts::Hugify && !BC->HasFixedLoadAddress) {
NextAvailableAddress += BC->PageAlign;
}

if (!opts::UseGnuStack && !BC->IsLinuxKernel) {
// This is where the black magic happens. Creating PHDR table in a segment
Expand Down Expand Up @@ -5858,17 +5859,28 @@ void RewriteInstance::rewriteFile() {

// Write all allocatable sections - reloc-mode text is written here as well
for (BinarySection &Section : BC->allocatableSections()) {
if (!Section.isFinalized() || !Section.getOutputData())
if (!Section.isFinalized() || !Section.getOutputData()) {
LLVM_DEBUG(if (opts::Verbosity > 1) {
dbgs() << "BOLT-INFO: new section is finalized or !getOutputData, skip "
<< Section.getName() << '\n';
});
continue;
if (Section.isLinkOnly())
}
if (Section.isLinkOnly()) {
LLVM_DEBUG(if (opts::Verbosity > 1) {
dbgs() << "BOLT-INFO: new section is link only, skip "
<< Section.getName() << '\n';
});
continue;
}

if (opts::Verbosity >= 1)
BC->outs() << "BOLT: writing new section " << Section.getName()
<< "\n data at 0x"
<< Twine::utohexstr(Section.getAllocAddress()) << "\n of size "
<< Section.getOutputSize() << "\n at offset "
<< Section.getOutputFileOffset() << '\n';
<< Section.getOutputFileOffset() << " with content size "
<< Section.getOutputContents().size() << '\n';
OS.seek(Section.getOutputFileOffset());
Section.write(OS);
}
Expand Down
6 changes: 4 additions & 2 deletions bolt/runtime/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,10 +151,12 @@ struct timespec {
uint64_t tv_nsec; /* nanoseconds */
};

#if defined(__aarch64__)
#if defined(__aarch64__) || defined(__arm64__)
#include "sys_aarch64.h"
#else
#elif defined(__x86_64__)
#include "sys_x86_64.h"
#else
#error "For AArch64/ARM64 and X86_64 only."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: "Arch is not supported" probably better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your valuable comment. Let me disagree on this one with you.

Copy link
Member

@yota9 yota9 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, someone who would add support for RISC-V for example would absolutely change it "For AArch64/ARM64 and X86_64 and RISC-V only." as it is very convenient :)

Copy link
Contributor Author

@alekuz01 alekuz01 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! Why not? Or it just removes this line entirely because of this:

# Determine default set of targets to build -- the intersection of                                                                                                                         
# those BOLT supports and those LLVM is targeting.                                                                                                                                         
set(BOLT_TARGETS_TO_BUILD_all "AArch64;X86;RISCV") 

But if you worry about the cognitive effort that might be hard to understand as you demonstrated in your case:

"AArch64;X86;RISCV archs are supported!"

Copy link
Member

@yota9 yota9 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Too pi ty that you don't understand that good code doesn't have to be changed each time. Not that I worried about my cognitive efforts, but your mental health. Sorry, not interested in further discussions with such an arrogant tool like you :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do apologise for any inconvenience you have had because of this. However, Could you be more specific what exactly was so arrogant towards you based on fact that you have used words like "garbage" instead of more appropriate disagreement regarding appropriate style of error messages.

Copy link
Member

@yota9 yota9 Mar 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I wrote exactly this word because of your previous comments about my "helpful and valuable comments" with barely hidden aggression. Please, don't play a victim now. Anyway it was addressed to output, not you or your work.
I'm not payed to work with you and just not interested in further discussions, maybe someone else dare to write any comments to your review if they're ready to have such a nice conversations.

#endif

constexpr uint32_t BufSize = 10240;
Expand Down
21 changes: 16 additions & 5 deletions bolt/runtime/hugify.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
//
//===---------------------------------------------------------------------===//

#if defined (__x86_64__) && !defined(__APPLE__)
#if defined(__x86_64__) || \
(defined(__aarch64__) || defined(__arm64__)) && !defined(__APPLE__)

#include "common.h"

Expand Down Expand Up @@ -73,8 +74,10 @@ static bool hasPagecacheTHPSupport() {
if (Res < 0)
return false;

if (!strStr(Buf, "[always]") && !strStr(Buf, "[madvise]"))
if (!strStr(Buf, "[always]") && !strStr(Buf, "[madvise]")) {
DEBUG(report("[hugify] THP support is not enabled.\n");)
return false;
}

struct KernelVersionTy {
uint32_t major;
Expand Down Expand Up @@ -167,12 +170,20 @@ extern "C" void __bolt_hugify_self_impl() {

/// This is hooking ELF's entry, it needs to save all machine state.
extern "C" __attribute((naked)) void __bolt_hugify_self() {
// clang-format off
#if defined(__x86_64__)
__asm__ __volatile__(SAVE_ALL "call __bolt_hugify_self_impl\n" RESTORE_ALL
"jmp __bolt_hugify_start_program\n" ::
:);
"jmp __bolt_hugify_start_program\n"
:::);
#elif defined(__aarch64__) || defined(__arm64__)
__asm__ __volatile__(SAVE_ALL "bl __bolt_hugify_self_impl\n" RESTORE_ALL
"adrp x16, __bolt_hugify_start_program\n"
"add x16, x16, #:lo12:__bolt_hugify_start_program\n"
"br x16\n"
:::);
#else
exit(1);
__exit(1);
#endif
// clang-format on
}
#endif
17 changes: 14 additions & 3 deletions bolt/test/runtime/X86/hugify.c → bolt/test/runtime/hugify.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,28 @@ int main(int argc, char **argv) {
REQUIRES: system-linux,bolt-runtime

RUN: %clang %cflags -no-pie %s -o %t.nopie.exe -Wl,-q
RUN: %clang %cflags -fpic -pie %s -o %t.pie.exe -Wl,-q
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I ask why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! You may ask. I hope I could respond to this comment with the following explanation:

  • llvm uses lit for testing
  • lit has its own configuration files, for example lit.local.cfg (bolt/runtime, bolt/test)
  • if you can check those files you can find there: -pie -fPIE flags. Those flags are applied to the mentioned binary by default. Except -fpic.
+ bin/clang --target=aarch64-unknown-linux-gnu -fPIE -fuse-ld=lld -Wl,--unresolved-symbols=ignore-all -Wl,--build-id=none -pie llvm-project/bolt/test/runtime/hugify.c -o tools/bolt/test/runtime/Output/hugify.c.tmp.pie.exe -Wl,-q

I think your question is about whether to use -fpicor not in this case as -pie is used by default?

+ bin/clang --target=aarch64-unknown-linux-gnu -fPIE -fuse-ld=lld -Wl,--unresolved-symbols=ignore-all -Wl,--build-id=none -pie -fpic /llvm-project/bolt/test/runtime/hugify.c -o tools/bolt/test/runtime/Output/hugify.c.tmp.pie.exe -Wl,-q 

Should we discuss what exactly -fpic would check here? Would it affect the output of nm?

RUN: %clang %cflags -fpic %s -o %t.pie.exe -Wl,-q

RUN: llvm-bolt %t.nopie.exe --lite=0 -o %t.nopie --hugify
RUN: llvm-bolt %t.pie.exe --lite=0 -o %t.pie --hugify

RUN: llvm-nm --numeric-sort --print-armap %t.nopie | \
RUN: FileCheck %s -check-prefix=CHECK-NM
RUN: %t.nopie | FileCheck %s -check-prefix=CHECK-NOPIE

CHECK-NOPIE: Hello world

RUN: llvm-nm --numeric-sort --print-armap %t.pie | \
RUN: FileCheck %s -check-prefix=CHECK-NM
RUN: %t.pie | FileCheck %s -check-prefix=CHECK-PIE

CHECK-NM: W __hot_start
CHECK-NM-NEXT: T _start
CHECK-NM: T main
CHECK-NM: W __hot_end
CHECK-NM: t __bolt_hugify_start_program
CHECK-NM-NEXT: W __bolt_runtime_start

CHECK-NOPIE: Hello world

CHECK-PIE: Hello world

*/
Original file line number Diff line number Diff line change
Expand Up @@ -5,19 +5,15 @@
*/
#include <stdio.h>

int foo(int x) {
return x + 1;
}
int foo(int x) { return x + 1; }

int fib(int x) {
if (x < 2)
return x;
return fib(x - 1) + fib(x - 2);
}

int bar(int x) {
return x - 1;
}
int bar(int x) { return x - 1; }

int main(int argc, char **argv) {
printf("fib(%d) = %d\n", argc, fib(argc));
Expand All @@ -31,14 +27,28 @@ RUN: %clang %cflags -no-pie %s -o %t.exe -Wl,-q

RUN: llvm-bolt %t.exe --relocs=1 --lite --reorder-functions=user \
RUN: --hugify --function-order=%p/Inputs/user_func_order.txt -o %t
RUN: llvm-bolt %t.exe --relocs=1 --lite --reorder-functions=user \
RUN: --function-order=%p/Inputs/user_func_order.txt -o %t.nohugify
RUN: llvm-nm --numeric-sort --print-armap %t | \
RUN: FileCheck %s -check-prefix=CHECK-NM
RUN: %t 1 2 3 | FileCheck %s -check-prefix=CHECK-OUTPUT
RUN: llvm-nm --numeric-sort --print-armap %t.nohugify | \
RUN: FileCheck %s -check-prefix=CHECK-NM-NOHUGIFY
RUN: %t.nohugify 1 2 3 | FileCheck %s -check-prefix=CHECK-OUTPUT-NOHUGIFY


CHECK-NM: W __hot_start
CHECK-NM: T main
CHECK-NM-NEXT: T fib
CHECK-NM-NEXT: W __hot_end
CHECK-NM: t __bolt_hugify_start_program
CHECK-NM-NEXT: W __bolt_runtime_start

CHECK-NM-NOHUGIFY: W __hot_start
CHECK-NM-NOHUGIFY: T main
CHECK-NM-NOHUGIFY-NEXT: T fib
CHECK-NM-NOHUGIFY-NEXT: W __hot_end

CHECK-OUTPUT: fib(4) = 3
CHECK-OUTPUT-NOHUGIFY: fib(4) = 3
*/