-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[LLD][COFF] Always locate the IAT at the beginning of the .rdata section and align its size to 4KB on ARM64EC. #107588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld-coff @llvm/pr-subscribers-lld Author: Jacek Caban (cjacek) ChangesThis mimics the behavior of MSVC's link.exe. My guess is that the reason for this approach is to facilitate tracking runtime IAT modifications. An auxiliary IAT allows bypassing the call checker for imported function calls. It's the OS's responsibility to ensure that, if runtime patching occurs, the auxiliary IAT is reverted to enable call checking. Modifying the IAT is a form of runtime patching, and ensuring that it doesn’t share pages with other data likely helps with tracking accuracy. Although alignment alone should ensure that the IAT occupies its own pages, placing it at the beginning of the .rdata section might be an optimization. This way, padding is only needed after the IAT, not before. The auxiliary IAT seems to follow a similar idea but is positioned at the end of the .rdata section. Full diff: https://github.com/llvm/llvm-project/pull/107588.diff 2 Files Affected:
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 4a0eed4d00997e..3e218508e0c1b4 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -217,6 +217,7 @@ class Writer {
void createExportTable();
void mergeSections();
void sortECChunks();
+ void appendECImportTables();
void removeUnusedSections();
void assignAddresses();
bool isInRange(uint16_t relType, uint64_t s, uint64_t p, int margin,
@@ -753,6 +754,7 @@ void Writer::run() {
createExportTable();
mergeSections();
sortECChunks();
+ appendECImportTables();
removeUnusedSections();
finalizeAddresses();
removeEmptySections();
@@ -914,6 +916,28 @@ void Writer::addSyntheticIdata() {
add(".idata$7", idata.dllNames);
}
+void Writer::appendECImportTables() {
+ if (!isArm64EC(ctx.config.machine))
+ return;
+
+ const uint32_t rdata = IMAGE_SCN_CNT_INITIALIZED_DATA | IMAGE_SCN_MEM_READ;
+
+ // IAT is always placed at the begining of .rdata section and its size
+ // is aligned to 4KB. Insert it now, after all merges all done.
+ if (PartialSection *importAddresses = findPartialSection(".idata$5", rdata)) {
+ if (!rdataSec->chunks.empty())
+ rdataSec->chunks.front()->setAlignment(
+ std::max(0x1000u, rdataSec->chunks.front()->getAlignment()));
+ iatSize = alignTo(iatSize, 0x1000);
+
+ rdataSec->chunks.insert(rdataSec->chunks.begin(),
+ importAddresses->chunks.begin(),
+ importAddresses->chunks.end());
+ rdataSec->contribSections.insert(rdataSec->contribSections.begin(),
+ importAddresses);
+ }
+}
+
// Locate the first Chunk and size of the import directory list and the
// IAT.
void Writer::locateImportTables() {
@@ -1069,6 +1093,11 @@ void Writer::createSections() {
sortCRTSectionChunks(pSec->chunks);
}
+ // ARM64EC has special placement and alignment requirements for IAT.
+ // Delay adding its chunks to appendECImportTables.
+ if (isArm64EC(ctx.config.machine) && pSec->name == ".idata$5")
+ continue;
+
OutputSection *sec = createSection(name, outChars);
for (Chunk *c : pSec->chunks)
sec->addChunk(c);
diff --git a/lld/test/COFF/arm64ec-import.test b/lld/test/COFF/arm64ec-import.test
index 2a80a30910b5c7..b1c47d785e445b 100644
--- a/lld/test/COFF/arm64ec-import.test
+++ b/lld/test/COFF/arm64ec-import.test
@@ -20,7 +20,7 @@ RUN: llvm-readobj --coff-imports out2.dll | FileCheck --check-prefix=IMPORTS %s
IMPORTS: Import {
IMPORTS-NEXT: Name: test.dll
IMPORTS-NEXT: ImportLookupTableRVA:
-IMPORTS-NEXT: ImportAddressTableRVA: 0x2258
+IMPORTS-NEXT: ImportAddressTableRVA: 0x2000
IMPORTS-NEXT: Symbol: data (0)
IMPORTS-NEXT: Symbol: func (0)
IMPORTS-NEXT: Symbol: func2 (0)
@@ -28,7 +28,7 @@ IMPORTS-NEXT: }
IMPORTS-NEXT: Import {
IMPORTS-NEXT: Name: test2.dll
IMPORTS-NEXT: ImportLookupTableRVA:
-IMPORTS-NEXT: ImportAddressTableRVA: 0x2278
+IMPORTS-NEXT: ImportAddressTableRVA: 0x2020
IMPORTS-NEXT: Symbol: t2func (0)
IMPORTS-NEXT: }
@@ -36,12 +36,17 @@ RUN: llvm-objdump -d out.dll | FileCheck --check-prefix=DISASM %s
RUN: llvm-objdump -d out2.dll | FileCheck --check-prefix=DISASM %s
DISASM: 0000000180001000 <.text>:
-DISASM-NEXT: 180001000: ff 25 5a 12 00 00 jmpq *0x125a(%rip) # 0x180002260
+DISASM-NEXT: 180001000: ff 25 02 10 00 00 jmpq *0x1002(%rip) # 0x180002008
RUN: llvm-readobj --hex-dump=.test out.dll | FileCheck --check-prefix=TESTSEC %s
RUN: llvm-readobj --hex-dump=.test out2.dll | FileCheck --check-prefix=TESTSEC %s
-TESTSEC: 0x180004000 60220000 58220000 68220000 78220000
-TESTSEC-NEXT: 0x180004010 00100000
+TESTSEC: 0x180005000 08200000 00200000 10200000 20200000
+TESTSEC-NEXT: 0x180005010 00100000
+
+RUN: llvm-readobj --headers out.dll | FileCheck -check-prefix=HEADERS %s
+HEADERS: LoadConfigTableRVA: 0x3008
+HEADERS: IATRVA: 0x2000
+HEADERS: IATSize: 0x1000
#--- test.s
.section .test, "r"
|
One difference from MSVC's link.exe is that, if .rdata is merged into another section, MSVC places the IATs within that section, whereas this PR hardcodes the .rdata placement. While it might be possible to replicate MSVC's behavior, this would likely require calling assignOutputSectionIndices() earlier. |
…ion and align its size to 4KB on ARM64EC. This mimics the behavior of MSVC's link.exe. My guess is that the reason for this approach is to facilitate tracking runtime IAT modifications. An auxiliary IAT allows bypassing the call checker for imported function calls. It's the OS's responsibility to ensure that, if runtime patching occurs, the auxiliary IAT is reverted to enable call checking. Modifying the IAT is a form of runtime patching, and ensuring that it doesn’t share pages with other data likely helps with tracking accuracy. Although alignment alone should ensure that the IAT occupies its own pages, placing it at the beginning of the .rdata section might be an optimization. This way, padding is only needed after the IAT, not before. The auxiliary IAT seems to follow a similar idea but is positioned at the end of the .rdata section.
cf3cda1
to
c46a8cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
I had browsed the previous version of this patch in your branch before, but the version you're submitting here is slightly different - can you elaborate on where you moved it, what implications it has (previously I think you only added one set of chunks somewhere, but now you're adding things to contribSections
too) and why you changed your approach?
(I guess normally one doesn't have to answer about code changes prior to submission, but I somewhat felt that the previous version that I've seen felt more straightforward, but I guess this one is more complete/correct somehow?)
Yes, I believe the new version is more correct, though the practical difference may be minimal. The main distinction lies in how I tested this by using an x86_64 import library generated by binutils. Since such a library is unaware of the auxiliary IAT, the resulting binary isn't quite correct and is unlikely to function properly. However, it would be possible to create an ARM64EC-aware library that handles things correctly. This would require additional By the way, when such invalid import libraries are used, we may want to emit some diagnostics. This kind of mistake is not obvious and could be easily made in a MinGW environment. MSVC handles it by checking if the IAT aligns with the auxiliary IAT, and if not, it emits a warning like:
(I think this warning is somewhat misleading - it’s not just a performance issue; in my test case, it would actually lead to crashes.) As for the partial section insertion, my goal was to make the LLD state as close to the "regular" chunk addition process as possible. Since we use contributing sections when emitting PDBs, this should make that process more accurate. MSVC also emits partial sections in the map file (which is how I discovered the section names for auxiliary IATs). While LLD doesn’t do this yet, if we ever decide to, the code from this PR will be ready for it. |
Thanks! That sounds like good reasons - the less we assume about whether idata is synthesized or sourced from long import libraries, the better. |
Pushed, thanks! |
This mimics the behavior of MSVC's link.exe. My guess is that the reason for this approach is to facilitate tracking runtime IAT modifications. An auxiliary IAT allows bypassing the call checker for imported function calls. It's the OS's responsibility to ensure that, if runtime patching occurs, the auxiliary IAT is reverted to enable call checking. Modifying the IAT is a form of runtime patching, and ensuring that it doesn’t share pages with other data likely helps with tracking accuracy.
Although alignment alone should ensure that the IAT occupies its own pages, placing it at the beginning of the .rdata section might be an optimization. This way, padding is only needed after the IAT, not before. The auxiliary IAT seems to follow a similar idea but is positioned at the end of the .rdata section.