-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[LLD] [COFF] Error out if new LTO objects are pulled in after the main LTO compilation #71337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…n LTO compilation Normally, this shouldn't happen. It can happen in exceptional circumstances, if the compiled output of a bitcode object file references symbols that weren't listed as undefined in the bitcode object file itself. This can at least happen in the following cases: - A custom SEH personality is set via asm() - Compiler generated calls to builtin helper functions, such as __chkstk, or __rt_sdiv on arm Both of these produce undefined references to symbols after compiling to a regular object file, that aren't visible on the level of the IR object file. This is only an issue if the referenced symbols are provided as LTO objects themselves; loading regular object files after the LTO compilation works fine. Custom SEH personalities are rare, but one CRT startup file in mingw-w64 does this. The referenced pesonality function is usually provided via an import library, but for WinStore targets, a local dummy reimplementation in C is used, which can be an LTO object. Generated calls to builtins is very common, but the builtins aren't usually provided as LTO objects (compiler-rt's builtins explicitly pass -fno-lto when building), and many of the builtins are provided as raw .S assembly files, which don't get built as LTO objects anyway, even if built with -flto. If hitting this unusual, but possible, situation, error out cleanly with a clear message rather than crashing.
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lto Author: Martin Storsjö (mstorsjo) ChangesNormally, this shouldn't happen. It can happen in exceptional circumstances, if the compiled output of a bitcode object file references symbols that weren't listed as undefined in the bitcode object file itself. This can at least happen in the following cases:
Both of these produce undefined references to symbols after compiling to a regular object file, that aren't visible on the level of the IR object file. This is only an issue if the referenced symbols are provided as LTO objects themselves; loading regular object files after the LTO compilation works fine. Custom SEH personalities are rare, but one CRT startup file in mingw-w64 does this. The referenced pesonality function is usually provided via an import library, but for WinStore targets, a local dummy reimplementation in C is used, which can be an LTO object. Generated calls to builtins is very common, but the builtins aren't usually provided as LTO objects (compiler-rt's builtins explicitly pass -fno-lto when building), and many of the builtins are provided as raw .S assembly files, which don't get built as LTO objects anyway, even if built with -flto. If hitting this unusual, but possible, situation, error out cleanly with a clear message rather than crashing. Full diff: https://github.com/llvm/llvm-project/pull/71337.diff 4 Files Affected:
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index d9673e159769fbd..15c76461a84a92c 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -61,6 +61,10 @@ void SymbolTable::addFile(InputFile *file) {
if (auto *f = dyn_cast<ObjFile>(file)) {
ctx.objFileInstances.push_back(f);
} else if (auto *f = dyn_cast<BitcodeFile>(file)) {
+ if (ltoCompilationDone) {
+ error("LTO object file " + toString(file) + " linked in after "
+ "doing LTO compilation.");
+ }
ctx.bitcodeFileInstances.push_back(f);
} else if (auto *f = dyn_cast<ImportFile>(file)) {
ctx.importFileInstances.push_back(f);
@@ -876,6 +880,7 @@ Symbol *SymbolTable::addUndefined(StringRef name) {
}
void SymbolTable::compileBitcodeFiles() {
+ ltoCompilationDone = true;
if (ctx.bitcodeFileInstances.empty())
return;
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index 511e60d1e3a0873..fc623c2840d401b 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -133,6 +133,7 @@ class SymbolTable {
llvm::DenseMap<llvm::CachedHashStringRef, Symbol *> symMap;
std::unique_ptr<BitcodeCompiler> lto;
+ bool ltoCompilationDone = false;
COFFLinkerContext &ctx;
};
diff --git a/lld/test/COFF/lto-late-arm.ll b/lld/test/COFF/lto-late-arm.ll
new file mode 100644
index 000000000000000..0e2f148ef74c6c8
--- /dev/null
+++ b/lld/test/COFF/lto-late-arm.ll
@@ -0,0 +1,39 @@
+; REQUIRES: arm
+
+;; A bitcode file can generate undefined references to symbols that weren't
+;; listed as undefined on the bitcode file itself, when lowering produces
+;; calls to e.g. builtin helper functions. If these functions are provided
+;; as LTO bitcode, the linker would hit an unhandled state. (In practice,
+;; compiler-rt builtins are always compiled with -fno-lto, so this shouldn't
+;; happen.)
+
+; RUN: rm -rf %t.dir
+; RUN: split-file %s %t.dir
+; RUN: llvm-as %t.dir/main.ll -o %t.main.obj
+; RUN: llvm-as %t.dir/sdiv.ll -o %t.sdiv.obj
+; RUN: llvm-ar rcs %t.sdiv.lib %t.sdiv.obj
+
+; RUN: env LLD_IN_TEST=1 not lld-link /entry:entry %t.main.obj %t.sdiv.lib /out:%t.exe /subsystem:console 2>&1 | FileCheck %s
+
+; CHECK: error: LTO object file lto-late-arm.ll.tmp.sdiv.lib(lto-late-arm.ll.tmp.sdiv.obj) linked in after doing LTO compilation.
+
+;--- main.ll
+target datalayout = "e-m:w-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7-w64-windows-gnu"
+
+@num = dso_local global i32 100
+
+define dso_local arm_aapcs_vfpcc i32 @entry(i32 %param) {
+entry:
+ %0 = load i32, ptr @num
+ %div = sdiv i32 %0, %param
+ ret i32 %div
+}
+;--- sdiv.ll
+target datalayout = "e-m:w-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
+target triple = "thumbv7-w64-windows-gnu"
+
+define dso_local arm_aapcs_vfpcc void @__rt_sdiv() {
+entry:
+ ret void
+}
diff --git a/lld/test/COFF/lto-late-personality.ll b/lld/test/COFF/lto-late-personality.ll
new file mode 100644
index 000000000000000..f768f65f4521e40
--- /dev/null
+++ b/lld/test/COFF/lto-late-personality.ll
@@ -0,0 +1,48 @@
+; REQUIRES: x86
+
+;; A bitcode file can generate undefined references to symbols that weren't
+;; listed as undefined on the bitcode file itself, if there's a reference to
+;; an unexpected personality routine via asm(). If the personality function
+;; is provided as LTO bitcode, the linker would hit an unhandled state.
+
+; RUN: rm -rf %t.dir
+; RUN: split-file %s %t.dir
+; RUN: llvm-as %t.dir/main.ll -o %t.main.obj
+; RUN: llvm-as %t.dir/other.ll -o %t.other.obj
+; RUN: llvm-as %t.dir/personality.ll -o %t.personality.obj
+; RUN: llvm-ar rcs %t.personality.lib %t.personality.obj
+
+; RUN: env LLD_IN_TEST=1 not lld-link /entry:entry %t.main.obj %t.other.obj %t.personality.lib /out:%t.exe /subsystem:console /opt:lldlto=0 /debug:symtab 2>&1 | FileCheck %s
+
+; CHECK: error: LTO object file lto-late-personality.ll.tmp.personality.lib(lto-late-personality.ll.tmp.personality.obj) linked in after doing LTO compilation.
+
+;--- main.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+define i32 @entry() {
+entry:
+ tail call void @other()
+ tail call void asm sideeffect ".seh_handler __C_specific_handler, @except\0A", "~{dirflag},~{fpsr},~{flags}"()
+ ret i32 0
+}
+
+declare dso_local void @other()
+
+;--- other.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+define dso_local void @other() {
+entry:
+ ret void
+}
+
+;--- personality.ll
+target datalayout = "e-m:w-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-w64-windows-gnu"
+
+define void @__C_specific_handler() {
+entry:
+ ret void
+}
|
You can test this locally with the following command:git-clang-format --diff 1d95a071d6fc43fb413a0d3f5a9d1e52a18abab0 373f2c7d9e8043f50d6be1849d3feebde44ce9d1 -- lld/COFF/SymbolTable.cpp lld/COFF/SymbolTable.h View the diff from clang-format here.diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 15c76461a..27e1bc75d 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -62,7 +62,8 @@ void SymbolTable::addFile(InputFile *file) {
ctx.objFileInstances.push_back(f);
} else if (auto *f = dyn_cast<BitcodeFile>(file)) {
if (ltoCompilationDone) {
- error("LTO object file " + toString(file) + " linked in after "
+ error("LTO object file " + toString(file) +
+ " linked in after "
"doing LTO compilation.");
}
ctx.bitcodeFileInstances.push_back(f);
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
Normally, this shouldn't happen. It can happen in exceptional circumstances, if the compiled output of a bitcode object file references symbols that weren't listed as undefined in the bitcode object file itself.
This can at least happen in the following cases:
Both of these produce undefined references to symbols after compiling to a regular object file, that aren't visible on the level of the IR object file.
This is only an issue if the referenced symbols are provided as LTO objects themselves; loading regular object files after the LTO compilation works fine.
Custom SEH personalities are rare, but one CRT startup file in mingw-w64 does this. The referenced pesonality function is usually provided via an import library, but for WinStore targets, a local dummy reimplementation in C is used, which can be an LTO object.
Generated calls to builtins is very common, but the builtins aren't usually provided as LTO objects (compiler-rt's builtins explicitly pass -fno-lto when building), and many of the builtins are provided as raw .S assembly files, which don't get built as LTO objects anyway, even if built with -flto.
If hitting this unusual, but possible, situation, error out cleanly with a clear message rather than crashing.