-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[IR] Allow uses of llvm.global_ctors
and llvm.global_dtors
#96477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -1,16 +0,0 @@ | |||
; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be better to turn this into a positive test instead of deleting it, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for suggestion. I've added llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll since Verifier subdir does not look as a right place for such a positive test.
With PAuth enabled, signed function pointers in init/fini arrays are going to be represented with `ptrauth` constants (see llvm#96478). To support address discrimination for such signed pointers, we need to fill the storage address with `getelementptr` referencing the array (`llvm.global_ctors` or `llvm.global_dtors`) itself. Such uses of these special arrays were previously disallowed since `appendToGlobal{C|D}tors` did not update uses after construction of a new array. This patch implements such update logic. Test tools/llvm-reduce/remove-ifunc-program-addrspace.ll needs to be updated since otherwise the following assertion in `Value::doRAUW` is triggered: ``` assert(New->getType() == getType() && "replaceAllUses of value with new value of different type!"); ``` It's better not to omit `addrspace` in source IR.
219b873
to
9360bbd
Compare
@llvm/pr-subscribers-llvm-ir @llvm/pr-subscribers-llvm-transforms Author: Daniil Kovalev (kovdan01) ChangesWith PAuth enabled, signed function pointers in init/fini arrays are Such uses of these special arrays were previously disallowed since Test tools/llvm-reduce/remove-ifunc-program-addrspace.ll needs to be
It's better not to omit Full diff: https://github.com/llvm/llvm-project/pull/96477.diff 5 Files Affected:
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 0abd5f76d449c..ee17d266e5057 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -850,8 +850,6 @@ void Verifier::visitGlobalVariable(const GlobalVariable &GV) {
GV.getName() == "llvm.global_dtors")) {
Check(!GV.hasInitializer() || GV.hasAppendingLinkage(),
"invalid linkage for intrinsic global variable", &GV);
- Check(GV.materialized_use_empty(),
- "invalid uses of intrinsic global variable", &GV);
// Don't worry about emitting an error for it not being an array,
// visitGlobalValue will complain on appending non-array.
diff --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 122279160cc7e..47c5aa0cb2541 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -35,7 +35,8 @@ static void appendToGlobalArray(StringRef ArrayName, Module &M, Function *F,
// to the list.
SmallVector<Constant *, 16> CurrentCtors;
StructType *EltTy;
- if (GlobalVariable *GVCtor = M.getNamedGlobal(ArrayName)) {
+ GlobalVariable *GVCtor = M.getNamedGlobal(ArrayName);
+ if (GVCtor) {
EltTy = cast<StructType>(GVCtor->getValueType()->getArrayElementType());
if (Constant *Init = GVCtor->getInitializer()) {
unsigned n = Init->getNumOperands();
@@ -43,7 +44,6 @@ static void appendToGlobalArray(StringRef ArrayName, Module &M, Function *F,
for (unsigned i = 0; i != n; ++i)
CurrentCtors.push_back(cast<Constant>(Init->getOperand(i)));
}
- GVCtor->eraseFromParent();
} else {
EltTy = StructType::get(IRB.getInt32Ty(),
PointerType::get(FnTy, F->getAddressSpace()),
@@ -67,8 +67,14 @@ static void appendToGlobalArray(StringRef ArrayName, Module &M, Function *F,
// Create the new global variable and replace all uses of
// the old global variable with the new one.
- (void)new GlobalVariable(M, NewInit->getType(), false,
- GlobalValue::AppendingLinkage, NewInit, ArrayName);
+ auto *NewGVCtor =
+ new GlobalVariable(M, NewInit->getType(), false,
+ GlobalValue::AppendingLinkage, NewInit, ArrayName);
+ if (GVCtor) {
+ NewGVCtor->takeName(GVCtor);
+ GVCtor->replaceAllUsesWith(NewGVCtor);
+ GVCtor->eraseFromParent();
+ }
}
void llvm::appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data) {
diff --git a/llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll b/llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll
new file mode 100644
index 0000000000000..7b335ce7f692e
--- /dev/null
+++ b/llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll
@@ -0,0 +1,9 @@
+;; Run opt with asan to trigger `appendToGlobalArray` call which should update uses of `llvm.global_ctors`
+; RUN: opt -passes=asan -S %s -o - | FileCheck %s
+; CHECK: @llvm.global_ctors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr ptrauth (ptr @foo, i32 0, i64 55764, ptr getelementptr inbounds ([1 x { i32, ptr, ptr }], ptr @llvm.global_ctors, i32 0, i32 0, i32 1)), ptr null }, { i32, ptr, ptr } { i32 1, ptr @asan.module_ctor, ptr @asan.module_ctor }]
+
+@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr ptrauth (ptr @foo, i32 0, i64 55764, ptr getelementptr inbounds ([1 x { i32, ptr, ptr }], ptr @llvm.global_ctors, i32 0, i32 0, i32 1)), ptr null }]
+
+define void @foo() {
+ ret void
+}
diff --git a/llvm/test/Verifier/global-ctors-dtors-uses.ll b/llvm/test/Verifier/global-ctors-dtors-uses.ll
deleted file mode 100644
index 1af4fb7ca9c0e..0000000000000
--- a/llvm/test/Verifier/global-ctors-dtors-uses.ll
+++ /dev/null
@@ -1,16 +0,0 @@
-; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
-
-; CHECK: invalid uses of intrinsic global variable
-; CHECK-NEXT: ptr @llvm.global_ctors
-@llvm.global_ctors = appending global [1 x { i32, ptr, ptr } ] [
- { i32, ptr, ptr } { i32 65535, ptr null, ptr null }
-]
-
-; CHECK: invalid uses of intrinsic global variable
-; CHECK-NEXT: ptr @llvm.global_dtors
-@llvm.global_dtors = appending global [1 x { i32, ptr, ptr } ] [
- { i32, ptr, ptr } { i32 65535, ptr null, ptr null }
-]
-
-@ctor_user = global ptr @llvm.global_ctors
-@dtor_user = global ptr @llvm.global_dtors
diff --git a/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll b/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll
index e275d61764b21..9573c55030242 100644
--- a/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll
+++ b/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll
@@ -16,7 +16,7 @@ define void @existing_ctor() addrspace(1) {
; CHECK-FINAL: [[TABLE:@[0-9]+]] = internal addrspace(2) global [6 x ptr addrspace(1)] poison, align 8
; CHECK-FINAL: @llvm.global_ctors = appending addrspace(2) global [2 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }, { i32, ptr addrspace(1), ptr } { i32 10, ptr addrspace(1) [[TABLE_CTOR:@[0-9]+]], ptr null }]
-@llvm.global_ctors = appending global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }]
+@llvm.global_ctors = appending addrspace(2) global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather not allow this, because this implies that we also have to lower such references to something meaningful, outside the pointer authentication use case.
Can you explain in more detail why these weird self-references are required?
Also, does this mean that optimizations on ctor/dtor will have to rewrite these additional GEPs when e.g. some ctors are optimized away? |
+1 |
@@ -16,7 +16,7 @@ define void @existing_ctor() addrspace(1) { | |||
; CHECK-FINAL: [[TABLE:@[0-9]+]] = internal addrspace(2) global [6 x ptr addrspace(1)] poison, align 8 | |||
|
|||
; CHECK-FINAL: @llvm.global_ctors = appending addrspace(2) global [2 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }, { i32, ptr addrspace(1), ptr } { i32 10, ptr addrspace(1) [[TABLE_CTOR:@[0-9]+]], ptr null }] | |||
@llvm.global_ctors = appending global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }] | |||
@llvm.global_ctors = appending addrspace(2) global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't just adjust the test to avoid an assert, you have to fix the assert
I agree that this is a significant drawback of the changes, but I'm not sure how we can limit allowed uses of these special arrays only to self-references when storing address discriminated signed pointers as ctors/dtors. Please let me know if you have thoughts on this.
Sure. When signing pointers, we have 4 arguments affecting the result: the raw pointer itself, the key, a constant discriminator (optional, 0 by default) and a storage address to be blended with this constant discriminator (also optional; when present, we say that such signed pointer is address discriminated). Address discrimination makes it harder to substitute a signed pointer with another one: not only constant discriminators used for signing need to be equivalent, but also storage address as well, otherwise further signature check will fail when trying to execute an authenticated branch instruction. You can treat discrimination as "salt" used for signing and checking the signature. When we want to sign ctors/dtors with address discrimination, we need to get their storage address, which is inside
But I'm not sure if it's any better, since presence of any uses, not just self-references, is disallowed now. |
@kovdan01 Can you use some special value as the discriminator to indicate to the backend that you want a self-reference? That way only the backend has to deal with it. |
In common case, we can't, since for stuff like In a particular case of ctors/dtors, I can try to implement some special handling. One thought I now have is that we can just look at llvm module flags - we already store info regarding pauth features enabled for module, which includes info whether address discrimination is enabled for ctors/dtors, so, when doing further lowering, we can manually apply address discrimination to them. @nikic Please let me know if such approach looks reasonable to you. If yes, I'll still leave this PR opened until I publish corresponding changes in #96478. |
@kovdan01 That approach sounds good to me. |
I close the PR since latest changes in #96478 make this obsolete. See comment #96478 (comment) for details. |
With PAuth enabled, signed function pointers in init/fini arrays are
going to be represented with
ptrauth
constants (see #96478). To supportaddress discrimination for such signed pointers, we need to fill the storage
address with
getelementptr
referencing the array (llvm.global_ctors
or
llvm.global_dtors
) itself.Such uses of these special arrays were previously disallowed since
appendToGlobal{C|D}tors
did not update uses after construction of a newarray. This patch implements such update logic.
Test tools/llvm-reduce/remove-ifunc-program-addrspace.ll needs to be
updated since otherwise the following assertion in
Value::doRAUW
istriggered:
It's better not to omit
addrspace
in source IR.