Skip to content

[IR] Allow uses of llvm.global_ctors and llvm.global_dtors #96477

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

kovdan01
Copy link
Contributor

@kovdan01 kovdan01 commented Jun 24, 2024

With PAuth enabled, signed function pointers in init/fini arrays are
going to be represented with ptrauth constants (see #96478). To support
address discrimination for such signed pointers, we need to fill the storage
address with getelementptr referencing the array (llvm.global_ctors
or llvm.global_dtors) itself.

Such uses of these special arrays were previously disallowed since
appendToGlobal{C|D}tors did not update uses after construction of a new
array. This patch implements such update logic.

Test tools/llvm-reduce/remove-ifunc-program-addrspace.ll needs to be
updated since otherwise the following assertion in Value::doRAUW is
triggered:

assert(New->getType() == getType() &&
       "replaceAllUses of value with new value of different type!");

It's better not to omit addrspace in source IR.

@@ -1,16 +0,0 @@
; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be better to turn this into a positive test instead of deleting it, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for suggestion. I've added llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll since Verifier subdir does not look as a right place for such a positive test.

With PAuth enabled, signed function pointers in init/fini arrays are
going to be represented with `ptrauth` constants (see llvm#96478). To support
address discrimination for such signed pointers, we need to fill the storage
address with `getelementptr` referencing the array (`llvm.global_ctors`
or `llvm.global_dtors`) itself.

Such uses of these special arrays were previously disallowed since
`appendToGlobal{C|D}tors` did not update uses after construction of a new
array. This patch implements such update logic.

Test tools/llvm-reduce/remove-ifunc-program-addrspace.ll needs to be
updated since otherwise the following assertion in `Value::doRAUW` is
triggered:

```
assert(New->getType() == getType() &&
       "replaceAllUses of value with new value of different type!");
```

It's better not to omit `addrspace` in source IR.
@kovdan01 kovdan01 force-pushed the allow-llvm-global-ctors-dtors-uses branch from 219b873 to 9360bbd Compare June 24, 2024 21:30
@kovdan01 kovdan01 requested review from arsenm, nikic and topperc June 24, 2024 21:35
@kovdan01 kovdan01 marked this pull request as ready for review June 24, 2024 21:35
@kovdan01 kovdan01 requested a review from jroelofs June 24, 2024 21:35
@llvmbot
Copy link
Member

llvmbot commented Jun 24, 2024

@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-llvm-transforms

Author: Daniil Kovalev (kovdan01)

Changes

With PAuth enabled, signed function pointers in init/fini arrays are
going to be represented with ptrauth constants (see llvm#96478). To support
address discrimination for such signed pointers, we need to fill the storage
address with getelementptr referencing the array (llvm.global_ctors
or llvm.global_dtors) itself.

Such uses of these special arrays were previously disallowed since
appendToGlobal{C|D}tors did not update uses after construction of a new
array. This patch implements such update logic.

Test tools/llvm-reduce/remove-ifunc-program-addrspace.ll needs to be
updated since otherwise the following assertion in Value::doRAUW is
triggered:

assert(New->getType() == getType() &&
       "replaceAllUses of value with new value of different type!");

It's better not to omit addrspace in source IR.


Full diff: https://github.com/llvm/llvm-project/pull/96477.diff

5 Files Affected:

  • (modified) llvm/lib/IR/Verifier.cpp (-2)
  • (modified) llvm/lib/Transforms/Utils/ModuleUtils.cpp (+10-4)
  • (added) llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll (+9)
  • (removed) llvm/test/Verifier/global-ctors-dtors-uses.ll (-16)
  • (modified) llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll (+1-1)
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 0abd5f76d449c..ee17d266e5057 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -850,8 +850,6 @@ void Verifier::visitGlobalVariable(const GlobalVariable &GV) {
                        GV.getName() == "llvm.global_dtors")) {
     Check(!GV.hasInitializer() || GV.hasAppendingLinkage(),
           "invalid linkage for intrinsic global variable", &GV);
-    Check(GV.materialized_use_empty(),
-          "invalid uses of intrinsic global variable", &GV);
 
     // Don't worry about emitting an error for it not being an array,
     // visitGlobalValue will complain on appending non-array.
diff --git a/llvm/lib/Transforms/Utils/ModuleUtils.cpp b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
index 122279160cc7e..47c5aa0cb2541 100644
--- a/llvm/lib/Transforms/Utils/ModuleUtils.cpp
+++ b/llvm/lib/Transforms/Utils/ModuleUtils.cpp
@@ -35,7 +35,8 @@ static void appendToGlobalArray(StringRef ArrayName, Module &M, Function *F,
   // to the list.
   SmallVector<Constant *, 16> CurrentCtors;
   StructType *EltTy;
-  if (GlobalVariable *GVCtor = M.getNamedGlobal(ArrayName)) {
+  GlobalVariable *GVCtor = M.getNamedGlobal(ArrayName);
+  if (GVCtor) {
     EltTy = cast<StructType>(GVCtor->getValueType()->getArrayElementType());
     if (Constant *Init = GVCtor->getInitializer()) {
       unsigned n = Init->getNumOperands();
@@ -43,7 +44,6 @@ static void appendToGlobalArray(StringRef ArrayName, Module &M, Function *F,
       for (unsigned i = 0; i != n; ++i)
         CurrentCtors.push_back(cast<Constant>(Init->getOperand(i)));
     }
-    GVCtor->eraseFromParent();
   } else {
     EltTy = StructType::get(IRB.getInt32Ty(),
                             PointerType::get(FnTy, F->getAddressSpace()),
@@ -67,8 +67,14 @@ static void appendToGlobalArray(StringRef ArrayName, Module &M, Function *F,
 
   // Create the new global variable and replace all uses of
   // the old global variable with the new one.
-  (void)new GlobalVariable(M, NewInit->getType(), false,
-                           GlobalValue::AppendingLinkage, NewInit, ArrayName);
+  auto *NewGVCtor =
+      new GlobalVariable(M, NewInit->getType(), false,
+                         GlobalValue::AppendingLinkage, NewInit, ArrayName);
+  if (GVCtor) {
+    NewGVCtor->takeName(GVCtor);
+    GVCtor->replaceAllUsesWith(NewGVCtor);
+    GVCtor->eraseFromParent();
+  }
 }
 
 void llvm::appendToGlobalCtors(Module &M, Function *F, int Priority, Constant *Data) {
diff --git a/llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll b/llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll
new file mode 100644
index 0000000000000..7b335ce7f692e
--- /dev/null
+++ b/llvm/test/CodeGen/Generic/global-ctors-dtors-uses.ll
@@ -0,0 +1,9 @@
+;; Run opt with asan to trigger `appendToGlobalArray` call which should update uses of `llvm.global_ctors`
+; RUN: opt -passes=asan -S %s -o - | FileCheck %s
+; CHECK: @llvm.global_ctors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr ptrauth (ptr @foo, i32 0, i64 55764, ptr getelementptr inbounds ([1 x { i32, ptr, ptr }], ptr @llvm.global_ctors, i32 0, i32 0, i32 1)), ptr null }, { i32, ptr, ptr } { i32 1, ptr @asan.module_ctor, ptr @asan.module_ctor }]
+
+@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr ptrauth (ptr @foo, i32 0, i64 55764, ptr getelementptr inbounds ([1 x { i32, ptr, ptr }], ptr @llvm.global_ctors, i32 0, i32 0, i32 1)), ptr null }]
+
+define void @foo() {
+  ret void
+}
diff --git a/llvm/test/Verifier/global-ctors-dtors-uses.ll b/llvm/test/Verifier/global-ctors-dtors-uses.ll
deleted file mode 100644
index 1af4fb7ca9c0e..0000000000000
--- a/llvm/test/Verifier/global-ctors-dtors-uses.ll
+++ /dev/null
@@ -1,16 +0,0 @@
-; RUN: not llvm-as %s -o /dev/null 2>&1 | FileCheck %s
-
-; CHECK: invalid uses of intrinsic global variable
-; CHECK-NEXT: ptr @llvm.global_ctors
-@llvm.global_ctors = appending global [1 x { i32, ptr, ptr } ] [
-  { i32, ptr, ptr } { i32 65535, ptr null, ptr null }
-]
-
-; CHECK: invalid uses of intrinsic global variable
-; CHECK-NEXT: ptr @llvm.global_dtors
-@llvm.global_dtors = appending global [1 x { i32, ptr, ptr } ] [
-  { i32, ptr, ptr } { i32 65535, ptr null, ptr null }
-]
-
-@ctor_user = global ptr @llvm.global_ctors
-@dtor_user = global ptr @llvm.global_dtors
diff --git a/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll b/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll
index e275d61764b21..9573c55030242 100644
--- a/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll
+++ b/llvm/test/tools/llvm-reduce/remove-ifunc-program-addrspace.ll
@@ -16,7 +16,7 @@ define void @existing_ctor() addrspace(1) {
 ; CHECK-FINAL: [[TABLE:@[0-9]+]] = internal addrspace(2) global [6 x ptr addrspace(1)] poison, align 8
 
 ; CHECK-FINAL: @llvm.global_ctors = appending addrspace(2) global [2 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }, { i32, ptr addrspace(1), ptr } { i32 10, ptr addrspace(1) [[TABLE_CTOR:@[0-9]+]], ptr null }]
-@llvm.global_ctors = appending global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }]
+@llvm.global_ctors = appending addrspace(2) global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }]
 
 
 

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not allow this, because this implies that we also have to lower such references to something meaningful, outside the pointer authentication use case.

Can you explain in more detail why these weird self-references are required?

@nikic
Copy link
Contributor

nikic commented Jun 25, 2024

Also, does this mean that optimizations on ctor/dtor will have to rewrite these additional GEPs when e.g. some ctors are optimized away?

@arsenm
Copy link
Contributor

arsenm commented Jun 25, 2024

I'd rather not allow this, because this implies that we also have to lower such references to something meaningful, outside the pointer authentication use case.

+1

@@ -16,7 +16,7 @@ define void @existing_ctor() addrspace(1) {
; CHECK-FINAL: [[TABLE:@[0-9]+]] = internal addrspace(2) global [6 x ptr addrspace(1)] poison, align 8

; CHECK-FINAL: @llvm.global_ctors = appending addrspace(2) global [2 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }, { i32, ptr addrspace(1), ptr } { i32 10, ptr addrspace(1) [[TABLE_CTOR:@[0-9]+]], ptr null }]
@llvm.global_ctors = appending global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }]
@llvm.global_ctors = appending addrspace(2) global [1 x { i32, ptr addrspace(1), ptr }] [{ i32, ptr addrspace(1), ptr } { i32 0, ptr addrspace(1) @existing_ctor, ptr null }]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't just adjust the test to avoid an assert, you have to fix the assert

@kovdan01
Copy link
Contributor Author

kovdan01 commented Jun 25, 2024

@nikic

I'd rather not allow this, because this implies that we also have to lower such references to something meaningful, outside the pointer authentication use case.

I agree that this is a significant drawback of the changes, but I'm not sure how we can limit allowed uses of these special arrays only to self-references when storing address discriminated signed pointers as ctors/dtors. Please let me know if you have thoughts on this.

Can you explain in more detail why these weird self-references are required?

Sure. When signing pointers, we have 4 arguments affecting the result: the raw pointer itself, the key, a constant discriminator (optional, 0 by default) and a storage address to be blended with this constant discriminator (also optional; when present, we say that such signed pointer is address discriminated). Address discrimination makes it harder to substitute a signed pointer with another one: not only constant discriminators used for signing need to be equivalent, but also storage address as well, otherwise further signature check will fail when trying to execute an authenticated branch instruction. You can treat discrimination as "salt" used for signing and checking the signature.

When we want to sign ctors/dtors with address discrimination, we need to get their storage address, which is inside llvm.global_{c|d}tors and is obtained by getelementptr - so, it looks like that we can't avoid this use. Theoretically, we can try to somehow transform self-references to just references, by adding logic for handling smth like this:

ctor1 = ptrauth(..., getelementptr(llvm.global_ctors, ...))
llvm.global_ctors = {ctor1}

But I'm not sure if it's any better, since presence of any uses, not just self-references, is disallowed now.

@nikic
Copy link
Contributor

nikic commented Jun 25, 2024

@kovdan01 Can you use some special value as the discriminator to indicate to the backend that you want a self-reference? That way only the backend has to deal with it.

@kovdan01
Copy link
Contributor Author

@kovdan01 Can you use some special value as the discriminator to indicate to the backend that you want a self-reference? That way only the backend has to deal with it.

In common case, we can't, since for stuff like ret ptr ptrauth (ptr @g, i32 0, i64 42, ptr storage_addr) we need some actual storage address and not just a flag or a special discriminator value indicating that we need a self-reference (because here we can't say, self-reference to what).

In a particular case of ctors/dtors, I can try to implement some special handling. One thought I now have is that we can just look at llvm module flags - we already store info regarding pauth features enabled for module, which includes info whether address discrimination is enabled for ctors/dtors, so, when doing further lowering, we can manually apply address discrimination to them. @nikic Please let me know if such approach looks reasonable to you. If yes, I'll still leave this PR opened until I publish corresponding changes in #96478.

@nikic
Copy link
Contributor

nikic commented Jun 25, 2024

@kovdan01 That approach sounds good to me.

@kovdan01
Copy link
Contributor Author

I close the PR since latest changes in #96478 make this obsolete. See comment #96478 (comment) for details.

@kovdan01 kovdan01 closed this Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants