Skip to content

[PAC] Implement function pointer type discrimination #96992

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 11, 2024

Conversation

ahatanak
Copy link
Collaborator

@ahatanak ahatanak commented Jun 28, 2024

Give users an option (-fptrauth-function-pointer-type-discrimination) to sign a function pointer using a non-zero discriminator based on the function type.

The discriminator is computed by first translating the function type to a string and then computing the hash value of the string. Two function types that are compatible in C must be translated to the same string with the exception of function types that use typedefs of anonymous structs in their return type or parameter types.

This patch doesn't have the code to resign function pointers, which is needed when a function pointer is converted to a different function type. That will be implemented in another patch.

Co-authored-by: John McCall [email protected]

Give users an option to sign a function pointer using a non-zero
discrimiantor based on the type of the destination.

Co-authored-by: John McCall <[email protected]>
@ahatanak ahatanak marked this pull request as ready for review June 28, 2024 15:04
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Jun 28, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2024

@llvm/pr-subscribers-clang-driver
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: Akira Hatanaka (ahatanak)

Changes

Give users an option to sign a function pointer using a non-zero discrimiantor based on the type of the destination.


Patch is 32.79 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/96992.diff

14 Files Affected:

  • (modified) clang/include/clang/AST/ASTContext.h (+3)
  • (modified) clang/include/clang/AST/Type.h (+3)
  • (modified) clang/include/clang/Basic/Features.def (+1)
  • (modified) clang/include/clang/Basic/LangOptions.def (+3)
  • (modified) clang/include/clang/CodeGen/CodeGenABITypes.h (+4)
  • (modified) clang/include/clang/Driver/Options.td (+2)
  • (modified) clang/lib/AST/ASTContext.cpp (+263)
  • (modified) clang/lib/CodeGen/CGExprConstant.cpp (+5)
  • (modified) clang/lib/CodeGen/CGPointerAuth.cpp (+64-4)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+4)
  • (modified) clang/lib/Driver/ToolChains/Clang.cpp (+3)
  • (modified) clang/lib/Frontend/CompilerInvocation.cpp (+9-2)
  • (added) clang/test/CodeGen/ptrauth-function-type-discriminator.c (+137)
  • (modified) clang/test/Preprocessor/ptrauth_feature.c (+27-7)
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index a99f2dc6eb3f2..8ba0c943a9c86 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -1283,6 +1283,9 @@ class ASTContext : public RefCountedBase<ASTContext> {
   uint16_t
   getPointerAuthVTablePointerDiscriminator(const CXXRecordDecl *RD);
 
+  /// Return the "other" type-specific discriminator for the given type.
+  uint16_t getPointerAuthTypeDiscriminator(QualType T);
+
   /// Apply Objective-C protocol qualifiers to the given type.
   /// \param allowOnPointerType specifies if we can apply protocol
   /// qualifiers on ObjCObjectPointerType. It can be set to true when
diff --git a/clang/include/clang/AST/Type.h b/clang/include/clang/AST/Type.h
index a98899f7f4222..3aa0f05b0ab60 100644
--- a/clang/include/clang/AST/Type.h
+++ b/clang/include/clang/AST/Type.h
@@ -2506,6 +2506,7 @@ class alignas(TypeAlignment) Type : public ExtQualsTypeCommonBase {
   bool isFunctionNoProtoType() const { return getAs<FunctionNoProtoType>(); }
   bool isFunctionProtoType() const { return getAs<FunctionProtoType>(); }
   bool isPointerType() const;
+  bool isSignableType() const;
   bool isAnyPointerType() const;   // Any C pointer or ObjC object pointer
   bool isCountAttributedType() const;
   bool isBlockPointerType() const;
@@ -8001,6 +8002,8 @@ inline bool Type::isAnyPointerType() const {
   return isPointerType() || isObjCObjectPointerType();
 }
 
+inline bool Type::isSignableType() const { return isPointerType(); }
+
 inline bool Type::isBlockPointerType() const {
   return isa<BlockPointerType>(CanonicalType);
 }
diff --git a/clang/include/clang/Basic/Features.def b/clang/include/clang/Basic/Features.def
index 53f410d3cb4bd..2f864ff1c0edf 100644
--- a/clang/include/clang/Basic/Features.def
+++ b/clang/include/clang/Basic/Features.def
@@ -110,6 +110,7 @@ FEATURE(ptrauth_vtable_pointer_address_discrimination, LangOpts.PointerAuthVTPtr
 FEATURE(ptrauth_vtable_pointer_type_discrimination, LangOpts.PointerAuthVTPtrTypeDiscrimination)
 FEATURE(ptrauth_member_function_pointer_type_discrimination, LangOpts.PointerAuthCalls)
 FEATURE(ptrauth_init_fini, LangOpts.PointerAuthInitFini)
+FEATURE(ptrauth_function_pointer_type_discrimination, LangOpts.PointerAuthFunctionTypeDiscrimination)
 EXTENSION(swiftcc,
   PP.getTargetInfo().checkCallingConvention(CC_Swift) ==
   clang::TargetInfo::CCCR_OK)
diff --git a/clang/include/clang/Basic/LangOptions.def b/clang/include/clang/Basic/LangOptions.def
index 6dd6b5614f44c..7521ab85a9b70 100644
--- a/clang/include/clang/Basic/LangOptions.def
+++ b/clang/include/clang/Basic/LangOptions.def
@@ -470,6 +470,9 @@ ENUM_LANGOPT(StrictFlexArraysLevel, StrictFlexArraysLevelKind, 2,
 
 COMPATIBLE_VALUE_LANGOPT(MaxTokens, 32, 0, "Max number of tokens per TU or 0")
 
+BENIGN_LANGOPT(PointerAuthFunctionTypeDiscrimination, 1, 0,
+               "Use type discrimination when signing function pointers")
+
 ENUM_LANGOPT(SignReturnAddressScope, SignReturnAddressScopeKind, 2, SignReturnAddressScopeKind::None,
              "Scope of return address signing")
 ENUM_LANGOPT(SignReturnAddressKey, SignReturnAddressKeyKind, 1, SignReturnAddressKeyKind::AKey,
diff --git a/clang/include/clang/CodeGen/CodeGenABITypes.h b/clang/include/clang/CodeGen/CodeGenABITypes.h
index d4822dc160082..9cbc5a8a2a3f4 100644
--- a/clang/include/clang/CodeGen/CodeGenABITypes.h
+++ b/clang/include/clang/CodeGen/CodeGenABITypes.h
@@ -108,6 +108,10 @@ unsigned getLLVMFieldNumber(CodeGenModule &CGM,
 /// Return a declaration discriminator for the given global decl.
 uint16_t getPointerAuthDeclDiscriminator(CodeGenModule &CGM, GlobalDecl GD);
 
+/// Return a type discriminator for the given function type.
+uint16_t getPointerAuthTypeDiscriminator(CodeGenModule &CGM,
+                                         QualType FunctionType);
+
 /// Given the language and code-generation options that Clang was configured
 /// with, set the default LLVM IR attributes for a function definition.
 /// The attributes set here are mostly global target-configuration and
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index dd55838dcf384..a6f960e42d23f 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -4228,6 +4228,8 @@ defm ptrauth_vtable_pointer_address_discrimination :
 defm ptrauth_vtable_pointer_type_discrimination :
   OptInCC1FFlag<"ptrauth-vtable-pointer-type-discrimination", "Enable type discrimination of vtable pointers">;
 defm ptrauth_init_fini : OptInCC1FFlag<"ptrauth-init-fini", "Enable signing of function pointers in init/fini arrays">;
+defm ptrauth_function_pointer_type_discrimination : OptInCC1FFlag<"ptrauth-function-pointer-type-discrimination",
+  "Enabling type discrimination on C function pointers">;
 }
 
 def fenable_matrix : Flag<["-"], "fenable-matrix">, Group<f_Group>,
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 84deaf5429df7..74e0ae0a58e9f 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -3140,6 +3140,269 @@ ASTContext::getPointerAuthVTablePointerDiscriminator(const CXXRecordDecl *RD) {
   return llvm::getPointerAuthStableSipHash(Str);
 }
 
+/// Encode a function type for use in the discriminator of a function pointer
+/// type. We can't use the itanium scheme for this since C has quite permissive
+/// rules for type compatibility that we need to be compatible with.
+///
+/// Formally, this function associates every function pointer type T with an
+/// encoded string E(T). Let the equivalence relation T1 ~ T2 be defined as
+/// E(T1) == E(T2). E(T) is part of the ABI of values of type T. C type
+/// compatibility requires equivalent treatment under the ABI, so
+/// CCompatible(T1, T2) must imply E(T1) == E(T2), that is, CCompatible must be
+/// a subset of ~. Crucially, however, it must be a proper subset because
+/// CCompatible is not an equivalence relation: for example, int[] is compatible
+/// with both int[1] and int[2], but the latter are not compatible with each
+/// other. Therefore this encoding function must be careful to only distinguish
+/// types if there is no third type with which they are both required to be
+/// compatible.
+static void encodeTypeForFunctionPointerAuth(ASTContext &Ctx, raw_ostream &OS,
+                                             QualType QT) {
+  // FIXME: Consider address space qualifiers.
+  const Type *T = QT.getCanonicalType().getTypePtr();
+
+  // FIXME: Consider using the C++ type mangling when we encounter a construct
+  // that is incompatible with C.
+
+  switch (T->getTypeClass()) {
+  case Type::Atomic:
+    return encodeTypeForFunctionPointerAuth(
+        Ctx, OS, cast<AtomicType>(T)->getValueType());
+
+  case Type::LValueReference:
+    OS << "R";
+    encodeTypeForFunctionPointerAuth(Ctx, OS,
+                                     cast<ReferenceType>(T)->getPointeeType());
+    return;
+  case Type::RValueReference:
+    OS << "O";
+    encodeTypeForFunctionPointerAuth(Ctx, OS,
+                                     cast<ReferenceType>(T)->getPointeeType());
+    return;
+
+  case Type::Pointer:
+    // C11 6.7.6.1p2:
+    //   For two pointer types to be compatible, both shall be identically
+    //   qualified and both shall be pointers to compatible types.
+    // FIXME: we should also consider pointee types.
+    OS << "P";
+    return;
+
+  case Type::ObjCObjectPointer:
+  case Type::BlockPointer:
+    OS << "P";
+    return;
+
+  case Type::Complex:
+    OS << "C";
+    return encodeTypeForFunctionPointerAuth(
+        Ctx, OS, cast<ComplexType>(T)->getElementType());
+
+  case Type::VariableArray:
+  case Type::ConstantArray:
+  case Type::IncompleteArray:
+  case Type::ArrayParameter:
+    // C11 6.7.6.2p6:
+    //   For two array types to be compatible, both shall have compatible
+    //   element types, and if both size specifiers are present, and are integer
+    //   constant expressions, then both size specifiers shall have the same
+    //   constant value [...]
+    //
+    // So since ElemType[N] has to be compatible ElemType[], we can't encode the
+    // width of the array.
+    OS << "A";
+    return encodeTypeForFunctionPointerAuth(
+        Ctx, OS, cast<ArrayType>(T)->getElementType());
+
+  case Type::ObjCInterface:
+  case Type::ObjCObject:
+    OS << "<objc_object>";
+    return;
+
+  case Type::Enum:
+    // C11 6.7.2.2p4:
+    //   Each enumerated type shall be compatible with char, a signed integer
+    //   type, or an unsigned integer type.
+    //
+    // So we have to treat enum types as integers.
+    OS << "i";
+    return;
+
+  case Type::FunctionNoProto:
+  case Type::FunctionProto: {
+    // C11 6.7.6.3p15:
+    //   For two function types to be compatible, both shall specify compatible
+    //   return types. Moreover, the parameter type lists, if both are present,
+    //   shall agree in the number of parameters and in the use of the ellipsis
+    //   terminator; corresponding parameters shall have compatible types.
+    //
+    // That paragraph goes on to describe how unprototyped functions are to be
+    // handled, which we ignore here. Unprototyped function pointers are hashed
+    // as though they were prototyped nullary functions since thats probably
+    // what the user meant. This behavior is non-conforming.
+    // FIXME: If we add a "custom discriminator" function type attribute we
+    // should encode functions as their discriminators.
+    OS << "F";
+    auto *FuncType = cast<FunctionType>(T);
+    encodeTypeForFunctionPointerAuth(Ctx, OS, FuncType->getReturnType());
+    if (auto *FPT = dyn_cast<FunctionProtoType>(FuncType)) {
+      for (QualType Param : FPT->param_types()) {
+        Param = Ctx.getSignatureParameterType(Param);
+        encodeTypeForFunctionPointerAuth(Ctx, OS, Param);
+      }
+      if (FPT->isVariadic())
+        OS << "z";
+    }
+    OS << "E";
+    return;
+  }
+
+  case Type::MemberPointer: {
+    OS << "M";
+    auto *MPT = T->getAs<MemberPointerType>();
+    encodeTypeForFunctionPointerAuth(Ctx, OS, QualType(MPT->getClass(), 0));
+    encodeTypeForFunctionPointerAuth(Ctx, OS, MPT->getPointeeType());
+    return;
+  }
+  case Type::ExtVector:
+  case Type::Vector:
+    OS << "Dv" << Ctx.getTypeSizeInChars(T).getQuantity();
+    break;
+
+  // Don't bother discriminating based on these types.
+  case Type::Pipe:
+  case Type::BitInt:
+  case Type::ConstantMatrix:
+    OS << "?";
+    return;
+
+  case Type::Builtin: {
+    const BuiltinType *BTy = T->getAs<BuiltinType>();
+    switch (BTy->getKind()) {
+#define SIGNED_TYPE(Id, SingletonId)                                           \
+  case BuiltinType::Id:                                                        \
+    OS << "i";                                                                 \
+    return;
+#define UNSIGNED_TYPE(Id, SingletonId)                                         \
+  case BuiltinType::Id:                                                        \
+    OS << "i";                                                                 \
+    return;
+#define PLACEHOLDER_TYPE(Id, SingletonId) case BuiltinType::Id:
+#define BUILTIN_TYPE(Id, SingletonId)
+#include "clang/AST/BuiltinTypes.def"
+      llvm_unreachable("placeholder types should not appear here.");
+
+    case BuiltinType::Half:
+      OS << "Dh";
+      return;
+    case BuiltinType::Float:
+      OS << "f";
+      return;
+    case BuiltinType::Double:
+      OS << "d";
+      return;
+    case BuiltinType::LongDouble:
+      OS << "e";
+      return;
+    case BuiltinType::Float16:
+      OS << "DF16_";
+      return;
+    case BuiltinType::Float128:
+      OS << "g";
+      return;
+
+    case BuiltinType::Void:
+      OS << "v";
+      return;
+
+    case BuiltinType::ObjCId:
+    case BuiltinType::ObjCClass:
+    case BuiltinType::ObjCSel:
+    case BuiltinType::NullPtr:
+      OS << "P";
+      return;
+
+    // Don't bother discriminating based on OpenCL types.
+    case BuiltinType::OCLSampler:
+    case BuiltinType::OCLEvent:
+    case BuiltinType::OCLClkEvent:
+    case BuiltinType::OCLQueue:
+    case BuiltinType::OCLReserveID:
+    case BuiltinType::BFloat16:
+    case BuiltinType::VectorQuad:
+    case BuiltinType::VectorPair:
+      OS << "?";
+      return;
+
+    // Don't bother discriminating based on these seldom-used types.
+    case BuiltinType::Ibm128:
+      return;
+#define IMAGE_TYPE(ImgType, Id, SingletonId, Access, Suffix)                   \
+  case BuiltinType::Id:                                                        \
+    return;
+#include "clang/Basic/OpenCLImageTypes.def"
+#define EXT_OPAQUE_TYPE(ExtType, Id, Ext)                                      \
+  case BuiltinType::Id:                                                        \
+    return;
+#include "clang/Basic/OpenCLExtensionTypes.def"
+#define SVE_TYPE(Name, Id, SingletonId)                                        \
+  case BuiltinType::Id:                                                        \
+    return;
+#include "clang/Basic/AArch64SVEACLETypes.def"
+    case BuiltinType::Dependent:
+      llvm_unreachable("should never get here");
+    case BuiltinType::AMDGPUBufferRsrc:
+    case BuiltinType::WasmExternRef:
+#define RVV_TYPE(Name, Id, SingletonId) case BuiltinType::Id:
+#include "clang/Basic/RISCVVTypes.def"
+      llvm_unreachable("not yet implemented");
+    }
+  }
+  case Type::Record: {
+    RecordDecl *RD = T->getAs<RecordType>()->getDecl();
+    IdentifierInfo *II = RD->getIdentifier();
+    if (!II)
+      if (const TypedefNameDecl *Typedef = RD->getTypedefNameForAnonDecl())
+        II = Typedef->getDeclName().getAsIdentifierInfo();
+
+    if (!II) {
+      OS << "<anonymous_record>";
+      return;
+    }
+    OS << II->getLength() << II->getName();
+    return;
+  }
+  case Type::DeducedTemplateSpecialization:
+  case Type::Auto:
+#define NON_CANONICAL_TYPE(Class, Base) case Type::Class:
+#define DEPENDENT_TYPE(Class, Base) case Type::Class:
+#define NON_CANONICAL_UNLESS_DEPENDENT_TYPE(Class, Base) case Type::Class:
+#define ABSTRACT_TYPE(Class, Base)
+#define TYPE(Class, Base)
+#include "clang/AST/TypeNodes.inc"
+    llvm_unreachable("unexpected non-canonical or dependent type!");
+    return;
+  }
+}
+
+uint16_t ASTContext::getPointerAuthTypeDiscriminator(QualType T) {
+  assert(!T->isDependentType() &&
+         "cannot compute type discriminator of a dependent type");
+
+  SmallString<256> Str;
+  llvm::raw_svector_ostream Out(Str);
+
+  if (T->isFunctionPointerType() || T->isFunctionReferenceType())
+    T = T->getPointeeType();
+
+  if (T->isFunctionType())
+    encodeTypeForFunctionPointerAuth(*this, Out, T);
+  else
+    llvm_unreachable(
+        "type discrimination of non-function type not implemented yet");
+
+  return llvm::getPointerAuthStableSipHash(Str);
+}
+
 QualType ASTContext::getObjCGCQualType(QualType T,
                                        Qualifiers::GC GCAttr) const {
   QualType CanT = getCanonicalType(T);
diff --git a/clang/lib/CodeGen/CGExprConstant.cpp b/clang/lib/CodeGen/CGExprConstant.cpp
index 1fec587b5c4c7..6d412af5cc994 100644
--- a/clang/lib/CodeGen/CGExprConstant.cpp
+++ b/clang/lib/CodeGen/CGExprConstant.cpp
@@ -2220,6 +2220,11 @@ llvm::Constant *ConstantLValueEmitter::emitPointerAuthPointer(const Expr *E) {
 
   // The assertions here are all checked by Sema.
   assert(Result.Val.isLValue());
+  auto *Base = Result.Val.getLValueBase().get<const ValueDecl *>();
+  if (auto *Decl = dyn_cast_or_null<FunctionDecl>(Base)) {
+    assert(Result.Val.getLValueOffset().isZero());
+    return CGM.getRawFunctionPointer(Decl);
+  }
   return ConstantEmitter(CGM, Emitter.CGF)
       .emitAbstract(E->getExprLoc(), Result.Val, E->getType());
 }
diff --git a/clang/lib/CodeGen/CGPointerAuth.cpp b/clang/lib/CodeGen/CGPointerAuth.cpp
index 673f6e60bfc31..33b421437c74c 100644
--- a/clang/lib/CodeGen/CGPointerAuth.cpp
+++ b/clang/lib/CodeGen/CGPointerAuth.cpp
@@ -15,6 +15,7 @@
 #include "CodeGenModule.h"
 #include "clang/CodeGen/CodeGenABITypes.h"
 #include "clang/CodeGen/ConstantInitBuilder.h"
+#include "llvm/Analysis/ValueTracking.h"
 #include "llvm/Support/SipHash.h"
 
 using namespace clang;
@@ -29,7 +30,9 @@ llvm::ConstantInt *CodeGenModule::getPointerAuthOtherDiscriminator(
     return nullptr;
 
   case PointerAuthSchema::Discrimination::Type:
-    llvm_unreachable("type discrimination not implemented yet");
+    assert(!Type.isNull() && "type not provided for type-discriminated schema");
+    return llvm::ConstantInt::get(
+        IntPtrTy, getContext().getPointerAuthTypeDiscriminator(Type));
 
   case PointerAuthSchema::Discrimination::Decl:
     assert(Decl.getDecl() &&
@@ -43,6 +46,11 @@ llvm::ConstantInt *CodeGenModule::getPointerAuthOtherDiscriminator(
   llvm_unreachable("bad discrimination kind");
 }
 
+uint16_t CodeGen::getPointerAuthTypeDiscriminator(CodeGenModule &CGM,
+                                                  QualType FunctionType) {
+  return CGM.getContext().getPointerAuthTypeDiscriminator(FunctionType);
+}
+
 uint16_t CodeGen::getPointerAuthDeclDiscriminator(CodeGenModule &CGM,
                                                   GlobalDecl Declaration) {
   return CGM.getPointerAuthDeclDiscriminator(Declaration);
@@ -71,12 +79,15 @@ CGPointerAuthInfo CodeGenModule::getFunctionPointerAuthInfo(QualType T) {
   assert(!Schema.isAddressDiscriminated() &&
          "function pointers cannot use address-specific discrimination");
 
-  assert(!Schema.hasOtherDiscrimination() &&
-         "function pointers don't support any discrimination yet");
+  llvm::Constant *Discriminator = nullptr;
+  if (T->isFunctionPointerType() || T->isFunctionReferenceType())
+    T = T->getPointeeType();
+  if (T->isFunctionType())
+    Discriminator = getPointerAuthOtherDiscriminator(Schema, GlobalDecl(), T);
 
   return CGPointerAuthInfo(Schema.getKey(), Schema.getAuthenticationMode(),
                            /*IsaPointer=*/false, /*AuthenticatesNull=*/false,
-                           /*Discriminator=*/nullptr);
+                           Discriminator);
 }
 
 llvm::Value *
@@ -114,6 +125,47 @@ CGPointerAuthInfo CodeGenFunction::EmitPointerAuthInfo(
                            Schema.authenticatesNullValues(), Discriminator);
 }
 
+/// Return the natural pointer authentication for values of the given
+/// pointee type.
+static CGPointerAuthInfo
+getPointerAuthInfoForPointeeType(CodeGenModule &CGM, QualType PointeeType) {
+  if (PointeeType.isNull())
+    return CGPointerAuthInfo();
+
+  // Function pointers use the function-pointer schema by default.
+  if (PointeeType->isFunctionType())
+    return CGM.getFunctionPointerAuthInfo(PointeeType);
+
+  // Normal data pointers never use direct pointer authentication by default.
+  return CGPointerAuthInfo();
+}
+
+CGPointerAuthInfo CodeGenModule::getPointerAuthInfoForPointeeType(QualType T) {
+  return ::getPointerAuthInfoForPointeeType(*this, T);
+}
+
+/// Return the natural pointer authentication for values of the given
+/// pointer type.
+static CGPointerAuthInfo getPointerAuthInfoForType(CodeGenModule &CGM,
+                                                   QualType PointerType) {
+  assert(PointerType->isSignableType());
+
+  // Block pointers are currently not signed.
+  if (PointerType->isBlockPointerType())
+    return CGPointerAuthInfo();
+
+  auto PointeeType = PointerType->getPointeeType();
+
+  if (PointeeType.isNull())
+    return CGPointerAuthInfo();
+
+  return ::getPointerAuthInfoForPointeeType(CGM, PointeeType);
+}
+
+CGPointerAuthInfo CodeGenModule::getPointerAuthInfoForType(QualType T) {
+  return ::getPointerAuthInfoForType(*this, T);
+}
+
 llvm::Constant *
 CodeGenModule::getConstantSignedPointer(llvm::Constant *Pointer, unsigned Key,
                                         llvm::Constant *StorageAddress,
@@ -180,6 +232,14 @@ llvm::Constant *CodeGenModule::getFunctionPointer(GlobalDecl GD,
                                                   llvm::Type *Ty) {
   const auto *FD = cast<FunctionDecl>(GD.getDecl());
   QualType FuncType = FD->getType();
+
+  // Annoyingly, K&R functions have prototypes in the clang AST, but
+  // expressions referring to them are unprototyped.
+  if (!FD->hasPrototype())
+    if (const auto *Proto = FuncType->getAs<FunctionProtoType>()...
[truncated]

Copy link
Contributor

@kovdan01 kovdan01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor mostly cosmetic comments. I'm OK with fixing them in a separate PR if it's too time-consuming now.

RecordDecl *RD = T->getAs<RecordType>()->getDecl();
IdentifierInfo *II = RD->getIdentifier();
if (!II)
if (const TypedefNameDecl *Typedef = RD->getTypedefNameForAnonDecl())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTypedefNameForAnonDecl() doesn't actually have any semantic significance in C; it's a C++ thing. Not sure how much we care.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this code is needed to add more diversification in the following case:

typedef struct {
  int a;
} S;

void f(S);
void (*fp)(S) = f;

Function type discrimination is used in C++ mode too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the following two files, you have an ODR violation in C++... but you're not violating the equivalent C type compatibility rule. So this will reject code which is technically valid C.

If this is an intentional deviation from the C standard, that might be fine, but it needs an explicit comment explaining that.

typedef struct {
  int a;
} S1;
void f(S1);
typedef struct {
  int a;
} S2;
void f(S2);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, that exact code will work as usual because function symbols are not themselves signed. A call to f under one function type will work correctly even if it is defined using the other. However, you are still generally correct.

If you take the address of a function, the resulting function pointer value is signed in that context according to that context's concept of its type. If you then convert it, it will be authenticated for its old type and re-signed for its new type. There will only be a problem if the function pointer value is reinterpreted as a new type without such a conversion.

Therefore, if f is instead declared as taking a function pointer value:

void f(void (*fp)(S1));
void f(void (*fp)(S2));

A call under the first signature will sign the argument differently from how a definition under the second signature will expect it, and code that is technically valid C will fail to work.

This deviation is indeed intentional. We felt that typedefs of anonymous structs were too common of a pattern in C to simply completely fail to distinguish them, and we felt that a structural hash of the members was likely to be problematic in other ways. It is reasonable to ask that we document this, though. Akira, I suggest:

      // In C++, an immediate typedef of an anonymous struct or union
      // is considered to name it for ODR purposes, but C's specification
      // of type compatibility does not have a similar rule.  Using the typedef
      // name in function type discriminators anyway, as we do here,
      // therefore technically violates the C standard: two function pointer
      // types defined in terms of two typedef'd anonymous structs with
      // different names are formally still compatible, but we are assigning
      // them different discriminators and therefore incompatible ABIs.
      //
      // This is a relatively minor violation that significantly improves
      // discrimination in some cases and has not caused problems in
      // practice.  Regardless, it is now part of the ABI in places where
      // function type discrimination is used, and it can no longer be
      // changed except on new platforms.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this patch doesn't have the code to resign function pointers, which is needed for type conversions. That will be implemented in subsequent patches.

auto *Base = Result.Val.getLValueBase().get<const ValueDecl *>();
if (auto *Decl = dyn_cast_or_null<FunctionDecl>(Base)) {
assert(Result.Val.getLValueOffset().isZero());
return CGM.getRawFunctionPointer(Decl);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code not part of ConstantEmitter()?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is called to return an unsigned pointer. The returned pointer is used to create a signed pointer for __builtin_ptrauth_sign_constant in emitPointerAuthSignConstant. ConstantEmitter returns a pointer that is signed with zero or a non-zero discriminator that is based on the function type etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, why are you special-casing functions here? This looks like emitPointerAuthPointer can return either a signed pointer or an unsigned pointer, depending on what exactly the expression is referring to, which seems bad.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The checks in SemaChecking.cpp and the assertion in emitPointerAuthPointer ensure that the expression of the first argument to __builtin_ptrauth_sign_constant evaluates to an LValue with a ValueDecl base. With that restriction, a call to emitAbstract returns a signed pointer only when the argument points to a function definition (see ConstantLValueEmitter::tryEmitBase). Since we are special-casing functions here, the call to emitAbstract returns an unsigned pointer.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that there's an unnecessary null check in the code above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I think I follow what this is doing, then... but it seems like a terrible way to structure the code; it's repeating code, and the relevant logic is split over multiple disconnected functions.

Can we add an argument to ConstantLValueEmitter() so users can request that functions are either signed or unsigned, or something like that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean to ConstantEmitter? I'd be okay with a flag that specifically requests a raw pointer; that doesn't seem too error-prone.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised code looks fine.

@ahatanak
Copy link
Collaborator Author

I'll wait a day and merge the PR if there are no more comments.

@kovdan01
Copy link
Contributor

@ahatanak It would be nice if you explicitly mention the flag name -fptrauth-function-pointer-type-discrimination in commit message - I think it might be useful when grepping output of git log.

@ahatanak ahatanak changed the title [clang] Implement function pointer type discrimination [arm64e] Implement function pointer type discrimination Jul 11, 2024
@ahatanak ahatanak changed the title [arm64e] Implement function pointer type discrimination [PAC] Implement function pointer type discrimination Jul 11, 2024
@ahatanak ahatanak merged commit ae18b94 into llvm:main Jul 11, 2024
7 checks passed
@ahatanak ahatanak deleted the function-type-discrimination branch July 11, 2024 16:09
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
Give users an option (-fptrauth-function-pointer-type-discrimination) to
sign a function pointer using a non-zero discriminator based on the
function type.

The discriminator is computed by first translating the function type to
a string and then computing the hash value of the string. Two function
types that are compatible in C must be translated to the same string
with the exception of function types that use typedefs of anonymous
structs in their return type or parameter types.

This patch doesn't have the code to resign function pointers, which is
needed when a function pointer is converted to a different function
type. That will be implemented in another patch.

Co-authored-by: John McCall <[email protected]>

---------

Co-authored-by: John McCall <[email protected]>
kovdan01 added a commit to kovdan01/llvm-project that referenced this pull request Jul 15, 2024
Test one feature at a time to make RUN lines shorter. See also
llvm#96992 (comment)
kovdan01 added a commit that referenced this pull request Jul 15, 2024
Test one feature at a time to make RUN lines shorter. See also
#96992 (comment)
kovdan01 added a commit that referenced this pull request Jul 19, 2024
Enhance tests introduced in #94056, #96992, #98276 and #98847 by adding
RUN and CHECK lines against linux triples.
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Test one feature at a time to make RUN lines shorter. See also
#96992 (comment)

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251767
yuxuanchen1997 pushed a commit that referenced this pull request Jul 25, 2024
Summary:
Enhance tests introduced in #94056, #96992, #98276 and #98847 by adding
RUN and CHECK lines against linux triples.

Test Plan: 

Reviewers: 

Subscribers: 

Tasks: 

Tags: 


Differential Revision: https://phabricator.intern.facebook.com/D60251378
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen IR generation bugs: mangling, exceptions, etc. clang:driver 'clang' and 'clang++' user-facing binaries. Not 'clang-cl' clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

6 participants