-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[PAC] Implement function pointer type discrimination #96992
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
cf22a4b
1d5757a
4b1ad0b
06e4a14
d429cb4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3140,6 +3140,269 @@ ASTContext::getPointerAuthVTablePointerDiscriminator(const CXXRecordDecl *RD) { | |
return llvm::getPointerAuthStableSipHash(Str); | ||
} | ||
|
||
/// Encode a function type for use in the discriminator of a function pointer | ||
/// type. We can't use the itanium scheme for this since C has quite permissive | ||
/// rules for type compatibility that we need to be compatible with. | ||
/// | ||
/// Formally, this function associates every function pointer type T with an | ||
/// encoded string E(T). Let the equivalence relation T1 ~ T2 be defined as | ||
/// E(T1) == E(T2). E(T) is part of the ABI of values of type T. C type | ||
/// compatibility requires equivalent treatment under the ABI, so | ||
/// CCompatible(T1, T2) must imply E(T1) == E(T2), that is, CCompatible must be | ||
/// a subset of ~. Crucially, however, it must be a proper subset because | ||
/// CCompatible is not an equivalence relation: for example, int[] is compatible | ||
/// with both int[1] and int[2], but the latter are not compatible with each | ||
/// other. Therefore this encoding function must be careful to only distinguish | ||
/// types if there is no third type with which they are both required to be | ||
/// compatible. | ||
static void encodeTypeForFunctionPointerAuth(const ASTContext &Ctx, | ||
raw_ostream &OS, QualType QT) { | ||
// FIXME: Consider address space qualifiers. | ||
const Type *T = QT.getCanonicalType().getTypePtr(); | ||
|
||
// FIXME: Consider using the C++ type mangling when we encounter a construct | ||
// that is incompatible with C. | ||
|
||
switch (T->getTypeClass()) { | ||
case Type::Atomic: | ||
return encodeTypeForFunctionPointerAuth( | ||
Ctx, OS, cast<AtomicType>(T)->getValueType()); | ||
|
||
case Type::LValueReference: | ||
OS << "R"; | ||
encodeTypeForFunctionPointerAuth(Ctx, OS, | ||
cast<ReferenceType>(T)->getPointeeType()); | ||
return; | ||
case Type::RValueReference: | ||
OS << "O"; | ||
encodeTypeForFunctionPointerAuth(Ctx, OS, | ||
cast<ReferenceType>(T)->getPointeeType()); | ||
return; | ||
|
||
case Type::Pointer: | ||
// C11 6.7.6.1p2: | ||
// For two pointer types to be compatible, both shall be identically | ||
// qualified and both shall be pointers to compatible types. | ||
// FIXME: we should also consider pointee types. | ||
OS << "P"; | ||
return; | ||
|
||
case Type::ObjCObjectPointer: | ||
case Type::BlockPointer: | ||
OS << "P"; | ||
return; | ||
|
||
case Type::Complex: | ||
OS << "C"; | ||
return encodeTypeForFunctionPointerAuth( | ||
Ctx, OS, cast<ComplexType>(T)->getElementType()); | ||
|
||
case Type::VariableArray: | ||
case Type::ConstantArray: | ||
case Type::IncompleteArray: | ||
case Type::ArrayParameter: | ||
// C11 6.7.6.2p6: | ||
// For two array types to be compatible, both shall have compatible | ||
// element types, and if both size specifiers are present, and are integer | ||
// constant expressions, then both size specifiers shall have the same | ||
// constant value [...] | ||
// | ||
// So since ElemType[N] has to be compatible ElemType[], we can't encode the | ||
// width of the array. | ||
OS << "A"; | ||
return encodeTypeForFunctionPointerAuth( | ||
Ctx, OS, cast<ArrayType>(T)->getElementType()); | ||
|
||
case Type::ObjCInterface: | ||
case Type::ObjCObject: | ||
OS << "<objc_object>"; | ||
return; | ||
|
||
case Type::Enum: | ||
// C11 6.7.2.2p4: | ||
// Each enumerated type shall be compatible with char, a signed integer | ||
// type, or an unsigned integer type. | ||
// | ||
// So we have to treat enum types as integers. | ||
OS << "i"; | ||
return; | ||
|
||
case Type::FunctionNoProto: | ||
case Type::FunctionProto: { | ||
// C11 6.7.6.3p15: | ||
// For two function types to be compatible, both shall specify compatible | ||
// return types. Moreover, the parameter type lists, if both are present, | ||
// shall agree in the number of parameters and in the use of the ellipsis | ||
// terminator; corresponding parameters shall have compatible types. | ||
// | ||
// That paragraph goes on to describe how unprototyped functions are to be | ||
// handled, which we ignore here. Unprototyped function pointers are hashed | ||
// as though they were prototyped nullary functions since thats probably | ||
// what the user meant. This behavior is non-conforming. | ||
// FIXME: If we add a "custom discriminator" function type attribute we | ||
// should encode functions as their discriminators. | ||
OS << "F"; | ||
const auto *FuncType = cast<FunctionType>(T); | ||
encodeTypeForFunctionPointerAuth(Ctx, OS, FuncType->getReturnType()); | ||
if (const auto *FPT = dyn_cast<FunctionProtoType>(FuncType)) { | ||
for (QualType Param : FPT->param_types()) { | ||
Param = Ctx.getSignatureParameterType(Param); | ||
encodeTypeForFunctionPointerAuth(Ctx, OS, Param); | ||
} | ||
if (FPT->isVariadic()) | ||
OS << "z"; | ||
} | ||
OS << "E"; | ||
return; | ||
} | ||
|
||
case Type::MemberPointer: { | ||
OS << "M"; | ||
const auto *MPT = T->getAs<MemberPointerType>(); | ||
encodeTypeForFunctionPointerAuth(Ctx, OS, QualType(MPT->getClass(), 0)); | ||
encodeTypeForFunctionPointerAuth(Ctx, OS, MPT->getPointeeType()); | ||
return; | ||
} | ||
case Type::ExtVector: | ||
case Type::Vector: | ||
OS << "Dv" << Ctx.getTypeSizeInChars(T).getQuantity(); | ||
break; | ||
|
||
// Don't bother discriminating based on these types. | ||
case Type::Pipe: | ||
case Type::BitInt: | ||
case Type::ConstantMatrix: | ||
OS << "?"; | ||
return; | ||
|
||
case Type::Builtin: { | ||
const auto *BTy = T->getAs<BuiltinType>(); | ||
switch (BTy->getKind()) { | ||
#define SIGNED_TYPE(Id, SingletonId) \ | ||
case BuiltinType::Id: \ | ||
OS << "i"; \ | ||
return; | ||
#define UNSIGNED_TYPE(Id, SingletonId) \ | ||
case BuiltinType::Id: \ | ||
OS << "i"; \ | ||
return; | ||
#define PLACEHOLDER_TYPE(Id, SingletonId) case BuiltinType::Id: | ||
#define BUILTIN_TYPE(Id, SingletonId) | ||
#include "clang/AST/BuiltinTypes.def" | ||
llvm_unreachable("placeholder types should not appear here."); | ||
|
||
case BuiltinType::Half: | ||
OS << "Dh"; | ||
return; | ||
case BuiltinType::Float: | ||
OS << "f"; | ||
return; | ||
case BuiltinType::Double: | ||
OS << "d"; | ||
return; | ||
case BuiltinType::LongDouble: | ||
OS << "e"; | ||
return; | ||
case BuiltinType::Float16: | ||
OS << "DF16_"; | ||
return; | ||
case BuiltinType::Float128: | ||
OS << "g"; | ||
return; | ||
|
||
case BuiltinType::Void: | ||
OS << "v"; | ||
return; | ||
|
||
case BuiltinType::ObjCId: | ||
case BuiltinType::ObjCClass: | ||
case BuiltinType::ObjCSel: | ||
case BuiltinType::NullPtr: | ||
OS << "P"; | ||
return; | ||
|
||
// Don't bother discriminating based on OpenCL types. | ||
case BuiltinType::OCLSampler: | ||
case BuiltinType::OCLEvent: | ||
case BuiltinType::OCLClkEvent: | ||
case BuiltinType::OCLQueue: | ||
case BuiltinType::OCLReserveID: | ||
case BuiltinType::BFloat16: | ||
case BuiltinType::VectorQuad: | ||
case BuiltinType::VectorPair: | ||
OS << "?"; | ||
return; | ||
|
||
// Don't bother discriminating based on these seldom-used types. | ||
case BuiltinType::Ibm128: | ||
return; | ||
#define IMAGE_TYPE(ImgType, Id, SingletonId, Access, Suffix) \ | ||
case BuiltinType::Id: \ | ||
return; | ||
#include "clang/Basic/OpenCLImageTypes.def" | ||
#define EXT_OPAQUE_TYPE(ExtType, Id, Ext) \ | ||
case BuiltinType::Id: \ | ||
return; | ||
#include "clang/Basic/OpenCLExtensionTypes.def" | ||
#define SVE_TYPE(Name, Id, SingletonId) \ | ||
case BuiltinType::Id: \ | ||
return; | ||
#include "clang/Basic/AArch64SVEACLETypes.def" | ||
case BuiltinType::Dependent: | ||
llvm_unreachable("should never get here"); | ||
case BuiltinType::AMDGPUBufferRsrc: | ||
case BuiltinType::WasmExternRef: | ||
#define RVV_TYPE(Name, Id, SingletonId) case BuiltinType::Id: | ||
#include "clang/Basic/RISCVVTypes.def" | ||
llvm_unreachable("not yet implemented"); | ||
} | ||
} | ||
case Type::Record: { | ||
const RecordDecl *RD = T->getAs<RecordType>()->getDecl(); | ||
const IdentifierInfo *II = RD->getIdentifier(); | ||
if (!II) | ||
if (const TypedefNameDecl *Typedef = RD->getTypedefNameForAnonDecl()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. getTypedefNameForAnonDecl() doesn't actually have any semantic significance in C; it's a C++ thing. Not sure how much we care. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, this code is needed to add more diversification in the following case:
Function type discrimination is used in C++ mode too. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given the following two files, you have an ODR violation in C++... but you're not violating the equivalent C type compatibility rule. So this will reject code which is technically valid C. If this is an intentional deviation from the C standard, that might be fine, but it needs an explicit comment explaining that.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In fact, that exact code will work as usual because function symbols are not themselves signed. A call to If you take the address of a function, the resulting function pointer value is signed in that context according to that context's concept of its type. If you then convert it, it will be authenticated for its old type and re-signed for its new type. There will only be a problem if the function pointer value is reinterpreted as a new type without such a conversion. Therefore, if void f(void (*fp)(S1)); void f(void (*fp)(S2)); A call under the first signature will sign the argument differently from how a definition under the second signature will expect it, and code that is technically valid C will fail to work. This deviation is indeed intentional. We felt that typedefs of anonymous structs were too common of a pattern in C to simply completely fail to distinguish them, and we felt that a structural hash of the members was likely to be problematic in other ways. It is reasonable to ask that we document this, though. Akira, I suggest:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note that this patch doesn't have the code to resign function pointers, which is needed for type conversions. That will be implemented in subsequent patches. |
||
II = Typedef->getDeclName().getAsIdentifierInfo(); | ||
|
||
if (!II) { | ||
OS << "<anonymous_record>"; | ||
return; | ||
} | ||
OS << II->getLength() << II->getName(); | ||
return; | ||
} | ||
case Type::DeducedTemplateSpecialization: | ||
case Type::Auto: | ||
#define NON_CANONICAL_TYPE(Class, Base) case Type::Class: | ||
#define DEPENDENT_TYPE(Class, Base) case Type::Class: | ||
#define NON_CANONICAL_UNLESS_DEPENDENT_TYPE(Class, Base) case Type::Class: | ||
#define ABSTRACT_TYPE(Class, Base) | ||
#define TYPE(Class, Base) | ||
#include "clang/AST/TypeNodes.inc" | ||
llvm_unreachable("unexpected non-canonical or dependent type!"); | ||
return; | ||
} | ||
} | ||
|
||
uint16_t ASTContext::getPointerAuthTypeDiscriminator(QualType T) const { | ||
assert(!T->isDependentType() && | ||
"cannot compute type discriminator of a dependent type"); | ||
|
||
SmallString<256> Str; | ||
llvm::raw_svector_ostream Out(Str); | ||
|
||
if (T->isFunctionPointerType() || T->isFunctionReferenceType()) | ||
T = T->getPointeeType(); | ||
|
||
if (T->isFunctionType()) | ||
encodeTypeForFunctionPointerAuth(*this, Out, T); | ||
else | ||
llvm_unreachable( | ||
"type discrimination of non-function type not implemented yet"); | ||
|
||
return llvm::getPointerAuthStableSipHash(Str); | ||
} | ||
|
||
QualType ASTContext::getObjCGCQualType(QualType T, | ||
Qualifiers::GC GCAttr) const { | ||
QualType CanT = getCanonicalType(T); | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2220,6 +2220,11 @@ llvm::Constant *ConstantLValueEmitter::emitPointerAuthPointer(const Expr *E) { | |
|
||
// The assertions here are all checked by Sema. | ||
assert(Result.Val.isLValue()); | ||
const auto *Base = Result.Val.getLValueBase().get<const ValueDecl *>(); | ||
if (const auto *Decl = dyn_cast_or_null<FunctionDecl>(Base)) { | ||
ahatanak marked this conversation as resolved.
Show resolved
Hide resolved
|
||
assert(Result.Val.getLValueOffset().isZero()); | ||
return CGM.getRawFunctionPointer(Decl); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is this code not part of ConstantEmitter()? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This function is called to return an unsigned pointer. The returned pointer is used to create a signed pointer for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In that case, why are you special-casing functions here? This looks like emitPointerAuthPointer can return either a signed pointer or an unsigned pointer, depending on what exactly the expression is referring to, which seems bad. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The checks in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just noticed that there's an unnecessary null check in the code above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Okay, I think I follow what this is doing, then... but it seems like a terrible way to structure the code; it's repeating code, and the relevant logic is split over multiple disconnected functions. Can we add an argument to ConstantLValueEmitter() so users can request that functions are either signed or unsigned, or something like that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You mean to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Revised code looks fine. |
||
} | ||
return ConstantEmitter(CGM, Emitter.CGF) | ||
.emitAbstract(E->getExprLoc(), Result.Val, E->getType()); | ||
} | ||
|
Uh oh!
There was an error while loading. Please reload this page.