Skip to content

[clang][RISCV][Zicfilp] Emit RISCV function-signature-based CFI label in llvm::Function metadata #111661

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mylai-mtk
Copy link
Contributor

@mylai-mtk mylai-mtk commented Oct 9, 2024

This method calculates a CFI label used in the RISC-V Zicfilp func-sig CFI scheme for a given function type/declaration. The scheme, according to psABI, encodes the label based on function signature, and the rules are modified from the Itanium C++ ABI mangling rule to allow functions (callees) that are called indirectly to have the expected label as indicated by the function pointer type seen at the call site (caller).

This method is a pre-requisite to enable RISC-V Zicfilp CFI with func-sig label scheme.

The implemented psABI scheme can be found at riscv-non-isa/riscv-elf-psabi-doc#434 .

Update: This PR is expanded to include emitting the RISC-V Zicfilp func-sig CFI label in the metadata of generated llvm::Functions. The purpose is to allow testing of the mangled function signatures.

@mylai-mtk mylai-mtk requested a review from kito-cheng October 9, 2024 11:22
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Oct 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Oct 9, 2024

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-clang

Author: Ming-Yi Lai (mylai-mtk)

Changes

This method calculates a CFI label used in the RISC-V Zicfilp func-sig CFI scheme for a given function type/declaration. The scheme, according to psABI, encodes the label based on function signature, and the rules are modified from the Itanium C++ ABI mangling rule to allow functions (callees) that are called indirectly to have the expected label as indicated by the function pointer type seen at the call site (caller).

This method is a pre-requisite to enable RISC-V Zicfilp CFI with func-sig label scheme.

The implemented psABI scheme can be found at riscv-non-isa/riscv-elf-psabi-doc#434 .


Full diff: https://github.com/llvm/llvm-project/pull/111661.diff

6 Files Affected:

  • (modified) clang/include/clang/AST/ASTContext.h (+7)
  • (modified) clang/include/clang/AST/Mangle.h (+5)
  • (modified) clang/lib/AST/ASTContext.cpp (+6)
  • (modified) clang/lib/AST/ItaniumMangle.cpp (+135-2)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+50)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+8)
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index a4d36f2eacd5d1..4812e0bac2cfc3 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -1876,6 +1876,9 @@ class ASTContext : public RefCountedBase<ASTContext> {
   /// (struct/union/class/enum) decl.
   QualType getTagDeclType(const TagDecl *Decl) const;
 
+  /// Return the type for "void *"
+  QualType getVoidPtrType() const { return VoidPtrTy; }
+
   /// Return the unique type for "size_t" (C99 7.17), defined in
   /// <stddef.h>.
   ///
@@ -1903,6 +1906,10 @@ class ASTContext : public RefCountedBase<ASTContext> {
   /// defined in <stddef.h> as defined by the target.
   QualType getWideCharType() const { return WideCharTy; }
 
+  /// Return the type of wide characters in C context, no matter whether it's C
+  /// or C++ being compiled.
+  QualType getWCharTypeInC() const;
+
   /// Return the type of "signed wchar_t".
   ///
   /// Used when in C++, as a GCC extension.
diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h
index d5f6c0f6cc67df..16cbd802177ba4 100644
--- a/clang/include/clang/AST/Mangle.h
+++ b/clang/include/clang/AST/Mangle.h
@@ -212,6 +212,11 @@ class ItaniumMangleContext : public MangleContext {
 
   virtual void mangleModuleInitializer(const Module *Module, raw_ostream &) = 0;
 
+  virtual void mangleForRISCVZicfilpFuncSigLabel(const FunctionType &FT,
+                                                 const bool IsCXXInstanceMethod,
+                                                 const bool IsCXXVirtualMethod,
+                                                 raw_ostream &) = 0;
+
   // This has to live here, otherwise the CXXNameMangler won't have access to
   // it.
   virtual DiscriminatorOverrideTy getDiscriminatorOverride() const = 0;
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 034fbbe0bc7829..ce8688a489cb43 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -6442,6 +6442,12 @@ CanQualType ASTContext::getUIntMaxType() const {
   return getFromTargetType(Target->getUIntMaxType());
 }
 
+/// Return the type of wide characters in C context, no matter whether it's C
+/// or C++ being compiled.
+QualType ASTContext::getWCharTypeInC() const {
+  return getFromTargetType(Target->getWCharType());
+}
+
 /// getSignedWCharType - Return the type of "signed wchar_t".
 /// Used when in C++, as a GCC extension.
 QualType ASTContext::getSignedWCharType() const {
diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 777cdca1a0c0d7..9eff94b0844092 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -43,6 +43,14 @@ using namespace clang;
 
 namespace {
 
+static bool mayBeCovariant(const Type &Ty) {
+  if (auto *const PT = Ty.getAs<PointerType>())
+    return PT->getPointeeType()->isStructureOrClassType();
+  if (auto *const RT = Ty.getAs<ReferenceType>())
+    return RT->getPointeeType()->isStructureOrClassType();
+  return false;
+}
+
 static bool isLocalContainerContext(const DeclContext *DC) {
   return isa<FunctionDecl>(DC) || isa<ObjCMethodDecl>(DC) || isa<BlockDecl>(DC);
 }
@@ -136,6 +144,11 @@ class ItaniumMangleContextImpl : public ItaniumMangleContext {
 
   void mangleModuleInitializer(const Module *Module, raw_ostream &) override;
 
+  void mangleForRISCVZicfilpFuncSigLabel(const FunctionType &,
+                                         const bool IsCXXInstanceMethod,
+                                         const bool IsCXXVirtualMethod,
+                                         raw_ostream &) override;
+
   bool getNextDiscriminator(const NamedDecl *ND, unsigned &disc) {
     // Lambda closure types are already numbered.
     if (isLambda(ND))
@@ -395,8 +408,10 @@ class CXXNameMangler {
   llvm::DenseMap<uintptr_t, unsigned> Substitutions;
   llvm::DenseMap<StringRef, unsigned> ModuleSubstitutions;
 
+protected:
   ASTContext &getASTContext() const { return Context.getASTContext(); }
 
+private:
   bool isCompatibleWith(LangOptions::ClangABI Ver) {
     return Context.getASTContext().getLangOpts().getClangABICompat() <= Ver;
   }
@@ -443,6 +458,8 @@ class CXXNameMangler {
     NullOut = true;
   }
 
+  virtual ~CXXNameMangler() = default;
+
   struct WithTemplateDepthOffset { unsigned Offset; };
   CXXNameMangler(ItaniumMangleContextImpl &C, raw_ostream &Out,
                  WithTemplateDepthOffset Offset)
@@ -559,9 +576,12 @@ class CXXNameMangler {
                                       StringRef Prefix = "");
   void mangleOperatorName(DeclarationName Name, unsigned Arity);
   void mangleOperatorName(OverloadedOperatorKind OO, unsigned Arity);
+
+protected:
   void mangleQualifiers(Qualifiers Quals, const DependentAddressSpaceType *DAST = nullptr);
   void mangleRefQualifier(RefQualifierKind RefQualifier);
 
+private:
   void mangleObjCMethodName(const ObjCMethodDecl *MD);
 
   // Declare manglers for every type class.
@@ -572,11 +592,24 @@ class CXXNameMangler {
 
   void mangleType(const TagType*);
   void mangleType(TemplateName);
+
+protected:
+  // Use the `Impl` scheme instead of directly virtualizing `mangleType`s since
+  // `mangleType`s are declared by tables
+  virtual void mangleTypeImpl(const BuiltinType *T);
+  virtual void mangleTypeImpl(const FunctionProtoType *T);
+  virtual void mangleTypeImpl(const FunctionNoProtoType *T);
+
+private:
   static StringRef getCallingConvQualifierName(CallingConv CC);
   void mangleExtParameterInfo(FunctionProtoType::ExtParameterInfo info);
   void mangleExtFunctionInfo(const FunctionType *T);
+
+protected:
   void mangleBareFunctionType(const FunctionProtoType *T, bool MangleReturnType,
                               const FunctionDecl *FD = nullptr);
+
+private:
   void mangleNeonVectorType(const VectorType *T);
   void mangleNeonVectorType(const DependentVectorType *T);
   void mangleAArch64NeonVectorType(const VectorType *T);
@@ -3058,7 +3091,9 @@ void CXXNameMangler::mangleCXXRecordDecl(const CXXRecordDecl *Record) {
   addSubstitution(Record);
 }
 
-void CXXNameMangler::mangleType(const BuiltinType *T) {
+void CXXNameMangler::mangleType(const BuiltinType *T) { mangleTypeImpl(T); }
+
+void CXXNameMangler::mangleTypeImpl(const BuiltinType *T) {
   //  <type>         ::= <builtin-type>
   //  <builtin-type> ::= v  # void
   //                 ::= w  # wchar_t
@@ -3563,10 +3598,14 @@ CXXNameMangler::mangleExtParameterInfo(FunctionProtoType::ExtParameterInfo PI) {
     mangleVendorQualifier("noescape");
 }
 
+void CXXNameMangler::mangleType(const FunctionProtoType *T) {
+  return mangleTypeImpl(T);
+}
+
 // <type>          ::= <function-type>
 // <function-type> ::= [<CV-qualifiers>] F [Y]
 //                      <bare-function-type> [<ref-qualifier>] E
-void CXXNameMangler::mangleType(const FunctionProtoType *T) {
+void CXXNameMangler::mangleTypeImpl(const FunctionProtoType *T) {
   mangleExtFunctionInfo(T);
 
   // Mangle CV-qualifiers, if present.  These are 'this' qualifiers,
@@ -3604,6 +3643,10 @@ void CXXNameMangler::mangleType(const FunctionProtoType *T) {
 }
 
 void CXXNameMangler::mangleType(const FunctionNoProtoType *T) {
+  return mangleTypeImpl(T);
+}
+
+void CXXNameMangler::mangleTypeImpl(const FunctionNoProtoType *T) {
   // Function types without prototypes can arise when mangling a function type
   // within an overloadable function in C. We mangle these as the absence of any
   // parameter types (not even an empty parameter list).
@@ -7074,6 +7117,85 @@ bool CXXNameMangler::shouldHaveAbiTags(ItaniumMangleContextImpl &C,
   return TrackAbiTags.AbiTagsRoot.getUsedAbiTags().size();
 }
 
+namespace {
+
+class RISCVZicfilpFuncSigLabelMangler : public CXXNameMangler {
+  bool IsTopLevelAndCXXVirtualMethod;
+
+public:
+  RISCVZicfilpFuncSigLabelMangler(ItaniumMangleContextImpl &C, raw_ostream &Out,
+                                  const bool IsCXXVirtualMethod)
+      : CXXNameMangler(C, Out),
+        IsTopLevelAndCXXVirtualMethod(/*IsTopLevel=*/true &&
+                                      IsCXXVirtualMethod) {}
+
+  void mangleTypeImpl(const BuiltinType *T) override {
+    if (T->getKind() == BuiltinType::WChar_S ||
+        T->getKind() == BuiltinType::WChar_U) {
+      const Type *const OverrideT =
+          getASTContext().getWCharTypeInC().getTypePtr();
+      assert(isa<BuiltinType>(OverrideT) &&
+             "`wchar_t' in C is expected to be defined to a built-in type");
+      T = static_cast<const BuiltinType *>(OverrideT);
+    }
+    return CXXNameMangler::mangleTypeImpl(T);
+  }
+
+  // This <function-type> is the RISC-V psABI modified version
+  // <function-type> ::= [<CV-qualifiers>] [Dx] F <bare-function-type>
+  //                     [<ref-qualifier>] E
+  void mangleTypeImpl(const FunctionProtoType *T) override {
+    // Mangle CV-qualifiers, if present.  These are 'this' qualifiers,
+    // e.g. "const" in "int (A::*)() const".
+    mangleQualifiers(T->getMethodQuals());
+
+    getStream() << 'F';
+
+    bool MangleReturnType = true;
+    if (const Type &RetT = *T->getReturnType().getTypePtr();
+        IsTopLevelAndCXXVirtualMethod && mayBeCovariant(RetT)) {
+      // Possible covariant types mangle dummy cv-unqualified `class v` as its
+      // class type
+      if (RetT.isPointerType())
+        getStream() << "P1v";
+      else if (RetT.isLValueReferenceType())
+        getStream() << "R1v";
+      else {
+        assert(RetT.isRValueReferenceType() &&
+               "Expect an r-value ref for covariant return type that is not a "
+               "pointer or an l-value ref");
+        getStream() << "O1v";
+      }
+
+      IsTopLevelAndCXXVirtualMethod = false; // Not top-level anymore
+      MangleReturnType = false;
+    }
+    mangleBareFunctionType(T, MangleReturnType);
+
+    // Mangle the ref-qualifier, if present.
+    mangleRefQualifier(T->getRefQualifier());
+
+    getStream() << 'E';
+  }
+
+  void mangleTypeImpl(const FunctionNoProtoType *T) override {
+    return CXXNameMangler::mangleTypeImpl(toFunctionProtoType(T));
+  }
+
+private:
+  const FunctionProtoType *
+  toFunctionProtoType(const FunctionNoProtoType *const T) {
+    FunctionProtoType::ExtProtoInfo EPI;
+    EPI.ExtInfo = T->getExtInfo();
+    const Type *const NewT = getASTContext()
+                                 .getFunctionType(T->getReturnType(), {}, EPI)
+                                 .getTypePtr();
+    return static_cast<const FunctionProtoType *>(NewT);
+  }
+}; // class RISCVZicfilpFuncSigLabelMangler
+
+} // anonymous namespace
+
 //
 
 /// Mangles the name of the declaration D and emits that name to the given
@@ -7412,6 +7534,17 @@ void ItaniumMangleContextImpl::mangleModuleInitializer(const Module *M,
   }
 }
 
+void ItaniumMangleContextImpl::mangleForRISCVZicfilpFuncSigLabel(
+    const FunctionType &FT, const bool IsCXXInstanceMethod,
+    const bool IsCXXVirtualMethod, raw_ostream &Out) {
+  if (IsCXXInstanceMethod)
+    // member methods uses a dummy class named `v` in place of real classes
+    Out << "M1v";
+
+  RISCVZicfilpFuncSigLabelMangler Mangler(*this, Out, IsCXXVirtualMethod);
+  Mangler.mangleType(QualType(&FT, 0));
+}
+
 ItaniumMangleContext *ItaniumMangleContext::create(ASTContext &Context,
                                                    DiagnosticsEngine &Diags,
                                                    bool IsAux) {
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 5ba098144a74e7..c7b6b56690c9ab 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2829,6 +2829,56 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
       F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+uint32_t
+CodeGenModule::calcRISCVZicfilpFuncSigLabel(const FunctionType &FT,
+                                            const bool IsCXXInstanceMethod,
+                                            const bool IsCXXVirtualMethod) {
+  std::string OutName;
+  llvm::raw_string_ostream Out(OutName);
+  MangleContext &MC = getCXXABI().getMangleContext();
+  cast<ItaniumMangleContext>(MC).mangleForRISCVZicfilpFuncSigLabel(
+      FT, IsCXXInstanceMethod, IsCXXVirtualMethod, Out);
+
+  const llvm::MD5::MD5Result MD5Result =
+      llvm::MD5::hash({(const uint8_t *)OutName.data(), OutName.size()});
+
+  // Take 20 bits of MD5 result with the approach specified by psABI
+  uint64_t MD5High = MD5Result.high();
+  uint64_t MD5Low = MD5Result.low();
+  while (MD5High && MD5Low) {
+    const uint32_t Low20Bits = MD5Low & 0xFFFFFULL;
+    if (Low20Bits)
+      return Low20Bits;
+
+    // Logical right shift MD5 result by 20 bits
+    MD5Low = (MD5High & 0xFFFFF) << 44 | MD5Low >> 20;
+    MD5High >>= 20;
+  }
+
+  return llvm::MD5Hash("RISC-V") & 0xFFFFFULL;
+}
+
+uint32_t CodeGenModule::calcRISCVZicfilpFuncSigLabel(const FunctionDecl &FD) {
+  if (FD.isMain())
+    // All main functions use `int main(int, char**)` for label calculation
+    // according to psABI spec. This value is the pre-calculated label.
+    return 0xd0639;
+
+  if (isa<CXXDestructorDecl>(FD))
+    // All destructors use `void (void*)` for label calculation according to the
+    // psABI spec. This value is the pre-calculated label.
+    return 0x639c2;
+
+  bool IsCXXInstanceMethod = false;
+  bool IsCXXVirtualMethod = false;
+  if (const auto *const MD = dyn_cast<CXXMethodDecl>(&FD)) {
+    IsCXXInstanceMethod = MD->isInstance();
+    IsCXXVirtualMethod = MD->isVirtual();
+  }
+  return calcRISCVZicfilpFuncSigLabel(*FD.getFunctionType(),
+                                      IsCXXInstanceMethod, IsCXXVirtualMethod);
+}
+
 void CodeGenModule::setKCFIType(const FunctionDecl *FD, llvm::Function *F) {
   llvm::LLVMContext &Ctx = F->getContext();
   llvm::MDBuilder MDB(Ctx);
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index c58bb88035ca8a..937b7c8a11b060 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1559,6 +1559,14 @@ class CodeGenModule : public CodeGenTypeCache {
   /// Emit KCFI type identifier constants and remove unused identifiers.
   void finalizeKCFITypes();
 
+  /// Calculate RISC-V Zicfilp func-sig scheme CFI label
+  uint32_t calcRISCVZicfilpFuncSigLabel(const FunctionType &FT,
+                                        const bool IsCXXInstanceMethod,
+                                        const bool IsCXXVirtualMethod);
+
+  /// Calculate RISC-V Zicfilp func-sig scheme CFI label
+  uint32_t calcRISCVZicfilpFuncSigLabel(const FunctionDecl &FD);
+
   /// Whether this function's return type has no side effects, and thus may
   /// be trivially discarded if it is unused.
   bool MayDropFunctionReturn(const ASTContext &Context,

@llvmbot
Copy link
Member

llvmbot commented Oct 9, 2024

@llvm/pr-subscribers-clang-codegen

Author: Ming-Yi Lai (mylai-mtk)

Changes

This method calculates a CFI label used in the RISC-V Zicfilp func-sig CFI scheme for a given function type/declaration. The scheme, according to psABI, encodes the label based on function signature, and the rules are modified from the Itanium C++ ABI mangling rule to allow functions (callees) that are called indirectly to have the expected label as indicated by the function pointer type seen at the call site (caller).

This method is a pre-requisite to enable RISC-V Zicfilp CFI with func-sig label scheme.

The implemented psABI scheme can be found at riscv-non-isa/riscv-elf-psabi-doc#434 .


Full diff: https://github.com/llvm/llvm-project/pull/111661.diff

6 Files Affected:

  • (modified) clang/include/clang/AST/ASTContext.h (+7)
  • (modified) clang/include/clang/AST/Mangle.h (+5)
  • (modified) clang/lib/AST/ASTContext.cpp (+6)
  • (modified) clang/lib/AST/ItaniumMangle.cpp (+135-2)
  • (modified) clang/lib/CodeGen/CodeGenModule.cpp (+50)
  • (modified) clang/lib/CodeGen/CodeGenModule.h (+8)
diff --git a/clang/include/clang/AST/ASTContext.h b/clang/include/clang/AST/ASTContext.h
index a4d36f2eacd5d1..4812e0bac2cfc3 100644
--- a/clang/include/clang/AST/ASTContext.h
+++ b/clang/include/clang/AST/ASTContext.h
@@ -1876,6 +1876,9 @@ class ASTContext : public RefCountedBase<ASTContext> {
   /// (struct/union/class/enum) decl.
   QualType getTagDeclType(const TagDecl *Decl) const;
 
+  /// Return the type for "void *"
+  QualType getVoidPtrType() const { return VoidPtrTy; }
+
   /// Return the unique type for "size_t" (C99 7.17), defined in
   /// <stddef.h>.
   ///
@@ -1903,6 +1906,10 @@ class ASTContext : public RefCountedBase<ASTContext> {
   /// defined in <stddef.h> as defined by the target.
   QualType getWideCharType() const { return WideCharTy; }
 
+  /// Return the type of wide characters in C context, no matter whether it's C
+  /// or C++ being compiled.
+  QualType getWCharTypeInC() const;
+
   /// Return the type of "signed wchar_t".
   ///
   /// Used when in C++, as a GCC extension.
diff --git a/clang/include/clang/AST/Mangle.h b/clang/include/clang/AST/Mangle.h
index d5f6c0f6cc67df..16cbd802177ba4 100644
--- a/clang/include/clang/AST/Mangle.h
+++ b/clang/include/clang/AST/Mangle.h
@@ -212,6 +212,11 @@ class ItaniumMangleContext : public MangleContext {
 
   virtual void mangleModuleInitializer(const Module *Module, raw_ostream &) = 0;
 
+  virtual void mangleForRISCVZicfilpFuncSigLabel(const FunctionType &FT,
+                                                 const bool IsCXXInstanceMethod,
+                                                 const bool IsCXXVirtualMethod,
+                                                 raw_ostream &) = 0;
+
   // This has to live here, otherwise the CXXNameMangler won't have access to
   // it.
   virtual DiscriminatorOverrideTy getDiscriminatorOverride() const = 0;
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 034fbbe0bc7829..ce8688a489cb43 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -6442,6 +6442,12 @@ CanQualType ASTContext::getUIntMaxType() const {
   return getFromTargetType(Target->getUIntMaxType());
 }
 
+/// Return the type of wide characters in C context, no matter whether it's C
+/// or C++ being compiled.
+QualType ASTContext::getWCharTypeInC() const {
+  return getFromTargetType(Target->getWCharType());
+}
+
 /// getSignedWCharType - Return the type of "signed wchar_t".
 /// Used when in C++, as a GCC extension.
 QualType ASTContext::getSignedWCharType() const {
diff --git a/clang/lib/AST/ItaniumMangle.cpp b/clang/lib/AST/ItaniumMangle.cpp
index 777cdca1a0c0d7..9eff94b0844092 100644
--- a/clang/lib/AST/ItaniumMangle.cpp
+++ b/clang/lib/AST/ItaniumMangle.cpp
@@ -43,6 +43,14 @@ using namespace clang;
 
 namespace {
 
+static bool mayBeCovariant(const Type &Ty) {
+  if (auto *const PT = Ty.getAs<PointerType>())
+    return PT->getPointeeType()->isStructureOrClassType();
+  if (auto *const RT = Ty.getAs<ReferenceType>())
+    return RT->getPointeeType()->isStructureOrClassType();
+  return false;
+}
+
 static bool isLocalContainerContext(const DeclContext *DC) {
   return isa<FunctionDecl>(DC) || isa<ObjCMethodDecl>(DC) || isa<BlockDecl>(DC);
 }
@@ -136,6 +144,11 @@ class ItaniumMangleContextImpl : public ItaniumMangleContext {
 
   void mangleModuleInitializer(const Module *Module, raw_ostream &) override;
 
+  void mangleForRISCVZicfilpFuncSigLabel(const FunctionType &,
+                                         const bool IsCXXInstanceMethod,
+                                         const bool IsCXXVirtualMethod,
+                                         raw_ostream &) override;
+
   bool getNextDiscriminator(const NamedDecl *ND, unsigned &disc) {
     // Lambda closure types are already numbered.
     if (isLambda(ND))
@@ -395,8 +408,10 @@ class CXXNameMangler {
   llvm::DenseMap<uintptr_t, unsigned> Substitutions;
   llvm::DenseMap<StringRef, unsigned> ModuleSubstitutions;
 
+protected:
   ASTContext &getASTContext() const { return Context.getASTContext(); }
 
+private:
   bool isCompatibleWith(LangOptions::ClangABI Ver) {
     return Context.getASTContext().getLangOpts().getClangABICompat() <= Ver;
   }
@@ -443,6 +458,8 @@ class CXXNameMangler {
     NullOut = true;
   }
 
+  virtual ~CXXNameMangler() = default;
+
   struct WithTemplateDepthOffset { unsigned Offset; };
   CXXNameMangler(ItaniumMangleContextImpl &C, raw_ostream &Out,
                  WithTemplateDepthOffset Offset)
@@ -559,9 +576,12 @@ class CXXNameMangler {
                                       StringRef Prefix = "");
   void mangleOperatorName(DeclarationName Name, unsigned Arity);
   void mangleOperatorName(OverloadedOperatorKind OO, unsigned Arity);
+
+protected:
   void mangleQualifiers(Qualifiers Quals, const DependentAddressSpaceType *DAST = nullptr);
   void mangleRefQualifier(RefQualifierKind RefQualifier);
 
+private:
   void mangleObjCMethodName(const ObjCMethodDecl *MD);
 
   // Declare manglers for every type class.
@@ -572,11 +592,24 @@ class CXXNameMangler {
 
   void mangleType(const TagType*);
   void mangleType(TemplateName);
+
+protected:
+  // Use the `Impl` scheme instead of directly virtualizing `mangleType`s since
+  // `mangleType`s are declared by tables
+  virtual void mangleTypeImpl(const BuiltinType *T);
+  virtual void mangleTypeImpl(const FunctionProtoType *T);
+  virtual void mangleTypeImpl(const FunctionNoProtoType *T);
+
+private:
   static StringRef getCallingConvQualifierName(CallingConv CC);
   void mangleExtParameterInfo(FunctionProtoType::ExtParameterInfo info);
   void mangleExtFunctionInfo(const FunctionType *T);
+
+protected:
   void mangleBareFunctionType(const FunctionProtoType *T, bool MangleReturnType,
                               const FunctionDecl *FD = nullptr);
+
+private:
   void mangleNeonVectorType(const VectorType *T);
   void mangleNeonVectorType(const DependentVectorType *T);
   void mangleAArch64NeonVectorType(const VectorType *T);
@@ -3058,7 +3091,9 @@ void CXXNameMangler::mangleCXXRecordDecl(const CXXRecordDecl *Record) {
   addSubstitution(Record);
 }
 
-void CXXNameMangler::mangleType(const BuiltinType *T) {
+void CXXNameMangler::mangleType(const BuiltinType *T) { mangleTypeImpl(T); }
+
+void CXXNameMangler::mangleTypeImpl(const BuiltinType *T) {
   //  <type>         ::= <builtin-type>
   //  <builtin-type> ::= v  # void
   //                 ::= w  # wchar_t
@@ -3563,10 +3598,14 @@ CXXNameMangler::mangleExtParameterInfo(FunctionProtoType::ExtParameterInfo PI) {
     mangleVendorQualifier("noescape");
 }
 
+void CXXNameMangler::mangleType(const FunctionProtoType *T) {
+  return mangleTypeImpl(T);
+}
+
 // <type>          ::= <function-type>
 // <function-type> ::= [<CV-qualifiers>] F [Y]
 //                      <bare-function-type> [<ref-qualifier>] E
-void CXXNameMangler::mangleType(const FunctionProtoType *T) {
+void CXXNameMangler::mangleTypeImpl(const FunctionProtoType *T) {
   mangleExtFunctionInfo(T);
 
   // Mangle CV-qualifiers, if present.  These are 'this' qualifiers,
@@ -3604,6 +3643,10 @@ void CXXNameMangler::mangleType(const FunctionProtoType *T) {
 }
 
 void CXXNameMangler::mangleType(const FunctionNoProtoType *T) {
+  return mangleTypeImpl(T);
+}
+
+void CXXNameMangler::mangleTypeImpl(const FunctionNoProtoType *T) {
   // Function types without prototypes can arise when mangling a function type
   // within an overloadable function in C. We mangle these as the absence of any
   // parameter types (not even an empty parameter list).
@@ -7074,6 +7117,85 @@ bool CXXNameMangler::shouldHaveAbiTags(ItaniumMangleContextImpl &C,
   return TrackAbiTags.AbiTagsRoot.getUsedAbiTags().size();
 }
 
+namespace {
+
+class RISCVZicfilpFuncSigLabelMangler : public CXXNameMangler {
+  bool IsTopLevelAndCXXVirtualMethod;
+
+public:
+  RISCVZicfilpFuncSigLabelMangler(ItaniumMangleContextImpl &C, raw_ostream &Out,
+                                  const bool IsCXXVirtualMethod)
+      : CXXNameMangler(C, Out),
+        IsTopLevelAndCXXVirtualMethod(/*IsTopLevel=*/true &&
+                                      IsCXXVirtualMethod) {}
+
+  void mangleTypeImpl(const BuiltinType *T) override {
+    if (T->getKind() == BuiltinType::WChar_S ||
+        T->getKind() == BuiltinType::WChar_U) {
+      const Type *const OverrideT =
+          getASTContext().getWCharTypeInC().getTypePtr();
+      assert(isa<BuiltinType>(OverrideT) &&
+             "`wchar_t' in C is expected to be defined to a built-in type");
+      T = static_cast<const BuiltinType *>(OverrideT);
+    }
+    return CXXNameMangler::mangleTypeImpl(T);
+  }
+
+  // This <function-type> is the RISC-V psABI modified version
+  // <function-type> ::= [<CV-qualifiers>] [Dx] F <bare-function-type>
+  //                     [<ref-qualifier>] E
+  void mangleTypeImpl(const FunctionProtoType *T) override {
+    // Mangle CV-qualifiers, if present.  These are 'this' qualifiers,
+    // e.g. "const" in "int (A::*)() const".
+    mangleQualifiers(T->getMethodQuals());
+
+    getStream() << 'F';
+
+    bool MangleReturnType = true;
+    if (const Type &RetT = *T->getReturnType().getTypePtr();
+        IsTopLevelAndCXXVirtualMethod && mayBeCovariant(RetT)) {
+      // Possible covariant types mangle dummy cv-unqualified `class v` as its
+      // class type
+      if (RetT.isPointerType())
+        getStream() << "P1v";
+      else if (RetT.isLValueReferenceType())
+        getStream() << "R1v";
+      else {
+        assert(RetT.isRValueReferenceType() &&
+               "Expect an r-value ref for covariant return type that is not a "
+               "pointer or an l-value ref");
+        getStream() << "O1v";
+      }
+
+      IsTopLevelAndCXXVirtualMethod = false; // Not top-level anymore
+      MangleReturnType = false;
+    }
+    mangleBareFunctionType(T, MangleReturnType);
+
+    // Mangle the ref-qualifier, if present.
+    mangleRefQualifier(T->getRefQualifier());
+
+    getStream() << 'E';
+  }
+
+  void mangleTypeImpl(const FunctionNoProtoType *T) override {
+    return CXXNameMangler::mangleTypeImpl(toFunctionProtoType(T));
+  }
+
+private:
+  const FunctionProtoType *
+  toFunctionProtoType(const FunctionNoProtoType *const T) {
+    FunctionProtoType::ExtProtoInfo EPI;
+    EPI.ExtInfo = T->getExtInfo();
+    const Type *const NewT = getASTContext()
+                                 .getFunctionType(T->getReturnType(), {}, EPI)
+                                 .getTypePtr();
+    return static_cast<const FunctionProtoType *>(NewT);
+  }
+}; // class RISCVZicfilpFuncSigLabelMangler
+
+} // anonymous namespace
+
 //
 
 /// Mangles the name of the declaration D and emits that name to the given
@@ -7412,6 +7534,17 @@ void ItaniumMangleContextImpl::mangleModuleInitializer(const Module *M,
   }
 }
 
+void ItaniumMangleContextImpl::mangleForRISCVZicfilpFuncSigLabel(
+    const FunctionType &FT, const bool IsCXXInstanceMethod,
+    const bool IsCXXVirtualMethod, raw_ostream &Out) {
+  if (IsCXXInstanceMethod)
+    // member methods uses a dummy class named `v` in place of real classes
+    Out << "M1v";
+
+  RISCVZicfilpFuncSigLabelMangler Mangler(*this, Out, IsCXXVirtualMethod);
+  Mangler.mangleType(QualType(&FT, 0));
+}
+
 ItaniumMangleContext *ItaniumMangleContext::create(ASTContext &Context,
                                                    DiagnosticsEngine &Diags,
                                                    bool IsAux) {
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp
index 5ba098144a74e7..c7b6b56690c9ab 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -2829,6 +2829,56 @@ void CodeGenModule::CreateFunctionTypeMetadataForIcall(const FunctionDecl *FD,
       F->addTypeMetadata(0, llvm::ConstantAsMetadata::get(CrossDsoTypeId));
 }
 
+uint32_t
+CodeGenModule::calcRISCVZicfilpFuncSigLabel(const FunctionType &FT,
+                                            const bool IsCXXInstanceMethod,
+                                            const bool IsCXXVirtualMethod) {
+  std::string OutName;
+  llvm::raw_string_ostream Out(OutName);
+  MangleContext &MC = getCXXABI().getMangleContext();
+  cast<ItaniumMangleContext>(MC).mangleForRISCVZicfilpFuncSigLabel(
+      FT, IsCXXInstanceMethod, IsCXXVirtualMethod, Out);
+
+  const llvm::MD5::MD5Result MD5Result =
+      llvm::MD5::hash({(const uint8_t *)OutName.data(), OutName.size()});
+
+  // Take 20 bits of MD5 result with the approach specified by psABI
+  uint64_t MD5High = MD5Result.high();
+  uint64_t MD5Low = MD5Result.low();
+  while (MD5High && MD5Low) {
+    const uint32_t Low20Bits = MD5Low & 0xFFFFFULL;
+    if (Low20Bits)
+      return Low20Bits;
+
+    // Logical right shift MD5 result by 20 bits
+    MD5Low = (MD5High & 0xFFFFF) << 44 | MD5Low >> 20;
+    MD5High >>= 20;
+  }
+
+  return llvm::MD5Hash("RISC-V") & 0xFFFFFULL;
+}
+
+uint32_t CodeGenModule::calcRISCVZicfilpFuncSigLabel(const FunctionDecl &FD) {
+  if (FD.isMain())
+    // All main functions use `int main(int, char**)` for label calculation
+    // according to psABI spec. This value is the pre-calculated label.
+    return 0xd0639;
+
+  if (isa<CXXDestructorDecl>(FD))
+    // All destructors use `void (void*)` for label calculation according to the
+    // psABI spec. This value is the pre-calculated label.
+    return 0x639c2;
+
+  bool IsCXXInstanceMethod = false;
+  bool IsCXXVirtualMethod = false;
+  if (const auto *const MD = dyn_cast<CXXMethodDecl>(&FD)) {
+    IsCXXInstanceMethod = MD->isInstance();
+    IsCXXVirtualMethod = MD->isVirtual();
+  }
+  return calcRISCVZicfilpFuncSigLabel(*FD.getFunctionType(),
+                                      IsCXXInstanceMethod, IsCXXVirtualMethod);
+}
+
 void CodeGenModule::setKCFIType(const FunctionDecl *FD, llvm::Function *F) {
   llvm::LLVMContext &Ctx = F->getContext();
   llvm::MDBuilder MDB(Ctx);
diff --git a/clang/lib/CodeGen/CodeGenModule.h b/clang/lib/CodeGen/CodeGenModule.h
index c58bb88035ca8a..937b7c8a11b060 100644
--- a/clang/lib/CodeGen/CodeGenModule.h
+++ b/clang/lib/CodeGen/CodeGenModule.h
@@ -1559,6 +1559,14 @@ class CodeGenModule : public CodeGenTypeCache {
   /// Emit KCFI type identifier constants and remove unused identifiers.
   void finalizeKCFITypes();
 
+  /// Calculate RISC-V Zicfilp func-sig scheme CFI label
+  uint32_t calcRISCVZicfilpFuncSigLabel(const FunctionType &FT,
+                                        const bool IsCXXInstanceMethod,
+                                        const bool IsCXXVirtualMethod);
+
+  /// Calculate RISC-V Zicfilp func-sig scheme CFI label
+  uint32_t calcRISCVZicfilpFuncSigLabel(const FunctionDecl &FD);
+
   /// Whether this function's return type has no side effects, and thus may
   /// be trivially discarded if it is unused.
   bool MayDropFunctionReturn(const ASTContext &Context,

@kito-cheng kito-cheng requested a review from topperc October 9, 2024 12:33
Copy link
Member

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest either put more thing to make this PR test-able or write a unittest for this.

@mylai-mtk
Copy link
Contributor Author

Update:

  • Rebase to main
  • Address comments: Expand the scope of this PR to allow testing
  • Address other comments

@mylai-mtk mylai-mtk changed the title [clang][RISCV] Introduce CodeGenModule::calcRISCVZicfilpFuncSigLabel() [clang][RISCV] Emit RISCV function-signature-based CFI label in llvm::Function Oct 11, 2024
@mylai-mtk mylai-mtk changed the title [clang][RISCV] Emit RISCV function-signature-based CFI label in llvm::Function [clang][RISCV] Emit RISCV function-signature-based CFI label in llvm::Function metadata Oct 11, 2024
@mylai-mtk mylai-mtk force-pushed the zicfilp-calc-func-sig-label branch from 9041af3 to 0cc12aa Compare October 11, 2024 11:41
Copy link
Member

@kito-cheng kito-cheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A high level suggestion is don't hash that until MC layer, so that we can easier debug and observe that from the IR level, so that means we can drop the hash part from this PR.

const Type *const NewT = getASTContext()
.getFunctionType(T->getReturnType(), {}, EPI)
.getTypePtr();
return static_cast<const FunctionProtoType *>(NewT);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty broken if you actually have code using pre-C23 semantics. If you don't want to think about it, just require that everyone use C23, and you won't see any FunctionNoProtoType. Otherwise, you need to actually do something reasonable, like using the type of the call arguments.

}
MangleReturnType = false;
}
mangleBareFunctionType(T, MangleReturnType);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if you're restricting yourself to C23, C type compatibility rules are complicated... this probably rejects some valid code. Like, C allows the following:

void f(int(*x)[]);
void g() { f(0); }
void f(int(*x)[12]) {}

See also #96992.

@mylai-mtk
Copy link
Contributor Author

A high level suggestion is don't hash that until MC layer, so that we can easier debug and observe that from the IR level, so that means we can drop the hash part from this PR.

@kito-cheng I prefer to keep the function signature as an auxiliary metadata with the name riscv_lpad_func_sig, so future schemes and other language frontends can reuse the riscv_lpad_label metadata flow and have the opportunity of not being forced to adopt the idea of function signatures. Besides, this is in fact what we currently have in our internal branch, since a later development of the Zicfilp psABI spec introduced the lpad mapping symbol, which writes function signatures into ELF files, and thus the internal branch had already been modified to pass function signature as an extra metadata.

Regarding the testability issue, I think later I'll update this PR so clang attaches both the riscv_lpad_label and riscv_lpad_func_sig metadata to Functions, and unittests can be added specifically for the zicfilpFuncSigHash() function.

@kito-cheng
Copy link
Member

I prefer to keep the function signature as an auxiliary metadata with the name riscv_lpad_func_sig, so future schemes and other language frontends can reuse the riscv_lpad_label metadata flow and have the opportunity of not being forced to adopt the idea of function signatures.

My point is I would like to see the signature string in IR, so that sounds good to me.

@mylai-mtk mylai-mtk force-pushed the zicfilp-calc-func-sig-label branch from 0cc12aa to b3a7b9f Compare May 23, 2025 08:03
@mylai-mtk
Copy link
Contributor Author

Update: Rebased to main trunk; keep function signature in LLVM IR

@mylai-mtk
Copy link
Contributor Author

@efriedma-quic Thanks for pointing out the potential type incompatibilities regarding the current mangling scheme.

Regarding this issue, I believe there's currently no perfect solution given the complexity of C type compatibility. It looks like even in the AArch64 PtrAuth implementation you referred to (#96992), some known compatible cases still result in incompatible mangling results.

Given that the mangling scheme implemented here had already undergone extensive discussion in riscv-non-isa/riscv-elf-psabi-doc#434 , I would prefer to avoid spec-induced type compatibility issues and focus on implementation in this PR.

If you have any concrete case which you believe should result in compatible mangling results but does not due to the draft spec, please raise the issue in the RISC-V psABI thread riscv-non-isa/riscv-elf-psabi-doc#434 , so people there (me included, of course) would be able to incorporate the solution into the spec. This PR is just an implementation of the psABI spec.

For this PR, please comment on mangling issues only if it arises from mis-implementation of the draft spec. Thank you.

@efriedma-quic
Copy link
Collaborator

The issues I've raised are tied up with quality of implementation, even if the spec doesn't change.

For missing prototypes, we can statically detect calls which pass arguments to function without a prototype. Generating code we know will never work, without a diagnostic, is a bad idea.

For the type merging issue, we can't detect it completely reliably, but we can detect function definitions which aren't compatible with the corresponding function declaration, and diagnose that.

@mylai-mtk
Copy link
Contributor Author

mylai-mtk commented May 26, 2025

For missing prototypes, we can statically detect calls which pass arguments to function without a prototype. Generating code we know will never work, without a diagnostic, is a bad idea.

@efriedma-quic If what you need to see are diagnostics, I can add them in this PR. I already have this particular one on my internal branch since I need it to filter out those tests that are expected to fail due to passing arguments to unprototyped functions.

For the type merging issue, we can't detect it completely reliably, but we can detect function definitions which aren't compatible with the corresponding function declaration, and diagnose that.

I think "function definitions which aren't compatible with the corresponding function declaration" is an error and should already be diagnosed? Do you mean "functions which aren't compatible with the corresponding function pointer variable declaration"?

@mylai-mtk mylai-mtk changed the title [clang][RISCV] Emit RISCV function-signature-based CFI label in llvm::Function metadata [clang][RISCV][Zicfilp] Emit RISCV function-signature-based CFI label in llvm::Function metadata May 27, 2025
@efriedma-quic
Copy link
Collaborator

There are intentional tradeoffs here, for the ABI, which should be made carefully; in some cases, it might be the right tradeoff to reject standard-compliant code. If you've considered it, I guess I won't object. I can't find any discussion of it, though.

If we're intentionally going this route, I'd like to see appropriate diagnostics, yes.

For unprototyped signatures specifically, it's probably simpler to just require users to upgrade to -std=c23/-std=gnu23, as opposed to replicating the existing diagnostics.


I think "function definitions which aren't compatible with the corresponding function declaration" is an error and should already be diagnosed? Do you mean "functions which aren't compatible with the corresponding function pointer variable declaration"?

I meant something like my example #111661 (comment) . Diagnosing implicit function pointer casts is probably also useful.

@jrtc27
Copy link
Collaborator

jrtc27 commented May 28, 2025

With my psABI hat on, settling on an ABI that turns out to be wrong is my big concern, but you can't find those corner cases until you try it at scale (and that's speaking from experience where we've missed things in the past for CHERI). This is probably the kind of thing that needs to be explicitly marked as experimental and with no ABI forward-compatibility guarantees, but that needs to be landed so that people can start actually using it and finding where it falls down in the real world. (But that doesn't obviate proper code review, the implementation should still be proper even if the ABI isn't stable.)

Given the comparison with the AArch64 PAuth work, how much is building on that versus doing our own independent thing? That is, are we learning from existing solutions, or are we reinventing our own wheel? Is there any real-world deployment experience from that that we can make use of?

@mylai-mtk
Copy link
Contributor Author

mylai-mtk commented May 28, 2025

There are intentional tradeoffs here, for the ABI, which should be made carefully; in some cases, it might be the right tradeoff to reject standard-compliant code. If you've considered it, I guess I won't object. I can't find any discussion of it, though.

The reduction of unprototyped function types to prototyped no parameter function types is proposed by me in riscv-non-isa/riscv-elf-psabi-doc#434 (comment), which is accepted without much discussion/objection. I also proposed treating such usages as errors, but this part had not been taken into the spec.

With my psABI hat on, settling on an ABI that turns out to be wrong is my big concern, but you can't find those corner cases until you try it at scale (and that's speaking from experience where we've missed things in the past for CHERI). This is probably the kind of thing that needs to be explicitly marked as experimental and with no ABI forward-compatibility guarantees, but that needs to be landed so that people can start actually using it and finding where it falls down in the real world. (But that doesn't obviate proper code review, the implementation should still be proper even if the ABI isn't stable.)

Agree. I'm pretty sure that the mangling scheme is not perfect, especially when applied to ancient C projects. With regard to forward-compatibility management, I think we can introduce versioning to Zicfilp psABI as well? (cf. Tag_RISCV_atomic_abi in .riscv.attributes section) @kito-cheng

Given the comparison with the AArch64 PAuth work, how much is building on that versus doing our own independent thing?

We did not refer to the AArch64 PtrAuth work when we devised the Zicfilp mangling scheme, and instead, our foundation was the Itanium C++ ABI. The time at which these 2 mangling schemes appeared was close, and I did not know that AArch64 was developing a similar thing back then, so adapting the Itanium C++ ABI looked like the most practical approach.

Is there any real-world deployment experience from that that we can make use of?

Not that I know of... The AArch64 PtrAuth work is in the main trunk for only less than a year, and it may take multiple years for the feature to mature, so I would say that we can surely learn some valuable things from it if we observe it long enough, but for now, I think both schemes are in a similarly nascent status, with the AArch64 PtrAuth scheme being more explicit and controllable, but less discriminating for different type structures (which is not necessarily a bad thing).

Personally, I would surely adopt the AArch64 PtrAuth type scheme (or any other scheme, if it exists) if it can be proved that this scheme solves the possible issues we currently expect, but I just can't find any discussions and design choice explanations about the scheme, so I cannot know which cases were intentional and which were not. The scheme itself is also missing from the AArch64 PtrAuth psABI, so I would consider it to be off-spec and Clang-specific ABI, which means it'll need to go through some other lengthy processes to formalize it in terms of C/C++ before being adopted for both Clang and GCC.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:RISC-V clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category llvm:support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants