Description
In general we try to use the LLVM C API whenever we can as it's generally nice and stable. It also has the great benefit of being maintained by LLVM so it tends to never be a pain point when upgrading LLVM! Unfortunately though LLVM's C API isn't 100% comprehensive and we often need functionality above and beyond what you can do with just C.
For this custom functionality we typically use the C++ API of LLVM and compile our own shims which then in turn have a C API. At the time of this writing all of the C++ to C shims are located in the src/rustllvm
directory across three main files: ArchiveWrapper.cpp
, PassWrapper.cpp
, and RustWrapper.cpp
. These files are all compiled via build.rs
around here where basically use llvm-config
to guide us in how to compile those files.
The downside of these shims that we have, however, is that they're difficult for us to maintain over time. They impose problems whenever we upgrade LLVM (we have to get them compiling again as the C++ APIs change quite regularly). Additionally it also makes consumers of Rust have a more difficult time using custom LLVM versions. For example right now our shims compile on LLVM 5 but probably not LLVM trunk. Additionally for users that like to follow LLVM trunk then keeping up with the breakage of our shims can be quite difficult!
To help solve this problem it seems the ideal solution is to try to upstream at least a big chunk of the C++ APIs that we're using. This way we can much more closely stick to LLVM's C API which is far more stable. It makes it that much easier for us to eventually upgrade LLVM and it makes users using a custom LLVM not need to worry about using an LLVM beyond the one that we're using (aka LLVM trunk).
I'll try to have a checklist here we can maintain over time which also is a good listing of what each of the APIs does!
ArchiveWrapper.cpp
In general this is functionality for reading archive *.a
files in the Rust compiler. This makes reading rlibs (which are archive files) extra speedy. The functions here are:
-
LLVMRustOpenArchive
-
LLVMRustDestroyArchive
-
LLVMRustArchiveIteratorNew
-
LLVMRustArchiveIteratorFree
-
LLVMRustArchiveIteratorNext
-
LLVMRustArchiveChildName
-
LLVMRustArchiveChildData
-
LLVMRustArchiveChildFree
-
LLVMRustArchiveMemberNew
-
LLVMRustArchiveMemberFree
-
LLVMRustWriteArchive
These functions are basically just reading and writing archives, using iterators for reading and providing a list of structs for writing.
PassWrapper.cpp
This file is when we get into a bit more of a smorgasboard of random functions rather than a consistent theme, so I'll comment more of them inline below.
A general theme here I've found as I wrote these down is that it's not critical that all of these are implemented. I could imagine that it would be possible to have a mode where we as rustc still compile shims sometimes (like the ones below) but many of the shims are stubbed out to not actually use LLVM at all if we're in "non-Rust-LLVM mode" (aka custom LLVM mode). In other words, we don't necessarily need to upstream 100% of these functions.
-
LLVMInitializePasses
- not entirely sure why we can't use the upstream versions. Someone more knowledgeable with LLVM may know how to replace this! -
LLVMRustFindAndCreatePass
- this is how we add custom passes to a pass manager by their string name -
LLVMRustPassKind
- categorizes whether a pass is a function or module pass -
LLVMRustAddPass
- add a custom pass to a pass manager -
LLVMRustPassManagerBuilderPopulateThinLTOPassManager
- thin wrapper around the C++ API to populate a ThinLTO pass manager -
LLVMRustHasFeature
- this is actually a pretty tricky one. It has to do with target_feature requires embedded LLVM copy to be usable #46181 and is I think the only function which actually only works with our fork. I can provide more information for this if necessary. -
LLVMRustPrintTargetCPUs
- mostly just a debugging helper we could stub out in the custom LLVM case. -
LLVMRustPrintTargetFeatures
- same as above -
LLVMRustCreateTargetMachine
- this is one we have to create aTargetMachineRef
ourselves but also giving us full access to all the fields, would probably just involve exposing more field accessors and setters and such. -
LLVMRustDisposeTargetMachine
- complement to the above -
LLVMRustAddAnalysisPasses
- I think this is just adding "standard" passes to the pass manager IIRC, we're just trying to mirror what clang is doing here. -
LLVMRustConfigurePassManagerBuilder
- just configuring some fields, again also aimed at mirroring clang. -
LLVMRustAddBuilderLibraryInfo
- again, attempting to mirror clang by configuring all the fields -
LLVMRustAddLibraryInfo
- mirroring clang -
LLVMRustRunFunctionPassManager
- seems ripe to add upstream! -
LLVMRustSetLLVMOptions
- I think this is for one-time configuration of LLVM at startup -
LLVMRustWriteOutputFile
- there's a whole bunch of ways to write outupt files with LLVM, if we had something that just wrote it out to memory or a file that'd be good enough for us -
LLVMRustPrintModule
- I'm pretty sure this is mainly just generating IR, but I'm not personally too familiar with the need for a custom class here -
LLVMRustPrintPasses
- AFAIK a debugging helper, could be stubbed out with a custom LLVM -
LLVMRustAddAlwaysInlinePass
- may just be missing upstream? -
LLVMRustRunRestrictionPass
- I think this is part of our LTO bindings, internalizing lots of stuff -
LLVMRustMarkAllFunctionsNounwind
- definitely part of our LTO bindings, for when you're compiling with-C lto
and-C panic=abort
-
LLVMRustSetDataLayoutFromTargetMachine
- not entirely sure what this is... -
LLVMRustGetModuleDataLayout
- also not entirely sure what this is... -
LLVMRustSetModulePIELevel
- I think just configuring more properties -
LLVMRustThinLTOAvailable
- for us just testing the LLVM version right now -
LLVMRustWriteThinBitcodeToFile
- mostly just what it says on the tin -
LLVMRustThinLTOBufferCreate
- same as abvoe but in memory -
LLVMRustThinLTOBufferFree
- freeing the above -
LLVMRustThinLTOBufferPtr
- reading the above -
LLVMRustThinLTOBufferLen
- reading the above -
LLVMRustParseBitcodeForThinLTO
- mostly what it says on the tin
These APIs are all related to ThinLTO are are still somewhat in flux, there may not be a great C API just yet.
-
LLVMRustCreateThinLTOData
-
LLVMRustFreeThinLTOData
-
LLVMRustPrepareThinLTORename
-
LLVMRustPrepareThinLTOResolveWeak
-
LLVMRustPrepareThinLTOInternalize
-
LLVMRustPrepareThinLTOImport
RustWrapper.cpp
Sort of even a bigger smorgasboard than PassWrapper.cpp
! Note that many of these functions are very old and may have actually made their way into the C API of LLVM by now, in which case that'd be awesome!
-
LLVMRustCreateMemoryBufferWithContentsOfFile
- this is something we can and probably should write ourselves rather than relying on LLVM -
LLVMRustGetLastError
- this is a Rust-specific API for getting out an error message, I'd imagine that whenever it's set we'd have something analagous in LLVM. -
LLVMRustSetLastError
- used by the C++ code to set the error that rustc will retrieve later -
LLVMRustSetNormalizedTarget
- I think this is just exposing something that wasn't already there. -
LLVMRustPrintPassTimings
- debugging on our end. -
LLVMRustGetNamedValue
- I think this is just fun dealing with metadata -
LLVMRustGetOrInsertFunction
- needed that C++ function most likely. -
LLVMRustGetOrInsertGlobal
- again, probably just needed the function -
LLVMRustMetadataTypeInContext
- more constructors for more types -
LLVMRustAddCallSiteAttribute
- just a "fluff" thing we needed to do that wasn't possible in C IIRC -
LLVMRustAddAlignmentCallSiteAttr
- same as above -
LLVMRustAddDereferenceableCallSiteAttr
- same as above -
LLVMRustAddDereferenceableOrNullCallSiteAttr
- same as above -
LLVMRustAddFunctionAttribute
- same as above -
LLVMRustAddAlignmentAttr
- same as above -
LLVMRustAddDereferenceableAttr
- same as above -
LLVMRustAddDereferenceableOrNullAttr
- same as above -
LLVMRustAddFunctionAttrStringValue
- same as above -
LLVMRustRemoveFunctionAttributes
- same as above -
LLVMRustSetHasUnsafeAlgebra
- not entirely sure what this is doing... -
LLVMRustBuildAtomicLoad
- I think at the time the C API didn't exist? -
LLVMRustBuildAtomicStore
- same as above -
LLVMRustBuildAtomicCmpXchg
- same as above -
LLVMRustBuildAtomicFence
- same as above -
LLVMRustSetDebug
- I think one-time configuration of LLVM -
LLVMRustInlineAsm
- I think the C API didn't exist (or wasn't full-featured enough) -
LLVMRustAppendModuleInlineAsm
- that function probably wasn't exposed in C -
LLVMRustVersionMinor
- just exposing a constant -
LLVMRustVersionMajor
- same as above -
LLVMRustDebugMetadataVersion
- this and most debug functions below I think just aren't in the C API -
LLVMRustAddModuleFlag
- same as above -
LLVMRustMetadataAsValue
- same as above -
LLVMRustDI*
- same as above (there's a whole bunch of these) -
LLVMRustWriteValueToString
- IIRC this is mostly debugging -
LLVMRustLinkInExternalBitcode
- used during normal LTO -
LLVMRustLinkInParsedExternalBitcode
- used during normal LTO -
LLVMRustGetSectionName
- not sure where this came from... -
LLVMRustArrayType
- missing C API? -
LLVMRustWriteTwineToString
- I think more debugging/diagnostics -
LLVMRustUnpackOptimizationDiagnostic
- diagnostics -
LLVMRustUnpackInlineAsmDiagnostic
- diagnostics -
LLVMRustWriteDiagnosticInfoToString
- diagnostics -
LLVMRustGetDiagInfoKind
- custom for us I think? -
LLVMRustGetTypeKind
- missing C API? -
LLVMRustWriteDebugLocToString
- debugging API I think -
LLVMRustSetInlineAsmDiagnosticHandler
- dealing with inline asm diagnostics -
LLVMRustWriteSMDiagnosticToString
- diagnostics -
LLVMRustBuildLandingPad
- missing C API? -
LLVMRustBuildCleanupPad
- same as above -
LLVMRustBuildCleanupRet
- same as above -
LLVMRustBuildCatchPad
- same as above -
LLVMRustBuildCatchRet
- same as above -
LLVMRustBuildCatchSwitch
- same as above -
LLVMRustAddHandler
- same as above -
LLVMRustBuildOperandBundleDef
- same as above -
LLVMRustBuildCall
- same as above -
LLVMRustBuildInvoke
- same as above -
LLVMRustPositionBuilderAtStart
- same as above I think? -
LLVMRustSetComdat
- same as above -
LLVMRustUnsetComdat
- same as above -
LLVMRustGetLinkage
- same as above -
LLVMRustSetLinkage
- same as above -
LLVMRustConstInt128Get
- same as above -
LLVMRustGetValueContext
- same as above -
LLVMRustGetVisibility
- same as above -
LLVMRustSetVisibility
- same as above -
LLVMRustModuleBufferCreate
- serializing a module to memory -
LLVMRustModuleBufferFree
- freeing above -
LLVMRustModuleBufferPtr
- reading above -
LLVMRustModuleBufferLen
- reading above -
LLVMRustModuleCost
- mostly a debugging helper