Skip to content

[ThinLTO][Bitcode] Generate import type in bitcode #87600

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
May 22, 2024
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
78fa2e0
[ThinLTO]Record import type (declaration or definition) in GlobalValu…
mingmingl-llvm Apr 4, 2024
4b4c33f
[ThinLTO]Generate import type in bitcode writer
mingmingl-llvm Apr 4, 2024
cfb63d7
function import changes
mingmingl-llvm Apr 4, 2024
b0a4060
'git merge main' and resolve merge conflict
mingmingl-llvm Apr 11, 2024
e07d423
Merge branch 'main' into users/minglotus-6/spr/summary1
mingmingl-llvm Apr 11, 2024
175febc
rename Dec to Decl
mingmingl-llvm Apr 16, 2024
8bc4d7a
Merge branch 'main' into users/minglotus-6/spr/summary1
mingmingl-llvm Apr 16, 2024
7252e11
Merge branch 'users/minglotus-6/spr/summary1' into users/minglotus-6/…
mingmingl-llvm Apr 16, 2024
23a3fa4
simplify code
mingmingl-llvm Apr 17, 2024
a0277b4
Flag gate the import-declaration change.
mingmingl-llvm Apr 17, 2024
5edeccc
stage changes
mingmingl-llvm Apr 23, 2024
a139e4a
Add test coverage for import delcaration
mingmingl-llvm May 2, 2024
916dc96
add test coverage for function alias
mingmingl-llvm May 3, 2024
248fbfb
llvm-lto test coverage
mingmingl-llvm May 5, 2024
5e731ec
add test coverage for debugging log, and a RUN line for internalization
mingmingl-llvm May 6, 2024
c5e168e
Merge branch 'main' into users/minglotus-6/spr/summary1
mingmingl-llvm May 7, 2024
2c38b47
resolve review feedback
mingmingl-llvm May 7, 2024
3da9b8b
Merge branch 'users/minglotus-6/spr/summary1' into users/minglotus-6/…
mingmingl-llvm May 7, 2024
f1d22e1
Diffbase is updated (https://github.com/llvm/llvm-project/pull/87600/…
mingmingl-llvm May 7, 2024
86f3f44
resolve review feedback
mingmingl-llvm May 10, 2024
001a785
update this patch as the second one
mingmingl-llvm May 14, 2024
321f6aa
update stale comment and use 'DAG' for check lines
mingmingl-llvm May 14, 2024
31d9bd2
add comment for 'DecSummaries' parameter in FunctionImport.h
mingmingl-llvm May 15, 2024
ac8e9fa
run 'git merge main'
mingmingl-llvm May 20, 2024
bde377c
resolve review feedback
mingmingl-llvm May 20, 2024
779dcf3
Update regression test:
mingmingl-llvm May 21, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 6 additions & 3 deletions llvm/include/llvm/Bitcode/BitcodeWriter.h
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,8 @@ class raw_ostream;

void writeIndex(
const ModuleSummaryIndex *Index,
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex);
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex,
const GVSummaryPtrSet *DecSummaries);
};

/// Write the specified module to the specified raw output stream.
Expand Down Expand Up @@ -147,10 +148,12 @@ class raw_ostream;
/// where it will be written in a new bitcode block. This is used when
/// writing the combined index file for ThinLTO. When writing a subset of the
/// index for a distributed backend, provide the \p ModuleToSummariesForIndex
/// map.
/// map. \p DecSummaries specifies the set of summaries for which the
/// corresponding value should be imported as a declaration (prototype).
void writeIndexToFile(const ModuleSummaryIndex &Index, raw_ostream &Out,
const std::map<std::string, GVSummaryMapTy>
*ModuleToSummariesForIndex = nullptr);
*ModuleToSummariesForIndex = nullptr,
const GVSummaryPtrSet *DecSummaries = nullptr);

/// If EmbedBitcode is set, save a copy of the llvm IR as data in the
/// __LLVM,__bitcode section (.llvmbc on non-MacOS).
Expand Down
7 changes: 7 additions & 0 deletions llvm/include/llvm/IR/ModuleSummaryIndex.h
Original file line number Diff line number Diff line change
Expand Up @@ -587,6 +587,10 @@ class GlobalValueSummary {

void setImportKind(ImportKind IK) { Flags.ImportType = IK; }

GlobalValueSummary::ImportKind importType() const {
return static_cast<ImportKind>(Flags.ImportType);
}

GlobalValue::VisibilityTypes getVisibility() const {
return (GlobalValue::VisibilityTypes)Flags.Visibility;
}
Expand Down Expand Up @@ -1272,6 +1276,9 @@ using ModulePathStringTableTy = StringMap<ModuleHash>;
/// a particular module, and provide efficient access to their summary.
using GVSummaryMapTy = DenseMap<GlobalValue::GUID, GlobalValueSummary *>;

/// A set of global value summary pointers.
using GVSummaryPtrSet = SmallPtrSet<GlobalValueSummary *, 4>;

/// Map of a type GUID to type id string and summary (multimap used
/// in case of GUID conflicts).
using TypeIdSummaryMapTy =
Expand Down
5 changes: 3 additions & 2 deletions llvm/include/llvm/LTO/legacy/ThinLTOCodeGenerator.h
Original file line number Diff line number Diff line change
Expand Up @@ -271,12 +271,13 @@ class ThinLTOCodeGenerator {
const lto::InputFile &File);

/**
* Compute the list of summaries needed for importing into module.
* Compute the list of summaries and the subset of declaration summaries
* needed for importing into module.
*/
void gatherImportedSummariesForModule(
Module &Module, ModuleSummaryIndex &Index,
std::map<std::string, GVSummaryMapTy> &ModuleToSummariesForIndex,
const lto::InputFile &File);
GVSummaryPtrSet &DecSummaries, const lto::InputFile &File);

/**
* Perform internalization. Index is updated to reflect linkage changes.
Expand Down
21 changes: 15 additions & 6 deletions llvm/include/llvm/Transforms/IPO/FunctionImport.h
Original file line number Diff line number Diff line change
Expand Up @@ -31,9 +31,9 @@ class Module;
/// based on the provided summary informations.
class FunctionImporter {
public:
/// Set of functions to import from a source module. Each entry is a set
/// containing all the GUIDs of all functions to import for a source module.
using FunctionsToImportTy = std::unordered_set<GlobalValue::GUID>;
/// The functions to import from a source module and their import type.
using FunctionsToImportTy =
DenseMap<GlobalValue::GUID, GlobalValueSummary::ImportKind>;

/// The different reasons selectCallee will chose not to import a
/// candidate.
Expand Down Expand Up @@ -99,8 +99,13 @@ class FunctionImporter {
/// index's module path string table).
using ImportMapTy = DenseMap<StringRef, FunctionsToImportTy>;

/// The set contains an entry for every global value the module exports.
using ExportSetTy = DenseSet<ValueInfo>;
/// The map contains an entry for every global value the module exports.
/// The key is ValueInfo, and the value indicates whether the definition
/// or declaration is visible to another module. If a function's definition is
/// visible to other modules, the global values this function referenced are
/// visible and shouldn't be internalized.
/// TODO: Rename to `ExportMapTy`.
using ExportSetTy = DenseMap<ValueInfo, GlobalValueSummary::ImportKind>;

/// A function of this type is used to load modules referenced by the index.
using ModuleLoaderTy =
Expand Down Expand Up @@ -207,11 +212,15 @@ bool convertToDeclaration(GlobalValue &GV);
/// \p ModuleToSummariesForIndex will be populated with the needed summaries
/// from each required module path. Use a std::map instead of StringMap to get
/// stable order for bitcode emission.
///
/// \p DecSummaries will be popluated with the subset of of summary pointers
/// that have 'declaration' import type among all summaries the module need.
void gatherImportedSummariesForModule(
StringRef ModulePath,
const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
const FunctionImporter::ImportMapTy &ImportList,
std::map<std::string, GVSummaryMapTy> &ModuleToSummariesForIndex);
std::map<std::string, GVSummaryMapTy> &ModuleToSummariesForIndex,
GVSummaryPtrSet &DecSummaries);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a documentation for the DecSummaries parameter here too? (similar to the ThinLTOCodeGenerator.h comment, might be useful to indicate it's a subset of ModuleToSummariesForIndex).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


/// Emit into \p OutputFilename the files module \p ModulePath will import from.
std::error_code EmitImportsFiles(
Expand Down
38 changes: 30 additions & 8 deletions llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -428,6 +428,11 @@ class IndexBitcodeWriter : public BitcodeWriterBase {
/// The combined index to write to bitcode.
const ModuleSummaryIndex &Index;

/// When writing combined summaries, provides the set of global value
/// summaries for which the value (function, function alias, etc) should be
/// imported as a declaration.
const GVSummaryPtrSet *DecSummaries = nullptr;

/// When writing a subset of the index for distributed backends, client
/// provides a map of modules to the corresponding GUIDs/summaries to write.
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex;
Expand All @@ -452,11 +457,16 @@ class IndexBitcodeWriter : public BitcodeWriterBase {
/// Constructs a IndexBitcodeWriter object for the given combined index,
/// writing to the provided \p Buffer. When writing a subset of the index
/// for a distributed backend, provide a \p ModuleToSummariesForIndex map.
/// If provided, \p ModuleToDecSummaries specifies the set of summaries for
/// which the corresponding functions or aliased functions should be imported
/// as a declaration (but not definition) for each module.
IndexBitcodeWriter(BitstreamWriter &Stream, StringTableBuilder &StrtabBuilder,
const ModuleSummaryIndex &Index,
const GVSummaryPtrSet *DecSummaries = nullptr,
const std::map<std::string, GVSummaryMapTy>
*ModuleToSummariesForIndex = nullptr)
: BitcodeWriterBase(Stream, StrtabBuilder), Index(Index),
DecSummaries(DecSummaries),
ModuleToSummariesForIndex(ModuleToSummariesForIndex) {
// Assign unique value ids to all summaries to be written, for use
// in writing out the call graph edges. Save the mapping from GUID
Expand Down Expand Up @@ -1202,7 +1212,8 @@ static uint64_t getEncodedFFlags(FunctionSummary::FFlags Flags) {

// Decode the flags for GlobalValue in the summary. See getDecodedGVSummaryFlags
// in BitcodeReader.cpp.
static uint64_t getEncodedGVSummaryFlags(GlobalValueSummary::GVFlags Flags) {
static uint64_t getEncodedGVSummaryFlags(GlobalValueSummary::GVFlags Flags,
bool ImportAsDecl = false) {
uint64_t RawFlags = 0;

RawFlags |= Flags.NotEligibleToImport; // bool
Expand All @@ -1217,7 +1228,8 @@ static uint64_t getEncodedGVSummaryFlags(GlobalValueSummary::GVFlags Flags) {

RawFlags |= (Flags.Visibility << 8); // 2 bits

RawFlags |= (Flags.ImportType << 10); // 1 bit
unsigned ImportType = Flags.ImportType | ImportAsDecl;
RawFlags |= (ImportType << 10); // 1 bit

return RawFlags;
}
Expand Down Expand Up @@ -4543,6 +4555,12 @@ void IndexBitcodeWriter::writeCombinedGlobalValueSummary() {
Abbv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 8));
unsigned AllocAbbrev = Stream.EmitAbbrev(std::move(Abbv));

auto shouldImportValueAsDecl = [&](GlobalValueSummary *GVS) -> bool {
if (DecSummaries == nullptr)
return false;
return DecSummaries->contains(GVS);
};

// The aliases are emitted as a post-pass, and will point to the value
// id of the aliasee. Save them in a vector for post-processing.
SmallVector<AliasSummary *, 64> Aliases;
Expand Down Expand Up @@ -4653,7 +4671,8 @@ void IndexBitcodeWriter::writeCombinedGlobalValueSummary() {
NameVals.push_back(*ValueId);
assert(ModuleIdMap.count(FS->modulePath()));
NameVals.push_back(ModuleIdMap[FS->modulePath()]);
NameVals.push_back(getEncodedGVSummaryFlags(FS->flags()));
NameVals.push_back(
getEncodedGVSummaryFlags(FS->flags(), shouldImportValueAsDecl(FS)));
NameVals.push_back(FS->instCount());
NameVals.push_back(getEncodedFFlags(FS->fflags()));
NameVals.push_back(FS->entryCount());
Expand Down Expand Up @@ -4702,7 +4721,8 @@ void IndexBitcodeWriter::writeCombinedGlobalValueSummary() {
NameVals.push_back(AliasValueId);
assert(ModuleIdMap.count(AS->modulePath()));
NameVals.push_back(ModuleIdMap[AS->modulePath()]);
NameVals.push_back(getEncodedGVSummaryFlags(AS->flags()));
NameVals.push_back(
getEncodedGVSummaryFlags(AS->flags(), shouldImportValueAsDecl(AS)));
auto AliaseeValueId = SummaryToValueIdMap[&AS->getAliasee()];
assert(AliaseeValueId);
NameVals.push_back(AliaseeValueId);
Expand Down Expand Up @@ -5036,8 +5056,9 @@ void BitcodeWriter::writeModule(const Module &M,

void BitcodeWriter::writeIndex(
const ModuleSummaryIndex *Index,
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex) {
IndexBitcodeWriter IndexWriter(*Stream, StrtabBuilder, *Index,
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex,
const GVSummaryPtrSet *DecSummaries) {
IndexBitcodeWriter IndexWriter(*Stream, StrtabBuilder, *Index, DecSummaries,
ModuleToSummariesForIndex);
IndexWriter.write();
}
Expand Down Expand Up @@ -5090,12 +5111,13 @@ void IndexBitcodeWriter::write() {
// index for a distributed backend, provide a \p ModuleToSummariesForIndex map.
void llvm::writeIndexToFile(
const ModuleSummaryIndex &Index, raw_ostream &Out,
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex) {
const std::map<std::string, GVSummaryMapTy> *ModuleToSummariesForIndex,
const GVSummaryPtrSet *DecSummaries) {
SmallVector<char, 0> Buffer;
Buffer.reserve(256 * 1024);

BitcodeWriter Writer(Buffer);
Writer.writeIndex(&Index, ModuleToSummariesForIndex);
Writer.writeIndex(&Index, ModuleToSummariesForIndex, DecSummaries);
Writer.writeStrtab();

Out.write((char *)&Buffer.front(), Buffer.size());
Expand Down
42 changes: 29 additions & 13 deletions llvm/lib/LTO/LTO.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,9 @@ void llvm::computeLTOCacheKey(
support::endian::write64le(Data, I);
Hasher.update(Data);
};
auto AddUint8 = [&](const uint8_t &I) {
Hasher.update(ArrayRef<uint8_t>((const uint8_t *)&I, 1));
};
AddString(Conf.CPU);
// FIXME: Hash more of Options. For now all clients initialize Options from
// command-line flags (which is unsupported in production), but may set
Expand Down Expand Up @@ -156,18 +159,23 @@ void llvm::computeLTOCacheKey(
auto ModHash = Index.getModuleHash(ModuleID);
Hasher.update(ArrayRef<uint8_t>((uint8_t *)&ModHash[0], sizeof(ModHash)));

std::vector<uint64_t> ExportsGUID;
std::vector<std::pair<uint64_t, uint8_t>> ExportsGUID;
ExportsGUID.reserve(ExportList.size());
for (const auto &VI : ExportList) {
auto GUID = VI.getGUID();
ExportsGUID.push_back(GUID);
}
for (const auto &[VI, ExportType] : ExportList)
ExportsGUID.push_back(
std::make_pair(VI.getGUID(), static_cast<uint8_t>(ExportType)));

// Sort the export list elements GUIDs.
llvm::sort(ExportsGUID);
for (uint64_t GUID : ExportsGUID) {
llvm::sort(ExportsGUID, [](const std::pair<uint64_t, uint8_t> &LHS,
const std::pair<uint64_t, uint8_t> &RHS) {
if (LHS.first != RHS.first)
return LHS.first < RHS.first;
return LHS.second < RHS.second;
});
for (auto [GUID, ExportType] : ExportsGUID) {
// The export list can impact the internalization, be conservative here
Hasher.update(ArrayRef<uint8_t>((uint8_t *)&GUID, sizeof(GUID)));
AddUint8(ExportType);
}

// Include the hash for every module we import functions from. The set of
Expand Down Expand Up @@ -204,8 +212,10 @@ void llvm::computeLTOCacheKey(
Hasher.update(ArrayRef<uint8_t>((uint8_t *)&ModHash[0], sizeof(ModHash)));

AddUint64(Entry.getFunctions().size());
for (auto &Fn : Entry.getFunctions())
AddUint64(Fn);
for (auto &[GUID, ImportType] : Entry.getFunctions()) {
AddUint64(GUID);
AddUint8(ImportType);
}
}

// Include the hash for the resolved ODR.
Expand Down Expand Up @@ -275,9 +285,9 @@ void llvm::computeLTOCacheKey(
// Imported functions may introduce new uses of type identifier resolutions,
// so we need to collect their used resolutions as well.
for (const ImportModule &ImpM : ImportModulesVector)
for (auto &ImpF : ImpM.getFunctions()) {
for (auto &[GUID, UnusedImportType] : ImpM.getFunctions()) {
GlobalValueSummary *S =
Index.findSummaryInModule(ImpF, ImpM.getIdentifier());
Index.findSummaryInModule(GUID, ImpM.getIdentifier());
AddUsedThings(S);
// If this is an alias, we also care about any types/etc. that the aliasee
// may reference.
Expand Down Expand Up @@ -1389,15 +1399,21 @@ class lto::ThinBackendProc {
llvm::StringRef ModulePath,
const std::string &NewModulePath) {
std::map<std::string, GVSummaryMapTy> ModuleToSummariesForIndex;
GVSummaryPtrSet DeclarationSummaries;

std::error_code EC;
gatherImportedSummariesForModule(ModulePath, ModuleToDefinedGVSummaries,
ImportList, ModuleToSummariesForIndex);
ImportList, ModuleToSummariesForIndex,
DeclarationSummaries);

raw_fd_ostream OS(NewModulePath + ".thinlto.bc", EC,
sys::fs::OpenFlags::OF_None);
if (EC)
return errorCodeToError(EC);
writeIndexToFile(CombinedIndex, OS, &ModuleToSummariesForIndex);

// TODO: Serialize declaration bits to bitcode.
writeIndexToFile(CombinedIndex, OS, &ModuleToSummariesForIndex,
&DeclarationSummaries);

if (ShouldEmitImportsFiles) {
EC = EmitImportsFiles(ModulePath, NewModulePath + ".imports",
Expand Down
9 changes: 8 additions & 1 deletion llvm/lib/LTO/LTOBackend.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -721,7 +721,14 @@ bool lto::initImportList(const Module &M,
if (Summary->modulePath() == M.getModuleIdentifier())
continue;
// Add an entry to provoke importing by thinBackend.
ImportList[Summary->modulePath()].insert(GUID);
// Try emplace the entry first. If an entry with the same key already
// exists, set the value to 'std::min(existing-value, new-value)' to make
// sure a definition takes precedence over a declaration.
auto [Iter, Inserted] = ImportList[Summary->modulePath()].try_emplace(
GUID, Summary->importType());

if (!Inserted)
Iter->second = std::min(Iter->second, Summary->importType());
}
}
return true;
Expand Down
10 changes: 7 additions & 3 deletions llvm/lib/LTO/ThinLTOCodeGenerator.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -766,7 +766,7 @@ void ThinLTOCodeGenerator::crossModuleImport(Module &TheModule,
void ThinLTOCodeGenerator::gatherImportedSummariesForModule(
Module &TheModule, ModuleSummaryIndex &Index,
std::map<std::string, GVSummaryMapTy> &ModuleToSummariesForIndex,
const lto::InputFile &File) {
GVSummaryPtrSet &DecSummaries, const lto::InputFile &File) {
auto ModuleCount = Index.modulePaths().size();
auto ModuleIdentifier = TheModule.getModuleIdentifier();

Expand Down Expand Up @@ -796,7 +796,7 @@ void ThinLTOCodeGenerator::gatherImportedSummariesForModule(

llvm::gatherImportedSummariesForModule(
ModuleIdentifier, ModuleToDefinedGVSummaries,
ImportLists[ModuleIdentifier], ModuleToSummariesForIndex);
ImportLists[ModuleIdentifier], ModuleToSummariesForIndex, DecSummaries);
}

/**
Expand Down Expand Up @@ -832,10 +832,14 @@ void ThinLTOCodeGenerator::emitImports(Module &TheModule, StringRef OutputName,
IsPrevailing(PrevailingCopy), ImportLists,
ExportLists);

// 'EmitImportsFiles' emits the list of modules from which to import from, and
// the set of keys in `ModuleToSummariesForIndex` should be a superset of keys
// in `DecSummaries`, so no need to use `DecSummaries` in `EmitImportFiles`.
GVSummaryPtrSet DecSummaries;
std::map<std::string, GVSummaryMapTy> ModuleToSummariesForIndex;
llvm::gatherImportedSummariesForModule(
ModuleIdentifier, ModuleToDefinedGVSummaries,
ImportLists[ModuleIdentifier], ModuleToSummariesForIndex);
ImportLists[ModuleIdentifier], ModuleToSummariesForIndex, DecSummaries);

std::error_code EC;
if ((EC = EmitImportsFiles(ModuleIdentifier, OutputName,
Expand Down
Loading
Loading