-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[lld][COFF] Remove duplicate strtab entries #141197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
String table size is too big for large binary when symbol table is enabled. Some strings in strtab is same so it can be reused.
@llvm/pr-subscribers-platform-windows @llvm/pr-subscribers-lld-coff Author: Haohai Wen (HaohaiWen) ChangesString table size is too big for large binary when symbol table is Full diff: https://github.com/llvm/llvm-project/pull/141197.diff 1 Files Affected:
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index db6133e20a037..6ec83a5f77e5a 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -285,6 +285,7 @@ class Writer {
std::unique_ptr<FileOutputBuffer> &buffer;
std::map<PartialSectionKey, PartialSection *> partialSections;
std::vector<char> strtab;
+ StringMap<size_t> strtabMap;
std::vector<llvm::object::coff_symbol16> outputSymtab;
std::vector<ECCodeMapEntry> codeMap;
IdataContents idata;
@@ -1439,10 +1440,13 @@ void Writer::assignOutputSectionIndices() {
size_t Writer::addEntryToStringTable(StringRef str) {
assert(str.size() > COFF::NameSize);
- size_t offsetOfEntry = strtab.size() + 4; // +4 for the size field
- strtab.insert(strtab.end(), str.begin(), str.end());
- strtab.push_back('\0');
- return offsetOfEntry;
+ size_t newOffsetOfEntry = strtab.size() + 4; // +4 for the size field
+ auto res = strtabMap.try_emplace(str, newOffsetOfEntry);
+ if (res.second) {
+ strtab.insert(strtab.end(), str.begin(), str.end());
+ strtab.push_back('\0');
+ }
+ return res.first->getValue();
}
std::optional<coff_symbol16> Writer::createSymbol(Defined *def) {
|
@llvm/pr-subscribers-lld Author: Haohai Wen (HaohaiWen) ChangesString table size is too big for large binary when symbol table is Full diff: https://github.com/llvm/llvm-project/pull/141197.diff 1 Files Affected:
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index db6133e20a037..6ec83a5f77e5a 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -285,6 +285,7 @@ class Writer {
std::unique_ptr<FileOutputBuffer> &buffer;
std::map<PartialSectionKey, PartialSection *> partialSections;
std::vector<char> strtab;
+ StringMap<size_t> strtabMap;
std::vector<llvm::object::coff_symbol16> outputSymtab;
std::vector<ECCodeMapEntry> codeMap;
IdataContents idata;
@@ -1439,10 +1440,13 @@ void Writer::assignOutputSectionIndices() {
size_t Writer::addEntryToStringTable(StringRef str) {
assert(str.size() > COFF::NameSize);
- size_t offsetOfEntry = strtab.size() + 4; // +4 for the size field
- strtab.insert(strtab.end(), str.begin(), str.end());
- strtab.push_back('\0');
- return offsetOfEntry;
+ size_t newOffsetOfEntry = strtab.size() + 4; // +4 for the size field
+ auto res = strtabMap.try_emplace(str, newOffsetOfEntry);
+ if (res.second) {
+ strtab.insert(strtab.end(), str.begin(), str.end());
+ strtab.push_back('\0');
+ }
+ return res.first->getValue();
}
std::optional<coff_symbol16> Writer::createSymbol(Defined *def) {
|
Test needed. In ELF we can use |
Instead of just using This was attempted before in 9ffeaaa, but I had to revert it in 4d2eda2; see the revert commit for an explanation of the issues I ran into, including a suggestion on what one could do to reapply it. Anyway, using a Do you have any numbers on how much binary size you save with this? It could be interesting to compare with the |
Thanks for reminding. |
How about using two StringTableBuilder, one for long section name and another for others? |
I don't plan on reviving it myself right now, but if you're working in this area, I would appreciate if you'd try to reapply that patch (revert the revert) and see if it can be made to work again on current git main. Then with that in place, it would be interesting to see if you see any further significant size reductions on your blender testcase. If it doesn't really reduce the binary size any much further (perhaps cases where tail merging makes any difference are rare?), then perhaps it's best to just go with the much simpler approach of your current patch here. |
String table size is too big for large binary when symbol table is
enabled. Some strings in strtab is same so it can be reused.