-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MemProf] Use remove_if to erase MapVector elements in bulk #94269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A cycle profile showed that we were spending a lot of time invoking MapVector::erase. According to https://llvm.org/docs/ProgrammersManual.html#llvm-adt-mapvector-h, erasing elements one at a time is very inefficient for MapVector and it is better to use remove_if. This change resulted in around 7% time reduction on a large thin link.
@llvm/pr-subscribers-llvm-transforms Author: Teresa Johnson (teresajohnson) ChangesA cycle profile showed that we were spending a lot of time invoking This change resulted in around 7% time reduction on a large thin link. Full diff: https://github.com/llvm/llvm-project/pull/94269.diff 1 Files Affected:
diff --git a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
index 0512922e35ae9..d141fd9e0b495 100644
--- a/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
+++ b/llvm/lib/Transforms/IPO/MemProfContextDisambiguation.cpp
@@ -505,9 +505,6 @@ class CallsiteContextGraph {
ContextNode *getNodeForAlloc(const CallInfo &C);
ContextNode *getNodeForStackId(uint64_t StackId);
- /// Removes the node information recorded for the given call.
- void unsetNodeForInst(const CallInfo &C);
-
/// Computes the alloc type corresponding to the given context ids, by
/// unioning their recorded alloc types.
uint8_t computeAllocType(DenseSet<uint32_t> &ContextIds);
@@ -813,15 +810,6 @@ CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::getNodeForStackId(
return nullptr;
}
-template <typename DerivedCCG, typename FuncTy, typename CallTy>
-void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::unsetNodeForInst(
- const CallInfo &C) {
- AllocationCallToContextNodeMap.erase(C) ||
- NonAllocationCallToContextNodeMap.erase(C);
- assert(!AllocationCallToContextNodeMap.count(C) &&
- !NonAllocationCallToContextNodeMap.count(C));
-}
-
template <typename DerivedCCG, typename FuncTy, typename CallTy>
void CallsiteContextGraph<DerivedCCG, FuncTy, CallTy>::ContextNode::
addOrUpdateCallerEdge(ContextNode *Caller, AllocationType AllocType,
@@ -1694,11 +1682,10 @@ void CallsiteContextGraph<DerivedCCG, FuncTy,
MapVector<CallInfo, ContextNode *> TailCallToContextNodeMap;
for (auto Entry = NonAllocationCallToContextNodeMap.begin();
- Entry != NonAllocationCallToContextNodeMap.end();) {
+ Entry != NonAllocationCallToContextNodeMap.end(); Entry++) {
auto *Node = Entry->second;
assert(Node->Clones.empty());
// Check all node callees and see if in the same function.
- bool Removed = false;
auto Call = Node->Call.call();
for (auto EI = Node->CalleeEdges.begin(); EI != Node->CalleeEdges.end();) {
auto Edge = *EI;
@@ -1714,15 +1701,18 @@ void CallsiteContextGraph<DerivedCCG, FuncTy,
// Work around by setting Node to have a null call, so it gets
// skipped during cloning. Otherwise assignFunctions will assert
// because its data structures are not designed to handle this case.
- Entry = NonAllocationCallToContextNodeMap.erase(Entry);
Node->setCall(CallInfo());
- Removed = true;
break;
}
- if (!Removed)
- Entry++;
}
+ // Remove all mismatched nodes identified in the above loop from the node map
+ // (checking whether they have a null call which is set above). For a
+ // MapVector like NonAllocationCallToContextNodeMap it is much more efficient
+ // to do the removal via remove_if than by individually erasing entries above.
+ NonAllocationCallToContextNodeMap.remove_if(
+ [](const auto &it) { return !it.second->hasCall(); });
+
// Add the new nodes after the above loop so that the iteration is not
// invalidated.
for (auto &[Call, Node] : TailCallToContextNodeMap)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM modulo a minor comment about a range-based for loop.
for (auto Entry = NonAllocationCallToContextNodeMap.begin(); | ||
Entry != NonAllocationCallToContextNodeMap.end();) { | ||
Entry != NonAllocationCallToContextNodeMap.end(); Entry++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While you are here, may I suggest a range-based for loop here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good idea, we can do that now. My local branch seems to have gotten messed up, so I am going to merge this one, then make a separate NFC follow on change to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #94308
A cycle profile showed that we were spending a lot of time invoking
MapVector::erase. According to
https://llvm.org/docs/ProgrammersManual.html#llvm-adt-mapvector-h,
erasing elements one at a time is very inefficient for MapVector and it
is better to use remove_if.
This change resulted in around 7% time reduction on a large thin link.
While here remove an unused function that also invokes erase on MapVectors.