Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen #129934

higher-performance · 2025-03-05T20:54:07Z

This mitigates a regression introduced in #87824.

The mitigation here are two-fold:

We only store pointers to the vector elements, rather than copies of the elements themselves. This saves a lot of memory because the elements are ~5 times larger than pointers.
We populate the "seen" set lazily, whenever it is queried. This further reduces memory usage and speeds up queries.

Fixes #129808.

llvmbot · 2025-03-05T20:54:40Z

@llvm/pr-subscribers-clang

Author: None (higher-performance)

Changes

This is a regression in #87824.

With this change, the use of parent maps (hasParent(), hasAncestor(), etc.) in Clang AST matchers is no longer guaranteed to avoid quadratic slowdown, but in practice it should do so more frequently. I tested against a translation unit that had been slow in the past, and it worked fine on that. If we hit those pathological cases in the future, we can evaluate other approaches.

Fixes #129808

Full diff: https://github.com/llvm/llvm-project/pull/129934.diff

1 Files Affected:

(modified) clang/lib/AST/ParentMapContext.cpp (+65-8)

diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp
index e9387ec79c373..03e5236eeb1db 100644
--- a/clang/lib/AST/ParentMapContext.cpp
+++ b/clang/lib/AST/ParentMapContext.cpp
@@ -65,21 +65,78 @@ class ParentMapContext::ParentMap {
   public:
     ParentVector() = default;
     explicit ParentVector(size_t N, const DynTypedNode &Value) {
-      Items.reserve(N);
+      SortedAndUnsortedItems.reserve(N);
       for (; N > 0; --N)
         push_back(Value);
     }
     bool contains(const DynTypedNode &Value) {
-      return Seen.contains(Value);
+      const auto SortBoundary = SortedAndUnsortedItems.begin() + NumSorted;
+      bool Found = std::binary_search(SortedAndUnsortedItems.begin(),
+                                      SortBoundary, Value);
+      Budget += llvm::bit_width(
+          static_cast<size_t>(SortBoundary - SortedAndUnsortedItems.begin()));
+      if (!Found) {
+        auto FoundIt =
+            std::find(SortBoundary, SortedAndUnsortedItems.end(), Value);
+        Budget += FoundIt - SortBoundary;
+        Found |= FoundIt != SortedAndUnsortedItems.end();
+      }
+      SortIfWorthwhile();
+      return Found;
     }
     void push_back(const DynTypedNode &Value) {
-      if (!Value.getMemoizationData() || Seen.insert(Value).second)
-        Items.push_back(Value);
+      ++Budget;
+      if (!Value.getMemoizationData() || !contains(Value)) {
+        SortedAndUnsortedItems.push_back(Value);
+        if (SortedAndUnsortedItems.back() < SortedAndUnsortedItems[NumSorted]) {
+          // Keep the minimum element in the middle to quickly tell us if
+          // merging will be necessary
+          using std::swap;
+          swap(SortedAndUnsortedItems.back(),
+               SortedAndUnsortedItems[NumSorted]);
+        }
+      }
+      VerifyInvariant();
     }
-    llvm::ArrayRef<DynTypedNode> view() const { return Items; }
+    llvm::ArrayRef<DynTypedNode> view() {
+      ++Budget;
+      return SortedAndUnsortedItems;
+    }
+
   private:
-    llvm::SmallVector<DynTypedNode, 2> Items;
-    llvm::SmallDenseSet<DynTypedNode, 2> Seen;
+    void SortIfWorthwhile() {
+      VerifyInvariant();
+      auto SortBoundary = SortedAndUnsortedItems.begin() + NumSorted;
+      if (SortBoundary != SortedAndUnsortedItems.end()) {
+        const size_t NumUnsorted = SortedAndUnsortedItems.end() - SortBoundary;
+        const size_t SortingCost = NumUnsorted * llvm::bit_width(NumUnsorted);
+        const bool NeedMerge = SortBoundary != SortedAndUnsortedItems.begin();
+        // Assume that the naive implementation would copy these elements.
+        // This is just an estimate; it's OK if it's wrong.
+        const size_t MergeCost = SortedAndUnsortedItems.size() + NumUnsorted;
+        if (Budget >= (NeedMerge ? MergeCost : 0) + SortingCost) {
+          std::sort(SortBoundary, SortedAndUnsortedItems.end());
+          if (NeedMerge) {
+            std::inplace_merge(SortedAndUnsortedItems.begin(), SortBoundary,
+                               SortedAndUnsortedItems.end());
+          }
+          Budget = 0;
+          NumSorted = SortedAndUnsortedItems.size();
+        }
+      }
+    }
+
+    void VerifyInvariant() const {
+      assert(
+          !(NumSorted < SortedAndUnsortedItems.size() &&
+            SortedAndUnsortedItems.back() <
+                SortedAndUnsortedItems[NumSorted]) &&
+          "the boundary item must always be the minimum of the unsorted items");
+    }
+
+    llvm::SmallVector<DynTypedNode, 2> SortedAndUnsortedItems;
+    size_t NumSorted = 0;
+    int64_t Budget = 0;
   };
 
   /// Maps from a node to its parents. This is used for nodes that have
@@ -117,7 +174,7 @@ class ParentMapContext::ParentMap {
     if (I == Map.end()) {
       return llvm::ArrayRef<DynTypedNode>();
     }
-    if (const auto *V = dyn_cast<ParentVector *>(I->second)) {
+    if (auto *V = dyn_cast<ParentVector *>(I->second)) {
       return V->view();
     }
     return getSingleDynTypedNodeFromParentMap(I->second);

erichkeane

@AaronBallman should take a look at this as well. We might take a look at SetVector and see if there are any learnings we can take from that. It has a set and a vector together, but might have some interesting things.

Finally... I wonder if we were to just store this as a sorted-vector of DynTypedNode + insert order, then do a copy & sort on access of view.

Can we do some benchmarking to figure out which things are done the 'most often' with what values of N? it might better guide this.

clang/lib/AST/ParentMapContext.cpp

erichkeane · 2025-03-05T22:25:12Z

clang/lib/AST/ParentMapContext.cpp

    }
    void push_back(const DynTypedNode &Value) {
-      if (!Value.getMemoizationData() || Seen.insert(Value).second)
+      if (!Value.getMemoizationData() || !contains(Value)) {


It seems unfortunate to potentially re-calculate in contains here to just potentially immediately re-calculate it. I wonder if there is value to special-case ItemsProcessed == 0 && Items.capacity() == Items.size() to just do a linear search?

Also-also--- I wonder if it makes sense to only do the cache when size() is significant enough? It would be interesting to see if there is a value of N under which the set doesn't really make sense.

I could definitely optimize these further like in the ways you mention, but those are just (very small) constant-factor savings in time, at a significant readability and fragility cost. I didn't get the feeling it's worth it to be honest -- do you? Note that the performance graphs almost coincide with Clang 17's right now.

higher-performance · 2025-03-05T22:44:23Z

~~Oh shoot, it looks like my change may be buggy. One of the tests is breaking... I will take a look.~~

Edit: Fixed below. The fix didn't affect performance.

higher-performance · 2025-03-06T01:54:02Z

Ah sorry I forgot to reply to your review.

We might take a look at SetVector and see if there are any learnings we can take from that. It has a set and a vector together, but might have some interesting things.

I did take a look at this actually, but it looked equivalent to what I was doing before, particularly with respect to storing the expensive keys (DynTypedNode) twice. It just uses SmallVector<T, ...> and DenseSet<T> underneath, whereas I just happened to use SmallDenseSet<T, ...> instead of DenseSet<T>.

Finally... I wonder if we were to just store this as a sorted-vector of DynTypedNode + insert order, then do a copy & sort on access of view.

I don't think that works. You don't want to do a ton of work on every view() -- it's assumed to be cheap, and called often. Also, maintaining sort order gets expensive the more nodes you add; you don't want to do it every insertion.

Furthermore, not only is it that elements without memoization data aren't comparable, but they still need to appear in the view in the same order as they were inserted. This means either they need to be in the same vector as the comparable elements (which makes it impossible to sort the comparable portion), or we need to create a façade over two completely disjoint containers.

The implementation in this PR addresses all of those constraints without any drawbacks, I think. There's less code compared to writing a facade, and performance-wise it feels just about optimal.

Can we do some benchmarking to figure out which things are done the 'most often' with what values of N? it might better guide this.

Re: benchmarking -- note that I did a general benchmark both with the test case that was provided in the associated issue, as well as with an expensive TU that we had internally. It had great performance in both cases. Regarding N -- I assume N is referring to the total container sizes (per the other comment)? The thing is, I don't get the impression there's much room for improvement here. The memory usage was just high by a constant (10x) factor. Some ~5x of that almost certainly came from the data type itself (storing DynTypedNode in the cache which is 40 bytes, instead of DynTypedNode*, which is 8 bytes). Adding laziness on top of that has shrunk the difference so much that the performance plots are already very close to coinciding.

erichkeane · 2025-03-06T14:25:26Z

Ah sorry I forgot to reply to your review.

We might take a look at SetVector and see if there are any learnings we can take from that. It has a set and a vector together, but might have some interesting things.

I did take a look at this actually, but it looked equivalent to what I was doing before, particularly with respect to storing the expensive keys (DynTypedNode) twice. It just uses SmallVector<T, ...> and DenseSet<T> underneath, whereas I just happened to use SmallDenseSet<T, ...> instead of DenseSet<T>.

Finally... I wonder if we were to just store this as a sorted-vector of DynTypedNode + insert order, then do a copy & sort on access of view.

I don't think that works. You don't want to do a ton of work on every view() -- it's assumed to be cheap, and called often. Also, maintaining sort order gets expensive the more nodes you add; you don't want to do it every insertion.

I see. Yes, so the problem is that I don't have a good idea here of the uses, I did a first run here. If view is the 'frequent' operation, then my suggestion isn't a good one at least :)

Furthermore, not only is it that elements without memoization data aren't comparable, but they still need to appear in the view in the same order as they were inserted. This means either they need to be in the same vector as the comparable elements (which makes it impossible to sort the comparable portion), or we need to create a façade over two completely disjoint containers.

The implementation in this PR addresses all of those constraints without any drawbacks, I think. There's less code compared to writing a facade, and performance-wise it feels just about optimal.

Can we do some benchmarking to figure out which things are done the 'most often' with what values of N? it might better guide this.

Re: benchmarking -- note that I did a general benchmark both with the test case that was provided in the associated issue, as well as with an expensive TU that we had internally. It had great performance in both cases. Regarding N -- I assume N is referring to the total container sizes (per the other comment)? The thing is, I don't get the impression there's much room for improvement here. The memory usage was just high by a constant (10x) factor. Some ~5x of that almost certainly came from the data type itself (storing DynTypedNode in the cache which is 40 bytes, instead of DynTypedNode*, which is 8 bytes). Adding laziness on top of that has shrunk the difference so much that the performance plots are already very close to coinciding.

Ok, I think I'm ok with all that then. BUT as I said, I'm not the best person to be reviewing this, I've pinged Aaron and hope he'll come along.

cor3ntin · 2025-03-06T14:43:47Z

clang/lib/AST/ParentMapContext.cpp

+        const size_t OldCapacity = Items.capacity();
        Items.push_back(Value);
+        if (OldCapacity != Items.capacity()) {


This is quite terrifying.
Would we impact performance significantly if we use a map <pointer , index>,
or a vector of pointer that has the same order as the vector of value?

That wouldn't compile without significant refactoring across several files. We need the Items vector to be convertible to llvm::ArrayRef<DynTypedNode>, because that's what view() returns.

AaronBallman

Just some minor nits from me so far.

I think there may be some additional performance benefits we could eek out, but I don't think they need to be done in this patch (or done at all, it's speculative). One is that I noticed ASTContext::getTraversalScope() is returning a vector by copy and that doesn't seem necessary. Another is that perhaps we want to use a bloom filter here instead on the assumption that the parent map will always be fairly large. But again, this is all speculation.

clang/lib/AST/ParentMapContext.cpp

erichkeane · 2025-03-06T14:58:28Z

Just some minor nits from me so far.

I think there may be some additional performance benefits we could eek out, but I don't think they need to be done in this patch (or done at all, it's speculative). One is that I noticed ASTContext::getTraversalScope() is returning a vector by copy and that doesn't seem necessary. Another is that perhaps we want to use a bloom filter here instead on the assumption that the parent map will always be fairly large. But again, this is all speculation.

Hmm... would SOME duplicates be acceptable? I THINK the original patch removed the duplicates, so they were presumably OK before then? IF that is acceptable (and uses are duplicate-tolerant), a Bloom filter that can reduce the number of duplicates sub-1% would still be a massive-win, right? We could perhaps do some tuning on the size of the filter to get that reasonable.

According to the Wiki article on Bloom filter's intro, 10 bits-per-element is all that is necessary for a 1% false-positive rate, but the details of the article and reference get really "LaTEX-math-stuff" to me, so I don't have a good idea what it would take to get the false positive rate down low.

BUT since the idea is to just reduce/eliminate duplicates without high memory overhead, it seems like it would be a great solution here.

erichkeane · 2025-03-06T15:00:12Z

One is that I noticed ASTContext::getTraversalScope() is returning a vector by copy and that doesn't seem necessary.

I noticed this too! I'm actually working on validating a RAC fix for this right now, just to have it return an ArrayRef instead. Both uses of that function just do observation, so it seems reasonable enough

cor3ntin · 2025-03-06T15:42:14Z

I think we are a bit confused, so can someone tell me if I'm wrong

N is always going to be fairly small (and always 1 in the non-template case)
We care about the order of parents
We are just trying to avoid duplicate
Intrusive linked list

Here are some random ideas

Modify ParentStack so that it stores pointers such that we never make copies
Go back to a simple vector if we are only storing pointers
Play with something like a multimap for ParentMap - because ultimately, that's what we are trying to do, and maybe that would avoid the overhead of many SmallVector

higher-performance · 2025-03-06T18:33:43Z

@cor3ntin We need to satisfy this constraint, which is not trivial:

llvm-project/clang/lib/AST/ParentMapContext.cpp

Lines 79 to 81 in c54afe5

    
             llvm::ArrayRef<DynTypedNode> view() const { return Items; } 
        
           private: 
        
             llvm::SmallVector<DynTypedNode, 2> Items;

i.e. if we stored maps or pointers then we could no longer return an llvm::ArrayRef.

higher-performance · 2025-03-10T23:59:54Z

Checking in - has everybody had a chance to review this? Can we merge?

AaronBallman · 2025-03-11T12:14:45Z

Checking in - has everybody had a chance to review this? Can we merge?

Was @cor3ntin correct here?

I think we are a bit confused, so can someone tell me if I'm wrong

N is always going to be fairly small (and always 1 in the non-template case)
We care about the order of parents
We are just trying to avoid duplicate
Intrusive linked list

higher-performance · 2025-03-11T16:50:31Z

Checking in - has everybody had a chance to review this? Can we merge?

Was @cor3ntin correct here?

I think we are a bit confused, so can someone tell me if I'm wrong
N is always going to be fairly small (and always 1 in the non-template case)
We care about the order of parents
We are just trying to avoid duplicate
Intrusive linked list

Depends which part you mean, but I don't believe so - I tried to respond to it here. Ultimately I was saying we can't have an intrusive linked list because of that constraint. The part about "just trying" to avoid duplicates is somewhat ambiguous; the code only seems to want to avoid duplicates nodes that carry memoization data, but not other nodes. It may just be an optimization but I'm not confident enough to say if duplicating those will cause a problem. The part about caring about parent order is true as I've mentioned earlier, and the part about N being small I just don't know -- it probably depends on the codebase and how many parents/ancestors a class has.

AaronBallman

LGTM, thanks for the fix!

higher-performance · 2025-03-12T16:27:14Z

I just realized there is another optimization we could do in push_back (similar motivation as what @erichkeane mentioned here, but different):

We could avoid the duplication check in push_back, and defer it to the contains()/view() accessors, thus making push_back simply do if (!Value.getMemoizationData()) { ... }. If we do such a thing, then we'd probably want to amortize it so that the number of unprocessed entries doesn't grow unboundedly. We could keep the number of unprocessed elements within (say) 25% of the size of the processed elements.

Thoughts on this? I'm inclined to give it a try to see if it's worth the code complexity.

erichkeane · 2025-03-12T16:32:20Z

I just realized there is another optimization we could do in push_back (similar motivation as what @erichkeane mentioned here, but different):

We could avoid the duplication check in push_back, and defer it to the contains()/view() accessors, thus making push_back simply do if (!Value.getMemoizationData()) { ... }. If we do such a thing, then we'd probably want to amortize it so that the number of unprocessed entries doesn't grow unboundedly. We could keep the number of unprocessed elements within (say) 25% of the size of the processed elements.

Thoughts on this? I'm inclined to give it a try to see if it's worth the code complexity.

It sounds worth a try, and perhaps could help readability. I think a 'remove dupes' step (aka, std::unique/erase) is much more readable than what is being attempted here.

As far as the unprocessed elements, it would be interesting to do some sort of benchmarking to see if we choose a percent. What MIGHT be useful instead is in push_back if capacity==size, do the 'remove dupes' step. That would keep us necessarily bounded to the handful of powers-of-2, but would also catch us before a particularly expensive step.

higher-performance · 2025-03-12T17:32:52Z

Okay thanks, I'll give it a shot. It might take me a bit to get to it. If in the meantime anyone needs this PR urgently let me know and we can probably merge this and do that in a follow-up.

github-actions · 2025-03-13T01:45:19Z

✅ With the latest revision this PR passed the C/C++ code formatter.

higher-performance · 2025-03-13T02:03:40Z

So, this is going to sound very silly, but it seems I may have missed the obvious, and the solution may have been extremely simple: llvm::SmallPtrSet, on the return value of clang::DynTypedNode::getMemoizationData().

The reason is that the original code only ever inserts into the vector whenever getMemoizationData() returned non-NULL:

llvm-project/clang/lib/AST/ParentMapContext.cpp

Lines 388 to 391 in a3c248d

    
           bool Found = ParentStack.back().getMemoizationData() && 
        
                        llvm::is_contained(*Vector, ParentStack.back()); 
        
           if (!Found) 
        
             Vector->push_back(ParentStack.back());

Specifically, llvm::is_contained(*Vector, node) (and thus Vector->contains(node)) is only ever called when node.getMemoizationData() returns non-NULL. There was no containment check in the case where memoization data was absent, so we can just use pointer identity on getMemoizationData().

This drastically simplifies everything - please take another look and let me know if you see any issues!

cor3ntin

This looks reasonable, I just have one question

clang/lib/AST/ParentMapContext.cpp

AaronBallman

LGTM with the vector changed back to a SmallVector. This is a much cleaner change, thank you for spotting it!

erichkeane

Interesting... this ends up just being a set-vector with a different identity being stored for the set. So this ends up being ~4 ptr-sizes (IIRC you said that DynTypedNode is 3x the size of a pointer) per-record (though the extra 1 is only for those with memoization data).

Did you run the space benchmark here from the original bug? I definitely believe that this is an improvement (changes from 6x-ptr/element to < 4x-ptr/element), and perhaps the deduplication makes up for the extra data (since it only has to remove ONE duplicate to make up for 3 elements).

AS FAR AS the SmallVector/SmallDenseSet vs std::vector/std::set, the decision here is interesting. SmallVector has a pretty massive stack size in exchange for saving on an allocation (it isn't SSO-like, it is more, "size-of-vector + N, so that our allocation is on-stack unless size > N). So it kinda depends on whether we think MOST uses of this are going to be '1' element, vs '0' or '1+' (that is, for the small-vector suggested by @cor3ntin of size 1). If it is almost always going to be 1, I agree with him. If it is almost always going to be 0 or more than 1, std::vector is probably correct (as in the 0 case, std::vector is smaller, and in the 1+ case, the allocation is happening anyway, and we just have dead stack space again).

…ters in the cache

higher-performance · 2025-03-13T17:05:49Z

Interesting... this ends up just being a set-vector with a different identity being stored for the set. So this ends up being ~4 ptr-sizes (IIRC you said that DynTypedNode is 3x the size of a pointer) per-record (though the extra 1 is only for those with memoization data).

Originally 5x (40 bytes vs. 8 bytes), not 3x. But it might be even higher now, since that was using SmallDenseSet and this one is using SmallPtrSet -- I'm not sure how each one represents a slot in the table, but I imagine the pointer one is better optimized for pointers.

Did you run the space benchmark here from the original bug?

Yup. With the SmallVector<1> I just changed it back to, it's somewhere between 0.6× to 1.3×, depending on the number of elements. It was something like 0.9× to 1.1× with std::vector.

AS FAR AS the SmallVector/SmallDenseSet vs std::vector/std::set, the decision here is interesting. SmallVector has a pretty massive stack size in exchange for saving on an allocation (it isn't SSO-like, it is more, "size-of-vector + N, so that our allocation is on-stack unless size > N). So it kinda depends on whether we think MOST uses of this are going to be '1' element, vs '0' or '1+' (that is, for the small-vector suggested by @cor3ntin of size 1). If it is almost always going to be 1, I agree with him. If it is almost always going to be 0 or more than 1, std::vector is probably correct (as in the 0 case, std::vector is smaller, and in the 1+ case, the allocation is happening anyway, and we just have dead stack space again).

Yeah I don't know what to expect from user code here, but it seems the difference is small either way -- I'd say let's just go with SmallVector<1> now.

higher-performance · 2025-03-13T20:03:55Z

Just merged this, thanks everyone for the reviews! If I hadn't gotten the pushback earlier, I may have never spotted the simplification and optimization.

higher-performance · 2025-03-13T20:11:49Z

/cherry-pick 8c7f0ea

llvmbot · 2025-03-13T20:17:15Z

/cherry-pick 8c7f0ea

Error: Command failed due to missing milestone.

higher-performance · 2025-03-13T20:18:11Z

/cherry-pick 8c7f0ea

llvmbot · 2025-03-13T20:24:01Z

/pull-request #131209

…f nodes have been seen (llvm#129934) This mitigates a regression introduced in llvm#87824. The mitigation here is to store pointers the deduplicated AST nodes, rather than copies of the nodes themselves. This allows a pointer-optimized set to be used and saves a lot of memory because `clang::DynTypedNode` is ~5 times larger than a pointer. Fixes llvm#129808. (cherry picked from commit 8c7f0ea)

…f nodes have been seen (llvm#129934) This mitigates a regression introduced in llvm#87824. The mitigation here is to store pointers the deduplicated AST nodes, rather than copies of the nodes themselves. This allows a pointer-optimized set to be used and saves a lot of memory because `clang::DynTypedNode` is ~5 times larger than a pointer. Fixes llvm#129808.

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 5, 2025

higher-performance mentioned this pull request Mar 5, 2025

Increased memory consumption in ParentMapContext after clang-19 #129808

Closed

higher-performance marked this pull request as draft March 5, 2025 21:05

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch from 0e3c685 to e99bb26 Compare March 5, 2025 21:59

higher-performance marked this pull request as ready for review March 5, 2025 22:07

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch from e99bb26 to 9f12c58 Compare March 5, 2025 22:09

higher-performance changed the title ~~Reduce memory usage in AST parent map generation by partially reverting quadratic slowdown mitigation~~ Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen Mar 5, 2025

higher-performance requested a review from erichkeane March 5, 2025 22:14

erichkeane reviewed Mar 5, 2025

View reviewed changes

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch 4 times, most recently from eaefea5 to bcdf883 Compare March 5, 2025 23:25

cor3ntin reviewed Mar 6, 2025

View reviewed changes

AaronBallman reviewed Mar 6, 2025

View reviewed changes

clang/lib/AST/ParentMapContext.cpp Outdated Show resolved Hide resolved

clang/lib/AST/ParentMapContext.cpp Outdated Show resolved Hide resolved

clang/lib/AST/ParentMapContext.cpp Outdated Show resolved Hide resolved

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch from bcdf883 to 1bfd371 Compare March 6, 2025 18:23

AaronBallman approved these changes Mar 12, 2025

View reviewed changes

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch 2 times, most recently from 0f2ccf1 to a14050e Compare March 13, 2025 01:41

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch from a14050e to 6966e56 Compare March 13, 2025 01:47

cor3ntin reviewed Mar 13, 2025

View reviewed changes

clang/lib/AST/ParentMapContext.cpp Outdated Show resolved Hide resolved

AaronBallman approved these changes Mar 13, 2025

View reviewed changes

erichkeane approved these changes Mar 13, 2025

View reviewed changes

Reduce memory usage in AST parent map generation by only keeping poin…

d3923bd

…ters in the cache

higher-performance force-pushed the higher-performance-parent-map-perf-fix branch from 6966e56 to d3923bd Compare March 13, 2025 16:59

higher-performance merged commit 8c7f0ea into llvm:main Mar 13, 2025
9 of 10 checks passed

higher-performance deleted the higher-performance-parent-map-perf-fix branch March 13, 2025 20:02

higher-performance added this to the LLVM 20.X Release milestone Mar 13, 2025

github-project-automation bot added this to LLVM Release Status Mar 13, 2025

github-project-automation bot moved this to Needs Triage in LLVM Release Status Mar 13, 2025

llvmbot moved this from Needs Triage to Done in LLVM Release Status Mar 13, 2025

Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen #129934

Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen #129934

Uh oh!

Conversation

higher-performance commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Mar 5, 2025

Uh oh!

erichkeane left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

erichkeane Mar 5, 2025

Choose a reason for hiding this comment

Uh oh!

higher-performance Mar 5, 2025

Choose a reason for hiding this comment

Uh oh!

higher-performance commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

higher-performance commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

erichkeane commented Mar 6, 2025

Uh oh!

cor3ntin Mar 6, 2025

Choose a reason for hiding this comment

Uh oh!

higher-performance Mar 6, 2025

Choose a reason for hiding this comment

Uh oh!

AaronBallman left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

erichkeane commented Mar 6, 2025

Uh oh!

erichkeane commented Mar 6, 2025

Uh oh!

cor3ntin commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

higher-performance commented Mar 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

higher-performance commented Mar 10, 2025

Uh oh!

AaronBallman commented Mar 11, 2025

Uh oh!

higher-performance commented Mar 11, 2025

Uh oh!

AaronBallman left a comment

Choose a reason for hiding this comment

Uh oh!

higher-performance commented Mar 12, 2025

Uh oh!

erichkeane commented Mar 12, 2025

Uh oh!

higher-performance commented Mar 12, 2025

Uh oh!

github-actions bot commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

higher-performance commented Mar 13, 2025

Uh oh!

cor3ntin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

AaronBallman left a comment

Choose a reason for hiding this comment

Uh oh!

erichkeane left a comment

Choose a reason for hiding this comment

Uh oh!

higher-performance commented Mar 5, 2025 •

edited

Loading

higher-performance commented Mar 5, 2025 •

edited

Loading

higher-performance commented Mar 6, 2025 •

edited

Loading

cor3ntin commented Mar 6, 2025 •

edited

Loading

higher-performance commented Mar 6, 2025 •

edited

Loading

github-actions bot commented Mar 13, 2025 •

edited

Loading