Skip to content

release/20.x: Reduce memory usage in AST parent map generation by lazily checking if nodes have been seen (#129934) #131209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 17, 2025

Conversation

llvmbot
Copy link
Member

@llvmbot llvmbot commented Mar 13, 2025

Backport 8c7f0ea

Requested by: @higher-performance

@llvmbot
Copy link
Member Author

llvmbot commented Mar 13, 2025

@higher-performance What do you think about merging this PR to the release branch?

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Mar 13, 2025
@llvmbot
Copy link
Member Author

llvmbot commented Mar 13, 2025

@llvm/pr-subscribers-clang

Author: None (llvmbot)

Changes

Backport 8c7f0ea

Requested by: @higher-performance


Full diff: https://github.com/llvm/llvm-project/pull/131209.diff

1 Files Affected:

  • (modified) clang/lib/AST/ParentMapContext.cpp (+11-6)
diff --git a/clang/lib/AST/ParentMapContext.cpp b/clang/lib/AST/ParentMapContext.cpp
index 7ff492443031d..d8dd352c42d6b 100644
--- a/clang/lib/AST/ParentMapContext.cpp
+++ b/clang/lib/AST/ParentMapContext.cpp
@@ -12,10 +12,11 @@
 //===----------------------------------------------------------------------===//
 
 #include "clang/AST/ParentMapContext.h"
-#include "clang/AST/RecursiveASTVisitor.h"
 #include "clang/AST/Decl.h"
 #include "clang/AST/Expr.h"
+#include "clang/AST/RecursiveASTVisitor.h"
 #include "clang/AST/TemplateBase.h"
+#include "llvm/ADT/SmallPtrSet.h"
 
 using namespace clang;
 
@@ -69,17 +70,21 @@ class ParentMapContext::ParentMap {
       for (; N > 0; --N)
         push_back(Value);
     }
-    bool contains(const DynTypedNode &Value) {
-      return Seen.contains(Value);
+    bool contains(const DynTypedNode &Value) const {
+      const void *Identity = Value.getMemoizationData();
+      assert(Identity);
+      return Dedup.contains(Identity);
     }
     void push_back(const DynTypedNode &Value) {
-      if (!Value.getMemoizationData() || Seen.insert(Value).second)
+      const void *Identity = Value.getMemoizationData();
+      if (!Identity || Dedup.insert(Identity).second) {
         Items.push_back(Value);
+      }
     }
     llvm::ArrayRef<DynTypedNode> view() const { return Items; }
   private:
-    llvm::SmallVector<DynTypedNode, 2> Items;
-    llvm::SmallDenseSet<DynTypedNode, 2> Seen;
+    llvm::SmallVector<DynTypedNode, 1> Items;
+    llvm::SmallPtrSet<const void *, 2> Dedup;
   };
 
   /// Maps from a node to its parents. This is used for nodes that have

@higher-performance
Copy link
Contributor

LGTM. @erichkeane?

@erichkeane
Copy link
Collaborator

FWIW: I'm in favor of this. This is quite low risk/a pretty trivial change that has an incredibly big impact on the memory pressure of our compiler in certain situations. I think this is a good candidate to back port.

@github-project-automation github-project-automation bot moved this from Needs Triage to Needs Merge in LLVM Release Status Mar 13, 2025
…f nodes have been seen (llvm#129934)

This mitigates a regression introduced in llvm#87824.

The mitigation here is to store pointers the deduplicated AST nodes, rather than copies of the nodes themselves. This allows a pointer-optimized set to be used and saves a lot of memory because `clang::DynTypedNode` is ~5 times larger than a pointer.

Fixes llvm#129808.

(cherry picked from commit 8c7f0ea)
@tstellar tstellar merged commit 0b23d98 into llvm:release/20.x Mar 17, 2025
8 of 11 checks passed
@github-project-automation github-project-automation bot moved this from Needs Merge to Done in LLVM Release Status Mar 17, 2025
Copy link

@higher-performance (or anyone else). If you would like to add a note about this fix in the release notes (completely optional). Please reply to this comment with a one or two sentence description of the fix. When you are done, please add the release:note label to this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
Development

Successfully merging this pull request may close these issues.

5 participants