Skip to content

[NFC] [ASTMatchers] Share code of forEachArgumentWithParamType with UnsafeBufferUsage #132387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 4, 2025

Conversation

ilya-biryukov
Copy link
Contributor

This changes exposes a low-level helper that is used to implement forEachArgumentWithParamType but can also be used without matchers, e.g. if performance is a concern.

Commit f5ee105 introduced a copy of the implementation of the forEachArgumentWithParamType matcher that was needed for optimizing performance of -Wunsafe-buffer-usage.

This change shares the code between the two so that we do not repeat ourselves and any bugfixes or changes will be picked up by both implementations in the future.

… UnsafeBufferUsage

This changes exposes a low-level helper that is also used to implement
`forEachArgumentWithParamType`, but can be used without matchers, e.g.
if performance is a concern.

Commit f5ee105 introduced a copy of the
implementation of the `forEachArgumentWithParamType` matcher that was
needed for optimizing performance of `-Wunsafe-buffer-usage`.

This change will ensure the code is shared so that we do not repeat
ourselves and any bugfixes or changes will be picked up by both
implementations in the future.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:analysis labels Mar 21, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 21, 2025

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: Ilya Biryukov (ilya-biryukov)

Changes

This changes exposes a low-level helper that is used to implement forEachArgumentWithParamType but can also be used without matchers, e.g. if performance is a concern.

Commit f5ee105 introduced a copy of the implementation of the forEachArgumentWithParamType matcher that was needed for optimizing performance of -Wunsafe-buffer-usage.

This change shares the code between the two so that we do not repeat ourselves and any bugfixes or changes will be picked up by both implementations in the future.


Full diff: https://github.com/llvm/llvm-project/pull/132387.diff

5 Files Affected:

  • (modified) clang/include/clang/ASTMatchers/ASTMatchers.h (+17-58)
  • (added) clang/include/clang/ASTMatchers/LowLevelHelpers.h (+37)
  • (modified) clang/lib/ASTMatchers/CMakeLists.txt (+1)
  • (added) clang/lib/ASTMatchers/LowLevelHelpers.cpp (+105)
  • (modified) clang/lib/Analysis/UnsafeBufferUsage.cpp (+2-93)
diff --git a/clang/include/clang/ASTMatchers/ASTMatchers.h b/clang/include/clang/ASTMatchers/ASTMatchers.h
index 738617759eb29..26a58cea49b5a 100644
--- a/clang/include/clang/ASTMatchers/ASTMatchers.h
+++ b/clang/include/clang/ASTMatchers/ASTMatchers.h
@@ -71,6 +71,7 @@
 #include "clang/AST/TypeLoc.h"
 #include "clang/ASTMatchers/ASTMatchersInternal.h"
 #include "clang/ASTMatchers/ASTMatchersMacros.h"
+#include "clang/ASTMatchers/LowLevelHelpers.h"
 #include "clang/Basic/AttrKinds.h"
 #include "clang/Basic/ExceptionSpecificationType.h"
 #include "clang/Basic/FileManager.h"
@@ -5215,68 +5216,26 @@ AST_POLYMORPHIC_MATCHER_P2(forEachArgumentWithParamType,
   // argument of the method which should not be matched against a parameter, so
   // we skip over it here.
   BoundNodesTreeBuilder Matches;
-  unsigned ArgIndex =
-      cxxOperatorCallExpr(
-          callee(cxxMethodDecl(unless(isExplicitObjectMemberFunction()))))
-              .matches(Node, Finder, &Matches)
-          ? 1
-          : 0;
-  const FunctionProtoType *FProto = nullptr;
-
-  if (const auto *Call = dyn_cast<CallExpr>(&Node)) {
-    if (const auto *Value =
-            dyn_cast_or_null<ValueDecl>(Call->getCalleeDecl())) {
-      QualType QT = Value->getType().getCanonicalType();
-
-      // This does not necessarily lead to a `FunctionProtoType`,
-      // e.g. K&R functions do not have a function prototype.
-      if (QT->isFunctionPointerType())
-        FProto = QT->getPointeeType()->getAs<FunctionProtoType>();
-
-      if (QT->isMemberFunctionPointerType()) {
-        const auto *MP = QT->getAs<MemberPointerType>();
-        assert(MP && "Must be member-pointer if its a memberfunctionpointer");
-        FProto = MP->getPointeeType()->getAs<FunctionProtoType>();
-        assert(FProto &&
-               "The call must have happened through a member function "
-               "pointer");
-      }
-    }
-  }
 
-  unsigned ParamIndex = 0;
   bool Matched = false;
-  unsigned NumArgs = Node.getNumArgs();
-  if (FProto && FProto->isVariadic())
-    NumArgs = std::min(NumArgs, FProto->getNumParams());
-
-  for (; ArgIndex < NumArgs; ++ArgIndex, ++ParamIndex) {
+  auto ProcessParamAndArg = [&](QualType ParamType, const Expr *Arg) {
     BoundNodesTreeBuilder ArgMatches(*Builder);
-    if (ArgMatcher.matches(*(Node.getArg(ArgIndex)->IgnoreParenCasts()), Finder,
-                           &ArgMatches)) {
-      BoundNodesTreeBuilder ParamMatches(ArgMatches);
+    if (!ArgMatcher.matches(*Arg, Finder, &ArgMatches))
+      return;
+    BoundNodesTreeBuilder ParamMatches(std::move(ArgMatches));
+    if (!ParamMatcher.matches(ParamType, Finder, &ParamMatches))
+      return;
+    Result.addMatch(ParamMatches);
+    Matched = true;
+    return;
+  };
+  if (auto *Call = llvm::dyn_cast<CallExpr>(&Node))
+    matchEachArgumentWithParamType(*Call, ProcessParamAndArg);
+  else if (auto *Construct = llvm::dyn_cast<CXXConstructExpr>(&Node))
+    matchEachArgumentWithParamType(*Construct, ProcessParamAndArg);
+  else
+    return false;
 
-      // This test is cheaper compared to the big matcher in the next if.
-      // Therefore, please keep this order.
-      if (FProto && FProto->getNumParams() > ParamIndex) {
-        QualType ParamType = FProto->getParamType(ParamIndex);
-        if (ParamMatcher.matches(ParamType, Finder, &ParamMatches)) {
-          Result.addMatch(ParamMatches);
-          Matched = true;
-          continue;
-        }
-      }
-      if (expr(anyOf(cxxConstructExpr(hasDeclaration(cxxConstructorDecl(
-                         hasParameter(ParamIndex, hasType(ParamMatcher))))),
-                     callExpr(callee(functionDecl(
-                         hasParameter(ParamIndex, hasType(ParamMatcher)))))))
-              .matches(Node, Finder, &ParamMatches)) {
-        Result.addMatch(ParamMatches);
-        Matched = true;
-        continue;
-      }
-    }
-  }
   *Builder = std::move(Result);
   return Matched;
 }
diff --git a/clang/include/clang/ASTMatchers/LowLevelHelpers.h b/clang/include/clang/ASTMatchers/LowLevelHelpers.h
new file mode 100644
index 0000000000000..ad1fffb5e5e01
--- /dev/null
+++ b/clang/include/clang/ASTMatchers/LowLevelHelpers.h
@@ -0,0 +1,37 @@
+//===- LowLevelHelpers.h - helpers with pure AST interface ---- *- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+// Collects a number of helpers that are used by matchers, but can be reused
+// outside of them, e.g. when corresponding matchers cannot be used due to
+// performance constraints.
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_ASTMATCHERS_LOWLEVELHELPERS_H
+#define LLVM_CLANG_ASTMATCHERS_LOWLEVELHELPERS_H
+
+#include "clang/AST/Expr.h"
+#include "clang/AST/ExprCXX.h"
+#include "clang/AST/Type.h"
+#include "llvm/ADT/STLFunctionalExtras.h"
+
+namespace clang {
+namespace ast_matchers {
+
+void matchEachArgumentWithParamType(
+    const CallExpr &Node,
+    llvm::function_ref<void(QualType /*Param*/, const Expr * /*Arg*/)>
+        OnParamAndArg);
+
+void matchEachArgumentWithParamType(
+    const CXXConstructExpr &Node,
+    llvm::function_ref<void(QualType /*Param*/, const Expr * /*Arg*/)>
+        OnParamAndArg);
+
+} // namespace ast_matchers
+} // namespace clang
+
+#endif
diff --git a/clang/lib/ASTMatchers/CMakeLists.txt b/clang/lib/ASTMatchers/CMakeLists.txt
index 30303c1e39a00..7769fd656ac06 100644
--- a/clang/lib/ASTMatchers/CMakeLists.txt
+++ b/clang/lib/ASTMatchers/CMakeLists.txt
@@ -9,6 +9,7 @@ add_clang_library(clangASTMatchers
   ASTMatchFinder.cpp
   ASTMatchersInternal.cpp
   GtestMatchers.cpp
+  LowLevelHelpers.cpp
 
   LINK_LIBS
   clangAST
diff --git a/clang/lib/ASTMatchers/LowLevelHelpers.cpp b/clang/lib/ASTMatchers/LowLevelHelpers.cpp
new file mode 100644
index 0000000000000..d051430ba9fda
--- /dev/null
+++ b/clang/lib/ASTMatchers/LowLevelHelpers.cpp
@@ -0,0 +1,105 @@
+//===- LowLevelHelpers.cpp -------------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "clang/ASTMatchers/LowLevelHelpers.h"
+#include "clang/AST/Decl.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/AST/Expr.h"
+#include "clang/AST/ExprCXX.h"
+#include <type_traits>
+
+namespace clang {
+namespace ast_matchers {
+
+static const FunctionDecl *getCallee(const CXXConstructExpr &D) {
+  return D.getConstructor();
+}
+static const FunctionDecl *getCallee(const CallExpr &D) {
+  return D.getDirectCallee();
+}
+
+template <class ExprNode>
+static void matchEachArgumentWithParamTypeImpl(
+    const ExprNode &Node,
+    llvm::function_ref<void(QualType /*Param*/, const Expr * /*Arg*/)>
+        OnParamAndArg) {
+  static_assert(std::is_same_v<CallExpr, ExprNode> ||
+                std::is_same_v<CXXConstructExpr, ExprNode>);
+  // The first argument of an overloaded member operator is the implicit object
+  // argument of the method which should not be matched against a parameter, so
+  // we skip over it here.
+  unsigned ArgIndex = 0;
+  if (const auto *CE = dyn_cast<CXXOperatorCallExpr>(&Node)) {
+    const auto *MD = dyn_cast_or_null<CXXMethodDecl>(CE->getDirectCallee());
+    if (MD && !MD->isExplicitObjectMemberFunction()) {
+      // This is an overloaded operator call.
+      // We need to skip the first argument, which is the implicit object
+      // argument of the method which should not be matched against a
+      // parameter.
+      ++ArgIndex;
+    }
+  }
+
+  const FunctionProtoType *FProto = nullptr;
+
+  if (const auto *Call = dyn_cast<CallExpr>(&Node)) {
+    if (const auto *Value =
+            dyn_cast_or_null<ValueDecl>(Call->getCalleeDecl())) {
+      QualType QT = Value->getType().getCanonicalType();
+
+      // This does not necessarily lead to a `FunctionProtoType`,
+      // e.g. K&R functions do not have a function prototype.
+      if (QT->isFunctionPointerType())
+        FProto = QT->getPointeeType()->getAs<FunctionProtoType>();
+
+      if (QT->isMemberFunctionPointerType()) {
+        const auto *MP = QT->getAs<MemberPointerType>();
+        assert(MP && "Must be member-pointer if its a memberfunctionpointer");
+        FProto = MP->getPointeeType()->getAs<FunctionProtoType>();
+        assert(FProto &&
+               "The call must have happened through a member function "
+               "pointer");
+      }
+    }
+  }
+
+  unsigned ParamIndex = 0;
+  unsigned NumArgs = Node.getNumArgs();
+  if (FProto && FProto->isVariadic())
+    NumArgs = std::min(NumArgs, FProto->getNumParams());
+
+  for (; ArgIndex < NumArgs; ++ArgIndex, ++ParamIndex) {
+    QualType ParamType;
+    if (FProto && FProto->getNumParams() > ParamIndex)
+      ParamType = FProto->getParamType(ParamIndex);
+    else if (const FunctionDecl *FD = getCallee(Node); FD && FD->getNumParams() > ParamIndex)
+      ParamType = FD->getParamDecl(ParamIndex)->getType();
+    else
+      continue;
+
+    OnParamAndArg(ParamType, Node.getArg(ArgIndex)->IgnoreParenCasts());
+  }
+}
+
+void matchEachArgumentWithParamType(
+    const CallExpr &Node,
+    llvm::function_ref<void(QualType /*Param*/, const Expr * /*Arg*/)>
+        OnParamAndArg) {
+  matchEachArgumentWithParamTypeImpl(Node, OnParamAndArg);
+}
+
+void matchEachArgumentWithParamType(
+    const CXXConstructExpr &Node,
+    llvm::function_ref<void(QualType /*Param*/, const Expr * /*Arg*/)>
+        OnParamAndArg) {
+  matchEachArgumentWithParamTypeImpl(Node, OnParamAndArg);
+}
+
+} // namespace ast_matchers
+
+} // namespace clang
diff --git a/clang/lib/Analysis/UnsafeBufferUsage.cpp b/clang/lib/Analysis/UnsafeBufferUsage.cpp
index 636cb1f031f0d..adf51c08948c6 100644
--- a/clang/lib/Analysis/UnsafeBufferUsage.cpp
+++ b/clang/lib/Analysis/UnsafeBufferUsage.cpp
@@ -20,6 +20,7 @@
 #include "clang/AST/Stmt.h"
 #include "clang/AST/StmtVisitor.h"
 #include "clang/AST/Type.h"
+#include "clang/ASTMatchers/LowLevelHelpers.h"
 #include "clang/Basic/SourceLocation.h"
 #include "clang/Lex/Lexer.h"
 #include "clang/Lex/Preprocessor.h"
@@ -300,98 +301,6 @@ static void findStmtsInUnspecifiedLvalueContext(
     OnResult(BO->getLHS());
 }
 
-/// Note: Copied and modified from ASTMatchers.
-/// Matches all arguments and their respective types for a \c CallExpr.
-/// It is very similar to \c forEachArgumentWithParam but
-/// it works on calls through function pointers as well.
-///
-/// The difference is, that function pointers do not provide access to a
-/// \c ParmVarDecl, but only the \c QualType for each argument.
-///
-/// Given
-/// \code
-///   void f(int i);
-///   int y;
-///   f(y);
-///   void (*f_ptr)(int) = f;
-///   f_ptr(y);
-/// \endcode
-/// callExpr(
-///   forEachArgumentWithParamType(
-///     declRefExpr(to(varDecl(hasName("y")))),
-///     qualType(isInteger()).bind("type)
-/// ))
-///   matches f(y) and f_ptr(y)
-/// with declRefExpr(...)
-///   matching int y
-/// and qualType(...)
-///   matching int
-static void forEachArgumentWithParamType(
-    const CallExpr &Node,
-    const llvm::function_ref<void(QualType /*Param*/, const Expr * /*Arg*/)>
-        OnParamAndArg) {
-  // The first argument of an overloaded member operator is the implicit object
-  // argument of the method which should not be matched against a parameter, so
-  // we skip over it here.
-  unsigned ArgIndex = 0;
-  if (const auto *CE = dyn_cast<CXXOperatorCallExpr>(&Node)) {
-    const auto *MD = dyn_cast_or_null<CXXMethodDecl>(CE->getDirectCallee());
-    if (MD && !MD->isExplicitObjectMemberFunction()) {
-      // This is an overloaded operator call.
-      // We need to skip the first argument, which is the implicit object
-      // argument of the method which should not be matched against a
-      // parameter.
-      ++ArgIndex;
-    }
-  }
-
-  const FunctionProtoType *FProto = nullptr;
-
-  if (const auto *Call = dyn_cast<CallExpr>(&Node)) {
-    if (const auto *Value =
-            dyn_cast_or_null<ValueDecl>(Call->getCalleeDecl())) {
-      QualType QT = Value->getType().getCanonicalType();
-
-      // This does not necessarily lead to a `FunctionProtoType`,
-      // e.g. K&R functions do not have a function prototype.
-      if (QT->isFunctionPointerType())
-        FProto = QT->getPointeeType()->getAs<FunctionProtoType>();
-
-      if (QT->isMemberFunctionPointerType()) {
-        const auto *MP = QT->getAs<MemberPointerType>();
-        assert(MP && "Must be member-pointer if its a memberfunctionpointer");
-        FProto = MP->getPointeeType()->getAs<FunctionProtoType>();
-        assert(FProto &&
-               "The call must have happened through a member function "
-               "pointer");
-      }
-    }
-  }
-
-  unsigned ParamIndex = 0;
-  unsigned NumArgs = Node.getNumArgs();
-  if (FProto && FProto->isVariadic())
-    NumArgs = std::min(NumArgs, FProto->getNumParams());
-
-  const auto GetParamType =
-      [&FProto, &Node](unsigned int ParamIndex) -> std::optional<QualType> {
-    if (FProto && FProto->getNumParams() > ParamIndex) {
-      return FProto->getParamType(ParamIndex);
-    }
-    const auto *FD = Node.getDirectCallee();
-    if (FD && FD->getNumParams() > ParamIndex) {
-      return FD->getParamDecl(ParamIndex)->getType();
-    }
-    return std::nullopt;
-  };
-
-  for (; ArgIndex < NumArgs; ++ArgIndex, ++ParamIndex) {
-    auto ParamType = GetParamType(ParamIndex);
-    if (ParamType)
-      OnParamAndArg(*ParamType, Node.getArg(ArgIndex)->IgnoreParenCasts());
-  }
-}
-
 // Finds any expression `e` such that `InnerMatcher` matches `e` and
 // `e` is in an Unspecified Pointer Context (UPC).
 static void findStmtsInUnspecifiedPointerContext(
@@ -408,7 +317,7 @@ static void findStmtsInUnspecifiedPointerContext(
     if (const auto *FnDecl = CE->getDirectCallee();
         FnDecl && FnDecl->hasAttr<UnsafeBufferUsageAttr>())
       return;
-    forEachArgumentWithParamType(
+    ast_matchers::matchEachArgumentWithParamType(
         *CE, [&InnerMatcher](QualType Type, const Expr *Arg) {
           if (Type->isAnyPointerType())
             InnerMatcher(Arg);

Copy link

github-actions bot commented Mar 21, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@ilya-biryukov
Copy link
Contributor Author

cc @ivanaivanovska FYI

@ziqingluo-90
Copy link
Contributor

Hi @ilya-biryukov, thank you for continuing to improve this!
I'm currently dealing with downstream conflicts with your previous patch. Please do not merge this PR this week, otherwise it may slow down our CI. Thanks!

@ilya-biryukov
Copy link
Contributor Author

I'm currently dealing with downstream conflicts with your previous patch. Please do not merge this PR this week, otherwise it may slow down our CI. Thanks!

Sure! There is no urgency in landing this.
Please let me know when it'd be a good time to land this.

@AaronBallman it'd be nice to get this reviewed anyway to make sure this change makes sense from the matchers side.

@ziqingluo-90
Copy link
Contributor

@ilya-biryukov Our downstream should be clear now. Feel free to merge.
This change LGTM but I will let @AaronBallman to approve.

@ilya-biryukov ilya-biryukov requested a review from usx95 April 4, 2025 09:27
@ilya-biryukov
Copy link
Contributor Author

Friendly ping @AaronBallman to take a look.
And also add @usx95 to get another pair of eyes and faster review in case Aaron is busy.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ilya-biryukov ilya-biryukov merged commit da69eb7 into llvm:main Apr 4, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants