Skip to content

Commit 2709998

Browse files
authored
[NFC][analyzer] Framework for multipart checkers (#130985)
In the static analyzer codebase we have a traditional pattern where a single checker class (and its singleton instance) acts as the implementation of several (user-facing or modeling) checkers that have shared state and logic, but have their own names and can be enabled or disabled separately. Currently these multipart checker classes all reimplement the same boilerplate logic to store the enabled/disabled state, the name and the bug types associated with the checker parts. This commit extends `CheckerBase`, `BugType` and the checker registration process to offer an easy-to-use alternative to that boilerplate (which includes the ugly lazy initialization of `mutable std::unique_ptr<BugType>`s). In this new framework the single-part checkers are internally represented as "multipart checkers with just one part" (because this way I don't need to reimplement the same logic twice) but this does not require any changes in the code of simple single-part checkers. I do not claim that these multi-part checkers are perfect from an architectural point of view; but they won't suddenly disappear after many years of existence, so we might as well introduce a clear framework for them. (Switching to e.g. 1:1 correspondence between checker classes and checker names would be a prohibitively complex change.) This PR ports `DivZeroChecker` to the new framework as a proof of concept. I'm planning to do a series of follow-up commits to port the rest of the multi-part checker.
1 parent 7dcea28 commit 2709998

File tree

5 files changed

+142
-68
lines changed

5 files changed

+142
-68
lines changed

clang/include/clang/StaticAnalyzer/Core/BugReporter/BugType.h

Lines changed: 31 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
#include "clang/StaticAnalyzer/Core/BugReporter/CommonBugCategories.h"
1818
#include "clang/StaticAnalyzer/Core/Checker.h"
1919
#include <string>
20+
#include <variant>
2021

2122
namespace clang {
2223

@@ -26,36 +27,53 @@ class BugReporter;
2627

2728
class BugType {
2829
private:
29-
const CheckerNameRef CheckerName;
30+
struct CheckerPartRef {
31+
const CheckerBase *Checker;
32+
CheckerPartIdx Idx;
33+
};
34+
using CheckerNameInfo = std::variant<CheckerNameRef, CheckerPartRef>;
35+
36+
const CheckerNameInfo CheckerName;
3037
const std::string Description;
3138
const std::string Category;
32-
const CheckerBase *Checker;
3339
bool SuppressOnSink;
3440

3541
virtual void anchor();
3642

3743
public:
44+
// Straightforward constructor where the checker name is specified directly.
45+
// TODO: As far as I know all applications of this constructor involve ugly
46+
// hacks that could be avoided by switching to a different constructor.
47+
// When those are all eliminated, this constructor should be removed to
48+
// eliminate the `variant` and simplify this class.
3849
BugType(CheckerNameRef CheckerName, StringRef Desc,
3950
StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
4051
: CheckerName(CheckerName), Description(Desc), Category(Cat),
41-
Checker(nullptr), SuppressOnSink(SuppressOnSink) {}
52+
SuppressOnSink(SuppressOnSink) {}
53+
// Constructor that can be called from the constructor of a checker object.
54+
// At that point the checker name is not yet available, but we can save a
55+
// pointer to the checker and later use that to query the name.
4256
BugType(const CheckerBase *Checker, StringRef Desc,
4357
StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
44-
: CheckerName(), Description(Desc), Category(Cat), Checker(Checker),
45-
SuppressOnSink(SuppressOnSink) {}
58+
: CheckerName(CheckerPartRef{Checker, DefaultPart}), Description(Desc),
59+
Category(Cat), SuppressOnSink(SuppressOnSink) {}
60+
// Constructor that can be called from the constructor of a checker object
61+
// when it has multiple parts with separate names. We save the name and the
62+
// part index to be able to query the name of that part later.
63+
BugType(const CheckerBase *Checker, CheckerPartIdx Idx, StringRef Desc,
64+
StringRef Cat = categories::LogicError, bool SuppressOnSink = false)
65+
: CheckerName(CheckerPartRef{Checker, Idx}), Description(Desc),
66+
Category(Cat), SuppressOnSink(SuppressOnSink) {}
4667
virtual ~BugType() = default;
4768

4869
StringRef getDescription() const { return Description; }
4970
StringRef getCategory() const { return Category; }
5071
StringRef getCheckerName() const {
51-
// FIXME: This is a workaround to ensure that the correct checker name is
52-
// used. The checker names are set after the constructors are run.
53-
// In case the BugType object is initialized in the checker's ctor
54-
// the CheckerName field will be empty. To circumvent this problem we use
55-
// CheckerBase whenever it is possible.
56-
StringRef Ret = Checker ? Checker->getName() : CheckerName;
57-
assert(!Ret.empty() && "Checker name is not set properly.");
58-
return Ret;
72+
if (const auto *CNR = std::get_if<CheckerNameRef>(&CheckerName))
73+
return *CNR;
74+
75+
auto [Checker, Idx] = std::get<CheckerPartRef>(CheckerName);
76+
return Checker->getName(Idx);
5977
}
6078

6179
/// isSuppressOnSink - Returns true if bug reports associated with this bug

clang/include/clang/StaticAnalyzer/Core/Checker.h

Lines changed: 49 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -485,16 +485,60 @@ class Call {
485485
} // end eval namespace
486486

487487
class CheckerBase : public ProgramPointTag {
488-
CheckerNameRef Name;
488+
/// A single checker class (i.e. a subclass of `CheckerBase`) can implement
489+
/// multiple user-facing checkers that have separate names and can be enabled
490+
/// separately, but are backed by the same singleton checker object.
491+
SmallVector<std::optional<CheckerNameRef>, 1> RegisteredNames;
492+
489493
friend class ::clang::ento::CheckerManager;
490494

491495
public:
492-
StringRef getTagDescription() const override;
493-
CheckerNameRef getName() const;
496+
CheckerNameRef getName(CheckerPartIdx Idx = DefaultPart) const {
497+
assert(Idx < RegisteredNames.size() && "Checker part index is too large!");
498+
std::optional<CheckerNameRef> Name = RegisteredNames[Idx];
499+
assert(Name && "Requested checker part is not registered!");
500+
return *Name;
501+
}
502+
503+
bool isPartEnabled(CheckerPartIdx Idx) const {
504+
return Idx < RegisteredNames.size() && RegisteredNames[Idx].has_value();
505+
}
506+
507+
void registerCheckerPart(CheckerPartIdx Idx, CheckerNameRef Name) {
508+
// Paranoia: notice if e.g. UINT_MAX is passed as a checker part index.
509+
assert(Idx < 256 && "Checker part identifiers should be small integers.");
510+
511+
if (Idx >= RegisteredNames.size())
512+
RegisteredNames.resize(Idx + 1, std::nullopt);
513+
514+
assert(!RegisteredNames[Idx] && "Repeated registration of checker a part!");
515+
RegisteredNames[Idx] = Name;
516+
}
517+
518+
StringRef getTagDescription() const override {
519+
// This method inherited from `ProgramPointTag` has two unrelated roles:
520+
// (1) The analyzer option handling logic uses this method to query the
521+
// name of a checker.
522+
// (2) When the `ExplodedGraph` is dumped in DOT format for debugging,
523+
// this is called to attach a description to the nodes. (This happens
524+
// for all subclasses of `ProgramPointTag`, not just checkers.)
525+
// FIXME: Application (1) should be aware of multiple parts within the same
526+
// checker class instance, so it should directly use `getName` instead of
527+
// this inherited interface which cannot support a `CheckerPartIdx`.
528+
// FIXME: Ideally application (2) should return a string that describes the
529+
// whole checker class, not just one of it parts. However, this is only for
530+
// debugging, so returning the name of one part is probably good enough.
531+
for (const auto &OptName : RegisteredNames)
532+
if (OptName)
533+
return *OptName;
534+
535+
return "Unregistered checker";
536+
}
494537

495-
/// See CheckerManager::runCheckersForPrintState.
538+
/// Debug state dump callback, see CheckerManager::runCheckersForPrintState.
539+
/// Default implementation does nothing.
496540
virtual void printState(raw_ostream &Out, ProgramStateRef State,
497-
const char *NL, const char *Sep) const { }
541+
const char *NL, const char *Sep) const;
498542
};
499543

500544
/// Dump checker name to stream.

clang/include/clang/StaticAnalyzer/Core/CheckerManager.h

Lines changed: 41 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -116,6 +116,19 @@ class CheckerNameRef {
116116
operator StringRef() const { return Name; }
117117
};
118118

119+
/// A single checker class (and its singleton instance) can act as the
120+
/// implementation of several (user-facing or modeling) checker parts that
121+
/// have shared state and logic, but have their own names and can be enabled or
122+
/// disabled separately.
123+
/// Each checker class that implement multiple parts introduces its own enum
124+
/// type to assign small numerical indices (0, 1, 2 ...) to their parts. The
125+
/// type alias 'CheckerPartIdx' is conceptually the union of these enum types.
126+
using CheckerPartIdx = unsigned;
127+
128+
/// If a checker doesn't have multiple parts, then its single part is
129+
/// represented by this index.
130+
constexpr inline CheckerPartIdx DefaultPart = 0;
131+
119132
enum class ObjCMessageVisitKind {
120133
Pre,
121134
Post,
@@ -190,23 +203,38 @@ class CheckerManager {
190203
// Checker registration.
191204
//===--------------------------------------------------------------------===//
192205

193-
/// Used to register checkers.
194-
/// All arguments are automatically passed through to the checker
195-
/// constructor.
206+
/// Construct the singleton instance of a checker, register it for the
207+
/// supported callbacks and record its name with `registerCheckerPart()`.
208+
/// Arguments passed to this function are forwarded to the constructor of the
209+
/// checker.
210+
///
211+
/// If `CHECKER` has multiple parts, then the constructor call and the
212+
/// callback registration only happen within the first `registerChecker()`
213+
/// call; while the subsequent calls only enable additional parts of the
214+
/// existing checker object (while registering their names).
196215
///
197216
/// \returns a pointer to the checker object.
198-
template <typename CHECKER, typename... AT>
199-
CHECKER *registerChecker(AT &&... Args) {
217+
template <typename CHECKER, CheckerPartIdx Idx = DefaultPart, typename... AT>
218+
CHECKER *registerChecker(AT &&...Args) {
219+
// This assert could be removed but then we need to make sure that calls
220+
// registering different parts of the same checker pass the same arguments.
221+
static_assert(
222+
Idx == DefaultPart || !sizeof...(AT),
223+
"Argument forwarding isn't supported with multi-part checkers!");
224+
200225
CheckerTag Tag = getTag<CHECKER>();
201-
std::unique_ptr<CheckerBase> &Ref = CheckerTags[Tag];
202-
assert(!Ref && "Checker already registered, use getChecker!");
203226

204-
std::unique_ptr<CHECKER> Checker =
205-
std::make_unique<CHECKER>(std::forward<AT>(Args)...);
206-
Checker->Name = CurrentCheckerName;
207-
CHECKER::_register(Checker.get(), *this);
208-
Ref = std::move(Checker);
209-
return static_cast<CHECKER *>(Ref.get());
227+
std::unique_ptr<CheckerBase> &Ref = CheckerTags[Tag];
228+
if (!Ref) {
229+
std::unique_ptr<CHECKER> Checker =
230+
std::make_unique<CHECKER>(std::forward<AT>(Args)...);
231+
CHECKER::_register(Checker.get(), *this);
232+
Ref = std::move(Checker);
233+
}
234+
235+
CHECKER *Result = static_cast<CHECKER *>(Ref.get());
236+
Result->registerCheckerPart(Idx, CurrentCheckerName);
237+
return Result;
210238
}
211239

212240
template <typename CHECKER>

clang/lib/StaticAnalyzer/Checkers/DivZeroChecker.cpp

Lines changed: 19 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -34,10 +34,14 @@ class DivZeroChecker : public Checker<check::PreStmt<BinaryOperator>> {
3434

3535
public:
3636
/// This checker class implements several user facing checkers
37-
enum CheckKind { CK_DivideZero, CK_TaintedDivChecker, CK_NumCheckKinds };
38-
bool ChecksEnabled[CK_NumCheckKinds] = {false};
39-
CheckerNameRef CheckNames[CK_NumCheckKinds];
40-
mutable std::unique_ptr<BugType> BugTypes[CK_NumCheckKinds];
37+
enum : CheckerPartIdx {
38+
DivideZeroChecker,
39+
TaintedDivChecker,
40+
NumCheckerParts
41+
};
42+
BugType BugTypes[NumCheckerParts] = {
43+
{this, DivideZeroChecker, "Division by zero"},
44+
{this, TaintedDivChecker, "Division by zero", categories::TaintedData}};
4145

4246
void checkPreStmt(const BinaryOperator *B, CheckerContext &C) const;
4347
};
@@ -52,14 +56,11 @@ static const Expr *getDenomExpr(const ExplodedNode *N) {
5256

5357
void DivZeroChecker::reportBug(StringRef Msg, ProgramStateRef StateZero,
5458
CheckerContext &C) const {
55-
if (!ChecksEnabled[CK_DivideZero])
59+
if (!isPartEnabled(DivideZeroChecker))
5660
return;
57-
if (!BugTypes[CK_DivideZero])
58-
BugTypes[CK_DivideZero].reset(
59-
new BugType(CheckNames[CK_DivideZero], "Division by zero"));
6061
if (ExplodedNode *N = C.generateErrorNode(StateZero)) {
61-
auto R = std::make_unique<PathSensitiveBugReport>(*BugTypes[CK_DivideZero],
62-
Msg, N);
62+
auto R = std::make_unique<PathSensitiveBugReport>(
63+
BugTypes[DivideZeroChecker], Msg, N);
6364
bugreporter::trackExpressionValue(N, getDenomExpr(N), *R);
6465
C.emitReport(std::move(R));
6566
}
@@ -68,15 +69,11 @@ void DivZeroChecker::reportBug(StringRef Msg, ProgramStateRef StateZero,
6869
void DivZeroChecker::reportTaintBug(
6970
StringRef Msg, ProgramStateRef StateZero, CheckerContext &C,
7071
llvm::ArrayRef<SymbolRef> TaintedSyms) const {
71-
if (!ChecksEnabled[CK_TaintedDivChecker])
72+
if (!isPartEnabled(TaintedDivChecker))
7273
return;
73-
if (!BugTypes[CK_TaintedDivChecker])
74-
BugTypes[CK_TaintedDivChecker].reset(
75-
new BugType(CheckNames[CK_TaintedDivChecker], "Division by zero",
76-
categories::TaintedData));
7774
if (ExplodedNode *N = C.generateNonFatalErrorNode(StateZero)) {
7875
auto R = std::make_unique<PathSensitiveBugReport>(
79-
*BugTypes[CK_TaintedDivChecker], Msg, N);
76+
BugTypes[TaintedDivChecker], Msg, N);
8077
bugreporter::trackExpressionValue(N, getDenomExpr(N), *R);
8178
for (auto Sym : TaintedSyms)
8279
R->markInteresting(Sym);
@@ -129,28 +126,16 @@ void DivZeroChecker::checkPreStmt(const BinaryOperator *B,
129126
C.addTransition(stateNotZero);
130127
}
131128

132-
void ento::registerDivZeroChecker(CheckerManager &mgr) {
133-
DivZeroChecker *checker = mgr.registerChecker<DivZeroChecker>();
134-
checker->ChecksEnabled[DivZeroChecker::CK_DivideZero] = true;
135-
checker->CheckNames[DivZeroChecker::CK_DivideZero] =
136-
mgr.getCurrentCheckerName();
129+
void ento::registerDivZeroChecker(CheckerManager &Mgr) {
130+
Mgr.registerChecker<DivZeroChecker, DivZeroChecker::DivideZeroChecker>();
137131
}
138132

139-
bool ento::shouldRegisterDivZeroChecker(const CheckerManager &mgr) {
140-
return true;
141-
}
133+
bool ento::shouldRegisterDivZeroChecker(const CheckerManager &) { return true; }
142134

143-
void ento::registerTaintedDivChecker(CheckerManager &mgr) {
144-
DivZeroChecker *checker;
145-
if (!mgr.isRegisteredChecker<DivZeroChecker>())
146-
checker = mgr.registerChecker<DivZeroChecker>();
147-
else
148-
checker = mgr.getChecker<DivZeroChecker>();
149-
checker->ChecksEnabled[DivZeroChecker::CK_TaintedDivChecker] = true;
150-
checker->CheckNames[DivZeroChecker::CK_TaintedDivChecker] =
151-
mgr.getCurrentCheckerName();
135+
void ento::registerTaintedDivChecker(CheckerManager &Mgr) {
136+
Mgr.registerChecker<DivZeroChecker, DivZeroChecker::TaintedDivChecker>();
152137
}
153138

154-
bool ento::shouldRegisterTaintedDivChecker(const CheckerManager &mgr) {
139+
bool ento::shouldRegisterTaintedDivChecker(const CheckerManager &) {
155140
return true;
156141
}

clang/lib/StaticAnalyzer/Core/Checker.cpp

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,8 @@ using namespace ento;
1818

1919
int ImplicitNullDerefEvent::Tag;
2020

21-
StringRef CheckerBase::getTagDescription() const { return getName(); }
22-
23-
CheckerNameRef CheckerBase::getName() const { return Name; }
21+
void CheckerBase::printState(raw_ostream &Out, ProgramStateRef State,
22+
const char *NL, const char *Sep) const {}
2423

2524
CheckerProgramPointTag::CheckerProgramPointTag(StringRef CheckerName,
2625
StringRef Msg)

0 commit comments

Comments
 (0)