-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[-Wcompletion-handler] Fix a non-termination issue #78380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The Called-Once dataflow analysis could never terminate as a consequence of non-monotonic update on states. States of kind Escape can override states leading to non-monotonic update. The fix uses finite state sets instead of single states to represent CFG block outputs, which grows in uni-direction. To preserve the behavior of the analyzer, a single state can be extracted from a state set. The extraction follows the idea that Escape can override error states within one block but obeys to the partial-order for join. rdar://119671856
@llvm/pr-subscribers-clang @llvm/pr-subscribers-clang-analysis Author: Ziqing Luo (ziqingluo-90) ChangesThe analysis goes to infinity on the example below:
A simplified CFG will look like this:
The analysis goes backwards so it starts from The key issue is that the My humble understand of why to design the states this way is that if Full diff: https://github.com/llvm/llvm-project/pull/78380.diff 2 Files Affected:
diff --git a/clang/lib/Analysis/CalledOnceCheck.cpp b/clang/lib/Analysis/CalledOnceCheck.cpp
index 04c5f6aa9c7450..3fab93b1d09cdf 100644
--- a/clang/lib/Analysis/CalledOnceCheck.cpp
+++ b/clang/lib/Analysis/CalledOnceCheck.cpp
@@ -163,7 +163,7 @@ class ParameterStatus {
NotVisited = 0x8, /* 1000 */
// We already reported a violation and stopped tracking calls for this
// parameter.
- Reported = 0x15, /* 1111 */
+ Reported = 0xF, /* 1111 */
LLVM_MARK_AS_BITMASK_ENUM(/* LargestValue = */ Reported)
};
@@ -215,6 +215,55 @@ class ParameterStatus {
const Expr *Call = nullptr;
};
+// Represents a set of `ParameterStatus`s collected in a single CFGBlock.
+class ParamStatusSet {
+private:
+ // The kinds of `States` spans from 0x0 to 0x15, so 16 bits are enough:
+ llvm::BitVector Set{16, false};
+ // Following the exisitng idea: if there are multiple calls of interest in one
+ // block, recording one of them is good enough.
+ const Expr *Call = nullptr;
+
+public:
+ ParamStatusSet() {}
+
+ // Add a new `ParameterStatus` to the set. Returns true iff the adding status
+ // was new to the set.
+ bool add(ParameterStatus PS) {
+ assert(PS.getKind() != ParameterStatus::Kind::NotVisited &&
+ "the status cannot be NotVisited after visiting a block");
+ if (Set.test(PS.getKind()))
+ return false;
+ Set.set(PS.getKind());
+ if (PS.seenAnyCalls())
+ Call = &PS.getCall();
+ return true;
+ }
+
+ // Get one `ParameterStatus` to represent the set of them. The idea is to
+ // take a join on them but let ESCAPE overrides error statuses.
+ ParameterStatus summaryStatus() {
+ unsigned Summary = 0x0;
+
+ for (unsigned Idx :
+ llvm::make_range(Set.set_bits_begin(), Set.set_bits_end()))
+ Summary |= Idx;
+ assert(Summary == 0x0 || Summary == 0x1 || Summary == 0x3 ||
+ Summary == 0x5 || Summary == 0x7 ||
+ Summary == 0xF && "expecting a defined value");
+
+ ParameterStatus::Kind SummaryKind = ParameterStatus::Kind(Summary);
+
+ if (SummaryKind > ParameterStatus::Kind::NON_ERROR_STATUS &&
+ Set.test(ParameterStatus::Kind::Escaped)) {
+ SummaryKind = ParameterStatus::Kind::Escaped;
+ }
+ if (ParameterStatus::seenAnyCalls(SummaryKind))
+ return {SummaryKind, Call};
+ return {SummaryKind};
+ }
+};
+
/// State aggregates statuses of all tracked parameters.
class State {
public:
@@ -274,6 +323,53 @@ class State {
private:
ParamSizedVector<ParameterStatus> ParamData;
+
+ friend class CFGBlockOutputState;
+
+ State(ParamSizedVector<ParameterStatus> ParamData) : ParamData(ParamData) {}
+
+ unsigned size() const { return ParamData.size(); }
+};
+
+// A different kind of "state" in addition to `State`. `CFGBlockOutputState`
+// are created for dealing with the non-termination issue due to `State`s are
+// not being updated monotonically at the output of each CFGBlock.
+// A `CFGBlockOutputState` is in fact a finite set of `State`s that
+// grows monotonically. One can extract a `State` from a `CFGBlockOutputState`.
+// Note that the `State`-extraction does NOT guarantee monotone but it
+// respects the existing semantics. Specifically, ESCAPE is "greater than"
+// other error states in a single path but is "less than" them at JOIN.
+//
+// `CFGBlockOutputState` will be used to terminate the fix-point computation.
+class CFGBlockOutputState {
+private:
+ ParamSizedVector<ParamStatusSet> StatusSets;
+
+public:
+ CFGBlockOutputState(unsigned Size) : StatusSets{Size} {};
+
+ // Update this `CFGBlockOutputState` with a newly computed `State`. Return
+ // true iff `CFGBlockOutputState` changed.
+ bool update(const State &NewState) {
+ bool Changed = false;
+
+ for (unsigned Idx = 0; Idx < NewState.size(); ++Idx) {
+ if (StatusSets[Idx].add(NewState.getStatusFor(Idx))) {
+ Changed = true;
+ }
+ }
+ return Changed;
+ }
+
+ // Return a `State` that represents the `CFGBlockOutputState`.
+ State getState() {
+ ParamSizedVector<ParameterStatus> ParamData{StatusSets.size()};
+
+ for (unsigned Idx = 0; Idx < ParamData.size(); ++Idx) {
+ ParamData[Idx] = StatusSets[Idx].summaryStatus();
+ }
+ return State{ParamData};
+ }
};
/// A simple class that finds DeclRefExpr in the given expression.
@@ -639,6 +735,8 @@ class CalledOnceChecker : public ConstStmtVisitor<CalledOnceChecker> {
if (size() != 0) {
States =
CFGSizedVector<State>(FunctionCFG.getNumBlockIDs(), State(size()));
+ CFGBlockOutputStates = CFGSizedVector<CFGBlockOutputState>(
+ FunctionCFG.getNumBlockIDs(), CFGBlockOutputState(size()));
}
}
@@ -1305,17 +1403,17 @@ class CalledOnceChecker : public ConstStmtVisitor<CalledOnceChecker> {
}
/// \}
- /// Assign status to the given basic block.
- ///
- /// Returns true when the stored status changed.
+ /// Update output state for the CFGBlock.
+ /// Returns true when the stored state changed.
bool assignState(const CFGBlock *BB, const State &ToAssign) {
- State &Current = getState(BB);
- if (Current == ToAssign) {
- return false;
- }
+ CFGBlockOutputState &OldOutputState =
+ CFGBlockOutputStates[BB->getBlockID()];
- Current = ToAssign;
- return true;
+ if (OldOutputState.update(ToAssign)) {
+ getState(BB) = OldOutputState.getState();
+ return true;
+ }
+ return false;
}
/// Join all incoming statuses for the given basic block.
@@ -1692,6 +1790,8 @@ class CalledOnceChecker : public ConstStmtVisitor<CalledOnceChecker> {
State CurrentState;
ParamSizedVector<const ParmVarDecl *> TrackedParams;
CFGSizedVector<State> States;
+ // The mapping from each `CFGBlock` to its `CFGBlockOutputState`:
+ CFGSizedVector<CFGBlockOutputState> CFGBlockOutputStates;
};
} // end anonymous namespace
diff --git a/clang/test/SemaObjC/warn-called-once.m b/clang/test/SemaObjC/warn-called-once.m
index dbe8dc1cf1ae17..f23e41e00ee298 100644
--- a/clang/test/SemaObjC/warn-called-once.m
+++ b/clang/test/SemaObjC/warn-called-once.m
@@ -1194,6 +1194,16 @@ - (void)test_escape_after_branch:(int)cond
escape(handler);
}
+- (void)test_termination:(int)cond
+ withCompletion:(void (^)(void))handler {
+ // The code below was able to cause non-termination but should be
+ // fixed now:
+ do {
+ escape(handler);
+ handler(); // expected-warning{{completion handler is called twice}} expected-note{{previous call is here; set to nil to indicate it cannot be called afterwards}}
+ } while (cond);
+}
+
typedef void (^DeferredBlock)(void);
static inline void DefferedCallback(DeferredBlock *inBlock) { (*inBlock)(); }
#define _DEFERCONCAT(a, b) a##b
|
Hey @ziqingluo-90, thank you for such a detailed description of the problem. It helped me to get up to speed quickly. 🙏 Considering these rules (even be it for joining multiple paths) https://github.com/llvm/llvm-project/blob/main/clang/lib/Analysis/CalledOnceCheck.cpp#L119-L145 we shouldn't override
We do consider Reported to be an error state, and overwrite it. Maybe changing this logic and excluding Reported from that condition (i.e. x.isErrorStatus() && x != Reported ) is the way to go? The reason for that is that we already found a problem on that path and the fact that it escaped before it shouldn't affect our conclusion. Another reason is more formal, Reported is supposed to be a top element of the lattice and this logic defies it.
Thanks again! |
This reverts commit e7c3a3f.
The Called-Once dataflow analysis could never terminate as a consequence of non-monotonic update on states. States of kind Escape can override states leading to non-monotonic update. This fix disallows the `Escape` state to override the `Reported` state. rdar://119671856
✅ With the latest revision this PR passed the C/C++ code formatter. |
@SavchenkoValeriy, sorry for the late response. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! Thank you for fixing the problem and for describing it so in detail.
The Called-Once dataflow analysis could never terminate as a consequence of non-monotonic update on states. States of kind Escape can override states leading to non-monotonic update. This fix disallows the `Escape` state to override the `Reported` state. rdar://119671856 (cherry picked from commit 2a06850)
The Called-Once dataflow analysis could never terminate as a consequence of non-monotonic update on states. States of kind Escape can override states leading to non-monotonic update. This fix disallows the `Escape` state to override the `Reported` state. rdar://119671856 (cherry picked from commit 2a06850)
The analysis goes to infinity on the example below:
A simplified CFG will look like this:
The analysis goes backwards so it starts from
B2
with output stateNotCalled
. ThenB1
's output state isDefinitelyCalled
(Note thatescape
is visited but it only overrides error states). Next,B0
outputsDefinitelyCalled
as well. ThenB1
is visited again: the input is the join overDefinitelyCalled
andNotCalled
resulting inMaybeCalled
; it gets updated toReported
atf()
andEscape
later atescape(f)
. SoB0
now outputsEscape
as well. Then again, the input ofB1
is the join overEscape
andNotCalled
resulting inNotCalled
, and so on to the infinity.The key issue is that the
join
operator defines a partial order over the states, in whichEscape < any error states
. Meanwhile, the analyzer allowsEscape
to override error states during a CFG Block visit. Therefore, the output state of a CFG block can change from a "greater" state to a "smaller" state (i.e., it does not change monotonically).My humble understand of why to design the states this way is that if
Escape
happens on a single path, we do not report anything on that path for being less noisy whileEscape
on one path should not prevent the analyzer from warning the other paths.To preserve such an idea, I propose to use sets of states as CFG block outputs. The set is finite and grows in one direction, so it guarantees termination. For obtaining an input from such an output, a single state can be "extracted" from a state set. The extraction is basically a join over the set and it will let
Escape
to override error states.