-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[clang][nullability] Don't discard expression state before end of full-expression. #82611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,6 +94,38 @@ static llvm::BitVector findReachableBlocks(const CFG &Cfg) { | |
return BlockReachable; | ||
} | ||
|
||
static llvm::DenseSet<const CFGBlock *> | ||
buildContainsExprConsumedInDifferentBlock( | ||
const CFG &Cfg, | ||
const llvm::DenseMap<const Stmt *, const CFGBlock *> &StmtToBlock) { | ||
llvm::DenseSet<const CFGBlock *> Result; | ||
|
||
auto CheckChildExprs = [&Result, &StmtToBlock](const Stmt *S, | ||
const CFGBlock *Block) { | ||
for (const Stmt *Child : S->children()) { | ||
if (!isa<Expr>(Child)) | ||
continue; | ||
const CFGBlock *ChildBlock = StmtToBlock.lookup(Child); | ||
if (ChildBlock != Block) | ||
Result.insert(ChildBlock); | ||
} | ||
}; | ||
|
||
for (const CFGBlock *Block : Cfg) { | ||
if (Block == nullptr) | ||
continue; | ||
|
||
for (const CFGElement &Element : *Block) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we early terminate as soon as we found an expression that is consumed in a different block? Also, in case we do early termination, would it make sense to go backwards and start with the terminator conditions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
That doesn't work, because we insert |
||
if (auto S = Element.getAs<CFGStmt>()) | ||
CheckChildExprs(S->getStmt(), Block); | ||
|
||
if (const Stmt *TerminatorCond = Block->getTerminatorCondition()) | ||
CheckChildExprs(TerminatorCond, Block); | ||
} | ||
|
||
return Result; | ||
} | ||
|
||
llvm::Expected<ControlFlowContext> | ||
ControlFlowContext::build(const FunctionDecl &Func) { | ||
if (!Func.doesThisDeclarationHaveABody()) | ||
|
@@ -140,8 +172,12 @@ ControlFlowContext::build(const Decl &D, Stmt &S, ASTContext &C) { | |
|
||
llvm::BitVector BlockReachable = findReachableBlocks(*Cfg); | ||
|
||
llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock = | ||
buildContainsExprConsumedInDifferentBlock(*Cfg, StmtToBlock); | ||
|
||
return ControlFlowContext(D, std::move(Cfg), std::move(StmtToBlock), | ||
std::move(BlockReachable)); | ||
std::move(BlockReachable), | ||
std::move(ContainsExprConsumedInDifferentBlock)); | ||
} | ||
|
||
} // namespace dataflow | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -221,18 +221,21 @@ class PrettyStackTraceCFGElement : public llvm::PrettyStackTraceEntry { | |
// Avoids unneccesary copies of the environment. | ||
class JoinedStateBuilder { | ||
AnalysisContext &AC; | ||
Environment::ExprJoinBehavior JoinBehavior; | ||
std::vector<const TypeErasedDataflowAnalysisState *> All; | ||
std::deque<TypeErasedDataflowAnalysisState> Owned; | ||
|
||
TypeErasedDataflowAnalysisState | ||
join(const TypeErasedDataflowAnalysisState &L, | ||
const TypeErasedDataflowAnalysisState &R) { | ||
return {AC.Analysis.joinTypeErased(L.Lattice, R.Lattice), | ||
Environment::join(L.Env, R.Env, AC.Analysis)}; | ||
Environment::join(L.Env, R.Env, AC.Analysis, JoinBehavior)}; | ||
} | ||
|
||
public: | ||
JoinedStateBuilder(AnalysisContext &AC) : AC(AC) {} | ||
JoinedStateBuilder(AnalysisContext &AC, | ||
Environment::ExprJoinBehavior JoinBehavior) | ||
: AC(AC), JoinBehavior(JoinBehavior) {} | ||
|
||
void addOwned(TypeErasedDataflowAnalysisState State) { | ||
Owned.push_back(std::move(State)); | ||
|
@@ -248,12 +251,12 @@ class JoinedStateBuilder { | |
// initialize the state of each basic block differently. | ||
return {AC.Analysis.typeErasedInitialElement(), AC.InitEnv.fork()}; | ||
if (All.size() == 1) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. guard on ExprBehavior as well, like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
We would also need to return the existing state in the case
And it doesn't deal with the FIXME, as we still need some way to discard the expression state. I'm not sure that, all in all, changing the code in this way would be preferable. The upside is that we're now a bit more efficient in the case where The downside is that we're treating There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Agreed with keeping as is for reducing complexity. |
||
// Join the environment with itself so that we discard the entries from | ||
// `ExprToLoc` and `ExprToVal`. | ||
// Join the environment with itself so that we discard expression state if | ||
// desired. | ||
// FIXME: We could consider writing special-case code for this that only | ||
// does the discarding, but it's not clear if this is worth it. | ||
return {All[0]->Lattice, | ||
Environment::join(All[0]->Env, All[0]->Env, AC.Analysis)}; | ||
return {All[0]->Lattice, Environment::join(All[0]->Env, All[0]->Env, | ||
AC.Analysis, JoinBehavior)}; | ||
|
||
auto Result = join(*All[0], *All[1]); | ||
for (unsigned I = 2; I < All.size(); ++I) | ||
|
@@ -307,7 +310,22 @@ computeBlockInputState(const CFGBlock &Block, AnalysisContext &AC) { | |
} | ||
} | ||
|
||
JoinedStateBuilder Builder(AC); | ||
// If any of the predecessor blocks contains an expression consumed in a | ||
// different block, we need to keep expression state. | ||
// Note that in this case, we keep expression state for all predecessors, | ||
// rather than only those predecessors that actually contain an expression | ||
// consumed in a different block. While this is potentially suboptimal, it's | ||
// actually likely, if we have control flow within a full expression, that | ||
// all predecessors have expression state consumed in a different block. | ||
Environment::ExprJoinBehavior JoinBehavior = Environment::DiscardExprState; | ||
for (const CFGBlock *Pred : Preds) { | ||
if (Pred && AC.CFCtx.containsExprConsumedInDifferentBlock(*Pred)) { | ||
JoinBehavior = Environment::KeepExprState; | ||
break; | ||
} | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since this is a property of the CFG, can we compute this just once per block (rather than every time we visit the block)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried this out -- essentially, I added a method to
I would argue that On the other hand, Yes, this means we compute this information each time we visit the block, but this appears pretty insignificant compared to the work we do actually processing the block. I ran the Crubit nullability check benchmarks with the draft change I mentioned, and the speedup is too small to distinguish from measurement noise at the CPU time level, though a small significant change can be seen at the cycles level. I don't think this justifies changing the current design, which I think has better separation of concerns. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. re: EmbellishedCFG -- sure. Ok, I'm fine leaving as is, given the benchmarks. But, I would think there's a simple compromise (say for a followup patch) of just computing it locally in this file (as now), but caching it in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm still not sure this is worth it - this would add additional stored state (even if it's non-mutable), and I'm not sure this is worth it for the small gain we get. Will commit as-is, and we can discuss whether we want to create a followup patch for this. |
||
|
||
JoinedStateBuilder Builder(AC, JoinBehavior); | ||
for (const CFGBlock *Pred : Preds) { | ||
// Skip if the `Block` is unreachable or control flow cannot get past it. | ||
if (!Pred || Pred->hasNoReturnElement()) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, could be a bitset that contains the block ids. Although, since we probably expect this set to be sparse/small, maybe the current solution is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, I'm not sure which way this goes. This feature doesn't appear to be a bit performance issue one way or the other, so I'm inclined to keep it as is.