Skip to content

[clang][nullability] Don't discard expression state before end of full-expression. #82611

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 7, 2024

Conversation

martinboehme
Copy link
Contributor

@martinboehme martinboehme commented Feb 22, 2024

In #72985, I made a change to discard
expression state (ExprToLoc and ExprToVal) at the beginning of each basic
block. I did so with the claim that "we never need to access entries from these
maps outside of the current basic block", noting that there are exceptions to
this claim when control flow happens inside a full-expression (the operands of
&&, ||, and the conditional operator live in different basic blocks than the
operator itself) but that we already have a mechanism for retrieving the values
of these operands from the environment for the block they are computed in.

It turns out, however, that the operands of these operators aren't the only
expressions whose values can be accessed from a different basic block; when
control flow happens within a full-expression, that control flow can be
"interposed" between an expression and its parent. Here is an example:

void f(int*, int);
bool cond();

void target() {
  int i = 0;
  f(&i, cond() ? 1 : 0);
}

(godbolt)

In the CFG1 , note how the expression for &i is computed in block B4,
but the parent of this expression (the CallExpr) is located in block B1.
The the argument expression &i and the CallExpr are essentially "torn apart"
into different basic blocks by the conditional operator in the second argument.
In other words, the edge between the CallExpr and its argument &i straddles
the boundary between two blocks.

I used to think that this scenario -- where an edge between an expression and
one of its children straddles a block boundary -- could only happen between the
expression that triggers the control flow (&&, ||, or the conditional
operator) and its children, but the example above shows that other expressions
can be affected as well; the control flow is still triggered by &&, || or
the conditional operator, but the expressions affected lie outside these
operators.

Discarding expression state too soon is harmful. For example, an analysis that
checks the arguments of the CallExpr above would not be able to retrieve a
value for the &i argument.

This patch therefore ensures that we don't discard expression state before the
end of a full-expression. In other cases -- when the evaluation of a
full-expression is complete -- we still want to discard expression state for the
reasons explained in #72985 (avoid
performing joins on boolean values that are no longer needed, which
unnecessarily extends the flow condition; improve debuggability by removing
clutter from the expression state).

The impact on performance from this change is about a 1% slowdown in the
Crubit nullability check benchmarks:

name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     71.9µs ± 1%  71.9µs ± 2%    ~     (p=0.987 n=15+20)
BM_PointerAnalysisIntLoop          190µs ± 1%   192µs ± 2%  +1.06%  (p=0.000 n=14+16)
BM_PointerAnalysisPointerLoop      325µs ± 5%   324µs ± 4%    ~     (p=0.496 n=18+20)
BM_PointerAnalysisBranch           193µs ± 0%   192µs ± 4%    ~     (p=0.488 n=14+18)
BM_PointerAnalysisLoopAndBranch    521µs ± 1%   525µs ± 3%  +0.94%  (p=0.017 n=18+19)
BM_PointerAnalysisTwoLoops         337µs ± 1%   341µs ± 3%  +1.19%  (p=0.004 n=17+19)
BM_PointerAnalysisJoinFilePath    1.62ms ± 2%  1.64ms ± 3%  +0.92%  (p=0.021 n=20+20)
BM_PointerAnalysisCallInLoop      1.14ms ± 1%  1.15ms ± 4%    ~     (p=0.135 n=16+18)
 [B5 (ENTRY)]
   Succs (1): B4

 [B1]
   1: [B4.9] ? [B2.1] : [B3.1]
   2: [B4.4]([B4.6], [B1.1])
   Preds (2): B2 B3
   Succs (1): B0

 [B2]
   1: 1
   Preds (1): B4
   Succs (1): B1

 [B3]
   1: 0
   Preds (1): B4
   Succs (1): B1

 [B4]
   1: 0
   2: int i = 0;
   3: f
   4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int))
   5: i
   6: &[B4.5]
   7: cond
   8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void))
   9: [B4.8]()
   T: [B4.9] ? ... : ...
   Preds (1): B5
   Succs (2): B2 B3

 [B0 (EXIT)]
   Preds (1): B1

Footnotes

…l-expression.

In llvm#72985, I made a change to discard
expression state (`ExprToLoc` and `ExprToVal`) at the beginning of each basic
block. I did so with the claim that "we never need to access entries from these
maps outside of the current basic block", noting that there are exceptions to
this claim when control flow happens inside a full-expression (the operands of
`&&`, `||`, and the conditional operator live in different basic blocks than the
operator itself) but that we already have a mechanism for retrieving the values
of these operands from the environment for the block they are computed in.

It turns out, however, that the operands of these operators aren't the only
expressions whose values can be accessed from a different basic block; when
control flow happens within a full-expression, that control flow can be
"interposed" between an expression and its parent. Here is an example:

```cxx
void f(int*, int);
bool cond();

void target() {
  int i = 0;
  f(&i, cond() ? 1 : 0);
}
```

([godbolt](https://godbolt.org/z/hrbj1Mj3o))

In the CFG[^1] , note how the expression for `&i` is computed in block B4,
but the parent of this expression (the `CallExpr`) is located in block B1.
The the argument expression `&i` and the `CallExpr` are essentially "torn apart"
into different basic blocks by the conditional operator in the second argument.
In other words, the edge between the `CallExpr` and its argument `&i` straddles
the boundary between two blocks.

I used to think that this scenario -- where an edge between an expression and
one of its children straddles a block boundary -- could only happen between the
expression that triggers the control flow (`&&`, `||`, or the conditional
operator) and its children, but the example above shows that other expressions
can be affected as well; the control flow is still triggered by `&&`, `||` or
the conditional operator, but the expressions affected lie outside these
operators.

Discarding expression state too soon is harmful. For example, an analysis that
checks the arguments of the `CallExpr` above would not be able to retrieve a
value for the `&i` argument.

This patch therefore ensures that we don't discard expression state before the
end of a full-expression. In other cases -- when the evaluation of a
full-expression is complete -- we still want to discard expression state for the
reasons explained in llvm#72985 (avoid
performing joins on boolean values that are no longer needed, which
unnecessarily extends the flow condition; improve debuggability by removing
clutter from the expression state).

[^1]:
```
 [B5 (ENTRY)]
   Succs (1): B4

 [B1]
   1: [B4.9] ? [B2.1] : [B3.1]
   2: [B4.4]([B4.6], [B1.1])
   Preds (2): B2 B3
   Succs (1): B0

 [B2]
   1: 1
   Preds (1): B4
   Succs (1): B1

 [B3]
   1: 0
   Preds (1): B4
   Succs (1): B1

 [B4]
   1: 0
   2: int i = 0;
   3: f
   4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int))
   5: i
   6: &[B4.5]
   7: cond
   8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void))
   9: [B4.8]()
   T: [B4.9] ? ... : ...
   Preds (1): B5
   Succs (2): B2 B3

 [B0 (EXIT)]
   Preds (1): B1
```
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Feb 22, 2024
@martinboehme martinboehme requested review from ymand and Xazax-hun and removed request for ymand February 22, 2024 11:39
@llvmbot
Copy link
Member

llvmbot commented Feb 22, 2024

@llvm/pr-subscribers-clang

Author: None (martinboehme)

Changes

In #72985, I made a change to discard
expression state (ExprToLoc and ExprToVal) at the beginning of each basic
block. I did so with the claim that "we never need to access entries from these
maps outside of the current basic block", noting that there are exceptions to
this claim when control flow happens inside a full-expression (the operands of
&&, ||, and the conditional operator live in different basic blocks than the
operator itself) but that we already have a mechanism for retrieving the values
of these operands from the environment for the block they are computed in.

It turns out, however, that the operands of these operators aren't the only
expressions whose values can be accessed from a different basic block; when
control flow happens within a full-expression, that control flow can be
"interposed" between an expression and its parent. Here is an example:

void f(int*, int);
bool cond();

void target() {
  int i = 0;
  f(&i, cond() ? 1 : 0);
}

(godbolt)

In the CFG1 , note how the expression for &i is computed in block B4,
but the parent of this expression (the CallExpr) is located in block B1.
The the argument expression &i and the CallExpr are essentially "torn apart"
into different basic blocks by the conditional operator in the second argument.
In other words, the edge between the CallExpr and its argument &i straddles
the boundary between two blocks.

I used to think that this scenario -- where an edge between an expression and
one of its children straddles a block boundary -- could only happen between the
expression that triggers the control flow (&&, ||, or the conditional
operator) and its children, but the example above shows that other expressions
can be affected as well; the control flow is still triggered by &&, || or
the conditional operator, but the expressions affected lie outside these
operators.

Discarding expression state too soon is harmful. For example, an analysis that
checks the arguments of the CallExpr above would not be able to retrieve a
value for the &i argument.

This patch therefore ensures that we don't discard expression state before the
end of a full-expression. In other cases -- when the evaluation of a
full-expression is complete -- we still want to discard expression state for the
reasons explained in #72985 (avoid
performing joins on boolean values that are no longer needed, which
unnecessarily extends the flow condition; improve debuggability by removing
clutter from the expression state).

 [B5 (ENTRY)]
   Succs (1): B4

 [B1]
   1: [B4.9] ? [B2.1] : [B3.1]
   2: [B4.4]([B4.6], [B1.1])
   Preds (2): B2 B3
   Succs (1): B0

 [B2]
   1: 1
   Preds (1): B4
   Succs (1): B1

 [B3]
   1: 0
   Preds (1): B4
   Succs (1): B1

 [B4]
   1: 0
   2: int i = 0;
   3: f
   4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int))
   5: i
   6: &[B4.5]
   7: cond
   8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void))
   9: [B4.8]()
   T: [B4.9] ? ... : ...
   Preds (1): B5
   Succs (2): B2 B3

 [B0 (EXIT)]
   Preds (1): B1

Full diff: https://github.com/llvm/llvm-project/pull/82611.diff

7 Files Affected:

  • (modified) clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h (+21-4)
  • (modified) clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h (+10-1)
  • (modified) clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp (+35-1)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+24-4)
  • (modified) clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp (+25-7)
  • (modified) clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp (+2-1)
  • (modified) clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp (+70-34)
diff --git a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
index 405e93287a05d3..9a0a00f3c01343 100644
--- a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
+++ b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
@@ -58,19 +58,36 @@ class ControlFlowContext {
     return BlockReachable[B.getBlockID()];
   }
 
+  /// Returns whether `B` contains an expression that is consumed in a
+  /// different block than `B` (i.e. the parent of the expression is in a
+  /// different block).
+  /// This happens if there is control flow within a full-expression (triggered
+  /// by `&&`, `||`, or the conditional operator). Note that the operands of
+  /// these operators are not the only expressions that can be consumed in a
+  /// different block. For example, in the function call
+  /// `f(&i, cond() ? 1 : 0)`, `&i` is in a different block than the `CallExpr`.
+  bool containsExprConsumedInDifferentBlock(const CFGBlock &B) const {
+    return ContainsExprConsumedInDifferentBlock.contains(&B);
+  }
+
 private:
-  ControlFlowContext(const Decl &D, std::unique_ptr<CFG> Cfg,
-                     llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock,
-                     llvm::BitVector BlockReachable)
+  ControlFlowContext(
+      const Decl &D, std::unique_ptr<CFG> Cfg,
+      llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock,
+      llvm::BitVector BlockReachable,
+      llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock)
       : ContainingDecl(D), Cfg(std::move(Cfg)),
         StmtToBlock(std::move(StmtToBlock)),
-        BlockReachable(std::move(BlockReachable)) {}
+        BlockReachable(std::move(BlockReachable)),
+        ContainsExprConsumedInDifferentBlock(
+            std::move(ContainsExprConsumedInDifferentBlock)) {}
 
   /// The `Decl` containing the statement used to construct the CFG.
   const Decl &ContainingDecl;
   std::unique_ptr<CFG> Cfg;
   llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock;
   llvm::BitVector BlockReachable;
+  llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock;
 };
 
 } // namespace dataflow
diff --git a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
index b3dc940705f870..75d43f67864b56 100644
--- a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
+++ b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
@@ -240,6 +240,14 @@ class Environment {
   bool equivalentTo(const Environment &Other,
                     Environment::ValueModel &Model) const;
 
+  /// How to treat expression state (`ExprToLoc` and `ExprToVal`) in a join.
+  /// If the join happens within a full expression, expression state should be
+  /// kept; otherwise, we can discard it.
+  enum ExprJoinBehavior {
+    DiscardExprState,
+    KeepExprState,
+  };
+
   /// Joins two environments by taking the intersection of storage locations and
   /// values that are stored in them. Distinct values that are assigned to the
   /// same storage locations in `EnvA` and `EnvB` are merged using `Model`.
@@ -248,7 +256,8 @@ class Environment {
   ///
   ///  `EnvA` and `EnvB` must use the same `DataflowAnalysisContext`.
   static Environment join(const Environment &EnvA, const Environment &EnvB,
-                          Environment::ValueModel &Model);
+                          Environment::ValueModel &Model,
+                          ExprJoinBehavior ExprBehavior);
 
   /// Widens the environment point-wise, using `PrevEnv` as needed to inform the
   /// approximation.
diff --git a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
index 8aed19544be6a2..6058b557d353f6 100644
--- a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
@@ -94,6 +94,36 @@ static llvm::BitVector findReachableBlocks(const CFG &Cfg) {
   return BlockReachable;
 }
 
+static llvm::DenseSet<const CFGBlock *>
+buildContainsExprConsumedInDifferentBlock(
+    const CFG &Cfg,
+    const llvm::DenseMap<const Stmt *, const CFGBlock *> &StmtToBlock) {
+  llvm::DenseSet<const CFGBlock *> Result;
+
+  auto CheckChildExprs = [&Result, &StmtToBlock](const Stmt *S,
+                                                 const CFGBlock *Block) {
+    for (const Stmt *Child : S->children()) {
+      const CFGBlock *ChildBlock = StmtToBlock.lookup(Child);
+      if (isa<Expr>(Child) && ChildBlock != Block)
+        Result.insert(ChildBlock);
+    }
+  };
+
+  for (const CFGBlock *Block : Cfg) {
+    if (Block == nullptr)
+      continue;
+
+    for (const CFGElement &Element : *Block)
+      if (auto S = Element.getAs<CFGStmt>())
+        CheckChildExprs(S->getStmt(), Block);
+
+    if (const Stmt *TerminatorCond = Block->getTerminatorCondition())
+      CheckChildExprs(TerminatorCond, Block);
+  }
+
+  return Result;
+}
+
 llvm::Expected<ControlFlowContext>
 ControlFlowContext::build(const FunctionDecl &Func) {
   if (!Func.doesThisDeclarationHaveABody())
@@ -140,8 +170,12 @@ ControlFlowContext::build(const Decl &D, Stmt &S, ASTContext &C) {
 
   llvm::BitVector BlockReachable = findReachableBlocks(*Cfg);
 
+  llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock =
+      buildContainsExprConsumedInDifferentBlock(*Cfg, StmtToBlock);
+
   return ControlFlowContext(D, std::move(Cfg), std::move(StmtToBlock),
-                            std::move(BlockReachable));
+                            std::move(BlockReachable),
+                            std::move(ContainsExprConsumedInDifferentBlock));
 }
 
 } // namespace dataflow
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index 0cfc26ea952cda..df1c5aa245007a 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -48,6 +48,24 @@ static llvm::DenseMap<const ValueDecl *, StorageLocation *> intersectDeclToLoc(
   return Result;
 }
 
+// Performs a join on either `ExprToLoc` or `ExprToVal`.
+// The maps must be consistent in the sense that any entries for the same
+// expression must map to the same location / value. This is the case if we are
+// performing a join for control flow within a full-expression (which is the
+// only case when this function should be used).
+template <typename MapT> MapT joinExprMaps(const MapT &Map1, const MapT &Map2) {
+  MapT Result = Map1;
+
+  for (const auto &Entry : Map2) {
+    [[maybe_unused]] auto [It, Inserted] = Result.insert(Entry);
+    // If there was an existing entry, its value should be the same as for the
+    // entry we were trying to insert.
+    assert(It->second == Entry.second);
+  }
+
+  return Result;
+}
+
 // Whether to consider equivalent two values with an unknown relation.
 //
 // FIXME: this function is a hack enabling unsoundness to support
@@ -627,7 +645,8 @@ LatticeJoinEffect Environment::widen(const Environment &PrevEnv,
 }
 
 Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
-                              Environment::ValueModel &Model) {
+                              Environment::ValueModel &Model,
+                              ExprJoinBehavior ExprBehavior) {
   assert(EnvA.DACtx == EnvB.DACtx);
   assert(EnvA.ThisPointeeLoc == EnvB.ThisPointeeLoc);
   assert(EnvA.CallStack == EnvB.CallStack);
@@ -675,9 +694,10 @@ Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
   JoinedEnv.LocToVal =
       joinLocToVal(EnvA.LocToVal, EnvB.LocToVal, EnvA, EnvB, JoinedEnv, Model);
 
-  // We intentionally leave `JoinedEnv.ExprToLoc` and `JoinedEnv.ExprToVal`
-  // empty, as we never need to access entries in these maps outside of the
-  // basic block that sets them.
+  if (ExprBehavior == KeepExprState) {
+    JoinedEnv.ExprToVal = joinExprMaps(EnvA.ExprToVal, EnvB.ExprToVal);
+    JoinedEnv.ExprToLoc = joinExprMaps(EnvA.ExprToLoc, EnvB.ExprToLoc);
+  }
 
   return JoinedEnv;
 }
diff --git a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
index 4c88c46142d64d..06acc95346fa9d 100644
--- a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
+++ b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
@@ -221,6 +221,7 @@ class PrettyStackTraceCFGElement : public llvm::PrettyStackTraceEntry {
 // Avoids unneccesary copies of the environment.
 class JoinedStateBuilder {
   AnalysisContext &AC;
+  Environment::ExprJoinBehavior ExprBehavior;
   std::vector<const TypeErasedDataflowAnalysisState *> All;
   std::deque<TypeErasedDataflowAnalysisState> Owned;
 
@@ -228,11 +229,13 @@ class JoinedStateBuilder {
   join(const TypeErasedDataflowAnalysisState &L,
        const TypeErasedDataflowAnalysisState &R) {
     return {AC.Analysis.joinTypeErased(L.Lattice, R.Lattice),
-            Environment::join(L.Env, R.Env, AC.Analysis)};
+            Environment::join(L.Env, R.Env, AC.Analysis, ExprBehavior)};
   }
 
 public:
-  JoinedStateBuilder(AnalysisContext &AC) : AC(AC) {}
+  JoinedStateBuilder(AnalysisContext &AC,
+                     Environment::ExprJoinBehavior ExprBehavior)
+      : AC(AC), ExprBehavior(ExprBehavior) {}
 
   void addOwned(TypeErasedDataflowAnalysisState State) {
     Owned.push_back(std::move(State));
@@ -248,12 +251,12 @@ class JoinedStateBuilder {
       // initialize the state of each basic block differently.
       return {AC.Analysis.typeErasedInitialElement(), AC.InitEnv.fork()};
     if (All.size() == 1)
-      // Join the environment with itself so that we discard the entries from
-      // `ExprToLoc` and `ExprToVal`.
+      // Join the environment with itself so that we discard expression state if
+      // desired.
       // FIXME: We could consider writing special-case code for this that only
       // does the discarding, but it's not clear if this is worth it.
-      return {All[0]->Lattice,
-              Environment::join(All[0]->Env, All[0]->Env, AC.Analysis)};
+      return {All[0]->Lattice, Environment::join(All[0]->Env, All[0]->Env,
+                                                 AC.Analysis, ExprBehavior)};
 
     auto Result = join(*All[0], *All[1]);
     for (unsigned I = 2; I < All.size(); ++I)
@@ -307,7 +310,22 @@ computeBlockInputState(const CFGBlock &Block, AnalysisContext &AC) {
     }
   }
 
-  JoinedStateBuilder Builder(AC);
+  // If any of the predecessor blocks contains an expression consumed in a
+  // different block, we need to keep expression state.
+  // Note that in this case, we keep expression state for all predecessors,
+  // rather than only those predecessors that actually contain an expression
+  // consumed in a different block. While this is potentially suboptimal, it's
+  // actually likely, if we have control flow within a full expression, that
+  // all predecessors have expression state consumed in a different block.
+  Environment::ExprJoinBehavior ExprBehavior = Environment::DiscardExprState;
+  for (const CFGBlock *Pred : Preds) {
+    if (Pred && AC.CFCtx.containsExprConsumedInDifferentBlock(*Pred)) {
+      ExprBehavior = Environment::KeepExprState;
+      break;
+    }
+  }
+
+  JoinedStateBuilder Builder(AC, ExprBehavior);
   for (const CFGBlock *Pred : Preds) {
     // Skip if the `Block` is unreachable or control flow cannot get past it.
     if (!Pred || Pred->hasNoReturnElement())
diff --git a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
index 8799d03dfd3c58..465a8e21690c4a 100644
--- a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
@@ -190,7 +190,8 @@ TEST_F(EnvironmentTest, JoinRecords) {
     Env2.setValue(Loc, Val2);
 
     Environment::ValueModel Model;
-    Environment EnvJoined = Environment::join(Env1, Env2, Model);
+    Environment EnvJoined =
+        Environment::join(Env1, Env2, Model, Environment::DiscardExprState);
     auto *JoinedVal = cast<RecordValue>(EnvJoined.getValue(Loc));
     EXPECT_NE(JoinedVal, &Val1);
     EXPECT_NE(JoinedVal, &Val2);
diff --git a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
index 34f9b0b23719fe..9d05a0d6ca4010 100644
--- a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
@@ -244,15 +244,17 @@ TEST_F(DiscardExprStateTest, WhileStatement) {
   EXPECT_NE(NotEqOpState.Env.getValue(NotEqOp), nullptr);
 
   // In the block that calls `foo(p)`, the value for `p != nullptr` is discarded
-  // because it is not consumed by this block.
+  // because it is not consumed outside the block it is in.
   const auto &CallFooState = blockStateForStmt(BlockStates, CallFoo);
   EXPECT_EQ(CallFooState.Env.getValue(NotEqOp), nullptr);
 }
 
 TEST_F(DiscardExprStateTest, BooleanOperator) {
   std::string Code = R"(
-    bool target(bool b1, bool b2) {
-      return b1 && b2;
+    void f();
+    void target(bool b1, bool b2) {
+      if (b1 && b2)
+        f();
     }
   )";
   auto BlockStates = llvm::cantFail(runAnalysis<NoopAnalysis>(
@@ -260,46 +262,80 @@ TEST_F(DiscardExprStateTest, BooleanOperator) {
 
   const auto &AndOp =
       matchNode<BinaryOperator>(binaryOperator(hasOperatorName("&&")));
-  const auto &Return = matchNode<ReturnStmt>(returnStmt());
+  const auto &CallF =
+      matchNode<CallExpr>(callExpr(callee(functionDecl(hasName("f")))));
 
   // In the block that evaluates the LHS of the `&&` operator, the LHS is
   // associated with a value, while the right-hand side is not (unsurprisingly,
   // as it hasn't been evaluated yet).
   const auto &LHSState = blockStateForStmt(BlockStates, *AndOp.getLHS());
   auto *LHSValue = cast<BoolValue>(LHSState.Env.getValue(*AndOp.getLHS()));
-  ASSERT_NE(LHSValue, nullptr);
+  EXPECT_NE(LHSValue, nullptr);
   EXPECT_EQ(LHSState.Env.getValue(*AndOp.getRHS()), nullptr);
 
-  // In the block that evaluates the RHS, the RHS is associated with a
-  // value. The value for the LHS has been discarded as it is not consumed by
-  // this block.
+  // In the block that evaluates the RHS, both the LHS and RHS are associated
+  // with values, as they are both subexpressions of the `&&` operator, which
+  // is evaluated in a later block.
   const auto &RHSState = blockStateForStmt(BlockStates, *AndOp.getRHS());
-  EXPECT_EQ(RHSState.Env.getValue(*AndOp.getLHS()), nullptr);
-  auto *RHSValue = cast<BoolValue>(RHSState.Env.getValue(*AndOp.getRHS()));
-  ASSERT_NE(RHSValue, nullptr);
-
-  // In the block that evaluates the return statement, the expression `b1 && b2`
-  // is associated with a value (and check that it's the right one).
-  // The expressions `b1` and `b2` are _not_ associated with a value in this
-  // block, even though they are consumed by the block, because:
-  // * This block has two prececessor blocks (the one that evaluates `b1` and
-  //   the one that evaluates `b2`).
-  // * `b1` is only associated with a value in the block that evaluates `b1` but
-  //   not the block that evalutes `b2`, so the join operation discards the
-  //   value for `b1`.
-  // * `b2` is only associated with a value in the block that evaluates `b2` but
-  //   not the block that evaluates `b1`, the the join operation discards the
-  //   value for `b2`.
-  // Nevertheless, the analysis generates the correct formula for `b1 && b2`
-  // because the transfer function for the `&&` operator retrieves the values
-  // for its operands from the environments for the blocks that compute the
-  // operands, rather than from the environment for the block that contains the
-  // `&&`.
-  const auto &ReturnState = blockStateForStmt(BlockStates, Return);
-  EXPECT_EQ(ReturnState.Env.getValue(*AndOp.getLHS()), nullptr);
-  EXPECT_EQ(ReturnState.Env.getValue(*AndOp.getRHS()), nullptr);
-  EXPECT_EQ(ReturnState.Env.getValue(AndOp),
-            &ReturnState.Env.makeAnd(*LHSValue, *RHSValue));
+  EXPECT_EQ(RHSState.Env.getValue(*AndOp.getLHS()), LHSValue);
+  auto *RHSValue = RHSState.Env.get<BoolValue>(*AndOp.getRHS());
+  EXPECT_NE(RHSValue, nullptr);
+
+  // In the block that evaluates `b1 && b2`, the `&&` as well as its operands
+  // are associated with values.
+  const auto &AndOpState = blockStateForStmt(BlockStates, AndOp);
+  EXPECT_EQ(AndOpState.Env.getValue(*AndOp.getLHS()), LHSValue);
+  EXPECT_EQ(AndOpState.Env.getValue(*AndOp.getRHS()), RHSValue);
+  EXPECT_EQ(AndOpState.Env.getValue(AndOp),
+            &AndOpState.Env.makeAnd(*LHSValue, *RHSValue));
+
+  // In the block that calls `f()`, none of `b1`, `b2`, or `b1 && b2` should be
+  // associated with values.
+  const auto &CallFState = blockStateForStmt(BlockStates, CallF);
+  EXPECT_EQ(CallFState.Env.getValue(*AndOp.getLHS()), nullptr);
+  EXPECT_EQ(CallFState.Env.getValue(*AndOp.getRHS()), nullptr);
+  EXPECT_EQ(CallFState.Env.getValue(AndOp), nullptr);
+}
+
+TEST_F(DiscardExprStateTest, ConditionalOperator) {
+  std::string Code = R"(
+    void f(int*, int);
+    void g();
+    bool cond();
+
+    void target() {
+      int i = 0;
+      if (cond())
+        f(&i, cond() ? 1 : 0);
+      g();
+    }
+  )";
+  auto BlockStates = llvm::cantFail(runAnalysis<NoopAnalysis>(
+      Code, [](ASTContext &C) { return NoopAnalysis(C); }));
+
+  const auto &AddrOfI =
+      matchNode<UnaryOperator>(unaryOperator(hasOperatorName("&")));
+  const auto &CallF =
+      matchNode<CallExpr>(callExpr(callee(functionDecl(hasName("f")))));
+  const auto &CallG =
+      matchNode<CallExpr>(callExpr(callee(functionDecl(hasName("g")))));
+
+  // In the block that evaluates `&i`, it should obviously have a value.
+  const auto &AddrOfIState = blockStateForStmt(BlockStates, AddrOfI);
+  auto *AddrOfIVal = AddrOfIState.Env.get<PointerValue>(AddrOfI);
+  EXPECT_NE(AddrOfIVal, nullptr);
+
+  // Because of the conditional operator, the `f(...)` call is evaluated in a
+  // different block than `&i`, but `&i` still needs to have a value here
+  // because it's a subexpression of the call.
+  const auto &CallFState = blockStateForStmt(BlockStates, CallF);
+  EXPECT_NE(&CallFState, &AddrOfIState);
+  EXPECT_EQ(CallFState.Env.get<PointerValue>(AddrOfI), AddrOfIVal);
+
+  // In the block that calls `g()`, `&i` should no longer be associated with a
+  // value.
+  const auto &CallGState = blockStateForStmt(BlockStates, CallG);
+  EXPECT_EQ(CallGState.Env.get<PointerValue>(AddrOfI), nullptr);
 }
 
 struct NonConvergingLattice {

Footnotes

@llvmbot
Copy link
Member

llvmbot commented Feb 22, 2024

@llvm/pr-subscribers-clang-analysis

Author: None (martinboehme)

Changes

In #72985, I made a change to discard
expression state (ExprToLoc and ExprToVal) at the beginning of each basic
block. I did so with the claim that "we never need to access entries from these
maps outside of the current basic block", noting that there are exceptions to
this claim when control flow happens inside a full-expression (the operands of
&amp;&amp;, ||, and the conditional operator live in different basic blocks than the
operator itself) but that we already have a mechanism for retrieving the values
of these operands from the environment for the block they are computed in.

It turns out, however, that the operands of these operators aren't the only
expressions whose values can be accessed from a different basic block; when
control flow happens within a full-expression, that control flow can be
"interposed" between an expression and its parent. Here is an example:

void f(int*, int);
bool cond();

void target() {
  int i = 0;
  f(&amp;i, cond() ? 1 : 0);
}

(godbolt)

In the CFG1 , note how the expression for &amp;i is computed in block B4,
but the parent of this expression (the CallExpr) is located in block B1.
The the argument expression &amp;i and the CallExpr are essentially "torn apart"
into different basic blocks by the conditional operator in the second argument.
In other words, the edge between the CallExpr and its argument &amp;i straddles
the boundary between two blocks.

I used to think that this scenario -- where an edge between an expression and
one of its children straddles a block boundary -- could only happen between the
expression that triggers the control flow (&amp;&amp;, ||, or the conditional
operator) and its children, but the example above shows that other expressions
can be affected as well; the control flow is still triggered by &amp;&amp;, || or
the conditional operator, but the expressions affected lie outside these
operators.

Discarding expression state too soon is harmful. For example, an analysis that
checks the arguments of the CallExpr above would not be able to retrieve a
value for the &amp;i argument.

This patch therefore ensures that we don't discard expression state before the
end of a full-expression. In other cases -- when the evaluation of a
full-expression is complete -- we still want to discard expression state for the
reasons explained in #72985 (avoid
performing joins on boolean values that are no longer needed, which
unnecessarily extends the flow condition; improve debuggability by removing
clutter from the expression state).

 [B5 (ENTRY)]
   Succs (1): B4

 [B1]
   1: [B4.9] ? [B2.1] : [B3.1]
   2: [B4.4]([B4.6], [B1.1])
   Preds (2): B2 B3
   Succs (1): B0

 [B2]
   1: 1
   Preds (1): B4
   Succs (1): B1

 [B3]
   1: 0
   Preds (1): B4
   Succs (1): B1

 [B4]
   1: 0
   2: int i = 0;
   3: f
   4: [B4.3] (ImplicitCastExpr, FunctionToPointerDecay, void (*)(int *, int))
   5: i
   6: &amp;[B4.5]
   7: cond
   8: [B4.7] (ImplicitCastExpr, FunctionToPointerDecay, _Bool (*)(void))
   9: [B4.8]()
   T: [B4.9] ? ... : ...
   Preds (1): B5
   Succs (2): B2 B3

 [B0 (EXIT)]
   Preds (1): B1

Full diff: https://github.com/llvm/llvm-project/pull/82611.diff

7 Files Affected:

  • (modified) clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h (+21-4)
  • (modified) clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h (+10-1)
  • (modified) clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp (+35-1)
  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+24-4)
  • (modified) clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp (+25-7)
  • (modified) clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp (+2-1)
  • (modified) clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp (+70-34)
diff --git a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
index 405e93287a05d3..9a0a00f3c01343 100644
--- a/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
+++ b/clang/include/clang/Analysis/FlowSensitive/ControlFlowContext.h
@@ -58,19 +58,36 @@ class ControlFlowContext {
     return BlockReachable[B.getBlockID()];
   }
 
+  /// Returns whether `B` contains an expression that is consumed in a
+  /// different block than `B` (i.e. the parent of the expression is in a
+  /// different block).
+  /// This happens if there is control flow within a full-expression (triggered
+  /// by `&&`, `||`, or the conditional operator). Note that the operands of
+  /// these operators are not the only expressions that can be consumed in a
+  /// different block. For example, in the function call
+  /// `f(&i, cond() ? 1 : 0)`, `&i` is in a different block than the `CallExpr`.
+  bool containsExprConsumedInDifferentBlock(const CFGBlock &B) const {
+    return ContainsExprConsumedInDifferentBlock.contains(&B);
+  }
+
 private:
-  ControlFlowContext(const Decl &D, std::unique_ptr<CFG> Cfg,
-                     llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock,
-                     llvm::BitVector BlockReachable)
+  ControlFlowContext(
+      const Decl &D, std::unique_ptr<CFG> Cfg,
+      llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock,
+      llvm::BitVector BlockReachable,
+      llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock)
       : ContainingDecl(D), Cfg(std::move(Cfg)),
         StmtToBlock(std::move(StmtToBlock)),
-        BlockReachable(std::move(BlockReachable)) {}
+        BlockReachable(std::move(BlockReachable)),
+        ContainsExprConsumedInDifferentBlock(
+            std::move(ContainsExprConsumedInDifferentBlock)) {}
 
   /// The `Decl` containing the statement used to construct the CFG.
   const Decl &ContainingDecl;
   std::unique_ptr<CFG> Cfg;
   llvm::DenseMap<const Stmt *, const CFGBlock *> StmtToBlock;
   llvm::BitVector BlockReachable;
+  llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock;
 };
 
 } // namespace dataflow
diff --git a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
index b3dc940705f870..75d43f67864b56 100644
--- a/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
+++ b/clang/include/clang/Analysis/FlowSensitive/DataflowEnvironment.h
@@ -240,6 +240,14 @@ class Environment {
   bool equivalentTo(const Environment &Other,
                     Environment::ValueModel &Model) const;
 
+  /// How to treat expression state (`ExprToLoc` and `ExprToVal`) in a join.
+  /// If the join happens within a full expression, expression state should be
+  /// kept; otherwise, we can discard it.
+  enum ExprJoinBehavior {
+    DiscardExprState,
+    KeepExprState,
+  };
+
   /// Joins two environments by taking the intersection of storage locations and
   /// values that are stored in them. Distinct values that are assigned to the
   /// same storage locations in `EnvA` and `EnvB` are merged using `Model`.
@@ -248,7 +256,8 @@ class Environment {
   ///
   ///  `EnvA` and `EnvB` must use the same `DataflowAnalysisContext`.
   static Environment join(const Environment &EnvA, const Environment &EnvB,
-                          Environment::ValueModel &Model);
+                          Environment::ValueModel &Model,
+                          ExprJoinBehavior ExprBehavior);
 
   /// Widens the environment point-wise, using `PrevEnv` as needed to inform the
   /// approximation.
diff --git a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
index 8aed19544be6a2..6058b557d353f6 100644
--- a/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
+++ b/clang/lib/Analysis/FlowSensitive/ControlFlowContext.cpp
@@ -94,6 +94,36 @@ static llvm::BitVector findReachableBlocks(const CFG &Cfg) {
   return BlockReachable;
 }
 
+static llvm::DenseSet<const CFGBlock *>
+buildContainsExprConsumedInDifferentBlock(
+    const CFG &Cfg,
+    const llvm::DenseMap<const Stmt *, const CFGBlock *> &StmtToBlock) {
+  llvm::DenseSet<const CFGBlock *> Result;
+
+  auto CheckChildExprs = [&Result, &StmtToBlock](const Stmt *S,
+                                                 const CFGBlock *Block) {
+    for (const Stmt *Child : S->children()) {
+      const CFGBlock *ChildBlock = StmtToBlock.lookup(Child);
+      if (isa<Expr>(Child) && ChildBlock != Block)
+        Result.insert(ChildBlock);
+    }
+  };
+
+  for (const CFGBlock *Block : Cfg) {
+    if (Block == nullptr)
+      continue;
+
+    for (const CFGElement &Element : *Block)
+      if (auto S = Element.getAs<CFGStmt>())
+        CheckChildExprs(S->getStmt(), Block);
+
+    if (const Stmt *TerminatorCond = Block->getTerminatorCondition())
+      CheckChildExprs(TerminatorCond, Block);
+  }
+
+  return Result;
+}
+
 llvm::Expected<ControlFlowContext>
 ControlFlowContext::build(const FunctionDecl &Func) {
   if (!Func.doesThisDeclarationHaveABody())
@@ -140,8 +170,12 @@ ControlFlowContext::build(const Decl &D, Stmt &S, ASTContext &C) {
 
   llvm::BitVector BlockReachable = findReachableBlocks(*Cfg);
 
+  llvm::DenseSet<const CFGBlock *> ContainsExprConsumedInDifferentBlock =
+      buildContainsExprConsumedInDifferentBlock(*Cfg, StmtToBlock);
+
   return ControlFlowContext(D, std::move(Cfg), std::move(StmtToBlock),
-                            std::move(BlockReachable));
+                            std::move(BlockReachable),
+                            std::move(ContainsExprConsumedInDifferentBlock));
 }
 
 } // namespace dataflow
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index 0cfc26ea952cda..df1c5aa245007a 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -48,6 +48,24 @@ static llvm::DenseMap<const ValueDecl *, StorageLocation *> intersectDeclToLoc(
   return Result;
 }
 
+// Performs a join on either `ExprToLoc` or `ExprToVal`.
+// The maps must be consistent in the sense that any entries for the same
+// expression must map to the same location / value. This is the case if we are
+// performing a join for control flow within a full-expression (which is the
+// only case when this function should be used).
+template <typename MapT> MapT joinExprMaps(const MapT &Map1, const MapT &Map2) {
+  MapT Result = Map1;
+
+  for (const auto &Entry : Map2) {
+    [[maybe_unused]] auto [It, Inserted] = Result.insert(Entry);
+    // If there was an existing entry, its value should be the same as for the
+    // entry we were trying to insert.
+    assert(It->second == Entry.second);
+  }
+
+  return Result;
+}
+
 // Whether to consider equivalent two values with an unknown relation.
 //
 // FIXME: this function is a hack enabling unsoundness to support
@@ -627,7 +645,8 @@ LatticeJoinEffect Environment::widen(const Environment &PrevEnv,
 }
 
 Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
-                              Environment::ValueModel &Model) {
+                              Environment::ValueModel &Model,
+                              ExprJoinBehavior ExprBehavior) {
   assert(EnvA.DACtx == EnvB.DACtx);
   assert(EnvA.ThisPointeeLoc == EnvB.ThisPointeeLoc);
   assert(EnvA.CallStack == EnvB.CallStack);
@@ -675,9 +694,10 @@ Environment Environment::join(const Environment &EnvA, const Environment &EnvB,
   JoinedEnv.LocToVal =
       joinLocToVal(EnvA.LocToVal, EnvB.LocToVal, EnvA, EnvB, JoinedEnv, Model);
 
-  // We intentionally leave `JoinedEnv.ExprToLoc` and `JoinedEnv.ExprToVal`
-  // empty, as we never need to access entries in these maps outside of the
-  // basic block that sets them.
+  if (ExprBehavior == KeepExprState) {
+    JoinedEnv.ExprToVal = joinExprMaps(EnvA.ExprToVal, EnvB.ExprToVal);
+    JoinedEnv.ExprToLoc = joinExprMaps(EnvA.ExprToLoc, EnvB.ExprToLoc);
+  }
 
   return JoinedEnv;
 }
diff --git a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
index 4c88c46142d64d..06acc95346fa9d 100644
--- a/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
+++ b/clang/lib/Analysis/FlowSensitive/TypeErasedDataflowAnalysis.cpp
@@ -221,6 +221,7 @@ class PrettyStackTraceCFGElement : public llvm::PrettyStackTraceEntry {
 // Avoids unneccesary copies of the environment.
 class JoinedStateBuilder {
   AnalysisContext &AC;
+  Environment::ExprJoinBehavior ExprBehavior;
   std::vector<const TypeErasedDataflowAnalysisState *> All;
   std::deque<TypeErasedDataflowAnalysisState> Owned;
 
@@ -228,11 +229,13 @@ class JoinedStateBuilder {
   join(const TypeErasedDataflowAnalysisState &L,
        const TypeErasedDataflowAnalysisState &R) {
     return {AC.Analysis.joinTypeErased(L.Lattice, R.Lattice),
-            Environment::join(L.Env, R.Env, AC.Analysis)};
+            Environment::join(L.Env, R.Env, AC.Analysis, ExprBehavior)};
   }
 
 public:
-  JoinedStateBuilder(AnalysisContext &AC) : AC(AC) {}
+  JoinedStateBuilder(AnalysisContext &AC,
+                     Environment::ExprJoinBehavior ExprBehavior)
+      : AC(AC), ExprBehavior(ExprBehavior) {}
 
   void addOwned(TypeErasedDataflowAnalysisState State) {
     Owned.push_back(std::move(State));
@@ -248,12 +251,12 @@ class JoinedStateBuilder {
       // initialize the state of each basic block differently.
       return {AC.Analysis.typeErasedInitialElement(), AC.InitEnv.fork()};
     if (All.size() == 1)
-      // Join the environment with itself so that we discard the entries from
-      // `ExprToLoc` and `ExprToVal`.
+      // Join the environment with itself so that we discard expression state if
+      // desired.
       // FIXME: We could consider writing special-case code for this that only
       // does the discarding, but it's not clear if this is worth it.
-      return {All[0]->Lattice,
-              Environment::join(All[0]->Env, All[0]->Env, AC.Analysis)};
+      return {All[0]->Lattice, Environment::join(All[0]->Env, All[0]->Env,
+                                                 AC.Analysis, ExprBehavior)};
 
     auto Result = join(*All[0], *All[1]);
     for (unsigned I = 2; I < All.size(); ++I)
@@ -307,7 +310,22 @@ computeBlockInputState(const CFGBlock &Block, AnalysisContext &AC) {
     }
   }
 
-  JoinedStateBuilder Builder(AC);
+  // If any of the predecessor blocks contains an expression consumed in a
+  // different block, we need to keep expression state.
+  // Note that in this case, we keep expression state for all predecessors,
+  // rather than only those predecessors that actually contain an expression
+  // consumed in a different block. While this is potentially suboptimal, it's
+  // actually likely, if we have control flow within a full expression, that
+  // all predecessors have expression state consumed in a different block.
+  Environment::ExprJoinBehavior ExprBehavior = Environment::DiscardExprState;
+  for (const CFGBlock *Pred : Preds) {
+    if (Pred && AC.CFCtx.containsExprConsumedInDifferentBlock(*Pred)) {
+      ExprBehavior = Environment::KeepExprState;
+      break;
+    }
+  }
+
+  JoinedStateBuilder Builder(AC, ExprBehavior);
   for (const CFGBlock *Pred : Preds) {
     // Skip if the `Block` is unreachable or control flow cannot get past it.
     if (!Pred || Pred->hasNoReturnElement())
diff --git a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
index 8799d03dfd3c58..465a8e21690c4a 100644
--- a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
@@ -190,7 +190,8 @@ TEST_F(EnvironmentTest, JoinRecords) {
     Env2.setValue(Loc, Val2);
 
     Environment::ValueModel Model;
-    Environment EnvJoined = Environment::join(Env1, Env2, Model);
+    Environment EnvJoined =
+        Environment::join(Env1, Env2, Model, Environment::DiscardExprState);
     auto *JoinedVal = cast<RecordValue>(EnvJoined.getValue(Loc));
     EXPECT_NE(JoinedVal, &Val1);
     EXPECT_NE(JoinedVal, &Val2);
diff --git a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
index 34f9b0b23719fe..9d05a0d6ca4010 100644
--- a/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TypeErasedDataflowAnalysisTest.cpp
@@ -244,15 +244,17 @@ TEST_F(DiscardExprStateTest, WhileStatement) {
   EXPECT_NE(NotEqOpState.Env.getValue(NotEqOp), nullptr);
 
   // In the block that calls `foo(p)`, the value for `p != nullptr` is discarded
-  // because it is not consumed by this block.
+  // because it is not consumed outside the block it is in.
   const auto &CallFooState = blockStateForStmt(BlockStates, CallFoo);
   EXPECT_EQ(CallFooState.Env.getValue(NotEqOp), nullptr);
 }
 
 TEST_F(DiscardExprStateTest, BooleanOperator) {
   std::string Code = R"(
-    bool target(bool b1, bool b2) {
-      return b1 && b2;
+    void f();
+    void target(bool b1, bool b2) {
+      if (b1 && b2)
+        f();
     }
   )";
   auto BlockStates = llvm::cantFail(runAnalysis<NoopAnalysis>(
@@ -260,46 +262,80 @@ TEST_F(DiscardExprStateTest, BooleanOperator) {
 
   const auto &AndOp =
       matchNode<BinaryOperator>(binaryOperator(hasOperatorName("&&")));
-  const auto &Return = matchNode<ReturnStmt>(returnStmt());
+  const auto &CallF =
+      matchNode<CallExpr>(callExpr(callee(functionDecl(hasName("f")))));
 
   // In the block that evaluates the LHS of the `&&` operator, the LHS is
   // associated with a value, while the right-hand side is not (unsurprisingly,
   // as it hasn't been evaluated yet).
   const auto &LHSState = blockStateForStmt(BlockStates, *AndOp.getLHS());
   auto *LHSValue = cast<BoolValue>(LHSState.Env.getValue(*AndOp.getLHS()));
-  ASSERT_NE(LHSValue, nullptr);
+  EXPECT_NE(LHSValue, nullptr);
   EXPECT_EQ(LHSState.Env.getValue(*AndOp.getRHS()), nullptr);
 
-  // In the block that evaluates the RHS, the RHS is associated with a
-  // value. The value for the LHS has been discarded as it is not consumed by
-  // this block.
+  // In the block that evaluates the RHS, both the LHS and RHS are associated
+  // with values, as they are both subexpressions of the `&&` operator, which
+  // is evaluated in a later block.
   const auto &RHSState = blockStateForStmt(BlockStates, *AndOp.getRHS());
-  EXPECT_EQ(RHSState.Env.getValue(*AndOp.getLHS()), nullptr);
-  auto *RHSValue = cast<BoolValue>(RHSState.Env.getValue(*AndOp.getRHS()));
-  ASSERT_NE(RHSValue, nullptr);
-
-  // In the block that evaluates the return statement, the expression `b1 && b2`
-  // is associated with a value (and check that it's the right one).
-  // The expressions `b1` and `b2` are _not_ associated with a value in this
-  // block, even though they are consumed by the block, because:
-  // * This block has two prececessor blocks (the one that evaluates `b1` and
-  //   the one that evaluates `b2`).
-  // * `b1` is only associated with a value in the block that evaluates `b1` but
-  //   not the block that evalutes `b2`, so the join operation discards the
-  //   value for `b1`.
-  // * `b2` is only associated with a value in the block that evaluates `b2` but
-  //   not the block that evaluates `b1`, the the join operation discards the
-  //   value for `b2`.
-  // Nevertheless, the analysis generates the correct formula for `b1 && b2`
-  // because the transfer function for the `&&` operator retrieves the values
-  // for its operands from the environments for the blocks that compute the
-  // operands, rather than from the environment for the block that contains the
-  // `&&`.
-  const auto &ReturnState = blockStateForStmt(BlockStates, Return);
-  EXPECT_EQ(ReturnState.Env.getValue(*AndOp.getLHS()), nullptr);
-  EXPECT_EQ(ReturnState.Env.getValue(*AndOp.getRHS()), nullptr);
-  EXPECT_EQ(ReturnState.Env.getValue(AndOp),
-            &ReturnState.Env.makeAnd(*LHSValue, *RHSValue));
+  EXPECT_EQ(RHSState.Env.getValue(*AndOp.getLHS()), LHSValue);
+  auto *RHSValue = RHSState.Env.get<BoolValue>(*AndOp.getRHS());
+  EXPECT_NE(RHSValue, nullptr);
+
+  // In the block that evaluates `b1 && b2`, the `&&` as well as its operands
+  // are associated with values.
+  const auto &AndOpState = blockStateForStmt(BlockStates, AndOp);
+  EXPECT_EQ(AndOpState.Env.getValue(*AndOp.getLHS()), LHSValue);
+  EXPECT_EQ(AndOpState.Env.getValue(*AndOp.getRHS()), RHSValue);
+  EXPECT_EQ(AndOpState.Env.getValue(AndOp),
+            &AndOpState.Env.makeAnd(*LHSValue, *RHSValue));
+
+  // In the block that calls `f()`, none of `b1`, `b2`, or `b1 && b2` should be
+  // associated with values.
+  const auto &CallFState = blockStateForStmt(BlockStates, CallF);
+  EXPECT_EQ(CallFState.Env.getValue(*AndOp.getLHS()), nullptr);
+  EXPECT_EQ(CallFState.Env.getValue(*AndOp.getRHS()), nullptr);
+  EXPECT_EQ(CallFState.Env.getValue(AndOp), nullptr);
+}
+
+TEST_F(DiscardExprStateTest, ConditionalOperator) {
+  std::string Code = R"(
+    void f(int*, int);
+    void g();
+    bool cond();
+
+    void target() {
+      int i = 0;
+      if (cond())
+        f(&i, cond() ? 1 : 0);
+      g();
+    }
+  )";
+  auto BlockStates = llvm::cantFail(runAnalysis<NoopAnalysis>(
+      Code, [](ASTContext &C) { return NoopAnalysis(C); }));
+
+  const auto &AddrOfI =
+      matchNode<UnaryOperator>(unaryOperator(hasOperatorName("&")));
+  const auto &CallF =
+      matchNode<CallExpr>(callExpr(callee(functionDecl(hasName("f")))));
+  const auto &CallG =
+      matchNode<CallExpr>(callExpr(callee(functionDecl(hasName("g")))));
+
+  // In the block that evaluates `&i`, it should obviously have a value.
+  const auto &AddrOfIState = blockStateForStmt(BlockStates, AddrOfI);
+  auto *AddrOfIVal = AddrOfIState.Env.get<PointerValue>(AddrOfI);
+  EXPECT_NE(AddrOfIVal, nullptr);
+
+  // Because of the conditional operator, the `f(...)` call is evaluated in a
+  // different block than `&i`, but `&i` still needs to have a value here
+  // because it's a subexpression of the call.
+  const auto &CallFState = blockStateForStmt(BlockStates, CallF);
+  EXPECT_NE(&CallFState, &AddrOfIState);
+  EXPECT_EQ(CallFState.Env.get<PointerValue>(AddrOfI), AddrOfIVal);
+
+  // In the block that calls `g()`, `&i` should no longer be associated with a
+  // value.
+  const auto &CallGState = blockStateForStmt(BlockStates, CallG);
+  EXPECT_EQ(CallGState.Env.get<PointerValue>(AddrOfI), nullptr);
 }
 
 struct NonConvergingLattice {

Footnotes

@martinboehme martinboehme requested a review from ymand February 22, 2024 19:57
@martinboehme
Copy link
Contributor Author

Failing CI looks like an infrastructure failure.

Copy link
Collaborator

@ymand ymand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little concerned with the complexity and cost required to support this optimization. With this added cost, do you know if the analysis still faster (and less SAT timeouts?) than if we just did the simple thing of not dropping state?

@@ -248,12 +251,12 @@ class JoinedStateBuilder {
// initialize the state of each basic block differently.
return {AC.Analysis.typeErasedInitialElement(), AC.InitEnv.fork()};
if (All.size() == 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard on ExprBehavior as well, like if (All.size() == 1 && ExprBehavior == DiscardExprState)? Then, could you drop the FIXME and "if desired"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

guard on ExprBehavior as well, like if (All.size() == 1 && ExprBehavior == DiscardExprState)?

We would also need to return the existing state in the case All.size() == 1 && ExprBehavior == KeepExprState, right? IOW, the code would become something like this:

    if (All.size() == 1) {
      if (ExprBehavior == KeepExprState)
        return All[0];
      // Join the environment with itself so that we discard expression state.
      // FIXME: We could consider writing special-case code for this that only
      // does the discarding, but it's not clear if this is worth it.
      return {All[0]->Lattice, Environment::join(All[0]->Env, All[0]->Env,
                                                 AC.Analysis, DiscardExprState)};
    }

And it doesn't deal with the FIXME, as we still need some way to discard the expression state.

I'm not sure that, all in all, changing the code in this way would be preferable.

The upside is that we're now a bit more efficient in the case where All.size() == 1 and we want to keep expression state.

The downside is that we're treating KeepExprState specially when join() could have handled it for us. Right now, join() is the only place that needs to care about what to do for KeepExprState versus DiscardExprState, and it would be nice to keep the behavior localized in this way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with keeping as is for reducing complexity.

ExprBehavior = Environment::KeepExprState;
break;
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a property of the CFG, can we compute this just once per block (rather than every time we visit the block)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this out -- essentially, I added a method to ControlFlowContext that allows querying what the ExprJoinBehavior should be for a given block. However, I'm not convinced that the result was actually better that what we have now.

ControlFlowContext is essentially an "embellished" CFG -- it contains the CFG along with additional information derived from the CFG that the CFG itself doesn't contain. (BTW, what do you think of renaming ControlFlowContext to EmbellishedCFG? I think it describes the class better, and it would leave us with one fewer Context, of which we have confusingly many.)

I would argue that exprJoinBehaviorForBlock() isn't a direct property of the CFG block -- yes, it's a property that we associate with the block, but that property only makes sense in the context of how we join input states for the block, and as such seems more closely associated with computeBlockInputState() and the Environment than with the ControlFlowContext.

On the other hand, containsExprConsumedInDifferentBlock() does express a very direct property of the block. It seems more natural to have this be the property that is exposed by ControlFlowContext, and to have computeBlockInputState() derive from this the information it needs on how to join states from the predecessor blocks.

Yes, this means we compute this information each time we visit the block, but this appears pretty insignificant compared to the work we do actually processing the block. I ran the Crubit nullability check benchmarks with the draft change I mentioned, and the speedup is too small to distinguish from measurement noise at the CPU time level, though a small significant change can be seen at the cycles level. I don't think this justifies changing the current design, which I think has better separation of concerns.

Copy link
Collaborator

@ymand ymand Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: EmbellishedCFG -- sure.

Ok, I'm fine leaving as is, given the benchmarks. But, I would think there's a simple compromise (say for a followup patch) of just computing it locally in this file (as now), but caching it in AnalysisContext (which is local to that file), since its a fixed property of the block.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure this is worth it - this would add additional stored state (even if it's non-mutable), and I'm not sure this is worth it for the small gain we get. Will commit as-is, and we can discuss whether we want to create a followup patch for this.

const CFGBlock *Block) {
for (const Stmt *Child : S->children()) {
const CFGBlock *ChildBlock = StmtToBlock.lookup(Child);
if (isa<Expr>(Child) && ChildBlock != Block)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: check isa<Expr> before the lookup? avoids the extra lookup when it's not an expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point -- done!

@martinboehme
Copy link
Contributor Author

martinboehme commented Mar 5, 2024

I'm a little concerned with the complexity and cost required to support this optimization. With this added cost, do you know if the analysis still faster (and less SAT timeouts?) than if we just did the simple thing of not dropping state?

Yes, significantly faster, and fewer SAT solver timeouts.

To compare against the alternative of not dropping state, I reverse-applied #72985, then ran benchmarks for the Crubit nullability check and looked at the number of SAT solver timeouts when running the nullability check on an internal codebase.

Benchmark results show that this patch ("old") performs significantly better than reverse-applied #72985 ("new"):

name                              old cpu/op   new cpu/op   delta
BM_PointerAnalysisCopyPointer     72.4µs ± 4%  71.0µs ± 4%   -1.94%  (p=0.006 n=19+19)
BM_PointerAnalysisIntLoop          191µs ± 3%   237µs ± 3%  +24.50%  (p=0.000 n=19+20)
BM_PointerAnalysisPointerLoop      326µs ± 3%   405µs ± 3%  +24.16%  (p=0.000 n=17+19)
BM_PointerAnalysisBranch           192µs ± 3%   186µs ± 4%   -3.26%  (p=0.000 n=20+20)
BM_PointerAnalysisLoopAndBranch    523µs ± 4%   690µs ± 5%  +31.95%  (p=0.000 n=20+20)
BM_PointerAnalysisTwoLoops         339µs ± 3%   445µs ± 2%  +31.49%  (p=0.000 n=20+20)
BM_PointerAnalysisJoinFilePath    1.62ms ± 3%  2.04ms ± 4%  +25.65%  (p=0.000 n=20+20)
BM_PointerAnalysisCallInLoop      1.15ms ± 4%  1.48ms ± 4%  +29.03%  (p=0.000 n=19+19)

Regarding SAT solver timeouts, on the codebase I looked at, reverse-applied #72985 results in over four times as many SAT solver timeouts as this patch. This is because reverse-applied #72985 performs joins on entries in ExprToVal that could simply have been discarded, and those joins often require calling the SAT solver.

By the way, the performance impact from this patch over current head is pretty small -- about 1% slowdown in the Crubit nullability check benchmarks. (I've added the results to the PR description.)

I'll address the code-level comments next but wanted to get this high-level comment out first.

@Xazax-hun
Copy link
Collaborator

Yes, significantly faster, and fewer SAT solver timeouts.

The Clang Static Analyzer is going one step further. It is pruning the analysis state really aggressively, it is relying on liveness analysis, and whenever the value of a variable is dead (would not be read in future program points), the corresponding symbols will be pruned from the state. For the really deep path sensitive analysis the CSA is doing, it was really beneficial performance-wise. On the other hand, this approach definitely introduced complexity. The CSA is basically doing garbage collection to figure out what part of the program state is unreachable (e.g., can we read a value through some pointers?). Moreover, actually pruning the analysis state is tricky. If we had x < y and y < z in our constraints, to correctly prune y, we would need to have a new set of constraints like x+1 < z. Simply dropping everything with y would lose information.

if (Block == nullptr)
continue;

for (const CFGElement &Element : *Block)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we early terminate as soon as we found an expression that is consumed in a different block? Also, in case we do early termination, would it make sense to go backwards and start with the terminator conditions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we early terminate as soon as we found an expression that is consumed in a different block?

That doesn't work, because we insert ChildBlock, not Block. This block might contain expressions from various other blocks, and we want to insert all of those blocks into Result, so we can't terminate the loop early.

buildContainsExprConsumedInDifferentBlock(
const CFG &Cfg,
const llvm::DenseMap<const Stmt *, const CFGBlock *> &StmtToBlock) {
llvm::DenseSet<const CFGBlock *> Result;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, could be a bitset that contains the block ids. Although, since we probably expect this set to be sparse/small, maybe the current solution is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I'm not sure which way this goes. This feature doesn't appear to be a bit performance issue one way or the other, so I'm inclined to keep it as is.

@martinboehme
Copy link
Contributor Author

Yes, significantly faster, and fewer SAT solver timeouts.

The Clang Static Analyzer is going one step further. It is pruning the analysis state really aggressively, it is relying on liveness analysis, and whenever the value of a variable is dead (would not be read in future program points), the corresponding symbols will be pruned from the state. For the really deep path sensitive analysis the CSA is doing, it was really beneficial performance-wise. On the other hand, this approach definitely introduced complexity. The CSA is basically doing garbage collection to figure out what part of the program state is unreachable (e.g., can we read a value through some pointers?). Moreover, actually pruning the analysis state is tricky. If we had x < y and y < z in our constraints, to correctly prune y, we would need to have a new set of constraints like x+1 < z. Simply dropping everything with y would lose information.

Thanks for the input from CSA!

Yes, doing this more broadly would surely be a win -- but also much more difficult than for expression evaluation, which makes it easy to determine how long a value will be needed. Something to keep in mind for the future.

By the way, we've also been discussing dropping clauses from the flow condition when we can prove we no longer need them. Do you happen to know any references to similar work, maybe in CSA?

@martinboehme martinboehme merged commit d5aecf0 into llvm:main Mar 7, 2024
@Xazax-hun
Copy link
Collaborator

Do you happen to know any references to similar work, maybe in CSA?

CSA will also remove dead symbols from the path constraints (which is akin to the flow condition in path-sensitive analysis). But this definitely introduced some challenges, there was a recent discussion about some of the fallouts here: https://discourse.llvm.org/t/keeping-the-extent-symbol-alive/76676

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants