Skip to content

[analyzer] Harden safeguards for Z3 query times #95129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

steakhal
Copy link
Contributor

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in
at most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted.
After this patch, CSA will tackle such extreme cases as well.

steakhal added 2 commits June 11, 2024 16:54
Created using spr 1.3.4
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels Jun 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Jun 11, 2024

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Balazs Benics (steakhal)

Changes

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in
at most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted.
After this patch, CSA will tackle such extreme cases as well.


Full diff: https://github.com/llvm/llvm-project/pull/95129.diff

6 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def (+13)
  • (modified) clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h (+32-8)
  • (modified) clang/lib/StaticAnalyzer/Core/BugReporter.cpp (+10-2)
  • (modified) clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp (+52-11)
  • (modified) clang/test/Analysis/analyzer-config.c (+2)
  • (modified) clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp (+100-16)
diff --git a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
index f008c9c581d95..201153c886966 100644
--- a/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
+++ b/clang/include/clang/StaticAnalyzer/Core/AnalyzerOptions.def
@@ -184,6 +184,19 @@ ANALYZER_OPTION(bool, ShouldCrosscheckWithZ3, "crosscheck-with-z3",
                 "constraint manager backend.",
                 false)
 
+ANALYZER_OPTION(
+    unsigned, Z3CrosscheckTimeoutThreshold,
+    "crosscheck-with-z3-timeout-threshold",
+    "Set a timeout for individual Z3 queries in milliseconds. "
+    "Set 0 for no timeout.", 300)
+
+ANALYZER_OPTION(
+    unsigned, Z3CrosscheckRLimitThreshold,
+    "crosscheck-with-z3-rlimit-threshold",
+    "Set the Z3 resource limit threshold. This sets a deterministic cutoff "
+    "point for Z3 queries, as longer queries usually consume more resources. "
+    "Set 0 for unlimited.", 400'000)
+
 ANALYZER_OPTION(bool, ShouldReportIssuesInMainSourceFile,
                 "report-in-main-source-file",
                 "Whether or not the diagnostic report should be always "
diff --git a/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h b/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h
index 3ec59e3037363..524337d378aef 100644
--- a/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h
+++ b/clang/include/clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h
@@ -25,8 +25,11 @@ class Z3CrosscheckVisitor final : public BugReporterVisitor {
 public:
   struct Z3Result {
     std::optional<bool> IsSAT = std::nullopt;
+    unsigned Z3QueryTimeMilliseconds = 0;
+    unsigned UsedRLimit = 0;
   };
-  explicit Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result);
+  Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result,
+                      const AnalyzerOptions &Opts);
 
   void Profile(llvm::FoldingSetNodeID &ID) const override;
 
@@ -44,21 +47,42 @@ class Z3CrosscheckVisitor final : public BugReporterVisitor {
   /// Holds the constraints in a given path.
   ConstraintMap Constraints;
   Z3Result &Result;
+  const AnalyzerOptions &Opts;
 };
 
 /// The oracle will decide if a report should be accepted or rejected based on
-/// the results of the Z3 solver.
+/// the results of the Z3 solver and the statistics of the queries of a report
+/// equivalenece class.
 class Z3CrosscheckOracle {
 public:
+  explicit Z3CrosscheckOracle(const AnalyzerOptions &Opts) : Opts(Opts) {}
   enum Z3Decision {
-    AcceptReport, // The report was SAT.
-    RejectReport, // The report was UNSAT or UNDEF.
+    AcceptReport,  // The report was SAT.
+    RejectReport,  // The report was UNSAT or UNDEF.
+    RejectEQClass, // The heuristic suggests to skip the current eqclass.
   };
 
-  /// Makes a decision for accepting or rejecting the report based on the
-  /// result of the corresponding Z3 query.
-  static Z3Decision
-  interpretQueryResult(const Z3CrosscheckVisitor::Z3Result &Query);
+  /// Updates the internal state with the new Z3Result and makes a decision how
+  /// to proceed:
+  /// - Accept the report if the Z3Result was SAT.
+  /// - Suggest dropping the report equvalence class based on the accumulated
+  ///   statistics.
+  /// - Otherwise, reject the report if the Z3Result was UNSAT or UNDEF.
+  ///
+  /// Conditions for dropping the equivalence class:
+  /// - Accumulative time spent in Z3 checks is more than 700ms in the eqclass.
+  /// - Hit the 300ms query timeout in the report eqclass.
+  /// - Hit the 400'000 rlimit in the report eqclass.
+  ///
+  /// Refer to
+  /// https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520 to
+  /// see why this heuristic was chosen.
+  Z3Decision interpretQueryResult(const Z3CrosscheckVisitor::Z3Result &Meta);
+
+private:
+  const AnalyzerOptions &Opts;
+  unsigned NumZ3QueriesDoneInEqClass = 0;
+  unsigned AccumulatedZ3QueryTimeInEqClass = 0; // ms
 };
 
 } // namespace clang::ento
diff --git a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
index c9a7fd0e035c2..e2002bfbe594a 100644
--- a/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
+++ b/clang/lib/StaticAnalyzer/Core/BugReporter.cpp
@@ -89,6 +89,9 @@ STATISTIC(MaxValidBugClassSize,
 
 STATISTIC(NumTimesReportPassesZ3, "Number of reports passed Z3");
 STATISTIC(NumTimesReportRefuted, "Number of reports refuted by Z3");
+STATISTIC(NumTimesReportEQClassAborted,
+          "Number of times a report equivalence class was aborted by the Z3 "
+          "oracle heuristic");
 STATISTIC(NumTimesReportEQClassWasExhausted,
           "Number of times all reports of an equivalence class was refuted");
 
@@ -2840,6 +2843,7 @@ generateVisitorsDiagnostics(PathSensitiveBugReport *R,
 std::optional<PathDiagnosticBuilder> PathDiagnosticBuilder::findValidReport(
     ArrayRef<PathSensitiveBugReport *> &bugReports,
     PathSensitiveBugReporter &Reporter) {
+  Z3CrosscheckOracle Z3Oracle(Reporter.getAnalyzerOptions());
 
   BugPathGetter BugGraph(&Reporter.getGraph(), bugReports);
 
@@ -2871,16 +2875,20 @@ std::optional<PathDiagnosticBuilder> PathDiagnosticBuilder::findValidReport(
         // visitor and check again
         R->clearVisitors();
         Z3CrosscheckVisitor::Z3Result CrosscheckResult;
-        R->addVisitor<Z3CrosscheckVisitor>(CrosscheckResult);
+        R->addVisitor<Z3CrosscheckVisitor>(CrosscheckResult,
+                                           Reporter.getAnalyzerOptions());
 
         // We don't overwrite the notes inserted by other visitors because the
         // refutation manager does not add any new note to the path
         generateVisitorsDiagnostics(R, BugPath->ErrorNode, BRC);
-        switch (Z3CrosscheckOracle::interpretQueryResult(CrosscheckResult)) {
+        switch (Z3Oracle.interpretQueryResult(CrosscheckResult)) {
         case Z3CrosscheckOracle::RejectReport:
           ++NumTimesReportRefuted;
           R->markInvalid("Infeasible constraints", /*Data=*/nullptr);
           continue;
+        case Z3CrosscheckOracle::RejectEQClass:
+          ++NumTimesReportEQClassAborted;
+          return {};
         case Z3CrosscheckOracle::AcceptReport:
           ++NumTimesReportPassesZ3;
           break;
diff --git a/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp b/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
index 20cd8434cbdc6..78a3af7d6082b 100644
--- a/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
+++ b/clang/lib/StaticAnalyzer/Core/Z3CrosscheckVisitor.cpp
@@ -12,26 +12,37 @@
 //===----------------------------------------------------------------------===//
 
 #include "clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h"
+#include "clang/StaticAnalyzer/Core/AnalyzerOptions.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/BugReporter.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/SMTConv.h"
 #include "llvm/ADT/Statistic.h"
 #include "llvm/Support/SMTAPI.h"
+#include "llvm/Support/Timer.h"
 
 #define DEBUG_TYPE "Z3CrosscheckOracle"
 
 STATISTIC(NumZ3QueriesDone, "Number of Z3 queries done");
 STATISTIC(NumTimesZ3TimedOut, "Number of times Z3 query timed out");
+STATISTIC(NumTimesZ3ExhaustedRLimit,
+          "Number of times Z3 query exhausted the rlimit");
+STATISTIC(NumTimesZ3SpendsTooMuchTimeOnASingleEQClass,
+          "Number of times report equivalenece class was cut because it spent "
+          "too much time in Z3");
 
 STATISTIC(NumTimesZ3QueryAcceptsReport,
           "Number of Z3 queries accepting a report");
 STATISTIC(NumTimesZ3QueryRejectReport,
           "Number of Z3 queries rejecting a report");
+STATISTIC(NumTimesZ3QueryRejectEQClass,
+          "Number of times rejecting an report equivalenece class");
 
 using namespace clang;
 using namespace ento;
 
-Z3CrosscheckVisitor::Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result)
-    : Constraints(ConstraintMap::Factory().getEmptyMap()), Result(Result) {}
+Z3CrosscheckVisitor::Z3CrosscheckVisitor(Z3CrosscheckVisitor::Z3Result &Result,
+                                         const AnalyzerOptions &Opts)
+    : Constraints(ConstraintMap::Factory().getEmptyMap()), Result(Result),
+      Opts(Opts) {}
 
 void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
                                           const ExplodedNode *EndPathNode,
@@ -41,8 +52,12 @@ void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
 
   // Create a refutation manager
   llvm::SMTSolverRef RefutationSolver = llvm::CreateZ3Solver();
-  RefutationSolver->setBoolParam("model", true);        // Enable model finding
-  RefutationSolver->setUnsignedParam("timeout", 15000); // ms
+  if (Opts.Z3CrosscheckRLimitThreshold)
+    RefutationSolver->setUnsignedParam("rlimit",
+                                       Opts.Z3CrosscheckRLimitThreshold);
+  if (Opts.Z3CrosscheckTimeoutThreshold)
+    RefutationSolver->setUnsignedParam("timeout",
+                                       Opts.Z3CrosscheckTimeoutThreshold); // ms
 
   ASTContext &Ctx = BRC.getASTContext();
 
@@ -63,8 +78,15 @@ void Z3CrosscheckVisitor::finalizeVisitor(BugReporterContext &BRC,
   }
 
   // And check for satisfiability
+  llvm::TimeRecord Start = llvm::TimeRecord::getCurrentTime(/*Start=*/true);
   std::optional<bool> IsSAT = RefutationSolver->check();
-  Result = Z3Result{IsSAT};
+  llvm::TimeRecord Diff = llvm::TimeRecord::getCurrentTime(/*Start=*/false);
+  Diff -= Start;
+  Result = Z3Result{
+      IsSAT,
+      static_cast<unsigned>(Diff.getWallTime() * 1000),
+      RefutationSolver->getStatistics()->getUnsigned("rlimit count"),
+  };
 }
 
 void Z3CrosscheckVisitor::addConstraints(
@@ -101,18 +123,37 @@ void Z3CrosscheckVisitor::Profile(llvm::FoldingSetNodeID &ID) const {
 Z3CrosscheckOracle::Z3Decision Z3CrosscheckOracle::interpretQueryResult(
     const Z3CrosscheckVisitor::Z3Result &Query) {
   ++NumZ3QueriesDone;
+  ++NumZ3QueriesDoneInEqClass;
+  AccumulatedZ3QueryTimeInEqClass += Query.Z3QueryTimeMilliseconds;
 
-  if (!Query.IsSAT.has_value()) {
-    // For backward compatibility, let's accept the first timeout.
+  if (Query.IsSAT && Query.IsSAT.value()) {
+    ++NumTimesZ3QueryAcceptsReport;
+    return AcceptReport; // sat
+  }
+
+  // Suggest cutting the EQClass if certain heuristics trigger.
+  if (Opts.Z3CrosscheckTimeoutThreshold &&
+      Query.Z3QueryTimeMilliseconds >= Opts.Z3CrosscheckTimeoutThreshold) {
     ++NumTimesZ3TimedOut;
-    return AcceptReport;
+    ++NumTimesZ3QueryRejectEQClass;
+    return RejectEQClass;
   }
 
-  if (Query.IsSAT.value()) {
-    ++NumTimesZ3QueryAcceptsReport;
-    return AcceptReport; // sat
+  if (Opts.Z3CrosscheckRLimitThreshold &&
+      Query.UsedRLimit >= Opts.Z3CrosscheckRLimitThreshold) {
+    ++NumTimesZ3ExhaustedRLimit;
+    ++NumTimesZ3QueryRejectEQClass;
+    return RejectEQClass;
+  }
+
+  if (AccumulatedZ3QueryTimeInEqClass > 700 /*ms*/) {
+    ++NumTimesZ3SpendsTooMuchTimeOnASingleEQClass;
+    ++NumTimesZ3QueryRejectEQClass;
+    return RejectEQClass;
   }
 
+  // If no cutoff heuristics trigger, and the report is "unsat" or "undef",
+  // then reject the report.
   ++NumTimesZ3QueryRejectReport;
   return RejectReport; // unsat
 }
diff --git a/clang/test/Analysis/analyzer-config.c b/clang/test/Analysis/analyzer-config.c
index fda920fa3951e..c4e6d94182dc1 100644
--- a/clang/test/Analysis/analyzer-config.c
+++ b/clang/test/Analysis/analyzer-config.c
@@ -43,6 +43,8 @@
 // CHECK-NEXT: cplusplus.Move:WarnOn = KnownsAndLocals
 // CHECK-NEXT: cplusplus.SmartPtrModeling:ModelSmartPtrDereference = false
 // CHECK-NEXT: crosscheck-with-z3 = false
+// CHECK-NEXT: crosscheck-with-z3-rlimit-threshold = 400000
+// CHECK-NEXT: crosscheck-with-z3-timeout-threshold = 300
 // CHECK-NEXT: ctu-dir = ""
 // CHECK-NEXT: ctu-import-cpp-threshold = 8
 // CHECK-NEXT: ctu-import-threshold = 24
diff --git a/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp b/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp
index efad4dd3f03b9..bd9439e447542 100644
--- a/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp
+++ b/clang/unittests/StaticAnalyzer/Z3CrosscheckOracleTest.cpp
@@ -6,6 +6,7 @@
 //
 //===----------------------------------------------------------------------===//
 
+#include "clang/StaticAnalyzer/Core/AnalyzerOptions.h"
 #include "clang/StaticAnalyzer/Core/BugReporter/Z3CrosscheckVisitor.h"
 #include "gtest/gtest.h"
 
@@ -17,43 +18,126 @@ using Z3Decision = Z3CrosscheckOracle::Z3Decision;
 
 static constexpr Z3Decision AcceptReport = Z3Decision::AcceptReport;
 static constexpr Z3Decision RejectReport = Z3Decision::RejectReport;
+static constexpr Z3Decision RejectEQClass = Z3Decision::RejectEQClass;
 
 static constexpr std::optional<bool> SAT = true;
 static constexpr std::optional<bool> UNSAT = false;
 static constexpr std::optional<bool> UNDEF = std::nullopt;
 
+static unsigned operator""_ms(unsigned long long ms) { return ms; }
+static unsigned operator""_step(unsigned long long rlimit) { return rlimit; }
+
+static const AnalyzerOptions DefaultOpts = [] {
+  AnalyzerOptions Config;
+#define ANALYZER_OPTION_DEPENDS_ON_USER_MODE(TYPE, NAME, CMDFLAG, DESC,        \
+                                             SHALLOW_VAL, DEEP_VAL)            \
+  ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEEP_VAL)
+#define ANALYZER_OPTION(TYPE, NAME, CMDFLAG, DESC, DEFAULT_VAL)                \
+  Config.NAME = DEFAULT_VAL;
+#include "clang/StaticAnalyzer/Core/AnalyzerOptions.def"
+
+  // Remember to update the tests in this file when these values change.
+  // Also update the doc comment of `interpretQueryResult`.
+  assert(Config.Z3CrosscheckRLimitThreshold == 400'000);
+  assert(Config.Z3CrosscheckTimeoutThreshold == 300 /*ms*/);
+  // Usually, when the timeout/rlimit threshold is reached, Z3 only slightly
+  // overshoots until it realizes that it overshoot and needs to back off.
+  // Consequently, the measured timeout should be fairly close to the threshold.
+  // Same reasoning applies to the rlimit too.
+  return Config;
+}();
+
 namespace {
 
-struct Z3CrosscheckOracleTest : public testing::Test {
-  Z3Decision interpretQueryResult(const Z3Result &Result) const {
-    return Z3CrosscheckOracle::interpretQueryResult(Result);
+class Z3CrosscheckOracleTest : public testing::Test {
+public:
+  Z3Decision interpretQueryResult(const Z3Result &Result) {
+    return Oracle.interpretQueryResult(Result);
   }
+
+private:
+  Z3CrosscheckOracle Oracle = Z3CrosscheckOracle(DefaultOpts);
 };
 
 TEST_F(Z3CrosscheckOracleTest, AcceptsFirstSAT) {
-  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 25_ms, 1000_step}));
 }
 
 TEST_F(Z3CrosscheckOracleTest, AcceptsSAT) {
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 25_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenItGoesOverTime) {
+  // Even if it times out, if it is SAT, we should accept it.
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 310_ms, 1000_step}));
 }
 
-TEST_F(Z3CrosscheckOracleTest, AcceptsFirstTimeout) {
-  ASSERT_EQ(AcceptReport, interpretQueryResult({UNDEF}));
+TEST_F(Z3CrosscheckOracleTest, UNSATWhenItGoesOverTime) {
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNSAT, 310_ms, 1000_step}));
 }
 
-TEST_F(Z3CrosscheckOracleTest, AcceptsTimeout) {
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(AcceptReport, interpretQueryResult({UNDEF}));
+TEST_F(Z3CrosscheckOracleTest, RejectsTimeout) {
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNDEF, 310_ms, 1000_step}));
 }
 
 TEST_F(Z3CrosscheckOracleTest, RejectsUNSATs) {
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
-  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+}
+
+// Testing cut heuristics:
+// =======================
+
+TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfSpendsTooMuchTotalTime) {
+  // Simulate long queries, that barely doesn't trigger the timeout.
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenItSpendsTooMuchTotalTime) {
+  // Simulate long queries, that barely doesn't trigger the timeout.
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 290_ms, 1000_step}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 290_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfAttemptsManySmallQueries) {
+  // Simulate quick, but many queries: 35 quick UNSAT queries.
+  // 35*20ms = 700ms, which is equal to the 700ms threshold.
+  for (int i = 0; i < 35; ++i) {
+    ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 20_ms, 1000_step}));
+  }
+  // Do one more to trigger the heuristic.
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNSAT, 1_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenIfAttemptsManySmallQueries) {
+  // Simulate quick, but many queries: 35 quick UNSAT queries.
+  // 35*20ms = 700ms, which is equal to the 700ms threshold.
+  for (int i = 0; i < 35; ++i) {
+    ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 20_ms, 1000_step}));
+  }
+  // Do one more to trigger the heuristic, but given this was SAT, we still
+  // accept the query.
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 20_ms, 1000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfHitsRLimitTooManyTimes) {
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectEQClass, interpretQueryResult({UNDEF, 25_ms, 405'000_step}));
+}
+
+TEST_F(Z3CrosscheckOracleTest, SATWhenItHitsRLimitTooManyTimes) {
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(RejectReport, interpretQueryResult({UNSAT, 25_ms, 1000_step}));
+  ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 25_ms, 405'000_step}));
 }
 
 } // namespace

steakhal added 2 commits June 12, 2024 08:48
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@steakhal
Copy link
Contributor Author

I've tested this change on 200+ projects, and the overall effect is:

  • 0 new issues appear
  • 276 issues disappear (because we drop the report eqclass)

The three most affected checker categories (all of them are spread across usually 20+ projects):

  • null-deref-like diagnostics (74)
  • OOBv2 (70)
  • uninit-read-like (55)

This underpins the non-intrusive nature of this change.

@steakhal
Copy link
Contributor Author

Ping. @NagyDonat @Xazax-hun @haoNoQ

Copy link
Contributor

@NagyDonat NagyDonat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I vaguely recalled that I already reviewed this, but it turns out I didn't. Anyway, the change LGTM, I only added a few minor remarks. (We already discussed a high-level overview of this on Discourse.)

return RejectEQClass;
}

if (AccumulatedZ3QueryTimeInEqClass > 700 /*ms*/) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's counter-intuitive that this limit is hardcoded, while the others are not -- e.g. this will silently override timeout thresholds above 700 ms.

Obviously this is bikeshedding, because practically nobody will use these config options, but if we want to make this configurable, then we should make the config clear and user-friendly. Either introduce this limit as a third configurable value, or e.g. say that it's always twice the individual timeout (which would yield 600 ms instead of 700 ms).

}

// If no cutoff heuristics trigger, and the report is "unsat" or "undef",
// then reject the report.
++NumTimesZ3QueryRejectReport;
return RejectReport; // unsat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return RejectReport; // unsat
return RejectReport;

This doesn't add anything and is technically incorrect.

// For backward compatibility, let's accept the first timeout.
if (Query.IsSAT && Query.IsSAT.value()) {
++NumTimesZ3QueryAcceptsReport;
return AcceptReport; // sat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return AcceptReport; // sat
return AcceptReport;

This old comment doesn't add anything + the "unset" on the other branch was slightly incorrect.

ASSERT_EQ(AcceptReport, interpretQueryResult({SAT, 200_ms, 1000_step}));
}

TEST_F(Z3CrosscheckOracleTest, RejectEQClassIfHitsRLimitTooManyTimes) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name "HitsRLimitTooManyTimes" is a leftover from the earlier plan where one rlimit overrun wasn't enough to discard the eqclass. (This also applies for the next TC.)

@steakhal
Copy link
Contributor Author

Ah, it mixed the messages... Here it is:

  • Removed unused NumZ3QueriesDoneInEqClass
  • Added a new crosscheck-with-z3-eqclass-timeout-threshold config option.
  • Removed misleading // unsat comment on return statement.
  • Removed unnecessary // sat comment on return statement.
  • Adjusted the name of the two last unit-tests.

steakhal added 2 commits June 18, 2024 09:43
Created using spr 1.3.4

[skip ci]
Created using spr 1.3.4
@steakhal steakhal changed the base branch from users/steakhal/spr/main.analyzer-harden-safeguards-for-z3-query-times to main June 18, 2024 07:46
Created using spr 1.3.4
@steakhal steakhal merged commit eacc3b3 into main Jun 18, 2024
3 of 5 checks passed
@steakhal steakhal deleted the users/steakhal/spr/analyzer-harden-safeguards-for-z3-query-times branch June 18, 2024 07:48
@steakhal
Copy link
Contributor Author

Reverting this patch as requested in this comment.

steakhal added a commit that referenced this pull request Jul 1, 2024
This is exactly as originally landed in #95129,
but now the minimal Z3 version was increased to meet this change in #96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4

---

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in at
most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted. After this
patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)
lravenclaw pushed a commit to lravenclaw/llvm-project that referenced this pull request Jul 3, 2024
This is exactly as originally landed in llvm#95129,
but now the minimal Z3 version was increased to meet this change in llvm#96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4

---

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in at
most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted. After this
patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)
kbluck pushed a commit to kbluck/llvm-project that referenced this pull request Jul 6, 2024
This is exactly as originally landed in llvm#95129,
but now the minimal Z3 version was increased to meet this change in llvm#96682.

https://discourse.llvm.org/t/bump-minimal-z3-requirements-from-4-7-1-to-4-8-9/79664/4

---

This patch is a functional change.
https://discourse.llvm.org/t/analyzer-rfc-taming-z3-query-times/79520

As a result of this patch, individual Z3 queries in refutation will be
bound by 300ms. Every report equivalence class will be processed in at
most 1 second.

The heuristic should have only really marginal observable impact -
except for the cases when we had big report eqclasses with long-running
(15s) Z3 queries, where previously CSA effectively halted. After this
patch, CSA will tackle such extreme cases as well.

(cherry picked from commit eacc3b3)
Szelethus added a commit to Szelethus/codechecker that referenced this pull request Oct 21, 2024
Turn off Z3 timeouts added to the Clang Static Analyzer in
llvm/llvm-project#95129. We are now aware that
these options can often cause nondeterministic behaviour.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants