Skip to content

[clang][dataflow] Fix unsupported types always being equal #129502

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 4, 2025

Conversation

Discookie
Copy link
Contributor

Previously when the framework encountered unsupported values (such as enum classes), they were always treated as equal when comparing with ==, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.

Added a Value assignment for nullptr, to handle the special case of nullptr == nullptr.

Previously when the framework encountered unsupported values, they were always treated as equal when comparing with `==`, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.
Added a Value assignment for `nullptr`, to handle the special case of `nullptr == nullptr`.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Mar 3, 2025
@llvmbot
Copy link
Member

llvmbot commented Mar 3, 2025

@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clang

Author: Discookie (Discookie)

Changes

Previously when the framework encountered unsupported values (such as enum classes), they were always treated as equal when comparing with ==, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.

Added a Value assignment for nullptr, to handle the special case of nullptr == nullptr.


Full diff: https://github.com/llvm/llvm-project/pull/129502.diff

2 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/Transfer.cpp (+11-1)
  • (modified) clang/unittests/Analysis/FlowSensitive/TransferTest.cpp (+35)
diff --git a/clang/lib/Analysis/FlowSensitive/Transfer.cpp b/clang/lib/Analysis/FlowSensitive/Transfer.cpp
index 9c54eb16d2224..e17a16a3b75d0 100644
--- a/clang/lib/Analysis/FlowSensitive/Transfer.cpp
+++ b/clang/lib/Analysis/FlowSensitive/Transfer.cpp
@@ -60,7 +60,9 @@ static BoolValue &evaluateBooleanEquality(const Expr &LHS, const Expr &RHS,
   Value *LHSValue = Env.getValue(LHS);
   Value *RHSValue = Env.getValue(RHS);
 
-  if (LHSValue == RHSValue)
+  // When two unsupported values are compared, both are nullptr. Only supported
+  // values should evaluate to equal.
+  if (LHSValue == RHSValue && LHSValue)
     return Env.getBoolLiteralValue(true);
 
   if (auto *LHSBool = dyn_cast_or_null<BoolValue>(LHSValue))
@@ -798,6 +800,14 @@ class TransferVisitor : public ConstStmtVisitor<TransferVisitor> {
     Env.setValue(*S, Env.getIntLiteralValue(S->getValue()));
   }
 
+  // Untyped nullptr's aren't handled by NullToPointer casts, so they need to be
+  // handled separately.
+  void VisitCXXNullPtrLiteralExpr(const CXXNullPtrLiteralExpr *S) {
+    auto &NullPointerVal =
+        Env.getOrCreateNullPointerValue(S->getType()->getPointeeType());
+    Env.setValue(*S, NullPointerVal);
+  }
+
   void VisitParenExpr(const ParenExpr *S) {
     // The CFG does not contain `ParenExpr` as top-level statements in basic
     // blocks, however manual traversal to sub-expressions may encounter them.
diff --git a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
index 0f731f4532535..f52b73dbbdc57 100644
--- a/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/TransferTest.cpp
@@ -4974,6 +4974,41 @@ TEST(TransferTest, IntegerLiteralEquality) {
       });
 }
 
+TEST(TransferTest, UnsupportedValueEquality) {
+  std::string Code = R"(
+    // An explicitly unsupported type by the framework.
+    enum class EC {
+      A,
+      B
+    };
+  
+    void target() {
+      EC ec = EC::A;
+
+      bool unsupported_eq_same = (EC::A == EC::A);
+      bool unsupported_eq_other = (EC::A == EC::B);
+      bool unsupported_eq_var = (ec == EC::B);
+
+      (void)0; // [[p]]
+    }
+  )";
+  runDataflow(
+      Code,
+      [](const llvm::StringMap<DataflowAnalysisState<NoopLattice>> &Results,
+         ASTContext &ASTCtx) {
+        const Environment &Env = getEnvironmentAtAnnotation(Results, "p");
+
+        // We do not model the values of unsupported types, so this
+        // seemingly-trivial case will not be true either.
+        EXPECT_TRUE(isa<AtomicBoolValue>(
+            getValueForDecl<BoolValue>(ASTCtx, Env, "unsupported_eq_same")));
+        EXPECT_TRUE(isa<AtomicBoolValue>(
+            getValueForDecl<BoolValue>(ASTCtx, Env, "unsupported_eq_other")));
+        EXPECT_TRUE(isa<AtomicBoolValue>(
+            getValueForDecl<BoolValue>(ASTCtx, Env, "unsupported_eq_var")));
+      });
+}
+
 TEST(TransferTest, CorrelatedBranches) {
   std::string Code = R"(
     void target(bool B, bool C) {

@Discookie Discookie requested review from ymand and jvoung March 3, 2025 10:36
@Discookie Discookie merged commit 680391f into llvm:main Mar 4, 2025
15 checks passed
@jvoung
Copy link
Contributor

jvoung commented Mar 4, 2025

Hi, sorry, I should have run more tests first, but we just noticed a crash in a user of the dataflow framework (around this test https://github.com/google/crubit/blob/859520eca82d60a169fb85cdbf648c57d0a14a99/nullability/test/smart_pointers_diagnosis.cc#L57 )

I'd like to revert for now to investigate.

It may have something to do with the added VisitCXXNullPtrLiteralExpr creating more pointer values and either the consumer needs to adapt, or there is some invariant I hadn't checked.

@Discookie
Copy link
Contributor Author

Discookie commented Mar 5, 2025

Ah I see, thanks for letting me know! I'll test a bit more as well, and add more tests as well if I find something.

One of the things I could see an issue with, is that technically the nullptr literal doesn't have a pointee type, since it's a nullptr_t.
Downstream code should be able to handle a <<<NULL>>> QualType without any further code, but that could be a starting point.

@jvoung
Copy link
Contributor

jvoung commented Mar 11, 2025

Sorry for the delay!

I ran a few more tests and the downstream null analysis over a corpus of code with a small change to the patch. The small tweak (below) seems to pass the tests and your fix helps cover more code, as expected =)
(e.g., analyze both branches when there is an if (enum1 == enum2) vs only one branch)

I think the issue is that the downstream code doesn't expect to nullptr_t to have a pointer value / modeled nullability state until there is a cast (e.g., comments around https://github.com/google/crubit/blob/859520eca82d60a169fb85cdbf648c57d0a14a99/nullability/pointer_nullability_analysis.cc#L1130)

It would take more to change that assumption thoroughly.

So a suggestion is to:

  • do the original fix to evaluateBooleanEquality
  • to handle 'nullptr == nullptr' could special case that in evaluateBooleanEquality (e.g., check if LHS and RHS getType().isNullPtrType()) -- instead of creating PointerValues in VisitCXXNullPtrLiteralExpr

WDYT ?

@Discookie
Copy link
Contributor Author

Sounds good to me, thanks for investigating! Opened #131575 with the recommended change.

In the future, it would be nice indeed for nullptr literals to have some kind of Value - I also encountered this as a pain point in my own experimental checker. But with the presence of smart pointers, it indeed needs more thought than this simple fix.

@martinboehme
Copy link
Contributor

Drive-by comment.

I think the issue is that the downstream code doesn't expect to nullptr_t to have a pointer value / modeled nullability state until there is a cast (e.g., comments around https://github.com/google/crubit/blob/859520eca82d60a169fb85cdbf648c57d0a14a99/nullability/pointer_nullability_analysis.cc#L1130)

I think this is the correct way to model things, and it shouldn't be changed. nullptr_t is not a pointer type, so it should not be associated with a PointerValue.

Note that if you compare a pointer to nullptr, the nullptr is cast to the pointer type (godbolt), so the only case that needs to be treated specially is nullptr == nullptr. This might come up in macro or template code, so it does seem worth getting right.

to handle 'nullptr == nullptr' could special case that in evaluateBooleanEquality (e.g., check if LHS and RHS getType().isNullPtrType()) -- instead of creating PointerValues in VisitCXXNullPtrLiteralExpr

This is one way of doing this, and I don't really see any issues with this.

If we gave nullptr an actual Value though, that would require less special-casing. (The code you have in this PR would simply do the right thing.)

I don't think it's worth creating a new Value subclass for nullptr. Just add a method Value &getNullptrValue() const to Environment that always returns the same Value object (managed by the Arena), and use getNullptrValue() when evaluating a CXXNullPtrLiteralExpr. I think this would be a nice local fix.

@jvoung
Copy link
Contributor

jvoung commented Mar 18, 2025

Thanks, Martin, for the checking over the nullptr/nullptr_t modeling, and the suggestion for a Value &getNullptrValue() const! We can try that out and test it that as a followup cleanup for this special-casing.

jph-13 pushed a commit to jph-13/llvm-project that referenced this pull request Mar 21, 2025
Previously when the framework encountered unsupported values (such as
enum classes), they were always treated as equal when comparing with
`==`, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.

Added a Value assignment for `nullptr`, to handle the special case of
`nullptr == nullptr`.
Discookie added a commit that referenced this pull request Mar 26, 2025
…31575)

Relands #129502.

Previously when the framework encountered unsupported values (such as
enum classes), they were always treated as equal when comparing with
`==`, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.

Added handling for the special case of `nullptr == nullptr`, to preserve
the behavior of untyped `nullptr` having no value.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants