Skip to content

[clang][dataflow] Merge RecordValues with different locations correctly. #65319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 12, 2023

Conversation

martinboehme
Copy link
Contributor

Now that prvalue expressions map directly to values (see
https://reviews.llvm.org/D158977), it's no longer guaranteed that RecordValues
associated with the same expression will always have the same storage location.

In other words, D158977 invalidated the assertion in mergeDistinctValues().
The newly added test causes this assertion to fail without the other changes in
the patch.

This patch fixes the issue. However, the real fix will be to eliminate the
StorageLocation from RecordValue entirely.

@martinboehme martinboehme requested review from a team as code owners September 5, 2023 12:39
Copy link
Contributor

@kinu kinu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this LGTM (non-authoritative; but confirmed that it makes it go through with one of the real code I hit this)

MergedVal = &MergedEnv.create<RecordValue>(RecordVal1->getLoc());
else
// If the locations for the two records are different, need to create a
// completely new value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that how cryptic when this could happen, it might be helpful to have a brief comment that both could happen depending on a subtle order of CFG and therefore how merge happens.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure exactly what you mean...

Whether the locations are the same or not doesn't really depend on the order in which we process CFG blocks. Instead, locations are guaranteed to be the same if the RecordValue is stored in ExprToLoc, and they are usually (but not always) different if the RecordValue is stored in ExprToVal.

The order in which we process CFG blocks affects whether we actually see different RecordValues or not when the merge happens -- which makes merges difficult to trigger reliably from an integration test (example). But I'm not sure how relevant that is to the situation here?

Anyway, my hope is that all of this will go away relatively soon because we're now working on eliminating the RecordStorageLocation from RecordValue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, I wanted to see if some comment may help but it's a bit irrelevant here. Also given that it's going away I'm good with it, thanks!

// `RecordVal1` and `RecordVal2` may have different properties associated
// with them. Create a new `RecordValue` without any properties so that we
// soundly approximate both values. If a particular analysis needs to
// merge properties, it should do so in `DataflowAnalysis::merge()`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'If a particular analysis needs to merge...'
this part can be out of the if / else because the model's merge may do something for the else case too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you're saying. However, if I put this part of the comment ("If a particular analysis needs to merge properties...") above the if statement, it would read a bit strangely -- because in the case where the locations aren't the same, it's clear we need to create a new RecordValue anyway, and this will of course remove any properties.

What I'm trying to explain here is that we need to create a new RecordValue even if the location is the same, because the properties may be different -- and that therefore the analysis needs to merge properties even in this case. Does that make sense?

Open to suggestions on how I could clarify this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe expand a bit? e.g. "with the same location, but without any properties..."?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM. Yeah adding 'with the same location, ...' may clarify the situation bit, either works for me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the suggestion -- done!

@martinboehme martinboehme force-pushed the piper_export_cl_562747021 branch from 5c5723d to 28ac8d2 Compare September 11, 2023 07:24
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Sep 11, 2023
@llvmbot
Copy link
Member

llvmbot commented Sep 11, 2023

@llvm/pr-subscribers-clang

Changes

Now that prvalue expressions map directly to values (see
https://reviews.llvm.org/D158977), it's no longer guaranteed that RecordValues
associated with the same expression will always have the same storage location.

In other words, D158977 invalidated the assertion in mergeDistinctValues().
The newly added test causes this assertion to fail without the other changes in
the patch.

This patch fixes the issue. However, the real fix will be to eliminate the
StorageLocation from RecordValue entirely.

--
Full diff: https://github.com/llvm/llvm-project/pull/65319.diff

2 Files Affected:

  • (modified) clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp (+13-12)
  • (modified) clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp (+69)
diff --git a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
index f5eb469e7bb3d4e..e13f880896fc071 100644
--- a/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
+++ b/clang/lib/Analysis/FlowSensitive/DataflowEnvironment.cpp
@@ -125,18 +125,19 @@ static Value *mergeDistinctValues(QualType Type, Value &Val1,
 
   Value *MergedVal = nullptr;
   if (auto *RecordVal1 = dyn_cast(&Val1)) {
-    [[maybe_unused]] auto *RecordVal2 = cast(&Val2);
-
-    // Values to be merged are always associated with the same location in
-    // `LocToVal`. The location stored in `RecordVal` should therefore also
-    // be the same.
-    assert(&RecordVal1->getLoc() == &RecordVal2->getLoc());
-
-    // `RecordVal1` and `RecordVal2` may have different properties associated
-    // with them. Create a new `RecordValue` without any properties so that we
-    // soundly approximate both values. If a particular analysis needs to merge
-    // properties, it should do so in `DataflowAnalysis::merge()`.
-    MergedVal = &MergedEnv.create(RecordVal1->getLoc());
+    auto *RecordVal2 = cast(&Val2);
+
+    if (&RecordVal1->getLoc() == &RecordVal2->getLoc())
+      // `RecordVal1` and `RecordVal2` may have different properties associated
+      // with them. Create a new `RecordValue` with the same location but
+      // without any properties so that we soundly approximate both values. If a
+      // particular analysis needs to merge properties, it should do so in
+      // `DataflowAnalysis::merge()`.
+      MergedVal = &MergedEnv.create(RecordVal1->getLoc());
+    else
+      // If the locations for the two records are different, need to create a
+      // completely new value.
+      MergedVal = MergedEnv.createValue(Type);
   } else {
     MergedVal = MergedEnv.createValue(Type);
   }
diff --git a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
index a318d9ab7391aa1..6e37011a052c5e4 100644
--- a/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
+++ b/clang/unittests/Analysis/FlowSensitive/DataflowEnvironmentTest.cpp
@@ -96,6 +96,75 @@ TEST_F(EnvironmentTest, CreateValueRecursiveType) {
   EXPECT_THAT(PV, NotNull());
 }
 
+TEST_F(EnvironmentTest, JoinRecords) {
+  using namespace ast_matchers;
+
+  std::string Code = R"cc(
+    struct S {};
+    // Need to use the type somewhere so that the `QualType` gets created;
+    S s;
+  )cc";
+
+  auto Unit =
+      tooling::buildASTFromCodeWithArgs(Code, {"-fsyntax-only", "-std=c++11"});
+  auto &Context = Unit->getASTContext();
+
+  ASSERT_EQ(Context.getDiagnostics().getClient()->getNumErrors(), 0U);
+
+  auto Results =
+      match(qualType(hasDeclaration(recordDecl(hasName("S")))).bind("SType"),
+            Context);
+  const QualType *TyPtr = selectFirst("SType", Results);
+  ASSERT_THAT(TyPtr, NotNull());
+  QualType Ty = *TyPtr;
+  ASSERT_FALSE(Ty.isNull());
+
+  auto *ConstructExpr = CXXConstructExpr::CreateEmpty(Context, 0);
+  ConstructExpr->setType(Ty);
+  ConstructExpr->setValueKind(VK_PRValue);
+
+  // Two different `RecordValue`s with the same location are joined into a
+  // third `RecordValue` with that same location.
+  {
+    Environment Env1(DAContext);
+    auto &Val1 = *cast(Env1.createValue(Ty));
+    RecordStorageLocation &Loc = Val1.getLoc();
+    Env1.setValue(*ConstructExpr, Val1);
+
+    Environment Env2(DAContext);
+    auto &Val2 = Env2.create(Loc);
+    Env2.setValue(Loc, Val2);
+    Env2.setValue(*ConstructExpr, Val2);
+
+    Environment::ValueModel Model;
+    Environment EnvJoined = Environment::join(Env1, Env2, Model);
+    auto *JoinedVal = cast(EnvJoined.getValue(*ConstructExpr));
+    EXPECT_NE(JoinedVal, &Val1);
+    EXPECT_NE(JoinedVal, &Val2);
+    EXPECT_EQ(&JoinedVal->getLoc(), &Loc);
+  }
+
+  // Two different `RecordValue`s with different locations are joined into a
+  // third `RecordValue` with a location different from the other two.
+  {
+    Environment Env1(DAContext);
+    auto &Val1 = *cast(Env1.createValue(Ty));
+    Env1.setValue(*ConstructExpr, Val1);
+
+    Environment Env2(DAContext);
+    auto &Val2 = *cast(Env2.createValue(Ty));
+    Env2.setValue(*ConstructExpr, Val2);
+
+    Environment::ValueModel Model;
+    Environment EnvJoined = Environment::join(Env1, Env2, Model);
+    auto *JoinedVal = cast(EnvJoined.getValue(*ConstructExpr));
+    EXPECT_NE(JoinedVal, &Val1);
+    EXPECT_NE(JoinedVal, &Val2);
+    EXPECT_NE(&JoinedVal->getLoc(), &Val1.getLoc());
+    EXPECT_NE(&JoinedVal->getLoc(), &Val2.getLoc());
+  }
+}
+
 TEST_F(EnvironmentTest, InitGlobalVarsFun) {
   using namespace ast_matchers;
 

@martinboehme
Copy link
Contributor Author

(Sorry, rebased to head inadvertently -- still getting used to the new process. Will use fixup commits in the future.)

…ctly.

Now that prvalue expressions map directly to values (see
https://reviews.llvm.org/D158977), it's no longer guaranteed that `RecordValue`s
associated with the same expression will always have the same storage location.

In other words, D158977 invalidated the assertion in `mergeDistinctValues()`.
The newly added test causes this assertion to fail without the other changes in
the patch.

This patch fixes the issue. However, the real fix will be to eliminate the
`StorageLocation` from `RecordValue` entirely.
@martinboehme martinboehme force-pushed the piper_export_cl_562747021 branch from 28ac8d2 to fc70e2f Compare September 12, 2023 06:36
@martinboehme martinboehme merged commit 7f66cc7 into llvm:main Sep 12, 2023
ZijunZhaoCCK pushed a commit to ZijunZhaoCCK/llvm-project that referenced this pull request Sep 19, 2023
…ctly. (llvm#65319)

Now that prvalue expressions map directly to values (see
https://reviews.llvm.org/D158977), it's no longer guaranteed that
`RecordValue`s
associated with the same expression will always have the same storage
location.

In other words, D158977 invalidated the assertion in
`mergeDistinctValues()`.
The newly added test causes this assertion to fail without the other
changes in
the patch.

This patch fixes the issue. However, the real fix will be to eliminate
the
`StorageLocation` from `RecordValue` entirely.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants