Skip to content

[clang][analyzer] Fix a nullptr dereference when -ftime-trace is used (Reland) #139980

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
May 15, 2025

Conversation

fangyi-zhou
Copy link
Contributor

Fixes #139779.

The bug was introduced in #137355 in SymbolConjured::getStmt, when
trying to obtain a statement for a CFG initializer without an
initializer. This commit adds a null check before access.

Previous PR #139820, Revert #139936

Additional notes since previous PR:

When conjuring a symbol, sometimes there is no valid CFG element, e.g. in the file causing the crash, there is no element at all in the CFG. In these cases, the CFG element reference in the expression engine will be invalid. As a consequence, there needs to be extra checks to ensure the validity of the CFG element reference.

Fixes llvm#139779.

The bug was introduced in llvm#137355 in `SymbolConjured::getStmt`, when
trying to obtain a statement for a CFG initializer without an
initializer.  This commit adds a null check before access.
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:static analyzer labels May 14, 2025
@llvmbot
Copy link
Member

llvmbot commented May 14, 2025

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Fangyi Zhou (fangyi-zhou)

Changes

Fixes #139779.

The bug was introduced in #137355 in SymbolConjured::getStmt, when
trying to obtain a statement for a CFG initializer without an
initializer. This commit adds a null check before access.

Previous PR #139820, Revert #139936

Additional notes since previous PR:

When conjuring a symbol, sometimes there is no valid CFG element, e.g. in the file causing the crash, there is no element at all in the CFG. In these cases, the CFG element reference in the expression engine will be invalid. As a consequence, there needs to be extra checks to ensure the validity of the CFG element reference.


Full diff: https://github.com/llvm/llvm-project/pull/139980.diff

2 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h (+9-1)
  • (added) clang/test/Analysis/ftime-trace-no-init.cpp (+5)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
index 9e7c98fdded17..2a78d79b26bb8 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
@@ -101,9 +101,17 @@ class SymbolConjured : public SymbolData {
 
   // It might return null.
   const Stmt *getStmt() const {
+    if (const auto *Parent = Elem.getParent()) {
+      // Sometimes the CFG element is invalid, avoid dereferencing it.
+      if (Elem.getIndexInBlock() >= Parent->size())
+        return nullptr;
+    }
     switch (Elem->getKind()) {
     case CFGElement::Initializer:
-      return Elem->castAs<CFGInitializer>().getInitializer()->getInit();
+      if (const auto *Init = Elem->castAs<CFGInitializer>().getInitializer()) {
+        return Init->getInit();
+      }
+      return nullptr;
     case CFGElement::ScopeBegin:
       return Elem->castAs<CFGScopeBegin>().getTriggerStmt();
     case CFGElement::ScopeEnd:
diff --git a/clang/test/Analysis/ftime-trace-no-init.cpp b/clang/test/Analysis/ftime-trace-no-init.cpp
new file mode 100644
index 0000000000000..7fb289b19da78
--- /dev/null
+++ b/clang/test/Analysis/ftime-trace-no-init.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang_analyze_cc1 -analyzer-checker=core,apiModeling %s -ftime-trace=%t.raw.json -verify
+// expected-no-diagnostics
+
+// GitHub issue 139779
+struct {} a; // no-crash

@steakhal
Copy link
Contributor

I think this would still trip on the same memory issue.
The problem to me is that we read uninitialized memory. So even if you limit the bounds of the result of the read of such uninitialized memory, it would only mask and limit the effect of such read.
Have you tried running the failing case via valgrind? That's usually great for catching uninitialized reads.
msan is also great, but that needs an instrumented standard library which may be tough to set up.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.

@fangyi-zhou
Copy link
Contributor Author

I think this would still trip on the same memory issue. The problem to me is that we read uninitialized memory. So even if you limit the bounds of the result of the read of such uninitialized memory, it would only mask and limit the effect of such read. Have you tried running the failing case via valgrind? That's usually great for catching uninitialized reads. msan is also great, but that needs an instrumented standard library which may be tough to set up.

I don't think so. I have an asan build locally and it worked.

The methods getParent() and getIndexInBlock() in CFGElementRef does not access memory, since they only return the value from the ref. The actual dereference happens in -> operator, which is gated after the newly added if condition.

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would still trip on the same memory issue. The problem to me is that we read uninitialized memory. So even if you limit the bounds of the result of the read of such uninitialized memory, it would only mask and limit the effect of such read. Have you tried running the failing case via valgrind? That's usually great for catching uninitialized reads. msan is also great, but that needs an instrumented standard library which may be tough to set up.

I don't think so. I have an asan build locally and it worked.

The methods getParent() and getIndexInBlock() in CFGElementRef does not access memory, since they only return the value from the ref. The actual dereference happens in -> operator, which is gated after the newly added if condition.

My understanding is that asan will not check if a memory is initialized or not. It checks if the access is valid - but since you guard the access we would brush this under the carpet now.

@fangyi-zhou
Copy link
Contributor Author

valgrind result:

$ valgrind ./bin/clang -cc1 -analyze -analyzer-checker=apiModeling -ftime-trace=test.json ../clang/test/Analysis/ftime-trace-no-init.cpp
==825619== Memcheck, a memory error detector
==825619== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==825619== Using Valgrind-3.25.0 and LibVEX; rerun with -h for copyright info
==825619== Command: ./bin/clang -cc1 -analyze -analyzer-checker=apiModeling -ftime-trace=test.json ../clang/test/Analysis/ftime-trace-no-init.cpp
==825619== 
==825619== 
==825619== HEAP SUMMARY:
==825619==     in use at exit: 73,728 bytes in 1 blocks
==825619==   total heap usage: 7,012 allocs, 7,011 frees, 2,733,844 bytes allocated
==825619== 
==825619== LEAK SUMMARY:
==825619==    definitely lost: 0 bytes in 0 blocks
==825619==    indirectly lost: 0 bytes in 0 blocks
==825619==      possibly lost: 0 bytes in 0 blocks
==825619==    still reachable: 73,728 bytes in 1 blocks
==825619==         suppressed: 0 bytes in 0 blocks
==825619== Rerun with --leak-check=full to see details of leaked memory
==825619== 
==825619== For lists of detected and suppressed errors, rerun with: -s
==825619== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@fangyi-zhou fangyi-zhou requested a review from steakhal May 15, 2025 13:49
Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the nice discussion.

@steakhal
Copy link
Contributor

I'll wait for the premerge checks to show green to avoid more reverts.

@steakhal steakhal merged commit 81d48e0 into llvm:main May 15, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New --analyze crash with -ftime-trace
3 participants