Skip to content

[lldb] Remove ProcessRunLock::TrySetRunning #135455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 14, 2025

Conversation

JDevlieghere
Copy link
Member

I traced the issue reported by Caroline and Pavel in #134757 back to the call to ProcessRunLock::TrySetRunning. When that fails, we get a somewhat misleading error message:

process resume at entry point failed: Resume request failed - process still running.

This is incorrect: the problem was not that the process was in a running state, but rather that the RunLock was being held by another thread (i.e. the Statusline). TrySetRunning would return false in both cases and the call site only accounted for the former.

Besides the odd semantics, the current implementation is inherently race-y and I believe incorrect. If someone is holding the RunLock, the resume call should block, rather than give up, and with the lock held, switch the running state and report the old running state.

This patch removes ProcessRunLock::TrySetRunning and updates all callers to use ProcessRunLock::SetRunning instead. To support that, ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for consistency) now report whether the process was stopped or running respectively. Previously, both methods returned true unconditionally.

The old code has been around pretty much pretty much forever, there's nothing in the git history to indicate that this was done purposely to solve a particular issue. I've tested this on both Linux and macOS and confirmed that this solves the statusline issue.

A big thank you to Jim for reviewing my proposed solution offline and trying to poke holes in it.

I traced the issue reported by Caroline and Pavel in llvm#134757 back to the
call to ProcessRunLock::TrySetRunning. When that fails, we get a
somewhat misleading error message:

> process resume at entry point failed: Resume request failed - process still running.

This is incorrect: the problem was not that the process was in a
running state, but rather that the RunLock was being held by another
thread (i.e. the Statusline). TrySetRunning would return false in both
cases and the call site only accounted for the former.

Besides the odd semantics, the current implementation is inherently
race-y and I believe incorrect. If someone is holding the RunLock, the
resume call should block, rather than give up, and with the lock held,
switch the running state and report the old running state.

This patch removes ProcessRunLock::TrySetRunning and updates all callers
to use ProcessRunLock::SetRunning instead. To support that,
ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for
consistency) now report  whether the process was stopped or running
respectively. Previously, both methods returned true unconditionally.

The old code has been around pretty much pretty much forever, there's
nothing in the git history to indicate that this was done purposely to
solve a particular issue. I've tested this on both Linux and macOS and
confirmed that this solves the statusline issue.

A big thank you to Jim for reviewing my proposed solution offline and
trying to poke holes in it.
@llvmbot
Copy link
Member

llvmbot commented Apr 11, 2025

@llvm/pr-subscribers-lldb

Author: Jonas Devlieghere (JDevlieghere)

Changes

I traced the issue reported by Caroline and Pavel in #134757 back to the call to ProcessRunLock::TrySetRunning. When that fails, we get a somewhat misleading error message:

> process resume at entry point failed: Resume request failed - process still running.

This is incorrect: the problem was not that the process was in a running state, but rather that the RunLock was being held by another thread (i.e. the Statusline). TrySetRunning would return false in both cases and the call site only accounted for the former.

Besides the odd semantics, the current implementation is inherently race-y and I believe incorrect. If someone is holding the RunLock, the resume call should block, rather than give up, and with the lock held, switch the running state and report the old running state.

This patch removes ProcessRunLock::TrySetRunning and updates all callers to use ProcessRunLock::SetRunning instead. To support that, ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for consistency) now report whether the process was stopped or running respectively. Previously, both methods returned true unconditionally.

The old code has been around pretty much pretty much forever, there's nothing in the git history to indicate that this was done purposely to solve a particular issue. I've tested this on both Linux and macOS and confirmed that this solves the statusline issue.

A big thank you to Jim for reviewing my proposed solution offline and trying to poke holes in it.


Full diff: https://github.com/llvm/llvm-project/pull/135455.diff

4 Files Affected:

  • (modified) lldb/include/lldb/Host/ProcessRunLock.h (+6-1)
  • (modified) lldb/source/Host/common/ProcessRunLock.cpp (+4-14)
  • (modified) lldb/source/Host/windows/ProcessRunLock.cpp (+4-12)
  • (modified) lldb/source/Target/Process.cpp (+21-38)
diff --git a/lldb/include/lldb/Host/ProcessRunLock.h b/lldb/include/lldb/Host/ProcessRunLock.h
index b5b5328b4a33f..c83cab53a9a65 100644
--- a/lldb/include/lldb/Host/ProcessRunLock.h
+++ b/lldb/include/lldb/Host/ProcessRunLock.h
@@ -29,8 +29,13 @@ class ProcessRunLock {
 
   bool ReadTryLock();
   bool ReadUnlock();
+
+  /// Set the process to running. Returns true if the process was stopped.
+  /// Return true if the process was running.
   bool SetRunning();
-  bool TrySetRunning();
+
+  /// Set the process to stopped. Returns true if the process was stopped.
+  /// Returns false if the process was running.
   bool SetStopped();
 
   class ProcessRunLocker {
diff --git a/lldb/source/Host/common/ProcessRunLock.cpp b/lldb/source/Host/common/ProcessRunLock.cpp
index da59f40576978..f9bde96ae8ac9 100644
--- a/lldb/source/Host/common/ProcessRunLock.cpp
+++ b/lldb/source/Host/common/ProcessRunLock.cpp
@@ -37,21 +37,10 @@ bool ProcessRunLock::ReadUnlock() {
 
 bool ProcessRunLock::SetRunning() {
   ::pthread_rwlock_wrlock(&m_rwlock);
+  bool was_stopped = !m_running;
   m_running = true;
   ::pthread_rwlock_unlock(&m_rwlock);
-  return true;
-}
-
-bool ProcessRunLock::TrySetRunning() {
-  bool r;
-
-  if (::pthread_rwlock_trywrlock(&m_rwlock) == 0) {
-    r = !m_running;
-    m_running = true;
-    ::pthread_rwlock_unlock(&m_rwlock);
-    return r;
-  }
-  return false;
+  return was_stopped;
 }
 
 bool ProcessRunLock::SetStopped() {
@@ -60,6 +49,7 @@ bool ProcessRunLock::SetStopped() {
   ::pthread_rwlock_unlock(&m_rwlock);
   return true;
 }
-}
+
+} // namespace lldb_private
 
 #endif
diff --git a/lldb/source/Host/windows/ProcessRunLock.cpp b/lldb/source/Host/windows/ProcessRunLock.cpp
index 693641e42ed73..9f144b4c918f8 100644
--- a/lldb/source/Host/windows/ProcessRunLock.cpp
+++ b/lldb/source/Host/windows/ProcessRunLock.cpp
@@ -58,24 +58,16 @@ bool ProcessRunLock::ReadUnlock() { return ::ReadUnlock(m_rwlock); }
 
 bool ProcessRunLock::SetRunning() {
   WriteLock(m_rwlock);
+  bool was_stopped = !m_running;
   m_running = true;
   WriteUnlock(m_rwlock);
-  return true;
-}
-
-bool ProcessRunLock::TrySetRunning() {
-  if (WriteTryLock(m_rwlock)) {
-    bool was_running = m_running;
-    m_running = true;
-    WriteUnlock(m_rwlock);
-    return !was_running;
-  }
-  return false;
+  return was_stopped;
 }
 
 bool ProcessRunLock::SetStopped() {
   WriteLock(m_rwlock);
+  bool was_running = m_running;
   m_running = false;
   WriteUnlock(m_rwlock);
-  return true;
+  return was_running;
 }
diff --git a/lldb/source/Target/Process.cpp b/lldb/source/Target/Process.cpp
index a9787823b9108..cd5b8c1e8a6fa 100644
--- a/lldb/source/Target/Process.cpp
+++ b/lldb/source/Target/Process.cpp
@@ -577,9 +577,7 @@ void Process::Finalize(bool destructing) {
   // contain events that have ProcessSP values in them which can keep this
   // process around forever. These events need to be cleared out.
   m_private_state_listener_sp->Clear();
-  m_public_run_lock.TrySetRunning(); // This will do nothing if already locked
   m_public_run_lock.SetStopped();
-  m_private_run_lock.TrySetRunning(); // This will do nothing if already locked
   m_private_run_lock.SetStopped();
   m_structured_data_plugin_map.clear();
 }
@@ -1325,9 +1323,9 @@ void Process::SetPublicState(StateType new_state, bool restarted) {
 Status Process::Resume() {
   Log *log(GetLog(LLDBLog::State | LLDBLog::Process));
   LLDB_LOGF(log, "(plugin = %s) -- locking run lock", GetPluginName().data());
-  if (!m_public_run_lock.TrySetRunning()) {
-    LLDB_LOGF(log, "(plugin = %s) -- TrySetRunning failed, not resuming.",
-             GetPluginName().data());
+  if (!m_public_run_lock.SetRunning()) {
+    LLDB_LOGF(log, "(plugin = %s) -- SetRunning failed, not resuming.",
+              GetPluginName().data());
     return Status::FromErrorString(
         "Resume request failed - process still running.");
   }
@@ -1342,8 +1340,8 @@ Status Process::Resume() {
 Status Process::ResumeSynchronous(Stream *stream) {
   Log *log(GetLog(LLDBLog::State | LLDBLog::Process));
   LLDB_LOGF(log, "Process::ResumeSynchronous -- locking run lock");
-  if (!m_public_run_lock.TrySetRunning()) {
-    LLDB_LOGF(log, "Process::Resume: -- TrySetRunning failed, not resuming.");
+  if (!m_public_run_lock.SetRunning()) {
+    LLDB_LOGF(log, "Process::Resume: -- SetRunning failed, not resuming.");
     return Status::FromErrorString(
         "Resume request failed - process still running.");
   }
@@ -2718,13 +2716,8 @@ Status Process::LaunchPrivate(ProcessLaunchInfo &launch_info, StateType &state,
   SetPublicState(eStateLaunching, restarted);
   m_should_detach = false;
 
-  if (m_public_run_lock.TrySetRunning()) {
-    // Now launch using these arguments.
-    error = DoLaunch(exe_module, launch_info);
-  } else {
-    // This shouldn't happen
-    error = Status::FromErrorString("failed to acquire process run lock");
-  }
+  m_public_run_lock.SetRunning();
+  error = DoLaunch(exe_module, launch_info);
 
   if (error.Fail()) {
     if (GetID() != LLDB_INVALID_PROCESS_ID) {
@@ -2989,17 +2982,12 @@ Status Process::Attach(ProcessAttachInfo &attach_info) {
       if (wait_for_launch) {
         error = WillAttachToProcessWithName(process_name, wait_for_launch);
         if (error.Success()) {
-          if (m_public_run_lock.TrySetRunning()) {
-            m_should_detach = true;
-            const bool restarted = false;
-            SetPublicState(eStateAttaching, restarted);
-            // Now attach using these arguments.
-            error = DoAttachToProcessWithName(process_name, attach_info);
-          } else {
-            // This shouldn't happen
-            error =
-                Status::FromErrorString("failed to acquire process run lock");
-          }
+          m_public_run_lock.SetRunning();
+          m_should_detach = true;
+          const bool restarted = false;
+          SetPublicState(eStateAttaching, restarted);
+          // Now attach using these arguments.
+          error = DoAttachToProcessWithName(process_name, attach_info);
 
           if (error.Fail()) {
             if (GetID() != LLDB_INVALID_PROCESS_ID) {
@@ -3060,17 +3048,13 @@ Status Process::Attach(ProcessAttachInfo &attach_info) {
   if (attach_pid != LLDB_INVALID_PROCESS_ID) {
     error = WillAttachToProcessWithID(attach_pid);
     if (error.Success()) {
+      m_public_run_lock.SetRunning();
 
-      if (m_public_run_lock.TrySetRunning()) {
-        // Now attach using these arguments.
-        m_should_detach = true;
-        const bool restarted = false;
-        SetPublicState(eStateAttaching, restarted);
-        error = DoAttachToProcessWithID(attach_pid, attach_info);
-      } else {
-        // This shouldn't happen
-        error = Status::FromErrorString("failed to acquire process run lock");
-      }
+      // Now attach using these arguments.
+      m_should_detach = true;
+      const bool restarted = false;
+      SetPublicState(eStateAttaching, restarted);
+      error = DoAttachToProcessWithID(attach_pid, attach_info);
 
       if (error.Success()) {
         SetNextEventAction(new Process::AttachCompletionHandler(
@@ -5882,10 +5866,9 @@ void Process::ClearPreResumeAction(PreResumeActionCallback callback, void *baton
 }
 
 ProcessRunLock &Process::GetRunLock() {
-  if (m_private_state_thread.EqualsThread(Host::GetCurrentThread()))
+  if (Process::CurrentThreadIsPrivateStateThread())
     return m_private_run_lock;
-  else
-    return m_public_run_lock;
+  return m_public_run_lock;
 }
 
 bool Process::CurrentThreadIsPrivateStateThread()

@jimingham
Copy link
Collaborator

This looks good to me, but we've discussed this already so I'm a prejudiced judge. We should maybe wait on a second opinion...

Copy link
Collaborator

@labath labath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was looking at this code as well, and I've (also) came to the conclusion that TrySetRunning is a problem.

@@ -60,6 +49,7 @@ bool ProcessRunLock::SetStopped() {
::pthread_rwlock_unlock(&m_rwlock);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also do the was_running dance.

Comment on lines 37 to 38
/// Set the process to stopped. Returns true if the process was stopped.
/// Returns false if the process was running.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Set the process to stopped. Returns true if the process was stopped.
/// Returns false if the process was running.
/// Set the process to stopped. Returns true if the process was running.
/// Returns false if the process was stopped.

(at least, that's what the windows implementation does -- and I think it makes sense)

@labath
Copy link
Collaborator

labath commented Apr 13, 2025

Thanks for looking into this.

@JDevlieghere JDevlieghere merged commit d792094 into llvm:main Apr 14, 2025
6 checks passed
@JDevlieghere JDevlieghere deleted the try-set-running branch April 14, 2025 08:09
JDevlieghere added a commit that referenced this pull request Apr 14, 2025
…text (#135458)

Make sure the process is stopped when computing the symbol context. Both
Adrian and Felipe reported a handful of crashes in GetSymbolContext
called from Statusline::Redraw on the default event thread.

Given that we're handling a StackFrameSP, it's not clear to me how that
could have gotten invalidated, but Jim points out that it doesn't make
sense to compute the symbol context for the frame when the process isn't
stopped.

Depends on  #135455
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
I traced the issue reported by Caroline and Pavel in llvm#134757 back to the
call to ProcessRunLock::TrySetRunning. When that fails, we get a
somewhat misleading error message:

> process resume at entry point failed: Resume request failed - process
still running.

This is incorrect: the problem was not that the process was in a running
state, but rather that the RunLock was being held by another thread
(i.e. the Statusline). TrySetRunning would return false in both cases
and the call site only accounted for the former.

Besides the odd semantics, the current implementation is inherently
race-y and I believe incorrect. If someone is holding the RunLock, the
resume call should block, rather than give up, and with the lock held,
switch the running state and report the old running state.

This patch removes ProcessRunLock::TrySetRunning and updates all callers
to use ProcessRunLock::SetRunning instead. To support that,
ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for
consistency) now report whether the process was stopped or running
respectively. Previously, both methods returned true unconditionally.

The old code has been around pretty much pretty much forever, there's
nothing in the git history to indicate that this was done purposely to
solve a particular issue. I've tested this on both Linux and macOS and
confirmed that this solves the statusline issue.

A big thank you to Jim for reviewing my proposed solution offline and
trying to poke holes in it.
var-const pushed a commit to ldionne/llvm-project that referenced this pull request Apr 17, 2025
…text (llvm#135458)

Make sure the process is stopped when computing the symbol context. Both
Adrian and Felipe reported a handful of crashes in GetSymbolContext
called from Statusline::Redraw on the default event thread.

Given that we're handling a StackFrameSP, it's not clear to me how that
could have gotten invalidated, but Jim points out that it doesn't make
sense to compute the symbol context for the frame when the process isn't
stopped.

Depends on  llvm#135455
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Apr 22, 2025
I traced the issue reported by Caroline and Pavel in llvm#134757 back to the
call to ProcessRunLock::TrySetRunning. When that fails, we get a
somewhat misleading error message:

> process resume at entry point failed: Resume request failed - process
still running.

This is incorrect: the problem was not that the process was in a running
state, but rather that the RunLock was being held by another thread
(i.e. the Statusline). TrySetRunning would return false in both cases
and the call site only accounted for the former.

Besides the odd semantics, the current implementation is inherently
race-y and I believe incorrect. If someone is holding the RunLock, the
resume call should block, rather than give up, and with the lock held,
switch the running state and report the old running state.

This patch removes ProcessRunLock::TrySetRunning and updates all callers
to use ProcessRunLock::SetRunning instead. To support that,
ProcessRunLock::SetRunning (and ProcessRunLock::SetStopped, for
consistency) now report whether the process was stopped or running
respectively. Previously, both methods returned true unconditionally.

The old code has been around pretty much pretty much forever, there's
nothing in the git history to indicate that this was done purposely to
solve a particular issue. I've tested this on both Linux and macOS and
confirmed that this solves the statusline issue.

A big thank you to Jim for reviewing my proposed solution offline and
trying to poke holes in it.

(cherry picked from commit d792094)
JDevlieghere added a commit to swiftlang/llvm-project that referenced this pull request Apr 23, 2025
…-d792094c26dc

[🍒 swift/release/6.2] [lldb] Remove ProcessRunLock::TrySetRunning (llvm#135455)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants