Skip to content

Unix PAL POSIX Mutex lock should check its return codes for soundness #120147

Closed
@cbiffle

Description

@cbiffle

Hi! We've run into a subtle Mutex bug on a POSIX platform, and it's led me to propose a small change to the pthread_mutex-based PAL code.

Shout-out to @papertigers for noticing this and coming up with a reduction test case.

Details and background

The current lock routine, for instance, for the generic non-Futex Unix PAL is here:

let r = libc::pthread_mutex_lock(raw(self));
debug_assert_eq!(r, 0);

Since we're in std, that debug_assert_eq will be compiled out in almost all cases. This presents a soundness problem on non-Futex POSIX platforms where a non-error-checked pthread_mutex_lock call may fail. Since the PAL is specifying a PTHREAD_MUTEX_NORMAL (a non-error-checked mutex), you might expect lock to be unable to fail.

However, POSIX platforms in the wild have interpreted pthread_mutex_lock's semantics more loosely, and provide deadlock detection on lock for non-checked mutexes. This specifically affects versions of Solaris dating back to before the open source release (the point at which my ability to follow history ends), and various Solaris-descended operating systems, including Illumos (where this was noticed).

This means that on Solaris, Illumos, and kin, the current Rust Mutex implementation is unsound, because it can be used to generate aliasing &muts:

let m1 = Mutex::new();

let mut lock1 = m1.lock().unwrap();
// On a strict POSIX compliant platform this next line will hang.
// On Solaris-derived kernels it will instead succeed.
let mut lock2 = m1.lock().unwrap();

*lock1 = 42;
*lock2 = 0;

// We're well into UB land here, making these statements
// difficult to predict:
println!("{}", *lock1);
println!("{}", *lock2);

To be clear -- my interpretation of the POSIX standard says that these platforms are violating it. We have filed a bug reflecting this against Illumos, so that with any luck some future versions will behave differently. But there are a number of systems in the wild that may never see that fix, and it'd be good (IMO) to have Mutex be correct there too. I look forward to the world where I only have to write code for correct operating systems, but we're not there yet. :-)

Proposed change

Because the correctness of the POSIX lock operation is critical for soundness in Rust, I propose that the PAL check the return code from pthread_mutex_lock, under the banner of "wouldn't you want to know if this assumption were violated?"

Concretely, I'd like to switch that check to a full-fledged assert rather than a debug_assert. I'm writing this bug to open a discussion and see what folks would want to see before this happens, or if anyone has a good alternative. For instance, would folks feel strongly about this change having a platform-specific cfg to enable it, assuming that only a named subset of POSIX platforms will ever use this interpretation of the standard? (I'd be more comfortable opting specific POSIX platforms out of the check, personally, or making it unconditional -- but let's talk!)

As far as I can tell, the posix PAL is used primarily, in terms of numbers, by Darwin. Linux and the common BSDs do not rely on it. Darwin's mutex does not behave the same way as the Solaris-descended ones, so the soundness issue does not affect Darwin. As a result I could imagine opting Darwin out of the test, if folks are concerned about a performance hit.

I'm not sure if the current platform support has any kind of shared set of "POSIX interpretation quirks" that generates features I could use instead of target_os; if that's a thing, point me at it.

Thanks in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-threadArea: `std::thread`C-bugCategory: This is a bug.I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessO-unixOperating system: Unix-likeT-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions