-
Notifications
You must be signed in to change notification settings - Fork 13.3k
Fix condvar.wait(distant future) return immediately on OSX #42604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Is year 3000 special in that sense that this is a maximum OS X will support? |
@nagisa OSX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an evidence of where the overflow is? The apparent source code for pthread_cond_timedwait
shells out to _pthread_cond_wait
which doesn't look like it has any overflows?
Using dtruss
indicates that psynch_cvwait
is failing with error 316. Headers don't seem to indicate what that is and related StackOverflow questions seem to also be similarly confused.
It seems that we don't know why this is failing and we're just blindly applying a workaround?
src/libstd/sync/condvar.rs
Outdated
let (g, timeout_res) = c.wait_timeout(g, Duration::from_millis(u32::MAX as u64)).unwrap(); | ||
let (g, timeout_res) = c.wait_timeout(g, Duration::from_millis(u64::MAX)).unwrap(); | ||
// spurious wakeups mean this isn't necessarily true | ||
// assert!(notified.load(Ordering::SeqCst)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can't have this assert, why have the Arc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So next person going to patch this part of the library could execute test with this line uncommented. I believe that prevous comment "spurious wakeups mean this isn't necessarily true" several lines above kept in the source for the same reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be removed in that case? It's surely easier to test this outside the standard library than inside, so this seems like it's just an extra diff for not a lot of value.
I believe, overflow is somewhere inside kernel. I couldn't found sources of latest kernel, but there's a source of 10.7.5. It has code like this:
which does not explain this particular issue, but shows that authors of xnu don't care much about integer overflows.
Or maybe it is not overflow, but an indication of parameter error. Whatever.
That's fine with me. BTW, now I think that it would probably be slightly better to wait with explicitly infinite timeout if duration is more than 1000 years, instead of rounding down to year 3000, but this version is OK too. |
No, rounding down is legal, because |
Going to update patch now to round duration down to 1000 years instead of rounding down deadline to year 3000. |
Done. |
src/libstd/sys/unix/condvar.rs
Outdated
@@ -116,7 +173,7 @@ impl Condvar { | |||
(sys_now.tv_usec * 1000) as libc::c_long; | |||
let extra = (nsec / 1_000_000_000) as libc::time_t; | |||
let nsec = nsec % 1_000_000_000; | |||
let seconds = dur.as_secs() as libc::time_t; | |||
let seconds = wrapping_cast_to_time_t(dur.as_secs()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should no longer be necessary due to the clamp above, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if time_t
is more than 32 bit (as in all modern OSes), this cast is not needed.
However, I'd keep this cast, because
- it is not performance critical
- makes code easier to read (reader does not need to check that there's no overflow)
- makes OSX/Linux versions more like to each other
src/libstd/sys/unix/condvar.rs
Outdated
// println!("$"); | ||
// } | ||
// } | ||
// ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty long comment to work with, could the program be extracted to something like a gist? Could the example program be C and not Rust?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewritten in C and moved to gist.
src/libstd/sys/unix/condvar.rs
Outdated
let max_dur = Duration::from_secs(1000 * 365 * 86400); | ||
|
||
if dur > max_dur { | ||
// Apparently `tv_sec` overflows somewhere around |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we 100% sure that this is an overflow somewhere? The fact that the kernel returns an error seems to indicate that this comment is not correct. I think we can basically just say that OSX is buggy with super long durations, so we clamp which is also allowable per the API of wait_timeout
because of spurious wakeups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated comment to use diffent words.
src/libstd/sync/condvar.rs
Outdated
let (g, timeout_res) = c.wait_timeout(g, Duration::from_millis(u32::MAX as u64)).unwrap(); | ||
let (g, timeout_res) = c.wait_timeout(g, Duration::from_millis(u64::MAX)).unwrap(); | ||
// spurious wakeups mean this isn't necessarily true | ||
// assert!(notified.load(Ordering::SeqCst)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be removed in that case? It's surely easier to test this outside the standard library than inside, so this seems like it's just an extra diff for not a lot of value.
Fixes issue rust-lang#37440: `pthread_cond_timedwait` on macOS Sierra seems to overflow `ts_sec` parameter and returns immediately. To work around this problem patch rounds timeout down to approximately 1000 years. Patch also fixes overflow when converting `u64` to `time_t`.
I've rewritten tests with loop until wake up is no spurious. |
@bors: r+ |
📌 Commit 0c26b59 has been approved by |
Fix condvar.wait(distant future) return immediately on OSX Fixes issue #37440: `pthread_cond_timedwait` on macOS Sierra seems to overflow `ts_sec` parameter and returns immediately. To work around this problem patch rounds timeout down to year 3000. Patch also fixes overflow when converting `u64` to `time_t`.
☀️ Test successful - status-appveyor, status-travis |
Fixes issue #37440:
pthread_cond_timedwait
on macOS Sierra seemsto overflow
ts_sec
parameter and returns immediately. To workaround this problem patch rounds timeout down to year 3000.
Patch also fixes overflow when converting
u64
totime_t
.