Skip to content

Commit 986df44

Browse files
committed
auto merge of #8195 : bblum/rust/task-cleanup, r=brson
In the first commit it is obvious why some of the barriers can be changed to ```Relaxed```, but it is not as obvious for the once I changed in ```kill.rs```. The rationale for those is documented as part of the documenting commit. Also the last commit is a temporary hack to prevent kill signals from being received in taskgroup cleanup code, which could be fixed in a more principled way once the old runtime is gone.
2 parents af97339 + 963d37e commit 986df44

File tree

4 files changed

+94
-29
lines changed

4 files changed

+94
-29
lines changed

src/libstd/rt/comm.rs

Lines changed: 7 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ use kinds::Send;
1818
use rt::sched::Scheduler;
1919
use rt::local::Local;
2020
use rt::select::{Select, SelectPort};
21-
use unstable::atomics::{AtomicUint, AtomicOption, Acquire, Release, SeqCst};
21+
use unstable::atomics::{AtomicUint, AtomicOption, Acquire, Relaxed, SeqCst};
2222
use unstable::sync::UnsafeAtomicRcBox;
2323
use util::Void;
2424
use comm::{GenericChan, GenericSmartChan, GenericPort, Peekable};
@@ -217,15 +217,15 @@ impl<T> Select for PortOne<T> {
217217
}
218218
STATE_ONE => {
219219
// Re-record that we are the only owner of the packet.
220-
// Release barrier needed in case the task gets reawoken
221-
// on a different core (this is analogous to writing a
222-
// payload; a barrier in enqueueing the task protects it).
220+
// No barrier needed, even if the task gets reawoken
221+
// on a different core -- this is analogous to writing a
222+
// payload; a barrier in enqueueing the task protects it.
223223
// NB(#8132). This *must* occur before the enqueue below.
224224
// FIXME(#6842, #8130) This is usually only needed for the
225225
// assertion in recv_ready, except in the case of select().
226226
// This won't actually ever have cacheline contention, but
227227
// maybe should be optimized out with a cfg(test) anyway?
228-
(*self.packet()).state.store(STATE_ONE, Release);
228+
(*self.packet()).state.store(STATE_ONE, Relaxed);
229229

230230
rtdebug!("rendezvous recv");
231231
sched.metrics.rendezvous_recvs += 1;
@@ -300,7 +300,7 @@ impl<T> SelectPort<T> for PortOne<T> {
300300
unsafe {
301301
// See corresponding store() above in block_on for rationale.
302302
// FIXME(#8130) This can happen only in test builds.
303-
assert!((*packet).state.load(Acquire) == STATE_ONE);
303+
assert!((*packet).state.load(Relaxed) == STATE_ONE);
304304

305305
let payload = (*packet).payload.take();
306306

@@ -375,9 +375,7 @@ impl<T> Drop for PortOne<T> {
375375
// receiver was killed awake. The task can't still be
376376
// blocked (we are it), but we need to free the handle.
377377
let recvr = BlockedTask::cast_from_uint(task_as_state);
378-
// FIXME(#7554)(bblum): Make this cfg(test) dependent.
379-
// in a later commit.
380-
assert!(recvr.wake().is_none());
378+
recvr.assert_already_awake();
381379
}
382380
}
383381
}

src/libstd/rt/kill.rs

Lines changed: 76 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,73 @@
88
// option. This file may not be copied, modified, or distributed
99
// except according to those terms.
1010

11-
//! Task death: asynchronous killing, linked failure, exit code propagation.
11+
/*!
12+
13+
Task death: asynchronous killing, linked failure, exit code propagation.
14+
15+
This file implements two orthogonal building-blocks for communicating failure
16+
between tasks. One is 'linked failure' or 'task killing', that is, a failing
17+
task causing other tasks to fail promptly (even those that are blocked on
18+
pipes or I/O). The other is 'exit code propagation', which affects the result
19+
observed by the parent of a task::try task that itself spawns child tasks
20+
(such as any #[test] function). In both cases the data structures live in
21+
KillHandle.
22+
23+
I. Task killing.
24+
25+
The model for killing involves two atomic flags, the "kill flag" and the
26+
"unkillable flag". Operations on the kill flag include:
27+
28+
- In the taskgroup code (task/spawn.rs), tasks store a clone of their
29+
KillHandle in their shared taskgroup. Another task in the group that fails
30+
will use that handle to call kill().
31+
- When a task blocks, it turns its ~Task into a BlockedTask by storing a
32+
the transmuted ~Task pointer inside the KillHandle's kill flag. A task
33+
trying to block and a task trying to kill it can simultaneously access the
34+
kill flag, after which the task will get scheduled and fail (no matter who
35+
wins the race). Likewise, a task trying to wake a blocked task normally and
36+
a task trying to kill it can simultaneously access the flag; only one will
37+
get the task to reschedule it.
38+
39+
Operations on the unkillable flag include:
40+
41+
- When a task becomes unkillable, it swaps on the flag to forbid any killer
42+
from waking it up while it's blocked inside the unkillable section. If a
43+
kill was already pending, the task fails instead of becoming unkillable.
44+
- When a task is done being unkillable, it restores the flag to the normal
45+
running state. If a kill was received-but-blocked during the unkillable
46+
section, the task fails at this later point.
47+
- When a task tries to kill another task, before swapping on the kill flag, it
48+
first swaps on the unkillable flag, to see if it's "allowed" to wake up the
49+
task. If it isn't, the killed task will receive the signal when it becomes
50+
killable again. (Of course, a task trying to wake the task normally (e.g.
51+
sending on a channel) does not access the unkillable flag at all.)
52+
53+
Why do we not need acquire/release barriers on any of the kill flag swaps?
54+
This is because barriers establish orderings between accesses on different
55+
memory locations, but each kill-related operation is only a swap on a single
56+
location, so atomicity is all that matters. The exception is kill(), which
57+
does a swap on both flags in sequence. kill() needs no barriers because it
58+
does not matter if its two accesses are seen reordered on another CPU: if a
59+
killer does perform both writes, it means it saw a KILL_RUNNING in the
60+
unkillable flag, which means an unkillable task will see KILL_KILLED and fail
61+
immediately (rendering the subsequent write to the kill flag unnecessary).
62+
63+
II. Exit code propagation.
64+
65+
FIXME(#7544): Decide on the ultimate model for this and document it.
66+
67+
*/
1268

1369
use cast;
1470
use cell::Cell;
1571
use either::{Either, Left, Right};
1672
use option::{Option, Some, None};
1773
use prelude::*;
1874
use rt::task::Task;
75+
use task::spawn::Taskgroup;
1976
use to_bytes::IterBytes;
20-
use unstable::atomics::{AtomicUint, Acquire, SeqCst};
77+
use unstable::atomics::{AtomicUint, Relaxed};
2178
use unstable::sync::{UnsafeAtomicRcBox, LittleLock};
2279
use util;
2380

@@ -95,7 +152,7 @@ impl Drop for KillFlag {
95152
// Letting a KillFlag with a task inside get dropped would leak the task.
96153
// We could free it here, but the task should get awoken by hand somehow.
97154
fn drop(&self) {
98-
match self.load(Acquire) {
155+
match self.load(Relaxed) {
99156
KILL_RUNNING | KILL_KILLED => { },
100157
_ => rtabort!("can't drop kill flag with a blocked task inside!"),
101158
}
@@ -124,7 +181,7 @@ impl BlockedTask {
124181
Unkillable(task) => Some(task),
125182
Killable(flag_arc) => {
126183
let flag = unsafe { &mut **flag_arc.get() };
127-
match flag.swap(KILL_RUNNING, SeqCst) {
184+
match flag.swap(KILL_RUNNING, Relaxed) {
128185
KILL_RUNNING => None, // woken from select(), perhaps
129186
KILL_KILLED => None, // a killer stole it already
130187
task_ptr =>
@@ -159,7 +216,7 @@ impl BlockedTask {
159216
let flag = &mut **flag_arc.get();
160217
let task_ptr = cast::transmute(task);
161218
// Expect flag to contain RUNNING. If KILLED, it should stay KILLED.
162-
match flag.compare_and_swap(KILL_RUNNING, task_ptr, SeqCst) {
219+
match flag.compare_and_swap(KILL_RUNNING, task_ptr, Relaxed) {
163220
KILL_RUNNING => Right(Killable(flag_arc)),
164221
KILL_KILLED => Left(revive_task_ptr(task_ptr, Some(flag_arc))),
165222
x => rtabort!("can't block task! kill flag = %?", x),
@@ -257,7 +314,7 @@ impl KillHandle {
257314
let inner = unsafe { &mut *self.get() };
258315
// Expect flag to contain RUNNING. If KILLED, it should stay KILLED.
259316
// FIXME(#7544)(bblum): is it really necessary to prohibit double kill?
260-
match inner.unkillable.compare_and_swap(KILL_RUNNING, KILL_UNKILLABLE, SeqCst) {
317+
match inner.unkillable.compare_and_swap(KILL_RUNNING, KILL_UNKILLABLE, Relaxed) {
261318
KILL_RUNNING => { }, // normal case
262319
KILL_KILLED => if !already_failing { fail!(KILLED_MSG) },
263320
_ => rtabort!("inhibit_kill: task already unkillable"),
@@ -270,7 +327,7 @@ impl KillHandle {
270327
let inner = unsafe { &mut *self.get() };
271328
// Expect flag to contain UNKILLABLE. If KILLED, it should stay KILLED.
272329
// FIXME(#7544)(bblum): is it really necessary to prohibit double kill?
273-
match inner.unkillable.compare_and_swap(KILL_UNKILLABLE, KILL_RUNNING, SeqCst) {
330+
match inner.unkillable.compare_and_swap(KILL_UNKILLABLE, KILL_RUNNING, Relaxed) {
274331
KILL_UNKILLABLE => { }, // normal case
275332
KILL_KILLED => if !already_failing { fail!(KILLED_MSG) },
276333
_ => rtabort!("allow_kill: task already killable"),
@@ -281,10 +338,10 @@ impl KillHandle {
281338
// if it was blocked and needs punted awake. To be called by other tasks.
282339
pub fn kill(&mut self) -> Option<~Task> {
283340
let inner = unsafe { &mut *self.get() };
284-
if inner.unkillable.swap(KILL_KILLED, SeqCst) == KILL_RUNNING {
341+
if inner.unkillable.swap(KILL_KILLED, Relaxed) == KILL_RUNNING {
285342
// Got in. Allowed to try to punt the task awake.
286343
let flag = unsafe { &mut *inner.killed.get() };
287-
match flag.swap(KILL_KILLED, SeqCst) {
344+
match flag.swap(KILL_KILLED, Relaxed) {
288345
// Task either not blocked or already taken care of.
289346
KILL_RUNNING | KILL_KILLED => None,
290347
// Got ownership of the blocked task.
@@ -306,8 +363,11 @@ impl KillHandle {
306363
// is unkillable with a kill signal pending.
307364
let inner = unsafe { &*self.get() };
308365
let flag = unsafe { &*inner.killed.get() };
309-
// FIXME(#6598): can use relaxed ordering (i think)
310-
flag.load(Acquire) == KILL_KILLED
366+
// A barrier-related concern here is that a task that gets killed
367+
// awake needs to see the killer's write of KILLED to this flag. This
368+
// is analogous to receiving a pipe payload; the appropriate barrier
369+
// should happen when enqueueing the task.
370+
flag.load(Relaxed) == KILL_KILLED
311371
}
312372

313373
pub fn notify_immediate_failure(&mut self) {
@@ -415,7 +475,7 @@ impl Death {
415475
}
416476

417477
/// Collect failure exit codes from children and propagate them to a parent.
418-
pub fn collect_failure(&mut self, mut success: bool) {
478+
pub fn collect_failure(&mut self, mut success: bool, group: Option<Taskgroup>) {
419479
// This may run after the task has already failed, so even though the
420480
// task appears to need to be killed, the scheduler should not fail us
421481
// when we block to unwrap.
@@ -425,6 +485,10 @@ impl Death {
425485
rtassert!(self.unkillable == 0);
426486
self.unkillable = 1;
427487

488+
// FIXME(#7544): See corresponding fixme at the callsite in task.rs.
489+
// NB(#8192): Doesn't work with "let _ = ..."
490+
{ use util; util::ignore(group); }
491+
428492
// Step 1. Decide if we need to collect child failures synchronously.
429493
do self.on_exit.take_map |on_exit| {
430494
if success {

src/libstd/rt/task.rs

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -212,8 +212,13 @@ impl Task {
212212
pub fn run(&mut self, f: &fn()) {
213213
rtdebug!("run called on task: %u", borrow::to_uint(self));
214214
self.unwinder.try(f);
215-
{ let _ = self.taskgroup.take(); }
216-
self.death.collect_failure(!self.unwinder.unwinding);
215+
// FIXME(#7544): We pass the taskgroup into death so that it can be
216+
// dropped while the unkillable counter is set. This should not be
217+
// necessary except for an extraneous clone() in task/spawn.rs that
218+
// causes a killhandle to get dropped, which mustn't receive a kill
219+
// signal since we're outside of the unwinder's try() scope.
220+
// { let _ = self.taskgroup.take(); }
221+
self.death.collect_failure(!self.unwinder.unwinding, self.taskgroup.take());
217222
self.destroy();
218223
}
219224

src/libstd/unstable/sync.rs

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ use ptr;
1616
use option::*;
1717
use either::{Either, Left, Right};
1818
use task;
19-
use unstable::atomics::{AtomicOption,AtomicUint,Acquire,Release,SeqCst};
19+
use unstable::atomics::{AtomicOption,AtomicUint,Acquire,Release,Relaxed,SeqCst};
2020
use unstable::finally::Finally;
2121
use ops::Drop;
2222
use clone::Clone;
@@ -95,8 +95,7 @@ impl<T: Send> UnsafeAtomicRcBox<T> {
9595
pub fn get(&self) -> *mut T {
9696
unsafe {
9797
let mut data: ~AtomicRcBoxData<T> = cast::transmute(self.data);
98-
// FIXME(#6598) Change Acquire to Relaxed.
99-
assert!(data.count.load(Acquire) > 0);
98+
assert!(data.count.load(Relaxed) > 0);
10099
let r: *mut T = data.data.get_mut_ref();
101100
cast::forget(data);
102101
return r;
@@ -107,7 +106,7 @@ impl<T: Send> UnsafeAtomicRcBox<T> {
107106
pub fn get_immut(&self) -> *T {
108107
unsafe {
109108
let data: ~AtomicRcBoxData<T> = cast::transmute(self.data);
110-
assert!(data.count.load(Acquire) > 0); // no barrier is really needed
109+
assert!(data.count.load(Relaxed) > 0);
111110
let r: *T = data.data.get_ref();
112111
cast::forget(data);
113112
return r;
@@ -130,8 +129,7 @@ impl<T: Send> UnsafeAtomicRcBox<T> {
130129
// Try to put our server end in the unwrapper slot.
131130
// This needs no barrier -- it's protected by the release barrier on
132131
// the xadd, and the acquire+release barrier in the destructor's xadd.
133-
// FIXME(#6598) Change Acquire to Relaxed.
134-
if data.unwrapper.fill(~(c1,p2), Acquire).is_none() {
132+
if data.unwrapper.fill(~(c1,p2), Relaxed).is_none() {
135133
// Got in. Tell this handle's destructor not to run (we are now it).
136134
this.data = ptr::mut_null();
137135
// Drop our own reference.

0 commit comments

Comments
 (0)