Closed
Description
Our mechanism for propagating failure between tasks is not fully realized, has race conditions, and imposes complex interactions between the runtime and libcore.
The current rules are something like this:
- Children that fail cause their parents to fail
- Parents that fail do not cause their children to fail
- Children can be 'unsupervised', after which their failure does not propagate to their parents
- Children of dead parents that fail cause their grandparents (and so on) to fail
- The main task is supervised by the kernel
- When the kernel fails all tasks are killed
Issues:
- Parents should probably also cause children to fail
- Task notification (by which tasks can be informed of what the fate of other tasks was) is related, but the current semantics are not clear when linked failure is involved.
- There is a known race condition between children killing their parents and the parent blocking on a port in which it is possible for the parent to yield forever.
- Kernel level failure, upon which all tasks fail, is incredibly racy, basically broken.
See also: #1788 (redesign task API), #1857 (task-local data), #1078 (task notification)