Skip to content

Revisit linked failure mechanism #1868

Closed
@brson

Description

@brson

Our mechanism for propagating failure between tasks is not fully realized, has race conditions, and imposes complex interactions between the runtime and libcore.

The current rules are something like this:

  • Children that fail cause their parents to fail
  • Parents that fail do not cause their children to fail
  • Children can be 'unsupervised', after which their failure does not propagate to their parents
  • Children of dead parents that fail cause their grandparents (and so on) to fail
  • The main task is supervised by the kernel
  • When the kernel fails all tasks are killed

Issues:

  • Parents should probably also cause children to fail
  • Task notification (by which tasks can be informed of what the fate of other tasks was) is related, but the current semantics are not clear when linked failure is involved.
  • There is a known race condition between children killing their parents and the parent blocking on a port in which it is possible for the parent to yield forever.
  • Kernel level failure, upon which all tasks fail, is incredibly racy, basically broken.

See also: #1788 (redesign task API), #1857 (task-local data), #1078 (task notification)

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-runtimeArea: std's runtime and "pre-main" init for handling backtraces, unwinds, stack overflows

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions