Skip to content

Occasional swiftc crash on Windows, "disposed a muxnote with an active thread" #844

Open
@z2oh

Description

@z2oh

Upon upgrading our Azure CI machines to use the new Azure Cobalt ARM64 processors, we started seeing frequent compiler crashes when building a large Swift project. After some investigation, the culprit appears to be a lifecycle violation in libdispatch in the Windows pipe handling code.

The crashing line: https://github.com/apple/swift-corelibs-libdispatch/blob/e85f6a0d5c9ea1f32f5013c3fa34e4fc146cd0eb/src/event/event_windows.c#L240

And the stack trace:

[Inline Frame] dispatch.dll!_dispatch_muxnote_dispose(dispatch_muxnote_s * dmn) Line 240	C
[Inline Frame] dispatch.dll!_dispatch_muxnote_release(dispatch_muxnote_s * dmn) Line 265	C
[Inline Frame] dispatch.dll!_dispatch_event_merge_pipe_handle_read(dispatch_muxnote_s * dmn, unsigned long dwBytesAvailable) Line 669	C
dispatch.dll!_dispatch_event_loop_drain(unsigned int flags) Line 915	C
dispatch.dll!_dispatch_mgr_invoke() Line 5419	C
dispatch.dll!_dispatch_mgr_thread(dispatch_lane_s * dq, dispatch_invoke_context_s * dic, <unnamed-tag> flags) Line 5447	C
[Inline Frame] dispatch.dll!_dispatch_continuation_pop_inline(dispatch_object_t dou, dispatch_invoke_context_s * dic, <unnamed-tag> flags, dispatch_queue_class_t dqu) Line 2496	C
dispatch.dll!_dispatch_root_queue_drain(dispatch_queue_global_s * dq, unsigned int pri, <unnamed-tag> flags) Line 6114	C
dispatch.dll!_dispatch_worker_thread(void * context) Line 6250	C
dispatch.dll!_dispatch_worker_thread_thunk(void * lpParameter) Line 6272	C
[External Code]	

I suspect this is not an Cobalt/ARM64 specific issue, but is more likely a long-standing bug which has become common on this particular line of CPUs due to some scheduling or timing change.

The interesting section is here:
https://github.com/apple/swift-corelibs-libdispatch/blob/e85f6a0d5c9ea1f32f5013c3fa34e4fc146cd0eb/src/event/event_windows.c#L667-L669

The event set here is used to synchronize with the pipe monitoring thread, which itself calls _dispatch_muxnote_retain.Perhaps a change in timing affected the typical order of operations here, although I haven't been able to prove this yet.

I'm trying to reproduce the crash under LIBDISPATCH_LOG to get some more information.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions