Closed
Description
I think my team and I stumbled upon a deadlock bug on LDK. It goes like this:
- We call
ChainMonitor::channel_monitor_updated()
ChannelManager::get_and_clear_pending_msg_events()
eventually gets called, takes a read lock ontotal_consistency_lock
and callsprocess_pending_monitor_events()
- One of the pending monitor events is
MonitorEvent::Completed
, soChannelManager::channel_monitor_updated()
is called, which also takes a read lock ontotal_consistency_lock
If between the 2 read locks in steps 2. and 3. another concurrent task tries to get a write lock, a deadlock can occur, depending on the queuing policy of the OS. On my machine (MacOS) I never experienced this, but on Linux machines, we get random hangs. It's likely the BackgroundProcessor
calling persist_manager
, which takes a write lock on total_consistency_lock
inside the write()
method of ChannelManager
.
Metadata
Metadata
Assignees
Labels
No labels