Skip to content

Commit 6a030d6

Browse files
committed
Do not require that no calls are made post-disconnect_socket
The only practical way to meet this requirement is to block disconnect_socket until any pending events are fully processed, leading to this trivial deadlock: * Thread 1: select() woken up due to a read event * Thread 2: Event processing causes a disconnect_socket call to fire while the PeerManager lock is held. * Thread 2: disconnect_socket blocks until the read event in thread 1 completes. * Thread 1: bytes are read from the socket and PeerManager::read_event is called, waiting on the lock still held by thread 2. There isn't a trivial way to address this deadlock without simply making the final read_event call return immediately, which we do here. This also implies that users can freely call event methods after disconnect_socket, but only so far as the socket descriptor is different from any later socket descriptor (ie until the file descriptor is re-used).
1 parent c505f07 commit 6a030d6

File tree

1 file changed

+22
-15
lines changed

1 file changed

+22
-15
lines changed

lightning/src/ln/peer_handler.rs

Lines changed: 22 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -195,11 +195,8 @@ pub trait SocketDescriptor : cmp::Eq + hash::Hash + Clone {
195195
/// indicating that read events on this descriptor should resume. A resume_read of false does
196196
/// *not* imply that further read events should be paused.
197197
fn send_data(&mut self, data: &[u8], resume_read: bool) -> usize;
198-
/// Disconnect the socket pointed to by this SocketDescriptor. Once this function returns, no
199-
/// more calls to write_buffer_space_avail, read_event or socket_disconnected may be made with
200-
/// this descriptor. No socket_disconnected call should be generated as a result of this call,
201-
/// though races may occur whereby disconnect_socket is called after a call to
202-
/// socket_disconnected but prior to socket_disconnected returning.
198+
/// Disconnect the socket pointed to by this SocketDescriptor.
199+
/// No [`PeerManager::socket_disconnected`] call need be generated as a result of this call.
203200
fn disconnect_socket(&mut self);
204201
}
205202

@@ -617,7 +614,12 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
617614
pub fn write_buffer_space_avail(&self, descriptor: &mut Descriptor) -> Result<(), PeerHandleError> {
618615
let mut peers = self.peers.lock().unwrap();
619616
match peers.peers.get_mut(descriptor) {
620-
None => panic!("Descriptor for write_event is not already known to PeerManager"),
617+
None => {
618+
// This is most likely a simple race condition where the user found that the socket
619+
// was writeable, then we told the user to `disconnect_socket()`, then they called
620+
// this method. Return an error to make sure we get disconnected.
621+
return Err(PeerHandleError { no_connection_possible: false });
622+
},
621623
Some(peer) => {
622624
peer.awaiting_write_event = false;
623625
self.do_attempt_write_data(descriptor, peer);
@@ -637,7 +639,6 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
637639
/// If Ok(true) is returned, further read_events should not be triggered until a send_data call
638640
/// on this file descriptor has resume_read set (preventing DoS issues in the send buffer).
639641
///
640-
/// Panics if the descriptor was not previously registered in a new_*_connection event.
641642
pub fn read_event(&self, peer_descriptor: &mut Descriptor, data: &[u8]) -> Result<bool, PeerHandleError> {
642643
match self.do_read_event(peer_descriptor, data) {
643644
Ok(res) => Ok(res),
@@ -665,7 +666,12 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
665666
let mut msgs_to_forward = Vec::new();
666667
let mut peer_node_id = None;
667668
let pause_read = match peers.peers.get_mut(peer_descriptor) {
668-
None => panic!("Descriptor for read_event is not already known to PeerManager"),
669+
None => {
670+
// This is most likely a simple race condition where the user read some bytes
671+
// from the socket, then we told the user to `disconnect_socket()`, then they
672+
// called this method. Return an error to make sure we get disconnected.
673+
return Err(PeerHandleError { no_connection_possible: false });
674+
},
669675
Some(peer) => {
670676
assert!(peer.pending_read_buffer.len() > 0);
671677
assert!(peer.pending_read_buffer.len() > peer.pending_read_buffer_pos);
@@ -1294,12 +1300,9 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
12941300

12951301
/// Indicates that the given socket descriptor's connection is now closed.
12961302
///
1297-
/// This must only be called if the socket has been disconnected by the peer or your own
1298-
/// decision to disconnect it and must NOT be called in any case where other parts of this
1299-
/// library (eg PeerHandleError, explicit disconnect_socket calls) instruct you to disconnect
1300-
/// the peer.
1301-
///
1302-
/// Panics if the descriptor was not previously registered in a successful new_*_connection event.
1303+
/// This need only be called if the socket has been disconnected by the peer or your own
1304+
/// decision to disconnect it and may be skipped in any case where other parts of this library
1305+
/// (eg PeerHandleError, explicit disconnect_socket calls) instruct you to disconnect the peer.
13031306
pub fn socket_disconnected(&self, descriptor: &Descriptor) {
13041307
self.disconnect_event_internal(descriptor, false);
13051308
}
@@ -1308,7 +1311,11 @@ impl<Descriptor: SocketDescriptor, CM: Deref, RM: Deref, L: Deref> PeerManager<D
13081311
let mut peers = self.peers.lock().unwrap();
13091312
let peer_option = peers.peers.remove(descriptor);
13101313
match peer_option {
1311-
None => panic!("Descriptor for disconnect_event is not already known to PeerManager"),
1314+
None => {
1315+
// This is most likely a simple race condition where the user found that the socket
1316+
// was disconnected, then we told the user to `disconnect_socket()`, then they
1317+
// called this method. Either way we're disconnected, return.
1318+
},
13121319
Some(peer) => {
13131320
match peer.their_node_id {
13141321
Some(node_id) => {

0 commit comments

Comments
 (0)