Skip to content

Stop BackgroundProcessor's thread on drop #1007

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 64 additions & 16 deletions lightning-background-processor/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,7 @@ use std::ops::Deref;
/// for unilateral chain closure fees are at risk.
pub struct BackgroundProcessor {
stop_thread: Arc<AtomicBool>,
/// May be used to retrieve and handle the error if `BackgroundProcessor`'s thread
/// exits due to an error while persisting.
pub thread_handle: JoinHandle<Result<(), std::io::Error>>,
thread_handle: Option<JoinHandle<Result<(), std::io::Error>>>,
}

#[cfg(not(test))]
Expand Down Expand Up @@ -84,21 +82,25 @@ ChannelManagerPersister<Signer, M, T, K, F, L> for Fun where
}

impl BackgroundProcessor {
/// Start a background thread that takes care of responsibilities enumerated in the top-level
/// documentation.
/// Start a background thread that takes care of responsibilities enumerated in the [top-level
/// documentation].
///
/// If `persist_manager` returns an error, then this thread will return said error (and
/// `start()` will need to be called again to restart the `BackgroundProcessor`). Users should
/// wait on [`thread_handle`]'s `join()` method to be able to tell if and when an error is
/// returned, or implement `persist_manager` such that an error is never returned to the
/// `BackgroundProcessor`
/// The thread runs indefinitely unless the object is dropped, [`stop`] is called, or
/// `persist_manager` returns an error. In case of an error, the error is retrieved by calling
/// either [`join`] or [`stop`].
///
/// Typically, users should either implement [`ChannelManagerPersister`] to never return an
/// error or call [`join`] and handle any error that may arise. For the latter case, the
/// `BackgroundProcessor` must be restarted by calling `start` again after handling the error.
///
/// `persist_manager` is responsible for writing out the [`ChannelManager`] to disk, and/or
/// uploading to one or more backup services. See [`ChannelManager::write`] for writing out a
/// [`ChannelManager`]. See [`FilesystemPersister::persist_manager`] for Rust-Lightning's
/// provided implementation.
///
/// [`thread_handle`]: BackgroundProcessor::thread_handle
/// [top-level documentation]: Self
/// [`join`]: Self::join
/// [`stop`]: Self::stop
/// [`ChannelManager`]: lightning::ln::channelmanager::ChannelManager
/// [`ChannelManager::write`]: lightning::ln::channelmanager::ChannelManager#impl-Writeable
/// [`FilesystemPersister::persist_manager`]: lightning_persister::FilesystemPersister::persist_manager
Expand Down Expand Up @@ -158,13 +160,53 @@ impl BackgroundProcessor {
}
}
});
Self { stop_thread: stop_thread_clone, thread_handle: handle }
Self { stop_thread: stop_thread_clone, thread_handle: Some(handle) }
}

/// Join `BackgroundProcessor`'s thread, returning any error that occurred while persisting
/// [`ChannelManager`].
///
/// # Panics
///
/// This function panics if the background thread has panicked such as while persisting or
/// handling events.
///
/// [`ChannelManager`]: lightning::ln::channelmanager::ChannelManager
pub fn join(mut self) -> Result<(), std::io::Error> {
assert!(self.thread_handle.is_some());
self.join_thread()
}

/// Stop `BackgroundProcessor`'s thread, returning any error that occurred while persisting
/// [`ChannelManager`].
///
/// # Panics
///
/// This function panics if the background thread has panicked such as while persisting or
/// handling events.
///
/// [`ChannelManager`]: lightning::ln::channelmanager::ChannelManager
pub fn stop(mut self) -> Result<(), std::io::Error> {
assert!(self.thread_handle.is_some());
self.stop_and_join_thread()
}

/// Stop `BackgroundProcessor`'s thread.
pub fn stop(self) -> Result<(), std::io::Error> {
fn stop_and_join_thread(&mut self) -> Result<(), std::io::Error> {
self.stop_thread.store(true, Ordering::Release);
self.thread_handle.join().unwrap()
self.join_thread()
}

fn join_thread(&mut self) -> Result<(), std::io::Error> {
match self.thread_handle.take() {
Some(handle) => handle.join().unwrap(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that if the background thread panics, join will return an error that will cause the unwrap to panic on this line. Might it'd be preferable to either return the panic as an error or at least document that this is the expected behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added docs about panicking to join and stop.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does feel a bit strange that we have a function that returns a Result, but instead of ever returning an Err it always panics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so the behavior of std's join() is to return an error if the thread panics. Might it make more sense for our join method to mirror that behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does feel a bit strange that we have a function that returns a Result, but instead of ever returning an Err it always panics.

It will return an Err if the persister returns an Err. The unwrap is to remove the error handling JoinHandle provides for panics, not for the Result returned by the thread's closure.

Hmm, so the behavior of std's join() is to return an error if the thread panics. Might it make more sense for our join method to mirror that behavior?

My intention was to mirror the same behavior as stop, which would already panic in this case. The reasoning for panicking was because the panic would likely have originated from user code. So users should either write code to not panic in the case of event handling (some discussion here) or to return an error in case of persisting. If the panic is coming from RL, my assumption was we'd want to propagate it.

I'm not necessarily opposed to making it return the wrapped result, but I think we'd want stop to have the same behavior. There may be a good argument based on how a user configures Rust to handle panics. I vaguely remember @TheBlueMatt discussing it with us but I'm a bit foggy on the details.

None => Ok(()),
}
}
}

impl Drop for BackgroundProcessor {
fn drop(&mut self) {
self.stop_and_join_thread().unwrap();
}
}

Expand Down Expand Up @@ -416,7 +458,13 @@ mod tests {
let persister = |_: &_| Err(std::io::Error::new(std::io::ErrorKind::Other, "test"));
let event_handler = |_| {};
let bg_processor = BackgroundProcessor::start(persister, event_handler, nodes[0].chain_monitor.clone(), nodes[0].node.clone(), nodes[0].peer_manager.clone(), nodes[0].logger.clone());
let _ = bg_processor.thread_handle.join().unwrap().expect_err("Errored persisting manager: test");
match bg_processor.join() {
Ok(_) => panic!("Expected error persisting manager"),
Err(e) => {
assert_eq!(e.kind(), std::io::ErrorKind::Other);
assert_eq!(e.get_ref().unwrap().to_string(), "test");
},
}
}

#[test]
Expand Down