Skip to content

Fix stale gossip retention Heisenbug when channel updates are slow. #1953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions lightning/src/routing/gossip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1541,6 +1541,14 @@ impl<L: Deref> NetworkGraph<L> where L::Target: Logger {
#[cfg(not(feature = "std"))]
let current_time_unix = None;

self.channel_failed_with_time(short_channel_id, is_permanent, current_time_unix)
}

/// Marks a channel in the graph as failed if a corresponding HTLC fail was sent.
/// If permanent, removes a channel from the local storage.
/// May cause the removal of nodes too, if this was their last channel.
/// If not permanent, makes channels unavailable for routing.
fn channel_failed_with_time(&self, short_channel_id: u64, is_permanent: bool, current_time_unix: Option<u64>) {
let mut channels = self.channels.write().unwrap();
if is_permanent {
if let Some(chan) = channels.remove(&short_channel_id) {
Expand Down Expand Up @@ -2537,18 +2545,18 @@ mod tests {

// Mark the channel as permanently failed. This will also remove the two nodes
// and all of the entries will be tracked as removed.
network_graph.channel_failed(short_channel_id, true);
network_graph.channel_failed_with_time(short_channel_id, true, Some(tracking_time));

// Should not remove from tracking if insufficient time has passed
network_graph.remove_stale_channels_and_tracking_with_time(
tracking_time + REMOVED_ENTRIES_TRACKING_AGE_LIMIT_SECS - 1);
assert_eq!(network_graph.removed_channels.lock().unwrap().len(), 1);
assert_eq!(network_graph.removed_channels.lock().unwrap().len(), 1, "Removed channel count ≠ 1 with tracking_time {}", tracking_time);

// Provide a later time so that sufficient time has passed
network_graph.remove_stale_channels_and_tracking_with_time(
tracking_time + REMOVED_ENTRIES_TRACKING_AGE_LIMIT_SECS);
assert!(network_graph.removed_channels.lock().unwrap().is_empty());
assert!(network_graph.removed_nodes.lock().unwrap().is_empty());
assert!(network_graph.removed_channels.lock().unwrap().is_empty(), "Unexpectedly removed channels with tracking_time {}", tracking_time);
assert!(network_graph.removed_nodes.lock().unwrap().is_empty(), "Unexpectedly removed nodes with tracking_time {}", tracking_time);
}

#[cfg(not(feature = "std"))]
Expand Down