Skip to content

Ignore Duplicate Gossip Error while updating networkGraph from RGS #1764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 19, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 51 additions & 1 deletion lightning-rapid-gossip-sync/src/processing.rs
Original file line number Diff line number Diff line change
Expand Up @@ -194,7 +194,13 @@ impl<NG: Deref<Target=NetworkGraph<L>>, L: Deref> RapidGossipSync<NG, L> where L
synthetic_update.htlc_maximum_msat = htlc_maximum_msat;
}

network_graph.update_channel_unsigned(&synthetic_update)?;
match network_graph.update_channel_unsigned(&synthetic_update) {
Ok(_) => {},
Err(LightningError { action: ErrorAction::IgnoreDuplicateGossip, .. }) => {},
Err(LightningError { action: ErrorAction::IgnoreAndLog(_), .. }) => {},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This starts ignoring errors like channel_update is older than two weeks old which we probably don't want to ignore - that indicates the RGS data is bunk.

Copy link
Contributor

@tnull tnull Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but couldn't a similar argument be made for the whole PR: that the dataset is probably bunk when we receive a duplicate update in the first place? So why ignore one but not the other error indicating "something is wrong with this data"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you can get updates from payment failures (or p2p gossip sync, if you're doing both), so its totally reasonable to have an update out-of-band from some source other than RGS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think on client side we should ignore those errors.
"There is something wrong with this server update" => "we should ignore and not fail client side"
Server-side should be responsible for its own correctness and should have its own checks in place.

IIUC, ignoring the update leaves the network graph still functional instead of erroring out and stopping on some mistake from server. (potentially out-of-date is better than errors until server fixes the potential bug)
And the way rgs currently works with latest_seen_timestamp and backdate_timestamp, it makes sense to ignore the noisy updates ?

#1785 => As of now, RGS-client doesn't accept stale-update, it ignores the stale-update.
#1784 => As of now, RGS-client ignores the update for channel which doesn't exist and does not fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because all of the RGS updates get applied with the same timestamps there's nothing to apply, they'll all fail. I'd much rather fail and get that feedback than just silently continue. For example if someone hosts their own server and it starts failing best to give them errors rather than hope they have monitoring in place.

As for 1784, what an I reading wrong? The code snippet I copied into the issue seems to be to be returning an error if a channel is missing. I don't see any other code to ignore such cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"get that feedback than just silently continue" -> does debug logging with outdated network graph as feedback work? or we want to surface errors to users?

1784 -> Clarifying, i meant after this PR, we are catching and ignoring "IgnoreError" as well right? and it won't fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"get that feedback than just silently continue" -> does debug logging with outdated network graph as feedback work? or we want to surface errors to users?

Hmm, I'd imagine we'd want to surface an error, but if we want to avoid punishing users for the server being poorly monitored I'd be okay with debug logging too.

1784 -> Clarifying, i meant after this PR, we are catching and ignoring "IgnoreError" as well right? and it won't fail.

Ah, sorry for the confusion, no that error is in lightning-rapid-gossip-sync/src/processing.rs, ie it is not ignored, because we generate it locally.

Err(LightningError { action: ErrorAction::IgnoreError, .. }) => {},
Err(e) => return Err(e.into()),
}
}

self.network_graph.set_last_rapid_gossip_sync_timestamp(latest_seen_timestamp);
Expand Down Expand Up @@ -435,6 +441,50 @@ mod tests {
assert!(after.contains("783241506229452801"));
}

#[test]
fn update_succeeds_when_duplicate_gossip_is_applied() {
let initialization_input = vec![
76, 68, 75, 1, 111, 226, 140, 10, 182, 241, 179, 114, 193, 166, 162, 70, 174, 99, 247,
79, 147, 30, 131, 101, 225, 90, 8, 156, 104, 214, 25, 0, 0, 0, 0, 0, 97, 227, 98, 218,
0, 0, 0, 4, 2, 22, 7, 207, 206, 25, 164, 197, 231, 230, 231, 56, 102, 61, 250, 251,
187, 172, 38, 46, 79, 247, 108, 44, 155, 48, 219, 238, 252, 53, 192, 6, 67, 2, 36, 125,
157, 176, 223, 175, 234, 116, 94, 248, 201, 225, 97, 235, 50, 47, 115, 172, 63, 136,
88, 216, 115, 11, 111, 217, 114, 84, 116, 124, 231, 107, 2, 158, 1, 242, 121, 152, 106,
204, 131, 186, 35, 93, 70, 216, 10, 237, 224, 183, 89, 95, 65, 3, 83, 185, 58, 138,
181, 64, 187, 103, 127, 68, 50, 2, 201, 19, 17, 138, 136, 149, 185, 226, 156, 137, 175,
110, 32, 237, 0, 217, 90, 31, 100, 228, 149, 46, 219, 175, 168, 77, 4, 143, 38, 128,
76, 97, 0, 0, 0, 2, 0, 0, 255, 8, 153, 192, 0, 2, 27, 0, 0, 0, 1, 0, 0, 255, 2, 68,
226, 0, 6, 11, 0, 1, 2, 3, 0, 0, 0, 4, 0, 40, 0, 0, 0, 0, 0, 0, 3, 232, 0, 0, 3, 232,
0, 0, 0, 1, 0, 0, 0, 0, 58, 85, 116, 216, 255, 8, 153, 192, 0, 2, 27, 0, 0, 56, 0, 0,
0, 0, 0, 0, 0, 1, 0, 0, 0, 100, 0, 0, 2, 224, 0, 25, 0, 0, 0, 1, 0, 0, 0, 125, 255, 2,
68, 226, 0, 6, 11, 0, 1, 4, 0, 0, 0, 0, 29, 129, 25, 192, 0, 5, 0, 0, 0, 0, 29, 129,
25, 192,
];

let block_hash = genesis_block(Network::Bitcoin).block_hash();
let logger = TestLogger::new();
let network_graph = NetworkGraph::new(block_hash, &logger);

assert_eq!(network_graph.read_only().channels().len(), 0);

let rapid_sync = RapidGossipSync::new(&network_graph);
let initialization_result = rapid_sync.update_network_graph(&initialization_input[..]);
assert!(initialization_result.is_ok());

let single_direction_incremental_update_input = vec![
76, 68, 75, 1, 111, 226, 140, 10, 182, 241, 179, 114, 193, 166, 162, 70, 174, 99, 247,
79, 147, 30, 131, 101, 225, 90, 8, 156, 104, 214, 25, 0, 0, 0, 0, 0, 97, 229, 183, 167,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 255, 8, 153, 192, 0, 2, 27, 0, 0, 136, 0, 0, 0, 221, 255, 2,
68, 226, 0, 6, 11, 0, 1, 128,
];
let update_result_1 = rapid_sync.update_network_graph(&single_direction_incremental_update_input[..]);
// Apply duplicate update
let update_result_2 = rapid_sync.update_network_graph(&single_direction_incremental_update_input[..]);
assert!(update_result_1.is_ok());
assert!(update_result_2.is_ok());
}

#[test]
fn full_update_succeeds() {
let valid_input = vec![
Expand Down
4 changes: 2 additions & 2 deletions lightning/src/routing/gossip.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1330,7 +1330,7 @@ impl<L: Deref> NetworkGraph<L> where L::Target: Logger {
// updates to ensure you always have the latest one, only vaguely suggesting
// that it be at least the current time.
if node_info.last_update > msg.timestamp {
return Err(LightningError{err: "Update older than last processed update".to_owned(), action: ErrorAction::IgnoreAndLog(Level::Gossip)});
return Err(LightningError{err: "Update older than last processed update".to_owned(), action: ErrorAction::IgnoreDuplicateGossip});
} else if node_info.last_update == msg.timestamp {
return Err(LightningError{err: "Update had the same timestamp as last processed update".to_owned(), action: ErrorAction::IgnoreDuplicateGossip});
}
Expand Down Expand Up @@ -1796,7 +1796,7 @@ impl<L: Deref> NetworkGraph<L> where L::Target: Logger {
// pruning based on the timestamp field being more than two weeks old,
// but only in the non-normative section.
if existing_chan_info.last_update > msg.timestamp {
return Err(LightningError{err: "Update older than last processed update".to_owned(), action: ErrorAction::IgnoreAndLog(Level::Gossip)});
return Err(LightningError{err: "Update older than last processed update".to_owned(), action: ErrorAction::IgnoreDuplicateGossip});
} else if existing_chan_info.last_update == msg.timestamp {
return Err(LightningError{err: "Update had same timestamp as last processed update".to_owned(), action: ErrorAction::IgnoreDuplicateGossip});
}
Expand Down