Skip to content

Ignore Duplicate Gossip Error while updating networkGraph from RGS #1764

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Oct 19, 2022

Conversation

G8XSU
Copy link
Contributor

@G8XSU G8XSU commented Oct 11, 2022

Fixes #1746

@G8XSU G8XSU requested a review from arik-so October 11, 2022 19:25
arik-so
arik-so previously approved these changes Oct 11, 2022
Copy link
Contributor

@arik-so arik-so left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we try to add a test?

Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mod the possibly pending test case.

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please add test.

@G8XSU
Copy link
Contributor Author

G8XSU commented Oct 12, 2022

Yes,
The current tests are based off serialized byte-stream with custom format.
Was checking if there's a better way, I didn't find any utils to generate it. Any ideas ?

@jkczyz
Copy link
Contributor

jkczyz commented Oct 12, 2022

Yes, The current tests are based off serialized byte-stream with custom format. Was checking if there's a better way, I didn't find any utils to generate it. Any ideas ?

@arik-so generated the bytes from his server, IIUC.

https://github.com/lightningdevkit/rapid-gossip-sync-server

Not positive if anything can be reused from there. Ideally, we'd have an interface for building this data for testing instead of using byte arrays. Would make the tests more readable and modifiable, at very least.

@codecov-commenter
Copy link

codecov-commenter commented Oct 12, 2022

Codecov Report

Base: 90.80% // Head: 90.78% // Decreases project coverage by -0.01% ⚠️

Coverage data is based on head (96d04ae) compared to base (559ed20).
Patch coverage: 94.73% of modified lines in pull request are covered.

❗ Current head 96d04ae differs from pull request most recent head 51ee152. Consider uploading reports for the commit 51ee152 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1764      +/-   ##
==========================================
- Coverage   90.80%   90.78%   -0.02%     
==========================================
  Files          87       87              
  Lines       46969    46987      +18     
  Branches    46969    46987      +18     
==========================================
+ Hits        42648    42655       +7     
- Misses       4321     4332      +11     
Impacted Files Coverage Δ
lightning-rapid-gossip-sync/src/processing.rs 91.48% <94.73%> (+0.24%) ⬆️
lightning-net-tokio/src/lib.rs 76.73% <0.00%> (-0.31%) ⬇️
lightning/src/util/events.rs 38.13% <0.00%> (-0.27%) ⬇️
lightning/src/ln/peer_channel_encryptor.rs 93.38% <0.00%> (-0.25%) ⬇️
lightning/src/ln/functional_tests.rs 96.93% <0.00%> (-0.14%) ⬇️
lightning/src/chain/channelmonitor.rs 91.16% <0.00%> (-0.06%) ⬇️
lightning/src/routing/scoring.rs 96.61% <0.00%> (ø)
lightning-background-processor/src/lib.rs 96.23% <0.00%> (ø)
lightning/src/chain/onchaintx.rs 95.40% <0.00%> (+0.68%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@G8XSU G8XSU requested review from arik-so, jkczyz and tnull October 12, 2022 18:38
@G8XSU G8XSU force-pushed the rgs-ignore-error branch 2 times, most recently from 4c5bd9b to 96d04ae Compare October 12, 2022 19:49
@@ -435,6 +439,50 @@ mod tests {
assert!(after.contains("783241506229452801"));
}

#[test]
fn update_succeeds_when_duplicate_gossip_update_from_server() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with duplicate gossip

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given-when-then-should is general unit-test naming convention, we don't follow it as such.
but it was derived from that. happy to change if you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the grammar worked with when/then, sure, but in its current configuration it doesn't, and making it so would be very verbose

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, update_succeeds_when_duplicate_gossip_is_applied

arik-so
arik-so previously approved these changes Oct 12, 2022
tnull
tnull previously approved these changes Oct 12, 2022
dunxen
dunxen previously approved these changes Oct 13, 2022
@TheBlueMatt
Copy link
Collaborator

Hmm, I think we also need to make sure gossip doesn't fail if we hit the neighboring case: return Err(LightningError{err: "Update older than last processed update".to_owned(), action: ErrorAction::IgnoreAndLog(Level::Gossip)}); We can probably just change that to IgnoreDuplicateGossip, cause that's probably what it is.

@G8XSU
Copy link
Contributor Author

G8XSU commented Oct 17, 2022

I will change that to IgnoreDuplicateGossip,
But we should still probably ignore all of "IgnoreDuplicate | IgnoreAndLog | IgnoreError" for RGS-client.

@G8XSU G8XSU dismissed stale reviews from dunxen and tnull via dadb1d3 October 17, 2022 21:02
@jkczyz jkczyz changed the title Ignore Duplicate Gossip Error while updating networkGraph from RGS #1746 Ignore Duplicate Gossip Error while updating networkGraph from RGS Oct 17, 2022
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I'd avoid linking to PR or issues's in commit messages. Otherwise, it will get an update every time you force push.

@jkczyz
Copy link
Contributor

jkczyz commented Oct 17, 2022

FYI, I'd avoid linking to PR or issues's in commit messages. Otherwise, it will get an update every time you force push.

Also, please expand the commit message with rationale as it should be understandable independent of github.

While applying gossip info from RGS-server, number of harmless
errors might arise which should be ignored. E.g. client should not
fail if there is a duplicate gossip for same channel or duplicate
update.
@G8XSU G8XSU requested a review from jkczyz October 18, 2022 19:16
@G8XSU G8XSU added this to the 0.0.112 milestone Oct 18, 2022
@jkczyz jkczyz merged commit c06ab02 into lightningdevkit:main Oct 19, 2022
match network_graph.update_channel_unsigned(&synthetic_update) {
Ok(_) => {},
Err(LightningError { action: ErrorAction::IgnoreDuplicateGossip, .. }) => {},
Err(LightningError { action: ErrorAction::IgnoreAndLog(_), .. }) => {},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This starts ignoring errors like channel_update is older than two weeks old which we probably don't want to ignore - that indicates the RGS data is bunk.

Copy link
Contributor

@tnull tnull Oct 19, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, but couldn't a similar argument be made for the whole PR: that the dataset is probably bunk when we receive a duplicate update in the first place? So why ignore one but not the other error indicating "something is wrong with this data"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you can get updates from payment failures (or p2p gossip sync, if you're doing both), so its totally reasonable to have an update out-of-band from some source other than RGS.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think on client side we should ignore those errors.
"There is something wrong with this server update" => "we should ignore and not fail client side"
Server-side should be responsible for its own correctness and should have its own checks in place.

IIUC, ignoring the update leaves the network graph still functional instead of erroring out and stopping on some mistake from server. (potentially out-of-date is better than errors until server fixes the potential bug)
And the way rgs currently works with latest_seen_timestamp and backdate_timestamp, it makes sense to ignore the noisy updates ?

#1785 => As of now, RGS-client doesn't accept stale-update, it ignores the stale-update.
#1784 => As of now, RGS-client ignores the update for channel which doesn't exist and does not fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because all of the RGS updates get applied with the same timestamps there's nothing to apply, they'll all fail. I'd much rather fail and get that feedback than just silently continue. For example if someone hosts their own server and it starts failing best to give them errors rather than hope they have monitoring in place.

As for 1784, what an I reading wrong? The code snippet I copied into the issue seems to be to be returning an error if a channel is missing. I don't see any other code to ignore such cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"get that feedback than just silently continue" -> does debug logging with outdated network graph as feedback work? or we want to surface errors to users?

1784 -> Clarifying, i meant after this PR, we are catching and ignoring "IgnoreError" as well right? and it won't fail.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"get that feedback than just silently continue" -> does debug logging with outdated network graph as feedback work? or we want to surface errors to users?

Hmm, I'd imagine we'd want to surface an error, but if we want to avoid punishing users for the server being poorly monitored I'd be okay with debug logging too.

1784 -> Clarifying, i meant after this PR, we are catching and ignoring "IgnoreError" as well right? and it won't fail.

Ah, sorry for the confusion, no that error is in lightning-rapid-gossip-sync/src/processing.rs, ie it is not ignored, because we generate it locally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RGS Update had same timestamp as last processed update shouldn't be an error
7 participants