Skip to content

Payment Retries in ChannelManager #1096

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

valentinewallace
Copy link
Contributor

@valentinewallace valentinewallace commented Sep 24, 2021

Seeking concept ACKs on the new pending_outbound_payments format/new enum @jkczyz @TheBlueMatt

Also left a comment on an open question about retrying payments that have been completely removed.

TODOs

  • test retrying a completely failed payment
  • check total_msat is sane on retry
  • test retrying w/ bad payment_hash, amt, etc
  • impl expiring outbound payments

@codecov
Copy link

codecov bot commented Sep 24, 2021

Codecov Report

Merging #1096 (3e6297a) into main (5811f5b) will increase coverage by 0.41%.
The diff coverage is 96.27%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1096      +/-   ##
==========================================
+ Coverage   90.73%   91.14%   +0.41%     
==========================================
  Files          65       65              
  Lines       34318    37177    +2859     
==========================================
+ Hits        31137    33884    +2747     
- Misses       3181     3293     +112     
Impacted Files Coverage Δ
lightning/src/ln/channelmanager.rs 87.28% <93.75%> (+2.35%) ⬆️
lightning/src/ln/functional_tests.rs 97.41% <98.66%> (-0.04%) ⬇️
lightning/src/ln/channel.rs 91.48% <100.00%> (+2.83%) ⬆️
lightning/src/chain/mod.rs 55.55% <0.00%> (-3.27%) ⬇️
lightning-block-sync/src/lib.rs 93.49% <0.00%> (-1.69%) ⬇️
lightning/src/ln/script.rs 93.06% <0.00%> (-0.69%) ⬇️
lightning-net-tokio/src/lib.rs 76.69% <0.00%> (-0.59%) ⬇️
lightning/src/util/ser_macros.rs 86.51% <0.00%> (-0.33%) ⬇️
lightning-block-sync/src/http.rs 93.30% <0.00%> (-0.22%) ⬇️
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5811f5b...3e6297a. Read the comment docs.

}

impl PendingOutboundPayment {
fn remove(&mut self, session_priv: &[u8; 32]) -> bool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, match the hashset api, slick, i like it.

}
} else {
// XXX they could retry with a bogus amt/hash/secret/payment_id here and we wouldn't catch it atm
// should we keep payments in pending_outbound_payments until they succeed/expire/manually removed?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I was kinda figuring we'd require users just call send_payment again, though I admit it would be kinda slick to let them just call the same method here. I guess we could do something like time out pending all-paths-failed payments after, eg, 3 blocks or so? Do you have an opinion @jkczyz?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I figured once a payment has failed (and thus ChannelManager has been cleaned up), retrying would just involve calling send_payment again as is currently done in #1059. So the retry interface would be specific to retrying a path failure for MPP.

Originally, I hoped it could be a single interface across any send/retry scenario. See conversation here: #1053 (comment). But that seemed to not be ideal.

If we could have a single call for any retry scenario, that would be nice. It mostly just prevents us from needing to switch on all_paths_failed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I was thinking we'd have separate interfaces, but maybe the memory usage is not so bad? Like, we'd have a bunch of memory used for failed payments, but we could time-box it. For high-volume scenarios it could be a real concern, but we'd clean up any memory (a) after the payment succeeds, even across a different path, or (b) after some time. Maybe we can figure out how to address the DoS later?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could parameterize send_payment further by the invoice expiry, and remove pending outbounds at min(invoice_expiry, 3 blocks) maybe?

In any case, given this week's a bit busy, I'm in favor of punting complete payment retries to follow-up

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, given this week's a bit busy, I'm in favor of punting complete payment retries to follow-up

Well we either have to handle it here or in #1059, both for 0.0.102. I suppose we can punt "tightly limit expiry time" to a later release, even if we do expire here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I meant punt complete payment retries via the retry_payment API. I think #1059 already does complete payment retries, just via send_payment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I meant punt complete payment retries via the retry_payment API. I think #1059 already does complete payment retries, just via send_payment

Right, my point was its probably ~the same amount of code to handle switching between retry_payment and send_payment in #1059 as it is to just call retry_payment instead of send_payment and not remove the payment entry until a payment succeeds in this PR :)

Copy link
Contributor Author

@valentinewallace valentinewallace Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

> not remove the payment entry until a payment succeeds in this PR :)
Hmm, so when I keep failed payments, then retry a payment with all_paths_failed, it fails due to we didn't have a corresponding inbound payment (on the receiver end). Gotta make it symmetrical (i.e. keep failed payments on receiver end as well), I suppose?
Actually, this seems like it could be an issue for #1059 as well iiuc

EDIT: disregard, should never retry if rejected_by_dest anyway

Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming looks good.

Comment on lines 5265 to 5414
(1, pending_outbound_payments, required),
(1, pending_outbound_payments_compat_2, required),
(3, pending_outbound_payments, required),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will changing the order break the format?

Copy link
Contributor Author

@valentinewallace valentinewallace Sep 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, if e.g. in 0.0.102 we change the hashmap serialization format that corresponds to the 1 tlv type here, 0.0.101 clients will fail to deserialize it (hope that answers)

@valentinewallace valentinewallace force-pushed the 2021-09-mpp-retries branch 4 times, most recently from e5654b8 to c4ac620 Compare September 28, 2021 01:43
@valentinewallace valentinewallace marked this pull request as ready for review September 28, 2021 22:32
Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Leftover from previous PR Jeff feedback.

Useful in upcoming commits as we'll expose this to users for payment retries
Used in upcoming commits for retries
We want to reuse send_payment internal functions for retries,
so some need to now be parameterized by PaymentId to avoid
generating a new PaymentId on retry
Retrying a partial payment means send_payment_internal needs to be parameterized
by a total payment amount, else 'HTLC values do not match' errors
@valentinewallace valentinewallace force-pushed the 2021-09-mpp-retries branch 2 times, most recently from 58afd58 to 977dbf8 Compare September 28, 2021 23:55
@TheBlueMatt TheBlueMatt added this to the 0.0.102 milestone Sep 29, 2021
@valentinewallace valentinewallace force-pushed the 2021-09-mpp-retries branch 2 times, most recently from 98a225e to 8460a97 Compare September 29, 2021 23:47
Copy link
Contributor

@jkczyz jkczyz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM module some minor comments

This is because we want the ability to retry completely failed
payments.

Upcoming commits will remove these payments on timeout to prevent
DoS issues

Also test that this removal allows retrying single-path payments
@TheBlueMatt TheBlueMatt merged commit 7aa2cac into lightningdevkit:main Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants