Fill out failure codes for onion packet #157

yuntai · 2018-09-06T16:49:00Z

Partial patch for #146
Some additions of returning errors and fix in failure message (additional parameters for UPDATE).
Moving on to work on handling returned failure codes (and looping them back to the router).

TheBlueMatt

You can ignore the full_stack_target test failure, I'll fix it pre-merge should be easy to fix.

TheBlueMatt · 2018-09-06T17:37:48Z

src/ln/channelmanager.rs

+						break Some(("CLTV expiry is too close", 0x1000 | 14, self.get_channel_update(chan).unwrap()));
+					}
+					/*
+					if chan.is_disabled() {


Oops, I was thinking more !is_live() is this case. Maybe just return 0x1000|20 for that?

TheBlueMatt · 2018-09-06T18:48:51Z

src/ln/channelmanager.rs

+						break Some(("Forwarding node has tampered with the intended HTLC values or origin node has an obsolete cltv_expiry_delta", 0x1000 | 13, self.get_channel_update(chan).unwrap()));
+					}
+					let cur_height = self.latest_block_height.load(Ordering::Acquire) as u32 + 1;
+					if msg.cltv_expiry <= cur_height + CLTV_EXPIRY_DELTA as u32 { // expiry_too_soon


Hmm, dont think we need the full CLTV_EXPIRY_DELTA here. The spec doesn't actually say what to use here (as far as I see) and appears to assume that meeting the min_final_cltv_expiry will suffice for every node on the path. I think all we really care about here is that we don't end up in a situation where we early-force-close a channel (ChannelMonitor::would_broadcast_at_height) because of a much-too-soon CLTV (implying, obviously, that no expiration is already past). This means:

for the inbound HTLC (aka msg.cltv_expiry) that we wont learn the payment preimage within channelmonitor::CLTV_CLAIM_BUFFER. This implies that the outbound HTLC CLTV must be at least CLTV_CLAIM_BUFFER before the inbound HTLC (trivially true due to CLTV_EXPIRY_DELTA being greater, though maybe we could add a comment/static assert indicating that is a requirement) and that the peer is able to do the update_fulfill_htlc/commitment/revoke dance within CLTV_EXPIRY_DELTA - CLTV_CLAIM_BUFFER blocks (which should never be an issue, though, again, we should probably add a comment noting that CLTV_EXPIRY_DELTA should always be a block or three greater than CLTV_CLAIM_BUFFER).

for the outbound HTLC (aka outgoing_cltv_value) that we receive a update_fail_htlc and do the commitment/revoke dance prior to the HTLC's expiration minus CLTV_CLAIM_BUFFER. Now that isnt actually possible to ensure right now, but I think we can change the check in ChannelMonitor to only force-close outbound if we're a block or three after the actual expiration as long as CLTV_EXPIRY_DELTA is sufficiently larger than CLTV_CLAIM_BUFFER (or, I think, at least 2x CLTV_CLAIM_BUFFER cause that will give us CLTV_EXPIRY_DELTA blocks to get it confirmed, which is the meaning of CLTV_EXPIRY_DELTA).

All that to say, if we tweak ChannelMonitor::would_broadcast_at_height and ChannelMonitor::block_connected we can probably just check that outgoing_htlc_value is at least the current height + a few blocks and dont worry about anything else assuming the next hop will either immediately fail it due to it being too close to their desired final_cltv_expiry or not being able to meet their own CLTV_EXPIRY_DELTA or they will fail it because it expires soon.

for the outbound HTLC (aka outgoing_cltv_value) that we receive a update_fail_htlc and do the commitment/revoke dance prior to the HTLC's expiration minus CLTV_CLAIM_BUFFER.

I don't clearly understand this sentence.

Did you mean the outbound HTLC need to be force-closed if update_fail_htlc/commitment_dance is not received prior to corresponding incoming HTLC's expiration minus CLTV_CLAIM_BUFFER?

ah, or you meant the outbound HTLC need to be force-closed if update_fail_htlc/commitment_dance is not received prior to the HTLC's expiration minus CLTV_CLAIM_BUFFER, because we will know the downstream node will force-close channel since they are within their CLTV_CLAIM_BUFFEER prior to HTLC's expiration so we need to force-close and try to fetch pre-image from the chain.
(but with CLTV_EXPIRY_DELTA >> CLTV_CLAIM_BUFFER, it's sufficient to just time out at the actual expiry (+1) since we have ample time before our deadline?)

The second, kinda. We need to force-close the channel if the upstream node hasn't done the update_fail_htlc/commitment_signed dance at least in enough time for us to claim the corresponding incoming HTLC. Right now we force-close if they haven't update_fail danced at least by a few blocks before the HTLC expires, but that seems like overkill, instead I think we should change it to only force-close if they haven't update_fail danced at least a few blocks after the HTLC expires, as long as we do it at least CLTV_CLAIM_BUFFER blocks prior to the incoming HTLC expiring, which we can ensure by ensuring that CLTV_EXPIRY_DELTA is at least 2x as large as CLTV_CLAIM_BUFFER.

TheBlueMatt · 2018-09-06T18:49:22Z

src/ln/channelmanager.rs

+					else if code == 0x1000 | 13 {
+						res.extend_from_slice(&byte_utils::be32_to_array(msg.cltv_expiry));
+					}
+					res.extend_from_slice(&chan_update.encode_with_len()[..]);


I dont think we should include the channel_update if we are returning the amt_to_forward or cltv_expiry.

UPDATE|11, 12 and 13 seem to include channel_udpate in the RFC.

yuntai · 2018-10-09T13:44:19Z

WIP. started implementing handling onion failure codes in the origin node.

yuntai · 2018-10-15T06:04:58Z

Still, WIP but the current patch contains some breaking changes.

Add channel_update_timestamp to RouteHop for its outbound to store chan_update timestamp when a route is calculated to compare it against a new chan_update coming with onion failure message
(Maybe it'd make more sense in Route as a separate vec.)
Currently, onion failure message is processed in internal_update_fail_htlc()(understandably, as we want to update network view as soon as we know about node/channel failures) and it's decided here whether the payment can be retried. But, as my understanding, we don't want to fire a PamentFailed event until the HTLC is revoked. Wanted to keep this boolean(payment_retryable) around until RAA() but split-minded where to keep it. (feeling better with keeping HTLCSource immutable but don't feel like adding a new hashmap)
Add PaymentData field in HTLCSource to keep htlc_msat forwarded to the first hop and block_height when the route is calculated to reconstruct incoming_htlc and CLTV forwardly when handling onion failure messages.

TheBlueMatt

This is awesome work, thanks. I'll review in a bit more detail later, but in the mean time can you take the change at TheBlueMatt@2018-10-157-changes and then make the changes I left comments on?

TheBlueMatt · 2018-10-16T16:15:32Z

src/ln/channelmanager.rs

@@ -291,6 +301,8 @@ pub struct ChannelManager {
 }

 const CLTV_EXPIRY_DELTA: u16 = 6 * 24 * 2; //TODO?
+const CLTV_FAR_FAR_AWAY: u16 = 6 * 24 * 7; //TODO?
+const FINAL_NODE_TIMEOUT: u16 = 3; //TODO?


I dont think this needs to be a new constant - channelmonitor::CLTV_CLAIM_BUFFER + HTLC_FAIL_TIMEOUT_BLOCKS should be right, I think.

TheBlueMatt · 2018-10-16T16:44:59Z

src/ln/channelmanager.rs

+						break Some(("Forwarding node has tampered with the intended HTLC values or origin node has an obsolete cltv_expiry_delta", 0x1000 | 13, self.get_channel_update(chan).unwrap()));
+					}
+					let cur_height = self.latest_block_height.load(Ordering::Acquire) as u32 + 1;
+					if msg.cltv_expiry <= cur_height + 3 as u32 { // expiry_too_soon


I think what we want here is CLTV_CLAIM_BUFFER + HTLC_FAIL_TIMEOUT_BLOCKS (aka "we want to have at least HTLC_FAIL_TIMEOUT_BLOCKS to fail prior to going on chain CLAIM_BUFFER blocks before expiration"). Probably add that text in a comment here as well :p.

yuntai · 2018-10-17T10:17:31Z

Cherrypicked the commit and fixed the cltv_expiry checks. Working on writing the test so this will be ready for a review soon.

yuntai · 2018-10-18T10:35:28Z

Still WIP. Pushed just in case any feedback on changing their structure around a bit. Also moved PaymentFailed from fail_htlc_backwards_internal() to update_fail_htlc() since it's okay for now per #209.

TheBlueMatt · 2018-10-18T13:40:15Z

Oops, no, definitely cant move PaymentFailed to update_fail_htlc, if anything PaymentSent needs to move the other way - out of update_fulfill_htlc. If a node sends an update_fail_htlc then disconnects, they can still reconnect and claim the payment. If they do the same thing for update_fulfill then we just get free money, so not as big a deal (aside from the duplicative event).

TheBlueMatt · 2018-10-18T13:40:59Z

Sorry about the conflict due to #211, its a trivial rebase, though.

yuntai · 2018-10-18T13:58:25Z

Hmm, right. Then I guess I can move onion failure message processing after commitment_signed_dance. ~~Actually, with a good reason, I just thought of - to prevent dos attack on our routing table from the attacker sending bogus update_fail!~~

message

to compare it agains the timestamp of channel_update coming with onion failure message

Add PaymentData to HTLCSource to reconstruct incoming_htlc_msat and cltv for each node when an onion failure message is returned. These values are used to check whether an updated chan_update is valid. Refactor onion failure code and more error cases are handled

See new comments in code for more details

TheBlueMatt · 2018-10-21T20:42:14Z

@yuntai whats the status of this? This will probably conflict up and down with #217, I'm happy to wait a few days on this if you want to do some quick back-and-forth and get this one in first, but otherwise it will probably need some rebase.

Add retryable and error_code fields to PaymentFailed event Add RouteUpdate event ala lightningdevkit#217 containing msgs::HTLCFailChannelUpdate Move msgs::HTLCFailChannelUpdate from update_fail_htlc's return value to the payload of RouteUpdate fired inside fail_htlc_backwards_interanl Some fixes in decode_update_add_htlc_onion Move procesing onion failure to fail_htlc_backwards_internal Test for process_onion_failure

yuntai · 2018-10-22T09:55:23Z

I updated https://github.com/yuntai/rust-lightning/tree/201809-issue146-failurenonion
but the PR is not automatically updated. oh, GitHub is back, yay!

Still, WIP needed to fill out the rest of the test cases. The current version of the patch is mergeable and expect that the remaining changes will only be inside test. I think I could open another PR to continue working on it once it's rebased and merged with that gigantic patch you have.

TheBlueMatt · 2018-10-22T14:50:52Z

~~Oops, I dont think the HTLCFailChannelUpdate needs to be out-of-band - there's no requirement that it be handled in any specific order and doing so is a bit less performant~~. I'll rebase/review/revert that bit now unless you're around and want to do it yourself.

Err, I take that back. I guess it's nice that we don't handle an update_fail_htlc onion error and take action based on that if it might get replaced with a new one on disconnect/reconnect. In any case, I'll rebase/review as-is.

TheBlueMatt · 2018-10-23T18:02:37Z

Will take as #219.

TheBlueMatt reviewed Sep 6, 2018

View reviewed changes

yuntai mentioned this pull request Sep 25, 2018

Migrate ChannelMonitor writing/reading to new serialization framework #191

Merged

yuntai force-pushed the 201809-issue146-failurenonion branch from 1fe8418 to e9b02dc Compare October 9, 2018 13:43

yuntai force-pushed the 201809-issue146-failurenonion branch 2 times, most recently from 924ebde to 37d1dac Compare October 14, 2018 14:02

TheBlueMatt reviewed Oct 16, 2018

View reviewed changes

yuntai force-pushed the 201809-issue146-failurenonion branch from 37d1dac to 7ec3073 Compare October 17, 2018 08:03

yuntai mentioned this pull request Oct 20, 2018

Add API to change HTLC relay parameters #216

Closed

yuntai and others added 8 commits October 22, 2018 01:36

Add is_permanent field to ChannelClosed message and add NodeFailure

8102e98

message

Error handling in decoding onion

0f02045

Add channel_update_timestamp to RouteHop

626eb19

to compare it agains the timestamp of channel_update coming with onion failure message

Some changes in comments and error messages.

4a9e608

Avoid needless on-chain channel failing for timing-out HTLCs

ba76181

See new comments in code for more details

Fix in checking incoming cltv_expry bounds

c11d77d

Remove unused const

895ae99

yuntai force-pushed the 201809-issue146-failurenonion branch from 77662bd to a69e5bb Compare October 22, 2018 09:41

TheBlueMatt mentioned this pull request Oct 23, 2018

Partially implement more onion error handling for sent payments #219

Merged

TheBlueMatt closed this Oct 23, 2018

Fill out failure codes for onion packet #157

Fill out failure codes for onion packet #157

Uh oh!

Conversation

yuntai commented Sep 6, 2018

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Sep 6, 2018

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Sep 6, 2018

Choose a reason for hiding this comment

Uh oh!

yuntai Oct 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Oct 10, 2018

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Sep 6, 2018

Choose a reason for hiding this comment

Uh oh!

yuntai Oct 7, 2018

Choose a reason for hiding this comment

Uh oh!

yuntai commented Oct 9, 2018

Uh oh!

yuntai commented Oct 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Oct 16, 2018

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Oct 16, 2018

Choose a reason for hiding this comment

Uh oh!

yuntai commented Oct 17, 2018

Uh oh!

yuntai commented Oct 18, 2018

Uh oh!

TheBlueMatt commented Oct 18, 2018

Uh oh!

TheBlueMatt commented Oct 18, 2018

Uh oh!

yuntai commented Oct 18, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Oct 21, 2018

Uh oh!

yuntai commented Oct 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Oct 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt commented Oct 23, 2018

Uh oh!

Uh oh!

yuntai Oct 7, 2018 •

edited

Loading

yuntai commented Oct 15, 2018 •

edited

Loading

yuntai commented Oct 18, 2018 •

edited

Loading

yuntai commented Oct 22, 2018 •

edited

Loading

TheBlueMatt commented Oct 22, 2018 •

edited

Loading