-
Notifications
You must be signed in to change notification settings - Fork 407
Fill out failure codes for onion packet #157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fill out failure codes for onion packet #157
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can ignore the full_stack_target test failure, I'll fix it pre-merge should be easy to fix.
src/ln/channelmanager.rs
Outdated
break Some(("CLTV expiry is too close", 0x1000 | 14, self.get_channel_update(chan).unwrap())); | ||
} | ||
/* | ||
if chan.is_disabled() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, I was thinking more !is_live() is this case. Maybe just return 0x1000|20 for that?
src/ln/channelmanager.rs
Outdated
break Some(("Forwarding node has tampered with the intended HTLC values or origin node has an obsolete cltv_expiry_delta", 0x1000 | 13, self.get_channel_update(chan).unwrap())); | ||
} | ||
let cur_height = self.latest_block_height.load(Ordering::Acquire) as u32 + 1; | ||
if msg.cltv_expiry <= cur_height + CLTV_EXPIRY_DELTA as u32 { // expiry_too_soon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, dont think we need the full CLTV_EXPIRY_DELTA here. The spec doesn't actually say what to use here (as far as I see) and appears to assume that meeting the min_final_cltv_expiry will suffice for every node on the path. I think all we really care about here is that we don't end up in a situation where we early-force-close a channel (ChannelMonitor::would_broadcast_at_height) because of a much-too-soon CLTV (implying, obviously, that no expiration is already past). This means:
- for the inbound HTLC (aka msg.cltv_expiry) that we wont learn the payment preimage within channelmonitor::CLTV_CLAIM_BUFFER. This implies that the outbound HTLC CLTV must be at least CLTV_CLAIM_BUFFER before the inbound HTLC (trivially true due to CLTV_EXPIRY_DELTA being greater, though maybe we could add a comment/static assert indicating that is a requirement) and that the peer is able to do the update_fulfill_htlc/commitment/revoke dance within CLTV_EXPIRY_DELTA - CLTV_CLAIM_BUFFER blocks (which should never be an issue, though, again, we should probably add a comment noting that CLTV_EXPIRY_DELTA should always be a block or three greater than CLTV_CLAIM_BUFFER).
- for the outbound HTLC (aka outgoing_cltv_value) that we receive a update_fail_htlc and do the commitment/revoke dance prior to the HTLC's expiration minus CLTV_CLAIM_BUFFER. Now that isnt actually possible to ensure right now, but I think we can change the check in ChannelMonitor to only force-close outbound if we're a block or three after the actual expiration as long as CLTV_EXPIRY_DELTA is sufficiently larger than CLTV_CLAIM_BUFFER (or, I think, at least 2x CLTV_CLAIM_BUFFER cause that will give us CLTV_EXPIRY_DELTA blocks to get it confirmed, which is the meaning of CLTV_EXPIRY_DELTA).
All that to say, if we tweak ChannelMonitor::would_broadcast_at_height and ChannelMonitor::block_connected we can probably just check that outgoing_htlc_value is at least the current height + a few blocks and dont worry about anything else assuming the next hop will either immediately fail it due to it being too close to their desired final_cltv_expiry or not being able to meet their own CLTV_EXPIRY_DELTA or they will fail it because it expires soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for the outbound HTLC (aka outgoing_cltv_value) that we receive a update_fail_htlc and do the commitment/revoke dance prior to the HTLC's expiration minus CLTV_CLAIM_BUFFER.
I don't clearly understand this sentence.
Did you mean the outbound HTLC need to be force-closed if update_fail_htlc/commitment_dance is not received prior to corresponding incoming HTLC's expiration minus CLTV_CLAIM_BUFFER?
ah, or you meant the outbound HTLC need to be force-closed if update_fail_htlc/commitment_dance is not received prior to the HTLC's expiration minus CLTV_CLAIM_BUFFER, because we will know the downstream node will force-close channel since they are within their CLTV_CLAIM_BUFFEER prior to HTLC's expiration so we need to force-close and try to fetch pre-image from the chain.
(but with CLTV_EXPIRY_DELTA >> CLTV_CLAIM_BUFFER, it's sufficient to just time out at the actual expiry (+1) since we have ample time before our deadline?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The second, kinda. We need to force-close the channel if the upstream node hasn't done the update_fail_htlc/commitment_signed dance at least in enough time for us to claim the corresponding incoming HTLC. Right now we force-close if they haven't update_fail danced at least by a few blocks before the HTLC expires, but that seems like overkill, instead I think we should change it to only force-close if they haven't update_fail danced at least a few blocks after the HTLC expires, as long as we do it at least CLTV_CLAIM_BUFFER blocks prior to the incoming HTLC expiring, which we can ensure by ensuring that CLTV_EXPIRY_DELTA is at least 2x as large as CLTV_CLAIM_BUFFER.
src/ln/channelmanager.rs
Outdated
else if code == 0x1000 | 13 { | ||
res.extend_from_slice(&byte_utils::be32_to_array(msg.cltv_expiry)); | ||
} | ||
res.extend_from_slice(&chan_update.encode_with_len()[..]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we should include the channel_update if we are returning the amt_to_forward or cltv_expiry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
UPDATE|11, 12 and 13 seem to include channel_udpate in the RFC.
1fe8418
to
e9b02dc
Compare
WIP. started implementing handling onion failure codes in the origin node. |
924ebde
to
37d1dac
Compare
Still, WIP but the current patch contains some breaking changes.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome work, thanks. I'll review in a bit more detail later, but in the mean time can you take the change at TheBlueMatt@2018-10-157-changes and then make the changes I left comments on?
src/ln/channelmanager.rs
Outdated
@@ -291,6 +301,8 @@ pub struct ChannelManager { | |||
} | |||
|
|||
const CLTV_EXPIRY_DELTA: u16 = 6 * 24 * 2; //TODO? | |||
const CLTV_FAR_FAR_AWAY: u16 = 6 * 24 * 7; //TODO? | |||
const FINAL_NODE_TIMEOUT: u16 = 3; //TODO? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think this needs to be a new constant - channelmonitor::CLTV_CLAIM_BUFFER + HTLC_FAIL_TIMEOUT_BLOCKS should be right, I think.
src/ln/channelmanager.rs
Outdated
break Some(("Forwarding node has tampered with the intended HTLC values or origin node has an obsolete cltv_expiry_delta", 0x1000 | 13, self.get_channel_update(chan).unwrap())); | ||
} | ||
let cur_height = self.latest_block_height.load(Ordering::Acquire) as u32 + 1; | ||
if msg.cltv_expiry <= cur_height + 3 as u32 { // expiry_too_soon |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what we want here is CLTV_CLAIM_BUFFER + HTLC_FAIL_TIMEOUT_BLOCKS (aka "we want to have at least HTLC_FAIL_TIMEOUT_BLOCKS to fail prior to going on chain CLAIM_BUFFER blocks before expiration"). Probably add that text in a comment here as well :p.
37d1dac
to
7ec3073
Compare
Cherrypicked the commit and fixed the cltv_expiry checks. Working on writing the test so this will be ready for a review soon. |
Still WIP. Pushed just in case any feedback on changing their structure around a bit. Also moved PaymentFailed from fail_htlc_backwards_internal() to update_fail_htlc() since it's okay for now per #209. |
Oops, no, definitely cant move PaymentFailed to update_fail_htlc, if anything PaymentSent needs to move the other way - out of update_fulfill_htlc. If a node sends an update_fail_htlc then disconnects, they can still reconnect and claim the payment. If they do the same thing for update_fulfill then we just get free money, so not as big a deal (aside from the duplicative event). |
Sorry about the conflict due to #211, its a trivial rebase, though. |
Hmm, right. Then I guess I can move onion failure message processing after commitment_signed_dance. |
to compare it agains the timestamp of channel_update coming with onion failure message
Add PaymentData to HTLCSource to reconstruct incoming_htlc_msat and cltv for each node when an onion failure message is returned. These values are used to check whether an updated chan_update is valid. Refactor onion failure code and more error cases are handled
See new comments in code for more details
Add retryable and error_code fields to PaymentFailed event Add RouteUpdate event ala lightningdevkit#217 containing msgs::HTLCFailChannelUpdate Move msgs::HTLCFailChannelUpdate from update_fail_htlc's return value to the payload of RouteUpdate fired inside fail_htlc_backwards_interanl Some fixes in decode_update_add_htlc_onion Move procesing onion failure to fail_htlc_backwards_internal Test for process_onion_failure
77662bd
to
a69e5bb
Compare
Still, WIP needed to fill out the rest of the test cases. The current version of the patch is mergeable and expect that the remaining changes will only be inside test. I think I could open another PR to continue working on it once it's rebased and merged with that gigantic patch you have. |
Err, I take that back. I guess it's nice that we don't handle an update_fail_htlc onion error and take action based on that if it might get replaced with a new one on disconnect/reconnect. In any case, I'll rebase/review as-is. |
Will take as #219. |
Partial patch for #146
Some additions of returning errors and fix in failure message (additional parameters for UPDATE).
Moving on to work on handling returned failure codes (and looping them back to the router).