Rewrite Channel resend tracking to make it much more reliable #320

TheBlueMatt · 2019-03-07T19:30:55Z

Resending revoke_and_ack and commitment_signed (+update) messages
after monitor-update-failure or disconnection has been a highly
unreliable part of our codebase for some time (as evidenced by the
number of bugs caught in the chanmon_fail_consistency fuzz target).
This is due to its rather ad-hoc nature and tracking/behavior which
consists of checking a number of different flags to try to deduce
which messages were/were not delivered and go from there. Instead,
this commit rewrites it to simply keep track of the order messages
were generated originally, as we always resend in the
originally-generated order.

I'm anticipating this will be way more robust than the old code, in
addition to its simplicity.

Based on #314, #316, and #319. I have some tests which failed prior to this change that I need to backport from chanmon_fail_consistency, as well as #319 needs some tests.

ariard

Seems good to me, but also seems easy to mess up there, so glad to have chanmon_fail_consistency (does this last one test overlapping of commitment signed, i.e issued and received by both sides before any RAA, no sure with a quick skim on it ?)

ariard · 2019-03-25T01:54:12Z

src/ln/channel.rs

+	/// When resending CS/RAA messages on channel monitor restoration or on reconnect, we always
+	/// need to ensure we resend them in the order we originally generated them. Thus, we track
+	/// that order here by always setting this to the opposite of the message we are generating (ie
+	/// when we generate a CS, we set this to RAAFirst as we should send any pending RAAs first and


I think got the idea, but maybe could be a little be clearer, something like: when we get CS_1 from B and generate CS_2 to B, if we monitor failed after CS_2, we have to ensure to finish CS_1 channel update by sending RAA_1 first so at CS_2 generation we set to opposite

ariard · 2019-03-25T01:55:05Z

src/ln/channel.rs

+	/// need to ensure we resend them in the order we originally generated them. Thus, we track
+	/// that order here by always setting this to the opposite of the message we are generating (ie
+	/// when we generate a CS, we set this to RAAFirst as we should send any pending RAAs first and
+	/// visa-versa).


nit: vice-versa. Visa, that's the concurrence

Lol, good point, the rewrite removed it, though.

ariard · 2019-03-25T02:18:28Z

src/ln/channel.rs

@@ -2142,7 +2129,7 @@ impl Channel {
 				// When the monitor updating is restored we'll call get_last_commitment_update(),
 				// which does not update state, but we're definitely now awaiting a remote revoke
 				// before we can step forward any more, so set it here.
-				self.channel_state |= ChannelState::AwaitingRemoteRevoke as u32;
+				self.send_commitment_no_status_check()?;


Hmm reason to call send_commitment_no_status_check there is to upgrade htlc status ? (instead of self.resend_order = RAACommitmentOrder::RevokeAndACKFirst ; self.channel_state |= ChannelState::AwaitingRemoteRevoke )

Yea, otherwise the pending HTLCs wont match when we resend the CS.

Resending revoke_and_ack and commitment_signed (+update) messages after monitor-update-failure or disconnection has been a highly unreliable part of our codebase for some time (as evidenced by the number of bugs caught in the chanmon_fail_consistency fuzz target). This is due to its rather ad-hoc nature and tracking/behavior which consists of checking a number of different flags to try to deduce which messages were/were not delivered and go from there. Instead, this commit rewrites it to simply keep track of the order messages were generated originally, as we always resend in the originally-generated order. I'm anticipating this will be way more robust than the old code, in addition to its simplicity.

TheBlueMatt · 2019-03-25T21:05:27Z

Note that since restarting the fuzzers on this branch I haven't found another chanmon_fail_consistency crash, with probably 2 billion iterations in AFL etc, whereas the previous code was falling over quite quickly.

ariard · 2019-03-26T00:08:34Z

Read again updated comment, fine, take it to close 0.0.9

TheBlueMatt added this to the 0.0.9 milestone Mar 7, 2019

TheBlueMatt force-pushed the 2019-03-chan-send-rewrite branch from 674fdca to 2cf9e39 Compare March 7, 2019 21:56

ariard approved these changes Mar 25, 2019

View reviewed changes

Add log_trace on completion of monitor update restoration in Chan

63ed8fe

TheBlueMatt force-pushed the 2019-03-chan-send-rewrite branch from 2cf9e39 to e6e70ab Compare March 25, 2019 20:54

TheBlueMatt added 2 commits March 25, 2019 17:03

Fix crash in chanmon_fail_consistency due to fuzz hash collisions

c2a3fc7

TheBlueMatt force-pushed the 2019-03-chan-send-rewrite branch from e6e70ab to c2a3fc7 Compare March 25, 2019 21:04

TheBlueMatt merged commit 06eddc3 into lightningdevkit:master Apr 22, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rewrite Channel resend tracking to make it much more reliable #320

Rewrite Channel resend tracking to make it much more reliable #320

Uh oh!

TheBlueMatt commented Mar 7, 2019

Uh oh!

ariard left a comment

Uh oh!

ariard Mar 25, 2019

Uh oh!

TheBlueMatt Mar 25, 2019

Uh oh!

ariard Mar 25, 2019

Uh oh!

TheBlueMatt Mar 25, 2019

Uh oh!

ariard Mar 25, 2019

Uh oh!

TheBlueMatt Mar 25, 2019

Uh oh!

TheBlueMatt commented Mar 25, 2019

Uh oh!

ariard commented Mar 26, 2019

Uh oh!

Uh oh!

Rewrite Channel resend tracking to make it much more reliable #320

Rewrite Channel resend tracking to make it much more reliable #320

Uh oh!

Conversation

TheBlueMatt commented Mar 7, 2019

Uh oh!

ariard left a comment

Choose a reason for hiding this comment

Uh oh!

ariard Mar 25, 2019

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Mar 25, 2019

Choose a reason for hiding this comment

Uh oh!

ariard Mar 25, 2019

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Mar 25, 2019

Choose a reason for hiding this comment

Uh oh!

ariard Mar 25, 2019

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Mar 25, 2019

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt commented Mar 25, 2019

Uh oh!

ariard commented Mar 26, 2019

Uh oh!

Uh oh!