Skip to content

Handle pre-startup and closed-channel monitor update completion actions #2391

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

This handles two cases that were teed up by, but not fixed in, #2167. Built on #2364 and should be roughly the last of the fixes needed for async monitor completion until 117.

@TheBlueMatt TheBlueMatt added this to the 0.0.116 milestone Jul 5, 2023
@TheBlueMatt
Copy link
Collaborator Author

Tests in the last few commits in #2169 though still need to write some more comments in one.

Comment on lines 2366 to 2367
// disconnect the peers. Note that the fuzzer originally found this issue because
// deserializing a ChannelManager in this state causes an assertion failure.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't quite clear what issue the fuzzer had found before and if it would still find it after moving the initialization of chan_0_monitor_serialized.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when the test was written we had no concept for in-flight monitor updates - we expected the user to (magically) call monitor_update_completed on the monitor update that they'd previously marked in-flight after startup. With the change here to finally actually "complete" monitor updates that completed while we were shut down, we have to do something to mark the monitor updates as actually-not-complete, which we do by loading with the old monitor.

@wpaulino
Copy link
Contributor

Needs rebase.

@TheBlueMatt TheBlueMatt force-pushed the 2023-07-all-compl-actions branch from d2b4778 to 8900ef1 Compare July 10, 2023 21:03
@TheBlueMatt
Copy link
Collaborator Author

Rebased and addressed feedback.

If a channel has been closed, there may still be some
`ChannelMonitorUpdate`(s) which are pending completion. These
in-flight updates may also be blocking another channel from letting
an update fly, e.g. for forwarded payments where the payment
preimage will be removed from the downstream channel after the
upstream channel has closed.

Luckily all the infrastructure to handle this case is already in
place - we just need to process the
`monitor_update_blocked_actions` for closed channels.
@TheBlueMatt TheBlueMatt force-pushed the 2023-07-all-compl-actions branch from 8900ef1 to 76f331e Compare July 10, 2023 21:40
Copy link
Contributor

@alecchendev alecchendev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM

@TheBlueMatt TheBlueMatt force-pushed the 2023-07-all-compl-actions branch from 76f331e to 9f5b70e Compare July 11, 2023 19:56
@codecov-commenter
Copy link

codecov-commenter commented Jul 11, 2023

Codecov Report

Patch coverage: 37.50% and project coverage change: +0.26 🎉

Comparison is base (4c342bd) 90.31% compared to head (9f5b70e) 90.58%.

❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2391      +/-   ##
==========================================
+ Coverage   90.31%   90.58%   +0.26%     
==========================================
  Files         106      106              
  Lines       55166    57314    +2148     
  Branches    55166    57314    +2148     
==========================================
+ Hits        49824    51917    +2093     
- Misses       5342     5397      +55     
Impacted Files Coverage Δ
lightning/src/ln/channelmanager.rs 88.67% <24.24%> (+2.72%) ⬆️
lightning/src/ln/chanmon_update_fail_tests.rs 98.69% <100.00%> (+<0.01%) ⬆️

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@@ -8518,6 +8539,16 @@ where
update: update.clone(),
});
}
if $chan_in_flight_upds.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if any updates were actually removed before generating the background event?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only get here if there were some updates in the in_flight_monitor_updates map, which is only filled in with updates for channels with in-flight updates during serialization.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM after squash

If a `ChannelMonitorUpdate` completes being persisted, but the
`ChannelManager` isn't informed thereof (or isn't persisted) before
shutdown, on startup we may still have it listed as in-flight. When
we compare the available `ChannelMonitor` with the in-flight set,
we'll notice it completed and remove it, however this may leave
some post-update actions dangling which need to complete.

Here we handle this with a new `BackgroundEvent` indicating we need
to handle any post-update action(s) for a given channel.
In an older PR a reviewer had asked why the discarding of a channel
being blocked on another monitor update is okay if the blocked
channel has since closed. At the time, this was not actually okay -
the monitor updates in the channel weren't moved to the
`ChannelManager` on close so the whole pipeline was busted, but
with the changes in 4041f08 the
handling of channel closes with pending monitor updates is now
correct, and so is the existing code block.
@TheBlueMatt TheBlueMatt force-pushed the 2023-07-all-compl-actions branch from 9f5b70e to 550cf91 Compare July 12, 2023 20:53
@TheBlueMatt
Copy link
Collaborator Author

Squashed without change:

$ git diff-tree -U1 9f5b70ea 550cf914
$ 

@TheBlueMatt TheBlueMatt merged commit df237ba into lightningdevkit:main Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants