Skip to content

Add ChannelManager support for monitor update failure in one place #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
295 changes: 217 additions & 78 deletions src/ln/channel.rs

Large diffs are not rendered by default.

828 changes: 698 additions & 130 deletions src/ln/channelmanager.rs

Large diffs are not rendered by default.

17 changes: 17 additions & 0 deletions src/ln/channelmonitor.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ use std::sync::{Arc,Mutex};
use std::{hash,cmp};

/// An error enum representing a failure to persist a channel monitor update.
#[derive(Clone)]
pub enum ChannelMonitorUpdateErr {
/// Used to indicate a temporary failure (eg connection to a watchtower failed, but is expected
/// to succeed at some point in the future).
Expand All @@ -47,6 +48,22 @@ pub enum ChannelMonitorUpdateErr {
/// submitting new commitment transactions to the remote party.
/// ChannelManager::test_restore_channel_monitor can be used to retry the update(s) and restore
/// the channel to an operational state.
///
/// Note that continuing to operate when no copy of the updated ChannelMonitor could be
/// persisted is unsafe - if you failed to store the update on your own local disk you should
/// instead return PermanentFailure to force closure of the channel ASAP.
///
/// Even when a channel has been "frozen" updates to the ChannelMonitor can continue to occur
/// (eg if an inbound HTLC which we forwarded was claimed upstream resulting in us attempting
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm in case of "frozen" channel could we have a ChannelMonitor trying to claim ghost htlc outputs, and fail to do so, on remote commitment tx, this last one up-to-date without the concerned htlc output because we pass it the preimage ? Seems not to me because remote commitment tx needs commitment_signed but wonder if there is other weird cases like that.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean by "ghost htlc outputs"? The unupdated ChannelMonitor should ignore any offered HTLC outputs that it doesn't have the preimage for assuming the remote side is just gonna claim them via timeout, whereas our local/updated copy should do the claim.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't exactly the case I had in head, the one was what if we have a channel in ChannelMonitorUpdateErr::TemporaryFailure and we pass backward the preimage via claim_funds ? Harmless in fact because other node needs commitment_signed. Go ahead, I was just trying to speculate on weird on-chain consequences in case of frozen channel..

/// to claim it on this channel) and those updates must be applied wherever they can be. At
/// least one such updated ChannelMonitor must be persisted otherwise PermanentFailure should
/// be returned to get things on-chain ASAP using only the in-memory copy. Obviously updates to
/// the channel which would invalidate previous ChannelMonitors are not made when a channel has
/// been "frozen".
///
/// Note that even if updates made after TemporaryFailure succeed you must still call
/// test_restore_channel_monitor to ensure you have the latest monitor and re-enable normal
/// channel operation.
TemporaryFailure,
/// Used to indicate no further channel monitor updates will be allowed (eg we've moved on to a
/// different watchtower and cannot update with all watchtowers that were previously informed
Expand Down
30 changes: 23 additions & 7 deletions src/ln/msgs.rs
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ pub struct FundingSigned {
}

/// A funding_locked message to be sent or received from a peer
#[derive(Clone)]
#[derive(Clone, PartialEq)]
pub struct FundingLocked {
pub(crate) channel_id: [u8; 32],
pub(crate) next_per_commitment_point: PublicKey,
Expand All @@ -244,7 +244,7 @@ pub struct ClosingSigned {
}

/// An update_add_htlc message to be sent or received from a peer
#[derive(Clone)]
#[derive(Clone, PartialEq)]
pub struct UpdateAddHTLC {
pub(crate) channel_id: [u8; 32],
pub(crate) htlc_id: u64,
Expand All @@ -255,23 +255,23 @@ pub struct UpdateAddHTLC {
}

/// An update_fulfill_htlc message to be sent or received from a peer
#[derive(Clone)]
#[derive(Clone, PartialEq)]
pub struct UpdateFulfillHTLC {
pub(crate) channel_id: [u8; 32],
pub(crate) htlc_id: u64,
pub(crate) payment_preimage: [u8; 32],
}

/// An update_fail_htlc message to be sent or received from a peer
#[derive(Clone)]
#[derive(Clone, PartialEq)]
pub struct UpdateFailHTLC {
pub(crate) channel_id: [u8; 32],
pub(crate) htlc_id: u64,
pub(crate) reason: OnionErrorPacket,
}

/// An update_fail_malformed_htlc message to be sent or received from a peer
#[derive(Clone)]
#[derive(Clone, PartialEq)]
pub struct UpdateFailMalformedHTLC {
pub(crate) channel_id: [u8; 32],
pub(crate) htlc_id: u64,
Expand All @@ -280,32 +280,36 @@ pub struct UpdateFailMalformedHTLC {
}

/// A commitment_signed message to be sent or received from a peer
#[derive(Clone)]
#[derive(Clone, PartialEq)]
pub struct CommitmentSigned {
pub(crate) channel_id: [u8; 32],
pub(crate) signature: Signature,
pub(crate) htlc_signatures: Vec<Signature>,
}

/// A revoke_and_ack message to be sent or received from a peer
#[derive(Clone, PartialEq)]
pub struct RevokeAndACK {
pub(crate) channel_id: [u8; 32],
pub(crate) per_commitment_secret: [u8; 32],
pub(crate) next_per_commitment_point: PublicKey,
}

/// An update_fee message to be sent or received from a peer
#[derive(PartialEq)]
pub struct UpdateFee {
pub(crate) channel_id: [u8; 32],
pub(crate) feerate_per_kw: u32,
}

#[derive(PartialEq)]
pub(crate) struct DataLossProtect {
pub(crate) your_last_per_commitment_secret: [u8; 32],
pub(crate) my_current_per_commitment_point: PublicKey,
}

/// A channel_reestablish message to be sent or received from a peer
#[derive(PartialEq)]
pub struct ChannelReestablish {
pub(crate) channel_id: [u8; 32],
pub(crate) next_local_commitment_number: u64,
Expand Down Expand Up @@ -463,6 +467,7 @@ pub struct HandleError { //TODO: rename me

/// Struct used to return values from revoke_and_ack messages, containing a bunch of commitment
/// transaction updates if they were pending.
#[derive(PartialEq)]
pub struct CommitmentUpdate {
pub(crate) update_add_htlcs: Vec<UpdateAddHTLC>,
pub(crate) update_fulfill_htlcs: Vec<UpdateFulfillHTLC>,
Expand Down Expand Up @@ -629,7 +634,18 @@ pub(crate) struct OnionPacket {
pub(crate) hmac: [u8; 32],
}

#[derive(Clone)]
impl PartialEq for OnionPacket {
fn eq(&self, other: &OnionPacket) -> bool {
for (i, j) in self.hop_data.iter().zip(other.hop_data.iter()) {
if i != j { return false; }
}
self.version == other.version &&
self.public_key == other.public_key &&
self.hmac == other.hmac
}
}

#[derive(Clone, PartialEq)]
pub(crate) struct OnionErrorPacket {
// This really should be a constant size slice, but the spec lets these things be up to 128KB?
// (TODO) We limit it in decode to much lower...
Expand Down
11 changes: 11 additions & 0 deletions src/ln/peer_handler.rs
Original file line number Diff line number Diff line change
Expand Up @@ -866,6 +866,17 @@ impl<Descriptor: SocketDescriptor> PeerManager<Descriptor> {
Self::do_attempt_write_data(&mut descriptor, peer);
continue;
},
Event::SendRevokeAndACK { ref node_id, ref msg } => {
log_trace!(self, "Handling SendRevokeAndACK event in peer_handler for node {} for channel {}",
log_pubkey!(node_id),
log_bytes!(msg.channel_id));
let (mut descriptor, peer) = get_peer_for_forwarding!(node_id, {
//TODO: Do whatever we're gonna do for handling dropped messages
});
peer.pending_outbound_buffer.push_back(peer.channel_encryptor.encrypt_message(&encode_msg!(msg, 133)));
Self::do_attempt_write_data(&mut descriptor, peer);
continue;
},
Event::SendShutdown { ref node_id, ref msg } => {
log_trace!(self, "Handling Shutdown event in peer_handler for node {} for channel {}",
log_pubkey!(node_id),
Expand Down
6 changes: 5 additions & 1 deletion src/util/errors.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,10 @@ pub enum APIError {
ChannelUnavailable {
/// A human-readable error message
err: &'static str
}
},
/// An attempt to call add_update_monitor returned an Err (ie you did this!), causing the
/// attempted action to fail.
MonitorUpdateFailed,
}

impl fmt::Debug for APIError {
Expand All @@ -42,6 +45,7 @@ impl fmt::Debug for APIError {
APIError::FeeRateTooHigh {ref err, ref feerate} => write!(f, "{} feerate: {}", err, feerate),
APIError::RouteError {ref err} => f.write_str(err),
APIError::ChannelUnavailable {ref err} => f.write_str(err),
APIError::MonitorUpdateFailed => f.write_str("Client indicated a channel monitor update failed"),
}
}
}
9 changes: 9 additions & 0 deletions src/util/events.rs
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,15 @@ pub enum Event {
/// The update messages which should be sent. ALL messages in the struct should be sent!
updates: msgs::CommitmentUpdate,
},
/// Used to indicate that a revoke_and_ack message should be sent to the peer with the given node_id.
///
/// This event is handled by PeerManager::process_events if you are using a PeerManager.
SendRevokeAndACK {
/// The node_id of the node which should receive this message
node_id: PublicKey,
/// The message which should be sent.
msg: msgs::RevokeAndACK,
},
/// Used to indicate that a shutdown message should be sent to the peer with the given node_id.
///
/// This event is handled by PeerManager::process_events if you are using a PeerManager.
Expand Down
5 changes: 4 additions & 1 deletion src/util/test_utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -38,12 +38,14 @@ impl chaininterface::FeeEstimator for TestFeeEstimator {
pub struct TestChannelMonitor {
pub added_monitors: Mutex<Vec<(OutPoint, channelmonitor::ChannelMonitor)>>,
pub simple_monitor: Arc<channelmonitor::SimpleManyChannelMonitor<OutPoint>>,
pub update_ret: Mutex<Result<(), channelmonitor::ChannelMonitorUpdateErr>>,
}
impl TestChannelMonitor {
pub fn new(chain_monitor: Arc<chaininterface::ChainWatchInterface>, broadcaster: Arc<chaininterface::BroadcasterInterface>) -> Self {
Self {
added_monitors: Mutex::new(Vec::new()),
simple_monitor: channelmonitor::SimpleManyChannelMonitor::new(chain_monitor, broadcaster),
update_ret: Mutex::new(Ok(())),
}
}
}
Expand All @@ -57,7 +59,8 @@ impl channelmonitor::ManyChannelMonitor for TestChannelMonitor {
w.0.clear();
monitor.write_for_watchtower(&mut w).unwrap(); // This at least shouldn't crash...
self.added_monitors.lock().unwrap().push((funding_txo, monitor.clone()));
self.simple_monitor.add_update_monitor(funding_txo, monitor)
assert!(self.simple_monitor.add_update_monitor(funding_txo, monitor).is_ok());
self.update_ret.lock().unwrap().clone()
}
}

Expand Down