-
Notifications
You must be signed in to change notification settings - Fork 2.2k
htlcswitch+peer [1/2]: thread context through in preparation for passing to graph DB calls #9691
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: elle-graph
Are you sure you want to change the base?
Conversation
Important Review skippedAuto reviews are limited to specific labels. 🏷️ Labels to auto review (1)
Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
ed0f260
to
00150d6
Compare
a6321cc
to
ea67b33
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ✅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up reviewing #9699 to understand the full pic. Left some questions.
if update == nil { | ||
// Fallback to the non-alias behavior. | ||
var err error | ||
update, err = l.cfg.FetchLastChannelUpdate(l.ShortChanID()) | ||
update, err = l.cfg.FetchLastChannelUpdate(ctx, l.ShortChanID()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the main purpose is to create ctx to be used in graph related calls, I don't think we need to have this ctx
popped up alone the callsite, ending up all the methods in the chain taking a never checked ctx
, given we can just create a ctx
here via l.cg.Create(context.Background())
. And if the link is stopping, that ctx
will be canceled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is dangerous to do that because what if the source of this call was not from internal (ie, from channelLink) but was actually triggered from an external system. Then if that system exists, this DB call wont be cancelled since the ctx is coming from the incorrect source
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nding up all the methods in the chain taking a never checked ctx
contexts only need to be "checked" if we are doing something that hangs/retries etc. so anywhere we currently have a select that can hang, then there we should always check it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like, for example: createFailureWithUpdate
is called by CheckHtlcForward
which is called by the switch. so not all call-sites are from within the link. It's safer to just have a ctx as a first param to methods so that we dont need to think about the origins of the call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s look at the specifics? Basically,
- If the
link.ctx
is done, does it mean theswitch.ctx
is also done? Nope because other systems can callRemoveLink
to shut down the link. - If the
switch.ctx
is done, does it mean thelink.ctx
is done? Yes, as whenswitch
shuts down, it cleans up all the links inhtlcForwarder
.
This is because the switch
can be viewed as the link
manager, and it’s responsible for starting and stopping the links.
Then if that system exists, this DB call wont be cancelled since the ctx is coming from the incorrect source
This is def correct. It was discussed before that maybe we should decouple switch
and link
. But as for now, we want to maintain the relationship, which means when switch
goes down, link
goes down.
Like, for example: createFailureWithUpdate is called by CheckHtlcForward which is called by the switch. so not all call-sites are from within the link.
Also correct. Tho given the relationship between switch
and link
, it does really matter since when switch.ctx
is done, link.ctx
is also done. I’m also exploring the possibility to make link
and private struct managed by switch
only.
It’s worth mentioning that things may be different if we took the previous approach, that each Start
takes a ctx
, which means each system’s internal context can be a child of an external one. Not saying we should go with the previous approach, just wanna mention a simple change there can make a different design choice.
It's safer to just have a ctx as a first param to methods so that we dont need to think about the origins of the call
I’ve been thinking about it these days. It may seem nice to be oblivious about the origin, but I don’t think it’s safe to do so. It gives the external caller the ability to abort actions, while some actions cannot be aborted, it’s case-specific. Good thing is we only have an internal context here, and it’s only used to abort graph-related calls, so one less thing to worry about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is valuable to have a standard pattern for things like ctx
. The standard pattern is that the caller's context is passed through for sync calls.
If the link and switch go down together anyways, then i dont see the harm in just keeping this standard pattern rather than now threading through the incorrect context.
while some actions cannot be aborted, it’s case-specific.
ok but yes it is case specific, for anything really critical like that, we should explicitly not pass in the caller's context.
@@ -3207,7 +3210,7 @@ func (l *channelLink) getAliases() []lnwire.ShortChannelID { | |||
// attachFailAliasUpdate sets the link's FailAliasUpdate function. | |||
// | |||
// Part of the scidAliasHandler interface. | |||
func (l *channelLink) attachFailAliasUpdate(closure func( | |||
func (l *channelLink) attachFailAliasUpdate(closure func(ctx context.Context, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I think we can create a ctx
via l.cg.Create(context.Background())
and pass it to the closure below so it still satisfies the signature failAliasUpdate
, think will bring way less change, and create fewer conflicts in other people's PRs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see other comment
@@ -1521,7 +1523,7 @@ func (p *Brontide) maybeSendChannelUpdates() { | |||
// to fetch the update to send to the remote peer. If the | |||
// channel is pending, and not a zero conf channel, we'll get | |||
// an error here which we'll ignore. | |||
chanUpd, err := p.cfg.FetchLastChanUpdate(scid) | |||
chanUpd, err := p.cfg.FetchLastChanUpdate(ctx, scid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same re ctx
, we can derive it from l.cg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see previous comment
Both the `htlcswitch` and `peer` systems take a `FetchLastChanUpdate` call back which underneath will call the `graphBuilder`'s `GetChannelByID` method which calls the graph DB's `FetchChannelEdgeByID` method which will eventually take a context. A couple of `context.TODO()`s are added which will be removed in later commits.
which calls the `FetchLastChannelUpdate` call back.
In this commit we also derive fresh contexts and then call the `cancel` functions of those in the `Stop` methods of: - `InterceptableSwitch` - `Switch` This removes two previously added `context.TODO`s and introduces a few more.
and introduce one more.
And remove one context.TODO
PR context
This is part of the last step required to complete #9494. We want all calls to the graph DB (and hence, ChannelGraph) to take a context since later, this context will be passed through to any remote graph RPC calls as well as any DB calls to a SQL graph backend.
This PR specifically: This is part 1 of a 2 part PR series that addresses the
htlcswitch
In this PR, we tackle the
htlcswitch
&peer
packages. This is split over 2 PRs since it is quite a lot. This PR is part 1. See part 2 for the full picture. This package makes calls to the graph'sFetchLastChannelUpdate
call. So this PR is all aimed at threading contexts through to the call-sites of this call-back. It leaves 1context.TODO()
in the main LNDserver
which will be addressed in a follow up.Branch strategy:
This series of PRs is basically a big code refactor. So once 19 is shipped, we can merge any work that has been merged into
elle-graph
and from then on we can just merge into master.