-
Notifications
You must be signed in to change notification settings - Fork 406
Remove peers_needing_send set from peer_handling #456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
peers.peers_needing_send.insert(peer_descriptor.clone()); | ||
peer.needing_send = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheBlueMatt Could you explain why this needs to be set here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to stream out our routing db, we (obviously) dont copy the whole thing into our outbound send buff (otherwise we'd just OOM ourselves...), so the logic in do_attempt_write_data has to be hit to add messages to the send buffer if the buffer is empty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the comments definitely hint at a old world where we were trying to be really hard to avoid reentrancy in the peer_handler, and gave up cause it was only barely practical. AFAICT, its not needed at all anymore.
As you note, peer_handler is currently woefully under-tested, likely the least tested area of the code.
peers.peers_needing_send.insert(peer_descriptor.clone()); | ||
peer.needing_send = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we need to stream out our routing db, we (obviously) dont copy the whole thing into our outbound send buff (otherwise we'd just OOM ourselves...), so the logic in do_attempt_write_data has to be hit to add messages to the send buffer if the buffer is empty.
By "it" do you mean |
Right, yea, the whole thing. AFAICT (without digging into git history) I gave up on avoiding reentrancy and just did the Easy Thing, so the whole set seems to be completely useless? |
At very least removing the code doesn't cause any test failures. Could you help me understand what sort of reentrancy you were attempting to avoid? I see that #217 removed the call to |
Specifically, both read_event and write_event may call back into user code (via the Descriptor objects), as indicated in their docs. In general, reentrancy scares me cause, as a user using a library, it can be very hard to get right, and very easy to add hard-to-identify bugs, but there also isn't tooo much that can be obviously done here. |
Thanks for the clarification. My understanding then is that the warnings about reentrancy are essentially telling users to avoid calling And, if I'm reading 294ad32 correctly, the thought was that calls to What I'm having trouble understanding is how we planned on avoiding reentrancy. Was the thought to remove any transitive calls of |
Yea, I think that was the goal, at least for From a client perspective, I could totally see confusion in that you could put a lock in your SocketDescriptor and end up with a deadlock when reading messages trying to lock it just to write something? |
Using the peers_needing_send HashSet to determine which peers have any outstanding writes adds complexity without much benefit. Instead, add a needing_send field to the Peer struct to accomplish the same thing.
Variables named using the pronoun 'us' are confusing as they are overloaded for Connection and SocketDescriptor, often inconsistently. Use 'connection' and 'descriptor' instead.
eb24218
to
7a03fe0
Compare
I've been reading through lightning-net-tokio to try to get a better sense of how a user would use First, I'd like to get a better sense behind some of the design decisions that went into Rather than using Feel free to discuss with me offline and I can circle back to this PR. |
Took me a while to find this PR, but I wanted to let you know that I've readded the reentrancy requirement as part of eae0904. I found the bug when doing the unit tests and seeing that the docs still mentioned it shouldn't call I actually like the idea of removing the separate set in favor of a Peer scan with a flag to clean up a lot of the internal functions that pass it around or reference it through I'll make sure to reference this PR when it makes it out of the initial review on top of #494 so they can be closed out together. |
Thanks for the heads up @julianknutsen and for addressing the reentrancy. As noted in the PR description, I was primarily trying to understand the peer handling code. Feel free to grab 9f875f4 for your work. |
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
Motivated by lightningdevkit#456, remove the peers_needing_send set in favor of just scanning the peers during process_events() and attempting to send data for those peers that have items in their outbound queue or pending sync items to be sent out.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
`Julian Knutsen <[email protected]>` pointed out in a previous discussion that `read_event` can reenter user code despite the documentation stating explicitly that it will not. This was addressed in lightningdevkit#456 by simply codifying the reentrancy, but its somewhat simpler to just drop the `do_attempt_write_data` call. Ideally we could land most of Julian's work, but its still in need of substantial git history cleanup to get it in a reviewable state and this solves the immediate issue.
Done in #948. |
Examining the
peer_handling
module reveals that thepeers_needing_send
set either can be replaced by aneeding_send
field onPeer
(to reduce complexity) or removed entirely. The set's documentation is unclear in that it claims thatpeers_needing_send
is used bydo_read_event()
when a message is pushed onto the send buffer butdo_attempt_write_data()
is not called to avoid reentrancy.However,
do_attempt_write_data()
is called unconditionally following the read loop withindo_read_event()
. It's possible thatdo_attempt_write_data()
is not called in the case of an early return (e.g., when use of thetry_potential_handleerror!
macro sends an error message and returns). But if this is the only case, then the members ofpeers_needing_send
would be overly broad.Furthermore, removing the places where
peers_needing_send
is populated does not cause any tests to fail. Therefore, this PR is an attempt to understand the intended usage ofpeers_needing_send
, determine if it can be simplified or removed, clarify the documentation, and add any missing tests for the intended behavior.