Skip to content

Increase the timeout for RPC responses from Bitcoin Core #915

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

TheBlueMatt
Copy link
Collaborator

Early sample testing showed multiple users hitting
EWOULDBLOCK/EAGAIN waiting for an initial response from Bitcoin
Core while it was doing some long operation (eg UTXO cache
flushing). Instead of only waiting 5 seconds for each attempt, we
now wait a full two minutes, but only for the first header
response, not each byte.

@codecov
Copy link

codecov bot commented May 6, 2021

Codecov Report

Merging #915 (3d3379d) into main (7297e13) will decrease coverage by 0.03%.
The diff coverage is 100.00%.

❗ Current head 3d3379d differs from pull request most recent head 4ade6bc. Consider uploading reports for the commit 4ade6bc to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##             main     #915      +/-   ##
==========================================
- Coverage   90.51%   90.47%   -0.04%     
==========================================
  Files          59       59              
  Lines       29771    29769       -2     
==========================================
- Hits        26946    26934      -12     
- Misses       2825     2835      +10     
Impacted Files Coverage Δ
lightning-block-sync/src/http.rs 95.37% <100.00%> (-0.03%) ⬇️
lightning/src/ln/functional_tests.rs 96.86% <0.00%> (-0.17%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7297e13...4ade6bc. Read the comment docs.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much LGTM! Nice

@TheBlueMatt TheBlueMatt force-pushed the 2021-05-bump-rpc-timeout branch from 8da44a9 to d0a738e Compare May 7, 2021 22:40
@TheBlueMatt
Copy link
Collaborator Author

Fixed the comments and also added a new commit I'd missed to enable retrying if we actually hit a TCP socket timeout.

Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question but either way this looks like an improvement!

let read_res = reader.read_line(&mut line);
match read_res {
Ok(bytes_read) => break bytes_read,
Err(e) if e.kind() == std::io::ErrorKind::WouldBlock => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any other errors we would want to retry on? Like Interrupted docs https://doc.rust-lang.org/std/io/enum.ErrorKind.html#variant.Interrupted seems to indicate that it can probably be retried

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think you can get an Interrupted in a read on Linux and probably OSX. I'm not really sure what it would even mean in practice - usually it would imply another thread someone sent some signal to interrupt, but in a read not so much.

All that said, maybe it makes sense to just retry everything. Another potential error we could see is a Bitcoin Core RPC queue full error, which would present as just a generic HTTP error.

@TheBlueMatt TheBlueMatt force-pushed the 2021-05-bump-rpc-timeout branch from d0a738e to 3d3379d Compare May 8, 2021 15:03
@TheBlueMatt
Copy link
Collaborator Author

Redid the first commit to always retry requests instead of checking the error source.

Early sample testing showed multiple users hitting
EWOULDBLOCK/EAGAIN waiting for an initial response from Bitcoin
Core while it was doing some long operation (eg UTXO cache
flushing). Instead of only waiting 5 seconds for each attempt, we
now wait a full two minutes, but only for the first header
response, not each byte.
@TheBlueMatt TheBlueMatt force-pushed the 2021-05-bump-rpc-timeout branch from 3d3379d to 4ade6bc Compare May 10, 2021 16:55
@TheBlueMatt
Copy link
Collaborator Author

Squashed fixup commits with no change:

$ git diff-tree -U1 3d3379de 4ade6bcb
$

@TheBlueMatt TheBlueMatt merged commit fcc0723 into lightningdevkit:main May 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants