-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Increase the default media chunksize to 100MB. #482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The current `DEFAULT_CHUNK_SIZE` of 512kb leads to unnecessarily slow transfers, as in googlecolab/backend-container#1. Increasing this simply lowers the overhead for large transfers. Fixes googleapis#481.
100mb is a very big increase. resumable-media uses |
So I forgot to mention the provenance of that constant -- it did not just come out of my hat: I have vague memories of @thobrla doing some amount of experimentation to come up with that constant, so I was willing to just trust it. Do we have any benchmarks or experiments we can use to get a sense for a good vs. bad chunksize? Is there something you're worried about with a large chunksize? (Note that most requests are already retried, so I think "oh internet connections can be wonky" isn't too compelling.) |
Oh @dhermes did you run any experiments when picking that chunksize for resumable-media? |
@craigcitro Nope, that was just copy-pasta from |
yeah, p. sure wikipedia should add a citation to this bug for "poetic justice". 😁 Proposal: let's switch to the gsutil default everywhere? I'm happy to follow this up with a PR to resumable media. |
FYI: I didn't do any deep performance testing to come up with the constant. Instead, I was trying to find a reasonable ceiling for this formula: Resumable chunk size / round-trip write latency to GCS = Maximum single-stream throughput Plugging in some numbers: |
The goal here is to lower latency for transfers, as well as having some consistency between client libraries. For more discussion, see googleapis/google-api-python-client#482.
Hi guys, do you know how to "Update the default UPLOAD_CHUNK_SIZE to 100MB. " when I use Colab to mount Google Drive? Thanks a lot. Ni |
@Ni-Chen Would you mind opening a new issue with your question? We're more likely to miss activity on issues that are already closed. Thanks! |
@Ni-Chen if you're using |
Howdy. Either the docs need to be updated, or the From """
...
Google App Engine has a 5MB limit on request size, so you should never set
your chunksize larger than 5MB, or to -1.
""" Either the Google App Engine has increased this limit, or something else fishy is going on? |
The current
DEFAULT_CHUNK_SIZE
of 512kb leads to unnecessarily slowtransfers, as in googlecolab/backend-container#1. Increasing this simply
lowers the overhead for large transfers.
Fixes #481.
PTAL @jonparrott -- I couldn't think of a meaningful test that didn't involve basically repeating the constant in another file, which feels silly. 😀