-
Notifications
You must be signed in to change notification settings - Fork 1.1k
PYTHON-1752 bulk_write should be able to accept a generator #2262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
"""Generate batches of operations, batched by type of | ||
operation, in the order **provided**. | ||
""" | ||
run = None | ||
for idx, (op_type, operation) in enumerate(self.ops): | ||
for idx, request in enumerate(requests): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the goal of this ticket is to avoid inflating the whole generator upfront and only iterate requests as they are needed at the encoding step. For example:
coll.bulk_write((InsertOne({'x': 'large'*1024*1024}) for _ in range(1_000_000))
If we inflate all at once like we do here, then that code will need to allocate all 1 million documents at once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (2)
pymongo/synchronous/bulk.py:745
- [nitpick] Consider renaming the 'generator' parameter to 'requests' for clarity, since it represents the source of write operations and to align with the naming in bulk_write.
generator: Generator[_WriteOp[_DocumentType]],
pymongo/asynchronous/bulk.py:747
- [nitpick] Consider renaming the 'generator' parameter to 'requests' to better reflect its purpose and to maintain consistency with other bulk_write method signatures.
generator: Generator[_WriteOp[_DocumentType]],
…tryable var to bulk class to match client_bulk
Notes: