Skip to content

Faster delitems #1524

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Faster delitems #1524

wants to merge 4 commits into from

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Sep 14, 2023

fixes #1336

We first try to delete keys without checking if they exist in storage. If the storage backend treats deleting missing keys as error, and it signals such an error by raising FileNotFoundError, then keys will be filtered based on whether they exist in storage. This is much slower.

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Sep 14, 2023
@codecov
Copy link

codecov bot commented Sep 14, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (8ac8553) 99.99% compared to head (769e76e) 100.00%.
Report is 1 commits behind head on main.

❗ Current head 769e76e differs from pull request most recent head e2c0ce4. Consider uploading reports for the commit e2c0ce4 to get more accurate results

Additional details and impacted files
@@            Coverage Diff            @@
##             main     #1524    +/-   ##
=========================================
  Coverage   99.99%   100.00%            
=========================================
  Files          38        37     -1     
  Lines       14563     14740   +177     
=========================================
+ Hits        14562     14740   +178     
+ Misses          1         0     -1     
Files Coverage Δ
zarr/storage.py 100.00% <100.00%> (ø)

... and 30 files with indirect coverage changes

@martindurant
Copy link
Member

This is fine, but rm() in fsspec should really allow on_error on all calls consistently. It's worth noting that the backend that doesn't ignore FileNotFound (i.e., local) is also the one that deletes files sequentially as opposed to async.

@martindurant
Copy link
Member

is also the one that deletes files sequentially

I mean, we could try/except on individual files in this case and still not bother explicitly checking existence first.

@github-actions github-actions bot removed the needs release notes Automatically applied to PRs which haven't added release notes label Feb 13, 2024
@jhamman
Copy link
Member

jhamman commented Oct 11, 2024

I'm going to close this as stale. Folks should feel free to reopen if there is interest in continuing this work.

@jhamman jhamman closed this Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

performance optimization in delitems
3 participants