Description
Is your feature request related to a problem? Please describe.
Currently if a zarr store has a missing chunk, it is treaded as all missing. This is an upstream functionality but one for which there may soon be a kwarg allowing you to instead raise an error in these instances (zarr-developers/zarr-python#489). This is valuable in situations where you would like to distinguish intentional NaN data from I/O errors that caused you to not write some chunks. Here's an example of a problematic case in this situation (courtesy of @delgadom ):
import xarray as xr
import numpy as np
xr.Dataset({'myarr': (('x', 'y'), [[0., np.nan], [2., 3.]]), 'x': [0, 1], 'y': [0, 1]}).chunk({'x': 1, 'y': 1}).to_zarr('myzarr.zarr');
print('\n\ndata read into xarray\n' + '-'*30)
print(xr.open_zarr('myzarr.zarr').compute().myarr)
print('\n\nstructure of zarr store\n' + '-'*30)
! ls -R myzarr.zarr
print('\n\nremove a chunk\n' + '-'*30 + '\nrm myzarr.zarr/myarr/1.0')
! rm myzarr.zarr/myarr/1.0
print('\n\ndata read into xarray\n' + '-'*30)
print(xr.open_zarr('myzarr.zarr').compute().myarr)
This prints:
data read into xarray
------------------------------
<xarray.DataArray 'myarr' (x: 2, y: 2)>
array([[ 0., nan],
[ 2., 3.]])
Coordinates:
* x (x) int64 0 1
* y (y) int64 0 1
structure of zarr store
------------------------------
myzarr.zarr:
myarr x y
myzarr.zarr/myarr:
0.0 0.1 1.0 1.1
myzarr.zarr/x:
0
myzarr.zarr/y:
0
remove a chunk
------------------------------
rm myzarr.zarr/myarr/1.0
data read into xarray
------------------------------
<xarray.DataArray 'myarr' (x: 2, y: 2)>
array([[ 0., nan],
[nan, 3.]])
Coordinates:
* x (x) int64 0 1
* y (y) int64 0 1
Describe the solution you'd like
I'm not sure where a kwarg to the __init__
method of a zarr Array
object would come into play within open_zarr
or open_dataset
(once zarr-developers/zarr-python#489 is merged), but I figured I'd ask this question to see if anyone could point me in the right direction and to get ready for when that zarr feature exists. Happy to file a PR once I know where I'm looking. Couldn't figure it out with some initial browsing