Description
Zarr version
2.15.0
Numcodecs version
0.11.0
Python Version
3.11.4
Operating System
Linux/Ubuntu
Installation
pip install zarr
Description
Hi,
I am attempting to read a consolidated zarr file with lots of groups remotely from a minio s3 bucket. Everything is working nicely with anonymous access. However, when I try to restrict some groups using the minio ACL tools, Zarr returns a blank zero filled array with no error. I would expect that Zarr/FSpec would throw a "permission denied" or similar error.
For example, I have the following structure, and want to deny anonymous access to group 28412.
amc_test.zarr
├── .zgroup
├── .zmetadata
├── 28412 ---> Access is denied in s3 ACL to this group and below.
│ ├── .zgroup
│ ├── data
│ │ ├── .zarray
│ │ └── 0
│ ├── error
│ │ ├── .zarray
│ │ └── 0
│ └── time
│ ├── .zarray
│ └── 0
├── 28415 ---> Still want public access to this group
│ ├── .zgroup
│ ├── data
│ │ ├── .zarray
│ │ └── 0
│ ├── error
│ │ ├── .zarray
│ │ └── 0
│ └── time
│ ├── .zarray
│ └── 0
I can set the corresponding ACL to deny anonymous access on that particular group in minio. But when I go to try and read the data, instead of getting an error, I get a blank array filled with zeros. I guess Zarr sees the file is not readable and assumes that that it doesn't exist, rather than throw an error.
Steps to reproduce
Here is a simple example to how I am accessing the data from s3. The minio bucket is at localhost:10101
. If I turn of the ACL in minio then this script runs without errors. However, I would expect an error to be thrown on the penultimate line. Instead I get a blank zero filled array.
import s3fs
import zarr
import numpy as np
file_name = 'amc_test.zarr'
s3 = s3fs.S3FileSystem(
anon=True,
use_ssl=False,
client_kwargs={
"endpoint_url": "http://localhost:10101"
},
)
store = zarr.storage.FSStore(f'mast/{file_name}', fs=s3)
handle = zarr.open_consolidated(store)
arr = handle['28415']['data'][:] # <-- This works as expected. There is no access control on this group.
assert not np.all(arr == 0)
arr = handle['28412']['data'][:] # <-- I expected an error to be thrown here.
assert not np.all(arr == 0) # <-- This assert fails.
Additional output
No response