Skip to content

Accessing restricted groups over s3 produces zero filled array #1504

Open
@samueljackson92

Description

@samueljackson92

Zarr version

2.15.0

Numcodecs version

0.11.0

Python Version

3.11.4

Operating System

Linux/Ubuntu

Installation

pip install zarr

Description

Hi,

I am attempting to read a consolidated zarr file with lots of groups remotely from a minio s3 bucket. Everything is working nicely with anonymous access. However, when I try to restrict some groups using the minio ACL tools, Zarr returns a blank zero filled array with no error. I would expect that Zarr/FSpec would throw a "permission denied" or similar error.

For example, I have the following structure, and want to deny anonymous access to group 28412.

amc_test.zarr
├── .zgroup
├── .zmetadata
├── 28412               ---> Access is denied in s3 ACL to this group and below.
│   ├── .zgroup
│   ├── data
│   │   ├── .zarray
│   │   └── 0
│   ├── error
│   │   ├── .zarray
│   │   └── 0
│   └── time
│       ├── .zarray
│       └── 0
├── 28415              ---> Still want public access to this group
│   ├── .zgroup
│   ├── data
│   │   ├── .zarray
│   │   └── 0
│   ├── error
│   │   ├── .zarray
│   │   └── 0
│   └── time
│       ├── .zarray
│       └── 0

I can set the corresponding ACL to deny anonymous access on that particular group in minio. But when I go to try and read the data, instead of getting an error, I get a blank array filled with zeros. I guess Zarr sees the file is not readable and assumes that that it doesn't exist, rather than throw an error.

Steps to reproduce

Here is a simple example to how I am accessing the data from s3. The minio bucket is at localhost:10101. If I turn of the ACL in minio then this script runs without errors. However, I would expect an error to be thrown on the penultimate line. Instead I get a blank zero filled array.

import s3fs
import zarr
import numpy as np

file_name = 'amc_test.zarr'

s3 = s3fs.S3FileSystem(
    anon=True,
    use_ssl=False,
    client_kwargs={
        "endpoint_url": "http://localhost:10101"
    },
)

store = zarr.storage.FSStore(f'mast/{file_name}', fs=s3)
handle = zarr.open_consolidated(store)
arr = handle['28415']['data'][:]        # <-- This works as expected. There is no access control on this group.
assert not np.all(arr == 0)

arr = handle['28412']['data'][:]        # <-- I expected an error to be thrown here.
assert not np.all(arr == 0)             # <-- This assert fails.

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions