Skip to content

Cached filesystem not concurrency-safe? #1107

Closed
@jwodder

Description

@jwodder

We have a program that uses fsspec to mount a cached HTTP filesystem as a FUSE mount; while this process runs in the background, another traverses the FUSE mount and inspects multiple files in parallel. Unfortunately, the FUSE process keeps hitting errors in the following form:

Uncaught exception from FUSE operation open, returning errno.EINVAL.
Traceback (most recent call last):
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fuse.py", line 734, in _wrapper
    return func(*args, **kwargs) or 0
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fuse.py", line 834, in open
    fi.fh = self.operations('open', path.decode(self.encoding),
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/datalad_fuse/fuse_.py", line 75, in __call__
    return super(DataLadFUSE, self).__call__(op, self.root + path, *args)
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fuse.py", line 1076, in __call__
    return getattr(self, op)(*args)
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/datalad_fuse/fuse_.py", line 203, in open
    fsspec_file = self._adapter.open(path)
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/datalad_fuse/fsspec.py", line 243, in open
    return dsap.open(relpath, mode=mode, encoding=encoding, errors=errors)
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/datalad_fuse/fsspec.py", line 169, in open
    return self.fs.open(url, mode, **kwargs)
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fsspec/implementations/cached.py", line 406, in <lambda>
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fsspec/spec.py", line 1034, in open
    f = self._open(
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fsspec/implementations/cached.py", line 406, in <lambda>
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fsspec/implementations/cached.py", line 345, in _open
    self.save_cache()
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fsspec/implementations/cached.py", line 406, in <lambda>
    return lambda *args, **kw: getattr(type(self), item).__get__(self)(
  File "/home/dandi/cronlib/dandisets-healthstatus/venv/lib/python3.8/site-packages/fsspec/implementations/cached.py", line 161, in save_cache
    cached_files = pickle.load(f)
EOFError: Ran out of input

I suspect the cause is that the cached filesystem is not safe for concurrent access, with the result that multiple actions on the filesystem are causing the cache file to be read by one procedure while another writes to it. Indeed, there seems to be no locking in the cached filesystem code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions