Description
Hi there,
I'm trying to write some examples for our users using fsspec
so they can incorporate it into their own scripts. Only some of our users are proficient in Python, so these examples need to be super verbose and simple. Our use case is climate model simulation and archiving on a tape system. Longer-term goal is to integrate it into some form of cataloguing, but, that's for later.
I'm trying to get a index list of a tar file and get that to be recycled upon re-use, but that doesn't seem to be working?
def simple_targz_example():
base_path = "/albedo/work/user/pgierz/SciComp/Tutorials/AWIESM_Basics/experiments/basic-001.tar.gz"
index_path = "basic-001.tar.gz.index" # Cache information about the contents to disk at this location
# fo is the file_object (this is the tar file to look inside of)
basic_001_fs = fsspec.filesystem(
"tar",
fo=base_path,
index_store=index_path,
)
# Find all NetCDF Files in "outdata".
# NOTE(PG): Since we have a tar filesystem with the ``fo`` argument, we DO NOT need
# to include the base_path on the filesystem here. However, the tar file
# contains the base folder, so you need to include that:
outdata_netcdf_files = basic_001_fs.glob("basic-001/outdata/**/*.nc")
for nc_file in outdata_netcdf_files:
print(nc_file)
I saw in the code here that this seems to not be implemented yet? I had a try in #1807
filesystem_spec/fsspec/implementations/tar.py
Lines 91 to 101 in 6b85a47
I might need to ask some additional questions, I hope that I don't spam too much with silly things, sorry for that.
Cheers,
Paul