Skip to content

BUG: read_csv stopped working with s3 file system #34519

Closed
@hellocoldworld

Description

@hellocoldworld
  • [yes ] I have checked that this issue has not already been reported.

  • [ yes ] I have confirmed this bug exists on the latest version of pandas.

  • [ yes] (optional) I have confirmed this bug exists on the master branch of pandas.
    Checked against pandas 1.1.0.dev0+1732.g2428cdda3

Problem description

read_csv in pandas1.0.4 has stopped working with s3fs.

On pandas1.0.3

import pandas as pd; import s3fs
s3fs.S3FileSystem(anon=False, key=os.environ.get("STORE_USERNAME"), secret=os.environ.get("STORE_PASSWORD"))
df = pd.read_csv(filepath_or_buffer="s3://my-private-bucket/my_dataframe.csv")
print(df.shape)

prints the correct output, whilst using pandas1.0.4 it raises the following exception

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/nico/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "/home/nico/.local/lib/python3.7/site-packages/pandas/io/parsers.py", line 431, in _read
    filepath_or_buffer, encoding, compression
  File "/home/nico/.local/lib/python3.7/site-packages/pandas/io/common.py", line 212, in get_filepath_or_buffer
    filepath_or_buffer, encoding=encoding, compression=compression, mode=mode
  File "/home/nico/.local/lib/python3.7/site-packages/pandas/io/s3.py", line 52, in get_filepath_or_buffer
    file, _fs = get_file_and_filesystem(filepath_or_buffer, mode=mode)
  File "/home/nico/.local/lib/python3.7/site-packages/pandas/io/s3.py", line 42, in get_file_and_filesystem
    file = fs.open(_strip_schema(filepath_or_buffer), mode)
  File "/home/nico/.local/lib/python3.7/site-packages/fsspec/spec.py", line 775, in open
    **kwargs
  File "/home/nico/.local/lib/python3.7/site-packages/s3fs/core.py", line 378, in _open
    autocommit=autocommit, requester_pays=requester_pays)
  File "/home/nico/.local/lib/python3.7/site-packages/s3fs/core.py", line 1097, in __init__
    cache_type=cache_type)
  File "/home/nico/.local/lib/python3.7/site-packages/fsspec/spec.py", line 1065, in __init__
    self.details = fs.info(path)
  File "/home/nico/.local/lib/python3.7/site-packages/s3fs/core.py", line 530, in info
    Key=key, **version_id_kw(version_id), **self.req_kw)
  File "/home/nico/.local/lib/python3.7/site-packages/s3fs/core.py", line 200, in _call_s3
    return method(**additional_kwargs)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/client.py", line 622, in _make_api_call
    operation_model, request_dict, request_context)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/client.py", line 641, in _make_request
    return self._endpoint.make_request(operation_model, request_dict)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 102, in make_request
    return self._send_request(request_dict, operation_model)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 132, in _send_request
    request = self.create_request(request_dict, operation_model)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/endpoint.py", line 116, in create_request
    operation_name=operation_model.name)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/signers.py", line 90, in handler
    return self.sign(operation_name, request)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/signers.py", line 160, in sign
    auth.add_auth(request)
  File "/home/nico/.local/lib/python3.7/site-packages/botocore/auth.py", line 357, in add_auth
    raise NoCredentialsError
botocore.exceptions.NoCredentialsError: Unable to locate credentials

Output of pd.show_versions()

using pandas 1.0.3

INSTALLED VERSIONS ------------------ commit : None python : 3.7.5.final.0 python-bits : 64 OS : Linux OS-release : 5.3.0-46-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : es_AR.UTF-8 LOCALE : es_AR.UTF-8

pandas : 1.0.3
numpy : 1.18.4
pytz : 2020.1
dateutil : 2.8.1
pip : 9.0.1
setuptools : 39.0.1
Cython : None
pytest : 5.4.0
hypothesis : None
sphinx : 1.6.7
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 0.999999999
pymysql : None
psycopg2 : None
jinja2 : 2.10
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.0
pyxlsb : None
s3fs : 0.4.2
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None

using pandas 1.0.4

INSTALLED VERSIONS ------------------ commit : None python : 3.7.5.final.0 python-bits : 64 OS : Linux OS-release : 5.3.0-46-generic machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : es_AR.UTF-8 LOCALE : es_AR.UTF-8

pandas : 1.0.4
numpy : 1.18.4
pytz : 2020.1
dateutil : 2.8.1
pip : 9.0.1
setuptools : 39.0.1
Cython : None
pytest : 5.4.0
hypothesis : None
sphinx : 1.6.7
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : 0.999999999
pymysql : None
psycopg2 : None
jinja2 : 2.10
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.4.0
pyxlsb : None
s3fs : 0.4.2
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None

Metadata

Metadata

Assignees

Labels

BugIO CSVread_csv, to_csvIO NetworkLocal or Cloud (AWS, GCS, etc.) IO Issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions