Skip to content

pandas/io/feather_format.py should call use_threads instead of nthreads to prevent breakage in pyarrow 0.11.0 #23053

Closed
@bvanderhaar

Description

@bvanderhaar

Code Sample

d = {'one' : [1., 2., 3., 4.],
        'two' : [4., 3., 2., 1.]}
df = pandas.DataFrame(d)
df.to_feather('example.feather')

# with pyarrow 0.10.0 this succeeds with a deprecation warning
# with pyarrow 0.11.0 this errors with a TypeError: unexpected argument 'nthreads'
df = pandas.read_feather('example.feather')

# attempt to manually set nthreads results in TypeError: unexpectect argument 'nthreads'
df = pandas.read_feather('example.feather', nthreads=4)

# attempt to pass 'use_threads' results in TypeError: unexpected argument 'nthreads'
df = pandas.read_feather('example.feather', use_threads=True)

Problem description

Pandas introduced nthreads for reading feather files in issue 16359

With PyArrow 0.10.0 a deprecation warning is shown from this source: "nthreads argument is deprecated, pass use_threads instead"

When PyArrow version 0.11.0, Python errors with: TypeError: read_feather() got an unexpected keyword argument 'nthreads'.

I've searched with 'pyarrow' and 'nthreads' keywords and didn't see this issue posted.

Specifically feather-format.py line 112 should be changed to
return feather.read_dataframe(path, use_threads=True) or changing the method signature to all overriding use_threads:
return feather.read_dataframe(path, use_threads=use_threads)
I will submit a PR if the only barrier to fix is code effort.

Expected Output

I expect no error output upon running pandas.read_feather() with PyArrow 0.11.0

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.6.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 79 Stepping 1, GenuineIntel
byteorder: little
LC_ALL: None
LANG: None
LOCALE: None.None

pandas: 0.23.4
pytest: None
pip: 18.1
setuptools: 40.3.0
Cython: None
numpy: 1.15.1
scipy: 1.1.0
pyarrow: 0.10.0
xarray: None
IPython: 6.5.0
sphinx: None
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: 0.4.0
matplotlib: 2.2.2
openpyxl: None
xlrd: 1.1.0
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: 1.2.11
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions