Skip to content

BLD/TST: add pyarrow on CI to macosx build #18714

Closed
@jreback

Description

@jreback

xref #18662 (comment)

Currently failing

on pyarrow 0.7.1, fp 0.1.3, on macosx

(pandas) bash-3.2$ pytest pandas/tests/io/test_parquet.py --tb=short
=========================================================================================== test session starts ===========================================================================================
platform darwin -- Python 3.6.1, pytest-3.2.1, py-1.4.34, pluggy-0.4.0
rootdir: /Users/jreback/pandas, inifile: setup.cfg
plugins: xdist-1.16.0, cov-2.3.1
collected 38 items                                                                                                                                                                                         

pandas/tests/io/test_parquet.py .....F............s...s...x...s.s.....

================================================================================================ FAILURES =================================================================================================
_________________________________________________________________________________________ test_cross_engine_pa_fp _________________________________________________________________________________________
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:96: in __init__
    with open_with(fn2, 'rb') as f:
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/util.py:44: in default_open
    return open(f, mode)
E   NotADirectoryError: [Errno 20] Not a directory: '/var/folders/h3/mr_r3bkj5yg0pbx9fr3tk1r00000gp/T/tmpii71wdx8/_metadata'

During handling of the above exception, another exception occurred:
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:119: in _parse_header
    fmd = read_thrift(f, parquet_thrift.FileMetaData)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/thrift_structures.py:22: in read_thrift
    obj.read(pin)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1899: in read
    _elem53.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1742: in read
    _elem33.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1656: in read
    self.meta_data.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:1487: in read
    self.statistics.read(iprot)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/parquet_thrift/parquet/ttypes.py:298: in read
    iprot.skip(ftype)
../miniconda3/envs/pandas/lib/python3.6/site-packages/thrift/protocol/TProtocol.py:208: in skip
    self.readString()
../miniconda3/envs/pandas/lib/python3.6/site-packages/thrift/protocol/TProtocol.py:184: in readString
    return binary_to_str(self.readBinary())
../miniconda3/envs/pandas/lib/python3.6/site-packages/thrift/compat.py:37: in binary_to_str
    return bin_val.decode('utf8')
E   UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 2: invalid start byte

During handling of the above exception, another exception occurred:
pandas/tests/io/test_parquet.py:186: in test_cross_engine_pa_fp
    result = read_parquet(path, engine=fp)
pandas/io/parquet.py:211: in read_parquet
    return impl.read(path, columns=columns, **kwargs)
pandas/io/parquet.py:123: in read
    return self.api.ParquetFile(path).to_pandas(columns=columns, **kwargs)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:102: in __init__
    self._parse_header(f, verify)
../miniconda3/envs/pandas/lib/python3.6/site-packages/fastparquet/api.py:122: in _parse_header
    self.fn)
E   fastparquet.util.ParquetException: Metadata parse failed: /var/folders/h3/mr_r3bkj5yg0pbx9fr3tk1r00000gp/T/tmpii71wdx8
======================================================================== 1 failed, 32 passed, 4 skipped, 1 xfailed in 2.93 seconds ========================================================================

Metadata

Metadata

Assignees

No one assigned

    Labels

    CIContinuous IntegrationCompatpandas objects compatability with Numpy or Python functionsIO Parquetparquet, feather

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions