Skip to content

CI/BUG: pyarrow read_csv deadlock #43650

Closed
@mzeitlin11

Description

@mzeitlin11

xref #43611, #43643

When trying to figure out azure timeout issues, deadlock appeared to be occurring in parser code, so pyarrow makes sense as the culprit. Seems like tests with weird input cause issues, for example some of the parse_dates tests, or for a specific reproducer the test:

pandas/tests/io/parser/common/test_ints.py::test_outside_int64_uint64_range

On current pyarrow I can't reproduce, but azure uses 0.17.0, with which can reproduce a deadlock (just running the command pandas/tests/io/parser/common/test_ints.py::test_outside_int64_uint64_range) on macOS. Doesn't happen consistently, but will deadlock (to the point that need to sigkill to stop, which explains why pytest-timeout didn't catch it).

cc @lithomas1 if any thoughts here

Metadata

Metadata

Assignees

No one assigned

    Labels

    Arrowpyarrow functionalityCIContinuous IntegrationIO CSVread_csv, to_csvTestingpandas testing functions or related to the test suite

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions