Description
In #22225, #23192 (and now #23582), I've had persistent a ResourceWarning
the last few CI runs. I first thought it was a flaky thing like those warnings used to be, but this time, it stayed, and I can reproduce some of it locally (not with pytest pandas/tests/io/test_parquet.py
, but at least with pytest pandas/tests/io
).
For example in:
- https://travis-ci.org/pandas-dev/pandas/jobs/453820311
- https://travis-ci.org/pandas-dev/pandas/jobs/453822449
- https://travis-ci.org/pandas-dev/pandas/jobs/454102793
- https://travis-ci.org/pandas-dev/pandas/jobs/454342935
- https://travis-ci.org/pandas-dev/pandas/jobs/454644088
- https://travis-ci.org/pandas-dev/pandas/jobs/454644964
- https://travis-ci.org/pandas-dev/pandas/jobs/454744932
- https://travis-ci.org/pandas-dev/pandas/jobs/454760670
- https://travis-ci.org/pandas-dev/pandas/jobs/454760916
sys:1: ResourceWarning: unclosed <socket.socket fd=16, family=AddressFamily.AF_INET, type=2050, proto=0, laddr=('0.0.0.0', 0)>
sys:1: ResourceWarning: unclosed <socket.socket fd=15, family=AddressFamily.AF_INET, type=2050, proto=0, laddr=('0.0.0.0', 0)>
and
=============================== warnings summary ===============================
pandas/core/frame.py::pandas.core.frame.DataFrame.to_parquet
/home/travis/build/pandas-dev/pandas/pandas/io/parquet.py:129: ResourceWarning: unclosed file <_io.BufferedReader name='df.parquet.gzip'>
**kwargs).to_pandas()
There's also a stderr
(or stdout
) warning from the parser-tests surfacing somewhere:
..............................................................x...........................................................s....
........................................Skipping line 3: Expected 3 fields in line 3, saw 4
.......................................s.......................................................................................
I've narrowed one of the ResourceWarning down to the parquet-s3 tests, but at least one other one remains that I haven't been able to track (same for the skipped line
warning). I couldn't grep anything about 'df.parquet.gzip'
in various combinations, and tried disabling anything related to 'gzip'
or S3 or _io
in several trial runs in #23192, to no avail.
Any help would be appreciated. Would also be interested to hear if someone else has seen them already. The code in the PRs I linked on top cannot reasonably be the culprit (e.g. #23582 just adds tests)...
Potentially related xref: #22934