Skip to content

Commit fd9b2a4

Browse files
authored
BUG: read_csv may interpret second row as index names even if index_col is False (#47397)
* BUG: read_csv may interpret second row as index names even if header is integer * BUG: read_csv may interpret second row as index names even if index_col is False * BUG: read_csv may interpret second row as index names even if index_col is False
1 parent 3364f9a commit fd9b2a4

File tree

3 files changed

+18
-2
lines changed

3 files changed

+18
-2
lines changed

doc/source/whatsnew/v1.5.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -861,6 +861,7 @@ I/O
861861
- Bug in :func:`read_csv` not recognizing line break for ``on_bad_lines="warn"`` for ``engine="c"`` (:issue:`41710`)
862862
- Bug in :meth:`DataFrame.to_csv` not respecting ``float_format`` for ``Float64`` dtype (:issue:`45991`)
863863
- Bug in :func:`read_csv` not respecting a specified converter to index columns in all cases (:issue:`40589`)
864+
- Bug in :func:`read_csv` interpreting second row as :class:`Index` names even when ``index_col=False`` (:issue:`46569`)
864865
- Bug in :func:`read_parquet` when ``engine="pyarrow"`` which caused partial write to disk when column of unsupported datatype was passed (:issue:`44914`)
865866
- Bug in :func:`DataFrame.to_excel` and :class:`ExcelWriter` would raise when writing an empty DataFrame to a ``.ods`` file (:issue:`45793`)
866867
- Bug in :func:`read_html` where elements surrounding ``<br>`` were joined without a space between them (:issue:`29528`)

pandas/io/parsers/python_parser.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -933,7 +933,11 @@ def _get_index_name(
933933
implicit_first_cols = len(line) - self.num_original_columns
934934

935935
# Case 0
936-
if next_line is not None and self.header is not None:
936+
if (
937+
next_line is not None
938+
and self.header is not None
939+
and index_col is not False
940+
):
937941
if len(next_line) == len(line) + self.num_original_columns:
938942
# column and index names on diff rows
939943
self.index_col = list(range(len(line)))

pandas/tests/io/parser/test_python_parser_only.py

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -466,6 +466,17 @@ def test_index_col_false_and_header_none(python_parser_only):
466466
0.5,0.03
467467
0.1,0.2,0.3,2
468468
"""
469-
result = parser.read_csv(StringIO(data), sep=",", header=None, index_col=False)
469+
with tm.assert_produces_warning(ParserWarning, match="Length of header"):
470+
result = parser.read_csv(StringIO(data), sep=",", header=None, index_col=False)
470471
expected = DataFrame({0: [0.5, 0.1], 1: [0.03, 0.2]})
471472
tm.assert_frame_equal(result, expected)
473+
474+
475+
def test_header_int_do_not_infer_multiindex_names_on_different_line(python_parser_only):
476+
# GH#46569
477+
parser = python_parser_only
478+
data = StringIO("a\na,b\nc,d,e\nf,g,h")
479+
with tm.assert_produces_warning(ParserWarning, match="Length of header"):
480+
result = parser.read_csv(data, engine="python", index_col=False)
481+
expected = DataFrame({"a": ["a", "c", "f"]})
482+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)