Skip to content

read.csv index_col argument accepts string, list-of-string #22276

Closed
@smcinerney

Description

@smcinerney
  1. read.csv index_col argument has been accepting either string or list-of-string for years, but the doc (as of 0.24 dev) has never been updated to reflect this. Current and suggested text at bottom.
  2. All columns used in index_col get dropped as regular columns. The doc never explicitly says this and it causes user confusion.
  3. The doc does however discuss "malformed file with delimiters at the end of each line... you might consider index_col=False", this is overly prominent for a rare defective case and should be shunted somewhere less prominent, or at minimum relegated to a parenthesized footnote.

Current read.csv doc:

index_col : int or sequence or False, default None
Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each line, you might consider index_col=False to force pandas to not use the first column as the index (row names).

Suggested read.csv doc:

index_col : int/string or sequence of int/string or False, default None
Column(s) to use as the row labels of the DataFrame, either given as string name or column index.
If a sequence of int/string is given, a MultiIndex is used.
Columns used for the index (row names) are dropped from the actual columns of the input dataframe. (They are accessible via .index).
(Note: index_col=False can be used to force pandas to not use the first column as the index, e.g. when you have a malformed file with delimiters at the end of each line).

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions