Description
read.csv
index_col argument has been accepting either string or list-of-string for years, but the doc (as of 0.24 dev) has never been updated to reflect this. Current and suggested text at bottom.- All columns used in index_col get dropped as regular columns. The doc never explicitly says this and it causes user confusion.
- The doc does however discuss "malformed file with delimiters at the end of each line... you might consider index_col=False", this is overly prominent for a rare defective case and should be shunted somewhere less prominent, or at minimum relegated to a parenthesized footnote.
Current read.csv doc:
index_col : int or sequence or
False
, defaultNone
Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. If you have a malformed file with delimiters at the end of each line, you might consider index_col=False to force pandas to not use the first column as the index (row names).
Suggested read.csv doc:
index_col : int/string or sequence of int/string or
False
, defaultNone
Column(s) to use as the row labels of the DataFrame, either given as string name or column index.
If a sequence of int/string is given, a MultiIndex is used.
Columns used for the index (row names) are dropped from the actual columns of the input dataframe. (They are accessible via.index
).
(Note:index_col=False
can be used to force pandas to not use the first column as the index, e.g. when you have a malformed file with delimiters at the end of each line).