Description
Code
>>> import pandas as pd
>>> df = pd.DataFrame({'x': [0,1], 'x1': [2,3]})
>>> df.to_csv('tmp.csv', index=False)
>>> pd.read_csv('tmp.csv', usecols='x')
x
0 0
1 1
>>> pd.read_csv('tmp.csv', usecols=['x1'])
x1
0 2
1 3
>>> pd.read_csv('tmp.csv', usecols='x1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/matt/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 709, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/matt/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 449, in _read
parser = TextFileReader(filepath_or_buffer, **kwds)
File "/home/matt/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 818, in __init__
self._make_engine(self.engine)
File "/home/matt/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1049, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/matt/anaconda3/lib/python3.6/site-packages/pandas/io/parsers.py", line 1740, in __init__
raise ValueError("Usecols do not match names.")
ValueError: Usecols do not match names.
Problem description
When using usecols
to load a single column, one needs to have either a single-character column name or provide an array-like object. In the example above, pd.read_csv('tmp.csv', usecols='x')
and pd.read_csv('tmp.csv', usecols=['x1'])
work as expected; however, things break down for pd.read_csv('tmp.csv', usecols='x1')
. The corresponding error message ValueError: Usecols do not match names.
is not very helpful either.
Expected Output
It would be nice if there were some type checking done on usecols
so that things don't break in the example above. At the least, the error message should be a bit more helpful; e.g., ValueError: Usecols should be array-like.
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.13.0-37-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.22.0
pytest: None
pip: 9.0.1
setuptools: 39.0.1
Cython: None
numpy: 1.14.2
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.7.1
patsy: None
dateutil: 2.7.1
pytz: 2018.3
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None