Description
Code Sample, a copy-pastable example if possible
In pandas 0.24 we encounter a bug where the dictionary key is not set correctly if the column name is class
and we use to_dict(orient="records")
. All other orientations are ok. If we use the column name klass
then the records orientation is also ok. Our csv file is:
case, class
1, 1
2, 0
3, 1
4, 0
5, 1
6, 0
7, 1
8, 0
Then, we load the data with:
read_csv(fname, skipinitialspace=True).to_dict(orient="records")
Problem description
All column names should be in the records, class
has been replaced with _1
in the new version of pandas.
Expected Output
In pandas 0.23.4 the resulting dictionary is:
df:
case class
0 1 1
1 2 0
2 3 1
3 4 0
4 5 1
5 6 0
6 7 1
7 8 0
to_dict:
<class 'list'>: [{'case': 1, 'class': 1}, {'case': 2, 'class': 0}, {'case': 3, 'class': 1}, {'case': 4, 'class': 0}, {'case': 5, 'class': 1}, {'case': 6, 'class': 0}, {'case': 7, 'class': 1}, {'case': 8, 'class': 0}]
This is what we'd expect. However, in pandas 0.24 we get:
df:
case class
0 1 1
1 2 0
2 3 1
3 4 0
4 5 1
5 6 0
6 7 1
7 8 0
to_dict:
<class 'list'>: [{'case': 1, '_1': 1}, {'case': 2, '_1': 0}, {'case': 3, '_1': 1}, {'case': 4, '_1': 0}, {'case': 5, '_1': 1}, {'case': 6, '_1': 0}, {'case': 7, '_1': 1}, {'case': 8, '_1': 0}]
Note that the class
column name is missing from the keys in the records. This occurs in python 3.6 and 3.7 (https://travis-ci.org/comic/evalutils/builds/486831323)
The column name class
is preserved in the dictionary when using all orient="dict"
, "list"
, "series"
, "split"
or "index"
.
If we change the column name to klass
, then we do not get this problem using any orientation.
It is not clear to me if this solves the problem: #24965
Output of pd.show_versions()
INSTALLED VERSIONS
------------------
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-44-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8
pandas: 0.24.0
pytest: 3.8.0
pip: 18.0
setuptools: 40.2.0
Cython: None
numpy: 1.15.1
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: 1.8.0
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
None