Skip to content

Column name class is ignored when using to_dict(orient="records") after upgrading to 0.24  #25071

Closed
@jmsmkn

Description

@jmsmkn

Code Sample, a copy-pastable example if possible

In pandas 0.24 we encounter a bug where the dictionary key is not set correctly if the column name is class and we use to_dict(orient="records"). All other orientations are ok. If we use the column name klass then the records orientation is also ok. Our csv file is:

case, class
1, 1
2, 0
3, 1
4, 0
5, 1
6, 0
7, 1
8, 0

Then, we load the data with:

read_csv(fname, skipinitialspace=True).to_dict(orient="records")

Problem description

All column names should be in the records, class has been replaced with _1 in the new version of pandas.

Expected Output

In pandas 0.23.4 the resulting dictionary is:

df:
   case  class
0     1      1
1     2      0
2     3      1
3     4      0
4     5      1
5     6      0
6     7      1
7     8      0

to_dict:
<class 'list'>: [{'case': 1, 'class': 1}, {'case': 2, 'class': 0}, {'case': 3, 'class': 1}, {'case': 4, 'class': 0}, {'case': 5, 'class': 1}, {'case': 6, 'class': 0}, {'case': 7, 'class': 1}, {'case': 8, 'class': 0}]

This is what we'd expect. However, in pandas 0.24 we get:

df:
   case  class
0     1      1
1     2      0
2     3      1
3     4      0
4     5      1
5     6      0
6     7      1
7     8      0

to_dict:
<class 'list'>: [{'case': 1, '_1': 1}, {'case': 2, '_1': 0}, {'case': 3, '_1': 1}, {'case': 4, '_1': 0}, {'case': 5, '_1': 1}, {'case': 6, '_1': 0}, {'case': 7, '_1': 1}, {'case': 8, '_1': 0}]

Note that the class column name is missing from the keys in the records. This occurs in python 3.6 and 3.7 (https://travis-ci.org/comic/evalutils/builds/486831323)

The column name class is preserved in the dictionary when using all orient="dict", "list", "series", "split" or "index".

If we change the column name to klass, then we do not get this problem using any orientation.

It is not clear to me if this solves the problem: #24965

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.5.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.0-44-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_GB.UTF-8
LOCALE: en_GB.UTF-8

pandas: 0.24.0
pytest: 3.8.0
pip: 18.0
setuptools: 40.2.0
Cython: None
numpy: 1.15.1
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: None
sphinx: 1.8.0
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None
None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions