Skip to content

df.to_dict() no longer supports symbols in column names as of v0.24.0 #25012

Closed
@vinceatbluelabs

Description

@vinceatbluelabs

Example code

import pandas as pd
data = np.array([(1, 5, 'bazzle')],
                        dtype=[
                            ('abc', 'i4'),
                            ('foo.bar', 'i4'),
                            ('foo.baz.bing', 'U10')
                        ])
mock_dataframe = pd.DataFrame(data)
mock_dataframe.to_dict(orient='records')

Problem description

It looks like dataframes with column names that have non-alphanumeric characters (e.g., _ or .) now get mangled when run through .to_dict(). To reproduce (see code sample above), start with a dataframe like this:

abc foo.bar foo.baz.bing
1 5 bazzle

Expected Output (behavior in v0.23.4)

[{'abc': 1, 'foo.bar': 5, 'foo.baz.bing': 'bazzle'}]

Full example from v0.23.4

(bluelabs-joblib-python-3.6.3) ]0;LM65vincebroz@LM65:~/src/bluelabs-joblib-python$ pip install --upgrade 'pandas<0.24'
Collecting pandas<0.24
  Downloading https://files.pythonhosted.org/packages/78/78/50ef81a903eccc4e90e278a143c9a0530f05199f6221d2e1b21025852982/pandas-0.23.4-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (14.6MB)
    100% |████████████████████████████████| 14.7MB 1.3MB/s 
Requirement already satisfied, skipping upgrade: python-dateutil>=2.5.0 in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from pandas<0.24) (2.7.5)
Requirement already satisfied, skipping upgrade: pytz>=2011k in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from pandas<0.24) (2018.7)
Requirement already satisfied, skipping upgrade: numpy>=1.9.0 in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from pandas<0.24) (1.15.4)
Requirement already satisfied, skipping upgrade: six>=1.5 in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from python-dateutil>=2.5.0->pandas<0.24) (1.11.0)
apache-airflow 1.10.1 has requirement jinja2<2.9.0,>=2.7.3, but you'll have jinja2 2.10 which is incompatible.
apache-airflow 1.10.1 has requirement sqlalchemy<1.2.0,>=1.1.15, but you'll have sqlalchemy 1.2.14 which is incompatible.
Installing collected packages: pandas
  Found existing installation: pandas 0.24.0
    Uninstalling pandas-0.24.0:
      Successfully uninstalled pandas-0.24.0
Successfully installed pandas-0.23.4
You are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(bluelabs-joblib-python-3.6.3) ]0;LM65vincebroz@LM65:~/src/bluelabs-joblib-python$ python3
Python 3.6.3 (default, Oct 12 2018, 12:58:48) 
[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.10.44.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> data = np.array([(1, 5, 'bazzle')],
                        dtype=[
                            ('abc', 'i4'),
                            ('foo.bar', 'i4'),
                            ('foo.baz.bing', 'U10')
                        ])
... ... ... ... ... >>> mock_dataframe = pd.DataFrame(data)
>>> mock_dataframe
   abc  foo.bar foo.baz.bing
0    1        5       bazzle
>>> mock_dataframe.to_dict(orient='records')
[{'abc': 1, 'foo.bar': 5, 'foo.baz.bing': 'bazzle'}]
>>> 

Actual Output

[{'abc': 1, '_1': 5, '_2': 'bazzle'}]

Full example from v0.24.0

(bluelabs-joblib-python-3.6.3) ]0;LM65vincebroz@LM65:~/src/bluelabs-joblib-python$ pip install --upgrade pandas
Collecting pandas
  Downloading https://files.pythonhosted.org/packages/b3/4a/bd76e1522f9cbb038eaea01ef8d59ab1014abfe086a0cc60d938da586f10/pandas-0.24.0-cp36-cp36m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl (16.3MB)
    100% |████████████████████████████████| 16.3MB 1.2MB/s 
Requirement already satisfied, skipping upgrade: numpy>=1.12.0 in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from pandas) (1.15.4)
Requirement already satisfied, skipping upgrade: python-dateutil>=2.5.0 in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from pandas) (2.7.5)
Requirement already satisfied, skipping upgrade: pytz>=2011k in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from pandas) (2018.7)
Requirement already satisfied, skipping upgrade: six>=1.5 in /Users/vincebroz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages (from python-dateutil>=2.5.0->pandas) (1.11.0)
apache-airflow 1.10.1 has requirement jinja2<2.9.0,>=2.7.3, but you'll have jinja2 2.10 which is incompatible.
apache-airflow 1.10.1 has requirement sqlalchemy<1.2.0,>=1.1.15, but you'll have sqlalchemy 1.2.14 which is incompatible.
Installing collected packages: pandas
  Found existing installation: pandas 0.23.4
    Uninstalling pandas-0.23.4:
      Successfully uninstalled pandas-0.23.4
Successfully installed pandas-0.24.0
You are using pip version 18.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
(bluelabs-joblib-python-3.6.3) ]0;LM65vincebroz@LM65:~/src/bluelabs-joblib-python$ python3
Python 3.6.3 (default, Oct 12 2018, 12:58:48) 
[GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.10.44.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> >>> import pandas as pd
>>> import numpy as np
>>> >>> data = np.array([(1, 5, 'bazzle')],
                        dtype=[
                            ('abc', 'i4'),
                            ('foo.bar', 'i4'),
                            ('foo.baz.bing', 'U10')
                        ])
... ... ... ... ... >>> 
>>> mock_dataframe = pd.DataFrame(data)
>>> mock_dataframe
   abc  foo.bar foo.baz.bing
0    1        5       bazzle
>>> mock_dataframe.to_dict(orient='records')
[{'abc': 1, '_1': 5, '_2': 'bazzle'}]
>>> 

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]

pd.show_versions()
/Users/broz/.pyenv/versions/3.6.3/envs/bluelabs-joblib-python-3.6.3/lib/python3.6/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""")

INSTALLED VERSIONS

commit: None
python: 3.6.3.final.0
python-bits: 64
OS: Darwin
OS-release: 18.2.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.24.0
pytest: None
pip: 18.1
setuptools: 28.8.0
Cython: None
numpy: 1.15.2
scipy: None
pyarrow: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml.etree: 4.2.5
bs4: 4.6.3
html5lib: 1.0.1
sqlalchemy: 1.2.12
pymysql: None
psycopg2: 2.7.5 (dt dec pq3 ext lo64)
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions