Description
The following code worked in Pandas 0.23.4 but not in Pandas 0.24.0 (I'm on Python 3.7.2).
import pandas as pd
class Thing:
# (Production code would also ensure a Thing instance's hash
# and equality testing depended on name and color)
def __init__(self, name, color):
self.name = name
self.color = color
def __str__(self):
return "<Thing %r>" % (self.name,)
thing1 = Thing('One', 'red')
thing2 = Thing('Two', 'blue')
df = pd.DataFrame({thing1: [0, 1], thing2: [2, 3]})
df.set_index([thing2])
In Pandas 0.23.4, I get the following correct result:
<Thing 'One'>
<Thing 'Two'>
2 0
3 1
In Pandas 0.24.0, I get the following error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".../venv/lib/python3.7/site-packages/pandas/core/frame.py", line 4153, in set_index
raise ValueError(err_msg)
ValueError: The parameter "keys" may be a column key, one-dimensional array, or a list containing only valid column keys and one-dimensional arrays.
After looking at Pandas 0.24.0's implementation of DataFrame.set_index
:
Lines 4144 to 4153 in 83eb242
I noticed that
is_scalar
returns False
for thing1
in Pandas 0.24.0:
>>> from pandas.core.dtypes.common import is_scalar
>>> is_scalar(thing1)
False
I suspect that it is incorrect to test DataFrame column keys using is_scalar
.
Output of pd.show_versions()
pd.show_versions()
from Pandas 0.23.4
INSTALLED VERSIONS
commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.23.4
pytest: None
pip: 18.1
setuptools: 40.4.3
Cython: None
numpy: 1.16.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.1.2
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
pd.show_versions()
from Pandas 0.24.0
INSTALLED VERSIONS
commit: None
python: 3.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 17.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8
pandas: 0.24.0
pytest: None
pip: 18.1
setuptools: 40.4.3
Cython: None
numpy: 1.16.0
scipy: 1.1.0
pyarrow: None
xarray: None
IPython: 7.2.0
sphinx: None
patsy: None
dateutil: 2.7.3
pytz: 2018.5
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: 1.1.2
lxml.etree: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
gcsfs: None