Skip to content

stack() does not work with subclassed DataFrame #18859

Closed
@rockg

Description

@rockg

Code Sample, a copy-pastable example if possible

In [1]: from pandas import Series, DataFrame

In [2]: class SubclassedSeries(Series):
   ...: 
   ...:     @property
   ...:     def _constructor(self):
   ...:         return SubclassedSeries
   ...: 
   ...:     @property
   ...:     def _constructor_expanddim(self):
   ...:         return SubclassedDataFrame
   ...: 
   ...: class SubclassedDataFrame(DataFrame):
   ...: 
   ...:     @property
   ...:     def _constructor(self):
   ...:         return SubclassedDataFrame
   ...: 
   ...:     @property
   ...:     def _constructor_sliced(self):
   ...:         return SubclassedSeries
   ...: 

In [3]: sdf = SubclassedDataFrame({"A": [0, 1, 2], "B": [0, 1, 2]})

In [4]: sdf.stack()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-987edeff1e47> in <module>()
----> 1 sdf.stack()

~/Software/miniconda3/envs/pandas_stack/lib/python3.6/site-packages/pandas/core/frame.py in stack(self, level, dropna)
   4506             return stack_multiple(self, level, dropna=dropna)
   4507         else:
-> 4508             return stack(self, level, dropna=dropna)
   4509 
   4510     def unstack(self, level=-1, fill_value=None):

~/Software/miniconda3/envs/pandas_stack/lib/python3.6/site-packages/pandas/core/reshape/reshape.py in stack(frame, level, dropna)
    545 
    546     klass = type(frame)._constructor_sliced
--> 547     return klass(new_values, index=new_index)
    548 
    549 

TypeError: 'property' object is not callable

Problem description

When using stack() on a DataFrame into a Series, the code incorrectly references the class rather than the instance. This works for DataFrame as _constructor_sliced is on the class, but using the paradigm defined here it does not work. Should be as simple as changing 546 to klass = frame._constructor_sliced. Seems like this was worked on in #15655 and reported in #15563 but I don't see an associated commit. What's the status @delgadom?

Expected Output

In [5]: sdf = SubclassedDataFrame({"A": [0, 1, 2], "B": [0, 1, 2]})

In [6]: sdf.stack()
Out[6]: 
0  A    0
   B    0
1  A    1
   B    1
2  A    2
   B    2
dtype: int64

Output of pd.show_versions()

[paste the output of pd.show_versions() here below this line]
INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Linux
OS-release: 3.16.0-38-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.21.1
pytest: None
pip: 9.0.1
setuptools: 36.4.0
Cython: None
numpy: 1.13.1
scipy: None
pyarrow: None
xarray: None
IPython: 6.1.0
sphinx: None
patsy: None
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsDuplicate ReportDuplicate issue or pull requestReshapingConcat, Merge/Join, Stack/Unstack, ExplodeSubclassingSubclassing pandas objects

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions