Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Note: Please read this guide detailing how to provide the necessary information for us to reproduce your bug.
Code Sample, a copy-pastable example
import pandas as pd
import numpy as np
idx = pd.MultiIndex.from_arrays([
[str(i) for i in range(100)], np.random.choice(['A', 'B'], size=(100,))
], names=['a', 'b'])
data_dict = dict((str(i), np.random.rand(100)) for i in range(10))
data_dict['string'] = [str(i) for i in range(100)]
data_dict['bool'] = np.random.choice([True, False], (100,))
data = pd.DataFrame(data_dict, index=idx)
data.sem(level=1)
Problem description
Unexpected exception raised:
Traceback (most recent call last):
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-8-cd9abe134148>", line 1, in <module>
data.sem(level=1)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 10925, in sem
return NDFrame.sem(self, axis, skipna, level, ddof, numeric_only, **kwargs)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 10665, in sem
return self._stat_function_ddof(
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 10655, in _stat_function_ddof
return self._agg_by_level(
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/generic.py", line 10552, in _agg_by_level
return getattr(grouped, name)(**kwargs)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/groupby/groupby.py", line 1612, in sem
result.iloc[:, cols] = result.iloc[:, cols] / np.sqrt(
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/indexing.py", line 691, in __setitem__
iloc._setitem_with_indexer(indexer, value, self.name)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/indexing.py", line 1636, in _setitem_with_indexer
self._setitem_single_block(indexer, value, name)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/indexing.py", line 1860, in _setitem_single_block
self.obj._mgr = self.obj._mgr.setitem(indexer=indexer, value=value)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 568, in setitem
return self.apply("setitem", indexer=indexer, value=value)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/internals/managers.py", line 427, in apply
applied = getattr(b, f)(**kwargs)
File "/Users/wenjun/miniconda3/lib/python3.8/site-packages/pandas/core/internals/blocks.py", line 1025, in setitem
values[indexer] = value
ValueError: shape mismatch: value array of shape (2,12) could not be broadcast to indexing result of shape (11,2)
Expected Output
Correct aggregation results shall be returned.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 3e89b4c
python : 3.8.3.final.0
python-bits : 64
OS : Darwin
OS-release : 19.6.0
Version : Darwin Kernel Version 19.6.0: Tue Nov 10 00:10:30 PST 2020; root:xnu-6153.141.10~1/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : None
LANG : None
LOCALE : zh_CN.UTF-8
pandas : 1.2.0
numpy : 1.19.1
pytz : 2020.4
dateutil : 2.8.1
pip : 20.2.2
setuptools : 49.6.0.post20200814
Cython : 0.29.21
pytest : 6.0.1
hypothesis : None
sphinx : 3.2.1
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.2
IPython : 7.17.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : 0.4.2
gcsfs : None
matplotlib : 3.2.2
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 1.0.1
pyxlsb : None
s3fs : None
scipy : 1.5.0
sqlalchemy : 1.3.18
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
numba : 0.51.2