Closed
Description
import pandas as pd
import numpy as np
levels = [np.arange(10), np.arange(100), np.arange(100)]
codes = [
np.arange(10).repeat(10000),
np.tile(np.arange(100).repeat(100), 10),
np.tile(np.tile(np.arange(100), 100), 10),
]
index = pd.MultiIndex(levels=levels, codes=codes)
df = pd.DataFrame(np.random.randn(len(index), 4), index=index)
%timeit df.groupby(level=1).std()
Points to #34372 (cc @rhshadrach), but there was an earlier slowdown.