Closed
Description
Closely related to #48476
As far as I can tell this only occurs when the input dtype to groupby is object.
df = pd.DataFrame({'a': [np.nan, pd.NA, None], 'b': [1, 2, 3]})
gb = df.groupby('a', sort=True, dropna=False)
print(gb.sum())
# b
# a
# NaN 6
but with sort=False
:
df = pd.DataFrame({'a': [np.nan, pd.NA, None], 'b': [1, 2, 3]})
gb = df.groupby('a', sort=False, dropna=False)
print(gb.sum())
# b
# a
# NaN 1
# <NA> 2
# None 3
I think we should prefer the sort=True
behavior as that is the default value for now, but prefer sort=False
in the long run.