Skip to content

Sum of grouped bool column has inconsistent type #7001

Closed
@jkleint

Description

@jkleint

Summing a bool column after a groupby gives a bool result until there are two or more True values, when it becomes a float64. Seems like it should always be an (unsigned?) integer. Straight sum without a groupby always gives an int64. This is with 0.13.1.

pd.DataFrame([True]).groupby(lambda x: 0).sum()
      0
0  True

pd.DataFrame([True,True]).groupby(lambda x: 0).sum()
   0
0  2

pd.DataFrame([False]).groupby(lambda x: 0).sum()
       0
0  False

pd.DataFrame([False,False]).groupby(lambda x: 0).sum()
       0
0  False

pd.DataFrame([False,False,True]).groupby(lambda x: 0).sum()
      0
0  True

pd.DataFrame([False,False,True,True]).groupby(lambda x: 0).sum()
   0
0  2

pd.DataFrame([False,False]).sum()
0    0
dtype: int64

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions