Skip to content

Needs Triage: In aggregation, pd.Series.nunique outputs float instead of int. #42383

Closed
@qniksefat

Description

@qniksefat

Sorry, I'm very new to issuing. I'm not sure if it's a bug or improvement. Here it is.

When you count the number of unique values in a groupby aggregation, you'll get a float output when the values are float themselves.

df = pd.DataFrame({
    'a': 4 * list(range(1, 3)),
    'b': map(float, list(range(1, 9)))
})

df
   a    b
0  1  1.0
1  2  2.0
2  1  3.0
3  2  4.0
4  1  5.0
5  2  6.0
6  1  7.0
7  2  8.0
df.groupby('a').agg({'b': pd.Series.nunique}).b
a
1    4.0
2    4.0
Name: b, dtype: float64

Problem description

If the type of b is int, it outputs int, but not for the float type. Nevertheless, the unique number is int no matter what.

Expected Output

a
1    4
2    4
Name: b, dtype: int64

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions