Skip to content

ENH: Ability to use "normalize" function (df['col'].value_counts(normalize=True) when you do df.groupby(cols).size() #39938

Closed
@bbennett36

Description

@bbennett36

groupby.size() should have the ability to "normalize" the results and return them as a percentage.

df['col'].value_counts(normalize=True) 
  
A = 0.25
B = 0.25  
C = 0.25  
D = 0.25

To accomplish this result with a groupby.size() you have to do the following -

df2 = df.groupby(['subset_product', 'subset_close']).size().reset_index(name='prod_count')
a = df2.groupby('subset_product')['prod_count'].transform('sum')
df2['prod_count'] = df2['prod_count'].div(a)

Is there a way to add a feature to include this as a parameter in .size()

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementNeeds InfoClarification about behavior needed to assess issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions