Skip to content

groupby with index = False returns NANs when column is categorical.  #13204

Closed
@toasteez

Description

@toasteez

Please see stackoverflow for example of issue

http://stackoverflow.com/questions/37279260/why-doesnt-pandas-allow-a-categorical-column-to-be-used-in-groupby?noredirect=1#comment62084780_37279260

>>> pd.__version__
'0.18.1'
>>> 

# import the pandas module
import pandas as pd

# Create an example dataframe
raw_data = {'Date': ['2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13','2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13', '2016-05-13'],
    'Portfolio': ['A', 'A', 'A', 'A', 'A', 'A', 'B', 'B','B', 'B', 'B', 'C', 'C', 'C', 'C', 'C', 'C'],
    'Duration': [1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3],
    'Yield': [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1],}

df = pd.DataFrame(raw_data, columns = ['Date', 'Portfolio', 'Duration', 'Yield'])

df['Portfolio'] = pd.Categorical(df['Portfolio'],['C', 'B', 'A'])
df=df.sort_values('Portfolio')

dfs = df.groupby(['Date','Portfolio'], as_index =False).sum()

print(dfs)

                        Date    Portfolio   Duration   Yield
Date        Portfolio               
13/05/2016  C           NaN     NaN         NaN        NaN
            B           NaN     NaN         NaN        NaN
            A           NaN     NaN         NaN        NaN

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions