Skip to content

Minimum of ordered categorical data in Panda DataFrames #25299

Closed
@Guillaume1801

Description

@Guillaume1801

I have a Pandas DataFrame with one Serie containing ordered Categorical data. Some value of this Serie may be missing (NaN). I want to get the minimum without taking into account NaNs but I obtained strange results ...

Code:

raw_cat = pd.Categorical(["a", "b", "c", "a"],
                         categories=["b", "c", "d"],
                         ordered=True)
s = pd.Series(raw_cat)
raw_cat.min(numeric_only=True), s.min(numeric_only=True)

Output:

('b', nan)

Expected utput:

('b', 'b')

I am getting the desired output when running this code with pandas 0.23.4 but not with pandas 0.24.0 and above.

Is this an issue or a misunderstanding? Thank you for your help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CategoricalCategorical Data TypeRegressionFunctionality that used to work in a prior pandas version

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions