Skip to content

pivot_table(aggfunc="count") with category column raise "ValueError: Cannot convert NA to integer" #9534

Closed
@ruoyu0088

Description

@ruoyu0088

Here is the test code, that return the right table:

import pandas as pd
data = {"C1":["A", "B", "C", "C"], "C2":["a", "a", "b", "b"], "V":[1, 2, 3, 4]}
df = pd.DataFrame(data)
df.pivot_table("V", index="C1", columns="C2", aggfunc="count")

when convert column to category, ValueError is raised:

df2 = df.copy()
df2["C1"] = df2["C1"].astype("category")
df2.pivot_table("V", index="C1", columns="C2", aggfunc="count")

groupby also raise the same error:

df2.groupby(["C1", "C2"]).count()

add dropna() to count() in groupby.py fix this problem:

    def count(self, axis=0):
        return self._count().dropna().astype('int64')

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugCategoricalCategorical Data TypeGroupbyReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions