Skip to content

BUG: groupby _cython_agg_blocks implicitly assumes unique columns #31735

Closed
@jbrockmendel

Description

@jbrockmendel

xref #31616, the two test cases that adds both have unique columns. Editing test_agg_split_object_part_datetime to make columns non-unique breaks it:

df = pd.DataFrame(
            {
                "A": pd.date_range("2000", periods=4),
                "B": ["a", "b", "c", "d"],
                "C": [1, 2, 3, 4],
                "D": ["b", "c", "d", "e"],
                "E": pd.date_range("2000", periods=4),
                "F": [1, 2, 3, 4],
            }
).astype(object)
df.columns = ["A", "B", "B", "D", "E", "F"]

>>> result = df.groupby([0, 0, 0, 0]).min()
pandas.core.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects

cc @WillAyd @TomAugspurger

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugGroupbyInternalsRelated to non-user accessible pandas implementationNeeds TestsUnit test(s) needed to prevent regressions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions