Skip to content

Commit 6d1372e

Browse files
authored
BUG: groupby agg raising when column was selected twice (#44944)
1 parent c99cf86 commit 6d1372e

File tree

3 files changed

+20
-1
lines changed

3 files changed

+20
-1
lines changed

doc/source/whatsnew/v1.4.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -795,6 +795,7 @@ Groupby/resample/rolling
795795
- Bug in :meth:`GroupBy.mean` failing with ``complex`` dtype (:issue:`43701`)
796796
- Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not calculating window bounds correctly for the first row when ``center=True`` and index is decreasing (:issue:`43927`)
797797
- Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` for centered datetimelike windows with uneven nanosecond (:issue:`43997`)
798+
- Bug in :meth:`GroupBy.mean` raising ``KeyError`` when column was selected at least twice (:issue:`44924`)
798799
- Bug in :meth:`GroupBy.nth` failing on ``axis=1`` (:issue:`43926`)
799800
- Fixed bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` not respecting right bound on centered datetime-like windows, if the index contain duplicates (:issue:`3944`)
800801
- Bug in :meth:`Series.rolling` and :meth:`DataFrame.rolling` when using a :class:`pandas.api.indexers.BaseIndexer` subclass that returned unequal start and end arrays would segfault instead of raising a ``ValueError`` (:issue:`44470`)

pandas/core/base.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -235,7 +235,7 @@ def __getitem__(self, key):
235235
raise IndexError(f"Column(s) {self._selection} already selected")
236236

237237
if isinstance(key, (list, tuple, ABCSeries, ABCIndex, np.ndarray)):
238-
if len(self.obj.columns.intersection(key)) != len(key):
238+
if len(self.obj.columns.intersection(key)) != len(set(key)):
239239
bad_keys = list(set(key).difference(self.obj.columns))
240240
raise KeyError(f"Columns not found: {str(bad_keys)[1:-1]}")
241241
return self._gotitem(list(key), ndim=2)

pandas/tests/groupby/test_indexing.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
import random
44

5+
import numpy as np
56
import pytest
67

78
import pandas as pd
@@ -285,3 +286,20 @@ def test_column_axis(column_group_df):
285286
expected = column_group_df.iloc[:, [1, 3]]
286287

287288
tm.assert_frame_equal(result, expected)
289+
290+
291+
@pytest.mark.parametrize("func", [list, pd.Index, pd.Series, np.array])
292+
def test_groupby_duplicated_columns(func):
293+
# GH#44924
294+
df = pd.DataFrame(
295+
{
296+
"A": [1, 2],
297+
"B": [3, 3],
298+
"C": ["G", "G"],
299+
}
300+
)
301+
result = df.groupby("C")[func(["A", "B", "A"])].mean()
302+
expected = pd.DataFrame(
303+
[[1.5, 3.0, 1.5]], columns=["A", "B", "A"], index=pd.Index(["G"], name="C")
304+
)
305+
tm.assert_frame_equal(result, expected)

0 commit comments

Comments
 (0)