Skip to content

Commit b00749c

Browse files
authored
BUG: Groupby head/tail with axis=1 fails (#37778)
1 parent 08bf822 commit b00749c

File tree

3 files changed

+34
-2
lines changed

3 files changed

+34
-2
lines changed

doc/source/whatsnew/v1.2.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -543,6 +543,8 @@ Groupby/resample/rolling
543543
- Bug in :meth:`df.groupby(..).quantile() <pandas.core.groupby.DataFrameGroupBy.quantile>` and :meth:`df.resample(..).quantile() <pandas.core.resample.Resampler.quantile>` raised ``TypeError`` when values were of type ``Timedelta`` (:issue:`29485`)
544544
- Bug in :meth:`Rolling.median` and :meth:`Rolling.quantile` returned wrong values for :class:`BaseIndexer` subclasses with non-monotonic starting or ending points for windows (:issue:`37153`)
545545
- Bug in :meth:`DataFrame.groupby` dropped ``nan`` groups from result with ``dropna=False`` when grouping over a single column (:issue:`35646`, :issue:`35542`)
546+
- Bug in :meth:`DataFrameGroupBy.head`, :meth:`DataFrameGroupBy.tail`, :meth:`SeriesGroupBy.head`, and :meth:`SeriesGroupBy.tail` would raise when used with ``axis=1`` (:issue:`9772`)
547+
546548

547549
Reshaping
548550
^^^^^^^^^

pandas/core/groupby/groupby.py

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2742,7 +2742,10 @@ def head(self, n=5):
27422742
"""
27432743
self._reset_group_selection()
27442744
mask = self._cumcount_array() < n
2745-
return self._selected_obj[mask]
2745+
if self.axis == 0:
2746+
return self._selected_obj[mask]
2747+
else:
2748+
return self._selected_obj.iloc[:, mask]
27462749

27472750
@Substitution(name="groupby")
27482751
@Substitution(see_also=_common_see_also)
@@ -2776,7 +2779,10 @@ def tail(self, n=5):
27762779
"""
27772780
self._reset_group_selection()
27782781
mask = self._cumcount_array(ascending=False) < n
2779-
return self._selected_obj[mask]
2782+
if self.axis == 0:
2783+
return self._selected_obj[mask]
2784+
else:
2785+
return self._selected_obj.iloc[:, mask]
27802786

27812787
def _reindex_output(
27822788
self, output: OutputFrameOrSeries, fill_value: Scalar = np.NaN

pandas/tests/groupby/test_nth.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -513,6 +513,30 @@ def test_groupby_head_tail(op, n, expected_rows, columns, as_index):
513513
tm.assert_frame_equal(result, expected)
514514

515515

516+
@pytest.mark.parametrize(
517+
"op, n, expected_cols",
518+
[
519+
("head", -1, []),
520+
("head", 0, []),
521+
("head", 1, [0, 2]),
522+
("head", 7, [0, 1, 2]),
523+
("tail", -1, []),
524+
("tail", 0, []),
525+
("tail", 1, [1, 2]),
526+
("tail", 7, [0, 1, 2]),
527+
],
528+
)
529+
def test_groupby_head_tail_axis_1(op, n, expected_cols):
530+
# GH 9772
531+
df = DataFrame(
532+
[[1, 2, 3], [1, 4, 5], [2, 6, 7], [3, 8, 9]], columns=["A", "B", "C"]
533+
)
534+
g = df.groupby([0, 0, 1], axis=1)
535+
expected = df.iloc[:, expected_cols]
536+
result = getattr(g, op)(n)
537+
tm.assert_frame_equal(result, expected)
538+
539+
516540
def test_group_selection_cache():
517541
# GH 12839 nth, head, and tail should return same result consistently
518542
df = DataFrame([[1, 2], [1, 4], [5, 6]], columns=["A", "B"])

0 commit comments

Comments
 (0)