Skip to content

groupby/quantile breaks #28307

Closed
Closed
@daishi

Description

@daishi

Code Sample, a copy-pastable example if possible

In [34]: pd.DataFrame({'x': [0, 1, 2], 'y': [0, 1, 2]}).groupby('x')['y'].quantile([0.5])                                                                                                                          
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-34-287422cf6b27> in <module>
----> 1 pd.DataFrame({'x': [0, 1, 2], 'y': [0, 1, 2]}).groupby('x')['y'].quantile([0.5])

~/.venv-3.7.2/p4p_backend-1ulE3AqG/lib/python3.7/site-packages/pandas/core/groupby/groupby.py in quantile(self, q, interpolation)
   1951             indices = np.concatenate(arrays)
   1952             assert len(indices) == len(result)
-> 1953             return result.take(indices)
   1954 
   1955     @Substitution(name="groupby")

~/.venv-3.7.2/p4p_backend-1ulE3AqG/lib/python3.7/site-packages/pandas/core/series.py in take(self, indices, axis, is_copy, **kwargs)
   4430 
   4431         indices = ensure_platform_int(indices)
-> 4432         new_index = self.index.take(indices)
   4433 
   4434         if is_categorical_dtype(self):

~/.venv-3.7.2/p4p_backend-1ulE3AqG/lib/python3.7/site-packages/pandas/core/indexes/multi.py in take(self, indices, axis, allow_fill, fill_value, **kwargs)
   2030             allow_fill=allow_fill,
   2031             fill_value=fill_value,
-> 2032             na_value=-1,
   2033         )
   2034         return MultiIndex(

~/.venv-3.7.2/p4p_backend-1ulE3AqG/lib/python3.7/site-packages/pandas/core/indexes/multi.py in _assert_take_fillable(self, values, indices, allow_fill, fill_value, na_value)
   2058                 taken = masked
   2059         else:
-> 2060             taken = [lab.take(indices) for lab in self.codes]
   2061         return taken
   2062 

~/.venv-3.7.2/p4p_backend-1ulE3AqG/lib/python3.7/site-packages/pandas/core/indexes/multi.py in <listcomp>(.0)
   2058                 taken = masked
   2059         else:
-> 2060             taken = [lab.take(indices) for lab in self.codes]
   2061         return taken
   2062 

IndexError: index 3 is out of bounds for size 3

Problem description

An exception is thrown above. This was not an issue in Pandas 0.24.2.

Expected Output

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit           : None
python           : 3.7.2.final.0
python-bits      : 64
OS               : Linux
OS-release       : 5.0.0-27-generic
machine          : x86_64
processor        : x86_64
byteorder        : little
LC_ALL           : None
LANG             : en_US.UTF-8
LOCALE           : en_US.UTF-8

pandas           : 0.25.1
numpy            : 1.17.1
pytz             : 2019.2
dateutil         : 2.8.0
pip              : 19.2.3
setuptools       : 41.2.0
Cython           : None
pytest           : 5.1.2
hypothesis       : None
sphinx           : None
blosc            : None
feather          : None
xlsxwriter       : 1.2.0
lxml.etree       : 4.4.1
html5lib         : None
pymysql          : None
psycopg2         : 2.8.3 (dt dec pq3 ext lo64)
jinja2           : 2.10.1
IPython          : 7.5.0
pandas_datareader: None
bs4              : 4.8.0
bottleneck       : None
fastparquet      : None
gcsfs            : None
lxml.etree       : 4.4.1
matplotlib       : 3.1.1
numexpr          : None
odfpy            : None
openpyxl         : 2.6.3
pandas_gbq       : None
pyarrow          : None
pytables         : None
s3fs             : None
scipy            : 1.3.1
sqlalchemy       : 1.3.8
tables           : None
xarray           : None
xlrd             : 1.2.0
xlwt             : None
xlsxwriter       : 1.2.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs InfoClarification about behavior needed to assess issue

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions