Description
Problem description
The docs for DataFrame.groupby
signature start with:
by : mapping, function, label, or list of labels
Used to determine the groups for the groupby.
... but the code assumes that lists of mappings or functions can also be passed, and this is also tested, although with limited enthusiasm:
pandas/pandas/tests/groupby/test_grouping.py
Line 667 in 0370740
... and consistency (apparently that code path is used somewhere else):
pandas/pandas/tests/groupby/test_grouping.py
Line 732 in 0370740
Expected Output
Either we disable/deprecate the possibility of passing lists of mappings, ore we document it.
I guess the latter is the desired outcome, since the code does not support the feature "by chance". Still I wanted to double check with @pandas-dev/pandas-core because
- it is not a killer feature, as it is really easy to pass a single lambda that does the same job of a list of mappings (and more, like applying different mappings to specific levels of the index)
- removing it would allow us to simplify the code quite a bit (e.g. get_group(...) fails for groupby(...) based on a function #22257 wouldn't have happened)
- it is probably not much used
Output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.3.final.0
python-bits: 64
OS: Linux
OS-release: 4.9.0-6-amd64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: it_IT.UTF-8
pandas: 0.24.0.dev0+437.g33d70efb5
pytest: 3.5.0
pip: 9.0.1
setuptools: 39.2.0
Cython: 0.28.4
numpy: 1.14.3
scipy: 0.19.0
pyarrow: None
xarray: None
IPython: 6.2.1
sphinx: 1.5.6
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.0dev
tables: 3.3.0
numexpr: 2.6.1
feather: 0.3.1
matplotlib: 2.2.2.post1634.dev0+ge8120cf6d
openpyxl: 2.3.0
xlrd: 1.0.0
xlwt: 1.3.0
xlsxwriter: 0.9.6
lxml: 4.1.1
bs4: 4.5.3
html5lib: 0.999999999
sqlalchemy: 1.0.15
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: 0.2.1
gcsfs: None