Warn on duplicate names in MI?

Opening a new issue so this isn't lost.

In https://github.com/pandas-dev/pandas/pull/18882 banned duplicate names in a MultiIndex. I think this is a good change since allowing duplicates hit a lot of edge cases when you went to actually do something. I want to make sure we understand all the cases that actually produce duplicate names in the MI though, specifically groupby.apply.

```python
In [1]: import dask.dataframe as dd

In [2]: import pandas as pd

In [3]:     pdf = pd.DataFrame({'a': [1, 2, 3, 4, 5, 6, 7, 8, 9],
   ...:                         'b': [4, 5, 6, 3, 2, 1, 0, 0, 0]},
   ...:                        index=[0, 1, 3, 5, 6, 8, 9, 9, 9]).set_index("a")
   ...:
   ...:

In [4]: pdf.groupby(pdf.index).apply(lambda x: x.b)
```

Another, more realistic example: groupwise drop_duplicates:


```python
In [18]: df = pd.DataFrame({"B": [0, 0, 0, 1, 1, 1, 2, 2, 2]}, index=pd.Index([0, 1, 1, 2, 2, 2, 0, 0, 1], name='a'))

In [19]: df
Out[19]:
   B
a
0  0
1  0
1  0
2  1
2  1
2  1
0  2
0  2
1  2

In [20]: df.groupby('a').apply(pd.DataFrame.drop_duplicates)
Out[20]:
     B
a a
0 0  0
  0  2
1 1  0
  1  2
2 2  1
```

Is it possible to throw a warning on this for now, in case duplicate names are more common than we thought?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warn on duplicate names in MI? #19029

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Warn on duplicate names in MI? #19029

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions