BUG?: getitem with a MultiIndex returns a Series only when the lower level is ""

xref #46944 

```
columns1 = pd.MultiIndex.from_tuples([("a", "a2"), ("b", "c")])
df1 = pd.DataFrame([[1, 2]], columns=columns1)
print(df1["a"])
#    a2
# 0   1

columns2 = pd.MultiIndex.from_tuples([("a", ""), ("b", "c")])
df2 = pd.DataFrame([[1, 2]], columns=columns2)
print(df2["a"])
# 0    1
# Name: a, dtype: int64
```

The first case produces a DataFrame, whereas the second case produces a Series. I don't think this is intentional. This gives rise to a difference in `DataFrameGroupBy._selected_obj` and `DataFrameGroupBy._obj_with_exclusions` which can lead to erroneous results (#50804 is one example).

Currently, `df2.groupby("a")` is allowed whereas `df1.groupby("a")` raises. So returning a DataFrame in the 2nd case will resolve the groupby inconsistency as well.

One can do `df1.groupby(("a", "a2"))` successfully, so I don't think there is a worry about making certain ops not possible.

cc @phofl for any thoughts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG?: getitem with a MultiIndex returns a Series only when the lower level is "" #50805

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BUG?: getitem with a MultiIndex returns a Series only when the lower level is "" #50805

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions