Skip to content

Using .loc with MultiIndex containing np.nan unexpected behavior #43814

Open
@deponovo

Description

@deponovo

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the master branch of pandas.

Reproducible Example

import pandas as pd
import numpy as np

df = pd.DataFrame(
    {
        "temp_playlist": [0, 0, 0, 0],
        "objId": ["o1", np.nan, "o1", np.nan],
        "x": [1, 2, 3, 4],
    }
)

agg_df = df.groupby(by=['temp_playlist', 'objId'], dropna=False)["x"].agg(list)
print(agg_df.loc[agg_df.index[-1]])  # KeyError: because it is (0, np.nan), wanted to get [2, 4]

Issue Description

This issue is a follow-up of the discussion in this SO question.
It appears to be a bug, but if not, meaning, if this is desired behavior it should be documented.
As shown in the Reproducible Example, after grouping x data on the temp_playlist and objId columns, there is a MultiIndex (0, nan). This index is meaningful and I wanted to access the data via it as I can perform with any other index from agg_df.index as agg_df.loc[<index_pos>]. This is not possible for the index containing the nan (agg_info_df.loc[agg_info_df.index[-1]]). However, it does work if that same index is provided in a list of indices. So this seems at least inconsistent if not a bug entirely.
For more info, please consult the SO question, especially this answer.

Expected Behavior

agg_info_df.loc[(0, np.nan)] should return [2, 4]

Installed Versions

python 3.8.5, pandas 1.3.1, numpy 1.20.3

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions