Skip to content

BUG: left join between df with a single index and df with a multiindex produces an inner join #34292

Open
@CuylenE

Description

@CuylenE
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample

A = pd.DataFrame([1,2], columns=["i"]).set_index(["i"])
B = pd.DataFrame([(1,4),(3,0),(1,5)], columns=["i", "ii"]).set_index(["i", "ii"])
A.join(B, how="left")
#Empty DataFrame
#Columns: []
#Index: [(1, 4), (1, 5)]

Problem description

Joining a df with 1 index and a df with a multiindex always generates an inner join, no matter the value of "how". In this case index-value 2 from the left df is missing, while it should be kept.

The same happens when just joining the indexes. This problem does not happen when joining 2 multi-indexes.

It seems something in the implementation in file pandas\core\indexes\base.py , class Index, method _join_level goes wrong. The generated new codes are either calculated wrong or don't take into account the join-method. But that's as far as I've gotten.

Expected Output

Empty DataFrame
Columns: []
Index: [(1, 4), (1, 5), (2, nan)]

Output of pd.show_versions()

[paste the output of pd.show_versions() here leaving a blank line after the details tag]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions