Skip to content

DEPR: Merging on different number of levels #34862

Closed
@dinya

Description

@dinya

Two dataframes with MultiIndex in columns (shapes are different).

df1_columns = pd.MultiIndex.from_tuples([("A0", "B0", "C0"), ("A1", "B1", "C1")])
df1 = pd.DataFrame([[1, 2], [10, 20]], columns=df1_columns)

df2_columns = pd.MultiIndex.from_tuples([("X0", "Y0"), ("X1", "Y1")])
df2 = pd.DataFrame([[1, 200], [10, 200]], columns=df2_columns)

Case 1

pd.merge(df1, df2, left_on=[("A0", "B0","C0")], right_on=[("X0", "Y0")])

returns

   (A0, B0, C0)  (A1, B1, C1)  (X0, Y0)  (X1, Y1)
0             1             2         1       200
1            10            20        10       200

It's ok. Columns saved its structure from the both dataframes. They are tuples, but it can be fixed (for example with something like this).

Case 2

pd.merge(df2, df1, left_on=[("X0", "Y0")], right_on=[("A0", "B0","C0")])

returns

   X0   X1  A0  A1
   Y0   Y1  B0  B1
0   1  200   1   2
1  10  200  10  20

When the left DataFrame is a DataFrame with fewer column levels the merge result is questionable: lower levels are dropped.

Is this a bug or a feature (result levels shape must be the same as for the left DataFrame)?

Notes:

  1. In both cases the warning is the same UserWarning: merging between different levels can give an unintended result ....
  2. Checked with pandas 0.25.1 and 1.0.4.
  3. See notebook.

Metadata

Metadata

Assignees

No one assigned

    Labels

    DeprecateFunctionality to remove in pandasReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions