Skip to content

Please add force_suffixes to pandas.merge() #17834

Open
@sorenwacker

Description

@sorenwacker

Code Sample, a copy-pastable example if possible

import pandas as pd

A = pd.DataFrame({'colname': [1, 2]})

B = pd.DataFrame({'colname': [1, 2]})

C = pd.DataFrame({'colname': [1, 2]})

D = pd.merge(A, B, right_index=True, left_index=True, suffixes=('_A', '_B'))

print(pd.merge(D, B, right_index=True, left_index=True, suffixes=('', '_C'))

>   colname_A  colname_B  colname
> 0          1          1        1
> 1          2          2        2

Problem description

When using pandas.merge() suffixes are only added to the column-names if a column-name appears in both data frames. With more than two data frames that can lead to the above situation. The first merge adds suffixes. When the resulting table is merged with a third table the column-name does not appear twice and the suffix is not added. The suffix then needs to be added manually. This is overhead, especially when larger numbers of data-frames are merged. An option "force_suffixes" would be appreciated that ensures the suffix is added.

Expected Output

>   colname_A  colname_B  colname_C
> 0          1          1        1
> 1          2          2        2

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementReshapingConcat, Merge/Join, Stack/Unstack, Explode

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions