Description
Problem description
This was going to be a request for an enhancement, but after investigation it just became a reminder that pandas has an undocumented feature.
About once every few months I get into a tricky debug situation where a column has ceased to exist because it was suffixed with _x or _y, and while I understand use cases where suffixing is good, it is in some cases bad and unintended, and can be tricky because the _x and _y columns may not matter until much later. I wanted to be able to specify in merge that it should fail if it is going to suffix.
Turns out you can, but it is undocumented. Tracing code I eventually arrived at https://github.com/pandas-dev/pandas/blob/v0.23.3/pandas/core/internals.py#L5242
A quick check seems to indicate that merge(..., suffixes=(False,False)) causes the line to execute and doesn't bite if it shouldn't. Which is nice, but I would encourage you to document it at:
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html
import pandas as pd
left = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'A': ['A0', 'A1', 'A2', 'A3'],
'B': ['B0', 'B1', 'B2', 'B3']})
right = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'B': ['D0', 'D1', 'D2', 'D3']})
right_2 = pd.DataFrame({'key': ['K0', 'K1', 'K2', 'K3'],
'C': ['C0', 'C1', 'C2', 'C3'],
'D': ['D0', 'D1', 'D2', 'D3']})
result = pd.merge(left, right, on='key')
result
#should error and does
pd.merge(left, right, on='key', suffixes=(False, False))
#should not error and doesnt
pd.merge(left, right_2, on='key', suffixes=(False, False))