Closed
Description
The join_axes
kwarg of pd.concat
is not very clearly documented (took me several tries to get it to work), and its name is not very clear either -- it's actually about restricting the axes that are not being concatenated (i.e. would be 'outer'
-joined normally).
In particular, it is basically irrelevant with the deprecation of Panel
, since there are no more ax_e_s (plural), only one non-concatenation ax_i_s.
Finally, with reindex
and reindex_like
, it is redundant as well:
one = pd.DataFrame([[0, 1], [2, 3]], columns=list('ab'))
two = pd.DataFrame([[10, 11], [12, 13]], index=[1, 2], columns=list('bc'))
## simulating 'right'-join for the non-concatenation axis
pd.concat([one, two], join='outer', axis=1, join_axes=two.index) # cryptic error message!
# AssertionError: length of join_axes must not be equal to 1
## only works with list-like join_axes
pd.concat([one, two], join='outer', axis=1, join_axes=[two.index])
# a b b c
# 1 2.0 3.0 10 11
# 2 NaN NaN 12 13
## cleaner with reindex?
pd.concat([one, two], join='outer', axis=1).reindex(two.index)
# a b b c
# 1 2.0 3.0 10.0 11.0
# 2 NaN NaN 12.0 13.0
Note that the dtype changes due to the intermediate object having NaN
s in the rows, but this will be fixed by #21160 anyway. Only question is if performance would be much worse, if concatenating huge Series/DFs before selecting small index-subset.