Description
The .str.cat
-method is the only one in the str
-accessor that takes another Series as an argument, and as such, is a bit of a special case (e.g. it had no index alignment until v0.23).
It makes sense to support lists of objects which get concatenated sequentially, and list of lists have been supported since at least v0.17, see https://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.str.cat.html
When I wrote #20347, I tried very hard to keep signature backwards-compatible, and the example from the v0.17-22 docs working:
>>> Series(['a', 'b']).str.cat([['x', 'y'], ['1', '2']], sep=',')
0 a,x,1
1 b,y,2
dtype: object
However, this added lots of complexity, and I think that this should be simplified, especially in light of @TomAugspurger's comment in #21894
As a reminder, the plan is to have no new deprecations in 0.25.x and 1.0.0. So this [v0.24] is the last round of deprecations before 1.0.
My suggestion is to modify the allowed combinations (as of v0.23) as follows:
Type of "others" | action | comment
---------------------------------------------------------------------
list-like of strings | keep | as before; mimics behavior elsewhere,
cf.: pd.Series(range(3)) + [2,4,6]
Series | keep |
np.ndarray (1-dim) | keep |
DataFrame | keep | sequential concatenation
np.ndarray (2-dim) | keep | sequential concatenation
list-like of
Series/Index/np.ndarray (1-dim) | keep | sequential concatenation
list-like containing list-likes (1-dim)
other than Series/Index/np.ndarray | DEPR | sequential concatenation
In other words, if the user wants sequential concatenation, there are many possibilities available, and list-of-lists does not have to be one of them, IMO. This would substantially simplify (post-deprecation) the code for str.cat._get_series_list
, which is currently a bit complicated. https://github.com/pandas-dev/pandas/blob/v0.23.3/pandas/core/strings.py#L2089
Finally, for completeness, the example from the v0.17-22 docs has been removed for v0.23, but there are two examples in https://pandas.pydata.org/pandas-docs/stable/text.html#concatenating-a-series-and-many-objects-into-a-series that would fall under the deprecation I'm suggesting.