Skip to content

DOC: update str.cat example #23723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jan 4, 2019
12 changes: 6 additions & 6 deletions doc/source/text.rst
Original file line number Diff line number Diff line change
Expand Up @@ -303,23 +303,23 @@ The same alignment can be used when ``others`` is a ``DataFrame``:
Concatenating a Series and many objects into a Series
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

All one-dimensional list-likes can be combined in a list-like container (including iterators, ``dict``-views, etc.):
Several items can be combined a list-like container (including iterators, ``dict``-views, etc.), which may contain ``Series``, ``Index``, ``PandasArray`` and ``np.ndarray``.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it was a typo to remove the word in here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also some nits around verbage, but I think it would be easier to keep the Series, Index, etc... mentions closer to "Several items"; as is I had to read a few times to truly understand what was meant after the word which

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even just `Several items (ex: Series, Index, ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. I'm making an explicit list though, as only those types are allowed within list-likes


.. ipython:: python

s
u
s.str.cat([u.array,
u.index.astype(str).array], na_rep='-')
s.str.cat([u, u.array, u.to_numpy()], join='left')

All elements must match in length to the calling ``Series`` (or ``Index``), except those having an index if ``join`` is not None:
All elements without an index (e.g. ``PandasArray`` and ``np.ndarray``) within the passed list-like must match in length to the calling ``Series`` (or ``Index``),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might have missed this but what's the reason for bringing up PandasArray? Not really something the end user would be using directly (at least in current form)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PandasArray is very user-facing:

>>> s = pd.Series(['a', 'b' ,'c', 'd'])
>>> s.array
<PandasArray>
['a', 'b', 'c', 'd']
Length: 4, dtype: object

and the current example was recently changed to use .array instead of .values. I think this should be documented clearly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks. I have been somewhat on the sidelines for that conversation so I'll defer to @jreback specifically on this piece

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to mention PandasArray here, its not very interesting, nor relevant. Just say array-likes. and remove u.array, the u.to_numpy() is the corrent idiom here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback
There is a big difference between u.array and u.to_numpy():

>>> s = pd.Series(['a', 'b', 'c', 'd'])
>>> s.array
<PandasArray>
['a', 'b', 'c', 'd']
Length: 4, dtype: object
>>> s.to_numpy()
array(['a', 'b', 'c', 'd'], dtype=object)

I'm guessing .array will eventually replace .values-usage (e.g. to get rid of the index for .str.cat), since it is by design better suited for pandas-internal dtypes, and so the distinction above is not just an irrelevant detail IMO.

I want to show here the explicitly allowed item types to pass into a list-like, which have to pass:

nxt = others.pop(0)
[...]
if not (isinstance(nxt, (Series, Index))
        or (isinstance(nxt, np.ndarray) and nxt.ndim == 1)):
    raise ValueError(...)  # currently just a DeprecationWarning

Long story short, I want to show a list-like containing an np.ndarray, a PandasArray (to reiterate, this example was already changed by @TomAugspurger to use .array instead of .values in #23623), and a Series. (Including Index would be nice-to-have, but too complicated absent #22225).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@h-vetinari .to_numpy() replaces .values; .array is user-accessible, but generally is not visible to by the user. its not necessary here and is just noise.

Copy link
Contributor Author

@h-vetinari h-vetinari Jan 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback

.to_numpy() replaces .values;

As I did in the last few commits...

.array is user-accessible, but generally is not visible to by the user. its not necessary here and is just noise.

Will remove, but pinging @TomAugspurger since he added the current u.array in this example in #23623 (although likely "just" for replacing u.values?).

but ``Series`` and ``Index`` may have arbitrary length (as long as alignment is not disabled with ``join=None``):

.. ipython:: python

v
s.str.cat([u, v], join='outer', na_rep='-')
s.str.cat([v, u, u.to_numpy()], join='outer', na_rep='-')

If using ``join='right'`` on a list of ``others`` that contains different indexes,
If using ``join='right'`` on a list-like of ``others`` that contains different indexes,
the union of these indexes will be used as the basis for the final concatenation:

.. ipython:: python
Expand Down