Skip to content

any/all reductions on boolean object-typed Series #27709

Closed
@xhochy

Description

@xhochy

On implementing a boolean based ExtensionArray I stumbled on the case that boolean arrays with missing values (which can only be object-typed in pandas) are kind-of undefined behaviour in Pandas reductions with skipna=False:

The following case should return True according to the docstring of Series.any(skipna=False):

pd.Series([False, None]).any(skipna=False)
# None
pd.Series([None, False]).any(skipna=False)
# False
pd.Series([False, np.nan]).any(skipna=False)
# nan
pd.Series([np.nan, False]).any(skipna=False)
# nan

Whereas when you do the same operation on float columns the behaviour is as documented:

pd.Series([np.nan, 0.]).any(skipna=False)
# True
pd.Series([0, np.nan]).any(skipna=False)
# True

As I have not found a unit test for the above mentioned case with a boolean object column, I suspect that this is rather undefined behaviour then intended.

Three solutions come to my mind:

  1. Document this behaviour in the Series.any() docstring.
  2. Align the behaviour of pd.Series(booleans, dtype=object).any(…) with pd.Series(booleans, dtype=object).astype(float).any(…).
  3. Raise an error when calling any/all on a mixed typed boolean series.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions