Skip to content

ENH: should we support np.allclose for ExtensionArrays? #37915

Open
@arw2019

Description

@arw2019

xref https://github.com/pandas-dev/pandas/pull/33435/files#r406534850

I'm looking into picking up #33435. Currently the issue is that np.allclose throws when called on an EA:

In [1]: import numpy as np
   ...: import pandas as pd
   ...: 
   ...: A = pd.array([1, 2], dtype='Int64')
   ...: B = pd.array([1, 2], dtype='Int64')
   ...: np.allclose(A, B)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-1-01ae8e8fc321> in <module>
      4 A = pd.array([1, 2], dtype='Int64')
      5 B = pd.array([1, 2], dtype='Int64')
----> 6 np.allclose(A, B)

<__array_function__ internals> in allclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol, equal_nan)
   2187 
   2188     """
-> 2189     res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
   2190     return bool(res)
   2191 

<__array_function__ internals> in isclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in isclose(a, b, rtol, atol, equal_nan)
   2285     y = array(y, dtype=dt, copy=False, subok=True)
   2286 
-> 2287     xfin = isfinite(x)
   2288     yfin = isfinite(y)
   2289     if all(xfin) and all(yfin):

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

The error message is about the isfinite ufunc not being implemented. The root cause here is that np.isclose calls np.asanyarray on its inputs and the following triggers the same error:

In [14]: a = np.asanyarray(A)
    ...: b = np.asanyarray(B)
    ...: np.allclose(a, b)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-14-9c8d10e21030> in <module>
      1 a = np.asanyarray(A)
      2 b = np.asanyarray(B)
----> 3 np.allclose(a, b)

<__array_function__ internals> in allclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in allclose(a, b, rtol, atol, equal_nan)
   2187 
   2188     """
-> 2189     res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
   2190     return bool(res)
   2191 

<__array_function__ internals> in isclose(*args, **kwargs)

~/anaconda3/envs/pandas-dev/lib/python3.8/site-packages/numpy/core/numeric.py in isclose(a, b, rtol, atol, equal_nan)
   2285     y = array(y, dtype=dt, copy=False, subok=True)
   2286 
-> 2287     xfin = isfinite(x)
   2288     yfin = isfinite(y)
   2289     if all(xfin) and all(yfin):

TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according

The issue is that when np.asanyarray is called on EA input it returns an object dtype:

In [16]: A = pd.array([1, 2], dtype='Int64')
    ...: np.asanyarray(A)
Out[16]: array([1, 2], dtype=object)

As far as #33435 is concerned, one solution is to cast to a NumPy array before calling np.allclose. Would we, however, want to make np.allclose work directly on the integer/floating EAs?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Compatpandas objects compatability with Numpy or Python functionsEnhancementExtensionArrayExtending pandas with custom dtypes or arrays.Needs DiscussionRequires discussion from core team before further actionufuncs__array_ufunc__ and __array_function__

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions