Skip to content

Change in wrapped ufunc handling between 1.15 -> 1.16 #12997

Open
@jorisvandenbossche

Description

@jorisvandenbossche

Reproducing code example:

Consider this example, where I create a pandas TimedeltaIndex (an "array-like") and call np.sum on it, which correctly sums the timedelta64[ns] data and returns a scalar:

In [1]: idx = pd.TimedeltaIndex(np.arange(10)*1e9) 

In [2]: idx
Out[2]: 
TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04',
                '00:00:05', '00:00:06', '00:00:07', '00:00:08', '00:00:09'],
               dtype='timedelta64[ns]', freq=None)

In [3]: np.sum(idx) 
Out[3]: numpy.timedelta64(45000000000,'ns')

In [4]: pd.__version__   
Out[4]: '0.23.4'

In [5]: np.__version__
Out[5]: '1.15.4'

The above is with numpy 1.15, but starting from 1.16 (and on master as well) the same code now gives an error (using the same pandas version):

In [1]: idx = pd.TimedeltaIndex(np.arange(10)*1e9)   

In [2]: np.sum(idx)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-4137fa3a65d6> in <module>
----> 1 np.sum(idx)

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py in sum(a, axis, dtype, out, keepdims, initial)
   2074 
   2075     return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
-> 2076                           initial=initial)
   2077 
   2078 

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
     84                 return reduction(axis=axis, out=out, **passkwargs)
     85 
---> 86     return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
     87 
     88 

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __array_wrap__(self, result, context)
    658         attrs = self._get_attributes_dict()
    659         attrs = self._maybe_update_attributes(attrs)
--> 660         return Index(result, **attrs)
    661 
    662     @cache_readonly

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    301                   (dtype is not None and is_timedelta64_dtype(dtype))):
    302                 from pandas.core.indexes.timedeltas import TimedeltaIndex
--> 303                 result = TimedeltaIndex(data, copy=copy, name=name, **kwargs)
    304                 if dtype is not None and _o_dtype == dtype:
    305                     return Index(result.to_pytimedelta(), dtype=_o_dtype)

~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/timedeltas.py in __new__(cls, data, unit, freq, start, end, periods, closed, dtype, copy, name, verify_integrity)
    250 
    251         # check that we are matching freqs
--> 252         if verify_integrity and len(data) > 0:
    253             if freq is not None and not freq_infer:
    254                 index = cls._simple_new(data, name=name)

TypeError: len() of unsized object

In [3]: np.__version__      
Out[3]: '1.16.1'

In [4]: pd.__version__                                     
Out[4]: '0.23.4'

The error you see comes from passing a 0d array to the TimedeltaIndex constructor. But it seems that something changed in numpy how this error is handled (I was using the same pandas version, so this error will happen under the hood in both cases).

We can rather easily work around this in pandas (checking if the result is a 0dim array or scalar, and then not passing it to the class constructor in __array_wrap__, see pandas-dev/pandas#25329), but reporting this here to check if this is an intentional change or rather an uncatched regression.

Some more information on this specific case: TimedeltaIndex has no sum method implemented. So np.sum does not directly dispatch to such a method (in contrast to eg Series, which has a sum method). That means that np.sum goes through the __array__ and __array_wrap__.
(note: I suppose towards the future we should also fix this by adding a __array_ufunc__)

Numpy/Python version information:

Both is in the same environment with python 3.7

In [9]: sys.version
Out[9]: '3.7.1 | packaged by conda-forge | (default, Feb 18 2019, 01:42:00) \n[GCC 7.3.0]'

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions