Description
Reproducing code example:
Consider this example, where I create a pandas TimedeltaIndex (an "array-like") and call np.sum
on it, which correctly sums the timedelta64[ns] data and returns a scalar:
In [1]: idx = pd.TimedeltaIndex(np.arange(10)*1e9)
In [2]: idx
Out[2]:
TimedeltaIndex(['00:00:00', '00:00:01', '00:00:02', '00:00:03', '00:00:04',
'00:00:05', '00:00:06', '00:00:07', '00:00:08', '00:00:09'],
dtype='timedelta64[ns]', freq=None)
In [3]: np.sum(idx)
Out[3]: numpy.timedelta64(45000000000,'ns')
In [4]: pd.__version__
Out[4]: '0.23.4'
In [5]: np.__version__
Out[5]: '1.15.4'
The above is with numpy 1.15, but starting from 1.16 (and on master as well) the same code now gives an error (using the same pandas version):
In [1]: idx = pd.TimedeltaIndex(np.arange(10)*1e9)
In [2]: np.sum(idx)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-2-4137fa3a65d6> in <module>
----> 1 np.sum(idx)
~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py in sum(a, axis, dtype, out, keepdims, initial)
2074
2075 return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
-> 2076 initial=initial)
2077
2078
~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapreduction(obj, ufunc, method, axis, dtype, out, **kwargs)
84 return reduction(axis=axis, out=out, **passkwargs)
85
---> 86 return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
87
88
~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __array_wrap__(self, result, context)
658 attrs = self._get_attributes_dict()
659 attrs = self._maybe_update_attributes(attrs)
--> 660 return Index(result, **attrs)
661
662 @cache_readonly
~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
301 (dtype is not None and is_timedelta64_dtype(dtype))):
302 from pandas.core.indexes.timedeltas import TimedeltaIndex
--> 303 result = TimedeltaIndex(data, copy=copy, name=name, **kwargs)
304 if dtype is not None and _o_dtype == dtype:
305 return Index(result.to_pytimedelta(), dtype=_o_dtype)
~/miniconda3/envs/numpy-dev/lib/python3.7/site-packages/pandas/core/indexes/timedeltas.py in __new__(cls, data, unit, freq, start, end, periods, closed, dtype, copy, name, verify_integrity)
250
251 # check that we are matching freqs
--> 252 if verify_integrity and len(data) > 0:
253 if freq is not None and not freq_infer:
254 index = cls._simple_new(data, name=name)
TypeError: len() of unsized object
In [3]: np.__version__
Out[3]: '1.16.1'
In [4]: pd.__version__
Out[4]: '0.23.4'
The error you see comes from passing a 0d array to the TimedeltaIndex
constructor. But it seems that something changed in numpy how this error is handled (I was using the same pandas version, so this error will happen under the hood in both cases).
We can rather easily work around this in pandas (checking if the result is a 0dim array or scalar, and then not passing it to the class constructor in __array_wrap__
, see pandas-dev/pandas#25329), but reporting this here to check if this is an intentional change or rather an uncatched regression.
Some more information on this specific case: TimedeltaIndex
has no sum
method implemented. So np.sum
does not directly dispatch to such a method (in contrast to eg Series
, which has a sum
method). That means that np.sum
goes through the __array__
and __array_wrap__
.
(note: I suppose towards the future we should also fix this by adding a __array_ufunc__
)
Numpy/Python version information:
Both is in the same environment with python 3.7
In [9]: sys.version
Out[9]: '3.7.1 | packaged by conda-forge | (default, Feb 18 2019, 01:42:00) \n[GCC 7.3.0]'