Skip to content

Regression in to_timedelta with errors="coerce" and unit #34806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 15, 2020

Conversation

TomAugspurger
Copy link
Contributor

Introduced in d3f686b

In pandas 1.0.3

In [2]: pd.to_timedelta([1, 2, 'error'], errors="coerce", unit="ns")
Out[2]: TimedeltaIndex(['00:00:00.000000', '00:00:00.000000', NaT], dtype='timedelta64[ns]', freq=None)

In master, we raise.

In [2]: pd.to_timedelta([1, 2, 'error'], errors="coerce", unit="ns")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-a3691c044041> in <module>
----> 1 pd.to_timedelta([1, 2, 'error'], errors="coerce", unit="ns")

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/tools/timedeltas.py in to_timedelta(arg, unit, errors)
    101         arg = arg.item()
    102     elif is_list_like(arg) and getattr(arg, "ndim", 1) == 1:
--> 103         return _convert_listlike(arg, unit=unit, errors=errors)
    104     elif getattr(arg, "ndim", 1) > 1:
    105         raise TypeError(

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/tools/timedeltas.py in _convert_listlike(arg, unit, errors, name)
    140
    141     try:
--> 142         value = sequence_to_td64ns(arg, unit=unit, errors=errors, copy=False)[0]
    143     except ValueError:
    144         if errors == "ignore":

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py in sequence_to_td64ns(data, copy, unit, errors)
    927     if is_object_dtype(data.dtype) or is_string_dtype(data.dtype):
    928         # no need to make a copy, need to convert if string-dtyped
--> 929         data = objects_to_td64ns(data, unit=unit, errors=errors)
    930         copy = False
    931

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py in objects_to_td64ns(data, unit, errors)
   1037     values = np.array(data, dtype=np.object_, copy=False)
   1038
-> 1039     result = array_to_timedelta64(values, unit=unit, errors=errors)
   1040     return result.view("timedelta64[ns]")
   1041

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.array_to_timedelta64()

ValueError: unit must not be specified if the input contains a str

This restores the 1.0.3 behavior, and adds an additional test for errors="ignore", and cleans up the to_timedelta docstring.

Introduced in pandas-dev@d3f686b

In pandas 1.0.3

```python
In [2]: pd.to_timedelta([1, 2, 'error'], errors="coerce", unit="ns")
Out[2]: TimedeltaIndex(['00:00:00.000000', '00:00:00.000000', NaT], dtype='timedelta64[ns]', freq=None)
```

In master, we raise.

```pytb
In [2]: pd.to_timedelta([1, 2, 'error'], errors="coerce", unit="ns")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-a3691c044041> in <module>
----> 1 pd.to_timedelta([1, 2, 'error'], errors="coerce", unit="ns")

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/tools/timedeltas.py in to_timedelta(arg, unit, errors)
    101         arg = arg.item()
    102     elif is_list_like(arg) and getattr(arg, "ndim", 1) == 1:
--> 103         return _convert_listlike(arg, unit=unit, errors=errors)
    104     elif getattr(arg, "ndim", 1) > 1:
    105         raise TypeError(

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/tools/timedeltas.py in _convert_listlike(arg, unit, errors, name)
    140
    141     try:
--> 142         value = sequence_to_td64ns(arg, unit=unit, errors=errors, copy=False)[0]
    143     except ValueError:
    144         if errors == "ignore":

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py in sequence_to_td64ns(data, copy, unit, errors)
    927     if is_object_dtype(data.dtype) or is_string_dtype(data.dtype):
    928         # no need to make a copy, need to convert if string-dtyped
--> 929         data = objects_to_td64ns(data, unit=unit, errors=errors)
    930         copy = False
    931

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/timedeltas.py in objects_to_td64ns(data, unit, errors)
   1037     values = np.array(data, dtype=np.object_, copy=False)
   1038
-> 1039     result = array_to_timedelta64(values, unit=unit, errors=errors)
   1040     return result.view("timedelta64[ns]")
   1041

pandas/_libs/tslibs/timedeltas.pyx in pandas._libs.tslibs.timedeltas.array_to_timedelta64()

ValueError: unit must not be specified if the input contains a str
```

This restores the 1.0.3 behavior, and adds an additional test for `errors="ignore"`
@TomAugspurger TomAugspurger added this to the 1.1 milestone Jun 15, 2020
@TomAugspurger TomAugspurger added Regression Functionality that used to work in a prior pandas version Timedelta Timedelta data type labels Jun 15, 2020
@jreback jreback merged commit 62b26a7 into pandas-dev:master Jun 15, 2020
@jreback
Copy link
Contributor

jreback commented Jun 15, 2020

thanks @TomAugspurger

@TomAugspurger TomAugspurger deleted the to_timedelta-unit-ignore branch June 15, 2020 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Regression Functionality that used to work in a prior pandas version Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants