Skip to content

BUG: Timedelta components no longer rounded with high precision integers #31380

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Feb 2, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/whatsnew/v1.1.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ Datetimelike
Timedelta
^^^^^^^^^

-
- Bug in constructing a :class:`Timedelta` with a high precision integer that would round the :class:`Timedelta` components (:issue:`31354`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I'm not sure if it was already covered by the earlier PR, but it seems like the change in precision to total_seconds() from "dubiously nanosecond-precision" to "microsecond-precision" should probably show up in the release notes. It seems like this is the only entry under Timedelta, so maybe it was done silently? Or was the other change already released?

-

Timezones
Expand Down
10 changes: 1 addition & 9 deletions pandas/_libs/tslibs/conversion.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ from pandas._libs.tslibs.util cimport (
from pandas._libs.tslibs.timedeltas cimport cast_from_unit
from pandas._libs.tslibs.timezones cimport (
is_utc, is_tzlocal, is_fixed_offset, get_utcoffset, get_dst_info,
get_timezone, maybe_get_tz, tz_compare, treat_tz_as_dateutil)
get_timezone, maybe_get_tz, tz_compare)
from pandas._libs.tslibs.timezones import UTC
from pandas._libs.tslibs.parsing import parse_datetime_string

Expand Down Expand Up @@ -341,14 +341,6 @@ cdef _TSObject convert_datetime_to_tsobject(datetime ts, object tz,
obj.tzinfo = tz
else:
obj.value = pydatetime_to_dt64(ts, &obj.dts)
# GH 24329 When datetime is ambiguous,
# pydatetime_to_dt64 doesn't take DST into account
# but with dateutil timezone, get_utcoffset does
# so we need to correct for it
if treat_tz_as_dateutil(ts.tzinfo):
if ts.tzinfo.is_ambiguous(ts):
dst_offset = ts.tzinfo.dst(ts)
obj.value += int(dst_offset.total_seconds() * 1e9)
obj.tzinfo = ts.tzinfo

if obj.tzinfo is not None and not is_utc(obj.tzinfo):
Expand Down
10 changes: 2 additions & 8 deletions pandas/_libs/tslibs/nattype.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ from cpython.object cimport (
PyObject_RichCompare,
Py_GT, Py_GE, Py_EQ, Py_NE, Py_LT, Py_LE)

from cpython.datetime cimport (datetime,
from cpython.datetime cimport (datetime, timedelta,
PyDateTime_Check, PyDelta_Check,
PyDateTime_IMPORT)

Expand Down Expand Up @@ -276,13 +276,6 @@ cdef class _NaT(datetime):
def __long__(self):
return NPY_NAT

def total_seconds(self):
"""
Total duration of timedelta in seconds (to microsecond precision).
"""
# GH#10939
return np.nan

@property
def is_leap_year(self):
return False
Expand Down Expand Up @@ -386,6 +379,7 @@ class NaTType(_NaT):
# nan methods
weekday = _make_nan_func('weekday', datetime.weekday.__doc__)
isoweekday = _make_nan_func('isoweekday', datetime.isoweekday.__doc__)
total_seconds = _make_nan_func('total_seconds', timedelta.total_seconds.__doc__)
month_name = _make_nan_func('month_name', # noqa:E128
"""
Return the month name of the Timestamp with specified locale.
Expand Down
10 changes: 1 addition & 9 deletions pandas/_libs/tslibs/timedeltas.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -859,14 +859,6 @@ cdef class _Timedelta(timedelta):
"""
return self.to_timedelta64()

def total_seconds(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer need this implementation since we can rely on timedelta.total_seconds

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what change for the "no longer" part?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are now dispatching the correct microseconds to datetime.timedelta here: https://github.com/pandas-dev/pandas/pull/31380/files#diff-d23555c3fb6b3ffc9f2088e09f9ec5f9R1245

we can use datetime.timedelta.total_seconds as mentioned in this #31155 (comment) since we are no longer rounding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good, thanks for explaining

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I did consider suggesting the removal of total_seconds here, but when I benchmarked it, I found that the implementation where you truncate ._value was about 2x as fast as the native timedelta.total_seconds(), if that matters at all.

"""
Total duration of timedelta in seconds (to microsecond precision).
"""
# GH 31043
# Microseconds precision to avoid confusing tzinfo.utcoffset
return (self.value - self.value % 1000) / 1e9

def view(self, dtype):
"""
Array view compatibility.
Expand Down Expand Up @@ -1250,7 +1242,7 @@ class Timedelta(_Timedelta):
return NaT

# make timedelta happy
td_base = _Timedelta.__new__(cls, microseconds=int(value) / 1000)
td_base = _Timedelta.__new__(cls, microseconds=int(value) // 1000)
td_base.value = value
td_base.is_populated = 0
return td_base
Expand Down
13 changes: 13 additions & 0 deletions pandas/tests/scalar/timedelta/test_timedelta.py
Original file line number Diff line number Diff line change
Expand Up @@ -821,3 +821,16 @@ def test_resolution_deprecated(self):
def test_truthiness(value, expected):
# https://github.com/pandas-dev/pandas/issues/21484
assert bool(value) is expected


def test_timedelta_attribute_precision():
# GH 31354
td = Timedelta(1552211999999999872, unit="ns")
result = td.days * 86400
result += td.seconds
result *= 1000000
result += td.microseconds
result *= 1000
result += td.nanoseconds
expected = td.value
assert result == expected
10 changes: 1 addition & 9 deletions pandas/tests/scalar/timestamp/test_timestamp.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

import calendar
from datetime import datetime, timedelta
from distutils.version import LooseVersion
import locale
import unicodedata

Expand Down Expand Up @@ -1088,20 +1087,13 @@ def test_constructor_ambigous_dst():
assert result == expected


@pytest.mark.xfail(
LooseVersion(compat._optional._get_version(dateutil)) < LooseVersion("2.7.0"),
reason="dateutil moved to Timedelta.total_seconds() in 2.7.0",
)
@pytest.mark.parametrize("epoch", [1552211999999999872, 1552211999999999999])
def test_constructor_before_dst_switch(epoch):
# GH 31043
# Make sure that calling Timestamp constructor
# on time just before DST switch doesn't lead to
# nonexistent time or value change
# Works only with dateutil >= 2.7.0 as dateutil overrid
# pandas.Timedelta.total_seconds with
# datetime.timedelta.total_seconds before
ts = Timestamp(epoch, tz="dateutil/US/Pacific")
ts = Timestamp(epoch, tz="dateutil/America/Los_Angeles")
result = ts.tz.dst(ts)
expected = timedelta(seconds=0)
assert Timestamp(ts).value == epoch
Expand Down