Skip to content

BUG: to_timedelta overflows without raising in some very particular cases #17037

Closed
@cchwala

Description

@cchwala

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

int_min = np.iinfo(np.int64).min
int_max = np.iinfo(np.int64).max

def float_array_with_smallest_increments(initial_float, N_points_in_one_direction):    
    floats_upward = [initial_float, ]
    floats_downward = [initial_float, ]
    for i in range(N_points_in_one_direction):
        floats_upward.append(np.nextafter(floats_upward[-1] , int_max))
        floats_downward.append(np.nextafter(floats_downward[-1] , int_min)) 
    return np.array(floats_downward[::-1] + floats_upward[1:])

seconds_as_floats = float_array_with_smallest_increments(int_max/1e9, 5)

for v in np.nditer(seconds_as_floats):
    print('%.20f' % v)
    
pd.to_timedelta(seconds_as_floats, unit='s')

Output:

9223372036.85476684570312500000
9223372036.85476875305175781250
9223372036.85477066040039062500
9223372036.85477256774902343750
9223372036.85477447509765625000
9223372036.85477638244628906250
9223372036.85477828979492187500
9223372036.85478019714355468750
9223372036.85478210449218750000
9223372036.85478401184082031250
9223372036.85478591918945312500

TimedeltaIndex([  '106751 days 23:47:16.854767',
                  '106751 days 23:47:16.854769',
                  '106751 days 23:47:16.854771',
                  '106751 days 23:47:16.854773',
                  '106751 days 23:47:16.854774',
                '-106752 days +00:12:43.145224',
                '-106752 days +00:12:43.145226',
                '-106752 days +00:12:43.145228',
                '-106752 days +00:12:43.145230',
                '-106752 days +00:12:43.145232',
                '-106752 days +00:12:43.145234'],
               dtype='timedelta64[ns]', freq=None)

Here is a more detailed notebook showing the problem

Problem description

If you pass floating points values close to the edge of overflow to to_timedelta it might return an incorrect Timedelta instead of raising an OverflowError.

Output of pd.show_versions()

INSTALLED VERSIONS ------------------ commit: None python: 2.7.12.final.0 python-bits: 64 OS: Darwin OS-release: 14.4.0 machine: x86_64 processor: i386 byteorder: little LC_ALL: None LANG: de_DE.UTF-8 LOCALE: None.None

pandas: 0.20.1
pytest: 3.0.7
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.0
xarray: 0.9.5-11-gff2e4dd
IPython: 5.3.0
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.5.1
pytz: 2016.2
blosc: None
bottleneck: 1.2.0
tables: 3.2.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.0
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.4.1
html5lib: None
sqlalchemy: 1.0.12
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
s3fs: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions