Skip to content

Floating point accuracy problems in DatetimeIndex.round #14440

Closed
@eoincondron

Description

@eoincondron

A small, complete example of the issue

There is a slight problem when using the rounding methods of DatetimeIndex (round, floor, ceil) to high frequencies as illustrated by this example:

pd.DatetimeIndex(['2016-10-17 12:00:00.0015']).round('ms')
DatetimeIndex(['2016-10-17 12:00:00.001999872'], dtype='datetime64[ns]', freq=None)

The problem is here in the TimelikeOps._round method:

 result = (unit * rounder(values / float(unit))).astype('i8')

rounder(values / float(unit)) returns an array of floats containing the multiples of unit required. However, although the values look like ints, when multiplied by unit the result can be off due to floating point accuracy. Replacing it with

 result = (unit * rounder(values / float(unit)).astype('i8'))

Should fix the problem. I'm willing to do a PR to fix it.

Output of pd.show_versions()

pandas: 0.19.0
nose: None
pip: 8.1.2
setuptools: 27.2.0
Cython: None
numpy: 1.11.1
scipy: None
statsmodels: None
xarray: None
IPython: 5.1.0
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.6.1
blosc: None
bottleneck: None
tables: 3.2.3.1
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.13
pymysql: 0.7.6.None
psycopg2: None
jinja2: 2.8
boto: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions