Skip to content

BUG: Timedelta drops decimals if precision is greater than nanoseconds #36771

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Nov 15, 2020

Conversation

phofl
Copy link
Member

@phofl phofl commented Oct 1, 2020

I am not quite sure, if we should round to nanoseconds or cut if off after the 9th decimal. I implemented a simple cut off, but would be happy to change if rounding is better.

I would also add a note to the user guide if the future behavior is clear

@phofl phofl added the Timedelta Timedelta data type label Oct 1, 2020
@phofl
Copy link
Member Author

phofl commented Oct 1, 2020

/azp run

@azure-pipelines
Copy link
Contributor

Commenter does not have sufficient privileges for PR 36771 in repo pandas-dev/pandas

@phofl
Copy link
Member Author

phofl commented Oct 1, 2020

Failure seems weird, maybe rerunning the pipeline?

@mroeschke
Copy link
Member

I think truncation is reasonable. I think it would be good to document this behavior in Timedelta and to_timedelta.

Could you also add a test that goes through to_timedelta?

@phofl
Copy link
Member Author

phofl commented Oct 11, 2020

The test ran through to_timedelta but not through Timedelta. Added this one. Also added a note to the docstrings

@phofl
Copy link
Member Author

phofl commented Oct 11, 2020

Failure unrelated

� Conflicts:
�	doc/source/whatsnew/v1.2.0.rst
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Nov 11, 2020
@mroeschke
Copy link
Member

Mind merge master @phofl? This LGTM after that

@phofl
Copy link
Member Author

phofl commented Nov 11, 2020

@mroeschke done

@phofl
Copy link
Member Author

phofl commented Nov 11, 2020

ci failure is from master and will be fixed with #37756

@jorisvandenbossche
Copy link
Member

@phofl I merged that PR, so you can update here as well now

@phofl
Copy link
Member Author

phofl commented Nov 11, 2020

Thanks, done

@mroeschke mroeschke removed the Stale label Nov 11, 2020
@mroeschke mroeschke added this to the 1.2 milestone Nov 11, 2020
@@ -66,6 +66,12 @@ def to_timedelta(arg, unit=None, errors="raise"):
to_datetime : Convert argument to datetime.
convert_dtypes : Convert dtypes.

Notes
-----

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no extra line here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx, changed

("8:53:08.7180000001", "8:53:08.7180000001"),
],
)
@pytest.mark.parametrize("func", ["Timedelta", "to_timedelta"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use pd.Timedelta and pd.to_datetime instead of passing these as strings and using getattr below

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -404,6 +404,7 @@ Datetimelike
- Bug in :meth:`DatetimeArray.shift` incorrectly allowing ``fill_value`` with a mismatched timezone (:issue:`37299`)
- Bug in adding a :class:`BusinessDay` with nonzero ``offset`` to a non-scalar other (:issue:`37457`)
- Bug in :func:`to_datetime` with a read-only array incorrectly raising (:issue:`34857`)
- Bug in :class:`Timedelta` incorrectly deleted all decimals when input had a higher precision than nanoseconds (:issue:`36738`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"incorrectly truncating to sub-second portion of a string input when it has precision higher than nanoseconds"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -1143,6 +1145,9 @@ class Timedelta(_Timedelta):
Notes
-----
The ``.value`` attribute is always in ns.

If the precision is higher than nanoseconds, the precision of the duration is
truncated to nanoseconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify this is for a string input

do additional digits round or just drop (and if just drop, are we sure thats the desired behavior?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropd additional digits. Was not sure either so I started with the easier implementation
#36771 (comment)

@pytest.mark.parametrize(
("input", "expected"),
[
("8:53:08.71800000001", "8:53:08.718"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a case with the trailing digit being 9 to clarify the truncating/rounding distinction

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

# GH: 36738
expected = pd.Timedelta(expected)
result = func(input)
print(result)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the print

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry forgot that, is removed

@@ -404,6 +404,7 @@ Datetimelike
- Bug in :meth:`DatetimeArray.shift` incorrectly allowing ``fill_value`` with a mismatched timezone (:issue:`37299`)
- Bug in adding a :class:`BusinessDay` with nonzero ``offset`` to a non-scalar other (:issue:`37457`)
- Bug in :func:`to_datetime` with a read-only array incorrectly raising (:issue:`34857`)
- Bug in :class:`Timedelta` incorrectly truncating to sub-second portion of a string input when it has precision higher than nanoseconds (:issue:`36738`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move to section below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@jreback jreback merged commit d85e9a2 into pandas-dev:master Nov 15, 2020
@jreback
Copy link
Contributor

jreback commented Nov 15, 2020

thanks @phofl

@phofl phofl deleted the 36738 branch November 15, 2020 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: to_timedelta drops decimals from input if precision is greater than nanoseconds
5 participants