Skip to content

BUG: first() changes datetime64 data #9311

Closed
@chrisbyboston

Description

@chrisbyboston

I have a dataframe that contains a datetime64 column that seems to be losing some digits after a groupby.first(). I know that under the hood these are stored at nanosecond precision, but I need them to remain equal, since I'm merging back onto the original frame later on. Is this expected behavior?

import pandas as pd
from pandas import Timestamp

data = [
 {'a': 1,
  'dateCreated': Timestamp('2011-01-20 12:50:28.593448')},
 {'a': 1,
  'dateCreated': Timestamp('2011-01-15 12:50:28.502376')},
 {'a': 1,
  'dateCreated': Timestamp('2011-01-15 12:50:28.472790')},
 {'a': 1,
  'dateCreated': Timestamp('2011-01-15 12:50:28.445286')}]

df = pd.DataFrame(data)

Output is:

In [6]: df
Out[6]:
   a                dateCreated
0  1 2011-01-20 12:50:28.593448
1  1 2011-01-15 12:50:28.502376
2  1 2011-01-15 12:50:28.472790
3  1 2011-01-15 12:50:28.445286

In [7]: df.groupby('a').first()
Out[7]:
                    dateCreated
a
1 2011-01-20 12:50:28.593447936

When I compare the datetime64 in the first row to the datetime64 after the groupby.first(), the two are not equal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions