Skip to content

Weird Datetime Behaviour #3002

Closed
Closed
@justinvf-zz

Description

@justinvf-zz

I am seeing odd behavior of dates being converted to datetime64[ns] times when that is inappropriate.

    In [272]: file = StringIO("xxyyzz20100101PIE\nxxyyzz20100101GUM\nxxyyww20090101EGG\nfoofoo20080909PIE")

    In [273]: df = pd.read_fwf(file, widths=[6,8,3], names=["person_id", "dt", "food"], parse_dates=["dt"])

    In [274]: df
    Out[274]: 
      person_id                  dt food
    0    xxyyzz 2010-01-01 00:00:00  PIE
    1    xxyyzz 2010-01-01 00:00:00  GUM
    2    xxyyww 2009-01-01 00:00:00  EGG
    3    foofoo 2008-09-09 00:00:00  PIE

Everything looks good. However:

    In [275]: df.dt.value_counts()
    Out[275]: 
    1970-01-17 00:00:00    2
    1970-01-18 00:00:00    1
    1970-01-25 08:00:00    1

    In [276]: df.dt.dtype
    Out[276]: dtype('datetime64[ns]')

The type is being stored as epoch nanoseconds internally, but usually displaying usually as the dates I want. But then everything is getting messed up when we drop down to numpy land for things like value_counts. What is going on? Is this a bug or am I making mistakes?

In [280]: pd.__version__
Out[280]: '0.10.1'

In [281]: np.__version__
Out[281]: '1.6.1'

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions