Skip to content

pd.to_datetime/pd.DatetimeIndex unexpected behavior #12501

Closed
@marcomayer

Description

@marcomayer

I don't know if this is a bug or not (I guess not), but I think this behavior might be very confusing to many users who like me until now might not be aware of this.

When converting a string column to Date using pd.to_datetime or pd.DatetimeIndex the way the date is interpreted might change from row to row. I experienced this today where March 1st was interpreted as January 3rd. Probably pandas does this simply on a per row set considering only one sample at a time. One way to improve this would be to first go through the whole

At least I'd suggest to implement a warning if none of the format/dayfirst parameters are used.

Here's a screenshot to clarify:

http://snag.gy/mXT7g.jpg

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8

pandas: 0.17.1
nose: 1.3.7
pip: 8.0.2
setuptools: 19.6.2
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
IPython: 4.1.1
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.6
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
Jinja2: 2.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypeDuplicate ReportDuplicate issue or pull request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions