Description
I don't know if this is a bug or not (I guess not), but I think this behavior might be very confusing to many users who like me until now might not be aware of this.
When converting a string column to Date using pd.to_datetime or pd.DatetimeIndex the way the date is interpreted might change from row to row. I experienced this today where March 1st was interpreted as January 3rd. Probably pandas does this simply on a per row set considering only one sample at a time. One way to improve this would be to first go through the whole
At least I'd suggest to implement a warning if none of the format/dayfirst parameters are used.
Here's a screenshot to clarify:
output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 15.3.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
pandas: 0.17.1
nose: 1.3.7
pip: 8.0.2
setuptools: 19.6.2
Cython: 0.23.4
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
IPython: 4.1.1
sphinx: 1.3.5
patsy: 0.4.0
dateutil: 2.4.2
pytz: 2015.7
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.4.6
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.5.0
bs4: 4.4.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.0.11
pymysql: None
psycopg2: None
Jinja2: 2.8