Description
Hi,
I regularly run into issues where I have dates that fall outside of Pandas's datetime standards. Quite a few data sources have defaults such as "9999-12-31" and stuff like that, leading to issues in pandas.
This is because Pandas defaults to nanoseconds where the time span is quite limited.
See: http://docs.scipy.org/doc/numpy/reference/arrays.datetime.html
Code Meaning Time span (relative) Time span (absolute)
s second +/- 2.9e12 years [ 2.9e9 BC, 2.9e9 AD]
ms millisecond +/- 2.9e9 years [ 2.9e6 BC, 2.9e6 AD]
us microsecond +/- 2.9e6 years [290301 BC, 294241 AD]
ns nanosecond +/- 292 years [ 1678 AD, 2262 AD]
I first thought it was the unit='s' parameter in to_datetime would work (see: http://pandas.pydata.org/pandas-docs/version/0.14.0/generated/pandas.tseries.tools.to_datetime.html), but this is only for translating a different datetime to nano seconds (I think) and the "ns" detail level seems to be rather hard coded.
I cannot imagine the majority of the use cases needing nano seconds; even going to micro seconds extends the date range to something that in my experience should always work. The nanosecond 2262AD is really limited.
Imho, ideally one should be able to choose the detail level. Is this just me or is this a common issue?