Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
s = pd.Series(['6/30/2025','1 27 2024'])
pd.to_datetime(s,errors='raise',infer_datetime_format=True)
Issue Description
I'm confused about the interaction between errors = {raise, coerce}
and infer_datetime_format = {True, False}
.
-
From experiment 3, it tells me if I want to follow the format in first row using infer=True, the second row has a problem which causes it to be coerced to NaT.
-
I also understand that in experiment 4, because infer=False, each row is free to have it's own format not necessarily following the format inferred from first row, so it doesn't have any problem being converted, thus not NaT.
-
I don't understand why experiment 1 does not raise any error but converts row 2 properly?
(From experiment 3 i expect row 2 to be problematic when infer=True) -
I understand experiment 2 converts with no problem because infer=False allows every row to have it's own format, similar reasoning as for experiment 4.
I tried 4 experiments on the same series of s = pd.Series(['6/30/2025','1 27 2024'])
Experiment 1. errors = 'raise', infer_datetime_format=True
Input: pd.to_datetime(s,errors='raise',infer_datetime_format=True)
Output:
0 2025-06-30
1 2024-01-27
dtype: datetime64[ns]
Experiment 2. errors = 'raise', infer_datetime_format=False
Input: pd.to_datetime(s,errors='raise',infer_datetime_format=False)
Output: Same as experiment 1
Experiment 3. errors = 'coerce', infer_datetime_format=True
Input: pd.to_datetime(s,errors='coerce',infer_datetime_format=True)
Output:
0 2025-06-30
1 NaT
dtype: datetime64[ns]
Experiment 4. errors = 'coerce', infer_datetime_format=False
Input: pd.to_datetime(s,errors='coerce',infer_datetime_format=False)
Output: Same as experiment 1
Expected Behavior
I expect the same error that caused experiment 3 to produce NaT in 2nd row with '1 27 2024'
, to generate some error in experiment 1 for 2nd row too.
I'm not sure what caused the NaT in experiment 3, but reasoning from the assumption that if errors='coerce'
has found some problem, errors='raise'
should find that problem too.
I also assume the NaT in experiment 3 is caused by infer_datetime_format by comparing experiment 3 and 4.
Installed Versions
INSTALLED VERSIONS
commit : bb1f651
python : 3.8.12.final.0
python-bits : 64
OS : Darwin
OS-release : 20.4.0
Version : Darwin Kernel Version 20.4.0: Fri Mar 5 01:14:14 PST 2021; root:xnu-7195.101.1~3/RELEASE_X86_64
machine : x86_64
processor : i386
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.4.0
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
pip : 21.1.1
setuptools : 56.0.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.4
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 7.30.1
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : 3.5.0
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.7.3
sqlalchemy : 1.4.29
tables : None
tabulate : 0.8.9
xarray : None
xlrd : 2.0.1
xlwt : None
zstandard : None