Description
Code Sample, a copy-pastable example if possible
Consider the following setup on master:
In [1]: import pandas as pd; pd.__version__
Out[1]: '0.26.0.dev0+1576.gdd0d353fb'
In [2]: dti = pd.date_range("2020", periods=3)
...: dti_tz = pd.date_range("2020", periods=3, tz="UTC")
...: tdi = pd.timedelta_range("0 days", periods=3)
...: pi = pd.period_range("2020Q1", periods=3, freq="Q")
Equality comparisons with an equivalent Categorical
are incorrect for DatetimeIndex
:
In [3]: dti == pd.Categorical(dti)
Out[3]: array([False, False, False])
In [4]: dti_tz == pd.Categorical(dti_tz)
Out[4]: array([False, False, False])
Equality comparisons raise for PeriodIndex
:
In [5]: pi == pd.Categorical(pi)
---------------------------------------------------------------------------
ValueError: Value must be Period, string, integer, or datetime
Looks good for TimedeltaIndex
:
In [6]: tdi == pd.Categorical(tdi)
Out[6]: array([ True, True, True])
The incorrect behavior above is generally consistent when replacing the index with its extension array/Series equivalent, Categorical
with CategoricalIndex
/Series[Categorical]
, and ==
with !=
.
The only exception appears to be that a couple cases work when when you have a Categorical
/ CategoricalIndex
on the RHS and an extension array on the LHS:
In [7]: pd.Categorical(dti) == dti.array
Out[7]: array([ True, True, True])
In [8]: pd.CategoricalIndex(pi) == pi.array
Out[8]: array([ True, True, True])
Though note that the above does not work for dti_tz.array
Problem description
Equality comparisons for datetimelike arrays/indexes are largely incorrect when comparing to equivalent categoricals. There is some tie in to #19513 but I think this specific component is pretty clear cut.
Expected Output
I'd expect all variants of ==
to in the examples above to return array([ True, True, True])
, and all variants of !=
to return array([False, False, False])
.