Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
from pandas import DatetimeIndex
l = DatetimeIndex(['2023-05-24 00:00:00+00:00', '2023-05-24 00:15:00+00:00',
'2023-05-24 00:30:00+00:00', '2023-05-24 00:45:00+00:00',
'2023-05-24 01:00:00+00:00'],
dtype='datetime64[ms, UTC]', name='ts', freq='15min')
r = DatetimeIndex(['2023-05-24 00:00:00+00:00', '2023-05-24 00:30:00+00:00',
'2023-05-24 01:00:00+00:00'],
dtype='datetime64[ms, UTC]', name='ts', freq='30min')
union = r.union(l)
print(union)
assert len(union) == len(l)
assert all(r.union(l) == l)
Issue Description
The union of two datetime-indexes as given in the reproducible example is calculated incorrectly, the result on newer Pandas versions is
DatetimeIndex(['2023-05-24 00:00:00+00:00', '2051-11-29 16:00:00+00:00',
'2080-06-06 08:00:00+00:00'],
dtype='datetime64[ms, UTC]', name='ts', freq='15T')
The first failing version is the one I put into "Installed Versions". The error happens exactly from Pandas 2.1.0 onwards, Pandas 1.* and up to 2.0.3 work fine. Neither the numpy nor the Python version matter.
Expected Behavior
The expected result in the given case is that l
is returned.
Installed Versions
INSTALLED VERSIONS
commit : ba1cccd
python : 3.10.16.final.0
python-bits : 64
OS : Linux
OS-release : 6.12.10-200.fc41.x86_64
Version : #1 SMP PREEMPT_DYNAMIC Fri Jan 17 18:05:24 UTC 2025
machine : x86_64
processor :
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 2.1.0
numpy : 1.26.4
pytz : 2024.2
dateutil : 2.9.0.post0
tzdata : 2025.1