Description
Pandas version checks
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
I have confirmed this bug exists on the main branch of pandas.
Reproducible Example
import pandas as pd
import datetime as dt
index1 = pd.date_range(start=dt.datetime(2021,10,28), periods=3, freq='1D', tz='Europe/London')
index2 = pd.date_range(start=dt.datetime(2021,10,30), periods=4, freq='1D', tz='Europe/London')
index1.union(index2)
Issue Description
result:
DatetimeIndex(['2021-10-28 00:00:00+01:00', '2021-10-29 00:00:00+01:00',
'2021-10-30 00:00:00+01:00', '2021-10-31 00:00:00+01:00',
'2021-10-31 23:00:00+00:00', '2021-11-01 23:00:00+00:00',
'2021-11-02 23:00:00+00:00'],
dtype='datetime64[ns, Europe/London]', freq='D')
When computing the union of 2 daily DatetimeIndex one of which is across a summer/winter time change, the result is not correct.
The wrong union currently omits or adds dates to the result.
Expected Behavior
expected result:
DatetimeIndex(['2021-10-28 00:00:00+01:00', '2021-10-29 00:00:00+01:00',
'2021-10-30 00:00:00+01:00', '2021-10-31 00:00:00+01:00',
'2021-11-01 00:00:00+00:00', '2021-11-02 00:00:00+00:00'],
dtype='datetime64[ns, Europe/London]', freq='D')
The correct result can be obtained by removing first the freq on one of the index with:
index1.freq = None
Installed Versions
INSTALLED VERSIONS
commit : bb1f651
python : 3.8.6.final.0
python-bits : 64
OS : Linux
OS-release : 5.10.25-linuxkit
Version : #1 SMP Tue Mar 23 09:27:39 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : en_US.UTF-8
LANG : en_US.UTF-8
LOCALE : en_US.UTF-8
pandas : 1.4.0
numpy : 1.22.1
pytz : 2021.3
dateutil : 2.8.2
pip : 21.0.1
setuptools : 49.6.0.post20210108
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.7.1
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 7.20.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
fsspec : 2022.01.0
gcsfs : None
matplotlib : 3.4.2
numba : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 6.0.1
pyreadstat : None
pyxlsb : None
s3fs : 2022.01.0
scipy : 1.7.3
sqlalchemy : 1.3.23
tables : None
tabulate : None
xarray : 0.18.2
xlrd : None
xlwt : None
zstandard : None