Description
import pandas
df = pandas.DataFrame({'a': [pandas.Timestamp('20200309T120000.000000-0400'), pandas.Timestamp('20200309T130000.000000-0400')]})
df.astype({'a': pandas.api.types.DatetimeTZDtype(unit='ns', tz='UTC')})
a
0 2020-03-09 16:00:00+00:00
1 2020-03-09 17:00:00+00:00
df = pandas.DataFrame({'a': [pandas.Timestamp('20200309T120000.000000-0400'), pandas.Timestamp('20200309T130000.000000-0500')]})
df.astype({'a': pandas.api.types.DatetimeTZDtype(unit='ns', tz='UTC')})
ValueError: Tz-aware datetime.datetime cannot be converted to datetime64 unless utc=True
Similarly, something like df['a'].dt.tz_convert(tz='utc')
also returns the same error
Problem description
Calling pandas.DataFrame.astype
with a column of mixed timezone aware Timestamps fails to convert to UTC
Expected Output
I would expect this to return the same result as df.applymap(lambda x: x.tz_convert(tz='utc'))
0 2020-03-09 16:00:00+00:00
1 2020-03-09 18:00:00+00:00
Output of pd.show_versions()
pandas : 1.0.1
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 46.0.0.post20200309
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 7.12.0
pandas_datareader: None
bs4 : None
bottleneck : None
fastparquet : None
gcsfs : None
lxml.etree : None
matplotlib : None
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pytables : None
pytest : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
xlsxwriter : None
numba : None