Description
Code Sample, a copy-pastable example if possible
import pandas as pd
import numpy as np
int64_info = np.iinfo("int64")
s = pd.Series([int64_info.max, None, int64_info.min], dtype=pd.Int64Dtype())
df = pd.DataFrame({"Int64": s})
df.max()
Int64 9.223372e+18
dtype: float64
Problem description
pd.Int64
data is converted to np.float64
in certain reduction operations on pd.DataFrame
. This causes data corruption, as pd.Int64
is intended to avoid this exact issue.
Expected Output
df.max()
should probably return a pd.Series
of dtype='object'
wrapping a pd.Int64
value.
Output of pd.show_versions()
pandas : 1.1.0.dev0+779.g27ad77971
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.0
pip : 19.3.1
setuptools : 42.0.2.post20191203
Cython : 0.29.14
pytest : 5.3.5
hypothesis : 5.4.1
sphinx : 2.4.0
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 2.11.1
IPython : 7.12.0
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.4.0.dev0+62.g8ac3a4c8
fastparquet : 0.3.2
gcsfs : None
matplotlib : None
numexpr : 2.7.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 0.11.1
pytables : None
pyxlsb : None
s3fs : 0.4.0
scipy : 1.4.1
sqlalchemy : None
tables : 3.6.1
tabulate : None
xarray : 0.14.1
xlrd : None
xlwt : None
numba : 0.48.0
</details>