Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas.
-
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
import pandas as pd
import numpy as np
# Generate a 4x6 DataFrame
df = pd.DataFrame(np.arange(24).reshape(4, 6), columns=list("abcdef"))
# Make each column a different data type
df = df.astype({"a":"float16", "b":"float32", "c":"float64", "d":"int8", "e":"int16", "f":"int32"})
print(df)
# Output:
# a b c d e f
# 0 0.0 1.0 2.0 3 4 5
# 1 6.0 7.0 8.0 9 10 11
# 2 12.0 13.0 14.0 15 16 17
# 3 18.0 19.0 20.0 21 22 23
print(df.dtypes)
# Output:
# a float16
# b float32
# c float64
# d int8
# e int16
# f int32
# dtype: object
# Rolling minimum across rows
print(df.rolling(window=2, min_periods=1, axis=1).min())
# Output. Notice how the float16 and float32 columns were removed:
# c d e f
# 0 2.0 2.0 3.0 4.0
# 1 8.0 8.0 9.0 10.0
# 2 14.0 14.0 15.0 16.0
# 3 20.0 20.0 21.0 22.0
Problem description
It seems that rolling operations along rows (axis=1
) incorrectly omit columns containing float16
s and float32
s. The same operations work as expected along columns (axis=0
), however.
Expected Output
# Convert float16 and float32 columns to float64s as a workaround
df = df.astype({"a":"float64", "b":"float64"})
# Rolling minimum across rows again
print(df.rolling(window=2, min_periods=1, axis=1).min())
# Output:
# a b c d e f
# 0 0.0 0.0 1.0 2.0 3.0 4.0
# 1 6.0 6.0 7.0 8.0 9.0 10.0
# 2 12.0 12.0 13.0 14.0 15.0 16.0
# 3 18.0 18.0 19.0 20.0 21.0 22.0
Possible Cause
A change made in #36458, specifically this line.
It seems that "float"
is an alias specifically for np.float64
, not np.float32
or np.float16
. Changing that line to
obj = obj.select_dtypes(include="number", exclude=["timedelta"])
to include all numeric values seemed to fix the issue in this case. I can open a PR if there don't seem to be any issues with this.
Output of pd.show_versions()
INSTALLED VERSIONS
commit : 2cb9652
python : 3.9.1.final.0
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.19041
machine : AMD64
processor : AMD64 Family 23 Model 8 Stepping 2, AuthenticAMD
byteorder : little
LC_ALL : None
LANG : en_US.UTF-8
LOCALE : English_Canada.1252
pandas : 1.2.4
numpy : 1.20.3
pytz : 2021.1
dateutil : 2.8.1
pip : 21.1.2
setuptools : 57.0.0
Cython : None
pytest : None
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : 4.6.3
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : 7.24.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.4.2
numexpr : None
odfpy : None
openpyxl : 3.0.7
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : None
sqlalchemy : None
tables : None
tabulate : None
xarray : 0.18.2
xlrd : None
xlwt : None
numba : None