Description
Code Sample, a copy-pastable example if possible
This is just one case that can go wrong, but other edge cases are possible (see listing below)
import numpy as np
import pandas as pd
series = pd.Series([0], dtype=np.float32)
upper = pd.Series([-np.finfo(np.float64).tiny], dtype=np.float64)
series.clip(upper, inplace=True)
assert (series <= upper).all()
Problem description
Series.clip
, Series.clip_upper
and Series.clip_lower
may return wrong results if the bound arguments have a higher precision than the series (e.g. float64
VS float32
, but others are possible as well) and inplace=True
was given. This is due to the fact that pandas cannot change the dtype of the clipped series in that case and seems to do run some additional conversation. The following edge cases are possible:
- the
upper
bound gets larger when converted to the lower precision (this is the example shown above) - the
lower
bound gets larger when converted to the lower precision - the
upper
bound is negative and is that "large" that it cannot be represented by the lower precision float - the
lower
bound is positive and is that "large" that it cannot be represented by the lower precision float - the range between
lower
andupper
is that tiny that there is no lower precision float possible
Expected Output
For 1. and 2.: find the closest lower-precision float that satisfies the bound check
For 3. and 4.: return -/+inf
For 5.: return NaN
Output of pd.show_versions()
pandas: 0.23.1
pytest: 3.4.0
pip: 10.0.1
setuptools: 39.1.0
Cython: 0.28.2
numpy: 1.14.3
scipy: 1.0.0
pyarrow: 0.9.0
xarray: None
IPython: 6.1.0
sphinx: 1.6.7
patsy: 0.5.0
dateutil: 2.7.3
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: None
numexpr: 2.6.5
feather: None
matplotlib: 2.2.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 1.0.1
sqlalchemy: 1.2.5
pymysql: None
psycopg2: 2.7.4 (dt dec pq3 ext lo64)
jinja2: 2.8.1
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None
</details>