Skip to content

DOC: Clarifiy fill_value behavior in arithmetic ops #19653

Closed
@HagaiHargil

Description

@HagaiHargil

When adding two DataFrames using df1.add(df2) one can use the fill_value parameter to fill in any NaNs that might come up. This parameter seems pretty broken:

a = pd.DataFrame(np.ones((3, 2)))
b = pd.DataFrame(np.ones((4, 3)))
print(a + b)  # all good:
#      0    1   2
# 0  2.0  2.0 NaN
# 1  2.0  2.0 NaN
# 2  2.0  2.0 NaN
# 3  NaN  NaN NaN

print(a.add(b))  # all good
print(a.add(b, fill_value=0))  # broken
#      0    1    2
# 0  2.0  2.0  1.0
# 1  2.0  2.0  1.0
# 2  2.0  2.0  1.0
# 3  1.0  1.0  1.0

As you see, it filled the NaNs with 1.0. Changing fill_value=1 will fill everything with 2.0. However, changing some of the values inside b leads to more peculiar results, and I couldn't really connect the dots and find some pattern.

This was observed on Python 3.6 on both Linux and Windows.

Thanks.

``` commit: None python: 3.6.3.final.0 python-bits: 64 OS: Linux OS-release: 3.10.0-514.el7.x86_64 machine: x86_64 processor: x86_64 byteorder: little LC_ALL: None LANG: en_US.UTF-8 LOCALE: en_US.UTF-8 pandas: 0.22.0 pytest: None pip: 9.0.1 setuptools: 38.4.0 Cython: 0.27.3 numpy: 1.14.0 scipy: 1.0.0 pyarrow: None xarray: None IPython: 6.2.1 sphinx: 1.6.7 patsy: None dateutil: 2.6.1 pytz: 2017.3 blosc: None bottleneck: None tables: None numexpr: None feather: None matplotlib: 2.1.1 openpyxl: None xlrd: None xlwt: None xlsxwriter: None lxml: None bs4: None html5lib: 0.9999999 sqlalchemy: None pymysql: None psycopg2: None jinja2: 2.10 s3fs: None fastparquet: None pandas_gbq: None pandas_datareader: None ```

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions