Skip to content

Maybe wrong default axis with operators (add, sub, mul, div) between datetime-indexed df and series 1.0.0 #31487

Closed
@giuliobeseghi

Description

@giuliobeseghi

Code Sample, a copy-pastable example if possible

import pandas as pd

index = pd.date_range(start='2020', periods=5)
df = pd.DataFrame([[1, 2, 3]] * 5, columns=['a', 'b', 'c'], index=index)
series = pd.Series([10, 20, 30, 40, 50], index=index)

print(df + series)
2020-01-01 00:00:00 2020-01 02 00:00:00 2020-01-03 00:00:00 2020-01-04 00:00:00 2020-01-05 00:00:00 a b c
2020-01-01 NaN NaN NaN NaN NaN NaN NaN NaN
2020-01-02 NaN NaN NaN NaN NaN NaN NaN NaN
2020-01-03 NaN NaN NaN NaN NaN NaN NaN NaN
2020-01-04 NaN NaN NaN NaN NaN NaN NaN NaN
2020-01-05 NaN NaN NaN NaN NaN NaN NaN NaN

Problem description

According to the docs (https://pandas.pydata.org/pandas-docs/stable/getting_started/dsintro.html#data-alignment-and-arithmetic):

When doing an operation between DataFrame and Series, the default behavior is to align the Series index on the DataFrame columns, thus broadcasting row-wise

In the special case of working with time series data, if the DataFrame index contains dates, the broadcasting will be column-wise

It seems to me that in both cases now the broadcasting is row-wise.

Is this an expected change for pandas 1.0.0 (I hope not - I never saw any FutureWarnings about it)? If so, the docs (and the examples) must be updated.

The same happens for the operators -, /, *, %

Expected Output

Not sure if this is the expected output anymore, but it used to be equivalent to:

df.add(series, axis=0)
a b c
2020-01-01 11 12 13
2020-01-02 21 22 23
2020-01-03 31 32 33
2020-01-04 41 42 43
2020-01-05 51 52 53

Although I can't replicate it, I'm pretty sure this was the behaviour until pandas 0.25.3

Output of pd.show_versions()

INSTALLED VERSIONS

commit : None
python : 3.7.6.final.0
python-bits : 64
OS : Windows
OS-release : 10
machine : AMD64
processor : Intel64 Family 6 Model 60 Stepping 3, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : None.None

pandas : 1.0.0
numpy : 1.18.1
pytz : 2019.3
dateutil : 2.8.1
pip : 20.0.2
setuptools : 45.1.0.post20200127
Cython : 0.29.14
pytest : 5.3.4
hypothesis : 4.54.2
sphinx : 2.3.1
blosc : None
feather : None
xlsxwriter : 1.2.7
lxml.etree : 4.4.2
html5lib : 1.0.1
pymysql : None
psycopg2 : None
jinja2 : 2.10.3
IPython : 7.11.1
pandas_datareader: None
bs4 : 4.8.2
bottleneck : 1.3.1
fastparquet : None
gcsfs : None
lxml.etree : 4.4.2
matplotlib : 3.1.1
numexpr : 2.7.0
odfpy : None
openpyxl : 3.0.3
pandas_gbq : None
pyarrow : None
pytables : None
pytest : 5.3.4
pyxlsb : None
s3fs : 0.4.0
scipy : 1.3.2
sqlalchemy : 1.3.13
tables : 3.6.1
tabulate : None
xarray : None
xlrd : 1.2.0
xlwt : 1.3.0
xlsxwriter : 1.2.7
numba : 0.47.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    DocsNumeric OperationsArithmetic, Comparison, and Logical operations

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions