Skip to content

Idea: use df.index/df.columns names to automatically choose axis along which to broadcast #13243

Closed
@supern8ent

Description

@supern8ent

In writing some math code in pandas, I find it necessary to do things like

df2 = df.sub(ser, axis='columns')

instead of the shorter and more intuitive

df2 = df - ser

in order to control the axis along which the series is broadcast.

I think it would be a big improvement syntactically if pandas would automatically broadcast down the axis that didn't have a matching name.

Example:

df = pd.DataFrame(np.random.rand(3, 2), columns=['a', 'b'])
df.index.name = 'dim0'
df.columns.name = 'dim1'
df
dim1         a         b
dim0                    
0     0.755744  0.321682
1     0.915464  0.413154
2     0.647672  0.457927

subtract:

df - df['a']    # does not give the desired result
       a   b   0   1   2
dim0                    
0    NaN NaN NaN NaN NaN
1    NaN NaN NaN NaN NaN
2    NaN NaN NaN NaN NaN

subtract, specifying which axis to match on (broadcasting happens on the other axis):

df.sub(df['a'], axis='index')   # gives the desired result
dim1    a         b
dim0               
0     0.0 -0.434062
1     0.0 -0.502310
2     0.0 -0.189745

I am suggesting that the "-" operator would look at the names of the indices in the operands and match on the axis that has the same name in the two operands.

By way of motivation, I'm doing mass spectral matching of compounds, so I could name my indices 'chemical' and 'mass'.

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions