Skip to content

API: alignment behaviour of Series as assignment value in indexing #37516

Open
@jorisvandenbossche

Description

@jorisvandenbossche

Follow up on #37427

Currently, there is an inconsistency between __getitem__ and loc related to how the RHS value in an assignment gets re-aligned or not.

In this example, the values in Series get just assigned as is, ignoring the index of the series (only requiring that the length matches the number of values in the indexer):

s = pd.Series(range(5))
idx = np.array([1, 4]) 

>>> s[idx] = Series([10, 11])   
>>> s   
0     0
1    10
2     2
3     3
4    11
dtype: int64

However, with loc we have a different behaviour: the Series gets basically aligned with s first, and then this realigned series is also indexed with idx, and those values get set:

s = pd.Series(range(5))
idx = np.array([1, 4]) 

>>> s.loc[idx] = Series([10, 11]) 
>>> s 
0     0.0
1    11.0
2     2.0
3     3.0
4     NaN
dtype: float64

you could write what basically happens more explicitly as:

s.loc[idx] = Series([10, 11]).reindex(s.index)[idx] 

To have the same behaviour as __getitem__, you need to assign with an array instead of series:

s = pd.Series(range(5))
idx = np.array([1, 4]) 

>>> s.loc[idx] = Series([10, 11]).values    
>>> s 
0     0
1    10
2     2
3     3
4    11
dtype: int64

Metadata

Metadata

Assignees

No one assigned

    Labels

    API - ConsistencyInternal Consistency of API/BehaviorBugIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions