Open
Description
Follow up on #37427
Currently, there is an inconsistency between __getitem__
and loc
related to how the RHS value in an assignment gets re-aligned or not.
In this example, the values in Series get just assigned as is, ignoring the index of the series (only requiring that the length matches the number of values in the indexer):
s = pd.Series(range(5))
idx = np.array([1, 4])
>>> s[idx] = Series([10, 11])
>>> s
0 0
1 10
2 2
3 3
4 11
dtype: int64
However, with loc
we have a different behaviour: the Series gets basically aligned with s
first, and then this realigned series is also indexed with idx
, and those values get set:
s = pd.Series(range(5))
idx = np.array([1, 4])
>>> s.loc[idx] = Series([10, 11])
>>> s
0 0.0
1 11.0
2 2.0
3 3.0
4 NaN
dtype: float64
you could write what basically happens more explicitly as:
s.loc[idx] = Series([10, 11]).reindex(s.index)[idx]
To have the same behaviour as __getitem__
, you need to assign with an array instead of series:
s = pd.Series(range(5))
idx = np.array([1, 4])
>>> s.loc[idx] = Series([10, 11]).values
>>> s
0 0
1 10
2 2
3 3
4 11
dtype: int64