Skip to content

BUG: (iloc) Comparisons of Series have different indexing semantics from DataFrame comparisons #4581

Closed
@mairas

Description

@mairas

Consider the following:

In [3]: df = pd.DataFrame(np.arange(5), columns=["foo"])

In [4]: df["foo"].iloc[:-1] != df["foo"].iloc[1:]
Out[4]:
0    True
1    True
2    True
3    True
dtype: bool

In [5]: df["foo"].iloc[1:] != df["foo"].iloc[:-1]
Out[5]:
1    True
2    True
3    True
4    True
dtype: bool

Obviously, the ne operator uses positional indexing when comparing the values and applies the index of the first argument to the result. I don't think this is the expected behaviour.

Again, the semantics are different when performing addition:

In [6]: df["foo"].iloc[:-1] + df["foo"].iloc[1:]
Out[6]:
0   NaN
1     2
2     4
3     6
4   NaN
dtype: float64

In [7]: df["foo"].values[:-1] + df["foo"].values[1:]
Out[7]: array([1, 3, 5, 7])

Here, everything works as expected.

I also tried the dataframe ne method:

In [8]: df.iloc[:-1].ne(df.iloc[1:])
Out[8]:
     foo
0   True
1  False
2  False
3  False
4   True

This works as expected, and differently from Series comparisons. The respective method doesn't exist for Series objects, so I couldn't test that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugError ReportingIncorrect or improved errors from pandasIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions