Skip to content

Allow for near misses when doing alignment with Float64Index? #9530

Closed
@shoyer

Description

@shoyer

The "exact match" requirement for index alignment doesn't work very well when using Float64Index if the index values may result from computations.

Here's an example of how the current behavior results in a poor user experience:

In [22]: s1 = pd.Series(range(3), np.arange(3.0))

In [23]: s2 = pd.Series(range(3), s1.index + 0.15 - 0.15)

In [24]: s1 + s2
Out[24]:
0     0
1   NaN
1   NaN
2     4
dtype: float64

In [25]: s1.index
Out[25]: Float64Index([0.0, 1.0, 2.0], dtype='float64')

In [26]: s2.index
Out[26]: Float64Index([0.0, 1.0, 2.0], dtype='float64')

In [27]: s1.index[1]
Out[27]: 1.0

In [28]: s2.index[1]
Out[28]: 0.99999999999999989

reindex/joins with method='nearest' will help (#9258), but automatic alignment will still remain broken. Ideally, we should figure out some way for array alignment to ignore very small differences that are almost certainly due to the precision limits of floating point arithmetic.

I'm not entirely sure what the right solution here looks like, but I wanted to open this to discussion.

CC @hugadams who I'm guessing has almost certainly encountered these sort of issues when doing computations on indexes for scikit-spectra.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementIndexRelated to the Index class or subclassesIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions