Skip to content

BUG/inconsistency: IntervalIndex.get_loc gives a location for non-exact inputs #19349

Open
@topper-123

Description

@topper-123

This one is a bit complex to explain, but I'll do my best.

Currently IntervalIndex.get_indexer fails if the other index doesn't contain Interval only (there's also another bug, but let's keep it simple here).

The underlying issue is that IntervalIndex.get_indexer depends on IntervalIndex.get_loc which is ambigous for how it treats number inputs:

>> ii = pd.IntervalIndex.from_breaks([0,1,2,3])
>> ii.get_loc(pd.Interval(1, 2))
1  # ok
>> ii.get_loc(1)  # do we mean exactly 1, or if an interval contains the number 1?
1  # ambigous

The issue is that get_loc returns the location for both exact matches and inexact matches (i.e. if the number input is in an interval). For the purposes of get_indexer however, this behavious fails, as get_indexer needs get_loc to find exact matches only.

See #19021 (comment) for further discussion.

Solution

A solution could be adding a 'strict' option to the method parameter of IntervalIndex.get_loc.

This wasn't so difficult after all, and I've already made a PR on this, see #19353

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignBugIndexingRelated to indexing on series/frames, not to indexes themselvesIntervalInterval data type

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions