-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: Fix performance regression in get_loc of IntervalIndex #51339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/arrays/interval.py
Outdated
@@ -727,7 +730,7 @@ def __getitem__( | |||
if np.ndim(left) > 1: | |||
# GH#30588 multi-dimensional indexer disallowed | |||
raise ValueError("multi-dimensional indexing not allowed") | |||
return self._shallow_copy(left, right) | |||
return self._shallow_copy(left, right, verify_integrity=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldnt we also skip _ensure_simple_new_inputs, i.e. just do
return self._simple_new(left, right, dtype=self.dtype)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's see if it works, probably not necessary
pandas/core/arrays/interval.py
Outdated
@@ -675,7 +677,8 @@ def _shallow_copy(self: IntervalArrayT, left, right) -> IntervalArrayT: | |||
""" | |||
dtype = IntervalDtype(left.dtype, closed=self.closed) | |||
left, right, dtype = self._ensure_simple_new_inputs(left, right, dtype=dtype) | |||
self._validate(left, right, dtype=dtype) | |||
if verify_integrity: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so is this still necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not for this regression, but I think we should look through all usages to see if it's necessary everywhere
pandas/core/arrays/interval.py:729: error: Argument 2 to "_simple_new" of "IntervalArray" has incompatible type "Union[Period, Timestamp, Timedelta, NaTType, DatetimeArray, TimedeltaArray, ndarray[Any, Any]]"; expected "Union[Union[DatetimeArray, TimedeltaArray], ndarray[Any, Any]]" [arg-type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending green
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.caused in 184e1674
No need to call validate here
https://asv-runner.github.io/asv-collection/pandas/#indexing.IntervalIndexing.time_loc_list