-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
REF/TST: Fix remaining DatetimeArray with DateOffset arithmetic ops #23789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 14 commits
986fdbc
fd75931
66c866b
4dc17e2
da3459c
348a8b2
dd7e873
c8351bc
23a25d1
b4ae288
d1ebdbf
711ee61
9338b5b
5fbe9c8
c7db0e4
a4f9733
b50fedf
7e951e4
317e1e7
5de2d42
5433a71
dc137f3
2c65f3b
31c5c0b
c3d775e
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,7 +7,7 @@ | |
|
||
import numpy as np | ||
|
||
from pandas._libs import tslibs | ||
from pandas._libs import algos, tslibs | ||
from pandas._libs.tslibs import NaT, Timedelta, Timestamp, iNaT | ||
from pandas._libs.tslibs.fields import get_timedelta_field | ||
from pandas._libs.tslibs.timedeltas import ( | ||
|
@@ -24,7 +24,7 @@ | |
from pandas.core.dtypes.missing import isna | ||
|
||
from pandas.core import ops | ||
from pandas.core.algorithms import checked_add_with_arr | ||
from pandas.core.algorithms import checked_add_with_arr, unique1d | ||
import pandas.core.common as com | ||
|
||
from pandas.tseries.frequencies import to_offset | ||
|
@@ -162,15 +162,29 @@ def _simple_new(cls, values, freq=None, dtype=_TD_DTYPE): | |
result._freq = freq | ||
return result | ||
|
||
def __new__(cls, values, freq=None, dtype=_TD_DTYPE): | ||
def __new__(cls, values, freq=None, dtype=_TD_DTYPE, copy=False): | ||
|
||
freq, freq_infer = dtl.maybe_infer_freq(freq) | ||
|
||
values = np.array(values, copy=False) | ||
if values.dtype == np.object_: | ||
values = array_to_timedelta64(values) | ||
values, inferred_freq = sequence_to_td64ns( | ||
values, copy=copy, unit=None) | ||
if inferred_freq is not None: | ||
jbrockmendel marked this conversation as resolved.
Show resolved
Hide resolved
|
||
if freq is not None and freq != inferred_freq: | ||
raise ValueError('Inferred frequency {inferred} from passed ' | ||
jorisvandenbossche marked this conversation as resolved.
Show resolved
Hide resolved
|
||
'values does not conform to passed frequency ' | ||
'{passed}' | ||
.format(inferred=inferred_freq, | ||
passed=freq.freqstr)) | ||
elif freq is None: | ||
freq = inferred_freq | ||
freq_infer = False | ||
|
||
result = cls._simple_new(values, freq=freq) | ||
# check that we are matching freqs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you could simplify a lot of these checks if also pass There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ill take a look at this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think you're right that we can change this from a 3-4 liner into a 1-2 liner. Since this pattern shows up in all four of TDA/DTA/TDI/DTI constructors (actually, future tense for DTA), I'd like to do change them all at once in a dedicated follow-up |
||
if inferred_freq is None and len(result) > 0: | ||
if freq is not None and not freq_infer: | ||
cls._validate_frequency(result, freq) | ||
|
||
if freq_infer: | ||
result.freq = to_offset(result.inferred_freq) | ||
|
||
|
@@ -227,6 +241,22 @@ def _validate_fill_value(self, fill_value): | |
"Got '{got}'.".format(got=fill_value)) | ||
return fill_value | ||
|
||
# is_monotonic_increasing, is_monotonic_decreasing, and is_unique | ||
# are needed by `frequencies.infer_freq`, which is called when accessing | ||
# the `inferred_freq` property inside the TimedeltaArray constructor | ||
|
||
@property # NB: override with cache_readonly in immutable subclasses | ||
def is_monotonic_increasing(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Where is this used in this PR? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. When calling the constructor with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I already expressed this previously, but I would prefer that we have this discussion for all the Arrays, not just for TimedeltaArray (or DatetimeArray), as this is not a datetime-specific attribute. If you don't want to have that discussion first, you can always make it a private attribute here, and check that in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this is a discussion you'd like to have for EAs more generally, go ahead and open an issue for it. I would be +1 on putting these attributes in the mixin class so that they are available on all three of DTA/TDA/PA, but for now they are only needed on TDA. Special-casing inside There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jbrockmendel I would really prefer if you leave this out of this PR (I mean adding the attributes, so which means special casing this in Looking at the code again, I think the easiest to do is to ensure what is passed to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. converting to an Index would break the "Array should be ignorant of Index" rule discussed elsewhere. I'll special-case this to get this over with, but I maintain this is introducing a code smell. |
||
return algos.is_monotonic(self.asi8, timelike=True)[0] | ||
|
||
@property # NB: override with cache_readonly in immutable subclasses | ||
def is_monotonic_decreasing(self): | ||
return algos.is_monotonic(self.asi8, timelike=True)[1] | ||
|
||
@property # NB: override with cache_readonly in immutable subclasses | ||
def is_unique(self): | ||
return len(unique1d(self.asi8)) == len(self) | ||
jorisvandenbossche marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
# ---------------------------------------------------------------- | ||
# Arithmetic Methods | ||
|
||
|
@@ -281,7 +311,7 @@ def _add_datetimelike_scalar(self, other): | |
result = checked_add_with_arr(i8, other.value, | ||
arr_mask=self._isnan) | ||
result = self._maybe_mask_results(result) | ||
return DatetimeArrayMixin(result, tz=other.tz) | ||
return DatetimeArrayMixin(result, tz=other.tz, freq=self.freq) | ||
|
||
def _addsub_offset_array(self, other, op): | ||
# Add or subtract Array-like of DateOffset objects | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -227,6 +227,12 @@ def _format_native_types(self, na_rep=u'NaT', date_format=None, **kwargs): | |
# ------------------------------------------------------------------- | ||
# Wrapping TimedeltaArray | ||
|
||
# override non-caching implementations from TimedeltaArray with | ||
# _engine-based implementations that take advantage of Index immutability | ||
is_monotonic_increasing = Index.is_monotonic_increasing | ||
is_monotonic_decreasing = Index.is_monotonic_decreasing | ||
is_unique = Index.is_unique | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you need to time this, we have a dedicated routine in the cython engine for this, the main reason for using it is it a O(n) op once the hashtable is computed, rather than a non-cached computation which you did above. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. TimedeltaIndex uses the _engine-based implementation that is available because we know TDI is immutable. TimedeltaArray uses the naive implementation, at least for now. There's an Issue to investigate caching on PeriodArray which can be extended to TDA/DTA when the time comes. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is not obvious at all. i would expect a comment on this. maybe even do this in a separate PR, with testing for this. I am not sure its relevant to the changes in this PR. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll add a comment. It is definitely relevant to this PR, since without implementing these in the TDA calls to |
||
|
||
__mul__ = Index.__mul__ | ||
__rmul__ = Index.__rmul__ | ||
__truediv__ = Index.__truediv__ | ||
|
Uh oh!
There was an error while loading. Please reload this page.