Skip to content

DatetimeIndex.get_slice_bound(...) raises TypeErrors for unexpected YYYY-MM-DD/datetime.date/Timestamp combinations #35690

Closed
@RhysU

Description

@RhysU

Below find a handful of reproducible observations in Pandas 1.0.5 re: DatetimeIndex.get_slice_bound(...) where I'm genuinely not sure what the correct behaviors should all be. Some of these appear related to #34077 in that slice_locs(...) uses get_slice_bound(...) under the covers.

Observations inlined and expected behaviors discussed afterwards:

########################################
OBSERVATIONS 1 using a UTC DatetimeIndex
########################################

import datetime
import pandas as pd
import pandas.util.testing as put

# Generate a UTC-localized DatetimeIndex
df = put.makeTimeDataFrame()
df = df.tz_localize("utc")
index = df.index

# Show the generated DatetimeIndex, which should look like:
# DatetimeIndex(['2000-01-03 00:00:00+00:00', '2000-01-04 00:00:00+00:00',
#                ...
#                '2000-02-10 00:00:00+00:00', '2000-02-11 00:00:00+00:00'],
#               dtype='datetime64[ns, UTC]', freq='B')
print(index)

# (A) When the date is inside the DatetimeIndex, this call completes.
index.get_slice_bound(datetime.date(2000, 1, 7), kind="ix", side="left")

# (B) Notice date before start of index
# TypeError: searchsorted requires compatible dtype or scalar, not date
index.get_slice_bound(datetime.date(2000, 1, 1), kind="ix", side="left")

# (C) Notice date after end of index
# TypeError: searchsorted requires compatible dtype or scalar, not date
index.get_slice_bound(datetime.date(2020, 1, 1), kind="ix", side="left")


# (D) When the Timestamp is inside the DatetimeIndex, this call completes
index.get_slice_bound(pd.Timestamp("2000-01-07"), kind="ix", side="left")

# (E) Notice Timestamp before start of index
# TypeError: Cannot compare tz-naive and tz-aware datetime-like objects
index.get_slice_bound(pd.Timestamp("2000-01-01"), kind="ix", side="left")

# (F) Notice Timestamp after end of index
# TypeError: Cannot compare tz-naive and tz-aware datetime-like objects
index.get_slice_bound(pd.Timestamp("2020-01-01"), kind="ix", side="left")

Discussion:

  1. Above, I do not know if (A)-(C) should behave as (E)-(F) or not.
  2. Above, I suspect (D) should raise as (E)-(F) do.
  3. I can see arguments where datetime.date's lacking tzinfo could be inferred to be the DatetimeIndex.tzinfo. Then (A)-(C) would not raise.
  4. I can see arguments where Timestamps lacking tzinfo could be inferred to be the DatetimeIndex.tzinfo. Then (D)-(F) would not raise.
  5. I don't expect any data-dependence, meaning that the specific YYYY-MM-DD should not impact if something raises.
  6. I have not tested datetime.datetime under these circumstances.
  7. I believe some of these behaviors may have changed since the 0.2-series.

Again, observations inlined and expected behaviors discussed afterwards:

########################################################
OBSERVATIONS 2 using a DatetimeIndex with tzinfo is None
########################################################

import datetime
import pandas as pd
import pandas.util.testing as put

# Generate a non-localized DatetimeIndex
df = put.makeTimeDataFrame()
index = df.index
assert index.tzinfo is None

# Show the generated DatetimeIndex, which should look like:
# DatetimeIndex(['2000-01-03', '2000-01-04', '2000-01-05', '2000-01-06',
#                ...
#                '2000-02-10', '2000-02-11'],
#               dtype='datetime64[ns]', freq='B')
print(index)

# (G) When the date is inside the DatetimeIndex, this call completes.
index.get_slice_bound(datetime.date(2000, 1, 7), kind="ix", side="left")

# (H) Notice date before start of index
# TypeError: searchsorted requires compatible dtype or scalar, not date
index.get_slice_bound(datetime.date(2000, 1, 1), kind="ix", side="left")

# (I) Notice date after end of index
# TypeError: searchsorted requires compatible dtype or scalar, not date
index.get_slice_bound(datetime.date(2020, 1, 1), kind="ix", side="left")


# (J) When the Timestamp is inside the DatetimeIndex, this call completes
index.get_slice_bound(pd.Timestamp("2000-01-07"), kind="ix", side="left")

# (K) Call completes for Timestamp before start of index
index.get_slice_bound(pd.Timestamp("2000-01-01"), kind="ix", side="left")

# (L) Call completes for Timestamp after end of index
index.get_slice_bound(pd.Timestamp("2020-01-01"), kind="ix", side="left")

Discussion:

  1. Above, I expect (H)-(I) to behave as (G).
  2. Above, (J)-(K) appear correct to me.
  3. I don't expect any data-dependence, meaning that the specific YYYY-MM-DD should not impact if something raises.
  4. I have not tested datetime.datetime under these circumstances.
  5. I believe some of these behaviors may have changed since the 0.2-series.
INSTALLED VERSIONS
------------------
commit           : None
python           : 3.6.10.final.0
python-bits      : 64
OS               : Linux
OS-release       : 4.14.67-ts1
machine          : x86_64
processor        : 
byteorder        : little
LC_ALL           : en_US.UTF-8
LANG             : en_US.utf8
LOCALE           : en_US.UTF-8

pandas           : 1.0.5
numpy            : 1.16.6
pytz             : 2019.2
dateutil         : 2.8.0
pip              : 19.0.3
setuptools       : 40.8.0
Cython           : 0.29.20
pytest           : 5.3.1
hypothesis       : 3.57.0
sphinx           : 1.8.5
blosc            : 1.5.1
feather          : None
xlsxwriter       : 1.0.2
lxml.etree       : 4.3.4
html5lib         : 1.0.1
pymysql          : None
psycopg2         : None
jinja2           : 2.11.1
IPython          : 7.5.0
pandas_datareader: None
bs4              : 4.9.1
bottleneck       : 1.2.1
fastparquet      : 0.3.3
gcsfs            : None
lxml.etree       : 4.3.4
matplotlib       : 3.0.3
numexpr          : 2.6.4
odfpy            : None
openpyxl         : None
pandas_gbq       : None
pyarrow          : 1.0.0
pytables         : None
pytest           : 5.3.1
pyxlsb           : None
s3fs             : 0.4.2
scipy            : 1.5.1
sqlalchemy       : 1.2.1
tables           : 3.5.2
tabulate         : 0.8.3
xarray           : 0.10.0
xlrd             : 1.1.0
xlwt             : 1.3.0
xlsxwriter       : 1.0.2
numba            : 0.50.1

Metadata

Metadata

Assignees

Labels

BugDatetimeDatetime data dtypeIndexingRelated to indexing on series/frames, not to indexes themselvesTimezonesTimezone data dtype

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions