Skip to content

BUG: pd.util.hash_array fails on DatetimeIndex with tz specified #41817

Closed
@TheNeuralBit

Description

@TheNeuralBit
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

pd.util.hash_array(pd.DatetimeIndex(['2018-10-28 01:20:00'], tz='Europe/Berlin'))

Output:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-389898c5af02> in <module>
----> 1 pd.util.hash_array(pd.DatetimeIndex(['2018-10-28 01:20:00'], tz='Europe/Berlin'))

~/.pyenv/versions/3.8.6/envs/beam/lib/python3.8/site-packages/pandas/core/util/hashing.py in hash_array(vals, encoding, hash_key, categorize)
    255         return _hash_categorical(vals, encoding, hash_key)
    256     elif is_extension_array_dtype(dtype):
--> 257         vals, _ = vals._values_for_factorize()
    258         dtype = vals.dtype
    259 

AttributeError: 'DatetimeIndex' object has no attribute '_values_for_factorize'

Apparently datetime64[ns, Europe/Berlin] is an extension array dtype but has no _values_for_factorize method. I've reproduced on pandas 1.1.4, 1.2.4, and on master (503ce50)

Problem description

pd.util.hash_array works with other Indexes, including a timezone-naive DatetimeIndex, it seems reasonable to expect it to work with a timezone-aware DatetimeIndex (or yield a better error).

Expected Output

Output should be similar to timezone-naive DatetimeIndex:

In [3]: pd.util.hash_array(pd.DatetimeIndex(['2018-10-28 01:20:00']))
Out[3]: array([3152239034440746192], dtype=uint64)

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugNeeds TriageIssue that has not been reviewed by a pandas team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions