Description
-
I have checked that this issue has not already been reported.
-
I have confirmed this bug exists on the latest version of pandas. (does not exist in 1.2.4, only seen in 1.3.0rc1 and master)
-
(optional) I have confirmed this bug exists on the master branch of pandas. (confirmed on 0b68d87)
Code Sample, a copy-pastable example
In [1]: import pandas as pd
In [2]: pd.util.hash_array(pd.DatetimeIndex(['2018-10-28 01:20:00'], tz='Europe/Berlin'))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-2-389898c5af02> in <module>
----> 1 pd.util.hash_array(pd.DatetimeIndex(['2018-10-28 01:20:00'], tz='Europe/Berlin'))
~/working_dir/pandas/pandas/core/util/hashing.py in hash_array(vals, encoding, hash_key, categorize)
287 elif not isinstance(vals, np.ndarray):
288 # i.e. ExtensionArray
--> 289 vals, _ = vals._values_for_factorize()
290
291 return _hash_ndarray(vals, encoding, hash_key, categorize)
AttributeError: 'DatetimeIndex' object has no attribute '_values_for_factorize'
In [3]: pd.util.hash_array(pd.Index([1,2,3]))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-3-e66efa244441> in <module>
----> 1 pd.util.hash_array(pd.Index([1,2,3]))
~/working_dir/pandas/pandas/core/util/hashing.py in hash_array(vals, encoding, hash_key, categorize)
287 elif not isinstance(vals, np.ndarray):
288 # i.e. ExtensionArray
--> 289 vals, _ = vals._values_for_factorize()
290
291 return _hash_ndarray(vals, encoding, hash_key, categorize)
AttributeError: 'Int64Index' object has no attribute '_values_for_factorize'
In [4]: pd.util.hash_array(pd.RangeIndex(1,3))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-4-af0900aae979> in <module>
----> 1 pd.util.hash_array(pd.RangeIndex(1,3))
~/working_dir/pandas/pandas/core/util/hashing.py in hash_array(vals, encoding, hash_key, categorize)
287 elif not isinstance(vals, np.ndarray):
288 # i.e. ExtensionArray
--> 289 vals, _ = vals._values_for_factorize()
290
291 return _hash_ndarray(vals, encoding, hash_key, categorize)
AttributeError: 'RangeIndex' object has no attribute '_values_for_factorize'
Problem description
This issue looks similar to #41817, but that is specifically for DateTimeIndex with tz defined, while this seems to happen for any Index instance.
Expected Output
A hash of the input index, as in pandas < 1.3.0.
Output of pd.show_versions()
pandas : 1.3.0rc1
numpy : 1.19.5
pytz : 2021.1
dateutil : 2.8.1
pip : 20.2.1
setuptools : 49.2.1
Cython : 0.29.22
pytest : 6.2.2
hypothesis : 6.4.0
sphinx : 3.5.1
blosc : 1.10.2
feather : None
xlsxwriter : 1.3.7
lxml.etree : 4.6.2
html5lib : 1.1
pymysql : None
psycopg2 : None
jinja2 : 2.11.3
IPython : 7.21.0
pandas_datareader: None
bs4 : 4.9.3
bottleneck : 1.3.2
fsspec : 0.8.7
fastparquet : 0.5.0
gcsfs : 0.7.2
matplotlib : 3.3.4
numexpr : 2.7.3
odfpy : None
openpyxl : 3.0.6
pandas_gbq : None
pyarrow : 3.0.0
pyxlsb : None
s3fs : 0.5.2
scipy : 1.6.1
sqlalchemy : 1.3.23
tables : 3.6.1
tabulate : 0.8.9
xarray : 0.17.0
xlrd : 2.0.1
xlwt : 1.3.0
numba : 0.52.0