Skip to content

hash_pandas_object fails on empty dataframe with index=False #24318

Open
@TomAugspurger

Description

@TomAugspurger
In [8]: df = pd.DataFrame(index=pd.Index([1, 2, 3]))

In [9]: pd.util.hash_pandas_object(df, index=False)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-23d57ac9e66b> in <module>
----> 1 pd.util.hash_pandas_object(df, index=False)

~/sandbox/pandas/pandas/core/util/hashing.py in hash_pandas_object(obj, index, encoding, hash_key, categorize)
    112         h = _combine_hash_arrays(hashes, num_items)
    113
--> 114         h = Series(h, index=obj.index, dtype='uint64', copy=False)
    115     else:
    116         raise TypeError("Unexpected type for hashing %s" % type(obj))

~/sandbox/pandas/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath)
    244                             'Length of passed values is {val}, '
    245                             'index implies {ind}'
--> 246                             .format(val=len(data), ind=len(index)))
    247                 except TypeError:
    248                     pass

ValueError: Length of passed values is 0, index implies 3

I'm not sure what to do here. I think that the output should have the same shape / index. Passing index=False means don't include the index in the hashed value. But the output should retain the same index.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions