Skip to content

BUG: DataFrame doesn't roundtrip with HDFStore(..., format='table', dropna=True) #37624

Open
@arw2019

Description

@arw2019

xref #37564

Example

In [3]: import numpy as np
   ...: import pandas as pd
   ...: import pandas._testing as tm
   ...: from pandas.tests.io.pytables.common import ensure_clean_path
   ...: 
   ...: df_with_missing = pd.DataFrame(
   ...:             {"col1": [0, np.nan, 2], "col2": [1, np.nan, np.nan]}
   ...:         )
   ...: df_without_missing = pd.DataFrame({"col1":[0, 2], "col2": [1, np.nan]})
   ...: 
   ...: setup_path = '/tmp/store'
   ...: with ensure_clean_path(setup_path) as path:
   ...:     df_with_missing.to_hdf(path, "df_with_missing", dropna=True, index=False, format="table")
   ...:     reloaded = pd.read_hdf(path, "df_with_missing")
   ...:     tm.assert_frame_equal(df_without_missing, reloaded)
   ...: 
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-3-77cd279acb0b> in <module>
     13     df_with_missing.to_hdf(path, "df_with_missing", dropna=True, index=False, format="table")
     14     reloaded = pd.read_hdf(path, "df_with_missing")
---> 15     tm.assert_frame_equal(df_without_missing, reloaded)
     16 

    [... skipping hidden 2 frame]

/workspaces/pandas/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()
     44 
     45 
---> 46 cpdef assert_almost_equal(a, b,
     47                           rtol=1.e-5, atol=1.e-8,
     48                           bint check_dtype=True,

/workspaces/pandas/pandas/_libs/testing.pyx in pandas._libs.testing.assert_almost_equal()
    159             msg = (f"{obj} values are different "
    160                    f"({np.round(diff * 100.0 / na, 5)} %)")
--> 161             raise_assert_detail(obj, msg, lobj, robj, index_values=index_values)
    162 
    163         return True

/workspaces/pandas/pandas/_testing.py in raise_assert_detail(obj, message, left, right, diff, index_values)
   1053         msg += f"\n[diff]: {diff}"
   1054 
-> 1055     raise AssertionError(msg)
   1056 
   1057 

AssertionError: DataFrame.index are different

DataFrame.index values are different (50.0 %)
[left]:  RangeIndex(start=0, stop=2, step=1)
[right]: Int64Index([0, 2], dtype='int64')

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO HDF5read_hdf, HDFStoreIndexRelated to the Index class or subclasses

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions