Skip to content

BUG: HDFStore doesn't save datetime64[s] right #59004

Closed
@caballerofelipe

Description

@caballerofelipe

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

# **** This doesn't work

import pandas as pd
df = pd.DataFrame(['2001-01-01', '2002-02-02'], dtype='datetime64[s]')
print(df)
# Prints
#            0
# 0 2001-01-01
# 1 2002-02-02

print(df.dtypes)
# Prints
# 0    datetime64[s]
# dtype: object

with pd.HDFStore('deleteme.h5', mode='w') as store:
    store.put(
        'df',
        df,
    )

with pd.HDFStore('deleteme.h5', mode='r') as store:
    df_fromstore = store.get('df')
print(df_fromstore)
# Prints
#                               0
# 0 1970-01-01 00:00:00.978307200
# 1 1970-01-01 00:00:01.012608000

# Delete created file
import pathlib
pathlib.Path('deleteme.h5').unlink()

# **** However this does work

import pandas as pd
df = pd.DataFrame(['2001-01-01', '2002-02-02'], dtype='datetime64[ns]')
print(df)
# Prints
#            0
# 0 2001-01-01
# 1 2002-02-02

print(df.dtypes)
# Prints
# 0    datetime64[ns]
# dtype: object

with pd.HDFStore('deleteme.h5', mode='w') as store:
    store.put(
        'df',
        df,
    )

with pd.HDFStore('deleteme.h5', mode='r') as store:
    df_fromstore = store.get('df')
print(df_fromstore)
# Prints
#            0
# 0 2001-01-01
# 1 2002-02-02

# Delete created file
pathlib.Path('deleteme.h5').unlink()

Issue Description

When saving to a file using .h5 (HDF5 format), if a column is of type datetime64[s] (s not ns) and then loading the file, the date isn't loaded correctly. My intuition tells me that the file gets saved to datetime64[s] but when read Pandas parses the date as datetime64[ns] and I guess some more digits should be present if parsing to datetime64[ns]. I haven't done any tests with other datetime precisions.

Expected Behavior

When saving a datetime64[s] to HDF5 file format and loading that file, the date should be correctly loaded.

Installed Versions

UserWarning: Setuptools is replacing distutils.
  warnings.warn("Setuptools is replacing distutils.")

INSTALLED VERSIONS
------------------
commit                : d9cdd2ee5a58015ef6f4d15c7226110c9aab8140
python                : 3.11.9.final.0
python-bits           : 64
OS                    : Darwin
OS-release            : 23.5.0
Version               : Darwin Kernel Version 23.5.0: Wed May  1 20:12:58 PDT 2024; root:xnu-10063.121.3~5/RELEASE_ARM64_T6000
machine               : arm64
processor             : arm
byteorder             : little
LC_ALL                : None
LANG                  : en_US.UTF-8
LOCALE                : en_US.UTF-8

pandas                : 2.2.2
numpy                 : 1.26.4
pytz                  : 2024.1
dateutil              : 2.9.0
setuptools            : 70.0.0
pip                   : 24.0
Cython                : 3.0.8
pytest                : 8.0.0
hypothesis            : None
sphinx                : None
blosc                 : None
feather               : None
xlsxwriter            : 3.1.9
lxml.etree            : 5.1.0
html5lib              : 1.1
pymysql               : None
psycopg2              : None
jinja2                : 3.1.3
IPython               : 8.21.0
pandas_datareader     : None
adbc-driver-postgresql: None
adbc-driver-sqlite    : None
bs4                   : 4.12.3
bottleneck            : None
dataframe-api-compat  : None
fastparquet           : None
fsspec                : None
gcsfs                 : None
matplotlib            : 3.8.2
numba                 : None
numexpr               : 2.10.0
odfpy                 : None
openpyxl              : 3.1.4
pandas_gbq            : None
pyarrow               : None
pyreadstat            : None
python-calamine       : None
pyxlsb                : None
s3fs                  : None
scipy                 : 1.12.0
sqlalchemy            : None
tables                : 3.9.2
tabulate              : None
xarray                : None
xlrd                  : None
zstandard             : None
tzdata                : 2024.1
qtpy                  : None
pyqt5                 : None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO HDF5read_hdf, HDFStoreNon-Nanodatetime64/timedelta64 with non-nanosecond resolution

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions