Skip to content

categories in HDFStore don't filter correctly #13322

Closed
@michaelaye

Description

@michaelaye

Code Sample, a copy-pastable example if possible

obsids = ['ESP_012345_6789', 'ESP_987654_3210']
imgids = ['APF00006np', 'APF0001imm']
data = [4.3, 9.8]
df = pd.DataFrame(dict(obsids=obsids, imgids=imgids, data=data))
df.to_hdf('testdf_no_cats.hdf', 'df',format='t', data_columns=True)
df.obsids = df.obsids.astype('category')
df.imgids = df.imgids.astype('category')
df.to_hdf('testdf_with_cats.hdf', 'df',format='t', data_columns=True)
print("No categories:")
print(pd.read_hdf('testdf_no_cats.hdf', 'df', where='obsids=B'))
print("With categories:")
print(pd.read_hdf('testdf_with_cats.hdf', 'df', where='obsids=B'))

Actual Output:

No categories:
Empty DataFrame
Columns: [data, imgids, obsids]
Index: []
With categories:
   data      imgids           obsids
0   4.3  APF00006np  ESP_012345_6789

Expected Output

Empty DataFrame for both cases.

output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.5.1.final.0
python-bits: 64
OS: Darwin
OS-release: 14.5.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8

pandas: 0.13.1+5155.g9baf452
nose: None
pip: 8.1.1
setuptools: 21.2.1
Cython: 0.24
numpy: 1.11.0
scipy: 0.17.1
statsmodels: None
xarray: 0.7.2
IPython: 4.2.0
sphinx: 1.4.1
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: 4.4.1
html5lib: 0.9999999
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.8
boto: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions