Skip to content

"min_itemsize" doesn't work for MultiIndex columns in table format #12154

Open
@toobaz

Description

@toobaz
df = pd.DataFrame([[1,2,3],[4,5,6]],
              columns=pd.MultiIndex.from_tuples([(1,'a'), (1,'b'), (2,'c')])).astype(str)
store = pd.HDFStore('/tmp/store.hdf')
store.append('test', df, min_itemsize={1 : 20})

yields

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-4-cea3042a011b> in <module>()
----> 1 store.append('test', df, min_itemsize={1 : 20})

/home/pietro/nobackup/repo/pandas/pandas/io/pytables.py in append(self, key, value, format, append, columns, dropna, **kwargs)
    915         kwargs = self._validate_format(format, kwargs)
    916         self._write_to_group(key, value, append=append, dropna=dropna,
--> 917                              **kwargs)
    918 
    919     def append_to_multiple(self, d, value, selector, data_columns=None,

/home/pietro/nobackup/repo/pandas/pandas/io/pytables.py in _write_to_group(self, key, value, format, index, append, complib, encoding, **kwargs)
   1260 
   1261         # write the object
-> 1262         s.write(obj=value, append=append, complib=complib, **kwargs)
   1263 
   1264         if s.is_table and index:

/home/pietro/nobackup/repo/pandas/pandas/io/pytables.py in write(self, obj, axes, append, complib, complevel, fletcher32, min_itemsize, chunksize, expectedrows, dropna, **kwargs)
   3783         self.create_axes(axes=axes, obj=obj, validate=append,
   3784                          min_itemsize=min_itemsize,
-> 3785                          **kwargs)
   3786 
   3787         for a in self.axes:

/home/pietro/nobackup/repo/pandas/pandas/io/pytables.py in create_axes(self, axes, obj, validate, nan_rep, data_columns, min_itemsize, **kwargs)
   3466 
   3467         # validate our min_itemsize
-> 3468         self.validate_min_itemsize(min_itemsize)
   3469 
   3470         # validate our metadata

/home/pietro/nobackup/repo/pandas/pandas/io/pytables.py in validate_min_itemsize(self, min_itemsize)
   3105                 raise ValueError(
   3106                     "min_itemsize has the key [%s] which is not an axis or "
-> 3107                     "data_column" % k)
   3108 
   3109     @property

ValueError: min_itemsize has the key [1] which is not an axis or data_column

... which is actually true (if "data_column" is interpreted as "queryable column" rather than just "column of data"), but should not be a blocker (at least, judging from the documentation).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions