Skip to content

UnicodeDecodeError with html.table_schema = True #16848

Open
@dhirschfeld

Description

@dhirschfeld

I have a DataFrame with a column of binary data which has an object dtype.

If display.html.table_schema is False then this displays fine with the display just being the repr of the byte string.

If I set display.html.table_schema = True then attempting to display the DataFrame throws a UnicodeDecodeError from the json conversion.

It would be good if the display also worked when using the table_schema option.

In [1]: pd.Series([b'\x00\x00\x00\x00\x00\x01\x82S'])
Out[1]: 
0    b'\x00\x00\x00\x00\x00\x01\x82S'
dtype: object

In [2]: pd.options.display.html.table_schema = True

In [3]: pd.Series([b'\x00\x00\x00\x00\x00\x01\x82S'])
Traceback (most recent call last):

  File "C:\Miniconda3\lib\site-packages\IPython\core\formatters.py", line 336, in __call__
    return method()

  File "C:\Miniconda3\lib\site-packages\pandas\core\generic.py", line 141, in _repr_data_resource_
    payload = json.loads(data.to_json(orient='table'),

  File "C:\Miniconda3\lib\site-packages\pandas\core\generic.py", line 1245, in to_json
    lines=lines)

  File "C:\Miniconda3\lib\site-packages\pandas\io\json\json.py", line 46, in to_json
    date_unit=date_unit, default_handler=default_handler).write()

  File "C:\Miniconda3\lib\site-packages\pandas\io\json\json.py", line 166, in write
    data = super(JSONTableWriter, self).write()

  File "C:\Miniconda3\lib\site-packages\pandas\io\json\json.py", line 90, in write
    default_handler=self.default_handler

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x82 in position 58: invalid start byte

Out[3]: 
0    b'\x00\x00\x00\x00\x00\x01\x82S'
dtype: object

In [4]: pd.get_option('display.encoding')
Out[4]: 'utf-8'

Output of pd.show_versions()

INSTALLED VERSIONS
------------------
commit: None
python: 3.6.1.final.0
python-bits: 64
OS: Windows
OS-release: 7
machine: AMD64
processor: Intel64 Family 6 Model 62 Stepping 4, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: None.None

pandas: 0.20.2
pytest: 3.1.2
pip: 9.0.1
setuptools: 33.1.1.post20170320
Cython: 0.25.2
numpy: 1.12.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: 1.6.3
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2017.2
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: 2.5.0a1
xlrd: 1.0.0
xlwt: 1.2.0
xlsxwriter: 0.9.6
lxml: 3.8.0
bs4: 4.6.0
html5lib: 0.999999999
sqlalchemy: 1.1.11
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO HTMLread_html, to_html, Styler.apply, Styler.applymapIO JSONread_json, to_json, json_normalizeUnicodeUnicode strings

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions