Skip to content

BUG: to_json with objects causing segfault #14256

Closed
@tjader

Description

@tjader

Code Sample, a copy-pastable example if possible

Creating an bson objectID, without giving an objectID exclusively is ok.

>>> import bson
>>> import pandas as pd
>>> pd.DataFrame({'A': [bson.objectid.ObjectId()]}).to_json()
Out[4]: '{"A":{"0":{"binary":"W\\u0e32\\u224cug\\u00fcR","generation_time":1474361586000}}}'
>>> pd.DataFrame({'A': [bson.objectid.ObjectId()], 'B': [1]}).to_json()
Out[5]: '{"A":{"0":{"binary":"W\\u0e4e\\u224cug\\u00fcS","generation_time":1474361614000}},"B":{"0":1}}'

However, if you provide an ID explicitly, an exception is raised

>>> pd.DataFrame({'A': [bson.objectid.ObjectId('574b4454ba8c5eb4f98a8f45')]}).to_json()
Traceback (most recent call last):
  File "/auto/energymdl2/anaconda/envs/commod_20160831/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2885, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-7-c9a20090d481>", line 1, in <module>
    pd.DataFrame({'A': [bson.objectid.ObjectId('574b4454ba8c5eb4f98a8f45')]}).to_json()
  File "/auto/energymdl2/anaconda/envs/commod_20160831/lib/python2.7/site-packages/pandas/core/generic.py", line 1056, in to_json
    default_handler=default_handler)
  File "/auto/energymdl2/anaconda/envs/commod_20160831/lib/python2.7/site-packages/pandas/io/json.py", line 36, in to_json
    date_unit=date_unit, default_handler=default_handler).write()
  File "/auto/energymdl2/anaconda/envs/commod_20160831/lib/python2.7/site-packages/pandas/io/json.py", line 79, in write
    default_handler=self.default_handler)
OverflowError: Unsupported UTF-8 sequence length when encoding string

And worse, if the column is not the only column, the entire process dies.

>>> pd.DataFrame({'A': [bson.objectid.ObjectId('574b4454ba8c5eb4f98a8f45')], 'B': [1]}).to_json()
Process finished with exit code 139

Expected Output

output of pd.show_versions()

pandas: 0.18.1
nose: 1.3.7
pip: 8.1.2
setuptools: 26.1.1
Cython: 0.24
numpy: 1.10.4
scipy: 0.17.0
statsmodels: 0.6.1
xarray: 0.7.2
IPython: 4.1.2
sphinx: 1.3.5
patsy: 0.4.1
dateutil: 2.5.2
pytz: 2016.6.1
blosc: None
bottleneck: 1.0.0
tables: 3.2.2
numexpr: 2.5.2
matplotlib: 1.5.1
openpyxl: 2.3.2
xlrd: 0.9.4
xlwt: 1.0.0
xlsxwriter: 0.8.4
lxml: 3.6.0
bs4: 4.3.2
html5lib: 0.999
httplib2: 0.9.2
apiclient: 1.5.0
sqlalchemy: 1.0.13
pymysql: None
psycopg2: None
jinja2: 2.8
boto: 2.39.0
pandas_datareader: None

pymongo version is 3.3.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugIO JSONread_json, to_json, json_normalize

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions