Closed
Description
pyarrow has a read_json
function that could be used as an alternative parser for pd.read_json
https://arrow.apache.org/docs/python/generated/pyarrow.json.read_json.html#pyarrow.json.read_json
Like we have for read_csv
and read_parquet
, I would like to propose a engine
keyword argument to allow users to pick the parsing backend engine="ujson"|"pyarrow"
One change compared to read_csv
and read_parquet
with engine="pyarrow"
I would like to propose would be to return ArrowExtensionArray
s instead of converting the result to numpy dtypes such that the pyarrow.Table
returned by read_json
still propagates pyarrow objects underneath.
Thoughts?