Skip to content

ENH: Add engine keyword to read_json to enable reading from pyarrow #48893

Closed
@mroeschke

Description

@mroeschke

pyarrow has a read_json function that could be used as an alternative parser for pd.read_json https://arrow.apache.org/docs/python/generated/pyarrow.json.read_json.html#pyarrow.json.read_json

Like we have for read_csv and read_parquet, I would like to propose a engine keyword argument to allow users to pick the parsing backend engine="ujson"|"pyarrow"

One change compared to read_csv and read_parquet with engine="pyarrow" I would like to propose would be to return ArrowExtensionArrays instead of converting the result to numpy dtypes such that the pyarrow.Table returned by read_json still propagates pyarrow objects underneath.

Thoughts?

Metadata

Metadata

Assignees

Labels

Arrowpyarrow functionalityEnhancementIO JSONread_json, to_json, json_normalize

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions