Open
Description
Feature Type
- Adding new functionality to pandas
Problem Description
We heavily rely on Amazon ION file format. Currently, reading ION files as Pandas dataframes requires workarounds.
Feature Description
It would be great to add support for ION in pandas using read_ion
and write_ion
methods.
Alternative Solutions
Here is a reproducer of a workaround we use for now:
import amazon.ion.simpleion as ion
from amazon.ion.simple_types import IonPyNull
import pandas as pd
import requests
def convert_ion_nulls(value):
return None if isinstance(value, IonPyNull) else value
url = "https://huggingface.co/datasets/kestra/datasets/resolve/main/ion/employees.ion"
response = requests.get(url)
response.raise_for_status()
ion_content = response.content
ion_data = ion.loads(ion_content, single_value=False)
list_of_dicts = [dict(record) for record in ion_data]
list_of_dicts = [
{k: convert_ion_nulls(v) for k, v in record.items()} for record in list_of_dicts
]
df = pd.DataFrame(list_of_dicts)
For writing files:
import amazon.ion.simpleion as ion
list_of_values = df.to_dict("records")
def save_as_ion(dict_or_list, file_name):
with open(file_name, "wb") as f:
ion.dump(dict_or_list, f)
save_as_ion(list_of_values, "mydata.ion")