Skip to content

ENH: Add support to read and write Amazon ION files #55725

Open
@anna-geller

Description

@anna-geller

Feature Type

  • Adding new functionality to pandas

Problem Description

We heavily rely on Amazon ION file format. Currently, reading ION files as Pandas dataframes requires workarounds.

Feature Description

It would be great to add support for ION in pandas using read_ion and write_ion methods.

Alternative Solutions

Here is a reproducer of a workaround we use for now:

import amazon.ion.simpleion as ion
from amazon.ion.simple_types import IonPyNull
import pandas as pd
import requests


def convert_ion_nulls(value):
    return None if isinstance(value, IonPyNull) else value


url = "https://huggingface.co/datasets/kestra/datasets/resolve/main/ion/employees.ion"
response = requests.get(url)
response.raise_for_status()
ion_content = response.content
ion_data = ion.loads(ion_content, single_value=False)
list_of_dicts = [dict(record) for record in ion_data]
list_of_dicts = [
    {k: convert_ion_nulls(v) for k, v in record.items()} for record in list_of_dicts
]
df = pd.DataFrame(list_of_dicts)

For writing files:

import amazon.ion.simpleion as ion

list_of_values = df.to_dict("records")


def save_as_ion(dict_or_list, file_name):
    with open(file_name, "wb") as f:
        ion.dump(dict_or_list, f)

save_as_ion(list_of_values, "mydata.ion")

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnhancementIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions