Skip to content

API: formalize the pandas IO API #15862

Closed
@jreback

Description

@jreback

#15838 (comment)

we have fairly uniform IO routines of the form

.to_format(path, df, **kwargs) (takes DataFrame)
and
pd.read_format(path, **kwargs) (returns DataFrame)

so should document various aspects of this:

  • contract on input path strings
  • file-like objects & is_file_like (ENH: Add file buffer validation to I/O ops #15894)
  • do we do any encoding / compression (only on csv/json ATM), compression
  • various guarantees on what we are sending in (e.g. no Index, string columns which are non-duplicated), no non-string objects (see feather/parquet impl).
  • make these more pluggable
  • perhaps allow a specification for block access / chunking.
  • additional args to accept/use: mode (for writing)

Metadata

Metadata

Assignees

No one assigned

    Labels

    API DesignDocsIO DataIO issues that don't fit into a more specific label

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions