Skip to content

ENH: Adding support for calamine as Excel reader engine #50395

Closed
@abielr

Description

@abielr

Feature Type

  • Adding new functionality to pandas

  • Changing existing functionality in pandas

  • Removing existing functionality in pandas

Problem Description

Reading Excel files in Pandas is considerably slower than in some alternative data frame tools, for example the readxl package in R can read Excel files much faster. The Rust calamine library can read Excel files much faster than other engines supported by Pandas, and there is an existing Python binding to it, python-calamine. I would like to request that Pandas add official support for calamine, so that users can read an Excel file like:

pd.read_excel("test.xlsx", engine="calamine")

Feature Description

The python-calamine package already implements code that enables the calamine engine in Pandas, see the examples using pandas_monkeypatch() at the bottom of their Github README. The code to enable this is here

Although python-calamine already implements the necessary features to use the library with Pandas, I am unclear on how similar the behavior is between calamine and other engines that Pandas supports like openpyxl. I am hoping that by bringing calamine in as an officially supported engine that Pandas unit tests will confirm consistent behavior across calamine and other engines.

Alternative Solutions

None

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions