Skip to content

API/DEPR: SparseArray constructor usability #43628

Open
@mzeitlin11

Description

@mzeitlin11

Currently the constructor includes sparse_index, which is not publicly exposed (nor is there a publicly exposed function to make one). Because of this, I'd like to suggest deprecating sparse_index.

One related issue is that without sparse_index (or a preexisting SparseArray), there is no way to use the constructor to construct a SparseArray without using dense data (which seems to defeat the purpose of using a SparseArray.

I'd like to replace sparse_index with the parameters indices and length (giving equivalent functionality to what sparse_index provided). indices would refer to array locations with actual values, length would be length of the SparseArray (I'm not attached to these names if people prefer others, these just match naming in existing SparseArray code).

This would allow constructing something sparse without first constructing a dense array, for example

arr = SparseArray([1, 2], indices=[5, 11], length=1_000_000)
or
arr = SparseArray(1, indices=[1000, 2000, 4000, 9000], length=1_000_000)
without needing to materialize an array of length 1_000_000

Metadata

Metadata

Assignees

No one assigned

    Labels

    ConstructorsSeries/DataFrame/Index/pd.array ConstructorsDeprecateFunctionality to remove in pandasNeeds DiscussionRequires discussion from core team before further actionSparseSparse Data Type

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions