Skip to content

BUG: COO matrices are copied and converted #43419

Open
@memeplex

Description

@memeplex
  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample, a copy-pastable example

Doesn't apply.

Problem description

The documentation for DataFrame.sparse.from_spmatrix() in [1] states that:

All sparse formats are supported, but matrices that are not in COOrdinate format will be converted, copying data as needed.

But the first thing it does [2] is:

data = data.tocsc()

so conversion. And the implementation for tocsc [3] does:

            indptr = np.empty(N + 1, dtype=idx_dtype)
            indices = np.empty_like(row, dtype=idx_dtype)
            data = np.empty_like(self.data, dtype=upcast(self.dtype))

            coo_tocsr(N, M, self.nnz, col, row, self.data,
                      indptr, indices, data)

so copy.

Given that the "native" format is CSC, not COO, it seems weird to me that the documentation insists on COO.

Expected Output

Doesn't apply.

Output of pd.show_versions()

Referenced documentation and code is for 1.3.2.


[1] https://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html
[2] https://github.com/pandas-dev/pandas/blob/v1.3.2/pandas/core/arrays/sparse/accessor.py#L228-L283
[3] https://github.com/scipy/scipy/blob/v1.7.1/scipy/sparse/coo.py#L330-L370

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions