Disambiguate what copy=True means for dask

The current documentation for the `copy` parameter of `asarray` states:
https://data-apis.org/array-api/latest/API_specification/generated/array_api.asarray.html#asarray

> copy (Optional[bool]) – boolean indicating whether or not to copy the input. If True, the function must always copy. If False, the function must never copy for input which supports the buffer protocol and must raise a ValueError in case a copy would be necessary. If None, the function must reuse existing memory buffer if possible and copy otherwise. Default: None.

The meaning of `copy=True` is ambiguous for dask.
There are two possible interpretations:
1. updating the output array will never alter the input array; OR
2. the output array will never share memory with the input array; in other words dereferencing the input array can release its memory even if you still hold a reference to the output array.

In dask, updating a collection actually creates brand new graph nodes under the hood and repoints the collection to those nodes, so the original is never modified. However, the original chunks, generated e.g. by `from_array`, are still held inside the graph.

I strongly prefer the first definition, as IMHO decisions around memory management should be considered low level and delegated to the individual libraries.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Disambiguate what copy=True means for dask #866

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Disambiguate what copy=True means for dask #866

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions