Description
The current documentation for the copy
parameter of asarray
states:
https://data-apis.org/array-api/latest/API_specification/generated/array_api.asarray.html#asarray
copy (Optional[bool]) – boolean indicating whether or not to copy the input. If True, the function must always copy. If False, the function must never copy for input which supports the buffer protocol and must raise a ValueError in case a copy would be necessary. If None, the function must reuse existing memory buffer if possible and copy otherwise. Default: None.
The meaning of copy=True
is ambiguous for dask.
There are two possible interpretations:
- updating the output array will never alter the input array; OR
- the output array will never share memory with the input array; in other words dereferencing the input array can release its memory even if you still hold a reference to the output array.
In dask, updating a collection actually creates brand new graph nodes under the hood and repoints the collection to those nodes, so the original is never modified. However, the original chunks, generated e.g. by from_array
, are still held inside the graph.
I strongly prefer the first definition, as IMHO decisions around memory management should be considered low level and delegated to the individual libraries.