Skip to content

[RFC]: add broadcast_shapes to the specification #893

Open
@crusaderky

Description

@crusaderky

This RFC proposes adding an API to the specification for explicitly broadcasting a list of shapes to a single shape.

Overview

Based on array API comparison data, this API, or some variation of it, is commonly implemented across array libraries.

Currently, the Array API specification only includes broadcast_arrays and broadcast_to which both require array input. The specification lacks APIs for working directly with shapes without needing to create new array instances.

Prior Art

Proposal

This RFC proposes adding the following API to the specification:

def broadcast_shapes(*shapes: tuple[int | None, ...]) → tuple[int | None, ...]

in which one or more shapes are broadcasted together according to broadcasting rules as enumerated in the specification.

Questions

  • How to handle shapes having unknown dimensions?

    • dask.array.core.broadcast_shapes sets the output size to nan if any of the input shapes are nan on the same axis
    • ndonnx.broadcast_arrays(a, b) returns arrays with material shapes.

    Note that shape materialization can be a very expensive operation, as it requires materializing the whole graph until that point. In the case of Dask, which doesn't cache intermediate results as a deliberate memory management policy, this means computing everything at least twice.

Notes

The top-level page on broadcasting mentions on the first line, using non-prescriptive language, that broadcasting allows creating views of the inputs:

Broadcasting refers to the automatic (implicit) expansion of array dimensions to be of equal sizes without copying array data

However, no mention of sharing memory is made in broadcast_to or broadcast_arrays.

For the sake of comparison, see the verbiage in asarray(copy=False).

The problem with this ambiguity is that one can work around the lack of broadcast_shapes by calling xp.broadcast_arrays(*args)[0].shape, but there is no strong guarantee that the backend won't deep-copy the inputs.

Note that numpy.broadcast_shapes doesn't work with shapes containing None (ndonnx and hopefully in the future JAX too) or NaN (Dask; non-standard).

I suggest to either

  • Add prescriptive verbiage to broadcast_to and broadcast_arrays that the output must share memory with the input, or in other words the operation must be O(1), or
  • Add broadcast_shapes to the standard, and change the verbiage of the broadcasting high level page to "typically without copying array data"

For the time being I am adding the function to array_api_extra:

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs DiscussionNeeds further discussion.RFCRequest for comments. Feature requests and proposed changes.topic: BroadcastingArray broadcasting.

    Type

    No type

    Projects

    Status

    Stage 0

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions