Skip to content

How to expose API to downstream libraries? #16

Closed
@saulshanabrook

Description

@saulshanabrook

I wanted to open a discussion on how the Array API (and potentially the dataframe API) will be exposed to downstream libraries.

For example, let's say I am the author of scikit-learn. How do I get access to an "Array compatible API"? Or let's say I am a downstream user, using scikit-learn in a notebook. How can I tell it to use Tensorflow over NumPy?

Options

I present three options here, but I would appreciate any suggestions on further ideas:

Manual

The default option is the current status quo where there is no standard way to get access to some array conformant API backend.

Different downstream libraries, like scikit-learn, could introduce their own mechanisms, like a backend kwarg to functions, if they wanted to support different backends.

Local Dispatch

Another approach, would be to provide access to the related module from particular instances of the objects, which is the one taken by NEP 37.

In this case, scikit-learn would either call some x.__array_module__() method on its inputs or we would provide a array-api Python package that would have a helper function like get_array_module(x), similar to the NEP.

There is an open PR in scikit-learn (scikit-learn/scikit-learn#16574) to add support for NEP 37.

Global Dispatch

Instead of requiring an object to inspect, we could instead rely on a global context to store the "active array api" and provide ways of getting and settings this. Some form of this is implemented by scipy, with their scipy.fft.set_backend, which uses uarray.

This would be heavier weight than we would need, probably, but does illustrate the general concept. I think if we implemented this, we could use Context Variables like python's built in decimal module does. i.e. something like this:

from array_api import set_backend, get_backend

import cupy

with set_backend(cupy):
    some_fn()

def some_fn():
    np = get_backend()
    return np.arange(10)

The advantage of using a global dispatch is then you don't need to rely on passing in some custom instance class to set the backend.

Static Typing

This is slightly tangential, but one question that comes up for me is how we could properly statically type options 2 or 3. It seems like what we need is a typing.Protocol but for modules. I raised this as a discussion point on the typing-sig mailing list.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions