Skip to content

Add null object, and update top-level API specification #157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 49 additions & 3 deletions spec/API_specification/dataframe_api/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
"""
from __future__ import annotations

from typing import Mapping, Sequence
from typing import Mapping, Sequence, Any

from .column_object import *
from .dataframe_object import *
Expand All @@ -14,8 +14,9 @@

__dataframe_api_version__: str = "YYYY.MM"
"""
String representing the version of the DataFrame API specification to which the
conforming implementation adheres.
String representing the version of the DataFrame API specification to which
the conforming implementation adheres. Set to a concrete value for a stable
implementation of the dataframe API standard.
"""

def concat(dataframes: Sequence[DataFrame]) -> DataFrame:
Expand Down Expand Up @@ -73,3 +74,48 @@ def dataframe_from_dict(data: Mapping[str, Column]) -> DataFrame:
DataFrame
"""
...

class null:
"""
A `null` object to represent missing data.

``null`` is a scalar, and may be used when constructing a `Column` from a
Python sequence with `column_from_sequence`. It does not support ``is``,
``==`` or ``bool``.

Raises
------
TypeError
From ``__eq__`` and from ``__bool__``.

For ``__eq__``: a missing value must not be compared for equality
directly. Instead, use `DataFrame.isnull` or `Column.isnull` to check
for presence of missing values.

For ``__bool__``: truthiness of a missing value is ambiguous.

Notes
-----
Like for Python scalars, the ``null`` object may be duck typed so it can
reside on (e.g.) a GPU. Hence, the builtin ``is`` keyword should not be
used to check if an object *is* the ``null`` object.

"""
...

def isnull(value: object, /) -> bool:
"""
Check if an object is a `null` scalar.

Parameters
----------
value : object
Any input type is valid.

Returns
-------
bool
True if the input is a `null` object from the same library which
implements the dataframe API standard, False otherwise.

"""
15 changes: 15 additions & 0 deletions spec/API_specification/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,21 @@ API specification

.. currentmodule:: dataframe_api

The API consists of dataframe, column and groupby classes, plus a small number
of objects and functions in the top-level namespace. The latter are:

.. autosummary::
:toctree: generated
:template: attribute.rst
:nosignatures:

__dataframe_api_version__
isnull
null

The ``DataFrame``, ``Column`` and ``GroupBy`` objects have the following
methods and attributes:

.. toctree::
:maxdepth: 3

Expand Down