Open
Description
Validity mask is a missing value representation that depends on the Column in the protocol. If describe_null()
is meant to describe missing values at the column level for a given dtype.
In the case there is no missing values, shall we still provide a validity array with 1 (valid) at all entries or
shall we raise an exception ?
From my perspective, the later is better because we can just check null_count == 0
without allocating and filling the whole array with the same value. That is how it works in cudf dataframe. If there is no missing value, accessing the attribute nullmask
(which holds the validity array) raises an exception with message: "Column has no null mask".