Open
Description
Right now, Datetime.Array.dtype
can be either np.dtype('M8[ns])
or a DatetimeTZDtype
, depending on whether the values are tz-naive or tz-aware. This means that while DatetimeArray[tz-naive]
is an instance of ExtensionArray
, it doesn't satisfy the minimum ExtensionArray API, which requires that array.dtype
be an ExtensionDtype
.
In [4]: pd.date_range('2000', periods=4)._data.dtype
Out[4]: dtype('<M8[ns]')
In [5]: pd.date_range('2000', periods=4, tz="CET")._data.dtype
Out[5]: datetime64[ns, CET]
The causes some type-unsoundness for places that are supposed to return an ExtensionArray. The two most prominent being pd.array
and Series.array
. As an example, following isn't necessarily safe code
def f(ser: Series) -> Callable:
return ser.array.dtype.construct_array_type
that will fail for tz-naive datetime data, because its .dtype
is a NumPy dtype.
Proposal:
- Make a
DatetimeDtype
(or allowDatetimeTZDtype
to havetz=None
). Make aTimedeltaArray.dtype
- Ensure that
DatetimeArray.dtype
andTimedeltaArray.dtype
is always an ExtensionDtype - Wrap
Series.dtype
andDatetimeIndex.dtype
to continue returning the NumPy dtype
The last step is to avoid breaking code relying on Series[tz-naive].dtype
being a NumPy dtype.