Skip to content

[0.24.0rc1] passing dtype='M8' to Index raises #24753

Closed
@TomAugspurger

Description

@TomAugspurger

In 0.23.4

In [1]: import pandas as pd

In [2]: pd.Index([], dtype='M8')
Out[2]: DatetimeIndex([], dtype='datetime64[ns]', freq=None)

In 0.24.0rc1

In [2]: pd.Index([], dtype='M8')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-2-09844a8ae29c> in <module>
----> 1 pd.Index([], dtype='M8')

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/base.py in __new__(cls, data, dtype, copy, name, fastpath, tupleize_cols, **kwargs)
    306             else:
    307                 result = DatetimeIndex(data, copy=copy, name=name,
--> 308                                        dtype=dtype, **kwargs)
    309                 return result
    310

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/indexes/datetimes.py in __new__(cls, data, freq, start, end, periods, tz, normalize, closed, ambiguous, dayfirst, yearfirst, dtype, copy, name, verify_integrity)
    301             data, dtype=dtype, copy=copy, tz=tz, freq=freq,
    302             dayfirst=dayfirst, yearfirst=yearfirst, ambiguous=ambiguous,
--> 303             int_as_wall_time=True)
    304
    305         subarr = cls._simple_new(dtarr, name=name,

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in _from_sequence(cls, data, dtype, copy, tz, freq, dayfirst, yearfirst, ambiguous, int_as_wall_time)
    366             data, dtype=dtype, copy=copy, tz=tz,
    367             dayfirst=dayfirst, yearfirst=yearfirst,
--> 368             ambiguous=ambiguous, int_as_wall_time=int_as_wall_time)
    369
    370         freq, freq_infer = dtl.validate_inferred_freq(freq, inferred_freq,

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in sequence_to_dt64ns(data, dtype, copy, tz, dayfirst, yearfirst, ambiguous, int_as_wall_time)
   1704     inferred_freq = None
   1705
-> 1706     dtype = _validate_dt64_dtype(dtype)
   1707
   1708     if not hasattr(data, "dtype"):

~/Envs/dask-dev/lib/python3.7/site-packages/pandas/core/arrays/datetimes.py in _validate_dt64_dtype(dtype)
   1991             raise ValueError("Unexpected value for 'dtype': '{dtype}'. "
   1992                              "Must be 'datetime64[ns]' or DatetimeTZDtype'."
-> 1993                              .format(dtype=dtype))
   1994     return dtype
   1995

ValueError: Unexpected value for 'dtype': 'datetime64'. Must be 'datetime64[ns]' or DatetimeTZDtype'.

We want that to raise eventually; ns precision should be specified. But we should maybe deprecate the old behavior first?

The best place to do it is probably before we get to arrays, so in DatetimeIndex.__new__ we can check for M8 specifically, warn, then pass through M8[ns].

We should also check

  • np.dtype("M8")
  • 'm8'
  • np.dtype('m8')

It's less clear what we should do for something like M8[us]. In the past, we used to ignore the precision

In [15]: pd.Index(list(pd.date_range('2000', periods=4).asi8), dtype='M8[us]')
Out[15]: DatetimeIndex(['2000-01-01', '2000-01-02', '2000-01-03', '2000-01-04'], dtype='datetime64[ns]', freq=None)

this should arguably raise now...

Metadata

Metadata

Assignees

No one assigned

    Labels

    DatetimeDatetime data dtypeDtype ConversionsUnexpected or buggy dtype conversions

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions