Skip to content

BUG: Improve error message when casting ExtensionDtype to datetime #37553

Closed
@arw2019

Description

@arw2019

We support casting int and float dtypes to DatetimeTZDtype:

In [11]: s = pd.Series(np.random.randint(10, size=3), dtype="int64") 
    ...: s.astype(pd.DatetimeTZDtype(tz="US/Pacific"))                                                  
Out[11]: 
0   1969-12-31 16:00:00.000000004-08:00
1   1969-12-31 16:00:00.000000007-08:00
2   1969-12-31 16:00:00.000000001-08:00
dtype: datetime64[ns, US/Pacific]

We should be able to do this with the corresponding extension dtypes (Int/Float). Currently this errors:

In [12]: s = pd.Series(np.random.randint(10, size=5), dtype="Int64") 
    ...: s.astype(pd.DatetimeTZDtype(tz="US/Pacific"))                                                  
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-12-469a4b536444> in <module>
      1 s = pd.Series(np.random.randint(10, size=5), dtype="Int64")
----> 2 s.astype(pd.DatetimeTZDtype(tz="US/Pacific"))

/workspaces/pandas-arw2019/pandas/core/generic.py in astype(self, dtype, copy, errors)
   5797         else:
   5798             # else, only a single dtype is given
-> 5799             new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   5800             return self._constructor(new_data).__finalize__(self, method="astype")
   5801 

/workspaces/pandas-arw2019/pandas/core/internals/managers.py in astype(self, dtype, copy, errors)
    620         self, dtype, copy: bool = False, errors: str = "raise"
    621     ) -> "BlockManager":
--> 622         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    623 
    624     def convert(

/workspaces/pandas-arw2019/pandas/core/internals/managers.py in apply(self, f, align_keys, ignore_failures, **kwargs)
    422                     applied = b.apply(f, **kwargs)
    423                 else:
--> 424                     applied = getattr(b, f)(**kwargs)
    425             except (TypeError, NotImplementedError):
    426                 if not ignore_failures:

/workspaces/pandas-arw2019/pandas/core/internals/blocks.py in astype(self, dtype, copy, errors)
    608         if self.is_extension:
    609             try:
--> 610                 values = self.values.astype(dtype)
    611             except (ValueError, TypeError):
    612                 if errors == "ignore":

/workspaces/pandas-arw2019/pandas/core/arrays/integer.py in astype(self, dtype, copy)
    471             na_value = lib.no_default
    472 
--> 473         return self.to_numpy(dtype=dtype, na_value=na_value, copy=False)
    474 
    475     def _values_for_argsort(self) -> np.ndarray:

/workspaces/pandas-arw2019/pandas/core/arrays/masked.py in to_numpy(self, dtype, copy, na_value)
    222             data[self._mask] = na_value
    223         else:
--> 224             data = self._data.astype(dtype, copy=copy)
    225         return data
    226 

TypeError: data type not understood

We should get this to work for both ExtensionArray and the masked versions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ExtensionArrayExtending pandas with custom dtypes or arrays.NA - MaskedArraysRelated to pd.NA and nullable extension arraysNeeds TestsUnit test(s) needed to prevent regressionsTimezonesTimezone data dtypegood first issue

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions