Description
xref #14145 (reviving this PR and superceding)
xref #15533
So when we fill (either via fillna, where, or using setitem) we are inconsistent in handling what we do when we have a non-compat type (note have not exhaustively tested other methods, I suspect these pretty much coerce to object
, except for .where
). The classic example is trying to fill with a tz-aware in tz-naive.
For current pandas (ex-pandas2), we generally upcast to object
. This is the case for setitem and fillna, but NOT for .where
, where we raise. We DO have a parameter to control this raise_on_error=True
, and it works for .where
(though not for .fillna
).
In [1]: s = Series([Timestamp('20130101'), pd.NaT])
In [2]: s
Out[2]:
0 2013-01-01
1 NaT
dtype: datetime64[ns]
# this is wrong
In [3]: s.fillna(Timestamp('20130101', tz='US/Eastern'))
Out[3]:
0 2012-12-31 19:00:00-05:00
1 2013-01-01 00:00:00-05:00
dtype: datetime64[ns, US/Eastern]
# current behavior
In [4]: s.fillna('foo')
Out[4]:
0 2013-01-01 00:00:00
1 foo
dtype: object
In [5]: s[1] = 'bar'
In [6]: s
Out[6]:
0 2013-01-01 00:00:00
1 bar
dtype: object
In [8]: s = Series([Timestamp('20130101'), pd.NaT])
# this is inconsistent with fillna
In [9]: s.where([True, False], Timestamp('20130101',tz='US/Eastern'), raise_on_error=False)
Out[9]:
0 2013-01-01 00:00:00
1 2013-01-01 00:00:00-05:00
dtype: object
In [10]: s.where([True, False], Timestamp('20130101',tz='US/Eastern'), raise_on_error=True)
TypeError: Could not operate [Timestamp('2013-01-01 00:00:00-0500', tz='US/Eastern')] with block values [cannot coerce a Timestamp with a tz on a naive Block]
So I think its reasonable to make these all raise, though that would be a pretty big API change. We are going to do this for pandas2 anyhow. Would be beneficial to not be so strict like this.
Alternatively we can fix .where
to by default upcast to object
.
thoughts?
cc @wesm @shoyer @TomAugspurger @jorisvandenbossche @sinhrks @cpcloud
@ResidentMario