Closed
Description
In [5]: df = DataFrame({'A' : [1,2]})
In [6]: df['B'] = Index([True,False])
In [7]: df['C'] = Index([True,False]).values
In [8]: df
Out[8]:
A B C
0 1 True True
1 2 False False
In [9]: df.dtypes
Out[9]:
A int64
B object
C object
dtype: object
In [10]: df = DataFrame({'A' : [1,2]})
In [11]: df['B'] = Index([True,False])
# you have to force the coercion
In [12]: df['C'] = Index([True,False]).values.astype(bool)
In [13]: Index([True,False])
Out[13]: Index([True, False], dtype='object')
In [14]: Index([True,False]).values
Out[14]: array([True, False], dtype=object)
In [15]: df.dtypes
Out[15]:
A int64
B object
C bool
dtype: object
Since we don't have a boolean Index type (we should, but separate issue), the assignment of an Index
leaves the resulting dtype as object
rather than bool
. This can be fixed on a special case basis here, though maybe better in _sanitize_array
.
But I think this might be non-performant to even check with infer_dtype
. So you have to be a bit hacky about this, e.g. call is_bool_array
first, then you can coerce if it IS (say its NOT, IOW its a bunch of strings, then this is problematic).