Skip to content

BUG: assignment with boolean index not propogating dtype #13169

Closed
@jreback

Description

@jreback
In [5]: df = DataFrame({'A' : [1,2]})

In [6]: df['B'] = Index([True,False])

In [7]: df['C'] = Index([True,False]).values

In [8]: df
Out[8]: 
   A      B      C
0  1   True   True
1  2  False  False

In [9]: df.dtypes
Out[9]: 
A     int64
B    object
C    object
dtype: object

In [10]: df = DataFrame({'A' : [1,2]})

In [11]: df['B'] = Index([True,False])

# you have to force the coercion
In [12]: df['C'] = Index([True,False]).values.astype(bool)

In [13]: Index([True,False])
Out[13]: Index([True, False], dtype='object')

In [14]: Index([True,False]).values
Out[14]: array([True, False], dtype=object)

In [15]: df.dtypes
Out[15]: 
A     int64
B    object
C      bool
dtype: object

Since we don't have a boolean Index type (we should, but separate issue), the assignment of an Index leaves the resulting dtype as object rather than bool. This can be fixed on a special case basis here, though maybe better in _sanitize_array.

But I think this might be non-performant to even check with infer_dtype. So you have to be a bit hacky about this, e.g. call is_bool_array first, then you can coerce if it IS (say its NOT, IOW its a bunch of strings, then this is problematic).

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugClosing CandidateMay be closeable, needs more eyeballsDtype ConversionsUnexpected or buggy dtype conversionsIndexingRelated to indexing on series/frames, not to indexes themselves

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions