Skip to content

API: inconsistency in casting back to its EA dtype (try_cast_to_ea) #31108

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

With the current logic we have in try_cast_to_ea:

try:
result = cls_or_instance._from_sequence(obj, dtype=dtype)
except Exception:
# We can't predict what downstream EA constructors may raise
result = obj
return result

it's easy to get inconsistencies, and it depends critically on what _from_sequence accepts as valid scalar.

For example, we now have this

>>> s = pd.Series([0, 1, 2], dtype="Int64") 
>>> s.combine(0, lambda x, y: x == y)      
0     True
1    False
2    False
dtype: bool

However, if the IntegerArray constructor gets changed slightly to be more willing to accept all kinds of boolean like values (see #31104 for a current inconsistency on this aspect, and this is what happens in #30282), you can get:

>>> s = pd.Series([0, 1, 2], dtype="Int64")  
>>> s.combine(0, lambda x, y: x == y)       
0    1
1    0
2    0
dtype: Int64

So how "forgiving" the EA constructor is, will determine the output type.

Here the example is with combine, but the try_cast_to_ea function is also used in a part of groupby.

Metadata

Metadata

Assignees

No one assigned

    Labels

    API - ConsistencyInternal Consistency of API/BehaviorBugExtensionArrayExtending pandas with custom dtypes or arrays.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions