Skip to content

pd.concat of Series with int64 column and Series with int64-ExtensionArray yields int64 #21792

Closed
@xhochy

Description

@xhochy

Code Sample

import fletcher as fr
import pandas as pd

df_ext = pd.DataFrame({'a': fr.FletcherArray([1, 2])})
df_ext.info()
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 2 entries, 0 to 1
# Data columns (total 1 columns):
# a    2 non-null fletcher[int64]
# dtypes: fletcher[int64](1)
# memory usage: 100.0 bytes

df_normal = pd.DataFrame({'a': [3, 4]})
df_normal.info()
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 2 entries, 0 to 1
# Data columns (total 1 columns):
# a    2 non-null int64
# dtypes: int64(1)
# memory usage: 96.0 bytes

# Works
pd.concat([df_ext, df_normal]).info()
# <class 'pandas.core.frame.DataFrame'>
# Int64Index: 4 entries, 0 to 1
# Data columns (total 1 columns):
# a    4 non-null object
# dtypes: object(1)
# memory usage: 64.0+ bytes

# yield int64 instead of object
pd.concat([df_ext['a'], df_normal['a']]).dtype
# dtype('int64')

Problem description

This currently leads BaseReshapingTests.test_concat_mixed_dtypes to fail on ExtensionArrays that can be converted to any numeric data NumPy datatype.

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugExtensionArrayExtending pandas with custom dtypes or arrays.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions