Skip to content

API/BUG: Series(floating, dtype=intlike) ignores dtype, DataFrame casts #40110

Closed
@jbrockmendel

Description

@jbrockmendel

In most contexts, Series is strict about dtype, so will always either return the given dtype or raise. DataFrame is the opposite, often silently ignoring dtype (xref #24435) (i think on the theory that dtype may be intended to apply to some columns but not others).

With floating data and integer dtypes, its the opposite:

arr = np.random.randn(5)

>>> pd.Series(arr, dtype="int16")
0    1.002695
1    0.259332
2   -1.111468
3   -0.680714
4   -0.008943

>>> pd.DataFrame(arr, dtype="int16")
   0
0  1
1  0
2 -1
3  0
4  0

We have exactly one test that is broken if we change the latter behavior, and that is mostly by coincidence. There are other bugs (e.g. Series(bigints, dtype="int8") silently overflowing) that would be easier to fix if maintaining this behavior weren't a consideration.

Is this intentional? cc @jreback @jorisvandenbossche @TomAugspurger

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugConstructorsSeries/DataFrame/Index/pd.array ConstructorsDeprecateFunctionality to remove in pandasDtype ConversionsUnexpected or buggy dtype conversionsNeeds DiscussionRequires discussion from core team before further action

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions