Bug in Series constructor returning wrong missing values #43026

phofl · 2021-08-13T20:29:34Z

closes BUG: Default value for Series(dtype=int) is not pd.NA #43017
xref BUG: Inconsistent behavior of setting default value for type bool #43018 (if the reindex case is ok, then this is done too)
tests added / passed
Ensure all linting tests pass, see here for how to run them
whatsnew entry

This ensures consistency with the DataFrame constructor

doc/source/whatsnew/v1.4.0.rst

jreback · 2021-08-13T21:44:01Z

pandas/tests/series/methods/test_astype.py

+        # GH#43018
+        ser = Series(np.nan, dtype="object")
+        result = ser.astype("bool")
+        expected = Series(True, dtype="bool")


this is True?

bool(np.nan) returns True

Yes exactly.

jreback · 2021-08-13T21:44:20Z

pandas/tests/series/test_constructors.py

+    def test_constructor_bool_dtype_missing_values(self):
+        # GH#43018
+        result = Series(index=[0], dtype="bool")
+        expected = Series(True, index=[0], dtype="bool")


this is True?

jreback · 2021-08-19T02:01:58Z

thanks @phofl

…43026)

simonjayhawkins · 2022-03-25T17:20:57Z

@phofl we have had reports regarding both these cases. Is the new behavior now consistent?

import pandas as pd

print(pd.__version__)
s = pd.Series(dtype="int", index=[0])
print(s)
s2 = pd.Series(dtype="bool", index=[0])
print(s2)

1.3.5
0    0
dtype: int64
0    False
dtype: bool

1.4.1
0   NaN
dtype: float64
0    True
dtype: bool

It appears to me (and maybe others from the issues) that the changes are confusing.

We have changed the int case and and instead create a float array saying that the missing value cannot be held in a integer array and yet for the bool case we continue to keep the bool dtype even though we cannot represent a missing value in a boolean array?

I appreciate that int(np.nan) raises ValueError: cannot convert float NaN to integer and bool(np.nan) is True but the users are not specifying np.nan, they are not supplying data. I wonder whether we ought to revert this until we change the constructors to use nullable dtypes?

jbrockmendel · 2022-04-01T20:45:26Z

I wonder whether we ought to revert this until we change the constructors to use nullable dtypes?

no opinion on the reversion, but i advise against the implicit assumption that constructors are going to default to nullable dtypes. support has improved, but its still a mess.

Bug in Series constructor returning wrong missing values

55fc8af

phofl added Constructors Series/DataFrame/Index/pd.array Constructors Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate Series Series data structure labels Aug 13, 2021

jreback reviewed Aug 13, 2021

View reviewed changes

doc/source/whatsnew/v1.4.0.rst Outdated Show resolved Hide resolved

Update doc/source/whatsnew/v1.4.0.rst

46a622b

jreback requested changes Aug 13, 2021

View reviewed changes

jreback added this to the 1.4 milestone Aug 19, 2021

jreback approved these changes Aug 19, 2021

View reviewed changes

jreback merged commit 098661e into pandas-dev:master Aug 19, 2021

feefladder pushed a commit to feefladder/pandas that referenced this pull request Sep 7, 2021

Bug in Series constructor returning wrong missing values (pandas-dev#…

5345129

…43026)

phofl deleted the 43017 branch November 13, 2021 19:32

phofl mentioned this pull request Feb 14, 2022

BUG: Pandas Series with dtype=bool has changed default data values from False to True. #45933

Closed

3 tasks

phofl mentioned this pull request Mar 11, 2022

BUG: pd.Series created with wrong dtype #46323

Closed

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug in Series constructor returning wrong missing values #43026

Bug in Series constructor returning wrong missing values #43026

phofl commented Aug 13, 2021

jreback Aug 13, 2021

jbrockmendel Aug 13, 2021

phofl Aug 14, 2021

jreback Aug 13, 2021

jreback commented Aug 19, 2021

simonjayhawkins commented Mar 25, 2022

jbrockmendel commented Apr 1, 2022

Bug in Series constructor returning wrong missing values #43026

Bug in Series constructor returning wrong missing values #43026

Conversation

phofl commented Aug 13, 2021

jreback Aug 13, 2021

Choose a reason for hiding this comment

jbrockmendel Aug 13, 2021

Choose a reason for hiding this comment

phofl Aug 14, 2021

Choose a reason for hiding this comment

jreback Aug 13, 2021

Choose a reason for hiding this comment

jreback commented Aug 19, 2021

simonjayhawkins commented Mar 25, 2022

jbrockmendel commented Apr 1, 2022