-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Bug in Series constructor returning wrong missing values #43026
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# GH#43018 | ||
ser = Series(np.nan, dtype="object") | ||
result = ser.astype("bool") | ||
expected = Series(True, dtype="bool") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool(np.nan)
returns True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes exactly.
def test_constructor_bool_dtype_missing_values(self): | ||
# GH#43018 | ||
result = Series(index=[0], dtype="bool") | ||
expected = Series(True, index=[0], dtype="bool") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is True?
thanks @phofl |
@phofl we have had reports regarding both these cases. Is the new behavior now consistent? import pandas as pd
print(pd.__version__)
s = pd.Series(dtype="int", index=[0])
print(s)
s2 = pd.Series(dtype="bool", index=[0])
print(s2)
It appears to me (and maybe others from the issues) that the changes are confusing. We have changed the I appreciate that |
no opinion on the reversion, but i advise against the implicit assumption that constructors are going to default to nullable dtypes. support has improved, but its still a mess. |
Series(dtype=int)
is notpd.NA
#43017bool
#43018 (if the reindex case is ok, then this is done too)This ensures consistency with the DataFrame constructor