-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
String dtype: implement object-dtype based StringArray variant with NumPy semantics #58451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 3 commits
63a7fc5
0eee625
607b95e
bca157d
79eb3b4
c063298
ab96aa4
bae8d65
31f1c33
cbd0820
864c166
d3ad7b0
028dc2c
1750bcb
7f4baf7
fdf1454
fe6fce6
70325d4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -805,6 +805,16 @@ def assert_extension_array_equal( | |
left_na, right_na, obj=f"{obj} NA mask", index_values=index_values | ||
) | ||
|
||
# Specifically for StringArrayNumpySemantics, validate here we have a valid array | ||
if isinstance(left.dtype, StringDtype) and left.dtype.storage == "python_numpy": | ||
jorisvandenbossche marked this conversation as resolved.
Show resolved
Hide resolved
|
||
assert np.all( | ||
[np.isnan(val) for val in left._ndarray[left_na]] # type: ignore[attr-defined] | ||
), "wrong missing value sentinels" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a bit a custom check (and we don't do anything similarly for other types), but given I initially overlooked a case where we were creating string arrays with the wrong missing value sentinel because the tests don't actually catch that (two arrays with different missing value sentinels still pass as equal in case of EAs), I would prefer keeping this in at least on the short term. |
||
if isinstance(right.dtype, StringDtype) and right.dtype.storage == "python_numpy": | ||
assert np.all( | ||
[np.isnan(val) for val in right._ndarray[right_na]] # type: ignore[attr-defined] | ||
), "wrong missing value sentinels" | ||
|
||
left_valid = left[~left_na].to_numpy(dtype=object) | ||
right_valid = right[~right_na].to_numpy(dtype=object) | ||
if check_exact: | ||
|
Uh oh!
There was an error while loading. Please reload this page.