-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
REF: reshape.concat operate on arrays, not SingleBlockManagers #33125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the issue your change to assert_series_equal is working around?
pandas/core/arrays/numpy_.py
Outdated
@@ -436,6 +436,11 @@ def skew(self, axis=None, dtype=None, out=None, keepdims=False, skipna=True): | |||
# ------------------------------------------------------------------------ | |||
# Additional Methods | |||
|
|||
def astype(self, dtype, copy=True): | |||
if dtype is self.dtype: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are PandasDtype instances singletons? I wouldn't expect this to pass for PandasType('int') is PandasDtype('int')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a singleton, no
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this require action? AFAICT the fact that this is needed is an artifact of the way we are patching PandasArray._typ in these tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. Otherwise the implementation is incorrect since PandasArray.astype(PandasDtype("int64"))
will raise
In [6]: arr = pd.core.arrays.PandasArray(np.array([1, 2]))
In [7]: arr.astype(pd.core.arrays.numpy_.PandasDtype("float64"))
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-7-eb4babcd3e70> in <module>
----> 1 arr.astype(pd.core.arrays.numpy_.PandasDtype("float64"))
~/sandbox/pandas/pandas/core/arrays/numpy_.py in astype(self, dtype, copy)
440 if dtype is self.dtype:
441 return self.copy() if copy else self
--> 442 return super().astype(dtype, copy=copy)
443
444 def to_numpy(
~/sandbox/pandas/pandas/core/arrays/base.py in astype(self, dtype, copy)
432 NumPy ndarray with 'dtype' for its dtype.
433 """
--> 434 return np.array(self, dtype=dtype, copy=copy)
435
436 def isna(self) -> ArrayLike:
TypeError: data type not understood
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, but that isnt any different from the status quo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the difference is that we're implementing this now, but the implementation isn't correct.
If you don't see a straightforward was to resolve this now, then feel free to open a followup issue and merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is_extension_dtype(other) and other.dtype == self.dtype
might suffice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To explain a bit more (at risk of over-explaining) a future reader of that exception is likely to be more confused about why this doesn't work than we are now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated to revert the astype
change and xfail the affected tests
Without the patched assert_series_equal, we get
I think the assert_series_equal issue has to do with the way we are patching |
Updated to just xfail more selectively and avoid patching assert_series_equal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. I think @TomAugspurger has some remaining comments
cc @TomAugspurger is there a better way to handle the assert_series_equal patch for the PandasArray tests?