-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Fix concat of frames with extension types (no reindexed columns) #34339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 5 commits
e130060
ea33cfd
f83338a
5822694
3d32bc8
951e356
d269229
63a6570
e731802
d8faef5
2e1f1a6
94fe89d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -319,6 +319,15 @@ def _concatenate_join_units(join_units, concat_axis, copy): | |
concat_values = concat_values.copy() | ||
else: | ||
concat_values = concat_values.copy() | ||
elif any(isinstance(t, ExtensionArray) for t in to_concat): | ||
# concatting with at least one EA means we are concatting a single column | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I find it odd that the logic is now split here and concat_compat. I would strongly encourage to put this logic there. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having this here actually ensures a clear separation of concerns between The This keeps this BlockManager-related logic inside the internals, and ensures that There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. my point is you should have a concat_ea which you call here (right near concat_compat), otherwise the logic is scatttered. the *dispatching) logic is fine, the problem is the operational logic does not belong here. (alternative is to pass anotherr arg to concat_compat to do what you are doing here). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If there is a different What do you mean exactly with the "operational logic does not belong here". What part of the code below is the "operational" logic? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. every bit you added is operational logic. pls do something like this
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is moving L324-330. As I have said before you are mixing 2 types of logic here by adding this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. To make sure I understand you correctly. You want to see the following function be put in def concat_ea_frame(to_concat):
to_concat = [t if isinstance(t, ExtensionArray) else t[0, :] for t in to_concat]
concat_values = concat_compat(to_concat, axis=concat_axis)
if not isinstance(concat_values, ExtensionArray):
# if the result of concat is not an EA but an ndarray, reshape to
# 2D to put it a non-EA Block
concat_values = np.atleast_2d(concat_values)
return concat_values and called her. Is that correct? I can do that if it ends this discussion, but just to note: this would create a function in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yes this is exactly inline with concat_dateime and concat_compat. I am surprised that you think this should be anywhere else. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
i agree with Joris; internals-specific logic belongs in internals There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. i guess |
||
# the non-EA values are 2D arrays with shape (1, n) | ||
to_concat = [t if isinstance(t, ExtensionArray) else t[0, :] for t in to_concat] | ||
concat_values = concat_compat(to_concat, axis=concat_axis) | ||
if not isinstance(concat_values, ExtensionArray): | ||
# if the result of concat is not an EA but an ndarray, reshape to | ||
# 2D to put it a non-EA Block | ||
concat_values = np.atleast_2d(concat_values) | ||
else: | ||
concat_values = concat_compat(to_concat, axis=concat_axis) | ||
|
||
|
@@ -443,7 +452,7 @@ def _is_uniform_join_units(join_units: List[JoinUnit]) -> bool: | |
# cannot necessarily join | ||
return ( | ||
# all blocks need to have the same type | ||
all(isinstance(ju.block, type(join_units[0].block)) for ju in join_units) | ||
all(type(ju.block) is type(join_units[0].block) for ju in join_units) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this change is to ensure DatetimeBlock and DatetimeTZBlock don't incorrectly take this "uniform join units" code path |
||
and # noqa | ||
# no blocks that would get missing values (can lead to type upcasts) | ||
# unless we're an extension dtype. | ||
|
Uh oh!
There was an error while loading. Please reload this page.