Skip to content

BUG: Coerce to object for mixed concat #20799

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 24, 2018

Conversation

TomAugspurger
Copy link
Contributor

@TomAugspurger TomAugspurger added this to the 0.23.0 milestone Apr 23, 2018
@TomAugspurger TomAugspurger added Reshaping Concat, Merge/Join, Stack/Unstack, Explode ExtensionArray Extending pandas with custom dtypes or arrays. labels Apr 23, 2018
@@ -173,6 +174,10 @@ def is_nonempty(x):
elif 'sparse' in typs:
return _concat_sparse(to_concat, axis=axis, typs=typs)

extensions = [is_extension_array_dtype(x) for x in to_concat]
if any(extensions) and not all(extensions):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have an actual case where the "and not all(extensions)" is hit?
Because if that is the case, I also think the np.concatenate(to_concat, axis=axis) will fail as it has the wrong dimensions to both work at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhmm I suppose I don't have an example, because the all extension dtype case should never get here, right? It should be going down the is_uniform_join_units path. I'll change it, to remove the all(extensions) condition, so that it makes sense locally.

@codecov
Copy link

codecov bot commented Apr 23, 2018

Codecov Report

Merging #20799 into master will decrease coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20799      +/-   ##
==========================================
- Coverage   91.85%   91.85%   -0.01%     
==========================================
  Files         153      153              
  Lines       49310    49322      +12     
==========================================
+ Hits        45292    45303      +11     
- Misses       4018     4019       +1
Flag Coverage Δ
#multiple 90.24% <100%> (ø) ⬆️
#single 41.89% <50%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/dtypes/concat.py 99.18% <100%> (+0.01%) ⬆️
pandas/core/indexes/base.py 96.63% <0%> (-0.05%) ⬇️
pandas/core/indexes/multi.py 95.07% <0%> (+0.01%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 31e77b0...615399c. Read the comment docs.

@jorisvandenbossche jorisvandenbossche merged commit f89c25d into pandas-dev:master Apr 24, 2018
@jorisvandenbossche
Copy link
Member

Thanks!

@@ -173,6 +174,10 @@ def is_nonempty(x):
elif 'sparse' in typs:
return _concat_sparse(to_concat, axis=axis, typs=typs)

extensions = [is_extension_array_dtype(x) for x in to_concat]
if any(extensions):
to_concat = [np.atleast_2d(x.astype('object')) for x in to_concat]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm this is not correct

what about categorical? which is EA
what about DTI which is not?
do they hit this path ?

you need much more comprehensive tests here
I don’t think you need to convert to object for internal EA types (only external ones)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

both categorical and datetimes cases are already filtered out before (couple of lines above) and use the _concat_categorical and _concat_datetime special cased functions. So the only case that is left here are actual external EAs, for which the only option is converting to object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok

some extra comments would be helpful
also test when u have those plus extension types in a frame would be nice

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: concat of ExtensionArray with other dtype fails
3 participants