-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
BUG: Index.union() inconsistent with non-unique Indexes #36299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
56 commits
Select commit
Hold shift + click to select a range
b8f27e2
Fix issues in Index.union with duplicate index values
phofl 070abad
Run black pandas
phofl 7a90556
Parametrize tests
phofl dbcd0a7
Fix small bug and Pep8 issues
phofl 7a9aec7
Change import order
phofl 01d55c6
Fix black bug
phofl 8463078
Implement reviews
phofl cddd7f9
Move code away from try except
phofl 3c6b079
Catch r_reindexer None
phofl 90056b7
Resort imports...
phofl 46b7c6c
Simplify resorting
phofl 9f55905
Rename variable
phofl 870c0ac
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 97fe91f
Rename variable
phofl e724ade
Move resorting to algos
phofl 920f9ff
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl c36ccff
Add test and improve doc
phofl 396c70f
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 1062778
Fix pattern issues
phofl cf06418
Add check
phofl 6f04408
Rename function
phofl 167d695
Adjust whatsnew
phofl 0b3b548
Change gh reference
phofl 73d9ab3
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 561f835
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 59dbdf6
Remove pd
phofl 51131be
Adress review comments
phofl 04817b4
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl a1887c7
Move whatsnew
phofl fe41a6f
Fix gh reference
phofl b63a732
Remove comment and fix test
phofl fa49dfe
Add test for concat issue
phofl 2a6f6ed
Add whatsnew
phofl d80949c
Remove concat test
phofl 48e041a
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl b57f00f
Add dropna
phofl aa4533a
Fix join func
phofl 8fd90a3
Fix bug
phofl 091942a
Fix bug
phofl 446eb50
Refactor code and add tests
phofl 6b8fa64
Run Black
phofl 484f4f8
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 5209bf0
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 6cabad8
Adress review
phofl b80fbdd
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 60bceec
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 2547f65
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 25885d4
Merge master and adjust condition
phofl fcc4635
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl ccaa0c1
Remove unused import
phofl f4ee466
Reformat code
phofl 20e62e6
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl e92ab7a
Add comments and refactor code
phofl 7ffa07a
Adress review
phofl 76ded89
Merge branch 'master' of https://github.com/pandas-dev/pandas into 36289
phofl 0af939d
Fix merge introduced missing object
phofl File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2939,32 +2939,44 @@ def _union(self, other: Index, sort): | |
lvals = self._values | ||
rvals = other._values | ||
|
||
if sort is None and self.is_monotonic and other.is_monotonic: | ||
if ( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add a comment here on when these branches are taken (similar to what you did below). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done |
||
sort is None | ||
and self.is_monotonic | ||
and other.is_monotonic | ||
and not (self.has_duplicates and other.has_duplicates) | ||
): | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Both are unique and monotonic, so can use outer join | ||
try: | ||
result = self._outer_indexer(lvals, rvals)[0] | ||
return self._outer_indexer(lvals, rvals)[0] | ||
except (TypeError, IncompatibleFrequency): | ||
# incomparable objects | ||
result = list(lvals) | ||
value_list = list(lvals) | ||
|
||
# worth making this faster? a very unusual case | ||
value_set = set(lvals) | ||
result.extend([x for x in rvals if x not in value_set]) | ||
result = Index(result)._values # do type inference here | ||
else: | ||
# find indexes of things in "other" that are not in "self" | ||
if self.is_unique: | ||
indexer = self.get_indexer(other) | ||
missing = (indexer == -1).nonzero()[0] | ||
else: | ||
missing = algos.unique1d(self.get_indexer_non_unique(other)[1]) | ||
value_list.extend([x for x in rvals if x not in value_set]) | ||
return Index(value_list)._values # do type inference here | ||
|
||
if len(missing) > 0: | ||
other_diff = algos.take_nd(rvals, missing, allow_fill=False) | ||
result = concat_compat((lvals, other_diff)) | ||
elif not other.is_unique and not self.is_unique: | ||
jreback marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# self and other both have duplicates | ||
result = algos.union_with_duplicates(lvals, rvals) | ||
return _maybe_try_sort(result, sort) | ||
|
||
else: | ||
result = lvals | ||
# Either other or self is not unique | ||
# find indexes of things in "other" that are not in "self" | ||
if self.is_unique: | ||
indexer = self.get_indexer(other) | ||
missing = (indexer == -1).nonzero()[0] | ||
else: | ||
missing = algos.unique1d(self.get_indexer_non_unique(other)[1]) | ||
|
||
if len(missing) > 0: | ||
other_diff = algos.take_nd(rvals, missing, allow_fill=False) | ||
result = concat_compat((lvals, other_diff)) | ||
else: | ||
result = lvals | ||
|
||
if not self.is_monotonic or not other.is_monotonic: | ||
result = _maybe_try_sort(result, sort) | ||
|
||
return result | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.