-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
split up pandas/tests/indexes/test_multi.py #18644 #21514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21514 +/- ##
=======================================
Coverage 91.92% 91.92%
=======================================
Files 158 158
Lines 49703 49703
=======================================
Hits 45689 45689
Misses 4014 4014
Continue to review full report at Codecov.
|
|
||
|
||
@pytest.fixture | ||
def _index(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to prefix with '_'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The non-fixture variable "index" is used in many of these tests. Several of these also use the fixture "index" which I dubbed _index in an attempt to distinguish them.
I suggest using something like m_index or index_fixture to provide some separation.
Thoughts?
from pandas.core.indexes.datetimelike import DatetimeIndexOpsMixin | ||
|
||
|
||
def verify_pickle(indices): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't create functions like this, inline it instead (unless there is a lot of usage about this, but indicate that)
# GH11193, when an existing index is passed, and a new name is not | ||
# specified, the new index should inherit the previous object name | ||
expected = _index | ||
if not isinstance(expected, MultiIndex): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would this NOT be a MI?
@@ -0,0 +1,938 @@ | |||
# -*- coding: utf-8 -*- | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as long as you are splitting, would not create a test_base, rather create functional named files, similar to how pandas/tests/indexes/datetimes
; since you already have some functional area files below, just distribute these
assert not _index.is_numeric() | ||
|
||
|
||
def test_bounds(_index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where did this come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the file itself, correct? I somewhat started over, and inretrospect, I really shouldn't have. I can more closely follow Xavier's orginal stucture or that of another similar directory.
expected = mi.get_level_values(level) | ||
|
||
|
||
def test_multiindex_compare(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move compare/equal/identical to a new file
|
||
idx = MultiIndex.from_arrays([['qux', 'baz', 'foo', 'bar'], np.arange( | ||
4)]) | ||
result = idx.isin(values) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isin should be in indexing
level='first') | ||
|
||
|
||
def test_reindex_preserves_names_when_target_is_list_or_ndarray(_index): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call this test_indexing
@jreback thank you for the quick feedback. I will make updates over the next day or two. |
0171ac6
to
275c653
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking really good. a couple of comments / questions.
assert result == i | ||
|
||
|
||
def test_dims(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next PR can come back and remove this
first.tolist() - idx[-3:] | ||
|
||
|
||
def test_nlevels(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be in integrity
# GH 21149 | ||
# Ensure comparison operations for MultiIndex with nlevels == 1 | ||
# behave consistently with those for MultiIndex with nlevels > 1 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to equivalence
tm.assert_raises_regex(ValueError, "Duplicated level name:", | ||
mi.rename, names[:2], level=[0, 2]) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you split out duplicated & uniques tests to a separate file
@@ -0,0 +1,88 @@ | |||
# -*- coding: utf-8 -*- | |||
import numpy as np | |||
import pandas.util.testing as tm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have a couple of tests for unsorted in integrity, can you move here
return index | ||
|
||
|
||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that you are iterating over this in a number of tests. why is that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's more or less a holdover from how the inherited base test class was implemented. I was thinking I'd leave it in case anyone wanted to extend some of these cases with another test multi-index, but yes it looks kind of sloppy. I'll refactor it and remove the duplicate fixture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, done with that, but some tests are not relevant to the MultiIndex class or will need to be refactored. I marked those tests with a TODO. The tests that I could refactor have been. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep that's ok. i expect after your refactor will need to go over this again and clean a bit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a number of comments, but should be straightforward. ping on green.
|
||
@pytest.fixture | ||
def idx(): | ||
major_axis = Index(['foo', 'bar', 'baz', 'qux']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment describing the fixture (always do this for fixtures)
return ['first', 'second'] | ||
|
||
|
||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok w/o using these a leading _ (e.g. holder) is ok
def f(): | ||
if idx: | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change to do this with with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not following here. What is the need to use a context manager here? None of the functions seems to have a valid __enter__
or __exit__
methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want something like with pytest.raises(ValueError):
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep exactly
def f(): | ||
if common: | ||
pass | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
|
||
|
||
def test_reconstruct_sort(): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these 2 should be in test_sorting (reconstrut_*)
assert not ind.is_monotonic | ||
|
||
|
||
def test_partial_string_timestamp_multiindex(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partial string should be in test_indexing, or test_timeseries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved this to test_indexing.
@pytest.mark.parametrize('first_type,second_type', [ | ||
('int64', 'int64'), | ||
('datetime64[D]', 'str')]) | ||
def test_remove_unused_levels_large(first_type, second_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this to test_sorting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
['a', 'd', 'b', 'unused']]) | ||
@pytest.mark.parametrize('level1', [['w', 'x', 'y', 'z'], | ||
['w', 'x', 'y', 'z', 'unused']]) | ||
def test_remove_unused_nan(level0, level1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to test_sorting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved it, but why not in test_missing.py? Is it because of the implementation of the underlying algorithm?
(u('x'), u('out'), u('z'), 7, u('y'), u('in'), u('z'), 119), | ||
(u('x'), u('out'), u('z'), 9, u('y'), u('in'), u('z'), 135), | ||
(u('x'), u('out'), u('z'), 13, u('y'), u('in'), u('z'), 145), | ||
(u('x'), u('out'), u('z'), 14, u('y'), u('in'), u('z'), 158), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just rename module to: test_unique_and_duplicates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
assert idx[:0].empty | ||
|
||
|
||
def test_unique_na(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move test_unique_and_duplicates
6b3b36c
to
edfc1ad
Compare
So, someone removed the get_data_path from the pandas.utils.testing. I forgot to run pytest after I rebased. I can inline the function if need be. The bigger issue is the rebase broke my ability to run pytest even on master. Any thoughts on how to deal with the below?
|
You need to rebuild the Cython extensions after pulling from master as there have been recent changes with those |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
last round! thanks for this. ping on green.
|
||
|
||
@pytest.mark.xfail | ||
def test_hasnans_isnans(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move to test_missing
if len(idx) == 0: | ||
pass | ||
elif isinstance(idx, DatetimeIndexOpsMixin): | ||
values[1] = iNaT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can simplify this here
assert result.names == index.names | ||
|
||
|
||
@pytest.mark.skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs to be @pytest.mark.skipif(PY3, reason=......)
from pandas.compat import lrange, range | ||
|
||
|
||
def test_get_loc(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_loc* also goes to test_indexing
|
||
pytest.raises(KeyError, idx.get_loc, (False, True)) | ||
pytest.raises(KeyError, idx.get_loc, (True, False)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thru here
|
||
arrays = [[], []] | ||
index = pd.MultiIndex.from_arrays(arrays) | ||
result = index.get_level_values(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i indicated some more to move
assert result == (0, len(index)) | ||
|
||
|
||
def test_shift(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to test_analytics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the "I indicated some more to move" above, my impression is to move all the test_get_level_* to test_indexing.py. Is that correct?
assert indexer.dtype == np.intp | ||
|
||
|
||
def test_partial_string_timestamp_multiindex(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you move test_partial to a new file: test_partial_indexing.py
|
||
|
||
def test_nan_stays_float(): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to test_missing
tm.assert_numpy_array_equal(index.notna(), ~result) | ||
|
||
|
||
@pytest.mark.parametrize('level', [0, 1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these last 2 here move to test_indexing
Will do thanks!
…On Tue, Jun 26, 2018 at 12:52 PM, William Ayd ***@***.***> wrote:
You need to rebuild the Cython extensions after pulling from master as
there have been recent changes with those
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#21514 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADSn4F5qlxelBvmPjD6W-vs43sPeqq8Iks5uAnTkgaJpZM4Uq6xt>
.
|
I think the datapath fixture is a step backward from the get_data_path which was relative to the test directory. I can't run pytest from the directly from the tests/indexes/mulit directory successfully anymore. I now have to run it from the root directory if I want the test to pass. This should probably be relative to the director of the test like before. |
edfc1ad
to
bd4a635
Compare
hmm. I almost always run at the root directory. but I suppose this could be addressed as well. can you open a separate issue for this? |
How does pytest handle fixtures when you don't run from the root of the tests dir? |
I don't think it is necessarily a pytest issue. An absolute path would
just be better since it makes the working directory irrelevant. Something
like os.path.dir(__file__) to resolve this would be relatively straight
forward.
…On Tue, Jun 26, 2018, 2:18 PM Tom Augspurger ***@***.***> wrote:
How does pytest handle fixtures when you don't run from the root of the
tests dir?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#21514 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ADSn4OBcTwxX3X8AMs7wiAADqPtZdpeGks5uAokIgaJpZM4Uq6xt>
.
|
We need a fixture so that these tests are skipped / failed (depending on
the config option) when the file isn't present.
What exception do you get?
On Tue, Jun 26, 2018 at 2:38 PM, LeakedMemory <[email protected]>
wrote:
… I don't think it is necessarily a pytest issue. An absolute path would
just be better since it makes the working directory irrelevant. Something
like os.path.dir(__file__) to resolve this would be relatively straight
forward.
On Tue, Jun 26, 2018, 2:18 PM Tom Augspurger ***@***.***>
wrote:
> How does pytest handle fixtures when you don't run from the root of the
> tests dir?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#21514 (comment)
>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/
ADSn4OBcTwxX3X8AMs7wiAADqPtZdpeGks5uAokIgaJpZM4Uq6xt>
> .
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#21514 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABQHIgH1RRF_RYFFOQHTzMMM__rMXUEbks5uAo2qgaJpZM4Uq6xt>
.
|
I will try to get to the last round of comments later this week. I pushed some fixes prior to seeing the comments. |
@TomAugspurger I see where you're going with the fixture. The approach definitely makes sense, but I just like that absolute file paths let me run code regardless of where I am on the command line. I opened an issue, #21646, and submitted a pull request to address it. Thanks. |
thanks @elmq0022 ping when you want me to look. |
bd4a635
to
ce18bcd
Compare
ce18bcd
to
8f360d9
Compare
@jreback looks like everything is green. I believe all the comments are addressed. I also included the last updates to test_multi.py in test unique and duplicates. |
@jreback it looks like there were a couple of tests added that I need to include and rebase. Do you have any additional feedback. I would like to get this merged as soon as possible so everything can stay in sync. Thanks! |
yes I agree. pls rebase. ping as soon as green and can merge. |
test_nlevels -> test_integrity.py unsort tests -> test_sorting.py duplicates and unique test -> test_unqi_dups.py
8f360d9
to
baaace3
Compare
@jreback I just pushed the rebased code. Will check back in a half hour or so to see if everything is green. Thanks! |
great thanks! I will go thru and possibly make some comments. but take them as for a followup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bunch of comments. but let's do these in a follow up please. thanks!
tm.assert_index_equal(result, expected) | ||
|
||
|
||
def test_from_arrays_invalid_input(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameterize
pytest.raises(TypeError, MultiIndex.from_arrays, arrays=i) | ||
|
||
|
||
def test_from_arrays_different_lengths(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameterize
labels=[[], [], []], names=names) | ||
tm.assert_index_equal(result, expected) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parametrize
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broke into several parameterized tests as the length of the names and the levels were different in each. Trying to do it all in one level seemed clunky to me.
|
||
|
||
def test_from_product_index_series_categorical(): | ||
# GH13743 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parametrize
index = idx | ||
index.names = ['foo', 'bar'] | ||
result = pd.Index(index) | ||
tm.assert_index_equal( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
write the expected out on a separate line
idx.astype('category') | ||
|
||
|
||
def test_repeat(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to test_analytics
first.tolist() - idx[-3:] | ||
|
||
|
||
def test_argsort(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to the sorting tests
tm.assert_numpy_array_equal(result, expected) | ||
|
||
|
||
def test_map(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move map tests to test_analytics
with np.errstate(all='ignore'): | ||
func(idx) | ||
|
||
for func in [np.isfinite, np.isinf, np.isnan, np.signbit]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
want to split & parameterize these tests
tm.assert_index_equal(result, expected) | ||
|
||
|
||
def test_numpy_ufuncs(idx): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just combine test_operations and test_analytics (call it test_analytics)
thanks @elmq0022 very very nice! great job! if you can do a followup to address comments (and some TODO's you had in the code would be much appreciated) |
git diff upstream/master -u -- "*.py" | flake8 --diff
The file is split out, the number of tests remains 377 (2 skips on my machine; all other test cases pass), the code uses fixtures in all instances, new fixtures are in a local conftest.py file, and there are no dependencies on classes.
I'm happy to receive and address any feedback.