Skip to content

REF/CLN: test_get_dummies #33184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 1, 2020
Merged

Conversation

jbrockmendel
Copy link
Member

No description provided.

@jreback jreback added the Testing pandas testing functions or related to the test suite label Mar 31, 2020
@jreback jreback added this to the 1.1 milestone Mar 31, 2020
@jreback
Copy link
Contributor

jreback commented Mar 31, 2020

lgtm. merge on green.

Copy link
Member

@jschendel jschendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, one small comment

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an ongoing effort to replace pd.DataFrame by DataFrame in our code and to be redundant with the test method names (TestGetDummmies.test_basic by TestGetDummmies.test_get_dummies_basic)?

Replacing loops by parametrization improves readability, but these other changes look very personal and opinionated, and if we want to make them a standard, would be nice to have an issue referenced here. Otherwise we can end up in a situation where every time someone works with a file, we'll get some pd. added and removed (for example, I would personally prefer to always have the module name and not import objects).

@jbrockmendel
Copy link
Member Author

and to be redundant with the test method names (TestGetDummmies.test_basic by TestGetDummmies.test_get_dummies_basic)?

In this case it is redundant, but in many other cases we'll see something like test_basic in a class without an informative name. It's also not uncommon to see tests for something other than get_dummies living inside a TestGetDummies class. I think being explicit is worthwhile here.

Is there an ongoing effort to replace pd.DataFrame by DataFrame in our code

Not that I'm aware of. For files testing lower-level code, I like to have all the relevant imports at the top so that it is easy to tell what is tested in this file. If pd is in the namespace, I can't rule anything out. Not really relevant for this file, just a habit I've gotten into.

@jbrockmendel
Copy link
Member Author

updated with typo fixup + green

@WillAyd
Copy link
Member

WillAyd commented Apr 1, 2020

(for example, I would personally prefer to always have the module name and not import objects).

Yea @jorisvandenbossche and I agree, but at the same time I think someone needs to champion that cause if we want to establish a standard

@WillAyd WillAyd merged commit d8d1dc9 into pandas-dev:master Apr 1, 2020
@WillAyd
Copy link
Member

WillAyd commented Apr 1, 2020

Thanks @jbrockmendel

@jreback
Copy link
Contributor

jreback commented Apr 1, 2020

(for example, I would personally prefer to always have the module name and not import objects).

Yea @jorisvandenbossche and I agree, but at the same time I think someone needs to champion that cause if we want to establish a standard

i have always been in favor of this :)

moving to this will require a good code check though - s may need be that easy

@jbrockmendel jbrockmendel deleted the tst-get_dummies branch April 1, 2020 00:51
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Apr 1, 2020

IMO this doesn't necessarily require a code check initially.
But we could already just stop merging PRs that do changes like pd.DataFrame -> DataFrame (or ask them to revert that part). @WillAyd gave the comment like #33184 (comment) on several PRs recently I think, but in the end we always just merged the PR..

@jreback
Copy link
Contributor

jreback commented Apr 1, 2020

I care most about consistency in a single file meaning we shouldn’t mix ways of referencing common things like Series vs pd.Series

I do prefer using import of Series and not pd.Series

@jorisvandenbossche
Copy link
Member

I do prefer using import of Series and not pd.Series

I am confused now. I interpreted the above "i have always been in favor of this" as the opposite (#33184 (comment)), since that is replying to a quote of Will that is basically saying that he, Marc and I prefer to not import objects but modules (so using pd.Series instead of Series)

@jorisvandenbossche
Copy link
Member

I think someone needs to champion that cause if we want to establish a standard

I opened a separate issue about it: #33203. Let's try to decide there which of the two ways has our preference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Clean Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants