-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
[26660] compare int with double by subtracting during describe percen… #26768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[26660] compare int with double by subtracting during describe percen… #26768
Conversation
Codecov Report
@@ Coverage Diff @@
## master #26768 +/- ##
==========================================
- Coverage 91.71% 91.71% -0.01%
==========================================
Files 178 178
Lines 50755 50755
==========================================
- Hits 46552 46548 -4
- Misses 4203 4207 +4
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #26768 +/- ##
==========================================
- Coverage 91.71% 91.71% -0.01%
==========================================
Files 178 178
Lines 50771 50776 +5
==========================================
+ Hits 46567 46570 +3
- Misses 4204 4206 +2
Continue to review full report at Codecov.
|
…mining if indexes are integers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
cc @jreback
a868714
to
894c73e
Compare
0d5450e
to
5a73741
Compare
pandas/io/formats/format.py
Outdated
@@ -1246,7 +1246,7 @@ def format_percentiles(percentiles): | |||
raise ValueError("percentiles should all be in the interval [0,1]") | |||
|
|||
percentiles = 100 * percentiles | |||
int_idx = (percentiles.astype(int) == percentiles) | |||
int_idx = (abs(percentiles.astype(int) - percentiles) < 1e-4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reasoning for using 1e-4? May make more sense just to leverage np.allclose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good insight - it makes code more readable. Instead of np.allclose, I've opted out for np.isclose, which returns array of booleans. Reasoning for that is that further down in the function that mask is used when determining how to format integers (where mask is True) and decimal numbers (where mask is False).
Another approach would be that if at least one percentile is decimal, all percentiles should be represented as decimals. This is currently not the case, for instance if percentiles to describe are: [0, 0.5, 0.334] the method will return ["0%", "50%", "33.4%"]. This approach would yield: ["0.0%", "50.0%", "33.4%"].
I wanted to change as little as possible, so I returned mask of booleans from this expression, but if it fits your business logic to change to second approach, I can modify the code to that behavior also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. This is fine as is - really just wanted to circumvent hard coding thresholds like before. Thanks for reviewing in detail!
@@ -2740,6 +2740,14 @@ def test_format_percentiles(): | |||
fmt.format_percentiles([0.1, 0.5, 'a']) | |||
|
|||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an additional test that uses .describe, like you did in the OP: #26660 (comment), add in pandas/tests/frame/test_analytics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the test you mentioned. Please comment if it is not sufficient.
thanks @jkovacevic |
git diff upstream/master -u -- "*.py" | flake8 --diff
I am not sure whether I had put my issue in whatsnew entry in appropriate place.