-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
ENH: Implement DataFrame.value_counts #31247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit
Hold shift + click to select a range
0830e36
Add value_counts tests
d946e93
Update docs
7d9306d
Start implementing value_counts
2e58db4
Set MultiIndex name
25d7f2f
Format
aa96c98
Sort imports
aef75ae
Remove typing for now
acb81cc
Simplify test a little
786de34
Remove single col example for now
7eba59a
Update error for bins
60554e9
Update pandas/core/frame.py
dsaxton 4c4e858
Update pandas/core/frame.py
dsaxton 07f0e76
Import Label type
d055b5c
Merge remote-tracking branch 'upstream/master' into df-val-counts
957a8ec
Make Sequence optional
4fee5e0
Fix docstring
b8f4126
Clean docstring
310c688
Update to comments
a266021
Add to Series See Also
d738bf7
Update tests and add back tolist
2618220
Don't import pytest
98e7e5b
Merge branch 'master' into df-val-counts
1ab2aeb
Merge branch 'master' into df-val-counts
a97347f
Merge branch 'master' into df-val-counts
e12117e
Merge branch 'master' into df-val-counts
9e75083
Merge branch 'master' into df-val-counts
0d46697
Add to basics.rst
81991a1
Move to Notes
d618677
Merge branch 'master' into df-val-counts
85bc213
Rename to avoid doc error
425ef73
Merge branch 'master' into df-val-counts
b03978c
Merge branch 'master' into df-val-counts
de043d9
Merge branch 'master' into df-val-counts
2f0f46d
Merge branch 'master' into df-val-counts
5544716
Merge branch 'master' into df-val-counts
12898ad
Merge branch 'master' into df-val-counts
d743ac2
Merge remote-tracking branch 'upstream/master' into df-val-counts
dsaxton f7c3abe
Merge remote-tracking branch 'upstream/master' into df-val-counts
dsaxton c297143
Merge remote-tracking branch 'upstream/master' into df-val-counts
dsaxton 47683ad
Add See Also
dsaxton e60de83
Move tests
dsaxton 3903a4d
versionadded
dsaxton 9ee6e0e
Merge remote-tracking branch 'upstream/master' into df-val-counts
dsaxton de40484
Merge remote-tracking branch 'upstream/master' into df-val-counts
dsaxton File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
import numpy as np | ||
|
||
import pandas as pd | ||
import pandas._testing as tm | ||
|
||
|
||
def test_data_frame_value_counts_unsorted(): | ||
df = pd.DataFrame( | ||
{"num_legs": [2, 4, 4, 6], "num_wings": [2, 0, 0, 0]}, | ||
index=["falcon", "dog", "cat", "ant"], | ||
) | ||
|
||
result = df.value_counts(sort=False) | ||
expected = pd.Series( | ||
data=[1, 2, 1], | ||
index=pd.MultiIndex.from_arrays( | ||
[(2, 4, 6), (2, 0, 0)], names=["num_legs", "num_wings"] | ||
), | ||
) | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
|
||
def test_data_frame_value_counts_ascending(): | ||
df = pd.DataFrame( | ||
{"num_legs": [2, 4, 4, 6], "num_wings": [2, 0, 0, 0]}, | ||
index=["falcon", "dog", "cat", "ant"], | ||
) | ||
|
||
result = df.value_counts(ascending=True) | ||
expected = pd.Series( | ||
data=[1, 1, 2], | ||
index=pd.MultiIndex.from_arrays( | ||
[(2, 6, 4), (2, 0, 0)], names=["num_legs", "num_wings"] | ||
), | ||
) | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
|
||
def test_data_frame_value_counts_default(): | ||
df = pd.DataFrame( | ||
{"num_legs": [2, 4, 4, 6], "num_wings": [2, 0, 0, 0]}, | ||
index=["falcon", "dog", "cat", "ant"], | ||
) | ||
|
||
result = df.value_counts() | ||
expected = pd.Series( | ||
data=[2, 1, 1], | ||
index=pd.MultiIndex.from_arrays( | ||
[(4, 6, 2), (0, 0, 2)], names=["num_legs", "num_wings"] | ||
), | ||
) | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
|
||
def test_data_frame_value_counts_normalize(): | ||
df = pd.DataFrame( | ||
{"num_legs": [2, 4, 4, 6], "num_wings": [2, 0, 0, 0]}, | ||
index=["falcon", "dog", "cat", "ant"], | ||
) | ||
|
||
result = df.value_counts(normalize=True) | ||
expected = pd.Series( | ||
data=[0.5, 0.25, 0.25], | ||
index=pd.MultiIndex.from_arrays( | ||
[(4, 6, 2), (0, 0, 2)], names=["num_legs", "num_wings"] | ||
), | ||
) | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
|
||
def test_data_frame_value_counts_single_col_default(): | ||
df = pd.DataFrame({"num_legs": [2, 4, 4, 6]}) | ||
|
||
result = df.value_counts() | ||
expected = pd.Series( | ||
data=[2, 1, 1], | ||
index=pd.MultiIndex.from_arrays([[4, 6, 2]], names=["num_legs"]), | ||
) | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
|
||
def test_data_frame_value_counts_empty(): | ||
df_no_cols = pd.DataFrame() | ||
|
||
result = df_no_cols.value_counts() | ||
expected = pd.Series([], dtype=np.int64) | ||
|
||
tm.assert_series_equal(result, expected) | ||
|
||
|
||
def test_data_frame_value_counts_empty_normalize(): | ||
df_no_cols = pd.DataFrame() | ||
|
||
result = df_no_cols.value_counts(normalize=True) | ||
expected = pd.Series([], dtype=np.float64) | ||
|
||
tm.assert_series_equal(result, expected) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.