Skip to content

DOC: update the DataFrame.stack docstring #20430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

samuelsinayoko
Copy link
Contributor

Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):

  • PR title is "DOC: update the DataFrame.stack docstring"
  • The validation script passes: scripts/validate_docstrings.py <your-function-or-method>
  • The PEP8 style check passes: git diff upstream/master -u -- "*.py" | flake8 --diff
  • The html version looks good: python doc/make.py --single <your-function-or-method>
  • It has been proofread on language by another sprint participant

Please include the output of the validation script below between the "```" ticks:

# paste output of "scripts/validate_docstrings.py <your-function-or-method>" here
# between the "```" (remove this comment, but keep the "```")

If the validation script still gives errors, but you think there is a good reason
to deviate in this case (and there are certainly such cases), please state this
explicitly.

Checklist for other PRs (remove this part if you are doing a PR for the pandas documentation sprint):

  • closes #xxxx
  • tests added / passed
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Copy link
Member

@datapythonista datapythonista left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work. Added some comments about formatting, and some that in my opinion should make the examples better.

-----
The function is named by analogy with a stack of books
(levels) being re-organised from a horizontal position (column
levels) to a vertical position (index levels).

Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Examples section should go at the end, after Returns and See Also.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 99734ac

onto column axis.
DataFrame.pivot: reshape dataframe from long format to wide
format.
DataFrame.pivot_table: create a spreadsheet-style pivot table
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be a space between the colon, and the description should start with a capital letter.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 652f7b2

b 1
two a 2
b 3
dtype: int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a personal opinion, but I think defining all the data first with descriptive names make it a bit more complex to understand.

We could have separate sections for each case, with a title in bold (surrounding the text with double stars, followed by the data creation, using simply df in all the cases.

Also, in this case I think it would make the example easier to understand using more real-world examples. As a, b... don't have a meaning, it'd a bit harder to understand what's going on.

A minor thing, when creating the data, I think it makes more sense that each row is defined as a tuple, than as a list.

For example:

**Single level**

>>> df = pd.DataFrame([(8, 12), (22, 35)],
...                   index=['cat', 'dog'],
...                   columns=['weight', 'max_speed'])
>>> df

>>> df.stack()

Copy link
Contributor Author

@samuelsinayoko samuelsinayoko Mar 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 split the examples in several sections in 15902ed

@codecov
Copy link

codecov bot commented Mar 21, 2018

Codecov Report

Merging #20430 into master will increase coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20430      +/-   ##
==========================================
+ Coverage    91.8%   91.85%   +0.04%     
==========================================
  Files         152      152              
  Lines       49215    49231      +16     
==========================================
+ Hits        45181    45220      +39     
+ Misses       4034     4011      -23
Flag Coverage Δ
#multiple 90.23% <100%> (+0.04%) ⬆️
#single 41.83% <66.66%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/frame.py 97.18% <100%> (ø) ⬆️
pandas/core/arrays/categorical.py 96.2% <0%> (-0.02%) ⬇️
pandas/core/base.py 96.78% <0%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 96.72% <0%> (ø) ⬆️
pandas/core/series.py 93.84% <0%> (ø) ⬆️
pandas/core/panel.py 97.29% <0%> (ø) ⬆️
pandas/core/generic.py 95.85% <0%> (ø) ⬆️
pandas/core/indexes/category.py 97.3% <0%> (ø) ⬆️
pandas/core/indexes/base.py 96.68% <0%> (ø) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 01882ba...5bc794c. Read the comment docs.

@pep8speaks
Copy link

pep8speaks commented Mar 22, 2018

Hello @samuelsinayoko! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on March 26, 2018 at 13:05 Hours UTC

column labels) having a hierarchical index with a new inner-most level
of row labels.
The level involved will automatically get sorted.
Stack the prescribed level(s) from the column axis onto the index
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a single line. Can you shorten by "column axis" -> "columns" and "index axis" -> "index"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review. Fixed in a2c9b1a.

I've also modified the description in the Notes section. It was never completely clear to me why method was called stack (I think I was imagining the column as a board being moved from an horizontal position to a vertical position, whereas I think the name comes from a collection of items being moved from a side by side position to a stack), so I've tried to explain that in the notes section. Hope it makes sense!

@samuelsinayoko
Copy link
Contributor Author

@datapythonista Nice seeing you at the London meetup this week, thanks again for organising. Are you happy with the latest changes?

@TomAugspurger TomAugspurger added this to the 0.23.0 milestone Mar 26, 2018
@TomAugspurger TomAugspurger added the Reshaping Concat, Merge/Join, Stack/Unstack, Explode label Mar 26, 2018
@TomAugspurger TomAugspurger merged commit 402ad45 into pandas-dev:master Mar 26, 2018
@TomAugspurger
Copy link
Contributor

Thanks @samuelsinayoko !

ZackStone pushed a commit to ZackStone/pandas that referenced this pull request Mar 26, 2018
javadnoorb pushed a commit to javadnoorb/pandas that referenced this pull request Mar 29, 2018
dworvos pushed a commit to dworvos/pandas that referenced this pull request Apr 2, 2018
kornilova203 pushed a commit to kornilova203/pandas that referenced this pull request Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants