Skip to content

DOC: update the dtypes/ftypes docstring (Seoul) #20100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 12, 2018
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 48 additions & 1 deletion pandas/core/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -4275,7 +4275,29 @@ def get_ftype_counts(self):

@property
def dtypes(self):
"""Return the dtypes in this object."""
"""
Return the dtypes in this object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain in more depth to a novice user that this is used to get the dtypes per column of the DataFrame.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a Returns section as specified in the guide?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a 'See Also' section to contemplate common dtypes.

Notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can remove the "Notes" header, and just make this the extended summary.

-----
It returns a Series with the data type of each column.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe replace "it" with "This method". And let's say what the values and index is.

This returns a Series with the data type of each column. The result's index is the
original DataFrame's columns.

Let's also replace all instances of "object" with "DataFrame". I'm not sure why this is in generic.py since I think it's specific to DataFrame.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm modifying dtype @property of the NDFrame in generic.py
Is "This property" okay instead of "This method"?

At first, object was NDFrame, but found an error.

Private classes (['NDFrame']) should not be mentioned in public docstring.

but it seems to be better that the way you specify with "DataFrame", I will do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a 'Returns' section to explain the output. Notes should be used only to explain technical details about the implementation of the algorithm or function behavior.

If object contains data multiple dtypes in a single column,
dtypes will be chosen to accommodate all of the data types.
``object`` is the most general.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth explaining that str will be represented as object.


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe in a See Also section ftypes could be mentioned.

Examples
--------
>>> df = pd.DataFrame({'f': pd.np.random.rand(3),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just write out three floating point values? I'd like to avoid random data.

And FYI in general you don't want to use pd.np. You'll want to import NumPy manually (it's assumed to be imported in our docstrings.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, and could you just make all these length-1 lists, just to be clearer? so {'f': [1.0], 'i': [1], ...}

... 'i': 1,
... 'd': pd.Timestamp('20180310'),
... 'o': 'foo'})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add in See Also to Series.dtype

>>> df.dtypes
f float64
i int64
d datetime64[ns]
o object
dtype: object
"""
from pandas import Series
return Series(self._data.get_dtypes(), index=self._info_axis,
dtype=np.object_)
Expand All @@ -4285,6 +4307,31 @@ def ftypes(self):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would better organized it there was pull request for dtypes and another pull request for ftypes.

Return the ftypes (indication of sparse/dense and dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring guide asks that the summary should fit in a single line. Could you rephrase it that way? If all information could not fit in a single line you can then use an Extended Summary section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Short summary should have 1 line.

in this object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a return section.

Notes
-----
Sparse data should have the same dtypes as its dense representation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End in a .


See Also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'See Also' should go before notes.

--------
dtypes, SparseDataFrame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of simply dtypes you should use pandas.DataFrame.dtypes. You can even try linking to dtypes with the notation found in the guide: :meth:`pandas.Series.sum`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make a short description for the itens cited in the summary.


Examples
--------
>>> arr = pd.np.random.randn(100, 4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arr = np.random.RandomState(0).randn(100, 4) for reproducibility

>>> arr[arr < .8] = pd.np.nan
>>> pd.DataFrame(arr).ftypes
0 float64:dense
1 float64:dense
2 float64:dense
3 float64:dense
dtype: object
>>> pd.SparseDataFrame(arr).ftypes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a blank line before this to break things up a bit.

0 float64:sparse
1 float64:sparse
2 float64:sparse
3 float64:sparse
dtype: object
"""
from pandas import Series
return Series(self._data.get_ftypes(), index=self._info_axis,
Expand Down