Skip to content

DOC: update pandas.core.resample.Resampler.nearest docstring #20381

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 20, 2018
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
69 changes: 63 additions & 6 deletions pandas/core/resample.py
Original file line number Diff line number Diff line change
Expand Up @@ -418,23 +418,80 @@ def pad(self, limit=None):

def nearest(self, limit=None):
"""
Fill values with nearest neighbor starting from center
Resample by using the nearest value.

When resampling data, missing values may appear (e.g., when the
resampling frequency is higher than the original frequency).
The nearest fill will replace ``NaN`` values that appeared in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"nearest fill" refers to this method, right? Maybe make that explicit with "nearest fill method"?

the resampled data with the value from the nearest member of the
sequence, based on the index value.
Missing values that existed in the original data will not be modified.
If `limit` is given, fill only `limit` values in each direction for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe the second "limit" could be "this many"? Probably worthwhile to get a non-native speaker to weight in what is clearest.

each of the original values.

Parameters
----------
limit : integer, optional
limit of how many values to fill
limit : int, optional
Limit of how many values to fill.

.. versionadded:: 0.21.0

Returns
-------
an upsampled Series
Series or DataFrame
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@datapythonista do we have a convention for saying same-type-as-input?

An upsampled Series or DataFrame with ``NaN`` values filled with
their nearest value.

See Also
--------
Series.fillna
DataFrame.fillna
backfill: Backward fill the new missing values in the resampled data.
fillna : Fill ``NaN`` values using the specified method, which can be
'backfill'.
pad : Forward fill ``NaN`` values.
pandas.Series.fillna : Fill ``NaN`` values in the Series using the
specified method, which can be 'backfill'.
pandas.DataFrame.fillna : Fill ``NaN`` values in the DataFrame using
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thoroughness is good, but this seems excessive. @datapythonista what's the convention for how much to write in this section?

the specified method, which can be 'backfill'.

Examples
--------
>>> s = pd.Series([1, 2, 3],
... index=pd.date_range('20180101',
... periods=3,
... freq='1h'))
>>> s
2018-01-01 00:00:00 1
2018-01-01 01:00:00 2
2018-01-01 02:00:00 3
Freq: H, dtype: int64

>>> s.resample('20min').nearest()
2018-01-01 00:00:00 1
2018-01-01 00:20:00 1
2018-01-01 00:40:00 2
2018-01-01 01:00:00 2
2018-01-01 01:20:00 2
2018-01-01 01:40:00 3
2018-01-01 02:00:00 3
Freq: 20T, dtype: int64

Limit the number of upsampled values imputed by the nearest:

>>> s.resample('10min').nearest(limit=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tad long. Can you change it to '20min'. I think that'll still make the point as the first will be filled and the second will be NaN

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

2018-01-01 00:00:00 1.0
2018-01-01 00:10:00 1.0
2018-01-01 00:20:00 NaN
2018-01-01 00:30:00 NaN
2018-01-01 00:40:00 NaN
2018-01-01 00:50:00 2.0
2018-01-01 01:00:00 2.0
2018-01-01 01:10:00 2.0
2018-01-01 01:20:00 NaN
2018-01-01 01:30:00 NaN
2018-01-01 01:40:00 NaN
2018-01-01 01:50:00 3.0
2018-01-01 02:00:00 3.0
Freq: 10T, dtype: float64
"""
return self._upsample('nearest', limit=limit)

Expand Down