-
Notifications
You must be signed in to change notification settings - Fork 4.2k
[maskedtensor] Add missing nan ops tutorial #2046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
Implementing missing torch.nan* operators | ||
----------------------------------------- | ||
|
||
In the above issue, there is a request to add additional operators to cover the various `torch.nan*` applications, | ||
such as ``torch.nanmax``, ``torch.nanmin``, etc. | ||
|
||
In general, these problems lend themselves more naturally to masked semantics, so instead of introducing additional | ||
operators, we propose using MaskedTensors instead. Since | ||
`nanmean has already landed <https://github.com/pytorch/pytorch/issues/21987>`__, we can use it as a comparison point: | ||
|
||
>>> x = torch.arange(16).float() | ||
>>> y = x * x.fmod(4) | ||
>>> y = y.masked_fill(y ==0, float('nan')) | ||
>>> y | ||
tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13., | ||
28., 45.]) | ||
>>> y.nanmean() | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it might be useful to inline some comments on what you're trying to show here There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. probably outside the scope of this review, but why do we have nanmean() as an API instead of the pandas-style There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure.. |
||
tensor(16.6667) | ||
>>> torch.mean(masked_tensor(y, ~torch.isnan(y))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is the goal to have sequential tutorials or keep each self contained? If the latter, can you add the relevant imports up top. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This tutorial will be merged with overview! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. did you replace There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. have we considered API sugar: (1) instantiating a MT from a Tensor assuming na is the mask >>> MaskedTensor(y)
MaskedTensor(
[ --, 1.0000, 4.0000, 9.0000, --, 5.0000, 12.0000, 21.0000, --, 9.0000, 20.0000, 33.0000, --, 13.0000, 28.0000, 45.0000]
) (2) instantiating a MT where user just states the mask value instead of passing the mask y = MaskedTensor(y, mask_value=float(1)) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not yet! I think an unspecified mask could also be an indication that they would like all Another one would be to allow for just All been discussed and will take note to add in :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where are you tracking feature requests? |
||
MaskedTensor( 16.6667, True) | ||
|
||
:class:`MaskedTensor` can also support reductions when the data is fully masked out, which is equivalent | ||
to the case above when the data Tensor is completely ``nan``. ``nanmean`` would return ``nan`` | ||
(an ambiguous return value), while MaskedTensor would more accurately indicate a masked out result. | ||
|
||
>>> x = torch.empty(16).fill_(float('nan')) | ||
>>> x | ||
tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) | ||
>>> torch.nanmean(x) | ||
tensor(nan) | ||
>>> torch.mean(masked_tensor(x, ~torch.isnan(x))) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same comment above on MaskedTensor |
||
MaskedTensor(--, False) |
Uh oh!
There was an error while loading. Please reload this page.