-
Notifications
You must be signed in to change notification settings - Fork 4.2k
[maskedtensor] Add missing nan ops tutorial #2046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please address comments on masked_tensor vs MaskedTensor. I'm also not sure that this tutorial is all that helpful as a standalone? Would it make more sense to just merge this into the overall tutorial? Or have a second tutorial that aggregate advanced features? This feels like a pretty minor demonstration of nanmean with Tensors vs mean with MT.
>>> y | ||
tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13., | ||
28., 45.]) | ||
>>> y.nanmean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be useful to inline some comments on what you're trying to show here
28., 45.]) | ||
>>> y.nanmean() | ||
tensor(16.6667) | ||
>>> torch.mean(masked_tensor(y, ~torch.isnan(y))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the goal to have sequential tutorials or keep each self contained? If the latter, can you add the relevant imports up top.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tutorial will be merged with overview!
28., 45.]) | ||
>>> y.nanmean() | ||
tensor(16.6667) | ||
>>> torch.mean(masked_tensor(y, ~torch.isnan(y))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did you replace masked_tensor
with MaskedTensor
in pytorch/pytorch@5e9c26c ? If so, can you update the tutorial here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
masked_tensor
is the preferred function to use!
tensor([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]) | ||
>>> torch.nanmean(x) | ||
tensor(nan) | ||
>>> torch.mean(masked_tensor(x, ~torch.isnan(x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment above on MaskedTensor
>>> y | ||
tensor([nan, 1., 4., 9., nan, 5., 12., 21., nan, 9., 20., 33., nan, 13., | ||
28., 45.]) | ||
>>> y.nanmean() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably outside the scope of this review, but why do we have nanmean() as an API instead of the pandas-style mean(..., skipna=True)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure..
28., 45.]) | ||
>>> y.nanmean() | ||
tensor(16.6667) | ||
>>> torch.mean(masked_tensor(y, ~torch.isnan(y))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have we considered API sugar:
(1) instantiating a MT from a Tensor assuming na is the mask
>>> MaskedTensor(y)
MaskedTensor(
[ --, 1.0000, 4.0000, 9.0000, --, 5.0000, 12.0000, 21.0000, --, 9.0000, 20.0000, 33.0000, --, 13.0000, 28.0000, 45.0000]
)
(2) instantiating a MT where user just states the mask value instead of passing the mask
y = MaskedTensor(y, mask_value=float(1))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not yet! I think an unspecified mask could also be an indication that they would like all True
values for the mask, so that could be a third option as well.
Another one would be to allow for just MaskedTensor(y)
if y
is a sparse tensor because then the mask is "implied".
All been discussed and will take note to add in :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where are you tracking feature requests?
Stack from ghstack (oldest at bottom):