-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Update docstrings of pm.set_data and model.Data #6087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update docstrings of pm.set_data and model.Data #6087
Conversation
…ing posterior predictive sampling
pymc/model.py
Outdated
@@ -1849,6 +1849,13 @@ def point_logps(self, point=None, round_vals=2): | |||
def set_data(new_data, model=None, *, coords=None): | |||
"""Sets the value of one or more data container variables. | |||
|
|||
When doing prediction or posterior predictive sampling, make sure that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the difference between prediction and posterior predictive sampling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing.. wanted to catch the eye of someone skimming with either word. Maybe better phrasing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just drop the first prediction mention.
"make sure" is also a bit strong, as it's not required, it's more of a convenience. And maybe also say at the end that the benefit of this is that you only have to set_data()
x
, but not both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also add an example of resizing y
alone first. MutableData is not useful only in posterior predictive sampling with GLMs where this concern for having to resize both variables arises.
It's also useful to fit the same model to different datasets.
Would be strange for me as user seeing this being discussed in the only example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twiecki that makes sense, I'll change it.
@ricardoV94 For an example with resizing y
alone first, would y
be a pm.MutableData
object, and observed? Makes sense RE fitting the same model to different datasets. In the past I've just remade the model object but you could save on compiling time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #6087 +/- ##
==========================================
+ Coverage 89.54% 92.14% +2.59%
==========================================
Files 72 92 +20
Lines 12929 21061 +8132
==========================================
+ Hits 11577 19406 +7829
- Misses 1352 1655 +303
|
Is this ready to merge? |
pymc/data.py
Outdated
@@ -594,6 +594,11 @@ def Data( | |||
To set the value of the data container variable, check out | |||
:func:`pymc.Model.set_data`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:func:`pymc.Model.set_data`. | |
:meth:`pymc.Model.set_data`. |
pymc/data.py
Outdated
When making predictions or doing posterior predictive sampling, the shape of the | ||
registered data variable will most likely need to be changed. If you encounter an | ||
Aesara shape mismatch error, refer to the documentation for | ||
:func:`pymc.model.set_data`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:func:`pymc.model.set_data`. | |
:func:`pymc.set_data`. |
The function is imported as pymc.set_data
and since 4.0 we document objects in the same place they should be imported.
pymc/model.py
Outdated
the random variable acting as the likelihood (`observed` is True) is dynamically | ||
resized when the data is updated. Set the `shape` argument of the random | ||
variable acting as the likelihood to the shape of the relevant | ||
:class:`pymc.data.MutableData` object, as shown in the example below |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:class:`pymc.data.MutableData` object, as shown in the example below | |
:class:`pymc.MutableData` object, as shown in the example below |
Took another crack at it, basically made the docstring more minimal and added an example as per your suggestion @ricardoV94. Would it be nice to also have the syntax allow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great
I guess so ;) |
One last question, if you don't mind, I can't seem to figure out why pre-commit isn't happy. Since this PR just changes docstrings, what's it saying about numpy, positional arguments? |
That was fixed today. If you rebase from main it should go away |
Adds docstrings describing how to track changing shape of
MutableData
in the likelihood. Meant to address issue #5987. The recommended solution is then to set theshape=x.shape
, like this:Major / Breaking Changes
None
Bugfixes / New features
None
Docs / Maintenance
Add recommendation to docstrings relevant to prediction and posterior predictive sampling.