Skip to content

Generalization of logp test comparison to facilitate relative and/or absolute comparison #6159

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 4 additions & 2 deletions docs/source/contributing/implementing_distribution.md
Original file line number Diff line number Diff line change
Expand Up @@ -350,9 +350,11 @@ These methods will perform a grid evaluation on the combinations of `domain` and
There are a couple of details worth keeping in mind:

1. By default, the first and last values (edges) of the `Domain` are not compared (they are used for other things). If it is important to test the edge of the `Domain`, the edge values can be repeated. This is done by the `Bool`: `Bool = Domain([0, 0, 1, 1], "int64")`
3. There are some default domains (such as `R` and `Rplus`) that you can use for testing your new distribution, but it's also perfectly fine to create your own domains inside the test function if there is a good reason for it (e.g., when the default values lead too many extreme unlikely combinations that are not very informative about the correctness of the implementation).
3. There are some default domains (such as `R` and `Rplus`) that you can use for testing your new distribution, but it's also perfectly fine to create your own domains inside the test function if there is a good reason for it (e.g., when the default values lead too many extreme unlikely combinations that are not very informative about the correctness of the implementation). For example, there is an `Rplusbig` domain defined which is suited to scale parameters, where numbers close to zero are atypical.
4. By default, a random subset of 100 `param` x `paramdomain` combinations is tested, to keep the test runtime under control. When testing your shiny new distribution, you can temporarily set `n_samples=-1` to force all combinations to be tested. This is important to avoid your `PR` leading to surprising failures in future runs whenever some bad combinations of parameters are randomly tested.
5. On GitHub some tests run twice, under the `pytensor.config.floatX` flags of `"float64"` and `"float32"`. However, the reference Python functions will run in a pure "float64" environment, which means the reference and the PyMC results can diverge quite a lot (e.g., underflowing to `-np.inf` for extreme parameters). You should therefore make sure you test locally in both regimes. A quick and dirty way of doing this is to temporarily add `pytensor.config.floatX = "float32"` at the very top of file, immediately after `import pytensor`. Remember to set `n_samples=-1` as well to test all combinations. The test output will show what exact parameter values lead to a failure. If you are confident that your implementation is correct, you may opt to tweak the decimal precision with `select_by_precision`, or adjust the tested `Domain` values. In extreme cases, you can mark the test with a conditional `xfail` (if only one of the sub-methods is failing, they should be separated, so that the `xfail` is as narrow as possible):
5. On GitHub some tests run twice, under the `pytensor.config.floatX` flags of `"float64"` and `"float32"`. However, the reference Python functions will run in a pure "float64" environment, which means the reference and the PyMC results can diverge quite a lot (e.g., underflowing to `-np.inf` for extreme parameters). You should therefore make sure you test locally in both regimes. A quick and dirty way of doing this is to temporarily add `pytensor.config.floatX = "float32"` at the very top of file, immediately after `import pytensor`. Remember to set `n_samples=-1` as well to test all combinations. The test output will show what exact parameter values lead to a failure. The tolerance (see next note) for `"float64"` or `"float32"` can be set with `select_by_precision`, or adjust the tested `Domain` values. In extreme cases, you can mark the test with a conditional `xfail` (if only one of the sub-methods is failing, they should be separated, so that the `xfail` is as narrow as possible):
6. In the event of tests failing, and if you are confident that your implementation is correct, you may opt to tweak the tolerances against which your `logp` or `logc` is being tested. For this, you can pass an `atol` and/or `rtol` parmeter to `logp` or `logc`, which set the acceptable absolute and relative tolerances respectively, following the relevant {doc}`numpy:reference/generated/testing.assert_allclose <NumPy documentation>` for the `assert_allclose` which is utilized.


```python

Expand Down
2 changes: 1 addition & 1 deletion pymc/distributions/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

# Contains code from Aeppl, Copyright (c) 2021-2022, PyTensor Developers.
# Contains code from Aeppl, Copyright (c) 2021-2022, Aesara Developers.

# coding: utf-8
"""
Expand Down
30 changes: 15 additions & 15 deletions pymc/tests/distributions/test_continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -235,14 +235,14 @@ def test_polyagamma(self):
Rplus,
{"h": Rplus, "z": R},
lambda value, h, z: polyagamma_pdf(value, h, z, return_log=True),
decimal=select_by_precision(float64=6, float32=-1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
)
check_logcdf(
pm.PolyaGamma,
Rplus,
{"h": Rplus, "z": R},
lambda value, h, z: polyagamma_cdf(value, h, z, return_log=True),
decimal=select_by_precision(float64=6, float32=-1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
)

def test_flat(self):
Expand All @@ -269,14 +269,14 @@ def test_normal(self):
R,
{"mu": R, "sigma": Rplus},
lambda value, mu, sigma: st.norm.logpdf(value, mu, sigma),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-3),
)
check_logcdf(
pm.Normal,
R,
{"mu": R, "sigma": Rplus},
lambda value, mu, sigma: st.norm.logcdf(value, mu, sigma),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-3),
)

def test_half_normal(self):
Expand All @@ -285,7 +285,7 @@ def test_half_normal(self):
Rplus,
{"sigma": Rplus},
lambda value, sigma: st.halfnorm.logpdf(value, scale=sigma),
decimal=select_by_precision(float64=6, float32=-1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
)
check_logcdf(
pm.HalfNormal,
Expand Down Expand Up @@ -320,7 +320,7 @@ def test_wald_logp(self):
Rplus,
{"mu": Rplus, "alpha": Rplus},
lambda value, mu, alpha: st.invgauss.logpdf(value, mu=mu, loc=alpha),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-2),
)

def test_wald_logcdf(self):
Expand Down Expand Up @@ -443,7 +443,7 @@ def test_laplace_asymmetric(self):
R,
{"b": Rplus, "kappa": Rplus, "mu": R},
laplace_asymmetric_logpdf,
decimal=select_by_precision(float64=6, float32=2),
rtol=select_by_precision(float64=1e-6, float32=1e0),
)

def test_lognormal(self):
Expand Down Expand Up @@ -596,7 +596,7 @@ def test_fun(value, mu, sigma):
Rplus,
{"mu": Rplus, "sigma": Rplus},
test_fun,
decimal=select_by_precision(float64=4, float32=3),
rtol=select_by_precision(float64=1e-6, float32=1e-5),
)

def test_pareto(self):
Expand Down Expand Up @@ -653,7 +653,7 @@ def test_skew_normal(self):
R,
{"mu": R, "sigma": Rplusbig, "alpha": R},
lambda value, alpha, mu, sigma: st.skewnorm.logpdf(value, alpha, mu, sigma),
decimal=select_by_precision(float64=5, float32=3),
rtol=select_by_precision(float64=1e-7, float32=1e-5),
)

@pytest.mark.parametrize(
Expand Down Expand Up @@ -720,14 +720,14 @@ def test_logistic(self):
R,
{"mu": R, "s": Rplus},
lambda value, mu, s: st.logistic.logpdf(value, mu, s),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
)
check_logcdf(
pm.Logistic,
R,
{"mu": R, "s": Rplus},
lambda value, mu, s: st.logistic.logcdf(value, mu, s),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
)

def test_logitnormal(self):
Expand All @@ -738,7 +738,7 @@ def test_logitnormal(self):
lambda value, mu, sigma: (
st.norm.logpdf(sp.logit(value), mu, sigma) - (np.log(value) + np.log1p(-value))
),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
)

@pytest.mark.skipif(
Expand Down Expand Up @@ -841,7 +841,7 @@ def scipy_logp(value, mu, sigma, lower, upper):
R,
{"mu": R, "sigma": Rplusbig, "lower": -Rplusbig, "upper": Rplusbig},
scipy_logp,
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
skip_paramdomain_outside_edge_test=True,
)

Expand All @@ -850,7 +850,7 @@ def scipy_logp(value, mu, sigma, lower, upper):
R,
{"mu": R, "sigma": Rplusbig, "upper": Rplusbig},
ft.partial(scipy_logp, lower=-np.inf),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
skip_paramdomain_outside_edge_test=True,
)

Expand All @@ -859,7 +859,7 @@ def scipy_logp(value, mu, sigma, lower, upper):
R,
{"mu": R, "sigma": Rplusbig, "lower": -Rplusbig},
ft.partial(scipy_logp, upper=np.inf),
decimal=select_by_precision(float64=6, float32=1),
rtol=select_by_precision(float64=1e-6, float32=1e-1),
skip_paramdomain_outside_edge_test=True,
)

Expand Down
4 changes: 2 additions & 2 deletions pymc/tests/distributions/test_logprob.py
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,7 @@ def test_joint_logp_subtensor():
# The compiled graph should not contain any `RandomVariables`
assert_no_rvs(logp_vals_fn.maker.fgraph.outputs[0])

decimals = select_by_precision(float64=6, float32=4)
atol = select_by_precision(float64=1e-8, float32=1e-5)

for i in range(10):
bern_sp = sp.bernoulli(p)
Expand All @@ -238,7 +238,7 @@ def test_joint_logp_subtensor():

logp_vals = logp_vals_fn(A_idx_value, I_value)

np.testing.assert_almost_equal(logp_vals, exp_obs_logps, decimal=decimals)
np.testing.assert_allclose(logp_vals, exp_obs_logps, atol=atol)


def test_logp_helper():
Expand Down
20 changes: 9 additions & 11 deletions pymc/tests/distributions/test_multivariate.py
Original file line number Diff line number Diff line change
Expand Up @@ -272,22 +272,22 @@ def test_mvnormal(self, n):
RealMatrix(5, n),
{"mu": Vector(R, n), "chol": PdMatrixChol(n)},
normal_logpdf_chol,
decimal=select_by_precision(float64=6, float32=-1),
rtol=select_by_precision(float64=1e-8, float32=1e-3),
extra_args={"size": 5},
)
check_logp(
pm.MvNormal,
Vector(R, n),
{"mu": Vector(R, n), "chol": PdMatrixChol(n)},
normal_logpdf_chol,
decimal=select_by_precision(float64=6, float32=0),
rtol=select_by_precision(float64=1e-8, float32=1e-1),
)
check_logp(
pm.MvNormal,
Vector(R, n),
{"mu": Vector(R, n), "chol": PdMatrixCholUpper(n)},
normal_logpdf_chol_upper,
decimal=select_by_precision(float64=6, float32=0),
rtol=select_by_precision(float64=1e-8, float32=1e-1),
extra_args={"lower": False},
)

Expand Down Expand Up @@ -338,7 +338,7 @@ def test_matrixnormal(self, n):
"colcov": PdMatrix(n) * mat_scale,
},
matrix_normal_logpdf_cov,
decimal=select_by_precision(float64=5, float32=3),
rtol=select_by_precision(float64=1e-8, float32=1e-6),
)
check_logp(
pm.MatrixNormal,
Expand All @@ -349,7 +349,7 @@ def test_matrixnormal(self, n):
"colcov": PdMatrix(n) * mat_scale,
},
matrix_normal_logpdf_cov,
decimal=select_by_precision(float64=5, float32=3),
rtol=select_by_precision(float64=1e-8, float32=1e-6),
)
check_logp(
pm.MatrixNormal,
Expand All @@ -360,7 +360,7 @@ def test_matrixnormal(self, n):
"colchol": PdMatrixChol(n) * mat_scale,
},
matrix_normal_logpdf_chol,
decimal=select_by_precision(float64=5, float32=3),
rtol=select_by_precision(float64=1e-8, float32=1e-6),
)
check_logp(
pm.MatrixNormal,
Expand All @@ -371,7 +371,7 @@ def test_matrixnormal(self, n):
"colchol": PdMatrixChol(3) * mat_scale,
},
matrix_normal_logpdf_chol,
decimal=select_by_precision(float64=5, float32=3),
rtol=select_by_precision(float64=5, float32=3),
)

@pytest.mark.parametrize("n", [2, 3])
Expand Down Expand Up @@ -787,10 +787,8 @@ def test_car_logp(self, sparse, size):
delta_logp = scipy_logp - car_logp

# Check to make sure all the delta values are identical.
tol = 1e-08
if pytensor.config.floatX == "float32":
tol = 1e-5
assert np.allclose(delta_logp - delta_logp[0], 0.0, atol=tol)
atol = select_by_precision(float64=1e-8, float32=1e-5)
assert np.allclose(delta_logp - delta_logp[0], 0.0, atol=atol)


@pytest.mark.parametrize(
Expand Down
4 changes: 2 additions & 2 deletions pymc/tests/distributions/test_timeseries.py
Original file line number Diff line number Diff line change
Expand Up @@ -754,8 +754,8 @@ def test_logp(self):
z = Normal("z", mu=0, sigma=vol, shape=data.shape)
garch_like = t.compile_logp(y)({"y": data})
reg_like = t.compile_logp(z)({"z": data})
decimal = select_by_precision(float64=7, float32=4)
np.testing.assert_allclose(garch_like, reg_like, 10 ** (-decimal))
rtol = select_by_precision(float64=1e-9, float32=1e-6)
np.testing.assert_allclose(garch_like, reg_like, rtol)

@pytest.mark.parametrize(
"batched_param",
Expand Down
Loading