Description
Motivating examples
Currently, the CAR distribution is set up to accept values >= -1 and <= 1. When alpha = 1, the distribution starts behaving in difficult to understand ways, giving what looks like inconsistent answers on very similar problems. Here's two examples that illustrate the problem. First, suppose the adjacency structure is a line. If you try to evaluate the logp of some example values, it returns nan
.
import pymc as pm
import numpy as np
N = 4
adj_matrix = np.array([[0,1,0,0],
[1,0,1,0],
[0,1,0,1],
[0,0,1,0]])
values = np.array([0.5,0.1,0.2,0.6])
phi = pm.CAR.dist(mu=np.zeros(N), tau=1, alpha=1, W=adj_matrix)
pm.logp(phi,values).eval()
However, if we adjust the adjacency structure slightly so it's a square lattice instead, it returns a value, in this case -17.8452531.
N = 4
adj_matrix = np.array([[0,1,0,1],
[1,0,1,0],
[0,1,0,1],
[1,0,1,0]])
values = np.array([0.5,0.1,0.2,0.6])
phi = pm.CAR.dist(mu=np.zeros(N), tau=1, alpha=1, W=adj_matrix)
pm.logp(phi,values).eval()
The pattern of results holds up across a bunch of adjustments. If you pass it a fully connected 4x4 adjacency matrix, nan
. If you pass a 4x4 adjacency matrix where everyone is connected except one pair, logp
gives you a value. Similarly things apply for matrices of different sizes. I tried both types of 3x3 adjacency matrices and got one nan and one result. pm.draw
produces results that follow a similar pattern. If the logp would return nan, then draw throws an error and says the matrix is not positive semi-definite. Finally, none of these issues arise if alpha is any positive number less than 1.
Why is this happening
The big picture here is that when alpha = 1, the resulting precision matrix,
PyMC's implementation doesn't take the matrix inverse of
lam = pt.slinalg.eigvalsh(DWD, pt.eye(DWD.shape[0]))
where the DWD bit is a matrix derived from the adjacency matrix and the pt.eye()
bit is typically an identity matrix of the same size as DWD. Then we have
logdet = pt.log(1 - alpha.T * lam[:, None]).sum()
When alpha = 1, it drops out. So if we try to take the log of (1 - lam) wherever an eigenvalue is 1, we should get nan / -infinity back. In my first example, lam = array([-1. , -0.5, 0.5, 1. ])
, so the last eigenvalue should trigger the nan. However, instead the subtraction of 1 - lam, yields a tiny negative number 2.22044605e-16
in the last spot of the vector. log still returns nan, a correct result but for the wrong reason.
When we switch to the lattice case, the eigenvalue in the last position is 1.00000000e+00
and the subtraction yields a tiny positive number 2.22044605e-16
. log()
happily returns a number when it shouldn't. In both cases, the tiny number 2.22...e-16 should be zero but approximation error is causing our values to make tiny fluctuates around 0 instead. The pattern of errors is explained because sometimes the approximation goes negative and sometimes it goes positive.
Proposed solution
I'm curious what others think, but I believe the desired behavior is that none of these examples should work. The logp should be nan
in each case where alpha = 1
and it shouldn't be able to sample either. The literature describes the CAR with alpha=1 as a non-generating or an improper prior. One possible fix is to change the RV make_node
so it throws a error when alpha = 1 + change the check_parameters
part of the logp to constrain alpha to be less than 1. I need to do a bit more research about what is supposed to happen when alpha = -1, but I imagine the same restrictions need to be there. So the fixes might look like:
class CARRV(RandomVariable):
def make_node(self, rng, size, dtype, mu, W, alpha, tau):
.
.
.
alpha = pt.as_tensor_variable(floatX(alpha))
#if alpha >= 1.0 or alpha <= -1.0:
# raise ValueError("the domain of alpha is: -1 < alpha < 1.")
class CAR(Continuous):
def logp(value, mu, W, alpha, tau):
.
.
.
logquad = (tau * delta * tau_dot_delta).sum(axis=-1)
return check_parameters(
0.5 * (logtau + logdet - logquad),
#-1 < alpha,
#alpha < 1,
tau > 0,
#msg="-1 < alpha < 1, tau > 0",
)
In the medium term, I'm also working on an implementation of ICAR to allow users to handle the special case of alpha = 1.
PyMC version information:
Windows
PyMC 5.5.0
Pytensor 2.12.1