Skip to content

Compute correct log_likelihood for models with partially observed variables #5255

Open
@ricardoV94

Description

@ricardoV94

Masked observations are no longer associated with the observed RV, but to a separate free RV.

import numpy as np
import pymc as pm

with pm.Model() as m:
  x = pm.Normal('x', observed=[np.nan, 1, 2, 3])

print(m['x_observed'].tag.observations)  # TensorConstant{[1.0 2.0 3.0]}

As such I am not sure whether we are computing the correcting model log_likelihood for partially observed models, assuming the imputed observations should appear in the final log_likelihood:

Running all tests in test_idata_conversion.py with coverage, confirmed that these lines lines in InferenceDataConverter.log_likelihood-vals_point are not being triggered:

pymc/pymc/backends/arviz.py

Lines 248 to 257 in 0c90e82

if isinstance(var.owner.op, (AdvancedIncSubtensor, AdvancedIncSubtensor1)):
try:
obs_data = extract_obs_data(var.tag.observations)
except TypeError:
warnings.warn(f"Could not extract data from symbolic observation {var}")
mask = obs_data.mask
if np.ndim(mask) > np.ndim(log_like_val):
mask = np.any(mask, axis=-1)
log_like_val = np.where(mask, np.nan, log_like_val)

This came up in #5245

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs infoAdditional information requiredtrace-backendTraces and ArviZ stuffv4

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions