Skip to content

New example for auto-imputation aka handle missing values with a simple dataset and full workflow #721

Closed
@jonsedar

Description

@jonsedar

Notebook proposal

Title: GLM-missing-numeric-values

New example for auto-imputation aka handle missing values with a simple dataset and full workflow.

I propose this is a present-day solution, which should hopefully be generally applicable, and if/when we get newer
functionality, let's extend this notebook with a new Section 2 to demonstrate and compare that. Related further
discussion in pymc-devs/pymc#6626, and pymc-devs/pymc#7204

Why should this notebook be added to pymc-examples?

Our problem statement is that when faced with data with missing values, we want to:

  1. Infer the missing values for the in-sample dataset and sample full posterior parameters
  2. Predict the endogenous feature and the missing values for an out-of-sample dataset

This notebook takes the opportunity to:

  • Demonstrate a general method using a numpy.masked_array, often mentioned in pymc folklore but rarely demonstrated
  • Demonstrate a reasonably complete Bayesian workflow {cite:p}gelman2020bayesian including data creation

This notebook is a partner to another pymc-examples notebook Missing_Data_Imputation.ipynb
which goes into more detail of taxonomies and a much more complicated dataset and tutorial-style worked example.

Suggested categories:

  • Level: Intermediate
  • Diataxis type: Reference

Related notebooks

This notebook is a partner to another pymc-examples notebook Missing_Data_Imputation.ipynb
which goes into more detail of taxonomies and a much more complicated dataset and tutorial-style worked example.

References

Related further discussion in pymc-devs/pymc#6626, and pymc-devs/pymc#7204

Also already in references.bib

@book{enders2022,
title = {Applied Missing Data Analysis},
author = {Enders K, Craig},
year = {2022},
publisher = {The Guilford Press}
}

@Article{gelman2020bayesian,
title = {Bayesian workflow},
author = {Gelman, Andrew and Vehtari, Aki and Simpson, Daniel and Margossian, Charles C and Carpenter, Bob and Yao, Yuling and Kennedy, Lauren and Gabry, Jonah and B{"u}rkner, Paul-Christian and Modr{'a}k, Martin},
journal = {arXiv preprint arXiv:2011.01808},
year = {2020},
url = {https://arxiv.org/abs/2011.01808}
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    proposalNew notebook proposal still up for discussion

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions