-
Notifications
You must be signed in to change notification settings - Fork 3
add post about the pipeline approach #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
avallecam
wants to merge
10
commits into
epiverse-trace:main
Choose a base branch
from
avallecam:post/pipelines
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
2dff0ed
add draft post about pipelines
avallecam b406da3
add bib file for pipelines
avallecam a36941c
add explanation and references to pipelines
avallecam 234402e
add authors info
avallecam 2d43535
minor writing edits
avallecam d2bd66d
replace verbs and terms for homogenious reading
avallecam e6a2a3a
minor rearrangement of fig in scenarios intro
avallecam 8a5c3b2
add alt text for figure 1
avallecam d319f07
remove white spaces for lintr
avallecam a951edf
remove trailing whitespace after :
avallecam File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,67 @@ | ||
--- | ||
title: "Outbreak analytics Pipelines" | ||
author: | ||
- name: "Andree Valle-Campos" | ||
orcid: "0000-0002-7779-481X" | ||
- name: " Carmen Tamayo Cuartero" | ||
orcid: "0000-0003-4184-2864" | ||
- name: "Anna Carnegie" | ||
orcid: "0000-0002-6385-7795" | ||
- name: "Sebastian Funk" | ||
orcid: "0000-0002-2842-3406" | ||
- name: "Adam Kucharski" | ||
orcid: "0000-0001-8814-9421" | ||
- name: "Rosalind M Eggo" | ||
orcid: "0000-0002-0362-6717" | ||
date: last-modified | ||
categories: [outbreak analytics, pipelines, tasks, packages] | ||
bibliography: pipelines.bib | ||
image: "sigmund-4CNNH2KEjhc-unsplash.jpg" | ||
format: | ||
html: | ||
toc: true | ||
--- | ||
|
||
## The Pipeline approach | ||
|
||
We can solve Outbreak Analytics *tasks* connecting multiple packages in *pipelines*. | ||
|
||
## Outbreak analytics | ||
|
||
*Outbreak analytics* is a specialized field within data science that focuses on the technological and methodological aspects of the outbreak data pipeline. This includes the systematic collection, analysis, modeling, and reporting of data to inform outbreak response [@polonsky2019outbreak]. | ||
|
||
### Tasks | ||
|
||
We can view Outbreak analytics as a set of related data analysis __Tasks__. In @fig-tasks we represent this in a directed graph, where each *node* is a Task and each *directed edge* represents the flow of input and output data. Tasks are connected similarly to the [tidyverse](https://r4ds.hadley.nz/whole-game.html) diagram for exploratory data analysis. | ||
|
||
{#fig-tasks fig-alt="Directed graph where tasks are nodes and data flows are directed edges like arrows. One task connect with multiple other tasks."} | ||
|
||
In @fig-tasks-detailed we have a summarized detail of data inputs and outputs between Tasks. For example, for the first task on the left called *Read case data* we need a data input called *Case data* to get two data outputs called *Linelist* and *Contact data*. | ||
|
||
{#fig-tasks-detailed} | ||
|
||
One Task can contain different methods and packages for similar data inputs and outputs. | ||
|
||
### Pipelines | ||
|
||
We defined a __Pipeline__ as a set of connected Tasks required to obtain an informative outcome for decision-making purposes. | ||
|
||
For example, to quantify the time-varying reproduction number we can follow the *Transmissibility pipeline* (@fig-pipe-01). First, we *Read case data* to generate a linelist. Then, we *Describe case data*, using the linelist as inputs to generate delay distributions and epicurves. Finally, we use both outputs as inputs to *Quantify transmission* and generate an estimate of transmission. This output allows us to determine the intensity of interventions needed to achieve epidemic control [@cori2017key]. | ||
|
||
{#fig-pipe-01} | ||
|
||
Similarly, to simulate the final size of an epidemic we can follow the *Scenarios pipeline* (@fig-pipe-02). First, we *Read population data* to obtain its demographic distribution and social contact matrix. Next, we collect the estimate of transmission data output, ideally from the *Transmissibility pipeline*. Finally, we use these three data as inputs to *Simulate transmission scenarios* and determine the proportion of the population infected. This output allows us to assess the long-term impact of the outbreak and evaluate intervention choices [@cori2017key]. | ||
|
||
{#fig-pipe-02} | ||
|
||
## How we use the Pipelines? | ||
|
||
We use the Pipeline approach to connect multiple packages in the design of: | ||
|
||
- Reproducible report templates per Pipeline stored in the [`{episoap}`](https://epiverse-trace.github.io/episoap/) package, | ||
- Code scripts stored in the [`{howto}`](https://epiverse-trace.github.io/howto/) repository, and | ||
- [New](https://github.com/orgs/epiverse-trace/discussions/87) packages in relation to other upstream packages and tasks. | ||
|
||
## Attributions | ||
|
||
- The image of this feed is from [Unsplash](https://unsplash.com/photos/4CNNH2KEjhc), provided by [Sigmund](https://unsplash.com/@sigmund), free to use under the [Unsplash License](https://unsplash.com/license). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
@article{cori2017key, | ||
doi = {10.1098/rstb.2016.0371}, | ||
url = {https://doi.org/10.1098/rstb.2016.0371}, | ||
year = {2017}, | ||
month = apr, | ||
publisher = {The Royal Society}, | ||
volume = {372}, | ||
number = {1721}, | ||
pages = {20160371}, | ||
author = {Anne Cori and Christl A. Donnelly and Ilaria Dorigatti and Neil M. Ferguson and Christophe Fraser and Tini Garske and Thibaut Jombart and Gemma Nedjati-Gilani and Pierre Nouvellet and Steven Riley and Maria D. Van Kerkhove and Harriet L. Mills and Isobel M. Blake}, | ||
title = {Key data for outbreak evaluation: building on the Ebola experience}, | ||
journal = {Philosophical Transactions of the Royal Society B: Biological Sciences} | ||
} | ||
|
||
@article{polonsky2019outbreak, | ||
doi = {10.1098/rstb.2018.0276}, | ||
url = {https://doi.org/10.1098/rstb.2018.0276}, | ||
title={Outbreak analytics: a developing data science for informing the response to emerging pathogens}, | ||
author={Polonsky, Jonathan A and Baidjoe, Amrish and Kamvar, Zhian N and Cori, Anne and Durski, Kara and Edmunds, W John and Eggo, Rosalind M and Funk, Sebastian and Kaiser, Laurent and Keating, Patrick and others}, | ||
journal={Philosophical Transactions of the Royal Society B}, | ||
volume={374}, | ||
number={1776}, | ||
pages={20180276}, | ||
year={2019}, | ||
publisher={The Royal Society} | ||
} |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I didn't know about this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
possibly I spend too much time looking at the quarto documentation, hehe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I now realize this may come with issues as we sometimes need to update already published posts (typos, breaking changes in quarto, broken URLs, etc.)