Adding Tinas Blog (#969)

* Adding Tinas Blog * removed EIOS from former post * adding AAAI paper to Delphi website * change people file to add Cat
cmu-delphi · Aug 14, 2024 · bad57df · bad57df
1 parent 79b715e
commit bad57df
Show file tree

Hide file tree

Showing 10 changed files with 87 additions and 11 deletions.
diff --git a/content/about/publications/images/ranking.png b/content/about/publications/images/ranking.png
diff --git a/content/about/publications/index.md b/content/about/publications/index.md
@@ -1,6 +1,12 @@
 ---
 title: Research
 papers:
+  - title: "Outlier Ranking for Large-Scale Public Health Data"
+    image: ranking.png
+    authors: Joshi, Townes T., Gormley, Neureiter, Wilder, Rosenfeld
+    link: https://ojs.aaai.org/index.php/AAAI/article/view/30222
+    year: 2024
+    journal: Association for the Advancement of Artificial Intelligence
   - title: "Smooth Multi-Period Forecasting with Application to Prediction of COVID-19 Cases"
     image: smoothing-paper-teaser.jpg
     authors: Tuzhilina, Hastie, McDonald, Tay, Tibshirani

diff --git a/content/blog/2024-01-01-flash-intro.Rmd b/content/blog/2024-01-01-flash-intro.Rmd
@@ -9,6 +9,7 @@ authors:
   - nolan
   - richa
   - tina
+  - catalina
 heroImage: blog-lg-flash.jpeg
 heroImageThumb: blog-thumb-flash.jpeg
 summary: | 
@@ -29,13 +30,34 @@ These issues, if undetected, can have critical downstream ramifications for data
 
 ![Fig 1. Data quality changes in case counts, shown by the large spikes in March and July 2022, when cases were trending down, resulted in similar spikes for predicted counts (red) from multiple forecasts that were then sent to the US CDC. A weekly forecast per state, for cases, hospitalizations, and deaths, up to 4 weeks in the future means that modeling teams would have to review 600 forecasts per week and may not have been able to catch the upstream data issue.](/blog/2024-01-01-flash-intro/forecast.jpg)
 
+We care about finding data issues like these so that we can alert downstream data users accordingly. That is why our goal in the FlaSH team (Flagging Anomalies in Streams related to public Health) is to quickly identify data points that warrant human inspection and create tools to support data review. Towards this goal, our team of researchers, engineers, and data reviewers iterate on our deployed interdisciplinary approach. We will cover the different methods and perspectives of the FlaSH project, starting with the visualization and user experience perspectives. 
 
-We care about finding data issues like these so that we can alert downstream data users accordingly. That is why our goal in the FlaSH team (Flagging Anomalies in Streams related to public Health) is to quickly identify data points that warrant human inspection and create tools to support data review. Towards this goal, our team of researchers, engineers, and data reviewers iterate on our deployed interdisciplinary approach. In this blog series, we will cover the different methods and perspectives of the FlaSH project.
+## Visualization and User Experience
+*Perspectives from our expert data reviewer, who has been working with this system for over a year -- Tina Townes.*
 
-Members: Ananya Joshi, Nolan Gormley, Richa Gadgil, Tina Townes   \  
+<center><div class="float">[![**Fig 2a.** Revised FlaSH Dashoard](/blog/2024-01-01-flash-intro/new_dash.png)](/blog/2024-01-01-flash-intro/new_dash.png)</div></center>
+
+<p>In its initial stages, the FlaSH dashboard (Fig 2b) only enabled me to assess potential anomalies by viewing graphs, line-by-line for each location of the numerous signals that have flagged anomalies, as generated by the FlaSH program. This was a particularly daunting task as daily FlaSH outputs generated and continue to produce a large number of reports in the form of compressed lines that required clicking on to expand and reveal more details. Without the new dashboard's features, I was spending a significant amount of time scrolling through the daily list of anomaly reports and manually sorting what I wanted to review by clicking on and expanding only certain report lines and leaving them expanded until I was done with my selection process and ready to review the expanded lines. I would also often make notes and document interesting patterns in anomalies in a separate notepad, decreasing the efficiency and speed of my review process. My attention became divided as I was parsing though the daily anomaly list to search for reports in certain geographies (I knew I wanted to examine these due to prior report patterns), while simultaneously trying to focus on assessing new anomalies.</p>
+
+<center><div class="float">[![**Fig 2b.** Prior FlaSH Dashoard](/blog/2024-01-01-flash-intro/old_dash.png)](/blog/2024-01-01-flash-intro/old_dash.png)</div></center>
+
+
+<p>With the old dashboard setup, it was not easy for me to review the lines of daily anomaly reports because I couldn't efficiently filter various incoming anomalies when I needed to examine specific geographic areas or signals. For example, one particular week I was seeing a lot of anomaly reports in a county in Puerto Rico Monday through Wednesday. By Thursday of that week I wanted to, upon logging into the platform, immediately proceed to filter the daily anomaly reports to look specifically at that Puerto Rican county right away, but had no way of filtering by geography with the old dashboard. The updated dashboard now has a menu that lets me efficiently select to filter lines not only by the geographic regions, but also by various indicators as well. This new setup speeds up my daily review process as it lets me quickly focus on specific geographies and finish reviewing those so that I can move on and focus on examining other anomaly reports in different geographies.
+</p>
+
+
+<p>Now, in its current iteration (Fig 2a), the FlaSH dashboard lets me easily filter daily anomaly results by various variables including geos and signal types, and also view a national map offering a quick glimpse of locations of high FlaSH scores. Furthermore, the updated FlasH dashboard now enables me to take detailed notes on particularly interesting anomalies, trends and other issues of importance, and maintain these notes in an organized, searchable fashion within the platform.</p>
+<p>Finally, now with the dashboard’s repositioned filtering menu, the page layout becomes an even more familiar environment. The menu echoes the user-friendly layouts of popular retail and informational sites, making navigation much more intuitive and smoother, thus allowing me to work through various options more quickly.</p>
+<p>These new dashboard features allow me to devote more of my time and efforts to assessing anomalies of interest and focus on geographies with high concentrations of problematic data or noteworthy trends.</p>
+
+## Additional Information 
+For more information, please check out our [demo video](https://www.youtube.com/watch?v=fWe6M-rTQQ0), open-source methods [(1)](https://github.com/cmu-delphi/covidcast-indicators/blob/main/_delphi_utils_python/delphi_utils/flash_eval/eval_day.py) [(2)](https://github.com/Ananya-Joshi/outshines_sparky), and publications [(1)](https://arxiv.org/abs/2306.16914) [(2)](https://ojs.aaai.org/index.php/AAAI/article/view/30222).
+
+
+Members: Ananya Joshi, Nolan Gormley, Richa Gadgil, Tina Townes, Catalina Vajiac (part time)   \  
 
 Former Members: Luke Neurieter, Katie Mazaitis  \   
 
 Advisors: Peter Jhon, Roni Rosenfeld, Bryan Wilder
 
-
+**Revised July 12th 2024**
diff --git a/content/blog/2024-01-01-flash-intro.html b/content/blog/2024-01-01-flash-intro.html
@@ -9,6 +9,7 @@
   - nolan
   - richa
   - tina
+  - catalina
 heroImage: blog-lg-flash.jpeg
 heroImageThumb: blog-thumb-flash.jpeg
 summary: | 
@@ -19,19 +20,59 @@
 ---
 
 
+<div id="TOC">
+<ul>
+<li><a href="#visualization-and-user-experience" id="toc-visualization-and-user-experience">Visualization and User Experience</a></li>
+<li><a href="#additional-information" id="toc-additional-information">Additional Information</a></li>
+</ul>
+</div>
 
-<p>Delphi publishes millions of public-health-related data points per day, including the total number of daily influenza cases, hospitalizations, and deaths per county and state in the United States (US). This data helps public health practitioners, data professionals, and members of the public make important, informed decisions relating to health and well-being.</p>
+<p>Delphi publishes millions of public-health-related data points per day, such as the total number of daily COVID-19 cases, hospitalizations, and deaths per county and state in the United States (US). This data helps public health practitioners, data professionals, and members of the public make important, informed decisions relating to health and well-being.</p>
 <p>Yet, as data volumes continue to grow quickly (Delphi’s data volume expanded 1000x in just 3 years), it is infeasible for data reviewers to inspect every one of these data points for subtle changes in</p>
 <ul>
 <li>quality (like those resulting from data delays) or</li>
 <li>disease dynamics (like an outbreak).</li>
 </ul>
 <p>These issues, if undetected, can have critical downstream ramifications for data users (as shown by the example in Fig 1).</p>
-<div class="figure">
-<img src="/blog/2024-01-01-flash-intro/forecast.jpg" alt="" />
-<p class="caption">Fig 1. Data quality changes in case counts, shown by the large spikes in March and July 2022, when cases were trending down, resulted in similar spikes for predicted counts (red) from multiple forecasts that were then sent to the US CDC. A weekly forecast per state, for cases, hospitalizations, and deaths, up to 4 weeks in the future means that modeling teams would have to review 600 forecasts per week and may not have been able to catch the upstream data issue.</p>
+<div class="float">
+<img src="/blog/2024-01-01-flash-intro/forecast.jpg" alt="Fig 1. Data quality changes in case counts, shown by the large spikes in March and July 2022, when cases were trending down, resulted in similar spikes for predicted counts (red) from multiple forecasts that were then sent to the US CDC. A weekly forecast per state, for cases, hospitalizations, and deaths, up to 4 weeks in the future means that modeling teams would have to review 600 forecasts per week and may not have been able to catch the upstream data issue." />
+<div class="figcaption">Fig 1. Data quality changes in case counts, shown by the large spikes in March and July 2022, when cases were trending down, resulted in similar spikes for predicted counts (red) from multiple forecasts that were then sent to the US CDC. A weekly forecast per state, for cases, hospitalizations, and deaths, up to 4 weeks in the future means that modeling teams would have to review 600 forecasts per week and may not have been able to catch the upstream data issue.</div>
+</div>
+<p>We care about finding data issues like these so that we can alert downstream data users accordingly. That is why our goal in the FlaSH team (Flagging Anomalies in Streams related to public Health) is to quickly identify data points that warrant human inspection and create tools to support data review. Towards this goal, our team of researchers, engineers, and data reviewers iterate on our deployed interdisciplinary approach. We will cover the different methods and perspectives of the FlaSH project, starting with the visualization and user experience perspectives.</p>
+<div id="visualization-and-user-experience" class="section level2">
+<h2>Visualization and User Experience</h2>
+<p><em>Perspectives from our expert data reviewer, who has been working with this system for over a year – Tina Townes.</em></p>
+<center>
+<div class="float">
+<a href="/blog/2024-01-01-flash-intro/new_dash.png"><img src="/blog/2024-01-01-flash-intro/new_dash.png" alt="Fig 2a. Revised FlaSH Dashoard" /></a>
 </div>
-<p>We care about finding data issues like these so that we can alert downstream data users accordingly. That is why our goal in the FlaSH team (Flagging Anomalies in Streams related to public Health) is to quickly identify data points that warrant human inspection and create tools to support data review. Towards this goal, our team of researchers, engineers, and data reviewers iterate on our deployed interdisciplinary approach. In this blog series, we will cover the different methods and perspectives of the FlaSH project.</p>
-<p>Members: Ananya Joshi, Nolan Gormley, Richa Gadgil, Tina Townes  </p>
+</center>
+<p>
+In its initial stages, the FlaSH dashboard (Fig 2b) only enabled me to assess potential anomalies by viewing graphs, line-by-line for each location of the numerous signals that have flagged anomalies, as generated by the FlaSH program. This was a particularly daunting task as daily FlaSH outputs generated and continue to produce a large number of reports in the form of compressed lines that required clicking on to expand and reveal more details. Without the new dashboard’s features, I was spending a significant amount of time scrolling through the daily list of anomaly reports and manually sorting what I wanted to review by clicking on and expanding only certain report lines and leaving them expanded until I was done with my selection process and ready to review the expanded lines. I would also often make notes and document interesting patterns in anomalies in a separate notepad, decreasing the efficiency and speed of my review process. My attention became divided as I was parsing though the daily anomaly list to search for reports in certain geographies (I knew I wanted to examine these due to prior report patterns), while simultaneously trying to focus on assessing new anomalies.
+</p>
+<center>
+<div class="float">
+<a href="/blog/2024-01-01-flash-intro/old_dash.png"><img src="/blog/2024-01-01-flash-intro/old_dash.png" alt="Fig 2b. Prior FlaSH Dashoard" /></a>
+</div>
+</center>
+<p>
+With the old dashboard setup, it was not easy for me to review the lines of daily anomaly reports because I couldn’t efficiently filter various incoming anomalies when I needed to examine specific geographic areas or signals. For example, one particular week I was seeing a lot of anomaly reports in a county in Puerto Rico Monday through Wednesday. By Thursday of that week I wanted to, upon logging into the platform, immediately proceed to filter the daily anomaly reports to look specifically at that Puerto Rican county right away, but had no way of filtering by geography with the old dashboard. The updated dashboard now has a menu that lets me efficiently select to filter lines not only by the geographic regions, but also by various indicators as well. This new setup speeds up my daily review process as it lets me quickly focus on specific geographies and finish reviewing those so that I can move on and focus on examining other anomaly reports in different geographies.
+</p>
+<p>
+Now, in its current iteration (Fig 2a), the FlaSH dashboard lets me easily filter daily anomaly results by various variables including geos and signal types, and also view a national map offering a quick glimpse of locations of high FlaSH scores. Furthermore, the updated FlasH dashboard now enables me to take detailed notes on particularly interesting anomalies, trends and other issues of importance, and maintain these notes in an organized, searchable fashion within the platform.
+</p>
+<p>
+Finally, now with the dashboard’s repositioned filtering menu, the page layout becomes an even more familiar environment. The menu echoes the user-friendly layouts of popular retail and informational sites, making navigation much more intuitive and smoother, thus allowing me to work through various options more quickly.
+</p>
+<p>
+These new dashboard features allow me to devote more of my time and efforts to assessing anomalies of interest and focus on geographies with high concentrations of problematic data or noteworthy trends.
+</p>
+</div>
+<div id="additional-information" class="section level2">
+<h2>Additional Information</h2>
+<p>For more information, please check out our <a href="https://www.youtube.com/watch?v=fWe6M-rTQQ0">demo video</a>, open-source methods <a href="https://github.com/cmu-delphi/covidcast-indicators/blob/main/_delphi_utils_python/delphi_utils/flash_eval/eval_day.py">(1)</a> <a href="https://github.com/Ananya-Joshi/outshines_sparky">(2)</a>, and publications <a href="https://arxiv.org/abs/2306.16914">(1)</a> <a href="https://ojs.aaai.org/index.php/AAAI/article/view/30222">(2)</a>.</p>
+<p>Members: Ananya Joshi, Nolan Gormley, Richa Gadgil, Tina Townes, Catalina Vajiac (part time)  </p>
 <p>Former Members: Luke Neurieter, Katie Mazaitis  </p>
 <p>Advisors: Peter Jhon, Roni Rosenfeld, Bryan Wilder</p>
+<p><strong>Revised July 12th 2024</strong></p>
+</div>
diff --git a/content/blog/2024-01-30-flash-framework.Rmd b/content/blog/2024-01-30-flash-framework.Rmd
@@ -13,7 +13,7 @@ output:
     toc: true
 acknowledgements: Thank you to George Haff, Carlyn Van Dyke, and Ron Lunde for editing this blog post. 
 ---
-Insights from public health data can keep communities safe. However, identifying these insights in large volumes of modern public health data can be laborious^[Rosen, George. A history of public health. JHU Press, 2015.]. As a result, over the past few decades, public health agencies have built monitoring systems, like [ESSENCE](https://www.cdc.gov/nssp/new-users.html) (CDC), [EIOS](https://www.who.int/initiatives/eios) (WHO), and [DHIS2](https://dhis2.org/) (WHO), where users can set custom statistical alerts and then investigate these alerts using data visualizations^[Chen, Hsinchun, Daniel Zeng, and Ping Yan. Infectious disease informatics: syndromic surveillance for public health and biodefense. Vol. 21. New York: Springer, 2010.]. These alerting systems largely follow the following formula^[Murphy, Sean Patrick, and Howard Burkom. "Recombinant temporal aberration detection algorithms for enhanced biosurveillance." Journal of the American Medical Informatics Association 15.1 (2008): 77-86.] as shown in Fig 1.:
+Insights from public health data can keep communities safe. However, identifying these insights in large volumes of modern public health data can be laborious^[Rosen, George. A history of public health. JHU Press, 2015.]. As a result, over the past few decades, public health agencies have built monitoring systems, like [ESSENCE](https://www.cdc.gov/nssp/new-users.html) (CDC) and [DHIS2](https://dhis2.org/) (WHO), where users can set custom statistical alerts and then investigate these alerts using data visualizations^[Chen, Hsinchun, Daniel Zeng, and Ping Yan. Infectious disease informatics: syndromic surveillance for public health and biodefense. Vol. 21. New York: Springer, 2010.]. These alerting systems largely follow the following formula^[Murphy, Sean Patrick, and Howard Burkom. "Recombinant temporal aberration detection algorithms for enhanced biosurveillance." Journal of the American Medical Informatics Association 15.1 (2008): 77-86.] as shown in Fig 1.:
 
 
 <center>

diff --git a/content/blog/2024-01-30-flash-framework.html b/content/blog/2024-01-30-flash-framework.html
@@ -21,7 +21,7 @@
 </ul>
 </div>
 
-<p>Insights from public health data can keep communities safe. However, identifying these insights in large volumes of modern public health data can be laborious<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a>. As a result, over the past few decades, public health agencies have built monitoring systems, like <a href="https://www.cdc.gov/nssp/new-users.html">ESSENCE</a> (CDC), <a href="https://www.who.int/initiatives/eios">EIOS</a> (WHO), and <a href="https://dhis2.org/">DHIS2</a> (WHO), where users can set custom statistical alerts and then investigate these alerts using data visualizations<a href="#fn2" class="footnote-ref" id="fnref2"><sup>2</sup></a>. These alerting systems largely follow the following formula<a href="#fn3" class="footnote-ref" id="fnref3"><sup>3</sup></a> as shown in Fig 1.:</p>
+<p>Insights from public health data can keep communities safe. However, identifying these insights in large volumes of modern public health data can be laborious<a href="#fn1" class="footnote-ref" id="fnref1"><sup>1</sup></a>. As a result, over the past few decades, public health agencies have built monitoring systems, like <a href="https://www.cdc.gov/nssp/new-users.html">ESSENCE</a> (CDC) and <a href="https://dhis2.org/">DHIS2</a> (WHO), where users can set custom statistical alerts and then investigate these alerts using data visualizations<a href="#fn2" class="footnote-ref" id="fnref2"><sup>2</sup></a>. These alerting systems largely follow the following formula<a href="#fn3" class="footnote-ref" id="fnref3"><sup>3</sup></a> as shown in Fig 1.:</p>
 <center>
 <div class="float">
 <img src="/blog/2024-01-30-flash-framework/image3.png" alt="Fig 1 Standard Approach for Alerting Systems" />

diff --git a/content/people/headshots/catalina.jpeg b/content/people/headshots/catalina.jpeg
diff --git a/content/people/index.md b/content/people/index.md
@@ -912,6 +912,13 @@ people:
   affiliation: CMU/MLD
   team:
   - core
+- key: catalina
+  firstName: Catalina
+  lastName: Vajiac
+  image: catalina.jpeg
+  affiliation: CMU/CSD
+  team:
+  - contributors
 - firstName: Ana&nbsp;Karina
   lastName: Van&nbsp;Nortwick
   image: ana-karina-van-nortwick.jpeg

diff --git a/static/blog/2024-01-01-flash-intro/new_dash.png b/static/blog/2024-01-01-flash-intro/new_dash.png
diff --git a/static/blog/2024-01-01-flash-intro/old_dash.png b/static/blog/2024-01-01-flash-intro/old_dash.png