Skip to content

spatialstatisticsupna/IHD_ST_patterns

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A fast approach for analyzing spatio-temporal patterns in ischaemic heart disease mortality across US counties (1999-2021)

This repository contains the R code to reproduce the analysis presented in the paper entitled "A fast approach for analyzing spatio-temporal patterns in ischaemic heart disease mortality across US counties (1999-2021)" (Urdangarin et al., 2025).

Table of contents

Data

Ischaemic heart disease (IHD) mortality data in US counties during 1999-2021

The IHD annual mortality totals for each county in the US between 1999 and 2021, along with the corresponding county populations, has been obtained from the Centers for Disease Control (CDC) and Prevention’s Wide-ranging Online Data for Epidemiologic Research (WONDER) website https://wonder.cdc.gov/. Here, we use the publicly available data provided by NCHS to CDC-WONDER with counts less than 10 suppressed for confidentiality reasons.

The datasets in IHD_data.Rdata include 3105 counties from all the US states except Alaska and Hawaii. Additionally, Puerto Rico, Samoa, the Dukes and Nantucket islands in Massachusetts, as well as San Juan Island in Washington, are excluded. Details regarding the data preprocessing procedures are provided in Supplementary Material A. We have considered the US administrative division into four geographic regions, namely, West, Midwest, South, and Northeast, encompassing the selected 48 states.

The IHD_data.Rdata file contains the following objects:

  • counts: contains the IHD mortality counts for each year. Suppressed counts are denoted as NAs. It is a dataframe where the first column labeled GEOID, serves as the identifier for counties, and the subsequent columns correspond to the counties for each year during 1999-2021.
  • pop: contains the population for each county in each year. It is a dataframe where the first column labeled GEOID, serves as the identifier for counties, and the subsequent columns correspond to the population for each year during 1999-2021.
  • pop.NAs: contains the population for each county in each year. Population corresponding to suppressed counts is denoted as NA. It is a dataframe where the first column labeled GEOID, serves as the identifier for counties, and the subsequent columns correspond to the population for each year during 1999-2021.
  • carto: cartography of the 3105 counties of US.
    • STATEFP: state level FIPS codes
    • GEOID: Geographic Identifiers of the counties
    • NAME: names of the counties
    • STATEFP2: state level FIPS codes of the partition chosen for "divide and conquer" approach
    • State.Name: names of the states
    • Region: 1=West, 2=Midwest, 3=South, 4=Northeast

The folder Data includes three .txt files with some additional information.

  • Capitals.txt: includes information on the capitals of the 48 states in the US. For each state, the capital city is listed under the column Capital and the column County contains the county where the capital is situated. The last two columns refer to the FIPS (Federal Information Process System) code of the state and the corresponding GEOID (Geographic Identifiers) of the county.
  • FP_StateNames.txt: comprises the FIPS code that corresponds to each state and the region where it is located each state.
  • GEOID_vs_CountyNames.txt: comprises the GEOID of each county in the US.

Imputed data

The IHD_imputed_data.Rdata encompasses the counts dataset after imputing the missing values. Specifically, it contains the following objects:

  • counts: contains the IHD mortality counts for each year after imputing the missing data.
  • pop: contains the population for each county in each year.
  • exp: contains the expected cases for each county in each year computed using the overall rate in the US in the whole period of study (for details see page 6 of the paper)
  • carto: cartography of the 3105 counties of US.

R code

The folder labeled R contains the code to impute the missing counts and reproduce all the tables and figures of the paper.

Computations were run using R-4.2.1, R-INLA version 22.12.16 (dated 2022-12-23) and bigDM version 0.5.3.

Acknowledgements

This work has been supported by Project PID2020-113125RB-I00/ MCIN/ AEI/ 10.13039/501100011033 and by Project UNEDPAM/PI/PR24/05A. image

References

Urdangarin, A., Goicoa, T. , Congdon, P. and Ugarte, M.D. (2025). A fast approach for analyzing spatio-temporal patterns in ischaemic heart disease mortality across US counties (1999-2021). Spatial and Spatio-temporal Epidemiology, 52, 100700. doi: https://doi.org/10.1016/j.sste.2024.100700.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages