Skip to content

Latest commit

 

History

History
56 lines (56 loc) · 2.5 KB

2022-12-31-markus22a.md

File metadata and controls

56 lines (56 loc) · 2.5 KB
abstract booktitle title volume year layout series publisher issn id month tex_title firstpage lastpage page order cycles bibtex_author author date address container-title genre issued pdf extras
People often interpret clinical prediction models to detect ‘risk factors’, i.e. to identify variables associated to the outcome. We shed light on the stability of prediction models by performing a large-scale experiment developing over 450 prediction models using LASSO logistic regression and investigating model changes across databases (care settings) and phenotype definitions. Our results show that model stability, as measured by the similarity of selected variables, is poor across the prediction tasks but slightly better for the top (i.e. most important) variables. Differences in the top variables are mostly due to database choice and not due to using different target population and/or outcome phenotype definitions. However, this means using a different database might lead to finding different ‘risk factors’. Furthermore, we found the effect (i.e. sign) of variables is not always the same across models, which makes clinical interpretation of potential ‘risk factors’ difficult. This study shows it is important to be careful when using LASSO regression to identify ‘risk factors’ and not to over-interpret the developed models in general. For ‘risk factor’ detection, we recommend investigating model robustness across settings or using alternative methods (e.g. univariate analysis).
Proceedings of the 7th Machine Learning for Healthcare Conference
Why predicting risk can’t identify ‘risk factors’: empirical assessment of model stability in machine learning across observational health databases
182
2022
inproceedings
Proceedings of Machine Learning Research
PMLR
2640-3498
markus22a
0
Why predicting risk can’t identify ‘risk factors’: empirical assessment of model stability in machine learning across observational health databases
828
852
828-852
828
false
Markus, Aniek F. and Rijnbeek, Peter R. and Reps, Jenna M.
given family
Aniek F.
Markus
given family
Peter R.
Rijnbeek
given family
Jenna M.
Reps
2022-12-31
Proceedings of the 7th Machine Learning for Healthcare Conference
inproceedings
date-parts
2022
12
31