abstract

booktitle

title

volume

year

layout

series

publisher

issn

id

month

tex_title

firstpage

lastpage

page

order

cycles

bibtex_author

author

date

address

container-title

genre

issued

pdf

extras

People often interpret clinical prediction models to detect ‘risk factors’, i.e. to identify variables associated to the outcome. We shed light on the stability of prediction models by performing a large-scale experiment developing over 450 prediction models using LASSO logistic regression and investigating model changes across databases (care settings) and phenotype definitions. Our results show that model stability, as measured by the similarity of selected variables, is poor across the prediction tasks but slightly better for the top (i.e. most important) variables. Differences in the top variables are mostly due to database choice and not due to using different target population and/or outcome phenotype definitions. However, this means using a different database might lead to finding different ‘risk factors’. Furthermore, we found the effect (i.e. sign) of variables is not always the same across models, which makes clinical interpretation of potential ‘risk factors’ difficult. This study shows it is important to be careful when using LASSO regression to identify ‘risk factors’ and not to over-interpret the developed models in general. For ‘risk factor’ detection, we recommend investigating model robustness across settings or using alternative methods (e.g. univariate analysis).

Proceedings of the 7th Machine Learning for Healthcare Conference

Why predicting risk can’t identify ‘risk factors’: empirical assessment of model stability in machine learning across observational health databases

182

2022

inproceedings

Proceedings of Machine Learning Research

PMLR

2640-3498

markus22a

0

Why predicting risk can’t identify ‘risk factors’: empirical assessment of model stability in machine learning across observational health databases

828

852

828-852

828

false

Markus, Aniek F. and Rijnbeek, Peter R. and Reps, Jenna M.

given	family
Aniek F.	Markus

given	family
Peter R.	Rijnbeek

given	family
Jenna M.	Reps

2022-12-31

Proceedings of the 7th Machine Learning for Healthcare Conference

inproceedings

date-parts

2022

12

31

https://proceedings.mlr.press/v182/markus22a/markus22a.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2022-12-31-markus22a.md

2022-12-31-markus22a.md

Files

2022-12-31-markus22a.md

Latest commit

History

2022-12-31-markus22a.md

File metadata and controls