posted on 2023-10-19, 14:20authored byMatthew Davis, Kit Simpson, Leslie A. Lenert, Vanessa Diaz, Alexander V. Alekseyenko
Cross-validated model training results on test fold showing logistic regression as the best performer by AUC, and SVM by F1 score
Funding
HHS | NIH | National Cancer Institute (NCI)
History
ARTICLE ABSTRACT
Cancer is the second leading cause of death in the United States, and breast cancer is the fourth leading cause of cancer-related death, with 42,275 women dying of breast cancer in the United States in 2020. Screening is a key strategy for reducing mortality from breast cancer and is recommended by various national guidelines. This study applies machine learning classification methods to the task of predicting which patients will fail to complete a mammogram screening after having one ordered, as well as understanding the underlying features that influence predictions. The results show that a small group of patients can be identified that are very unlikely to complete mammogram screening, enabling care managers to focus resources.
The motivation behind this study is to create an automated system that can identify a small group of individuals that are at elevated risk for not following through completing a mammogram screening. This will enable interventions to boost screening to be focused on patients least likely to complete screening.