Supplementary Data from Recalibrating Risk Prediction Models by Synthesizing Data Sources: Adapting the Lung Cancer PLCO Model for Taiwan
Funding
Ministry of Health and Welfare (MOHW)
National Health Research Institutes (NHRI)
Ministry of Science and Technology, Taiwan (MOST)
History
ARTICLE ABSTRACT
Methods synthesizing multiple data sources without prospective datasets have been proposed for absolute risk model development. This study proposed methods for adapting risk models for another population without prospective cohorts, which would help alleviate the health disparities caused by advances in absolute risk models. To exemplify, we adapted the lung cancer risk model PLCOM2012, well studied in the west, for Taiwan.
Using Taiwanese multiple data sources, we formed an age-matched case–control study of ever-smokers (AMCCSE), estimated the number of ever-smoking lung cancer patients in 2011–2016 (NESLP2011), and synthesized a dataset resembling the population of cancer-free ever-smokers in 2010 regarding the PLCOM2012 risk factors (SPES2010). The AMCCSE was used to estimate the overall calibration slope, and the requirement that NESLP2011 equals the estimated total risk of individuals in SPES2010 was used to handle the calibration-in-the-large problem.
The adapted model PLCOT-1 (PLCOT-2) had an AUC of 0.78 (0.75). They had high performance in calibration and clinical usefulness on subgroups of SPES2010 defined by age and smoking experience. Selecting the same number of individuals for low-dose computed tomography screening using PLCOT-1 (PLCOT-2) would have identified approximately 6% (8%) more lung cancers than the US Preventive Services Task Forces 2021 criteria. Smokers having 40+ pack-years had an average PLCOT-1 (PLCOT-2) risk of 3.8% (2.6%).
The adapted PLCOT models had high predictive performance.
The PLCOT models could be used to design lung cancer screening programs in Taiwan. The methods could be applicable to other cancer models.