Comparison of AUCs between PKU-M model, Brock model, PKU model, Mayo model, and VA model in subgroups
National Natural Science Foundation of China
Peking University People's Hospital Research and Development Funds
ARTICLE ABSTRACTNodule evaluation is challenging and critical to diagnose multiple pulmonary nodules (MPNs). We aimed to develop and validate a machine learning–based model to estimate the malignant probability of MPNs to guide decision-making.
A boosted ensemble algorithm (XGBoost) was used to predict malignancy using the clinicoradiologic variables of 1,739 nodules from 520 patients with MPNs at a Chinese center. The model (PKU-M model) was trained using 10-fold cross-validation in which hyperparameters were selected and fine-tuned. The model was validated and compared with solitary pulmonary nodule (SPN) models, clinicians, and a computer-aided diagnosis (CADx) system in an independent transnational cohort and a prospective multicentric cohort.
The PKU-M model showed excellent discrimination [area under the curve; AUC (95% confidence interval (95% CI)), 0.909 (0.854–0.946)] and calibration (Brier score, 0.122) in the development cohort. External validation (583 nodules) revealed that the AUC of the PKU-M model was 0.890 (0.859–0.916), higher than those of the Brock model [0.806 (0.771–0.838)], PKU model [0.780 (0.743–0.817)], Mayo model [0.739 (0.697–0.776)], and VA model [0.682 (0.640–0.722)]. Prospective comparison (200 nodules) showed that the AUC of the PKU-M model [0.871 (0.815–0.915)] was higher than that of surgeons [0.790 (0.711–0.852), 0.741 (0.662–0.804), and 0.727 (0.650–0.788)], radiologist [0.748 (0.671–0.814)], and the CADx system [0.757 (0.682–0.818)]. Furthermore, the model outperformed the clinicians with an increase of 14.3% in sensitivity and 7.8% in specificity.
After its development using machine learning algorithms, validation using transnational multicentric cohorts, and prospective comparison with clinicians and the CADx system, this novel prediction model for MPNs presented solid performance as a convenient reference to help decision-making.