American Association for Cancer Research
Browse

FIGURE 1 from Integrative Prognostic Machine Learning Models in Mantle Cell Lymphoma

Download (413.59 kB)
figure
posted on 2023-08-02, 14:20 authored by Holly A. Hill, Preetesh Jain, Chi Young Ok, Koji Sasaki, Han Chen, Michael L. Wang, Ken Chen
<p>Selection process and workflow of prognostic models. <b>A,</b> Flowchart depicting MCL patient selection process for inclusion in models. <b>B,</b> Flowchart showing data availability for patient cohort (<i>n</i> = 794). All patients had clinicopathologic data. Most patients (<i>n</i> = 642) had cytogenetic and/or genomic data. <b>C,</b> Workflow of ML (XGBoost modeling). The dataset containing all 794 patients was split into a training/validation set and a test set. The test set was held from all initial preprocessing and hyperparameter tuning to avoid data leakage. Data preprocessing included removing zero and NZV features, dummy encoding categorical features, and collapsing low-frequency categorical variables into an “other” category. The training set was again split into 10 cross-fold validation sets where hyperparameters for the XGBoost model were tuned. The hyperparameter set with the highest mean ROC AUC was chosen for the final fit on the test set, and performance was evaluated to check the model's fit. Variable importance was visualized with a VIP and SHAP additive values. The fitted XGBoost model was launched using a REST API, demonstrating clinical utility.</p>

Funding

HHS | NIH | National Cancer Institute (NCI)

History

ARTICLE ABSTRACT

Our model is the first to integrate a dynamic algorithm with multiple clinical and molecular features, allowing for accurate predictions of MCL disease outcomes in a large patient cohort.