Figure S3: Confusion Matrix Across All Confidence Predictions from Deep-Learning Model for Tumor-Type Prediction Using Targeted Clinical Genomic Sequencing Data
posted on 2024-06-12, 16:20authored byMadison Darmofal, Shalabh Suman, Gurnit Atwal, Michael Toomey, Jie-Fu Chen, Jason C. Chang, Efsevia Vakiani, Anna M. Varghese, Anoop Balakrishnan Rema, Aijazuddin Syed, Nikolaus Schultz, Michael F. Berger, Quaid Morris
Overall heatmap of top prediction across all confidence values. Heatmap is row-normalized and sorted by overall precision. Off-target values represent proportion of the predicted type, row type that is of the true type along the columns. NSCLC, Non-Small Cell Lung Cancer; GIST, Gastrointestinal Stromal Tumor; SQC, Squamous Cell Carcinoma; SCLC, Small Cell Lung Cancer; PNET, Pancreatic Neuroendocrine Tumor; Lu-NET, Lung Neuroendocrine Tumor; GI-NET, Gastro-intestinal Neuroendocrine Tumor; Carc., Carcinoma; MPNST, Malignant Peripheral Nerve Sheath Tumor.
Funding
National Cancer Institute (NCI)
United States Department of Health and Human Services
Tumor type guides clinical treatment decisions in cancer, but histology-based diagnosis remains challenging. Genomic alterations are highly diagnostic of tumor type, and tumor-type classifiers trained on genomic features have been explored, but the most accurate methods are not clinically feasible, relying on features derived from whole-genome sequencing (WGS), or predicting across limited cancer types. We use genomic features from a data set of 39,787 solid tumors sequenced using a clinically targeted cancer gene panel to develop Genome-Derived-Diagnosis Ensemble (GDD-ENS): a hyperparameter ensemble for classifying tumor type using deep neural networks. GDD-ENS achieves 93% accuracy for high-confidence predictions across 38 cancer types, rivaling the performance of WGS-based methods. GDD-ENS can also guide diagnoses of rare type and cancers of unknown primary and incorporate patient-specific clinical information for improved predictions. Overall, integrating GDD-ENS into prospective clinical sequencing workflows could provide clinically relevant tumor-type predictions to guide treatment decisions in real time.
We describe a highly accurate tumor-type prediction model, designed specifically for clinical implementation. Our model relies only on widely used cancer gene panel sequencing data, predicts across 38 distinct cancer types, and supports integration of patient-specific nongenomic information for enhanced decision support in challenging diagnostic situations.See related commentary by Garg, p. 906.This article is featured in Selected Articles from This Issue, p. 897