posted on 2024-01-11, 14:20authored byMarta Ligero, Garazi Serna, Omar S.M. El Nahhas, Irene Sansano, Siarhei Mauchanski, Cristina Viaplana, Julien Calderaro, Rodrigo A. Toledo, Rodrigo Dienstmann, Rami S. Vanguri, Jennifer L. Sauter, Francisco Sanchez-Vega, Sohrab P. Shah, Santiago Ramón y Cajal, Elena Garralda, Paolo Nuciforo, Raquel Perez-Lopez, Jakob Nikolas Kather
Visualization of high and low PD-L1 score patients from the NSCLC-MSK and pan-cancer-VHIO cohort. Magnification of tumor areas with the highest attention scores for both high PD-L1 (TPS ≥ 1%) and low PD-L1 (TPS < 1%) samples from the training and validation cohort. Magnification shows that for both high and low scores the model gives more attention to tumor cells, ignoring areas of high lymphocyte density [A: tumor cells (high attention, left), lymphocytes (low attention, right); B and D: tumor cells (high attention, right), lymphocytes (low attention, left); C: tumor cells (high attention), stroma (low attention)] (Image magnification, A: 1.25×–6.12×, B:, C: 1.25×–2.5×, D: 2.5×–5.0×).
Funding
Bundesministerium für Gesundheit (BMG)
Deutsche Krebshilfe (German Cancer Aid)
Bundesministerium für Bildung und Forschung (BMBF)
'la Caixa' Foundation ('la Caixa')
CRIS Cancer Foundation (CRIS Foundation)
MEC | Instituto de Salud Carlos III (ISCIII)
NIHR | National Institute for Health and Care Research Applied Research Collaboration Oxford and Thames Valley (ARC OTV)
Fundación Fero (Fundació Fero)
Prostate Cancer Foundation (PCF)
PERIS
History
ARTICLE ABSTRACT
Programmed death-ligand 1 (PD-L1) IHC is the most commonly used biomarker for immunotherapy response. However, quantification of PD-L1 status in pathology slides is challenging. Neither manual quantification nor a computer-based mimicking of manual readouts is perfectly reproducible, and the predictive performance of both approaches regarding immunotherapy response is limited. In this study, we developed a deep learning (DL) method to predict PD-L1 status directly from raw IHC image data, without explicit intermediary steps such as cell detection or pigment quantification. We trained the weakly supervised model on PD-L1–stained slides from the non–small cell lung cancer (NSCLC)-Memorial Sloan Kettering (MSK) cohort (N = 233) and validated it on the pan-cancer-Vall d'Hebron Institute of Oncology (VHIO) cohort (N = 108). We also investigated the performance of the model to predict response to immune checkpoint inhibitors (ICI) in terms of progression-free survival. In the pan-cancer-VHIO cohort, the performance was compared with tumor proportion score (TPS) and combined positive score (CPS). The DL model showed good performance in predicting PD-L1 expression (TPS ≥ 1%) in both NSCLC-MSK and pan-cancer-VHIO cohort (AUC 0.88 ± 0.06 and 0.80 ± 0.03, respectively). The predicted PD-L1 status showed an improved association with response to ICIs [HR: 1.5 (95% confidence interval: 1–2.3), P = 0.049] compared with TPS [HR: 1.4 (0.96–2.2), P = 0.082] and CPS [HR: 1.2 (0.79–1.9), P = 0.386]. Notably, our explainability analysis showed that the model does not just look at the amount of brown pigment in the IHC slides, but also considers morphologic factors such as lymphocyte conglomerates. Overall, end-to-end weakly supervised DL shows potential for improving patient stratification for cancer immunotherapy by analyzing PD-L1 IHC, holistically integrating morphology and PD-L1 staining intensity.
The weakly supervised DL model to predict PD-L1 status from raw IHC data, integrating tumor staining intensity and morphology, enables enhanced patient stratification in cancer immunotherapy compared with traditional pathologist assessment.