posted on 2023-03-31, 19:32 authored by Wendy A. Cooper, Prudence A. Russell, Maya Cherian, Edwina E. Duhig, David Godbolt, Peter J. Jessup, Christine Khoo, Connull Leslie, Annabelle Mahar, David F. Moffat, Vanathi Sivasubramaniam, Celine Faure, Alena Reznichenko, Amanda Grattan, Stephen B. Fox

Table S2 shows agreement with gold standard depending on PD-L1 tumor proportion score per gold standard. Agreements for trained and untrained participating pathologists were compared.



Purpose: Reliable and reproducible methods for identifying PD-L1 expression on tumor cells are necessary to identify responders to anti–PD-1 therapy. We tested the reproducibility of the assessment of PD-L1 expression in non–small cell lung cancer (NSCLC) tissue samples by pathologists.Experimental Design: NSCLC samples were stained with PD-L1 22C3 pharmDx kit using the Dako Autostainer Link 48 Platform. Two sample sets of 60 samples each were designed to assess inter- and intraobserver reproducibility considering two cut points for positivity: 1% or 50% of PD-L1 stained tumor cells. A randomization process was used to obtain equal distribution of PD-L1 positive and negative samples within each sample set. Ten pathologists were randomly assigned to two subgroups. Subgroup 1 analyzed all samples on two consecutive days. Subgroup 2 performed the same assessments, except they received a 1-hour training session prior to the second assessment.Results: For intraobserver reproducibility, the overall percent agreement (OPA) was 89.7% [95% confidence interval (CI), 85.7–92.6] for the 1% cut point and 91.3% (95% CI, 87.6–94.0) for the 50% cut point. For interobserver reproducibility, OPA was 84.2% (95% CI, 82.8–85.5) for the 1% cut point and 81.9% (95% CI, 80.4–83.3) for the 50% cut point, and Cohen's κ coefficients were 0.68 (95% CI, 0.65–0.71) and 0.58 (95% CI, 0.55–0.62), respectively. The training was found to have no or very little impact on intra- or interobserver reproducibility.Conclusions: Pathologists reported good reproducibility at both 1% and 50% cut points. More adapted training could potentially increase reliability, in particular for samples with PD-L1 proportion, scores around 50%. Clin Cancer Res; 23(16); 4569–77. ©2017 AACR.

