American Association for Cancer Research
10780432ccr121915-sup-fig1-11.pdf (1.75 MB)

Supplementary Figures 1 - 11 from Expression Profiling of Archival Tumors for Long-term Health Studies

Download (1.75 MB)
journal contribution
posted on 2023-03-31, 17:15 authored by Levi Waldron, Shuji Ogino, Yujin Hoshida, Kaori Shima, Amy E. McCart Reed, Peter T. Simpson, Yoshifumi Baba, Katsuhiko Nosho, Nicola Segata, Ana Cristina Vargas, Margaret C. Cummings, Sunil R. Lakhani, Gregory J. Kirkner, Edward Giovannucci, John Quackenbush, Todd R. Golub, Charles S. Fuchs, Giovanni Parmigiani, Curtis Huttenhower

PDF file - 1787K, Supplemental Figure S1: Immunohistochemistry for selected proteins. Supplemental Figure S2: Pairwise scatterplots of technical variables for 1,003 DASL arrays from the NHS/HPFS study, including median value of Illumina control probes, sample age, RNA concentration, fraction of probes called present (p<0.05), and Interquartile Range (IQR). Supplemental Figure S3: Spatial distribution of sample-independent positive control probes (CY3_HYB) across the 12 96-well plates of the NHS/HPFS experiment. Supplemental Figure S4: Spearman correlation of expression profiles of individual technical replicates to the median pseudochip decreases with Interquartile Range (IQR), particularly for samples with IQR less than 1. Supplemental Figure S5: CAT-boxplots for six covariates available in the NHS/HPFS study, showing reproducibility of ranked lists of differentially expressed between two independent samples by plotting fractional concordance of the top n genes in each list on the y-axis, against n on the x-axis. Supplemental Figure S6: Examples of two of the "best" (0.1% and 0.2% quantile) and "worst" (99.8% and 99.9% quantile) probes for reproducibility as assessed by Spearman correlation, Euclidian Distance, and Manhattan Distance. Supplemental Figure S7: Reproducibility of individual probe intensity measurements. For samples passing quality control in the BC/A experiment, Spearman correlation between technical replicates was calculated for each probe, as a function of four different measures of probe activity. Supplemental Figure S8: Analysis of probe reproducibility measured by Euclidian distance, and relationships between different probe filtering criteria. Supplemental Figure S9: t-statistics using samples passing strict QC and permissive QC for the molecular markers considered in Supplemental Table 1. Supplemental Figure S10: validation of MSI-associated mRNA transcripts previously reported from fresh-frozen tissues(24). Supplemental Figure S11: Distribution of presence calls (fraction of samples in which a probe is called present, p<0.01) for all 24,526 probes on the DASL� assay, and for probes involved in validation of previously identified MSI-associated gene transcripts(24).



Purpose: More than 20 million archival tissue samples are stored annually in the United States as formalin-fixed, paraffin-embedded (FFPE) blocks, but RNA degradation during fixation and storage has prevented their use for transcriptional profiling. New and highly sensitive assays for whole-transcriptome microarray analysis of FFPE tissues are now available, but resulting data include noise and variability for which previous expression array methods are inadequate.Experimental Design: We present the two largest whole-genome expression studies from FFPE tissues to date, comprising 1,003 colorectal cancer (CRC) and 168 breast cancer samples, combined with a meta-analysis of 14 new and published FFPE microarray datasets. We develop and validate quality control (QC) methods through technical replication, independent samples, comparison to results from fresh-frozen tissue, and recovery of expected associations between gene expression and protein abundance.Results: Archival tissues from large, multicenter studies showed a much wider range of transcriptional data quality relative to smaller or frozen tissue studies and required stringent QC for subsequent analysis. We developed novel methods for such QC of archival tissue expression profiles based on sample dynamic range and per-study median profile. This enabled validated identification of gene signatures of microsatellite instability and additional features of CRC, and improved recovery of associations between gene expression and protein abundance of MLH1, FASN, CDX2, MGMT, and SIRT1 in CRC tumors.Conclusions: These methods for large-scale QC of FFPE expression profiles enable study of the cancer transcriptome in relation to extensive clinicopathological information, tumor molecular biomarkers, and long-term lifestyle and outcome data. Clin Cancer Res; 18(22); 6136–46. ©2012 AACR.