Supplementary Table S3 from A Functional Survey of the Regulatory Landscape of Estrogen Receptor–Positive Breast Cancer Evolution
Supplementary Table S3: SIDP results in MCF7 grown in full (red; +E2) media. S3.1: results of the differential abundance analysis for the positive controls and the non-targeting sgRNAs (as indicated in the genome_partition field). For each sgRNA, an identifier, the pool, and the results from the edgeR analysis are shown. Average abundance of the sgRNA at day 7 and 21 post infection is indicated as logCPM (counts per million). The log2-fold changes (log2FC) between day 21 and 7, and between day 21 and the initial library, are indicated, along with the FDR (Benjamini-corrected p-value). Two further fields indicate whether the sgRNA was identified as showing a significant increase in frequency (IF; FDR <= 0.05 and linear fold-change >= 1.5) or decrease (DF; FDR <= 0.05 and linear fold-change <= -1.5). S3.2: like S3.1 but listing the results for the sgRNAs targeting the genomic regions of interest. Hg38 coordinates are also included in this case. S3.3: summary of the results at the level of each SID region. For each region, hg38 coordinates are listed, along with the symbol of the nearest gene, and the distance to its TSS in bp (positive or negative values indicate the region is either downstream or upstream the TSS, respectively). The table then indicates whether the region was selected as a gene promoter, putative enhancer, or putative insulator. The number of sgRNAs targeting the enlarged region (indicated coordinates +- 1 kbp), is followed by information on the overlapping sgRNAs that scored significantly, separately for DF and IF. In both cases, the total number of significant guides, the corresponding fraction, and the FDR and log2FC of the highest scoring sgRNA are reported. A column indicating significance of one or more sgRNAs is also provided. S3.4: enriched terms in the set of genes close to the regions showing scoring sgRNAs, separately for the DF and the IF sets. For each group, hallmark sets showing a p-value <= 0.05 are included in the table. Statistics of the hyper-geometric test are shown, along with the total number and identity of the overlapping genes. S3.5: overlap between the regions identified in our +E2 MCF7 SIDP assay and previously published screens in breast cancer cell lines (marcotte: Marcotte et al. 2012; fei: Fei at al. 2019; Korkmaz: Korkmaz et al. 2019; ggg: Rui Lopes et al. 2020).