microRNA-associated progression pathways and potential therapeutic targets identified by integrated mRNA and microRNA expression profiling in breast cancer

Running title: Integrated mRNA and microRNA analysis in breast cancer


Introduction
Breast cancer is a heterogeneous disease, with treatment resistance a consequence of pathologic and genomic characteristics.mRNA expression profiling in clinical cohorts has led to identification of functional pathways with roles in tumor progression and collections of genes (gene signatures) associated with disease outcome (1), some of which are now approved by Food and Drug Administration for clinical use, such as MammaPrint and OncotypeDX.microRNAs are small noncoding RNA molecules regulating cell function both at transcriptional and posttranscriptional levels which open up a new area of prognostic marker research complementary to established transcriptional gene signature or traditional marker studies (2).A study by Blenkiron and colleagues of 93 breast cancers identified several human microRNAs associated with intrinsic breast cancer subtype (3).Another recent study of 38 breast cancers (4) selected 12 microRNAs associated with clinicopathologic variables for analysis in a cohort of 261 breast cancers.Among these, 4 (miR-7, -128a, -210, and -516-3p) were prognostic, miR-210 having been previously identified (5).
However, integrated analysis of microRNA and mRNA global expression profiles has yet to be explored in prognostic studies.Such analyses have the potential not only to identify microRNAs that are independent prognostic factors, but also to elucidate microRNA function in vivo, and identify interactions between microRNAs and targeted mRNAs for enhanced marker and therapeutic target discovery.Thus, we conducted comprehensive microRNA and mRNA expression profiling in a large, well-annotated cohort of 207 early-invasive breast cancers.
To identify microRNAs that provide independent information, and are not simply surrogates of previously identified prognostic covariates, a Cox regression for distant relapse-free survival (DRFS) was carried out, including all microRNAs, clinical covariates and gene signatures.Penalized least-square minimization with variable selection and regularization was used to prevent overfitting.
To investigate the functional role of prognostic microRNAs, relationships between host genes (pri-microRNAs), mature microRNAs, and cognate target mRNAs were examined by global expression analysis.Finally, findings were confirmed by 2 further independent analyses.First, pri-microRNA (upstream of microRNA expression) and genes in the microRNA processing machinery were analyzed.In addition, expression of target genes, reflecting the functional effect downstream of microRNA expression, was considered.Prognostic significance of both analyses was assessed in published cohorts (n > 1,000 cases).

Patient characteristics
A retrospective series of 219 patients with early primary breast cancer was considered (5); extended demographics are provided in Supplementary Table S1 and Supplementary Information.Informed consent was obtained for each subject.Clinical investigations were conducted after approval by the local research ethics committee and in accordance with the ethical principles expressed in the Declaration of Helsinki.Main endpoint was DRFS.

mRNA and microRNA profiling
Matched microRNA and mRNA profiling was successfully obtained from 207 of 219 samples by using Illumina Human RefSeq-8 and miRNAv1 arrays (see Supplementary Information).Data have been submitted to GEO (6), superSeries GSE22220.In addition, Affymetrix U133A-B/plus2 array data from previously published breast cancer cohorts were analyzed [Supplementary Table S2, n ¼ 1,050 with recurrence-free survival (RFS), n ¼ 592 with RFS outcome].

Statistical methods
Workflow with study design is provided in Supplementary Figure S1.Analyses were implemented and carried out in R v2.9.0, and Sweave (7) was used for Automatic Generation of Reports.
Penalized regression allows identification of microRNAs whose expression is prognostic, or associated with a clinicopathologic factor, independently from other covariates.Genomic datasets are characterized by a greater number of variables than samples and high structure.Therefore, we used L1-and L2penalized regression to enable efficient variable selection and encourage a grouping effect (8).Penalization parameters were optimized by cross-validation, and leave-one-out was used to test the models.Penalized Cox regression was carried out in 2 steps: Only microRNAs selected by both steps were considered independently prognostic of DRFS.
Penalized linear regression analysis was used to study association of microRNA expression with clinicopathologic factors [Supplementary Information, Equation (SM3)].
Meta-analysis of published cohorts was carried out as previously described (9); datasets are summarized in Supplementary Table S2 and selection criteria are provided in Supplementary Information.

Global expression analysis of microRNA-target relationship
An extremely small number of microRNA targets have been validated previously, thus analyses of microRNA-target relationship relies on in silico predictions.Simultaneous use of multiple prediction algorithms has been suggested (10), and to avoid bias resulting from underlying correlations we considered the union of all predicted targets from 6 major algorithms (further details in Supplementary Information).We expected that if a microRNA were functional, a significant proportion of its predicted targets would be downregulated.Because of the challenges involved, the following 3 methods were compared: 1. Cumulative relative risk (RR) plots.All transcripts present on the array were ranked on the basis of the correlation of their expression with that of the microRNA under study.
The RR was defined as the probability of observing, at each correlation level, a given number of predicted targets among genes whose expression is inversely correlated with that of the microRNA, with respect to genes with positive correlation [Supplementary  (12,13), hypoxia (9), stem cell (14), invasion, immune response, and apoptosis (13).
This method identified 3 novel and 1 known (miR-128a) prognostic microRNA in ER þ breast cancer, and 5 novel and 1 known (miR-210) in ER À breast cancer (Fig. 1).Specifically, high miR-767-3p, -128a, and/or -769-3p expression was associated with poor prognosis, and high miR-135a with good prognosis in ER þ cases; high miR-27b, -144, and/or -210 was associated with poor prognosis, and high miR-342, -150, and/or -30c with good prognosis in ER À cases.Full results are reported in Supplementary Figure S2A-G.Three further microRNAs, miR-29c, -642 (high values, good prognosis), and -548d (high values, poor prognosis) were identified in an analysis including all samples (Supplementary Fig. S2C).An analysis including intrinsic subtype classification (Supplementary Table S1) produced similar results (data not shown); however, because the stability of this classification has been recently questioned (15) we have used the HER2/ ER/proliferation signature classification for the main analysis (Supplementary Fig. S2C-E).
The above analysis confirmed 2 known prognostic micro-RNAs from previous studies (miR-210 and -128a); however, miR-7 and -516-3p, also prognostic in a previous study (4), could not be confirmed.In our study, miR-7 was associated with DRFS in univariate Cox analysis but not in a model including all microRNAs, suggesting that this microRNA is not an independent prognostic factor, whereas miR-516-3p was not significantly associated with DRFS.
Also, real-time PCR was carried out to measure expressions of 4 prognostic microRNAs, miR-210, miR-342, miR-144 and miR-27b, relative to 3 small nucleolar RNAs (snoRNAs) controls (Supplementary Information).Strength of correlation between real-time PCR and arrays results varied but was significant for all 4 microRNAs (P < 0.0001 in all cases); analysis with nonnormalized CT values produced similar results (see Supplementary Information for discussion on PCR normalization).
microRNA signatures divided patients efficiently into good and poor prognosis groups when either a simple median split (Fig. 1E and G) or clustering with Bayesian Information Criterion (Supplementary Fig. S2F) were considered, and importantly they performed well when tested on the leftout cases (Supplementary Fig. S2B and G).Among microRNAs prognostic in ER À samples, 5 (miR-150, -342, good prognosis; and miR-210, -144, -27b, poor prognosis) were also prognostic  S2A).Heat maps display the cross-validated model at each leave-one-out iteration (y-axis) for analyses in all 207 (A), 82 ER À (B), and 90 ER þ Tamoxifen-treated (C) samples.Similar models are clustered (standard correlation).Color reflects the HR (logged and rescaled per column, see bar).microRNAs that were selected in at least 5% of iterations are shown.Consistently selected microRNAs (>95% iterations) were further analyzed by a Cox model including clinicopathologic factors and gene expression signatures (D-G).microRNAs consistently significant in this model were considered prognostic and are indicated with an asterisk (red, not known; black, previously known to be prognostic).Summary results are shown for ER þ Tamoxifen-treated (D) and ER À (F) samples (full results in Supplementary Fig. S2C-E).miR-768-3p is in parenthesis as it was retired from miRBase.Kaplan-Meier plots of prognostic microRNA signature in ER þ (E) and ER À (G) cases are also shown.Expression of each prognostic microRNA (asterisks from D and F, respectively) was ranked.In each case, the mean rank of poor prognosis microRNAs was subtracted from that of good prognosis microRNAs to provide a nonparametric summary score.Cases were categorized as high and low risk by median value of this score.Box plots of the microRNA ranked expression in the 2 groups show median, quartiles, and range.GSE6532 overlaps partially (n ¼ 69) with the present dataset.This was exploited to establish the correlation between microRNA and pri-microRNA expression as measured by Affymetrix arrays; however, the common cases were taken out from GSE6532OXF in all prognostic analysis where this dataset is used (here and also in Fig. 4).d Summary HR and 95% CI for the meta-analysis.Significant results are in bold.

Analysis of microRNA host-gene (pri-microRNA) expression in published cohorts
in univariate analysis of triple negative (TRN) breast cancers (n ¼ 37, see legend of Supplementary Table S1).miR-342, -27b, and -150 were also significant in a Cox analysis of this group including clinicopathologic factors and gene signatures (Supplementary Fig. S2H).

Prognostic microRNA clusters
miR-767-3p (prognostic in ER þ cases), and miR-27b and miR-144 (prognostic in ER À cases) are part of microRNA clusters.Analysis of clustered microRNAs revealed that miR-451, clustered with miR-144, was significantly associated with good prognosis in ER þ cases (Supplementary Table S3), suggesting an independent role for these 2 clustered microRNAs in ER þ and ER À tumors.Conversely, the miR-24/27/23 cluster was consistently associated with poor prognosis, with all but one microRNA of this family (miR-27a/b, -23a/b, -24; but not -189) significantly associated with DRFS in univariate analysis of all cases, and all microRNAs but one (miR-23b) significant in analysis of ER À cases (Supplementary Table S3).However, none of them was associated with prognosis in ER þ cases, providing evidence that this family plays a specific role in ER À breast cancer.

Transcriptional regulation of microRNAs and analysis of pri-microRNA in independent cohorts
We investigated whether prognostic microRNAs and their pri-microRNAs showed coordinated expression, and studied the prognostic potential of pri-microRNAs in independent cohorts (n ¼ 1,050 with RFS, n ¼ 592 with DRFS information; for further details on these cohorts see Supplementary Information and Supplementary Table S2).Initially, transcripts containing pri-microRNA for 6 prognostic microRNAs could be mapped to the arrays and their expression found to be significantly correlated with that of the mature microRNA in our series (Spearman's rank correlation test P < 0.05, microRNAs are listed in Table 1).This suggests transcriptional regulation of these microRNAs in breast cancer, rather than modulation of the mature microRNA due to changes in processing.Among these, pri-miR-128a, 210, -29c, -342, and -548d showed significant association with prognosis in metaanalysis of the independent cohorts and a concordant effect with respect to the microRNA prognostic analysis in Figure 1; specifically, pri-miR-128a, -210, -342, and -548d were significant for DRFS, and -128a, -210, -27b, -342, and -548d were significant for RFS (Table 1).

Association of prognostic microRNAs with processes critical for breast cancer biology
microRNA association with clinicopathologic factors was studied by multiple penalized linear regression (Fig. 2A and B).This confirmed, for example, the previously reported hypoxia regulation of miR-210 (Fig. 2C), but suggested that miR-210 expression is also associated with proliferation, ER positivity, grade, and nodal invasion (Fig. 2B and C).Interestingly, associations with hypoxia and proliferation were significant after accounting for grade and ER status (Fig. 2B and C).This highlights the usefulness of testing associations simultaneously by including clinical and molecular covariates in the same regression model.miR-128a was found to be associated with ER positivity, both as measured by IHC and ESR1 expression signature (Fig. 2A).Among the newly discovered prognostic microRNAs, a reported role for miR-150 in immunity (16) agrees with an association with immune response signature in our dataset (Fig. 2C).miR-135a was inversely correlated with the proliferation signature, in agreement with reports of negative regulation of cell growth in Hodgkin's lymphoma (17).Finally, miR-27b was associated with the invasion signature (Fig. 2B), in agreement with its reported ability to promote invasion and angiogenesis (18).
A recent study in 20 inflammatory breast cancers (IBC; ref. 19), one of the most aggressive forms of locally advanced breast cancer, identified microRNAs belonging to families miR-29 (-29a), -30 (-30b), and -342 as downregulated in IBC.This agrees with our findings of expression of microRNAs in these families (miR-29c, -30c, -342) being associated with good prognosis.In the same study, miR-548a-5p was found upregulated in IBC, in agreement with high expression of the miR-548 family being associated with poor prognosis (-548d).

Target expression is consistent with regulatory effect of prognostic microRNAs
Downregulation of predicted target transcript was observed by using cumulative RR plots for all prognostic microRNAs, with miR-128a, -144, and -210 providing the most consistent and significant results (Fig. 3A; Supplementary Fig. S3).Furthermore, microRNA association with clinicopathologic factors (Fig. 2) was reflected by coordinated downregulation of cognate target mRNAs (Fig. 3B; Supplementary Fig. S3).The most significant were miR-210 and -144 targets, strongly underexpressed in highly proliferative tumors (Fig. 3).In 3 cases (miR-144, -210, and -769-3p), results could be confirmed consistently by both PTsign-and RE-score methods (Supplementary Table S4); however, PTsign and RE scores were strongly correlated for all microRNAs [P << 0.00001 for all microRNAs, correlation coefficient range: (0.7-0.95)].Contrary to RR plots, these measure inhibition of target expression irrespective of amount of downregulation.

A global analysis of microRNA targets reveals pathways dysregulated in cancer
A pathway analysis revealed that a number of downregulated predicted targets, greater than that expected by chance, were implicated in pathways which have important roles in direct tumor growth and metastasis (Supplementary Fig. S4); and also that microRNA target downregulation would lead to induction of these pathways (Supplementary Fig. S4A).For example, downregulated targets included MAPK8 and MAPK14 ((miR-144 and miR-27b targets, respectively), which can be proapoptotic under stress conditions (20); SPRY2 (miR-128a target) in the FGF receptor signaling pathway, which has a major role in inhibiting tyrosine kinases (21); and tumor suppressor genes PTEN and FOXO1 (miR-144 and -128a targets).miR-27b predicted targets in the metabotropic glutamate receptor pathway, namely GRIN3A, GRM6, and GRIA4, and voltage-dependent L-type calcium channel subunit, were consistently downregulated.Their association with neuronal cell death under hypoxic stress (22) suggests a new potential mechanism by which cancer cells escape death under stress.Recently, mutations in this pathway were reported as the commonest mutations in melanoma (23).Wnt antagonists such as miR-144 predicted target SRFP1, and targets involved in angiogenesis such as Frizzled4 were also among downregulated predicted  S3.A, plots show enrichment in the number of predicted targets among transcripts whose downregulation is associated with microRNA overexpression (Supplementary Fig. S3 for workflow of this analysis).At each correlation level r, the RR of observing a given number of predicted targets whose expression is anticorrelated with microRNA expression, with correlation of Àr or less, is plotted.Number of downregulated targets is shown on the plot.Dotted lines represent the mean RR for randomly sampled (Â1,000) set of transcripts, with 5% and 95% CI.B, RR plots of enrichment in the number of predicted targets among transcripts whose expression is anticorrelated with gene expression signatures of biological processes.Proliferation was considered as biological process with strongest association with microRNA expression (Fig. 2A).RR plot statistics as described in A.
In TRN cancers, several signaling pathways related with miR-150 predicted targets were found upregulated in poor prognosis cases including Akt2, insulin receptor, ErbB3, S6 kinase, MAP kinase pathways, and stress response kinase JNK, as well as downstream protease MMP-13 (Supplementary Fig. S4B).miR-342 predicted target RRM2 was also upregulated in this group (Supplementary Fig. S4B).RRM2 is a target for several inhibitors of proliferation that affect cells in S-phase, and would be compatible with the high proliferation rate of TRN cancer (6).Similar pattern was observed for miR-342 target glucose 1,6-bisphosphate synthase, a critical component of glycolysis (25) to which inhibitors are currently being developed; and TLE-1, a downstream component of notch signaling (26), that is already recognized to be activated in TRN cancers.

Validation of prognostic microRNAs using target expression in the present and independent cohorts
We focused on prognostic microRNAs showing consistent and significant association with downregulation of cognate target transcripts.Among these, miR-210, -128, and -27b predicted targets were prognostic for DRFS and RFS, both when using cumulative RR (Supplementary Fig. S5) and summary scores in meta-analysis of published cohort studies (Fig. 4A-F).Furthermore, tumors with concomitant microRNA overexpression and target underexpression had significantly worse prognosis, whereas the opposite was true for cases with low microRNA levels and high target expression (Fig. 4G-I).
We also conducted a single target analysis for experimentally derived miR-210 targets (Supplementary Fig. S4E).Three of 5 targets showing most significant downregulation, namely ISCU, CBX7, and IGF1R, were significantly associated with  S2).Dots represent HR; dimensions are proportional to dataset size.Gray bars are 95% CI.A-C, DRFS; D-F, RFS.G-I, Kaplan-Meier curves for expression of mature miR-210 (G), -27b (H), and -128a (I), and respective PTSigns, in the Oxford breast cancer dataset (n ¼ 207).microRNA expression levels and PTSign score were split by median value (À, below; þ, above).HR for deviation contrasts in a Cox regression comparing each category with the overall effect are shown for significant comparisons.
DRFS; specifically, low levels were associated with poor prognosis (Supplementary Fig. S4E).This suggests that novel targets can be confirmed with this approach.To further assess this potential we assessed protein expression by using immunohistochemistry (IHC) for a predicted target with suitable commercial antibody available.NFE2L2 (alias NRF2) was chosen as top ranking downregulated predicted target of miR-144, novel prognostic microRNAs showing evidence of coordinated downregulation of target mRNAs (Fig. 1; Supplementary Table S4).IHC was done on 137 cases (Supplementary Information); staining was predominantly cytoplasmic with minimal stromal positivity (Supplementary Fig. S6).Tumors with weak protein and mRNA expression showed significantly higher miR-144 expression than tumors with consistently high NFE2L2 expression (Supplementary Fig. S6).This was true only in ER À cases (Supplementary Fig. S6D and E), miR-144 being prognostic only in ER À tumors (Fig. 1) in agreement.Low NFE2L2 IHC product score (IPS) was associated with worse DRFS (univariate Cox HR ¼ 0.64, P ¼ 0.18; IPS ranked and normalized between 0 and 1); in agreement with NFE2L2 acting as a tumor suppressor (23).Furthermore, tumors with concomitant miR-144 overexpression and NFE2L2 underexpression had significantly worse prognosis, whereas the opposite was true for cases with low miR-144 levels and high NFE2L2 expression (Supplementary Fig. S6F).

microRNAs are prognostic independent of expression of microRNA-processing genes
We found that the expression of some, but not all, pri-microRNAs was correlated with that of the mature micro-RNAs.Thus, we investigated whether microRNA-processing genes might have a role in regulation of mature microRNA levels.We tested the prognostic significance of several processing genes.High expression of transport gene exportin 5 (XPO5) and RISC complex genes EIF2C2 and EIF2C3 were significantly associated with poor prognosis, whereas high expression of DICER genes DICER1 and TARBP1, and RISC complex gene EIF2C4 were associated with good prognosis (Table 2).Of these, expression of XPO5 showed a significant correlation with mature microRNA expression both for the ER À and ER þ microRNA prognostic signatures (Fig. 5A and B), whereas EIF2C2 and EIF2C3 showed correlation only with the ER À microRNA prognostic signature.However, in all cases fold expression changes were very small (Fig. 5A and B).When samples were stratified by low and high risk, based on microRNA signature and expression of processing genes, the effect of the latter was never significant (Fig. 5C-F).A Cox analysis including single prognostic microRNAs and processing genes showed that in all cases, mature microRNA expression was significantly independent of the expression of the processing genes (Supplementary Table S5).Overall, these results suggest that the expression profile and prognostic significance of these microRNAs is due to regulation of their transcription rather than differential processing.For miR-128a, -210, -342, and -548d, this is also confirmed by the prognostic significance of their pri-microRNA expression in independent cohorts (Table 1).
Interestingly, EIF2C2 (AGO2) expression in our samples, although mildly differential, was always very high (Fig. 5), thus allowing mature microRNAs to modulate target gene expression.This is particularly important in view of recent work (11) suggesting that AGO expression is mandatory for an effect of microRNAs on the expression profiles of tumor samples.

Discussion
In this study, microRNAs that are independently prognostic for DRFS were identified and validated by integrated analysis of microRNA and mRNA profiling.Nine microRNAs were found prognostic in ER þ and ER À breast cancer, 3 of which remained significant in the clinically challenging TRN group.Our findings regarding mature miR-210 and miR-128a expression confirm previous findings of a smaller study (4).Multiple penalized linear regression revealed that prognostic micro-RNAs are associated with key biological processes in breast cancer, such as proliferation (miR-135a), hypoxia (miR-210, -342), invasion (miR-27b), and immune response (miR-150).However, these microRNAs are prognostic independent of gene expression signatures of these processes.This highlights the importance of considering both microRNA and transcript expression data in prognostic studies.
Several microRNAs previously linked with breast cancer subtype or progression in experimental models were not identified as independently prognostic.This agrees with previous results from prognostic studies (4, 27); reasons could include discrepancy of assays and/or the importance of microRNA expression in specific cell populations which would not appear in whole tumor sample analyses.However, it also highlights the need to differentiate between biomarkers of tumor presence and prognostic biomarkers, and identify factors carrying information independently rather than surrogates of known prognostic covariates.
To examine whether microRNAs might be prognostic due to regulatory effects on cognate targets, associations between Figure 5. Expression of microRNA-processing genes does not affect microRNA prognostic significance.A and B, box plots of expression of microRNA-processing genes that were prognostic in Cox univariate analysis (Table 2) are shown for the low-risk profile (LRP) and highrisk profile (HRP) in ER þ (A) and ER À (B) samples as defined in Fig. 1E and G legend.Fold changes in mean expression (FC) and nonparametric Mann-Whitney test are shown for significant cases.C-F, Kaplan-Meier plots of DRFS.Samples are stratified into the LRP and HRP, and also by median value of processing genes' expression; À, below; þ, above median.This value is calculated by using all the samples, thus the expression cutpoint is equal in both LRP and HRP group.The HR of LRP and HRP groups was always significant, whereas the HR of [À] and [þ] groups was never significant (Cox model contrasts, threshold P < 0.05).
microRNAs and predicted targets were studied by global expression analysis.This is challenging, as the microRNA-mRNA interaction network is complex and the effect on each individual mRNA is often small.Nevertheless, prognostic miR-128a, -27b, and -210 showed evidence of cognate target downregulation, and expression of their targets was prognostic in meta-analysis of several cohorts (Fig. 4).Furthermore, combined use of microRNA and target expression identified cases with the poorest prognosis, suggesting that a combined score could reflect not only expression but also functionality of the microRNA.
Prognostic microRNAs could be used not only to select patients for specific interventions, but also to define therapeutic approaches.To this end, functional validation of microRNA targets is needed to elucidate the role of micro-RNAs in cancer.In this respect, analysis of microRNA-mRNA relationships in clinical datasets could assist in target prioritization, and cohorts such as the present one will be a useful resource for future validation studies.Prognostic microRNAs were found to regulate a wide range of previously poorly investigated pathways that may be related to cancer progression and potential candidates for therapy, such as RRM2 and TLE-1 in TRN breast cancers.A validated example was miR-210 target ISCU (iron-sulfur cluster scaffold homolog), which was downregulated in samples with high miR-210 levels, and whose underexpression was associated with both hypoxia and poor prognosis.ISCU plays an important role in mitochondrial respiration and DNA repair (28,29).Tumors with high miR-210 might be for example more susceptible to DNA damaging agents combined with DNA repair inhibitors.Importantly, ISCU could not be identified in traditional supervised gene expression prognostic analyses.
An additional benefit of merging microRNA and mRNA data was that coordinate expression of microRNAs and precursorcontaining pri-microRNAs, and expression of microRNA-processing genes could be explored.As this analysis suggested that several prognostic microRNAs are regulated at the tran-scriptional level rather than through changes in RNA processing mechanisms (Fig. 5), pri-microRNA data (Table 1) could be used for the validation of our findings in cohorts for which only mRNA profiling is available.The most striking results were for miR-210 and -128a, where coordinated expression of pri-microRNA (upstream of microRNA expression), mature microRNA and predicted targets was observed.Expression of each one of these components was found to be prognostic.
In conclusion, this is the first large study integrating micro-RNA and mRNA global profiles in human breast cancer.Prognostic microRNAs in ER þ and ER À breast cancer were identified, and regulatory action on target transcripts shown.These results elucidated potential novel therapeutic targets, and could also be exploited to validate findings in independent cohorts.Our approach consistently validated known micro-RNA-target interactions, and may therefore be broadly applicable to other biologically and clinically heterogeneous diseases.

Figure 1 .
Figure1.microRNAs independently prognostic of DRFS in breast cancer (n ¼ 207).Prognostic microRNAs were selected by 2-step penalized Cox regression (see Materials and Methods).A-C, first a model including all microRNAs (x-axis) was considered (full results in Supplementary Fig.S2A).Heat maps display the cross-validated model at each leave-one-out iteration (y-axis) for analyses in all 207 (A), 82 ER À (B), and 90 ER þ Tamoxifen-treated (C) samples.Similar models are clustered (standard correlation).Color reflects the HR (logged and rescaled per column, see bar).microRNAs that were selected in at least 5% of iterations are shown.Consistently selected microRNAs (>95% iterations) were further analyzed by a Cox model including clinicopathologic factors and gene expression signatures (D-G).microRNAs consistently significant in this model were considered prognostic and are indicated with an asterisk (red, not known; black, previously known to be prognostic).Summary results are shown for ER þ Tamoxifen-treated (D) and ER À (F) samples (full results in Supplementary Fig.S2C-E).miR-768-3p is in parenthesis as it was retired from miRBase.Kaplan-Meier plots of prognostic microRNA signature in ER þ (E) and ER À (G) cases are also shown.Expression of each prognostic microRNA (asterisks from D and F, respectively) was ranked.In each case, the mean rank of poor prognosis microRNAs was subtracted from that of good prognosis microRNAs to provide a nonparametric summary score.Cases were categorized as high and low risk by median value of this score.Box plots of the microRNA ranked expression in the 2 groups show median, quartiles, and range.
pri-microRNA HR for DRFS in published cohorts (n = 592) b pri-microRNA HR for RFS in published cohorts (n = 1,050) c

Figure 2 .
Figure2.Biological processes and clinicopathologic factors associated with prognostic microRNAs.Expression signatures derived from large-scale analysis of cancer datasets were used as surrogate markers of biological processes; samples were ranked from lowest to highest as measured by summary scores of these signatures.microRNAs prognostic in ER þ (A) and ER À (B) cases are shown.Association between expression of each microRNA (y-axis) with clinicopathologic variables and gene signatures (x-axis) was obtained by using penalized linear regression.Heat maps illustrate the microRNA association with signature/clinical variable in the cross-validated model (see bar).Red, covariate levels high, microRNA expression high; blue, opposite.C, scatter plots with examples of significant associations.MicroRNA expression (x-axis) and covariate (y-axis) are plotted with a second covariate shown as color stratification.Linear fits and Spearman's rank correlation are shown in the 2 strata.HS up, Hypoxia Signature Score above median; HS down, below median; Nþ, nodal involvement; N0, no involvement.

Figure 3 .
Figure 3. Inhibition of cognate target expression by prognostic microRNAs.Cumulative RR plots are shown for the most significant effects; further results in Supplementary FigureS3.A, plots show enrichment in the number of predicted targets among transcripts whose downregulation is associated with microRNA overexpression (Supplementary Fig.S3for workflow of this analysis).At each correlation level r, the RR of observing a given number of predicted targets whose expression is anticorrelated with microRNA expression, with correlation of Àr or less, is plotted.Number of downregulated targets is shown on the plot.Dotted lines represent the mean RR for randomly sampled (Â1,000) set of transcripts, with 5% and 95% CI.B, RR plots of enrichment in the number of predicted targets among transcripts whose expression is anticorrelated with gene expression signatures of biological processes.Proliferation was considered as biological process with strongest association with microRNA expression (Fig.2A).RR plot statistics as described in A.

Figure 4 .
Figure 4. Validation of prognostic microRNAs in independent breast cancer datasets by analysis of cognate targets.A-F, forest plots are shown for miR-210 (A, D), -27b (B, E), and -128a (C, F) predicted targets (PTSign summary score, see Materials and Methods) in the present (Oxf) and published BC datasets (GEO IDs provided, further details in Supplementary TableS2).Dots represent HR; dimensions are proportional to dataset size.Gray bars are 95% CI.A-C, DRFS; D-F, RFS.G-I, Kaplan-Meier curves for expression of mature miR-210 (G), -27b (H), and -128a (I), and respective PTSigns, in the Oxford breast cancer dataset (n ¼ 207).microRNA expression levels and PTSign score were split by median value (À, below; þ, above).HR for deviation contrasts in a Cox regression comparing each category with the overall effect are shown for significant comparisons.

Table 2 .
Prognostic significance of microRNA processing genes a Cox univariate analysis.Gene expression considered as continuous variable, ranked, and normalized between 0 and 1.When more than one probeset mapped to the same gene, the probeset with the greater effect is shown.Probesets with P < 0.05 are highlighted in bold.