Evidence for the Presence of Disease-perturbed Networks in Prostate Cancer Cells by Genomic and Proteomic Analyses: a Systems Approach to Disease

Prostate cancer is initially responsive to androgen ablation therapy and progresses to androgen-unresponsive states that are refractory to treatment. The mechanism of this transition is unknown. A systems approach to disease begins with the quantitative delineation of the informational elements (mRNAs and proteins) in various disease states. We employed two recently developed high-throughput technologies, massively parallel signature sequencing (MPSS) and isotope-coded affinity tag, to gain a comprehensive picture of the changes in mRNA levels and more restricted analysis of protein levels, respectively, during the transition from androgen-dependent LNCaP (model for early-stage prostate cancer) to androgen-independent CL1 cells (model for late-stage prostate cancer). We sequenced > > >5 million MPSS signatures, obtained > > >142,000 tandem mass spectra, and built comprehensive MPSS and proteomic databases. The integrated mRNA and protein expression data revealed underlying functional differences between androgen-dependent and androgen-independent prostate cancer cells. The high sensitivity of MPSS enabled us to identify virtually all of the expressed transcripts and to quantify the changes in gene expression between these two cell states, including functionally important low-abundance mRNAs, such as those encoding transcription factors and signal transduction molecules. These data enable us to map the differences onto extant physiologic networks, creating perturbation networks that reflect prostate cancer progression. We found 37 BioCarta and 14 Kyoto Encyclopedia of Genes and Genomes pathways that are up-regulated and 23 BioCarta and 22 Kyoto Encyclopedia of Genes and Genomes pathways that are down-regulated in LNCaP cells versus CL1 cells. Our efforts represent a significant step toward a systems approach to understanding prostate cancer progression.


Introduction
Prostate cancer is the most common nondermatologic cancer in the United States (1).Initially, its growth is androgen dependent; early-stage therapies, including chemical and surgical castration, kill cancerous cells by androgen deprivation.Although such therapies produce tumor regression, they eventually fail because most prostate carcinomas become androgen independent (2).To improve the efficacy of prostate cancer therapy, it is necessary to understand the molecular mechanisms underlying the transition from androgen dependence to androgen independence.
The transition from androgen-dependent to androgen-independent status likely results from multiple processes, including activation of oncogenes, inactivation of tumor suppressor genes, and changes in key components of signal transduction pathways and gene regulatory networks.Systems approaches to biology and disease are predicated on the identification of the elements of the systems, the delineation of their interactions, and their changes in distinct disease states.Biological information is of two types: the digital information of the genome (e.g., genes and cis-control elements) and environmental cues.Normal protein and gene regulatory networks may be perturbed by disease, through genetic and/or environmental perturbations, and understanding these differences lies at the heart of systems approaches to disease.Disease-perturbed networks initiate altered responses that bring about pathologic phenotypes, such as the invasiveness of cancer cells.
To map network perturbations in cancer initiation and progression, one must measure changes in expression levels of virtually all transcripts.Certain low-abundance transcripts, such as those encoding transcription factors and signal transducers, wield significant regulatory influences in spite of the fact they may be present in the cell at very low copy numbers.Differential display (3) or cDNA microarrays (4,5) have been used to profile changes in gene expression during the androgen-dependent to androgenindependent transition; however, those technologies can identify only a limited number of more abundant mRNAs, and they miss many low-abundance mRNAs due to their low detection sensitivities.Massively parallel signature sequencing (MPSS), a recently introduced method, allows 20-nucleotide signature sequences to be determined in parallel for >1,000,000 DNA sequences from an individual cDNA library or cell state (6).The frequency of each MPSS signature was calculated for each sample and represented in transcripts per million (tpm).MPSS technology allows identification and cataloging of almost all mRNAs, even those with one or a few transcripts per cell.Differentially expressed genes thus identified can be mapped onto cellular networks to provide a systemic understanding of changes in cellular state.
Although transcriptome (mRNA levels) differences are easier to study than proteome (protein levels) differences, cellular functions are usually performed by proteins.RNA expression profiling studies do not address how the encoded proteins function biologically, and transcript abundance levels do not always correlate with protein abundance levels (7).We therefore complemented our mRNA expression profiling with a more limited protein profiling by using isotope-coded affinity tags (ICAT) coupled with tandem mass spectrometry (MS/MS; ref. 8).
The LNCaP cell line is a widely used androgen-sensitive model for early-stage prostate cancer from which androgen-independent sublines have been generated (4,5,9).The cells of one such variant, CL1, in contrast to their LNCaP progenitors, are highly tumorigenic and exhibit invasive and metastatic characteristics in intact and castrated mice (9,10).Thus, CL1 cells model late-stage prostate cancer.MPSS and ICAT data extracted from these model cell lines can be validated by real-time reverse transcription-PCR (RT-PCR) or Western blot analysis in more relevant biological models (tumor xenografts) and in tumor biopsies.
We conducted a MPSS analysis of f5 million signatures for the androgen-dependent LNCaP cell line and its androgenindependent derivative CL1.Our database offers the first comprehensive view of the digital transcriptomes of two states of prostate cancer cells and allows us to explore the cellular pathways perturbed during the transition from androgendependent to androgen-independent growth.We additionally compared protein expression profiles between LNCaP and CL1 cells using ICAT-MS/MS technology.These are the first steps toward a systems approach to disease through an integrative, systemic understanding of prostate cancer progression at the mRNA, protein, and network levels.

Materials and Methods
Massively parallel signature sequencing analysis.LNCaP and CL1 cells were grown as described by Tso et al. (10).MPSS cDNA libraries were constructed, and individual cDNA sequences were amplified, attached to individual beads, and sequenced as described elsewhere (6).The resulting signatures, generally 20 bases long, were annotated using the then most recently annotated human genome sequence (Human Genome Release hg16, released in November 2003) and the human Unigene (Unigene Build 171, released in July 2004) according to a previously published method (11).We considered only 100% matches between a MPSS signature and a genome signature.We also excluded those signatures that expressed at <3 tpm in both LNCaP and CL1 libraries, as they might not be reliably detected (12).Additionally, we classified cDNA signatures by their positions relative to polyadenylation signals and polyadenylic acid [poly(A)] tails and by their orientation relative to the 5V -3V orientation of source mRNA.The Z-test (13,14) was used to calculate Ps for comparison of gene expression levels between the cell lines.
Isotope-coded affinity tag analysis.ICAT reagents were purchased from Applied Biosystems, Inc. (Foster City, CA) Fractionation of cells into cytosolic, microsomal, and nuclear fractions (15), as well as ICAT labeling, MS/MS, and data analyses, were done as described by Han et al. (15).In addition, probability score analysis (16) and Automated Statistical Analysis on Protein Ratio (17) were used to assess the quality of MS spectra and to calculate protein ratios from multiple peptide ratios.Descriptions of these software tools are available at http://regis.systemsbiology.net/software.To compare protein and mRNA expression levels, the Unigene numbers of the differentially expressed proteins were used to find MPSS signatures and their expression levels in tpm.If one Unigene had more than one MPSS signature likely due to alternative terminations, the average tpm of all signatures was taken.
Real-time reverse transcription-PCR.All primers were designed with the PRIMER3 program (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi)and BLAST searched against the human cDNA and expressed sequence tag (EST) database for uniqueness.Primer sequences and PCR conditions are available on request.Real-time PCR was done on an ABI 7700 machine (Applied Biosystems), and SYBR Green dye (Molecular Probes, Inc., Eugene, OR) was used as a reporter.PCR conditions were designed to give bands of the expected size with minimal primer dimer bands.
Identification of perturbed networks.Genes in the 314 BioCarta and 155 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways or networks (http://cgap.nci.nih.gov/Pathways/) were downloaded and compared with the MPSS data using Unigene IDs as identifiers.If a Unigene ID or an Enzyme Classification number corresponded to multiple signatures potentially due to multiple alternatively terminated isoforms, the tpm counts of the isoforms were combined and then subjected to the Z-test (13,14).Genes with Ps of V0.001 were considered to be significantly differentially expressed.The following criteria were used to identify perturbed networks: a perturbed network must have more than three genes represented on our differentially expressed gene list (P < 0.001) and at least 50% of those genes must be up-regulated (an up-regulated pathway) or down-regulated (a down-regulated pathway).

Results
Massively parallel signature sequencing analyses of the androgen-dependent LNCaP cell line and its androgenindependent variant CL1.Using MPSS technology, we sequenced 2.22 million signature sequences for LNCaP cells and 2.96 million for CL1 cells.We identified a total of 19,595 unique transcript signatures expressed at levels >3 tpm in at least one of the samples.The signatures were classified into three major categories: 1,093 signatures matched repeat sequences, 15,541 signatures matched unique cDNAs or ESTs, and 2,961 signatures had no matches to any cDNA or EST sequences (but did match genomic sequences).The last category included sequences falling into one of three different categories: signatures representing new transcripts yet to be defined, signatures representing polymorphisms in cDNA sequences (a match of a MPSS sequence to cDNA or EST sequences requires 100% sequence identity), or errors in the MPSS reads.Transcript tags with matches to a cDNA or EST sequence were further classified based on the signatures' relative orientation to transcription direction and their position relative to a polyadenylation site and/or poly(A) tail.We also built a searchable MySQL database (http://www.mysql.com)containing the expression levels (tpm), the genomic locations of the MPSS sequences, the cDNAs or EST matches, and the classification of each signature.A detailed description of the schema for classification is available in Supplementary Table S1.A snapshot of a representative data query is shown in Supplementary Fig. S1.
We first restricted our analysis to those MPSS signatures corresponding to cDNAs with poly(A) tails and/or polyadenylation sites, so that corresponding genes could be conclusively identified.We used the Z-test (13,14) to compare differential gene expression between LNCaP and CL1 cells.Using very stringent Ps (<0.001), we identified 2,088 MPSS signatures (corresponding to 1,987 unique genes, as some genes have two or more MPSS signatures due to alternative uses of polyadenylation sites) with significant differential expression.Of these, 1,011 signatures (965 genes) were overexpressed in CL1 cells and 1,077 signatures (1,022 genes) were overexpressed in LNCaP cells (Supplementary Table S2).The Zscore is related to mRNA abundance in the library.For example, using a cutoff P of <0.001 in our data set, the expression level in tpm changed from 0 to 26 tpm for the most lowly expressed transcript (>26-fold) but changed from 7,591 and 11,206 tpm for the most highly expressed transcript (1.48-fold).
We randomly selected nine genes from the 1,987 differentially expressed genes identified by our MPSS analysis and compared their changes in expression levels with those obtained by quantitative real-time RT-PCR techniques.We showed that the expression levels of these nine genes changed in the same direction (Table 1).The MPSS expression profiling data were also consistent with the available published data.For example, using RT-PCR, Patel et al. (9) showed that CL1 tumors express barely detectable prostate-specific antigen (PSA) and androgen receptor mRNAs compared with LNCaP cells.Our MPSS results indicated that LNCaP cells expressed 584 tpm of androgen receptor and 841 tpm of PSA; CL1 cells did not express either androgen receptor or PSA (0 tpm in both cases).Freedland et al. found that CD10 expression was lost in CL1 cells compared with LNCaP cells (19); likewise, we found that CD10 was expressed at 0 tpm in CL1 cells but at 56 tpm in LNCaP cells.Using cDNA microarrays, Vaarala et al. ( 4) compared LNCaP cells and another androgen-independent variant, non-PSA-producing LNCaP line, which is similar to CL1, and identified a total of 56 differentially expressed genes.We found that the expression levels of these 56 genes changed in the same direction (concordant) between LNCaP and CL1 cells and between LNCaP and non-PSA-producing LNCaP cells (data not shown).This identification of 1,987 versus 56 differentially expressed genes, respectively, underscores the striking differences in sensitivity between MPSS and cDNA microarray techniques.
To compare the sensitivity of the MPSS and cDNA microarray procedures, we hybridized cDNA microarrays containing 40,000 human cDNAs to the same LNCaP and CL1 RNAs that we used for MPSS.Three replicate array hybridizations were done.MPSS signatures and array clone IDs were mapped to Unigene IDs for data extraction and comparisons.We found that only those genes expressed at >40 tpm by MPSS could be reliably detected as changing levels by cDNA microarray hybridizations [judged by an expression level twice the SD of the background, a standard cutoff value for microarray data analysis (data not shown)].This observation is consistent with the 33 to 60 tpm sensitivity of microarrays estimated from the experiment of Hill et al. (20), in which known concentrations of synthetic transcripts were added.In LNCaP and CL1 cells, f68.75% (13,471 of 19,595) of MPSS signatures (>3 tpm) were expressed at a level below 40 tpm; changes in the levels of these genes will be missed by microarray methods.Many attempts have been made to increase the sensitivity of DNA array technology (21,22).We have not compared these new improvements against MPSS, but it is clear that there will still be significant differences in the levels of change that can be detected.
Serial analysis of gene expression (SAGE; ref. 23) is another technology for gene expression profiling; like MPSS, it is digital and can generate a large number of signature sequences.However, MPSS (f1 million signatures per sample,) can achieve a much deeper coverage than SAGE (typically f10,000-100,000 signatures sequenced per sample) at reasonable cost.We compared our MPSS data on LNCaP cells against publicly available SAGE data on LNCaP cells (National Center for Biotechnology Information SAGE database) through common Unigene IDs.The SAGE library GSM724 (total SAGE tags sequenced: 22,721; ref. 24) is derived from LNCaP cells with an inactivated PTEN gene; it is the SAGE library most similar to our LNCaP cells.Only 400 (f20%) of our 1,987 significantly differentially expressed genes (P < 0.001) had any SAGE tag entry in GSM724.These data illustrate the importance of deep sequence coverage in identifying state changes in transcripts expressed at low-abundance levels.
Functional classifications of genes differentially expressed between LNCaP and CL1 cells.Examination of the Gene Ontology classification of our 1,987 genes revealed that multiple cellular processes have changed during the transition from LNCaP to CL1 cells.The completed list, including Gene Ontology annotations, is shown in Supplementary Table S2.The most interesting groups, categorized by function, are shown in Table 2.
Nineteen differentially expressed proteins are related to apoptosis.Twelve of these are up-regulated in CL1 cells, including the apoptosis inhibitors human T-cell leukemia virus type I binding protein 1 and CASP8 and FADD-like apoptosis regulator.Seven are down-regulated in CL1, including programmed cell death 8 and 5 (apoptosis-inducing factors) and BCL2-like 13 (an apoptosis facilitator).Because CL1 cells have increased expression of apoptosis inhibitors and decreased expression of apoptosis inducers, net inhibition of apoptosis may contribute to their greater tumorigenicity.Matrix metalloproteinases (MMP), which degrade extracellular matrix components that physically impede cell migration, are implicated in tumor cell growth, invasion, and metastasis.We found that MMPs 1, 2, 10, and 13 are significantly overexpressed in CL1 cells (Table 2), which may partially explain these cells' aggressive and metastatic behavior.CD markers are generally localized at the cell surface; some may be associated with prostate cancer (25).We converted all currently identified CD markers (CD1-CD247) from the PROW CD index database (http://www.ncbi.nlm.nih.gov/prow/guide/45277084.htm) to Unigene numbers and used these numbers to identify their signatures and their expression levels.We identified 15 CD markers that are differentially expressed between LNCaP and CL1 cells (Z-score < 0.001; Table 2).Eleven CD markers, including CD213a2 and CD213a1, which encode interleukin (IL)-13 receptors a1 and a2, are up-regulated in CL1 cells; three CD markers, CD9, CD10, and CD107, are down-regulated in these cells (Table 2).Six CD markers went from 0 or 1 to >35 tpm (Table 2), making them good digital or absolute markers or therapeutic targets.These data suggest that carefully selected CD markers may be useful in following the progression of prostate cancer and indeed could serve as potential targets for antibody-mediated therapies (25).Additional functional categories can be seen in Supplementary Table S2.
Delineation of disease-perturbed networks in prostate cancer cells.Genes and proteins rarely act alone but rather generally operate in networks of interactions.Identifying key nodes (proteins) in the disease-perturbed networks may provide insights into effective drug targets.Comparing the genes (proteins) currently available in the 314 BioCarta and 155 KEGG pathway or network (http://cgap.nci.nih.gov/Pathways/)databases with the MPSS data through Unigene IDs, we identified 37 BioCarta and 14 KEGG pathways that are up-regulated and 23 BioCarta and 22 KEGG pathways that are down-regulated in LNCaP cells versus CL1 cells (Table 3).The number of genes whose expression patterns changed in each pathway is listed in Table 3.Each gene along with its expression level in LNCaP and CL1 cells is listed pathway by pathway in our database ( ftp://ftp.systemsbiology.net/pub/blin/mpss).Changes in these pathways reveal the underlying phenotypic differences between LNCaP and CL1 cells.For example, multiple networks involved in modulating cell mobility, adhesion, and spreading are up-regulated in CL1 cells, which are more metastatic and invasive than LNCaP cells (Table 3).In the uCalpain and friends in cell spread pathway, calpains are calcium-dependent thiol proteases implicated in cytoskeletal rearrangements and cell migration.During cell migration, calpain cleaves target proteins, such as talin, ezrin, and paxillin, at the leading edge of the membrane while at the same time cleaving the cytoplasmic tails of the integrins h 1 (a) and h 3 (b) to release adhesion attachments at the trailing membrane edge.Increased activity of calpains increases migration rates and facilitates cell invasiveness (26).
Many pathways we identified as perturbed in the LNCaP and CL1 comparison are interconnected to form networks (in fact, there are probably no discrete pathways, only networks).For example, the insulin signaling pathway, the signal transduction through IL-1 receptor pathway, and nuclear factor-nB (NF-nB) signaling pathway are interconnected through c-Jun, IL-1 receptor, and NF-nB.The mapping of genes onto networks/pathways will be an ongoing objective as more networks/pathways become available.Our transcriptome data will be an invaluable resource in delineating these relationships.
As gene regulatory networks controlled by transcription factors form the top layer of the hierarchy that controls the physiologic network, we sought to identify differentially expressed transcription factors.Of 554 transcription factors expressed in LNCaP and CL1 cells, 112 showed significantly different levels between the cell lines (P < 0.001; Supplementary Table S3).This clearly showed significant difference in the functioning of the corresponding gene regulatory networks during the progression of prostate cancer from the early to late stages.As secreted proteins can readily be exploited for blood cancer diagnosis and prognosis, we next asked how many of our differentially expressed genes encode secreted proteins.We identified 521 signatures belonging to 460 genes potentially encoding secreted proteins (Supplementary Table S6).Among these, 287 (259 genes) and 234 (201 genes) signatures, respectively, are overexpressed or underexpressed in CL1 cells compared with LNCaP cells.Thus, one can think about using blood diagnostics (changes in relevant protein concentrations) to follow prostate cancer progression.
Quantitative proteomic analysis of prostate cancer cells.We quantitatively profiled the protein expression changes between LNCaP and CL1 cells using the ICAT-MS/MS protocol described by Han et al. (15).We generated a total of 142,849 MS/MS, 7,282 of which corresponded to peptides with a mass spectrum quality score P of >0.9 (allowing unambiguous identification of peptides; ref. 16).We obtained quantitative peptide ratios for 4,583 peptides corresponding to 940 proteins.The number of peptides is greater than the number of proteins because (a) mass spectrometry identified multiple peptides from the same protein and (b) the ionization step of mass spectrometry created different charge states for the same peptide.The protein ratios were calculated from multiple peptide ratios using an algorithm for the Automated Statistical Analysis on Protein Ratio (17).In the end, we identified 82 proteins that are downregulated and 108 proteins that are up-regulated by at least 1.8fold in LNCaP cells compared with CL1 cells.The functional classification of the proteins identified is shown in Supplementary Table S4.
Fifty-four percent (103 of 190) of differentially expressed proteins identified have enzymatic activity.Many of the proteins identified are involved in fatty acid and lipid metabolism, including fatty acid synthase, carnitine palmitoyltransferase II, and propionyl CoA carboxylase a polypeptide.Fatty acid and lipid metabolism is perturbed in prostate cancer (27,28).Additionally, many genes involved in lipid transport were altered, including five Annexin family proteins, prosaposin, and fatty acid binding protein 5 (Supplementary Table S4).Annexin A1 was shown to be overexpressed in non-PSA-producing LNCaP cells compared with PSA-producing LNCaP cells (4).Annexin A7 is postulated to be a prostate tumor suppressor gene (29).Annexin A2 expression is reduced or lost in prostate cancer cells, and its re-expression inhibits prostate cancer cell migration (30).
Other genes we identified here have been implicated in carcinogenesis, including tumor suppressor p16 and insulin-like growth factor-II receptor (27,31).Some genes have been implicated previously in prostate cancer, such as prostate cancer overexpressed gene 1 (POV1), which is overexpressed in prostate cancer (32), and y1 and a1 catenin (cadherin-associated protein) and junction plakoglobin, which are down-regulated in prostate cancer cells (33).However, the potential relationships of most of the proteins identified here to prostate cancer require further elucidation.For example, transmembrane protein 4 (TMEM4), a gene predicted to encode a 182-amino acid type II transmembrane protein, is down-regulated f2-fold in CL1 cells compared with LNCaP cells.MPSS data also indicated that TMEM4 is down-regulated f2-fold in CL1 cells.Many type II transmembrane proteins, such as TMPRSS2, are overexpressed in prostate cancer patients (34).It will be interesting to see whether TMEM4 overexpression plays a primary role in prostate carcinogenesis.We also identified 12 proteins that have not been annotated or functionally characterized.The relationships between these novel proteins and prostate cancer also need further study.
Additionally, we sought to compare the changes in expression at the protein level in the two cell states with changes at the mRNA level.We converted the protein IDs and MPSS signatures to Unigene IDs to compare the MPSS data with the ICAT-MS/MS data.We limited this comparison to those with common Unigene IDs and with reliable ICAT ratios (SD <0.5) and ended up with a subset of 79 proteins.Of these, 66 genes (83.5%) were concordant in their changes in mRNA and protein levels of expression and 13 genes (16.5%) were discordant (i.e., having higher protein expression but lower mRNA expression or vice versa).The scatter plot of protein/mRNA expression ratios is shown in Fig. 1.There are no functional similarities among the discordant genes.As these mRNAs and proteins are expressed at relatively high levels, discordance due to measurement errors is unlikely.Clearly, posttranscriptional mechanism(s) of protein expression is important, although the elucidation of the specific mechanism(s) awaits further studies.

Discussion
The systems approach to disease is predicated on the idea that the disease process is reflected in disease-perturbed protein and gene regulatory networks.Molecular systems biology has two important features: (a) it employs global analyses where global implies studying changes in transcript or protein levels as well as the relationships of all of the elements in the system and (b) it integrates different types of biological information (single nucleotide polymorphisms, DNA, mRNA, protein, protein interactions, etc.).MPSS is a powerful and sensitive technology that allows deep analysis of the prostate transcriptome.The MPSS protocol we used for this study relies on GATC enzymatic sites to cleave the 3V region of cDNAs to generate DNA fragments as substrates for MPSS.cDNAs lacking GATC in their 3V region would be excluded from these analyses.The estimated percentage of cDNA clones lacking an appropriately positioned GATC site is f3% as calculated from the Mammalian Genome Collection full-length sequences.Among the 15,064 Mammalian Genome Collection sequences, 14,602 (96.93%) sequences have appropriate GATC sites.The protocol we used is also biased toward capturing MPSS signatures within 500 bp 5Vof the poly(A) site.If the GATC site is located beyond 500 bp 5Vof the poly(A) site, it will likely be missed as well.For example, NKX3.1, a prostate-specific and androgen-regulated gene (35,36), is not found in our MPSS data set because its GATC site closest to the poly(A) tail (Genbank accession no.AF247704) is 2.8 kb away.Recently, a new protocol that eliminates this bias was developed at Lynx. 4 We estimated that LNCaP cells expressed f280,000 transcripts per cell.We obtained f900 Ag of total RNA from 10 8 cells.With an average of 3% polyadenylated RNA and an average transcript length of 1 kb, this corresponds to 280,000 transcripts per cell.Therefore, with >2 million signatures obtained for each cell state by MPSS, we can detect transcripts expressed at levels of <1 transcript per cell (this means that not all cells express the transcript).
The BioCarta and KEGG databases describe 469 protein pathways or networks (http://cgap.nci.nih.gov/Pathways/).We have identified 37 BioCarta and 14 KEGG pathways that are upregulated and 23 BioCarta and 22 KEGG pathways that are downregulated in LNCaP cells versus CL1 cells.We have also shown that 112 transcription factors change between these two disease states, consistent with the fact that several different gene regulatory networks are perturbed.These changes indicate significant alterations of the corresponding gene regulatory networks.These transcription factors include androgen receptor along with other six transcription factors, such as the ets homologous factor, a liverspecific bHLH-Zip transcription factor, an IFN regulatory factor, and CCCTC-binding factor (zinc finger protein; by exploring data in Supplementary Table S3).The fascinating question is which of these networks are directly correlated with prostate cancer progression and which are changed secondarily as a consequence of their connections to the primary disease networks.We are working on strategies to distinguish these possibilities.Nevertheless, we can firmly conclude that the progression from early-stage to late-stage prostate cancer as represented by LNCaP and CL1 cells clearly is reflected in significant changes in both protein and gene regulatory networks.
In contrast to the MPSS technology, the ICAT technology is an immature technology that cannot now carry out global analyses (37).The integration of different types of data provides powerful new approaches to defining more precisely protein and gene regulatory networks (38).We have shown that the protein and RNA expression levels of 66 of 79 genes (83.5%) were concordant (i.e., changes in the same direction; Supplementary Table S5).This concordance rate is higher than that reported elsewhere (39,40).Waghray et al. found that only 8 of 25 (32%) androgenresponsive genes in LNCaP cells showed concordance between protein levels measured by two-dimensional gels and MS/MS and mRNA levels analyzed by SAGE (39).Although genes in different experimental systems may have different concordance rates between mRNA and protein expression, use of different methods for quantitative protein profiling (ICAT-MS/MS versus two-dimensional gel-MS/MS) and mRNA expression profiling (MPSS versus SAGE) may also account for the differences.It is also critical to use only those data with high confidence levels in the comparisons between mRNA and protein levels.The expression levels obtained by MPSS are more accurate than those obtained by SAGE or DNA microarrays because of the deep sequence coverage MPSS achieves.We have also limited our data set to only those proteins (649 of them) that were identified in multiple peptide hits and in which the ICAT ratios did not vary greatly among different peptides from the same protein (SD < 0.5).Such variation could derive from experimental errors or from different protein isoforms.There are a multiplicity of post-transcriptional mechanisms that have been described and there are probably more to be identified (41).The important point is that this major aspect of control could not have been identified without the integration of two data types-mRNAs and proteins.
The systems approach provides powerful new approach to diagnostics.The idea is that disease-perturbed networks change their patterns of mRNA and protein expression both within the diseased cells and in terms of the proteins they synthesize that are secreted into the blood.Of the 1,987 mRNAs that changed in the transition from LNCaP to CL1 cells (early-stage to late-stage  S3 were transformed to natural logarithms and then plotted.cancer), 460 (23.2%) encoded proteins that were potentially secreted (Supplementary Table S6).Sixteen of these putative secreted proteins were also identified to be differentially expressed in these two cell states by the ICAT approach (Supplementary Table S6).Of the 190 differentially expressed proteins identified by the ICAT approaches, 22 were predicted to be secreted proteins (Supplementary Table S6).These proteins are excellent candidates for investigation as diagnostic markers for prostate cancer progression.The interesting point is that these secreted diagnostic markers will serve as surrogates for the state of the corresponding protein and gene regulatory networks and potentially will enable one to (a) stratify disease into distinct categories (e.g., relatively benign, slowly invasive, and rapidly metastatic for prostate cancer), for these different types of prostate cancer will employ different disease-perturbed networks); (b) follow progression; (c) follow response to therapy; and (d) monitor adverse drug reactions.The other interesting possibility is that the perturbed secreted proteins will serve as markers to identify the primary disease-perturbed networks and accordingly will identify networks that may harbor excellent protein candidates for drug targeting-drug targets that may kill disease cells specifically or return the networks to a more normal state.
Interestingly, these two states of prostate cancer progression can lead to ''digital changes'' (i.e., changes from 0 to z50 tpm).Thus, one can possibly obtain diagnostic markers that are digital in the sense that they transition from no expression to some expression.In the transition from LNCaP cells to CL1 cells, there are 175 signatures (169 mRNAs) that go from 0 to z50 tpm.
Likewise, in going from CL1 cells to LNCaP cells, there are 131 signatures (128 mRNAs; Supplementary Table S2).Among the transcription factors we identified, eight transcription factors changed from 0 tpm in LNCaP to >50 tpm in CL1 cells and seven transcription factors changed from >50 tpm in LNCaP cells to 0 tpm in CL1 cells (Supplementary Table S3).Eight pathways were affected by the ''digital changes'' (Supplementary Table S7).For example, acid ceramidase 1 and aspartate aminotransferase changed from >50 tpm in LNCaP cells to 0 tpm in CL1 cells, affecting multiple pathways, including the insulin-like growth factor-I receptor pathway and activation of COOH-terminal Srk kinase pathway (Supplementary Table S7).It will be interesting to test these potential digital diagnostic markers.
Our analyses provide an excellent database and powerful resource enabling the development of tools for multivariable diagnosis and prognosis.They represent a significant step toward a system-wide understanding of prostate cancer progression.The systems approach to disease will offer powerful to approaches to diagnostics, therapeutics, and even prevention in the future (42).It will almost certain usher in an era of predictive and preventive medicine over the next 10 to 20 years (43).

Figure 1 .
Figure 1.Scatter plot of the protein ratios obtained by ICAT and the mRNA expression ratios obtained by MPSS.Expression ratios in Supplementary TableS3were transformed to natural logarithms and then plotted.

Table 1 .
Comparison of MPSS and real-time RT-PCR results *LNCR, LNCaP cells stimulated with androgen; LNCX, LNCaP cells starved of androgen.Network Analysis of Prostate Cancer Cells www.aacrjournals.org

Table 2 .
Examples of differentially expressed genes and their functional classifications(Cont'd)

Table 3 .
Pathways that are up-regulated or down-regulated comparing LNCaP cells to CL1 cells(Cont'd)