Genetic Alterations in the Tyrosine Kinase Transcriptome of Human Cancer Cell Lines

Protein tyrosine kinases (PTKs) play a critical role in the manifestation of cancer cell properties, and respective signaling mechanisms have been studied extensively on immortalized tumor cells. To characterize and analyze commonly used cancer cell lines with regard to variations in the primary structure of all expressed PTKs, we conducted a cDNA-based sequence analysis of the entire tyrosine kinase transcriptome of 254 established tumor cell lines. The profiles of cell line intrinsic PTK transcript alterations and the evaluation of 155 identified polymorphisms and 234 somatic mutations are made available in a database designated ‘‘Tykiva’’ (tyrosine kinome variant). Tissue distribution analysis and/or the localization within defined protein domains indicate functional relevance of several genetic alterations. The cysteine replacement of the highly conserved Y367 residue in fibroblast growth factor receptor 4 or the Q26X nonsense mutation in the tumor-suppressor kinase CSK are examples, and may contribute to cell line–specific signaling characteristics and tumor progression. Moreover, known variants, such as epidermal growth factor receptor G719S, that were shown to mediate anticancer drug sensitivity could be detected in other than the previously reported tumor types. Our data therefore provide extensive system information for the design and interpretation of cell line–based cancer research, and may stimulate further investigations into broader clinical applications of current cancer therapeutics. [Cancer Res 2007;67(23):11368–76]


Introduction
For decades, research on established cancer cell lines has been the basis of pioneering discoveries in diverse areas of modern biology.Especially the fields of tumor biology and signal transduction research critically depended on studies of tumorderived cell lines.Also, the development of currently used cancer therapeutics, including protein tyrosine kinase (PTK) targeting agents, was made possible through the use of cancer cells in culture.
Because of the established relevance of PTKs for cancer initiation and progression, extensive screens for genomic sequence changes of the cancer kinome have been undertaken and yielded a diversity of sporadic alterations (1)(2)(3)(4)(5)(6)(7)(8).Besides sporadic mutations that accumulate in the developing tumor due to cancer genome instability, DNA sequence polymorphisms have been linked to cancer aggressiveness and susceptibility (9).The fibroblast growth factor receptor 4 (FGFR4) G388R single nucleotide polymorphism (SNP), for instance, was found to be associated with increased cell motility in vitro as well as lymph node metastasis, advanced tumor stage, and reduced survival of breast cancer patients (10).Moreover, this SNP was shown to mediate resistance to adjuvant therapy in primary breast cancer, and its clinical relevance was also shown for soft tissue sarcoma, and head and neck, colon, and prostate cancers (11).Recent genome-wide association studies to identify SNPs connected with cancer susceptibility reflect the increasing recognition of polymorphisms as cancer-relevant parameters (12)(13)(14).
Through cDNA-based tyrosine kinase transcriptome (TKT) sequence evaluation, we performed a comprehensive characterization and analysis of widely used cancer cell lines with regard to variations in the primary structure of all expressed tyrosine kinases.For 254 established tumor cell lines of 19 tissue origins, we determined the profiles of both somatic mutations and germ line polymorphisms detectable in the transcripts of 90 PTK genes (15).The localization within the respective protein sequence, sequence comparisons with related genes, or the tissue distribution indicated potential functional relevance for individual genetic alterations that may define the cell line specific signaling characteristics and/ or properties related to oncogenesis.

Materials and Methods
Samples.The panel of 254 cancer and 7 nontumorigenic cell lines was obtained from various sources (Supplementary Table S1).Poly(A)+ RNAs from diverse organs of 15 different cancer-free individuals were purchased from Ambion (Supplementary Table S1); genomic DNA derived from blood of 90 noncancer donors was from Coriell Institute for Medical Research.Samples of 55 primary invasive breast carcinomas were obtained from H. Hoefler and S. Iacobelli.Biopsies of 55 kidney tumors and 55 prostate cancer specimens were provided by S. Peter (Supplementary Table S2).
RNA isolation and cDNA synthesis.Total RNA was extracted from exponentially growing cells using guanidinium thiocyanate (16).
Genes, PCR, and sequencing.With purified cDNAs prepared from 261 cell lines and 15 control tissues as templates, overlapping fragments covering the entire coding region of all 58 receptor tyrosine kinase (RTK) and 32 cytoplasmic tyrosine kinase genes (15) were PCR amplified in at least two independent experiments.For each fragment, PCR conditions were specifically optimized using pooled cDNAs in a first step, and cDNAs from six individual cell lines in a second step.Thereby, the efficiencies of three proofreading polymerases (Novagen, Roche, Invitrogen) in combination with or without additives (DMSO, betaine, TMA oxalate), various primer pairs and different annealing temperatures were compared, and best parameters applied to the full panel of samples.
PCR products for all genes except IRR, MUSK, FGR , and SRMS, for which we obtained no amplicons, were subjected to direct sequencing of both forward and reverse strands.For selected alterations, homologous fragments from cDNAs of 165 primary tumors were PCR amplified using the same primer pairs.To further verify potential germ line origin of given variants, additional primers were designed to PCR amplify the respective region from genomic DNA of 90 blood samples from healthy donors.
Primers for amplification and sequencing of cDNA (Supplementary Table S3) and genomic DNA (Supplementary Table S4) were designed using the Primer3 program 8 and refer to National Center for Biotechnology Information (NCBI) 9 reference sequence files with accession numbers provided in Supplementary Table S5.
Data analysis.Sequence differences to the NCBI 9 reference sequence were identified via manual inspection of aligned electropherograms assisted by the Mutation Surveyor software package.In addition to nonsynonymous genetic alterations, we detected numerous silent sequence variations that are not presented and further analyzed here.It is noteworthy that 63 sequence differences to the NCBI 9 database occurred in all cDNA samples we analyzed and represent either other isoforms or variants that might actually be wild-type rather than genetic alterations (Supplementary Table S6).

Results and Discussion
Characterization of PTK transcripts in 254 tumor cell lines.
To comprehensively characterize widely used tumor cell lines with regard to nonsilent alterations in all expressed tyrosine kinase genes, we evaluated the sequence of the entire TKT of 254 established cancer cell lines (Supplementary Table S1).These cell lines were derived from 19 different tissue origins (Fig. 1A), controls included seven nontumorigenic cell lines and 15 tissues from different organs of healthy individuals.
For each cell line and control sample, whole-cell cDNA was prepared and used for the amplification and direct sequencing of the complete PTK coding region.Despite the lack of normal tissue counterparts for the established tumor cell lines, we attempted to define the identified PTK transcript variations as somatic or germ line sequence differences.We declared those sequence differences as germ line polymorphisms that were either detected in our 22 controls or were previously reported as hereditary variants in the NCBI SNP database, 9 Ensemble Genome Browser, 10 the UniProtKB/ Swiss-Prot database, 11 the SNP500Cancer database, 12 or the literature.Besides zygosity, cell line-specific variant profiles thus indicate germ line or somatic origin of the individual TKT sequence variations.Representative examples are displayed for the skinderived tumor cell lines A-375, BOW-G, C-32, C-8161, and Colo-16 (Fig. 1B).The characterization of all cell lines can be found in Supplementary Table S7.
Based on these data, we determined the absolute number and distribution of TKT-linked somatic mutations and germ line polymorphisms within the entire tumor cell line panel.For polymorphisms, we observed a Gaussian-like distribution with an average of 12.3 sequence variations per cancer cell line.In contrast, somatic mutations were unevenly distributed (Fig. 1C).No somatic alterations were detected in the TKT of 119 cancer cell lines, consistent with kinome mutations entirely absent in subsets of recently screened breast cancer, lung carcinoid, and testicular germ-cell tumor samples (1)(2)(3).In contrast, high frequencies of 9 to 14 somatic mutations in the transcribed tyrosine kinomes of LNCaP, Jurkat, MeWo, MKN-1, HCT-15, and DLD-1 might reflect a mutator phenotype (1).They are in agreement with sequence data of 24 cancer genes in the NCI-60 cell line panel that also showed HCT-15 to be one of the most frequently mutated tumor cell lines (18).With intermediate mutation rates for the other tumor cell lines, our data indicate an accumulation of somatic mutations in PTK transcripts of various cancer cell lines which may contribute to the progression characteristics of certain cancers.
In the entire panel of cancer cell lines and controls, we identified 389 nonsynonymous genetic variations within 39.85 of 72.08 MB of PTK coding sequence that were amplifiable and represent the detectable TKT of all samples.As alternative to the allocation to cell line-specific variant profiles, we grouped these sequence differences by genes and PTK subfamilies.Each variation was thereby specified regarding the spectrum of affected cell lines as well as the zygosity status and the presumable somatic versus germ line origin.These data are shown for FGFR4 (Fig. 2) that will be discussed below.The full information for all transcript variants and PTK genes can be obtained from Supplementary Table S8.Sequence specifications of frame shift alterations and insertions are provided in Supplementary Table S9.
Tykiva database for cancer cell line TKT analysis.Additionally, all data on the identified PTK transcript variants are compiled in the database designated Tykiva. 13Transcript variants can be specifically retrieved for each of the 254 tumor cell lines, 19 tissue origins/tumor types, or any of the 90 PTK genes.Somatic or germ line origin is indicated, and other cell lines carrying the same variant are referred to.In graphical gene representations, the localization of all detected variants is displayed in the context of the reference amino acid sequence as well as predicted protein domain structures according to Swiss-Prot data.Optionally projectable to the major known isoforms, these illustrations cross-reference our data to variant information from the NCBI SNP database, 9 the Ensemble Genome Browser, 10 the Swiss-Prot 11  and the Genbank14 databases, the KinMutBase, 15 the IDbases, 16and the literature.By that, they comprise the current knowledge of nonsilent genetic variations in PTK genes.
The expressed PTK variants may define cell line-specific signaling characteristics and cancer-related cell properties.In the following sections, we therefore address tissue distribution and localization of each polymorphism and somatic mutation within the respective protein sequence.Based on these data and the current literature, we discuss potential functional and/or clinical relevance for some of the identified genetic variants.
Characterization of 155 PTK gene sequence polymorphisms.According to our definition of somatic and germ line sequence variants, 155 of the 389 identified alterations were classified as sequence polymorphisms.They include 131 SNPs, 16 germ line deletions, and 8 insertions.Their overall frequencies and localization in distinct protein domains are summarized in Fig. 3A and B.Moreover, the occurrence frequency of each polymorphism in individual tumor types or control samples was determined.Occurrence frequency was thereby defined as the fraction of carriers of a given sequence variant and the number of cell lines with the same tissue origin that express the corresponding gene regardless of its genotype (paired numbers in Fig. 3C and Supplementary Table S10).It therefore reflects the expression aspect of respective genes and alterations, as addressed by our cDNA analysis.
Of the 131 missense substitutions, 100 had been reported previously.However, only 12 of them, as well as two deletions, have been connected with cancer thus far (Supplementary Table S10).It is noteworthy that five of eight novel deletions involve entire exons (Supplementary Table S10) and most likely represent splice variants.Such variants could preferentially be detected because of the use of cDNA as sequencing target.Moreover, other transcription-related mechanisms such as epigenetic gene silencing or mRNA stability are also reflected by cDNA, and genetic alterations identified therein are thus likely to be expressed within the cell.However, a disadvantage of our approach is that it does not detect fusion kinase gene or amplified kinase transcripts.
To verify the in vivo relevance of the sequence variations detected in tumor cell lines, we analyzed cDNA from 165 primary breast, kidney, and prostate cancer specimens as well as Figure 1.Characterization of tumor cell lines with regard to genetic alterations in the TKT.A, samples.The tissue origins and number of tumor cell lines derived thereof are summarized.B, patterns of genetic alterations.For each of the 254 tumor and seven control cell lines, the specific pattern of nonsynonymous genetic alterations within the TKT is provided (Supplementary Table S7) and exemplarily shown for five skin-derived tumor cell lines.Blue, germ line polymorphisms; yellow, somatic mutations.C, genetic alterations per TKT.The number of tumor cell lines with the indicated number of somatic (yellow ) or germ line (blue ) variations detected therein is illustrated.

Cancer Research
Cancer Res 2007; 67: (23).December 1, 2007 blood DNA from 90 healthy individuals for the occurrence of a representative subset of our identified genetic alterations.This subset was defined as all nonconservative sequence changes that were found at least twice in our panel of cell lines and control samples with at least one cell line originating from breast, kidney, or prostate cancer.All but 2 of the 46 polymorphisms that fulfilled these criteria could be verified in patient sample cDNAs or blood DNA (Supplementary Table S11), hence confirming the in vivo relevance of sequence variations in tumor cell lines.As exemplified here for FGFR4, the spectrum of identified genetic alterations and the corresponding patterns of affected tumor cell lines or control samples was determined for each tyrosine kinase gene (Supplementary Table S8).The total sample number carrying a given sequence variant is indicated, and affected cancer cell lines are subdivided according to their tissue origin.Blue, germ line polymorphisms; yellow, somatic mutations.Heterozygosity is indicated by a hash; the other samples are homozygous carriers of the respective variation.CO, colon; EP, endometrium and placenta; HN, head and neck; HL, hematopoietic and lymphoid system; KI, kidney; LI, liver; LU, lung; OV, ovary; PA, pancreas; PR, prostate; SK, skin; ST, stomach; TE, testes; TY, thyroid; NO, normal control samples) was determined for all polymorphisms (Supplementary Table S10) and presented here for those described in the text.Paired numerals, number of carriers of the indicated variant as a subset of all cell lines with the same tissue origin that express the corresponding gene regardless of its genotype.Bold type, novel germ line alterations; parenthesized numbers, supplementary references that associate respective polymorphisms with cancer.
Cancer relevance of identified polymorphisms.Some of the identified polymorphisms have previously been associated with nonproliferative diseases.Respective functional modulations may, because of the pleiotropic effects of many PTKs, also be relevant for cancer.This is exemplified by the V722I transversion in the pseudokinase domain of Janus-activated kinase 3 that we identified as a rare heterozygous polymorphism in the head and neck cancer cell lines SCC-10A and SCC-10B (Fig. 3C).First reported in patients with autosomal recessive T-B+ SCID syndrome (19), its recent detection in an acute megakaryoblastic leukemia patient and the capacity to transform Ba/F3 cells (20) support a potential role in cancer.Another example is NTRK1 R780Q, which we found in the colon, ovarian, and head and neck cancer cell lines Caco2, SK-OV-8, and SCC-9 (Fig. 3C), respectively.This SNP affects the same arginine residue whose replacement with proline was shown to be associated with ''congenital insensitivity to pain with anhidrosis'' and abrogation of catalytic tyrosine kinase activity in vitro (21).Assuming a similar loss-of-function for the NTRK1 Q780 isotype, this variant may exert antiapoptotic and hence pro-oncogenic effects, as expression of NTRK1 wild-type was associated with induction of apoptosis and a favorable prognosis of neuroblastoma patients (22).
Cancer relevance was also established for MET T1010I, which represents a biomarker for MET inhibitor efficacy (23) and was originally reported as a somatic gain-of-function mutation in small cell lung cancer and non-small cell lung cancer (NSCLC; ref. 24) and malignant pleural mesotheliomas (23).Its detection in 4 of our 90 blood control DNAs (Supplementary Table S11), however, confirmed previous hints for potential germ line occurrence (25).Moreover, the identification of MET T1010I in the prostate carcinoma cell line TSU-PR1 and a primary prostate tumor as well as in the brain, breast, colon, hematopoietic, and skin cancer cell lines IHR-32, DAL, LS-123, U-266, and Colo-829, respectively (Fig. 3C and Supplementary Tables S10 and S11), suggests enhanced MET signaling in these tumor cell lines and expands the currently reported spectrum of affected tumor types.
Polymorphism frequencies in cancer cells versus normal tissues.Differential occurrence rates of sequence polymorphisms in particular cancer types and/or normal tissues may indicate tumor suppressive or promoting effects.To address the potential relevance of all polymorphism for certain tumor types, we compared their occurrence frequencies in tissue types and control samples (Fig. 3C and Supplementary Table S10).Only some examples are displayed in Fig. 4. For epidermal growth factor receptor (EGFR) R521K, a relative overrepresentation of the EGFR K521 allele in cDNAs of normal control samples (55%), colon (52%), and head and neck (69%) tumor cell lines was detected (Fig. 4, left).This indicates a possible tumor suppressive activity of the EGFR K521 isotype that apparently is not relevant to colon cancer and head and neck cancer.An attenuated growth response to EGFR ligands and reduced induction of the proto-oncogenes FOS, JUN, and MYC in EGFR K521, but not EGFR R521-expressing cells (26), and an increased risk of local recurrence after chemoradiation treatment for rectal cancer patients with at least one EGFR R521 allele (27) support these conclusions.Similarly, a clearly differential occurrence of TYK2 F362 allele carriers was observed in brain (75%)-derived and hematopoietic/lymphoid system (67%)-derived tumor cell lines compared with control tissues (31%) or other tumor types (Fig. 4, middle).The novel polymorphism TNK M598delinsEVRSHX was found at low frequencies in control samples (5%) and cancer cells of several tissue origins, but occurred in 62% of blood-derived, 55% of skin-derived, and, even more prominent, 80% of brain-derived tumor cell lines (Fig. 4, right).In contrast to EGFR R521K, the underrepresentation of the TYK2 F362 allele and the TNK insertion in control samples indicates a tumorpromoting function with particular relevance for leukemia, melanoma, and glioma.It can be expected that, as for EGFR R521K (27) or FGFR4 G388R (10), the correlation with clinical parameters will assign therapeutic and/or predictive value to many of such unequally distributed alleles.
Polymorphisms affecting the kinase domain.The localization of genetic alterations within the respective protein sequence may  3) with an expression of the corresponding gene in at least 10 samples have been selected for this analysis.
be indicative of structural and/or functional consequences.The skipping of entire exons in the cytoplasmic kinases TYK2 and TXK, TYK2 E971fsX67 and TXK Y414fsX15, for instance, results in frame shifts and premature translation termination in the tyrosine kinase domains, and thus most likely is associated with catalytic inactivation.The truncated TYK2 variant that lacks 206 amino acids of the kinase domain, including the catalytic site and the activation loop, may also impose a dominant negative effect on cell signals.Interestingly, Stoiber et al. (28) reported that TYK2deficient mice developed B and T lymphoid leukemia with higher incidence and shortened latency as a result of decreased cytotoxic capacity of TYK2À/À natural killer (NK) and NKT cells and thus impaired tumor surveillance.Because NK activity as part of the innate immune system mediates tumor rejection in general, the significance of TYK2 loss-of-function might not be restricted to hematopoietic malignancies, but may also be important for other cancer types.Consistent with this possibility, we detected TYK2 E971fsX67 in cancer cells derived from various tissues, including breast, cervix/vulva, colon, endometrium, lung, and pancreas (Fig. 3C), as well as 33 clinical breast, prostate, and kidney cancer specimens.Its occurrence in the control cell line BPH-1 suggests potential germ line origin.Hence, the TYK2 E971fsX67 splice variant may also represent a prognostic marker for cancer patients and support therapeutic decision making.
Overall, these examples point at the potential role of sequence polymorphisms as genetic parameters that may contribute to a patient-specific definition of disease predisposition, rate of progression, or responsiveness to therapeutic agents.In conjunction with simple detectability in blood samples, this renders polymorphisms to be highly valuable biomarkers for diagnostic patient characterization.
Identification of 234 somatic PTK gene mutations.Of all sequence differences, 234 were undetectable in any of the control samples or public databases and were thus defined as somatic mutations.However, because of the lack of cell line-specific normal tissue controls, we cannot exclude the possibility that some actually represent rare germ line polymorphisms.The somatic mutations are composed of 210 missense and 2 nonsense single nucleotide substitutions as well as 19 deletions and 3 insertions.Although the majority (186) occurred once, 53 were found two to five times, and 3 in 6 to 10 tumor cell lines (Fig. 5A).Among the twice-occurring somatic mutations, 20 were detected in cell lines originating from the same tumor donor (Supplementary Table S1).They may be considered single mutations, thus adding up to a total of 206 nonrecurring mutational events.As for the polymorphisms, we present all somatic TKT alterations in the context of the respective protein domains and tumor types, and display the ratio of affected and expression-positive cell lines for each tissue origin (Fig. 5B and C; Supplementary Table S12).
Consistent with SYK A353T to represent one of the two most prevalent mutations, the tumor-suppressive tyrosine kinase SYK turned out to be the most frequently mutated kinase within our panel of 254 tumor cell lines.When absolute numbers of somatic mutations were compared, SYK scored highest with mutations detected in 11 nonrelated tumor cell lines (Supplementary Table S13).After normalization with respect to the PTK transcription status, SYK showed the highest mutation rate of 30.3 sporadic alterations per 1 Mb expressed coding sequence, followed by Figure 5. Distributions of nonsynonymous somatic mutations identified in all transcribed PTK genes from 254 tumor cell lines.A, rates of somatic mutations.The allocation to missense or nonsense substitutions, deletions and insertions as well as frequency categories (1, 2-5, 6-10, or >10 affected samples) is shown.B, domain localization of identified mutations.The localization within defined domains or other protein regions is indicated.C, tissue distribution of sporadic alterations.For each somatic mutation, the tissue distribution (see legend to Fig. 3 for abbreviations) was determined (Supplementary Table S12) and presented here for text-related examples.Paired numerals, the number of mutated and expression-positive cell lines within a given tumor type.Bold type, novel somatic mutations; numbers in parenthesis, supplementary references that associate respective mutations with cancer.NTRK1, EPHA2, and FLT3 (Supplementary Table S14).The domain organization of SYK and the known and novel genetic alterations are shown in Fig. 6A.
Somatic mutations with possible oncogenic potential.Somatic mutations clustering in the EGFR kinase domain (EGFR G719S, L858R, L861Q, and others) have recently been reported for patients with gefitinib-responsive NSCLC and were shown to enhance tyrosine kinase activity and sensitivity to gefitinib in vitro (29)(30)(31).We found the EGFR G719S mutation to be heterozygously expressed in the colon cancer cell line SW-48 (Fig. 5C; Supplementary Tables S1 and S2).This shows the existence of Iressa sensitivity-mediating mutations in cancers other than NSCLC and, in particular, suggests colon cancer as another potential indication for gefitinib therapy.
Similar to EGFR L858R and gefitinib, the KIT N822K mutation that we confirmed in the acute myelogenous leukemia cell line KASUMI-1 (32) was reported to mediate sensitivity to Gleevec (33).The enhanced in vitro receptor activation shown for these EGFR and KIT mutations (30,34) might be related to their location within the regulatory activation loop.In this respect, the sporadic variations FLT3 R849H, TEK A1006T, ABL G417E, ARG K450R, and TEC W531R, which we detected homozygously or heterozygously in BM-1604, SK-MEL-2, MM-Leh, Caki-2, and Jurkat (Fig. 5C), are particularly intriguing as they are located in the activation loop as well.By inference, these mutations may also have a higher probability to modulate the tyrosine kinase catalytic activity and/or related signaling pathways within the respective tumor cell lines.
The 18 somatic mutations we identified in intracellular juxtamembrane domains (Supplementary Table S12) might affect functionally important elements that mediate down-regulation of RTK activity.The in-frame deletion MET D981_E1027del as a result of exon 14 skipping, for instance, leads to the loss of c-Cbl E3-ligase binding, decreased ubiquitination, and prolonged ligand-dependent cell signaling in vitro and in vivo (35).Although we confirmed MET D981_E1027del in the NSCLC cell line NCI-H596, its homozygous detection in breast and stomach cancer cell lines MDA-MB-415 and Hs746T, respectively (Fig. 5C), provides evidence for its occurrence in tumor types other than the reported NSCLC (35,36).Presuming enhanced sensitivity to anti-MET therapeutics that MET D981_E1027del was suggested to mediate (35), our findings extend the potential clinical relevance for this deletion.
The somatic mutations we identified within the extracellular domain of two FGFR family members, FGFR1 P252S and FGFR4 Y367C (Figs. 5C and 6B), possibly augment receptor activation by influencing ligand binding and receptor dimerization, respectively.The highly conserved FGFR1 P252 residue we found to be heterozygously exchanged with hydrophilic serine in the melanoma cell line MeWo has previously been shown to be replaced by threonine in lung cancer (2) and arginine in patients with Pfeiffer syndrome (37).The crystal structure of the homologous activating FGFR2 mutant, FGFR2 P252R, revealed the formation of three additional hydrogen bonds with complexed fibroblast growth factor 2 (FGF2).They were predicted to increase the receptor affinity for its specific ligand as well as to allow binding of a different set of ligands (38).Because the hydroxyl group of the P252-replacing serine residue in FGFR1 also has a high potential to form additional hydrogen bonds, the somatic FGFR1 P252S substitution may represent a gain-of-function mutation with analogous functional consequences as for FGFR2 P252R.This is particularly intriguing in the context of studies demonstrating that blockage of FGFR1 or basic fibroblast growth factor function was associated with suppressed proliferation and survival of melanoma cells (39).
The novel FGFR4 Y367C mutation was detected as a homozygous genotype in the breast cancer cell line MDA-MB-453, and the affected Y367 residue in the extracellular juxtamembrane domain is highly conserved throughout the FGF family.Remarkably, homologous substitutions in FGFR1 (Y372C), FGFR2 (Y375C), and FGFR3 (Y373C) were shown to cause various osteogenic deficiency syndromes (40) through the formation of intermolecular disulfide bonds that force receptor dimerization and activation.Ligandindependent, constitutive receptor activation has been confirmed in vitro for FGFR1 Y372C (41) and FGFR3 Y373C (42).Furthermore, the oncogenic potential of the FGFR3 Y373C variant has been shown and was suggested to contribute to tumor progression of multiple myeloma (43).Thus, it is most likely that Y367C as the homologous FGFR4 variant also results in basal receptor activation, which strongly indicates an important role of this mutant in cancer.
Nonsense substitutions abrogate tumor suppressor activity.Down-regulation of tumor-suppressive activity is expected for the two nonsense substitutions we detected in EPHB2 and CSK (Fig. 5C).Q722X-mediated truncation and kinase inactivation of EPHB2 in the two prostate cancer cell lines BM-1604 and DU-145 supports mutational inactivation to be involved in progression of prostate cancer as proposed by Huusko et al. (44).They showed suppressed growth and colony formation of DU-145 cells upon reconstitution with functional EPHB2.The heterozygous CSK Q26X nonsense substitution detected in the colon cancer cell lines DLD-1 and HCT-15 is consistent with reduced protein levels of this negative regulator of SRC family kinases that were reported for f60% of human colon cancer cases with elevated SRC activity (45).These data indicate a significant role of CSK nonsense mutations in the development and/or progression of colon carcinoma and therefore strongly suggest the inclusion of SRC kinase inhibitors in the therapeutic regimen of this prevalent malignancy.
The examples discussed above represent only a partial extract of our overall data.Other genetic alterations affecting less investigated PTKs such as members of the AATYK, DDR, EPH, ROR, ROS, or FRK families, as well as tyrosine kinases that more recently captured scientific attention such as HER3 or ACK1, have been found (Supplementary Tables S2-S6).Their identification will support novel functional investigations toward the understanding of the therapeutic value of these kinases.
Low redundancy of PTK gene mutations in human tumors.In agreement with results from previous studies (1)(2)(3)(4)(5)(6)(7)46), our analysis of 254 cancer cell lines and additional primary tumors indicates that mutational patterns might be quite unique for the majority of human tumors, and that the frequency of specific somatic mutations in PTKs is low.Data mining of public databases and the literature revealed that only nine of all sporadic alterations identified in our study have been described before (Supplementary Table S12).Among them are KIT N822K and VEGFR1 R781Q as the only two alterations that were picked up in the currently most comprehensive mutational kinome analysis of human cancer samples (7).The low redundancy of somatic mutations is furthermore reflected by the nonrecurrence that we found for 206 of the 234 mutational events within our panel of tumor cell lines.
Consistent with this picture, none of the seven somatic representatives of our exemplary subset of nonconservative andmore frequent alterations found in at least one breast, kidney, or prostate cancer cell line could be detected in any of the 165 primary breast, kidney, and prostate cancer specimens (Supplementary Table S11).In fact, two genetic alterations, YES K113Q and TYRO3 E489K, were found in blood controls and therefore must be considered rare germ line alterations.
Despite the low redundancy of individual mutations, 70 of the tyrosine kinase genes turned out to carry at least one somatic mutation.Although most of our mutations require further experimental evaluation to determine their cancer relevance and in some cases may turn out to represent ''passenger'' rather than ''driver'' mutations, this broad incidence of sporadic alterations underscores the central importance of the entire PTK family in oncogenesis.Moreover, it provides further compelling support for the development of multitargeted kinase inhibitors or combination of complementary therapeutics as cancer treatments that may be adapted to the pathologic and genetic parameters of an individual patient.The extensive characterization of established tumor cell lines with respect to transcriptional profiles of genetic variations in this currently most promising cancer target family will aid in the selection of suitable cell systems, data interpretation and target validation, and thereby support preclinical development of novel targeted cancer drugs.

Figure 2 .
Figure2.Characterization of PTK genes with regard to genetic alterations detected in the transcripts of 276 tumor cell lines and control samples.As exemplified here for FGFR4, the spectrum of identified genetic alterations and the corresponding patterns of affected tumor cell lines or control samples was determined for each tyrosine kinase gene (Supplementary TableS8).The total sample number carrying a given sequence variant is indicated, and affected cancer cell lines are subdivided according to their tissue origin.Blue, germ line polymorphisms; yellow, somatic mutations.Heterozygosity is indicated by a hash; the other samples are homozygous carriers of the respective variation.

Figure 3 .
Figure 3. Distributions of nonsynonymous polymorphisms identified in the TKT of 276 cancer cell lines and control samples.A, rates of germ line alterations.The rates of missense (MS ) or nonsense (NS ) substitutions, deletions (DEL ) and insertions (INS ) subdivided into four frequency categories (1, 2-5, 6-10, or >10 affected samples) are summarized.B, domain localization of identified polymorphisms.The number of polymorphisms detected in distinct domains or other protein regions is indicated.C, tissue distribution of germ line variations.The tissue distribution (BL, bladder; BS, bone and soft tissue; BA, brain; BE, breast; CV, cervix and vulva;CO, colon; EP, endometrium and placenta; HN, head and neck; HL, hematopoietic and lymphoid system; KI, kidney; LI, liver; LU, lung; OV, ovary; PA, pancreas; PR, prostate; SK, skin; ST, stomach; TE, testes; TY, thyroid; NO, normal control samples) was determined for all polymorphisms (Supplementary TableS10) and presented here for those described in the text.Paired numerals, number of carriers of the indicated variant as a subset of all cell lines with the same tissue origin that express the corresponding gene regardless of its genotype.Bold type, novel germ line alterations; parenthesized numbers, supplementary references that associate respective polymorphisms with cancer.

Figure 4 .
Figure 4. Diverging occurrence rates of polymorphisms in different tumor types and/or control samples.The frequency of homozygous (HO ; dark blue column ) and heterozygous (HE ; light blue column ) carriers of EGFR R521K, TYK2 V362F, and TNK1 M598delinsEVRSHX was determined.Only tissue origins (for abbreviations, see legend to Fig.3) with an expression of the corresponding gene in at least 10 samples have been selected for this analysis.

Figure 6 .
Figure 6.Illustration of known and novel genetic alterations in selected genes.A, SYK.The domain organization and location of genetic alterations is displayed.B, sequence comparison of FGFR1-4.For FGFR1-4, the general domain organization (middle ) and sequence comparisons of the linker region connecting the IG-D2-domain and IG-D3-domain (top ) as well as a part of the extracellular juxtamembrane region (bottom ) are illustrated.Genetic alterations identified in our cell line screen are indicated below, whereas known sequence variants are depicted above the graphical representation of the domain structure.Blue, polymorphisms; yellow, somatic mutations.Numbers in parenthesis, number of affected nonrelated cell lines (SH2, Src homology 2 domain; TK, tyrosine kinase domain; S, signal peptide; TM, transmembrane domain; IG, immunoglobulin-like domain).