[CANCER RESEARCH 62, 1284–1288, March 1, 2002] Advances in Brief

Mutation Profiling of Mismatch Repair-deficient Colorectal Cancers Using an in Silico Genome Scan to Identify Coding Microsatellites1

Jane Park,2 Doron Betel,2 Robert Gryfe, Katerina Michalickova, Nando Di Nicola, Steven Gallinger, Christopher W. V. Hogue, and Mark Redston3 Centre for Cancer Genetics, Samuel Lunenfeld Research Institute, Mount Sinai Hospital, and Departments of Laboratory Medicine and Pathobiology, Biochemistry, and Surgery, University of Toronto, Toronto, Ontario, Canada M5G 1X5 [J. P., D. B., R. G., K. M., N. D. N., S. G., C. W. V. H., M. R.], and Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts 02115 [M. R.]

Abstract tides and dinucleotides (7). This is exemplified by inactivating frameshift mutations in coding microsatellites in MSI-H CRCs, most Human colorectal, endometrial, and gastric cancers with defective DNA notably transforming growth factor, ␤ receptor II (8). Therefore, an in mismatch repair (MMR) have microsatellite instability, a unique molec- ular alteration characterized by widespread frameshift mutations of re- silico search for with coding microsatellites should uncover the petitive DNA sequences. We developed “Kangaroo,” a bioinformatics novel genetic targets involved in the molecular progression of these program for searches in nucleotide and sequence databases, and neoplasms. Unfortunately, the current query programs of the public performed an in silico genome scan for DNA coding microsatellites that sequence databases have two limitations that prohibit such a search. may have novel mutations in MMR-deficient cancers. Examination of 29 First, they do not support searches for low complexity regions, be- previously untested coding polyadenines revealed widespread mutations cause these regions are filtered out as “background noise,” and sec- in MMR-deficient colorectal cancers, with the highest frequencies in ond, they do not allow searching solely within human open reading ERCC5, CASP8AP2, p72, RAD50, CDC25, RECQL1, CBF2, RACK7, frames. We devised a computer program, “Kangaroo,” that searches GRK4, and DNAPK (range, 10–33%). This algorithm allows comprehen- sive mutation profiling of MMR-deficient cancers, an important step in for DNA sequences in annotated human GenBank records. Although understanding the pathogenesis of these neoplasms. GenBank is a highly redundant database, we identified many records containing coding microsatellite sequences and demonstrated muta- Introduction tions in a number of novel target genes that may be involved in the CRC4 is the second leading cause of cancer death in North Amer- pathogenesis of MMR-deficient cancers. This approach unveils the ica, providing the impetus for research aimed at understanding the possibility of comprehensive mutation profiling of MMR-deficient biology of this disease. Among the important discoveries, in recent cancers and will be integral to uncovering the biologically important years it has become clear that there are at least two major molecular molecular alterations of these neoplasms. pathogenetic pathways to CRC: (a) MSI, because of defects in DNA MMR; and (b) chromosomal instability, because of defects in mitotic Materials and Methods spindle apparatus and other genes (1). Importantly, the pathological Bioinformatics Search Algorithm. We developed a two-step search algo- and clinical attributes of the cancers arising out of each of these two rithm, Kangaroo, written in C computer programming language using the pathways are different. MSI-H CRCs are more often located in the NCBI toolkit (J. Ostell, NCBI Software Development Toolkit, 1997)5 and right colon, are typically polypoid, and have high grade histology with developed on a dual Pentium II processor Linux machine. In the first step, a prominent lymphoid reaction (2). This pathway also underlies most NCBI GenBank records are retrieved, and coding region sequences are parsed cases of hereditary nonpolyposis colon cancer (3) and leads to cancers out from all of the records. NCBI GenBank records, are accessed from our that display less aggressive growth characteristics with fewer metas- in-house database (SeqHound),6,7 which mirrors the NCBI latest GenBank tases and better overall survival (4). The fundamental difference release (v.123.0 Apr.2001), the NCBI taxonomy database, and the Brookhaven between these two cancer pathways lies in the underlying mechanism protein databank (9). Coding region information was derived from the se- of genomic instability (1). CRCs with chromosomal instability are quence annotations as entered in the GenBank flatfile by the individual record characterized by widespread chromosomal deletions and transloca- submitters. Although GenBank provides a reliable source of regularly updated tions, whereas those with MSI have ubiquitous DNA mutations (3, 5, records, it is a highly redundant database, and, thus, a single gene may be 6). As predicted by bacterial and yeast models, MMR deficiency leads represented as many as 20 times in our searches. In the second step, Kangaroo to instability of short repeated sequences, particularly mononucleo- searches through coding regions for the DNA pattern submitted by the user. We designed Kangaroo to permit searches of short and/or low complexity DNA sequences and query sequences that contain IUPAC DNA ambiguity Received 9/26/01; accepted 1/14/02. codes. The search algorithm is based on Regular Expression functions and is The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked advertisement in accordance with part of the NCBI C toolkit. The strategy described here was extended to search 18 U.S.C. Section 1734 solely to indicate this fact. different organism databases and to search general DNA and protein records. 1 Supported in part by the National Cancer Institute of Canada with funds from the To develop this search algorithm into a user-friendly, public, bioinformatics Terry Fox Run. M. R. was the recipient of a Research Scientist Award from the National Cancer Institute of Canada supported with funds provided by the Canadian Cancer tool, we amalgamated the features into a web-based application. Kangaroo, 8 Society. which runs on a four processor Sun Solaris server, can perform searches 2 These authors contributed equally to this work. through amino acids, DNA, and annotated coding regions in 10 different 3 To whom requests for reprints should be addressed, at Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, 75 Francis Street, Boston, MA 02115. Phone: (617) 732-7592; Fax: (617) 264-6301; E-mail: [email protected]. 5 Internet address: ftp://ncbi.nlm.nih.gov/toolbox/ncbi_tools. harvard.edu. 6 K. Michalickova, G. D. Bader, R. Isserlin, and C. W. V. Hogue. SeqHound biological 4 The abbreviations used are: CRC, colorectal cancer; NCBI, National Center for sequence database system as a platform for bioinformatics research, manuscript in Biotechnology Information; MMR, mismatch repair; MSH3, mutS (Escherichia coli) preparation. homologue 3; MSI, microsatellite instability; MSI-H, high frequency microsatellite in- 7 Internet address: http://bioinfo.mshri.on.ca. stability; PMS2, postmeiotic segregation increased (S. cervesiae)2. 8 Internet address: http://bioinfo.mshri.on.ca/kangaroo. 1284

Downloaded from cancerres.aacrjournals.org on October 4, 2021. © 2002 American Association for Cancer Research. MUTATION PROFILING OF MISMATCH REPAIR-DEFICIENT CANCERS organisms with custom flexibility that is not available in other recent database search tools (10–12). Tissue Samples and MSI Testing. Patients (Ͻ50 years of age) with resected CRCs were identified through the Ontario Cancer Registry in a population-based study (4). Paraffin-embedded tissues were obtained and a histopathological review performed to locate regions of high neoplastic cellu- larity (Ͼ50%). Tissue was microdissected and DNA extracted as described (4). Briefly, tissue was scraped from two to three unstained 10-␮m slides into 50–100 ␮l of lysis buffer [10 mm Tris-Cl (pH 7.0), 100 mm KCl, 2.5 mM

MgCl2, and 0.45% Tween 20]. After a 10-min incubation at 95°C, tissue samples were subjected to proteinase K (20 mg/ml, 15–35 ␮l) digestion overnight at 65°C. A total of 16 human cancer cell lines were obtained from the American Type Culture Collection (Manassas, VA), including 7 MSI-H CRC cell lines (SW48, LS174T, LS411, LoVo, HCT-8, HCT-116, and DLD- 1), 1 MSI-H endometrial carcinoma cell line (HEC1A), and 8 microsatellite stable CRC cell lines (HT-29, SW480, SW620, SW837, SW1116, Colo320HSR, LS513, and LS1034; Refs. 13–16). DNA was extracted from the cell lines using DNeasy Tissue kit (Qiagen, Mississauga, ON), according to the manufacturer’s instructions. MSI was tested in the primary CRCs by PCR of Fig. 1. Semilog distribution of mononucleotide repetitive sequences identified in five reference panel loci outlined in the National Cancer Institute Workshop on GenBank records of annotated human coding region. Mononucleotides less than six bases in length were too abundant to enumerate using Kangaroo. For the purposes of this figure, Microsatellite Instability, and CRCs were classified as microsatellite stable, searches for coding microsatellites Ͼ13 nucleotides in length were truncated because of low frequency microsatellite instability, or MSI-H as defined (17). The loci apparent ambiguities in many of these sequence entries (see text). Because of the used in our study were BAT-25, BAT-26, D2S123, D5S346, D17S250, BAT-40, redundancy of GenBank, the number of actual coding mononucleotides is likely to be BAT-RII, D18S58, D18S69, and D17S787, with PCR conditions as described much smaller than that indicated by the number of records identified. (4, 17). In total, there were 102 MSI-H primary CRCs available for mutation screening (results of the MSI testing have been published previously; Ref. 4). The MSI status of the cell lines was confirmed by PCR analysis of the BAT26 C/G tracts were not identified. There were a number of mononucle- locus. otides Ͼ13 bases in length identified in the database, the majority of Mutation Profiling of Coding Microsatellites. PCR primers correspond- Ͻ which were putative genes identified from sequencing projects, and ing to selected coding regions were designed to amplify a product 150bp Ј (primer sequences and annealing temperatures available on request). The the repeats were most frequently located at the extreme 3 end of the reverse primer was end-labeled in a final volume of 10 ␮l; 0.3 ␮M of the entry (data not shown). Several entries were isolated mutations that reverse primers was combined with 60 ␮Ci of [␥-33P]ATP (Easytides; NEN– extended open reading frames to include repeats in the 3Ј UTR. A US, Boston, MA) and 5.88 units of FPLCpure Polynucleotide Kinase (Amer- single (A)32 was also identified in the original entry for regulator of sham Pharmacia Biotech, Baie d’Urfe, Quebec, Canada). The reaction was mitotic spindle assembly 1 (HUMPROTXA/RMSA-1), but this se- incubated at 37°C for 1 h and denatured at 90°C for 2 min. In a 15-␮l PCR quence was subsequently recognized to be a cloning artifact. The reaction, 2 ␮l of genomic DNA from primary CRCs or 1 ␮l of DNA from the longest definite coding mononucleotide identified, an (A)14, was in ϫ cell lines was combined with 10 PCR buffer [200 mM Tris-HCl (pH 8.4), melastatin 1 (MLSN1). 500 mM KCl], 1.5 mM MgCl , 0.13 mM deoxynucleotide triphosphates, 0.4 ␮M 2 To investigate the role of some of these novel candidates, we of each forward and reverse primer, and 1 unit of Platinum Taq polymerase selected 18 genes containing polyadenine tracts measuring at least (Life Technologies, Inc., Burlington, Ontario, Canada). PCR cycling condi- tions were 2 min at 94°C followed by 35 cycles of 15 s at 94°C, 15 s at eight nucleotides in length and screened for mutations in MMR- annealing temperature, and 20 s at 72°C (DNA Engine, model PTC-200; MJ deficient CRCs. Genes were selected because of their probable func- Research, Watertown, MA). After PCR, 7.5 ␮l of denaturing formamide dye tion in cell cycle regulation or transcription activation. Mutation was added, the samples were denatured at 94°C for 4 min, cooled, and loaded screening revealed frameshift mutation frequencies varying widely on to a denaturing 6% polyacrylamide gel. The gel was transferred onto 3-mm among the genes (from 0% to 33%). Half of the genes had mutation Whatman paper, dried, and exposed to Kodak Biomax film (Rochester, NY). frequencies of Ն5% (Table 1). The highest mutation frequencies were All of the putative mutations were confirmed by sequencing. Templates were found in DNA-activated protein kinase catalytic subunit (DNAPK), G reamplified, PCR products were gel purified using the Concert Rapid PCR protein-coupled receptor kinase 2 (Drosophila)-like (GRK4), protein Purification System (Life Technologies, Inc.), and the reverse primer was used kinase C-binding protein 1 (RACK7), and CCAAT box binding protein for sequencing using the Thermo Sequenase radiolabeled terminator cycle (CBF2), all of which were Ͼ15%. The remaining nine genes had sequencing kit (Amersham Pharmacia Biotech, Cleveland, OH) according to the manufacturer’s instructions. mutation frequencies from 0 to 4% (data not shown). For comparison, we also screened for mutations in seven genes with previously re- Results ported coding polyadenine microsatellites, including five genes with mutation frequencies of Ն5% (Table 1). The overall results were Using Kangaroo, we identified many records with mononucleotide similar to those described previously (17). Of note, the overall muta- tracts at least six nucleotides in length in human coding sequences. tion frequency in the tumors was 61 of 196 (31%) for (A)10 tracts, 77 Because GenBank is a highly redundant database, the number of of 461 (17%) for (A)9 tracts, and 112 of 1560 (7.2%) for (A)8 tracts, identified records with mononucleotides may be 5–10 times higher suggesting an association between polyadenine length and mutation than the actual number of coding mononucleotides in human se- frequency. quences. The number of six-base mononucleotides was by far the To compare the mutation profiles of cultured CRC cells, we tested most frequent (80% of the total), and the frequency dropped sharply a subset of the same genes in a panel of MSI-H cell lines. The with increasing tract length (Fig. 1). Polyadenine tracts were present mutation frequencies ranged from zero (11 of the 24 genes tested) to at much high frequency than all of the other tracts. In particular, (A)8 50% (MSH3). Comparison of the mutation frequencies in the primary tracts were about five times more abundant than any other type of CRCs revealed that the results in the cell lines were similar (Fig. 2). (N)8 repeat. In addition, whereas there were occasional very long (12 Four of the six genes with mutation frequencies Ͼ20% in the cell lines and 13 nucleotide) tracts, these were all A/T tracts, and similar length were also Ͼ20% in the primary CRCs. Similarly, there were 18 genes 1285

Downloaded from cancerres.aacrjournals.org on October 4, 2021. © 2002 American Association for Cancer Research. MUTATION PROFILING OF MISMATCH REPAIR-DEFICIENT CANCERS

Table 1 Mutation frequency of coding polyadenine microsatellites in MSI-H colorectal cancers Coding Mutation Gene symbola Gene name; function Accession no. Locationb microsatellitec frequency (%) Newly identified coding polyadenines DNAPK/PRKDC Protein kinase, DNA-activated, catalytic subunit; DNA double- NM_006904 487 (A)10 33/99 (33) strand break repair and recombination 10807 (A)8 8/98 (8) GRK4/GPRK2L G protein-coupled receptor kinase 2 (Drosophila)-like; desensitize NM_005307 656 (A)9 19/91 (21) G protein-coupled receptors RACK7/PRKCBP1 Protein kinase C-binding protein 1; anchors protein kinase C-beta-1 NM_012408 319 (A)8 17/91 (19) CBF2 CCAAT box binding protein; transcriptional activation NM_005760 1420 (A)9 14/91 (16) RECQL1 RecQ (DNA helicase) protein-like; DNA repair NM_032941 112 (A)9 11/93 (12) CDC25C Cell division cycle 25C; triggers entry into mitosis NM_001790 724 (A)8 10/93 (11) ERCC5 Excision repair cross-complementing, complementation group 5; NM_000123 2743 (A)9 8/80 (10) nucleotide excision and transcription-coupled repair of DNA damage TF-34 Human Krueppel-related DNA binding protein; zinc finger protein GI: 1124875 500 (A)8 6/86 (7) BLYM Burkitt’s lymphoma transforming gene; unknown gene product GI: 179497 78 (A)8 5/96 (5) NM_005179 Previously reported coding polyadenines MSH3 MutS (E. coli) homolog 3; DNA mismatch repair NM_002439 1141 (A)8 42/101 (42) ATR Ataxia telangiectasia and Rad3 related; DNA recombination, NM_001184 2311 (A)10 28/97 (29) damage checkpoint, double-strand break repair BLM Bloom syndrome; ATPase/helicase, may suppress inappropriate NM_000057 1536 (A)9 22/97 (23) recombination BCL10 B-cell CLL/lymphoma 10; activates nuclear factor ␬B, promotes NM_003921 129 (A)8 7/102 (7) , suppresses transformation 493 (A)7 2/102 (2) PMS2 Postmeiotic segregation increased (S. cervesiae) 2; DNA mismatch NM_000535 1232 (A)8 5/99 (5) repair a Includes only those genes with coding microsatellite mutation frequencies of Ն5%. b The numerical position of the repeat sequence is denoted as its position within the coding sequence and is retrieved by Kangaroo from the GenBank Flatfile. c In addition to the identified polyadenine tract, all other coding mononucleotides in the same genes measuring at least seven bases in length were also screened for mutations. with mutation frequencies Ͻ20% in the cell lines, and 17 of these Discussion were Ͻ20% in the primary CRCs. Whereas some low frequency mutations were detected in the primary tumors but not in the cell lines, DNA MMR deficiency results in widespread frameshift mutations this probably reflects the statistical variability of a mutation screen in because of a failure to repair one and two base slippage events a small panel of lines. The overall mutation frequency in the cell lines occurring during replication of repetitive DNA sequences (3, 5–7). for all of the genes tested, 20 of 185 (10.8%), was the same as the The presence of coding microsatellite frameshift mutations in CRCs overall mutation frequency in the primary CRCs for the corresponding with defective DNA MMR suggests a molecular mechanism for this genes, 241 of 2201 (10.9%). These findings suggested that the MSI-H distinctive pathogenetic pathway (8, 17). Kangaroo is an important cell lines could be used in a screen to identify targets that were advance for investigating these neoplasms, because it permits a com- mutated frequently in the primary MSI-H CRCs. We used this strategy prehensive in silico genome scan to identify previously unrecognized to test 11 additional coding polyadenines, and identified 3 genes, coding microsatellites. The high frequency of polyadenine tracts iden- RAD50 (Saccharomyces cervesiae) homologue (RAD50), DEAD/H tified by Kangaroo was particularly striking, and the relative absence (Asp-Glu-Ala-Asp/His) box polypeptide 17 (DDX17/p72), and of longer G/C mononucleotides raises the possibility that these ho- CASP8-associated protein 2 (CASP8AP2/RIP25), with coding micro- mopolymers may be more highly selected against in evolution. This is satellites mutated in Ն10% of the samples (Table 2). The remaining consistent with the greater susceptibility of G/C homopolymeric tracts 8 genes had no coding microsatellite mutations identified in any of the to instability in the presence of DNA MMR deficiency (18). The total cell lines (data not shown). number of records with coding mononucleotides identified by Kan-

Fig. 2. Coding mononucleotide mutation frequencies in MMR- deficient primary CRCs and cell lines. The genes are organized into three groups based on the mutation frequency in the cell lines: 0% (no cell lines with instability), 1–20% (one cell line with coding instability), and Ͼ20% (two or more cell lines with coding instability). Within each of these groups, the genes are arranged according to the mutation frequency in the primary tumors. For some genes, only seven tumor cell lines amplified successfully.

1286

Downloaded from cancerres.aacrjournals.org on October 4, 2021. © 2002 American Association for Cancer Research. MUTATION PROFILING OF MISMATCH REPAIR-DEFICIENT CANCERS

Table 2 Mutation frequency of coding polyadenine microsatellites in MSI-H human cancer cell lines Coding Mutation Gene symbola Gene name; function Accession no. Locationb microsatellitec frequency (%) RAD50 RAD50 (S. cervesiae) homolog, DNA double-strand break NM_005732 2175 (A)9 2/8 (25) repair/recombination 2812 (A)8 0/8 DDX17/p72 DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 17; NM_006386 118 (A)8 1/7 (14) ATPase involved in transcription/translation CASP8AP2/RIP25 CASP8 associated protein 2; interacts with and activates NM_02115 3700 (A)8 1/8 (12.5) caspase-8 in FAS mediated apoptosis 3382 (A)9 0/8 2855 (A)7 0/8 3794 (A)7 0/8 a Includes only those genes with coding microsatellite mutation frequencies of 10% or greater. b The numerical position of the repeat sequence is denoted as its position within the coding sequence and is retrieved by Kangaroo from the GenBank Flatfile. c In addition to the identified polyadenine tract, all other coding mononucleotides in the same genes measuring at least seven bases in length were also screened for mutations. garoo is similar to that reported by Woerner et al. (11), which also tase nonreceptor type 13 (FAP-1/PTPN13), cell division cycle 7 (S. used a highly redundant database (EMBL) for source records. In cervesiae homologue)-like 1 (CDC7), REV1 (yeast homologue)- contrast, the study by Mori et al. (12) identified a much smaller like (REV1), PMS2, apoptotic protease activity factor (APAF1), number of coding mononucleotides, most likely reflecting the fact that and checkpoint (Schizosaccharomyces pombe) homologue (CHK1), source records were obtained from the less redundant Unigene data- harbor relatively low mutation frequencies. Finally, the negligible base. Wren et al. (10) enumerated the presence of coding microsat- mutation frequencies in two putative apoptosis inhibitors, BCL2- ellites, predominantly trinucleotide repeats, based on the identification associated anthanogene 4/silencer of death domain (BAG4/SODD) of in-frame repetitive elements only. and BCL2-associated X protein antagonist selected in saccharo- We selected a number of biologically important candidate genes myces 1 (BASS1), as well as several signaling pathway genes, and identified widespread coding microsatellite mutations, similar to Vaccinia-related kinase 2 (VRK2), mitogen inducible 2 (MIG-2), other recent mutation surveys of coding sequences (11, 12, 19, 20). PDZ domain-containing guanine nucleotide exchange factor I This suggests that the coding microsatellites identified by Kangaroo (GNEF/LOC51735), mitogen activated protein kinase kinase likely represent numerous novel molecular progression targets of kinase (MAP3K4/MTK1/MEKK4), renal tumor antigen (MOK/ MMR-deficient neoplasms. Several of the genes we found to have the RAGE), and Ras-like protein (TC10), also raise the possibility that highest coding microsatellite mutation frequencies are candidate genes in tumorigenesis. Mutation of DNA-dependent protein kinase mutations in some coding microsatellites could be selected against catalytic subunit (DNAPK), which is involved in DNA-damage in tumor development. response double-strand break repair, causes the severe combined Although some clues can be obtained from overall mutation fre- immumodeficiency phenotype and lymphoma predisposition in mice. quencies, it is not possible to infer the biological significance of Although DNA-dependent protein kinase catalytic subunit (DNAPK) coding microsatellite mutations without functional studies of the cell inactivation is not reported in human neoplasms, several genes with biological effects of these alterations. It is entirely possible that the related functions, including ataxia telangiectasia mutated (ATM), frameshift mutations in a given coding microsatellite could be by- breast cancer 1 early onset (BRCA1), breast cancer 2 early onset standers, even in genes with putative roles in apoptosis and cell cycle (BRCA2), and p53, are directly implicated in tumorigenesis. In addi- regulation. Furthermore, interpretation of mutation frequency requires tion, we identified mutations in two other genes involved in DNA an understanding of the role that repeat length, nucleotide type, and repair, RecQ (DNA helicase) protein-like (RECQL1) and excision adjacent sequence context play in determining sequence stability (17). repair cross-complementing complementation group 5 (ERCC5), and Attempts have been made to compare coding to noncoding mutation confirmed the presence of mutations in MSH3, ataxia telangiectasia frequencies for microsatellites of similar sequence composition (8, and Rad3 related (ATR), Bloom syndrome (BLM), and RAD50 (S. 17); however, much larger surveys of intronic sequences are required. cervesiae) homologue (RAD50). Thus, there is a growing list of DNA Polycytosines and polyguanines may have greater instability than repair genes that have coding microsatellites inactivated in MMR- polyadenines and polythymines (18), precluding comparisons of mu- deficient tumorigenesis [including also mutS (Escherichia coli) tation frequencies for different tract compositions. Therefore, addi- homologue 6 and methyl-CpG binding domain protein 4 (MBD4)]. tional studies will be required before the functional significance of the The curious presence of hypermutable repetitive sequences in several mutations identified by us are known. of the DNA MMR genes has been noted, raising theories about a role DNA MMR deficiency underlies a significant proportion of CRCs, in evolutionary modulation (21). endometrial cancers, and gastric cancers. We developed Kangaroo, a We also found relatively high mutation frequencies in cell powerful bioinformatics search algorithm, to perform an in silico division cycle 25C (CDC25C), which is involved in triggering entry into mitosis, and CASP8 associated protein 2 (CASP8AP2/ genome scan for coding microsatellites that may be mutated in DNA RIP25), which is involved in FAS-mediated apoptosis. In combi- MMR-deficient tumors. It will now be possible to develop a compre- nation with mutations in the coding 8 base polyguanine tract of hensive mutation profile across hundreds of coding microsatellites BCL2-associated X protein, these findings suggest that there may that are putative targets for MMR-deficient tumors. Whereas it is be a host of coding microsatellite targets contributing to abrogation often difficult to understand the biological and functional importance of cell cycle and apoptosis regulation in MMR-deficient neo- of some of these alterations, the delineation of a comprehensive plasms. In contrast, several other DNA repair, cell cycle, and mutation profile may be analogous to the development of a transcrip- apoptosis regulatory genes, including centromere protein F tional profile of a tumor. This type of systematic approach will be (CENP-F/MITOSIN), ATP-dependent DNA ligase III (DNA ligase essential to better understand the molecular pathogenesis of MMR- III), transcription factor Dp-2 (DP-2), protein tyrosine phospha- deficient neoplasms. 1287

Downloaded from cancerres.aacrjournals.org on October 4, 2021. © 2002 American Association for Cancer Research. MUTATION PROFILING OF MISMATCH REPAIR-DEFICIENT CANCERS

References 12. Mori, Y., Yin, J., Rashid, A., Leggett B. A., Young, J., Simms, L., Kuehl, P. M., Langenberg, P., Meltzer, S. J., and Stine, O. C. Instabilotyping. Comprehensive 1. Lengauer, C., Kinzler, K. W., and Vogelstein, B. Genetic instabilities in human identification of frameshift mutations caused by coding region microsatellite insta- cancer. Nature (Lond.), 396: 643–649, 1998. bility. Cancer Res., 61: 6046–6049, 2001. 2. Kim, H., Jen, J., Vogelstein, B., and Hamilton, S. R. Clinical and pathological 13. Cottu, P. H., Muzeau, F., Estreicher, A., Flejou, J. F., Iggo, R., Thomas, G., and characteristics of sporadic colorectal carcinomas with DNA replication errors in Hamelin R. Inverse correlation between RERϩ status and p53 mutation in colorectal microsatellite sequences. Am. J. Pathol., 145: 148–156, 1994. cancer cell lines. Oncogene, 13: 2727–2730, 1996. 3. Aaltonen, L. A., Peltomaki, P., Leach, F. S., Sistonen, P., Pylkkanen, L., Mecklin, 14. Hoang, J. M., Cottu, P. H., Thuille, B., Salmon, R. J., Thomas, G., and Hamelin, R. J. P., Jarvinen, H., Powell, S. M., Jen, J., Hamilton, S. R., Petersen, G. M., Kinzler, BAT-26, an indicator of the replication error phenotype in colorectal cancers and cell K. W., Vogelstein, B., and de la Chapelle, A. Clues to the pathogenesis of familial lines. Cancer Res., 57: 300–303, 1997. colorectal cancer. Science (Wash. DC), 260: 812–816, 1993. 15. Sparks, A. B., Morin, P. J., Vogelstein, B., and Kinzler, K. W. Mutational analysis of 4. Gryfe, R., Kim, H., Hsieh, E. T., Aronson, M. D., Holowaty, E. J., Bull, S. B., the APC/␤-catenin/Tcf pathway in colorectal cancer. Cancer Res., 58: 1130–1134, Redston, M., and Gallinger, S. Tumor microsatellite instability and clinical outcome 1998. in young patients with colorectal cancer. N. Engl. J. Med., 342: 69–77, 2000. 16. Schwartz, S., Jr., Yamamoto, H., Navarro, M., Reventos, J., and Perucho, M. 5. Ionov, Y., Peinado, M. A., Malkhosyan, S., Shibata, D., and Perucho, M. Ubiquitous Frameshift mutations at mononucleotide repeats in caspase-5 and other target genes somatic mutations in simple repeated sequences reveal a new mechanism for colonic in endometrial and gastrointestinal cancer of the microsatellite mutator phenotype. carcinogenesis. Nature (Lond.), 363: 558–561, 1993. Cancer Res., 59: 2995–3002, 1999. 6. Thibodeau, S. N., Bren, G., and Schaid, D. Microsatellite instability in cancer of the 17. Boland, C. R., Thibodeau, S. N., Hamilton, S. R., Sidransky, D., Eshleman, J. R., proximal colon. Science (Wash. DC), 260: 816–819, 1993. 7. Sia, E. A., Kokoska, R. J., Dominska, M., Greenwell, P., and Petes, T. D. Microsat- Burt, R. W., Meltzer, S. J., Rodriguez-Bigas, M. A., Fodde, R., Ranzani, G. N., and ellite instability in yeast: dependence on repeat unit size and DNA mismatch repair Srivastava, S. A. National Cancer Institute Workshop on Microsatellite Instability for genes. Mol. Cell Biol., 17: 2851–2858, 1997. cancer detection and familial predisposition: development of international criteria for 8. Markowitz, S., Wang, J., Myeroff, L., Parsons, R., Sun, L., Lutterbaugh, J., Fan, R. S., the determination of microsatellite instability in colorectal cancer. Cancer Res., 58: Zborowska, E., Kinzler, K. W., and Vogelstein, B. Inactivation of the type II TGF-␤ 5248–5257, 1998. receptor in colon cancer cells with microsatellite instability. Science (Wash. DC), 18. Zhang, L., Yu, J., Willson, J. K., Markowitz, S. D., Kinzler, K. W., and Vogelstein, 268: 1336–1338, 1995. B. Short mononucleotide repeat sequence variability in mismatch repair-deficient 9. Bernstein, F. C., Koetzle, T. F., Williams, G. J., Meyer, E. F. J., Brice, M. D., cancers. Cancer Res., 61: 3801–3805, 2001. Rodgers, J. R., Kennard, O., Shimanouchi, T., and Tasumi, M. The : 19. Forgacs, E., Wren, J. D., Kamibayashi, C., Kondo, M., Xu, X. L., Markowitz, S., a computer-based archival file for macromolecular structures. Arch. Biochem. Bio- Tomlinson, G. E., Muller, C. Y., Gazdar, A. F., Garner, H. R., and Minna, J. D. phys., 185: 584–591, 1978. Searching for microsatellite mutations in lung, breast, ovarian and colorectal cancer. 10. Wren, J. D., Forgacs, E., Fondon, J. W., III, Pertsemlidis, A., Cheng, S. Y., Gallardo, Oncogene, 20: 1005–1009, 2001. T., Williams, R. S., Shohet, R. V., Minna, J. D., and Garner, H. R. Repeat polymor- 20. Duval, A., Rolland, S., Compoint, A., Tubacher, E., Iacopetta, B., Thomas, G., and phisms within gene regions: phenotypic and evolutionary implications. Am. J. Hum. Hamelin, R. Evolution of instability at coding and non-coding repeat sequences in Genet., 67: 345–356, 2000. human MSI-H colorectal cancers. Hum. Mol. Genet., 10: 513–518, 2001. 11. Woerner, S. M., Gebert, J., Yuan, Y. P., Sutter, C., Ridder, R., Bork, P., and von 21. Chang, D. K., Metzgar, D., Wills, C., and Boland, C. R. Microsatellites in the Knebel, D. M. Systematic identification of genes with coding microsatellites mutated eukaryotic DNA mismatch repair genes as modulators of evolutionary mutation rate. in DNA mismatch repair-deficient cancer cells. Int. J. Cancer, 93: 12–19, 2001. Genome Res., 11: 1145–1146, 2001.

1288

Downloaded from cancerres.aacrjournals.org on October 4, 2021. © 2002 American Association for Cancer Research. Mutation Profiling of Mismatch Repair-deficient Colorectal Cancers Using an in Silico Genome Scan to Identify Coding Microsatellites

Jane Park, Doron Betel, Robert Gryfe, et al.

Cancer Res 2002;62:1284-1288.

Updated version Access the most recent version of this article at: http://cancerres.aacrjournals.org/content/62/5/1284

Cited articles This article cites 21 articles, 11 of which you can access for free at: http://cancerres.aacrjournals.org/content/62/5/1284.full#ref-list-1

Citing articles This article has been cited by 4 HighWire-hosted articles. Access the articles at: http://cancerres.aacrjournals.org/content/62/5/1284.full#related-urls

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://cancerres.aacrjournals.org/content/62/5/1284. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from cancerres.aacrjournals.org on October 4, 2021. © 2002 American Association for Cancer Research.