Mouse telocentric sequences reveal a high rate of homogenization and possible role in Robertsonian translocation

Paul Kalitsis*†‡, Belinda Griffiths*†, and K. H. Andy Choo*†‡

*Murdoch Childrens Research Institute, Royal Children’s Hospital, Melbourne, Victoria 3052, Australia; and †Department of Pediatrics, University of Melbourne, Royal Children’s Hospital, Melbourne, Victoria 3052, Australia

Edited by Elizabeth Blackburn, University of California, San Francisco, CA, and approved April 17, 2006 (received for review January 10, 2006) The telomere and centromere are two specialized structures of p-arms or, indeed, those of other eukaryotic telocentric chromo- eukaryotic that are essential for sta- somes. bility and segregation. These structures are usually characterized Previous attempts by others at isolating the p-arm telomeric and by large tracts of tandemly repeated DNA. In mouse, the two subtelomeric sequences of the mouse chromosomes have largely structures are often located in close proximity to form telocentric been unsuccessful because of the nature of the DNA chromosomes. To date, no detailed sequence information is avail- residing in these regions and difficulties associated with anchoring able across the mouse telocentric regions. The antagonistic mech- these sequences to individual chromosomes. We have identified anisms for the stable maintenance of the mouse telocentric karyo- fosmid clones generated for the C57BL͞6 mouse project type and the occurrence of whole-arm Robertsonian translocations (8), containing large randomly sheared inserts that bridge the gap remain enigmatic. We have identified large-insert fosmid clones between the p-arm telomere and centromere. The study of these that span the telomere and centromere of several mouse chromo- clones has enabled us to establish the true molecular nature of the some ends. Sequence analysis shows that the distance between mouse telocentric p-arm regions and have allowed us to propose the telomeric T2AG3 and centromeric minor repeats range possible mechanisms involving karyotype maintenance and from 1.8 to 11 kb. The telocentric regions of different mouse evolution. chromosomes comprise a contiguous linear order of T2AG3 repeats, a highly conserved truncated long interspersed element Results 1 repeat, and varying amounts of a recently discovered telocentric Identification of Fosmid Clones with Telomeric and Minor Satellite tandem repeat that shares considerable identity with, and is DNA. Previous studies using pulsed-field gel electrophoresis inverted relative to, the centromeric minor satellite DNA. The (PFGE) (5, 6) and stretched chromatin fiber FISH (7) have telocentric domain as a whole exhibits the same polarity and a high suggested that the mouse ‘‘short-arm’’ telomere [p-telomeric (p- sequence identity of >99% between nonhomologous chromo- tel)] was situated in close proximity to the centromeric mouse minor somes. This organization reflects a mechanism of frequent recom- satellite DNA. In an attempt to bridge the gap between the p-tel and binational exchange between nonhomologous chromosomes that the centromere, we used the fosmid end-sequence clone library should promote the stable evolutionary maintenance of a telocen- created from the public C57BL͞6 mouse genome project (8). This tric karyotype. It also provides a possible mechanism for occasional library provides a relatively unbiased representation of the inverted mispairing and recombination between the oppositely C57BL͞6 mouse genome, because the DNA inserts were isolated oriented TLC and minor satellite repeats to result in Robertsonian from randomly sheared genomic DNA, and the fosmid clones were translocations. maintained in a single-copy vector in bacterial cells to minimize recombination and rearrangement between tandemly repeated centromere ͉ chromosome ͉ evolution ͉ telomere DNA. A telomere repeat (TTAGGG)10 and a mouse minor satellite ukaryotic chromosomes contain telomeres and centromeres, monomer unit were used to BLAST search the end-sequence library. Eboth being essential for faithful chromosome segregation dur- Twelve fosmid clones were identified with both telomere and minor ing division. The telomere comprises functionally important satellite hits, suggesting that the telomere and centromere were small tandem repeats and is a cap-like structure that protects the within 40 kb (Fig. 1A). We sequenced the ends of these fosmid ends of chromosomes from degradation and aberrant fusion with clones to confirm their identity (Table 1, which is published as other chromosome ends (1). The centromere, on the other hand, is supporting information on the PNAS web site). Three clones, usually located along the chromosome arm and is essential for #0773, #1977, and #1022, contained the same insert, because they mitotic spindle capture and checkpoint control, sister chromatid displayed identical start positions for each end; therefore, only cohesion and release, and cytokinesis (2, 3). The centromere of clone #0773 is shown in Fig. 1A. Of the 12 fosmid clones, one was higher typically consists of tandemly repeated DNA found to have no telomere sequence and was eliminated from this arrays that are devoid of . In some , such a tandem study. For the remaining clones, the TTAGGG repeat was con- served and Ͻ500 bp in length. repeat structure is absent, as seen in the point centromeres of the budding yeast and the ectopic centromeres (or neocentromeres)

that form in nonrepetitive regions of the genome (4). Conflict of interest statement: No conflicts declared. The karyotype of the mouse genus, Mus, is made up of 19 pairs This paper was submitted directly (Track II) to the PNAS office. of telocentric autosomes and a telocentric X chromosome. Here, we Abbreviations: MEF, mouse embryonic fibroblast; PFGE, pulsed-field gel electrophoresis; define a telocentric chromosome as having no obvious short arm at tL1, truncated long interspersed nucleotide element 1; CENP, centromere protein; Rob, a cytogenetic level. The Y chromosome contains a short distal arm Robertsonian; p-tel, p-telomeric. and can be defined as being acrocentric. Molecular cytogenetic Data deposition: The sequences reported in this paper have been deposited in the GenBank studies have placed the mouse centromeric minor satellite DNA database (accession nos. DQ322707–DQ322722). 10–1,000 kb away from the end of the telocentric short arm (p-arm) ‡To whom correspondence may be addressed. E-mail: [email protected] or (5–7). However, what remains unknown are the precise DNA [email protected]. sequence and molecular organization of the mouse telocentric © 2006 by The National Academy of Sciences of the USA

8786–8791 ͉ PNAS ͉ June 6, 2006 ͉ vol. 103 ͉ no. 23 www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600250103 Downloaded by guest on September 23, 2021 satellite order, whereas the other a telomere3tL13novel tandem repeat (see below) 3minor satellite order (Fig. 1A). Multiple sequence alignments of the tL1 repeats from the different fosmid clones showed an extremely high degree of con- servation across all of the clones (Fig. 6, which is published as supporting information on the PNAS web site). Across the 1,780-bp sequence, there were only up to four base substitutions between the clones, representing a sequence identity of 99.8–100%. Sequence comparison with full-length mouse L1 repeats showed that tL1 belonged to the L1Md-A2 family and involved a 5Ј truncation through ORF2. The remaining 1,104 bp of ORF2 contained no stop codons, suggesting a recent insertion event. Interestingly, when the mouse genome was searched with the tL1 sequence, a high level of homology of 98.5–98.9% between some genomic L1s was seen. The top 12 matching L1 sequences were present at different chromo- some locations, with 10 of the 12 sequences being full length, further suggesting that the tL1 originated from a relatively young member of the L1 family.

A Newly Discovered Repeat (TLC) Is Present Between the Telomere and Centromere. Sequencing between the tL1 and the minor satellite array revealed a tandem repeat with a monomer unit of 146 bp in some of the fosmid clones. This repeat, which we have called TLC, is found in three of the four possible groupings of chromosomal loci represented by the different telomere-minor satellite-containing fosmids (Fig. 1A). Complete sequence analysis of a full-length 8.8-kb TLC array was carried out by using clone #1783. A consen- sus restriction site BfaI, present in Ͼ50% of monomers, was used as base 1 in the TLC-monomer analysis. Of the 58 monomers within this clone, 29 were 146 bp in length. Pair-wise comparisons of each monomer combination showed a mean identity of 82 Ϯ 10% with a range between 45% and 100%. The entire 8.8-kb TLC array in clone #1783 displayed a region of Ϸ4.7 kb that showed internal duplication of 1.0- and 1.8-kb fragments (Fig. 1B). In contrast, the

flanking sequences were predominantly made up of monomeric GENETICS Fig. 1. Sequence structure of the telocentric regions of mouse chromosomes. units of the TLC repeat. (A) List of fosmid clones identified with (TTAGGG)n (black) and minor satellite Sequence comparison between TLC and minor satellite showed (dark gray) repeats on either end. All regions were sequenced apart from the that the two sequences shared homology ranging in identity be- checkered boxes. The white and light-gray boxes define a truncated LINE-1 tween 74% and 77% across the TLC array. The base composition (tL1) and a telocentric tandem repeat (TLC), respectively. i–iv denote four revealed that the TLC repeat is 63–65% AT-rich, which is similar possible groupings of chromosomal loci represented by the different fosmid to that of minor satellite DNA (67% AT). In addition, the sequence clones. (B) Consensus domain map of the telocentric region of mouse chro- mosomes. Shown on top is a typical mouse telocentric chromosome showing displayed either a thymidine- or adenosine-rich strand, as observed the positions of centromeric minor satellite and pericentric major satellite in minor satellite. TLC and minor satellite divergence was evident , m and M, respectively. Below is shown the restriction map of clone from the absence of the centromere protein (CENP)-B box motif #1783 for enzymes used in C–E; BspHI and DraI are not shown. The map that is well conserved in the minor satellite sequences. Further- corresponds with that obtained by using genomic DNA digestion (see below) more, TLC lacked crosshybridization with minor satellite under and therefore represents a consensus mouse telocentric map. Arrows in the moderate stringency conditions (see below); however, crosshybrid- telocentric region denote internal duplications. (C) Duplicate C57 genomic ization occurred when the Southern blots were probed under low DNA blots probed with TLC or minor satellite under low-stringency conditions. stringency conditions (Fig. 1C). Interestingly, the TLC and minor For both probes, a laddering pattern of 120-bp periodicity that corresponds to satellite repeat clusters were positioned in opposite orientations minor satellite is seen with NdeI (N) and͞or SpeI (S) or PstI (not shown). (D) Fosmid clones #0550 and #1783 and C57BL͞6 (C57) genomic DNA were di- (Fig. 2A). gested with BspHI (B), DraI (D), SpeI (S), and XbaI (X) and hybridized with the To confirm that the TLC repeat was specific to the telocentric TLC probe (1783-44). Internal restriction fragments within the TLC array ends of the mouse chromosomes, we performed FISH on C57BL͞6 ranged in size from 1.2 to 2.9 kb and were observed for both fosmid and cells using a 3.5-kb (subclone #1783-44) TLC probe under non- genomic DNA. (E) Genomic C57 DNA probed with TLC revealed an 8.4-kb PstI crosshybridizing conditions for minor satellite. Positive signals that (P) fragment bridging the TLC and minor satellite arrays, whereas a 9.7-kb colocalized with those for the minor satellite were observed on most NdeI (N) fragment links up the tL1, TLC, and minor satellite repeats (refer metaphase chromosomes (Fig. 3A). Individual chromosomes were to restriction map in B). Other restriction enzymes used were XbaI (X) and subsequently identified with chromosome-specific paints. The re- SpeI (S). sults demonstrated the presence of the TLC repeat on all of the chromosomes except for the Y chromosome, and barely detectable signals on chromosomes 13 and 18 (data not shown). A Highly Conserved Truncated Long Interspersed Nucleotide Element 1 (tL1) Element Is Present in Every Fosmid Clone. Fosmid end-sequencing Genomic Verification of Fosmid Clones. Sequencing of Ϸ3kbofthe indicated that for each of the clones, the telomeric repeat sequence telomeric ends of the different fosmid clones across the led directly into a 1,780-bp 5Ј-tL1 retrotransposable DNA element. (TTAGGG)n, tL1, and various lengths of TLC and minor satellite We used primer walking to sequence through the tL1 in each clone. suggested that the clones represented at least four different chro- This initial characterization indicated that the clones could be mosome ends (Fig. 1A). To confirm that the observed base changes divided into two groups, one showing a telomere3tL13minor were representative of the mouse chromosomes, we performed

Kalitsis et al. PNAS ͉ June 6, 2006 ͉ vol. 103 ͉ no. 23 ͉ 8787 Downloaded by guest on September 23, 2021 Fig. 3. Juxtapositioning of TLC and telomere repeats. (A) The TLC repeat (i; green) colocalizes with the minor satellite (ii, red) at the p-tel ends of most C57BL͞6 mouse chromosomes (stained with DAPI; blue). (iii) Merged images. (B) PFGE analysis of genomic C57BL͞6 DNA probed with the TLC and (TTAGGG) repeat probes. Cohybridizing fragments ranged in size from 30 to Fig. 2. Minor satellite monomers show a high level of identity, are rich in 5 80 kb for BspHI (B), DraI (D), SpeI (S), and XbaI (X). CENP-B box motifs, and bind CENP-A. (A) Average identities of pair-wise analyses are shown for each repeat cluster at junction and clone ends, including standard deviation and range. Fosmid clones #1783 and #2085 (derived from different chromosomes) both contain TLC repeats. TTAGGG repeats are shown in black, tL1 Further verification was carried out by PCR amplification across in white, TLC repeat in light gray, and minor satellite in dark gray (checkered block the other repeat junctions (tL1-TLC; tL1-minor satellite; and has not been sequenced). Pair-wise analysis was performed on a minor satellite TLC-minor satellite) using C57BL͞6 and monochromosomal hy- array of 2.2 kb (18 monomers) of clone #1783. The first two monomers bordering brid chromosome 5. Sequences from these products corresponded the TLC-minor junction showed a slightly decreased level of identity with a mean to at least one fosmid clone. In addition, junction sequences were of 92% and 94%, respectively, when compared with the other monomers. also used in database searches of the C57BL͞6 mouse genome Monomer units 3 to 18 have a mean identity of 96%. Rectangular boxes represent (Table 2, which is published as supporting information on the PNAS the 120-bp monomer unit numbered 1–18 with the monomers containing the web site). Each junction sequence showed 100% identity to small conserved CENP-B box motif indicated by asterisks. (B) Fosmid clones #0394 and #0773 (derived from the same chromosome) do not contain any TLC repeat. Cross unplaced contigs that were not long enough to cover the entire comparisons of tL1-adjacent and clone-end minor satellite clusters show an region between telomere and minor satellite DNA. average identity of 96 Ϯ 2.1% and 97 Ϯ 2.4% for clones #0394 and #0773, Next, we used Southern blot analysis of C57BL͞6 genomic and respectively. Minor satellite monomer analysis among clones #1783, #2085, and fosmid clone DNA to determine and compare the genomic orga- #0773 at the TLC or tL1 borders showed a high level of identity, where same- nizationofthep-tel domain. Using the 3.5-kb TLC subclone colored boxes represent Ͼ99% identity. (C) CENP-A occupies a subset of the minor (#1783-44) as a probe under moderate hybridization stringency, we satellite DNA. Chromatin fibers from mouse C57BL͞6 MEFs were prepared by observed the predicted internal restriction fragments of the TLC mechanical stretching and stained with (i) anti-CENP-A antibody (green), fol- array (Fig. 1D). Further Southern blot analysis enabled the linking lowed by FISH hybridization with (ii) minor satellite DNA (red), and (iii) merged images with DAPI staining (blue). of the TLC array to adjacent repeats (Fig. 1E), as predicted by the restriction map derived from the sequencing of clone #1783 (Fig. 1B). PCR and sequenced the products across the tel-tL1 junction of Terminal restriction fragment analysis was used to physically link separate monochromosomal hybrids containing mouse chromo- the telomere with the TLC array. Restriction enzymes BspHI, DraI, some 5, 6, 10, 12, or 14 (9). The PCR region displayed polymor- SpeI, and XbaI were chosen because they did not cut within the tL1 phisms that assisted in the classification of the fosmid clones and repeat. Cohybridizing fragments were observed for both TLC and their genomic sequences (Fig. 6). Cloned tel-tL1 PCR products (TTAGGG)5 probes, ranging in size from 30 to 80 kb (Fig. 3B). from chromosome 5 were found to be identical to fosmid clone #2085, and products from chromosomes 10, 12, and 14 were Mouse Minor Satellite Monomers Are Highly Homogeneous Right Up identical to the #1783, #1815, and #1964 group of fosmids. to the TLC or tL1 Junction. CLUSTALW sequence analysis of 18 minor Interestingly, the sequenced PCR product from chromosome 6 satellite monomers next to the TLC-minor satellite junction of contained one base change and did not have an identical match to clone #1783 showed a mean identity of 95 Ϯ 2.5%, with a range of any of the fosmid clones. To ascertain whether the PCR products 88–100% (Fig. 2A). The minor satellite monomer identity did not were nonspecifically amplifying nontelomeric L1 sequences, we greatly diverge as the TLC junction was approached, with only the searched the genome database and found only one unplaced first two TLC-proximal minor satellite monomers displaying a sequence contig with telomere and L1 DNA, with the next highest relatively small (mean 92% and 94%, respectively) decrease in hit containing six base changes within the 571 bp of L1 sequence for identity when compared with the other minor monomers. A high chromosome 5. These changes were never found in the tel-tL1 PCR level of minor satellite identity was also observed at the TLC products. junction of clone #2085 (Fig. 2A). In addition, we found a high level

8788 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600250103 Kalitsis et al. Downloaded by guest on September 23, 2021 of sequence conservation between the two blocks of seven or six TLC-proximal minor satellite monomers for fosmid clones #1783 and #2085 (Fig. 2A). Minor satellite DNA binds the conserved CENP-B by means of a 17-bp sequence motif known as the CENP-B box. This motif is commonly found in centromeric DNA of many mammalian species and is required for de novo centromere assembly (10). We examined the distribution of the CENP-B boxes within the minor satellite array next to the TLC repeats of clone #1783. Of the 18 minor satellite monomers examined, six contained the consensus CENP-B box (Fig. 2A). We also examined the mouse minor satellite monomers of the non-TLC-containing fosmid clones. Two clones, #0394 and #0773, derived from the same chromosomal locus were used (Fig. 1A). We sequenced blocks of five or six contiguous minor satellite monomers adjacent to the tL1 repeat (m1, Fig. 2B) and at the minor satellite end of the fosmid clones (m2 or m3). The level of monomer identity in pairwise analysis revealed an average value of 96 Ϯ 2.3%, 95 Ϯ 1.7%, and 96 Ϯ 2.7% for blocks, m1, m2, and m3, respectively. The average identity was additionally examined between tL1-adjacent block (m1) and fosmid-end blocks (m2 and m3) and gave an average identity of 96 Ϯ 2.1% and 97 Ϯ 2.4% for m1͞m2 and m1͞m3, respectively. These results suggested there is not a significant drop-off in sequence identity as the minor satellite arrays approach the TLC or the tL1 junction. As with the TLC-containing fosmids, CENP-B boxes were also observed in the minor satellite of non- TLC fosmids.

DNA Methylation Extends into the TLC Array. Fig. 4. TLC repeats are methylated and found in closely related Mus species. Mouse pericentric ͞ DNA is characterized by the methylation of cytosines and a (A) HpaII–MspI digestion of genomic C57BL 6 DNA from MEF and spleen showed a strong 0.15-kb TLC laddering in the MspI digest from DNA of both transcriptionally repressed state (11, 12). To examine the methyl- tissues and a weaker laddering signal in the HpaII digest of the spleen DNA. (B) ation status of the TLC repeats, we digested genomic DNA from Cross-species conservation of the TLC repeat in the Mus genus. Five micro- C57BL͞6 spleen and mouse embryonic fibroblast (MEF) cells with grams of genomic DNA from each strain or species were digested with MspI. MspI and HpaII that cut on average every second monomer. Both Lanes 1–6 represent C57BL͞6, M. musculus castaneus, M. musculus͞

DNAs exhibited marked methylation as shown by an increase in domesticus, M. pahari, M. spretus, and M. caroli. A shorter exposure for lanes GENETICS fragment sizes in the HpaII digests (Fig. 4A). Interestingly, spleen 5 and 6 is shown (5Ј and 6Ј). genomic DNA showed slightly less methylation when compared with MEF DNA. genome. To verify that the TLC repeat is present at the ends of the Conservation of TLC in Other Mouse Species. Centromeric DNA is chromosomes and not at interstitial sites, we performed terminal often characterized by its high degree of sequence divergence restriction fragment analysis with TLC and (TTAGGG)5 repeat between species. To determine whether the TLC repeat was probes. Comigrating fragments with sizes ranging from 30 to 80 kb conserved in closely related mouse species, we digested equal are observed by using four different restriction enzymes. The sizes amounts of genomic DNA with MspI and probed the Southern blot of the presumed terminal restriction fragments are within the with TLC (Fig. 4B). Varying intensities of a typical laddering mouse telomere size range from previous PFGE (13) and Q-FISH pattern of 0.15 kb corresponding to that of TLC repeat arrays was studies (14) using TTAGGG probes. Additional evidence support- observed for C57BL͞6, Mus musculus castaneus, Mus musculus͞ ing the fosmids being representative of the mouse telomeric ends domesticus, and Mus spretus. No bands were detected with Mus comes from the fact that the telomeric hexamers are in correct pahari, whereas Mus caroli showed a very strong hybridization signal orientation with respect to the ends of the chromosomes, and that with an irregular laddering pattern. the sequenced hexameric repeats show no significant degeneration, as would be expected if the repeats resided at interstitial genomic Discussion sites. The Mouse Proximal Telomere Is Truly Telocentric. We show that the telomere and centromere sequences of the different mouse chro- Extensive Sequence Homogenization Between Nonhomologous Chro- mosome ends studied are separated by a distance ranging from 1.8 mosomes Extends from the Telomere to the Centromere. Tandemly to 11 kb. We demonstrate that one or two types of sequences repeated sequences at the centromeric and pericentromeric regions separate the telomere and centromere: a tL1 sequence found on all are known to undergo homogenization via intra- and interchromo- fosmid clones, and a 146-bp TLC tandem repeat found on most somal crossing-over events (15, 16). Similarly, the subtelomeric ends fosmids. One group of fosmids (all derived from the same chro- of metacentric chromosomes contain tracts of tandem repeat DNA mosomal locus) contained only the 1.8-kb tL1 but no TLC repeat, that rapidly exchange between chromosomes (17). Acrocentric and bringing the centromere and telomere to within 1.8 kb of each telocentric chromosomes are unique in structure, because they other. contain two distinct chromosomal loci in close proximity. We now To demonstrate the telocentric location of the TLC repeats, we present DNA sequence information that bridges the gap between used FISH on metaphase chromosomes. Within the limits of these two structures on the mouse telocentric chromosomes. We detection, all of the mouse chromosomes, apart from chromosomes demonstrate that the DNA from the p-tel region through to minor 13, 18, and Y, show the TLC repeats at the telocentric end. PCR satellite is in the same orientation and has a very high level of amplification and Southern blot analyses of genomic DNA further homology (Ͼ99% across blocks of DNA) from the different confirm that the fosmid clones are stable and representative of the chromosome ends (Fig. 2 and Figs. 6 and 7, which are published as

Kalitsis et al. PNAS ͉ June 6, 2006 ͉ vol. 103 ͉ no. 23 ͉ 8789 Downloaded by guest on September 23, 2021 supporting information on the PNAS web site). This level of sequence conservation indicates a high prevalence of exchanges between nonhomologues leading to extensive sequence homoge- nization. A previous study has also demonstrated that the mono- mers of the pericentric mouse major satellite are highly homoge- neous (18), suggesting that homogenization extends into the pericentric satellite domain. It is possible that such frequent inter- chromosomal exchanges of the telocentric regions is facilitated by the close association of the mouse telomeres observed during meiosis (19).

Unusually High Level of Minor Satellite Sequence Homogeneity at the Array Periphery. In humans, centromeric ␣-satellite monomer se- quence identity varies between 70% and 100% (20). These mono- mers are often arranged into higher-order repeat units that show a significantly greater degree of sequence identity. The ␣-satellite monomers in these higher-order units commonly contain the conserved CENP-B-binding motif, CENP-B box (21). As the ␣-satellite boundary is approached, the level of sequence identity decreases. An example of a well-studied ␣-satellite junction region is found in Xp11 (22). Here, the level of ␣-satellite identity drops Fig. 5. Possible mechanisms for interchromosomal exchanges between off from 98–99% down to 70%, together with the frequency of telocentric chromosomes. (A) Exchange of telocentric arms between nonho- CENP-B boxes (23). mologous chromosomes anywhere (dashed crosses) along the common uni- A previous study using stretched fiber immuno-FISH analysis on directional telocentric DNA drives p-tel sequence homogenization and main- tains stability of the telocentric karyotype. (B) Occasional mispairing and a minichromosome in different somatic cell backgrounds showed exchange between sequence-related but oppositely oriented TLC and minor that the minor satellite DNA partially colocalized with the kinet- satellite repeat arrays result in the generation of a Rob chromosome and an ochore protein CENP-A (24). Here, we have performed immu- acentric fragment that is subsequently lost. nostretched fiber FISH on C57BL͞6 cells and shown that CENP-A occupies over half of the minor satellite domain (Fig. 2C), providing direct evidence that minor satellite is the primary site for nucleating chromatin. Further studies, such as involving the identification of kinetochore assembly. Furthermore, we demonstrate that the mi- chromatin boundary protein elements, will be required to address nor satellite array is orientated in the same direction on all of the this possibility. fosmid clones, confirming a similar finding of a previous report We have shown that the TLC repeat is located at the telocentric based on the use of chromosome orientation FISH (25). We have ends of most C57BL͞6 mouse chromosomes. The average sequence shown that, unlike human ␣-satellite, mouse minor satellite mono- identity between monomers across a TLC array is Ϸ82%, which is mers at the boundary of TLC or tL1 show a high level of sequence considerably lower than those seen between the minor or major homology (mean identity of 95–96% between monomers) and a satellite monomers (95% and 96%, respectively; see ref. 18 and the high prevalence of CENP-B boxes. Such a high level of minor present study). Sequence comparison between minor satellite and satellite identity extending across the minor satellite array into the TLC repeats suggests that the two repeats are distantly related, periphery is consistent with a mechanism of rapid interchromo- raising the possibility that the TLC repeat might have been derived somal sequence homogenization across the entire telocentric from minor satellite or a common progenitor sequence. It is unclear region. why the TLC monomers are less homogeneous compared to those of the minor (or major) satellite. Perhaps, like the tL1 sequence, the Spacer Sequences Between the Telomere and Centromere. The DNA TLC domain may merely serve a spacer role between the centro- adjacent to the telomeric ends of human chromosomes comprises mere and telomere regions, in which case the conservation of a small 0.5- to 2.4-kb stretches of tandemly repeated DNA (17). In the strict sequence identity may be less crucial. The identification of at mouse genome, only 11q and Xq telomeric ends have been com- least a small number of fosmids͞chromosomes that lack this domain pletely sequenced, where the adjacent sequence is 200–300 bp of further argues against an essential role of the TLC domain. DNA common to both chromosomes followed by B1 and LTR Cross-species analysis within the Mus genus indicates that the repeats. Our study has shown that the p-tel ends of the mouse TLC repeat is present in closely related subspecies and in more telocentric chromosomes contain a common partial L1 (tL1) repeat distantly related species such as M. spretus. However, this repeat that is next to TLC or minor satellite tandem repeats. appears absent in M. pahari, which last shared a common ancestor L1 repeats have been identified at the pericentric regions of with M. musculus around 6 million years (Myr) ago. It is interesting human chromosomes. The most detailed example comes from the that M. caroli diverged around 4 Myr ago but shows a very strong study of the human Xp, where an age-dependent gradient of L1 hybridization signal, albeit with a different periodicity that may repeats has been observed, with evolutionary age decreasing as the reflect crosshybridization with the predominant 60- and 79-bp higher-order ␣-satellite repeat array is approached (22, 23). We centromeric repeat monomers of M. caroli. The M. caroli repeat have shown the presence of a tL1 repeat next to the TTAGGG monomers show an identity ranging from 86% to 92% across 38- to repeat in at least four of the mouse p-tel ends. The presence of this 50-bp regions against the TLC repeat (26). tL1 may be explained by an initial transposition event to a single subtelomeric chromosome site that then rapidly spreads to other Telocentric Karyotype Evolution and Robertsonian (Rob) Transloca- chromosomes by means of interchromosomal translocations. It is tions. Telocentric chromosomes are not uncommon in mammalian interesting that the junction between the TTAGGG repeat and tL1 species and are found, for example, in mouse, dog, cattle, sheep, and contains an overlapping AACCC sequence (see Fig. 6). This goats (27). The prevalence of telocentric chromosomes in the Mus sequence may have directed the truncation of a full-length L1 by an genus suggests that this karyotype is relatively stable through unequal crossing-over event. At present, it is unclear what function, evolution. Here, we provide evidence that the various DNA se- if any, the tL1 sequence might serve. It is possible that the presence quences of the telocentric domain exhibit the same polarity that of this sequence serves to demarcate the telomeric and centric extends from the minor satellite array to the p-tel end. This identical

8790 ͉ www.pnas.org͞cgi͞doi͞10.1073͞pnas.0600250103 Kalitsis et al. Downloaded by guest on September 23, 2021 polarity should greatly favor the exchange of the telocentric ends FISH and Immunocytochemistry. FISH was performed on C57BL͞6 (but not the occurrence of Rob translocations; see below) among MEF by using standard methods. A 3.5-kb TLC DNA probe nonhomologous chromosomes, thereby providing a mechanism to (#1783-44) was subcloned from fosmid clone #1783. The minor maintain the stability of the telocentric karyotype (Fig. 5A). satellite probe was used from a previous study (32). For medium The mouse telocentric structure also allows a possible mecha- stringency hybridization and washing, probes were hybridized in nism for the occasional Rob translocation events to take place. This 30% formamide͞2 ϫ SSC, 37°C, and washed at 1 ϫ SSC, 50°C. mechanism may involve the mispairing between the sequence- For high stringency, hybridization was performed in 50% for- related TLC and minor satellite repeat arrays. Because the arrays mamide, 2 ϫ SSC, 37°C, and washing at 60°C, 0.1 ϫ SSC was for these two repeats are orientated in opposite directions, aberrant used. Medium stringency was used for the TLC FISH followed recombination between them will give rise to Rob translocations by the Rainbow FISH. Images of metaphase spreads were (Fig. 5B). The resulting Rob chromosomes can retain considerable captured and coordinates recorded by using an England Finder amounts of minor satellite DNA to preserve centromere function. graticule (SPI, West Chester, PA). Individual mouse chromo- Previous studies have shown that minor satellite is present in somes were then identified with a second round of FISH by using varying amounts on Rob chromosomes. This suggests that different Rainbow Star FISH (Cambio, Cambridge, U.K.) per the man- exchange sites within the minor satellite array and͞or the subse- ufacturer’s instructions. Stretched chromatin fiber FISH was quent loss of minor satellite sequences are possible through pro- performed as described (33). Images were captured by using an cesses such as unequal crossing over or looping out unstable Axioplan2 microscope (Zeiss), with a Sensys-cooled charge- inverted minor satellite sequences generated from the translocation coupled device camera (Photometrics, Tucson, AZ) and pro- events (25, 28–30). Rob translocations constitute the most common cessed with IP LAB software (Scanalytics, Billerica, MA). chromosomal rearrangement observed in humans. Similar molec- ular mechanisms have been proposed for the derivation of such Cell Lines and DNA. MEF cell lines were derived from pooled day translocation events (31). Further studies will be required to 12.5 male and female C57BL͞6 mouse embryos. A monochromo- determine whether the TLC repeats indeed provide a molecular somal somatic cell hybrid containing a mouse chromosome 5 in a mechanism for mouse Rob translocation events, and whether human background was generated for this study through microcell- possible lessons learned from such mouse studies will assist in the mediated fusion of a mouse embryonic stem (ES) cell line con- delineation of the human Rob mechanisms. taining a targeted CENP-A-GFP fusion product and neomycin resistance (32) into the human fibrosarcoma cell line HT1080, Materials and Methods using a previously described method (34). Genomic DNA was ͞ Mouse Fosmid Clones and Subclones. The female C57BL 6 sheared- extracted from the spleen of a 4.5-mo-old C57BL͞6 female or insert fosmid library was previously constructed by Kerstin pooled day-12.5 embryos for liquid and PFGE blocks using stan- Lindblad-Toh and colleagues, Broad Institute of Harvard and dard molecular methods. PFGE gels were run under the following Massachusetts Institute of Technology, Cambridge, MA. This conditions: 0.9% agarose͞6V͞cm, 14°C͞0.5 ϫ TBE, pulsed time library was deposited at the BACPAC Resource Center, Children’s 2–30 sec, run time 22 h, using a Bio-Rad Chef Mapper. Mouse Hospital Oakland Research Institute, from which replicate fosmid genomic DNA from different species and strains were obtained clones were obtained for this study. To facilitate the sequencing of from The Jackson Laboratory. Southern blots using the TLC probe GENETICS the repetitive fosmid inserts, DNA was randomly sheared using were hybridized at 55°C or 60°C in Church buffer and washed in 2 ϫ sonication and separated on an agarose gel to excise fragments SSC͞0.1% SDS, 55°C or 60°C. The (TTAGGG) telomere probe between 3 and 5 kb. The fragments were purified by using Gelase 5 was end-labeled, and blots were hybridized at 50°C in Church buffer (Epicentre Technologies, Madison, WI), and the ragged ends were and washed in 2 ϫ SSC͞0.1% SDS, 50°C. repaired with T4 DNA polymerase and T4 polynucleotide kinase (NEB) and blunt-end cloned into vector pAlter-1 (Pro- We thank Dr. Kerstin Lindblad-Toh (Broad Institute of Harvard and mega). Insert sizes of the subclones were then determined, and Massachusetts Institute of Technology, Cambridge, MA) for the mouse paired-sequence reads assisted in the assembly and closure of fosmid clones, Prof. Phil Avner for mouse monochromosomal hybrid contigs. Additional sequencing and PCR analyses are in Supporting DNAs, and Dr. Tim Littlejohn for assistance with bioinformatics anal- Text, which is published as supporting information on the PNAS yses. K.H.A.C. is a Senior Principal Research Fellow of National Health web site. and the Medical Research Council.

1. Chan, S. R. & Blackburn, E. H. (2004) Philos. Trans. R. Soc. London B 359, 109–121. 21. Ikeno, M., Masumoto, H. & Okazaki, T. (1994) Hum. Mol. Genet. 3, 1245–1257. 2. Cleveland, D. W., Mao, Y. & Sullivan, K. F. (2003) Cell 112, 407–421. 22. Schueler, M. G., Higgins, A. W., Rudd, M. K., Gustashaw, K. & Willard, H. F. (2001) Science 3. Amor, D. J., Kalitsis, P., Sumer, H. & Choo, K. H. A. (2004) Trends Cell Biol. 14, 359–368. 294, 109–115. 4. Amor, D. J. & Choo, K. H. A. (2002) Am. J. Hum. Genet. 71, 695–714. 23. Schueler, M. G., Dunn, J. M., Bird, C. P., Ross, M. T., Viggiano, L., Rocchi, M., Willard, 5. Kipling, D., Ackford, H. E., Taylor, B. A. & Cooke, H. J. (1991) Genomics 11, 235–241. H. F. & Green, E. D. (2005) Proc. Natl. Acad. Sci. USA 102, 10563–10568. 6. Narayanswami, S., Doggett, N. A., Clark, L. M., Hildebrand, C. E., Weier, H. U. & Hamkalo, 24. Zeng, K., de las Heras, J. I., Ross, A., Yang, J., Cooke, H. & Shen, M. H. (2004) Chromosoma B. A. (1992) Mamm. Genome 2, 186–194. 113, 84–91. 7. Garagna, S., Zuccotti, M., Capanna, E. & Redi, C. A. (2002) Cytogenet. Genome Res. 96, 25. Garagna, S., Marziliano, N., Zuccotti, M., Searle, J. B., Capanna, E. & Redi, C. A. (2001) 125–129. Proc. Natl. Acad. Sci. USA 98, 171–175. 8. Waterston, R. H., Lindblad-Toh, K., Birney, E., Rogers, J., Abril, J. F., Agarwal, P., 26. Kipling, D., Mitchell, A. R., Masumoto, H., Wilson, H. E., Nicol, L. & Cooke, H. J. (1995) Agarwala, R., Ainscough, R., Alexandersson, M., An, P., et al. (2002) Nature 420, 520–562. Mol. Cell. Biol. 15, 4009–4020. 9. Sabile, A., Poras, I., Cherif, D., Goodfellow, P. & Avner, P. (1997) Mamm. Genome 8, 81–85. 27. Pardo-Manuel de Villena, F. & Sapienza, C. (2001) Genetics 159, 1179–1189. 10. Ohzeki, J., Nakano, M., Okada, T. & Masumoto, H. (2002) J. Cell Biol. 159, 765–775. 28. Garagna, S., Redi, C. A., Capanna, E., Andayani, N., Alfano, R. M., Doi, P. & Viale, G. 11. Okano, M., Bell, D. W., Haber, D. A. & Li, E. (1999) Cell 99, 247–257. (1993) Cytogenet. Cell Genet. 64, 247–255. 12. Martens, J. H., O’Sullivan, R. J., Braunschweig, U., Opravil, S., Radolf, M., Steinlein, P. & 29. Garagna, S., Broccoli, D., Redi, C. A., Searle, J. B., Cooke, H. J. & Capanna, E. (1995) Jenuwein, T. (2005) EMBO J. 24, 800–812. 13. Kipling, D. & Cooke, H. J. (1990) Nature 347, 400–402. Chromosoma 103, 685–692. 14. Zijlmans, J. M., Martens, U. M., Poon, S. S., Raap, A. K., Tanke, H. J., Ward, R. K. & 30. Nanda, I., Schneider-Rasp, S., Winking, H. & Schmid, M. (1995) Chromosome Res. 3, Lansdorp, P. M. (1997) Proc. Natl. Acad. Sci. USA 94, 7423–7428. 399–409. 15. Smith, G. P. (1976) Science 191, 528–535. 31. Choo, K. H. (1990) Mol. Biol. Med. 7, 437–449. 16. Dover, G. (1982) Nature 299, 111–117. 32. Kalitsis, P., Fowler, K. J., Earle, E., Griffiths, B., Howman, E., Newson, A. J. & Choo, 17. Linardopoulou, E. V., Williams, E. M., Fan, Y., Friedman, C., Young, J. M. & Trask, B. J. K. H. A. (2003) Chromosome Res. 11, 345–357. (2005) Nature 437, 94–100. 33. Blower, M. D., Sullivan, B. A. & Karpen, G. H. (2002) Dev. Cell 2, 319–330. 18. Vissel, B. & Choo, K. H. (1989) Genomics 5, 407–414. 34. Saffery, R., Wong, L. H., Irvine, D. V., Bateman, M. A., Griffiths, B., Cutts, S. M., Cancilla, 19. Bass, H. W. (2003) Cell Mol. Sci. 60, 2319–2324. M. R., Cendron, A. C., Stafford, A. J. & Choo, K. H. A. (2001) Proc. Natl. Acad. Sci. USA 20. Rudd, M. K. & Willard, H. F. (2004) Trends Genet. 20, 529–533. 98, 5705–5710.

Kalitsis et al. PNAS ͉ June 6, 2006 ͉ vol. 103 ͉ no. 23 ͉ 8791 Downloaded by guest on September 23, 2021