<<

Copyright 0 1995 by the Genetics Society of America

A Physical Map of the X of Drosophila melanogaster: Cosmid Contigs and Sequence Tagged Sites

Encarna Madueno, * George Papagiannakis, Georgina Rimmington, x Robert D.C. Saunders, x Charalambos Savakis,t& Inga Siden-Kiamos, George Skavdis, t7 * * Lefteris Spanos, Jenny Trenear,tt Paul Adam,x Michael Ashburner,tt Panayiotis Benos, t3 * * Viacheslav N. Bolshakov, Daren Coulson, tt David M. Glover, x Sieg-un Herrmann,tt Fotis C. Kafatos, t, * Qlristos Louis, t, * * Tamsin Majerus tt and Juan Modolell* *Centra de Biologia Molecular Severo Ochoa, CSIC, Universidad Autonoma de Madrid, 28049 Madrid, Spain, tInstitute of Molecular Biology and Biotechnology, FORTH, Heraklion, Greece, :Department of Anatomy and Physiology and CRC Cycle Genetics Group, University of Dundee, Dundee, Scotland, Division of Medical Sciences, Medical School and **Department of Biology, University of Crete, Heraklion, Greece, ttDepartment of Genetics, University of Cambridge, Cambridge, England, and IfEuropean Molecular Biology Laboratory, Heidelberg, Germany Manuscript received July 15, 1994 Accepted for publication December 21, 1994

ABSTRACT A physical map of the euchromatic X chromosome of Drosophila melanogaster has been constructed by assembling contiguous arrays of cosmids that were selected by screening a with DNA isolated from microamplified chromosomal divisions. This map, consisting of893 cosmids, covers -64% of the euchromatic part of the chromosome. In addition, 568 sequence tagged sites (STS) , in aggregate representing 120 kb of sequenced DNA, were derived from selected cosmids.Most of these STSs, spaced at an average distance of -35 kb along the euchromatic region of the chromosome, represent DNA tags that can be used as entry points to the fruitfly . Furthermore, 42 have been placed on the physical map, either through the hybridization of specific probes to the cosmids or through the fact that they were represented among the STSs. These provide a link between the physical and the genetic maps of D. melanogaster. Nine novel genes have been tentatively identified in Drosophilaon the basis of matches between STS sequences and sequences from other species.

OSOPHILA melanogaster is, genetically, the best of the basic regulatory mechanisms being discovered in D” known higher . Nearly 90 years of study this fly subsequently have been shown by molecular have provided an unparalleled model for the study of means to have parallels in other . biological phenomena as apparently diverse as popula- Drosophila is also a rich mine for evolutionary stud- tion biology and cell death (ASHBURNER 1989). Over ies. There are now 3318 species known in the family. 6000 genes have been identified (FlyBase 1994) and Most of these, it is true, areknown only to taxonomists, -25% of these have been cloned and sequenced, at but many have been the subject of detailed ecological least in part. Novel genetic techniqueshave been devel- and evolutionary study, by both classical methods such oped, such as transposon mutagenesis and enhancer as polytene chromosomebanding and by molecular trapping, that detect DNA sequence elements control- methods [see KIUMBAS and POWELL(1992) and POW- ling the tissue and developmental specificity of genes ELL and DE SALLE (1995) for reviews, respectively]. ( O’KANEand GEHRING1987). These techniques allow Given this rich background and following the devel- the identification of the great majority of genes affect- opment of the requisite technology in other organisms ing any specific developmental process, regardless of ( e.g., COULSONet al. 1986) , it was natural for Drosoph- whether or notmutations inthese genes have an identi- ila biologists to consider mapping the entire euchro- fiable phenotype. Together with the increasing use of matic genome of D. melanogasterwith cloned . In- screens to recover mutations in genes that interact dur- deed, several such projects are underway, using cosmids ingdevelopment, these techniquesare providing an ( SIDEN-KIAMos et al. 1990) , YACs ( GARZAet al. 1989; exceptionally detailed view of the control of develop- AJIOKAet al. 1991), P1 phage (SMOLLERet al. 1991) mental and cellular phenomena. Encouragingly, many and combining these various vector-insert systems into The order of authors was determined, with minor variation, ac- a “reference library” (HOHEISELet al. 1991; MERRIAM cording to Article 16 of Royal Decree 2223-1984 of the Kingdom of et al. 1991; HARTLand LOZOVSKAYA1992, for review). Spain, which governs the order of presentation of candidates to the Physical maps will enable the cloning of otherwise un- Spanish CiService. cloned genes, will be a prerequisite to whole genome Corresponding author: Michael Ashburner, Department of Genetics, University of Cambridge, Downing St., Cambridge CB2 3EH, UK. sequencing and will be a resource forcomparative stud- E-mail: [email protected] ies of drosophilid . They will also be a model

Genetics 139 1631-1647 (April, 1995) 1632 E. Maduedo et al.

for the analysis of the genomes of insects of economic (CHURCH andGILBERT 1984) at 37” for 16 hr. Filters were or medical importance (see ZHENG et al. 1991 ) . then washed three times for 30 min in 6X SSC at room temper- One physical map of the D. melanogaster genome is ature and rinsed twice for 20 min in a buffer consisting of 3 M Tetramethylammonium chloride, 50 mM Tris-HC1, pH 8.0, in the form of overlapping cosmid clones and is being 2 mM EDTA, 0.1% SDS. A temperature of 57-58’ was selected assembled by a European consortium of laboratories for the rinses to allow for a 18bplong match for a 20-bp olige ( SIDEN-KIAMOSet al. 1990; KAFATOS et al. 1991 ) . This nucleotide probe (WOODet al. 1985) . Before exposing to film map is being linked to the polytene chromosome map, the filters were briefly rinsed in 6X SSC. a regular pattern of -5000 bands and interbands DNA preparations: For sequencing, cosmid DNA was pre- pared from 4 ml overnight cultures grown at 37” in TB me- (BRIDGES1935), by in situ hybridization. This permits dium containing 12 g/liter Bacto-Tryptone, 24 g/liter yeast the assignment of selected “canonical” cosmids from extract, 4 ml/liter glycerol, 17 mM KHnP04,72 mM K2HP04, sets of overlapping cosmids (contigs) to specific bands, and 30 pg/ml kanamycin. DNA was extracted from cells by or at least to smallchromosomal map subdivisions, and the alkaline lysis procedure ( SAMBROOKet al. 1989) and puri- provides the linkage between the physical and the ge- fied with Wizard columns ( Promega) according to the manu- facturer’s directions. Melds ranged from 1 to 20 pg. For most netic map to a resolution limit of tens of kilobase pairs. preparations they were between 7 and 10 pg. To enhance the physical map and tie it to the ultimate DNA sequencing: ( 20-mers ) comple- map, the chromosomal DNA sequence, we have begun mentary to theSP6 and T7 sites were used as primers to obtain to determine a large number of “sequence tagged sites” sequences of the endsof the insert DNA. DNA wassequenced (STSs) (OLSONet al. 1989). These are sequences of using heatdenatured, double-stranded templates and the lin- ear amplification method ( CRAXTON1991) as described in short stretches of DNA located at known positions within the Promega fmol sequencing kit with several modifications. the cosmidinserts. An advantage of the STSsis that In brief, 1-2 pg of template DNA were mixed, in sequencing they make the physical map independent of the actual buffer, with 20 pCi ‘35S]dATP ( 1000 Ci/mmol) , 3 pmol SP6 cosmids that generated themap. We are obtaining STSs or T7 primer and5 U Tag Polymerase, up to a final volume corresponding to the ends of the DNA inserts byse- of 17 pl. Four-microliter aliquots were mixed with the appro- priate deoxy-and dideoxy-nucleotide triphosphates, heated to quencing from the T7 and SP6 promoters adjacent to 95” for 2 min in a model 480 Perkin Elmer Cetus thermal the cloning site within the cosmid vector.The sequences cycler, and subjected to 30 cycles consisting of 30 sec at 95”, obtained have an average length of 211 bp. 30 sec at 42” and 60 sec at 70”. Three microliters of stop buffer Here we report the map of the euchromatic Xchro- were added to each reaction mixture, DNA was denatured by mosome of D. melanogaster, consisting of 893 cosmid incubation at 80” for 2 min and 2-3 p1 of each mixture was electrophoresed in 6% acrylamide gels. Another portion of clones and 568 STSs. This map is anchored to preex- each mixture was electrophoresed in a secondgel. After auto- isting maps by the identification of cosmidsthat include radiography for 4-5 days the gels were readand the se- genes previously identified and sequenced. quences were entered into a MicroVax computer with the help of a digitizing tablet. Consensus sequences were obtained with the University of Wisconsin software package ( DEVEREUX MATERIALSAND METHODS et al. 1984). STSs were named by appending an “S” or “T” Map assembly and in situ hybridization: The map of chro- to the correspondingcosmid name, accordingto whether the mosome Xwas assembled as described by SIDEN-KIAMOSet al. STS was obtained by priming on the SP6 or T7 recognition ( 1990). The cosmid library used in the construction of the site, respectively. map was generated in the Lo& Gvector (GIBSONet al. 1987) , STS analysis: Approximately 75% of the cosmids yielded which contains SP6 and T7 polymerase recognition sequences readablesequences fromboth termini, 14% failed to se- near and at eachside of the BamHI cloning site. The cosmid quence from oneof the two termini and the rest failed from master library ( 18,432 clones, equivalentto 4.6 genomes) was both ends. More than half of the failures (60%) were due to gridded onto new filters using a robotic device to attain a compressions. The remainingwere due toa failureof elonga- density of 9216 clones per filter (see LEHRACHet al. 1990). tion. The length distribution of the STSs were as follows: 3% Contig assembly used the programs of SULSTONet al. (1988) (17 STSs), 50-100 nucleotides (nt); 35.4% (201 STSs), as modified in the course of this work (T. BENOSand C. 101-200 nt; 38.2% (217 STSs), 201-250 nt; 18.7% (106 SAVAKIS,unpublished data). STSs) ,251-300 nt; 4.7% (27 STSs) , longer than 300 nt. The To estimate the real physical length of the contigs, we deter- average length of all STSs was 211 bp. mined the restriction maps of a sample of contigs of various Data deposition: The STS sequence data described in this lengths. It was observed that the overlap between cosmids paper, and of a further 168 autosomal cosmid STSs,have was, on average, larger for the larger contigs than for the been deposited in the EMBL/Genbank/DDBJ data libraries smaller. The following convention was used to estimate the with accession numbers 231722, 231727-231742, 231744 physical length of contigs, assuming an average insert length 232401, 232417-232477, 232515. Thesesequences are also of 38 kb. The length was estimated as 38 kb for the first available from the dbSTS database (National Center forBio- cosmid plus 9.5 kb for eachadditional cosmid for cosmids 2- technology Information,Washington, DC) with accession 5, plus 7.5 kb for each cosmid for cosmids 6-10 and plus 5.5 numbers 4202-4937. kb for each cosmid for cosmids 11 or above. Availability of cosmids: All of the cosmids are freely avail- In situ hybridization to larval polytene was able from Dr. I. SIDEN-KIAMOS(Institute of Molecular Biology performed as described earlier ( SAUNDERSet al. 1989). and Biotechnology, Foundation for Research and Technol- Hybridization procedures: To determine the presence of ogy - Hellas, PO Box 1527, Heraklion 711 10, Crete, Greece. previously cloned genes on the contigs, the cosmid library was E-mail: [email protected]). Researchers requiring COS- hybridized to either subcloned DNA or oligonucleotides (20 mids in a particular chromosome region are recommended bp) . Oligonucleotides were “P-labeled with kinase ( SAMBROOK to state this in their request, as cosmids additional to those et al. 1989). Hybridization was performed in “CHURCH”buffer listed in this paper may have become available. DrosophilaMap X Chromosome 1633

RESULTS AND DISCUSSION maps. The results of this analysis are presentedin detail in the APPENDIX. In thefinal map, a total of 718 cosmids Contig analysis: The basic strategy used to produce from the X chromosome was assembled into 138 con- a physical map of chromosome Xwas that described by tigs, withfrom two to 24 cosmids per contig, on average SID~N-KIAMOSet al. 1990, with slight modifications. In 5.2. In addition, 175 further clones, unattached to any brief, polytene chromosomes of D. simulans were mi- contig, have been mapped to a lettered polytene chro- crodissected into regions corresponding to numbered mosome subdivision by in situ hybridization. divisions ( SAUNDEFS et al. 1989). After extraction, the For the calculation of the proportion of the genome DNA was digested with Sau3A and a 24mer oligonucle- that has been covered by cosmids, we used SOMA’S esti- otide adaptor with a 5 ’ overhanging Sau3A sequence mates of the amountof DNA per haploid equivalent of was ligated to each end. TheDNA was then PCR ampli- each of the BRIDGESpolytene chromosome divisions fied using primers corresponding to the adaptor se- (SOMA1988) . Using an algorithm to estimate the real quence. The PCR product was used, after labeling, both physical length of a contig from the number of its COS- for in situ hybridization to polytene chromosomes of D. mids (see MATERIALS AND METHODS), we estimate that, melanogaster (for purposes of verification) and as a overall, the total coverageof the X chromosome is probe to screen high density filters of the cosmid li- -64%. This number varies between 28 and >loo% for brary. The selected clones, from 50 to 400 for each different divisions. Some divisions, particularly 6,9 and screen, were highly enriched for cosmids derived from 10, are poorly covered. Repeated microdissections and the microdissected chromosome division. These cos- preparation ofnew microamplified probes for these mids were fingerprinted by Hinf I digestion and contigs problematic divisions did not improve their coverage. were assempled by the comparison on the fingerprints, It is possible that these divisions correspond to regions using the computer-assisted methods developed by SUL of low sequence similarity between D. melanogaster and STON et al. ( 1988) . D. simulans. However, probes from D. melanogaster se- Following a first round of contig assembly, each con- lected far too many positive clones ( >1000), presum- tig was represented by a minimum overlapping set of ably due to repetitive sequences in the probe selecting “canonical” cosmids, that is, both cosmids of contigs clones from heterochromatic regions. For some divi- consisting oftwo cosmids, both terminal cosmids of sions (4, 13 and 16) the numberof clones selected by contigs with three members, and these and additional the D.simulans probes was within the expected range, internal cosmids in the case of larger contigs. From but a large proportion of these were repetitive by in this set, selected cosmids were hybridized in situ to the situ hybridization and could not be mapped to a pri- polytene chromosomes of D. melanogaster. Routinely, mary site. Preliminary experiments have shown that us- one cosmid from contigs consisting of two cosmids, and ing a from another strain of D. melano- both outer cosmids and representative internal cosmids gaster may be helpful in these cases. for all larger contigs, were analyzed in this way. The current coverage of approximately two thirds of Some cosmids, selected with a probe from a particu- the X chromosome is the practical limit of what can be lar polytene chromosome division, hybridized in situ to achieved with the present library (nominally represent- a different chromosome division. The reasons for this ing 4.6 genome equivalents) and the present method are various but include sequence similarities between of detecting cosmid overlapsby fingerprinting. Tomini- probe and clone (detectableby filter hybridization but mize false positives in contig building, we are using a not by in situ hybridization to chromosomes) and con- rather stringent criterion for cosmid overlap, accepting tamination of the microdissected DNA (by, e.g., the overlaps at a level of confidence better than lop5.This dissecting needletouching another chromosome re- correspondsto >50% overlap between two cosmids. gion) . Many of the false positives would appear, on We are now developing new algorithms for contig as the basis of their in situ hybridization signals, to be of sembly that take into account not only common sized heterochromatic origin. bands between fingerprinted cosmids but also band in- Some cosmids that remained unattached to a contig tensity (T. BENOSand C. SAVAKIS,unpublished data) after the analysis of fingerprints were hybridized to the and expect that these new methods, applied to theavail- polytene chromosomes. This strategy was followed ex- able fingerprint data,will increase the coverage by con- tensively when the divisionspecific probes selected tigs. comparatively few clones. To provide a direct link with the genetic maps, we The cytogenetic localization of a total of 1594 cos determined whethergenes, previously cloned by others, mids was determined. Of these, 367 hybridized to only were present in the cosmids mapped to contigs. We a single polytene chromosome site, 539 hybridized to either hybridized preexisting clones of particular genes multiple sites but with a clear “primary” site and 688 or synthesized oligonucleotides (20-mers) based on se- hybridized to multiple sites with no primary site that quences availablein the sequence data- could be determined. After the in situ hybridization of bases. Table l shows the Drosophila genes that were cosmids, a second round of computer-assisted contig identified in this way. Often, more thanone cosmid was building allowed thegeneration of the final contig shown to have sequences similar to a particular 1634 E. Maduefio et al.

TABLE 1 Known genes placed on the cosmid map

~~ G ene [abbreviation]Gene MapMethod position of detection

achaete [ ac] 1B1.2 HYB asense [use] 1B1.2 HYB fl-amyloid+roteinp-ecursor [ Appc] 1B5-8 STS emtnyonal lethal, abnormal vision [elav] 1B5-9 HYB suppressor of sable [su(s)] 1B10-Cl STS, HYB armadillo [arm] 2B15 STS, HYB Phosphogluconate-dehydrogenase [Pgd 2D3-4 HYB polyhomeotic [ph] 2D3-4 STS, HYB pecanex [pcx] 2E2 STS @ne [pnl 2E2-3 HYB female-sterile-(l)-Yu [j(l)Yu] 3B46 HYB period [Per1 3B1.2 HYB Notch [Nl 3C7 STS, HYB dunce [ dnc] 3C11-D4 STS, HYB Salivary-gland-secretion-4 [ Sgs41 3Cll-12 HYB Ubiguitinfusion-52 [ Ubif521 STS Sex lethal [Sxc] 6F47B3 STS defective chorion-1 [ dec-1] 7C1-3 OL female-sta’le-(1)-homoeotic[ fst(l)h] 7D1-5 STS Neuroglian [ Nrg] 7F1 01, ovarian tumors [otu] 7F1 OL Yolk protein-2 [ Yp2] 9A-B OL Protein+hosphatase-lp”C [Pplfl-94 9C STS sevenless [ sev] 10A1.2 HYB, OL vermilion [ v] 10A1.2 OL lethal (1) discs large (l(1)dlgJ 1OC OL Position spen$c antigen la [PSla] 11D7-E4 STS Yolk protein-3 [ Yp3] 12RC OL rutabaga [rut] 12F5-13A OL Protein+hosphatase-l-l3C [Ppl-ljrc] 13C1.2 HYB scalloped [ sd] 13F STS, OL Myboncogene-like [ Myb] 13F OL shibire [ shi] 13F-14 OL extradenticle [ exdl 14A1-B1 STS Cyclophylin-1 [ Cypl] 14C STS no on or off transient A [ nonA] 14C1-2 STS, OL rudimentary [ r] 15A1 OL fl-Spectrin [ Spec-fl] 16C1-4 OL Transmpion activatingfacctm-42 [ Taf421 16D45 STS wings upud-A [ wupA] 16F3 STS shaking-b [shakB] 19E3-4 OL suppressor of forked [ sucf)] 20E-F HYB The method used to place these genes on the cosmid map were hybridization to cloned sequences (HYB), hybridization to oligonucleotides (OL), determination of sequence tagged sites (STS). The genes are listed from distal to proximal. Cytological map positions are from FlyBase. probe, eitherverifymg a contig or helping join two con- with known cytological locations. The majority of STSs tigs that were considered separate accordingto the fin- were determined from canonical cosmids of contigs. gerprint analysis alone. Table1 also includes genes that When the coverage of a chromosomal division by con- were identified by the STS analysis (see below) , as well tigs waslow, we also determined the end sequences as some genes whose presence was determined by other from some unattached, but in situ mapped, cosmids, workers to whom our cosmids had been sent. The cos- aiming at aneven distribution of STSs along each chro- mids to which these probes hybridized are identified in mosomal division. the APPENDIX. Classes of sequence tagged sites: All STS sequences were STS analysis: STSs were determined from 568 ends analyzed todetermine whether or notsimilar sequences of cosmids, all of which had been mapped by in situ were already present in nucleic acid databases or hybridization to the polytene chromosomes. These whether they could potentially code for polypeptides STSs amount to a total of 120 kbp of sequenced DNA whose sequences were present in protein databases. Nu- Drosophila X Chromosome Map 1635

TABLE 2 Classification of 568 Xchromosome STSs (total length: 120,113 bp)

Number Percent Class I: known genes a) Known D.melanogaster genes (see Table 1) 33 5.8 b) Genes identified by homology (see Figure 1) 9 1.6 Phe-I tRNA synthetase (cosmid maps to 7D) DL-receptw-like protein (cosmid maps to 8D) Phospholylase B kinase (cosmid maps to 10D) A-olyl endopeptidase (cosmid maps to 9A) ATPdqbendent helicase (cosmid maps to 10A1.2) Protein phosphatase B (cosmid maps to 14D-F) Lipoamide acyltransferase (cosmid maps to 13B) Alkaline protease (cosmid maps to 19BC) ZAPp-omoted placenta protein (cosmid maps to 8D8-12) Class 11: Other useful STSs 495 87.2 a) Apparently nonrepetitive DNA 394 69.4 b) Partially repetitive 101 17.8 Class 111: Repetitive sequences 40 7.0 a) rDNA and Type I inserts 13 2.3 elements 24 b) Transposable elements 4.2 c) Histone Hl-H3 spacer 1 0.2 d) DNA 2 0.4 The numbers and percentages sum to more than the total number of STSs, because some STS sequences fall into more than one class. - cleic acid searches were performed using the BLASTN Drosophila are included in class I. Each showed a signifi- programme ( ALTSCHUL et al. 1990) with the EMBL nu- cant BLASTX score with a protein sequence in the cleic acid sequence database. BLASTX searches of the SWISSPROT database. The BESTFIT comparisons be- SWISSPROT database were also performed. Based on tween the conceptual translations of theseSTS sequences the results obtained, we could classify the sequenced and the highest scoring matches from SWISSPROT are cosmid ends as belonging to one of three categories shown in Figure 1. The identities between pairs ofse- (Table 2) ; individual tagged cosmids are identified in quences range from 31 % to 62%; the similarities, allowing the APPENDIX. for conservative amino acid changes, range from 62% to Class I STSs represent sequences that show a consider- 81%.These STSs presumably represent genes in Drosoph- able degree of similarity to, or even complete identity ila encoding similar proteins. with, previously sequenced genes (33 STSs, 5.8% of the Class I1 STSs, whichare the most frequent (495STSs, total) . Class I STSs are particularly useful in linking the 87.2% of the total), contain sequences that have not physical map to the genetic map of Drosophila (see previously been described for Drosophila. For about above). The cytogenetic localizations of these cosmids, four fifths of these (394/ 495) , the BLASTN and as determined by in situ hybridization, fully correspond BLASTX searches were negative, while for the rest to the genetic locations at which the corresponding (101 /495) BWTN scores were obtained indicating genes had previously been mapped, with one partial ex- similarities, but with a low degree of confidence. Closer ception. Cosmid 70B4 hybridizes in situ to two primary inspection of these 101 STSs revealed that the high sites at 14C and 19A, in addition to several secondary similarity scores were due to the presence, in both the sites. The two ends of this cosmid are similar to thegenes STS and matched sequences, of simple repeated motifs, Cyclophilin-1 and no on OT off transient A, and these genes such as (CCX),, (GT),, (AT), or essentially homo- have been found to be very closely linked at 14C (C. polymeric runs (e.g.,T,) . With a few exceptions, the ZUKER, personal communication). A cloning artifact in- repeat arrays were much shorter than the STS. These serting a sequence from 19A between two linked loci is partially repetitive STSs should still be useful for the very improbable. Our interpretation of this cosmid is design of primers, if the simple sequence repeats are that its insert is wholly derived from 14C but that some not included. of its internal sequences are duplicated elsewhere in the Finally, class I11 STSs contained sequences that are genome, most faithfully,or most extensively,at 19A. STS similar to known Drosophila repetitive sequences (40 144BllT is colinear to sequences from the gene PSla cosmid ends, 7.0%of the total). These included 24 over only 42 bp. The cosmid maps to 12A, whilePSla is sequences (4.2% of the total) that were similar to the reported to map in llD7-E4 ( WEHRLIet al. 1993). transposable elements copia, mdgl, F, 412, 297, jockq Nine STSs with no similarity to any known gene of ( = Sancho) , gypsy, Doc, BS and 1360 (Table 3). Some 1636 E. Madueiio et ul.

LDLR-RABIT.Swiss x llOE6T.Pep PPCE-PIG.Swiss x lOOB6T.Pep Similarity = 64%,Identity = 31% Similarity = 76%, Identity = 43%

527 AKIEKGGLNGVDVYSLVTEDIQWPNGITLDLSS 286 FEGEYDYVTNEGTVFTFKTNRHSPNYRLINI ...... I... 1:I.I. .:.l:l:l: I I : ::l:llll. l:ll::.lll::l I 1 GQVLRAHMDGTHARSIVSEAAYKASGVTVDIIS 28 FISLFQYITNEGSKIFFRTNKNAPNYQVIAI

GRLYWVDSKLHSISSIDVNGGNRKTVLEDEQRL DFTDPEESKWKVLVP 331 I::I.II I. l.l:l:l::l I1 ::I :: II :. I II I:: KRVFWCDSLLDYIESVDYEGAHRVMVLRGQQ-V DFNNSAEDKWETLIA 75

AHPFSLAIFEDKVFWTDVINEAIFSANRLTGSD I .ll:ll::::lll...::l:l.:::.l... PRH1SCHPO.Swiss 11G9S.Pep PSPSRLALFENRIYWTDATKQGIMSVDKFEGPP x Similarity = 73%, Identity = 49% VHLV 629

PFRL 102 148 QPRRVAVNLAKRVAAEQCRLGEQVGYSIRFD IIII:.I: :I IIIII: I:I: Ill II : KpBHJl"AN.Swiss x 194B3T.Pep 64 QPRRLSAIAVAERVAAERLDRIGQLVGYQIRLE Similarity = 67% , Identity = 53% DTTSKKTRIKYLTDGMLLRELINDPILS 208 : I II: : I ):Ill I .II:I 80 ILRQVAGHPHIITLIDSYESSSFMFLVFDLMR NKVSQSTRLSFCTTGILLRRLASDPLLG 3 ...... :.I I :II..I:IIII:I 12 FHHLPSTLSCTVDLQDVFESDAFVFLVFELCA

KGELFDYLTEKVALSEKETRSIMRSLLEAVSF IIIIIIIII. l.llll.II.III ::I:I.: KGELFDYLTSWTLSEKKTRTIMRQIFEGVEY P2Bl-RAT.Swiss x 134D6T.Pep

LHANNIVHRDLKP 156 Similarity = 81%, Identity = 56% :ll..llllllll IHAKSIVHRDLKP 88 11 LSTTDRWKAVPFPPSHRLTAKEVFD-NDGKPR :I1 :II: .I:IIII::II :Ill ..Ill: SYFD-YEASTSwiss x 47C4T.Pep 55 ISTKERVIDSVAFPPSRKLTCADVFDARTGKPQ Similarity = 75% ,Identity = 62% VDILKAHLMKEGRLEESVALRI 64 I:II I . III:III Ill 222 SSGALHPLNKVREEFRQIFFSMGFTEMPSNQWE HDVLKQHFILEGRIEESAALRI 1 ..I Ill III.IIIIlI:.III.III.I.III 1 TRGHLHPLLKVRTEFRQIFLFMGFSEMPTNNYVE

TGFWNFDALYVPQQHPARDLQDTFYIKDPLTAE- .:lIIIIIII IIIIIIII :lll::..l ... SSFWNFDALYQPQQHPARDAHDTFFVNHPAKSHK

LPDDKTYMDNIK 300 :I:I l::.:l FPQD--YLERVK 78 FIGURE 1.-Protein sequencealignment between nineentries from SWISSPROT (top lines of each comparison)and nine translated D. melunoguster STS sequences (bottom lines of each comparison). The percent similarity and identity scores are indicated for each BESTFIT comparison. The in situ locations of the cosmids from which the STSs were determined are given in Table 2. LDLR-RABIT.Swiss, fragment of low density lipoprotein precursor( 07yctolugus) ; KPBH-HUMANSwiss, gamma (catalytic) chainof testis phosphorylase B kinase (human) ; SYFD-YEAST.Swiss, beta chain of the cytoplasmic phenylalanine-tRNA synthetase ( S. cerevisiue) ; PPCEPIGSwiss, prolyl endopeptidase ( Sus scrofu) ; PRHl-SCHPO.Swiss, probable ATPdependent RNA helicase PRHl ( S. pombe) ; P2Bl_RATSwiss, protein phosphatase 2B catalytic subunit 1 (Rattus); ODB2-HUMAN.Swiss, lipoamide acyltransferase component (E2) (human) ; YAEPYARLI.Swiss, hypothetical protein in alkaline extracellular protease 3' region (Y. lipolytica) ; MIPP-MOUSE.Swiss, IAP-promoted placenta-expressed protein (Mus).

STSs may represent novel transposons. For example, degree of similarity (63%) to gyfsy. In both of these 124BlS is similar to, but distinct from, a region of the cases, the similarity extends to nonoverlapping seg- OW2 of the 297 (67% identity) and ments of the ' reverse transcriptase en- 412 (65% identity), while 164B9T showsa considerable coding regions. DrosophilaMap X Chromosome 1637

ODBZ-HUMAN.Swiss x 143ClT.Pep with sequencesthat are located downstream of the forked gene. Figure 2 shows the alignment of the se- Similarity = 81%. Identity = 60% quences present in three of these STSs, as well as the BLASTN output of a search of the nucleic acid database 404 IGGTFAKPVIMPPEVAIGALGSIKAIPRFNQK using the corresponding segment derived from 30D3S. 1111:. I ll:l:lllll:l II:IIlI:I In addition to these transposon-like sequences, 13 cos- 29 IGGTYTHPCIMAPQVAIGAMGRTKAVPRFNDK mid ends included in class 111 (from 11 cosmids) were GEVYKAQIMNV 446 homologous to ribosomal DNA or type I insertion se- I II :I.I DEXVKAYVMSV73 quences (R1 element). The type I sequence is an ele- ment found in the nucleolar organizer of the Xchrome some, as well as at other sites (GLOVER1981) . Finally, class 111 includes two STSs that contain sequences similar YAEP-YARLI.Swiss x 25C7S.Pep to the 1.688 satellite DNA and one that is identical to a segment of the long Hl-H3 histone spacer. It should be Similarity = Identity = 45% 75%, noted that our in situ hybridization data suggest that most of the rDNA-homologous STS, as well as the one 93 NLGWQGMSDLLSPLYVVLQDDTLAFWAFSAFM bearing the Hl-H3 spacer, are derived from “euchro- ::II:III II:.II I: :I:.I.: I 56 DVGYMQGMCDLVAPLLVIFDDESLSYSCFCKLM matic” parts of the Xchromosome rather thanfrom the nucleolar organizer or the cytogenetic locus 39D (where ERMERNY 132 IIII I: the histone genes are located), respectively. ERMIENF 16 Further analysis and signiJicance of the STSs: The first 197 STSs were translated in all six frames and subjected to FASTA and TFASTA searches against the PIR and EMBL databases, respectively; the SWISSPROT data- MIPP-MOUSE.Swiss x 33AlOT.Pep base was searched using the BLASTX programme. Only a) Similarity = 62%, Identity = 38% two new hits were found to have a high degree of confi- dence. These were 49F4S, similar to the human EST HSXT00891, and 38H5T, similar to the Drosophila 21 SMNHPRCGLGVCVCYGAIYALGGWVGAEIGNTI .I. I II I I :II:II. I. .I genes su(Hwj and odd-skipped. Further two-way analysis 1 PMSSXRSTAGVAVLGGRLYAVGGRDGSVCHRSI between this lastSTS and the two Drosophila genes revealed that the similarity is restricted to a small frag- ERFDPDENKWEWGSMA 71 I :I I I II :::.I ment that carries an incomplete zinc finger motif. Be- ECYDPHTNKWSLLAPMN 51 cause of the limited returns and CPU intensity of the protein searches, for the remainder of the STSs the nucleic acid sequence database searches were only sup- b) Similarity = 74%, Identity = 51% plemented by BLASTX. The BLASTX searches verified allof the results obtained from scanning the EMBL 99 RSFEVYDPLSKRWSPLPPMGTRRAYLGVA database. The BLAST searches of all STSs are repeated II I Ill ..:I1 1.11 II: :II. within the NCBI’sdbSTS database and are recorded 31 RSIECYDPHTNKWSLLAPMNRRRGGVGVS within the dbSTS records. AL 129 The determination of DNA sequences (STS) from I IL 62 cytologically determined sites is a valuable addition to the Drosophila genome mapping project. Over 90% FIGURE1. - Continued of the STSs include at least some sequences that are apparently unique and are, therefore, useful. They can Four sequences in class 111, determined from contigs be used not only for the recovery of cloned DNA se- derived from different chromosomallocations, show no quences from any genomic library by PCR but also for similarity to previously known transposons and, appar- mapping projects, as was recently demonstrated for the ently, represent a novel repetitive element that we pro- construction of the physical map of the human 21q and pose to name Lefka, after a well-known feature of west- Y chromosomes ( CHUMAKOVet al. 1992; FOOTEet al. ern Crete. Two of these ( 199H11S and 63E7T) arevery 1992). TheSTSs that harborstretches of simple repeats similar to each other over their entire length (94%) may also be useful in providing polymorphic markers except for small gaps, while the other two ( llG9T and based on the number of repeats in different strains of 30D3S) show onlypartial overlaps over 57 bp. BLASTN Drosophila. searches showed that this sequence can be aligned with The STSs corresponding to previously sequenced various nonrelated entries of the EMBL database, the Drosophila genes arevaluable for the mapping project, similarity being restricted to non-gene-encoding seg- because they represent secure links between the physi- ments. For example, 30D3S is almost 100% co-linear cal and genetic maps. Their relatively high frequency 1638 E. Madueiio et al.

TABLE 3 Cosmid clones with STS sequences that match repetitive sequences from D. melanogas&

Element Cosmid In situ site (s) 297 element 10D9 19A, 34A, dispersed repeats; chromocenter 124B1 18C, few dispersed repeats; chromocenter 125A12 8D; dispersed repeats; chromocenter 30D3 4C, 3C; dispersed repeats; chromocenter 412 element 142A8 5B/U 10H9 3E/U 10H9 5G4 ND 17.6 element 147H5 Dispersed repeats; chromocenter 66E7 25C/U 88F5 8D; chromocenter gvpq element 164B9 ID/U Doc element 136H3 2OA-C; dispersed repeats; chromocenter 72A3 1OA/U 1360 element 164B9 1D/U 34C6 9A, 16B, dispersedrepeats; chromocenter Bari-1 element 26B1 56F, 7D;repeats; dispersed chromocenter copia element 61E7 Chromocenter 27H8 100B/U Leflla element 63E7 19F; dispersed repeats; chromocenter 199Hll 2OA-C/U 11G9 lOA/U 30D3 3C, 4C; dispersedrepeats; chromocenter mdgl element 64G2 Dispersed repeats; chromocenter BS element 69G7 lD, 1E;repeats; dispersed chromocenter roo element 7B3 25A/U F element 46E 1 3B, 3C, 3D-E; dispersed repeats; chromocenter R1 rDNA 125Hll 19F-2OA; dispersedrepeats; chromocenter insertion element 120H6 21E; dispersed repeats; chromocenter 7664 1E; chromocenter 26C12 3C; dispersed repeats; chromocenter 93H7 3B-C/U rDNA 97H10 5C,repeats; 64B; dispersed chromocenter 99F1 9A-B; dispersed repeats 17E12 X, 2R tips; 82E 120H6 21E; dispersedrepeats; chromocenter 1OH9 3E/U 52H6 3E; chromocenter 46E1 3B, 3C, 3D-E;3C, 3B, 46E1 dispersedrepeats; chromocenter 5G4 ND Histone spacer sequences 82A,177B1 19F; dispersedrepeats; chromocenter Su(Ste) sequences 34C6 16B; 9A, dispersedrepeats; chromocenter 1.688 satellite-like sequences 58D8 Dispersed repeats; chromocenter 1 3D/U 66D 1

(5.8%) presumably reflects a combination of the den- sequences (on average one every 35 kb of the euchro- sity of genes in the Drosophila genome and the propor- matic genome). tion of genes that have already been sequenced.Finally, The degree of cosmid coverage is satisfactory, in that additional STSs reveal several previouslyunknown Dro- most genomic sequences will either be represented in sophila genes. the physical map or only a short chromosome walk away Conclusions and prospects: This paper represents a from mapped cosmids and STSs. A completely contigu- significant advance in the physical mapping of the D. ous map of the chromosome would be desirable but melunoguster genome, as its extends our previous study will require contig linkage using clones of larger size, in of the tip of the X ( SIDEN-KLAMOSet ul. 1990) to the either P1 phage or YAC vectors ( COULSONet al. 1988) . entire euchromaticchromosome. As much as two thirds More recently, considerable progress has been made of the euchromatinof thischromosome is nowavailable in the contig mapping of the major autosome arms of on the form of nearly 900 mapped cosmids, of which D. mhnoguster (I. SIDEN-KIAMOS,c. LOUIS,c. SAVAKIS, over 700 have been linked into contigs. The Xchromo- M. ASHBURNER, D. GLOVER,R. D. C. SAUNDER,J. MODO- some has also been covered by a dense array of STS LELL and F. C. KAFATOS, unpublished data), with over Drosophila X Chromosome Map 1639 llG9T AATGCTATAGTCGAGTTCCCCGACTATGAGATACCCTTTACTCAGCTAG%WAM4G 30D3S AATGCTATAGTCGAGTTCCCCAACTATC!ACATACCCGWACTCAGCTMW~% 63E7T AATGCTATAGTCGAGTTCCTCGACTATCAGATACCGCTTACT!L’GGCTAG~TG identities ******************* * ***** ***** ***** **** ****** *

Sequences producing high-scoring segment pairs High Smallest FIGURE2.-Alignments of three Score Poisson STSs that define the Lefku element, Probability a novel repetitive elementof D.melu- noguster. Below the alignment are the DSOVERLAB D. simulans su (f) gene 235 9.4e-25 highestscoring matches from a DMTG124 D. melanogaster tRNA gene cluster 249 1.Oe-13 screen of the EMBL nucleic acid se- quence databasewith the 30D3S STS DBDNABPA D. melanogaster DNA-binding protein 249 1.Oe-13 sequence. All of the matches are to -11 D. melanogaster genes for tRNA6 249 1.Oe-13 noncoding regions. DMRDOC D. melanogaster rdgC gene 204 5.6e-10 DMFORKEDA D. melanogaster putative f gene 195 3.2e-09 DMSUPF D. melanogaster suff) gene 163 1.5e-06 DMINTBETN D. melanogaster integrin beta subunit 159 3.le-06 DSINTERSP D. simulans DNA sequence 152 1.2e-05

1500 cosmids assigned to and over 1693 mapped CKAXTON,M., 1991 Linearamplification sequencing, a powerful method for sequencing DNA. Methods 3: 20-26. to primary sites On the autosomes in hybridiza- DEVEREUX.,_I1.. , P. HAEBEKLIand 0. SMITHIES,1984 A commehensive set of sequence analysis programs for the VAX. Nucleic Acids tion. The data are alreadv publiclv available from Flv- I .I 11 . Base (ftp.bio.indiana.edu:/flybase/clones/cosmids.txt). Res. 12: i87-395. FlyBase, 1994 The Drosophila database. Available from the World Together with the other Drosophila physical mapping Wide Web at the URL http://morp-an.harvard.edu proiects now underway, we can look forward within a FoorE, S., D. VOI.I.KATH, A. HILTON kdD. C. PAGE, 1992 The 1- very few years to an essentially complete representation human Y chromosome: overlapping DNA clones spanning the euchromatic regions. Science 258 60-66. Of the genome Of D.melanogusterin characterized ‘lone’ GAR/A, D., J. W. A~IOKA,D. T. BUKKE: and D. L.HMTL., I989 Map and a dense array of STS landmarks. It is our hope that piugthe Dros@hiZu genome withyeast artificial chromosomes. these will be use.d to facilitate research in many fields Science 246: 641 -646 GIBSON,T. J., A. ROSENTHAI.and R. H. WATERSTON,1987 Lorists6, of biology and, in particular, efforts to sequence the a cosmid vector withBamH1. Notl. Stol and HirulIII cloninrrD entire genome of this interesting fly. sites and altered neomycin phosphotl-ansferase gene expression. Gene 53: 283-286. We thank Dr. DOE1.z for performing the and GI.OVER,D. M., 1981 The rDNA of l~ro,~@hik~mlanoguster. Cell 26: TFASTA searches in the initialphase of the work and Dr. H. L,EHIWCH 297-298. and Dr. J. HOHEISEI.for use ofthe ICKF robot facility. We also thank HAKTI.,D., and E. K. LOLOVSKAYA,I992 The Droqbl~ilagenonle Dr. C. TOISTOSHEVof the NCBI and Dr. R. FUCHSof the EMBL. for project: current status of the physical map. Comp. Biochem. help with the database submissions. We are also indebted to our Physiol. 103B: 1-8. numerous colleagues who told us information concerning cosmids HOIWISEI.,J. D., G. L.. L.F.NNON, G. ZECIETNERand H. LEHKACH,1991 sent to them. This work was supported by grants from the European Use of high coverage reference libraries of Drosophila mlanogasler for relational data analysis. J. Mol. Biol. 903-914. Communities to all participatinglaboratories (ST2P-0477-C and 220: KAFATOS, F. C., C. I,OUIS, C. SAVAKIS,D. M. GI.OVEK,M. ASHBURNER, SCl *-CT92-0787) and by an institutional grant from FundaciBn Ra- el ul., 1991 Integrated maps of the Drosophilu genome: progrers mcin Areces to the Centro de Biologia Molecular (Madrid). and prospects. Treuds Genet. 7: 155-161. KIUMDAS, C.,and J. Pow~1.1.(Editors) , 1992 Drosophilu Znvmion Poly LITERATURECITED mo@bism. CRC Press, Boca Katon. FL. LEHRACH,H., R. DKMAKAC,J. HOIIEISFX.,%. LAKIN, G. LENNON,et AJIOKA,J. W., D. A. SMOILER,K. W. JONES,J. P. CAKUI.IJ,A. E. C. ul., 1990 Hybridization fingerprinting in genome mapping and VELLEKet al., 1991 Drosophila genome project: one-hit coverage sequencing, pp. 39-81 in Grnrtir and Physical Mapping, Vol. 1, in yeast artificial chromosomes. Chromosoma 100: 495-509. GPnornrAnaly.ti.s,edited by K. E. DAVIS and S. M. TII.(;HMAN. Cold ALTSCHUL, S. F., W. GISH,W. MIIIER, E. W. MYERSand D. J. LIPMAN, Spring Harbor Press, Cold Spring Harhor, NY. 1990 Basic local alignment search tool. J. Mol. Biol. 215: 403-410. MEKKIAM,J., M. ASHBCXNER,D. I.. HMn. and F. C. KAFATOS, 1991 ASHRUKNER,M., 1989 Drosophila, A Laboratory Handbook. Cold Spring Progress towards cloning and mappingthe Dromphila rnelanrJguster Harbor Press, Cold Spring Harbor, NY. genome. Science 254: 221-225. BKIDGES,C. M., 1935 Salivary chromosome maps with a key to the O’KWE, C;., and W. J. GEHKINC:,1987 Detection in situ ofgenomic banding of the chromosomesof Drosophila melanoga&r.J. Hered. regulatory elements in Droq’hilu. Proc. Natl. .kad. Sci. USA 84: 26: 60-64. 9123-9127. CHUMAKOV, I., P. &GAULT,S. GLIIILOU,P. OUGEN,A. BlL.1nu.r et al., OLSON,M., I.. HOOD,C. CANTORand D. BOTSTEIN, 1989 A common 1992 Continuum of overlappingclones spanning the entire language for physical mapping ofthe human geuome. Science human chromosome 21q. Nature 359: 380-387. 245: 1434-1435. CHURCH,G., and W. GILBERT, 1984Genome sequencing. Proc. Natl. POWEI.I.,J., and R. DE SAIJ.E,1995 Drosophila molecular phylogenies Acad. Sci. USA 81: 1991-1995. and their uses. Evol. Biol. 28: 87-138. COUI.SON,A,, J. SUISTON,S. BRENNER andJ. URN,1986 Towards SAMBROOK,J., E. F. FKITSCHand T. IMANIxrIs, 1989 Molecular Clon- a physical map of the genome of the nematode Camorhabditis Cng. Cold Spring Harbor Press, Cold Spring Harbor, NY. ekgans. Proc. Natl. Acad. Sci. USA 83 7821-7825. SAUNDERS, R. D. C., D. M. GLOVER,M. ASHBUKNER,I. SU)&N-KIAMOS, CouIsox, A,, R. WATERSTON,J. KIFF, J. SULSTONand Y. Kmww, 1988 C. I.ouls rt al., 1989 PCR amplification of DNA microdissected Genome linking with yeast artificial chromosomes. Nature 335: from a single polytene band: a comparison with conventional 184-186. microcloning. Nucleic Acids Res. 17: 9027-9037. 1640 E. Madueiio et al.

SIDEN-&AMOS,I., R. D. C. SAUNDERS,L. SPANOS,T. MAJERUS,J. TRE COX, 1993 Cloning and characterization of alpha PSI, a novel NEAR et al., 1990 Towards a physical map of the Drosophila mela- Drosophila melunogasterintegrin. Mech. Dev. 43 21-36. nogastergenome: mapping of cosmid cloneswithin defined genc- WOOD,W. I., J. GITSCHIER,L. A. LASKYand R. M. LAWN, 1985 Base mic divisions. Nucleic Acids Res. 18: 6261-6270. composition-independenthybridization in tetramethylammo- SMOI.LER,D. A,, D. PETROVand D. L. HARTL, 1991 Characterization nium chloride: a method for screeningof highly of bacteriphage P1 library containing insertsof Drosophila DNA complex gene libraries. Proc. Natl. Acad. Sci. USA 81: 1585- of 75-100 kilobase pairs. Chromosoma 100: 487-494. 1588. SORSA,V. 1988 Chromosome Maps of Drosophila. CRC Press, Boca Ba- ZHENG,L., R. D. C. SAUNDERS,D. FORTINI,A. DELLA TORRE, M. Co- ton, FL. SULSTON,J., F. MALLETT,R. STADEN,R. DURBIN,T. How I.UZZI et al., 1991 Low resolution map of the malarial mosquito NELL et al., 1988 Software for genome mapping by fingerprint- Anopheles gambiae. Proc. Natl. Acad. Sci. USA 88: 11187-11191. ing techniques. Comput. Appl. Biosci. 4: 125. WEHRLI,M., A. DIANTONIO,I. M. FFARNLEY,R. J. SMITHand M. WIL Communicating editor: V. G. FINNERTY

APPENDM

1A 37 23E12 1A/U S 1c 2 158H9 1D1.2/U S.T 100H2 150E12 14 125H1014 1A 184A10 1D 12788 1B/U 26B3 IC 1B 9 70B12 1B/U 44D2 171Dll 109H7 ICD/U 96H4 40F6 lC/U 180D6 1BI4/U ac 77G2 lD/U 180Cll 133C8 1D1.2/U 165H7 lBl4/U ase 166G9 175G11 1B14/U ase 146H10 109G12 1B1-4,20 ase 34F3 7 102C3 lB7-10/U elav S,T 81G10 153H8 1c 82Hll 136G8 1B/U elav 173 192Dll 99F11 IB/U elav 55G8 1CD 14265 1B/U U 155B1 IC 65F1 1B/U 1D 4 125E5 46C11 1B elav S,T 24F3 1EF/U 63B3 29F10 1E/U 5 3 llE3 lB3-10/U 118E1 1D/U 4787 10 55F7 102All 105C8 16C9 lB3-10/U 76C4 1E 97E2 lB/U 99F6 lD/U 11F6 1B/U 150B5 1E45/U 129E1 191C3 151F8 162E6 1D/U 88B3 1B/U 182E10 1D 117B4 11 197A5 1D 118B3 1C 164B9 1D/U 146F4 1E 5 67A3 1E,39 131H5 53F6 17E12 1B,60F,82E 141H8 57F12 1BC/U 144A7 1F/U 5 103A3 94G12 112H3 78H5 1EF/U 84C5 73B7 IF-PA/U 112C6 lB9-10/U 33C11 142A1 IC/U 57F1 1F/U 49F4 1D,3A 6 78E1 82D12 1C/U 190A7 115C2 lB10-14/U 192A1 1E 165D1 lB10-14/U 35F1 1E/U u 122H9 1BC 12 43F5 Drosophila X Chromosome Map 1641

APPENDM Continued

1E 185Bll 1E 1EF/U 2A 52C10 2A,58F 69D5 1 EF u 123B12 2A u 76B2 1E 2B 15 147C9 2B1-10 55G7 lEF,18C,58F,60 87G9 2B/U 46B2 1E,97DE 85F12 63D2 1E 59C12 2B1-8 1F 1 26G3 lE3.41F/U 87A1 2B1-10 190F7 1F 34E5 2B1-10 19F10 1F/U 162G1 8D8 1F/U 1oc7 2B1-4/U 48A10 IF 183Bll 2B1-3/U 195F10 1F/U 191D10 2B1-10/U 198D4 1F,2B 103A4 154A5 47D8 80A9 147A6 95A2 1F/U 4E6 2B14U 132E8 1F/U 79G8 2B1-10 67E6 2A/U 92G5 5468 3367 2B1-10/U 66G10 2A/U 118G1 64D12 2A/U 80H7 2B 107B11 1F-2A/U 36E4 2B1-5/U 66G4 2A/U 88B2 2B5-6/U 4964 1 F-2A/U 171A5 16E12 1F-2A 18 137E7 2A 16 154H3 2A/U 71D4 136H9 82B9 2B 52Hll 2B1-10/U 30B7 2B5-8 50C4 2B1-10/U 17E4 2B47/U 58A11 48D3 2B/U 42H4 20B3 2Bll-18 71D7 153c9 2B11-18/U 33G8 112~5 2B5-8/U 133D10 54G7 4F1 2B1-10/U 131F2 2B7-8/U 86A3 2B1-8/U 20H 1 2B11-18/U 79H6 2B7-8/U 76D5 15H6 2B1-10/U 129E12 2B9-16/U 9E2 2B16,13EF,65B 152A10 9D2 2B1-10/U 145G5 19F2 2B1-10/U 63B12 2B10-14,U 138D4 2B48/U 60E8 98F7 2B15/U 19C523 2CD 50G5 86E4 73D1 38H 12 2B10-14/U 10B12 2B1-10/U 11IF12 49E11 2B1-10/U 39E1 2B1318/U 190C3 82H7 17 115D8 2B1-10/U 81C459 2B10-18 123F11 2B1-10/U 133E12 -2c/u 78B8 2B1-10/U 133 20D1 2B38 45B7 199H8 2B1-10/U 70B2 2A/U u 81D4 2B11-18/U 56F3 2B1-5 122Ell 2B1-5,19CD 83G4 2B1-5/U 67C7 2B10-18/U 17A10 2B1-10/U 36D1 2B1-5/U 121c1 58C11 2B1-5 57812 2B1-10 252C 121B10 1 7A9 2B1-6/U 199E7 2CD/U 171E4 172F1 2D/U 25D2 2Bl-lO/U 93 67A9 2C1-4/U 171A7 2B1-10 22E5 2C410/U 145191D12 u 81E3 2C,83A 1642 E. Maduefio et al.

APPENDIX Continued

2E 20 62D9 2F/U 3D 10H9 3E [Sl,[Tl 103B4 2EF/U u 85E1 3DE 41C12 343E 155B10 3E/U 152A3 2E/U 57F3 3F 28C2 99D2 3E 2D5 2F,70EF,97D 170B5 3E 5 87B1 35 66A1 3E/U 62C3 2F-3A/U 117G1 4A 5 134194D5 u 155B5 3E 184C3 2EF/U 147Bll 3E u 176D6 2EF/U 3F u 30G8 3F-4A 2F 21 30B8 2F/U 4A u 53C7 4A 94D9 2F,12D 8483 4Al3,13B,94A 192G4 2F/U 28E10 -4,86EF 25E8 2F/U 4C 36 70D2 4C 196H12 2F/U 84H4 4c/u 17E2 2F/mult 6D8 83D1 2F/U 162F6 176E6 157D4 184C2 136C10 4c u 75F9 2F u 30D3 4c 3A 17594D5 3A/U 66C1 4c 16G10 69D 1 4C,6DE,19DE,SOAB u 45F2 3A,39 4D 38 34H6 4D 3B 28 lOOGl0 3C1-3/U 143E12 60G6 3B 28D9 -4 23A2 3C1.2 65G11 4D/U 5 102C5 4E 49 llB8 4E 65H5 3B-3C1.2/U 19F8 5A 95B7 3C/U 117 138A9 4E/mult 31 74A6 3B/U 121Gll 4E/U 182Bll 139 127B10 32 93H7 3BC/U lBll 75G10 31C4 4E/U 140E9 u lOBll 4EF/U 155E2 3c/u 127B6 4E u 100G7 3B 4F 42 189G10 3C 68G826 3C/U 68C5 61F11 3C 70C2 4F,5A 140Gll 3C1.2/U u 123Dll 4F/U 24H7 194D9 4F,8D 149H2 3C7.8/U 5A 47 87F2 5A 21H10 71D8 5A 16482 3C2-6/U 116E448 5A 163A10 103Fll 5B 33 96G10 3C5-7,2B1-10 u 106F2 5 38B10 3c/u 84A6 -5A 29 66D129 3D/U 67E10 5AB/u 61F10 3c 4E7 -5 152F11 3D/U 163C8 -5 58G5 3D 5B 40 81A4 5B u 20Fll 3C/U 47E12 92A3 3C/U 12A12 5B/mult 131D1 3C/U u 15A6 5B 26C12 3c 103G8 5B 140G12 2OAc,3C 142A8 5B/U 3D 27 114E2 3DE 63G741 5C 5BC 96F12 34F11 37F4 3E 15B1 5c/u 60D 10 99F7 5G4 29D12 5c 132A3 173E9 105H9 102E3 -5EF/U Drosophila X Chromosome Map 1643

APPENDIX Continued

5c 43 105D2 5c/u 7D 67 48E10 26D4 5c 114H4 7D/U 115H9 41G4 14E8 143C12 7D/U 36A1 5c 53E3 116D7 71 47C4 7D (T) 5F2 5c 156F8 97H10 5C,-64B Dl 71B7 7D U 106G1 5c 89F8 7D 123A4 5C,22A 200F9 7D S,T 5D 46 66C2 5D/U 49G868 7D,66AB,82E otu 62C6 5D/U 40G3 7DE otu IllDll 5A? 8883 7E 44 132F8 u 78B6 7D,29EF,62B,70D 40D3 5D u 190F5 7D/U dec-1 37C3 5D/U 7E 66 36G10 79E10 154G11 U 84F7 5D 14D5 5E 45 199A6 5A? 97G10 7E/U 143G11 5D/U 134A4 179A4 44C2 105A2 -5EF/U 166H8 -7EF/U 38C8 5DE/U 29F5 136G3 5c 163B4 U 9885 -5EF 89E2 7DE 6A 62 186H4 -6AC 44G3 173C3 6BC/U 69 138B6 U 139A3 6A 80Gll 7F IllHl 6A 132F4 7E 6B U 114C4 -6B/U 16962 7EF 131A5 -6B/U 168F470 6C 61 124A5 79A12 83A4 -6C/U 191F7 114A5 48C2 78E9 32B8 108B9 25C12 7E/mult U 170Hll 6C 1B2 - 7E 6D 60 52A5 6D u 173D3 7E,27CD 26E7 78C8 7E 141D5 133F473 8B U 113C9 6D,66B 112A7 -8B?U 6F 63 5H6 135G2 60H5 6F-7A/U 57D 1 148 125A5 6F-7A/U 5c4 -8C?U 42C4 153H12 8D 7A U 128H8 7AB,39AB 125A12 8D 7B 159 36C9 8C u 143G2 8CD 192B4 8D 74 190E8 8D/U 144H1 7B 88F5 8D 65 107B5 7BC/U 33A10 8D8-12 149Bll 42C2 8E,26C,88C 7A4 7C,65AB 75 140B10 8D,13E,87B 143H10 164F6 8D/U U 52D2 7B/U 58E4 110H5 7B/U u llOE6 - 8D 7D 64 3B11 7D 194H272 8E SEF, 14C 188G11 181B5 60G8 16C8 80H4 14A12 T 109D1 145F9 161H12 27H3 47E2 -7B? 125E2 8EF 22H7 7D1-3/U 134C9 8E 1644 E. Madueiio et al.

APPENDIX Continued

8E 61H5 11C 8F8 11c u 34D3 8E/U 69E2 11C 87C6 8E14/U 77F1 11C 133612 8E 11D 88 16A12 11D T 48D 12 8E 178E5 11D/U 8F u 39A1 8F,17A,26A 108Fll 11EF/U 9A 8232Dll 9A/U 144Bll 12A/U Psla (T) 99F1 9AB u 55G6 11D,60E u 7288 9A 11E u ll0A5 11E 132C3 9A 135Fll 11E 100B6 9A 151F1 11E 34C6 9A 160Gll 11E/U 9AB 15F5 11EF 121A3 Ppl-p 9c 11F u 146F3 11F/U 9B u 150B2 9B,86E 12A 87 171A8 -12E/U YP3 S 9C 100C576 YC/mult 48F5 24E6 9c 7c5 12DE/U 9D u 152F6 9DE,38CD,95F 22F12 136B7 9D 167A4 9F u 36A5 SF-lOA/U 189B8 12AB,45A S,T 10A 30 76A7 45F11 12C YP3 S,T 11G9 lOAl.2/U u 36F3 12A,31C,92A 34G9 10A/U 12B 92 36G4 79 72A3 10A/U 94D8 12B/U 84C6 29E9 163150D1 1OA/U 17G3 12B/U lllF8 141B7 10B u 121H2 1OB 2387 10c 75D 1 10C 110D5 12B/U 80A1 10C u 34B5 12B 1OD u 62B12 10D,71A 12C 89 194.48 194B3 10D/U 176A8 62C12 lOD/U 189C4 12D/U 10E u 48H8 10E/U 174Ell 10F 83 176C8 41 G3 175F2 10F/U 17H2 12C-12D 16B10 1OF 186H1 12D/U 14E6 10F/U 22G11 - 12F 122D7 46F1 126D12 95 145B9 11A 58C1084 188C6 38B7 11A 47D4 u 66F8 11AB 50E1 12CD 89D3 11A 14E5 174Hll 11A 12D 90 65G4 SOH6 11A 61E5 11B 86 36A7 25H11 144D12 11B 181F7 12D 46B10 11B 174151Dll 24 43C6 1lBC/U 107H5 12DE,57E 61A7 11BC/U u 30H2 -12D,34BC,-38EF 167B1 11B/U 83A12 12DE u 126E5 11BC,43A 182C5 12D,60E 87B12 11B 191A7 12D/U 11c 85 24B12 11B? 12E 8 127E10 12EF 98G1 llCD/U 1 J9F12 l0lGll 91 154C12 12E/U 83E3 158C9 123Hll 29A5 49E12 11c 26D3 107D4 80G4 12E8/U 1SA7 11D 65G10 u 109E10 1lC/U 45D10 Drosophila X Chromosome Map 1645

APPENDIX Continued

.e8 .e n3 VY 33 - 12E 26C1 13D 98A4 - 13D 158B2 14A 105 3B9 14.4 exd (T) 144D2 12F/U 176A4 14AB/U S,T 137B8 12F-l3A/U 150Bll 194F10 12F/U 17H10 160 129A8 41F5 140B2 12EF 52A11 1648G12 10oF9 5869 12E/U 109A1 12F u 33B2 12F 73H12 14AB S,T 13A 35E694 rut 74H6 48G9 143168D9 M?b 2A1 29H7 14A,85D 171H6 145Cll M?b 184C7 160E7 MYb 151E9 13A/U S 142E2 shi 147E3 17G6 shi 165156B3 u 87G7 14A/u 186D5 13A/U S,T 30E5 14A 90F4 13D 44H2 14A u 165E9 13A/U 53D12 14A 198H1 13A 14C 104 60Gll 14CD 61Cll 13,49AB 156C11 13B 9613B 8C2 13D/U sd 17B2 14C 41F1 65B4 14CD/U 72D3 13B,39CD 36D4 128Bll 106 176A2 14C,99A,59EF nonA 66BI 1 - 13C/U 36F12 14DE 14C1 126A2 - 14C 99 127A5 10791E3 143C 1 13B/U S.T 153H10 157A9 196B1 15H1 13B/U 7688 65A3 128B10 66B12 86D9 176H1 13BC/U S,T 94A9 14DE/U u 78F9 13B 192D9 14F/U 13C 100 192F5 13B/U. S,T 159C5 14C 146812 13C/U Pp"C u 70B4 14C,19A 134E7 Pp"C 50C8 14C,44A 16B5 13C/U 14D 103 18F10 14EF 45B5 Ppl-l?c 175E5 169B10 41A10 98 189A12 sd 194G1 13B2 13C/U 38C6 14E 18G8 sd 11A4 14EF 78G2 sd 41G7 16D7 13D/U 133H1 32D10 sd 134D6 14DF 40B 1 sd u llA6 14D u 15E1 - 13CD,86A 183E9 -14D/U 81Fll 13CD,62F 14F 167113H12 14F 18962 13CD 120G2 13D 81C697 13E u 143D3 14F-l5A/U 47H9 - 13D/U 15A 101 148Hll - 15AB/U 144A2 136D4 15AB/U 12F4 13E 168 168F2 81D7 177G3 15A,86C 120B3 u 149D5 15A 176C7 13E/U 15B 109 26F5 15B/U 102 149C1 13D/U 14H10 15DE 150B6 13E/U 72F8 u 27A2 13D 105A12 15DE 1646 E. Madueiio et al.

APPENDIX Continued

rE 9 (II z v, s E;;

~ ~ ~~ ~ 15B u 108G7 15BC/U 17D 119 3885 17D,86E 15D u 102G12 15DE/U 25E9 17D/U 15E 108 20Hll 15EF,19 147G8 -17D,43C,39 40C1 15EF 121 174E3 6H7 108H 1 17D/U 110H3 15F-16A U 60A11 17D 88C7 15D 17E 170 129D9 17EF/U 176G5 2986 18A/U 194F2 15F/U 63C 12 18A,83C 112 189B1 15EF/U U 44H6 17EF? 189F7 15F 58E9 17E,58D u 57A3 15EF/U 96D 10 17E/U 15F 149E10110 17F U 12H3 17F-1SA 96E8 -15F 62A3 17F,45A 74D3 16A 18A 115 117G7 193Cll 15F-16A 142A6 19B,18B 200C5 16CD/U 117A4 18AB/U 168D7 16B 133G5 65E4 124 29G11 18BC/U u 107Hll 15F 148D10 16A u 86C3 16A 159D4 18A/U 16C11356F9 16C/U 173E8 142E4 118E8 99E1 62A4 18B/U 45D1 U 144F7 -18,29D 96C3 16C/U 18B 19 142G8 18C u 12C3 16C,50A 60D7 18BC/U 16D 111 117C8 171 9c9 18B/U 81C10 16D4.5 S.T 7A5 18C/U 44D4 16F/U 142H9 u 106E5 16DE 18C 125 114D12 16E 116 106H1 16EF 60H9 18C/U 109F9 16F S.T 171Gll 109F10 48A3 18C/U 27B3 17AB/U 97F3 18C 83C11 16EF/U 108C8 18C/U 114E10 16F/U 166 61H8 18C,88B? u 199E12 16EF/U 6484 200E1 16EF/U 143E8 117E7 16E,27B U 97D7 18C 16F 114 5382 16F 6H5 18C,65EF 198B1 16F/U 103C1 18C u 185C3 16F/U 124B1 18C 17A 169 8387 -17A 187E12 18C 194H9 17B 136C6 18C u 61H2 17A,43C,38EF 120E1 18C 81D2 17A,99DE 18D 120 llFll 18D/U 17B u 48A9 17B,54F 174G5 17C 118 llG6 - 17CD/U 9E11 16A3 173Hll 18D/U 103B2 17CD 47C12 126F7 9G6 I19H5 17DE 41D9 14683 17B? 27B12 96E10 17C-17D 155F7 18CD/U 85D7 17DE 180H4 IlAll 17F/U 19A10 191G5 17CE/U 94E9 18D/U 181H6 17D-E 122 15D12 18D/U 37c7 99H7 u 37C10 17C/U 11H6 18D 60B1 1 17C,17D,18F 43812 140H1 -17C U 125F12 18D/U Drosophila X Chromosome Map 1647

APPENDM Continued c

8D 156M18D 18D/U 19B 58G1 19B 138B7 18D 118C11 19B 18F126 68Hll 19D u 143G1 19D S 26F9 18F4.5 19E 128 106F5 19E 13D5 16389 19E/U S,T 158E6 18F4.5/U 19F1 u 15D4 - 18F 129 66F6 19EF 42A4 18F 1OlGl 143D6 18F4.5 153F10 2OAC/U 19A 123 4C9 19B 115G9 20AC 1OD9 19A,34A 194G6 81C2 u 198A3 19E 128D7 19F 51 63E7 19F u 100G4 19AB 199Hll 2OAC/U 139G12 - 19A 18G9 139E12 19A u 125Hll 19F-20A 19B 12734H5 19B 78D4 19F/U 14D12 19c 20A 1306H2 20 172 3286 163A11 20D/U sum 15E10 19B 174F6 2OAC?U suv) S,T u 122G1 19B S,T 181H12 sum 176F7 19BC shukB u 146B9 20A/U 25C7 19BC 20D u 130Gll 20D Cosmids mapped to the X chromosome of D. mlunogaster are ordered by polytene chromosome subdivision and grouped by arbitrarily numbered contig. Cosmids marked “u” are so far unattached to any other. The “cytology” column shows the in situ hybridization data for cosmids and analyzed by this means (a blank in this column means not done). The/U qualifier indicates that the in situ signal was unique. In situ sites not marked/U are only those mapped. In addition these cosmids may hybridize to other sites, including the chromocenter. The “locus” column shows the correspondence between cosmids and known genes of D. melunogustm, determined by a variety of methods (see Table 1).An entxy in the STS column indicates that an STS sequence has been determined from the SP6 (S) or T7 (T) endof the cosmid insert. Class I STSs (see text) are enclosed in parentheses, Class II are unbracketed, and Class 111 are in square brackets. For availability of cosmids see MATERIALS AND METHODS.