Structural and Functional Comparisons of the Drosophila Virilis And
Total Page:16
File Type:pdf, Size:1020Kb
Proc. NatI. Acad. Sci. USA Vol. 87, pp. 5916-5920, August 1990 Genetics Structural and functional comparisons of the Drosophila virilis and Drosophila melanogaster rough genes (homeobox/eye development/evolution) ULRIKE HEBERLEIN AND GERALD M. RUBIN Howard Hughes Medical Institute and Department of Molecular and Cell Biology, University of California, Berkeley, CA 94720 Contributed by Gerald M. Rubin, May 23, 1990 ABSTRACT We have isolated the homeobox gene rough tions were carried out at 420C in 2x SSC (1 x SSC is 0.15 M (ro) from Drosophila viriis. Comparison ofthe predicted amino NaCI/0.015 M sodium citrate, pH 7) containing 35% form- acid sequences of the D. melanogaster and D. virus rough amide, 0.1% Ficoll, 0.1% bovine serum albumin, 0.1% poly- proteins reveals that domains of high conservation, including vinylpyrrolidone, and 100 ,ug of sonicated salmon sperm the homeodomain, are interspersed with highly diverged re- DNA per ml. Washing conditions were lx SSC/0.1% SDS, gions. Stretches of significant sequence conservation are also 50'C. Approximately one clone was obtained per genome observed in the 5' promoter region and in the introns. The D. screened. DNA blot analysis of the isolated phage DNA virusrough gene rescues the rough mutant phenotype and is identified an 8-kilobase (kb) genomic Sal I fragment that properly regulated when introduced into the D. melanogaster hybridized with the D. melanogaster rough cDNA. This genome. Thus the rough protein as well as the cis-regulatory fragment was subcloned into pBluescript KS(+) (Strata- elements that ensure proper temporal and spatial regulation gene), and random clones were generated by sonication and are functionally conserved between these Drosophila species. subcloning into phage M13. Sequencing was done by the chain-termination method (8). Both strands were sequenced The compound eye ofDrosophila consists ofseveral hundred except for two 100-base-pair (bp) regions in the first intron. units, or ommatidia, each containing a stereotyped arrange- Sequences were compiled and analyzed using the IntelliGe- ment of photoreceptor, pigment, and cone cells. These om- netics and University of Wisconsin Genetics Computer matidia develop during late larval and pupal life in the eye Group software packages. imaginal disc, in a process that The 8-kb DNA fragment containing the D. virilis rough involves the recruitment of gene was cloned into the P-element transformation vector undifferentiated epithelial cells into gradually growing om- pDM30 (9) and germ-line transformants were obtained by matidial clusters (for review see refs. 1 and 2). The rough (ro) standard techniques (10). mutation disrupts cellular interactions at an early stage of Antibody staining of eye imaginal discs with the rough ommatidial assembly, leading to irregularly arranged clusters monoclonal antibody (MAbrol) was carried out exactly as containing variable numbers of photoreceptor cells (3). described (4). Fixation and sectioning of adult Drosophila The rough gene encodes a homeodomain protein (3, 35) and heads were performed as described (11). is believed to specify the identity ofa subset ofphotoreceptor cells in the developing retina (4, 5). The rough protein is restricted to the eye imaginal disc, where it is expressed in a RESULTS AND DISCUSSION complex and dynamic pattern (4). Unlike mutations in other The D. virilis rough gene was isolated from a genomic library Drosophila homeobox genes, flies carrying complete loss-of- by virtue of its cross-hybridization with a D. melanogaster function alleles of rough are viable (unpublished data). This, rough cDNA (see Materials and Methods). The regions of together with the relatively small size of the rough gene, homologous sequence in the two genes were found to be provides a unique opportunity to study structure-function completely contained within an 8-kb D. virilis Sal I fragment. relationships of a homeodomain protein in its natural devel- The DNA sequence of most of this genomic fragment is opmental context. shown in Fig. 1, which includes alignments with D. melano- As a first step to identify functionally relevant domains of gaster protein-coding sequences. From the analysis of these the rough protein, as well as cis-regulatory DNA sequences alignments we conclude that the D. virilis DNA fragment required for proper regulation, we have compared the se- contains all the protein-coding sequences, as well as -1 kb quences of the rough genes from two distantly related Droso- each of 5' and 3' noncoding DNA. The D. melanogaster phila species, D. melanogaster and D. virilis.* These two rough protein is encoded by three exons. The DNA se- species are separated by an evolutionary period of "60 million quences ofthe splice sites and adjacent regions are conserved years (6), which is sufficiently distant for unconstrained DNA in the two species, arguing that the overall genomic organi- sequences to have diverged extensively, allowing putative zation is the same. A dot-matrix comparison of the D. virilis functional elements to be identified by sequence conservation. and D. melanogaster rough sequences is shown in Fig. 2. To test whether the observed conservation is of functional Although the homologies are concentrated in the coding importance, we introduced the D. virilis rough gene into the D. regions, several stretches of highly conserved sequence are melanogaster genome and analyzed its function. observed in each intron. The conservation at the DNA level in the three exons, calculated as percent nucleotide identity MATERIALS AND METHODS relative to the total number of nucleotides in the D. mela- A genomic D. virilis library in bacteriophage A EMBL3 (ref. nogaster sequence, is 46%, 81%, and 69% in the first, second, 7; a gift of M. Scott, Stanford University) was screened with and third exons, respectively. It is difficult to calculate the a full-length D. melanogaster rough cDNA (3). Hybridiza- overall conservation in the introns due to significant differ- ences in their length. However, -209o% of the sequence The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" *The sequence reported in this paper has been deposited in the in accordance with 18 U.S.C. §1734 solely to indicate this fact. GenBank data base (accession no. M35372). 5916 Downloaded by guest on September 25, 2021 Genetics: Heberlein and Rubin Proc. Nat!. Acad. Sci. USA 87 (1990) 5917 viv rCGGATCCGACCAACTG-1050TCGAGTTGAAGCAAGGTAGTTCGTTAAGACTCCGGACTTGGCACTAC 15 Orvi CTACCTTATAGCGTTATCCTGCACGACACCTAAGCATC-918TTTCTCGCATTTTAGTATTAGCAATGC-1 vi TTAAAATGAAGATACAAGGCCGrATCACATTACATT -786AATCAATTGCAATCCCGGCGGCCACGTCG-8 vi rGATAATTCACATTATCCAAAtTTGGTTTGCAA -390GATTTTTTTATTGAAAATATTCATGAAGATA-9 vivi rAAAAATAACCAAGGAACAAAGATTCTTATAC-258GGTTCTCGCAATGCGCGTTGTCCCTCACAAAT -5 vi r AGGGGCCTTCTTGTCCTGGAATTAGGAGTATAGTGATCGTCACACGGACCTGTGTCGTTGGCAATTT -126 1-. vi r TATACCATTACCTCTTGTTATCGAAATTTAGGAGTACCACCTACAAAGGCGTATTGATGAGACCATT 6 mel HOR H KYVEI G SPDG SP- -8- 0- LODP 25 mel - ----AA-GT- -C--AGTTGAGATTGGCTCA-CC-AC-GCTC-CCG ----AGGAGTG-CAG-TTAGATC-C vir CATATGCATTCATTAGTACGTCCACGGGCAATCGACAACACTGCCCAGGAAGTACATTCATCAGCAA 138 vir LOO0N P S S P1L H A E K F I1K S 585 T P T 24 mel I A.N T-I-SV ---------------------S. 56 mel ATAG-G...................ACACT- -- -A-T- -AT-CGT----A-C--C-6---G----A--G- -T----A-0-------T-G- -A----CT-C.... vi r GCAACGACGACGACGACGACAGAAGTCTCAGCCGGCGCTTCCCATCTGGGCTAGGACCAAGGAGAAC 270 vir A T P A P A S A T A T A P N H T V L A T P 0 R P S S P R 0 F F E R 1 Y G H L E T R T H N 68 4-HKPPPC mel -N -FE-D----0-H-- ----- V ---------- --T--AA Y- A -0 106 mel TCG--A-AT--G--A--C--C--_C--A------C-----T--C ---T ---A----TC-----A-----C--C--C------AC---GC---C-GCA-A--- CG-C--C--T virt GGGGGCAATAGGGAAAGCAAGCTCAACAGAGACGACTGCGCTTGTATAGGGGCCTATATTCGCTTA 402 vir S F S G 0 I 0 v G T H A F T P Y 0 5 0 G G 5 A S S P 0 1 S I 5 0 F R V S 1 V S F P 5 Y E 112 .LPOHPSQQHOQHHHHHHHPPOLVHQKLSYVSPPPA IG mel F YG HA K- - I -A - GA- - P v- -H --------P S- --F S- 181 mel --CT---GT-ACG-CAA---C--T-TT--CGC---T-GAGC ----TC-TGTCC-G--A-AC--A.....C--C----CC-AAG---C----CT-C-G----C---C----C vi r TTCTCCTCCGATCGGGAGCCATCACCATTCGCGTTTATCGCTTCGTACGAATAGCGTTCGATTIATC 534 vir L H A P T P 0 Y S A Y 6 A I A N A T L P P A I S F P A F S A 0 P H I 8 A G F S A F 1 154 vir ATCACAATACCAATGCTCATATTCATTTAAGACACATTGTCGAAATAAAATATAAGTTTTGAATAAC 666 vir GGAAAAATGAACGATCCTTTTTTGAAAATATTGGGTCTAATGAAAAAATTTGATAAGAGTTGGTTAC 798 vir AAAGTTGCAATATTAGAAATTGTGAATGAAATCTTCTGACCTTTTGATAAATTAATTTTAGAACTAT 930 vir GAAGCTGCTTGACTCCGtTTCAACGTGTTTAAGTGCCTTCTGATTCTTAGTAATGATCAACACATAC 1062 vir YAAAAAAGMCAC GTAATTCAACGTTTAAGCAGCAGTAAGTTGAAAAAGCCTCTACTTATCAGTA 1194 vir GTGAACTTCMTGCGTGTCAGCATTTAAATGCGTCCCGTCATAAGAGTGAAAAAAGACATGCGTGATA 1326 vivCGATiTAATTTTGCGATAATCCrTTAACTCTACAGA 1458TCGAAGTTGTACCAGAAGAGTTAAAGTTC15 vir TAWTTAAGTGAACAGTGACTAGTTGAATCCAAGATAAAATCAGGGCCGAAAGTATAAAAAGAAATCG 1S90 OrviAATGAArCGATCTATGACAGAAATAACTACATAT 1722TTAAGTTTATATAAACGTTGTGATGGA12 vir ATATGCTATACTGAATCAAAACGTTCATACATTTGAATTTTTTGCACAAGTACTTATGCAAACTGTA 1854 vir CTTATACTTTGAGTAGCGGAAAGG GACACCTACTCC GT TGTAAAAGTTCTTATCGTAATAT T 19866 vir TTGAATGAMCATTTTATCATGTCAATdTTTTGTTTTTGCCTTCCGCCCGTTTAGTCCGTCTCCAA 2118 vir AGGTfijfiffG& TGTTCACATGCACAGTCCATGACCTCAATTCAGGCTGTGCGTTTGCGAGCGGCA 2382 vir ATGATTGCGCAGAGACCGCGCAGAAAGGAGATTATGTTTAAGTAAAGGAAGTATCAGGTTCCCCGCC 2514 vir ATCGTCAA AGAGAGGAAAAATATCGAGATCTAAGACAGTTAAACTGATGCGAACTGTAAAAAAAT 2646 vir ATGCAAAAGCAAAGMAAGATGCTAAAGAAAACCGAAAAGTCGAGAGAGAAAGGGCTTTAAGAAGAAGAAAGCAAAAAAAMAACAAGAATAAATTTAAGATAOGTAAAAATT