12658_CH04_093_111.indd Page 93 8/23/11 5:40 PM user f-403 F-403

4 © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC A ©self-splicing Jones & Bartlett in an rRNA Learning, of the LLC NOT FOR SALE OR DISTRIBUTION largeNOT ribosomal FOR subunit. SALE © KennethOR DISTRIBUTION Eward/Photo Researchers, Inc. The Interrupted Gene © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

CHAPTER OUTLINE

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC 4.1 Introduction 4.6 Some DNA Sequences Encode More Than One NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.2 An Interrupted Gene Consists of Polypeptide and 4.7 Some Exons Can Be Equated with 4.3 Organization of Interrupted Genes May Functional Domains Be Conserved 4.8 Members of a Gene Family Have a Common © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC Organization MethodsNOT and Techniques: FOR SALE The OR Discovery DISTRIBUTION of NOT FOR SALE OR DISTRIBUTION Introns by DNA-RNA Hybridization 4.9 Summary 4.4 Sequences Are Usually Conserved but, Introns Vary 4.5© Jones Genes & Show Bartlett a Wide Learning, Distribution LLC of Sizes © Jones & Bartlett Learning, LLC NOT DueFOR Primarily SALE toOR Intron DISTRIBUTION Size and Number NOT FOR SALE OR DISTRIBUTION Variation

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 93 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 94 8/16/11 5:09 PM user f-400 F-400

© Jones & Bartlett Learning, LLC4.1 Introduction© Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION The simplest form of a geneNOT is aFOR length SALE of DNA OR that DISTRIBUTION directly corresponds to its poly- peptide product. Bacterial genes are almost always of this type, in which a continuous coding sequence of 3 N base pairs encodes a polypeptide of N amino acids. In eukary- otes, however, a gene may include additional sequences that lie within the coding region and interrupt the sequence that encodes the polypeptide. These sequences © Jonesare &removed Bartlett from Learning, the RNA LLCproduct following transcription,© Jones generating & Bartlett an mRNA Learning, LLC NOT FORthat includes SALE aOR nucleotide DISTRIBUTION sequence exactly correspondingNOT to the FOR polypeptide SALE product OR DISTRIBUTION according to the rules of the genetic code. interrupted gene A gene in The sequences of DNA comprising an interrupted gene are divided into the two which the coding sequence categories depicted in FIGURE 4.1 . is not continuous due to the • Exons are the sequences retained in the mature RNA product. By definition, presence of introns. © Jones & Bartlett Learning,a gene LLC begins and ends with exons© Jonesthat correspond & Bartlett to the Learning, 59 and 39 endsLLC of exon Any segment of an the RNA. interruptedNOT gene FOR that isSALE rep- OR DISTRIBUTION • Introns are the intervening sequencesNOT FORthat are SALE removed OR whenDISTRIBUTION the primary resented in the mature RNA transcript is processed to give the mature RNA product. product. The exon sequences are in the same order in the gene and in the RNA, but an intron A segment of DNA interrupted gene is longer than its final RNA product because of the presence of the that is transcribed but later introns. © Jonesremoved & Bartlett from within Learning, the LLC © Jones & Bartlett Learning, LLC transcript by splicing together The processing of interrupted genes requires an additional step that is not needed NOT FORthe sequencesSALE OR (exons) DISTRIBUTION on for uninterrupted genes.NOT The DNA FOR of SALE an interrupted OR DISTRIBUTION gene is transcribed to an RNA either side of it. copy (a transcript) that is exactly complementary to the original gene sequence. This RNA is only a precursor, though; it cannot yet be used to produce a polypep- tide. First, the introns must be removed from the RNA to give a messenger RNA RNA splicing The process that consists only of the series of exons. This process is called RNA splicing (see of excising introns from RNA© Jones Section & Bartlett2.10, Several Learning, Processes Are LLC Required to Express the Product© Jones of a Gene & )Bartlett and involves Learning, LLC and connecting the exons into NOT FORprecisely SALE deleting OR DISTRIBUTIONthe introns from the primary transcriptNOT and thenFOR joining SALE the OR ends DISTRIBUTION a continuous mRNA. of the RNA on either side to form a covalently intact molecule (see Chapter 21, RNA Splicing and Processing). The original gene comprises the region in the genome between the points cor- responding to the 59 and 39 terminal bases of the mature mRNA. We know that tran- © Jones & Bartlett scriptionLearning, starts LLC at the DNA template corresponding© Jones &to Bartlettthe 59 end Learning, of the mRNA LLC and NOT FOR SALE ORusually DISTRIBUTION extends beyond the complementNOT to the FOR 39 end SALE of the OR mature DISTRIBUTION mRNA, which is generated by cleavage of the primary RNA transcript (see Section 21.14, The 39 Ends of mRNAs Are Generated by Cleavage and Polyadenylation ). The gene is also considered to include the regulatory regions on both sides of the gene that are required for initiating and (sometimes) terminating transcription. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR FIGURE SALE4.1 Interrupted OR DISTRIBUTION genes are expressed via a precursor RNA. IntronsNOT FOR SALEEXON 1 OR DISTRIBUTIONEXON 2 EXON 3 are removed when the exons are spliced together. The mature mRNA has DNA INTRON 1 INTRON 2 only the sequences of the exons. RNA synthesis pre-mRNA © Jones & Bartlett Learning, LLC INTRON 1 © JonesINTRON & Bartlett 2 Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FORSplicing SALE removes OR DISTRIBUTION introns mRNA

Protein synthesis © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION Protein NOT FOR SALE OR DISTRIBUTION

Length of precursor RNA (not mRNA) defines region of gene

© Jones & Bartlett Learning, LLC © Jones & BartlettIndividual coding Learning, regions are LLCseparated in gene NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 94 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 95 8/16/11 5:09 PM user f-400 F-400

© Jones &CONCEPT Bartlett AND REASONING Learning, CHECK LLC © Jones & Bartlett Learning, LLC Why might it be difficult to express an intact human gene inserted into a bacterial cell? NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

4.2 An Interrupted Gene Consists © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC of ExonsNOT and FOR Introns SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION How does the existence of introns change our view of the gene? During splicing, the exons are always joined together in the same order they are found in the orig- inal DNA, so the correspondence between the gene and polypeptide sequences is maintained. FIGURE 4.2 shows that the order of exons in the gene remains the same© as Jones the order & Bartlett of exons Learning, in the processed LLC mRNA, but the distances© Jones between & Bartlett Learning, LLC sites NOTin the FOR gene SALE (as determined OR DISTRIBUTION by recombination analysis) do notNOT correspond FOR SALE OR DISTRIBUTION to the distances between sites in the processed mRNA. The length of the gene is defined by the length of the primary RNA transcript instead of the length of the processed mRNA. All the exons are present in the primary RNA transcript, and their splicing © Jones &together Bartlett occurs Learning, only as anLLC intramolecular reaction.© There Jones is usually & Bartlett no joining Learning, of LLC NOT FORexons SALE carried OR DISTRIBUTIONby different RNA transcripts, so the splicingNOT mechanismFOR SALE excludes OR DISTRIBUTION any splicing together of sequences from different alleles. (However, in a phenomenon known as trans -splicing, sequences from different mRNAs are ligated together into a single molecule for translation.) Mutations located in different exons of a gene cannot complement one another, so they continue to be defined as members of the same complementation© Jones group. & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC Mutations that directlyNOT FORaffect theSALE sequence OR DISTRIBUTION of a polypeptide must occur in exons. NOT FOR SALE OR DISTRIBUTION What are the effects of mutations in the introns? The introns are not part of the pro- cessed messenger RNA, so mutations in them cannot directly affect the polypeptide sequence. However, they may affect the processing of the messenger RNA by inhibit- ing the splicing together of exons. A mutation of this sort acts only on the allele that carries© it.Jones As a result, & Bartlett it fails Learning,to complement LLC any other mutation in that© Jones allele and & Bartlettis Learning, LLC part ofNOT the sameFOR complementation SALE OR DISTRIBUTION group as the exons. NOT FOR SALE OR DISTRIBUTION Mutations that affect splicing are usually deleterious. The majority are single-base substitutions at the junctions between introns and exons. They may cause an exon to be left out of the product, cause an intron to be included, or make splicing occur at a different site. The most common outcome is a termination codon that truncates the © Jones &polypeptide Bartlett sequence.Learning, About LLC 15% of the point mutations© Jones that cause & Bartlett human diseasesLearning, LLC NOT FORdisrupt SALE splicing. OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Some eukaryotic genes are not interrupted, and, like prokaryotic genes, corre- spond directly with the polypeptide product. In yeast, most genes are uninterrupted. In multicellular , most genes are interrupted, and introns are usually much longer than exons, so that genes are considerably larger than their coding regions. (See Figure 4.6 for the© example Jones of & the Bartlett mammalian Learning, b-globin LLC genes, in which exons can © Jones & Bartlett Learning, LLC be ~37% of the total NOTlength FOR of the SALE gene.) OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

Genomic DNA FIGURE 4.2 Exons remain in the same order in mRNA as in DNA, but distances along the gene do not correspond to distances Exon 1 Intron 1 Exon 2 Intron 2 Exon 3 along the mRNA or polypeptide products. The distance A-B in the A B C © Jones & Bartlett Learning, LLC gene is smaller© Jones than the& distanceBartlett B-C; Learning, but the distance LLC A-B in the NOT FOR SALE OR DISTRIBUTION mRNA (andNOT polypeptide) FOR SALE is greater OR than DISTRIBUTION the distance B-C. A-B B-C mRNA Exon 1 Exon 2 Exon 3 A B C

© Jones & Bartlett Learning,A-B B-C LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones and Bartlett4.2 An Publishers. Interrupted NOT Gene FOR Consists SALE of OR Exons DISTRIBUTION and Introns 95 12658_CH04_093_111.indd Page 96 8/16/11 5:09 PM user f-400 F-400

© Jones & Bartlett Learning, LLCKEY CONCEPTS © Jones & Bartlett Learning, LLC • Introns are removed by the process of RNA splicing. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION • Only mutations in exons can affect polypeptide sequence; however, mutations in introns can affect processing of the RNA and therefore prevent production of polypeptide.

© JonesCONCEPT & Bartlett AND REASONING Learning, CHECK LLC © Jones & Bartlett Learning, LLC NOT FORDescribe SALE the types OR ofDISTRIBUTION mutations that lead to abnormal splicingNOT and their FOR specific SALE effects. OR DISTRIBUTION

4.3 Organization of Interrupted Genes © Jones & Bartlett Learning,May LLC Be Conserved © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION The characterization of eukaryotic genes was first made possible by the development of techniques for physically mapping DNA. When an mRNA is compared with the DNA sequence from which it was transcribed, the DNA sequence turns out to have extra regions that are not represented in the mRNA. © Jones & Bartlett Learning, LLC One technique for comparing© Jones &mRNA Bartlett with Learning,genomic DNA LLC is to hybridize the NOT FOR SALE OR DISTRIBUTIONmRNA with the complementaryNOT FOR strand SALE of the DNA.OR DISTRIBUTION If the two sequences are colinear, a duplex is formed. FIGURE 4.3 shows a typical result when an RNA transcribed from an uninterrupted gene is hybridized with a DNA that includes the gene. The sequences on either side of the gene are not represented in the RNA, but the DNA sequence of the gene hybridizes with the RNA to form a continuous duplex region. © Jones & Suppose Bartlett we Learning,now perform LLC the same experiment with© RNAJones transcribed & Bartlett from Learning, an LLC NOT FORinterrupted SALE gene. OR TheDISTRIBUTION difference is that the sequences representedNOT FOR in the SALE mRNA OR lie onDISTRIBUTION either side of a sequence that is not in the mRNA. FIGURE 4.4 shows that the RNA-DNA hybrid forms a duplex, but the unhybridized DNA sequence in the middle remains single stranded, forming a loop that extrudes from the duplex. The hybridizing regions correspond to the exons, and the extruded loop corresponds to the intron. © Jones & Bartlett Learning, The structure LLC of the mRNA-DNA hybrid© Jones can be & visualized Bartlett byLearning, electron micros-LLC NOT FOR SALE ORcopy. DISTRIBUTION One of the very first examples of NOTthe visualization FOR SALE of ORan interrupted DISTRIBUTION gene is

FIGURE 4.3 Hybridizing an mRNA from an uninterrupted gene with Blue region of DNA is complementary to RNA the DNA of the gene generates a duplex region corresponding to Gray region is not represented in RNA the gene. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Denature DNA

© Jones & Bartlett Learning, LLC + © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

mRNA © Jones & Bartlett Learning, LLC Hybridize© Jones mRNA & Bartlett Learning, LLC with DNA NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

RNA-DNA hybrid forms © Jones & Bartlett Learning, LLC © Jones & Bartlettcontinuous Learning, duplex region LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 96 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 97 8/16/11 5:09 PM user f-400 F-400

Blue region of DNA is complementary to RNA FIGURE 4.4 RNA hybridizing with the DNA from an interrupted © Jones &Gray Bartlett region is not Learning, represented in LLC RNA © Jonesgene & Bartlettproduces a duplexLearning, corresponding LLC to the exons, with an NOT FOR SALE OR DISTRIBUTION NOT FORintron SALE excluded OR as aDISTRIBUTION single-stranded loop between the exons.

Denature DNA © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC

NOT FOR+ SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

mRNA© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC

NOT FORHybridize SALE mRNA OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION with DNA Intron

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

Exon 1 Exon 2

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Noncomplementary region of DNA is extruded as single-strand loop from duplex

shown in FIGURE 4.5 . On the right side of the figure, tracing the structure shows that three© introns Jones are & located Bartlett close Learning, to the very LLC beginning of the gene. © Jones & Bartlett Learning, LLC When a gene is uninterrupted, the restriction map of its DNA corresponds exactlyNOT with FOR the mapSALE of itsOR mRNA. DISTRIBUTION When a gene has introns, the mapsNOT of FORthe gene SALE OR DISTRIBUTION and its mRNA are different except at each end, corresponding to the first and last exon. The gene map will have additional regions due to the presence of introns.

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION DNA Loop C

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Loop A

Hybrid

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOTLoop B FOR SALE OR DISTRIBUTION

RNA FIGURE 4.5 Hybridization between an adenovirus mRNA and its DNA identifies three loops corresponding to introns that are located at the beginning of the gene. Photo reproduced from S. M. Berget, C. Moore, and P. A. Sharp, Proc. Natl. Acad. Sci. USA 74 (1977): 3171–3175. Used © Jones &with Bartlett permission Learning, of Philip Sharp, LLC Koch Institute for Integrative Cancer© Jones Research, & Massachusetts Bartlett Learning, Institute of Technology. LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.3 Organization of Interrupted Genes May Be Conserved 97 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 98 8/16/11 5:09 PM user f-400 F-400

FIGURE 4.6 Comparison of the restriction maps of cDNA and Genomic DNA Sites cleaved by restriction enzymes © Jones & Bartlett Learning, LLC © Jonesgenomic & BartlettDNA for mouse Learning, b globin shows LLC that the gene has two Exon 1 Intron 1 Exon 2 Intron 2 Exon 3 NOT FORintrons SALEthat are notOR present DISTRIBUTION in the cDNA. The exons can be NOT FOR SALE OR DISTRIBUTION aligned exactly between cDNA and gene.

© Jones & Bartlett Learning,cDNA LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTIONcDNA map corresponds to NOT FOR SALE OR DISTRIBUTION exon1 + exon2 + exon3 of genomic map

FIGURE 4.6 compares the restriction maps of a b-globin gene and a complementary cDNA ©A single-strandedJones & Bartlett DNALearning, (cDNA )LLC copy of its mRNA. The gene© Joneshas two &introns, Bartlett each Learning, of which contains LLC DNA complementaryNOT FOR toSALE an ORa seriesDISTRIBUTION of restriction sites that are absentNOT from FORthe cDNA. SALE The OR pattern DISTRIBUTION of restriction RNA and synthesized from sites in the exons is the same in both the cDNA and the gene. it by reverse transcription in Ultimately, a comparison of the nucleotide sequences of the gene and mRNA vitro. sequences precisely identifies the introns. As indicated in FIGURE 4.7 , an intron usually has no open reading frame. An intact reading frame in the mRNA results from the © Jones & Bartlett Learning, LLCremoval of the introns. © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

FIGURE 4.7 An intron is a sequence present in the gene Exon Intron Exon but absent from the mRNA (here shown in terms of the cDNA sequence). The reading frame is indicated by the Closed Closed Closed alternating open and shaded blocks; note that all three © Jones & Bartlett Learning,Gene sequence LLC © Jones & Bartlett Learning, LLC possible reading frames are closed by termination codons ..TCCTTATTGTGCCGTGTCATAGAAAGTAATAATTGCTTCAAGTCATAATATGGCCTAGAAATCGTGGTGTTTT... in the intron. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

mRNA sequence ....UCCUUAUUGUGCCGUAAAUCGUGGUGUUUU.... © Jones & Bartlett Learning, LLC Ser Leu© Leu Jones Ser Arg Asn & Ser Bartlett Trp Cys Phe Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

96% S. cerevisiae 80

© Jones & Bartlett Learning, LLC60 © Jones & Bartlett Learning, LLC

NOT FOR SALE OR DISTRIBUTION40 NOT FOR SALE OR DISTRIBUTION

20

30 D. melanogaster © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC

Percent 20 17% NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 10

15 Mammals 10 © Jones & Bartlett Learning,6% LLC © Jones & Bartlett Learning, LLC 5 NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 12 10 9 8 7 6 5 4 3 2 1 2 3 4 5 6 7 8 9 10 12 14 16 18 20 <40 >60 Number of exons FIGURE 4.8 Most genes are uninterrupted in yeast, but most genes are inter- rupted in flies and mammals. (Uninterrupted genes have only one exon and are © Jones & Bartlett Learning, LLCshown in the leftmost column ©in red.)Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 98 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 99 8/16/11 5:09 PM user f-400 F-400

The structures of eukaryotic genes 20 © Jones vary& Bartlett extensively. Learning, Some LLCgenes are unin- © Jones & Bartlett Learning, LLC NOT FORterrupted, SALE OR so thatDISTRIBUTION the gene sequence cor- 15 NOT FOR SALE OR DISTRIBUTION responds to the mRNA sequence. Most Worm multicellular eukaryotic genes are inter- 10 rupted, but among genes the introns vary enormously in both number and size. 5 (See FIGURE 4.8 for distributions© Jones & of Bartlett intron Learning, LLC © Jones & Bartlett Learning, LLC 20 number variation amongNOT FORgenes SALEand FIG- OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION URE 4.9 for distributions of intron-size 15 variation.) Fly Genes encoding polypeptides, rRNA, % introns 10 or tRNA may all have introns. Introns are also© Jones found & in Bartlett mitochondrial Learning, genes LLC 5 © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION of plants, fungi, protists, and one meta- 20 zoan (a sea anemone), and in chloro- plast genes. Genes with introns have 15 been found in every class of eukaryotes, Human archaea, bacteria, and bacteriophages, 10 © Jones &although Bartlett they Learning, are extremely LLC rare in pro- © Jones & Bartlett Learning, LLC NOT FORkaryotic SALE genomes.OR DISTRIBUTION 5 NOT FOR SALE OR DISTRIBUTION Some interrupted genes have only 1 5 10 15 20 25 30 one or a few introns. The globin genes Intron length in kb provide an extensively studied example FIGURE 4.9 Introns range from very short to (see Section 4.8, Members of a Gene Fam- very long. ily Have a Common Organization© Jones ).& TheBartlett two Learning, LLC © Jones & Bartlett Learning, LLC general classes of globinNOT gene,FOR aSALE and b OR, DISTRIBUTION NOT FOR SALE OR DISTRIBUTION share a common organization. They originated from an ancient gene duplication event and are described as paralogous genes, or paralogs . The consistent struc- paralogous genes (para- ture of mammalian globin genes is evident from the “generic” globin gene pre- logs) Genes that share a sented in FIGURE 4.10 . common ancestry due to gene Introns© Jones are found& Bartlett at homologous Learning, positions LLC (relative to the coding ©sequence) Jones in& allBartlett duplication. Learning, LLC knownNOT active FOR globin SALE genes, OR including DISTRIBUTION those of mammals, birds, and frogs.NOT AlthoughFOR SALE OR DISTRIBUTION intron lengths vary, the first intron is always fairly short and the second is usually longer. Most of the variation in the lengths of different globin genes results from length variation in the second intron. For example, the second intron in the mouse b-globin gene is 150 bp of the total 850 bp of the gene, whereas the homologous © Jones &intron Bartlett in the Learning,mouse major LLC b-globin gene is 585 bp of the© Jones total 1382 & bp.Bartlett The difference Learning, LLC NOT FORin SALE length ORof the DISTRIBUTION genes is much greater than that of theirNOT mRNAs FOR (SALEb-globin OR mRNA DISTRIBUTION = 585 bases; b-globin mRNA = 620 bases). The globin genes are examples of a general phenomenon: genes that share a com- mon ancestry have similar organizations with conservation of the positions (of at least some) of the introns. Variations in the lengths of the genes are primarily due to intron length variation. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

Intron length 116–130 573–904 © JonesExon 1& IntronBartlett 1 Exon Learning, 2 LLC Intron 2 ©Exon Jones 3 & Bartlett Learning, LLC Exon NOTlength 142–145FOR SALE OR222 DISTRIBUTION 216–255NOT FOR SALE OR DISTRIBUTION

Contains 5′ UTR Amino acids Coding 105–end + coding 1–30 31–104 + 3′ UTR FIGURE 4.10 All functional globin genes have an interrupted structure with three exons. The © Jones &lengths Bartlett indicated Learning, in the figure applyLLC to the mammalian b-globin ©genes. Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.3 Organization of Interrupted Genes May Be Conserved 99 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 100 8/16/11 5:09 PM user f-400 F-400

© JonesMETHODS & Bartlett AND Learning, TECHNIQUES LLC © Jones & Bartlett Learning, LLC NOT FORThe SALEDiscovery OR of DISTRIBUTIONIntrons by DNA-RNA Hybridization NOT FOR SALE OR DISTRIBUTION A second line of research, construction of physi- cal maps of eukaryotic genes, revealed that interrupted genes are not unique to viral genomes. Prior to the The discovery of introns (intervening sequences) in advent of whole genome sequencing, physical maps eukaryotic genes was unexpectedly© Jones & made Bartlett in 1977. Learning, Two of LLC genomes were constructed© Jonesby mapping & Bartlett restriction Learning, LLC different experimental methods,NOT FOR both SALE taking ORadvantage DISTRIBUTION endonuclease cleavage sites NOTin genomic FOR DNA.SALE Restric- OR DISTRIBUTION of DNA-RNA hybridization, led to this startling find- tion endonucleases are bacterial enzymes that cleave ing. One avenue of research was the mapping of the double-stranded DNA in a site-specific fashion; dif- adenovirus genome using RNA displacement loops to ferent restriction endonucleases recognize and cleave determine the position of mRNA transcripts relative to distinct sequences (see Section 3.2, Nucleases). Whole- the viral© genome. Jones Adenovirus & Bartlett is aLearning, double-stranded LLC virus genomic DNA© Jonescan be &fragmented Bartlett Learning,by cleavage LLCwith that infectsNOT human FOR epithelialSALE OR cells, DISTRIBUTION causing respiratory restriction endonucleases,NOT FOR SALE and the OR resulting DISTRIBUTION fragments cold symptoms and stomach flu. This virus was an early are separated based on size through agarose gel electro- model system for studying eukaryotic transcription. phoresis. It is impossible to directly visualize the frag- In 1977, two independent groups, the labs of Rich- ments for a specific gene because of the complexity and ard Roberts at Cold Spring Harbor Laboratories and size of a typical eukaryotic genome; however, using © JonesPhillip & Bartlett Sharp at Learning,the Massachusetts LLC Institute of Technol- © Jonesthe method & Bartlett of Southern Learning, blotting LLC and DNA hybridiza- NOT FORogy, SALEused the OR method DISTRIBUTION of R loop mapping to determine the NOTtion FOR the fragmentsSALE OR derived DISTRIBUTION from a specific gene can be regions of the adenovirus genome that are transcribed. identified. The separated fragments of genomic DNA Using this technique, double-stranded genomic DNA is are denatured and transferred to a membrane in a mixed with individual mRNA transcripts under condi- process referred to as Southern blotting (named for its tions that promote DNA-RNA hybrids between com- inventor Ed Southern; see Section 3.9, Blotting Methods). plementary sequences. Since© Jones the mRNA & Bartlett will anneal Learning, to The LLC membrane with the bound© Jones DNA is &then Bartlett immersed Learning, LLC only one strand of the DNA,NOT theFOR other SALE strand OR of DISTRIBUTION DNA in a solution containing a specificNOT FOR radio-labeled SALE OR DNA DISTRIBUTION is displaced, resulting in a loop (R loop) of single- probe. The DNA probe will anneal only to the com- stranded DNA visible under the electron microscope plementary sequences bound to the membrane, and (see Figure 4.4). When mapping the R loops formed when this membrane is washed to remove unbound between adenovirus DNA and viral mRNA transcripts probe, dried, and exposed to X-ray film, the fragments expressed© Joneslate in the& Bartlett infection Learning,cycle, it was LLC observed of DNA annealing© Jones to the & probe Bartlett are revealed. Learning, Typically, LLC a that theNOT 59 and FOR 39 ends SALE of the OR RNA DISTRIBUTION did not anneal with physical mapNOT of a geneFOR is SALE constructed OR DISTRIBUTIONby using a cDNA the DNA. This was not surprising for the 39 end of the (copied DNA from an mRNA transcript) to probe the mRNA transcript, which was known to be modified genomic DNA. One of the first genes to be mapped in following transcription by the addition of a polyad- this fashion was the rabbit b-globin gene, performed enylate tail; however, the discontinuity of the 150 to by Jeffreys and Flavell in 1977. This led to the revela- © Jones200 & nucleotides Bartlett Learning,at the 59 end LLC was unexpected. Hybrid- © Jonestion that & thereBartlett is a sequenceLearning, of LLC600 base pairs in the NOT FORization SALE of the OR 59 DISTRIBUTIONterminal mRNA and discrete single- NOTcoding FOR region SALE of ORthe genomicDISTRIBUTION DNA that is not present stranded fragments of genomic DNA demonstrated that in the cDNA. Another interrupted gene discovered at this mRNA sequence is actually composed of sequences the same time is the chicken ovalbumin gene, which derived from three distinct and dispersed regions of the is also interrupted in the coding region. With time, adenovirus genome (see Figure 4.5). This was the first it became clear that most eukaryotic genes are inter- evidence of the post-transcriptional© Jones & processBartlett of Learning,mRNA rupted LLC by intervening sequences© Jones that are& Bartlett transcribed Learning, LLC splicing, and Drs. RobertsNOT and FOR Sharp SALE were recognizedOR DISTRIBUTION and then removed by mRNANOT splicing. FOR In SALE 1978 Walter OR DISTRIBUTION for their discovery of “interrupted genes” in 1993 with a Gilbert proposed that these intervening sequences be Nobel Prize in physiology or Medicine. referred to as introns.

© Jones & Bartlett Learning,KEY CONCEPTS LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR• DISTRIBUTION Introns can be detected by the presence ofNOT additional FOR regions SALE when OR genes DISTRIBUTION are compared with their RNA products by restriction mapping or electron microscopy. The ultimate determination, though, is based on comparison of sequences. • The positions of introns are usually conserved when homologous genes are compared © Jones & Bartlett Learning, LLCbetween different organisms.© Jones The lengths & Bartlett of the corresponding Learning, introns LLC may vary greatly. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 100 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 101 8/16/11 5:09 PM user f-400 F-400

© Jones &CONCEPT Bartlett AND REASONING Learning, CHECK LLC © Jones & Bartlett Learning, LLC NOT FORWhy SALE would OR the DISTRIBUTIONgenes of viruses with DNA genomes that infectNOT eukaryotic FOR SALE cells be OR expected DISTRIBUTION to have introns?

4.4 Exon Sequences Are Usually © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC Conserved,NOT FORbut SALEIntrons OR Vary DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Is a single-copy structural gene completely unique among other genes in a genome? The answer depends on how “completely unique” is defined. Considered as a whole, the gene is unique, but its exons may be related to those of other genes. As a general rule, when two genes are related, the relationship between their exons is closer than the relationship© Jones &between Bartlett their Learning, introns. In LLC an extreme case, the exons© ofJones two differ- & Bartlett Learning, LLC ent genesNOT may FOR encode SALE the OR same DISTRIBUTION polypeptide sequence while the intronsNOT are different.FOR SALE OR DISTRIBUTION This situation can result from the duplication of a common ancestral gene followed by unique base substitutions in both copies, with substitutions restricted in the exons by the need to encode a functional polypeptide. As we will see in Chapter 8, Genome Evolution , when we consider the evolution of © Jones &the Bartlett genome, Learning,exons can be LLC considered basic building blocks© Jones that may& Bartlett be assembled Learning, in LLC NOT FORvarious SALE combinations. OR DISTRIBUTION It is possible for a gene to haveNOT some FORexons SALE related ORto those DISTRIBUTION of another gene, with the remaining exons unrelated. Usually, in such cases, the introns are not related at all. Such homologies between genes may result from duplication and translocation of individual exons. The homology between two genes can be plotted in the form of a dot matrix compari- © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC son, as in FIGURE 4.11 . A dot is placed in each position that is identical in both genes. The dots form a solid line on theNOT diagonal FOR of theSALE matrix OR if theDISTRIBUTION two sequences are completely identi- NOT FOR SALE OR DISTRIBUTION cal. If they are not identical, the line is broken by gaps that lack homology and is displaced laterally or vertically by nucleotide deletions or insertions in one or the other sequence. When the two mouse b-globin genes are compared in this way, a line of homol- ogy extends through the three exons and the small intron. The line disappears in the flanking© Jones UTRs and & inBartlett the large Learning, intron. This LLCis a typical pattern in related© genes; Jones the &cod- Bartlett Learning, LLC ing sequencesNOT FOR and SALE areas of OR introns DISTRIBUTION adjacent to exons retain their similarity,NOT but FOR there SALE is OR DISTRIBUTION greater divergence in longer introns and the regions on either side of the coding sequence. The overall degree of divergence between two homologous exons in related genes corresponds to the differences between the polypeptides. It is mostly a result of base substitutions. In the translated regions, changes in exon sequences are constrained by © Jones &selection Bartlett against Learning, mutations LLC that © Jones & Bartlett Learning, LLC NOT FORalter SALE or destroyOR DISTRIBUTION the function of NOT FOR –globin SALE Major OR DISTRIBUTION FIGURE 4.11 The sequences of the maj min the polypeptide. Many of the pre- 5 Exon Exon Exon 3 mouse b - and b -globin genes are closely related in coding regions 1 2 3 served changes do not affect codon 5 but differ in the flanking UTRs meanings because they change one and the long intron. Data provided codon into another for the same by Philip Leder, Harvard Medical Exon 1 amino acid (i.e., they© are Jones synony- & Bartlett Learning, LLC ©School. Jones & Bartlett Learning, LLC mous substitutions).NOT Similarly, FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION there are higher rates of changes in Exon 2 nontranslated regions of the gene (specifically, those that are tran- scribed to the 59 UTR [leader] and 9 © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC

3 UTR [trailer] of the mRNA). –globin Minor InNOT homologous FOR SALE introns, OR DISTRIBUTIONthe NOT FOR SALE OR DISTRIBUTION pattern of divergence involves both changes in length (due to Exon 3 deletions and insertions) and base substitutions. Introns evolve much © Jones &more Bartlett rapidly Learning, than exons. LLCWhen a 3 © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.4 Exon Sequences Are Usually Conserved, but Introns Vary 101 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 102 8/16/11 5:09 PM user f-400 F-400

gene is compared among different species, there are instances where its exons are © Jones & Bartlett Learning, LLChomologous but its introns© have Jones diverged & Bartlett so much Learning, that very little LLC homology is retained. NOT FOR SALE OR DISTRIBUTIONAlthough mutations in certainNOT FORintron SALEsequences OR (branch DISTRIBUTION site, splicing junctions, and perhaps other sequences influencing splicing) will be subject to selection, most intron mutations are expected to be selectively neutral. In general, mutations occur at the same rate in both exons and introns, but exon mutations are eliminated more effectively by selection. However, because of the low © Joneslevel & of Bartlett functional Learning, constraints, LLC introns may more freely ©accumulate Jones & point Bartlett substitu- Learning, LLC NOT FORtions SALEand other OR changes. DISTRIBUTION The empirical observation of NOTfaster FORevolution SALE in intronsOR DISTRIBUTION implies that introns have fewer sequence-specific functions, but that does not neces- sarily mean that their presence is not required for normal gene function (see Section 4.6, Some DNA Sequences Encode More Than One Polypeptide ). © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE ORKEY DISTRIBUTION CONCEPTS NOT FOR SALE OR DISTRIBUTION • Comparisons of related genes in different species show that the sequences of the corresponding exons are usually conserved but the sequences of the introns are much less similar. • Introns evolve much more rapidly than exons because of the lack of selective pressure to © Jones & Bartlett Learning, LLCproduce a polypeptide with© Jonesa useful sequence.& Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

CONCEPT AND REASONING CHECK In comparing two genes in different species, consider whether they are homologous given © Jonesthe &following Bartlett differences: Learning, (1) different LLC exons; (2) the same exons,© Jones but completely & Bartlett different Learning, LLC NOT FORintrons; SALE (3) the OR same DISTRIBUTION exons, and introns in the same positionsNOT but of FORdifferent SALE sizes. OR DISTRIBUTION

4.5 Genes Show a Wide Distribution of Sizes Due © Jones & Bartlett Learning,Primarily LLC to Intron Size© Jones and Number& Bartlett Variation Learning, LLC NOT FOR SALE OR Figure DISTRIBUTION 4.8 compares the organization of genesNOT inFOR a yeast, SALE an insect, OR DISTRIBUTION and mammals. In the yeast Saccharomyces cerevisiae , the majority of genes (~96%) are not interrupted, and those that have introns generally have three or fewer. There are virtually no S. cerevisiae genes with more than four exons. In insects and mammals, the situation is reversed. Only a few genes have uninter- © Jones & Bartlett Learning, LLCrupted coding sequences ©(6% Jones in mammals). & Bartlett Insect Learning, genes tend LLC to have a fairly small NOT FOR SALE OR DISTRIBUTIONnumber of exons—typicallyNOT fewer FOR than SALE 10. Mammalian OR DISTRIBUTION genes are split into more pieces, and some have more than 60 exons. About half FIGURE 4.12 Yeast genes are short, S. cerevisiae of mammalian genes have more than 10 introns. but genes in flies and mammals 50 have a dispersed distribution 40 If we examine the effect of intron num- extending to very long sizes. 30 ber variation on the total size of genes, we see © Jones20 & Bartlett Learning, LLC in FIGURE 4.12 that there© Jones is a striking & Bartlett difference Learning, LLC NOT FOR10 SALE OR DISTRIBUTION between yeast andNOT multicellular FOR SALE eukaryotes. OR DISTRIBUTION D. melanogaster The average yeast gene is 1.4 kb long, and very 50 40 few are longer than 5 kb. The predominance of 30 interrupted genes in multicellular eukaryotes, Percent 20 however, means that the gene can be much 10 © Jones & Bartlett Learning, LLC larger© Jones than the & sumBartlett total ofLearning, the exon lengths.LLC NOT FOR SALE OR DISTRIBUTION50 Mammals OnlyNOT a smallFOR percentageSALE OR of DISTRIBUTION genes in flies or 40 mammals are shorter than 2 kb, and most have 30 20 lengths between 5 kb and 100 kb. The average 10 human gene is 27 kb long (see Figure 6.11). The dystrophin gene, with a length of 2000 kb, © Jones & Bartlett Learning, LLC <0.5 <1 <2 <5 <10 <25 <50© Jones<100 >100 & Bartlett Learning, LLC Size of gene (kb) is the longest known human gene. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 102 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 103 8/16/11 5:09 PM user f-400 F-400

The switch from largely uninterrupted FIGURE 4.13 Exons encoding poly- © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning,peptides LLC are usually short. to largely interrupted genes seems to have 5 NOT FORoccurred SALE with OR theDISTRIBUTION evolution of multicellular NOT FOR SALE OR DISTRIBUTION 4 eukaryotes. In fungi other than S . cerevisiae , Yeast the majority of genes are interrupted, but 3 they have a relatively small number of 2 exons (<6) and are fairly short (<5 kb). 1 In the fruit fly, gene ©sizes Jones have a& bimodal Bartlett Learning,6 LLC © Jones & Bartlett Learning, LLC distribution—many areNOT short FOR but some SALE are OR DISTRIBUTION5 NOT FOR SALE OR DISTRIBUTION quite long. With this increase in the length 4 Fly of the gene due to the increased number of 3

introns, the correlation between genome % exons 2 size and organism complexity becomes 1 weak© (see Jones Figure & 8.7). Bartlett Learning, LLC © Jones & Bartlett Learning, LLC IsNOT there FORan evolutionary SALE OR trend DISTRIBUTION in intron 6 NOT FOR SALE OR DISTRIBUTION length as well as intron number? Yes, organ- 5 4 isms with larger genomes tend to have Human larger introns, whereas exon size does not 3 tend to increase ( FIGURE 4.13 ). In multicellu- 2 © Jones &lar Bartletteukaryotes, Learning, the average LLC exon codes for © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION 1 NOT FOR SALE OR DISTRIBUTION ~50 amino acids, and the general distribu- 100 300 500 700 900 tion is consistent with the hypothesis that Exon length in nucleotides genes have evolved by the gradual addition of exon units that code for short, function- ally independent protein domains (see Section 8.6, How Did Interrupted Genes Evolve?). There is no significant difference© Jones in the average& Bartlett size of Learning, exons in different LLC multicellular eukary- © Jones & Bartlett Learning, LLC otes, although the sizeNOT range isFOR smaller SALE in vertebrates OR DISTRIBUTION for which there are few exons longer NOT FOR SALE OR DISTRIBUTION than 200 bp. In yeast, there are some longer exons that represent uninterrupted genes for which the coding sequence is intact. Figure 4.9 shows that introns vary widely in size among multicellular eukaryotes. In worms and flies, the average intron is not much longer than the exons. There are no very© Jones long introns & Bartlett in worms, Learning, but flies contain LLC many. In vertebrates,© the Jones size distri- & Bartlett Learning, LLC butionNOT is much FOR wider, SALE extending OR DISTRIBUTION from approximately the same lengthNOT as theFOR exons SALE OR DISTRIBUTION (<200 bp) up to 60 kb in extreme cases. Some fish, such as fugu, have compressed genomes with shorter introns and intergenic spaces than mammals have. Very long genes are the result of very long introns, not the result of encoding longer products. There is no correlation between total gene size and total exon size in © Jones &multicellular Bartlett Learning,eukaryotes, norLLC is there a good correlation© Jonesbetween & gene Bartlett size and Learning, num- LLC NOT FORber SALE of exons. OR The DISTRIBUTION size of a gene is, therefore, determinedNOT primarily FOR bySALE the lengths OR DISTRIBUTION of its individual introns. In mammals and insects, the average gene length is approximately five times that of the total length of its exons.

KEY CONCEPTS © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC • Most genes are uninterruptedNOT FOR in S .SALE cerevisiae OR but DISTRIBUTION are interrupted in most other NOT FOR SALE OR DISTRIBUTION eukaryotes. • Exons are usually short, typically coding for <100 amino acids. • Introns are short in unicellular eukaryotes but can be many kilobases in length in multicellular eukaryotes. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC • The overall length of a gene is determined largely by its introns. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

CONCEPT AND REASONING CHECK Would you expect there to be a general evolutionary trend for increase or decrease in the © Jones &size Bartlett of introns? Learning, Why or why LLCnot? © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.5 Genes Show a Wide Distribution of Sizes Due Primarily to Intron Size and Number Variation 103 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 104 8/16/11 5:09 PM user f-400 F-400

© Jones & Bartlett Learning, LLC 4.6 Some DNA© Sequences Jones & Bartlett Encode Learning, More LLC NOT FOR SALE OR DISTRIBUTION Than One NOTPolypeptide FOR SALE OR DISTRIBUTION Most structural genes consist of a sequence of DNA that encodes a single polypeptide, although the gene may include noncoding regions at both ends and introns within the coding region. However, there are some cases in which a single sequence of DNA © Jonesencodes & Bartlett more than Learning, one polypeptide. LLC © Jones & Bartlett Learning, LLC overlapping gene A geneNOT FOR The SALE simplest OR overlapping DISTRIBUTION gene is one in which oneNOT gene FORis part SALE of another. OR DISTRIBUTIONIn in which part of the sequence other words, the first half (or second half) of a gene independently specifies a protein is found within part of the that is the first (or second) half of the protein specified by the full gene. This relation- sequence of another gene. ship is illustrated in FIGURE 4.14 . A more complex form of overlapping gene occurs when the same sequence of © Jones & Bartlett DNALearning, encodes LLC two nonhomologous © Jones because & itBartlett has more Learning, than one readingLLC NOT FOR SALE ORframe. DISTRIBUTION Usually, a coding DNA sequence is NOTread in FOR only oneSALE of the OR three DISTRIBUTION potential read- ing frames. In some viral and mitochondrial genes, however, there is some overlap between two adjacent genes that are read in different reading frames, as illustrated in FIGURE 4.15 . The length of overlap is usually short, so that most of the DNA sequence encodes a unique protein sequence. © Jones mature & Bartlett transcript Learning, A modi- LLC In many genes, alternative© Jones proteins & Bartlett result from Learning, switches LLC in the process of con- NOT FORfied SALERNA transcript. OR DISTRIBUTION Modifica- necting the exons. A singleNOT gene FOR may SALE generate OR aDISTRIBUTION variety of mRNA products that tion may include the removal differ in their exon content. Often, this is because there are pairs of exons that are of intron sequences and alter- treated as mutually exclusive—one or the other is included in the mature tran- ations to the 59 and 39 ends. script , but not both. The alternative proteins have one part in common and one The unique part. Such an example is presented in FIGURE 4.16 . The 39 half of the rat tro- production of different RNA© Jonesponin & BartlettT gene contains Learning, five exons, LLC but only four are used© to Jones construct & Bartlettan individual Learning, LLC products from a single productNOT FORmRNA. SALE Three OR exons, DISTRIBUTION W, X, and Z, are included in all mRNAs.NOT However, FOR SALE in one OR alter- DISTRIBUTION by changes in the usage of native splicing pattern the a exon is included between X and Z, whereas in the splicing junctions. other pattern it is replaced by the b exon. The a and b forms of troponin T therefore

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC FIGURE 4.14 Two proteins can be generated from a single gene Full-length protein by startingNOT (or terminating) FOR SALE expression OR at DISTRIBUTIONdifferent points. NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning,Part-length LLC protein NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC START Triplet codons STOP NOT FOR SALE OR DISTRIBUTION Alternative START NOT FOR SALE OR DISTRIBUTION

FIGURE 4.15 Two genes may overlap by reading the same DNA START sequence in© different Jones frames. & Bartlett Learning, LLC © Jones & CodonsBartlett used Learning, for protein 1 LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Bases

Codons used for protein 2 © Jones & Bartlett Learning, LLC © Jones & BartlettSTART Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 104 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 105 8/16/11 5:09 PM user f-400 F-400

differ in the sequence of the amino W X Z FIGURE 4.16 Alternative splicing © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning,generates LLC the a and b variants of acids between W and Z, depending One mRNA has the exon NOT FORon SALE which ORof the DISTRIBUTION alternative exons (a NOT FOR SALE OR DISTRIBUTIONtroponin T. or b) is used. Exons W X Z There are also cases of alter- native splicing in which certain One mRNA has the exon exons are optional; i.e., they may be included or spliced© out,Jones as in & the Bartlett Learning, LLCW X Z © Jones & Bartlett Learning, LLC example presented NOTin FIGURE FOR 4.17 SALE . OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION There is a single primary transcript from the gene, but it can be spliced in either one of two ways. In the first, more standard way, the two introns are spliced out and the three exons are joined together. In the second way, the second exon is excluded along with the two introns as if a single large intron is spliced out. The two alternate© Jones proteins & areBartlett the same Learning, at their ends, LLC but one has an additional© Jones sequence & inBartlett Learning, LLC the middle.NOT FOR (Other SALE types ORof combinations DISTRIBUTION that are produced by alternativeNOT FOR splicing SALE OR DISTRIBUTION are discussed in Section 21.11, Alternative Splicing Is a Rule, Rather Than an Exception, in Multicellular Eukaryotes. ) primary transcript The original unmodified RNA Sometimes the two alternative splicing patterns operate simultaneously, with a product corresponding to the certain proportion of the primary mRNA transcripts being spliced in each way. How- transcription unit of a gene. © Jones &ever, Bartlett in some Learning, genes the splicing LLC patterns are alternatives© Jonesthat are &expressed Bartlett under Learning, dif- LLC NOT FORferent SALE conditions, OR DISTRIBUTION e.g., one in one cell type and one inNOT another FOR cell SALE type. OR DISTRIBUTION So, alternative (or differential) splicing can generate different proteins with related sequences from a single stretch of DNA. It is curious that the multicellular eukaryotic genome is often extremely large, with long genes that are often widely dispersed along a chromosome, but at the same time there may be multiple products from a single locus. Due© Jones to alternative & Bartlett splicing, Learning, there are LLC~15% more proteins than © Jones & Bartlett Learning, LLC genes in flies and worms,NOT butFOR it isSALE estimated OR thatDISTRIBUTION the majority of human genes are NOT FOR SALE OR DISTRIBUTION alternatively spliced (see Section 6.5, The Human Genome Has Fewer Genes Than Origi- nally Expected ).

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC ExonNOT 1 FORIntron SALE ORExon DISTRIBUTION 2 Intron Exon 3 FIGURENOT 4.17 AlternativeFOR SALE splicing OR uses DISTRIBUTIONthe same pre-mRNA to generate mRNAs that have different combinations of exons.

RNA synthesis

Only introns spliced out

© Jones &5 Bartlett Learning, LLC © Jones3 & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

5 3 mRNA

© Jones & BartlettC Learning, LLC © Jones & Bartlett Learning, LLC N NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Long protein has all 3 exons Introns + exon 2 spliced out 5 3 © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 5 3 mRNA

C N © Jones & Bartlett Learning,Short protein hasLLC only exon 1 + exon 3 © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.6 Some DNA Sequences Encode More Than One Polypeptide 105 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 106 12/7/11 2:51 PM user f-400 F-400

© Jones & Bartlett Learning, LLCKEY CONCEPTS © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION• The use of alternative startNOT or stop FOR codons SALE allows OR two DISTRIBUTION proteins to be generated where one is equivalent to a fragment of the other. • Nonhomologous protein sequences can be produced from the same sequence of DNA when it is read in different reading frames by two (overlapping) genes. • Homologous proteins that differ by the presence or absence of certain regions can © Jonesbe & generated Bartlett by Learning, differential (alternative) LLC splicing when certain© Jones exons are & included Bartlett Learning, LLC NOT FORor excluded.SALE OR This DISTRIBUTIONmay take the form of including or excludingNOT indiv idualFOR exons SALE or of OR DISTRIBUTION choosing between alternative exons.

CONCEPT AND REASONING CHECK © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC What are the advantages to having the same gene produce two or more related polypep- NOT FOR SALE ORtides? DISTRIBUTION Why might different versions of the sameNOT polypeptide FOR SALE be necessary? OR DISTRIBUTION

© Jones & Bartlett Learning, LLC 4.7 Some Exons© Jones Can Be & BartlettEquated Learning, LLC NOT FOR SALE OR DISTRIBUTION with ProteinNOT Functional FOR SALE ORDomains DISTRIBUTION The issue of the evolution of interrupted genes is more fully considered in Section 8.6, How Did Interrupted Genes Evolve? If proteins evolve by recombining parts of ancestral proteins that were originally separate, the accumulation of protein domains is likely to © Joneshave & occurred Bartlett sequentially, Learning, with LLC one exon added at a time.© AreJones the current & Bartlett functions Learning, LLC NOT FORof these SALE domains OR theDISTRIBUTION same as their original functions? In NOTother FORwords, SALE can we OR assign DISTRIBUTION particular functions of current proteins to individual exons? In some cases, there is a clear relationship between the structures of the gene and the protein. The example par excellence is provided by the immunoglobulin (antibody) proteins, which are encoded by genes in which every exon corresponds exactly to © Jones & Bartlett aLearning, known functional LLC domain of the protein.© Jones FIGURE 4.18 & Bartlettcompares Learning,the structure LLC of an NOT FOR SALE ORimmunoglobulin DISTRIBUTION with its gene. NOT FOR SALE OR DISTRIBUTION An immunoglobulin is a tetramer of two light chains and two heavy chains that covalently bond to generate a protein with several distinct domains. Light chains and heavy chains differ in structure, and there are several types of heavy chains. Each type of chain is produced from a gene that has a series of exons corresponding to the © Jones & Bartlett Learning, LLCstructural domains of the© protein. Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

FIGURE 4.18 Immunoglobulin light chains L exon V-J exon C exon and heavy chains are encoded by genes I1 I2 whose structures (in their expressed forms) correspond to the distinct domains in the protein. Each protein domain corresponds© Jones to & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC an exon; introns are numbered NOTI1 to I5. FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

Light chain Leader Variable Constant Hinge Constant 2 Constant 3 Protein domains © Jones & Bartlett Learning,Heavy chainLLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

I1 I2 I3 I4 I5 © Jones & Bartlett Learning, LLC L exon V-D-J exon© JonesC1 & exon Bartlett Learning,Hinge exon LLCC2 exon C3 exon NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 106 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 107 8/16/11 5:09 PM user f-400 F-400

In many instances, some of the exons of LDL receptor FIGURE 4.19 The LDL receptor gene © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning,consists LLC of 18 exons, some of which a gene can be identified with particular func- C9 complement EGF precursor are related to EGF precursor exons NOT FORtions. SALE In ORsecretory DISTRIBUTION proteins, such as insu- homologyNOT FORhomology SALE OR DISTRIBUTION and some of which are related to lin, the first exon often specifies the signal the C9 blood complement gene. sequence involved in membrane secretion. Triangles mark the positions of The view that exons are the functional introns. building blocks of genes is supported by cases in which two genes may© Jones share some& Bartlett related Learning, LLC © Jones & Bartlett Learning, LLC exons but also have uniqueNOT FOR exons. SALE FIGURE 4.19OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION summarizes the relationship between the human LDL (plasma low density lipoprotein) receptor and other proteins. The LDL recep- EGF precursor tor gene has a series of exons related to the exons© of Jones the EGF & (epidermalBartlett Learning, growth factor) LLC © Jones & Bartlett Learning, LLC precursorNOT gene, FOR and SALE another OR seriesDISTRIBUTION of exons related to those of theNOT blood FOR protein SALE OR DISTRIBUTION complement factor C9. Apparently, the LDL receptor gene evolved by the assembly of modules suitable for its various functions. These modules are also used in different combinations in other proteins. Exons tend to be fairly small (see Figure 6.11), around the size of the smallest © Jones &polypeptide Bartlett thatLearning, can assume LLC a stable folded structure© (~20 Jones to 40 &residues). Bartlett It Learning,may be LLC NOT FORthat SALE proteins OR wereDISTRIBUTION originally assembled from rather NOTsmall modules.FOR SALE Each OR individual DISTRIBUTION module need not correspond to a current function; several modules could have com- bined to generate a new functional unit. Larger genes tend to have more exons, which is consistent with the view that proteins acquire multiple functions by successively adding appropriate modules. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION KEY CONCEPTS • Many exons can be equated with encoding polypeptide sequences that have particular functions. • Related© Jones exons are& Bartlettfound in different Learning, genes. LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

CONCEPT AND REASONING CHECK Would you expect two membrane-bound enzymes that catalyze different reactions to have (1) completely different exons, (2) some similar exons and some different exons, or (3) all © Jones &similar Bartlett exons? Learning, Why? LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

4.8 Members of a Gene Family gene family A set of genes within a genome that encode Have a ©Common Jones & OrganizationBartlett Learning, LLC © Jonesrelated or& identical Bartlett proteins Learning, LLC Many genes in a multicellularNOT FOR eukaryotic SALE ORgenome DISTRIBUTION are related to others in the same NOTor FORRNAs. TheSALE members OR origi-DISTRIBUTION genome. A gene family is defined as a group of genes that encode related or identical nated from duplication of an products as a result of gene duplication events. After the first duplication event, the ancestral gene followed by two copies are identical, but then they diverge as different mutations accumulate in accumulation of changes in sequence between the copies. them. Further duplications and divergences extend the family. The globin genes are © Jones & Bartlett Learning, LLC © Jones & BartlettMost Learning, often the members LLC are an example of a family that can be divided into two subfamilies (a globin and b glo- related but not identical. bin),NOT but all FOR its members SALE ORhave DISTRIBUTION the same basic structure and function.NOT In some FOR cases, SALE OR DISTRIBUTION we can find genes that are more distantly related, but still can be recognized as having superfamily A set of genes common ancestry. Such a group of gene families is called a superfamily . all related by presumed A fascinating case of evolutionary conservation is presented by the a and b globins descent from a common and two other proteins related to them. Myoglobin is a monomeric oxygen-binding ancestor but now showing © Jones &protein Bartlett in animals. Learning, Its amino LLC acid sequence suggests© aJones common & Bartlett(though ancient)Learning, considerableLLC variation. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.8 Members of a Gene Family Have a Common Organization 107 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 108 8/16/11 5:09 PM user f-400 F-400

FIGURE 4.20 The exon structure of Heme-binding domain origin with globins. Leghemoglobins are © Jones & Bartlett Learning, LLC © Jonesglobin & genes Bartlett corresponds Learning, to protein LLC Globin oxygen-binding proteins found in legume NOT FORfunction, SALE but leghemoglobin OR DISTRIBUTION has an NOT FOR SALEplants; OR like DISTRIBUTION myoglobin, they are monomeric extra intron in the central domain. E I E I E and share a common origin with the other heme-binding proteins. Together, the glo- bins, myoglobin, and leghemoglobins make up the globin superfamily—a set of gene © JonesE &I BartlettE Learning,E I LLC E © Jones & Bartlett Learning, LLC Extra intron Leghemoglobin families all descended from an ancient com- NOT FOR SALE OR DISTRIBUTION mon ancestor. NOT FOR SALE OR DISTRIBUTION Domain is divided Both a- and b-globin genes have three exons and two introns at conserved positions (see Figure 4.6). The central exon represents the heme-binding domain of the globin chain. There is a single myoglobin gene in the human genome and its structure is essen- © Jones & Bartlett tiallyLearning, the same LLC as that of the globin genes.© The Jones three-exon & Bartlett structure Learning, therefore predates LLC NOT FOR SALE ORthe DISTRIBUTION common ancestor of the myoglobin andNOT globin FOR genes. SALE Leghemoglobin OR DISTRIBUTION genes contain three introns, the first and last of which are homologous to the two introns in the globin genes. This remarkable similarity suggests an exceedingly ancient origin for the inter- rupted structure of heme-binding proteins, as illustrated in FIGURE 4.20 . orthologous genes (ortho- Orthologous genes , or orthologs, are genes that are homologous ( homologs ) © Joneslogs) & Bartlett Related genes Learning, in differ- LLCdue to speciation; in other© words, Jones they & areBartlett related Learning, genes in different LLC species. Compari- NOT FORent species.SALE OR DISTRIBUTIONson of orthologs that differNOT in structure FOR SALE may provide OR DISTRIBUTION information about their evolution. homologous genes (homo- An example is insulin. Mammals and birds have only one gene for insulin, except for logs) Related genes in the rodents, which have two. FIGURE 4.21 illustrates the structures of these genes. We use the same species, such as alleles principle of parsimony in comparing the organization of orthologous genes by assum- on homologous chromosomes ing that a common feature predates the evolutionary separation of the two species. In or multiple genes in the same© Jones chickens, & Bartlett the single Learning, insulin gene LLC has two introns; one of the© two Jones homologous & Bartlett rat genes Learning, LLC genome sharing common NOT FORhas the SALE same ORstructure. DISTRIBUTION The common structure implies thatNOT the ancestralFOR SALE insulin OR gene DISTRIBUTION ancestry. had two introns. However, since the second rat gene has only one intron, it must have evolved by a gene duplication in rodents that was followed by the precise removal of one intron from one of the homologs. The organizations of some orthologs show extensive discrepancies between spe- © Jones & Bartlett cies.Learning, In these cases,LLC there must have been© deletion Jones or & insertion Bartlett of Learning,introns during LLC evo- NOT FOR SALE ORlution. DISTRIBUTION A well-characterized example is thatNOT of theFOR actin SALE genes. OR The DISTRIBUTION common features of actin genes are a nontranslated leader of <100 bases, a coding region of ~1200 bases, and a trailer of ~200 bases. Most actin genes have introns, and their positions can be aligned with regard to the coding sequence (except for a single intron some- times found in the leader). FIGURE 4.22 shows that almost every actin gene is differ- © Jones & Bartlett Learning, LLCent in its pattern of intron© Jonespositions. & AmongBartlett all Learning,the genes being LLC compared, introns NOT FOR SALE OR DISTRIBUTIONoccur at 19 different sites.NOT However, FOR theSALE range OR of intronDISTRIBUTION number per gene is zero to six. How did this situation arise? If we suppose that the primordial actin gene had introns, and that all current actin genes are related to it by loss of introns, different introns have been lost in each evolutionary branch. Probably some introns have been lost entirely, so the primordial gene could well have had 20 introns or more. The © Jones & Bartlett Learning, LLC alternative is to© suppose Jones that & Bartlett a process Learning, of LLC FIGURE 4.21 The rat insulin geneNOT FOR CommonSALE insulinOR DISTRIBUTIONgene (chicken and rat) intron insertionNOT continued FOR SALEindependently OR DISTRIBUTION with one intron evolved by loss of Leader Coding Coding in the different lines of evolution. an intron from an ancestor with exon Intron exon Intron exon two introns. The relationship between individual exons and functional protein domains is somewhat erratic. In some cases there is a © Jones & Bartlett Learning, LLC ©clear Jones 1:1 relationship; & Bartlett in Learning, others no patternLLC NOT FOR SALE OR DISTRIBUTION NOTcan be FOR discerned. SALE One OR possibility DISTRIBUTION is that the Delete intron removal of introns has fused previously Leader exon Intron Coding exon adjacent exons. This means that the intron must have been precisely removed, without changing the integrity of the coding region. © Jones & Bartlett Learning, LLC Second insulin gene© in Jonesrat & BartlettAn alternative Learning, is that LLC some introns arose by NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 108 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 109 8/16/11 5:09 PM user f-400 F-400

FIGURE 4.22 Actin genes vary widely in © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning,their organization. LLC The sites of introns are S. pombe indicated by dark boxes. The bar at the top NOT FOR SALES. OR cerevisiae DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Acanthamoeba summarizes all the intron positions among Thermomyces the different orthologs. C. elegans D. melanogaster A6 A1 A4 A2 © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC Sea urchin C NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION J Chick muscle Rat muscle Rat cytoplasmic Human smooth muscle Human cardiac muscle © JonesSoybean & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION insertion into an exon encoding a single domain. Together with the variations that we see in exon placement in cases such as the actin genes, the conclusion is that intron positions can evolve. © Jones & Bartlett The correspondence Learning, ofLLC at least some exons with ©protein Jones domains & Bartlett and the Learning, pres- LLC NOT FORence SALE of related OR DISTRIBUTION exons in different proteins leave no NOTdoubt FORthat the SALE duplication OR DISTRIBUTION and juxtaposition of exons have played important roles in evolution. It is possible that the number of ancestral exons—from which all proteins have been derived by duplica- tion, variation, and recombination—could be relatively small, perhaps as little as a few thousand. The idea that exons are the building blocks of new genes is consistent with the “introns early” model© Jones for the & originBartlett of genes Learning, encoding LLC proteins (see S ection 8.6, © Jones & Bartlett Learning, LLC How Did Interrupted GenesNOT Evolve? FOR ). SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

KEY CONCEPTS • A common feature in a set of genes is assumed to identify a property that preceded their separation© Jones in evolution.& Bartlett Learning, LLC © Jones & Bartlett Learning, LLC • AllNOT globin FOR genes SALEhave a common OR DISTRIBUTION form of organization with three exons andNOT two FORintrons, SALE OR DISTRIBUTION suggesting that they are descended from a single ancestral gene.

© Jones &CONCEPT Bartlett AND REASONINGLearning, CHECK LLC © Jones & Bartlett Learning, LLC NOT FOR“Parsimony” SALE OR is DISTRIBUTIONthe least complex explanation for a given observation.NOT FOR For SALE example, OR the DISTRIBUTION explanation that modern actin genes evolved from an ancestral gene with many exons is more parsimonious than the explanation that modern actin genes gained introns indepen- dently. Evaluate the statement, “Selecting the most parsimonious explanation guarantees the choice of the correct explanation.” © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.9 Summary Virtually all eukaryotic genomes contain interrupted genes. The proportion of inter- rupted genes is low in some fungi, but few genes are uninterrupted in multicellular eukaryotes.© Jones Introns & Bartlett are found Learning, in all classes LLC of eukaryotic genes. The ©structure Jones of & the Bartlett Learning, LLC interruptedNOT FOR gene SALEis the same OR inDISTRIBUTION all tissues: exons are spliced togetherNOT in RNA FOR in theSALE OR DISTRIBUTION same order as they are found in DNA, and the introns, which usually have no coding function, are removed from RNA by splicing. Some genes are expressed by alternative splicing patterns, in which a particular sequence is removed as an intron in some situations, but retained as an exon in oth- © Jones &ers. Bartlett Often, when Learning, the organizations LLC of orthologous genes© areJones compared, & Bartlett the positions Learning, of LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 4.9 Summary 109 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 110 8/16/11 5:09 PM user f-400 F-400

introns are conserved. Intron sequences vary—and may even be unrelated—although © Jones & Bartlett Learning, LLCexon sequences are clearly© Jonesrelated. The& Bartlett conservation Learning, of exon LLCsequence and position NOT FOR SALE OR DISTRIBUTIONcan be used to isolate relatedNOT genes FOR in SALEdifferent OR species. DISTRIBUTION The size of a gene is determined primarily by the lengths of its introns. Large introns probably first appeared in the multicellular eukaryotes, and there is an evo- lutionary trend toward increased intron (and consequently, gene) size. The range of gene sizes in mammals is generally from 1 to 100 kb, but it is possible to have even © Joneslarger & genes.Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR Some SALE genes OR share DISTRIBUTION only some of their exons with otherNOT genes, FOR suggesting SALE that OR they DISTRIBUTION have been assembled by addition of exons representing functional “modular units” of the protein. Such modular exons may have been incorporated into a variety of differ- ent proteins. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION CHAPTER QUESTIONS

1. Mutations that affect splicing: A. are usually inconsequential. © Jones & Bartlett Learning, LLC B. are usually deleterious.© Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION C. are always deleterious.NOT FOR SALE OR DISTRIBUTION D. increase gene expression. 2. In which group of organisms are most genes interrupted (have introns)? A. bacteria B. yeast © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC C. animals NOT FOR D. SALE more than OR oneDISTRIBUTION of the above NOT FOR SALE OR DISTRIBUTION 3. The length of genes in a gene family often varies; this variation is most often determined by: A. length of the 59 untranslated region. © Jones & Bartlett Learning, B. number LLC and size of exons. © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION C. number and size of introns. NOT FOR SALE OR DISTRIBUTION D. length of the 39 untranslated region. 4. In general, for two genes that are related, which shows the closest relationship? A. exons B. introns © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC C. both introns and exons NOT FOR SALE OR DISTRIBUTION D. the promoter regionNOT and FORfirst exon SALE OR DISTRIBUTION 5. What percentage of genes is uninterrupted in animals? A. <5% B. <10% © Jones &C. <20%Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR D. SALE ~33% OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 6. In general, as genome size increases in different organisms: A. exons also increase in size. B. introns increase in size. © Jones & Bartlett Learning, C. both exons LLC and introns increase in© size. Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION D. 59 and 39 untranslated regions increaseNOT in FOR size. SALE OR DISTRIBUTION 7. Which group of organisms has the longest average intron size? A. bacteria B. flies C. worms © Jones & Bartlett Learning, LLC D. mammals © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION 110 CHAPTER 4 The Interrupted Gene © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION 12658_CH04_093_111.indd Page 111 8/16/11 5:09 PM user f-400 F-400

8. The hypothesis that new genes are constructed by combining exons from existing © Jones & Bartlettgenes is supportedLearning, by LLCthe observation that genes with© Jones different & Bartlettfunctions: Learning, LLC NOT FOR SALE A. have OR all DISTRIBUTION related introns. NOT FOR SALE OR DISTRIBUTION B. have some related exons. C. have all related exons. D. have some related introns. 9. Genes that encode© relatedJones or & identical Bartlett proteins Learning, in an organism LLC are part of: © Jones & Bartlett Learning, LLC A. a gene family. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION B. a superfamily. C. homologous genes. D. orthologous genes. 10. a© globin Jones and & b Bartlettglobin are: Learning, LLC © Jones & Bartlett Learning, LLC A. unrelated. NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION B. orthologs of each other. C. paralogs of myoglobin. D. paralogs of each other.

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR KEYSALE TERMS OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION alternative splicing homologous genes (homologs) orthologous genes (orthologs) RNA splicing cDNA interrupted gene overlapping gene superfamily exon intron paralogous genes (paralogs) gene family © Jones mature & Bartlett transcript Learning, LLC primary transcript © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

FURTHER READING Black , D. L. ( 2003 ). Mechanisms of alternative pre-messenger RNA splicing . Annu. Rev. Biochem© Jones . 72, 291–336& Bartlett . An in-depth Learning, review LLC of specifi c examples of alternative© Jones splicing, & Bartlett Learning, LLC including the Drosophila sex determination system. FaustinoNOT , N. FORA. , and SALE Cooper ,OR T. A. DISTRIBUTION ( 2003 ). Pre-mRNA splicing and human diseaseNOT . Genes FOR Dev. SALE OR DISTRIBUTION 17, 419–437 . A review of the mechanisms by which problems with alternative splicing of pre-mRNA can result in human diseases. Ponting , C. P. , and Russell , R. R. ( 2002 ). The natural history of protein domains . Annu. Rev. Biophys. Biomol. 31, 45–71 . A discussion of the origin and evolution of protein domains, © Jones & Bartlettincluding Learning,the suggestion LLC that domains be classifi ed in a© hierarchical Jones & taxonomic Bartlett system. Learning, LLC NOT FOR Reddy SALE , A. S.OR N. (DISTRIBUTION 2007 ). Alternative splicing of pre-messengerNOT RNAs FOR in plants SALE in the OR genomic DISTRIBUTION era . Annu. Rev. Plant Biol. 58, 267–294 . A review of unexpected recent discoveries of alternative splicing of plant genes. Rodríguez-Trelles , F. , Tarrío , R. , and Ayala , F. J. ( 2006 ). Origins and evolution of spliceosomal introns . Annu. Rev. Genet. 40, 47–76 . A review of classic and newer hypotheses about the origins of introns. © Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION

© Jones & Bartlett Learning, LLC © Jones & Bartlett Learning, LLC NOT FOR SALE OR DISTRIBUTION NOT FOR SALE OR DISTRIBUTION Further Reading 111 © Jones and Bartlett Publishers. NOT FOR SALE OR DISTRIBUTION