09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 1

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION 1

Introduction to Molecular

OUTLINE OF TOPICS

1.1 Intellectual Foundation RNA serves as the hereditary material in some viruses. Two studies performed in the 1860s provided the intellectual and Maurice Wilkins obtained x-ray diffrac- underpinning for molecular biology. tion patterns of extended DNA fibers. and proposed that DNA is a double- Genotypes and Phenotypes 1.2 stranded helix. Each gene is responsible for the synthesis of a single polypeptide. The central dogma provides the theoretical framework for molecular biology. 1.3 Nucleic Acids Recombinant DNA technology allows us to study complex Nucleic acids are linear chains of nucleotides. biological systems. A great deal of molecular biology information is available on 1.4 DNA Structure and Function the Internet. Transformation experiments led to the discovery that DNA is the hereditary material. Suggested Reading Chemical experiments also supported the hypothesis that DNA Classic Papers is the hereditary material. The blender experiment demonstrated that DNA is the genetic material in bacterial viruses.

Photo courtesy of James Gathany / CDC 1 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 2

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

he term molecular biology first appeared in a report prepared for the Rockefeller Foundation in 1938 by Warren Weaver, T then director of the Foundation’s Natural Sciences Division. Weaver coined the term to describe a research approach in which physics and chemistry would be used to address fundamental biolog- ical problems. Weaver proposed that the Rockefeller Foundation fund research efforts to seek molecular explanations for biological processes. His proposal was remarkably farsighted, especially when considering that many of his contemporaries believed that living cells possessed a vital force that could not be explained by chemical or physical laws that govern the inanimate world. In fact, some physi- cists entered the field of biology in the hope that they might discover new physical laws. At the time of Weaver’s report, biology was on the threshold of major changes. Two new disciplines—biochemistry and genetics— had altered the way that biologists think about living systems. Bio- chemists had delivered a major blow to the vital force theory by demonstrating that cell-free extracts can perform many of the same functions as intact cells. Geneticists established that the functional and physical unit of heredity is the gene. However, they did not know how the hereditary information was stored in the gene, how the gene was replicated so that it could be transmitted to the next generation, or how the information stored in the gene determined a specific phys- ical trait such as eye color. Neither biochemistry nor genetics had the power to solve these problems on its own. In fact, it took an interdisciplinary effort involv- ing specialists in many fields of the life sciences, including biochem- istry, biophysics, chemistry, x-ray crystallography, developmental biology, genetics, immunology, microbiology, and virology, to solve the hereditary problem. This interdisciplinary effort resulted in the creation of a new discipline, molecular biology, which seeks to ex- plain genetic phenomena in chemical and physical terms.

1.1 Intellectual Foundation Two studies performed in the 1860s provided the intel- lectual underpinning for molecular biology. The earliest intellectual roots of molecular biology can be traced back to the work of two investigators in the 1860s. No connection was apparent between the experiments performed by the two inves- tigators for more than 75 years, but when the connection was finally made, the result was the birth of molecular biology and the beginning of a scientific revolution that continues today. Mendel’s Three Laws of Inheritance The work of the first investigator, , an Austrian monk and botanist, is familiar to all biology students and so will only be summarized briefly here. Mendel discovered three basic laws of in-

2 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 3

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

heritance by studying the way in which simple physical traits are passed on from one generation of pea plants to the next. For conven- ience, Mendel’s laws of inheritance will be described using two mod- ern biological terms, gene for a unit of heredity and chromosome for a structure bearing several linked genes. 1. The law of independent assortment — Specific physical traits such as plant size and color are inherited independently of one another. Mendel was fortunate to have selected physical traits that were determined by genes that were on different chromosomes. 2. The law of independent segregation — A specific gene may exist in alternate forms called alleles. An organism inherits one allele for each trait from each parent. The two alleles, which may be the same or different, segregate (or separate) in germ cells (sperm or egg) and combine again during repro- duction so that each parent transmits one allele to each off- spring. 3. The law of dominance — For each physical trait, one allele is dominant so that the physical trait that it specifies appears in a definite 3:1 ratio. The alternative form is recessive. In Mendel’s peas, tallness was dominant and shortness reces- sive. Therefore, three times as many pea plants were tall as were short. Today we know that there are exceptions to the law of dominance. Sometimes neither allele is dominant. For instance, a plant that inherits a gene for a red flower and a gene for a white flower may produce a pink flower. Unfortunately, scientists failed to recognize the significance of Mendel’s work during his lifetime. His paper remained obscure until about 1900 when scientists rediscovered Mendel’s laws of inheri- tance, giving birth to the science of genetics. Miescher and DNA The second investigator, the Swiss physician Friedrich Miescher, per- formed experiments that led to the discovery of deoxyribonucleic acid (DNA), which we, of course, now know is the hereditary mate- rial. Miescher did not set out to discover the hereditary material but instead was interested in studying cell nuclei from white blood cells, which he collected from pus discharges on discarded bandages that had been used to cover infected wounds. Miescher used a combina- tion of protease (enzymes that hydrolyze ) digestion and sol- vent extraction to disrupt and fractionate the white blood cells. One fraction, which he called nuclein, contained an acidic material with unusually high phosphorus content. Miescher later found that salmon sperm cells, which have remarkably large cell nuclei, are also an excellent source of nuclein. In 1889, Miescher’s student, Richard Altmann, separated nuclein into and a substance with a very high phosphorous content that he named nucleic acid. Because of its high phosphorus content, investigators initially thought that nucleic acids might serve as storehouses for cellular phosphorus.

CHAPTER 1 Introduction to Molecular Biology 3 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 4

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

1.2 Genotypes and Phenotypes Each gene is responsible for the synthesis of a single polypeptide. Mendel’s experiments showed that the genetic makeup of an organ- ism, its genotype, determines the organism’s physical traits, its phe- notype. However, his experiments did not show how genes are able to determine complex physical traits such as plant color or size. Archibald Garrod, an English physician, was the first to provide an explanation for the relationship between genotype and phenotype. Garrod uncovered this relationship while studying alkaptonuria, a rare inherited human disorder in which the urine of affected individ- uals becomes very dark upon standing due to the accumulation of homogentisic acid, a breakdown product of the tyrosine. Garrod correctly proposed that alkaptonuria results from a recessive gene, which causes a deficiency in the enzyme that normally converts homogentisic acid into colorless products. Garrod’s work was generally ignored until the early 1940s, when the American geneticists, George Beadle and Edward Tatum redis- covered it while seeking experimental proof for the connection be- tween genes and enzymes. They believed that, if a gene really does specify an enzyme, it should be possible to create genetic mutants that cannot carry out specific enzymatic reactions. They therefore ex- posed spores of the bread mold Neurospora crassa to x-rays or UV radiation and demonstrated that the mutant molds had a variety of special nutritional needs. Unlike their wild-type parents, the mutants could not reproduce without having specific amino acids or vitamins added to their growth medium. A mutant that requires a specific sup- plement that is not required by the wild-type parent is called an auxo- troph. Genetic analysis revealed that each auxotroph appeared to be HO HO blocked at a specific step in the metabolic pathway for the required C1 C1 amino acid or vitamin. Furthermore, the auxotrophs accumulated large quantities of the substance formed just prior to the blocked H C2 OH H C2 H step. Thus, Beadle and Tatum had replicated in the bread mold the HOHC3 HOHC3 same type of situation that Garrod had observed in alkaptonuria. A

HOHC4 HOHC4 defective gene caused a defect in a specific enzyme that resulted in the abnormal accumulation of an intermediate in a metabolic pathway. C H OH C H OH 5 2 5 2 As a result of their work with the N. crassa mutants, Beadle and Tatum Ribose Deoxyribose proposed the one gene-one enzyme hypothesis, which states that each FIGURE 1.1 The two sugars present in nucleic gene is responsible for synthesizing a single enzyme. We now know acids. (a) Ribose and (b) deoxyribose each that many enzymes are made of more than one type of polypeptide contain five carbon atoms and an aldehyde chain and that a single mutation may affect just one of the polypep- group and therefore belong to aldopentose tide chains. Hence, the original one gene-one enzyme hypothesis was family of simple sugars. The only difference between the two sugars is that ribose has a modified to become a one gene-one polypeptide hypothesis. How- hydroxyl group at carbon-2 and deoxyribose ever, as we will see later, even the one gene-one polypeptide hypothe- as a hydrogen atom at carbon-2. sis is an oversimplification.

4 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 5

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

(a) Cyclization to form ribofuranose

5 CH2OH 5 OH 5 H H HOH2C H HOH2C O 4 O 4 O 4 CHOH H H 1 H H 1 H H 1 OH Rotate about H H 3 2 3 2 3 2 C3-C4 bond OH OH OH OH OH OH

(b) Cyclization to form deoxyribofuranose 5 5 OH 5 CH OH HOH2C O H 2 H HOH2C H 4 CHOH 4 O H H 1 4 O H H 1 H H 1 H H 3 2 OH 3 2 3 2 Rotate about C3-C4 bond OH H OH H OH H

(c) Furan

O HC CH

HC CH

FIGURE 1.2 Conversion of straight chain ribose and deoxyribose to cyclic forms. (a) Conversion of ribose to ribofuranose and (b) conversion of deoxyribose to deoxyri- bofuranose. (c) Structure of furan, a five atom ring system.

Nucleic Acids CH OH CH OH 1.3 2 2 O O OH Nucleic acids are linear chains of nucleotides. Investigators slowly came to realize that nucleic acids could be di- OH vided into two major groups: deoxyribonucleic acid (DNA) and ri- OH OH OH OH bonucleic acid (RNA). Although DNA was initially thought to be α-D-ribofuranose β-D-ribofuranose present in animals and RNA in plants, investigators eventually de- tected both kinds of nucleic acids in all living systems. The principal difference between DNA and RNA is that the former contains de- CH2OH CH2OH O O OH oxyribose, and the latter, ribose (FIGURE 1.1). The two five-carbon sugars (pentoses) differ in only one sub- stituent—deoxyribose has a hydrogen atom at carbon-2, whereas ri- OH bose has a hydroxyl group at this position. By convention, the OH OH aldehyde group in the pentose is C-1, the next carbon is C-2, and so α-D-deoxyribofuranose β-D-deoxyribofuranose forth. Each pentose chain can close to form a five-carbon ring in FIGURE 1.3 Haworth structures for ribofura- FIGURE 1.2a b which an oxygen bridge joins C-1 to C-4 ( and ). nose and deoxyribofuranose. Haworth struc- Because the sugar rings are derivatives of furan (FIGURE 1.2c), ture represents a cyclic sugar as a flat ring they are called furanoses. Ribofuranose and deoxyribofuranose are perpendicular to the plane of the page with often depicted as Haworth structures (named after Walter N. Haw- the ring-oxygen in the back and C-1 to the orth, the investigator who devised the representations). As illustrated right. A hydrogen atom attached to the sugar ring is represented by a line. Ring formation in FIGURE 1.3, a Haworth structure represents a cyclic sugar as a flat can lead to two different stereochemical ring perpendicular to the plane of the page with the ring-oxygen in arrangements at C-1; one in which the hydroxyl the back and C-1 to the right. The ring’s thick lower edge projects to- at C-1 points down (a) and another in which it ward the viewer, its upper edge projects back behind the page, and its points up (b).

CHAPTER 1 Introduction to Molecular Biology 5 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 6

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

(a) O NH2 4 4 4 3 3 CH 3 5 3 5 N HN 5 N 2 2 2 6 6 6 1 N O 1 N O 1 N H H Thymine (T) Cytosine (C) Pyrimidine Pyrimidine bases

(b) NH2 O 6 6 7 6 7 7 5 N 5 N 5 N N 1 N HN 1 1 8 8 8 2 2 2 4 N 9 4 N 9 4 N 9 3 N 3 N H N 3 N H H 2 H

Adenine (A) Guanine (G) Purine Purine bases FIGURE 1.4 Pyrimidine and purine bases in DNA. O 4 3 HN 5 2 6 substituents are visualized as being either above or below the plane of O 1 N H the ring. A line is used to represent a hydrogen atom attached to the Uracil (U) sugar ring. Ring formation can lead to two different stereochemical FIGURE 1.5 Uracil (U). arrangements at C-1. One arrangement, the ␣-anomer, is represented by drawing the hydroxyl group attached to C-1 below the plane of O NH2 the ring and the other, the ␤-anomer, by drawing it above the plane 4 4 3 CH 3 3 5 of the ring. We will see in Chapter 4 that ribofuranose and HN 5 N deoxyribofuranose actually have puckered rather than planar con- 2 2 6 6 O formations. Nevertheless, Haworth structures are convenient repre- 1 N O 1 N ʹ ʹ sentations for sugar rings when precise three-dimensional 5 CH2OH 5 CH2OH O O information is not required. ʹ ʹ 4ʹ 1 4ʹ 1

3ʹ 2ʹ 3ʹ 2ʹ OH OH 1.4 DNA Structure and Function Deoxythymidine Deoxycytidine An early pioneer in the study of nucleic acid chemistry, Phoebus A. Levene found that DNA contains four different kinds of heterocyclic

NH2 O ring structures, which are now known simply as bases because they 6 6 7 7 5 N 5 N can act as proton acceptors. Two of the bases, thymine (T) and cyto- 1 N HN 1 sine (C), are derivatives of pyrimidine (FIGURE 1.4a) and the other two 8 8 2 2 adenine (A) and guanine (G), are derivatives of purine (FIGURE 1.4b). 4 N 9 4 N 9 3 N H2N 3 N RNA also contains cytosine, adenine, and guanine, but the pyrimi- ʹ ʹ 5 CH2OH 5 CH2OH dine uracil (U) replaces thymine (FIGURE 1.5). The only difference be- O O tween uracil and thymine is that the latter contains a methyl group ʹ 1ʹ ʹ 1ʹ 4 4 attached to carbon-5. Levene showed that T, C, A, and G combine 3ʹ 2ʹ 3ʹ 2ʹ with deoxyribose to form a class of compounds called deoxyribonu- OH OH cleosides (FIGURE 1.6) and that U, C, A, G combine with ribose to Deoxyadenosine Deoxyguanosine form a related class of compounds called ribonucleosides (FIGURE FIGURE 1.6 Deoxyribonucleosides. 1.7). Each base is linked to the pentose ring by a bond that joins a

6 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 7

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

specific nitrogen atom on the base (N-1 in pyrimidines and N-9 O NH2 4 4 in purines) to C-1 on the furanose ring. This bond is termed an N- 3 3 5 glycosidic bond. Because each nucleoside has two ring systems (the HN 5 N 2 2 sugar and the base that is attached to it), a method is required to dis- 6 6 O tinguish between atoms in each ring system. This problem is solved 1 N O 1 N ʹ ʹ by adding a prime (′) after the sugar atoms. Thus, the first carbon 5 CH2OH 5 CH2OH O O ′ ′ ʹ ʹ atom in the sugar becomes 1 , the second 2 , and so forth. 4ʹ 1 4ʹ 1 Nucleosides that have a phosphate group attached to the sugar group are called nucleotides. Ribonucleoside and deoxyribonucleoside 3ʹ 2ʹ 3ʹ 2ʹ derivatives are called ribonucleotides and deoxyribonucleotides, re- OH OH OH OH Uridine Cytidine spectively (FIGURE 1.8). The pentose carbon atom to which the phos- phate group is attached is given as part of the nucleotide’s name. Thus, the phosphate group is attached to C-3′ in uridine-3′-monophosphate NH2 O (3′-UMP) and thymidine-3′-monophosphate (3′-dTMP) and to C-5′ in 6 6 7 7 5 N 5 N uridine-5′-monophosphate (5′-UMP) and thymidine-5′-monophos- 1 N HN 1 8 8 2 2 4 N 9 4 N 9 3 N H2N 3 N ʹ ʹ 5 CH2OH 5 CH2OH O O ʹ ʹ ʹ (a) 5 -nucleoside monophosphates 4ʹ 1 4ʹ 1 O O 4 4 3ʹ 2ʹ 3ʹ 2ʹ 3 3 CH3 HN 5 HN 5 OH OH OH OH 2 2 6 6 Adenosine Guanosine O O 1 N 1 N FIGURE 1.7 Ribonucleosides. – – HO3POCH2 HO3POCH2 5ʹ O 5ʹ O ʹ ʹ 4ʹ 1 4ʹ 1

3ʹ 2ʹ 3ʹ 2ʹ OH OH OH Uridine-5ʹ-monophosphate (5ʹ-UMP) Thymidine-5ʹ-monophosphate (5ʹ-dTMP) or 5ʹ-uridylate or 5ʹ-thymidylate

(b) 3ʹ-nucleoside monophosphates O O 4 4 3 3 CH3 HN 5 HN 5 2 2 6 6 O O 1 N 1 N CH2OH CH2OH 5ʹ O 5ʹ O ʹ ʹ 4ʹ 1 4ʹ 1

3ʹ 2ʹ 3ʹ 2ʹ OH – – OPO3H OPO3H

Uridine-3ʹ-monophosphate (3ʹ-UMP) Thymidine-3ʹ-monophosphate (3ʹ-dTMP) or 3ʹ-uridylate or 3ʹ-thymidylate

FIGURE 1.8 Nucleotides. Nucleotides are formed by adding a phosphate group to the pentose ring in a nucleoside. (a) Nucleotides formed by adding a phosphate group to the 5’-hydroxyl group in uridine or thymidine. (b) Nucleotides formed by adding a phosphate group to the 3’-hydroxyl group in uridine or thymidine.

CHAPTER 1 Introduction to Molecular Biology 7 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 8

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

TABLE 1.1 Bases, Nucleosides, and Nucleotides Base Sugar Nucleoside 5’-Mononucleotide Uracil (U) ribose uridine Uridine-5’-monophosphate or 5’-uridylate (5’-UMP) Cytosine (C) ribose cytidine Cytidine-5’-monophosphate or 5’-cytidylate (5’-CMP) Adenine (A) ribose adenosine Adenosine-5’-monophosphate or 5’-adenylate (5’-AMP) Guanine (G) ribose guanosine Guanosine-5’-monophosphate or 5’-guanylate (5’-GMP) Thymine (T) deoxyribose deoxythymidine1 Deoxythymidine-5’-monophos- phate or 5’-deoxythymidylate (5’-dTMP)1 Cytosine (C) deoxyribose deoxycytidine Deoxycytidine-5’-monophos- phate or 5’-deoxycytidylate (5’- dCMP) Adenine (A) deoxyribose deoxyadenosine Deoxyadenosine-5’- monophosphate or 5’- deoxyadenylate (5’-dAMP) Guanine (G) deoxyribose deoxyguanosine Deoxyguanosine-5’- monophosphate or 5’- deoxyguanylate (5’-dGMP) 1Deoxythymidine and deoxythymidine-5’-monophosphate are also called thymidine and thymidine-5’-monophosphate, respectively. When thymine is attached to ribose, the nucleoside is called ribothymidine and the nucleotide is called ribothymidylate. This nomenclature convention follows from the fact that thymine is most frequently attached to deoxyribose.

phate (5′-dTMP). As indicated in Table 1.1, nucleoside monophosphates have two alternative names. For example, cytidine-5′-monophosphate (5′-CMP) is also known as 5′-cytidylate. Levene also suggested that neighboring deoxyribonucleosides in DNA are joined to one another by 5′ to 3′ (5′→3′) phosphodiester bridges. Unfortunately, Levene is remembered more for a hypothesis that he proposed that turned out to be wrong than for all his impor- tant contributions to the field of nucleic acid chemistry. To place Lev- ene’s hypothesis in context, it is important to note that the analytical tools available to him were quite poor. Therefore, not surprisingly, Levene based his idea about DNA structure on the incorrect assump- tion that T, C, A, and G are present in DNA in equimolar concentra- tions. He therefore proposed that DNA has a tetranucleotide structure (FIGURE 1.9). This hypothetical structure became untenable in the 1930s when the Swedish researchers Torbjörn Caspersson and Einar Hammersten showed that DNA has a very high molecular mass and therefore must be a very large molecule or macromolecule. Nevertheless, scientists continued to accept the idea that DNA contains a repeating sequence of the four nucleotides. They were therefore forced to look elsewhere for a molecule that could carry ge- netic information because DNA, which they mistakenly thought had a repetitive and monotonous nucleotide sequence, seemed an unlikely candidate to carry much information. Proteins seemed to be much better candidates for the genetic material. Although little was known about protein structure in the early part of the twentieth century,

8 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 9

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

NH2 NH HN 2

O N N Guanine Cytosine

N O N N CH O 2 O O O P O O– H2C O O O O– P P O– O O O CH O 2 O P O

O O O N H C – 2 O NN Thymine HN Adenine

CH3 O N NH2 N FIGURE 1.9 The tetronucleotide structure proposed by Phoebus A. Levene. This structure shows the correct connectivity between nucleotides but as- sumes that the structure closes on itself (is circular rather than linear). Al- though wrong, Levene's proposal had a marked influence on the field of nucleic acid chemistry for many years. Adapted from Klug, W. S., and Cummings, M. R. Concepts of Genetics, 2/e. Merrill Publishing Company, 1986.

there did seem to be many kinds of different proteins. Furthermore, proteins, along with DNA, were known to be part of the chromo- somes, distinct structures in the nucleus that had been demonstrated to carry genes. Once DNA had been eliminated as the genetic mate- rial because it seemed to be a “dumb” molecule, investigators focused on proteins. It was not until 1952 that the distinguished British organic chemist Lord Alexander Robertus Todd and his associates showed the way that the various groups in DNA are joined together. These DNA groups form a long unbranched polymer in which phosphate groups join 5′- and 3′-carbons of neighboring deoxyribonucleosides (FIGURE 1.10a). Thus, each linear DNA chain has a 3′- and a 5′-terminus. This direc- tionality will be very important in future discussions of DNA struc- ture and function. The structure of RNA is very similar to that for DNA (FIGURE 1.11a). Just three years after the discovery that nucleic acids are linear chains of nucleotides, Frederick Sanger working in England showed that each kind of polypeptide molecule is a linear chain of amino acids arranged in a specific order. The amino acid order is responsi- ble for the unique physical and chemical properties of each type of protein, including its ability to catalyze specific reactions, protect cells from foreign organisms and substances, transport materials, or support the cellular infrastructure. The awareness that DNA mole-

CHAPTER 1 Introduction to Molecular Biology 9 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 10

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

(a) (b) (c) 5ʹ end pApCpGpTp P O 5ʹ A 3ʹ P Na+ – O P O 5ʹ C NH2 3ʹ O P N 5ʹ G N 3ʹ A P ʹ T 5ʹ 5 CH2 N N 3ʹ O P

O

Na+ – O P O NH2 O N C

ʹ O 5 CH2 N O

O

Na+ – O P O O O N NH G 5ʹCH 2 N N NH2 O

O

Na+ – O P O O O H C 3 NH T

ʹ 5 CH2 N O O

O

Na+ – O P O

O

3ʹ end FIGURE 1.10 Segment of a polydeoxyribonucleotide. (a) Extended structure as a sodium salt, (b) stick figure structure, and (c) an abbreviated structure.

10 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 11

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

(a) (b) (c) 5ʹ end pApCpGpUp P OH O 5ʹ A 3ʹ P OH Na+ – O P O 5ʹ C NH2 3ʹ O P OH N 5ʹ G N 3ʹ A P OH ʹ ʹ 5 U 5 CH2 N N 3ʹ O P

3ʹ OH O

Na+ – O P O NH2 O N C

ʹ O 5 CH2 N O

3ʹ OH O

Na+ – O P O O O N NH G

5ʹCH 2 N N NH2 O

3ʹ OH O

Na+ – O P O O O NH U

ʹ 5 CH2 N O O

3ʹ OH O

– O P O

O

3ʹ end FIGURE 1.11 Segment of a polyribonucleotide. (a) Extended structure as a sodium salt, (b) stick figure structure, and (c) an abbreviated structure.

CHAPTER 1 Introduction to Molecular Biology 11 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 12

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

cules are made of linear chains of nucleotides and polypeptides are made of linear chains of amino acids led to the sequence hypothesis that proposes nucleotide sequences specify amino acid sequences. Drawing extended structures for DNA and RNA chains requires considerable time and space. It is more convenient to draw stick fig- ure structures, which are adequate representations for many pur- poses (FIGURES 1.10b and 1.11b). The few simple conventions for drawing stick figure representations for DNA and RNA are as fol- lows: (1) A single horizontal line represents the pentose ring. (2) The letter A, G, C, U or T, at one end of the horizontal line, represents the purine or pyrimidine attached to C-1′ of the pentose ring. (3) The let- ter P, connected by short diagonal lines to adjacent horizontal lines, represents the 5′→3′ phosphodiester bond. (4) The symbol OH rep- resents a hydroxyl group. An even simpler method for indicating nu- cleotide sequence is to just write the letters corresponding to the bases (FIGURES 1.10c and 1.11c).

Transformation experiments led to the discovery that DNA is the hereditary material. The first hint that genes are made of DNA came from an observation made in 1928 by Fred Griffith, who was studying Streptococcus pneumoniae, the bacterium responsible for human pneumonia. The virulence of this bacterium was known to depend on a surrounding polysaccharide capsule that protects the bacterium from the body’s defense systems. This capsule also causes the bacterium to produce smooth-edged (S) colonies on an agar surface. It was known that S bacteria normally killed mice. Griffith isolated a rough-edged (R) colony mutant, which proved to be both non-encapsulated and non- lethal. He then observed that while either live R or heat-killed S bac- teria are non-lethal, a mixture of the two is lethal (FIGURE 1.12). Furthermore, when bacteria were isolated from a mouse that had died from such a mixed infection, the bacteria were live S and R. Therefore, the live-R bacteria had somehow either been replaced by or transformed to S bacteria. Several years later, investigators showed that the mouse itself was not needed to mediate this transformation because when a mixture containing live R bacteria and heat-killed S bacteria was incubated in culture medium, living S cells were produced. One possible explana- tion for this surprising phenomenon was that the R cells restored the viability of the dead S cells. However, this hypothesis was eliminated by the observation that living S cells appeared even when the heat- killed S culture in the mixture was replaced by a cell extract prepared from broken S cells, which had been freed from both intact cells and the capsular polysaccharide by centrifugation. Hence, it was con- cluded that the cell extract contained a transforming principle, the nature of which was unknown. The next development occurred in 1944 when Oswald Avery, Colin MacLeod, and Maclyn McCarty determined the chemical nature of the transforming principle. They did so by isolating DNA from S

12 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 13

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

Living Living Heat-killed Living R cells plus S cells R cells S cells heat-killed S cells

Mouse contracts Mouse remains Mouse remains Mouse contracts pneumonia healthy healthy pneumonia

S colonies isolated R colonies isolated No colonies isolated R and S colonies isolated from tissue of dead mouse from tissue from tissue from tissue of dead mouse FIGURE 1.12 The Griffith’s experiment demonstrating bacterial transformation. A mouse dies from pneumonia if injected with the virulent S (smooth) strain of Streptococcus pneumoniae. However, the mouse remains healthy if injected with either the nonvirulent R (rough) strain or the heat-killed S strain. R cells in the presence of heat-killed S cells are transformed into the virulent S strain, killing the mouse.

cells and adding the DNA to live R bacterial cultures (FIGURE 1.13). Af- ter allowing the mixture to incubate for a period of time, they placed samples on an agar surface and incubated them until colonies ap- peared. Some of the colonies (about 1 in 104) that grew were S type. To show that this was a permanent genetic change, Avery and coworkers dispersed many of the newly formed S colonies and placed them on a second agar surface. The resulting colonies were again S type. If an R colony arising from the original mixture was dispersed, only R bacte- ria grew in subsequent generations. Hence, the R colonies retained the R character, whereas the transformed S colonies bred true as S. Because S and R colonies differed by a polysaccharide coat around each S bac- terium, Avery and coworkers tested the ability of purified polysac- charide to transform, but observed no transformation. Since the procedures for isolating DNA then in use produced DNA containing many impurities, it was necessary to provide evidence that the transfor- mation was actually caused by DNA and not an impurity. This evi- dence was provided by the following four procedures. 1. Chemical analysis of samples containing the transforming principle showed that the major component was a deoxyri- bose-containing nucleic acid. 2. Physical measurements showed that the sample contained a highly viscous substance having the properties of DNA. 3. Incubation with trypsin or chymotrypsin, enzymes that catalyze protein hydrolysis, or with ribonuclease (RNase), an enzyme

CHAPTER 1 Introduction to Molecular Biology 13 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 14

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

Preparation of transforming principle from S strain

1. Disrupt cells 1. Precipitate with ethanol 2. Centrifuge 2. Redissolve in water

Encapsulated S strain Cell-free Transforming extract principle from S strain

Addition of transforming principle to R strain

Transforming principle from S strain Mix and spread on agar plates

Culture containing R strain both S and R cells

FIGURE 1.13 The Avery, MacLeod, and McCarty transformation experiment.

that catalyzes RNA hydrolysis, did not affect transforming activ- ity. Thus, the transforming principle is neither protein nor RNA. 4. Incubation with deoxyribonuclease (DNase), an enzyme that catalyzes DNA hydrolysis, inactivated the transforming prin- ciple. In drawing conclusions from their experiments, Avery, MacLeod, and McCarty avoided stating explicitly that DNA was the hereditary material and concluded at the end of their work only that nucleic acids have “biological specificity, the chemical basis of which is as yet undetermined.” The problem they faced in persuading the scientific community to accept their conclusion was that the hereditary mate- rial had to be a substance capable of enormous variation in order to contain the information carried by the huge number of genes. How- ever, the tetranucleotide hypothesis seemed incompatible with the idea that DNA could be the sole component of the hereditary mate- rial. Furthermore, the consensus was that genes were made of chro- mosomal protein, an idea that, in the course of a 40-year period, had logically evolved from the recognition that protein composition and structure varied greatly among organisms. For these reasons, the transformation experiments had little initial impact. Investigators who supported the genes-as-protein theory posed two alternative ex- planations for the transformation results. (1) The transforming prin- ciple might not be DNA but rather one of the proteins invariably contaminating the DNA sample. (2) DNA somehow affected capsule formation directly by acting in the metabolic pathway for biosynthe- sis of the polysaccharide and permanently altering this pathway. The 14 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 15

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

first point should have already been discounted by the original work because the experiments showed insensitivity to proteolytic enzymes and sensitivity to DNase. However, since the DNase was not a pure enzyme, the possibility could not be eliminated conclusively. The transformation experiment was repeated five years later by Rollin Hotchkiss with a DNA sample with a protein content that was only 0.02%, and it was found that this extensive purification did not reduce the transforming activity. This result supported the view of Av- ery, MacLeod, and McCarty but still did not prove it. The second al- ternative, however, was clearly eliminated—also by Hotchkiss—with an experiment in which he transformed a penicillin-sensitive bacterial strain to penicillin-resistance. Because penicillin resistance is totally distinct from the rough-smooth character of the bacterial capsule, this experiment showed that the transforming ability of DNA was not limited to capsule synthesis. Interestingly enough, most biologists still remained unconvinced that DNA was the genetic material. It was not until Erwin Chargaff showed in 1950 that a wide variety of chemical structures in DNA were possible—thus allowing biological specificity—that this idea was accepted.

Chemical experiments also supported the hypothesis that DNA is the hereditary material. The hypothesis of a tetranucleotide structure for DNA arose from the belief that DNA contained equimolar quantities of adenine, thymine, guanine, and cytosine. This incorrect conclusion arose for two rea- sons. First, in the chemical analysis of DNA, the technique used to separate the bases before identification did not resolve them very well, so the quantitative analysis was poor. Second, the DNA ana- lyzed was usually isolated from animals, plants, and yeast in which the four bases are present in nearly equimolar concentration, or from bacterial species such as Escherichia coli that also happened to have nearly equimolar base concentrations. Using the DNA from a wide variety of organisms, Chargaff applied new separation and analytical techniques and showed that the molar content of bases (generally called the base composition) could vary widely. The base composi- tion of DNA from a particular organism is usually expressed as a fraction of all bases that are G • C pairs. This fraction called the G + C content can be expressed as follows: G + C content = ([G] + [C])/[all bases] where the square brackets ([ ]) denote molar concentrations. Chargaff’s studies also revealed one other remarkable fact about DNA base composition. In each of the DNA samples that Chargaff studied, he found that [A] = [T] and [G] = [C]. Although the signifi- cance of these equalities, known as Chargaff’s rules, was not imme- diately apparent, they would later help to confirm the structure of the DNA molecule. Base compositions of hundreds of organisms have been deter- mined. Generally speaking, the value of the G + C content is near 0.50 for the higher organisms and has a very small range from one CHAPTER 1 Introduction to Molecular Biology 15 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 16

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

species to the next (0.49–0.51 for the primates). For the lower organ- isms, the value of the G + C content varies widely from one genus to another. For example, for bacteria the extremes are 0.27 for the genus Clostridium and 0.76 for the genus Sarcina; E. coli DNA has the value 0.50. Thus, it was demonstrated that DNA could have vari- able composition, a primary requirement for the hereditary material. Upon publication of Chargaff’s results, the tetranucleotide hypothe- sis quietly died and the DNA-gene idea began to catch on. Shortly af- terward, workers in several laboratories found that, for a wide variety of organisms, somatic cells have twice the DNA content of germ cells, a characteristic to be expected of the genetic material, (a) given the tenets of classical chromosome genetics. Although it could apply just as well to any component of chromosomes, once this result was revealed, objections to the work of Avery, MacLeod, and Mc- Protein Head (protein Carty were no longer heard, and the hereditary nature of DNA rap- DNA and DNA) idly became the fashionable idea.

The blender experiment demonstrated that DNA is the genetic material in bacterial viruses. Tail (protein An elegant confirmation that DNA is the hereditary material came only) from an experiment performed by Alfred Hershey and Martha Chase in 1952 in which viral replication was studied in the bacterium E. coli. Bacterial viruses are called bacteriophages or phages for short (see Chapter 8 for more information about bacteriophages and other viruses). The experiment, known as the blender experiment because (b) a kitchen blender was used as a major piece of apparatus, demon- strated that the DNA injected by a bacteriophage T2 particle into a bacterium contains all of the information required to synthesize progeny phage particles. A single phage T2 particle consists of DNA (now known to be a single molecule) encased in a protein shell and a long protein tail by which it attaches to sensitive bacteria (FIGURE 1.14). Hershey and Chase showed that an attached phage can be torn from a bacterial cell wall by violent agitation of the infected cells in a kitchen blender. Thus, it was possible to separate an absorbed phage from a bacterium and determine the component(s) of the phage that could not be shaken free by agitation; presumably, those components had been injected into the bacterium. DNA is the only phosphorus-containing substance in the phage particle. The proteins of the shell, which contain the amino acids me- thionine and cysteine, have the only sulfur atoms. T2 phage contain- ing radioactive DNA can be prepared by infecting bacteria with T2 32 FIGURE 1.14 Bacteriophage T2. (a) Drawing phage in a growth medium that contains [ P]phosphate as the sole of E. coli phage T2, showing various compo- source of phosphorus. If instead the growth medium contains ra- nents. The DNA, which is confined to the inte- dioactive sulfur as [35S]sulfate, phage containing radioactive proteins rior of the head, is the only component are obtained. When these two kinds of labeled phages are used to in- 32 labeled by P. Methionine and cysteine in the fect a bacterial host, the phage DNA and the protein molecules can proteins can be labeled with 35S. (b) An elec- tron micrograph of phage T4, a closely related always be located by their radioactivity. Hershey and Chase used 32 35 phage. [Electron micrograph courtesy of Robert these phages to show that P but not S remains associated with the Duda, University of Pittsburgh.] bacterium.

16 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 17

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

35S-Labeled phage particles were adsorbed to bacteria for a few minutes. The bacteria were separated from unadsorbed phage and phage fragments by centrifuging the mixture and collecting the sedi- ment (the pellet), which consisted of the phage-bacterium complexes. These complexes were resuspended in aqueous solutions and blended. The suspension was again centrifuged, and the pellet (consisting al- most entirely of bacteria) and the supernatant were collected. It was found that 80% of the radioactive sulfur was in the supernatant and 20% was in the pellet. The 20% of the 35S that remained associated with the bacteria was shown many years later to consist mostly of phage tail fragments that adhered too tightly to the bacterial surface to be removed by the blending. A very different result was observed when the phage population was labeled with radioactive phosphate. In this case, 70% of the 32P remained associated with the bacteria in the pellet after blending and only 30% was in the supernatant. Of the radioactivity in the supernatant, roughly one third could be accounted for by breakage of the bacteria during the blending. (The remainder was shown some years later to be a result of defective phage particles that could not inject their DNA.) When the pellet material was resus- pended in growth medium and reincubated, it was found to be capa- ble of phage production. Thus, the ability of a bacterium to synthesize progeny phage is associated with transfer of 32P, and hence of DNA, from parental phage to the bacteria. Another series of experiments (FIGURE 1.15), known as transfer experiments, supported the interpretation that genetic material con- tains 32P but not 35S. In these experiments, progeny phage were iso- lated, after blending, from cells that had been infected with either 35S- or 32P-containing phage and the progeny were then assayed for ra- dioactivity. The idea was that some parental genetic material should be found in the progeny. No 35S but about half of the injected 32P was transferred to the progeny. This result indicated that although 35S might be residually associated with the phage-infected bacteria, it was not part of the phage genetic material. The interpretation (now known to be correct) of the transfer of only half of the 32P was that progeny DNA is selected at random for packaging into protein coats and that all progeny DNA is not successfully packaged.

RNA serves as the hereditary material in some viruses. The transformation and blender experiments settled once and for all the question of the chemical identity of the genetic material. The ab- solute generality of the conclusion remained in question, though, be- cause several plant and animal viruses were known to contain single-stranded RNA and no DNA. These particles became under- standable shortly afterward as the role of RNA in the flow of infor- mation from gene to protein became clear. That is, DNA stores genetic information for protein synthesis and the pathway from DNA to protein always requires the synthesis of an RNA intermedi- ate, which is copied from a DNA template. Thus, a virus that lacks DNA utilizes the base sequence of RNA both for storage of informa-

CHAPTER 1 Introduction to Molecular Biology 17 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 18

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

(a) (b) Infection with Infection with nonradioactive nonradioactive T2 phage T2 phage

E. coli cells grown E. coli cells grown in 32P-containing in 35S-containing medium (labels DNA) medium (labels protein)

Phage reproduction; Phage reproduction; cell lysis releases cell lysis releases DNA-labeled progeny protein-labeled phage progeny phage

DNA-labeled phage Protein-labeled phage used to infect used to infect nonradioactive cells nonradioactive cells

After infection, part of phage After infection, part of phage remaining attached to cells is remaining attached to cells is removed by violent agitation removed by violent agitation in a kitchen blender in a kitchen blender

Infecting Infected cell Infecting Infected cell labeled DNA nonlabeled DNA

Phage reproduction; cell lysis Phage reproduction; cell releases progeny phage that lysis releases progeny contain some 32P-labeled DNA phage that contain almost from the parental phage DNA no 35S-labeled protein

FIGURE 1.15 The Hershey-Chase (“transfer”) experiment demonstrating that DNA, not protein, is directing reproduction of phage T2 in infected E. coli cells.

18 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 19

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

tion and as a template from which the amino acid sequence of pro- teins can be obtained. RNA does not serve these two functions as ef- ficiently as the DNA→RNA system, but it does work.

Rosalind Franklin and Maurice Wilkins obtained x-ray diffraction patterns of extended DNA fibers. At the same time that chemists were attempting to learn something about the composition of DNA, crystallographers were trying to ob- tain a three-dimensional image of the molecule. Rosalind Franklin and Maurice Wilkins obtained some excellent x-ray diffraction patterns of extended DNA fibers in the early 1950s (FIGURE 1.16). One might pre- dict that all the x-ray diffraction patterns would look alike, but this was not the case. DNA structure and, therefore, x-ray diffraction pat- A-form DNA terns depend on several variables. One of the most important of these is the relative humidity of the chamber in which DNA fibers are placed. Two types of DNA structure are of particular interest. B-DNA is stable at a relative humidity of about 92%. A-DNA appears as the rel- ative humidity falls to about 75%. Crystallographers did not know whether A-DNA or B-DNA is present in the living cell. Partly for this reason, Wilkins turned his attention toward taking x-ray diffraction pictures of DNA in sperm cells. Franklin focused her attention on x- ray diffraction patterns of A-DNA because they appeared to provide more detail. She believed that careful analyses of the detailed patterns would eventually lead to the solution of DNA’s structure.

James Watson and Francis Crick proposed that DNA is a B-form DNA double-stranded helix. FIGURE 1.16 X-ray diffraction patterns of the A and B forms of the sodium salt of DNA. The American biologist, James D. Watson, and the English crystallog- Reproduced from Franklin, R. E., and Gosling, rapher, Francis Crick, working together in England, took a different R.G., Acta Crystallographica 6 (1953): 673–677. approach to determining DNA’s structure. They tried to obtain as Photos courtesy of International Union of much information as they could from the x-ray diffraction patterns Crystallography. and then to build a model consistent with this information. The term model has a special meaning to scientists. A model is a hypothesis or tentative explanation of the way a system works, usu- ally including the components, interactions, and sequences of events. A successful model suggests additional experiments and allows inves- tigators to make predictions that can be tested in the laboratory. If predictions do not agree with experimental results, the model must be considered incorrect in its current form and modified. A model cannot be proved to be correct merely by showing that it makes a correct prediction. However, if it makes many correct predictions, it is probably nearly, if not completely, correct. Watson and Crick focused their attention on Franklin’s x-ray dif- fraction patterns of B-DNA. This pattern indicated that B-DNA has a helical structure, a diameter of approximately 2.0 nm, and a repeat distance of 0.34 nm. Their model would have to account for these structural features. Watson and Crick still had to work out the num- ber of DNA chains in a DNA molecule, the location of the bases, and the position of the phosphate and deoxyribose groups. The density of CHAPTER 1 Introduction to Molecular Biology 19 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 20

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

DNA seemed to be consistent with one, two, or three chains per mol- ecule. Watson and Crick tried to build a two-chain model with hy- drogen bonds (weak electrostatic attractions; see Chapter 2) holding the bases together. They were unsuccessful until Jerry Donohue sug- gested that they use the keto tautomeric forms of T and G in their models. At first, Watson tried to link two purines together and two pyrimidines together in what he called a “like-with-like” model. A dramatic turning point occurred in 1953 when Watson realized that adenine forms hydrogen bonds with thymine and guanine forms hy- drogen bonds with cytosine. Watson describes this turning point in his book, The Double Helix: “When I got to our still empty office the following morning, I quickly cleared away the paper from my desk top so that I would have a large, flat surface on which to form pairs of bases held together by hydrogen bonds. Though I initially went back to my like-with-like prejudices, I saw all too well that they led nowhere. When Jerry [Donohue] came in I looked up, saw that it was not Francis [Crick], and began shifting the bases in and out of various other pairing possi- bilities. Suddenly I became aware that an adenine-thymine pair held together by two hydrogen bonds was identical in shape to a guanine-cytosine held together by at FIGURE 1.17 The figure in the 1953 paper by Watson and Crick in Nature that shows their least two hydrogen bonds. All the hydrogen bonds seemed double helix model for DNA for the first time. to form naturally; no fudging was required to make the Reproduced from Watson, J. D., and Crick, two types of base pairs identical in shape.” F. H. C., Nature 171 (1953): 737–738. With the realization that adenine-thymine and cytosine-guanine base pairs have the same width, Watson and Crick were quickly able to construct a double helix model of DNA that fit Franklin’s x-ray dif- fraction data (FIGURE 1.17). The key features of the Watson-Crick Model for B-DNA are as follows: 1. Two polydeoxyribonucleotide strands twist about each other to form a double helix. 2. Phosphate and deoxyribose groups form a backbone on the outside of the helix. 3. Purine and pyrimidine base pairs stack inside the helix and form planes perpendicular to the helix axis and the deoxyri- bose groups. 4. The helix diameter is 2.0 nm (or 20 Å). 5. Adjacent base pairs are separated by an average distance of 0.34 nm (or 3.4 Å) along the helix axis. The structure repeats itself after about ten base pairs, or about once every 3.4 nm (or 34 Å) along the helix axis. 6. Adenine always pairs with thymine and guanine with cyto- sine. The original model showed two hydrogen bonds stabi- lizing each kind of base pair. Although this was an accurate description for A-T base pairs, later work showed that G-C base pairs are stabilized by three hydrogen bonds (FIGURE 1.18). (These base pairing relationships explained Chargaff’s observation that the molar ratios of adenine to thymine and guanine to cytosine are one.)

20 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 21

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

H

H N N H3C O Adenine Thymine N H N N O N N O O Deoxyribose Deoxyribose H

N H O N Guanine Cytosine N H N A N O N O N O H N Deoxyribose Deoxyribose H

FIGURE 1.18 Base pairs in DNA.

7. The two strands are antiparallel, which means that the strands run in opposite directions. That is, one strand runs 3′→5′ in one direction while the other strand runs 5′→3′ in the same direction. Because the strands are antiparallel, a convention is needed for stating the sequence of bases of a single chain. The convention is to write a sequence with the 5′-P terminus at the left; for example, ATC denotes the trinu- cleotide 5′-ATC-3′. This is also often written as pApTpC, again using the conventions that the left side of each base is the 5′-terminus of the nucleotide and that a phosphodiester group is represented by a p between two capital letters. 8. A major and a minor groove wind about the cylindrical outer helical face. The two grooves are of about equal depth but the major groove is much wider than the minor groove. The Watson-Crick Model indicates that when the nucleotide sequence of one strand is known, the sequence of the complementary strand can be predicted, providing the theoretical framework needed to un- derstand the fidelity of gene replication. Each strand serves as a mold or template for the synthesis of the complementary strand (FIGURE 1.19). Watson and Crick ended their short paper announcing the dou- ble helix model with the following sentence that must be one of the greatest understatements in the scientific literature: “It has not es- caped our notice that the specific pairing we have postulated im- mediately suggests a possible copying mechanism for the genetic material.” The double helix showed how establishing a chemical structure can be used to understand biological function and to make predictions that guide new research. Structure-function relationships remain a central theme of mo- lecular biology. We will see, time and again, throughout this book how knowledge of structure helps us to understand function and leads us to new insights. Biological structures are sometimes quite

CHAPTER 1 Introduction to Molecular Biology 21 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 22

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

TA CG AT

CG CG Parent duplex GC T A

CG T A CG TA

TA A T A T C G CG C G GC GC T A T Daughter A duplex CG C G T A T A CG CG T A TA

TA Template TA AT strands AT

Replica strands

FIGURE 1.19 Replication of DNA. Replication of DNA duplex as originally envisioned by Watson and Crick. As the parental strands separate, each parental strand serves as a template for the formation of a new daughter strand by means of A-T and G-C base pairing.

complex and difficult to study. However, it is worth the effort to study the structures because the reward for doing so is so great. The Watson-Crick Model also serves as a powerful example of the funda- mental principles that help to define molecular biology as a disci- pline. (1) The same physical and chemical laws apply to living systems and inanimate objects. (2) The same biological principles tend to ap- ply to all organisms. (3) Biological structure and function are inti- mately related. The Watson-Crick Model provided an excellent starting point for the challenging job of elucidating the chemical ba- sis of heredity.

22 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 23

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

The central dogma provides the theoretical framework DNA for molecular biology. Much of the research in molecular biology in the late 1950s was di- Replication rected toward discovering the mechanisms by which nucleotide se- Transcription quences specify amino acid sequences. In a talk given at a 1957 mRNA symposium, Crick suggested a model for information flow that pro- vides a theoretical framework for molecular biology. The following excerpt from Crick’s presentation states the theory known as the cen- Translation tral dogma in a clear and concise fashion. Ribosome “In more detail, the transfer of information from nucleic acid to nucleic acid, or from nucleic acid to protein may be possible, but transfer from protein to protein, or from pro- tein to nucleic acid is impossible. Information means here the precise determination of sequence, either of bases in nu- Polypeptide cleic acid or of amino acid residues in protein.” Thus, Crick was proposing that genetic information flows from FIGURE 1.20 The “central dogma.” The cen- DNA to DNA (DNA replication), from DNA to RNA (transcrip- tral dogma as originally proposed by Francis tion), and from RNA to polypeptide (translation) (FIGURE 1.20). Crick postulated information flow from DNA By the mid-1960s, molecular biologists had obtained consider- to RNA to protein. The ribosome is an essen- tial part of the translation machinery. Later able experimental support for the central dogma. In particular, they studies demonstrated that information can had discovered enzymes that catalyze replication and transcription also flow from RNA to RNA and from RNA to and elucidated the pathway for translating nucleotide sequences to DNA (reverse transcription). amino acid sequences. With some variations in detail, all organisms use the same basic mechanism to translate information from nucleo- tide sequences to amino acid sequences. Three major components of the translation machinery—messenger RNA (mRNA), ribosomes, and transfer RNA (tRNA)—play major roles in this information trans- fer (FIGURE 1.21). Each gene can serve as a template for the synthesis of specific mRNA molecules. Messenger RNA molecules program ri- bosomes (protein synthetic factories) to form specific polypeptides. Transfer RNA molecules carry activated amino acids to programmed ribosomes, where under the direction of the mRNA the amino acids join to form a polypeptide chain. The twelve years of extraordinary scientific progress that followed the discovery of DNA’s structure were capped by a series of brilliant investigations by Marshal W. Niren- berg and others that culminated in deciphering the genetic code. By 1965, investigators could predict a polypeptide chain’s amino acid sequence from a DNA or mRNA molecule’s nucleotide sequence. Re- markably, the genetic code was found to be nearly universal. Each particular sequence of three adjacent nucleotides or codon specifies the same amino acid in bacteria, plants, and animals. Could one ask for more convincing support for the uniformity of life processes?

Recombinant DNA technology allows us to study com- plex biological systems. The second major wave in molecular biology started in the late 1970s with the development of recombinant DNA technology (also known as genetic engineering). Thanks to recombinant DNA technology,

CHAPTER 1 Introduction to Molecular Biology 23 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 24

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

Nucleus Cytoplasm

Amino acids

DNA

mRNA Transcription Growing polypeptide chain

tRNA

Incoming tRNA with amino acid Outgoing tRNA

Translation

mRNA

Ribosome

FIGURE 1.21 Simplified schematic diagram of the protein synthetic machinery in an animal, plant, or yeast cell. Transfer RNA (tRNA) carries the next amino acid to be attached to the growing polypeptide chain to the ribosome, which recognizes a match between a three nucleotide sequence in the mRNA (the codon) and a complementary sequence in the tRNA (the anti- codon) and transfers the growing polypeptide chain to the incoming amino acid while it is still attached to the tRNA. Adapted from Secko, D., The Science Creative Quarterly 2 (2007).

genes can be manipulated in the laboratory just like any other organic molecule. They can be synthesized or modified as desired and their nucleotide sequence determined. Sequence information can save mo- lecular biologists months and perhaps even years of work. For exam- ple, when a new gene is discovered that causes a specific disease in humans, molecular biologists may be able to obtain valuable clues to the new gene’s function by comparing its nucleotide sequence with se- quences of all other known genes. If similarities are found with a gene of known function from some other organism, then it is likely that the new gene will have a similar function. Alternatively, a segment of the nucleotide sequence of the new gene may predict an amino acid se- quence that is known to have specific binding or catalytic functions. The development of recombinant DNA technology has allowed the incredible progress that has been made and continues to be made in solving problems that seemed intractable just a few years ago. Re- combinant DNA technology has helped investigators to study cell di- vision, cell differentiation, transformation of normal cells to cancer cells, programmed cell death (apoptosis), antibody production, hor- mone action, and a variety of other fundamental biological processes. One of the most exciting applications of recombinant DNA tech- nology is in medicine. Until a few years ago, many diseases could only be studied in humans because no animal models existed. Very

24 INTRODUCTION 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 25

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

little progress was made in studying these diseases because of obvious ethical constraints associated with studying human diseases. Once in- vestigators learned how to transfer genes from one species to another, they were able to create animal models for human diseases, thus fa- cilitating study of the diseases. Recombinant DNA technology also has led to the production of new drugs to treat diabetes, anemia, car- diovascular disease, and cancer as well as to the development of di- agnostic tools to detect a wide variety of diseases. The list of practical medical applications grows longer with each passing day. Although recombinant DNA technology promises to change our lives for the better, it also forces us to consider important social, po- litical, ethical, and legal issues. For example, how do we protect the interests of an individual when DNA analysis reveals that the individ- ual has alleles that are likely to cause a serious physical or mental dis- ease in the future, especially if there is no cure or treatment? What impact will the knowledge have on affected individuals and their families? Should insurance companies or potential employers have access to the genetic information? If not, how do we limit access to information and how do we enforce the limitation? Rapid progress in recombinant DNA technology also raises troubling ethical issues. Germ line therapy allows new genes to be introduced into fertilized eggs and thereby alter the genetic characteristics of future genera- tions. This technique has been used to introduce desired traits into plants and animals but it has not as yet been reported in humans. An argument can be made for using germ line therapy to correct human genetic diseases such as Tay Sachs disease, cystic fibrosis, or Huntington disease so that affected individuals and their families can be spared the devastating consequences of these diseases. However, we must be very careful about application of germ line therapy to hu- mans because the technique has the power to do great harm. Who will decide which genetic characteristics are desirable and which are undesirable? Should the technique be used to change physical ap- pearance, intelligence, or personality traits? Should anyone be per- mitted to make such decisions?

A great deal of molecular biology information is avail- able on the Internet. Recombinant DNA technology has generated so much information that it would be nearly impossible to share all of it in a timely fash- ion with the entire molecular biology community by conventional means such as publishing in professional journals or writing books. Fortunately, there is an alternate method for sharing large quantities of rapidly accumulating information that is both quick and efficient. A worldwide network of communication networks, the Internet, al- lows us to gain almost instant access to the information. The Internet also provides many helpful tutorials and instructive animations. This text will include references to helpful Internet sites. However, be- cause the addresses for these Web sites tend to change over time, this text will refer the reader to a primary site maintained by the publisher.

CHAPTER 1 Introduction to Molecular Biology 25 09166_CH01_001_026_FINL.qxp 6/30/07 1:21 PM Page 26

© 2007 Jones and Bartlett Publishers, Inc. NOT FOR SALE OR DISTRIBUTION

Suggested Reading General Choudri, S. 2003. The path from nuclein to human genome: a brief history of DNA with a note on human genome sequencing and its impact on future research in bi- ology. Bull Sci Technol Soc 23, 360–367. Crick, F. 1988. What Mad Pursuit: A Personal View of Scientific Discovery. New York: Basic Books. Crick, F. 1974. The double helix: a personal view. Nature 248:766–769. Judson, H. F. 1996.The Eighth Day of Creation: Makers of the Revolution in Biol- ogy. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory. Maddox, B. 2002. Rosalind Franklin: The Dark Lady of DNA. New York: Harper Collins. McCarty, M. 1985. The Transforming Principle: Discovering that Genes Are Made of DNA. New York: W.W. Norton. Olby, R. C. 1994. The Path to the Double Helix: The Discovery of DNA. New York: Dover Publications. Pukkila, P. J. 2001. Molecular biology: the central dogma. Encyclopedia of Life Sci- ences. pp. 1–5. London, UK: Nature Publishing Co. Sayre, A. 1978. Rosalind Franklin and DNA. New York: W. W. Norton. Stent, G. 1972. Prematurity and uniqueness in scientific discovery. Sci Am 227:84–93. Summers, W. C. 2002. History of molecular biology. Encyclopedia of Life Sciences. pp. 1–8. London, UK: Nature Publishing Co. Watson, J. D. 1968. The Double Helix: A Personal Account of the Discovery of the Structure of DNA. New York: Antheneum Books. Wilkins, M. 2003. The Third Man of the Double Helix: The Autobiography of Mau- rice Wilkins. Oxford, UK: Oxford University Press.

Classic Papers

Avery, O. T., Macleod, C. M., and McCarty, M. 1944. Studies on the chemical na- ture of the substance inducing transformation of pneumococcal types. Induction of transformation by a deoxyribonucleic acid fraction isolated from Pneumococ- cus type III. J Exp Med 79:137–156. Beadle, G. W., and Tatum, E., 1941. Genetic control of biochemical reactions in Neurospora. Proc Nat Acad Sci USA 27:499–506. Hershey A. D., and Chase, M. 1952. Independent functions of viral protein and nu- cleic acid in growth of bacteriophage. J Gen Physiol 36:39–56. Watson, J. D., and Crick F. H. 1953. Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171:737–738.

26 INTRODUCTION