Transcription | Principles of Biology from Nature Education
Total Page:16
File Type:pdf, Size:1020Kb
contents Principles of Biology 49 Transcription Transcription is the process of copying information from DNA to RNA. A scribe makes careful work. Similar to the way a scribe would make copies of one manuscript to another, transcription is the relay of information in DNA to a new but similar form, RNA. Jean Le Tavernier, portrait of Jean Miélot, after 1456. Topics Covered in this Module Transcription versus DNA Replication Major Objectives of this Module Explain the processes that occur during the three phases of transcription. Describe the molecular factors that aid in transcription. Relate the importance of specific sequences on the DNA molecule to the process of transcription. Describe the differences between eukaryotic and prokaryotic transcription. Describe RNA processing. page 252 of 989 3 pages left in this module contents Principles of Biology 49 Transcription Transcription versus DNA Replication Each diploid cell contains only 2 copies of each gene, but needs to make a large amount of protein from the genetic information. The first step in this process is to create many copies of the genetic information as RNA instead of DNA. The process of transcription creates an RNA version of the information coded in the DNA. Transcription is similar to DNA replication in that the DNA is unwound and a polymerase reaction adds the appropriate nucleotide substrates to the growing nucleotide chain. However, there are several key differences between DNA replication and transcription. During transcription, only one strand of the DNA is used as a template to create the RNA molecule. This is called the template strand. The other strand is called the non-template or coding strand. It is called the coding strand because its sequence will match the sequence of the newly created RNA strand, except that the RNA will contain the nucleotide uracil (U) in place of thymine (T) in the DNA. The enzyme that performs the polymerase reaction in transcription is called RNA polymerase. Bacteria have one type of RNA polymerase while eukaryotes have at least three. RNA polymerase I transcribes genes that code for the large RNA molecules, called ribosomal RNA (rRNA), that are found in ribosomes. RNA polymerase II transcribes protein-coding genes and creates messenger RNA (mRNA). RNA polymerase III transcribes genes that code for transfer RNAs (tRNAs) that play a key role during translation. In addition to these, new RNA polymerases that produce RNA involved in regulation of gene expression have recently been identified. RNA polymerase moves 3′ to 5′ along the template strand of the DNA and synthesizes the RNA molecule in the 5′ to 3′ direction. Using the coding strand as a reference, sequences that are on the 5′-side of a reference point are called "upstream," and sequences on the 3′-side are called "downstream." Unlike DNA polymerase, RNA polymerase does not need a primer to start transcription. The stretch of DNA that is transcribed into RNA is known as the transcription unit. Transcription has three distinct phases: initiation, elongation and termination. During initiation, with the help of additional factors, RNA polymerase binds to the DNA and unwinds it. During the elongation phase, RNA polymerase moves along the DNA template and creates the RNA transcript. Finally, termination occurs when RNA polymerase reaches the termination site and the RNA transcript is released. The initiation of transcription requires a special DNA sequence called a promoter. The promoter tells the RNA polymerase where to start transcription and is positioned upstream of the transcription start site, also known as the +1 site because it is the site at which the first RNA nucleotide is added. The promoter also tells RNA polymerase which DNA strand to use as the template. The sequences and factors involved in initiation differ between prokaryotic and eukaryotic transcription. Transcription differs in prokaryotes and eukaryotes. In prokaryotes, promoters are between 40–50 base pairs long and they include a six-base-pair sequence identical or similar to TATAAT. This sequence is located approximately 10 base pairs upstream from the +1 site and is known as the -10 box. A second key sequence, TTGACA, occurs 35 base pairs upstream from the +1 site, and is therefore called the -35 box. While most prokaryotic promoters include both a -10 box and a -35 box, the promoter sequences outside of these regions vary widely. The sequences in eukaryotic promoters are more diverse than prokaryotic promoters. Despite the increase in diversity, many eukaryotic promoters for protein-coding genes have a similar structure for their "core" promoter. One element of the core promoter — called the TATA box — is located 25–30 base pairs upstream from the transcription start site. Another consensus site, the TFIIB recognition element, is often located in the promoter region at approximately 35 base pairs upstream from the transcription start site. Finally, the core promoter may also include an initiator element centered on the transcription start site and a downstream core promoter element roughly 30 base pairs downstream of the +1 site (Figure 1). Figure 1: Eukaryotic Promoter Structure. Several consensus sequences are found in the core promoter region of a gene that codes for a protein. Not all of these sequences are found in every promoter. A transcription start site consists of a core promoter element and a regulatory promoter. The core promoter elements include the TFIIB recognition element, the TATA box, the initiator element and the downstream core promoter element. © 2011 Nature Education All rights reserved. Eukaryotes also use enhancer sequences, which increase the efficiency of transcription initiation of the corresponding gene. Enhancers may be located hundreds or thousands of base pairs from the promoter and are brought to the promoter by DNA looping. This looping is facilitated by proteins known as activators. Proteins that inhibit looping are called repressors. In addition to RNA polymerase, there are other factors that are required for transcription. In prokaryotes, a protein subunit called sigma binds to the core RNA polymerase to create what is known as the RNA polymerase holoenzyme. It is the sigma portion of the holoenzyme that binds to the promoter to initiate transcription. There are a variety of sigma proteins, each with a slightly different structure. By pairing with different sigma proteins, RNA polymerase may bind to different promoters. The genes transcribed by the holoenzyme are dependent on which sigma protein is present in the holoenzyme. Eukaryotes also require additional factors for RNA polymerase to bind to the DNA. These proteins are called the general transcription factors. These proteins assemble at the promoter first, and then RNA polymerase binds to form what is known as the transcription initiation complex. Once the holoenzyme (in prokaryotes) or transcription initiation complex (in eukaryotes) is bound to the promoter, the DNA helix unwinds, exposing approximately 13 base pairs at a time. Using the template strand of DNA, RNA polymerase begins adding nucleotide monomers to the growing transcript. Once approximately 10 nucleotides are polymerized, initiation is considered complete and elongation begins. Test Yourself If a mutation changed the sequence of the -10 box, what would you expect the result to be? Submit BIOSKILL DNA-RNA Hybridization How do scientists determine which DNA sequences are bound by transcription factor proteins? Protein-DNA interactions are important for transcription, DNA replication, and many other biological processes, and it is important to understand where along the DNA the protein is binding. One of the laboratory techniques that scientists use to study protein-DNA interactions is chromatin immunoprecipitation (ChIP) (Figure 2). class="NoSpacing" >The first step in ChIP is to cross-link the protein-DNA complexes in the cell using a cross-linking agent, such as formaldehyde. This will maintain the association of the protein with the DNA so that the entire complex can be isolated. The DNA is then physically disrupted or enzymatically digested into approximately 500-base-pair pieces. The pieces of protein-bound DNA are then isolated using an antibody highly specific for the protein of interest and precipitated away from protein-DNA complexes not containing the protein of interest. class="NoSpacing" >Cross-linking of the immunoprecipitated protein-DNA sample is reversed by breaking the bonds between the protein and DNA. The DNA that was isolated with the protein is purified and analyzed using one of several techniques, including quantitative PCR, sequencing, or microarray. This allows scientists to identify which DNA sequences are directly bound to the protein of interest. class="NoSpacing" > Figure 2: Steps of the chromatin immunoprecipitation (ChIP) procedure. In a ChIP procedure, bound protein is used to isolate the DNA sequences recognized by the protein. In this example, Caenorhabditis elegans genomic DNA sequences are bound to specific regulatory proteins, and these complexes are cross-linked, immunoprecipitated, and purified. The DNA sequences can be analyzed by PCR, microarray, cloning or Southern blotting. © 2013 Nature Education All rights reserved. BIOSKILL Elongation, termination, and processing create the final RNA transcript. During elongation, RNA polymerase moves along the DNA template 3′ to 5′ and adds new nucleotides to the 3′ end of the RNA transcript (Figure 3). Nucleotides are added to the RNA by complementary base pairing to the DNA template strand. The base pairing during transcription is the same as in DNA base pairing, except that RNA contains uracil instead of thymine. Therefore, RNA polymerase uses the nucleotides CTP, GTP, ATP, and UTP to create the transcript. RNA polymerase catalyzes the formation of phosphodiester bonds between these monomers as the transcript is created, at a rate of approximately 40 nucleotides per second. As transcription continues along the DNA, the RNA transcript separates from the DNA template and the DNA double helix is re-formed (Figure 4). A single gene may produce many RNA transcripts at the same time.