TTGACA TATAAT +1 AGGAGGT ATG TTA ATG TGA TAG 5' 3' 3' Gene A Gene C AACTGT ATATTA TCCTCCA TAC ATT TAC ACT ATC 5' URS - 35 - 10 Shine-Dalgarno translation translation sequences start rho or GC hairpin Pribnow Box start stop loop transcription σ discriminator termination

sequences

DNA Prokaryote Transcription Steps (updated February 2013) 5'

3' β' 1. In initiation the RNA polymerase holoenzyme with two alpha, one beta, URS ω sequences one beta prime and one omega bind the σ sigma (σ) unit which binds to the α Pribnow box and the TTGACA β discriminator.

TTGACA TATAAT +1 AGGAGGT ATG TTA 3' Gene A AACTGT ATATTA 5' - 35 - 10 TCCTCCA TAC ATT rho or GC hairpin Pribnow Box Shine-Dalgarno translation translation σ discriminator start stop loop transcription termination sequences

URS sequences 5' 2. Transcription factors bind the upstream regulatory 3' sequences and to the alpha (α) subunits of the RNA polymerase. These factors affect binding strength of the RNA polymerase and the ability of sigma (σ) to bend and open (melt) the DNA double helix around the transcription start point.

3. The RNA polymerase starts synthesis of the mRNA assembling a strand of about 11 RNA .

β' +1 TTGACAω TATAAT AGGAGGT ATG TTA 3' σ Gene A AACTGTα ATATTA TCCTCCA TAC ATT 5' β - 35 - 10 Shine-Dalgarno translation translation rho or GC hairpin Pribnow Box transcription start stop loop transcription σ discriminator start termination sequences URS sequences 4. In clearance sigma domains reposition so that the RNA 5' polymerase holoenzyme can enter the elongation stage where the rest of the 3' template DNA is transcribed increasing the length of the mRNA strand.

5. NusA and Nus G bind to the RNA polymerase to keep it on track to finish DNA transcription. The elongation complex leaves behind the transcription factors associated with the URS. In rho-independent termination, NusA facilitates GC hairpin loop formation.

+1 β'

ω TTGACA TATAAT 3' 5' AAUCGUAGGAGGUCCAGCGAUGσ α nusG 3' ATT 5' 5' AACTGT ATATTA β - 35 - 10 3' TCCTCCA 5' 3' TAC 5' Gene A translation rho or GC hairpin transcription Pribnow Box Shine-Dalgarno translation stop loop transcription start σ discriminator start termination sequences nusA

5a. Rho proteins bind to mRNA strand until they catch up to the beta subunit of the RNA polymerase. When both mRNA and the beta subunit are bound, the complex falls off of the DNA template strand as rho functions like a DNA-RNA helicase and unwinds the mRNA from the 5' AAUCGUAGGAGGU template DNA strand. There is a stem loop but no Uracil AUG tract. β' ω 5' 3' rho σ nusG 3' α 5' β nusA

TATAAT +1 5' TTGACA AGGAGGT ATG TTA ATG TGA TAG 3'

Gene A Gene C 3' 5' AACTGT ATATTA TCCTCCA TAC ATT TAC URS transcription ACT ATC - 35 - 10 Shine-Dalgarno translation translation sequences start Pribnow Box start stop σ discriminator

5'

β' ω σ α rho β 3' mRNA

5b. GC-hairpin loop preceding an AAAAA-rich template DNA sequence puts drag on the RNA polymerase so that it is more easily derailed where the A-U hydrogen bonds are fewer in number and therefore weaker.

5' AAUCGUAGGAGGU

AUG

5' AAAAAAAAA 3' UUUUUUUUU 3' 5'

' β 5' ω

α A β U UUUUUUUUU G

nusA nusG 3' mRNA

σ DNA Transcription Steps (updated February 2013)

Eukaryote transcription is monocistronic meaning that only one polypeptide coding region is under control of the promoter. The promoter has several sequences that are similar to the Pribnow and TTGACA boxes in prokaryote promoters. The TATA box (TATAAA) is almost identical to the Pribnow sequence. Only about 32% of the known eukaryote core promoters have the TATA box at -26 to -31. If the iniator (Inr) core promoter element (YYA+1Nt/aYY)* is present at - 2 to + 4, it acts synergistically with the TATA box. The downstream promoter element (DPE) at +28 to +32 (a/g G a/t c/t g/a/c) in TATA-less promoters requires initiator to function. Motif ten element (MTE) at +18 to +27 (C g/a A a/g C g/c c/a/g AACG g/c) also requires initiator. It can act independently of the TATA box or the downstream promoter element, or synergistically when either are present. Proximal promoter elements include the CAAT box at -70 to -200 (CCAAT) and the GC box also at - 70 to -200 (GGGCGG) whose binding proteins act to tether long-range regulatory elements such as enhancers to the core promoter through the mediator transcription factor. Long-range regulatory elements (upstream activating sequences) include enhancers, silencers, and insulators which are found at euchromatin/heterochromatin boundaries where they prevent bleed- over by enhancers and silencers of adjacent genes. Also included are locus control regions (LCRs) which organize downstream chromatin into open configurations for RNA polymerase access for transcription. Matrix attachment regions (MARs) in interphase chromatin and scaffolding attachment regions (SARs) in metaphase chromosomes organize chromatin into loops averaging 70 kilobases (kb) between attachment points recruiting enzymes to assist tissue-specific and developmental stage specific transcription of genes.

The TATA(box) Binding Protein (TBP) binds the minor groove of the DNA at the TATAAA sequence with saddle-like antiparallel β sheets which cause the DNA to bend almost 90o and melt (open up) in a fashion similar to sigma protein binding of the Pribnow, TGn and TTGACA boxes. TBP associated factors (TAFs) bind the TATA binding protein to form the TFIID complex. TAF-1 and TAF-2 bind initiator. SP1 that binds the GC box also binds TAF-4 with a glutamine-rich transactivation domain. TFIIB binds to one end of TBP and to a GC rich TFIIB recognition element (BRE) upstream to the TATA box. This gives both direction and strand specificity to the transcription pre-initiation complex because TFIIB also binds RNA polymerase II (RNA pol II is a 12 subunit holoenyzme) with a cysteine-rich zinc-binding ribbon domain to recruit RNA pol II to the transcription pre-iniation complex with DNA at base +1 above the active site center. TFIIF also assists the placement of promoter DNA in this complex strengthening binding. Next TFIIE binds and then it recruits TFIIH where H stands for helicase. With TFIIE, the helicase unwinds the DNA in the active site and TFIIF captures the nontemplate DNA strand moving it out of the active site while the template strand migrates into the active site channel. TFIIE drops off. TFIIH also has kinase activity and it phosphorylates the C-Terminal Domain (CTD) of the largest RNA pol II subunit at multiple serine- rich amino acid repeats. This initiates promoter clearance. Site-specific serine phosphorylation and dephosphorylation are involved in RNA pol II binding (pre-initiation), promoter clearance, elongation and termination in transcription. Only dephosphorylated CTD-RNA pol II can bind the promoter. In promoter clearance TFIIH helicase activity and TFIIB assist formation of the replication bubble. Once the RNA strand exceeds 10 nucleotides (bases), TFIIB drops off. Additional phosphorylation of the CTD of the RNA pol II by TFIIH pushes the polymerase into the elongation phase. TFIIH drops off. TFIID stays behind to form a new pre-initiation complex. TFIIF stays to keep nontemplate DNA sequestered. TFIIS binds to the RNA pol II complex to keep the polymerase on track much like nusA and nusG in prokaryote transcription. This gives a basal rate of transcription. Mediator-binding of RNA pol II and proximal and long-range regulatory element transcription factors can speed up processing of the pre-initiation complex and moving through promoter clearance.

In there are three different RNA polymerases: RNA polymerase I transcribes rDNA, RNA polymerase II transcribes DNA that codes for polypeptides as hnRNA and structural genes that produce splicing snRNA, while RNA polymerase III transcribes 5S rDNA, tDNA and other snDNA genes.] Other transcription factors bind the CAAT box, GC boxes or CACCC boxes if present as well as or sequences which may also be found in certain upstream regulatory sequences of a given structural gene promoter. Sometimes included in these regulatory sequences are response elements for different hormones, heat shock, light, etc and possibly homeoboxes (animals) or MADS boxes (plants) that control developmental pathways. Regulation of eukaryotic genes appears to be more complex than that of prokaryotic genes. Once all necessary factors are in place, the DNA double helix opens and now the RNA polymerase is able to directly transcribe the RNA. Termination for rRNA sequences uses a rho-like factor that binds the DNA downstream of the termination site to cause the RNA polymerase I to separate from the DNA. This is different than the prokaryote mechanism of binding the newly synthesized RNA molecule and with helicase activity unwinding it from the RNA polymerase-DNA complex to stop transcription. RNA polymerase III relies on a DNA termination sequence of many A nucleotides generating a polyU sequence in tRNA or 5S rRNA to make it easier for the polymerase and RNA to separate from the DNA template. This is similar to the rho-independent system of except that the G-C hairpin loop is not required. For RNA polymerase II transcribing polypeptide genes to ultimately make mRNA, the RNA transcribed by RNA polymerase is called hnRNA (heterogeneous nuclear RNA) or sometimes pre-mRNA (precursor-mRNA). Transcription often continues for several thousand bases past the end of the polypeptide coding region. On this 3' end of the hnRNA at about 30 to 50 nucleotides past the translation stop codon, there is polyA cleavage site AAUAAA. Somewhere within 20 to 200 or more bases downstream of this site, the excess RNA is cleaved and a polyA tail is added to the 3' portion of the newly cleaved hnRNA. The other piece of RNA trails the RNA polymerase II. However, an enzyme with 5'-3' exonuclease activity called Rat1 homologous to the human cytoplasmic exonuclease Xrn2 attaches to this free 5' end and chews up the RNA moving rapidly towards the RNA polymerase II where it stops transcription. A helicase may also be involved. As mentioned above, there may be DNA/RNA sequences or signals that either cause the RNA pol II to pause, or that recruit phosphatases to remove phosphate groups from the CTD serine repeats of the RNA pol II slowing it down. It is analogous to rho termination except that in bacteria, the rho protein unwinds the RNA instead of chewing it up into pieces. A 7-methyl guanosine cap is added to the 5' end of the hnRNA. This happens shortly after the nascent RNA strand appears when guanylyltransferase and methyltransferase are recruited to the phosphorylated CTD of the RNA pol II. The 7-methyl guanosine cap is required for translation initiation where eIF4F (eukaryotic initiation factor 4F), the 5' 7-methylguanosine cap of the mRNA, the 40S ribosomal subunit and the initiator tRNAiMet associate with the assistance of other initiation factors to form the 43S initiation complex. Also, in eukaryote structural genes there are non-coding regions of DNA within the polypeptide coding regions. The non-coding regions are called introns while the coding regions are exons. Now introns are spliced out leaving a mature mRNA of the 5' 7-methyl G cap, exons, and the 3' polyA tail. The mature mRNA is transported through the nuclear pores to the cytoplasm for translation. In the eubacteria prokaryotes, structural mRNA does not have introns processed out. In both archaebacteria and eubacteria prokaryotes mRNA translation occurs in the cytoplasm even before transcription is completed. In prokaryotes replication, transcription and translation (protein synthesis) processes can be and often are occurring at the same time [with some limited spatial separation].

*R stands for purine A or G; Y stands for pyrimidine T (U) or C; N stands for any of the four (five) nucleotides.

The above information is largely taken from: 1) Lizabeth A. Allison (2012) Fundamental , 2nd ed. John Wiley & Sons, Inc. USA.

2) Patricia Richard and James L. Manley (2009) Transcription termination by nuclear RNA polymerases. Genes & Development 23:1247-1269. Cold Spring Harbor Laboratory Press.

3) Victoria H. Cowling (2010) Regulation of mRNA cap methylation. Biochemical Journal 425:295-302. BJ www.biochemj.org.

4) Sacha A. F. T. van Hijum et al (2009) Mechanisms and evolution of control logic in prokaryotics transcriptional regulation. Microbiology and Molecular Biology Reviews 73:481-509. American Society for Microbiology.