Eukaryotic and Prokaryotic Gene Structure Thomas Shafee*, Rohan Lowe
Total Page:16
File Type:pdf, Size:1020Kb
WikiJournal of Medicine, 2017, 4(1):2 doi: 10.15347/wjm/2017.002 Figure Article Eukaryotic and prokaryotic gene structure Thomas Shafee*, Rohan Lowe Abstract Genes consist of multiple sequence elements that together encode the functional product and regulate its expres- sion. Despite their fundamental importance, there are few freely available diagrams of gene structure. Presented here are two figures that summarise the different structures found in eukaryotic and prokaryotic genes. Common gene structural elements are colour-coded by their function in regulation, transcription, or translation. Introduction Previous images Gene structure Access to freely available diagrams is important for sci- entists, medical professions and the general pub- Genes contain the information necessary for living cells lic.[7][8]However, current open-access gene structure to survive and reproduce.[1][2] In most organisms, genes figures are limited in their scope (Supplementary fig- are made of DNA, where the particular DNA sequence ures 1-4), typically showing one or a few aspects (e.g. determines the function of the gene. A gene is tran- exon splicing, or promoter regions). Information-rich scribed (copied) from DNA into RNA, which can either diagrams that use a consistent colour scheme and lay- be non-coding (ncRNA) with a direct function, or an in- out should help the comprehension of complex con- termediate messenger (mRNA) that is then translated cepts.[9] into protein. Each of these steps is controlled by specific sequence elements, or regions, within the gene. Every This work provides two diagrams that summarise the gene, therefore, requires multiple sequence elements complex structure and terminology of genes. Common to be functional.[2] This includes the sequence that ac- elements of gene structure are presented in a con- tually encodes the functional protein or ncRNA, as well sistent layout and format to highlight the relationships as multiple regulatory sequence regions. These regions between components. Key differences between eukar- may be as short as a few base pairs, up to many thou- yotes and prokaryotes are indicated. sands of base pairs long. Much of gene structure is broadly similar between eu- karyotes and prokaryotes. These common elements Results largely result from the shared ancestry of cellular life in organisms over 2 billion years ago.[3] Key differences in Common gene structure features gene structure between eukaryotes and prokaryotes re- The structures of both eukaryotic and prokaryotic flect their divergent transcription and translation ma- genes involve several nested sequence elements. Each [4][5] chinery. Understanding gene structure is the foun- element has a specific function in the multi-step pro- dation of understanding gene annotation, expression, cess of gene expression. The sequences and lengths of [6] and function. these elements vary, but the same general functions are present in most genes.[2] Although DNA is a double- stranded molecule, typically only one of the strands en- codes information that the RNA polymerase reads to La Trobe Institute for Molecular Science, La Trobe University, produce protein-coding mRNA or non-coding RNA. Melbourne, Australia This 'sense' or 'coding' strand, runs in the 5' to 3' direc- *Author correspondence: [email protected] ORCID1: 0000-0002-2298-7593 tion where the numbers refer to the carbon atoms of ORCID2: 0000-0003-0653-9704 the backbone's ribose sugar. The open reading frame Supplementary material: Supplementary figures (ORF) of a gene is therefore usually represented as an Licensed under: CC-BY-SA Received 14-11-2016; accepted 17-01-2017 1 of 5 | WikiJournal of Medicine WikiJournal of Medicine, 2017, 4(1):2 doi: 10.15347/wjm/2017.002 Figure Article arrow indicating the direction in which the sense strand Eukaryotes is read.[10] The structure of eukaryotic genes includes features not Regulatory sequences are located at the extremities of found in prokaryotes (Figure 1). Most of these relate to genes. These sequence regions can either be next to the post-transcriptional modification of pre-mRNAs to pro- transcribed region (the promoter) or separated by duce mature mRNA ready for translation into protein. [11] many kilobases (enhancers and silencers). The pro- Eukaryotic genes typically have more regulatory ele- moter is located at the 5' end of the gene and is com- ments to control gene expression compared to prokar- posed of a core promoter sequence and a proximal pro- yotes.[5] This is particularly true in multicellular eukary- moter sequence. The core promoter marks the start site otes, humans for example, where gene expression var- for transcription by binding RNA polymerase and other ies widely among different tissues.[24] proteins necessary for copying DNA to RNA. The proxi- mal promoter region binds transcription factors that A key feature of the structure of eukaryotic genes is that modify the affinity of the core promoter for RNA poly- their transcripts are typically subdivided into exon and merase.[12][13] Genes may be regulated by multiple en- intron regions. Exon regions are retained in the final hancer and silencer sequences that further modify the mature mRNA molecule, while intron regions are activity of promoters by binding activator or repressor spliced out (excised) during post-transcriptional pro- [25] proteins.[14][15] Enhancers and silencers may be distantly cessing. Indeed, the intron regions of a gene can be located from the gene, many thousands of base pairs considerably longer than the exon regions. Once away. The binding of different transcription factors, spliced together, the exons form a single continuous therefore, regulates the rate of transcription initiation protein-coding regions, and the splice boundaries are at different times and in different cells.[16] not detectable. Eukaryotic post-transcriptional pro- cessing also adds a 5' cap to the start of the mRNA and Regulatory elements can overlap one another, with a a poly-adenosine tail to the end of the mRNA. These ad- section of DNA able to interact with many competing ditions stabilise the mRNA and direct its transport from activators and repressors as well as RNA polymerase. the nucleus to the cytoplasm, although neither of these For example, some repressor proteins can bind to the features are directly encoded in the structure of a [17] core promoter to prevent polymerase binding. For gene.[20] genes with multiple regulatory sequences, the rate of transcription is the product of all of the elements com- Prokaryotes bined.[18] Binding of activators and repressors to multi- ple regulatory sequences has a cooperative effect on The overall organisation of prokaryotic genes is mark- transcription initiation.[19] edly different from that of the eukaryotes (Figure 2). The most obvious difference is that prokaryotic ORFs Although all organisms use both transcriptional activa- are often grouped into a polycistronic operon under the tors and repressors, eukaryotic genes are said to be 'de- [5] control of a shared set of regulatory sequences. These fault off', whereas prokaryotic genes are 'default on'. ORFs are all transcribed onto the same mRNA and so The core promoter of eukaryotic genes typically re- are co-regulated and often serve related func- quires additional activation by promoter elements for tions.[26][27] Each ORF typically has its own ribosome expression to occur. The core promoter of prokaryotic binding site (RBS) so that ribosomes simultaneously genes, conversely, is sufficient for strong expression [5] translate ORFs on the same mRNA. Some operons also and is regulated by repressors. display translational coupling, where the translation An additional layer of regulation occurs for protein cod- rates of multiple ORFs within an operon are linked.[28] ing genes after the mRNA has been processed to pre- This can occur when the ribosome remains attached at pare it for translation to protein. Only the region be- the end of an ORF and simply translocates along to the tween the start and stop codons encodes the final pro- next without the need for a new RBS.[29] Translational tein product. The flanking untranslated regions (UTRs) coupling is also observed when translation of an ORF af- contain further regulatory sequences.[20] The 3' UTR fects the accessibility of the next RBS through changes contains a terminator sequence, which marks the end- in RNA secondary structure.[30] Having multiple ORFs point for transcription and releases the RNA polymer- on a single mRNA is only possible in prokaryotes be- ase.[21] The 5’ UTR binds the ribosome, which translates cause their transcription and translation take place at the protein-coding region into a string of amino acids the same time and in the same subcellular loca- that fold to form the final protein product. In the case of tion.[26][31] genes for non-coding RNAs the RNA is not translated but instead folds to be directly functional.[22][23] 2 of 5 | WikiJournal of Medicine WikiJournal of Medicine, 2017, 4(1):2 doi: 10.15347/wjm/2017.002 Figure Article Regulatory sequence Regulatory sequence Enhancer Enhancer /silencer Promoter 5’UTR Open reading frame 3 ’UTR /silencer Proximal Core Start Stop Terminator DNA Transcription Exon Exon Exon Intron Intron Pre- mRNA Post-transcriptional modification Protein coding region + 5’cap + Poly-A tail Mature mRNA Translation Protein Figure 1 | The structure of a eukaryotic protein-coding gene. Regulatory sequence controls when and where ex- pression occurs for the protein coding region (red). Promoter and enhancer regions (yellow) regulate the transcrip- tion of the gene into a pre-mRNA which is modified to remove introns (light grey) and add a 5' cap and poly-A tail (dark grey). The mRNA 5' and 3' untranslated regions (blue) regulate translation into the final protein product. Polycistronic operon Regulatory sequence Regulatory sequence Enhancer Enhancer /silencer Operator Promoter 5’UTR ORF UTR ORF 3’UTR /silencer Start Stop Start Stop DNA Transcription Protein coding region Protein coding region RBS RBS mRNA Translation Protein Figure 2 | The structure of a prokaryotic operon of protein-coding genes.