How New RNA Genes Are Born
Total Page:16
File Type:pdf, Size:1020Kb
Nicholas Delihas human FAM247 long intergenic non-coding RNA gene family Biology ︱ A model of gene formation involving these so-called pseudogenes has been presented by Dr Delihas whereby FAM247 forms human pseudogenes such as the FAM247 fragment serves as a POM121L9P nucleation element or a foundation site. Other sequence blocks from other parts of the genome are added How new RNA FAM247 spans evolutionary time to the FAM247 sequence to form the over 350 million years mature gene (see figure 2). These FAM247 fragments are in genomic regions that display evolutionarily conserved sequence signatures. The genes are born GGT5 –gamma-glutamyltransferase 5 protein gene found FAM247 sequence appears to carry in house mouse contains the 5’ half sequence of FAM247 the information for the attachment The study of gene birth and ene duplication, the formation Evolutionary scientists routinely survey of these extraneous sequences and evolution focuses on the of new genes from an exact the genetic architecture of human and it represents a focal point for the identification of ancestral Gcopy of existing ones, has long primate populations, along those of USP18 protein carboxy end sequence from FAM247: de novo genesis of this and other genetic sequences, highly been considered the major process other species, to find out the specific QETAYLLVYMKMEC372. A similar sequence is in zebrafish pseudogenes. These pseudogenes conserved during evolution, behind gene formation. However, it has roles that different genes play in the display RNA transcript expression, that can serve as a foundation been shown that genes may be born way different populations adapt to and several in a very broad and robust for gene development. from non-coding DNA, regions of DNA the environment and to map any manner in various tissues. Functions Nicholas Delihas, Professor that have open reading frames and changes to the genome that occurred putative ancestral gene-forming FAM247 sequence of these genes are not known. The Emeritus at the Renaissance display translational activity but do not during evolution. Lately, there has determination of gene function School of Medicine at Stony encode proteins. This process is known been a major focus on the genomic Fig 1. The FAM247 sequence forms various genes over evolutionary time. The pseudogenes, is important, however, to assess GGT5 and USP18, only contain segments of the FAM247 sequence whereas the FAM247 lincRNA Brook University, New York, has as de novo protein gene formation determination and sequencing of long genes contain the full sequence. The USP18 protein carboxyl end sequence is crucial for the the degree of relevance of these identified one such ancestral based on a protogene sequence or non-coding RNA (lncRNAs) genes, regulation of the immune system. pseudogenes to cellular molecular element and presented data a sequence that does not represent which are now considered to be key processes. and a model to show how new an existing gene. The concept players in numerous biochemical RNA genes were created with has been formulated by Dr Anne- pathways. This leads to the question of An evolutionarily conserved sequence forms Surprisingly, two ancient protein genes, the ancestral sequence serving Ruxandra Carvunis and co-workers at how many lncRNA genes are present in gamma-glutamyltransferase (GGT5) and as a foundation. Evolutionarily, the University of Pittsburgh and by the genome, their age in evolutionary part of diverse genes and appears to serve ubiquitin specific peptidase 18 USP18 this small sequence also forms investigators at other institutions. It time and how they were born. also found to contain segments of the an important part of two as a nucleation site for the development of constitutes a model to explain how FAM247 sequence (see figure 1). ancient genes, the gamma- glutamyltransferase (GGT5) and during evolution new and different With an increased interest in the study new genes during evolution. Both genes date back from one the ubiquitin specific peptidase protein genes arose by a de novo of de novo genes, that is genes made hundred to several hundred million 18 (USP18), but how these mechanism instead of by existing gene from scratch and not from a template, the phenomenon of de novo gene Pseudogenes are defined as genes years in evolutionary time. Thus, the protein genes were created has sequence duplication. this has also led to questions on what formation has occurred throughout that have copies of sequences of a FAM247 sequence has formed a yet to be determined. in fact constitutes a gene. It is generally evolutionary history. protein gene but cannot form a protein part of diverse genes through much ustas7777777/Shutterstock.com Dr Nicholas Delihas, at the Renaissance agreed that a genetic sequence is product because of key mutations of vertebrate evolution. Unlike the School of Medicine at Stony Brook one that leads to the formation of a DISCOVERY OF AN EVOLUTIONARY within the sequence. modified pseudogenes that are formed University, New York, initially was trying functional product, which might be CONSERVED NUCLEATION SITE to understand a repeat DNA sequence RNA or protein. There are different There are five human FAM247 lincRNA The formation of a new gene present in an RNA gene family, the methods that are employed to confirm family genes that were recently FAM230 long intergenic non-coding that a gene is a functional entity. formed in humans and this gene putative POM121L-1 BCR RNA (lincRNA) family that is found in One approach is to confirm gene family appears to have originated by human chromosome 22. This repeat expression at the RNA and protein gene sequence duplication of the FAM247 sequence and the RNA gene family level through biochemical techniques. FAM247 sequence, a sequence that may be related to the onset of human Another method would be to disrupt is 11,231 bp in length. This conforms genetic disorders involving aberrant a specific genetic sequence and to with the established model for gene attached sequences chromosomal recombination and observe if any changes occur in the family origins via gene duplications. putative subsequent chromosomal deletions. phenotype. This could, however, be However, other genes, pseudogenes |-POM121L-l-|-----FAM247-----|--------------------BCR-----------------------| However, these studies led to an problematic when analysing entire and protein genes were found to pseudogene POM121L9P unexpected discovery of an ancestral genomes. Evolutionary approaches contain segments of the FAM247 |--------------------------------------------------------------------------------------------| DNA repeat sequence, termed FAM247, look at the presence of specific genetic lincRNA family sequence (see figure that can serve as a gene-forming ‘signatures’ that provide evidence of 1). The pseudogenes are of particular Fig 2. The birth of gene POM121L9P can be visualised by the addition of sequences from parts of other genes termed putative POM121L-1 and BCR to the FAM247 sequence. The FAM247 element: a nucleation site for new RNA selection in an attempt to confirm the interest as they are unique and contain sequence is situated at a human chromosomal site that displays sequence synteny with the gene development that parallels the presence of a gene. On the other hand, extraneous chromosomal sequences comparable chromosomal site found in the chimpanzee and represents a nucleation site for gene concept envisioned for de novo protein despite the challenges associated unrelated to FAM247 or the parent formation. POM121L1 is a POM121-like protein 1 gene. POM121 is a membrane component of the vertebrate nuclear pore complex. The POM121L1 sequence is attached on the left (5’) side of the birth by Dr Carvunis, co-workers and of with the identification of genes, gene of origin; thus, they significantly FAM247 sequence. A section of the BCR gene, termed BCR activator of RhoGEF and GTPase is on scientists at other institutions. there is now plenty of evidence that differ from true pseudogenes. the right (3’) side of FAM247. www.researchoutreach.org www.researchoutreach.org Wiki Commons Wiki Behind the Research Dr Nicholas Delihas Kazakov Maksim/Shutterstock.comKazakov Lefteris Papaulakis/Shutterstock.com Photo Credit, left image: E: [email protected] T: +1 631 286 9427 W: https://orcid.org/0000-0002-1704-2587 Research Objectives References Photo Credit, right image: Marble statues of Phrasikleia Kore (left, c. 550–540 Prof Delihas studies long non-coding RNA genes and Delihas, N. (2020). Genesis of Non-Coding RNA Genes The FAM247 sequence, or part of it, was present in the zebrafish USP18 gene. BC) and Anavysos (Kroisos) Kouros (right, c. 530 BC). their evolutionary development. This may imply that certain functions of the ubiquitin specific peptidase USP18 Long non-coding RNA genes formed from FAM247 in Human Chromosome 22 – A Sequence Connection originated in vertebrates several hundred million years ago. nucleation sequences in humans. with Protein Genes Separated by Evolutionary Time. Detail Non-Coding RNA, 6(3), 36. Available at: https://doi. de novo and the five human FAM247 function in the regulation of the There is a very high identity between the org/10.3390/ncrna6030036 lincRNA family genes formed by gene immune system. USP18 is the ubiquitin FAM247 nucleotide sequence and the Nicholas Delihas duplication, the mechanism that led to specific peptidase gene, a member human/primate carboxy terminal end Department of Micobiology and Immunology Delihas,