High Concentrations of Long Interspersed Nuclear Element Sequence Distinguish Monoallelically Expressed Genes

Total Page:16

File Type:pdf, Size:1020Kb

High Concentrations of Long Interspersed Nuclear Element Sequence Distinguish Monoallelically Expressed Genes High concentrations of long interspersed nuclear element sequence distinguish monoallelically expressed genes Elena Allen*, Steve Horvath*†, Frances Tong*, Peter Kraft†, Elizabeth Spiteri‡, Arthur D. Riggs‡, and York Marahrens*§ Departments of *Human Genetics and †Biostatistics, University of California, Los Angeles, CA 90095; and ‡Division of Biology, Beckman Research Institute of the City of Hope, 1450 East Duarte Road, Duarte, CA 91010 Edited by Stanley M. Gartler, University of Washington, Seattle, WA, and approved June 26, 2003 (received for review December 5, 2002) Genes subject to monoallelic expression are expressed from only cation of intact elements by the formation of heterochromatin one of the two alleles either selected at random (random mono- over their sequence (12–15). allelic genes) or in a parent-of-origin specific manner (imprinted The high density of LINE-1 elements around X-inactivated genes). Because high densities of long interspersed nuclear ele- genes raises the possibility that monoallelically expressed auto- ment (LINE)-1 transposon sequence have been implicated in X- somal genes are also flanked by high densities of repetitive inactivation, we asked whether monoallelically expressed autoso- sequence. Here we show that monoallelic autosomal genes mal genes are also flanked by high densities of LINE-1 sequence. A display significantly higher densities of LINE-1 sequence, less statistical analysis of repeat content in the regions surrounding truncated LINE-1 elements, and fewer CpG islands and SINE monoallelically and biallelically expressed genes revealed that elements in their flanking regions. We find evidence that random monoallelic genes were flanked by significantly higher imprinted and random monoallelic genes each separate into two densities of LINE-1 sequence, evolutionarily more recent and less statistically significant distinct groups. We identify Ͼ1,000 ad- truncated LINE-1 elements, fewer CpG islands, and fewer base- ditional genes that share the characteristics of one of these pairs of short interspersed nuclear elements (SINEs) sequence than groups but are not currently known to exhibit allelic differences biallelically expressed genes. Random monoallelic and imprinted in gene expression or chromatin structure. genes were pooled and subjected to a clustering analysis algo- rithm, which found two clusters on the basis of aforementioned Methods sequence characteristics. Interestingly, these clusters did not fol- Data. We identified 33 random monoallelically expressed (20 low the random monoallelic vs. imprinted classifications. We infer mouse, 13 human) genes; 39 imprinted (15 mouse, 24 human) that chromosomal sequence context plays a role in monoallelic genes; and 28 (15 mouse, 13 human) genes that we designated as gene expression and may involve the recognition of long repeats being biallelically expressed (Table 1). Genes were assigned to or other features. The sequence characteristics that distinguished the biallelically expressed category on the basis of gene expres- the high-LINE-1 category were used to identify more than 1,000 sion data and͞or evidence of synchronous replication timing. All additional genes from the human and mouse genomes as candi- monoallelic genes that have been assayed for replication timing date genes for monoallelic expression. are asynchronously replicating (Table 1 and ref. 16). We con- sequently assumed that synchronously replicated genes are bi- enes expressed from only one allele (monoallelic genes) are allelically expressed. However, we consider it plausible that Geither selected at random (random monoallelic genes) or in biallelic genes that are replicated asynchronously exist. The a parent-of-origin specific manner (imprinted genes). Both types presence of biallelically expressed genes within the aforemen- of monoallelic genes reside in chromosomal regions defined by tioned monoallelic gene set would bias the results of the current allelic differences in chromatin properties, including replication study toward the null hypothesis of no difference between the timing (Table 1). These regions vary in scale from single genes monoallelic and biallelic groups. (e.g., IL-2͞Il-2), to small gene clusters [e.g., PWS͞AS gene The majority of known autosomal monoallelic genes belong to region (1)], to the Ϸ3,000 genes that are silenced during X- gene families. A single representative from each of these gene inactivation (2). families was selected (Table 1). In addition, 150 genes (75 mouse, On the basis of the inability of X-inactivation to spread into 75 human) were randomly selected from the National Center for ͞ many autosomal regions in X autosome translocations, it was Biotechnology Information Reference Sequence (RefSeq) proposed that way stations located throughout the X chromo- genes. The location of each gene and the sequence characteris- some are required for gene silencing to spread (3, 4). The tics surrounding it were determined by using the University of discovery that inactivation correlates with high densities of long California, Santa Cruz (UCSC) Genome Browser (http:͞͞ interspersed nuclear elements (LINEs) in mice led to the genome.cse.ucsc.edu), February 2002 and June 2002 builds for hypothesis that LINE elements are way stations (5). An analysis mouse and human, respectively. Repeat information was deter- of human genome sequence supported this hypothesis (6). mined by using the REPEATMASKER (A. F. A. Smit and P. Green, LINEs are mostly truncated descendants of 6- to 7-kb (7) http:͞͞ftp.genome.washington.edu͞RM͞RepeatMasker.html) transposons enriched in chromosomal G bands (8) and in L1 and track of the UCSC Genome Browser. See Supporting Methods, L2 isochores (genomic DNA fractions in Cs2SO4 density gradi- which is published as supporting information on the PNAS web ents) (9). LINE elements make up 14.5% and 17.6% of mouse site, www.pnas.org, for a more detailed description of the and human autosomes and 28.9% and 31.0% of mouse and procedures used to obtain data. human X chromosomal sequence (E.A. and Y.M., unpublished data). Other abundant repeats throughout the genomes are the Ͻ350-bp short interspersed nuclear elements (SINEs), enriched This paper was submitted directly (Track II) to the PNAS office. in chromosomal R bands (8) and H3 and H4 isochores (9) and Abbreviations: LINEs, long interspersed nuclear elements; SINEs, short interspersed nuclear the retrovirus-derived long terminal repeats (LTRs). A host- elements; RefSeq, Reference Sequence. defense mechanism (10, 11) is thought to suppress the multipli- §To whom correspondence should be addressed. E-mail: [email protected]. 9940–9945 ͉ PNAS ͉ August 19, 2003 ͉ vol. 100 ͉ no. 17 www.pnas.org͞cgi͞doi͞10.1073͞pnas.1737401100 Downloaded by guest on September 25, 2021 Table 1. Mouse and human genes that are known to exhibit Table 1. (continued) allelic differences or allelic equivalence Allelic Rep. Allelic Rep. Species͞gene Chromosome expression timing Reference Species͞gene Chromosome expression timing Reference Random monoallelic m Interleukins p57(KIP2) 7 M A 57, 58 m Zfp127 7M– 59 Il-2 3MA39 Igf2-R 17 M – 40, 41 Il-1a*2M– 42 Ndn 7M– 62 Mash2 7M– 63 Il-4 11 M – 44 h h H19 11 M – 64 IL-2 4M– 46 ARHI 1M– 66 X chromosome MAGEL2 15 M – 68 m PEG10 7M– 69 Xist† XM– 49 PLAGL1 6M– 72 h ZIM2 19 M – 73 XIST X M A 51, 52 ZNF264 19 M – 74 Immunoglobin loci TP73 1M– 75 m CDKN1C 11 M – 76 Ig␬ 6MA55 HYMAI 6M– 78 IgL‡ 16 M A 55 NDN 15 M – 79 NNAT 20 M – 81 IgH‡ 12 M A 55 GRB10 7M– 82 TCRB 6MA55 ZIM3 19 M – 74 h GNAS1 20 M – 84 ␬ Ig 2M– 61 DLK1 14 M – 85 ‡ IgL 22 M – 61 MEST 7M– 86 GENETICS IgH‡ 14 M – 61 ZNF215*11 M– 88 TCRA*7M– 61 UBE3A 15 M – 89 TCRB‡ 14 M – 61 ATP10C 15 M – 90 TCRD‡ 14 M – 61 IGF2 11 M – 41, 91 TCRG‡ 7M– 71 p57(KIP2) 11 M – 76 Olfactory receptors WT1*11M– 93 m SNRPN 15 M – 94 Biallelic Or10 14 M A 60 m Olfr41‡ 7MA60 Cebpb 2B– 38 ‡ Olfr72 9MA60 Edn3 2B– 38 ‡ Ors25 7MA60 Chrna4 2B– 38 Ora16‡ 2MA60 E2f1 2B– 38 OrM71‡ 9MA60 Ntsr 2B– 38 h Kcnb1 2B– 38 OR1E1 17 M – 61 Mc3r 2B– 38 OR12D2‡ 2M– 61 C-mpl 4 – S39 OR2S2‡ 9M– 61 Tmp 6 – S55 OR51B2‡ 11 M – 61 Cd2 3 – S55 NK receptors Rras 7 – S41 Alb 5 – S41 m Myc 15 – S60 Nkg2d*6M– 87 Pfkl 10 – S41 ‡ Ly49A* 6M– 87 Trp53 11 – S41 ‡ Ly49G2* Un M – 87 h Genes near t complex ACTB 7B– 65 m TSGA14 7B– 67 Nubp2 17 M – 92 NAP1L4 11 B – 64 Igfals‡ 17 M – 92 IGFBP1 7B– 70 Jsap‡ 17 M – 92 ACHE 7 – S41 Imprinted APOB 2 – S41 m CD3D 11 – S41 Tssc3 7M– 37 MYC 8 – S41 TP53 17 – S77 U2af1-rs1 11 M – 40, 41 PYGM 11 – S41 Peg3 7M– 43 ERBB2 17 – S80 Mas1 17 M – 45 RB1 13 – S80 Gnas 2M– 47 RUNX 21 – S83 Rasgrf1 9M– 48 Ins1 19 M – 50 m, mouse; h, human; Rep., replication. H19 7 M A 41, 53 *Monoallelically expressed in a subset of cells. † Snrpn 7 M A 41, 54 Paternally imprinted in some tissues. ‡Excluded from the analyses that involve individual representatives from each Igf2 7MA56 gene family (see supporting information). Allen et al. PNAS ͉ August 19, 2003 ͉ vol. 100 ͉ no. 17 ͉ 9941 Downloaded by guest on September 25, 2021 Sequence Analysis. A set of 68 covariates that described each gene and the sequence in the region surrounding it, defined as 100 kb to the 3Ј and 5Ј sides of the gene, was written. Definitions of covariates are in Supporting Methods. Statistical Methods. Univariate differences in covariates were tested across categorical groupings by using the Kruskal–Wallis test (17). Distributions of covariates across categorical groupings were visualized with boxplots. To facilitate unsupervised learning, an intrinsic dissimilarity measure was constructed with a random-forest analysis of the covariates describing the 200-kb regions flanking the genes (18–20).
Recommended publications
  • Aberrant Methylation Underlies Insulin Gene Expression in Human Insulinoma
    ARTICLE https://doi.org/10.1038/s41467-020-18839-1 OPEN Aberrant methylation underlies insulin gene expression in human insulinoma Esra Karakose1,6, Huan Wang 2,6, William Inabnet1, Rajesh V. Thakker 3, Steven Libutti4, Gustavo Fernandez-Ranvier 1, Hyunsuk Suh1, Mark Stevenson 3, Yayoi Kinoshita1, Michael Donovan1, Yevgeniy Antipin1,2, Yan Li5, Xiaoxiao Liu 5, Fulai Jin 5, Peng Wang 1, Andrew Uzilov 1,2, ✉ Carmen Argmann 1, Eric E. Schadt 1,2, Andrew F. Stewart 1,7 , Donald K. Scott 1,7 & Luca Lambertini 1,6 1234567890():,; Human insulinomas are rare, benign, slowly proliferating, insulin-producing beta cell tumors that provide a molecular “recipe” or “roadmap” for pathways that control human beta cell regeneration. An earlier study revealed abnormal methylation in the imprinted p15.5-p15.4 region of chromosome 11, known to be abnormally methylated in another disorder of expanded beta cell mass and function: the focal variant of congenital hyperinsulinism. Here, we compare deep DNA methylome sequencing on 19 human insulinomas, and five sets of normal beta cells. We find a remarkably consistent, abnormal methylation pattern in insu- linomas. The findings suggest that abnormal insulin (INS) promoter methylation and altered transcription factor expression create alternative drivers of INS expression, replacing cano- nical PDX1-driven beta cell specification with a pathological, looping, distal enhancer-based form of transcriptional regulation. Finally, NFaT transcription factors, rather than the cano- nical PDX1 enhancer complex, are predicted to drive INS transactivation. 1 From the Diabetes Obesity and Metabolism Institute, The Department of Surgery, The Department of Pathology, The Department of Genetics and Genomics Sciences and The Institute for Genomics and Multiscale Biology, The Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA.
    [Show full text]
  • Expression of a Mouse Long Terminal Repeat Is Cell Cycle-Linked (Friend Cells/Gene Expression/Retroviruses/Transformation/Onc Genes) LEONARD H
    Proc. Natl. Acad. Sci. USA Vol. 82, pp. 1946-1949, April 1985 Biochemistry Expression of a mouse long terminal repeat is cell cycle-linked (Friend cells/gene expression/retroviruses/transformation/onc genes) LEONARD H. AUGENLICHT AND HEIDI HALSEY Department of Oncology, Montefiore Medical Center, and Department of Medicine, Albert Einstein College of Medicine, 111 East 210th St., Bronx, NY 10467 Communicated by Harry Eagle, November 26, 1984 ABSTRACT The expression of the long terminal repeat have regions of homology with a Syrian hamster repetitive (LTR) of intracisternal A particle retroviral sequences which sequence whose expression is also linked to the G, phase of are endogenous to the mouse genome has been shown to be the cell cycle (28). linked to the early G6 phase of the cell cycle in Friend erythroleukemia cells synchronized by density arrest and also MATERIALS AND METHODS in logarithmically growing cells fractionated into cell-cycle Cells. Friend erythroleukemia cells, strain DS-19, were compartments by centrifugal elutriation. Regions of homology grown in minimal essential medium containing 10% fetal calf were found in comparing the LTR sequence to a repetitive serum (28). Cell number was determined by counting an Syrian hamster sequence specifically expressed in early G1 in aliquot in an automated particle counter (Coulter Electron- hamster cells. ics). The cells were fractionated into cell-cycle compart- ments by centrifugal elutriation with a Beckman elutriator The long terminal repeats (LTR) of retroviral genomes rotor as described (29). For analysis of DNA content per contain sequence elements that regulate the transcription of cell, the cells were stained with either 4',6-diamidino-2- the viral genes (1).
    [Show full text]
  • OR12D2 (Y-13): Sc-109987
    SAN TA C RUZ BI OTEC HNOL OG Y, INC . OR12D2 (Y-13): sc-109987 BACKGROUND PRODUCT Olfactory receptors interact with odorant molecules in the nose to initiate a Each vial contains 200 µg IgG in 1.0 ml of PBS with < 0.1% sodium azide neuronal response that leads to the perception of smell. While they share a and 0.1% gelatin. seven transmembrane domain structure with many neurotransmitter and hor - Blocking peptide available for competition studies, sc-109987 P, (100 µg mone receptors, olfactory receptors are responsible for the recognition and peptide in 0.5 ml PBS containing < 0.1% sodium azide and 0.2% BSA). transduction of odorant signals. The olfactory receptor gene family is the larg- est in the genome. OR12D2 (olfactory receptor 12D2), also known as OR6-28 APPLICATIONS or Hs6M1-20, is a 307 amino acid multi-pass membrane protein that belongs to the G-protein coupled receptor 1 family. The gene that encodes OR12D2 OR12D2 (Y-13) is recommended for detection of OR12D2 of human origin by consists of nearly 1,000 bases and maps to human chromosome 6p22.1. With Western Blotting (starting dilution 1:200, dilution range 1:100-1:1000), immu- 170 million base pairs, chromosome 6 comprises nearly 6% of the human nofluorescence (starting dilution 1:50, dilution range 1:50-1:500) and solid genome. Deletion of a portion of the q arm of chromosome 6 is associated phase ELISA (starting dilution 1:30, dilution range 1:30-1:3000); non cross- with early onset intestinal cancer, suggesting the presence of a cancer sus - reactive with family member OR12D3.
    [Show full text]
  • Genetic Variation Across the Human Olfactory Receptor Repertoire Alters Odor Perception
    bioRxiv preprint doi: https://doi.org/10.1101/212431; this version posted November 1, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. Genetic variation across the human olfactory receptor repertoire alters odor perception Casey Trimmer1,*, Andreas Keller2, Nicolle R. Murphy1, Lindsey L. Snyder1, Jason R. Willer3, Maira Nagai4,5, Nicholas Katsanis3, Leslie B. Vosshall2,6,7, Hiroaki Matsunami4,8, and Joel D. Mainland1,9 1Monell Chemical Senses Center, Philadelphia, Pennsylvania, USA 2Laboratory of Neurogenetics and Behavior, The Rockefeller University, New York, New York, USA 3Center for Human Disease Modeling, Duke University Medical Center, Durham, North Carolina, USA 4Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina, USA 5Department of Biochemistry, University of Sao Paulo, Sao Paulo, Brazil 6Howard Hughes Medical Institute, New York, New York, USA 7Kavli Neural Systems Institute, New York, New York, USA 8Department of Neurobiology and Duke Institute for Brain Sciences, Duke University Medical Center, Durham, North Carolina, USA 9Department of Neuroscience, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA *[email protected] ABSTRACT The human olfactory receptor repertoire is characterized by an abundance of genetic variation that affects receptor response, but the perceptual effects of this variation are unclear. To address this issue, we sequenced the OR repertoire in 332 individuals and examined the relationship between genetic variation and 276 olfactory phenotypes, including the perceived intensity and pleasantness of 68 odorants at two concentrations, detection thresholds of three odorants, and general olfactory acuity.
    [Show full text]
  • RESEARCH ARTICLES Gene Cluster Statistics with Gene Families
    RESEARCH ARTICLES Gene Cluster Statistics with Gene Families Narayanan Raghupathy*1 and Dannie Durand* *Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA; and Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA Identifying genomic regions that descended from a common ancestor is important for understanding the function and evolution of genomes. In distantly related genomes, clusters of homologous gene pairs are evidence of candidate homologous regions. Demonstrating the statistical significance of such ‘‘gene clusters’’ is an essential component of comparative genomic analyses. However, currently there are no practical statistical tests for gene clusters that model the influence of the number of homologs in each gene family on cluster significance. In this work, we demonstrate empirically that failure to incorporate gene family size in gene cluster statistics results in overestimation of significance, leading to incorrect conclusions. We further present novel analytical methods for estimating gene cluster significance that take gene family size into account. Our methods do not require complete genome data and are suitable for testing individual clusters found in local regions, such as contigs in an unfinished assembly. We consider pairs of regions drawn from the same genome (paralogous clusters), as well as regions drawn from two different genomes (orthologous clusters). Determining cluster significance under general models of gene family size is computationally intractable. By assuming that all gene families are of equal size, we obtain analytical expressions that allow fast approximation of cluster probabilities. We evaluate the accuracy of this approximation by comparing the resulting gene cluster probabilities with cluster probabilities obtained by simulating a realistic, power-law distributed model of gene family size, with parameters inferred from genomic data.
    [Show full text]
  • Gene-Pseudogene Evolution
    Mahmudi et al. BMC Genomics 2015, 16(Suppl 10):S12 http://www.biomedcentral.com/1471-2164/16/S10/S12 RESEARCH Open Access Gene-pseudogene evolution: a probabilistic approach Owais Mahmudi1*, Bengt Sennblad2, Lars Arvestad3, Katja Nowick4, Jens Lagergren1* From 13th Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics Frankfurt, Germany. 4-7 October 2015 Abstract Over the last decade, methods have been developed for the reconstruction of gene trees that take into account the species tree. Many of these methods have been based on the probabilistic duplication-loss model, which describes how a gene-tree evolves over a species-tree with respect to duplication and losses, as well as extension of this model, e.g., the DLRS (Duplication, Loss, Rate and Sequence evolution) model that also includes sequence evolution under relaxed molecular clock. A disjoint, almost as recent, and very important line of research has been focused on non protein-coding, but yet, functional DNA. For instance, DNA sequences being pseudogenes in the sense that they are not translated, may still be transcribed and the thereby produced RNA may be functional. We extend the DLRS model by including pseudogenization events and devise an MCMC framework for analyzing extended gene families consisting of genes and pseudogenes with respect to this model, i.e., reconstructing gene- trees and identifying pseudogenization events in the reconstructed gene-trees. By applying the MCMC framework to biologically realistic synthetic data, we show that gene-trees as well as pseudogenization points can be inferred well. We also apply our MCMC framework to extended gene families belonging to the Olfactory Receptor and Zinc Finger superfamilies.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Gene Family Amplification Facilitates Adaptation in Freshwater Unionid
    Gene family amplification facilitates adaptation in freshwater unionid bivalve Megalonaias nervosa Rebekah L. Rogers1∗, Stephanie L. Grizzard1;2, James E. Titus-McQuillan1, Katherine Bockrath3;4 Sagar Patel1;5;6, John P. Wares3;7, Jeffrey T Garner8, Cathy C. Moore1 Author Affiliations: 1. Department of Bioinformatics and Genomics, University of North Carolina, Charlotte, NC 2. Department of Biological Sciences, Old Dominion University, Norfolk, VA 3. Department of Genetics, University of Georgia, Athens, GA 4. U.S. Fish and Wildlife Service, Midwest Fisheries Center Whitney Genetics Lab, Onalaska, WI 5. Department of Biology, Saint Louis University, St. Louis, MO 6. Donald Danforth Plant Science Center, St. Louis, MO 7. Odum School of Ecology, University of Georgia, Athens, GA 8. Division of Wildlife and Freshwater Fisheries, Alabama Department of Conservation and Natural Resources, Florence, AL *Corresponding author: Department of Bioinformatics and Genomics, University of North Carolina, Charlotte, NC. [email protected] Keywords: Unionidae, M. nervosa, Evolutionary genomics, gene family expansion, arXiv:2008.00131v2 [q-bio.GN] 16 Nov 2020 transposable element evolution, Cytochrome P450, Population genomics, reverse ecological genetics Short Title: Gene family expansion in Megalonaias nervosa 1 Abstract Freshwater unionid bivalves currently face severe anthropogenic challenges. Over 70% of species in the United States are threatened, endangered or extinct due to pollution, damming of waterways, and overfishing. These species are notable for their unusual life history strategy, parasite-host coevolution, and biparental mitochondria inheritance. Among this clade, the washboard mussel Megalonaias nervosa is one species that remains prevalent across the Southeastern United States, with robust population sizes. We have created a reference genome for M.
    [Show full text]
  • Research Article Characterization, Tissue Expression, and Imprinting Analysis of the Porcine CDKN1C and NAP1L4 Genes
    Hindawi Publishing Corporation Journal of Biomedicine and Biotechnology Volume 2012, Article ID 946527, 7 pages doi:10.1155/2012/946527 Research Article Characterization, Tissue Expression, and Imprinting Analysis of the Porcine CDKN1C and NAP1L4 Genes Shun Li,1 Juan Li,1 Jiawei Tian,1 Ranran Dong,1 Jin Wei,1 Xiaoyan Qiu,2 and Caode Jiang2 1 School of Life Science, Southwest University, Chongqing 400715, China 2 College of Animal Science and Technology, Southwest University, Chongqing 400715, China Correspondence should be addressed to Caode Jiang, [email protected] Received 4 August 2011; Revised 25 October 2011; Accepted 15 November 2011 Academic Editor: Andre Van Wijnen Copyright © 2012 Shun Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. CDKN1C and NAP1L4 in human CDKN1C/KCNQ1OT1 imprinted domain are two key candidate genes responsible for BWS (Beckwith-Wiedemann syndrome) and cancer. In order to increase understanding of these genes in pigs, their cDNAs are characterized in this paper. By the IMpRH panel, porcine CDKN1C and NAP1L4 genes were assigned to porcine chromosome 2, closely linked with IMpRH06175 and with LOD of 15.78 and 17.94, respectively. By real-time quantitative RT-PCR and polymorphism-based method, tissue and allelic expression of both genes were determined using F1 pigs of Rongchang and Landrace reciprocal crosses. The transcription levels of porcine CDKN1C and NAP1L4 were significantly higher in placenta than in other neonatal tissues (P<0.01) although both genes showed the highest expression levels in the lung and kidney of one- month pigs (P<0.01).
    [Show full text]
  • CRISPR/Cas9-Mediated Genome Editing Efficiently Creates Specific
    www.nature.com/scientificreports Correction: Author Correction OPEN CRISPR/Cas9-mediated genome editing efciently creates specifc mutations at multiple loci using one Received: 18 April 2017 Accepted: 3 July 2017 sgRNA in Brassica napus Published: xx xx xxxx Hong Yang1, Jia-Jing Wu1, Ting Tang1, Ke-De Liu 2 & Cheng Dai1 CRISPR/Cas9 is a valuable tool for both basic and applied research that has been widely applied to diferent plant species. Nonetheless, a systematical assessment of the efciency of this method is not available for the allotetraploid Brassica napus—an important oilseed crop. In this study, we examined the mutation efciency of the CRISPR/Cas9 method for 12 genes and also determined the pattern, specifcity and heritability of these gene modifcations in B. napus. The average mutation frequency for a single-gene targeted sgRNA in the T0 generation is 65.3%. For paralogous genes located in conserved regions that were targeted by sgRNAs, we observed mutation frequencies that ranged from 27.6% to 96.6%. Homozygotes were readily found in T0 plants. A total of 48.2% of the gene mutations, including homozygotes, bi-alleles, and heterozygotes were stably inherited as classic Mendelian alleles in the next generation (T1) without any new mutations or reversions. Moreover, no mutation was found in the putative of-target sites among the examined T0 plants. Collectively, our results demonstrate that CRISPR/Cas9 is an efcient tool for creating targeted genome modifcations at multiple loci that are stable and inheritable in B. napus. These fndings open many doors for biotechnological applications in oilseed crops.
    [Show full text]
  • Snps) Distant from Xenobiotic Response Elements Can Modulate Aryl Hydrocarbon Receptor Function: SNP-Dependent CYP1A1 Induction S
    Supplemental material to this article can be found at: http://dmd.aspetjournals.org/content/suppl/2018/07/06/dmd.118.082164.DC1 1521-009X/46/9/1372–1381$35.00 https://doi.org/10.1124/dmd.118.082164 DRUG METABOLISM AND DISPOSITION Drug Metab Dispos 46:1372–1381, September 2018 Copyright ª 2018 by The American Society for Pharmacology and Experimental Therapeutics Single Nucleotide Polymorphisms (SNPs) Distant from Xenobiotic Response Elements Can Modulate Aryl Hydrocarbon Receptor Function: SNP-Dependent CYP1A1 Induction s Duan Liu, Sisi Qin, Balmiki Ray,1 Krishna R. Kalari, Liewei Wang, and Richard M. Weinshilboum Division of Clinical Pharmacology, Department of Molecular Pharmacology and Experimental Therapeutics (D.L., S.Q., B.R., L.W., R.M.W.) and Division of Biomedical Statistics and Informatics, Department of Health Sciences Research (K.R.K.), Mayo Clinic, Rochester, Minnesota Received April 22, 2018; accepted June 28, 2018 ABSTRACT Downloaded from CYP1A1 expression can be upregulated by the ligand-activated aryl fashion. LCLs with the AA genotype displayed significantly higher hydrocarbon receptor (AHR). Based on prior observations with AHR-XRE binding and CYP1A1 mRNA expression after 3MC estrogen receptors and estrogen response elements, we tested treatment than did those with the GG genotype. Electrophoretic the hypothesis that single-nucleotide polymorphisms (SNPs) map- mobility shift assay (EMSA) showed that oligonucleotides with the ping hundreds of base pairs (bp) from xenobiotic response elements AA genotype displayed higher LCL nuclear extract binding after (XREs) might influence AHR binding and subsequent gene expres- 3MC treatment than did those with the GG genotype, and mass dmd.aspetjournals.org sion.
    [Show full text]
  • Genome Organization/ Human
    Genome Organization/ Secondary article Human Article Contents . Introduction David H Kass, Eastern Michigan University, Ypsilanti, Michigan, USA . Sequence Complexity Mark A Batzer, Louisiana State University Health Sciences Center, New Orleans, Louisiana, USA . Single-copy Sequences . Repetitive Sequences . The human nuclear genome is a highly complex arrangement of two sets of 23 Macrosatellites, Minisatellites and Microsatellites . chromosomes, or DNA molecules. There are various types of DNA sequences and Gene Families . chromosomal arrangements, including single-copy protein-encoding genes, repetitive Gene Superfamilies . sequences and spacer DNA. Transposable Elements . Pseudogenes . Mitochondrial Genome Introduction . Genome Evolution . Acknowledgements The human nuclear genome contains 3000 million base pairs (bp) of DNA, of which only an estimated 3% possess protein-encoding sequences. As shown in Figure 1, the DNA sequences of the eukaryotic genome can be classified sequences such as the ribosomal RNA genes. Repetitive into several types, including single-copy protein-encoding sequences with no known function include the various genes, DNA that is present in more than one copy highly repeated satellite families, and the dispersed, (repetitive sequences) and intergenic (spacer) DNA. The moderately repeated transposable element families. The most complex of these are the repetitive sequences, some of remainder of the genome consists of spacer DNA, which is which are functional and some of which are without simply a broad category of undefined DNA sequences. function. Functional repetitive sequences are classified into The human nuclear genome consists of 23 pairs of dispersed and/or tandemly repeated gene families that chromosomes, or 46 DNA molecules, of differing sizes either encode proteins (and may include noncoding (Table 1).
    [Show full text]