Mouse Tom1l1 Conditional Knockout Project (CRISPR/Cas9)

Total Page:16

File Type:pdf, Size:1020Kb

Mouse Tom1l1 Conditional Knockout Project (CRISPR/Cas9) https://www.alphaknockout.com Mouse Tom1l1 Conditional Knockout Project (CRISPR/Cas9) Objective: To create a Tom1l1 conditional knockout Mouse model (C57BL/6J) by CRISPR/Cas-mediated genome engineering. Strategy summary: The Tom1l1 gene (NCBI Reference Sequence: NM_028011 ; Ensembl: ENSMUSG00000020541 ) is located on Mouse chromosome 11. 16 exons are identified, with the ATG start codon in exon 4 and the TAA stop codon in exon 15 (Transcript: ENSMUST00000107868). Exon 9~10 will be selected as conditional knockout region (cKO region). Deletion of this region should result in the loss of function of the Mouse Tom1l1 gene. To engineer the targeting vector, homologous arms and cKO region will be generated by PCR using BAC clone RP24-71O2 as template. Cas9, gRNA and targeting vector will be co-injected into fertilized eggs for cKO Mouse production. The pups will be genotyped by PCR followed by sequencing analysis. Note: Exon 9 starts from about 52.14% of the coding region. The knockout of Exon 9~10 will result in frameshift of the gene. The size of intron 8 for 5'-loxP site insertion: 3547 bp, and the size of intron 10 for 3'-loxP site insertion: 1263 bp. The size of effective cKO region: ~995 bp. The cKO region does not have any other known gene. Page 1 of 8 https://www.alphaknockout.com Overview of the Targeting Strategy Wildtype allele 5' gRNA region gRNA region 3' 1 9 10 11 16 Targeting vector Targeted allele Constitutive KO allele (After Cre recombination) Legends Exon of mouse Tom1l1 Homology arm cKO region loxP site Page 2 of 8 https://www.alphaknockout.com Overview of the Dot Plot Window size: 10 bp Forward Reverse Complement Sequence 12 Note: The sequence of homologous arms and cKO region is aligned with itself to determine if there are tandem repeats. No significant tandem repeat is found in the dot plot matrix. So this region is suitable for PCR screening or sequencing analysis. Overview of the GC Content Distribution Window size: 300 bp Sequence 12 Summary: Full Length(7495bp) | A(28.7% 2151) | C(20.39% 1528) | T(30.37% 2276) | G(20.55% 1540) Note: The sequence of homologous arms and cKO region is analyzed to determine the GC content. No significant high GC-content region is found. So this region is suitable for PCR screening or sequencing analysis. Page 3 of 8 https://www.alphaknockout.com BLAT Search Results (up) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN -------------------------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 90658459 90661458 3000 browser details YourSeq 175 2249 2678 3000 87.1% chr16 + 95647252 95647494 243 browser details YourSeq 165 2499 2678 3000 97.2% chr14 - 56662553 56662732 180 browser details YourSeq 165 2294 2678 3000 92.3% chr12 - 90514880 90515258 379 browser details YourSeq 161 2497 2690 3000 92.1% chr14 - 47450978 47451164 187 browser details YourSeq 159 2498 2678 3000 94.2% chr19 - 23098224 23098401 178 browser details YourSeq 159 2499 2680 3000 96.0% chr16 + 64793018 64793202 185 browser details YourSeq 158 2517 2709 3000 89.8% chr15 - 26740057 26740231 175 browser details YourSeq 157 2509 2678 3000 95.9% chr18 - 77628962 77629130 169 browser details YourSeq 157 2520 2712 3000 90.3% chr7 + 45658871 45659051 181 browser details YourSeq 157 2517 2680 3000 98.2% chr19 + 34823131 34823348 218 browser details YourSeq 156 2517 2678 3000 98.2% chr15 + 59660697 59660858 162 browser details YourSeq 155 2482 2667 3000 96.5% chr5 + 147255649 147255840 192 browser details YourSeq 154 2506 2706 3000 87.5% chr2 + 168215534 168215710 177 browser details YourSeq 153 2520 2678 3000 96.9% chr17 - 64373778 64373935 158 browser details YourSeq 153 2516 2678 3000 97.0% chr9 + 21486414 21486576 163 browser details YourSeq 153 2517 2705 3000 90.1% chr3 + 58580283 58580455 173 browser details YourSeq 153 2517 2678 3000 98.2% chr11 + 21074631 21074792 162 browser details YourSeq 152 2517 2678 3000 97.0% chr1 + 13075727 13075888 162 browser details YourSeq 151 2499 2678 3000 89.6% chr4 - 149776525 149776696 172 Note: The 3000 bp section upstream of Exon 9 is BLAT searched against the genome. No significant similarity is found. BLAT Search Results (down) QUERY SCORE START END QSIZE IDENTITY CHROM STRAND START END SPAN ----------------------------------------------------------------------------------------------- browser details YourSeq 3000 1 3000 3000 100.0% chr11 - 90654464 90657463 3000 browser details YourSeq 141 1238 1731 3000 90.3% chr16 + 30635919 30636422 504 browser details YourSeq 138 1328 1807 3000 80.3% chr1 + 61127721 61128206 486 browser details YourSeq 137 1239 1683 3000 85.2% chr2 - 127549701 127550143 443 browser details YourSeq 137 1242 1731 3000 85.2% chr18 - 66481548 66482031 484 browser details YourSeq 136 1238 1731 3000 75.8% chr2 + 129861509 129861964 456 browser details YourSeq 134 1245 1683 3000 76.7% chr7 - 132450828 132451240 413 browser details YourSeq 134 1238 1683 3000 78.0% chr2 + 20976322 20976732 411 browser details YourSeq 126 1243 1683 3000 75.0% chr2 - 25590537 25590848 312 browser details YourSeq 125 1249 1701 3000 83.6% chr4 + 81051152 81051594 443 browser details YourSeq 123 1238 1680 3000 81.1% chr16 - 91205517 91205916 400 browser details YourSeq 123 1258 1729 3000 80.3% chr1 - 151147789 151148239 451 browser details YourSeq 121 1179 1683 3000 79.1% chrX - 153619775 153620260 486 browser details YourSeq 120 1162 1683 3000 83.9% chr15 - 12685258 12685848 591 browser details YourSeq 120 1329 1774 3000 91.1% chr5 + 137991364 137991986 623 browser details YourSeq 119 1235 1683 3000 81.1% chr6 - 49138166 49138592 427 browser details YourSeq 118 1235 1679 3000 80.8% chr11 + 19893418 19893782 365 browser details YourSeq 115 1237 1674 3000 78.5% chr9 + 120401193 120401552 360 browser details YourSeq 114 1238 1654 3000 88.2% chr12 + 39860634 39861050 417 browser details YourSeq 114 1190 1683 3000 81.5% chr12 + 17399553 17399982 430 Note: The 3000 bp section downstream of Exon 10 is BLAT searched against the genome. No significant similarity is found. Page 4 of 8 https://www.alphaknockout.com Gene and protein information: Tom1l1 target of myb1-like 1 (chicken) [ Mus musculus (house mouse) ] Gene ID: 71943, updated on 24-Oct-2019 Gene summary Official Symbol Tom1l1 provided by MGI Official Full Name target of myb1-like 1 (chicken) provided by MGI Primary source MGI:MGI:1919193 See related Ensembl:ENSMUSG00000020541 Gene type protein coding RefSeq status VALIDATED Organism Mus musculus Lineage Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Myomorpha; Muroidea; Muridae; Murinae; Mus; Mus Also known as C80573; Srcasm; 2310045L10Rik Expression Ubiquitous expression in liver E14 (RPKM 14.0), liver E14.5 (RPKM 12.2) and 28 other tissues See more Orthologs human all Genomic context Location: 11 D; 11 55.47 cM See Tom1l1 in Genome Data Viewer Exon count: 16 Annotation release Status Assembly Chr Location 108 current GRCm38.p6 (GCF_000001635.26) 11 NC_000077.6 (90645690..90688279, complement) Build 37.2 previous assembly MGSCv37 (GCF_000001635.18) 11 NC_000077.5 (90507005..90548915, complement) Chromosome 11 - NC_000077.6 Page 5 of 8 https://www.alphaknockout.com Transcript information: This gene has 8 transcripts Gene: Tom1l1 ENSMUSG00000020541 Description target of myb1-like 1 (chicken) [Source:MGI Symbol;Acc:MGI:1919193] Gene Synonyms 2310045L10Rik, Srcasm Location Chromosome 11: 90,643,465-90,688,366 reverse strand. GRCm38:CM001004.2 About this gene This gene has 8 transcripts (splice variants), 158 orthologues, 10 paralogues and is a member of 1 Ensembl protein family. Transcripts Name Transcript ID bp Protein Translation ID Biotype CCDS UniProt Flags Tom1l1-203 ENSMUST00000107868.7 2093 397aa ENSMUSP00000103500.1 Protein coding CCDS36280 Q8BZR6 TSL:1 GENCODE basic Tom1l1-201 ENSMUST00000020849.8 4308 474aa ENSMUSP00000020849.2 Protein coding - Q923U0 TSL:1 GENCODE basic APPRIS P1 Tom1l1-202 ENSMUST00000107867.7 2439 227aa ENSMUSP00000103499.1 Protein coding - Q923U0 TSL:1 GENCODE basic Tom1l1-204 ENSMUST00000107869.8 1826 398aa ENSMUSP00000103501.2 Protein coding - Q923U0 TSL:1 GENCODE basic Tom1l1-208 ENSMUST00000154599.1 592 191aa ENSMUSP00000123329.1 Protein coding - A6PWP3 CDS 3' incomplete TSL:3 Tom1l1-207 ENSMUST00000147329.1 2694 No protein - Retained intron - - TSL:2 Tom1l1-205 ENSMUST00000127034.1 728 No protein - Retained intron - - TSL:5 Tom1l1-206 ENSMUST00000131055.1 749 No protein - lncRNA - - TSL:3 Page 6 of 8 https://www.alphaknockout.com 64.90 kb Forward strand 90.64Mb 90.65Mb 90.66Mb 90.67Mb 90.68Mb 90.69Mb Genes Cox11-201 >protein coding (Comprehensive set... Cox11-202 >protein coding Contigs AL646046.8 > AL672199.16 > Genes (Comprehensive set... < Stxbp4-207protein coding < Tom1l1-201protein coding < Stxbp4-201protein coding < Tom1l1-202protein coding < Tom1l1-208protein coding < Stxbp4-203protein coding < Tom1l1-203protein coding < Stxbp4-202protein coding < Tom1l1-204protein coding < Stxbp4-205lncRNA < Tom1l1-205retained intron < Tom1l1-207retained intron < Tom1l1-206lncRNA Regulatory Build 90.64Mb 90.65Mb 90.66Mb 90.67Mb 90.68Mb 90.69Mb Reverse strand 64.90 kb Regulation Legend CTCF Enhancer Open Chromatin Promoter Promoter Flank Gene Legend Protein Coding merged Ensembl/Havana Ensembl protein coding Non-Protein Coding RNA gene processed transcript Page 7 of 8 https://www.alphaknockout.com Transcript: ENSMUST00000107868 < Tom1l1-203protein coding Reverse strand 41.91 kb ENSMUSP00000103... MobiDB lite Low complexity (Seg) Superfamily ENTH/VHS SSF89009 Pfam VHS domain GAT domain PROSITE profiles VHS domain GAT domain PIRSF Target of Myb protein 1 PANTHER TOM1-like protein 1 PTHR13856 Gene3D ENTH/VHS GAT domain superfamily CDD cd16997 cd14237 All sequence SNPs/i... Sequence variants (dbSNP and all other sources) Variant Legend inframe deletion missense variant synonymous variant Scale bar 0 40 80 120 160 200 240 280 320 397 We wish to acknowledge the following valuable scientific information resources: Ensembl, MGI, NCBI, UCSC. Page 8 of 8.
Recommended publications
  • Extensive Rewiring of Epithelial-Stromal Co-Expression Networks in Breast Cancer
    Extensive rewiring of epithelial-stromal co-expression networks in breast cancer The Harvard community has made this article openly available. Please share how this access benefits you. Your story matters Citation Oh, E., S. M. Christensen, S. Ghanta, J. C. Jeong, O. Bucur, B. Glass, L. Montaser-Kouhsari, et al. 2015. “Extensive rewiring of epithelial-stromal co-expression networks in breast cancer.” Genome Biology 16 (1): 128. doi:10.1186/s13059-015-0675-4. http:// dx.doi.org/10.1186/s13059-015-0675-4. Published Version doi:10.1186/s13059-015-0675-4 Citable link http://nrs.harvard.edu/urn-3:HUL.InstRepos:17295769 Terms of Use This article was downloaded from Harvard University’s DASH repository, and is made available under the terms and conditions applicable to Other Posted Material, as set forth at http:// nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of- use#LAA Extensive rewiring of epithelial-stromal co-expression networks in breast cancer Oh et al. Oh et al. Genome Biology (2015) 16:128 DOI 10.1186/s13059-015-0675-4 Oh et al. Genome Biology (2015) 16:128 DOI 10.1186/s13059-015-0675-4 RESEARCH Open Access Extensive rewiring of epithelial-stromal co-expression networks in breast cancer Eun-Yeong Oh1,2,3†, Stephen M Christensen1,2,3†, Sindhu Ghanta1,2,3†, Jong Cheol Jeong1,2,3†, Octavian Bucur1,2,3, Benjamin Glass1,2,3, Laleh Montaser-Kouhsari1,2,3, Nicholas W Knoblauch1,2,3, Nicholas Bertos4, Sadiq MI Saleh4, Benjamin Haibe-Kains5,6, Morag Park4 and Andrew H Beck1,2,3* Abstract Background: Epithelial-stromal crosstalk plays a critical role in invasive breast cancer pathogenesis; however, little is known on a systems level about how epithelial-stromal interactions evolve during carcinogenesis.
    [Show full text]
  • Based on Network Pharmacology and RNA Sequencing Techniques to Explore the Molecular Mechanism of Huatan Jiangzhuo Decoction for Treating Hyperlipidemia
    Hindawi Evidence-Based Complementary and Alternative Medicine Volume 2021, Article ID 9863714, 16 pages https://doi.org/10.1155/2021/9863714 Research Article Based on Network Pharmacology and RNA Sequencing Techniques to Explore the Molecular Mechanism of Huatan Jiangzhuo Decoction for Treating Hyperlipidemia XiaowenZhou ,1 ZhenqianYan ,1 YaxinWang ,1 QiRen ,1 XiaoqiLiu ,2 GeFang ,3 Bin Wang ,4 and Xiantao Li 1 1Laboratory of TCM Syndrome Essence and Objectification, School of Basic Medical Sciences, Guangzhou University of Chinese Medicine, Guangzhou 510006, China 2Guangzhou Sagene Tech Co., Ltd., Guangzhou 510006, China 3College of Traditional Chinese Medicine, Hunan University of Chinese Medicine, Changsha 410208, China 4Shenzhen Traditional Chinese Medicine Hospital, Shenzhen 518000, China Correspondence should be addressed to Xiantao Li; [email protected] Received 22 May 2020; Revised 12 March 2021; Accepted 18 March 2021; Published 12 April 2021 Academic Editor: Jianbo Wan Copyright © 2021 Xiaowen Zhou et al. +is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Background. Hyperlipidemia, due to the practice of unhealthy lifestyles of modern people, has been a disturbance to a large portion of population worldwide. Recently, several scholars have turned their attention to Chinese medicine (CM) to seek out a lipid-lowering approach with high efficiency and low toxicity. +is study aimed to explore the mechanism of Huatan Jiangzhuo decoction (HTJZD, a prescription of CM) in the treatment of hyperlipidemia and to determine the major regulation pathways and potential key targets involved in the treatment process.
    [Show full text]
  • Supplementary Information – Postema Et Al., the Genetics of Situs Inversus Totalis Without Primary Ciliary Dyskinesia
    1 Supplementary information – Postema et al., The genetics of situs inversus totalis without primary ciliary dyskinesia Table of Contents: Supplementary Methods 2 Supplementary Results 5 Supplementary References 6 Supplementary Tables and Figures Table S1. Subject characteristics 9 Table S2. Inbreeding coefficients per subject 10 Figure S1. Multidimensional scaling to capture overall genomic diversity 11 among the 30 study samples Table S3. Significantly enriched gene-sets under a recessive mutation model 12 Table S4. Broader list of candidate genes, and the sources that led to their 13 inclusion Table S5. Potential recessive and X-linked mutations in the unsolved cases 15 Table S6. Potential mutations in the unsolved cases, dominant model 22 2 1.0 Supplementary Methods 1.1 Participants Fifteen people with radiologically documented SIT, including nine without PCD and six with Kartagener syndrome, and 15 healthy controls matched for age, sex, education and handedness, were recruited from Ghent University Hospital and Middelheim Hospital Antwerp. Details about the recruitment and selection procedure have been described elsewhere (1). Briefly, among the 15 people with radiologically documented SIT, those who had symptoms reminiscent of PCD, or who were formally diagnosed with PCD according to their medical record, were categorized as having Kartagener syndrome. Those who had no reported symptoms or formal diagnosis of PCD were assigned to the non-PCD SIT group. Handedness was assessed using the Edinburgh Handedness Inventory (EHI) (2). Tables 1 and S1 give overviews of the participants and their characteristics. Note that one non-PCD SIT subject reported being forced to switch from left- to right-handedness in childhood, in which case five out of nine of the non-PCD SIT cases are naturally left-handed (Table 1, Table S1).
    [Show full text]
  • Supplementary Data
    Supplemental figures Supplemental figure 1: Tumor sample selection. A total of 98 thymic tumor specimens were stored in Memorial Sloan-Kettering Cancer Center tumor banks during the study period. 64 cases corresponded to previously untreated tumors, which were resected upfront after diagnosis. Adjuvant treatment was delivered in 7 patients (radiotherapy in 4 cases, cyclophosphamide- doxorubicin-vincristine (CAV) chemotherapy in 3 cases). 34 tumors were resected after induction treatment, consisting of chemotherapy in 16 patients (cyclophosphamide-doxorubicin- cisplatin (CAP) in 11 cases, cisplatin-etoposide (PE) in 3 cases, cisplatin-etoposide-ifosfamide (VIP) in 1 case, and cisplatin-docetaxel in 1 case), in radiotherapy (45 Gy) in 1 patient, and in sequential chemoradiation (CAP followed by a 45 Gy-radiotherapy) in 1 patient. Among these 34 patients, 6 received adjuvant radiotherapy. 1 Supplemental Figure 2: Amino acid alignments of KIT H697 in the human protein and related orthologs, using (A) the Homologene database (exons 14 and 15), and (B) the UCSC Genome Browser database (exon 14). Residue H697 is highlighted with red boxes. Both alignments indicate that residue H697 is highly conserved. 2 Supplemental Figure 3: Direct comparison of the genomic profiles of thymic squamous cell carcinomas (n=7) and lung primary squamous cell carcinomas (n=6). (A) Unsupervised clustering analysis. Gains are indicated in red, and losses in green, by genomic position along the 22 chromosomes. (B) Genomic profiles and recurrent copy number alterations in thymic carcinomas and lung squamous cell carcinomas. Gains are indicated in red, and losses in blue. 3 Supplemental Methods Mutational profiling The exonic regions of interest (NCBI Human Genome Build 36.1) were broken into amplicons of 500 bp or less, and specific primers were designed using Primer 3 (on the World Wide Web for general users and for biologist programmers (see Supplemental Table 2) [1].
    [Show full text]
  • Identification of Core Genes Involved in the Progression of Cervical
    International Journal of Molecular Sciences Article Identification of Core Genes Involved in the Progression of Cervical Cancer Using an Integrative mRNA Analysis Marina Dudea-Simon 1, Dan Mihu 1, Alexandru Irimie 2,3, Roxana Cojocneanu 4, Schuyler S. Korban 5, Radu Oprean 6, Cornelia Braicu 4,* and Ioana Berindan-Neagoe 4,7 1 2nd Obstetrics and Gynecology Department, “Iuliu Hatieganu” University of Medicine and Pharmacy, 400012 Cluj-Napoca, Romania; [email protected] (M.D.-S.); [email protected] (D.M.) 2 Department of Surgery, “Prof. Dr. Ion Chiricuta” Oncology Institute, 400015 Cluj-Napoca, Romania; [email protected] 3 Department of Surgical Oncology and Gynecological Oncology, Iuliu Hatieganu University of Medicine and Pharmacy, 400012 Cluj-Napoca, Romania 4 Research Center for Functional Genomics, Biomedicine and Translational Medicine, Iuliu Hatieganu University of Medicine and Pharmacy, 23 Marinescu Street, 400337 Cluj-Napoca, Romania; [email protected] (R.C.); [email protected] (I.B.-N.) 5 Department of Natural and Environmental Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA; [email protected] 6 Analytical Chemistry Department, Iuliu Hatieganu University of Medicine and Pharmacy, 4, Louis Pasteur Street, 400349 Cluj-Napoca, Romania; [email protected] 7 Department of Functional Genomics and Experimental Pathology, “Prof. Dr. Ion Chiricu¸tă” Oncology Institute, 34-36 Republicii Street, 400015 Cluj-Napoca, Romania * Correspondence: [email protected] Received: 5 September 2020; Accepted: 1 October 2020; Published: 3 October 2020 Abstract: In spite of being a preventable disease, cervical cancer (CC) remains at high incidence, and it has a significant mortality rate. Although hijacking of the host cellular pathway is fundamental for developing a better understanding of the human papillomavirus (HPV) pathogenesis, a major obstacle is identifying the central molecular targets involved in HPV-driven CC.
    [Show full text]
  • Novel Insights Into the Molecular Heterogeneity of Hepatocellular Carcinoma
    bioRxiv preprint doi: https://doi.org/10.1101/101766; this version posted January 19, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Novel insights into the molecular heterogeneity of hepatocellular carcinoma Running head: HCC molecular heterogeneity Juan Jovel1,5, Zhen Lin1,5, Sandra O’keefe1, Steven Willows1, Weiwei Wang1, Guangzhi Zhang1, Jordan Patterson1, David J. Kelvin4, Gane Ka-Shu Wong1,2,4, * and Andrew L. Mason1,* 1. Department of Medicine, University of Alberta. Edmonton AB, T6G 2E1, Canada 2. Department of Biological Sciences, University of Alberta, Edmonton AB, T6G 2E9, Canada 3. BGI-Shenzhen, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China 4. Division of Experimental Therapeutics, University Health Network, Toronto, ON, M5G 1L7, Canada 5. These authors contributed equally to this work * Corresponding authors G.K. Wong 4-126 Katz Group Centre for Pharmacy and Health Research University of Alberta, Edmonton AB, T6G 2E1 [email protected] Phone 780 492 8663 Fax: 780 492 2475 A.L. Mason 7-142 Katz Group Centre for Pharmacy and Health Research University of Alberta, Edmonton AB, T6G 2E1 [email protected] Phone 780 492 8176 Fax: 780 492 1655 Competing interest: The authors declare that no competing interest exists This work was supported by funding from Alberta Innovates Technology Futures-Innovates Centres of Research Excellence (AITF-iCORE) to GKSW. bioRxiv preprint doi: https://doi.org/10.1101/101766; this version posted January 19, 2017.
    [Show full text]
  • Creating Reference Gene Annotation for the Mouse C57BL6/J Genome Assembly
    Mamm Genome (2015) 26:366–378 DOI 10.1007/s00335-015-9583-x Creating reference gene annotation for the mouse C57BL6/J genome assembly 1 1 Jonathan M. Mudge • Jennifer Harrow Received: 2 April 2015 / Accepted: 18 June 2015 / Published online: 18 July 2015 Ó The Author(s) 2015. This article is published with open access at Springerlink.com Abstract Annotation on the reference genome of the of identifying and describing gene structures. However, in C57BL6/J mouse has been an ongoing project ever since the the 21st century, genes are increasingly regarded as col- draft genome was first published. Initially, the principle lections of distinct transcripts—generated, most obviously, focus was on the identification of all protein-coding genes, by alternative splicing—that can have biologically distinct although today the importance of describing long non-cod- roles (Gerstein et al. 2007). The process of ‘gene’ anno- ing RNAs, small RNAs, and pseudogenes is recognized. tation is therefore perhaps more accurately understood as Here, we describe the progress of the GENCODE mouse that of ‘transcript’ annotation (with separate consideration annotation project, which combines manual annotation from being given to pseudogene annotation). The information the HAVANA group with Ensembl computational annota- held in such models can be divided into two categories. tion, alongside experimental and in silico validation pipeli- Firstly, the model will contain the coordinates of the nes from other members of the consortium. We discuss the transcript structure, i.e., the coordinates of exon/intron more recent incorporation of next-generation sequencing architecture and splice sites, as well as the transcript start datasets into this workflow, including the usage of mass- site (TSS) and polyadenylation site (if known; see ‘‘The spectrometry data to potentially identify novel protein- incorporation of next-generation sequencing technologies coding genes.
    [Show full text]
  • Detection of H3k4me3 Identifies Neurohiv Signatures, Genomic
    viruses Article Detection of H3K4me3 Identifies NeuroHIV Signatures, Genomic Effects of Methamphetamine and Addiction Pathways in Postmortem HIV+ Brain Specimens that Are Not Amenable to Transcriptome Analysis Liana Basova 1, Alexander Lindsey 1, Anne Marie McGovern 1, Ronald J. Ellis 2 and Maria Cecilia Garibaldi Marcondes 1,* 1 San Diego Biomedical Research Institute, San Diego, CA 92121, USA; [email protected] (L.B.); [email protected] (A.L.); [email protected] (A.M.M.) 2 Departments of Neurosciences and Psychiatry, University of California San Diego, San Diego, CA 92103, USA; [email protected] * Correspondence: [email protected] Abstract: Human postmortem specimens are extremely valuable resources for investigating trans- lational hypotheses. Tissue repositories collect clinically assessed specimens from people with and without HIV, including age, viral load, treatments, substance use patterns and cognitive functions. One challenge is the limited number of specimens suitable for transcriptional studies, mainly due to poor RNA quality resulting from long postmortem intervals. We hypothesized that epigenomic Citation: Basova, L.; Lindsey, A.; signatures would be more stable than RNA for assessing global changes associated with outcomes McGovern, A.M.; Ellis, R.J.; of interest. We found that H3K27Ac or RNA Polymerase (Pol) were not consistently detected by Marcondes, M.C.G. Detection of H3K4me3 Identifies NeuroHIV Chromatin Immunoprecipitation (ChIP), while the enhancer H3K4me3 histone modification was Signatures, Genomic Effects of abundant and stable up to the 72 h postmortem. We tested our ability to use H3K4me3 in human Methamphetamine and Addiction prefrontal cortex from HIV+ individuals meeting criteria for methamphetamine use disorder or not Pathways in Postmortem HIV+ Brain (Meth +/−) which exhibited poor RNA quality and were not suitable for transcriptional profiling.
    [Show full text]
  • Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress
    University of Pennsylvania ScholarlyCommons Publicly Accessible Penn Dissertations Fall 2010 Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress Renuka Nayak University of Pennsylvania, [email protected] Follow this and additional works at: https://repository.upenn.edu/edissertations Part of the Computational Biology Commons, and the Genomics Commons Recommended Citation Nayak, Renuka, "Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress" (2010). Publicly Accessible Penn Dissertations. 1559. https://repository.upenn.edu/edissertations/1559 This paper is posted at ScholarlyCommons. https://repository.upenn.edu/edissertations/1559 For more information, please contact [email protected]. Coexpression Networks Based on Natural Variation in Human Gene Expression at Baseline and Under Stress Abstract Genes interact in networks to orchestrate cellular processes. Here, we used coexpression networks based on natural variation in gene expression to study the functions and interactions of human genes. We asked how these networks change in response to stress. First, we studied human coexpression networks at baseline. We constructed networks by identifying correlations in expression levels of 8.9 million gene pairs in immortalized B cells from 295 individuals comprising three independent samples. The resulting networks allowed us to infer interactions between biological processes. We used the network to predict the functions of poorly-characterized human genes, and provided some experimental support. Examining genes implicated in disease, we found that IFIH1, a diabetes susceptibility gene, interacts with YES1, which affects glucose transport. Genes predisposing to the same diseases are clustered non-randomly in the network, suggesting that the network may be used to identify candidate genes that influence disease susceptibility.
    [Show full text]
  • 1 Deep Transcriptome Annotation Enables the Discovery and Functional 1 Characterization of Cryptic Small Proteins 2 3 4 Sondos S
    1 Deep transcriptome annotation enables the discovery and functional 2 characterization of cryptic small proteins 3 4 5 Sondos Samandi1,7†, Annie V. Roy1,7†, Vivian Delcourt1,7,8, Jean-François Lucier2, 6 Jules Gagnon2, Maxime C. Beaudoin1,7, Benoît Vanderperre1, Marc-André Breton1, Julie 7 Motard1,7, Jean-François Jacques1,7, Mylène Brunelle1,7, Isabelle Gagnon-Arsenault6,7, 8 Isabelle Fournier8, Aida Ouangraoua3, Darel J. Hunting4, Alan A. Cohen5, Christian R. 9 Landry6,7, Michelle S. Scott1, Xavier Roucou1,7* 10 11 1Department of Biochemistry, 2Department of Biology and Center for Computational 12 Science, 3Department of Computer Science, 4Department of Nuclear Medicine & 13 Radiobiology, 5Department of Family Medicine, Université de Sherbrooke, Québec, 14 Canada; 6Département de biologie, Département de biochimie, microbiologie et bio- 15 informatique and IBIS, Université Laval, Québec, Canada; 7PROTEO, Québec Network 16 for Research on Protein Function, Structure, and Engineering, Québec, Canada; 17 8Université Lille, INSERM U1192, Laboratoire Protéomique, Réponse Inflammatoire & 18 Spectrométrie de Masse (PRISM) F-59000 Lille, France 19 20 †These authors contributed equally to this work 21 *Correspondance to Xavier Roucou: Département de biochimie (Z8-2001), Faculté de 22 Médecine et des Sciences de la Santé, Université de Sherbrooke, 3201 Jean Mignault, 1 23 Sherbrooke, Québec J1E 4K8, Canada, Tel. (819) 821-8000x72240; Fax. (819) 820 6831; 24 E-Mail: [email protected] 25 26 Abstract 27 28 Recent functional, proteomic and ribosome profiling studies in eukaryotes have 29 concurrently demonstrated the translation of alternative open reading frames (altORFs) in 30 addition to annotated protein coding sequences (CDSs). We show that a large number of 31 small proteins could in fact be coded by these altORFs.
    [Show full text]
  • Identification and Interaction Analysis of Molecular Markers in Pancreatic
    medRxiv preprint doi: https://doi.org/10.1101/2020.12.20.20248601; this version posted December 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Identification and Interaction Analysis of Molecular Markers in Pancreatic Ductal Adenocarcinoma by Integrated Bioinformatics Analysis and Molecular Docking Experiments Basavaraj Vastrad1 , Chanabasayya Vastrad *2, Anandkumar Tengli3 1. Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, Karnataka 582103, India. 2. Biostatistics and Bioinformatics,Chanabasava Nilaya, Bharthinagar, Dharwad, Karanataka 580001, India. 3. Department of Pharmaceutical Chemistry, JSS College of Pharmacy, Mysuru and JSS Academy of Higher Education & Research, Mysuru- 570015, Karnataka, India. * Chanabasayya Vastrad [email protected] Ph: +919480073398 Chanabasava Nilaya, Bharthinagar, Dharwad 580001 , Karanataka, India NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. medRxiv preprint doi: https://doi.org/10.1101/2020.12.20.20248601; this version posted December 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Abstract The current investigation aimed to mine therapeutic molecular targets that play an key part in the advancement of pancreatic ductal adenocarcinoma (PDAC). The expression profiling by high throughput sequencing dataset profile GSE133684 dataset was downloaded from the Gene Expression Omnibus (GEO) database.
    [Show full text]
  • Genome-Wide Profiling of RNA Splicing in Prostate Tumor from RNA
    Srinivasan et al. Journal of Clinical Bioinformatics 2012, 2:21 JOURNAL OF http://www.jclinbioinformatics.com/content/2/1/21 CLINICAL BIOINFORMATICS METHODOLOGY Open Access Genome-wide Profiling of RNA splicing in prostate tumor from RNA-seq data using virtual microarrays Subhashini Srinivasan1*, Arun H Patil1, Mohit Verma1, Jonathan L Bingham2 and Raghunathan Srivatsan1 Abstract Background: Second generation RNA sequencing technology (RNA-seq) offers the potential to interrogate genome-wide differential RNA splicing in cancer. However, since short RNA reads spanning spliced junctions cannot be mapped contiguously onto to the chromosomes, there is a need for methods to profile splicing from RNA-seq data. Before the invent of RNA-seq technologies, microarrays containing probe sequences representing exon-exon junctions of known genes have been used to hybridize cellular RNAs for measuring context-specific differential splicing. Here, we extend this approach to detect tumor-specific splicing in prostate cancer from a RNA-seq dataset. Method: A database, SPEventH, representing probe sequences of under a million non-redundant splice events in human is created with exon-exon junctions of optimized length for use as virtual microarray. SPEventH is used to map tens of millions of reads from matched tumor-normal samples from ten individuals with prostate cancer. Differential counts of reads mapped to each event from tumor and matched normal is used to identify statistically significant tumor-specific splice events in prostate. Results: We find sixty-one (61) splice events that are differentially expressed with a p-value of less than 0.0001 and a fold change of greater than 1.5 in prostate tumor compared to the respective matched normal samples.
    [Show full text]