Org.Hs.Eg.Db August 24, 2021
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
"1" : Interaction Predicted; "0" : Interaction Not Predicted. Mirna ID
# "1" : interaction predicted; "0" : interaction not predicted. miRNA ID miRNA Acc Refseq Symbol Description diana microinspector miranda hsa-miR-137 MIMAT0000429 NM_004440 EPHA7 EPH receptor A7 0 0 1 hsa-miR-137 MIMAT0000429 NM_007249 KLF12 Kruppel-like factor 12 0 0 1 hsa-miR-137 MIMAT0000429 NM_001316 CSE1L CSE1 chromosome segregation0 1-like (yeast)0 1 hsa-miR-137 MIMAT0000429 NM_001152 SLC25A5 solute carrier family 25 (mitochondrial0 carrier;0 adenine nucleotide1 translocator), member 5 hsa-miR-137 MIMAT0000429 NM_001105 ACVR1 activin A receptor, type I 0 0 1 hsa-miR-137 MIMAT0000429 NM_013261 PPARGC1A peroxisome proliferator-activated0 receptor gamma,0 coactivator1 1 alpha hsa-miR-137 MIMAT0000429 NM_013309 SLC30A4 solute carrier family 30 (zinc0 transporter), member0 4 1 hsa-miR-137 MIMAT0000429 NM_032017 STK40 serine/threonine kinase 40 0 0 1 hsa-miR-137 MIMAT0000429 NM_013437 LRP12 low density lipoprotein-related0 protein 12 0 1 hsa-miR-137 MIMAT0000429 NM_013449 BAZ2A bromodomain adjacent to zinc0 finger domain,0 2A 1 hsa-miR-137 MIMAT0000429 NM_031372 HNRPDL heterogeneous nuclear ribonucleoprotein0 D-like0 1 hsa-miR-137 MIMAT0000429 NM_001046 SLC12A2 solute carrier family 12 (sodium/potassium/chloride0 0 transporters),1 member 2 hsa-miR-137 MIMAT0000429 NM_181712 KANK4 KN motif and ankyrin repeat0 domains 4 0 1 hsa-miR-137 MIMAT0000429 NM_014445 SERP1 stress-associated endoplasmic0 reticulum protein0 1 1 hsa-miR-137 MIMAT0000429 NM_014682 ST18 suppression of tumorigenicity0 18 (breast carcinoma)0 (zinc1 finger protein) hsa-miR-137 -
Protein Sequence
Inside protein databases: bridging sequences and knowledge WELCOME ! Geneva, 2017 SIB Swiss Institute of Bioinformatics [email protected] & [email protected] Swiss-Prot, SIB [email protected] CALIPHO, SIB SIB Swiss Institute of Bioinformatics • 70 groups • 800 collaborators • biologists, biochemists, computer scientists, physicists, physicians, chemists, mathematicians, pharmacists, … Common point: bioinformatics Inside protein databases: bridging sequences and knowledge All the material is available here: http://education.expasy.org/cours/InsideProteinDatabases2017/ Content • a description of the major protein sequence databases and their sequence annotation pipeline, focusing on UniProtKB/Swiss-Prot • an introduction to Gene Ontology (GO) • practical sessions allowing to gain knowledge on how to query protein sequence databases, how to perform enrichment analysis on datasets and how to interpret the results of such analyses. Objectives • know the differences between the major protein sequence databases • understand the major sequence annotation pipelines and the GO annotation pipelines • estimate the protein sequence accuracy and the annotation quality 08h30 Protein sequence databases: theory 10h30 COFFEE BREAK 11h00 Controlled vocabularies and standardization resources: theory 12h15 PAUSE 13h30 Protein sequence databases and Gene Ontology: practicals 15h00 COFFEE BREAK 15h30 Analysis tools using ontologies : theory 16h00 Protein sequence databases and Gene Ontology: practicals 17h00 Evaluation / Exam -
Telomere-To-Telomere Assembly of a Complete Human X Chromosome
Article Telomere-to-telomere assembly of a complete human X chromosome https://doi.org/10.1038/s41586-020-2547-7 Karen H. Miga1,24 ✉, Sergey Koren2,24, Arang Rhie2, Mitchell R. Vollger3, Ariel Gershman4, Andrey Bzikadze5, Shelise Brooks6, Edmund Howe7, David Porubsky3, Glennis A. Logsdon3, Received: 30 July 2019 Valerie A. Schneider8, Tamara Potapova7, Jonathan Wood9, William Chow9, Joel Armstrong1, Accepted: 29 May 2020 Jeanne Fredrickson10, Evgenia Pak11, Kristof Tigyi1, Milinn Kremitzki12, Christopher Markovic12, Valerie Maduro13, Amalia Dutra11, Gerard G. Bouffard6, Alexander M. Chang2, Published online: 14 July 2020 Nancy F. Hansen14, Amy B. Wilfert3, Françoise Thibaud-Nissen8, Anthony D. Schmitt15, Open access Jon-Matthew Belton15, Siddarth Selvaraj15, Megan Y. Dennis16, Daniela C. Soto16, Ruta Sahasrabudhe17, Gulhan Kaya16, Josh Quick18, Nicholas J. Loman18, Nadine Holmes19, Check for updates Matthew Loose19, Urvashi Surti20, Rosa ana Risques10, Tina A. Graves Lindsay12, Robert Fulton12, Ira Hall12, Benedict Paten1, Kerstin Howe9, Winston Timp4, Alice Young6, James C. Mullikin6, Pavel A. Pevzner21, Jennifer L. Gerton7, Beth A. Sullivan22, Evan E. Eichler3,23 & Adam M. Phillippy2 ✉ After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been fnished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. -
Data File Formats File Format V1.3 Software V1.8.0
Data File Formats File format v1.3 Software v1.8.0 Copyright © 2010 Complete Genomics Incorporated. All rights reserved. cPAL and DNB are trademarks of Complete Genomics, Inc. in the US and certain other countries. All other trademarks are the property of their respective owners. Disclaimer of Warranties. COMPLETE GENOMICS, INC. PROVIDES THESE DATA IN GOOD FAITH TO THE RECIPIENT “AS IS.” COMPLETE GENOMICS, INC. MAKES NO REPRESENTATION OR WARRANTY, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION ANY IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE OR USE, OR ANY OTHER STATUTORY WARRANTY. COMPLETE GENOMICS, INC. ASSUMES NO LEGAL LIABILITY OR RESPONSIBILITY FOR ANY PURPOSE FOR WHICH THE DATA ARE USED. Any permitted redistribution of the data should carry the Disclaimer of Warranties provided above. Data file formats are expected to evolve over time. Backward compatibility of any new file format is not guaranteed. Complete Genomics data is for Research Use Only and not for use in the treatment or diagnosis of any human subject. Information, descriptions and specifications in this publication are subject to change without notice. Data Formats Document Table of Contents Table of Contents Preface ...................................................................................................................................................................... 3 Conventions ............................................................................................................................................................................... -
SUPPLEMENTARY DATA Supplementary Table 1. Characteristics of the Organ Donors and Human Islet Preparations Used for RNA-Seq
SUPPLEMENTARY DATA Supplementary Table 1. Characteristics of the organ donors and human islet preparations used for RNA-seq and independent confirmation and mechanistic studies. Gender Age BMI Cause of death Purity (years) (kg/m2) (%) F 77 23.8 Trauma 45 M 36 26.3 CVD 51 M 77 25.2 CVD 62 F 46 22.5 CVD 60 M 40 26.2 Trauma 34 M 59 26.7 NA 58 M 51 26.2 Trauma 54 F 79 29.7 CH 21 M 68 27.5 CH 42 F 76 25.4 CH 30 F 75 29.4 CVD 24 F 73 30.0 CVD 16 M 63 NA NA 46 F 64 23.4 CH 76 M 69 25.1 CH 68 F 23 19.7 Trauma 70 M 47 27.7 CVD 48 F 65 24.6 CH 58 F 87 21.5 Trauma 61 F 72 23.9 CH 62 M 69 25 CVD 85 M 85 25.5 CH 39 M 59 27.7 Trauma 56 F 76 19.5 CH 35 F 50 20.2 CH 70 F 42 23 CVD 48 M 52 24.5 CH 60 F 79 27.5 CH 89 M 56 24.7 Cerebral ischemia 47 M 69 24.2 CVD 57 F 79 28.1 Trauma 61 M 79 23.7 NA 13 M 82 23 CH 61 M 32 NA NA 75 F 23 22.5 Cardiac arrest 46 M 51 NA Trauma 37 Abbreviations: F: Female; M: Male; BMI: Body mass index; CVD: Cardiovascular disease; CH: Cerebral hemorrhage. -
Illuminahumanv1.Db
illuminaHumanv1.db August 24, 2021 illuminaHumanv1ACCNUM Map Manufacturer identifiers to Accession Numbers Description illuminaHumanv1ACCNUM is an R object that contains mappings between a manufacturer’s iden- tifiers and manufacturers accessions. Details For chip packages such as this, the ACCNUM mapping comes directly from the manufacturer. This is different from other mappings which are mapped onto the probes via an Entrez Gene identifier. Each manufacturer identifier maps to a vector containing a GenBank accession number. Mappings were based on data provided by: Entrez Gene ftp://ftp.ncbi.nlm.nih.gov/gene/DATA With a date stamp from the source of: 2015-Mar17 Examples x <- illuminaHumanv1ACCNUM # Get the probe identifiers that are mapped to an ACCNUM mapped_probes <- mappedkeys(x) # Convert to a list xx <- as.list(x[mapped_probes]) if(length(xx) > 0) { # Get the ACCNUM for the first five probes xx[1:5] # Get the first one xx[[1]] } 1 2 illuminaHumanv1.db illuminaHumanv1ALIAS2PROBE Map between Common Gene Symbol Identifiers and Manufacturer Identifiers Description illuminaHumanv1ALIAS is an R object that provides mappings between common gene symbol identifiers and manufacturer identifiers. Details Each gene symbol is mapped to a named vector of manufacturer identifiers. The name represents the gene symbol and the vector contains all manufacturer identifiers that are found for that symbol. An NA is reported for any gene symbol that cannot be mapped to any manufacturer identifiers. This mapping includes ALL gene symbols including those which are already listed in the SYMBOL map. The SYMBOL map is meant to only list official gene symbols, while the ALIAS maps are meant to store all used symbols. -
Using Galaxy for NGS Analyses Luce Skrabanek
Using Galaxy for NGS Analyses Luce Skrabanek Registering for a Galaxy account Before we begin, first create an account on the main public Galaxy portal. Go to: https://main.g2.bx.psu.edu/ Under the User tab at the top of the page, select the Register link and follow the instructions on that page. This will only take a moment, and will allow all the work that you do to persist between sessions and allow you to name, save, share, and publish Galaxy histories, workflows, datasets and pages. Disk quotas As a registered user on the Galaxy main server, you are now entitled to store up to 250GB of data on this public server. The little green bar at the top right of your Galaxy page will always show you how much of your allocated resources you are currently using. If you exceed your disk quota, no further jobs will be run until you have (permanently) deleted some of your datasets. Note: Many of the examples throughout this document have been taken from published Galaxy pages. Thanks to users james, jeremy and aun1 for making these pages and the associated datasets available. 1 Importing data into Galaxy The right hand panel of the Galaxy window is called the Tools panel. Here we can access all the tools that are available in this build of Galaxy. Which tools are available will depend on how the administrator of the Galaxy instance you are using has set it up. The first tools we will look at will be the built-in tools for importing data into Galaxy. -
Bioinformatics Algorithms
Bioinformatics Algorithms Data sources and formats David Hoksza http://siret.ms.mff.cuni.cz/hoksza 1 Sequence databases and data formats 2 Sequence Databases • DNA • GenBank/RefSeq (NCBI), European Nucleotide Archive (EMBL-EBI), DNA Database of Japan (DDBJ) • Proteins • PIR (USA), SwissProt (EMBL-EBI) • UniProt (SwissProt + TrEMBL + PIR) • Derived Databases • Pfam, PROSITE, SILVA • … and MANY more … 3 GenBank • Annotated collection of all publicly (ENA) operated by European available DNA sequences and their Bioinformatics Institute (EBI) and protein transcripts including mRNA the DNA Data Bank of Japan (DDBJ) sequences with coding regions, segments of genomic DNA with a • 654,057,069,549 bases from single gene or multiple genes, and 218,642,238 sequences as of August ribosomal RNA gene clusters 2020 • Maintained by National Center for Biotechnology Information (NCBI) • More than 100,000 distinct organisms • Multiple entries for some loci • Part of the International Nucleotide (sequencing can take place under Sequence Database Collaboration with slightly different conditions in various the European Nucleotide Archive individuals) 4 GenBank RefSeq RefSeq Not curated Curated Author submits NCBI creates from existing data • Reference Sequence (RefSeq) Only author can revise NCBI revises as new data database is a curated collection of emerge DNA, RNA, and protein sequences built by NCBI Multiple records for the same Single record for each loci molecule of major organisms • Provides separate and linked records Records can contradict each for -
Mutational Landscape of Aggressive Prostate Tumors in African American Men Karla J
Published OnlineFirst February 26, 2016; DOI: 10.1158/0008-5472.CAN-15-1787 Cancer Molecular and Cellular Pathobiology Research Mutational Landscape of Aggressive Prostate Tumors in African American Men Karla J. Lindquist1, Pamela L. Paris2, Thomas J. Hoffmann1,3, Niall J. Cardin1, Remi Kazma1, Joel A. Mefford1, Jeffrey P. Simko2, Vy Ngo2, Yalei Chen4, Albert M. Levin4, Dhananjay Chitale4, Brian T. Helfand5, William J. Catalona5, Benjamin A. Rybicki4, and John S. Witte1,2,3,6 Abstract Prostate cancer is the most frequently diagnosed and second observed several mutation patterns consistent with previous stud- most fatal nonskin cancer among men in the United States. African ies, such as large copy number aberrations in chromosome 8 and American men are two times more likely to develop and die of complex rearrangement chains. However, TMPRSS2-ERG gene prostate cancer compared with men of other ancestries. Previous fusions and PTEN losses occurred in only 21% and 8% of the whole genome or exome tumor-sequencing studies of prostate African American patients, respectively, far less common than in cancer have primarily focused on men of European ancestry. In this patients of European ancestry. We also identified mutations that study, we sequenced and characterized somatic mutations in appeared specific to or more common in African American aggressive (Gleason 7, stage T2b) prostate tumors from 24 patients, including a novel CDC27-OAT gene fusion occurring in African American patients. We describe the locations and preva- 17% of patients. The genomic aberrations reported in this study lence of small somatic mutations (up to 50 bases in length), copy warrant further investigation of their biologic significant role in the number aberrations, and structural rearrangements in the tumor incidence and clinical outcomes of prostate cancer in African genomes compared with patient-matched normal genomes. -
Extensive Sequence Divergence Between the Reference PNAS PLUS Genomes of Two Elite Indica Rice Varieties Zhenshan 97 and Minghui 63
Extensive sequence divergence between the reference PNAS PLUS genomes of two elite indica rice varieties Zhenshan 97 and Minghui 63 Jianwei Zhang (张建伟)a,b,c,1, Ling-Ling Chen (陈玲玲)a,1, Feng Xing (邢锋)a,1, David A. Kudrnab,c, Wen Yao (姚文)a, Dario Copettib,c,d, Ting Mu (穆婷)a, Weiming Li (李伟明)a, Jia-Ming Song (宋佳明)a, Weibo Xie (谢为博)a, Seunghee Leeb,c, Jayson Talagb,c, Lin Shao (邵林)a, Yue An (安玥)a, Chun-Liu Zhang (张春柳)a, Yidan Ouyang (欧阳亦聃)a, Shuai Sun (孙帅)a, Wen-Biao Jiao (焦文标)a, Fang Lv (吕芳)a, Bogu Du (杜博贾)a, Meizhong Luo (罗美中)a, Carlos Ernesto Maldonadob,c, Jose Luis Goicoecheab,c, Lizhong Xiong (熊立仲)a, Changyin Wu (吴昌银)a, Yongzhong Xing (邢永忠)a, Dao-Xiu Zhou (周道绣)a,SibinYu(余四斌)a,YuZhao(赵毓)a, Gongwei Wang (王功伟)a,YeisooYub,c,2, Yijie Luo (罗艺洁)a, Zhi-Wei Zhou (周智伟)a, Beatriz Elena Padilla Hurtadob,c, Ann Danowitzb, Rod A. Wingb,c,d,3, and Qifa Zhang (张启发)a,3 aNational Key Laboratory of Crop Genetic Improvement, Huazhong Agricultural University, Wuhan 430070, China; bArizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721; cBIO5 Institute, School of Plant Sciences, University of Arizona, Tucson, AZ 85721; and dInternational Rice Research Institute, Genetic Resource Center, Los Baños, Laguna, Philippines Contributed by Qifa Zhang, July 12, 2016 (sent for review May 12, 2016; reviewed by James J. Giovannoni and Yaoguang Liu) Asian cultivated rice consists of two subspecies: Oryza sativa subsp. eating quality, and thus has been the most widely cultivated hybrid indica and O.