Browsing Genes and Genomes with Ensembl

Total Page:16

File Type:pdf, Size:1020Kb

Browsing Genes and Genomes with Ensembl Browsing Genes and Genomes with Ensembl Ensembl Workshop University of Cambridge 3rd September 2013 www.ensembl.org Denise Carvalho-Silva Ensembl Outreach Notes: This workshop is based on Ensembl release 72 (June 2013). 1) Presentation slides The pdf file of the talks presented in this workshop is available in the link below: http://www.ebi.ac.uk/~denise/naturimmun 2) Course booklet A diGital Copy of this course booklet can be found below: http://www.ebi.ac.uk/~denise/coursebooklet.pdf 3) Answers The answers to the exercises in this course booklet can be found below: http://www.ebi.ac.uk/~denise/answers 2 TABLE OF CONTENTS OVERVIEW ...................................................................................................... 4 INTRODUCTION TO ENSEMBL .................................................................. 5 BROWSER WALKTHROUGH .................................................................... 14 EXERCISES .................................................................................................... 38 BROWSER ..................................................................................................... 38 BIOMART ...................................................................................................... 41 VARIATION ................................................................................................... 48 COMPARATIVE GENOMICS ...................................................................... 50 REGULATION ............................................................................................... 52 QUICK GUIDE TO DATABASES AND PROJECTS ................................. 55 CHECK YOUR LEARNING! 10 ADVANCED EXERCISES ..................... 57 3 OVERVIEW Getting started with Ensembl at www.ensembl.orG Ensembl provides Genes and other annotation such as regulatory reGions, Conserved base pairs aCross speCies, and sequenCe variations. The Ensembl gene set is based on protein and mRNA evidence in UniProtKB and NCBI RefSeq databases, along with manual annotation from the VEGA/Havana group. All the data are freely available and Can be aCCessed via the web browser at www.ensembl.orG. Perl proGrammers Can direCtly aCCess Ensembl databases through an Application Programming Interfaces (Perl APIs). Gene sequenCes Can be downloaded from the Ensembl browser itself, or through the use of the BioMart web interfaCe, whiCh can extract information from the Ensembl databases without the need for proGramminG knowledGe! A sister browser at www.ensemblGenomes.orG is set up to aCCess non-chordates, namely bacteria, plants, fungi, metazoa, and protists). #$%&'("=*J$+ !"#$%&'(")*&"+$,-$.'$+"/01("234#562345 !"7$1&0$8$"9%1%"/01("20*:%&1 !"5&<"*-&"G%&0%.1"EH$'1"I&$90'1*& !"4''$++"1($"E.'*9$"9%1%"0."E.+$C=? !"5&<"*-&"G%&0%.1"EH$'1"I&$90'1*& !" 2&*/+$" >?%.1+@" =%'1$&0%@" )-.A0@" >&*B+1+" !" ;(**+$" <*-&" )%8*-&01$" %.9"C$1%D*%"%1"E.+$C=?"F$.*C$+ 8$&1$=&%1$"+>$'0$+ Points covered: • Why do we need Genome browsers? • An introduCtion to the Ensembl browser Check our video tutorial! • How data can be accessed in Ensembl ? http://www.youtube.com/user/ • An overview of Ensembl tools EnsemblHelpdesk The Ensembl Genome browser Introduction to BioMart 4 INTRODUCTION TO ENSEMBL Ensembl is a joint projeCt between the EBI (European BioinformatiCs Institute) and the WellCome Trust SanGer Institute that annotates chordate genomes (i.e. vertebrates and Closely related invertebrates with a notoChord, such as sea squirt). Gene sets from model organisms (e.G. yeast, fruitfly and worm are also imported for comparative analysis by the Ensembl Comparative GenomiCs team. Most annotation is updated every two months, leadinG to inCreasinG Ensembl versions (such as version 73), however the Gene sets are determined less frequently. !"#$%$&$%' !$+&'#3'4,&'56'+0' (#$)'*+,-.*/+%'+)& !"#$%&)#-+&0#*1)& 2'+$%' Figure 1: The ReGion in detail view: Protein alignments and other traCks. Click on the cog wheel icon to add more data track to Ensembl views. Alternatively, you may want to click on the Configure this page button instead at the left hand side. The vast amount of information associated with genomiC sequences demands a way to organise and access that information. This is where Genome browsers Come in. Ensembl strives to display many layers of Genome annotation into a simplified view for the ease of the user. Figure 1 above shows the ReGion in detail page for the BRCA2 gene in human. The example shows blocks of Conserved sequence refleCting Conservation sCores of sequence identity on a base pair level aCross 36 species. Conserved regions are displayed as dark bloCks that represent loCal reGions of aliGnment. 5 Also in this fiGure proteins alignments from the UniProtKB have been added to the Genomic loCation of the human BRCA2 gene. Filled yellow blocks show where these UniProtKB proteins align to the genome, and gaps in the alignment are shown as empty yellow bloCks. Note, in this Case, the UniProtKB proteins support most of the exons shown in the Ensembl BRCA2-001 and BRCA2-201 transcripts. Both Ensembl and Vega (Havana) transcripts are portrayed as exons (boxes) and introns (connecting lines). Filled boxes show codinG sequence and empty boxes refleCt untranslated regions (UTRs). This ReGion in detail view is useful for comparing Ensembl gene models with Current proteins, mRNAs and ESTs from other databases, suCh as NCBI RefSeq, ENA, UniGene and UniProtKB. Everything in this view is aliGned to the Genome. Figure 2: The ReGion in detail view: 1000 Genomes traCks The ReGion in detail view can be configured (using the Configure this page button) to show reGulatory features, sequence variation, and more! For example, CliCk on dbSNP under the Variation menu and turn on the sequence variants (dbSNP and all other sources) at the riGht side of this paGe. Save and Close. BaCk to the ReGion in detail view, click on the sequence variation of interest. A pop-up box will show you a few variation properties such as rs number, alleles, type, and others. CliCk on the rs properties link to take you to an information paGe for the Genetic variation, inCludinG links to population frequencies, if available. You Can do the same for reGulatory features as well. An index paGe is provided for eaCh speCies with information about the source of the Genomic sequence assembly, a karyotype (if available), and a link to past sites on our archive. The picture below shows the Ensembl homepaGe for human. Links to the human 6 karyotype, to the previous human assembly and a summary of gene and genome information are found in this index paGe. News SearCh Information Links to and statistiCs example features in Ensembl Ensembl devotes separate pages and views in the browser to display a variety of information types, using a tabbed structure. The three main entry points in Ensembl are the LoCation tab, Gene tab and TransCript tab. We have also the speCies tab, variation tab and reGulation tab. SpeCies tab Gene tab Variation tab LoCation tab TransCript ReGulation tab tab You Can for example Change the speCies of interest in the speCies tab, view a chromosomal region in the Location tab (also known as ReGion in detail), visualise Gene trees in the Gene tab, browse the cDNA sequence alonGside the protein translation in the transcript pages, look for Genotype information in the variation tab and get details by cell line for regulatory features. You Can also perform a similarity search aGainst any species in Ensembl by usinG our BlastView, whiCh offers aCCess to both BLAST and BLAT proGrams. 7 !"#$%&"'()*+'),*-#"'#)'.$%* /0$0%*#$1*&.$%0"-01*"02'.$%* 3.4.5.260%*'$*/0$0*!"00%* 789:!*#$1*789!*%0#"&,0%* Let’s take a look at the Ensembl Genomes homepage at www.ensemblGenomes.orG Links to the taxa- speCifiC sites Link baCk to Ensembl News 8 Click on the different taxa to see their homepages. Each of them is colour-coded: Protists Fungi Metazoa Plants Bacteria You Can naviGate most of the taxa in the same way as you would with Ensembl, but Ensembl BaCteria has a larGe number of Genomes, so needs slightly different methods. Let’s look at it in more detail. 9 SearCh for SearCh for a gene a speCies Information on Ensembl BaCteria There’s no full speCies list for bacteria as it would be hard to navigate with the number of speCies. To find a speCies, start to type the speCies name into the speCies searCh box. A drop down list will appear with possible speCies. For example, to find a substrain of Clostridium difficile type in Clostridium d. The drop down Contains various strains of Clostridium difficile. Let’s choose Clostridium difficile 630. This will take us to another speCies homepage, where we can explore various features. 10 Retrieving Data from Ensembl BioMart is a web-interfaCe that Can extraCt information from the Ensembl databases and present the user with a table of information without the need for proGramminG. It Can be used to output sequences or tables of Genes along with gene positions (Chromosome and base pair locations), single nucleotide polymorphisms (SNPs), homologues, and other annotation in HTML, text, or Microsoft Excel format. BioMart Can also translate one type of ID to another, identify genes associated with an InterPro domains or gene ontology (GO) terms, export Gene expression data and lots more. Ensembl uses MySQL relational databases to store its information. A comprehensive set of Application Programme Interfaces (APIs) serve as a middle-layer between underlyinG database schemes and more specific application proGrammes. The API aims to encapsulate
Recommended publications
  • CISD2 (NM 001008388) Human Tagged ORF Clone Product Data
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RC207131L3 CISD2 (NM_001008388) Human Tagged ORF Clone Product data: Product Type: Expression Plasmids Product Name: CISD2 (NM_001008388) Human Tagged ORF Clone Tag: Myc-DDK Symbol: CISD2 Synonyms: ERIS; Miner1; NAF-1; WFS2; ZCD2 Vector: pLenti-C-Myc-DDK-P2A-Puro (PS100092) E. coli Selection: Chloramphenicol (34 ug/mL) Cell Selection: Puromycin ORF Nucleotide The ORF insert of this clone is exactly the same as(RC207131). Sequence: Restriction Sites: SgfI-MluI Cloning Scheme: ACCN: NM_001008388 ORF Size: 405 bp This product is to be used for laboratory only. Not for diagnostic or therapeutic use. View online » ©2021 OriGene Technologies, Inc., 9620 Medical Center Drive, Ste 200, Rockville, MD 20850, US 1 / 2 CISD2 (NM_001008388) Human Tagged ORF Clone – RC207131L3 OTI Disclaimer: The molecular sequence of this clone aligns with the gene accession number as a point of reference only. However, individual transcript sequences of the same gene can differ through naturally occurring variations (e.g. polymorphisms), each with its own valid existence. This clone is substantially in agreement with the reference, but a complete review of all prevailing variants is recommended prior to use. More info OTI Annotation: This clone was engineered to express the complete ORF with an expression tag. Expression varies depending on the nature of the gene. RefSeq: NM_001008388.1 RefSeq Size: 5892 bp RefSeq ORF: 408 bp Locus ID: 493856 UniProt ID: Q8N5K1 Protein Families: Transmembrane MW: 15.3 kDa Gene Summary: The protein encoded by this gene is a zinc finger protein that localizes to the endoplasmic reticulum.
    [Show full text]
  • Ensembl Genomes: Extending Ensembl Across the Taxonomic Space P
    Published online 1 November 2009 Nucleic Acids Research, 2010, Vol. 38, Database issue D563–D569 doi:10.1093/nar/gkp871 Ensembl Genomes: Extending Ensembl across the taxonomic space P. J. Kersey*, D. Lawson, E. Birney, P. S. Derwent, M. Haimel, J. Herrero, S. Keenan, A. Kerhornou, G. Koscielny, A. Ka¨ ha¨ ri, R. J. Kinsella, E. Kulesha, U. Maheswari, K. Megy, M. Nuhn, G. Proctor, D. Staines, F. Valentin, A. J. Vilella and A. Yates EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK Received August 14, 2009; Revised September 28, 2009; Accepted September 29, 2009 ABSTRACT nucleotide archives; numerous other genomes exist in states of partial assembly and annotation; thousands of Ensembl Genomes (http://www.ensemblgenomes viral genomes sequences have also been generated. .org) is a new portal offering integrated access to Moreover, the increasing use of high-throughput genome-scale data from non-vertebrate species sequencing technologies is rapidly reducing the cost of of scientific interest, developed using the Ensembl genome sequencing, leading to an accelerating rate of genome annotation and visualisation platform. data production. This not only makes it likely that in Ensembl Genomes consists of five sub-portals (for the near future, the genomes of all species of scientific bacteria, protists, fungi, plants and invertebrate interest will be sequenced; but also the genomes of many metazoa) designed to complement the availability individuals, with the possibility of providing accurate and of vertebrate genomes in Ensembl. Many of the sophisticated annotation through the similarly low-cost databases supporting the portal have been built in application of functional assays.
    [Show full text]
  • Abstracts Genome 10K & Genome Science 29 Aug - 1 Sept 2017 Norwich Research Park, Norwich, Uk
    Genome 10K c ABSTRACTS GENOME 10K & GENOME SCIENCE 29 AUG - 1 SEPT 2017 NORWICH RESEARCH PARK, NORWICH, UK Genome 10K c 48 KEYNOTE SPEAKERS ............................................................................................................................... 1 Dr Adam Phillippy: Towards the gapless assembly of complete vertebrate genomes .................... 1 Prof Kathy Belov: Saving the Tasmanian devil from extinction ......................................................... 1 Prof Peter Holland: Homeobox genes and animal evolution: from duplication to divergence ........ 2 Dr Hilary Burton: Genomics in healthcare: the challenges of complexity .......................................... 2 INVITED SPEAKERS ................................................................................................................................. 3 Vertebrate Genomics ........................................................................................................................... 3 Alex Cagan: Comparative genomics of animal domestication .......................................................... 3 Plant Genomics .................................................................................................................................... 4 Ksenia Krasileva: Evolution of plant Immune receptors ..................................................................... 4 Andrea Harper: Using Associative Transcriptomics to predict tolerance to ash dieback disease in European ash trees ............................................................................................................
    [Show full text]
  • The ELIXIR Core Data Resources: ​Fundamental Infrastructure for The
    Supplementary Data: The ELIXIR Core Data Resources: fundamental infrastructure ​ for the life sciences The “Supporting Material” referred to within this Supplementary Data can be found in the Supporting.Material.CDR.infrastructure file, DOI: 10.5281/zenodo.2625247 (https://zenodo.org/record/2625247). ​ ​ Figure 1. Scale of the Core Data Resources Table S1. Data from which Figure 1 is derived: Year 2013 2014 2015 2016 2017 Data entries 765881651 997794559 1726529931 1853429002 2715599247 Monthly user/IP addresses 1700660 2109586 2413724 2502617 2867265 FTEs 270 292.65 295.65 289.7 311.2 Figure 1 includes data from the following Core Data Resources: ArrayExpress, BRENDA, CATH, ChEBI, ChEMBL, EGA, ENA, Ensembl, Ensembl Genomes, EuropePMC, HPA, IntAct /MINT , InterPro, PDBe, PRIDE, SILVA, STRING, UniProt ● Note that Ensembl’s compute infrastructure physically relocated in 2016, so “Users/IP address” data are not available for that year. In this case, the 2015 numbers were rolled forward to 2016. ● Note that STRING makes only minor releases in 2014 and 2016, in that the interactions are re-computed, but the number of “Data entries” remains unchanged. The major releases that change the number of “Data entries” happened in 2013 and 2015. So, for “Data entries” , the number for 2013 was rolled forward to 2014, and the number for 2015 was rolled forward to 2016. The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences ​ 1 Figure 2: Usage of Core Data Resources in research The following steps were taken: 1. API calls were run on open access full text articles in Europe PMC to identify articles that ​ ​ mention Core Data Resource by name or include specific data record accession numbers.
    [Show full text]
  • Cryptic Inoviruses Revealed As Pervasive in Bacteria and Archaea Across Earth’S Biomes
    ARTICLES https://doi.org/10.1038/s41564-019-0510-x Corrected: Author Correction Cryptic inoviruses revealed as pervasive in bacteria and archaea across Earth’s biomes Simon Roux 1*, Mart Krupovic 2, Rebecca A. Daly3, Adair L. Borges4, Stephen Nayfach1, Frederik Schulz 1, Allison Sharrar5, Paula B. Matheus Carnevali 5, Jan-Fang Cheng1, Natalia N. Ivanova 1, Joseph Bondy-Denomy4,6, Kelly C. Wrighton3, Tanja Woyke 1, Axel Visel 1, Nikos C. Kyrpides1 and Emiley A. Eloe-Fadrosh 1* Bacteriophages from the Inoviridae family (inoviruses) are characterized by their unique morphology, genome content and infection cycle. One of the most striking features of inoviruses is their ability to establish a chronic infection whereby the viral genome resides within the cell in either an exclusively episomal state or integrated into the host chromosome and virions are continuously released without killing the host. To date, a relatively small number of inovirus isolates have been extensively studied, either for biotechnological applications, such as phage display, or because of their effect on the toxicity of known bacterial pathogens including Vibrio cholerae and Neisseria meningitidis. Here, we show that the current 56 members of the Inoviridae family represent a minute fraction of a highly diverse group of inoviruses. Using a machine learning approach lever- aging a combination of marker gene and genome features, we identified 10,295 inovirus-like sequences from microbial genomes and metagenomes. Collectively, our results call for reclassification of the current Inoviridae family into a viral order including six distinct proposed families associated with nearly all bacterial phyla across virtually every ecosystem.
    [Show full text]
  • Cytotoxic Effects and Changes in Gene Expression Profile
    Toxicology in Vitro 34 (2016) 309–320 Contents lists available at ScienceDirect Toxicology in Vitro journal homepage: www.elsevier.com/locate/toxinvit Fusarium mycotoxin enniatin B: Cytotoxic effects and changes in gene expression profile Martina Jonsson a,⁎,MarikaJestoib, Minna Anthoni a, Annikki Welling a, Iida Loivamaa a, Ville Hallikainen c, Matti Kankainen d, Erik Lysøe e, Pertti Koivisto a, Kimmo Peltonen a,f a Chemistry and Toxicology Research Unit, Finnish Food Safety Authority (Evira), Mustialankatu 3, FI-00790 Helsinki, Finland b Product Safety Unit, Finnish Food Safety Authority (Evira), Mustialankatu 3, FI-00790 Helsinki, c The Finnish Forest Research Institute, Rovaniemi Unit, P.O. Box 16, FI-96301 Rovaniemi, Finland d Institute for Molecular Medicine Finland (FIMM), University of Helsinki, P.O. Box 20, FI-00014, Finland e Plant Health and Biotechnology, Norwegian Institute of Bioeconomy, Høyskoleveien 7, NO -1430 Ås, Norway f Finnish Safety and Chemicals Agency (Tukes), Opastinsilta 12 B, FI-00521 Helsinki, Finland article info abstract Article history: The mycotoxin enniatin B, a cyclic hexadepsipeptide produced by the plant pathogen Fusarium,isprevalentin Received 3 December 2015 grains and grain-based products in different geographical areas. Although enniatins have not been associated Received in revised form 5 April 2016 with toxic outbreaks, they have caused toxicity in vitro in several cell lines. In this study, the cytotoxic effects Accepted 28 April 2016 of enniatin B were assessed in relation to cellular energy metabolism, cell proliferation, and the induction of ap- Available online 6 May 2016 optosis in Balb 3T3 and HepG2 cells. The mechanism of toxicity was examined by means of whole genome ex- fi Keywords: pression pro ling of exposed rat primary hepatocytes.
    [Show full text]
  • Learning Protein Constitutive Motifs from Sequence Data Je´ Roˆ Me Tubiana, Simona Cocco, Re´ Mi Monasson*
    TOOLS AND RESOURCES Learning protein constitutive motifs from sequence data Je´ roˆ me Tubiana, Simona Cocco, Re´ mi Monasson* Laboratory of Physics of the Ecole Normale Supe´rieure, CNRS UMR 8023 & PSL Research, Paris, France Abstract Statistical analysis of evolutionary-related protein sequences provides information about their structure, function, and history. We show that Restricted Boltzmann Machines (RBM), designed to learn complex high-dimensional data and their statistical features, can efficiently model protein families from sequence information. We here apply RBM to 20 protein families, and present detailed results for two short protein domains (Kunitz and WW), one long chaperone protein (Hsp70), and synthetic lattice proteins for benchmarking. The features inferred by the RBM are biologically interpretable: they are related to structure (residue-residue tertiary contacts, extended secondary motifs (a-helixes and b-sheets) and intrinsically disordered regions), to function (activity and ligand specificity), or to phylogenetic identity. In addition, we use RBM to design new protein sequences with putative properties by composing and ’turning up’ or ’turning down’ the different modes at will. Our work therefore shows that RBM are versatile and practical tools that can be used to unveil and exploit the genotype–phenotype relationship for protein families. DOI: https://doi.org/10.7554/eLife.39397.001 Introduction In recent years, the sequencing of many organisms’ genomes has led to the collection of a huge number of protein sequences, which are catalogued in databases such as UniProt or PFAM Finn et al., 2014). Sequences that share a common ancestral origin, defining a family (Figure 1A), *For correspondence: are likely to code for proteins with similar functions and structures, providing a unique window into [email protected] the relationship between genotype (sequence content) and phenotype (biological features).
    [Show full text]
  • DECIPHER: Harnessing Local Sequence Context to Improve Protein Multiple Sequence Alignment Erik S
    Wright BMC Bioinformatics (2015) 16:322 DOI 10.1186/s12859-015-0749-z RESEARCH ARTICLE Open Access DECIPHER: harnessing local sequence context to improve protein multiple sequence alignment Erik S. Wright1,2 Abstract Background: Alignment of large and diverse sequence sets is a common task in biological investigations, yet there remains considerable room for improvement in alignment quality. Multiple sequence alignment programs tend to reach maximal accuracy when aligning only a few sequences, and then diminish steadily as more sequences are added. This drop in accuracy can be partly attributed to a build-up of error and ambiguity as more sequences are aligned. Most high-throughput sequence alignment algorithms do not use contextual information under the assumption that sites are independent. This study examines the extent to which local sequence context can be exploited to improve the quality of large multiple sequence alignments. Results: Two predictors based on local sequence context were assessed: (i) single sequence secondary structure predictions, and (ii) modulation of gap costs according to the surrounding residues. The results indicate that context-based predictors have appreciable information content that can be utilized to create more accurate alignments. Furthermore, local context becomes more informative as the number of sequences increases, enabling more accurate protein alignments of large empirical benchmarks. These discoveries became the basis for DECIPHER, a new context-aware program for sequence alignment, which outperformed other programs on largesequencesets. Conclusions: Predicting secondary structure based on local sequence context is an efficient means of breaking the independence assumption in alignment. Since secondary structure is more conserved than primary sequence, it can be leveraged to improve the alignment of distantly related proteins.
    [Show full text]
  • Assessment of Genetic Mutations in WFS1 & CISD2 in Wolfram
    Online Journal of Neurology and L UPINE PUBLISHERS Brain Disorders Open Access DOI: 10.32474/OJNBD.2018.01.000104 ISSN: 2637-6628 Research Article Assessment of Genetic Mutations in WFS1 & CISD2 in Wolfram Syndrome Human Shahin Asadi*, Mahsa Jamali, Hamideh Mohammadzadeh and Mahya Fattahi Director of the Division of Medical Genetics and Molecular Research, Iran Received: January 25, 2018; Published: February 15, 2018 *Corresponding author: Shahin Asadi, Molecular Genetics Iran Tabriz, Director of the Division of Medical Genetics and Molecular Research, Iran Abstract In this study we have analyzed 30 people. 10 patients Wolfram syndrome and 20 persons control group. The genes WFS1 and CISD2, analyzed in terms of genetic mutations made. In this study, people who have genetic mutations were targeted, with nervous disorders Wolfram syndrome. In fact, of all people with Wolfram syndrome. 10 patients Wolfram syndrome had a genetic mutation in the genes WFS1 and CISD2 Wolfram syndrome. Any genetic mutations in the target genes control group did not show. Keywords: Genetic study; Wolfram syndrome; Mutations The gene WFS1 and CISD2; RT-PCR. Generalizations of Wolfram Syndrome urinary tract, decreased testosterone levels in men (hypogonadism), Wolfram syndrome is a genetic disorder that affects many body Nervous or psychiatric disorders [1]. systems. The specific characteristics of Wolfram syndrome include: mellitus) and progressive vision loss due to degeneration of the syndrome, which is commonly diagnosed at about 6 years of age. high blood sugar levels due to insulin hormone deficiency (diabetes Diabetes mellitus is usually the first symptom of Wolfram nerves that transports visual information from the eye to the brain Almost all people with Wolfram syndrome who have diabetes need (vision atrophy).
    [Show full text]
  • Wolfram Syndrome
    Wolfram syndrome Description Wolfram syndrome is a condition that affects many of the body's systems. The hallmark features of Wolfram syndrome are high blood sugar levels resulting from a shortage of the hormone insulin (diabetes mellitus) and progressive vision loss due to degeneration of the nerves that carry information from the eyes to the brain (optic atrophy). People with Wolfram syndrome often also have pituitary gland dysfunction that results in the excretion of excessive amounts of urine (diabetes insipidus), hearing loss caused by changes in the inner ear (sensorineural deafness), urinary tract problems, reduced amounts of the sex hormone testosterone in males (hypogonadism), or neurological or psychiatric disorders. Diabetes mellitus is typically the first symptom of Wolfram syndrome, usually diagnosed around age 6. Nearly everyone with Wolfram syndrome who develops diabetes mellitus requires insulin replacement therapy. Optic atrophy is often the next symptom to appear, usually around age 11. The first signs of optic atrophy are loss of color vision and side ( peripheral) vision. Over time, the vision problems get worse, and people with optic atrophy are usually blind within approximately 8 years after signs of optic atrophy first begin. In diabetes insipidus, the pituitary gland, which is located at the base of the brain, does not function normally. This abnormality disrupts the release of a hormone called vasopressin, which helps control the body's water balance and urine production. Approximately 70 percent of people with Wolfram syndrome have diabetes insipidus. Pituitary gland dysfunction can also cause hypogonadism in males. The lack of testosterone that occurs with hypogonadism affects growth and sexual development.
    [Show full text]
  • 1 Codon-Level Information Improves Predictions of Inter-Residue Contacts in Proteins 2 by Correlated Mutation Analysis 3
    1 Codon-level information improves predictions of inter-residue contacts in proteins 2 by correlated mutation analysis 3 4 5 6 7 8 Etai Jacob1,2, Ron Unger1,* and Amnon Horovitz2,* 9 10 11 1The Mina & Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat- 12 Gan, 52900, 2Department of Structural Biology 13 Weizmann Institute of Science, Rehovot 7610001, Israel 14 15 16 *To whom correspondence should be addressed: 17 Amnon Horovitz ([email protected]) 18 Ron Unger ([email protected]) 19 1 20 Abstract 21 Methods for analysing correlated mutations in proteins are becoming an increasingly 22 powerful tool for predicting contacts within and between proteins. Nevertheless, 23 limitations remain due to the requirement for large multiple sequence alignments (MSA) 24 and the fact that, in general, only the relatively small number of top-ranking predictions 25 are reliable. To date, methods for analysing correlated mutations have relied exclusively 26 on amino acid MSAs as inputs. Here, we describe a new approach for analysing 27 correlated mutations that is based on combined analysis of amino acid and codon MSAs. 28 We show that a direct contact is more likely to be present when the correlation between 29 the positions is strong at the amino acid level but weak at the codon level. The 30 performance of different methods for analysing correlated mutations in predicting 31 contacts is shown to be enhanced significantly when amino acid and codon data are 32 combined. 33 2 34 The effects of mutations that disrupt protein structure and/or function at one site are often 35 suppressed by mutations that occur at other sites either in the same protein or in other 36 proteins.
    [Show full text]
  • Annual Scientific Report 2013 on the Cover Structure 3Fof in the Protein Data Bank, Determined by Laponogov, I
    EMBL-European Bioinformatics Institute Annual Scientific Report 2013 On the cover Structure 3fof in the Protein Data Bank, determined by Laponogov, I. et al. (2009) Structural insight into the quinolone-DNA cleavage complex of type IIA topoisomerases. Nature Structural & Molecular Biology 16, 667-669. © 2014 European Molecular Biology Laboratory This publication was produced by the External Relations team at the European Bioinformatics Institute (EMBL-EBI) A digital version of the brochure can be found at www.ebi.ac.uk/about/brochures For more information about EMBL-EBI please contact: [email protected] Contents Introduction & overview 3 Services 8 Genes, genomes and variation 8 Molecular atlas 12 Proteins and protein families 14 Molecular and cellular structures 18 Chemical biology 20 Molecular systems 22 Cross-domain tools and resources 24 Research 26 Support 32 ELIXIR 36 Facts and figures 38 Funding & resource allocation 38 Growth of core resources 40 Collaborations 42 Our staff in 2013 44 Scientific advisory committees 46 Major database collaborations 50 Publications 52 Organisation of EMBL-EBI leadership 61 2013 EMBL-EBI Annual Scientific Report 1 Foreword Welcome to EMBL-EBI’s 2013 Annual Scientific Report. Here we look back on our major achievements during the year, reflecting on the delivery of our world-class services, research, training, industry collaboration and European coordination of life-science data. The past year has been one full of exciting changes, both scientifically and organisationally. We unveiled a new website that helps users explore our resources more seamlessly, saw the publication of ground-breaking work in data storage and synthetic biology, joined the global alliance for global health, built important new relationships with our partners in industry and celebrated the launch of ELIXIR.
    [Show full text]