A Library of Protein Families and Subfamilies Indexed by Function Paul D
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
1 Mutational Heterogeneity in Cancer Akash Kumar a Dissertation
Mutational Heterogeneity in Cancer Akash Kumar A dissertation Submitted in partial fulfillment of requirements for the degree of Doctor of Philosophy University of Washington 2014 June 5 Reading Committee: Jay Shendure Pete Nelson Mary Claire King Program Authorized to Offer Degree: Genome Sciences 1 University of Washington ABSTRACT Mutational Heterogeneity in Cancer Akash Kumar Chair of the Supervisory Committee: Associate Professor Jay Shendure Department of Genome Sciences Somatic mutation plays a key role in the formation and progression of cancer. Differences in mutation patterns likely explain much of the heterogeneity seen in prognosis and treatment response among patients. Recent advances in massively parallel sequencing have greatly expanded our capability to investigate somatic mutation. Genomic profiling of tumor biopsies could guide the administration of targeted therapeutics on the basis of the tumor’s collection of mutations. Central to the success of this approach is the general applicability of targeted therapies to a patient’s entire tumor burden. This requires a better understanding of the genomic heterogeneity present both within individual tumors (intratumoral) and amongst tumors from the same patient (intrapatient). My dissertation is broadly organized around investigating mutational heterogeneity in cancer. Three projects are discussed in detail: analysis of (1) interpatient and (2) intrapatient heterogeneity in men with disseminated prostate cancer, and (3) investigation of regional intratumoral heterogeneity in -
PANTHER Tutorial 2011.Pptx
PANTHER Classificaon System version 7 Huaiyu Mi Department of Preven3ve Medicine Keck School of Medicine University of Southern California USA August 27, 2011, ICSB Tutorial, Heidelberg, Germany 0 Outline • PANTHER Background – How PANTHER is built? • PANTHER Website at a Glance – Brief overview of all PANTHER pages • PANTHER Basic Func3onali3es • PANTHER Tools – Tutorial on tool usage 1 PANTHER BACKGROUND 2 PANTHER Database 3 4 What’s new in PANTHER 7.0? • Whole genome sequence coverage from 48 organisms. • New tree building algorithm (GIGA) for improved phylogene3c relaonships of genes and families. • Improved Hidden-Markov Models • Improved ortholog iden3ficaon. • Implement GO slim and PANTHER protein class for classifying genes and families. • Expanded sets of genomes and sequence iden3fier for PANTHER tools. • PANTHER Pathway diagram in SBGN. 5 PANTHER PROTEIN LIBRARY 6 What is PANTHER? PANTHER library (PANTHER/LIB) • a family tree Sequences • a mul3ple sequence alignment • an HMM PANTHER subfamily HMM models PANTHER GO slim and Protein Class Stas3c models Phylogene3c trees Mul3sequence (HMM) alignments • Molecular func3on • Biological process • Cellular component • Protein class 7 Building PANTHER Protein Family Library Select sequences Build clusters Curaon PANTHER Build MSA Protein Libray Build trees PANTHER GO slim Build and Protein Class HMMs ontology 8 Complete Gene Sets • 12 GO Reference Genomes • 36 other genomes to help reconstruct evolu3onary history – 14 bacterial genomes – 2 archaeal genomes – 2 fungal genomes – 2 plant genomes – 1 amoebozoan genome – 3 prost genomes – 2 protostome genomes – 10 deuterostome genomes 9 “Standard” set of protein coding genes and corresponding protein sequences Get list of genes in each genome • 48 genomes • Sources of genes – MOD Get list of all protein products – ENSEMBL from given source – NCBI (Entrez) • Sources of protein sequences Get mapping of – UniProt each protein – product to UniProt NCBI (Refseq) – ENSEMBL • One protein is selected for Select one each gene. -
Ejhg2009157.Pdf
European Journal of Human Genetics (2010) 18, 342–347 & 2010 Macmillan Publishers Limited All rights reserved 1018-4813/10 $32.00 www.nature.com/ejhg ARTICLE Fine mapping and association studies of a high-density lipoprotein cholesterol linkage region on chromosome 16 in French-Canadian subjects Zari Dastani1,2,Pa¨ivi Pajukanta3, Michel Marcil1, Nicholas Rudzicz4, Isabelle Ruel1, Swneke D Bailey2, Jenny C Lee3, Mathieu Lemire5,9, Janet Faith5, Jill Platko6,10, John Rioux6,11, Thomas J Hudson2,5,7,9, Daniel Gaudet8, James C Engert*,2,7, Jacques Genest1,2,7 Low levels of high-density lipoprotein cholesterol (HDL-C) are an independent risk factor for cardiovascular disease. To identify novel genetic variants that contribute to HDL-C, we performed genome-wide scans and quantitative association studies in two study samples: a Quebec-wide study consisting of 11 multigenerational families and a study of 61 families from the Saguenay– Lac St-Jean (SLSJ) region of Quebec. The heritability of HDL-C in these study samples was 0.73 and 0.49, respectively. Variance components linkage methods identified a LOD score of 2.61 at 98 cM near the marker D16S515 in Quebec-wide families and an LOD score of 2.96 at 86 cM near the marker D16S2624 in SLSJ families. In the Quebec-wide sample, four families showed segregation over a 25.5-cM (18 Mb) region, which was further reduced to 6.6 Mb with additional markers. The coding regions of all genes within this region were sequenced. A missense variant in CHST6 segregated in four families and, with additional families, we observed a P value of 0.015 for this variant. -
A Comprehensive Evaluation of 181 Reported CHST6 Variants in Patients with Macular Corneal Dystrophy
www.aging‐us.com AGING 2019, Vol. 11, No. 3 Research Paper A comprehensive evaluation of 181 reported CHST6 variants in patients with macular corneal dystrophy Jing Zhang1, Dan Wu1, Yue Li1, Yidan Fan1, Yiqin Dai1, Jianjiang Xu1 1Department of Ophthalmology and Visual Science, Eye and ENT Hospital, Shanghai Medical College of Fudan University, NHC Key Laboratory of Myopia (Fudan University), Shanghai Key Laboratory of Visual Impairment and Restoration, Shanghai, China Correspondence to: Jianjiang Xu; email: [email protected] Keywords: CHST6, macular corneal dystrophy, genetic variants Received: October 23, 2018 Accepted: January 25, 2019 Published: February 4, 2019 Copyright: Zhang et al. This is an open‐access article distributed under the terms of the Creative Commons Attribution License (CC BY 3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. ABSTRACT Macular corneal dystrophy (MCD) is an autosomal recessive disease featured by bilateral progressive stromal clouding and loss of vision, consequently necessitating corneal transplantation. Variants in CHST6 gene have been recognized as the most critical genetic components in MCD. Although many CHST6 variants have been described until now, the detailed mechanisms underlying MCD are still far from understood. In this study, we integrated all the reported CHST6 variants described in 408 MCD cases, and performed a comprehensive evaluation to better illustrate the causality of these variants. The results showed that majority of these variants (165 out of 181) could be classified as pathogenic or likely pathogenic. Interestingly, we also identified several disease causal variants with ethnic specificity. In addition, the results underscored the strong correlation between mutant frequency and residue conservation in the general population (Spearman’s correlation coefficient = ‐0.311, P = 1.20E‐05), thus providing potential candidate targets for further genetic manipulation. -
PANTHER Version 11: Expanded Annotation Data from Gene Ontology and Reactome Pathways, and Data Analysis Tool Enhancements
Published online 28 November 2016 Nucleic Acids Research, 2017, Vol. 45, Database issue D183–D189 doi: 10.1093/nar/gkw1138 PANTHER version 11: expanded annotation data from Gene Ontology and Reactome pathways, and data analysis tool enhancements Huaiyu Mi*, Xiaosong Huang, Anushya Muruganujan, Haiming Tang, Caitlin Mills, Diane Kang and Paul D. Thomas* Division of Bioinformatics, Department of Preventive Medicine, Keck School of Medicine of USC, University of Southern California, Los Angeles, CA 90033, USA Received October 05, 2016; Revised October 27, 2016; Editorial Decision October 28, 2016; Accepted November 16, 2016 ABSTRACT INTRODUCTION The PANTHER database (Protein ANalysis THrough Protein ANalysis THrough Evolutionary Relationships Evolutionary Relationships, http://pantherdb.org) (PANTHER) is a multifaceted data resource for classifica- contains comprehensive information on the evolu- tion of protein sequences by evolutionary history, and by tion and function of protein-coding genes from 104 function. Protein-coding genes from 104 organisms are clas- completely sequenced genomes. PANTHER software sified by evolutionary relationships, and by structured rep- resentations of protein function including the Gene Ontol- tools allow users to classify new protein sequences, ogy (GO) and biological pathways. The foundation of PAN- and to analyze gene lists obtained from large-scale THER is a comprehensive ‘library’ of phylogenetic trees genomics experiments. In the past year, major im- of protein-coding gene families. These trees attempt to re- provements include a large expansion of classifica- construct the evolutionary events (speciation, gene duplica- tion information available in PANTHER, as well as tion and horizontal gene transfer) that led to the modern- significant enhancements to the analysis tools. -
Microarray Bioinformatics and Its Applications to Clinical Research
Microarray Bioinformatics and Its Applications to Clinical Research A dissertation presented to the School of Electrical and Information Engineering of the University of Sydney in fulfillment of the requirements for the degree of Doctor of Philosophy i JLI ··_L - -> ...·. ...,. by Ilene Y. Chen Acknowledgment This thesis owes its existence to the mercy, support and inspiration of many people. In the first place, having suffering from adult-onset asthma, interstitial cystitis and cold agglutinin disease, I would like to express my deepest sense of appreciation and gratitude to Professors Hong Yan and David Levy for harbouring me these last three years and providing me a place at the University of Sydney to pursue a very meaningful course of research. I am also indebted to Dr. Craig Jin, who has been a source of enthusiasm and encouragement on my research over many years. In the second place, for contexts concerning biological and medical aspects covered in this thesis, I am very indebted to Dr. Ling-Hong Tseng, Dr. Shian-Sehn Shie, Dr. Wen-Hung Chung and Professor Chyi-Long Lee at Change Gung Memorial Hospital and University of Chang Gung School of Medicine (Taoyuan, Taiwan) as well as Professor Keith Lloyd at University of Alabama School of Medicine (AL, USA). All of them have contributed substantially to this work. In the third place, I would like to thank Mrs. Inge Rogers and Mr. William Ballinger for their helpful comments and suggestions for the writing of my papers and thesis. In the fourth place, I would like to thank my swim coach, Hirota Homma. -
The BCAR1 Locus in Carotid Intima-Media Thickness and Atherosclerosis
The BCAR1 Locus in Carotid Intima-Media Thickness and Atherosclerosis Freya Boardman-Pretty University College London Thesis submitted for the degree of Doctor of Philosophy 1 Declaration of work I, Freya Boardman-Pretty, confirm that the work presented in this thesis is my own and has been generated by me as the result of my own original research. Where information has been derived from other sources, this has been indicated in the text. My role in the work presented in each chapter is outlined below. In chapter 3 I carried out all bioinformatics analysis, including identification of SNPs in strong LD, analysis of regulatory marks associated with these SNPs using UCSC Genome Browser, HaploReg and ElDorado, and selection of candidate SNPs. I used the Gene-Tissue Expression (GTEx) browser and Gilad/Pritchard eQTL browser to investigate eQTLs. In chapter 4 I was responsible for genotyping the PLIC cohort for the SNP rs4888378. Genotype data and an analysis plan were sent to Danilo Norata and Andrea Baragetti (University of Milan), who carried out the statistical tests that I requested in PLIC and provided test results and the necessary data for the meta-analysis. I carried out the sex-stratified meta-analysis and additional event rate analysis. In chapter 5 I carried out electrophoretic mobility shift assays (EMSAs) on candidate SNPs, multiplex competitor EMSAs and supershift EMSAs. For the luciferase assay, I designed reporter fragments of interest and carried out the cloning experiments to create the four sets of reporter vectors, and carried out the luciferase assays. I also carried out transfection experiments. -
Exploring the Roles of Horizontal Gene Transfer in Metazoans by Dongliang Chen June, 2016 Director: Dr. Jinling Huang DEPARTMEN
Exploring the Roles of Horizontal Gene Transfer in Metazoans By Dongliang Chen June, 2016 Director: Dr. Jinling Huang DEPARTMENT OF BIOLOGY Horizontal gene transfer (HGT; also known as lateral gene transfer, LGT) refers to the movement of genetic information between distinct species by overcoming normal mating barriers. Historically HGT is only considered to be important in prokaryotes. Some researchers believe that eukaryotes have sexual recombination and HGT is insignificant. However, HGT has also been found to play roles in many aspects of eukaryotic evolution, like parasitism and the colonization of land by plants, although at lower frequencies than in prokaryotes. In this dissertation, I first estimated the scope of HGT in 16 selected metazoan species by genome screening using AlienG. These species are sampled to represent major lineages of metazoans. Among all the 16 species, Nematostella vectensis (4.08%) has the highest percentage of HGT genes, while parasitic Schistosoma japonicum (0.47%) ranks the lowest. In order to find out which factors are correlated with HGT rates in different species, living habitat, diet, lineage group and reproductive type were analyzed in a statistical framework. In Chapter 3 and Chapter 4, Ciona intestinalis and Trichoplax adhaerens were chosen as models to investigate horizontally acquired genes. Tunicate cellulose synthase was discovered to originate from green algae, instead from bacteria as found in previous studies. 43 genes of 21 families in T. adhaerens were found to be horizontally acquired. -
V12a18-Klintworth Pgmkr
Molecular Vision 2006; 12:159-76 <http://www.molvis.org/molvis/v12/a18/> ©2006 Molecular Vision Received 13 October 2005 | Accepted 8 March 2006 | Published 10 March 2006 CHST6 mutations in North American subjects with macular corneal dystrophy: a comprehensive molecular genetic review Gordon K. Klintworth,1,2 Clayton F. Smith,2 Brandy L. Bowling2 Departments of 1Pathology and 2Ophthalmology, Duke University Medical Center, Durham, NC Purpose: To evaluate mutations in the carbohydrate sulfotransferase-6 (CHST6) gene in American subjects with macular corneal dystrophy (MCD). Methods: We analyzed CHST6 in 57 patients from 31 families with MCD from the United States, 57 carriers (parents or children), and 27 unaffected blood relatives of affected subjects. We compared the observed nucleotide sequences with those found by numerous investigators in other populations with MCD and in controls. Results: In 24 families, the corneal disorder could be explained by mutations in the coding region of CHST6 or in the region upstream of this gene in both the maternal and paternal chromosome. In most instances of MCD a homozygous or heterozygous missense mutation in exon 3 of CHST6 was found. Six cases resulted from a deletion upstream of CHST6. Conclusions: Nucleotide changes within the coding region of CHST6 are predicted to alter the encoded protein signifi- cantly within evolutionary conserved parts of the encoded sulfotransferase. Our findings support the hypothesis that CHST6 mutations are cardinal to the pathogenesis of MCD. Moreover, the observation that some cases of MCD cannot be explained by mutations in CHST6 suggests that MCD may result from other subtle changes in CHST6 or from genetic heterogeneity. -
Gnomad Lof Supplement
1 gnomAD supplement gnomAD supplement 1 Data processing 4 Alignment and read processing 4 Variant Calling 4 Coverage information 5 Data processing 5 Sample QC 7 Hard filters 7 Supplementary Table 1 | Sample counts before and after hard and release filters 8 Supplementary Table 2 | Counts by data type and hard filter 9 Platform imputation for exomes 9 Supplementary Table 3 | Exome platform assignments 10 Supplementary Table 4 | Confusion matrix for exome samples with Known platform labels 11 Relatedness filters 11 Supplementary Table 5 | Pair counts by degree of relatedness 12 Supplementary Table 6 | Sample counts by relatedness status 13 Population and subpopulation inference 13 Supplementary Figure 1 | Continental ancestry principal components. 14 Supplementary Table 7 | Population and subpopulation counts 16 Population- and platform-specific filters 16 Supplementary Table 8 | Summary of outliers per population and platform grouping 17 Finalizing samples in the gnomAD v2.1 release 18 Supplementary Table 9 | Sample counts by filtering stage 18 Supplementary Table 10 | Sample counts for genomes and exomes in gnomAD subsets 19 Variant QC 20 Hard filters 20 Random Forest model 20 Features 21 Supplementary Table 11 | Features used in final random forest model 21 Training 22 Supplementary Table 12 | Random forest training examples 22 Evaluation and threshold selection 22 Final variant counts 24 Supplementary Table 13 | Variant counts by filtering status 25 Comparison of whole-exome and whole-genome coverage in coding regions 25 Variant annotation 30 Frequency and context annotation 30 2 Functional annotation 31 Supplementary Table 14 | Variants observed by category in 125,748 exomes 32 Supplementary Figure 5 | Percent observed by methylation. -