UNIVERSITY of CALIFORNIA, SAN DIEGO Effective Design And

Total Page:16

File Type:pdf, Size:1020Kb

UNIVERSITY of CALIFORNIA, SAN DIEGO Effective Design And UNIVERSITY OF CALIFORNIA, SAN DIEGO Effective Design and Analysis of Systems Genetics Studies A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Computer Science by Hyun Min Kang Committee in charge: Professor Pavel Pevzner, Chair Professor Eleazar Eskin, Co-Chair Professor Vineet Bafna Professor Sanjoy Dasgupta Professor Trey Ideker Professor Nicholas J. Schork 2009 Copyright Hyun Min Kang, 2009 All rights reserved. The dissertation of Hyun Min Kang is approved, and it is acceptable in quality and form for publi- cation on microfilm and electronically: Co-Chair Chair University of California, San Diego 2009 iii DEDICATION To Jihye and Joseph. iv EPIGRAPH Get the facts, or the facts will get you. And when you get them, get them right, or they will get you wrong. — Thomas Fuller v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . v Table of Contents . vi List of Figures . x List of Tables . xii Acknowledgements . xiii Vita and Publications . xvi Abstract of the Dissertation . xviii Chapter 1 Introduction . 1 Chapter 2 A high-density haplotype resource of 94 inbred mouse strains . 9 2.1 Motivation . 9 2.2 Results . 10 2.2.1 The mouse HapMap resource . 10 2.2.2 Haplotype structure among the strains . 15 2.2.3 Integrating NIEHS/Perlegen resequencing and HapMap data . 19 2.2.4 Effects of larger resources . 24 2.2.5 Trait mapping with the mouse HapMap resource . 26 2.3 Discussion . 28 2.4 Methods . 31 Chapter 3 An adaptive and memory efficient algorithm for genotype impu- tation . 33 3.1 Motivation . 33 3.2 Materials and methods . 37 3.2.1 The imputation problem. 37 3.2.2 Imputation algorithm for haploid model . 38 3.2.3 Extension to unphased genotypes (diploid model) . 43 3.3 Results . 45 3.3.1 Genotype imputation of 94 inbred mouse strains . 45 3.3.2 Imputation of HapMap SNPs in WTCCC samples . 47 3.4 Conclusion . 48 vi Chapter 4 Efficient control of population structure in model organism as- sociation mapping . 50 4.1 Motivation . 50 4.2 Materials and methods . 54 4.2.1 Genotypes and phenotypes . 54 4.2.2 Efficient mixed model association (EMMA) . 55 4.2.3 Similarity-based kinship matrix . 58 4.2.4 Phylogenetic control . 59 4.2.5 Statistical tests and multiple hypothesis testing . 60 4.2.6 Simulation studies . 61 4.2.7 Derivation of restricted likelihood and derivatives . 62 4.3 Results . 64 4.3.1 Comparison with previous methods . 64 4.3.2 High resolution henome-wide association mapping in inbred mouse strains . 67 4.3.3 Power of inbred association mapping . 70 4.4 Discussion . 73 Chapter 5 Accounting for sample structure in large scale genome-wide as- sociation studies using a variance component model . 78 5.1 Motivation . 78 5.2 Materials and methods . 81 5.2.1 Variance component model to account for sample structure . 81 5.2.2 Estimating marker specific inflation factor . 85 5.2.3 Accounting for large effect sizes at some SNPs . 88 5.2.4 Application to case control datasets . 89 5.2.5 Genotype and phenotype data . 90 5.3 Results . 91 5.3.1 NFBC66 . 91 5.3.2 Application to WTCCC case-control data . 106 5.3.3 Marker specific inflation factors . 107 5.4 Discussion . 112 Chapter 6 Accurate discovery of expression quantitative trait loci under confounding from spurious and genuine regulatory hotspots . 117 6.1 Motivation . 117 6.2 Results . 121 6.2.1 Spurious regulatory hotspots in recombinant inbred mice . 121 6.2.2 Inter-sample correlation as signatures of systematic confounding effects . 122 6.2.3 Inter-sample Correlation Emended (ICE) eQTL map- ping . 128 vii 6.2.4 Some trans-regulatory bands in high quality datasets are likely to correspond to real genetic effects . 136 6.2.5 Correcting for confounding effects in human lympho- blastoid cell line expression . 145 6.2.6 Comparison with previous methods . 146 6.3 Discussion . 152 6.4 Materials and methods . 153 6.4.1 Gene expression data and genetic maps . 153 6.4.2 Traditional eQTL mapping and genome wide eQTL maps . 154 6.4.3 Explicit batch effect correction and Surrogate Vari- able Analysis . 155 6.4.4 Genome wide inter-sample correlation . 155 6.4.5 Simulation studies . 156 6.4.6 Variance component test . 157 6.4.7 ICE eQTL mapping . 157 6.4.8 Assessing the statistical significance of trans-regula- tory bands . 158 Chapter 7 A High Resolution Association Mapping Panel for the Dissection of Complex Traits in Mice . 159 7.1 Motivation . 159 7.2 Results . 162 7.2.1 Design principles of mouse association studies . 162 7.2.2 Strain selection for the Hybrid Mouse Diversity Panel163 7.2.3 Validating the statistical power of the HMDP through mapping metabolic clinical traits . 167 7.2.4 Resolution of mouse association studies . 171 7.2.5 Application of the HMDP by mapping metabolic clinical traits . 173 7.2.6 Comparison to previous mouse association studies . 175 7.3 Discussion . 177 7.4 Materials and methods . 181 7.4.1 Animals . 181 7.4.2 Phenotypes/ phenotyping protocols . 182 7.4.3 Genotyping . 182 7.4.4 RNA isolation and expression profiling . 183 7.4.5 Gene expression analysis . 183 7.4.6 Genome-wide association mapping accounting for population structure . 184 7.4.7 Estimation of power and mapping resolution . 184 7.4.8 Genome-wide significance threshold . 185 7.4.9 Validation of clinical and expression associations . 186 viii Chapter 8 Conclusion and future work . 187 8.1 Summary and conclusion . 187 8.2 Future work . 189 8.2.1 GWAS with unstratified populations . 189 8.2.2 Exploring multiple rare variants hypothesis . 190 8.2.3 Capturing unmodeled confounding effects inherent in various high-throughput data . 191 8.2.4 Challenges in sequence-based association mapping . 191 Bibliography . 193 ix LIST OF FIGURES Figure 1.1: A conceptual diagram of systems genetics studies . 3 Figure 2.1: Classification of 94 strains used in the mouse HapMap projects . 11 Figure 2.2: Fraction of genome covered by shared segments . 16 Figure 2.3: Fraction of pairwise shared geomic segments . 18 Figure 2.4: Estimated imputation accuracy and coverage . 27 Figure 2.5: Phenotypic varianace explained by population structure . 29 Figure 2.6: Number of phenotypes with significant associations . 30 Figure 2.7: Comparison of genomic control inflation factors . 30 Figure 3.1: An example of the imputation problem . 35 Figure 3.2: An example of HMM . 39 Figure 4.1: Comparison between different methods . 65 Figure 4.2: Cumulative distribution of p-values . 67 Figure 4.3: Genome-wide association plots . 69 Figure 4.4: Power estimates based on real phenotypes . 71 Figure 4.5: Simulation-based power estimates . 74 Figure 5.1: Principal components and geographical information . 92 Figure 5.2: Scatterplot of five principal components . 93 Figure 5.3: QQ plot of LDL association . 94 Figure 5.4: Inflation by the number of principal components . 96 Figure 5.5: Comparison between IBS and IBD estimates . 98 Figure 5.6: QQ plots for NFBC66 association mapping . 101 Figure 5.7: Comparion between EMMA and EMMAX . 102 Figure 5.8: Comparisons of LDL association plots . 103 Figure 5.9: Concordance between different methods . 105 Figure 5.10: Differences in beta estimates . 106 Figure 5.11: QQ plots in WTCCC association mapping . 108 Figure 5.12: Distribution of the marker specific inflation factors . 109 Figure 5.13: QQ plots comparison using simulated phenotypes . 111 Figure 5.14: Concordance of per-marker inflation factor . 113 Figure 6.1: Comparion of regulatory hotspots . 123 Figure 6.2: Genome wide eQTL maps . 124 Figure 6.3: Genome wide inter-sample correlation with replicated samples . 127 Figure 6.4: eQTL maps of simulated expression datasets . 129 Figure 6.5: Statistical power under various systematic confounding effects . 132 Figure 6.6: trans-regulatory bands under various systematic confounding effects133 Figure 6.7: Number of genes with significant eQTLs . 135 Figure 6.8: Concordance of eQTLs . 137 Figure 6.9: Inter-sample correlation with M430v2 chip . 140 x Figure 6.10: Regulatory hotspots between original and simulated datasets . 142 Figure 6.11: Genome wide eQTL maps using Surrogate Variable Analysis . 144 Figure 6.12: P-values for differential expressions across populations . 148 Figure 6.13: QQ plot of differential expressions between populations . 149 Figure 6.14: Cis associations within each HapMap population . 149 Figure 7.1: Power Calculations . 166 Figure 7.2: Detection of associations for plasma lipids in HMDP strains . 169 Figure 7.3: Expression traits demonstrate high resolution of HMDP . 172 Figure 7.4: Correcting for population structure dramatically reduces false positives . ..
Recommended publications
  • Genome-Wide Analysis of 5-Hmc in the Peripheral Blood of Systemic Lupus Erythematosus Patients Using an Hmedip-Chip
    INTERNATIONAL JOURNAL OF MOLECULAR MEDICINE 35: 1467-1479, 2015 Genome-wide analysis of 5-hmC in the peripheral blood of systemic lupus erythematosus patients using an hMeDIP-chip WEIGUO SUI1*, QIUPEI TAN1*, MING YANG1, QIANG YAN1, HUA LIN1, MINGLIN OU1, WEN XUE1, JIEJING CHEN1, TONGXIANG ZOU1, HUANYUN JING1, LI GUO1, CUIHUI CAO1, YUFENG SUN1, ZHENZHEN CUI1 and YONG DAI2 1Guangxi Key Laboratory of Metabolic Diseases Research, Central Laboratory of Guilin 181st Hospital, Guilin, Guangxi 541002; 2Clinical Medical Research Center, the Second Clinical Medical College of Jinan University (Shenzhen People's Hospital), Shenzhen, Guangdong 518020, P.R. China Received July 9, 2014; Accepted February 27, 2015 DOI: 10.3892/ijmm.2015.2149 Abstract. Systemic lupus erythematosus (SLE) is a chronic, Introduction potentially fatal systemic autoimmune disease characterized by the production of autoantibodies against a wide range Systemic lupus erythematosus (SLE) is a typical systemic auto- of self-antigens. To investigate the role of the 5-hmC DNA immune disease, involving diffuse connective tissues (1) and modification with regard to the onset of SLE, we compared is characterized by immune inflammation. SLE has a complex the levels 5-hmC between SLE patients and normal controls. pathogenesis (2), involving genetic, immunologic and envi- Whole blood was obtained from patients, and genomic DNA ronmental factors. Thus, it may result in damage to multiple was extracted. Using the hMeDIP-chip analysis and valida- tissues and organs, especially the kidneys (3). SLE arises from tion by quantitative RT-PCR (RT-qPCR), we identified the a combination of heritable and environmental influences. differentially hydroxymethylated regions that are associated Epigenetics, the study of changes in gene expression with SLE.
    [Show full text]
  • Supplementary Table S1. Upregulated Genes Differentially
    Supplementary Table S1. Upregulated genes differentially expressed in athletes (p < 0.05 and 1.3-fold change) Gene Symbol p Value Fold Change 221051_s_at NMRK2 0.01 2.38 236518_at CCDC183 0.00 2.05 218804_at ANO1 0.00 2.05 234675_x_at 0.01 2.02 207076_s_at ASS1 0.00 1.85 209135_at ASPH 0.02 1.81 228434_at BTNL9 0.03 1.81 229985_at BTNL9 0.01 1.79 215795_at MYH7B 0.01 1.78 217979_at TSPAN13 0.01 1.77 230992_at BTNL9 0.01 1.75 226884_at LRRN1 0.03 1.74 220039_s_at CDKAL1 0.01 1.73 236520_at 0.02 1.72 219895_at TMEM255A 0.04 1.72 201030_x_at LDHB 0.00 1.69 233824_at 0.00 1.69 232257_s_at 0.05 1.67 236359_at SCN4B 0.04 1.64 242868_at 0.00 1.63 1557286_at 0.01 1.63 202780_at OXCT1 0.01 1.63 1556542_a_at 0.04 1.63 209992_at PFKFB2 0.04 1.63 205247_at NOTCH4 0.01 1.62 1554182_at TRIM73///TRIM74 0.00 1.61 232892_at MIR1-1HG 0.02 1.61 204726_at CDH13 0.01 1.6 1561167_at 0.01 1.6 1565821_at 0.01 1.6 210169_at SEC14L5 0.01 1.6 236963_at 0.02 1.6 1552880_at SEC16B 0.02 1.6 235228_at CCDC85A 0.02 1.6 1568623_a_at SLC35E4 0.00 1.59 204844_at ENPEP 0.00 1.59 1552256_a_at SCARB1 0.02 1.59 1557283_a_at ZNF519 0.02 1.59 1557293_at LINC00969 0.03 1.59 231644_at 0.01 1.58 228115_at GAREM1 0.01 1.58 223687_s_at LY6K 0.02 1.58 231779_at IRAK2 0.03 1.58 243332_at LOC105379610 0.04 1.58 232118_at 0.01 1.57 203423_at RBP1 0.02 1.57 AMY1A///AMY1B///AMY1C///AMY2A///AMY2B// 208498_s_at 0.03 1.57 /AMYP1 237154_at LOC101930114 0.00 1.56 1559691_at 0.01 1.56 243481_at RHOJ 0.03 1.56 238834_at MYLK3 0.01 1.55 213438_at NFASC 0.02 1.55 242290_at TACC1 0.04 1.55 ANKRD20A1///ANKRD20A12P///ANKRD20A2///
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Download Download
    Supplementary Figure S1. Results of flow cytometry analysis, performed to estimate CD34 positivity, after immunomagnetic separation in two different experiments. As monoclonal antibody for labeling the sample, the fluorescein isothiocyanate (FITC)- conjugated mouse anti-human CD34 MoAb (Mylteni) was used. Briefly, cell samples were incubated in the presence of the indicated MoAbs, at the proper dilution, in PBS containing 5% FCS and 1% Fc receptor (FcR) blocking reagent (Miltenyi) for 30 min at 4 C. Cells were then washed twice, resuspended with PBS and analyzed by a Coulter Epics XL (Coulter Electronics Inc., Hialeah, FL, USA) flow cytometer. only use Non-commercial 1 Supplementary Table S1. Complete list of the datasets used in this study and their sources. GEO Total samples Geo selected GEO accession of used Platform Reference series in series samples samples GSM142565 GSM142566 GSM142567 GSM142568 GSE6146 HG-U133A 14 8 - GSM142569 GSM142571 GSM142572 GSM142574 GSM51391 GSM51392 GSE2666 HG-U133A 36 4 1 GSM51393 GSM51394 only GSM321583 GSE12803 HG-U133A 20 3 GSM321584 2 GSM321585 use Promyelocytes_1 Promyelocytes_2 Promyelocytes_3 Promyelocytes_4 HG-U133A 8 8 3 GSE64282 Promyelocytes_5 Promyelocytes_6 Promyelocytes_7 Promyelocytes_8 Non-commercial 2 Supplementary Table S2. Chromosomal regions up-regulated in CD34+ samples as identified by the LAP procedure with the two-class statistics coded in the PREDA R package and an FDR threshold of 0.5. Functional enrichment analysis has been performed using DAVID (http://david.abcc.ncifcrf.gov/)
    [Show full text]
  • The Gas6 Gene Rs8191974 and Ap3s2 Gene Rs2028299 Are Associated with Type 2 Diabetes in the Northern Chinese Han Population Elena V
    Vol. 64, No 2/2017 227–231 https://doi.org/10.18388/abp.2016_1299 Regular paper The Gas6 gene rs8191974 and Ap3s2 gene rs2028299 are associated with type 2 diabetes in the northern Chinese Han population Elena V. Kazakova#, Tianwei Zghuang#, Tingting Li, Qingxiao Fang, Jun Han and Hong Qiao* 1The Fifth Endocrine Department, the Second Affiliated Hospital, Harbin Medical University, Harbin, Heilongjiang, China Previous studies in other countries have shown that INTRODUCTION single nucleotide polymorphisms (SNPs) in the growth arrest-specific gene 6 (Gas6; rs8191974) and adapt- Diabetes mellitus affects more than 300 million indi- er-related protein complex 3 subunit sigma-2 (Ap3s2; viduals worldwide, with increasing prevalence particular- rs2028299) were associated with an increasedrisk for ly in the developing countries (Whiting et al., 2011). In type 2 diabetes mellitus (T2DM). However, the associ- fact, the prevalence of type 2 diabetes mellitus (T2DM) ation of these loci with T2DM has not been examined in China is among the highest in the world. The com- in Chinese populations. We performed a replication bination of insulin resistance in peripheral tissues and study to investigate the association of these suscep- impaired insulin secretion from pancreatic β-cells is be- tibility loci with T2DM in the Chinese population. lieved to contribute to the development and progression We genotyped 1968 Chinese participants (996 with of T2DM. Both, the genetic and environmental factors T2DM and 972controls) for rs8191974 in Gas6 and confer susceptibility to T2DM. In recent years, studies rs2028299 near Ap3s2, and examined their associa- of gene polymorphisms have helped identify a number tion with T2DM using a logistic regression analysis.
    [Show full text]
  • Supplementary Table S4. FGA Co-Expressed Gene List in LUAD
    Supplementary Table S4. FGA co-expressed gene list in LUAD tumors Symbol R Locus Description FGG 0.919 4q28 fibrinogen gamma chain FGL1 0.635 8p22 fibrinogen-like 1 SLC7A2 0.536 8p22 solute carrier family 7 (cationic amino acid transporter, y+ system), member 2 DUSP4 0.521 8p12-p11 dual specificity phosphatase 4 HAL 0.51 12q22-q24.1histidine ammonia-lyase PDE4D 0.499 5q12 phosphodiesterase 4D, cAMP-specific FURIN 0.497 15q26.1 furin (paired basic amino acid cleaving enzyme) CPS1 0.49 2q35 carbamoyl-phosphate synthase 1, mitochondrial TESC 0.478 12q24.22 tescalcin INHA 0.465 2q35 inhibin, alpha S100P 0.461 4p16 S100 calcium binding protein P VPS37A 0.447 8p22 vacuolar protein sorting 37 homolog A (S. cerevisiae) SLC16A14 0.447 2q36.3 solute carrier family 16, member 14 PPARGC1A 0.443 4p15.1 peroxisome proliferator-activated receptor gamma, coactivator 1 alpha SIK1 0.435 21q22.3 salt-inducible kinase 1 IRS2 0.434 13q34 insulin receptor substrate 2 RND1 0.433 12q12 Rho family GTPase 1 HGD 0.433 3q13.33 homogentisate 1,2-dioxygenase PTP4A1 0.432 6q12 protein tyrosine phosphatase type IVA, member 1 C8orf4 0.428 8p11.2 chromosome 8 open reading frame 4 DDC 0.427 7p12.2 dopa decarboxylase (aromatic L-amino acid decarboxylase) TACC2 0.427 10q26 transforming, acidic coiled-coil containing protein 2 MUC13 0.422 3q21.2 mucin 13, cell surface associated C5 0.412 9q33-q34 complement component 5 NR4A2 0.412 2q22-q23 nuclear receptor subfamily 4, group A, member 2 EYS 0.411 6q12 eyes shut homolog (Drosophila) GPX2 0.406 14q24.1 glutathione peroxidase
    [Show full text]
  • Identification of 42 Genes Linked to Stage II Colorectal Cancer Metastatic Relapse
    Int. J. Mol. Sci. 2016, 17, 598; doi:10.3390/ijms17040598 S1 of S16 Supplementary Materials: Identification of 42 Genes Linked to Stage II Colorectal Cancer Metastatic Relapse Rabeah A. Al-Temaimi, Tuan Zea Tan, Makia J. Marafie, Jean Paul Thiery, Philip Quirke and Fahd Al-Mulla Figure S1. Cont. Int. J. Mol. Sci. 2016, 17, 598; doi:10.3390/ijms17040598 S2 of S16 Figure S1. Mean expression levels of fourteen genes of significant association with CRC DFS and OS that are differentially expressed in normal colon compared to CRC tissues. Each dot represents a sample. Table S1. Copy number aberrations associated with poor disease-free survival and metastasis in early stage II CRC as predicted by STAC and SPPS combined methodologies with resident gene symbols. CN stands for copy number, whereas CNV is copy number variation. Region Cytoband % of CNV Count of Region Event Gene Symbols Length Location Overlap Genes chr1:113,025,076–113,199,133 174,057 p13.2 CN Loss 0.0 2 AKR7A2P1, SLC16A1 chr1:141,465,960–141,822,265 356,305 q12–q21.1 CN Gain 95.9 1 SRGAP2B MIR5087, LOC10013000 0, FLJ39739, LOC10028679 3, PPIAL4G, PPIAL4A, NBPF14, chr1:144,911,564–146,242,907 1,331,343 q21.1 CN Gain 99.6 16 NBPF15, NBPF16, PPIAL4E, NBPF16, PPIAL4D, PPIAL4F, LOC645166, LOC388692, FCGR1C chr1:177,209,428–177,226,812 17,384 q25.3 CN Gain 0.0 0 chr1:197,652,888–197,676,831 23,943 q32.1 CN Gain 0.0 1 KIF21B chr1:201,015,278–201,033,308 18,030 q32.1 CN Gain 0.0 1 PLEKHA6 chr1:201,289,154–201,298,247 9093 q32.1 CN Gain 0.0 0 chr1:216,820,186–217,043,421 223,235 q41 CN
    [Show full text]
  • Classification and Evolution of P-Loop Gtpases and Related Atpases Detlefd.Leipe,Yurii.Wolf,Eugenev.Koonin*Andl.Aravind
    doi:10.1006/jmbi.2001.5378availableonlineathttp://www.idealibrary.comon J. Mol. Biol. (2002) 317, 41±72 Classification and Evolution of P-loop GTPases and Related ATPases DetlefD.Leipe,YuriI.Wolf,EugeneV.Koonin*andL.Aravind National Center for Sequences and available structures were compared for all the widely dis- Biotechnology Information tributed representatives of the P-loop GTPases and GTPase-related pro- National Library of Medicine teins with the aim of constructing an evolutionary classi®cation for this National Institutes of Health superclass of proteins and reconstructing the principal events in their Bethesda, MD 20894, USA evolution. The GTPase superclass can be divided into two large classes, each of which has a unique set of sequence and structural signatures (synapomorphies). The ®rst class, designated TRAFAC (after translation factors) includes enzymes involved in translation (initiation, elongation, and release factors), signal transduction (in particular, the extended Ras- like family), cell motility, and intracellular transport. The second class, designated SIMIBI (after signal recognition particle, MinD, and BioD), consists of signal recognition particle (SRP) GTPases, the assemblage of MinD-like ATPases, which are involved in protein localization, chromo- some partitioning, and membrane transport, and a group of metabolic enzymes with kinase or related phosphate transferase activity. These two classes together contain over 20 distinct families that are further subdi- vided into 57 subfamilies (ancient lineages) on the basis of conserved sequence motifs, shared structural features, and domain architectures. Ten subfamilies show a universal phyletic distribution compatible with presence in the last universal common ancestor of the extant life forms (LUCA). These include four translation factors, two OBG-like GTPases, the YawG/YlqF-like GTPases (these two subfamilies also consist of pre- dicted translation factors), the two signal-recognition-associated GTPases, and the MRP subfamily of MinD-like ATPases.
    [Show full text]
  • Integrative Cross Tissue Analysis of Gene Expression Identifies Novel
    bioRxiv preprint doi: https://doi.org/10.1101/108134; this version posted February 27, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. 1 1 Integrative cross tissue analysis of gene expression identifies 2 novel type 2 diabetes genes 1 2 2 3 2 3 Jason M. Torres , Alvaro N. Barbeira , Rodrigo Bonazzola , Andrew P. Morris , Kaanan P. Shah , 4 5,6 7, 2, 4 Heather E. Wheeler , Graeme I. Bell , Nancy J. Cox ⇤, Hae Kyung Im ⇤ 5 1 Committee on Molecular Metabolism and Nutrition, Biological Sciences Division, The University of 6 Chicago, Chicago, IL, USA 7 2 Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL, USA 8 3 Institute of Translational Medicine, University of Liverpool, Liverpool, United Kingdom 9 4 Departments of Biology and Computer Science, Loyola University Chicago, Chicago, IL, USA 10 5 Department of Medicine, The University of Chicago, Chicago, IL, USA 11 6 Department of Human Genetics, The University of Chicago, Chicago, IL, USA 12 7 Division of Genetic Medicine, Vanderbilt University, Nashville, TN, USA 13 * Correspondence to: [email protected] and [email protected] 14 Abstract 15 To understand the mechanistic underpinnings of type 2 diabetes (T2D) loci mapped through GWAS, we 16 performed a tissue-specific gene association study in a cohort of over 100K individuals (n 26K, cases ⇡ 17 n 84K) across 44 human tissues using MetaXcan, a summary statistics extension of PrediXcan.
    [Show full text]
  • NUBP2 (NM 012225) Human Tagged ORF Clone Lentiviral Particle Product Data
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RC203571L4V NUBP2 (NM_012225) Human Tagged ORF Clone Lentiviral Particle Product data: Product Type: Lentiviral Particles Product Name: NUBP2 (NM_012225) Human Tagged ORF Clone Lentiviral Particle Symbol: NUBP2 Synonyms: CFD1; CIAO6; NBP 2; NUBP1 Vector: pLenti-C-mGFP-P2A-Puro (PS100093) ACCN: NM_012225 ORF Size: 813 bp ORF Nucleotide The ORF insert of this clone is exactly the same as(RC203571). Sequence: OTI Disclaimer: The molecular sequence of this clone aligns with the gene accession number as a point of reference only. However, individual transcript sequences of the same gene can differ through naturally occurring variations (e.g. polymorphisms), each with its own valid existence. This clone is substantially in agreement with the reference, but a complete review of all prevailing variants is recommended prior to use. More info OTI Annotation: This clone was engineered to express the complete ORF with an expression tag. Expression varies depending on the nature of the gene. RefSeq: NM_012225.1 RefSeq Size: 1408 bp RefSeq ORF: 816 bp Locus ID: 10101 UniProt ID: Q9Y5Y2, B7Z6P0 MW: 28.8 kDa Gene Summary: This gene encodes an adenosine triphosphate (ATP) and metal-binding protein that is required for the assembly of cyotosolic iron-sulfur proteins. The encoded protein functions in a heterotetramer with nucleotide-binding protein 1 (NUBP1). Alternative splicing results in multiple transcript variants. [provided by RefSeq, Oct 2013] This product is to be used for laboratory only.
    [Show full text]
  • Download Special Issue
    BioMed Research International Novel Bioinformatics Approaches for Analysis of High-Throughput Biological Data Guest Editors: Julia Tzu-Ya Weng, Li-Ching Wu, Wen-Chi Chang, Tzu-Hao Chang, Tatsuya Akutsu, and Tzong-Yi Lee Novel Bioinformatics Approaches for Analysis of High-Throughput Biological Data BioMed Research International Novel Bioinformatics Approaches for Analysis of High-Throughput Biological Data Guest Editors: Julia Tzu-Ya Weng, Li-Ching Wu, Wen-Chi Chang, Tzu-Hao Chang, Tatsuya Akutsu, and Tzong-Yi Lee Copyright © 2014 Hindawi Publishing Corporation. All rights reserved. This is a special issue published in “BioMed Research International.” All articles are open access articles distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Contents Novel Bioinformatics Approaches for Analysis of High-Throughput Biological Data,JuliaTzu-YaWeng, Li-Ching Wu, Wen-Chi Chang, Tzu-Hao Chang, Tatsuya Akutsu, and Tzong-Yi Lee Volume2014,ArticleID814092,3pages Evolution of Network Biomarkers from Early to Late Stage Bladder Cancer Samples,Yung-HaoWong, Cheng-Wei Li, and Bor-Sen Chen Volume 2014, Article ID 159078, 23 pages MicroRNA Expression Profiling Altered by Variant Dosage of Radiation Exposure,Kuei-FangLee, Yi-Cheng Chen, Paul Wei-Che Hsu, Ingrid Y. Liu, and Lawrence Shih-Hsin Wu Volume2014,ArticleID456323,10pages EXIA2: Web Server of Accurate and Rapid Protein Catalytic Residue Prediction, Chih-Hao Lu, Chin-Sheng
    [Show full text]
  • Modeling Genomic Diversity and Tumor Dependency in Malignant Melanoma
    Research Article Modeling Genomic Diversity and Tumor Dependency in Malignant Melanoma William M. Lin,1,3,5 Alissa C. Baker,1,3 Rameen Beroukhim,1,3,5 Wendy Winckler,1,3,5 Whei Feng,1,3,5 Jennifer M. Marmion,7 Elisabeth Laine,8 Heidi Greulich,1,3,5 Hsiuyi Tseng,1,3 Casey Gates,5 F. Stephen Hodi,1 Glenn Dranoff,1 William R. Sellers,1,6 Roman K. Thomas,9,10 Matthew Meyerson,1,3,4,5 Todd R. Golub,2,3,5 Reinhard Dummer,8 Meenhard Herlyn,7 Gad Getz,3,5 and Levi A. Garraway1,3,5 Departments of 1Medical Oncology and 2Pediatric Oncology and 3Center for Cancer Genome Discovery, Dana-Farber Cancer Institute, Harvard Medical School; 4Department of Pathology, Harvard Medical School, Boston, Massachusetts; 5The Broad Institute of M.I.T. and Harvard; 6Novartis Institutes for Biomedical Research, Cambridge, Massachusetts; 7Cancer Biology Division, Wistar Institute, Philadelphia, Pennsylvania; 8Department of Dermatology, University of Zurich Hospital, Zu¨rich, Switzerland; 9Max Planck Institute for Neurological Research with Klaus Joachim Zulch Laboratories of the Max Planck Society and the Medical Faculty of the University of Cologne; and 10Center for Integrated Oncology and Department I for Internal Medicine, University of Cologne, Cologne, Germany Abstract tumorigenesis have been derived from functional studies involving The classification of human tumors based on molecular cultured human cancer cells (e.g., established cell lines, short-term cultures, etc.). Despite their limitations, cancer cell line collections criteria offers tremendous clinical potential; however, dis- cerning critical and ‘‘druggable’’ effectors on a large scale will whose genetic alterations reflect their primary tumor counterparts also require robust experimental models reflective of tumor should provide malleable proxies that facilitate mechanistic genomic diversity.
    [Show full text]