ARTICLE https://doi.org/10.1038/s41467-020-19996-z OPEN Genetic architecture of host proteins involved in SARS-CoV-2 infection Maik Pietzner1, Eleanor Wheeler 1, Julia Carrasco-Zanini1, Johannes Raffler 2, Nicola D. Kerrison1, Erin Oerton1, Victoria P. W. Auyeung 1, Jian’an Luan 1, Chris Finan 3,4, Juan P. Casas5,6, Rachel Ostroff7, Steve A. Williams 7, Gabi Kastenmüller 2, Markus Ralser8,9, Eric R. Gamazon 1,10, ✉ ✉ Nicholas J. Wareham 1,11, Aroon D. Hingorani 3,4,12 & Claudia Langenberg 1,8,11,13 1234567890():,; Understanding the genetic architecture of host proteins interacting with SARS-CoV-2 or med- iating the maladaptive host response to COVID-19 can help to identify new or repurpose existing drugs targeting those proteins. We present a genetic discovery study of 179 such host proteins among 10,708 individuals using an aptamer-based technique. We identify 220 host DNA sequence variants acting in cis (MAF 0.01-49.9%) and explaining 0.3-70.9% of the variance of 97 of these proteins, including 45 with no previously known protein quantitative trait loci (pQTL) and 38 encoding current drug targets. Systematic characterization of pQTLs across the phe- nome identified protein-drug-disease links and evidence that putative viral interaction partners such as MARK3 affect immune response. Our results accelerate the evaluation and prioritization of new drug development programmes and repurposing of trials to prevent, treat or reduce adverse outcomes. Rapid sharing and detailed interrogation of results is facilitated through an interactive webserver (https://omicscience.org/apps/covidpgwas/). 1 MRC Epidemiology Unit, University of Cambridge, Cambridge, UK. 2 Institute of Computational Biology, Helmholtz Zentrum München – German Research Center for Environmental Health, Neuherberg, Germany. 3 Institute of Cardiovascular Science, Faculty of Population Health Sciences, University College London, London WC1E 6BT, UK. 4 UCL BHF Research Accelerator centre, London, UK. 5 Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. 6 Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC), VA Boston Healthcare System, Boston, Massachusetts, USA. 7 SomaLogic, Inc., Boulder, CO, USA. 8 The Molecular Biology of Metabolism Laboratory, The Francis Crick Institute, London, UK. 9 Department of Biochemistry, Charité University Medicine, Berlin, Germany. 10 Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA. 11 Health Data Research UK, Wellcome Genome Campus and University of Cambridge, Cambridge, UK. 12 Health Data Research UK, Institute of Health Informatics, University College London, London, UK. 13 Computational Medicine, Berlin Institute of Health (BIH), Charité University ✉ Medicine, Berlin, Germany. email: [email protected]; [email protected] NATURE COMMUNICATIONS | (2020) 11:6397 | https://doi.org/10.1038/s41467-020-19996-z | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-19996-z he pandemic of the novel coronavirus SARS-CoV-2 example, if confirmed, the reported protective effect of blood infection, the cause of COVID-19, is causing severe glo- group O on COVID-19-induced respiratory failure10 might well T 1,2 bal disruption and excess mortality . While ultimately be mediated by the effect of genetically reduced activity of an strategies are required that create vaccine-derived population ubiquitously expressed glycosyltransferase on a diverse range of immunity, in the medium term there is a need to develop new proteins. therapies or to repurpose existing drugs that are effective in In this study, we integrate large-scale genomic and aptamer- treating patients with severe complications of COVID-19, and based plasma proteomic data from a population-based study of also to identify agents that might protect vulnerable individuals 10,708 individuals prior to any SARS-CoV-2 infection or from becoming infected. The experimental characterization of COVID-19 to characterize the genetic architecture of 179 host 332 SARS-CoV-2-human protein–protein interactions and their proteins relevant to COVID-19. We identify genetic variants that mapping to 69 existing FDA-approved drugs, drugs in clinical regulate host proteins interacting with SARS-CoV-2, or which trials, and/or preclinical compounds3 points to new therapeutic may contribute to the maladaptive host response. We deeply strategies, some of which are currently being tested. The mea- characterize protein quantitative trait loci (pQTLs) in close surement of circulating host proteins that associate with COVID- proximity to protein-encoding genes (±500 kb window around 19 severity or mortality also provides insight into potentially the gene body), cis-pQTLs, and use genetic score analysis and targetable maladaptive host responses with current interest being phenome-wide scans to interrogate potential consequences for focused on the innate immune response4, coagulation5,6, and targeting those proteins by drugs. Our results enable the use of novel candidate proteins7. genetic variants as instruments for drug target validation in Naturally occurring sequence variation in or near a human emerging genome-wide association studies (GWAS) of SARS- gene that is encoding a drug target and affecting its expression or CoV-2 infection and COVID-1910,11. activity can be used to provide direct support for drug mechan- isms and safety in humans. This approach is now used by major pharmaceutical companies for drug target identification and Results validation for a wide range of non-communicable diseases, and to Coverage of COVID-19-relevant proteins. We identified guide drug repurposing8,9. Genetic evidence linking molecular COVID-19-relevant candidate proteins based on different layers targets to diseases relies on our understanding of the genetic of evidence to be involved in the pathology of COVID-19: (1) two architecture of drug targets. Proteins are the most common human proteins related to viral entry12, (2) 332 human proteins biological class of drug targets and advances in high-throughput shown to interact with viral proteins3, (3) 26 proteomic markers proteomic technologies have enabled systematic analysis of the of disease severity7, and (4) 54 protein biomarkers of adverse “human druggable proteome” and genetic target validation to prognosis, complications, and disease deterioration4–6,13 (Fig. 1 rapidly accelerate the prioritization (or de-prioritization) of and Supplementary Data 1). Of the 409 proteins prioritized, 179 therapeutic targets for new drug development or repurposing were detectable by the currently most-comprehensive proteomic trials. assay using an aptamer-based technology (SomaScan©), including Identification and in-depth genetic characterization of proteins 28 recognized by more than one aptamer (i.e., 179 proteins utilized by SARS-CoV-2 for entry and replication as well as those recognized by 190 aptamers). We further included com- proteins involved in the maladaptive host response will help to plementary data from proximity extensions assays (Olink©) for understand the systemic consequences of COVID-19. For 32 out of the 179 candidate proteins in a subset of 485 Fenland 2 Proteins for viral entry 332 Proteins identified as interaction partners withSARS-CoV-2 409 Candidate 4,775 Proteins measured 26 Proteins associated with disease proteins using SOMAscan v4 severity 54 Clinical biomarkers of outcome and prognosis 179 Candidate proteins measured among 10,708 participants of the Fenland study with available genotype data 50 SOMAscan v4 Covered 40 Not covered 30 20 # Proteins 10 0 E N M orf6 orf8 nsp1 nsp2 nsp4 nsp5 nsp6 nsp7 nsp8 nsp9 orf9c orf3a orf3b orf7a orf9b orf10 Spike nsp10 nsp11 nsp12 nsp13 nsp14 nsp15 Virus entry Fibrinolysis nsp5_C145A Cytokine storm Disease severity Viral baits Adverse prognosis Coagualtion cascade Fig. 1 Flowchart of the identification of candidate proteins and coverage by the SomaScan v4 platform within the Fenland cohort. More details for each protein targeted are given in Supplementary Data 1. 2 NATURE COMMUNICATIONS | (2020) 11:6397 | https://doi.org/10.1038/s41467-020-19996-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-19996-z ARTICLE Chromosomal position 12345678 9 10 11 12 13 14 15 16 17 18 19 20 22 X 4000 100 -value) p 50 –log10( 0 F8 F5 F7 C2 TF F9* F2* F11 IDE HP* LBP C1S GLA CFB AES C1R C8G C8G CFH IL7R IL6R CRP FGG F10* POR | CKM FGG* VWF GGH ITIH4 ETFA PLG* F13B F13B | ITIH3 PPT1 | PLAT PLD3 C1RL SAA1 SDF2 PKP2 SAA2 GSN* GPX1 IL1R1 | IL1R2 IL2RA VWA1 IL6ST IL2RB VWA2 IL1RN PROC CD14* COMT AP2A2 FBLN5 NPTX1 IL1RL1 RAB2A RAB1A FKBP7 IL1RL2 GDF15 APOA1 IL17RA ElF4E2 PLOD2 PROS1 LMAN2 QSOX2 MFGE8 HMOX1 MARK3 IL1RAP SMOC1 ACADM SVEP1* COL6A1 ERLEC1 DCTPP1 POFUT1 GRPEL1 KDELC2 PTGES2 IGHG1-4 ERO1LB SCARA5 C8B FAM162A FGB | | CKB SEPSECS SERPINE1 SERPINC1 COLGALT1 LGALS3BP RAP1GDS1 SERPINA10 F13A1 FGA C8A Virus entry SARS-CoV2-Human PPl Disease severity Adverse prognosis Coagulation cascade Fibrinolysis Cytokine storm Fig. 2 Manhattan plot of cis-associations statistics (encoding gene ±500 kb) for 179 proteins. The most significant regional sentinel protein quantitative trait loci (pQTL) acting in cis are annotated by larger dots for 104 unique protein targets (dashed line; p <5×10−8). Starred genes indicate those targeted by multiple aptamers (n = 9 genes). study individuals (Supplementary Data 1). Of these 179 proteins, targets,
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages14 Page
-
File Size-