Blood Donor Genotyping
Total Page:16
File Type:pdf, Size:1020Kb
Blood Donor Genotyping Nicholas S. Gleadall Department of Haematology University of Cambridge This dissertation is submitted for the degree of Doctor of Philosophy Gonville & Caius College September 2020 I would like to dedicate this thesis to the 813,212 donors, patients and healthy individuals whose data this work has used. Their decision to share personal medical and genetic information so that it may be used for the benefit of others is laudable. I hope that thework presented here meets the expectations they had when making such a valuable donation. Declaration I hereby declare that except where specific reference is made to the work of others, the contents of this dissertation are original and have not been submitted in whole or in part for consideration for any other degree or qualification in this, or any other university. This dissertation is my own work and contains nothing which is the outcome of work done in collaboration with others, except as specified in the text and Acknowledgements. This dissertation contains fewer than 65,000 words including appendices, bibliography, footnotes, tables and equations and has fewer than 150 figures. Nicholas S. Gleadall September 2020 Blood Donor Genotyping Nicholas S. Gleadall Abstract Transfusion of blood is one of the oldest and most widely used clinical interventions. In 2020 the World Health Organisation reported that globally 118.5 million blood donations had been collected worldwide. This blood will be used to provide life-saving transfusion support for millions of individuals with a wide range of medical conditions. To ensure the safety of each blood transfusion it is common policy to identify and ensure compatibility between the ABO and RhD antigens of both donor and recipient. Although this policy prevents the majority of adverse haemolytic transfusion reaction’s (HTR), approximately 3% of recipients become sensitised following an immune reaction to a non-self blood group antigen after a single transfusion episode. This proportion can rise dramatically in patients requiring frequent transfusion support, with immunisation rates as high as 60% being reported for some haemoglobinopathy patients. Sensitisation to non-self red blood cell (RBC) antigens confers a lifetime risk of HTRs, which from 2013 through 2017 were responsible for 17% (32 of 185) and 6% (7 of 110) of transfusion-related deaths reported to the US Food and Drug Administration and Serious Hazards of Transfusion UK, respectively. Furthermore, sensitisation can render transfusion- dependent patients non-transfusable and cause haemolytic disease in pregnancy which is potentially life-threatening to the fetus. A more precise blood matching policy will reduce sensitisation rates, however, adoption of this is resisted because of perceived logistical challenges, donor typing costs, and the lack of evidence from large scale clinical trials that reducing sensitisation rates results in health gains. Antibody-based haemagglutination tests are the current gold standard for RBC antigen typing; however, reliable reagents and high-throughput techniques are not available for all clinically relevant antigens. DNA-based tests have been used to overcome these limitations, and a range of in-house and commercial assays have been developed for donor genotyping. However, due to the limited number of antigens typed for and the low throughput capacity of these assays, they have not been widely applied to blood donor typing. Advances in the technologies used for genome-wide genotyping and sequencing have substantially reduced the cost of generating genetic variation data at population scale. Mul- tiple studies have demonstrated that it is possible to extract antigen typing information from the data produced by these technologies, indicating that they could be used to deliver vii genomics-based precision transfusion medicine to the patient bedside. However, the same studies also highlight a series of challenges that would have to be overcome before these technologies could be safely integrated into the clinical laboratory. In this thesis, I will present the work that has been done to overcome some of the issues surrounding the interpretation of blood cell antigen typing from genomic data and the development of a universal donor genotyping platform which can be used by blood supply organisations worldwide to implement of a policy of precision transfusion medicine. Acknowledgements The work presented in this thesis has been made possible through the continued support of many individuals. Firstly, I would like to thank Karen, Stephen and Sam - my family - for supporting me throughout my studies. I would like to thank all friends, loungers, crewmates, and ghost house residents who have kept me entertained throughout my PhD, I have been incredibly fortunate to meet you all. Secondly, from the NIHR BioResource, INTERVAL, COMPARE and DIS-III study teams I would like to thank Emanuele Di Angelantonio, John Bradley, Sarah Fahle, John M Jongerius, Nathalie Kingston, Jyoti Khadake, Amy McMahon, Susan Mehenny, Carmel Moore, Jennifer Sambrook, Olga Shamardina, Kathy Stirrups, Tiffany C. Timmer, Katja van den Hurk, Matt Walker, and Neil Walker for the genotyping and patient data they provided that was so crucial to this study. Thirdly, from the ISBT, NHS Blood and Transplant, and Sanquin I would like to thank Christoph Gassner, Alan Grey, Shane Grimsley, Peter Ligthart, and Nicole Thornton for taking the time to give me an education in Immunoheamtology and Judith Chapman, Gail Miflin, Dave Roberts, and Jill Storry who have publicly supported this work and helpedto bring it to the attention of the wider transfusion medicine community. A special thanks must go to William Astle, Kim Brugger, Karyn Megy, and Christopher Penkett who have done me hundreds of favours and put up with continual questions for four years. I would also like to give special thanks to Debra Fletcher and Lindsay Walker who have without fail managed to keep me organised, on-time, and generally alive throughout. From the Blood transfusion Genomics Consortium and my extensive panel of co- supervisors, I would like to thank Adam Butterworth, Jeremy Gollub, William Lane, John Ord, Ellen van der Schoot, Barbera Veldhuisen, Nicholas Watkins, and Connie Westhoff who have committed countless hours of their time to this project and to my professional development. Last but certainly not least, I would like to thank Willem H Ouwehand who has been consistently supportive, generous with his time and ideas, and an inspiration throughout. Being his student has been a privilege and a joy, I could not have asked for a better supervisor. x Because of him, I have had the opportunity to complete 1 PhD, visit 9 countries, give 32 presentations, win 6 prizes, receive 1 Lancet rejection, receive 1 Blood Advances acceptance, meet 1 billionaire, climb in the Rocky Mountains, sail on Lake Ontario, save William Lane from drowning in the Rhine, and make no less than 25 Circos plots. I will never be able to thank him enough for his decision to give me an opportunity to study for my PhD at the University of Cambridge. Table of contents List of figures xv List of tables xvii Nomenclature xix 1 Introduction1 1.1 A brief history of blood transfusion . .1 1.2 Blood group antigens . .3 1.3 Pre-transfusion testing and donor screening . .4 1.4 The molecular basis of blood group antigens . .7 1.4.1 The ABO blood group system . .7 1.4.2 The Kell blood group system . 11 1.5 Donor Genotyping . 11 1.6 Population Scale Genotyping . 13 1.7 Aims . 17 2 Materials and Methods 19 2.1 Overview . 19 2.2 An overview of Axiom array genotyping . 22 2.3 Next Generation Sequencing . 26 2.4 Bioinformatics . 30 2.4.1 bloodTyper . 30 2.4.2 HLA impute . 32 2.4.3 BWA, SAMtools, and BCFtools . 34 2.4.4 Genome Analysis Toolkit, Manta and Canvas . 34 2.4.5 Axiom Analysis Suite . 34 2.5 Summary of external data sources . 35 2.6 Blood Donor DNA samples . 35 xii Table of contents 2.6.1 INTERVAL panel (n=1,257) . 35 2.6.2 NIHR BioResource panel (n=507) . 36 2.6.3 COMPARE and DIS3 panel (n=7,477) . 36 2.7 Clinical RBC and PLT antigen typing data . 37 2.8 RBC and HPA Antigen typing concordance analysis . 39 2.9 HLA antigen typing concordance analysis . 39 2.10 Data availability . 40 2.11 My contribution to each chapter . 40 3 Mapping blood group antigens to the human genome 43 3.1 Abstract . 43 3.2 Introduction . 44 3.3 Chapter Workflow . 46 3.4 The Blood transfusion Genomics Consortium . 48 3.5 Data driven review of ISBT reference transcripts . 49 3.6 Exploring whole-genome sequencing data . 53 3.6.1 Further validation of the bloodTyper WGS algorithm . 53 3.6.2 Antigen typing 13,037 whole genomes . 54 3.6.3 Variation in the Kell blood group system . 60 3.7 Discussion . 65 4 A universal donor genotyping platform 69 4.1 Abstract . 69 4.2 Introduction . 70 4.3 Chapter workflow . 72 4.4 Stage 1 - Proof of principle . 74 4.5 Stage 2 - Designing an array for donor genotyping . 77 4.6 Stage 3: Large scale blinded trial of the UKBBv2 array . 83 4.6.1 Selection of trial set samples and initial genotyping quality control . 86 4.6.2 Allele frequency validation of the UKBBv2 array . 90 4.6.3 Overall antigen typing performance . 92 4.6.4 RBC antigen typing performance . 93 4.6.5 HPA antigen typing . 93 4.6.6 HLA antigen typing . 93 4.7 Impact of an extensively genotyped donor panel . 94 4.7.1 Identification of individuals with rare phenotypes . 96 4.7.2 Immediate clinical benefit of extended typing data . 96 Table of contents xiii 4.7.3 Donor and patient health information . 98 4.8 Discussion .