A Thesis:

entitled

Improving a Method for High-Resolution HLA-Typing In Transplantation

by

Afnan Sami H Malebari

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the

Master of Science Degree in Biomedical Sciences:

Medical Microbiology and Immunology

______Dr. Stanislaw Stepkowski, Committee Chair

______Dr. Kevin Pan, Committee Member

______Dr. Sadik A. Khuder, Committee Member

______Dr. Amanda Bryant-Friedrich, Dean College of Graduate Studies

The University of Toledo

May 2017

Copyright 2017, Afnan Sami H Malebari

This document is copyrighted material. Under copyright law, no parts of this document may be reproduced without the expressed permission of the author.

An Abstract of

Improving a Method for High-Resolution HLA-Typing In Transplantation

by

Afnan Sami H Malebari

Submitted to the Graduate Faculty as partial fulfillment of the requirements for the Master of Science Degree in Biomedical Sciences: Medical Microbiology and Immunology

The University of Toledo

May 2017

For a successful unrelated organ transplantation with low risk of graft-versus-host disease and mortality, it is important to have a close (HLA) between donor and recipient. In other words, in diagnostic laboratories, HLA typing and profile are the key rules to test donor and recipients compatibility. However, due to that HLA genes are the most polymorphic genes in the human genome, high- resolution HLA typing is challenging by using well-tried methods. Sanger sequencing- based typing (SBT) is the gold standard method in sequencing and continuously identify new HLA alleles. Even though, SBT has disadvantageous as a result of its confined sensitivity and its failure to perform parallel sequencing of multiple targets. In the other hand, next generation sequencing (NGS) technologies may improve the sequencing process for HLA typing because it allows massively parallel sequencing, rapid, and cost- effective than SBT. So, NGS is a promising tool to overcome intrinsic HLA typing problems.

iii

Therefore, in this study, I am improving a preliminary assay by NGS for high- resolution HLA typing in transplantation. The preliminary assay was designed for high- resolution HLA typing for class-I and class-II HLA has some obstacles in identifying four-digit HLA alleles, but I will focus on improving class-I HLA typing and work on model solutions for HLA class-I (HLA-A, B, and C) then apply it on class-II. The preliminary assay library preparation has two different types of primers; broad primer pairs to increase the amplification ratio of the target genes than its pseudogenes and target specific primer pairs to amplify specific regions of the target genes. However, the redesigned broad primers for class-I HLA genes were able to amplify more specifically the target genes and reduced the amplification of pseudogenes than the previously designed broad primers. Additionally, a preliminary computer program plays a vital role in analyzing the sequence and identifying the correct high-resolution HLA alleles. The ability of the right allele calls increased by excluding any sequence read length less than

90% of the expected sequence read length of each amplicon, and minimizing interference sequence reads by redesigning broad primers.

iv

To my husband, Shadi Melebari, your presence in my life, your love, your standing beside me encouraged me to overcome difficulties of this journey. I am forever thankful of the enormous sacrifices you have made along the way.

To my Mom and Dad, Thuraya Melebari and Sami Melebari, I could never have done this without your, encouragement to be the best in my studies and confidence and you always taught me the love of knowledge, the determination and patience. Thank you for teaching me to believe in myself, in Allah, and in my dreams.

Acknowledgements

I would like to express my appreciation and gratitude to my advisor, Dr.

Stepkowski for his support, expert guidance, and advice through my Master studies. I would also like to thank the faculty and staff in the department of Medical microbiology and immunology at the University of Toledo.

To those who help me in my research and light the darkness that sometimes stand in my way. To those who provided me with assistance, ideas and information, possibly without feeling in turn. So they have the thanks of me all, and particularly of them;

Caitlin Baum, Dr.Beata Mierzejewska, Dr.Willey James, Dulat Bekbolsynov and Erin

Crawford.

A special gratitude I give to Dr. Thomas Blomquist, whose contribution in stimulating suggestions and modifying the computer program, helped me to complete my project.

vi

Table of Contents

Abstract ...... iii

Acknowledgements ...... vi

Table of Contents ...... vii

List of Tables ...... ix

List of Figures ...... x

List of Abbreviations ...... xiii

List of Symbols ...... xiv

1 Chapter 1 ...... 1

1.1 Introduction ...... 1

1.2 Objectives ...... 13

2 Chapter 2 ...... 16

2.1 Literature Survey ...... 16

3 Chapter 3 ...... 21

3.1 Materials and Methods ...... 21

3.1.1 Samples ...... 21

vii

3.1.2 DNA extractions ...... 21

3.1.3 DNA concentration and purity determination ...... 22

3.1.4 Reagent Design ...... 23

3.1.5 Library construction ...... 24

3.1.6 Next Generation sequencing ...... 27

3.1.7 Evaluating primers and PCR reaction products ...... 28

3.1.8 Data processing ...... 28

3.1.9 Analyzing the raw sequence data without the software...... 29

3.1.10 Statistical Analysis...... 30

4 Chapter 4 ...... 33

4.1 Results ...... 33

4.1.1 Obstacles in the preliminary assay...... 33

4.1.2 Redesigning broad primers...... 39

4.1.3 Performance testing of competitive amplicon library preparation...... 56

4.2 Discussion ...... 72

5 Chapter 5 ...... 81

5.1 Conclusion ...... 81

Reference ...... 83

A Preliminary assay primers sequences ...... 91 viii

List of Tables

Table 3.1. HLA Class-I-broad primer sets...... 31

Table 3.2. HLA Class-I target specific primer sets...... 32

Table 4.1. Average percentage of preliminary NGS assay of sequence reads

agreement with Sanger sequence reads per amplicon among 6 patients ...... 38

Table 4.2. Redesigned broad primers for HLA-A exon 4...... 40

Table 4.3. Redesigned broad primers for HLA-B exon 2, 3, and 4...... 46

Table 4.4. Redesigned broad primers for HLA-C exon 2, 3, and 4...... 51

Table 4.5. Average percentage of modified NGS assay of sequence reads agreement

with Sanger sequence reads per amplicon among 6 patients ...... 71

Table 4.6. Allelic type results comparison between Sanger method, old assay, and

the new assay...... 80

Table 5.1. Previously designed broad primers...... 94

Table 5.2. Previously designed target specific primers for HLA-A...... 95

Table 5.3. Previously designed target specific primers for HLA-B...... 96

Table 5.4. Previously designed target specific primers for HLA-C...... 97

ix

List of Figures

Figure 1-1. Nomenclature of HLA Alleles1...... 10

Figure 2-1. Genomic location of the HLA Class I and II genes...... 19

Figure 4-1. Percentage of NGS sequence reads agreement with Sanger sequence

reads of 6 patients...... 35

Figure 4-2. Excel snapshot: an example of the original data analysis...... 37

Figure 4-3. Schematic illustration of broad primer and target specific primer design. .... 39

Figure 4-4. Agilent results of redesigned broad primers for HLA-A exon 4...... 42

Figure 4-5. Schematic illustration of nested PCR reactions...... 44

Figure 4-6. Agilent results for nested PCR products...... 44

Figure 4-7. Agilent results of redesigned primers for HLA-B exon 2...... 47

Figure 4-8. Agilent results of redesigned broad primers for HLA-B exon 3...... 48

Figure 4-9. Agilent result for redesigned broad primers of HLA-B exon 4...... 49

Figure 4-10. Agilent result for redesigned broad primers of HLA-C exon 2...... 52

Figure 4-11. Agilent result for redesigned broad primers of HLA-C exon 3...... 53

Figure 4-12. Agilent result for redesigned broad primers of HLA-C exon 4...... 54

Figure 4-13. Agilent results of class-I-A1 for spot checked samples...... 57

Figure 4-14. Agilent results of class-I-A2 for spot checked samples...... 58

Figure 4-15. Agilent results of class-I-B1 for spot checked samples...... 59

Figure 4-16. Agilent results of class-I-B2 for spot checked samples...... 60 x

Figure 4-17. Agilent results of class-I-C1 for spot checked samples...... 61

Figure 4-18. Agilent results of class-I-C2 for spot checked samples...... 62

Figure 4-19. Agilent results for barcode PCR reaction products...... 63

Figure 4-20. Agilent results for platform PCR reaction products...... 65

Figure 4-21. Comparison of NGS/Sanger mean percentage agreement between the

old and new assay for HLA-A exon 4...... 66

Figure 4-22. Comparison of NGS/Sanger mean percentage agreement between the

old and new assay for HLA-B exon 2, 3, and 4...... 69

Figure 4-23. Comparison of NGS/Sanger mean percentage agreement between the

old and new assay for HLA-C exon 2, 3, and 4...... 70

Figure 4-24. Agilent results of previously designed broad primer pair for HLA-A

exon 4...... 73

Figure 4-26. Agilent results of previously designed broad primer pair for HLA-B

exon 3...... 75

Figure 4-25. Agilent results of previously designed broad primer pair for HLA-B

exon 2...... 75

Figure 4-27. Agilent results of previously designed broad primer pair for HLA-B

exon 4...... 76

Figure 4-28. Agilent results of previously designed broad primer pair for HLA-C

exon 2...... 76

Figure 4-29. Agilent results of previously designed broad primer pair for HLA-C

exon 3...... 77

xi

Figure 4-30. Agilrnt results of previously designed broad primer pair for HLA-C

exon 4...... 77

xii

List of Abbreviations

ASHI ...... American Society for and Immunogenicity

DNA ...... DeoxyriboNucleic Acid

ESRD ...... End-Stage Renal Disease EFI...... European Federation for Immunogenetics Q ...... Expression Faulty gDNA ...... Genomic DNA

HLA ...... Human Leukocyte Antigen

IDT ...... Integrated DNA Technologies IMGT ...... International ImMunoGeneTics

L...... Low Expression

MHC ...... Major Histocompatibility Complex

NIST ...... National Institute Standard Technology NGS...... Next Generation Sequencing N...... Not Communicated Null

PCR ...... Polymerase Chain Reaction PERL ...... Practical Extraction and Reporting Language

ROC ...... Receiver Operating Characteristics

SBT ...... Sanger Sequencing-Based Typing SAM ...... Sequence Alignment/Map SSOP ...... Sequence-Specific Oligonucleotide Probes SSP ...... Sequence-Specific Primers

xiii

List of Symbols

°C ...... Degree Celsius

µl...... Microliter mL ...... Milliliter M...... Molar

xiv

1 Chapter 1

1.1 Introduction

End-stage renal disease (ESRD) is a global health problem, and kidney

transplantation is considered the only treatment that can restore a relatively normal

functioning kidney and health 2 by removing a healthy kidney from a donor and

transplanting it in the recipient suffering with terminal ESRD. The earliest mentioned of

kidney transplantation was recorded in 1907 by an American researcher in an article

entitled ‘Tendencies in Pathology’3. The researcher, Simon Flexner, argued that organ

transplantation would be possible in the future but for diseased human organs only. He

declared that organs such as stomach, kidney, heart or even arteries will be possible to

substitute with healthy ones in an individual in a case of its failure or damage 3. Some

years later, in 1933, a surgeon from the Soviet Union attempted to perform kidney

transplantation in a patient with ESRD. Dr. Yuriy Voroniy removed the kidney from a

donor six hours before donor death and transplanted it into the thigh of a recipient 4.

Despite the fact that the transplanted kidney seemed to function, the patient died after two

days as a result of complications, and in addition there was no attempt of any matching

between the donor and the recipient 4. After that event, a number of organ

1

transplantations were performed and recorded in history but none of them was really successful.

In fact, kidney transplantation is linked to quite a number of setbacks related to lack of the concept of an allograft as foreign tissue and its rejection, lack of understanding the need for immunosuppression, and a necessity for an ongoing medical supervision 5. The setbacks were not resolved until the researchers had clarified all these basic facts. The biggest success of the first successful kidney transplantation performed on 1954, stems from the fact that the team led by Dr. Murray understood the basic principles of transplantation6,7. Indeed, the first kidney transplant was performed between two identical twins: such approach eliminated the need for immunosuppression6,7. Since then the steady progress in investigation introduced first successful immunosuppressive therapy in 1960s, improved immunosuppression by introducing cyclosporine in 1980s and mastered immune therapies and management of kidney transplant patients in 1990s8-

12. Currently, kidney transplantation is the most successful remedy for patients with

ESRD with one and five years kidney allograft survivals of 96% and 85%, respectively.

Kidney transplantation and generally organ transplantation have helped to save many lives ever since the first kidney transplantation in 1950 performed by Dr. Richard Lawler on Ruth Tucker as a recipient: the kidney transplant functioned for 5 years until she died from heart disease and pneumonia13,14.

Despite the massive success rates of kidney transplants, the real progress required several fundamental discoveries related to kidney harvesting and preservation, evaluation of patients’ sensitization, HLA tissue typing and identification of other transplant antigens, as well as induction therapy and tailoring of individualized 2

immunosuppression15-17. While allograft rejection may happened at any time after transplantation, in most instances the acute kidney rejection occurs within the first six months after transplantation 14. However, there are generally three types of rejection, namely hyperacute, acute and chronic rejections18. Hyperacute rejection happens when a patient is loaded with preformed anti-donor specific antibodies18. This is today very rare occurrence as all patients are required to be tested for the presence of such preformed . Acute rejection is mediated by the immune response (T cells or antibodies) after kidney transplant is recognized as a foreign tissue18. While immunosuppression is very effective to block acute rejection, different factors influence the effectiveness.

Finally, chronic rejection is affecting the majority of kidney allograft by a chronic process of graft damage. There is no effective therapy for chronic rejection as well as there is no preventive strategy. The progress in preventing and therapy of chronic rejection may require much more basic scientific work. The vast majority of transplanted organs is totally unmatched, and that is responsible for the potency of the immune response. Although almost all transplant patients experience some type of acute and/or chronic rejection, overall results of kidney transplantation are excellent 19.

Rejection episodes and especially chronic rejection do not always entail signs and symptoms 20,21. In fact, they can only be noted by direct biopsy or sometimes from frequent blood work routines21. While patients must be given immunosuppressants to suppress their immune system to accept the transplanted organ allograft, around 10-20% of kidney transplant recipients experience kidney rejection even after the use of immunosuppressants 21. However, kidney rejection episodes do not automatically mean that one will lose the new kidney transplant as most episodes are treatable by adjusting 3

the immunosuppression or more often by using a temporary therapy with an anti-rejection medication. The episodes can be related to flu-like symptoms, fever of above 101o F, decrease in urine output, pain over the transplant and fatigue in particular cases 18.

However, since most of these symptoms do not appear or show at times periodic blood checks are necessary.

Kidney rejection is mostly a result of the immune system being activated by a newly transplanted organ. The recipient’s immune system has as its sole purpose of protecting the body from foreign substances, namely cancer cells, infectious germs, etc18.

Foreign proteins referred to as antigens include transplanted organs which have different antigens as compared to those that are within the body 18. Therefore, the transplanted organ is attacked as soon as it is recognized as foreign object by the recipient’s immune system. Only identical twins or genetically matched individuals do not initiate rejection.

In the event that organs that are being transplanted are not matched, the tissue antigens expressed on cells are detected, and induce transplant rejection 18.

Organ transplantation entails innate and adaptive immunity that can be considered as the main response mechanism that is exerted on the tissue that is transplanted. This can be attributed to the self-integrity goal as the immune response represented by T cells checks self-HLA and other self-proteins 22. The surfaces of molecules that cause the antigenic stimulus lead to activation of first T cells leading to engagement of other immune cells: result is the rejection of the donated tissue or organ by activated T cells 23.

Various antigens related to transplantation have been described such as the major (MHC) and minor histocompatibility molecules, the ABO blood group antigens among many others 22. It is proposed that over 300 proteins in humans are polymorphic and thus can 4

induce alloimmune response 24. Blood transfusions, previous transplants and pregnancy can sensitize the polymorphic alloantigens, leading to the development of the anti-HLA antibodies which play the role in inducing graft rejection after transplantation of organs25.

The first attempts of kidney and generally organ transplantation in 19th century failed because of the lack of understanding about the concept of allograft rejection and more precisely because of the lack of immunosuppression. In fact, the experimentation in the 1940’s helped to understand the basic principles of immunological organ allograft rejection26. Several experiments created the concept of foreign tissues, allografts, and finally self and non-self. The scientific idea of matching between donor and recipient for successful transplant survival was confirmed by observing the indefinite survival of self- transplants and the prompt rejection of foreign non-self-transplants. Later, several time consuming mouse brading experiments defined the idea of MHC and minor histocompatibility antigens. Today, there is clear distinction of as well as clear understanding that major (MHC) and/or minor histocompatibility antigens may induce allograft rejection26. Before a harvested organ is accepted for transplantation, some tests have to be carried out to ensure that this organ matches the particular recipient. In particular, blood type must be matched between donor and recipient while the cross- match must confirm lack of IgG antibodies against donor (donor-specific antibodies;

DSA). These two tests are compulsory prior to transplantation.

The first serology testing assays, to identify the presence of antibodies, were developed in the middle of 19th century27. These assays identified antibodies within the serum and how they respond in the event of an infection. Out of this type of studies very useful assays were defined to eventually test antibodies measuring responses of potential 5

recipients against potential donors. There are a variety of serological techniques such as the enzyme-linked immunosorbent assay, several precipitation assay, and several versions of complement-fixation assays, as well as assays measuring agglutination and neutralization antibodies 27.

The basic laboratory-based experimentation of tissue transplants entails the mixing of the donor leukocytes with the recipient’s T cells or sera to evaluate the recipient’s immune system. Multiplication of recipient’s T cells in response to donor’s leukocytes flags the onset of an immune reaction and the presumable rejection of the donor tissue. The newest methods utilize most advanced non-radioactive indicators of deoxyribonucleic acid (DNA) levels in recipients’ T cells in response to irradiated donor cells (blocking their divisions) as a measurement of proliferation.

Tissue typing methods have been gradually developing since 1960s with the biggest technological progress during 1980s when better techniques to match donor and recipient blood types and cross-match techniques were discovered. At the same time

DNA testing was introduced as a standard procedure to identify HLA of donors and recipients, slowly replacing serological methods with panels of HLA-reactive antibodies.

Human chromosome 6p21 encodes multiple HLA antigens, including HLA-A, B, C, DR,

DQB/DQA, and DPB/DPA28-30. The most characteristic feature of these genes is the fact that they are extremely polymorphic amongst all humans. They encode glycoproteins expressed on the membrane of all nucleated cells as the HLA antigens. Their presence on all cells represent the signature of individual self-reference for the immune T and B cells: self-tolerance is defined by the elimination of all T and B cells able to recognize self-

HLA and other self-antigens. Consequently, foreign HLAs and other non-self-antigens 6

are recognized by individual’s T and B cells, initiating allograft rejection 31,32. These self-

HLAs and other self-antigens permit our immune system to "perceive" our own organs and tissues as “confirmed” self and different from those of another person. The concept of self-antigens and non-self-antigens incites a uniform resistance against foreign antigens and thus creates the framework for the immune system: tolerance to self

(negative response), attack to invasive/infection agents (positive response), and mistaken identification of autoimmune agents (autoimmune response) 33,34. Unfortunately, foreign organs or tissues which often safe lives are classified as an invasion thus initiating a positive response and aggressive allograft rejection preventing solid organ and bone marrow transplantation 35,36.

There are various well-established methods of HLA typing such as the serological methods, molecular techniques, sequence-based typing, reference strand based typing and the sequence-specific oligonucleotide probing22. The difficulties in organ transplantation are based on the limited number of organs and the immune suppressive treatments that prevent acute but not chronic rejection and which also are producing side effects. On one hand, the transplantation field was transformed into very successful branch of medicine saving thousands of lives over decades. On the other hand, an average survival time for kidney transplant is only 12 years for deceased donors and 15 years for live donors. New efforts must be made to address these challenges, namely extending long-term allograft survivals. One of the solutions is to improve the HLA matching between donor and recipient but that requires a much less expensive method to identify the high resolution

HLA typing. The development of a 4-digit HLA typing method is the goal of our work as

7

a current method of 2-digit HLA typing may eliminate many “good” donors for highly sensitized recipients.

The polymorphic differences of distinctive HLA genes among donors and recipients are either very immunogenic or little immunogenic and therefore influencing the risk of allograft rejection. The correct and precise methods used to test DNA for tissue typing are likely those which are most accurate to identify possibly high resolution

4-digit HLAs. DNA is extracted from the cells of donors and recipients, then processed and subjected to the method reading the sequences of HLAs sufficient enough to identify proper HLA designation: proper controls and reference sequences are always used to increase particular method’s accuracy and quality 21. The use of high resolution 4-digit

HLA typing for organ transplantation needs a further investigation as the benefit of HLA polymorphisms between the donor and recipient needs to be documented. Indeed, there are six HLA antigens, HLA-A, B, C, DR, DQB/DQA, and DPB/DPA, which had been identified to be important in the immune response to allografts 21.

Discovery of HLAs and the typing methods were a great achievement in the matching of organ donors to recipients, and it increased the survival rate of kidney transplants. HLA genes play a fundamental role in the determination of organ transplantation outcomes. These genes help to identify multiple HLA-dependent diseases.

The HLA framework incorporates a multiple clusters of genes situated on chromosome 6 and includes a variety of proteins important in the safe control of all nucleated cells34.

The self-HLA proteins present on every nucleated cell provide self-identity to inform self-trained immune system not to attack (tolerogenic signal)34. There are three important class I loci at HLA: HLA-A, - B, and - Cw, and five class II loci: HLA-DR, - DQ, - DP, - 8

DM, and - DO. The framework of all these molecules is that they are profoundly polymorphic 37. The commitment of the assorted allelic genes of class I and II qualities to resistant acknowledgment and alloreactivity can be dissected by serological techniques and molecular strategies at the DNA level by various strategies like particular succession groundwork and oligotyping with locus-and allele-particular oligonucleotide tests 37.

HLA class I and II coordinating is essential in organ transplantation 37 particularly in kidney and bone marrow transplantation. In heart and lung transplantation, HLA coordinate at the DR locus is imperative, yet there is a few challenges like ischemic circumstances, accessibility of contributors and clinical need of recipients. Corneal transplants are not as a rule impacted by HLA matching unless being transplanted into a vascularized bed 37. Transplantation of allogeneic tissue activates both humoral and cellular immune reactions of the recipient.

With regards to HLA nomenclature each HLA has exclusive number interrelated to up to to up to four digit parted by colons. The first field among the four defining the allele which corresponds with the serological specificity of the HLA protein. The variation in the sequence of the amino acid is indicated by the second field that differentiates the DNA sequence. The third field is utilized to show synonymous DNA substitutions in the coding sections, and the fourth field alludes to contrasts in the non-coding locales. Since variations in the third and fourth field do not have any impact on the subsequent HLA protein, as it is encoding untranslated region 27. Toward the end of the name for an HLA allele, there might be a postfix showing changes in the declaration of the protein (N=not communicated Null allele, L= Low expression, Q= expression faulty) and it is illustrated in

9

Figure 4-30. Agilent results of previously designed broad primer pair for HLA-C exon 4.1. Invalid alleles are clinically applicable and more often than not need to be avoided or affirmed 38.

Figure 1-1. Nomenclature of HLA Alleles1. HLA typing can be challenging at high resolution even with the whole sequencing data 37. The 2-digit typing helps in excluding any donor who has the same antigen or a 4-

10

digit allele with that same antigen 27. On the other hand, a 4-digit typing eliminates any donor who has a precise allele authorized in HLA typing 27. In other words, 4-digit typing includes four separate fields that indicate the different levels of variation in the DNA sequence and the protein that results 37. Tissue typing plays a significant role in matching and ensuring that kidney rejection does not occur. It enables for a cross-match to be performed 27.

The required determination for HLA typing relies on upon the clinical application and is characterized by the neighborhood transplant convention 39. The European

Federation for Immunogenetics (EFI) principles give some insignificant necessities that must be satisfied. For the most part, for solid organ transplantation, a 2-digit determination separating the serological parts is required. The greater part of the low determination sequence-specific primers (SSP) packs does not accomplish what is really required. For bone marrow registries a low to the medium determination is the basic necessity for screening39. Most registries demand a 4-digit typing since this expands the likelihood that a donor is selected. For the donor and the recipient of a hematopoietic immature organ transplantation, a genuine high determination tissue typing is required that must be accomplished by sequencing.

The majority of centers limit their HLA typing to low or medium HLA typing, which in this case is the 2-digit DNA typing. The low-resolution typing is only sufficient enough to affirm the match between a recipient and a donor only if such donor is a sibling (in reality siblings share precise 4-digit HLA). This is as a result of the strong disequilibrium linkage that exists between HLA-DR and DQ and also HLA-B and C 22 in accordance with the EFI standards in relation to the kidney transplantation22. 11

Nevertheless, using 2-digit tissue typing has its limits, it may mean that the recipient and the donor may not match at HLA-DQ and HLA-C. This usually happens if a recipient and a donor did not inherit the same haplotypes and in the end, the two can be mismatched when using alleles at the 4-digit typing. Therefore, the most appropriate and comprehensive of the two HLA testing methodologies is the 4-digit typing seeing that it evaluates the matching probabilities on a wider angle between a donor and a recipient. It is well established that HLA antigens have a vital role in kidney transplantation outcomes. Better matching correlated with better survival and vice versa. Thus, these retrospectively examined rules propose that proposed donor/recipient selections may improve kidney allograft survivals.

HLA sequencing is typically used to acquire high determination 4 digit allele level for HLA-A, B, C, DR, DQ and DP. The gold standard method of sequencing, which is the Sanger sequencing entails the DNA polymerase which is used in the extension of the primer oligonucleotide and selective addition of chain-terminating fluorescently- labeled di-deoxy nucleotide triphosphate molecules followed by size separation by 40,41. The method has disadvantageous as a result to its confined sensitivity and its failure to perform parallel investigation of multiple targets.

Nevertheless, the fact that it can only be useful in single exons at particular locus during polymerase chain reaction (PCR) and often for each amplicon should have multiples sequencing reactions poses as a disadvantage. Also, there has to be the confirmation of data under short intervals. Up to the present time, the highest resolution HLA typing for exons 2 and 3 in class I HLA genes; HLA-A,B and C or exon 2 in class II HLA genes;

HLA-DR, DQ and DP is done by using Sanger-based DNA sequencing and obtained with 12

sequence-specific oligonucleotide probes (SSOP), SSP and sequence-based typing (SBT)

42. Sanger sequencing-based typing has the ability to perform four digits HLA typing, but

PCR amplification should be done for each exon at individual locus and for each amplicon should be done a regularly different separate sequencing. Due to the highly polymorphism characteristic of HLA genes, vagueness still appears in HLA typing. To deal with these ambiguities require tedious methodologies as analyzing the two alleles independently after amplification 43.

However, in the previous decade, the advancement of next generation sequencing

(NGS) has made it possible for entire genome investigation in people. Examining on the

HLA, a broadly contemplated atom required in resistance, has profited from NGS innovations. So far, a few high-throughput HLA-typing strategies utilizing NGS have been developed in the research of HLA, NGS encourages the complete HLA sequencing and is relied upon to enhance our comprehension of the instruments through which HLA genes are tweaked, including interpretation, control of quality expression and epigenetics.

Above all, NGS may likewise allow the investigation of HLA-omics. NGS is better than established Sanger sequencing in massively parallel sequencing, reduced cost per sample, quicker, uses significantly less DNA and is more reliable than Sanger sequencing 42,44.

These remarkable properties permit its expansive application to differing zones in clinical transplantation 45. A few researchers have connected NGS innovations to genotype the very polymorphic HLA genes utilizing different amplification and library preparation techniques, sequencing platforms that help the DNA binding, and sequencing examination ways to deal with significantly improve sequencing depth profundity and resolve ambiguities 46. 13

1.2 Objectives

The main objective of this study is to improve an assay designed for the high resolution HLA-typing by a next generation sequencing (NGS) method. Such method will provide an inexpensive and precise tool to identify 4-digit high resolution HLA typing of humans, which may be particularly useful in transplantation. For example, matching of donor and recipient for bone marrow transplantation as well as organ transplantation may become easier, more cost effective, and more efficient. Our work has been inspired by the plan of improving matching of recipients and donors for a kidney transplantation. Typically, kidney transplantation requires the identification of HLA-A,

B, and DR, increasingly identifying also HLA-C, DQB and DPB, as well as recently

DQA and DPA47-49. Since there is an increasing evidence that each of these HLAs may play an important role in allograft rejection, matching for these HLAs become critical especially for sensitized patients47-49. However, each gene has multiple exons. In particular, for identification of Class I HLA (A, B and C) we chose to sequence exons 2,

3, and 4 whereas for Class II HLA exons 2 and 3: this is because the extracellular regions and the amino acid variability is predominantly found in these regions 50. The primers for sequencing HLA genes were designed in pairs as forward and reverse primer to assess the entire exonic region of each gene and then the NGS technology is used for the final sequencing process. Initial HLA sequencing data from the preliminary assay was very encouraging. However, we also realized the complexity of such NGS method as it requires excellent selection of primer pairs as well as very efficient computer program to correctly read sequences and thus name appropriate HLAs. Consequently, our NGS 14

method needs to be improved in order to have 100% accuracy at 4-digit high resolution

HLA typing. The presented work has been focused on improving the Class I HLA-A, -B and -C in exons 2, 3 and 4. Our NGS method was designed to perform an amplification of HLA genes in two phases, namely 1) broad primer pairs amplifying large fragments of selected exons; and 2) target-specific primers amplifying all selected specific fragments of HLA exons. These two amplification steps were designed to avoid sequencing and identification of non-HLA genes and pseudo-genes. However, the testing of patients with different HLA alleles showed that further redesigning of both broad and target-specific primers is necessary. Consequently, the present work is focused on redesigning broad primers to amplify HLA genes by PCR to increase the primer specificity and efficacy comparing to previous primers. The computer program must be designed to identify HLA sequences and correctly read proper HLAs; such program should be written with a set of precise instructions and rules which will enable of sorting out the sequence reads length shorter than 90% of the expected sequence read length of each amplicon, therefore, giving clear genotype readings. Eventually, the size of shorter reads may be pushed to

95% or even 99% of shorter size for elimination by the computer program, but such determination need to be tested.

15

2 Chapter 2

2.1 Literature Survey

A highly common cause for morbidity and mortality is end-stage renal disease

(ESRD), which is also called kidney failure. As per the most recent U.S. Renal Data

System Annual Data Report, The quantity of ESRD predominant cases keeps on

ascending (by around 21,000 cases for every year)51. For patients with ESRD who fail to

respond to a therapy, transplantation is the best treatment of choice 52. Organ

transplantation is the process of surgical extracting a healthy organ from one body and

implanted it in the body of a recipient, to replace a failed or injured organ. Despite the

fact that transplantation can improve patients’ life as it eliminates the need for dialysis

and thus give them an opportunity to live a normal live, the process can involve different

complexities, including infection, acute and chronic rejection and malignancy53.

Furthermore, the compatibility between a donor and a recipient contributes to the

transplant success. The best donor/recipient matching is especially evident by analysis of

long-term kidney allograft survivals 54. The zero mismatch (HLA-A, -B, and -DR)

produced the best survivals whereas every additional mismatched HLA allele resulted in

worse survivals. If it would be possible to maximize matching of kidney transplants

16

among 16,000 recipients several new transplants to first time recipients could be made instead of re-transplants.

A retrospective single center study of live and deceased donor kidney transplants has exhibited that HLA-mismatch remained a vital determinant of acute rejection possibility in renal transplant recipient getting fourfold immunosuppression including the utilization of interleukin-2 receptor antibody induction, tacrolimus, mycophenolate mofetil and corticosteroids 55. In addition, decreasing acute rejection risk has been shown mostly in kidney transplant recipients getting cyclosporine-based immunosuppressive regimen as advantages effect of improved HLA-matching 56,57.

For a long time the HLA community has been progressing in the direction of a strategy that will precisely recognize the broad polymorphism of the HLA genes on

DNA-based methods. The approach of PCR, joined with different innovations (Sanger sequencing, SSOP, SSP, Luminex) 58,59, gave an equation to essentially enhancing the discovery of HLA polymorphisms yet with a few restrictions that keep on constrain the capacity to completely describe the HLA genes. Advances created in the course of the most recent years, called NGS platforms, have given new open doors that permit the entire characterization of the HLA genes in haploid fashion. NGS has two definite components, clonal sequencing of DNA fragments and enormously high throughput.

NGS gives the ability to phase polymorphisms along these lines disposing of all ambiguities and gives HLA typing at the three to four digit level without reflexive testing, accordingly offer a comprehensive solution to HLA typing issue. Today, DNA sequencing is the highest quality level for HLA typing around the world also and is a well-characterized, dependable and productive method ready to recognize and distinguish 17

both alleles at the 6 loci HLA Class-I (HLA-A, HLA-B and HLA-C) and HLA Class-II

(HLA-DRB1, HLA-DQB1 and HLA-DPB1). Since NGS entails the single molecule sequencing approach, simple laboratory workflow, rapid, highly informative and the dramatic decrease in sequencing cost 60, it is this particular method by Illumina MiSeq system that used in the preliminary designed assay for high resolution HLA typing in transplantation combined with analytical software that able to take account of the need to analyze sequence from single DNA strands in determining allele assignment.

In the other hand, a few researches have been published describe unanticipated allele reactivity with transplant recipient sera. For example, a kidney transplant beneficiary who typed for HLA-B*44:03 had antibodies that responded with HLA-

B*44:02; they were particular for the 156DA-characterized displayed by the immunizing HLA-C*07:04 allele and common with HLA-B*44:02 and a couple of other

HLA-B alleles 61. Two kidney transplant recipient who typed as HLA-B*13:02 had antibodies that responded with all HLA-Bw4-carrying alleles aside from HLA-B*13:01 and HLA-B*13:02 62. These antibodies perceived a particular epitope characterized by

145R matched with the HLA-Bw4- related 82LR. We know about other comparable discoveries and every one of these cases clarify how problematic it would be if mismatch acceptability were exclusively decided at the antigen level.

Duquesnoy et al. talk about the positive effect of high resolution typing focused on alleles attempts more precise approach than antigen typing in highly sensitized patients anticipating living giver transplants to determine mismatch acceptability for sensitized recipient 63 . Furthermore, rejection or graft versus host disease appears to contribute in alleles mismatched and that is the reason for requiring high resolution HLA 18

typing in bone marrow transplantation 64. In particularly, four digit or high resolution

HLA typing is utilized to recognize allelic differences. In class-I HLA-A, HLA-B and

HLA-C there is more than 11859 alleles and in class-II HLA-DR there is more than 2252 alleles have been defined65. HLA typing remains challenging due to its highly polymorphic nature and the absence of a known full sequence of this chromosome 6p21.3 region 66-68. In addition, there are other genes called pseudogenes lie within the class I region and share broad homology with class I antigen. Those pseudogenes have fundamentally less allelic polymorphism and investigations of their appearance demonstrated they serve functions specific from, however it maybe overlap with the classical antigens 69-71. Some of those pseudogenes have been distinguished and given the names HLA-E, HLA-F, and HLA-G 72-74 (see Figure 2-1 A). This is one of the problems in the preliminary assay and I attempt to avoid pseudogenes to assess the computer program in determining the correct alleles.

Figure 2-1. Genomic location of the HLA Class I and II genes.

19

In summary, presented studies have shown the advantages of using high resolution HLA typing using NGS which gives accurate genomic consensus sequences in transplantation by permitting a better assessment of donor-recipient compatibility and it could improves the outcome of the transplanted organ. Also, these advantages could decrease the chances of organ rejection and increase long term transplant survival.

However, it is clear for some time that these improvements translate into clinically significant benefits.

20

3 Chapter 3

3.1 Materials and Methods

3.1.1 Samples

To evaluate the performance of the developed assay, twenty-three well-

characterized class-I HLA typed control genomic DNA (gDNA) samples from American

Society for Histocompatibility and Immunogenicity (ASHI) were obtained from the

University of Michigan. The samples had been previously Sanger sequence typed for

class-I HLA genes; HLA-A, HLA-B and HLA-C for exons 2, 3 and 4 by University of

Michigan. Sanger sequence data were used to determine the accuracy of DNA sequenced

by the developed assay. In addition, unidentified male and female gDNA samples were

used to evaluate the redesigned broad primer pairs. Molecular grade water was used as a

negative control.

3.1.2 DNA extractions

The gDNA samples from University of Michigan were extracted and purified by

using Qiagen EZ1 DNA Blood Kit using a Geno-M6 robot (Qiagen, Alameda, CA). The

unidentified male and female gDNA samples were extracted from whole blood using the

protocol of extraction of gDNA from whole blood75. Briefly, one volume of Buffer A 21

(0.32 M sucrose, 10 mM Tris- HCl, 5 mM MgCl2, 0.75% Triton-X-100) was added to 1 volume of blood and 2 volumes of cold, sterile, distilled, deionized water. Following centrifugation, the pellet was mixed with 2mL of buffer A and 6 ML of water and purification steps were repeated two to three times until a clean white or cream pellet was obtained. The pellet was resuspended in 5 mL of Buffer B (20 mM Tris-HCL, 4 mM

Na2EDTA, 100 mM NaCl), 500 µL of 10% SDS, and 50 µL of Proteinase K solution (20 mg/mL), then allowed to incubate for 2 hours at 55°C. Following complete cell lysis, 5.3

M NaCl was added to solidify the DNA and make it visible. Upon centrifugation, the supernatant was decanted into a clean microfuge tube and an equal amount of ice cold isopropyl alcohol was added. Genomic DNA was removed from the tube using a wide- bore pipette tip or an inoculating loop, then transferred to a microfuge tube containing

400 µL of TE buffer. The DNA was allowed to dissolve into solution before quantification with the Nanodrop.

3.1.3 DNA concentration and purity determination

The concentration and purity of gDNA were measured by spectrophotometric assay performed using a Nanodrop 2000 spectrophotometer (Thermo Fisher scientific,

Waltham, MA, USA). The concentrated gDNA samples were diluted with PCR grade water to a concentration of 10 ng/µL.

22

3.1.4 Reagent Design

Primer design and synthesis. Previously designed broad primers for HLA-A exon-2 and exon-3 were specific for the HLA-A gene. However, the forward and reverse broad primers for HLA-A exon-4 differentiated HLA-A from the pseudogenes (HLA-E, HLA-

F, HLA-G), but did not differentiate HLA-A from the pseudogene HLA-H. So, a primer pair that differentiate HLA-A from HLA-H were designed to be used in a first round

PCR reaction of broad primers and then uses the previously designed primer pair

(F10/R10) that differentiate HLA-A from other pseudogenes. The broad primers for

HLA-A exon 2 and 3 were specific for the target gene and did not require any modification. In addition, the previously designed broad primers for HLA-B and HLA-C were not specific to the target genes and could amplify portions of HLA-A, HLA-B or

HLA-C. Consequently, forward and reverse PCR broad primers were redesigned corresponding to three target gene regions (exon-2, exon-3, and exon-4) for each of class-

I genes HLA-B and HLA-C in the human genome.

I used human consensus sequence alignments of the target class-I genes; HLA-A,

HLA-B, and HLA-C in addition to consensus sequences of class-I pseudogenes HLA-E,

HLA-F, HLA-G and HLA-H, all sequences were obtained from International

ImMunoGeneTics (IMGT-HLA) database76 and aligned into a document, to design uniquely target specific forward and reverse broad primers to amplifying the target HLA genes only. Using OligoAnalyzer 3.1 by integrated DNA technologies (IDT) each forward and reverse broad primer was designed by minimize the presence of SNPs at the

3’ end of the primer; minimize homodimers; minimize hairpin structures; length of the primer and the annealing temperature of the PCR reaction control the specificity of the 23

primer thus, a primer length between 18 and 24 bases with annealing temperature of the

PCR reaction is set within a few degrees of the primer melting temperature and reasonable GC content (50-60%) tend to be very sequence specific77. In addition, for multi-template PCR each target specific primers was designed with a universal tail sequence not exist in the human genome to add barcode and platform-specific sequencing adapters. Separate primer pools for all broad primers and all target specific primers were created after checking primer pool primer interactions by national institute standard technology (NIST) and then combining synthesized primers after testing in equimolar ratio and diluting to a final working concentration of 2.5 µM of each broad primer in broad primer set and 100 µM of each target-specific primer in target primer set in molecular grade water. Altogether, class-I broad primer sets are; class-I-BP-HLA-A, class-I-BP-HLA-B and class-I-BP-HLA-C (see Table 3.1), and class-I target specific primer sets are; class-I-A1, class-I-A2, class-I-B1, class-I-B2, class-I-C1, and class-I-C2

(see Table 3.2).

3.1.5 Library construction

Broad primers and nested PCR reactions. First round PCR using broad primer sets for the polymorphic regions of class-I genes (HLA-A, HLA-B, HLA-C) exon 2-4 is done to artificially enhance the ratio of the target genes to their pseudogenes. Moreover, since the primer pair that is used in the first round PCR for HLA-A exon four only differentiate

HLA-A from HLA-H and not the other pseudogenes, a second round PCR is done using a previously designed primer pair (F10/R10) that differentiate HLA-A from other pseudogenes as HLA-E, HLA-F, and HLA-G.

24

Target-specific primers PCR reactions. Next, perform PCR reaction with previously designed target specific primers but in new sets (class-I-A1, class-I-A2, class-I-B1, class-

I-B2, class-I-C1, and class-I-C2) for each purified DNA from broad primer PCR products.

The addition of barcodes and sequencing platform adaptors. After broad PCR amplification and target specific amplification, barcoding PCR reaction is done for the purified DNA of target specific primer PCR products, a sample from each of the target specific PCR reaction was labeled using a unique set of barcode primers. Target specific forward and reverse primers were tagged with different barcodes to dual index each sample and lower the probability of incorrect barcoding a sequence read78. Next, platform PCR reaction is done for the purified DNA of barcode PCR products with appropriate forward and reverse platform primer.

Reaction components. All broad primers and target specific primers PCR reactions were amplified in 25 µl reactions using hot start Taq DNA polymerase (Alkali Scientific Inc,

Pompano Beach, FL, USA). The PCR mixture consisted of 1x reaction buffer (with 3mM

MgCl2 and 1 mM dNTP), 0.25 µl of 1x hot start Taq DNA polymerase, and broad primer of 1 µl forward and 1 µl reverse at 10 µl each or broad primer set of 2 µl at 2.5 µM each or target specific primer set of 2 µl at 100 nM each.

All barcode primers and platform primers PCR reactions were amplified in 25 µl reactions using platinum Taq kits and reagent (Invitrogen by Thermo Fisher scientific,

Waltham, MA, USA). The PCR mixture consisted of 10x PCR reaction buffer, 50 mM

MgCl2, 2 mM dNTP, 0.125 µl of platinum Taq DNA polymerase, and barcode primer of

25

2.5 µl at 10 µM each forward and reverse primer or platform primer of 2.5 µl at 10 µM each forward and reverse primer.

Thermal cycling parameters. Each broad primer PCR reaction were cycled in a C1000 thermal cycler (Bio-rad, Hercules, CA, USA) under the following conditions of protocol called TD6259C: 95°C for 2 minutes; 5 cycles of 95°C for 15 seconds, 62°C for 15 seconds, and 72°C for 15 seconds; 5 cycles of 95°C for 15 seconds, 61°C for 15 seconds, and 72°C for 15 seconds; then 5 cycles of 95°C for 15 seconds, 60°C for 15 seconds, and

72°C for 15 seconds; then 20 cycles of 95°C for 15 seconds, 59°C for 15 seconds 72°C for 15 seconds, and followed by a final five minute extension at 72°C.Reaction tubes were stored at 4°C. On the other hand, each target-specific PCR reaction were cycled in a

C1000 thermal cycler (Bio-rad) under the following conditions of protocol called

HLA60C: 95°C for 30 seconds; 30 cycles of 95°C for 30 seconds, 60°C for 30 seconds,

72°C for 1 minute and after the 30 cycles 72°C for 5 minutes and then 4°C. Reaction tubes were stored at 4°C.

Each barcoding reaction and each platform reaction were cycled in a C1000 thermal cycler (Bio-rad) under the following conditions of protocol called Barcode: 98°C for 2 minutes; 15 cycles of 98°C for 30 seconds, 58°C for 30 seconds, 72°C for 30 seconds and after the 15 cycles 72°C for 1 minute and then 4°C. Reaction tubes were stored at 4°C.

Amplicon pooling and PCR purification. After broad primer sets PCR reaction using the Qiaquick PCR purification kit (Qiagen), 10 µl of PCR products of each broad primer sets class-I-BP-B, class-I-BP-C and second round PCR products of F10/R10 primers for each of the 23 samples were purified. These volumes of each patient can be combined 26

and run on the same purification column. After purification 2 µl of eluate is diluted in

198 µl sterile water. Then use the dilutions for target specific reactions. Similarly, after target specific primer sets PCR reactions 5 µl of each PCR reaction; class-I-A1, class-I-

A2, class-I-B1, class-I-B2, class-I-C1 and class-I-C2 of the same patient are combined in one tube and column purify again using the Qiaquick PCR purification kit (Qiagen). The final elution volume of 30 µl is done with low EDTA TE buffer (10mM Tris-CL, PH 7.4,

0.1 mM EDTA). After purification 2 µl of eluate is diluted in 198 µl low EDTA TE buffer. Then use the dilutions for barcoding reactions.

Further, the entire volume of each sample of the barcoding PCR reaction was column purified individually. The final elution volume of 25 µl is done with low EDTA

TE buffer (10mM Tris-CL, PH 7.4, 0.1 mM EDTA) and these purified barcode products were used for platform tags PCR reaction. Finally, 5 µl of the platform tags PCR reaction products of each sample were mixed is a 1:1 ratio and then column purified the entire mixture by Qiaquick PCR purification kit (Qiagen) as one sample and send it for sequencing.

3.1.6 Next Generation sequencing

All NGS platforms which are available in the market now perform massively parallel or deep sequencing of small DNA fragments. The most common NGS tools are

Roche GS 454 FLX, Pacific Biosciences SMRT, Ion Torrent PGM and Illumina

MiSeq/HiSeq. In this study we chose to us Illumina MiSeq because unlike Sanger and other NGS tools, it offers higher resolution HLA typing results, faster, less costly, and less demanding work process60.

27

Accordingly, after platform PCR reaction of the 23 patient samples, amplicon pools of each patient were mixed in 1:1 ratio and purified. Next, it was sent to

Nationwide Children’s Hospital in order to perform MiSeq PE300 bp run and they provide us with forward and reverse FASTQ data.

3.1.7 Evaluating primers and PCR reaction products

Redesigned primers and PCR products were analyzed by Agilent 2100

Bioanalyzer using microfluidics-based platform DNA Chips with DNA 1000 Kit reagents according to manufacturer’s protocol (Agilent Technologies Deutschland GmbH,

Waldbronn, Germany). Unlike the traditional technique slab gel electrophoresis, the

DNA kits together with the Agilent 2100 Bioanalyzer are ideal for quantification and qualification analysis of PCR fragments or fragmented DNA, precise, fast, easier workflow and the results are shown in a gel-like image, electropherogram and tabular formats.

3.1.8 Data processing

Computer program and data set. An algorithm is used to piece together all sequence reads of DNA fragments by mapping the individual reads to the human reference genome.

FASTQ file processing. Nationwide Children’s Hospital provided us the raw sequencing data from Illumina MiSeq in FASTQ format. Briefly, according to our collaborator 79, all sequencing reads that are shorter in length more than 90 % of the expected length were excluded, and each read from the remaining sequence reads was parsed into 3 separate

28

FASTQ files: 1) forward and 2) reverse barcode regions, as well as 3) central portion of the amplicon which is the internal region of the target specific priming sites.

BFAST of sequences against index databases79. Our collaborator used FASTA database to align each of the three FASTQ files corresponding to weather it was a barcode or amplicon region using the BLAT-like fast accurate search tool (BFAST, version 0.7.0a), with file output in sequence alignment/map (SAM) format 80. BFAST match against the index databases and SAM file output was performed for the trimmed

FASTQ files containing 1) barcode, 2) reverse barcode and 3) captured amplicon subject sequences.

Binning of sequences counts 79. Using a practical extraction and reporting language

(PERL) hash table using the sequence read ID as a key for matching

(http://www.perl.org/), each of the three SAM files from 1) forward and 2) reverse barcode, and 3) amplicon region were merged. Based on barcode and amplicon alignment, each sequencing read was binned into an array and this happens only if the forward and reverse barcode alignment calls did match. Then the binned sequence reads processed as outlined in Statistical Method section.

3.1.9 Analyzing the raw sequence data without the software.

In order to optimize the developed method for high-resolution HLA-typing, the raw sequence data by the preliminary assay of 6 gDNA samples have been analyzed without using the program and compared with Sanger method results. For each sample at each amplicon I evaluated the data by this pattern: 1) I excluded anything with very low amplification under 10 reads, 2) I excluded everything of inaccurate size, in other words

29

any sequence read that is shorter more than 90% of the expected sequence read length, as well as 3) I noted the sequence which has the highest read counts, sometimes this was a correct match to the Sanger read & sometime it was not, 4) Next I determined if any sequence reads matched the Sanger sequence, there could be up to two if the sample were heterozygous, Finally 5) I looked at the read counts of matching NGS sequences, if I found a sequence with high read counts and does not match Sanger sequence read I searched it on the IMGT NCBI BLAST to find regions of sequence similarity to find clues about the interfering inappropriate sequence reads.

3.1.10 Statistical Methods.

Methods to assess agreement between estimates of native target read counts79.

Briefly, difference plots display the differences between estimates given by two methods.

For this particular application, differences were plotted on the base 10 logarithm scale.

Along with the difference plots, scatter plots of the data and corresponding R2 values

(percent variance explained) from linear models also are displayed.

In addition, the area under the receiver operating characteristics (ROC) curve (and corresponding 95% confidence interval) was calculated to assess accuracy in the detection of fold differences (fold changes of 1.10, 1.25, 1.50, 2.00, and 4.00) of ERCC controls known to exist between samples A and B, as well as their derivative mixtures resulting in samples C and D. Results for fold-change ROC curve analysis were binned across differential ratio subpools of pairwise comparisons.

30

Statistical analysis. Statistical analysis of the NGS/Sanger mean percentage agreement between the old and the new assay were made by two tailed t-test and considered significant when the P values less than 0.05.

Table 3.1. HLA Class-I-broad primer sets.

PCR Class- Target Exon Forward Reverse primer Amplicon amplification I- gene primer length (bp) rounds broad primer set First PCR Class- A 2 PreAmp_A_F PreAmp_A_R9 ~429 round I-BP- 1 A A 3 A-F4mod PreAmp_A_R4 ~ 730 A 4 HLA_H_F3 HLA_H_R3 ~ 812 First PCR Class- B 2 HLA_B_2Re_ HLA_B_2Re_R ~ 581 round I-BP- F1 1 B B 3 B-F4mod HLA_B_3Re_R ~ 707 2 B 4 HLA_B_4Re_ HLA_B_4Re_R ~ 687 F1 2 First PCR Class- C 2 HLA_C_2Re_ PreAmp_C_R9 ~ 552 round I-BP- F2 C C 3 HLA_C_3Re_ HLA_C_3Re_R ~ 735 F2 1 C 4 HLA_C_4Re_ HLA_C_4Re_R ~ 794 F2 2 second PCR - A 4 A-F10 A-R10 ~662 round

31

Table 3.2. HLA Class-I target specific primer sets.

Class-I-target Target Exon Forward Reverse Amplicon specific gene primer primer length (bp) primers set Class-I-A1 A 2 A2-2F A2-2R ~134 A 3 A-3-1F A3-1R ~145 A 3 A-3-3F A-3-3R ~142 A 4 A-4-1F A-4-1R ~154 A 4 A-4-3F A-4-3R ~158 Class-I-A2 A 2 A-2-1F A-2-1R ~163 A 2 A-2-3F A-2-3R ~170 A 3 A-3-2F A-3-2R ~135 A 3 A-3-4F A-3-4R ~144 A 4 A-4-2F A-4-2R ~144 A 4 A-4-4F A-4-4R ~124 Class-I-B1 B 2 B-2-2F B-2-2R ~167 B 3 B-3-1F B-3-1R ~146 B 3 B-3-3F B-3-3R ~174 B 4 B-4-2F B-4-2R ~135 B 4 B-4-4F B-4-4R ~147 Class-I-B2 B 2 B-2-1F B-2-1R ~147 B 2 B-2-3F B-2-3R ~181 B 3 B-3-2F B-3-2R ~123 B 4 B-4-1F B-4-1R ~153 B 4 B-4-3F B-4-3R ~135 C 2 C-2-1F C-2-1R ~144 Class-I-C1 C 2 C-2-2F C2-2R ~140 C 2 C-2-4F C-2-4R ~153 C 3 C-3-2F C-3-2R ~168 C 3 C-3-4F C-3-4R ~162 C 4 C-4-2F C-4-2R ~165 Class-I-C2 C 2 C-2-3F C-2-3R ~177 C 3 C-3-1F C-3-1R ~166 C 3 C-3-3F C-3-3R ~154 C 4 C-4-1F C-4-1R ~168 C 4 C-4-3F C-4-3R ~175

32

4 Chapter 4

4.1 Results

4.1.1 Obstacles in the preliminary assay.

In order to optimize the preliminary assay for high resolution HLA-typing, the

original data of NGS sequence reads were analyzed without the computer program for 6

patient samples and compared our results with Sanger method results that we had

obtained from the University of Michigan to see if both reads match (Figure 4-1). This

comparison showed that the information provided to the computer program may suggest

incorrect alleles for some patients. We interpret these results that our method may

amplify not only desired genes but also pseudogenes or other not desired HLA genes.

In detail, for each sample at each amplicon we evaluated the data by the following

pattern; first we excluded sequence reads with very low amplification under, namely less

than 10 repeated reads and all reads with an inaccurate size. Next, we focused on the

sequence which has the highest read counts and very often this was the correct match to

the Sanger read, there could be up to two if the sample were heterozygous. Finally, we

looked at the read counts of matching NGS sequences and consider the possibility of an

33

allelic dropout which means that we were able to find one of two heterozygous sequences

(Figure 4-2).

In summary, when I searched in BLAST website for the sequence reads that does not match Sanger sequence reads, they appear to match one of the class I HLA pseudogenes, namely HLA-H. We confirmed that such incorrect readings were present in only in exon 4 of HLA-A. The repeated results of performed assays of HLA-A exon confirmed that exons 2 and 3 were almost always correct and the NGS sequence reads matched Sanger sequence reads. . On the other hand, we also found that some of the results were incorrect in HLA-B and HLA-C. For example, some HLA-C sequence reads matched those of HLA-A and HLA-B. Based on this analysis we concluded that readouts of exon 4 for HLA-A, exons 2, 3, and 4 for HLA-B, and exons 2, 3, and 4 for HLA-C, all need to be re-analyzed and improved for correct and precise readings (Table 4.1).

34

Figure 4-1. Percentage of NGS sequence reads agreement with Sanger sequence reads of 6 patients.

35

36

Figure 4-2. Excel snapshot: an example of the original data analysis.

An example of original data analysis of A2-2 and A2-3 assays. Highlighted in yellow is the NGS sequence results that match Sanger sequence results and highlighted in pink is amplification of heterozygous, and the sequence reads that does not match to Sanger sequences reads are not highlighted.

37

Table 4.1. Average percentage of preliminary NGS assay of sequence reads agreement with Sanger sequence reads per amplicon among 6 patients

NGS Mean Standar NGS Mean of Standar NGS Mean of Standar assay of % d assay % d assay % d HLA- agree deviatio HLA- agreeme deviatio HLA- agreeme deviati A ment n B nt with n C nt with on with Sanger Sanger Sanger method method metho d A2-1 97% 0.023% B2-1 76% 0.346% C2-1 0% 0% A2-2 99% 0.012% B2-2 60% 0.416% C2-2 54% 0.396% A2-3 93% 0.092% B2-3 96% 0.026% C2-3 0% 0% A3-1 88% 0.124% B3-1 37% 0.325% C2-4 92% 0.089% A3-2 99% 0.008% B3-2 64% 0.30% C3-1 0% 0% A3-3 80% 0.195% B3-3 97% 0.016% C3-2 73% 0.35% A3-4 97% 0.035% B4-1 97% 0.05% C3-3 60% 0.379% A4-1 62% 0.31% B4-2 97% 0.038% C3-4 76% 0.385% A4-2 35% 0.21% B4-3 63% 0.19% C4-1 82% 0.40% A4-3 86% 0.119% B4-4 0% 0% C4-2 81% 0.399% A4-4 54% 0.109% C4-3 82% 0.40%

38

4.1.2 Redesigning broad primers.

Evaluation of the previously designed broad and target specific primers for HLA-

A had been done by comparing each primer pair sequence with the consensus sequences of HLA-A, HLA-B, HLA-C, HLA-E, HLA-F, HLA-G, and HLA-H to see if the previously designed primers could amplify other gene than the HLA target gene. Notably, most of the target primers could bind to more than one gene, and since it is difficult to find multiple unique exonic regions for each HLA-A gene because they are very similar and small region, broad primers were designed in the intronic region to increase the PCR amplification ratio of HLA-A genes avoiding pseudogene (Figure 4-3). We propose that redesigning unspecific broad primers seems to be more efficient approach than modifying the target specific primers.

Figure 4-3. Schematic illustration of broad primer and target specific primer design.

Nested PCR for HLA-A exon 4. In particular, the forward and reverse broad primers for exon-4 of HLA-A differentiates HLA-A from the pseudogenes (HLA-E, HLA-F, HLA-

G), but did not differentiate HLA-A from the HLA-H pseudogene. To improve this, five forward primers (HLA_H_F1, HLA_H_F2, HLA_H_F3, HLA_H_F4, and HLA_H_F5) and four reverse primers (HLA_H_R1, HLA_H_R2, HLA_H_R3, and HLA_H_R4) have been re-designed to differentiate HLA-A from HLA-H (Table 4.2). Each broad primer pair was then tested using a PCR reaction for male and female DNA. Next, the specificity 39

of each broad primer pair was evaluated by analyzing the PCR product on Agilent 2100

Bioanalyzer (Figure 4-4 A-L). Depending on the Agilent results the broad primer pair

HLA_H_F1 and HLA_H_R3, as well as HLA_H_F3 and HLA_H_R3 have been chosen because the amplification in good expected size matching, there was hardly any primer dimer formed, and the target peak was narrow and sharp in both male and female DNA samples (Figure 4-4 A, B, E and F). In contrast, other tested primer pair results did not produce the required expected size length of the target amplicon and therefore they were not considered for further testing (Figure 4-4 C, D, G, H, K and L).

Table 4.2. Redesigned broad primers for HLA-A exon 4.

Forward Primer Sequence (5'-3')

HLA_H_F1 GTTCTGTGCTCTCTTCCCCAT HLA_H_F2 GAGTGGTTCCCTTTGACAC HLA_H_F3 TTCTGTGCTCTCTTCCCCATC HLA_H_F4 CTGTGCTCTCTTCCCCATC HLA_H_F5 TGTGGGGGTCTGAGTCCAGCA Reverse Primer Sequence (5'-3') HLA_H_R1 TGTGCCCTGTCTCATTACTGG HLA_H_R2 CTGTGTGCCAGCACTTACTC HLA_H_R3 AGGACAGATTTATCACCTTGAT HLA_H_R4 CACACATTTCTGGAAACTTC

Due to the fact of similarities it was very hard to design a single broad primer set that could clearly differentiate HLA-A from other class I HLA genes and from pseudogenes. However, nested PCR was done using newly designed primer pairs as an outer primer set in the first round of PCR reaction, and then used the previously designed 40

broad primer pair (HLA-A-F10 and HLA-A-R10) as inner primer set in the second round of PCR reaction (Figure 4-5). As a matter of fact, rather than diluting and re-amplifying with the previously designed broad primers, nested PCR has been demonstrated to be more successful than that81. The nested PCR product results (Figure 4-6 B) showed that using the new primer pair HLA_H_F3 and HLA_H_R3 was more efficient because there was no primer dimer, the target peak was narrow and sharp, amplification in good expected size matching and there was no more than one product amplification. Unlike the result of the primer pair HLA_H_F1 and HLA_H_R3 showing that the primer pair amplified more than one product. This means that the primer pair has low specificity

(Figure 4-6 A).

41

Figure 4-4. Agilent results of redesigned broad primers for HLA-A exon 4.

42

43

Figure 4-5. Schematic illustration of nested PCR reactions.

Figure 4-6. Agilent results for nested PCR products.

Redesigning the broad primers for HLA-B and HLA-C. The BLAST results of the interfering sequences that have been found in HLA-B and HLA-C appears to match other class-I genes. For instance, the interfering sequence reads in HLA-B match to HLA-A or

HLA-C and vice-versa. Thus, redesigning the broad primers for HLA-B and HLA-C for exons 2, 3, and 4 are important to increase the amplification specificity of the target genes.

44

For HLA-B multiple forward and reverse broad primers designed for exons 2, 3, and 4 (Table 4.3). In Figure 4-7 shows the result for female DNA PCR products of the redesigned broad primer pairs for HLA-B exon 2 and the primers pair HLA_B_2Re_F1

& HLA_B_2Re_R1 has been chosen because the amplification in good expected size matching, there was no primer dimers, and the target peak was narrow and sharp

(Figure 4-7 A). In contrast, the Agilent result of the primer pair HLA_B_2Re_F2 and

HLA_B_2Re_R1 showed some homo-dimer and this could affect the primer efficacy for amplification the target gene (Figure 4-7 B). For HLA-B exon 3, four redesigned broad primer pairs have been tested and the previously designed forward primer B-4mod and the newly designed reverse primer HLA_B_3Re_R2 is chosen as a primer pair for this exon because the amplification in good expected size matching, there was no primer dimer, with one product amplification and the target peak was narrow and sharp

(Figure 4-8 C). Although the Agilent results of the other primer pairs look good, the reverse primer HLA_B_3Re_R1 differentiate HLA-B from the pseudogenes HLA-F at the 12th base and this could lower the primer specificity, the reverse primer

HLA_B_3Re_R3 only differentiate HLA-B from HLA-C at the ultimate 3' nucleotide, the reverse primer HLA_B_3Re_R4 lacks of diversity of nucleotides (Figure 4-8 A, B, and D). For HLA-B exon 4, six redesigned broad primer pairs have been tested on a female DNA and relying on the Agilent results of the PCR products, the primer pair

HLA_B_4Re_F1 and HLA_B_4Re_R2 appears to be the best choice among others because the amplification in good expected size matching, there was no primer dimer, with one product and the target peak was narrow and sharp (Figure 4-9 F). Unlike the

45

primer pairs in (Figure 4-9 A and C) have primer dimer and this affects the primer efficacy of amplifying the target gene.

Table 4.3. Redesigned broad primers for HLA-B exon 2, 3, and 4.

Exon Forward Primer Sequence (5'-3') 2 HLA_B_2Re_F1 AGTGCGGGTCGGGAGGGAAAT 2 HLA_B_2Re_F2 CGGACTCAGAGTCTCCT 3 HLA_B_3Re_F1 CGGTTTCATTTTCAGTTGAG 4 HLA_B_4Re_F1 GTTCTCTGCCTCACACTCAG 4 HLA_B_4Re_F2 CCAGCACTTCTGAGTCACTTTAC Reverse Primer Sequence (5'-3') 2 HLA_B_2Re_R1 TGAGGCCAAAATCCCCGC 3 HLA_B_3Re_R1 CTCTGATTCCAGCACTTCTG 3 HLA_B_3Re_R2 GTGTGTTTGGGGCTCTGATTCC 3 HLA_B_3Re_R3 GTCGCCCTCCGTTGAATGGA 3 HLA_B_3Re_R4 GAAGAGGAGGAAAATGGGATC 4 HLA_B_4Re_R1 TCTTCCCCTCCTTTCCCAG 4 HLA_B_4Re_R2 GTCCACCATCCCCATCGTG

46

Figure 4-7. Agilent results of redesigned primers for HLA-B exon 2.

47

Figure 4-8. Agilent results of redesigned broad primers for HLA-B exon 3.

48

Figure 4-9. Agilent result for redesigned broad primers of HLA-B exon 4.

49

Additionally, for HLA-C multiple forward and reverse broad primers designed for exons 2, 3, and 4 (Table 4.4). For HLA-C exon 2, three redesigned broad primer pairs have been tested on a female DNA and regarding to the Agilent result the forward primer

HLA_C_2Re_F2 and the previously designed reverse primer PreAmp_C_R9 has been chosen because the amplification in good expected size matching, there was no primer dimer, and the target peak was narrow and sharp (Figure 4-10 E and F). For HLA-C exon

3, two redesigned broad primer pairs have been tested on female DNA and depending on the Agilent result the primer pair HLA_C_3Re_F2 and HLA_C_3Re_R1 is chosen because the amplification in good expected size matching, there was no primer dimer, and the target peak was narrow and sharp in both male and female DNA samples

(Figure 4-11 C and D). Unlike the primer pair HLA_C_3Re_F1 and HLA_C_3Re_R1 the product size length is longer than expected (Figure 4-11 A and B). For HLA-C exon 4, nine redesigned broad primer pairs have been tested on a female DNA and regarding to the Agilent result the forward primer HLA_C_4Re_F2 and the reverse primer

HLA_C_4Re_R2 is chosen because the amplification in good expected size matching, there was no formation of primer dimers, while the target peak was narrow and appeared sharp (Figure 4-12 C).

50

Table 4.4. Redesigned broad primers for HLA-C exon 2, 3, and 4.

Exon Forward Primer Sequence (5'-3') 2 HLA_C_2Re_F1 GGCCTGTGAGTGCGGGGTT 2 HLA_C_2Re_F2 CTGTGAGTGCGGGGTTG 3 HLA_C_3Re_F1 GACACAGAACTACAAGCG 3 HLA_C_3Re_F2 GTCTGAGATCCACCCCAAGG 4 HLA_C_4Re_F1 TGACCACTTTGACCACTG 4 HLA_C_4Re_F2 GACCAGAAGTCGCTGTTCCTCC 4 HLA_C_4Re_F3 TTCTCAGGATGGTCACATGGGC 4 HLA_C_4Re_F4 CCTTTGACCACTTTGACCAC Reverse Primer Sequence (5'-3') 3 HLA_C_3Re_R1 GGCTGCTGACCTTTCTCTC 4 HLA_C_4Re_R1 TGGTTGTCCTAGCTGTCC 4 HLA_C_4Re_R2 AGTTTCAAGCCCCAGGTAG

51

Figure 4-10. Agilent result for redesigned broad primers of HLA-C exon 2.

52

Figure 4-11. Agilent result for redesigned broad primers of HLA-C exon 3.

53

Figure 4-12. Agilent result for redesigned broad primers of HLA-C exon 4.

54

55

4.1.3 Performance testing of competitive amplicon library preparation.

Performance with gDNA. Twenty three gDNA patient samples from the University of

Michigan were used to test the assay with the newly designed broad primers. New 6 pools of previously designed target specific primers for class-I HLA genes (class-I-A1, class-2-A2, class-I-B1, class-I-B2, class-I-C1, and class-I-C2) used to amplify multiple regions of each exon 2, 3 and 4 of the broad primer PCR products. In Figure 4-13 shows of the Agilent results, five random spot checked samples for class-I-A1 set was amplified produced more than one product with the approximate expected size length of ̴147 bp. In

Figure 4-14 shows similar pattern for class-I-A2 set with ̴147 bp products. In Figure 4-15 shows the Agilent results of five random spot checked samples for class-I-B1 set with an expected size length of ̴154 bp products. In Figure 4-16 presents the Agilent results of five random spot samples for class-I-B2 set with also an expected size length of ̴147 bp amplification products. In Figure 4-17 demonstrates samples for class-I-C1 with products of 158̴ bp. Similarly, Figure 4-18 tests class-I-C2 set producing ̴168 bp fragments. So, these results revealed that all target primer sets amplified products of expected size lengths

56

Figure 4-13. Agilent results of class-I-A1 for spot checked samples.

57

Figure 4-14. Agilent results of class-I-A2 for spot checked samples.

58

Figure 4-15. Agilent results of class-I-B1 for spot checked samples. 59

Figure 4-16. Agilent results of class-I-B2 for spot checked samples. 60

Figure 4-17. Agilent results of class-I-C1 for spot checked samples. 61

Figure 4-18. Agilent results of class-I-C2 for spot checked samples. 62

Platform and barcode PCR reactions. After we spot checked the target specific primers

amplification, barcode primers PCR reactions was done for each 23 patient samples to

differentiate one patient from another. In Figure 4-19 shows the Agilent results of 5 spot

Figure 4-19. Agilent results for barcode PCR reaction products.

63

checked samples of the barcode PCR products. Then, Platform PCR reaction was performed for each of 23 patient samples. Figure 4-20 shows the results for the same 5 spot checked samples with an increase in size. Finally, after evaluating the library preparation and making sure of their amplification, the final amplification sample had been sent for sequencing by the Illumina MiSeq system.

64

Figure 4-20. Agilent results for platform PCR reaction products.

65

4.1.4 Sequencing results for 6 patients

Agreement of NGS sequences with Sanger sequences. The same 6 patients that were

analyzed in the preliminary assay results were chosen to analyze the sequence result of

the new assay. In (Figure 4-21) shows a comparison between the mean percentage

agreement of NGS/Sanger in both the old and the new assays of HLA-A exon 4. As

shown by the summary results, the percentage agreement increased in all target specific

primers amplification. Moreover, the interfering sequences of HLA-H, which was

interfering in the old assay, it became barely visible in the new assay and this is a clear

indication of highly specific amplification. These results confirm an improvement for the

revised assay.

Figure 4-21. Comparison of NGS/Sanger mean percentage agreement between the old and new assay for HLA-A exon 4. The mean agreement difference between the old and new assays were not statistically significant in A4-1 (p= 0.21), and A4-3 (p=0.75), but it were significantly different in the assays A4-2 (p=0.05) and A4-4 (p=0.01).

66

Figure 4-22 A-C showed that for HLA-B assays there was no significant difference of the NGS/Sanger agreement between the old and new assays in exon 2; B2-1 and B2-2, exon 3; B3-2, and exon 4; B4-1, B4-2, B4-3. However, in the target specific primer B3-1 for HLA-B exon 3 (Figure 4-22 B) and primer B4-4 for HLA-B exon 4

(Figure 4-22 C) there was an increase in the NGS/Sanger agreement in the revised assay.

It is possible that some interfering sequences were still present of HLA-C fragments.

Since the redesigned broad primers for HLA-B showed in the Agilent result an amplification of one product, the interfering sequences could be because of the low specificity of the target specific primers.

In HLA-C exon 2, the two assays C2-1 and C2-3 did not show any amplification in both the old and new assays, but for the target specific primer C2-2 there was an increase of the NGS/Sanger agreement while for C2-4 there was a slightly decreased the

NGS/Sanger agreement (Figure 4-23 A). Notably, in HLA-C exon 3, the target specific primer C3-1 did not have any amplification in the old assay and new assays as it showed an amplification with 53% of the NGS/Sanger agreement (Figure 4-23 B). There was no significant difference between the old and new assay for HLA-C exon 4 as shown by the

NGS/Sanger mean agreement (Figure 4-23 C). However, previously interfering sequence reads in HLA-C were also the same sequence reads for HLA-A and HLA-B in the old assay, but in the new assay the interfering sequence had been found only in HLA-B.

Overall, these results indicate that even though the newly designed broad primers for

HLA-C excluded the amplification of HLA-A, it was either still amplifying HLA-B as the previously designed broad primers or the target primers need to be modified and

67

therefore they were modified. In fact, nested PCR reaction was needed to exclude the amplification of HLA-B. In Table 4.5 a summary for all HLA class-I genes with

NGS/Sanger mean percentage agreement of each amplicon.

68

Figure 4-22. Comparison of NGS/Sanger mean percentage agreement between the old and new assay for HLA-B exon 2, 3, and 4. (A)There was a significant difference of the mean agreement between the old and new assay in B2-3 (p=0.02), but there were no statistically significant difference in B2-1 (p= 0.26), and B2-2 (p=0.95). (B) There was a significant difference of the mean agreement between the old and new assay in B3-1 (p=0.05) and B3-3 (p=0.004), but there was no statistically significant difference in B3-2 (p= 0.60). (C) There were no significant difference of the mean agreement between the old and new assay in B4-1 (p=0.06), B4-2 (p=0.07) and B4-3 (p=0.50), but there was statistically significant difference in B4-4 (p= 2.15E-09). 69

Figure 4-23. Comparison of NGS/Sanger mean percentage agreement between the old and new assay for HLA-C exon 2, 3, and 4. (A)There were no significant difference of the mean agreement between the old and new assay in C2-2 (p=0.16) and C2- 4 (p=0.06), and there were no amplification in both the old and the new assay in C2-1 and C2-3. (B) There was a significant difference of the mean agreement between the old and new assay in C3-1 (p=0.02) and there were no statistically significant difference in C3-2 (p= 0.45), C3-3 (p=0.66), and C3-4 (p=0.66). (C) There were no significant difference of the mean agreement between the old and new assay in C4-1 (p=0.67), C4-2 (p=0.84) and C4-3 (p=0.60).

70

Table 4.5. Average percentage of modified NGS assay of sequence reads agreement with Sanger sequence reads per amplicon among 6 patients

NGS Mean of Standard NGS Mean of Standard NGS Mean of Standard assay % deviation assay % deviation assay % deviation HLA-A agreeme HLA-B agreement HLA-C agreement nt with with with Sanger Sanger Sanger method method method A4-1 73% 0.288% B2-1 47% 0.37% C2-1 0% 0% A4-2 60% 0.267% B2-2 61% 0.18% C2-2 68% 0.258% A4-3 84% 0.175% B2-3 46% 0.37% C2-3 0% 0% A4-4 83% 0.208% B3-1 73% 0.2% C2-4 81% 0.09% B3-2 57% 0.3% C3-1 53% 0.4% B3-3 84% 0.059% C3-2 53% 0.39% B4-1 63% 0.335% C3-3 49% 0.3% B4-2 91% 0.03% C3-4 65% 0.4% B4-3 54% 0.29% C4-1 74% 0.17% B4-4 85% 0.02% C4-2 85% 0.1% C4-3 75% 0.09%

71

4.2 Discussion

As explained in detail based on the literature survey, many studies illustrate the importance of HLA alleles match in transplantation 63,64. Some papers also demonstrate the positive effect of high resolution typing focused on HLA 4-digit alleles 63,64. More precise approach suggests the importance of all HLA antigen typing, namely HLA-A, -B,

-C, -DR, -DQB/DQA, and -DPB/DPA54,82. Fast mounting evidence demonstrate that each of these HLA antigens may induce allograft rejection54,82. At the same time, several important details and circumstances need to be clarified. The contribution of different mismatched alleles in the rejection or graft versus host disease need to be re-defined63,64.

There are occasional clinical data that some HLAs are more dangerous than other HLAs while some of them may be nor dangerous or even beneficial for long-term allograft survivals. Although, Sanger sequencing method is the gold standard method for sequencing, it is an expensive and time consuming method. Instead, NGS technologies such as Illumina sequencing method may provide a high quality and significantly less expensive alternative for Sanger sequencing method. In fact, it is faster and it costs only a fraction of $4,000 needed for the Sanger method 83. The present study aims at improving the existing assay to identify 100% of samples at high resolution 4-digit HLA typing. An initial objective was to identify problems in the assay of each class-I target gene, namely:

1) adjusting broad primers; and 2) re-designing target specific primers. The main effort was placed on identifying the interfering sequence reads such as pseudogenes and other unwanted HLA genes. Our plan was to identify problematic/interfering sequences and to determine the ability of excluding them from the proper HLA sequences by: 1) 72

redesigning the sets of primers; and 2) applying rules to or “teaching” the computer program how to exclude them. In particular, the redesigning was focused on the broad primers of exon 4 for HLA-A, exons 2, 3, and 4 for HLA-B, and exons 2, 3 and 4 for

HLA-C. In other words, the most efforts were made to promote the amplification specificity of the target genes.

It is clear from the Agilent results that the nested PCR reaction for HLA-A exon 4 raised the amplification specificity as documented by only one product amplification

(Figure 4-6 B). In contrast, as shown in Figure 4-24 the previously designed broad primers tested in male and female gDNA displayed amplification of more than one product. Overall, our focus is establishing an improvement method for our NGS sequencing assay to precisely identify 4-digit HLAs.

Figure 4-24. Agilent results of previously designed broad primer pair for HLA-A exon 4.

73

In addition, the redesigned broad primers for HLA-B exon 2 (Figure 4-7 A) showed an amplification of also only one product, unlike the previously designed broad primers (Figure 4-25). Similarly, the newly designed broad primers for HLA-B exon 3

(Figure 4-8 C) and HLA-B exon 4 (Figure 4-9 F) showed an amplification of only one product and no primer dimers, in contrast to the previously designed broad primers with more products (Figure 4-26 and Figure 4-27). Furthermore, the Agilent results for redesigned broad primers of HLA-C exon 2 (Figure 4-10 E and F), HLA-C exon 3

(Figure 4-11 C and D), and HLA-C exon 4 (Figure 4-12 C) showed more specific amplifications than the previously designed broad primers (Figure 4-28, Figure 4-29, and

Figure 4-30). Consequently, the newly designed broad primers appear to be promising in increasing the amplification specificity of the target genes. Twenty three samples have been sequenced by preparing the library with the newly designed broad primers while 6 patients were chosen to compare the sequence results of the old and new assays.

74

Figure 4-26. Agilent results of previously designed broad primer pair for HLA-B exon 2.

Figure 4-25. Agilent results of previously designed broad primer pair for HLA-B exon 3.

75

Figure 4-27. Agilent results of previously designed broad primer pair for HLA-B exon 4.

Figure 4-28. Agilent results of previously designed broad primer pair for HLA-C exon 2.

76

Figure 4-29. Agilent results of previously designed broad primer pair for HLA-C exon 3.

Figure 4-30. Agilent results of previously designed broad primer pair for HLA-C exon 4.

77

This study showed the benefit in redesigning of broad primers as they filtered most of the interfering sequence reads, which had been previously complicating assay results. Despite the fact that there are still some interfering sequence reads of HLA-C sequences in the HLA-B amplification and some interfering sequence reads of HLA-B sequences in the HLA-C amplification, there was a clear improvements in readouts of class I HLA alleles for all 6 patients (Table 4.6). The computer program was able to identify the correct 4-digit high resolution typing of HLA-A, HLA-B, and HLA-C for a patient (P.16) by the new assay a dramatic improvement from the old assay. In fact, the old assay sequence results identified only 2-digit low resolution typing and only for one allele (Table 4.6). This shows that the new modification helped in improving the identification of class I HLAs. At the same time we suggest that more work is needed to apply some new rules to the computer program to increase the accuracy of the HLA allele identification as well as possibly to modify the target specific primers. We plan to focus now on the computer program to improve the selection of appropriate 4-digit HLA typing for all tested patients. The 4-digit HLAs are present on the list of selected HLA sequences for each patient with proper 2-digit HLA, but filtering out of irrelevant HLA sequences is not working for 4 out of 6 patients. If necessary, we plan to further adjust target-specific primers.

We have found at least six reports about development of a method for HLA typing using NGS technology. One of them used in-solution targeted capturing class I

(HLA-A, B and C) as well as class II (HLA-DRB1, DQA1, DQB1, DPA1 and DPB1).

The algorithm names “three-field” resolution as validated on 357 commercial samples42.

78

Other report described an NGS method able to name four HLA loci (HLA-A, B, C and

DRB1) producing up to 6-digit or even 8-digit high resolution unambiguous phased HLA typing46. NGS on the 454 FLX Titanium platform was developed to read class I HLA loci from four worldwide populations with 96.4% accuracy at 4-digit resolution84. A fully integrated workflow HLA typing automated method showed 97.3% initial sequencing typing with the mean ambiguity reduction for the analyzed loci of 93.5% 85. Single

Molecule Real Time DNA sequencing technology reported typing seven samples for

HLA-A, B and C 86. Finally, one more system of HLA typing of seven HLA loci of class

I and II for 24 or 48 individuals in a single sample-specific internal sequence tags. These authors incorporated an HLA typing software application of Conexio Genomics able to assign HLA genotypes for seven loci (HLA-A, B, C, DRB1, DQA1, DQB1, and DPB1).

These reported methods shows the recent dramatic expansion in the development of much needed applications for 4-digit or even better HLA typing to avoid ambiguities.

Our NGS method will use a unique technique with internal standards to prevent allelic dropouts and to provide an internal control over amplification performance79. The present work improved the performance of our NGS method for 4-digit NGS typing. Continuous efforts will be made to achieve 100% accuracy in identifying even most rare HLAs.

79

Table 4.6. Allelic type results comparison between Sanger method, old assay, and the new assay.

Patient P.4 P.5 P.6 P.16 P.18 P.20 number HLA-A A*02:01, A*24:02, A*02:05, A*02:01, A*02:06, A*11:01, Sanger Allele type A*03:01 A*32:01 A*30:02 A*02:06 A*32:01 A*34:01 Allelic type HLA-B B*07:02, B*44:03, B*27:05, B*39:02, B*14:01, B*15:02, results Allele type B*44:02 B*51:01 B*50:01 B*39:05 B*52:01 B*15:21

HLA-C C*05:01, C*04:01, C*01:02, C*07:02 C*04:01, C*04:03, Allele type C*07:02 C*14:02 C*06:02 C*08:02 C*08:01

Old HLA-A A*02, A*32 A*30 A*02 A*32 A*11 NGS Allele type A*03 Allelic type HLA-B B*07 B*51:01 B*50 B*39 indeterminate B*15 results Allele type HLA-C C*05, C*04 C*01 C*07 C*04 C*04 Allele type C*07

New HLA-A A*02, A*32 A*30 A*02:01, A*32 A*11 NGS Allele type A*03 A*02:06 Allelic HLA-B B*07 B*51 B*49, B*39:02, B*14 B*15:02, type Allele type B*50 B*39:05 B*15:21 results HLA-C C*05, C*04, C*01, C*07:02 C*04 C*04:03 Allele type C*07 C*14 C*06 Yellow highlighted text show 4-digit HLA typing and gray highlighted text show newly identified 2-digit HLA typing.

80

5 6 7 8 Chapter 5

8.1 Conclusion

This study was designed to define problems and find solutions with our already

developed NGS method for 4-digit high resolution HLA typing. We have identified the

following problems: 1) some broad primers amplified pseudogenes and other HLA genes

in addition to targeted gene fragments; 2) some target-specific primers were also not

sufficiently exclusive and specific; and 3) computer program need reprogramming to

eliminate not-targeted HLAs. To address some of the problems, we have redesigned

broad primers for HLA-A exon 4, HLA-B exons 2, 3, and 4, as well as HLA-C exons 2,

3, and 4. The results confirmed that these redesigned broad primers produced much better

results by amplifying more specifically fragments for HLA-A exon 4, HLA-B exons 2, 3,

and 4, as well as and HLA-C exons 2, 3, and 4. In addition, the nested PCR for HLA-A

exon 4 increased the mean agreement percentage between NGS and Sanger methods. Our

study also revealed that the redesigned broad primers for HLA-B and HLA-C need to be

further improved alone or in combination with their target specific primers to exclude the

amplification of the non-targeted genes. We also consider the use of a nested PCR to

filter the non-targeted genes. We plan to continue improving our NGS method by an

approach developed in the current work. In particular, we expect to make significant

major improvements by the progress introduced into the computer program: the 81

“teaching” of our computer program how to distinguish sequences for pseudogenes and other unnecessary HLA genes and eliminate them from identification of targeted HLAs.

Moreover, it would be interesting to produce analysis for base call accuracy between

NGS and Sanger, and HLA allele calling accuracy to apply a threshold to the computer program that would help in excluding incorrect sequence reads.

82

9 Reference

1. Bauer DC, Zadoorian A, Wilson LOW, Alliance MGH, Thorne NP. Evaluation of

computational programs to predict HLA genotypes from genomic sequencing

data. Briefings in Bioinformatics. 2016:1–9.

2. Suthanthiran M, Strom TB. Medical progress - renal-transplantation. New

England Journal of Medicine. 1994;331(6):365-376.

3. Porter R, STARK T. Knife to the heart - the story of transplant surgery. Tls-the

Times Literary Supplement. 1996(4873):36-36.

4. Matevossian E, Kern H, Huser N, et al. Surgeon Yurii Voronoy (1895-1961) - a

pioneer in the history of clinical transplantation: in Memoriam at the 75th

Anniversary of the First Human Kidney Transplantation. Transplant International.

2009;22(12):1132-1139.

5. Cornell LD, Smith RN, Colvin RB. Kidney transplantation: Mechanisms of

rejection and acceptance. Annual Review of Pathology-Mechanisms of Disease.

Vol 3. Palo Alto: Annual Reviews; 2008.

6. Merrill JP, Murray JE, Harrison JH, Guild WR. Successful homotransplantation

of the human kidney between identical twins. Journal of Urology.

2002;167(2):830-830.

7. Murray JE, Merrill JP, Harrison JH. Renal homotransplantation in identical twins.

Journal of the American Society of Nephrology. 2001;12(1):201-204.

83

8. Starzl TE, Iwatsuki S, Klintmalm G, et al. The use of cyclosporin-A and

prednisone in cadaver kidney-transplantation. Surgery Gynecology &

Obstetrics. 1980;151(1):17-26.

9. Starzl TE, Klintmalm GBG, Porter KA, Iwatsuki S, Schroter GPJ. Liver-

transplantation with use of cyclosporin-A and prednisone. New England Journal

of Medicine. 1981;305(5):266-269.

10. Reitz BA, Wallwork JL, Hunt SA, et al. Heart-lung transplantation - successful

therapy for patients with pulmonary vascular-disease. New England Journal of

Medicine. 1982;306(10):557-564.

11. Griffith BP, Hardesty RL, Deeb GM, Starzl TE, Bahnson HT. Cardiac

transplantation with cyclosporin-A and prednisone. Annals of Surgery.

1982;196(3):324-329.

12. Cooper JD. The evolution of techniques and indications for lung transplantation.

Annals of Surgery. 1990;212(3):249-256.

13. Petechuk D. Organ Transplantation. Westport, Connecticut London: Greenwood

Publishing Group; 2006.

14. Patel R, Terasaki PI. Significance of positive crossmatch test in kidney

transplantation. New England Journal of Medicine. 1969;280(14):735-&.

15. Kimball PM, Baker MA, Wagner MB, King A. Surveillance of alloantibodies

after transplantation identifies the risk of chronic rejection. Kidney International.

2011;79(10):1131-1137.

84

16. Cardarelli F, Pascual M, Tolkoff-Rubin N, et al. Prevalence and significance of

anti-HLA and donor-specific antibodies long-term after renal transplantation.

Transplant International. 2005;18(5):532-540.

17. Anasetti C, Amos D, Beatty PG, et al. Effect of HLA compatibility on

engraftment of bone-marrow transplants in patients with leukemia or lymphoma.

New England Journal of Medicine. 1989;320(4):197-204.

18. Nymann T, Hathaway DK, Shokouh-Amiri MH, et al. Patterns of acute rejection

in portal-enteric versus systemic-bladder pancreas-kidney transplantation. Clinical

Transplantation. 1998;12(3):175-183.

19. Meier-Kriesche HU, Schold JD, Srinivas TR, Kaplan B. Lack of improvement in

renal allograft survival despite a marked decrease in acute rejection rates over the

most recent era. American Journal of Transplantation. 2004;4(3):378-383.

20. Ivanyi B. Transplant capillaropathy and transplant glomerulopathy: ultrastructural

markers of chronic renal allograft rejection. Nephrology Dialysis Transplantation.

2003;18(4):655-660.

21. Bray RA, Nolen JDL, Larsen C, et al. Transplanting the highly sensitized patient:

The emory algorithm. American Journal of Transplantation. 2006;6(10):2307-

2315.

22. Mahdi B. A glow of HLA typing in organ transplantation. Clinical and

Translational Medicine. 2013;2(1):6.

23. Pratt JR, Basheer SA, Sacks SH. Local synthesis of complement component C3

regulates acute renal transplant rejection. Nature Medicine. 2002;8(6):582-587.

85

24. Creighton TE. Proteins: Structures and Molecular Properties. 2nd Edition ed. New

York: W.H.

Freeman and company; 1993.

25. Warren RP, Storb R, Weiden PL, Su PJ, Thomas ED. Lymphocyte-mediated cyto-

toxicity and antibody-dependent cell-mediated cyto-toxicity in patients with

aplastic-anemia- distinguishing transfusion-induced sensitization from possible

immune-mediated aplastic-anemia. Transplantation Proceedings. 1981;13(1):245-

247.

26. Soulillou JP, Peyrat MA, Guenel J. Association between treatment-resistant

kidney-allograft rejection and post-transplant appearance of antibodies to donor

B-Lymphocyte alloantigens. Lancet. 1978;1(8060):354-356.

27. Mack SJ. A gene feature enumeration approach for describing HLA allele

polymorphism. Human Immunology. 2015;76(12):975-981.

28. Trowsdale J, Knight JC. Major histocompatibility complex genomics and human

disease. In: Chakravarti A, Green E. Annual review of genomics and human

genetics. Vol 14. Palo Alto: Annual Reviews; 2013.

29. Horton R, Wilming L, Rand V, et al. Gene map of the extended human MHC.

Nature Reviews Genetics. 2004;5(12):889-899.

30. Klein J, Sato A. Advances in immunology - The HLA system - First of two parts.

New England Journal of Medicine. 2000;343(10):702-709.

31. Colvin RB, Hirohashi T, Farris AB, Minnei F, Collins AB, Smith RN. Emerging

role of B cells in chronic allograft dysfunction. Kidney International.

2010;78:S13-S17. 86

32. Trivedi HL. Immunobiology of rejection and adaptation. Transplantation

Proceedings. 2007;39(3):647-652.

33. Davidson A, Diamond B. Advances in immunology - Autoimmune diseases. New

England Journal of Medicine. 2001;345(5):340-350.

34. Murphy K. Janeway's Immunobiology Janeway's Immunobiology. 8th ed:

Garland Science; 2011.

35. Eapen M, Rubinstein P, Zhang MJ, et al. Outcomes of transplantation of unrelated

donor umbilical cord blood and bone marrow in children with acute leukaemia: a

comparison study. Lancet. 2007;369(9577):1947-1954.

36. Lee SJ, Klein J, Haagenson M, et al. High-resolution donor-recipient HLA

matching contributes to the success of unrelated donor marrow transplantation.

Blood. 2007;110(13):4576-4583.

37. Khodakov D, Wang CY, Zhang DY. Diagnostics based on nucleic acid sequence

variant profiling: PCR, hybridization, and NGS approaches. Advanced Drug

Delivery Reviews. 2016;105:3-19.

38. Marsh SGE, Albert ED, Bodmer WF, et al. Nomenclature for factors of the HLA

system, 2010. Tissue Antigens. 2010;75(4):291-455.

39. Aubert V, Venetz JP, Pantaleo G, Pascual M. Low levels of human leukocyte

antigen donor-specific antibodies detected by solid phase assay before

transplantation are frequently clinically irrelevant. Human Immunology.

2009;70(8):580-583.

87

40. Sanger F, Coulson AR. RAPID METHOD FOR DETERMINING SEQUENCES

IN DNA BY PRIMED SYNTHESIS WITH DNA-POLYMERASE. Journal of

Molecular Biology. 1975;94(3):441-&.

41. Sanger F, Nicklen S, Coulson AR. DNA SEQUENCING WITH CHAIN-

TERMINATING INHIBITORS. Proceedings of the National Academy of

Sciences of the United States of America. 1977;74(12):5463-5467.

42. Wittig M, Anmarkrud JA, Kassens JC, et al. Development of a high-resolution

NGS-based HLA-typing and analysis pipeline. Nucleic Acids Research.

2015;43(11):E70-U11.

43. Bentley G, Higuchi R, Hoglund B, et al. High-resolution, high-throughput HLA

genotyping by next-generation sequencing. Tissue Antigens. 2009;74(5):393-403.

44. Lind C, Ferriola D, Mackiewicz K, et al. Next-generation sequencing: the solution

for high-resolution, unambiguous human leukocyte antigen typing. Human

Immunology. 2010;71(10):1033-1042.

45. Lan JH, Zhang QH. Clinical applications of next-generation sequencing in

histocompatibility and transplantation. Current Opinion in Organ Transplantation.

2015;20(4):461-467.

46. Ehrenberg PK, Geretz A, Baldwin KM, et al. High-throughput multiplex HLA

genotyping by next-generation sequencing using multi-locus individual tagging.

Bmc Genomics. 2014;15:8.

47. Qiu JX, Cai JC, Terasaki PI, El-Awar N, Lee J. Detection of antibodies to HLA-

DP in renal transplant recipients using single antigen beads. Transplantation.

2005;80(10):1511-1513. 88

48. Aubert O, Bories MC, Suberbielle C, et al. Risk of Antibody-Mediated Rejection

in Kidney Transplant Recipients With Anti-HLA-C Donor-Specific Antibodies.

American Journal of Transplantation. 2014;14(6):1439-1445.

49. Freitas MCS, Rebellato LM, Ozawa M, et al. The Role of Immunoglobulin-G

Subclasses and C1q in De Novo HLA-DQ Donor-Specific Antibody Kidney

Transplantation Outcomes. Transplantation. 2013;95(9):1113-1119.

50. Bjorkman PJ, Saper MA, Samraoui B, Bennett WS, Strominger JL, Wiley DC.

Structure of the human class-I histocompatibility antigen, HLA-A2. Nature.

1987;329(6139):506-512.

51. USRDS. 2015 USRDS annual data report: Epidemiology of Kidney Disease in the

United States. United States Renal Data System: National Institutes of Health,

National Institute of Diabetes and Digestive and Kidney Diseases; 2015.

52. Shah MR, Starling RC, Longacre LS, Mehra MR, Working Grp P. Heart

transplantation research in the next decade-a goal to achieving evidence-based

outcomes. Journal of the American College of Cardiology. 2012;59(14):1263-

1269.

53. Huynh TN, Kleerup EC, Raj PP, Wenger NS. The opportunity cost of futile

treatment in the ICU. Critical Care Medicine. 2014;42(9):1977-1982.

54. Opelz G, Mytilineos J, Scherer S, et al. Survival of DNA HLA-DR typed and

matched cadaver kidney-transplants. Lancet. 1991;338(8765):461-463.

55. Wissing KM, Fomegne G, Broeders N, et al. HLA mismatches remain risk factors

for acute kidney allograft rejection in patients receiving quadruple

89

immunosuppression with anti-interleukin-2 receptor antibodies. Transplantation.

2008;85(3):411-416.

56. McKenna RM, Lee KR, Gough JC, et al. Matching for private or public HLA

reduces acute rejection episodes and improves two-year renal allograft

function. Transplantation. 1998;66(1):38-43.

57. Beckingham IJ, Dennis MJS, Bishop MC, Blamey RW, Smith SJ, Nicholson ML.

Effect of human-leukocyte antigen matching on the incidence of acute rejection in

renal-transplantation. British Journal of Surgery. 1994;81(4):574-577.

58. Dunn PPJ. Human leucocyte antigen typing: techniques and technology, a critical

appraisal. International Journal of Immunogenetics. 2011;38(6):463-473.

59. Erlich H. HLA DNA typing: past, present, and future. Tissue Antigens.

2012;80(1):1-11.

60. Vogiatzi P. Some considerations on the current debate about typing resolution in

solid organ transplantation. Transplantation Research. 2016;5:6.

61. Lomago J, Jelenik L, Zern D, et al. How did a patient who types for HLA-B*4403

develop antibodies that react with HLA-B*4402? Human Immunology.

2010;71(2):176-178.

62. Marrari M, Conca R, Pratico-Barbato L, Amoroso A, Duquesnoy RJ. Brief report:

Why did two patients who type for HLA-B13 have antibodies that react with all

Bw4 antigens except HLA-B13? Transplant Immunology. 2011;25(4):217-220.

63. Duquesnoy RJ, Kamoun M, Baxter-Lowe LA, et al. Should HLA mismatch

acceptability for sensitized transplant candidates be determined at the high-

90

resolution rather than the antigen level? American Journal of Transplantation.

2015;15(4):923-930.

64. Petersdorf EW, Longton GM, Anasetti C, et al. The significance of HLA-DRB1

matching on clinical outcome after HLA-A, HLA-B, HLA-DR identical unrelated

donor marrow transplantation. Blood. 1995;86(4):1606-1613.

65. EMBL-EBI. IMGT HLA sequence database. 2017;

https://www.ebi.ac.uk/ipd/imgt/hla/stats.html. Accessed February 2, 2017.

66. Bai Y, Ni M, Cooper B, Wei Y, Fury W. Inference of high resolution HLA types

using genome-wide RNA or DNA sequencing reads. Bmc Genomics. 2014;15:16.

67. Hosomichi K, Jinam TA, Mitsunaga S, Nakaoka H, Inoue I. Phase-defined

complete sequencing of the HLA genes by next-generation sequencing. Bmc

Genomics. 2013;14:16.

68. Zheng X, Shen J, Cox C, et al. HIBAG-HLA genotype imputation with attribute

bagging. Pharmacogenomics Journal. 2014;14(2):192-200.

69. Kovats S, Main EK, Librach C, Stubblebine M, Fisher SJ, Demars R. A class-I

antigen, HLA-G, expressed in human trophoblasts. Science. 1990;248(4952):220-223.

70. Wei XH, Orr HT. Differential expression of HLA-E, HLA-F, and HLA-G

transcripts in human tissue. Human Immunology. 1990;29(2):131-142.

71. Ishitani A, Geraghty DE. Alternative splicing of HLA-G transcripts yields

proteins with primary structures resembling both class-I and class-II antigens. Proceedings of the National Academy of Sciences of the United States of

America. 1992;89(9):3947-3951. 91

72. Koller BH, Geraghty DE, Shimizu Y, Demars R, Orr HT. HLA-E - A novel HLA

class-I gene expressed in resting lymphocytes-T. Journal of Immunology.

1988;141(3):897-904.

73. Geraghty DE, Wei XH, Orr HT, Koller BH. Human-leukocyte antigen-F (HLA-F)

- an expressed hla gene composed of a class-I coding sequence linked to a novel

transcribed repetitive element. Journal of Experimental Medicine. 1990;171(1):1-

18.

74. Geraghty DE, Koller BH, Orr HT. A human major histocompatibility complex

class-I gene that encodes a protein with a shortened cytoplasmic segment.

Proceedings of the National Academy of Sciences of the United States of

America. 1987;84(24):9145-9149.

75. Laura-Lee B. Extraction genomic DNA from whole blood. 2004;

http://www.protocol-online.org/prot/Protocols/Extraction-of-genomic-DNA-from-

whole-blood-3171.html. Accessed 3-21, 2017.

76. Robinson J, Mistry K, McWilliam H, Lopez R, Parham P, Marsh SGE. The

IMGT/HLA database. Nucleic Acids Research. 2011;39:D1171-D1176.

77. Dieffenbach CW, Lowe TMJ, Dveksler GS. General concepts for PCR primer

design. Pcr-Methods and Applications. 1993;3(3):S30-S37.

78. Kircher M, Sawyer S, Meyer M. Double indexing overcomes inaccuracies in

multiplex sequencing on the Illumina platform. Nucleic Acids Research.

2012;40(1):8.

79. Blomquist TM, Crawford EL, Lovett JL, et al. Targeted RNA- sequencing with

competitive multiplex-PCR amplicon libraries. Plos One. 2013;8(11):14. 92

80. Homer N, Merriman B, Nelson SF. BFAST: an alignment tool for large scale

genome resequencing. Plos One. 2009;4(11):A95-A106.

81. Albert J, Fenyo EM. Simple, sensitive, and specific detection of human-

immunodeficiency-virus type-1 in clinical specimens by polymerase chain-

reaction with nested primers. Journal of Clinical Microbiology. 1990;28(7):1560-

1564.

82. Takemoto SK, Terasaki PI, Gjertson DW, Cecka JM. Twelve years' experience

with national sharing of HLA-matched cadaveric kidneys for transplantation. New

England Journal of Medicine. 2000;343(15):1078-1084.

83. Cereb N, Kim HR, Ryu J, Yang SY. Advances in DNA sequencing technologies

for high resolution HLA typing. Human Immunology. 2015;76(12):923-927.

84. Erlich RL, Jia XM, Anderson S, et al. Next-generation sequencing for HLA

typing of class I loci. Bmc Genomics. 2011;12:13.

85. Danzer M, Niklas N, Stabentheiner S, et al. Rapid, scalable and highly automated

HLA genotyping using next-generation sequencing: a transition from research to

diagnostics. Bmc Genomics. 2013;14:11.

86. Mayor NP, Robinson J, McWhinnie AJM, et al. HLA Typing for the Next

Generation. Plos One. 2015;10(5)

93

Appendix A

Preliminary assay primer sequences.

Table 9.1. Previously designed broad primers.

Exon Forward Sequence (5'-3') Primer 3 B-F4mod GCGTTTACCCGGTTTCATTTTCAGTTG 4 B-F8mod TCTGATTCCAGCACTTCTGAGTCACTTTA Exon Reverse Sequence (5'-3') Primer 2 C-R9mod GGTAAAGGTGACTGGGGCTCTCT 2 PreAmp_C_R9 GGAGAGAGCCCCAGTCACCTTTA 4 PreAmp_C_R2 GTTACTGGAAGCACCATCCACACA

94

Table 9.2. Previously designed target specific primers for HLA-A.

Exon Forward Sequence (5'-3') Primer 2 A2-1F GGTCTCAGCCACTGCTCG 2 A2-2F TTCGTGCGGTTCGACAGC 2 A2-3F AGGAGACACGGAATGTGAAGG 3 A3-1F GCGCCTTTACCCGGTT 3 A3-2F GGGCCAGGTTCTCACACC 3 A3-3F ACGCCTACGACGGCAAG 3 A3-4F CCATGAGGCGGAGCAG 4 A4-1F TGTCCCATGACAGATGCAA 4 A4-2F GGCCACCCTGAGGTGCT 4 A4-3F CAGGACACGGAGCTCGT 4 A4-4F GGAGCAGAGATACACCTGCCAT Exon Reverse Sequence (5'-3') Primer 2 A2-1R GCTCCATCCTCTGGCTCG 2 A2-2R TCCACTCGGTCAGTCTGTGA 2 A2-3R TCGGACCCGGAGACTGTG 3 A3-1R CCCACGTCGCAGCC 3 A3-2R CCAAGAGCGCAGGTCCTC 3 A3-3R GTGCCATCCAGGTAGGCT 3 A3-4R GGGAGATCTACAGGCGATCAG 4 A4-1R CCAGGTCAGTGTGATCTCCG 4 A4-2R GCCGCCCACTTCTGGA 4 A4-3R CCCACCTTACCCCATCTCAG 4 A4-4R GTCTCCAGAGAGGCTCCTGCTT

95

Table 9.3. Previously designed target specific primers for HLA-B.

Exon Forward Sequence (5'-3') Primer 2 B2-1F GTCGGGCGGGTCTCAG 2 B2-2F TCTCAGTGGGCTACGTGGAC 2 B2-3F TGGGACCGGAACACACAG 3 B3-1F GGGGGACTGGGCTGACC 3 B3-2F AGTACGCCTACGACGGCAA 3 B3-3F CAGATCACCCAGCGCAAGT 4 B4-1F CATTCTCAGGCTGGTCACAT 4 B4-2F CCACCACCCCATCTCTGAC 4 B4-3F GGAGATCACACTGACCTGGC 4 B4-4F GAAGTGGGCAGCTGTGGTG Exon Reverse Sequence (5'-3') Primer 2 B2-1R CGCTGTCGAACCTCACGAA 2 B2-2R CAGGCTCTCTCGGTCAGTCTG 2 B2-3R GATCTCGGACCCGGAGACT 3 B3-1R CTCGTTCAGGGCGATGTAATC 3 B3-2R GCTGCTCCGCCTCACG 3 B3-3R CCGGCGACCTATAGGAGATG 4 B4-1R GGGCCCAGCACCTCAG 4 B4-2R GGTCTGGTCTCCACAAGCTC 4 B4-3R CTCTGCTCTTCTCCAGAAGGC 4 B4-4R GCTCCTGCTTTCCCTGAGAA

96

Table 9.4. Previously designed target specific primers for HLA-C.

Exon Forward Sequence (5'-3') Primer 2 C2-1F TCGGGCGGGTCTCAGC 2 C2-2F CAGGCTCCCACTCCATGA 2 C2-3F TCGTGCGGTTCGACAGC 2 C2-4F CAGGAGGGGCCGGAGTAT 3 C3-1F CCGGTTTCATTTTCGGTTTAGG 3 C3-2F GGTCTCACACCCTCCAGAGGAT 3 C3-3F CTGCGCTCCTGGACCG 3 C3-4F AGTGGCTCCGCAGATACCTG 4 C4-1F GCTGGAGTGTCCCAAGAGAGAT 4 C4-2F GCTTCTACCCTGCGGAGATC 4 C4-3F GAGATGGAACCTTCCAGAAGTG Exon Reverse Sequence (5'-3') Primer 2 C2-1R CTGTCGAACCGCACGAAC 2 C2-2R CTCCCCTCTCGGACTCGC 2 C2-3R CACCGTCCTCGCTCTGGTT 2 C2-4R GGGAGGGGTCGTGACCT 3 C3-1R GGTCATACCCGCGGAGGA 3 C3-2R ACTTGCGCTGGGTGATCTGA 3 C3-3R GCTGCAGCGTCTCCTTCC 3 C3-4R TTCAAGGGAGGGCGATATTC 4 C4-1R CCCGCTGCCAGGTCAGT 4 C4-2R GCATATGGCACGTGTATCTCTG 4 C4-3R GGCTCCAGAAGGACTTCTGC

97