HIV Type 1 Intra-Subtype Superinfection Results in Increased : A Case Report

Agnes Nordquist [email protected]

under the direction of Melissa Norstr¨om,PhD Candidate Department of Laboratory Medicine Karolinska Institutet

Research Academy for Young Scientists July 11, 2012 Abstract

Human immunodeficiency type 1 (HIV-1) superinfection has implications for our un- derstanding of HIV pathogenesis, transmission and vaccine development. In this study, the suspicion of superinfection in a treatment na¨ıve HIV-1 infected patient, followed from early infection up to seven years, was investigated in detail. Sequences from the viral population were obtained from the gag p24 region by single amplification. Bioinformatic anal- yses were performed on a dataset including sequences from ten different time points during the course of the HIV-1 infection. The findings indicate that an HIV-1 intra-subtype B super- infection occurred in this subject around one and a half year post infection. Most probably this resulted in an increase of viral load and loss of viral control. The study highlights the importance of HIV-1 superinfection and clinical consequences, which must be considered for future vaccine design. Contents

1 Introduction 4 1.1 Classification ...... 5 1.2 The virion ...... 6 1.3 The replication cycle ...... 7 1.4 The course of HIV-1 infection ...... 7 1.5 Human leukocyte antigen ...... 10 1.6 Superinfection ...... 10

2 Materials and methods 11 2.1 Patient characteristics ...... 11 2.2 Experimental procedures ...... 12 2.2.1 Viral RNA extraction ...... 12 2.2.2 cDNA synthesis ...... 12 2.2.3 Single genome amplification ...... 13 2.3 Bioinformatic analyses ...... 14 2.3.1 Editing and alignment of sequences ...... 14 2.3.2 Subtyping and excluding the possibility of contamination ...... 14 2.3.3 Excluding duplicate sequences and blasting ...... 15 2.3.4 construction for all time points ...... 15

3 Results 16 3.1 HIV-1 disease course ...... 16 3.2 Experimental results ...... 17 3.3 Results from bioinformatic analyses ...... 18 3.3.1 Contamination was excluded and subtype determined as clade B . . . 18

1 3.3.2 Genetic variation results ...... 18 3.3.3 Evolutionary relationship between patient sequences ...... 20

4 Discussion 21

5 Acknowledgements 25

2 List of abbreviations

AIDS - Acquired immune deficiency syndrome

CD4 count - CD4+ T cell levels cDNA - Complimentary DNA

HIV - Human immunodeficiency virus

HLA - Human leukocyte antigen

IFN - Interferon

IL - Interleukin

MHC - Major histocompatibility complex

MSM - Men who have sex with men

NT - Nucleotide

PCR - Polymerase chain reaction

RTase -

SIV - Simian immunodeficiency

TNF - Tumor necrosis factor

VL - Viral load

WPI - Weeks post infection

3 1 Introduction

Human immunodeficiency virus (HIV) was in 1983 identified as the causative agent of, what has since become, one of the most devastating infectious diseases to have emerged in recent history. The HIV-virus was recognised by two separate research groups, one led by Luc Montagnier in France and one led by Robert Gallo in the United States of America [1, 2]. The virus found was the answer to the pursuit of origin of the disease acquired immune deficiency syndrome (AIDS) that first was described in the early 1980s. However, it is believed that the HIV-virus has existed since 1910 or earlier, infecting humans after evolving from a simian immunodeficiency virus (SIV) by multiple cross-species transmissions from other primate species in Africa [3]. HIV is a blood-borne pathogen. The virus spreads foremost through sexual contact, contaminated injection equipment, blood transfusions and mother-to-child during pregnancy, delivery or breastfeeding [4]. According to UNAIDS, the estimated total of people living with HIV today is 34 million, distributed across the globe (Figure 1) [5].

Figure 1: HIV-1 distribution worldwide. The HIV-1 distribution worldwide 2010 according to UNAIDS, World Health Organization [5].

4 1.1 Classification

Two HIV virus types are distinguished; HIV-1 and HIV-2. The more virulent and aggressive HIV-1 is the prevailing type in most countries while HIV-2 is endemic in West Africa. HIV is one of the fastest evolving organisms, enabling the emerge of recombinant forms of HIV from the currently present genetically distinct groups and subtypes of HIV [4]. This is a result of the error-prone viral reverse transcriptase, coupled with a high viral replication rate [6]. HIV-1 is divided in three groups: group M (for major/main) and the two minor groups N and O [7]. HIV-1 group M is spread globally. The dispersion of HIV-1 group M has led to the ascendancy of different group M lineages in different geographic areas (Figure 2). These group M lineages are currently classified in nine subtypes (A-D, F-H, J, K) and additional 40 recombinant forms (CRFs). The recombinant forms were generated when multiple subtypes infected habitants in the same geographic area [7, 8].

Figure 2: The HIV-1 global prevalence and subtype distribution. This figure shows the global HIV-1 group M subtype distribution (2004 - 2007) and the HIV-1 prevalence (2009) for each country. The subtype distribution is presented in larger regions [4].

5 1.2 The virion

The HIV virus is a , i.e. a retrovirus with a long . The HIV virus particles are spherical with a diameter of approximately 100nm [9]. The HIV-virion has a nucleocapsid with an RNA nucleoprotein core. The virus is surrounded by a membrane derived from the host-cell containing virally encoded envelope glycoproteins (gp120 and gp41) in addition to host-cell proteins. The HIV-virion also holds protease and the enzymes reverse transcriptase and integrase. These are all necessary for viral replication. The viral particle is covered by a lipid bilayer derived from the host-cell and studded with the viral glycoproteins gp41 and gp120 (Figure 3). The HIV genome, like other retroviruses, consists of genes such as , gag and pol. In HIV-1 the env gene codes for the precursor envelope glycoprotein (gp160). Gag encodes for the matrix (p17), capsid (p24) and nucleocapsid (p6, p7) proteins. The pol gene encodes for the viral enzymes protease, reverse transcriptase and integrase [10].

Figure 3: The HIV-1 virion. This is a schematic illustration of the HIV-1 virion [11].

6 1.3 The replication cycle

The viral replication begins with the virus envelope glycoprotein gp120 binding to a CD4 receptor and a co-receptor in the host-cell membrane, causing a change in configuration enabling the gp41 glycoprotein envelope glycoprotein to mediate fusion of the viral envelope with the membrane of the host-cell, allowing the viral genome and viral proteins entry to the cell (Figure 4). In the cytoplasm of the host-cell the viral RNA is transcribed to cDNA by reverse transcriptase. The cDNA is then transported in to the nucleus where the enzyme integrase incorporates it to the DNA of the host-cell. In a process called transcription the viral DNA is read, and the code is then used in the making of viral proteins, partially possible thanks to the enzyme integrase that cleaves proteins to their component viral peptides. The viral peptides and proteins, along with new viral RNA, forms new, immature, HIV-virions. The host-cell dies within a few days of being infected of HIV. The cell dies due to one of the following mechanisms; direct killing as a result of the virions binding to cell-surface receptors, the increased susceptibility to apoptosis or killing by cytotoxic CD8 T cells [3]. The HIV virion chiefly infects CD4+ T cells and consequently the immune system. HIV-virions infect macrophages and dendritic cells as well, as these cells express CD4 receptors just like CD4+ T cells [9].

1.4 The course of HIV-1 infection

The pathogenesis of an HIV infection depends on the virus, e.g. the level of virulence and quantity of viruses, and the host, e.g. age, genetic differences, immune responses and en- vironmental factors. Furthermore coinfection with other microbes can affect the rate and severity of disease progression [13]. However, in average the disease progression in treatment na¨ıve HIV-1 infected individuals is 10 years (Figure 5). There are exceptions with individu- als that can control the virus better, thus delay the disease progression. Likewise, there are

7 Figure 4: The HIV-1 replicatyion cycle. Schematic illustration of the HIV-1 replication cycle summarised in seven steps [12].

8 exceptions with individuals who progress to AIDS within two to three years after infection [14]. The acute phase of HIV infection (primary) is followed by an outbreak of viraemia. For some individuals this outbreak is accompanied by flu-like symptoms. The amount of CD4+ T cells decrease rapidly [15]. However, the viraemia is reduced and peripheral CD4+ T cells are recovered with the increase in innate and HIV-specific immune responses. The point to which the viraemia is decreased is called the set point. The set point is predictive of disease progression [16]. Subsequently, the infection enters its asymptomatic, chronic, phase. During which, the HIV-specific immunity may partially control the virus. The infection can persist due to constantly emerging escape variants of the virus and the integration of HIV provirus into the cellular DNA. The chronic phase is accompanied by a gradual loss in CD4+ T cells and defect in responsiveness of several immune functions [13]. The HIV-infection progresses to AIDS when the massive reduction in the number of CD4+ T cells has lead to the number cells/mm3 becoming lower than 200. At this point the immune system is so weak that oppor- tunistic infections, e.g. due to microbes such as Candida, Toxoplasma, Herpes infections or rare cancer types such as Kaposi’s Sarcoma, are observed. Thus, the virus and opportunistic infections take over, leading to death for the patient if left untreated [9].

Figure 5: The HIV-1 disease course. Graph showing viral load and CD4 counts in a treatment na¨ıve human during the course of an HIV-1 infection [17].

9 1.5 Human leukocyte antigen

The major histocompatibility complex (MHC) is a cell surface molecule that mediate in- teractions of the immune cells leukocytes. The MHC for humans is called Human leukocyte antigen. There are different classes of HLA with various functions; A,B and C are correspond- ing to MHC class I. The genetic setting of a patient result in having one of these HLA-types. The HLA system present peptides (epitopes) produced in the cell. If a viral peptide has been produced, these foreign antigens will attract CD8+ T cells. These cytotoxic T cells express receptors on their surface that can recognize a specific antigen. There is high affinity between the CD8+ T cell receptor and the HLA complex presenting the specific antigen, causing the CD8+ T cell to bind to the cell. This enables the CD8+ T cell to destroy the infected cell by producing cytokines and chemokines (i.e. IFNγ, IL-2, TNFα, perforin) [18]. Research has shown a correlation between HLA-B*57 and a slower disease progression [19]. There are several HLA-B*57-restricted epitopes in the gag p24 region.

1.6 Superinfection

When infection of an individual with a second viral strain after the initial infection and the immune response to it has been established a type of dual-infection called superinfection of HIV occurs. A dual-infection prior to the establishment of immune response classifies as coinfection. Superinfection is difficult to detect and is therefore undoubtedly underdiagnosed, especially when using routine clinical methods [20]. If one cell is infected with two separate HIV strains an RNA strain from each HIV strain can be packaged into the same virion. When this virion then infects a cell, recombination can occur if the viral enzyme reverse transcriptase change from one RNA template to another, thus creating a mosaic of the parental viruses in the reverse transcript [21]. This type of recombination contributes to a more rapid increase in viral diversity than does the accumu-

10 lation of mutations through replication errors [22, 23]. The resulting genetic heterogeneity allows for a rapid adaption to host immune responses, target cell availability and antiretrovi- ral therapy. Further, this can lead to augmented viral pathogenicity, infectivity and reduced susceptibility to antiretroviral therapy [24]. Since 2002, 16 cases of HIV-1 superinfection has been reported worldwide [6]. This study is unique compared to previous studies on superinfection. The patient in this case report was both treatment na¨ıve and followed longitudually from early infection, 10 weeks post infection (wpi) up to seven years (324 wpi) post infection. Furthermore, a more precise sequencing method, single genome amplification, was used to obtain more detailed information on the viral population and advanced bioinformatical tools were used to analyse the sequence data. The aim of this study was to investigate the suspicion of HIV-1 superinfection and the clinical impact in a subject carrying the HLA-B*5701 allele.

2 Materials and methods

2.1 Patient characteristics

The patient for this study was a HIV-1 positive caucasian and the time since infection was determined as the mid-point between reported negative and positive tests. Patient was treatment na¨ıve and followed from early infection (10 wpi) up to seven years (324 wpi) post-infection. In addition, high-resolution HLA-typing data was available and this patient was HLA-B*5701 positive. The risk-group for this patient was men who have sex with men (MSM). This study was performed on longitudinal plasma samples (stored in -80 ◦C). In addition, CD4+ T cell counts (cells/mm3) and viral load (copies/mL) measurements were performed regularly during the course of infection and plasma samples for this study were analysed at ten different occasions (wpi=13, 26, 45, 75, 97, 112, 133, 179, 274, 324).

11 2.2 Experimental procedures

2.2.1 Viral RNA extraction

RNA extraction was performed to obtain HIV RNA from plasma samples. RNA extraction was performed according to previously described protocol [25]. Shortly, a volume of 1 mL plasma was used for extraction of HIV-1 RNA, using the RNeasy Lipid Tissue Mini Kit (Qiagen, Hilden, Germany). Extracted viral DNA was eluted in 32 µL of MilliQ water and the entire viral RNA extraction sample was used for the cDNA synthesis. The viral RNA extraction was carried out in a P3 security laboratory.

2.2.2 cDNA synthesis

Complementary DNA (cDNA) was synthesised from the viral RNA template. The cDNA synthesis was performed using the ThermoScript RT-PCR system (Invitrogen). A gene- specific primer (MJ7B) for gag was used: 5’-TCTTTCATTTGRTGTCCTTC-3’(HXB2 nt position 2063-2044). The gene-specific primer used for the cDNA synthesis corresponds to the reverse primer in the first round polymerase chain reaction (PCR-1). The cDNA synthesis consisted of two steps; denaturation and RT incubation. For the denaturation a volume of 8 µL RNA was added to a mixture of 1 µL MJ7B (10 µM), 1 µL dNTP mix (10 µM) and 2 µL MilliQ water. In total four cDNA reactions were prepared and placed in a 2720 Thermal cycler (Applied Biosystems). The reaction-tubes were incubated in 65 ◦C for five minutes and then placed on ice for one minute. Thereafter the mixture for RT-incubation was prepared with 4 µL cDNA synthesis buffer (5x), 1 µL DTT (0,1M), 1 µL RNase OUT (40u/ml), 1 µL ThermoScript RT (15u/ml) and 1 µL MilliQ water. The RT-incubation mixture (8µL) was added to the denaturation mixture (12 µL) by pipetting gently up and down. The same procedure was performed for all four reaction-tubes. For the RT-incubation the following PCR program was used: 50 ◦C 60 min, 85 ◦C 5 min, 4 ◦C.

12 2.2.3 Single genome amplification

Single genome amplification is a method to obtain PCR product derived from single cDNA molecules [26]. In order to obtain PCR products from single cDNA molecules the cDNA was diluted until approximately 30% of the PCR reactions yielded DNA product [27]. This per- centage is estimated by a serial dilution of 1:3 to the cDNA (1:3, 1:9, 1:27, 1:81 etc.). A repli- cate of ten reactions is performed for each dilution. A nested PCR was performed to obtain adequate concentration of amplicons corresponding to the target HIV-1 DNA gag p24-region. A nested PCR consists of an outer (PCR-1) and inner (PCR-2) round of PCR. In PCR- 1 utilised primers are: the forward primer (MJ5B): 5’-CATMTAGTATGGGCAAGCAG- 3’(HXB2 nt position 886-905) and the reverse primer (MJ7B) used in cDNA synthesis. A PCR-1 mixture was prepared with 0,5 µL MJ5B (10 µM), 0,5 µL MJ7B (10 µM), 2,5 µL PCR buffer (10x), 0,85 µL MgCl2 (50mM), 0,5 µL dNTP (10mM), 0,1 µL Platinum Taq DNA Polymerase (Invitrogen) and 17,55 µL MilliQ water for each reaction containing a total volume of 22,5 µL. To each reaction tube a volume of 2,5 µL cDNA was added to the PCR-1 mix obtaining a final volume of 25 µL. Additionally, in the PCR-1 plate six negative (MilliQ) and two positive controls (MN) were included. For the PCR-1 following program was used: 94 ◦C 2 min - 40x (94 ◦C 20 sec - 55 ◦C 20 sec - 72 ◦C 90 sec) - 72 ◦C 5 min - 4 ◦C. This was followed by PCR-2. In which each PCR-1 product was subsequently used as a template, with forward primer (MJ6B): 5’-GTCAGCCAAAATTACCCTA-3’(HXB2 nt position 1171-1189) and reverse primer (MJ8B): 5’-CCTTCCTTTCCACATTTCC-3’(HXB2 nt position 2048- 2030). The PCR-2 mixture was prepared with the MJ6B primer, MJ8B primer and the same reagents, with the same concentrations and volumes, used in the PCR-1 mixture. To each reaction tube a volume of 2,5 µL PCR-1 product was added to the PCR-2 mix. The program used for PCR-2 was: 94 ◦C 2 min - 30x (94 ◦C 20 sec - 55 ◦C 20 sec - 72 ◦C 90 sec) - 72 ◦C 5 min - 4 ◦C. Positive nested PCR products were identified by agarose gel electrophoresis, using E-gel with agarose (Invitrogen). PCR-2 products were diluted 1:4 with MilliQ water

13 (75 µL) and 20 µL from each sample was loaded to the E-gel. The dilution, 1:81, where three of the wells on the E-gel out of ten were positive was chosen for amplification of at least 20 single cDNA molecules. The remaining cDNA was diluted to the correct concentration obtained by the serial dilution experiment (described above). A nested PCR was performed according to the same protocol previously described. The volumes in the positive wells were collected for purification and sequencing. This was performed by a core-facility using the PCR-2 forward (MJ6B) and reverse (MJ8B) primers.

2.3 Bioinformatic analyses

2.3.1 Editing and alignment of sequences

Sequencing files (AB1 format) obtained from the core-facility were imported in Sequencher software (Gene Codes Corporation, Ann Arbor, MI, USA). Sequencher is a program for DNA sequence analysis, including contig assembly [28]. The forward sequence for each sample was assembled with the corresponding reverse sequence to obtain a contig. The chromatogram was manually examined and the presence of double peaks indicated two samples per sequencing reaction. These were discarded. The consensus sequences were exported in FASTA format. More than 20 sequences were obtained for each time point. Sequences were thereafter imported (FASTA format) and manually aligned in the sequence alignment editor BioEdit [29]. The alignment with nine additional time points was exported in FASTA format.

2.3.2 Subtyping and excluding the possibility of contamination

A dataset was assembled in BioEdit to exclude the possibility of contamination and to per- form subtyping. To the alignment 28 selected subtype references, downloaded from HIV database [30], and the positive control sequence (MN) were added. This alignment was imported to Molecular Evolutionary Genetics Analysis version 5 (MEGA5) (Center for Evo-

14 lutionary Functional Genomics, The Biodesign Institute, Tempe, AZ, USA) software [31]. A phylogenetic neighbor-joining tree with 500 bootstrap (a way to judge the strength of sup- port for nodes on phylogenetic trees [32]) replicates was inferred using this program. The tree file (Newick format) was imported in FigTree [33] to be visualised and modified. Branches of sequences belonging to the same subtype were collapsed for display purposes. Rega HIV-1 subtyping tool is a program in which phylogenetic methods are used in order to identify the subtype of a specific sequence. For this patient HIV-1 subtyping was confirmed using Rega HIV version 2.0 [34].

2.3.3 Excluding duplicate sequences and blasting

Gag p24 is a conserved region in the HIV-1 genome and it is therefore likely that several sequences, from the same time points, will be identical. In early HIV-1 infection it is more common with homogeneity in the viral population than in late infection. Therefore, it was interesting to investigate the number of unique sequences for each time point. The unique sequences for each time point were identified using the web based program HyPhy at the web- site datamonkey.org [35]. The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. This program was used to compare the unique nucleotide sequences to sequence databases and calculate the statistical significance of matches [36]. The result from the BLAST search and the origin (country) for the sequences was summarised for each time point.

2.3.4 Phylogenetic tree construction for all time points

An additional phylogenetic neighbor-joining tree was inferred using Mega5. The dataset for this tree consisted of all sequences obtained from the ten time points. The tree was constructed with 500 bootstrap replicates and the tree file was imported in FigTree. Here, each time point was coloured manually.

15 3 Results

3.1 HIV-1 disease course

The HIV-1 disease course of the patient is given as CD4+ T cell count and viral load (VL) over time (wpi) (Figure 6). The CD4+ T cell count and VL for this patient differentiates from the average disease progression in treatment na¨ıve HIV-1 infected individuals (Figure 5). The patient for this study displays a distinct increase in VL after 75 wpi (from 125 to 25,433 copies/mL) and sustained to higher levels of VL during the remaining study period. A general decline in CD4+ T cells was observed over time.

Figure 6: Clinical data during the course of HIV-1 infection. CD4+ T cell count and viral load characteristics of patient followed from early infection (10 wpi) up to seven years (324 wpi). CD4+ T cell count (cells/mm3) in blue on the y-axis to the left and plasma RNA HIV-1 load (copies/mL) in orange on the y-axis to the right. Black arrows indicate the plasma samples that were obtained at ten different occasions (13, 26, 45, 75, 97, 112, 133, 179, 274 and 324 wpi) for further analyses.

16 Figure 7: E-gel with results from the serial dilution 96-well 1% agarose gel electrophoresis for time point 75 wpi with ten replicates for each dilution (1:3, 1:9, 1:27, 1:81, 1:243, 1:729), nega- tive controls MilliQ (blue) and positive controls MN (yellow). The dilution 1:81 corresponded to 30% positive wells for this time point and was therefore chosen for further amplification (marked in red).

Figure 8: E-gel with positive wells for se- quencing Agarose gel electrophoresis results for cDNA with the dilution of 1:81 to obtain single molecules (red). In total the product of 28 wells were collected at time point 75 wpi. Six negative controls MilliQ (blue) and two positive controls MN (yellow) were also included.

3.2 Experimental results

A serial dilution experiment was performed to find the dilution corresponding to 30% positive wells in order to obtain single genomes of the gag p24 region. This percentage was acquired with the dilution 1:81 (Figure 7). This dilution was therafter used to obtain at least 20 single genome sequences from the time point 75 wpi. In total 28 samples out of 88 (31,8%) were positive (Figure 8). These samples were collected for sequencing.

17 3.3 Results from bioinformatic analyses

3.3.1 Contamination was excluded and subtype determined as clade B

The dataset used for bioinformatic analyses consisted of sequences from ten different time points with a total number of 230 sequences. The phylogenetic neighbor-joining tree obtained to define the subtype and exclude the possibility of contamination is shown in Figure 9. The tree was constructed with the aligned sequences from the patient together with the positive control (MN) and subtype reference sequences. The sequence obtained from the positive control clustered together with other subtype B sequences and not with any of the patient sequences. Therefore, contamination could be excluded. The clustering of patient sequences was closest to the subtype B references with a bootstrap value of 0,98. The subtype for all the sequences obtained from this patent was determined as subtype B. This was also confirmed using the Rega 2.0 subtyping tool.

3.3.2 Genetic variation results

The genetic variation over time was estimated using the information obtained from datamon- key.org, demonstrating the quantity of identical sequences at the ten different time points. The number of unique sequences increased over time. This indicates an increase in viral diversity over time (Table 1). A BLAST search was performed for each unique sequence and the results are given in Table 1. A major shift of country origin from the BLAST search was observed between sequences from 75 wpi and 97 wpi (Figure 10). The patient sequences from the first four time points were most identical (hat the highest BLAST score) with an HIV-1 strain from Canada in the BLAST search. However, after 75 wpi the patient sequences were most identical with an HIV-1 strain from Spain.

18 Figure 9: Phylogenetic tree with gag p24 genealogies for patient, MN and subtypes sequences The evolutionary history for HIV-1 gag p24 sequence obtained through single genome amplification was inferred using the Neighbor-Joining method [37]. Outgroup sequences from the Los Alamos HIV database are shown as grey lines, sequences from subtype B as red lines, MN contamination reference sequence as a white star and sequences from patient, from all time points, as blue lines. The optimal tree with the sum of branch length = 1.87018401 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [38] for bootstrap values > 70. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [39] and are in the units of the number of base substitutions per site. The rate variation among sites was modeled with a gamma distribution (shape parameter = 0.5). The analysis involved 330 nucleotide sequences. All ambiguous positions were removed for each sequence pair. There were a total of 690 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [40].

19 Table 1: General information of the sequence dataset.

3.3.3 Evolutionary relationship between patient sequences

A phylogenetic tree was constructed with the aligned sequences from the patient from ten different time points (Figure 11). The phylogenetic tree exhibit the evolutionary relationship between the sequences from the different viral populations. The sequences from the different time points were coloured according to the figure legend to the lower right. The tree was rooted with a sequence from the earliest time point (13 wpi). The branchlenghts of the sequences from the three first time points (13, 26, 45 wpi) were in general shorter than sequences from the majority of the subsequent time points. However, all sequences except from one, from the fifth time point (97 wpi) had short branchlenghts and were clustered together separately. Sequences from some of the later time points (133, 179) cluster closer to the root and show more similarity to the viral population of the first four time points than with the later time points.

20 Figure 10: Number of unique sequences and BLAST results from each time point. Num- ber of unique sequences showed as quantity of identical sequences/total single genome sequences, viral variants (%), versus weeks post infection. The origin of the sequences, obtained through BLAST search, were Canada (EU242363.1, AY779552.1), USA (AY134957.1), Spain (JQ846178.1) and Other (M38432.1, FJ645346.1, EF116335.1).

4 Discussion

The patient is carrying the HLA-B*5701 allele, which is associated with a better control of the HIV-1 infection and a slower disease progression. This might be one reason to why the patient has such low viral load (50-777 copies/mL) the first one and a half year of the infection. However, the distinct increase in VL after 75 wpi is similar to the pattern of VL increase during acute HIV-1 infection. Interestingly, the VL levels sustained to higher levels after 75 wpi and during the rest of the study period. For some reason, this patient’s viral control in the beginning of the infection no longer remained afterwards. To investigate this in more detail longitudinal plasma samples were analysed. Phylo- genetic analyses demonstrated that the obtained experimental results were reliable. In the

21 0 00 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 00 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0.97 00 00 0 0 0 0 1 0 0 0.96 0 0 0 0 0 0 0 0.99 0 0 0 0 0 0.95 0 0 0 0 0 0.70 0 Wpi 0 0 0 0 0 0 0 13 0 0 0 0 00 0.78 0 0 0 0 0 26 0 0 0 0 0 0 0 45 0 0 0 0 0 0 0 75 0 0 0 0 0 0 0 0 0 0 97 0 0 0 0 0 0 112 0 0 0 0 0 0 0 133 0 0 0 0 0 0 0 179 0 0 0 0 0 274 0 0 0 0 0 0.840 324 0 0 0 0 0 0.0040 Figure 11: Phylogenetic tree with gag p24 genealogies for all patient time points. The evolutionary history for HIV-1 gag p24 sequence obtained through single genome amplification was inferred using the Neighbor-Joining method [37]. Sequences from all ten time points are shown in different colours: 13 wpi - red , 26 - orange, 45 - yellow, 75 - green, 97 - pale blue, 112 - dark blue, 133 - purple, 179 - pink, 274 - grey and 324 - black. The optimal tree with the sum of branch length = 0.20525505 is shown. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (500 replicates) are shown next to the branches [38] for bootstrap values > 70. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances used to infer the phylogenetic tree. The evolutionary distances were computed using the Maximum Composite Likelihood method [39] and are in the units of the number of base substitutions per site. The rate variation among sites was modeled with a gamma distribution (shape parameter = 0.5). The analysis involved 284 nucleotide sequences. All ambiguous positions were removed for each sequence pair. There were a total of 690 positions in the final dataset. Evolutionary analyses were conducted in MEGA5 [40].

22 phylogenetic neighbor-joining tree the positive control sequence clustered together with other subtype B sequences and not with any of the patient sequences. All patient sequences clus- tered together with the subtype B reference sequences and had high bootstrap values in the phylogenetic tree. This indicates that the patient sequences obtained from the different time points were trustworthy. The conclusion is therefore that all patient sequences from the entire infection course are subtype B with no contamination. Considering the fact that subtype B is the most common HIV-1 subtype among caucasians this was not unexpected. Additionally, the genetic variation and BLAST analyses indicate a change in the viral population. The time for this change coincide with the time for the increased VL (between 75 to 97 wpi). A new HIV-1 subtype B infection is therefore suspected. According to the BLAST search the sequences from the first four time points were similar with a HIV-1 subtype B strain collected from a subject in Canada. However, after 75 wpi the majority of the sequences were similar to a HIV-1 subtype B strain collected in Spain. A HIV-1 intra- subtype superinfection, with the origin from another continent, is therefore suspected. It is worth to mention that one patient sequence in the third time point had a match, with the highest score, to a sequence originating from United States. This sequence is according to the BLAST search very similar to the Canadian, therefore it is still possible that this specific patient originates from Canada. The phylogenetic tree with the patient sequences from all time points confirms this sus- picion of a intra-subtype superinfection occurring in the time interval 75 to 97 wpi. This is partially because all sequences except from one, from the fifth time point (97 wpi), had short branchlenghts and were clustered together more separately from the earlier time points. This may indicate a new infection with a less divergent viral population which is common in acute infection. Additionally, sequences from some of the later time points (133, 179 wpi) show more similarity to the viral population of the early time points than the later time points. This may be due to recombination between the to different HIV-1 subtype B strains. Recom-

23 bination needs to be confirmed by more advanced phylogenetic tools developed specificly for detection of recombination. The decrease observed in VL after 133 wpi can be explained with actuation of the adaptive immune defence, specific towards the new strain of HIV-1 virus. If recombination has occurred that could also explain the decrease in VL since the recombinant carry sections of the genome from the original HIV-1 strain. In most reports of HIV-1 superinfection to date, disease progression could not be directly assessed, because subjects either did not have long-term follow-up or were treated. In this case report there are several results implying superinfection but further analyses needs to be performed to confirm this, especially regarding the time interval when superinfection occurred. The observation of superinfection occurring could alternatively be explained as coinfection. In that case, the coinfection would be with two divergent viruses, followed by late outgrowth of the variant which initially was a minor population, localised to a cellular or anatomic compartment, e.g. lymph nodes or central nervous system [41]. Since this is a case report it is important to confirm the results obtained in this study with other HIV-1 intra-subtype superinfected patients. However, these subjects are difficult to find since a longer follow-up time and advanced analyses are necessary to confirm a intra-subtype superinfection. To date, less than eight of these subjects have been described in literature [6]. Few of these earlier described, intra-subtype superinfected subjects were treatment na¨ıve and had a shorter follow-up time compared to the subject in this study. This makes it difficult to investigate the impact superinfection can have on the clinical outcome of the HIV-1 infection. This study highlights the importance of superinfection and potential loss of viral con- trol. The occurrence of HIV-1 superinfection has implications for our understanding of HIV pathogenesis, transmission and vaccine development. These implications must be considered for future vaccine design.

24 5 Acknowledgements

I cannot thank my mentor and partner in crime, Melissa Norstr¨om,enough for the endless hours of discussion and laughter we have shared. Miss Norstr¨om’sgreat mentorship and expertise has been of utmost importance for my study. I would also like to thank Prof. David Erlinge at the Department of Cardiology at Lunds University whom I have come to regard a mentor and a great inspiration. I believe I owe him a great deal for the opportunity to commit research under these facilities. I would also like to thank Prof. Anna Karlsson at the Department of Laboratory Medicine at Karolinska Institutet for the interesting scientific discussion we shared. Last but not least, a final thank you to the organisers of the Research Academy for Young Scientists, Astra Zeneca, LIF and Europaskolan for making this research possible. Thank you all.

25 References

[1] Barr´e-SinoussiF, Chermann JC, Rey F, Nugeyre MT, Chamaret S, Gruest J, et al. Isolation of a T-lymphotropic retrovirus from a patient at risk for acquired immune deficiency syndrome (AIDS). 1983. Revista de investigaci´oncl´ınica;organo del Hospital de Enfermedades de la Nutrici´on. 1983;56(2):126–9.

[2] Gallo R, Sarin PS, Gelmann EP, Robert-Guroff M, Richardson E. Isolation of hu- man T-cell leukemia virus in acquired immune deficiency syndrome (AIDS). Science. 1983;220:868–871.

[3] Parham P. The Immune System. 1st ed. New York, NY: Garland Publishing; 2000.

[4] Skar H, Hedskog C, Albert J. HIV-1 evolution in relation to molecular epidemiol- ogy and antiretroviral resistance. Annals of the New York Academy of Sciences. 2011 Aug;1230:108–18.

[5] World Health Organization UNAIDS . UNAIDS World AIDS Day Report — Core Epi- demiology Slides; 2011. Available from: http://www.unaids.org/en/.

[6] Smith DM, Richman DD, Little SJ. HIV Superinfection. Journal of Infectious Diseases. 2005;192:438–444.

[7] Sharp PM, Hahn BH. Origins of HIV and the AIDS Pandemic. Cold Spring Harbor perspectives in medicine. 2011 Sep;1(1):a006841.

[8] Taylor B, Sobieszczyk M, McCutchan F, Hammer S. The Challenge of HIV-1 Subtype Diversity. N Engl J Med. 2008;358(15):1590–1602.

[9] Nowroozalizadeh S. Studies of innate immune stimulation with CpG in HIV infection; 2008.

[10] Frankel AD, Young JA. HIV-1: fifteen proteins and an RNA. Annu Rev Biochem. 1998;(67):1–25.

[11] Castillo R. Cell-mediated deficiency;. Available from: http://arapaho.nsuok.edu/ ~castillo/Cell-mediateddeficiency..html. [12] NIAID. Schematic illustration of the HIV-1 replication cycle.; 2012. Avail- able from: http://www.niaid.nih.gov/topics/HIVAIDS/Understanding/Biology/ pages/hivreplicationcycle.aspx.

[13] Levy JA. HIV and the Pathogenesis of AIDS. 3rd ed. Washington, D.C.: ASM Press; 2007.

[14] Pantaleo G, Fauci AS. Immunopathogenesis of HIV Infection. Annual Review of Mi- crobiology. 1996;50:825–854.

26 [15] Kahn JO, Walker BD. Acute Human Immunodeficiency Virus Type 1 Infection. New England Journal of Medicin. 1998;339(1):33–39.

[16] Mellors JW, Kingsley LA, Rinaldo CR, Todd JA, Hoo BS, Kokka RP, et al. Quantitation of HIV-1 RNA in Plasma Predicts Outcome after Seroconversion. Annals of Internal Medicine. 1995;122(8):573–579.

[17] Jurema O. The Immunopathogenesis of Human Immunodeficiency Virus Infection. 2004;Available from: http://en.wikipedia.org/wiki/File:Hiv-timecourse.png.

[18] Schellens IMM, Borghans J, Jansen C, De Cuyper IM, Geskus RB, van Baarle D, et al. Abundance of early functional HIV-specific CD8+ T cells does not predict AIDS-free survival time. PloS one. 2008 Jan;3(7):2745.

[19] Norstr¨omMM. Virologiska och immunologiska m¨onsterkopplade till l¨agrerisk att utveckla AIDS. BestPractice. 2012;(01):11–13.

[20] Waters L, Smit E. HIV-1 superinfection. Current opinion in infectious diseases. 2012 Feb;25(1):42–50.

[21] Jung A, Maier R, Vartanian JP, Bocharov G, Jung V, Fischer U, et al. Recombination: Multiply infected spleen cells in HIV patients. Nature. 2002;418(6894):144.

[22] Malim MH, Emerman M. HIV-1 sequence variation: drift, shift, and attenuation. Cell. 2001 Feb;104(4):469–72.

[23] Robertson DL, Sharp PM, McCutchan FE, Hahn BH. Recombination in HIV-1. Nature. 1995;374(6518):124–126.

[24] Blackard JT, Cohen DE, Mayer KH. Human immunodeficiency virus superinfection and recombination: current state of knowledge and potential clinical consequences. Clinical infectious diseases : an official publication of the Infectious Diseases Society of America. 2002 Apr;34(8):1108–14.

[25] Lindkvist A, Ed´enA, Norstr¨omMM, Gonzalez VD, Nilsson S, Svennerholm B, et al. AIDS Research and Therapy Reduction of the HIV-1 reservoir in resting CD4 + T- lymphocytes by proof-of-concept study. AIDS Research and Therapy. 2009;6(15):1–8.

[26] Kearney M, Palmer S, Maldarelli F, Shao W, Polis MA, Mican J, et al. Frequent polymorphism at drug resistance sites in HIV-1 protease and reverse transcriptase. AIDS (London, England). 2008;(22):497–501.

[27] Palmer S, Kearney M, Maldarelli F, Halvas EK, Bixby CJ, Bazmi H, et al. Multi- ple , Linked Human Immunodeficiency Virus Type 1 Drug Resistance Mutations in Treatment-Experienced Patients Are Missed by Standard Genotype Analysis. Journal of Clinical Microbiology. 2005;43(1):406–413.

27 [28] UNC. UNC Center for Bioinformatics - Software; 2009. Available from: http:// bioinformatics.unc.edu/software/sequencher/index.htm.

[29] Hall T. BioEdit version 5.0.6. 2001;Available from: http://www.mbio.ncsu.edu/ BioEdit/BioDoc.pdf.

[30] Los Alamos National Security. Los Alamos HIV database; 2011. Available from: http: //www.hiv.lanl.gov/.

[31] Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Molecular biology and evolution. 2011 Oct;28(10):2731– 9.

[32] Efron B, Halloran E, Holmes S. Bootstrap confidence levels for phylogenetic trees. Proceedings of the National Academy of Sciences of the United States of America. 1996 Nov;93(23):13429–34.

[33] Morariu VI, Srinivasan BV, Raykar VC, Duraiswami R, Davis LS. Automatic online tuning for fast Gaussian summation. Advances in Neural Information Processing Sys- tems. 2008;p. 1–8.

[34] De Oliveira T, Deforche K, Cassol S, Salminen M, Paraskevis D, Seebregts C, et al. An automated genotyping system for analysis of HIV-1 and other microbial sequences. Bioinformatics. 2005;21(19):2797–3800.

[35] Delport W, Poon AF, Frost SDW, Kosakovsky Pond SL. Datamonkey 2010: a suite of phylogenetic analysis tools for evolutionary biology; 2010.

[36] NCBI. BLAST - basic information;. Available from: http://blast.ncbi.nlm.nih. gov/.

[37] Saitou N, Nei M. The neighbor-joining method: A new method for reconstructing phylogenetic trees. Molecular Biology and Evolution. 1987;(4):406–425.

[38] Felsenstein J. Confidence limits on phylogenies: An approach using the bootstrap. Evolution. 1985;(39):783–791.

[39] Tamura K, Nei M, Kumar S. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proceedings of the National Academy of Sciences (USA). 2004;(101):11030–11035.

[40] Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: Molecular Evolutionary Genetics Analysis using Maximum Likelihood, Evolutionary Distance, and Maximum Parsimony Methods. Molecular Biology and Evolution. (In Press). 2011;.

28 [41] Smith DM, Wong JK, Hightower GK, Ignacio CC, Koelsch KK, Petropoulos CJ, et al. HIV drug resistance acquired through superinfection. AIDS (London, England). 2005;(19):1251–1256.

29 Appendix

Figure 12: Editing process in the program Sequencher.

Figure 13: Alignment process in the program BioEdit.

30 Figure 14: Confirmation of subtyping using Rega tool 2.0.

Figure 15: BLAST search of sequences.

31 Table 2: CD4+ T cell count and viral load over time.

32 Table 3: Selected HIV-1 subtype reference sequences.

33 Table 4: Data from blasting of sequences.

34