IMPAIRED AND INCREASED ACTIVATION-

INDUCED CYTIDINE DEAMINASE EXPRESSION DURING HIV-1 INFECTION

by

ELISABETH BOWERS

B.S., University of Washington, Seattle, 2002

A thesis submitted to the

Faculty of the Graduate School of the

University of Colorado in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

Microbiology Program

2013

This thesis for the Doctor of Philosophy degree by

Elisabeth Bowers

has been approved for the

Microbiology Program

by

Jerry Schaack, Chair

Edward N. Janoff, Advisor

Linda Van Dyk

Larry Wysocki

Cara Wilson

Date 7/18/13

ii

Bowers, Elisabeth (Ph.D., Microbiology)

Impaired Somatic Hypermutation and Increased Activation-Induced Cytidine Deaminase

Expression during HIV-1 Infection

Thesis directed by Professor Edward N. Janoff

ABSTRACT

Background: HIV-1 infection is complicated by high rates of opportunistic infections against which specific antibodies contribute to immune defense. Antibody function depends on somatic hypermutation (SHM) of variable regions of immunoglobulin (Ig) heavy chain genes. SHM is mediated by a -specific enzyme, activation-induced cytidine deaminse (AID).

Methods: We characterized the frequency of SHM in expressed VH3-family IgD,

IgM, IgA, and IgG mRNA immunoglobulin transcripts from control and HIV-1-infected patients using high-throughput pyrosequencing. We also compared AID mRNA expression by qRT-PCR, AID isoform expression by PCR, and the activation phenotype of B and T lymphocyte subsets among peripheral blood mononuclear cells (PBMC) from

HIV-1-infected and control subjects pre- and post-stimulation.

Results: VH3-IgM and VH3-IgA SHM frequencies did not differ between HIV-1- infected patients and controls. VH3-IgG SHM frequencies were significantly lower in

HIV-1-infected patients as they were in another non-Ig AID target, Bcl6. VH3-IgD SHM frequencies were significantly higher in HIV-1-infected patients, however. Mutation

iii patterns were comparable in both groups in all isotypes regardless of SHM frequency.

AID mRNA expression was significantly higher in HIV-1-infected patients compared with controls. AID increased significantly post-stimulation in both groups, but the expression levels were lower among HIV-1-infected patients. At baseline, activation markers for B and T cells in multiple naïve and memory subsets were significantly higher in HIV-1-infected patients, but activation levels were not significantly different post- stimulation. AID expression correlated significantly with VH3-IgD SHM frequency and activation of T cells, but not with VH3-IgG SHM frequency, Bcl6 mutation frequency, B cell activation, or plasma HIV-1 RNA. AID isoform expression was comparable in both groups.

Conclusions: B cells from HIV-1-infected patients show disparate SHM frequencies, especially amongst sequences known to control opportunistic infections which commonly cause morbidity during HIV-1 infection. Similar mutation patterns suggest differences in quantity, but not quality, of AID activity. Despite increased expression of AID mRNA and surface activation markers at baseline, B cells from HIV-

1-infected patients demonstrated a diminished capacity to upregulate AID mRNA in response to stimuli. These impairments may compromise humoral immune responses to both opportunistic infections and even to HIV-1 itself.

The form and content of this abstract are approved. I recommend its publication.

Approved: Edward N. Janoff

iv

ACKNOWLEDGEMENTS

I would like to thank my family, especially my husband and my daughter, for all the sacrifices they have made and the support they have given me so that I could pursue my scientific goals. God has blessed me immensely with your support, encouragement and patience; so much more than I deserve. I would not have finished without your help.

I would also like to thank all the friends that we have made during our time in Colorado who have also given us great encouragement and support when being so far from home.

Many people contributed to the work presented in this thesis. The sequencing experiments could not have been possible without the valuable input from the Pollock lab, specifically Todd and Jill Castoe. Fantastic customer service was provided by Sergio

Pereira at the Toronto TCAG Sequencing Core. Janoff lab and Frank lab current and former members, Diana Ir, Leah Feazel, and Emily Eshleman all contributed to generating and submitting sequences. Sequence analysis has been a huge challenge throughout this project, and the greatest contributor to my success in this arena is Dan

Frank, who sacrificed a lot of time, hard-drive space, and RAM to analyze the massive amounts of data generated by this project. Initial efforts at Batch Analyzer coding by

Abdul Rajib Bahar and Isaac Spitzer were also highly appreciated. Thank you also to

Sam MaWhinney for thoughtful and tireless statistical analysis and to Tim Wright for patient recruitment at Denver Health Hospital. I am greatly appreciative also to the resident flow cytometry expert in the Janoff lab, Harsh Pratap, for endless hours spent

v isolating PBMC and optimizing, running, and analyzing the flow cytometry data. Thanks also to Alison McMahon and Melissa Keays, former Janoff lab members, for flow expertise and effort. Other Janoff lab members have also been incredibly supportive and knowledgeable in helping me survive this experience with some sanity still intact; Jeremy

Rahkola, Jana Palaia, Claire Gustafson, Jennna Achenbach, Rick Sullivan, and Jacinta

Cooper. I would also like to thank my committee members Jerry Schaack, Linda van

Dyk, Cara Wilson, and Larry Wysocki for all of the time that you have dedicated to this project and my advancement, for your patience, encouragement, and insightful suggestions. Finally, I would also like to thank my mentor, Edward Janoff, for all the time spent pushing me to work harder, think outside the box, and expand my scientific horizons. I am extremely appreciative of the time spent and the effort expended on my behalf.

vi

TABLE OF CONTENTS

CHAPTER Page

I. INTRODUCTION……………………………………………………………1

HIV-1………………………………………………………………………….1

Epidemiology……………………………………………………………...1

HIV-1 life cycle…………………………………………………….……..3

Pathogenesis……………………………………………………….………8

B cells………………………………………………………………..……….10

B cell development……………………………………………..………..10

Antibody structure……………………………………….………………18

Activation-induced cytidine deaminase…………………………...…..…21

Somatic hypermutation and class-switch recombination………….……..25

B cells and HIV-1………………………………………………….…….32

II. MATERIALS AND METHODS…………………………………………….35

Patient data…………………………………………………………..……….35

PBMC isolation and stimulation……………………………………………..36

Isolation…………………………………………………………………..36

Stimulation………………………………………………………….……36

Total immunoglobulins………………………………………………...…….36

mRNA isolation……………………………………………………..……….37

DNA isolation………………………………………………………………..37

vii cDNA generation………………………………………………………..…..37

Real-time PCR………………………………………………………………38

AID isoforms amplification and cloning…………………..………………...39

AID isoforms PCR………………………………………………..……..39

AID isoforms cloning………………………………………………...….40

Amplification, cloning, and sequencing of VH3-IgG genes………..………..41

PCR amplification of VH3-IgG genes……………………………………41

Cloning of VH3-IgG genes……………………………………………….43

Sequencing of VH3-IgG clones……………………………….………….43

VH3-IgG cloned sequence analysis…………………………………..………43

Determination of polymerase fidelity……………………………………43

VH3-IgG cloned sequence alignments and mutation calculations……….44

Mutation patterns analysis…………………………...…………………..44

CDR3 region analysis……………………………………..…………….45

VH3 454 PCR/High-throughput pyrosequencing…………………..………..45

1st round amplification…………………………………………………..45

2nd round amplification…………………………………………..………46

VH3 454-pyrosequencing analysis…………………………………….……..49

VhIGene program………………………………………………….……..49

Mutation patterns analysis………………………………………...……..50

CDR3 region analysis……………………………………………………50

Preliminary Bcl6 gene PCR, cloning, sequencing, and analysis…….………50

Preliminary Bcl6 PCR amplification……………………………………50

viii

Cloning of Bcl6 genes………………………………………………..….52

Sequencing of Bcl6 clones…………………………………………..…..54

Bcl6 cloned sequence alignments and mutation calculations……...……54

Bcl6 454 PCR/High-throughput pyrosequencing……………….…………..54

1st round amplification…………………………………….…………….54

2nd round amplification………………………………………...………..55

Bcl6 454-pyrosequencing analysis……………………………….…………57

Flow cytometry………………………………………………………...……57

Statistical analysis……………………………………………………….…..58

III. THE SOMATIC HYPERMUTATION FREQUENCY OF VH3 FAMILY IMMUNOGLOBULIN GENES IS ALTERED IN HIV-1-INFECTED PATIENTS COMPARED WITH HEALTHY CONTROLS………………………………………………………….……..59

Introduction………………………………………………………….….59

Results………………………………………………………………..….62

VH somatic hypermutation frequency is reduced in VH3-IgG cloned samples from viremic HIV-1-infected patients but the mutation pattern is normal……………..………………………….…62

VH somatic hypermutation frequency is reduced in VH3-IgG 454- pyrosequenced genes from viremic HIV-1-infected patients but increased in VH3-IgD genes……………………………………….…82

Mutation frequency is reduced in Bcl6 genes in HIV-1-infected patients………………………………………………………...……115

Discussion………………………………………………………..……..123

Disparate SHM frequencies occur during HIV-1 infection……...…123

V(D)J recombination is normal during HIV-1-infection………...…124

Regulation of AID may be altered during HIV-1 infection……...…125

ix

DNA repair pathways involved in SHM may be impaired during HIV-1 infection………………………………………………..……127

Antibody selection and function may be decreased during HIV-1 infection………………………………..…128

Non-specific B cell activation during HIV-1 infection………..……128

Limitations of the studies……………………………………...……129

IV. AID mRNA EXPRESSION IS INCREASED IN FRESH PBMC FROM HIV-1-INFECTED COMPARED WITH HEALTHY CONTROLS………………………………………………………….…….131

Introduction………………………………………………..……………131

Results……………………………………………………..……………138

Baseline AID mRNA expression is higher, but AID mRNA induction post-stimulation is lower in HIV-1-infected patients in which VH3-IgG cloning and sequencing was performed……………………………………………...……………138

Baseline B cell and activation is higher in HIV-1-infected patients in whom VH3-IgG cloning and sequencing was performed, but activation levels are comparable to controls post-stimulation………………………………………..……………145

Baseline AID mRNA expression is higher in HIV-1-infected patients in which 454-Pyrosequencing of VH3 Genes was performed…………………………………...………………………149

B and T cell subsets are more activated in HIV-1-infected patients in whom 454-pyrosequencing performed…………….……153

Discussion………………………………………………………………168

High baseline AID mRNA expression is likely driven by T cell activation…………………………………………………..…168

Induction of AID mRNA expression and B cell differentiation may be impaired during HIV-1 infection……………………...……170

Correlation of high VH3-IgD SHM frequency and high baseline AID implies normal AID function during HIV-1 infection………....172

x

AID isoform expression does not likely affect AID function during HIV-1 infection……………………………………………...172

B cell subset proportions cannot account for discrepancies in SHM frequency……………………………………………….……..173

Limitations of the studies……………………………………………174

V. SUMMARY AND CONCLUSIONS…………………………………....…176

Summary……………………………………………………………..…176

Disparate frequencies of SHM by isotype during HIV-1 infection………………………………………………………….….177

High baseline AID mRNA expression and decreased expression post-stimulation…………………………………………………...…179

B and T cell subsets are more activated during HIV-1 infection…....180

Potential Mechanisms……………………………………………..……182

AID may be less functional in germinal centers during HIV-1 Infection…………………………………………………………..…182

Early dissolution of the germinal center during HIV-1 infection...…187

Abnormal trafficking within germinal centers during HIV-1 Infection…………………………………………………………..…192

Non-specific, T cell-independent B cell activation during HIV-1 Infection………………………………………………………..……194

Future Directions…………………………………………………….…198

Which B cell subsets have disparate SHM frequencies during HIV-1 infection? ……………………………………………………198

Which B cell subsets have high AID expression? …………….……200

Does increased AID expression translate to increased SHM frequency during HIV-1 infection?……………………………….…201

Are GC signaling molecule levels normal in HIV-1-inected patients? …………………………………………………..………...202

xi

Impact of the work in this thesis………………………….…………203

REFERENCES…………………………………………………………………………205

APPENDIX 230

A. Mutation Pattern Tables……………………………………………………..…230

B. Failed Experiments………………………………………………………..……241

AID Functional Assay……………………………………………..……241

Measuring AID Protein Expression by Flow Cytometry…………….…256

C. Appendix References…………………………………………………………..265

xii

LIST OF TABLES

TABLE Page

2.1 AID Isoforms PCR Primer Sequences……………………………………...……40

2.2 VH3-IgG PCR Primer Sequences…………………………………………….…..42

2.3 VH3 454-Pyrosequencing PCR Primer Sequences…………………………...….48

2.4 Preliminary Bcl6 PCR Amplification and Cloning Primer Sequences……….….53

2.5 Bcl6 454-Pyrosequencing PCR Primer Sequences…………………………..…..56

3.1 Clinical Characteristics of VH3-IgG Cloning Study Subjects…………………....63

3.2 Biochemical Characteristics of Amino Acids in the CDR3 Regions……...…….67

3.3 RGYW/WRCY Motifs and Targeted Mutation Frequencies in CDR and FR Regions in VH3-IgG Genes………………………………………...……77

3.4 Nucleotide Mutation Patterns in VH3-IgG Genes …………………………….....78

3.5 Nucleotide Mutation Proportions in VH3-IgG Genes ………..……………….…79

3.6 Nucleotide Mutations and Adjacent Nucleotide Patterns in VH3-IgG Genes…....81

3.7 Clinical Characteristics of 454-Pyrosequencing Study Subjects…………….…..90

3.8 Nucleotide Mutation Patterns and Proportions in VH3 454-Pyrosequenced Samples…………………………………………………………………………111

3.9 Transition and Transversion Mutation Proportions in VH3 454-Pyrosequenced Samples…………………………………………………………………………114

3.10 Preliminary Bcl6 Mutation Frequencies………………………………..………117

3.11 Preliminary Bcl6 Mutation Frequencies Post-Stimulation……………..………119

3.12 Clinical Characteristics of Bcl6 Study Subjects………………………………..119

4.1 Cell Marker Expression in CD19+ B Cell Subsets……………………….……..135

4.2 Cell Marker Expression in T Cell Subsets……………………………...………137

4.3 Clinical Characteristics of Study Subjects……………………………...………139 xiii

4.4 Clinical Characteristics of AID Isoform Study Subjects…………………….....146

4.5 AID Isoforms Expression………………………………………………………147

4.6 Clinical Characteristics of Study Subjects………………………………….…..151

5.1 Factors Required for Germinal Center Maintenance………………………...…190

A1 Biochemical Characteristics of Amino Acids in CDR3 Regions from VH3 454-Pyrosequenced Samples……………………………………………....230

A2 RGYW/WRCY Motifs and Targeted Mutation Frequencies in CDR and FR Regions in VH3 454-Pyrosequenced Samples……………………….…233

A3 Nucleotide Mutation Patterns and Proportions in VH3 454- Pyrosequenced Samples………………………………………………………...235

A4 Nucleotide Mutations and Adjacent Nucleotide Patterns in VH3 454- Pyrosequenced Samples………………………………………………..……….237

B1 Primer Sequences………………………………………………………..……...245

xiv

LIST OF FIGURES

FIGURE Page

1.1 Typical Course of HIV Infection.…………………………………………………4

1.2 HIV-1 Life Cycle………………………………………………………….………5

1.3 B Cell Development………………………………………………………...……11

1.4 V(D)J Recombination………………………………………………………...….13

1.5 Germinal Center Reaction…………………………………………………….….14

1.6 Antibody Structure……………………………………………………………….19

1.7 AID Enzyme……………………………………………………………………..22

1.8 AID Isoforms……………………………………………………………….……24

1.9 Model of Somatic Hypermutation…………………………………………….…27

2.1 Representative Real-Time PCR Run…..………………………………..….……39

2.2 Immunoglobulin Heavy Chain mRNA…………………………………….…….42

2.3 454-Pyrosequencing Protocol……….……………………………………...……47

2.4 Flow Diagram of VhIGene Program……………………………………….…….51

2.5 Preliminary Segments Selected for Testing Bcl6 Mutation Frequency……...…..52

3.1 Immunoglobulin Heavy Chain mRNA……………………………….………….61

3.2 Expression of VH3-IgG Family Genes and VH3-23……………………..……….65

3.3 Expression of D and JH Genes…………………………………………………...66

3.4 Mutation Frequency is Reduced in CDR1/2 Regions of VH3-IgG Genes from Viremic HIV-1-Infected Patients………………………………...…70

3.5 Replacement Amino Acid Mutation Frequencies are Reduced in FR1/2/3 Regions of VH3-IgG Genes from Viremic HIV-1-Infected Patients…………………………………………………………………………..71

xv

3.6 The Proportions of Mutated Sequences in VH3-IgG Genes……..………………72

3.7 Frequencies of Amino Acid Replacement Mutations in Specific VH3-IgG Genes……………………………………………………………….…73

3.8 Frequencies of Amino Acid Mutations at Each Amino Acid Position in VH3-IgG Genes……………………………………………………………….…76

3.9 Dinucleotide Analysis of C:T and A:G Transition Mutations in VH3-IgG Genes…………………………………………………………………………….80

3.10 454-FLX Pyrosequencing of VH3 cDNA Sequences Isolated from Control Subject PBMCs………………………………………………………………….84

3.11 Error Frequency vs. Quality Score Cut-Off………………………….…………89

3.12 VH3 Gene Expression in 454-Pyrosequenced Samples…….………...………….92

3.13 D Gene Expression in 454-Pyrosequenced Samples………….…..…………….93

3.14 JH Gene Expression in 454-Pyrosequenced Samples……………...…………….95

3.15 Mutation Frequencies in CDR1/2 Regions of VH3 454-Pyrosequenced Samples...... 99

3.16 Mutation Frequencies in FR1/2/3 Regions of VH3 454-Pyrosequenced Samples………………………………………………………………………....102

3.17 Proportions of Mutated Sequences in VH3 454-Pyrosequenced Samples…………………………………………………………………………105

3.18 Frequencies of Replaced Amino Acid Mutations in Specific VH3 Genes from 454-Pyrosequenced Samples…………………………………………….……..106

3.19 VH3-IgD SHM Frequency Correlates with VH3-IgM SHM Frequency and HIV-1 Plasma Viral Load………………………………………………………108

3.20 Bcl6 Gene and Segment Positions……………………..……………………….116

3.21 Mutation Frequencies in 454-Pyrosequenced Bcl6 Sequences are Reduced in HIV-1-Infected Patients……………………………………………………..121

3.22 Baseline Bcl6 Mutation Frequency Correlates with VH3-IgG SHM Frequency and CD4+ T Cell Count……….………...………………………….122

4.1 AID mRNA Expression in PBMCs..………………………...…………………141 xvi

4.2 AID mRNA Expression Induced by B Cell Stimulants……...…………………143

4.3 AID Isoforms Expression………………………………………………………146

4.4 Activation Phenotype of B Cells and T Cells at Baseline and Post-Stimulation………………………………………………………………...148

4.5 Relationship Between T Cell Activation with Baseline AID mRNA Expression in PBMC……….…………………………………………..………150

4.6 AID mRNA Expression in 454-Pyrosequenced PBMCs………….……………152

4.7 Baseline AID mRNA Expression Positively Correlates with VH3-IgD CDR1/2 Amino Acid SHM Frequency………..……………………..…………153

4.8 B Cell Subsets in 454-Pyrosequenced PBMCs…………………………………156

4.9 Baseline and Stimulated AID mRNA Expression Correlate with BND, GC, and IgM C-S B Cell Subsets……………………… ………………………..….160

4.10 T Cell Subsets in 454-Pyrosequenced PBMCs…………………………………162

+ 4.11 CD8 T Cell Activation and TFH Activation Correlate with Baseline AID mRNA Expression………………………………………………………….…..164

4.12 TFH Cell Proportions Correlate with VH3-IgA and VH3-IgG SHM Frequencies…………………………………………………………………..…164

4.13 Activated T Cell Subsets Correlate with Activated B Cell Subsets…………….166

5.1 Induction of AID mRNA Expression…………………………………………..184

5.2 AID Protein Function………………………………………………………...…186

5.3 Germinal Center Signaling…………………………………………..…………189

5.4 Impairment of AID, IL-21 Signaling Deficiencies, and T Cell-Independent Germinal Center Formation during HIV-1 Infection…………………..………197

B1 AID Functional Assay Plasmids………………………………………….…….250

B2 Mock-Nucleofection of Primary Human B Cells…………………………....…252

B3 Nucleofection of Primary Human B Cells with Control GFP Plasmid…….…..253

B4 Nucleofection of Primary Human B Cells with FEpM1-3 Assay Plasmid…..…254 xvii

B5 Staining of Non-Fixed/Permeabilized PBMC………………………………….259

B6 Staining of Fresh PBMC using the Cell Signaling Technologies anti-Human AID Rat Monoclonal Antibody………………………………………...………260

B7 Staining of Stimulated PBMC using the Cell Signaling Technologies anti-Human AID Rat Monoclonal Antibody…………………………...………261

B8 Staining of Fresh PBMC using the RnD anti-Human AID Mouse Monoclonal Antibody………………………………………………………………..………262

B9 Staining of Stimulated PBMC using the RnD anti-Human AID Mouse Monoclonal Antibody………………………………………………………..…263

xviii

LIST OF ABBREVIATIONS

ADCC Antibody-Dependent Cell-mediated Cytotoxicity

AID Activation-Induced Cytidine Deaminase

AIDS Acquired Immunodeficiency Syndrome

A-NHEJ Alternative Non-Homologous End Joining

ANOVA Analysis Of Variance

APE Apurinic/Apyrimidic Endonuclease

ART Anti-Retroviral Therapy

BAFF B cell Activation Factor belonging to the TNF Family

BAFFR BAFF Receptor

BCR B Cell Receptor

BER Base Excision Repair

BLAST Basic Local Alignment Search Tool

CDR Complementarity Determining Region

CH Heavy Chain Constant Region

CSR Class Switch Recombination

D Heavy Chain Diversity Region

DC Dendritic Cell

DC-SIGN Dendritic Cell-Specific Intracellular Adhesion Molecule-3 Grabbing Non- Integrin

DNA Deoxyribonucleic Acid dsDNA Double Stranded DNA

Env HIV-1 Envelope Gene

ESCRT Endosomal Sorting Complex Required for Transport xix

FACS Fluorescence-Activated Cell Sorting

FDC Follicular Dendritic Cell

FR Framework Region

Gag HIV-1 Group-specific Antigen

GALT Gut-Associated Lymphoid Tissue

GAPDH Glyceraldehyde 3-Phosphate Dehydrogenase

GC Germinal Center

HAART Highly-Active Anti-Retroviral Therapy

HIV-1 Human Immunodeficiency Virus-1

IFN Interferon

Ig Immunoglobulin

IL Interleukin

IMGT ImMunoGeneTics Database

Indel Insertions and Deletions

JH Heavy Chain Joining Region

LTR Long Terminal Repeat

MFI Mean Fluorescence Intensity

MHC Major Histocompatibility Complex

MMR Mismatch Repair

Nef HIV-1 Negative regulatory Factor

NES Nuclear Export Signal

NHEJ Non-Homologous End Joining

NLS Nuclear Localization Signal

xx

NNRTI Non-Nucleoside Reverse Transcriptase Inhibitor

NRTI Nucleoside Reverse Transcriptase Inhibitor

PBMC Peripheral Blood Mononuclear Cell

PCNA Proliferating Cell Nuclear Antigen

PCR Polymerase Chain Reaction

PI Protease Inhibitor

PIC Pre-Integration Complex

PKA Protein Kinase A

RAG Recombination Activating Gene

Rev HIV-1 Regulator of Expression of Viral proteins

RFU Relative Fluorescence Units

RNA Ribonucleic Acid

RPA Replication Protein A

RSS Recombination Signal Sequence

RT Reverse Transcriptase

RTC Reverse Transcription Complex

SHM Somatic Hypermutation

Tat HIV-1 Transactivator

TD T Cell-Dependent

TdT Terminal Deoxytransferase

TFH Follicular

TI T Cell-Independent

TNF Tumor Necrosis Factor

xxi

TLR Toll-Like Receptor

TLS Translesion

UNG Uracil DNA Glycosylase

VH Heavy Chain Variable Region

Vif HIV-1 Viral Infectivity Factor

VL Light Chain Variable Region

Vpr HIV-1 Viral Protein R

Vpu HIV-1 Viral Protein U

xxii

CHAPTER I

INTRODUCTION

HIV-1

Epidemiology

Human immunodeficiency virus-1 (HIV-1) has become a major global health issue since the epidemic was first recognized in 1981 [1]. In the U.S. alone, an estimated

40,000-60,000 people become newly infected yearly, 1.2 million people are currently infected with HIV-1, and nearly 18,000 people die every year due to complications of

HIV-1 infection [2]. Globally, an estimated 34 million people are infected, with 2.7 million new infections per year, resulting in 1.8 million deaths annually [3]. Infection and death rates are especially high in developing countries due to clinical presentation late in the course of disease and limited access to both medical resources and antiretroviral therapies.

1

HIV-1 is especially problematic because it infects and can directly and indirectly destroy cells of the immune system, CD4+ T lymphocytes and macrophages, leaving the host susceptible to numerous opportunistic infections that do not normally cause sickness or death in the healthy general population [4, 5].

The course of HIV-1 infection is divided into several stages (Figure 1.1). The primary infection can last up to 6 months post-infection. During this stage plasma HIV-1

RNA levels, “viral loads,” can reach as high as a million copies per mL of blood and the

CD4+ T cell count drops below the normal range (800-1200 cells/mm3 of blood) [1].

Most HIV-1 transmission is proposed to occur during this period of high level viremia.

Within approximately 6 months, the immune system gains partial control over viral replication, the viral load decreases and stabilizes at a “set point” that can range from

<2,000 virions/mL to >100,000 virions/mL and CD4+ T cell counts increase. This clinical latency or asymptomatic stage lasts 8-10 years, on average. The magnitude of the viral set point is predictive of the rate of the patients’ disease progression and immune decline over this period [6]. During the asymptomatic stage, antigen-selected viral mutations can lead to immunological escape from both CD8+ cytotoxic T cells and antibodies [7] resulting in increasing viral loads and decreasing CD4+ T cell counts.

Patients enter the symptomatic stage, during which viral loads increase to over 100,000 copies/mL and CD4+ T cell counts drop below 500 cells/mm3. Patients develop minor infections such as herpes simplex, warts and fungal infections, and oral (thrush) and vaginal candidiasis. As viral loads continue to increase and CD4+ T cell counts decrease further, patients develop AIDS, acquired immunodeficiency syndrome, characterized as having CD4+ T cell counts below 200 cells/mm3 and the development of opportunistic

2 infections and cancers, particularly a range of B cell , eventually resulting in the patient’s death [1, 4]. Understanding the viral life cycle has provided insights into identifying vulnerable pathways for antiretroviral medications and potentially, vaccine development to limit disease progression and its morbid sequelae.

HIV-1 life cycle

HIV-1 is a spherical, enveloped virus belonging to the retrovirus family. Inside the envelope is a nucleocapsid core surrounding two copies of a 9 kb single stranded positive-sense RNA genome [4, 10]. The genome is composed of three genes common to all retroviruses (Gag, Pol, and Env), as well as six accessory genes (Tat, Nef, Rev, Vpr,

Vif, and Vpu), encoding 17 different proteins [10]. The Env gene product, gp160, is cleaved into two glycoproteins expressed in the envelope, gp41 and gp120 that mediate viral binding [4].

As binding of HIV-1 gp120 to the primary viral receptor, CD4 (Figure 1.2), is of relatively low affinity, early attachment of HIV-1 to host cells is likely the result of binding to cell surface molecules such as Heparan sulfate proteoglycan, LFA-1, and nucleolin [11]. Upon binding of gp120 to CD4 (Figure 1.2, step 1), the conformation of gp120 changes, leading to the exposure of new epitopes that can bind to a coreceptor; either CCR5, CXCR4, or both. Binding to the coreceptor induces more conformational changes in gp120 which dissociates from gp41, at which point gp41 inserts into the host plasma membrane with fusion of the viral and host membranes (Figure 1.2, step 2) [10-

12].

3

Figure 1.1: Typical Course of HIV-1 Infection. During the early period after primary infection there is widespread dissemination of virus and a sharp decrease in the number of CD4 T cells in peripheral blood. An immune response to HIV ensues, with a decrease in detectable viremia followed by a prolonged period of clinical latency. The CD4 T cell count continues to decrease during the following years, until it reaches a critical level below which there is a substantial risk of opportunistic diseases [8, 9].

The details immediately succeeding the virion core entering the cytoplasm are not well characterized. The capsid core may begin to dissolve immediately, releasing the

Reverse Transcription Complex (RTC) into the cytoplasm, where reverse transcription begins [12], or it may remain intact and travel along microtubules and actin filaments to the nuclear membrane [13]. In this scenario, reverse transcription occurs inside the core and the Pre-Integration Complex (PIC) with the reverse-transcribed viral DNA genome actively entering the nucleus through a nuclear pore (Figure 1.2, steps 3-5).

4

Figure 1.2: HIV-1 Life Cycle. The infection begins when the envelope (Env) glycoprotein spikes engage the receptor CD4 and the membrane-spanning co-receptor CC-chemokine receptor 5 (CCR5) (step 1), leading to fusion of the viral and cellular membranes and entry of the viral particle into the cell (step 2). Partial core shell uncoating (step 3) facilitates reverse transcription (step 4), which in turn yields the pre- integration complex (PIC). Following import into the cell nucleus (step 5), PIC- associated integrase orchestrates the formation of the integrated provirus (step 6). Proviral transcription (step 7) yields viral mRNAs of different sizes (step 8). mRNAs serve as templates for protein production (step 9), and genome-length RNA is incorporated into viral particles with protein components (step 10). Viral-particle budding (step 11) and release (step 12) from the cell is mediated by ESCRT (endosomal sorting complex required for transport) complexes and is accompanied or soon followed by protease-mediated maturation (step 13) to create an infectious particle [12].

Reverse transcription, whether it takes place inside the capsid core or in the cytoplasm, occurs in the context of the RTC (Figure 1.2, step 4). This complex is proposed to comprise the viral RNA genome and several viral proteins including the

Reverse Transcriptase (RT) and the matrix protein [13]. Reverse transcription begins with the synthesis of minus-strand DNA at the 5’ end of the RNA genome. This short 5

DNA fragment then undergoes a strand transfer event and becomes the primer at the 3’ end of the genome for the rest of the minus-strand DNA. While the minus-strand DNA is being copied, the RNA template is degraded. Synthesis of the plus-strand DNA then begins near the center of the minus-strand DNA template. After a second strand transfer, the remaining plus-strand DNA is synthesized at the 5’ end of the minus-strand DNA template [13].

The completed double-stranded (ds) DNA viral genome is now part of the PIC, composed of the dsDNA, matrix protein, Viral Protein R (Vpr), and viral Integrase [11,

12]. This complex is actively transported through nuclear pores through the action of both viral and host proteins (Figure 1.2, step 5). Both Integrase and the matrix protein bind host DNA-binding proteins on chromatin, preferentially near genes that are transcribed by RNA Pol II [11]. Integrase cleaves both ends of the viral DNA genome as well as the host DNA, then joins the viral and host DNA at the resulting free 5’ and 3’ ends. Host enzymes repair the single-stranded gaps, creating the stable provirus (Figure

1.2, step 6) [10, 12].

Transcription of the viral genome is dependent upon the binding of transcription factors to the promoter in the 5’ Long Terminal Repeat (LTR) of the viral genome

(Figure 1.2, step 7). Early transcripts are multiply spliced, code mainly for viral accessory proteins Tat, Nef, and Rev, and can easily exit the nucleus through nuclear pores. Later transcripts are singly or unspliced transcripts and require the action of Rev to exit the nucleus (Figure 1.2, step 8) [14]. Rev binds to a region of the transcript and to a host nuclear export protein, CRM1, to mediate nuclear exit [12]. These longer

6 transcripts code for structural genes Gag, Pol, and Env, and serve as new viral genomes

(Figure 1.2, step 9) [10, 12].

Assembly of immature virions occurs in lipid rafts at the plasma membrane

(Figure 1.2, step 10) [15, 16]. The Env gene product is translated in the endoplasmic reticulum, then glycosylated and cleaved at the Golgi complex and directed to the plasma membrane [10]. The Gag gene product is myristoylated at its N-terminus and targeted to the plasma membrane [15]. The Gag gene product is cleaved into 6 proteins, the matrix

(p17), nucleocapsid (p7), and capsid proteins (p24), and the smaller core proteins p1, p2, and p6 [10]. Both the nucleocapsid and p2 proteins bind to a packaging signal present on unspliced RNA viral genomes. The matrix protein binds Gag-Pol transcripts [10]. Both are brought to the plasma membrane when the Gag complex assembles. Other viral and host proteins are also incorporated into the assembled viral core, including Vif, Vpr, and

Cyclophilin A [16]. The assembled complex induces membrane curvature leading to bud formation (Figure 1.2, step 11) [16]. Budding requires the interaction of viral protein p6 and host protein Tsg101, a component of the endosomal sorting complex required for transport (ESCORT) complex [12, 15, 16]. Ion channels formed by Viral Protein U

(Vpu) also aid in viral budding [12, 15]. Once released, the immature virus must undergo additional maturation to become an infectious particle (Figure 1.2, step 12). Maturation involves a conformational change of the structural components of the virus, as well as enzymatic cleavage of Gag-Pol precursors. Maturation requires four cleavage steps.

First, the Gag-Pol immature protease cleaves Gag-Pol to form the mature protease. The mature protease then cleaves the Gag-Pol precursor into p2, nucleocapsid, and matrix proteins (Figure 1.2, step 13) [12, 16]. The enveloped virus is now infectious. Envelope

7 gp120 and gp41 moieties are the primary targets of neutralizing antibodies that inhibit transmission person to person, but appear unsuccessful in limiting viral replication within the infected person due to serial antigen-driven mutations and glycosylation in gp120.

Pathogenesis

HIV-1 infects immune cells which express both CD4 and either of the coreceptors

CCR5 or CXCR4: T cells and macrophages [17, 18]. Primary HIV-1 infections most often occur at mucosal sites, where the virus is believed to be captured by the dendrites of intra- or subepithelial dendritic cells (DCs), Langerhans cells, or macrophages. Captured virus is then passed on to CD4+CCR5+ T cells in which productive infection is established [7, 18]. The virus spreads rapidly to other lymphoid tissues, particularly the gut-associated lymphoid tissue (GALT) and especially the effector lamina propria, where it can directly and indirectly cause the depletion of ~80% of the largely activated CD4+ T cells [7], including both memory and IL-17-secreting (Th17) CD4+ T cells, within the first three weeks of infection [19]. This massive depletion of CD4+ T cells, as well as indirect dysregulation of B cells results in the loss of nearly half of all gut germinal centers as well [7]. As the GALT becomes compromised, the integrity of the gut mucosal barrier breaks down, leading to the translocation of microbial products from the mucosa into the bloodstream [20]. These microbial products are proposed to contribute to the systemic immune activation and dysregulation characteristic of chronic HIV-1 infection, including non-specific chronic B and T cell activation [7, 19, 20]. From the gut, the virus spreads to other lymphoid tissue, especially the lymph nodes, resulting, in

8 later stages of infection, in attenuation of the follicular dendritic cell network and in the formation of germinal centers [21].

Chronic HIV-1 infection results in changes in the proportions of T cells in the blood and lamina propria; a drop in CD4+ and an increase CD8+ T cell numbers with increased activated and memory cell subsets. This skewing is not limited to T cells, however, but similarly affects B cell, DC, and natural killer cells as well. Immune cells also frequently become exhausted, losing both functionality and proliferative capacity during the course of infection [22-24]. In nearly 50% of patients infected with HIV-1 subtype B (the predominant circulating subtype in North America and Europe), the virus may shift from utilizing CCR5 as a coreceptor, to using CXCR4 during the later stages of infection [17]. This switch in viral tropism is classically associated with very advanced disease, low CD4+ T cell numbers, and the development of AIDS in untreated patients

[4].

Although HIV-1 has only been shown to infect CD4+ T lymphocytes and macrophages in vivo, the virus does have direct and indirect effects on the phenotype and function of the humoral effector arm of the adaptive immune response, B lymphocytes.

The consequence of this effect is a dramatically increased incidence of infections for which antibodies are important in defense (e.g., Streptococcus pneumoniae, Haemophilus influenzae, non-typhi Salmonellae, and Cryptococcus neoformans) as well as of auto- immune disease and B cell lymphomas [25-29].

9

B cells

B cell development

B cells begin as pluripotent hematopoietic CD34+ stem cells in the bone marrow

(Figure 1.3) [30-33]. With the help of soluble signaling factors produced by bone marrow stromal cells, including stem cell factor (SCF), thrombopoietin (TPO), flt3 ligand, and other unidentified factors [34], these stem cells differentiate into early-B and pro-B cells [34, 35].

Pro-B cells begin antigen-independent rearrangement of the variable portion of the B cell receptor (BCR), composed of the variable (V), diversity (D), and joining (J) gene segments of the heavy (H)-chain genes (Figure 1.4) [30-32]. V(D)J recombination is mediated primarily by two enzymes expressed during the pro-B cell stage, recombination activating genes (RAG)-1 and -2 [31, 36, 37]. These enzymes mediate recombination first by bringing together two recombination signal sequences (RSSs) that have spacer lengths of 12 and 23 base pairs [36, 37]. The RAG proteins both create single strand nicks in the RSSs. The 3’-OH end of the cleaved DNA strand invades the opposite DNA strand forming blunt RSS ends and closed hairpin coding ends [37].

These ends are processed through the endonuclease activity of DNA-PKcs and Artemis and repaired by the classical nonhomologous end-joining DNA repair pathway (NHEJ)

[36, 37]. During this process, splicing inaccuracies by the RAG enzymes may occur, as well as the addition of random nucleotides by the activity of the Pol X family of polymerases (TdT, Pol, and Pol), both of which create great diversity at the V-D-J junctions [36-38]. In the early pro-B cell, D and J gene segments of the H-chain gene rearrange first. The D-J rearrangement is then joined with a V region in the late pro-B

10

Figure 1.3: B Cell Development. B cells originate in the bone marrow as pluripotent hematopoietic stem cells. Guided by bone marrow stromal cells, the stem cells differentiate into pro-B cells where they rearrange the heavy chain genes of the B cell receptor (BCR). Successful rearrangement leads to several rounds of proliferation and transition into the pre-B cell stage, where the light chain genes are rearranged. Immature B cells express a functional BCR – rearranged heavy chains linked to the IgM constant region, rearranged light chains, CD79 and CD79. Immature B cells start to express the IgD constant region and become mature B cells, which exit the bone marrow and circulate in the periphery. Upon antigen encounter, B cells can either become short-lived plasma cells, secreting antibodies that are either IgM or IgA (gut only), or migrate to the primary follicle to initiate a germinal center (GC) reaction. GC B cells undergo somatic hypermutation (SHM; ) and class-switch recombination (CSR; centrocytes) to improve BCR/Immunoglobulin affinity and functionality. High-affinity B cells can then differentiate further into long-lived antibody-secreting plasma cells or memory B cells.

11 cell [31, 32]. When late pro-B cells express the pre-BCR, composed of the in-frame VDJ rearrangement linked to a  constant region chain, the VpreB/5 complex (a surrogate light chain) and CD79, they become large pre-B cells. Expression of the pre-BCR signals a halt in further rearrangement and several rounds of proliferation [30, 32].

Rearrangement of the V and J gene segments of the L-chain then commences [32]. A complete immunoglobulin, including functional H-chain and L-chains, associates on the cell surface with CD79 and CD79 to form a functional BCR [31-33]. Cells which express a functional BCR are termed immature B cells [31, 32].

Immature B cells are negatively selected to remove any self-reacting BCRs from the B cell pool, a process dependent on both BCR and BAFF/BAFFR (B cell activation factor belonging to the TNF family) signaling [40]. Unselected cells express BCRs of both  and  isotypes (IgM and IgD, respectively) and become naive mature cells which circulate in the blood stream and [31, 33].

Upon encountering the antigen to which the BCR is specific, the B cells upregulate CCR7, which attracts them to both CCL19 and CCL21, expressed by stromal cells in the T cells zones of secondary lymphoid organs [31, 41]. The antigen is internalized and processed by B cells, then is presented by MHC class II molecules on the

B cell surface. Nearby CD4+ T cells, activated previously by DCs [31, 42], recognize the peptide-MHC complex, and express soluble (IL-4) and cognate (CD40 ligand) costimulatory molecules [31, 41]. Activated B cells may then develop into antibody- secreting plasma cells in extrafollicular sites or mature into germinal-center precursor B cells [42].

12

Figure 1.4: V(D)J Recombination. The two DNA coding segments to be joined are shown as dark blue and pink rectangles. The blue coding segment is flanked by a 12- RSS and the pink coding segment is flanked by a 23-RSS (heptamer, spacer, nonamer indicated by 7, 12/23, 9, respectively. The first step of V(D)J recombination is the binding of the RAG1/2 complex to an RSS. Binding first to a 12-RSS or a 23-RSS are equally likely. Capture of the second RSS occurs in the process of synapsis. RAG1/2 cleave the DNA within the synaptic complex, yielding the cleaved signal complex. Hairpin structures on coding ends are nicked by DNA-PKcs and Artemis (Art), followed by the addition of non-templated nucleotides (N) by terminal deoxynucleotidyl transferase (TdT), and coding end ligation by the XRCC40ligase IV (Lig IV) complex. The signal joint is formed by blunt end ligation of signal ends by the XRCC4-ligase IV complex [adapted from 39].

Germinal centers (GCs) (Figure 1.5) are composed mostly of proliferating B cells, but also a small proportion of antigen-specific T cells and FDCs [31, 43]. They are divided into areas called the dark zone, characterized by high frequencies of B cell proliferation and high cell density among CXCR4+ centroblasts, and the light zone, where

13

Figure 1.5: Germinal Center Reaction. Antigen-activated B cells differentiate into centroblasts that undergo clonal expansion in the dark zone of the germinal center. Current hypotheses hold that during proliferation, the process of somatic hypermutation (SHM) introduces base-pair changes into the V(D)J region of the rearranged genes encoding the immunoglobulin variable region (IgV) of the heavy chain and light chain; some of these base-pair mutations lead to a change in the amino-acid sequence. Centroblasts then differentiate into centrocytes and move to the light zone, where the modified antigen receptor, with help from immune helper cells including T cells and follicular dendritic cells (FDCs), is selected for improved binding to the immunizing antigen. Newly generated centrocytes that produce an unfavorable antibody undergo apoptosis and are removed. A subset of centrocytes undergoes immunoglobulin class- switch recombination (CSR). Cycling of centroblasts and centrocytes between dark and light zones seems to be mediated by a chemokine gradient, presumably established by stromal cells in the respective zones (not shown). Antigen-selected centrocytes eventually differentiate into memory B cells or plasma cells [42].

B cell activation, selection, and apoptosis occur among CXCR4- centrocytes [42-44]. In the periphery of the light zone are a network of FDCs and CD4+ follicular helper T cells

(TFH) which provide both antigen, and survival and activation signals to the proliferating

B cells [31, 45, 46]. 14

In the dark zone, centroblasts are retained by their expression of CXCR4, which is attracted to the high concentration of CXCL12, produced by local stromal cells in the dark zone [42-44]. Centroblasts in the dark zone express high levels of activation- induced cytidine deaminase (AID) and undergo somatic hypermutation (SHM) to alter the affinity of the BCR for the antigen [42, 47]. Centroblasts can downregulate CXCR4 expression and upregulate CXCR5 expression, becoming centrocytes. CXCR5+ GC B cells are attracted to CXCL13, produced by FDCs in the light zone [31, 42-45].

Centrocytes reduce their rate of cell division and undergo positive selection to retain only the highest affinity BCRs. Centrocytes express their newly mutated BCR, which can bind with varying affinities to antigen captured by FDCs. High-affinity binding leads to antigen processing and presentation on MHC class II molecules to TFH cells. B cells expressing high densities of MHC class II/antigen complexes elicit survival signals, especially CD40L and IL-21, from TFH cells. B cells expressing low affinity BCRs will not be able to compete as well for antigen, will not be able to present high densities of antigen to T cells, and thus, will not receive survival signals. Low-affinity B cells, therefore, undergo apoptosis [31, 42, 43, 45-47]. High-affinity B cells that are selected can upregulate CXCR4 and return to the dark zone to undergo further rounds of affinity maturation as a result of SHM and selection [42, 45, 47]. Current data suggest that after

SHM occurs, the isotype of the BCR may be altered to change its functionality [48].

Class switch recombination (CSR), also dependent on the activity of AID, is the switch from the constant regions expressed by mature naïve B cells, IgM and IgD, to expression of either IgM, IgG, IgA, or IgE alone [47, 49, 50].

15

Cells that survive the selection process in germinal centers can differentiate into either memory B cells or antibody-secreting plasma cells in the bone marrow and other tissues, including the lamina propria of mucosal tissues. These distinct maturation pathways are mediated by T cell factors including CD40/CD40L and IL-4 signaling for memory cell development, and OX40-OX40L and IL-2, IL-3, IL-6, and IL-10 signaling for development [31, 51]. Memory cells are long-lived cells that express surface immunoglobulin, but do not secrete antibody. During subsequent exposures to their specific antigen, memory B cells become quickly reactivated, leading to proliferation and differentiation into plasma cells [51, 52]. Crotty, et al, have suggested that “bystander” activation of memory B cells by adjacent antigen-unrelated but activated

T cells can also sustain the B cell memory and plasma cell populations [53]. Thus, a small pool of memory B cells can be maintained for long periods of time. Plasma cells express surface immunoglobulin and secrete large amounts of antibody [51]. Plasma cells persist for several weeks to several years and are responsible for large, rapid increases in levels of specific antibody during subsequent antigenic exposure [45, 51, 52].

Migration of plasmablasts and memory cells from the germinal centers is most likely directed by chemokine expression. After infections are cleared, memory cells will circulate between germinal centers to extrafollicular sites such as spleen and bone marrow via lymph and blood 47, 51, 54]. Plasma cells are retained in the bone marrow and found in peripheral blood only at low frequencies [45, 51, 52, 55].

Not all antibody responses require GC formation. T cell-independent (TI) antigens can activate B cells without any T cell help. These antigens are divided into two groups; TI-1 antigens activate B cells through non-BCR signaling pathways, such as

16

TLRs, and TI-2 antigens bind and cross-link BCRs via their repeating epitopes [56]. TI-1 antigens include TLR ligands such as single-stranded RNA from viruses (TLR-7/8) and

CpG DNA from bacteria and viruses (TLR-9) [57]. TI-2 antigens include bacterial polysaccharides from S. pneumoniae, H. influenzae, and N. meningitidis, and some viral capsid proteins, such as those found in vesicular stomatitis virus (VSV), poliovirus, and polyoma virus [56]. Activation signals, in conjunction with TI-antigen binding to B cells, can be provided by various cell types, such as epithelial cells and DCs, to activate the B cells [58-59]. Transient, short-lived GC may form, but only limited SHM and CSR occur, and only involve switch to IgM in extrafollicular sites or IgM and IgA in the gut

[31, 41, 45, 50].

In the context of HIV-1 infection, B cell development from the stem cell to the naïve mature stage appears normal. Indeed, despite decreased total B cell numbers in

HIV-1-infected patients, multiple studies have shown normal to increased proportions of naïve mature B cells compared to healthy control subjects [60-63]. GC responses and maturation to memory and plasma cells may be impaired, however. GC development is interrupted early during the course of infection, recovers during the asymptomatic stage, and declines again as patients progress to AIDS [7, 21, 63] presumably due to a lack of

CD4+ T cell help [7], but also because of attenuated and disrupted FDC networks [21].

The proportion of memory B cells, including IgM memory cells (IgM+IgD-CD27+) and class-switched plasma cells is also decreased during HIV-1 infection [22, 63-65].

Whether these perturbations in B cell phenotype are due to the reductions in the number or function of GC T cells, particularly TFH cells, to the functional activity of AID, the enzyme responsible for CSR, to the numbers and sizes of GC, or to an HIV-1-associated

17 defect in the ability of B cells to respond to antigenic stimuli and/or T cell help is not known. How these conditions affect the genes that encode antibodies (e.g., VH) and the molecules that regulate them (e.g., AID) is a primary focus of this thesis.

Although the primary function of B cells is antibody production, B cells can also direct the immune responses and development of other cell types. B cells release cytokines that can direct T cell and DC differentiation and responses, including IL-4, IL-

6, IL-10, and IFN- [66-68], regulate lymphoid tissue generation and organization, such as the differentiation of FDCs through lymphotoxin / expression and M cell development in the GALT [69], regulate wound healing, and influence tumor immunity

[67].

Antibody structure

Immunoglobulins on the B cell surface (B cell receptor; BCR) and their secreted cogeners, antibodies, are composed of two identical heavy (H)-chains and two identical light (L)-chains (Figure 1.6). These four chains are held together by disulfide binding.

The N-terminal domains of the variable region of each chain (VDJ for H-chains and VJ for L-chains) binds antigen. Each V region is further subdivided into complementarity determining regions (CDRs) and framework regions (FRs). CDR loops come into direct contact with the antigen. FRs provide structural support for the CDRs [70]. The C- terminal domains form the constant region, which defines the class and function of the antibody [38].

18

Figure 1.6: Antibody Structure. Pairs of heavy and light chains combine to form a Y- shaped molecule. Two antigen-binding sites are formed by the combination of variable domains from one light (VL) and one heavy (VH) chain. Cleavage within the hinge region separates the Fab and Fc portions of the protein. The Fc portion of the molecule also contains bound carbohydrate [Adapted from 71].

All antibodies are glycoproteins; 3-13% of their weight is carbohydrate [38]. The extent of glycosylation varies by isotype, is essential for antibody structure, and can affect antibody function [38-72]. Glycans can influence binding to both Fc receptors and immune mediators, are essential for antibody-dependent cellular cytotoxicity (ADCC), play a role in complement-dependent cytotoxicity, can protect the hinge region of IgA from proteolytic cleavage, and can even bind antigen [72].

The constant regions define the 5 antibody classes or isotypes: IgG, IgA, IgM,

IgD, and IgE. Additionally, there are four subclasses of IgG (IgG1-IgG4) and two subclasses of IgA (IgA1 and IgA2) [38]. IgD is predominantly expressed by naïve mature B cells and during the GC reaction, expression is downregulated prior to class 19 switching [31]. There are additional exclusively-IgD+ B cell subsets that can have both a naïve mature (BND anergic) or a memory or antibody-secreting plasma cell phenotype

(class-switched IgD B cells) [73-75]. While little is known about the function of secreted

IgD [76], class-switched IgD B cells have heavily mutated VH genes and are typically autoreactive [74, 75]. IgM functions in neutralization and opsonization. IgM antibodies tend to be lower affinity (intrinsic strength of antigen binding) because B cells can differentiate into IgM+IgD- B cells outside of a germinal center without undergoing affinity-enhancing SHM. However, IgM forms pentamers, allowing the resulting 10 antigen-binding sites to bind antigen at a high overall avidity (functional affinity). IgM is found in blood and lymph [72, 77]. IgG is the most abundant isotype produced in blood where it can diffuse into tissues more readily than IgM. IgG is capable of neutralization, opsonization and activation of the complement pathway. IgA is the principal isotype in secretions, especially in the intestinal and respiratory tracts. It functions mainly as a neutralizing antibody. IgE is present at the lowest serum concentration of all the classes and interacts with mast cells to fight off parasitic infections and mediate allergic disease

[72-77].

Antibodies can have both direct and indirect effects on pathogens. They can directly block pathogens from binding to host tissues, prevent pathogen entry into host cells, interfere with pathogen mobility, and block pathogen effector protein activity [78].

Indirectly, antibodies can activate the classical complement pathway, recruit innate immune cells and promote phagocytosis of the pathogen, or bind to Fc receptors found on immune effector cells such as macrophages, dendritic cells, neutrophils, natural killer cells, eosinophils, basophils, and mast cells. Different cell types express different Fc

20 receptors for IgG (FcRI (CD64), II (CD32), and III (CD16)), IgA (FcR; CD89), and

IgM [78, 79]. Fc receptor engagement can activate cells to phagocytose an antibody- coated microorganism, release stored toxic chemicals such as reactive oxygen and nitrogen species, and induce cytokine production, such as IL-4, IL-6, IL-8, and TNF-

.

The development of effective, specific antibody structure, and therefore, function, is dependent upon a B cell-specific enzyme called activated-induced cytidine deaminase

(AID).

Activation-induced cytidine deaminase

AID is an enzyme expressed predominantly by germinal center B cells and is required for both CSR and SHM [80, 81]. This 198 amino acids enzyme is encoded by five exons (Figure 1.7) [82, 83]. The N-terminal portion of the enzyme is required for

SHM and the C-terminal portion is required for CSR [84]. Although a crystal structure has not been resolved, it is thought to function as a dimer in vivo, although activity has also been shown by the monomer in vitro [85, 86].

AID is expressed in response to B cell activation. Activation by various stimuli, such as T cell engagement (CD40 signaling) [82], Toll-like receptor (TLR) signaling

(e.g., TLR-9) [87], hormones such as estrogen [88], and many microorganisms (Hepatitis

C virus, Epstein-Barr virus [87, 89], Helicobacter pylori [90], Bacteroides species [91], and HIV-1 [92, 93]), leads to the upregulation of AID mRNA. These stimuli activate the transcription factors E47, Stat6, NF-B, and Pax5, which in turn upregulate AID expression [87]. 21

Figure 1.7: AID Enzyme. AID is composed of five exons and divided into two domains, the N-terminal domain, associated with SHM, and the C-terminal domain, associated with CSR. A nuclear localization signal (NLS) is located in the N-terminal domain. A nuclear export signal (NES) is located in the C-terminal domain. Deaminase activity has been mapped to the center of the enzyme and is completely encoded by exon 3. Three phosphorylation sites that are important for CSR and SHM have been identified; Serine (S) 38, Threonine (T) 140, and Tyrosine (Y) 184.

AID initiates both SHM and CSR by deaminating cytidine residues, leaving uracil residues behind in the target DNA. As a result of the potential for subsequent mutagenic activity, AID is tightly regulated by four mechanisms: transcription, cellular compartmentalization, phosphorylation, and degradation. In addition to the transcription factors that control activation of AID expression, AID expression can be downregulated by the Ca2+-sensing protein calmodulin [94] and the hormone progesterone [95]. AID expression is also sensitive to post-transcriptional control by the microRNAs miR-181b and miR-155, which both downregulate AID protein expression [96-98]. AID protein is expressed primarily in the cytoplasm [99, 100]. AID has both a nuclear export signal

(NES) found in the C-terminal portion of the protein [100, 101] and a nuclear localization signal (NLS) in the N-terminal portion [102]. Active nuclear export mediated by CRM-1 is required to maintain AID in the cytoplasm [100, 102], but active nuclear import is required for AID activity [101]. 22

As its name implies, AID deaminates cytidine residues on single stranded DNA

(ssDNA) substrates [103]. Active transcription of the variable region of heavy and light immunoglobulin chains (VH and VL, respectively) is required for SHM and of the switch regions near the constant region genes for CSR [80]. Indeed, AID has been shown to directly associate with the transcription apparatus, specifically RNA polymerase II, and there is a correlation between SHM frequency and transcription levels [104].

Upon entering the nucleus, AID can be phosphorylated at three sites, S38, T140, and Y184 [105]. Both CSR and SHM require phosphorylation at S38 [106], whereas

T140 phosphorylation affects only SHM [107]. The relevance of Y184 phosphorylation is unknown [105]. Phosphorylation of AID is important for its association with chromatin [99]. Nuclear AID is targeted for degradation by the ubiquitin-proteasomal pathway [108]. It is likely that other regulatory mechanisms exist also, especially negative regulatory mechanisms. Indeed, when AID is constitutively over-expressed transgenically, the enzyme has diminished, rather than increased, activity relative to endogenous AID [109], although the mechanism limiting AID function is unknown.

Phosphorylated AID also interacts directly with Replication Protein A (RPA), a single stranded DNA binding protein required for AID activity [110]. After cytidine deamination, AID may dissociate from the substrate, leaving RPA bound to the DNA.

RPA then may act as a scaffold for downstream DNA repair pathways [111].

In addition to the full-length isoform, four other mRNA variants of AID are expressed in vivo (Figure 1.8) [83, 112-114]. The longest isoform, AIDins3, contains all five exons plus intron 3. This isoform maintains the cytodine deaminase domain, but lacks the portion of the protein responsible for CSR due to a frameshift resulting in a

23 premature stop codon. The 562 base pair isoform (AID-E4a) contains all five exons, but lacks the last 30 base pairs of exon 4 though it retains all functional domains with no frameshifts. The AID-E4 isoform is missing all of exon 4 including the majority of the

C-terminus due to a frameshift, including the nuclear export signal, resulting in primarily nuclear localization and the inability to perform CSR. Finally, the AID E3-E4 isoform lacks both exons 3 and 4 without any frameshifts, but it does delete both the cytidine deaminase domain (present in exon 3) as well as the T140 phosphorylation site (exon 4) resulting in the loss of both CSR and SHM [83]. Thus, alternative splicing of AID may have functional consequences, such as those described below, that may underlie HIV-1- associated defects in SHM and CSR.

Figure 1.8: AID Isoforms. Exons (1-5) in AID mRNA can be alternatively spliced into 5 different isoforms. The different isoforms have been associated with different levels of SHM and CSR activity as well as cellular localization, relative to the full-length isoform. *Premature stop codon resulting from a frameshift.

24

The functional AID multimer is likely a dimer in vivo [85]. Because of deletions of many functional domains in the isoforms, many of their functional capacities may vary from that of the full-length isoforms. If multiple isoforms of AID were to be expressed in the same cell, heterodimerization of different isoforms may affect the functionality of the dimer. Like the full-length isoform, both AID-E4a and AID E3-E4 are predominantly found in the cytoplasm, whereas both AIDins3 and AID-E4 are found more abundantly in the nucleus than the full-length isoform because they lack the NES [83, 115]. Whereas the full-length isoform is both SHM- and CSR-competent, none of the other isoforms mediate CSR. Similarly, AID E3-E4 is also defective for SHM, likely due to the absence of the cytidine deaminase domain. In contrast, AID-E4a and AID-E4 not only were capable of hypermutating a test substrate, but did so at a higher frequency than the full-length isoform [116]. However, using a different assay, none of the other isoforms had any catalytic activity [115]. Expression patterns of the various AID isoforms and their functional capacities in SHM and CSR, both in health and disease states, such as with HIV-1 infection, may reveal their physiologic impact on overall AID activity.

Somatic hypermutation and class-switch recombination

AID mediates two important stages of antibody development: Class switch recombination (CSR) and somatic hypermutation (SHM). CSR is responsible for the switch of antibody constant regions from the IgD to the IgM, IgG, IgA, or IgE isotypes.

The switch in the constant region is capable of changing the functionality of the antibody, as described above in “Antibody structure”. SHM leads to the introduction of point 25 mutations into the variable region of antibody genes with the goal of producing antibodies that are more specific and of higher affinity for a particular antigen. Both processes occur predominantly in the GC, with SHM putatively preceding CSR [48].

SHM has been shown in vitro to be impaired by HIV-1. The viral protein Vif can directly bind to AID, when both are co-expressed in E. coli. Binding interrupts the cytidine deaminase activity of AID, thereby preventing the introduction of point mutations [117]. It is not known, however, whether this process occurs in vivo or during

HIV-1 infection.

SHM is initially triggered by the deamination of cytidine residues by AID in actively-transcribed variable regions (Figure 1.9). In the nucleus, AID and Protein

Kinase A (PKA) are likely both recruited to the V regions, where PKA phosphorylates the S38 residue [118]. How the T140 residue becomes phosphorylated is unknown, though it may involve another protein kinase, PKC [107]. Phosphorylated-AID then associates with RPA, a single stranded DNA (ssDNA) binding protein, on chromatin.

AID begins mutating ssDNA along the V region starting ~150 base pairs downstream of the IgV promoter and ending 1-2 kb downstream [82, 119]. Activity of AID on a cytidine residue results in a U:G lesion.

The U:G lesion can be recognized and repaired by several pathways. If the DNA replication machinery encounter the lesion first, an A will be paired with the U, creating a

C to T transition mutation. The U:G lesion can also be recognized by DNA repair machinery. Uracil-DNA glycosylase (UNG) will remove the uracil residue, leaving an abasic site at the lesion. The abasic site can also be recognized by several repair

26

Figure 1.9: Model of Somatic Hypermutation. dU is not relevant to DNA and the dU:dG mismatch is ‘replicated over’ or dealt with by the DNA repair machinery. Replicating over dU results in a dC→dT transition mutation (Phase 1a), whereas dU deglycosylation by Ung gives rise to an abasic site. In the presence of PCNA (orange ring), DNA synthesis opposite the abasic site by TLS pol , which has both nucleotide inserter and extender activity, or by the nucleotide inserter pol , Rev1 or, perhaps pol , followed by the nucleotide extender pol  or pol , yields dC→dT transitions and dC→dA or dC→dG transversions (Phase 1b). Alternatively, the abasic site can be recognized and excised by APE or the Mre11-Rad50 lyase to create a DNA nick. This nick can be repaired by DNA pol  (light pink circles) in an error-free fashion (short- patch BER) or repaired in an error-prone fashion by a TLS polymerase through a long- patch BER also involving PCNA and Fen1. dU:dG mispairs can also be recognized by the MMR machinery, resulting in a DNA-gap formation through the intervention of an unidentified endonuclease or MRN and Exo I. Subsequently, TLS pol , pol , Rev1, pol and , perhaps, pol  can affect DNA re-synthesis as part of a patch repair, thereby inserting mismatches (Phase 2). In the long-patch BER or MMR, RPA (large brown ovals) and PCNA would recruit other repair proteins to the lesion and co-ordinate their actions. MMR proteins are indicated as large green ovals. Mutated nucleotides are shown in red [120].

27 pathways. RPA still bound to the ssDNA can associate with proliferating cell nuclear antigen (PCNA), a scaffold protein known to recruit low-fidelity translesion (TLS) DNA

Polymerases and to promote efficient DNA repair. Replication over the abasic site by the

TLS polymerases will result in the random incorporation of a nucleotide at the site, creating both transition (purine→purine or pyrimidine→pyrimidine) and transversion

(purine↔pyrimidine) mutations. If the abasic site left by UNG is recognized by apurinic/apyrimidic endonuclease (APE), the abasic site will be removed, leaving a gap in the DNA strand [120, 121].

This gap can be repaired by either the high fidelity short-patch base excision repair (BER) pathway, or the error-prone long-patch BER pathway. Short-patch repair machinery recruits a high fidelity DNA polymerase, polymerase , to the sight of the lesion, and corrects the initial lesion leaving no resulting mutation. Long-patch repair machinery recruits PCNA and an endonuclease Fen1, which excises many nucleotides around the gap. The gap is then repaired by low fidelity TLS polymerases which may effect several mutations around the original lesion [120, 121]. The abasic site created by

UNG can also be recognized by the MRN (Mre11-Rad50-Nbs1) complex that can nick the abasic site, also creating a gap [119, 120], but leaving ends that high fidelity DNA polymerases cannot extend [119]. PCNA is recruited and the gap is repaired by the long- patch BER pathway [120]. Lastly, the U:G lesion originally created by AID can also be recognized by the mismatch repair (MMR) pathway. MMR pathway machinery is recruited to the lesion, including both Exo I and another unknown endonuclease that nicks the strand around the lesion. The lesion is then repaired by TLS polymerases, potentially yielding multiple mutations around the original lesion [120, 122]. Thus, a

28 greater number of mutations can be generated in bases other than C by the initial deamination of cytidine.

Mutations resulting from SHM occur at a one million-fold higher rate than spontaneous somatic DNA mutations in the rest of the genome, or about 10-3 to 10-4 changes per base per cell division. The vast majority of these mutations are point mutations [82, 103] and the process has a slight preference for transition over transversion mutations [82, 123, 124]. Most AID-induced mutations occur at

RGYW/WRCY “hotspot” motifs, the sequence targeted by AID [103, 125]. Most AT mutations occur at WA/TW motifs [125]. Mutations at each base occur at different rates, with mutations occurring most frequently at A bases (35% of mutations), followed by C and G bases (both 25%), with the fewest occurring at T bases (15%) [126]. Mutations are targeted more to the hypervariable CDR regions of the gene than to the more conserved structural FR regions. This differential mutation frequency is supported in part by the increased prevalence of RGYW/WRCY motifs in CDRs, thus recruiting AID more frequently to these regions [82].

It is not known exactly how AID is targeted to IgV regions of chromatin. Several mechanisms, including immunoglobulin (Ig) enhancer elements, chromatin accessibility, cis-acting sequences and cell cycle regulation, could all play a role. Ig enhancer elements have been found in both intronic sequences and at the 3’ end of rearranged Ig variable regions [127]. Changes in histone acetylation may also be dependent on AID expression and recruit error-prone DNA repair machinery [124]. Cis-acting sequences, such as R- and G-loops in transcribed Ig regions may offer AID a stable binding target [124]. SHM and CSR are both restricted to the nonreplicative phases of the cell cycle, thus potentially

29 separating error-free DNA repair that occurs during DNA replication from the error- prone DNA repair that occurs during CSR and SHM. While all these things may attract

AID, none are sufficient in themselves to target SHM and CSR.

Repair by low fidelity TLS DNA polymerases is essential to the SHM process.

Without them, mutations might not occur at all, or too many mutations could overwhelm the error-free DNA repair machinery leading to the death of the cell. There are eight TLS polymerases in humans that are believed to be involved in SHM and CSR mutation repair: pol , , , , , Rev 1, and pol  and  [120, 128, 129]. Their exact roles in both processes are still being discovered, but it is thought that they work cooperatively with each other in both filling and bypassing the abasic sites and gaps introduced by the repair machinery [120]. Impairment of one or more of these polymerases can affect the resulting nucleotide mutation pattern [128, 129]. Pol  and  are thought to have the greatest roles in SHM [120, 128, 129], while pol  and  have been implicated in CSR

[128].

CSR, like SHM, is initiated by AID. AID is targeted to RGYW/WRCY hotspots in the switch (S) regions located upstream of each of the H-chain constant regions except for C (IgD), rather than the VH regions with SHM. S regions are tandem repeats of short

(20-80 base pair) G-rich sequences. They differ in sequence and in length (1-12 kb) by isotype [49]. Cytokines secreted by T cells in the GC or by DCs and monocytes outside the GC direct the isotype switch [49, 58, 59].

Similar to SHM, CSR begins with transcription. Actively transcribed S regions are bound by AID, which deaminates the cytidine residues into uracils in RGYW/WRCY

“hotspot” motifs in the sequence. The lesions are recognized by UNG, which cleaves out 30 the uracil, leaving an abasic site. The abasic site is recognized by APE, which removes the phosphate backbone of the DNA, creating single stranded (ss) DNA breaks. Exo I is believed to recognize the ssDNA breaks and continue removing bases until it reaches another ssDNA break on the other DNA strand, therefore creating a double stranded (ds) break. The dsDNA breaks create 5’ or 3’ single strand overhangs. These will be either excised by the endonuclease ERCC1-XPF or filled by TLS polymerases recruited to the site by MMR machinery and PCNA [49, 77]. Once dsDNA breaks are formed in both the donor S region and the acceptor Sx region, the regions are recombined by either non- homologous end joining (NHEJ) or alternative non-homologous end joining (A-NHEJ)

[49, 130], and give rise to an extrachromosomal DNA switch circle comprised of the deleted DNA sequence [77].

CSR requires at least two complete rounds of cell division to complete. Once complete, the B cell will express an immunoglobulin with the new IgM, IgG, IgA, or IgE isotype [49, 77]. The specification of class switching is dependent upon cytokine signaling, such as IL-4, TGF-, and IFN-, from activated T cells; these signals promote transcription at the different isotype switch region promoters [77]. Similar to SHM, studies from Qiao et al. have shown both in vitro and in vivo that CSR can be impaired by HIV-1 [131]. HIV-1 protein Nef can suppress NF-B activation by inducing NF-B inhibitor IB expression and by inducing suppressor of cytokine signaling (SOCS) protein expression, which inhibits CD40/CD40L and other cytokine signaling [131].

Indeed, data from our lab suggest a subtle but reproducible HIV-1-associated decrement in IgA- and an increment in IgM-producing plasma cells in the intestine [132, 133].

31

B cells and HIV-1

HIV-1-infected patients experience higher incidences of mucosal and systemic secondary and opportunistic infections with microorganisms such as Streptococcus pneumoniae, Haemophilus influenzae, Cryptococcus neoformans, and Salmonella species than non-infected patients [25-27]. In the case of S. pneumoniae infections, HIV-1- infected patients have a higher incidence of both pneumonia (infection of the lungs, about

10 higher than non-infected) and of bacteremia (infection of the bloodstream, 100 times higher than non-infected) [134]. The recurrence of infection is also higher in HIV-1- infected patients than in non-infected patients [65]. These opportunistic infections and others are controlled by antibody responses [25-27].

HIV-1 does not directly infect B cells in vivo, but HIV-1 infection is consistently associated with B cell dysfunction. The consequences of HIV-1-associated B cell dysfunction include hypergammaglobulinemia early in infection, increased polyclonal B cell activation, lymphoadenopathy, high frequencies of B cell lymphomas, autoantibodies in the serum, and autoimmune disease [22, 29, 135-138]. Many of these defects precede the significant depletion of CD4+ T cells and are not completely corrected in vitro when normal T cells are added, or with elimination of viremia with antiretroviral therapy, suggesting that the defect in B cell activity, may be, in part, intrinsic to the B cell [138-

140].

High levels of B cell activation are caused both directly and indirectly by the virus. HIV-1 can bind directly to B cells through interactions between CD21 on B cells and complement C3 fragments bound to virions [93]. Other HIV-1 binding receptors found on B cells include DC-SIGN and other C-type lectin receptors, the framework

32 regions of VH3 family-specific surface immunoglobulins [22], and CD40 receptors on the

B cells that bind host membrane-derived CD40 ligand on the viral envelope [139].

Indirect activation occurs through other activated immune cell types including virus- activated CD4+ T cells, monocytes, dendritic cells, and natural killer cells either through direct CD40-CD40L interaction or through the release of cytokines [64]. HIV-1-infected patients have increased serum levels of cytokines and growth factors including IFN-,

TNF-, IL-6, IL-10, and B cell Activation Factor (BAFF) [22, 141], all of which can activate B cells. High levels of B cell activation are not the only B cell abnormality in

HIV-1-infected patients. B cell subpopulations, such as immature transitional cells, exhausted cells, activated mature cells, and plasmablasts are found at much higher proportions than in uninfected control subjects. Conversely, the number and proportion of memory B cells (IgD-CD27+CD21+) are lower, as is function [64, 65], a phenotype that also is not corrected with successful antiretroviral treatment [22].

Despite high levels of B cell activation, HIV-1-infected subjects mount diminished responses to neoantigens and polysaccharides in vivo and to B cell mitogens in vitro [29, 138, 140, 142, 143]. The limited magnitude of such responses may explain, in part, the high occurrence of antibody-controlled opportunistic infections and the poor responses to vaccines administered to prevent them, as may limitations in the quality of responses, such as antibody affinity, a process determined by SHM and controlled by

AID, as described above. Numerous studies have shown that immunization using pneumococcal vaccines results in poor T cell-dependent vaccination responses in HIV-1- infected individuals [65, 144-147]. Likewise, poor vaccination responses have also been seen when using vaccines targeting both neoantigens (KLH, Hepatitis B) [140] and recall

33 antigens (tetanus toxoid [145] and H. influenzae type b [148, 149]). Not only may the quantities of the antibodies produced be limited, but some data suggest that the quality of the antibody response (i.e. avidity and affinity) is also decreased. Indeed, the avidity of oral antibodies to Candida antigens [150, 151] and H. influenza type b specific IgG and

IgA antibodies in the blood [148] among patients with HIV-1 infection, qualities dependent on SHM and AID activity, may be lower than those in control subjects.

Thus, despite the consistent presence of hypergammaglobulinemia, lower levels of antibody responses to infection and vaccination, as well as a potential decrement in antibody quality, may contribute to the high rates of opportunistic infections, especially those occurring in the blood. I propose to determine whether B cell function in HIV-1- infected patients is impaired at the phenotypic and molecular levels, and if so, to determine the mechanisms underlying these defects. I hypothesized that the poor vaccine and infection responses identified in HIV-1-infected patients could be attributed, in part, to decreased frequencies of SHM in the antigen-binding variable region of circulating class-switched IgG B cells, and that these decreases in SHM resulted from a decrement in both AID expression and AID function.

34

CHAPTER II

MATERIALS AND METHODS

Patient data

We enrolled 42 HIV-seronegative control subjects and 72 patients with HIV-1 infection (19 on antiretroviral therapy (aviremics) and 53 not on therapy with detectable plasma HIV-1 RNA (viremics)). Exclusion criteria included any acute medical illness, significant underlying medical illnesses (e.g., renal, hepatic, pulmonary) or immunosuppressive therapy and additionally for control subjects, any risks for HIV-1 infection, including intravenous drug use. For the 6 months preceding enrollment, all

HIV-1-infected patients were stable on antiretroviral regimens or had received no therapy. All subjects were enrolled after completing written informed consent with protocols approved by Institutional Review Boards at the Veterans Affairs Medical

Centers in Minneapolis and Denver and at the University of Minnesota and the

University of Colorado Denver.

35

PBMC isolation and stimulation

Isolation

PBMC were separated from ≈60 mL of fresh whole blood by Ficoll-Hypaque

(Sigma, St. Louis, MO) density centrifugation and washed. For immunophenotyping

(Flow Cytometry), 2 million cells were resuspended in FACS Staining Buffer and placed on ice. For RNA, 5-10 million cells were resuspended in RNAlater (Qiagen, Inc.,

Valencia, CA) and stored at -80oC.

Stimulation

PBMC (2x106 cells/mL) were cultured with either media alone (RPMI media

(Invitrogen, Grand Island, NY) with 10% heat-inactivated FCS (HyClone Laboratories,

Logan, UT) and 10 g/ml gentamicin (Invitrogen)) or stimulated with single agents or a combination of IL-4 (10 ng/mL; Peprotech, Inc., Rocky Hill, NJ), anti-CD40 (1 g/mL;

BD-Biosciences, San Diego, CA), and anti-IgM (1.3 g/mL; Jackson ImmunoResearch

Laboratories, Inc., West Grove, PA) in 25 mL flasks at 37oC for 4 days. On day 4, cells were removed from the flasks, and the two halves of each flask were placed in separate tubes on ice for Flow Cytometry or RNA extraction.

Total immunoglobulins

Supernatants from parallel cultures in 96 well plates were harvested on day 7 for total IgG, IgM, and IgA by enzyme immunoassay as described from our lab [133]. Total

IgG, IgM, and IgA from sera were measured by commercial nephelometry.

36

mRNA isolation

RNA was extracted using either an RNEasy RNA Extraction Kit or an AllPrep

RNA/DNA/Protein Mini kit according to the manufacturer’s protocol (Qiagen, Inc.), quantified using a NanoDrop spectrophotometer (NanoDrop Technologies Inc.,

Wilmington, DE), and stored at -80oC.

DNA isolation

Genomic DNA was extracted using a DNEasy DNA Extraction Kit or an AllPrep

RNA/DNA/Protein Mini kit according to the manufacturer’s protocol (Qiagen, Inc.), quantified using a NanoDrop spectrophotometer (NanoDrop Technologies), and stored at

-20oC.

cDNA generation

cDNA for sequencing was synthesized according to the manufacturer’s protocol in a 60 L reaction consisting of 23 L of RNA template, 0.5 mM each dNTP, 7.5 g/mL random hexamers (Invitrogen), 8.3 g/mL oligo(dT) (Invitrogen), and 10 U/L Moloney murine leukemia virus (M-MLV) reverse transcriptase (Invitrogen). The samples were incubated at 37oC for 1 hour then heated to 70oC for 15 minutes. Two units of RNAse H

(Invitrogen) were added to the cDNA. The samples were incubated at 37oC for 20 minutes then stored at -20oC.

37

Real-time PCR

For reverse transcription of mRNA, 2 g of RNA were aliquoted, dried, and reverse transcribed using the First Strand cDNA Synthesis Kit (Invitrogen) and random hexamers (Invitrogen) according to the manufacturer’s protocol. 80 l of nuclease-free water was added to each reaction, resulting in cDNA concentrated to 20 ng/L. Real time quantitative PCR analysis of AID mRNA expression was performed using either

Platinum SYBR Green qPCR Supermix-UDG with ROX (Invitrogen) and primer sets targeting human -actin and human AID (aicda; SuperArray Bioscience Inc., Frederick,

MD) or TaqMan Gene Expression Assays for human AICDA and human GAPDH (Life

Technologies, Grand Island, NY). Both human AID assay primers sets amplify AID cDNA beginning in exon 2 and ending in exon 3. Samples were tested in triplicate, using

1 l of template cDNA per 20 l reaction on either an ABI 7300 real time PCR machine

(SYBR Green reactions; Applied Biosystems, Inc., Foster City, CA) or a BioRad CFX96

Real Time System (TaqMan reactions; BioRad, Hercules, CA, Figure 2.1). AID expression was determined relative to the “housekeeping”/reference genes -actin or

GAPDH by subtracting the cycle threshold (Ct) for the reference gene from that for AID using the following equation: Relative AID mRNA expression =AID/reference gene =2-

(Ct(sample)-Ct(reference)) =2(Ct). Assays were validated by amplifying a standard curve with each primer set in triplicate.

38

Figure 2.1: Representative Real-Time PCR Run. Cycle number (x-axis) is graphed against the Relative Fluorescent Units (RFU; y-axis). Cycle threshold (Ct) for genes is the point on the curve that intersects the horizontal threshold line (blue, above). 20 l reactions were run in triplicate for each sample. AID expression was calculated relative to either -actin, or GAPDH (shown above). The fewer cycles required to attain the threshold, the higher the specific cDNA concentration.

AID isoforms amplification and cloning

AID isoforms PCR

To determine expression of splice variants (isoforms) of AID, AID-specific cDNA transcripts were amplified by PCR. Primers used for PCR amplification were commercially synthesized primers which bind AID cDNA transcripts in exons 2

(AIDfwd1) and 5 (AIDrev1) (Table 2.1). Reactions were done in duplicate. PCR (25

L) consisted of: 5 L of cDNA or 1.25 ng of cloned positive control, 2.5 L 2 mM dNTPs, 1.5 L 25 mM MgSO4, 2.5L of 10x KOD Buffer, 0.75L each primer (10

M), 0.5 L KOD Hot Start High-Fidelity DNA polymerase (Novagen, Gibbstown, NJ).

Samples were heated to 95oC for 2 minutes in an Applied Biosystems 9700 thermocycler 39

(“hot start”) followed by 45 cycles consisting of 20 seconds of 95oC denaturation, 10 seconds of 60oC annealing, and 10 seconds of 70oC extension. PCR products were run on a 2.0% 1x TBE agarose gel (Sigma) and stained with ethidium bromide. The presence of each isoform in cDNA samples was determined by comparing the lengths of all amplified PCR products with the lengths of the positive control cloned isoform products.

Table 2.1: AID Isoforms PCR Primer Sequences.

Primer Name Sequence AIDfwd1 5’-TCTTGATGAACCGGAGGAAG-3’ AIDrev1 5’-CGAAATGCGTCTCGTAAGTC-3’ STC1 AIDrev1 5’- CCTTCGCCGACTGACGAAATGCGTCTCGTAAGTC-3’ (cloning only)

AID isoforms cloning

In order to generate positive PCR controls and sequences with which to compare test sample product lengths, sequences of all 5 isoforms were PCR amplified, as above, gel-extracted using a MinElute Gel Extraction kit (Qiagen), and cloned using a

StabyCloning kit (Eurogentec) following kit instructions. Briefly, 10 L of PCR product with an STC1 adaptor sequence attached at the 3’ end were cloned into a pSTC1.3 vector and transformed into chemically competent E. coli. Plasmids from individual bacterial colonies cultured overnight with 2 mL of LB medium (100 g/mL ampicillin) were purified using a QIAprep Spin Miniprep kit as per the manufacturer’s protocol (Qiagen).

Cloning of each isoform was verified by sequencing in one direction by the Colorado

40

Cancer Center Core Sequencing Facility using an ABI 3739 Sequencer (Applied

Biosystems).

Amplification, cloning, and sequencing of VH3-IgG genes

PCR amplification of VH3-IgG genes

Primers used for amplification were commercially synthesized primers derived from the first exon of the IgG constant gene (3’IgG, CH2A (indicates the name of the primer, not its position), and SC-CH2A) [152, 153] and the 5’ end of the VH3-family specific leader sequence (5’VH3) [154] (Figure 2.2, Table 2.2). For cloning using a

TOPO TA kit (Invitrogen), PCR (20 L) consisted of: 0.5 L of cDNA, 0.4 L dNTPs

(10 mM), 0.8 L 50 mM MgSO4, 2 L of 10x High Fidelity PCR buffer, 0.4 L of primer mix (3’IgG and 5’VH3, 10 M each), and 0.1 L Platinum Taq High Fidelity

DNA polymerase (Invitrogen). Samples were heated to 94oC for 30 seconds in a Perkin-

Elmer 9700 thermocycler before initiating cycling (“hot start”). Each of the 24 cycles consisted of a 30 second 94oC denaturation, a 30 second 60oC annealing, and a 30 second

68oC extension. For cloning using a StabyCloning kit (Eurogentec, San Diego, CA),

PCR (25 L) consisted of: 1 L cDNA, 0.75 L dNTPs (10 mM each), 0.5 L 50 mM

MgSO4, 2.5 L 10x Pfx Amplification buffer, 0.3 L CH2A primer (10 M), 0.6 L

5’VH3 primer (10M), 0.75 L SC-CH2A primer (10 M), and 0.2 L Platinum Pfx

DNA polymerase (Invitrogen). Samples were heated to 94oC for 2 minutes in an Applied

Biosystems 9700 thermocycler (“hot start”), followed by 45 cycles consisting of a 30 second 94oC denaturation, a 30 second 60oC annealing, and a 30 second 68oC extension.

41

All PCR product was purified using a QIAquick PCR Purification Kit according to the manufacturer’s protocol (Qiagen Inc.).

Figure 2.2: Immunoglobulin Heavy Chain mRNA. Immunoglobulin heavy chain mRNA is composed of VH, D, and JH segments rearranged and linked to a constant region (CH). The VDJ variable domain is divided into 3 complementarity-determining regions (CDRs) that directly bind antigen held together by 4 framework regions (FRs). At the 5’ end of the mRNA there is an untranslated leader sequence which is used to distinguish VH families. The 5’VH3 primer binds at the beginning of the leader sequence. The 3’IgG and CH2A primers bind in the first exon of the IgG constant region.

Table 2.2: VH3-IgG PCR Primer Sequences.

Primer Name Sequence

VH3-IgG Sequencing 5’VH3 5’-CCATGGAGTTGGGGCTGAGCTGC-3’ 3’IgG 5’-GGGCACCAGGGGGAAGACCG-3’ CH2A 5’-CACCGGTTCGGGGAAGTAGTCC-3’ SC-CH2A 5’-CCTTCGCCGACTGACACCGGTTCGGGGAAGTAGTCC-3’

42

Cloning of VH3-IgG genes

PCR products (1-4 L) were cloned in a pCR®-4TOPO vector following the

TOPO TA Cloning® kit instructions (Invitrogen) or a pSTC1.3 vector (10 L of PCR product) following the StabyCloning kit instructions (Eurogentec). Plasmids from 84-

212 individual bacterial colonies cultured overnight with 2 mL of LB medium (100

g/mL ampicillin) were purified using a QIAprep Spin Miniprep kit as per the manufacturer’s protocol (Qiagen).

Sequencing of VH3-IgG clones

For each patient sample, 56-105 plasmid inserts per patient were sequenced either in both directions using SequiTherm Excell (Epicentre Technologies, Madison, WI), or in one direction by the Colorado Cancer Center Core Sequencing Facility using an ABI

3739 Sequencer (Applied Biosystems).

VH3-IgG cloned sequence analysis

Determination of polymerase fidelity

Comparison of an 18 bp segment of IgG constant regions from 823 blood IgG-

VH3 sequences with the germline sequences yielded a maximum potential error rate of

-3 1.15 x 10 (17 mutations/ 14,814 bp), or about 1 error/ 871 bp (or ≈1 error per 2.5 VH genes). Given the high frequency of mutation in the blood VH3 genes in this analysis

(mean 20 mutations/VH3 gene in the control subjects), only ≈1.7% of the mutations could be ascribed to polymerase amplification errors.

43

VH3-IgG cloned sequence alignments and mutation calculations

Somatic hypermutations in 40-66 sequences per patient were identified by comparing the sequences to the Vbase (Http://vbase.mrc-cpe.cam.ac.uk/) database of variable region germline sequences using the DNAplot software accessed at this website.

Multiple identical sequences, as determined by their CDR3 sequence and which represented 26.7% of sequences, were excluded from this analysis. The Batch Analyzer program developed and validated in our group by comparison with online programs

(Vbase, IMGT® http://www.imgt.org [155] and JOINSOLVER® http://joinsolver.niams.nih.gov [156]) was used to exclude multiple identical sequences from the same patient and to calculate mutation frequencies and patterns in each sequence.

Mutation patterns analysis

Conservative amino acid substitutions in either direction included: Ala to Gly,

Ile, or Leu; Ile to Leu, Met, Phe, or Val; Leu to Met, Phe, or Val; Arg to His or Lys; Gly to Ile, Leu, or Val; Asp to Glu; Asn to Gln; His to Lys; Met to Phe or Val; Phe to Val; Ser to Thr. All other substitutions, which showed changes in charge or polarity, were considered non-conservative. Insertions or deletions within sequences were not included in the analysis. D regions were identified using the criteria of a minimum of 6 bp of identity over a 7 bp length of sequence [157].

44

CDR3 region analysis

The first amino acid in CDR3 was assigned as the third amino acid following a conserved cysteine residue at position 92 (Kabat numbering system) of the VH gene

[152]. The last amino acid was assigned just before the first conserved tryptophan of

FR4 (Kabat position 103). The CDR3 sequences were analyzed for length, composition of acidic (Asp and Glu), basic (Arg, His, and Lys), uncharged polar (Gly, Ser, Thr, Cys,

Tyr, Asn, and Gln), nonpolar (Ala, Val, Leu, Ile, Pro, Met, Phe, Trp), and aromatic residues (His, Phe, Tyr, and Trp), and average hydrophobicity using the Kyte-Doolittle scale [158] as normalized by Eisenberg [159].

VH3 454 PCR/ High-throughput pyrosequencing

1st round amplification (Figure 2.3)

We wished to confirm the results of the VH3-IgG clones sequencing study on a larger number of sequences and in another class-switched isotype, IgA, as well as more naïve isotypes, IgD and IgM. Therefore, we switched from using a cloning and sequencing strategy to performing high-throughput pyrosequencing using Roche’s 454-

FLX Titanium platform (Roche Applied Science, Indianapolis, IN). The primers used for

PCR amplification were commercially synthesized primers derived from the VH3-family specific leader primer [154] (5’VH3) with random 8 bp nucleotide barcode sequences attached at the 5’ end to allow for sample pooling, the CH1 region of the IgG constant genes [153] (CH2A) and the first exons of the IgM, IgA [63], and IgD constant genes

(Table 2.3). First round synthesis was performed in a 50 L reaction containing 2-5 L of cDNA template, 0.2 mM each dNTP, 1.5 mM MgSO4, 5 L 10x KOD Hot Start

45

Buffer, 0.3 M each primer, and 0.02 U/L KOD Hot Start High Fidelity DNA

Polymerase (Novagen). Samples were heated to 95oC for 2 minutes in an Applied

Biosystems GeneAmp PCR System 9700 thermocycler before initiating cycling (“hot start”). Samples were run for 5 cycles consisting of a 20 second 95oC denaturation, a 10 second annealing (68oC for IgD and IgG primer sets, 63oC for IgM and IgA primer sets), and a 10 second 70oC extension. PCR product was purified using a QIAquick PCR

Purification kit (Qiagen) and run on a 1.5% agarose gel stained with ethidium bromide.

The 5 cycle reaction gel bands were excised and purified from the gel using a MinElute

Gel Extraction kit according to the manufacturer’s protocol (Qiagen). Purified PCR product was eluted into 15 L of elution buffer.

2nd round amplification

The primers used for 2nd round amplification, to add the 454-adaptor sequences to the 5’ and 3’ ends of the PCR amplicons, were commercially synthesized primers as described above but with 454-sequencing tags (Roche) attached at the 5’ (5’VH3) and 3’

(CH2A, IgM, IgA, and IgD) ends (Table 2.3). Second round synthesis was performed in a 50 L reaction containing the 15 L of purified 1st round 5 cycle PCR product described above, 0.2 mM each dNTP, 1.5 mM MgSO4, 5 L 10x KOD Hot Start Buffer,

0.3 M each primer, and 0.02 U/L KOD Hot Start High Fidelity DNA Polymerase

(Novagen). Samples were heated to 95oC for 2 minutes. The samples were amplified for

17-22 cycles consisting of a 20 second 95oC denaturation, a 10 second annealing (65oC for IgD and IgG primer sets, 60oC for IgM and IgA primer sets), and a 10 second 70oC extension. Samples were purified and run on an agarose gel, then excised and purified 46 from the gel slice as described above for the first round amplification product. The purified PCR product was eluted into 20 l of elution buffer (Qiagen).

Figure 2.3: 454-Pyrosequencing Protocol. 454-Pyrosequencing requires two rounds of PCR to amplify the specific gene of interest and incorporate the 454 tags required for sequencing. The first round of amplification utilizes gene-specific primers to amplify the cDNA of choice. At the 5’ end of the forward primers are barcode sequences composed of 8 random nucleotides. Addition of the barcodes allows for sample pooling during sequencing. The final product of the first round is the PCR-amplified gene of interest attached to the barcode sequence. First-round PCR product is run out on an agarose gel, extracted from the gel, and purified. The second round of PCR amplifies the levels of first round product and adds the 454-sequencing primer sequence to the 5’ end of each PCR product and the bead-specific sequence to the 3’ end. The bead specific sequence attaches to the sequencing beads, allowing for sequencing of individual PCR amplicons on a multiwell plate. Second-round PCR product is also run out on an agarose gel, extracted, and purified, to ensure that only full-length PCR products are sequenced.

47

Table 2.3: VH3 454-Pyrosequencing PCR Primer Sequences

Primer Name Sequence 1st Round Amplification IgG Barcode TAAGAACG, TTACCGTG, TCGTCCGT, TTGAGGTT, Sequences TCCTTAGT, AAGGTGTT, TGTTAGGT, ACTTCGGT, CTTACCTT, GTAACCGT, TATATGCG, TATGTGCT, TATGCATG, TATGCACA, TAGAGACT IgA Barcode TTAACGGA, AATTACGG, AACTTAGG, ACGGTACG, Sequences TTGTACGG, TAAGTTGG, TTCGGTGG, TTCGTAAG, TAAGAACG, TTCACCGG, TATAGCTG, TATCGCGT, TACTGTCA, TAGATGCT, TCTGTGCA IgD Barcode ACCGTTAA, AACGGTTA, GGTAACGG, ACAACCGG, Sequences TTCTAACG, TTATTCCG, TTAACACG, TTACTAGG, TTACCGTG, TAGTTACG, CAGCACAT, CGCATACG, GATCGAGT, GCACAGCA, GCGAGCTA IgM Barcode TCGTTACG, CCGGTTAG, AAGTTCCG, TACGACCG, Sequences TAACGTTG, TAAGTACG, TTGTACGG, TCAACCTT, GGTACCGT, ATACCGGT, TCATCACA, TCATGTGT, TCACTCTG, TCGTCAGT, TGATGATG 5’VH3 5’-Barcode sequence-CCATGGAGTTGGGGCTGAGCTGC-3’ CH2A (IgG) 5’-CACCGGTTCGGGGAAGTAGTCC-3’ IgA 5’-GAGGCTCAGCGGGAAGACCTT-3’ IgD 5’-CCCAGTTATCAAGCATGCCAGGAC-3’ IgM 5’-CGGGGAATTCTCACAGGAGAC-3’ 2nd Round Amplification FLX 454A 5’-GCCTCCCTCGCGCCATCAG-3’ (Adaptor) FLX 454B 5’-GCCTTGCCAGCCCGCTCAG-3’ (Adaptor) FLX-Titanium 5’-GCCTCCATCTCATCCCTGCGTGTCTCCGACTCAG-3’ T454A (Adaptor)

48

Primer Name Sequence FLX-Titanium 5’-CCTATCCCCTGTGTGCCTTGGCAGTCTCAG-3’ T454B (Adaptor) 5’VH3-T454 5’-T454A sequence-Barcode sequence-CCATGGAG-3’ CH2A-T454 5’-T454B sequence-CACCGGTTCGGGGAAGTAGTCC-3’ IgA-T454 5’-T454B sequence-GAGGCTCAGCGGGAAGACCTT-3’ IgD-T454 5’-T454B sequence-CCCAGTTATCAAGCATGCCAGGAC-3’ IgM-T454 5’-T454B sequence-CGGGGAATTCTCACAGGAGAC-3’

VH3 454-pyrosequencing analysis

VhIGene program

Due to the nature of the pyrosequencing results (format, number, and a significant number of indels per sequence), we were unable to utilize either our previous analysis program Batch Analyzer or any other publically available alignment tools. Therefore, we developed the computer software pipeline VhIGene (Figure 2.3) in collaboration with

Dan Frank, Ph.D., (Division of Infectious Diseases) who did the program coding in order to classify the human V, D, and J germline alleles from which a rearranged B cell immunoglobulin gene most likely was derived and to calculate a variety of VH mutation statistics from high-throughput VH gene pyrosequencing data. The tasks performed by this software include 1) initial polishing of sequence reads (bartab); 2) profile alignment

(hmmr) to the IMGT reference database of V-D-J allelic variants [155]); 3) identification for each VH read of its nearest-neighbor IMGT V-D-J sequence (vdj_match); and 4) calculation of a variety of statistics (e.g., base composition and substitutions, indels) for

CDR1-3 and FR1-3 segments (vdj_stats). Our software was validated using several test sets including: 1) Gold-standard sequences generated by Sanger sequencing (~3000 VH

49 clones from which high-confidence V-D-J classifications were made through Vbase and

JOINSOLVER) [156], and 2) Mock 454 sequences generated by pyrosequencing independent PCR libraries made from 10 Sanger-sequenced VH clones (79,683 polished reads), which allow us to model the background frequency of base changes and insertions/deletions (indels) from pyrosequencing.

Mutation patterns analysis

Mutation patterns were calculated according to the same rules as with the VH3-

IgG cloned sequences. See the VH3-IgG Cloned Sequence Analysis section above.

CDR3 region analysis

The CDR3 region was determined and characteristics calculated according to the same rules as with the VH3-IgG cloned sequences. See the VH3-IgG Cloned Sequence

Analysis section above.

Preliminary Bcl6 gene PCR, cloning, sequencing, and analysis

Preliminary Bcl6 PCR amplification

To measure SHM in a target other than VH genes, Bcl6 was also sequenced. We wanted to sequence the most highly mutated portion of the sequence, so preliminary amplification, cloning, and sequencing were performed on 12 selected segments of the gene. Primers used for amplification were commercially synthesized primers to amplify the 12 selected segments (Figure 2.5, Table 2.4). PCR (25 L) consisted of: 200 ng

50

Figure 2.4: Flow Diagram of VhIGene Program. Pooled or “Multiplexed” Next- Generation Sequence Datasets are initially polished by BARTAB, which sorts the sequences based on the attached barcodes into individual libraries, removes poor quality sequences from the analysis, removes identical copies of sequences from the analysis based on alignment of the first 200 base pairs of sequence, and trims poor quality bases from the remaining set of sequences. The polished, de-multiplexed libraries are then aligned to each other and to a reference dataset derived from IMGT using HMMER alignment software. VHFILTER removes any insertions or deletions identified from the alignment which are not equal to multiples of 3 bp in length (e.g., 3, 6, or 9 base pairs, which may represent new or deleted amino acids). The closest VH, D, and JH gene matches in the reference dataset are identified by VHMATCH. VHSTATS identifies and calculates mutation frequencies and patterns in the Complementarity Determining Regions (CDRs) and Framework Regions (FRs) of the sequence and outputs the data into a .csv file. 51 template DNA, 0.75 L dNTPs (10 mM each), 0.5 L 50 mM MgSO4, 2.5 L 10x Pfx

Amplification buffer, 0.3 L Bcl6 Rev primer (10 M), 0.6 L Bcl6 Fwd primer (10M),

0.75 L STC1 Bcl6 Rev primer (10 M), and 0.2 L Platinum Pfx DNA polymerase

(Invitrogen). Samples were heated to 94oC for 2 minutes in an Applied Biosystems 9700 thermocycler (“hot start”), followed by 45 cycles consisting of a 30 second 94oC denaturation, a 30 second 60oC annealing, and a 30 second 68oC extension. All PCR product was purified using a QIAquick PCR Purification Kit according to the manufacturer’s protocol (Qiagen Inc.).

Figure 2.5: Preliminary Segments Selected for Testing Bcl6 Mutation Frequency. Segments 1-11 were chosen based on their RGYW/WRCY AID hotspot motif content. Segment 1.5 was chosen based on sequencing experiments found in the literature and because its position at the 5’ end of the gene more closely resembles the portion of the VH gene that is most highly mutated [160-162].

Cloning of Bcl6 genes

PCR products (10 L) were cloned in a pSTC1.3 vector following the

StabyCloning kit instructions (Eurogentec). Plasmids from individual bacterial colonies cultured overnight with 2 mL of LB medium (100 g/mL ampicillin) were purified using a QIAprep Spin Miniprep kit as per the manufacturer’s protocol (Qiagen).

52

Table 2.4: Preliminary Bcl6 PCR Amplification and Cloning Primer Sequences.

Primer Name Sequence Bcl6-1Fwd 5’-GCTGAAAGTCCCAAGCTGTC-3’ Bcl6-1Rev 5’-GAAAGGGGCAATTGGAGAAT-3’ STC1 Bcl6-1Rev 5’-CCTTCGCCGACTGAGAAAGGGGCAATTGGAGAAT-3’ Bcl6-2Fwd 5’-AATTCTCCAATTGCCCCTTT-3’ Bcl6-2Rev 5’-CACTGGTCATCCAGCAAAGA-3’ STC1 Bcl6-2Rev 5’-CCTTCGCCGACTGACACTGGTCATCCAGCAAAGA-3’ Bcl6-3Fwd 5’-TCTTTGCTGGATGACCAGTG-3’ Bcl6-3Rev 5’-GAAGGGGAAGAGAGCGATTT-3’ STC1 Bcl6-3Rev 5’-CCTTCGCCGACTGAGAAGGGGAAGAGAGCGATTT-3’ Bcl6-4Fwd 5’-CATCTCAGGCTGTGTTCTGC-3’ Bcl6-4Rev 5’-TTTACTGGTTTGGGGCTTTG-3’ Bcl6-5Fwd 5’-GAGCCTCCTATGAACGAGGA-3’ Bcl6-5Rev 5’-CGTCATCCCAGATGCAGTAA-3’ STC1 Bcl6-5Rev 5’-CCTTCGCCGACTGACGTCATCCCAGATGCAGTAA-3’ Bcl6-6Fwd 5’-AAGTCAAAGTGGGGTGATGG-3’ Bcl6-6Rev 5’-CAACTCAAACCCCAAGCAAT-3’ Bcl6-7Fwd 5’-AGGGGCTGAGTATCAGTGCT-3’ Bcl6-7Rev 5’-AGCCTGGAAAACCCTTCTGT-3’ STC1 Bcl6-7Rev 5’-CCTTCGCCGACTGAAGCCTGGAAAACCCTTCTGT-3’ Bcl6-8Fwd 5’-ATGCTAGGGTGATTGCATCC-3’ Bcl6-8Rev 5’-AGTGGCAGGTTGTTCTCCAC-3’ STC1 Bcl6-8Rev 5’-CCTTCGCCGACTGAAGTGGCAGGTTGTTCTCCAC-3’ Bcl6-9Fwd 5’-AGGTGGTGGAGAACAACCTG-3’ Bcl6-9Rev 5’-GGTCCCCTGCTACATCAAGA-3’ STC1 Bcl6-9Rev 5’-CCTTCGCCGACTGAGGTCCCCTGCTACATCAAGA-3’ Bcl6-10Fwd 5’-GGGTCTCAGAGCTTGAGTGG-3’ Bcl6-10Rev 5’-ATGACCCTGTGCCAAATCTC-3’ STC1 Bcl6-10Rev 5’-CCTTCGCCGACTGAATGACCCTGTGCCAAATCTC-3’ Bcl6-11Fwd 5’-GGTTTCCTCTGCTGAGCATC-3’ Bcl6-11Rev 5’-AGCCCCTCATTAGCACACAG-3’ STC1 Bcl6-11Rev 5’-CCTTCGCCGACTGAAGCCCCTCATTAGCACACAG-3’ Bcl6-1.5Fwd 5’-GCAGTGGTAAAGTCCGAAGC-3’ Bcl6-1.5Rev 5’-AGGGAACACCAAAACACTCG-3’ STC1 Bcl6-1.5Rev 5’-CCTTCGCCGACTGA AGGGAACACCAAAACACTCG-3’

53

Sequencing of Bcl6 clones

For each segment and time point, 10-55 (median = 37) plasmid inserts were sequenced in one direction by the Colorado Cancer Center Core Sequencing Facility using an ABI 3739 Sequencer (Applied Biosystems).

Bcl6 cloned sequence alignments and mutation calculations

Cloned Bcl6 sequences were trimmed by visual analysis and aligned with a human Bcl6 reference sequence derived from GenBank

(www.ncbi.nlm.nih.gov/genbank) using the BLAST tool bl2seq

(www.ncbi.nlm.nih.gov/BLAST) which aligns two input sequences. Individual base substitution mutations were manually counted based on the alignment and recorded in an excel file. Insertions and deletions were not counted as mutations. Mutation frequency = the total number of nucleotide mismatch mutations identified / total number of nucleotides sequenced.

Bcl6 454 PCR/ High-throughput pyrosequencing

1st round amplification

The primers used for PCR amplification were commercially synthesized primers designed to amplify 5 fragments (A, B, C, D, and E) derived from segment Bcl6-1.5, which had the greatest increase in mutation frequency after PBMC stimulation (Table

2.5). First round synthesis was performed in a 50 L reaction containing 400 ng DNA template, 0.2 mM each dNTP, 1.5 mM MgSO4, 5 L 10x KOD Hot Start Buffer, 0.3 M each primer, and 0.02 U/L KOD Hot Start High Fidelity DNA Polymerase (Novagen). 54

Samples were heated to 95oC for 2 minutes in an Applied Biosystems GeneAmp PCR

System 9700 thermocycler before initiating cycling (“hot start”). Samples were run for 5 cycles consisting of a 20 second 95oC denaturation, a 10 second 60oC annealing, and a 10 second 70oC extension. PCR product was purified using a QIAquick PCR Purification kit (Qiagen) and run on a 1.5% agarose gel stained with ethidium bromide. The 5 cycle reaction gel bands were excised and purified from the gel using a MinElute Gel

Extraction kit according to the manufacturer’s protocol (Qiagen). Purified PCR product was eluted into 15 L of elution buffer.

2nd round amplification

The primers used for 2nd round amplification were commercially synthesized primers as described above but with 454-sequencing tags (Roche), attached at the 5’

(Fwd; F) and 3’ (Rev; R) ends (Table 2.5). Second round synthesis was performed in a

50 L reaction containing the 15 L of purified 1st round 5 cycle PCR product described above, 0.2 mM each dNTP, 1.5 mM MgSO4, 5 L 10x KOD Hot Start Buffer, 0.3 M each primer, and 0.02 U/L KOD Hot Start High Fidelity DNA Polymerase (Novagen).

Samples were heated to 95oC for 2 minutes. The samples were amplified for 17-22 cycles consisting of a 20 second 95oC denaturation, a 10 second 55oC annealing, and a 10 second 70oC extension. Samples were purified and run on an agarose gel, then excised and purified from the gel slice as described above for the first round amplification product. The purified PCR product was eluted into 20 l of elution buffer.

55

Table 2.5: Bcl6 454-Pyrosequencing PCR Primer Sequences.

Primer Name Sequence 1st Round Amplification Bcl6A Barcode TTCCGGTT, TCCGTTAA, TTAACCGA, TTACCGTA, Sequences TTCCGAAC, TAAGGAAC, TAAGTTCC, AACTTACC, TTAACTCC, TTACTACC Bcl6B Barcode TACGGTTA, TAGTTAGG, TACCAAGG, TAAGTACG, Sequences TAACGTTG, AACGGAGG, TATTACGG, ACCGGTTG, ACGTTCCG, TTCGTAAG Bcl6C Barcode AAGACCG, TTGGTTC, TCTTAAC, TTCTAAG, GGTAACC, Sequences AACCTAGG, TTAGAACG, TTATTCCG, TAACCAAG, TAACTTCG Bcl6D Barcode TACGTAA, AACCGGT, TCCTTAC, TTAGGTT, TTACCGT, Sequences AGGTTAAC, TTCGTAGT, AAGTAGGT, ATAACCGT, AGTCCGTT Bcl6E Barcode TTACGGAA, AGTTAACC, TTAAGGTC, TCCGGAAC, Sequences TCGTTAAC, ATTAACGG, CGTTAAGG, TCGGTTGG, TTCGGAGG, TTGAACCG Bcl6A Fwd 5’-Barcode sequence-GGAACCTCCAAATCCGAGAC-3’ Bcl6B Fwd 5’-Barcode sequence-CAAATGCTTTGGCTCCAAGT-3’ Bcl6C Fwd 5’-Barcode sequence-CACCCTCCCTTGTGTTGTTT-3’ Bcl6D Fwd 5’-Barcode sequence-GCAAACTGCTTTCCTTGCTC-3’ Bcl6E Fwd 5’-Barcode sequence-TTCAGAGCCGTGATCTTCCT-3’ Bcl6A Rev 5’-GAAAACTTGGAGCCAAAGCA-3’ Bcl6B Rev 5’-CTCCTTCCTCTCCTCCACCT-3’ Bcl6C Rev 5’-CGGAGTTACCCAGAAGGACA-3’ Bcl6D Rev 5’-GCAGGGAACACCAAAACACT-3’ Bcl6E Rev 5’-AGCGCCCAAAATACAAACAC-3’ 2nd Round Amplification T454A (Adaptor) 5’-GCCTCCATCTCATCCCTGCGTGTCTCCGACTCAG-3’ T454B (Adaptor) 5’-CCTATCCCCTGTGTGCCTTGGCAGTCTCAG-3’ BaF-454 5’-T454A sequence-Barcode sequence-GGAACCTC-3’ BbF-454 5’-T454A sequence-Barcode sequence-CAAATGC-3’ BcF-454 5’-T454A sequence-Barcode sequence-CACCCTCC-3’ BdF-454 5’-T454A sequence-Barcode sequence-GCAAACTG-3’ BeF-454 5’-T454A sequence-Barcode sequence-TTCAGAGC-3’

56

Primer Name Sequence BaR-454 5’-T454B sequence-GAAAACTTGGAGCCAAAGCA -3’ BbR-454 5’-T454B sequence-CTCCTTCCTCTCCTCCACCT -3’ BcR-454 5’-T454B sequence-CGGAGTTACCCAGAAGGACA -3’ BdR-454 5’-T454B sequence-GCAGGGAACACCAAAACACT -3’ BeR-454 5’-T454B sequence-AGCGCCCAAAATACAAACAC -3’

Bcl6 454-pyrosequencing analysis

Reference sequences from the GenBank human Bcl6 gene were created for the 5 fragments sequenced by 454-pyrosequencing. 454-generated sequences were de- multiplexed, polished, aligned, and analyzed by VhIGene. The 5 reference sequences were used in place of the IMGT reference dataset.

Flow cytometry

To measure B and T cell marker expression, unstimulated and stimulated PBMC

(2x106 cells/tube) were washed in filter-sterilized FACS buffer (Dulbecco’s PBS

(Invitrogen) + 1% BSA (Fisher Scientific, Pittsburgh, PA)) and stained with 9 color panels with monoclonal antibodies to B and T cell markers. Cells were incubated with antibodies for 40 minutes at 4oC. After two additional washes in FACS buffer, stained cells were fixed in 1% paraformaldehyde for 10 minutes at 4oC, then run on an LSRII flow cytometer (BD Biosciences) within 24 hours. Data were analyzed using Flow Jo

Software (Tree Star, Inc; Ashland, OR). Cells were gated on singlets, followed by lymphocytes, and finally, on CD19+ B cells or CD3+ T cells. Positive expression of each marker was determined by mean fluorescence intensity measurements (MFIs) from cells

57 singly stained with each marker. B cell markers included CD38-FITC, CD21-PE, CD10-

PE-CF594, CD40-APC, CD86-BV421 (BD Pharmingen, San Diego, CA), IgM-PerCP-

Cy5.5, IgD-PE-Cy7, CD19-AF700, and CD27-BV650 (Biolegend, San Diego, CA). T cell markers included CD38-FITC, HLA-DR-PE-Cy7, PD-1-APC, CD3-AF700, CD4-

Pacific Blue (Biolegend), CXCR5-PE (RnD Systems, Minneapolis, MN), CD45-RA-

ECD, CD27-PE-Cy5 (Beckman-Coulter, Indianapolis, IN), and CD8-APC-AF750

(Invitrogen).

Statistical analysis

Primary analyses (HIV-1-viremic vs. aviremic vs. control subjects) assumed non- parametric two-sided tests with a significance level of 0.05. A Fisher’s projected least significant different approach was utilized for secondary analyses, such that pairwise

(secondary) comparisons are conducted only if the overall primary test was significant.

Secondary tests utilized a Kruskal-Wallis, non-parametric ANOVA. Primary analyses between HIV-1-infected and control subjects only utilized a non-parametric Mann-

Whitney test. For each subject, the expression of specific VH3 family genes was compiled and the mean percent expression was calculated; the median percent expression was calculated from the means of the individual patients for each group. Generalized linear models (binomial family) were used for comparing the proportion of events between groups.

58

CHAPTER III

THE SOMATIC HYPERMUTATION FREQUENCY OF VH3 FAMILY

IMMUNOGLOBULIN GENES IS ALTERED IN HIV-1-INFECTED PATIENTS

COMPARED WITH HEALTHY CONTROLS

Introduction

B cell activation and hypergammaglobulinemia are among the first and most persistent immunologic consequences of HIV-1 infection [163, 164]. High rates of infection and impaired humoral responses to vaccines during HIV-1 infection may be related to an impaired ability to generate pathogen-specific antibodies in sufficient quantities, but also of sufficient quality and function to control these pathogens [165-

167]. The successful evolution of antibody diversity, specificity and function is determined by three distinct processes. First, antigen-independent recombination of variable (V), diversity (D) and joining (J) gene segments establishes the primary repertoire in naïve B cells (IgD+IgM+) and appears relatively intact during HIV-1 infection [29]. Subsequently, in lymph node germinal centers, antigen-dependent somatic hypermutation (SHM) modifies the antigen-binding variable regions of the heavy

(VH) and light (VL) chains, which, following selection, enhances antigen specificity and

59 avidity [168]. Finally, class-switch recombination (CSR) modifies the effector constant regions of the heavy chain (CH) to a single isotype (IgG, IgA or IgM) and may be somewhat impaired during HIV-1 infection [131, 133, 135, 169].

SHM introduces point mutations into the variable regions of immunoglobulin genes to increase both specificity and affinity. While both light (VL) and heavy (VH) genes are somatically hypermutated, the major contributor to diversity and specificity is from the heavy chain gene [170]. Immunoglobulin heavy chain genes (Figure 3.1) are divided into two types of regions, complementarity determining regions (CDRs) which bind directly to antigen, and framework regions (FRs) which form the structural regions of the molecule [70]. Three CDRs are surrounded by four FRs. Both CDRs, 1 and 2, as well as FR1/2/3 are encoded by the V segment, CDR3 is encoded by the V, D, and J rearranged segments, and FR4 is encoded by the J segment [171, 172]. CDRs are more highly mutated than the FR regions. This is due to both the codon composition in both regions, where, in the CDRs, mutations are more likely to result in amino acid replacements, and due to the greater number of RGYW/WRCY mutation “hotspot” motifs found in these regions [82].

We focused our studies on sequences of the immunoglobulin VH3 family. Of the seven VH families, VH3 is the largest, comprising 22 of the 44 functional human VH genes [170, 173], and is also the most highly expressed VH gene family [174]. VH3 family genes encode most antibodies to capsular polysaccharides of common HIV-1- associated pathogens (e.g. Streptococcus pneumoniae, Haemophilus influenzae,

Cryptococcus neoformans, and Salmonella spp.) [25-27, 173]. VH3 genes have also been implicated in superantigen-like activity in the context of HIV-1 infection. The HIV-1

60

Figure 3.1: Immunoglobulin Heavy Chain mRNA. Immunoglobulin heavy chain (IgH) mRNA is composed of V, D, and J segments rearranged and linked to a constant region. The VDJ variable domain is divided into 3 complementarity-determining regions (CDRs) that directly bind antigen held together by 4 framework regions (FRs). At the 5’ end of IgH mRNA there is an untranslated leader sequence which is used to distinguish VH families. The CDR3 region, composed of V, D, and J gene segments is the most variable of all regions.

envelope glycoprotein, gp120, can bind to several VH3 family genes in an area outside of the antigen-binding region and can lead to B cell activation in vitro [175, 176]. In the absence of adequate T cell signals, such binding and activation could deplete these cells

[175, 176]. Thus, impairment of SHM of VH3 family genes could have a huge impact on

HIV-1 disease progression.

We show that, compared with uninfected control subjects, HIV-1 infection is associated with significantly decreased frequencies of SHM in CDR1/2 (nucleotides and amino acids) of IgG class-switched VH3 cDNA. While the quantity of the SHM response may be decreased during HIV-1 infection, the quality of the process, as determined by examining the mutation pattern, appears intact. Consistent with this result, a significantly

61 lower mutation frequency was also found in HIV-1-infected patients in another non-Ig target of the SHM machinery, Bcl6. Because antibody avidity and function are determined by SHM, these decrements in VH3 mutation may contribute to the increased rates of primary and recurrent infections against which antibodies contribute to protection, and to the limited efficacy of polysaccharide vaccines to protect against these pathogens in this adult population [177].

Surprisingly, when we examined the SHM frequency in presumably naïve IgD

VH3 transcripts, we found that it was significantly higher in the HIV-1-infected patients compared with controls. However, as seen in the VH3-IgG transcripts, mutation patterns imply only a change in quantity of mutation, not quality. The mechanisms of HIV-1- associated disparities in SHM frequency may include altered regulation in the frequency or magnitude of the SHM process mediated by activation-induced deaminase (AID) protein, related DNA repair enzymes, antibody selection in germinal centers, altered B cell subset proportions, and/or non-specific peripheral T-independent activation of circulating naïve B cells.

Results

VH3 somatic hypermutation frequency is reduced in VH3-IgG cloned samples from viremic HIV-1-infected patients but the mutation pattern is normal

VH3-23 gene expression is reduced in viremic HIV-1-infected patients. We began our study by examining VH3-family immunoglobulin gene expression in IgG mRNA transcripts from 10 control subjects compared with two HIV-1-infected cohorts, one group on stable therapy with no detectable plasma HIV-1 viral loads (aviremic, 6

62 patients) and another untreated group with high viral loads (viremic, 15 patients) (Table

3.1).

V-D-J gene recombination is the first step in generating the antibody repertoire.

We characterized VH3 gene utilization by cloning and sequencing 494 VH3- IgG cDNA clones from circulating class-switched IgG B cells from 10 HIV-1-seronegative control subjects, 793 clones from 15 viremic HIV-1-infected patients (80% >10,000 copies/mL) and 278 clones from 6 aviremic HIV-1-infected patients (median 50 clones/subject; range

40-66; accession numbers JN576421-JN577983). Of the 22 individual VH3 genes, at least 20 out of 22 were represented in all groups. VH3-23, although the most frequently expressed gene [173, 174], was also significantly decreased among viremic HIV-1- infected patients compared with control subjects, as were VH3-7, VH3-9, and VH3-53

Table 3.1: Clinical Characteristics of VH3-IgG Cloning Study Subjects.

Control Aviremic Viremic p value Number 10 6 15 Age – median years 28 53.5 43 0.017 (Range) (24-52) (44-59) (26-52) Gender (M:F) 6:4 6:0 14:1 0.04 % Non-Caucasiana 0 33% 33% CD4+ T cells/L – median N/A 270 140 0.047 (Range) (129-316) (34-394) HIV-1 RNA median N/A <50 56,324 copies/mL (Range) (500-1,029,032) Antiretroviral Therapyb N/A No Therapy N/A 0 10 (67%) 1 medication N/A 0 2 (13%) ≥3 medications N/A 6 (100%) 3 (20%) aNon-Caucasian = African American, Asian/Pacific Islander, Native American, Hispanic, or Other. bAntiretroviral therapy: 1 medication = NRTI, ≥3 medications = 1 PI + 2 NRTI or NNRTI + 2 NRTI. Effect of gender was evaluated by Fisher’s exact test.

63

(Figure 3.2a). This decrease was consistent for most viremic patients, among whom the frequency of VH3-23 expression was below the median for control subjects in 13 of 15 patients (Figure 3.2b).

D and JH gene expression is very similar in all groups. Only limited differences were observed in the utilization of the other variable region genes, D and JH. D regions, located centrally in the hypervariable CDR3 region and the most commonly mutated of the 3 variable region segments, could be assigned in 84.6% of sequences. We identified few significant differences in D gene utilization of less prevalent D genes across all groups (Figure 3.3a). Similarly, utilization of JH segments was comparable in each group, with JH4b comprising approximately half of all JH segments (Figure 3.3b). Thus, as suggested in earlier work on naïve B cells [29], the repertoire of expressed variable region genes appears fundamentally intact in these patients with HIV-1 infection.

CDR3 region characteristics are similar in all groups. The CDR3 region of the antigen-binding variable segment typically plays the greatest role in antigen binding

[178] and has the greatest degree of diversity. We characterized the CDR3 region of each sequence by analyzing its length, hydrophobicity, and the composition of acidic, basic, uncharged polar, nonpolar, and aromatic residues (Table 3.2) as these attributes all make important contributions to the region’s antigen specificity [172]. CDR3 length, which overlaps the junctions of VH, D, and JH genes [172], was similar in sequences from the three groups. The degree of hydrophobicity, a reflection of the tendency of hydrophobic residues to be buried within the protein, was also comparable between groups, as was the amino acid composition in the CDR3 region. Together with results on the expression of

VH3, D, and JH genes, these data suggest that the initial V(D)J recombination process,

64

A VH3 Family Gene Expression

30 * p<0.05 * * Control

3 Gene 3 HIV+ Aviremic H 20 HIV+ Viremic

* 10

* * % Expression of V of Expression % 0 7 9 11 15 23 30 33 48 53 73 74 30.3 30.5 Other

VH3 Gene

B VH3-23 Gene Expression

60 p = 0.03 p = 0.01

3-23 Gene 3-23 40 H

20

% Expression of V of Expression % 0 Control Aviremic Viremic HIV Status

Figure 3.2: Expression of VH3-IgG Family Genes and VH3-23. A) VH3 family gene expression. Data are shown as group medians among control subjects (n=10; black bars), aviremic HIV-1-infected patients (n=6; gray bars), and viremic HIV-1-infected patients (n=15; white bars). Values were calculated from individual mean percent expression for each gene. “Other” includes VH3-13, -20, -43, -49, -64, -65, -66, -72, and unidentified genes which ranged from 0.4% to 12.6% expression. B) VH3-23 expression. Each point represents the mean percentage of VH3-23/total VH3 genes for each control subject (n=10; black circles), aviremic HIV-1-infected patient (n=6; gray squares), and viremic HIV-1-infected patient (n=15; white triangles). Primary p=0.008. The solid bar indicates the group median.

65

A DH Gene Expression 50 Control

40 HIV+ Aviremic Gene H HIV+ Viremic 30 * p>0.05

20

10 * * *

* * % Expression of D of Expression % 0 D1-26 D2-15 D3-3 D3-10 D3-22 D4-17 D5-5 D5-24 D6-13 D6-19 No ID Other

DH Gene

B JH Gene Expression 60 Control

HIV+ Aviremic Gene H 40 HIV+ Viremic * p>0.02 * * 20 *

* * % Expression of J of Expression % 0 JH2 JH3a JH3b JH4b JH5a JH5b JH6b Other

JH Gene

Figure 3.3: Expression of D and JH Genes. A) D gene expression was calculated as with VH3 gene expression. “Other” includes D1-1, D1-7, D1-14, D1-20, D2-2, D2-8, D2-21, D3, D3-9, D3-16, D4, D4-b, D4-11, D4-23, D5-12, D6-6, and D6-25 which ranged from 0.4% to 4.3% expression. Gene identification could not be assigned in 12.5- 18.0% of sequences. B) JH gene percent expression was calculated as described above. “Other” includes JH1, JH4a, JH4d, JH6a, and JH6c which ranged from 0.2% to 6.7% expression.

66 involving RAG1, RAG2, and terminal deoxytransferase (TdT), is intact in HIV-1- infected individuals and does not materially contribute to the reported decreased avidity and functional activity of antibodies produced by HIV-1-infected patients [29, 179, 180].

Mutation frequency is reduced in CDRs and FRs of VH3-IgG genes from viremic

HIV-1-infected patients. SHM is typically an antigen-driven process that enhances antibody affinity and function. Hypervariable complementarity determining regions

(CDR), the principle antigen-binding regions that serve as the primary targets for SHM, were assessed. The nucleotide mutation frequency in the CDR1/2 regions of cloned VH3-

IgG sequences from B cells of the viremic HIV-1-infected group was significantly lower

Table 3.2: Biochemical Characteristics of Amino Acids in the CDR3 Regions.

Control Aviremic Viremic p value CDR3 Length 13.1 12.7 13.4 0.32 (amino acids) (12.2-14.2) (11.1-12.9) (11.5-15.1) Hydrophobicity -0.0090 -0.0022 0.0014 0.39 Index (-0.08-0.05) (-0.05-0.02) (-0.07-0.10) Acidic Residues 15.9% 17.5% 15.7% 0.54 (14.6-19.2%) (14.3-18.7%) (13.0-18.2%) Basic Residues 7.5% 6.7% 7.3% 0.52 (6.2-9.9%) (6.1-8.7%) (5.6-9.2%) Uncharged Polar 42.7% 42.4% 43.6% 0.70 Residues (36.1-46.7%) (37.2-43.8%) (35.9-47.6%) Nonpolar Residues 33.4% 33.4% 34.1% 0.76 (29.9-36.5%) (31.0-37.6%) (29.7-40.1%) Aromatic Residues 24.9% 23.8% 24.3% 0.91 (18.1-29.0%) (21.6-27.7%) (22.9-27.7%) CDR3 length and hydrophobicity index were calculated as described in the methods. Residue characteristics were calculated with the number of each type of residue divided by the total number of residues in the CDR3 region of each sequence. Values are listed as the group median with the range of individual means in parentheses.

67 than that of control subjects (p=0.033; Figure 3.4a), whereas values for aviremic patients on effective antiretroviral therapy were not different. Although the overall frequency of nucleotide mutation was lower in the viremic patients, the ratio of replacement to silent

(R/S) amino acid changes resulting from these nucleotide mutations, one indicator of positive antigen-driven selection, was high and similar in all three groups.

Nevertheless, consistent with the nucleotide results, the frequency of amino acid mutation in CDR1/2 was also lower among the viremic patients compared with control subjects (Figure 3.4b). However, the character of amino acid changes were similar in that over two-thirds were non-conservative, changes which are more likely to impact antigen binding than conservative mutations. Non-conservative mutations are more likely to be positively selected during affinity maturation of VH immunoglobulin genes.

The percent of non-conservative mutations was not statistically different between the three groups suggesting that mutations were antigen-driven and targeted rather than more randomly distributed in these patients. Thus, although the frequency of nucleotide and resultant amino acid replacement mutations were lower in viremic HIV-1-infected patients, the types and changes in nucleotides (replacement vs. silent) and amino acids

(non-conservative vs. conservative) were not.

For the structural framework (FR1/2/3) regions (Figure 3.5a), the nucleotide mutation frequency in general, as expected [82, 181], was lower than in the CDR1/2 regions (2-8% vs. 5-15% (Fig. 3.4a), respectively). The variance in individual FR nucleotide and amino acid mutation frequencies was greater in FR from viremic patients than in the other two groups, but only significantly different for amino acids (Figures 3.5a and 3.5b). The R/S ratios were also similar between groups but lower overall compared

68 with the CDRs, reflecting the targeted effect of antigen-driven selection on the latter.

Similarly, the proportion of non-conservative amino acid mutations did not vary among groups either. Results from aviremic patients effectively treated with antiretroviral therapy were most similar to those of control subjects.

While the mutation frequencies were lower in sequences from viremic patients compared with control subjects, the median proportions of unmutated sequences did not differ between groups (controls: median = 1.14%, range = 0 - 12.8%; aviremic: median

=1.07 %, range = 0 – 4.3%; viremic: median = 7.69%, range = 0 - 18.8%; overall p=0.15), suggesting that the lower frequencies were not due to skewing by unmutated

IgG class-switched sequences, as have been described in young infants [182]. Rather, the lower number of mutated nucleotides per sequence in the viremic group was distributed throughout the antibody population (Figure 3.6). Neither the frequencies of mutations nor amino acids in CDR1/2 or FR1/2/3 correlated significantly with either plasma HIV-1

RNA (r=0.46, p=0.30) or CD4+ T cell number (r=0.15, p=0.63; data not shown).

VH3-IgG gene-specific SHM is consistent with overall SHM frequency in viremic

HIV-1-infected patients. The decrement in mutation frequency in CDR1/2 of viremic patients was present across nearly all VH3 family genes, including the 5 most commonly expressed genes, representing 53% of all sequences analyzed (833/1,565) (Figure 3.7).

Differences in mutation frequencies between viremic patients and control subjects were only statistically significant for VH3-33 (p=0.01). Mutation frequencies in the 5 genes proposed to bind to the HIV-1 envelope protein gp120 in a nontraditional manner to VH regions outside the antigen-binding pocket [183] were also evaluated. Such targeted binding has been proposed as a superantigen-like stimulus for selective deletion of B

69

A CDR Nucleotide Mutation Frequency 20 Overall p = 0.04 p = 0.03

15

10

5 R/S Ratios

% CDR Nucleotide% Mutations 3.88 4.10 3.65 (2.47-5.39) (2.99-4.20) (2.36-7.27) 0 Control HIV+ Aviremic HIV+ Viremic HIV Status

B CDR Amino Acid Mutation Frequency 40 Overall p = 0.02 p = 0.02

30

20

10 Non-conservative mutations 69.0% 73.1% 71.0%

% CDR a.a. replacement% mutations (66.3-72.3%) (69.4-81.1%) (65.5-77.0%) 0 Control HIV+ Aviremic HIV+ Viremic HIV Status

Figure 3.4: Mutation Frequency is Reduced in CDR1/2 Regions of VH3-IgG Genes from Viremic HIV-1-Infected Patients. The mean percent of mutated nucleotides (A) or the mean percent of mutated amino acids (B) in CDR1 and CDR2 regions was calculated for each control subject (n=10; black circles), aviremic HIV-1-infected patient (n=6; gray squares), or viremic HIV-1-infected patient (n=15; white triangles) based on the alignment of the cloned sequences with VH3 sequences from the Vbase immunoglobulin database. The solid bar indicates the group median. A) The median (and range) of R/S ratios listed below the data points indicates the number of amino acids replaced as a result of nucleotide mutations relative to unchanged (silent) amino acids. B) The median (and range) percent non-conservative mutations (listed below the data points with the range of patient means in parentheses) was calculated relative to the total number of replaced amino acid mutations. 70

A FR Nucleotide Mutation Frequency 10 Overall p = 0.18 8

6

4

2 R/S Ratios

% FR Nucleotide% Mutations 1.60 1.50 1.51 (1.41-1.76) (1.15-1.63) (1.19-1.98) 0 Control HIV+ Aviremic HIV+ Viremic HIV Status

B FR Amino Acid Mutation Frequency 15 Overall p = 0.11

p = 0.05

10

5

Non-conservative mutations 56.9% 57.9% 59.6%

% FR a.a. replacement% mutations (55.1-62.0%) (51.6-60.0%) (49.2-64.5%) 0 Control HIV+ Aviremic HIV+ Viremic HIV Status

Figure 3.5: Replacement Amino Acid Mutation Frequencies are Reduced in FR1/2/3 Regions of VH3-IgG Genes from Viremic HIV-1-Infected Patients. The mean percents of mutated nucleotides (A) or mutated amino acids (B) in FR1, FR2, and FR3 was calculated for each control subject (n=10, black circles), HIV-1-infected aviremic patient (n=6, gray squares), and HIV-1-infected viremic patient (n=15, white triangles) based on the alignment of cloned sequences with VH3 sequences from the Vbase immunoglobulin database. The solid bar indicates the group median. A) The median (and range) of R/S ratios listed below the data points indicates the number of amino acids replaced as a result of nucleotide mutations relative to unchanged (silent) amino acids. B) The median (and range) percent non-conservative mutations listed below the data points were calculated relative to the total number of replaced amino acid mutations. The solid bar indicates the group median. 71

Proportions of Mutated Sequences 30 p=0.04 p=0.04* Control * p=0.02 * HIV+ Aviremic 20 HIV+ Viremic

10 % of Total Sequences Total of %

0

18+ 0-1.9 2-3.9 4-5.9 6-7.9 8-9.9 10-11.9 12-13.9 14-15.9 16-17.9 Nucleotide Mutation Frequency (%)

Figure 3.6: The Proportions of Mutated Sequences in VH3-IgG Genes. The median proportion of sequences with varying percentages of nucleotide mutations per sequence is graphed for control subjects (n=10, black circles), HIV-1-infected aviremic patients (n=6, gray squares), and HIV-1-infected viremic patients (n=15, white triangles). Proportions are expressed as a percent of the total number of sequences from each individual. Primary p values are listed on the graph. Secondary analyses: 0-1.9%, control vs. viremic p<0.05; 4-5.9%, aviremic vs. viremic p<0.05; 6-7.9%, control vs. viremic p<0.05.

cells bearing these gene products (VH3-30.5, -23, -15, -30, and -73). However, we found no consistent differences in the pattern of amino acid replacement frequencies by group for each of these genes (Figure 3.7).

The topographical pattern of amino acid mutations was similar in all groups. The frequency of amino acid replacements by numbered position within the VH molecule was virtually identical in all groups (Figure 3.8). Moreover, the proportion of non- conservative replacements at each VH site in each patient group mirrored the others. As

72

Figure 3.7: Frequencies of Amino Acid Replacement Mutations in Specific VH3-IgG Genes. The group median for CDR1/2 in each of the 5 most frequently expressed VH3 genes (VH3-74, -33, -07, -30.5, and -23) and for the VH3 genes proposed to bind to HIV-1 gp120 outside of the antigen-binding region (VH3-30.5, -23, -15, -30, and -73) are calculated with the mean mutation frequencies for each individual patient in a group. The table below shows the total number of sequences cloned (N) for each gene by group. noted, the highest mutation frequencies were found in the CDR1/2 regions. A larger portion of these mutations in the CDR1/2 regions were non-conservative compared with those in the FR1/2/3 (2.23-2.71 in CDR1/2 vs. 1.32-1.47 in FR1/2/3; Figure 3.8), again suggesting antigen-driven selection requiring functional differences in structure compared with germline. In contrast, a larger portion of the replacement amino acid

73 mutations in the FR1/2/3 regions were conservative mutations, as we expected based on

FR function and nucleotide content [82].

Binding of HIV-1 gp120 to VH3 genes in a superantigen-like fashion has been reported to involve amino acid residues in regions FR1 (residues 10, 13, 19, 23, 28),

CDR1 (residue 32), CDR2 (residues 54, 59, 64, 65), and FR3 (residues 75, 79, 81, 82a,

83, and 85) [183]. Despite suggestions that gp120 binding would be sensitive to SHM at these residues [183], amino acid replacement frequencies at nearly all of these positions did not vary by group, as might be expected if large proportions of the VH3-expressing B cells that bound gp120 at these positions were being selectively deleted. Only at amino acid position 54 was there a significant difference in the proportion of non-conservative mutations between viremic patients and control subjects, with viremic patients less likely to have a non-conservative mutation at this site. This was the only significant difference seen between groups when all VH3 gene sequencing results were compared (p=0.033), when only putative gp120-binding VH3 gene sequencing results were compared

(p=0.063), or when only VH3-23 gene sequence results were compared (p=0.016).

Nucleotide mutation patterns were similar in all groups. Mutations in G and C are a direct result of the activity of activation-induced cytidine deaminase (AID). AID targets SHM mutations to “hotspot” motifs (RGYW and complementary WRCY nucleotide motifs) that comprise a minority of total nucleotides but include the majority of mutations in CDR1/2 (Table 3.3) [82, 125]. The decreased mutation frequency observed in the CDR regions from patients with plasma viremia was not due to differences in the number of these hotspot motifs in the variable regions (11-24 per sequence in all groups) (Table 3.3). In addition, nucleotide mutations in both CDRs and

74

FRs were clustered equally. In cells from all groups, 59-61% of all CDR mutations occurred in these hotspots, as did over a third of mutations in FR1/2/3.

The initial deamination of cytidine nucleotides by AID [184] is subsequently repaired by low-fidelity translesion DNA synthesis (TLS) polymerases (DNA pol , , ,

, and Rev1) yielding mutations in A and T nucleotides [121]. Impairment of one or more of these polymerases can affect the resulting nucleotide mutation pattern [128, 129].

Based on knock-out mutations and RNA silencing studies in mice, yeast, and humans, loss of Pol , , , and  do not result in any changes in mutation frequency or pattern.

However, loss of Pol , , , and Rev 1 result in a decrease in some or all types of mutations [128]. Fewer A:T but more G:C mutations are found in the absence of Pol

. Decreases of all nucleotides are seen in the absence of Pol  (moderate decreases) and Pol  (60-80% decrease) [128]. Loss of Rev 1 may lead to a decrease in

C:G transversion mutations [128]. Whereas all nucleotides tended to show a lower proportion of mutation, mutation of both A and G nucleotides, relative to the total number of A and G nucleotides in the unmutated reference sequence, were significantly lower in viremic patients compared with control subjects (Table 3.4) in the CDR1/2 regions. Changes in nucleotide mutation patterns in FR1/2/3 regions were less pronounced, however, the mutation of G nucleotides was again significantly lower among viremic patients compared with control subjects. Despite the lower proportions of mutated A and G nucleotides, the relative frequency of each nucleotide mutated, relative to the total number of mutations, was not different between groups (G> A> C> T in

CDRs, G> C> A> T in FR in all groups) (Table 3.5).

75

A Amino Acid Mutation Frequency by Position 60 Control Group

CDR N/C ratio: 2.23 40 FR N/C ratio: 1.32

20

A.A. Mutation Frequency (%) Frequency Mutation A.A. 0

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94

31a 52a 52c 82a 82c

31b 52b 82b FR1 CDR1 FR2 CDR2 FR3

B HIV+ Aviremic Group 60

CDR N/C ratio: 2.71 40 FR N/C ratio: 1.38

20

A. A. Mutation Frequency (%) Frequency Mutation A. A. 0

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94

31a 52a 52c 82a 82c

31b 52b 82b FR1 CDR1 FR2 CDR2 FR3

C HIV+ Viremic Group 60 % Non-conservative mutations % Conservative mutations CDR N/C ratio: 2.45 40 FR N/C ratio: 1.47

20

A.A. Mutation Frequency (%) Frequency Mutation A.A. 0

1 2 3 4 5 6 7 8 9

10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94

31a 52a 52c 82a 82c

31b 52b 82b

FR1 CDR1 FR2 CDR2 FR3 Amino Acid Position

Figure 3.8: Frequencies of Amino Acid Mutations at Each Amino Acid Position in VH3-IgG Genes. The percent of non-conservative (black bars) or conservative (gray bars) mutations relative to the total number of amino acids sampled at each position are represented, the sums of which equal to the total percent replaced amino acid mutation frequency at each position. The mean percent replacement amino acid mutation frequency was calculated for each individual at each amino acid position, and the median of the group mutation frequencies at each amino acid position was calculated from the individual means. Control subjects (A; n=10), HIV-1-infected aviremic patients (B; n=6), and HIV-1-infected viremic patients (C; n=15). 76

Table 3.3: RGYW/WRCY Motifs and Targeted Mutation Frequencies in CDR and FR Regions in VH3-IgG Genes.

Control Aviremic Viremic p value Number of RGYW/WRCY 17.8 18.3 17.8 0.80

motifs per VH segment (11-23) (13-23) (13-24) Median Percent (Subject Range) % of CDR Nucleotides 13.5 12.4 9.8 0.03 mutated (7.4-15.2) (10.2-14.1) (4.9-14.4) % of CDR mutations 58.8 61.1 59.4 0.60 present in RGYW/WRCY (48.5-62.5) (56.8-71.0) (53.5-69.2) motifs % of FR nucleotides 5.2 4.8 4.1 0.11 mutated (4.0-6.5) (4.2-5.7) (2.4-7.8) % of FR mutations present 29.3 30.1 28.7 0.68 in RGYW/WRCY motifs (19.6-33.5) (27.3-33.0) (24.4-35.2) % of all nucleotides 7.2 6.5 5.3 0.07 mutated (6.0-8.4) (5.6-7.6) (3.1-9.3) % of all mutations present 41.2 43.3 41.4 0.56 in RGYW/WRCY motifs (31.9-46.3) (39.9-48.6) (37.1-45.6) Group medians are listed with the range of individual means in each group in parentheses.

Characterization of neighboring bases can also be useful in examining both stages of SHM; AID-induction and subsequent TLS DNA polymerase-mediated repair.

Consistent with previous data that AID-mediated mutation of cytidine nucleotides is determined in part by neighboring bases, particularly A or G in the +1 (3’) position [185], we showed that 76-82% of mutations of C occurred in the presence of an A or G in the adjacent -1 position (-1R, Figure 3.9A), regardless of whether the mutation resulted in a transition (C:T) or a transversion (C:A or C:G) (Table 3.6). C or T were the preferred -1

(5’) adjacent nucleotides in 54-74% of C mutations (-1Y). These values were identical in each group. Mutations of G showed a very similar preference of -1 and +1 adjacent nucleotides [186]. Mutations of A and T were more variable in the preference of adjacent

77

Table 3.4: Nucleotide Mutation Patterns in VH3-IgG Genes.

Control Aviremic Viremic p value Complementarity Determining Regions (CDR) Median Percent (Range) % C nucleotides mutated 13.0 13.8 10.3 0.13 (9.1-16.8) (10.2-16.0) (6.9-19.0) % G nucleotides mutated 15.1 14.0 11.0 0.04 (7.5-19.2) (12.9-17.8) (6.2-16.9) % A nucleotides mutated 13.1 12.1 9.7 0.02 (6.8-16.2) (8.1-13.8) (3.9-14.4) % T nucleotides mutated 10.0 9.1 7.6 0.49 (6.9-11.5) (8.0-10.3) (2.5-13.4)

Framework Regions (FR) % C nucleotides mutated 5.3 5.4 4.0 0.16 (3.9-7.4) (4.4-6.4) (2.2-7.5) % G nucleotides mutated 5.4 5.1 4.2 0.03 (4.2-7.5) (4.7-6.1) (2.6-8.7) % A nucleotides mutated 5.8 5.4 4.8 0.16 (4.8-6.8) (4.6-6.6) (3.2-9.1) % T nucleotides mutated 3.4 3.4 2.7 0.36 (2.2-4.0) (2.6-4.1) (0.9-5.2) “% C nucleotides mutated” indicates the proportion of nucleotides in either the CDR1/2 or FR1/2/3 regions that were mutated relative to the total number of C nucleotides present in the unmutated reference sequence, expressed as a percent. The group medians are listed (individual patient mean ranges in parentheses). nucleotides depending on the type of resulting mutations. For A:G transition mutations, an A or G in the -1 adjacent nucleotide was preferred (66-68%), whereas in A:T transversion mutations a preference for C or T was seen. A:C transversion mutations showed no significant preference at either the -1 or +1 positions. The only difference seen between control subjects and viremic patients was in the preference at the +1 position in A:G transition mutations. In controls, a +1 C or T was slightly preferred over

A or G and in viremic patients a +1 A or G was slightly preferred over a C or T (Figure

78

Table 3.5: Nucleotide Mutation Proportions in VH3-IgG Genes.

Control Aviremic Viremic p value Complementarity Determining Region (CDR) Median Percent (Range) % of mutations that were C 19.1 20.4 20.1 0.70 nucleotides (17.5-26.8) (18.1-22.5) (17.0-29.1) % of mutations that were G 30.9 33.1 31.3 0.80 nucleotides (26.9-50.2) (29.3-38.1) (27.0-38.3) % of mutations that were A 30.4 28.5 29.3 0.89 nucleotides (26.9-35.0) (24.7-35.3) (24.6-36.3) % of mutations that were T 19.3 17.4 17.1 0.18 nucleotides (11.8-21.7) (14.3-20.3) (12.0-22.5)

Framework Regions (FR) % of mutations that were C 27.9 27.2 27.0 0.28 nucleotides (23.8-33.6) (25.8-30.7) (22.7-33.0) % of mutations that were G 35.7 35.0 34.1 0.19 nucleotides (29.5-48.9) (33.2-36.4) (31.5-39.7) % of mutations that were A 24.7 23.0 24.7 0.72 nucleotides (21.7-35.0) (20.9-24.6) (21.2-29.4) % of mutations that were T 13.3 14.4 14.4 0.39 nucleotides (10.5-20.0) (11.9-17.0) (11.0-17.0) “% of mutations that were C nucleotides” indicates the proportion of mutations in either the CDR1/2 or FR1/2/3 regions that were C nucleotides in the unmutated reference sequence relative to the total number of mutations in the region expressed as a percent. The group medians are listed (individual patient mean ranges in parentheses).

3.9B). No differences were seen in the aviremic group compared with controls. C or T was the preferred nucleotide in both the -1 and +1 position for A:T transversions (Table

3.6). An A or G in the -1 position of T mutations resulting in a transition or a transversion was found in 67-82% of mutations. However, only in the case of T:A transversions was there any preference at the +1 position (A or G, 61-70%). These results, consistent with those in the literature [125], now shown for the first time among patients with HIV-1 viremia, and taken together with the previous mutation pattern

79

A C T Transition Mutations 100 Control 80 HIV+ Aviremic HIV+ Viremic 60 Overall p values: -1R/Y p = 0.22 40 +1R/Y p = 0.92

20 % C:T of Mutations

0 -1R -1Y C +1R +1Y Nucleotide/Position

B A G Transition Mutations 80 Control HIV+ Aviremic 60 * * HIV+ Viremic *p = 0.02 40 Overall p values -1R/Y p = 0.03

20 +1R/Y p = 0.43 % Mutations A:G of

0 -1R -1Y A +1R +1Y Nucleotide/Position

Figure 3.9: Dinucleotide Analysis of C:T and A:G Transition Mutations in VH3-IgG Genes. The percent either of R (A or G) or Y (C or T) nucleotides occurring in either the -1 (5’) position or the +1 (3’) position to a mutation were calculated for A) C → T transition mutations and B) for A → G transition mutations. Medians are represented for the control group (black bars), the HIV-1 aviremic group (gray bars) and the HIV-1 viremic group (white bars).

80

Table 3.6: Nucleotide Mutations and Adjacent Nucleotide Patterns in VH3-IgG Genes.

Mutation R:Y Ratio Mutation R:Y Ratio 5’ 3’ 5’ 3’ C → G (Tr) G → C (Tr) Control 4.9 : 1 1 : 2.2 Control 2.7 : 1 1 : 2.5 Aviremic 3.9 : 1 1 : 2.6 Aviremic 3.0 : 1 1 : 2.4 Viremic 4.2 : 1 1 : 2.9 Viremic 2.9 : 1 1 : 2.4 p value 0.47 0.45 p value 0.93 0.98 C → A (Tr) G → A (Ts) Control 3.2 : 1 1 : 1.2 Control 2.1 : 1 1 : 3.1 Aviremic 3.1 : 1 1 : 1.3 Aviremic 2.1 : 1 1 : 2.8 Viremic 4.0 : 1 1 : 1.3 Viremic 2.1 : 1 1 : 2.9 p value 0.37 0.60 p value 0.74 0.30 C → T (Ts) G → T (Tr) Control 4.2 : 1 1 : 1.9 Control 1 : 1.1 1 : 3.1 Aviremic 4.3 : 1 1 : 1.8 Aviremic 1 : 1.0 1 : 3.3 Viremic 3.6 : 1 1 : 1.8 Viremic 1.2 : 1 1 : 3.6 p value 0.28 0.80 p value 0.42 0.09

A → C (Tr) T → C (Ts) Control 1 : 1.1 1.3 : 1 Control 1.6 : 1 2.7 : 1 Aviremic 1 : 1.2 1.1 : 1 Aviremic 1.5 : 1 2.4 : 1 Viremic 1 : 1.1 1.4 : 1 Viremic 1.5 : 1 2.7 : 1 p value 0.85 0.85 p value 0.10 0.76 A → G (Ts) T → G (Tr) Control 1 : 1.1 2.2 : 1 Control 1 : 1.2 2.3 : 1 Aviremic 1 : 1.0 2.0 : 1 Aviremic 1 : 1.4 2.6 : 1 Viremic 1.2 : 1 2.1 : 1 Viremic 1 : 1.1 2.1 : 1 p value 0.02 0.30 p value 0.52 0.93 A → T (Tr) T → A (Tr) Control 1 : 2.8 1 : 2.2 Control 1.6 : 1 4.8 : 1 Aviremic 1 : 3.6 1 : 2.1 Aviremic 1.7 : 1 4.5 : 1 Viremic 1 : 3.6 1 : 2.1 Viremic 2.4 : 1 3.5 : 1 p value 0.13 0.44 p value 0.17 0.23 R = A or G; Y = C or T. Mutations: Ts = transition (purine ↔ purine or pyrimidine ↔ pyrimidine); Tr = transversion (purine ↔ pyrimidine).

81 results reveal that the process, if not the frequency of SHM, as well as SHM-associated lesion repair are intact during HIV-1 infection.

Finally, the preference of AID-induced mutations to be transitions (purine:purine or pyrimidine:pyrimidine) rather than transversions (purine:pyrimidine) [82, 123, 124] is preserved in all groups (53-56% vs. 45-48%, respectively, data not shown). Overall, we show that the frequency of SHM is significantly decreased during HIV-1 infection, particularly in the critical antigen-binding CDR regions. In contrast, the hierarchies in

SHM and AID activity described in other non-HIV-1 groups [82, 120, 121, 125, 128,

184], such as nucleotide mutation patterns, mutation proportions, and mutation preferences, are maintained among patients with HIV-1 viremia.

VH3 somatic hypermutation frequency is reduced in VH3-IgG 454-pyrosequenced genes from viremic HIV-1-infected patients but increased in VH3-IgD genes

454-pyrosequencing reduces time and increases the number of sequence reads generated per patient sample. Due to the limited number of sequences obtained for each patient by cloning, we decided to pursue high-throughput 454-pyrosequencing

+ technology to more completely characterize the VH3-IgG B cell compartment. We also decided to expand the analysis to include the more naïve Ig isotypes, IgD and IgM, as well as confirm the IgG results in another class-switched isotype, IgA. 454- pyrosequencing is a technique that allows for large numbers of sequences to be generated straight from PCR product, therefore bypassing time- and labor-intensive cloning steps.

Our initial experiments were performed using 454-FLX technology which produces sequences 200-300 base pairs in length. Genes from the VH3 family were sequenced

82 from IgA+ and IgM+ cDNA transcripts isolated from the mRNA of PBMCs from a healthy control subject (Figure 3.10). The majority of VH3-IgA sequences were in the expected length range, however, many of the VH3-IgM sequences were very short and did not cover much of the variable region sequence. Even at their longest, sequences generated on the 454-FLX platform extend only partially through the VH region and do not cover the V-D-J junction and CDR3 regions.

Shortly after this experiment was performed, Roche introduced 454-FLX

Titanium technology, producing up to 1 million sequences per plate with lengths of up to

500 base pairs. For Ig molecules, this longer length would cover nearly the entire variable region of the expressed immunoglobulin molecule. Thus, switching from the cloning and Sanger sequencing method to the high-throughput pyrosequencing method would allow us to generate 100- to 200-fold additional sequences in less time with less cost per sequence, without potentially sacrificing sequence data. This would allow us to have a much larger number of sequences identified for each VH3 gene and remove any potential type-II errors associated with low experimental numbers.

The VhIGene alignment and analysis program is designed to accommodate 454- pyrosequencing generated data. Pyrosequencing offered many new challenges to data analysis. Several groups have reported high incidences of insertions and deletions

(indels) present in the sequencing results [187-191] and our data was consistent with these reports. Because of the nature of the sequence results, the publically available websites commonly used for Ig alignment, Vbase, IMGT [155], and JOINSOLVER

[156], which we had utilized before to analyze cloning data, could not accurately align the sequences with the reference sequences in the websites’ databases. In addition, these

83

A

B

Figure 3.10: 454-FLX Pyrosequencing of VH3 cDNA Sequences Isolated from Control Subject PBMCs. 454-FLX Pyrosequencing was performed on VH3 family sequences of either A) the IgA isotype or B) the IgM isotype. The total number of sequences (y-axis) generated at each length (base pairs, x-axis) are reported.

84 public tools also allow for only single sequence or small batch sequence alignments, potentially making our new datasets very time-consuming to analyze even if indels could be removed or accommodated. Therefore, we collaborated with Dr. Daniel Frank to develop a new alignment and analysis program, VhIGene, to polish, align, and calculate gene expression, mutation frequency, and mutation patterns in our new Ig data sets.

Use of barcode-labeled PCR primers allowed us to pool multiple samples and isotypes on a single 454-pyrosequencing plate. VhIGene’s first step is to take the pooled

“multiplexed” dataset and sort the sequences into individual libraries based on identification of the barcode sequence. Any sequences where a barcode cannot be identified are excluded from the dataset. Once libraries are created, VhIGene then

“polishes” the sequences by removing sequences with a low overall quality score, removing sequences shorter than 200 base pairs in length, removing identical copies of sequences (“de-replication”) based on the alignments of the first 200 bps, and identifying and trimming poor quality bases from the remaining set of sequences. The polished, de- multiplexed libraries are then aligned with each other using an alignment program called

HMMER3 (hmmer.org) [192]. VhIGene uses these alignments to identify and remove any indels found that are not equal to multiples of 3 base pairs in length. Our assumption is that one or two nucleotide indels represent sequencing errors whereas 3 indels represent bona fide codon insertion during V(D)J recombination. Sequences are then aligned to the IMGT reference dataset [155] using HMMER3. Mutation frequencies and patterns are identified from the alignments to the reference sequences and cataloged by

VhIGene. The resulting output is the most comprehensive and detailed analysis of Ig alignments currently available. VhIGene also includes additional analyses that were not

85 calculated by Batch Analyzer, such as tracking total nucleotide counts and proportions in both references sequences and test sequences, and tracking the number of indels and poor quality bases excluded from analysis.

While our older analysis program, Batch Analyzer, utilized alignments derived from the Vbase Ig database, unfortunately, the Vbase reference dataset is no longer updated by its programmers. VhIGene utilizes a reference dataset from IMGT, a site that is frequently updated to include the most current sequencing datasets and used frequently by other alignment tools (JOINSOLVER) and in the literature [190, 191, 193]. VhIGene was also designed to utilize probabilistic models (HMMER) rather than local sequence alignment tools (BLAST) to ensure the most accurate alignments. BLAST-based alignment tools align short stretches of nucleotides, unlike HMMER which utilizes an algorithm closer to Smith/Waterman alignments and aligns individual nucleotides (the most accurate but time-consuming). Therefore, HMMER runs somewhat slower than

BLAST-based alignment tools. However, due to the high number of indels present in

454-pyrosequenced samples, in most cases our sequences could not be accurately aligned using BLAST-based alignment tools.

VhIGene was validated using Sanger-sequenced clone results and a mock-454 sequencing experiment. 454-Pyrosequencing and VhIGene were both validated in several ways. First, results from 2983 sequences generated from cloned VH3-IgG, VH3-IgA, and

VH3-IgM transcripts that were Sanger sequenced and analyzed by Vbase, JOINSOLVER

[156], and Batch Analyzer were run through VhIGene. VhIGene’s results were compared with both Batch Analyzer’s results as well as outputs from JOINSOLVER [156] (NIH’s

Ig alignment program that also utilizes IMGT’s [155] reference dataset). VhIGene

86 identified the correct VH3 gene in 99.9% of cloned sequences, the correct JH gene in

93.8% of cloned sequences, and the correct D gene in 85% of cloned sequences. The

CDR3 sequence was correctly identified in 97.8% of cloned sequences. In the remaining

2.2%, the CDR3 region could not be identified. VhIGene also correctly identified all mutations identified in each sequence by Vbase, JOINSOLVER [156], and Batch

Analyzer as well as correctly characterized each mutation.

Next, to determine the error frequency of both the PCR and sequencing steps of

454 pyrosequencing, 10 of the clones analyzed above were PCR-amplified separately and sequenced using barcoded and 454-tagged PCR primers and 454-Pyrosequencing reagents and technology. The results from the 10 pooled clones were de-multiplexed by

VhIGene and aligned to the 10 clones reference sequences as determined by Sanger sequencing without any further polishing steps. Any base pair changes discovered in these sequencing results were considered errors. The error frequency detected in this dataset was 1.31%. This frequency is slightly higher than others reported in the literature. One study performed using 454-FLX technology which produces shorter sequences reported an error frequency of ~0.5% [187]. Another study using 454-FLX

Titanium technology and, therefore, sequencing longer amplicons, reported an error frequency of 1.07% [189]. However, the longer amplicons reported by Gilles, et al., were not PCR amplified prior to sequencing as our samples were, and our sequences were longer than those reported in the 454-FLX study [187]. Indeed, according to Gilles, et al., error frequency increases with sequence length. After additional sequence polishing by VhIGene, including removing any sequences which have <80% homology with the sequences in the reference dataset, a step which is performed during polishing for

87 analysis of all additional sequencing results, the error frequency dropped to 0.49%

(Figure3.11). The levels of nucleotide mutation detected in our ~3000 Sanger-sequenced clones due to SHM are 2-8% in FR regions (median = 4-5%) and 5-15% in CDR regions

(median = 9-12%) in both HIV-1-infected patients and control subjects, well above both the raw 1.31% error frequency due to the 454-Pyrosequencing technique and especially the 0.49% error frequency after polishing.

Analyzing this dataset also validates VhIGene’s ability to accurately analyze 454- generated sequencing data, which contains much higher frequencies of indels than Sanger sequencing results. We were also able to use this dataset to determine the ideal quality score cutoff. Each base identified by 454-pyrosequencing analysis tools is assigned a quality score ranging from 0-40 based on PHRED values [194]. PHRED scores use log- transformed error probabilities to calculate a quality value, such that a score of 40 indicates that there is a 1/10,000 change of a base call being incorrect and a score of 20 indicates a 1/100 chance. By running the 10 clones 454-sequencing data through

VhIGene and adjusting the quality score filter, we were able to calculate the error frequency relative to quality scores (Figure 3.11). Based on our results we have chosen a quality score cutoff of 20, such that any bases in a sequence with a score between 0 and

19 are ignored by VhIGene and not included in mutation calculations. This quality score was selected as a reasonable compromise between including all data but increasing the error frequency and decreasing the error frequency but losing sequencing information.

As seen in Figure 3.11, much less data is lost by using a quality cutoff score of 20 versus

30 or 40, without significantly increasing the error frequency. All subsequent datasets were run through VhIGene with the quality score filter set at 20.

88

Quality Score vs. Error Frequency

0.5 40

Data Cut (% of Nts)Tot

0.4 30

0.3 20 0.2

10 0.1

0.0 0 Error Frequency (% of TotNts) of (% Frequency Error 0 10 15 20 25 30 35 40 Quality Score Cut-Off Error Frequency Data Cut from Analysis

Figure 3.11: Error Frequency vs. Quality Score Cut-Off. Quality scores were calculated based on log-transformed PHRED sequence quality scores. The error frequency (number of miscalled bases) was calculated at various quality score cut-offs (ranging from 0-40), such that bases with quality scores below the cut-off were ignored in the analysis. The amount of poor quality data cut from analysis was calculated from the total number of bases ignored per analysis expressed as a percentage of the total number of nucleotides in the dataset.

454-FLX Titanium pyrosequencing was used to sequence VH3 transcripts from viremic HIV-1-infected patients and control subjects. We analyzed 198,570 unique VH3-

IgD sequences (median = 10,401 sequences per patient, range = 82 - 56,425), 185,566 unique VH3-IgM sequences (median = 8,806 sequences per patient, range = 4,314 -

21,763), 251,810 unique VH3-IgA sequences (median = 8,850 sequences per patient, range = 3,528 - 66,147), and 218,844 unique VH3-IgG sequences (median = 10,017 sequences per patient, range = 4,512 - 45,868) from 8 HIV-1-seronegative control subjects and 9 HIV-1-infected patients with detectable viremia and low CD4+ T cell counts (Table 3.7).

89

Table 3.7: Clinical Characteristics of 454-Pyrosequencing Study Subjects

Control HIV-1+ p value Number 8 9 Age – median years 38 42 0.41 (Range) (31-50) (29-60) Gender (M:F) 7:1 9:0 0.47a Ethnicityb 4C, 1B, 1H, 2O 5C, 1B, 1H, 2O 1.00a CD4+ T cells/L – median N/A 148 (Range) (41-474) HIV-1 RNA median copies/mL N/A 70,400 (Range) (2,780-835,000) Antiretroviral Therapy -No Therapy (%) N/A 9 (100%) -≥1 Medication N/A 0 aCalculated by Fisher’s Exact test. bEthnicity: C = Caucasian, B = Black, H = Hispanic, O = Other.

VH3-23 expression was reduced in 454-pyrosequenced VH3-IgG transcripts from viremic HIV-1-infected patients. We first looked at individual VH3 gene expression in all isotypes of both groups. Of the 21 functional individual VH3 genes (as defined by IMGT

[155] in 2010 when our reference dataset was created), all 21 genes were expressed amongst all four isotypes in both groups. VH3-23 and VH3-30 were consistently the most dominant genes expressed across all isotypes, while the naïve IgD+ B cells had the most evenly distributed gene expression of the four isotypes (Figure 3.12A). Whereas

VH3-11 and VH3-74 were significantly lower in the HIV-1-infected group, no other differences were found in VH3-IgD sequences compared with controls. Similarly, gene expression amongst VH3-IgM (Figure 3.12B) and VH3-IgA (Figure 3.12C) sequences was comparable between groups. There was a significantly higher proportion of VH3-7 gene expression amongst VH3-IgM sequences compared with controls, while VH3-11 was significantly lower amongst VH3-IgA sequences. Consistent with our previous cloning 90 results, VH3-23 expression was significantly lower in VH3-IgG sequences from the HIV-

1-infected group compared with controls, though a slightly higher level of VH3-13 expression was also discovered in this group (Figure 3.12D).

D and JH gene expression are comparable in both groups among isotypes except for IgD. Few differences were also seen in D gene expression and nearly all the differences were present in the VH3-IgD sequences (Figure 3.13A). The more commonly expressed IGHD2-02 and IGHD2-15 genes were significantly lower in the HIV-1- infected group as were the rarely expressed IGHD5-24 (p=0.01) and IGHD6-13 (p=0.01)

(data not shown). Conversely, expression of IGHD1-14 (p=0.04) and IGHD3-03

(p=0.007) were significantly higher in the HIV-1-infected group compared with controls.

Similar to VH3 gene expression, few differences were seen in VH3-IgM (Figure 3.13B) and VH3-IgA sequences (Figure 3.13C). In VH3-IgM sequences, both IGHD4-04

(p=0.01) and IGHD6-13 (p=0.05) were lower in the HIV-1-infected group, as was

IGHD5-05 (p=0.01) in VH3-IgA sequences. No significant differences in D gene expression were found amongst the VH3-IgG sequences (Figure 3.13D).

Finally, JH gene expression was even less variable than D gene expression. Only one difference in the naïve IgD+ B cell subset was found; IGHJ4*02 gene expression was lower in the HIV-1-infected group compared with controls (Figure 3.14A). No differences were detected amongst the other three isotypes (Figure 3.14). Taken together, and consistent with the previous VH3-IgG cloned dataset results, the repertoire of expressed variable region genes appears to be fundamentally intact during HIV-1 infection.

91

A B IgM VH Gene Expression IgD VH Gene Expression 25 20 Control Control HIV-1+ 20 p=0.05 HIV-1+ 15 15

10 p=0.005 10 p=0.01

5 5

% of Total Sequences Total of % % of Total Sequences Total of %

0 0 7 9 11 13 15 23 30 30-3 33 48 66 72 74 7 9 11 13 15 23 30 30-3 33 48 66 72 74

VH3 Gene VH3 Gene

D IgG V Gene Expression C IgA VH Gene Expression H p=0.006 25 20 Control Control HIV-1+ HIV-1+ 20 15

15 p=0.01 10 10 p=0.05 5

5

% of Total Sequences Total of % % of Total Sequences Total of %

0 0 7 9 11 13 15 23 30 30-3 33 48 66 72 74 7 9 11 13 15 23 30 30-3 33 48 66 72 74

VH3 Gene VH3 Gene

Figure 3.12: VH3 Gene Expression in 454-Pyrosequenced Samples. VH3-family mRNA sequences were amplified from A) naïve IgD+ B cells, B) IgM+ B cells, C) IgA+ class-switched B cells, or D) IgG+ class-switched B cells from control subjects (black bars) and HIV-1-infected patients (HIV-1+, white bars). Group median values for the 13 most highly expressed VH3 family genes are shown. Values were calculated from individual mean percent expression for each gene.

92

A B IgD D Gene Expression IgM D Gene Expression 15 Control 15 Control HIV-1+ HIV-1+

10 10 p=0.04

p=0.04 p=0.01

5 5

% of Total Sequences Total of % % of Total Sequences Total of %

0 0 1-01 2/OR 2-02 2-15 2-21 4-04 4-23 6-06 7-27 DIR 1-01 2/OR 2-02 2-15 2-21 4-04 4-23 6-06 7-27 DIR D Gene D Gene

C D IgA D Gene Expression IgG D Gene Expression 15 Control 15 Control HIV-1+ HIV-1+

10 10

5 5

% of Total Sequences Total of % % of Total Sequences Total of %

0 0 1-01 2/OR 2-02 2-15 2-21 4-04 4-23 6-06 7-27 DIR 1-01 2/OR 2-02 2-15 2-21 4-04 4-23 6-06 7-27 DIR D Gene D Gene

Figure 3.13: D Gene Expression in 454-Pyrosequenced Samples.

93

Figure 3.13: D Gene Expression in 454-Pyrosequenced Samples. VH3-family mRNA sequences were amplified from A) naïve IgD+ B cells, B) IgM+ B cells, C) IgA+ class-switched B cells, or D) IgG+ class-switched B cells from control subjects (black bars) and HIV-1-infected patients (HIV-1+, white bars). Group median values for the 10 most highly expressed D genes are shown. Due to both sequence length and junctional sequence diversity, gene identification could be assigned in 61.27% of sequences. Values were calculated from individual mean percent expression for each gene.

94

A B IgD JH Gene Expression IgM JH Gene Expression 40 40 p=0.05 Control Control HIV-1+ HIV-1+ 30 30

20 20

10 10

% of Total Sequences Total of % % of Total Sequences Total of %

0 0 3*02 4*01 4*02 4*03 5*02 6*02 Other 3*02 4*01 4*02 4*03 5*02 6*02 Other J Gene J Gene

C IgA JH Gene Expression D IgG JH Gene Expression 40 40 Control Control HIV-1+ HIV-1+ 30 30

20 20

10 10

% of Total Sequences Total of % % of Total Sequences Total of %

0 0 3*02 4*01 4*02 4*03 5*02 6*02 Other 3*02 4*01 4*02 4*03 5*02 6*02 Other J Gene J Gene

Figure 3.14: JH Gene Expression in 454-Pyrosequenced Samples.

95

Figure 3.14: JH Gene Expression in 454-Pyrosequenced Samples. VH3-family mRNA sequences were amplified from A) naïve IgD+ B cells, B) IgM+ B cells, C) IgA+ class-switched B cells, or D) IgG+ class-switched B cells from control subjects (black bars) and HIV-1-infected patients (HIV-1+, white bars). Group median values for each gene are shown. Due to sequence lengths, gene identification could be assigned in 54.61% of sequences. “Other” indicates IGHJ1*01, 2*01, 3*01, 5*01, 6*01, and 6*03. Values were calculated from individual mean percent expression for each gene.

96

CDR3 region characteristics were similar in both groups among all isotypes.

Characteristics of the main antigen-binding segment of the variable region, CDR3, were determined for each isotype (Appendix Table A1). Due to the limit in sequence length, the CDR3 region could only be identified in a small portion of the sequences from all isotypes (median = 6.66% of sequences, range = 0.68-32.79%), though the numbers of

CDR3 regions identified did not vary by group. Neither the amino acid length nor the degree of hydrophobicity varied by group for any isotype. The only difference in amino acid composition was seen in the VH3-IgD sequences, where HIV-1-infected patients had a slight, but significantly lower proportion of aromatic residues. This could be due to the differences seen in D gene usage in the VH3-IgD sequences in this group as described earlier. The CDR3 region overlaps the VH-D-JH junction, and thus, the majority of its composition is contributed by the D gene. Combined with the VH3, D, and JH gene expression results, these data suggest that the initial V(D)J recombination process is intact during HIV-1 infection.

SHM frequencies are lower in CDR1/2 and FR1/2/3 regions of VH3-IgG sequences but higher in VH3-IgD sequences in viremic HIV-1-infected patients. Next, we characterized the SHM frequencies of both nucleotides and amino acids in the CDR1 and CDR2 regions (Figures 3.15A and 3.15B, respectively). Surprisingly, both nucleotide and amino acid SHM frequencies in the VH3-IgD sequences were significantly greater in the HIV-1-infected group compared with controls. This was unexpected as the majority of circulating IgD+ B cells are naïve cells are not believed to be either antigen- experienced or have participated in a germinal center reaction, and thus, should have unmutated sequences. There was no difference in either the nucleotide or amino acid

97

SHM frequencies in the VH3-IgM and VH3-IgA sequences. Consistent with the previous

VH3-IgG cloned dataset results, but contrary to the VH3-IgA results above, in VH3-IgG

454-pyrosequences both the nucleotide and amino acid SHM frequencies were lower in the HIV-1-infected group compared with controls, however, the differences did not quite reach statistical significance due to a high degree of variability in the HIV-1-infected samples.

The high SHM frequency found in VH3-IgD sequences in HIV-1-infected patients is not likely due to technical error or contamination. The four patients with significantly higher SHM frequency in VH3-IgD sequences were PCR-amplified at different times with different reagent aliquots, including different barcode sequences. They were also pooled and run with other samples with low VH3-IgD SHM frequencies. Similarly, high VH3-

IgD SHM frequency did not correlate with SHM frequency among other isotypes within the same patient. Mutation patterns were also the same in these four patients compared with the four patients with normal VH3-IgD SHM frequencies. Even the inversion of the ratio of transition-to-transversion mutations, as discussed below, remains consistent despite high or normal levels of VH3-IgD SHM frequency. Though unexpected, the results in the VH3-IgD compartment appear authentic.

Despite the varied SHM frequencies between controls and HIV-1-infected patients in VH3-IgD and VH3-IgG pyrosequenced genes, no differences were found in the

R/S ratios or the proportion of non-conservative amino acid changes. R/S ratios in the

CDR1/2 regions were high and consistent with the VH3-IgG cloning results (median ratios from controls vs. HIV-1-infected patients; IgD: 3.4 vs. 3.6, p=0.67; IgM: 3.8 vs.

3.5, p=0.32; IgA: 3.9 vs. 4.3, p=0.27; IgG: 3.6 vs. 3.5, p=0.85). Consistent with the

98

A CDR Nucleotide Mutation Frequency

30 p=0.03 p=0.74 p=0.25 p=0.06

20

10

% CDR Nucleotide% Mutations 0 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ IgD IgM IgA IgG

B CDR Amino Acid Replacement Mutation Frequency

50 p=0.02 p=0.67 p=0.21 p=0.07

40

30

20

10

% CDR amino acid% mutations 0 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ IgD IgM IgA IgG

Figure 3.15: Mutation Frequencies in CDR1/2 Regions of VH3 454-Pyrosequenced Samples. The mean percent of mutated nucleotides (A) or the mean percent of replaced mutated amino acids (B) in CDR1 and CDR2 regions was calculated for each isotype (IgD, IgM, IgA, and IgG) for each control subject (black circles) or HIV-1-infected patient (HIV-1+, white triangles). The solid bar indicates the group median.

99 ratios in the pyrosequenced genes, the proportion of non-conservative amino acid mutations relative to the total number of mutations was also high and similar between groups (median percent non-conservative mutations in controls vs. HIV-1-infected patients; IgD: 78.4% vs. 73.9, p=0.19; IgM: 74.4 vs. 74.2 p=1.00; IgA: 74.8 vs. 75.0, p=0.76; IgG: 74.4 vs. 74.1, p=0.92).

As expected, both the nucleotide and amino acid SHM frequencies in the structural FR1, FR2, and FR3 regions (Figures 3.16A and 3.16B, respectively) were lower, in general, than the SHM frequencies found in the CDR1/2 regions in all isotypes and in both groups. Similar to SHM frequencies in the CDR1/2 regions, both nucleotide and amino acid SHM frequencies were higher in VH3-IgD sequences in HIV-1-infected patients, although in this case, the differences did not reach statistical significance. Also, consistent with results in CDR1/2 regions, no significant differences were seen in either nucleotide or amino acid SHM frequencies in VH3-IgM and VH3-IgA sequences. Though the lower nucleotide SHM frequency in VH3-IgG sequences also did not reach statistical significance, the amino acid SHM frequency was significantly lower in the HIV-1- infected group compared with controls, as seen in the VH3-IgG cloned dataset results.

FR1/2/3 R/S ratios were not different between groups in any isotype and were lower, in general, than those in the CDR1/2 regions (median ratios for controls vs. HIV-1-infected patients; IgD: 1.6 vs. 1.9, p=0.44; IgM: 1.6 vs. 1.5, p=0.16; IgA: 1.6 vs. 1.6, p=0.71;

IgG: 1.7 vs. 1.7, p=0.66). Similarly, the proportion of non-conservative amino acid mutations was also lower, in general, compared with those in the CDR1/2 regions and did not vary by group in VH3-IgD, -IgM, or –IgA sequences (median percent of non- conservative mutations in controls vs. HIV-1-infected patients; IgD: 57.7% vs. 58.7,

100 p=0.67; IgM: 56.8 vs. 56.1, p=0.48; IgA: 57.2 vs. 57.2, p=0.79). However, in contrast to the VH3-IgG cloned dataset results described earlier, the proportion of non- conservative amino acid changes was significantly lower in VH3-IgG sequences from

HIV-1-infected patients compared with controls (58.0% for controls vs. 56.5% for HIV-

1-infected patients, p=0.03), suggesting potentially altered antigen-driven selection pressure during the affinity maturation process during HIV-1-infection.

To further characterize the populations of mutated sequences in each group and isotype, the proportions of sequences with varying densities of mutations were graphed

(Figure 3.17). In VH3-IgD and VH3-IgM sequences, a majority of sequences had no or few nucleotide mutations (Figures 3.17A and 3.17B, respectively). The proportions of unmutated sequences in both groups also did not differ in either isotype (IgD: 39.3% unmutated sequences for controls vs. 25.1% for HIV-1-infected patients, p = 0.10; IgM:

26.3% vs. 22.9%, p=0.74), suggesting that for at least IgD, a larger proportion of unmutated “naïve” VH3-IgD transcripts in control subjects was not biasing the difference in SHM frequency. In VH3-IgM sequences, though the proportion of sequences with 14-

15.9% mutation density was significantly higher in the HIV-1-infected group compared with the controls, the small percentage of sequences (0.16% for HIV-1-infected patients) in this category was not enough to affect the overall SHM frequency across all sequences.

In the class-switched isotypes, IgA and IgG (Figures 3.17C and 3.17D, respectively), a higher proportion of sequences were more densely mutated than in the IgD and IgM isotypes. In VH3-IgA sequences, while both the control and HIV-1-infected sequences peak at the same density (4-5.9%, Figure 3.17C), a significantly higher proportion of sequences were also more densely mutated (6-7.9%) than in the HIV-1-infected group,

101

A FR Nucleotide Mutation Frequency

15 p=0.08 p=0.85 p=0.37 p=0.28

10

5

% FR Nucleotide% Mutations 0 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ IgD IgM IgA IgG

B FR Amino Acid Replacement Mutation Frequency p=0.13 p=0.54 p=0.09 p=0.008 15

10

5 % FR amino acid% mutations 0 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ IgD IgM IgA IgG

Figure 3.16: Mutation Frequencies in FR1/2/3 Regions of VH3 454-Pyrosequenced Samples. The mean percent of mutated nucleotides (A) or the mean percent of replaced mutated amino acids (B) in FR1, FR2, and FR3 regions was calculated for each isotype (IgD, IgM, IgA, and IgG) for each control subject (black circles) and HIV-1-infected patient (HIV-1+, white triangles). The solid bar indicates the group median.

102 although, as with VH3-IgM sequences, this difference was not enough to drive a change in the overall SHM frequency when both control and HIV-1-infected groups were compared. In the case of VH3-IgG sequences, however, mutation density in the HIV-1- infected group peaks at a much lower density (2-3.9%, Figure 3.17D) compared with controls (6-7.9%) and is consistent with previous VH3-IgG cloned dataset results. The differences in the proportion of sequences measured at each density in these groups may explain the lower mutation frequencies observed in the VH3-IgG sequences from HIV-1- infected patients. The proportion of unmutated sequences, though similar in the cloned sequences, was higher in the pyrosequenced HIV-1-infected group compared with controls (median 0.76% unmutated sequences for controls vs. 2.61% for HIV-1-infected patients, p=0.004) and may have contributed to the reduced SHM frequency.

VH3 gene-specific SHM is consistent with overall SHM frequencies measured in each isotype in viremic HIV-1-infected patients. The differences in overall SHM frequencies can also be seen in individual VH3 genes. In general, CDR1/2 amino acid mutation frequencies were higher in VH3-IgD sequences from HIV-1-infected patients compared with controls and significantly higher in VH3-9 (Figure 3.18A) and VH3-30-3

(p = 0.03, data not shown). Consistent with finding no differences in the overall CDR1/2 amino acid SHM frequency, in VH3-IgM sequences no differences were seen in any specific gene between groups (Figure 3.18B). In VH3-IgA sequences from HIV-1- infected patients, a significant decrease was seen in the SHM frequency in VH3-30, one of the most highly expressed VH3 genes and potentially bound by HIV-1 gp120 in vivo

(Figure 3.18C). However, no other significant differences were found in the other putative gp120-binding genes or other highly expressed VH3 genes when compared with

103 controls. In contrast, nearly every VH3 gene from VH3-IgG sequences displayed a lower

SHM frequency in the HIV-1-infected patients compared with controls, with statistically significant differences found in 10 of the 21 VH3 genes, including the most highly expressed VH3 genes (VH3-33, -48, -7, -30, and -23; Figure 3.18D) and in all putative gp120-binding genes (VH3-30, -23, -73, and -15), contrary to results previously seen in the VH3-IgG cloned dataset.

Finally, CDR1/2 amino acid SHM frequencies were compared with SHM frequencies of all other isotypes, with HIV-1 plasma viral load, and with CD4+ T cell number. Only VH3-IgD and VH3-IgM SHM frequencies were significantly correlated

(Figure 3.19A). None of the other CDR1/2 SHM frequencies correlated with either VH3-

IgD or each other, including VH3-IgD versus VH3-IgG (r=0.10, p=0.71, data not shown).

No CDR1/2 SHM frequencies in any isotype correlated significantly with CD4+ T cell number (IgD: r=0.0, p=1.0; IgM: r=0.24, p=0.58; IgA: r=-0.31, p=0.46; IgG: r=0.14, p=0.75). However, the CDR1/2 amino acid SHM frequency in VH3-IgD sequences did positively correlate with HIV-1 plasma viral load (Figure 3.19B). The CDR1/2 amino acid SHM frequencies from the other three isotypes, including IgG, did not significantly correlate with viral load (IgM: r=0.35, p=0.36; IgA: r=0.23, p=0.55; IgG: r=0.22, p=0.58). Therefore, the increased mutation accumulation in VH3-IgD sequences may be dependent on HIV-1 viral loads, however, the decreased SHM frequencies seen in VH3-

IgG sequences, consistent with the previous VH3-IgG cloned dataset results, does not seem to be dependent on viral loads.

104

A B Proportions of Mutated Sequences Proportions of Mutated Sequences IgD IgM

100 80 Control Control 80 HIV+ HIV+ 60 60 40 40 20

20 % of Total Sequences Total of % % of Total Sequences Total of % * p=0.05 0 0

18+ 18+ 0-1.9 2-3.9 4-5.9 6-7.9 8-9.9 0-1.9 2-3.9 4-5.9 6-7.9 8-9.9 10-11.9 12-13.9 14-15.9 16-17.9 10-11.9 12-13.9 14-15.9 16-17.9 Nucleotide Mutation Frequency (%) Nucleotide Mutation Frequency (%)

C D Proportions of Mutate Sequences Proportions of Mutated Sequences IgA IgG p=0.04 30 30 * p=0.04 Control p=0.004 p=0.02 Control * HIV+ * * HIV+ 20 20

10 10

% of Total Sequences Total of % % of Total Sequences Total of %

0 0

18+ 18+ 0-1.9 2-3.9 4-5.9 6-7.9 8-9.9 0-1.9 2-3.9 4-5.9 6-7.9 8-9.9 10-11.9 12-13.9 14-15.9 16-17.9 10-11.9 12-13.9 14-15.9 16-17.9 Nucleotide Mutation Frequency (%) Nucleotide Mutation Frequency (%)

Figure 3.17: Proportions of Mutated Sequences in VH3 454-Pyrosequenced Samples. The median proportion of sequences with varying percentages of nucleotide mutations per sequence is graphed for control subjects (black circles) and HIV-1-infected patients (HIV-1+, white triangles) from A) naïve IgD+ B cells, B) IgM+ B cells, C) IgA+ class-switched B cells, or D) IgG+ class-switched B cells. Proportions are expressed as a percent of the total number of sequences from each individual.

105

Figure 3.18: Frequencies of Replacement Amino Acid Mutations in Specific VH3 Genes from 454-Pyrosequenced Samples.

106

Figure 3.18: Frequencies of Replacement Amino Acid Mutations in Specific VH3 Genes from 454-Pyrosequenced Samples. The group median for CDR1/2 in each of the 5 most frequently expressed VH3 genes (33, 48, 11, 9, 7, 30, 23) and the VH3 genes proposed to bind to HIV-1 gp120 outside of the antigen-binding region (30, 23, 15, and 73) are calculated with the mean mutation frequencies for each individual patient in both control (black bars) and HIV-1-infected (HIV-1+, white bars) groups.

107

A B IgD CDR Amino Acid SHM Frequency vs IgD CDR Amino Acid SHM Frequency vs HIV-1 Viral Load IgM CDR Amino Acid SHM Frequency 40 20

30 15

20 10 10 r = 0.83 r = 0.50 p = 0.02 mutations (IgM) 5 p = 0.05

% CDR a.a. replacement% 0

% CDR a.a. replacement% mutations 0 5 5 5 5 6 0 10 10 10 10 10 0 2.0 4.0 6.0 8.0 1.0 10 20 30 40 % CDR a.a. Mutation Frequency (IgD) HIV-1 Plasma Viral Load (RNA copies/mL)

Figure 3.19: VH3-IgD SHM Frequency Correlates with VH3-IgM SHM Frequency and HIV-1 Plasma Viral Load. Spearman correlations were calculated for VH3-IgD CDR1/2 amino acid SHM frequency and A) VH3-IgM CDR1/2 amino acid SHM frequency and B) HIV-1 plasma viral load (RNA copies/mL blood).

Nucleotide mutation patterns were similar between groups among all isotypes.

Despite variable levels in SHM frequencies, mutation patterns were generally comparable between groups amongst the different isotypes. Any differences resulting from the targeting of the mutational machinery were discerned by comparing 1) the targeting of mutations to specific “hotspot” motifs (RGYW/WRCY), 2) the proportion of mutations occurring at each nucleotide, 3) the nature of the mutational event, and 4) the pattern of nucleotides surrounding the mutation.

SHM is initiated by the enzyme AID, which targets RGYW/WRCY motifs in both variable and switch region sequences. RGYW/WRCY motifs were present in similar proportions in both groups (Appendix Table A2) in all isotypes except in IgM, where the number of motifs per sequence was significantly lower. This may be due to

108 the significantly higher proportion of VH3-7 sequences in the HIV-1-infected population as the VH3-7 reference sequence has fewer RGYW/WRCY motifs than some of the other more highly expressed VH3 genes (e.g., VH3-23). However, despite a smaller number of motifs in the VH3-IgM sequences of HIV-1-infected patients, there were no differences in the proportion of mutations found in the motifs in either the CDR1/2 or FR1/2/3 regions.

Whereas no differences were observed in the proportion of mutations occurring in these motifs in the CDR1/2 regions of VH3-IgD sequences, a significantly higher proportion of mutations were found in the FR1/2/3 regions of HIV-1-infected patients compared with controls, suggesting targeting to these motifs was actually increased in these sequences.

No other targeting differences were found in the VH3-IgA or VH3-IgG sequences, consistent with previous VH3-IgG cloned dataset results.

Whereas decreased targeting to “hotspot” motifs by AID was not observed in either group in any isotype, downstream mutation repair mechanisms can disrupt the proportions and patterns of nucleotide mutations. The overall increased SHM frequency seen in VH3-IgD sequences from HIV-1-infected patients was present across all nucleotides, and especially in the CDR1/2 region in the proportions of C, A, and T nucleotides mutated, relative to the total number of C, A, and T nucleotides present in the unmutated reference sequence (Table 3.8). Increased proportions of all nucleotides mutated were also seen in the FR1/2/3 regions, but did not reach statistical significance.

The relative frequency of each nucleotide mutated, relative to the total number of mutations in the CDR1/2 regions was similar in both groups (G> A> C> T). Although the frequency pattern of mutations in the FR1/2/3 regions was similar in both groups (G>

C> A> T), the frequency of T mutations was significantly lower in the HIV-1-infected

109 group compared with controls. No differences in the proportions of each nucleotide mutated were found in VH3-IgM sequences between groups (Appendix Table A3). A significantly higher frequency of mutations in the FR1/2/3 regions occurred at C nucleotides in the HIV-1-infected patients compared with controls, although the frequency patterns were the same in both CDR1/2 and FR1/2/3 regions in both groups.

In VH3-IgA sequences from HIV-1-infected patients, A nucleotides were mutated in the

CDR1/2 regions at a significantly decreased frequency compared with controls, however, no other differences were found in nucleotide proportions or frequency patterns between groups in either the CDR1/2 or FR1/2/3 regions (Appendix Table A3). Contrary to previous VH3-IgG cloned dataset results which showed a decrease in only A and G nucleotide proportions, all nucleotides were mutated at a lower frequency in pyrosequenced VH3-IgG sequences in both the CDR1/2 and FR1/2/3 regions, however, the proportions of C and G nucleotides in the CDR1/2 regions and T nucleotides in the

FR1/2/3 regions were significantly lower (Table 3.8). While frequency patterns were similar in both groups in both the CDR1/2 and FR1/2/3 regions, there were significantly fewer T nucleotide mutations present in the CDR1/2 regions compared with controls.

The decrease seen across all nucleotides in VH3-IgG sequences from HIV-1-infected patients does not suggest nor eliminate a potential deficit in the downstream mutation repair mechanisms.

Mutations that result from AID activity are more likely to be transitions than transversions [82, 123, 124]. While no differences in the proportions of transition mutations versus transversion mutations were observed between groups for any isotype

110

Table 3.8: Nucleotide Mutation Patterns and Proportions in VH3 454- Pyrosequenced Samples.

IgD Control Day 0 HIV-1+ Day 0 p value CDR Median Percent (Subject Range) % C nucleotides mutated 2.68 5.72 0.02 (1.5-4.6) (2.5-25.5) % G nucleotides mutated 2.69 7.30 0.07 (2.1-4.2) (2.3-35.1) % A nucleotides mutated 2.02 7.66 0.04 (1.2-4.0) (1.9-18.4) % T nucleotides mutated 1.51 5.33 0.04 (1.1-4.2) (1.2-21.0) FR % C nucleotides mutated 1.27 3.20 0.13 (0.8-3.3) (0.8-9.2) % G nucleotides mutated 1.48 3.08 0.07 (0.7-2.0) (0.7-8.3) % A nucleotides mutated 1.74 4.00 0.08 (1.3-2.8) (1.0-18.8) % T nucleotides mutated 1.22 2.28 0.21 (0.8-3.4) (0.6-6.3) CDR % of mutations that were C 23.15 20.68 0.38 nucleotides (15.7-29.5) (11.7-29.1) % of mutations that were G 30.2 31.56 0.80 nucleotides (27.0-44.2) (21.4-45.4) % of mutations that were A 24.81 25.06 0.65 nucleotides (19.3-30.1) (21.3-43.0) % of mutations that were T 19.18 19.64 0.72 nucleotides (17.2-30.5) (16.2-27.5) FR % of mutations that were C 23.41 27.53 0.33 nucleotides (19.6-29.2) (12.3-31.5) % of mutations that were G 27.97 32.62 0.07 nucleotides (21.5-36.8) (25.2-41.0) % of mutations that were A 21.34 23.48 0.28 nucleotides (19.0-26.6) (17.9-32.2) % of mutations that were T 25.67 17.54 0.002 nucleotides (22.0-32.4) (13.6-24.4)

111

IgG Control Day 0 HIV-1+ Day 0 p value CDR Median Percent (Subject Range) % C nucleotides mutated 21.63 15.30 0.05 (18.4-24.7) (11.4-23.8) % G nucleotides mutated 20.34 13.75 0.02 (17.2-23.1) (9.4-26.9) % A nucleotides mutated 18.39 11.97 0.09 (12.5-21.0) (9.9-24.6) % T nucleotides mutated 12.10 8.20 0.09 (9.0-20.8) (5.1-13.6) FR % C nucleotides mutated 5.26 4.15 0.06 (4.8-5.7) (3.3-6.2) % G nucleotides mutated 5.10 3.65 0.20 (4.8-5.3) (3.1-6.7) % A nucleotides mutated 5.23 3.68 0.11 (4.8-7.4) (3.2-7.5) % T nucleotides mutated 2.70 2.13 0.02 (2.4-4.6) (1.5-3.0) CDR % of mutations that were C 19.54 21.13 0.11 nucleotides (16.1-21.7) (13.4-23.8) % of mutations that were G 30.58 31.66 0.74 nucleotides (28.4-34.5) (25.2-35.1) % of mutations that were A 30.56 29.04 0.81 nucleotides (22.2-34.2) (25.3-38.9) % of mutations that were T 19.60 17.97 0.04 nucleotides (17.4-30.5) (14.9-23.5) FR % of mutations that were C 26.62 28.19 0.81 nucleotides (25.4-30.0) (24.5-30.9) % of mutations that were G 34.91 37.09 0.29 nucleotides (31.0-38.9) (34.2-44.2) % of mutations that were A 21.81 19.93 0.17 nucleotides (20.1-24.7) (16.6-25.4) % of mutations that were T 14.15 13.74 0.32 nucleotides (13.1-19.6) (11.3-18.8) “% C nucleotides mutated” indicates the proportion of nucleotides in either the CDR1/2 or FR1/2/3 regions that were mutated relative to the total number of C nucleotides present in the unmutated reference sequence, expressed as a percent. “% of mutations that were C nucleotides” indicates the proportion of mutations in either the CDR1/2 or FR1/2/3 regions that were C nucleotides in the unmutated reference sequence relative to the total number of mutations in the region, expressed as a percent. The medians of each group are listed with individual patient mean ranges in parentheses.

112

(Table 3.9), interestingly, the opposite pattern was observed in the VH3-IgD sequences.

Transversion mutations were slightly more common in these sequences in controls and this pattern was maintained in HIV-1-infected patients, possibly suggesting that AID may not be the only source of these mutations, downstream mutagenic repair mechanisms may vary, or selection pressures may vary in the highly mutated IgD+ B cell subset.

Finally, AID-mediated mutation is determined, in part, by bases immediately surrounding the targeted nucleotide. Similar to previous reports [125, 185, 186] and to the previous VH3-IgG cloned dataset results, all C mutations showed a strong preference for an A or G nucleotide in the -1 (5’) position (Appendix Table A4). All G mutations showed preferences for both either an A or G nucleotide in the -1 position and either a C or T nucleotide in the +1 (3’) position. An A or G nucleotide in the +1 position was preferred for T → C transition and T → A transversion mutations, however, T → G transversion mutations did not reveal any preferences in either neighboring base. A → G transversion mutations showed a preference for an A or G nucleotide in the +1 position.

Conversely, A → T transversion mutations had a preference for a C or T nucleotide in the

+1 position. Similar to T → G transversion mutations, no strong preferences were observed for A → C transversion mutations. These patterns were consistent in all isotypes between both groups, although a few significant differences were found in the degree to which certain nucleotides were preferred. Contrary to the previous cloning data, no significant differences in the pattern of preferred neighboring bases was observed in the VH3-IgG sequences. Taken together, these data suggest that despite the increased SHM frequencies found in VH3-IgD sequences and the decreased SHM frequencies found in VH3-IgG sequences from HIV-1-infected patients, targeting of AID

113

Table 3.9: Transition and Transversion Mutation Proportions in VH3 454- Pyrosequenced Samples.

Control HIV-1+ p value Transition Mutations IgD 45.8 47.1 0.65 (41.1-53.3) (37.4-52.6) IgM 51.1 50.4 0.96 (43.9-54.7) (49.3-58.2) IgA 52.9 53.4 0.54 (47.1-56.0) (49.6-57.1) IgG 52.8 53.8 0.89 (50.1-55.4) (45.2-60.4) Transversion Mutations IgD 54.2 53.5 0.88 (46.7-59.0) (36.5-62.6) IgM 48.9 49.6 0.96 (45.3-56.1) (41.8-50.8) IgA 47.1 46.6 0.54 (44.1-52.9) (42.9-50.4) IgG 47.2 47.5 0.81 (44.6-49.9) (44.5-54.8) Transition Mutation = purine ↔ purine or pyrimidine ↔ pyrimidine; Transversion Mutation = purine ↔ pyrimidine. The median percents of each group are listed with individual patient mean ranges in parentheses. and subsequent downstream repair pathways do not appear to be impaired during HIV-1 infection.

In the 454-pyrosequenced dataset, we have shown that although SHM frequency is lower in VH3-IgG class-switched sequences, consistent with our previous VH3-IgG cloned dataset results, SHM frequency in presumably naïve VH3-IgD sequences is significantly increased in HIV-1-infected patients. However, such patterns could not be confirmed in sequences from either the other class-switched isotype, IgA, or in IgM sequences, which may, even in class-switched memory cells, retain a “naïve-like” repertoire of no or little SHM in healthy adults. Despite variable SHM frequencies in both VH3-IgG and VH3-IgD transcripts, mutation patterns were similar in both groups,

114 suggesting a deficit in VH3-IgG SHM frequency and an increase in VH3-IgD SHM frequency of the quantity of mutation, but not the quality.

Mutation frequency is reduced in Bcl6 genes in HIV-1-infected patients

AID targets non-Ig genes, including Bcl6. VH genes are not the only targets of

AID during SHM. Many other genes, including Bcl6, CD83, CD79a, CD79b, Pim1, and

Myc have been shown to be targeted by AID also [160, 162]. Of these known additional

SHM targets, Bcl6 accumulates the highest mutation frequency [45, 160, 162, 195, 196]).

Mutations found in Bcl6 are also known to have been caused by AID activity, as the mutation frequency in Bcl6 in AID-/- murine B cells is reduced by 80-fold compared to wild type B cells [162]. The Bcl6 protein is a transcriptional repressor that affects B cell activation, differentiation, cell cycle regulation, and a cell’s response to DNA damage

[45, 195-197]. The gene encoding the protein is highly expressed in B cells during the germinal center reaction and concurrent with SHM [42, 195-197]. The Bcl6 gene is 24.3 kilobases (kb) long and encodes 10 exons. Within its genomic sequence are 1457

RGYW/WRCY AID “hotspot” motifs. In order to determine if aberrant mutation frequencies exist in other AID-targeted genes, such as Bcl6, multiple segments of the

Bcl6 gene were amplified and sequenced.

Preliminary Bcl6 sequencing reveals different levels of mutation at different gene segments. First, we determined which of several selected segments had the highest mutation frequency in the gene. Primer pairs for 11 segments were designed, covering

11.8 kb of both intronic and exonic sequences (Figure 3.20). These 11 segments were selected because they had the highest densities of RGYW/WRCY “hotspots”. Indeed,

115 though they cover less than half of the genomic sequence, they include over half

(753/1457) of the encoded RGYW/WRCY “hotspot” motifs in the gene. An additional segment was also selected based on previous sequencing experiments found in the literature and because its position at the 5’ end of the gene more closely resembles the portion of the VH gene that is most highly mutated (e.g. the first 1.5 kb of VH genes)

[160-162]. Shen, et al., in normal murine B cells, and Dijkman, et al., in primary cutaneous follicle center (PCFCL) and primary cutaneous large B cell lymphoma (PCLBCL) patients, found that the first 1.5 kb of the Bcl6 gene was the most highly mutated portion of the sequence [160, 161] (Figure 3.20).

Genomic DNA was extracted from fresh PBMCs isolated from a healthy control subject. Of the 12 selected Bcl6 segments, primer pairs for segments 4 and 6 did not successfully amplify product from the DNA. The remaining 10 segments were PCR amplified, cloned into plasmids, and sequenced by Sanger sequencing. Sequencing of

Figure 3.20: Bcl6 Gene and Segment Positions. The 24.3 kb Bcl6 gene encodes 10 exons and 1457 RGYW/WRCY AID “hotspot” target motifs. Segments 1-11 were chosen based on their high density of RGYW/WRCY motifs. Segment 1.5 was chosen based on sequencing experiments in the literature and because its position at the 5’ end of the gene more closely resembles the portion of the VH gene that is most highly mutated [160-162].

116 segment 2 gave consistently poor results and cloning and sequencing were discontinued.

Cloned sequences were aligned using the BLAST tool bl2seq

(www.ncbi.nlm.nih.gov/BLAST) with a human Bcl6 reference sequence from the

GenBank database (www.ncbi.nlm.nih.gov/genbank). Mutations were detected in all 9 sequenced segments (Table 3.10). The segments with the highest mutation frequencies were 5, 9, and 10. The segment with the lowest mutation frequencies was 7. The segment covering the first 1.5 kb of the Bcl6 gene, Bcl6-1.5, also had a low mutation frequency compared with the other segments.

Table 3.10: Preliminary Bcl6 Mutation Frequencies

Bcl6 Segment Mutation Frequency Number of Base Pairs Bcl6-1 6.5 x 10-4 24470 Bcl6-3 8.1 x 10-4 35707 Bcl6-5 1.2 x 10-3 35753 Bcl6-7 1.2 x 10-4 33210 Bcl6-8 8.0 x 10-4 28787 Bcl6-9 9.6 x 10-4 36381 Bcl6-10 1.9 x 10-3 22508 Bcl6-11 6.8 x 10-4 27752 Bcl6-1.5 6.0 x 10-4 11684

Bcl6 accumulates mutations during stimulation at the 5’ region of the gene. We wanted to investigate not only the baseline levels of SHM in Bcl6, but also the accumulation of mutation in the Bcl6 gene in stimulated PBMCs over time. This method may indirectly measure AID activity within stimulated B cells. In order to determine which of the segments would be best for further testing, we selected the 3 most highly mutated segments from the previous experiment, 5, 9, and 10, as well as the 1.5 kb segment for testing in stimulated PBMCs. Genomic DNA was isolated from PBMCs 117 from a healthy control subject at baseline (fresh cells), or after stimulation for 4 or 10 days in culture with B cell stimulants anti-CD40, anti-IgM, and IL-4. PCR-amplified segments were cloned and sequenced and analyzed as before.

Baseline mutation frequencies were similar for each segment, except for segment

10, which was 1.5-fold lower in this batch of sequences (Table 3.11). Though the Bcl6-

1.5 segment had the lowest baseline mutation frequency of all 4 segments tested, this segment had the greatest increase in mutation frequency over time (2-fold increase).

However, this increase only occurred between Days 0 and 4. No additional mutation accumulation was observed at Day 10. Therefore, the Bcl6-1.5 segment was selected for high-throughput 454-pyrosequencing in both control subjects and HIV-1-infected patients at baseline and post-stimulation in order to measure AID functionality in these two populations.

Bcl6 baseline mutation frequencies were lower in viremic HIV-1-infected patients. High-throughput 454-pyrosequencing was done on genomic DNA isolated from patients in the same dataset as the 454-pyrosequenced VH3 gene analysis (Table 3.12).

The 1.5 kb segment selected from the preliminary studies was too long for the 454- pyrosequencing platform, which routinely gives results in the 300-500 bp range.

Therefore, the 1.5 kb segment was divided further into 5 smaller fragments – A, B, C, D, and E – ranging in size from 252-411 bp. These 5 fragments cover the entire 1.5 kb segment sequenced in the preliminary results. Genomic DNA was isolated from fresh and stimulated PBMCs from both control subjects and HIV-1-infected patients. All 5 barcoded fragments were PCR-amplified, aligned to the reference sequence for each fragment using VhIGene, and analyzed for mutation frequency at both Days 0 and 4.

118

Table 3.11: Preliminary Bcl6 Mutation Frequencies Post-Stimulation.

Bcl6 Segment Mutation Frequency Number of base pairs Bcl6-5 Day 0 2.3 x 10-3 22425 Day 4 2.5 x 10-3 22008 Day 10 1.8 x 10-3 27800 Bcl6-9 Day 0 7.7 x 10-4 24823 Day 4 6.1 x 10-4 19713 Day 10 6.6 x 10-4 13568 Bcl6-10 Day 0 7.0 x 10-5 14271 Day 4 0 7228 Day 10 0 9343 Bcl6-1.5 Day 0 6.0 x 10-4 11684 Day 4 1.2 x 10-3 11325 Day 10 1.2 x 10-3 14784

Table 3.12: Clinical Characteristics of Bcl6 Study Subjects.

Control HIV-1+ Viremic p value Number 8 8 Age – median years 43 43 0.71 (range) (31-48) (29-60) Sex (M:F) 8:0 8:0 Ethnicityb 4W, 1B, 1H, 2O 3W, 2B, 1H, 2O 1.00 a CD4+ T cells/l – median N/A 299 (range) (93-736) HIV-1 RNA median copies/mL 62,860 (range) (2,780-835,000) aFisher’s Exact test. bEthnicity: C = Caucasian, B = Black, H = Hispanic, O = Other.

We analyzed 244,507 sequences from control subjects (median = 23,612 sequences per patient, range = 909 – 32,216) and 145,058 sequences from HIV-1- infected patients (median = 23,415 sequences per patient, range = 3059 – 58, 315). Bcl6 119 sequencing results were calculated in two ways. To determine the overall mutation frequency, the total number of nucleotides sequenced in all 5 fragments was calculated per patient. The mutation frequency was calculated by dividing the total number of mutations counted by VhIGene by the total number of nucleotides sequenced. This “all sequences” mutation frequency was significantly lower in the HIV-1-infected patients compared with controls (Figure 3.21A) both at baseline (Day 0) and post-stimulation

(Day 4). The mutation frequency did not change significantly from baseline to post- stimulation in either group, however.

As genomic DNA was extracted from PBMCs, including B cells, T cells, and monocytes, and not from sorted B cells, we also determined the mutation frequency in sequences that had ≥1 mutation in order to adjust for any differences in circulating non-B cell subsets between control and HIV-1-infected patients. Such cells do not express AID and do not undergo SHM, and presumably have nonmutated Bcl6 sequences. This mutation frequency was calculated by dividing the total number of mutations counted by

VhIGene by the total number of nucleotides present in only the mutated sequences per patient (Figure 3.21B). This “mutated sequences only” mutation frequency was also significantly lower in HIV-1-infected patients and reduced the amount of variance in both groups, both at baseline and post-stimulation. The number of unmutated sequences in both groups was not significantly different either at baseline (p=0.09) or post-stimulation

(p=0.23).

120

A B Bcl6 Nucleotide Mutation Frequency Bcl6 Nucleotide Mutation Frequency All Sequences Mutated Sequences Only

1.0 p=0.02 2.0 p=0.0003 p=0.03 p=0.006 0.8 1.5 0.6 1.0 0.4

0.5 0.2

0.0 0.0 % Nucleotide Mutation Frequency Mutation Nucleotide % Day 0 Day 4 Day 0 Day 4 Frequency Mutation Nucleotide % Day 0 Day 4 Day 0 Day 4 Control HIV-1+ Control HIV-1+

Figure 3.21: Mutation Frequencies in 454-Pyrosequenced Bcl6 Sequences are Reduced in HIV-1-Infected Patients. Mutation frequency was calculated by dividing the number of mutations by the total number of nucleotides sequenced and expressed as a percent. Mutation frequency was determined from the number of mutations and nucleotides totaled for all 5 fragments from either A) all sequences per control subject (black/dark gray circles) and HIV-1-infected patient (white/light gray triangles), or B) all sequences with ≥1 mutation identified. Changes between Day 0 (Baseline) and Day 4 (Post-Stimulation) were not significant in either controls or HIV-1-infected patients when calculating mutation frequency from all sequences or mutated sequences only.

Bcl6 mutation frequency correlates with VH3-IgG SHM frequency, but not with

VH3-IgD SHM frequency. The Bcl6 mutation frequencies found in control and HIV-1- infected patients mirror those found in the VH3-IgG pyrosequenced transcripts, and indeed, these two variables are significantly and positively correlated in this patient cohort (Figure 3.22A). Baseline Bcl6 mutation frequency, however, did not correlate with the CDR1/2 amino acid SHM frequency of either VH3-IgD (Figure 3.22B), VH3-

IgM (p=0.99, data not shown), or VH3-IgA (p=0.22, data not shown). Similar to VH3-

IgG CDR amino acid SHM frequency, baseline Bcl6 mutation frequency did not correlate

121

A B IgG CDR Amino Acid SHM Frequency vs IgD CDR Amino Acid SHM Frequency vs Bcl6 Mutation Frequency - Mutated Sequences Only Bcl6 Mutation Frequency - Mutated Sequences Only 1.2 1.2 r = -0.02 r = 0.79 p = 0.95 p = 0.006 1.0 1.0

0.8 0.8

0.6 0.6

% Bcl6 Nucleotide Bcl6 %

% Bcl6 Nucleotide Bcl6 %

Mutation Frequency Mutation Mutation Frequency Mutation

0.4 0.4 0 10 20 30 40 0 10 20 30 40 % CDR A.A. Mutation Frequency (IgG) % CDR A.A. Mutation Frequency (IgD)

C CD4 Count vs Bcl6 Mutation Frequency All Sequences 0.6

0.4 r = 0.96 p = 0.003

0.2

% Bcl6 Nucleotide Bcl6 % Mutation Frequency Mutation

0.0 0 200 400 600 800 CD4 Count (cells/l)

Figure 3.22: Baseline Bcl6 Mutation Frequency Correlates with VH3-IgG SHM Frequency and CD4+ T Cell Count. Spearman correlations were calculated for Baseline Bcl6 mutation frequency from mutated sequences only and A) Baseline VH3- IgG CDR1/2 amino acid SHM frequency or B) Baseline VH3-IgD CDR1/2 amino acid SHM frequency and for C) Baseline Bcl6 mutation frequency from all sequences and CD4+ T cell count. with HIV-1 plasma viral load (p=0.78, data not shown), however, baseline Bcl6 mutation frequency did positively correlate with CD4+ T cell counts (Figure 3.22C). Therefore, though Bcl6 mutation frequency does not appear to be directly related to viral levels, it does appear to be affected by low CD4+ T cell counts, which are the result of viral infection.

122

In summary, consistent with the reduced VH3-IgG CDR1/2 amino acid SHM frequency characterized in viremic HIV-1-infected patients in both the cloning and 454- pyrosequenced datasets, mutation frequency is reduced in another non-Ig target of SHM,

Bcl6. Unfortunately, the lack of change in mutation frequency in Bcl6 sequences from baseline to post-stimulation does not provide any additional information on the potential mechanism(s) creating low Bcl6 and VH3-IgG mutation frequencies and, conversely, high

VH3-IgD mutation frequencies.

Discussion

Disparate SHM frequencies occur during HIV-1-infection

We demonstrate on a molecular level that the frequency of SHM is decreased among class-switched VH3-IgG B cells in patients with advanced viremic HIV-1- infection. We confirmed this phenotype in multiple datasets and by measuring mutation frequency in an additional non-Ig SHM target, Bcl6. Unexpectedly, we also discovered a significantly increased SHM frequency in VH3-IgD transcripts in 4/8 patients with advanced HIV-1 infection. This result correlated significantly and positively with patient

HIV-1 plasma viral loads. Despite the opposite reactions in the IgG+ and IgD+ B cell subsets in HIV-1-infected patients, mutation frequencies were normal in another class- switched isotype IgA, as well as a more “naïve-like” isotype, IgM, when compared with controls.

SHM underlies the ability of B cells to make antibodies of high specificity, avidity and function. We targeted the VH3 genes due to their frequency (half of all expressed VH genes) and because VH3 antibodies are prominent in defense against

123 invasive HIV-1-associated mucosally-acquired systemic pathogens [25-27] (e.g., S. pneumoniae, H. influenzae, and Cryptococcus neoformans) and in responses to associated preventive vaccines [25-28]. Thus, a decreased ability to support SHM in IgG-class- switched effector B cells, and, by inference, an impaired capacity to generate high affinity antigen-specific antibodies that target systemic pathogens [29, 179, 180] may underlie, in part, the high rates of invasive and often fatal HIV-1-associated bacterial infections [25, 198-200].

V(D)J recombination is normal during HIV-1 infection

Three processes underlie the development of antibody diversity, specificity and function. The first, establishment of the primary B cell repertoire by V(D)J recombination, appears to be intact in B cells of all four isotypes tested in HIV-1-infected patients. In the VH3-IgG cloned dataset, we found slightly decreased expression of only

4 of 22 VH3 genes in HIV-1-infected patients compared with controls, consistent with earlier data that the primary IgM+IgD+ repertoire in HIV-1-infected patients is intact [29].

The results in the VH3-IgG cloned dataset were confirmed in the 454-pyrosequenced samples. Few differences were seen between control subjects and HIV-1-infected patients in any of the isotypes in VH, D, or JH gene expression, although, just as in the

VH3-IgG cloned dataset, expression of VH3-23 was significantly lower in the HIV-1- infected patients.

Of note, selected VH3 genes (VH3-15, -30, 30.5, -73, and the prominent VH3-23) have been reported to bind gp120 in a non-classical superantigen-like manner outside of the antigen-binding pocket [183]. In the absence of adequate T cell signals, such binding

124 and activation could deplete these cells [175]. In the VH3-IgG cloned dataset we identified only a decrement in VH3-23 expression, although low numbers of sequences of the other genes may have precluded finding such a difference. Thus, we performed high- throughput pyrosequencing in multiple isotypes to increase the number of sequences acquired per patient. In the 454-pyrosequenced sequencing dataset only decreased expression of VH3-23 could be confirmed in the VH3-IgG sequences. Additionally, the much greater number of sequences analyzed in this set rule out decreased expression of the other putative gp120-binding genes in HIV-1-infected patients. We also did not confirm suggestions that such putative gp120-binding residues would be more frequently

+ + mutated during SHM in expressed VH3 B cells, nor in VH3 B cells thought to be targets of gp120-binding [183]. Therefore, our data do not support preferential deletion of specific VH3 family members by gp120 binding nor preferential mutation at putative gp120 binding sites in the VH3 B cell repertoire.

Regulation of AID may be altered during HIV-1 infection

The second and third processes, SHM and CSR, are both mediated by AID. For

SHM, the decreased frequencies of nucleotide and amino acid mutations in cloned VH3-

IgG sequences from viremic HIV-1-infected patients may imply decreased levels of AID activity. This trend in SHM frequencies was confirmed in the 454-pyrosequenced VH3-

IgG dataset. Surprisingly, in contrast, no differences in SHM frequencies were found in the other class-switched VH3-IgA 454-pyrosequenced samples from the same patients and VH3-IgD sequences displayed an increased SHM frequency, again, predominantly in

125 the CDR1/2 regions. Therefore, potential differences in AID function may be specific to certain B cells or to certain lymphoid tissues but not a global humoral defect.

That the increased mutation frequency was not also seen in the VH3-IgM subset of sequences would indicate that the VH3-IgD sequences with high SHM frequency are not originating from IgD+IgM+CD27-CD10+ transitional cells, IgD+IgM+CD27-CD10- naïve cells, or IgDlowIgM+CD27+ marginal zone cells. Likely, these sequences arise from either

+ - - IgD IgM CD27 BND anergic cells, thought to be precursors to autoreactive B cells [73], or IgD+IgM-CD27+ class-switched IgD cells [75]. An increased population of the class- switched IgD cells, which have characteristically high SHM frequencies [73, 75] may imply altered CSR regulation during HIV-1 infection, as class-switch to IgD occurs through an alternative mechanism than class-switch to any other isotype [49], though this process still requires AID [76].

The lower frequencies of SHM in the VH3-IgG sequences of both datasets and higher frequency of SHM in the VH3-IgD sequences both had very similar patterns of mutation during HIV-1 viremia, suggesting a disparity in mutation quantity, but not quality. Three studies have shown increased expression of AID, the enzyme that initiates

SHM, among comparable patients with HIV-1 infection [92, 201, 202]. These findings would predict higher, as discovered in the VH3-IgD subset, but not lower, mutation frequencies. However, it is the VH3-IgG data that are consistent with results addressing mutation frequency in a small number of circulating IgD-CD27+ memory B cells (which include IgG+ cells) that the mutation frequencies are lower in HIV-1-infected patients than those in controls [202]. That AID mRNA expression was higher among CD27+ B cells is compatible with a murine study showing that high, constitutive AID expression

126 was accompanied by decreased SHM, presumably due to negative regulation of AID activity [109]. In addition to positively influencing AID expression, HIV-1 can also have a negative effect on AID function. The HIV-1 protein Vif can block the SHM-associated activity of AID in E. coli in vitro by direct protein-protein interactions [117], although any effects of Vif-AID interactions on CSR or Vif activity in vivo are unknown.

Similarly, the HIV-1 protein Nef has been shown to be taken up by B cells in vivo and interrupt CSR in vitro by blocking B cell activation pathways [131]. However, effects of

Nef on SHM have not been determined.

DNA repair pathways involved in SHM may be impaired during HIV-1 infection

AID initiates SHM by deaminating cytidines, creating a U:G lesion. The lesion can either be replicated over, creating a dC to dT transition mutation, or repaired by DNA repair pathways utilizing low fidelity translesion (TLS) DNA polymerases, creating additional mutations at A and T [128, 129]. Among the cloned VH3-IgG CDR1/2 nucleotide studies herein, the significant decrease in overall mutation frequency, particularly in the percent of A and G mutations, does not match any phenotypes in cells or mice observed with selectively-deleted DNA repair pathways [128, 129]. And no other significant patterns were found in the 454-pyrosequenced samples for any isotype that would specifically suggest a decrement in DNA repair pathway activity. Thus, our data do not support disruption of lesion repair and no effects of HIV-1 virions or proteins on DNA repair pathways are reported in the literature.

127

Antibody selection and germinal center function may be decreased during HIV-1 infection

SHM occurs with antigen stimulation and selection, processes that occur in germinal centers. Indeed, that germinal centers are smaller in patients such as ours with advanced HIV-1 infection, among whom the follicular dendritic cell (FDC) networks are attenuated and disrupted [21] may explain the decreased VH3-IgG SHM frequency. Loss of rigorous selection of high-affinity serially-mutated B cells within a germinal center reaction may allow for the survival and escape of low affinity and less mutated B cells into the periphery. Along with attenuated FDC networks, germinal center function may also be impaired by a decreased population of follicular helper T cells (TFH cells), which are an important source of HIV-1 replication [203]. TFH cells are thought to participate in positive B cell selection during the GC reaction by providing survival signals to somatically hypermutating B cells [204, 205]. TFH cells are also the primary producer of

IL-21, which is required for GC maintenance [46, 205]. Indeed, IL-21 serum levels have been shown to be reduced during HIV-1-infection [206]. Therefore, deletion of TFH cells may interrupt selection and encourage the dissolution of the GC and escape of low affinity B cells prior to selection.

Non-specific B cell activation during HIV-1 infection

Alternatively, polyclonal bystander activation of non-specific and low affinity B cells by cytokines and growth factors (e.g. IFN-, TNF-, IL-4, IL-10, BAFF) and surface CD40L, which may be over-expressed in untreated HIV-1 infection [22, 141], may also affect the selection process. Such non-specifically activated B cells may

128 compete with antigen-specific activated B cells for survival factors [141], avoid deletion in attenuated FDC networks and diminished TFH populations during the selection process and survive to migrate from GCs to the periphery. However, the ratios of replacement/silent nucleotide mutations and of non-conservative/conservative amino acid mutations, both measures of positive antigen-driven selection were similar in both groups in both sequencing data sets. Though a single R/S ratio alone cannot predict the degree of positive selection in an Ig sequence [181, 207, 208] comparison of R/S ratio values between controls and HIV-1-infected patients supports the suggestion that the process, if not the rigor of selection may be relatively intact during HIV-1-infection.

Limitations of the studies

Potential limitations of the study include the unequal distribution of age, gender, and ethnicity among groups in the VH3-IgG cloning dataset. HIV-1-infected patients with and without viremia, although older than controls, are distributed across the age range. As reported in the literature, both VH diversity and mutation frequencies were similar in 1200 transcripts from tonsillar B cells from older adults (50-73 years) and children [209], and mutation frequencies were higher among older (65-92 years) than younger (19-30) adults [210]. If mutation frequencies are higher among older adults, as suggested by the literature, under the null hypothesis, our age distribution should have shown an opposite result to those reported herein or the decreased SHM frequency would be more difficult to detect. Also, although gender may be a potential confounder of differences between groups in this dataset, mutation frequencies in CDR1/2 between male (25.8%; 95% C.I.=18.1-29.7%) and female (22.1%; 95% C.I.=15.2-31.0%) controls

129 were not different (p=0.61). In addition, despite the age range being higher in the aviremic population compared with the viremic population and gender and ethnicity being similar, SHM frequency more closely resembled the control group rather than the viremic group, further supporting our conclusion that age, gender, and ethnic differences are not confounding our results in this data set. As our results were confirmed among the

VH3-IgG 454-pyrosequenced dataset where age, gender, and ethnicity were not different between groups, we conclude that none of these variables significantly confounded the results of the VH3-IgG cloned dataset.

Finally, due to the number of cells acquired from each patient, we sequenced expressed mRNA transcripts and genomic DNA from unsorted PBMC (including circulating naïve IgD+, IgM+, IgA+ class-switched, or IgG+ class-switched B cells) rather than sorted B cell populations. Despite the presence of non-B cells, in the case of Bcl6 sequencing, which could have potentially masked an effect on Bcl6 mutation frequency, we were able to discover the decrement in mutation frequency in the HIV-1-infected patients. In addition, although multiple identical Ig sequences were isolated from patients, only one of these sequences was included from each patient in both datasets to prevent overrepresentation of a single clone or plasmablast. Future experiments may facilitate the identification of specific B cell subtypes that may harbor the lower VH3-IgG and Bcl6 SHM frequencies and higher VH3-IgD SHM frequencies in HIV-1-infected patients.

130

CHAPTER IV

AID mRNA EXPRESSION IS INCREASED IN FRESH PBMC FROM HIV-1-

INFECTED PATIENTS COMPARED WITH HEALTHY CONTROLS

Introduction

Somatic hypermutation (SHM) is a process initiated by the B cell enzyme, activation-induced cytidine deaminase (AID) [80, 81]. AID is predominantly expressed during germinal center (GC) reactions, but can be also be induced by cytokines derived from activated peripheral cells (T cell-independent) and by some pathogens, including

HIV-1 [80, 82, 87, 89, 90, 211]. Once translated, AID is targeted to the nucleus by the presence of a 5’ nuclear localization signal (NLS) in its sequence [100, 101]. The exact mechanism of targeting AID to VH regions remains undiscovered; however, stalling of

RNA Pol II during transcription is thought to recruit Spt5 which in turn recruits AID

[212, 213]. AID is phosphorylated by PKA, and perhaps PKC, which enables it to interact with RPA, a single stranded DNA (ssDNA) binding protein, believed to stabilize ssDNA generated during transcription [99, 100, 111, 214]. AID deaminates cytidine residues to uracils, creating single nucleotide point mutations that are mutagenically

131 repaired by down-stream DNA repair pathways, including Base Excision Repair and

Mismatch Repair pathways which can extend the original U:G lesion to neighboring bases [120, 121].

During a GC reaction, AID expression is induced by CD40-CD40L signaling between activated B cells and TFH cells [59, 82]. Signaling through CD40 leads to NF-

B activation and through interactions with other transcription factors including E47,

Pax5, E2A, and STAT6, binds to the AID promoter and activates transcription [87].

Transcription can also be activated in a TFH-independent CD40-independent manner involving B cell cytokines BAFF and APRIL in peripheral tissues outside of GCs [50, 58,

59]. Like CD40 signaling, BAFF, bound to BAFF-R, or APRIL, bound to TACI

(Transmembrane Activator and CAML (calcium-modulator and cyclophiling ligand)

Interactor) receptors on B cells, activates NF-B and promotes transcription of AID [55,

215]. Such activation is common in the lamina propria, where AID expression leads to

IgM and IgA class switching. BAFF is expressed by monocytes and DCs, whereas

APRIL is expressed by locally activated epithelial cells, macrophages, monocytes, DCs and activated T cells [58, 59]. Additionally, a few pathogens can directly induce AID expression, through the Type IV secretion system in the case of H. pylori [90, 211], through binding of CD81 on the surface of B cells in the case of Hepatitis C [87, 89], through LMP-1 signaling in the case of Epstein-Barr Virus (EBV) [87, 89], or through binding of envelope proteins gp160 and gp120 in the case of HIV-1 [141, 216]. Indeed, one study found increased AID mRNA expression in circulating IgD-CD27+ B cells during HIV-1 infection [202]. However, despite increased AID mRNA expression, the

132 outcomes of AID activity, CSR and SHM, are thought to be impaired during HIV-1 infection [131, 148, 150, 151] (see previous chapter). Impairment may be due to negative regulation by resident B cell factors, as has been shown in murine cells transgenically expressing AID [109], or may be due to direct viral interactions, such as has been shown with HIV-1 viral proteins Vif and Nef [117, 131].

Direct binding of HIV-1 envelope glycoproteins to B cells is not the only means by which the virus can upregulate AID expression during HIV-1 infection. HIV-1 can also activate B cells indirectly by being bound by complement fragments which bind

CD21 on the cell surface [93], or by incorporating CD40L into its envelope during budding from infected T cells and binding CD40 on nearby B cells [139]. HIV-1 can also activate other cells which, in turn, produce cytokines and chemokines such as IFN-,

TNF-, IL-6, IL-10, CD40L and BAFF and activate nearby B cells [19, 22]. Such non- specific polyclonal B cell activation is thought to be a cause of hypergammaglobulinemia, an accumulation of high serum antibody titers, commonly described during HIV-1 infection [64].

Abnormalities in the B cell compartment are one of the first detrimental events on immune system function during HIV-1 infection [7, 139]. Though not directly infected by HIV-1 in vivo, both naïve and memory B cell populations are quickly depleted during acute HIV-1 infection [19, 60, 61, 64, 203], likely due to activation-induced death [217].

These effects are not entirely corrected by successful antiretroviral therapy [22, 29, 65].

Specific B cell subsets are more affected by chronic HIV-1 infection than others. Naïve

133

B cells, for instance, in addition to depletion, express increased levels of activation markers [19, 22, 24, 64, 65, 218]. IgM memory B cells are disproportionately more depleted than their non-IgM expressing memory counterparts [65]. Plasma cell proportions are also frequently found to be decreased [22, 64, 65]. Several groups have also described increased levels of immature/transitional B cells [22, 24, 219]. Finally, B cell exhaustion has been described in memory B cell populations during chronic HIV-1 infection [22, 24, 220].

The different B cell subsets are defined by surface marker expression (Table 4.1), developmental stage, and function. Transitional/immature B cells are the first subset to enter the periphery after development in the bone marrow [31, 32, 55]. After positive selection, these cells mature into naïve IgD+IgM+ cells, which circulate through the blood and lymph waiting for antigen encounter [31, 40]. Another naïve B cell subset, the exclusively IgD-expressing B cells, BND anergic cells, are naïve autoreactive B cells thought to be precursors to high-affinity autoreactive B cells implicated in several autoimmune diseases [73]. Once IgD+IgM+ naïve B cells bind antigen and are activated by cognate T cell signals and cytokines (i.e. CD40L and IL-4), they can form GCs where they proliferate, undergo affinity maturation and CSR, and differentiate further into memory cells and plasmablasts [42, 43]. Plasmablasts will further mature into antibody- secreting plasma cells [221]. Class-switched IgD+, IgM+, and memory IgG+, or IgA+ cells can be short or long-lived and recirculate through the blood. Upon re-exposure to antigen they rapidly differentiate into antibody-secreting plasma cells [51, 74, 75, 222].

Not all B cell subsets mature through a GC reaction, however. Marginal zone B cells

134 arise from activated naïve B cells [42, 43, 56] and circulate through the blood. They have a much lower threshold of activation than GC-derived B cells, react to antigens derived from blood-borne bacteria, and once activated, can differentiate into antibody-secreting plasma cells [56].

In response to antigen, B cell activation is normally driven by help from activated

CD4+ T cell subsets [42, 47]. CD4+ T cell subsets are the primary targets of HIV-1 infection and, just as B cell subsets are impaired by chronic HIV-1 infection, T cell subsets are even more devastated by direct infection, chronic antigen stimulation, as well as stimulation from abnormal proinflammatory cytokine levels [19]. Indeed, acute HIV-1

Table 4.1: Cell Marker Expression in CD19+ B Cell Subsets.

-

)

+

or IgA or

Memory

+

Anergic

ND

Transitional Naive B Center Germinal Zone Marginal Memory likeIgM IgM Memory IgD Memory Switch (IgG IgD + + + - -/low - + -

IgM + + -/low +/- + + - - CD10 + - - +

CD21 -/+ + + CD27 - - - - + + + +

CD40 +

AID - - + Bcl6 - - +

135 infection leads to a massive depletion of CD4+ T cells, and especially CXCR5 expressing effector memory cells in the lamina propria of the gut [7, 19, 23, 223]. In response to vast amounts of antigen produced by the virus during replication, CD8+ T cell numbers increase and both CD4+ and CD8+ T cells become more highly activated [19, 23].

Increased levels of T cell exhaustion, as evidenced by increased Programmed Cell Death

(PD)-1 receptor expression, have also been discovered [224-226], further weakening the ability of the T cell compartment to combat the infection.

Similar to B cells, T cell subsets are defined by surface marker expression (Table

4.2), developmental stage, and function. Naïve T cells encounter antigen, become activated, proliferate, and differentiate into effector T cells [227, 228]. These effector cells, including terminally differentiated effector cells (TD), are not long-lived, however, and >90% apoptose shortly after an infection recedes [227, 228]. The remaining T cells can differentiate into either central memory (CM) cells or effector memory (EM) cells.

TEM cells are found in the spleen, blood, and non-lymphoid tissues and provide the first line of defense against re-exposure to a pathogen. TEM cells have cytotoxic capability and produce IFN-, TNF-, IL-4, and IL-5 upon reactivation [227-230]. TCM cells circulate through the blood and secondary lymphoid tissues but can enter inflamed peripheral sites also. TCM cells can proliferate extensively, produce IL-2, and constitute the second wave of action against a recurring pathogen [227-229]. An additional T cell subset important in GC B cell activation and implicated as an HIV-1 viral reservoir, is the follicular helper (TFH) subset [203]. These cells are found within the GC where they provide costimulatory molecules (CD40L) and survival signals (IL-4, IL-21) to GC B cells undergoing affinity maturation [43, 46].

136

We confirm that although evidence demonstrates that CSR and SHM may be inhibited during HIV-1 infection [131] (see previous chapter), AID mRNA in circulating

B cells of viremic [202], and now aviremic, HIV-1-infected patients is significantly higher compared with control subjects and highlight the association with both general T cell activation and TFH cell activation. We extend these results to show that, despite increased AID mRNA expression at baseline, and the ability to upregulate markers of B cell activation in multiple B cell subsets to levels comparable to those in control subjects, the ability to induce new AID mRNA upon stimulation with surrogates for antigen (anti-

IgM) and both cognate (anti-CD40) and soluble T-cell (IL-4) stimuli may be significantly impaired. These data suggest that the inability to appropriately respond to B cell stimuli may, in part, impair humoral responses in HIV-1-infected patients.

Table 4.2: Cell Marker Expression in T Cell Subsets.

) FH

Naive Follicular (T Helper Central (CM) Memory Effector (EM) Memory Terminally Differentiated (TD) CD3 + + + + +

CD4 +/- + +/- +/- +/-

CD8 +/- - +/- +/- +/-

CD45RA + - - + CD27 + + - -

CXCR5 +

PD-1 +

137

Results

Baseline AID mRNA expression is higher, but AID mRNA induction post-stimulation is lower in HIV-1-infected patients in which VH3-IgG cloning and sequencing was performed

Immunoglobulins. As suggested in previous studies, HIV-1 can inhibit CSR, a process mediated by AID [131, 133, 169]. To test for the ability of AID to initiate class switch in our HIV-1-infected groups levels of total IgG, IgM, and IgA were measured in serum from 16 HIV-1-infected viremic patients, 17 aviremic patients, and 20 control subjects in serum that were included in the VH3-IgG cloned dataset in which VH3-IgG

SHM frequency was measured in the last chapter (Table 4.3). Levels of total IgG were significantly higher in the viremic patients compared with those in both the controls and aviremic patients, as were levels of IgM but not IgA. Despite the differences in the quantity of immunoglobulins in serum, the median proportions of IgG (73.6%, 77.3%,

76.9% p=0.30) and IgA (15.8%, 15.6%, 10.3%, p=0.26) compared with total serum Ig in controls, aviremic and viremic patients, respectively, were comparable. IgM was lower only in the aviremic group compared to controls (4.8 vs. 10.6%, p=0.02). Similarly, as measured in culture, IgG and IgA produced spontaneously by PBMCs over 7 days were significantly higher among viremic patients. Spontaneous IgG production in culture did not correlate with serum values in any group. Thus, the distribution of antibodies in serum and secreted in culture ex vivo by isotype suggests that the ability to class-switch from IgM to IgG or IgA, a function of AID, appears to be generally intact in our HIV-1- infected cohort for antibody secreting cells.

138

Table 4.3: Clinical Characteristics of Study Subjects.

Control HIV-1+ HIV-1+ p Aviremic Viremic value Number 22 19 16 Age – median years 30 52 46 0.0001a (Range) (23-54) (33-62) (26-55) Sex (M:F) 10:12 18:1 16:0 0.0001b Ethnicityc 20C, 2A 13C, 3B, 1H, 2O 8C, 6B, 1A, 1O 0.02d CD4+ T cells/l – N/A 264 217.5 0.20 median (Range) (129-429) (34-394) HIV-1 RNA median N/A <50 41,315 copies/mL (2,840- >1x106) (Range) Antiretroviral Therapy N/A 100% 0% Serum Ig mg/dL – mediane (Range) IgG 1090 1380 2295 0.001f (776-1390) (1090-3480) (1100-6850) IgA 244 313 309 0.08 (98-443) (146-703) (110-1090) IgM 155 95 271 0.0003g (47-288) (46-323) (117-682) Spontaneous ex vivo Immunoglobulin production – median ng/mLh (Range) IgG 55 256 532 <0.04 (44-143) (73-1094) (32-12,441) IgA 26 269 191 <0.01 (2-70) (88-1397) (35-8093) IgM 11 2 36 0.14 (2-24) (2-106) (2-153) aPrimary p value listed on table. Secondary analyses – control vs. aviremic p<0.001, control vs. viremic p<0.01, aviremic vs. viremic p>0.05 (not significant). bSecondary analyses – control vs. aviremic p=0.0008, control vs. viremic p=0.0003, aviremic vs. viremic p=1.0. cEthnicity: C = Caucasian, B = Black, A = Asian, H = Hispanic, O = Other. dSecondary analyses – control vs. aviremic p=0.12, control vs. viremic p=0.008, aviremic vs. viremic p=0.32. eControl n=20, aviremic n=17, viremic n=16. fSecondary analyses – control vs. aviremic p<0.05, control vs. viremic p<0.001, aviremic vs. viremic p<0.05. gSecondary analyses – control vs. aviremic p>0.05 (not significant), control vs. viremic p<0.01, aviremic vs. viremic p<0.001. hImmunoglobulins were measured in supernatants from 105 unstimulated PBMC cultured in RPMI and 10% heat-inactivated fetal calf serum for 7 days; control: n=5; HIV-1+ aviremic: n=6; HIV-1+ viremic: n=7.

139

AID mRNA expression is upregulated in HIV-1-infected patients at baseline but impaired post-stimulation. In order to determine AID expression profiles in HIV-1- infected patients, we measured expression of AID mRNA relative to -actin (Ct) in freshly-isolated PBMC and post-stimulation from 16 viremic, 17 aviremic patients, and

21 control subjects from the VH3-IgG cloned dataset. Baseline AID mRNA was significantly higher in both HIV-1-infected patient groups (Day 0; Figure 4.1a). The presence of HIV-1 viremia did not affect levels of AID mRNA, as levels between aviremic and viremic patients were comparable.

After four days of stimulation with surrogates of B cell receptor engagement

(anti-IgM) and cognate (anti-CD40) and soluble (IL-4) T cell stimulation, AID mRNA expression (Ct) increased significantly in all groups (Day 4; Figure 4.1a). Despite the lowest baseline values, AID mRNA was highest in controls post-stimulation. Thus, the increase in expression from baseline to Day 4 post-stimulation (Ct) was also greater in the control group than in either HIV-1-infected group (Figure 4.1b), and especially in the viremic group where the magnitude of AID induction was significantly lower than in control subjects. Neither expression of AID mRNA at baseline nor post-stimulation

+ correlated significantly with VH3-IgG SHM frequency, CD4 T cell counts, plasma HIV-

1 RNA, or total levels and proportions of IgG, IgA, or IgM (Table 4.3) (data not shown).

These baseline levels of AID mRNA were also unrelated to those of proposed soluble biomarkers in plasma of systemic exposure to mucosal bacterial antigens (levels of LPS,

IgM specific for LPS (EndoCab IgM)) and systemic inflammation (soluble CD14) [22,

231, 232] (data not shown).

140

Figure 4.1: AID mRNA Expression in PBMCs. A) AID mRNA expression was measured by qRT-PCR in freshly isolated PBMCs from control subjects (black circles, n=21), aviremic patients (gray squares, n=17), and viremic patients (white triangles, n=16) at and after 4 days of culture with B cell stimulants (anti-CD40, IL-4, anti-IgM). AID expression levels were normalized to -actin expression. The primary p value for baseline expression between all groups is 0.001. The primary p value for Day 4 expression is 0.04. B) The change in AID mRNA expression, relative to -actin expression, was calculated by subtracting baseline expression levels from day 4 post- stimulation levels. The primary p value for change in expression level is 0.03. 141

Individual stimulants have varying effects on AID mRNA expression. Next, we determined whether our VH3-IgG cloned dataset subjects with and without HIV-1 infection showed a differential AID mRNA response to individual B cell stimuli alone or in combination (Figure 4.2), each of which engages different receptors and activates distinct pathways. PBMCs incubated in media alone for four days showed a significant rise in AID mRNA in all groups compared with baseline values (Figure 4.2). Similarly, markers of B cell activation (CD86+ and CD23-) increased when incubated in media alone for 4 days among controls (p=0.006 and p=0.03, respectively), but not among HIV-

1-infected patients (data not shown). Using values on day 4 without stimulation (Day 4 no stims) as the comparator, all stimuli, alone or in combination, elicited a significant rise in AID mRNA in all groups, with the exception of anti-CD40 alone in both HIV-1- infected groups (Figure 4.2, panel 1).

Although expression increased within each group, we identified differences in the magnitude of responses between groups (Figure 4.2, panels 2 and 3). Averaged over all groups, the relative effect on AID expression of the single stimuli IL-4, anti-IgM, and anti-CD40 were mean log Ct (standard deviation) 0.94 (0.71, 1.18), 0.69 (0.41, 0.97), and -0.31 (-0.60, -0.12), p=0.04, respectively, compared with Day 4 no stimulation.

Thus, the overall pattern of change in the relative expression of AID in stimulated PBMC was IL-4 ≥ anti-IgM > anti-CD40; this pattern was consistent in all groups.

However, although similar in both HIV-1-infected groups, the change in AID mRNA responses from Day 4 no stimulation to Day 4 with either IL-4 and anti-IgM stimulation were lower in viremic patients compared with controls, albeit with low numbers in the latter comparison, as was the response to IL-4 in aviremic patients (Figure

142

Figure 4.2: AID mRNA Expression Induced by B Cell Stimulants. AID mRNA expression was determined by qRT-PCR in PBMC from control subjects (C, black circles, n=9-21), aviremic (A, gray squares, n=4-18), and viremic (V, white triangles, n=3-16) patients with HIV-1 infection. Numbers tested per condition varied by cell availability. AID mRNA expression was measured at baseline, after 4 days of culture in media alone, and after 4 days of culture with B cell stimulants alone or in combination. AID expression levels were normalized to -actin expression. The final model, determined by likelihood ratio tests, demonstrated group effects (IL-4, anti-CD40, and anti-IgM, p<0.0001), but not on the interactions terms (estimates with >1 stimulus (p=0.50)). *0.05 > p > 0.01, **0.01 > p > 0.001, ***p < 0.001, ns = not significant. The ‘X’ in the baseline columns indicates the median. The line in the Day 4 columns indicates the mixed model estimate.

143

4.2, panel 2). Combinations of two or more stimuli elicited higher AID mRNA expression in all three groups compared with single stimuli. Finally, the combination of all three stimuli did not result in a significant increase in either HIV-1-infected group when compared with IL-4 + anti-CD40 or anti-IgM + anti-CD40 combinations.

Therefore, additions of either anti-IgM or IL-4, respectively, with the other two stimuli in the HIV-1-infected groups did not significantly increase AID mRNA expression. Taken together, these data suggest that control subjects could generate increased amounts of

AID mRNA with each additional stimulus, whereas B cells from HIV-1-infected patients may lack this additional reserve to respond.

HIV-1 infection does not affect AID splicing. Alternative splicing of AID transcripts may yield five different isoforms of AID in healthy adults and patients with cancer [83, 112-114] (Figure 1.8). Whether the isoforms have functional implications remains controversial [115, 116]. To determine AID isoform expression levels during

HIV-1 infection, we amplified AID cDNA by PCR using primers specific for exons shared by all isoforms (exons 2 and 5) (Table 2.1). We cloned and amplified each of the

5 isoforms, verified the sequence by Sanger sequencing and used the sizes of the isoforms as positive controls to assign the presence or absence of each isoform in patient samples. We used total mRNA isolated from baseline and stimulated PBMCs to generate cDNA from 9 control subjects, and 12 viremic patients taken from both the VH3-IgG cloned sequence and the 454-pyrosequenced datasets (Figure 4.3, Table 4.4). All viremic patients and 8/9 control subjects showed expression of 3 or more isoforms at baseline

(Table 4.5). Expression increased visibly after stimulation with anti-IgM, anti-CD40 and

IL-4, primarily of the full-length isoforms; only 2/12 viremic patients and 2/9 control

144 subjects expressed 3 or more isoforms. Thus, HIV-1-infected patients and controls subjects share AID isoform expression profiles in circulating cells, and the full length isoform predominates in each group after stimulation.

Baseline B cell and T cell activation is higher in HIV-1-infected patients in whom VH3-

IgG cloning and sequencing was performed, but activation levels are comparable to controls post-stimulation

B cell activation does not correlate with AID mRNA expression in HIV-1 infection. Because AID mRNA expression is upregulated with B cell activation, we determined whether AID expression levels correlated with markers of B cell activation in subjects from the VH3-IgG cloned dataset (Table 4.3). In fresh PBMC, three surface markers of activation (cells expressing CD86 and those lacking CD21 and CD23 expression) were increased in CD19+ B cells from viremic patients compared with those from controls (Figure 4.4a); results for aviremic patients were intermediate. These B cell markers in viremic patients correlated directly with CD4+ T cell activation (CD4+HLA-

DR+CD38+) (CD86+ p=0.06; CD21- p=0.009, CD23- p<0.04). The proportions expressing the memory marker, CD27, and CD69, an early activation marker, were similar.

After stimulation with anti-IgM, anti-CD40, and IL-4 for four days, the proportions of B cells expressing each marker changed significantly in all groups for

CD27+, CD86+, CD69+, and CD21- (p<0.05), whereas the proportion of CD23- B cells did not change significantly in any group (p>0.07) (Figure 4.4b). The magnitude of change in the surface activation molecules was similar. Only the change in CD27+ memory cell

145

Table 4.4: Clinical Characteristics of AID Isoform Study Subjects.

Control HIV-1+ p value Viremic Number 9 12 Age – median years 39 41.5 0.59 (Range) (22-48) (29-60) Gender (M:F) 9:0 12:0 Ethnicitya 4W, 1B, 1H, 3O 7W, 2B, 1H, 2O 0.67b CD4+ T cells/L – median 196 (Range) (41-736) HIV RNA median copies/mL 90,200 (Range) (2,780-1,502,153) aEthnicity: C = Caucasian, B = Black, H = Hispanic, O = Other. bFisher’s Exact test.

Figure 4.3: AID Isoforms Expression. Representative agarose gel of a control subject and an HIV-1-infected viremic patient at baseline and at day 4 post-stimulation. Positive controls (AIDins3, FL, E4a, E4, and E3-E4) were cloned and PCR-amplified to compare amplicon sizes.

146

Table 4.5: AID Isoforms Expression

Control HIV-1+ Viremic Baseline Day 4* Baseline Day 4 AIDFL 8/9 (89%) 8/9 (89%) 12/12 (100%) 12/12 (100%) Intensity increases Intensity increases AID-ins3 8/9 (89%) 6/9 (67%) 9/12 (75%) 9/12 (75%) No change in intensity No change in intensity AID-E4a 6/9 (67%) 2/9 (22%) 11/12 (92%) 1/12 (8%) No change in intensity No change in intensity AID-E4 7/9 (78%) 2/9 (22%) 12/12 (100%) 3/12 (25%) No change in intensity No change in intensity AID-E3-E4 4/9 (44%) 1/9 (11%) 10/12 (83%) 0/12 No change in intensity *Day 4 indicates AID isoforms expression after four days of stimulation with anti-IgM, anti-CD40, and IL-4.

expression post-stimulation was significantly lower in both HIV-1-infected groups compared with controls, suggesting a potential HIV-1-associated limitation in generating new memory cells. The percent of CD19+ B cells did not increase in culture over time, independent of group or stimulus (baseline primary p=0.48, post-stimulation primary p= p=0.11). Thus, B cells in PBMC from HIV-1-infected patients can upregulate surface markers of activation with sufficient stimulus, but may not be able to differentiate appropriately into CD27+ memory cells.

Despite increases in both AID mRNA and B cell activation marker expression following stimulation, we identified no direct correlations between any of the B cell activation markers or CD27 with AID expression at baseline, or changes in marker expression post-stimulation with upregulation of AID post-stimulation in vitro (data not shown). However, B cell activation did correlate with CD4+ T cell activation at baseline

(CD86+ p=0.06; CD21- p=0.009, CD23- p<0.04; data not shown).

147

Figure 4.4: Activation Phenotype of B Cells and T Cells at Baseline and Post-Stimulation. B cell and T cell activation markers were measured in control subjects (dark gray bars, n=22), aviremic patients (light gray bars, n=16), and viremic patients (white bars, n=17). A) We measured markers of memory (CD27+) and activation (CD86+, CD69+, CD21-, and CD23-) on CD19+ B cells by flow cytometry at baseline. B) The change in B cell marker expression levels from baseline to Day 4 post-stimulation with -CD40, IL-4, and -IgM were calculated for each B cell marker. C) T cell activation markers (CD38+, HLA-DR+) on CD4+ and CD8+ CD3+T cells at baseline. D) Change in T cell activation marker expression from baseline to Day 4 post-stimulation with -CD40, IL-4, and -IgM. *0.05 > p > 0.01, **0.01 > p > 0.001, ***p < 0.001.

148

Baseline AID mRNA levels correlate with T cell activation. As anticipated [19], both CD4+ and CD8+ T cell subsets showed more activation at baseline in the viremic, but not aviremic, patients compared with controls based on co-expression of CD38 and

HLA-DR (Figure 4.4c) in subjects from the VH3-IgG cloned dataset. Because AID expression in germinal center B cells is largely dependent on T cell help [202], we correlated T cell activation with AID expression. Indeed, baseline AID expression correlated significantly with activation of both CD4+ and CD8+ T cells (Figures 4.5a and

4.5b). Of note, the B cell-targeted stimuli did not promote increased T cell activation in any group (Figure 4.4d). Thus, the increased activation of B cells and of AID expression in vitro was due to the crosslinking of the BCR and the added surrogate cognate and soluble T cell stimulation and not likely related to the effects of these stimuli on the T cells themselves. Therefore, the limitations in AID responses to stimulation in vitro among the HIV-1-infected patients was not due to a failure of their own T cells to elicit these responses but, rather, due to B cell defects.

Baseline AID mRNA expression is higher in HIV-1-infected patients in which 454- pyrosequencing of VH3 genes was performed

Baseline AID mRNA expression is higher during HIV-1 infection. We also measured AID mRNA expression at baseline and post-stimulation in subjects from the

454-pyrosequenced dataset (Table 4.6). As in the previous VH3-IgG cloned dataset, baseline AID mRNA levels, relative to GAPDH expression, were significantly higher in the HIV-1-infected group compared with controls in the 454-pyrosequenced dataset

(Figure 4.6a). In the 454-pyrosequence dataset, however, AID mRNA levels in the HIV-

149

Figure 4.5: Relationship Between T Cell Activation with Baseline AID mRNA Expression in PBMC. Baseline AID mRNA expression (log10 Ct) values increased with baseline CD4+ T cell (A) or CD8+ T cell (B) activation levels (CD38+HLA-DR+) among all patient groups by linear regression (control n=9, aviremic n=6, viremic n=7). + + + For every 0.5 increase in log10 (% CD38 HLA-DR CD4 T cell) we estimate a 1.19 (0.42, 1.96) increase in baseline AID mRNA expression. For every 0.5 increase in log10 (% CD38+HLA-DR+CD8+ T cell we estimate a 0.74 (0.11, 1.36) increase in baseline AID mRNA expression.

1-infected patients reached levels comparable to those in control subjects after stimulation for four days with anti-IgM, anti-CD40, and IL-4. Similarly, the magnitude of induction was also comparable between the groups in this data set (Figure 4.6b), though this differs from the results in the VH3-IgG cloned dataset. These different results could be due, in part, to the different control groups used in the comparisons. In the VH3-

IgG cloned dataset, control patients are significantly younger (Table 4.3), whereas the control group in the 454-pyrosequenced dataset more closely matches the HIV-1-infected patient group (Table 4.6). However, we did not see any significant relationship between

AID mRNA expression and age either at baseline (controls: r2=8.0 x 10-8, p=0.99; HIV-

1-infected: r2=1.3 x 10-5, p=0.98) or post-stimulation (controls: r2=0.08, p=0.25; HIV-1- infected: r2=0.001, p=0.83, data not shown) to suggest that age may have been a

150 confounding factor. Normal levels of AID mRNA expression post-stimulation in the

HIV-1-infected group could also be due to sampling error, as even some of the patients in the VH3-IgG cloned dataset had AID mRNA levels equivalent with control subjects, though they were a minority (Figures 4.1A and B).

Baseline AID mRNA expression positively correlates with VH3-IgD SHM frequency. As we saw in the VH3-IgG cloned dataset, baseline AID mRNA expression in the 454-pyrosequenced dataset did not correlate with CD4+ T cell count (r=0.156, p=0.52), HIV-1 plasma RNA (r=0.040, p=0.87), or VH3-IgG SHM frequency (Figure

4.7D). However, baseline AID mRNA expression did positively correlate with VH3-IgD

SHM frequency (Figure 4.7A), suggesting, at least in this B cell subset, that SHM frequency is related to AID levels and that increased AID expression can lead to increased SHM frequencies in vivo. Baseline AID mRNA levels did not correlate significantly with VH3-IgM SHM frequency (Figure 4.7B) or VH3-IgA SHM frequency

(Figure 4.7C).

Table 4.6: Clinical Characteristics of Study Subjects

Control HIV-1+ p value Number 15 26 Age – median years 37 43 0.56 (Range) (22-56) (21-64) Gender (M:F) 13:2 25:1 0.54a Ethnicityb 9W, 1B, 2H, 3O 12W, 6B, 6H, 2O 0.52a CD4+ T cells/L – median N/A 226 (Range) (3-736) HIV-1 RNA – median copies/mL N/A 108,814 (Range) (553-3,793,762) Antiretroviral Therapy N/A 1/26 (3.8%) aFisher’s Exact test. bEthnicity: C = Caucasian, B = Black, H = Hispanic, O = Other.

151

A Baseline and Stimulated AID Expression p = 0.66 10 0 p = 0.01

10 -2

10 -4 Relative AID Expression RelativeAID 10 -6 Day 0 Day 4 Day 0 Day 4 Control HIV-1+

B Change in AID Expression Post-Stimulation p=0.09 10 4

10 3

10 2

10 1

10 0

Change in Relative Expression AID Control HIV-1+

Figure 4.6: AID mRNA Expression in 454-Pyrosequenced PBMCs. A) AID mRNA expression was measured by qRT-PCR in freshly isolated PBMCs from control subjects (black circles, n=14) and HIV-1-infected (HIV-1+) patients (white triangles, n=19) at and after 4 days of culture with B cell stimulants (anti-CD40, IL-4, anti-IgM). AID expression levels were normalized to GAPDHexpression. The difference between Day 4 Relative AID Expression levels and baseline (Day 0) Relative AID Expression levels were significant in both Controls and HIV-1-infected groups (p<0.0001 for both). B) The change in AID mRNA expression, relative to GAPDH expression, was calculated by subtracting baseline expression levels from day 4 post-stimulation levels.

152

A B Baseline AID mRNA Expression and Baseline CDR Baseline AID mRNA Expression and Baseline CDR Amino Acid SHM Frequency - IgD Amino Acid SHM Frequency - IgM

40 r = 0.600 20 r = 0.253 p = 0.02 p = 0.38 30 15

20 10

10 5

0 0 % CDR a.a. replacement% mutations % CDR a.a. replacement% mutations -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 Relative AID mRNA Expression Relative AID mRNA Expression

C D Baseline AID mRNA Expression and Baseline CDR Baseline AID mRNA Expression and Baseline CDR Amino Acid SHM Frequency - IgA Amino Acid SHM Frequency - IgG

50 r = -0.467 40 r = -0.464 p = 0.11 p = 0.09 40 30

30 20 20 10 10

0 0 % CDR a.a. replacement% mutations % CDR a.a. replacement% mutations -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 Relative AID mRNA Expression Relative AID mRNA Expression

Figure 4.7: Baseline AID mRNA Expression Positively Correlates with IgD CDR1/2 Amino Acid SHM Frequency. Spearman correlations were calculated for baseline AID mRNA expression and A) IgD CDR amino acid SHM frequency, B) IgM CDR amino acid SHM frequency, C) IgA CDR amino acid SHM frequency, and D) IgG CDR amino acid SHM frequency.

B and T cell subsets are more activated in HIV-1-infected patients in whom 454- pyrosequencing performed

B cell subsets are more activated during HIV-1 infection. In the 454- pyrosequenced dataset, we did a more comprehensive examination of both B and T cell subset proportions and activation levels, rather than just activation levels as in the VH3-

IgG cloned dataset. We were able to determine the activation states of immature

+ + - + + + - - (transitional: IgD IgM CD27 CD10 ), mature (naïve: IgD IgM CD27 CD10 ; BND

153 anergic: IgD+IgM-CD27-; and germinal center (GC): IgD-CD27+CD10+CD38+), and memory B cell subsets (marginal zone (MZ): IgD+IgM+CD27+CD10-; class-switched

IgM (IgM C-S): IgD-IgM+CD27+; class-switched IgD (IgD C-S): IgD+IgM-CD27+; and class-switched memory: IgD-IgM-) at baseline and after four days of stimulation with anti-IgM, anti-CD40, and IL-4. Activation was determined by expression of CD21,

CD40 and CD86.

Contrary to previous reports [22, 24, 219], we did not find any significant difference in the proportion of transitional cells in our HIV-1-infected group at baseline

(Figure 4.8A). CD21 expression, which is downregulated with activation, was significantly higher in the control group (p=0.01), though CD86 and CD40 levels, which increase with activation, were not different. The proportion of naïve and GC B cells was also not different between groups, though activation of naïve cells was significantly higher in the HIV-1-infected group (CD86+, p=0.04; CD21-, p<0.0001), which is consistent with previous studies [19, 22, 24, 64, 65, 218] and thought to be caused by non-specific bystander B cell activation by cytokines and surface markers on activated T cells and activated antigen presenting cells [64, 137]. In contrast, the proportion of BND anergic IgD+ B cells was significantly lower in the HIV-1-infected group, though activation was higher (CD21-, p<0.0001, CD40+, p=0.003). Few significant differences were found post-stimulation (Figure 4.8C). The proportions of both naïve and BND anergic cells were significantly lower in the HIV-1-infected group, however, there were no differences in activation marker expression in either of these subsets. Marker expression was also comparable in transitional cells. Only CD21 expression was different post-stimulation in GC B cells in HIV-1-infected patients (CD21-, p=0.004).

154

In the memory cell subsets, both MZ and IgD C-S B cell proportions were comparable between groups at baseline (Figure 4.8B) and had similarly lower CD21 expression in the HIV-1-infected group (MZ, p<0.0001; IgD C-S, p=0.001). Likewise, the proportion of non-IgM, non-IgD class-switched memory cells (Switch Memory) was comparable between groups, but more highly activated in the HIV-1-infected group

(CD21-, p=0.0001; CD86+, p=0.05). In contrast, the IgM C-S memory cell proportion was higher in HIV-1-infected patients as were activation levels (CD21-, p<0.0001;

CD86+, p=0.002). None of the memory subset proportions were different between groups post-stimulation (Figure 4.8D). Only CD40 expression on Switch Memory B cells was significantly different between groups (p=0.02) and was higher in the control group.

Only one difference was found in the magnitude of change from baseline to post- stimulation between the two groups in all subsets (Figure 4.8E and F). The proportion of

IgM C-S cells decreased in the HIV-1-infected group, resulting in normal proportions post-stimulation compared with control subjects (Figure 4.8D). Within each group, the proportions of IgD C-S and Switch Memory significantly increased from baseline to post- stimulation in both groups (p<0.0001, data not shown), as did activation levels (CD21-, p<0.0003; CD86+, p<0.0004). In contrast, proportions of transitional, naïve, and marginal zone cells decreased (p<0.04). Despite the decrease in proportion, however, activation was significantly increased in all three subsets in both groups (CD21-,

+ p<0,001; CD86 , p<0.0002). The proportions of BND anergic cells did not change significantly in either group, though activation increased significantly (CD21-, p<0.0001,

CD86+, p<0.0005). Finally, while the proportion of GC B cells only increased

155

A B Baseline Immature and Mature Subsets Baseline Memory Subsets

80 80 B cells

B cells 60 60

+ +

40 p=0.0001 40 p=0.003

20 20 % Positive% on CD19 0 Positive% on CD19 0 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+

Transitional Naive BND Anergic GC Marginal Zone IgM C-S IgD C-S Switch Memory

C D Immature and Mature Subsets Post-Stimulation Memory Subsets Post-Stimulation p=0.03 p=0.02 20 100

80 B cells

15 B cells

+ + 60 10 40 5

20 % Positive% on CD19 0 Positive% on CD19 0 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+

Transitional Naive BND Anergic GC Marginal Zone IgM C-S IgD C-S Switch Memory

Figure 4.8: B Cell Subsets in 454-Pyrosequenced PBMCs.

156

E F Change in Immature/Mature Subsets (Day 4 - Day 0) Change in Memory Subsets (Day 4 - Day 0) 20 100

0

B cells

B cells + + 50 -20 p=0.002

-40 0

-60 % Change% on CD19 -80 Change% on CD19 -50 Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+

Transitional Naive BND Anergic GC Marginal Zone IgM C-S IgD C-S Switch Memory

Figure 4.8: B Cell Subsets in 454-Pyrosequenced PBMCs. CD19+ B cells were divided into subsets based on marker expression as measured by flow cytometry. Immature and mature subsets are defined as Transitional (IgD+IgM+CD27- + + + - - + - - - + CD10 ), Naïve (IgD IgM CD27 CD10 ), BND anergic (IgD IgM CD27 ), and Germinal Center (GC; IgD CD27 CD10+CD38+). Memory subsets are defined as marginal zone (IgD+IgM+CD27+CD10-), class-switched IgM (IgM C-S; IgD- IgM+CD27+), class-switched IgD (IgD C-S; IgD+IgM-CD27+), and switched memory B cells (IgD-IgM-). Immature and mature subset proportions were measured in control subjects (gray bars, n=13) and HIV-1-infected patients (HIV-1+, white bars, n=19) at baseline (A) and post-stimulation (C). Memory subset proportions were measured at baseline (B) and post- stimulation (D). Changes in the proportions of immature and mature subsets (E) or memory subsets (F) with stimulation were calculated by subtracting baseline proportions from post-stimulated proportions. Median, interquartile range, and minimum and maximum values per group are represented in each data set.

157

significantly in the HIV-1-infected group (p=0.009), increases in activation only occurred within the control group (CD21-, p=0.03; CD86+, p=0.001), although CD21 levels were significantly reduced in the HIV-1-infected group also (p=0.003).

In summary, B cell subset proportions were mostly comparable at baseline between groups, though activation levels in naïve, GC, IgM C-S, and Switch Memory B cells were significantly higher in the HIV-1-infected group. After stimulation, however, proportions of subsets were comparable with the exception of naïve and BND anergic cells, as were activation levels, suggesting that B cells of all subsets were capable of achieving activation levels similar to control subjects upon stimulation. Interestingly, the only exception to increased activation was in GC B cells from HIV-1-infected patients.

While the proportion of these cells was significantly increased during stimulation, activation was not. Of note, however, activation levels measured post-stimulation were not different from control subjects, possibly suggesting that GC activation had already peaked at baseline in HIV-1-infected patients.

AID mRNA expression positively correlates with BND, GC, and IgM C-S B cell subsets. To determine whether high baseline AID mRNA expression in HIV-1-infected patients was related to high levels of B cell activation, we correlated baseline and post- stimulation AID mRNA levels with proportions of total B cell subsets and activated B cell subsets (Figure 4.9). Baseline AID expression was negatively correlated with total

BND Anergic cell proportions (Figure 4.9A), a cell population characterized as having

+ unmutated germline VH sequences, as well as naïve BND Anergic cells (CD21 , p=0.004), data not shown. Baseline AID was positively correlated with GC B cells (Figure 4.9B), although, surprisingly, not with activated (CD21-, CD40+, or CD86+) GC B cells

158

(p>0.37). Additionally, baseline AID expression also positively correlated with IgM C-S

B cells, both total proportions (Figure 4.9C), and CD21+ proportions (p=0.03), therefore suggesting that elevated levels of AID mRNA in HIV-1-infected patients may be driving the increased proportions of IgM C-S B cells found in our HIV-1-infected patients. Post- stimulation AID mRNA levels correlated only with post-stimulation total IgM C-S B cell proportions (Figure 4.9D) and, as with baseline results, CD21+ IgM C-S B cell proportions (p=0.04).

T cell subsets are more activated during HIV-1 infection. We next measured activation levels in CD4+ and CD8+ T cell subsets in patients from the 454- pyrosequenced dataset. The subsets tested were total, PD-1+, naïve (CD45RA+CD27+), central memory (CM; CD45RA-CD27+), effector memory (EM; CD45RA-CD27-), and terminally differentiated effector memory (TD; CD45RA+CD27-) CD4+ and CD8+ T

+ + + + cells, as well as CD4 follicular helper (TFH; CD4 CXCR5 PD-1 ) T cells. Activation was determined by expression of HLA-DR and CD38.

Total proportions of CD4+ T cells were significantly decreased in the HIV-1- infected group with a concurrent increase in CD8+ T cells (Figure 4.10A), as commonly reported in the literature [7, 23, 223]. Both CD4+ and CD8+ T cells were also more activated in HIV-1-infected patients compared with control subjects (Figures 4.10B and

C). PD-1 levels, a marker commonly used to identify T cell exhaustion in non-TFH T cells, were significantly higher in CD4+ T cells from the HIV-1-infected group, as has also been previously seen during chronic HIV-1 infection (Figure 4.10B) [224-226].

Surprisingly, PD-1 levels were not significantly elevated in CD8+ T cells in HIV-1- infected patients (Figure 4.10C). In general, all other CD4+ T cell subsets were similar,

159

A B Baseline AID mRNA Expression and Baseline AID mRNA Expression and Baseline BND Anergic B Cells Baseline GC B Cells

25 4

r = -0.66 Cells r = 0.50 + p = 0.01 Cells p = 0.0005

- 20 3

CD38

+ CD27

- 15 CD10

+ 2 IgM

+ 10

CD27

- IgD

+ 1 IgD

5 +

% CD19 % 0 0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 CD19 % -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 Relative AID mRNA Expression Relative AID mRNA Expression

D C Baseline AID mRNA Expression and Post-Stimulation AID mRNA Expression and Baseline IgM C-S B Cells Post-Stimulation IgM C-S B Cells

20 5 r = 0.60 r = 0.52 p = 0.002

Cells p = 0.02

Cells 4 +

15 +

CD27 CD27

+ 3 +

10

IgM

IgM -

- 2

IgD

IgD

+ + 5

1

% CD19 % % CD19 % 0 0 -5.5 -5.0 -4.5 -4.0 -3.5 -3.0 -2.5 -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 Relative AID mRNA Expression Relative AID mRNA Expression

Figure 4.9: Baseline and Stimulated AID mRNA Expression Correlate with BND, GC, and IgM C-S B Cell Subsets. Spearman correlations were calculated for baseline AID mRNA expression and baseline proportions of A) BND Anergic B cells, B) germinal center (GC) B cells, or C) class-switched IgM (IgM C-S) B cells, and D) post-stimulation AID mRNA expression and proportions of IgM C-S B cells. with only a slight, but significant increase in effector memory cells (Figure 4.10B). All subsets, however, were more activated in the HIV-1-infected group (p<0.03), a reflection of constant antigen stimulation during chronic HIV-1 infection. Similar to CD4+ T cells, the CD8+ effector memory population was also significantly higher in the HIV-1-infected patients compared with controls (Figure 4.10C). The proportion of naïve cells was

160 decreased. All CD8+ T cell subsets were significantly more activated in the HIV-1- infected group as well (p<0.0001).

T cell subsets and activation correlate with AID, VH3-IgD SHM frequency, and B cell subsets. We have previously shown in the VH3-IgG cloned dataset that AID mRNA expression was correlated with both CD4+ and CD8+ T cell activation, though not B cell activation. Thus, we tried to correlate AID expression with T cell subset proportions and activation levels in the 454-pyrosequenced dataset as well (Figure 4.11). While baseline

AID mRNA expression did positively correlate with activation in CD8+ T cells (Figure

4.11A), as before, surprisingly, they did not correlate with CD4+ T cell activation

(p=0.18). Interestingly, baseline AID mRNA expression also correlated with activated proportions of TFH cells (Figure 4.11B), though not total proportions of TFH cells. As these T cells are the predominant T cell subset in GCs this result is not surprising and potentially provides another mechanism of AID mRNA upregulation during HIV-1 infection.

Baseline TFH proportions were also found to correlate with both VH3-IgA (Figure

4.12A) and VH3-IgG (Figure 4.12B) SHM frequencies, despite the frequencies in VH3-

IgG sequences being lower. No correlations were seen between TFH proportions and

VH3-IgD or VH3-IgM sequence SHM frequencies. It is interesting to note that TFH proportions only correlate with the isotypes commonly associated with T cell-dependent

GCs and not with isotypes commonly believed to be of a more naïve phenotype or to class-switch outside of GCs. Surprisingly, however, neither VH3-IgA (p=0.46) nor VH3-

IgG (p=1.00) SHM frequencies correlated with GC B cell proportions, nor to AID mRNA expression (Figures 4.7C and D).

161

A Baseline CD4+ and CD8+ T Cells 100 p=0.0007 p=0.0001

80

B cells + 60

40

20 % Positive% on CD3 0 Control HIV-1+ Control HIV-1+ CD4+ CD8+

C + B Baseline CD4+ T Cell Subsets Baseline CD8 T Cell Subsets

100 100 p<0.0001 p<0.0001 p=0.0007

80 p=0.01 80 B cells

B cells p<0.0001

+ + 60 60 p=0.0003

40 40

20 20

% Positive% on CD4 % Positive% on CD8 0 0

HIV-1+ HIV-1+ HIV-1+ HIV-1+ HIV-1+ HIV-1+ HIV-1+ Control Control Control Control Control Control Control Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Control HIV-1+ Activ. PD-1 Naive TFH CM EM TD Activ. PD-1 Naive CM EM TD

Figure 4.10: T Cell Subsets in 454-Pyrosequenced PBMCs.

162

Figure 4.10: T Cell Subsets in 454-Pyrosequenced PBMCs. CD3+ T cells were divided into subsets based on marker expression as measured by flow cytometry. A) The proportion of CD3+CD4+ and CD3+CD8+ T cells was calculated in control subjects (black circles, n=13) and HIV-1-infected patients (HIV-1+, white triangles, n=19). Subset proportions were also measured in control subjects (gray bars) and HIV-1-infected patients (HIV-1+, white bars) in CD4+ T cells (B) and CD8+ T cells (C). Cell subsets are defined as Activated (Activ.; HLA-DR+CD38+), PD-1 (PD-1+), Naïve (CD45RA+CD27+), Follicular + + - + - - Helper (TFH; CXCR5 PD-1 ), Central Memory (CM; CD45RA CD27 ), Effector Memory (EM; CD45RA CD27 ), and Terminally Differentiated Effector Memory (TD; CD45RA+CD27-) T cells. Median, interquartile range, and minimum and maximum values per group are represented in each data set.

163

A B Baseline AID mRNA Expression and Baseline AID mRNA Expression and + Baseline Activated CD8 T Cells Baseline Activated TFH T Cells

-2.5 r = 0.46 -2.5 r = 0.49 p = 0.04 p = 0.03 -3.0 -3.0

-3.5 -3.5

-4.0 -4.0

-4.5 -4.5

Relative AID Expression RelativeAID Relative AID Expression RelativeAID -5.0 -5.0 0 20 40 60 80 100 0 10 20 30 40 50 % CD3+CD8+HLA-DR+CD38+ T Cells % CD3+CD4+CXCR5+PD-1+HLA-DR+CD38+ T Cells

+ Figure 4.11: CD8 T Cell Activation and TFH Activation Correlate with Baseline AID mRNA Expression. Spearman correlations were calculated for baseline AID mRNA expression and A) baseline activated CD8+ T cells and B) baseline activated follicular helper (TFH) T cells.

A B IgA CDR Amino Acid Replacement Mutation IgG CDR Amino Acid Replacement Mutation Frequency and Baseline TFH T Cells Frequency and Baseline TFH T Cells

40 35 r = -0.93 r = -0.74 p = 0.007 p = 0.05 30 30

25 20

20 10

15 0 % CDR a.a. replacement mutations replacement a.a. CDR % % CDR a.a. replacement% mutations 0 5 10 15 20 25 0 5 10 15 20 25 % CD3+CD4+CXCR5+PD-1+ T Cells % CD3+CD4+CXCR5+PD-1+ T Cells

Figure 4.12: TFH Cell Proportions Correlate with VH3-IgA and VH3-IgG SHM Frequencies. Spearman correlations were calculated for baseline proportions of follicular helper (TFH) T cells and A) VH3-IgA CDR amino acid SHM frequency and B) VH3-IgG CDR amino acid SHM frequency.

164

Finally, we correlated activated T cells and T cell subsets with B cell subsets. We found T cell activation, both CD4+ (Figure 4.13A) and CD8+ (Figure 4.13B) to be negatively correlated with baseline BND Anergic B cells, possibly suggesting a role in T cell-mediated maturation or deletion of this B cell subset. Similarly, a significant negative correlation was also found between CD4+ (Figure 4.13C) and CD8+ (Figure

4.13D) T cell activation and MZ B cell proportions. Activated TFH T cells, which provide survival signals to GC B cells, were positively correlated with activated GC B cells (Figure 4.13E), suggesting that activated TFH cells may be providing sufficient stimulation to activate GC B cells. Interestingly, however, activated TFH cells did not correlate with total GC B cell proportions, possibly hinting that while activated TFH cells can activate GC B cells, they may not be able to prevent their apoptosis.

In summary, baseline AID mRNA expression was found to be higher in HIV-1- infected patients in both theVH3-IgG cloned dataset, as well as the 454-pyrosequenced dataset. AID mRNA expression correlated positively with the increased VH3-IgD SHM frequency we described in the last chapter, but not with SHM frequency in any other isotypes. Though AID mRNA expression increased significantly with stimulation in both groups in both datasets, the magnitude of the change was significantly lower in the HIV-

1-infected group compared with control subjects in only the VH3-IgG cloned dataset, suggesting that AID mRNA induction may be impaired during HIV-1 infection and this impairment may affect multiple activation pathways. The expression of the 5 different isoforms of AID, which could have functional implications on AID dimers formed in vivo, was not different between groups. Baseline B cell activation levels were higher in

165

A B Baseline Activated CD4+ T Cells and Baseline Activated CD8+ T Cells and Baseline BND Anergic B Cells Baseline BND Anergic B Cells

25 25 r = -0.61 r = -0.67

p = 0.001 p = 0.0002

Cells

Cells -

- 20 20

CD27

CD27 -

- 15 15

IgM

IgM +

+ 10 10

IgD

IgD

+ +

5 5 % CD19 % % CD19 % 0 0 0 20 40 60 80 0 20 40 60 80 100 % CD3+CD4+HLA-DR+CD38+ T Cells % CD3+CD8+HLA-DR+CD38+ T Cells

C D Baseline Activated CD4+ T Cells and Baseline Activated CD8+ T Cells and Baseline MZ B Cells Baseline MZ B Cells

30 30

r = -0.53 r = -0.47

Cells Cells -

- p = 0.005 p = 0.02

CD10 CD10 +

+ 20 20

CD27 CD27

+ +

IgM IgM +

+ 10 10

IgD IgD

+ +

0 0 % CD19 % % CD19 % 0 20 40 60 80 0 20 40 60 80 100 % CD3+CD4+HLA-DR+CD38+ T Cells % CD3+CD8+HLA-DR+CD38+ T Cells

E Baseline Activated TFH T Cells and Baseline Activated GC B Cells

80 r = 0.55 + p = 0.004

60

CD10

+

Cells

+ CD27

- 40

CD86

+

IgD +

20

CD38 % CD19 % 0 0 20 40 60 80 % CD3+CD4+CXCR5+PD-1+HLA-DR+CD38+ T Cells

Figure 4.13: Activated T Cell Subsets Correlate with Activated B Cell Subsets.

166

Figure 4.13: Activated T Cell Subsets Correlate with Activated B Cell Subsets. Spearman correlations were calculated for baseline proportions of BND Anergic B cells and A) CD4+ T cell activation and B) CD8+ T cell activation; baseline proportions of marginal zone (MZ) B cells and C) CD4+ T cell activation and D) CD8+ T cell activation or; E) baseline proportions of activated TFH cells and activated germinal center (GC) B cells.

167

HIV-1-infected patients in both datasets, and are reflective of the high baseline AID mRNA levels, though no significant correlations were found between them. Unlike AID expression, however, we did not detect any significant difference in B cell activation post-stimulation, suggesting that B cells are capable of upregulating activation markers to levels matching controls upon stimulation. Baseline AID mRNA levels did correlate with the high T cell activation levels found in HIV-1-infected patients, both in CD4+ and

+ + CD8 T cells in the VH3-IgG cloned dataset, and in CD8 T cells and activated TFH T cells in the 454-pyrosequenced dataset. Taken together, these data indicate that increased

AID expression may be driven, in part, by high levels of T cell activation during HIV-1 infection. However, while increased AID expression may lead to increased SHM frequencies in B cell subsets which may not participate in T cell-dependent GC reactions

(i.e. IgD+ B cells), increased AID expression may not translate to increased SHM activity within a GC environment (i.e. low VH3-IgG SHM frequency).

Discussion

High baseline AID mRNA expression is likely driven by T cell activation

HIV-1 infection results in perturbations of B cell populations and function, particularly high levels of B cell activation and hypergammaglobulinemia [22, 163, 233].

Despite this activity, humoral responses to both recall and neoantigens are diminished in

HIV-1-infected patients [140, 142, 164, 198]. Effective antibody responses, including affinity maturation through SHM and isotype class switching are dependent on the activity of a B cell-specific enzyme, AID [80, 81]. In both the viremic and aviremic

168

HIV-1-infected cohorts from both the VH3-IgG cloned and 454-pyrosequenced datasets, we found high baseline AID mRNA expression. AID is upregulated in response to a variety of stimuli, such as T cell engagement (CD40 signaling) [82], TLR signaling (e.g.,

TLR-9, [87]), hormone signaling [88], and exposure to many viruses and bacteria [87,

89-91], including HIV-1 [92]. During HIV-1 infection, activated CD4+ T-cells, as we also observed in our HIV-1-infected cohorts, and circulating levels of antigens and cytokines likely lead to the high levels of bystander B cell activation reported in multiple studies in HIV-1-infected patients [22, 64, 93, 139, 141, 231, 232). The high baseline levels of AID expression seen in our datasets are consistent with these results, though none of the serum or plasma biomarkers that we measured, including EndoCab IgM,

LPS, or sCD14 were elevated in the HIV-1-infected viremic group (VH3-IgG cloned dataset patients) nor correlated with either B cell activation or AID expression.

Similarly, neither B cell activation nor AID expression were correlated with HIV-1 viral

RNA levels in either dataset. Thus, these markers of systemic antigen and cytokine exposure are not likely directly causing the high frequencies of B cell activation and high baseline AID expression we observed. Indeed, unlike in mice, human B cells do not express receptors for LPS and are not readily activated by LPS. Activated T cell frequencies, including the GC T cell subset TFH cells, were higher in our HIV-1-infected patients, however, and did correlate with AID expression, suggesting that T cell activation, and likely T cell signaling, contributed to the high B cell activation levels and high baseline AID expression.

169

Induction of AID mRNA expression and B cell differentiation may be impaired during

HIV-1 infection

Within the viremic HIV-1-infected patients in the VH3-IgG cloned dataset, we also found decreased AID mRNA expression post-stimulation in response to B cell- specific stimuli compared with those in seronegative control subjects. In contrast, B cell activation markers in the same dataset, also high at baseline, reached equivalent levels in

HIV-1-infected patients compared with controls. Differentiation into CD27-expressing memory B cells in response to stimulation, however, was diminished in this set of HIV-1- infected patients. Taken together, these data suggest that in these HIV-1-infected patients, B cells can be activated in response to appropriate stimulation compared with control patients, but may be limited in their ability to advance further in maturation and differentiation.

The significant but decreased capacity to upregulate AID expression seems contrary to the increased levels of B cell activation marker expression with stimulation in cells from viremic HIV-1-infected patients in the VH3-IgG cloned dataset. The stimulants used in our study were selected to mimic activation provided by antigen engagement of the BCR by anti-IgM and by cognate CD4+ T cell binding of CD40 on B cells with anti-CD40 and T cell-derived cytokine stimulation with IL-4. Whereas B cells from control subjects up-regulated AID mRNA in response to either IL-4 or anti-IgM alone, response to these single stimuli were reduced in the HIV-1-infected groups, suggesting an intrinsic B cell defect in signaling capacity through these pathways.

Analogous to this observation, changes in CD23 expression, an IL-4-responsive gene

[234], were reduced, though not significantly, from those in controls. Although AID

170 expression was only marginally upregulated in response to anti-CD40 in both HIV-1- infected groups and controls, similarly, the B cell memory marker (CD27) expression was also lower in the HIV-1-infected groups compared with controls post-stimulation.

Progression from activated mature B cells to memory B cells may require long-term signaling through the CD40 receptor [42, 235], expression of which may be downregulated during HIV-1 infection. Consistent with our results, inefficient signaling in B cells from viremic HIV-1-infected patients, such as may have occurred during stimulation, could result in decreased levels of both AID mRNA expression and decreased frequencies of memory B cells.

These results could not be confirmed in the 454-pyrosequenced dataset, however, where we found comparable levels of both AID mRNA expression and CD27+ B cell subsets post-stimulation between HIV-1-infected patients and control subjects, despite using identical reagents and protocols. This discrepancy could be due to the differences between control subjects and HIV-1-infected patients in the VH3-IgG dataset, as described below, or could simply be a sampling error. Despite significant differences in the levels of both AID mRNA expression and CD27+ B cell populations post-stimulation in the VH3-IgG dataset, there is overlap between the two groups, suggesting that at least some of the HIV-1-infected patients even in this dataset were capable of reaching similar levels of expression and differentiation as those in control subjects. A larger number of patient samples of matching demographics may be necessary, therefore, to resolve this discrepancy.

171

Correlation of high VH3-IgD SHM frequency and high baseline AID implies normal AID function during HIV-1 infection

Potential AID mRNA expression disparities in HIV-1-infected patients may lead to functional disparities as well. As seen in the last chapter, levels of SHM in viremic

HIV-1-infected patients in both the VH3-IgG cloned and 454-pyrosequenced datasets were lower in VH3-IgG sequences, either a direct result of AID activity or possibly fewer rounds of affinity maturation. In contrast, SHM frequencies of VH3-IgD sequences were significantly higher in the HIV-1-infected patients. That the IgD SHM frequency would correlate with baseline AID mRNA expression but not the IgG SHM frequency, suggests that the VH3-IgD SHM frequency may be a direct result of AID activity, while the VH3-

IgG SHM frequency may be the result of another mechanism. This conclusion would imply, therefore, that AID may be functioning normally during HIV-1-infection.

AID isoform expression does not likely affect AID function during HIV-1 infection

If, in addition to dysregulation of AID mRNA, the activity of AID is abnormal during HIV-1 infection, such activity does not appear to be due to differential expression of different isoforms of the AID enzyme that may also affect function [116].

Dimerization of different isoforms, which can lack functional domains of the enzyme, including the cytidine deaminase functional domain, could confer increased or decreased functional capacity on the operative AID molecule. Our data show for the first time that while at baseline, multiple isoforms of AID are expressed in both control subjects and

HIV-1-infected patients, and upon stimulation, the full length AID isoform is

172 predominantly expressed and expressed to a comparable degree in both groups. Thus, isoform expression should not substantially alter the functional capacity of AID during

HIV-1 infection.

B cell subset proportions cannot account for discrepancies in SHM frequency

Discrepancies in SHM frequency were not readily explainable by examining the proportions of B cell subsets in HIV-1-infected patients in the 454-pyrosequenced dataset. Indeed, with higher SHM frequencies found in the VH3-IgD sequences, we might have expected to see a higher proportion of class-switched IgD B cells. These B cells are known to harbor extremely high mutation levels in their VH genes [75]. An increase in this population, would, therefore, have accounted for the increased SHM frequency. However, no differences were seen in this B cell subset either at baseline or post-stimulation. Further experiments are needed to identify which IgD+ B cell subset(s) is (are) accumulating high SHM frequencies. Similarly, lower SHM frequencies may have been explained by higher proportions of switch IgG+ memory cells, which may accumulate fewer mutations than similarly isotype-switched plasma cells [236]. The proportion of switched memory B cells was not significantly different in the HIV-1- infected patients, although we did not specifically look at the IgG+ subset of switch memory B cells, only switch memory cells as a whole. Therefore, inclusion of IgA+ switch memory cells may be masking a potential increase in the proportion of IgG+ switch memory cells.

173

Limitations of the study

Potential limitations of the study include stimulation of the B cells in the mixed

PBMC population rather than of purified B cells. However, we chose to maintain the cells in a more physiologic context. Moreover, the B cell-specific stimuli did not appear to significantly activate the T cells by expression of activation markers but cytokines from T cells or antigen-presenting cells may have contributed to the effects seen, as they may in vivo. In addition, the seronegative control subjects in the VH3-IgG cloned dataset were more likely to be younger and female then either group of HIV-1-infected patients

(Table 1) in the same dataset. Studies in mice and humans have reported lower levels of

AID expression in older populations post-stimulation [237]. However, the ages of our control and HIV-infected populations among whom AID expression was measured span the same range (23-54 for controls, 26-55 for viremics, and 33-62 for aviremics), though the group medians differed (30, 46, and 52, respectively). Indeed, we found no correlation between age and AID expression in our group of seronegative controls or

HIV-1-infected patients either at baseline or with stimulation (data not shown).

Moreover, an age-related deficit in AID expression would not explain the high levels of

AID expression at baseline among the HIV-1-infected adults. Finally, in a model controlling for age, HIV-1 status, and gender, overall differences between all three groups were still significant at baseline (overall p=0.001, p<0.01 for both aviremic vs. controls and viremic vs. controls) and when measuring the change in AID expression post-stimulation (Ct overall p<0.0001, p<0.005 for both aviremic vs. controls and viremic vs. controls). Therefore, discrepancies in gender or age do not appear to be significantly confounding our results. That we could not confirm all results in the more

174 carefully age, gender, and ethnically-matched 454-pyrosequenced dataset, however, casts some doubt as to the reliability of the post-stimulation AID mRNA expression and B cell differentiation results. As stated earlier, additional tests will be required to resolve the differing results.

175

CHAPTER V

SUMMARY AND DISCUSSION

Summary

The goal of this thesis was to provide insights into the mechanism of B cell dysfunction which results in a decreased quantity and quality of antibody responses to infection among HIV-1-infected patients. Several processes direct the development of an effective antibody response, including the diversification of the antibody repertoire through V(D)J recombination, additional nucleotide modifications to enhance antibody affinity (SHM), and improving the functionality of an antibody by altering its effector constant region (CSR). More limited previous studies and those described herein using novel high-throughput molecular sequencing and analytic tools reveal that V(D)J recombination in the intial naïve circulating B cell repertoire (IgD+IgM+) is comparable during HIV-1-infection to that of healthy control subjects [29]. In contrast, substantial data support the conclusion that CSR is affected by HIV-1-infection by both direct and indirect mechanisms. CSR may be impaired by HIV-1 itself, perhaps by signal blockade by HIV-1 Nef protein [131, 133, 169] and the quantities of class-switched circulating antibodies in serum (IgG and IgA) are increased, likely by indirect non-specific bystander

176

B cell activation from activated T cells [29, 64, 135-137, 140, 202]. However, very little evidence is available in the literature that describes the effect of HIV-1 infection on the process of SHM. Therefore, we focused our studies on characterizing the consequences of HIV-1 infection on SHM, the accumulation of nucleotide point mutations in the variable regions of expressed antibody genes that bind antigen and on the enzyme required for both CSR and SHM, activation-induced cytidine deaminase.

Disparate frequencies of SHM by isotype during HIV-1 infection

SHM is an antigen-driven process designed to improve the affinity of an antibody for its specific antigen. The introduction of point mutations by AID and the subsequent changes to the expressed protein can increase or decrease the antibody’s specificity and avidity for an antigen, making it more or less effective in binding and supporting antibody function to preventing and controlling infections. Indeed, we identified decreased SHM frequency in nucleotides of expressed VH3-IgG transcripts both by direct cloning and sequencing and high-throughput sequencing (Figures 3.4 and 3.15) and in the predicted amino acid sequences. These observations may underlie the poor vaccine responses and higher incidences of infections in HIV-1-infected patients which are normally controlled by VH3-IgG antibodies in healthy controls (e.g. Streptococcus pneumoniae, Haemophilus influenzae, Cryptococcus neoformans, and Salmonella spp.)

[65, 144-149, 165-167]. Our data suggest also, that antiretroviral treatment may increase the effectiveness of antibody responses in HIV-1-infected patients, as patients successfully treated with Highly-Active Anti-Retroviral Therapy (HAART) with no detectable circulating HIV RNA had SHM frequencies closer to those in control subjects

177 that in untreated HIV-1-infected patients with advanced disease (Figure 3.4). We confirmed that SHM is impaired during HIV-1 infection by revealing for the first time that mutation frequencies in Bcl6, the second most common gene target of AID in vivo, are also significantly reduced (Figure 3.21). Therefore, we concluded that SHM, like

CSR, can be impaired during HIV-1 infection. Whether the decreased prevalence of these mutations in VH3-IgG and Bcl6 genes is due to decreased function of the AID protein or differential expression of B cell subsets is under investigation. Preliminary data from our lab suggests that the decreased avidity of IgG reactive with pneumococcal polysaccharides after vaccination of HIV-1-infected adults in vivo is associated with a decreased function; the ability of these antibodies to support opsonophagocytic killing of

S. pneumoniae in vitro.

Much to our surprise, however, results in antibodies of other were not consistent with these decrements in SHM and VH3-IgG. Indeed, in another class-switched isotype,

IgA, despite greater variance, SHM frequencies did not differ significantly between HIV-

1-infected patients and controls (Figure 3.15). Similarly, in IgM transcripts, an isotype expressed in naïve and some memory B cell subsets, SHM frequencies also did not differ between groups. In contrast to results from all other isotypes, however, IgD transcripts showed a significantly increased SHM frequency in half of our patients with HIV-1 infection. IgD, like IgM, is an isotype predominantly expressed in naïve and selective memory B cell subsets. Therefore, HIV-1 infection is associated with discordant and differential effects in different B cell isotypes and, likely, subsets. Because SHM is initiated by AID in vivo and the enzyme is required for this process, we sought to

178 determine a mechanism for the disparate SHM frequencies by characterizing expression of AID mRNA and its alternatively-spliced isoforms during HIV-1 infection.

High baseline AID mRNA expression and decreased expression post-stimulation

AID mRNA expression was measured by real-time PCR in HIV-1-infected patients. Baseline levels of AID mRNA were significantly higher in HIV-1-infected patients compared with controls. We considered these results in the context of the lower frequencies of SHM in the VH3-IgG transcripts, for which the high AID would seem inconsistent, and for the higher frequency of SHM in half of the HIV-1-infected patients for VH3-IgD, for which the high AID would be consistent. Regarding the VH3-IgG results, a murine study showed that increased AID mRNA expression can yield decreased

SHM frequencies, likely through negative regulation [109], although the exact mechanism is unknown. When we stimulated PBMCs the absolute level of AID mRNA expression and the change from baseline AID mRNA induced by stimulation were reduced in some patients compared with controls. These data suggest that upon stimulation in vitro by mixed stimuli (engagement of the B cell receptor (BCR) with anti-

IgM and cognate and soluble T cell factors with anti-CD40 and IL-4, respectively) and perhaps by antigen in vivo, the ability to upregulate AID may be diminished with a consequent diminution of IgG SHM. Regarding the IgD results, baseline AID activity correlated positively with the high SHM frequencies in VH3-IgD sequences. IgD- expressing B cells comprise the majority of circulating B cells (>60-70%) so the correlation suggests that the primary effect of high AID may be related to the IgD results, whereas the lower change in AID with stimulation may be more closely related to the

179 lower SHM in class-switched IgG cells that have previously been exposed to antigen stimulation. That neither result was targeted to any individual VH gene(s) or gene segment (V, D, or J) in either isotype indicates that the underlying mechanisms were generalized effects and not likely due to a common antigen, such as HIV-1 itself. Future work will be directed to sorting out these differential effects in IgG versus IgD and in baseline versus stimulated AID, initially by sorting relevant B cell subsets and determining the AID expression and frequency of SHM.

B and T cell subsets are more activated during HIV-1 infection

B and T cell subset proportions and activation levels were measured by flow cytometry in HIV-1-infected patients. Consistent with previous reports [22, 64, 218], we found both higher overall B cell activation at baseline (Figure 4.4), as well as activation within specific B cell subsets examined, including Naïve mature, GC, IgM memory, and switch memory subsets (Figure 4.8). Higher activation levels in these B cell subsets is reflective of the higher baseline AID mRNA discussed earlier, although interestingly, baseline AID mRNA has not been found to correlate either with overall B cell activation levels, nor activation levels in any specific subset. Baseline AID mRNA expression, did, however correlate with both GC and IgM Memory cell proportions, both cell subsets where AID mRNA expression may be detected in vivo. Of note, baseline AID mRNA

+ levels were negatively correlated with BND anergic IgD B cells, such that higher AID mRNA levels were associated with lower BND anergic B cell proportions (Figure 4.9).

As VH3-IgD SHM frequency had a positive association with AID mRNA expression

(Figure 4.7), these two results taken together suggest that the high SHM frequency in

180

VH3-IgD sequences is not likely to be found in the BND anergic population. Similarly, high VH3-IgD SHM frequency may also be explained by lower proportions of BND anergic B cells with low or unmutated Ig genes and higher proportions of IgD memory B cells known to harbor high levels of mutation in their Ig genes [73-75].

Overrepresentation of IgD memory cells in the periphery would skew overall SHM frequencies in VH3-IgD sequences. However, overrepresentation of IgD memory B cells was not seen in our HIV-1-infected patients, either in patients with high VH3-IgD SHM frequencies or low frequencies, and similarly, BND anergic B cell proportions were also lower in our HIV-1-infected patients. It may be interesting in future studies, therefore, to more carefully characterize the IgD+ B cell subsets during HIV-1-infection, their activation states, as well as the degree of SHM.

Consistent with our B cell activation results and studies in the literature [7, 23,

223], T cells in both CD4+ and CD8+ subsets were significantly more activated at baseline in HIV-1-infected patients compared with control subjects. This result remained consistent in the different subsets of CD4+ and CD8+ T cell populations as well (Figures

4.4 and 4.10), which included Naïve, Central Memory (CM), Effector Memory (EM),

Terminally Differentiatied Effector cells (TD), and additionally for CD4+ T cells,

Follicular Helper (TFH) cells. That all T cell subsets identified were more highly activated in the HIV-1-infected patients is reflective of the high levels of circulating antigen and pro-inflammatory cytokines previously reported during HIV-1 infection [22,

141]. Though baseline AID mRNA expression did not correlate with B cell activation levels, it did correlate with T cell activation, both overall CD8+ T cell activation, and more specifically activated TFH cell proportions (Figure 4.11). That activated TFH cells

181 would associate with increased AID mRNA expression is contradictory to our results in

VH3-IgG Gene SHM frequency. TFH cells are the primary T cell population in GCs, where IgG memory cells are induced. High activation of TFH cells and high AID mRNA expression within GCs would suggest that VH3-IgG SHM frequency would be increased not decreased. Of note, however, VH3-IgG SHM frequency correlated only with TFH cell proportions, but not TFH cell activation (Figure 4.12), suggesting that perhaps a reduction in TFH number in the GC, though not activation levels, may result in a decreased level of

SHM in the IgG B cell subset. While we cannot rule out reduced numbers of TFH cells in

GCs, we did not find a decrease in the number of TFH cells in the periphery in our HIV-1- infected patients. Finally, levels of TFH activation also correlated significantly with levels of GC B cell activation, although, similarly, not with the proportion of GC B cells (Figure

4.13), suggesting that while TFH cells can become activated and may, in turn, provide sufficient stimulus for GC B cells, they may not be capable of preventing apoptosis of

GC B cells. Characterization of GC versus peripheral blood population subsets and activation levels, therefore, will be necessary to discern associations between GC cell populations (i.e. GC B cells and TFH cells, AID expression, and SHM frequency).

Potential mechanisms

AID may be less functional in germinal centers during HIV-1 infection

Despite high baseline levels of AID expression during HIV-1 infection, SHM frequency in VH3-IgG sequences and Bcl6 sequences was low, suggesting that AID may be impaired on a functional level. In addition, as noted above, the ability to stimulate

AID mRNA expression may also be impaired during HIV-1 infection. AID mRNA

182 expression is induced by stimulation through either immune signaling (i.e. via cognate signaling by engagement of CD40 on B cells by CD40 ligand on CD4+ T cells or by soluble factors such as BAFF or APRIL through BAFF-R/TACI) (Figure 5.1) or by pathogen or their products stimulating B cells (i.e. signaling through TLRs, or in the case of HIV-1 CD40, CD21, and the BCR) [58, 59, 87, 90, 92, 93, 139, 141, 216]. This stimulation can also be impaired by HIV-1. The HIV-1 viral protein Nef has been shown to be taken up by B cells in vivo and in vitro. The presence of Nef can inhibit AID mRNA production by preventing the activation of NF-B by signaling through CD40

[131]. Similarly, the presence of HIV-1 Vif can also impair induction of AID expression in vitro through an unknown mechanism [117].

In addition, HIV-1 Vif can also impair AID function [117]. AID protein enters the nucleus through active import mediated, in part, by the nuclear localization signal

(NLS) [101] (Figures 1.7 and 5.2). In the nucleus, AID is phosphorylated by Protein

Kinase A (PKA) and possibly PKC [99, 107, 124, 214] and is recruited to the VH region, likely by Spt5, which is itself recruited by stalled RNA Pol II at active transcription sites

[212, 213]. At transcription sites phosphorylated AID associates with Replication Protein

A (RPA), a single stranded DNA binding protein, and CTNNBL-1, another nuclear protein thought to be involved in RNA splicing [213, 238, 239]. Deamination of cytidine residues then follows, creating U:G lesions in the transcribed DNA [84, 103, 104]. U:G lesions are subsequently mutagenically repaired by several DNA repair pathways, including Base Excision Repair (BER) and Mismatch Repair (MMR) pathways [120,

121].

183

Figure 5.1: Induction of AID mRNA Expression. Activation of signaling pathways (BAFF/BAFF-R, CD40/CD40L, TLR/TLR ligands, and APRIL/TACI) have all been shown to lead to activation of NF-B and when coupled with other transcription factors at the AID promoter (E47, STAT6, Pax5, E2A, and/or estrogen), to upregulate AID mRNA expression. Blockade of NF-B activation has been shown by HIV-1 Nef. Inhibition of AID mRNA induction has been shown by HIV-1 Vif, and Ca2+/calmodulin or progesterone binding at the AID promoter. After transcription AID mRNA can also be targeted for degradation by binding of microRNAs, miR-155 and miR-181B, to the 3’ end of the AID mRNA transcript.

184

Activity of AID within the nucleus is controlled by several mechanisms. Most

AID is found in the cytoplasm [99, 100]. AID is actively exported out of the nucleus by

CRM-1 associating with the nuclear export signal (NES) signal at the 3’ end of AID [101,

102, 213]. AID is also targeted for degradation in the nucleus by polyubiquitination

[108]. Polyubiquitination targets AID for proteasomal degradation. How HIV-1 Vif impairs AID in vitro is not known, however, it does require direct Vif-AID protein interaction [117].

Inhibition of AID function may explain decreased SHM frequency that we have found in VH3-IgG sequences as well as another AID target, Bcl6. Impairment of function would lead to a decrease in the creation of U:G lesions in transcribed DNA without affecting the downstream low-fidelity repair mechanisms. Thus, lower AID activity would lead to fewer mutation but not change the pattern of mutation, just as we see in our

VH3-IgG sequences (Tables 3.3-3.6). Since SHM of IgD- B cells and expression of both

Bcl6 and AID are most often localized to the GC, the SHM decrement that we found in

VH3-IgG genes is most likely an effect in GCs.

An overall deficit in AID function cannot explain either the increased SHM frequency in VH3-IgD sequences which may not have undergone a GC reaction, or the normal SHM frequencies in VH3-IgA and VH3-IgM sequences. Inhibition of AID, therefore, would have to occur only in B cells which class switch to IgG. Such a process in which B cells with impaired AID that are prone to IgG switch may occur in GCs where

HIV-1 preferentially replicates and accumulates high concentrations of HIV-1 viral products and/or if viral products are the target antigen. The majority of anti-HIV-1

185

Figure 5.2: AID Protein Function. Control of AID concentration within the nucleus and cytoplasm is controlled by active import and export of the AID protein through the nuclear pore. Mediators of active import have not been identified, however, active export is believed to be mediated by CRM-1. AID concentrations within the nucleus are also controlled by polyubiquitination and subsequent proteasomal degradation. Inside the nucleus, AID is phosphorylated at multiple sites by PKA and PKC. AID is believed to be targeted to actively transcribed target sequences (including VH and Bcl6 genes) by Spt5, which is itself recruited by stalled RNA Pol II. Phosphorylated AID associates with RPA and CTNNBL-1.

186 antibodies induced in GCs is of the IgG isotype [240-242]. If affinity maturation to HIV-

1 viral products, particularly envelope peptides, drives B cell activation and class switch to IgG, the presence of viral products in the GC (such as Nef and Vif) may be able to specifically impair AID expression and function in cells likely to class switch to IgG.

Determining AID expression and function in specific B cell subsets, both in the periphery and in GCs, therefore, would be instructive in determining whether or not as well as where AID expression and function may be impaired during HIV-1 infection.

Early dissolution of the germinal center during HIV-1

An alternative hypothesis to the low SHM frequency in VH3-IgG genes is inadequate GC activity that typically includes multiple rounds of replication in the dark zone and selection by FDC in the light zones. These processes may be compromised by the early dissolution of GC integrity as a result of loss of CD4+ T cells, attenuation of the

FDC network and progressive fibrosis in these tissues [21, 223]. If B cells cannot complete enough rounds of proliferation, mutation, and selection to achieve a level of affinity maturation necessary to obtain high affinity effective antibodies, these B cells will potentially accumulate fewer mutations than the number found in B cells which have undergone a complete germinal center reaction. Several factors are required for germinal center maintenance (Figure 5.3). Integrin ligands (Vascular Cell Adhesion Molecule-1,

VCAM-1, and Intercellular Adhesion Molecule-1, ICAM-1), BAFF, Notch ligands

(Delta-like 1 and Jagged1), and PD-L1 are all produced by resident FDCs in the light zone where selection of high affinity BCR-expressing B cells is thought to occur [45,

204]. CD40L, CD28, ICOS, PD-1, IL-4, and IL-21 are produced by TFH cells also in the 187 light zone [43, 45, 46, 204, 236]. Blockade of any of these factors results in a reduction of GC cell survival and dissolution of the [31, 43, 45, 46, 204, 205, 235, 243, 244]. B cell factors may also affect affinity maturation. Overexpression of anti-apoptotic signals

Bcl-xL, Bcl-2, and Bim can prevent apoptosis of low affinity B cells, resulting in their prolonged survival and accumulation in the germinal center [205, 245]. Only blockade of some of these factors, however, leads to a reduction in affinity maturation, as seen in the

VH3-IgG sequences of our HIV-1-infected patients (Table 5.1).

Several mutation and blocking studies have been done to illuminate the role of these factors in germinal center function in vitro and in vivo (Table 5.1). Disruption of

BAFF or Notch signaling affects B cell survival and GC size, but not affinity maturation

[45]. Overexpression of Bcl-xL, similar to blocking PD-1/PD-L1 signaling, results in an increase in affinity maturation [45, 204, 236, 245]. Affinity maturation is reduced, however, when signaling through integrin ligands, CD40, CD28, ICOS, and IL-21 is interrupted, and when Bcl-2 and Bim are overexpressed in B cells [45, 204, 205, 236,

245], similar to our results in the VH3-IgG sequences. However, reduced levels of integrin ligands or blocking CD40L or CD28 signaling results in a reduction in serum

IgG levels, a phenotype not found in our subjects nor reported in other HIV-1 studies [45,

205]. Bcl-2 and Bim are also not likely candidates in this case, as overexpression of these proteins results in increased numbers of memory B cells, the opposite of what is seen during HIV-1 infection [205, 245]. Disruption of ICOS signaling leads to decreased affinity maturation, but has no effect on the numbers of memory B cells produced [205,

236], contrary to results in HIV-1 studies reporting fewer memory B cells [19, 22, 65,

188

Figure 5.3: Germinal Center Signaling. Germinal centers are composed of dark zones and light zones. CXCR4+ centroblasts traffic to dark zones in response to an SDF-1 (CXCL12) chemical gradient in the dark zone produced by stromal cells. Centroblasts undergo SHM and several rounds of proliferation then downregulate CXCR4 expression and upregulate CXCR5 expression. CXCR5 expression hones the non-proliferating centrocytes (formerly centroblasts) into the light zone through its attraction to CXCL13, produced in the light zone by FDCs. Centrocytes express the newly mutated BCR and B cells expressing high affinity BCRs are positively selected. High affinity B cells receive survival signals (ICOS, IL-4, IL-21, PD-1, CD28, CD40L, BAFF, IL-6) from FDCs and

TFH cells. Low affinity B cells do not receive survival signals and undergo apoptosis. High affinity B cells can return to the dark zone for further rounds of proliferation and SHM by upregulating CXCR4 expression, or they can undergo CSR and differentiate into memory B cells or plasmablasts.

189

149]. Blockade of IL-4 and IL-21 signaling resulted in lower affinity maturation and lower plasma and memory cell numbers [45, 204, 205, 236]. In addition, Bcl6 expression is maintained in GC B cells by IL-21 signaling, and impairment of Bcl6 led to a decrease in SHM in IgG expressing B cells [205, 244], matching our phenotype in VH3-IgG specific cells. The significantly lower mutation frequency we found in Bcl6 may also be explained by a deficiency in IL-21 signaling, either levels of the cytokine or the receptor and its downstream signaling network, as a resulting decrease in Bcl6 transcription would allow for less AID-mediated SHM.

Table 5.1: Factors required for Germinal Center Maintenance.

Affinity Memory Cell/ Maturation Plasma Cell Number Other BAFF Unknown Decreased Notch Ligands Unknown Decreased Bcl-xL Increased Increased PD-1/L1 Increased Decreased Integrin Ligands Decreased Decreased Decreased IgG Titers CD40L Decreased Decreased Decreased IgG Titers CD28 Decreased Decreased Decreased IgG Titers Bcl-2 Decreased Increased Bim Decreased Increased ICOS Decreased No Change IL-4 Decreased Decreased Decreased CSR IL-21 Decreased Decreased Decreased Bcl6 expression Blockade or knockout of the above factors can affect affinity maturation (increased or decreased levels) and the numbers of memory cells and plasma cells produced (increased or decreased numbers).

190

During HIV-1 infection, serum IL-21 levels have been reported to be decreased beginning early in infection and progressively decrease in association with decreasing

CD4+ T cells numbers in later stages of the infection [206, 246]. IL-21 is produced by

TFH cells [43, 45, 46, 246]. Although TFH cell numbers have not been shown to be decreased during HIV-1 infection in our ongoing studies in blood and in others [203] or in lymph nodes [247], levels of IL-21 produced by TFH cells infected with HIV-1 are significantly lower [206]. In addition, though CD4+ T cell counts increase with successful antiretroviral therapy (ART), IL-21 serum levels remain significantly lower than those measured in healthy controls. This lack of effect of ART on IL-21 levels is similar to the persistently decreased frequencies of SHM in VH3-IgG genes in aviremic patients on therapy (Figure 3.4) and with reports showing limited reconstitution of B cell subsets during ART [206]. This inadequate recovery with ART could be due, in part, to persistent effects of HIV as low levels of viral replication continue in GCs during successful viral treatment [248]. Though levels of virus are undetectable in the serum, low copy numbers of viral RNA have been detected within germinal centers. With ongoing viral replication occurring in germinal centers during treatment, infected TFH cells, thought to be a major viral reservoir in these tissues [203], may be producing lower amounts of IL-21 in response to activation compared with their uninfected counterparts

[206]. Similarly, a recent report highlighted the importance of the interactions between all of these signals. High PD-L1 expression on lymph node GC B cells in patients with

HIV-1 infection bound to PD-1 on TFH cells was discovered to lead to a decrease in both

ICOS and IL-21 production by TFH cells [247].

191

A putative loss of IL-21 signaling in germinal centers provides an intriguing hypothesis to explain the decreased SHM in VH3-IgG sequences, but cannot explain the increased SHM observed in VH3-IgD sequences, nor the normal levels of SHM found in

VH3-IgM and VH3-IgA sequences. The effect of IL-21 signal disruption would have to either be focused in GCs responsive to antigens that lead to IgG class-switch exclusively, or only have a significant impact on Bcl6 expression without affecting B cell survival or affinity maturation in IgA class-switched B cells. As discussed above, IgG-specific SHM decrements may occur in GCs where HIV-1 viral products are the target antigen. The majority of anti-HIV-1 antibodies produced in germinal centers are of the IgG isotype

[240-242]. If affinity maturation to HIV-1 viral products drives class switch to IgG, IL-

21 signaling may be most highly affected in germinal centers where TFH cells are virally infected and a) producing lower levels of IL-21 as a consequence of infection and b) productively replicating virus and therefore creating increased levels of antigen in the germinal center. Therefore, investigation of IL-21 and its effects on Bcl6 expression and

SHM in IgG+ B cells during HIV-1 infection may provide a mechanism for the reduced

SHM we have described in this B cell compartment. None of our studies to date have included stimulation of T cells in the context of B cell activation. Such an approach will likely provide complementary information on the T-B interaction during HIV-1 infection.

Abnormal trafficking within germinal centers during HIV-1 infection

Signaling deficiencies in GCs during HIV-1 infection also extend to GC B cell trafficking between the dark zone and light zone [22, 249, 250]. The dark zone has been proposed to be the site of SHM, whereas CSR is thought to occur in the light zone

192

(Figure 5.3) [42, 45]. Trafficking between dark and light zones is determined by chemical gradients produced by stromal cells in the dark zone (SDF-1/CXCL12) and

FDCs in the light zone (CXCL13). Centroblasts in the dark zone express CXCR4, which is attracted to its ligand CXCL12 and thus directs B cells within this GC area. However, transformation into centrocytes in the light zone results in downregulation of CXCR4 and upregulation of CXCR5. The activating ligand for CXCR5, CXCL13, is produced in high concentrations in the light zone and directs centrocytes to leave the dark zone and migrate to the light zone [42-44]. Interrupted trafficking could lead to accumulation of B cells in one zone or the other. Accumulations could possibly lead to increased frequencies of SHM without additional CSR in the dark zone, consistent with the high

SHM frequencies in our VH3-IgD sequences. Alternatively, B cells accumulating in the light zone after fewer cycles through the dark zone could produce class-switched antibodies with low SHM frequencies, consistent with the low SHM frequencies in our

VH3-IgG sequences. Lower levels of both CXCR4 and CXCR5 on B cells with an exhausted phenotype and decreased CXCR5 levels in peripheral B cells have been reported during HIV-1 infection [22, 250]. However, despite lower CXCR5 levels, in vitro chemotactic ability towards CXCL13 and in CXCR4+ B cells towards CXCL12 was actually increased in HIV-1-infected patients with low CD4+ T cell counts compared with healthy controls [250]. Therefore, altered GC trafficking patterns during HIV-1 infection are not likely contributing prominently to the disparate SHM frequencies seen in our patient cohort.

193

Non-specific, T cell-independent B cell activation during HIV-1 infection

Excessive B cell activation during HIV-1 infection is manifested in several ways

[19, 22, 24, 64, 65, 218]. Hypergammaglobulinemia, high serum levels of autoantibodies, lymphadenopathy, and high incidence of B cell lymphomas are all characteristics of chronic HIV-1 infection [19, 29, 64, 135, 136]. Several mechanisms have been proposed to explain high B cell activation both directly and indirectly by HIV-

1. Directly, HIV-1 Env glycoproteins can directly bind B cells on C-type lectin receptors

[141] and on specific VH3 genes in the BCR [175, 176]. HIV-1 gp160 binding to B cells can also lead to activation of the B cell in the presence of T cells [216]. HIV-1 Tat stimulates B cell activation, although its mechanism is unknown [19]. Indirectly, B cells have been shown to bind complement proteins associated with HIV-1 virions via the

CD21 complement receptor [93]. Similarly, HIV-1 virions can also incorporate host

CD40L into their envelopes during viral budding which can bind CD40 on B cells [92,

139]. HIV-1 Nef has been shown to induce ferritin protein production in macrophages which can also stimulate B cells [22, 249]. Finally, high levels of circulating cytokines and chemokines such as IFN, TNF, IL-6, IL-10, CD40L, and BAFF produced by monocytes, macrophages, DCs, activated CD4+ T cells, or activated epithelial and endothelial cells have been reported during HIV-1 infection and can lead to increased B cell activation [20, 22, 55, 58, 59, 141, 145, 251, 252]. In addition to increased levels of

B cell activation, several of these mechanisms have been shown to induce AID expression within germinal centers as well as without [58, 59, 93, 139, 141].

194

SHM may occur outside of the traditional T cell-dependent (TD) GC environment

[42, 45]. GCs may be generated in response to high doses of T cell-independent (TI) antigens. High concentrations of antigen in addition to stimulatory cytokines from nearby antigen presenting cells can support the formation of a GC without the need for T cell help. These TI-generated GCs are short-lived, however, and collapse within days

[43, 45]. SHM has been shown to occur in the absence of CSR, which is reminiscent of the high SHM found in VH3-IgD sequences, although early collapse of these TI GCs would likely prevent the accumulation of high frequencies of somatic mutations [236].

Of note, the introduction of CD40L signaling within TI GCs may rescue early dissolution of these GCs [45]. CD4+ T cells are also highly activated during HIV-1 infection [7, 19,

23] (see Chapter IV). It is possible that high circulating levels of T cell-derived cytokines or immune complexes on FDCs which can signal through CD40 could potentially extend these short-lived GC reactions in the absence of TFH cells within the GCs [45]. Similarly,

HIV-1 virions themselves may provide T cell signals in the form of host CD40L within its viral envelope budded off from activated and HIV-1-infected CD4+ T cells [92, 139].

Binding of virion-bound CD40L to CD40 on activated B cells could prolong TI GC B cell survival and allow accumulation of high levels of point mutations. Indeed, addition of non-T cell-derived CD40L to TI GCs has been shown to rescue B cell centrocyte apoptosis and induce transformation back into centroblasts, in which SHM predominantly occurs [45]. SHM frequency can also increase when antigen is presented as immune complexes [45]. Class switch in a TI GC environment appears to be rare, however [236].

Additionally, CSR outside of GC (whether TD or TI) predominantly leads to switch to

195

IgM and IgA isotypes [58, 59]. Therefore, non-specific B cell activation and subsequent development of TI GCs may underlie the high SHM frequency seen in our VH3-IgD sequences.

In summary, several mechanisms may be affecting SHM during HIV-1 infection

(Figure 5.4). Within the GC B cell, impairment of AID expression and function by the

HIV-1 viral proteins Nef and Vif may directly decrease AID activity. Concurrently, outside the GC B cell, GC signaling may be interrupted during HIV-1 infection, preventing the survival and differentiation of high affinity isotype-switched B cells.

Blocking of IL-21 expression by TFH cells through either active HIV-1 replication or by signaling through PD-1 can prevent necessary levels of survival signals from accumulating in the GC and lead to apoptosis of B cells potentially expressing high affinity antibodies. Simultaneously, lack of IL-21 signaling in both B and T cells prevents upregulation of Bcl6 expression, which, in addition to other transcription factors, controls CXCR4, CXCR5, CD40L, ICOS, CXCL13, and PD-1 expression [246], further exacerbating aberrant GC signaling. Outside the TD GC, non-specifically activated B cells with increased AID expression in the presence of large amounts of antigen may be forming TI GCs and accumulating high numbers of mutations without undergoing class switch. Correction of these deficiencies through therapeutic interventions (such as supplemental IL-21) may allow HIV-1-infected patients to develop effective antibody responses to both secondary infections, vaccines, and to HIV-1.

196

Figure 5.4: Impairment of AID, IL-21 Signaling Deficiencies, and T Cell- Independent Germinal Center Formation during HIV-1 Infection. In TD GCs, impairment of AID expression can be mediated by both HIV-1 viral proteins Nef and Vif.

Impairment of AID function can be mediated by Vif. HIV-1 replication within TFH cells as well as PD-L1 signaling from B cells can downregulate IL-21 expression in TFH cells. Impairment of IL-21 production and signaling results in decreased induction of Bcl6 expression in both B and T cells which can lead to decreased levels of CXCR4, CXCR5, CD40L, ICOS, CXCL13, and PD-1 expression, and can also result in decreased B cell survival and differentiation. X indicates blockade or impairment. In TI GCs, B cells can be stimulated through surface receptor signaling by HIV-1 viral proteins and virion- associated CD40L and complement. Activation and survival signals can be released from activated antigen-presenting cells (APC), including IFN-, TNF-, IL-6, IL-10, CD40L, and BAFF.

197

Future directions

Which B cell subsets have disparate SHM frequencies during HIV-1 infection?

The discovery of high SHM frequencies in VH3-IgD genes was surprising and intriguing. IgD+ B cells represent the majority of circulating B cells, but the proportions of this subset were not significantly different in patients with and without HIV-1 infection. As a result, it is difficult to explain the phenotype of the cells harboring the significantly increased SHM frequencies beyond their isotype. Since similarly increased rates of SHM frequency were not seen in VH3-IgM sequences, it seems unlikely that these cells are naïve IgD+IgM+ B cells. Sorting IgD+ B cells into different subsets may determine which subset is harboring the increased SHM frequency.

Freshly isolated PBMC from HIV-1-infected and control subjects would be sorted into IgD+ B cell subsets by flow cytometry based on B cell marker expression. Cells would be sorted into class-switched IgD cells (IgD+IgM-CD27+; <5% of control B cells), which have been shown to have 2-3-fold higher SHM frequencies compared with other

+ - + + - - isotypes [73, 75], IgD IgM CD27 ), BND anergic cells (IgD IgM CD27 , <5-10% of control B cells ), which are not class-switched and no SHM is reported [73], IgD+IgM-

CD27-), Marginal Zone-like B cells (IgD+IgM+CD27+; 5-10% of control B cells), which are proposed to respond to T cell-independent antigens, and Naïve cells (IgD+IgM+CD27-

CD10-), which are the largest subset (40-50% of control B cells) and typically show few to no mutations. mRNA would be extracted from each of the subsets and VH3 genes would be sequenced to determine SHM frequency. It is possible that more than one

198 subset may have increased SHM frequencies. If this is the case, additional B cell markers

(i.e. activation markers such as CD40, CD80, and CD86 or plasma cell markers such as

CD38 and CD138) could be used to further differentiate the IgD+ B cell subset populations, as would staining for AID protein and measuring AID mRNA in each population.

Similarly, further characterization of IgG+ B cells with low SHM frequency may direct additional experiments to determine the mechanism of decreased SHM in VH genes from these cells. IgG+ B cells could be sorted based on B cell marker expression such as germinal center markers (IgD-CD38+CD10+CD27-) with relevant receptors (IL-21R, PD-

L1, CD40), activation levels (CD80, CD86), and plasma cell markers (CD38+CD20-

CD10-). mRNA from sorted cells would be tested for AID and Bcl6 expression and sequenced for SHM frequency in VH3 and Bcl6. This experiment would clarify in which class-switched subsets low SHM is occurring and its relationship to AID expression.

Additionally, expansion of VH-IgG sequencing could be done in other VH families, especially VH4, the second most commonly expressed family, and VH1, the third most common, which is frequently utilized by broadly neutralizing antibodies against HIV-1

[253]. Indeed, the development of broadly neutralizing antibodies requires high levels of mutation accumulation [249]. This assay could be used to determine whether broadly neutralizing antibodies are only rarely developed in HIV-1-infected patients because of low frequencies of SHM in HIV-1-specific VH-IgG genes or overall.

199

Which B cell subsets have high AID expression?

We have found high baseline AID expression in circulating B cells during HIV-1 infection. However, we have not determined which B cell subset is expressing the increased levels of AID mRNA. If these are naïve activated B cells, which are increased during HIV-1 infection [22, 249], these cells may or may not accumulate increased levels of mutations. Indeed, high levels of AID have been shown to be negatively regulated in vitro in a murine model [109]. Perhaps AID expression in naïve B cells would not support SHM or CSR if cofactors of AID are not also upregulated simultaneously. In contrast, if high AID expression is found in GC B cells, concurrent with low SHM frequencies, this may implicate an HIV-1-specific mechanism of inhibition of AID function or impaired selection of attenuated GCs in vivo. Therefore, characterization of

AID expression in different circulating and GC-derived B cell subsets should be performed.

Freshly isolated PBMC from blood and lymphoid tissue from HIV-1-infected and control patients could be sorted by into naïve (IgD+IgM+CD27-), memory (CD27+), activated (CD80+CD86+), centroblast (CXCR4highCD83lowCD86low), and centrocyte

(CXCR4lowCD83highCD86high) B cell subsets by marker expression by FACS. AID mRNA expression could then be measured from isolated mRNA by real-time PCR.

Similarly, AID mRNA expression could be measured in PBMC from different lymphoid follicles, such as spleen, lamina propria, and tonsils to determine if AID expression varies in different lymphoid tissues during HIV-1 infection.

200

Studies in blood are facilitated by accessibility but often limited by the number of

B cells available (3-6% of PBMC). Studies from lymphoid tissues are limited by the availability of these tissues but facilitated by the substantial numbers of cells available in these tissues. For blood cells, for each of the experiments described above, the most straight forward and feasible approach would be to perform a separation of B cell subsets by sorting CD19+ B cells by flow cytometry into 4 subsets and test for a) mRNA levels of

AID and b) Bcl6 and mutation frequencies in IgD (groups 1-3) and IgG (memory cells; group 4) B cell subsets:

Group 1) IgD+IgM+27- (IgD+M+CD27- naïve cells; largest subset; ≈40-60%),

+ - - Group 2) IgD IgM CD27 (BND anergic B cells; smallest subset; ≈4-10%),

Group 3) IgD+CD27+ (includes IgD+M- class-switched memory and

IgD+IgM+CD27+ marginal zone-like IgM memory cells; 5-13%) and

Group 4) IgD-CD27+ (includes IgM, IgG and IgA memory cells; second largest

subset; ≈15-30%).

Does increased AID expression translate to increased SHM frequencies during HIV-1 infection?

This question could be addressed by two methods. First, freshly isolated PBMC from both HIV-1-infected and control subjects could be sorted into AIDhigh and AIDlow populations by FACS. This experiment would require a well-tested and very specific anti-AID antibody, a commodity that we have not yet identified despite considerable testing of commercially available antibodies (see Appendix B). Should such an antibody become available, mRNA could be extracted from the sorted populations and sequenced

201 to measure SHM frequency in the expressed VH genes. One could then correlate AID protein expression with SHM frequency.

Conversely, AID function could be directly measured by monitoring a specific mutation to a transfected substrate molecule. Such functional assays have been designed and used in B cell lines to determine relative AID function [254-256]. We have also designed a functional assay using a mutated GFP plasmid designed to be transfected into primary human B cells (see Appendix B). This plasmid would be transfected into fresh

PBMC isolated from HIV-1-infected patients and control subjects and cultured for 24 hours, then assayed for GFP expression by flow cytometry. While the plasmid containing the mutated GFP sequence has been created, transfection of the primary human B cells and measurement of GFP expression need to be optimized for this assay to be informative. This assay could also be used more generally with primary B cells to determine the requirements for SHM in vitro.

Are GC signaling molecule levels normal in HIV-1-infected patients?

Low levels of IL-21 in serum from HIV-1-infected patients have been reported in other studies [206, 246]. However, IL-21 expression within GCs has yet to be characterized, and may be different than serum levels. IL-21 can be rapidly depleted from the circulation by GC cells [247], thus IL-21 levels in serum may not be reflective of IL-21 expression in GCs. IL-21 production in GC cells could be measured by RT-

PCR in cells extracted from GCs from HIV-1-infected patients and controls. Additional

GC signaling protein expression, such as PD-1/L1, ICOS, and IL-4 could also be measured in the same tissues. Furthermore, correlation of these results with both VH-Ig

202

SHM frequency and Bcl6 mutation frequency from GC B cells would provide compelling evidence that impairment of these signaling pathways has a detrimental effect on SHM.

Impact of the work in this thesis

The work described herein will advance the field of HIV-1 research for several reasons. First, this work may, in part, explain why vaccine responses are low in HIV-1- infected individuals. One explanation for the high rates of secondary infections to pathogens normally controlled by antibody responses that have been reported in HIV-1- infected patients is reduced SHM frequencies in VH3-IgG genes that we report here. The inability to develop high affinity antibodies may impair an antibody’s ability to neutralize and clear infections. The next steps for this work would be to clarify the defects associated with high IgD mutation and low IgG mutation and to correlate these results with actual vaccine responses in vivo. We could then attempt to replace (e.g. IL-21) or block (e.g. PD-1 or PD-L1) the relevant molecules during vaccine administration.

Similarly, the low frequency of development of broadly neutralizing antibodies to HIV-1 in chronically infected patients may result from similar mechanisms. Neutralization breadth is positively correlated with the level of affinity maturation for anti-HIV-1 antibodies, the most effective of which are of the IgG isotype [249]. Potentially reversing the cause of low VH-IgG SHM may, therefore, not only improve antibody responses to secondary infections, but improve antibody responses to HIV-1 also. Thus, this work has implications for understanding the basic consequences of chronic immune activation during HIV-1 infection, and potentially other such chronic infections such as Hepatitis C and malaria, and the resultant B cell dysfunction, and may provide direction to reverse

203 these defects and enhance the efficacy of humoral defenses against these infections and the vaccines designed to prevent them.

204

REFERENCES

1. Klimas, N., Koneru, A. O., and Fletcher, M. A. (2008). "Overview of HIV." Psychosom Med 70(5): 523-530.

2. www.cdc.org/hiv/resources/factsheets/us.htm

3. www.avert.org/worldstats.htm

4. Manavi, K. (2006). "A review on infection with human immunodeficiency virus." Best Pract Res Clin Obstet Gynaecol 20(6): 923-940.

5. Smith, P. D., Meng, G., Shaw, G. M., and Li, L. (1997). "Infection of gastrointestinal tract macrophages by HIV-1." J Leukoc Biol 62(1): 72-77.

6. Seage, G. R., 3rd, Losina, E., Goldie, S. J., Paltiel, A. D., Kimmel, A. D., and Freedberg, K. A. (2002). "The relationship of preventable opportunistic infections, HIV-1 RNA, and CD4 Cell counts to chronic mortality." J Acquir Immune Defic Syndr 30(4): 421-428.

7. McMichael, A. J., Borrow, P., Tomaras, G. D., Goonetilleke, N., and Haynes, B. F. (2010). "The immune response during acute HIV-1 infection: clues for vaccine development." Nat Rev Immunol 10(1): 11-23.

8. Pantaleo, G., C. Graziosi, and A. S. Fauci. (1993). “The Immunopathogenesis of Human Immunodeficiency Virus Infection.” N Eng J Med 328(5): 327-335.

9. http://www.dwp.gov.uk/publications/specialist-guides/medical-conditions/a-z-of- medical-conditions/hiv-aids/clinical-features/

10. Ferguson, M. R., Rojo, D. R., von Lindern, J. J., and O'Brien, W. A. (2002). "HIV-1 replication cycle." Clin Lab Med 22(3): 611-635.

11. Nisole, S. and A. Saib (2004). "Early steps of retrovirus replicative cycle." Retrovirology 1: 9.

12. Engelman, A. and P. Cherepanov (2012). "The structural biology of HIV-1: mechanistic and therapeutic insights." Nat Rev Microbiol 10(4): 279-290.

13. Arhel, N. (2010). "Revisiting HIV-1 uncoating." Retrovirology 7: 96.

14. Krug, R. M. (1993). "The regulation of export of mRNA from nucleus to cytoplasm." Curr Opin Cell Biol 5(6): 944-949.

205

15. Wapling, J., Srivastava, S., Shehu-Xhilaga, M., and Tachedjian, G. (2007). "Targeting human immunodeficiency virus type 1 assembly, maturation and budding." Drug Target Insights 2: 159-182.

16. Bukrinskaya, A. G. (2004). "HIV-1 assembly and maturation." Arch Virol 149(6): 1067-1082.

17. Mariani, S. A., Vicenzi, E., and Poli, G. (2011). "Asymmetric HIV-1 co-receptor use and replication in CD4(+) T lymphocytes." J Transl Med 9 Suppl 1: S8.

18. Moir, S., Chun, T. W., and Fauci, A. S. (2011). "Pathogenic mechanisms of HIV disease." Annu Rev Pathol 6: 223-248.

19. Haas, A., Zimmermann, K., and Oxenius, A. (2011). "Antigen-dependent and - independent mechanisms of T and B cell hyperactivation during chronic HIV-1 infection." J Virol 85(23): 12102-12113.

20. Brenchley, J. M. and D. C. Douek (2008). "The mucosal barrier and immune activation in HIV pathogenesis." Curr Opin HIV AIDS 3(3): 356-361.

21. Haase, A. T., Henry, K., Zupancic, M., Sedgewick, G., Faust, R. A., Melroe, H., Cavert, W., Gebhard, K., Staskus, K., Zhang, Z. Q., Dailey, P. J., Balfour, H. H., Jr., Erice, A., and Perelson, A. S. (1996). "Quantitative image analysis of HIV-1 infection in lymphoid tissue." Science 274(5289): 985-989.

22. Moir, S. and A. S. Fauci (2009). "B cells in HIV infection and disease." Nat Rev Immunol 9(4): 235-245.

23. Chattopadhyay, P. K. and M. Roederer (2010). "Good cell, bad cell: flow cytometry reveals T-cell subsets important in HIV disease." Cytometry A 77(7): 614-622.

24. Fogli, M., Torti, C., Malacarne, F., Fiorentini, S., Albani, M., Izzo, I., Giagulli, C., Maggi, F., Carosi, G., and Caruso, A. (2012). "Emergence of exhausted B cells in asymptomatic HIV-1-infected patients naive for HAART is related to reduced immune surveillance." Clin Dev Immunol 2012: 829584.

25. Abadi, J., Friedman, J., Mageed, R. A., Jefferis, R., Rodriguez-Barradas, M. C., and Pirofski, L. (1998). "Human antibodies elicited by a pneumococcal vaccine express idiotypic determinants indicative of V(H)3 gene segment usage." J Infect Dis 178(3): 707-716.

206

26. Adderson, E. E., Shackelford, P. G., Quinn, A., Wilson, P. M., Cunningham, M. W., Insel, R. A., and Carroll, W. L. (1993). "Restricted immunoglobulin VH usage and VDJ combinations in the human response to Haemophilus influenzae type b capsular polysaccharide. Nucleotide sequences of monospecific anti- Haemophilus antibodies and polyspecific antibodies cross-reacting with self antigens." J Clin Invest 91(6): 2734-2743.

27. Pirofski, L., Lui, R., DeShaw, M., Kressel, A. B., and Zhong, Z. (1995). "Analysis of human monoclonal antibodies elicited by vaccination with a Cryptococcus neoformans glucuronoxylomannan capsular polysaccharide vaccine." Infect Immun 63(8): 3005-3014.

28. Subramaniam, K., French, N., and Pirofski, L. A. (2005). "Cryptococcus neoformans-reactive and total immunoglobulin profiles of human immunodeficiency virus-infected and uninfected Ugandans." Clin Diagn Lab Immunol 12(10): 1168-1176.

29. Scamurra, R. W., Miller, D. J., Dahl, L., Abrahamsen, M., Kapur, V., Wahl, S. M., Milner, E. C., and Janoff, E. N. (2000). "Impact of HIV-1 infection on VH3 gene repertoire of naive human B cells." J Immunol 164(10): 5482-5491.

30. Blom, B. and H. Spits (2006). "Development of human lymphoid cells." Annu Rev Immunol 24: 287-320.

31. Sagaert, X., Sprangers, B., and De Wolf-Peeters, C. (2007). "The dynamics of the B follicle: understanding the normal counterpart of B-cell-derived malignancies." Leukemia 21(7): 1378-1386.

32. Holmes, M. L., Pridans, C., and Nutt, S. L. (2008). "The regulation of the B-cell gene expression programme by Pax5." Immunol Cell Biol 86(1): 47-53.

33. Ollila, J. and M. Vihinen (2005). "B cells." Int J Biochem Cell Biol 37(3): 518- 523.

34. Nakamori, Y., Liu, B., Ohishi, K., Suzuki, K., Ino, K., Matsumoto, T., Masuya, M., Nishikawa, H., Shiku, H., Hamada, H., and Katayama, N. (2012). "Human bone marrow stromal cells simultaneously support B and T/NK lineage development from human haematopoietic progenitors: a principal role for flt3 ligand in lymphopoiesis." Br J Haematol 157(6): 674-686.

35. Sanz, E., Munoz, A. N., Monserrat, J., Van-Den-Rym, A., Escoll, P., Ranz, I., Alvarez-Mon, M., and de-la-Hera, A. (2010). "Ordering human CD34+CD10- CD19+ pre/pro-B-cell and CD19- common lymphoid progenitor stages in two pro-B-cell development pathways." Proc Natl Acad Sci U S A 107(13): 5925- 5930.

207

36. Schatz, D. G. and P. C. Swanson (2011). "V(D)J recombination: mechanisms of initiation." Annu Rev Genet 45: 167-202.

37. Bassing, C. H., Swat, W., and Alt, F. W. (2002). "The mechanism and regulation of chromosomal V(D)J recombination." Cell 109 Suppl: S45-55.

38. Delves, P. J. and I. M. Roitt (2000). "The immune system. First of two parts." N Engl J Med 343(1): 37-49.

39. http://mutagenetix.utsouthwestern.edu/phenotypic/phenotypic_rec.cfm?pk=410

40. Rowland, S. L., Tuttle, K., Torres, R. M., and Pelanda, R. (2013). "Antigen and cytokine receptor signals guide the development of the naive mature B cell repertoire." Immunol Res. 55(1-3): 231-240.

41. Delves, P. J. and I. M. Roitt (2000). "The immune system. Second of two parts." N Engl J Med 343(2): 108-117.

42. Klein, U. and R. Dalla-Favera (2008). "Germinal centres: role in B-cell physiology and malignancy." Nat Rev Immunol 8(1): 22-33.

43. Allen, C. D., Okada, T., and Cyster, J. G. (2007). "Germinal-center organization and cellular dynamics." Immunity 27(2): 190-202.

44. Caron, G., Le Gallou, S., Lamy, T., Tarte, K., and Fest, T. (2009). "CXCR4 expression functionally discriminates centroblasts versus centrocytes within human germinal center B cells." J Immunol 182(12): 7595-7602.

45. Vinuesa, C. G., Linterman, M. A., Goodnow, C. C., and Randall, K. L. (2010). "T cells and follicular dendritic cells in germinal center B-cell formation and selection." Immunol Rev 237(1): 72-89.

46. Craft, J. E. (2012). "Follicular helper T cells in immunity and systemic autoimmunity." Nat Rev Rheumatol 8(6): 337-347.

47. Brandtzaeg, P. and F. E. Johansen (2005). "Mucosal B cells: phenotypic characteristics, transcriptional regulation, and homing properties." Immunol Rev 206: 32-63.

48. Wu, Y. C., Kipling, D., Leong, H. S., Martin, V., Ademokun, A. A., and Dunn- Walters, D. K. (2010). "High-throughput immunoglobulin repertoire analysis distinguishes between human IgM memory and switched memory B-cell populations." Blood 116(7): 1070-1078.

49. Stavnezer, J., Guikema, J. E., and Schrader, C. E. (2008). "Mechanism and regulation of class switch recombination." Annu Rev Immunol 26: 261-292.

208

50. Xu, W., Santini, P. A., Matthews, A. J., Chiu, A., Plebani, A., He, B., Chen, K., and Cerutti, A. (2008). "Viral double-stranded RNA triggers Ig class switching by activating upper respiratory mucosa B cells through an innate TLR3 pathway involving BAFF." J Immunol 181(1): 276-287.

51. Calame, K. L. (2001). "Plasma cells: finding new light at the end of B cell development." Nat Immunol 2(12): 1103-1108.

52. Kelly, D. F., Pollard, A. J., and Moxon, E. R. (2005). "Immunological memory: the role of B cells in long-term protection against invasive bacterial pathogens." JAMA 294(23): 3019-3023.

53. Crotty, S. and R. Ahmed (2004). "Immunological memory in humans." Semin Immunol 16(3): 197-203.

54. Paramithiotis, E. and M. D. Cooper (1997). "Memory B lymphocytes migrate to bone marrow in humans." Proc Natl Acad Sci U S A 94(1): 208-212.

55. Rickert, R. C., Jellusova, J., and Miletic, A. V. (2011). "Signaling by the tumor necrosis factor receptor superfamily in B-cell biology and disease." Immunol Rev 244(1): 115-133.

56. Swanson, C. L., Pelanda, R., and Torres, R. M. (2013). "Division of labor during primary humoral immunity." Immunol Res. 55(1-3): 277-286.

57. Chiron, D., Bekeredjian-Ding, I., Pellat-Deceunynck, C., Bataille, R., and Jego, G. (2008). "Toll-like receptors: lessons to learn from normal and malignant human B cells." Blood 112(6): 2205-2213.

58. Macpherson, A. J. and K. McCoy (2007). "APRIL in the intestine: a good destination for immunoglobulin A(2)." Immunity 26(6): 755-757.

59. He, B., Xu, W., Santini, P. A., Polydorides, A. D., Chiu, A., Estrella, J., Shan, M., Chadburn, A., Villanacci, V., Plebani, A., Knowles, D. M., Rescigno, M., and Cerutti, A. (2007). "Intestinal bacteria trigger T cell-independent immunoglobulin A(2) class switching by inducing epithelial-cell secretion of the cytokine APRIL." Immunity 26(6): 812-826.

60. Chong, Y., Ikematsu, H., Yamamoto, M., Murata, M., Yamaji, K., Nishimura, M., Nabeshima, S., Kashiwagi, S., and Hayashi, J. (2004). "Increased frequency of CD27- (naive) B cells and their phenotypic alteration in HIV type 1-infected patients." AIDS Res Hum Retroviruses 20(6): 621-629.

209

61. Titanji, K., Chiodi, F., Bellocco, R., Schepis, D., Osorio, L., Tassandin, C., Tambussi, G., Grutzmeier, S., Lopalco, L., and De Milito, A. (2005). "Primary HIV-1 infection sets the stage for important B lymphocyte dysfunctions." AIDS 19(17): 1947-1955.

62. Jacobsen, M. C., Thiebaut, R., Fisher, C., Sefe, D., Clapson, M., Klein, N., and Baxendale, H. E. (2008). "Pediatric human immunodeficiency virus infection and circulating IgD+ memory B cells." J Infect Dis 198(4): 481-485.

63. Scamurra, R. W., Nelson, D. B., Lin, X. M., Miller, D. J., Silverman, G. J., Kappel, T., Thurn, J. R., Lorenz, E., Kulkarni-Narla, A., and Janoff, E. N. (2002). "Mucosal plasma cell repertoire during HIV-1 infection." J Immunol 169(7): 4008-4016.

64. De Milito, A., Nilsson, A., Titanji, K., Thorstensson, R., Reizenstein, E., Narita, M., Grutzmeier, S., Sonnerborg, A., and Chiodi, F. (2004). "Mechanisms of hypergammaglobulinemia and impaired antigen-specific humoral immunity in HIV-1 infection." Blood 103(6): 2180-2186.

65. D'Orsogna, L. J., Krueger, R. G., McKinnon, E. J., and French, M. A. (2007). "Circulating memory B-cell subpopulations are affected differently by HIV infection and antiretroviral therapy." AIDS 21(13): 1747-1752.

66. Harris, D. P., Haynes, L., Sayles, P. C., Duso, D. K., Eaton, S. M., Lepak, N. M., Johnson, L. L., Swain, S. L., and Lund, F. E. (2000). "Reciprocal regulation of polarized cytokine production by effector B and T cells." Nat Immunol 1(6): 475- 482.

67. LeBien, T. W. and T. F. Tedder (2008). "B lymphocytes: how they develop and function." Blood 112(5): 1570-1580.

68. Morva, A., Lemoine, S., Achour, A., Pers, J. O., Youinou, P., and Jamin, C. (2012). "Maturation and function of human dendritic cells are regulated by B lymphocytes." Blood 119(1): 106-114.

69. Lipsky, P. E. (2001). "Systemic lupus erythematosus: an autoimmune disease of B cell hyperactivity." Nat Immunol 2(9): 764-766.

70. Davis, M. M. (2004). "The evolutionary and structural 'logic' of antigen receptor diversity." Semin Immunol 16(4): 239-243.

71. Nelson, D. L. and M. M. Cox. (2000). Lehninger Principles of Biochemistry, 3rd Edition. New York, NY, Worth Publishers: 228.

72. Schroeder, H. W., Jr. and L. Cavacini (2010). "Structure and function of immunoglobulins." J Allergy Clin Immunol 125(2 Suppl 2): S41-52.

210

73. Duty, J. A., Szodoray, P., Zheng N. Y., Koelsch, K. A., Zhang, Q., Swiatkowski, M., Mathias, M., Garman, L., Helms, C., Nakken, B., Smith, K., Farris, A. D., and Wilson, P. C. (2009). "Functional anergy in a subpopulation of naive B cells from healthy humans that express autoreactive immunoglobulin receptors." J Exp Med 206(1): 139-151.

74. Koelsch, K., Zheng, N. Y., Zhang, Q., Duty, A., Helms, C., Mathias, M. D., Jared, M., Smith, K., Capra, J. D., and Wilson, P. C. (2007). "Mature B cells class switched to IgD are autoreactive in healthy individuals." J Clin Invest 117(6): 1558-1565.

75. Arpin, C., de Bouteiller, O., Razanajaona, D., Fugier-Vivier, I., Briere, F., Banchereau, J., Lebecque, S., and Liu, Y. J. (1998). "The normal counterpart of IgD myeloma cells in germinal center displays extensively mutated IgVH gene, Cmu-Cdelta switch, and lambda light chain expression." J Exp Med 187(8): 1169- 1178.

76. Chen, K. and A. Cerutti (2010). "New insights into the enigma of immunoglobulin D." Immunol Rev 237(1): 160-179.

77. Xu, Z., Zan, H., Pone, E. J., Mai, T., and Casali, P. (2012). "Immunoglobulin class-switch DNA recombination: induction, targeting and beyond." Nat Rev Immunol 12(7): 517-531.

78. Joller, N., Weber, S. S., and Oxenius, A. (2011). "Antibody-Fc receptor interactions in protection against intracellular pathogens." Eur J Immunol 41(4): 889-897.

79. Kubagawa, H., Oka, S., Kubagawa, Y, Torii, I., Takayama, e., Kang, D. W., Gartland, G. L., Bertoli, L. F., Mori, H., Takatsu, H., Kitamura, T., Ohno, H., and Wang, J. Y. (2009). "Identity of the elusive IgM Fc receptor (FcmuR) in humans." J Exp Med 206(12): 2779-2793.

80. Durandy, A. (2003). "Activation-induced cytidine deaminase: a dual role in class- switch recombination and somatic hypermutation." Eur J Immunol 33(8): 2069- 2073.

81. Muramatsu, M., Sankaranand, V. S., Anant, S., Sugai, M., Kinoshita, K., Davidson, N. O., and Honjo, T. (1999). "Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RNA-editing deaminase family in germinal center B cells." J Biol Chem 274(26): 18470-18476.

82. Wu, X., Feng, J., Komori, A., Kim, E. C., Zan, H., and Casali, P. (2003). "Immunoglobulin somatic hypermutation: double-strand DNA breaks, AID and error-prone DNA repair." J Clin Immunol 23(4): 235-246.

211

83. Iacobucci, I., Lonetti, A., Messa, F., Ferrari, A., Cilloni, D., Soverini, S., Paoloni, F., Arruga, F., Ottaviani, E., Chiaretti, S., Messina, M., Vignetti, M., Papayannidis, C., Vitale, A., Pane, F., Piccaluga, P. P., Paolini, S., Berton, G., Baruzzi, A., Saglio, G., Baccarani, M., Foa, R., and Martinelli, G. (2010). "Different isoforms of the B-cell mutator activation-induced cytidine deaminase are aberrantly expressed in BCR-ABL1-positive acute lymphoblastic leukemia patients." Leukemia 24(1): 66-73.

84. Honjo, T., Muramatsu, M., and Fagarasan, S. (2004). "AID: how does it aid antibody diversity?" Immunity 20(6): 659-668.

85. Wang, J., Shinkura, R., Muramatsu, M., Nagaoka, H., Kinoshita, K., and Honjo, T. (2006). "Identification of a specific domain required for dimerization of activation-induced cytidine deaminase." J Biol Chem 281(28): 19115-19123.

86. Brar, S. S., Sacho, E. J., Tessmer, I., Croteau, D. L., Erie, D. A., and Diaz, M. (2008). "Activation-induced deaminase, AID, is catalytically active as a monomer on single-stranded DNA." DNA Repair (Amst) 7(1): 77-87.

87. Gourzi, P., Leonova, T., and Papavasiliou, F. N. (2007). "Viral induction of AID is independent of the interferon and the Toll-like receptor signaling pathways but requires NF-kappaB." J Exp Med 204(2): 259-265.

88. Pauklin, S., Sernandez, I. V., Bachmann, G., Ramiro, A. R., and Petersen-Mahrt, S. K. (2009). "Estrogen directly activates AID transcription and function." J Exp Med 206(1): 99-111.

89. Rosenberg, B. R. and F. N. Papavasiliou (2007). "Beyond SHM and CSR: AID and related cytidine deaminases in the host response to viral infection." Adv Immunol 94: 215-244.

90. Takaishi, S. and T. C. Wang (2007). "Providing AID to p53 mutagenesis." Nat Med 13(4): 404-406.

91. Yanagibashi, T., Hosono, A., Oyama, A., Tsuda, M., Hachimura, S., Takahashi, Y., Itoh, K., Hirayama, K., Takahashi, K., and Kaminogawa, S. (2009). "Bacteroides induce higher IgA production than Lactobacillus by increasing activation-induced cytidine deaminase expression in B cells in murine Peyer's patches." Biosci Biotechnol Biochem 73(2): 372-377.

92. Epeldegui, M., Thapa, D. R., De la Cruz, J., Kitchen, S., Zack, J. A., and Martinez-Maza, O. (2010). "CD40 ligand (CD154) incorporated into HIV virions induces activation-induced cytidine deaminase (AID) expression in human B lymphocytes." PLoS One 5(7): e11448.

212

93. Moir, S., Malaspina, A., Li, Y., Chun, T. W., Lowe, T., Adelsberger, J., Baseler, M., Ehler, L. A., Liu, S., Davey, R. T., Jr., Mican, J. A., and Fauci, A. S. (2000). "B cells of HIV-1-infected patients bind virions through CD21-complement interactions and transmit infectious virus to activated T cells." J Exp Med 192(5): 637-646.

94. Hauser, J., Sveshnikova, N., Wallenius, A., Baradaran, S., Saarikettu, J., and Grundstrom, T. (2008). "B-cell receptor activation inhibits AID expression through calmodulin inhibition of E-proteins." Proc Natl Acad Sci U S A 105(4): 1267-1272.

95. Pauklin, S. and S. K. Petersen-Mahrt (2009). "Progesterone inhibits activation- induced deaminase by binding to the promoter." J Immunol 183(2): 1238-1244.

96. de Yebenes, V. G., Belver, L., Pisano, D. G., Gonzalez, S., Villasante, A., Croce, C., He, L., and Ramiro, A. R. (2008). "miR-181b negatively regulates activation- induced cytidine deaminase in B cells." J Exp Med 205(10): 2199-2206.

97. Dorsett, Y., McBride, K. M., Jankovic, M., Gazumyan, A., Thai, T. H., Robbiani, D. F., Di Virgilio, M., Reina San-Martin, B., Heidkamp, G., Schwickert, T. A., Eisenreich, T., Rajewsky, K., and Nussenzweig, M. C. (2008). "MicroRNA-155 suppresses activation-induced cytidine deaminase-mediated Myc-Igh translocation." Immunity 28(5): 630-638.

98. Teng, G., Hakimpour, P., Landgraf, P., Rice, A., Tuschl, T., Casellas, R., and Papavasiliou, F. N. (2008). "MicroRNA-155 is a negative regulator of activation- induced cytidine deaminase." Immunity 28(5): 621-629.

99. McBride, K. M., Gazumyan, A., Woo, E. M., Barreto, V. M., Robbiani, D. F., Chait, B. T., and Nussenzweig, M. C. (2006). "Regulation of hypermutation by activation-induced cytidine deaminase phosphorylation." Proc Natl Acad Sci U S A 103(23): 8798-8803.

100. Longerich, S., Basu, U., Alt, F., and Storb, U. (2006). "AID in somatic hypermutation and class switch recombination." Curr Opin Immunol 18(2): 164- 174.

101. Patenaude, A. M. and J. M. Di Noia (2010). "The mechanisms regulating the subcellular localization of AID." Nucleus 1(4): 325-331.

102. McBride, K. M., Barreto, V., Ramiro, A. R., Stavropoulos, P., and Nussenzweig, M. C. (2004). "Somatic hypermutation is limited by CRM1- dependent nuclear export of activation-induced deaminase." J Exp Med 199(9): 1235-1244.

213

103. Mayorov, V. I., Rogozin, I. B., Adkison, L. R., Frahm, C., Kunkel, T. A., and Pavlov, Y. I. (2005). "Expression of human AID in yeast induces mutations in context similar to the context of somatic hypermutation at G-C pairs in immunoglobulin genes." BMC Immunol 6: 10.

104. Larson, E. D. and N. Maizels (2004). "Transcription-coupled mutagenesis by the DNA deaminase AID." Genome Biol 5(3): 211.

105. Basu, U., Franklin, A., Schwer, B., Cheng, H. L., Chaudhuri, J., and Alt, F. W. (2009). "Regulation of activation-induced cytidine deaminase DNA deamination activity in B-cells by Ser38 phosphorylation." Biochem Soc Trans 37(Pt 3): 561-568.

106. Chelico, L., Pham, P., Petruska, J., and Goodman, M. F. (2009). "Biochemical basis of immunological and retroviral responses to DNA-targeted cytosine deamination by activation-induced cytidine deaminase and APOBEC3G." J Biol Chem 284(41): 27761-27765.

107. McBride, K. M., Gazumyan, A., Woo, E. M., Schwickert, T. A., Chait, B. T., and Nussenzweig, M. C. (2008). "Regulation of class switch recombination and somatic mutation by AID phosphorylation." J Exp Med 205(11): 2585-2594.

108. Aoufouchi, S., Faili, A., Zober, C., D'Orlando, O., Weller, S., Weill, J. C., and Reynaud, C. A. (2008). "Proteasomal degradation restricts the nuclear lifespan of AID." J Exp Med 205(6): 1357-1368.

109. Muto, T., Okazaki, I. M., Yamada, S., Tanaka, Y., Kinoshita, K., Muramatsu, M., Nagaoka, H., and Honjo, T. (2006). "Negative regulation of activation-induced cytidine deaminase in B cells." Proc Natl Acad Sci U S A 103(8): 2752-2757.

110. Basu, U., Chaudhuri, J., Phan, R. T., Datta, A., and Alt, F. W. (2007). "Regulation of activation induced deaminase via phosphorylation." Adv Exp Med Biol 596: 129-137.

111. Chaudhuri, J., Khuong, C., and Alt, F. W. (2004). "Replication protein A interacts with AID to promote deamination of somatic hypermutation targets." Nature 430(7003): 992-998.

112. Oppezzo, P., Vuillier, F., Vasconcelos, Y., Dumas, G., Magnac, C., Payelle-Brogard, B., Pritsch, O., and Dighiero, G. (2003). "Chronic lymphocytic leukemia B cells expressing AID display dissociation between class switch recombination and somatic hypermutation." Blood 101(10): 4029-4032.

214

113. Albesiano, E., Messmer, B. T., Damle, R. N., Allen, S. L., Rai, K. R., and Chiorazzi, N. (2003). "Activation-induced cytidine deaminase in chronic lymphocytic leukemia B cells: expression as multiple forms in a dynamic, variably sized fraction of the clone." Blood 102(9): 3333-3339.

114. McCarthy, H., Wierda, W. G., Barron, L. L., Cromwell, C. C., Wang, J., Coombes, K. R., Rangel, R., Elenitoba-Johnson, K. S., Keating, M. J., and Abruzzo, L. V. (2003). "High expression of activation-induced cytidine deaminase (AID) and splice variants is a distinctive feature of poor-prognosis chronic lymphocytic leukemia." Blood 101(12): 4903-4908.

115. van Maldegem, F., Scheeren, F. A., Aarti Jibodh, R., Bende, R. J., Jacobs, H., and van Noesel, C. J. (2009). "AID splice variants lack deaminase activity." Blood 113(8): 1862-1864; author reply 1864.

116. Wu, X., Darce, J. R., Chang, S. K., Nowakowski, G. S., and Jelinek, D. F. (2008). "Alternative splicing regulates activation-induced cytidine deaminase (AID): implications for suppression of AID mutagenic activity in normal and malignant B cells." Blood 112(12): 4675-4682.

117. Santa-Marta, M., Aires da Silva, F., Fonseca, A. M., Rato, S., and Goncalves, J. (2007). "HIV-1 Vif protein blocks the cytidine deaminase activity of B-cell specific AID in E. coli by a similar mechanism of action." Mol Immunol 44(4): 583-590.

118. Vuong, B. Q., Lee, M., Kabir, S., Irimia, C., Macchiarulo, S., McKnight, G. S., and Chaudhuri, J. (2009). "Specific recruitment of protein kinase A to the immunoglobulin locus regulates class-switch recombination." Nat Immunol 10(4): 420-426.

119. Teng, G. and F. N. Papavasiliou (2007). "Immunoglobulin somatic hypermutation." Annu Rev Genet 41: 107-120.

120. Casali, P., Pal, Z., Xu, Z., and Zan, H. (2006). "DNA repair in antibody somatic hypermutation." Trends Immunol 27(7): 313-321.

121. Neuberger, M. S., Di Noia, J. M., Beale, R. C., Williams, G. T., Yang, Z., and Rada, C. (2005). "Somatic hypermutation at A.T pairs: polymerase error versus dUTP incorporation." Nat Rev Immunol 5(2): 171-178.

122. Langerak, P., Nygren, A. O., Krijger, P. H., van den Berk, P. C., and Jacobs, H. (2007). "A/T mutagenesis in hypermutated immunoglobulin genes strongly depends on PCNAK164 modification." J Exp Med 204(8): 1989-1998.

215

123. Mastache, E. F., Lindroth, K., Fernandez, C., and Gonzalez-Fernandez, A. (2006). "Somatic hypermutation of Ig genes is affected differently by failures in apoptosis caused by disruption of Fas (lpr mutation) or by overexpression of Bcl- 2." Scand J Immunol 63(6): 420-429.

124. Peled, J. U., Kuang, F. L., Iglesias-Ussel, M. D., Roa, S., Kalis, S. L., Goodman, M. F., and Scharff, M. D. (2008). "The biochemistry of somatic hypermutation." Annu Rev Immunol 26: 481-511.

125. Spencer, J. and D. K. Dunn-Walters (2005). "Hypermutation at A-T base pairs: the A nucleotide replacement spectrum is affected by adjacent nucleotides and there is no reverse complementarity of sequences flanking mutated A and T nucleotides." J Immunol 175(8): 5170-5177.

126. Upton, D. C., Gregory, B. L., Arya, R., and Unniraman, S. (2011). "AID: a riddle wrapped in a mystery inside an enigma." Immunol Res 49(1-3): 14-24.

127. Besmer, E., Gourzi, P., and Papavasiliou, F. N. (2004). "The regulation of somatic hypermutation." Curr Opin Immunol 16(2): 241-245.

128. Seki, M., Gearhart, P. J., and Wood, R. D. (2005). "DNA polymerases and somatic hypermutation of immunoglobulin genes." EMBO Rep 6(12): 1143-1148.

129. Pham, P., Zhang, K., and Goodman, M. F. (2008). "Hypermutation at A/T sites during G.U mismatch repair in vitro by human B-cell lysates." J Biol Chem 283(46): 31754-31762.

130. Bothmer, A., Robbiani, D. F., Feldhahn, N., Gazumyan, A., Nussenzweig, A., and Nussenzweig, M. C. (2010). "53BP1 regulates DNA resection and the choice between classical and alternative end joining during class switch recombination." J Exp Med 207(4): 855-865.

131. Qiao, X., He, B., Chiu, A., Knowles, D. M., Chadburn, A., and Cerutti, A. (2006). "Human immunodeficiency virus 1 Nef suppresses CD40-dependent immunoglobulin class switching in bystander B cells." Nat Immunol 7(3): 302- 310.

132. Janoff, E. N., Smith, P. D., and Blaser, M. J. (1988). "Acute antibody responses to Giardia lamblia are depressed in patients with AIDS." J Infect Dis 157(4): 798-804.

133. Janoff, E. N., Jackson, S., Wahl, S. M., Thomas, K., Peterman, J. H., and Smith, P. D. (1994). "Intestinal mucosal immunoglobulins during human immunodeficiency virus type 1 infection." J Infect Dis 170(2): 299-307.

216

134. Janoff, E. N. and J. B. Rubins (2004). Immunodeficiency and invasive pneumococcal disease. The Pneumococcus. E. I. Tuomanen, T. J. Mitchell, D. A. Morrison and B. G. Spratt. Washington, DC, ASM Press: 252-280.

135. Janoff, E. N., Scamurra, R. W., Sanneman, T. C., Eidman, K., and Thurn, J. R. (1999). "Human immunodeficiency virus type 1 and mucosal humoral defense." J Infect Dis 179 Suppl 3: S475-479.

136. Amadori, A. and L. Chieco-Bianchi (1990). "B-cell activation and HIV-1 infection: deeds and misdeeds." Immunol Today 11(10): 374-379.

137. Shirai, A., Cosentino, M., Leitman-Klinman, S. F., and Klinman, D. M. (1992). "Human immunodeficiency virus infection induces both polyclonal and virus-specific B cell activation." J Clin Invest 89(2): 561-566.

138. Berberian, L., Valles-Ayoub, Y., Sun, N., Martinez-Maza, O., and Braun, J. (1991). "A VH clonal deficit in human immunodeficiency virus-positive individuals reflects a B-cell maturational arrest." Blood 78(1): 175-179.

139. Martin, G., Roy, J., Barat, C., Ouellet, M., Gilbert, C., and Tremblay, M. J. (2007). "Human immunodeficiency virus type 1-associated CD40 ligand transactivates B lymphocytes and promotes infection of CD4+ T cells." J Virol 81(11): 5872-5881.

140. Miedema, F., Petit, A. J., Terpstra, F. G., Schattenkerk, J. K., de Wolf, F., Al, B. J., Roos, M., Lange, J. M., Danner, S. A., Goudsmit, J., et al. (1988). "Immunological abnormalities in human immunodeficiency virus (HIV)-infected asymptomatic homosexual men. HIV affects the immune system before CD4+ T helper cell depletion occurs." J Clin Invest 82(6): 1908-1914.

141. He, B., Qiao, X., Klasse, P. J., Chiu, A., Chadburn, A., Knowles, D. M., Moore, J. P., and Cerutti, A. (2006). "HIV-1 envelope triggers polyclonal Ig class switch recombination through a CD40-independent mechanism involving BAFF and C-type lectin receptors." J Immunol 176(7): 3931-3941.

142. Yarchoan, R., Redfield, R. R., and Broder, S. (1986). "Mechanisms of B cell activation in patients with acquired immunodeficiency syndrome and related disorders. Contribution of antibody-producing B cells, of Epstein-Barr virus- infected B cells, and of immunoglobulin production induced by human T cell lymphotropic virus, type III/lymphadenopathy-associated virus." J Clin Invest 78(2): 439-447. 143. Carson, P. J., Schut, R. L., Simpson, M. L., O'Brien, J., and Janoff, E. N. (1995). "Antibody class and subclass responses to pneumococcal polysaccharides following immunization of human immunodeficiency virus-infected patients." J Infect Dis 172(2): 340-345.

217

144. Viau, M., Veas, F., and Zouali, M. (2007). "Direct impact of inactivated HIV-1 virions on B lymphocyte subsets." Mol Immunol 44(8): 2124-2134.

145. Hart, M., Steel, A., Clark, S. A., Moyle, G., Nelson, M., Henderson, D. C., Wilson, R., Gotch, F., Gazzard, B., and Kelleher, P. (2007). "Loss of discrete memory B cell subsets is associated with impaired immunization responses in HIV-1 infection and may be a risk factor for invasive pneumococcal disease." J Immunol 178(12): 8212-8220.

146. Rodriguez-Barradas, M. C., Musher, D. M., Lahart, C., Lacke, C., Groover, J., Watson, D., Baughn, R., Cate, T., and Crofoot, G. (1992). "Antibody to capsular polysaccharides of Streptococcus pneumoniae after vaccination of human immunodeficiency virus-infected subjects with 23-valent pneumococcal vaccine." J Infect Dis 165(3): 553-556.

147. Chang, Q., Abadi, J., Alpert, P., and Pirofski, L. (2000). "A pneumococcal capsular polysaccharide vaccine induces a repertoire shift with increased VH3 expression in peripheral B cells from human immunodeficiency virus (HIV)- uninfected but not HIV-infected persons." J Infect Dis 181(4): 1313-1321.

148. Payeras, A., Martinez, P., Mila, J., Riera, M., Pareja, A., Casal, J., and Matamoros, N. (2002). "Risk factors in HIV-1-infected patients developing repetitive bacterial infections: toxicological, clinical, specific antibody class responses, opsonophagocytosis and Fc(gamma) RIIa polymorphism characteristics." Clin Exp Immunol 130(2): 271-278.

149. Janoff, E. N., Hardy, W. D., Smith, P. D., and Wahl, S. M. (1991). "Humoral recall responses in HIV infection. Levels, specificity, and affinity of antigen-specific IgG." J Immunol 147(7): 2130-2135.

150. Coogan, M. M. and S. J. Challacombe (2000). "Serum and salivary antibodies to a mycobacterial 65-kDa stress protein are elevated in HIV-positive patients and modified by oral candidiasis." Oral Microbiol Immunol 15(5): 284- 289.

151. Challacombe, S. J. and S. P. Sweet (1997). "Salivary and mucosal immune responses to HIV and its co-pathogens." Oral Dis 3 Suppl 1: S79-84.

152. Kabat, E.A., T.T. Wu, H.M. Perry, K.S. Gottesman, and Foeller, C. Sequences of proteins of immunological interest. U.S. Department of Health and Human Services, Bethesda. 1991.

153. Owens, G. P., Ritchie, A. M., Burgoon, M. P., Williamson, R. A., Corboy, J. R., and Gilden, D. H. (2003). "Single-cell repertoire analysis demonstrates that clonal expansion is a prominent feature of the B cell response in multiple sclerosis cerebrospinal fluid." J Immunol 171(5): 2725-2733.

218

154. Kristiansen, S. V., Pascual, V., and Lipsky, P. E. (1994). "Staphylococcal protein A induces biased production of Ig by VH3-expressing B lymphocytes." J Immunol 153(7): 2974-2982.

155. Lefranc, M. P., Giudicelli, V., Ginestoux, C., Jabado-Michaloud, J., Folch, G., Bellahcene, F., Wu, Y., Gemrot, E., Brochet, X., Lane, J., Regnier, L., Ehrenmann, F., Lefranc, G., and Duroux, P. (2009). "IMGT, the international ImMunoGeneTics information system." Nucleic Acids Res 37(Database issue): D1006-1012.

156. Souto-Carneiro, M. M., Longo, N. S., Russ, D. E., Sun, H. W., and Lipsky, P. E. (2004). "Characterization of the human Ig heavy chain antigen binding complementarity determining region 3 using a newly developed software algorithm, JOINSOLVER." J Immunol 172(11): 6790-6802.

157. Mortari, F., Wang, J. Y., and Schroeder, H. W., Jr. (1993). "Human cord blood antibody repertoire. Mixed population of VH gene segments and CDR3 distribution in the expressed C alpha and C gamma repertoires." J Immunol 150(4): 1348-1357.

158. Kyte, J. and R. F. Doolittle (1982). "A simple method for displaying the hydropathic character of a protein." J Mol Biol 157(1): 105-132.

159. Eisenberg, D. (1984). "Three-dimensional structure of membrane and surface proteins." Annu Rev Biochem 53: 595-623.

160. Dijkman, R., Tensen, C. P., Buettner, M., Niedobitek, G., Willemze, R., and Vermeer, M. H. (2006). "Primary cutaneous follicle center lymphoma and primary cutaneous large B-cell lymphoma, leg type, are both targeted by aberrant somatic hypermutation but demonstrate differential expression of AID." Blood 107(12): 4926-4929.

161. Shen, H. M., Peters, A., Baron, B., Zhu, X., and Storb, U. (1998). "Mutation of BCL-6 gene in normal B cells by the process of somatic hypermutation of Ig genes." Science 280(5370): 1750-1752.

162. Liu, M., Duke, J. L., Richter, D. J., Vinuesa, C. G., Goodnow, C. C., Kleinstein, S. H., and Schatz, D. G. (2008). "Two levels of protection for the B cell genome during somatic hypermutation." Nature 451(7180): 841-845.

163. Lane, H. C., Masur, H., Edgar, L. C., Whalen, G., Rook, A. H., and Fauci, A. S. (1983). "Abnormalities of B-cell activation and immunoregulation in patients with the acquired immunodeficiency syndrome." N Engl J Med 309(8): 453-458.

219

164. Terpstra, F. G., Al, B. J., Roos, M. T., De Wolf, F., Goudsmit, J., Schellekens, P. T., and Miedema, F. (1989). "Longitudinal study of leukocyte functions in homosexual men seroconverted for HIV: rapid and persistent loss of B cell function after HIV infection." Eur J Immunol 19(4): 667-673.

165. Obaro, S. K., Pugatch, D., and Luzuriaga, K. (2004). "Immunogenicity and efficacy of childhood vaccines in HIV-1-infected children." Lancet Infect Dis 4(8): 510-518.

166. Pancharoen, C., Ananworanich, J., and Thisyakorn, U. (2004). "Immunization for persons infected with human immunodeficiency virus." Curr HIV Res 2(4): 293-299.

167. Rivas, P., Herrero, M. D., Puente, S., Ramirez-Olivencia, G., and Soriano, V. (2007). "Immunizations in HIV-infected adults." AIDS Rev 9(3): 173-187.

168. Berek, C., Berger, A., and Apel, M. (1991). "Maturation of the immune response in germinal centers." Cell 67(6): 1121-1129.

169. Fogelman, I., Davey, V., Ochs, H. D., Elashoff, M., Feinberg, M. B., Mican, J., Siegel, J. P., Sneller, M., and Lane, H. C. (2000). "Evaluation of CD4+ T cell function In vivo in HIV-infected patients as measured by bacteriophage phiX174 immunization." J Infect Dis 182(2): 435-441.

170. Matsuda, F., Ishii, K., Bourvagnet, P., Kuma, K., Hayashida, H., Miyata, T., and Honjo, T. (1998). "The complete nucleotide sequence of the human immunoglobulin heavy chain variable region locus." J Exp Med 188(11): 2151- 2162.

171. Kirkham, P. M. and H. W. Schroeder, Jr. (1994). "Antibody structure and the evolution of immunoglobulin V gene segments." Semin Immunol 6(6): 347- 360.

172. Rosner, K., Winter, D. B., Tarone, R. E., Skovgaard, G. L., Bohr, V. A., and Gearhart, P. J. (2001). "Third complementarity-determining region of mutated VH immunoglobulin genes contains shorter V, D, J, P, and N components than non-mutated genes." Immunology 103(2): 179-187.

173. Sasso, E. H., Buckner, J. H., and Suzuki, L. A. (1995). "Ethnic differences of polymorphism of an immunoglobulin VH3 gene." J Clin Invest 96(3): 1591- 1600.

174. Brezinschek, H. P., Foster, S. J., Brezinschek, R. I., Dorner, T., Domiati- Saad, R., and Lipsky, P. E. (1997). "Analysis of the human VH gene repertoire. Differential effects of selection and somatic hypermutation on human peripheral CD5(+)/IgM+ and CD5(-)/IgM+ B cells." J Clin Invest 99(10): 2488-2501.

220

175. Berberian, L., Goodglick, L., Kipps, T. J., and Braun, J. (1993). "Immunoglobulin VH3 gene products: natural ligands for HIV gp120." Science 261(5128): 1588-1591.

176. Goodglick, L., Zevit, N., Neshat, M. S., and Braun, J. (1995). "Mapping the Ig superantigen-binding site of HIV-1 gp120." J Immunol 155(11): 5151- 5159.

177. French, N., Nakiyingi, J., Carpenter, L. M., Lugada, E., Watera, C., Moi, K., Moore, M., Antvelink, D., Mulder, D., Janoff, E. N., Whitworth, J., and Gilks, C. F. (2000). "23-valent pneumococcal polysaccharide vaccine in HIV-1-infected Ugandan adults: double-blind, randomised and placebo controlled trial." Lancet 355(9221): 2106-2111.

178. Morea, V., Tramontano, A., Rustici, M., Chothia, C., and Lesk, A. M. (1998). "Conformations of the third hypervariable region in the VH domain of immunoglobulins." J Mol Biol 275(2): 269-294.

179. Nair, N., Moss, W. J., Scott, S., Mugala, N., Ndhlovu, Z. M., Lilo, K., Ryon, J. J., Monze, M., Quinn, T. C., Cousens, S., Cutts, F., and Griffin, D. E. (2009). "HIV-1 infection in Zambian children impairs the development and avidity maturation of measles virus-specific immunoglobulin G after vaccination and infection." J Infect Dis 200(7): 1031-1038.

180. Brunell, P. A., Vimal, V., Sandu, M., Courville, T. M., Daar, E., and Israele, V. (1995). "Abnormalities of measles antibody response in human immunodeficiency virus type 1 (HIV-1) infection." J Acquir Immune Defic Syndr Hum Retrovirol 10(5): 540-548.

181. Dunn-Walters, D. K. and J. Spencer (1998). "Strong intrinsic biases towards mutation and conservation of bases in human IgVH genes during somatic hypermutation prevent statistical analysis of antigen selection." Immunology 95(3): 339-345.

182. Rogosch, T., Kerzel, S., Hoss, K., Hoersch, G., Zemlin, C., Heckmann, M., Berek, C., Schroeder, H. W., Jr., Maier, R. F., and Zemlin, M. (2012). "IgA response in preterm neonates shows little evidence of antigen-driven selection." J Immunol 189(11): 5449-5456.

183. Karray, S., Juompan, L., Maroun, R. C., Isenberg, D., Silverman, G. J., and Zouali, M. (1998). "Structural basis of the gp120 superantigen-binding site on human immunoglobulins." J Immunol 161(12): 6681-6688.

221

184. Luo, Z., Ronai, D., and Scharff, M. D. (2004). "The role of activation- induced cytidine deaminase in antibody diversification, immunodeficiency, and B-cell malignancies." J Allergy Clin Immunol 114(4): 726-735.

185. Beale, R. C., Petersen-Mahrt, S. K., Watt, I. N., Harris, R. S., Rada, C., and Neuberger, M. S. (2004). "Comparison of the differential context-dependence of DNA deamination by APOBEC enzymes: correlation with mutation spectra in vivo." J Mol Biol 337(3): 585-596.

186. Langlois, M. A., Beale, R. C., Conticello, S. G., and Neuberger, M. S. (2005). "Mutational comparison of the single-domained APOBEC3C and double- domained APOBEC3F/G anti-retroviral cytidine deaminases provides insight into their DNA target site specificities." Nucleic Acids Res 33(6): 1913-1923.

187. Huse, S. M., Huber, J. A., Morrison, H. G., Sogin, M. L., and Welch, D. M. (2007). "Accuracy and quality of massively parallel DNA pyrosequencing." Genome Biol 8(7): R143.

188. Eriksson, N., Pachter, L., Mitsuya, Y., Rhee, S. Y., Wang, C., Gharizadeh, B., Ronaghi, M., Shafer, R. W., and Beerenwinkel, N. (2008). "Viral population estimation using pyrosequencing." PLoS Comput Biol 4(4): e1000074.

189. Gilles, A., Meglecz, E., Pech, N., Ferreira, S., Malausa, T., and Martin, J. F. (2011). "Accuracy and quality assessment of 454 GS-FLX Titanium pyrosequencing." BMC Genomics 12: 245.

190. Logan, A. C., Gao, H., Wang, C., Sahaf, B., Jones, C. D., Marshall, E. L., Buno, I., Armstrong, R., Fire, A. Z., Weinberg, K. I., Mindrinos, M., Zehnder, J. L., Boyd, S. D., Xiao, W., Davis, R. W., and Miklos, D. B. (2011). "High- throughput VDJ sequencing for quantification of minimal residual disease in chronic lymphocytic leukemia and immune reconstitution assessment." Proc Natl Acad Sci U S A 108(52): 21194-21199.

191. Rubelt, F., Sievert, V., Knaust, F., Diener, C., Lim, T. S., Skriner, K., Klipp, E., Reinhardt, R., Lehrach, H., and Konthur, Z. (2012). "Onset of immune senescence defined by unbiased pyrosequencing of human immunoglobulin mRNA repertoires." PLoS One 7(11): e49774.

192. Eddy, S. R. (2009). "A new generation of homology search tools based on probabilistic inference." Genome Inform 23(1): 205-211.

193. Glanville, J., Zhai, W., Berka, J., Telman, D., Huerta, G., Mehta, G. R., Ni, I., Mei, L., Sundar, P. D., Day, G. M., Cox, D., Rajpal, A., and Pons, J. (2009). "Precise determination of the diversity of a combinatorial antibody library gives insight into the human immunoglobulin repertoire." Proc Natl Acad Sci U S A 106(48): 20216-20221. 222

194. Ewing, B. and P. Green (1998). "Base-calling of automated sequencer traces using phred. II. Error probabilities." Genome Res 8(3): 186-194.

195. Hofmann, W. P., Fernandez, B., Herrmann, E., Welsch, C., Mihm, U., Kronenberger, B., Feldmann, G., Spengler, U., Zeuzem, S., and Sarrazin, C. (2007). "Somatic hypermutation and mRNA expression levels of the BCL-6 gene in patients with hepatitis C virus-associated lymphoproliferative diseases." J Viral Hepat 14(7): 484-491.

196. Capello, D., Vitolo, U., Pasqualucci, L., Quattrone, S., Migliaretti, G., Fassone, L., Ariatti, C., Vivenza, D., Gloghini, A., Pastore, C., Lanza, C., Nomdedeu, J., Botto, B., Freilone, R., Buonaiuto, D., Zagonel, V., Gallo, E., Palestro, G., Saglio, G., Dalla-Favera, R., Carbone, A., and Gaidano, G. (2000). "Distribution and pattern of BCL-6 mutations throughout the spectrum of B-cell neoplasia." Blood 95(2): 651-659.

197. Crotty, S., Johnston, R. J., and Schoenberger, S. P. (2010). "Effectors and memories: Bcl-6 and Blimp-1 in T and B lymphocyte differentiation." Nat Immunol 11(2): 114-120.

198. Ballet, J. J., Sulcebe, G., Couderc, L. J., Danon, F., Rabian, C., Lathrop, M., Clauvel, J. P., and Seligmann, M. (1987). "Impaired anti-pneumococcal antibody response in patients with AIDS-related persistent generalized lymphadenopathy." Clin Exp Immunol 68(3): 479-487. 199. French, N., Gilks, C. F., Mujugira, A., Fasching, C., O'Brien, J., and Janoff, E. N. (1998). "Pneumococcal vaccination in HIV-1-infected adults in Uganda: humoral response and two vaccine failures." AIDS 12(13): 1683-1689.

200. Janoff, E. N., Fasching, C., Ojoo, J. C., O'Brien, J., and Gilks, C. F. (1997). "Responsiveness of human immunodeficiency virus type 1-infected Kenyan women with or without prior pneumococcal disease to pneumococcal vaccine." J Infect Dis 175(4): 975-978.

201. Epeldegui, M., Breen, E. C., Hung, Y. P., Boscardin, W. J., Detels, R., and Martinez-Maza, O. (2007). "Elevated expression of activation induced cytidine deaminase in peripheral blood mononuclear cells precedes AIDS-NHL diagnosis." AIDS 21(17): 2265-2270.

202. Cagigi, A., Nilsson, A., Pensieroso, S., and Chiodi, F. (2010). "Dysfunctional B-cell responses during HIV-1 infection: implication for influenza vaccination and highly active antiretroviral therapy." Lancet Infect Dis 10(7): 499-503.

223

203. Perreau, M., Savoye, A. L., De Crignis, E., Corpataux, J. M., Cubas, R., Haddad, E. K., De Leval, L., Graziosi, C., and Pantaleo, G. (2013). "Follicular helper T cells serve as the major CD4 T cell compartment for HIV-1 infection, replication, and production." J Exp Med 210(1): 143-156.

204. Victora, G. D. and M. C. Nussenzweig (2012). "Germinal centers." Annu Rev Immunol 30: 429-457.

205. Good-Jacobson, K. L. and M. J. Shlomchik (2010). "Plasticity and heterogeneity in the generation of memory B cells and long-lived plasma cells: the influence of germinal center interactions and dynamics." J Immunol 185(6): 3117-3125.

206. Iannello, A., Boulassel, M. R., Samarani, S., Debbeche, O., Tremblay, C., Toma, E., Routy, J. P., and Ahmad, A. (2010). "Dynamics and consequences of IL-21 production in HIV-infected individuals: a longitudinal and cross-sectional study." J Immunol 184(1): 114-126.

207. Hershberg, U., Uduman, M., Shlomchik, M. J., and Kleinstein, S. H. (2008). "Improved methods for detecting selection by mutation analysis of Ig V region sequences." Int Immunol 20(5): 683-694.

208. MacDonald, C. M., Boursier, L., D'Cruz, D. P., Dunn-Walters, D. K., and Spencer, J. (2010). "Mathematical analysis of antigen selection in somatically mutated immunoglobulin genes associated with autoimmunity." Lupus 19(10): 1161-1170. 209. Kolar, G. R., Mehta, D., Wilson, P. C., and Capra, J. D. (2006). "Diversity of the Ig repertoire is maintained with age in spite of reduced germinal centre cells in human tonsil lymphoid tissue." Scand J Immunol 64(3): 314-324.

210. Chong, Y., Ikematsu, H., Yamaji, K., Nishimura, M., Kashiwagi, S., and Hayashi, J. (2003). "Age-related accumulation of Ig V(H) gene somatic mutations in peripheral B cells from aged humans." Clin Exp Immunol 133(1): 59-66.

211. Matsumoto, Y., Marusawa, H., Kinoshita, K., Endo, Y., Kou, T., Morisawa, T., Azuma, T., Okazaki, I. M., Honjo, T., and Chiba, T. (2007). "Helicobacter pylori infection triggers aberrant expression of activation-induced cytidine deaminase in gastric epithelium." Nat Med 13(4): 470-476.

212. Stavnezer, J. (2011). "Complex regulation and function of activation- induced cytidine deaminase." Trends Immunol 32(5): 194-201.

213. Storck, S., Aoufouchi, S., Weill, J. C., and Reynaud, C. A. (2011). "AID and partners: for better and (not) for worse." Curr Opin Immunol 23(3): 337-344.

224

214. Basu, U., Chaudhuri, J., Alpert, C., Dutt, S., Ranganath, S., Li, G., Schrum, J. P., Manis, J. P., and Alt, F. W. (2005). "The AID antibody diversification enzyme is regulated by protein kinase A phosphorylation." Nature 438(7067): 508-511.

215. He, B., Santamaria, R., Xu, W., Cols, M., Chen, K., Puga, I., Shan, M., Xiong, H., Bussel, J. B., Chiu, A., Puel, A., Reichenbach, J., Marodi, L., Doffinger, R., Vasconcelos, J., Issekutz, A., Krause, J., Davies, G., Li, X., Grimbacher, B., Plebani, A., Meffre, E., Picard, C., Cunningham-Rundles, C., Casanova, J. L., and Cerutti, A. (2010). "The transmembrane activator TACI triggers immunoglobulin class switching by activating B cells through the adaptor MyD88." Nat Immunol 11(9): 836-845.

216. Chirmule, N., Kalyanaraman, V. S., Saxinger, C., Wong-Staal, F., Ghrayeb, J., and Pahwa, S. (1990). "Localization of B-cell stimulatory activity of HIV-1 to the carboxyl terminus of gp41." AIDS Res Hum Retroviruses 6(3): 299- 305.

217. Amadori, A., Zamarchi, R., Veronese, M. L., Panozzo, M., Mazza, M. R., Barelli, A., Borri, A., and Chieco-Bianchi, L. (1991). "B-cell activation during HIV-1 infection. III. Down-regulating effect of mitogens." AIDS 5(7): 821-828.

218. Amadori, A., Zamarchi, R., Ciminale, V., Del Mistro, A., Siervo, S., Alberti, A., Colombatti, M., and Chieco-Bianchi, L. (1989). "HIV-1-specific B cell activation. A major constituent of spontaneous B cell activation during HIV-1 infection." J Immunol 143(7): 2146-2152. 219. Benedetto, A., Di Caro, A., Camporiondo, M. P., Gallone, D., Zaniratti, S., Tozzi, V., and Elia, G. (1992). "Identification of a CD21 receptor-deficient, non-Ig-secreting peripheral B lymphocyte subset in HIV-seropositive drug abusers." Clin Immunol Immunopathol 62(2): 139-147.

220. Lanzavecchia, A. and F. Sallusto (2009). "Human B cell memory." Curr Opin Immunol 21(3): 298-304.

221. Calame, K. L., Lin, K. I., and Tunyaplin, C. (2003). "Regulatory mechanisms that determine the development and function of plasma cells." Annu Rev Immunol 21: 205-230.

222. Chen, K., Xu, W., Wilson, M., He, B., Miller, N. W., Bengten, E., Edholm, E. S., Santini, P. A., Rath, P., Chiu, A., Cattalini, M., Litzman, J., B. Bussel, J., Huang, B, Meini, A., Riesbeck, K., Cunningham-Rundles, C., Plebani, A., and Cerutti, A. (2009). "Immunoglobulin D enhances immune surveillance by activating antimicrobial, proinflammatory and B cell-stimulating programs in basophils." Nat Immunol 10(8): 889-898.

225

223. Schacker, T. (2008). "The role of secondary lymphatic tissue in immune deficiency of HIV infection." AIDS 22 Suppl 3: S13-18.

224. Trautmann, L., Janbazian, L., Chomont, N., Said, E. A., Gimmig, S., Bessette, B., Boulassel, M. R., Delwart, E., Sepulveda, H., Balderas, R. S., Routy, J. P., Haddad, E. K., and Sekaly, R. P. (2006). "Upregulation of PD-1 expression on HIV-specific CD8+ T cells leads to reversible immune dysfunction." Nat Med 12(10): 1198-1202.

225. D'Souza, M., Fontenot, A. P., Mack, D. G., Lozupone, C., Dillon, S., Meditz, A., Wilson, C. C., Connick, E., and Palmer, B. E. (2007). "Programmed death 1 expression on HIV-specific CD4+ T cells is driven by viral replication and associated with T cell dysfunction." J Immunol 179(3): 1979-1987.

226. Kulpa, D. A., Lawani, M., Cooper, A., Peretz, Y, Ahlers, J., and Sekaly, R. P. (2013). "PD-1 coinhibitory signals: The link between pathogenesis and protection." Semin Immunol. doi: 10.1016/j.smim.2013.02.002.

227. Ojdana, D., Safiejko, K., Lipska, A., Radziwon, P., Dadan, J., and Tryniszewska, E. (2008). "Effector and memory CD4+ and CD8+ T cells in the chronic infection process." Folia Histochem Cytobiol 46(4): 413-417.

228. Mueller, S. N., Gebhardt, T., Carbone, F. R., and Heath, W. R. (2013). "Memory T cell subsets, migration patterns, and tissue residence." Annu Rev Immunol 31: 137-161.

229. Seder, R. A., Darrah, P. A., and Roederer, M. (2008). "T-cell quality in memory and protection: implications for vaccine design." Nat Rev Immunol 8(4): 247-258.

230. Gattinoni, L., Klebanoff, C. A., and Restifo, N. P. (2012). "Paths to stemness: building the ultimate antitumour T cell." Nat Rev Cancer 12(10): 671- 684.

231. Brenchley, J. M., Price, D. A., Schacker, T. W., Asher, T. E., Silvestri, G., Rao, S., Kazzaz, Z., Bornstein, E., Lambotte, O., Altmann, D., Blazar, B. R., Rodriguez, B., Teixeira-Johnson, L., Landay, A., Martin, J. N., Hecht, F. M., Picker, L. J., Lederman, M. M., Deeks, S. G., and Douek, D. C. (2006). "Microbial translocation is a cause of systemic immune activation in chronic HIV infection." Nat Med 12(12): 1365-1371.

232. Jiang, W., Lederman, M. M., Hunt, P., Sieg, S. F., Haley, K., Rodriguez, B., Landay, A., Martin, J., Sinclair, E., Asher, A. I., Deeks, S. G., Douek, D. C., and Brenchley, J. M. (2009). "Plasma levels of bacterial DNA correlate with immune activation and the magnitude of immune restoration in persons with antiretroviral-treated HIV infection." J Infect Dis 199(8): 1177-1185. 226

233. Zamarchi, R., Barelli, A., Borri, A., Petralia, G., Ometto, L., Masiero, S., Chieco-Bianchi, L., and Amadori, A. (2002). "B cell activation in peripheral blood and lymph nodes during HIV infection." AIDS 16(9): 1217-1226.

234. Zhang, W. J., Koltun, W. A., Tilberg, A. F., Thompson, J. L., and Chorney, M. J. (2003). "Genetic control of interleukin-4-induced activation of the human signal transducer and activator of transcription 6 signaling pathway." Hum Immunol 64(4): 402-415.

235. Arpin, C., Dechanet, J., Van Kooten, C., Merville, P., Grouard, G., Briere, F., Banchereau, J., and Liu, Y. J. (1995). "Generation of memory B cells and plasma cells in vitro." Science 268(5211): 720-722.

236. Shlomchik, M. J. and F. Weisel (2012). "Germinal center selection and the development of memory B and plasma cells." Immunol Rev 247(1): 52-63.

237. Frasca, D., Landin, A. M., Lechner, S. C., Ryan, J. G., Schwartz, R., Riley, R. L., and Blomberg, B. B. (2008). "Aging down-regulates the transcription factor E2A, activation-induced cytidine deaminase, and Ig class switch in human B cells." J Immunol 180(8): 5283-5290.

238. Conticello, S. G., Ganesh, K., Xue, K., Lu, M., Rada, C., and Neuberger, M. S. (2008). "Interaction between antibody-diversification enzyme AID and spliceosome-associated factor CTNNBL1." Mol Cell 31(4): 474-484. 239. Kim, Y. and M. Tian (2010). "The recruitment of activation induced cytidine deaminase to the immunoglobulin locus by a regulatory element." Mol Immunol 47(9): 1860-1865.

240. Beniguel, L., Begaud, E., Peruchon, S., Cognasse, F., Gabrie, P., Marovich, M., Lucht, F., Genin, C., and Garraud, O. (2004). "Isotype profiles of anti-gp160 antibodies from HIV-infected patients in plasma and culture supernatants." Immunol Lett 93(1): 57-62.

241. Tomaras, G. D., Yates, N. L., Liu, P., Qin, L., Fouda, G. G., Chavez, L. L., Decamp, A. C., Parks, R. J., Ashley, V. C., Lucas, J. T., Cohen, M., Eron, J., Hicks, C. B., Liao, H. Z., Self, S. G., Landucci, G., Forthal, D. N., Weinhold, K. J., Keele, B. F., Hahn, B. H., Greenberg, M. L., Morris, L., Karin, S. S., Blattner, W. A., Montefiori, D. C., Shaw, G. M., Perelson, A. S., and Haynes, B. F. (2008). "Initial B-cell responses to transmitted human immunodeficiency virus type 1: virion-binding immunoglobulin M (IgM) and IgG antibodies followed by plasma anti-gp41 antibodies with ineffective control of initial viremia." J Virol 82(24): 12449-12463.

227

242. Fouda, G. G., Yates, N. L., Pollara, J., Shen, X., Overman, G. R., Mahlokozera, T., Wilks, A. B., Kang, H. H., Salazar-Gonzalez, J. F., Salazar, M. G., Kalilani, L., Meshnick, S. R., Hahn, B. H., Shaw, G. M., Lovingood, R. V., Denny, T. N., Haynes, B., Letvin, N. L., Ferrari, G., Montefiori, D. C., Tomaras, G. D., Permar, S. R. and Center for H. I. V. Aids Vaccine Immunology. (2011). "HIV-specific functional antibody responses in breast milk mirror those in plasma and are primarily mediated by IgG antibodies." J Virol 85(18): 9555-9567.

243. Rolf, J., Bell, S. E., Kovesdi, D., Janas, M. L., Soond, D. R., Webb, L. M., Santinelli, S., Saunders, T., Hebeis, B., Killeen, N., Okkenhaug, K., and Turner, M. (2010). "Phosphoinositide 3-kinase activity in T cells regulates the magnitude of the germinal center reaction." J Immunol 185(7): 4042-4052.

244. Zotos, D., Coquet, J. M., Zhang, Y., Light, A., D’Costa, K., Kallies, A., Corcoran, L. M., Godfrey, D. I., Toellner, K. M., Smyth, M. J., Nutt, S. L., and Tarlinton, D. M. (2010). "IL-21 regulates germinal center B cell differentiation and proliferation through a B cell-intrinsic mechanism." J Exp Med 207(2): 365- 378.

245. Takahashi, Y., Ohta, H., and Takemori, T. (2001). "Fas is required for clonal selection in germinal centers and the subsequent establishment of the memory B cell repertoire." Immunity 14(2): 181-192.

246. Pallikkuth, S., Parmigiani, A., and Pahwa, S. (2012). "The role of interleukin-21 in HIV infection." Cytokine Growth Factor Rev 23(4-5): 173-180.

247. Cubas, R. A., Mudd, J. C., Savoye, A. L., Perreau, M., can Grevenynghe, J., Metcalf, T., Connick, E., Meditz, A., Freemean, G. J., Abesada-Terk, G., Jr., Jacobson, J. M., Brooks, A. D., Crotty, S., Estes, J. D., Pantaleo, G., Lederman, M. M., and Haddad, E. K. (2013). "Inadequate T follicular cell help impairs B cell immunity during HIV infection." Nat Med 19(4): 494-499.

248. Popovic, M., Tenner-Racz, K., Pelser, C., Stellbrink, H. J., van Lunzen, J., Lewis, G., Kalyanaraman, V. S., Gallo, R. C., and Racz, P. (2005). "Persistence of HIV-1 structural proteins and glycoproteins in lymph nodes of patients under highly active antiretroviral therapy." Proc Natl Acad Sci U S A 102(41): 14807- 14812.

249. Shen, X. and G. D. Tomaras (2011). "Alterations of the B-cell response by HIV-1 replication." Curr HIV/AIDS Rep 8(1): 23-30.

250. Cagigi, A., Du, L., Dang, L. V., Grutzmeier, S., Atlas, A., Chiodi, F., Pan- Hammarstrom, Q., and Nilsson, A. (2009). "CD27(-) B-cells produce class switched and somatically hyper-mutated antibodies during chronic HIV-1 infection." PLoS One 4(5): e5427.

228

251. Chatzigeorgiou, A., Lyberi, M., Chatzilymperis, G., Nezos, A., and Kamper, E. (2009). "CD40/CD40L signaling and its implication in health and disease." Biofactors 35(6): 474-483.

252. Graham, J. P., Arcipowski, K. M., and Bishop, G. A. (2010). "Differential B-lymphocyte regulation by CD40 and its viral mimic, latent membrane protein 1." Immunol Rev 237(1): 226-248.

253. van der Donk, E. M., Schutten, M., Osterhaus, A. D., and van der Heijden, R. W. (1994). "Molecular characterization of variable heavy and light chain regions of five HIV type 1-specific human monoclonal antibodies." AIDS Res Hum Retroviruses 10(12): 1639-1649.

254. Chui, Y. L., Lozano, F., Jarvis, J. M., Pannell, R., and Milstein, C. (1995). "A reporter gene to analyse the hypermutation of immunoglobulin genes." J Mol Biol 249(3): 555-563.

255. Ruckerl, F. and J. Bachl (2005). "Activation-induced cytidine deaminase fails to induce a mutator phenotype in the human pre-B cell line Nalm-6." Eur J Immunol 35(1): 290-298.

256. Yoshikawa, K., Okazaki, I. M., Eto, T., Kinoshita, K., Muramatsu, M., Nagaoka, H., and Honjo, T. (2002). "AID enzyme-induced hypermutation in an actively transcribed gene in fibroblasts." Science 296(5575): 2033-2036.

229

APPENDIX A

MUTATION PATTERN TABLES

Table A1: Biochemical Characteristics of Amino Acids in CDR3 Regions from VH3 454-Pyrosequenced Samples.

IgD Control HIV-1+ p value CDR3 Length (amino acids) 13.98 15.3 0.08 (12.3-15.5) (13.4-17.0) Hydrophobicity Index -0.005915 0.004 0.33 (-0.0315-0.0004) (-0.087-0.084) Polar Residues 25.15 25.3 0.51 (22.7-26.8) (23.9-27.2) Acidic Residues 13.83 14.1 0.51 (12.6-15.9) (12.9-16.6) Basic Residues 11.0 11.1 0.44 (9.6-11.9) (10.0-12.4) Uncharged Polar Residues 39.22 38.2 0.38 (36.2-42.8) (24.8-40.9) Nonpolar Residues 35.74 36.0 0.65 (34.0-38.4) (34.7-39.9) Aromatic Residues 22.33 20.7 0.002 (20.9-24.5) (14.2-21.4)

230

IgM Control Day 0 HIV-1+ Day 0 p value CDR3 Length (amino acids) 14.50 14.2 0.48 (12.4-16.2) (12.8-15.4) Hydrophobicity Index -0.01288 -0.017 0.67 (-0.0452- -0.0043) (-0.053-0.015) Polar Residues 25.44 26.9 0.44 (23.4-28.2) (23.8-28.0) Acidic Residues 14.01 14.4 0.37 (13.2-16.6) (13.4-15.9) Basic Residues 11.28 11.8 0.28 (10.1-13.0) (10.2-13.8) Uncharged Polar Residues 39.94 38.0 0.54 (36.4-41.7) (36.9-41.8) Nonpolar Residues 34.66 35.0 0.96 (34.3-37.3) (32.3-36.4) Aromatic Residues 22.14 20.8 0.28 (19.1-23.6) (18.4-23.2)

IgA Control Day 0 HIV-1+ Day 0 p value CDR3 Length (amino acids) 14.27 14.1 0.84 (13.6-15.5) (10.7-15.9) Hydrophobicity Index -0.01783 -0.013 1.00 (-0.033-0.0776) (-0.140-0.014) Polar Residues 26.91 27.0 1.00 (21.5-30.2) (24.2-29.2) Acidic Residues 14.21 14.8 0.60 (8.7-19.7) (13.6-17.0) Basic Residues 12.11 12.2 0.83 (10.5-13.6) (10.5-12.7) Uncharged Polar Residues 37.69 37.6 0.68 (24.1-40.2) (35.5-49.9) Nonpolar Residues 35.92 34.8 0.35 (34.3-45.7) (23.2-37.3) Aromatic Residues 18.41 20.0 0.25 (2.2-22.7) (18.6-24.4)

231

IgG Control Day 0 HIV-1+ Day 0 p value CDR3 Length (amino acids) 14.50 13.9 0.89 (13.1-15.2) (10.6-16.0) Hydrophobicity Index -0.0096 -0.024 0.74 (-0.0621-0.0689) (-0.159-0.062) Polar Residues 27.68 27.2 1.00 (24.3-31.8) (25.7-38.2) Acidic Residues 14.31 14.9 0.37 (11.7-17.0) (14.0-17.2) Basic Residues 12.77 12.4 0.74 (10.9-15.0) (11.0-21.0) Uncharged Polar Residues 36.08 36.6 0.54 (32.2-38.6) (29.2-39.6) Nonpolar Residues 36.72 34.9 0.32 (33.4-39.8) (32.5-42.6) Aromatic Residues 19.24 20.5 0.28 (16.8-20.3) (15.5-24.7) Residue characteristics were calculated with the number of each type of residue divided by the total number of residues in the CDR3 region of each sequence. Values are listed as the group median with the range of individual means in parentheses.

232

Table A2: RGYW/WRCY Motifs and Targeted Mutation Frequencies in CDR and FR Regions in VH3 454-Pyrosequenced Samples.

IgD Control Day 0 HIV-1+ Day 0 p value Number of RGYW/WRCY motifs 17.74 17.5 0.88 per VH segment (17.1-18.1) (16.3-19.3) Median Percent (Subject Ranges) % of CDR nucleotides mutated 1.90 7.03 0.03 (1.4-4.0) (1.8-20.0) % of CDR mutations present in 62.19 64.09 0.33 RGYW/WRCY motifs (51.9-64.0) (48.1-77.6) % of FR nucleotides mutated 1.55 3.17 0.08 (0.9-2.8) (0.7-9.6) % of FR mutations present in 25.85 32.15 0.05 RGYW/WRCY motifs (19.3-36.6) (25.2-43.7) % all nucleotides mutated 1.64 3.70 0.08 (1.0-2.7) (0.9-10.4) % of all mutated nucleotides 30.75 40.24 0.02 present in RGYW/WRCY motifs (24.3-41.6) (27.7-46.3)

IgM Control Day 0 HIV-1+ Day 0 p value Number of RGYW/WRCY motifs 18.15 17.7 0.04 per VH segment (17.6-18.5) (16.9-18.3) Median Percent (Subject Ranges) % of CDR nucleotides mutated 4.32 3.41 0.74 (2.6-6.1) (2.6-10.5) % of CDR mutations present in 65.38 64.47 0.96 RGYW/WRCY motifs (63.1-67.8) (53.3-75.5) % of FR nucleotides mutated 1.73 1.71 0.85 (1.3-2.1) (1.0-3.8) % of FR mutations present in 29.39 32.72 0.24 RGYW/WRCY motifs (25.6-30.8) (23.9-37.7) % all nucleotides mutated 2.16 1.95 0.74 (1.6-2.7) (1.3-4.5) % of all mutated nucleotides 36.24 40.57 0.07 present in RGYW/WRCY motifs (32.8-40.0) (28.0-48.4)

233

IgA Control Day 0 HIV-1+ Day 0 p value Number of RGYW/WRCY motifs 17.95 17.7 0.25 per VH segment (17.2-18.1) (16.9-18.6) Median Percent (Subject Ranges) % of CDR nucleotides mutated 15.28 12.89 0.25 (13.8-16.5) (9.2-23.8) % of CDR mutations present in 69.43 67.75 0.61 RGYW/WRCY motifs (64.1-74.8) (52.3-93.4) % of FR nucleotides mutated 4.67 3.58 0.37 (4.1-5.2) (2.6-6.6) % of FR mutations present in 36.16 35.47 0.41 RGYW/WRCY motifs (35.6-38.9) (33.5-39.8) % all nucleotides mutated 6.23 5.00 0.35 (5.5-6.7) (3.6-9.3) % of all mutated nucleotides 47.09 47.13 0.92 present in RGYW/WRCY motifs (46.1-52.2) (40.3-71.8)

IgG Control Day 0 HIV-1+ Day 0 p value Number of RGYW/WRCY motifs 17.82 17.4 0.07 per VH segment (17.2-18.0) (16.4-18.0) Median Percent (Subject Ranges) % of CDR nucleotides mutated 15.64 11.81 0.06 (13.3-16.8) (8.4-18.3) % of CDR mutations present in 66.47 65.93 0.89 RGYW/WRCY motifs (63.1-70.1) (57.8-70.8) % of FR nucleotides mutated 4.50 3.38 0.28 (4.4-5.2) (2.9-4.9) % of FR mutations present in 36.63 38.08 0.37 RGYW/WRCY motifs (33.2-39.8) (32.6-39.9) % all nucleotides mutated 6.26 4.44 0.07 (5.7-6.8) (3.8-6.9) % of all mutated nucleotides 47.18 46.76 0.67 present in RGYW/WRCY motifs (43.8-49.0) (43.8-49.1) Group medians are listed with the range of individual means in each group in parentheses.

234

Table A3: Nucleotide Mutation Patterns and Proportions in VH3 454- Pyrosequenced Samples.

IgM Control Day 0 HIV-1+ Day 0 p value CDR Median Percent (Subject Range) % C nucleotides mutated 5.49 6.10 0.48 (3.1-9.1) (3.2-15.9) % G nucleotides mutated 6.23 5.55 0.96 (3.9-8.8) (3.6-18.0) % A nucleotides mutated 4.68 4.20 0.61 (2.6-6.4) (2.6-10.5) % T nucleotides mutated 3.26 3.30 0.60 (1.8-4.3) (1.9-8.2) FR % C nucleotides mutated 2.07 1.81 0.44 (1.3-2.4) (1.3-4.6) % G nucleotides mutated 1.65 1.74 0.81 (1.3-2.2) (1.1-3.9) % A nucleotides mutated 2.33 2.14 1.00 (1.8-2.7) (1.1-4.6) % T nucleotides mutated 1.21 1.30 0.74 (1.2-1.5) (0.8-2.7) CDR % of mutations that were C 21.33 23.00 0.24 nucleotides (19.5-23.0) (14.9-24.5) % of mutations that were G 32.64 32.98 0.96 nucleotides (29.0-36.0) (26.0-39.8) % of mutations that were A 27.00 26.48 0.54 nucleotides (23.5-29.8) (24.4-35.2) % of mutations that were T 17.69 17.28 0.42 nucleotides (17.1-23.5) (12.4-23.8) FR % of mutations that were C 23.93 28.41 0.02 nucleotides (20.5-28.0) (22.4-32.6) % of mutations that were G 31.77 33.39 0.54 nucleotides (27.3-40.0) (28.7-36.9) % of mutations that were A 22.58 20.89 0.17 nucleotides (20.0-24.6) (16.7-25.8) % of mutations that were T 19.76 17.52 0.09 nucleotides (17.0-28.5) (13.5-24.5)

235

IgA Control Day 0 HIV-1+ Day 0 p value CDR Median Percent (Subject Range) % C nucleotides mutated 18.97 16.44 0.41 (7.9-20.5) (10.0-20.7) % G nucleotides mutated 19.90 20.84 0.68 (18.4-28.2) (11.1-64.7) % A nucleotides mutated 18.33 14.26 0.02 (15.3-21.9) (11.2-20.7) % T nucleotides mutated 12.37 9.77 0.14 (9.1-13.5) (6.7-27.2) FR % C nucleotides mutated 5.74 4.16 0.17 (4.4-6.0) (2.8-6.8) % G nucleotides mutated 4.74 3.92 0.25 (4.4-6.1) (2.8-9.6) % A nucleotides mutated 5.56 4.88 0.40 (4.6-6.1) (3.5-10.4) % T nucleotides mutated 2.82 2.37 0.07 (2.5-3.4) (1.6-3.2) CDR % of mutations that were C 18.82 18.88 0.68 nucleotides (9.0-19.6) (15.3-23.3) % of mutations that were G 31.60 32.13 0.30 nucleotides (27.3-33.8) (27.2-40.5) % of mutations that were A 30.52 27.95 0.17 nucleotides (28.4-39.4) (21.2-41.0) % of mutations that were T 19.81 19.77 0.35 nucleotides (17.8-21.9) (15.2-22.9) FR % of mutations that were C 28.2 28.64 0.41 nucleotides (25.5-28.7) (18.4-30.0) % of mutations that were G 35.09 35.31 0.63 nucleotides (32.5-37.0) (32.0-43.0) % of mutations that were A 21.26 21.36 0.96 nucleotides (20.6-26.8) (20.2-27.5) % of mutations that were T 14.97 14.02 0.17 nucleotides (14.5-17.7) (11.1-17.3) “% C nucleotides mutated” indicates the proportion of nucleotides in either the CDR1/2 or FR1/2/3 regions that were mutated relative to the total number of C nucleotides present in the unmutated reference sequence, expressed as a percent. “% of mutations that were C nucleotides” indicates the proportion of mutations in either the CDR1/2 or FR1/2/3 regions that were C nucleotides in the unmutated reference sequence relative to the total number of mutations in the region, expressed as a percent. The medians of each group are listed with individual patient mean ranges in parentheses.

236

Table A4: Nucleotide Mutations and Adjacent Nucleotide Patterns in VH3 454- Pyrosequenced Samples.

IgD Mutation R:Y Ratio Mutation R:Y Ratio 5’ 3’ 5’ 3’ C → G (Tr) G → C (Tr) Control 2.5 : 1 1.1 : 1 Control 3.4 : 1 1 : 3.1 Viremic 2.0 : 1 1 : 1.2 Viremic 4.6 : 1 1 : 3.0 p value 1.00 0.28 p value 0.96 0.88 C → A (Tr) G → A (Ts) Control 1.1 : 1 1.4 : 1 Control 2.1 : 1 1.3 : 1 Viremic 1.2 : 1 1.0 : 1 Viremic 2.2 : 1 1 : 1.1 p value 0.57 0.96 p value 0.72 0.64 C → T (Ts) G → T (Tr) Control 1.4 : 1 1 : 1.7 Control 1 : 1.8 1 : 4.2 Viremic 3.6 : 1 1 : 2.4 Viremic 1 : 1.4 1 : 5.9 p value 0.08 0.19 p value 0.60 0.51

A → C (Tr) T → C (Ts) Control 1 : 2.6 1 : 1.6 Control 1 : 1.4 1.3 : 1 Viremic 1 : 1.7 1 : 1.9 Viremic 1.3 : 1 1.9 : 1 p value 0.19 0.96 p value 0.28 0.44 A → G (Ts) T → G (Tr) Control 1.6 : 1 3.1 : 1 Control 1.3 : 1 1 : 1.2 Viremic 1.4 : 1 2.6 : 1 Viremic 1.0 : 1 2.1 : 1 p value 0.38 0.13 p value 0.75 0.13 A → T (Tr) T → A (Tr) Control 1 : 1.7 1 : 3.3 Control 1.6 : 1 5.1 : 1 Viremic 1 : 1.9 1 : 3.8 Viremic 1.3 : 1 4.5 : 1 p value 0.72 0.88 p value 0.96 0.88

237

IgM Mutation R:Y Ratio Mutation R:Y Ratio 5’ 3’ 5’ 3’ C → G (Tr) G → C (Tr) Control 3.0 : 1 1 : 2.3 Control 3.9 : 1 1 : 3.0 Viremic 3.2 : 1 1 : 1.2 Viremic 3.4 : 1 1 : 3.4 p value 0.33 0.88 p value 0.43 0.72 C → A (Tr) G → A (Ts) Control 1.4 : 1 1 : 1.1 Control 2.7 : 1 1 : 1.1 Viremic 1.1 : 1 1.2 : 1 Viremic 2.0 : 1 1 : 1.9 p value 0.57 0.83 p value 0.57 0.16 C → T (Ts) G → T (Tr) Control 2.4 : 1 1 : 1.6 Control 1 : 1.3 1 : 3.0 Viremic 2.6 : 1 1 : 2.0 Viremic 1 : 1.5 1 : 3.3 p value 0.43 0.23 p value 0.23 0.96

A → C (Tr) T → C (Ts) Control 1 : 1.8 1 : 1.4 Control 1 : 1.1 1.8 : 1 Viremic 1 : 1.4 1 : 1.3 Viremic 1.2 : 1 2.3 : 1 p value 0.09 0.33 p value 0.10 0.23 A → G (Ts) T → G (Tr) Control 1.2 : 1 2.2 : 1 Control 1.5 : 1 1 : 1.4 Viremic 1 : 1.1 2.2 : 1 Viremic 1.1 : 1 1.1 : 1 p value 0.60 0.60 p value 0.02 0.04 A → T (Tr) T → A (Tr) Control 1 : 2.6 1 : 2.5 Control 1.6 : 1 3.0 : 1 Viremic 1 : 2.0 1 : 2.3 Viremic 1.8 : 1 4.4 : 1 p value 0.96 0.44 p value 0.88 0.19

238

IgA Mutation R:Y Ratio Mutation R:Y Ratio 5’ 3’ 5’ 3’ C → G (Tr) G → C (Tr) Control 4.3 : 1 1 : 2.3 Control 3.4 : 1 1 : 3.0 Viremic 4.5 : 1 1 : 2.3 Viremic 3.3 : 1 1 : 3.0 p value 0.54 0.68 p value 1.00 0.92 C → A (Tr) G → A (Ts) Control 2.2 : 1 1 : 1.3 Control 2.4 : 1 1 : 3.3 Viremic 3.0 : 1 1 : 1.3 Viremic 2.2 : 1 1 : 3.3 p value 0.06 0.84 p value 0.35 0.79 C → T (Ts) G → T (Tr) Control 3.9 : 1 1 : 1.9 Control 1.1 : 1 1 : 4.1 Viremic 4.1 : 1 1 : 2.1 Viremic 1 : 1.0 1 : 3.8 p value 1.00 1.00 p value 0.47 0.79

A → C (Tr) T → C (Ts) Control 1 : 1.3 1 : 1.0 Control 1.4 : 1 2.6 : 1 Viremic 1 : 1.1 1 : 1.0 Viremic 1.4 : 1 2.6 : 1 p value 0.35 0.96 p value 0.20 0.84 A → G (Ts) T → G (Tr) Control 1 : 1.3 1.9 : 1 Control 1.4 : 1 1.3 : 1 Viremic 1 : 1.2 2.0 : 1 Viremic 1.1 : 1 1.5 : 1 p value 0.84 0.47 p value 0.03 0.35 A → T (Tr) T → A (Tr) Control 1 : 3.5 1 : 1.8 Control 1.9 : 1 3.6 : 1 Viremic 1 : 4.8 1 : 1.8 Viremic 2.1 : 1 4.1 : 1 p value 0.24 0.68 p value 0.79 0.37

239

IgG Mutation R:Y Ratio Mutation R:Y Ratio 5’ 3’ 5’ 3’ C → G (Tr) G → C (Tr) Control 4.5 : 1 1 : 3.0 Control 3.0 : 1 1 : 2.8 Viremic 4.2 : 1 1 : 2.1 Viremic 3.0 : 1 1 : 2.9 p value 0.24 0.04 p value 1.00 0.74 C → A (Tr) G → A (Ts) Control 3.2 : 1 1 : 1.7 Control 2.4 : 1 1 : 3.1 Viremic 1.9 : 1 1 : 1.1 Viremic 2.2 : 1 1 : 2.9 p value 0.009 0.24 p value 0.74 0.32 C → T (Ts) G → T (Tr) Control 4.3 : 1 1 : 1.9 Control 1.2 : 1 1 : 3.5 Viremic 4.2 : 1 1 : 2.0 Viremic 1 : 1.1 1 : 3.0 p value 0.89 0.56 p value 0.17 0.23

A → C (Tr) T → C (Ts) Control 1 : 1.3 1.1 : 1 Control 1.6 : 1 2.5 : 1 Viremic 1 : 1.5 1 : 1.2 Viremic 1.6 : 1 2.6 : 1 p value 0.89 0.24 p value 0.61 0.81 A → G (Ts) T → G (Tr) Control 1 : 1.0 1.9 : 1 Control 1.3 : 1 1.2 : 1 Viremic 1 : 1.1 2.2 : 1 Viremic 1 : 1.0 2.0 : 1 p value 0.96 0.54 p value 0.14 0.32 A → T (Tr) T → A (Tr) Control 1 : 2.7 1 : 1.8 Control 1.7 : 1 3.9 : 1 Viremic 1 : 2.8 1 : 2.3 Viremic 1.0 : 1 3.9 : 1 p value 0.89 0.09 p value 0.31 0.77 R = A or G nucleotides, Y = C or T nucleotides. Mutations: Ts = transition (purine ↔ purine or pyrimidine ↔ pyrimidine); Tr = transversion (purine ↔ pyrimidine). The group median ratio of R:Y nucleotides in the adjacent -1 (5’) or +1 (3’) positions is listed for each mutation type.

240

APPENDIX B

FAILED EXPERIMENTS

AID Functional Assay

Introduction

Directly measuring human AID protein function has proven difficult. Several in vitro assays have been developed in numerous cell lines to measure both endogenous and transgenic AID protein function [A1-A5]. In the past, studies have either looked indirectly at AID function by looking at mutation frequencies in sequences from the V genes of stimulated human B cell lines [A1, A2], or more directly, by looking for mutator activity in a receptor gene assay [A3-A5]. As seen in Chapter III, we have already looked indirectly at AID function through sequencing of IgG-VH3 mRNA in fresh primary human B cells. Our results indicated that AID function may be impaired. We attempted to confirm these results by developing a reporter gene assay to measure AID function in primary human B cells. We intended to correlate the results of the reporter gene assay with the sequencing results to further validate our hypothesis.

The AID functional reporter gene assay is modeled after the in vitro assay designed by Ruckerl et al [A4]. This group measured AID activity in a cell line known to express AID (BL70, a human germinal center-derived Burkitt’s lymphoma cell line) and in a cell line with no AID expression (Nalm-6, a human pre-B cell line). AID activity

241 was measured by the frequency at which a GFP gene with a premature stop codon present on a transfected plasmid was reverted back into a full-length functional protein through

AID-mediated mutation. Reversion of the stop codon present in the GFP sequence would indicate hypermutation. Indeed, in the AID-expressing cell line, all 8 samples tested were positive for GFP expression when measured by flow cytometry. Consistent with the hypothesis that AID activity mutated and caused the reversion of the stop codon, in the cell lines that did not express AID, none of the 8 samples tested had any GFP fluorescence.

We attempted to optimize this assay for use in primary human B cells (derived from PBMCs) in order to test AID functional capacity in PBMCs isolated from healthy control subjects and HIV-1-infected patients. The plasmid to be transfected was constructed using a pmaxFP-Green-C plasmid backbone (Lonza, Gaithersburg, MD).

This plasmid contains a GFP gene controlled by the constitutively active cytomegalovirus

(CMV) promoter and a Kanamycin resistance gene for propagation in E. coli cells. The complete mechanism of targeting AID to Ig V genes in vivo is unknown, however a few elements have been identified that enhance SHM at specific regions [A6-A10].

Immunoglobulin heavy chain large intron enhancer sequences (ELi) have been identified in both mice and humans and are thought to encourage SHM by regulating VH gene transcription [A11-A13]. The presence of an enhancer sequence near the reporter gene on a transfected plasmid has also been shown to increase mutation frequencies in a reporter gene [A9, A10]. Therefore, the human HS1,2 ELi sequence was also cloned into the pmaxFP-Green-C plasmid backbone at the 3’ end of the GFP gene (Figure B1). AID

242 also preferentially deaminates cytidine residues in the hotspot sequence motif “RGYW”

[A14-A16]. Such motifs are prevalent in the V genes [A17-A19] and also found in other genes known to be mutated by AID in vivo, including Bcl6 [A10, A19, A20]. The GFP gene sequence in the pmaxFP-Green-C plasmid has several RGYW sequence motifs. In one of these motifs, a stop codon has been introduced, preventing translation of the full- length GFP protein (Figure B1). The final element added to the plasmid backbone is a triple-encoded FLAG-tag sequence (5’-ATGGACTACAAAGACCATGACGGTGATTA

TAAAGATCATGACATCGATTACAAGGATGACGATGACAAG-3’) at the 5’ end of the GFP gene to measure transfection efficiency (Figure B1).

Fresh and stimulated PBMC were transfected with the plasmid using a Human B

Cell Nucleofector kit from Lonza, shown to have up to a 36% transfection efficiency with a control plasmid. We hypothesized that in our stimulated control PBMCs, in B cells that were transfected, AID activity would mutate the stop codon in the GFP gene, reverting the stop codon back into an amino acid. Should this occur, GFP fluorescence would be measured by flow cytometry. However, in stimulated PBMCs isolated from HIV-1- infected patients, who may have diminished AID activity, fewer mutations would revert the stop codon in GFP to an amino acid, and therefore, fewer CD19+ B cells would express GFP. We developed the functional assay and tested the ability of PBMC to be nucleofected using this plasmid. While our nucleofection efficiencies are currently quite low, there is much that could be done to optimize this assay.

243

Methods

PCR

Amplification of ELi enhancer sequence. Primers used for amplification were previously published [A11] commercially synthesized primers designed to amplify the

HS1,2A enhancer region at the 3’of the constant alpha genes of the human immunoglobulin gene locus (Table B1). PCR (25 l) consisted of: 100 ng DNA, 2.5 l

10x KOD Hot Start buffer, 2.5 l 2 mM dNTPs, 1.5 l 25 mM MgSO4, 0.6 l Eli FWD primer (10 M), 0.3 l Eli REV primer (10 M), 0.75 l STC1 Eli REV primer (10 M), and 0.5 l KOD Hot Start DNA Polymerase (1 U/l; Novagen, Gibbstown, NJ). Samples were heated to 95oC for 2 minutes in an Applied Biosystems 9700 thermocycler (“hot start”; Applied Biosystems, Foster City, CA), followed by 45 cycles consisting of a 20 second 95oC denaturation, a 10 second 60oC annealing, and a 10 second 70oC extension.

PCR product was run out on a 1.5% 1x TAE agarose (Sigma, St. Louis, MI) gel and stained with ethidium bromide. PCR product was excised from the agarose gel and purified using a MinElute Gel Extraction Kit (Qiagen). PCR product (10 l) was cloned into a pSTC1.3 vector (Eurogentec, San Diego, CA), according to the manufacturer’s protocol.

Mutagenic PCR. Primers used for mutagenesis were commercially synthesized primers designed to introduce point mutations into the p3xFLAG plasmid (p3xFLAG

AgeI FWD1/REV1 and p3xFLAG AgeI FWD3/REV3) and the pMAX plasmid

(5’GFPstop3, 3’GFPstop3) (Table B1). PCR (50 l) consisted of: 50 ng plasmid, 5 l

10x Pfu Ultra High Fidelity buffer, 1 l 10 mM dNTPs, 125 ng of each primer, and 2.5 244

Table B1: Primer Sequences

Primer Name Primer Sequence Eli FWD 5’- GACTCATTCTGGGCAGACTTG-3’ Eli REV 5’- GTCCTGGTCCCAAAGATGG-3’ STC1 Eli REV 5’- CCTTCGCCGACTGAGTCCTGGTCCCAAAGATGG-3’ p3xFLAG AgeI FWD1 5’- GAGCTCGTTTAGTGAACCGGTCAGAATTAACCATGGAC-3’ p3xFLAG AgeI REV1 5’- GTCCATGGTTAATTCTGACCGGTTCACTAAACGAGCTC-3’ p3xFLAG AgeI FWD3 5’- CGATGACAAGCTACCGGTCGCGAATTCATCGATAG-3’ p3xFLAG AgeI REV3 5’- CTATCGATGAATTCGCGACCGGTAGCTTGTCATCG-3’ 5’GFPstop3 5’- GACCTTCAGCCCCTAGCTGCTGAGCCACGTG-3’ 3’GFPstop3 5’- CACGTGGCTCAGCAGCTAGGGGCTGAAGGTC-3’ pMAX fwd2 5’-AGGAGGATCACAGCAACACC-3’ 5’GFPseq 5’-CACCAAAATCAACGGGACTT-3’

units of Pfu Ultra High Fidelity DNA polymerase (Stratagene, La Jolla, CA). Samples were amplified and mutated using an Applied Biosystems 9700 thermocycler using a program consisting of 20 cycles (p3xFLAG) or 22 cycles (pMAX) of a 30 second 95oC denaturation, a 1 min 55oC annealing, and a 4.75 minute (p3xFLAG) or a 5 minute

(pMAX) 68oC extension. Mutagenized plasmids were digested with 1 l of DpnI enzyme (Stratagene) for 1 hour at 37oC.

Restriction Digests

Plasmids were digested using BamHI (Invitrogen, Carlsbad, CA; ELi sequence insert, pMAX vector) or AgeI (New England Biolabs, Ipswitch, MA; p3xFLAG insert, pMAX + ELi vector) according to the manufacturers’ protocols. Restriction digests (50

l) consisted of: 2000 ng plasmid substrate, 30 units of enzyme, and 5 l 10x enzyme buffer. Restriction digests were incubated at 37oC for 2 hours (BamHI) or overnight

(AgeI), then purified using a PCR Purification kit (Qiagen). Purified digested vectors

245

(pMAX or pMAX + ELi) were further digested with Calf-Intestinal Alkaline Phosphatase

(CIAP; New England Biolabs) for ligation reactions. One unit of enzyme was added to

49 l of purified digested vector plasmid and incubated at 37oC for 1 hour. All digested inserts and vectors were run on a 1.0% 1xTAE agarose (Sigma) gel and stained with ethidium bromide. Products were excised and purified using a MinElute Gel Extraction kit (Qiagen). Purified inserts and vectors were quantified using a NanoDrop spectrophotometer (NanoDrop Technologies Inc., Wilmington, DE).

Ligation Reactions

Digested insert sequences and vector plasmids were ligated using T4 DNA ligase

(Invitrogen). Ligation reactions (20 l) consisted of: 1 unit of T4 DNA ligase, 4 l 5x ligase buffer, and digested and purified vector plasmid and insert DNA at a 3:1 ratio.

Ligation reactions were incubated at room temperature for 1 hour (ELi insert + pMAX vector) or 3 hours (p3xFLAG insert + pMAX + ELi vector).

Transformation of E. coli

Chemically competent E. coli (XL1-Blue cells for PCR-mutated plasmids,

Stratagene; CYS21 cells for cloned PCR product plasmids, Eurogentec; or 5 cells for ligated plasmids, New England Biolabs) were transformed with 10 l of plasmid according to the manufacturer’s protocols. Cells were plated on LB agar + Kanamycin

(50 g/ml) plates (pMAX plasmids) or LB agar + Ampicillin (100 g/ml) plates

(pSTC1.3 and p3xFLAG plasmids) and grown overnight at 37oC. Individual bacterial colonies were selected from the plates and used to inoculate 3 ml LB broth + antibiotic 246 cultures. Cultures were grown overnight with shaking at 37oC. Plasmid DNA was isolated from the cultures using a QIAprep Spin Miniprep kit (Qiagen). Purified plasmid

DNA was quantified using a NanoDrop spectrophotometer (NanoDrop Technologies

Inc.).

Sequencing

Mutagenic PCRs and ligation reactions were verified by sequencing. Sequencing reactions (17 l) consisted of: 1000 ng of template plasmid + 1.25 l 10 M sequencing primers (Table B1). Sequencing was performed by the Colorado Cancer Center Core

Sequencing Facility using an ABI 3730 Sequencer (Applied Biosystems).

Endotoxin-Free Plasmid Isolation

AID functional assay plasmids (Figure A1) were amplified and purified using an

EndoFree Plasmid Purification Mega kit (Qiagen) according to the manufacturer’s protocol.

PBMC isolation

See Chapter II, PBMC Isolation.

Nucleofection

Ficoll-isolated fresh human PBMC (1x106 cells/100 l Human B Cell

Nucleofector solution/condition) were nucleofected with either 5 g of the unmutated

FEpM1-3 functional assay plasmid, 2 g of the control plasmid pmaxGFP (Lonza), or 247 mock-nucleofected with Endotoxin-free EB buffer (Qiagen; no plasmid) using the U-015 program on a Nucleofector device (Lonza) according to the manufacturer’s protocol.

Immediately after nucleofection, 500 l of fresh RPMI media (Invitrogen) + 10% heat- inactivated fetal bovine serum (HyClone Laboratories, Logan, UT) + 10 g/ml gentamicin (Invitrogen) were added to the cells. Cells were incubated in 6 well plates for

o 24 hours at 37 C in 5% CO2.

Flow cytometry

Nucleofected PBMCs were washed in cold FACS buffer (Dulbecco’s PBS

(Invitrogen) + 1% heat-inactivated BSA (Fisher Scientific)) and stained with 20 l

CD19-APC-AF750 monoclonal antibody (Invitrogen) for 30 minutes at room temperature, then washed again. Stained PBMCs were permeabilized and fixed using a

Caltag Permeabilization/Fixation kit (Caltag, Grand Island, NY) according to the manufacturer’s protocol, then stained with 2 l anti-FLAG-M2-Cy3 antibody (Sigma) for

30 minutes at room temperature. PBMCs were washed again and fixed in 1% paraformaldehyde (Sigma) then run immediately on an LSRII Flow Cytometer (BD

Biosciences, San Jose, CA).

Results

Building the plasmid

The human HS1,2A ELi sequence was PCR-amplified from DNA isolated from frozen human PBMCs. The amplified HS1,2 ELi sequence was then cloned into the pSTC1.3 vector backbone and sequenced. The cloned sequence was cleaved out of the 248 pSTC1.3 vector using BamHI and ligated into the pmaxFP-Green-C vector backbone behind the end of the GFP gene. Ligation was verified by sequencing. Next, the

3xFLAG sequence present in the expression vector p3xFLAG-CMV-7.1 (Sigma) was mutated to introduce two AgeI sites at the 5’ and 3’ ends of the 3xFLAG sequence.

Mutations were verified by sequencing. The 3xFLAG sequence was cleaved out of the mutated p3xFLAG-CMV-7.1 plasmid using AgeI and ligated into the pmaxFP-Green-C +

ELi vector between the CMV promoter and the start of GFP. Ligation was verified by sequencing. Finally, the pmaxFP-Green-C + ELi + 3xFLAG plasmid was mutated to introduce a premature stop codon into the GFP sequence. This stop codon is present at amino acid 57 (GFP is 257 amino acids total) and its inclusion introduces a new

RGYW/WRCY AID hotspot motif, AGCT. In this motif, AID targets the G residue for mutation, and the AG nucleotides in this sequence make up the final two nucleotides in the stop codon TAG. The mutation was verified by sequencing. Both the unmutated

(FEpM1-3) and mutated (M5-7) pmaxFP-Green-C + ELi + 3xFLAG assay plasmids

(Figure A1) were amplified in large cultures of E. coli and isolated using an EndoFree

Plasmid Purification kit (Qiagen) to prevent contamination by LPS.

249

Figure B1: AID Functional Assay Plasmids. Both the unmutated FEpM1-3 assay plasmid (A) and the mutated M5-7 assay plasmid (B) were constructed using the pmaxFP-Green-C vector as a backbone. The human HS1,2 ELi enhancer sequence was cloned into the backbone vector to attract AID to nearby sequences. The 3xFLAG sequence was cloned to be expressed along with GFP as a positive indicator of nucleofection. The M5-7 (B) assay plasmid contains a single point mutation in the GFP sequence creating both a premature stop codon and an AGCT “hotspot” motif for AID targeting.

250

Testing the plasmid

Freshly isolated human PBMCs were nucleofected with either buffer alone

(mock-nucleofected; negative control), a control GFP plasmid (pmaxGFP), or the unmutated pmaxFP-Green-C + ELi + 3xFLAG plasmid FEpM1-3. Cells were then cultured for 24 hours to allow GFP expression, then stained with a monoclonal antibody to detect CD19 expression, a pan B cell marker, then permeabilized and fixed and stained for intracellular 3xFLAG expression.

In the mock-nucleofected cells, over 99% of lymphocytes were both GFP- (Figure

A2B) and 3xFLAG- (Figure A2D), indicating only low levels of background fluorescence in these two channels. As expected, 18.4% of gated lymphocytes were CD19+ B cells

(Figure A2C).

In cells nucleofected with the positive control GFP plasmid, only 13.5% of gated lymphocytes were GFP+ (Figure A3B). Only 10% of CD19+ B cells were also GFP+, indicating that the nucleofection protocol will need to be optimized further. Background levels of 3xFLAG expression were still low, with >98% of lymphocytes 3xFLAG-

(Figure A3D).

In cells nucleofected with the FEpM1-3 assay plasmid, both GFP (Figure A4B) and 3xFLAG (Figure A4D) expression were extremely low, with levels of 3xFLAG not above background levels seen in the negative controls. Only 0.5% of CD19+ B cells were

GFP+, and only 0.25% of CD19+ B cells were 3xFLAG+, suggesting that nucleofection with the assay plasmid is much less efficient than with the positive control GFP plasmid.

Only 27.8% of GFP+ cells were also 3xFLAG+, indicating that the 3xFLAG peptide is not well detected with the current protocol.

251

Figure B2: Mock-nucleofection of primary human B cells. A) Cells were gated on lymphocytes based on size (forward scatter, X axis) and granularity (side scatter, Y axis). 150,673 events were collected. 62.8% of cells were lymphocytes. B) Negative control for GFP expression. C) 18.4% of gated lymphocytes were CD19+ B cells. D) Negative control for 3xFLAG peptide expression.

252

Figure B3: Nucleofection of primary human B cells with control GFP plasmid. A) Cells were gated on lymphocytes based on size (forward scatter, X axis) and granularity (side scatter, Y axis). 58,555 events were collected. 40.2% of cells were lymphocytes. B) GFP fluorescence was detected in 13.5% of lymphocytes. C) 22.5% of gates lymphocytes were CD19+ B cells. D) Negative control for 3xFLAG peptide expression.

253

Figure B4: Nucleofection of primary human B cells with the FEpM1-3 assay plasmid. A) Cells were gated on lymphocytes based on size (forward scatter, X axis) and granularity (side scatter, Y axis). 88,966 events were collected. 51.1% of cells were lymphocytes. B) GFP fluorescence was detected in 1.9% of lymphocytes. C) 26.8% of gated lymphocytes were CD19+ B cells. D) The 3xFLAG peptide was detected in 0.9% of lymphocytes.

Discussion/Future Directions.

The ability to detect AID function in primary human B cells would greatly enhance our understanding of the protein not only during HIV-1 infection, but also in a more biologically relevant setting. Thus, using assays designed for B cell lines as a model, we sought to create an assay for use in primary human B cells.

Highly efficient nucleofection of primary human B cells has been reported by

Lonza using their control plasmid, pmaxGFP, however, we were not able to replicate

254 such high efficiencies in our experiments using either the control plasmid or our assay plasmid FEpM1-3. This inability may be corrected by further optimization of the nucleofection protocol. Several parameters may be adjusted which may improve nucleofection efficiency, including the number of cells used in each experiment, the ratio of cells to plasmid, the voltage used for nucleofection, and perhaps the nucleofection buffer, which can vary from cell type to cell type.

Flow cytometric analysis will also need to be optimized in order to detect both levels of nucleofection and GFP expression. The magnitude of GFP fluorescence detected using either the control plasmid or our assay plasmid was much lower than expected based on results presented by Lonza. GFP fluorescence should peak at a much higher level. It is possible that our fixative reagent, paraformaldehyde, may be quenching the GFP signal prior to signal detection on the Flow cytometer (Matt Downey, personal communication). Thus, one way to correct this problem would be to test other fixative agents which may have a less detrimental effect on GFP fluorescence. Detection of the p3xFLAG signal in nucleofected cells was also lower than expected given the number of

GFP-fluorescent cells detected using the assay plasmid. We were expecting to see similar levels of GFP+ and 3xFLAG+ cells in our experiment. The quantity of antibody used and the intracellular staining protocol, therefore, may also need to be optimized to more accurately measure 3xFLAG expression.

Another area of optimization is the time point at which GFP expression is measured. According to the protocol provided by Lonza, GFP expression should be measured 24 hours after nucleofection. Indeed, we were able to detect GFP expression at this time point in our experiment. However, this may not be the ideal time point at which

255 to detect GFP reversion. Upregulation of AID expression, translation of the protein, traffic to the nucleus, and deamination at the stop codon may take much more than 24 hours to complete. Therefore, once the nucleofection and flow cytometry protocols are optimized, the proper time point at which to measure GFP expression will need to be determined.

Finally, the stimulants used to upregulate AID expression and function may need to be optimized as well. While the anti-CD40 + anti-IgM + IL-4 cocktail is known to both increase AID expression and enhance CSR, it is known whether these stimuli induce

SHM. Other stimuli, including antigens, cytokines, and antibodies may need to be tested to determine which combination lead to increased SHM frequencies in primary human B cells. Though much work is left to be done, successful optimization of the nucleofection, flow cytometry, and stimulation protocols may eventually allow us to detect AID functional levels in PBMC in both HIV-1-infected patients and controls.

Measuring AID Protein Expression by Flow Cytometry

Introduction

Expression of AID mRNA may not translate directly to AID protein expression.

Multiple microRNAs have been characterized which inhibit AID translation post- transcriptionally [A21-A23]. Therefore, the increased levels of AID mRNA expression may not mean that increased levels of AID protein are produced during HIV-1 infection.

Decreased levels of AID protein may result in decreased CSR and SHM. In order to determine AID protein levels in PBMCs from both HIV-1-infected patients and control subjects, we tested multiple monoclonal anti-human AID antibodies commercially 256 produced by several companies (see Methods below). Despite their recommended use in flow cytometry by the companies that produce them, we could not identify an antibody that was specific enough for use by this method.

Methods

PBMC Isolation and Stimulation

See Chapter II, PBMC Isolation and Stimulation section for the Ficoll-isolation protocol of PBMCs from fresh blood and stimulation reagents and protocols.

Flow Cytometry

See Chapter II, Flow Cytometry for staining protocol. PBMC were stained with 3 color panels with monoclonal antibodies to AID, B cell, and T cell markers. Several labeled and unlabeled anti-human AID antibodies were tested from Cell Signaling

Technologies (mouse and rat antibodies; Danvers, MA), RnD Systems (mouse and rat;

Minneapolis, MN), Invitrogen (mouse), Abcam (rat; Cambridge, MA), Novus Biologicals

(mouse; Littleton, CO), Abnova (mouse; Walnut, CA), and Santa Cruz Biotechnologies

(mouse; Dallas, TX). Secondary antibodies used were purchased from eBiosciences

(Sann Diego, CA). Prior to antibody staining, PBMC were incubated in 50 l of Human

AB serum for 5 minutes at room temperature to block any non-specific binding. Data were analyzed within 24 hours of staining using Flow Jo Software (Tree Star, Inc.). Cells were gated on singlets, followed by lymphocytes and divided into two populations,

CD19+ B cells and CD3+CD4+ T cells.

257

Results.

Many anti-human AID monoclonal antibodies were tested on Ficoll-isolated human PBMCs using flow cytometry with varying degrees of success. In general, the antibodies tested either did not stain cells at all (Invitrogen), or displayed high amounts of background and/or non-specific staining. In one representative experiment, we stained unpermeabilized cells (AID is an intracellular protein) using one of our antibodies from

RnD Sytems as a control (Figure B5). The RnD Systems antibody was the primary antibody, followed by secondary staining using an anti-mouse PE-conjugated antibody

(eBiosciences). AID signal was detected in both CD19+ B cells (Figure B5-A) and in

CD4+ T cells (Figure B5-B), despite AID being an intracellular protein expressed mainly in B cells [A24, A25]. AID was detected in over 5% of B cells and, surprisingly, in over

9% of T cells (Figure B5-C). The results of the RnD antibody tested in this experiment are representative of the other antibodies where staining was detected.

While all AID antibodies tested showed high background staining in unpermeabilized PBMC, we decided to try testing the antibodies on both fresh and stimulated PBMC to determine if we could see an increase in AID expression with B cell stimulation. For this experiment we tested two antibodies, the RnD antibody as above, and an antibody from Cell Signaling Technologies which had been used in previously published AID protein studies in humans [A26]. In fresh PBMC, the Cell Signaling (CS)

Technology antibody showed a low degree of background in the CD4+ T cells (Figures

B6-B and –C), however, staining in CD19+ B cells was not much higher (6.0% in B cells vs. 4.0% in T cells (Figures B6-A and –C). AID protein has never been reported in T cells and AID mRNA expression has only been reported transiently during T cell

258

Figure B5: Staining of Non-Fixed/Permeabilized PBMC. Freshly isolated PBMC were stained without fixation/permeabilization to measure background levels of binding for each antibody. For the RnD antibody, staining was detected in CD19+ B cells (A) and CD3+CD4+ T cells (B). The percentage of positively-stained cells was determined in each subset (C). development and in very small numbers [A27]. In addition, staining with only the secondary antibody did not result in significantly high levels of background in either

CD19+ B or CD4+ T cells (0.7% staining in both, data not shown). Therefore, we must conclude that the AID expression detected in T cells is non-specific staining.

In stimulated cells, we found the same results. AID protein expression in B cells did increase significantly with stimulation (6% in fresh cells vs. 52.5% in stimulated cells; Figure B-6A vs. B-7A), however, staining in T cells also increased (4% in fresh cells vs. 26.8%; Figure B6-B vs. B-7B). Though the increase in AID protein levels was expected in B cells with stimulation, no increase should have been detected in T cells.

259

Figure B6: Staining of Fresh PBMC using the Cell Signaling Technologies anti- Human AID Rat Monoclonal Antibody. Freshly isolated PBMC were stained after fixation/permeabilization using the antibody purchased from Cell Signaling (CS) Technologies. Staining was detected in CD19+ B cells (A) and CD3+CD4+ T cells (B). The percentage of positively-stained cells was determined in each subset (C).

Taken together, the degree of non-specific staining measured in T cells both at baseline and post-stimulation do not make this antibody suitable for measuring AID protein expression in primary human B cells.

We next tried staining fresh and stimulated PBMC using the RnD antibody that had showed promising staining results in experiments using human tonsil B cells (data not shown). Using fresh PBMC, we found staining in 29.2% of CD19+ B cells (Figures

B8-A and –C), but also in 33.2% of CD4+ T cells (Figures B8-B and –C), suggesting that any specific staining by this antibody would be masked by the high degree of non-

260

Figure B7: Staining of Stimulated PBMC using the Cell Signaling Technologies anti-Human AID Rat Monoclonal Antibody. Stimulated PBMC were stained after fixation/permeabilization using the antibody purchased from Cell Signaling (CS) Technologies. Staining was detected in CD19+ B cells (A) and CD3+CD4+ T cells (B). The percentage of positively-stained cells was determined in each subset (C). specific staining. Staining with only the secondary antibody resulted in only 2.7% PE- stained B cells and 2.1% PE-stained T cells (data not shown). Therefore, high levels of non-specific staining are due to aberrant binding of the primary antibody.

This result was even more evident in the stimulated PBMC. Only a slight increase was seen in AID protein expression from baseline to post-stimulation (29.2% vs.

31.2%; Figures B8-A vs. B9-A), however, a huge increase was seen in CD4+ T cells, where staining increased from 33.2% at baseline to 90.8% post-stimulation (Figures B8-

261

Figure B8: Staining of Fresh PBMC using the RnD anti-Human AID Mouse Monoclonal Antibody. Freshly isolated PBMC were stained after fixation/permeabilization using the antibody purchased from RnD Systems. Staining was detected in CD19+ B cells (A) and CD3+CD4+ T cells (B). The percentage of positively- stained cells was determined in each subset (C).

B vs. B9-B). Consistent with the Cell Signaling Technologies antibody, the extensive degree of non-specific binding associated with the RnD antibody does not make it a suitable candidate for flow cytometry.

Discussion/Future Directions.

Our attempts at identifying an antibody specific for human AID were unsuccessful. All antibodies tested failed to either stain CD19+ B cells and CD4+ T cells, or have high non-specific staining in the T cells, where AID protein is not produced.

Similarly, high background levels were also seen when staining unpermeabilized cells, 262

Figure B9: Staining of Stimulated PBMC using the RnD anti-Human AID Mouse Monoclonal Antibody. Stimulated PBMC were stained after fixation/permeabilization using the antibody purchased from RnD Systems. Staining was detected in CD19+ B cells (A) and CD3+CD4+ T cells (B). The percentage of positively-stained cells was determined in each subset (C). though AID is an intracellular protein. The antibodies that we tested constitute the entire repertoire of anti-human AID antibodies commercially available. Testing antibodies reactive to mice, however, may allow us to find an antibody specific enough for human use in CD19+ B cells (mouse and human AID protein sequences are >90% identical).

Few studies in the literature report AID protein expression using flow cytometry, and our results may explain why this is the case. Indeed, one study used the same primary antibody from Cell Signaling Technologies that we tested and reported increased

AID protein expression, concurrent with increased AID mRNA expression, after stimulation with HIV-1 virions [A26]. However, in this study, CD19+ B cells were first 263 sorted from the total PBMC (including CD4+ T cells) and staining in CD4+ T cells was either not performed or not reported. Our results showing high non-specific staining, therefore, call into question the validity of the data reported by Epeldegui et al, as high levels of non-specific staining undermine any conclusions that can be drawn from the

CD19+ B cell subset, despite finding expected patterns of expression (i.e. staining levels are lower at baseline and increase with stimulation, similar induction of AID mRNA expression) [A26]. Careful consideration of controls, therefore, is required for accurate

AID protein expression measurement in human PBMC.

264

APPENDIX C

APPENDIX REFERENCES

A1. Zan, H., Cerutti, A., Dramitinos, P., Schaffer, A., Li, Z., and Casali, P. (1999). "Induction of Ig somatic hypermutation and class switching in a human monoclonal IgM+ IgD+ B cell line in vitro: definition of the requirements and modalities of hypermutation." J Immunol 162(6): 3437-3447.

A2. Bardwell, P. D., Martin, A., and Scharff, M. D. (2002). "Mutation detection of immunoglobulin V-regions by DHPLC." J Immunol Methods 266(1-2): 165-173.

A3. Chui, Y. L., Lozano, F., Jarvis, J. M., Pannell, R., and Milstein, C. (1995). "A reporter gene to analyse the hypermutation of immunoglobulin genes." J Mol Biol 249(3): 555-563.

A4. Ruckerl, F. and J. Bachl (2005). "Activation-induced cytidine deaminase fails to induce a mutator phenotype in the human pre-B cell line Nalm-6." Eur J Immunol 35(1): 290-298.

A5. Yoshikawa, K., Okazaki, I. M., Eto, T., Kinoshita, K., Muramatsu, M., Nagaoka, H., and Honjo, T. (2002). "AID enzyme-induced hypermutation in an actively transcribed gene in fibroblasts." Science 296(5575): 2033-2036.

A6. Stavnezer, J. (2011). "Complex regulation and function of activation-induced cytidine deaminase." Trends Immunol 32(5): 194-201.

A7. Besmer, E., Gourzi, P., and Papavasiliou, F. N. (2004). "The regulation of somatic hypermutation." Curr Opin Immunol 16(2): 241-245.

A8. Teng, G. and F. N. Papavasiliou (2007). "Immunoglobulin somatic hypermutation." Annu Rev Genet 41: 107-120.

A9. Bachl, J. and C. Olsson (1999). "Hypermutation targets a green fluorescent protein- encoding transgene in the presence of immunoglobulin enhancers." Eur J Immunol 29(4): 1383-1389. 265

A10. Tanaka, A., Shen, H. M., Ratnam, S., Kodgire, P., and Storb, U. (2010). "Attracting AID to targets of somatic hypermutation." J Exp Med 207(2): 405-415.

A11. Tolusso, B., Frezza, D., Mattioli, C., Fedele, A. L., Bosello, S., Faustini, F., Peluso, G., Giambra, V., Pietrapertosa, D., Morelli, A., Gremese, E., De Santis, M., and Ferraccioli, G. F. (2009). "Allele *2 of the HS1,2A enhancer of the Ig regulatory region associates with rheumatoid arthritis." Ann Rheum Dis 68(3): 416-419.

A12. Komori, A., Xu, Z., Wu, X., Zan, H., and Casali, P. (2006). "Biased dA/dT somatic hypermutation as regulated by the heavy chain intronic iEmu enhancer and 3'Ealpha enhancers in human lymphoblastoid B cells." Mol Immunol 43(11): 1817- 1826.

A13. Mills, F. C., Harindranath, N., Mitchell, M., and Max, E. E. (1997). "Enhancer complexes located downstream of both human immunoglobulin Calpha genes." J Exp Med 186(6): 845-858.

A14. Mayorov, V. I., Rogozin, I. B., Adkison, L. R., Frahm, C., Kunkel, T. A., and Pavlov, Y. I. (2005). "Expression of human AID in yeast induces mutations in context similar to the context of somatic hypermutation at G-C pairs in immunoglobulin genes." BMC Immunol 6: 10.

A15. Spencer, J. and D. K. Dunn-Walters (2005). "Hypermutation at A-T base pairs: the A nucleotide replacement spectrum is affected by adjacent nucleotides and there is no reverse complementarity of sequences flanking mutated A and T nucleotides." J Immunol 175(8): 5170-5177.

A16. Wu, X., Feng, J., Komori, A., Kim, E. C., Zan, H., and Casali, P. (2003). "Immunoglobulin somatic hypermutation: double-strand DNA breaks, AID and error- prone DNA repair." J Clin Immunol 23(4): 235-246.

A17. Li, Z., Zhao, C., Iglesias-Ussel, M. D., Polonskaya, Z., Zhuang, M., Yang, G., Luo, Z., Edelmann, W., and Scharff, M. D. (2006). "The mismatch repair protein Msh6 influences the in vivo AID targeting to the Ig locus." Immunity 24(4): 393-403.

A18. Cohen, R. M., Kleinstein, S. H., and Louzoun, Y. (2011). "Somatic hypermutation targeting is influenced by location within the immunoglobulin V region." Mol Immunol 48(12-13): 1477-1483.

A19. Chen, Z., Viboolsittiseri, S. S., O'Connor, B. P., and Wang, J. H. (2012). "Target DNA sequence directly regulates the frequency of activation-induced deaminase- dependent mutations." J Immunol 189(8): 3970-3982.

266

A20. Liu, M., Duke, J. L., Richter, D. J., Vinuesa, C. G., Goodnow, C. C., Kleinstein, S. H., and Schatz, D. G. (2008). "Two levels of protection for the B cell genome during somatic hypermutation." Nature 451(7180): 841-845.

A21. Dorsett, Y., McBride, K. M., Jankovic, M., Gazumyan, A., Thai, T. H., Robbiani, D. F., Di Virgilio, M., Reina San-Martin, B., Heidkamp, G., Schwickert, T. A., Eisenreich, T., Rajewsky, K., and Nussenzweig, M. C. (2008). "MicroRNA-155 suppresses activation-induced cytidine deaminase-mediated Myc-Igh translocation." Immunity 28(5): 630-638.

A22. de Yebenes, V. G., Belver, L., Pisano, D. G., Gonzalez, S., Villasante, A., Croce, C., He, L., and Ramiro, A. R. (2008). "miR-181b negatively regulates activation- induced cytidine deaminase in B cells." J Exp Med 205(10): 2199-2206.

A23. Teng, G., Hakimpour, P., Landgraf, P., Rice, A., Tuschl, T., Casellas, R., and Papavasiliou, F. N. (2008). "MicroRNA-155 is a negative regulator of activation- induced cytidine deaminase." Immunity 28(5): 621-629.

A24. Durandy, A. (2003). "Activation-induced cytidine deaminase: a dual role in class- switch recombination and somatic hypermutation." Eur J Immunol 33(8): 2069-2073.

A25. Muramatsu, M., Sankaranand, V. S., Anant, S., Sugai, M., Kinoshita, K., Davidson, N. O., and Honjo, T. (1999). "Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RNA-editing deaminase family in germinal center B cells." J Biol Chem 274(26): 18470-18476.

A26. Epeldegui, M., Thapa, D. R., De la Cruz, J., Kitchen, S., Zack, J. A., and Martinez-Maza, O. (2010). "CD40 ligand (CD154) incorporated into HIV virions induces activation-induced cytidine deaminase (AID) expression in human B lymphocytes." PLoS One 5(7): e11448.

A27. Qin, H., Suzuki, K., Nakata, M., Chikuma, S., Izumi, N., Huong le, T., Maruya, M., Fagarasan, S., Busslinger, M., Honjo, T., and Nagaoka, H. (2011). "Activation- induced cytidine deaminase expression in CD4+ T cells is associated with a unique IL-10-producing subset that increases with age." PLoS One 6(12): e29141.

267