Genetic association and expression analysis of

inflammatory in

by

Aabida Saferali

A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE

REQUIREMENTS FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

in

THE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES

(Experimental Medicine)

THE UNIVERSITY OF BRITISH COLUMBIA

(Vancouver)

September 2016

© Aabida Saferali, 2016

ABSTRACT

Cystic fibrosis (CF) is characterized by a progressive decline in lung function due to airway obstruction, infection, and inflammation. CF patients are particularly susceptible to respiratory infection by a variety of pathogens, and the inflammatory response in CF is dysregulated and prolonged. This thesis identifies and characterizes BPI fold containing family A, member 1

(BPIFA1) and BPIFB1 as putative anti-inflammatory molecules in CF, and explores the CF inflammatory response to rhinovirus infection.

BPIFA1 and BPIFB1 are proposed innate immune molecules expressed in the upper airways. We interrogated BPIFA1/BPIFB1 single-nucleotide polymorphisms in data from the

North American genome-wide association study (GWAS) for lung disease severity in CF and discovered that the G allele of rs1078761 was associated with reduced lung function in CF patients. Microarray and qPCR analysis implicated rs1078761 G as being associated with reduced BPIFA1 and BPIFB1 gene expression, suggesting that decreased levels of these genes are detrimental in CF.

Functional assays to characterize the role of BPIFA1 and BPIFB1 in CF indicated that these molecules do not have an anti-bacterial role against P. aeruginosa, but do have an immunomodulatory function in CF airway epithelial cells. To further investigate the mechanism of action of BPIFA1 and BPIFB1 during bacterial infection, gene expression was profiled using

RNA-Seq in airway epithelial cells stimulated with P. aeruginosa and treated with recombinant

BPIFA1 and BPIFB1.

Viral infections are now recognized to play an important role in the short and long term health of CF patients. Rhinovirus is emerging as a lead viral pathogen although little is known about the inflammatory response triggered by rhinovirus in the CF lung. To investigate whether

ii

CF patients have a dysregulated response to rhinovirus infection, primary airway epithelial cells from CF and healthy control children were infected with rhinovirus and gene expression profiles were assessed by RNA-Seq. Although rhinovirus stimulation resulted significantly altered gene expression, the response to infection was not different in CF patients compared to healthy controls. However, CF cells had significantly higher rhinovirus levels than controls, indicating that CF patients may have a deficient antiviral response allowing for increased rhinovirus replication.

iii

PREFACE

All work was conducted at the Centre for Heart Lung Innovation and the Child and Family

Research Institute/BC Children’s Research Institute. Ethics approval was obtained for collection of saliva samples from CF patients and healthy controls from UBC Providence Health Care

Research Institute Ethics Board (certificates H12-03293 and H15-02759). Healthy control samples were obtained through recruitment of volunteers by A. Saferali at the Centre for Heart

Lung Innovation with informed written consent. CF patient samples were collected with informed written consent from the Pacific Lung Health Centre CF Clinic directed by Dr. B.

Quon.

A. Saferali was the lead investigator for all the work presented in this thesis. All aspects of study design, analysis, and execution were carried out by A. Saferali together with Drs. A.J.

Sandford, and S.E. Turvey. Primary airway epithelial cells from CF patients described in

Chapter 4 were obtained from Drs. A. Kicic and S. Stick in Perth Australia. RNA-Seq data shown in Chapter 4 were generated in the lab of Dr. R.E. Hancock. F. Shaheen helped with the immunohistochemistry shown in section 3.4.1. Dr. A. Tang performed the experiments described in section 3.4.5 and wrote the material and methods for that section. In addition, Dr.

Tang contributed to 10% of the western blots shown in section 3.4.2.

The study presented in Chapter 2 of this work was published in the American Journal of

Respiratory Cell and Molecular Biology in November 2015. The data are reprinted with permission of the American Thoracic Society1.

1 Copyright © 2016 American Thoracic Society. The American Journal of Respiratory Cell and Molecular Biology is an official journal of the American Thoracic Society. iv

Saferali, A., et al., Polymorphisms associated with expression of BPIFA1/BPIFB1 and

lung disease severity in cystic fibrosis. Am J Respir Cell Mol Biol, 2015. 53(5): p. 607-

14.

Certain aspects of Chapter 1 were included in a review to be published in Wiley eLS.

Saferali, A., et al., (September 2016) Cystic Fibrosis: Modifier Genes. In: eLS. John

Wiley & Sons, Ltd: Chichester. DOI: 10.1002/9780470015902.a0020233.

v

TABLE OF CONTENTS

ABSTRACT ...... ii PREFACE ...... iv TABLE OF CONTENTS ...... vi LIST OF TABLES ...... ix LIST OF FIGURES ...... x LIST OF ABBREVIATIONS ...... xii ACKNOWLEDGEMENTS ...... xvi CHAPTER 1: A BACKGROUND ON CYSTIC FIBROSIS MODIFIER GENES AND HOST RESPONSE ...... 1 1.1 Molecular basis of cystic fibrosis ...... 1 1.2 Genome-wide association studies in CF ...... 4 1.2.1 Consortium approach to genome-wide association studies ...... 5 1.2.2 Genome-wide studies of lung function in cystic fibrosis ...... 6 1.2.3 Genetic association studies of other CF related phenotypes ...... 11 1.2.3.1 Body mass index and nutritional status ...... 11 1.2.3.2 CF-related diabetes ...... 11 1.2.3.3 Meconium ileus ...... 13 1.2.4 CF modifier genes, inflammation and immunity in CF ...... 16 1.3 Infection in cystic fibrosis ...... 17 1.3.1 Bacterial infection ...... 17 1.3.2 Viral infection in cystic fibrosis ...... 18 1.3.3 Rhinovirus in CF ...... 19 1.3.4 Fungal pathogens...... 22 1.4 Inflammation in CF ...... 23 1.4.1 Dysregulated immune response in CF ...... 23 1.4.2 Abnormal airway surface liquid and mucociliary clearance ...... 23 1.4.3 Secondary inflammatory defects in CF airways ...... 25 1.4.4 Anti-inflammatory therapies ...... 26 1.5 BPIFA1 and BPIFB1: Innate immune molecules in CF ...... 27 1.5.1 The BPI fold containing family of ...... 27 1.5.2 Antimicrobial properties of BPIFA1 ...... 28 1.5.3 Immunomodulatory function of BPIFA1 ...... 30 1.5.4 BPIFA1 in ion transport and ASL regulation ...... 30 1.5.5 Function of BPIFB1 ...... 31 1.6 Summary of thesis objectives ...... 31 CHAPTER 2: PLUNCs AS MODIFIER GENES IN CYSTIC FIBROSIS ...... 33 2.1 Rationale ...... 33 2.2 Background ...... 33 2.3 Materials and methods ...... 35 2.3.1 Selection of candidate SNPs from CF modifier GWAS data ...... 35 2.3.2 Lung expression quantitative trait loci (eQTL) study ...... 36 2.3.3 Bioinformatic analysis of SNPs in the 20q11 region ...... 38 2.3.4 Collection of saliva samples from healthy volunteers ...... 38

vi

2.3.5 DNA extraction and genotyping of saliva samples ...... 39 2.3.6 Quantitative Real-Time PCR (qPCR) of BPIFA1 and BPIFB1 in the lung ...... 39 2.3.7 Western blot analysis of BPIFA1 and BPIFB1 levels in the lung and saliva ...... 40 2.3.8 Statistical analysis of BPIFA1 and BPIFB1 expression according to genotype ...... 41 2.4 Results ...... 41 2.4.1 Assessment of BPIFA1/BPIFB1 polymorphisms ...... 41 2.4.2 Replication of rs1078761 association with CF lung disease severity ...... 44 2.4.3 Identification of SNPs associated with BPIFA1 and BPIFB1 gene expression levels ...... 46 2.4.4 Potential functional effects of BPIFA1/BPIFB1 SNPs ...... 47 2.4.5 SNPs associated with BPIFA1 and BPIFB1 gene expression levels in CF samples ...... 47 2.4.6 qPCR gene expression analysis of BPIFA1 and BPIFB1 in lung tissue ...... 50 2.4.7 Measurement of BPIFA1 and BPIFB1 protein levels in lung tissue ...... 52 2.4.8 Measurement of BPIFA1 and BPIFB1 protein levels in the saliva ...... 53 2.4.9 Relationship between BPIFA1 and BPIFB1 levels and lung function ...... 55 2.5 Discussion ...... 55 CHAPTER 3: BPIFA1 AS AN ANTI-INFLAMMATORY TARGET IN CYSTIC FIBROSIS ...... 60 3.1 Rationale ...... 60 3.2 Background ...... 60 3.3 Methods ...... 62 3.3.1 Immunohistochemistry ...... 62 3.3.2 Recruitment of CF patients ...... 62 3.3.3 Western blot analysis of BPIFA1 and BPIFB1 protein levels in saliva ...... 63 3.3.4 DNA extraction and genotyping of saliva samples ...... 63 3.3.5 Bacterial growth assays ...... 64 3.3.6 Cell culture ...... 64 3.3.7 Cell stimulation and cytokine quantification ...... 65 3.3.8 BPIFA1 overexpression assays ...... 66 3.3.9 RNA extraction and RNA sequencing ...... 66 3.4 Results ...... 67 3.4.1 BPIFA1 expression profile ...... 67 3.4.2 Relationship between rs1078761 genotype and levels of BPIFA1 and BPIFB1 in CF saliva .. 69 3.4.3 Investigation of bacterial growth inhibition by BPIFA1 ...... 71 3.4.4 Effect of BPIFA1 overexpression on bacterial growth and inflammatory cytokine production74 3.4.5 Effect of BPIFA1 and BPIFB1 on production of inflammatory cytokines in airway epithelial cells ...... 76 3.4.6 Characterization of BPIFA1 and BPIFB1 treatment on gene expression in CF airway epithelial cells ...... 79 3.5 Discussion ...... 88 CHAPTER 4: INFLAMMATORY RESPONSE TO RHINOVIRUS INFECTION IN CYSTIC FIBROSIS ...... 93 4.1 Rationale ...... 93 4.2 Background ...... 93 4.3 Materials and methods ...... 95 4.3.1 Recruitment of CF patients and healthy controls ...... 95 4.3.2 Airway brushing and primary cell culture ...... 96 4.3.3 Rhinovirus stimulation ...... 97 4.3.4 RNA-Sequencing and analysis ...... 97 4.4 Results ...... 98

vii

4.4.1 Establishment of time course for expression of viral response genes ...... 98 4.4.2 Characterization of viral response gene expression in airway epithelial cells ...... 102 4.4.3 Quantification of HRV1b in airway epithelial cells ...... 107 4.4.4 Investigation of differential response to viral infection in CF at the global level ...... 109 4.4.5 Investigation of pathways that have previously been shown to be dysregulated in the CF response to rhinovirus ...... 115 4.5 Discussion ...... 117 CHAPTER 5: GENERAL CONCLUSIONS AND FUTURE DIRECTIONS ...... 122 5.1 Main contributions to the field of CF and inflammation research ...... 122 5.2 Future Directions ...... 124 5.2.1 What is the role of the rs1078761 polymorphism in BPIFA1 and BPIFB1 expression and regulation? ...... 124 5.2.2 What is the mechanism by which BPIFA1 alters the airway epithelial response to infection? ...... 125 5.2.3 What is the mechanism by which rhinovirus levels are elevated in CF? ...... 126 REFERENCES ...... 128 APPENDIX ...... 142

viii

LIST OF TABLES

Table 1.1: CFTR mutation classes ...... 4 Table 1.2: Summary of CF modifier GWAS/linkage study findings ...... 10 Table 2.1: Characteristics of included GWAS individuals ...... 36 Table 2.2: CF and non-CF patients from the lung eQTL study ...... 37 Table 2.3: Healthy saliva donors ...... 38 Table 2.4: Lung eQTL study patients selected for qPCR validation ...... 40 Table 2.5: Association between genotype and mRNA levels from the lung eQTL study ..... 46 Table 2.6: Association between genotype and mRNA levels in CF and non-CF patients from the lung eQTL study ...... 49 Table 2.7: Association between genotype and mRNA levels in the lung measured by qPCR ...... 50 Table 2.8: Association between genotype and protein levels in saliva ...... 53 Table 3.1: Overrepresented pathways by Sigora analysis in genes that are differentially expressed in response to BPIFA1 or BPIFB1 treatment ...... 82 Table 4.1: Characteristics of CF patients and healthy controls ...... 96 Table 4.2: Overrepresented pathways associated with HRV1b infection at 2, 4, 8 and 24 hours after exposure ...... 101 Table 4.3: Sigora pathway overrepresentation analysis of genes that are differentially expressed at 24 hours after HRV1b infection in airway epithelial cells ...... 107 Table 4.4: Top differentially expressed genes stratified into CF and healthy control groups ...... 113 Table 4.5: Pro-inflammatory cytokines, interferons and apoptosis genes from RNA-Seq data ...... 116

ix

LIST OF FIGURES

Figure 1.1: Apical membrane constituents that are associated with CF related traits ...... 15 Figure 2.1: Locus zoom plot of the region 10 kb upstream of BPIFA1 to 10 kb downstream of BPIFB1 ...... 43 Figure 2.2: Locus zoom plot of the region 10 kb upstream of BPIFA1 to 10 kb downstream of BPIFB1 in GWAS 1+ 2 ...... 45 Figure 2.3: Interaction of genotype and CF status on BPIFA1 and BPIFB1 mRNA levels in lung tissue ...... 49 Figure 2.4: BPIFA1 and BPIFB1 mRNA levels in the lung ...... 51 Figure 2.5 BPIFB1 protein levels in lung tissue ...... 52 Figure 2.6 BPIFA1 and BPIFB1 protein expression levels in saliva ...... 54 Figure 3.1: BPIFA1 expression in airway sections from healthy individuals ...... 68 Figure 3.2: BPIFA1 expression in small airway sections from a healthy individual and a CF patient...... 68 Figure 3.3: BPIFA1 and BPIFB1 levels in saliva from stable CF patients by rs1078761 genotype ...... 70 Figure 3.4: P. aeruginosa growth curves with the addition of BPIFA1 or BPIFB1 ...... 72 Figure 3.5: Quantification of colony forming units in P. aeruginosa treated with recombinant BPIFA1 and BPIFB1 protein ...... 73 Figure 3.6: Detection of secreted BPIFA1 in transfected IB3-1 cells ...... 75 Figure 3.7: Quantification of colony forming units and inflammatory cytokine production in IB3-1 cells transfected with BPIFA1 plasmid ...... 75 Figure 3.8: Preliminary experiments to establish conditions for stimulation of airway epithelial cells ...... 77 Figure 3.9: Quantification of inflammatory cytokine production by airway epithelial cells pretreated with recombinant BPIFA1 or BPIFB1 prior to bacterial stimulation ...... 78 Figure 3.10: PCA plots of global gene expression in CF airway epithelial cell lines ...... 83 Figure 3.11: Differentially expressed genes in IB3-1 cells stimulated with heat-killed P. aeruginosa...... 84 Figure 3.12: Protein:protein network of genes that were differentially expressed as a result of BPIFA1/BPIFB1 treatment in IB3-1 cells ...... 85 Figure 3.13: Protein:protein network of genes that were differentially expressed as a result of BPIFA1/BPIFB1 treatment in CFBE41o- cells ...... 86 Figure 3.14: Heatmap showing all PAO1 responsive genes in IB3-1 cells ...... 87 Figure 4.1: PCA clustering of airway epithelial gene expression ...... 100

x

Figure 4.2: Number of differentially expressed genes post RV1b infection ...... 100 Figure 4.3: PCA clustering and MA plots of CF and healthy control airway epithelial gene expression at 24 hours post HRV1b infection ...... 104 Figure 4.4: Heatmap of top 200 differentially expressed genes after rhinovirus infection 105 Figure 4.5: Network analysis of differentially expressed genes at 24 hours after HRV1b infection ...... 106 Figure 4.6: Rhinovirus levels in CF and healthy control airway epithelial cells...... 108 Figure 4.7: Factor map and hierarchical clustering of gene expression data from CF and healthy control airway epithelial cells infected with HRV1b ...... 111 Figure 4.8: Venn diagram and chord diagram of genes that were differentially expressed in both CF and healthy control airway epithelial cells ...... 112 Figure 4.9: Heatmaps illustrating genes that respond differently to HRV1b in CF compared to controls ...... 114

xi

LIST OF ABBREVIATIONS

ABPA – Allergic bronchopulmonary aspergillosis

ALI – Air liquid interface

ANOVA –Analysis of variance

APAF1 – Apoptotic peptidase activating factor 1

APIP – APAF1-interacting protein

ARRDC3 – Arrestin domain- containing 3

ASL – Airway surface liquid

BEBM – Bronchial epithelial basal medium

BMI – Body mass index

BPI – Bactericidal permeability increasing protein

BPIF – BPI folding containing

BPIFA1 - BPI fold containing family A, member 1

BPIFB1 - BPI fold containing family B, member 1

CCL5 – C-C motif chemokine ligand 5

CETP – Cholesteryl ester-transfer protein

CF – Cystic fibrosis

CFRD – CF-related diabetes

CFTR – Cystic fibrosis transmembrane conductance regulator

CFU – Colony forming units

CGS - Canadian Consortium for Genetic Studies

COPD – Chronic obstructive pulmonary disease

CXCL8 – C-X-C motif chemokine ligand 8

xii dsRNA – Double-stranded RNA

EGR1 – Early growth response 1

EHF – ETS homologous factor

ENaC – Epithelial sodium channel

ENCODE – Encyclopedia of DNA Elements eQTL – Expression quantitative trait locus

FEV1 – Forced expiratory volume in one second

FrGMC – French CF Gene Modifier Consortium

G-CSF – Granulocyte-colony stimulating factor

GM-CSF – Granulocyte-Macrophage Colony-Stimulating factor

GMS – Gene modifier Study

GWAS – Genome-wide association study

HRV – Human rhinovirus

HRV1b – Human rhinovirus 1b

ICAM-1 – Intercellular adhesion molecule 1

IFNs – Interferons

IFRD1 – Interferon related developmental regulator 1

IL6 – Interleukin 6

IRF1 – Interferon regulatory factor 1

IRF9 – Interferon regulatory factor 9

ISG15 – ISG15 ubiquitin-like modifier

ISGs – Interferon stimulated genes

KNoRMA - Kulich normal residual mortality-adjusted lung disease phenotype

xiii

LBP – LPS-binding protein

LD – Linkage disequilibrium

LDLR – Low-density lipoprotein receptors

LPLUNC – Long palate, lung, nasal epithelium clone

LPS – Lipopolysaccharide

MBL2 – Mannose-binding lectin 2

MDA-5 – Melanoma differentiation associated gene 5

MOI – Multiplicity of infection

NHE3 – Cation proton antiporter 3

NTM – Nontuberculous mycobacteria

OAS1 – 2’5’-oligoadenylate synthetase 1

PLTP – Phospholipid – transfer protein

PRR – Pattern recognition receptors qPCR – Quantitative real-time polymerase chain reaction

RIG-I – Retinoic inducible gene-I

RSV – Respiratory syncytial virus

Sigora – Signature overrepresentation analysis

SNPs – Single nucleotide polymorphisms

SPLUNC – Short palate, lung, nasal epithelium clone ssRNA – Single-stranded RNA

STAT1 – Signal transducer and activator of transcription 1

TGFB1 – Transforming growth factor beta 1

TLR3 – Toll-like 3

xiv

TLRs – Toll-like receptors

TNF – Tumor necrosis factor

TSS – Twin and Sibling Study

TULIP – Tubular lipid binding domain

xv

ACKNOWLEDGEMENTS

I am truly lucky to have had the opportunity to work with two outstanding supervisors. I would like to thank Andy for giving me the chance to start my PhD in his lab. I am incredibly grateful for his supervision, mentorship, and advice over the years. I would like to thank him for taking the time to listen to my problems, and for helping me with me with every aspect of my research and career development. He truly went above and beyond for me and I know that I could not have made it through my degree without his support. I would also like to thank Stuart for allowing me to join his lab unexpectedly, and for taking a geneticist into the fold. I am sincerely appreciative of the supervision and career advice he provided me, and I am very thankful for the experience and knowledge I gained while in his lab. I also greatly value the time I spent shadowing him in the clinic, and it was tremendously inspiring to see the perspective of a clinician-scientist.

I would like to thank everyone in the Turvey and Sandford labs for their help with my project and everything they taught me. I will greatly miss our steeped tea breaks and the interesting conversations we’ve had. Finally I would like to thank my family for their support over the many years that I’ve been a student.

xvi

CHAPTER 1: A BACKGROUND ON CYSTIC FIBROSIS MODIFIER

GENES AND HOST RESPONSE

Cystic fibrosis (CF) is an autosomal recessive disorder causing early death which affects approximately 70,000 individuals worldwide. It is characterized by the production of abnormally viscous mucus, which obstructs bronchial airways and pancreatic ducts leading to infection, inflammation and tissue damage. Respiratory failure is the leading cause of morbidity and mortality in CF. CF is also characterized by impaired function of other organ systems including the sweat glands, liver, male reproductive tract, and intestine. In addition, CF patients often suffer from malnutrition and poor growth due to loss of pancreatic function. Individuals with CF are treated with antibiotics, lung clearance strategies and replacement of pancreatic enzymes amongst other therapies guided by a multidisciplinary team. This has resulted in great improvement in survival, and Canadian children born with CF today are predicted to live to a median age of 51.8 years [1].

1.1 Molecular basis of cystic fibrosis

In 1946, Andersen and Hodges proposed that CF is caused by a recessive mutation [2].

However, it was not until 1989 that the cystic fibrosis transmembrane conductance regulator

(CFTR) gene was determined to be responsible for CF [3-5]. CFTR functions mainly as a chloride and bicarbonate transporter, but also plays important roles in the regulation of other ion channels such as the epithelial sodium channel (ENaC) [6]. The first identified mutation in

CFTR was the p.Phe508del deletion (also known as ΔF508), which is also the most common CF causing mutation. Since then, over 2000 mutations have been identified in CFTR. Of these mutations, 40% result in amino acid substitution, 36% affect RNA processing, ~3% are CFTR

1 rearrangements, and 1% alter promoter regions. In addition, 14% of the CFTR mutations are sequence variants that are thought to be neutral (and not CF-causing), and 6% have unknown effects [7]. Mutations in the CFTR gene can result in disease by altering the amount and/or function of CFTR protein that localizes to the cell membrane. These mutations have been grouped into five and sometimes six classes based on the level of effect on CFTR (Table 1.1) [8,

9]. However, it is important to be aware that one variant can affect multiple processes in the cell

[7]. For example, the p.Phe508del mutation alters CFTR mRNA translation, folding of the protein in the Golgi and gating of the channel at the membrane [7, 10-12]. Therefore it could belong to at least three mutation classes.

While the exact mechanism by which mutation in CFTR results in CF disease pathophysiology is currently unknown, several hypotheses have been proposed to explain this.

One consequence of CFTR mutation may be impaired chloride transport and increased sodium absorption in the airway epithelium resulting in reduced airway surface liquid (ASL) volume and dehydrated airways [13] [14]. However, this is controversial and it is not clear whether loss of

CFTR function directly results in sodium hyper-absorption [13]. The production of abnormal

ASL is thought to impair mucociliary transport and also results in mucus that fails to detach from submucosal gland ducts [14]. Furthermore, defective bicarbonate transport due to CFTR mutation causes the production of acidic ASL which may impair bacterial killing in the airways

[14]. In addition to the described consequences in the airway epithelium, CFTR mutation has been shown to impair the bacterial killing capacity of monocytes and macrophages [15-17].

The correlation between CFTR mutation and disease severity has been shown to be imprecise [18], however patient phenotype normally reflects whether there is complete loss of

CFTR function caused by the presence of two class I or II mutations as opposed to having

2 residual CFTR function due to the presence of a less severe mutation. For example, patients with one copy of the R1175H mutation retain some chloride permeability in their cells resulting in milder disease than in patients who are homozygous for the p.Phe508del mutation [19, 20].

Exocrine pancreatic function is often used as a surrogate for mutation severity as more severe mutations are associated with loss of pancreatic function whereas patients with mild mutations are pancreatic sufficient [21].

Recent progress in correcting specific mutations using small molecule therapy has led to the development of two new breakthrough drugs for CF. The first small molecule that was shown to be effective is Ivacaftor [22-24]. Clinical trials indicated that Ivacaftor is highly effective in patients with class III mutations, particularly in patients with the G551D mutation or related gating mutations [25-30]. Ivacaftor has been shown to result in improved lung function, reduced sweat chloride concentration, improved quality of life, and less frequent pulmonary exacerbations [26]. For patients who are homozygous for the p.Phe508del mutation, a combination of Ivacaftor together with the corrector Lumacaftor has been tested, and has been shown to significantly improve weight, and modestly improve quality of life and lung function

[31-33]. These results indicate that a strategic approach to targeting specific mutations in CFTR can be of significant benefit to CF patients.

3

Table 1.1: CFTR mutation classes

Mutation Class Example Mutations in this Class Phenotype I R1162X, G542X Complete lack of CFTR production due to a premature stop codon resulting in a truncated protein. II p.Phe508del Trafficking mutations. Reduced amounts of CFTR reach the cell membrane as a result of abnormal folding and transport. CFTR is retained in the endoplasmic reticulum and degraded. III G551D Gating mutations. CFTR protein reaches the apical membrane, but channels do not open properly resulting in reduced Cl- transport. IV R117H, R347P Full length CFTR reaches the apical membrane and is partially functional but has reduced anion permeability due to channel narrowing V 3120+1G>A, 2789+5G>A Decreased CFTR expression resulting in reduced amounts of CFTR reaching the cell membrane. CFTR is fully functional. VI Q1412X Shorter life of CFTR protein at the apical membrane Classes of CFTR mutation with example mutations of each class and description of the effect on CFTR 1.2 Genome-wide association studies in CF

Individuals with CF have a large degree of variability in disease severity and outcome. Allelic heterogeneity within the CFTR gene plays a role in determining disease severity, and the presence of at least one mild CFTR mutation is associated with better lung function and nutritional status [34-36]. While CFTR genotype has been found to correlate well with pancreatic disease severity and somewhat with sweat chloride levels [36-41], there is a large variation in pulmonary phenotype even for patients with the same CFTR genotype [34, 42]. Some of this clinical variability can be attributed to environmental factors such as tobacco smoke exposure, bacterial infection, socioeconomic status, and nutrition [43-48]. Twin studies have demonstrated that monozygotic CF twin pairs have greater concordance for lung function than dizygotic twin

4 and sibling pairs [49, 50]. The heritability for lung disease severity in CF is ~50%, demonstrating that gene modifiers have a strong contribution to this phenotype. Furthermore, there is evidence that four other traits relevant to survival in CF have a genetic contribution separate from CFTR mutations: body mass index (BMI) / nutritional status, CF-related diabetes and intestinal obstruction at birth (meconium ileus).

The two fundamental strategies that can be used to identify disease-modifying genes are linkage analysis and genetic association studies. Linkage studies utilize information from families to determine whether genotype and disease severity are co-inherited in siblings and/or other family members. In contrast, association studies determine whether a genotype is associated with disease severity in unrelated patients. The majority of genetic modifier studies in

CF have been association studies, mainly due to insufficient numbers of families with multiple affected siblings. There are two main approaches that are utilized in genetic association studies: an “a priori” or candidate gene approach, and an array-based genotyping approach to test for associations throughout the genome.

1.2.1 Consortium approach to genome-wide association studies

GWAS for a disease as rare as CF typically require sample sizes that can only be attained by meta-analysis of several independent cohorts. Several patient groups have been utilized in meta- analyses of CF-related traits: the Gene Modifier Study (GMS) from the University of North

Carolina that consisted of unrelated pancreatic insufficient individuals who were p.Phe508del homozygotes selected from the extremes of lung function [51], the Twin and Sibling Study

(TSS) from Johns Hopkins University that included families in which two or more children had

CF [52]; the Canadian Consortium for Genetic Studies (CGS), a population-based sample of

5 unrelated Canadian CF individuals with pancreatic insufficiency [53]; and the French CF Gene

Modifier Consortium (FrGMC), a population-based sample of French CF patients with pancreatic insufficiency [54]. The GMS, CGS and FrGMC are suitable for use in genome-wide association studies, while the TSS can be used in both linkage and association analysis due to its family-based design.

1.2.2 Genome-wide studies of lung function in cystic fibrosis

Lung disease is responsible for the majority of morbidity and mortality in CF and as a result has been the most studied phenotype in CF modifier studies. The first GWAS for CF modifier genes was a case-control design using mild and severe patients from the GMS cohort [52]. The phenotype was based on lung function assessed by the forced expiratory volume in one second

(FEV1) and included 160 mild patients from the highest quartile of FEV1 and 160 severe patients from the lowest quartile. Replication of the most significant results was sought in the

TSS cohort using a family-based test of association. Although the study was small and tested a relatively limited number of polymorphisms (n=100,000), a significant association was found with single nucleotide polymorphisms (SNPs) in and around the interferon related developmental regulator 1 (IFRD1) gene. IFRD1 is a protein expressed in neutrophils and the authors demonstrated that neutrophils without IFRD1 were deficient in effector functions such as the oxidative burst and cytokine production [52, 55]. IFRD1 polymorphisms have also been associated with nasal polyposis in CF patients [56].

CF lung disease severity is most often clinically assessed by measurement of FEV1.

However, this metric has limitations when applied to disease severity in meta-analyses of multiple patient groups. Specifically, comparison of different groups is confounded by different ages, study designs and changing patterns of mortality in CF. Therefore, the North American

6

Cystic Fibrosis Gene Modifier Consortium developed a novel phenotype to characterize lung disease severity using age-specific CF percentile values based on multiple measurements of

FEV1 over 3 years normalized for mortality attrition [57]. This statistic is known as KNoRMA

(Kulich Normal Residual Mortality-Adjusted lung disease phenotype) and has been shown to be heritable (h=0.51) as well as comparable across studies [57].

The North American Cystic Fibrosis Gene Modifier Consortium used this phenotype in two large GWAS studies for lung disease severity. The first (GWAS1) included 3,467 CF patients across three study designs, the CGS, TSS and GMS, and a second GWAS (GWAS2) was performed combining an additional 2,921 patients from North America and France. The two studies were also analyzed as a meta-analysis across the four cohorts (CGS, TSS, GMS and

FrGMC) allowing for an impressively large sample size of 6,365 patients with greater power to detect genome-wide associations [54]. In addition, a linkage study was performed in 486 sibling pairs from the TSS.

The most significant association in both GWAS studies was a locus at 11p13 located between the EHF and APIP genes [58], although it is not yet clear which of the two genes is causative for the association. Interestingly, this locus has also been associated with lung function in asthmatics [59]. EHF (ETS homologous factor) encodes an epithelial protein that belongs to the ETS subfamily, which regulates differentiation of epithelial cells under conditions of stress and inflammation [60]. EHF has been found to modify the capability of epithelial cells to fold and process mutant CFTR, supporting its potential role as a CF modifier gene [61]. APIP (APAF1 interacting protein) is an inhibitor of cytochrome c-dependent and

APAF1 (apoptotic peptidase activating factor 1)-mediated cell death and prevents hypoxic injury. Elevated expression of APIP, resulting in a decrease in apoptosis, may be detrimental to

7 lung function in CF due to delayed clearance of neutrophils [62, 63]. Therefore, both EHF and

APIP have strong biological relevance for CF, and either or both genes could be responsible for the association.

In addition to the association at 11p13, associations were identified with SNPs in

AGTR2/SLC6A14 on Xq23, HLA Class II on chromosome 6p21, AHRR/SLC9A3 on chromosome 5p13, and MUC4/MUC20 genes on chromosome 3q29. Genes present in all four loci have strong biological relevance for lung disease in CF. The MUC4 and MUC20 genes code for mucins that can be found both tethered to ciliated airway epithelial cells where they contribute to the periciliary layer, and in airway mucus where they likely contribute to host defense [64-66]. SLC9A3 encodes the cation proton antiporter 3 (NHE3) which plays a role in regulation of pH and transport of ions in the epithelium, and has been implicated in intestinal obstruction in both a mouse model of CF and in a hypothesis-driven GWAS of meconium ileus in CF patients [67-70]. Polymorphisms in SLC9A3 were also associated with age of first

Pseudomonas aeruginosa infection in CF patients in a candidate gene study [71]. The HLA

Class II locus has been associated with respiratory traits including asthma [72], lung disease in

CF [73], lung function in healthy populations [74] and allergic bronchopulmonary aspergillosis

[75]. This suggests that polymorphisms affecting antigen processing may play a role in determining susceptibility to lung disease progression and bacterial infection which is supported by evidence that HLA class II polymorphisms are associated with chronic P. aeruginosa infection [73]. AGTR2 encodes the angiotensin type II receptor which has been shown to mediate signaling in pulmonary fibrosis, and play an anti-inflammatory role in the lung [76, 77].

SLC6A14 is an amino acid transporter which has previously been associated with meconium ileus in CF patients [69]. Interestingly, SLC6A14 polymorphisms have been associated with

8 male infertility [78, 79], a phenotype observed in CF patients. In addition to the GWAS, linkage analysis of the TSS cohort identified a locus on chromosome 20q13.2 that includes five genes that are expressed in either fetal or adult lung or bronchial epithelial cells (Table 1.2). Further studies are currently in progress to investigate the genes in the linkage peak [80].

9

Table 1.2: Summary of CF modifier GWAS/linkage study findings

Phenotype Study Design Sample Size Statistically Significant Associations that are Associations Suggestive of Significance

Lung Disease Wright, Strug et Association 3467 patients from GMS + EHF /APIP HLA-DRA, EEA1, SLC8A3, Severity al. 2011 [58] CGS + TSS AHRR, CDH8, AGTR2/SLC6A14

Corvol, Blackman Association 6365 patients from GMS + EHF/APIP, AGTR2/SLC6A14 et al. 2015 [54] CGS + TSS + FrGMC MUC4/MUC20, SLC9A3, HLA Class II BMI/ Bradley, Linkage 1010 patients 1p36.1, 5q14 1p31-22, 2q14.3, 6q25, Nutritional Blackman et al. from TSS 7q33, 9q34, 10q25-26, Status 2012 [81] 11q12, 11q14, 12p12, 13q33, 14q21, 16p11.2 CF-Related Blackman, Association 644 CF patients with CFRD SLC26A9 CYP11B2, KRT18i33, Diabetes Commander et al. and 2415 CF patients without NCKAP1L, LPHN3 2013 [82] CFRD from GMS + CGS + TSS Meconium Blackman, Linkage 26 CF sibling pairs with MI 4q35.1, 8p23.1, 11q25, Ileus Deering-Brose et and 282 CF sibling pairs 20p11.22, 21q22.3 al. 2006 [83] without MI from TSS

Dorfman, Li et al. Linkage 226 CF families with no-MI 12p13.3(ADIPOR2); 2009[84] and 71 CF families with at 4q13.3 (SLC4A4) least one MI child at birth from CGS

Sun, Rommens et Association 3763 CF patients from CGS, SLC6A14, SLC26A9, al. 2012 [69] GMS + TSS SLC9A3

10

1.2.3 Genetic association studies of other CF related phenotypes

In addition to lung disease severity, there are several other CF phenotypes that have evidence for contribution by non-CFTR genetic factors.

1.2.3.1 Body mass index and nutritional status

CF is also characterized by poor growth due to a combination of pancreatic disease resulting from pancreatic enzyme deficiency and chronic lung disease. Individuals with CF are treated with pancreatic enzyme replacement, however some individuals maintain extremely low BMI.

BMI in CF patients has been shown to have a large genetic contribution outside of CFTR genotype, and heritability estimates range from 0.54-0.80 [81]. The contribution of gene modifiers to nutritional status in CF children (age 5-10 years) was investigated using the TSS cohort. Significant linkage was found between BMI and chromosomal regions 1p36.1 and 5q14

[81]. The linkage peak at 5q14 contains the arrestin domain-containing 3 (ARRDC3) gene which may be involved in the regulation of body mass and energy expenditure in males but not in females, with a rare haplotype contributing to high body weight in males [85]. This finding is also of relevance to the gender differences in CF severity, as it has been shown that females have a significantly higher risk for an abnormally low BMI [86], and that gender differences in body composition may contribute to the more severe disease found in female CF patients [87].

Therefore, some of the gender gap in CF could be explained by ARRDC3 variants that may be protective against low BMI in males. See Table 1.2 for a summary of the findings.

1.2.3.2 CF-related diabetes

Individuals with CF often develop diabetes mellitus with advancing age, and approximately 19% of adolescents and 40-50% of adults with CF are affected. CF-related diabetes (CFRD) is

11 associated with an increased decline in lung function which can be improved with medical management of blood glucose [88]. Approximately 90% of CF patients have exocrine pancreatic insufficiency which is mainly determined by CFTR mutation [89], with the majority of pancreatic insufficient patients having severe class I, II and III mutations [89]. While all patients who develop CFRD are pancreatic insufficient [90, 91], CFTR mutations do not otherwise significantly contribute to diabetes risk and variation in other genes accounts for the majority of the risk for CFRD [92]. The heritability of CFRD is extremely strong and has been estimated as near one [92]. A GWAS for CFRD was conducted in 3,059 individuals with CF from the North

American CF Gene Modifiers Consortium, out of whom 644 had CFRD. A single SNP in the

SLC26A9 gene was associated with CFRD (p=3.6×10-8), and was subsequently replicated in a separate group of 694 individuals (of whom 124 had CFRD) [82]. SLC26A9 is an epithelial chloride/bicarbonate channel that has been demonstrated to physically interact with CFTR [93-

95]. However, the identified SLC26A9 SNP is not associated with lung disease severity in CF, suggesting that the effect of this SNP is tissue-specific. It is also possible that SLC26A9 is involved in CFRD by playing a role in glucose metabolism as SLC26A9 SNPs are also associated with type 2 diabetes in non-CF individuals, although with an opposite direction of effect [82].

Furthermore, a hypothesis-driven analysis of solute channel polymorphisms identified SLC26A9

SNPs as being associated with exocrine pancreatic disease measured by immunoreactive trypsinogen level [96, 97]. The association of SLC26A9 SNPs with two phenotypes in the same organ highlights the significance of this gene to pancreatic function and warrants further investigation. It has been proposed that pancreatic damage could result in loss of beta cell function and a corresponding lack of insulin secretion [98], which is supported by the finding that early pancreatic damage predicts the development of CFRD later in life [96, 99]. In addition,

12 there were four other loci, CYP11B2, KRT18i33, NCKAP1L and LPHN3, which each had one

SNP that was suggestive of association (p<1.8×10-6 ) with CFRD. See Table 1.2 for a summary of the findings.

1.2.3.3 Meconium ileus

Meconium ileus is the obstruction of the small intestine at birth that occurs in ~15% of newborns with CF. This phenotype is strongly regulated by gene modifiers, and heritability has been estimated at ~1.0 [83]. In addition to two linkage studies (summarized in Table 1.2) [83, 84] a conventional GWAS analysis in 3,763 patients from the CGS, GMA and TSS cohorts identified five SNPs in the SLC6A14 and SLC26A9 genes [69] that were associated with meconium ileus.

These associations were replicated in an independent sample of 2,372 patients from North

America and France. A parallel hypothesis driven GWAS (GWAS-HD) was performed using the prior knowledge that disease causing mutations in CFTR affect the flow of electrolytes and fluid flux. The GWAS-HD identified the same SNPs in SLC6A14 and SLC26A9 in addition to identifying a third gene, SLC9A3, with replication in the FrGMC cohort. In addition, the

GWAS-HD was used to test all SNPs annotated to apical membrane genes jointly, and this analysis confirmed that apical membrane genes as a group contribute to susceptibility to meconium ileus. Furthermore, SLC26A9, SLC9A3 and SLC6A14 are pleiotropic, as SLC26A9 was associated with prenatal pancreatic damage, SLC6A14 was associated with lung disease and age at first P. aeruginosa infection, and SLC9A3 was associated with pediatric lung disease severity in the CGS cohort [96]. These results show that apical membrane constituents significantly contribute not only to the development of meconium ileus but also to other CF co- morbidities, suggesting that modulation of these genes may be therapeutically beneficial for

13 multiple phenotypes. See Figure 1.1 for a summary of apical membrane constituents that are associated with CF severity. See Table 1.2 for a summary of the findings.

14

Figure 1.1: Apical membrane constituents that are associated with CF related traits

15

1.2.4 CF modifier genes, inflammation and immunity in CF

Genome-wide association and linkage studies in CF have identified a number of genes that are associated with CF phenotypes. Strikingly, several of the identified genes are involved in inflammation and immune response, such as EHF, APIP, and HLA. Furthermore, many associated genes are apical membrane constituents that contribute to ASL properties and mucociliary clearance, such as MUC4, MUC20, SLC26A9, SLC6A14, and SLC9A3, and therefore indirectly contribute to inflammation and immunity since ASL volume and efficient mucociliary clearance are critical aspects of the immune response in the lung. This highlights the significance of inflammation and immune response to disease severity and progression. In addition to genome-wide association and linkage studies, candidate gene studies in CF have identified several modifier genes that are involved in the immune response. One of the first identified candidate genes in CF was mannose-binding lectin 2 (MBL2), a molecule which binds to a monosaccharide on bacterial cell walls and activates the complement cascade to both directly lyse bacteria as well as recruit phagocytes. In most studies, genotypes associated with MBL2 deficiency were associated with decreased lung function [100]. Another well-characterized candidate gene in CF is transforming growth factor beta 1 (TGF-β1) [100]. It is a strong candidate modifier since TGF-β1 genotype has also been associated with other respiratory diseases such as asthma and chronic obstructive pulmonary disease (COPD) [101-103]. TGF-β1 plays an important role in mechanisms that are critical to CF lung disease such as regulation of inflammation and tissue remodeling [104].

16

1.3 Infection in cystic fibrosis

1.3.1 Bacterial infection

While CF infants are born with healthy lungs, early in life they become infected with a variety of pathogens. Children with CF are most susceptible to infection with Haemophilus influenza and

Staphylococcus aureus, which result in damage to the airways due to the inflammatory response to infection. As the disease progresses, individuals with CF acquire a variety of Gram-negative bacteria. Many of these bacterial pathogens are commonly found in the environment, and only result in respiratory infection in individuals who are immune compromised or have reduced integrity of the airway epithelium. P. aeruginosa is the most common pathogen in adults with

CF, and 60-70% of CF patients are infected by this bacterium by the age of 20 years [105]. Once acquired, P. aeruginosa can often result in chronic infection which is associated with higher levels of inflammation, increased recruitment of neutrophils and correspondingly higher levels of serine proteases which contribute to airway obstruction and damage (reviewed in [106]). During chronic colonization, P. aeruginosa undergoes a phenotypic conversion which includes changes in motility, virulence, and antibiotic susceptibility allowing it to be better adapted to the CF airways (reviewed in [106, 107]). Several other Gram-negative bacteria can be found in CF including Stenotrophomonas maltophilia, Achromobacter spp and Burkholderia cepacia complex. Although it is one of the less common pathogens in CF, B. cepacia can be especially problematic as it is one of the most virulent CF pathogens and chronic infection is associated with significant morbidity and mortality [45]. B. cepacia can lead to development of “cepacia syndrome”, a rapidly progressive pneumonic illness which is difficult to treat and is often fatal

[108, 109]. Environmental non-tuberculous mycobacteria (NTM), such as Mycobacterium abscessus and Mycobacterium avium-intracellulare, are additionally found in CF patients.

17

These are aerobic, non-motile Gram positive bacteria that are rod shaped with thick cell walls.

NTM result in chronic pulmonary infection, which can often be difficult to treat due to antibiotic resistance [110-112]. Several CF pathogens can be transmitted between individuals, which is a serious concern in CF clinics [113, 114]. In particular, B. cepacia, P. aeruginosa, S. aureus and

M. abscessus have been shown to be transmitted between individuals [114, 115]. This has resulted in significant effort by CF clinics to reduce cross-infection, and patient-to-patient contact is strictly restricted [113, 114]. Recent studies using next generation sequencing to characterize the CF microbiome have identified a much larger range of bacteria in the CF airways than was previously known, which includes some anaerobic bacteria. Some of these bacteria can also be found in healthy individuals so are likely not all pathogenic [116, 117]. The high relative abundance of a particular organism such as P. aeruginosa and lower diversity of bacterial species has been shown to be associated with reduced lung function [117-120]. These data indicate that the use of antibiotics in CF may be detrimental in some cases, which is supported by studies indicating that prophylactic antibiotic use in early life in CF is associated with increased prevalence of P. aeruginosa.

1.3.2 Viral infection in cystic fibrosis

Although bacterial pathogens are most commonly associated with respiratory infections resulting in clinical deterioration, recent studies implicate respiratory viruses as a major contributor to short and long term health in CF [121]. In healthy individuals, viral infection is cleared without impacting long-term respiratory health. Although the frequency of viral infections is not different between healthy children and those with CF [122, 123], in CF patients respiratory viral infections are associated with up to 50% of CF exacerbations and can result in a persistent decline in lung function [121, 123-127]. Furthermore, respiratory virus infections have been

18 linked to an increase in bacterial adherence, and there is evidence that viral infection may trigger bacterial infection (reviewed in [128]). In addition, viral infections are associated with prolonged hospitalizations and increased use of antibiotics [127]. In children with CF, viral- induced exacerbations are associated with worse severity and decreased quality of life compared to non-viral exacerbations [129]. The most common viral pathogens that are detected in CF patients are influenza, respiratory syncytial virus (RSV) and rhinovirus [121, 130]. Rhinovirus is the predominant viral pathogen and is found in up to 40% of virus-associated exacerbations in

CF [121, 125, 126].

In response to detection of respiratory virus, pattern recognition receptors (PRRs) particularly Toll-like receptors (TLRs) and retinoic acid-inducible gene I-like receptors are activated by the presence of double-stranded RNA (dsRNA) which is produced during viral replication [130]. This results in airway epithelial cells producing anti-viral molecules including

Interferons (IFNs), particularly IFN-β and IFN-λ [131-133]. IFNs bind to specific receptors and activate the JAK-STAT pathway, which results in the induction of IFN-stimulated genes (ISGs), such as 2’5’-oligoadenylate synthetase 1 (OAS1), that trigger the production of several anti-viral proteins.

1.3.3 Rhinovirus in CF

The majority of common colds and respiratory tract infections in childhood are caused by human rhinovirus, a positive single-stranded RNA (ssRNA) virus that is part of the Picornaviridae family. There are over a hundred strains of rhinovirus identified belonging to three viral serotypes (A, B and C). Furthermore, rhinovirus is classified into two groups, the major and minor group, depending on airway epithelial receptor affinity. The major group represents the

19 majority of rhinovirus strains which utilize the intercellular adhesion molecule, ICAM-1 to bind to and enter cells, while the remaining (minor group) strains bind to low-density lipoprotein receptors (LDLR) [134, 135].

While the majority of studies suggest that CF patients do not have increased susceptibility to rhinovirus infection [122, 123, 136, 137], one recent report showed that children with CF may have increased risk of rhinovirus acquisition [138]. In healthy individuals, rhinovirus infection is normally confined to the upper respiratory tract, however in CF patients antiviral immunity appears to be deficient, allowing the virus to move to the lower airways

[136]. The mechanism involved in increased morbidity due to rhinovirus infection in CF is currently unknown. Rhinovirus has been recognized as a contributor to exacerbation and lung function decline in patients with other chronic lung diseases such as COPD and asthma. COPD patients have been shown to have increased susceptibility to rhinovirus infection despite an increased IFN response [139], although a conflicting report found that COPD derived airway epithelial cells had lower IFN production in response to rhinovirus [140]. In addition, more than

50% of asthma exacerbations are associated with rhinovirus infection [141], and deficient production of IFNs, specifically IFN-β and IFN-λ, has been shown to contribute to the dysregulated control of rhinovirus infection in asthmatics [142-146]. Therefore it has been hypothesized that production of IFNs, as well as inflammatory cytokines, in response to rhinovirus infection is also abnormal in CF.

It is well established that airway inflammation plays a large role in the progression of CF lung disease and that it contributes to damage of the airways. This has resulted in the hypothesis that there is an elevated production of inflammatory mediators and recruitment of inflammatory cells in response to respiratory virus infection in CF, and this contributes to the increased

20 respiratory virus morbidity in CF. Current data in the literature reporting on the CF response to rhinovirus have been conflicting. Sutanto et al. found that CF airway epithelial cells stimulated with human rhinovirus 1b (HRV1b) responded with increased interleukin (IL)-6 and IL-8 production, as well as reduced apoptosis and increased viral replication compared to non CF cells [147]. In contrast, Kieninger et al. [148] did not find a significant difference in the production of inflammatory cytokines released in response to rhinovirus infection in CF airway epithelial cells compared to controls. Furthermore, while Kieninger et al. also found that rhinovirus load is higher in children with CF compared to healthy controls, in particular during

CF exacerbations, they showed that increased viral load was associated with decreased IFN levels [136]. It has been shown that CF derived airway epithelial cells produce a reduced IFN response to rhinovirus and have a higher viral load when co-infected with P. aeruginosa, while this phenomenon is not observed in healthy individuals. In addition, infection of CF derived airway epithelial cells with rhinovirus following P. aeruginosa infection resulted in the dispersal of bacteria into a planktonic form, which was associated with increased production of several chemokines including IL-8 [149]. Most recently, Dauletbaev et al. [150] showed that CF airway epithelial cells do not produce significantly different amounts of interferon-β and IL-8 compared to controls in response to infection with HRV16. However they found a significantly elevated viral load in CF cells, which could be detected as early as two hours post infection. This confirms other studies showing that rhinovirus load is elevated in CF [136, 147, 151, 152].

Further work is required establish gene expression profiles in CF cells compared to healthy controls after rhinovirus infection to determine if viral response pathways are dysregulated in

CF.

21

1.3.4 Fungal pathogens

The role of fungal pathogens in CF lung disease is becoming increasingly appreciated in recent years. Most commonly Aspergillus and Candida species can be detected in CF lungs. In addition to these more prevalent fungal pathogens additional genera such as Penicillium,

Alternaria and Scedosporium have been isolated in CF. The clinical importance of airway colonization with fungi is currently unclear but in recent years is gaining interest for further study. The clinical syndrome known as allergic bronchopulmonary aspergillosis (ABPA) is the most well recognized effect of fungal colonization in the airways, and is known to be caused by

Aspergillus fumigatus [153]. ABPA is the result of an exaggerated immune response to the presence of Aspergillus spores in the airways and is associated with elevated levels of

Aspergillus-specific IgE. ABPA is well recognized to be associated with reduced lung function and an increased rate of lung function decline [154]. ABPA is normally treated with inhaled and oral corticosteroids to attenuate inflammation and immune response [155, 156], and with antifungal agents such as Itraconazole [157]. Although the consequences of the presence of

Candida species in the airways has not been well studied, there is evidence that Candida acquisition is associated with reduced FEV1, increased frequency of exacerbation and more rapid decline in FEV1 [158]. While these findings suggest that colonization of the airways with

Candida is detrimental in CF, it is important to consider the possibility that colonization is an effect of increased infection and resultant antibiotic treatment.

22

1.4 Inflammation in CF

1.4.1 Dysregulated immune response in CF

There is evidence that inflammation is dysregulated in the CF lung [13, 159, 160]. While CF patients are born with anatomically normal lungs, early in life even in the absence of obvious infection, CF infants and children have evidence of inflammation such as elevated IL-8 and high numbers of neutrophils [161, 162]. CF patients soon become infected with a characteristic group of bacteria and the inflammatory response to these pathogens is severe and prolonged [163].

High levels of neutrophil elastase are present in CF infants. This persistent exaggerated immune response causes permanent structural damage to the CF airways resulting in loss of lung function and eventual respiratory failure. It is currently unknown why defective CFTR function results in dysregulated inflammation and several mechanisms are likely involved. The most well characterized consequence of CFTR deficiency is the impact on ASL and the mucociliary escalator.

1.4.2 Abnormal airway surface liquid and mucociliary clearance

The ASL and mucociliary escalator are important aspects of airway biology that dynamically respond to signals from the environment and host. Adequate volume, pH and composition of the

ASL are important for proper function of cilia and mucociliary clearance. The ASL also contains antimicrobial molecules that play a role in innate and adaptive immune responses. The airway mucus serves the functions of trapping pathogens and removing them through the mucociliary escalator, as well as protecting the airway epithelium from toxins. CFTR plays an important role in modulating the viscosity of the mucus layer to find a balance between sufficient fluidity for airway clearance and viscosity for protection [164]. CFTR deficiency is thought to

23 result in excessive water absorption and the production of dehydrated mucus due to loss of chloride secretion and the resulting change in osmotic pressure. This likely results in impaired clearance of pathogens and a corresponding secondary immune response. Tethered and secreted mucins contribute to ASL hydration by attracting and storing water, and CFTR contributes to this function by providing the required water by secreting chloride and regulating sodium absorption [65, 160]. The recent identification of MUC4 and MUC2 as modifier genes of lung disease severity in CF is further evidence of the importance of mucins in ASL hydration in CF

[54]. Since CFTR also contributes to ASL pH through its role in bicarbonate transport, loss of

CFTR function results in reduced ASL pH. Reduced pH is thought to be detrimental to the function of many antimicrobial factors, and reduced pH in a CF pig model has been shown to reversibly inhibit antimicrobial activity [165]. Reduced bicarbonate also alters the properties of airway mucus and results in a dense mucus that is impenetrable and tightly tethered to the epithelium in which the cilia are not able to properly move.

CFTR can modulate additional apical membrane ion channels and transporters which contribute to ASL homeostasis. The epithelial sodium channel (ENaC) plays an important role in sodium transport on the apical membrane, and thus contributes to hydration of the ASL and mucus. While the mechanism behind the relationship is unknown and the hypothesis is controversial, defective CFTR is believed to result in increased activity of ENaC and there are data supporting increased ENaC activity in CF [166, 167]. As ENaC is responsible for sodium transport, upregulated ENaC activity in the context of CF results in hyperabsorption of sodium which causes ASL dehydration [166, 168]. ENaC activity in the CF airways is also altered due to several other mechanisms. ENaC is modulated through cleavage by serine proteases such as neutrophil elastase that are elevated in CF airways [169, 170]. In addition, ENaC is inhibited by

24

BPI fold containing family A member 1 (BPIFA1), a secreted product of airway epithelial cells which loses its regulatory ability in the acidic CF environment. Another family of apical membrane transporters are the SLC26A family which are involved in the transport of chloride and bicarbonate in the airway epithelium, and have been shown to be regulated by CFTR [171].

The genetic association of SLC26A9 polymorphisms with CF related diabetes, meconium ileus and CF pancreatic disease [69, 82, 96, 97], further highlights the significance of this family of solute channels in CF. In summary, CFTR mutations can result in altered ASL and mucus both directly and through the modulation of additional solute channels and transporters such as ENaC and the SLC26A family. This can result in increased inflammation due to inefficient mucociliary clearance of pathogens and the inhibition of antimicrobial molecules.

1.4.3 Secondary inflammatory defects in CF airways

Inflammation in the CF lung is elevated and prolonged due to the constant recruitment of immune cells into the airways. The inflammation is characterized by high levels of neutrophils, which are sentinel cells of the lung against bacterial and fungal pathogens such as P. aeruginosa and A. fumigatus [172]. However, activation of neutrophils at high levels can be damaging to the lungs due to the release of oxidants and proteases [173]. Neutrophil granules contain several serine proteases: neutrophil elastase, cathepsin G, proteinase-3, and neutrophil serine protease-

[174, 175]. Elevated levels of serine proteases that are found in the context of CF have been shown to be detrimental to lung function. Neutrophil elastase has been shown to degrade CFTR resulting in loss of function [176], and elevated neutrophil elastase levels are associated with the development of bronchiectasis in CF children [176]. Loss of neutrophil elastase has been shown to be beneficial in CF models and is associated with reduced levels of inflammation, mucus hypersecretion and emphysema [177]. Calgranulins are another class of molecules that are

25 derived from inflammatory cells and play a role in the inflammatory response [178]. They are released from neutrophils, macrophages and monocytes, and levels are markedly increased in CF

[179, 180]. These molecules contribute to CF inflammation through several mechanisms including activation of TLR4 signaling, phosphorylation of ERK, facilitation of NF-κB translocation to the nucleus, and induction of MUC5AC secretion [181, 182]. While inflammation in the CF lung is primarily neutrophilic, other immune cells particularly macrophages, dendritic cells and T cells also play a role. T cells have been shown to be altered in CF [183], and this could contribute to defects in the adaptive immune response.

1.4.4 Anti-inflammatory therapies

Airway inflammation plays a large role in the progression of lung disease in CF. Slowing disease progression with treatments such as CFTR potentiators and correctors, mucolytics and antibiotics may contribute to reduction of inflammation in the airways. However, therapies which directly target inflammation have been shown to be beneficial in CF, although most are not routinely used due to safety concerns. Ibuprofen, which is a non-steroidal anti-inflammatory drug, has been shown to significantly reduce the decline of FEV1 as well as improve body weight [184, 185]. While ibuprofen is not routinely used in the treatment of CF patients due to concerns about side effects [186], there is evidence that ibuprofen might be beneficial in proportion to the risks for CF patients with mild to moderate lung disease [187, 188]. More recently, ibuprofen has been shown to slow the decline of lung function in pediatric CF patients who started with well-preserved lung function, with a low rate of adverse events [189].

Prednisone is a corticosteroid which has also been shown to be beneficial for lung function and

BMI in CF, and has been associated with a decrease in the number of pulmonary exacerbations

[190, 191]. However, side effects including growth failure and the development of cataracts

26

[192] have prevented prednisone from being used routinely as a long term therapy.

Azithromycin, a macrolide antibiotic, has been shown to be beneficial for CF lung function in clinical trials and is routinely prescribed as an anti-inflammatory therapy [193]. While the mechanism of action is currently unclear, there is evidence supporting several anti-inflammatory effects including inhibition of quorum sensing [194] resulting in breakdown of biofilms [195], inhibition of NF-κB and AP-1 mediated inflammatory responses [196], and restoration of chloride efflux in CF cells [197, 198].

1.5 BPIFA1 and BPIFB1: Innate immune molecules in CF

1.5.1 The BPI fold containing family of proteins

The upper respiratory tract, starting from the nasal and oral cavities, is a main route of entry of pathogens into the body. As well as serving as a structural barrier, airway epithelial cells produce proteins that are secreted into the airway lumen and provide a first line of defense against pathogenic exposures. These molecules are part of the innate immune response. Many innate immune molecules can specifically recognize and respond to molecules which are present on the bacterial surface, such as bacterial lipopolysaccharide (LPS). In humans, two essential innate immune molecules which interact with LPS are LPS-binding protein (LBP) and bactericidal permeability increasing protein (BPI) [199, 200]. While BPI and LBP are structurally similar and can bind the Lipid A component of LPS, they are considered to have opposing functions. LBP is a plasma protein that can bind to LPS on Gram-negative bacteria and transfer it to other LPS-binding proteins on the cell surface, alerting the host to the presence of bacteria and inducing an inflammatory response. BPI serves an antagonistic role whereby it binds to LPS and reduces its presentation to the host; therefore it has an anti-inflammatory

27 function. BPI and LBP are both members of the large tubular lipid-binding (TULIP) domain superfamily which also contains cholesteryl ester-transfer protein (CETP), phospholipid-transfer protein (PLTP) [201], as well as the human BPIF (BPI fold containing) family. All of the

TULIP genes are located on , except for CETP, the most distantly related family member, which is located on chromosome 16.

Proteins that contain BPI fold structures, which consist of two large barrel-shaped domains connected by a central β-sheet, are considered part of the BPIF superfamily. Modeling studies have demonstrated that BPIF proteins subdivide into two groups, originally known as short palate, lung, nasal epithelium clone (SPLUNC) and long PLUNC (LPLUNC) proteins.

While LPLUNCs contain domains that are structurally similar to both domains of BPI,

SPLUNCs only contain domains similar to the N-terminal domain of BPI [202]. Although the two domains are structurally similar, their sequence identity is low and the domains exhibit distinct cellular functions [203]. In recent years, members of this gene family have been implicated in a variety of diseases. The best characterized genes of the BPIF family are BPIFA1

(SPLUNC1) and BPIFB1 (LPLUNC1), both of which are secreted by airway epithelial cells in the upper respiratory tract [204-206].

1.5.2 Antimicrobial properties of BPIFA1

BPIFA1 is one of the most highly expressed proteins in the respiratory tract and is produced by epithelial cells in the upper airway and proximal lower respiratory tract. While expression levels are high in the trachea and bronchi including the secretory ducts and submucosal glands of normal human trachea and bronchi, BPIFA1 levels decrease from the proximal to the distal airway, until they are undetectable in the peripheral lungs [207].

28

There are multiple secreted isoforms of human BPIFA1 that have been detected in respiratory secretions [208, 209], and are likely produced through post translational modifications including phosphorylation, deamidation, truncation or glycosylation [208, 210]. The acidic isoforms of

BPIFA1 have been found to be sialylated, and these isoforms have been proposed to play a role in the inflammatory response and host defense in seasonal allergic rhinitis patients [211]. Based on early predictions of BPIFA1 amino acid sequence, and similarity to other secreted proteins, it was proposed that BPIFA1 is secreted [207, 212, 213], and this was later confirmed when

BPIFA1 was detected in nasal secretions [214-217], saliva, sputum [212, 218], and airway epithelial cell culture supernatant [212]. BPIFA1 mRNA and protein expression levels have been found to be increased in CF airways, suggesting that BPIFA1 may be of importance in CF patients [204, 219, 220].

Soon after its discovery, BPIFA1 was predicted to play a role in innate immunity due to its structural similarity to BPI and LBP. This function was confirmed, as BPIFA1 has been found to bind to LPS and to inhibit the growth of P. aeruginosa, Mycoplasma pneumoniae and

Klebsiella pneumoniae [221-226]. Transgenic mice that over-express human BPIFA1 have been shown to display enhanced bacterial clearance after challenge with P. aeruginosa, resulting in decreased infiltration of neutrophils and lower levels of inflammatory cytokines [227].

Conversely, reduction of BPIFA1 using siRNA was associated with increased growth of M. pneumoniae and increased IL-8 production [221], and BPIFA1 knockout mice were less able to clear M. pneumoniae infection, had impaired neutrophil activation and increased levels of inflammatory cells in response to infection [224]. BPIFA1 also has anti-viral properties against

Epstein-Barr virus [222]. In addition, there is evidence that BPIFA1 has surfactant properties and can inhibit biofilm formation by P. aeruginosa and K. pneumoniae [225, 228].

29

1.5.3 Immunomodulatory function of BPIFA1

Experiments in mice have shown that BPIFA1 has immunomodulatory properties in models of acute airway inflammation. BPIFA1-deficient mice have increased levels of inflammation which is characterized by increased levels of eosinophils, mucus production, and airway hyperreactivity, as well as increased production of the TH2 cytokines IL-4, IL-5, and IL-13 [229,

230]. In contrast, in a model of pulmonary inflammation in the absence of infection, BPIFA1 overexpressing mice were stimulated with potently inflammatory inhaled single-walled carbon nanotubes. Increased BPIFA1 levels were associated with greater neutrophilic inflammation and elevated levels of the inflammatory cytokines TNF-α and IL-6. Although this model system was associated with a pro-inflammatory role for BPIFA1, over-expression of BPIFA1 in these mice was associated with protection against fibrosis [231]. Another mechanism by which BPIFA1 may regulate airway inflammation is by influencing apoptosis of airway epithelial cells. Over- expression of BPIFA1 in a nasopharyngeal carcinoma cell line was associated with decreased expression of anti-apoptotic proteins, increased expression of pro-apototic proteins and resulted in higher levels of apoptosis in these cells [232]. In addition to influencing the production of chemokines, there is evidence that BPIFA1 may function as a chemoattractant itself that can recruit neutrophils and macrophages [226].

1.5.4 BPIFA1 in ion transport and ASL regulation

BPIFA1 has been identified as a pH sensitive inhibitor of ENaC and has been shown to lose this function in the acidic CF airway environment, potentially contributing to depletion of airway surface liquid in CF airways [233-235]. The mechanism behind this inhibition involves binding of BPIFA1 to ENaC to prevent its cleavage, reducing ENaC expression on the apical membrane

30 and also possibly regulating the opening of the channel [169, 236]. This function of BPIFA1 is especially relevant in the context of CF since ENaC may be overactive, and this increased activity may contribute to dehydrated airway mucus and decreased ASL height and impairs mucociliary clearance.

1.5.5 Function of BPIFB1

BPIFB1 is another highly expressed protein that is produced by a population of goblet cells in the airway epithelium and nasal passages, and can also be found in submucosal glands and minor glands of the oral and nasal cavities. BPIFB1 is secreted and can be detected in bronchoalveolar lavage fluid as two glycosylated forms [206]. Glycosylated isoforms of BPIFB1 have also been identified in saliva [237]. Post-translational modification of this protein is believed to be important for biological activity [206]. Although less well characterized, BPIFB1 is also hypothesized to function in innate immunity due to its structural similarity with BPI and LBP

[202]. Further evidence that BPIFB1 plays a role in innate immunity comes from a genetic association with cholera [238] as well as data indicating that BPIFB1 modifies the innate immune response to Vibrio cholera [239]. Furthermore, auto-antibodies to BPIFB1 have been associated with the development of interstitial lung disease [240]. BPIFB1 expression levels are elevated in airway epithelial cells from CF patients [241] suggesting that BPIFB1 may play a role in innate immunity in CF airways.

1.6 Summary of thesis objectives

Inflammation in response to respiratory infection plays a large role in CF disease pathogenesis and contributes strongly to the decline in pulmonary function characteristic of the disease.

Genetic association studies have identified several modifier genes of CF disease severity which

31 are known to play a role in inflammation and the innate immune response. The overarching goal of this thesis was to utilize information from genetic association studies and whole transcriptome analysis to investigate the inflammatory response to infection in CF, and identify and characterize anti-inflammatory targets.

The central goal of this thesis will be addressed through three main objectives which will be described in three data chapters. The first objective was to identify and characterize SNPs in the BPIFA1/BPIFB1 chromosomal region associated with CF lung disease severity. This was done by interrogating data from the CF GWAS for lung disease severity and integrating with genotype and lung gene expression data from microarrays, with validation by quantitative PCR

(qPCR) and western blotting. The second objective was to characterize the function of BPIFA1 and BPIFB1 in CF. This objective was addressed through mechanistic and observational studies investigating BPIFA1 and BPIFB1 expression, as well as characterizing their putative antimicrobial and anti-inflammatory functions. The third objective was to investigate whether the CF response to rhinovirus infection is altered compared to healthy controls in order to identify possible anti-inflammatory targets. This objective was tackled by performing whole transcriptome analysis of CF and healthy control airway epithelial cells infected with rhinovirus.

In combination, these studies provide valuable insight into the inflammatory response in CF.

32

CHAPTER 2: PLUNCs AS MODIFIER GENES IN CYSTIC FIBROSIS

2.1 Rationale

There is compelling evidence for a role of both BPIFA1 and BPIFB1 in airway defense.

Furthermore, BPIFA1 has surfactant properties, and has been shown to inhibit ENaC.

Therefore, BPIFA1 and BPIFB1 are excellent candidates as modifier genes in CF, although a connection between these molecules and variability in CF disease severity has not yet been established. The purpose of this study was to investigate whether these genes contribute to lung disease in CF, by determining if BPIFA1/BPIFB1 polymorphisms are associated with CF lung function in data from CF modifier GWAS and to further investigate whether associated polymorphisms modulated BPIFA1/BPIFB1 mRNA and protein expression levels.

2.2 Background

BPIFA1 and BPIFB1 are members of the PLUNC family of proteins that are secreted from airway epithelial cells in the upper respiratory tract, as well as from the nasopharynx and submucosal glands [204-206]. Both proteins may play important roles in host defense of CF airways due to their putative roles in innate immunity. BPIFA1 has been found to bind to LPS and inhibit the growth of P. aeruginosa, M. pneumoniae, and K. pneumoniae [221-225]. BPIFA1 has also been identified as a pH sensitive inhibitor of ENaC, and has been shown to lose this function in the acidic CF airway environment, potentially contributing to depletion of airway surface liquid in CF airways [233, 234]. In addition, BPIFA1 has surfactant properties and may inhibit biofilm formation in the airways [225]. Thus, the antimicrobial, ENaC inhibitory, and surfactant properties of BPIFA1 may play an important role in CF airways.

33

Although less well characterized, BPIFB1 is also hypothesized to function in innate immunity due to its structural similarity with bactericidal BPI and LBP, both of which are innate immune molecules with recognized roles in sensing and responding to Gram negative bacteria

[202]. Further evidence that BPIFB1 plays a role in innate immunity comes from a genetic association with cholera [238] as well as data indicating that BPIFB1 modifies the innate immune response to Vibrio cholera [239]. In addition, BPIFB1 expression levels are elevated in airway epithelial cells from CF patients [241]. Therefore, BPIFB1 may also play a role in innate immunity in CF airways.

As described in Chapter 1, GWAS have been performed for lung disease severity in CF and these studies have identified several genes that are associated with CF lung function.

However, GWAS are also a rich resource for identifying candidate genes in a hypothesis driven manner. Here, we interrogated data from the first CF GWAS (GWAS1) to identify polymorphisms that are associated with CF lung disease severity, and utilized the second GWAS

(GWAS2) as a replication cohort. We also tested whether associated polymorphisms modulated

BPIFA1/BPIFB1 mRNA and protein expression levels.

Hypothesis: Polymorphisms in the BPIFA1/BPIFB1 region are associated with lung disease severity in CF

Aim: To identify and characterize SNPs in the BPIFA1/BPIFB1 region associated with CF lung disease severity

34

2.3 Materials and methods

2.3.1 Selection of candidate SNPs from CF modifier GWAS data

1357 individuals with CF from the CGS and 1137 CF individuals from the GMS were genotyped genome-wide for 570,725 SNPs on the Illumina 610-Quad BeadChip [58]. Imputation was performed using the MaCH/Minimac software in the region spanning chromosome 20

31,800,000-32,000,000 using Phase I, Version 3 haplotype data from the 1000 Genomes reference population. Lung disease severity was assessed using the North American CF

Modifier Consortium lung phenotype (KNoRMA), which is an average of CF-specific FEV1 percentiles adjusted for age, sex and mortality [57]. Associations were tested using an additive model adjusted for sex and principal components that reflect genetic ancestry and results from the two populations were combined using a meta-analysis approach (n=2,494). We assessed

SNPs within the region 10kb upstream of BPIFA1 to 10 kb downstream of BPIFB1 (Figure 2.1 and Appendix Table 1). The critical value for significance was determined using a previously described Meff based calculation [242, 243], to calculate the number of independent statistical tests required to correct for multiple comparisons. The replication cohort consisted of 2333 individuals, including 285 additional individuals from CGS, 1222 individuals from the FrGMC and 826 additional individuals from the GMS that were combined in a meta-analysis approach.

In addition, we performed a meta-analysis combining all 4827 patients from the discovery and replication cohorts across the three study designs. See Table 2.1 for a summary of included individuals.

35

Table 2.1: Characteristics of included GWAS individuals

Study Lead Design Number p.Phe508del/ Institution of p. Phe508del Subjects n (%) Genetic Severe University Modifier Extremes of n=406 of North 1137 (100.0) Study phenotype Mild Carolina Discovery (GMS) n=731 Cohort Canadian (GWAS 1) Consortium Hospital for Population for Genetic Sick 1357 841 (62.0) based Studies Children (CGS) French CF Université Gene of Pierre Population- Modifier 1222 716 (58.6) and Marie based Consortium Curie (FrGMC) Extremes of Genetic 469 191 (40.7) Replication University phenotype Modifier cohort of North Study Not Extremes (GWAS2) Carolina 357 214 (59.9) (GMS) of phenotype Canadian Consortium Hospital for Population- for Genetic Sick 285 189 (66.3) based Studies Children (CGS) GWAS 1 + 2 4827 Total Description of CF patients included in GWAS 1 and GWAS 2 for lung disease severity in CF

2.3.2 Lung expression quantitative trait loci (eQTL) study

Details of the genome-wide genotyping and gene expression analysis from the lung as well as a description of the study participants have been previously published [244]. Briefly, lung tissue samples were collected from patients who underwent lung resections, from three participating sites: Laval University, University of British Columbia, and University of Groningen. Gene expression levels were measured using a custom Affymetrix array (see GEO platform

36

GPL10379), and samples were genotyped on the Illumina Human1M-Duo BeadChip array.

Imputation was performed using the MaCH/Minimac software in the region spanning chromosome 20 31,800,000-32,000,000 using Phase I Version 3 haplotype data from the 1000

Genomes population. A linear model was used to adjust normalized gene expression data for age, gender and recruitment centre. 1,111 patients, including 51 CF patients (Table 2.2) passed standard quality control measures. SNPs in the top 10% FDR were considered statistically significant at the genome-wide level, and SNPs with p-value<0.05 were defined as nominally significant.

Table 2.2: CF and non-CF patients from the lung eQTL study

CF Patients Non-CF Individuals Age (years) 21.1±9.5 60.2±12.6 %Male 60.0 55.1 Polymorphism Observed Expected Observed Expected AA 21 (41%) 22 576 (54%) 575 rs1078761 AG 28 (55%) 24 409 (39%) 411 GG 2 (4%) 5 75 (7%) 74 P-value1 0.0495 0.8369 AA 14 (27%) 14 290 (27%) 298 rs750064 AG 25 (49%) 25 544 (51%) 528 GG 12 (24%) 12 226 (21%) 234 P-value1 0.8972 0.326 Total 51 1060 Demographic and genotypic characteristics of CF and non-CF patients included in the lung eQTL study. Age is expressed as the mean ± standard deviation. Genotype frequency is expressed as the number of individuals with the specified genotype. 1P-value from Chi-Squared goodness-of-fit test of the genotype distributions with Hardy- Weinberg equilibrium. A p-value <0.05 indicates that the SNP is not in Hardy-Weinberg equilibrium

37

2.3.3 Bioinformatic analysis of SNPs in the 20q11 region

The SIFT [245] and PolyPhen2 [246] prediction algorithms were used to test for functional impact of the rs1078761 polymorphism. Genotype data for estimation of linkage disequilibrium

(LD) were accessed from the 1000 Genomes Project [247]. SNP function data were accessed from the Encyclopedia of DNA Elements (ENCODE), which identifies functional elements in the [58].

2.3.4 Collection of saliva samples from healthy volunteers

101 healthy volunteers were recruited from the Centre for Heart Lung Innovation to donate saliva samples. Demographic and genotypic characteristics of the study participants are summarized in Table 2.3. Saliva samples were collected based on the protocol published by the

International Agency for Research on Cancer Biobank [248]. Saliva samples were centrifuged at

14000 RPM for 20 minutes. Supernatants were removed and stored at -80°C for subsequent measurement of protein levels. Saliva pellets were stored for extraction of DNA for genotyping.

Table 2.3: Healthy saliva donors

Age (years) 29.2±8.5 %Male 46.5 rs1078761 AA 53 (52%) genotype AG 40 (40%) GG 8 (8%) rs750064 AA 28 (28%) genotype* AG 58 (58%) GG 14 (14%)

Demographic and genotypic characteristics of healthy volunteers recruited to donate saliva samples. Age is expressed as the mean ± standard deviation. Genotype frequency is expressed as the number of individuals with the specified genotype. *One individual failed genotyping for rs750064

38

2.3.5 DNA extraction and genotyping of saliva samples

DNA was extracted from saliva pellets using the QIAamp DNA Mini Kit (Qiagen, Toronto, ON,

Canada). 5 ng of DNA was genotyped using TaqMan assays (Life Technologies, Burlington,

ON, Canada) for rs1078761 and rs750064. Genotyping was performed on the Applied

Biosystems ViiA7 Real-Time PCR System (Life Technologies). DNA samples from CEPH individuals of known genotype were used as positive controls (Coriell Institute, Camden, NJ,

USA).

2.3.6 Quantitative Real-Time PCR (qPCR) of BPIFA1 and BPIFB1 in the lung

Seventy eight individuals from the UBC recruitment site of the lung eQTL study were selected based on rs1078761 genotype for validation using qPCR. 77 of these individuals were included in the lung eQTL study described above. Demographic and genotypic characteristics of these individuals are summarized in Table 2.4. Total RNA was extracted from 35 mg of peripheral lung tissue using the RNeasy Mini Kit (Qiagen, Toronto, ON, Canada). RNA was quantified using the Nanodrop 8000 spectrophotometer (Thermo Scientific, Wilmington, DE, USA), and

200 ng of RNA was reverse transcribed using the SuperScript III First-Strand Synthesis Kit (Life

Technologies, Burlington, ON, Canada). qPCR was performed on the ABI 7900HT real-time

PCR instrument using 20 ng of cDNA and TaqMan gene expression assays for BPIFA1 and

BPIFB1 (assay IDs Hs00213177_m1 and Hs00264197_m1; Life Technologies). Standard

TaqMan qPCR cycling conditions were used. PPIA was used as a reference gene as it was previously shown to be stably expressed in the lung [249]. A sample of pooled cDNA from several lung samples was used as a calibrator. Gene expression was calculated using the 2-ΔΔCT method [250].

39

Table 2.4: Lung eQTL study patients selected for qPCR validation

Age (years) 56.2±15.9 % Male 65.8 CF 10 Non-CF 68 rs1078761* AA 37 (48%) genotype AG 25 (32%) GG 15 (19%) rs750064** AA 21 (30%) genotype AG 23 (32%) GG 27 (38%)

Demographic and genotypic characteristics of samples selected from the lung eQTL study for validation of BPIFA1 and BPIFB1 gene expression levels in the lung. Age is expressed as the mean ± standard deviation. Genotype frequency is expressed as the number of individuals with the specified genotype. *1 individual failed genotyping for rs1078761 **7 individuals failed genotyping for rs750064

2.3.7 Western blot analysis of BPIFA1 and BPIFB1 protein levels in the lung and saliva

Ninety three samples of lung tissue and 101 samples of saliva were selected for measurement of

BPIFA1 and BPIFB1 protein levels. Protein was obtained by homogenizing 35 mg of peripheral lung tissue in 250 μl of lysis buffer containing protease and phosphatase inhibitors.

Homogenized samples were loaded into QIAshredder columns (Qiagen) and centrifuged at

13,000 RPM for 1 minute. The flow-through was quantified for total protein levels using the DC

Protein Assay (Bio Rad, Mississauga, ON, Canada). 15 mg of lung protein and 15 μl of undiluted saliva were run on western blots which were probed for BPIFA1 and BPIFB1 using a goat anti human BPIFA1 antibody (R&D Systems, Minneapolis, MN, USA) and a mouse anti human BPIFB1 antibody (Sigma). Donkey anti-goat (Life Technologies) and goat anti-mouse

(Rockland Immunochemicals, Gilbertsville, PA, USA) fluorescently conjugated antibodies were used for detection. A reference sample was run on each gel to normalize for differences between 40 gels. Densitometry was performed using Image J to measure the intensity of BPIFA1 and

BPIFB1 signal.

2.3.8 Statistical analysis of BPIFA1 and BPIFB1 expression according to genotype

Multiple linear regression was performed using an additive model to test for the effect of genotype on mRNA and protein expression levels. Microarray data were adjusted for age, gender and centre, and qPCR gene expression data were adjusted for age and gender. Analyses of BPIFB1 protein levels in saliva were adjusted for age. All expression values were log10 transformed to approximate a normal distribution. Multiple linear regression, was used to test for a relationship between pre-bronchodilator FEV1 % predicted and gene expression levels in a subset of 739 individuals from the lung eQTL study who had no lung disease, other than smoking related disease, that could influence lung function.

2.4 Results

2.4.1 Assessment of BPIFA1/BPIFB1 polymorphisms

Within the region 10 kb upstream of BPIFA1 to 10 kb downstream of BPIFB1, 268 polymorphisms were either genotyped or imputed and tested for association with lung disease severity of CF patients. Fifty of these SNPs displayed p-values less than 0.05. The minimum p- value was observed at rs1078761 (p=2.71×10-4), which reaches regional significance after correction for the effective number of independent statistical tests (effective number of independent SNPs=93.58; corrected p-value=0.0254) (Figure 2.1, Appendix Table 1).

Specifically, the minor allele of rs1078761 (G) was associated with reduced lung function. The rs1078761 polymorphism is located in exon 3 of the BPIFB1 gene and causes a change of amino

41 acid 84 from isoleucine to valine, however this change is not predicted to have a functional impact on the protein (SIFT score = 0.15; PolyPhen2 score = 0.016).

42

Figure 2.1: Locus zoom plot of the region 10 kb upstream of BPIFA1 to 10 kb downstream of BPIFB1

P-values in –log10 scale for SNPs that were tested for association in GWAS 1 for CF lung disease severity. Rs1078761 displays the smallest association p-value in the region (purple diamond). The extent of LD (r2) with rs1078761 for the remaining SNPs is indicated with colors.

43

2.4.2 Replication of rs1078761 association with CF lung disease severity

To test for replication of the rs1078761 association, data from an independent cohort of CF patients was assessed. The G allele of rs1087861 was significantly associated with increased CF lung disease severity in 2333 individuals with CF from the CGS, GMS and FrGMC (p=1.27 ×

10-4). Furthermore, a meta-analysis of GWAS1 + GWAS2 including 4827 CF patients from

CGS, GMS and FrGMC reached a p value of 1.25×10-7. The rs1078761 polymorphism remained the most significant polymorphism in the region (Figure 2.2).

44

Figure 2.2: Locus zoom plot of the region 10 kb upstream of BPIFA1 to 10 kb downstream of BPIFB1 in GWAS 1+ 2

P-values in –log10 scale for SNPs that were tested for association in the meta-analysis of GWAS 1+2 for CF lung disease severity. Rs1078761 remains the smallest association p-value in the region (purple diamond). The extent of LD (r2) with rs1078761 for the remaining SNPs is indicated with colors.

45

2.4.3 Identification of SNPs associated with BPIFA1 and BPIFB1 gene expression levels

In order to identify polymorphisms which may be important for regulation of the BPIFA1 and

BPIFB1 genes, data from the lung eQTL study were interrogated. Briefly, this approach identifies SNPs associated with differences in gene expression using whole genome genotyping and gene expression data. A total of 69 SNPs located no more than 150 kb away from rs1078761 were associated with BPIFA1 levels at the ten percent false discovery rate (0.1 FDR)

(Appendix Table 2). In contrast, no SNP was significantly associated with BPIFB1 levels at the

0.1 FDR threshold. The genotyped polymorphism providing the greatest evidence for association with BPIFA1 levels was rs750064 (p=1.31×10-27). In addition, the G allele of rs1078761, which is associated with increased CF lung disease severity, was significantly associated with reduced BPIFA1 levels (p=4.08×10-15). The G allele of rs1078761 was nominally associated with reduced BPIFB1 expression (p=0.0314), while rs750064 was not associated with BPIFB1 levels (p= 0.2403) (Table 2.5). BPIFA1 and BPIFB1 expression levels were significantly correlated (R2=0.59, p=2×10-219), which suggests that the two genes are co- regulated. Due to its association with BPIFA1 gene expression, rs750064 was selected for further analysis along with rs1078761.

Table 2.5: Association between genotype and mRNA levels from the lung eQTL study

SNP BPIFA1 expression BPIFB1 expression P value P value rs1078761 4.08×10-15 0.0314 rs750064 1.31×10-27 0.2403

P-values from the lung eQTL study for the association between genotype and gene expression levels of BPIFA1 and BPIFB1 in the full cohort of 1111 individuals.

46

2.4.4 Potential functional effects of BPIFA1/BPIFB1 SNPs

We next determined whether rs1078761, rs750064, and the other SNPs nominally associated with lung disease severity in CF have potential functional effects using ENCODE data

(Appendix Table 3). The rs1078761 polymorphism is located in a DNase I hypersensitive site in both urothelial and colorectal adenocarcinoma cells, and is predicted to alter the binding sites for the CTCF and HIF1 transcription factors, indicating that it may have a potential regulatory function. The rs750064 polymorphism is located in a DNase I hypersensitive site and is predicted to alter binding sites for the FoxJ2, AIRE and Evi1 transcription factors. However, in the four lung epithelial cell types included in the ENCODE project (small airway epithelial cells, primary tracheal epithelial cells, bronchial epithelial cells treated with retinoic acid, and adenocarcinomic alveolar epithelial (A549) cells), there were no DNase I hypersensitive sites identified for either rs1078761 or rs750064.

2.4.5 SNPs associated with BPIFA1 and BPIFB1 gene expression levels in CF samples

The primary focus of our study was to investigate if BPIFA1 and/or BPIFB1 contribute to CF lung disease severity. Since gene expression in CF lungs can differ considerably from non-CF lung tissue [251], the analysis of lung expression data as a function of rs1078761 and rs750064 genotype was repeated in the CF (n=51) and non-CF (n=1060) subsets. Within both the CF and non-CF sub-groups both polymorphisms were significantly associated with gene expression levels of BPIFA1 (Table 2.6). However, rs1078761 was only associated with BPIFB1 levels in

CF samples, while rs750064 was not associated with BPIFB1 in either group (Table 2.6). The effect size (β-coefficient of the linear regression models) of rs1078761 on BPIFA1 and BPIFB1 levels was markedly higher in the CF group compared to the non-CF group, and an interaction

47 model indicated that the rs1078761 genotype significantly interacted with CF status (Table 2.6 and Figure 2.3) demonstrating that genotype had a greater effect on BPIFA1 and BPIFB1 levels in CF patients than in non-CF individuals. In addition, within the CF patients, rs1078761 had a greater effect on BPIFA1 and BPIFB1 than rs750064.

BPIFA1 gene expression levels were 1.28 fold higher in CF samples compared with non-CF samples (p=1.3×10-8). However, BPIFA1 levels were negatively associated with age in the total study group (p=1.3×10-7) and in the non-CF patients alone (p=0.0041). Since the CF patients were much younger than the non-CF patients (Table 4) it is not possible to conclude that the difference in BPIFA1 levels between CF and non-CF patients was due to disease status. To further explore this issue we performed a subset analysis of CF patients (n=40) and non-CF patients (n=41) who were ≤35 years of age. The age distribution did not differ between CF and non-CF patients in this subset (mean age=23.7 and 24.0 years, respectively; p=0.8971) but

BPIFA1 levels were higher in the CF patients (p=0.0025). Although the sample size is small, this suggests that at least some of the difference in BPIFA1 levels between the two groups may be due to the presence of CF. The differences in BPIFB1 levels between CF and non-CF samples did not reach statistical significance.

48

Table 2.6: Association between genotype and mRNA levels in CF and non-CF patients from the lung eQTL study

SNP BPIFA1 expression BPIFB1 expression CF patients Non-CF patients Interaction1 CF patients Non-CF patients Interaction P-Value Effect P-Value Effect P-Value P-Value Effect P-Value Effect P-Value Size Size Size Size (β) (β) (β) (β) rs1078761 8.02×10-4 -2.65 1.07×10-13 -0.95 0.0131 0.0033 -2.21 0.1252 -0.18 0.0014 rs750064 0.0157 -1.50 1.58×10-26 -1.21 0.5574 0.1278 -0.91 0.4139 -0.09 0.1142

P-values and scaled estimates of effect sizes from the lung eQTL study for the association between genotype and gene expression levels of BPIFA1 and BPIFB1 in CF (n=51) and non-CF (n=1060) samples. Effect sizes refer to the effect of additional copies of the risk allele (allele associated with more severe disease), sign indicates the direction of effect. Negative effect sizes indicate that the risk allele is associated with decreased gene expression. 1P-value for interaction between genotype and disease state in multiple linear regression.

Figure 2.3: Interaction of genotype and CF status on BPIFA1 and BPIFB1 mRNA levels in lung tissue

Plots showing the results of multiple linear regression.

49

2.4.6 qPCR gene expression analysis of BPIFA1 and BPIFB1 in lung tissue

To validate the microarray gene expression data, a subset of samples were tested for BPIFA1 and

BPIFB1 gene expression levels using qPCR. rs1078761 G and rs750064 G were both associated with significantly reduced mRNA expression levels of BPIFA1 but not BPIFB1 (Table 2.7,

Figure 2.4a). In addition, there were 37.4 fold higher levels of mean BPIFA1 in CF tissue (n=10) compared with non-CF (n=68) (p=7.2×10-5). BPIFB1 levels were 6.15 fold higher in CF compared with non-CF (p=0.0182), however, these groups were not matched for age (Figure

2.4b).

Table 2.7: Association between genotype and mRNA levels in the lung measured by qPCR

SNP BPIFA1 BPIFB1 P-value1 Direction of P-value Direction of effect2 effect rs1078761 0.0173 - 0.3218 - rs750064 0.0065 - 0.0909 -

P-values for the association between genotype and gene expression levels measured by qPCR. 1P-value from multiple linear regression using an additive model to test for the effect of genotype on mRNA levels. 2Direction of effect indicates whether the risk allele (allele associated with more severe disease) is associated with increased (+) or decreased (-) gene expression level

50

Figure 2.4: BPIFA1 and BPIFB1 mRNA levels in the lung

BPIFA1 and BPIFB1 gene expression levels in lung tissue measured by qPCR. a) Effect of rs1078761 genotype on BPIFA1 and BPIFB1 mRNA expression. P-values are from multiple linear regression using an additive model to test for the effect of genotype on mRNA levels. B) Comparison of BPIFA1 and BPIFB1 mRNA expression levels in CF compared to non-CF lung tissue. 51

2.4.7 Measurement of BPIFA1 and BPIFB1 protein levels in lung tissue

To test whether the effect of genotype on expression levels of BPIFA1 and BPIFB1 extends to

the protein level, BPIFA1 and BPIFB1 protein levels were measured in a subset of 93 lung

samples. Neither rs1078761 nor rs750064 were associated with BPIFB1 protein levels

(p=0.1619 and 0.1929, respectively, Figure 2.5). Only 13 out of the 93 lung tissue samples

contained detectable levels of BPIFA1 protein, therefore BPIFA1 levels were not analyzed

according to genotype. Although not age-matched, BPIFA1 protein level was 3.43 fold higher in

CF compared with non-CF samples (p=0.0026) but there was no difference in BPIFB1 protein

levels between the two groups (p=0.6106).

Figure 2.5 BPIFB1 protein levels in lung tissue

BPIFB1 protein levels in lung tissue measured by western blotting according to rs1078761 and rs750064 genotype

52

2.4.8 Measurement of BPIFA1 and BPIFB1 protein levels in the saliva

Since saliva is known to contain BPIFA1 protein at high levels [252, 253], healthy volunteers were recruited to donate saliva samples for measurement of BPIFA1. The G allele of rs1078761 was significantly associated with reduced BPIFA1 levels in the saliva (p=0.0161), however rs750064 was not associated (p=0.1483). Although BPIFB1 was detected at high levels in the saliva, levels were not associated with rs1078761 or rs750064 genotypes (p=0.4277 and 0.8650, respectively) (Table 2.8, Figure 2.6).

Table 2.8: Association between genotype and protein levels in saliva

BPIFA1 BPIFB1 SNP P-value1 Direction of P-value Direction of effect2 effect rs1078761 0.0161 - 0.4277 No effect rs750064 0.1483 No effect 0.8650 No effect

P-values for the association between genotype and protein expression levels of BPIFA1 and BPIFB1 in saliva and lung tissue. 1P-value from multiple linear regression using an additive model to test for the effect of genotype on mRNA levels. 2Direction of effect indicates whether the risk allele (allele associated with more severe disease) is associated with increased (+) or decreased (-) gene expression levels

53

Figure 2.6 BPIFA1 and BPIFB1 protein expression levels in saliva

BPIFA1 and BPIFB1 protein expression levels in saliva samples from healthy volunteers according to rs1078761 and rs750064 genotype

54

2.4.9 Relationship between BPIFA1 and BPIFB1 levels and lung function

To test whether BPIFA1 and BPIFB1 levels were associated with lung function, we analyzed data from the lung eQTL study for a relationship between gene expression levels and lung function in a subset of 739 individuals who did not have diseases other than smoking related disease that could influence lung function. We found that increased BPIFA1 levels were significantly associated with decreased lung function measured by pre-bronchodilator FEV1 % predicted (R2=0.0203, p=1.00×10-4), although BPIFB1 levels were not associated (p=0.6909).

2.5 Discussion

Using a candidate gene approach, we interrogated BPIFA1/BPIFB1 SNPs in data from the

GWAS for lung disease severity in CF patients (GWAS1) and found that the G allele of rs1078761 was associated with more severe lung disease. This association was replicated in an independent cohort of CF patients (GWAS2). Furthermore, we found that the G allele of rs1078761 was significantly associated with reduced mRNA and protein expression levels of both BPIFA1 and BPIFB1. Interestingly, the association with BPIFB1 mRNA levels was unique to individuals with CF. Overall, the results of our study suggest that reduced levels of BPIFA1 and BPIFB1 proteins in the lung contribute to greater lung disease severity in CF patients.

The G allele of rs1078761 had a greater effect on both BPIFA1 and BPIFB1 mRNA levels in the CF compared with the non-CF subgroup. Gene expression is known to differ in CF airways compared with non-CF airways, partly due to the heightened inflammatory response and elevated levels of immune cells [251]. rs1078761 may disrupt binding sites of transcription factors and other regulatory proteins that are expressed more highly in the context of CF. In particular, ENCODE data indicated that rs1078761 is predicted to alter the binding sites for the 55

CTCF and HIF1 transcription factors. Therefore genotype may have a greater effect on BPIFA1 and BPIFB1 expression in CF patients due to the differential expression of regulatory proteins.

The G allele of rs1078761 was also associated with reduced BPIFA1 protein levels in the saliva. This indicates that the effect of rs1078761 on BPIFA1 expression extends to the salivary glands. BPIFA1 is known to be expressed in the tonsil, tongue and salivary glands [205], and has previously been shown to be detectable in saliva with a large variability between individuals

[252]. Here, we have shown that genotype contributes to this variability. While BPIFB1 was detectable at high levels in the saliva, there was no relationship with genotype. This suggests that the effect of genotype on BPIFB1 is restricted to the lungs.

While the G allele of rs750064 was associated with reduced BPIFA1 mRNA levels in the lung of both the CF and non-CF subgroups, in CF patients the effect size was markedly lower than that of rs1078761. This suggests that although rs750064 may play an important role in regulation of BPIFA1 expression in non-CF individuals, alternative mechanisms involving rs1078761 may be involved in regulating BPIFA1 expression in CF patients. These data suggest that rs750064 is less likely to be responsible for the association with lung disease severity in CF than rs1078761.

Previous studies have shown that BPIFA1 gene expression levels are significantly higher in CF compared to non-CF individuals [204] and this suggests that BPIFA1 may play a critical role in the disease. Our data were consistent with these observations but not conclusive due to the potential confounder of age. Previous studies have shown that BPIFB1 protein is upregulated in CF [241]. In our experiments BPIFB1 mRNA levels were 6.15 fold higher in CF patients compared with non-CF individuals, but this result may also have been confounded by age.

56

BPIFA1 and BPIFB1 may be upregulated in CF as a part of the augmented innate immune response which occurs in CF patients as a result of chronic infection.

We found that rs1078761 was associated with expression levels of both BPIFA1 and

BPIFB1. In addition, the effect size of the association between genotype and gene expression levels within CF patients was similar for both genes. Therefore, it is not possible to determine which of the two genes is responsible for the association with lung function in CF, or if both genes are involved. Recent studies have identified a role for BPIFA1 in CF through the regulation of ENaC [234, 254, 255]. While BPIFB1 is known to be upregulated in CF there have been no published studies to date investigating the function of BPIFB1 in CF. Additional investigation is required to elucidate the role of BPIFB1 and determine which of the two genes are responsible for the association seen in our study.

Results from the eQTL study also revealed that increased BPIFA1 levels were significantly associated with decreased lung function measured by pre-bronchodilator FEV1 % predicted in a group of 739 patients with smoking-related lung disease (primarily lung cancer and chronic obstructive pulmonary disease). Previous studies [204, 219] suggest that there is increased BPIFA1 expression in CF patients which may be a response to the chronic lung infection typically seen in these patients. These observations suggest that increased BPIFA1 expression is associated with lower lung function. Nevertheless, the G allele of rs1078761 was associated with reduced BPIFA1 and BPIFB1 levels and more severe lung disease. We speculate that the G allele of rs1078761 may attenuate the protective innate immune response, thus leading to increased microbial burden and more severe lung disease in CF.

Polymorphisms in BPIFB1 and BPIFA1 have previously been shown to be associated with disease. The rs1078761 polymorphism was associated with lung function in a large meta-

57 analysis of lung function data from ~48,000 individuals of European ancestry [256] (publically available at www.gwascentral.org) [257]. rs1078761 was associated with both FEV1 (p=0.0382) and FEV1/FVC ratio (p=0.0043). rs750064 and another BPIFA1 promoter polymorphism, rs2752903, were found to be associated with susceptibility to nasopharyngeal carcinoma in

Cantonese-speaking Chinese individuals [258]. However, only the rs2752903 polymorphism replicated in a Malaysian Chinese population, and the intronic rs1407019 polymorphism in

BPIFA1 was identified to be the most likely functional polymorphism responsible for the association [259]. Neither rs2752903 nor rs1407019 are in LD with rs1078761 (r2<0.2), indicating these are not the functional polymorphisms responsible for the association with CF severity. The rs11906665 polymorphism in the BPIFB1 promoter has been associated with cholera in a Bangladeshi population [238]. However, this SNP is not polymorphic in the

European population and therefore cannot be responsible for the association with CF severity.

A possible weakness of this study is that expression data in the lung eQTL study were generated from peripheral lung tissue samples, while the pathophysiology of CF lung disease is predominantly in the conducting airways. However, the peripheral lung tissue samples would be expected to contain a percentage of epithelial cells from small airways. Since BPIFB1 and

BPIFA1 are specifically expressed by airway epithelial cells, the gene expression signal measured in this study likely originates from airways that were included in the lung samples.

While rs1078761 is a coding polymorphism located in exon 3 of the BPIFB1 gene, it encodes a conservative change which is not predicted to have a functional impact on the protein.

Therefore, we investigated regulatory functions for this polymorphism and confirmed that rs1078761 genotype is associated with BPIFA1 and BPIFB1 gene expression levels. However, there is a possibility that rs1078761 may also have an effect on BPIFB1 protein function which

58 may have contributed to the clinical association with lung disease severity in CF. Additional experiments are required to further investigate the effect of the G allele of rs1078761 on BPIFB1 function.

BPIFB1 is hypothesized to function in innate immunity [238, 239], and there is evidence supporting a role for BPIFA1 in defense against bacterial infection [221, 222], inhibiting ENaC activity [233] and reducing biofilm formation [225]. Consistent with these functions, our findings indicate that polymorphisms associated with more severe CF are also associated with decreased BPIFA1 levels. Therefore, CF patients may require higher levels of BPIFA1 than non-CF individuals to protect their airways against infection. If BPIFA1 and/or BPIFB1 can be shown to be protective against bacterial infection in the lung, supplementation with aerosolized forms of BPIFA1/BPIFB1 may be a novel therapy for CF patients to protect against infection.

59

CHAPTER 3: BPIFA1 AS AN ANTI-INFLAMMATORY TARGET IN

CYSTIC FIBROSIS

3.1 Rationale

We have established that variants in the BPIFA1/BPIFB1 region which are associated with decreased gene expression are associated with increased CF lung disease severity. This suggests that decreased BPIFA1 and/or BPIFB1 expression is detrimental to lung function in CF.

Substantial evidence has emerged in recent years implicating BPIFA1 as an innate immune molecule, with anti-microbial, surfactant, immunomodulatory and ENaC inhibitory properties.

While BPIFB1 has been less characterized, several studies have identified potential functions in innate immunity. However, there has been little characterization of the role of BPIFA1/BPIFB1 in CF and it is not yet clear why increased expression of these genes could contribute to better lung function in CF. The purpose of this study was to elucidate the function of BPIFA1/BPIFB1 in CF.

3.2 Background

As described in section 1.5, BPIFA1 has been demonstrated to have antimicrobial properties, and has been shown by several groups to inhibit the growth of a number of bacterial species in vitro including P. aeruginosa, M. pneumoniae and K. pneumoniae, as well as to bind LPS [221-226].

However, other researchers have not found antimicrobial activity of BPIFA1 [221, 260].

Transgenic mice that over express human BPIFA1 have enhanced bacterial clearance of P. aeruginosa, together with reduced inflammatory cytokine production [227]. Conversely,

60

BPIFA1 knockout mice have impaired bacterial clearance and increased levels of inflammatory cells [221, 224].

BPIFA1 has also been shown to have an immunomodulatory function in mouse models of airway inflammation. Mice that are deficient in BPIFA1 have increased levels of inflammation including higher levels of eosinophils, mucus production, and airway hyper- reactivity, as well as increased production of the TH2 cytokines IL-3, IL-5, and IL-13 [229, 230].

In a different model system, BPIFA1 had pro-inflammatory properties with BPIFA1 overexpressing mice producing elevated levels of TNF-α and IL-6 in response to stimulation with potently inflammatory carbon nanotubes [231]. Furthermore, BPIFA1 functions as a chemoattractant in vitro, administration of recombinant BPIFA1 protein results in increased neutrophil migration [226]. Therefore BPIFA1 may function in modulation of the immune response through a variety of mechanisms.

While BPIFB1 has also been proposed to function as an immune molecule, there are currently little data supporting this in the literature. Some evidence that BPIFB1 plays a role in innate immunity comes from a genetic association with clinical outcomes in cholera [238] which is supported by data indicating that BPIFB1 modifies the innate immune response to Vibrio cholera [239].

BPIFA1 has been shown to have several functions in the innate immune response, including antimicrobial and immunomodulatory activity. There is also some evidence that

BPIFA1 has surfactant properties [225, 228], and that it may inhibit ENaC [233-235]. Since the data supporting the antimicrobial and immunomodulatory functions for BPIFA1 are the most compelling, and little is currently known about the specific function of BPIFB1, we chose to

61 further investigate the role of BPIFA1 and BPIFB1 in bacterial growth inhibition and the inflammatory response in the context of CF.

Hypothesis: BPIFA1 and BPIFB1 have antimicrobial and immunomodulatory activity in CF.

Aim: To characterize the function of BPIFA1 and BPIFB1 in CF.

3.3 Methods

3.3.1 Immunohistochemistry

Lung tissue samples were obtained from the James Hogg Lung tissue registry at the Centre for

Heart lung Innovation. Paraffin embedded lung tissue sections were deparaffinized and rehydrated with xylene and ethanol washes. Antigen retrieval was performed by placing slides in Target Retrieval Solution (Dako) and autoclaving for 25 minutes. Slides were washed in TBS then placed in 3% hydrogen peroxide for 10 minutes to quench peroxidase activity. For BPIFA1 and BPIFB1 staining, slides were blocked in 5% horse serum for 1 hour. The slides were then incubated in mouse anti-hPLUNC (BPIFA1) antibody (R&D) overnight. The following day, slides were washed and incubated with horse anti-mouse biotin tagged secondary antibody for 1 hour (Vector). The slides were then incubated in Streptavidin Horseradish-Peroxidase (Dako) for 10 minutes, followed by incubation in DAB substrate for an additional 10 minutes. Slides were counterstained in Harris’s hematoxylin followed by dehydration in ethanol washes. Cover slips were sealed to slides using Cytoseal Mounting Medium (VWR).

3.3.2 Recruitment of CF patients

CF patients were recruited from the adult CF clinic at the Pacific Lung Health Centre. Subjects were recruited if they had a confirmed diagnosis of CF based on sweat chloride testing and/or genotyping. Saliva samples were collected from 30 CF patients during a routine stable clinic 62 visit. Saliva samples were collected by asking patients to rinse their mouths with bottled water and to spit into a sterile collection container 5 times over 5 minutes. Saliva samples were processed within one hour of collection by adding Sputolysin reagent (Calbiochem) to a ratio of

4 mL Sputolysin per 1 gram of saliva. Samples were then incubated in a 37°C water bath for 20 minutes with shaking by inversion every 5 minutes. Samples were centrifuged at 500 relative centrifugal force (RCF) for 10 minutes at 4°C, and the supernatant was again centrifuged at 4000

RCF for 20 minutes at 4°C. Samples were aliquoted and frozen at -80 °C until analysis. Pellets from saliva centrifugation were stored for DNA extraction.

3.3.3 Western blot analysis of BPIFA1 and BPIFB1 protein levels in saliva

Total protein was quantified in saliva samples using the Coomassie plus (Bradford) assay

(Pierce). 11.2 μg of total protein was run on 12% gels which were probed for BPIFA1 and

BPIFB1 using a goat anti human BPIFA1 antibody (R&D Systems, Minneapolis, MN, USA) and a mouse anti human BPIFB1 antibody (Sigma). Donkey anti-goat (Life Technologies) and goat anti-mouse (Rockland Immunochemicals, Gilbertsville, PA, USA) fluorescently conjugated antibodies were used for detection. A reference sample was run on each gel to normalize for differences between gels. Densitometry was performed using Image J to measure the intensity of the BPIFA1 and BPIFB1 signals.

3.3.4 DNA extraction and genotyping of saliva samples

DNA was extracted from saliva pellets using the QIAamp DNA Mini Kit (Qiagen,). 5 ng of

DNA was genotyped using TaqMan assays (Life Technologies) for rs1078761 and rs750064.

Genotyping was performed on the Applied Biosystems ViiA7 Real-Time PCR System (Life

63

Technologies). DNA samples from CEPH individuals of known genotype were used as positive controls (Coriell Institute).

3.3.5 Bacterial growth assays

Bacterial growth assays were performed using the PAO1 strain of P. aeruginosa, which is a common lab strain obtained from a wound isolate. Frozen bacterial cultures were streaked out onto LB agar plates and incubated overnight at 37 °C. Individual colonies were selected and grown overnight in 3 mL of LB broth in a 37 °C incubator with shaking. The next morning, optical densities of the overnight cultures were measured, and cultures were diluted to an optical density of 0.05. Cultures were plated onto honeycomb plates (Fisher) and cultures were treated with recombinant BPIFA1 from yeast (Abnova) or E. coli (Novus), or PBS as a negative control.

Each condition was performed in triplicate, and each experiment was performed at least 3 times.

Plates were incubated in the Bioscreen C (Oy Growth Curves Ab Ltd), an automated microbiology growth curve analysis system which was used to incubate cultures at 37 °C with shaking, with measurements of optical density performed every 15 minutes. After 24 hours of growth, cultures were serially diluted and two to three dilutions were plated onto agar plates and incubated at 37 °C overnight. Colonies were counted in the morning to determine colony forming units (CFU) for each treatment condition.

3.3.6 Cell culture

IB3-1, CuFi-1 and CFBE41o- CF airway epithelial cell lines were obtained from the American

Type Culture Collection. IB3-1 cells were derived from a patient who was a compound heterozygote for the p.Phe508del and W1282X mutations, and CuFi-1 and CFBE41o- cells were derived from p.Phe508del homozygous patients. Cells were cultured as recommended by their

64 respective suppliers using standard protocols. IB3-1 cells were grown in basal LHC-8 medium

(Invitrogen) supplemented with 10% fetal bovine serum (FBS), 2mM L-glutamine, and 1mM sodium pyruvate. CuFi-1 cells were cultured in BEBM serum-free medium (Lonza) with supplement bullet kit (EGF, hydrocortisone, bovine pituitary extract, transferrin, bovine insulin, triiodothyronine, epinephrine, retinoic acid), 2 mM L-glutamine, and 1mM sodium pyruvate.

CFBE41o- cells were grown in minimum essential medium w/ Earle’s salt (Sigma) supplemented with 10% FBS and Glutamax (ThermoFisher). Prior to stimulation, airway epithelial cell lines were plated in coated 24-well plates (BD Biosciences) at 1.7×105 cells/well, and allowed to adhere overnight. Plates for IB3-1 and CFBE41o- stimulations were coated in a mixture of bovine serum albumin (10 mg/cm2), fibronectin (1 mg/cm2), and bovine collagen type

I (3.3 mg/ cm2) (BD Biosciences). Plates for CuFi-1 were coated with collagen type IV (6 mg/ cm2) (Sigma Aldrich).

3.3.7 Cell stimulation and cytokine quantification

Airway epithelial cells were pretreated for 30 minutes with recombinant BPIFA1 (Abnova), recombinant BPIFB1 (Abnova), or fresh media as a negative control. Heat killed P. aeruginosa

(using the PAO1 strain) were generated by incubating bacterial cultures at 95 °C for 30 min.

Pretreated and untreated airway epithelial cells were stimulated with heat killed P. aeruginosa at a multiplicity of infection (MOI) of 50 (unless otherwise specified). Cell lysates were collected at 8 hours after stimulation for extraction of RNA, and supernatants were collected after 24 hours for measurement of cytokine levels (unless otherwise specified). IL-6 and IL-8 levels were quantified in cell supernatants using sandwich ELISA (eBioscience), according to the manufacturer’s instructions.

65

3.3.8 BPIFA1 overexpression assays

IB3-1 cells were seeded in 12-well plates and allowed to achieve ~80% density over 1 to 3 days.

Cells were transfected with 500 ng total plasmid of either BPIFA1 (Origene) or empty pCMV6- entry vector (Origene) using Lipofectamine 2000 according to the manufacturer’s instructions.

After transfection overnight, the medium was replaced with 350 μl of fresh LHC8 medium and the cells were allowed to rest overnight for 24 hours. The next day, P. aeruginosa (using the

PAO1 strain) was grown from an overnight culture to log phase (~ 2 hours) and used to infect cultures at a MOI of 10 (confluent 12-well plate of IB3-1 ~ 180,000 cells). After 5 hours of incubation, a portion of the supernatant was obtained for CFU counts, and the rest was clarified by centrifugation for ELISA and western blotting. Cells were also collected to make lysates. For

CFU, supernatants were serially diluted 10-fold, plated on LB plates, and allowed to incubate overnight at 37 °C prior to counting the next day. To confirm that the transfection had been successful, supernatants were blotted for the presence of BPIFA1.

3.3.9 RNA extraction and RNA sequencing

RNA was extracted from cell lysates using the RNeasy Mini kit (Qiagen). RNA concentration, integrity and purity were assessed using the RNA Nano Kit with the Agilent 2100 Bioanalyzer

(Agilent Technologies). mRNA was reverse transcribed, amplified and sequenced using the Ion

Torrent library kits and the Ion Proton next generation sequencing system (ThermoFisher) at the

UBC sequencing core at the Djavad Mowafaghian Centre for Brain Health. All samples were sequenced to a depth of at least 15 million reads. Sequencing quality was assessed using FastQC

[261], and adapter sequences were trimmed from reads using cutadapt [262]. Reads were mapped to the human genome (hg19) with a two-step alignment protocol using Tophat2

66 followed by Bowtie2 [263, 264]. Sorting and indexing of BAM and SAM files was performed using SAMtools [265]. HTSeq-count was used to generate read count tables. Differential gene expression analysis was performed using the Bioconductor packages DESeq2 [266] and Limma

[267]. Due to the absence of biological replicates, p-values were not calculated, and a fold change threshold greater than or equal to +/- 1.5 was used to identify differentially expressed genes. Downstream analysis was performed using Sigora or InnateDB [268] for pathway analysis, and NetworkAnalyst to generate protein:protein interaction networks for differentially expressed genes (http://www.networkanalyst.ca/NetworkAnalyst/)[269].

3.4 Results

3.4.1 BPIFA1 expression profile

Immunohistochemistry was performed on paraffin embedded lung tissue from healthy lung in order to characterize the expression profile of BPIFA1 in the airways. We detected BPIFA1 expression at high levels in the larger airways, and identified BPIFA1 staining in a proportion of airway epithelial cells as well as coating the surface of the airway (Figure 3.1). In addition, we found high levels of BPIFA1 protein localization in the submucosal glands and ducts. We did not find any BPIFA1 expression in the parenchymal tissue of the lung. BPIFA1 expression was absent in the smaller airways of healthy lung (Figure 3.2a). In contrast, BPIFA1 protein localization was detectable in the epithelial cells of the smaller airways of CF lung well as in the lumen of the airways where it was observed in mucus plugs along with the presence of high numbers of inflammatory cells (Figure 3.2b). The elevated levels of BPIFA1 found in CF could indicate that BPIFA1 may contribute to CF disease pathophysiology or is important for host defense.

67

Figure 3.1: BPIFA1 expression in airway sections from healthy individuals

Immunohistochemistry showing BPIFA1 expression in the large airways of healthy individuals specifically in the a) airway epithelium b) submucosal ducts and coating the surface of the airway epithelium c) submucosal glands

Figure 3.2: BPIFA1 expression in small airway sections from a healthy individual and a CF patient

Immunohistochemistry showing BPIFA1 expression in the small airways of a) a healthy individual b) a CF patient

68

3.4.2 Relationship between rs1078761 genotype and levels of BPIFA1 and BPIFB1 in CF saliva

We previously established (see section 2.4.8) that the rs1078761 polymorphism is associated with differences in BPIFA1 expression levels in the saliva of healthy individuals. Furthermore, while we showed that rs1078761 was associated with CF BPIFA1 and BPIFB1 gene expression levels, it was not confirmed whether genotype is associated with BPIFA1/BPIFB1 protein expression in CF patients. Therefore, we collected saliva samples from CF patients during their stable clinic visit to the Pacific Lung Health Centre and measured BPIFA1 and BPIFB1 levels by western blotting and genotyped for rs1078761. Patients who were homozygous for the G allele had significantly reduced BPIFA1 levels compared to patients who were homozygous for the A allele (Kruskal-Wallis was used as a non-parametric alternative to ANOVA with p=0.083; Mann

Whitney was used as a non-parametric alternative to Student’s t-test with p=0.0121 for comparison of AA to GG)(Figure 3.3a). Furthermore, patients who were homozygous for the G allele had significantly reduced BPIFA1 compared to patients who were heterozygous or homozygous for the A allele combined (Mann Whitney p=0.0121). BPIFB1 levels were not different between genotypes (Figure 3.3b), consistent with our previous findings in healthy individuals.

69

Figure 3.3: BPIFA1 and BPIFB1 levels in saliva from stable CF patients by rs1078761 genotype

CF saliva samples from stable CF patients with genotyping of rs1078761 and quantification by western blot of a) BPIFA1 and b) BPIFB1. Kruskal-Wallis was used to test for differences in BPIFA1 and BPIFB1 levels between all three genotypes (left panel). In addition Mann-Whitney was used to test for differences between the AA and GG genotypes. Due to the small sample size, the AA and AG genotypes were pooled and compared to the GG genotype, and Mann- Whitney test was used to test for differences in BPIFA1 and BPIFB1 levels (right panel).

70

3.4.3 Investigation of bacterial growth inhibition by BPIFA1

In order to test whether BPIFA1 can inhibit growth of P. aeruginosa, we incubated bacterial cultures with recombinant BPIFA1 and measured optical density every 15 minutes for 24 hours to establish growth curves. We found that there was no decrease in optical density with the addition of BPIFA1 or BPIFB1 (Figure 3.4). The lack of effect on optical density indicates that

BPIFA1 and BPIFB1 do not have bactericidal properties (ability to kill bacteria), however, it is possible that these molecules may be bacteriostatic (keep bacteria in stationary phase of growth).

A bacteriostatic agent may not result in decreased optical density as the bacteria have not been lysed even though growth has been inhibited. Therefore, we measured CFU at 4 and 24 hours of

P. aeruginosa growth to determine if the addition of BPIFA1 and/or BPIFB1 resulted in fewer live bacteria that were able to form colonies, and we did not find any change in CFU with the addition of BPIFA1 or BPIFB1 (p= 0.427 and 0.957 respectively, at 24 hours of bacterial growth)(Figure 3.5a). A possible reason why some groups have found BPIFA1 to have antimicrobial properties while others have not may be due to the recombinant BPIFA1 protein being used. Many groups have utilized recombinant BPIFA1 produced in E. coli. BPIFA1 is a highly glycosylated protein, and since E. coli do not normally have the capacity for glycosylation

[270], BPIFA1 produced in E. coli would differ from that produced in a eukaryotic system such as the BPIFA1 we used which was derived from yeast. Therefore, we performed additional experiments utilizing an E. coli derived BPIFA1 protein to test the effect on P. aeruginosa optical density and CFU. Again, there was no difference in optical density growth curves or

CFU with the addition of BPIFA1 produced in E. coli (p=0.275) (Figure 3.5b).

71

Figure 3.4: P. aeruginosa growth curves with the addition of BPIFA1 or BPIFB1

Growth curves of the PAO1 strain of P. aeruginosa treated with a) 1, 5 and 10 μg of recombinant BPIFA1 protein b) 1 and 10 μg/mL of recombinant BPIFB1 protein

72

Figure 3.5: Quantification of colony forming units in P. aeruginosa treated with recombinant BPIFA1 and BPIFB1 protein

Quantification of colony forming units in P. aeruginosa treated with recombinant a) BPIFA1 produced in yeast b) BPIFA1 produced in E. coli c) BPIFB1 produced in yeast. The Kruskal- Wallis test was used to determine statistical differences between conditions and the corresponding p-values are shown. Colony forming units were measured at 4 hours and 24 hours of bacterial culture growth.

73

3.4.4 Effect of BPIFA1 overexpression on bacterial growth and inflammatory cytokine production

We next tested whether recombinant BPIFA1 produced in human airway epithelial cells can inhibit bacterial growth in vitro. IB3-1 cells, which are an airway epithelial cell line derived from a CF patient (W1282X/F508del), did not secrete detectable amounts of BPIFA1 protein at baseline as assessed by western blot. These cells were transfected with a plasmid containing the full BPIFA1 sequence in order to overexpress human BPIFA1. Subsequent western blotting confirmed that BPIFA1 was produced and secreted by these cells, and that the signal was proportional to the amount of BPIFA1 plasmid transfected (Figure 3.6). After allowing BPIFA1 to accumulate in the medium overnight, IB3-1 cells were stimulated with live P. aeruginosa for

4 hours to test whether the secreted BPIFA1 had antimicrobial properties. P. aeruginosa growth was quantified by counting CFU. Over-expression of BPIFA1 using either 100 or 500 ng of transfected plasmid did not result in decreased CFU counts compared to transfection with empty vector (Figure 3.7a). We also quantified the production of the anti-inflammatory cytokines IL-6 and IL-8 in order to test whether BPIFA1 had an immunomodulatory effect in vitro. IL-6 and IL-

8 levels were not significantly different with BPIFA1 over-expression (Figure 3.7c). In order to compensate for the potential inflammatory background associated with nucleic acid transfection, the amount of plasmid for each condition was normalized to 500 ng per transfection. The conditions were as follows: 500 ng BPIFA1 plasmid, 100 ng BPIFA1 plasmid + 400 ng pCMV6 vector, and 500 ng pCMV6 vector.

74

Figure 3.6: Detection of secreted BPIFA1 in transfected IB3-1 cells

Detection of secreted BPIFA1 in IB3-1 cells transfected with BPIFA1 plasmid with and without stimulation with P. aeruginosa

Figure 3.7: Quantification of colony forming units and inflammatory cytokine production in IB3-1 cells transfected with BPIFA1 plasmid

IB3-1 cells were transfected with 500 ng or 100 ng of BPIFA1 plasmid, or empty vector and stimulated with P. aeruginosa for 5 hours. This was followed by quantification of a) colony forming units b) IL-8 production by ELISA c) IL-6 production by ELISA

75

3.4.5 Effect of BPIFA1 and BPIFB1 on production of inflammatory cytokines in airway epithelial cells

We further investigated the putative immunomodulatory effect of BPIFA1 using recombinant

BPIFA1 protein in three airway epithelial cell lines. Preliminary experiments performed to establish conditions indicated that the 24 hour time point with a MOI of 50 resulted in maximal production of IL-8 after stimulation with P. aeruginosa. Furthermore, we found that heat-killed

P. aeruginosa resulted in a greater production of IL-8 than live bacteria (Figure 3.8).

In order to determine if BPIFA1 and BPIFB1 have an anti-inflammatory effect on airway epithelial cells during conditions of infection, airway epithelial cell lines were pretreated with recombinant BPIFA1 or BPIFB1 (produced in yeast) prior to stimulation with heat-killed P. aeruginosa. We found that pre-treatment of IB3-1 and CuFi-1 but not CFBE41o- cells with recombinant BPIFA1 prior to stimulation with PAO1 resulted in a reduction in IL-8 production

(ANOVA p=0.0015 and 0.0030, respectively). Furthermore, pretreatment with BPIFB1 was associated with decreased production of IL-8 in IB3-1 cells (ANOVA p=0.0030). However,

BPIFA1 and BPIFB1 pretreatment was not associated with reduction of IL-6 production in any cell type (Figure 3.9).

76

Figure 3.8: Preliminary experiments to establish conditions for stimulation of airway epithelial cells

Preliminary experiments performed in IB3-1 cells to establish stimulation conditions. a) Comparison of heat-killed P. aeruginosa to live bacteria at multiplicity of infection of 2, 10, 50 and 100 on IL-8 production by IB3-1 cells. b) IL-8 produced in IB3-1 cells stimulated with heat-killed bacteria at multiplicity of infection of 2, 10, 20, 50 and 100. Supernatants were collected at 4, 8 and 24 hours after stimulation. C) IL-8 produced in IB3-1 cells pretreated with 0, 0.1, 1, or 10 μg/mL of recombinant BPIFA1 followed by stimulation with heat-killed P. aeruginosa.

77

Figure 3.9: Quantification of inflammatory cytokine production by airway epithelial cells pretreated with recombinant BPIFA1 or BPIFB1 prior to bacterial stimulation

Quantification of IL-8 and IL-6 by ELISA in IB3-1, CuFi-1 and CFBE41o- airway epithelial cell lines that were unstimulated, stimulated with heat- killed P. aeruginosa, pretreated with recombinant BPIFA1 for 30 min followed by stimulation with heat-killed P. aeruginosa, or pretreated with recombinant BPIFB1 for 30 min followed by stimulation with heat-killed P. aeruginosa.

78

3.4.6 Characterization of BPIFA1 and BPIFB1 treatment on gene expression in CF airway epithelial cells

In order to further explore the function of BPIFA1 and BPIFB1 in airway epithelial cells, RNA-

Seq was used to characterize the effect of BPIFA1/BPIFB1 on the airway epithelial transcriptome. RNA samples from IB3-1 airway epithelial cell lines treated with recombinant

BPIFA1 or BPIFB1 alone or followed by stimulation with P. aeruginosa were sequenced.

Principal component analysis (PCA) plots showing global changes in gene expression in each cell line are shown in Figure 3.10.

We found that PAO1 stimulation resulted in the differential expression of 182 genes in

IB3-1 cells, 122 that were upregulated and 60 downregulated (Figure 3.11a). Pathway analysis performed using InnateDB identified that the pathways that were overrepresented in the differentially expressed genes included the Jak-STAT signaling pathway, interferon alpha/beta signaling, and cytokine signaling in immune response (Figure 3.11b). Treatment of IB3-1 cells with recombinant BPIFA1 alone or recombinant BPIFB1 alone resulted in large transcriptome changes with 1110 genes responding to BPIFA1 and 1324 genes responding to BPIFB1 compared to untreated cells. Sigora was used to identify overrepresented pathways in the differentially expressed genes that responded to BPIFA1/BPIFB1 in IB3-1 cells. This method of pathway analysis focuses on genes or gene pairs that are specific to a single pathway. In this way it utilizes the status of other genes in the experimental context to identify the most relevant pathways and minimize the identification of spurious pathways. Sigora pathway analysis indicated that overrepresented pathways in the differentially expressed genes responding to

BPIFA1 treatment included Signaling by Rho GTPases, Rho GTPase cycle, and RHO GTPase effectors (Table 3.1). Furthermore, BPIFB1 treatment resulted in the overrepresentation of

79 pathways in the differentially expressed genes including Signaling by Rho GTPases, RHO

GTPase effectors, and Downregulation of TGF-beta receptor signaling (Table 3.1).

Analysis of biological networks to which the differentially expressed genes belong was carried out using the network biology tool NetworkAnalyst [269]. The genes that were differentially expressed after treatment with BPIFA1 or BPIFB1 compared to baseline were used to create zero order (including interaction between differentially expressed genes only) protein:protein networks (Figure 3.12) which included 373 seed proteins for BPIFA1 and 513 seed proteins for BPIFB1. Biological function enrichment analysis indicated that both networks were enriched for innate immune genes. The BPIFA1 network in IB3-1 cells was enriched for several immune pathways including Signaling by TGF-beta receptor complex, Antiviral mechanism by IFN-stimulated genes, Downregulation of TGF-beta receptor signaling and

Interferon signaling. The BPIFB1 network from IB3-1 cells was enriched for pathways including Downregulation of TGF-beta receptor signaling, NOD like receptor signaling pathway and MAPK signaling pathway. To corroborate these findings, RNA from an additional CF airway epithelial cell line, CFBE41o-, treated with recombinant BPIFA1 and BPIFB1 was sequenced. A higher order network (including interaction between seed proteins as well as proteins that are known to interact with them) was plotted due to fewer differentially expressed genes. This resulted in a network including 292 nodes (Figure 3.13) that was enriched for biological pathways including Influenza infection, RIG-I/MDA5 mediated induction of IFN- alpha pathways, and NOD1/2 signaling pathway. The BPIFB1 network from CFBE41o- cells was a higher order network consisting of 255 nodes and was enriched for pathways including

Influenza Infection, TRAF6 mediated NF-kB activation, RIG-I/MDA5 mediated induction of

IFN-alpha/beta pathways and DAI mediated induction of type I IFNs. These findings indicate

80 that treatment of CF airway epithelial cells with BPIFA1 or BPIFB1 results in activation of the innate immune response.

In order to explore the mechanism by which BPIFA1 and BPIFB1 may modulate the immune response to P. aeruginosa infection we compared gene expression in cells pretreated with BPIFA1 or BPIFB1 prior to stimulation with P. aeruginosa to that of IB3-1 cells that were not pretreated. We found that pretreatment with BPIFA1 resulted in 66 differentially expressed genes compared to cells that were not pre-treated and pretreatment with BPIFB1 resulted in 279 differentially expressed genes compared to cells that were not pretreated. Sigora analysis did not find any pathways to be overrepresented in genes that were differentially expressed in IB3-1 cells pretreated with BPIFA1 prior to P. aeruginosa stimulation compared to cells that were only stimulated with P. aeruginosa. In cells that were pretreated with BPIFB1 prior to stimulation, compared to cells that were stimulated with P. aeruginosa alone, Sigora identified Gap junction trafficking and regulation, Platelet degranulation, and Metabolism of vitamins and cofactors pathways as overrepresented pathways. A heatmap showing the genes that respond to PAO1 stimulation illustrates that BPIFA1 or BPIFB1 pretreatment did not further alter expression of those genes (Figure 3.14).

81

Table 3.1: Overrepresented pathways by Sigora analysis in genes that are differentially expressed in response to BPIFA1 or BPIFB1 treatment

Comparison Pathway P-value Signaling by Rho GTPases 1.38×10-16 Rho GTPase cycle 6.93×10-13 Infectious disease 2.66×10-7 Rho GTPase Effectors 2.45×10-6 Unstimulated SRP-dependent cotranslational targeting to 8.16×10-5 vs. BPIFA1 membrane KSRP (KHSRP) binds and destabilizes mRNA 5.10×10-4 Downstream signaling of activated FGFR1 0.0187 Cyclin A/B1 associated events during G2/M 0.0497 transition Oxygen-dependent proline hydroxylation of 3.49×10-9 hypoxia-inducible factor alpha Downregulation of TGF-beta receptor signaling 1.42×10-8 Transcription of the HIV genome 4.07×10-8 KSRP (KHSRP) binds and destabilizes mRNA 1.79×10-6 Metabolism of vitamins and cofactors 3.00×10-5 RNA Polymerase II Transcription 6.52×10-5 RHO GTPase Effectors 1.87×10-4 Metabolism of polyamines 9.76×10-4 Constitutive signaling by NOTCH1 PEST domain 1.79×10-3 Unstimulated mutants vs. BPIFB1 Cleavage of growing transcript in the termination 3.10×10-3 region Signaling by Rho GTPases 3.76×10-3 Golgi associated vesicle biogenesis 4.35×10-6 Cyclin A/B1 associated events during G2/M 0.0102 transition Regulation of cholesterol biosynthesis by SREBP 0.0348 (SREBF) Chondroitin sulfate/dermatan sulfate metabolism 0.0396 Golgi cisternae pericentriolar stack reorganization 0.0451

Pathways that are overrepresented in genes that are significantly differentially expressed in IB3-1 cells treated with BPIFA1or BPIFB1 compared to unstimulated cells.

82

Figure 3.10: PCA plots of global gene expression in CF airway epithelial cell lines

PCA plot generated using DESeq2 of whole transcriptome sequencing data from IB3-1 airway epithelial cells at baseline, stimulated with heat killed P. aeruginosa, pretreated with recombinant BPIFA1 or BPIFB1 prior to stimulation with heat killed P. aeruginosa, or treated with recombinant BPIFA1 or BPIFB1 alone.

83

Figure 3.11: Differentially expressed genes in IB3-1 cells stimulated with heat-killed P. aeruginosa.

a) MA plot of genes that were differentially expressed in IB3-1 cells stimulated with P. aeruginosa compared to unstimulated cells. 122 genes were upregulated and 60 were downregulated. b) InnateDB analysis of overrepresented pathways in genes that were differentially expressed in IB3-1 cells as a result of PAO1 stimulation.

84

Figure 3.12: Protein:protein network of genes that were differentially expressed as a result of

BPIFA1/BPIFB1 treatment in IB3-1 cells

Zero order protein:protein networks, which show only direct interactions between differentially expressed genes, were plotted for IB3-1 cells for the comparison of a) treatment with BPIFA1 compared to unstimulated cells b) treatment with BPIFB1 compared to unstimulated cells

85

Figure 3.13: Protein:protein network of genes that were differentially expressed as a result of

BPIFA1/BPIFB1 treatment in CFBE41o- cells

Higher order protein:protein networks, which show all known interactions between differentially expressed genes, were plotted for CFBE41o- cells for the comparison of a) treatment with BPIFA1 compared to unstimulated cells b) treatment with BPIFB1 compared to unstimulated cells

86

Figure 3.14: Heatmap showing all PAO1 responsive genes in IB3-1 cells

Heatmap of all genes that were differentially expressed in IB3-1 cells stimulated with P. aeruginosa compared to unstimulated cells.

87

3.5 Discussion

We have previously shown that a polymorphism in the BPIFA1/BPIFB1 region is associated with lung disease severity in CF (Chapter 2)[271]. Furthermore, we found that the genotype associated with more severe disease is associated with reduced BPIFA1, and to lesser extent

BPIFB1, gene and protein expression (Chapter 2) [271]. To investigate why reduced levels of these molecules are detrimental to CF lung function we characterized their role in CF, specifically focusing on potential antimicrobial and immunomodulatory functions. Since the findings of our previous study pointed to BPIFA1 as being the causative gene, but did not rule out BPIFB1, this work investigated both BPIFA1 and BPIFB1, with a focus on the former.

Immunohistochemistry for BPIFA1 illustrated that BPIFA1 is highly expressed in the upper airway where it can be found in airway epithelial cells, coating the surface of the airway epithelium, in submucosal glands and submucosal gland ducts. Since the airway epithelium is the first line of defense against pathogens in the respiratory tract, this pattern of expression supports a role for BPIFA1 in host defense. Furthermore we found that while BPIFA1 is absent from healthy individual small airways, it is markedly upregulated in CF small airways. These findings confirm what has been shown by other groups [204, 205]. While we did not characterize BPIFB1 expression, it has previously been shown by others that BPIFB1 shares a similar pattern of expression with BPIFA1 and can be detected in goblet cells in the airway epithelium and nasal passages, as well as in the submucosal glands [206]. Like BPIFA1,

BPIFB1 has also been found to be upregulated in CF airway epithelium [241]. Elevated levels of

BPIFA1 and BPIFB1 in CF provide evidence that these molecules may contribute to disease pathogenesis.

88

Our previous study (Chapter 2) demonstrated that the rs1078761 polymorphism, which is associated with CF lung disease severity, is also associated with BPIFA1 but not BPIFB1 levels in the saliva of healthy individuals. While we did show that rs1078761 was associated with

BPIFA1 gene expression in CF patients we did not confirm whether this effect extended to the protein level in CF. Therefore, in this study we collected saliva samples from CF patients and quantified BPIFA1 and BPIFB1 levels according to rs1078761 genotype. We found that patients who were homozygous for the G allele had significantly reduced BPIFA1 but not BPIFB1 levels compared to patients who were homozygous for the A allele. These results support our previous findings which indicate that the G allele of rs1078761 is associated with reduced BPIFA1 but not

BPIFB1. Furthermore, these data suggest that BPIFA1 is the causative gene for the association of rs1078761 with CF lung disease severity in CF, since CF BPIFB1 levels were not associated with rs1078761 genotype.

We next explored findings in the literature that have demonstrated that BPIFA1 can inhibit bacterial growth in vitro [221-226]. These findings have been controversial as some groups have found that BPIFA1 has antimicrobial properties while others have not [221, 260].

We found that recombinant BPIFA1 protein produced in yeast does not inhibit bacterial growth measured by optical density or CFU. A possible reason for the discrepancy in the literature may be due to the type of recombinant BPIFA1 protein being used. Many groups have utilized recombinant BPIFA1 produced in E. coli. Since E. coli normally lack the capacity for protein glycosylation [270], and BPIFA1 is a highly glycosylated protein [210], the recombinant protein produced in yeast may be significantly different from the one produced in E. coli, both structurally and functionally. We also tested for the effect of BPIFA1 produced in E. coli on P. aeruginosa growth and found that there was no reduction in bacterial growth measured by

89 optical density or CFU. In addition, we found that recombinant BPIFA1 produced by CF airway epithelial cells transfected with BPIFA1 plasmid also does not inhibit colony formation by P. aeruginosa. Together these data indicate that BPIFA1 does not have bacteriostatic or bactericidal activity against P. aeruginosa.

We also found that the addition of recombinant BPIFA1 protein to CF airway epithelial cells prior to stimulation with P. aeruginosa resulted in reduced production of IL-8 but not IL-6.

In addition, treatment with recombinant BPIFB1 protein prior to P. aeruginosa stimulation resulted in reduced production of IL-8 but not IL-6 in CuFi-1 cells. These results indicate that

BPIFA1 and BPIFB1 may have anti-inflammatory properties in response to P. aeruginosa infection. While anti-inflammatory action of BPIFA1 or BPIFB1 has not previously been shown in human cells, in mouse model systems BPIFA1 has been demonstrated to have anti- inflammatory effects [229, 230].

To investigate the mechanism of action of BPIFA1 and BPIFB1, whole transcriptome sequencing of airway epithelial cells pretreated with recombinant BPIFA1 or BPIFB1 with or without stimulation with P. aeruginosa was performed. Although the cells were responsive to stimulation with P. aeruginosa, we found that pretreatment with BPIFA1 and BPIFB1 had little effect on the response to infection. However, we did find that treatment of airway epithelial cells with BPIFA1 or BPIFB1 in the absence of stimulation had a large effect on the transcriptome.

Pathway overrepresentation analysis revealed that both BPIFA1 and BPIFB1 activated several pathways related to Rho GTPases. Rho GTPases have been shown to play a central role in cellular migration [272], including in the recruitment of neutrophils [273]. This suggests that the primary function of both BPIFA1 and BPIFB1 may be chemotaxis in the recruitment of

90 inflammatory cells, which is supported by data in the literature indicating that BPIFA1 may function in neutrophil recruitment [226].

Network analysis of genes that were differentially expressed in response to BPIFA1 and

BPIFB1 was performed, and biological function analysis of these networks identified several pathways related to the innate immune response including Signaling by TGF-beta receptor complex, Antiviral mechanism by IFN-stimulated genes, Downregulation of TGF-beta receptor signaling, Interferon signaling, NOD like receptor signaling pathway, MAPK signaling pathway.

Many of these pathways are related to viral infection, suggesting that BPIFA1 and BPIFB1 may play a role in the anti-viral response. This is supported by preliminary data indicating that

BPIFA1 may function in the anti-viral response to Influenza-A virus [274].

An interesting finding from pathway overrepresentation analysis and biological function analysis of protein:protein interaction networks was that the response to BPIFA1 treatment was very similar to the response to BPIFB1 treatment. Since little is currently known about the function of BPIFB1, it was surprising to find that it has a similar effect on gene expression to

BPIFA1 which has been well characterized in the literature. However, the findings that BPIFA1 and BPIFBI are structurally similar, share a similar expression profile, and are strongly co- regulated (Chapter 2), [271] are consistent with similar biological functions for the two molecules.

Although we found that pretreatment of IB3-1 cells with BPIFA1 and BPIFB1 resulted in decreased production of IL-8 cytokine levels in response to infection with P. aeruginosa, we did not find decreased gene expression of IL-8 in these samples. This may be due to the time point at which samples were collected for RNA extraction. While the 8 hour time point allowed for detection of large differences in gene expression in response to BPIFA1 and BPIFB1 alone, an

91 alternative time point may have been more appropriate to identify transcriptional changes in response to bacterial infection as a result of BPIFA1 or BPIFB1 pretreatment.

From these experiments, it is still not clear whether BPIFA1 or BPIFB1 is causative for the association with lung disease severity shown in Chapter 2. Our data favor BPIFA1 being responsible, since rs1078761 was only associated with BPIFA1 levels in CF saliva. However both molecules were able to reduce IL-8 production in response to P. aeruginosa infection and

RNA-Seq data indicated that both molecules have functions related to CF. It is possible that

BPIFA1 is directly regulated by the rs1078761 polymorphism and is causative for the genetic association with CF severity, but that it functions synergistically with BPIFB1 in CF, so both molecules may play an important role in the disease.

In conclusion, we have generated several lines of new evidence supporting a role for

BPIFA1 and BPIFB1 in the inflammatory response in CF, and have demonstrated that these molecules may contribute to CF severity through several complementary functions.

Furthermore, our data are supportive of a new paradigm by which BPIFA1 and BPIFB1 may contribute to CF lung disease severity compared to what has been shown in the literature. Our

RNA-Seq findings indicate that BPIFA1 and BPIFB1 result in gene expression changes in CF airway epithelial cells that could influence cell migration through Rho GTPase pathways, and also alter the response to viral infection. Additional experiments to investigate the function of

BPIFA1 and BPIFB1 would be of benefit to further characterize these functions.

92

CHAPTER 4: INFLAMMATORY RESPONSE TO RHINOVIRUS

INFECTION IN CYSTIC FIBROSIS

4.1 Rationale

While there is significant evidence that the CF response to bacterial infection is dysregulated and prolonged, little is currently known about the response to respiratory viruses in CF. In recent years there has been an increased appreciation of the role of respiratory virus infections in CF exacerbations. As many as 50% of CF exacerbations have a viral component, which can result in a persistent decline in lung function [121, 123-127]. While the rate of respiratory virus infection is not higher in CF patients compared to those without CF, CF patients experience more severe and prolonged infections [133, 138] which are associated with increased antibiotic use, hospitalization, and decline in lung function [127]. However, the mechanisms resulting in increased respiratory virus related morbidity in CF are currently unknown. Since the airway epithelium is the first line of defense against pathogens in the respiratory tract, investigation of the epithelial response to viral infection may provide insight into disease progression as a result of respiratory virus infection in CF patients.

4.2 Background

The most common viral pathogen in CF is rhinovirus, which accounts for up to 40% of CF virus related exacerbations [121, 125, 126]. Rhinovirus is a ssRNA virus which is a member of the

Picornaviridae family. It is composed of a positive sense genome which is encapsulated in a protein capsid. There are over 100 types of human rhinovirus (HRV) belonging to three serotypes (A, B and C). Furthermore, rhinovirus is classified into two groups, the major and

93 minor group, depending on airway epithelial receptor affinity. The major group represents the majority of rhinovirus strains which utilize the intercellular adhesion molecule, ICAM-1 to bind to and enter cells, while the remaining (minor group) strains bind to low-density lipoprotein receptors (LDLR) [134, 135]. Rhinovirus enters the cell through receptor-mediated endocytosis upon which the RNA genome is uncoated and the RNA transferred into the cytosol. Once in the cytoplasm, RNA is translated into a polyprotein which is then cleaved into viral proteins, and is also replicated by the viral polymerase [275].

The entry and replication of rhinovirus in airway epithelial cells is detected by pattern recognition receptors including TLRs, retinoic inducible gene-I (RIG-I), and melanoma differentiation associated gene 5 (MDA-5) [276]. The HRV capsid is detected by TLR2 which is present on the surface of airway epithelial cells, while ssRNA and dsRNA which are associated with the presence of rhinovirus are detected by TLR3, TLR7 and TLR8 [277, 278]. MDA-5 and

RIG-I also detect the presence of dsRNA which is produced during viral replication. This triggers the activation of signaling pathways which lead to release of cytokines including type I interferons (IFN-α/IFN-β), type III interferons (IL-28A, IL-28B and IL-29), in addition to IL-1β, tumor necrosis factor (TNF), IL-8, IL-6, and IL-11, IL-12 and IL-15 [276]. IFNs are important for direct restriction of viral replication, while other cytokines such as IL-12 and IL-15 are involved in the differentiation, survival and recruitment of cytotoxic and natural killer cells

[275]. Growth factors such as Granulocyte-colony stimulating factor (G-CSF) and Granulocyte-

Macrophage Colony-Stimulating Factor (GM-CSF) also play important roles in response to HRV infection [276]. Furthermore, chemokines such as C-X-C motif chemokine ligand 8 (CXCL8),

CXCL5, CXCL10 and C-C motif chemokine ligand 5 (CCL5) are involved in the recruitment and activation of granulocytes, particularly neutrophils [275, 276].

94

It has been hypothesized that the CF response to HRV infection is dysregulated, and that this contributes to the increased HRV-related morbidity found in CF patients. Current data in the literature reporting on the CF response to HRV have been conflicting and it remains unclear whether the viral response is exaggerated or dampened in CF. While there are some data supporting an inflammatory response to viral exposure that is disproportionately high [147], other studies have found that higher viral load is associated with lower interferon response [136].

Characterization of the viral response in CF may help identify dysregulated pathways to reveal novel therapeutic targets.

Hypothesis: CF patients have an exaggerated immune response to rhinovirus infection compared to healthy controls

Aim: To investigate whether the CF response to rhinovirus infection is altered compared to healthy controls in order to identify possible anti-inflammatory targets.

4.3 Materials and methods

4.3.1 Recruitment of CF patients and healthy controls

Children with CF were recruited when attending the Princess Margaret Hospital for Children,

Perth, Western Australia, for their annual bronchoscopy and bronchoalveolar lavage [147].

Samples were also obtained from children without CF who were undergoing elective surgery for non-respiratory related conditions, and individuals with pre-existing bacterial or viral chest infection were excluded [279-281]. See Table 4.1 for a description of included individuals.

95

Table 4.1: Characteristics of CF patients and healthy controls

CF patients Healthy Controls

n 9 n 10

Mean age 2.8 ± 1.7 Mean age 3.9 ± 1.6

% male 33.3% % male 50.0%

p.Phe508del / Non- 8 p.Phe508del 7 Atopic

Mutation p.Phe508del / 1 Atopy p.Arg334Trp Atopic 2

p.Phe508del / 1 p. Arg117His

Brushing location Tracheal Brushing location Tracheal

Demographic and clinical information of CF patients and healthy controls recruited for bronchoscopy and airway brushing 4.3.2 Airway brushing and primary cell culture

Airway epithelial cells were harvested using trans-laryngeal non-bronchoscopic brushing of the tracheal mucosa and primary airway epithelial cell cultures were established according to previously described protocols [280-282]. Briefly, cells were obtained by gentle brushing of the airways, and were detached from the brush tip by vortexing. Cells were then centrifuged at 500g for 7 min at 8 °C and resuspended in Bronchial Epithelial Basal Medium (BEBM) supplemented with bovine pituitary extract (50μg/ml), hydrocortisone (0.5 μg/ml), epidermal growth factor (0.5 ng/ml), epinephrine (0.5 μg/ml), triiodothyronine (6.5 ng/ml), insulin (5 μg/ml), transferrin (10

μg/ml), fungixone (2.5 μg/ml), penicillin/streptomycin (100 Units/ml penicillin and 0.1 mg/ml streptomycin) and gentamycin (50 μg/ml). Cultures were utilized between passage one and three.

Epithelial lineage was confirmed for each established culture via cytokeratin-19 expression and

96 established to be free from mesenchymal, macrophage, dendritic and endothelial cell contamination using immunohistochemistry and RT-PCR. Growth medium on established primary cell cultures was replaced every second day. Confluent cells were passaged by incubating with 0.25% Trypsin/0.05% EDTA for 7 min at 37°C, followed by centrifugation, resuspension of the cell pellet in growth medium, and cell counting. Cells were plated into new flasks coated in a solution containing 10 mM fibronectin, 20mM collagen, and 100mM bovine serum albumin.

4.3.3 Rhinovirus stimulation

HRV1b was provided by Dr. Peter Wark (Hunter Medical Research Institute, NSW, Australia).

Viral stock was expanded four times in HeLa cells. Airway epithelial cells were grown in 12- well plates until 70% confluence in BEBM plus supplements and growth factors. HRV1b was added to the cells to a MOI of 12 and incubated at 37 °C for up to 24 hours. Cells were also sham treated with UV inactivated virus as a control, which was immediately washed off after addition. Cells were collected for analysis at specified times after infection (2, 4, 8, or 24 hours).

4.3.4 RNA-Sequencing and analysis

RNA was extracted from CF and control cells using the RNeasy Mini kit (QIAGEN, Valencia

CA). RNA concentration, integrity and purity were assessed using the RNA Nano Kit with the

Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara CA). Libraries for sequencing were prepared using the Illumina Truseq (Ilumina catalogue number FC-122-1002) RNA library preparation kit. Briefly, mRNA was purified from 1 μg of total RNA using poly-dT beads and used to synthesize cDNA. cDNA ends were repaired, adenylated and adaptors containing unique barcodes were ligated. DNA bound to adapter was amplified using PCR and quantified.

97

A CBOT instrument was used for cluster generation followed by RNA sequencing on a GAIIx instrument (Illumina, San Diego, CA), performed as a single end run of 64 nucleotides.

Sequencing data were demultiplexed and converted to FASTQ files using CASAVA. Sequencing quality was assessed using FastQC [261], and if necessary adapter sequences were trimmed from reads using cutadapt [262]. Sequencing reads were aligned to the hg19 reference genome using

Tophat2 [263], followed by sorting and indexing of BAM and SAM files using SAMtools [265].

HTSeq-count was used to generate read count tables. Differential gene expression analysis was performed using the Bioconductor packages DESeq2 [266] and Limma [267]. The significance threshold for differential expression was defined as a fold change greater than or equal to ±2 and a false discovery rate corrected p value less than or equal to 0.05 unless otherwise indicated.

Downstream analysis was performed using Sigora [268] for pathway analysis, and

NetworkAnalyst to generate protein:protein interaction networks for differentially expressed genes (http://www.networkanalyst.ca/NetworkAnalyst/)[269].

HRV1b levels in the airway epithelial cells were quantified by converting reads that were unmapped by Tophat2 to FASTQ files using Bam2fastq [283], followed by alignment of unmapped reads to the Rhinovirus A genome using Bowtie2 [264].

4.4 Results

4.4.1 Establishment of time course for expression of viral response genes

In order to determine the time point after infection with HRV1b to best assess gene expression in response to viral infection, we performed RNA-Seq on airway epithelial cells stimulated with

HRV for 2, 4, 8 and 24 hours (n=5; 3 CF patients and 2 healthy controls at each time point).

PCA clustering was performed using DESeq2 and indicated that the earlier time points cluster

98 separately from those of 24 hours after infection (Figure 4.1). DESeq2 was used to test for differential gene expression, and the number of differentially expressed genes at each time point is shown in (Figure 4.2). A FDR corrected p-value threshold of 0.05 was used for significance, but no fold change cut off was applied in order to capture smaller changes in gene expression which may occur at earlier time points. While changes in gene expression begin at 8 hours, the majority of differentially expressed genes are found at 24 hours after HRV infection. Sigora pathway analysis indicated that genes involved in viral response pathways such as interferon signaling and regulation of RIG-I/MDA5 signaling are differentially expressed at 8 hours after

HRV1b infection and continue to be differentially expressed after 24 hours (Table 4.2).

Together these data indicate that changes in viral response gene expression begin at 8 hours, but are maximized at 24 hours after infection in the time points we studied, and therefore we focused on the 24 hour time point for all subsequent analyses.

99

Figure 4.1: PCA clustering of airway epithelial gene expression

PCA clustering of gene expression data from airway epithelial cells stimulated with HRV1b and harvested at various time points. PCA plots were generated using DESeq2 and were calculated using the 500 genes with greatest variance across all samples Figure 4.2: Number of differentially expressed genes post RV1b infection

1500

s

e

n

e

g 1000

E

D

f

o

r

e

b 500

m

u

N

0 rs rs rs rs u u u u o o o o h h h h 2 4 8 4 2 Time after HRV1b stimulation

Number of differentially expressed genes at 2, 4, 8 and 24 hours after HRV1b stimulation. DESeq2 was used to compare gene expression at each time point to gene expression of uninfected cells. A FDR corrected p-value cut off of 0.05 was applied, but no fold change threshold was used. Data from CF samples and controls were pooled for this analysis. 100

Table 4.2: Overrepresented pathways associated with HRV1b infection at 2, 4, 8 and 24 hours after exposure

Time Pathway Name Adjusted p- value 2 hours N/A (no differentially expressed genes) 4 hours N/A (no significantly overrepresented pathways) Interferon alpha/beta signaling 3.69×10-27 Interferon Signaling 1.35×10-23 Reactome 8 hours Negative regulators of RIG-I/MDA5 signaling 2.38×10-14 Pathways ISG15 antiviral mechanism 2.16×10-6 Interferon gamma signaling 0.000677 Interferon alpha/beta signaling <1×10-300 Interferon gamma signaling 3.13×10-159 24 hours Interferon signaling 4.07×10-115 Negative regulators of RIG-I/ MDA5 signaling 6.25×10-82 TRIF-mediated TLR3/TLR4 signaling 3.38×10-29 2 hours N/A (no differentially expressed genes) 4 hours Ribosome biogenesis in eukaryotes 0.0471 Influenza A 1.15×10-32 Ribosome biogenesis in eukaryotes 0.00362 8 hours Hepatitis C 0.00508 Kegg RNA transport 0.0112 Pathways Spliceosome 0.0187 Influenza A 2.16×10-95 NOD-like receptor signaling pathway 2.57×10-38 24 hours RIG-I-like receptor signaling pathway 9.68×10-31 Proteosome 1.39×10-20 Steroid biosynthesis 4.17×10-20

Sigora (signature overrepresentation analysis) was used to identify pathways that are overrepresented in differentially expressed genes at each time point after HRV1b infection. Annotations curated from both Reactome and Kegg databases were analyzed. Adjusted p-value indicate p-values that are Bonferroni corrected for multiple comparisons

101

4.4.2 Characterization of viral response gene expression in airway epithelial cells

We next characterized the changes in gene expression that occur in airway epithelial cells focusing specifically on the 24 hour time point. PCA clustering indicated that there are significant changes in gene expression in both CF and healthy control airway epithelial cells at

24 hours after HRV stimulation (Figure 4.3a). DESeq2 was used to test for differential expression in the full sample size of CF patients and healthy controls combined (n=19; 9 CF patients and 10 healthy controls) using a paired design (matching before and after infection samples in each individual) to maximize power. There were a total of 764 differentially expressed genes after viral infection using a fold change threshold of ±2 and a FDR adjusted p- value cut-off of 0.05; 600 genes were upregulated after viral infection and 163 genes were downregulated as displayed in the MA plot shown in Figure 4.3b. The top 200 differentially expressed genes were plotted in a heatmap (Figure 4.4) which illustrates that there is significant inter-individual variability in expression of viral response genes after HRV1b infection, but global gene expression was not different between CF patients and controls. The genes with altered expression levels following HRV1b infection were subjected to a biological pathway overrepresentation analysis using Sigora to identify the biological processes affected by HRV infection. As expected, the majority of biological pathways over represented as a result of

HRV1b stimulation were related to interferon response (Table 4.3). Analysis of biological networks to which differentially expressed genes belong was carried out using the network biology tool NetworkAnalyst [269]. The genes that were differentially expressed after viral stimulation compared to baseline were used to create a zero order (including interaction between differentially expressed genes only) protein:protein network (Figure 4.5), which included 230 seed proteins. The top hubs (highly connected nodes) in this network were interferon regulatory

102 factor 1 (IRF1), signal transducer and activator of transcription 1 (STAT1), early growth response 1 (EGR1), ISG15 ubiquitin-like modifier (ISG15), interferon regulatory factor 9

(IRF9), all of which are involved in the interferon response except for EGR1 which is a transcriptional regulator. Biological function enrichment analysis indicated that the genes found in this network are involved in innate immune response, defense response to virus, and cytokine mediated signaling.

103

Figure 4.3: PCA clustering and MA plots of CF and healthy control airway epithelial gene expression at 24 hours post HRV1b infection

a) PCA plot of gene expression in CF and healthy control airway epithelial cells before and after infection with HRV1b for 24 hours. PCA plots were generated using DESeq2 and were calculated using the 500 genes with greatest variance across all samples. b) MA plot showing genes that are differentially expressed at 24 hours after HRV1b infection in CF and healthy control airway epithelial cells combined. Red dots indicate genes with FDR corrected p-values less than 0.05.

104

Figure 4.4: Heatmap of top 200 differentially expressed genes after rhinovirus infection

Heat map showing the top 200 genes that are differentially expressed in CF and healthy control airway epithelial cells at 24 hours after HRV1b infection. DESeq2 was used to calculate Z scores for each sample based on the mean expression of each gene across all samples. Negative Z scores indicating lower than average expression are shown in purple and positive Z scores indicating increased expression are shown in orange.

105

Figure 4.5: Network analysis of differentially expressed genes at 24 hours after HRV1b infection

Biological network of zero-order interaction between genes that were differentially expressed in airway epithelial cells after HRV1b infection. Upregulated genes are shown in red and downregulated genes are shown in green. The size of the node indicates the number of interaction partners. Proteins with many interaction partners (or hubs) are larger in size.

106

Table 4.3: Sigora pathway overrepresentation analysis of genes that are differentially

expressed at 24 hours after HRV1b infection in airway epithelial cells

Reactome Pathways Kegg Pathways Pathway Name Corrected Pathway Name Corrected p-value p-value Interferon alpha/beta <10-300 Influenza A 4.11×10-83 signaling Interferon gamma signaling 1.10×10-126 Jak-STAT signaling 6.19×10-63 pathway Interferon signaling 2.49×10-102 TNF-signaling pathway 4.54×10-33 Negative regulators of RIG- 7.28×10-68 NOD-like receptor signaling 1.67×10-27 I/MDA5 signaling pathway Signaling by Interleukins 1.87×10-19 RIG-I like receptor 7.06×10-10 signaling pathway Interleukin-6 signaling 8.78×10-19 B cell receptor signaling 5.26×10-08 pathway Diseases of Immune system 2.27×10-14 Amoebiasis 1.01×10-06 Endosomal/Vacuolar pathway 6.84×10-12 Proteoglycans in cancer 3.82×10-06 Termination of translesion 1.38×10-11 Toxoplasmosis 7.15×10-06 DNA synthesis Toll-like Receptor 3 (TLR3) 7.30×10-10 Cytosolic DNA-sensing 0.0002996 Cascade pathway

Kegg and Reactome pathways that are significantly overrepresented in genes that are differentially expressed in airway epithelial cells from both CF and healthy control individuals 24 hours after HRV1b infection.

4.4.3 Quantification of HRV1b in airway epithelial cells

To detect and quantify HRV1b in the infected airway epithelial cells, sequencing reads from

RNA-Seq were analyzed for the presence of the HRV1b genome. Since HRV1b is a RNA virus,

and the genome is polyadenylated, its genome is present in the mRNA used for sequencing.

Therefore, reads that did not map to the human genome were aligned to the Rhinovirus A (the

rhinovirus class to which HRV1b belongs) genomic sequence using Bowtie2, and the number of

aligned reads were quantified. Figure 4.6a shows the viral levels at each time point after HRV1b

stimulation and reveals that viral levels peak and plateau at 8 hours following rhinovirus

107 infection, despite the fact that significant changes in gene expression are not detectable until 24 hours after viral stimulation. Furthermore, HRV1b levels are significantly elevated in CF patients compared to healthy controls at 24 hours after infection (p<0.0001) (Figure 4.6b).

Figure 4.6: Rhinovirus levels in CF and healthy control airway epithelial cells

a) HRV1b levels detected from RNA-Seq data in CF and healthy control airway epithelial cells in the time course experiment (n=5). b) HRV1b levels detected from RNA-Seq data in CF and healthy control airway epithelial cells at 24 hours (n=19)

108

4.4.4 Investigation of differential response to viral infection in CF at the global level

To investigate whether the response to viral infection is dysregulated in CF patients compared to controls, multiple factor analysis was performed using FactoMineR. This analysis clusters samples into groups based on gene expression. While the unstimulated samples were clustered separately from those that were infected with HRV1b, there was no separation according to disease. Hierarchical clustering and factor map plots demonstrate that there are no global differences in gene expression between CF patients and healthy controls after rhinovirus exposure (Figure 4.7). FactoMineR was additionally used to test for association of gene expression with qualitative and quantitative variables, including age, gender, and CFTR mutation with sample clustering. The only variable that was significantly associated with sample clustering was HRV1b infection (p=6.5×10-9), and no other clinical or demographic variables including disease were associated. These findings also suggest that gender is not associated with response to rhinovirus infection in CF.

Next, DESeq2 was used to test for differential gene expression within the CF patients and controls individually. There were 774 genes that were significantly differentially expressed in

CF patients after viral infection, and 552 differentially expressed genes in controls. Of these,

440 genes were commonly differentially expressed in both groups (Figure 4.8). Out of the 763 genes that were differentially expressed in the combined group of CF patients and controls

(section 4.4.2), 624 were differentially expressed in the CF samples individually and 491 were differentially expressed in healthy controls. The top 20 genes out of the analysis of the combined group of CF and control patients are shown in Table 4.4 which illustrates that fold change for these top genes was similar for CF patients and healthy controls. Next, DESeq2 was used to test for differentially expressed genes using a nested interaction model to identify genes

109 that respond differently in CF patients from controls. Eight genes were identified using a less stringent FDR corrected p-value threshold of 0.1 (4 genes had p<0.05)(Figure 4.9a).

Furthermore, the likelihood ratio test was used in DESeq2 to test for the contribution of interaction between disease and HRV1b stimulation to gene expression, and seven differentially expressed genes were identified at the 0.1 FDR (2 genes had p<0.05)(Figure 4.9b). While both interaction models identified several genes as responding differently in CF patients compared to controls, the heatmap indicates that the differences in expression of these genes between CF and controls are very subtle and may not be biologically relevant.

Since viral levels were shown to be elevated in CF patients (section 4.4.3), we tested for differential gene expression by linear regression in Limma using a nested interaction design with correction for viral levels. This test would identify genes that respond differently to infection in

CF when gene expression is normalized for viral levels, and in essence would identify genes whose expression is out of proportion to viral levels in CF compared to controls. No significantly differentially expressed genes were identified in Limma with or without correction for viral levels.

In combination, analysis of gene expression in CF and healthy control samples individually and in interaction models indicate that global response to viral infection is not dysregulated in CF patients compared to controls under the conditions tested in these studies.

110

Figure 4.7: Factor map and hierarchical clustering of gene expression data from CF and healthy control airway epithelial cells infected with HRV1b

a) Factor map of gene expression in CF and healthy control airway epithelial cells infected with HRV1b for 24 hours. Colors indicate clusters generated by FactoMineR based on global gene expression. Samples with more similar gene expression will be clustered together. CF patients before treatment are labeled “CF_Control”, CF patients after treatment are “CF_Virus”, Healthy controls before treatment are labeled “Healthy_Control, Healthy controls after treatment are “Healthy_Virus”. The factor map was calculated using the 500 genes with greatest variance across all samples b) Hierarchical clustering of gene expression data highlighting clusters generated by FactoMineR 111

Figure 4.8: Venn diagram and chord diagram of genes that were differentially expressed in both CF and healthy control airway epithelial cells

a) Venn diagram showing genes that are commonly differentially expressed in both CF and healthy control airway epithelial cells 24 hours after HRV1b infection. b) Chord diagram showing genes that are differentially expressed in both CF and healthy control airway epithelial cells 24 hours after HRV1b infection. The cord diagram displays the relationship between genes that are differentially expressed in healthy controls (shown in orange) and genes that are differentially expressed in CF (shown in purple). Blue lines connect from orange to blue segments indicate genes that are differentially expressed in both groups.

112

Table 4.4: Top differentially expressed genes stratified into CF and healthy control groups

All Samples CF Controls Interaction log2 Fold padj log2 Fold padj log2 Fold padj Interaction padj Change Change Change Score IFIT2 6.62 3.73×10-118 5.92 2.50×10-59 5.38 1.15×10-47 -0.240 0.999 IFIT1 6.57 1.90×10-108 5.82 2.50×10-55 5.32 3.83×10-45 -0.607 0.999 IFNB1 6.49 5.25×10-74 5.68 7.10×10-48 4.66 5.88×10-31 -1.03 NA CMPK2 6.4 1.57×10-98 5.58 3.60×10-49 5.01 2.05×10-38 -0.313 0.999 ZBP1 6.3 2.41×10-73 5.13 1.40×10-38 4.9 8.49×10-34 0.0629 NA CXCL10 6.09 1.16×10-57 4.24 2.50×10-22 3.7 8.80×10-17 -1.34 0.999 HERC5 6.07 1.05×10-110 5.75 4.90×10-63 5 1.92×10-46 -0.429 0.999 RSAD2 6.01 5.82×10-58 4.5 1.20×10-25 3.81 4.85×10-18 -1.04 0.999 IFIT3 6.01 8.60×10-164 5.8 2.40×10-86 5.31 4.90×10-70 -0.494 0.999 CXCL11 5.99 3.45×10-60 4.5 5.30×10-26 3.97 6.20×10-20 -0.392 0.999

The top 10 genes by fold change that are differentially expressed in all samples combined 24 hours after rhinovirus infection, and p- values and fold changes for those same genes within the CF and healthy control subsets individually. Interaction p-values and interaction-scores were calculated to identify genes that respond differently to HRV1b in healthy controls compared to CF. These data indicate that the top differentially expressed genes in the combined group are not different between CF and healthy controls.

113

Figure 4.9: Heatmaps illustrating genes that respond differently to HRV1b in CF compared to controls

Heat maps generated using DESeq2 to calculate Z scores for each sample based on the mean expression of each gene across all samples. Negative Z scores indicating lower than average expression are shown in purple and positive Z scores indicating increased expression are shown in orange. a) Heatmap showing genes that respond differently to HRV1b infection in CF samples compared to controls using a nested interaction model in DESeq2 b) Heatmap showing genes that respond differently to HRV1b infection in CF samples compared to controls using the likelihood ratio test in DESeq2

114

4.4.5 Investigation of pathways that have previously been shown to be dysregulated in the

CF response to rhinovirus

Previous studies have identified several pathways as being dysregulated in CF patients compared to healthy controls after rhinovirus infection. Several groups have found that interferons as well as inflammatory cytokine levels, specifically IL-6, and IL-8 are altered in CF [147, 284].

Furthermore, Sutano et al. found that apoptosis was reduced in CF cells in response to virus

[147]. Therefore, we examined these specific pathways in a hypothesis driven manner. We hypothesized that inflammatory cytokines, IFNs and apoptosis genes respond differently in CF compared to controls. We specifically targeted the pro-inflammatory cytokines IL-6, and IL-8, the interferons IFN-β1, IFN-λ1, IFN-λ2, IFN-λ3 as well as all members of the epithelial cell apoptotic process GO pathway (GO:1904019 ). While many of these genes were significantly differentially expressed in the combined dataset, as well as in the CF and healthy control subsets, the gene expression changes were comparable between CF patients and healthy controls, and there was no statistically significant interaction between gene expression and disease (Table 4.5

115

Table 4.5: Pro-inflammatory cytokines, interferons and apoptosis genes from RNA-Seq data

Interaction in Combined group CF Healthy Controls Combined Samples Pathway Gene Fold Fold Fold Interaction P -value P -value P -value P -value Change Change Change Score Pro-inflammatory IL-6 5.6×10-03 1.1 0.065 0.76 0.017 1.0 0.60 -0.42 cytokines CXCL8 1.4×10-05 0.98 0.016 0.68 6.4×10-06 1.3 0.20 0.45 IFNB1 1.7×10-76 6.5 1.4×10-50 5.68 2.1×10-33 4.7 0.56 -1.0 IFNL1 4.2×10-38 5.2 2.6×10-21 3.96 2.6×10-13 3.1 0.12 -2.8 Interferons IFNL2 2.2×10-51 5.8 3.5×10-31 4.75 2.2×10-20 3.8 0.32 -1.7 IFNL3 1.4×10-49 5.8 1.6×10-25 4.33 5.6×10-16 3.4 0.63 -0.92 BMPR2 4.9×10-07 0.60 4.0×10-04 0.54 3.0×10-05 0.66 0.68 0.084 CAPN10 0.98 3.8×10-3 0.87 -0.03 0.91 0.024 0.77 0.11 MTCH2 0.015 -0.19 0.062 -0.21 0.15 -0.17 0.91 0.017 SERPINB13 0.031 -0.67 0.052 -0.74 0.31 -0.39 0.19 0.58 SFRP4 0.53 0.14 1.0 0 0.48 0.11 0.96 0.30 MST1 0.75 -0.080 0.43 0.24 0.19 -0.39 0.38 -0.39 TIA1 0.28 0.15 0.97 -0.01 0.13 0.30 0.41 0.19 Epithelial cell apoptotic MTOR 0.24 -0.11 0.23 -0.14 0.60 -0.063 0.95 9.1×10-3 process PPARGC1A 4.6×10-03 -1.07 3.9×10-03 -1.19 0.06 -0.77 0.27 1.31 SIX3 0.89 -0.031 1.0 0 0.70 -0.058 1.00 0 TEK 0.56 0.23 0.72 -0.15 0.20 0.54 0.27 1.4 IL13 0.57 0.12 0.51 0.10 1.00 0 0.89 -0.95 RGCC 2.8×10-03 -0.98 1.3×10-03 -1.2 0.04 -0.77 0.22 0.89 WFS1 0.11 -0.33 0.12 -0.41 0.32 -0.27 0.75 0.11 AGER 0.014 0.54 0.162 0.41 0.05 0.58 0.46 0.37 P-values and fold changes for proinflammatory cytokines, interferons and apoptosis genes in RNA-Seq data calculated using DESeq2. P-values and fold changes are shown for the combined group of CF and healthy control individuals, for CF patients alone, and for healthy controls alone. In addition a nested interaction model was used to investigate whether each gene responds differently to HRV1b infection in CF compared to controls. All p-values shown are unadjusted therefore the threshold for significance is p<0.002 with Bonferroni correction.

116

4.5 Discussion

Respiratory virus infection is now recognized to play an important role in the short and long term health of CF patients [121]. While CF patients do not have increased susceptibility to respiratory virus infections, there is increased severity and duration of infection in CF which results in significant morbidity [121, 138]. Therefore, we sought to characterize the immune response to rhinovirus infection in order to identify possible pathways that may be dysregulated in the anti- viral response in CF. The main finding of this study was that CF airway epithelial cells do not have a dysregulated response to rhinovirus infection, despite having elevated viral levels.

Furthermore, we showed that most changes in gene expression are detectable at 24 hours after

HRV1b infection, and that the response to rhinovirus infection in airway epithelial cells is characterized by interferon and RIG-I/MDA5 signaling.

We found that changes in expression of genes in viral response pathways such as interferon signaling and regulation of RIG-I/MDA-5 signaling begin at 8 hours after viral infection, and are of greater magnitude at 24 hours. However, viral levels plateau in both CF and healthy control cells at 8 hours after infection indicating that there is a delay after replication before viral response gene expression begins. This finding supports previous studies in airway epithelial cells which found that few genes were upregulated in response to viral infection at 6 or

8 hours, but many changes can be detected by 12-24 hours [285, 286]. Furthermore an in vivo study in healthy volunteers showed that gene expression changes in peripheral blood in response to rhinovirus infection cannot be detected at 8 hours but are present at 24 hours after infection

[287].

117

We also found that there are large transcriptional changes in airway epithelial cells at 24 hours after HRV1b infection. As expected, the differentially expressed genes were enriched for viral response pathways including interferon signaling, regulation of RIG-I/MDA-5 signaling, and signaling by interleukins. These pathways are consistent with known gene expression profiles in response to rhinovirus infection [286]. Furthermore, although our rhinovirus stimulations were performed in cultured primary cells, the gene expression profile identified is comparable to that observed in nasal epithelial cells stimulated with rhinovirus in-vivo [288].

Proud et al. [288] found that nasal epithelial cells infected with rhinovirus in-vivo responded by producing a range of viral response genes including interferon stimulated genes. Similar responses were later found in airway epithelial cells from asthmatics and healthy controls that were grown at the air liquid interface (ALI) [289]. These findings indicate that gene expression in response to viral infection is well represented in the primary airway epithelial cells used in our experiments.

Previous studies have shown that CF airway epithelial cells may have a dysregulated response to rhinovirus while others have found that this is not the case [136, 147-150].

Therefore we investigated whether CF airways epithelial cells had an altered response to HRV1b infection compared to healthy controls. While several genes were identified as responding differently in CF compared to healthy controls, the p-values were of borderline significance, and the differences in gene expression were not striking suggesting that these may be spurious findings. Subset analysis of the top overall viral responsive genes in CF and healthy control subgroups indicated that the fold change for these genes is similar in CF patients compared to healthy controls, and testing for interaction did not reveal significant interaction between gene expression and disease. Furthermore, a hypothesis driven investigation of interferon and

118 apoptosis pathways did not identify any genes with a significant interaction score. Therefore we concluded that the CF response to rhinovirus is not altered compared to controls. This is supported by several studies that found that interferon levels and inflammatory cytokine levels in response to rhinovirus infection are not altered in CF patients compared to controls [148, 150].

However, our results are inconsistent with the findings of Sutanto et al., who demonstrated that

CF patients have increased production of IL-6 and IL-8 in response to rhinovirus [147],

Kieninger et al. who found that interferon levels were reduced in CF patients with increased viral load [136] and Schogler et al. who found that HRV1b was associated with increased production of interferons and pro-inflammatory cytokines[136, 284]. One possible reason that was considered to account for these discrepancies was that different studies have used different rhinovirus strains. There are two main groups of rhinovirus which use different receptors on airway epithelial cells for entry, and there is some evidence in the literature that Class A and

Class B rhinovirus strains may have different effects on interferon response in both CF cells and healthy controls [284, 290, 291]. However, HRV class does not appear to be responsible for differences in epithelial response here as studies using the same strain have produced different results. For example, similar to our study Sutanto et al. used HRV1b, but found elevated production of proinflammatory cytokines in CF. Furthermore, Dauletbaev et al. worked with

HRV16, which is a Class B rhinovirus, but found results matching ours. Therefore rhinovirus strain likely is not responsible for the discrepancy found in the literature. Another important technical difference between studies is the experimental readout, specifically gene-expression versus protein expression. While our study focused on gene expression, others quantified protein expression [147], which may explain the lack of consistency in our conclusions. Furthermore,

119 since our study analyzed the entire transcriptome, correction for multiple comparisons was required which could have resulted in the loss of true positives.

Using the RNA-Seq reads, we were able to quantify the amount of HRV1b genome present in the airway epithelial cells, and we found that viral levels are significantly elevated in

CF airway epithelial cells compared to controls at 24 hours after rhinovirus infection. This is consistent with what has previously been found in several studies [136, 147, 151]. This finding raised the question of why the virus is able to replicate more and/or infect more cells in CF compared to healthy controls. Since the viral levels are higher in CF, but viral response gene expression is the same as in healthy controls, this suggests that the viral response may actually be low in proportion to the level of virus. This may allow the virus to continue replicating, as interferon levels may not be sufficient for the amount of virus present. Therefore we tested for differential gene expression using an interaction model with correction for viral levels, in order to identify genes that respond differently to infection in CF in proportion to viral levels, however no genes were identified. Further experiments are required to investigate why viral levels are higher in CF, particularly to look at antiviral responses that occur early on in viral infection. It is possible that differences in CF cells and healthy cells occur during viral entry. Since both major and minor group rhinovirus strains have been shown to be elevated in CF compared to controls

[147, 150, 151], and both groups use different receptors for entry, it is unlikely that differences in viral binding to receptors are responsible. Therefore, differences in virus internalization through endosomal uptake may be involved, which could contribute to elevated virus load in CF.

However, our findings show that differences in viral levels between CF patients and controls are not detectable until 24 hours after virus infection, and that there are no differences in viral levels at 2, 4 and 8 hours after infection. If differences in viral internalization were responsible, we

120 would expect to see changes in viral levels early in the infection. This could indicate that differences in virus replication as opposed to internalization may be involved. It is important to note however, that in our experiments sample size was limited at the 2, 4 and 8 hour time points

(n=3 for CF; n=2 for healthy controls), and may be the reason that we did not detect differences in viral levels between CF and controls at these time points. Alternatively, it is possible that viral clearance is impaired or delayed in CF, allowing for viral levels to be elevated in CF patients at 24 hours after infection while virus has started to be cleared in healthy controls.

Investigation of gene expression and viral load at later time points would clarify whether viral clearance is delayed in CF, and possibly identify genes involved in viral clearance that are dysregulated in CF patients.

In conclusion, we found that while viral response gene expression is intact in CF airway epithelial cells, there are increased levels of virus associated with CF. This could contribute to the elevated rhinovirus related morbidity found in CF. Further studies are required to investigate the reason that viral levels are higher in CF, and why this does not result in an elevated viral response. Elucidation of the mechanism by which viral levels are elevated in CF may reveal novel anti-viral targets for CF, and thereby slow the rate of respiratory decline due to viral related exacerbation.

121

CHAPTER 5: GENERAL CONCLUSIONS AND FUTURE DIRECTIONS

It has been shown that the inflammatory response in CF is dysregulated and prolonged. This is due in part to the frequent and severe respiratory infections to which CF patients are susceptible, but also reflects an inflammatory response that may be disproportionately high. While bacterial infection is known to play a significant role in CF, viral infection is emerging as an important contributor to the health of CF patients. Therefore, the primary objective of the studies described in this thesis was to utilize information from genetic association studies and whole transcriptome analysis to investigate the inflammatory response to infection in CF, and identify and characterize anti-inflammatory targets. We began by identifying and characterizing BPIFA1 and BPIFB1 as modifier genes in CF, focusing on the putative antimicrobial and immunomodulatory properties of these molecules. Next we used whole transcriptome gene expression data using RNA-Seq to characterize the inflammatory response of CF airway epithelial cells to rhinovirus infection.

5.1 Main contributions to the field of CF and inflammation research

The body of research presented in this thesis has shown that:

1) Polymorphisms in the BPIFA1/BPIFB1 chromosomal region are associated with lung

disease severity in CF. This indicates that these putative innate immune genes contribute

to CF lung function. While it has been shown in the literature that BPIFA1 and BPIFB1

are elevated in CF [204, 219, 220, 241], our results are the first implicating a contribution

of these genes to CF severity.

2) The G allele of rs1078761, which is associated with increased CF lung disease severity, is

also associated with decreased mRNA expression of BPIFA1 and BPIFB1 in lung tissue

122

and BPIFA1 protein levels in saliva. This suggests that decreased levels of BPIFA1 and

BPIFB1 are detrimental to CF lung function. These findings are the first in the literature

to indicate that decreased BPIFA1/BPIFB1 levels may contribute to CF lung disease

severity.

3) BPIFA1 but not BPIFB1 mRNA and protein levels are significantly elevated in lung

tissue from CF patients compared to healthy controls. This supports findings in the

literature indicating that BPIFA1 is elevated in CF [204, 219, 220, 241]. These findings

suggest that BPIFA1 may contribute to host response in CF and therefore CF patients

who carry the G allele of rs1078761 may have more severe lung disease due to an

impaired host response as a result of lower BPIFA1 or BPIFB1 expression.

4) Using immunohistochemistry we found that BPIFA1 is expressed in ciliated epithelial

cells in the upper airways and is absent in the small airways of healthy lung. In contrast,

BPIFA1 can be detected at high levels in both large and small airways of CF lung. This

supports previous studies of BPIFA1 localization in healthy individual and CF tissue, and

indicates that these molecules may play an important role in host defense given their

presence at the epithelial barrier [204, 205].

5) Recombinant BPIFA1 does not have antimicrobial properties against P. aeruginosa in

vitro. This is supported by several studies in the literature [221, 260] but contradicted by

others [221-226].

6) Recombinant BPIFA1 and BPIFB1 have an anti-inflammatory effect in CF airway

epithelial cell lines measured by IL-8 cytokine production in response to infection with P.

aeruginosa. This confirms several mouse studies which have shown that BPIFA1 has

anti-inflammatory effects in vivo [229, 230].

123

7) Whole genome gene expression data using RNA-Seq indicate that while BPIFA1 and

BPIFB1 do not have the ability to inhibit inflammation produced as a result of P.

aeruginosa infection, both molecules significantly alter gene expression in CF airway

epithelial cells. Specifically both BPIFA1 and BPIFB1 alter expression of genes

involved in Rho GTPase signaling which are known to play a central role in cellular

migration [272].

8) Changes in viral response gene expression begin at 8 hours after viral infection, and are

of greater magnitude at 24 hours.

9) While viral response gene expression in response to rhinovirus infection is not altered in

CF airway epithelial cells compared to healthy controls, rhinovirus levels are elevated in

CF airway epithelial cells. This suggests that there is a deficiency in the viral response

mechanism in CF which allows for greater viral internalization and/or replication. This

confirms what was previously shown by Dauletbaev et al [150].

5.2 Future Directions

5.2.1 What is the role of the rs1078761 polymorphism in BPIFA1 and BPIFB1 expression and regulation?

Our findings indicate that the rs1078761 polymorphism which is associated with CF lung disease severity is also associated with BPIFA1 and BPIFB1 expression levels. However it is not clear how this polymorphism regulates expression of these genes. Experiments such as chromatin immunoprecipitation (ChIP), electromobility shift assays (EMSA) and chromatin conformation capture analysis could be utilized to identify regulatory proteins and DNA sequences that interact with the SNP. ENCODE data indicated that CTCF and HIF1 transcription factors bind to

124 rs1078761 and therefore antibodies against these proteins could be used in ChIP and EMSA experiments. Furthermore, since the rs1078761 is an amino acid changing polymorphism in

BPIFB1, it is of interest to investigate whether this change alters BPIFB1 function. Cloning, site-directed mutagenesis and in vitro translation could be used to produce BPIFB1 protein with the amino acid change, which could then be used in functional experiments such as the ones performed in our studies as well as the proposed experiments described below.

5.2.2 What is the mechanism by which BPIFA1 alters the airway epithelial response to infection?

Our data suggest that while BPIFA1 may have some mild anti-inflammatory properties in response to P. aeruginosa infection, BPIFA1 does not have anti-microbial properties in vitro. In contrast to the functions that have previously been explored for BPIFA1 and BPIFB1 in the literature, we found that treatment of CF airway epithelial cells with recombinant BPIFA1 or

BPIFB1 results in significant changes in expression of genes involved in Rho GTPase signaling pathways. Since Rho GTPases are involved in cellular migration [272], including the recruitment of neutrophils [273], BPIFA1 and BPIFB1 may be involved in chemotaxis.

Therefore, it would be of interest to further investigate whether BPIFA1 and BPIFB1 have the ability to recruit inflammatory cells, possibly using cell migration assays in vitro. Since neutrophilic inflammation is central to CF disease pathophysiology, it would be of particular relevance to investigate whether BPIFA1 and BPFIB1 can influence neutrophil migration at physiological doses. Furthermore, our findings indicate that BPIFA1 and BPIFB1 alter expression of genes involved in the anti-viral response, and this is supported by early findings that BPIFA1 may function in the anti-viral response to Influenza-A virus [274]. Further experiments could be performed to investigate the mechanism by which BPIFA1 (and possibly

125

BPIFB1) alters the viral response in airway epithelial cells by treating CF airway epithelial cells with recombinant BPIFA1 and BPIFB1 proteins prior to infection with respiratory viruses including influenza and rhinovirus. Inflammatory cytokines could be measured using ELISA, and gene expression profiles assessed using RNA-Seq, similar to what was done in our experiments with P. aeruginosa. These experiments may identify novel roles for BPIFA1 and

BPIFB1 in host response, and further support the relevance of inhaled BPIFA1 as a therapy.

5.2.3 What is the mechanism by which rhinovirus levels are elevated in CF?

Our studies demonstrate that although gene expression in response to rhinovirus infection is not altered in CF patients compared to healthy controls, virus levels are elevated in CF airway epithelial cells. Further experiments are required to elucidate the mechanism which allows for higher viral levels in CF. Investigation of early time points may identify viral response genes that are responding differently early in infection in CF cells compared to healthy controls. In addition, analysis of later time points could uncover viral clearance genes that may be expressed later in CF patients compared to healthy individuals. Future experiments could involve RNA-

Seq of airway epithelial cells at an early time point such as 2 hours after infection at which elevated viral loads have been shown to already be detectable [150], as well as time points later than 24 hours. Alternatively, qPCR or ELISA could be used to quantify a specific panel of viral response molecules such as IFNs and IFN responsive genes in a time-course experiment.

Furthermore, it is of importance to investigate which step in the process of viral infection results in increased viral levels in CF. The defect could lie at the point of viral binding to cellular receptors or in viral internalization by endosomal uptake. Viral binding to cellular receptors could be quantified using purified radiolabeled viruses that are allowed to attach to cultured cells

[292]. In addition, viral internalization could be compared between CF and healthy control cell

126 lines using previously described viral internalization assays [293]. These experiments may elucidate the mechanism by which virus levels are elevated in CF to identify pathways that could be targeted therapeutically.

127

REFERENCES

1. Cystic Fibrosis Canada, The Canadian Cystic Fibrosis Registry Annual Report. 2014. 2. Andersen, D.H. and R.G. Hodges, Celiac syndrome; genetics of cystic fibrosis of the pancreas, with a consideration of etiology. Am J Dis Child, 1946. 72: p. 62-80. 3. Kerem, B., et al., Identification of the cystic fibrosis gene: genetic analysis. Science, 1989. 245(4922): p. 1073-80. 4. Rommens, J.M., et al., Identification of the cystic fibrosis gene: chromosome walking and jumping. Science, 1989. 245(4922): p. 1059-65. 5. Riordan, J.R., et al., Identification of the cystic fibrosis gene: cloning and characterization of complementary DNA. Science, 1989. 245(4922): p. 1066-73. 6. Elborn, J.S., Cystic fibrosis. Lancet, 2016. 7. Cutting, G.R., Cystic fibrosis genetics: from molecular understanding to clinical application. Nat Rev Genet, 2015. 16(1): p. 45-56. 8. Welsh, M.J. and A.E. Smith, Molecular mechanisms of CFTR chloride channel dysfunction in cystic fibrosis. Cell, 1993. 73(7): p. 1251-4. 9. Zielenski, J., Genotype and phenotype in cystic fibrosis. Respiration, 2000. 67(2): p. 117-33. 10. Cheng, S.H., et al., Defective intracellular transport and processing of CFTR is the molecular basis of most cystic fibrosis. Cell, 1990. 63(4): p. 827-34. 11. Denning, G.M., et al., Processing of mutant cystic fibrosis transmembrane conductance regulator is temperature-sensitive. Nature, 1992. 358(6389): p. 761-4. 12. Lazrak, A., et al., The silent codon change I507-ATC->ATT contributes to the severity of the DeltaF508 CFTR channel dysfunction. FASEB J, 2013. 27(11): p. 4630-45. 13. Stoltz, D.A., D.K. Meyerholz, and M.J. Welsh, Origins of cystic fibrosis lung disease. N Engl J Med, 2015. 372(4): p. 351-62. 14. Munder, A. and B. Tummler, Origins of cystic fibrosis lung disease. N Engl J Med, 2015. 372(16): p. 1574. 15. Haggie, P.M. and A.S. Verkman, Cystic fibrosis transmembrane conductance regulator- independent phagosomal acidification in macrophages. J Biol Chem, 2007. 282(43): p. 31422-8. 16. Barriere, H., et al., Revisiting the role of cystic fibrosis transmembrane conductance regulator and counterion permeability in the pH regulation of endocytic organelles. Mol Biol Cell, 2009. 20(13): p. 3125-41. 17. Steinberg, B.E., et al., A cation counterflux supports lysosomal acidification. J Cell Biol, 2010. 189(7): p. 1171-86. 18. Castellani, C., et al., Consensus on the use and interpretation of cystic fibrosis mutation analysis in clinical practice. J Cyst Fibros, 2008. 7(3): p. 179-96. 19. Reddy, M.M. and P.M. Quinton, Control of dynamic CFTR selectivity by glutamate and ATP in epithelial cells. Nature, 2003. 423(6941): p. 756-60. 20. Sheppard, D.N., et al., Mutations in CFTR associated with mild-disease-form Cl- channels with altered pore properties. Nature, 1993. 362(6416): p. 160-4. 21. Kristidis, P., et al., Genetic determination of exocrine pancreatic function in cystic fibrosis. Am J Hum Genet, 1992. 50(6): p. 1178-84. 22. Van Goor, F., et al., Rescue of CF airway epithelial cell function in vitro by a CFTR potentiator, VX-770. Proc Natl Acad Sci U S A, 2009. 106(44): p. 18825-30.

128

23. Yu, H., et al., Ivacaftor potentiation of multiple CFTR channels with gating mutations. J Cyst Fibros, 2012. 11(3): p. 237-45. 24. Van Goor, F., et al., Effect of ivacaftor on CFTR forms with missense mutations associated with defects in protein processing or function. J Cyst Fibros, 2014. 13(1): p. 29-36. 25. Ramsey, B.W., et al., A CFTR potentiator in patients with cystic fibrosis and the G551D mutation. N Engl J Med, 2011. 365(18): p. 1663-72. 26. Davies, J.C., et al., Efficacy and safety of ivacaftor in patients aged 6 to 11 years with cystic fibrosis with a G551D mutation. Am J Respir Crit Care Med, 2013. 187(11): p. 1219-25. 27. Davies, J., et al., Assessment of clinical response to ivacaftor with lung clearance index in cystic fibrosis patients with a G551D-CFTR mutation and preserved spirometry: a randomised controlled trial. Lancet Respir Med, 2013. 1(8): p. 630-8. 28. McKone, E.F., et al., Long-term safety and efficacy of ivacaftor in patients with cystic fibrosis who have the Gly551Asp-CFTR mutation: a phase 3, open-label extension study (PERSIST). Lancet Respir Med, 2014. 2(11): p. 902-10. 29. Rowe, S.M., et al., Clinical mechanism of the cystic fibrosis transmembrane conductance regulator potentiator ivacaftor in G551D-mediated cystic fibrosis. Am J Respir Crit Care Med, 2014. 190(2): p. 175-84. 30. De Boeck, K., et al., Efficacy and safety of ivacaftor in patients with cystic fibrosis and a non- G551D gating mutation. J Cyst Fibros, 2014. 13(6): p. 674-80. 31. Van Goor, F., et al., Correction of the F508del-CFTR protein processing defect in vitro by the investigational drug VX-809. Proc Natl Acad Sci U S A, 2011. 108(46): p. 18843-8. 32. Rehman, A., N.U. Baloch, and I.A. Janahi, Lumacaftor-Ivacaftor in Patients with Cystic Fibrosis Homozygous for Phe508del CFTR. N Engl J Med, 2015. 373(18): p. 1783. 33. Boyle, M.P., et al., A CFTR corrector (lumacaftor) and a CFTR potentiator (ivacaftor) for treatment of patients with cystic fibrosis who have a phe508del CFTR mutation: a phase 2 randomised controlled trial. Lancet Respir Med, 2014. 2(7): p. 527-38. 34. Kerem, E., et al., The relation between genotype and phenotype in cystic fibrosis--analysis of the most common mutation (delta F508). N Engl J Med, 1990. 323(22): p. 1517-22. 35. Sosnay, P.R., et al., Defining the disease liability of variants in the cystic fibrosis transmembrane conductance regulator gene. Nat Genet, 2013. 45(10): p. 1160-7. 36. Correlation between genotype and phenotype in patients with cystic fibrosis. The Cystic Fibrosis Genotype-Phenotype Consortium. N Engl J Med, 1993. 329(18): p. 1308-13. 37. Wilschanski, M., et al., Correlation of sweat chloride concentration with classes of the cystic fibrosis transmembrane conductance regulator gene mutations. J Pediatr, 1995. 127(5): p. 705- 10. 38. Hubert, D., et al., Genotype-phenotype relationships in a cohort of adult cystic fibrosis patients. Eur Respir J, 1996. 9(11): p. 2207-14. 39. Kraemer, R., P. Birrer, and S. Liechti-Gallati, Genotype-phenotype association in infants with cystic fibrosis at the time of diagnosis. Pediatr Res, 1998. 44(6): p. 920-6. 40. Kraemer, R., et al., Early detection of lung disease and its association with the nutritional status, genetic background and life events in patients with cystic fibrosis. Respiration, 2000. 67(5): p. 477-90. 41. Kraemer, R., et al., Long-term gas exchange characteristics as markers of deterioration in patients with cystic fibrosis. Respir Res, 2009. 10: p. 106.

129

42. Tsui, L.C. and P. Durie, Genotype and phenotype in cystic fibrosis. Hosp Pract (1995), 1997. 32(6): p. 115-8, 123-9, 134, passim. 43. Kerem, E., et al., Pulmonary function and clinical course in patients with cystic fibrosis after pulmonary colonization with Pseudomonas aeruginosa. J Pediatr, 1990. 116(5): p. 714-9. 44. Rubin, B.K., Exposure of children with cystic fibrosis to environmental tobacco smoke. N Engl J Med, 1990. 323(12): p. 782-8. 45. Corey, M. and V. Farewell, Determinants of mortality from cystic fibrosis in Canada, 1970-1989. Am J Epidemiol, 1996. 143(10): p. 1007-17. 46. Schechter, M.S., et al., The association of socioeconomic status with outcomes in cystic fibrosis patients in the United States. Am J Respir Crit Care Med, 2001. 163(6): p. 1331-7. 47. O'Connor, G.T., et al., Median household income and mortality rate in cystic fibrosis. Pediatrics, 2003. 111(4 Pt 1): p. e333-9. 48. Steinkamp, G. and H. von der Hardt, Improvement of nutritional status and lung function after long-term nocturnal gastrostomy feedings in cystic fibrosis. J Pediatr, 1994. 124(2): p. 244-9. 49. Mekus, F., et al., Categories of deltaF508 homozygous cystic fibrosis twin and sibling pairs with distinct phenotypic characteristics. Twin Res, 2000. 3(4): p. 277-93. 50. Vanscoy, L.L., et al., Heritability of lung disease severity in cystic fibrosis. Am J Respir Crit Care Med, 2007. 175(10): p. 1036-43. 51. Drumm, M.L., et al., Genetic modifiers of lung disease in cystic fibrosis. N Engl J Med, 2005. 353(14): p. 1443-53. 52. Gu, Y., et al., Identification of IFRD1 as a modifier gene for cystic fibrosis lung disease. Nature, 2009. 458(7241): p. 1039-42. 53. Dorfman, R., et al., Complex two-gene modulation of lung disease severity in children with cystic fibrosis. J Clin Invest, 2008. 118(3): p. 1040-9. 54. Corvol, H., et al., Genome-wide association meta-analysis identifies five modifier loci of lung disease severity in cystic fibrosis. Nat Commun, 2015. 6: p. 8382. 55. Hector, A., et al., Expression and regulation of interferon-related development regulator-1 in cystic fibrosis neutrophils. Am J Respir Cell Mol Biol, 2013. 48(1): p. 71-7. 56. Baldan, A., et al., IFRD1 gene polymorphisms are associated with nasal polyposis in cystic fibrosis patients. Rhinology, 2015. 53(4): p. 359-64. 57. Taylor, C., et al., A novel lung disease phenotype adjusted for mortality attrition for cystic fibrosis genetic modifier studies. Pediatr Pulmonol, 2011. 46(9): p. 857-69. 58. Wright, F.A., et al., Genome-wide association and linkage identify modifier loci of lung disease severity in cystic fibrosis at 11p13 and 20q13.2. Nat Genet, 2011. 43(6): p. 539-46. 59. Li, X., et al., Genome-wide association study identifies TH1 pathway genes associated with lung function in asthmatic patients. J Allergy Clin Immunol, 2013. 132(2): p. 313-20 e15. 60. Fossum, S.L., et al., Ets homologous factor regulates pathways controlling response to injury in airway epithelial cells. Nucleic Acids Res, 2014. 42(22): p. 13588-98. 61. Stanke, F., et al., The CF-modifying gene EHF promotes p.Phe508del-CFTR residual function by altering protein glycosylation and trafficking in epithelial cells. Eur J Hum Genet, 2014. 22(5): p. 660-6. 62. Dibbert, B., et al., Cytokine-mediated Bax deficiency and consequent delayed neutrophil apoptosis: a general mechanism to accumulate effector cells in inflammation. Proc Natl Acad Sci U S A, 1999. 96(23): p. 13330-5.

130

63. Harris, J.F., et al., Bcl-2 sustains increased mucous and epithelial cell numbers in metaplastic airway epithelium. Am J Respir Crit Care Med, 2005. 171(7): p. 764-72. 64. Reid, C.J., S. Gould, and A. Harris, Developmental expression of mucin genes in the human respiratory tract. Am J Respir Cell Mol Biol, 1997. 17(5): p. 592-8. 65. Button, B., et al., A periciliary brush promotes the lung health by separating the mucus layer from airway epithelia. Science, 2012. 337(6097): p. 937-41. 66. Kesimer, M., et al., Molecular organization of the mucins and glycocalyx underlying mucus transport over mucosal surfaces of the airways. Mucosal Immunol, 2013. 6(2): p. 379-92. 67. Tse, C.M., et al., Cloning and sequencing of a rabbit cDNA encoding an intestinal and kidney- specific Na+/H+ exchanger isoform (NHE-3). J Biol Chem, 1992. 267(13): p. 9340-6. 68. Orlowski, J., R.A. Kandasamy, and G.E. Shull, Molecular cloning of putative members of the Na/H exchanger gene family. cDNA cloning, deduced amino acid sequence, and mRNA tissue expression of the rat Na/H exchanger NHE-1 and two structurally related proteins. J Biol Chem, 1992. 267(13): p. 9331-9. 69. Sun, L., et al., Multiple apical plasma membrane constituents are associated with susceptibility to meconium ileus in individuals with cystic fibrosis. Nat Genet, 2012. 44(5): p. 562-9. 70. Bradford, E.M., et al., Reduced NHE3-mediated Na+ absorption increases survival and decreases the incidence of intestinal obstructions in cystic fibrosis mice. Am J Physiol Gastrointest Liver Physiol, 2009. 296(4): p. G886-98. 71. Dorfman, R., et al., Modulatory effect of the SLC9A3 gene on susceptibility to infections and pulmonary function in children with cystic fibrosis. Pediatr Pulmonol, 2011. 46(4): p. 385-92. 72. Kontakioti, E., et al., HLA and asthma phenotypes/endotypes: a review. Hum Immunol, 2014. 75(8): p. 930-9. 73. Aron, Y., et al., HLA class II polymorphism in cystic fibrosis. A possible modifier of pulmonary phenotype. Am J Respir Crit Care Med, 1999. 159(5 Pt 1): p. 1464-8. 74. Hancock, D.B., et al., Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function. PLoS Genet, 2012. 8(12): p. e1003098. 75. Chauhan, B., et al., Evidence for the involvement of two different MHC class II regions in susceptibility or protection in allergic bronchopulmonary aspergillosis. J Allergy Clin Immunol, 2000. 106(4): p. 723-9. 76. Konigshoff, M., et al., The angiotensin II receptor 2 is expressed and mediates angiotensin II signaling in lung fibrosis. Am J Respir Cell Mol Biol, 2007. 37(6): p. 640-50. 77. Wagenaar, G.T., et al., Angiotensin II type 2 receptor ligand PD123319 attenuates hyperoxia- induced lung and heart injury at a low dose in newborn rats. Am J Physiol Lung Cell Mol Physiol, 2014. 307(3): p. L261-72. 78. Noveski, P., et al., Study of Three Single Nucleotide Polymorphisms in the SLC6A14 Gene in Association with Male Infertility. Balkan J Med Genet, 2014. 17(2): p. 61-6. 79. Siasi, E. and A. Aleyasin, Four Single Nucleotide Polymorphisms in INSR, SLC6A14, TAS2R38, and OR2W3 Genes in Association with Idiopathic Infertility in Persian Men. J Reprod Med, 2016. 61(3-4): p. 145-52. 80. Knowles, M.R. and M. Drumm, The influence of genetics on cystic fibrosis phenotypes. Cold Spring Harb Perspect Med, 2012. 2(12): p. a009548. 81. Bradley, G.M., et al., Genetic modifiers of nutritional status in cystic fibrosis. Am J Clin Nutr, 2012. 96(6): p. 1299-308.

131

82. Blackman, S.M., et al., Genetic modifiers of cystic fibrosis-related diabetes. Diabetes, 2013. 62(10): p. 3627-35. 83. Blackman, S.M., et al., Relative contribution of genetic and nongenetic modifiers to intestinal obstruction in cystic fibrosis. Gastroenterology, 2006. 131(4): p. 1030-9. 84. Dorfman, R., et al., Modifier gene study of meconium ileus in cystic fibrosis: statistical considerations and gene mapping results. Hum Genet, 2009. 126(6): p. 763-78. 85. Patwari, P., et al., The arrestin domain-containing 3 protein regulates body mass and energy expenditure. Cell Metab, 2011. 14(5): p. 671-83. 86. Harness-Brumley, C.L., et al., Gender differences in outcomes of patients with cystic fibrosis. J Womens Health (Larchmt), 2014. 23(12): p. 1012-20. 87. Fogarty, A.W., et al., Are measures of body habitus associated with mortality in cystic fibrosis? Chest, 2012. 142(3): p. 712-7. 88. Finkelstein, S.M., et al., Diabetes mellitus associated with cystic fibrosis. J Pediatr, 1988. 112(3): p. 373-7. 89. Ahmed, N., et al., Molecular consequences of cystic fibrosis transmembrane regulator (CFTR) gene mutations in the exocrine pancreas. Gut, 2003. 52(8): p. 1159-64. 90. Moran, A., et al., Diagnosis, screening and management of cystic fibrosis related diabetes mellitus: a consensus conference report. Diabetes Res Clin Pract, 1999. 45(1): p. 61-73. 91. Mackie, A.D., S.J. Thornton, and F.P. Edenborough, Cystic fibrosis-related diabetes. Diabet Med, 2003. 20(6): p. 425-36. 92. Blackman, S.M., et al., Genetic modifiers play a substantial role in diabetes complicating cystic fibrosis. J Clin Endocrinol Metab, 2009. 94(4): p. 1302-9. 93. Bertrand, C.A., et al., SLC26A9 is a constitutively active, CFTR-regulated anion conductance in human bronchial epithelia. J Gen Physiol, 2009. 133(4): p. 421-38. 94. Ko, S.B., et al., Gating of CFTR by the STAS domain of SLC26 transporters. Nat Cell Biol, 2004. 6(4): p. 343-50. 95. Chang, M.H., et al., Slc26a9 is inhibited by the R-region of the cystic fibrosis transmembrane conductance regulator via the STAS domain. J Biol Chem, 2009. 284(41): p. 28306-18. 96. Li, W., et al., Unraveling the complex genetic model for cystic fibrosis: pleiotropic effects of modifier genes on early cystic fibrosis-related morbidities. Hum Genet, 2014. 133(2): p. 151-61. 97. Miller, M.R., et al., Variants in Solute Carrier SLC26A9 Modify Prenatal Exocrine Pancreatic Damage in Cystic Fibrosis. J Pediatr, 2015. 166(5): p. 1152-1157 e6. 98. Moran, A., et al., Epidemiology, pathophysiology, and prognostic implications of cystic fibrosis- related diabetes: a technical review. Diabetes Care, 2010. 33(12): p. 2677-83. 99. Durie PR, S.D., Gonska T, Ip W, Keenan K, Miller M, Sun L, Rommens J, Strug LJ, Early exocrine pancreatic damage determined by serum immunoreactive trypsinogen is a significant predictor of CF-related diabetes at a later age. . Pediatr Pulmonol, 2012. 47(S35): p. 408. 100. Collaco, J.M. and G.R. Cutting, Update on gene modifiers in cystic fibrosis. Curr Opin Pulm Med, 2008. 14(6): p. 559-66. 101. Pulleyn, L.J., et al., TGFbeta1 allele association with asthma severity. Hum Genet, 2001. 109(6): p. 623-7. 102. Wu, L., et al., Transforming growth factor-beta1 genotype and susceptibility to chronic obstructive pulmonary disease. Thorax, 2004. 59(2): p. 126-9. 103. Silverman, E.S., et al., Transforming growth factor-beta1 promoter polymorphism C-509T is associated with asthma. Am J Respir Crit Care Med, 2004. 169(2): p. 214-9.

132

104. Akhurst, R.J., TGF beta signaling in health and disease. Nat Genet, 2004. 36(8): p. 790-2. 105. FitzSimmons, S.C., The changing epidemiology of cystic fibrosis. J Pediatr, 1993. 122(1): p. 1-9. 106. Folkesson, A., et al., Adaptation of Pseudomonas aeruginosa to the cystic fibrosis airway: an evolutionary perspective. Nat Rev Microbiol, 2012. 10(12): p. 841-51. 107. Jimenez, P.N., et al., The multiple signaling systems regulating virulence in Pseudomonas aeruginosa. Microbiol Mol Biol Rev, 2012. 76(1): p. 46-65. 108. Jones, A.M., M.E. Dodd, and A.K. Webb, Burkholderia cepacia: current clinical issues, environmental controversies and ethical dilemmas. Eur Respir J, 2001. 17(2): p. 295-301. 109. Grimwood, K., T.J. Kidd, and M. Tweed, Successful treatment of cepacia syndrome. J Cyst Fibros, 2009. 8(4): p. 291-3. 110. Sherrard, L.J., M.M. Tunney, and J.S. Elborn, Antimicrobial resistance in the respiratory microbiota of people with cystic fibrosis. Lancet, 2014. 384(9944): p. 703-13. 111. Bryant, J.M., et al., Whole-genome sequencing to identify transmission of Mycobacterium abscessus between patients with cystic fibrosis: a retrospective cohort study. Lancet, 2013. 381(9877): p. 1551-60. 112. Floto, R.A. and C.S. Haworth, The growing threat of nontuberculous mycobacteria in CF. J Cyst Fibros, 2015. 14(1): p. 1-2. 113. Saiman, L., et al., Infection prevention and control guideline for cystic fibrosis: 2013 update. Infect Control Hosp Epidemiol, 2014. 35 Suppl 1: p. S1-S67. 114. Schaffer, K., Epidemiology of infection and current guidelines for infection prevention in cystic fibrosis patients. J Hosp Infect, 2015. 89(4): p. 309-13. 115. Kidd, T.J., et al., Shared Pseudomonas aeruginosa genotypes are common in Australian cystic fibrosis centres. Eur Respir J, 2013. 41(5): p. 1091-100. 116. Tunney, M.M., et al., Lung microbiota and bacterial abundance in patients with bronchiectasis when clinically stable and during exacerbation. Am J Respir Crit Care Med, 2013. 187(10): p. 1118-26. 117. Fodor, A.A., et al., The adult cystic fibrosis airway microbiota is stable over time and infection type, and highly resilient to antibiotic treatment of exacerbations. PLoS One, 2012. 7(9): p. e45001. 118. O'Neill, K., et al., Reduced bacterial colony count of anaerobic bacteria is associated with a worsening in lung clearance index and inflammation in cystic fibrosis. PLoS One, 2015. 10(5): p. e0126980. 119. Coburn, B., et al., Lung microbiota across age and disease stage in cystic fibrosis. Sci Rep, 2015. 5: p. 10241. 120. Paganin, P., et al., Changes in cystic fibrosis airway microbial community associated with a severe decline in lung function. PLoS One, 2015. 10(4): p. e0124348. 121. Wat, D., et al., The role of respiratory viruses in cystic fibrosis. J Cyst Fibros, 2008. 7(4): p. 320-8. 122. Ramsey, B.W., et al., The effect of respiratory viral infections on patients with cystic fibrosis. Am J Dis Child, 1989. 143(6): p. 662-8. 123. Wang, E.E., et al., Association of respiratory viral infections with pulmonary deterioration in patients with cystic fibrosis. N Engl J Med, 1984. 311(26): p. 1653-8. 124. Armstrong, D., et al., Severe viral respiratory infections in infants with cystic fibrosis. Pediatr Pulmonol, 1998. 26(6): p. 371-9. 125. Collinson, J., et al., Effects of upper respiratory tract infections in patients with cystic fibrosis. Thorax, 1996. 51(11): p. 1115-22.

133

126. de Almeida, M.B., et al., Rhinovirus C and respiratory exacerbations in children with cystic fibrosis. Emerg Infect Dis, 2010. 16(6): p. 996-9. 127. van Ewijk, B.E., et al., Viral respiratory infections in cystic fibrosis. J Cyst Fibros, 2005. 4 Suppl 2: p. 31-6. 128. Hament, J.M., et al., Respiratory viral infection predisposing for bacterial disease: a concise review. FEMS Immunol Med Microbiol, 1999. 26(3-4): p. 189-95. 129. Asner, S., et al., Role of respiratory viruses in pulmonary exacerbations in children with cystic fibrosis. J Cyst Fibros, 2012. 11(5): p. 433-9. 130. Hendricks, M.R. and J.M. Bomberger, Digging through the Obstruction: Insight into the Epithelial Cell Response to Respiratory Virus Infection in Patients with Cystic Fibrosis. J Virol, 2016. 90(9): p. 4258-61. 131. Takaoka, A., et al., Integration of interferon-alpha/beta signalling to responses in tumour suppression and antiviral defence. Nature, 2003. 424(6948): p. 516-23. 132. Kotenko, S.V., et al., IFN-lambdas mediate antiviral protection through a distinct class II cytokine receptor complex. Nat Immunol, 2003. 4(1): p. 69-77. 133. Khaitov, M.R., et al., Respiratory virus induction of alpha-, beta- and lambda-interferons in bronchial epithelial cells and peripheral blood mononuclear cells. Allergy, 2009. 64(3): p. 375- 86. 134. Hofer, F., et al., Members of the low density lipoprotein receptor family mediate cell entry of a minor-group common cold virus. Proc Natl Acad Sci U S A, 1994. 91(5): p. 1839-42. 135. Greve, J.M., et al., The major human rhinovirus receptor is ICAM-1. Cell, 1989. 56(5): p. 839-47. 136. Kieninger, E., et al., High rhinovirus burden in lower airways of children with cystic fibrosis. Chest, 2013. 143(3): p. 782-90. 137. Hiatt, P.W., et al., Effects of viral lower respiratory tract infection on lung function in infants with cystic fibrosis. Pediatrics, 1999. 103(3): p. 619-26. 138. Dijkema, J.S., et al., Frequency and Duration of Rhinovirus Infections in Children With Cystic Fibrosis and Healthy Controls: A Longitudinal Cohort Study. Pediatr Infect Dis J, 2016. 35(4): p. 379-83. 139. Schneider, D., et al., Increased cytokine response of rhinovirus-infected airway epithelial cells in chronic obstructive pulmonary disease. Am J Respir Crit Care Med, 2010. 182(3): p. 332-40. 140. Mallia, P., et al., Experimental rhinovirus infection as a human model of chronic obstructive pulmonary disease exacerbation. Am J Respir Crit Care Med, 2011. 183(6): p. 734-42. 141. Johnston, N.W., et al., The September epidemic of asthma exacerbations in children: a search for etiology. J Allergy Clin Immunol, 2005. 115(1): p. 132-8. 142. Wark, P.A., et al., Asthmatic bronchial epithelial cells have a deficient innate immune response to infection with rhinovirus. J Exp Med, 2005. 201(6): p. 937-47. 143. Contoli, M., et al., Role of deficient type III interferon-lambda production in asthma exacerbations. Nat Med, 2006. 12(9): p. 1023-6. 144. Forbes, R.L., et al., Impaired type I and III interferon response to rhinovirus infection during pregnancy and asthma. Thorax, 2012. 67(3): p. 209-14. 145. Sykes, A., et al., Rhinovirus 16-induced IFN-alpha and IFN-beta are deficient in bronchoalveolar lavage cells in asthmatic patients. J Allergy Clin Immunol, 2012. 129(6): p. 1506-1514 e6. 146. Edwards, M.R., et al., Impaired innate interferon induction in severe therapy resistant atopic asthmatic children. Mucosal Immunol, 2013. 6(4): p. 797-806.

134

147. Sutanto, E.N., et al., Innate inflammatory responses of pediatric cystic fibrosis airway epithelial cells: effects of nonviral and viral stimulation. Am J Respir Cell Mol Biol, 2011. 44(6): p. 761-7. 148. Kieninger, E., et al., Lack of an exaggerated inflammatory response on virus infection in cystic fibrosis. Eur Respir J, 2012. 39(2): p. 297-304. 149. Chattoraj, S.S., et al., Rhinovirus infection liberates planktonic bacteria from biofilm and increases chemokine responses in cystic fibrosis airway epithelial cells. Thorax, 2011. 66(4): p. 333-9. 150. Dauletbaev, N., et al., Rhinovirus Load Is High despite Preserved Interferon-beta Response in Cystic Fibrosis Bronchial Epithelial Cells. PLoS One, 2015. 10(11): p. e0143129. 151. Schogler, A., et al., Novel antiviral properties of azithromycin in cystic fibrosis airway epithelial cells. Eur Respir J, 2015. 45(2): p. 428-39. 152. Chattoraj, S.S., et al., Pseudomonas aeruginosa suppresses interferon response to rhinovirus infection in cystic fibrosis but not in normal bronchial epithelial cells. Infect Immun, 2011. 79(10): p. 4131-45. 153. Stevens, D.A., et al., Allergic bronchopulmonary aspergillosis in cystic fibrosis--state of the art: Cystic Fibrosis Foundation Consensus Conference. Clin Infect Dis, 2003. 37 Suppl 3: p. S225-64. 154. Kraemer, R., et al., Effect of allergic bronchopulmonary aspergillosis on lung function in children with cystic fibrosis. Am J Respir Crit Care Med, 2006. 174(11): p. 1211-20. 155. Vlahakis, N.E. and T.R. Aksamit, Diagnosis and treatment of allergic bronchopulmonary aspergillosis. Mayo Clin Proc, 2001. 76(9): p. 930-8. 156. Judson, M.A. and D.A. Stevens, Current pharmacotherapy of allergic bronchopulmonary aspergillosis. Expert Opin Pharmacother, 2001. 2(7): p. 1065-71. 157. Leon, E.E. and T.J. Craig, Antifungals in the treatment of allergic bronchopulmonary aspergillosis. Ann Allergy Asthma Immunol, 1999. 82(6): p. 511-6; quiz 516-9. 158. Chotirmall, S.H., et al., Sputum Candida albicans presages FEV(1) decline and hospital-treated exacerbations in cystic fibrosis. Chest, 2010. 138(5): p. 1186-95. 159. Cohen-Cymberknoh, M., et al., Airway inflammation in cystic fibrosis: molecular mechanisms and clinical implications. Thorax, 2013. 68(12): p. 1157-62. 160. Cantin, A.M., et al., Inflammation in cystic fibrosis lung disease: Pathogenesis and therapy. J Cyst Fibros, 2015. 14(4): p. 419-30. 161. Khan, T.Z., et al., Early pulmonary inflammation in infants with cystic fibrosis. Am J Respir Crit Care Med, 1995. 151(4): p. 1075-82. 162. Balough, K., et al., The relationship between infection and inflammation in the early stages of lung disease from cystic fibrosis. Pediatr Pulmonol, 1995. 20(2): p. 63-70. 163. Muhlebach, M.S., et al., Quantitation of inflammatory responses to bacteria in young cystic fibrosis and control patients. Am J Respir Crit Care Med, 1999. 160(1): p. 186-91. 164. Collawn, J.F. and S. Matalon, CFTR and lung homeostasis. Am J Physiol Lung Cell Mol Physiol, 2014. 307(12): p. L917-23. 165. Pezzulo, A.A., et al., Reduced airway surface pH impairs bacterial killing in the porcine cystic fibrosis lung. Nature, 2012. 487(7405): p. 109-13. 166. Stutts, M.J., et al., CFTR as a cAMP-dependent regulator of sodium channels. Science, 1995. 269(5225): p. 847-50. 167. Collawn, J.F., et al., The CFTR and ENaC debate: how important is ENaC in CF lung disease? Am J Physiol Lung Cell Mol Physiol, 2012. 302(11): p. L1141-6.

135

168. Donaldson, S.H. and R.C. Boucher, Sodium channels and cystic fibrosis. Chest, 2007. 132(5): p. 1631-6. 169. Gaillard, E.A., et al., Regulation of the epithelial Na+ channel and airway surface liquid volume by serine proteases. Pflugers Arch, 2010. 460(1): p. 1-17. 170. Weldon, S., et al., Decreased levels of secretory leucoprotease inhibitor in the Pseudomonas- infected cystic fibrosis lung are due to neutrophil elastase degradation. J Immunol, 2009. 183(12): p. 8148-56. 171. Ko, S.B., et al., A molecular mechanism for aberrant CFTR-dependent HCO(3)(-) transport in cystic fibrosis. EMBO J, 2002. 21(21): p. 5662-72. 172. Hirche, T.O., et al., Neutrophil elastase mediates innate host protection against Pseudomonas aeruginosa. J Immunol, 2008. 181(7): p. 4945-54. 173. Nauseef, W.M. and N. Borregaard, Neutrophils at work. Nat Immunol, 2014. 15(7): p. 602-11. 174. Korkmaz, B., et al., Neutrophil elastase, proteinase 3, and cathepsin G as therapeutic targets in human diseases. Pharmacol Rev, 2010. 62(4): p. 726-59. 175. Perera, N.C., et al., NSP4, an elastase-related protease in human neutrophils with arginine specificity. Proc Natl Acad Sci U S A, 2012. 109(16): p. 6229-34. 176. Sly, P.D., et al., Risk factors for bronchiectasis in children with cystic fibrosis. N Engl J Med, 2013. 368(21): p. 1963-70. 177. Gehrig, S., et al., Lack of neutrophil elastase reduces inflammation, mucus hypersecretion, and emphysema, but not mucus obstruction, in mice with cystic fibrosis-like lung disease. Am J Respir Crit Care Med, 2014. 189(9): p. 1082-92. 178. Zackular, J.P., W.J. Chazin, and E.P. Skaar, Nutritional Immunity: S100 Proteins at the Host- Pathogen Interface. J Biol Chem, 2015. 290(31): p. 18991-8. 179. Ryckman, C., et al., Proinflammatory activities of S100: proteins S100A8, S100A9, and S100A8/A9 induce neutrophil chemotaxis and adhesion. J Immunol, 2003. 170(6): p. 3233-42. 180. Vaos, G., et al., The role of calprotectin in pediatric disease. Biomed Res Int, 2013. 2013: p. 542363. 181. Lorenz, E., et al., Different expression ratio of S100A8/A9 and S100A12 in acute and chronic lung diseases. Respir Med, 2008. 102(4): p. 567-73. 182. Kang, J.H., S.M. Hwang, and I.Y. Chung, S100A8, S100A9 and S100A12 activate airway epithelial cells to produce MUC5AC via extracellular signal-regulated kinase and nuclear factor-kappaB pathways. Immunology, 2015. 144(1): p. 79-90. 183. Hector, A., et al., Regulatory T-cell impairment in cystic fibrosis patients with chronic pseudomonas infection. Am J Respir Crit Care Med, 2015. 191(8): p. 914-23. 184. Konstan, M.W., et al., Effect of high-dose ibuprofen in patients with cystic fibrosis. N Engl J Med, 1995. 332(13): p. 848-54. 185. Lands, L.C. and S. Stanojevic, Oral non-steroidal anti-inflammatory drug therapy for cystic fibrosis. Cochrane Database Syst Rev, 2007(4): p. CD001505. 186. Oermann, C.M., M.M. Sockrider, and M.W. Konstan, The use of anti-inflammatory medications in cystic fibrosis: trends and physician attitudes. Chest, 1999. 115(4): p. 1053-8. 187. Flume, P.A., et al., Cystic fibrosis pulmonary guidelines: chronic medications for maintenance of lung health. Am J Respir Crit Care Med, 2007. 176(10): p. 957-69. 188. Konstan, M.W., Ibuprofen therapy for cystic fibrosis lung disease: revisited. Curr Opin Pulm Med, 2008. 14(6): p. 567-73.

136

189. Lands, L.C. and S. Stanojevic, Oral non-steroidal anti-inflammatory drug therapy for lung disease in cystic fibrosis. Cochrane Database Syst Rev, 2016. 4: p. CD001505. 190. Auerbach, H.S., et al., Alternate-day prednisone reduces morbidity and improves pulmonary function in cystic fibrosis. Lancet, 1985. 2(8457): p. 686-8. 191. Eigen, H., et al., A multicenter study of alternate-day prednisone therapy in patients with cystic fibrosis. Cystic Fibrosis Foundation Prednisone Trial Group. J Pediatr, 1995. 126(4): p. 515-23. 192. Lai, H.C., et al., Risk of persistent growth impairment after alternate-day prednisone treatment in children with cystic fibrosis. N Engl J Med, 2000. 342(12): p. 851-9. 193. Equi, A., et al., Long term azithromycin in children with cystic fibrosis: a randomised, placebo- controlled crossover trial. Lancet, 2002. 360(9338): p. 978-84. 194. Nalca, Y., et al., Quorum-sensing antagonistic activities of azithromycin in Pseudomonas aeruginosa PAO1: a global approach. Antimicrob Agents Chemother, 2006. 50(5): p. 1680-8. 195. Ichimiya, T., et al., The influence of azithromycin on the biofilm formation of Pseudomonas aeruginosa in vitro. Chemotherapy, 1996. 42(3): p. 186-91. 196. Cigana, C., et al., Anti-inflammatory effects of azithromycin in cystic fibrosis airway epithelial cells. Biochem Biophys Res Commun, 2006. 350(4): p. 977-82. 197. Oliynyk, I., et al., Azithromycin increases chloride efflux from cystic fibrosis airway epithelial cells. Exp Lung Res, 2009. 35(3): p. 210-21. 198. Saint-Criq, V., et al., Restoration of chloride efflux by azithromycin in airway epithelial cells of cystic fibrosis patients. Antimicrob Agents Chemother, 2011. 55(4): p. 1792-3. 199. Elsbach, P. and J. Weiss, Role of the bactericidal/permeability-increasing protein in host defence. Curr Opin Immunol, 1998. 10(1): p. 45-9. 200. Fenton, M.J. and D.T. Golenbock, LPS-binding proteins and receptors. J Leukoc Biol, 1998. 64(1): p. 25-32. 201. Kopec, K.O., V. Alva, and A.N. Lupas, Bioinformatics of the TULIP domain superfamily. Biochem Soc Trans, 2011. 39(4): p. 1033-8. 202. Bingle, C.D. and C.J. Craven, PLUNC: a novel family of candidate host defence proteins expressed in the upper airways and nasopharynx. Hum Mol Genet, 2002. 11(8): p. 937-43. 203. Bingle, C.D., L. Bingle, and C.J. Craven, Distant cousins: genomic and sequence diversity within the BPI fold-containing (BPIF)/PLUNC protein family. Biochem Soc Trans, 2011. 39(4): p. 961-5. 204. Bingle, L., et al., Differential epithelial expression of the putative innate immune molecule SPLUNC1 in cystic fibrosis. Respir Res, 2007. 8: p. 79. 205. Bingle, L., et al., SPLUNC1 (PLUNC) is expressed in glandular tissues of the respiratory tract and in lung tumours with a glandular phenotype. J Pathol, 2005. 205(4): p. 491-7. 206. Bingle, C.D., et al., Human LPLUNC1 is a secreted product of goblet cells and minor glands of the respiratory and upper aerodigestive tracts. Histochem Cell Biol, 2010. 133(5): p. 505-15. 207. Weston, W.M., et al., Differential display identification of plunc, a novel gene expressed in embryonic palate, nasal epithelium, and adult lung. J Biol Chem, 1999. 274(19): p. 13698-703. 208. Campos, M.A., et al., Purification and characterization of PLUNC from human tracheobronchial secretions. Am J Respir Cell Mol Biol, 2004. 30(2): p. 184-92. 209. Ghafouri, B., et al., PLUNC (palate, lung and nasal epithelial clone) proteins in human nasal lavage fluid. Biochem Soc Trans, 2003. 31(Pt 4): p. 810-4. 210. Ghafouri, B., et al., PLUNC in human nasal lavage fluid: multiple isoforms that bind to lipopolysaccharide. Biochim Biophys Acta, 2004. 1699(1-2): p. 57-63.

137

211. Ghafouri, B., et al., Comparative proteomics of nasal fluid in seasonal allergic rhinitis. J Proteome Res, 2006. 5(2): p. 330-8. 212. Di, Y.P., et al., Molecular cloning and characterization of spurt, a human novel gene that is retinoic acid-inducible and encodes a secretory protein specific in upper respiratory tracts. J Biol Chem, 2003. 278(2): p. 1165-73. 213. LeClair, E.E., et al., Genomic organization of the mouse plunc gene and expression in the developing airways and thymus. Biochem Biophys Res Commun, 2001. 284(3): p. 792-7. 214. Lindahl, M., B. Stahlbom, and C. Tagesson, Identification of a new potential airway irritation marker, palate lung nasal epithelial clone protein, in human nasal lavage fluid with two- dimensional electrophoresis and matrix-assisted laser desorption/ionization-time of flight. Electrophoresis, 2001. 22(9): p. 1795-800. 215. Ghafouri, B., et al., Newly identified proteins in human nasal lavage fluid from non-smokers and smokers using two-dimensional gel electrophoresis and peptide mass fingerprinting. Proteomics, 2002. 2(1): p. 112-20. 216. Sung, Y.K., et al., Plunc, a member of the secretory gland protein family, is up-regulated in nasal respiratory epithelium after olfactory bulbectomy. J Biol Chem, 2002. 277(15): p. 12762-9. 217. Debat, H., et al., Identification of human olfactory cleft mucus proteins using proteomic analysis. J Proteome Res, 2007. 6(5): p. 1985-96. 218. Nicholas, B., et al., Shotgun proteomic analysis of human-induced sputum. Proteomics, 2006. 6(15): p. 4390-401. 219. Scheetz, T.E., et al., Large-scale gene discovery in human airway epithelia reveals novel transcripts. Physiol Genomics, 2004. 17(1): p. 69-77. 220. Roxo-Rosa, M., et al., Proteomic analysis of nasal cells from cystic fibrosis patients and non- cystic fibrosis control individuals: search for novel biomarkers of cystic fibrosis lung disease. Proteomics, 2006. 6(7): p. 2314-25. 221. Chu, H.W., et al., Function and regulation of SPLUNC1 protein in Mycoplasma infection and allergic inflammation. J Immunol, 2007. 179(6): p. 3995-4002. 222. Zhou, H.D., et al., Effect of SPLUNC1 protein on the Pseudomonas aeruginosa and Epstein-Barr virus. Mol Cell Biochem, 2008. 309(1-2): p. 191-7. 223. Liu, Y., et al., Increased susceptibility to pulmonary Pseudomonas infection in Splunc1 knockout mice. J Immunol, 2013. 191(8): p. 4259-68. 224. Gally, F., et al., SPLUNC1 promotes lung innate defense against Mycoplasma pneumoniae infection in mice. Am J Pathol, 2011. 178(5): p. 2159-67. 225. Liu, Y., et al., SPLUNC1/BPIFA1 contributes to pulmonary host defense against Klebsiella pneumoniae respiratory infection. Am J Pathol, 2013. 182(5): p. 1519-31. 226. Sayeed, S., et al., Multifunctional role of human SPLUNC1 in Pseudomonas aeruginosa infection. Infect Immun, 2013. 81(1): p. 285-91. 227. Lukinskiene, L., et al., Antimicrobial activity of PLUNC protects against Pseudomonas aeruginosa infection. J Immunol, 2011. 187(1): p. 382-90. 228. Gakhar, L., et al., PLUNC is a novel airway surfactant protein with anti-biofilm activity. PLoS One, 2010. 5(2): p. e9098. 229. Wright, P.L., et al., Epithelial reticulon 4B (Nogo-B) is an endogenous regulator of Th2-driven lung inflammation. J Exp Med, 2010. 207(12): p. 2595-607. 230. Thaikoottathil, J.V., et al., SPLUNC1 deficiency enhances airway eosinophilic inflammation in mice. Am J Respir Cell Mol Biol, 2012. 47(2): p. 253-60.

138

231. Di, Y.P., et al., Dual acute proinflammatory and antifibrotic pulmonary effects of short palate, lung, and nasal epithelium clone-1 after exposure to carbon nanotubes. Am J Respir Cell Mol Biol, 2013. 49(5): p. 759-67. 232. Chen, P., et al., SPLUNC1 regulates cell progression and apoptosis through the miR-141- PTEN/p27 pathway, but is hindered by LMP1. PLoS One, 2013. 8(3): p. e56929. 233. Garcia-Caballero, A., et al., SPLUNC1 regulates airway surface liquid volume by protecting ENaC from proteolytic cleavage. Proc Natl Acad Sci U S A, 2009. 106(27): p. 11412-7. 234. Garland, A.L., et al., Molecular basis for pH-dependent mucosal dehydration in cystic fibrosis airways. Proc Natl Acad Sci U S A, 2013. 110(40): p. 15973-8. 235. Tarran, R. and M.R. Redinbo, Mammalian short palate lung and nasal epithelial clone 1 (SPLUNC1) in pH-dependent airway hydration. Int J Biochem Cell Biol, 2014. 52: p. 130-5. 236. Rollins, B.M., et al., SPLUNC1 expression reduces surface levels of the epithelial sodium channel (ENaC) in Xenopus laevis oocytes. Channels (Austin), 2010. 4(4): p. 255-9. 237. Ramachandran, P., et al., Comparison of N-linked Glycoproteins in Human Whole Saliva, Parotid, Submandibular, and Sublingual Glandular Secretions Identified using Hydrazide Chemistry and Mass Spectrometry. Clin Proteomics, 2008. 4(3-4): p. 80-104. 238. Larocque, R.C., et al., A variant in long palate, lung and nasal epithelium clone 1 is associated with cholera in a Bangladeshi population. Genes Immun, 2009. 10(3): p. 267-72. 239. Shin, O.S., et al., LPLUNC1 modulates innate immune responses to Vibrio cholerae. J Infect Dis, 2011. 204(9): p. 1349-57. 240. Shum, A.K., et al., BPIFB1 is a lung-specific autoantigen associated with interstitial lung disease. Sci Transl Med, 2013. 5(206): p. 206ra139. 241. Bingle, L., et al., BPIFB1 (LPLUNC1) is upregulated in cystic fibrosis lung disease. Histochem Cell Biol, 2012. 138(5): p. 749-58. 242. Li, J. and L. Ji, Adjusting multiple testing in multilocus analyses using the eigenvalues of a correlation matrix. Heredity (Edinb), 2005. 95(3): p. 221-7. 243. Nyholt, D.R., A simple correction for multiple testing for single-nucleotide polymorphisms in linkage disequilibrium with each other. Am J Hum Genet, 2004. 74(4): p. 765-9. 244. Hao, K., et al., Lung eQTLs to help reveal the molecular underpinnings of asthma. PLoS Genet, 2012. 8(11): p. e1003029. 245. Ng, P.C. and S. Henikoff, Predicting deleterious amino acid substitutions. Genome Res, 2001. 11(5): p. 863-74. 246. Sunyaev, S., et al., Prediction of deleterious human alleles. Hum Mol Genet, 2001. 10(6): p. 591- 7. 247. Abecasis, G.R., et al., An integrated map of genetic variation from 1,092 human genomes. Nature, 2012. 491(7422): p. 56-65. 248. IARC, Common Minimum Technical Standards and Protocols for Biological Resource Centres dedicated to Cancer Research, E.P. Caboux, A. Hainaut, P. IARC Working Group Reports Editor. 2007, World Health Organization International Agency for Research on Cancer. p. 48. 249. He, J.Q., et al., Selection of housekeeping genes for real-time PCR in atopic human bronchial epithelial cells. Eur Respir J, 2008. 32(3): p. 755-62. 250. Livak, K.J. and T.D. Schmittgen, Analysis of relative gene expression data using real-time quantitative PCR and the 2-ΔΔCT method. Methods, 2001. 25(4): p. 402-8. 251. Ogilvie, V., et al., Differential global gene expression in cystic fibrosis nasal and bronchial epithelium. Genomics, 2011. 98(5): p. 327-36.

139

252. Kohlgraf, K.G., et al., Quantitation of SPLUNC1 in saliva with an xMAP particle-based antibody capture and detection immunoassay. Arch Oral Biol, 2012. 57(2): p. 197-204. 253. Vitorino, R., et al., Identification of human whole saliva protein components using proteomics. Proteomics, 2004. 4(4): p. 1109-15. 254. Hobbs, C.A., et al., Identification of the SPLUNC1 ENaC-inhibitory domain yields novel strategies to treat sodium hyperabsorption in cystic fibrosis airway epithelial cultures. Am J Physiol Lung Cell Mol Physiol, 2013. 305(12): p. L990-L1001. 255. Hobbs, C.A., et al., Identification of SPLUNC1's ENaC-inhibitory domain yields novel strategies to treat sodium hyperabsorption in cystic fibrosis airways. FASEB J, 2012. 26(10): p. 4348-59. 256. Soler Artigas, M., et al., Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function. Nat Genet, 2011. 43(11): p. 1082-90. 257. Beck, T., et al., GWAS Central: a comprehensive resource for the comparison and interrogation of genome-wide association studies. Eur J Hum Genet, 2014. 22(7): p. 949-52. 258. He, Y., et al., Association of PLUNC gene polymorphisms with susceptibility to nasopharyngeal carcinoma in a Chinese population. J Med Genet, 2005. 42(2): p. 172-6. 259. Yew, P.Y., et al., Identification of a functional variant in SPLUNC1 associated with nasopharyngeal carcinoma susceptibility among Malaysian Chinese. Mol Carcinog, 2012. 51 Suppl 1: p. E74-82. 260. Bartlett, J.A., et al., PLUNC is a secreted product of neutrophil granules. J Leukoc Biol, 2008. 83(5): p. 1201-6. 261. Andrews, S., FastQC: a quality control tool for high throughput sequence data. 2010. 262. Martin, M., Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal, 2011. 11(1): p. 10-12. 263. Kim, D., et al., TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol, 2013. 14(4): p. R36. 264. Langmead, B. and S.L. Salzberg, Fast gapped-read alignment with Bowtie 2. Nat Methods, 2012. 9(4): p. 357-9. 265. Li, H., et al., The Sequence Alignment/Map format and SAMtools. Bioinformatics, 2009. 25(16): p. 2078-9. 266. Love, M.I., W. Huber, and S. Anders, Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol, 2014. 15(12): p. 550. 267. Ritchie, M.E., et al., limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res, 2015. 43(7): p. e47. 268. Foroushani, A.B., F.S. Brinkman, and D.J. Lynn, Pathway-GPS and SIGORA: identifying relevant pathways based on the over-representation of their gene-pair signatures. PeerJ, 2013. 1: p. e229. 269. Xia, J., E.E. Gill, and R.E. Hancock, NetworkAnalyst for statistical, visual and network-based meta-analysis of gene expression data. Nat Protoc, 2015. 10(6): p. 823-44. 270. Wacker, M., et al., N-linked glycosylation in Campylobacter jejuni and its functional transfer into E. coli. Science, 2002. 298(5599): p. 1790-3. 271. Saferali, A., et al., Polymorphisms associated with expression of BPIFA1/BPIFB1 and lung disease severity in cystic fibrosis. Am J Respir Cell Mol Biol, 2015. 53(5): p. 607-14. 272. Ridley, A.J., Rho GTPase signalling in cell migration. Curr Opin Cell Biol, 2015. 36: p. 103-12. 273. Gambardella, L. and S. Vermeren, Molecular players in neutrophil chemotaxis--focus on PI3K and small GTPases. J Leukoc Biol, 2013. 94(4): p. 603-12.

140

274. Akram K., M.N., Tompkins M, Tripp R, Bingle L, Stewart J, Bingle C, An innate defence role for BPIFA1/SPLUNC1 against influenza-A infection European Respiratory Journal 2015. 46(S59). 275. Blaas, D. and R. Fuchs, Mechanism of human rhinovirus infections. Mol Cell Pediatr, 2016. 3(1): p. 21. 276. Kennedy, J.L., et al., Pathogenesis of rhinovirus infection. Curr Opin Virol, 2012. 2(3): p. 287-93. 277. Triantafilou, K., et al., Human rhinovirus recognition in non-immune cells is mediated by Toll-like receptors and MDA-5, which trigger a synergetic pro-inflammatory immune response. Virulence, 2011. 2(1): p. 22-9. 278. Slater, L., et al., Co-ordinated role of TLR3, RIG-I and MDA5 in the innate response to rhinovirus in bronchial epithelium. PLoS Pathog, 2010. 6(11): p. e1001178. 279. Kicic, A., et al., Decreased fibronectin production significantly contributes to dysregulated repair of asthmatic epithelium. Am J Respir Crit Care Med, 2010. 181(9): p. 889-98. 280. Kicic, A., et al., Intrinsic biochemical and functional differences in bronchial epithelial cells of children with asthma. Am J Respir Crit Care Med, 2006. 174(10): p. 1110-8. 281. Lane, C., et al., The use of non-bronchoscopic brushings to study the paediatric airway. Respir Res, 2005. 6: p. 53. 282. Banerjee, B., et al., The airway epithelium is a direct source of matrix degrading enzymes in bronchiolitis obliterans syndrome. J Heart Lung Transplant, 2011. 30(10): p. 1175-85. 283. Dexheimer, P.S., J., Bam2fastq. 284. Schogler, A., et al., Interferon response of the cystic fibrosis bronchial epithelium to major and minor group rhinovirus infection. J Cyst Fibros, 2016. 15(3): p. 332-9. 285. Kim, T.K., et al., A systems approach to understanding human rhinovirus and influenza virus infection. Virology, 2015. 486: p. 146-57. 286. Chen, Y., et al., Rhinovirus induces airway epithelial gene expression through double-stranded RNA and IFN-dependent pathways. Am J Respir Cell Mol Biol, 2006. 34(2): p. 192-203. 287. Caliskan, M., et al., Host genetic variation influences gene expression response to rhinovirus infection. PLoS Genet, 2015. 11(4): p. e1005111. 288. Proud, D., et al., Gene expression profiles during in vivo human rhinovirus infection: insights into the host response. Am J Respir Crit Care Med, 2008. 178(9): p. 962-8. 289. Bai, J., et al., Phenotypic responses of differentiated asthmatic human airway epithelial cultures to rhinovirus. PLoS One, 2015. 10(2): p. e0118286. 290. Sykes, A., et al., Rhinovirus-induced interferon production is not deficient in well controlled asthma. Thorax, 2014. 69(3): p. 240-6. 291. Wark, P.A., et al., Diversity in the bronchial epithelial cell response to infection with different rhinovirus strains. Respirology, 2009. 14(2): p. 180-6. 292. Arnberg, N., et al., Adenovirus type 37 uses sialic acid as a cellular receptor. J Virol, 2000. 74(1): p. 42-8. 293. Burkard, C., et al., Dissecting virus entry: replication-independent analysis of virus binding, internalization, and penetration using minimal complementation of beta-galactosidase. PLoS One, 2014. 9(7): p. e101762

141

APPENDIX

Table A.1: All SNPs in the region 10 kb upstream of BPIFA1 to 10kb downstream of BPIFB1 which have been genotyped or imputed in the genome-wide association study for lung disease severity in CF. SNP CF MAF2 Risk LD with eQTL p-value5 eQTL direction of SNP location7 Genotyped/ Imputed8 GWAS allele3 rs1078761 effect6 p-value1 (r2)4 BPIFA1 BPIFB1 BPIFA1 BPIFB1 GWAS eQTL rs1078761 2.7110-4 0.267 G - 4.0810-15 0.0314 - - Exon 3 of BPIFB1 Genotyped Genotyped rs2295576 8.9610-4 0.271 C 0.857 4.2510-15 0.0339 - - Intron 2 of BPIFB1 Imputed Imputed rs113915824 0.008 0.153 A 0.257 2.0110-13 0.0070 - - Intergenic Imputed Imputed rs3827028 0.008 0.276 G 0.892 1.3510-14 0.0418 - - Intron 2 of BPIFB1 Imputed Imputed rs1884880 0.01 0.277 T 0.892 8.7610-15 0.0324 - - Intergenic Imputed Imputed rs11167201 0.011 0.219 A 0.607 1.6710-13 0.0681 - - Intergenic Imputed Imputed rs6057802 0.013 0.22 C 0.636 1.5310-13 0.0670 - - Intergenic Imputed Imputed rs4911310 0.014 0.277 G 0.892 1.1910-14 0.0281 - - Intergenic Imputed Imputed rs4911309 0.015 0.275 G 0.964 1.5310-14 0.0327 - - Intergenic Imputed Imputed rs1570033 0.016 0.469 A 0.189 2.1610-22 0.3257 - - Intron 4 of BPIFA3 Imputed Genotyped rs1321423 0.016 0.27 G 0.892 2.2010-14 0.0381 - - Intergenic Imputed Imputed rs1964852 0.016 0.275 T 0.684 1.6910-14 0.0310 - - Intergenic Imputed Imputed rs6119362 0.016 0.275 A 0.964 1.5610-14 0.0539 - - Intron 2 of BPIFB1 Imputed Imputed rs982561 0.017 0.27 T 0.892 2.2410-14 0.0380 - - Intergenic Imputed Imputed rs1884882 0.017 0.275 C 0.892 1.5610-14 0.0338 - - Intergenic Imputed Imputed rs6057806 0.017 0.277 T 0.892 1.0410-14 0.0290 - - Intergenic Imputed Imputed rs941682 0.017 0.276 G 0.964 1.1510-14 0.0329 - - Intergenic Imputed Imputed rs982562 0.018 0.275 C 0.892 1.3510-14 0.0333 - - Intergenic Imputed Imputed rs6059230 0.018 0.276 T 0.928 1.1610-14 0.0326 - - Intergenic Imputed Imputed rs982563 0.019 0.22 C 0.636 1.3110-13 0.0651 - - Intergenic Imputed Imputed rs13037711 0.019 0.277 G 0.892 8.6810-10 0.0283 - - Intergenic Imputed Imputed rs13037156 0.019 0.22 T 0.636 1.4910-13 0.0645 - - Intergenic Imputed Imputed rs1884881 0.019 0.221 G 0.636 8.5910-14 0.0541 - - Intergenic Imputed Imputed 142

rs2275082 0.02 0.048 T 0.115 0.0728 0.1566 - - Intron 9 of BPIFB1 Imputed Imputed rs911138 0.022 0.221 T 0.636 1.0910-13 0.0624 - - Intergenic Imputed Imputed rs75123718 0.022 0.006 A 0.075 * * Intron 7 of BPIFB1 Imputed NA rs6057816 0.023 0.46 G 0.245 2.4210-15 0.2889 + + Intron 6 of BPIFB1 Imputed Imputed rs1321421 0.023 0.277 A 0.892 9.4510-15 0.0323 - - Intergenic Imputed Imputed rs6120204 0.024 0.277 C 0.583 8.8410-15 0.0297 - - Intergenic Imputed Imputed rs6059212 0.025 0.469 G 0.287 3.6110-26 0.2819 - - Intergenic Imputed Imputed rs4911307 0.026 0.473 C 0.287 9.8110-26 0.3008 - - Intergenic Imputed Imputed rs6059217 0.026 0.473 C 0.287 1.1910-25 0.3046 - - Intergenic Imputed Imputed rs6057777 0.027 0.477 C 0.260 1.3210-27 0.2396 - - Intergenic Imputed Imputed rs6059187 0.027 0.476 G 0.260 1.3110-27 0.2403 - - Intron 5 of BPIFA1 Imputed Genotyped rs750064 0.027 0.477 C 0.274 1.3110-27 0.2403 - - 5' region of BPIFA1 Imputed Genotyped rs6141909 0.027 0.217 T 0.635 3.9210-13 0.0732 - - Intergenic Imputed Imputed rs3787144 0.028 0.477 G 0.274 1.2510-27 0.2369 - - Intron 1 of BPIFA1 Imputed Imputed rs1884883 0.028 0.217 C 0.665 3.6610-13 0.0793 - - Intergenic Genotyped Genotyped rs78841773 0.031 0.056 T 0.151 0.0893 0.3808 - - Intron 1 of BPIFB1 Imputed Genotyped rs1570034 0.035 0.477 T 0.274 6.7610-27 0.2744 - - Intergenic Genotyped Genotyped rs6141908 0.036 0.216 G 0.665 4.0610-13 0.0739 - - Intergenic Imputed Imputed rs6057813 0.036 0.056 A 0.151 0.0899 0.3777 - - Intron 1 of BPIFB1 Genotyped Genotyped rs6141379 0.038 0.218 T 0.636 2.5010-13 0.0730 - - Intergenic Imputed Imputed rs7263699 0.041 0.217 C 0.665 2.4810-13 0.0733 - - Intergenic Imputed Imputed rs4911311 0.042 0.218 C 0.665 2.3710-13 0.0729 - - 5' region of BPIFB1 Imputed Imputed rs6141911 0.043 0.217 A 0.665 2.5410-13 0.0736 - - Intergenic Imputed Imputed rs7262822 0.045 0.217 A 0.665 2.5810-13 0.0736 - - Intergenic Genotyped Genotyped rs75914519 0.045 0.045 T 0.151 0.0753 0.1691 - - Intron 7 of BPIFB1 Imputed Imputed Intron 11 of rs2253335 0.046 0.409 C 0.213 5.3910-12 0.2767 - - BPIFB1 Genotyped Genotyped rs13045552 0.047 0.221 C 0.607 1.0710-13 0.0629 - - Intergenic Imputed Imputed 1P-value for association between genotype and lung disease severity in 2494 CF patients. 2Minor allele frequency. 3Allele associated with more severe lung disease in CF. 4LD with rs1078761 in the CEU subset of the 1000 Genomes reference population (Pilot 1). 5P-value for association between genotype and gene expression levels using multiple linear regression with adjustment for age, gender and centre. 6Whether the risk allele is associated with increased (+) or decreased (-) gene expression levels. 75’region = within 2 kb of the start of transcription. 8Whether genotype information was derived by genotyping or imputation. *rs75123718 was not imputed in the eQTL study due to low MAF.

143

Table A.2: SNPs identified in the lung eQTL study as being significantly associated with BPIFA1 gene expression levels at the 10% false discovery rate.

SNP Distance LD (r2) with P-Value2 Allele3 Allele4 Location of SNP5 Genotyped/Imputed6 from rs10787611 ↑ ↓ rs1078761 (bp) rs3787144 -51,912 0.274 1.2510-27 A G Intron 1 of BPIFA1 Imputed rs750064 -53,051 0.274 1.3110-27 T C 5' region of BPIFA1 Genotyped rs6059187 -48,416 0.26 1.3110-27 A G Intron 5 of BPIFA1 Genotyped rs6057777 -56,528 0.26 1.3210-27 A C Intergenic Imputed rs1570034 -37,645 0.274 6.7610-27 C T Intergenic Genotyped rs6059212 -29,458 0.287 3.6110-26 A G Intergenic Imputed rs4911307 -27,223 0.287 9.8110-26 T C Intergenic Imputed rs6059217 -25,754 0.287 1.1910-25 T C Intergenic Imputed rs911139 -83,612 0.117 3.2610-24 T G Intergenic Genotyped rs1570033 -62,333 0.189 2.1610-22 G A Intron 4 of BPIFA3 Genotyped rs149380522 -103,326 0.132 2.4010-22 A C Intergenic Imputed rs735625 -83,677 0.045 5.4110-21 C T Intergenic Genotyped rs725914 -65,130 0.045 5.5910-21 C T Intron 1 of BPIFA3 Genotyped rs6057753 -88,644 0.034 7.4510-21 C T Intergenic Imputed rs1040795 -96,934 0.034 1.0210-20 A T Intergenic Imputed rs2145254 -104,370 0.109 1.0810-19 A T Intergenic Imputed rs2424966 -146,413 0 2.3110-15 G T Intergenic Imputed rs6057816 4,918 0.245 2.4210-15 G A Intron 6 of BPIFB1 Imputed rs1078761 - - 4.0810-15 A G Exon 3 of BPIFB1 Genotyped rs2295576 -196 0.857 4.2510-15 T C Intron 2 of BPIFB1 Imputed rs2273529 -47,598 0.598 6.2110-15 G A Intron 5 of BPIFA1 Imputed

144 rs927159 -56,892 0.598 6.2410-15 C T Intergenic Imputed rs13037711 -20,834 0.892 8.0110-15 A G Intergenic Imputed rs1884880 -24,397 0.892 8.7610-15 C T Intergenic Imputed rs6120204 -21,554 0.583 8.8410-15 T C Intergenic Imputed rs1321421 -23,094 0.892 9.4510-15 G A Intergenic Imputed rs6057806 -18,799 0.892 1.0410-14 C T Intergenic Imputed rs941682 -8,841 0.964 1.1510-14 A G Intergenic Imputed rs6059230 -9,649 0.928 1.1610-14 C T Intergenic Imputed rs4911310 -15,422 0.892 1.1910-14 A G Intergenic Imputed rs3827028 -2,424 0.892 1.3510-14 A G Intron 2 of BPIFB1 Imputed rs982562 -19,353 0.892 1.3510-14 T C Intergenic Imputed rs6141895 -57,670 0.073 1.5010-14 A G Intergenic Imputed rs4911309 -15,709 0.964 1.5310-14 A G Intergenic Imputed rs6119362 -1,812 0.964 1.5610-14 C A Intron 2 of BPIFB1 Imputed rs1884882 -17,304 0.892 1.5610-14 T C Intergenic Imputed rs1964852 -14,401 0.684 1.6910-14 C T Intergenic Imputed rs1321423 -19,660 0.892 2.2010-14 A G Intergenic Imputed rs982561 -19,661 0.892 2.2410-14 C T Intergenic Imputed rs2424967 10,737 0.269 3.1210-14 T C Intron 7 of BPIFB1 Genotyped rs1321419 -24,673 0.204 3.4310-14 T C Intergenic Imputed rs1884881 -24,201 0.636 8.5910-14 C G Intergenic Imputed rs13045552 -20,855 0.607 1.0710-13 G C Intergenic Imputed rs911138 -23,557 0.636 1.0910-13 C T Intergenic Imputed rs982563 -19,275 0.636 1.3110-13 T C Intergenic Imputed rs13037156 -20,806 0.636 1.4910-13 C T Intergenic Imputed rs6057802 -22,530 0.636 1.5310-13 T C Intergenic Imputed rs11167201 -21,100 0.607 1.6710-13 G A Intergenic Imputed

145 rs6141913 -3,395 0.455 1.6910-13 G A Intron 1 of BPIFB1 Imputed rs113915824 -57,685 0.257 2.0110-13 G A Intergenic Imputed rs4911311 -6,591 0.665 2.3710-13 G C 5' region of BPIFB1 Imputed rs7263699 -8,432 0.665 2.4810-13 T C Intergenic Imputed rs6141379 -22,134 0.636 2.5010-13 C T Intergenic Imputed rs6141911 -9,395 0.665 2.5410-13 T A Intergenic Imputed rs7262822 -10,984 0.665 2.5810-13 G A Intergenic Genotyped rs3746393 -2,860 0.636 3.2110-13 C T Intron 1 of BPIFB1 Genotyped rs1884883 -17,286 0.665 3.6610-13 T C Intergenic Genotyped rs6141909 -14,590 0.635 3.9210-13 C T Intergenic Imputed rs6141908 -15,010 0.665 4.0610-13 C G Intergenic Imputed rs2253335 14,322 0.213 5.3910-12 T C Intron 11 of BPIFB1 Genotyped rs4911305 -63,547 0.325 2.8410-11 C T Intron 3 of BPIFA3 Genotyped rs6059172 -57,666 0.189 3.2810-10 A G Intergenic Imputed rs2145250 1,911 0.558 2.8810-09 C G Intron 4 of BPIFB1 Imputed rs6087461 -117,831 0.114 1.1210-08 C G Intron 2 of BPIFA2 Imputed rs4911314 9,301 0.527 2.0010-08 C G Intron 7 of BPIFB1 Imputed rs4911313 9,279 0.558 2.0110-08 A T Intron 7 of BPIFB1 Imputed rs6141382 7,388 0.527 2.3310-08 A C Intron 6 of BPIFB1 Genotyped rs4911308 -25,604 0.108 1.2110-06 G A Intergenic Imputed rs6088122 -145,356 0.002 1.8510-06 A C Intergenic Imputed

1LD with rs1078761 in the CEU subset of the 1000 Genomes reference population (Pilot 1). 2P-value for association between genotype and BPIFA1 expression levels using multiple linear regression with adjustment for age, gender and centre. 3Allele associated with increased BPIFA1 expression. 4Allele associated with decreased BPIFA1 expression. 55’region = within 2 kb of the start of transcript. 6Indicates whether genotype information was derived by genotyping or imputation. Rs750064 is shown in bold as it is the most statistically significant genotyped SNP in the region, and was selected for follow-up.

146

Table A.3. ENCODE data (http://www.genome.gov/10005107) (accessed 06/08/2014) regarding the potential functional impact of SNPs in the BPIFA1/BPIFB1 region which were nominally associated with CF disease severity. The same polymorphisms are shown as in Supplementary Table 1.

SNP Conserved Promoter Enhancer DNase I hypersensitive sites3 Proteins Motifs changed5 region histone histone bound4 marks1 marks2 rs1078761 Urothelial cells, Caco-2 (colorectal CTCF, HIF1 adenocarcinoma) rs2295576 Prostate adenocarcinoma, child AP-2, BDP1, Egr-1, Pax-5, fibroblasts Rad21, TCF12, ZBTB7A, Znf143 rs113915824 EWSR1-FLI1, GATA, HDAC2 rs3827028 B-lymphoblastoid cell line Zbtb3, p300 rs1884880 Prostate epithelial cell line rs11167201 Foxa, GR, Pbx-1, STAT rs6057802 Znf143 rs4911310 Yes Hepatocellular Hepatocellular carcinoma, Arid5a, Foxa, HDAC2, carcinoma mammary gland adenocarcinoma TCF11::MafG, TCF12 rs4911309 Hepatocellular Fibroblasts (Parkinson's disease), Pax-5 carcinoma epidermal melanocytes, fibroblasts (Hutchinson-Gilford progeria syndrome), urothelial cells rs1570033 Child fibroblasts SZF1-1, Spz1 rs1321423 Hepatocellular CD34+ hematopoietic progenitor BDP1, Ehf, Elf3, Gfi1, carcinoma cells, CD14+ monocytes NRSF, PU.1, Sin3Ak-20, VDR, p300 rs1964852 Hepatocellular HNF4 carcinoma rs6119362 Pancreas adenocarcinoma, primary SP2 hepatocytes rs982561 Hepatocellular CD34+ hematopoietic progenitor BDP1, Ehf, PU.1, Sin3Ak-20 carcinoma cells, CD14+ monocytes rs1884882 Primary Th1 T cells rs6057806

147 rs941682 Hepatocellular CD4+ cells enriched for Th0 TCF4 carcinoma populations rs982562 Hepatocellular Pdx1 carcinoma rs6059230 Hepatocellular Osteoblasts, CD4+ cells enriched Foxa, GR, Nanog, Sin3Ak-20 carcinoma for Th0 populations rs982563 , Foxa, Pou1f1, Pou2f2, Pou3f2, Pou3f3, Pou5f1, Zfp187 rs13037711 COMP1 rs13037156 Hand1, Smad3, Znf143 rs1884881 rs2275082 Primary Th1 T cells , Eomes, Zfp410 rs911138 rs75123718 CTCF, EBF, NRSF, Roaz, Sin3Ak-20 rs6057816 GATA, HEN1, KAP1, PPAR, Pax-5 rs1321421 Gfi1 rs6120204 rs6059212 Huvec, K562 Evi-1, GATA, HDAC2, HP1- site-factor, Hoxb9, Ik-2, Pou2f2, SPIB, TATA rs4911307 Urothelial cells, Undifferentiated USF1, AP-4, E2A, LBP-1, Rad21, embryonic stem cells YY1 TAL1, TCF12 rs6059217 Foxj1 rs6057777 LUN-1 rs6059187 rs750064 Yes PanIslets AIRE, Evi-1, Foxj2 rs6141909 Hepatocellular TCF12 carcinoma rs3787144 THAP1 rs1884883 Primary Th1 T cells Nkx6-1 rs78841773 prostate adenocarcinoma, AP-2, Duxl osteoblasts, Caco-2 (colorectal adenocarcinoma) 148 rs1570034 rs6141908 Hepatocellular Primary Th1 T cells, mammary FOSL2 carcinoma gland adenocarcinoma,urothelial cells, hippocampal astrocytes rs6057813 Hepatocellular Prostate adenocarcinoma, BCL, DMRT4, Irf, PU.1, carcinoma mammary gland Pax-5, RXRA, p300 adenocarcinoma,urothelial cells rs6141379 Esr2, GR, HNF4, Hdx, Pbx3 rs7263699 Hepatocellular Primary Th1 T cells carcinoma rs4911311 Hepatocellular USF1 Foxa, VDR carcinoma rs6141911 Hepatocellular Endometrial adenocarcinoma, BCL, EBF, ERalpha-a, NRSF carcinoma neuroblastoma, hippocampal astrocytes, cerebellar astrocytes, renal cortical epithelial cells, renal epithelial cells, renal proximal tubule epithelial cells rs7262822 H1 embryonic A549 (lung carcinoma), skeletal FOXA1, Arid5b stem cells, muscle myoblasts, hepatocellular HNF4A, Hepatocellular carcinoma, retinoblastoma P300 carcinoma rs75914519 BHLHE40, Rad21 rs2253335 BDP1, ERalpha-a, Ets, GATA, GR, HNF4, PU.1, SZF1-1, Zfp691 rs13045552 CTCF, TFII-I, YY1

1Polymorphisms in a region showing histone modifications characteristic of active promoters. The entry indicates the cell type that these modifications were identified in. 2Polymorphisms in a region showing histone modifications characteristic of active enhancers. The entry indicates the cell type that these modifications were identified in. 3Polymorphisms in regions that are sensitive to cleavage by DNase I that are characteristic of regulatory regions such as enhancers, silencers, promoters, insulators and locus control regions. The entry indicates the cell type that the hypersensitive sites were identified in. 4Polymorphisms in sequences found to bind to protein using chromatin immunoprecipitation. The identity of the bound protein is shown. 5Polymorphisms that are predicted to alter binding sites for transcription factors

149