TRANSCRIPTOMICS: ADVANCING HYPERTENSION PHARMACOGENOMICS

By

ANA CAROLINE COSTA SÁ

A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

2017

© 2017 Ana Caroline Costa Sá

To my precious family, my mother Sônia Maria Costa Sá, my father Raimundo Delmar de Sá, my two siblings Marcus and Ana Paula, my grandparents, and my husband Roque

ACKNOWLEDGMENTS

My deepest gratitude and appreciation goes to my mentor, Dr. Julie Johnson, for

her mentorship, training, guidance, help and support. Over the past four years, she has inspired and enriched my growth as a young scientist. During the most challenging moments of my PhD project, she been a constant oasis of ideas and great passion in

science. I have no doubts that I am indebted to her more than she knows, and will

always be for the rest of my career. Further, I would like to thank Dr. Yan Gong, Dr.

Matias Kirst, Dr. Marta Wayne, and Dr. Somnath Datta for serving on my committee and

for their valuable advice, guidance, encouragement and sincere help throughout this

work. My absolute gratitude is also extended to Dr. José Paulo Leite, who provided me

with great guidance and support before joining Dr. Johnson’s lab. He is indeed one of

the great researchers who significantly influenced my character and shaped my career

path.

I would like to express my sincere gratitude to Dr. Rhonda Copper-DeHoff, Dr.

Caitrin McDonough, Dr. Taimour Langaee, Dr. Larisa Cavallari and Dr. Reggie Frye for

their scientific guidance, valuable advice, and continuous support during my PhD. As a

Genetics & Genomics PhD student, I have the luxury to have found home both in the

Genetics Institute and in the Department of Pharmacotherapy and Translational

Research at College of Pharmacy. I appreciate the confidence in me and the support

provided by Dr. Wilfred Vermerris, Dr. Jorg Bungert, Dr. Connie Mulligan, Dr. Patrick

Concannon and Hope Parmeter. Special thanks to Dr. Mohamed Shahin, Dr. Issam

Hamadeh, Dr. Nihal El-Rouby, Dr. Shin-wen Chang, and Dr. Mohamed Solayman for

their great friendship, compassion and kindness which created a family environment

that I will never forget. I would also like to extend many thanks to Ben Burkley, Cheryl

4

Galloway, Lynda Stauffer, who facilitated part of the research included in this dissertation.

Last but not least, I would like to deeply thank my beloved husband – Roque – for his indispensable emotional support, kindness, patience and encouragement. He is not only the love of my life, but also my best friend and favorite computer programming specialist who I always seek for advice and feedback in building my coding skills.

Additionally, I would like to take the opportunity to extend my deepest gratitude to my precious family, my parents and my two siblings, for their unconditional love and support. They have always believed in me, more than I do and have been fully supportive of all my decisions. They have been continuously praying for my success and they were always there for me through the good and bad times. I would like to dedicate this dissertation to them for their endless love, support and self-sacrifices.

5

TABLE OF CONTENTS

page

ACKNOWLEDGMENTS ...... 4

LIST OF TABLES ...... 8

LIST OF FIGURES ...... 9

LIST OF ABBREVIATIONS ...... 10

ABSTRACT ...... 13

CHAPTER

1 HYPERTENSION PHARMACOGENOMICS AND THE POTENTIAL FOR DISCOVERIES WITH WHOLE TRANSCRIPTOME SEQUENCING ...... 15

Introduction ...... 15 Hypertension Pharmacogenomics ...... 17 Potential for Scientific Discoveries through RNA-Sequencing ...... 20 RNA-Seq Technology ...... 20 RNA-Seq Applications ...... 22 mRNA Expression Profiling ...... 22 Alternative Splicing ...... 23 Expression Regulation ...... 23 Breakthrough Discoveries with RNA-Seq in Cardiovascular Disease and HTN ...... 25 Summary and Aims of the Project ...... 27 Significance ...... 29

2 BLOOD PRESSURE SIGNATURE AND BLOOD PRESSURE RESPONSE TO THIAZIDE DIURETICS: RESULT FROM PEAR AND PEAR-2 STUDIES ...... 36

Introduction ...... 36 Methods ...... 37 Study Population and Ethics Statement ...... 37 Gene Expression Profile with RNA-Seq ...... 38 Statistical Methods ...... 39 Genomics Analysis ...... 40 Allele Specific Expression Analysis ...... 41 Results ...... 42 Discussion ...... 44

6

3 WHOLE TRANSCRIPTOME SEQUENCING ANALYSES REVEAL MOLECULAR MARKERS OF BLOOD PRESSURE RESPONSE TO THIAZIDE DIURETICS ...... 62

Introduction ...... 62 Methods ...... 63 Study Participants ...... 63 Gene expression profile with RNA-Seq ...... 64 Statistical Methods ...... 66 Genomics Analysis ...... 67 Allele Specific Expression (ASE) Analysis...... 68 Results ...... 69 Differential mRNA Expression ...... 69 Validation of gene expression associations with BP response to TD ...... 70 Genomics Analysis ...... 71 Allele Specific Expression Analysis ...... 72 Discussion ...... 73

4 SUMMARY AND CONCLUSION ...... 94

APPENDIX: SUPPLEMENTARY INFORMATION FOR CHAPTER 3 ...... 101

LIST OF REFERENCES ...... 102

BIOGRAPHICAL SKETCH ...... 1123

7

LIST OF TABLES

Table page

1-1 Advantages of RNA-Seq compared with Microarrays ...... 30

2-1 Characteristics of PEAR and PEAR-2 participants classified as responder and non-responders for the RNA-Seq analysis...... 49

2-2 Genes previously associated with BP/HTN15 and the expression measurements in PEAR withes and PEAR-2 whites and blacks ...... 50

2-3 Genes differentially expressed between responders and non-responders to HCTZ and chlorthalidone in all 3 cohorts ...... 52

2-4 Differences in baseline expression levels for FOS, DUSP1 and PPP1R15A between thiazide diuretics responders and non-responders ...... 53

2-5 Representative trans eQTL for top differentially expressed genes and association with BP response to thiazide diuretics in PEAR and PEAR-2 ...... 54

2-6 SNPs with AEI ≥1.3-fold and eQTLs associations ...... 55

3-1 Characteristics of PEAR and PEAR-2 participants classified as responder and non-responders for RNA-Seq analyses ...... 78

3-2 Potassium and uric acid mean changes in non-responders...... 79

3-3 Summary of mapping statistics from alignment with Tophat2 ...... 80

3-4 Genes differentially expressed in PEAR whites treated with HCTZ ...... 81

3-5 Genes differentially expressed in PEAR-2 whites treated with chlorthalidone .... 82

3-6 Genes differentially expressed between responders and non-responders to HCTZ and chlorthalidone in all 3 cohorts ...... 83

3-7 Genes differentially expressed between responders and non-responders to chlorthalidone in PEAR-2 whites and blacks ...... 84

3-8 Differences in baseline expression levels for CEBPD and TSC22D3 with adjustment for age, gender and baseline blood pressure ...... 85

3-9 SNPs in SERINC5 gene region with allele specific expression (ASE) ≥1.3- fold and significant eQTLs association from Blood eQTL browser ...... 86

8

LIST OF FIGURES

Figure page

1-1 Blood pressure response to HCTZ by 17 rs16960228...... 31

1-2 Blood pressure response to HCTZ by chromosome 20 rs2273359...... 32

1-3 Overview of a typical RNA-Seq experiment and most common applications ..... 33

1-4 Genome-based assembly strategy for reconstructing transcripts from RNA- Seq reads...... 34

1-5 RNA-Seq can also be used to interrogate allelic effects, in sites with a polymorphism confirmed by dense coverage of reads...... 35

2-1 Mapping statistics for PEAR and PEAR-2 RNA-Seq data...... 56

2-2 Linkage disequilibrium plots between rs10655987, rs653178, rs10774625 and rs11066301 single nucleotide polymorphisms...... 57

2-3 Rs7101 allele-specific expression analysis ...... 58

2-4 The effect of rs11065987 polymorphism on the blood pressure response of Whites treated with HCTZ in PEAR...... 59

2-5 Rs1046117 allele-specific expression analysis...... 60

2-6 PPP1R15A rs557806 allele-specific expression ratios...... 61

3-1 Volcano plots comparing gene expression between responders and non- responders to HCTZ and chlorthalidone...... 87

3-2 Plots showing CEBPD and TSC22D3 baseline expression levels between thiazide responders compared to non-responders...... 88

3-3 The effect of SERINC5 rs10042497 polymorphism on the blood pressure response of whites treated with chlorthalidone ...... 89

3-4 Allele-specific expression ratios in SERINC5 rs10072008 ...... 90

3-5 Allele-specific expression ratios in SERINC5 rs7707754...... 91

3-6 Allele-specific expression ratios in SERINC5 rs78174795...... 92

3-7 Allele-specific expression ratios in SERINC5 rs11951568...... 93

A-1 TSC22D3 expression by gender...... 101

9

LIST OF ABBREVIATIONS

AEI Allelic Expression Imbalance

AGT Angiotensinogen

ALDH1A3 Aldehyde Dehydrogenase 1 family member A3

ALT/REF Alternative and Reference Alleles

AP-1 Activator 1 (transcription factor)

ASE Allele-Specific Expression

BMI Body Mass Index

BP Blood Pressure cDNA Complementary DNA

CEBPB CCAAT/Enhancer Binding Protein Beta

CEBPD CCAAT/Enhancer Binding Protein Delta

Chr Chromosome

CLIC5 Chloride Intracellular Channel 5

CLTD Chlorthalidone

DBP Diastolic Blood Pressure

DNA Deoxyribonucleic acid

DUSP1 Dual Specificity Phosphatase 1 eIF-2alpha Eukaryotic Initiation Factor 2

ENCODE The Encyclopedia of DNA Elements eQTL Expression Quantitative Trait Loci

ERK Extracellular Regulated Kinases

FDR False Discovery Rate

FOS Fos Proto-Oncogene

10

FPKM Fragments Per Kilobase Of Exon Model Per Million

FRS2 Fibroblast Growth Factor Receptor Substrate 2

FTO Fat Mass And Obesity-Associated

GATK Genome Analysis Toolkit

GENRES Genetics of Drug Responsiveness in Essential Hypertension study

GERA Genetic Epidemiology of Responses to Antihypertensives study

GNAS G Protein Alpha Subunit

GTEX Genotype-Tissue Expression project

GWAS Genome-wide Association Studies

HCTZ Hydrochlorothiazide

HF Heart Failure hg19 built 19

HTN Hypertension

IRX3 Iroquois Homeobox 3

JUN Jun Proto-Oncogene kb Kilobase

LD Linkage Disequilibrium

LRRC15 Leucine Rich Repeat Containing 15

LYZ Lysozyme mmHg Millimeter Of Mercury mRNA Messenger RNA

NGS Next Generation Sequencing

NORDIL The Nordic Diltiazem (Nordil) Study

Pbinom P-value from Binomial Statistic Test

PDGF-αR Platelet-Derived Growth Factor-α Receptor

11

PEAR Pharmacogenomics Evaluation of Antihypertensives Response

PGRN Pharmacogenomics Research Network

PGx Pharmacogenomics

Poly(a) Polyadenylation

PP1 Phosphatase Protein 1

PPP1R15A Protein Phosphatase 1 Regulatory Subunit 15A

PRKCA Protein Kinase C Alpha

R/FPKM Read or Fragments Per Kilobase Of Exon Model Per Million

REF/ALT allele ratios

RNA Ribonucleic Acid

RNA-Seq RNA Sequencing

RV Right Ventricle

SBP Systolic Blood Pressure

SHR Spontaneously Hypertensive

SLC25A32 Solute Carrier Family 25 Member 32

SNP Single-Nucleotide Polymorphism

SPARCL1 SPARC like 1

STEAP4 STEAP4 metalloreductase

TD Thiazide Diuretics

TSC22D3 TSC22 Domain Family Member 3

US United States

UTR Untranslated Region

VSIG4 V-Set And Immunoglobulin Domain Containing 4

VSMC Vascular Smooth Muscle Cell

YEATS4 YEATS domain containing 4

12

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

TRANSCRIPTOMICS: ADVANCING HYPERTENSION PHARMACOGENOMICS

By

Ana Caroline Costa Sá

May 2017

Chair: Julie A. Johnson Major: Genetics and Genomics

Hypertension (HTN) is a prevalent and silent health threat in the United States and the leading cause of cardiovascular diseases worldwide. The thiazide diuretics hydrochlorothiazide (HCTZ) and chlorthalidone are some of the most commonly prescribed antihypertensive medications, with over 100 million prescriptions annually in the US. However, less than 50% of treated patients achieve blood pressure (BP) control. HTN pharmacogenomics studies hold the potential to improve the management of HTN by expanding the knowledge on molecular markers of disease susceptibility or drug response, while also providing potential insight into new mechanisms underlying the pathophysiology of HTN or antihypertensive effects. Current available data from

Genome-wide Association Studies (GWAS) reveal compelling genetic signals associated with HTN and antihypertensive drug response, while not yet sufficiently accounting for blood pressure or response variability to advance into clinical translation.

Additional research of the transcriptome – the complete set of transcripts (RNA) – has the potential to expand the knowledge of gene expression regulation mechanisms impacting variability in drug response. Therefore, this study aims to identify novel molecular determinants of thiazide diuretics BP response through the systematic study

13

of transcriptome. Associations of gene expression differences with BP response to thiazide diuretics were assessed in 150 hypertensive participants treated with HCTZ and chlorthalidone from PEAR (Pharmacogenomics Evaluation of Antihypertensives

Response) and PEAR-2 studies, respectively. From PEAR, 50 white participants were selected for RNA-Sequencing based on the upper and lower quartile of extreme BP response to HCTZ. Likewise, in PEAR-2, white and black participants were classified as responders and non-responders to chlorthalidone. FOS, DUSP1, PPP1R15A, CEBPD,

TSC22D3 and SERINC5 were differentially expressed across all cohorts (meta-analysis p-value < 2x10-6) and responders to HCTZ or chlorthalidone presented up-regulated transcripts. From these genes, only FOS was previously documented in functional studies related to BP regulation mechanisms. Collectively, the findings from this project document the use of transcriptomics RNA-Seq data to identify biomarkers of drug response and suggest CEBPD, TSC22D3, SERINC5, FOS, DUSP1 and PPP1R15A as

potential molecular determinants of antihypertensive response to thiazide diuretics.

Further evaluation of these genes may provide new insights into molecular mechanisms

underlying BP response to thiazides.

14

CHAPTER 1 HYPERTENSION PHARMACOGENOMICS AND THE POTENTIAL FOR DISCOVERIES WITH WHOLE TRANSCRIPTOME SEQUENCING

Introduction

Hypertension (HTN) affects approximately 1 billion individuals worldwide1 and is the most important modifiable risk factor for cardiovascular diseases - coronary artery disease, myocardial infarction, heart failure, stroke and peripheral vascular diseases, regardless of gender, racial groups, geographic region and income2. Treatment with

antihypertensive (anti-HTN) medications clearly reduce chronic blood pressure (BP)

elevations, contributing to reduce morbidity and mortality rates3-5.

Even with multiple anti-HTN medications available, targeting different BP

regulatory systems, only about half of those with treated HTN in fact manage to control

their BP6, 7. Several factors possibly contribute to global rates of uncontrolled BP: poor

adherence to therapy, ineffectiveness in the current treatment approach, which can be

largely due to the use of single drugs, instead of more aggressive strategies using

combination therapy, poor response to the anti-HTN agent, therapeutic inertia on the part of the healthcare providers when poorly controlled HTN is identified, among others.

As the current method for therapy selection is essentially based on trial and error, stratifying HTN patients based on predictors of drug response has potential to be beneficial not only for control rates but may also help to reduce adverse cardiovascular events.

Systolic (SBP) and diastolic blood pressures (DBP) are considered complex physiological traits that are under the influence of genetic, physiologic and environmental factors. The heritable component of BP is estimated at 30-50%8-10.

However, genetic signals associated with HTN/BP that have been identified through

15

genome wide association studies (GWAS) explain only a small proportion of inter- individual BP variability11. Additionally, the biological mechanisms underpinning most genes identified in BP/HTN GWAS are still unknown. Hence, additional studies are crucial for understanding the molecular mechanisms behind these signals and to define functional relations to BP physiology.

Recently, several HTN pharmacogenomic studies have advanced our

understanding of the potential role of genetics in variable response to anti-HTN

medications12-14. Genome wide association studies (GWAS) have shown success in

identifying novel genetic variants associated with variability in drug response15-19.

However, none of these sufficiently explain the BP response variability to guide

decisions clinically. Additionally, the GWAS approach tests genomic DNA, which

represents only the first step towards understanding the complexity of the system in the

flow of genetic information. Moving forward, it is important to understand the

transcriptome (the full set of transcripts in a cell) and the regulatory mechanisms of the

transcriptome to more completely understand the factors that underlie the diversity in

response to drugs.

In the past few years, the development of novel high-throughput DNA

sequencing tools has provided a new method for both mapping and quantifying

transcriptomes20. RNA-Seq has emerged as an innovative method for both mapping and quantifying transcriptome signatures associated with many diseases and traits 21-23.

When compared to other transcriptomic techniques, such as microarrays, RNA-Seq has

the ability to quantify the expression levels with higher accuracy and throughput that

makes RNA-Seq the best approach for revealing the full repertoire of differentially

16

expressed genes. It also provides a dynamic assessment of mechanisms associated with many diseases and traits in order to bridge the gap between genomics and phenotype24. Transcriptome approaches have the potential to contribute to our

understanding about the complexity of antihypertensive blood pressure response and

therefore hypertension.

To provide context for this thesis project, the most compelling data from HTN

pharmacogenomics are reviewed, background information is provided on RNA-Seq and

how RNA-Seq is being applied in HTN and HTN pharmacogenomics research.

Hypertension Pharmacogenomics

For the past 20 years, there has been a substantial number of studies

investigating genetic variants influencing BP response to anti-HTN medications and

some recent reviews put this body of literature into perspective25-27. These studies

reveal genetic polymorphisms with modest to moderate effect sizes, relative to the large

effect sizes that have been observed for pharmacogenetics of other cardiovascular

drugs, namely clopidogrel, warfarin and simvastatin28-30. Although there are no

examples of HTN pharmacogenomics signals ready for application in clinical practice,

herein we highlight the most promising findings to date.

Discoveries through Genome-Wide Association Studies

In the past decade, genome-wide association studies have been the most widely

employed tool to investigate the link between genetic polymorphism and common

diseases, due to the application of agnostic approaches in which genetic variation

across the human genome is tested, allowing discovery of novel genes and pathways.

This approach successfully revealed multiple genetics signals associated with HTN/BP and BP response to anti-hypertensive drugs.

17

In 2008, the first GWAS with a HTN Pharmacogenomics phenotype identified a

haplotype, in chromosome 12q15 (rs317689, rs315135 and rs7297610 in proximity of

LYZ, YEATS4 and FRS2, respectively), associated with DBP response to HCTZ in

African Americans31. The finding was replicated in independent samples from PEAR

African American hypertensive participants treated with HCTZ32.

In order to identify novel genetic variants associated with variability of HCTZ BP

response in hypertensive participants of European American ancestry, five independent

studies were involved: PEAR and GERA, as discovery, and GENRES, NORDIL and

Milan, as replication cohorts12. The GWAS meta-analysis revealed two novel regions,

rs16960228 in PRKCA (protein kinase C, alpha) (Figure 1-1) and rs2273359 near

GNAS (G-protein alpha subunit) (Figure 1-2), that were replicated in the other cohorts

and showed clinically relevant effects on BP response in HCTZ treated patients12.

Another GWAS investigated genome-wide SNP association with BP response to the main 4 classes of anti-HTN drugs in the GENRES study14. All subjects received

randomized monotherapy treatment with amlodipine, bisoprolol, HCTZ and losartan14. A

missense variant in NPHS1 coding region was associated with response to losartan in

European Americans (SBP: β = -2.8, P= 2x10-5; DBP: β = -1.6, P= 2x10-4) and the

findings were replicated with same direction of association in GERA and SOPHIA14. In

addition, results from the meta-analysis of GENRES, PEAR and GERA revealed 2 other

variants identified and replicated influencing HCTZ BP response: rs3825926 (β = 6.7,

P= 5.6x10-6) and rs321329 (β = -1.8, P= 7.3x10-5), close to ALDH1A3 and CLIC5,

respectively14.

18

The more recent genome-wide meta-analysis was the first to be performed with

African American hypertensive participants from PEAR and PEAR2 treated with atenolol and metoprolol, respectively33. Two genetic variants were identified in the monotherapy

analysis and achieved genome-wide significance (P < 5x10-8) in a 3-group meta-

analysis which also includes a cohort of African Americans from PEAR treated first with

HCTZ monotherapy and then the addition of atenolol33: SLC25A32 rs201279313

deletion (β = -4.42 mmHg per variant allele, P=2.5x10-8) and LRRC15 rs11313667 (β = -

3.65 mmHg per variant allele, P=7.2x10-8)33.

While there are strong data on BP response to thiazide diuretics, particularly

HCTZ, and β-blockers, limited or no literature exist for the other major classes of

antihypertensives. The findings presented here suggest promising genetic determinants

of responses to antihypertensives, despite the fact that none of them have been

sufficiently replicated in larger studies or presented large enough effect size by

themselves to drive modifications in clinical practice.

In conclusion, while there are strong data on BP response to thiazide diuretics,

particularly HCTZ, and β-blockers, limited or no literature can be found for the other

major classes of antihypertensives. In addition, the current available data on HTN and

HTN pharmacogenomics reveal that there is not sufficient response variability explained

through genetic signals alone. In order to make a prominent contribution to the field, it is

crucial to explore the biology further than DNA variations alone. In order to understand

complex phenotypes, such as variability in BP, BP response to drugs or even complex

diseases, one of the viable alternatives is to systematically study the transcriptome.

19

Potential for Scientific Discoveries through RNA-Sequencing

Despite the GWAS advances presented herein, using genomic information provides only one dimension of molecular information about BP, hypertension, and BP response to anti-HTN treatment. Although a critical dimension, analyzing genetic variations alone is insufficient for achieving an understanding of the multidimensional complexity of BP and BP response to antihypertensive agents. In this context, transcriptomics, global characterization of genes/transcripts that are actively expressed in multiple tissues or experimental conditions, represents an innovative approach that enables biomarker discovery associated with diseases and traits.

Until the past decade, microarrays represented the most cost-effective, reliable and rapid technology for high throughput profiling of gene expression. However, microarrays require a priori knowledge of sequences to be investigated, limiting the identification of de novo splicing isoforms or novel exons, transcripts and genes20. In

addition, hybridization-based methods can also limit the dynamic range of gene

expression quantification (Table 1-1), casting doubt on measurements of transcripts in

high abundancy34.

RNA-Seq Technology

With the widespread diffusion of Next Generation Sequencing (NGS) platforms,

RNA-Seq, a methodology for RNA profiling, using millions of short reads (sequence strings), enables the investigation of all the RNA in a sample, theoretically35. In practice,

the input population of RNA, either total RNA or fractioned (mRNA or poly(A) selected,

for example), is converted to a library of fragmented cDNA35. Then, each fragment

receives adaptors attached to one of both ends35. These fragments are randomly

20

amplified and sequenced in a high-throughput manner, generating millions of short reads35(Figure 1-3 highlights the main steps for experimental biology).

Depending on the sequencing platform of choice (Illumina, Ion Torrent, BGISEQ,

Qiagen GeneReader), read lengths typically range between 30-500 base pairs36. Early in the process of experimental design, sequence length is an important decision since longer reads improve mappability and transcript identification37. Another important factor

is the library size or read depth, which is the number of sequence reads for a given

sample. The deeper the sequencing level, the more precise transcript quantification will

be37. While there are some studies that advocate the use of as low as 5 million reads for

accurately quantifications of moderate to highly expressed genes38, the ENCODE best

practices recommends library sizes with more than 25 million reads for a typical RNA-

Seq protocol for investigating mRNA expression39.

Once high quality reads are obtained, RNA-Seq reads are computationally mapped to the human reference genome, revealing a transcriptional map20, 40. Owing

the extensive alternative splicing that occurs in the human transcriptome, the alignment

process is more challenging to map reads that span splice junctions36. Also, RNA-Seq read alignment is complicated by the fact that short reads may be assigned to multiple regions of the human genome36. The most widely used RNA-Seq alignment software programs use gene annotation to achieve better placement of spliced reads and correctly handle multiple short read assignment in the vast majority of occurrences41.

Next, overlapping reads that were mapped to a particular exon are clustered into gene

or isoform level of quantification37. Raw read counts alone are not sufficient to compare

expression levels among samples37. The most frequently reported measure of gene

21

expression from RNA-Seq analysis is the R/FPKM (reads or fragments per kilobase of exon model per million), a within sample normalization method that considers transcript length, and total number of mapped reads37. The data analysis then allows the characterization of gene expression levels that can be applied to investigate distinct

features of the transcriptome diversity. Figure 1-3 highlights the main steps for

computational biology for RNA-Seq data. As with all large scale analyses, the resulting

RNA levels are subject to error so important findings need to be replicated with

alternative methods such as quantitative Real Time-Polymerase Chain Reaction (qRT-

PCR).

RNA-Seq Applications

The beauty of the RNA-Seq tool lies in the fact that previously distinct core activities of discovery and transcript quantification now can be combined in a single high-throughput assay. This relatively new method provides a significant qualitative and quantitative improvement to study the transcriptome, and features the possibility to detect genes with low expression, more accurate sense and antisense transcripts and high level resolution20.

mRNA Expression Profiling

One of the most biologically relevant applications of RNA-Seq is the comparison

of transcriptomes across distinct developmental stages, across diseased versus normal

samples, or other specific experimental conditions42. For this type of analysis, it is

crucial to accurately construct the isoform structure in order to assess transcript

abundances comparing multiple samples (Figure 1-4)36. This powerful approach is

essential for the interpretation of the functional elements of the genome and the

22

discovery or elucidation of key genes or transcripts key in molecular mechanisms underlying disease susceptibility or response to drugs.

Alternative Splicing

Alternative splicing events play a key role in shaping biological complexity and genomic diversity43. As a consequence, they are involved in 15-50% of mutations associated with a vast range of diseases43, 44. The term alternative splicing refers to distinct inclusion/exclusion of exons in the processed RNA product when compared to constitutive splicing events.

The RNA-Seq technology enables the exploration of transcriptome structure, investigating different patterns of splice junctions with more accuracy than microarrays45. Deep surveying of alternative splicing with RNA-Seq data revealed

unprecedented diversity of splice junctions, tissue-specific RNA- binding motifs and

splicing regulatory elements46.

Gene Expression Regulation

Most of the SNPs identified through GWAS fall in non-coding or intergenic

regions of the genome47. For this reason, one can make the argument that causal

variants are more likely to influence traits/phenotypes by impacting gene expression 48-

50. Genetic polymorphisms associated with variation in gene expression levels, termed

Expression Quantitative Trait Loci (eQTLs), have been extensively studied over the

years and are known to be widespread over the human populations50, 51. These

regulatory variants contribute to phenotype diversity by interfering with the steps across

the flow of genetic information in a cell, from DNA to protein.

RNA-Seq enables further investigation of the regulatory role of specific

sequences to gene expression by taking advantage of the single nucleotide level of

23

resolution. Heterozygous individuals for a particular genome locus present two allelic forms, which allows to investigate if one of the alleles is more expressed than the other.

This event is called allele-specific expression (ASE) and suggest a potential gene expression regulatory effect due to genetic and/or epigenetic determinants that govern transcriptional activity at different alleles (Figure 1-5)52, 53. Often, ASE is an evidence of

a disruption of a highly regulated process leading to disease susceptibility52, 53 or

potential variability in drug response54.

Predominantly, the largest effect sizes or the strongest genetic effects in the

expression of individual genes are observed locally (often in the same chromosome) or

within the respective target gene region51, 55. These are called cis-regulatory regions

and are composed of cis-regulatory elements, which usually are transcription factors

and other regulatory - promoters and enhancers, regarding active cis-regulatory

regions, and repressors, for transcriptionally inactive regions51. Transcription factor

binding sites are the central elements of the cis-regulatory regions that in the presence

of transcription factors or epigenetic modifications can determine whether the

transcription is turned on/off or the rate and even the speed of the transcription

process51, 56.

Trans-acting variants, polymorphic variants that regulate gene expression via an

intermediate factor, can be anywhere in the human genome, typically convey a smaller-

effect size than cis-acting variants 48, 51, 56. One of the reasons may be that expression

levels of a particular gene are usually under the effect of multiple trans-acting regulators, such as different transcription factors, co-activator proteins, proteins that help stabilize transcription factors, etc.; consequently, the effect size of each one of

24

these trans-acting regulators is diminished 56, 57. So far, several trans-acting regulatory

regions have been identified as “hot spots” but only a few of these regions have been

determined to account for the underlying mechanism 57-63.

Breakthrough Discoveries with RNA-Seq in Cardiovascular Disease and HTN

Recently, multiple studies have bridged the causality gap between human

regulatory variants, gene expression and phenotypes 64-67, including discovery of

polymorphisms detected in the intronic region of FTO (encoding fat mass and obesity-

associated protein) associated with obesity68. This intronic region was found to serve as

an enhancer making physical contact with the IRX3 gene promoter, which is more than

500 kb apart from the obesity associated variants, thereby regulating its gene

expression in both cerebellum and human adipocytes 69. Through IRX3 knockout

models, a causal link was established between SNPs, IRX3 expression, and obesity 69.

Additionally, a large-scale study with RNA-Seq data from the TwinsUK cohort

(n=856) conducted a genome-wide search for gene-by-body mass index (BMI)

interactions on the regulation of gene expression in multiple tissues (adipose, skin,

whole blood and lymphoblastoid cell lines)70. This study identified 16 cis-acting regulatory variants and one trans-acting variant, rs3851570, regulating the expression

of 53 genes in adipose tissue70.This demonstrates the importance of investigating the

role of eQTLs in influencing downstream traits.

In recent years, multiple studies have investigated the transcriptome signature of

Heart Failure (HF). Differential expression analysis was conducted comparing whole

transcriptome profiles between explanted human HF right ventricles (RV) and 5 unused

donor human heart RVs71. STEAP4, SPARCL1 and VSIG4 were identified as potential

right ventricular myocardial biomarkers in human HF71. The same group also identified

25

long noncoding RNA differentially expressed between normal versus HF RVs72. Another study used transcriptomics data, generated by RNA-Seq and microarrays, to identify novel myocardial gene expression signatures of HF73.

RNA-Seq approaches have also been used to enhance understanding of HTN. A

large-scale, unbiased investigation of BP/HTN gene expression signature using whole

blood RNA revealed 34 genes that in aggregate explain up to 9% of inter-individual

variability in BP63. These results, based on exploration of differential expression in HTN, contrast to merely 3% of variability in BP explained by the GWAS findings collectively.

Further, the integration of the BP signature genes, eQTLs and GWAS results revealed that 6 SNPs associated with BP (p-value < 5x10-8 in the ICBP GWAS 74) are also trans

regulators of several top BP signature genes63. Therefore, this study highlights important avenues for future investigation on the impact of these transcriptomic markers in the treatment of HTN.

Additionally, the application of RNA-Seq in HTN mouse models for transcriptome profiling revealed novel potential mechanisms involved in the pathophysiology of HTN and its complications. Cowlley et al 75 identified genes and biological pathways

associated with protective effect on Dahl salt sensitive rats. Tain et al76 identified genes

of importance for programmed HTN, through transcriptome characterization of the

offspring of pregnant mouse models under suboptimal conditions (high fructose and

dexamethasone administration). Differential expression and pathway analysis revealed

genes involved in arachidonic acid metabolism as potential gatekeeper involved in

programmed hypertension76.

26

Each of these studies highlights the potential scientific insights that can be gained through experimental approaches that apply RNA-Seq data. Likewise, we anticipate that studies arising from transcriptome analyses are likely to increase our understanding of the mechanisms of BP regulation and the causes of inter-individual differences in drug response. The application of RNA-Seq may lead not only to the discovery of signature genes of BP response to drugs but it may also enable the characterization of isoform diversity, cis/trans-acting regulatory variants and gene expression networks impacting variability of BP response to antihypertensives. This powerful tool holds the potential to provide global insights into the mechanisms underlying BP regulation.

Summary and Aims of the Project

For over half century, thiazide diuretics have been a centerpiece of antihypertensive therapy with more than 100 million prescriptions annually in the US alone. The large inter-individual variability exhibited in BP response emphasizes the need for molecular predictors of drug response that hold potential for improving antihypertensive therapy. Determining predictors of BP response to thiazide diuretics will lead to an improved understanding of their mechanisms of BP lowering, and may lead to approaches that could be used to optimize anti-hypertensive treatment selection, leading to better control of patient’s BP, consequently decreasing the risk of CV morbidity and mortality.

Collectively, the genetic signals reviewed in Chapter 1 put the field a step closer to tailoring clinical therapy based on the individual characteristics of a hypertensive patient. Additional studies are needed to advance this field to the level of knowledge and clinical recommendations that other cardiovascular drugs have achieved. Moving

27

forward, there is great potential for the use of transcriptomics to refine treatment strategies for the management of HTN. Although the use of transcriptomics data in pharmacogenomics or in HTN pharmacogenomics is currently scarce, recent advances in NGS technologies allow accurate transcription quantification for differential expression between biological conditions, identification of splicing events and the assessment of regulatory mechanisms of gene expression control due to high resolution of the data. These are relevant processes to generate diversity in protein/metabolite function with proved consequences in drug disposition, mechanism of action and clinical consequences.

Therefore, we sought to use whole transcriptome analysis to help decipher the

complexity of anti-HTN BP response, and lead to better understanding of mechanisms

underlying HTN. We hypothesize that functional elements of the genome, evaluated

through RNA-Seq data, contain important determinants of antihypertensive drug

response. We tested our hypothesis through the following specific aims:

Aim1: Identify and validate molecular determinants of BP response to thiazide

diuretics through differences in gene expression levels, followed by testing of

expression regulatory variants as mechanisms for the differential expression and drug

response.

Aim1a: Identify and validate genes differentially expressed by comparing

genome-wide expression levels from responders and non-responders to thiazide

diuretics (HCTZ and chlorthalidone).

28

Aim1b: Identify and validate cis-acting regulatory variants that may impact the expression levels of the genes differentially expressed (aim 1a), driving variability in BP response to HCTZ.

Significance

The results of this study may ultimately lead to more favorable approaches to guide HTN treatment selection in the long term goal. Additionally, the potential molecular markers associated with variability in BP response to thiazide diuretics may lead to a better understanding of the mechanisms of hypertension and/or BP lowering by these medications, and potentially identify new anti-HTN drug targets.

29

Table 1-1. Advantages of RNA-Seq compared with Microarrays Microarrays RNA-Seq High-throughput Principle Hybridization Sequencing

Resolution > 100 bp Single base Reliance on genomic sequence Yes Not necessarily

Background noise High Low

Dynamic range for gene expression quantification Few 100-fold >8,000-fold Ability to distinguish isoforms Limited Yes

Ability to distinguish allelic expression Limited Yes

Required amount of RNA High (µg) Low (ng)

30

Figure 1-1. Blood pressure response to hydrochlorothiazide by chromosome 17 rs16960228 genotype of participants from 5 independent studies. A) Diastolic blood pressure (DBP) response. B) Systolic blood pressure (SBP) response. The blood pressure responses are adjusted for pretreatment blood pressure levels, age, and sex and P values are for contrast of adjusted means between genotype groups. GENRES indicates the Genetics of Drug Responsiveness in Essential Hypertension Study; GERA, Genetic Epidemiology of Responses to Antihypertensives; NORDIL, the Nordic Diltiazem; and PEAR, Pharmacogenomic Evaluation of Antihypertensive Responses. Source: From Turner et al12 with permission.

31

Figure 1-2. Blood pressure response to hydrochlorothiazide by chromosome 20 rs2273359 genotype of participants from 3 independent studies. A) Systolic blood pressure (SBP) response. B) Diastolic blood pressure (DBP) response. The blood pressure responses are adjusted for pretreatment blood pressure levels, age, and sex and P values are for contrast of adjusted means between genotype groups. GERA indicates Genetic Epidemiology of Responses to Antihypertensives; NORDIL, the Nordic Diltiazem; and PEAR, Pharmacogenomic Evaluation of Antihypertensive Responses. Source: From Turner et al12 with permission.

32

Figure 1-3. Overview of a typical RNA-Seq experiment and most common applications. The workflow starts with RNA preparations, followed by sequencing and analysis steps, leading to applications and biological insights.

33

Figure 1-4. Genome-based assembly strategy for reconstructing transcripts from RNA- Seq reads. First, short RNA-Seq reads are aligned to the reference genome, accounting for possible splicing events. Then, transcripts are reconstructed from the spliced alignments. The colors of the RNA-Seq reads represent the transcript isoform from which they are derived.

34

Figure 1-5. RNA-Seq can also be used to interrogate allelic effects, in sites with a polymorphism confirmed by dense coverage of reads. Based on the reads aligned to a specific genome locus, it is possible to calculate the ratio of reads from each allele (allele 1: allele2). Allele-specific expression (ASE) is determined if the calculated ratio deviates from the expected 50:50.

35

CHAPTER 2 BLOOD PRESSURE SIGNATURE GENES AND BLOOD PRESSURE RESPONSE TO THIAZIDE DIURETICS: RESULT FROM PEAR AND PEAR-2 STUDIES

Introduction

Hypertension (HTN) is the most important modifiable risk factor for

cardiovascular diseases- coronary artery disease, myocardial infarction, heart failure,

stroke and peripheral vascular diseases; controlling blood pressure (BP) is critical for

reducing long-term mortality and morbidity rates2. Despite the plethora of therapeutic

options, selection of the initial anti-HTN treatment remains empirical. Worldwide, 1

billion people suffer from HTN3 but only about 50% of those under drug therapy achieve

the treatment goal, which highlights that anti-HTN drug selection for a specific patient likely impacts therapy success6, 77.

Thiazide diuretics (TD) are a centerpiece of anti-HTN therapy due to their

effectiveness, safety profile in the management of HTN. Among the available anti-HTN medications, HCTZ, chlorthalidone and other TD are considered first line options for most patients with uncomplicated essential HTN, and are highly recommended for patients requiring more than one anti-HTN therapy for control of BP 78. However, TD

have variable efficacy, and less than 50% of HCTZ-treated patients achieve BP control77. The inter-individual variability in BP response to TD is likely to contribute to

suboptimal BP control.

Most recently, two replicated regions, one in PRKCA (protein kinase C, alpha)

and the other one near GNAS (G protein alpha subunit), were identified with clinically

relevant effects on BP response to HCTZ 12. Despite the successes, the GWAS

approach provides only one dimension of molecular information about BP response to

anti-HTN treatment. While it is a critical dimension, analyzing DNA variation alone is

36

insufficient for achieving an understanding of the multidimensional complexity of BP response to TD. In this context, transcriptomics (gene expression profiling) has been described as an innovative approach that enables biomarker discovery associated with different diseases and traits79-82.

Recently, 34 genes had been associated with differential expression relative to

BP/HTN, which in aggregate explain ~9% of inter-individual variability in BP63. We hypothesize that some of the differentially expressed genes associated with BP/HTN are also associated with BP response to antihypertensive treatment with TD. We assessed the association of these 34 genes with differential expression to BP response to TD by applying RNA sequencing in whole blood samples from 150 hypertensive participants from the Pharmacogenomic Evaluation of Antihypertensive Responses

(PEAR) and PEAR-2 studies.

Methods

Study Population and Ethics Statement

This study includes data from PEAR and PEAR-2 (NCT00246519,

NCT01203852 www.clinicaltrials.gov), which were previously described in details 83.

Briefly, PEAR was a multicenter, randomized clinical trial with the primary aim of

evaluating the role of genetic variability on BP response of HCTZ and/or atenolol treated

patients. Study participants (n=768) with uncomplicated HTN were randomized to

receive monotherapy of either the thiazide diuretic HCTZ, or the beta-blocker atenolol

for a period of 9 weeks. Fasting blood and urine samples were collected at baseline

(untreated), after 9 weeks of monotherapy, and after 9 weeks of combination therapy.

BP responses were assessed using office, home, and 24-hour ambulatory BP and then a composite BP response was constructed84.

37

The PEAR-2 clinical trial included a hypertensive population similar to the one in

PEAR, and for which metoprolol, a beta-blocker, and chlorthalidone, a thiazide-like diuretic, were tested. Details of this prospective, clinical trial were previously published85. Briefly, 417 hypertensive participants were treated in a sequential

monotherapy design with metoprolol (beta-blocker) and then chlorthalidone (thiazide

diuretic) with at least 4 week washout periods prior to each active treatment. Data

collected included home and clinic BP measurements, adverse metabolic effects, RNA

and DNA from whole blood, and urine samples.

All study participants provided written informed consent. The Institutional Review

Boards at participating clinical trial sites including the University of Florida, Mayo Clinic,

and Emory University approved both PEAR and PEAR-2. The studies were conducted

in accordance with the principles of the Declaration of Helsinki and the US Code of the

Federal Regulations for Protection of Human Subjects.

Gene Expression Profile with RNA-Seq

PEAR whites and PEAR-2 white and black participants were selected for gene

expression profiling with RNA-Seq based on the differences in their BP response to

HCTZ and chlorthalidone treatment, respectively. A total of 149 patients with BP

responses to either HCTZ or chlorthalidone in the top and bottom quartiles from each of

the three cohorts were selected and classified as poor BP responders (non-responders)

and good BP responders (responders).

Using whole blood samples collected before HCTZ or chlorthalidone

monotherapy, RNA was extracted using the PAXgene Blood RNA kit IVD (Qiagen,

Valenica, CA). The selection of poly(A) mRNA from total RNA was performed using

Sera-Mag Magnetic Oligo(dT) Beads (Illumina, San Diego,CA) according to the

38

manufacturer’s protocol. 100 ng of RNA was then used as template for cDNA synthesis.

Libraries were prepared following the strand-specific protocol86. DNA clusters were

generated using the Illumina cluster station, followed by 100 cycles of paired-end

sequencing on the Illumina HiSeq 2000, performed at Baylor Human Genome

Sequencing Center in Texas. For data quality control purposes, read duplicates removal

was implemented using Picard (http://picard.sourceforge.net) MarkDuplicates option.

The 100 bp reads generated in the paired-end RNA sequencing were uniquely

mapped to the human reference genome (hg19) using TopHat v2.0.1087 allowing for

four reads mismatches, read edit distance of six, one mismatch in the anchor region of

a spliced read, and a maximum of five multi-hits. Transcript assembly was performed

using Cufflinks v2.2.1. Statistical analysis were carried out with Cuffdiff and gene

expression levels are reported in fragments per kilobase per million reads (FPKM),

considering reads mapped to exonic regions of the 34 genes previously associated with

BP/HTN63.

Additionally, we performed differential expression analysis using alternative tools

in order to adjust the expression levels for age, gender and baseline diastolic BP. By

using BAM files from TopHat 2 alignments, we were able to count the number of reads

for each known human genes (Gencode gene annotation release 18) applying the

htseq-count function from the HTSeq bioconductor package88. Counts were modeled to

a Negative Binomial distribution using a generalized linear model in edgeR89.

Statistical Methods

Based on the fact that the BP signature genes, selected for this analysis, were

discovered in whites, the primary data analysis was also performed in whites treated

with HCTZ or chlorthalidone. Associations of differences in expression levels of these

39

genes in responders compared to non-responders to TD was evaluated using a t-test to quantify the statistical significance in the differences observed among the gene expression measurements (FPKM). Bonferroni corrected P values < 0.0015 (0.05/34) were considered statistically significant.

For each differentially expressed gene in PEAR or in PEAR-2 whites (6 in total), we attempted replication in PEAR-2 blacks or the alternate group of whites in order to validate the association of the genes with BP response to TD. A strict approach was established for validation with Bonferroni corrected P value (< 0.05/6 = 0.008) and the same fold change direction (either up or down regulation) as the primary analysis in whites treated with HCTZ or chlorthalidone.

For those genes that passed the validation criteria, the differential expression results from each study cohort were combined in a meta-analysis, using standardized p- values to follow the assumption of the Fisher p-value combination method implemented by the R package MetaRNASeq90. We considered that genes with meta-analysis p

values < 2.0x10-6 (0.05/25,000) achieved transcriptome-wide association with BP response to TD.

Genomics Analysis

Previous studies have explored in much more detail the genome-wide genotyping results for the PEAR and PEAR-2 studies91, 92. GWAS data for

chlorthalidone in PEAR-2 will be reported separately. Briefly, DNA samples were

genotyped using Illumina Human Omni-1Million Quad BeadChip and 2.5M-8 BeadChip

(Illumina, San Diego CA) for PEAR and PEAR-2, respectively. Genotypes were called using GenTrain2 clustering algorithm (GenomeStudio, Illumina, San Diego CA). MaCH software (version 1.0.16) was used to impute SNPs based on HapMapIII haplotypes.

40

In order to identify SNPs potentially regulating the expression of the genes differentially expressed in the RNA-Seq data, we consulted the Blood eQTL browser93.

The SNPs identified as eQTL for the differentially expressed genes were then evaluated

in the PEAR and PEAR2 GWAS data, to test for a genetic association with BP response

to TD. SNP associations with BP response were evaluated using previously conducted

GWAS analyses91 that included data on systolic and diastolic BP responses to HCTZ in

228 whites participants from PEAR, and responses to chlorthalidone in 185 white and

142 black participants from PEAR-2. PLINK software was used to run the analysis with

adjustment for age, gender, pre-HCTZ/chlorthalidone BP and population substructure

by considering the first and second principal components (PC1 and PC2) in all our

analysis.

Allele Specific Expression Analysis

We also searched for cis-eQTLs in blood (Blood eQTL browser93) indicated by

allelic mRNA expression imbalance in heterozygous white participants from PEAR and

PEAR-2 (n=100). A personalized genome was built by substituting the reference allele with the variant allele SNP in hg19 using GATK FastaAlternateReference tool

(www.software.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_walkers_ fasta_FastaAlternateReferenceMaker.php) in order to overcome potential bias in read alignment, where reference allele reads can be preferentially aligning over alternative allele reads94. RNA-Seq reads were mapped using STAR v2.5.2b and a two-pass

strategy. We followed the Broad Institute best practices workflow for SNP and indel

calling from RNA-Seq data (https://www.broadinstitute.org/gatk/guide/article?id=3891).

For each SNP, allelic expression imbalance (AEI) ratios were obtained from the division

of reference allele counts over alternative allele reads counts. Binomial statistical test

41

was applied to determine whether this ratio deviates from the expected 50:50, when the two alleles are expressed equally.

Results

Table 2-1 presents baseline and demographic characteristics from PEAR whites

treated with HCTZ and PEAR-2 whites and blacks treated with chlorthalidone who were

selected for RNA-Sequencing. For PEAR, age, gender and baseline BP were not

statistically different between participants classified as responders and non-responders

to HCTZ. However, in PEAR-2 white participants, differences in gender and baseline BP

were statistically significant between responders and non-responders to chlorthalidone.

Differences in baseline BP were also observed in PEAR-2 blacks between responders

and non-responders to chlorthalidone.

In order to identify genes with differential expression involved in BP response to

thiazide diuretics, whole transcriptome sequences were generated from 149 participants

treated with HCTZ or chlorthalidone. One of the samples from HCTZ responders did not

achieve enough library yield for adequate performance in sequencing. On average, 32

million reads per sample were mapped to the human reference genome (hg 19) and

about 93% were uniquely mapped (Figure 2-1).

At a Bonferroni corrected alpha (0.0015), 6 genes were differentially expressed in

whites treated with HCTZ or chlorthalidone (Table 2-2). For each gene differentially

expressed in PEAR or PEAR-2 whites, we attempted replication in the other white group

and in blacks from PEAR2 (Table 2-2). Of the six genes identified, FOS and DUSP1

were differentially expressed and showed consistent fold change direction in all 3

cohorts (Table 2-3), passing the stringent Bonferroni corrected alpha at 0.008 for

validation. PPP1R15A showed consistent directional fold change in all three cohorts,

42

and met the Bonferroni threshold p value in PEAR whites given HCTZ (Fold Change

(responders/non-responders): 1.27, p = 1.15x10-3) and PEAR-2 blacks given

chlorthalidone (Fold Change: 1.29, p = 1.75x10-3), while only achieving nominal

significance in PEAR-2 whites (Fold Change: 1.19, p = 3.61x10-2). The meta-analysis of

all participants with RNA-Seq data included FOS, DUSP1 and PPP1R15A, and confirmed transcriptome-wide associations that far exceeded transcriptome wide (and genome wide) significance for FOS (p = 2x10-12), DUSP1 (p = 9.5x10-12) and

PPP1R15A (p = 3.6x10-8) expression and BP response to TD (Table 2-3). Even though

the statistical strength of the association lessened after the adjustment for age, gender

and baseline BP, the fold change direction remains consistent across PEAR whites and

PEAR-2 whites and blacks regardless of the statistical methods used (Table 2-4)

Based on data in the Blood eQTL browser93, we identified 4 trans-eQTLs

(rs11065987, rs653178, rs10774625 and rs11066301) associated with reduced

expression of both FOS and PPP1R15A (Table 2-5). Because of the high linkage

disequilibrium between these SNPs (Figure 2-3), we selected a representative SNP

(rs11065987) to test for an association with BP response with thiazide diuretics.

Rs11065987 was associated with SBP and DBP response to HCTZ in PEAR whites

(SBP: β = -2.1; p= 1.7x10-3; DBP: β = -1.4; p= 2.9x10-3) (Figure 2-4) and showed

consistent directional association in PEAR-2 whites but did not reach statistical significance in PEAR-2 whites or blacks treated with chlorthalidone (Table 2-5).

Additionally, one and nine cis-eQTLs from the Blood eQTL browser93 showed

significant association with decreased expression of FOS and PPP1R15A, respectively,

and had coverage of at least 30 RNA-Seq reads for AEI data analysis (Table 2-6). For

43

FOS rs7101, there were 28 heterozygous in PEAR and PEAR-2 whites; of those we observed 10 samples with allele ratios (REF/ALT or ALT/REF) greater than log2 0.3

(1:1.3), suggestive of modest AEI (Figure 2-3). We also tested SNPs in high LD (r2 >0.8)

with rs7101 in the exon region of FOS. We found rs1046117 C >T in high LD with

rs7101 (r2=0.92, D’=0.73), showing read coverages of 41-229, and the vast majority of samples tested displayed consistent direction of allelic imbalance (AEI ratio > log2 0.3): the variant allele T had greater expression than the reference C allele (Figure 2-5).

From the 9 eQTLs in the exonic region of PPP1R15A, rs557806 showed consistent direction of AEI ratios, greater than log2 0.3, in 8 out of the 19 heterozygous tested, and mean log2 allelic expression ratio of 0.32 (p=0.05) (Figure 2-6). This indicates that there

is a potential cis-acting regulatory SNP in high LD with rs557806. Rs595474 was found

in high LD (r2>0.8) with rs557806 but showed low read coverage (mean = 13.3) which

incurs inaccurate allelic ratio estimation. Collectively, these results show evidence of

modest cis-acting regulatory effects in FOS and PPP1R15A. Although we were able to demonstrate evidence of allelic imbalances in rs7101, rs1046117 and rs557806, due to limitations in sample size, and low RNA-Seq read coverage in specific regions, we were unable to identify specific causal variants responsible for the expression imbalances detected.

Discussion

Despite the widespread use of thiazide diuretics, there is large inter-individual variability in BP or drug response, which has motivated the identification of genetic markers with the potential to optimize antihypertensive treatment selection. GWAS results have definitely contributed to enlarge the current knowledge on the potential role of genetics in inter-individual variability in drug response in general and also to thiazide

44

BP response12, 92. However, this approach provides only one dimension of molecular

information in thiazide BP response, which may not be sufficient to understand the

complexity of this phenotype. In this study, we investigated differences in gene

expression underlying extreme BP response to thiazides in white and black participants

from PEAR and PEAR-2. Such approaches have the potential to provide methods for

precision medicine, but additionally may provide previously unrecognized insights into

BP regulation and responses to antihypertensive drugs.

Herein, we have shown that applying transcriptome sequencing data helped us

to identify molecular markers potentially implicated in BP response to thiazide diuretics.

Among the 34 genes previously documented to influence BP/HTN, FOS, DUPS1 and

PPP1R15A mRNAs were differentially expressed between responders and non- responders in three different cohorts treated with thiazide diuretics, with consistent directional fold change in whites treated with HCTZ and whites and blacks treated with

chlorthalidone.

Among these three genes, only FOS has been associated previously with the pathophysiology of HTN. Expression of FOS (FBJ murine osteosarcoma viral oncogene homolog, also known as AP-1 transcription factor subunit), a leucine zipper protein that when dimerized with JUN forms a transcription factor complex, is linked to neuronal activation of vasomotor areas in mice95. Also, the blockade of FOS expression with

oligonucleotides attenuates high BP in HTN-induced and spontaneously HTN mice96.

We did not find in the literature any direct evidence of the involvement of DUSP1

and PPP1R15A that could account mechanistically for a potential susceptibility for HTN

and/or BP response to thiazides. However, we found that these genes are involved in

45

biological processes related to BP regulatory mechanisms. For instance, DUSP1 has shown consistent inhibition of ERK 1/2 (Extracellular Regulated Kinases) signaling in vitro and in vivo 97, with potential attenuation on the effects of angiotensin II-mediated

vascular smooth muscle cell (VSMC) proliferation and vasoconstriction 98.

PPP1R15A is a regulatory subunit for phosphatase protein (PP) 199. PP1 is the

catalytic subunits for myosin phosphatases, a key convergence point on contractility

pathways in VSMC, that dephosphorylates myosin light chain and initiates the relaxation

process for vasodilation 100. Of relevance, PP1 has a highly specific inhibitor 1 (I-1) which, when activated by protein kinase A, forms a heterotrimeric complex with PP1 and PPP1R15A99. This specific interaction of PPP1R15A with the C-terminal region of I-

1 engenders strong PP1 inhibition99 and a potential amplification of contractile response

in VSMC101. In addition, PPP1R15A is known for targeting PP1 for the phosphorylation

of the Eukaryotic Initiation Factor 2 (eIF-2alpha) leading to regulation of cell growth arrest and apoptosis under specific stress conditions including deprivation, heat shock, and viral infection102. Since there is no concrete evidence of the consequences of I-1 regulation on contractile signaling through the interaction with

PPP1R15, specifically in VSMC, we can only speculate that this gene may be important

for BP regulatory mechanisms. Further experimental validation will be crucial to close

the link between PPP1R15A interactions with I-1 for the regulation of PP1 activity in

VSMC.

In addition, we found rs11065987 associated with both systolic and diastolic BP

responses to HCTZ in PEAR whites, and it is also associated in trans with decreased

expression of 2 genes in our top list of BP signature genes: FOS and PPP1R15A.

46

rs11065987, the leading SNP in this small haplotype block, is an intergenic SNP in chromosome 12, where the closest gene is BRCA1 associated protein and previous cardiovascular disease GWA studies identified 12q4 as a risk locus for coronary artery disease and HTN103. Further experiments will be valuable to understand the

mechanisms involved in gene expression regulation in the chromosome 12q4 region

that could potentially affect BP regulation as well.

We also observed allele specific FOS and PPP1R15A mRNA expression in 9

SNPs, which had previously been identified in association with decreased expression of

these genes. We observed moderate differences between reference and alternative

allele expression, which suggests the presence of cis-acting regulatory variants. Future

studies with greater sample sizes will enable search for potential regulatory variants in

FOS and PPP1R15A regions.

Although it is not clear how FOS, DUSP1 and PPP1R15A are involved in BP

regulation, the differences in gene expression documented in this study taken together

with evidence of gene expression regulatory mechanism with AEI in cis-eQTLs and

trans-eQTLs associated with BP response to HCTZ suggest that these genes may be

markers of response to thiazide diuretics. Further functional studies may provide

additional insights to the field.

This study presents some limitations. First, the number of samples with RNA-Seq

data may have limited the power to identify additional genes differentially expression as

well as to validate some of the transcriptomics signals; however, we enhanced the

power of the number of samples tested by taking an extreme phenotype approach.

Second, using RNA from whole blood for RNA-Seq data analysis may have limited the

47

detection of the expression of some genes/regulatory mechanisms that might be cell type-specific. However, it may be challenging to select only one tissue in order to investigate gene expression as a marker of BP regulation since drug response to anti-

HTN might arise from a variety of target tissues such as heart, brain, kidney or vasculature. Not only are these tissues difficult to access in relatively healthy patients, as hypertensive patients are, but it is also not obvious which tissue should be used.

Thus we are using whole blood as a surrogate for multiple tissues. Moreover, the original study that served as the basis for selection of BP signature genes also used whole blood samples for that transcriptome-wide gene expression studies due to the convenience to identify biomarkers using easily accessible body fluids11.

In conclusion, these findings suggest that whole transcriptome data can provide insights into genes potentially involved in the pharmacogenetic phenotype of antihypertensive drug response. Specifically, we were able to identify genes that were previously identified through BP/HTN transcriptome profiling that are also relevant determinants of BP response to TD. Specifically, FOS, DUSP1 and PPP1R15A, through

their differential expression, may be involved in the response to TD. To strengthen the

finding, through use of a publicly available eQTL database, we found an eQTL (SNP) of

FOS and PPP1R15A that associated with BP response to TD. Further work is needed

to understand the mechanistic basis by which differential expression of FOS, DUSP1

and PPP1R15A may influence BP regulation and response to TD.

48

Table 2-1. Characteristics of PEAR and PEAR-2 participants classified as responder and non-responders for the RNA- Seq analysis. Whites (n=99) Blacks (n=50) Characteristics HCTZ Chlorthalidone Chlorthalidone Responders Non-responders Responders Non-responders Responders Non-responders (n=24) (n=25) (n=25) (n=25) (n=25) (n=25) Age 48±12 48±8 53±7.9 47±11 50±8 50±10

Female, n (%) 11 (44%) 10 (40%) 15 (75%) 5 (25%) 12 (48%) 12 (48%)

Baseline DBP 93.5±4.9 94.4±4.3 96.5±6.5 92.9±4.9 97.7±6.1 93.4±4.1

Baseline SBP 146±10.5 144.1±9.7 151.6±10.8 144.5±10 152.6±10.4 145.8±10.5 DBP response to -8.8±6.3 0.06±3.6 -13.8±3.7 -0.8±2.0 -16.9±4.2 -1.4±2.8 TD SBP response to -12.5±6.3 -0.9±5.8 -20.7±7.0 -2.5±5.1 -27.4±7.8 -4.4±5.1 TD Mean and Standard Deviation values for the continuous variables were presented SBP: systolic blood pressure; DBP: diastolic blood pressure; TD: thiazide diuretics

49

Table 2-2. Genes previously associated with BP/HTN15 and the expression measurements in PEAR withes and PEAR-2 whites and blacks treated with HCTZ and chlorthalidone, respectively HCTZ WHITES CHLORTHALIDON CHLORTHALIDON E WHITES E BLACKS Fold Fold Fold Gene Chr. Position (bp) P value P value P value Change Change Change DUSP1 5 172185228-172204777 1.38 1.50E-04 1.30 1.35E-03 1.29 3.55E-03 FGFBP2 4 15961865-16086001 0.75 3.50E-04 1.37 4.00E-04 1.09 3.25E-01 PPP1R15A 19 49375648-49379314 1.27 1.15E-03 1.19 3.61E-02 1.29 1.75E-03 NKG7 19 51874859-51875969 0.78 1.40E-03 1.27 3.80E-03 1.07 4.42E-01 FOS 14 75745476-75748933 1.26 2.90E-03 1.29 1.15E-03 1.46 5.00E-05 GPR56 16 57644563-57698944 0.75 7.50E-03 1.31 1.05E-03 1.15 1.07E-01 GLRX5 14 95998633-96011061 0.80 1.32E-02 1.01 9.47E-01 0.84 9.15E-02 SLC31A2 9 115913221-115983641 1.30 5.13E-02 1.24 1.31E-01 1.21 2.01E-01 PTGS2 1 186640922-186649559 1.18 5.49E-02 1.05 5.82E-01 1.04 7.21E-01 GZMB 14 25064928-25126980 0.80 6.99E-02 1.13 3.78E-01 1.15 3.42E-01 IL2RB 22 37521877-37595425 0.86 7.25E-02 1.09 3.29E-01 1.00 9.75E-01 PRF1 10 72357103-72362531 0.88 1.01E-01 1.07 4.43E-01 1.03 6.95E-01 TAGLN2 1 159887896-159895522 1.15 1.03E-01 1.11 2.28E-01 1.28 8.40E-03 VIM 10 17256237-17279592 1.15 1.10E-01 1.15 1.01E-01 1.16 9.62E-02 MYADM 19 54357834-54379691 1.14 1.13E-01 1.25 8.35E-03 1.26 1.14E-02 CD97 19 14491312-14519537 1.21 1.20E-01 1.13 2.76E-01 1.18 1.34E-01 TAGAP 6 159393902-159486305 1.12 1.98E-01 1.19 8.26E-02 1.17 1.62E-01 MCL1 1 150547031-150552066 1.11 2.42E-01 1.13 1.51E-01 1.12 1.81E-01 GRAMD1A 19 35485687-35517375 1.15 2.60E-01 1.02 8.47E-01 1.21 6.87E-02 OBFC2A 2 192542793-192553251 1.12 2.80E-01 1.06 5.69E-01 1.07 5.55E-01 GNLY 2 85912297-85925977 0.89 3.00E-01 1.15 2.20E-01 1.24 7.71E-02

50

Table 2-2. Continued HCTZ WHITES CHLORTHALIDON CHLORTHALIDON E WHITES E BLACKS Fold Fold Fold Gene Chr. Position (bp) P value P value P value Change Change Change CLC 19 40221889-40228668 1.08 4.56E-01 1.15 2.34E-01 1.16 2.27E-01 S100A10 1 151955390-151966866 1.06 4.76E-01 1.21 3.91E-02 1.24 1.73E-02 ANXA1 9 75766672-75785309 1.06 4.99E-01 1.10 3.14E-01 1.14 3.20E-01 ANTXR2 4 80822302-81046608 1.09 5.00E-01 1.10 5.19E-01 1.22 2.55E-01 AHNAK 11 62201015-62323707 0.92 5.48E-01 1.11 2.24E-01 1.13 2.22E-01 TMEM43 3 14166439-14242619 1.07 5.64E-01 1.11 3.97E-01 1.17 2.10E-01 TIPARP 3 156389650-156424559 0.96 6.68E-01 1.14 1.89E-01 1.10 3.65E-01 BHLHE40 3 4938492-5027008 1.03 7.01E-01 1.13 1.48E-01 1.03 7.41E-01 PIGB 15 55495163-55800432 1.11 7.65E-01 1.22 1.35E-01 1.14 3.45E-01 ARHGAP15 2 143848930-144533642 1.10 8.09E-01 1.05 8.59E-01 1.16 5.85E-01 FBXL5 4 15606161-15739936 1.00 9.73E-01 1.02 8.53E-01 1.08 5.12E-01 HAVCR2 5 156512842-156682201 1.01 9.74E-01 1.05 7.82E-01 0.95 8.04E-01

51

Table 2-3. Genes differentially expressed between responders and non-responders to HCTZ and chlorthalidone in all 3 cohorts, with consistent direction and transcriptome-wide statistical significance when meta-analyzed

Meta- HCTZ Whites Chlorthalidone Whites Chlorthalidone Blacks analysis

Non- Fold Non- Fold Non- Fold Genes resp. Resp. Change P value resp. Resp. Change P value resp. Resp. Change P value P value

FOS 39.2 49.5 1.3 2.9E-03 29.4 38.0 1.29 1.15E-03 24.6 35.9 1.46 5.00E-05 2.08E-12

DUSP1 76.0 105 1.4 1.5E-04 71.5 92.8 1.30 1.35E-03 63.3 81.7 1.29 3.55E-03 9.50E-12

PPP1R15A 38.3 48.7 1.3 1.1E-03 29.9 35.5 1.19 3.61E-02 27.6 35.6 1.29 1.75E-03 3.64E-08

Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM)

52

Table 2-4. Differences in baseline expression levels for FOS, DUSP1 and PPP1R15A between thiazide diuretics responders and non-responders in PEAR and PEAR-2 with adjustment for age, gender and baseline blood pressure Chlorthalidone Chlorthalidone HCTZ Whites Whites Blacks Fold Fold Fold Genes P value P value P value Change Change Change FOS 1.23 0.0334 1.23 0.0454 1.3 0.069 DUSP1 1.45 0.0242 1.23 0.0466 1.18 0.14 PPP1R15A 1.28 0.0025 1.14 0.1632 1.2 0.071 Generalized linear model implemented in edgeR21 Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM)

53

Table 2-5. Representative trans eQTL for top differentially expressed genes and association with BP response to thiazide diuretics in PEAR whites and PEAR-2 whites and blacks SNP - Gene Association* PEAR whites participants PEAR2 whites participants PEAR2 blacks participants FOS PPP1R15A HCTZ DBP HCTZ SBP CLTD DBP CLTD SBP CLTD DBP CLTD SBP

SNP Z score P value Z score P value β SE P β SE P β SE P β SE P β SE P β SE P

rs11065987 -5.4 5.60E-08 -4.7 2.81E-06 -1.4 0.5 2.9E-03 -2.1 0.7 1.8E-03 -0.5 0.5 0.363 -0.1 0.8 0.858 1.1 1.4 0.426 2.5 2.1 0.247 *Data from Blood eQTL database22 SNP, single nucleotide polymorphism; HCTZ, hydrochlorothiazide; CLTD, chlorthalidone; SBP, systolic blood pressure and DBP, diastolic blood pressure

54

Table 2-6. SNPs with AEI ≥1.3-fold and significant eQTLs association from Blood eQTL browser Genes SNPs Samples Samples No AEI Samples Average Blood eQTL Functional tested with AEI with AEI read browser annotation ≤ -log2 ≥ log2 depth 0.3 0.3 Z- score P-value FOS rs7101 28 5 18 5 138 -3.5 4.0E-04 5'UTR

PPP1R15A rs564196 25 2 18 5 92 -10.0 1.7E-23 Missense PPP1R15A rs611251 27 1 20 6 110 -10.1 5.6E-24 Missense PPP1R15A rs557806 19 0 11 8 121 -12.8 2.4E-37 Missense PPP1R15A rs610308 27 4 21 2 99 -9.2 5.0E-20 Missense PPP1R15A rs556052 31 7 17 7 93 -9.5 1.4E-21 Missense PPP1R15A rs500079 25 4 18 3 92 -12.1 9.6E-34 Missense PPP1R15A rs524 30 3 22 5 104 -12.1 9.6E-34 Synonymous PPP1R15A rs527 27 4 17 6 87 -12.1 1.2E-33 Synonymous

55

Figure 2-1. Mapping statistics for PEAR and PEAR-2 RNA-Seq data. The blue line represents total number of reads aligned to the human reference genome (hg19) for the 149 samples included in this study. The orange line represents uniquely mapped reads per sample and the dashed line represents total number of reads that remained after duplicate removal with Picard MarkDuplicates option.

56

A B

Figure 2-2. Linkage disequilibrium plots between rs10655987, rs653178, rs10774625 and rs11066301 single nucleotide polymorphisms. Linkage disequilibrium is represented in r2 (A) and D’(B) values with data from the 1000 Genome project, phase 3 release CEU population using Haploview104.

57

1.0

0.5

0.0

-0.5

-1.0 Heterozygotes at rs7101 (n=28) Log 2 Expression Ratio Expression Log 2 (C/T)

Figure 2-3. Rs7101 allele-specific expression analysis. Each bar represents one heterozygous individual at the rs7101 SNP.

58

A B

Figure 2-4. The effect of rs11065987 polymorphism on the blood pressure response of Whites treated with HCTZ in PEAR. Blood pressure responses were adjusted for baseline blood pressure, age, sex, and population substructure. P-values represent the contrast of adjusted means between different genotype groups in the PEAR white participants. Error bars represent standard error of the mean. A) systolic blood pressure response to HCTZ in PEAR whites. B) diastolic blood pressure response to HCTZ in PEAR whites.

59

Figure 2-5. Rs1046117 allele-specific expression analysis. Each bar represents one heterozygous individual at the rs1046117 SNP.

60

Figure 2-6. PPP1R15A rs557806 allele-specific expression ratios (major allele over minor allele). Each bar represents the magnitude and direction of allelic expression imbalance (AEI) for one heterozygous individual indicated on a log2 scale.

61

CHAPTER 3 WHOLE TRANSCRIPTOME SEQUENCING ANALYSES REVEAL MOLECULAR MARKERS OF BLOOD PRESSURE RESPONSE TO THIAZIDE DIURETICS

Introduction

Hypertension (HTN) affects approximately 80 million adults in the United States and 1 billion worldwide3, 93. While it is the most important modifiable risk factor for

cardiovascular diseases and renal disease, the current available evidence shows that

use of antihypertensive medications is associated with decreased morbidity and

mortality2. Despite the availability of numerous blood pressure (BP) lowering

medications from different drug classes, with different mechanisms of action, only about

half of patients in antihypertensive treatment achieve appropriate BP control6, 105.

Thiazide diuretics are among the most commonly prescribed antihypertensive

medications in the US, with hydrochlorothiazide (HCTZ) achieving > 50 million

prescriptions in 2014106, and likely double that when combination products are

considered. Thiazides are a first-line option for HTN treatment, yet patients’ responses

vary widely and less than 40% of patients achieve BP control6, 107. This reveals that the inter-individual variability in BP response to TD is likely to contribute to the suboptimal

BP control.

In the past 10 years, pharmacogenomic studies have increased our understanding of the potential role of specific genetic variants with BP response to antihypertensive drugs13, 14, 92. Recently, two replicated regions, one in PRKCA (protein

kinase C, alpha) and the other one near GNAS (G protein alpha subunit), were

identified with clinically relevant effects on BP response to HCTZ 12. Despite success

with the GWAS approach, stringent cutoffs for statistical significance (P < 5.0x10-8) relative to the sample sizes available in hypertension pharmacogenomics cohorts limit

62

the detection of additional polymorphisms influencing BP response to antihypertensive drugs.

The recent development of cheaper, faster and high throughput sequencing technologies has enabled the systematic analysis of hundreds of millions of DNA and

RNA fragments20. Among its applications, RNA-Seq has brought relevant qualitative

and quantitative improvements to transcriptome analysis, offering an unprecedented

level of resolution and a unique tool to simultaneously investigate different layers of

transcriptome complexity. The application of RNA-Seq facilitated transcriptomics

approaches successfully identifying biomarkers associated with different diseases and

traits in order to bridge the gap between genomics and phenotype. Thus, in this study, we aim to identify genes/transcripts associated with BP response to thiazide diuretics

and investigate allele specific expression within these genes, as a mechanism to

potentially explain the detected differences in gene expression.

Methods

Study Participants

The primary analysis of this study included clinical data and whole blood samples from hypertensive participants from the Pharmacogenomic Evaluation of

Antihypertensive Responses (PEAR) and PEAR-2 studies (NCT00246519,

NCT01203852 www.clinicaltrials.gov). Details of these studies were previously

published83. In brief, PEAR was a multicenter, randomized clinical trial with one of the

primary aims to evaluate the role of genetics on BP response of HCTZ and/or atenolol

treated patients. PEAR recruited 768 study participants with uncomplicated HTN from

the University of Florida (Gainesville, FL), Emory University (Atlanta, GA), and the Mayo

Clinic (Rochester, MN). These participants were randomized to receive monotherapy of

63

either the thiazide diuretic HCTZ, or the beta-blocker atenolol for a period of 9 weeks.

Fasting blood (including DNA and RNA) and urine samples were collected at baseline

(untreated), after 9 weeks of monotherapy, and after 9 weeks of combination therapy

(HCTZ + atenolol). BP response measurements were assessed using office, home, and

24-hour ambulatory BP and then a composite BP response was constructed84.

PEAR-2 was a prospective, multi-center, sequential monotherapy clinical trial,

which recruited a hypertensive population with similar characteristics to the one in

PEAR. One of its primary aims was to investigate the role of genetics on metoprolol, a

beta-blocker, and chlorthalidone, a thiazide-like diuretic. Details of this prospective,

clinical trial were previously published85. Briefly, 417 hypertensive participants had at

least a 4-weeks washout period prior to each active treatment period with metoprolol

(beta-blocker) and then chlorthalidone (thiazide diuretic). Home and clinic BP

measurements, adverse metabolic effects, RNA and DNA from whole blood, and urine

samples were collected.

Study participants from PEAR and PEAR-2 provided written informed consent.

The Institutional Review Boards at the participating clinical trial sites approved both

PEAR and PEAR-2 studies, which were conducted in accordance with the principles of

the Declaration of Helsinki and the US Code of the Federal Regulations for Protection of

Human Subjects.

Gene expression profile with RNA-Seq

RNA-Seq was performed in 150 PEAR whites and PEAR-2 white and black

participants, selected based on the differences in their BP response to HCTZ and

chlorthalidone treatment, respectively. Sample selection was based on BP responses to

either HCTZ or chlorthalidone in the top and bottom quartiles from each of the three

64

cohorts and participants were classified as poor BP responders (non-responders) and good BP responders (responders).

We determined the mean changes of serum potassium concentrations and serum uric acid levels in non-responders after treatment with HCTZ and chlorthalidone with the purpose to investigate treatment compliance in the group of non-responders to

TD. We also compared changes from baseline to after treatment serum potassium and uric acid using paired t-tests. Both potassium depletion and uric acid elevation are commonly observed secondary to treatment with TD108-110, and were lab parameters with statistically significant change in the overall clinical study from PEAR participants111, 112.

Total RNA was from whole blood samples using the PAXgene Blood RNA kit IVD

(Qiagen, Valenica, CA), then mRNA was selected using poly(A) selection protocol with

Sera-Mag Magnetic Oligo(dT) Beads (Illumina, San Diego,CA) and fragmented to a mean length ~ 120 to 180 base pairs. Strand-specific complementary DNA libraries

were prepared and sequenced on an Illumina HiSeq 2000, performed at Baylor Human

Genome Sequencing Center in Texas. One of the samples from HCTZ responders did

not achieve enough yield of libraries for adequate performance in sequencing.

The paired-end 100 bp reads generated were uniquely mapped to the human

reference genome (hg19) using TopHat v2.0.1087 allowing for four reads mismatches, read edit distance of six, one mismatch in the anchor region of a spliced read, and a maximum of five multi-hits. PCR duplicates were removed using Picard

(http://picard.sourceforge.net) MarkDuplicates option. Transcript structure assembly

was performed using Cufflinks v2.2.1 on each sample. Gene expression levels (in

65

Fragments per Kilobase of Exon Mapped, FPKM) were calculated by considering per- isoform FPKM measurements carried out with Cuffdiff v2.2.1.

Additionally, alternative tools were applied for differential expression analysis with the purpose to include age, gender and baseline diastolic BP in the statistical model for association with BP response to TD. With BAM files from TopHat 2 alignments, the htseq-count function from the HTSeq bioconductor package88 was

applied to directly count the number of reads for assigned to the known human genes

(Gencode gene annotation release 18). Then, these read counts were modeled to a

Negative Binomial distribution using a generalized linear model in edgeR89.

Statistical Methods

The primary data analysis for this study was performed in whites treated with

HCTZ or chlorthalidone. Whole transcriptome expression levels were quantified by

measuring read counts that overlap protein coding genes (count matrix) and Fragments

per Kilobase of transcript per Million mapped reads (FPKM). A t-test was applied in

order to assess the statistical significance for the observed differences in expression

levels between responders and non-responders to TD. False discovery rate (FDR)

adjusted p-values (Q value) < 0.05 were considered statistically significant.

In order to validate the association of gene expression differences with BP response to TD, we aimed to replicate the finding in PEAR-2 blacks and the alternate group of whites for each gene differentially expressed in PEAR and PEAR-2 whites. The a priori criteria for validation was Q value < 0.05 (considering the subset of genes differentially expressed) and consistent fold change direction (up or down regulation of expression) in all three groups: 1) whites treated with HCTZ, and 2) whites and 3) blacks treated with chlorthalidone.

66

The differential expression results from each study cohort were combined in a meta-analysis, applying the Empirically Adjusted Meta-analysis113 with central matching

approach to estimate the empirical null, followed by the Fisher p-value combination

method implemented by the R package MetaRNASeq 90. We considered that genes

with meta-analysis p-values<2.0x10-6 (0.05/25,000) achieved transcriptome-wide

association with BP response to TD.

Genomics Analysis

The genome-wide genotyping results for the PEAR and PEAR-2 studies were

previously reported91, 92. GWAS associations with chlorthalidone response in PEAR-2

will be reported separately. Briefly, Illumina Human Omni-1Million Quad BeadChip and

2.5M-8 BeadChip (Illumina, San Diego CA) platforms were used for genotyping PEAR and PEAR-2 DNA samples, respectively. SNP calling was performed using GenTrain2

clustering algorithm (GenomeStudio, Illumina, San Diego CA). MaCH software (version

1.0.16) was used for pre-phasing and Minimac to impute SNPs based on the reference

panels from 1000 genomes Phase I study.

We selected SNPs within the two genes that passed the validation criteria and

the top gene associated with BP response to TD in the meta-analysis of gene

expression for statistical tests of association with HCTZ and chlorthalidone BP response. The three genes’ regions for SNP selection were considered within 1kb of the coding region. Linkage disequilibrium pruning was conducted using LDlink web tool

(https://analysistools.nci.nih.gov/LDlink/), which was based on the 1000 Genomes panel

representing the population with Caucasian ancestry (CEU), and considering an r2

threshold greater than 0.7. 109 SNP associations with BP response were then

evaluated using previously conducted GWAS analyses91 that included data on systolic

67

and diastolic BP responses to HCTZ in 228 whites participants from PEAR, and responses to chlorthalidone in 185 white and 142 black participants from PEAR-2.

PLINK software was used to run the analysis with adjustment for age, gender, pre-

HCTZ/chlorthalidone BP and population substructure by considering the first and second principal components (PC1 and PC2) in all our analysis. A Bonferroni correction with a p-value less than 4.6 x 10-4 (0.05/109) was defined as the statistical significance

threshold for this analysis.

Allele Specific Expression (ASE) Analysis

We also tested for allelic mRNA expression imbalance in the

upstream/downstream within 2 kb of the coding region for the genes that passed the

validation criteria in the differential expression analysis and for the top gene associated

with TD BP response in the meta-analysis of gene expression. The ASE analyses were

conducted with heterozygous white participants from PEAR and PEAR-2 (n=100) as our sample size in blacks (n=50) was too small for a meaningful analysis. A personalized genome was built by substituting the reference allele with the variant allele SNP in hg19 using GATK FastaAlternateReference tool

(www.software.broadinstitute.org/gatk/gatkdocs/org_broadinstitute_gatk_tools_fasta_Fa staAlternateReferenceMaker.php) in order to overcome potential bias in read alignment, where reference allele reads can be preferentially aligning over alternative allele reads94. RNA-Seq reads were mapped using STAR v2.5.2b and a two-pass strategy.

We followed the Broad Institute best practices workflow for SNP and indel calling from

RNA-Seq data (https://www.broadinstitute.org/gatk/guide/article?id=3891). For each

SNP, ASE ratios were obtained from the division of reference allele counts over

alternative allele reads counts. A binomial statistical test was applied to determine

68

whether this ratio deviates from the expected 50:50, when the two alleles are expressed equally.

Results

Table 3-1 displays baseline and demographic characteristics from PEAR and

PEAR-2 participants selected for RNA-Sequencing. When comparing age, gender and baseline BP in PEAR participants classified as responders and non-responders to

HCTZ, these characteristics were not statistically significant. Nonetheless, we detected

statistically significant differences between PEAR-2 white and black responders and

non-responders to chlorthalidone, as shown in Table 3-1.

After treatment with HCTZ and chlorthalidone, there were significant reductions

on serum potassium concentrations and significant increases in serum uric acid levels

in participants classified as non-responders (Table 3-2). These changes are consistent

with previously reported metabolic effects after treatment with TD111, 112, and suggest

high treatment compliance in the group of non-responders to TD.

In order to study inter-individual variability in expression that potentially impacts

BP response to TD, we generated transcriptome sequencing data from 150

hypertensive participants treated with HCTZ or chlorthalidone, and data passed quality

control procedures on 149. For each sample, RNA-Seq reads were mapped to the

human genome, resulting in 11-63 million mapped reads per sample. Of those, 79-95%

of the reads were uniquely mapped. These and other mapping statistics are presented

in the Table 3-3.

Differential mRNA Expression

For the primary analysis with PEAR and PEAR-2 whites, we investigated genes

differentially expressed between responders and non-responders to HCTZ and

69

chlorthalidone, respectively. There were 12,948 and 13,160 transcripts detected with

FPKM ≥ 1 in the responders or non-responders to HCTZ and chlorthalidone, respectively. At Q value < 0.05, there were 11 and 18 unique genes differentially expressed in PEAR and PEAR-2 whites, respectively (Figure 3-1 and Tables 3-3 and 3-

4).

Validation of gene expression associations with BP response to TD

In order to validate the differential expression results, we attempted replication in the other white group and in PEAR-2 blacks for each gene differentially expressed in

PEAR or PEAR-2 whites (Tables 3-3 and 3-4). CEBPD and TSC22D3 showed

statistically significant differences in expression and consistent fold change direction

(FPKM in responders compared to non-responders) in all 3 groups tested (Table 3-6).

The results from the meta-analysis displayed in the Table 3-6 revealed that CEBPD and

TSC22D3 expression association with BP response to TD achieved transcriptome-wide

significance (CEBPD: P=1.8x10-11 and TSC22D3: P=1.9x10-9). We observed higher

CEBPD expression in responders than non-responders to TD across blacks and whites

and the two different drugs in the TD drug class: HCTZ and chlorthalidone (Figure 3-2

A-C). In contrast, TSC22D3 showed increased expression levels in non-responders to

TD consistently in PEAR whites and PEAR-2 white and black participants (Figure 3-2 D-

F). These results indicate the potential for CEBPD and TSC22D3 to be considered as

molecular determinants of BP response to TD.

Although SERINC5, TFCP2, PPP2R5C, METTL23 and LTF did not pass the

validation criteria for association with TD BP response in all the PEAR and PEAR-2

cohorts tested for differential expression, these genes reached statistical significance

and same fold change direction in the two PEAR-2 cohorts (whites and blacks) treated

70

with chlorthalidone, though differences in expression between responders and non- responders to HCTZ were not observed (Table 3-5). In addition, SERINC5 had the

lowest p-value in the meta-analysis and showed the greatest magnitude of fold change when comparing the SERINC5 expression levels between responders and non- responders to chlorthalidone (PEAR-2 whites: fold change = 0.05, blacks: fold change =

0.11) (Table 3-7).

The differential expression results with edgeR, including age, gender and baseline BP in the statistical model, revealed similar effect sizes, fold change in expression between responders and non-responders, when compared to the results with Cuffdiff for CEBPD (Table 3-8), although the p value of this association was not as low. The edgeR analyses for TSC22D3 and SERINC5 were not statistically significant

(Table 3-8).

Since TSC22D3 is located in the X chromosome, we also investigated the overall

expression levels (FPKM) of this gene in PEAR and PEAR-2 male and female

participants (Appendix, Figure A-1). There were no sex-specific differences detected in

TSC22D3 expression (PEAR: P=0.09, PEAR-2 whites: P=0.37 and PEAR-2 blacks:

P=0.39), which suggests that X inactivation escape was not the cause of the observed

TSC22D3 differential expression results.

Genomics Analysis

109 independent SNPs were selected to test for association with TD BP response. These SNPs were within 1 KB distance from the coding region of CEBPD,

TSC22D3 and SERINC5. We did not find any SNPs associated with BP response to

HCTZ or chlorthalidone in CEBPD or TSC22D3 target regions. SERINC5 intronic SNP

rs10042497 showed statistically significant association with SPB and DBP response to

71

chlorthalidone in PEAR-2 whites participants (SBP: β = 3.1; p= 1.7x10-4; DBP: β = 1.7;

p= 1.6x10-3) (Figure 3-3) and showed consistent directional association in PEAR whites

(SBP: β = 0.67; p= 0.155; DBP: β = 0.68; p= 0.067).

Allele Specific Expression Analysis

We also sought to determine whether there was evidence of cis-acting regulation

for CEBPD and TSC22D3. However, we were not able to achieve sufficient number of

heterozygous (>2) or enough RNA-Seq coverage (> 30 reads) for ASE analysis in these

candidate gene regions.

Because SERINC5 was the top gene associated with BP response to TD in the

meta-analysis, we considered investigating potential genetic mechanisms that could

account for the differences in expression observed in PEAR-2 whites and blacks, even

though this gene did not pass the a priori criteria for validation (due to unchanged

expression in PEAR whites). Table 3-9 shows all SNPs in SERINC5 region with at least

2 heterozygous participants presenting allelic expression imbalance and sequencing

coverage greater than 30 reads. ASE analysis revealed that the reference C allele in

SERINC5 rs10072008 is more expressed than the variant T allele (mean log2 ASE

ratio= 0.3 and Pbinom=0.03). This ASE effect, with fold change ≥ log2 0.3 (1:1.3), was

consistently observed in 7 out of 17 heterozygotes for this loci (Figure 3-4). We found

another 3’UTR SNP, rs7707754, in high LD with rs10072008 (r2 = 0.5 and D' = 1)

presenting similar ASE pattern: fold change ≥ log2 0.3 in 6 of 13 heterozygotes at

rs7707754 (Figure 3-5). Both rs10072008 and rs7707754 are significantly associated with reduced SERINC5 expression in whole blood (Blood eQTL browser)93 (Table 3-9).

In addition, we observed consistent ASE effect in almost all heterozygotes (4 of 5) at

72

SERINC5 intronic SNP rs78174795, where the C allele was more expressed than the T reference allele (mean log2 ASE ratio=1.1 and Pbinom=0.004) (Figure 3-6). A similar

effect was observed at rs11951568 (mean log2 ASE ratio=1.0 and Pbinom=0.045) (Figure

3-7), which is in high LD with rs78174795 (r2 = 0.7 and D' = 1), and strengthens the

evidence of SERINC5 gene expression regulatory effect.

Discussion

To the best of our knowledge, this is the first study to investigate the association

of global gene expression levels with BP response to antihypertensive drugs. Unlike

other studies profiling gene expression, here we have RNA-Seq data from whole blood

samples from 3 cohorts of participants selected based on the extremes of BP response

to TD: PEAR whites treated with HCTZ and PEAR-2 whites and blacks treated with

chlorthalidone. The application of robust methods to quantify gene expression, with high

sequencing resolution and available data for the replication and validation of the results

reveal the potential to provide previously unrecognized insights into BP regulation and

responses to antihypertensive drugs.

Herein, we have shown that 29 genes were differentially expressed (Q value <

0.05) between white participants classified as responders and non-responders to HCTZ

or chlorthalidone (Table 3-4 and 3-5). Among them, CEBPD and TSC22D3 were

differentially expressed between responders and non-responders to three different

cohorts treated with thiazide diuretics, with consistent directional fold change in whites

treated with HCTZ and whites and blacks treated with chlorthalidone (Table 3-6).

CEBPD, our top differentially expressed gene (meta-analysis P-value = 1.8x10-

11), is located at chromosome 8p11.2-p11.1 and encodes the transcription factor

CCAAT/enhancer binding protein delta. Previously, the expression of CEBPD was

73

associated with strain-specific differential transcription activation of Platelet-Derived

Growth Factor-α Receptor (PDGF-αR) expression between spontaneously hypertensive

(SHR) and normotensive (Wistar-Kyoto) rats114. This strong bimodal (all versus none)

strain-specific effect in PDGF-αR expression suggests that PDGF-αR and its

transcription-regulating factors are significantly related to genetic hypertension through

proliferation and migration of vascular smooth muscle cells114. Additionally, members of

the CEBP family of transcription factors, especially CEBPB (beta) and CEBPD, showed

regulatory effects on the expression of the angiotensinogen (AGT) gene by increasing the promoter activity mediated by interleukin 6115. CEBPD is known to facilitate the

binding of other transcription factors and contribute to chromatin remodeling not only for

the genes mentioned here116, with documented impact in hypertension, but also genes

involved in immune and inflammatory responses117. Therefore, further experiments will

be valuable to understand the regulatory mechanisms by which CEBPD is involved in

BP response to TD.

We also observed differences in TSC22D3 expression highly associated with BP

response to HCTZ and chlorthalidone (meta-analysis P-value = 1.9x10-9). TSC22D3,

located at the chromosome Xq22.3, encodes the anti-inflammatory protein

glucocorticoid (GC)-induced leucine zipper, also known as Gilz. TSC22D3 expression is

stimulated by glucocorticoids118, interleukin 10119 and aldosterone120, and the latter

plays a role in sodium homeostasis in the distal nephron via activation of the apical

epithelial sodium channel (EnaC)121. Aldosterone dose-dependent activation of

TSC22D3 mediates the inhibition of the negative feedback mechanism, regulating the

74

EnaC deactivation, which ultimately drives sodium retention120. Further experimental validation will be crucial to close the link between TSC22D3 and BP regulation with TD.

In humans, most of the X-linked genes are subject to X-inactivation122. However, about 15% of them are thought to escape X-inactivation, which implicates in gene expression coming from both the active and inactive X in women122. Due to the localization of TSC22D3 in the X chromosome, we tested the association of gene expression levels with gender (Appendix, Figure A-1). There was no statistically significant difference in the expression levels between genders. Collectively, these results suggest that an effect of X inactivation escape can be dismissed.

Although SERINC5 was not associated with BP response to TD in all the cohorts tested, and did not pass a priori validation criteria, it was differentially expressed in

PEAR-2 whites and blacks treated with chlorthalidone (Table 3-7). Also, the fact that

PEAR responders and non-responders to HCTZ did not show differences in SERINC5 expression suggests that this gene may be a potential molecular marker specific to chlorthalidone BP response. In addition, a SNP in SERINC5, rs10042497, was found in association with BP response to chlorthalidone in PEAR-2 whites (Figure 3-3).

SERINC5 encodes the serine incorporator 5, a member of a family of putative carrier proteins with at least 10 transmembrane domains, that integrates serine molecules into membranes and promotes the synthesis of phosphatidylserine and sphingolipids, two serine-derived lipids 123. This gene was previously linked to myelin formation and mechanisms involved in neural activity124. Thus, this is the first report of association of

SERINC5 with a blood pressure phenotype generally and chlorthalidone BP response specifically.

75

In the search for genetic mechanisms of expression regulation in the genes differentially expressed, we also observed 2 distinct signals that are potential candidate variants accounting for SERINC5 ASE. Cis-acting regulation by genetic variants may affect different aspects of gene expression, for example transcription, alternative mRNA processing or mRNA stability94. SERINC5 rs10072008 and rs7707754 showed similar

pattern of consistent ASE ratios (Figures 3-4 and 3-5, respectively).Since both SNPs

have been associated with decreased SERINC5 expression in whole blood samples93 and are located at the 3’UTR loci in a gene with reported alternative polyadenylation events125, our results suggest a potential cis-acting regulatory mechanism impacting gene expression by alternative processing events (3’UTR extension or truncation), which is usually associated with decreased mRNA expression. We also reported other two SNPs - rs78174795 and rs11951568 – in the SERINC5 intronic region with greater

than 2-fold ASE effect and high LD (Figures 3-6 and 3-7, respectively). The fact that we

could detect the expression levels of these intronic variants is probably due to intron

retention mechanism126 and alternative SERINC5 mRNA splicing. However, we could

not find in the current genetics databases any splicing variants annotated in SERCINC5.

To confirm the intron retention, it would be the necessary to investigate the allelic ratios

in the genomic DNA of the heterozygous participants for these SNPs.

This study presents some limitations. First, our sample size for RNA-Seq differential expression and ASE analysis may have restricted the power to identify additional signals as well as to validate some of the findings, however we enhanced the power of the number of samples tested by taking an extreme phenotype approach.

Second, using whole blood samples for RNA-Seq data analysis may have also limited

76

the detection of some tissue-specific genes/regulatory mechanisms. However, it is challenging to select only one tissue in order to investigate gene expression as a marker of BP regulation since drug response to anti-HTN might arise from a variety of target tissues such as vasculature, heart, brain, or kidney. Not only are these tissues

difficult to access in relatively healthy patients, as hypertensive patients are, but it is not

obvious which tissue should be used. Thus we are using whole blood as a surrogate for

multiple tissues, recognizing the limitations of tissue specific expression with this

approach.

In conclusion, this is the first report of whole transcriptome sequencing analysis

to identify genes potentially involved in the phenotype of antihypertensive drug

response. More specifically, we identified differences in CEBPD and TSC22D3

expression associated with BP response to HCTZ and chlorthalidone in 3 unique

cohorts. In addition, SERINC5 expression was associated with BP response to

chlorthalidone only. We also report unique genetic signals from this gene in association

with this phenotype, along with a with potential regulatory effect on SERINC5 expression. Additional experiments are needed to demonstrate the mechanisms by which, CEBPD, TSC22D3 and SERINC5 may influence BP response to TD.

77

Table 3-1. Characteristics of PEAR and PEAR-2 participants classified as responder and non-responders for RNA-Seq differential expression and allele specific expression analyses

Whites (n=99) Blacks (n=50)

HCTZ Chlorthalidone Chlorthalidone Non- Non- Non- Responders responders Responders responders Responders responders Characteristics (n=24) (n=25) (n=25) (n=25) (n=25) (n=25)

Age 48±12 48±8 53±8 48±10 52±8 50±10

Female, n (%) 11 (44%) 10 (40%) 15 (75%)* 5 (25%)* 12 (48%) 12 (48%)

Baseline DBP 93±5 94±4 97±6* 93±5* 98±6* 93±4*

Baseline SBP 146±10 144±10 152±11* 144±9* 152±10* 146±10*

DBP response to -9±6*** 0.06±4*** -14±4*** -0.2±2*** -17±4*** -1.4±3*** TD

SBP response to -12±6*** -0.9±6*** -22±7*** -1.5±5*** -27±7*** -4.4±5*** TD Mean and Standard Deviation values for the continuous variables were presented SBP: systolic blood pressure; DBP: diastolic blood pressure; TD: thiazide diuretics * P < 0.05 *** P < 0.001

78

Table 3-2. Potassium and uric acid mean changes in participants classified as non-responders after treatment with HCTZ and chlorthalidone. Whites Blacks Non-responders to Non-responders to Non-responders to HCTZ (n= 25) Chlorthalidone (n =25) Chlorthalidone (n=25) Parameters Mean P value Mean P value Mean P value change ± change ± change ± s.d. s.d. s.d. Serum K+ -0.2±0.4 0.016 -0.6±0.4 2.0E-07 -0.45±0.6 0.001 (mEq/L)

Serum uric 0.9±1.0 9.6E-05 1.1±1.0 2.8E-05 1.1±1.4 5.6E-04 acid, mg/dl HCTZ, hydrochlorothiazide; K+, potassium. P values represent the comparison between baseline and the end of the monotherapy

79

Table 3-3. Summary of mapping statistics from PEAR and PEAR-2 RNA-Seq alignment with Tophat2 Mapping PEAR whites PEAR-2 whites PEAR-2 blacks characteristics mean(range) mean(range) mean(range)

Reads aligned 27,032,558 33,709,514 33,737,127 (14,247,017-44,816,469) (15,437,743-52,173,426) (11,287,549-63,147,881)

Uniquely mapped (%) 93.0 93.5 92.1 (88.4-94.9) (91.3-95.1) (78.6-95.2)

Remaining after 47.3 61.3 58.7 duplicate removal (%) (15.0-63.9) (29.0-80.5) (28.1-84.9)

Known junctions (%) 86.8 85.1 84.8 (82.9-90.5) (79.9-89.9) (76.7-91.5)

Reads aligned to 65.5 61.6 60.5 exonic regions (%) (57.5-70.8) (55.7-68.5) (42.7-70.2)

80

Table 3-4. Genes differentially expressed in PEAR whites treated with HCTZ at Q-value < 0.05 and gene expression results in PEAR-2 whites or blacks treated with chlorthalidone

PEAR-2 Chlorthalidone PEAR HCTZ whites PEAR-2 Chlorthalidone blacks whites FOLD FOLD FOLD GENE P Q CHANG P Q P Q CHANGE CHANGE E TSEN34 1.5 5.0E-05 0.034 1.07 2.0E-01 0.259 1.35 3.0E-04 0.002 CEBPD 1.4 5.0E-05 0.034 1.25 2.4E-03 0.031 1.31 5.3E-04 0.002 TIGD3 1.4 5.0E-05 0.034 1.19 4.8E-02 0.171 1.37 1.8E-03 0.006 VNN1 1.7 5.0E-05 0.034 1.15 9.6E-02 0.208 1.29 1.3E-02 0.033 TSPO 1.4 5.0E-05 0.034 1.05 2.8E-01 0.327 1.20 2.0E-02 0.044 CDC42EP2 1.4 5.0E-05 0.034 0.98 4.2E-01 0.416 1.19 3.3E-02 0.062 RHOB 1.4 5.0E-05 0.034 1.12 8.2E-02 0.208 1.12 8.0E-02 0.130 TRGC1 0.6 5.0E-05 0.034 1.12 1.6E-01 0.238 1.09 2.2E-01 0.292 FCRL6 0.7 5.0E-05 0.034 1.21 5.3E-02 0.171 1.08 2.6E-01 0.306 CHI3L1 1.6 5.0E-05 0.034 0.91 1.6E-01 0.238 0.95 3.2E-01 0.351 IGHG1 0.6 5.0E-05 0.034 1.15 1.1E-01 0.213 1.00 5.0E-01 0.496 Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM). Highlighted genes that passed specified criteria for validation: consistent gene expression fold change and statistical significance (Q > 0.05). *One sided p-value based on a one-sided hypothesis tested in the validation cohorts

81

Table 3-5. Genes differentially expressed in PEAR-2 whites treated with chlorthalidone at Q-value < 0.05 and gene expression results in PEAR whites or PEAR-2 blacks

PEAR-2 WHITES PEAR WHITES PEAR-2 BLACKS FOLD FOLD FOLD GENE P Q P Q P Q CHANGE CHANGE CHANGE TRIT1 1.71 5.0E-05 0.004 1.09 2.3E-01 0.494 0.37 2.5E-05 0.0001 SERINC5 0.05 5.0E-05 0.004 1.10 3.7E-01 0.494 0.11 2.5E-05 0.0001 TFCP2 0.37 5.0E-05 0.004 1.02 4.2E-01 0.494 0.15 2.5E-05 0.0001 PPP2R5C 0.50 5.0E-05 0.004 1.02 4.6E-01 0.494 0.33 2.5E-05 0.0001 2.75 5.0E-05 0.004 1.01 4.8E-01 0.494 2.79 2.5E-05 0.0001 METTL23 10.71 5.0E-05 0.004 1.00 4.9E-01 0.494 0.45 2.5E-05 0.0001 METTL6 TSC22D3 0.78 1.1E-03 0.049 0.77 1.8E-03 0.018 0.82 8.8E-03 0.0221 LTF 1.55 5.0E-05 0.004 0.88 1.2E-01 0.470 1.42 1.1E-02 0.0251 GPR56 0.76 1.1E-03 0.047 1.33 3.8E-03 0.025 0.87 5.3E-02 0.1067 IGHA2 1.47 3.5E-04 0.020 1.12 1.2E-01 0.470 0.88 9.5E-02 0.1458 BPI 1.70 1.5E-04 0.011 1.04 3.8E-01 0.494 1.25 8.1E-02 0.1458 PHACTR4 1.57 2.5E-04 0.016 1.00 4.9E-01 0.494 1.18 9.4E-02 0.1458 FGFBP2, 0.73 4.0E-04 0.023 1.33 1.8E-04 0.004 0.92 1.6E-01 0.2320 PROM1 AP3S2 1.58 2.5E-04 0.016 0.99 4.9E-01 0.494 0.91 2.3E-01 0.3005 OCIAD2 0.08 5.0E-05 0.004 1.02 4.6E-01 0.494 0.89 2.4E-01 0.3008 LRBA 0.48 5.0E-05 0.004 1.12 2.6E-01 0.494 0.94 3.8E-01 0.4413 SLC37A3 0.37 5.0E-05 0.004 0.90 2.2E-01 0.494 1.03 4.2E-01 0.4655 PHKB 3.05 5.0E-05 0.004 1.02 4.6E-01 0.494 0.99 4.7E-01 0.4743 Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM). Highlighted genes that passed specified criteria for validation: consistent gene expression fold change and statistical significance (Q > 0.05). *One sided p-value based on a one-sided hypothesis tested in the validation cohorts

82

Table 3-6. Genes differentially expressed between responders and non-responders to HCTZ and chlorthalidone in all 3 cohorts, with consistent direction and transcriptome-wide statistical significance when meta-analyzed Chlorthalidone Chlorthalidone Meta- HCTZ whites whites blacks analysis Genes Fold Fold Fold P-Value P-value P-value P-value Change Change Change

CEBPD 1.4 5.0E-05 1.2 2.4E-03 1.3 5.3E-04 1.8E-11

TSC22D3 0.8 1.8E-03 0.8 4.87E-02 0.8 8.8E-03 1.9E-09

Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM).

83

Table 3-7. Genes differentially expressed between responders and non-responders to chlorthalidone in PEAR-2 whites and blacks, with consistent direction and transcriptome-wide statistical significance when meta-analyzed

Meta- Chlorthalidone whites Chlorthalidone blacks HCTZ whites analysis Genes Fold Fold Fold P-value P-value P-value P-value Change Change Change

SERINC5 0.05 5.0E-05 0.11 2.5E-05 1.10 3.7E-01 1.2E-11 TFCP2 0.37 5.0E-05 0.15 2.5E-05 1.02 4.2E-01 1.5E-11 PPP2R5C 0.50 5.0E-05 0.33 2.5E-05 1.02 4.6E-01 1.6E-11

METTL23, MFSD11 2.75 5.0E-05 2.79 2.5E-05 1.01 4.8E-01 1.8E-11

Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM).

84

Table 3-8. Differences in baseline expression levels for CEBPD and TSC22D3 between thiazide diuretics responders and non-responders in PEAR and PEAR-2 with adjustment for age, gender and baseline blood pressure

Chlorthalidone HCTZ Whites Chlorthalidone Whites Blacks Fold P Fold Fold Genes P value* P value* Change value Change Change CEBPD 1.45 0.0337 1.25 0.02 1.21 0.05 SERINC5 1.00 0.9772 0.94 0.17 0.98 0.43 TSC22D3 1.32 0.1248 1.14 0.06 1.12 0.10 Generalized linear model implemented in edgeR21 Fold change corresponds to gene expression levels in responders divided by levels in non-responders, in fragments per kilobase per million reads (FPKM). *One sided p-value based on a one-sided hypothesis tested in the validation cohorts

85

Table 3-9. SNPs in SERINC5 gene region with allele specific expression (ASE) ≥1.3-fold and significant eQTLs association from Blood eQTL browser22

Samples Samples Blood eQTL Samples with ASE No 22 Functional SNP with ASE browser tested ≤ -log2 ASE Annotation ≥ log2 0.3 0.3

Z-score P-value rs10072008 17 0 10 7 -7.35 2.0E-13 3' UTR rs7707754 13 0 7 6 -5.74 9.4E-09 3' UTR

rs55740328 22 2 18 2 - - 3' UTR

rs55777108 49 19 13 6 - - 3' UTR rs4704617 9 7 5 2 - - 3' UTR rs4704618 10 3 5 2 - - 3' UTR rs4704619 8 4 2 2 - - 3' UTR rs10053887 38 13 9 4 - - 3' UTR rs12521674 7 1 1 5 - - 3' UTR

rs35085860 24 5 8 11 - - 3' UTR rs4703803 10 1 6 3 - - 3' UTR rs75946551 8 1 3 4 - - Intronic rs1132801 13 2 1 9 - - Intronic rs78174795 5 0 1 4 - - Intronic

rs11951568 5 0 1 4 - - Intronic

SNPs with a hyphen sign in the Blood eQTL browser columns have no data. SNPs highlighted in gray show statistical significance (Pbinomial < 0.05) for ASE analysis.

86

A B

Figure 3-1. Volcano plots comparing gene expression between responders and non-responders to HCTZ in PEAR whites (A) and chlorthalidone in PEAR-2 whites (B). Plot of log-fold changes versus log-p-values of probability of differential expression. Each gene is represented on the plot as a single dot. The red dots represent genes that passed the statistical threshold of Q value < 0.05.

87

Figure 3-2. Plots showing CEBPD and TSC22D3 baseline expression levels between thiazide responders compared to non-responders in the PEAR and PEAR-2 RNA-Seq analyses. A) CEBPD in PEAR (whites). B) CEBPD in PEAR-2 whites. C) CEBPD in PEAR-2 blacks. D) TSC22D3 in PEAR. E) TSC22D3 in PEAR-2 whites. F) TSC22D3 in PEAR-2 blacks. Abundance comparisons between thiazide diuretics responders and non- responders were carried using Cufflinks v2.2.1. Error bars indicate standard error of the mean. HCTZ: hydrochlorothiazide, FPKM: fragments per kilobase per million reads.

88

A B

Figure 3-3. The effect of SERINC5 rs10042497 polymorphism on the blood pressure response of whites treated with chlorthalidone in PEAR-2. Blood pressure responses were adjusted for baseline blood pressure, age, sex, and population substructure. P-values represent the contrast of adjusted means between different genotype groups in the PEAR-2 white participants. Error bars represent standard error of the mean. A) Systolic blood pressure response to chlorthalidone in PEAR-2 whites. B) Diastolic blood pressure response to chlorthalidone in PEAR-2 whites.

89

Figure 3-4. Allele-specific expression ratios (major allele over minor allele) in SERINC5 rs10072008, located at 3’ untranslated region (3’ UTR). Each bar represents the magnitude and direction of allele specific expression (ASE) for one heterozygous individual indicated on a log2 scale. The horizontal dashed lines at log2 expression 0.3 and -0.3 represent the pre-established threshold for ASE.

90

Figure 3-5. Allele-specific expression ratios (major allele over minor allele) in SERINC5 rs7707754. Each bar represents the magnitude and direction of allele specific expression (ASE) for one heterozygous individual indicated on a log2 scale. The horizontal dashed lines at log2 expression 0.3 and -0.3 represent the pre- established threshold for ASE.

91

Figure 3-6. Allele-specific expression ratios (major allele over minor allele) in SERINC5 rs78174795, located at the intronic region. Each bar represents the magnitude and direction of allele specific expression (ASE) for one heterozygous individual indicated on a log2 scale. The horizontal dashed lines at log2 expression 0.3 and -0.3 represent the pre-established threshold for ASE.

92

Figure 3-7. Allele-specific expression ratios (major allele over minor allele) in SERINC5 rs11951568. Each bar represents the magnitude and direction of allele specific expression (ASE) for one heterozygous individual indicated on a log2 scale. The horizontal dashed lines at log2 expression 0.3 and -0.3 represent the pre-established threshold for ASE.

93

CHAPTER 4 SUMMARY AND CONCLUSION

Hypertension (HTN) is the most significant risk factor for cardiovascular and kidney disease, affecting about 1 billion individuals worldwide. Despite the many options for antihypertensive therapy, only ~50% of patients treated for HTN achieve blood pressure control. HTN pharmacogenomics holds the potential to guide selection of HTN treatment based on molecular markers of drug response, while also providing potential insight into mechanisms underlying the antihypertensive effects of drugs. As discussed

in Chapter 1, currently, there are multiple compelling genetic signals associated with

antihypertensive drug response, identified though Genome-wide Association Studies

(GWAS), though they may not yet collectively explain enough response variability to be

predictive. Chapter 1 focused on the additional insights that might be gained through

research of the transcriptome – the complete set of transcripts (RNA) –to expand the

knowledge on the influence of gene expression regulation mechanisms on variability in

drug response. Although the use of transcriptomics data in HTN pharmacogenomics is

currently scarce, Next Generation Sequencing technologies allow accurate transcription

quantification for differential expression between biological conditions, identification of

splicing events and the assessment of regulatory mechanisms of gene expression

control due to high resolution of the data. These are important processes for generating

RNA transcript diversity, which may in turn impact protein/metabolite abundance with

proved consequences in drug disposition, mechanism of action and clinical

consequences. RNA-Sequencing, a revolutionary tool that allows whole transcriptome

analysis with accuracy and high data resolution, has been successfully applied to study

multiple disease phenotypes. Thus, the overall goal of this research project was to use

94

innovative transcriptome sequencing tools, i.e. RNA-Seq, to identify novel molecular markers that can contribute to optimizing thiazide diuretic treatment selection.

In Chapter 2, we tested the hypothesis that some of the genes previously associated with BP/HTN might also be associated with BP response to antihypertensive treatment with thiazide diuretics. We assessed these 34 genes for association with differential expression to BP response to thiazide diuretics with RNA sequencing in whole blood samples from 150 hypertensive participants from the Pharmacogenomic

Evaluation of Antihypertensive Responses (PEAR) and PEAR-2 studies. From PEAR,

50 white participants were selected based on the upper and lower quartile of extreme

BP response (25 responders and 25 non-responders) to hydrochlorothiazide (HCTZ).

Likewise, in PEAR-2, white (n=50) and black participants (n=50) were classified as responders and non-responders to chlorthalidone. FOS, DUSP1 and PPP1R15A were differentially expressed across all cohorts (meta-analysis p-value < 2.0x10-6), and

responders to HCTZ or chlorthalidone presented up-regulated transcripts. From these

genes, only FOS was previously documented in functional studies to have some

relationship with BP regulatory mechanisms, through the neuronal activation of

vasomotor areas in animal models95, 96. The other two genes, DUSP1 and PPP1R15A, are involved in pathways regulating vascular smooth muscle contraction or relaxation.

For instance, DUSP1, dual specific phosphatase 1, has been known to attenuate the effects of angiotensin II- mediated vasoconstriction through inhibition of ERK1/297, 98.

PPP1R15A is a regulatory subunit that inhibits the phosphatase protein 1 (PP1), which

may lead to a contractile response in vascular smooth muscle cells99-101. Collectively,

95

the results from Chapter 2 point to novel pathways for thiazide diuretics BP lowering effects in a long term.

Of note, rs11065987 in chromosome 12, a trans-eQTL for expression of FOS,

PPP1R15A and other genes, is also associated with BP response to HCTZ in PEAR

whites. Additionally, allele specific expression (ASE) analysis, with STAR and GATK

tools for SNP calling, revealed a modest imbalance in PPP1R15A rs557806 indicating

the presence of cis-acting regulatory variants. These findings document the potential

value of transcriptomics data to identify biomarkers of drug response and suggest FOS,

DUSP1 and PPP1R15A as potential molecular determinants of antihypertensive

response to thiazide diuretics.

In Chapter 3, we assessed global expression levels in whole blood samples from

150 participants using RNA-Seq data in order to identify novel molecular markers

associated with BP response to thiazide diuretics. In addition to differential expression

data analyses, we used the most validated scientific pipeline, with GATK best practices

for SNP calling with RNA-Seq data, to investigate genetic variants potentially regulating

gene expression. We identified 29 genes that were differentially expressed in relation to

HCTZ or chlorthalidone BP response in whites. For each gene differentially expressed,

we attempted replication in the alternate white group and PEAR-2 blacks. CEBPD and

TSC22D3 were differentially expressed in all 3 cohorts. SERINC5 was differentially

expressed in PEAR-2 whites and blacks treated with chlorthalidone but did not pass our

validation criteria in PEAR whites treated with HCTZ. CEBPD is a transcription factor

that was previously associated with differential transcriptional regulation of PDGF-αR

expression in hypertensive rats compared to those normotensives, and involved in the

96

mechanisms of vascular smooth muscle proliferation and migration8. In addition,

CEBPD showed regulatory effects on the expression of the angiotensinogen gene

(AGT), whose product is the precursor of the angiotensin hormone that causes

vasoconstriction115. TSC22D3 is known in the literature for regulatory activity of sodium retention in the kidneys, through deactivation of the apical epithelial sodium channel

(EnaC) 120, 121. Although SERINC5 was the top gene associated with BP response to TD

in the meta-analysis, we could not find information in the literature that could link

SERINC5 to pathways or mechanisms related to BP regulation.

Additionally, we detected genetic variants in SERINC5 associated with SBP and

DBP response to chlorthalidone in PEAR-2 whites and striking evidence of allelic imbalance in SERINC5 expression in intronic and 3’UTR positions. This suggests a cis-

acting regulatory effect in SERINC5 expression through potential alternative splicing

and alternative processing mechanisms. To our knowledge, this is the first report of the

use of transcriptome sequencing data to identify molecular markers of antihypertensive

drug response. These findings suggest the potential for CEBPD and TSC22D3 as

determinants of BP response to thiazide diuretics, and SERINC5 as determinant of BP

response to chlorthalidone.

In summary, this project applied the RNA-Sequencing technology with the aim of

identifying biomarkers associated with thiazide diuretics BP response. The results

revealed novel genes/transcripts differentially expressed between responders and non-

responders to thiazides: FOS, DUSP1, PPP1R15A, SERINC5, CEBPD and TSC22D3.

Even though the strategy for the identification of these genes was different, i.e. based

on a select list of genes associated with BP versus comparing gene expression

97

differences at the whole transcriptome level, these results have in common the fact that they are supported by multiple levels of replication, which is one of the strengths of this study and builds on validity and potential utility of these findings for guiding antihypertensive treatment selection.

Also, the eQTLs and cis-acting regulatory variants identified in this study also shed light on regions of DNA relevant for regulatory activity of the genes differentially expressed. These results provide a mechanistic understanding of how these loci may influence our phenotype of interest. Studying the downstream effects of these eQTLs and SNPs in ASE identified here and the molecular architecture of gene expression variation can help to further understand the regulatory mechanisms underlying the observed differences in gene expression. For example, the trans-eQTLs identified in chromosome 12 influencing the FOS and PPP1R15A expression (Chapter 2) may reflect looping of chromatin, resulting on dynamic interactions between these genetic loci or potential epistatic effects (synergistic interaction) within transcriptional networks.

Since there were also cis-acting regulatory variants of weak to moderate ASE effect only in FOS and PPP1R15A (not in DUSP1) and these two genes showed high gene expression correlation, these results raise the possibility for cis-trans SNP interactions which reinforce the hypothesis of a potential epistatic effect. In addition, the cis-acting variants identified in the SERINC5 3’ untranslated and intronic regions (Chapter 3) reveal that potential post-translational modification and alternative splicing mechanisms may play a role in the differences of gene expression observed in this study. Also, classical epigenetic marks such as DNA methylation, chromatin state or accessibility can be modulated directly or indirectly by these variants. This study was successful on

98

mapping and quantifying relevant gene expression regulatory activity for the genes associated with BP response to thiazide diuretics. A variety of other tools can be applied to characterize and experimentally validate the mechanisms by which these variants are involved, including analysis of protein-DNA interactions and reporter gene expression.

New genome editing technologies, such as the RNA-guided clustered regularly interspaced short palindromic repeats (CRISPR)-Cas nuclease system, provide an amenable approach for investigating genetic variants and regulatory elements of the genome in the context of the inherent genetic makeup.

Moreover, the findings from this study shed light on novel pathways and molecular markers associated with thiazide BP response, and suggest that thiazide BP lowering mechanisms might be mediated by their effect on pathways likely involved in the regulation of vascular smooth muscle function (with DUSP1 and PPP1R15A), of vasomotor function in the brain (FOS), general cell proliferation mechanism with implications in smooth muscle activity (DUSP1 and CEBPD), sodium retention mechanisms in the kidney (TSC22D3) and the long known renin-angiotensin system of

BP regulation (CEBPD). These findings suggest thiazide diuretics may have its BP lowering effect triggered by multiple tissues and complex mechanisms. Therefore, we hypothesize that developing a strategy to optimize antihypertensive treatment selection will require algorithms that take into consideration the involvement of several genes and molecular markers with documented association with thiazide BP response. The biomarkers revealed in this study should be considered in future models and algorithms with the main goal to optimize the use of thiazide diuretics in the treatment of HTN. In addition, functional studies will be valuable to close gaps in the literature regarding the

99

role of these genes in BP regulation. For example, it would be useful to investigate the interaction of PPP1R15A with PP1 in vascular smooth muscles and in the context of thiazide diuretic treatment. Also, experimental manipulations in model systems are needed to progressively implicate these genes and other related genes as relevant mediators of these pathways/mechanisms, which might identify additional novel anti- hypertensive drug targets.

In conclusion, the main findings of this research project revealed the strengths of studying the human transcriptome for identification of novel molecular markers associated with thiazide diuretic BP response. With more pervasive implementation of transcriptome sequencing strategies in the field of HTN Pharmacogenomics hold the potential to reveal novel avenues in antihypertensive treatment selection and may also expand the current knowledge on BP lowering mechanisms.

100

APPENDIX SUPPLEMENTARY INFORMATION FOR CHAPTER 3

Figure A-1. TSC22D3 expression (Fragments per Kilobase of Exon Mapped, FPKM) by gender in A) PEAR, B) PEAR-2 white and C) PEAR-2 blacks participants. P- values from t-test comparing mean expression between genders. The bars correspond to mean and 95% confidence interval.

101

LIST OF REFERENCES

1. Kearney PM, Whelton M, Reynolds K, Muntner P, Whelton PK and He J. Global burden of hypertension: analysis of worldwide data. Lancet. 2005;365:217-23.

2. Oparil S and Schmieder RE. New approaches in the treatment of hypertension. Circ Res. 2015;116:1074-95.

3. Mozaffarian D, Benjamin EJ, Go AS, et al. Executive Summary: Heart Disease and Stroke Statistics-2016 Update A Report From the American Heart Association. Circulation. 2016;133:447-454.

4. Sundström J, Arima H, Woodward M, et al. Blood pressure-lowering treatment based on cardiovascular risk: a meta-analysis of individual patient data. Lancet. 2014;384:591-8.

5. Neal B, MacMahon S, Chapman N and Collaboration BPLTT. Effects of ACE inhibitors, calcium antagonists, and other blood-pressure-lowering drugs: results of prospectively designed overviews of randomised trials. Blood Pressure Lowering Treatment Trialists' Collaboration. Lancet. 2000;356:1955-64.

6. Materson BJ, Reda DJ, Cushman WC, Massie BM, Freis ED, Kochar MS, Hamburger RJ, Fye C, Lakshman R, Gottdiener J and et al. Single-drug therapy for hypertension in men. A comparison of six antihypertensive agents with placebo. The Department of Veterans Affairs Cooperative Study Group on Antihypertensive Agents. N Engl J Med. 1993;328:914-21.

7. Egan BM, Zhao Y and Axon RN. US trends in prevalence, awareness, treatment, and control of hypertension, 1988-2008. JAMA. 2010;303:2043-50.

8. Bochud M, Bovet P, Elston RC, Paccaud F, Falconnet C, Maillard M, Shamlaye C and Burnier M. High heritability of ambulatory blood pressure in families of East African descent. Hypertension. 2005;45:445-50.

9. Alwan H, Ehret G, Ponte B, et al. Heritability of ambulatory and office blood pressure in the Swiss population. J Hypertens. 2015;33:2061-7.

10. Wang X, Ding X, Su S, Harshfield G, Treiber F and Snieder H. Genetic influence on blood pressure measured in the office, under laboratory stress and during real life. Hypertens Res. 2011;34:239-44.

11. Munroe PB, Barnes MR and Caulfield MJ. Advances in blood pressure genomics. Circ Res. 2013;112:1365-79.

12. Turner ST, Boerwinkle E, O'Connell JR, et al. Genomic association analysis of common variants influencing antihypertensive response to hydrochlorothiazide. Hypertension. 2013;62:391-7.

102

13. McDonough CW, Burbage SE, Duarte JD, Gong Y, Langaee TY, Turner ST, Gums JG, Chapman AB, Bailey KR, Beitelshees AL, Boerwinkle E, Pepine CJ, Cooper- DeHoff RM and Johnson JA. Association of variants in NEDD4L with blood pressure response and adverse cardiovascular outcomes in hypertensive patients treated with thiazide diuretics. J Hypertens. 2013;31:698-704.

14. Hiltunen TP, Donner KM, Sarin AP, et al. Pharmacogenomics of hypertension: a genome‐wide, placebo‐controlled cross‐over study, using four classes of antihypertensive drugs. J Am Heart Assoc. 2015;4:e001521.

15. Cooper GM, Johnson JA, Langaee TY, Feng H, Stanaway IB, Schwarz UI, Ritchie MD, Stein CM, Roden DM, Smith JD, Veenstra DL, Rettie AE and Rieder MJ. A genome-wide scan for common genetic variants with a large influence on warfarin maintenance dose. Blood. 2008;112:1022-7.

16. Perera MA, Cavallari LH, Limdi NA, et al. Genetic variants associated with warfarin dose in African-American individuals: a genome-wide association study. Lancet. 2013;382:790-6.

17. Shuldiner AR, O'Connell JR, Bliden KP, et al. Association of cytochrome P450 2C19 genotype with the antiplatelet effect and clinical efficacy of clopidogrel therapy. JAMA. 2009;302:849-57.

18. Park HW, Dahlin A, Tse S, Duan QL, Schuemann B, Martinez FD, Peters SP, Szefler SJ, Lima JJ, Kubo M, Tamari M and Tantisira KG. Genetic predictors associated with improvement of asthma symptoms in response to inhaled corticosteroids. J Allergy Clin Immunol. 2014;133:664-9.e5.

19. Barber MJ, Mangravite LM, Hyde CL, et al. Genome-wide association of lipid- lowering response to statins in combined study populations. PLoS One. 2010;5:e9763. 20. Wang Z, Gerstein M and Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57-63.

21. Ellsworth DL, Croft DT, Weyandt J, Sturtz LA, Blackburn HL, Burke A, Haberkorn MJ, McDyer FA, Jellema GL, van Laar R, Mamula KA, Chen Y and Vernalis MN. Intensive cardiovascular risk reduction induces sustainable changes in expression of genes and pathways important to vascular function. Circ Cardiovasc Genet. 2014;7:151-60.

22. Sitras V, Fenton C and Acharya G. Gene expression profile in cardiovascular disease and preeclampsia: a meta-analysis of the transcriptome based on raw data from human studies deposited in Gene Expression Omnibus. Placenta. 2015;36:170-8.

23. Marshall A, Lukk M, Kutter C, Davies S, Alexander G and Odom DT. Global gene expression profiling reveals SPINK1 as a potential hepatocellular carcinoma marker. PLoS One. 2013;8:e59459.

103

24. Barrie ES, Smith RM, Sanford JC and Sadee W. mRNA transcript diversity creates new opportunities for pharmacological intervention. Mol Pharmacol. 2012;81:620-30.

25. Johnson JA. Advancing management of hypertension through pharmacogenomics. Ann Med. 2012;44 Suppl 1:S17-22.

26. Cooper-DeHoff RM and Johnson JA. Hypertension pharmacogenomics: in search of personalized treatment approaches. Nat Rev Nephrol. 2016;12:110-22.

27. Gong Y, McDonough CW, Padmanabhan S and Johnson JA. Hypertension Pharmacogenomics. 2014:747-778.

28. Scott SA, Sangkuhl K, Stein CM, Hulot JS, Mega JL, Roden DM, Klein TE, Sabatine MS, Johnson JA, Shuldiner AR and Consortium CPI. Clinical Pharmacogenetics Implementation Consortium guidelines for CYP2C19 genotype and clopidogrel therapy: 2013 update. Clin Pharmacol Ther. 2013;94:317-23.

29. Johnson JA, Gong L, Whirl-Carrillo M, Gage BF, Scott SA, Stein CM, Anderson JL, Kimmel SE, Lee MT, Pirmohamed M, Wadelius M, Klein TE, Altman RB and Consortium CPI. Clinical Pharmacogenetics Implementation Consortium Guidelines for CYP2C9 and VKORC1 genotypes and warfarin dosing. Clin Pharmacol Ther. 2011;90:625-9.

30. Ramsey LB, Johnson SG, Caudle KE, et al. The clinical pharmacogenetics implementation consortium guideline for SLCO1B1 and simvastatin-induced myopathy: 2014 update. Clin Pharmacol Ther. 2014;96:423-8.

31. Turner ST, Bailey KR, Fridley BL, Chapman AB, Schwartz GL, Chai HS, Sicotte H, Kocher JP, Rodin AS and Boerwinkle E. Genomic association analysis suggests chromosome 12 locus influencing antihypertensive response to thiazide diuretic. Hypertension. 2008;52:359-65.

32. Duarte JD, Turner ST, Tran B, Chapman AB, Bailey KR, Gong Y, Gums JG, Langaee TY, Beitelshees AL, Cooper-Dehoff RM, Boerwinkle E and Johnson JA. Association of chromosome 12 locus with antihypertensive response to hydrochlorothiazide may involve differential YEATS4 expression. Pharmacogenomics J. 2013;13:257-63.

33. Gong Y, Wang Z, Beitelshees AL, et al. Pharmacogenomic Genome-Wide Meta- Analysis of Blood Pressure Response to β-Blockers in Hypertensive African Americans. Hypertension. 2016;67:556-63.

34. Shendure J. The beginning of the end for microarrays? Nat Methods. 2008;5:585-7.

104

35. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, Gerstein M and Snyder M. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008;320:1344-9.

36. Han Y, Gao S, Muegge K, Zhang W and Zhou B. Advanced Applications of RNA Sequencing and Challenges. Bioinform Biol Insights. 2015;9:29-46.

37. Conesa A, Madrigal P, Tarazona S, Gomez-Cabrero D, Cervera A, McPherson A, Szczesniak MW, Gaffney DJ, Elo LL, Zhang X and Mortazavi A. A survey of best practices for RNA-seq data analysis. Genome Biol. 2016;17:13.

38. Sims D, Sudbery I, Ilott NE, Heger A and Ponting CP. Sequencing depth and coverage: key considerations in genomic analyses. Nat Rev Genet. 2014;15:121-32.

39. The ENCODE Consortium. Standards, Guidelines and Best Practices for RNA- Seq. 2011;2016.

40. Martin JA and Wang Z. Next-generation transcriptome assembly. Nat Rev Genet. 2011;12:671-82.

41. Engstrom PG, Steijger T, Sipos B, Grant GR, Kahles A, Ratsch G, Goldman N, Hubbard TJ, Harrow J, Guigo R, Bertone P and Consortium R. Systematic evaluation of spliced alignment programs for RNA-seq data. Nat Methods. 2013;10:1185-91.

42. Costa V, Aprile M, Esposito R and Ciccodicola A. RNA-Seq and human complex diseases: recent accomplishments and future perspectives. Eur J Hum Genet. 2013;21:134-42.

43. Caceres JF and Kornblihtt AR. Alternative splicing: multiple control mechanisms and involvement in human disease. Trends Genet. 2002;18:186-93.

44. Wang GS and Cooper TA. Splicing in disease: disruption of the splicing code and the decoding machinery. Nat Rev Genet. 2007;8:749-61.

45. Pan Q, Shai O, Lee LJ, Frey BJ and Blencowe BJ. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008;40:1413-5.

46. Tress ML, Martelli PL, Frankish A, et al. The implications of alternative splicing in the ENCODE protein complement. Proc Natl Acad Sci U S A. 2007;104:5495-500.

47. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS and Manolio TA. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A. 2009;106:9362-7.

105

48. Cheung VG, Spielman RS, Ewens KG, Weber TM, Morley M and Burdick JT. Mapping determinants of human gene expression by regional and genome-wide association. Nature. 2005;437:1365-9.

49. Stranger BE, Forrest MS, Dunning M, et al. Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science. 2007;315:848-53.

50. Rockman MV and Kruglyak L. Genetics of global gene expression. Nat Rev Genet. 2006;7:862-72.

51. Majewski J and Pastinen T. The study of eQTL variations by RNA-seq: from SNPs to phenotypes. Trends Genet. 2011;27:72-9.

52. Pastinen T. Genome-wide allele-specific analysis: insights into regulatory variation. Nat Rev Genet. 2010;11:533-8.

53. Chuang LC, Kao CF, Shih WL and Kuo PH. Pathway analysis using information from allele-specific gene methylation in genome-wide association studies for bipolar disorder. PloS one. 2013;8:e53092.

54. Johnson AD, Zhang Y, Papp AC, Pinsonneault JK, Lim JE, Saffen D, Dai Z, Wang D and Sadee W. Polymorphisms affecting gene transcription and mRNA processing in pharmacogenetic candidate genes: detection through allelic expression imbalance in human target tissues. Pharmacogenet Genomics. 2008;18:781-91.

55. Bryois J, Buil A, Evans DM, Kemp JP, Montgomery SB, Conrad DF, Ho KM, Ring S, Hurles M, Deloukas P, Davey Smith G and Dermitzakis ET. Cis and trans effects of human genomic variants on gene expression. PLoS Genet. 2014;10:e1004461.

56. Albert FW and Kruglyak L. The role of regulatory variation in complex traits and disease. Nat Rev Genet. 2015;16:197-212.

57. Petretto E, Mangion J, Dickens NJ, Cook SA, Kumaran MK, Lu H, Fischer J, Maatz H, Kren V, Pravenec M, Hubner N and Aitman TJ. Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet. 2006;2:e172.

58. Zhu J, Zhang B, Smith EN, Drees B, Brem RB, Kruglyak L, Bumgarner RE and Schadt EE. Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet. 2008;40:854-61.

59. Brem RB, Storey JD, Whittle J and Kruglyak L. Genetic interactions between polymorphisms that affect gene expression in yeast. Nature. 2005;436:701-3.

60. Orozco LD, Bennett BJ, Farber CR, et al. Unraveling inflammatory responses using systems genetics and gene-environment interactions in macrophages. Cell. 2012;151:658-70.

106

61. Ghazalpour A, Bennett B, Petyuk VA, et al. Comparative analysis of proteome and transcriptome variation in mouse. PLoS Genet. 2011;7:e1001393.

62. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS and Cheung VG. Genetic analysis of genome-wide variation in human gene expression. Nature. 2004;430:743-7.

63. Huan T, Esko T, Peters MJ, et al. A meta-analysis of gene expression signatures of blood pressure and hypertension. PLoS Genet. 2015;11:e1005035.

64. Harismendy O, Notani D, Song X, Rahim NG, Tanasa B, Heintzman N, Ren B, Fu XD, Topol EJ, Rosenfeld MG and Frazer KA. 9p21 DNA variants associated with coronary artery disease impair interferon-gamma signalling response. Nature. 2011;470:264-8.

65. Small KS, Hedman AK, Grundberg E, et al. Identification of an imprinted master trans regulator at the KLF14 locus related to multiple metabolic phenotypes. Nat Genet. 2011;43:561-4.

66. Musunuru K, Strong A, Frank-Kamenetsky M, et al. From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature. 2010;466:714-9.

67. Kapoor A, Sekar RB, Hansen NF, et al. An enhancer polymorphism at the cardiomyocyte intercalated disc protein NOS1AP locus is a major regulator of the QT interval. Am J Hum Genet. 2014;94:854-69.

68. Frayling TM, Timpson NJ, Weedon MN, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007;316:889-94.

69. Claussnitzer M, Dankel SN, Kim KH, et al. FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. N Engl J Med. 2015;373:895-907.

70. Glastonbury CA, Vinuela A, Buil A, Halldorsson GH, Thorleifsson G, Helgason H, Thorsteinsdottir U, Stefansson K, Dermitzakis ET, Spector TD and Small KS. Adiposity- Dependent Regulatory Effects on Multi-tissue Transcriptomes. Am J Hum Genet. 2016;99:567-79.

71. di Salvo TG, Yang KC, Brittain E, Absi T, Maltais S and Hemnes A. Right ventricular myocardial biomarkers in human heart failure. J Card Fail. 2015;21:398-411.

72. Di Salvo TG, Guo Y, Su YR, Clark T, Brittain E, Absi T, Maltais S and Hemnes A. Right ventricular long noncoding RNA expression in human heart failure. Pulm Circ. 2015;5:135-61.

107

73. Liu Y, Morley M, Brandimarto J, Hannenhalli S, Hu Y, Ashley EA, Tang WH, Moravec CS, Margulies KB, Cappola TP, Li M and consortium MA. RNA-Seq identifies novel myocardial gene expression signatures of heart failure. Genomics. 2015;105:83- 9.

74. International Consortium for Blood Pressure Genome-Wide Association S, Ehret GB, Munroe PB, et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011;478:103-9.

75. Cowley AW, Jr., Moreno C, Jacob HJ, et al. Characterization of biological pathways associated with a 1.37 Mbp genomic region protective of hypertension in Dahl S rats. Physiol Genomics. 2014;46:398-410.

76. Tain YL, Huang LT, Chan JY and Lee CT. Transcriptome analysis in rat kidneys: importance of genes involved in programmed hypertension. Int J Mol Sci. 2015;16:4744-58.

77. Materson BJ. Variability in response to antihypertensive drugs. American Journal of Medicine. 2007;120:10-20.

78. James PA. 2014 Evidence-based Guideline for the Management of High Blood Pressure in Adults: Report From the Panel Members Appointed to the Eighth Joint National Committee (JNC 8) (vol 311, pg 507, 2014). Jama-J Am Med Assoc. 2014;311:1809-1809.

79. Chepelev I, Wei G, Tang QS and Zhao KJ. Detection of single nucleotide variations in expressed exons of the human genome using RNA-Seq. Nucleic Acids Res. 2009;37.

80. Himes BE, Jiang XF, Wagner P, et al. RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells. Plos One. 2014;9.

81. Huan T, Esko T, Peters MJ, et al. A meta-analysis of gene expression signatures of blood pressure and hypertension. Plos Genet. 2015;11:e1005035.

82. Peng ZY, Cheng YB, Tan BCM, et al. Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Nat Biotechnol. 2012;30:253- +.

83. Johnson JA, Boerwinkle E, Zineh I, Chapman AB, Bailey K, Cooper-DeHoff RM, Gums J, Curry RW, Gong Y, Beitelshees AL, Schwartz G and Turner ST. Pharmacogenomics of antihypertensive drugs: Rationale and design of the Pharmacogenomic Evaluation of Antihypertensive Responses (PEAR) study. American Heart Journal. 2009;157:442-449.

108

84. Turner ST, Schwartz GL, Chapman AB, Beitelshees AL, Gums JG, Cooper- Dehoff RM, Boerwinkle E, Johnson JA and Bailey KR. Power to identify a genetic predictor of antihypertensive drug response using different methods to measure blood pressure response. J Transl Med. 2012;10:47.

85. Hamadeh IS, Langaee TY, Dwivedi R, Garcia S, Burkley BM, Skaar TC, Chapman AB, Gums JG, Turner ST, Gong Y, Cooper-DeHoff RM and Johnson JA. Impact of CYP2D6 polymorphisms on clinical efficacy and tolerability of metoprolol tartrate. Clin Pharmacol Ther. 2014;96:175-81.

86. Levin JZ, Yassour M, Adiconis X, Nusbaum C, Thompson DA, Friedman N, Gnirke A and Regev A. Comprehensive comparative analysis of strand-specific RNA sequencing methods. Nat Methods. 2010;7:709-15.

87. Trapnell C, Hendrickson DG, Sauvageau M, Goff L, Rinn JL and Pachter L. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat Biotechnol. 2013;31:46-+.

88. Anders S, Pyl PT and Huber W. HTSeq-a Python framework to work with high- throughput sequencing data. Bioinformatics. 2015;31:166-169.

89. Robinson MD, McCarthy DJ and Smyth GK. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics. 2010;26:139-140.

90. Rau A, Marot G and Jaffrezic F. Differential meta-analysis of RNA-seq data from multiple studies. BMC Bioinformatics. 2014;15:91.

91. Turner ST, Boerwinkle E, O'Connell JR, et al. Genomic Association Analysis of Common Variants Influencing Antihypertensive Response to Hydrochlorothiazide. Hypertension. 2013;62:391-397.

92. Gong Y, McDonough CW, Wang Z, Hou W, Cooper-DeHoff RM, Langaee TY, Beitelshees AL, Chapman AB, Gums JG, Bailey KR, Boerwinkle E, Turner ST and Johnson JA. Hypertension susceptibility loci and blood pressure response to antihypertensives: results from the pharmacogenomic evaluation of antihypertensive responses study. Circ Cardiovasc Genet. 2012;5:686-91.

93. Westra HJ, Peters MJ, Esko T, et al. Systematic identification of trans eQTLs as putative drivers of known disease associations. Nat Genet. 2013;45:1238-43. 94. Smith RM, Webb A, Papp AC, Newman LC, Handelman SK, Suhy A, Mascarenhas R, Oberdick J and Sadee W. Whole transcriptome RNA-Seq allelic expression in human brain. BMC Genomics. 2013;14:571.

109

95. Minson J, Arnolda L, LlewellynSmith I, Pilowsky P and Chalmers J. Altered c-fos in rostral medulla and spinal cord of spontaneously hypertensive rats. Hypertension. 1996;27:433-441.

96. Rao F, Zhang L, Wessel J, et al. Tyrosine hydroxylase, the rate-limiting enzyme in catecholamine biosynthesis - Discovery of common human genetic variants governing transcription, autonomic activity, and blood pressure in vivo. Circulation. 2007;116:993-1006.

97. Duff JL, Monia BP and Berk BC. Mitogen-Activated Protein (Map) Kinase Is Regulated by the Map Kinase Phosphatase (Mkp-1) in Vascular Smooth-Muscle Cells. Journal of Biological Chemistry. 1995;270:7161-7166.

98. Touyz RM, Deschepper C, Park JB, He G, Chen X, Neves MF, Virdis A and Schiffrin EL. Inhibition of mitogen-activated protein/extracellular signal-regulated kinase improves endothelial function and attenuates Ang II-induced contractility of mesenteric resistance arteries from spontaneously hypertensive rats. J Hypertens. 2002;20:1127- 34.

99. Connor JH, Weiser DC, Li S, Hallenbeck JM and Shenolikar S. Growth arrest and DNA damage-inducible protein GADD34 assembles a novel signaling complex containing protein phosphatase 1 and inhibitor 1. Mol Cell Biol. 2001;21:6841-50. 100. Terrak M, Kerff F, Langsetmo K, Tao T and Dominguez R. Structural basis of protein phosphatase 1 regulation. Nature. 2004;429:780-784.

101. Lipskaia L, Bobe R, Chen J, et al. Synergistic role of protein phosphatase inhibitor 1 and sarco/endoplasmic reticulum Ca2+ -ATPase in the acquisition of the contractile phenotype of arterial smooth muscle cells. Circulation. 2014;129:773-85. 102. Tsaytler P, Harding HP, Ron D and Bertolotti A. Selective inhibition of a regulatory subunit of protein phosphatase 1 restores proteostasis. Science. 2011;332:91-4.

103. Ikram MK, Sim X, Jensen RA, et al. Four novel Loci (19q13, 6q24, 12q24, and 5q14) influence the microcirculation in vivo. PLoS Genet. 2010;6:e1001184.

104. Barrett JC, Fry B, Maller J and Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21:263-5.

105. Materson BJ. Variability in response to antihypertensive drugs. Am J Med. 2007;120:S10-20.

106. IMS Institute. Medicines Use and Spending Shifts: A Review of the Use of Medicines in the U.S. in 2014. 2014;2016.

110

107. Johnson JA, Gong Y, Bailey KR, Cooper-DeHoff RM, Chapman AB, Turner ST, Schwartz GL, Campbell K, Schmidt S, Beitelshees AL, Boerwinkle E and Gums JG. Hydrochlorothiazide and atenolol combination antihypertensive therapy: effects of drug initiation order. Clin Pharmacol Ther. 2009;86:533-9.

108. Shafi T, Appel LJ, Miller ER, 3rd, Klag MJ and Parekh RS. Changes in serum potassium mediate thiazide-induced diabetes. Hypertension. 2008;52:1022-9.

109. Gosfield E, Jr. Thiazide-induced hyperuricemia. N Engl J Med. 1963;268:562.

110. Duarte JD and Cooper-DeHoff RM. Mechanisms for blood pressure lowering and metabolic effects of thiazide and thiazide-like diuretics. Expert Rev Cardiovasc Ther. 2010;8:793-802.

111. Smith SM, Anderson SD, Wen S, Gong Y, Turner ST, Cooper-Dehoff RM, Schwartz GL, Bailey K, Chapman A, Hall KL, Feng H, Boerwinkle E, Johnson JA and Gums JG. Lack of correlation between thiazide-induced hyperglycemia and hypokalemia: subgroup analysis of results from the pharmacogenomic evaluation of antihypertensive responses (PEAR) study. Pharmacotherapy. 2009;29:1157-65.

112. Smith SM, Gong Y, Turner ST, Cooper-DeHoff RM, Beitelshees AL, Chapman AB, Boerwinkle E, Bailey K, Johnson JA and Gums JG. Blood pressure responses and metabolic effects of hydrochlorothiazide and atenolol. Am J Hypertens. 2012;25:359-65.

113. Sinjini Sikdar SD, Susmita Datta. EAMA: Empirically Adjusted Meta-analysis for Large-scale Simultaneous Hypothesis Testing in Genomic Experiments. In preparation. 2017.

114. Kitami Y, Fukuoka T, Hiwada K and Inagami T. A high level of CCAAT-enhancer binding protein-delta expression is a major determinant for markedly elevated differential gene expression of the platelet-derived growth factor-alpha receptor in vascular smooth muscle cells of genetically hypertensive rats. Circ Res. 1999;84:64-73.

115. Jain S, Li Y, Patil S and Kumar A. A single-nucleotide polymorphism in human angiotensinogen gene is associated with essential hypertension and affects glucocorticoid induced promoter activity. J Mol Med (Berl). 2005;83:121-31.

116. Wang F, Demura M, Cheng Y, et al. Dynamic CCAAT/enhancer binding protein- associated changes of DNA methylation in the angiotensinogen gene. Hypertension. 2014;63:281-8.

117. Balamurugan K and Sterneck E. The many faces of C/EBPδ and their relevance for inflammation and cancer. Int J Biol Sci. 2013;9:917-33.

111

118. D'Adamio F, Zollo O, Moraca R, Ayroldi E, Bruscoli S, Bartoli A, Cannarile L, Migliorati G and Riccardi C. A new dexamethasone-induced gene of the leucine zipper family protects T lymphocytes from TCR/CD3-activated cell death. Immunity. 1997;7:803-12.

119. Karaki S, Garcia G, Tcherakian C, Capel F, Tran T, Pallardy M, Humbert M, Emilie D and Godot V. Enhanced glucocorticoid-induced leucine zipper in dendritic cells induces allergen-specific regulatory CD4(+) T-cells in respiratory allergies. Allergy. 2014;69:624-31.

120. Bhalla V, Soundararajan R, Pao AC, Li H and Pearce D. Disinhibitory pathways for control of sodium transport: regulation of ENaC by SGK1 and GILZ. Am J Physiol Renal Physiol. 2006;291:F714-21.

121. Loffing J and Korbmacher C. Regulated sodium transport in the renal connecting tubule (CNT) via the epithelial sodium channel (ENaC). Pflugers Arch. 2009;458:111- 35.

122. Carrel L and Willard HF. X-inactivation profile reveals extensive variability in X- linked gene expression in females. Nature. 2005;434:400-4.

123. Inuzuka M, Hayakawa M and Ingi T. Serinc, an activity-regulated protein family, incorporates serine into membrane lipid synthesis. J Biol Chem. 2005;280:35776-83.

124. Fukazawa N, Ayukawa K, Nishikawa K, Ohashi H, Ichihara N, Hikawa Y, Abe T, Kudo Y, Kiyama H, Wada K and Aoki S. Identification and functional characterization of mouse TPO1 as a myelin membrane protein. Brain Res. 2006;1070:1-14.

125. Thierry-Mieg D and Thierry-Mieg J. AceView: a comprehensive cDNA-supported gene and transcripts annotation. Genome Biol. 2006;7 Suppl 1:S12 1-14.

126. Serre D, Gurd S, Ge B, Sladek R, Sinnett D, Harmsen E, Bibikova M, Chudin E, Barker DL, Dickinson T, Fan JB and Hudson TJ. Differential allelic expression in the human genome: a robust approach to identify genetic and epigenetic cis-acting mechanisms regulating gene expression. PLoS Genet. 2008;4:e1000006.

112

BIOGRAPHICAL SKETCH

Ana Caroline Costa Sá was born and raised in Brasília, Brazil. She received her bachelor’s degree in pharmaceutical sciences in September 2009, from the University of

Brasília (UnB) in Brasília. After graduation, she completed her master’s degree in parasite biology and Genetics at the Oswaldo Cruz Foundation. To continue her education in genetics, she started her Doctor of Philosophy degree in genetics and genomics at the University of Florida Genetics Institute in 2012. During her doctoral training, Ana Caroline was involved in a diverse set of activities including teaching, clinical study coordination, and research in pharmacogenomics. She has authored multiple peer-reviewed manuscripts and presented her research findings at national meetings. Ana Caroline earned her PhD degree from the University of Florida in the spring of 2017. She will continue her career as a research scientist in biotechnology/pharmaceutical industry, starting a postdoctoral fellowship with Dr. James

Brown at GlaxoSmithKline.

113