GENETIC EPIDEMIOLOGY OF HYPERTENSION IN POPULATIONS:

APPLICATIONS OF MODIFIED METHODS

by

PRIYA BHATIA SHETTY, M.S.

Submitted in partial fulfillment of the requirements

For the degree of Doctor of Philosophy

Dissertation Adviser: Xiaofeng Zhu, Ph.D.

Department of Epidemiology and Biostatistics

CASE WESTERN RESERVE UNIVERSITY

January, 2014 CASE WESTERN RESERVE UNIVERSITY

SCHOOL OF GRADUATE STUDIES

We hereby approve the dissertation of Priya Bhatia Shetty,

candidate for the Doctor of Philosophy degree*.

Xiaofeng Zhu, PhD

Robert C. Elston, PhD

Jing Li, PhD

Nathan Morris, PhD

September 6, 2013

*We also certify that written approval has been obtained for any proprietary material

contained therein.

2

Dedication

This dissertation is dedicated to my husband Manju and our children Suvan, Poppy,

Jujubee, and Sayali.

3

TABLE OF CONTENTS

List of Tables………………………………………………………………….…………..5

List of Figures…………………………………………………………………………….6

Abstract……………………………………………………………………….…………..7

Chapter 1. Background……………………………………………………...….…………9

Chapter 2. Specific Aims………..……………………………………………………….53

Chapter 3. Variants in CXADR and F2RL1 are associated with blood pressure and obesity in African-Americans in regions identified through admixture mapping………………..60

Chapter 4. Novel variants for HDL-C, LDL-C and triglycerides identified from admixture mapping and fine-mapping analysis in families…...…………………………………….85

Chapter 5. Identification of admixture regions associated with risk of apparent treatment- resistant hypertension in African-Americans………..………………………………….117

Chapter 6. Discussion………………………………..………………………………....137

References………………………………………………………………………………142

4

List of Tables

Chapter 1. Table 1. Genome-wide Association Studies of Hypertension………...……...40

Chapter 1. Table 2. Other Analysis Methods in the Genetic Analysis of Hypertension in

Unrelated Subjects………………………………………………………………...……..51

Chapter 3. Table 1. Summary statistics…………………………………………...……..80

Chapter 3. Table 2. Results of -based multivariable regression models……..….….81

Chapter 3. Table 3. Results of gene-based multivariable regression models that were nominally significant at α equal to 0.05…………………………………………..….….82

Chapter 3. Table 4. Results of gene-based multivariable regression models using local ancestry for replication analysis………………………………………………………….84

Chapter 4. Table 1. Summary statistics………………………………………………...104

Chapter 4. Table 2. Regions suggested by admixture mapping……………………...…105

Chapter 4. Table 3. Significant fine-mapping results…………………………………..106

Chapter 5. Table 1. Summary statistics………………………………………………...133

Chapter 5. Table 2. Significant results from the admixture mapping analysis………....134

5

List of Figures

Chapter 4. Figures 1a-1g. Quantile-Quantile plots of fine-mapping analysis in families……………………………………………………………………………….....107

Chapter 4. Figures 2a-2g. Combined analyses plots...... 108

Chapter 4. Supplemental Figures 3-3g. Admixture mapping…………………………..110

Chapter 5. Figure 1. Admixture mapping in all patients………………………………..135

Chapter 5. Figure 2. Admixture mapping in patients taking 2 anti-hypertensive drugs..136

6

Genetic Epidemiology of Hypertension in Populations: Applications of Modified

Methods

Abstract

by

PRIYA BHATIA SHETTY

Objective: Modified methods of analysis were applied in three studies of African-

Americans to identify variants associated with blood pressure and to address missing

heritability in genetic studies of hypertension.

Methods: Three genetic epidemiology studies were conducted in African-American

subjects in the National Heart, Lung and Blood Institute’s Family Blood Pressure

Program. First, a candidate gene association study using gene scores was conducted in

genomic regions that previously showed admixture association evidence for blood

pressure and other cardiovascular traits. Second, admixture mapping analyses were conducted on the 22 autosomal for blood pressure and lipids, followed by fine-mapping analyses of the suggestive admixture regions in families. Third, admixture mapping analyses were conducted for blood pressure in subjects treated with at least 2 anti-hypertensive medications, and the regions suggesting admixture evidence were followed up with fine-mapping analyses in families to identify variants associated with apparent treatment-resistant hypertension. Stratified analyses were conducted by number of drugs for phenotypes showing significant association results in all subjects.

7

Results: In the first study, CXADR and F2RL1 were associated with blood pressure and

body-mass-index, respectively. Analysis of local ancestry suggested that other associated

SNPs were present in these regions. In the second study, the admixture mapping analyses

identified seven regions associated with blood pressure and lipids. In the fine-mapping

analyses, 11 SNPs in 8 independent loci were identified for associations with lipids. In

the third study, admixture regions on chromosomes 3 and 5 were identified for

association with blood pressure in all treated subjects and in subjects taking two anti-

hypertensive drugs, respectively. The regions included variants that were previously identified in a large blood pressure GWAS, and the results suggested that the variants

were associated with treatment response, rather than blood pressure, in African-

Americans.

Conclusions: A number of novel and known genomic and genetic variants were identified for blood pressure and other cardiovascular phenotypes. The methods of analysis were

modified to incorporate gene scores, two-stage study designs and fine-mapping association analyses in families, and these adaptations were key in addressing some of the missing heritability in the genetics of hypertension and other cardiovascular traits.

8

Chapter 1. Background

Hypertension

One-quarter of adults worldwide suffer from high blood pressure, and this is a

major public health issue because hypertension is an important risk factor for

cardiovascular disease and stroke [1, 2]. Hypertension (HTN) is defined as untreated

systolic blood pressure greater than or equal to 140 mm Hg and/or untreated diastolic

blood pressure greater than or equal to 90 mm Hg or taking medication to reduce high

blood pressure [3, 4]. Systolic blood pressure (SBP) is the amount of arterial pressure

present when the heart contracts and diastolic blood pressure (DBP) is the amount of

arterial pressure present when the heart relaxes.

Between 1999 and 2008, approximately 30% of adults in the United States

suffered from high blood pressure and only one-third of hypertensive patients had adequate control of their blood pressure to below 140/90 mm Hg according to the

National Health and Nutrition Examination Survey (NHANES) of 1999-2000 [4, 5].

Epidemiological analyses of hypertension data have shown that risk of hypertension varies by race and ethnicity, sex and age. Approximately 40% of non-Hispanic black adults in the United States had hypertension in 1999-2008, compared to 25-30% of non-

Hispanic white adults and Mexican American adults in the U.S during the same time period [4]. An analysis of NHANES data from 2000-2006 also showed that non-

Hispanic black hypertensive adults were 1.49 times more likely to have poorly controlled blood pressure than non-Hispanic white hypertensive adults (odds ratio 1.49, 95% confidence interval (1.12, 1.98)) [6]. In young adulthood, SBP tends to be higher in men than women until age 60, after which women experience a more rapid age-associated

9

increase in SBP than men [7]. After age 60, women have a higher SBP than men;

however, DBP increases steadily with age until approximately 55 years for both sexes,

and it often declines after this point [7].

Besides age and sex, one of the most important risk factors for hypertension is obesity; a 10 kg increase in weight is associated with a 3 mm Hg increase in SBP and a

2.3 mm Hg increase in DBP [8]. Most of the effects of obesity on hypertension are mediated through changes in cardiac output and peripheral vascular resistance in that blood pressure = cardiac output X systemic vascular resistance. Obesity results in increased blood volume and stroke volume, which lead to increased cardiac output. In addition, obesity causes increased peripheral vascular resistance through a number of mechanisms, including endothelial dysfunction, sleep apnea and the release of inflammatory cytokines, such as IL-6 and TNF-α, by adipocytes [8].

Both genetic and environmental factors are associated with variation in blood pressure. The heritability of hypertension has been estimated to range between 34% and

67% from family studies and twin studies [9-11]. Interestingly, SBP and DBP have also shown considerable variability in heritability estimates by sex, in that males have higher heritability estimates than females for both SBP and DBP [7, 12].

In addition to these points, other considerations for studying blood pressure phenotypes are the decisions of how to treat the phenotype for analysis, continuous as in

SBP and DBP or dichotomous as in hypertension (yes or no), and how to effectively adjust for anti-hypertensive treatment in the analysis. For instance, increased statistical power might be available when treating blood pressure as a continuous trait compared to a categorical trait, but the normality requirements of statistical methods for continuous

10

variables (e.g., linear regression) may require an appropriate transformation. On the other hand, treating blood pressure as a dichotomous phenotype is favorable because the resulting interpretations are more clinically-meaningful than the clinical interpretations from the analysis of continuous traits. However, reducing the continuous trait to a categorical trait may result in a considerable loss in statistical power to detect small effect sizes, which are common in genetic epidemiology studies.

Furthermore, age, sex and obesity are key potential confounders that need to be adjusted for in blood pressure studies. As mentioned previously, age and sex are important potential confounders because hypertension varies considerably with these traits [7]. Obesity is an essential risk factor for hypertension that affects blood pressure through cardiac output and peripheral vascular resistance [8]. In statistical analyses, obesity is usually modeled as body-mass-index (BMI) and this is calculated as weight

(kg)/height2 (m2). Taking all of these elements into consideration, it is evident that the

heterogeneity of blood pressure is a complex public health challenge and well designed

research approaches and methods are necessary to address these issues.

As hypertension is a complex disease with many known and unknown

contributing and causal factors, a number of different approaches have been attempted to

address the genetic epidemiology of hypertension. This review will discuss the methods

and findings from genome-wide association studies (GWAS) and admixture mapping studies for blood pressure phenotypes to provide an overview of the genetics of hypertension at the population level.

Genome-Wide Association Studies

11

As hypertension is a disease that is present across many populations, it seems

reasonable that genetic variants that confer risk of the disease may exist in fairly high

frequencies across populations [13, 14]. With this in mind, genome-wide association

studies (GWAS) have sought to identify common genetic variants associated with blood

pressure efficiently by comparing genetic profiles between large numbers of cases and

controls.

GWAS Methods

One of the most common study designs used to identify genetic variants for a

disease is the genome-wide association study. This type of study is centered on the

common disease common variant (CDCV) hypothesis to identify multiple single

nucleotide polymorphisms (SNPs) that are associated with the disease. In the CDCV

hypothesis, it is expected that common, complex diseases are likely to be caused by

genetic variants that are relatively frequent in human populations [14-16].

In order to identify these genetic variants, GWAS are often designed as case-

control and cohort studies that compare the genetic variation between subjects to

determine which variants are associated with the phenotype of interest [17, 18]. As in

traditional epidemiological studies, care must be taken to select cases and controls from

the source population(s) such that the controls are well-matched to the cases with regards

to the risk of disease and associated potential confounding factors. The genetic data for

GWAS is obtained from a multi-stage procedure, beginning with selecting a large

number of genome-wide SNPs for genotyping. After the genotyping is complete, it is

necessary to conduct quality control procedures, including determining genotype calls for

the SNP data, filtering out SNPs with genotype call rates below a predefined threshold

12

and removing SNPs that demonstrate a departure from Hardy-Weinberg Equilibrium.

SNPs that show a considerable departure from Hardy-Weinberg Equilibrium are

considered likely genotyping errors if other considerations, including systematic

genotyping errors and population stratification, are not a concern [19, 20].

Finally, association analyses are used to evaluate the relationship between the genetic data and the phenotype data [17]. A number of tests and methods can be used to determine if an association exists between the genetic data and the phenotype data. For example, the Cochran-Armitage trend tests the relationship between the dosage of an allele and the phenotype data using a co-dominant or additive genetic model. In addition, single SNPs can be tested using statistical regression methods, such as linear regression and logistic regression, depending on the nature of the phenotype data. These methods are well-suited for GWAS because they are well-established statistical methods that have straightforward biological and clinical interpretations and are easily applied with multiple software options, including R and PLINK [21, 22]. Furthermore, the regression models allow for the incorporation of covariates, including adjustments for population stratification. However, the interpretation of these results should be considered with a correction for multiple testing. Alternatively, the tests can be evaluated for statistical significance using permutation testing.

The results from these analyses are often graphically represented as quantile- quantile (Q-Q) plots and Manhattan plots [17, 23]. Q-Q plots compare the distribution of the expected test results under the null hypothesis with the distribution of the observed test results to indicate if there are more statistically-significant results observed than would be expected in the GWAS. These points are usually the ones that deviate from the

13

45 degree reference line at the tail of the distribution. In addition, the Q-Q plots can indicate if there are systematic problems in the GWAS dataset, such as population stratification; in this case, the curve deviates from the 45 degree reference line throughout most of the distribution. Manhattan plots are an attractive way to identify the loci along the entire genome where the most statistically-significant findings are located. These graphs plot all of the SNPs’ –log10 p-values, ordered by and location, and

the most significant observations look like skyscrapers on the plot.

In addition to the primary analysis, many GWAS also include a secondary

analysis of multiple studies’ results because the increased sample size allows for

improved detection of small effect sizes, which most SNPs that are associated with

common, complex diseases have. Meta-analysis combines the results of multiple studies

to provide global estimates of effect, while accounting for differences in the distributions

of phenotypes, race/ethnicity and other key elements of a study. One of the most popular

methods used to combine studies in meta-analysis is the inverse-variance weighting

method, in which the statistical result from each study is weighted by its inverse variance,

such that the influence of each study is inversely proportional to the amount of variance

in its result. Software packages such as METAL can be used to implement meta-analysis

for GWAS [24].

Recently, researchers have also attempted to use data from GWAS to estimate the

heritability of a phenotype. Yang et al. reported that they estimated the heritability of

height to be 45%, using genome-wide SNPs [25]. The heritability of height from family

studies has been estimated at 80%; the authors noted that the discrepancy between their

estimate and the estimate from family studies, the missing heritability, was due to

14

incomplete linkage disequilibrium. In their analysis, Yang et al. used a restricted

maximum likelihood (REML) method with almost 295,000 SNPs in 3,925 unrelated

subjects to estimate the variance of height due to the SNPs. However, estimates for heritability using data from unrelated subjects are not the same as estimates obtained

from family data. On the one hand, using unrelated subjects may underestimate the

heritability due to imperfect LD between variants available on genotyping arrays and the

causal variants. On the other hand, family-based methods may overestimate the

heritability due to common environment. As part of the Genetic Analysis Workshop 17

(GAW17), heritability estimates based on family data were compared to estimates based

on unrelated data using the Yang et al. method and two REML methods for a simulated

phenotype in the workshop data [25, 26]. The heritability of the phenotype using family

data was estimated to be 65.0%. The two REML methods estimated the heritability to be

46.2% (standard error 10.0%) and 53.5% (standard error 12.1%); however, the Yang et

al. method did not achieve stable estimates. Further, there were, notable discrepancies

between the heritability estimates obtained from the family data and the unrelated data.

A similar concern in GWAS involves the disparity between the heritability estimates of the disease and the amount of variability in phenotypes that is explained by genetic variants. This difference is often referred to as the missing heritability of the disease. For hypertension, the heritability is estimated to range between 34% and 67% from family studies and twin studies, but only 0.55% - 1% of the variability of blood

pressure has been captured by the top SNPs in recent, large blood pressure GWAS [9-11,

27-29]. This disparity between the family-based estimates and the variability accounted

for by variants discovered in GWASs may be due to insufficient tagging SNPs on arrays,

15

rare genetic variants, copy number variation, gene-gene interactions and epistatic effects,

inadequate phenotype definitions that don’t account for disease heterogeneity.[25, 30].

GWAS Studies of Hypertension

Over the last several years, over 1,600 publications from GWAS assaying at least

11,000 SNPs have been published in the National Research Institute

(NHGRI) GWAS Catalog as of August 2013 [31]. Hundreds of smaller GWAS have been published in scientific journals during this time as well. Despite this wealth of studies, the overall results of most GWAS have not met the initial, great expectations of

GWAS having broad clinical applications following the sequencing of the human genome. In particular, genetic findings from GWAS in hypertension and blood pressure

studies have been somewhat limited.

Two of the earliest genome-wide association studies of hypertension were published in 2007. The Diabetes Genetic Initiative published a genome-wide association study of 2931 subjects (1464 type II diabetes cases and 1467 controls) from Finland and

Sweden on 18 traits, including blood pressure as measured by SBP and DBP (Table 1)

[32]. Saxena et al. also conducted a replication analysis on 10,850 additional subjects who were type II diabetes cases and controls from Sweden, Poland and the United States,

13,965 subjects from the Wellcome Trust Case Control Consortium (WTCCC)/UKT2D

(United Kingdom Type II Diabetes) studies and 4808 subjects from the FUSION

(Finland-United States Investigation of NIDDM Genetics) study. Despite a combined sample size of up to 32,554 subjects, the authors reported that no SNPs were genome- wide significantly associated with SBP or DBP. The authors did not offer any possible explanations regarding why they did not find any significant findings for blood pressure;

16 however, it is quite reasonable to expect that their study was underpowered to detect associations for blood pressure as their cases were selected for the study based on their diabetes status and the hypertension analysis was secondary.

At the same time, the Wellcome Trust Case Control Consortium published a genome-wide association study of seven common diseases in 14,000 cases and 3000 controls [33]. The authors studied genetic variants for association with hypertension,

Crohn’s disease, type I diabetes, type II diabetes, biopolar disorder, coronary artery disease and rheumatoid arthritis. Each association analysis was conducted on approximately 2000 cases and 3000 common controls from the 1958 Birth Cohort (the

National Child Development Study) conducted in the United Kingdom and the UK Blood

Services, and all of the subjects were white Europeans. The cases for the hypertension analysis were obtained from the BRIGHT (British Genetics of Hypertension) study. As in the Diabetes Genetic Initiative study, the WTCCC reported that no SNPs were genome-wide significantly associated with hypertension. However, they reported that

SNP rs2820037 on chromosome 1q43, which is close to the RYR2, CHRM3 and

ZP4 was most strongly associated with hypertension. The WTCCC commented that their inability to find genome-wide significant results may have been due to inadequate tagging by the genotyped SNPs, fewer common risk alleles of large effect-size in hypertension and misclassification bias in that some of the controls may have been hypertensive as well.

Subsequently, three of the largest GWAS for blood pressure were published by the CHARGE (Cohorts for Heart and Aging Research in Genome Epidemiology), Global

BPgen, and the International Consortium for Blood Pressure Genome-Wide Association

17

Studies (ICBP GWAS) consortia [27-29]. The CHARGE consortium analyzed 29,136 subjects in 6 cohorts of European ancestry in the primary analysis, the Global BPgen

consortium analyzed 34,333 subjects in 17 cohorts of European ancestry, and the ICBP

GWAS analyzed 200,000 subjects in 65 cohorts of European ancestry. The CHARGE and Global BPGen studies were published simultaneously, and they each used the other group’s sample for their replication analysis. In addition, the Global BPgen used a sample of 71,225 European subjects and 12,889 Indian Asian subjects for a second replication analysis. The ICBP GWAS employed a sample of over 73,000 subjects of different ethnicities (29,719 East Asian, 23,977 South Asian, and 19,775 African

ancestry) for replication analysis. All three consortia examined the effect of SNPs on

treatment-adjusted SBP and DBP.

Among the three studies, a number of SNPs were found to be genome-wide significantly associated with blood pressure phenotypes in the primary and replication analyses. Levy et al. reported that SNPs in ATP2B1 were associated with hypertension,

SBP and DBP, and SNPs in SH2B3 were associated with SBP and DBP [27]. In addition,

SNPs in CYP17A1 and PLEKHA7 were associated with SBP, and SNPs in ULK4,

CACNB2, TBX3-TBX5 and CSK-UL3 were associated with DBP. Newton-Cheh et al. reported that SNPs in MTHFR, CNNM2/NT5C2 and PLCD3 were associated with SBP, and SNPs in FGF5, c10orf107, ATXN2, CSK and ZNF652 were associated with DBP

[28]. Of these genes, only CYP17A1, SH2B3 and MDS1 were significantly associated with SBP, and CSK, SH2B3 and MDS1 were significantly associated with DBP in replication studies [13]. The ICBP GWAS reported 16 novel SNPs that were associated with blood pressure traits: MOV10 (SBP, DBP), SLC4A7 (DBP), MECOM (SBP, DBP),

18

SLC39A8 (SBP, DBP), GUCY1A3-GUCY1B3 (DBP), NPR3-C5orf23 (SBP, DBP, HTN),

EBF1 (SBP, DBP), HFE (SBP, DBP, HTN), BAT2-BAT5 (SBP, DBP, HTN), PLCE1

(SBP, HTN), ADM (SBP), FLJ32810-TMEM133 (SBP, DBP, HTN), FURIN-FES (SBP,

DBP), GOSR2 (SBP), JAG1 (SBP, DBP), and GNAS-EDN3 (SBP, DBP, HTN) [29]. In addition, the consortium reported an additional SNP in CACNB2 that was associated with

SBP and DBP, and the consortium replicated 12 out of the 13 results from the CHARGE and Global BPgen GWAS (the result for PLCD3 was not replicated).

Although the consortia reported several statistically-significant findings, the effect sizes were quite modest, such that each risk allele was associated with up to a 1.17 mm

Hg change in SBP and/or a 0.65 mm Hg change in DBP in the Global BPgen study and a

1 mm Hg change in SBP and/or a 0.5 mm Hg change in DBP in the CHARGE study [27,

28]. Similarly, the ICBP GWAS found that only 0.9% of the trait variances for SBP and

DBP were accounted for by the 29 SNPs they reported [29]. These modest effects may have been partly due to insufficient tagging of SNPs in linkage disequilibrium (LD) with common causal variants and rare variants, the heterogeneity of the phenotype and some cohort effects. The use of an Asian Indian cohort for the replication analysis in the

Global BPgen study was interesting, but it was not surprising that many of the Global

BPgen results did not replicate in this cohort as Asian Indians have a three- to four-fold increased risk of coronary artery disease compared to people of European ancestry, so different genetic variants may play a key role in this risk disparity [34]. Similarly, only some of the results from the ICBP GWAS were replicated in East Asians, South Asians, and individuals of African ancestry with regard to statistical significance and direction of the effect [29].

19

A number of medium and smaller GWAS have been conducted in subjects of

European ancestry as well. Org et al. conducted a genome-wide association study of

hypertension and unadjusted SBP and DBP in 1644 German subjects (1017 untreated

subjects) in the KORA S3 cohort [35]. They conducted their replication analyses in 1830

Germans from the KORA S4 cohort, 1832 Estonians from the HYPEST cohort and 4370

British from the BRIGHT family-based cohort. They reported that no SNPS were genome-wide significantly associated with blood pressure or hypertension, but SNP rs11646213 on chromosome 16q23.3 had one of the strongest associations with hypertension, SBP and DBP in their meta-analysis. This SNP is located upstream from

CDH13, and this gene is associated with vascular wall remodeling and angiogenesis, as well as with some cancers. In another European cohort, Sabatti et al. studied genetic risk factors for treatment adjusted SBP and DBP, as well as 7 other metabolic phenotypes, in

5654 subjects in the Northern Finland Birth Cohort of 1966 [36]. They conducted both

GWAS and a genome-wide haplotype association study, but they reported that no SNPs were genome-wide significantly associated with SBP or DBP. In addition, Wang et al. conducted a genome-wide association study of treatment adjusted SBP and DBP in a cohort of 542 Amish subjects with replication in 7151 additional subjects of European ancestry [37]. They reported that SNP rs4977950 was genome-wide significantly- associated with SBP, but this SNP is located in a gene desert, so the clinical and biological implications of this finding are currently limited. Their next most interesting findings were in SNPs on chromosome 2q24.3, near STK39, which were associated with

SBP and DBP. STK39 is associated with sodium excretion in the kidneys. Interestingly, the authors reported that the genetic model between STK39 and blood pressure was most

20

significantly additive or dominant in older Amish subjects, but it was most significantly

recessive in younger Amish subjects.

All of these GWAS reported mostly no genome-wide-significant findings, although this was likely due to being underpowered to detect small effects. In the Sabatti et al. study, the authors studied 9 metabolic traits overall, and this may have limited their power to detect small effect sizes because few variants have pleiotropic effects, as in metabolic syndrome, that can be detected by moderate studies [36]. In the Org et al. study, phenotype heterogeneity and associated small effect sizes may have played a substantial role in their inability to find genome-wide significant findings [35]. The primary cohort included a large percent (61.8%) of subjects that were untreated for their hypertension, which was likely milder than the disease present in treated hypertensives.

Therefore, the study conducted by Org et al. may have had too little power to discover genetic variants associated with small effects in the less severe hypertension phenotype.

The study conducted by Wang et al. was also underpowered to find variants associated with small effects on hypertension because the primary analysis was conducted in a very small, isolated cohort of only 542 Amish subjects [37].

Using a variation on this study design, Padmanabhan et al. conducted a genome- wide association study of hypertension and treatment-adjusted SBP and DBP in 1621 extreme hypertension cases and 1699 controls in the primary analysis and 19,845 extreme hypertension cases and 16,541 controls in the replication analysis [38]. The subjects were all of European ancestry. The authors reported that the SNP rs13333226 (UMOD) was genome-wide significantly associated with hypertension, and they suggested that this variant may be associated with hypertension through its role in renal function. Although

21

Padmanabhan et al. reported an interesting finding, it is unclear if this result is limited to

extreme hypertension cases only.

GWAS in non-European subjects

Despite their increased disease burden, there are fewer genome-wide association

studies conducted in African-Americans than people of European ancestry; however, it is

beneficial to conduct GWAS in African-Americans because their genetic backgrounds

provide additional opportunities for discovering variants. African-Americans are a

recently admixed population, and their background linkage disequilibrium (LD) contains

regions associated with African ancestry and regions associated with European ancestry.

The African ancestry regions are shorter than the European ancestry regions because their

length decreases with the increasing age of the population [39, 40]. Consequently, by capitalizing on the presence of these shorter African ancestry blocks, it may be possible to map risk variants more precisely to a particular location.

As a caveat, it is also necessary to account for population stratification in African-

American cohorts and other recently admixed populations to avoid reporting spurious results. If the ancestral populations for an admixed group have substantial allele frequency differences for a particular variant (population stratification), it is possible to create LD between unlinked genetic markers due to recent admixture. To avoid this issue, each individual’s genome-wide proportion of African or European ancestry is included in the analysis, often as a covariate in the regression modeling.

One of the first major African-American GWAS was conducted in 1,017 subjects from the Howard University Family Study with a replication analysis conducted in 980

West Africans [41]. The authors did not find any genetic variants that were genome-wide

22

significantly associated with blood pressure in hypertensive subjects. However, the

authors reported that SNPs in PMS1, CACNA1H, SLC24A4, YWHAZ, IPO7 and

pseudogene AL365365.23 were genome-wide significantly associated with SBP in

normotensive subjects in the primary analysis, and the SNP in SLC24A4 replicated in the

West African sample. The authors were also able to replicate the previously-reported

associations between blood pressure and STK39 and CDH13 [35, 37, 41]. In addition,

Adeyemo et al. replicated some of the top hits for hypertension from the Diabetes

Genetics Initiative study, SLC24A4, IPO7 and PMS1, and the CHARGE study, CACNB2

and PMS1 [27, 32, 41].

Recently, one of the largest African-American GWAS of blood pressure reported

a significant association between rs11041530 (CYB5R2) and SBP in the primary analysis, as well as significant associations for three novel loci (RSPO3, PLEKHG1, and EVX1-

HOXA) and 1 independent signal in a known (SOX6) for blood pressure traits in a trans-ethnic replication analysis [42]. Franceschini et al. conducted the meta-analysis of

GWAS in 29,378 African-Americans from 19 African-American GWAS, and the replication analysis was conducted in an additional 10,386 African-Americans, 69,395 subjects of European ancestry, and 19,601 subjects of East Asian ancestry. The authors also found that the effects sizes for SNPs that were previously reported for association with blood pressure in the ICBP GWAS were correlated to effect sizes for blood pressure in other ethnic groups, indicating that the variants had common effects in different populations [29, 42].

Two additional African-American GWAS of hypertension were conducted in subjects from the CARe (Candidate Gene Association Resource) consortium, as well.

23

Fox et al. conducted a genome-wide association study for treatment adjusted SBP and

DBP in 8591 African-American subjects in five cohorts of the CARe consortium with replication studies in five African-American cohorts (N = 11,882; Maywood, Howard

University Family Study, the GENOA network of the Family Blood Pressure Project, the

Women’s Health Initiative and the International Collaborative Study on Hypertension in

Blacks) and one European-American cohort (N = 69,899 from the International

Consortium for Blood Pressure Genome-wide Association Studies) [43]. The authors reported that rs10474346, which is near the genes GPR98 and ARRDC3, and rs2258119 in C21orf91 were genome-wide statistically significant. However, of the top hits from the European-ancestry subjects in the Global BPgen and CHARGE consortia, only three

SNPs in SH2B3, TBX2-TBX5 and CSK-ULK3 replicated in the African-American subjects from the CARe consortium.

Similarly, Lettre et al. conducted a study for hypertension in 8090 African-

American subjects from 5 cohorts in the CARe project in their primary study and 8849

African-American and African-Caribbean subjects from 4 population-based cohorts for their replication analysis [44]. The authors reported that the imputed SNP rs7801190

(SLCI2A9) was genome-wide significantly associated with hypertension in the primary analysis, but it was not significant in the replication study. The reason for this lack of association is unclear, as the authors conducted a well-designed study with large sample sizes for the discovery and replication analyses. As the authors studied six phenotypes, coronary heart disease and its risk factors, it is possible that the study may have been underpowered to detect small genetic effects that would be genome-wide significant after accounting for multiple testing.

24

The hypertension GWAS conducted in African-Americans highlight the need to

conduct studies of blood pressure in a variety of populations in order to better understand

the complex genetic epidemiology of blood pressure by capitalizing on differences in

linkage disequilibrium patterns and variability in allele frequencies between populations.

In this same vein, several GWAS conducted in Asians further emphasized this issue in other diverse groups. One of the earliest studies was a genome-wide association study of hypertension in Japanese subjects conducted in 188 male hypertension cases and 1504 controls with replication analyses conducted in 1371 cases and 2158 controls [45]. Kato et al. reported that no SNPs were genome-wide significantly associated with hypertension. However, SNP rs3755351 in ADD2 was their most strongly associated

SNP after primary and replication analyses. Other studies have reported associations between SNPs in ADD2 and SBP in untreated hypertensive subjects [46]. The authors reported mostly null results, and this may be due to the very small sample size used for the primary analysis as the study was underpowered to detect the small effect sizes that are expected with common genetic variants. It is also unclear why only male cases were used in the primary analysis but not in the controls or the replication analyses, and this may have affected the analysis as there are differences in blood pressure heritability and trends by sex [7, 12]. .

Hiura et al. also conducted a small study of blood pressure phenotypes and 6 other traits in Japanese subjects [47]. In their primary analysis, they studied 936 subjects who did not receive medication for hypertension and 6123 subjects in the replication analysis.

The authors found that SNP rs1652080 (CCBE1) on chromosome 8 was genome-wide significantly associated with SBP in the primary analysis, and it was significantly

25 associated with SBP, DBP and hypertension in the replication analysis. Although the authors reported an interesting finding for hypertension, their study also suffered from small sample size in the primary analysis and cohort differences between the primary and replication samples. The dataset for the primary analysis used subjects who had uncontrolled hypertension, but up to 29% of the subjects in the replication analysis were being treated with antihypertensive medication. Therefore, it is possible that the genetic variants that were statistically-significant in the primary analysis were associated with untreated hypertension, but the statistically-significant variants in the replication analysis may have been associated with gene effects, drug effects or both.

In a much larger study, Cho et al. conducted a genome-wide association study of

SBP and DBP, and six other quantitative traits, in 8842 Korean subjects from the Korea

Association Resource and a replication study in 7861 Korean subjects from the Health2 cohort, and both of these cohorts were part of the Korean Genome Epidemiology Study

(KoGES) [48]. From the discovery and replication analyses, the authors reported that their strongest association was for SBP and DBP and SNP rs17249754, which is near

ATP2B1, but this result was not genome-wide significant. The CHARGE consortium also reported genome-wide significant associations between all three blood pressure phenotypes and ATP2B1, so it is possible that this may be a true finding. Recently, Hong et al. also conducted a medium-sized GWAS of SBP and DBP in 7551 Korean subjects in the primary analysis and 3703 Korean subjects in the replication analysis [49]. The authors reported that no SNPs were genome-wide significantly associated with SBP or

DBP, but the associations between SNP rs11638762 (AKAP13) on chromosome 15q24-

26

25 and SBP and DBP were the strongest associations in the study. Functional studies

have shown that AKAP13 is associated with cardiac development in mice models [50].

In a large meta-analysis of treated SBP and DBP GWAS, Kato et al. studied

19,608 East Asians in the AGEN-BP (Asian Genetic Epidemiology Network- Blood

Pressure) consortium [51]. After the GWAS, the authors genotyped the top results in

10,518 additional Asian subjects and conducted an additional replication analysis in an independent sample of 20,247 East Asians. The cohorts include Japanese, Han Chinese,

Korean and Malaysian subjects. The authors reported novel genome-wide significant associations with blood pressure for ST7L-CAPZA1 and ENPEP for DBP, as well as

FIGN-GRB14 and NPR3 for SBP. They found genome-wide significant associations that are specific to East Asians for ALDH2 and SBP and DBP, as well as for rs35444 (TBX3)

and DBP. Kato et al. also replicated previous genome-wide significant associations between DBP and CASZ1, as well as between FGF5, CYP17A1 and ATP2B1 with SBP and DBP. The authors’ exciting findings were likely due to the large sample sizes in this study, as they were able to detect small effect sizes associated with common genetic variants; furthermore, the replication of the findings in different ethnic cohorts suggests that these results may be true effects.

These genome-wide association studies have made numerous contributions to the genetic epidemiology of hypertension across different populations. From the initial large

GWAS in subjects of European ancestry, it has been noted that many genetic variants of varying effect sizes are likely associated with blood pressure phenotypes and that these modest effects are best observed with large sample sizes. The GWAS conducted in

African-American cohorts and Asian cohorts have indicated that some of the variants

27

associated with blood pressure may be ethnicity-specific, while others are present across

multiple race/ethnic groups. It is possible that some of the limited significant findings in

the non-European cohorts may be due to the inability of tagging SNPs to identify genetic

variants that are important in subjects of non-European ancestry. Furthermore, the

analysis of the African-American cohorts has emphasized the importance of searching for

genetic variants for hypertension in high-risk cohorts.

Lessons from GWAS

Genome-wide association studies function as large-scale epidemiological studies

and have the potential to uncover numerous variants across the genome that are

associated with common diseases in an efficient manner. In addition, they play an

essential role in the discovery of novel variants towards the overall goals of

understanding the genetic architecture of diseases and applying this knowledge for

improved prevention and treatment of common diseases.

With the publication of numerous GWAS, there have been several key lessons

about the importance of genome-wide association studies in understanding the genetics of

hypertension and other common diseases. For instance, GWAS have reasonable power to

identify common variants underlying common diseases when the study samples are

homogeneous within cohorts and have sufficient sample sizes, given the expected small

effect sizes of the genetic variants, as seen with the CHARGE and Global BPgen GWAS

[27, 28]. In addition, replication studies are always necessary to refute or confirm any association evidence from GWAS because the multiple tests can lead to type I errors that are not replicated in independent studies, as many of the hypertension studies have shown. It is also of interest to conduct replication analyses in different populations to

28

take advantage of variability LD block size for improved mapping of causal variants and

to determine the causal variants’ effects in populations with differing disease risks, as

demonstrated in the blood pressure GWAS conducted in African-American and Asian

cohorts.

On the other hand, some aspects of this study design suggest that it is necessary to reevaluate the role of GWAS in the study of the genetics of common diseases across human populations. For example, a number of the GWAS for hypertension reviewed in this dissertation have reported that they found no genetic variants that were genome-wide significant for association, but they reported that some of their top hits were clinically or biologically important. It is possible that the statistical tests that are used to determine genome-wide significance are missing results that are biologically- or clinically-relevant because of stringent guidelines for multiple comparisons. In addition, biological and clinical data, such as blood pressure data, may not meet the normality assumptions and other assumptions of the statistical methods that are used. Furthermore, GWAS are primarily intended as a tool for discovery, rather than confirmation [32]. Instead of strictly focusing on genome-wide significance, it may be more important to highlight the top hits from previous GWAS for follow-up studies and future GWAS and to increase focus on the “robust replication” of GWAS findings [52].

Along these same lines is the important caveat of genome-wide association studies that statistically-significant SNPs are unlikely to be the causal variants for the disease of interest [53]. This is because GWAS are designed to identify SNPs that are in high linkage disequilibrium with the functional variant, rather than the functional variant itself. When there are well-described functional variants in linkage disequilibrium with

29

the significant genetic variants, the biological interpretations are clear; however, when

the significant variants are located in a gene desert, as with rs4977950 and SBP, or not in

any known genes with well-annotated function, the clinical and biological importance of

the finding, if any, is challenging [37]. In addition, top findings from GWAS often fail to replicate in independent studies from different investigators, especially the top findings from studies with small discovery sample sizes. This may indicate that experiment errors may greatly contribute to regression to the mean for results from small GWAS. These experiment errors include failing to allow for ascertainment, genotyping errors, phenotypic measurement errors and environmental confounding. Thus, independent replication studies are extremely important.

Although initial GWAS have queried the genome for common variants associated with common diseases and they were not based on pre-existing hypotheses, it is time to use all sources of available information to guide the next phase of GWAS analysis [18,

54]. Moreover, conducting GWAS with a focus on known and putative biological pathways, systems and functions may yield more clinically-relevant results in future studies.

Moving forward from initial GWAS

Capturing a larger proportion of the missing genetic variability of blood pressure is one of the key challenges in the genetics of hypertension and other common, complex diseases, but the best approaches to achieve this goal are less clear. Although there are a number of limitations associated with GWAS, it is not time to abandon GWAS. Instead, it is important to explore new methods for discovering genetic variants for common

30

diseases and to reevaluate the results of previous GWAS with increased emphasis on

biological and clinical plausibility for new methods of analysis with these study designs.

A number of new analysis methods and modifications of well-established

methods of analysis are paving the way for exciting discoveries in the understanding of

the genetic epidemiology of common, complex diseases. These modifications include

using candidate gene studies to replicate top findings that were not genome-wide

significant in GWAS in diverse populations, conducting candidate gene studies in regions

that are associated with drug response for related phenotypes, analyzing rare variants

with candidate gene studies, and admixture mapping analysis.

After the WTCCC genome-wide association study was published, Ehret et al. conducted a candidate gene replication of the top six hits for hypertension in an African-

American, Hispanic-American and European-American sample [55]. Using data from the Family Blood Pressure Project, Ehret et al. conducted a replication analysis of the top six hits for hypertension from the Wellcome Trust Case Control Consortium genome-

wide association study of seven common diseases [33]. They performed their analysis in

11,433 subjects in 4449 family in three networks of the Family Blood Pressure Project.

Unlike the WTCCC study, the subjects were of different ancestries: 39% were African-

Americans, 39% were European-Americans and 22% were Hispanic-American. The authors reported that the G allele of SNP rs1937506 was significantly associated with decreased SBP (-24.9 mm Hg) and DBP (-8.7 mm Hg) in European-Americans, but the same allele was associated with increased SBP (+27.7 mm Hg) in Hispanic-Americans and it was not associated with blood pressure in African-Americans. According to dbSNP, the frequency of the G allele was 73.9% in the HapMap CEU (Utah residents

31 with Northern and Western European ancestry from the CEPH) population and 85.7% in the HapMap YRI (Yoruba in Ibadan, Nigeria) population. Although the study was similar to other candidate gene studies, this work was important as it highlighted the heterogeneity of complex diseases, such as hypertension, both in the effects of a risk SNP and in the directions of effect of a risk allele. However, the estimated large effect sizes for the G allele were inconsistent with published literature, in that small effect sizes are expected for complex traits. Despite the uncertainty about the validity of the study’s results, the method of analysis implemented are interesting modifications on the candidate gene method.

Recently, Johnson et al. also conducted an interesting modification on candidate gene studies for blood pressure by examining regions of the genome that encode targeted by anti-hypertensive drugs [56]. Their primary analysis involved 29,136 subjects of European ancestry from the CHARGE consortium and their replication analysis involved 34,433 subjects of European ancestry from the Global BPgen consortium and 23,019 subjects of European ancestry in the Women’s Genome Health

Study. After the primary and replication analyses, Johnson et al. reported that rs1801253 in ADRB1 and rs2004776 in AGT were significantly associated with hypertension and treatment adjusted SBP and DBP. SNP rs4305 in ACE was also significantly associated with hypertension. The authors reported that their results were consistent with other studies that genetic variation in the β–adrenergic receptor gene ADRB1 and the renin- angiotensin genes AGT and ACE are associated with variation in blood pressure. These variants were not genome-wide significant in the CHARGE and Global BPgen GWAS, and this is most likely due to Johnson et al.’s modified candidate gene analysis. By using

32

clinical and pharmacological information to identify genes in pathways and mechanisms

that are targeted by anti-hypertensive medication, the authors were able to reduce the

number of tests they needed to perform to determine statistical significance.

Furthermore, by targeting genes that were in pathways known to be clinically-important

for anti-hypertensive medication, they were more likely to find variants that were

associated with the disease.

Candidate gene studies can also be used to conduct association analyses for rare

variants. One of the main challenges in studying rare variants is determining the most

appropriate method to treat the variants in the statistical analysis. As traditional single

SNP association analysis is difficult to conduct with rare variants due to small cell sizes,

many approaches involve collapsing rare variants in some manner before applying

statistical analyses [57]. For example, the cohort allelic sums test (CAST) compares the

frequency of subjects with at least one rare variant between cases and controls; the

combined multivariate and collapsing (CMC) method modifies the CAST method by

collapsing the rare variants as in CAST but then compares the distributions of rare

variants between cases and controls [58, 59]. Madsen and Browning proposed a different method that uses the frequency of each variant as a weight for each of the collapsed rare

variant groups; and Feng et al. presented an odds ratio weighted sum statistic that uses the

odds ratios from an independent case-control analysis as weights [60, 61]. In addition,

rare variants may be combined based on biological considerations; for instance, the

variants may be collapsed into their respective genes.

These modifications on traditional candidate gene methods improve existing analysis methods by increasing the possibility of finding interesting, significant results.

33

On the other hand, admixture mapping analysis presents a novel method of analysis that

identifies regions that may contain causal variants for the disease.

Admixture mapping analysis

Admixture mapping analysis, also known as mapping by admixture linkage disequilibrium (MALD), is a newer analysis method for genetic epidemiology studies of complex diseases in unrelated subjects [62]. It is different from other association analysis methods because it capitalizes on the architecture of the genome and disparities in the frequency of disease risk variants to map regions of the genome that may contain causal variants. In fact, admixture mapping analysis can be used to find variants that are missed by GWAS, including variants that are rare in the general population. Rare variants and others may be missed by GWAS because the tagging SNPs using arrays may not be designed to identify them; however, admixture mapping analysis does not rely on tagging techniques to capture variants that are associated with disease.

In general, admixture analysis is based on the hypothesis that the frequency of a risk allele, which is associated with the disease of interest, varies between ancestral populations. Consequently, it is assumed that in a recently admixed population, cases have increased ancestry for the ancestral population that has a greater frequency of the risk allele at the causal locus, compared to controls. This local excess ancestry at the causal variant is also expected to be greater than the genome-wide ancestry.

Hypertension is well-suited for admixture mapping analysis because one of the highest-risk groups for the disease, African-Americans, is a recently admixed population

[4, 6]. In addition, the blood pressure GWAS and subsequent candidate gene studies conducted in African-Americans suggest that there are population-specific genetic

34

variants and risk alleles between African-Americans and European-Americans [41, 43,

44, 55]. In a study of individual ancestry estimates and blood pressure in two admixed

groups, Tang et al. also reported that there was a slight increase in African ancestry in

hypertensive subjects compared to normotensive subjects [63].

Admixture mapping methods

First, in admixture mapping analysis, genome-wide ancestry estimates and

estimates of local ancestry for ancestry blocks along the chromosomes are calculated

using available ancestry informative markers, such as those developed by Smith et al.

[62, 64]. The length (cM) of the ancestry blocks varies based on the history and age of the admixed populations, such that the block length shortens with increasing age of the population. The ancestral populations for most African-Americans in the United States are African and European, but the background LD decay of African-Americans is quite similar to that of African populations.

Next, the local ancestry values are estimated along the genome for cases and controls. The local ancestry values for cases are compared to their average genome-wide ancestry estimates to determine if there are any regions than have greater-than-expected ancestry from the increased-risk population. The local ancestry estimates are also compared between cases and controls to determine which chromosomal loci are associated with excess ancestry in the cases, compared to the controls. The regions that contain greater ancestry than would be expected genome-wide and in cases compared to controls are anticipated to contain genetic variants that are associated with the disease of interest [62]. Often, admixture mapping studies are followed by association analyses to

35 determine if particular genetic variants in the regions identified by the admixture mapping are associated with the disease.

Admixture mapping studies in hypertension

One of the first admixture mapping studies for hypertension was conducted using genome-wide microsatellites in 1340 hypertensive African-Americans and 737 cases and

573 controls, all from the Family Blood Pressure Project (Table 2) [65]. Zhu et al. calculated excess African ancestry for each marker using ancestry probabilities obtained from the software STRUCTURE for each subject [66]. Then, they constructed a Z score for each marker. Finally, they noted the regions that had high Z scores among the cases only and the regions that had the largest difference in Z scores between cases and controls. Zhu et al. reported that local African ancestry on chromosomes 2, 3, 6 and 21 were significantly associated with hypertension.

After this study, Deo et al. conducted an admixture mapping analysis of hypertension in African-American hypertensive cases from the Multiethnic Cohort and the Genomic Collaborative study and 387 African-American controls from the

Multiethnic Cohort [67]. The authors conducted their admixture mapping analysis using

ANCESTRYMAP software, which is based on a Bayesian approach. They found no loci that were significantly associated with hypertension at the genome-wide level. In their candidate gene association analysis, Deo et al. reported that the one of the loci previously reported to be associated with hypertension by Zhu et al., 6q21.3, was associated with hypertension in their study, although nominally [65].

Subsequently, Zhu et al. conducted an admixture mapping analysis of hypertension using SNPs in 1743 African-Americans, 581 Mexican-Americans and 1000

36

European-Americans from the Dallas Heart Study [68]. In this study, the authors estimated local ancestry for the markers using the software ADMIXPROGRAM, which estimates local ancestry using a hidden Markov model with the allele frequencies being updated via the Expectation-Maximization algorithm for the continuous gene flow model

[69]. From these estimates, Zhu et al. constructed and compared Z scores for excess ancestry for cases only and for cases and controls. They replicated their 2005 study as they found that regions on chromosome 6 (129-166 cM) and chromosome 21 (0-30 cM) were significantly associated with African ancestry. In their association analysis, they reported that SNP rs2272996 (VNN1) was significantly associated with hypertension in

African-Americans and Mexican-Americans.

Recently, Zhu et al. also conducted a large admixture mapping study of treatment- adjusted SBP and DBP in 6303 African-Americans in the Candidate Gene Association

Resource consortium [70]. They conducted their replication analysis in 11,882 subjects in four African-American cohorts (Maywood, Howard University Family Study, GENOA of the Family Based Blood Pressure Project, and the Women’s Health Initiative) and one

Nigerian cohort. For this study, the authors also used the ADMIXPROGRAM software to estimate local ancestry. However, to map loci of excess African ancestry that were associated with hypertension, they constructed a linear regression model that accounted for key potential confounders for hypertension and considered p-values < 0.001 as suggestive association evidence in admixture mapping. The authors reported that 5 regions showed association with blood pressure and African ancestry in the admixture analysis: 1q41-42 and 21q21 for SBP, 5p13-11 and 17q11 for DBP, and 2q21-24 for SBP and DBP with DBP restricted to 2q22-24. In their association analysis, 2q21-24 and

37

21q21 were associated with SBP and 5p13-11 was associated with DBP. Further analysis

showed that 4 SNPs in 2q21-24 and 1 SNP in 21q21 explained the admixture evidence to

SBP, and 1 SNP in 5p13-11 explained the admixture evidence to DBP. In the meta- analysis and the replication analysis, SNP rs7726475 in 5p13-11 was significantly associated with SBP and DBP.

Although there have been only a few admixture mapping studies conducted for blood pressure phenotypes, it is evident that this method has identified genetic variants that were missed by GWAS. By using the structure of the genome and variability in the

frequency of alleles at risk variants, admixture mapping analysis is an efficient way to

narrow regions that harbor disease variants with substantial allele frequency differences

in ancestral populations. Compared to association studies, admixture mapping studies

have a number of advantages that make them an attractive option [71, 72]. For instance,

admixture studies are less sensitive to the genetic heterogeneity of diseases compared to

association studies, which is an important consideration in studying complex diseases.

While association studies require more SNPs than admixture mapping studies, they are

much more effective in identifying risk variants than admixture studies. However,

admixture mapping is dependent on the assumptions that the risk alleles vary between

ancestral populations and that the local ancestry along chromosomes can be accurately

estimated using marker maps and the available software. Despite these limitations, this

method has the potential to make substantial contributions to the genetics of hypertension

and other complex diseases.

Discussion

38

The complexity of blood pressure and hypertension is underscored by the

difficulties presented in exploring the genetic epidemiology of these phenotypes. While

past studies have discovered a number of genetic variants that are associated with

hypertension, the considerable missing heritability emphasizes that there are still many

challenges that need to be met. By improving existing analysis methods and developing new techniques, these issues can be addressed. Furthermore, by applying these methods,

we can demonstrate their strengths and understand how their findings contribute towards

an improved biologically- and clinically-relevant understanding of complex diseases.

39

Table 1. Genome-wide Association Studies of Hypertension

Analysis Blood Pressure Statistical Method Phenotypes Populations Methods Results Citation GWAS SBP and DBP 2931 subjects (1464 Linear regression No SNPs were genome- Saxena, et al. type II diabetes cases using PLINK wide significantly (2007) and 1467 controls) software for the associated with SBP or from Finland and primary analysis DBP Sweden in the and haplotype primary analysis; association 10,850 additional analysis subjects (type II diabetes cases and

40 controls) from

Sweden, Poland and the United States, 13,965 subjects from the WTCCC/UKT2D study and 4808 subjects from the FUSION study in the replication analysis GWAS Hypertension 2000 cases from the Logistic No SNPs were genome- WTCCC, et BRIGHT study and regression using wide significantly al. (2007) 3000 controls from R software associated with the 1958 Birth Cohort hypertension, but in the United rs2820037 on 1q43 (close Kingdom and the UK to genes RYR2, CHRM3 Blood Services, all and ZP4) was most strongly white Europeans associated with hypertension

GWAS Hypertension 188 male subjects The chi-square No SNPs were genome- Kato, et al. with hypertension and test and logistic wide significantly (2008) 1504 controls in 2 regression with associated with groups in the primary permutation tests hypertension, but SNP analysis, all of to determine rs3755351 in ADD2 was Japanese ancestry; statistical the most strongly 752 cases and 752 significance associated SNP after the controls in the first primary analysis and the replication study and two replication analyses 619 cases and 1406 controls in the second replication study, all of Japanese ancestry 41

GWAS Hypertension and raw 1017 African- Logistic and SNPs rs5743185 (PMS1), Adeyemo, et and treatment-adjusted Americans from the linear regression rs3751664 (CACNA1H), al. (2009) SBP and DBP Howard University using PLINK rs11160059 (SLC24A4), Family Study and 980 software and rs17365948 (YWHAZ), unrelated West meta analysis rs12279202 (IPO7) and Africans in a using an inverse- rs1687730 (pseudogene replication study variance AL365365.23) were weighting genome-wide significantly method with associated with SBP in METAL normotensive subjects in software the primary analysis and rs1160059 (SLC24A4) replicated in the West

42 African sample; SNPs in

STK39 and CDH13 replicated in the HUFS sample; SNPs in SLC24A4, IPO7 and PMS1 replicated from the Diabetes Genetics Initiative study GWAS SBP and DBP 8842 Korean subjects Linear regression SNP rs17249754 (near Cho, et al. from the Korea using PLINK ATP2B1) with SBP and (2009) Association Resource software or SAS DBP was not genome-wide (KARE) project and software significant, but it was the an independent strongest association of the sample of 7861 blood pressure traits from Korean subjects from the primary and replication the Health2 cohort for analyses the replication study

GWAS Hypertension and 29,136 subjects of Logistic and After primary and Levy, et al. treatment-adjusted SBP European-ancestry in linear regression; replication analysis, 1 SNP (2009) and DBP 6 cohorts of the meta-analysis was genome-wide CHARGE consortium using an inverse- significantly associated in the primary variance with hypertension analysis and 34,433 weighting (ATP2B1), 4 with SBP subjects of European- method (CYP17A1, PLEKHA7, ancestry in the Global ATP2B1 and SH2B3) and 6 BPgen consortium in with DBP (ULK4, the replication CACNB2, ATP2B1, SH2B3, analysis TBX3-TBX5 and CSK- ULK3) GWAS Treatment-adjusted SBP 34,433 subjects of Linear After primary and Newton-Cheh,

43 and DBP European-ancestry in regression; meta- replication analyses, SNPs et al. (2009)

17 cohorts of the analysis using an rs17367504 (MTHFR), Global BPgen inverse-variance rs11191548 consortium in the weighting (CNNM2/NT5C2) and primary analysis; method rs12946454 (PLCD3) were 71,225 subjects of associated with SBP and European ancestry SNPs rs16998073 (FGF5), and 12,889 subjects of rs1530440 (c10orf107), Indian Asian ancestry rs653178 (ATXN2), in the first replication rs1378942 (CSK) and analysis; 29,136 rs1694808 (ZNF652) were subjects of European- associated with DBP and ancestry in the these results were genome- CHARGE consortium wide significant in the second replication analysis

GWAS Hypertension and 1644 Germans in the Linear and No SNPs were genome- Org, et al. unadjusted SBP and DBP KORA S3 cohort in logistic wide significantly (2009) the primary analysis regression using associated with (1017 untreated); PLINK software; hypertension, SBP or DBP. 1830 Germans in the meta-analysis SNP rs11646213 on KORA S4 cohort, with inverse- 16q23.3, which is upstream 1832 Estonians in the variance of CDH13, was one of the HYPEST cohort and weighting strongest associations and 4370 British from the method using the this SNP was associated BRIGHT family- KORA S3, with hypertension, SBP and based cohort in the KORA S4 and DBP in the meta-analysis replication analysis HYPEST cohorts using R software

44 GWAS and Treatment-adjusted SBP 5654 subjects enrolled Linear regression No SNPs were genome- Sabatti, et al.

genome-wide and DBP in the NFBC 1966 using PLINK wide significantly (2009) haplotype (Northern Finland software for the associated with SBP or association Birth Cohort 1966) primary analysis DBP and weighted haplotype association for the genomewide haplotye association analysis

GWAS Treatment-adjusted SBP 542 Amish subjects in T-test for the SNP rs4977950 was Wang, et al. and DBP the AFDS in the primary analysis genome-wide significantly (2009) primary analysis; 557 and meta- associated with SBP, but it Amish subjects from analysis using is located in a gene desert. the AFDS, 790 Amish METAL A group of SNPs on subjects from the software 2q24.3, near STK39, were HAPI study, 1345 not genome-wide subjects from the significant but contained FHS, 3082 subjects many of the next strongest from the DGI, 575 signals for SBP and they Hutterite subjects and were also associated with 802 GenNet subjects DBP in the replication

45 analysis, all of

European ancestry GWAS and Young-onset Overall, 1008 young- Logistic No single SNPs reached Yang, et al. multilocus hypertension onset hypertension regression using genome-wide significance (2009) analysis cases and 1008 SAS software for for association with young- (genome- controls, all Han GWAS; genome- onset hypertension in the wide Chinese in Taiwan; wide haplotype primary or replication haplotype 175 cases and 175 association tests studies; the SNP group association) controls in the with trend rs9308945-rs6711736- primary analysis and regression using rs6729869-rs10495809 on 833 cases and 833 HelixTree 2p22.3 was associated with controls in the software (Golden young-onset hypertension replication analysis Helix, Inc. in the haplotype analysis Bozeman, MT)

GWAS SBP and DBP and 936 subjects from the One-way SNP rs1652080 (CCBE1 on Hiura, et al. unadjusted and treatment- Japanese Suita Study analysis of chromosome 8) was (2010) adjusted hypertension in the primary variance and associated with SBP and analysis; replication logistic this result achieved analysis in 2895 regression using genome-wide significance Japanese subjects in JMP software in the primary analysis and the Nomura Study (SAS Institute, was significantly associated and a subset of the Cary, NC) with SBP, DBP and 3228 Japanese hypertension in the subjects in the Suita replication analysis Study GWAS Hypertension and 1621 extreme HTN Logistic The minor allele (G) in Padmanabhan, treatment-adjusted SBP cases and 1699 HTN regression and SNP rs13333226 in UMOD et al. (2010)

46 and DBP controls, of European meta analysis was genome-wide

ancestry; replication using an inverse- significantly associated in 19,845 extreme variance with reduced risk of HTN cases and weighting hypertension 16,541 controls, of method European ancestry

GWAS Treatment-adjusted SBP 8591 African- Linear regression In the association analysis, Fox, et al. and DBP Americans in the using PLINK a rs10474346 (near GPR98 (2011) CARe consortium and linear mixed and ARRDC3) and replication in 4 effects model for rs2258119 (near C21orf91) African-American 1 cohort to adjust attained genome-wide cohorts (N = 11,882 for family significance and the top hits in Maywood, Howard structure; meta- did not replicate in University Family analysis independent datasets. Of Study, GENOA conducted with the top hits from the network of the Family the inverse- European-ancestry Global Blood Pressure variance BPGen and CHARGE Project, Women's weighting GWAS, 3 SNPs in SH2B3, Health Initiative and method in TBX2-TBX5 and CSK-

47 the International METAL ULK3 replicated In the

Collaborative Study African-American CARe on Hypertension in subjects. Blacks) and an European-American cohort (N = 69,899 in the International Consortium for Blood Pressure Genome- wide Association Studies)

GWAS SBP and DBP 7551 Koreans and Linear regression No SNPs were genome- Hong, et al. 3703 Koreans in a using PLINK wide significantly (2011) replication study software and associated with SBP or meta-analysis DBP. SNP rs11638762 using R software (AKAP13, 15q24-25) was associated with DBP in the primary and replication analyses Meta-analysis Treatment-adjusted SBP 19,608 East Asians in Inverse-variance Genome-wide significant Kato, et al. of GWAS and DBP the AGEN-BP weighting meta associations with blood (2011) consortium (Asian analysis using pressure for ST7L- Genetic METAL CAPZA1 and ENPEP for Epidemiology software DBP, as well as FIGN-

48 Network- Blood GRB14 and NPR3 for SBP;

Pressure) from 8 genome-wide significant GWAS. Followed associations that are with de novo specific to East Asians for genotyping of top ALDH2 and SBP and DBP, results in 10,518 as well as for rs35444 additional subjects. (TBX3) and DBP; Further replication in replicated previous an independent genome-wide significant sample (N = 20,247 associations between DBP of East Asian and CASZ1,and between ancestry) FGF5, CYP17A1 and ATP2B1 with SBP and DBP

GWAS Hypertension 8090 African- Logistic Imputed SNP rs7801190 Lettre, et al. American subjects regression using (SLCI2A9) was genome- (2011) from 5 cohorts in the PLINK software wide significantly CARe project in the or R software associated with primary study and hypertension in the primary 8849 African- analysis, but this SNP was American and not statistically significant African-Caribbean in the replication study subjects from 4 population-based cohorts in the replication Meta-analysis Hypertension and 203,056 subjects of Inverse-variance 16 novel associations with Ehret, et al.

49 of GWAS treatment-adjusted SBP European ancestry weighting meta blood pressure were (2011)

and DBP from 65 cohorts in the analysis reported in: MOV10 (SBP, primary study and DBP), SLC4A7 (DBP), 73,471 subjects of MECOM (SBP, DBP), East Asian (29,719), SLC39A8 (SBP, DBP), South Asian (23,977) GUCY1A3-GUCY1B3 and African ancestries (DBP), NPR3-C5orf23 (19,775) in the (SBP, DBP, HTN), EBF1 replication (SBP, DBP), HFE (SBP, DBP, HTN), BAT2-BAT5 (SBP, DBP, HTN), PLCE1 (SBP, HTN), ADM (SBP), FLJ32810-TMEM133 (SBP, DBP, HTN), FURIN- FES (SBP, DBP), GOSR2 (SBP), JAG1 (SBP, DBP), and GNAS-EDN3 (SBP, DBP, HTN). 12/13 SNPs

from the CHARGE and Global BPgen GWAS were also replicated.

Meta-analysis Hypertension and 29,378 African- Inverse-variance 3 novel associations with Franceschini, of GWAS treatment-adjusted SBP Americans in 19 weighting meta blood pressure reported in: et al. (2013)

50 and DBP cohorts in the analysis EVX1-HOXA (SBP, DBP),

discovery analysis RSPO3 (SBP, DBP) and and 99,382 subjects of PLEKHG1 (SBP, DBP); 1 African-American independent association in (10,386), European a known BP gene SOX6 (69,395), and East (DBP); and 1 known Asian (19,601) association in ULK4 ancestries. (DBP).

Table 2. Other Analysis Methods in the Genetic Analysis of Hypertension in Unrelated Subjects

Analysis Blood Pressure Populations Statistical Methods Results Citation Method Phenotypes Admixture Hypertension 1340 Admixture analysis by Microsatellite markers on Zhu, et al. mapping and hypertensive comparing genome-wide chromosomes 2, 3, 6 and (2005) association African- Z-scores of African 21 were significantly analysis using Americans and ancestry for cases only associated with ancestry. microsatellite 737 cases and with Z-scores for cases The markers on 6q and markers 573 controls, all and controls; association 21q were most from the FBPP analysis using logistic significantly associated regression between with hypertension and markers that were in ancestry.

51 regions that were

significantly associated with hypertension in the admixture analysis

Admixture Hypertension 1743 African- Admixture analysis by In the admixture Zhu, et al. mapping and Americans, 1000 comparing genome-wide analysis, findings on (2007) association European- Z-scores of African chromosome 6 (129-166 analysis Americans and ancestry for cases only cM) and chromosome 21 581 Mexican- with Z-scores for cases (0-30 cM) were Americans from and controls; association significantly associated the Dallas Heart analysis using logistic with ancestry, which Study regression between SNPs replicated previous that were in regions that findings (Zhu 2005). 51 were significantly SNPs on chromosome 6 associated with were analyzed in hypertension in the association analysis, and admixture analysis rs2272996 (VNN1) was

52 significantly associated

with African-Americans and Mexican-Americans. Admixture Hypertension 1670 Admixture analysis using No loci were Deo, et al. mapping and hypertensive ANCESTRYMAP significantly associated (2007) association African- software; association with hypertension analysis American cases analysis using logistic genome-wide; among from the regression candidate genes, the Multiethnic 6q21.3 locus identified Cohort and the by Zhu et al. (2005) was Genomic associated with Collaborative hypertension study and 387 African- American controls from the Multiethnic

Cohort

Candidate gene Hypertension 11,433 subjects Regression methods In this replication of the Ehret, et al. replication of and unadjusted (in 4449 families) using Merlin software top 6 hits for (2008)

53 GWAS SBP and DBP in three networks and Lamp software; hypertension from the

of the Family transmission- WTCCC study, the G Blood Pressure disequilibrium tests using allele of SNP rs1937506 Project, subjects FBAT/PBTAT software was significantly were 39% and QTDT software associated with African- decreased SBP and DBP American, 39% in European-Americans, European- associated with increased American and SBP in Hispanic- 22% Hispanic- Americans, and not American associated with blood pressure in African- Americans.

Candidate gene Hypertension 29,136 subjects Regression models within Significant results after Johnson, et al. association of and treatment- of European each cohort, and inverse- replication analysis for (2011) antihypertensive adjusted SBP ancestry in 6 variance weighted meta rs1801253 in ADRB1 drug targets and and DBP cohort studies of analysis for the overall (SBP, DBP and blood pressure the CHARGE results hypertension), rs2004776 consortium. in AGT (SBP, DBP and Replication in the hypertension) and rs4305 Global BPGen in ACE (hypertension) Consortium (N = 34,433 of European ancestry) and the Women's

54 Genome Health

Study (N = 23,019 of European ancestry)

Admixture Treatment- 6303 unrelated Admixture analysis, Admixture mapping: 5 Zhu, et al. mapping and adjusted SBP African- association analysis using regions showed (2011) association and DBP Americans in the linear regression models association with blood analysis Candidate Gene and a linear mixed effects pressure and African Association model for 1 cohort to ancestry (p < 0.0009): Resource (CARe) adjust for family 1q41-42 (SBP), 2q21-24 consortium. structure, and meta- (SBP and DBP: 2q22- Replication in analysis using the 24), 5p13-11 (DBP), 11,882 subjects in inverse-variance 17q11 (DBP) and 21q21 4 African- weighting method in (SBP). Association American cohorts METAL software. analysis showed that (Maywood, 2q21-24 and 21q21 were Howard associated with SBP and

55 University 5p13-11 was associated

Family Study, with DBP. 4 SNPs in GENOA of the 2q21-24 and 1 SNP in Family Blood 21q21 explained the Pressure Project admixture evidence to and Women's SBP, and 1 SNP in 5p13- Health Initiative) 11 explained the and 1 Nigerian admixture evidence to cohort DBP. In the meta- analysis and the replication analysis, rs7726475 (5p13-11) was associated with SBP and DBP (p < 0.0015 in meta-analysis; p < 8 x 10-8 for replication analysis)

Chapter 2. Specific Aims

Overall, the hypothesis of this dissertation was that the missing heritability in the genetics of hypertension can be addressed by utilizing modifications of common genetic epidemiology methods of analysis. This hypothesis was tested by conducting three studies on the genetic epidemiology of blood pressure phenotypes, comparing the results of these studies and describing their collective contribution to hypertension. All analyses were conducted on African-American subjects from the National Heart, Lung and Blood

Institute’s Family Blood Pressure Program to facilitate comparisons across the studies.

Study Sample

The studies were conducted on African-American subjects in the National Heart,

Lung and Blood Institute’s Family Blood Pressure Program (FBPP), which is a multiethnic and multicenter study on the genetics of hypertension [73]. The history of the FBPP and the dataset are described in detail in the original FBPP paper. Briefly, the

FBPP is composed of four independent multicenter studies, GENOA, GenNet, HyperGen and SAPPHIRe, which are focused on the genetic risk factors that are associated with blood pressure and hypertension. Each of the studies collected epidemiological, clinical and laboratory data on families with high blood pressure.

The Genetic Epidemiology Network of Atherosclerosis (GENOA) study includes

African Americans from Mississippi, Mexican Americans from Texas, and non-Hispanic white subjects from Minnesota. The GenNet study includes African Americans from

Illinois and non-Hispanic white subjects from Michigan. The Hypertension Genetic

Epidemiology Network (HyperGEN) study includes African American and non-Hispanic white subjects from Alabama, North Carolina, Massachusetts, Minnesota and Utah. The

56

Stanford Asian Pacific Program in Hypertension and Insulin Resistance (SAPPHIRe)

study includes people of Chinese and Japanese ancestry in Taiwan, Hawaii and

California.

Specific Aim 1: Discovery of variants in regions identified in previous admixture mapping analyses of cardiovascular disease traits using gene scores.

Candidate gene association analysis was conducted in regions on chromosomes

5q14-32, 6q24, 8q12-13 and 21q21 that showed admixture evidence for cardiovascular traits, including blood pressure, hypertension, BMI, and high-density lipoprotein cholesterol (HDL-C) in previous admixture mapping studies [65, 68, 74, 75]. The study was conducted in 1,733 unrelated African-American subjects in the GenNet, GENOA and HyperGen networks from the FBPP. The analyses in this study were performed with PLINK and SAS 9.2 (SAS Institute, Inc., Cary, North Carolina, USA) [22].

Additional information about the SNPs was obtained from the Database of Single

Nucleotide Polymorphisms (dbSNP) [76].

In the primary association analysis, multivariable linear and logistic regression methods were employed to determine the genes’ effects on the phenotypes of interest.

Gene scores were developed to model the gene effects by counting the number of minor alleles (0, 1, or 2) in each SNP of the gene and then collapsing the SNPs of each gene in an additive manner, allowing for a biologically-relevant interpretation of the variants’ effects on the traits. Replication analyses were conducted on 3,723 African Americans from the Atherosclerosis Risk in Communities (ARIC) and Jackson Heart Study (JHS) cohorts from the NHLBI’s Candidate Genes Association Resource (CARe) study.

57

This study reported that CXADR on chromosome 21 was associated with SBP and

DBP and F2RL1 was associated with BMI in African-Americans [77]. The association

for CXADR was consistent with previous studies that reported its role in electrical

conduction of the heart, as well as viral myocarditis and subsequent dilated

cardiomyopathy [78, 79]. CXADR was also recently reported for association with

ventricular fibrillation in acute myocardial infarction [80]. The association for F2RL1

and BMI was consistent with the previous finding that mice lacking F2RL1 were resistant

to weight gain [81]. The findings in CXADR and F2RL1 remained significant with and

without adjustment for local ancestry, indicating that additional associated SNPs may be

present in these regions. This study contributed to the overall aim of this dissertation by

reporting genetic associations with blood pressure and BMI in African-Americans using a

gene score method in a candidate gene study.

Specific Aim 2: Admixture analysis followed by family-based association analysis to identify genetic variants underlying cardiovascular disease traits.

Admixture mapping analysis was conducted on 22 autosomal chromosomes for hypertension, SBP, DBP, total cholesterol, HDL-C, low-density lipoprotein cholesterol

(LDL-C), and triglycerides, in 1,905 unrelated African-Americans in the NHLBI’s FBPP.

Seven regions showing admixture evidence were followed up with fine-mapping association analysis conducted in 3,556 African-Americans in families from the FBPP.

The analyses in this study were performed with PLINK, SAS 9.2 (SAS Institute, Inc.,

Cary, North Carolina, USA), Stata 9.2 (StataCorp. 2005, Stata Statistical Software:

Release 9. College Station, TX: StataCorp LP), and FamCC v.1.0 software [22].

Additional information about the chromosomes and the SNPs was obtained from the

58

Database of Single Nucleotide Polymorphisms (dbSNP), the UCSC Genome Browser

database, and the SNP Annotation and Proxy Search (SNAP) tool [76, 82, 83].

In the admixture mapping analysis, linear and logistic regression analyses were performed as in prior admixture mapping studies [70, 74]. The model included terms for the average African ancestry for a subject and the difference between the local ancestry at a given SNP and the average African ancestry for a subject, and this second term was used to test the null hypothesis of no association. The seven regions identified in the admixture mapping were for SBP (chromosomes 3q13 and 5p15), total cholesterol

(chromosome 19p13), HDL-C (chromosome 8p21), LDL-C (chromosome 7p15), and triglycerides (chromosomes 14q32 and 19p13). No regions showed admixture evidence for hypertension or DBP.

In the fine-mapping analysis, 61,925 SNPs in the seven admixture regions were tested in 3,556 subjects with complete information in families from the FBPP. The association analysis was conducted using linear regression with mixed models to account for familial correlations within pedigrees. This stage of the analysis identified 2 independent SNPs on chromosome 8 for HDL-C, 4 SNPs in 3 independent loci on chromosome 7 for LDL-C, and 5 SNPs in 3 independent loci on chromosome 14 for triglycerides. Some of the admixture regions and SNPs from the results from this study were previously reported for associations with cardiovascular disease traits or related phenotypes, and some novel associations were found as well. This work contributed toward the general hypothesis of this dissertation by utilizing a multi-stage study design, including a family-based association analysis in the second stage of the study, to improve the power to identify significant variants. Family-based association studies are robust to

59

population stratification, which is a concern in this recently admixed study population,

and they have increased power to detect variants by accounting for heritability [84].

Specific Aim 3: Admixture mapping analysis followed by family-based association

analysis to identify genomic and genetic variants associated with risk of apparent

treatment-resistant hypertension.

Patients diagnosed with apparent treatment-resistant hypertension (aTRH) have

either uncontrolled hypertension while taking three or more anti-hypertensive

medications of difference classes, including a diuretic, or they have controlled blood

pressure while taking at least 4 anti-hypertensive medications [85]. To identify variants associated with risk of aTRH, admixture mapping analysis and fine-mapping analysis of

SBP and DBP were performed in subjects taking > 2 anti-hypertensive medications and

patients with controlled blood pressure while taking 1 anti-hypertensive medication. As

African-American race is a key risk factor for aTRH, the admixture analysis was

conducted in 606 unrelated African-Americans for SBP and in 758 unrelated African-

Americans for DBP, followed by fine-mapping association analysis in 3,567 African-

Americans in families (1,495 subjects with complete genotype and phenotype information) [86]. Stratified analyses followed by fine-mapping analyses were also

conducted for strata defined by the number of anti-hypertensive medications (2 anti-

hypertensive drugs or > 3 anti-hypertensive drugs) for phenotypes that were significant

with the whole dataset.

The admixture mapping and fine-mapping association analyses were conducted as in Specific Aim 2 in African-American subjects from the FBPP. The analyses in this

study were performed with PLINK, SAS 9.2 (SAS Institute, Inc., Cary, North Carolina,

60

USA), Statistical Analysis for Genetic Epidemiology (S.A.G.E.), Stata 9.2 (StataCorp.

2005, Stata Statistical Software: Release 9. College Station, TX: StataCorp LP), and

FamCC v.1.0 [22]. Additional information about the chromosomes and the SNPs was obtained from the Database of Single Nucleotide Polymorphisms (dbSNP), the UCSC

Genome Browser database, the SNP Annotation and Proxy Search (SNAP) tool, the

National Human Genome Research Institute’s Catalog of Published Genome-Wide

Association Studies, the Database of Genotypes and Phenotypes (dbGaP) and the SNP

Selection and Functional Information (SNPinfo) tool [76, 82, 83, 87-89].

In the admixture mapping analysis, significant ancestry association for SBP was identified in an 8 Mb region on with the peak SNP at chromosome

3p24.2, and the surrounding admixture region included a SNP in SLC4A7 that was previously reported for significant association with blood pressure in the GWAS conducted by the ICBP [29]. In the stratified analysis, 7 SNPs in a large admixture region with the peak on chromosome 5q33.3 were identified for association with SBP in patients taking 2 anti-hypertensive drugs. The admixture region included EBF1, which was reported for association with blood pressure in the ICBP GWAS, as well as 4 other

SNPs with novel blood pressure associations and 2 other SNPs with known associations with cardiovascular disease traits [29]. No SNPs were significant in the overall and stratified fine-mapping analyses of SBP after controlling for multiple comparisons. No admixture regions showed significant association between local ancestry and DBP or

SBP in patients taking > 3 medications, so fine-mapping was not performed for these

traits.

61

To address a priori concerns of limited power to detect differences due to small

sample sizes, admixture mapping analysis was used in the first stage to improve the prior

probabilities of the test hypotheses and a family-based sample was used in the second

stage to account for heritability. This study contributed to this dissertation’s general aim

by conducting the first admixture mapping analysis of aTRH and by using a two-stage

analysis to improve limited power to detect genomic and genetic variants of interest.

Additionally, the results of this study replicated findings from the ICBP GWAS, and this has been a considerable challenge in African-American populations. As the present study was conducted in subjects taking anti-hypertensive medications, it is possible that these effects were difficult to replicate as they were associated with treatment effects in

African-Americans, rather than hypertension itself. These findings indicate that differences in phenotypes and sample composition may also contribute to the missing heritability in the genetics of blood pressure traits.

62

Chapter 3. Variants in CXADR and F2RL1 are associated with blood pressure and

obesity in African-Americans in regions identified through admixture mapping

The following chapter has been published as an article: Shetty, P.B., et al., Variants in

CXADR and F2RL1 are associated with blood pressure and obesity in African-Americans

in regions identified through admixture mapping. Journal of hypertension, 2012. 30(10):

p. 1970-6.

The article is included in this dissertation with permission from the publisher Lippincott

Williams & Wilkins (License Number 3216721362510). Promotional and commercial use of the material in print, digital or mobile device format is prohibited without the permission from the publisher Lippincott Williams & Wilkins. Please contact [email protected] for further information.

63

Variants in CXADR and F2RL1 are associated with blood pressure and obesity in

African-Americans in regions identified through admixture mapping

Authors: Priya B. SHETTYa, Hua TANGb, Bamidele O. TAYOc, Alanna C.

MORRISONd, Craig L. HANISd, D.C. RAOe, J. Hunter YOUNGf, Ervin R. FOXg, Eric

BOERWINKLEd, R.S. COOPERc, Neil RISCHi, Xiaofeng ZHUa, the Candidate Gene

Association Resource (CARe) Consortium

Affiliations: aDepartment of Epidemiology and Biostatistics, Case Western Reserve

University School of Medicine, 10900 Euclid Avenue, Cleveland, OH 44106;

bDepartment of Genetics, Stanford University School of Medicine; cDepartment of

Preventive Medicine and Epidemiology, Loyola University of Chicago Stritch School of

Medicine; dDivision of Epidemiology, Human Genetics and Environmental Sciences, The

University of Texas Health Science Center at Houston School of Public Health; eDivision

of Biostatistics, Washington University in St. Louis School of Medicine; fDepartment of

Epidemiology, Johns Hopkins Bloomberg School of Public Health; gDivision of

Cardiology, Department of Medicine, University of Mississippi Medical Center;

iDepartment of Epidemiology and Biostatistics, University of California, San Francisco.

Conflicts of Interest and Source of Funding: The authors declare no conflicts of interest.

This work was supported by the National Institutes of Health and grant number

HL086718 from the National Heart, Lung and Blood Institute.

Author Responsible for Correspondence Concerning the Manuscript and to Whom

Requests for Reprints Should be made: Xiaofeng Zhu, Department of Epidemiology and

Biostatistics, Case Western Reserve University School of Medicine, Wolstein Research

64

Building, 2103 Cornell Road, Room 1317, Cleveland, OH 44106;

[email protected]; 216-368-0201.

Abstract and Keywords

Objective: Genetic variants in 296 genes in regions identified through admixture mapping of hypertension, BMI, and lipids were assessed for association with hypertension, blood pressure, BMI, and HDL-C.

Methods: This study identified coding SNPs from HapMap2 data that were located in

genes on chromosomes 5, 6, 8, and 21, where ancestry association evidence for

hypertension, BMI or HDL-C was identified in previous admixture mapping studies.

Genotyping was performed in 1733 unrelated African-Americans from the National

Heart, Lung and Blood Institute’s Family Blood Pressure Project, and gene-based association analyses were conducted for hypertension, SBP, DBP, BMI, and HDL-C. A gene score based on the number of minor alleles of each SNP in a gene was created and used for gene-based regression analyses, adjusting for age, age2, sex, local marker ancestry, and BMI, as applicable. An individual’s African ancestry estimated from 2507 ancestry-informative markers was also adjusted for to eliminate any confounding due to population stratification.

Results: CXADR (rs437470) on chromosome 21 was associated with SBP and DBP with or without adjusting for local ancestry (p < 0.0006). F2RL1 (rs631465) on chromosome

5 was associated with BMI (P = 0.0005). Local ancestry in these regions was associated with the respective traits as well.

Conclusions: This study suggests that CXADR and F2RL1 likely play important roles in blood pressure and obesity variation, respectively; and these findings are consistent with other studies, so replication and functional analyses are necessary.

65

Key Words: African-Americans, blood pressure, genetic association studies, obesity

Abbreviations: AIMs, ancestry-informative markers; CARe, Candidate Gene Association

Resource; FBPP, Family Blood Pressure Project; GWASs, genome-wide association studies; HWE, Hardy-Weinberg equilibrium; NHLBI, National Heart, Lung and Blood

Institute Background

More than 82.6 million adults in the United States have cardiovascular disease, which is associated with hypertension, poorly controlled cholesterol levels, and obesity [1]. Poor cardiovascular health is associated with a number of serious outcomes, including stroke and coronary heart disease, and death. It is evident that this is a substantial public health issue.

Approximately one-quarter of adults worldwide have high blood pressure (BP) with about 7.5 million deaths worldwide due to high BP [2]. Some 34-67% of the inter- individual variation in BP is thought to be due to genetic factors [3-7]. Similarly, approximately 2.6 million deaths worldwide are due to high cholesterol and about 2.8 million deaths worldwide are due to overweight or obese status [2]. Estimates of the genetic contribution to the phenotypic variation of these diseases are 40-69% for high- density lipoprotein cholesterol (HDL-C), 40-66% for low-density lipoprotein cholesterol

(LDL-C), and 16-85% for obesity determined by BMI [8-14]. Genome-wide association studies (GWASs) and admixture mapping analyses in recently admixed populations are two methods that have been used to identify genetic variants associated with BP, lipids, and BMI.

Common genetic variants for hypertension have been identified through GWAS, but the resulting associations are of modest effect sizes. In the Cohorts for Heart and

66

Aging Research in Genomic Epidemiology (CHARGE) Consortium and the Global

BPgen Consortium blood pressure GWAS, 13 loci were collectively association with 1.0

mmHg increase in SBP and 0.5 mmHg increase in DBP [15, 16]. Recently, the largest

BP GWAS identified 29 independent variants at 28 loci (16 novel loci) significantly associated with BP in approximately 200,000 individuals; however, these variants collectively accounted for only 0.9% of the phenotypic variance [17].

In a smaller GWAS conducted on 8591 African-Americans in the Candidate Gene

Association Resource (CARe) study, Fox et al.[18] reported two SNPs that were significantly associated with BP but were not replicated. Another GWAS of 1017

African-Americans identified several SNPs that were significantly associated with SBP,

DBP, and hypertension; however, none of these findings were successfully replicated in the larger CARe study [18, 19]. Difficulty in replicating significant findings illustrates the challenges in identifying genetic variants that affect BP in the African-American population.

In the largest GWAS of blood lipids, significant results for 95 loci (59 novel loci) were reported [20]. For HDL-C, 62 loci were significant and 57 loci were significant for

LDL-C. A number of loci overlapped between the two traits. In an analysis that included secondary signals in 26 loci, the mapped variants collectively accounted for 25-30% of the genetic variance of each lipid phenotype.

For obesity, one of the earliest GWASs found a genetic variants in the FTO gene on chromosome 16q that was significantly associated with BMI [21]. A multistage obesity GWAS later reported significant results for 11 regions (seven novel regions), and the cumulative effect of the 11 loci accounted for less than 15 of the population

67

variability of obesity [22]. Another large GWAS reported 32 loci (18 novel loci) that were associated with BMI, but cumulatively, the loci explained only 2-4% of the genetic variance of BMI [23]. Further, the authors estimated that only 6-11% of the genetic variation of BMI could be explained by almost 300 genetic variants of similar effect sizes that probably exist in a sample size of over 730,000. These analyses clearly demonstrate some of the difficulties of conducting GWAS for complex diseases.

In addition to GWAS, African-American populations and other admixed populations are well suited for admixture mapping studies to identify genetic variants associated with BP, lipids and BMI [24, 25]. Admixture mapping capitalizes on differences in disease prevalence between parental populations of an admixed population to detect genetic variants associated with the disease. Previously, results of admixture analyses for hypertension, BMI and lipids in African-Americans were reported [26-30].

Several genomic regions were identified for the phenotypes of interest, including 6q24 and 21q21 for hypertension, 5q14-5q32 for BMI, and 8q11-8q21 for HDL-C.

The present study follows up SNPs identified from HapMap2 data in genes in four genomic regions, 5q14-32, 6q24, 8q11-8q21, and 21q21, that were reported in previous admixture analyses [26-29]. African-American participants from the National

Heart, Lung and Blood Institute’s (NHLBI) Family-Based Blood Pressure Program

(FBPP) were genotyped for SNPs in the four regions and for 2507 ancestry informative markers [31]. The aim of this study was to find variants in coding regions that are

associated with SBP and DBP, hypertension, BMI, and HDL-C in these four genomic

regions that were identified in admixture mapping studies. Gene-based analyses, rather

than single SNP analyses, were used to assess the variants with an emphasis on biological

68

function. In addition, the study was conducted in the African-American population, so there are advantages in utilizing their admixture to further examine the genetic variation of hypertension and in focusing on a population that has been shown to be more likely to have poorly controlled BP than non-Hispanic white hypertensive adults [32].

Methods

Sample

The NHLBI’s FBPP is a multicenter study that examines the genetic causes of hypertension and related phenotypes in different racial and ethnic groups: African-

Americans, Asians and Asian Americans, European Americans, and Mexican Americans

[31]. As the FBPP study was designed for linkage analysis, each of the networks in the

FBPP ascertained families via probands with elevated BP. This study focused on

African-Americans, who were recruited for three FBPP Networks. The Hypertension

Genetic Epidemiology Network (HyperGEN) recruited African American participants from Birmingham, Alabama and Forsyth County, North Carolina. The Genetic

Epidemiology Network of Atherosclerosis (GENOA) included African Americans from

Jackson, Mississippi, and the GenNet study recruited African Americans from Maywood,

Illinois. Additional details of the FBPP Networks are described elsewhere [31].

In this study, 1733 unrelated participants, 18-70 years, were selected for analysis from the FBPP data by first selecting the control from the families, if any were available.

If there were multiple controls present, the oldest control was selected. For cases, the youngest case in the family was selected. SBP and DBP measurements were obtained from Dinamap BP monitors. BMI was calculated from weight (in kilograms) and height

(in meters) measurements as weight/height2.

69

Genotyping Methods

SNPs from HapMap2 data that were located in the exons of 296 genes in regions on chromosomes 5q14-32, 6q24, 8q11-21, and 21q21 were identified. Each of these

regions was defined as the one unit drop of the –log(P) value from the peak in the admixture mapping analysis. As a result, 91 genes on , 117 genes on chromosome 6, 37 genes on chromosome 8, and 51 genes on chromosome 21 were examined in this study. These regions were selected for this study because they showed association evidence to hypertension, BMI, or HDL-C in previous admixture studies [26-

29]. Genes on chromosome 5 were examined for association with BMI, genes on chromosome 8 were examined for association with HDL-C, and genes on chromosomes 6 and 21 were examined for association with SBP, DBP, and hypertension. In addition,

2507 ancestry-informative markers (AIMs) across the genome were assessed to differentiate between African and European local and global ancestry and to adjust for population stratification.

Each of the four networks obtained blood samples from the particpants, and DNA was extracted by standard methods. The participants were genotyped using the Illumina iSelect Custom Bead Chip at the University of California San Francisco. The genetic data were examined for departures from Hardy-Weinberg equilibrium (HWE) in the hypertension controls. Two SNPs were found to have departures from HWE at the threshold corrected for testing 611 coding SNPs (P = 8 x 10-5), and these two SNPs were removed from the analysis. In addition, the genotype call rates were all at least 95%.

The AIMs were selected from SNPs available on the Illumina Human 1M array, the

Illumina 650K array, and the Affymetrix 6.0 array.

70

Statistical Methods

Hypertension was defined as having SBP at least 140 mm Hg, DBP at least 90

mm Hg, and/or taking prescription medication for high BP [33]. According to the

guidelines of American Heart Association, suboptimal HDL-C levels were less than 40 mg/dL in men and less than 50 mg/dL in women, and a suboptimal LDL-C level was at least> 100 mg/dL. Obesity was defined as having a body-mass index (kg/m2) > 30 [34].

First, SBP and DBP were imputed for participants being treated for hypertension

by adding 10 mm Hg to SBP and 5 mm Hg to DBP, consistent with the CHARGE

GWAS strategy [15]. The number of minor alleles for each SNP was counted, and the

SNPs in each gene were collapsed in an additive manner. Specifically, each SNP was

coded as 0, 1 or 2, based on the number of minor alleles present in the SNP; then, a gene

score was obtained by summing all SNP values in each gene for each person.

Association tests were based on these gene scores.

For hypertension, multivariable logistic regression analysis was conducted with

adjustment for age, age2, sex, and BMI. For SBP, DBP and HDL-C, age, age2, sex, and

BMI were adjusted for in regression models, and the BMI regression models were

adjusted for age, age2, and sex. Local ancestry and global African ancestry were

estimated from the 2507 AIMs using ADMIXPROGRAM [35]. To eliminate the effect

by population stratification, global ancestry was included in the regression models [30].

Analyses were performed with and without local ancestry as a covariate in the regression

models [36]. Finally, multiple comparisons were corrected for by adjusting for the

number of genes in all four regions using the Bonferroni correction to obtain adjusted

critical p-values.

71

It was expected that variants with association evidence should show a substantial

difference in allele frequencies between African and European populations, because this

study focused on variants in regions where local marker-specific ancestry was associated

with phenotypic variation. Therefore, the minor allele frequencies were compared for

each variant for the presumed representative reference populations for African

Americans, the Yoruba in Ibadan, Nigeria and the CEU (Utah residents with ancestry

from northern and western Europe) [37].

Replication analyses of the study findings were performed in two cohorts obtained from the NHLBI’s CARe [38]. CARe is a multinetwork study examining associations between genotypes and phenotypes that are of interest to the NHLBI, including BP, lipids and blood biomarkers. There are five African American cohorts in

CARe, and they were genotyped on the Affymetrix 6.0 platform. The replication analyses were conducted on 2916 African-Americans in the ARIC (Atherosclerosis Risk in Communities) cohort and 2144 African-Americans in the JHS (Jackson Heart Study) cohort, as they were most comparable to the FBPP dataset in terms of demographics.

The SNPs identified in this study were not available in Affymetrix 6.0 platform; therefore, imputed SNPs in ARIC and JHS were used in the replication analysis and these

SNPs had quality scores of R2 > 0.92. The local ancestry estimates for the genes of

interest were also used for the replication analysis.

All statistical analyses were conducted with software PLINK [39] and SAS 9.2

(Cary, North Carolina).

Results

72

The effects of SNPs in genes in coding regions on chromosomes 5, 6, 8, and 21

on SBP, DBP, hypertension and BMI were evaluated in 1733 unrelated African-

American participants. The descriptive statistics of the FBPP dataset are presented in

Table 1.

In the analysis, 117 genes on chromosome 6 (6q24) and 51 genes on chromosome

21 (21q21) were studied. For BMI, 91 genes in the genomic region 5q14-32 were examined. For HDL-C, 37 genes in the genomic region 8q11-21 were studied. Genes were reported in Table 2 if they were statistically significant after adjusting for the total number of genes tested for each trait. Genes were reported with and without local ancestry in the models to determine whether additional variants of interest were present in the same local ancestry region. The effect of local ancestry alone was also included.

The gene CXADR on chromosome 21 was significantly associated with increased

SBP (β = 3.17, P = 0.0001) and DBP (β = 1.70, P = 0.0002) before adjusting for local

ancestry. This gene was still significant after including local ancestry in the regression

models (SBP: β = 2.94, P = 0.0004; DBP: β = 1.60, P = 0.0006). All of these analyses

adjusted for age, age2, sex, BMI, and global African ancestry; these results were still

significant after adjusting for testing 51 genes for SBP and for DBP. Since local ancestry

was significant (P = 0.011 for SBP and P = 0.001 for DBP), there may be other variants

in this region that are independently associated with SBP and DBP. In this study,

CXADR only contained the nonsynonymous SNP rs437470. The minor allele was far

more prevalent in the YRI population than in the CEU population (C: 40.7% vs 9.7%),

and it was moderately common (C: 23.5%) in the HapMap ASW (individuals with

African ancestry in Southwest USA) population.

73

The gene F2RL1 on chromosome 5 was associated with obesity as measured by

BMI (β = 3.71, P = 0.0005) after adjusting for age, age2, sex, and local and global

African ancestry. The association with BMI was less significant before adjusting for

local ancestry (β = 3.41, P = 0.0011). Local African ancestry was not significant for

BMI (P = 0.352). The synonymous SNP rs631465 was the only SNP examined in

F2RL1 in our study. The minor allele was similarly infrequent in the YRI population as

in the CEU population (A: 0.0% vs 4.1%). Despite testing 37 genes in a region on

chromosome 8 that were selected based on admixture mapping results, no significant results were found for HDL-C.

Additional genes that were nominally significant at the significance level α equal

to 0.05 but were not significant after correction for multiple comparisons are reported in

Table 3. The results in Table 3 were reported from regression models with and without

local ancestry. The effect of local ancestry alone was also reported. For models that

were more significant with the local ancestry adjustment than without it, the results

indicated that there may be additional variants in this region associated with the

phenotype.

Replication Analyses

In the replication analyses using local ancestry estimates, association evidence for

CXADR and F2RL1 was not seen in the combined ARIC and JHS dataset (N = 3723).

Local ancestry at F2RL1 was marginally associated with BMI in the multivariable model

(β = -0.86, P = 0.0549) (Table 4). For SBP and DBP, the association of local ancestry at

CXADR was not significant, but the effects were in the same direction as in the FBPP dataset (SBP-β = 1.59, P = 0.2483; DBP-β = 1.23, P = 0.1248).

74

When only the genes were assessed, the statistically-significant results from the

FBPP study did not replicate in the combined ARIC and JHS cohorts (N = 5044), and the effect sizes were muted compared to the results in the primary dataset. The results for

CXADR in the combined ARIC and JHS datasets were in the same direction as in the

FBPP analysis after accounting for local ancestry (β-SBP = 0.19, P = 0.71 and β-DBP =

0.04, P = 0.90; Table 2). The SNP in F2RL1 was not present in the replication sets, so it was not possible to test this gene. It is important to note that the samples in the ARIC and JHS datasets had SNPs imputed, rather than genotyped directly. Although the imputation quality for each of the SNPs was high, imputed SNPs may have reduced the study’s power to replicate the FBPP findings.

Discussion

An association analysis of genetic variants in genomic coding regions that demonstrated association evidence in previous admixture mapping analyses was conducted [26-29, 31]. Coding SNPs that were identified from the HapMap project,

rather than tagging SNPs, were of interest in this study. Thus, the burden due to the

multiple comparisons was reduced by increasing the prior probability of the testing SNPs.

To further reduce the number of comparisons, the SNPs for each gene were collapsed and

a gene-based analysis was performed, similar to a rare variant analysis [40]. The

advantages of such an approach are a substantial reduction in the number of tests

conducted and the incorporation of a biological framework into the analysis.

The gene CXADR on chromosome 21 was identified as being statistically

significantly associated with SBP and DBP after correcting for testing 51 genes. One

nonsynonymous SNP, rs437470, was identified from the HapMap data in CXADR.

75

Interestingly, there was a substantial difference in the minor allele frequency of rs437470

between the HapMap YRI and CEU samples, suggesting that this result was consistent

with the finding that African ancestry is associated with increased BP in admixture

mapping studies [28-30]. Adjustment of local African ancestry resulted in models that

were still significantly associated with SBP and DBP, suggesting that additional variants

in this region may play a role in BP variation. The replication analyses using the local

ancestry estimates were consistent with these findings. Furthermore, CXADR plays a role

in the electrical conduction of the heart, and it has also been reported to be associated

with viral myocarditis and subsequent dilated cardiomyopathy, which is associated with

high BP [41, 42]. Recently, a SNP near CXADR was reported to have an association with

ventricular fibrillation in acute myocardial infarction [43].

The gene F2RL1 on chromosome 5 was significantly associated with BMI after correcting for 91 genes. In our analysis, this gene only contains the synonymous SNP rs631465, and the frequency of this SNP’s minor allele was infrequent in both the YRI and CEU populations. Interestingly, the association evidence was improved for this gene when local ancestry was included in the regression model, suggesting that this variant was unlikely to fully explain the association evidence observed in the admixture mapping of BMI [26]. The replication analyses showed consistent results that the local ancestry was significantly associated with BMI, indicating other variants for BMI may exist in this region. The association with obesity as measured by BMI was consistent with the finding that mice lacking this gene were resistant to weight gain [44]. The gene F2RL1 encodes the protease-activated receptor 2 (PAR2), and the activation of this receptor via coagulation factor VIIa is a key pathway for obesity. The association of F2RL1 with

76

nadir BMI was also reported recently in a study of expression SNPs (eSNPs) in Roux-en-

Y gastric bypass patients [45].

Overall, the results indicated that the local ancestry association regions contain

variant(s) of interest for the phenotypes. The results implied that the association between

BMI and F2RL1 may be real, but it was not known if rs631465 (F2RL1) is in linkage disequilibrium with a causal SNP for BMI. Local ancestry of F2RL1 was not significantly associated with BMI in the FBPP analysis, but it was associated with BMI in the multivariable replication analysis in the CARe dataset. This indicated that other local variants of interest for BMI may be present, but rs631465 in F2RL1 likely

accounted for some of the admixture evidence in this region. As F2RL1 is a large gene

spanning over 16 kb, different variant(s) in F2RL1 may be associated with BMI.

For CXADR, local ancestry was significantly associated with SBP and DBP in the

FBPP analysis, indicating that there may be additional variants in this region that are

associated with BP in this region. The local ancestry marker for CXADR in the CARe

dataset was shifted from the admixture peak that was previously reported in this region,

so this may have contributed to the lack of replication with local ancestry [29]. As with

F2RL1, the causal SNPs could be elsewhere in the gene as CXADR is over 80 kb. As the

results in CXADR and F2RL1 show significant association evidence in this study and

have been reported to be associated with BP or obesity-related traits in other studies,

these genes warrant further study [41-45].

The association results did not replicate in the gene-based analysis using the

CARe replication dataset. This difficulty in replicating the gene-level associations is

expected a priori as the participants from the FBPP dataset were ascertained based on a

77 family or personal history of hypertension, but the CARe dataset participants were recruited into population-based studies. As a result, the difficulty in replicating the gene- level results may also reflect the phenotypic heterogeneity among these cohorts, the possible overestimation of the FBPP effect sizes due to the “winner’s curse,” and the use of imputed variants, which all further reduce statistical power in a replication study.

One of this study’s main strengths was that the results of related admixture mapping analyses were used to narrow down the regions in which the candidate gene analysis was performed. In conducting a gene-based analysis, the interpretations of the study’s results were improved by employing a more biologically relevant framework than was possible from testing all SNPs. As the selection of candidate genes was based on previous findings, the number of tests performed was reduced compared to the number of tests that would be necessary in a genome-wide association study.

As the gene score method did not account for linkage disequilibrium patterns, the effects of the correlation structure between SNPs on the estimates are unclear. This may have resulted in muted effect sizes, as this method only identified genes that contained

SNPs in the dataset. Further, this study was limited by the necessity to impute BPs for participants undergoing treatment for hypertension rather than obtaining pretreatment

BPs.

The present study reported significant associations between hypertension and obesity phenotypes and genes in the African-American community. The genes CXADR and F2RL1 were associated with BP and BMI, and these findings are consistent with replication analyses and published literature. As a result, these genes may be important in predicting the risk of hypertension and obesity, so future replication and functional

78

analyses including resequencing studies are warranted. Furthermore, this study described

how local ancestry affected effect estimates of association and demonstrated how local

ancestry may be useful in replication analyses. Consideration of such factors may be important in determining the contributions of genetic and genomic effects in disease

association studies,

Acknowledgements

This work was supported by the National Institutes of Health and grant number

HL086718 from the National Heart, Lung and Blood Institute.

Conflicts of interest

There are no conflicts of interest.

79

Table 1. Summary statistics

Variable N Mean Median Min, Max Age (years) 1733 48.2 48.0 (19.0, 70.0) BMI (kg/m2) 1733 31.0 29.9 (13.8, 70.7) Mean proportion African Ancestry 1733 83.6% 85.5% (33.5%, 98.8%) Sex 1733 Male 617 (35.6%) . . . Female 1116 (64.4%) . . . Taking any anti-hypertension 1733 medication

80 Yes 770 (44.4%) . . .

No 963 (55.6%) . . . SBP, adjusted for treatment (mmHg) 1730 132.9 129.0 (74.0, 237.5) DBP, adjusted for treatment (mmHg) 1730 75.9 74.7 (42.0, 136.0)

Table 2. Results of gene-based multivariable regression models

Without local ancestry With local ancestry

Dataset Outcome Gene (Chr) SNP(s) β SE(β) P value β SE(β) P value P value of local ancestry

FBPP SBP CXADR (21) rs437470 3.17 0.830 0.0001 2.94 0.835 0.0004 0.011

FBPP DBP CXADR (21) rs437470 1.70 0.460 0.0002 1.60 0.463 0.0006 0.001

FBPP BMI F2RL1 (5) rs631465 3.41 1.045 0.0011 3.71 1.056 0.0005 0.352

ARIC & SBP CXADR (21) rs437470 0.27 0.499 0.5918 0.19 0.504 0.7110 0.277

81 JHS

ARIC & DBP CXADR (21) rs437470 0.10 0.291 0.7269 0.04 0.294 0.9005 0.134 JHS

Chr denotes chromosome number. Statistically significant after adjusting for the total number of genes tested. 51 genes on chromosome (Chr) 21

were tested for DBP and SBP (critical P value = 0.000980392), and 91 genes on chromosome 5 were tested for BMI (critical P = 0.000549451)

for the FBPP dataset analyses. All SBP and DBP multivariable models were adjusted for age, age2, sex, BMI, and global ancestry, unless

otherwise noted. All BMI multivariable models were adjusted for age, age2, sex, and global ancestry, unless otherwise noted. ARIC,

Atherosclerosis Risk in Communities; FBPP, Family-Based Blood Pressure Program; JHS, Jackson Heart Study.

Table 3. Results of gene-based multivariable regression models that were nominally significant at α equal to 0.05

Without local ancestry With local ancestry Phenotype Gene Chr SNPs β SE(β) P β SE(β) P P value of local ancestry BMI ATG10 5 rs3734114, 0.50 0.243 0.0380 0.59 0.246 0.0176 0.1929 rs1864183 BMI IQGAP2 5 rs10036913, 0.24 0.109 0.0281 0.25 0.109 0.0238 0.3301 rs7722711, rs2431352, rs2910819, 82

rs2455230, rs2431363 BMI MRPS27 5 rs17375461, -0.44 0.193 0.0217 -0.43 0.194 0.0279 0.6049 rs10942927 BMI THBS4 5 rs1866389 0.84 0.391 0.0320 0.92 0.393 0.0195 0.2632 SBP ENPP1 6 rs9483347, 2.10 0.915 0.0220 2.01 0.938 0.0324 0.0310 rs1804025 SBP GPR1266 6 rs11155242 2.18 0.959 0.0234 2.15 0.959 0.0252 0.0025 SBP L3MBTL3 6 rs9388768, 1.23 0.522 0.0185 1.55 0.563 0.0058 0.0481 rs7451021 SBP NOX3 6 rs12195525 -2.82 1.242 0.0233 -2.82 1.242 0.0233 0.4296 SBP RSPO3 6 rs1892172 1.58 0.809 0.0507 1.69 0.813 0.0380 0.0165

DBP C6orf184 6 rs9400272, -1.35 0.443 0.0023 -1.37 0.451 0.0024 0.0626 rs6927569 DBP ENPP1 6 rs9483347, 1.27 0.507 0.0122 1.28 0.520 0.0135 0.0786 rs1804025 DBP L3MBTL3 6 rs9388768, 0.68 0.290 0.0192 0.82 0.313 0.0091 0.1246 rs7451021 DBP TXLNB 6 rs17068451, 0.75 0.317 0.0186 0.73 0.325 0.0255 0.0112 rs9321712 SBP DOPEY2 21 rs3827183 2.97 1.248 0.0173 3.10 1.252 0.0134 0.3174 DBP DOPEY2 21 rs3827183 1.40 0.691 0.0437 1.39 0.694 0.0454 0.2382 HTN C6orf211 6 rs9397054 -0.20 0.102 0.0484 -0.20 0.103 0.0525 0.2965 83 HTN ENPP1 6 rs9483347, 0.23 0.091 0.0105 0.22 0.093 0.0162 0.0138

rs1804025 HTN L3MBTL3 6 rs9388768, 0.11 0.051 0.0358 0.14 0.055 0.0126 0.0368 rs7451021 HTN CXADR 21 rs437470 0.20 0.082 0.0166 0.19 0.083 0.0227 0.1348 HDL-C LOC137886 8 rs13277646 -1.99 0.796 0.0125 -1.98 0.799 0.0132 0.6924

Table 4. Results of gene-based multivariable regression models using local ancestry for replication analysis

Dataset Outcome Gene (Chr) β SE(β) P

ARIC & JHS SBP CXADR (21) 1.59 1.377 0.2483

ARIC & JHS DBP CXADR (21) 1.23 0.803 0.1248

ARIC & JHS BMI F2RL1 (5) -0.86 0.447 0.0549

Chr denotes Chromosome number. Models were testing local ancestry, instead of the gene of interest in the model. Multivariable SBP and DBP

models were adjusted for age, age2, sex, BMI, and global ancestry. Multivariable BMI model was adjusted for age, age2, sex, and global ancestry.

84 ARIC, Atherosclerosis Risk in Communities; JHS, Jackson Heart Study.

References

1. Roger VL, Go AS, Lloyd-Jones DM, Benjamin EJ, Berry JD, Borden WB, et al. Executive summary: heart disease and stroke statistics--2012 update: a report from the american heart association. Circulation. 2012; 125 (1):188-97. 2. WHO. Global health risks: mortality and burden of disease attributable to selected major risks. Geneva; 2009. 3. Hottenga JJ, Boomsma DI, Kupper N, Posthuma D, Snieder H, Willemsen G, et al. Heritability and stability of resting blood pressure. Twin Res Hum Genet. 2005; 8 (5):499-508. 4. Kupper N, Willemsen G, Riese H, Posthuma D, Boomsma DI, de Geus EJ. Heritability of daytime ambulatory blood pressure in an extended twin design. Hypertension. 2005; 45 (1):80-5. 5. Levy D, DeStefano AL, Larson MG, O'Donnell CJ, Lifton RP, Gavras H, et al. Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension. 2000; 36 (4):477-83. 6. Kearney PM, Whelton M, Reynolds K, Muntner P, Whelton PK, He J. Global burden of hypertension: analysis of worldwide data. Lancet. 2005; 365 (9455):217-23. 7. Lewington S, Clarke R, Qizilbash N, Peto R, Collins R. Age-specific relevance of usual blood pressure to vascular mortality: a meta-analysis of individual data for one million adults in 61 prospective studies. Lancet. 2002; 360 (9349):1903-13. 8. Kathiresan S, Manning AK, Demissie S, D'Agostino RB, Surti A, Guiducci C, et al. A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC Med Genet. 2007; 8 Suppl 1:S17. 9. Weiss LA, Pan L, Abney M, Ober C. The sex-specific genetic architecture of quantitative traits in humans. Nat Genet. 2006; 38 (2):218-22. 10. Adeyemo A, Luke A, Cooper R, Wu X, Tayo B, Zhu X, et al. A genome-wide scan for body mass index among Nigerian families. Obes Res. 2003; 11 (2):266-73. 11. Allison DB, Kaprio J, Korkeila M, Koskenvuo M, Neale MC, Hayakawa K. The heritability of body mass index among an international sample of monozygotic twins reared apart. Int J Obes Relat Metab Disord. 1996; 20 (6):501-6. 12. McQueen MB, Bertram L, Rimm EB, Blacker D, Santangelo SL. A QTL genome scan of the metabolic syndrome and its component traits. BMC Genet. 2003; 4 Suppl 1:S96. 13. Pietilainen KH, Kaprio J, Rissanen A, Winter T, Rimpela A, Viken RJ, et al. Distribution and heritability of BMI in Finnish adolescents aged 16y and 17y: a study of 4884 twins and 2509 singletons. Int J Obes Relat Metab Disord. 1999; 23 (2):107-15. 14. Platte P, Papanicolaou GJ, Johnston J, Klein CM, Doheny KF, Pugh EW, et al. A study of linkage and association of body mass index in the Old Order Amish. Am J Med Genet C Semin Med Genet. 2003; 121C (1):71-80. 15. Levy D, Ehret GB, Rice K, Verwoert GC, Launer LJ, Dehghan A, et al. Genome-wide association study of blood pressure and hypertension. Nat Genet. 2009; 41 (6):677-87. 16. Newton-Cheh C, Johnson T, Gateva V, Tobin MD, Bochud M, Coin L, et al. Genome- wide association study identifies eight loci associated with blood pressure. Nat Genet. 2009; 41 (6):666-76. 17. Ehret GB, Munroe PB, Rice KM, Bochud M, Johnson AD, Chasman DI, et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011; 478 (7367):103-9. 18. Fox ER, Young JH, Li Y, Dreisbach AW, Keating BJ, Musani SK, et al. Association of genetic variation with systolic and diastolic blood pressure among African Americans: the Candidate Gene Association Resource study. Hum Mol Genet. 2011; 20 (11):2273-84.

85

19. Adeyemo A, Gerry N, Chen G, Herbert A, Doumatey A, Huang H, et al. A genome-wide association study of hypertension and blood pressure in African Americans. PLoS Genet. 2009; 5 (7):e1000564. 20. Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010; 466 (7307):707-13. 21. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science. 2007; 316 (5826):889-94. 22. Thorleifsson G, Walters GB, Gudbjartsson DF, Steinthorsdottir V, Sulem P, Helgadottir A, et al. Genome-wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet. 2009; 41 (1):18-24. 23. Speliotes EK, Willer CJ, Berndt SI, Monda KL, Thorleifsson G, Jackson AU, et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet. 2010; 42 (11):937-48. 24. Smith MW, O'Brien SJ. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat Rev Genet. 2005; 6 (8):623-32. 25. Zhu X, Tang H, Risch N. Admixture mapping and the role of population structure for localizing disease genes. Adv Genet. 2008; 60:547-69. 26. Basu A, Tang H, Arnett D, Gu CC, Mosley T, Kardia S, et al. Admixture mapping of quantitative trait loci for BMI in African Americans: evidence for loci on chromosomes 3q, 5q, and 15q. Obesity (Silver Spring). 2009; 17 (6):1226-31. 27. Basu A, Tang H, Lewis CE, North K, Curb JD, Quertermous T, et al. Admixture mapping of quantitative trait loci for blood lipids in African-Americans. Hum Mol Genet. 2009; 18 (11):2091-8. 28. Zhu X, Cooper RS. Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PLoS One. 2007; 2 (11):e1244. 29. Zhu X, Luke A, Cooper RS, Quertermous T, Hanis C, Mosley T, et al. Admixture mapping for hypertension loci with genome-scan markers. Nat Genet. 2005; 37 (2):177-81. 30. Zhu X, Young JH, Fox E, Keating BJ, Franceschini N, Kang S, et al. Combined admixture mapping and association analysis identifies a novel blood pressure genetic locus on 5p13: contributions from the CARe consortium. Hum Mol Genet. 2011; 20 (11):2285-95. 31. FBPP. Multi-center genetic study of hypertension: The Family Blood Pressure Program (FBPP). Hypertension. 2002; 39 (1):3-9. 32. Redmond N, Baer HJ, Hicks LS. Health behaviors and racial disparity in blood pressure control in the national health and nutrition examination survey. Hypertension. 2011; 57 (3):383-9. 33. Chobanian AV, Bakris GL, Black HR, Cushman WC, Green LA, Izzo JL, Jr., et al. Seventh report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Hypertension. 2003; 42 (6):1206-52. 34. James WP, Jackson-Leach R, Ni Mhurchu C, Kalamara E, Shayeghi M, Rigby N, et al. Overweight and Obesity (High Body Mass Index). In: Ezzati M, Lopez AD, Rodgers A, Murray CJL, editors. Comparative quantification of health risks: global and regional burden of disease attributable to selected major risk factors. Geneva: WHO; 2004. p. 497-596. 35. Zhu X, Zhang S, Tang H, Cooper R. A classical likelihood based approach for admixture mapping using EM algorithm. Hum Genet. 2006; 120 (3):431-45. 36. Qin H, Morris N, Kang SJ, Li M, Tayo B, Lyon H, et al. Interrogating local population structure for fine mapping in genome-wide association studies. Bioinformatics. 2010; 26 (23):2961-8. 37. Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007; 449 (7164):851-61.

86

38. Musunuru K, Lettre G, Young T, Farlow DN, Pirruccello JP, Ejebe KG, et al. Candidate gene association resource (CARe): design, methods, and proof of concept. Circ Cardiovasc Genet. 2010; 3 (3):267-75. 39. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007; 81 (3):559-75. 40. Zhu X, Feng T, Li Y, Lu Q, Elston RC. Detecting rare variants for complex traits using family and unrelated data. Genet Epidemiol. 2010; 34 (2):171-87. 41. Bowles NE, Richardson PJ, Olsen EG, Archard LC. Detection of Coxsackie-B-virus- specific RNA sequences in myocardial biopsy samples from patients with myocarditis and dilated cardiomyopathy. Lancet. 1986; 1 (8490):1120-3. 42. Lisewski U, Shi Y, Wrackmeyer U, Fischer R, Chen C, Schirdewan A, et al. The tight junction protein CAR regulates cardiac conduction and cell-cell communication. J Exp Med. 2008; 205 (10):2369-79. 43. Bezzina CR, Pazoki R, Bardai A, Marsman RF, de Jong JS, Blom MT, et al. Genome- wide association study identifies a susceptibility locus at 21q21 for ventricular fibrillation in acute myocardial infarction. Nat Genet. 2010; 42 (8):688-91. 44. Badeanlou L, Furlan-Freguia C, Yang G, Ruf W, Samad F. Tissue factor-protease- activated receptor 2 signaling promotes diet-induced obesity and adipose inflammation. Nat Med. 2011; 17 (11):1490-7. 45. Greenawalt DM, Dobrin R, Chudin E, Hatoum IJ, Suver C, Beaulaurier J, et al. A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res. 2011; 21 (7):1008-16.

87

Chapter 4. Novel variants for HDL-C, LDL-C and triglycerides identified from admixture mapping and fine-mapping analysis in families

Abstract

Objective: Admixture mapping of blood pressure traits and lipids was followed-up by family-based association analysis of the chromosomes and phenotypes to identify variants for cardiovascular disease in African-Americans.

Methods: The present study conducted admixture mapping analysis on 22 autosomal chromosomes for hypertension, systolic blood pressure, diastolic blood pressure, total cholesterol, high-density lipoprotein cholesterol, low-density lipoprotein cholesterol and triglycerides. The admixture analysis was performed for 1,905 unrelated African-

American subjects from the National Heart, Lung and Blood Institute’s Family Blood

Pressure Project (FBPP). Regions that were identified as showing admixture evidence were followed-up with family-based association analysis in 3,556 African-American subjects from the FBPP. The regression analyses used in the admixture mapping analysis and in the family-based association analysis were adjusted for age, age2, sex, body-mass-

index, and mean ancestry to minimize the confounding due to population stratification.

Results: Regions suggestive for local ancestry association evidence are found on

chromosomes 3 and 5 (systolic blood pressure), 7 (low-density lipoprotein cholesterol,

LDL-C), 8 (high-density lipoprotein cholesterol, HDL-C), 14 (triglycerides) and 19 (total

cholesterol and triglycerides). In the fine-mapping analysis in families, 61,925 SNPs

were tested and 11 SNPs (at 8 independent SNPs) show nominal significant association

with HDL-C (2 SNPs), LDL-C (4 SNPs) and triglycerides (5 SNPs). The increased

information from the family data identifies SNPs that show novel associations to the

88

phenotypes of interest, as well as regions that contained genes known to have

associations with cardiovascular disease.

Conclusions: This study identifies regions on chromosomes 3, 5, 7, 8, 14 and 19 that are

of interest for further studies of cardiovascular disease in African-Americans. A number of the admixture regions contain genes with known associations for cardiovascular traits, and 11 SNPs from the fine-mapping are identified for association with HDL-C, LDL-C

and triglycerides. The current study also demonstrates the benefits of alternate and multi-

stage analysis methods, as well as the advantages of using family data in association analysis for genetic epidemiology studies.

Background

Inadequate control of blood pressure (BP) and lipids are key indicators of poor cardiovascular health, which is a substantial public health issue worldwide.

Approximately 7.5 million deaths worldwide annually are due to high blood pressure and

2.6 million deaths worldwide are due to poorly-controlled cholesterol [1].

Both genetic and environmental factors are associated with variation in blood pressure and lipids. The heritability of hypertension has been estimated to range between

34% and 67% from family studies and twin studies [2-4]. The heritability of key lipids is similar, ranging from 58-69% for HDL-C, LDL-C and triglycerides in the Framingham

Heart Study Offspring cohort [5]. Genome-wide association studies (GWAS) have been used successfully to identify genetic variants that are associated with cardiovascular disease, including blood pressure traits and lipid traits [6-10]. However, the primary analysis for most of these studies has been conducted in subjects of European ancestry and studies in African ancestry populations are still limited to relatively small sample

89

sizes in which the replication of the findings has been shown to be a challenge [11, 12].

Recently, a meta-analysis of almost 30,000 subjects of African ancestry identified three novel BP loci and one independent signal in a known locus [13]. Although this study suggested that BP loci identified in populations of European ancestry have common effects across different populations, replicating such association evidence in individual loci in populations of African ancestry can still be a challenge. Similar results were seen in a recent study of blood lipids across multiple ethnically-diverse populations [14].

Admixture mapping, on the other hand, uses the disparities in disease prevalence between the ancestral populations of admixed populations to identify genomic regions and genetic variants that are associated with the disease [15, 16]. In admixture analysis, the frequency of a risk allele is expected to vary between ancestral populations. It is assumed that in a recently admixed population, affected subjects have increased ancestry for the ancestral population that has a greater frequency of the risk allele at the causal variant, compared to controls. Ancestry at the causal variant is also expected to be greater than average genome-wide ancestry. This increased ancestry for the high-risk ancestral population has been previously reported in hypertension cases [17]. In African-

Americans, some genomic regions have already been identified for hypertension and lipids using admixture mapping, including regions 5p13-11, 6q24 and 21q21 for BP and hypertension and 8q11-8q21 for HDL-C, respectively [14, 18-21].

In the current study, admixture mapping analyses were first conducted to identify genomic regions that were associated with blood pressure traits (hypertension, systolic blood pressure (SBP) and diastolic blood pressure (DBP)) and lipid traits (total cholesterol (TC), high-density lipoprotein cholesterol (HDL-C), low-density lipoprotein

90

cholesterol (LDL-C) and triglycerides (TG)). The regions showing association evidence

with local ancestry were then followed up by fine-mapping association analysis using

SNPs in families.

Methods

Sample

The study sample for the admixture mapping analysis and the sample for the family-based association analysis were obtained from the NHLBI’s Family-Based Blood

Pressure Program. The FBPP is a multi-center study that focuses on the genetic causes of hypertension and associated traits in African Americans, Asians and Asian Americans,

European Americans, and Mexican Americans [22]. The FBPP study design was set up for linkage analysis, and the networks in the FBPP ascertained families through probands with increased blood pressure.

The current study was conducted in African Americans, and these subjects were recruited for three FBPP Networks: The Hypertension Genetic Epidemiology Network

(HyperGEN) in Birmingham, Alabama and Forsyth County, North Carolina; the Genetic

Epidemiology Network of Atherosclerosis (GENOA) in Jackson, Mississippi; and the

GenNet study in Maywood, Illinois. Further details of the FBPP Networks are available in [22].

The admixture mapping analysis was conducted in 1,905 unrelated subjects. In brief, these subjects were selected from the overall FBPP data by first selecting the control from the families, if available. If multiple controls were available, the oldest control was chosen. In the cases, the youngest case in the family was selected. Subjects

91

with age < 19 years or age > 80 years or average global African ancestry < 2.5% were removed from the dataset.

For the fine-mapping study, chromosomal regions that showed local ancestry association evidence for the phenotypes were selected for follow-up analysis. There were

7,053 subjects in 1,459 pedigrees (up to 3,556 subjects with complete phenotype and genotype information) from the FBPP in the family-based, fine-mapping association analysis.

Genotyping

The FBPP networks obtained blood samples from the study participants and DNA was extracted by standard methods. There were 2,593 evenly spanned ancestry- informative markers (AIMs) identified across the genome that maximized the difference in allele frequency for the ancestral populations, HapMap CEU (Utah residents with

Northern and Western European ancestry from the CEPH collection) and HapMap YRI

(Yoruba in Ibadan, Nigeria). These SNPs were selected from those that were available on the Illumina Human 1M array, the Illumina 650K array, and the Affymetrix 6.0 array.

These AIMs were genotyped for the 1,905 unrelated subjects using the Illumina iSelect

Custom Bead Chip at the University of California San Francisco [18-21, 23]. After dropping SNPs with call rate < 0.95 and Illumina GenTrain score < 0.7, a final dataset of

2,507 AIMs was used in the admixture mapping analysis.

The subjects from the GENOA network were genotyped using the Affymetrix

Array 6.0 and Illumina 1M arrays, and the subjects from the HyperGEN network were genotyped using the Affymetrix Array 6.0. Quality control measures were performed separately for the GENOA and HyperGEN samples. SNPs with call rate less than 95%

92

were excluded [13] . There were an additional 872 African-American subjects from the

HyperGEN and GENOA networks who were not genotyped with conventional GWAS platforms, rather these individuals were genotyped using the Affymetrix Axiom chips.

SNPs were called using the Affymetrix Genotyping Console (GTC) by analyzing CEL files of intensity calculations from the Affymetrix Axiom arrays (www.affymetrix.com).

SNPs with call rates < 95% were excluded. The three datasets were combined in the fine

mapping analysis, and the total sample size was 3,556 subjects with complete

information. The combined genotyping data were examined for Mendelian errors and

departures from Hardy-Weinberg Equilibrium (p < 0.001), and 42 SNPs on chromosome

5 and 30 SNPs on chromosome 19 were removed. After all of the quality control

measures, 61,925 SNPs were analyzed in seven regions identified from the admixture

mapping analysis.

Statistical Analyses

In the admixture mapping, the subjects genotyped with 2,507 ancestry informative

markers [22]. Two different software programs were used for estimating an individual's

local ancestry, ADMIXPROGRAM and SABER, and the performance of the two

programs was compared [24, 25]. As the AIMs were not spaced closely together, the

estimates from the ADMIXPROGRAM were reasonable; for markers that are spaced

closely, such as those from GWAS, SABER is more suitable for local ancestry estimation

[26]. Furthermore, ADMIXPROGRAM modeled the local ancestry probability with a

continuous gene-flow model that is applicable to admixed populations with two ancestral

populations, such as African-Americans [16, 24]. On the other hand, SABER modeled

the local ancestry probability with a transition matrix using an intermixing model that

93

accounts for more than two ancestral populations with separate admixing times, which may be more suitable for Hispanic populations [16, 25]. Based on the sample of interest,

the estimates from ADMIXPROGRAM were used. The regions showing local ancestry

association evidence with the estimates from the ADMIXPROGRAM software were

followed up with fine-mapping association analysis on the 61,925 SNPs in the families.

For continuous traits, linear regression analysis was performed similarly to

previous admixture mapping studies [21, 27]. Specifically, let yi be the trait value of

th individual i . Let Aij be the African ancestry at the j AIM and Ai be the average

African Ancestry of individual i . Aij was estimated by ADMIXPROGRAM and Ai was

calculated as the mean of Aij for all AIMs. The linear regression analysis was performed

2 as yi =+ββ01 Ai + β 2() Aij −+ A i β3 SEX + β 4 AGE + β 5 AGE + β 6 BMI + εi and tested the

null hypothesis β 2 = 0, which was used to assess statistical significance in admixture

mapping. A P-value <0.001 was considered for testing the null hypothesis β 2 = 0 as

suggestive evidence of association in admixture mapping analysis; further, a region of

admixture mapping was defined as comprising the markers that were within the 1 unit

drop of –log10(P-value) from the peak signal on either side. Average African ancestry Ai

was included in the regression model to adjust for population stratification. For a binary

trait such as hypertension, the linear regression was replaced by logistic regression with the same covariates. The admixture mapping analysis was repeated using local ancestry estimates obtained with the software SABER, and the results were compared to the results obtained with the ADMIXPROGRAM [25].

94

In the fine-mapping analysis, family-based association analysis using the ASSOC

program in S.A.G.E. software v. 6.3.0 was performed. The family-based association

analysis is similar to traditional association analysis conducted in unrelated subjects in

that linear regression is performed. However, the family-based analysis also accounts for

familial correlation among members within a pedigree through a mixed model. The

family-based analysis has a number of advantages over traditional trait-marker

association analysis in unrelated subjects, including the estimation of heritability and

additional quality control of the genotyping data by checking Mendelian errors. For the

family-based association analysis, residuals were first calculated after adjusting for covariates (age, age2, sex and BMI). Linear regression or logistic regression was then

used for the SNP-trait association analysis.

For the family samples, the first ten principal components were estimated and

included as covariates to account for population stratification as described in [28]. In

brief, the PCs for all founders were first calculated and the PCs of the rest of the family

members were projected according to their genotype data [28]. Principal component

(PC) analysis was conducted using FamCC v.1.0 software. The PCs of founders and non-founders were similar to the PCs that were calculated from founders only. The analyses in this study were conducted with PLINK software, SAS 9.2 (SAS Institute Inc.,

Cary, North Carolina, USA), Stata 9.2 (StataCorp. 2005. Stata Statistical Software:

Release 9. College Station, TX: StataCorp LP), and FamCC v.1.0 software [29].

Additional information for the chromosomes and SNPs was obtained from the Database of Single Nucleotide Polymorphisms (dbSNP), the UCSC Genome Browser database, and the SNP Annotation and Proxy Search (SNAP) tool [30-32].

95

Results

Admixture Mapping

Admixture mapping analyses were conducted genome-wide for hypertension,

SBP, DBP, total cholesterol, HDL-C, LDL-C and triglycerides in a sample of 1,905 unrelated African-American subjects in the FBPP. The descriptive statistics are presented in Table 1. The admixture mapping analyses for hypertension, SBP, DBP, total cholesterol, HDL-C, LDL-C and triglycerides are presented in Supplemental Figures 3a-

3g.

In the admixture mapping analysis, chromosomal regions suggesting evidence of

local ancestry association with a phenotype are those with p-value <0.001 (-log10(P-

value) > 3). The results for each phenotype are reported with the associated peak, corresponding SNP, -log10(P-value) of the SNP, and region defining a 1 unit drop in the -

log10(P-value) on either side of the peak (Table 2). The results are consistent with the

ADMIXPROGRAM estimates and the SABER estimates for each chromosome

(Supplement Figures 3a-3g), and the results that are presented in Table 2 use the

estimates from ADMIXPROGRAM.

For SBP, local ancestry association evidence is observed on chromosome 3 in a

region spanning from 112.6 Mb-118.6 Mb and the peak is located at SNP rs7431707

(3q13) in the intron of LSAMP. Local ancestry evidence for SBP is also suggested on chromosome 5 from 4.8 Mb to 9.3 Mb and the maximum –log10(P-value) on

chromosome 5 is located at SNP rs371869 (5p15), which is between SRD5A1 and

PAPD7. The admixture mapping results for DBP and hypertension do not identify any

associations with local ancestry.

96

For total cholesterol, local ancestry association evidence is suggested on chromosome 19 (4.9 Mb-40.1 Mb). The maximum –log10(P-value) in this region is at

SNP rs901792 (19p13), which is located within TRIC-A (TMEM38A). For HDL-C,

suggested ancestry evidence is found on chromosome 8 (16.8-27.4 Mb) and the peak is at

SNP rs13438843 (8p21). For LDL-C, suggested local ancestry evidence is found on

chromosome 7 (16.8 Mb-23.9 Mb), and the peak SNP in this region is at rs7793253

(7p15).

For triglycerides, local ancestry association evidence is suggested on chromosome

14 (78.3 Mb-95.6 Mb) with the maximum –log10(P-value) at rs3825663 (14q32) in

TDP1. Association evidence is also identified for triglycerides on chromosome 19 (1.6

Mb-8.1 Mb) with the peak at rs8110664 (19p13). Following the admixture mapping analysis, these regions on chromosomes 3, 5, 7, 8, 14 and 19 are selected for the fine-

mapping using the family-based association analysis.

Fine-Mapping Analysis

For each trait, only the chromosomal regions showing local ancestry association

evidence in the admixture mapping analysis were studied in the family-based association

analysis. Each of the association models were adjusted for age, age2, sex, BMI and the

first ten PCs for population stratification. A summary of the significant fine-mapping results for the phenotypes are reported in Table 3.

Overall, 61,925 SNPs were tested in the fine-mapping analysis, and the number of

SNPs tested for each phenotype was: SBP (4,788 SNPs on chromosome 3 and 4,198

SNPs on chromosome 5), total cholesterol (17,258 SNPs), HDL-C (10,995 SNPs), LDL-

C (6,964 SNPs) and triglycerides (15,328 SNPs on chromosome 14 and 2,394 SNPs on

97

chromosome 19). The fine-mapping was not performed for HTN and DBP because no

regions in the admixture mapping were significant at P<0.001. Quantile-quantile plots

of the observed and expected -log10(P-value) from the fine-mapping analyses show substantial deviation from the diagonal line after accounting for population stratification for HDL-C, LDL-C and triglycerides (chromosome 14), but not for SBP (chromosome 3 and chromosome 5), total cholesterol and triglycerides (chromosome 19) (Figures 1a-1g).

These results suggest that there are true SNPs associated with HDL-C, LDL-C and triglycerides in these regions.

For HDL-C, two SNPs are significant at the Bonferroni-corrected critical value (p

= 4.55 x 10-6) after testing 10,995 SNPs. The most significant SNP, rs10096633 (p =

4.17 x 10-7; 8p21) is not located within any known genes, but it is near LPL (lipoprotein lipase). The second SNP rs13702 (p = 3.44 x 10-6; 8p21) is located within LPL.

For LDL-C, 4 SNPs are significant at the Bonferroni-corrected critical value (p =

7.18 x 10-5) after testing 6,964 SNPs. The most significant SNP is rs12534314 (p = 9.29 x 10-6; 7p15), and this SNP is in high LD (D’ = 1.00, r2 = 1.00) with rs12531660 (p =

2.13 x 10-5; 7p15). SNP rs12534314 and rs6966083 (p = 1.02 x 10-5; 7p21) are not

located within genes, but SNP rs10486301 (p = 9.37 x 10-5; 7p21) is located within

HDAC9.

On chromosome 14, there were 15,328 SNPs tested for triglycerides in the fine-

mapping analysis, and five of these SNPs are significant after correcting for multiple

comparisons (at the Bonferroni-corrected critical value of 3.26 x 10-6 p-value). Three of

the five SNPs are in high LD (D’ = 1.00, r2 = 1.00), resulting in three independent SNPs

of interest for triglycerides. The most significant SNP in this region is rs10483943

98

(14q31) (p = 3.35 x 10-7), which is not located within any known gene. SNPs rs759512

(p = 7.74 x 10-7) and rs757645 (p = 1.92 x 10-6) are in close proximity to rs10483943. In

addition, SNPs rs10130530 (14q24) in NRXN3 and rs11620666 (14q32) in TTC7B are

significantly associated with triglycerides (p = 2.13 x 10-6 and p = 3.35 x 10-7,

respectively).

To summarize the results from the admixture mapping, the family-based fine-

mapping analysis, and large GWAS in the literature simultaneously, graphs were created

to evaluate the results from these analyses (Figures 2a-2g). The SNP results from the fine-mapping analysis and the SNP results from literature were overlaid on the admixture mapping results from the present study.

Discussion

Admixture mapping analysis of blood pressure traits and lipid traits was conducted for unrelated African-Americans in the FBPP, and the chromosomes that contained regions of admixture evidence were followed-up with fine-mapping association analysis using family data. This procedure has been suggested to identify SNP associations which may be missed by traditional GWAS [21, 33]. The admixture mapping analysis capitalized on the recent admixture of African-American population to narrow down regions of interest for a trait, and the family-based association analysis allowed for further fine mapping of the regions.

Seven regions in the admixture mapping analysis that showed local ancestry evidence are reported for association with blood pressure or lipids. In the follow-up fine- mapping analysis in families, 11 SNPs (8 independent SNPs) are associated with lipid

99

traits. Findings for four SNPs are consistent with previously reported results and the

other four are novel associations from these analyses.

Suggestive admixture associations are observed for SBP on chromosomes 3 and

5, total cholesterol, and triglycerides on chromosome 19, but significant fine-mapping results are not found in these regions. For SBP, the peak on chromosome 3 is located at rs7431707 in the intron of LSAMP, a protein that was previously reported for association with left main coronary artery disease [34]. On chromosome 5, the admixture mapping region extends from 4.8 Mb to 9.3 Mb (5p15.32-31), which is near the 5p13-11 region that was previously reported for BP using admixture mapping [21]. These variants were not previously reported by the large BP GWAS or admixture studies [6-8, 19, 20]. Local ancestry association evidence for hypertension is still observed in the regions on chromosome 6q and 21q that were reported by previous studies, although the associations were relatively weak in this study [19, 20].

For total cholesterol, the reported admixture region includes 4 SNPs that were previously reported for total cholesterol, HDL-C and LDL-C in a large GWAS of blood lipids in populations of European ancestry populations [9]. The admixture mapping peak in this region for total cholesterol is at SNP rs901792 in TRIC-A (TMEM38A), which was previously reported for association with BP maintenance in knockout mice studies [35].

For HDL-C, the admixture mapping peak at SNP rs13438843 (8p21) is not near any known genes. While the most significant SNP from the fine-mapping rs10096633 (p

= 4.17 x 10-7; 8p21) is not located within any known genes, it is near LPL (lipoprotein

lipase) and the second SNP rs13702 (p = 3.44 x 10-6; 8p21) is located within LPL. This

gene plays a key role in the hydrolysis of the triglyceride component of a number of

100

lipoproteins and consequently lipid metabolism [36, 37]. For LDL-C, the admixture peak at SNP rs7793253 is not near any known genes, but the admixture region on chromosome

7 (16.8 Mb-23.9 Mb) includes rs12670798 (DNAH11), which was previously reported for association with LDL-C and total cholesterol [9]. The most significant SNPs in the fine- mapping analysis SNP rs12534314 is located near STEAP1B, and the region including this gene was previously reported for association with all-cause death on dialysis in

African-Americans with type 2 diabetes in a recent GWAS [38]. Further, SNP rs10486301 in HDAC9 was previously reported for association with stroke and obesity- related phenotypes [39-41].

For triglycerides, the admixture peak on chromosome 14 is located at rs3825663 in TDP1, which has been reported for association with carotid plaque in a family study of

Dominican people [42]. In the fine-mapping of this region, SNP rs10130530 in NRXN3 was previously reported for association with obesity, body-mass-index, and waist circumference in GWAS conducted in European-ancestry populations [43-45]. SNP rs11620666 (14q32) in TTC7B was reported in a GWAS of European-ancestry individuals for cytomegalovirus (CMV) antibody response; CMV was previously found to be associated with atherosclerosis and acute coronary events [46-49]. The admixture peak on chromosome 19 is located at rs8110664, which is close to ZNF556. Neither of the local ancestry regions for triglycerides included SNPs that were previously reported for association in the lipids GWAS [9].

Finally, the results from the admixture mapping analysis, the fine-mapping association analysis, and results from GWAS were evaluated concurrently (Figures 2a-

2g) [8, 9, 14, 50-52]. As expected, the fine-mapping SNPs for HDL-C, LDL-C and

101

triglycerides (chromosome 14) are quite consistent with the admixture results with

regards to the location of the peak SNPs. Yet, it was evident that the precision and

magnitude of the results benefit from the additional information obtained from the family

data and the increased number of SNPs in the fine-mapping analysis. For example, for

LDL-C, the locations of the most significant SNPs are consistent with the peaks from the

admixture analysis; however, the magnitude of the p-values indicate that the strength of

evidence showing departure from the null hypothesis (of no association) is much stronger

in the family-based fine-mapping analysis than in the admixture mapping analysis, which

uses unrelated subjects only. As a result, the admixture mapping results are narrowed

down to the significant SNPs from the fine-mapping analysis. Further, these graphs

demonstrate consistencies between the studies, despite being performed in different

populations, which is also reported in recent studies [13, 14]. At the same time, the graphs also show that the current study’s use of African-American subjects have revealed novel findings that are not found in studies of other populations.

By conducting a two-stage analysis, this study has reduced the number of SNPs tested in the family-based association study because it focuses on variants in chromosomes containing regions of admixture evidence for their respective traits. This method of analysis has the potential to uncover the variants missed by traditional studies,

such as GWAS [21, 33]. In fact, it is worth noting that several SNPs in this study are

significant at the Bonferroni-corrected critical p-value, even though the Bonferroni p-

value is quite conservative.

In summary, the current study reports seven regions showing association of local

ancestry evidence with cardiovascular phenotypes. In the follow-up fine-mapping

102

analysis in families, 8 independent SNPs are associated with lipid traits. These regions and SNPs may aid in identifying potential therapeutic targets for cardiovascular disease traits. Furthermore, they may be candidates for further blood pressure and lipid studies, especially in African-American populations. The present study also demonstrates the benefits of alternate and multi-stage analysis methods for genetic studies, particularly in recently admixed populations that are not well-studied with traditional association studies.

Acknowledgements

This work was supported by the National Institutes of Health and grant numbers

HL086718 and HL007567-29 (T32) from the National Heart, Lung and Blood Institute.

Some of the results of this paper were obtained by using the software package S.A.G.E., which was supported by a U.S. Public Health Service Resource Grant (RR03655) from the National Center for Research Resources. The authors wish to acknowledge the members of the International HapMap Consortium and the communities of the

International HapMap Project for their contributions.

103

Tables

Table 1. Summary statistics

Variable N Mean Median Min, Max Mean African Ancestry 1905 83.7% 85.5% (33.5%, 98.8%) Age (years) 1905 48.8 49.0 (19.0, 80.0) BMI kg/m2 1905 31.1 29.9 (13.8, 70.7) Sex 1905 . . . Male 682 (35.8%) . . . Female 1223 (64.2%) . . .

104 Taking Any Anti-Hypertension Medication 1905 . . .

No 1038 (54.5%) . . . Yes 867 (45.5%) . . . SBP (mm Hg) 1902 133.4 130.0 (74.0, 237.5) DBP (mm Hg) 1902 75.9 74.7 (42.0, 136.0) Total Cholesterol (mg/dL) 1530 197.7 194.0 (80.0, 418.0) HDL-C (mg/dL) 1529 54.6 52.0 (22.0, 171.0) LDL-C (mg/dL) 1479 119.2 116.8 (25.6, 290.4) Triglycerides (mg/dL) 1530 122.2 105.0 (23.0, 3170.0)

Table 2. Regions suggested by admixture mapping

Phenotype Chromosome and Region (Mb)* Peak SNP in the region -log10(P-value) SBP Chromosome 3: 112.6 – 118.6 rs7431707 3.32 Chromosome 5: 4.8 – 9.3 rs371869 3.05 Total Cholesterol Chromosome 19: 4.9 – 40.1 rs901792 3.06 HDL-C Chromosome 8: 16.8 – 27.4 rs13438843 3.05 LDL-C Chromosome 7: 16.8 – 23.4 rs7793253 3.21 Triglycerides Chromosome 14: 78.3 – 95.6 rs3825663 3.26 Chromosome 19: 1.6 – 8.1 rs8110664 3.96

105

*Region is the 1 unit drop of –log10(P-value) of the peak SNP in the region.

Table 3. Significant fine-mapping results

Phenotype # Bonferroni- Significant Heritability Beta SE Wald Effect Effect SNPs Corrected SNPs p-value Allele Allele tested Significance Frequency Frequency Level in CEU in YRI HDL (Chr 8) 10,995 4.55 x 10-6 rs10096633 0.495 2.10 0.41 4.17 x 10-7 C 0.858 C 0.487 rs13702 0.497 -1.89 0.41 3.44 x 10-6 A 0.712 A 0.416 LDL (Chr 7) 6,964 7.18 x 10-6 rs12534314# 0.566 -8.50 1.92 9.29 x 10-6 T 0.192 T 0.033 rs6966083 0.565 -18.27 4.14 1.02 x 10-5 T 0.509 T 0.164 # -5

106 rs12531660 0.566 -8.22 1.93 2.13 x 10 G 0.175 G 0.026 -5 rs10486301 0.569 -8.34 2.14 9.37 x 10 G 0.225 G 0.008 Triglycerides 15,328 3.26 x 10-6 rs10483943^ 0.242 36.95 7.24 3.35 x 10-7 G 0.168 G 0.004 (Chr 14) rs759512^ 0.241 35.97 7.28 7.74 x 10-7 T 0.202 T 0.009 rs757645^ 0.241 33.05 6.94 1.92 x 10-6 A 0.175 A 0.018 rs10130530 0.254 53.28 11.24 2.13 x 10-6 A 0.025 A 0.008 rs11620666 0.243 23.85 5.11 3.08 x 10-6 T 0.430 T 0.008

#In linkage disequilibrium (D’ = 1.00, r2 = 1.00) on chromosome 7. ^In linkage disequilibrium (D’ = 1.00, r2 = 1.00) on chromosome 14.

Figures 1a-1g. Quantile-Quantile plots of fine-mapping analysis in families

107

61,925 SNPs were tested, and the number of SNPs tested for each phenotype was: SBP (4,788 SNPs on chromosome 3 and 4,198 SNPs on chromosome 5), total cholesterol (17,258 SNPs), HDL-C (10,995 SNPs), LDL-C (6,964 SNPs) and triglycerides (15,328 SNPs on chromosome 14 and 2,394 SNPs on chromosome 19). Significant results were reported for HDL-C, LDL-C and triglycerides (chromosome 14).

Figures 2a-2g. Combined analyses plots

SBP (Chromosome 3) SBP (Chromosome 5) rs3774372 rs1173771 rs419076 15 12 12

9 rs11953630 9

6 rs13082711 6 - log10(p - value) - log10(p - value) 3 3 0 0

0 50 100 150 200 0 50 100 150 200 108 Distance along Chromosome 3 (Mb) Distance along Chromosome 5 (Mb)

Total Cholesterol (Chromosome 19) rs10401969 36

24 rs2228671

rs2075650 rs157580 rs2304130 15 - value)- log10(p

rs492602 9 6 3 0

0 20 40 60 Distance along Chromosome 19 (Mb)

HDL-C (Chromosome 8) LDL-C (Chromosome 7) rs9987289 rs12670798 24 9 rs328 6

rs6966083 rs12534314 rs12531660 12 rs2293889 rs10486301

9 rs1461729 - value) - log10(p - value) - log10(p

rs326 3 rs10096633 6 rs2126259 rs13702 3 0 0 0 50 100 150 0 50 100 150 Distance along Chromosome 8 (Mb) Distance along Chromosome 7 (Mb) 109

Triglycerides (Chromosome 14) Triglycerides (Chromosome 19) rs10483943 rs439401 30 rs759512 6 rs10130530 rs757645 rs11620666

rs12721054 18 3 - value) - log10(p - log10(p - value)

9 rs16996148 6 3 0 0 20 40 60 80 100 0 20 40 60 Distance along Chromosome 14 (Mb) Distance along Chromosome 19 (Mb)

The SNP results from the fine-mapping analysis and from the literature were overlaid on the admixture mapping results from the present study. Results from the present study are represented by circles, and results from the literature are represented by triangles. Additional significant results were reported by Teslovich et al. for LDL-C at rs12670798 (21.6 Mb) and rs2072183 (44.5 Mb) on chromosome 7 and for total cholesterol at rs6511720 (11.1 Mb) and rs4420638 (50.1 Mb) on chromosome19 without p-values [9].

Supplemental Figures 3a- 3g: Admixture mapping

Hypertension

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

110

DBP

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

111

SBP

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

112

Total Cholesterol

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

113

HDL-C

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

114

LDL-C

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

115

Triglycerides

The ADMIXPROGRAM results are the dashed lines and the SABER results are the dotted lines.

116

References

1. WHO, Global health risks: mortality and burden of disease attributable to selected major risks., 2009: Geneva. 2. Hottenga, J.J., et al., Heritability and stability of resting blood pressure. Twin research and human genetics : the official journal of the International Society for Twin Studies, 2005. 8(5): p. 499-508. 3. Kupper, N., et al., Heritability of daytime ambulatory blood pressure in an extended twin design. Hypertension, 2005. 45(1): p. 80-5. 4. Levy, D., et al., Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension, 2000. 36(4): p. 477-83. 5. Kathiresan, S., et al., A genome-wide association study for blood lipid phenotypes in the Framingham Heart Study. BMC medical genetics, 2007. 8 Suppl 1: p. S17. 6. Levy, D., et al., Genome-wide association study of blood pressure and hypertension. Nature genetics, 2009. 41(6): p. 677-87. 7. Newton-Cheh, C., et al., Genome-wide association study identifies eight loci associated with blood pressure. Nature genetics, 2009. 41(6): p. 666-76. 8. Ehret, G.B., et al., Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature, 2011. 478(7367): p. 103-9. 9. Teslovich, T.M., et al., Biological, clinical and population relevance of 95 loci for blood lipids. Nature, 2010. 466(7307): p. 707-13. 10. Saxena, R., et al., Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science, 2007. 316(5829): p. 1331-6. 11. Adeyemo, A., et al., A genome-wide association study of hypertension and blood pressure in African Americans. PLoS genetics, 2009. 5(7): p. e1000564. 12. Fox, E.R., et al., Association of genetic variation with systolic and diastolic blood pressure among African Americans: the Candidate Gene Association Resource study. Human molecular genetics, 2011. 20(11): p. 2273-84. 13. Franceschini, N., et al., Genome-wide association analysis of blood pressure traits in nearly 30,000 African ancestry individuals reveals a common set of associated genes in African and non-African populations. American journal of human genetics, 2013. In press. 14. Coram, M.A., et al., Genome-wide Characterization of Shared and Distinct Genetic Components that Influence Blood Lipid Levels in Ethnically Diverse Human Populations. American journal of human genetics, 2013. 15. Smith, M.W. and S.J. O'Brien, Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nature reviews. Genetics, 2005. 6(8): p. 623-32. 16. Zhu, X., H. Tang, and N. Risch, Admixture mapping and the role of population structure for localizing disease genes. Advances in genetics, 2008. 60: p. 547-69. 17. Tang, H., et al., Racial admixture and its impact on BMI and blood pressure in African and Mexican Americans. Human genetics, 2006. 119(6): p. 624-33. 18. Basu, A., et al., Admixture mapping of quantitative trait loci for blood lipids in African- Americans. Human molecular genetics, 2009. 18(11): p. 2091-8. 19. Zhu, X. and R.S. Cooper, Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PloS one, 2007. 2(11): p. e1244. 20. Zhu, X., et al., Admixture mapping for hypertension loci with genome-scan markers. Nature genetics, 2005. 37(2): p. 177-81.

117

21. Zhu, X., et al., Combined admixture mapping and association analysis identifies a novel blood pressure genetic locus on 5p13: contributions from the CARe consortium. Human molecular genetics, 2011. 20(11): p. 2285-95. 22. FBPP, Multi-center genetic study of hypertension: The Family Blood Pressure Program (FBPP). Hypertension, 2002. 39(1): p. 3-9. 23. Shetty, P.B., et al., Variants in CXADR and F2RL1 are associated with blood pressure and obesity in African-Americans in regions identified through admixture mapping. Journal of hypertension, 2012. 30(10): p. 1970-6. 24. Zhu, X., et al., A classical likelihood based approach for admixture mapping using EM algorithm. Human genetics, 2006. 120(3): p. 431-45. 25. Tang, H., et al., Reconstructing genetic ancestry blocks in admixed individuals. American journal of human genetics, 2006. 79(1): p. 1-12. 26. Wang, X., et al., Adjustment for local ancestry in genetic association analysis of admixed populations. Bioinformatics, 2011. 27(5): p. 670-7. 27. Basu, A., et al., Admixture mapping of quantitative trait loci for BMI in African Americans: evidence for loci on chromosomes 3q, 5q, and 15q. Obesity, 2009. 17(6): p. 1226-31. 28. Zhu, X., et al., A unified association analysis approach for family and unrelated samples correcting for stratification. American journal of human genetics, 2008. 82(2): p. 352-65. 29. Purcell, S., et al., PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics, 2007. 81(3): p. 559-75. 30. Meyer, L.R., et al., The UCSC Genome Browser database: extensions and updates 2013. Nucleic acids research, 2013. 41(Database issue): p. D64-9. 31. Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. (dbSNP Build ID: 137). Available from: http://www.ncbi.nlm.nih.gov/SNP/. 32. Johnson, A.D., et al., SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics, 2008. 24(24): p. 2938-9. 33. Qin, H. and X. Zhu, Power comparison of admixture mapping and direct association analysis in genome-wide association studies. Genetic epidemiology, 2012. 36(3): p. 235- 43. 34. Wang, L., et al., Polymorphisms of the tumor suppressor gene LSAMP are associated with left main coronary artery disease. Annals of human genetics, 2008. 72(Pt 4): p. 443- 53. 35. Yamazaki, D., et al., TRIC-A channels in vascular smooth muscle contribute to blood pressure maintenance. Cell metabolism, 2011. 14(2): p. 231-41. 36. Eckel, R.H., Lipoprotein lipase. A multifunctional enzyme relevant to common metabolic diseases. The New England journal of medicine, 1989. 320(16): p. 1060-8. 37. Wang, H. and R.H. Eckel, Lipoprotein lipase: from gene to obesity. American journal of physiology. Endocrinology and metabolism, 2009. 297(2): p. E271-88. 38. Murea, M., et al., Genome-wide association scan for survival on dialysis in African- Americans with type 2 diabetes. American journal of nephrology, 2011. 33(6): p. 502-9. 39. Comuzzie, A.G., et al., Novel genetic loci identified for the pathophysiology of childhood obesity in the Hispanic population. PloS one, 2012. 7(12): p. e51954. 40. Traylor, M., et al., Genetic risk factors for ischaemic stroke and its subtypes (the METASTROKE collaboration): a meta-analysis of genome-wide association studies. Lancet neurology, 2012. 11(11): p. 951-62. 41. Bellenguez, C., et al., Genome-wide association study identifies a variant in HDAC9 associated with large vessel ischemic stroke. Nature genetics, 2012. 44(3): p. 328-33.

118

42. Dong, C., et al., Follow-up association study of linkage regions reveals multiple candidate genes for carotid plaque in Dominicans. Atherosclerosis, 2012. 223(1): p. 177- 83. 43. Wang, K., et al., A genome-wide association study on obesity and obesity-related traits. PloS one, 2011. 6(4): p. e18939. 44. Speliotes, E.K., et al., Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nature genetics, 2010. 42(11): p. 937-48. 45. Heard-Costa, N.L., et al., NRXN3 is a novel locus for waist circumference: a genome- wide association study from the CHARGE Consortium. PLoS genetics, 2009. 5(6): p. e1000539. 46. Vercellotti, G.M., Effects of viral activation of the vessel wall on inflammation and thrombosis. Blood coagulation & fibrinolysis : an international journal in haemostasis and thrombosis, 1998. 9 Suppl 2: p. S3-6. 47. Zhou, Y.F., et al., Association between prior cytomegalovirus infection and the risk of restenosis after coronary atherectomy. The New England journal of medicine, 1996. 335(9): p. 624-30. 48. Liu, R., et al., Presence and severity of Chlamydia pneumoniae and Cytomegalovirus infection in coronary plaques are associated with acute coronary syndromes. International heart journal, 2006. 47(4): p. 511-9. 49. Kuparinen, T., et al., Genome-wide association study does not reveal major genetic determinants for anti-cytomegalovirus antibody response. Genes and immunity, 2012. 13(2): p. 184-90. 50. Kathiresan, S., et al., Common variants at 30 loci contribute to polygenic dyslipidemia. Nature genetics, 2009. 41(1): p. 56-65. 51. Aulchenko, Y.S., et al., Loci influencing lipid levels and coronary heart disease risk in 16 European population cohorts. Nature genetics, 2009. 41(1): p. 47-55. 52. Weissglas-Volkov, D., et al., Genomic study in Mexicans identifies a new locus for triglycerides and refines European lipid loci. Journal of medical genetics, 2013. 50(5): p. 298-308.

119

Chapter 5. Identification of admixture regions associated with risk of apparent treatment-resistant hypertension in African-Americans

Abstract

Objective: Admixture mapping analysis of blood pressure was conducted in a sample of patients taking > 2 anti-hypertensive medications and patients with blood pressure controlled on one medication, followed by fine-mapping analysis in families to identify risk variants for apparent treatment-resistant hypertension.

Methods: This study conducted admixture mapping analysis in 606 unrelated African-

Americans for systolic blood pressure (SBP) and 758 unrelated African-Americans for diastolic blood pressure (DBP) to identify genetic variations associated with risk of apparent treatment-resistant hypertension (aTRH). Regions suggesting evidence for association with local ancestry were further examined with fine-mapping association analysis in 1,495 African-Americans with complete information in families. Linear regression was used to perform the admixture mapping and the fine-mapping association analysis. The analyses were adjusted for age, age2, sex, body-mass-index, and number of anti-hypertensive medications, as well as mean global ancestry to account for potential confounding due to population stratification. All of the subjects were from the National

Heart, Lung, and Blood Institute’s Family Blood Pressure Project.

Results: The admixture mapping analysis reveals an 8 Mb region on chromosome 3 showing evidence for association between local ancestry and SBP (peak SNP -log10(p- value) = 3.20). Stratified admixture analysis in subjects taking only 2 anti-hypertensive medications identifies a large region on chromosome 5 that is associated with SBP (peak

SNP -log10(p-value) = 3.17).

120

Conclusions: Admixture regions on chromosomes 3 and 5 are associated with aTRH,

specifically SBP in African-American patients taking multiple anti-hypertensive

medications. These regions include genes that were previously reported for association

with blood pressure traits in a large blood pressure European-ancestry GWAS, but they

have not been replicated in African-Americans [1]. As the results from this study were

obtained only in treated patients, it is possible that these variants are likely associated

with treatment response in African-Americans. These admixture regions should be

pursued in future studies as targets for drug development and personalized treatment for

patients at risk of aTRH and hypertension in general.

Background

With an estimated one-fourth of all adults worldwide suffering from hypertension and approximately 7.5 million deaths each year due to high blood pressure (BP) globally, the importance of preventing and successfully treating hypertension in public health is clear [2]. There are a number of challenges in controlling BP to optimal levels, including selection of effective treatment regimens and patient adherence to treatment and lifestyle changes; however, a subset of patients have high blood pressure levels that are refractory to these interventions.

Apparent treatment-resistant hypertension (aTRH) is diagnosed in patients with

uncontrolled hypertension despite use of 3 or more anti-hypertensive medications of

different classes, one of which is a diuretic, and in patients with controlled hypertension

on > 4 anti-hypertensive medications at optimal treatment levels [3]. Notably, patients

with aTRH have fairly low levels of poor treatment adherence [4]. While the prevalence

of aTRH is not well-documented, an analysis of the National Health and Nutrition

121

Examination Survey (NHANES) data showed that the prevalence of patients with

uncontrolled BP being treated with > 3 medications increased from 15.9% (in the 1998-

2004 data) to 28.0% (in the 2005-2008 data) [5]. More recently, a community-based study reported that 30.3% of uncontrolled hypertensive patients were taking at least 3 anti-hypertensive medications and 12.3% of patients with controlled BP required > 4 medications [6]. In this study, the proportion of African-American patients with

uncontrolled BP increased with an increasing dose of medications, while the proportion

of European-ancestry patients with uncontrolled BP decreased with an increasing dose of

medications, implying that the European-ancestry patients likely had a better response to

treatment than the African-American patients [6]. Independent risk factors for aTRH

include obesity, increased age, female sex and African-American race [5].

There are a limited number of genetic studies for aTRH so the heritability of this

trait is unclear. The heritability of hypertension in general, however, was estimated to be

between 34-67% based on family and twin studies [7-9]. Variants associated with aTRH

include those in the renin-angiotensin-aldosterone and epithelial sodium channel systems,

and these variants were identified with candidate gene studies [10-12]. For hypertension in general, genome-wide association studies (GWAS) have successfully identified a number of associated variants. The blood pressure GWAS Cohorts for Heart and Aging

Research in Genomic Epidemiology (CHARGE) and Global BPgen Consortium collectively reported 13 loci associated with blood pressure traits, and the International

Consortium for Blood Pressure Genome-Wide Association Studies (ICBP GWAS) recently reported 16 novel loci (29 variants at 28 loci overall) for blood pressure in a

GWAS of approximately 200,000 subjects [1, 13, 14]. While promising, the 29 variants

122

identified in the ICBP GWAS accounted for only 0.9% of the trait variance.

Furthermore, these studies were primarily conducted in subjects of European ancestry,

and the generalizability of these associations to other populations is still not clear.

Smaller, early GWAS conducted in African-Americans have reported significant associations for blood pressure and a number of SNPs, as well but these results were not

replicated [15, 16]. Recently, a large GWAS of blood pressure conducted in nearly

30,000 individuals of African ancestry revealed a common set of associated genes in

African and non-African ancestry populations, although the total phenotype variation accounted by these genes was still small [17].

Admixture mapping analysis is a genetic epidemiology study design that is well- suited to the identification of genetic regions that are associated with a trait of interest in admixed populations, such as African-Americans [18, 19]. Admixture mapping utilizes the difference in disease prevalence of ancestral populations to identify genetic variants that are associated with the trait of interest. In particular, it is expected that the frequency of a risk allele differs between ancestral populations, such that one parental population has a higher frequency of the risk allele at the causal variant (“high-risk ancestral population”) than the others. For a recently admixed population, it is assumed that cases have increased ancestry for the high-risk ancestral population, compared to controls. The proportion of ancestry from the high-risk ancestral population at the causal variant is thought to be greater than the average genome-wide ancestry as well, which is consistent with findings from a study of admixed subjects from the National Heart, Lung and Blood

Institute’s (NHLBI) Family Blood Pressure Program (FBPP) [20].

123

In the present study, two-stage analyses were performed to identify genomic and

genetic variants associated with systolic blood pressure (SBP) and diastolic blood

pressure (DBP) in African-Americans taking 2 or more anti-hypertensive medications

and patients with controlled blood pressure on one medication to discover variants for

aTRH. While subjects uncontrolled on 3 or more anti-hypertensive drugs are diagnosed

with aTRH, subjects uncontrolled on 2 anti-hypertensive drugs are at considerable risk of

aTRH and are also of interest for the prevention of aTRH [3]. The first stage of analysis

was conducted as an admixture-mapping analysis in unrelated subjects from the FBPP,

who were genotyped using 2,507 ancestry informative markers (AIMs) [21]. In the

second stage, fine-mapping association analysis was performed for any region that

suggested admixture association evidence for blood pressure. The fine-mapping analysis

was conducted in families from the FBPP, which accounted for family structure as part of

the analysis.

Methods

Sample

The study samples were obtained from the National Heart, Lung and Blood

Institute’s (NHLBI) Family Blood Pressure Program (FBPP). The FBPP is a large,

multi-site program examining the role of genetic factors in high blood pressure and

associated traits in four ethnic groups: African-Americans, Asians and Asian-Americans,

Mexican Americans and European Americans [21]. The FBPP networks ascertained families through affected probands with high blood pressure.

The present study was conducted in a sample of subjects taking > 2 anti- hypertensive medications and subjects controlled on 1 anti-hypertensive medication.

124

These patients were recruited by three FBPP networks: the Hypertension Genetic

Epidemiology Network (HyperGEN) in Birmingham, Alabama and Forsyth County,

North Carolina; the Genetic Epidemiology Network of Atherosclerosis (GENOA) in

Jackson, Mississippi; and the GenNet study in Maywood, Illinois. Additional details on

the FBPP networks are available elsewhere [21]. Subjects were narrowed down to those with age > 18 years and < 80 years and those with > 2.5% average African ancestry. The

final dataset contained 606 unrelated African-Americans for the admixture mapping of

SBP, which included 301 subjects taking > 2 anti-hypertensive drugs and 305 subjects with SBP controlled on 1 drug. The final dataset for the admixture mapping of DBP contained 758 unrelated African-Americans, which included 301 subjects taking > 2 anti-

hypertensive drugs and 484 subjects with DBP controlled on 1 drug for the DBP dataset.

The sample sizes differed for the SBP dataset and the DBP dataset because the number of

subjects controlled on 1 drug differed by blood pressure trait. The fine-mapping analysis

was conducted in 3,567 African-Americans in families (1,495 patients with complete

phenotype and genotype data).

Genotyping

The detailed genotyping was described in previous studies [22, 23]. Briefly,

2,593 ancestry-informative markers (AIMs) that were evenly spaced across the genome

were identified. These SNPs maximized the allele frequency differences for the relevant

ancestral populations: HapMap CEU (Utah residents with Northern and Western

European ancestry from the CEPH collection) and HapMap YRI (Yoruba in Ibadan,

Nigeria) [24]. The AIMs were chosen from the SNPs that were available on the Illumina

Human 1M array, the Illumina 650K array, and the Affymetrix 6.0 array. The unrelated

125

subjects were genotyped on the Illumina iSelect Custom Bead Chip at the University of

California San Francisco [22, 25-28]. SNPs with call rates < 95% and Illumina GenTrain scores < 0.7 were dropped, and the final dataset for the admixture analysis contained

2,507 AIMs.

In the fine-mapping analysis, the subjects from the HyperGEN network were genotyped on the Affymetrix Array 6.0 and the subjects from the GENOA network were genotyped on the Illumina 1M array and the Affymetrix Array 6.0. Quality control was conducted separately for the subjects from the two FBPP networks, and SNPs with call rates < 95% were dropped [17]. An additional 872 subjects that were previously not genotyped from the HyperGEN and GENOA networks were genotyped on Affymetrix

Axiom chip, which was designed to increase the coverage of common variants in African ancestral populations. These SNPs were called with the Affymetrix Genotyping Console, and SNPs with call rates < 95% were dropped. The combined datasets were evaluated for

Mendelian errors, which were set to missing when identified. Any SNPs showing departure from Hardy-Weinberg equilibrium (p < 0.001) were also excluded.

Consequently, 1 SNP was removed from chromosome 3 and 6 SNPs were removed from chromosome 5. The final datasets for fine-mapping analysis contained 7,050 SNPs in the chromosome 3 admixture region and 19,816 SNPs in the chromosome 5 admixture region.

Statistical Analyses

The phenotypes of interest were SBP and DBP for the discovery of variants associated with risk for aTRH. Subject characteristics for the admixture mapping dataset were summarized using descriptive statistics.

126

In the admixture mapping, linear regression was conducted as in previous

admixture mapping analyses [28, 29]. In the regression model, zi was the value of the

th blood pressure trait (SBP or DBP) for a subject i , Gij was the African ancestry at the j

AIM as estimated by ADMIXPROGRAM software, and Gi was the calculated mean of

Gij for the 2,507 AIMs [30]. The general formula used for the linear regression analysis

was

2 zi =+ββ01 Gii + β 2() Gij −+ G β3 SEX + β 4 AGE + β 5 AGE + β 6 BMI + β 7 Medications + εi .

The term Gi (average African ancestry) was included in the model to adjust for

population stratification. The term ()GGij − i represents the difference between the local ancestry and global ancestry for a given AIM. Consequently, the regression analysis

tested the null hypothesis β 2 = 0. Evidence of association with local ancestry was suggested by a p-value <0.001 or a –log10(p-value) of 3 or more. In addition, an associated admixture region was identified as the region of 1 unit drop of –log10(p-value) from either side of the peak admixture mapping signal to a marker. Admixture mapping analysis stratifying on the number of drugs (2 drugs or > 3 drugs) was conducted for the

blood pressure trait, if significant results were noted for the whole sample.

The fine-mapping association analysis was conducted in families, using S.A.G.E.

software (v. 6.3.0). Linear regression using a mixed model was used to account for

familial correlations among pedigree members with the ASSOC program of S.A.G.E. It

was advantageous to use the family-based analysis as it included additional information

about the trait-marker relationship that would not be considered in the use of only

unrelated patients. For both SBP and DBP, the trait was adjusted for key covariates age,

127

age2, body-mass-index (BMI), sex, number of anti-hypertensive drugs, and global mean ancestry by regressing these covariates on the phenotype using linear regression. The residuals from these models were used in the SNP-trait association analysis. Fine- mapping analysis was also conducted for regions showing significant association evidence in the stratified admixture mapping.

The fine-mapping analysis results were corrected for multiple comparisons using

the false discovery rate method as described by [31]. Population stratification was accounted for in the analysis by including the first ten principal components (PCs) as demonstrated by [32]. Specifically, the PCs were calculated for the founders in the families, and the PCs for the rest of the family members were estimated with regard to their genotype data. The PC analysis was performed with FamCC (v.1.0) software.

The analyses in this study were conducted using PLINK software, SAS 9.2 (SAS

Institute Inc., Cary, North Carolina, USA), S.A.G.E. software, FamCC (v.1.0) software, and Stata 9.2 (StataCorp. 2005. Stata Statistical Software: Release 9. College Station,

TX: StataCorp LP) [33]. Additional information about the chromosomal regions and

SNPs was obtained from the Database of Single Nucleotide Polymorphisms (dbSNP),

the UCSC Genome Browser database, the SNP Annotation and Proxy Search (SNAP)

tool, the National Human Genome Research Institute’s (NHGRI) Catalog of Published

Genome-Wide Association Studies, the Database of Genotypes and Phenotypes (dbGaP),

and the SNP Selection and Functional Information (SNPinfo) tool [34-39].

Results

128

Two-stage analyses were conducted to identify genomic and genetic variants

associated with risk of aTRH in African-Americans. Descriptive statistics for the

admixture mapping sample are presented in Table 1.

Table 2 presents the results by admixture mapping. One region on chromosome 3

shows significant association evidence with SBP, and the peak is located at rs1303629 in

RARB (chromosome 3p24.2; –log10(p-value) = 3.20) (Figure 1). The region surrounding

the peak ranges from rs7652410 in ZNF385D (chromosome 3p24.3; –log10(p-value) =

1.70) to rs2197896 in RBMS3 (chromosome 3p24.1; –log10(p-value) = 1.32), spanning

approximately 8 Mb. In the fine-mapping analysis, 7,050 SNPs were tested on

chromosome 3 from 3p24.3 to 3p24.1. After accounting for multiple comparisons using

the false-discovery rate method, no SNPs were significantly associated with SBP.

In the stratified analysis, the admixture mapping was conducted for subjects

taking 2 anti-hypertensive drugs and for subjects taking > 3 anti-hypertensive drugs

separately. The significant results from these analyses are presented in Table 2. The 2-

drug stratum analysis was conducted in 240 subjects, and one region on chromosome 5

shows significant admixture association evidence with SBP (Figure 2). The peak marker

is at rs17057017 (chromosome 5q33.3; –log10(p-value) = 3.17). The admixture range around the peak extends 22.5 Mb from rs4705411 in HMGXB3 (chromosome 5q32; – log10(p-value) = 2.10) to rs1107776 (chromosome 5q35.1; –log10(p-value) = 1.91). In the fine-mapping analysis in subjects taking 2 anti-hypertensive drugs, no SNPs are associated with SBP after accounting for multiple comparisons. Stratified analyses were conducted for SBP in patients taking > 3 anti-hypertensive drugs, but no admixture

regions demonstrated significant association evidence.

129

Similar admixture mapping analysis was performed for DBP, but no chromosomal regions showed significant ancestry association.

Discussion

The present study is the first admixture mapping analysis conducted for risk of apparent treatment-resistant hypertension. Genomic variants on chromosomes 3 and 5 are identified for risk of aTRH in African-Americans after conducting admixture mapping analyses of blood pressure and subsequent fine-mapping analyses. The admixture region on chromosome 3 shows association evidence for SBP in all patients in the study, and the admixture region on chromosome 5 shows association for SBP in patients taking 2 anti-hypertensive medications.

For SBP in all patients, the admixture peak rs1303629 is located at chromosome

3p24.2. The peak SNP is within 2 Mb of the index SNP rs13082711 in SLC4A7 that was reported for significant association with DBP and association with SBP in the ICBP

GWAS [1]. The effect of African ancestry in this region was protective in the current

study, which was consistent with the protective effect of the SLC4A7 locus in subjects of

European ancestry and subjects of African ancestry in the ICBP GWAS.

For SBP in patients treated with 2 anti-hypertensive medications, the admixture

peak rs17057017 is located at chromosome 5q33.3. This admixture peak is in the same region as the ICBP GWAS index SNP rs11953630 (EBF1), which was reported for significant association with SBP and DBP [1]. In the chromosome 5 admixture region,

African ancestry is associated with increased blood pressure in the current study, but the effect of the EBF1 locus was protective in subjects of European ancestry and subjects of

African ancestry in the ICBP GWAS.

130

Replication of the ICBP results in African-Americans has been a challenging task

[1, 17]. As the findings in this study are reported for association with blood pressure in treated patients, these variants may be associated with treatment effects in hypertensives, rather than blood pressure itself. The proportion of treated patients in the ICBP GWAS was not reported, so it is possible that some of the findings from this GWAS were partly driven by treatment effects as well. Previous studies in African-Americans have had difficulty in replicating the results from GWAS conducted primarily in subjects of

European ancestry [15, 16, 40]. In addition to disparities in allele frequencies, inadequate tagging SNPs on genotyping arrays and differences in linkage disequilibrium patterns, this study also suggests that anti-hypertensive treatment resistance may also contribute to the difficulty in replicating the previous findings.

The current study suggests two genomic regions containing several genes that may be associated with anti-hypertensive treatment-resistant of blood pressure. In the primary analysis, RARB on chromosome 3 has not been previously reported for association with blood pressure traits, but it was reported for risk of extreme obesity in a

GWAS of subjects of European ancestry [41]. In addition to SLC4A7 in the ICBP

GWAS, the admixture region on chromosome 3 included UBE2E2 and RBMS3, which were previously reported for association with blood pressure [1, 42, 43]. In the stratified analysis, an admixture region on chromosome 5 shows suggestive association evidence with SBP in patients taking 2 anti-hypertensive medications. As noted earlier, EBF1 in the chromosome 5 admixture region was reported for association with blood pressure in the ICBP GWAS, as well as for associations for stroke and coronary artery disease in other analyses [1, 44, 45]. A number of other genes in the same admixture region were

131

reported for association with blood pressure, including SGCD, GABRA6, GLRXP3,

RPLP0P9, ODZ2, FGF18, and C5orf50 [42, 46, 47].

Despite several significant results in the admixture mapping analyses, no SNPs

are significantly associated with SBP in the fine-mapping for the primary or stratified

analyses after correcting for multiple comparisons. This was not unexpected as the study

had limited power because of small sample sizes. Furthermore, the trait definition of aTRH was expanded to risk of aTRH in this study due to limited treatment information and small samples, and this broader phenotype definition may have introduced some trait heterogeneity.

As many of the results from this study are supported by previous studies, the

genomic variants identified in this study may be of interest for future replication and

association studies of aTRH. These findings may be important in the future for

developing tools to predict risk of aTRH, identifying patients who may benefit from more

aggressive treatment soon after diagnosis, and discovering variants that may be potential

drug targets.

Acknowledgements

This work was supported by the National Institutes of Health and grant numbers

HL086718 and HL007567-29 (T32) from the National Heart, Lung and Blood Institute.

Some of the results of this paper were obtained by using the software package S.A.G.E., which was supported by a U.S. Public Health Service Resource Grant (RR03655) from the National Center for Research Resources. The authors wish to acknowledge the members of the International HapMap Consortium and the communities of the

International HapMap Project for their contributions as well.

132

Table 1. Summary statistics

Subjects on 2+ Drugs Subjects with SBP controlled Subjects with DBP controlled on 1 Drug on 1 Drug Variable N (%) Mean, Min, Max N (%) Mean, Min, Max N (%) Mean, Min, Max Median Median Median Average 301 84%, 86% 41%, 98% 305 85%, 86% 49%, 98% 484 84%, 86% 34%, 98% African Ancestry Age (years) 301 54, 55 29, 75 305 52, 53 23, 77 484 54, 55 23, 78 BMI kg/m2 301 33, 32 17, 71 305 32, 31 18, 61 484 33, 31 18, 63 SBP (mm Hg) 301 139, 137 72, 221 305 123, 124 81, 140 . . . DBP (mm Hg) 301 77, 76 47, 131 . . . 482 74, 75 45, 90 Sex Male 113 (38%) . . 85 (28%) . . 131 (27%) . .

133 Female 118 (62%) . . 220 (72%) . . 353 (73%) . . Number of

Anti-hypertensive Medications Two 240 (80%) ...... Three 49 (16%) ...... Four 10 (3%) ...... Five 5 (1%) ......

Table 2. Significant results from the admixture mapping analysis

Sample Chromosome (Admixture Region) Peak SNP (Location) Effect Standard Error -log10(P-value) All 3 (3p24.3 – 3p24.1) rs1303629 (3p24.2) -12.13 3.53 3.20 2 drugs 5 (5q32 – 5q35.1) rs17057017 (5q33.3) 21.73 6.31 3.17

134

Figure 1. Admixture mapping in all patients

All Patients 3 2 -log(p-value) 1 0

0 50 100 150 200 Distance along Chromosome 3 (Mb)

135

Figure 2. Admixture mapping in patients taking 2 anti-hypertensive drugs

Patients Taking 2 Anti - Hypertensive Medications 3 2 -log(p-value) 1 0

0 50 100 150 200 Distance along Chromosome 5 (Mb)

136

References

1. Ehret, G.B., et al., Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature, 2011. 478(7367): p. 103-9. 2. WHO, Global health risks: mortality and burden of disease attributable to selected major risks., 2009: Geneva. 3. Calhoun, D.A., et al., Resistant hypertension: diagnosis, evaluation, and treatment: a scientific statement from the American Heart Association Professional Education Committee of the Council for High Blood Pressure Research. Circulation, 2008. 117(25): p. e510-26. 4. Irvin, M.R., et al., Prevalence and correlates of low medication adherence in apparent treatment-resistant hypertension. Journal of clinical hypertension, 2012. 14(10): p. 694- 700. 5. Egan, B.M., et al., Uncontrolled and apparent treatment resistant hypertension in the United States, 1988 to 2008. Circulation, 2011. 124(9): p. 1046-58. 6. Egan, B.M., et al., Prevalence of Optimal Treatment Regimens in Patients With Apparent Treatment-Resistant Hypertension Based on Office Blood Pressure in a Community- Based Practice Network. Hypertension, 2013. 7. Levy, D., et al., Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension, 2000. 36(4): p. 477-83. 8. Kupper, N., et al., Heritability of daytime ambulatory blood pressure in an extended twin design. Hypertension, 2005. 45(1): p. 80-5. 9. Hottenga, J.J., et al., Heritability and stability of resting blood pressure. Twin research and human genetics : the official journal of the International Society for Twin Studies, 2005. 8(5): p. 499-508. 10. Jones, E.S., E.P. Owen, and B.L. Rayner, The association of the R563Q genotype of the ENaC with phenotypic variation in Southern Africa. American journal of hypertension, 2012. 25(12): p. 1286-91. 11. Cruz-Gonzalez, I., et al., Association between -T786C NOS3 polymorphism and resistant hypertension: a prospective cohort study. BMC cardiovascular disorders, 2009. 9: p. 35. 12. Donner, K.M., et al., CYP2C9 genotype modifies activity of the renin-angiotensin- aldosterone system in hypertensive men. Journal of hypertension, 2009. 27(10): p. 2001- 9. 13. Newton-Cheh, C., et al., Genome-wide association study identifies eight loci associated with blood pressure. Nature genetics, 2009. 41(6): p. 666-76. 14. Levy, D., et al., Genome-wide association study of blood pressure and hypertension. Nature genetics, 2009. 41(6): p. 677-87. 15. Fox, E.R., et al., Association of genetic variation with systolic and diastolic blood pressure among African Americans: the Candidate Gene Association Resource study. Human molecular genetics, 2011. 20(11): p. 2273-84. 16. Adeyemo, A., et al., A genome-wide association study of hypertension and blood pressure in African Americans. PLoS genetics, 2009. 5(7): p. e1000564. 17. Franceschini, N., et al., Genome-wide association analysis of blood pressure traits in nearly 30,000 African ancestry individuals reveals a common set of associated genes in African and non-African populations. American journal of human genetics, 2013. In press. 18. Zhu, X., H. Tang, and N. Risch, Admixture mapping and the role of population structure for localizing disease genes. Advances in genetics, 2008. 60: p. 547-69. 19. Smith, M.W. and S.J. O'Brien, Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nature reviews. Genetics, 2005. 6(8): p. 623-32.

137

20. Tang, H., et al., Racial admixture and its impact on BMI and blood pressure in African and Mexican Americans. Human genetics, 2006. 119(6): p. 624-33. 21. FBPP, Multi-center genetic study of hypertension: The Family Blood Pressure Program (FBPP). Hypertension, 2002. 39(1): p. 3-9. 22. Shetty, P.B., et al., Variants in CXADR and F2RL1 are associated with blood pressure and obesity in African-Americans in regions identified through admixture mapping. Journal of hypertension, 2012. 30(10): p. 1970-6. 23. Shetty, P.B., et al., Novel Variants for HDL-C, LDL-C and Triglycerides Identified from Admixture Mapping and Fine-Mapping Analysis in Families (in preparation). . 2013. 24. The International HapMap Project. Nature, 2003. 426(6968): p. 789-96. 25. Basu, A., et al., Admixture mapping of quantitative trait loci for blood lipids in African- Americans. Human molecular genetics, 2009. 18(11): p. 2091-8. 26. Zhu, X. and R.S. Cooper, Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PloS one, 2007. 2(11): p. e1244. 27. Zhu, X., et al., Admixture mapping for hypertension loci with genome-scan markers. Nature genetics, 2005. 37(2): p. 177-81. 28. Zhu, X., et al., Combined admixture mapping and association analysis identifies a novel blood pressure genetic locus on 5p13: contributions from the CARe consortium. Human molecular genetics, 2011. 20(11): p. 2285-95. 29. Basu, A., et al., Admixture mapping of quantitative trait loci for BMI in African Americans: evidence for loci on chromosomes 3q, 5q, and 15q. Obesity, 2009. 17(6): p. 1226-31. 30. Zhu, X., et al., A classical likelihood based approach for admixture mapping using EM algorithm. Human genetics, 2006. 120(3): p. 431-45. 31. Benjamini, Y. and D. Yekateuli, Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 1995. 57(1): p. 289-300. 32. Zhu, X., et al., A unified association analysis approach for family and unrelated samples correcting for stratification. American journal of human genetics, 2008. 82(2): p. 352-65. 33. Purcell, S., et al., PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics, 2007. 81(3): p. 559-75. 34. Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. (dbSNP Build ID: 137). Available from: http://www.ncbi.nlm.nih.gov/SNP/. 35. Johnson, A.D., et al., SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics, 2008. 24(24): p. 2938-9. 36. Meyer, L.R., et al., The UCSC Genome Browser database: extensions and updates 2013. Nucleic acids research, 2013. 41(Database issue): p. D64-9. 37. Hindorff, L.A., et al., A Catalog of Published Genome-Wide Association Studies. 38. Mailman, M.D., et al., The NCBI dbGaP database of genotypes and phenotypes. Nature genetics, 2007. 39(10): p. 1181-6. 39. Xu, Z. and J.A. Taylor, SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies. Nucleic acids research, 2009. 37(Web Server issue): p. W600-5. 40. Ehret, G.B., et al., Replication of the Wellcome Trust genome-wide association study of essential hypertension: the Family Blood Pressure Program. European journal of human genetics : EJHG, 2008. 16(12): p. 1507-11. 41. Cotsapas, C., et al., Common body mass index-associated variants confer risk of extreme obesity. Human molecular genetics, 2009. 18(18): p. 3502-7. 42. Levy, D., et al., Framingham Heart Study 100K Project: genome-wide associations for blood pressure and arterial stiffness. BMC medical genetics, 2007. 8 Suppl 1: p. S3.

138

43. Sabatti, C., et al., Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nature genetics, 2009. 41(1): p. 35-46. 44. Genome-wide association between genotype and incident stroke in African-American participants. Accession: pha002887.1, in STAMPEED: Cardiovascular Health Study (CHS) GWAS to identify genetic variants associated with aging and CVD risk factors and events. dbGaP Study Accession: phs000226.v3.p1. 45. BLOM Transformed CAC, Coronary Calcium, Phase-1, Visit-2. Accession: pha003031.1, in NHLBI Family Heart Study (FamHS-Visit1 and FamHS-Visit2). dbGaP Study Accession: phs000221.v1.p1. 46. O'Donnell, C.J., et al., Genome-wide association study for subclinical atherosclerosis in major arterial territories in the NHLBI's Framingham Heart Study. BMC medical genetics, 2007. 8 Suppl 1: p. S4. 47. Lowe, J.K., et al., Genome-wide association studies in an isolated founder population from the Pacific Island of Kosrae. PLoS genetics, 2009. 5(2): p. e1000365.

139

Chapter 6. Discussion

The studies conducted in this dissertation demonstrated the benefits of applying

modified methods of analysis to address issues of missing heritability in genetic studies

for the discovery of variants associated with hypertension and related cardiovascular

traits. The methods applied identified novel genetic and genomic associations, as well as

those that were consistent with published literature. Furthermore, these methods improved the efficiency of discovering variants over traditional methods, such as GWAS

by: reducing the number of hypotheses being tested with gene scores and admixture

mapping in the first stage of analysis, improving the prior probabilities of test hypotheses

with a two-stage study design, and using samples in families in association analyses to

account for heritability in the phenotype-SNP relationship.

Across the three studies, one of the key limitations was the difficulty in

replicating the results in an independent dataset or identifying significant signals in

follow-up fine-mapping analyses. In the first paper, the association results were not

replicated in the CARe dataset; this was attributed to differences in ascertainment and

phenotypic heterogeneity between the datasets and the use of imputed variants in the

replication dataset. In the second paper, fine-mapping association analysis was

performed with the goal of identifying SNPs in seven admixture regions associated with

SBP (on chromosomes 3 and 5), total cholesterol, HDL-C, LDL-C and triglycerides (on

chromosomes 14 and 19). Significant fine-mapping results were not reported for SBP,

total cholesterol and triglycerides (on chromosome 19). In the third paper, fine-mapping

analyses did not identify any SNPs that were significantly associated with blood pressure

in treated patients in the admixture regions in the whole sample or in the 2-medication

140

stratum. These difficulties in identifying significant results in the fine-mapping analyses

in the second and third papers are attributed to small sample sizes and limited power to

detect differences. In addition, the SNPs may have tagged causal variants in African-

Americans inadequately, and this concern was illustrated in the third paper. The significant admixture regions included SNPs that were reported in a larger blood pressure

GWAS of Europeans, but the findings were only seen in treated patients in the current study, indicating that the variants reported in Europeans may be responsible for treatment effects rather than variation in blood pressure.

To further the contributions of these results to cardiovascular disease research, the results reported in these studies must be followed up with replication and functional studies, which may address some of the limitations in the initial studies. Replication studies must be conducted in other racial/ethnic groups, including other admixed populations, to evaluate whether the results are likely to be true and whether the results can be generalized to other communities. These studies are aided by recent, increasingly diverse genetic reference panels to determine whether the variants are population-specific or common to other communities. Replication analyses should also utilize newer genotyping methods, such as next generation sequencing, to examine the associations reported for the variants in these studies. These newer methods allow for improved detection of less common variants, which is important in genetic studies of non-European populations as many of the tagging SNPs in traditional arrays are selected from European populations that may not tag causal variants in non-European populations, including

African-Americans. As a result, using newer technology may be instrumental in identifying causal variants due to the higher resolution of these methods.

141

Specifically, the significant genomic and genetic results reported in the current

studies should be assessed in other African-American datasets, as well as in datasets of

other ethnicities. Next generation sequencing and arrays that have more population-

specific variants may aid in identifying causal variants that may be poorly tagged by

traditional tagging SNPs or variants that may be population-specific. In addition, fine-

mapping analyses of the significant admixture regions in the second and third papers

should be conducted in larger samples and with next generation sequencing to improve

the chances of identifying specific variants associated with blood pressure and cardiovascular phenotypes. Future replication studies that utilize next generation

sequencing may also benefit from the two-stage study design that was implemented in the

second and third papers to reduce the number of hypotheses being tested.

Functional studies are also necessary to better understand the role of these results

in the larger disease framework. While functional studies traditionally focused on

transcription and translation of the gene and protein-protein interactions, these studies can

be improved with the incorporation of relevant “omics” data as well. As the data from

metabolomics, proteomics, transcriptomics, and genomics tends to be quite dense, multi-

stage analyses to narrow regions of interest are particularly helpful to reduce the amount

of data that must be managed.

In the studies presented here, the first paper utilized a biologically-relevant

framework with gene scores as the variant of interest in the association analyses. These

results can be extended by identifying other genes in the same pathways as CXADR and

F2RL1 for blood pressure and obesity and then conducting pathway-based association

studies for these traits. In the second paper, the significant results for HDL-C, LDL-C

142

and triglycerides from the fine-mapping analyses should also be followed up with

pathway association analyses to improve the biological and clinical interpretation of these

findings. The pathway analyses of lipid metabolism and these variants may be improved

with the incorporation of other “omics” data to better understand the roles of these

variants in lipid metabolism. For the third paper, following successful fine-mapping

analyses of the significant admixture regions in the future, the variants should be assessed

in pathways to determine why the variants were associated with blood pressure in

Europeans in the literature, but they were associated with treatment response in African-

Americans. Functional analyses involving “omics” data may also be incorporated in

these analyses to determine whether there are population-specific differences in the

“omics” data as well.

Finally, the results from these studies and those from replication and functional

studies should be evaluated with the goal of clinically- and biologically-meaningful data interpretation. Methods focusing on pathways and systems biology aid in making reasonable attempts at this goal. Currently, some of the key limitations for these methods

include the management of extremely large and dense datasets, identification and

application of suitable statistical methods for the new types of data, and appropriate

evaluation of the collective results.

While the initial excitement for personalized medicine that followed the

sequencing of the human genome has not yet been realized as imagined, it may be too

early to abandon this goal. Instead, this goal has become more interesting as multiple

layers of data must be considered simultaneously for many complex diseases. As the

development of newer methods in laboratory technology, biostatistics, and computing

143 advance health research and knowledge, these steps also result in improved translational medicine and public health.

144

References For Chapters 1 and 2

1. Kearney PM, Whelton M, Reynolds K, Muntner P, Whelton PK, He J. Global burden of hypertension: analysis of worldwide data. Lancet. 2005; 365 (9455):217-23. 2. Lewington S, Clarke R, Qizilbash N, Peto R, Collins R. Age-specific relevance of usual blood pressure to vascular mortality: a meta-analysis of individual data for one million adults in 61 prospective studies. Lancet. 2002; 360 (9349):1903-13. 3. Chobanian AV, Bakris GL, Black HR, Cushman WC, Green LA, Izzo JL, Jr., et al. Seventh report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure. Hypertension. 2003; 42 (6):1206-52. 4. Yoon SS, Ostchega Y, Louis T. Recent trends in the prevalence of high blood pressure and its treatment and control, 1999-2008. NCHS Data Brief. 2010; (48):1-8. 5. NHLBI NH, Lung, and Blood Institute, US Department of Health and Human Services. Your Guide to Lowering Blood Pressure. National Institutes of Health Publication No 03-5232. 2003. 6. Redmond N, Baer HJ, Hicks LS. Health behaviors and racial disparity in blood pressure control in the national health and nutrition examination survey. Hypertension. 2011; 57 (3):383-9. 7. Kotchen TA. Chapter 241: Hypertensive Vascular Disease. In: Fauci AS, Braunwald E, Kasper DL, Hauser SL, Longo DL, Jameson JL, et al., editors. Harrison's Principles of Internal Medicine, 17e; 2008. 8. Poirier P, Giles TD, Bray GA, Hong Y, Stern JS, Pi-Sunyer FX, et al. Obesity and cardiovascular disease: pathophysiology, evaluation, and effect of weight loss: an update of the 1997 American Heart Association Scientific Statement on Obesity and Heart Disease from the Obesity Committee of the Council on Nutrition, Physical Activity, and Metabolism. Circulation. 2006; 113 (6):898-918. 9. Hottenga JJ, Boomsma DI, Kupper N, Posthuma D, Snieder H, Willemsen G, et al. Heritability and stability of resting blood pressure. Twin Res Hum Genet. 2005; 8 (5):499-508. 10. Kupper N, Willemsen G, Riese H, Posthuma D, Boomsma DI, de Geus EJ. Heritability of daytime ambulatory blood pressure in an extended twin design. Hypertension. 2005; 45 (1):80-5. 11. Levy D, DeStefano AL, Larson MG, O'Donnell CJ, Lifton RP, Gavras H, et al. Evidence for a gene influencing blood pressure on chromosome 17. Genome scan linkage results for longitudinal blood pressure phenotypes in subjects from the framingham heart study. Hypertension. 2000; 36 (4):477-83. 12. Seda O, Tremblay J, Gaudet D, Brunelle PL, Gurau A, Merlo E, et al. Systematic, genome-wide, sex-specific linkage of cardiovascular traits in French Canadians. Hypertension. 2008; 51 (4):1156-62. 13. Doris PA. The genetics of blood pressure and hypertension: the role of rare variation. Cardiovasc Ther. 2011; 29 (1):37-45. 14. Risch N, Merikangas K. The future of genetic studies of complex human diseases. Science. 1996; 273 (5281):1516-7. 15. Lander ES. The new genomics: global views of biology. Science. 1996; 274 (5287):536-9.

145

16. Chakravarti A. Population genetics--making sense out of sequence. Nat Genet. 1999; 21 (1 Suppl):56-60. 17. McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JP, et al. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008; 9 (5):356-69. 18. Pearson TA, Manolio TA. How to interpret a genome-wide association study. Jama. 2008; 299 (11):1335-44. 19. Leal SM. Detection of genotyping errors and pseudo-SNPs via deviations from Hardy-Weinberg equilibrium. Genet Epidemiol. 2005; 29 (3):204-14. 20. Hartl DL, Clark AG. Principles of Population Genetics. 4th edition ed. Sunderland, MA: Sinauer Associates, Inc.; 2007. 21. Cantor RM, Lange K, Sinsheimer JS. Prioritizing GWAS results: A review of statistical methods and recommendations for their application. Am J Hum Genet. 2010; 86 (1):6-22. 22. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007; 81 (3):559-75. 23. Ehret GB. Genome-wide association studies: contribution of genomics to understanding blood pressure and essential hypertension. Curr Hypertens Rep. 2010; 12 (1):17-25. 24. Willer CJ, Li Y, Abecasis GR. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics. 2010; 26 (17):2190-1. 25. Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010; 42 (7):565-9. 26. Shetty PB, Qin H, Namkung J, Elston RC, Zhu X. Estimating heritability using family and unrelated data. BMC Proceedings. 2011. 27. Levy D, Ehret GB, Rice K, Verwoert GC, Launer LJ, Dehghan A, et al. Genome- wide association study of blood pressure and hypertension. Nat Genet. 2009; 41 (6):677- 87. 28. Newton-Cheh C, Johnson T, Gateva V, Tobin MD, Bochud M, Coin L, et al. Genome-wide association study identifies eight loci associated with blood pressure. Nat Genet. 2009; 41 (6):666-76. 29. Ehret GB, Munroe PB, Rice KM, Bochud M, Johnson AD, Chasman DI, et al. Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk. Nature. 2011; 478 (7367):103-9. 30. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature. 2009; 461 (7265):747-53. 31. Hindorff LA JH, Hall PN, Mehta JP, and Manolio TA. A Catalog of Published Genome-Wide Association Studies. 32. Saxena R, Voight BF, Lyssenko V, Burtt NP, de Bakker PI, Chen H, et al. Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science. 2007; 316 (5829):1331-6. 33. WTCCC. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature. 2007; 447 (7145):661-78.

146

34. Enas EA, Mehta J. Malignant coronary artery disease in young Asian Indians: thoughts on pathogenesis, prevention, and therapy. Coronary Artery Disease in Asian Indians (CADI) Study. Clin Cardiol. 1995; 18 (3):131-5. 35. Org E, Eyheramendy S, Juhanson P, Gieger C, Lichtner P, Klopp N, et al. Genome-wide scan identifies CDH13 as a novel susceptibility locus contributing to blood pressure determination in two European populations. Hum Mol Genet. 2009; 18 (12):2288-96. 36. Sabatti C, Service SK, Hartikainen AL, Pouta A, Ripatti S, Brodsky J, et al. Genome-wide association analysis of metabolic traits in a birth cohort from a founder population. Nat Genet. 2009; 41 (1):35-46. 37. Wang Y, O'Connell JR, McArdle PF, Wade JB, Dorff SE, Shah SJ, et al. From the Cover: Whole-genome association study identifies STK39 as a hypertension susceptibility gene. Proc Natl Acad Sci U S A. 2009; 106 (1):226-31. 38. Padmanabhan S, Melander O, Johnson T, Di Blasio AM, Lee WK, Gentilini D, et al. Genome-wide association study of blood pressure extremes identifies variant near UMOD associated with hypertension. PLoS Genet. 2010; 6 (10):e1001177. 39. Reich DE, Cargill M, Bolk S, Ireland J, Sabeti PC, Richter DJ, et al. Linkage disequilibrium in the human genome. Nature. 2001; 411 (6834):199-204. 40. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, et al. The structure of haplotype blocks in the human genome. Science. 2002; 296 (5576):2225-9. 41. Adeyemo A, Gerry N, Chen G, Herbert A, Doumatey A, Huang H, et al. A genome-wide association study of hypertension and blood pressure in African Americans. PLoS Genet. 2009; 5 (7):e1000564. 42. Franceschini N, Fox E, Zhang Z, Edwards TL, Nalls MA, Sung YJ, et al. Genome-wide association analysis of blood pressure traits in nearly 30,000 African ancestry individuals reveals a common set of associated genes in African and non- African populations. The American Journal of Human Genetics. 2013; In press. 43. Fox ER, Young JH, Li Y, Dreisbach AW, Keating BJ, Musani SK, et al. Association of genetic variation with systolic and diastolic blood pressure among African Americans: the Candidate Gene Association Resource study. Hum Mol Genet. 2011; 20 (11):2273-84. 44. Lettre G, Palmer CD, Young T, Ejebe KG, Allayee H, Benjamin EJ, et al. Genome-wide association study of coronary heart disease and its risk factors in 8,090 African Americans: the NHLBI CARe Project. PLoS Genet. 2011; 7 (2):e1001300. 45. Kato N, Miyata T, Tabara Y, Katsuya T, Yanai K, Hanada H, et al. High-density association study and nomination of susceptibility genes for hypertension in the Japanese National Project. Hum Mol Genet. 2008; 17 (4):617-27. 46. Kardia SL, Sun YV, Hamon SC, Barkley RA, Boerwinkle E, Turner ST. Interactions between the adducin 2 gene and antihypertensive drug therapies in determining blood pressure in people with hypertension. BMC Med Genet. 2007; 8:61. 47. Hiura Y, Tabara Y, Kokubo Y, Okamura T, Miki T, Tomoike H, et al. A genome- wide association study of hypertension-related phenotypes in a Japanese population. Circ J. 2010; 74 (11):2353-9. 48. Cho YS, Go MJ, Kim YJ, Heo JY, Oh JH, Ban HJ, et al. A large-scale genome- wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nat Genet. 2009; 41 (5):527-34.

147

49. Hong KW, Lim JE, Oh B. A regulatory SNP in AKAP13 is associated with blood pressure in Koreans. J Hum Genet. 2011; 56 (3):205-10. 50. Mayers CM, Wadell J, McLean K, Venere M, Malik M, Shibata T, et al. The Rho guanine nucleotide exchange factor AKAP13 (BRX) is essential for cardiac development in mice. J Biol Chem. 2010; 285 (16):12344-54. 51. Kato N, Takeuchi F, Tabara Y, Kelly TN, Go MJ, Sim X, et al. Meta-analysis of genome-wide association studies identifies common variants associated with blood pressure variation in east Asians. Nat Genet. 2011; 43 (6):531-8. 52. Hunter DJ, Kraft P. Drinking from the fire hose--statistical issues in genomewide association studies. N Engl J Med. 2007; 357 (5):436-9. 53. Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008; 40 (6):695-701. 54. Hirschhorn JN, Daly MJ. Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005; 6 (2):95-108. 55. Ehret GB, Morrison AC, O'Connor AA, Grove ML, Baird L, Schwander K, et al. Replication of the Wellcome Trust genome-wide association study of essential hypertension: the Family Blood Pressure Program. Eur J Hum Genet. 2008; 16 (12):1507-11. 56. Johnson AD, Newton-Cheh C, Chasman DI, Ehret GB, Johnson T, Rose L, et al. Association of hypertension drug target genes with blood pressure and hypertension in 86,588 individuals. Hypertension. 2011; 57 (5):903-10. 57. Bansal V, Libiger O, Torkamani A, Schork NJ. Statistical analysis strategies for association studies involving rare variants. Nat Rev Genet. 2010; 11 (11):773-85. 58. Li B, Leal SM. Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. Am J Hum Genet. 2008; 83 (3):311-21. 59. Morgenthaler S, Thilly WG. A strategy to discover genes that carry multi-allelic or mono-allelic risk for common diseases: a cohort allelic sums test (CAST). Mutat Res. 2007; 615 (1-2):28-56. 60. Madsen BE, Browning SR. A groupwise association test for rare mutations using a weighted sum statistic. PLoS Genet. 2009; 5 (2):e1000384. 61. Feng T, Elston RC, Zhu X. Detecting rare and common variants for complex traits: sibpair and odds ratio weighted sum statistics (SPWSS, ORWSS). Genet Epidemiol. 2011; 35 (5):398-409. 62. Zhu X, Tang H, Risch N. Admixture mapping and the role of population structure for localizing disease genes. Adv Genet. 2008; 60:547-69. 63. Tang H, Jorgenson E, Gadde M, Kardia SL, Rao DC, Zhu X, et al. Racial admixture and its impact on BMI and blood pressure in African and Mexican Americans. Hum Genet. 2006; 119 (6):624-33. 64. Smith MW, O'Brien SJ. Mapping by admixture linkage disequilibrium: advances, limitations and guidelines. Nat Rev Genet. 2005; 6 (8):623-32. 65. Zhu X, Luke A, Cooper RS, Quertermous T, Hanis C, Mosley T, et al. Admixture mapping for hypertension loci with genome-scan markers. Nat Genet. 2005; 37 (2):177- 81. 66. Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000; 155 (2):945-59.

148

67. Deo RC, Patterson N, Tandon A, McDonald GJ, Haiman CA, Ardlie K, et al. A high-density admixture scan in 1,670 African Americans with hypertension. PLoS Genet. 2007; 3 (11):e196. 68. Zhu X, Cooper RS. Admixture mapping provides evidence of association of the VNN1 gene with hypertension. PLoS One. 2007; 2 (11):e1244. 69. Zhu X, Zhang S, Tang H, Cooper R. A classical likelihood based approach for admixture mapping using EM algorithm. Hum Genet. 2006; 120 (3):431-45. 70. Zhu X, Young JH, Fox E, Keating BJ, Franceschini N, Kang S, et al. Combined admixture mapping and association analysis identifies a novel blood pressure genetic locus on 5p13: contributions from the CARe consortium. Hum Mol Genet. 2011; 20 (11):2285-95. 71. Darvasi A, Shifman S. The beauty of admixture. Nat Genet. 2005; 37 (2):118-9. 72. McKeigue PM. Prospects for admixture mapping of complex traits. Am J Hum Genet. 2005; 76 (1):1-7. 73. FBPP. Multi-center genetic study of hypertension: The Family Blood Pressure Program (FBPP). Hypertension. 2002; 39 (1):3-9. 74. Basu A, Tang H, Arnett D, Gu CC, Mosley T, Kardia S, et al. Admixture mapping of quantitative trait loci for BMI in African Americans: evidence for loci on chromosomes 3q, 5q, and 15q. Obesity (Silver Spring). 2009; 17 (6):1226-31. 75. Basu A, Tang H, Lewis CE, North K, Curb JD, Quertermous T, et al. Admixture mapping of quantitative trait loci for blood lipids in African-Americans. Hum Mol Genet. 2009; 18 (11):2091-8. 76. Database of Single Nucleotide Polymorphisms (dbSNP). Bethesda (MD): National Center for Biotechnology Information, National Library of Medicine. (dbSNP Build ID: 137). Available from: http://www.ncbi.nlm.nih.gov/SNP/. 77. Shetty PB, Tang H, Tayo BO, Morrison AC, Hanis CL, Rao DC, et al. Variants in CXADR and F2RL1 are associated with blood pressure and obesity in African- Americans in regions identified through admixture mapping. J Hypertens. 2012; 30 (10):1970-6. 78. Bowles NE, Richardson PJ, Olsen EG, Archard LC. Detection of Coxsackie-B- virus-specific RNA sequences in myocardial biopsy samples from patients with myocarditis and dilated cardiomyopathy. Lancet. 1986; 1 (8490):1120-3. 79. Lisewski U, Shi Y, Wrackmeyer U, Fischer R, Chen C, Schirdewan A, et al. The tight junction protein CAR regulates cardiac conduction and cell-cell communication. J Exp Med. 2008; 205 (10):2369-79. 80. Bezzina CR, Pazoki R, Bardai A, Marsman RF, de Jong JS, Blom MT, et al. Genome-wide association study identifies a susceptibility locus at 21q21 for ventricular fibrillation in acute myocardial infarction. Nat Genet. 2010; 42 (8):688-91. 81. Greenawalt DM, Dobrin R, Chudin E, Hatoum IJ, Suver C, Beaulaurier J, et al. A survey of the genetics of stomach, liver, and adipose gene expression from a morbidly obese cohort. Genome Res. 2011; 21 (7):1008-16. 82. Meyer LR, Zweig AS, Hinrichs AS, Karolchik D, Kuhn RM, Wong M, et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 2013; 41 (Database issue):D64-9.

149

83. Johnson AD, Handsaker RE, Pulit SL, Nizzari MM, O'Donnell CJ, de Bakker PI. SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap. Bioinformatics. 2008; 24 (24):2938-9. 84. Benyamin B, Visscher PM, McRae AF. Family-based genome-wide association studies. Pharmacogenomics. 2009; 10 (2):181-90. 85. Calhoun DA, Jones D, Textor S, Goff DC, Murphy TP, Toto RD, et al. Resistant hypertension: diagnosis, evaluation, and treatment: a scientific statement from the American Heart Association Professional Education Committee of the Council for High Blood Pressure Research. Circulation. 2008; 117 (25):e510-26. 86. Egan BM, Zhao Y, Axon RN, Brzezinski WA, Ferdinand KC. Uncontrolled and apparent treatment resistant hypertension in the United States, 1988 to 2008. Circulation. 2011; 124 (9):1046-58. 87. Hindorff LA, MacArthur JEBI, Morales JEBI, Junkins HA, Hall PN, Klemm AK, et al. A Catalog of Published Genome-Wide Association Studies. 88. Mailman MD, Feolo M, Jin Y, Kimura M, Tryka K, Bagoutdinov R, et al. The NCBI dbGaP database of genotypes and phenotypes. Nat Genet. 2007; 39 (10):1181-6. 89. Wu YK. Epidemiology and community control of hypertension, stroke and coronary heart disease in China. Chin Med J (Engl). 1979; 92 (10):665-70.

150