Evaluation of Genetic Variants Influencing Susceptibility to Type 2 Diabetes Associated End- Stage Kidney Disease and Associated Phenotypes in African and European Americans

By

Jason Andrew Bonomo

A Dissertation Submitted to the Graduate Faculty of

Wake Forest University Graduate School of Arts and Sciences

in Partial Fulfillment of the Requirements

for the Degree of

Doctor of Philosophy

Molecular Medicine and Translational Science

May 2014

Winston Salem, NC

Approved By:

Donald W. Bowden, Ph.D., Advisor

Examining Committee:

Barry I. Freedman, M.D., Chairman

Maggie C.Y. Ng, Ph.D.

Nicholette D. P. Allred, Ph.D.

Timothy D. Howard, Ph.D.

Acknowledgements

Foremost, I would like to thank my advisor, Dr. Donald W. Bowden, for allowing me the opportunity to work in the Bowden lab, and providing extensive mentorship. During my time in the lab I appreciated your unique insights on many subjects including the genetics of complex diseases, NIH grant writing, and manuscript preparation. The environment fostered by your mentorship promotes an open environment which encourages critical thinking, collaboration, guidance, and supports student-led hypotheses. I would not be the scientist I am today without your guidance, Dr. Bowden. To Dr. Barry I. Freedman, I cannot say how appreciative I am of your support, the opportunities you offered to work with collaborators across the country, clinical mentorship, visits to dialysis facilities, help with manuscript revisions, and physician related life advice you’ve given to me over the years. Without your dedication and passion for understanding the genetic bases of nephropathies we would not have had access to many of the samples you have recruited over the decades. Dr. Maggie C.Y.

Ng, I am thankful for your mentorship the and one-on-one meetings during my first year in the

Bowden lab; you taught me much about statistical methods in genomics and always encouraged me work through challenges, and you helped me develop the cognitive skills necessary to answer questions which would have been left unanswered otherwise. You always answered any questions I had – which were many – and I appreciate the time you took to in explaining complex topics to me. I will always be thankful to Dr. Nicholette P. Allred who helped train me when I joined the lab, always had an open door, and dealt with years of my incessant questioning in a composed, patient manner coupled with an air of humor. I thank Dr.

Timothy Howard for his advice and guidance navigating graduate school, the world of publishing, and for scientific insights into my projects.

ii

To those in the Bowden Lab I am grateful for the camaraderie, training, guidance, advice, and the casual scientific exchanges that made experiments and papers more developed.

Becky Blevins – thank you for always helping with DNA plates, making work in the lab more enjoyable, and being a good friend . Pam, thank you for your help getting me started in the lab and with all the help in genotyping projects. To the statistical staff in the Bowden Lab – Poorva and JJ – thank you for teaching me to run complex statistical analyses and aiding my search for rare- and low-frequency variants from the T2D- study. To my coworkers in the Bowden

Lab I’ve always appreciated your help, feedback, collaborations, and advice with projects.

To the faculty and staff of Molecular Medicine and Translational Science, particularly

Dr. K. Bridget Brosnihan, Dr. Micheal Seeds, and Kay Collare, thank you for your advice, guidance, and endless support over the past years; I would n ot be as far along in my career if it was not for the support I have received from MMTS. A special thank you to Dr. Martha

Alexander-Miller who has offered continued guidance, mentorship, and su pport over the past five years, and to Mrs. Pierce in the grad uate school for her seemingly endless knowledge on all issues pertaining to graduate students and the medical center.

The Deans of the medical school, especially Drs. Knudson and Latham-Sadler, I appreciate your help and support from my transition to grad uate school from medical school, and the proactive approaches you all have taken to advance my career as an aspiring physician - scientist. To Drs. Paul Laurienti and Chris Whitlow, former and current directors of the

MD/ PhD Program, thank you for your advocacy and mentorship over the past 5 years. I truly believe joining the MD/ PhD program at Wake, with all its uniqueness, was the most important life decision I have made thus family.

To my family – I don’t know when, where, or how to begin. To my late father, Thomas

Anthony Bonomo, PhD, I thank you for your intense intellectual curiosity; I must have inherited

iii this from you. To Brendan, my late brother, you are the reason why I’ve chosen the path I’m on.

You were my motivation for medical school, and your trials gave me further inspiration to pursue a PhD. Your legacy will remain cerebrally ingrained , reminding me daily how fortunate

I am to have a “healthy life,” how the act of breathing freely in and of itself is no small miracle, how we as a society take all too much for granted, and that when all is said and done, nothing is more important than the happiness that comes from within. After dozens of operations, coding on operation tables, facing death but saying “no, not yet”, spending summers and winters at hospitals around the country, and spending your all too abbreviated life in stoic acceptance due to no fault of your own, you persevered, never once complained, and were always smiling, always living every moment as if it were your last. You even managed to teach me how to golf!

,Natalie Ann .وال عال م ألخ ي – Your journey serves as constant inspiration, never to be forgotten could I have asked for a better sister? Thanks for being a travel buddy; I’ll never forget our fight over cantaloupe while being detained at the Syrian border. Thanks for always being there with open ears, giving me unconditional support and encouragement over the years. Finally, to my mom Nawal, I might not have enjoyed some of the things you put Nat, Brendan, and I through as kids (being dragged to operas and musicals, French and Spanish tutors, playing instruments

I loathed, this list could really be endless…), but I am more appreciative of those experiences now than before. I sincerely appreciate everything you’ve done. Those experiences afforded me new perspectives, exposed me to new cultures, and helped me to recognize and appreciate the subtleties in life. I think your inability to accept adequacy ultimately served in my best interest and molded the young adult I am now . Thank you for college, my condo, the countless vacations around the world, and your steadfast, absolute support. You are an incredible mom and I’ll never forget all you’ve done.

iv

To my friends: thank you for keeping me sane. Whether it was short weekend trips, concerts, backpacking through Europe and South America, hiking, or simply hanging out in PA over holidays, overwhelmed by nostalgia and the stagnation of time, thank you for always being there – especially through the tougher times. You guys are family and are the single best escape from stress out there. I look forward to the eternal continuation of our epic adventures and forthcoming trips.

Finally, I would like the patients, their families, staff of the dialysis centers, and recruitment staff. Without the selfless participation of these patients with renal disease and type

2 diabetes this work would not have been possible. To those that have worked long days collecting, purifying, and quantifying the thousands of DNA samples, your efforts are most appreciated.

v

Table of Contents

Page Number

List of Tables vii

List of Figures ix

Abbreviations x

Abstract xiii

Chapter 1: Introduction 1

Chapter 2: Coding variants in NPHS1 modulate susceptibility to 27 diabetic and non-diabetic end-stage kidney disease in African Americans Submitted to the Clinical Journal of the American Society of Nephrology

Chapter 3: The ras responsive transcription factor RREB1 is a novel 64 candidate for type 2 diabetes and nephropathy Submitted to Human Molecular Genetics

Chapter 4: Complement gene associations with ESKD in 103 African Americans Published in Nephrology, Dialysis, & Transplantation

Chapter 5: Summary & Conclusions 122

Appendices

A.1 Creation and Evaluation of an African American T2D-ESKD 137 Genetic Risk Score Based on Common Genetics Variants in the Notch Signalling Pathway

A.2 RTN1 is a Novel Risk Gene for Kidney Disease 145 Data was included in a manuscript submitted to Nature Medicine in February 2014

A.3 Genetic variants in NPHS1 are associated with T2D-ESKD 159 in European Americans

A.4 A CRCT1 missense variant is associated with T2D-ESKD 161 in African Americans

Curriculum Vitae 163

vi

List of Tables

Page Number Chapter 2

Table 1: Clinical Characteristics of African American Study Cohorts 48 Table 2: Rs35238405 (T233A) Single-SNP association testing (Dominant Model) 49 with adjustment for admixture and APOL1 G1/ G2 allele status Table 3: NPHS1 variants nominally associated with ESKD in -wide analysis 50 using single-SNP association testing, following adjustment for admixture and APOL1 G1/ G2 allele status Table 4: Locus-wide NPHS1 SKAT Analysis weighted for rare variants 51 Sup. Table 1: NPHS1 exome sequencing results 52 Sup. Table 2: NPHS1 Locus-Wide Analysis 54 Sup. Table 3: T233A characteristics within all ESKD cases 56 Sup. Table 4: Reported rs35238405 (T233A) minor allele frequencies 57 Sup. Table 5: Rs35238405 (T233A) Single-SNP association testing (Dominant Model) 58 in T2D-only, non-nephropathy cases, adjusted for admixture only Sup. Table 6: Breakdown of variants identified in T2D-GENES exome sequencing 59 “Discovery” study (genome wide) Sup. Table 7: NPHS1 variants included in SKAT analysis 60 Sup. Table 8: CNV in NPHS1 61 Chapter 3 Table 1: Clinical Characteristics of African American Study Cohorts 90 Table 2: Clinical Characteristics of European American Study Cohorts 91 Table 3: RREB1 variants identified through African American T2D-ESKD 92 Exome Sequencing Study and African American Replication Analyses Table 4A: RREB1 African American non-T2D ESKD Locus-Wide Study 93 Table 4B: RREB1 African American T2D-only, non-nephropathy Locus-Wide Study 93 Table 5A: RREB1 EA T2D-ESKD Locus-Wide Study 94 Table 5B: RREB1 EA T2D-only Locus-Wide Study 94 Sup. Table 1: RREB1 variants genotyped in a locus-wide analysis 95 Sup. Table 2: MAF of RREB1 variants across genetic resources 96 Sup. Table 3: Transcription Factor Binding Sites rs9505090 is located within 97

vii

Sup. Table 4: CNV in RREB1 gene region 101 Chapter 4 Table 1: Clinical characteristics of African American study samples 115 Table 2: CFH single SNP association results in African Americans with 116 non-diabetic ESKD Table 3: CFH single SNP association results in African Americans with 117 putative type 2 diabetes-associated ESKD Table 4A: CFH Locus-wide SKAT analysis (non-T2D ESKD) 118 Table 4B: CFH Locus-wide SKAT analysis (T2D-ESKD) 118 Sup. Table 1: CFH variants genotyped 119 Appendix 1 Table 1: Top Epistatic Interactions in the Notch Signalling Pathway 142 Appendix 2 Table 1: Clinical Characteristics of Genetic Study Samples 154 Table 2: ESRD Analysis 155 Table 3: Trait Discrimination 156 Appendix 3 Table 1: European American NPHS1 variants genotyped in T2D-ESKD 159 Table 2: European American NPHS1 variant associations with T2D-ESKD 160

viii

List of Figures

Page Number Chapter 2 Supp. Figure 1: Potential mechanism of deleterious effects of T233A: 62 disruption of β-pleated sheet in vital bridge region between 2nd and 3rd Ig-C2 like domains of nephrin. Supp. Figure 2: NPHS1 mRNA expression correlates with eGFR in 63 patients with diabetic nephropathy Chapter 3 Supp. Figure 1: TFBS in which rs9505090 is located within 102 Appendix 1 Figure 1A: Genes used to create Notch signalling 141 Figure 1B: Number of SNPs per gene associated with T2D-ESKD 141 Figure 2: Notch Pathway T2D-ESKD GRS with all 89 associated SNPs 142 Figure 3: Epistatic Notch Pathway GRS 143 Appendix 2 Figure 1: LD Block (D’) of 219 SNP RTN1 Gene Region Utilized in Discovery Study 157 Appendix 4 Figure 1: CRCT1 Expression in Healthy Kidneys 161

ix

List of Abbreviations

AA amino acid ACEi angiotensin converting enzyme inhibitor ACTN4 gene for alpha actinin 4 AIM ancestry informative marker AKD acute kidney disease AUH gene for the protein AU RNA binding protein/ enoyl-CoA hydratase ASW African Americans from the Southwestern U.S. bp CARS gene for the protein cysteinyl-TRNA synthetase CD2AP gene for the protein CD2-associated protein CFH complement factor H gene CNDP1 gene for carnosine dipeptidase 1 CKD chronic kidney disease cM centimorgan CUBN gene for the protein cubulin DDD dense deposit disease CVD cardiovascular disease DN diabetic nephropathy FGGS focal global glomerulosclerosis FRMD3 gene for the protein FERM Domain Containing 3 FSGS focal segmental glomerulosclerosis ELMO1 gene for protein engulfment and cell motility 1 EPO erythropoietin ESKD end-stage kidney disease GBM glomerular basement membrane eGFR estimated glomerular filtration rate GFR glomerular filtration rate GWAS genome-wide association study

HbA1C hemoglobin A1C HIVAN human immunodeficiency virus associated nephropathy

x

LIMK2 gene for the protein LIM Domain Kinase 2 LWK Luhyans in Webuye, Kenya LOD logarithm of the odds LRP2 gene for the protein megalin MALD mapping by admixture linkage disequilibrium MAP mean arterial pressure miRNA micro RNAs NAT8 gene for the protein N-Acetyltransferase 8 non-T2D ESKD non type 2 diabetes associated end -stage kidney disease NF-kB nuclear factor kappa-light-chain-enhancer of activated B cells NPHS1 gene for the protein nephrin NPHS2 gene for the protein podocin Nrf2 nuclear 1 factor-like 2 factor OOA Out of Africa PCA principle component analysis PVT1 RNA gene for Pvt1 oncogene RPS12 gene for the ribosomal protein S12 RREB1 ras responsive element binding protein 1 RRT renal replacement therapy SASH1 gene for the protein SAM and SH3 domain containing 1 SFI1 gene for the protein Sfi1 Homolog, Spindle Assembly Associated (Yeast) SIC Shannon Information Content SLC7A9 gene for the protein solute carrier family 7, member 9 TCF7L2 gene for the protein transcription factor 7-like 2 TGFβ transforming growth factor beta T1D Type 1 Diabetes Mellitus T2D Type 2 Diabetes Mellitus T2D-ESKD T2D associated end-stage kidney disease UMOD gene for the protein uromodulin YRI Yorubans from Nigeria

xi

“In logic…the question is not how we think, but how we ought to think.” EMMANUEL KANT, 1800.

xii

Abstract

Evaluation of Genetic Variants Influencing Susceptibility to Type 2 Diabetes Associated End-Stage Kidney Disease and Associated Phenotypes in African and European Americans

Jason Andrew Bonomo

Dissertation under the direction of:

Donald W. Bowden, Ph.D., Professor of Biochemistry and

Center for Genomics and Personalized Medicine Research

There is strong clinical and epidemiological evidence supporting a genetic link between type 2 diabetes associated end-stage kidney disease (T2D-ESKD) and African ancestry.

Incidence and prevalence of T2D-ESKD, in addition to non-diabetic forms of ESKD, are markedly higher in African Americans compared to all other ethnicities even after adjustment for socioeconomic factors. Familial aggregation studies also support a strong genetic component to T2D-ESKD in African Americans. In this series of studies we examined this health care disparity from a genetics perspective by leveraging state-of-the-art genetic methodologies, including whole exome sequencing and bioinformatic tools, in an effort to further elucidate the genetic architecture of T2D-ESKD and non-T2D ESKD in several samples of African Americans, and to a lesser extent, in European Americans.

Data from these investigations provided wide-ranging genetic insights into T2D-ESKD, non-T2D ESKD, and also T2D in the absence of kidney disease. We have implicated rare- and low-frequency missense variants in the gene NPHS1 (which codes for the protein nephrin) that contribute to ESKD in the general African American population. Previously, NPHS1 variants

xiii were believed to contribute to monogenic diseases and uncommon forms of nephropathy.

Moreover, we identified novel roles for the gene RREB1, coding for the ras responsive element binding protein 1, with implications for T2D, T2D-ESKD, and non-T2D ESKD in populations of both African and European ancestry. Data from an additional study demonstrate that genetic variations in the complement factor H gene (CFH) influence susceptibility to common, non- diabetic forms of ESKD in African Americans. Previously, CFH had been linked primarily to

IgA nephropathy and dense deposit disease (DDD) in Asian and European populations.

Data from the aforementioned studies, in addition to those in the Appendix, provide new insights into T2D, T2D-ESKD, and non-T2D ESKD in both African and European

Americans. The ultimate goals of any genetic study are to aid in prognostication of disease risk, provide biologic insights into disease pathogenesis, and to identify tractable drug targets. The investigations presented here expand our knowledge of the genetic underpinnings of T2D, T2D-

ESKD, and non-T2D ESKD, laying the foundational framework for future studies aiming to construct a comprehensive genetic prognostication panel for T2D-ESKD and non-T2D ESKD, in addition to identifying a novel drug target (RREB1) which may have implications for T2D, T2D-

ESKD, and non-T2D ESKD.

xiv

Chapter 1

Introduction

End-Stage Kidney Disease

Broadly, kidney disease falls into two categories: acute kidney disease (AKD) and chronic kidney disease (CKD). Whereas ADK has an abrupt onset, generally within 48 hou rs, often accompanied with nephritic symptoms (hematuria or oliguria) and pathologies (casts in urine), CKD progresses slowly over periods of years to decades. There are 5 stages of CKD with

“Stage 1” characterized by a normal glomerular filtration rate (GFR; >90ml/ min/ 1.73m 2), a quantitative measure of kidney function, coupled with pathogenic findings in the urine (eg, proteinuria, hematuria). Often, but not always, CKD progresses to “Stage 5”, also known as end-stage kidney disease (ESKD) (de Zeeuw 2011).

ESKD refers to a group of heterogeneous pathologies with varying etiologies that ultimately result in the inability of the kidneys to maintain proper homeostatic roles. These functions include filtration and excretion of toxic metabolites and reclamation of solutes, endocrine and paracrine functions (eg, renin production, conversion of calcidiol to calcitriol

[active form of vitamin D], erythropoietin [EPO] synthesis, calcium homeostasis via parathyroid feedback), and maintenance of vascular tone. It is recognized that one has ESKD when their

GFR falls below 15mL/ min/ 1.73m 2. Ultimately, ESKD results in anemia and the accumulation of nitrogenous waste products (azotemia) and other toxic metabolites that result in uremia unless renal replace therapy (RRT) is initiated. Uremia has systemic consequences affecting nearly all organ systems and is fatal if left untreated ; for much of the developing world, RRT is not a viable option (Bargman, Skorecki 2012).

RRT may take on several forms including hemodialysis, peritoneal dialysis, or kidney transplantation. Kidney transplantation is the most effective form of treatment. In 2011, 113,000

1 patients initiated RRT, and of this total, 2,850 patients received a renal transplant as their first line treatment (USRDS 2013). This reflects the large disparity between those who need a donor kidney and their limited availability. As with other complex diseases it is thought that a combination of lifestyle, environmental, and genetic factors contribute to one’s risk of developing ESKD.

Incidence of ESKD in the United States

In 2011 it was reported that the national incidence rate of ESKD stood at 357 new cases per million people, similar to the levels observed in the previous decade (USRDS 2013). In the

United States End Stage Renal Disease Network 6, which includes patients with ESKD from

North Carolina, South Carolina, and Georgia, the incid ence rate was the highest in the nation standing at 387 new cases per million (USRDS 2013). Etiologies of ESKD include type 1 diabetes mellitus (T1D), type 2 diabetes mellitus (T2D), idiopathic glomerulonephritis (or primary focal segmental glomerulosclerosis [FSGS]), lupus nephritis, “hypertension attributed ” ESKD, and human immunodeficiency virus associated nephropathy (HIVAN), amongst others (USRDS

2013). A major concern for long term trends in ESKD incidence is the rising epidemic of type 2 diabetes in the United States and worldwide.

Type 2 Diabetes

Type 2 diabetes mellitus (T2D) is a complex metabolic disorder that results in decreased insulin sensitivity ultimately resulting in the impaired ability of pancreatic β-cells to secrete insulin. Clinically, T2D is often preceeded by the metabolic syndrome, encompassing risk factors for both cardiovascular disease and diabetes mellitus which include dyslipidemia, hyperglycemia, hypertension, and central adiposity (Eckel 2012). Much like Stage 1 CKD progressing to ESKD, metabolic syndrome often progresses to T2D, and those who meet criteria for metabolic syndrome are at a 3-5x increased risk for developing T2D (Eckel 2012). Clinically,

2

T2D is defined as repeated fasting blood glucose levels over 126mg/ dL (7.0mmol/ L) or greater

and an HbA1c > 6.5 on at least two measurements (Masharani, German 2011).

T2D results in chronic hyperglycemia with systemic implications if not properly treated; patient compliance with medications and willingness to modify diet and lifestyle choices are major challenges to the long term effectiveness of treatments (personal experience). Once patients with T2D become insulin dependent, hyperglycemia becomes especially challenging to treat. Temporally, higher doses of insulins (long and short acting) administration at specific and consistent timepoints are critical to achieve euglyemecia, which further strains patient compliance. Long-term complications of T2D are common and include retinopathy, peripheral neuropathy, and diabetic nephropathy (DN), in addition to increased risks of developing cardiovascular disease (CVD). DN is defined as albuminuria, ≥30mg/ 24hours that follows diagnosis of T2D (Bakris 2013). Hematuria may also accompany albuminuria, although this is a much less common presentation (Bakris 2013). Risk factors associated with DN risk includes poor glycemic control, obesity, smoking, and ethnicity (African Americans and Hispanic

Americans) (Bakris 2013).

DN progression is responsible for 45% of all new cases of ESKD, a process that generally takes 10-15 years termed T2D associated ESKD (T2D-ESKD) (USRDS 2013). With global trends demonstrating clear increases in rates of T2D in developed and developing nations, coupled with recent troublesome observations where T2D is being diagnosed in adolescent and young adult populations, rates of DN and T2D-ESKD are expected to parallel the rise of T2D (Zeitler el al. 2007; de Boer et al. 2011). The excess mortality associated with T2D is largely the result of DN

(ie, T2D-ESKD); when one has T2D without DN their overall mortality risk is nearly equal to that of non-diabetics in the general population (H immelfarb, Tuttle 2013; Afkarian et al. 2013;

Groop et al. 2009).

3

Ethnic Disparities in Incidence and Prevalence of ESKD

There are clear ethnic differences in the rates of ESKD and T2D-ESKD. Whereas

European Americans have the lowest incidence rates of ESKD in the United States, or 280 new cases per million in 2011, African Americans have the highest rates of newly diagnosed ESKD at

940 new cases per million – 3.4 times that of European Americans (USRDS 2013). Disparities are also observed in Hispanic Americans, Native Americans, and Asian Americans, although not to the same extent as with African Americans. For example, in the United States End Stage Renal

Disease Network 6, African Americans composed 55% of the 9,600 patients with T2D-ESKD in

2011, representing the highest percentage of African American T2D-ESKD cases in the country

(USRDS 2013). This statistic is a particularly poignant representation of the ESKD health care disparity borne by African Americans, especially younger African Americans. While the incidence rates of T2D-ESKD have increased 3.5% since 2000 for European Americans ages 30-

39, African Americans in this same age group saw rates rise 72% (USRDS 2013). Even after controlling for socioeconomic factors, rates of T2D-ESKD and non-T2D ESKD remain markedly higher in African American populations compared to European Americans (Freedman et al.

1995; Spray et al. 1995; Freedman 2002).

In 2011 the total prevalence rate of ESKD increased 3.2% to roughly 550,000 affected individuals. Nationally, the point prevalence of “all cause” ESKD is 1329 per million. Again,

African American bear a disproportionate burden of this disease with a 2011 point prevalence of 5,584 per million, or 4x that of European American, 2.5x that of Asian Americans, and 2x that of all other ethnic minorities (USRDS 2013).

Treatments, Outcomes, and Costs Associated with ESKD

The optimal treatment for any patient with ESKD is kidney transplantation.

Aforementioned, the demand for donor kidneys far outweighs their availability. Mortality rates

4 of ESKD patients have decreased roughly 20% since 2000 (USRDS 2013); however, the dialysis mortality rate was 254 per 1000 patients in 2011 (USRDS 2013). That is, each year on dialysis is associated with a 20-25% risk of death. Peritoneal dialysis is more effective than hemodialysis with 61% of patients remaining alive at 3 years compared to 52% for those on hemodialysis

(USRDS 2013). If one examines the ethnic breakdown of renal transplantations during this time, another disparity arises: after 3 years of being on dialysis half (51%) of African American will still be waiting for a renal transplant, whereas two thirds of European Americans (64%) will have received one (USRDS 2013).

The costs of ESKD cannot only be reflected by dollar amounts spent by the government and private insurers on treatment. Intangible costs borne by patients and their families are often ill-studied but include reduced mobility, increased morbidity, dietary restrictions, and side effects resulting from treatment (fatigue, nausea, bacteremia/ infections, and sexual side effects, amongst others). Hemodialysis ESKD patients suffer from reduced quality of life, spending on average 12 hours each week on dialysis, excluding the time traveled to and from dialysis facilities. This represents an annual cost of $90,000 per hemodialysis patient for medicare

(USRDS 2013); peritoneal dialysis is more flexible, offering at-home, overnight dialysis associated with lower costs ($72,000/ year; USRDS 2013) and slightly improved quality of life.

Overall it is estimated that total ESKD spending by Medicare and private insurers in 2011 was

$50billion (USRDS 2013).

Current therapeutic options for CKD are limited and at best slow the progression to

ESKD. These include the use of angiotensin converting enzyme inhibitors (ACEi), which decrease glomerular capillary pressure [via preferential dilation of the efferent arteriole] and increase levels of bradykinin (Kon et al 1993), and oral steroids (eg, prednisone, dexamethasone) which inhibit the TGF-β pathway and reduce inflammation and fibrosis in FSGS and associated

5 nephropathies. Recent clinical trials of novel agents to treat T2D-ESKD have largely failed. A

Phase III trial assessing the efficacy of bardoxolone methyl in T2D patients with Stage 4 CKD was halted early due to poorer survival and increased number of adverse effects in the treatment group (de Zeeuw et al. 2013). Bardoxolone methyl is an activator of the Nrf2 transcription factor which increases the productions of anti-oxidant genes (Himmelfarb, Tuttle

2013). Novel insights into the underlying genetic architecture of T2D-ESKD and ESKD are desperately needed in order to develop targeted therapeutics that would modulate the underlying pathological drivers of ESKD.

Clinical Markers of ESKD

Aforementioned, the GFR is the primary clinical marker used to assess CKD. Estimated

GFR, or eGFR, is a widely used clinical surrogate for the actual GFR, although it tends to underestimate true GFR by ≈10%. eGFR calculation using the Modification of Diet in Renal

Disease (MDRD) Study equation (Peterson et al. 1995) takes into account serum creatinine levels, a byproduct of creatine phosphate metabolism in muscle that is produced at a relatively constant state in the body, in addition to age, gender, and African ancestry (four variables). The formula is as follows:

( ) ( ) ( ) ( ) ( )

This equation has limitations and may not provide accurate estimates when serum creatinine is in a non-steady state (eg, pregnancy), at high GFRs, ages over 70, extremes in muscle mass (eg, amputees and those with exceedingly high muscle mass), and populations that are not primarily of European or African Ancestry (eg, Hispanics, Asians, et cetera) (Earley et al.

2012). A recent review comparing the MDRD eGFR equation to the Chronic Kidney Disease

Epidemiology Collaboration (CKD-EPI; Levey et al. 2009) concluded that the CKD-EPI equation

6 is more accurate on a population-level than the MDRD, although the MDRD equation out- performed CKD-EPI at lower GFRs (ie, in states of CKD) (Earley et al. 2012). The CKD-EPI equation is as follows:

( ) ( ) ( ) ( )

where α = -0.329 if female and α = -0.411 if male, k = 0.7 if female and k = 0.9 if male

(Levey et al. 2009).

An additional hallmark of chronic kidney disease is proteinuria. This is a direct consequence of the accumulations of glomerular insults temporally. Albumin is the most abundant protein in the bloodstream (serum component) and serves a variety of essential functions. At 3.6 nanometers (nm) in diameter exceedingly small amounts of albumin are able to traverse the negatively charged heparin sulfate barrier of the glomerular basement membrane and pores of the slit diaphragm (4nm), entering Bowman’s space for filtration through the tubular components of the nephron. The majority of this albumin that enters the proximal tubules are reclaimed by the endocytic megalin (LRP2) and cubilin (CUBN) in proximal tubule endothelial cells and are reintroduced into the bloodstream ; however, small amounts of albumin (<25mg/ day) may enter the urine in healthy individuals (Woredekal,

Friedman 2009).

In CKD this intricate filtration system is disrupted. Glomerular basement thicken ing, podocyte effacement, expansion of the mesangial matrix and deposition of amorphous proteins, glomerular sclerosis, and loss of the negatively charged heparin sulfate barrier all contribute to excess albumin entering the urine (albuminuria) and declining GFR (Bargman, Skorecki 2012).

In the initial stages of CKD microalbuminuria is a sensitive test for early renal damage and refers to amounts of albumin undetectable on urinary dipstick (Bargman, Skorecki 2012).

7

Microalbuminuria is defined as 30-299mg of albumin on 24 hour urinary collection, and overt proteinuria is defined as ≥300mg/ 24 hours (Woredekal, Friedman 2009). 24 hour urinary collection is a cumbersome clinical test and often the urine albumin-to-creatinine ratio (UACR) is used. UACR values over 30mg/ g are reflective of CKD (NKDEP 2010) and are clinically associated with CVD outcomes including stroke, myocardial infarction, and death (Rossing et al.

1996).

Increased levels of urinary protein excretion due to CKD are known to be associated with glomerular and tubular injury, cellular apoptosis, and endoplasmic reticulum (ER) stress

(Lee EK et al. 2011; Lee JY et al. 2011). Proteinuria may also drive interstitial inflammation, activate NF-kB signalling, and induce tubular complement synthesis (Fogo 2007; Abbate et al.

2006). One hypothesis of interstitial fibrosis posits that inflamatory cytokines released from damaged podocytes, mesangial cells, and infiltrating macrophages sustain an inflammatory, proteolytic, and fibrotic macro-environmental cascade as they pass through the efferent arteriole (and vasa recta in corticomedullary nephrons), causing further deleterious effects along the nephron, adjacent nephrons, in addition to the ischemic changes found at the primary site of glomerular insult (Fogo 2007). Initial hyperfiltration (↑GFR) following renal insults, as seen in T2D-ESKD, may also contribute to sclerosis and is believed to be a maladaptive compensatory renal response (Brenner et al. 1982).

In one study of Caucasians with diabetic nephropathy (average duration of proteinuria

6.5 years), their renal function (eGFR) declined an average of 5.2ml/ min/ 1.73m 2 per year

(Woredekal, Friedman 2009). Factors associated with higher rates of renal decline included

smoking, higher HbA 1c (indicative of how well T2D is controlled longitudin ally [≈3months]), and higher systolic blood pressure (Woredekal, Friedman 2009). Prospective studies in African

Americans have shown that, on average, it takes 5-25 years for patients to develop ESKD from

8 time of T2D diagnosis (Woredekal, Friedman 2009); there is often a lag in diagnosis of T2D as the symptoms are more insidious than T1D so progression time to ESKD tends to vary. Not all patients with T2D, DN, or early stages of CKD will progress to ESKD.

Architecture of the

The advent of the first wholey sequenced human genome in 2001 revolutionized the field of genomics by providing a reference scaffold for future studies, an accurate estimate of the number of protein-coding genes (≈20,000, which compose ≈1% of the 3 billion total DNA base pairs), and mapping the location of 2 million common single-nucleotide polymorphisms

(SNPs), a major source of human genetic variation (Venter et al. 2001). It’s estimated that there are >10million common SNPs, occuring every 300-600bp, which compromise the majority of interindividual genetic heterogeneity. What took Venter and colleagues 9 months and $1 billion

13 years ago can now be accomplished in 1-2 days at a cost of several thousand dollars w ith much greater accuracy and precision.

The human genome is composed of 22 autosomes, a pair of sex chromosom es, and a separate mitochondrial genome arranged in a circular . The DNA of each of the 23 pairs of is arranged into distinct haplotype blocks whose lengths vary depending on population ancestry. Haplotype blocks are discreet stretches of DNA encompassing a combination of alleles that are inherited together with little or no recombination. Genetic variants within a haplotype block are said to be in linkage disequilibrium (LD). LD values reflect how often alleles are inherited together rather than segregating independently. The median haplotype block length in those of African ancestry is 22kb, whereas in European derived and Asian populations it is 44kb (Gabriel et al. 2002). The smaller haplotype blocks in

African Americans are reflective of greater genetic diversity, minimal bottlenecking, and an

9 older population where longer periods of recombination led to increased breakdown of haplotype blocks.

There are several measures of LD including D, Lewontin’s D (D’) and r2. Lewontin’s D overcomes the shortfalls of D which is largely dependent on allele frequencies (Gaut, Long

2003). Lewontin’s D (D’) is a standardized measure of D wh ere the maximum value of 1 reflects two SNPs being in complete LD (Lewontin 1964). A D’=1 means that at least one of the four possible haplotypes was not observed (ie, 2 or 3 haplotypes were observed) and suggests that no recombination events occurred in human history. On the other hand, D’=0 implies the all haplotypes were observed and the loci are in linkage equilibrium, segregating independently

(Sun 2013). An advantage to this approach is that D’ is directly related to the recombination rate between haplotypes across generations; disadvantages include decreased sensitivity to the extent of LD, larger D’ values for smaller sample sizes (bias), and small D’ values when an allele is found at low frequencies (Sun 2013). These disadvantages of D’ led to the frequent use of a third measure of LD: r 2, or the correlation coefficient, which is D2 divided by the product of the allele frequencies at two loci (Hill, Roberston 1968). Like D’, r2 ranges from 0-1; although the values of D’ may be negative, its direction has no bearing on the actual measurement. r2 provides more robust measurements of LD when allele frequencies are low and is more useful in association studies. For example, D’=1 when 2 or 3 haplotypes are observed, whereas r 2=1 when exactly 2 haplotypes are observed.

Admixture

An important consideration for genetic studies is the effect of admixture on the population(s) being studied. Admixture describes the recent breeding between two previously separated genetic populations which results in a hybrid pop ulation. For example, the African slave trade beginning in the 16th century and ensuing forced African diaspora to North America

10 led to the interbreeding between populations of European and African ancestry, resulting in a genetically unique African American population. The ancestry of African Americans is, on average, 80% African and 20% European.

In genetic association studies admixture may play a confounding role if not properly accounted for. A number of approaches exist to mitigate the effects of admixture on the trait or phenotype being studied. These include utilizing a Principle Component Analysis (PCA) or

Ancestry Informative Markers (AIMs). PCA uses a multivariate analysis to reduce the dimensionality (X, Y, and Z) of datasets by implementing eigenvectors to account for population-level genetic variance, simultaneously preserving as much individual variation as possible (Jolliffe 2005). An alternative approach to PCA is to genotype AIMs. AIMs are essentially alleles (eg, SNPs) spaced out across the genome (≈1cm) with the largest Shannon

Information Content (SIC) – fundamentally a marker that has a large difference in frequency between parent populations of the admixed group (Smith, O’Brien 2005). For example, an ideal

AIM would have be absent in one population (eg, European Americans) and prevalent at high frequencies in the other (eg, Yorubans from Nigeria [YRI]). The alleles used for AIMs do not follow this ideal example. Rather, an expectation-maximization algorithm is used to estimate the proportion of ancestry for each sample based on the distributions of AIMs (Keene et al.

2008).

Heritability of GFR/UACR

Heritability is the proportion of observable differences between individuals that is due to genetic variation. Studies have estimated the genetic heritability of GFR and albuminuria at

36-75% and 16-49%, respectively (O’Seaghda, Fox 2011; Placha et al. 2005). The first study examining the heritability of GFR in diabetic individuals estimated it to be 68-76% in European

2 Americans following extensive covariate adjustment including medications, HbA 1c, age, age ,

11 gender, and mean arterial pressure (MAP) (Langefeld et al. 2004). A variety of genetic methodologies have been used to implicate genes with the qualitative traits diabetic nephropathy, CKD and ESKD, as well as with quantitative traits such as eGFR and albuminuria. It has been estimated that the cumulative impact of loci associated with these latter phenotypes account for less than 1.5% of the known variance in heritability (Böger, Heid

2011). A notable exception has been the genetics of non -T2D ESKD in African ancestry populations, which will be discussed shortly. In addition to heritability studies of CKD and

ESKD, it has been shown that ESKD aggregates in families (Freedman et al. 1995; Freedman

2002); this is especially true in African ancestry populations where having a relative with ESKD puts one at 9x greater risk for developing the disease (compared to 2.7x greater risk for

European ancestral populations).

Genetic Susceptibility to ESKD

Prior to the advent of genome-wide association studies (GWAS) linkage analysis was a common methodology to identify genetic loci in families with high rates of ESKD. In linkage analyses, linkage disequilibrium and genetic polymorphisms are leveraged in order to map genetic loci in families that segregate with a particular phenotype (Teare, Barret 2005). Linkage analysis often implicates large regions of the genome with disease susceptibility and further fine mapping approaches are needed to identify the causative alleles (Teare, Barret 2005). Linkage is measured as a logarithm of the odds (LOD) score. LOD scores are reflective of the recombination fraction or chromosomal position, and LOD scores > 3 are generally used to implicate a genetic locus with a phenotype. A LOD value of 3 is equivalent to a p=0.0001 for association (Teare, Barret 2005).

12

Linkage analysis was successful in identifying genes associated with Mendelian kidney diseases such as NPHS1 in Congenital Nephrotic Syndrome of the Finnish Type (CNS; Kestila et al. 1998). Mutations in NPHS2, which interacts with NPHS1 on a protein level, is mutated in autosomal recessive childhood -onset nephrotic syndrome (Boute et al. 2000) and an additional analysis identified novel NPHS2 mutations as contributing to non-diabetic end-stage kidney disease in African Americans (Dusal et al. 2005). Familial linkage analysis was also used to implicate the CNDP1 gene with diabetic nephropathy with replication in multiple ethnic populations (Vardaril et al. 2002; Janssen et al. 2005; Bowden et al. 2004; Freedman et al. 2007;

Palmer et al. 2014 [GWAS]), the UMOD gene in medullary cystic kidney disease 2 and familial juvenile hyperuricaemic nephropathy (Hart et al. 2002), ADPKD1/ADPKD2 with autosomal dominant polycystic kidney disease (Reeders et al. 1985, Mochizuki et al. 1996), ACTN4 with adult onset autosomal dominant FSGS (Kaplan et al. 2000), and IGAN1 in IgA nephropathy

(Gharavi et al. 2000), amongst others. Candidate gene studies have largely failed to identify genetic variants associated with T2D-ESKD (Schena, Gesualdo 2005), and candidate gene studies of non-T2D ESKD have been limited by their lack of replication and criticized for selection methods (Köttgen 2010). An epitome of the pitfalls of candidate gene stud ies is demonstrated in recent manuscript published by Barua et al. 2014. Here the authors claimed to identify a novel candidate gene for FSGS (PODXL) through exome sequencing in two related individuals with FSGS. They identified 1 homozygous recessive allele in the gene that had no functional impact in multiple in vitro studies, yet still concluded that PODXL was a candidate

FSGS gene (Barua et al. 2014). One notable success, however, has been the identification of

CUBN as a candidate ESKD gene in a Dutch population (Reznichenko et al. 2012).

The first ever GWAS was published in 2005 and implicated the complement factor H

(CFH) gene in age-related macular degeneration using a SNP array (Klein et al. 2005). This effort

13 was made possible by publically available SNP databases, sequencing of the human genome, the HapMap project which characterizes genetic variants, their frequencies in diverse ethnic populations, and SNPs in LD with the variants, as well as the availability of large study populations (Köttgen 2010). A major advantage of GWAS over previous genetic methodologies is that it provides an unbiased view of the entire genome for association with a phenotype.

GWAS approaches are built using SNPs to tag a haplotype block, and thus by proxy the common genetic variation within that haplotype block, across the genome. By using a SNP in a haplotype block one can essentially examine the majority of common variation within that haplotype. This then reduces the numbers of genetic variations needed to identify associations between gene regions and the phenotype of interest. A limitation to this approach is that a SNP associated with a phenotype in GWAS may not be the causal SNP; rather, it may be in high LD with the associated SNP and fine-mapping is often necessary to further elucidate the genetic underpinnings of disease phenotypes. Due to the high number of statistical tests performed in

GWAS strict P value cutoffs of < 5x10-8 are necessary for SNPs to meet genome-wide significance, portending a strong statistical penalty (Böger, Heid 2011).

GWAS studies in T1D-ESKD, T2D-ESKD, ESKD, CKD, as well as their quantitative traits, have been successful in identifying multiple areas of the genome associated with nephropathy. Shroom3, UMOD, NAT8, and SLC7A9 have been associated with CKD in

European-ancestry populations (Köttgen et al. 2009; Chambers et al. 2010). Loci associated with

DN include ELMO1 (Shimazaki et al. 2005), PVT1 (Hanson et al. 2007), and FRMD3 and CARS

(Pezzolesi et al. 2009). A recent GWAS of African Americans with T2D-ESKD identified several novel candidate loci including SASH1, RPS12, AUH, MSRB3-HMGA2, and LIMK2-SFI1, and the lattermost loci was also associated with all-cause ESKD (T2D-ESKD and non-T2D ESKD;

McDonough et al. 2011). The genetic variants identified through GWAS have provided insights

14 into the architecture of nephropathies, but common variants identified through these approaches rarely identify SNPs with ORs > 1.5, leaving much of heritability to be explained

(Bowden 2010).

APOL1-Associated Nephropathies in African Americans

Observations that African Americans were at much greater risk for developing FSGS and HIVAN led to genome-wide mapping by admixture linkage disequilibrium (MALD) studies in an attempt to identify regions conferring th is increased susceptibility to ESKD. A number of MYH9 haplotypes on , including the E1 haplotype, were initially identified as the putative susceptibility loci for non-T2D ESKD in African Americans under recessive models (Kopp et al. 2008; Kao et al 2008). Follow-up studies confirmed associations between MYH9 SNPs and non-T2D ESKD in African Americans (Bostrom et al. 2010). These studies, however, were unable to identify the causal variant(s).

Further studies interrogating a larger chromosomal region adjacent to MYH9, including the APOL1 gene, ultimately identified the causal variants responsible for 70% of the prevalence of non-T2D ESKD in African Americans (Genovese et al. 2010). Utilizing data from The 1000

Genomes Project in conjunction with association studies comparing African Americans with biopsy proven FSGS to population based controls resulted in the identification of the causal variants in APOL1, immediately centromeric to MYH9 and ≈10kbp away. The APOL1 G1 allele includes two missense variants, rs73885319 (S342G) and rs60910145 (I382M), in near perfect LD, and represented the strongest signal in the study. After performing regression analysis on the

G1 allele, Genovese and colleagues identified a second allele in APOL1, termed “G2”, that represented a 6bp deletion (rs71785313) removing amino acids N388 and Y389. After performing an analysis controlling for these two alleles, the signal in the MYH9 region was strongly attenuated .

15

The evolutionary significance of these two alleles was also investigated in that study as it appeared these G1 and G2 alleles were positively selected for and arose independent of one another (Genovese et al. 2010). in vitro studies identified that the ApoL1 protein with either the

G1 or G2 allele were able to lyse (T.b.) rhodesiense, a causative agent of

African sleeping sickness; ApoL1 without the G1 or G2 allele was unable to lyse T.b. rhodesiense

(Genovese et al. 2010). Thus, Genovese and colleagues provided strong genetic and biological data explaining why those of African descent were much more likely to develop non -T2D ESKD: natural selection led to the positive selection of these common variants which provided strong heterozygote survival advantages in the local environment against certain strains of T.b.

(Genovese et al. 2010).

A recent report by Parsa et al. described an association between the APOL1 G1/ G2 alleles and progression of diabetes associated CKD (Parsa et al. 2013). Despite the authors claim that these finding implicates APOL1 with DN and T2D-ESKD, the study suffers from many limitations thus their resultant conclusions are controversial. In the Parsa et al. study a hazards ratio of 1.95 was observed in African American diabetics with two APOL1 risk variants to the composite endpoint – a 50% reduction in GFR – compared to European American diabetics with

CKD (Parsa et al. 2013). These results were very similar to African Americans compared to

European Americans with CKD in the absence of diabetes, though the hazard ratios wer e slightly higher in this comparison (2.68). The limitations to the authors’ conclusions are as follows.

First, there is no mention either when African Americans developed CKD, nor when the

African Americans samples were diagnosed with T2D. Diabetes and the APOL1 G1/ G2 alleles are common in the African American population. 18.7% of African Americans have T2D

(NIDDK 2010) and the MAF of the APOL1 G1 and G2 alleles are 20% and 9%, respectively, and

16 the frequency of carriers with two APOL1 risk alleles is 12% in the general population

(Genovese et al. 2010); less than 4% of individuals with two risk alleles will develop ESKD.

Herein lays the problem: T2D is a common disease, the APOL1 variants are common, and there is no attempt to document if T2D preceeded CKD, the duration of T2D prior to CKD, or the percentage of individuals who developed T2D following the diagnosis of CKD. Thus there is significant confounding which likely resulted in the coincident finding that associated APOL1

G1/ G2 risk alleles with DN progression. Rather than documenting an association between

APOL1 G1/ G2 with DN/ T2D-ESKD, the conclusions from the Parsa study should be interpreted as a demonstration that the APOL1 alleles are associated with CKD progression independent of hyperglycemia. This has been a point advocated by some for many years prior to this report.

DN aside, the importance of APOL1 in susceptibility to non-T2D ESKD cannot be understated. The G1 and G2 alleles account for 70% of the prevalence of non -T2D ESKD in

African Americans and confer ORs remarkably high for common variants: 29 for HIVAN and 17 for FSGS (Genovese et al. 2010; Kopp et al. 2011). Any genetic study investigating ESKD in populations of African descent must then take into the accounts of these two alleles, assuming a recessive model of risk. These alleles and their associated nephropathies may be coincident with

T2D-ESKD (Freedman et al. 2011).

Next Generation Exome Sequencing

“Next Generation Exome Sequencing,” or NGES, refers to the relatively recent advent of massively parallel sequencing of the exome. The exome is the portion of the genome that contains protein coding genes. Unlike GWAS, NGES offers the ability to scrutinize genetic variations previously impermeable to GWAS such as low frequency (MAF = 1-5%) and rare variants (MAF < 1%), often at depths of >40x coverage per genetic variant. Thus, NGES can be

17 leveraged to implicate low and rare frequency variants with complex diseases. NGES approaches have been successful in implicating such variants to autism, multiple sclerosis, schizophrenia, and dilated cardiomyopathy, amongst other diseases (Sander et al. 2012;

Ramagopalan et al. 2011, Xu et al. 2011, Pinto et al. 2011). Since protein coding genes represent

1% of the genome but contain 85% of mutations associated with disease traits (Majewski et al.

2011), studying low frequency exonic variants in T2D-ESRD provides an attractive framework to elucidate the genetic underpinnings of disease mechanisms (Bowden 2010).

It is quite striking that 70% of non-diabetic ESKD risk in African Americans is due to common coding variants in the APOL1 gene (Genovese et al. 2010). It is equally striking that

APOL1 likely has a modest or negligible contribution to T2D-ESKD (McDonough et al. 2011;

Palmer et al. 2014). The aforementioned GWAS performed by our group investigating T2D-

ESKD in African Americans identified a number of potentially significant common variants, but as with other common variant GWAS, these studies were only able to explain modest risk associations (McDonough et al. 2011; Palmer et al. 2014). One likely source of the missing heritability is low frequency coding variants with minor allele frequencies (MAF) of less than

5% (Manolio et al. 2009).

Project Objectives

While much is known about the genetic su sceptibility to non-T2D ESKD in African

Americans (ie, APOL1 associated nephropathy), it remains to be elucidated what genetic loci influence the disproportionate burden of T2D-ESKD borne by the African American community. This “missing heritability” has led to many hypotheses that aim to elucidate its sources, positing copy number variation (CNV), epigenetics, low frequency and rare variants, micro RNAs (miRNAs), and gene-environment interactions as potential sources. Scientific advances over the past decade has also led to an explosion in the number of available resources

18 which may serve as adjuvants to logical, pragmatic genetic explorations seeking to bridge conventional genetic studies. The reconciliation of these vast resources with exome sequencing data in an effort to facilitate the study of the missing heritability of T2D-ESKD in African

Americans is a major component of this thesis work. We posit that this approach may provide tangible, novel insights into the pathogenesis of disease, aid in prognostication, and identify potential drug targets for T2D-ESKD. Independently, exome sequencing and these data resources may produce unique understandings into the missing heritability of T2D-ESKD, but by unifying these resources, multiplicative advances in our understanding cannot be excluded.

Throughout the following studies we sought to leverage whole exome sequencing

(WES) data with bioinformatic databases to aid in the identification of the missing heritability of

T2D-ESKD in African Americans. We hypothesized that WES would provide novel insights into the missing heritability of T2D-ESKD in African Americans by identifying gene regions and genetic variants previously impervious to GWAS and linkage studies, especially “functional” low frequency and rare SN Ps. By using the term “functional” we are encompassing intronic, synonymous, and non-synonymous SNPs; intronic SNPs may be functional if they affect conserved transcription factor binding sites (TFBS) or are located in DNaseI hypersensitivity loci, synonymous variants may affect splicing, and non-synonymous SNPs may affect secondary or tertiary protein structure through amino acid substitutions. Data resources such as

RegulomeDB (Stanford University), The 1000 Genomes Projects, Nephromine (University of

Michigan), and the Exome Variant Server (NHLBI/ University of Washington), coupled with our WES from the T2D-GENES Consortium, will serve as vital resources and aid in the generation of novel hypotheses in our search for genetic variations influencing T2D-ESKD susceptibility in African Americans.

19

In Chapter 2 we used this data to investigate the hypothesis that low frequency and rare missense variants in a known Mendelian nephropathy gene, NPHS1, an essential slit diaphragm protein (nephrin), contribute to T2D-ESKD susceptibility. In this study 28 rare and low frequency variants were identified from the T2D-GENES exome sequencing data, the Exome

Variant Server, and The 1000 Genomes Project. These variants were then genotyped in T2D-

ESKD cases, non-T2D ESKD cases, T2D-only cases, and population based controls in a pragmatic, stepwise manner to assess phenotypic associations of the variants in question. A major limitation to working with these types of variants is the power needed to detect them .

Thus, it was necessary to employ statistical methods specifically designed to detect these types of variants with maximum power.

WES data was also used to generate a priori hypotheses for novel drivers of T2D-ESKD, such as in Chapter 3. In this investigation we initially identified two genetic variants in the

RREB1 gene which were highly associated with T2D-ESKD in our WES discovery study.

Importantly, RREB1 splice variants have been shown the repress the angiotensinogen gene

(Date et al. 2004), and genetic variants in RREB1 have previously been reported to interact with

APOL1 and have been associated with eGFR (Bostrom et al. 2012; Yang et al. 2010); the latter finding was reported in supplementary data. Biological plausibility coupled with encouraging genetic evidence led us to the hypothesis that functional variants with high MAF discrepancies across ethnicities, ie, African and European Americans, in RREB1 could modulate susceptibility to T2D-ESKD and explain some of the ethnic disparity observed in T2D-ESKD. Ultimately, we were able to validate this hypothesis, and the data we observed led us to also study and establish novel pleotrophic roles for RREB1. An important resource in this exploration was

RegulomeDB, which utilizes TFBS, DNaseI hypersensitivity sites, and expression quantitative trait loci (eQTL) data to identify and prioritize functional intronic variants which may modulate

20 these loci. Through the extension of our WES data and follow -up studies of targeted genotyping utilizing RegulomeDB we were able to identify an intronic variant associated with multiple phenotypes that may disrupt >40 TFBS in two ethnic populations.

In Chapter 4 we hypothesized functional variants in a known IgA nephropathy (IgAN) gene, complement factor H (CFH), modulated susceptibility to ESKD in the general African

American population. This study was built upon a previous genetic association in an intronic

CFH variant identified and published by our lab group (Bostrom et al. 2012). Using similar approaches as in Chapters 2 & 3, we genotyped missense and nonsynonymous variants from

The 1000 Genomes Project and Exome Variant Server based on MAF discrepancy between

African and European Americans, PolyPhen2 prediction, as well as MAF frequency. Here we tested both the common-variant complex disease and rare-variant complex disease hypotheses.

The appendices of this thesis also include unique investigations studying T2D-ESKD. In

Appendix 1 we sought to create a genetic risk score of T2D-ESKD based on the Notch signalling pathway – known to be dysregulated in T2D-ESKD. Appendix 2 includes data implicating a novel gene, RTN1, with ESKD in African and European Americans. This investigation was part of a joint collaboration, where our partners identified RTN1 as being differentially expressed in rodent models of kidney disease. We extended our findings from Chapter 2 to European

Americans in Appendix 3, identifying several variants in NPHS1 which may be associated with

T2D-ESKD. Finally, in Appendix 4, we identified a missense variant in CRCT1 associated with

T2D-ESKD in two independent samples of African Americans.

21

References (Introduction)

1. Abbate M, Zoja C, Remuzzi G. 2006. How does proteinuria cause progressive renal damage? J Am Soc Nephrol. 17: 2974-2989.

2. Afkarian M, Sachs MC, Kestenbaum B, et al. 2013. Kidney disease and increased mortality risk in type 2 diabetes. J Am Soc Nephrol. 24: 302-308.

3. Bakris 2013. Overview of diabetic nephropathy. UpToDate. Accessed March 3rd, 2014.

4. Bargman JM, Skorecki K. Chapter 280. Chronic Kidney Disease. In: Longo DL, Fauci AS, Kasper DL, Hauser SL, Jameson J, Loscalzo J. eds. Harrison's Principles of Internal Medicine, 18e. New York: McGraw-Hill; 2012. http:/ / accessmedicine.mhmedical.com/ content.aspx?bookid=331&Sectionid=40727069. Accessed February 03, 2014.

5. Barua M, Shieh H, Sclonodorff J, et al. 2014. Exome sequencing and in vitro studies identified podocalyxin as a candidate gene for focal segmental glomerulosclerosis. Kidney Int. 85(1): 124-133.

6. Brenner BM, Meyer TW, Hostetter TH. 1982. Dietary protein intake and the progressive nature of kidney disease: the role of hemodynamically mediated glomerular injury in the pathogenesis of progressive glomerular sclerosis in aging, renal ablation, and intrinsic renal disease. N Eng J Med. 307: 652-659.

7. Böger CA, Heid IM. 2011. Chronic Kidney Disease: Novel Insights from Genome-Wide Association Studies. Kidney Blood Press Res. 34: 225-234.

8. Bostrom MA, Lu L, Chou J, Hicks PJ, et al. Candidate genes for non-diabetic ESRD in African Americans: a genome-wide association study using pooled DNA. Hum Genet. 128(2): 195-204.

9. Bostrom MA, Kao WH, Li M et al. 2012. Genetic association and gene-gene interaction analyses in African American dialysis patients with nondiabetic nephropathy. Am J Kidney Dis. 59: 210-221

10. Bostrom MA, Kao WHL, Li Man, Abboud HE, Adler SG, Iyengar SK, Kimmel PL, et al. 2012. Genetic Association and Gene-Gene interaction Analyses in African American Dialysis Patients with Nondiabetic Nephropathy. American Journal of Kidney Diseases. 59(2): 210-221.

11. Boute N, Gribouval O, Roselli S, Benessy F, Lee H, et al. 2000. NPHS2, encoding the glomerular protein podocin, is mutated in autosomal recessive steroid-resistant nephrotic syndrome. Nat Genet. 24: 349-352.

12. Bowden DW, Colicigno CJ, Langefeld CD, et al. 2004. A genome scan for diabetic nephropathy in African Americans. Kidney Int. 66: 1517-1526.

13. Bowden DW. 2010. Will Family Studies Return to Prominence in Human Genetics and Genomics? Rare Variants and Linkage Analysis of Complex Traits. Genes & Genomics. 33: 1-8.

22

14. Chambers JC, Zhang W, Lord GM, van der Harst P, et al. 2010. Genetic loci influencing kidney function and chronic kidney disease in man. Nat Genet. 42(5): 373-375.

15. Date S, Nibu Y, Yanai K, Hirata J, Yagami K, Fukamizu A. 2004. Finb, a multiple zinc finger protein, represses transcription of the human angiotensinogen gene. Int J Mol Med. 13(5): 637-42.

16. de Boer IH, Rue TC, Hall YN, et al. 2011. Temporal trends in the prevalence of diabetic kidney disease in the United States. JAMA. 305(24): 2532-2539.

17. de Zeeuw, D. 2011. Unmet need in renal protection--do we need a more comprehensive approach? Contrib Nephrol 171:157-160.

18. de Zeeuw D, Akizawa T, Audhya P, Barkis GL, et al. 2013. Bardoxolone Methyl in Type 2 Diabetes and Stage 4 Chronic Kidney Disease. N Eng J Med. 369: 2492-2503.

19. Dusel JA, Burdon KP, Hicks PJ, Hawkins GA, Bowden DW, Freedman BI. 2005. Identification of podocin (NPHS2) gene mutations in African Americans with nondiabetic end-stage renal disease. Kidney Int. 68(1): 256-262.

20. Early A, Miskulin D, Lamb EJ, Levey AS, Uhlig K. 2012. Estimating equations for glomerular filtration rate in the era of creatinine standardization: a systematic review. Ann Intern Med. 156(11): 785-795.

21. Eckel RH. Chapter 242. The Metabolic Syndrome. In: Longo DL, Fauci AS, Kasper DL, Hauser SL, Jameson J, Loscalzo J. eds. Harrison's Principles of Internal Medicine, 18e. New York: McGraw-Hill; 2012. http:/ / accessmedicine.mhmedical.com/ content.aspx?bookid=331&Sectionid=40727020. Accessed February 04, 2014.

22. Fogo AB. Mechanisms of progression of chronic kidney disease. Pediatr Nephrol. 22: 2011- 2022.

23. Freedman BI, Volkova NV, Satko SG, et al. 1995. Population-based screening for family history of end-stage renal disease among incident dialysis patients. Am J Nephrol. 25(6): 529-535.

24. Freedman BI. 2002. End-stage renal failure in African Americans: insights in kidney disease susceptibility. Nephrol Dial Transplant. 17(2): 198-200.

25. Freedman BI, Hicks PJ, Sale MM, et al. 2007. A leucine repeat in the carnosinase gene CNDP1 is associated with diabetic end-stage renal disease in European Americans. Nephrol Dial Transplant. 22: 1131-1135.

26. Freedman BI, Langefeld CD, Lu L, et al. 2011. Differential effects of MYH9 and APOL1 risk variants on FRMD3 Association with Diabetic ESRD in African Americans. PLoS Genet. 7(6): e1002150.

27. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, et al. 2002. The Structure of Haplotype Blocks in the Human Genome. Science. 296(5576): 2225-2229.

23

28. Gaut BS, Long AD. 2003. The Lowdown on Linkage Disequilibrium. Plant Cell. 15(7): 1502-1506.

29. Genovese G, Friedman DJ, Ross MD, Lecordier L, et al. 2010. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science. 329(5993): 841-5.

30. Gharavi AG, Yan Y, Scolari F, Schena FP, Frasca GM, et al. 2000. IgA nephropathy, the most common cause of glomerulonephritis, is linked to 6q22-23. Nature Genetics. 26: 354- 357.

31. Groop PH, Thomas MC, Moran JL, et al. 2009. The presence and severity of chronic kidney disease predicts all-cause mortality in type 1 diabetes. Diabetes. 58: 1651-1658.

32. Hanson RL, Craig DW, Millis MP, et al. 2007. Identification of PVT1 as a candidate gene for end-stage renal disease in type 2 diabetes using a pooling-based genome-wide single nucleotide polymorphism association study. Diabetes. 56: 975-983.

33. Hart TC, Gorry MC, Woodard AS, et al. 2002. Mutations of the UMOD gene are responsible for medullary cystic kidney disease 2 and familial juvenile hyperuricaemic nephropathy. J Med Genet. 39(12): 882-892.

34. Himmelfarb J, Tuttle KR. 2013. New Therapies for Diabetic Kidney Disease. N Engl J Med. 369: 2549-2550.

35. Janssen B, Hohenadel D, Brinkkoetter P, et al. 2005. Carnosine as a protective factor in diabetic nephropathy: association with a leucine repeat of the carnosinase gene CNDP1. Diabetes. 54: 2320-2327.

36. Jolliffe IT. 2005. Principal Component Analysis. John Wiley & Sonds, Ltd.

37. Kao WH, Klag MJ, Meoni LA, Reich D, et al. MYH9 is associated with nondiabetic end- stage renal disease in African African Americans. Nat Genet. 40(10): 1185-1192.

38. Kaplan JM, Kim SH, North KN, Rennke H, Correia LA, Tong HQ, et al. 2000. Mutations in ACTN4, encoding alpha-actinin-4, cause familial focal segmental glomerulosclerosis. Nat Genet. 24: 251-256.

39. Keene KL, Mychaleckyj JC, Leak TS, Smith SG, et al. 2008. Exploration of the utility of ancestry informative markers for genetic association studies of African Americans with type 2 diabetes and end stage renal disease. Human Genetics. 124(2: 147-154.

40. Kestila M, Lenkkeri U, Mannikko M, Lamerdin J, et al. 1998. Positionally Cloned Gene for a Novel Glomerular Protein – Nephrin – Is mutated in Congenital Nephrotic Syndrome. Mol Cell. 1(4): 575-82.

41. Klein RJ, Zeiss C, Chew EY, et al. 2005. Complement factor H polymorphisms in age- related macular degeneration. Science. 308(5720): 385-389.

42. Kon V, Fogo A, Ichikawa I. 1993. Bradykinin causes selective efferent arteriolar dilation during angiotensin I converting enzyme inhibition. Kidney Int. 44: 545-550.

24

43. Kopp JB, Smith MW, Nelson GW, Johnson RC, et al. 2008. MYH9 is a major-effect risk gene for focal segmental glomerulosclerosis. Nat Genet. 40(10): 1175-1184.

44. Kopp JB, Nelson GW, Swampath K, et al. 2011. APOL1 genetic variants in focal segmental glomerulosclerosis and HIV-associated nephropathy. J Am Soc Nephrol. 22(111): 2129-213.

45. Köttgen A. 2010. Genome-Wide Association Studies in Nephrology Research. American Journal of Kidney Diseases. 56(4): 743-758.

46. Köttgen A, Glazer NL, Dehgham A, Hwang SJ, et al. 2009. Multiple loci associated with indices of renal function and chronic kidney disease. Nat Genet. 41(6): 712-717.

47. Langefeld CD, Beck SR, Bowden DW, Rich SS, Wagenknecht LE, Freedman BI. 2004. Heritability of GFR and Albuminuria in Caucasians with Type 2 Diabetes. American Journal of Kidney Disease. 43(5): 796-800.

48. Lee EK, Jeong JU, Chang JW, Yang WS, Kim SB, et al. 2011. Activation of AMP-activated protein kinase inhibits albumin-induced endoplasmic reticulum stress and apoptosis through inhibition of reactive oxygen species. Nephron Exp Nephrol. 121: 38-48

49. Lee JY, Chang JW, Yang WS, Kim SB, Park SK, et al. 2011. Albumin-induced epithelial- mesenchymal transition and ER stress are regulated through a common ROS-c-Src kinase-mTOR pathway: effect of imatinib mesylate. Am J Physiol Renal Physiol. 300: 1214- 1222.

50. Lewontin RC. 1964. The interaction of selection and linkage. I. General Considerations; Heterotic Models. Genetics. 49(1): 49-67.

51. Manolio T, Collins F, Cox N, Goldstein D, Hindroff L, et al. 2009. Finding the missing heritability of complex diseases. Nature. 462(7265): 747-53.

52. Masharani U, German MS. Chapter 17. Pancreatic Hormones and Diabetes Mellitus. In: Gardner DG, Shoback D. eds. Greenspan’s Basic & Clinical Endocrinology, 9e. New York: McGraw-Hill; 2011. http:/ / accessmedicine.mhmedical.com/ content.aspx?bookid=380&Sectionid=39744057. Accessed February 04, 2014.

53. McDonough CW, Palmer ND, Hicks PJ, Roh BH, et al. 2011. A Genome Wide Association Study for Diabetic Nephropathy Genes in African Americans. Kidney Int. 79(5): 563-572.

54. Mochizuki, Toshio, et al. 1996. PKD2, a gene for polycystic kidney disease that encodes an integral membrane protein. Science 272(5266): 1339-1342.

55. NIDDK 2010. National Diabetes Statistics, 2011. National Diabetes Information Clearinghouse. < diabetes.niddk.nih.gov/ dm/ pubs/ statistics> Accessed 2/ 27/ 14.

56. NKDEP 2010. National Kidney Disease Education Program, U.S. Department of Health and Human Services, National Institutes of Health. Accessed 02/ 06/ 2014.

25

57. O’Seaghdha CM, Fox CS. 2011. Genetics of chronic kidney disease. Nephron Clin Pract. 118:c55-c63.

58. Palmer ND, Ng MCY, Hicks PJ, Mudgal P, et al. 2014. Evaluation of Candidate Nephropathy Susceptibility Genes in a Genome-Wide Association Study of African American Diabetic Kidney Disease. PLoS One. 9(2): e88273.

59. Parsa A, Kao WHL, Xie D, Astor BC, et al. 2013. APOL1 Risk Variants, Race, and Progression of Chronic Kidney Disease. N Eng J Med. 369(23): 2183-2196.

60. Peterson JC, Adler S, Burkat JM, Green T, et al. 1995. Blood Pressure Control, Proteinuria, and the Progression of Renal Disease: The Modification of Diet in Renal Disease Study. Ann Intern Med. 123(10): 754-762.

61. Pezzolesi MG, Poznik GD, Mychaleckyj JC, et al. 2009. Genome-wide association scan for diabetic nephropathy susceptibility genes in type 1 diabetes. Diabetes. 58: 1403-1410.

62. Pinto JR, Siegfried JD, Parvatiyar MS, Li D, et al. 2011. Functional characterization of TNNC1 rare variants identified in dilated cardiomyopathy. The Journal of Biological Chemistry. 286(39): 34404-12.

63. Placha G, Canani LH, Warram JK, Krowleski AS. 2005. Evidence for different susceptibility genes for proteinuria and ESRD in type 2 diabetes. Adv Chronic Kidney Dis. 12: 155-169.

64. Ramagopalan SV, Dvemt DA, Cader MZ, et al. 2011. Rare variants in the CYP27B1 gene are associated with multiple sclerosis. Annals of Neurology. 70(6): 881-6.

65. Reeders ST, Breuning MH, Davies KE, et al. 1985. A highly polymorphic DNA marker linked to adult polycystic kidney disease on chromosome 16. Nature. 317:542-544.

66. Reznichenko A, Snieder H, van den Born J, de Borst MH, et al. CUBN as a novel locus for end-stage renal disease: insights from renal transplantation. PLoS One. 7(5): e35612.

67. Rossing P, Hougaard P, Borch-Johnsen K, Parving HH. 1996. Predictors of mortality in insulin dependent diabetes: 10 Year observational follow up study. BMJ. 313: 779-784.

68. Sanders SJ, Murtha MT, Gupta AR, Murdoch JD, et al. 2012. De novo mutations revealed by whole-exome sequencing are strongly associated with autism. Nature. 4;485(7397): 237-41.

69. Schena FP, Gesualdo L. 2005. Pathogenetic Mechanisms of Diabetic Nephropathy. J Am Soc Nephrol. 16: 530-533.

70. Shimazaki A, Kawamura Y, Kanazawa A, et al. 2005. Genetic variations in the gene encoding ELMO1 are associated with susceptibility to diabetic nephropathy. Diabetes. 54: 1171-1178.

71. Smith MW, O’Brien SJ. 2005. Mapping by Admixture Linkage Disequilibrium: Advances, Limitations and Guidelines. Nat Rev Genet. 6(8): 623-632.

26

72. Spray BJ, Atassi NG, Tuttle AB, et al. 1995. Familial risk, age at onset, and cause of end - stage renal disease in white Americans. J Am Soc Nephrol. 5:1806-1810. PMID: 7787148

73. Sun J. 2013. Population Genetics: Hardy Weinber and Linkage Disequilibrium. MOGN 726.

74. Teare MD, Barret JH. 2005. Genetic linkage studies. Lancet. 366(9490): 1036-1044.

75. USRDS 2013. U.S. Renal Data System, USRDS 2013 Annual Data Report: Atlas of Chronic Kidney Disease and End-Stage Renal Disease in the United States, National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, 2013.

76. Vardaril I, Baier LJ, Hanson RL, et al. 2002. Gene for susceptibility to diabetic nephropathy in type 2 diabetes maps to 18q22.3-23. Kidney Int. 66: 1517-1526.

77. Woredekal Y, Friedman EA. Chapter 54. Diabetic Nephropathy. In: Lerma EV, Berns JS, Nissenson AR. eds. CURRENT Diagnosis & Treatment: Nephrology & Hypertension. New York: McGraw-Hill; 2009. http:/ / accessmedicine.mhmedical.com/ content.aspx?bookid=372&Sectionid=39961197. Accessed February 03, 2014.

78. Xu B, Roos JL, Dexheimer P, Boone B, Plummer B, Levy S, Gogos JA, Karayiorgou M. 2011. Exome sequencing supports a de novo mutational paradigm for schizophrenia. Nature Genetics. 43(9):864-8.

79. Yang Q, Kottgen A, Dehghan A, Smith AV, Glazer NL, Chen MH, et al. 2010. Multiple genetic loci influence serum urate levels and their relationship with gout and cardiovascular disease risk factors. Circ Cardiovasc Genet. 3(6): 523-30.

80. Zeitler P, Epstein L, Grey M, Hirst K, et al. "Treatment options for type 2 diabetes in adolescents and youth: a study of the comparative efficacy of metformin alone or in combination with rosiglitazone or lifestyle intervention in adolescents with type 2 diabetes." Pediatric diabetes 8.2 (2007): 74-87.

27

Chapter 2

Coding variants in NPHS1 modulate susceptibility to diabetic and non-diabetic end-stage kidney disease in African Americans

Jason A. Bonomo, BA 1,2, Maggie C. Y. Ng, PhD2,3, Nicholette D. Palmer, PhD 2,3, Jacob M. Keaton 2, Chris P. Larsen, MD4, Pamela J. Hicks, BS2, The T2D-GENES Consortium, Carl D. Langefeld, PhD2, 5, Barry I. Freedman, MD 6, and Donald W. Bowden, PhD 2,3,*

Author Affiliations:

1Department of Molecular Medicine and Translational Science, 2Center for Human Genomics and Personalized Medicine, 3Department of Biochemistry, 4Nephropath, Little Rock, AR, 5Department of Biostatistical Sciences, 6Department of Internal Medicine- Nephrology, Wake Forest School of Medicine, Winston-Salem, North Carolina, USA.

JAB performed experiments, ran the statistical analyses, and prepared the manuscript. MCYN and NDP provided experimental guidance and ed ited the manuscript. JMK confirmed exome sequencing data in the T2D-GENES cohort via targeted genotyping. PJH helped with experiments. The T2D-GENES consortium provided exome sequencing data. CDL created the program used for single-SNP association tests. BIF recruited subjects and served in an editorial capacity. DWB served in an advisory and editorial capacity.

*Corresponding author: Donald W. Bowden 1 Medical Center Blvd Winston Salem, NC 27157 phone: (336) 713-7507 fax: (336) 713-7566 email: [email protected]

Running Title: NPHS1 in African American ESKD

Abstract Word Count: 245 Text Word Count: 4,176 Tables: 4 + 8 (supplementary) Figures: 0 + 2 (supplementary)

Research Support: NIDDK F30 DK098836 (JAB), NIDDK RO1 DK53591 (DWB), NIDDK RO1 DK070941, (BIF), and NIDDK RO1 DK071891 (BIF). The T2D-GENES Consortium is supported by NIDDK U01’s DK085501, DK085524, DK085526, DK085545 and DK085584.

An abbreviated version of this manuscript was accepted to the Clinical Journal of the American Society of Nephrology in May of 2014.

28

Abstract

Background: Familial clustering and presumed genetic risk for type 2 diabetic (T2D) and non - diabetic end-stage kidney disease (ESKD) appear strong in populations of African ancestry.

Methods: Exome sequencing data from African American T2D-ESKD cases and non- diabetic non-nephropathy controls in the T2D-GENES study (“Discovery”) was evaluated, focusing on missense variants in the NPHS1 gene (nephrin). Variants associated in the

Discovery study were next evaluated in independent T2D-ESKD (“Replication”), non-T2D

ESKD, and T2D-only, non-nephropathy samples. A locus-wide NPHS1 analysis was performed by genotyping 28 additional rare- and low-frequency missense variants from exome sequencing resources in T2D-ESKD cases, non-T2D ESKD cases, and non-diabetic, non-nephropathy controls. Finally, sequence kernel association testing (SKAT) was performed on all NPHS1 variants passing quality control in a comprehensive analysis to increase the power of our study.

Results: Initial analysis identified rs35238405 (T233A; MAF=0.0096) as associated with

T2D-ESKD (p=0.042-0.080, OR=2.89); with replication in an independent T2D-ESKD cohort

(p=0.018, OR=4.3). The T233A variant was also associated with non -diabetic ESKD (p=0.016,

OR=4.48), but not type 2 diabetes mellitus (p=0.16-0.28). In a combined analysis, T233A was associated with all-cause ESKD (p=0.0038, OR=2.82; N=3270 cases and N=1187 controls). Two additional variants (H800R, Y1174H) were nominally associated with ESKD and were protective (p=0.036, 0.0084; OR=0.44, 0.04, respectively) in the locus-wide analysis. P-values of

5.64x10-7 and 1.18x10-6 were obtained following adjustment for admixture and admixture +

APOL1 G1/ G2, respectively, in the SKAT analysis.

Conclusion: Coding variants in NPSH1 modulate both risk and protection to common forms of nephropathy in African Americans.

29

Introduction

Multiple lines of evidence support genetic influences on susceptibility to end -stage kidney disease (ESKD). This is particularly true in African Americans where ESKD incidence rates are 3.5 fold higher than in European Americans.1 After adjustment for socioeconomic status, incidence rates and familial aggregation of ESKD remain markedly higher among

African Americans compared to European Americans and other ethnic minorities.2,3 The apolipoprotein L1 gene (APOL1) G1 and G2 alleles, present in those of West African descent, explain a substantial portion of the ethnic disparity in HIV-associated collapsing glomerulopathy, idiopathic focal segmental glomerulosclerosis (FSGS), and hypertension - attributed ESKD, with odds ratios (OR) for association in African Americans of 29, 17, and 7.3, respectively.4,5 These variants account for a portion of the ethnic difference in risk for non - diabetic forms of ESKD (non-T2D ESKD); however, they fail to account for the excess risk of type 2 diabetic nephropathy (T2D-ESKD) in African Americans.6 It is likely that other genetic loci contribute to T2D-ESKD risk in the African American population.7

Next-generation (massively parallel) exome sequencing (NGES) provides a powerful technology to comprehensively identify and test genetic variations in coding sequences of genes for disease association, facilitating the detailed exploration of previously u ntested genetic regions. We have used NGES data to survey NPHS1 (nephrin), an essential slit diaphragm protein involved in both adhesion and signaling, which has been implicated in Congenital

Nephrotic Syndrome of the Finnish Type (CNS).8,9,10 In the canonical CNS a 2-bp frameshift

mutation (Fin major) and a nonsense mutation in exon 26 (Fin minor) are responsible for >80% of all

CNS cases in Finland.8 In addition, variants in nephrin have been linked to minimal change

30 nephrotic syndrome (MCNS) and childhood -onset steroid-resistant nephrotic syndromes,11,12 as well as late-onset forms of steroid -resistant nephrotic syndrome.13

In contrast to these early onset and uncommon forms of nephropathy, we evaluated rare and low frequency NPHS1 variants for association with ESKD in a community-based sample of

African Americans. We hypothesized that low frequency (MAF < 5%) and rare (MAF <1%) coding variants in NPHS1 contribute to ESKD risk in African American populations. In order to identify associations amongst low and rare frequency variants we employed sequence kernel association testing to augment the study.

31

Materials & Methods

Study Subjects

Recruitment and sample collection procedures have previously been described in detail.6,16 The study was approved by the Institutional Review Board at Wake Forest School of

Medicine. All participants were unrelated, born in North Carolina, South Carolina, Georgia,

Tennessee, or Virginia and provided written informed consent. DNA extraction was performed using the PureGene system (Gentra Systems, Minneapolis, MN, USA). African Americans with

T2D-ESKD were recruited from dialysis facilities. Type 2 diabetes (T2D) was diagnosed in those developing diabetes after age 25 years, without historical evidence of diabetic ketoacidosis or receiving solely insulin therapy since diagnosis. T2D-ESKD was diagnosed after >5 year diabetes duration prior to renal replacement therapy, or with diabetic retinopathy or ≥ 100 mg/ dl proteinuria on urinalysis (when available), in the absence of other ca uses of nephropathy. Unrelated African American controls without diabetes or renal disease (based on a serum creatinine concentration <1.5 [men] or <1.3 mg/ dl [women]) were recruited from the community and internal medicine clinics at Wake Forest Baptist H ealth. Ethnicity was self- reported and confirmed by genotyping with ancestry informative markers. The mean African ancestry of the samples was 79.85% ± 11.54 for cases and 77.7% ± 10.96 for controls.

African American non-diabetic (non-T2D) ESKD cases

African American non-T2D ESKD cases lacked diabetes (or diabetes developed after initiating renal replacement therapy). ESKD was typically attributed to chronic glomerular disease (e.g., FSGS), HIV-associated nephropathy, “hypertension-attributed”, or unknown cause. Patients with ESKD due to urologic/ surgical cause, polycystic kidney disease or IgA nephropathy were excluded. The mean African ancestry of the non -T2D ESKD samples was

80.01% ± 10.96.

32

African American T2D (non-nephropathy) samples

African Americans with T2D but lacking nephropathy were recruited from the African

American-Diabetes Heart Study (AA-DHS),17 as well as internal medicine clinics at Wake Forest

Baptist Health. These diabetic controls were receiving insulin or oral agents, had an HbA1C

>6.5% or a fasting plasma glucose >126 mg/ dl, and serum creatinine concentration <1.5 mg/ dl

(men) or 1.3 mg/ dl (women). All T2D-only non-nephropathy controls in this study had an eGFR >60ml/ min/ 1.73m 2 and a urine albumin:creatinine ratio (UACR) <60 mg/ g. Althou gh some diabetic controls had low level microalbuminuria, a UACR up to 60 mg/ g was included to improve power since many African Americans with T2D have mild albuminuria and normal

GFR.

Sample Preparation, Genotyping, and Quality Control

African American T2D-ESKD T2D-GENES Discovery cases and controls

Next-generation exome sequencing (NGES) was performed under the auspices of the

T2D-GENES consortium (https:/ / t2d-genes.sph.umich.edu/ ) utilizing an Agilent V2 capture array platform (Agilent Technologies, Santa Clara, CA) at the Broad Institute (Cambridge, MA).

Data underwent multiple levels of quality control (QC) prior to release including DNA QC, tests for cryptic relatedness, removal of artifact sites, tests for heterozygosity, and confirmation of variant calling through comparison with prior GWAS data (>90% concordance rates). T2D -

ESKD cases and non-diabetic non-nephropathy controls were selected from a previously published African American T2D-ESKD GWAS.6

Targeted Genotyping

Targeted genotyping of NPHS1 variant rs35238405 was performed across all samples utilizing the Sequenom MassArray system (Sequenom, San Diego, CA) in the Center for

Genomics and Personalized Medicine Research at Wake Forest School of Medicine (Winston-

33

Salem, NC). SNPs were PCR-amplified using primers designed in MassARRAY Assay Design

3.1 (Sequenom, San Diego, CA) and genotypes were analyzed using MassARRAY Typer

(Sequenom). Call rates >97% were achieved and quality control was ensured using blind duplicates within each cohort of samples (100% concordance rate). The T2D-GENES discovery study also underwent targeted genotyping and 100% concordance rates between the exome sequencing and genotyping data were observed.

Locus-wide analysis

The 1000 Genomes Project (ASW, YRI, LWK) and the Exome Variant Server (NHLBI) were mined for additional rare (<0.5%) and low (MAF 0.5-5%) frequency NPHS1 missense variants beyond those found in T2D-GENES. Variants for analysis in this study were chosen based on allele enrichment in African versus European ancestral populations, type of mutation

(missense), and PolyPhen2 prediction (http:/ / genetics.bwh.harvard.edu/ pph/ data/ ). In total, an additional 32 NPHS1 variants were selected for targeted genotyping in T2D-ESKD, non-T2D

ESKD, and population-based control cohorts. Of the 32 variants, 4 were unable to be genotyped due to their genomic context. Targeted genotyping of the remaining 28 variants was performe d on the Sequenom system as described above. Four of the 28 variants failed quality control.

Statistical Analysis

Single SNP Association Testing

Each SNP was tested for departure from Hardy-Weinberg Equilibrium (HWE) expectations through a chi square goodness of fit test (HWE > 0.00001 cases and HWE > 0.001 controls). The overall genotypic test of association and the three genetic models (dominant, additive, and recessive) were computed to test for association between each SNP and each phenotype. Data from T2D-GENES was adjusted for admixture-only through a principal component analysis (Supplementary Table 1) as noted in the text. All other tests for association

34 were adjusted for admixture18 and APOL1 G1/ G2 risk allele status (assuming a recessive model of disease risk)4 (Tables 2 and 3). These tests were computed using the SNPGWA program

(http:/ / www.phs.wfubmc.edu/ public_bios/ sec_gene/ downloads.cfm).

Locus-Wide Burden Analysis

Sequence Kernel Association Testing (SKAT) was performed on all 16 non - monomorphic NPHS1 variants genotyped in T2D-ESKD cases, non-T2D ESKD cases, and non- diabetic non-nephropathy population-based controls; 8 variants were monomorphic.14 The

SKAT Meta package was run on R 3.0.1 (http:/ / cran.r-project.org/ web/ views/ Genetics.html) using the default model weighted for rare-variants, termed “[c(1,25)]”. An “all-cause” ESKD analysis was performed by combining genotype information from the T2D-ESKD cases and non-T2D ESKD cases versus the controls. We used the Fisher’s protected least significant differences (LSD) approach to evaluate significance of the variants. Specifically, we computed the overall gene-level test provided by SKAT Meta to determine if there was evidence of an effect within NPHS1. If the overall test was significant, we computed the individual SNP comparisons. The goal with the latter set of comparisons is to identify the variants that are driving the gene-level test.

Comparative Analysis

A two-tailed T-test was performed in order to assess whether phenotypic variations in carriers of the NPHS1 T233A variant were significant using an α=0.05.

35

Results

The influence of NPHS1 coding variants on susceptibility to ESKD was evaluated in the general African American population. Characteristics of the two samples of T2D-ESKD cases

(T2D-GENES and “Replication T2D-ESKD”), the non-T2D ESKD cases, T2D non-nephropathy controls, and two samples of non-diabetic non-nephropathy controls are summarized in Table

1. Participants with T2D-ESKD in the T2D-GENES exome sequencing cohort (Discovery cases) and Replication T2D-ESKD cases were similar for all characteristics. Ages at enrollment for both

T2D-ESKD case groups were older than the age for the two control groups (T2D non - nephropathy and non-diabetic non-nephropathy controls); however, the ages at T2D onset among T2D-ESKD cases were younger than the age at enrollment for the controls. All cohorts except non-T2D ESKD cases had a larger percentage of females compared to males. The distribution of BMI is broadly similar across all samples, although subjects with non -T2D ESKD had the lowest mean BMI (Table 1).

Data from next-generation exome sequencing on 529 African American Discovery T2D-

ESKD cases and 535 African American non-diabetic non-nephropathy controls included in the

T2D-GENES Consortium (t2d-genes.sph.umich.edu) was evaluated. A total of 40 coding NPHS1 variants were identified in at least one individual (see Supplementary Table 1 summarizing

NPHS1 exome sequencing results). Due to the low power of this discovery study we chose to focus on variants within NPHS1 which were statistically associated with T2D-ESKD

(Supplementary Tables 1, 6). The NPHS1 variant rs35238405 (T233A) was among the most common coding variants (MAF = ~1%) with PolyPhen2 prediction of impacting protein function. In the T2D-GENES Discovery analysis, rs35238405 was associated with T2D-ESKD based upon a comparison of allele frequencies with non -T2D, non-nephropathy controls. The

T233A variant had a minor allele frequency (MAF) of 1.4% in T2D-ESKD cases and 0.6% in

36 controls (Table 2), p=0.042 adjusted for admixture (OR=2.89 [95% CI 1.04, 8.03]; Supplementary

Table 1) and p=0.080 adjusted for admixture + APOL1 (Table 2). The rs35238405 SNP was subsequently genotyped in 1,305 independent African American T2D-ESKD Replication cases and 760 additional African American Replication non -diabetic non-nephropathy controls (T2D-

ESKD Replication, Table 2). T233A was associated with T2D-ESKD (p=0.018, OR=4.30 [1.29,

14.34]) in the Replication analysis after adjustment for admixture + APOL1. Rs35238405 was found at similar MAFs in both samples (MAF = 0.01-0.014 T2D-ESKD cases, MAF = 0.0034-

0.0056 non diabetic, non-nephropathy controls). Pooling the Discovery and Replication T2D-

ESKD cohorts (Combined T2D-ESKD) and contrasting them with the combined Discovery and

Replication non-T2D, non-nephropathy controls revealed an OR=2.86 [1.38, 5.94] for association; p=0.0047 adjustment for admixture + APOL1 (Table 2).

With consistent association between the rs35238405 (T233A) variant and T2D-ESKD, we investigated whether there was evidence of association of NPHS1 with non-diabetic etiologies of ESKD. Rs35238405 was genotyped in 1,705 African American non -T2D ESKD cases and the

T233A variant was significantly associated with non -T2D ESKD (OR = 4.48 [1.33, 15.11]; p=0.016 compared to the 671 African American non-diabetic non-nephropathy controls used in the T2D-

ESKD Replication analysis; Table 2). The T233A NPHS1 variant was also tested for association with T2D in the absence of nephropathy; in 503 African American T2D non -nephropathy samples (eGFR > 60ml/ min/ 1.732 and SCr < 1.5mg/ dL [men] and <1.3mg/ dL [women]) no evidence for association with T2D was detected (P=0.16, 0.28, following adjustment for admixture + APOL1 and admixture alone, respectively; Supplementary Table 5). Finally, rs35238405 was investigated in a combined all-cause ESKD analysis (Discovery T2D-GENES,

Replication T2D-ESKD cases, and non-T2D ESKD cases). Here, the T233A variant was associated with all-cause ESKD (p=0.0038) and had an OR=2.82 [1.4, 5.68] (Table 2).

37

To determine if other rare- and low-frequency missense variants contribute to T2D-

ESKD, non-T2D ESKD, or all-cause ESKD an additional 28 NPHS1 variants chosen from various exome sequence data resources were genotyped in our samples. Four variants failed quality control metrics, eight variants proved to be monomorphic and MAFs of the remaining 16 observed variants in T2D-ESKD cases ranged from 0.0004 to 0.13. In single SNP association testing rs146400394 (H800R) was associated with protection from all-cause ESKD (p=0.0084;

OR=0.04 [0, 0.44]) following adjustment for admixture + APOL1 (Table 3), and the strength of this association was similar to that observed with the T233A variant. Rs115489112 (Y1174H) was also nominally associated with protection from T2D-ESKD (p=0.022, OR=0.29 [0.10, 0.84]) and all-cause ESKD (p=0.036, OR=0.44 [0.2, 0.95]) (Table 3), after adjustment for admixture +

APOL1, but not non-T2D ESKD.

The contributions of NPHS1 variations to genetic susceptibility to ESKD is reflected most comprehensively by an assessment of the cumulative effect of variants using sequence kernel association testing (SKAT), which weights variants on minor allele freq uency in order to increase the study power (Wu et al. 2011). The SKAT analysis was run by combining genotype information from the T2D-ESKD replication cohort, non-T2D ESKD cohort, and the replication set of 760 non-T2D non-nephropathy controls. An unbiased approached was used by including all 16 non-monomorphic NPHS1 variants which passed QC in the SKAT analysis

(Supplementary Table 7). Under a model weighted for rare/ low frequency variants P-values of

2.95x10-8, 5.64x10-7, and 1.18x10-6 were obtained in an unadjusted, admixture, and then admixture + APOL1 adjusted model, respectively, for all-cause ESKD (Table 4).

38

Discussion

In an effort to identify low frequency variants associated with nephropathy susceptibility we evaluated NPHS1 exome sequence data from a cohort of African American

T2D-ESKD cases and non-diabetic, non-nephropathy controls. It is important to understand that while much is known about the genetic susceptibility to non -T2D ESKD in African Americans

(i.e., APOL1 associated nephropathies), the genetics of T2D-ESKD is less well understood in

T2D-ESKD with the APOL1 loci having a neglible impact. Additionally, we specifically chose to focus on the nephrin gene – rather than considering all genes in the nephrin/ slit diaphragm pathway – due to the constraints of the discovery study which limited our power. Moreover, nephrin mRNA expression correlates with GFR in patients with diabetic nephropathy and T2D-

ESKD (Supplementary Figure 2).

The discovery study identified a missense mutation (T233A) in the NPHS1 gene, rs35238405, which was associated with a heightened risk of T2D-ESKD (OR=2.89 [1.04, 8.03]; p=0.042). The exome sequencing data for this T233A variant was confirmed with targeted genotyping resulting in a 100% concordance rate. The T233A variant is predicted to be

“probably-damaging” by PolyPhen2 (http:/ / genetics.bwh.harvard.edu/ pph/ data/ ) and has been observed solely in African ancestral populations. The NPHS1 gene has been previously associated with Congenital Nephrotic Syndrome of the Finnish Type, and been implicated in additional uncommon etiologies of nephropathy.8,11-13 Genetic variants in nephrin have not been implicated in nephropathy in general populations, nor has the T233A variant been previously reported with CNS or any other type of nephropathy.

We further evaluated the contributions of rs35238405 to nephropathy-susceptibility in experiments designed to assess whether rs35238405 contributes solely to T2D-ESKD or to other non-diabetic etiologies of ESKD. Following adjustment for admixture and APOL1 G1/ G2 allele

39

status, rs35238405 was independently associated with T2D-ESKD (p=0.0045) and non-T2D

ESKD (p=0.016) in African Americans (Table 2). We note that rs60910145, a compon ent of the

APOL1 G1 haplotype block, was associated with non-T2D ESKD after adjustment for admixture

(p=4.8x10-25), but not after adjustment for admixture and APOL1 G1/ G2 allele status (p=0.11), demonstrating the effectiveness of controlling for APOL1 influences (data not shown).

Importantly, consistent odds ratios and MAFs were observed across the phenotypes (Table 2). A combined all-cause ESKD analysis pooled genotypes from the exome sequencing Discovery study, T2D-ESKD Replication cohort, non-T2D ESKD cases (N=3270 cases) and comparing them to non-T2D non-nephropathy controls (N=1187 controls) revealed that the T233A variant was associated with all-cause ESKD (p=0.0038) and conferred an OR of 2.82 [1.40, 5.68] following adjustment for admixture and APOL1 G1/ G2 allele status (MAF Cases/ Controls =

0.012/ 0.0038); thus, this T233A variant appears to be an ESKD specific variant independent of primary phenotype, ie, non-diabetes-associated nephropathies and diabetic nephropathy.

Rs35238405 was not associated with T2D (p=0.16-0.28; Table 1, Supplementary Table 5); importantly, in that analysis, we only used diabetic samples with an eGFR > 60mL/ min/ 1.73 2 and a SCr cutoff of <1.5mg/ dL (men) and <1.3mg/ dL (women), ensuring the quality of the

T2D-only, non-nephropathy phenotype. We collaborated with the laboratory of Chris Larson,

MD, at Nephropath (Little Rock, AR) where an additional 534 African Americans with various etiologies of ESKD were genotyped for the T233A allele. In that analysis a MAF of 2.4% was observed (personal communications), suggesting that the prevalence of this allele may be higher than estimated.

One third (33%) of T233A ESKD carriers had two APOL1 risk alleles compared to 25% of non-T233A ESKD cases, trending towards significance (P=0.09; Supplem entary Table 3). It is possible and perhaps likely that there is an additive interaction between the T233A variant and

40

the APOL1 G1/ G2 risk alleles; however, given our sample size and the low frequency of the

T233A variant we may lack statistical power to detect this effect. We examined the histopathology of T233A in an independent sample of 542 African American patients who underwent renal biopsy at Nephropath; 13 were carriers of T233A (MAF=2.4%). Findings on biopsy included diabetic glomerulosclerosis, m embranous glomerulopathy, arterionephrosclerosis, and collapsing glomerulopathy. There were no consistent changes associated with the T233A allele at the light microscopy or ultrastructural level, although this may be due to the small sample size.

To further investigate the role of NPHS1 missense variants in African Americans with nephropathy, an additional 24 rare- and low-frequency NPHS1 variants were successfully genotyped (Supplementary Table 2). Two additional NPHS1 missense SNPs (H800R, Y1174H) were nominally associated with protection from all-cause ESKD (Table 3). It appears that rs115489112 (Y1174H) may be a T2D-ESKD specific variant as its OR and P values (0.29 [0.1,

0.84]; p=0.022) were weaker in the all-cause ESKD analysis, yet remained significantly associated (p=0.036, OR=0.44 [0.2, 0.95]).

To assess the cumulative impact of the NPHS1 locus on ESKD risk – and to increase our study power – we utilized the SKAT program, developed by Wu et al.,14 which weights variants based on their MAF and incorporates direction of effect for each variant. In SKAT, a variant’s contribution to risk, protection, or no effect (neutral) is considered. This provides users much greater power and flexibility to detect associations amongst rare- and low-frequency variants compared to single-SNP association testing. P-values of 5.64x10-7 and 1.18x10-6 were obtained following adjustment for (admixture alone) and (admixture + APOL1 G1/ G2 status), respectively, providing compelling evidence that the effects of NPHS1 variants were independent of the APOL1 G1/ G2 loci and that this study is well powered. Moreover, the

41

effects we observed in this analysis suggest a strong role for low frequency and rare NPHS1 variations modulating common nephropathy risk in African Americans.

We also investigated another source of genetic diversity in the NPHS1 gene: copy number variation (CNV; Supplementary Table 8). Although we are limited by power to detect associations, it appears that an (1) insertion spanning the length of NPHS1 may be associated with ESKD as this insertion was not found in any controls. Additionally, 2 insertions in exon s

20-25 may also be associated with ESKD. Interestingly, 2 insertions in exons 22-25 were found in a control sample which suggests this region may not be as critical to protein stability.

Collectively, our data implicates rare- and low-frequency functional variations in NPHS1 in nephropathy-susceptibility independent of previously implicated loci (i.e., the APOL1 G1 and

G2 alleles) or phenotypes (i.e., T2D). That the T233A variant was found in our control populations suggests incomplete penetrance of the allele. We observed no homozygotes for

T233A and thus cannot infer if that genotype would be at higher risk for ESKD; cases of T233A homozygotes have not been reported in the literature, where this variant found at greatest MAF of 4.1% in Kenyan populations (Supplementary Table 4).

Interestingly, those with ESKD carrying the T233A allele were more likely to be female

(62.4% vs 51.3%; p=0.043), have a greater percentage of African ancestry (0.83 vs 0.80; p=0.020) – consistent with the allelic distribution, and a trend towards a family history of ESKD was observed (38.6% vs 28.9%; p=0.16) (Supplementary Table 3). Given the low MAF (<0.005) of the two other variants identified in single-SNP association testing (rs115489112 and rs146400394), we were unable to meaningfully compare and contrast phenotypic and clinical characteristics of these variants. It should be noted that the MAFs obtained in this study are consistent with those from 1000 Genomes (ASW), the Exome Sequencing Project (NHLBI), and dbGAP, serving as an additional barometer of the data quality (Supplementary Table 4).

42

Finally, to further investigate the role of NPHS1 variants on the population attribute risk

(PAR) of T2D-ESKD we utilized formulas published by Kraft et al. 2009.19 We obtained T2D-

ESKD PARs of 1.1%, 2.43%, and 4.56% for the T233A, Y1174H, and H800R variants, respectively

(data not shown). Thus the NPHS1 locus appears to contribute meaningfu lly to the genetic risk for ESKD in African Americans.

Given that the primary focus of this study was low and rare frequency coding variants we did not find it appropriate to assess the haplotype structures associated with each variant.

Unlike common variants which may lie in discreet haplotype blocks, rare variants do not follow this pattern. Of the 4,500 samples investigated, 3 samples were compound heterozygotes for 5

NPHS1 variants, 21 samples had 4 NPHS1 variants and 61 had 3 NPHS1 variants; the remaining

98.38% of our samples had less than 2 NPHS1 variants (data not shown).

The T233A variant results in a threonine to alanine missense mutation at position 233 (of

1241) immediately distal to the bridge between the 2nd and 3rd Ig-C2 domains of nephrin. It has been postulated that this extracellular region of nephrin, consisting of 8 Ig -C2 domains and a fibronectin domain, is pivotal to the nephrin -nephrin and nephrin-Neph1/ 2 interactions between opposite and adjacent podocytes forming the slit diaphragm. We employed 3D models to assess the structural implications of this variant and hypothesize that this mutation alters the secondary structure of a ß-pleated sheet, compromising a critical structural region bridging the

Ig-C2[2] and Ig-C2[3] domains (Supplementary Figure 1). Alternatively, this variant could interfere with nephrin signaling, either through ligand binding or impaired formation of lipid raft and signaling microdomains. This T233A variant has a MAF of 4.1% in East African

Populations (LWK), while the MAF is 1.7% in West African (YRI) populations, which raises the possibility of a heterozygote advantage under certain environmental conditions.

43

Multiple mechanisms may exist through which variants rs115489112 (Y1174H) and rs146400394 (H800R) confer protection. The Y1174H variant is found in the domain of nephrin

(pos. 1160-1241) which binds to podocin (NPHS2), potentially enhancing the integrity of this interaction. The APOL1 and NPHS2 genes reportedly interact, supporting this concept.15 H800R is located in the 7th Ig-C2 type domain of nephrin and may increase binding stability to nephrin homodimers, nephrin - Neph1/ Neph2 heterodimers, and thereby increase stability of the slit diaphragm microstructure. That these variants (T233A, H800R, Y1774H) are largely absent outside of African ancestral populations suggests that variations in NPHS1 may modulate risk and protection from T2D-ESKD and non-T2D ESKD in African ancestral populations.

This study demonstrates the challenges of working with low frequ ency variants. Even with a substantial estimated odds ratio in the 2.75 range, overwhelming significance of association is not observed even in a fairly large sample numbers (~4,500 cases and controls for the primary tests). While it can be argued that our initial T2D-GENES exome sequencing study was underpowered, we overcome this limitation through replication in T2D-ESKD and non-

T2D ESKD cohorts where we estimate 98% power to detect an association at a MAF of 0.8% assuming an OR of 2.75 with 3000 cases and 1150 controls at an α=0.005 (CaTs Power

Calculator, University of Michigan; data not shown); thus, it highly unlikely the T233A association is a false positive and this proves our methodology overcomes the initial power limitations. Moreover, we have complemented the single association tests by performing an unbiased locus-wide test using SKAT analysis resulting in both greater power, stronger evidence of association (e.g. P≈1.18x10-6; Table 4), and consistently implicating missense variations in NPHS1 to common nephropathies in African Americans.

An important clinical link for this study relates to the treatment of hypertension and nephropathy. Many patients are treated with angiotensin converting enzyme inhibitors (ACEi),

44

such as lisinopril and enalapril, which have been show to increase nephrin expression.20 From this two putative conclusions may be reached: that increased levels of nephrin are the mechanism by which ACEi’s ameliorate nephropathy, or that if one was to prognosticate patients early in T2D (or prior to CKD) based on T233A allele status, levels of the non - deleterious nephrin allele may be increased (renoprotective effects).

Conclusion

NPHS1 variants contribute to all-cause ESKD in African Americans regardless of primary phenotype (eg, type 2 diabetes or non-diabetic nephropathy). We identified several rare coding NPHS1 variants through next-generation exome sequencing coupled with data mining of genomic resources (1000 Genomes, Exome Variant Server) and targeted genotyping in multiple, independent cohorts. These data lend support to the common -disease rare-variant hypothesis, but we must continue to apply novel, pragmatic approaches to expand our understanding of complex diseases, such as T2D-ESKD. These data provide novel genetic insights into the disproportionate burden of ESKD borne by the African -ancestry population and provide a framework for further studies investigating the genetic architecture of severe nephropathy.

45

Disclosures

Conflicts of Interest/ Disclosures: The authors report no competing financial interests.

Acknowledgements: The authors of this study would foremost like to thank the patients for their participation in this research. This work was supported by NIH grants NIDDK F30

DK098836 (JAB), NIDDK RO1 DK53591 (DWB), NIDDK RO1 DK070941 (BIF), and NIDDK RO1

DK071891 (BIF). The T2D-GENES Consortium is supported by NIDDK U01’s DK085501,

DK085524, DK085526, DK085545 and DK085584.

46

References (Chapter 2)

1. USRDS 2010. USRDS Annual Data Report: Volume Two, Chapter 11. 367-382. Accessed 5/ 29/ 12.

2. Spray BJ, Atassi NG, Tuttle AB, et al. 1995. Familial risk, age at onset, and cause of end - stage renal disease in white Americans. J Am Soc Nephrol. 5:1806-1810. PMID: 7787148

3. Freedman BI, Tuttle AB, Spray BJ. Familial predisposition to nephropathy in African- Americans with non-insulin-dependent diabetes mellitus. Am J Kidney Dis. 1995;25:710- 713.

4. Genovese G, Friedman DJ, Ross MD, et al. 2010. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science. 329(5993): 841-5.

5. Kopp JB, Nelson GW, Sampath K, et al. 2011. APOL1 genetic variants in focal segmental glomerulosclerosis and HIV-associated nephropathy. J Am Soc Nephrol. 22(111):2129-37. PMID: 21997394

6. McDonough CW, Palmer ND, Hicks PJ, et al. 2011. A Genome Wide Association Study for Diabetic Nephropathy Genes in African Americans. Kidney Int. 79(5): 563-572.

7. Palmer ND, Freedman BI. 2012. Insights into the Genetic Architecture of Diabetic Nephropathy. Current Diabetes Reports. 12(4):423-31. PMID: 22573336

8. Kestila M, Lenkkeri U, Mannikko M, Lamerdin J, McCready P, Putaala H, Ruotsalainen V, Takako M, Nissinen M, Herva R, Kashtan CE, Peltonen L, Holmber C, Olsen A, Tryggvason K. 1998. Positionally Cloned Gene for a Novel Glomerular Protein – Nephrin – Is mutated in Congenital Nephrotic Syndrome. Mol Cell. 1(4): 575-82.

9. Lenkkeri U, Mannikko M, McCready P, et al. 1999. Structure of the gene for congenital nephrotic syndrome of the Finnish type (NPHS1) and characterization of mutations. American Journal of Human Genetics. 64(1):51-61.

10. Beltcheva O, Martin P, Lenkkeri U, et al. 2001. Mutation Spectrum in the Nephrin Gene (NPHS1) in Congenital Nephrotic Syndrome. Human Mutation. 17:368-373. PMID: 11317351

11. Lahdenkari AT, Kestila M, Holmber C, et al. 2004. Nephrin gene (NPHS1) in patients with minimal change nephrotic syndrome (MCNS). Kidney International. 65(5):1856-63. PMID: 15086927.

12. Philippe A, Nevo F, Esquivel EL, et al. 2008. Nephrin mutations can cause childhood - onset steroid-resistant nephrotic syndrome. J Am Soc Nephrology. 19(10):1871-1890. PMID: 18614772

13. Santin S, Garcia-Maset R, Ruiz P, et al. 2009. Nephrin mutations cause childhood - and adult-onset focal segmental glomerulosclerosis. Kidney International. 76(12): 1268-76. PMID: 19812541

47

14. Wu MC, Lee S, Cai T, et al. Rare-variant association testing for sequencing data with the sequence kernel association test. 2011. American Journal of Human Genetics. 89(1):82-93.

15. Divers J, Palmer ND, Lu L, et al. 2013. Gene-gene interactions in APOL1-associated nephropathy. Nephrology Dial Transplant. Epub ahead of print (Oct 24). PMID: 24157943

16. Cooke JN, Bostrom MA, Hicks PJ, et al. 2012. Polymorphisms in MYH9 are associated with diabetic nephropathy in European Americans. Nephrology, Dialysis, Transplant. 27(4):1505-11. PMID: 21968013

17. Divers J, Wagenknecht LE, Bowden DW, et al. 2010. Ethnic differences in the relationship between albuminuria and calcified atherosclerotic plaque: the African American - diabetes heart study. Diabetes Care. 33(1):131-8. PMID: 19825824

18. Keene KL, Mychaleckyj JC, Leak TS, et al. 2008. Exploration of the utility of ancestry informative markers for genetic association studies of African Americans with type 2 diabetes and end stage renal disease. Human Genetics. 124(2): 147-54.

19. Kraft P, Wacholder S, Cornelis MC, et al. 2009. Beyond odds ratios – communicating disease risk based on genetic profiles. Nature Reviews Genetics. 10(4): 264-9. PMID: 19238176

20. Blanco S, Bonet J, Lopez D, Casas I, Romero R. 2005. ACE inhibitors improve nephrin expression in Zucker rats with glomerulosclerosis. Kidney Int. 93:S10-14.

48

Table 1. Clinical Characteristics of African American Study Cohorts

Discovery T2D- Discovery Replication T2D- Replication T2D -only (GFR > Non-T2D ESKD 2 ESKD Controls ESKD Controls 60ml/min/1.73m ) T2D-GENES T2D-GENES T2D-ESKD GWAS T2D-ESKD GWAS Non-T2D ESKD T2D non-nephropathy Sample Source Cases Controls Cases Controls cases Controls n 529 535 1305 760 1705 503

Female (%) 61.2 57.3 60.7 57.9 43.7 58.7

Age (years) 61.6 ± 10.5 49.0 ± 11.9 61.3 ± 10.8 48.4 ± 12.7 54.6 ± 14.6 55.9 ± 9.5

Age at T2D (years) 41.6 ± 12.4 - 41.3 ± 12.4 - - 46.2 ± 10.3

Duration T2D prior to ESKD 17.6 ± 10.2 - 17.1 ± 10.7 - - - (years)

Duration of ESKD (years) 3.77 ± 3.8 - 3.66 ± 3.9 - 2.2 ± 1.64 -

Blood Glucose (mg/dl) - 88.8 ± 13.1 - 89.2 ± 13.6 88.6 ± 8.66 -

BMI (at recruitment; 29.7 ± 7.0 30.0 ± 7.0 30.3 ± 7.2 29.2 ± 7.4 27.2 ± 6.98 36.5 ± 18.4 kg/m2)

BUN (mg/dl) - 13.3 ± 5.4 - 13.3 ± 4.5 - -

Serum Creatinine (mg/dl) - 0.99 ± 0.25 - 1.03 ± 0.46 - 0.95 ± 0.2

Categoric data expressed as percentage; continuous data as mean ± SD.

49

Table 2. Rs35238405 (T233A) Single-SNP association testing (Dominant Model) with adjustment for admixture and APOL1 G1/G2 st tusᶧ

Study N Case / Control MAF Case / Control P OR 95% CI T2D-GENES (exome seq.) 528 / 514 0.014 / 0.0056 0.080 2.36 0.90, 6.16 T2D-ESKD Replication 1255 / 673 0.010 / 0.0022 0.018 4.30 1.29, 14.34 Combined T2D-ESKD 1783 / 1187 0.012 / 0.0038 0.0047 2.86 1.38, 5.94 non -T2D ESKD 1487 / 671 0.012 / 0.0022 0.016 4.48 1.33, 15.11

T2D -Only (vs controls) 480 / 1187 0.0073 / 0.0038 0.16 2.04 0.75, 5.53

All -cause ESKD 3270 / 1187 0.012 / 0.0038 0.0038 2.82 1.4, 5.68 MAF, minor allele frequency; P, P-value; OR, odds ratio; 95% CI, 95% confidence interval.

ᶧsample sizes reflect those with complete clinical and genotypic data used in each analysis

50

Table 3. NPHS1 variants nominally associated with ESKD in locus-wide analysis using single-SNP association testing, following adjustment for admixture and APOL1 G1/G2 st tus ᶧ

Study SNP N Case / Control MAF Case / Control P OR 95% CI T2D-ESKD Replication rs115489112 (Y1174H)* 1222 /642 0.0025 / 0.0078 0.022 0.29 0.10, 0.84 All-cause ESKD rs115489112 (Y1174H)* 2673 / 642 0.0039 / 0.0078 0.036 0.44 0.20, 0.95 rs146400394 (H800R)** 2757 / 642 0.0002 / 0.0022 0.0084 0.040 0, 0.44 * Additive model; **Dominant model

ᶧ sample sizes reflect those with complete clinical and genotypic data used in each analysis

51

Table 4. Locus-wide NPHS1 SKAT Analysis weighted for rare variants.

Model P Value Q cMAF (%) N SNPs 2.95x10-8 280417 0.29 16 Unadjusted 5.64x10-7 159005 0.28 15 Admixture 1.18x10-6 145754 0.28 15 Admixture + APOL1 G1/G2

Q, test statistic (similar to Chi-Square); cMAF, cumulative minor allele frequency for all SNPs analyzed.

52

Supplementary Data

Supplementary Table 1: NPHS1 exome sequencing results (missense variants, dominant model, admixture adjusted) Amino Genotype Genotype Position Acid Protein CI - P Count Count SNP CHR (hg19) Change Position OR CI - L95 U95 Value MAF Cases Controls rs35238405 19 36340467 THR,ALA 233/1241 2.89 1.041 8.028 0.04165 0.0096 0/15/506 0/5/513 var_19_36336287 19 36336287 TYR,CYS 638/1241 5.05 0.5858 43.54 0.1407 0.0029 0/5/516 0/1/517 rs4806213 19 36322601 ASN, SER 1077/1241 1.26 0.9256 1.72 0.1413 0.14 10/105/301 6/95/333 var_19_36340212 19 36340212 ARG,TRP 256/1241 4.27 0.4741 38.47 0.1955 0.0024 0/4/515 0/1/515 rs113825926 19 36340009 THR,ILE 294/1241 0.39 0.07452 2.006 0.2579 0.0034 0/2/519 0/5/513 var_19_36322018 19 36322018 ARG,CYS 1140/1241 1.62 0.7009 3.74 0.2593 0.012 0/15/506 0/9/509 rs116700257 19 36321778 ALA, THR 1188/1241 0.53 0.1764 1.596 0.2594 0.0067 0/5/516 0/9/509 rs35240811 19 36317544 PRO,SER 1200/1241 1.53 0.708 3.3 0.2799 0.013 0/17/504 0/11/507 rs33950747 19 36339247 ARG,GLN 408/1241 1.36 0.6431 2.882 0.4201 0.014 0/16/505 0/13/505 rs3814995 19 36342212 GLU,LYS 117/1241 1.15 0.8128 1.625 0.4312 0.077 2/78/441 5/67/445 var_19_36341302 19 36341302 SER,TYR 191/1241 0.48 0.04369 5.379 0.5554 0.0014 0/1/520 0/2/516 rs34736717 19 36330277 VAL,LEU 991/1241 0.92 0.6985 1.221 0.5771 0.14 9/121/391 9/127/382 rs34320609 19 36339295 LEU,PRO 392/1241 0.91 0.5864 1.401 0.6588 0.045 4/39/478 1/45/472 rs140626538 19 36342505 VAL,ALA 43/1241 1.19 0.5074 2.773 0.6935 0.011 0/12/509 0/10/507 rs114112112 19 36339558 MET,THR 382/1241 1.4 0.2315 8.399 0.7166 0.0024 0/3/518 0/2/516 rs34982899 19 36340187 PRO,ARG 264/1241 0.95 0.5292 1.71 0.8671 0.023 0/23/497 1/23/493 var_19_36341311 19 36341311 ASN, ILE 188/1241 1.14 0.1592 8.213 0.8939 0.0019 0/2/519 0/2/516 var_19_36326622 19 36326622 THR,ALA 1051/1241 1.17 0.07268 18.9 0.9109 0.00096 0/1/520 0/1/517 rs115489112 19 36321820 HIS,TYR 1174/1241 0.96 0.1916 4.779 0.9572 0.0029 0/3/518 0/3/515 var_19_36336581 19 36336581 GLU,LYS 583/1241 0.93 0.05797 14.93 0.9594 0.00097 0/1/517 0/1/517 var_19_36335305 19 36335305 GLU,LYS 663/1241 1 0.06242 16.09 0.9987 0.001 0/1/488 0/1/507 var_19_36336918 19 36336918 ALA,GLU 540/1241 6.46E-10 0 Inf 0.999 0.0016 0/0/307 0/2/325 var_19_36326643 19 36326643 GLU,LYS 1044/1241 7.80E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36332686 19 36332686 ALA,SER 916/1241 7.35E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36334400 19 36334400 PRO,THR 770/1241 1.40E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518

53

var_19_36335035 19 36335035 ALA,SER 728/1241 6.05E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36335107 19 36335107 LEU,VAL 704/1241 1.85E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518 var_19_36335326 19 36335326 VAL,LEU 656/1241 1.61E+09 0 Inf 0.9993 0.00056 0/1/433 0/0/460 var_19_36336411 19 36336411 ALA,PRO 597/1241 1.99E+09 0 Inf 0.9993 0.00048 0/1/517 0/0/517 var_19_36338980 19 36338980 ILE,THR 468/1241 5.77E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36339202 19 36339202 ALA,GLY 423/1241 6.10E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36339233 19 36339233 LEU,VAL 413/1241 6.40E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36339298 19 36339298 GLY,GLU 391/1241 1.45E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518 var_19_36339634 19 36339634 LEU,VAL 359/1241 1.53E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518 var_19_36339636 19 36339636 THR,ILE 358/1241 5.20E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36339923 19 36339923 VAL,MET 323/1241 1.64E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518 rs115308424 19 36340175 ARG,GLN 268/1241 1.51E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518 var_19_36340545 19 36340545 ARG,TRP 207/1241 2.00E+09 0 Inf 0.9993 0.00048 0/1/520 0/0/518 var_19_36342518 19 36342518 GLU,LYS 39/1241 7.80E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/517 var_19_36342539 19 36342539 ARG,TRP 32/1241 7.28E-10 0 Inf 0.9993 0.00048 0/0/521 0/1/516

54

Supplementary Table 2. NPHS1 Locus-Wide Analysis Amino Protein SNP Chr Position PolyPhen2 Acid Δ Pos. Alleles rs35238405 19 36340467 probably-damaging ALA,THR 233/1241 C/T rs4806213 19 36322601 probably-damaging SER,ASN 1077/1241 C/T rs141141839 19 36336287 probably-damaging CYS,TYR 638/1241 C/T C19:36340212 19 36340212 probably-damaging TRP,ARG 256/1241 A/G rs148104086 19 36342539 probably-damaging TRP,ARG 32/1241 A/G rs138173172 19 36332686 possibly-damaging SER,ALA 916/1241 A/C rs33950747 19 36339247 probably-damaging GLN,ARG 408/1241 T/C rs3814995 19 36342212 possibly-damaging LYS,GLU 117/1241 T/C rs115308424 19 36340175 possibly-damaging GLN,ARG 268/1241 T/C rs115489112 19 36321820 possibly-damaging TYR,HIS 1174/1241 A/G rs140673499 19 36326622 possibly-damaging ALA,THR 1051/1241 C/T rs147641617 19 36336581 probably-damaging LYS,GLU 583/1241 T/C rs112624813 19 36340499 probably-damaging LEU,PRO 222/1241 A/G rs115333628 19 36340506 possibly-damaging ALA,SER 220/1241 C/A rs139472106 19 36341989 possibly-damaging ALA,PRO 134/1241 C/G rs143649022 19 36332704 probably-damaging PRO,SER 910/1241 G/A rs143986233 19 36333098 probably-damaging HIS,ARG 864/1241 T/C rs144203682 19 36322556 probably-damaging HIS,ARG 1092/1241 T/C rs146400394 19 36333388 probably-damaging HIS,ARG 800/1241 T/C rs146858871 19 36322629 possibly-damaging THR,ALA 1068/1241 T/C rs149649169 19 36340211 probably-damaging GLN,ARG 256/1241 T/C rs150623032 19 36333180 possibly-damaging SER,ALA 837/1241 A/C rs151121915 19 36317420 probably-damaging ALA,VAL 1241/1241 G/A C19:36337051 19 36337051 probably-damaging SER,ARG 496/1241 T/G C19:36339181 19 36339181 possibly-damaging ARG,LYS 430/1241 C/T C19:36339287 19 36339287 probably-damaging SER,GLY 395/1241 T/C C19:36340454 19 36340454 probably-damaging GLN,LEU 237/1241 T/A C19:36342559 19 36342559 possibly-damaging VAL,ALA 25/1241 A/G rs34982899 19 36340187 possibly-damaging PRO,ARG 264/1241 G/C

55

rs116620503 19 36342451 N/A ALA,VAL 61/1241 A/G rs73928330 19 36342697 benign GLY,ARG 15/1241 C/G rs34320609 19 36339295 benign LEU,PRO 392/1241 A/G

56

Supplementary Table 3. T233A characteristics within all ESKD cases. African APOL1 G1/G2 (2 Age of Ancestry Sex (%) Age risk alleles) BMI FamHx ESKD (%) ESKD 0.83 ± 51.76 ± T233A (+) ESKD Cases 0.11 62.4 (F) 55.9 ± 14.4 0.33 ± 0.47 29.62 ± 7.36 38.6 14.81 53.47 ± T233A (-) ESKD Cases 0.8 ± 0.12 51.3 (F) 56.13 ± 14.3 0.25 ± 0.43 28.93 ± 7.24 28.9 14.26 P-Value (two tailed T-test) 0.020 0.043 0.89 0.090 0.40 0.16 0.30 Categoric data are expressed as percentage; continuous data as mean ± SD.

57

Supplementary Table 4. Reported rs35238405 (T233A) minor allele frequencies

Source MAF dbsnp 0.006 ESV (Af. American) 0.012 1000 Genomes All 0.006 AFR 0.024 LWK 0.041 YRI 0.017 ASW 0.008 EUR 0 AMR 0 CLM 0.008

58

Supplementary Table 5. Rs35238405 (T233A) Single-SNP association testing (Dominant Model) in T2D-only, non- nephropathy cases, adjusted for admixture only. Study N Case / Control MAF Case / Control P OR 95% CI T2D-Only (vs. controls) 502 / 1280 0.0070 / 0.0043 0.28 1.7 0.65, 4.41

59

Supplementary Table 6. Breakdown of variants identified in T2D-GENES exome sequencing “Discovery” study (genome wide)

Effect Frequency Proportion

Nonsynonymous Coding 237,989 0.34 p < 0.0001 4

0.0001 ≤ p < 0.001 23 0.001 ≤ p < 0.01 358

0.01 ≤ p < 0.05 1968 Start Gained 2,166 0.0031

Stop Gained 4,254 0.0060

Stop Lost 263 0.00040

60

Supplementary Table 7: NPHS1 variants included in SKAT analysis Chr SNP BP (hg19) Notes 19 rs151121915 36317420 - 19 rs115489112 36321820 - 19 rs144203682 36322556 monomorphic 19 rs4806213 36322601 - 19 rs146858871 36322629 - 19 rs140673499 36326622 - 19 rs138173172 36332686 - 19 rs143649022 36332704 - 19 rs150623032 36333180 monomorphic 19 rs146400394 36333388 - 19 rs147641617 36336581 - 19 19:36337051 36337051 - 19 19:36339181 36339181 monomorphic 19 19:36339287 36339287 - 19 rs34320609 36339295 - 19 rs149649169 36340211 monomorphic 19 rs35238405 36340467 - 19 rs112624813 36340499 monomorphic 19 rs139472106 36341989 monomorphic 19 rs3814995 36342212 - 19 rs116620503 36342451 monomorphic 19 rs148104086 36342539 monomorphic 19 19:36342559 36342559 -

61

Supplementary Table 8: CNV in NPHS1 Bowden ID ds1 NPHS1 CNV Affy Probe ID no CNV CN_769995 CN_769996 CN_769997 CN_769998 CN_769999 CN_770000 CN_770001 CN_770002 CN_770003 3686 2 2 2 2 2 2 2 2 2 2

1 Deletion NCBH-046 1 2 2 2 2 2 2 2 2 2 #N/A 1 2 2 2 2 2 2 2 2 2

1 Insertion DMS-137 2 3 3 3 3 3 3 3 3 3 608 2 3 3 3 3 3 3 3 3 3 1454 2 3 3 3 3 3 3 3 3 3

2 Insertions NCBH-046 1 2 2 2 2 2 2 2 2 2 RGF-1201 1 2 2 2 2 4 4 2 2 2 692 2 2 2 2 2 2 4 4 2 2 2138 2 2 2 2 4 4 4 2 2 2 DMS-2001 2 2 2 2 4 4 4 4 2 2 ds2= T2D-ESKD, ds1=control

62

Supplementary Figure 1. Potential mechanism of deleterious effects of T233A: disruption of β-pleated sheet in vital bridge region between 2nd and 3rd Ig-C2 like domains of nephrin.

63

Supplementary Figure 2. NPHS1 mRNA expression correlates with eGFR in patients with diabetic nephropathy.

Source: nephromine.org

64

Chapter 3

The ras responsive transcription factor RREB1 is a novel candidate gene for type 2 diabetes and nephropathy

Jason A. Bonomo1,2, Maggie C.Y. Ng2, Nicholette D. Palmer 2,3, Jianzhao Xu 2, Jacob M. Keaton 2, Pamela J. Hicks2, Janice P. Lea4, The T2D-GENES Consortium, Carl D. Langefeld 2,5, Barry I. Freedman 2,6, Donald W. Bowden 2,3,* Author Affiliations:

1Department of Molecular Medicine and Translational Science, Wake Forest School of Medicine 2Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine 3Department of Biochemistry, Wake Forest School of Medicine, 4Department of Internal Medicine – Nephrology, Emory University School of Medicine, Atlanta, Georgia, USA, 5Department of Biostatistical Sciences, Wake Forest School of Medicine, 6Department of Internal Medicine - Nephrology, Wake Forest School of Medicine, Winston Salem, North Carolina, USA.

JAB performed and designed the experiments, ran the statistical analyses, and prepared the manuscript. MCYN helped with interpretation of exome sequencing data. NDP served in an editorial capacity. JX aided in statistical analyses and JMK identified dsQTL locations. PJF helped perform experiments. JPL help ed recruit subjects. The T2D-GENES Consortium provided the exome sequencing data. CDL wrote the statistical software for single-SNP association testing. BIF recruited subjects and served in an advisory and editorial capacity. DWB provided methodological guidance, served in an advisory capacity, and helped edit the manuscript.

*Corresponding author: Donald W. Bowden Center for Genomics and Personalized Medicine Research1 1 Medical Center Blvd Winston Salem, NC 27157 phone: (336) 713-7507 fax: (336) 713-7566 email: [email protected]

An abbreviated version of this manuscript was submitted to Human Molecular Genetics in February of 2014.

65

Abstract Familial clustering and presumed genetic risk for type 2 diabetic (T2D) and non -diabetic end-stage kidney disease (ESKD) appear strong in populations of African ancestry. Examination of exome sequencing data from 529 African American T2D-ESKD cases and 535 non-diabetic non-nephropathy controls from the T2D-GENES collaboration identified two low frequency variants in the RREB1 gene, a repressor of the angiotensinogen (AGT) gene previously associated with kidney function: rs9379084 (p=0.0023, OR=0.28; D1171N) and rs41302867

(p=0.0021, OR=0.22; intronic splice site). These observations were replicated in independent samples of African Americans with T2D-ESKD (rs9379084 p=0.029 [OR=0.36] and rs41302867 p=0.011 [OR=0.50]), as well as European Americans with T2D-ESKD (rs9379084 p=8.3x10-5

[OR=0.52] and rs41302867 p=0.0064 [OR=0.69]). rs9379084 was not associated with non -T2D

ESKD or T2D per se (p=0.37 and p=0.19, respectively), but was associated with T2D in European

Americans (p=0.0068, OR=0.65). In African Americans, rs41302867 showed evidence of association with non-T2D ESKD and T2D (p=0.019; OR=0.57 and p=0.032; OR=0.57, respectfully), and rs41302867 was associated with “all-cause” ESKD in African Americans

(p=5.85x10-5, OR=0.49). A locus-wide analysis evaluating putatively functional intronic and missense SNPs revealed four additional variants associated with non -T2D ESKD, one associated with T2D in African Americans and four in European Americans, and one variant trending towards association with T2D-ESKD in African and European Americans. RREB1 is a large, complex gene which codes for a multidomain zinc finger binding protein and transcription factor. We propose that variants in RREB1 modulate seemingly d isparate phenotypes (T2D-

ESKD, non-T2D ESKD, and T2D) through potentially altered functions in splice variants and

SNP-induced dysregulation at transcription factor binding sites.

66

Introduction

Multiple lines of evidence support genetic influences on susceptibility to end-stage kidney disease (ESKD). This is particularly true in African Americans where ESKD incidence rates are 3.5 fold higher than in European Americans.1 After adjustment for socioeconomic status, incidence rates and familial aggregation of ESKD remain markedly higher among

African Americans compared to European Americans and other ethnic minorities. 2,3 The apolipoprotein L1 gene (APOL1) G1 and G2 alleles, present in those of West African descent, explain a substantial portion of the ethnic d isparity in HIV-associated collapsing glomerulopathy, idiopathic focal segmental glomerulosclerosis (FSGS), and hypertension - attributed ESKD.4,5 These variants account for a substantial portion of the ethnic difference in risk for non-diabetic forms of ESKD (non-T2D ESKD), but they fail to account for the excess risk of type 2 diabetic ESKD (T2D-ESKD) in African Americans.6 It is likely that other genetic loci contribute to non-T2D ESKD and especially T2D-ESKD risk in the African American population.7

Next-generation (massively parallel) exome sequencing (NGES) provides a powerful technology to comprehensively identify and test genetic variations in coding sequences of genes for disease association, facilitating detailed exploration of previously untested genetic regions.

We utilized NGES data to survey the RREB1 (ras-responsive element binding protein-1) gene, an upstream regulator of the Renin-Angiotensin System (RAS). Multiple splice variants of

RREB1, also known as Finb, have been shown to act as sequence-specific transcriptional repressors of the angiotensinogen (AGT) gene.8 A genome-wide association study (GWAS) published by the CHARGE consortium identified an association between an RREB1 variant and estimated glomerular filtration rate (eGFR), and an in tronic variant near the RREB1 locus has evidence of interaction with the APOL1 gene in African Americans with non-T2D ESKD.9,10

67

Given the a priori functional and genomics data, we hypothesized that genetic variations in

RREB1 may contribute to nephropathy susceptibility in the African American population.

68

Results

The impact of genetic variations in RREB1 modulating T2D-ESKD, non-T2D ESKD, and

T2D (in the absence of nephropathy) susceptibility in the African American population was investigated using a multistage study design. Initially RREB1 was analyzed in exome sequence data from 529 African American T2D-ESKD cases and 535 non-diabetic non-nephropathy controls (Discovery). While the Discovery sample provides a unique resource, individually it has limited power which results in, at best, nominal evidence of association. Thus, this was followed by testing in an independent T2D-ESKD case-control sample (Replication), followed by analysis in African American non-T2D-ESKD cases, and African American T2D-only cases

(i.e. T2D, without nephropathy). Finally RREB1 variants were tested in an independent

European American T2D-ESKD case-control sample. Characteristics of the African American samples, two independent samples with T2D-ESKD, non-T2D ESKD, T2D non-nephropathy, and population-based controls (non-T2D, non-nephropathy), are detailed in Table 1. Discovery

T2D-ESKD cases were broadly similar to those in the Replication T2D-ESKD study for all characteristics. The population-based controls for the Discovery and Replication studies are, on average, younger than the cases in the Discovery and Replication studies; however, the age of

T2D diagnosis in the cases is younger than the mean age of the population -based controls at enrollment. All cohorts except African American non-T2D ESKD cases had a larger percentage of females. The distributions of body mass index (BMI) were similar across cohorts, with the non-T2D ESKD cohort having the lowest mean BMI. As with African American cases, European

American T2D-ESKD and T2D-only cases tended to be older than controls, but the age at T2D diagnosis was younger than the mean age for controls (Table 2). BMIs across European

American cohorts were in the overweight to obese range, with the European American T2D - only cases having the highest BMI (Table 2).

69

NGES data of 529 African American T2D-ESKD cases and 535 African American non- diabetic non-nephropathy controls (Discovery) included in the T2D-GENES Consortium (t2d- genes.sph.umich.edu) was evaluated, focusing on the RREB1 gene. The initial T2D-GENES exome sequencing study identified two protective variants associated with T2D-ESKD under a dominant model: rs9379084, a D1771N missense variant (p=0.0023; OR=0.28[0.13, 0.64]), and rs41302867, an intronic variant near a splice site (p=0.0021; OR=0.22[0.08, 0.58]), following adjustment for admixture, age, gender, and APOL1 G1/ G2 allele status (“Model 1”) (Table 3).

Rs9379084 and rs41302867 are low frequency variants with minor allele frequencies (MAF) of

1.1% and 0.77% in cases, and 3.90% and 3.30% in controls, respectively (Table 3). The rs9379084 and rs41302867 variants were next genotyped in 1,305 independent African American T2D-

ESKD cases and 760 African American population-based controls (non-T2D, non-nephropathy controls; Replication) (Table 3). Both variants replicated association (rs41302867 p=0.016;

OR=0.52 [0.31, 0.89] and rs9379084 p=0.029; OR=0.56 [0.33, 0.94]) and consistent MAF were observed in cases and controls compared to the Discovery study under Model 2 (Model 1 +

BMI) (Table 3).

With consistent evidence of association of rs41302867 and rs9379084 with T2D-ESKD in

African Americans, we investigated whether these two RREB1 variants were associated with non-diabetic ESKD (non-T2D ESKD). Both variants were genotyped in a cohort of 1,705 African

American non-T2D ESKD cases. Rs41302867 was associated with non-T2D ESKD (p=0.019), remained protective (OR=0.52 [0.31, 0.89]), and similar MAF were observed under Model 2

(MAF=0.015 cases and 0.031 controls; Table 3). The rs9379084 variant was not associated with non-T2D ESKD (p=0.37; Table 3). Based on consistent association of rs41302867 with ESKD, we performed an African American “all-cause” ESKD analysis by pooling samples from the

Discovery T2D-ESKD cases and controls, Replication T2D-ESKD cases and controls, and non-

70

T2D ESKD cases. The African American all-cause ESKD analysis revealed p=5.85x10-5 for association with rs41302867 (OR 0.49 [0.35, 0.69]) under Model 2 (Table 3).

Evidence for T2D association in African Americans w as observed for rs41302867

(p=0.032) with a consistent OR (0.57 [0.34, 0.95]) and MAF (0.021 cases and 0.031 controls) under

Model 1 but not Model 2 (p=0.076) (Table 3). A weak trend for rs9379084 association with T2D was observed (p=0.16-0.19; Table 3).

As RREB1 variants appeared to modulate nephropathy and T2D susceptibility in

African Americans, we performed a locus-wide analysis by genotyping putatively functional

RREB1 variants in the African American Replication T2D-ESKD cohort, non-T2D ESKD cohort, and T2D-only, non-nephropathy samples (Supplementary Table 1). Of the 15 additional RREB1 variants selected for targeted genotyping, 3 failed quality control (Supplementary Table 1).

None of the additional variants were significantly associated with T2D-ESKD, although a trend was observed with rs9505090 under Model 1 (p=0.077, OR=1.38 [0.97, 1.98]; data not shown).

Four additional RREB1 variants were associated with non-T2D ESKD (Table 4A). Rs9505090, with a MAF 6.3% in cases and 4.2% in controls, was most strongly associated with non-T2D

ESKD (p=0.0081) and conferred risk (OR=1.57 [1.12, 2.24]) in Model 1 (Table 4A). A common variant, rs6919004, was associated with non-T2D ESKD (p=0.015) and conferred protection

(OR=0.78 [0.63, 0.95]; Table 4A). Rs116295722, an S1135I missense variant, was modestly associated with non-T2D ESKD under Model 1 (p=0.042; OR=0.55 [0.30, 1.00]) and trended towards significance in Model 2 (p=0.083; OR=0.59 [0.32, 1.07]; Table 4A). Additionally, rs77510523, a low frequency intronic variant (MAF 0.0056 cases and 0.016 controls), was modestly associated with non-T2D ESKD in Model 2 (p=0.042; OR=0.010 [0.01, 0.92]) (Table 4A).

One variant was associated with T2D in African Americans: rs9505090 (p=0.041-0.042); the OR 1.47 [1.02, 2.22] was similar to that in non-T2D ESKD cases, in addition to consistent

71

MAF (0.059 cases and 0.042 controls; Table 4B). A trend towards association of an S4N missense variant, rs116821447, with T2D was observed (p=0.079; OR=1.93 [0.93, 4.01]; data not shown).

We next genotyped 16 (of the 17 total) variants in 637 European Americans with T2D-

ESKD and 1020 population-based European American non-T2D, non-nephropathy controls; four failed quality control. Rs41302867 and rs9379084 replicated T2D-ESKD association

(p=0.0064; OR=0.69 [0.52, 0.90] and p=0.000083; OR=0.52 [0.40, 0.73], respectively) following age, gender, and BMI adjustment (Table 5A). A G1384R variant (rs2281833) was associated with

T2D-ESKD in European Americans (p=0.0064; OR=1.52[1.12, 2.05) followin g adjustment for age, gender, and BMI (Table 5A), but not in African Americans. In an unadjusted model, a trend towards significance was seen for rs9505090 with T2D-ESKD in European Americans (p=0.053;

OR=6.54[0.73, 58.6]; Table 5A).

These RREB1 variants were genotyped in 560 unrelated T2D non-nephropathy

European American cases and contrasted with European American population -based (non- diabetic non-nephropathy) controls. Following age, gender, and BMI adjustment, several variants associated with T2D were detected: rs9379084 (p=0.0068, OR=0.65 [0.48, 0.89]), and rs6919004 (p=0.016; OR=1.25 [1.04, 1.50]) (Table 5B). A trend was observed with rs1334576, a

G195R missense variant (p=0.053; Table 5B). Rs41302867 was associated with T2D in a model adjusted for age and gender (p=0.030; OR=0.75 [0.58, 0.97]) and in an unadjusted model rs9505090 trended towards association (p=0.098) (Table 5B).

Finally, we investigated the role of RREB1 CNV in T2D-ESKD (Supplementary Table 4).

Using Affymetric CNV probe IDs, CNV data was mined from our previously published T2D-

ESKD GWAS in African Americans (6). The exact role that CNV in RREB1 plays in modulating

T2D-ESKD is unclear given the extremely small numbers of samples that had evidence of CNV

(N Controls = 3, N Cases = 6). It appears that an insertion spanning approximately 70kb may be

72

associated with T2D-ESKD, and we identified 1 case of T2D-ESKD that had no CNV in RREB1.

One controls was identified that had 2 copies of the gene region Chr6: 7153721-7154299.

Interestingly, this 529bp region of RREB1 is highly regulated with extensive H3K4Me1 mono- methylation across 7 ENCODE cell lines (UCSC Genome Browser, data not shown). There is also evidence of this region being a strong enhancer and weak enhancer depending on

ENCONDE cell.

73

Discussion

We tested whether genetic variations in RREB1, a gene with strong a priori biological and genomic plausibility for association with nephropathy, modulated susceptibility to common forms of ESKD in the general African American and European American population. Results from these groups with different population ancestry were generally consistent, but complex.

Data were consistent with protection from T2D-ESKD in association of RREB1 in African

Americans and European Americans; albeit stronger association of RREB1 variants were observed in African Americans with non-T2D ESKD compared to T2D-ESKD. Similar results were observed in European Americans with T2D-ESKD; however, we do not have a corresponding DNA collection from European Americans with non-T2D-ESKD since etiologies of ESKD are likely more heterogeneous than African Americans, who often have APOL1- associated non-T2D ESKD. RREB1 variants were associated with T2D in African Americans and

European Americans, suggesting pleiotropic roles. None of the variants genotyped in this study were in linkage disequilibrium (LD) (r 2 < 0.30; data not shown) with the exception of rs41302867 and rs9379084 (r 2 = 0.63 for African Americans and r 2 = 0.62 for European Americans). These observations appear consistent given the complex structure and function of the RREB1 gene.

RREB1 is a large gene spanning 104 kB with at least 12 isoforms. Amino acid (AA) lengths for isoforms range from 4 AAs with 1 coding exon (RREB1-12) to 1742 AAs and 10 coding exons resulting in a protein product with 15 C2H2 zinc finger domains (RREB1-1). These longer isoforms of RREB1, previously termed “Finb” and including RREB1-1 (“Finb 188”), have been shown to repress expression of the AGT gene.8 Relevant to nephropathy susceptibility,

RREB1 polymorphisms reportedly interact with APOL1 and associate with kidney.9,10 Recent

GWAS implicate RREB1 with fat distribution and fasting glucose, effects potentially related to the T2D associations we observed.11, 36 Finally, while this manuscript was under review at HMG,

74

a manuscript was published in Nature Genetics describing a combined GWAS of >80,000 ethnically diverse samples that implicated a SNP in the RREB1-SSRI gene region as being associated with TD (p=1.4x10-9).39 These a priori data provide strong support for the pleiotrophic observations made in the present study.

RREB1 in African Americans w ith T2D-ESKD and non-T2D ESKD

In the initial African American T2D-GENES exome sequencing study (Discovery) two

RREB1 variants conferred protection from T2D-ESKD following adjustment for age, gender, admixture, APOL1, and BMI (Model 2): rs9379084, a D1171N missense variant (p=0.0023;

OR=0.28[0.13, 0.64]) and rs41302867, an intronic variant adjacent to a splice site (p=0.0021;

OR=0.22[0.08, 0.58]) (Table 2). Rs41302867 is found immediately downstream to an alternatively spliced exon in RREB1-1 (exon 7 of 10; RefSeq NM_001003699.3) and RREB1-3 (exon 7 of 9;

RefSeq NM_001003698.3). Both associations replicated in independent African American cases with T2D-ESKD and controls (Table 3). Rs41302867 was associated with non -T2D ESKD

(p=0.019; OR=0.56[0.35, 0.91]) in African Americans, whereas rs9379084 was not (p=0.37-0.55).

In a “combined ESKD” analysis, which pooled samples from the T2D-ESKD Discovery, T2D-

ESKD Replication, and non-T2D ESKD cohorts, strong evidence of rs41302867 association was observed after covariate adjustment (p=5.85x10-5; Model 2, Table 3). In that analysis we estimate

86% power to detect association at a MAF=0.014 with an OR of 0.49 at α=0.001 (CaTs Power

Calculator, University of Michigan; data not shown). Rs41302867 is found at greatest MAF amongst European popuations (MAF EUR=0.13 and MAF CEU=0.11; 1000 Genomes, data not shown) and is absent in native African populations (LWK, YRI). As such, this allele was likely introduced to African Americans through admixture (MAF ASW=0.033; Supplementary Table

2). The MAF are consistent with the data presented and serve as an additional layer of quality control (Supplementary Table 2).

75

We next performed a locus-wide analysis of 15 putatively functional variants in the

RREB1 gene region. Included in this analaysis were missense variants predicted to be damaging by PolyPhen2 and intronic variants located at splice sites, DNase1 hypersensivity cluster s, or transcription factor binding sites (TFBS) identified through RegulomeDB. 14 Through these analyses, rs9505090, a low frequency intronic variant with a high RegulomeDB score (1F), was implicated in non-T2D ESKD (p=0.0081; OR=1.57 [1.12, 2.24]) in African Americans; and a trend towards significance with T2D-ESKD in African Americans (p=0.077, OR=1.38 [0.97, 1.98]) was also observed. Two additional variants, rs6919004 and rs116295722, were modestly associated with non-T2D ESKD in African Americans (p=0.015, OR=0.78 [0.63, 0.95] and p=0.048, OR=0.55

[0.30, 1.00]), respectively, following adjustment (Model 1; Table 4A). Rs6919004 is a common intronic variant with a high RegulomeDB score (2b) located within a DNase1 hypersensitivity cluster and rs116295722 is a low frequency S1135I missense variant predicted to be “probably damaging” by PolyPhen2.

RREB1 in European Americans w ith T2D-ESKD

Both RREB1 variants from the initial T2D-GENES discovery study replicated association and direction of effect in a cohort of European Americans with T2D-ESKD compared to healthy non-diabetic, non-nephropathy European American population -based controls (Table 5A).

Rs9379084 (D1171N) was more strongly associated with T2D-ESKD in European Americans than rs41320867; the opposite was observed in African Americans. An additional variant in

European Americans associated with T2D-ESKD, an observation which did not extend to the

African American cohorts. The MAF discrepancy between African Americans and European

Americans played a large role in SNP selection; we hypothesized that alleles modulating nephropathy would be found at different frequencies between African -ancestral and European populations due to the major disparity in incidence of ESKD. Thus, while many of the variants

76

genotyped in African Americans are low frequency, the majority of SNPs are common variants in European Americans, affording us greater power for their detection. The missense variant rs2281833 (G1384R) was associated with T2D-ESKD in European Americans (p=0.0065; OR=1.52

[1.12, 2.05]; Table 5A) but not in African Americans (p=0.39) following adjustment for age, gender, and BMI (data not shown). Finally, we observed a trend towards association for rs9505090 with T2D-ESKD in European Americans (p=0.053) in an unadjusted model (greatest power). This variant was found at rare MAF in the T2D-ESKD cases (MAF = 0.0024) and was absent in our European American controls (Table 5A).

RREB1 in African Americans w ith T2D lacking nephropathy

We identified two variants modestly associated with T2D in African Americans lacking nephropathy. The T2D non-nephropathy cohort had an eGFR > 60mL/ min/ 1.73m 2, SCr <

1.5mg/ dL (men) and < 1.3mg/ dL (women), and UACR < 30mg/ g, demonstrating the quality of the phenotype. Rs41302867 was associated with T2D in African Americans under Model 1

(p=0.032; OR=0.57 [0.24, 0.95], Table 3), and rs9509050 was associated under Models 1 & 2

(p=0.041; OR=1.50 [1.02, 2.22]).

RREB1 in European Americans w ith T2D lacking nephropathy

The contributions of RREB1 variants to T2D in European Americans were examined. As in the T2D non-nephropathy African American cohort, the same clinical cutoffs for eGFR, SCr, and UACR were employed in European Americans. Rs41302867 was associated with T2D in

European Americans in a mod el adjusted for age and gender (p=0.030; OR=0.75 [0.58, 0.97])

(Table 5B). Rs9379084 (D1171N) was also associated with T2D in European Americans, but not in African Americans (p=0.0068, OR=0.65[0.48, 0.89]; Table 5B), though the strength of this association was two orders of magnitude less than that with T2D-ESKD in European

Americans. We cannot exlude the possibility that this was a spurious residual association with

77

D1171N due to the common frequencies of the allele and relatively high LD in European

Americans (r2=0.62). Rs6919004, which was associated with protection from non -T2D ESKD in

African Americans, was associated with T2D risk in European Americans (p=0.016; OR=1.25).

The minor allele of rs6919004 in African Americans is the major allele in European Americans, which may explain this discrepancy in direction of effect. Additionally, rs1334576, a “probably - damaging” (PolyPhen2) G195R missense variant, was associated with T2D in European

Americans in an unadjusted model (p=0.031) and trended towards significance following adjustment for age, gender, and BMI (p=0.053). Finally, a trend towards rs9505090 association was observed in an unadjusted model (p=0.098).

RREB1: making sense of pleiotropic roles

The majority of studies evaluating RREB1 focus on its role in oncology, as this transcription factor induces expression of and represses p16.15,16 The roles RREB1 plays in non-oncogenic pathologies are less well studied, as are isoform -specific actions of RREB1. Aside from a study which documented larger transcripts of RREB1 repressing the AGT gene, only one study examined isoform-specific actions of RREB1 in urologic cancer.8,34 Evidence for RREB1 association with T2D has not been previously established. Below et al. reported association with

T2D and SNP rs3099797; however, this SNP is likely in the NTM gene on chromosome 11 and not RREB1 (chromosome 6).13 The association of an RREB1 SNP with eGFR in a CHARGE consortium GWAS was reported in supplementary data.9

In addition to actively repressing AGT, a precursor component of the RAS, RREB1 has been shown to repress ZIP3/SLC39A3, a gene involved in zinc transport whose homolog

SLC30A8 has been associated with type 1 and type 2 diabetes.17-20 Moreover, RREB1 induces the expression of MT-IIA, a metallothioein produced extensively in the liver and kidneys which protects against oxidative stress,21 and secretin, which promotes osmoregularity, triggers insulin

78

release, and promotes the growth and maintenance of the pancreas. 22,23 RREB1 also potentiates

NeuroD1/ β2 activation, a gene implicated in Mature Onset Diabetes of the Young (MODY), and activates the nuclear androgen receptor (AR); a known target of AR activation is insulin - like growth factor-1 (IGF-1).23-25 Finally, GWAS have associated RREB1 with fat distribution and fasting glucose levels.11,36 Thus, variation in RREB1 may impact disparate phenotypes including predisposition to T2D and ESKD related to both diabetic and non -diabetic etiologies in multiple population ancestral groups. It’s also interesting to note that a DIAGRAGM constrium paper recently published identified the RREB1-SSR1 gene region as being highly associated with

T2D.39 That this consortium did not fine-map the location of the signal, coupled with findings from our data, are suggestive that the true T2D susceptibility may actually reside in the RREB1 gene region and that their signal was in high LD with RREB1 T2D variants.

The association of rs9505090 with African American non-T2D ESKD, T2D (with a trend in European American T2D), and a trend towards T2D-ESKD in African and European

Americans may be explained by its location. Rs9505090 sits in the most highly regulated region of RREB1, composed of an extensive DNase1 hypersensitivity cluster across all ENCODE cell lines with 40 matched known TFBS (Supplem entary Table 3, Supplementary Figure 1). The more notable TFBS include cyclic AMP-dependent transcription factor ATF-3 (ATF3), upstream stimulatory factor 1 (USF1), zinc finger and BTB domain-containing protein 7A (ZBTB7A), EA1 binding protein p300 (EP300), and transcriptional repressor CTCF (CTCF) (Supplementary

Table 3). ATF3 has been implicated in acute kidney injury, regulation of glucagon release, and is a regulator of stress-induced pancreatic β-cell apoptosis,26-29 USF1 has been described as a critical transcriptional factor regulating diabetic kidney disease in addition to modulating insulin induction,30,31 ZBTB7A represses CDKN2A (a known T2D gene),32 EP300 is a co-activator of

NeuroD1/ β2 dependent transcription of secretin,33 and finally, CTCF is a genetic insulator

79

involved in the repression of insulin-like growth factor 2 (IGF-2) which has been shown to interact with IGF2BP2, conferring protection against diabetic nephropathy.35 We posit that rs9505090 may disrupt transcription factor binding in this highly regulated region of RREB1, resulting in altered transcriptional activity. Rs9505090 is found exclusively in African -ancestral populations with the greatest MAF in West Africans (YRI=0.097, 1000 Genomes; data not shown), followed by East Africans and African Americans (MAF LWK=0.073 and MAF

ASW=0.049), and previously identified as being absent in European populations (MAF

EUR=0.00, 1000 Genomes). Moreover, Rs9505090 has been reported as a DNase1 sensitivity quantitative trait loci (dsQTL) in lymphoblastoid cell lines derived from in dividuals of African descent.37 Additionally, CHIP-seq evidence from ENCODE indicates rs9505090 falls within a binding site for the transcription factor CTCF in H epG2 cells and that the minor allele (G; risk allele) may result in aberrant regulation of RREB1 gene expression.38 Our data reveal the transfer of high impact, low frequency risk alleles continues to occur across population ancestries.

Rs6919004 is a common variant associated with non-T2D ESKD and conferring protection in African Americans, and associated with T2D in European Americans (Tables 3A,

5B). Like rs9505090, rs6919004 sits in a DNase1 hypersensivity cluster with a number of TFBS including USF1 and ATF3. As stated, the minor allele for rs6919004 in African -ancestral populations (C) is the major allele in populations of European ancestry.

The S1135I missense variant rs116295722 was modestly associated with non -T2D ESKD under Model 1 (Table 3A). The S1135I variant is believed to be phosphorylated (UniProt.org; data not shown) and conversion of this serine to isoleucine may reduce non -T2D ESKD susceptibility in African Americans through altered RREB1 activity. The rs77510523 variant, modestly associated with non-T2D ESKD in Model 2, is another intronic variant found within a

80

DNase1 hypersensitivity region with extensive histone methylation in liver cell lines (ENCODE; data not shown). TFBS for V-Myb Myeloblastosis Viral Oncogene Homolog (Avian)-Like 2

(MYBL2), Retinoid X Receptor Alpha (RXRA), and (DNA-directed RNA polymerase II subunit

RPB1) POLR2A are found in the same region as rs77510523 (data not shown). Rs116295722 is found exclusively in African-ancestral populations, whereas rs77510523 is found p rimarily in ancestral European populations (MAF EUR=0.18) and is absent in native African populations; this allele was likely introduced to African Americans through admixture (MAF ASW=0.049).

In addition to establishing potentially novel roles for RREB1 in T2D, T2D-ESKD, and non-T2D ESKD, this study provides genetic insights into the disproportionate burden of T2D and ESKD borne by the African American community. That the protective alleles we identified

(rs41302867, rs77510523, rs6919004) are found at higher MAF in European-ancestral populations and have the same direction of effect in European American populations, coupled with the most significant risk allele (rs9505090) being an African-ancestral allele with evidence of limited introduction into European American populations, may provide a novel genetics perspective on related health care disparities.

This study demonstrates how a single gene with numerous splice variants and multiple regulatory elements may influence seemingly unrelated pathologies including T2D, T2D-ESKD,

ESKD, and malignancy. Nearly a third of the TFBSs associated with rs9505090 relate to oncogenesis, consistent with previously implicated gene function (Supplementary Table 3). The ability of one gene to modulate a range of clinically relevant phenotypes warrants further investigation into pharmacologic targets which can exploit these pleiotropic properties. This study demonstrates the effectiveness of NGES coupled with pragmatic, targeted genotyping utilizing bioinformatic tools and a priori knowledge of the pathology (eg, prioritizing variants

81

based on allelic discrepancies between disproportionately affected populations) to aid in genotyping of clinically relevant genetic variants.

82

Methods

Study Subjects

Recruitment and sample collection procedures have been reported.6,16 The study was approved by the Institutional Review Board at Wake Forest School of Medicine (WFSM). All participants were unrelated, born in North Carolina, South Carolina, Georgia, Tennessee, or

Virginia and provided written informed consent. DNA extraction was performed using the

PureGene system (Gentra Systems, Minneapolis, MN, USA). African Americans with ESKD were recruited from dialysis facilities. Type 2 diabetes (T2D) was diagnosed in those developing diabetes after age 25 years, without historical evidence of diabetic ketoacidosis or receiving solely insulin therapy since diagnosis. T2D-ESKD was diagnosed after >5 year diabetes duration prior to renal replacement therapy, or with diabetic retinopathy or ≥ 100 mg/ dl proteinuria on urinalysis (when available), in the absence of other causes of nephropathy.

Unrelated African American controls without diabetes or renal disease (based on a serum creatinine concentration <1.5 [men] or <1.3 mg/ dl [women]) were recruited from the community and internal medicine clinics at WFSM. Ethnicity was self-reported and confirmed by genotyping with ancestry informative markers. The mean African ancestry of the samples was 79.85% ± 11.54 for cases and 77.7% ± 10.96 for controls. Recruitment and sample collection for European Americans with T2D-ESKD was performed as above controls were recruited from medicine clinics at WFSM (non-nephropathy, non-T2D population based controls).

African American non-diabetic (non-T2D) ESKD cases

African American non-T2D ESKD cases lacked diabetes (or diabetes developed after initiating renal replacement therapy). ESKD was typically attributed to chronic glomerular disease (e.g., FSGS), HIV-associated nephropathy, “hypertension-attributed”, or unknown cause. Patients with ESKD due to urologic/ surgical cause, polycystic kidney disease or IgA

83

nephropathy were excluded. The mean African ancestry of the non -T2D ESKD samples was

80.01% ± 10.96.

African American T2D (non-nephropathy) samples

African Americans with T2D but lacking nephropathy were recruited from a previously published African American T2D GWAS study,12 as well as internal medicine clinics at WFSM.

Diabetic controls were receiving insulin or oral agents, had an HbA1C >6.5% or a fasting plasma glucose >126 mg/ dl, and serum creatinine concentration <1.5 mg/ dl (men) or 1.3 mg/ dl

(women). All T2D-only non-nephropathy controls in this study had an eGFR

>60ml/ min/ 1.73m 2 and a urine albumin:creatinine ratio (UACR) <30 mg/ g.

European American T2D (non-nephropathy) samples

European Americans with T2D but lacking nephropathy were recruited from the

Diabetes Heart Study (DHS) at WFSM. These diabetic controls were receiving insulin or oral agents, had an HbA1C >6.5% or a fasting plasma glucose >126 mg/ dl, and serum creatinine concentration <1.5 mg/ dl (men) or 1.3 mg/ dl (women). All T2D-only non-nephropathy controls in this study had an eGFR >60ml/ min/ 1.73m 2 and a UACR <60 mg/ g.

Sample Preparation, Genotyping, and Quality Control

African American T2D-ESKD T2D-GENES Discovery cases and controls

NGES was performed under the auspices of the T2D-GENES consortium (https:/ / t2d- genes.sph.umich.edu/ ) utilizing an Agilent V2 capture array platform (Agilent Technologies,

Santa Clara, CA) at the Broad Institute (Cambridge, MA). Data underwent multiple levels of quality control (QC) prior to release including DNA QC, tests for cryptic relatedness, removal of artifact sites, tests for heterozygosity, and confirmation of variant calling through comparison with prior GWAS data (>90% concordance rates). T2D-ESKD cases and non-diabetic non-

84

nephropathy controls were selected from a previously published African American T2D -ESKD

GWAS.6

Targeted Genotyping

Targeted genotyping of RREB1 variants rs9379084 and rs41302867 were performed across all cohorts utilizing the Sequenom MassArray system (Sequenom, San Diego, CA) in the

Center for Genomics and Personalized Medicine Research at WFSM. SNPs were PCR-amplified using primers designed in MassARRAY Assay Design 3.1 (Sequenom, San Diego, CA) and genotypes were analyzed using MassARRAY Typer (Sequenom). Call rates >97% were achieved with the exception of rs77510523 which achieved >98% call rates in cases but 95% call rate in controls; quality control was ensured using blind duplicates within each cohort of samples

(100% concordance rate).

Locus-wide analysis

The 1000 Genomes Project (ASW, YRI, LWK), Exome Variant Server (NHLBI), and

RegulomeDB (http:/ / regulome.stanford.edu/ ) were mined for additional rare (<0.5%), low

(MAF 0.5-5%), and common (MAF >5%) frequency RREB1 missense and functional intronic variants (based on Regulome-DB score) beyond those found in T2D-GENES. Variants for analysis in this study were chosen based on allele enrichment in African versus European ancestral populations, type of mutation (missense and functional intronic), and PolyPhen2 prediction (http:/ / genetics.bwh.harvard.edu/ pph/ data/ ). In total, an additional 13 RREB1 variants were selected for targeted genotyping in T2D-ESKD, non-T2D ESKD, and population- based control cohorts. Of the 12 variants, 3 failed quality control.

Statistical Analysis

85

Single SNP Association Testing

Each SNP was tested for departure from Hardy-Weinberg Equilibrium (HWE) expectations through a chi square goodness of fit test (HWE > 0.01 cases and HWE > 0.05 controls). The overall genotypic test of association and the two genetic models (dominant and additive) were computed to test for association between each SNP and each phenotype. Data for all tests of association were adjusted for admixture,18 APOL1 G1/ G2 risk allele status (assuming a recessive model of disease risk),4 age, and gender (Model 1) or Model 1 + BMI (Model 2), unless noted (ie, in the T2D-only, non-nephropathy analysis). These tests were computed using the SNPGWA program

(http:/ / www.phs.wfubmc.edu/ public_bios/ sec_gene/ downloads.cfm).

86

References (Chapter 3)

1. USRDS 2010. USRDS Annual Data Report: Volume Two, Chapter 11. 367-382. Accessed 5/ 29/ 12.

2. Spray BJ, Atassi NG, Tuttle AB, Freedman BI. 1995. Familial risk, age at onset, and cause of end-stage renal disease in white Americans. J Am Soc Nephrol. 5:1806-1810. PMID: 7787148

3. Freedman BI, Tuttle AB, Spray BJ. Familial predisposition to nephropathy in African- Americans with non-insulin-dependent diabetes mellitus. Am J Kidney Dis. 1995;25:710- 713.

4. Genovese G, Friedman DJ, Ross MD, Lecordier L, Uzureau P, Freedman BI, Bowden DW, Langefeld CD, Olesyk TK, Uscinski Knob AL, Bernhardy AJ, Hicks PJ, Nelson GW, Vanhollebeke B, Winkler CA, Kopp JB, Pays E, Pollak MR. 2010. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science. 329(5993): 841-5.

5. Kopp JB, Nelson GW, Sampath K, Johnson RC, Genovese G, An P, Friedman D, Briggs W, Dart R, Korbet S, Mokrzycki MH, Kimmel PL, Limou S, Ahuja TS, Berns JS, Fryc J, Simon EE, Smith MC, Trachtman H, Michel DM, Schelling JR, Vlahov D, Pollak M, Winkler CA. 2011. APOL1 genetic variants in focal segmental glomerulosclerosis and HIV-associated nephropathy. J Am Soc Nephrol. 22(111):2129-37. PMID: 21997394

6. McDonough CW, Palmer ND, Hicks PJ, Roh BH, An SS, Cooke JN, Hester JM, Wing MR, Bostrom MA, Rudock ME, Lewis JP, Talbert ME, Blevins RA, Lu L, Ng MCY, Sale MM, Divers J, Langefeld CD, Freedman BI, Bowden DW. 2011. A Genome Wide Association Study for Diabetic Nephropathy Genes in African Americans. Kidney Int. 79(5): 563-572.

7. Palmer ND, Freedman BI. 2012. Insights into the Genetic Architecture of Diabetic Nephropathy. Current Diabetes Reports. 12(4):423-31. PMID: 22573336

8. Date S, Nibu Y, Yanai K, Hirata J, Yagami K, Fukamizu A. 2004. Finb, a multiple zinc finger protein, represses transcription of the human angiotensinogen gene. Int J Mol Med. 13(5): 637-42.

9. Yang Q, Kottgen A, Dehghan A, Smith AV, Glazer NL, Chen MH, et al. 2010. Multiple genetic loci influence serum urate levels and their relationship with gout and cardiovascular disease risk factors. Circ Cardiovasc Genet. 3(6): 523-30

10. Bostrom MA, Kao WHL, Li Man, Abboud HE, Adler SG, Iyengar SK, Kimmel PL, et al. 2012. Genetic Association and Gene-Gene interaction Analyses in African American Dialysis Patients with Nondiabetic Nephropathy. American Journal of Kidney Diseases. 59(2): 210-221

87

11. Liu CT, Monda KL, Taylor KC, Lange L, Demerath EW, Palmas W, Wojczynski MK, et al. 2013. Genome-Wide Association of Body Fat Distribution in African Ancestry Populations Suggests New Loci. PLoS Genetics. 9(8): e1003681

12. Palmer ND, McDonough CW, Hicks PJ, Roh BH, Wing MR, An SS, Hester KM, et al. 2012. A genome-wide association search for type 2 diabetes genes in African Americans. PLoS One. 7(1):e29202

13. Below JE, Gamazon ER, Morrison JV, Konkashbaev A, Pluzhnikov A, McKeigue PM, et al. 2011. Genome-wide association and meta-analysis in populations from Starr County, Texas, and Mexico City identify type 2 diabetes susceptibility loci and enrichment for expression quantitative trait loci in top signals. Diabetologia. 54(8): 2047-55.

14. Boyle AP, Hong EL, Hariharan M, Cheng Y, Schaub MA, Kasowski M, Karczewski KJ, Park J, Hitz BC, Weng S, Cherry JM, Snyder M. 2012. Annotation of functional variation in personal genomes using RegulomeDB. Genome Research. 22(9):1790-1797. PMID: 22955989.

15. Liu H, Hew HC, Lu ZG, Yamaguchi T, Miki Y, Yoshida K. 2009. DNA damage signalling recruits RREB-1 to the p53 tumour suppressor promoter. Biochem J. 422:543-551.

16. Zhang S, Qian X, Hedman C, Bilskovski V, Hamsay ES, Lowy DR, Mock BA. 2003. p16| NK4a gene promoter variation and differential binding of a repressor, the ras- responsive zinc-finger transcription factor, RREB. 22:2285-2295.

17. Wenzlau JM, Juhl K, Yu L, Moua O, Sarkar SA, Gottlieb P, Rewers M, et al. 2007. The cation efflux transporter ZnT8 (Slc30A8) is a major autoantigen in human type 1 diabetes. PNAS. 104(43): 17040-17045.

18. Ng MCY, Park KS, Oh B, Tam CHT, Cho YM, Shin HD, Lam VKL, Ma RCW, et al. 2008. Implication of Genetetic Variants Near TCF7L2, SLC30A8, HHEX, CDKAL1, CDK2A/B, IGF2BP2, and FTO in Type 2 Diabetes and Obesity in 6,719 Asians. Diabetes. 57(8): 2226- 2233.

19. Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Durean W, Erdos MR, Stringham HM, et al. 2007. A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants. Science. 316(5829): 1341-1345.

20. Milon BC, Agyapong A, Bautista R, Costello LC, Franklin RB. 2010. Ras responsive element binding protein-1 (RREB1) down-regulates hZIP1 expression in prostate cancer cells. Prostate. 70:288-296.

21. Fujimoto-Nishiyama A, Ishii S, Matsuda S, Inoue J, Yamamoto I. 1997. A novel sinze finger protein, FinB, is a transcriptional activetor and localized in nuclear bodies. Gene. 195: 267-275.

88

22. Dembinski AB, Johnson LR. 1980. Stimulation of Pancreatic Growth by Secretin, Caerulein, and Pentagastrin. Endocrinology. 106(1): 323-328.

23. Ray SK, Nishitani J, Petry MW, Fessing MY, Leiter AB. 2003. Novel transcriptional potentiation of BETA2/ NeuroD on the secretin gene promoter by the DNA bindin g protein Finb/ RREB-1. Mol Cell Biol. 23: 259-271.

24. Pandini G, Mineo R, Frasca F, Roberts CT Jr, Marcelli M, Vigneri R, Belfiore A. 2005.Androgens up-regulate the insulin-like growth factor-I receptor in prostate cancer cells. Cancer Res. 65(5): 1849-57.

25. Mukhopadhyay NK, Cinar B, Mukhopadhyay L, Lutchman M, Ferdinand AS, Kim J, Chung LW, et al. 2007. The zinc finger protein ras-responsive element binding protein-1 is a coregulator of the androgen receptor: implications for the role of the Ras pathw ay in enhancing androgenic signalling in prostate cancer. Mol Endocrinol. 22: 2285-2295.

26. Yoshida T, Sugiura H, Mitobe M, Tsuchiya K, Shirota S, Nishimura S, et al. 2008. ATF3 protects against renal ischemia-reperfusion injury. JASN. 19(2): 217-24.

27. Li HF, Cheng CF, Liao WJ, Lin H, Yang RB. 2010. ATF3-mediated epigenetic regulation protects against acute kidney injury. JASN. 21(6): 1003-13.

28. Lee YS, Kobayashi M, Kikuchi O, Sasaki T, Yokota-Hasimoto H, Sustanti VY, Ido Kitamura Y, Kitamura T. 2010. ATF3 expression is induced by low glucose in pancreatic α and β cells and regulates glucagon but not insulin gene transcription. Endocr J. Epub ahead of print. PMID: 24140652.

29. Hartman MG, Lu D, Kim ML, Kociba GJ, Shukri T, Buteau J, Wan X, Frankel WL, Guttridge D, et al. 2004. Role for Activating Transcription Factor 3 in Stress-Induced β- Cell Apoptosis. Mol Cell Biol. 24(13): 5271-5732.

30. Sanchez AP, Zhao J, You Y, Decleves AE, Diamond -Stanic M, Sharma K. 2011. Role of the USF1 transcription factor in diabetic kidney disease. Am J Physiol Renal Physiol. 30(2): F271-9.

31. Naukkarinen J, Nilson E, Koistinen HA, Soderlund S, Lyssenko V, Vaag A, Poulsen P, Groop L, Taskinen MR, Peltonen L. 2009. Functional variant disrupts insulin induction of USF1: mechanism for USF1-associated dyslipidemias. Circ Cardiovasc Genet. 2(5): 522- 9.

89

32. Yang X, Zu X, Tang J, Xiong W, Zhang Y, Liu F, Jiang Y. 2012. Zbtb7 suppresses the expression of CDK2 and E2F4 in liver cancer cells: implications for the role of Zbtb7 in cell cycle regulation. Mold Med Rep. 5(6): 1475-80.

33. Kim JY, Chu K, Kim HJ, Seong HA, Park KC, Sanyal S, Takeda J, Ha H, Shong M, Tsai MJ, Choi HS. 2004. Orphan nuclear receptor small heterodimer partner, a novel corepressor for a basic helix-loop-helix transcription factor BETA2/ neuroD. Mol Endocrinol. 18(4):776-90.

34. Nitz MD, Harding MA, Smith SC, Thomas S, Theodorescu D. 2011. RREB1 Transcription Factor Splice Variants in Urologic Cancer. Tumorigenesis and Neoplastic Progression. 179(1): 477-486.

35. Gu T, Horova E, Mollsten A, Seman NA, Falhammar H, Prazny M, Brismar K, Gu HF. 2012. IGF2BP2 and IGF2 genetic effects in diabetes and diabetic nephropathy. J Diabetes Complications. 26(5): 393-398.

36. Scott RA, Lagou V, Welch RP, Wheeler W, Montasser ME, Luan J, Magi R, Strawbridge RJ, et al. 2012. Large-scale association analyses identify new loci influencing glycemic traits and provide insight into the underlying biological pathways. Nature Genetics. 44: 991-1005.

37. Degner JF, Pai AA, Pique-Regi R, Veyrieras JB, et al. 2012. DNaseI sensitivity QTLs are a major determinant of human expression variation. Nature. 482(7385): 390–394.

38. ENCODE Project Consortium. 2012. An integrated encyclopedia of DNA elements in the human genome. Nature. 489 (7414): 57–74.

39. DIAGRAM Consortium et al. 2014. Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility. Nat Genet. Epub ahead of print. doi: doi:10.1038/ ng.2897.

90

Acknowledgements: This work was supported by NIH grants NIDDK F30 DK098836 (JAB),

NIDDK RO1 DK53591 (DWB), NIDDK RO1 DK070941 (BIF), and NIDDK RO1 DK071891 (BIF).

The T2D-GENES Consortium is supported by NIDDK U01’s DK085501, DK085524, DK085526,

DK085545 and DK085584.

Conflicts of Interest/Disclosures: The authors report no competing financial interests.

91

Table 1. Clinical Characteristics of African American Study Cohorts

T2D -only (UACR < 30mg/g Discovery Discovery Replication Replication Non-T2D and eGFR > T2D-ESKD Controls T2D-ESKD Controls ESKD 60ml/min/1.73m2) T2D-GENES T2D-GENES T2D-ESKD T2D-ESKD Non-T2D T2D non-nephropathy Sample Source Cases Controls GWAS Cases GWAS Controls ESKD cases Controls N 529 535 1305 760 1705 850 Female (%) 61.2 57.3 60.7 57.9 43.7 58.7

Age (years) 61.6 ± 10.5 49.0 ± 11.9 61.3 ± 10.8 48.4 ± 12.7 54.6 ± 14.6 55.9 ± 9.5

Age at T2D (years) 41.6 ± 12.4 - 41.3 ± 12.4 - - 46.2 ± 10.3

Duration T2D prior to ESKD (years) 17.6 ± 10.2 - 17.1 ± 10.7 - - -

Duration of ESKD (years) 3.77 ± 3.8 - 3.66 ± 3.9 - 2.2 ± 1.64 -

Blood Glucose (mg/dl) - 88.8 ± 13.1 - 89.2 ± 13.6 88.6 ± 8.66 -

BMI (at recruitment; kg/m2) 29.7 ± 7.0 30.0 ± 7.0 30.3 ± 7.2 29.2 ± 7.4 27.2 ± 6.98 36.5 ± 18.4

BUN (mg/dl) - 13.3 ± 5.4 - 13.3 ± 4.5 - -

Serum Creatinine (mg/dl) - 0.99 ± 0.25 - 1.03 ± 0.46 - 0.95 ± 0.2 Categoric data expressed as percentage; continuous data as mean ± SD.

92

Table 2. Clinical Characteristics of European American Study Cohorts European American European American T2D -only (UACR < 60mg/g and T2D-ESKD cases Controls eGFR > 60ml/min/1.73m2) N 637 1020 560 Female (%) 49.5 63.5 54.4

Age (years) 65.3 ± 10.4 53.9 ± 15.1 63.1 ± 9.0

Age at T2D (years) 45.4 ± 13.6 - 51.2 ± 10.4

Duration T2D prior to ESKD (years) 20.1 ± 10.1 - -

Duration of ESKD (years) 2.60 ± 3.0 - -

Blood Glucose (mg/dl) - 93.1 ± 16.9 -

BMI (at recruitment; kg/m2) 29.6 ± 7.1 28.4 ± 5.7 32.6 ± 7.1

BUN (mg/dl) - 15.1 ± 5.1 -

Serum Creatinine (mg/dl) - 0.96 ± 0.50 1.03± 0.29

93

Table 3. RREB1 variants identified through African American T2D-ESKD Exome Sequencing Study and African American Replication Analyses (Dominant Model)ᶧ

Model 1 Model 2

N Case / MAF Case / Study SNP Control Control P OR 95% CI P OR 95% CI Discovery rs9379084 525 / 510 0.011 / 0.039 0.0023 0.28 0.13, 0.64 0.0031 0.29 0.13, 0.66 T2D-ESKD rs41302867 521 / 507 0.0077 / 0.033 0.0021 0.22 0.08, 0.58 0.0028 0.23 0.09, 0.60

Replication rs9379084 1220 / 605 0.019 / 0.029 0.072 0.6 2 0.36, 1.05 0.029 0.56 0.33, 0.94 T2D-ESKD rs41302867 1238 / 611 0.016 / 0.031 0.016 0.52 0.31, 0.89 0.011 0.50 0.29, 0.85 non-T2D, rs9379084 1424 / 599 0.020 / 0.028 0.37 0.81 0.52, 1.27 0.55 0.87 0.54, 1.39 ESKD-only Cases rs41302867 1427 / 611 0.015 / 0.031 0.019 0.56 0.35, 0.91 0.019 0.57 0.36, 0.91 Combined, “all cause” ESKD rs41302867 3186 / 1118 0.014 / 0.032 7.69x10-5 0.50 0.35, 0.70 5.85x10-5 0.49 0.35, 0.69 T2D-only vs. population rs9379084 809 / 662 0.024 / 0.029 0.16 0.69 0.42, 1.15 0.19 0.70 0.42, 1.19 based controls* rs41302867* 828 / 675 0.021 / 0.031 0.032 0.57 0.24, 0.95 0.076 0.62 0.37, 1.05 ᶧ sample sizes reflect those with complete clinical and genotypic data used in each analysis Model 1 covariates: age, gender, APOL1 G1/G2 status, admixture Model 2 covariates: Model 1 + BMI *No APOL1 G1/G2 adjustment in T2D-only analysis

94

Table 4A. RREB1 African American non-T2D ESKD Locus-Wide Study (Dominant Model, unless noted)ᶧ Model 1 Model 2 SNP N Case/ Control MAF Case / Control P OR 95% CI P OR 95% CI rs6919004 1424 / 650 0.30 / 0.33 0.015 0.78 0.63, 0.95 0.022 0.78 0.64, 0.97 rs9505090 1425 / 651 0.063 / 0.042 0.0081 1.57 1.12, 2.24 0.018 1.53 1.08, 2.17 rs116295722 1429 / 657 0.0056 / 0.016 0.048 0.55 0.30, 1.00 0.083 0.59 0.32, 1.07 rs77510523* 1427 / 562** 0.026 / 0.036 0.064 0.12 0.10, 1.13 0.042 0.1 0.01, 0.92

Table 4B. RREB1 African American T2D-only, non-nephropathy Locus- Wide Study (Dominant Model) ᶧ ☼ Model 1 Model 2 SNP N Case/ Control MAF Case / Control P OR 95% CI P OR 95% CI rs9505090 827 / 675 0.059 / 0.042 0.042 1.47 1.01, 2.13 0.041 1.5 1.02, 2.22 rs41302867 828 / 675 0.021 / 0.031 0.032 0.57 0.34, 0.95 0.076 0.62 0.37, 1.05 ᶧ sample sizes reflect those with complete clinical and genotypic data used in each analysis *Recessive Model; **smaller control sample size reflects a call rate efficiency of 95% for controls (vs >98%) ☼ Model 1 & 2 are not adjusted for APOL1 G1/G2 in T2D-only analysis Model 1 covariates: age, gender, APOL1 G1/G2 , admixture Model 2 covariates: Model 1 + BMI

95

Table 5A. RREB1 EA T2D-ESKD Locus-Wide Study (Additive Model, Unless noted) no covariates age, gender age, gender, BMI N Case / MAF Case / N Case / SNP Control Control P OR 95% CI Control P OR 95% CI P OR 95% CI rs2281833 600 / 966 0.14 / 0.11 0.0052 1.35 1.09, 1.67 571 / 686 0.00095 1.54 1.19, 1.99 0.0065 1.52 1.12, 2.05 rs9505090* 616 / 1002 0.0024 / 0.00 0.053 6.54 0.73, 58.62 586 / 711 ------rs9379084 600 / 983 0.097 / 0.14 0.00018 0.65 0.62, 0.81 572 / 693 0.000075 0.58 0.44, 0.76 0.000083* 0.52 0.40, 0.73 rs41302867 619 / 984 0.11 / 0.13 0.022 0.77 0.62, 0.96 589 / 697 0.0073 0.70 0.53, 0.91 0.0064 0.69 0.52, 0.90 *Dominant model

Table 5B. RREB1 EA T2D-only Locus-Wide Study (Additive Model, Unless noted) no covariates age, gender age, gender, BMI N Case/ MAF Case / N Case / SNP Control Control P OR 95% CI Control P OR 95% CI P OR 95% CI rs1334576 549 / 995 0.43 / 0.39 0.031 1.18 1.01, 1.36 542 / 703 0.089 1.16 0.98, 1.37 0.053 1.2 1, 1.44 rs9505090* 551 / 1002 0.0016 / 0.00 0.098 5.47 0.57, 52.70 544 / 711 ------rs9379084 542 / 983 0.098 / 0.14 0.001 0.68 0.54, 0.86 535 / 693 0.0023* 0.62 0.46, 0.83 0.0068* 0.65 0.48, 0.89 rs41302867 548 / 984 0.11 / 0.13 0.020 0.76 0.61, 0.96 541 / 697 0.030 0.75 0.58, 0.97 0.073 0.78 0.59, 1.02 rs6919004 551 / 997 0.47 / 0.43 0.034 1.18 1.01, 1.32 544 / 704 0.047 1.18 1, 1.40 0.016 1.25 1.04, 1.50 *Dominant model

96

Supplementary Table 1. RREB1 variants genotyped in a locus-wide analysis

SNP Chr Position Function PolyPhen2 (missense) RegulomDB Alleles Notes (hg19) (intronic) rs1334576 6 7211818 G195R Prob. Damaging - G/A - rs2256596 6 7247248 L1467P benign - T/ C Failed QC rs2281833 6 7246998 G1384R benign - G/A - rs2714315 6 7229346 3'UTR - 5 T/ C Failed QC rs6919004 6 7215541 intron - 2b C/T - rs9379084 6 7231843 D1171N Prob. Damaging - G/A - rs9502564 6 7230680 G783V benign - G/T Failed QC rs9505090 6 7239201 intron - 1F A/G - rs41302867 6 7240876 intron/ near splice site - - G/A - rs73374662 6 7189398 T90A Prob. Damaging - G/A - rs80334326 6 7202444 intron - 2a C/T - rs115093903 6 7231280 L983S Prob. Damaging - T/ C - rs115479859 6 7108269 5'UTR - 2a A/T - rs115693225 6 7182107 5'UTR - 5 A/G - rs116295722 6 7231736 S1135I Prob. Damaging - G/T Failed QC in European Americans rs116821447 6 7182155 S4N benign - G/A - rs147897366 6 7181353 splice site - 5 C/G - rs77510523 6 7240053 Intron - 2a A/T Was not genotyped in European Americans

97

Supplementary Table 2. MAF of RREB1 variants across genetic resources rs9379084 rs41302867 rs6919004 rs9505090 rs116295722 1000 Genomes AFR 0.008 0.008 0.33 0.07 0.01 YRI 0 0 0.32 0.1 0.011 LWK 0 0 0.29 0.07 0.005 ASW 0.033 0.033 0.39 0.049 0.016 Exome Variant Server Af. American 0.022 0.022 - - 0.0039

98

Supplementary Table 3: Transcription Factor Binding Sites rs9505090 is located within

Transcriptio Category (broad)* Link to study Pubmed n Factor

POLR2A RNA Pol subunit

SP4 tumor suppression

JUND AP1 TF

CBX3 Heterochromatin formation

ARID3A Embryogenesis

SIN3A STAT

MYC pseudooncogene

YY1 Histone acetylation/ deacteylation

ELF1 ATP-BC family

IRF1 immune response (interferon beta), apoptosis

MAZ oncogenesis

CTCF insulator; chromatin remodeling; Represses IGF-2-->T2D? represses IGF-2

TEAD4 Embryogenesis

STAT1 interferon upregulation

99

MAX hereditary pheochromocytoma

ZBTB7A Represses CDKN2A; induces thymic CDKN2A polymorphisms insulin repsonse associated with T2D

RCOR1 Neural cell differentiation

EP300 co-activator of HIF1A; co-activator of NeuroD1/ Secretin and T2D NeuroD1 of NeuroD1-dependent transcription of secretin

SMC3 chromosome cohesion during mitosis

ETS1 oncogene homolog

BRCA1 tumor suppression; oncogene; activates CDKN1B

BACH1 MAFK signalling

ZNF143 transcriptional activator of tRNAsec

FOXP2 mediates speech & language development

EGR1 cancer suppressor; cellular differentiation and mitogenesis

SIN3A repression mediated by Mad -Max

SP1 cell differentiation, growth, apoptosis, immune response

RAD21 repair DNA double strand breaks;

100

apoptosis via CASP3

BHLHE40 chondrogenesis

FOXA1 development of endoderm organs (eg,liver, deletion of CDKN1B ameliorates http://www.ncbi.nlm.nih.gov/pubmed/15685 pancreas, lung, and prostate); activates hyperglycemia in diabetic mice 168 CDKN1B; glucose homeostasis

USF2 interacts with PPRC - mitochondiral PPRC similar to PPARYy biogenesis

EBF1 B cell development

FOSL2 AP-1 transcription factors

CTCFL epigenitic modifier via imprinting of male IGF2 germline; methylation of IGF2/H19

REST neuronal regulation

USF1 linked to familial hyperlipidemia; regulator "critical transcriptional factor http://www.ncbi.nlm.nih.gov/pubmed/21543 of glucose mediated TGF-B1 in mesangial regulating diabetic kidney 418 cells disease" ; USF1 SNPs linked to http://www.ncbi.nlm.nih.gov/pubmed/20031 defective insulin response-->CHD 629 & dyslipidemia

ATF3 expression induced by low glucose in kidney disease, glucose response http://www.ncbi.nlm.nih.gov/pubmed/24140 pancreaitic cells, regulates glucagon 652; release, implicated in acute kidney injury http://www.ncbi.nlm.nih.gov/pubmed/20360 311; http://www.ncbi.nlm.nih.gov/pubmed/18235 102

101

E2F6 tumor suppressor

NFIC ondoblast formation

NDAC2 deactylation - -

*From .org, Weizman Institute of Science, Rehovot, Israel

102

Supplementary Table 4. CNV in RREB1 gene region.

CN CN CN CN CN CN CN CN CN CN CN CN CN CN CN ID ds1 1193443 1195607 1195620 1195621 1195640 1195648 1195656 1195687 1195691 1200115 329265 329267 1200142 1200153 1202240 no CNV 3686 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

1 Deletion NCBH- 046 1 2 2 2 2 2 2 2 4 4 1 1 1 1 1 1

1 Insertion DMS-137 2 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2 608 2 3 3 3 3 3 3 3 3 3 2 2 2 2 2 2

2 Insertions NCBH- 046 1 2 2 2 2 2 2 2 4 4 1 1 1 1 1 1 RGF-1201 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 692 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2138 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 DMS-2001 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ds1=control; ds2=T2D-ESKD cases

103

Supplementary Figure 1. TFBS in which rs9505090 is located within.

104

Chapter 4

Complement factor H gene associations with ESKD in African Americans Jason A. Bonomo1,2, Nicholette D. Palmer 2,3, Pamela J. Hicks2, Janice P. Lea4, Mark D. Okusa 5, Carl D. Langefeld 2,6, Donald W. Bowden 2,3, Barry I. Freedman 2,7

1Department of Molecular Medicine and Translational Science, 2Center for Genomics and Personalized Medicine Research, 3Department of Biochemistry, 6Department of Biostatistical Sciences, 7Department of Internal Medicine - Section on Nephrology; Wake Forest School of Medicine, Winston-Salem, North Carolina; 4Department of Medicine – Division of Renal Medicine; Emory School of Medicine; Atlanta, Georgia; 5Department of Medicine – Division of Nephrology; University of Virginia School of Medicine, Charlottesville, Virginia.

JAB performed experiments, ran the statistical analyses, prepared the methods section , and served in an editorial capacity. NDP aided in manuscript preparation. PJH aided with experiments. JPL and MDO recruited subjects. CDL wrote the statistical software for single-SNP association tests. DWB served in an advisory capacity. BIF prepared the manuscript, recruited subjects, and served in an advisory capacity.

This is a pre-copy-editing, author-produced PDF of an article accepted for publication in Nephrology, Dialysis, & Transplantation following peer review. The definitive publisher- authenticated version [Bonomo et al. 2014. NDT. Epub ahead of print. doi: 10.1093/ ndt/ gfu036] is available online at: http:/ / ndt.oxfordjournals.org/ content/ early/ 2014/ 02/ 27/ ndt.gfu036.long

Running Head: CFH in African Americans with ESKD

Key Words: African Americans, CFH, end-stage kidney disease, genetics, kidney disease

Word counts: Abstract: 250 Text: 2224 Tables: 5 (4 + 1 Supplementary)

Correspondence: Barry I. Freedman, MD Internal Medicine - Nephrology Wake Forest School of Medicine Medical Center Boulevard Winston Salem, NC 27157-1053 USA phone: (336) 716-6192 fax: (336) 716-4318 email: [email protected]

105

Abstract

Background: Mutations in the complement factor H gene (CFH) region associate with renal- limited mesangial proliferative forms of glomerulonephritis including IgA nephropathy (IgAN), dense deposit disease (DDD), and C3 glomerulonephritis (C3GN). Lack of kidney biopsies could lead to under diagnosis of CFH-associated end-stage kidney disease (ESKD) in African

Americans (AAs), with incorrect attribution to other causes. A prior genome-wide association study in AAs with non-diabetic ESKD implicated an intronic CFH single nucleotide polymorphism (SNP).

Methods: Thirteen CFH SNPs (10 exonic, 2 3’UTR, and the previously associated intronic variant rs379489) were tested for association with common forms of non-diabetic and type 2 diabetes-associated (T2D) ESKD in 3,770 AAs (1,705 with non-diabetic ESKD, 1,305 with T2D-

ESKD, 760 controls). Most cases lacked kidney biopsies; those with known IgAN, DDD, or

C3GN were excluded.

Results: Adjusting for age, gender, ancestry, and apolipoprotein L1 gene risk variants, single

SNP analyses detected 6 CFH SNPs (5 exonic and the intronic variant) as significantly associated with non-diabetic ESKD (p=0.002-0.01), three of these SNPs were also associated with T2D-

ESKD. Weighted CFH locus-wide sequence kernel association testing (SKAT) in non -diabetic

ESKD (p=0.00053) and T2D-ESKD (p=0.047) confirmed significant evidence of association.

Conclusions: CFH was associated with commonly reported etiologies of ESKD in the AA population. These results suggest that a subset of cases with ESKD clinically ascribed to the effects of hypertension or glomerulosclerosis actually have CFH-related forms of mesangial proliferative glomerulonephritis. Genetic testing may prove usefu l to identify the causes of renal-limited kidney disease in patients with ESKD who lack renal biopsies.

106

Introduction

The is critical to protect hosts from invading pathogens. 1

Dysregulation of this system is associated with susceptibility to infection and autoimmune disorders including systemic lupus erythematosus.2 The complement factor H gene (CFH) and five CFH-related genes (CFHR) with high are located on chromosome 1q32.

This complex genomic region regulates the activity of the alternative pathway of the complement system. CFH encodes Factor H protein, a critical inhibitor of the alternative pathway.3 Loss of function mutations in CFH associate with age-related macular degeneration, presumably from microvascular retinal injury due to loss of inhibitory effect on the alternative complement pathway.4

Several renal-limited forms of mesangial proliferative glomerulonephritis also associate with mutations in the CFH and CFHR genes.5 These include IgA nephropathy (IgAN), C3 glomerulonephritis (C3GN), and dense deposit disease (DDD). 6-9 Mutations in the N-terminal regulatory region of CFH associate with complement-mediated C3GN and DDD; both disorders can be progressive and lead to end -stage kidney disease (ESKD). A genome-wide association study (GWAS) in IgAN implicated a single nucleotide polymorphism (SNP) in intron 12 of

CFH.6 The intronic SNP that showed the strongest association with IgAN is in high linkage disequilibrium (LD) with copy number variation in the adjacent CFHR1 and CFHR3 genes.

Deletion of these two genes appears to reduce susceptibility to IgAN. Finally, mutations near the C-terminus of CFH are associated with atypical hemolytic uremic syndrome (aHUS).10 aHUS is a systemic thrombotic disorder manifesting endothelial cell injury and leading to progressive

107

kidney failure. aHUS lacks a mesangial proliferative injury pattern and manifest s clinically with thrombocytopenia and intravascular hemolysis.

Extra-renal manifestations are strong clues to presence of aHUS. In contrast, renal- limited IgAN, C3GN, and DDD can only be diagnosed with a kidney biopsy. In the absence of biopsy material, subjects with progressive renal-limited kidney disease are often diagnosed as having hypertensive or chronic glomerulosclerosis-associated ESKD.11;12 A GWAS in African

American (AA) cases with non-diabetic etiologies of ESKD implicated the apolipoprotein L1

(APOL1) and CFH genes.13 After the profound effect of APOL1, intronic CFH SNP rs379489 was the most significantly associated variant. The current analyses evaluated this SNP and 12 additional exonic (coding) CFH variants to determine whether they were associated with commonly reported forms of non-diabetic and type 2 diabetes-associated (T2D) ESKD in AAs.

108

Methods

Study Subjects

Recruitment and sample collection procedures have previously been reported. 13;14 The study was approved by the Institutional Review Board at Wake Forest School of Medicine

(WFSM) and all participants provided written informed consent. Cases and controls were unrelated and born in North Carolina, South Carolina, Georgia, Tennessee, or Virginia. DNA was extracted from whole blood using the PureGene system (Gentra Systems, Minneapolis,

MN, USA). AA cases with ESKD were recruited from dialysis facilities; cases with non -T2D

ESKD lacked diabetes at initiation of renal replacement therapy. ESKD was attributed to hypertension (~60%), unspecified glomerular disease or focal segmental glomerulosclerosis

(FSGS) (~30%), HIV-associated nephropathy (~5%) or unknown cause in the absence of a kidney biopsy (~5%); less than 2% of cases had a kidney biopsy. T2D was diagnosed in cases developing diabetes after age 25 years, without diabetic ketoacidosis or treatment solely with insulin since diagnosis. T2D-ESKD was diagnosed after >5 year T2D duration prior to renal replacement therapy, or with diabetic retinopathy or ≥ 100 mg/ dl proteinuria on urinalysis

(when available), in the absence of other causes of nephropathy. Cases with ESKD due to urologic/ surgical cause, polycystic kidney disease, aHUS, IgAN, membranous glomerulonephritis, membranoproliferative glomerulonephritis, C3GN, or DDD were not recruited. AA controls without T2D or kidney disease (serum creatinine concentration <1.5

[men] or <1.3 mg/ dl [women]) were recruited from the community and WFSM internal medicine clinics. Ethnicity was self-reported and confirmed by genotyping with ancestry informative markers.

Sample Preparation, Genotyping, and Quality Control

109

CFH variants were selected from exome sequencing resources (The 1000 Genomes

Project, Exome Variant Server) in addition to the intronic variant selected from our prior report.13 (SNP selection criteria included allelic discrepancies between European and AA populations (Supplementary Table 1), PolyPhen2 prediction

(http:/ / genetics.bwh.harvard.edu/ pph/ data/ ), and minor allele frequency. Due to the low predictive value of PolyPhen2, the amino acid changes of exonic variants were considered independently of PolyPhen2 score in certain circumstances (Supplementary Table 1). Our intention was to test genetic variants in the CFH gene based on the common disease-common variant and the common disease-rare variant hypotheses. This is reflected in Tables 2 and 3 with common single-SNP association testing and in Table 4 with the SKAT analysis powered for rare variants.

Targeted Genotyping

Targeted genotyping of 13 CFH variants was performed utilizing the Sequenom

MassArray system (Sequenom, San Diego, CA) in the Center for Genomics and Personalized

Medicine Research at WFSM. SNPs were PCR-amplified using primers designed in

MassARRAY Assay Design 3.1 (Sequenom, San Diego, CA) and genotypes w ere analyzed using

MassARRAY Typer (Sequenom). Call rates >97% were achieved for all variants; quality control was ensured using blind duplicates within each cohort of samples (100% concordance rate).

Statistical Analysis

Single SNP Association Testing

Each SNP was tested for departure from Hardy-Weinberg Equilibrium (HWE) expectations through Fisher’s exact test (HWE p>0.05 in both cases and controls). The overall genotypic test of association and the three genetic models (dominant, additive, and recessiv e) were computed to test for association between each SNP and each phenotype. Data for all tests

110

of association were adjusted for admixture 15 and APOL1 G1/ G2 risk allele status assuming a recessive model of disease risk 16 (Model 1) or Model 1 + age and gender (Model 2). These tests were computed using the SNPGWA program

(http:/ / www.phs.wfubmc.edu/ public_bios/ sec_gene/ downloads.cfm). Given the a priori evidence of association between CFH and nephropathy, we employed a p-value cutoff of <0.05 for statistical significance.

Sequence Kernel Association Testing

Sequence Kernel Association Testing (SKAT) was p erformed on all 13 CFH variants genotyped in non-diabetic ESKD cases, T2D-ESKD cases, and non-diabetic non-nephropathy controls.17 The SKAT Meta package was run on R 3.0.1 (http:/ / cran.r- project.org/ web/ views/ Genetics.html) using a model weighted for low frequency variants, termed “[c(1,3)]”, or an unweighted model termed “[c(1,1)]”. The SKAT analysis provides quality control and support for association. SKAT considers whether a variant confers risk, protection, or neutral (no effect), and it sim ultaneously considers the influence of all variants and their directions of effect.

111

Results

Table 1 contains demographic and laboratory characteristics of all 3,010 AA cases with non-diabetic ESKD and T2D-ESKD, as well as the 760 non-nephropathy controls. The percentage African ancestry in non-diabetic ESKD cases, T2D-

ESKD cases, and non-nephropathy controls were mean + standard deviation (SD) 80.01% ±

10.96%, 79.85% ± 11.54%, and 77.7% ± 10.96%, respectively. Contrasting non-T2D ESKD cases and controls, cases were more often male (p=0.006), older (p<0.001), and had lower body mass index (BMI) (p<0.001). Contrasting T2D-ESKD cases and controls, cases were more likely female

(p=0.03), older (p<0.001), and had higher BMI (p=0.004). Supplementary Table 1 lists the 13

SNPs that were evaluated. All SNPs met HWE expectations in cases and controls.

Table 2 displays single SNP association results in cases with non -diabetic ESKD, relative to controls. Results are reported with an additive genetic model, unless a significant (p<0.05) lack of fit (LOF) to additivity was observed. Results for Model 1 were adjusted for admixture and APOL1 and in Model 2 for admixture, APOL1, age, and gender. The previously associated intronic CFH SNP rs379489 remained significantly associated in this larger sample (dominant model, odds ratio [OR] 0.74, 95% confidence interval [CI] 0.61-0.91; p=0.0041). In addition, two synonymous and three non-synonymous SNPs were also significantly associated with non - diabetic ESKD after full adjustment (Model 2). Two of these SNPs were associated in a dominant model, rs3753396 and rs1065489 (OR 1.62 [95% CI 1.19-2.21] p=0.0023; and OR 1.58

[1.16-2.15] p=0.0036, respectively). Three SNPs were associated in an additive model, rs1061147, rs1061170 and rs515299 (OR 0.81 (95% CI 0.70-0.93) p=0.0030; OR 0.83 (95% CI 0.72-0.95) p=0.0091; and OR 0.80 (95% CI 0.67-0.95, p=0.0098, respectively). Variants rs1061147 and rs1061170 are in LD (r 2=0.80; data not shown) and likely reflect the same signal. Modest LD w as

112

observed for rs1061170 and rs379489 (r 2=0.52; data not shown); all other variants in this study had an r2 value <0.40.

These same 13 CFH SNPs were next evaluated for association with ESKD attributed to

T2D in AAs and four were significantly associated (Model 2). Of the four SNPs, three were also associated with non-diabetic ESKD, including the intronic SNP rs379489 (OR 0.84 [95% CI 0.70-

1.00] p=0.049) and exonic rs1061170 (OR 0.85 [95% CI 0.72-0.99] p=0.042) and rs1061147 (OR 0.83

[95% CI 0.71-0.97] p=0.017). The fourth SNP (exonic rs35274867) was associated with putative

T2D-ESKD (OR 2.00 [95%CI 1.17-3.40] p=0.011), but not with non-diabetic ESKD.

Table 4 displays both unweighted and weighted Sequence Kernel Association Testing

(SKAT) results in non-diabetic ESKD cases (Table 4A) and T2D-ESKD cases (Table 4B). This revealed significant CFH association (p=0.00053 adjusted for age, gender, ancestry and APOL1) and provides evidence of locus-wide CFH association based on all 13 SNPs for non-diabetic

ESKD cases. Modest evidence of CFH association was also observed with T2D-ESKD in the unweighted and weighted models (p=0.026-0.047).

113

Discussion

We report a series of CFH genetic association analyses in AAs clinically diagnosed with common etiologies of ESKD, specifically excluding those with known IgAN, DDD, or C3GN.

Consistent and significant evidence of association was observed with several coding CFH SNPs in non-diabetic forms of ESKD, and to a lesser extent in T2D-attributed ESKD. CFH is associated with several forms of mesangial proliferative glomerulonephritis, all with glomerular complement deposition.5

These analyses were initiated to follow -up an intronic CFH variant associated with non- diabetic ESKD in a GWAS.13 The present study included a larger sample of AAs presumably with common forms of non-diabetic ESKD and T2D-associated ESKD who typically lacked renal biopsies. Two synonymous, two 3’ untranslated region (UTR), and eight coding exonic variants were assessed; four are predicted to be deleterious (Supplementary Table 1). Lack of renal histologic material in AAs (and many patients with ESKD) hinders diagnostic accuracy in renal-limited forms of glomerulonephritis. Many of these patients develop progressive nephropathy with resultant secondary hypertension.11;12 In the absence of a kidney biopsy their disease is often attributed to the effects of systemic hypertension, coded as glomerulosclerosis, or as unknown cause.18

Genetic analyses have the potential to dissect subsets of patients with related or specific etiologies of ESKD from heterogeneous samples.19 IgAN is the most common form of CFH- associated glomerular disease and it occurs significantly less often in AAs than in populations of Asian and European ancestry.20;21 Much of this disparity is genetically mediated and geographic differences exist in risk allele frequencies.6 However, IgAN, as well as the rare disorders DDD and C3GN, would be under diagnosed in the absence of a kidney biopsy. AAs and ethnic minorities generally have poorer access to healthcare relative to European

114

Americans.22 Our data support that cases of IgAN, DDD, and C3GN may be missed in AAs. It is widely appreciated that many AAs clinically diagnosed with T2D-associated ESKD have non- diabetic forms of nephropathy.19;23

The renal literature frequently lacks kidney biopsy material necessar y to diagnose renal- limited disease. Many of these AA cases likely had renal-limited forms of mesangial proliferative disorders related to CFH and CFHR, although it cannot be proven histologically. It is possible that CFH associates with progressive FSGS and/ or focal global glomerulosclerosis or that alternative pathway complement activation as seen in IgAN, DDD and C3GN were present in our cases. Although power tends to be a limitation for many genetic studies, we do not believe it adversely affected our results. We estimate 80% power to detect non -T2D ESKD associations for common variants (rs1061147 and rs1061170) and >90% power to detect associations amongst low frequency variants (rs3753396 an d rs1065489) at α=0.01 (CaTs Power

Calculator, University of Michigan; data not shown). We were unable to directly assess whether the observed variants are in linkage disequilibrium (LD) with copy number variation (CNV) in

CFHR3/ CFHR1 since the non-T2D ESKD cases lack individually genotyped GWAS data (DNA was pooled). One would not expect low frequency and rare variants to be in LD with CFHR

CNVs, especially given the distance (i.e. 50 kb). We therefore used common SNPs which were associated with ESKD as a proxy search (using the SNAP program from the Broad Institute) to investigate whether they were in LD (r 2 >0.8) with SNPs located within exons of CFHR3/ CFHR1 using published CNV methodology in these genes.24;25 Common CFH SNPs that we found to be associated with ESKD were not in LD with CNV-associated SNPs in CFHR3/ CFHR1. Moreover, we assessed whether rs6677604, the intronic CFH SNP previously associated with IgAN which is in strong LD with (and thus a proxy for) the CNV CFHR1,3Δ (as reported by Gharavi et al.),6

115

was in LD with any of the variants identified in this study. Again, no variants were found to be in LD.

In conclusion, we replicated association of an intronic SNP in CFH with clinically diagnosed non-diabetic and T2D-associated etiologies of ESKD in AAs. We further demonstrate that multiple exonic or coding SNPs in CFH are associated with common complex forms of

ESKD in populations of recent African ancestry. This supports a higher frequency of mesangial proliferative forms of glomerulonephritis in the AA population with ESKD than previously appreciated or that the complement system plays a role in progressive glomerulosclerosis. It also demonstrates how genetic methodologies can be applied to dissect related kidney disorders from large and heterogeneous sample sets.

Acknowledgements

The authors have no conflicts of interest to report. This work was supported by F30

DK098836-01 (JAB); R01 DK53591-10 (DWB); RO1 DK070941 (BIF); and RO1 DK071891 (BIF).

116

Table 1. Clinical characteristics of African American study samples Non-T2D ESKD T2D-ESKD Cases Controls Cases N 1305 760 1705 Female (%) 60.7 57.9 43.7

Age (years) 61.3 ± 10.8 48.4 ± 12.7 54.6 ± 14.6

Age at T2D (years) 41.3 ± 12.4 - -

Duration T2D prior to ESKD (years) 17.1 ± 10.7 - -

Duration of ESKD (years) 3.66 ± 3.9 - 2.2 ± 1.64

Fasting serum glucose (mg/dl) - 89.2 ± 13.6 88.6 ± 8.66

BMI (at recruitment; kg/m2) 30.3 ± 7.2 29.2 ± 7.4 27.2 ± 6.98

BUN (mg/dl) - 13.3 ± 4.5 -

Serum creatinine (mg/dl) - 1.03 ± 0.46 - Categoric data expressed as percentage; continuous data as mean ± SD.

117

Table 2. CFH single SNP association results in African Americans with non-diabetic ESKD

Model Model 1 2 MAF Case / SNP BP (hg19) N Case / Control☼ Control P OR 95% CI P OR 95% CI rs1061147 196654324 1424 / 691 0.42 / 0.46 0.0037 0.82 0.71, 0.94 0.0030 0.81 0.70, 0.93 rs1061170 196659237 1419 / 691 0.36 / 0.39 0.021 0.85 0.74, 0.98 0.0091 0.83 0.72, 0.95 rs74443307 196670520 1432 / 722 0.017 / 0.012 0.39 1.30 0.71, 2.38 0.52 1.22 0.66, 2.26 rs35453854 196684855 1426 / 693 0.054 / 0.065 0.20 0.83 0.62, 1.10 0.44 0.89 0.66, 1.20 rs379489 196693451 1422 / 688 0.25 / 0.29 0.012* 0.78 0.64, 0.95 0.0041* 0.74 0.61, 0.91 rs3753396 196695742 1426 / 690 0.072 / 0.057 0.00055* 1.70 1.26, 2.31 0.0023* 1.62 1.19, 2.21 rs142902005 196696005 1425 / 691 0.0014 / 0.00 ------rs515299 196706677 1423 / 690 0.19 / 0.22 0.017 0.82 0.69, 0.96 0.0098 0.80 0.67, 0.95 rs1065489 196709774 1426 / 693 0.072 / 0.058 0.00084* 1.67 1.23, 2.25 0.0036* 1.58 1.16, 2.15 rs34362004 196711098 1425 / 694 0.011 / 0.017 0.076 0.58 0.32, 1.06 0.099 0.59 0.32, 1.10 rs35274867 196712596 1422 / 692 0.026 / 0.022 0.63 1.12 0.70, 1.79 0.56 1.16 0.71, 1.88 rs34247141 196715063 1427 / 693 0.13 / 0.13 0.66 0.95 0.77, 1.18 0.96 0.99 0.80, 1.24 rs371680052 196716472 1419 / 691 0.0014 / 0.0014 0.83 1.21 0.21, 7.03 0.93 1.09 0.19, 6.34 Model 1: Additive model adjusted for APOL1 & admixture Model 2: Model 1 + age & gender *denotes Dominant Model (lack of fit to additive < 0.05) MAF – minor allele frequency; OR – odds ratio; CI – confidence interval ☼ sample sizes reflect those with complete clinical and genotypic data used in each analysis

118

Table 3. CFH single SNP association results in African Americans with putative type 2 diabetes-associated ESKD

Model 1 Model 2 SNP BP (hg19) N Case / Control☼ MAF Case / Control P OR 95% CI P OR 95% CI rs1061147 196654324 1284 / 691 0.42 / 0.46 0.015 0.85 0.74, 0.97 0.017 0.83 0.71, 0.97 rs1061170 196659237 1281 / 691 0.37 / 0.39 0.17 0.91 0.8, 1.04 0.042 0.85 0.72, 0.99 rs74443307 196670520 1288 / 722 0.017 / 0.012 0.20 1.44 0.82, 2.54 0.29 1.42 0.74, 2.71 rs35453854 196684855 1284 / 693 0.050 / 0.065 0.029 0.73 0.55, 0.97 0.37 0.85 0.60, 1.21 rs379489 196693451 1285 / 688 0.26 / 0.29 0.13 0.89 0.77, .04 0.049 0.84 0.7, 1.00 rs3753396 196695742 1284 / 690 0.050 / 0.057 0.60 0.93 0.69, 1.23 0.46 0.88 0.63, 1.23 rs142902005 196696005 1284 / 691 0.0016 / 0.00 ------rs515299 196706677 1284 / 690 0.20 / 0.22 0.22 0.91 0.77, 1.06 0.17 0.88 0.72, 1.06 rs1065489 196709774 1285 / 693 0.051 / 0.058 0.53 0.91 0.69, 1.21 0.35 0.85 0.61, 1.19 rs34362004 196711098 1284 / 694 0.013 / 0.017 0.28 0.74 0.43, 1.28 0.49 0.79 0.41, 1.53 rs35274867 196712596 1284 / 692 0.032 / 0.022 0.081 1.46 0.95, 2.24 0.011* 2.00 1.17, 3.40 rs34247141 196715063 1285 / 693 0.12 / 0.13 0.15 0.86 0.7, 1.06 0.63 0.94 0.73, 1.21 rs371680052 196716472 1286 / 691 0.0023 / 0.0014 0.55 1.63 0.33, 8.20 0.49 1.84 0.33, 10.17 Model 1: Additive model adjusted for APOL1 & admixture Model 2: Model 1 + age & gender *denotes dominant Model (lack of fit to additive p<0.05) MAF – minor allele frequency; OR – odds ratio; CI – confidence interval ☼ sample sizes reflect those with complete clinical and genotypic data used in each analysis

119

Table 4A. CFH Locus-wide SKAT analysis (non-T2D ESKD)

N Model P-value SNPs Unweighted [1,1] 0.0014 13 Weighted [1,3] 0.00053 13

Table 4B. CFH Locus-wide SKAT analysis (T2D-ESKD)

N Model P-value SNPs Unweighted [1,1] 0.026 13 Weighted [1,3] 0.047 13 Covariates: admixture, APOL1, age, gender

120

Supplementary Table 1. CFH variants genotyped

MAF SNP Chr BP Function PolyPhen2 MAF AA MAF EA Asian rs1061147* 1 196654324 A307A - 0.43 0.41 0.07 rs1061170* 1 196659237 H402Y benign 0.39 0.41 0.07 rs74443307* 1 196670520 3' UTR - 0.01 0 0 probably rs35453854* 1 196684855 I551T 0 0 damaging 0.04 rs379489* 1 196693451 intronic13 - 0.28 0.44 0.07 rs3753396* 1 196695742 Q672Q - 0.08 0.16 0.5 rs142902005** 1 196696005 T724K possibly damaging 0.00091 0 - rs515299* 1 196706677 S890I benign 0.2 0 0 rs1065489* 1 196709774 E936D possibly damaging 0.08 0.16 0.51 rs34362004* 1 196711098 T1071I possibly damaging 0.01 0 0 rs35274867* 1 196712596 N1050Y benign 0.03 0.03 0 rs34247141** 1 196715063 Q1143E benign 0.1 0.00012 - rs371680052** 1 196716472 3' UTR - 0.002 0 - *1000 Genomes (AA=ASW, EA=CEU) **Exome Variant Server, University of Washington/NHLBI MAF – minor allele frequency; AA – African American; EA – European American

121

Reference List (Chapter 4)

1. Yachnin S. Functions and mechanism of action of com plement. N Engl J Med 1966; 274: 140-145

2. Zhao J, Wu H, Khosravi M et al. Association of genetic variants in complement factor H and factor H-related genes with systemic lupus erythematosus susceptibility. PLoS Genet 2011; 7: e1002079

3. Alexander JJ, Pickering MC, Haas M, Osawe I, Quigg RJ. Complement factor h limits immune complex deposition and prevents inflammation and scarring in glomeruli of mice with chronic serum sickness. J Am Soc Nephrol 2005; 16: 52-57

4. Klein RJ, Zeiss C, Chew EY et al. Complement factor H polymorphism in age-related macular degeneration. Science 2005; 308: 385-389

5. Kiryluk K, Novak J, Gharavi AG. Pathogenesis of immunoglobulin A nephropathy: recent insight from genetic studies. Annu Rev Med 2013; 64: 339-356

6. Gharavi AG, Kiryluk K, Choi M et al. Genome-wide association study identifies susceptibility loci for IgA nephropathy. Nat Genet 2011; 43: 321-327

7. Gale DP, de Jorge EG, Cook HT et al. Identification of a mutation in complement factor H - related protein 5 in patients of Cypriot origin with glomerulonephritis. Lancet 2010; 376: 794-801

8. Pickering MC, Cook HT, Warren J et al. Uncontrolled C3 activation causes membranoproliferative glomerulonephritis in mice deficient in complement factor H. Nat Genet 2002; 31: 424-428

9. brera-Abeleda MA, Nishimura C, Smith JL et al. Variations in the complement regulatory genes factor H (CFH) and factor H related 5 (CFHR5) are associated with membranoproliferative glomerulonephritis type II (dense deposit disease). J Med Genet 2006; 43: 582-589

10. Ying L, Katz Y, Schlesinger M et al. Complement factor H gene mutation associated with autosomal recessive atypical hemolytic uremic syndrome. Am J Hum Genet 1999; 65: 1538- 1546

11. Freedman BI, Iskandar SS, Appel RG. The link between hypertension and nephrosclerosis. Am J Kidney Dis 1995; 25: 207-221

12. Zarif L, Covic A, Iyengar S, Sehgal AR, Sedor JR, Schelling JR. Inaccuracy of clinical phenotyping parameters for hypertensive nephrosclerosis. Nephrol Dial Transplant 2000; 15: 1801-1807

13. Bostrom MA, Kao WH, Li M et al. Genetic association and gene-gene interaction analyses in African American dialysis patients with nondiabetic nephropathy. Am J Kidney Dis 2012; 59: 210-221

122

14. McDonough CW, Palmer ND, Hicks PJ et al. A genome-wide association study for diabetic nephropathy genes in African Americans. Kidney Int 2011; 79: 563-572

15. Ng PC, Levy S, Huang J et al. Genetic variation in an individual human exome. PLoS Genet 2008; 4(8): e1000160

16. Genovese G, Friedman DJ, Ross MD et al. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 2010; 329: 841-845

17. Wu MC, Lee S, Cai T, Li Y, Boehnke M, Lin X. Rare-variant association testing for sequencing data with the sequence kernel association test. Am J Hum Genet 2011; 89: 82-93

18. U.S. Renal Data System, USRDS 2012 Annual Data Report, Vol 1: Atlas of Chronic Kidney Disease and End-Stage Renal Disease in the United States, National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases. 2012.

19. Freedman BI, Langefeld CD, Lu L et al. Differential Effects of MYH9 and APOL1 Risk Variants on FRMD3 Association with Diabetic ESRD in African Americans. PLoS Genet 2011; 7: e1002150

20. Kiryluk K, Li Y, Sanna-Cherchi S et al. Geographic differences in genetic susceptibility to IgA nephropathy: GWAS replication study and geospatial risk analysis. PLoS Genet 2012; 8: e1002765

21. Wyatt RJ, Julian BA. IgA nephropathy. N Engl J Med 2013; 368: 2402-2414

22. Crews DC, Pfaff T, Powe NR. Socioeconomic factors and racial disparities in kidney disease outcomes. Semin Nephrol 2013; 33: 468-475

23. Gopalakrishnan I, Iskandar SS, Daeihagh P et al. Coincident idiopathic focal segmental glomerulosclerosis collapsing variant and diabetic nephropathy in an African American homozygous for MYH9 risk variants. Hum Pathol 2011; 42: 291-294

24. Moore I, Strain L, Pappworth I et al. Association of factor H autoantibodies with deletions of CFHR1, CFHR3, CFHR4, and with mutations in CFH, CFI, CD46, and C3 in patients with atypical hemolytic uremic syndrome. Blood 2010; 115: 379-387

25. Holmes LV, Strain L, Staniforth SJ et al. Determining the population frequency of the CFHR3/ CFHR1 deletion at 1q32. PLoS One 2013; 8: e60352

123

Chapter 5

Summary & Conclusions

Throughout these studies, and those in the Appendix, we employed logical and comprehensive analyses in order to gain further insight into the genetic architecture of T2D -

ESKD, non-T2D ESKD, and T2D. We utilized the vast wealth of renal data in an effort to approach the missing heritability of T2D-ESKD from an original perspective. We were able to successfully validate several of our hypotheses. First, we established that rare and low frequency missense variants in NPHS1 modulate susceptibility to ESKD in the general African

American population, regardless of primary phenotype. We also identified a novel candidate gene, RREB1, for both T2D and ESKD susceptibility in African and European Americans using a priori genetics and biological data. Finally, we identified CFH as a susceptibility locus for non-

T2D ESKD in African Americans. These are some of the first studies implicating rare and low frequency variants to ESKD susceptibility in the general population, and these data offer insights into the disproportionate burden of ESKD borne by those of African ancestry . By combining exome sequencing data from the T2D-GENES consortium with outside resources

(The 1000 Genomes Project, Exome Variant Server) and bioinformatics tools (RegulomeDB,

Nephromine, and PolyPhen2) we were able to approach the question of “missing heritability” of T2D-ESKD in African Americans from a particularly unique viewpoint.

From an anthropological and genetics perspective African Americans represent a uniquely admixed group where, on average, 80% of their genome is derived from a population that is ≈160,000 years old, and, on average, the remaining 20% of the genome attributed to admixture with European-derived populations whose ancestry is much younger: 25,000-40,000 years old (Stewart, Stringer 2012). Native African populations never experienced the same bottlenecking that Eurasian populations underwent following their exit from the African

124

continent (“Out of Africa” [OOA]), despite previous unsubstantiated claims in the Grim

Hypothesis (Freedman 2002). Moreover, we know that Eurasian populations, but not African populations, interbred with Neanderthals. These Neanderthal genes compromise ≈1% of

European genomes and ≈1.5% of Asian genomes (Sankararaman et al. 2014).

This difference in genetic diversity is reflected by present day differences in haplotype structures: the more ancient African ancestral populations experienced greater LD deca y over the milenina, minimal bottlenecks, and unique environmental drivers of natural selection (eg,

APOL1 G1/ G2, sickle cell anemia alleles [HgbS]) which are reflected by their smaller haplotype blocks, greater genetic diversity, and unique, common genetic variants tailored to their prior local environments (Charles et al. 2014, Tennessen et al. 2012, Genovese et al 2010). This is in contrast to European ancestral populations where multiple bottlenecking events, shorter durations of LD decay, and different environmental conditions led to larger haplotype blocks with less genetic diversity (Charles et al. 2014; Tennessen et al. 2012). Despite differences in ancestry, African, European, and Asian populations all experienced recent exponential population growth which is illustrated by the low and rare frequency genetic variants unique to each population we observe today (Coventry et al. 2010; Nelson et al. 2012). These rare and low frequency variants arose over the past 2,500-75,000 years (depending on ancestry), are more likely to be deleterious, and have yet to be selected against through natural selection (Coventry et al. 2010). Herein, we largely focused on understanding the role that putatively functional rare and low frequency variants play in modulating susceptibility to complex renal diseases.

Nearly all of our African American ESKD samples came from the United States End

Stage Renal Disease Network 6 (“Southeastern Kidney Council, Inc.”), quite literally the epicenter of the T2D-ESKD health disparity (USRDS 2013). As a major academic medical center in this network it is our obligation to address this challenging inequality that affects the patients

125

we serve. In this thesis we sought to understand this disparity through a genetics and bioinformatics perspective in an attempt to reconcile the vast data wealth available to nephropathy researchers with in house, state-of-the-art genetic techniques.

In Chapters 2-4 we leveraged allelic discrepancies between European and African ancestral populations in order to drive our selection of putatively functional variants for targeted genotyping. Given the major disparity in incidence of T2D-ESKD and ESKD in African

Americans compared to European Americans, coupled with the missing heritability of T2D-

ESKD, we hypothesized that functional genetic drivers of T2D-ESKD and ESKD would be found at different allelic frequencies between these two populations. This approach is in line with previous successful studies (Genovese et al. 2010). Our results from these investigations were generally consistent and supported our hypotheses.

Summary Chapter 2

In Chapter 2 we examined the influence of rare and low frequency missense variants in

NPHS1 on susceptibility to T2D-ESKD and non-T2D ESKD in African Americans.

Aforementioned, African Americans have nearly twice the number of genetic variants in

NPHS1 compared to European Americans. Many of the variants we genotyped were found exclusively in African ancestral populations. For example, The T233A variant (rs35238405;

OR=2.82 for all-cause ESKD) was found at a MAF of 1.2% in cases compared to 0.38% in controls. In native African populations it’s found at higher MAF (YRI=1.7%, LWK=4.1%) and is completely absent in European populations. Interestingly, the T233A variant was found at a

MAF of 2.4% in a separate cohort of ESKD cases in the Southern United States (Chris Larsen,

Nephropath). This suggests local population stratification may be at play with this variant, and this allele may be found at higher or lower frequencies depending on th e geographic location

126

within the United States. Alternatively, this MAF discrepancy could be due to genotyping errors.

Additionally, we were able to provide evidence supporting the rare-variant complex- disease hypothesis. We did so by implicating the NPHS1 locus as contributing to ESKD in

African Americans through the SKAT analysis and by assessing the population attributable risk

(PAR) of NPHS1 variants. In total 16 non-monomorphic missense variants in NPHS1 with a

MAF range of 0.04 – 13% were successfully genotyped. Assessing the cumulative impact of these 16 missense variants simultaneously in SKAT we obtained P-values of 2.95x10-8, 5.63x10-7, and 1.18x10-6 for ESKD association of the NPHS1 locus in unadjusted, admixture, and admixture

+ APOL1 adjusted models, respectively. Two missense variants, in addition to T233A, were associated with ESKD through single-SNP association testing (Y1174H OR=0.44, H800R

OR=0.04). Combining the PAR of the T233A, H800R, and Y1174H variants, we were able to explain roughly 8.1% of the missing heritability in African Americans with T2D-ESKD.

A clinical extension of this study relates to glomerular nephrin expression. It is known that nephrinuria, or the pathologic finding of nephrin in the urine, precedes microalbuminuria

(Langham et al. 2002). Microalbuminuria is presently the most sensitive, widely implemented laboratory test for detecting the earliest stages of CKD and DN. It thus stands to reason that if glomerular injuries are recognized earlier – ie, detecting nephrinuria rather than microalbuminuria – treatments to slow the progression of CKD and DN could then be initiated earlier. ACEi increase nephrin mRNA and protein expression in the podocyte (Langham et al.

2002; Doublier et al. 2003), and although the exact mechanism of this is poorly understood, ACEi can be leveraged as therapy. The nephrin variants in this study were heterozygotes and conferred both risk and protection. It’s possible that initiation of ACEi therapy earlier in the

127

course of CKD or DN may slow nephropathy progression by increasing expression of the beneficial nephrin allele.

Summary Chapter 3

Our genetic investigations of RREB1 in Chapter 3 took us on an unexpected path in elucidating genetic associations of variants with multiple phenotypes across two ethnic groups.

We began the investigation by positing genes with strong biological plausibility for influencing

T2D-ESKD could be identified through our initial exome sequencing studies. Here we started with two genes, RREB1, which, amongst many of its function, is to inhibit the expression of angiotensinogen, and DUOX2, a gene whose protein is involved in modulating cellular responses to oxidative stress and is widely expressed in the kidney. The hypotheses were that

RREB1 variants could modulate susceptibility to T2D-ESKD through differences in angiotensinogen repression, and that genetic variants in DUOX2 could influence T2D-ESKD through protein alternations (missense mutations) modulating susceptibility to hyperglycemia induced oxidative stress.

While we were able to replicate findings from our discovery samples for two RREB1 variants (rs9379084 [D1171N] and rs41302867 [splice site]), the DUOX2 variant rs269868

(S1067L) did not replicate (p=0.070; data not shown). Thus, for Chapter 3, we focused on investigating the role of RREB1 variants in T2D-ESKD and the associated phenotype T2D (in the absence of nephropathy) and non-T2D ESKD (ESKD in the absence of T2D). As these studies progressed we made the observations that putatively functional genetic variants in RREB1 contributed to T2D, T2D-ESKD, and non-T2D ESKD in African American and European

American populations.

128

Perhaps most striking was the observation that the splice site variant rs41302867 was strongly associated with ESKD in African Americans (p =5.85x10-5, OR=0.49; MAF Cases=1.4% and MAF controls=3.2%) and European Americans, though it’s found at greater MAF (MAF EA

T2D-ESKD=11%, MAF EA controls = 13%). Haplotype block differences between African and

European Americans may explain why rs9379084 (p=8.30x10-5) was associated with T2D-ESKD in European Americans several orders of magnitude greater than rs41302867 (p=0.0064), the strongest signal observed in African Americans.

Remarkably, we identified common, protective T2D-ESKD variants (MAF=14% controls,

OR=0.52) in European Americans that are found at low MAF in African Americans (MAF=3%,

OR=0.49). These observations shed important light on the African American T2D-ESKD health disparity from a genetics perspective. The identification of RREB1 variants associated with non-

T2D ESKD phenotypes in African Americans and with T2D-ESKD in European Americans provided further insight into the missing heritability of ESKD.

One final notable finding from the RREB1 study was the pleiotrophic phenotypes associated with rs9505090. This low-frequency variant was associated with T2D and non -T2D

ESKD in African Americans with a trend towards association with T2D-ESKD in African

Americans and European Americans. In the latter group it was found at very rare frequencies (3 observations). Most interestingly, this intronic variant resides in a highly regulated region of

RREB1 and may disrupt binding of upwards of 40 transcription factors. This variant was identified using RegulomeDB and provides a substantiated framework for variant selection in complex disease association studies. The portion of the genome which regulates gene expression and transcription factor binding is poorly understand, and by incorporating genetic variants in these loci, further studies may be able to further elucidate mechanisms of disease risk.

129

Clinically, RREB1 may serve as a potential drug target for the treatment of T2D and

CKD. That the variants identified in this study largely confer protection to these phenotypes, rather than risk, is remarkable as majority of previous drug discovery studies sought to identify damaging variants; this current paradigm is shifting, providing further clinical relevance to our findings (Flannick et al. 2014). With upwards of 12 splice variants for the RREB1 gene, and our poor understanding of their actions, it will be vital for in vitro and in vivo studies to follow up on the mechanisms of biological action. Identifying small molecules that increase transcription of phenotype-specific splice variants may have important implications for T2D and CKD/ EKSD treatments.

Chapter 4

In Chapter 4 we identified a number of SNPs in CFH associated with non-T2D ESKD, and to a lesser extent, T2D-ESKD in African Americans. Our results were somewhat contradictory to our hypothesis that African Americans would harbor a greater number of risk variants compared to European Americans. We did not genotype these CFH variants in

European Americans with T2D-ESKD (or non-T2D ESKD), so the following statements, at best, reflect inferences from our data. A number of the low frequency CFH variants we identified including rs3753396 and rs1065489 were strongly associated with non-T2D ESKD risk in African

Americans (p=0.0023, OR=1.62 and p=0.0036, OR=1.58, respectively) where they are found in the low frequency to common MAF range (MAF≈5-7%). These variants, however, are found at common allele frequencies in European Americans (MAF≈16%). Moreover, we identified a common protective non-T2D-ESKD variant in CFH (rs515299, OR=0.80) that is absent in

European ancestral population. It likely that African Americans are at lesser risk for complement mediated forms of glomerulonephropathy, a finding that is generally consistent with the literature (Kiryluk et al. 2012). Here our data suggests that African Americans are

130

likely at greater risk of developing the glomerulosclerosis spectrum of renal diseases and less likely to develop immune complex mediated nephropathies such as IgAN and other CFH/CFHR related renal diseases.

Limitations

There are a number of limitations to these studies. Foremost are the effects of local population stratification on the detection of rare and low frequency variants, which can lead to spurious associations when dealing with rare variants. Similarly, cryptic relatedness among our samples may compound the effects rare variant associations by invalidating the assumption that samples are unrelated. This in turn may produce associations that are more reflective of familial aggregation rather than true phenotypic associations. Finally, calling rare and low frequency variants may be especially difficult if there is not much separation between the heterozygotes and homozygotes in their MALDI-TOF data.

We have attempted to address these issues specific to rare variants through several mechanisms. We have accounted for population stratification using a panel of 70 AIMs previously identified by our group as being highly informative in determining African ancestry

(Keene et al. 2008). Controlling for local population stratification is especially difficult as natural frequencies of these variants may vary widely, even on a local level (Moore et al. 2013). The

MAF observations of the NPHS1 T233A variant in our case samples (1.2%) compared to those from Nephropath (2.4%) are reflective of this. However, our observed MAF for variants identified in these studies are generally consistent with those from other exome sequencing resources (The 1000 Genomes Projects, Exome Variant Server). We also have gene-level association data for rare and low frequency variants in NPHS1 influencing ESKD based on

SKAT the analysis. Replicating associations of RREB1 variants with T2D-ESKD (and non-T2D

ESKD) in 3 independent, ethnically diverse samples also mitigates some of the aforementioned

131

concerns. Finally, by including blind duplicates in our samples as a measure of quality control we address some of the challenges in calling rare variants through mass-spectrometry.

Another limitation to these studies is that the controls utilized in Chapters 2-4 were not the “ideal” control population. In those studied we compared T2D-ESKD cases to population- based controls lacking nephropathy and type 2 diabetes. Ideally, our control group for identifying variants associated with T2D-ESKD would have been T2D-only patients lacking nephropathy for >10years. However, this population is difficult to identify and recruit since many African Americans will develop some degree of renal decline and proteinuria over the course of their T2D. Thus, we used comparisons with non-nephropathy, non-T2D population based controls versus T2D-ESKD samples, followed by comparing the genetic variants in question to a T2D sample lacking nephropathy (to population based controls) in an effort to elucidate true phenotypic associations.

A limitation specific to Chapter 2 is the focus on one protein, nephrin, specific to the glomeruli. There is certainly a great need to investigate other biologically relevant proteins found in both the glomerulus and the tubular components of the nephron. These projects are currently being developed in the Bowden Lab. We are expanding our search for low frequency missense variants to include the endocytic proteins megalin (LRP2) and cubulin (CUBN).

Already, there are hints of promising results based on preliminary exome data from T2D-

GENES.

Future Genetic Investigations of ESKD

Incorporating systems biology and pathway analyses will be critical in order to gain further insights into the genetic architecture of T2D-ESKD and ESKD. Recognizing that proteins do not exist in their own independent, static environments, it will be necessary to investigate genetic variants in protein-protein and gene-gene interaction networks to identify additional

132

ESKD susceptibility loci. Genetic variants in the nephrin pathway are currently being interrogated in the Bowden Lab and low-frequency variants in JAM-A/F11R and CD2AP display strong potential for promising results. Investigating the regulation of these genes may also prove fruitful. It would be especially interesting, although complex, to study splice variants of

RREB1 in vitro and in vivo in an effort to further characterize attributes of splice variants and aid in the development of drug targets to exploit pleiotrophic properties.

Positing the Future of T2D -ESKD

Compound heterozygosity will be a major attribute of T2D-ESKD (and non-T2D ESKD) heritability. This will incorporate compound risk from common, low frequency, and rare variants, as in this thesis work, in addition to gene-gene and gene-environment interactions

(Divers, Palmer et al. 2013; Divers et al. 2013). The intricate protein systems involved in the homeostatic functions of the nephron, renal tissue, and beyond give rise to the plausible reality that genetic and molecular disruptions along a pathway, be it change in amino acid struct ure, altered regulation, disruption of TFBS, or viral infection of the kidney may all contribute to modulating susceptibility to T2D-ESKD and similarly complex renal diseases. For each individual extensive locus and environmental heterogeneity will likely be a hallmark of the genetic underpinnings of ESKD.

Genetic Risk Prediction of ESKD

A major limitation to present day clinical prognostication of advanced nephropathy risk is the lack of an accurate and specific test assessing a patient’s risk of developing ESKD. Not all patients with microalbuminuria will follow the canonical clinical pathway to ESKD through the intermediate phenotype of overt proteinuria; some will permanently revert to normoalbuminuria, and others will develop ESKD without developing overt proteinuria

(Perkins et al. 2010; Roshan, Stanton 2013). It has been estimated that only one third of diabetic

133

patients with microalbuminuria will develop advanced CKD, and that in half of these initially microalbuminuric patients who later developed ESKD, renal decline preceeded overt proteinuria

(Perkins et al. 2010). Elevated HbA 1c is a known risk factor DN. However, even with intensive

medical management of FBG and HbA 1c over many years, DN progression has been observed in patients independent of glycemic control (DCCT 1993). Collectively, these observations demonstrate the strong genetic influences on ESKD susceptibility.

The APOL1 G1/ G2 alleles are presently the strongest known risk alleles for non-T2D

ESKD in African Americans. That said, only 4% of those with two APOL1 risk alleles will develop kidney disease. APOL1 has been associated with DN progression but not directly associated with canonical T2D-ESKD (Parsa et al. 2013; McDonough et al. 2010; Palmer et al.

2014).

The studies of this thesis work focused on identifying genetic variations in the most severe renal disease phenotype: ESKD. This represents a significant advantage over studying dynamic quantitative traits or less severe phenotypes where a patient may not necessarily progress to ESKD. In the studies presented here we identified rare, low frequency, and common genetic variants associated with T2D-ESKD and non-T2D ESKD independent of APOL1. These include the NPHS1 variants rs35238405 (OR=2.82, all-cause ESKD), rs115489112 (OR=0.44, all- cause ESKD), and rs146400394 (OR=0.04, all-cause ESKD), RREB1 variants rs41302867 (OR=0.49, all-cause ESKD) and rs9505090 (OR=1.53, non-T2D ESKD [trend in T2D-ESKD]), CFH variants rs3753396 (OR=1.62, non-T2D ESKD), rs1065489 (OR=1.58, non-T2D ESKD), and rs1061147

(OR=0.81, all-cause ESKD), the RTN1 variant rs12434215 (OR=0.73, all-cause ESKD), and the

CRCT1 variant rs73004856 (OR=1.27-1.56, T2D-ESKD). Collectively, the association of these variants with ESKD represents marked progress in our genetic understanding of ESKD and associated phenotypes.

134

If an African American patient develops microalbuminuria it may be possible to prognosticate their risk of developing ESKD using the panel of genetic v ariants identified in this work in conjunction with the APOL1 G1/ G2 alleles. The collective burden of ESKD risk could then be assessed through the distribution of compound heterozygosity between protective and risk variants. How accurate this type of genetic test would be is uncertain and would require many years of extensive followup for validation. Therefore, it appears that the role of genetics in ESKD prognostication is in its infancy and will likely require years, if not decades, to mature to an accurate and precise test with wide-spread clinical applications. The aforementioned genetic associations with ESKD will likely aid in identifying at-risk patients, and perhaps a calculation considering genetic risk in addition to environmental risk factors (smoking, obesity) will be employed for a comprehensive ESKD risk score.

The logical question following the creation of a validated ESKD genetic framework is then: what next? Current therapeutics for nephropathy at best slow progression to ESKD, and the most recent Phase III trial for Stage 4 CKD failed rather miserably. There is some hope for treatment with the hormone relaxin, where anti-fibrotic renal effects and amelioration of CKD have been observed in rodent models of nephropathy (Yoshida et al. 2012; Sasser 2013). Whether or not these beneficial effects can be replicated in humans is not yet known but certainly warrants further investigations.

Insights from this thesis work have expanded our understanding of the pathobiological molecular mechanisms underlying ESKD, will aid in future genetic ESKD prognostication, and have identified tractable therapeutic targets warranting further research; specifically, identifying mechanisms to increase glomerular nephrin expression and upregulating beneficial

RREB1 splice variants for novel ESKD (and T2D) treatments.

135

References (Chapter 5)

1. Balasubramani A, Winstead CJ, Turner H, et al. 2014. Deletion of a conserved cis-element in the Ifng locus highlights the role of acute histone acetylation in modulating inducible gene transcription. PLoS Genet. 10(1): e1003969.

2. Blattler A, Farnham PJ. 2013. Cross-talk between site-specific transcription factors and DNA methylation states. J Biol Chem. 288(48): 34287-34294.

3. Cawley J, Maclean JC. 2012. Unfit for service: The Implications of Rising Obesity for U.S. Military Recruitment. Health Economics. 21(11): 1348-1366.

4. Charles BA, Shriner D, Torimi CN. 2014. Accounting for Linkage Disequilibrium in Association Analysis of Diverse Populations. Genetic Epidemiology. 0(0): 1-9.

5. Coventry A, Bull-Otterson LMK, Liu X, Maxwell TJ, et al. 2010. Deep resequencing reveals excess rare recent variants consistent with explosive population growth. Nat Commun. 1(131): 1-6.

6. DCCT 1993. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin-dependent diabetes mellitus. The Diabetes Control and Complications Trial Research Group. N Engl J Med. 329(14): 977-86.

7. Divers J, Nũnez M, High KP, Murea M, et al. 2013. JC polyoma virus interacts with APOL1 in African Americans with nondiabetic kidney disease. Kidney Int. 84(6): 1207-13.

8. Divers J, Palmer ND, Lu L, Langefeld DC, et al. Gene-gene interactions in APOL1-associated nephropathy. Nephrol Dial Transplant. Epub ahead of print / doi: 10.1093/ndt/gft423

9. Doublier S, Salvidio G, Lupia E, et al. 2003. Nephrin expression is reduced in human diabetic nephropathy: evidence for a distinct role for glycated albumin and angiotensin II. Diabetes. 52(4): 1023-1030.

10. Flannick J, Thorliefsson G, Beer NL, Jacobs SBR, et al. 2014. Loss-of-function mutations in SLC30A8 protect against type 2 diabetes. Nat Genet. Epud ahead of print (3/2/2014). doi: 10.1038/ng.2915

11. Freedman BI. 2002. End-stage renal failure in African Americans: insights in kidney disease susceptibility. Nephrol Dial Transplant. 17(2): 198-200.

12. Genovese G, Friedman DJ, Ross MD et al. 2010. Association of trypanolytic ApoL1 variants with kidney disease in African Americans. Science 329: 841-845.

13. Keene KL, Mychaleckyj JC, Leak TS, Smith SG, et al. 2008. Exploration of the utility of ancestry informative markers for genetic association studies of African Americans with type 2 diabetes and end stage renal disease. Human Genetics. 124(2: 147-154.

14. Kiryluk K, Li Y, Sanna-Cherchi S, et al. 2012. Geographic differences in genetic susceptibility to IgA nephropathy: GWAS replication study and geospatial risk analysis. PLoS Genet. 8(6): e1002765

136

15. Kramer H, Tuttle K, Leehey D, et al. 2009. Obesity management in adults with CKD. Am Journ Kid Dis. 53(1): 151-165.

16. Lanham RG, Kelly DJ, Cox AJ, Thomson NM, et al. 2002. Proteinuria and the expression of the podocyte slit diaphragm protein, nephrin, in diabetic nephropathy: effects of angiotensin converting enzyme inhibition. Diabetologia. 45(11): 1572-1576.

17. Moore CB, Wallace JR, Wolfe DJ, Frase AT, et al. 2013. Low frequency variants, collapsed based on biological knowledge, uncover complexity of population stratification in 1000 genomes project data. PLoS Genet. 9(12): e1003959.

18. Nelson MR, Wegmann D, Ehm MG, Kessner D, et al. 2012. An Abundance of Rare Functional Variants in 202 Drug Target Genes Sequenced in 14,002 People. Science. 337(6090): 100-104.

19. Palmer ND, Ng MCY, Hicks PJ, Mudgal P, et al. 2014. Evaluation of Candidate Nephropathy Susceptibility Genes in a Genome-Wide Association Study of African American Diabetic Kidney Disease. PLoS One. 9(2): e88273.

20. Parsa A, Kao WHL, Xie D, Astor BC, et al. 2013. APOL1 Risk Variants, Race, and Progression of Chronic Kidney Disease. N Eng J Med. 369(23): 2183-2196.

21. Pelchat ML. 2009. Food Addiction in Humans. J Nutr. 139: 620-622.

22. Perkins BA, Ficociello LH, Roshan B, et al. 2010. In patients with type 1 diabetes and new-onset microalbuminuria the development of advanced chronic kidney disease may not require progression to proteinuria. Kidney Int. 77(1): 57-64.

23. Roshan B, Stanton RC. 2013. A story of microalbuminuria and diabetic nephropathy. J Nephropathol. 2(4): 234-240.

24. Sankararaman S, Mallick S, Dannemann M, et al. 2014. The genomic landscape of Neaderthal ancestry in present-day humans. Nature. Epub ahead of print. doi: 10.1038/ nature12961.

25. Sasser JM. 2013. The emerging role of relaxin as a novel therapeutic pathway in the treatment of chronic kidney disease. Am J Physiol Regul Integr Comp Physiol. 305(6): 559- 565.

26. Tennessen JA, Bigham AW, O’Connor TD, Fu W, et al. Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes. Science. 337(6090): 64-69.

27. USRDS 2013. U.S. Renal Data System, USRDS 2013 Annual Data Report: Atlas of Chronic Kidney Disease and End-Stage Renal Disease in the United States, National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD, 2013.

28. van Vilet-Ostaphtchouk JV, Shrir-Sverdlov R, Zhernakova A, et al. 2007. Association of variants of transcription factor 7-like 2 (TCF7L2) with susceptibility to type 2 diabetes in the Dutch Breda cohort. Diabetologia. 50(1): 59-62.

137

29. Yoshida T, Kumagi H, Suzuki A, et al. 2012. Relaxin ameliorates salt-sensitive hypertension and renal fibrosis. Nephrol Dial Transplant. 27(6): 2190-2197.

138

Appendix 1

Creation and Evaluation of an African American T2D-ESKD Genetic Risk Score Based on

Common Genetics Variants in the Notch Signalling Pathw ay

All analyses were performed by Jason A. Bonomo. Dr. Ng and Poorva Mugdal pulled the raw data from the previously published African American T2D-ESKD GWAS (McDonough et al.

2011), and Dr. Allred helped with this. Dr. Ng also helped develop the methodology behind the

GRS. Donald W. Bowden, Ph.D., served in an advisory capacity.

139

Introduction

The Notch pathway is an evolutionarily conserved signaling pathway involved in the determination of cell fate during embryogenesis (Bonegio, Susztak 2012). In the kidneys , the

Notch pathway is crucial for proper development of the proximal tubule cells and also modulates the number of nephrons in the kidney (Bonegio et al. 2011; Cheng et al. 2007). Recent studies have demonstrated that in primary podocytes, HEK-293 cells, and kidneys from diabetic mice, high glucose concentrations (as in T2D) reduces nephrin expression and increases podocyte apoptosis; introduction of shRNA targeting Notch1 ameliorated these effects (Lin et al.

2010). Moreover, Notch signalling has been shown to be re-activated in glomerular disease, and

Notch1 expression in mature podocytes results in proteinuria in animal models (Bonegioa,

Susztak 2012; Niranjan T et al. 2008).

With ample biological data supporting a role for Notch pathway in diabetic kidney disease we sought to develop a genetic risk score (GRS) based on genetic variants in genes involved in this pathway. Data from a previously publish African American T2D-ESKD GWAS was used to create the GRS (McDonough et al. 2011).

Methods

A comprehensive Notch signalling pathway was constructed utilizing the protein - protein interaction (PPI) databases includin g Reactome

(http:/ / www.reactome.org/ PathwayBrowser), KEGG

(http:/ / www.genome.jp/ kegg/ pathway.html), String-DB (http:/ / string-db.org/ ), and

PubMed searchers (Figure 1). Data from the African American T2D-ESKD GWAS was then mined for the genes included in Figure 1, and all SNPs with p<0.05 for T2D-ESKD association were included in the analyses (N=1037 controls and N=971 cases). Quality control included removing all variants with a MAF of < 5% and pruning variants which were in LD (PLINK run

140

on R 3.01). Statistical epistatic interactions were assessed using PLINK and all GRS analyses were run on SAS 9.1 (SAS, Cary, NC, USA).

Results

The genes included in the Notch pathw ay analysis are shown in Figure 1A which also describes their interactions. In total, 89 SNPs in this set of genes were associated with T2D -

ESKD (p < 0.05). Figure 1B details which elements of the pathway had the most associated genetic variants. The MAML gene family had the most associated variants at 21, followed by

TGFβ (10), and Notch, SMAD, EP300, and NFKB each with between 5-9 variants. The remaining

15 genes had between 1 and 4 associated SNPs.

First, a genetic risk score was created using all 89 SNPs from the GWAS data under an additive model (Figure 2). The mean number of risk variants in controls was 51.36, whereas it was 51.73 in cases with p=0.012. In an effort to improve the model epistatic interactions of these

89 SNPs were assessed using PLINK (Table 1) and all variants. 23 variants demonstrated evidence of statistical epistasis (p=0.009-0.0003 for interaction). In this epistatic GRS the mean score for cases was 22.05 and 21.74 for controls with p=9x10-4 (Figure 3).

Conclusion

This was a proof of concept study in which we used a priori biological evidence to create a genetic risk score from a previously published GWAS. After construction of the pathway and creating of the GRS, cases indeed had a higher allelic burden compared to controls; epistasis improved the results, increasing evidence of association three-fold. While these results have no clinical implications, one of the more interesting findings was the high concentration of genetic variants in the MAML gene family (N=21 SNPs). This could be an interesting observation to follow up on with fine mapping and investigating low frequency and rare variants in this gene.

141

Limitations to this study include independence: the results were not replicated.

Additionally, there is inherent bias as the analyses were built off of variants which were already associated with T2D-ESKD in African Americans. Therefore, one would expect to observe difference between cases and controls.

142

Figure 1A. Genes used to create the Notch signalling pathway.

Figure 1B. Number of SNPs per gene associated with T2D-ESKD.

143

Figure 2. Notch Pathway T2D-ESKD GRS with all 89 associated SNPs

Table 1. Top Epistatic Interactions in the Notch Signaling Pathway

144

Figure 3. Epistatic Notch Pathway GRS

145

References (Appendix 1)

1. Bonegio R, Susztak K. 2012. Notch signaling in d iabetic nephropathy. Experimental Cell Research. 318: 986-992.

2. Bonegio R, Beck LH, Kahlon RK, Lu W, Salant DJ. 2011. The fate of notch -deficient nephrogenic progenitor cells during metanephric kidney development. Kidney International. 79: 1099-1112.

3. Cheng HT, Kim M, Valerius MT, Surendran K, et al. 2007. Notch2, but not Notch1, is required for proximal fate acquisition in the mammalian nephron. Development. 134: 801- 811.

4. McDonough CW, Palmer ND, Hicks PJ, Roh BH, et al. 2011. A Genome Wide Association Study for Diabetic Nephropathy Genes in African Americans. Kidney Int. 79(5): 563-572.

5. Lin CL, Wang FS, Hsu YC, Chen CN, Tseng MJ, et al. 2010. Modulation of Notch-1 Signalling Alleviates Vascular Endothelial Growth Factor-Mediated Diabetic Nephropathy. Diabetes. 59: 1915-1925.

6. Thiruvur T, Bielesz B, Gruenwald A, Ponda M, Kopp JB, et al. 2008. The Notch pathway in podocytes plays a role in the development of glomerular disease. Nature Medicine. 14(3): 290-298.

146

Appendix 2

RTN1 is a Novel Risk Gene for Kidney Disease This data was included in a manuscript was submitted to Nature Medicine in February of 2014. The author list is as follows:

Ying Fan 1,2, Jason A. Bonomo 3, Wenzhen Xiao1, 2, Peter Chuang 1, Zhengzhe Li1, Belinda Jim 4, Sandeep Mallipattu 1, Weijia Zhang 1, Chengguo Wei1, Nicholette D. Palmer 3, Niesong Wang 2, Weiping Jia2, Donald W. Bowden 3, Barry I. Freedman 3, and John Cijiang He1

1Department of Medicine, Mount Sinai School of Medicine, NY; 2Department of Nephrology, Shanghai Six Municipal Hospital, Shan ghai Jiaotong University School of Medicine, China; 3Section on Nephrology and Center for Genomics and Personalized Medicine Research, Wake Forest School of Medicine, Winston -Salem, NC; 4Division of Nephrology, Jacobi Medical Center, NY.

Corresponding Author: John Cijiang He, MD, Division of Nephrology, Mount Sinai School of Medicine, NY, NY.

Preface:

This project was a collaborative effort between Dr. John He’s laboratory at Mount Sinai School of Medicine (MSSN) and the Center for Human Genomics and Personalized Medicine Research at Wake Forest School of Medicine. Dr. He’s group identified the RTN1 gene as being a candidate ESKD gene through comparison of kidney transcriptomes from HIV transgenic mice (Tg26) – a model for HIVAN – to wild type mice. After extensive biological investigation into the mechanisms whereby RTN1 contributes to ESKD in vivo and in vitro, our laboratory group (Dr. Bowden, Dr. Freedman, Dr. Allred, and JAB) pursued genetic links between RTN1 and ESKD in African Americans. Ying Fan, PhD, of MSSN and colleagues largely prepared the manuscript by writing the introduction, methods, results, and discussion; an abbreviated version of the introduction, results, methods, and discussion submitted to Nature Medicine are included in this appendix to reflect the work down at WFSOM. GWAS data (McDonough et al. 2011) was pulled by Dr. Nicholette P. Allred. Targeted genotyping and statistical analysis of RTN1 variants was performed by JAB. JAB, Dr. Allred, Dr. Freedman, and Dr. Bowden prepared appropriate sections of the manuscript and edited the final version. Dr. Freedman recruited the patients involved in this study. Biological data from MSSN is not included in this appendix; it is limited to work performed at WFSOM.

147

Abstract:

Gene expression profiling of kidneys from the murine model of HIV-associated nephropathy

(HIVAN) identified an association between the expression of an endoplasmic reticulum (ER)- associated protein reticulon-1, RTN1, and the severity of kidney disease. Upregulation of Rtn1 was confirmed in diseased kidneys of murine models of HIVAN, diabetic nephropathy (DN), and renal fibrosis. RTN1 mRNA levels were also increased in kidneys of patients with DN and correlated inversely with estimated glomerular filtration rate (eGFR). Of the three known RTN1 isoforms, only Rtn1-a protein expression was increased in kidneys of mice and humans with

HIVAN and DN. Protein expression of Rtn1-a in the kidneys also correlated inversely with eGFR in patients with DN. In HK2 cells, RTN1 overexpression induced ER stress/ apoptosis whereas RTN1 knockdown attenuated tunicamycin-, hyperglycemia-, and albumin-induced ER stress/ apoptosis. The role of RTN1 in ER stress was also confirmed in cultured podocytes. In vivo, knockdown of RTN1 expression attenuated renal fibrosis in mice with unilateral ureteral obstruction and proteinuria, glomerular hypertrophy, and mesangial expansion in STZ -induced diabetic mice, which were associated with suppression of ER stress markers. In addition, thr ee single nucleotide polymorphisms of RTN1 in African Americans that significantly associated with type 2 diabetes (T2D)-associated end-stage renal disease (ESRD), non-diabetic ESRD, and all-cause ESRD, but not T2D without nephropathy were identified. Taken together, these data suggest that RTN1 is a novel risk gene for kidney disease and may contribute to kidney injury through ER stress.

148

Introduction:

Chronic kidney disease (CKD) affects approximately 10% of US adults. The incidence and prevalence of this condition are increasing worldwide (Coresh et al. 2007) and more than

500,000 US adults currently suffer from the most severe form of CKD, end -stage renal disease

(ESRD). Therapeutic options for CKD are limited and at best only offer partial protection against disease progression (Lewis et al. 1993; Brenner et al. 2001; de Zeeuw 2011). In this study, we profiled the renal gene expression of a murine model of HIV-associated nephropathy

(HIVAN). Genes where the level of expression correlated with severity of renal injury were identified. Renal expression of RTN1, which encodes for an endoplasmic reticulum (ER)- associated protein called reticulon 1, was associated with the severity of renal injury. The reticulon family has four members: RTN1, RTN2, RTN3 and RTN4. Reticulons were first described in neuroendocrine cells. They localize primarily to the ER membrane as ER-shaping proteins. The human reticulon 1 gene (RTN1) is expressed predominantly in neuroendocrine tissues and has three transcript variants that encode for three RTN1 isoforms: RTN1-A, RTN1-B, and RTN1-C. The function of Rtn1-a, however, is not well characterized. In the current study, we characterized the expression, function, and genetic associations of RTN1 in kidney disease.

Results:

Identification of RTN1 as a highly upregulated gene in CKD

To identify genes that could contribute to development of CKD, we examined gene expression in kidneys of mice with varying extents of renal injury. Tg26 mice, a CKD model of HIVAN, develop significant proteinuria at four weeks of age. Most Tg26 mice do not survive past 3 to 6 months of age due to rapidly progressive renal failure. However, some Tg26 mice have milder kidney injury with stable renal function. The kidney histology of Tg26 mice is characterized by collapsing focal segmental glomerulosclerosis (FSGS) and significant tubular interstitial

149

inflammation and fibrosis with tubular atrophy and dilatation. The kidney transcriptomes of

Tg26 mice with mild and severe kidney disease, based on histologic scoring, were compared to gender-matched wild-type (WT) littermates that exhibited no renal

Association of RTN1 single nucleotide polymorphisms with ESRD in humans

Genetic variants in RTN1 were tested for association w ith ESRD due to type 2 diabetes

(T2D-ESRD) in a cohort of 965 African Americans (AAs) with T2D-ESRD and 1,029 AA population-based controls (McDonough et al. 2011). Using a genome-wide association approach, seven nominally associated intronic variants in RTN1 were identified (p<0.05 among 219 single nucleotide polymorphisms [SNPs] representing 53 independent tests). These variants underwent replication testing in an additional 1,312 unrelated AA T2D-ESRD cases and 774 AA population-based controls. Table 1 contains demographic characteristics. Replication was observed with three correlated variants (p<0.05, r 2>0.82; Table 2). Combined analysis revealed consistent and significant association among these variants with T2D-ESRD in AAs (p=0.015–

3.0x10-4; age, gender, APOL1 G1/ G2 allele status, and ancestry adjusted). Odds ratios (ORs) were protective (0.67–0.77) across studies and similar minor allele frequencies (MAFs) were observed. Association was observed for one of these SNPs (rs12434215) in 604 European -

American (EA) T2D-ESRD cases compared to 1,030 EA population-based controls (p=0.019,

OR=0.69 [0.52, 0.94]) following adjustment for age and gender (Table 2). The minor allele in

AAs corresponds to the major allele in EAs; the same reference allele was used in both analyses.

To determine whether association was limited to T2D-ESRD or included other etiologies of ESRD, the three RTN1 variants were investigated in a cohort of 1,523 AAs with non -diabetic

ESRD (nonT2D–ESRD, attributed to chronic glomerulonephritis or hypertension) relative to

1,803 combined AA population-based controls (Table 2). SNPs rs12431381and rs12434215 were associated with non-diabetic ESRD (p=0.014–0.015; age, gender, APOL1 G1/ G2 allele status,

150

and ancestry adjusted). ORs remained protective (0.77) and similar MAFs were observed.

Analysis of the combined 3,800 AA ESRD cases (diabetic and non -diabetic ESRD) and 1,803 combined AA population-based controls was performed to investigate the association of the three RTN1 SNPs with all-cause ESRD. Both rs12434215 (p=6.7x10-4) and rs12431381 (p=7.5x10-4) were significantly associated with all-cause ESRD in a dominant model adjusted for age, gender, APOL1 G1/ G2 status, and ancestry. The ORs and MAFs for these two variants were consistent across studies (Table 2). SNP rs1952034 was not associated with all-cause ESRD.

To assess potential association with T2D, a trait discrimination analysis was performed in 555 AA T2D cases without nephropathy and 1,803 combined AA population -based controls

(Table 3). Nominal association was seen only with rs12434215 (p=0.032) and no association was observed with rs12431381 and rs1952034 (Table 3), suggesting minimal association with T2D and stronger association with ESRD. Analyses contrasting 625 EA T2D cases with 1,030 EA population-based controls revealed nominal association with rs12434215 (p=0.048) and no association with the other RTN1 SNPs (Table 3).

Discussion:

Despite optimal medical therapy, many patients with CKD progress to ESRD (de Zeeuw

2011). Therefore, it is critical to identify the underlying mechanisms mediating progression of kidney disease. In the current study, we used the Tg26 model as an experimental model of CKD to identify candidate genes likely to be important for the development and progression o f nephropathy. Since Tg26 mice exhibit variable severity of renal phenotypes ranging from rapid progression to renal failure (over 2-3 months) to mild disease with stable renal function for more than 6 months, it is an ideal system for studying factors tha t dictate kidney disease progression. We began by identifying genes that were upregulated in kidneys with more severe kidney disease. To ensure candidate genes identified from the Tg26 model were relevant to

151

human kidney disease we examined the expression of these genes in human diseased kidneys by taking advantage of the publically available datasets from Nephromine.org. Interestingly, we found that RTN1A expression was also correlated with the progression of human DN. We were able to validate the expression of RTN1A in several animal models of kidney disease indicating that RTN1A could play a more general role in the progression of kidney diseases and was not restricted to DN or HIVAN.

RTN1A is expressed predominantly in the tubulointerstitial compartment. However, glomerular expression of RTN1A was also present, suggesting that RTN1A may play a role in both glomerular and tubular cell injury. Consistent with the transcript omic data from patients with DN, we confirmed that RTN1A protein expression correlated with kidney function in an independent cohort of patients with DN. These data strongly support a role for RTN1A in the progression of human DN. In addition, human genetic studies also revealed that three RTN1

SNPs were significantly associated with T2D-ESRD, non-diabetic ESRD, and all cause-ESRD in

AAs. Interestingly, association was also observed in EAs with T2D-ESRD for one of the RTN1

SNPs. Variation in RTN1 may account for a portion of the elevated risk of progressive kidney disease observed in AAs or differences in the haplotype block structure could have contributed to this result. All three associated SNPs were intronic and we note that power was lost after full covariate adjustment due to the decreased number of samples with complete clinical data.

Future studies addressing whether these SNPs are associated with increased RTN1A expression in patients with kidney disease are needed.

In conclusion, we performed a genetic-based association study framed by a mechanistic approach to demonstrate that RTN1A gene variants were associated with ESRD in AAs, and to a lesser extent in EAs. We also provided a plausible molecular mechanism to delineate the role of

RTN1A in DN. RTN1A expression is highly upregulated in HIVAN an d DN. RTN1A is a key

152

molecule mediating hyperglycemia and protein overload -induced ER stress and apoptosis of kidney cells. Therefore, it is likely that increased expression of RTN1A in kidney cells contributes to the progression of kidney disease. In addition, RTN1Agene variants are associated with ESRD in patients of recent African ancestry. Genetic and expression variability in RTN1A could potentially explain why patients of recent African descent are more susceptible to DN than are those of European ancestry.

Methods:

Human genetic study subjects: Recruitment and sample collection procedures have been described (McDonough et al. 2011; Cooke et al. 2012). The study was approved by the

Institutional Review Board at Wake Forest School of Medicine. All participants were unrelated, born in North Carolina, South Carolina, Georgia, Tennessee, or Virginia and provided written informed consent. Ethnicity was self-reported. DNA extraction was performed using the

PureGene system (Gentra Systems, Minneapolis, MN, USA). AAs with T2D-ESRD were recruited from dialysis facilities in the southeastern US. T2D was diagnosed in those developing diabetes after age 25 years, without historical evidence of diabetic ketoacidosis or receiving solely insulin since diagnosis. T2D-ESRD was diagnosed after 5 year T2D duration prior to renal replacement therapy and diabetic retinopathy or ≥100mg/ dl (>1+) proteinuria on urinalysis (when available) in the absence of other causes of nephropathy. Unrelated AA population-based controls without a current diagnosis of diabetes or kidney disease (serum creatinine ≤1.5 or <1.3 mg/ dl [women]) were recruited from the community and internal medicine clinics. AA non-diabetic ESRD cases lacked T2D when initiating renal replacement therapy. ESRD was typically attributed to chronic glomerular disease (e.g., focal segmental glomerulosclerosis), HIVAN, hypertension, or unknown causes. Patients with ESRD due to urologic/ surgical cause, polycystic kidney disease or IgA Nephropathy were excluded.

153

Additional replication non-diabetic, non-nephropathy controls were recruited, as above. AAs with T2D were recruited from the African American-Diabetes Heart Study (AA-DHS), as well as internal medicine clinics. These controls were receiving insulin or oral agents or had an

HbA1C>6.5% or fasting plasma glucose >126 mg/ dl with serum creatinine concentrations <1.5 mg/ dl (men) or <1.3 mg/ dl (women). Healthy EA controls were recruited from community and internal medicine clinics. EA T2D-ESRD cases were recruited from the same regional dialysis centers and diagnosed with T2D as in AAs (see above), except they had diabetes onset after age

30. EA T2D non-nephropathy controls were recruited from the Diabetes Heart Study (DHS) and

internal medicine clinics and were receiving insulin or oral agents or had a HbA1 C>6.5% or fasting plasma glucose >126 mg/ dl with serum creatinine concentrations <1.5 mg/ dl (men) or

1.3 mg/ dl (women).

Human genetic study genotyping and quality control: RTN1 SNPs of interest were identifying utilizing data from a previously published AA T2D-ESRD GWAS (McDonough et al. 2011).

SNPs were selected based on their location within the RTN1 gene region (10kb proximal flanking sequence and 790kb distal flanking sequence were included) and p<0.05 association with T2D-ESRD; 66 SNPs were directly genotyped and 153 were imputed. SNPs were imputed using HumanHap36 and had good quality scores (r 2>0.83). Replication genotyping of RTN1

SNPs was performed across all cohorts utilizing the Sequenom MassArray system (Sequenom,

San Diego, CA) in the Center for Genomics at Wake Forest School of Medicine (Winston Salem,

NC). SNPs were PCR-amplified using primers designed in Assay Design 3.1 (Sequenom, San

Diego, CA) and genotypes were analyzed using the iPlex Sequenom Typer Analyzer program

(Sequenom). Call rates >97% were achieved and quality control was ensured using blind duplicates within each cohort of samples.

154

Human genetic study statistical analysis: Each SNP was tested for departure from Hardy-

Weinberg Equilibrium (HWE) expectations through a chi square goodness of fit test. The overall genotypic test of association and the three a priori genetic models (dominant, additive, and recessive) were computed to test for association between each SNP and each phenotype. Tests for association were adjusted for age, gender, ancestry (AA samples), and apolipoprotein L1

(APOL1) G1/ G2 risk allele status (AA samples) (51). These tests were computed using

SNPGWA (http:/ / www.phs.wfubmc.edu/ public_bios/ sec_gene/ downloads.cfm). Large sample test distribution and permutation methods were used to estimate statistical significance.

Acknowledgement:

JCH is supported by NIH 1R01DK078897, VA Merit Award and NIH 1R01DK08854; JCH and

PEK are supported by NIHP01-DK-56492; PYC is supported by NIH 5K08DK082760. JAB is supported by NIH grant F30 DK098836. BIF is sup ported by NIH grants R01 HL56266, RO1

DK070941 and RO1 DK084149. DWB is supported by NIH grant RO1 DK053591.

Competing interest statement:

The authors declare that they have no competing financial interests.

155

Table 1. Clinical Characteristics of Genetic Study Samples

GWAS AA Replication Replication AA GWAS AA Population- AA Population- Replication AA AA T2D, non- EA Population- EA T2D, non- T2D-ESRD based Controls T2D-ESRD based Controls Non T2D-ESRD Nephropathy EA T2D-ESRD based Controls Nephropathy Variable Cases (965) (1029) Cases (1312) (774) Cases (1523) Controls (555) Cases (604) (1030) Controls (625) Female (%) 61.2 57.3 60.7 57.9 43.7 57.1 49.5 63.8 54.9 Age (years) 61.6 ± 10.5 49.0 ± 11.9 61.3 ± 10.8 48.4 ± 12.7 54.6 ± 14.6 55.6 ± 9.7 65.3 ± 10.4 53.9 ± 15.1 63.1 ± 9.1 Age at onset T2D (years) 41.6 ± 12.4 - 41.3 ± 12.4 - - 45.6 ± 10.0 45.4 ± 13.6 - 51.9 ± 9.2 Duration of ESRD (years) 3.65 ± 4.2 - 3.66 ± 3.9 - 2.2 ± 1.64 - 2.6 ± 3.0 - - Glucose (mg/dl) - 88.8 ± 13.1 - 89.2 ± 13.6 88.6 ± 8.66 - - 93.1 ± 16.9 148.2 ± 55.7 BMI (at recruitment; kg/m2) 29.7 ± 7.0 30.0 ± 7.0 30.3 ± 7.2 29.2 ± 7.4 27.2 ± 6.98 36.3 ± 17.4 29.6 ± 7.1 28.4 ± 5.7 32.8 ± 7.1 Serum Creatinine (mg/dl) - 0.99 ± 0.25 - 1.03 ± 0.46 - 0.97 ± 0.28 - 0.96 ± 0.5 1.03 ± 0.26 African Ancestry (%) 79.9 78.2 79.9 78.3 80 76.9 - - - Continuous data presented as mean ± SD

156

Table 2. ESRD Analysis Covariates: age, gender, admixture (AAs), APOL1 (AAs) SNP N Cases* MAF Cases N Controls* MAF Control s P OR CI add GWAS T2D-ESRD vs. rs1952034 831 0.060 735 0.081 0.088 0.77 0.54, 1.06 population-based rs12431381 936 0.083 861 0.12 0.017 dom 0.72 0.55, 0.94 controls dom

rs12434215 922 0.066 846 0.10 0.0077 0.67 0.50, 0.90 rs1952034 1219 0.097 622 0.12 0.014 dom 0.7 0.53, 0.93 Replication T2D-ESRD add vs. population-based rs12431381 1217 0.087 628 0.11 0.024 0.76 0.59, 0.96 controls rs12434215 1216 0.064 624 0.087 0.014 dom 0.66 0.48, 0.92 dom Combined T2D-ESRD rs1952034 2050 0.081 1357 0.10 0.015 0.77 0.63, 0.95 vs. all population rs12431381 2153 0.086 1489 0.12 0.0039 dom 0.75 0.62, 0.91 based controls rs12434215 2138 0.065 1470 0.095 3.02x10-4 dom 0.67 0.54, 0.83 rs1952034 1459 0.095 1357 0.10 0.85 add 0.95 0.79, 1.21 non-DM ESRD vs. all dom population based rs12431381 1441 0.089 1489 0.12 0.015 0.77 0.77, 0.95

controls rs12434215 1452 0.072 1470 0.095 0.014 add 0.77 0.62, 0.95 AfricanAmerican Samples rs1952034 3509 0.087 1357 0.10 0.12 add 0.88 0.74, 1.03 all-cause ESRD vs. all -4 dom population based rs12431381 3594 0.087 1489 0.12 7.53x10 0.75 0.63, 0.89 controls rs12434215 3590 0.068 1470 0.095 6.70x10-4 dom 0.73 0.61, 0.87 rs1952034 556 0.54 752 0.55 0.29 add 0.92 0.78, 1.08 European T2D-ESRD vs. dom American population based rs12431381 547 0.57 746 0.57 0.09 0.78 0.59, 1.04 Samples controls rs12434215 557 0.58 753 0.58 0.019 dom 0.69 0.52, 0.94 *Sample size reflects those with complete clinical and genotypic data.

157

Table 3. Trait Discrimination

Covariates: age, gender, admixture (AAs) N MAF N MAF SNP Cases* Cases Controls* Controls P OR CI add African American T2D- rs1952034 496 0.089 1608 0.10 0.29 0.87 0.66, 1.13 only (GFR > 60 ml/min) vs rs12431381 497 0.098 1765 0.11 0.26add 0.86 0.67, 1.11 population-based controls rs12434215 495 0.072 1742 0.095 0.032add 0.73 0.55, 0.97 add European American T2D- rs1952034 620 0.53 752 0.55 0.40 0.93 0.79, 1.10 only (GFR > 60 ml/min) vs rs12431381 619 0.54 746 0.57 0.12add 0.88 0.75, 1.03 population-based controls rs12434215 620 0.55 753 0.58 0.048add 0.85 0.72, 1.00 *Sample size reflects those with complete clinical and genotypic data.

158

Figure 1. LD Block (D’) of 219 SNP RTN1 Gene region utilized in Discovery Study

159

References:

1. Brenner, B.M., Cooper, M.E., de Zeeuw, D., Keane, W.F., Mitch, W.E., Parving, H.H., Remuzzi, G., Snapinn, S.M., Zhang, Z., and Shahinfar, S. 2001. Effects of losartan on renal and cardiovascular outcomes in patients with type 2 diabetes and nephropathy. N Engl J Med 345:861-869.

2. Cooke, J.N., Bostrom, M.A., Hicks, P.J., Ng, M.C., Hellwege, J.N., Comeau, M.E., Divers, J., Langefeld, C.D., Freedman, B.I., and Bowden, D.W. 2012. Polymorphisms in MYH9 are associated with diabetic nephropathy in European Am ericans. Nephrol Dial Transplant 27:1505-1511.

3. Coresh, J., Selvin, E., Stevens, L.A., Manzi, J., Kusek, J.W., Eggers, P., Van Lente, F., and Levey, A.S. 2007. Prevalence of chronic kidney disease in the United States. JAMA 298:2038-2047.

4. de Zeeuw, D. 2011. Unmet need in renal protection--do we need a more comprehensive approach? Contrib Nephrol 171:157-160. 5. Lewis, E.J., Hunsicker, L.G., Bain, R.P., and Rohde, R.D. 1993. The effect of angiotensin - converting-enzyme inhibition on diabetic nephropathy. The Collaborative Study Group. N Engl J Med 329:1456-1462.

6. McDonough, C.W., Palmer, N.D., Hicks, P.J., Roh, B.H., An, S.S., Cooke, J.N., Hester, J.M., Wing, M.R., Bostrom, M.A., Rudock, M.E., et al. 2011. A genome-wide association study for diabetic nephropathy genes in African Americans. Kidney Int 79:563-572.

160

Appendix 3

Genetic variants in NPHS1 are associated w ith T2D-ESKD in European Americans

JAB designed the experiments, performed genotyping, and the statistical analyses. Donald W.

Bowden served in an advisory role for this project.

After implicating missense variants with T2D-ESKD and non-T2D ESKD in African

Americans (see “Chapter 2”) we sought to determine whether functional variants also influence susceptibility to T2D-ESKD in a population of European Americans. Using the 1000 Genomes

Project and Exome Variant Server data of missense variants (and putatively functional intronic variants) in European-derived populations was examined. Selection criteria included MAF

(considering power restrictions), PolyPhen2 prediction, and Regulome DB score for intronic variants. Table 1. European American NPHS1 variants genotyped in T2D- ESKD A total of 11 NPHS1 SNP Missense RegDB Notes variants were successfully rs3814995 E117K - - rs33950747 R408Q - - genotyped; rs34320609 failed rs34320609 L392P - Failed Genotyping QC rs34982899 P264R - - quality control metrics (Table 1). rs73928317 - 3b - rs113825926 T294I - - In a discovery cohort of 630 rs114896482 R800C - European Americans with rs116459838 - 3a - rs138173172 A916S - - T2D-ESKD and 911 European rs145125791 N188I - - rs150038620 - splice site - American population based controls we identified two significant association s in an unadjusted model (Table 2). The first was with rs73928317 (p=0.025, OR=5.05), a rare intronic variant found in a repressed region immediately proximal to the first exon of NPHS1 (UCSC, data not shown). The second variant

161

associated with T2D-ESKD in European Americans was the low frequency missense variant

rs145125791 which results in a protective asparagine to isoleucine mutation at position

188 (of 1241) in nephrin (p=0.013, OR=0.57).

Table 2. European American NPHS1 missense variant associations with T2D-ESKD (no covariates)

AA Δ or SNP N Case / Control MAF Case / Control Model P OR RegDB rs73928317 608 / 870 0.0050 / 0.0006 Dom. 0.025 5.05 [1.1, 24.4] 3b rs145125791 618 / 890 0.018 / 0.035 Add. 0.013 0.57 [0.37, 0.90] N188I

A locus-wide analysis was next performed using SKAT. In an unadjusted, unweighted

model with the 11 successfully genotype NPHS1 variants a P value of 0.086 was observed (data

not shown). In an unadjusted analysis weighted for rare variants (c[1,25]), P=0.058 was

observed, suggesting a trend for association. SKAT lacks power at small sample sizes, as in this

study, so it is not surprising we did not observe greater statistical significance. It’s interesting

that rs73928317 was associated with risk in European Americans, as this variant is found at

common MAF in African Americans, providing further genetic insight into the disproportionate

burden of T2D-ESKD borne by African Americans. There are several limitations to this study.

They include small sample size and lack of a confirmatory replication study – necessary for all

genetic studies.

162

Appendix 4

A CRCT1 missense variant is associated w ith T2D-ESKD in African Americans

Dr. Maggie CY Ng provided the raw T2D-GENES data which JJ Xu then analyzed to create “top

100” hits for T2D-ESKD association using several models. JAB then selected variants from these

“top 100” hits for follow -up targeted genotyping and analysis. DWB served in an advisory capacity.

The CRT1 gene codes for a cysteine-rich C-terminal associated protein, which is also associated with a class of RNAs termed “long non-coding” RNA molecules (lncRNA). The functions and clinical relevance of this gene are poorly understood . A CRCT1-MAML2 fusion oncogene has been associated with Warthin’s tumors (tumors of the salivary glands), and is thought to be a protein component of the dermal layer of late cornified envelope proteins (O’Niel 2009; Rinnerthaler RM et al.

2013). In human kidneys CRCT1 is expressed in the proximal and distal convoluted tubules, and to a lesser extent in the glomerulus, but is not expressed the loop of Henle, renal pelvis, or papillary tips (Nephromine 2014; Figure 1).

Next generation exome sequencing Figure 1: CRCT1 Expression in Healthy Kidneys performed in conjuction with the T2D-GENES consortium identified a missense variant,

163

rs73004856 (G20S), in CRCT1 which was associated with T2D-ESKD in African Americans (529 cases and 535 population-based controls) after principle component adjustment under an additive model (p=0.00026; OR=1.56). The MAF of the G20S in the T2D-GENES discovery study was 15%.

The G20S variant was then genotyped in a replication cohort composed of 1260 independent T2D-ESKD cases and 760 population based controls. A MAF of 15.9% and 13.8% was observed in cases and controls, respectively, with p=0.030 and OR=1.27 [1.02, 1.59] following adjustment for age, gender, admixture, and APOL1 G1/ G2 allele status. The G20S variant was observed to be HWE (p=0.52).

The significance of this common missense variant in T2D-ESKD in African Americans is unclear. With a poor biological understanding of the CRCT1 gene there is no basis for speculation as to mechanisms whereby this variant confers risk. However, we were able to replicate association in an independent cohort, which may warrant further investigation of this gene in future studies of T2D-ESKD and ESKD.

References

1. Nephromine. University of Michigan, Ann Arbor. Accessed 02/05/2013.

2. O’N ID. 2009. N w s hts to th tu o W th ’s Tu o . J Oral Pathol Med. 38(1): 145- 149.

3. Rinnerthaler M, Duschl J, Steinbacher P, Salzmann M, et al. 2013. Ager-related changes in the composition of the cornified envelope in human skin. Exp Dermatol. 22(5): 329-335.

164

CURRICULUM VITAE

Jason Andrew Bonomo

Education

2009 – Present Wake Forest University School of Medicine, Wake Forest University Graduate School of Arts & Sciences, Program in Molecular Medicine and Translational Science, MD/ PhD Dual Degree Program Winston Salem, NC 27157

Ph.D., Expected Graduation May 2014 GPA: 3.72 Advisor: Donald W. Bowden, Ph.D.

M.D., Expected Graduation May 2016

2005 – 2009 Case Western Reserve University College of Arts & Sciences, Cleveland, OH 44106 B.A., cum laude GPA: 3.76 Majors: Anthropology, Biology (with Honors) Minor: Chemistry

2008 Universidad Torcuato di Tella Buenos Aires, Argentina Study Abroad (Spanish, Bioethics)

2001-2005 Bloomsburg University of Pennsylvania Bloomsburg, PA 17815 Various college coursework as a high school student (sociology, psychology, political science, anthropology, et cetera)

2001 Universidad Pontifica de Salamanca, Salamanca, Spain College-level Study Abroad/ Certification Program (Spanish)

Languages English native Arabic (Levantine) moderately proficient Spanish moderately proficient

Professional Experience

Graduate Fellow, 2011-Present Molecular Medicine and Translational Science Wake Forest University Graduate School of Arts & Sciences Advisor: Donald W. Bowden, Ph.D.

165

Medical Student, 2009-2011 Wake Forest University School of Medicine

Undergraduate Research Assistant, 2008-2009 Case Western Reserve University School of Medicine Department of Pathology Advisor: Brian Cobb, Ph.D.

SOURCE Scholar, 2008 Case Western Reserve University School of Medicine Department of Pathology Advisor: Brian Cobb, Ph.D.

Undergraduate Research Assistant, 2007-2008 Case Western Reserve University School of Medicine Department of Pathology Advisor: Brian Cobb, Ph.D.

Howard Hughes Medical Institute Undergraduate Research Fellow, 2007 Advisor: Brian Cobb, Ph.D.

Undergraduate Research Assistant, 2006-2007 Case Western Reserve University School of Medicine Department of Pathology Advisor: Brian Cobb, Ph.D.

High School Research Internship Geisinger Medical Center Weis Center for Research Field: Neuroscience Advisor: David Carey, PhD.

Academic and Professional Honors

NIH/ NIDDK F30 NRSA: 1F30DK098836-01 “Detection of Coding Variants Modulating Diabetic Nephropathy in African Americans” 48mo. training grant awarded to JAB

Pennel Pro Humanitate Vitae Global Health Award, WFSOM, 2011

MD/ PhD Scholarship, WFSOM, 2009-Present

Bachelors of Art, cum laude, with Honors in Biology, CWRU, 2009

SOURCE Research Scholar, CWRU, 2008

Howard Hughes Fellow, Howard Hughes Medical Institute, CWRU, 2007

166

Gamma Sigma Alpha National Greek Honor Society, CWRU, 2007

Mortar Board National Honor Society, CWRU, 2007-2009

Gross Memorial Academic Scholarship, Zeta Beta Tau Foundation, 2007, 2008

Dean’s Honors, CWRU, Fall 2006/ 08, Spring 2007

Dean’s High Honors, CWRU, Fall 2005/ 07, Spring 2006/ 08

President’s Scholarship (academic merit), CWRU, 2005-09

Other Professional & Civic Activities

National Youth Leadership Forum in Medicine, Volunteer & Cardiology Director, 2012, 2013

Student Representative, Institutional Task Force on Inclusion and Diversity, Wake Forest Baptist Health, 2011-present

President, MD/ PhD Student Association, WFSOM, 2012

Volunteer Health Care Provider, Community Care Center of Winston Salem, 2011-2014

Recruitment Chair, MD/ PhD Student Association, WFSOM, 2011-2014

Volunteer, Wake Forest School of Medicine Share the Heath Fair, 2010-present

Project Teach, volunteer high school tutor, WFSOM, 2010-2011

Vice President, MD/ PhD Student Association, WFSOM, 2010

DEAC Clinic Volunteer, medical student run clinic, WFSOM, 2009-present

Cultural Awareness Committee, Student Representative, WFSOM, 2009-present

President and co-founder of CWRU chapter, Phi Delta Epsilon International Medical Fraternity, 2008

Volunteer Counselor at Camp Spifida, a camp for children with spina bifida, 2005, 2006, 2008, 2011, 2013

Service and Philanthropy Chair, Zeta Beta Tau fraternity, 2007-2009

Child Life Volunteer, Rainbow Babies and Children’s Hospital, 2006-2009

Undergraduate Student Government, College of Arts and Sciences Rep ., CWRU, 2006-2009

167

Volunteer, Emergency Department, University Hospitals, 2005-2006

Coursework in Graduate School

Statistics in Biomedical Research, Research Ethics, Epidemiology, Human Molecular Genetics, Biochemistry, Cancer Biology, Physiology & Pharmacology I & II, Molecular Biology, Genetic Epidemiology

Professional Interests

I have actively been involved in biomedical research since high school and have conducted research in a variety of fields including neuroscience, immunology, neurooncology, and molecular genetics. These experiences have allowed me to become fluent in a number of techniques including PCR, Western Blotting, molecular cloning, immunofluorescence and immunohistochemistry, ELISA, genotyping, among others. As an undergraduate at CWRU I pursued a double major in biology and anthropology with a minor in chemistry. This provided me with a strong science background with a social science perspective. Uniquely, my medical anthropology research and coursework afforded me the opportunity to study sociocultural determinants of health and the influences of socioeconomic status and culture on health care delivery and disparities. This complements my thesis work which investigates an important health disparity, diabetic nephropathy and end -stage kidney disease, from a molecular genetics perspective. Moreover, my medical school training provides me with an understanding of the pathophysiology of diabetic nephropathy, and my on -going clinical rotation at the Community Care Center (Winston-Salem’s free clinic; continuity clinic during PhD years) allows me to see first-hand the devastating effects of end stage renal disease on patients and their families. In summary, my unique academic and research background coupled with a capable and robust mentoring team provides me with a st rong foundation to sucessfully complete this proposal which will foster my development as an aspiring clinician-scientist.

Research Interests

Genetic epidemiology; pathobiology & genetics of kidney disease; diabetic nephropathy; cross cultural determinants of health; international public health; urban health; medical anthropology; access to healthcare; healthcare challenges facing underserved populations

Research Skills

Sequenom based genotyping, PCR, Western blotting, molecular cloning, protein-protein interaction networks, pathway analyses, immunohistochemistry, immunofluoresence, confocal microscopy, protein production in prokaryotic IPTG-inducible systems, ELISA, co- immunoprecipitation, cell/ tissue culture, induction of pluripotent cancer stem cells, proliferation assays, mouse related work (tail vein injections, necropsy, isolation of BMDCs, splenocytes, et cetera)

Bibliography

168

Journal Articles

Ryan SO, Bonomo JA, Zhao F, BA Cobb. 2011. MHCII glycosylation modulates Bacteroides fragilis carbohydrate antigen presentation. The Journal of Experimental Medicine. 208(5):1041-53. PMID: 21502329

Cooke Bailey JN, Palmer ND, Ng MC, Bonomo JA, Hicks PJ, Langefeld CD, Freedman BI, Bowden DW. 2014. Analysis of coding variants identified from exome sequencing resources for association with diabetic and non-diabetic nephropathy in African Americans. Human Genetics. Epub ahead of print. PMID: 24385048

Bonomo JA, Palmer ND, Hicks PJ, Lea JP, Okusa MD, Langefeld CD, Bowden DW, Freedman BI. 2014. Complement factor H gene associations with ESKD in African Americans. Nephrol Dial Transplant. Epub ahead of print. doi: 10.1093/ ndt/ gfu036.

Bonomo JA, Ng MCY, Palmer ND, Keaton KM, Hicks PJ, The T2D-GENES Consortium, Langefeld CD, Freedman BI, Bowden DW. 2014. Coding variants in NPHS1 modulate susceptibility to diabetic and non-diabetic nephropathy in African Americans. Submitted to CJASN Jan ‘14.

Bonomo JA, Ng MCY, Palmer ND, Hicks PJ, Keaton JM, Lea JP, The T2D-GENES Consortium, Langefeld CD, Freedman BI, Bowden DW. 2014. The ras responsive transcription factor RREB1 is a novel candidate gene for type 2 diabetes and nephropathy. Submitted to Human Molecular Genetics Feb ‘14.

Fan Y, Bonomo JA, Xiao W, Chaung P, Li Z, Jum B, Mallipattu S, Zhang W, Wei C, Palmer ND, Wang N, Jia W, Bowden DW, Freedman BI, He JC. 2014. RTN1 is a Novel Risk Gene for Kidney Disease. Submitted to Nature Medicine March Feb ’14.

Abstracts

Bonomo JA, Kriesman L, Cobb BA. MHC II Glycosylation and Zwitterionic Polysaccharide Antigen Presentation. HHMI/ SPUR Poster Presentation, CWRU, Cleveland, OH. July 2007.

Bonomo JA, Lewis C, Kriesman L, Cobb BA. MHC II Glycosylation Modulates Carbohydrate Antigen Presentation. Support of Undergraduate Research & Creative Endeavors (SOURCE) Research Fair, Case Western Reserve University, Cleveland, OH. August 2008.

Bonomo JA, Gibo DM, Debinski WD. Development of a Bi-Species Monoclonal Antibody Against Interleukin 13 Receptor Alpha 2 for Targeted Delivery of Therapeutics in Glioblastoma Multiforme. 19th Annual Resident & Fellows Research Day, Division of Surgical Sciences, Wake Forest University School of Medicine, Winston Salem, NC. November 2011.

Bonomo JA, Gibo DM, Debinski WD. Development of a Bi-Species Monoclonal Antibody Against Interleukin 13 Receptor Alpha 2 for Targeted Delivery of Therapeutics in Glioblastoma Multiforme. Graduate School Research Day, Wake Forest University School of Medicine, Winston Salem, NC. March 2012.

169

Cooke JN, Palmer ND, Ng MC, Hicks J, Bonomo JA, Hester MH, Langefeld CD, Freed man BI, Bowden DW. Analysis of Coding Variants Identified from Exome Sequencing in Diabetic and Non-diabetic Nephropathy Genes in African Americans. The American Society for Human Genetics, Annual Meeting, November 2012.

Bonomo JA, Ng MC, Allred N, Freedman BI, Bowden DW. Novel coding variants are associated with diabetic and all-cause end-stage renal disease (ESRD) in African Americans. Medical Student Research Day, Wake Forest School of Medicine, Winston Salem, NC. October 2013.

Bonomo JA, Ng MC, Allred N, Freedman BI, Bowden DW. Novel coding variants in NPHS1 are associated with diabetic and all-cause end-stage renal disease (ESRD) in African Americans. American Society of Nephrology, Annual Meeting, Atlanta, GA. November 2013.

Bonomo JA, Ng MC, Xu J, Hicks PJ, Freedman BI, Bowden DW. RREB1, a modulator of the Renin-Angiotensin System, is a novel nephropathy gene in African Americans. American Society of Nephrology, Annual Meeting, Atlanta, November 2013.

Professional Membership

American Society of Nephrology, medical student member, 2013-present

170