Aus dem Department für Biometrie, Epidemiologie und Medizinische Bioinformatik

Institut für Genetische Epidemiologie

des Universitätsklinikums Freiburg im Breisgau

Associations between Known Genetic Risk

Variants and CKD Stage and Etiology

in the GCKD Study

INAUGURAL - DISSERTATION

zur Erlangung des Medizinischen Doktorgrades

der Medizinischen Fakultät

der Albert-Ludwigs-Universität

Freiburg im Breisgau

Vorgelegt 2017

von Sebastian Wunnenburger,

geboren in Leipzig

- 1 -

Dekanin: Prof. Dr. Kerstin Krieglstein

Erste Gutachterin: Prof. Dr. Anna Köttgen, M.P.H.

Zweiter Gutachter: Prof. Dr. Wolfgang Kühn

Jahr der Promotion: 2018

- 2 -

Table of contents

List of Abbreviations ...... - 5 - List of Tables ...... - 6 - 1 Introduction ...... - 8 - 1.1 Chronic kidney disease (CKD) – definition, epidemiology, clinical presentation, diagnosis and treatment ...... - 8 - 1.2 Specific etiologies of CKD ...... - 12 - 1.2.1 Introduction ...... - 12 - 1.2.2 Glomerular diseases ...... - 13 - 1.2.3 IgA nephropathy ...... - 13 - 1.2.4 Membranous nephropathy (MN) ...... - 15 - 1.2.5 Systemic lupus erythematosus (SLE) and lupus nephritis ...... - 15 - 1.2.6 Granulomatosis with polyangiitis (GPA) ...... - 17 - 1.2.7 Diabetes mellitus ...... - 17 - 1.2.8 Hypertensive chronic kidney disease ...... - 20 - 1.3 Genetic epidemiology: Genome-wide association studies and SNP associations ...... - 21 - 1.4 Aims of the thesis ...... - 25 - 2 Methods ...... - 27 - 2.1 Study populations ...... - 27 - 2.2 Exposure ...... - 29 - 2.2.1 Genotyping and sequencing ...... - 29 - 2.2.2 Quality control and filtering ...... - 29 - 2.2.3 Imputation ...... - 30 - 2.3 Outcome ...... - 30 - 2.3.1 eGFR/UACR ...... - 30 - 2.3.2 Case and control groups ...... - 30 - 2.3.2.1 Case groups ...... - 31 - 2.3.2.2 Control groups ...... - 31 - 2.3.3 Covariates ...... - 33 - 2.4 Statistical analyses ...... - 33 - 2.4.1 Literature search for previously reported SNPs associated with kidney function and disease ...... - 33 - 2.4.2 Descriptive statistics ...... - 37 - 2.4.3 Regression analyses ...... - 37 - 2.4.4 Covariates ...... - 37 - - 3 -

2.4.5 Sensitivity and conditional analyses ...... - 38 - 2.4.6 Statistical significance ...... - 38 - 2.4.7 Software...... - 38 - 3 Results ...... - 39 - 3.1 Demographic data and baseline characteristics ...... - 39 - 3.2 CKD etiologies ...... - 40 - 3.3 GFR/UACR as kidney function measures ...... - 40 - 3.4 Hardy-Weinberg equilibrium test ...... - 41 - 3.5 Genetic associations ...... - 42 - 3.5.1 Associations of SNPs identified in population-based studies with advanced CKD ...... - 42 - 3.5.2 Associations of SNPs identified in population-based studies with hypertensive and diabetic kidney disease...... - 43 - 3.5.3 Associations of SNPs with specific CKD etiologies ...... - 44 - 3.5.3.1 Associations of CKD etiology-specific risk variants with the previously reported CKD etiology ...... - 44 - 3.5.3.2 Associations of CKD etiology-specific risk variants across different CKD etiologies ..... - 46 - 3.5.3.3 Conditional analysis of independence of CKD etiology specific SNPs ...... - 49 - 3.5.3.4 Conditional analysis to assess independence of risk variants or T1DM-attributed CKD and T1DM ...... - 51 - 3.5.3.5 Linkage disequilibrium calculations ...... - 52 - 4 Discussion ...... - 53 - 4.1 Summary of results ...... - 53 - 4.2 Interpretation in the context of the literature ...... - 54 - 4.3 Clinical interpretation ...... - 57 - 4.4 Strengths and limitations ...... - 59 - 4.5 Conclusion ...... - 60 - 5 Abstract (English) ...... - 61 - 6 Abstract (German) ...... - 62 - 7 Acknowledgement ...... - 63 - 8 Eidesstaatliche Versicherung...... - 64 - 9 Supplementary Tables ...... - 65 - 10 Bibliography ...... - 69 - 11 Original publication in Scientific Reports ...... - 76 -

- 4 -

List of Abbreviations

1KGP 1000 Genomes Project ACE Angiotensin converting enzyme AT Angiotensin CI Confidence interval CKD Chronic kidney disease DM Diabetes mellitus eGFR Estimated glomerular filtration rate ESRD End-stage renal disease GCKD German Chronic Kidney Disease (study) GFR Glomerular filtration rate GPA Granulomatosis with polyangiitis HKD Hypertensive kidney disease IgA IgA nephropathy MN Membranous nephropathy n.s. Not significant NSAID nonsteroidal anti-inflammatory drug OR Odds ratio RAAS Renin-angiotensin-aldosterone system SLE Systemic lupus erythematosus SNP Single nucleotide polymorphism SSNS Steroid sensitive nephrotic syndrome T1DM Type 1 diabetes mellitus T2DM Type 2 diabetes mellitus UACR Urine albumin-to-creatinine rate WTCCC Wellcome Trust Case Control Consortium

- 5 -

List of Tables

Table 1. Classification of CKD according to eGFR and albuminuria Table 2. Categorization of CKD based on presence or absence of systemic diseases and location of pathologic findings Table 3. Case and control group characteristics for the analyses Table 4. Composition of the GCKD internal control group for comparison to specific etiologies of CKD Table 5. Sex distribution in the study populations Table 6. Overview of candidate SNPs and plausibility checks Table 7. HWE p-values of SNPs that showed deviation in the GCKD population Table 8. SNP proxies used in the analyses Table 9. Demographic data and baseline characteristics of the GCKD study population Table 10. Leading cause of CKD in the GCKD study Table 11. Associations of SNPs identified in population-based studies with advanced CKD (stage G3b or A3) Table 12. Associations of SNPs identified in population-based studies with hypertensive nephropathy (nephrosclerosis) Table 13. Associations of known risk loci for specific CKD etiologies with the corresponding CKD etiology in the GCKD study Table 14. Associations between CKD etiology specific SNPs and other CKD etiologies Table 15. Conditional analyses for independence of CKD etiology specific SNP signals Table 16. Conditional analyses of SNPs associated with CKD from T1DM and previously known T1DM SNPs Table 17. Linkage disequilibrium of selected SNPs in the HLA region in the GCKD cohort

Supplementary Table 1. Associations between population-based SNPs and advanced CKD (stage G3b or A3) in the GCKD cohort Supplementary Table 2. Associations between population-based SNPs and CKD from hypertension and type 2 diabetes mellitus in the GCKD cohort (all variants) Supplementary Table 3. Associations between CKD etiology-specific SNPs and other CKD etiologies in the GCKD cohort (all variants)

- 6 -

List of Figures

Figure 1. Effects of kidney function on essential homoeostatic processes Figure 2. Antihypertensive medication Figure 3. IgA nephropathy: immune-histological IgA deposition in the mesangium Figure 4. Diabetic kidney disease Figure 5. Principle of GWAS Figure 6. Manhattan Plot of an IgA nephropathy GWAS Figure 7. Relationship between minor allele frequency and effect size for genetic variants associated with continuous CKD-defining traits (eGFR, UACR) Figure 8. Flowchart: data cleaning in the GCKD study Figure 9. Age distribution in the GCKD cohort Figure 10. Distribution of eGFR in the GCKD cohort Figure 11. Distribution of ln(UACR) in the GCKD cohort Figure 12. SNP associations across different CKD etiologies in the GCKD cohort

- 7 -

1 Introduction 1.1 Chronic kidney disease (CKD) – definition, epidemiology, clinical presentation, diagnosis and treatment

Definition and categories of CKD The definition and classification of CKD has been standardized by the KDIGO workgroup. It is defined as abnormalities of kidney structure or function, present for more than 3 months, with implications for health. Criteria are reduced estimated glomerular filtration rate (GFR) of <60 ml/min/1.73m² and/or the presence of one or more markers of kidney damage. Among these are albuminuria and abnormalities detected in urine sediment or by histology. CKD is classified based on cause, GFR level and albuminuria. Additional ways of classification differentiate between systemic diseases affecting the kidney and primary kidney diseases, or between different morphologies in pathology or ultrasonography (Levey and Coresh 2012). Among the most common causes of CKD are diabetic nephropathy (15-30%), glomerulonephritis (20-25%), vascular nephropathy (15-25%) and polycystic kidney diseases (10-15%) (USRDS Annual Data Report 2016 ; Levey and Coresh 2012; Titze, Schmid et al. 2015). GFR is categorized into five stages (G1-G5, Table 1A), whereas G1 and G2 alone do not fulfill the criteria of CKD if further kidney damage markers are absent. Albuminuria is categorized into three stages based on the urinary albumin-to-creatinine ratio (UACR, Table 1B).

Table 1. Classification of CKD according to estimated GFR (eGFR) and albuminuria Table 1A. eGFR categories in CKD eGFR category eGFR (ml/min/1.73m2) Description relative to young adults G1 ≥90 Normal or high G2 60-89 Mildly decreased G3a 45-59 Mildly to moderately decreased G3b 30-44 Moderately to severely decreased G4 15-29 Severely decreased G5 <15 Kidney failure Table 1B. Albuminuria categories in CKD Albuminuria category UACR (mg/g) Description relative to young adults A1 <30 Normal to mildly increased A2 30-299 Moderately increased A3 ≥300 Severely increased eGFR: estimated glomerular filtration rate, UACR: urinary albumin-to-creatinine ratio. (Adapted from: KDIGO 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease) - 8 -

Relevant prognostic markers for CKD progression to end-stage renal disease (ESRD) are the cause of CKD, eGFR and albuminuria category as well as the presence of further risk factors and comorbidities. Frequent comorbidities in CKD patients are diabetes mellitus (DM), arterial hypertension, dyslipidemia and cardiovascular diseases.

Epidemiology The prevalence of CKD has increased during the last decades with an estimated 12% of the adult population affected in many countries (Levey and Coresh 2012; Levin, Tonelli et al. 2017). CKD prevalence shows a strong relation to age, with an estimated 23-40% persons affected among those older than 70 years (Zhang and Rothenbacher 2008; Levey, Stevens et al. 2009). Because of its high prevalence and strong association with cardiovascular morbidity and mortality it is regarded as a major public health problem (Eckardt, Barthlein et al. 2012; Eckardt, Coresh et al. 2013). Very recent investigations showed a stop of the increasing trend in CKD prevalence, especially in some Western countries with stable or even decreasing rates (De Nicola and Minutolo 2016; Hallan, Ovrehus et al. 2016). Possible explanations given there are improved control of hypertension and the increased use of the nephroprotective renin-angiotensin-aldosterone system (RAAS) inhibitors. The increased prescription of statins may contribute to the effect as well, but their protective effect on the kidneys is not proven (Wanner, Krane et al. 2005; Baigent, Landray et al. 2011). In addition, an increased focus on healthy lifestyle and preventive strategies may have led to a reduction of CKD risk factors. Globally, however, the burden of CKD is still on the rise (Global Burden of Disease Study 2016).

Pathophysiology and clinical presentation In a subset of patients, CKD leads to renal failure by a continued loss of functional glomeruli over time, independent of its cause (Hallan, Coresh et al. 2006). Blood flow to the nephrons rises to compensate for the lower number of functional glomeruli, leading to hyperfiltration in the remaining glomeruli. The hyperfiltration, which is mediated by angiotensin II and cytokines among others, causes hypertrophy, loss of the glomerular barrier function and subsequently proteinuria and progressive glomerulosclerosis. As the kidneys have many important functions, the consequences of renal failure are diverse (Figure 1):

1. Accumulation of metabolites and waste products and uremia. Substances that are excreted via the urine such as breakdown products of many medications and

- 9 -

metabolites such as organic acids, uric acid and creatinine cannot be eliminated adequately any more and hence their blood concentration increases. Because of this, creatinine is the most widely used biomarker to diagnose a reduction of the kidneys’ excretory function. However, blood concentrations of creatinine increase only when more than 50% of the renal filtration function is lost. Reduced salt excretion results in a volume overload, hypertension and edema. Toxic substances such as uric acid and a large number of metabolites, which cannot be eliminated sufficiently by the kidneys, are thought to cause organ and nervous damage and to increase the risk of cardiovascular events. 2. Unbalanced electrolyte and acid concentrations. When GFR lowers to less than 30 ml/min/1.73m², ions and protons cannot be eliminated adequately any more leading to metabolic acidosis and hyperkalemia. 3. Reduction of incretory function. Reduced generation of active vitamin D results in renal osteopathy, and lack of erythropoietin synthesis in anemia. In addition, generation of renin and prostaglandins is impaired.

However, patients with early CKD stages are often asymptomatic. Early symptoms are usually unspecific such as fatigue, weakness and hypertension. In advanced stages, paleness or pruritus are found as well.

Figure 1. Effects of kidney function on essential homoeostatic processes.

FGF=fibroblast growth factor. ANF=atrial natriuretic factor. (Adapted from: Eckardt 2013: Evolving importance of kidney disease: from subspecialty to global health burden)

- 10 -

Diagnosis

Apart from a patient’s history and clinical evaluation of the symptoms described above, diagnosis of CKD includes the assessment of laboratory parameters, imaging techniques (mainly ultrasound) and if indicated a renal biopsy. This all aims to detect and treat reversible causes of renal insufficiency and to prevent adverse consequences. The most meaningful laboratory parameters are serum creatinine, urea and cystatin C, from which the estimated GFR can be calculated to assess the kidneys’ excretory function and to categorize the disease. Furthermore, UACR assesses proteinuria and is used for staging. Renal biopsies are recommended only for specific subgroups of patients to confirm the underlying etiology, especially for glomerular or systemic diseases.

Treatment Once CKD has developed, its reversibility is rare. Therapeutic measures mainly aim to slow its progression and ameliorate the diverse symptoms as in most cases it is not possible to eliminate the underlying cause of CKD. Nevertheless, the therapy should be started as early as possible to avoid progression and hyperfiltration in the glomeruli. If possible, a therapy of the

underlying cause of CKD should be included. Some approaches are described below. The mainstay of CKD treatment is antihypertensive medication to lower blood pressure. Target blood pressure is ≤140/90 mm Hg for patients in CKD stage A1 and ≤130/80 mm Hg for patients in stage A2 or A3 (Figure 2, ESC Guidelines 2014).

ACE inhibitors and AT II receptor Figure 2. Antihypertensive medication (adapted from: blockers play an important role, as Prinz, Christian: Basiswissen Innere Medizin, Springer 2012) they do not only reduce hypertension, but also albuminuria and are therefore thought to be nephron-protective. Further recommendations include a low salt diet, an increased liquid intake, increased exercise and elimination or reduction of cardiovascular risk factors such as smoking and high cholesterol levels. However, many clinical trials of existing therapy regimes were negative, but some provided evidence such as the MDRD study (MDRD study group 1992).

- 11 -

If anemia, osteopathy or electrolyte disturbances exist, adequate substitution therapies are recommended. In general, side effects of drugs should be considered carefully as several drugs either are nephrotoxic or their renal elimination is impaired in CKD. Doses may have to be adapted to the GFR. If CKD progresses to ESRD, kidney replacement therapy is required, which consists either of dialysis or kidney transplantation. The latter has shown a lower mortality compared to dialysis, but often cannot be performed because of lack of donor organs or contraindications (Wolfe, Ashby et al. 1999).

1.2 Specific etiologies of CKD 1.2.1 Introduction CKD can result from various reasons, which can be categorized into systemic diseases affecting the kidney and primary kidney diseases. Both groups consist of various sub-groups relating to the location of damage. The KDIGO workgroup compiled a classification in 2002 which was updated in 2013 (Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group 2013), Table 2.

Table 2. Categorization of CKD based on presence or absence of systemic diseases and location of pathologic findings

Location of damage Example of systemic Example of primary kidney diseases affecting the diseases kidneys

Glomerular diseases Diabetes, autoimmune Membranous nephropathy, diseases (SLE) focal and segmental glomerulosclerosis Tubulointerstitial diseases Systemic infections, Urinary tract infections, sarcoidosis stones Vascular diseases Hypertension, ischemia, Renal artery stenosis vasculitis Cystic and congenital Polycystic kidney disease, Renal dysplasia diseases Fabry disease Diseases affecting the Acute and chronic rejection BK virus nephropathy transplanted kidney

SLE: systemic lupus erythematosus, ANCA: anti-neutrophil cytoplasmic antibody. Adapted from: KDIGO workgroup, 2013.

- 12 -

1.2.2 Glomerular diseases Pathological processes of glomerular diseases often manifest as either nephrotic or nephritic syndrome (see below). The cause of disease is typically based on a renal biopsy which shows both specific and unspecific pathological changes. The term “nephrotic syndrome” relates to the combination of edema, heavy proteinuria (usually >3.5 g/day for adults), hypalbuminemia and hyperlipidemia. Pathologies affecting the podocytes result in damage to the glomerular filtration barrier. Consequently, permeability of the membrane increases causing a heavy loss of proteins. This leads to hypalbuminemia and generalized edema. Further effects of increased glomerular permeability are a loss of substances such as antithrombin III causing hypercoagulability and a loss of immunoglobulins causing an increased susceptibility to infections. Examples of kidney diseases with nephrotic syndrome are membranous nephropathy, minimal change nephropathy and diabetic nephropathy (Turner 2015). “Nephritic syndrome” is characterized by the combination of edema, hypertension, glomerular hematuria, and proteinuria in a lower range, usually <1.5 g/day, mainly with an abrupt onset. It is caused by infections, autoimmune diseases or thrombotic events. Like in nephrotic syndrome, permeability of the glomerular filtration barrier is increased, but in this case with smaller pores so that red blood cells, smaller proteins and, in case of IgA nephropathy, IgA antibodies pass. In contrast, retention of sodium and water increases causing edema.

1.2.3 IgA nephropathy IgA nephropathy is a glomerulonephritis caused by the deposition of aberrantly glycosylated IgA complexes in the mesangium of the glomeruli (Tomana, Matousovic et al. 1997), Figure 3. In IgA nephropathy patients, B cells produce galactose-deficient polymeric IgA1 because of inherited defects. The liver and reticuloendothelial system show a reduced uptake of these

Figure 3. IgA nephropathy: IgA deposition in the mesangium in immunofluorescent staining (A) and HE (B) and mesangial expansion in electron micrograph (C). (from: Turner et al.: Oxford Textbook Of Clinical Nephrology, 4th edition, 2015.) - 13 - molecules, leading to their accumulation. If additional cofactors such as antiglycan IgG1 antibodies and further risk factors are present, the IgA molecules form immune complexes that are deposited in the glomeruli (Tomana, Novak et al. 1999). This is called a multi-hit- hypothesis as more than one defect has to be present for the manifestation of this pathology. The presence of multiple genetic and/or environmental risk factors is typical for complex diseases, in contrast to monogenetic diseases. The etiology of this complex disease is incompletely understood, which is why more than 90% of the cases are termed “sporadic” despite evidence for a hereditary component. Patients are usually young; males are affected more often than females. However, exact epidemiological data is lacking as the diagnosis requires renal biopsy. The annual incidence is estimated to be 1:100,000 (Wyatt and Julian 2013). It varies widely by ethnicity, which may point towards an interaction of genetic predispositions with environmental factors. This is consistent with the observation that a genetic IgA nephropathy risk score showed an East-West gradient and a North-South gradient within Europe concordant with differences in prevalence in geospatial analyses of GWAS (Wuttke and Kottgen 2016). In East-Asian populations, the incidence of ESRD resulting from IgA nephropathy is four times as high as in Europe (Kiryluk, Li et al. 2012). IgA nephropathy can also co-occur with systemic lupus erythematosus (SLE), Schönlein-Henoch purpura, liver cirrhosis and other diseases. Episodic macroscopic hematuria, often one to three days after an upper respiratory tract infection, is a typical first clinical sign. While macroscopic hematuria usually disappears spontaneously, microscopic hematuria with or without proteinuria often persists. Further symptoms may include hypertension and acute kidney failure. The clinical presentation in combination with results from urine analysis suggests a diagnosis of IgA nephropathy, which should be confirmed by a biopsy. As there is no specific targeted therapy for IgA nephropathy, symptomatic treatment is initiated and includes immunosuppressive and supportive therapy for example with corticosteroids and RAAS inhibitors, depending on progression and proteinuria. Only few studies so far addressed potential targeted therapies such as a decrease of the synthesis of galactosylated IgA by application of budesonide or hydroxychloroquine in small case series (Smerud, Barany et al. 2011). Complete remission is possible and occurs in 10% of the patients, but especially those with unfavorable prognostic factors such as proteinuria and hypertension can progress with a loss of up to 10 ml/min of GFR per year, eventually resulting in ESRD (Reich, Troyanov et al. 2007).

- 14 -

1.2.4 Membranous nephropathy (MN) Membranous nephropathy is a progressive kidney disease affecting mainly adults. In 80% of patients, circulating auto-antibodies against the phospholipase-A2-receptor protein (PLA2R) can form immune complexes with the PLA2R expressed on podocytes. These complexes trigger an immune reaction and thus lead to proteinuria and nephrotic syndrome. Antibodies against further proteins such as THSD7A are also known to appear in patients with MN, but affect only a small proportion of patients (Tomas, Beck et al. 2014). In the remaining patients, the etiology is either unknown, or the disease occurs secondary to autoimmune diseases such as systemic lupus erythematosus (SLE), infections (hepatitis, AIDS), solid tumors or drug therapy (antibiotics, gold). Clinical presentation in most cases includes nephrotic syndrome, proteinuria, edema or hypertension, but asymptotic patients exist as well. A biopsy is required to confirm the diagnosis. Before starting therapy, an appropriate screening for conditions causing secondary MN is required (Alfaadhel and Cattran 2015). Depending on presentation and risk assessment, which is based on proteinuria and renal function, therapy includes the prescription of diuretics, a salt-restrictive diet, causal therapy of the primary disease and/or immunosuppressive drugs; furthermore anticoagulants if thrombosis risk is elevated (Lee, Biddle et al. 2014). The exact mechanisms of the elevated thrombosis risk are still not clear. A loss of coagulation factors, an increased platelet activation and an enhanced red blood cell aggregation as a consequence of a nephrotic syndrome and a low albumin level are thought to be responsible (Mirrakhimov, Ali et al. 2014). A complete remission and a partial remission without loss of renal function can be attained for 1/3 of patients each, the remaining 1/3 progress to ESRD. However, exact numbers vary widely across different studies and depend on prognostic factors such as sex, age and the presence of various immune markers (Hopper, Trew et al. 1981; Zent, Nagai et al. 1997). Anti-PLA2R1 titers in blood can be used both as a parameter for therapeutic efficacy and as predictors for post-transplantation recurrence (Francis, Beck et al. 2016).

1.2.5 Systemic lupus erythematosus (SLE) and lupus nephritis Systemic lupus erythematosus is an autoimmune disease including skin manifestations, vasculitis and immune complex accumulation. Prevalence of SLE is 40/100,000. SLE affects mainly women in childbearing years; its etiology is unknown. Several genetic factors that contribute to the disease have been identified (Iwamoto and Niewold 2016), environmental

- 15 - factors such as viruses, medications and smoking are discussed as well (Tsokos 2011). These triggers cause apoptosis of various cells. Consequently, constituents of the nucleus are released. Auto-reactive B- and T-cells expand due to a disorder of their formation, activity and elimination and provoke an autoimmune reaction. This reaction is directed against DNA and proteins of the nucleus. SLE symptoms are very variable. Most common are general symptoms like fever, weight loss and weakness. Arthritis, skin symptoms (increased sun sensitivity, butterfly rash), cardiopulmonary symptoms (pleurisy, endocarditis, arteriosclerosis) and neuronal symptoms are frequently observed as well. An important complication because of its effect on therapy and prognosis is lupus nephritis. Lupus nephritis manifests with proteinuria, hematuria, nephrotic syndrome and progressive glomerulonephritis, ultimately resulting in ESRD. It is caused by antibody reactions against DNA fragments set free by cell destruction, and the deposition of the resulting immune complexes in the vessels and/or kidneys. Besides lupus nephritis, further renal manifestations of SLE exist, among these are minimal change disease, rapidly progressive glomerulonephritis and thrombotic microangiopathy. The diagnosis of SLE is made using the Systemic Lupus International Collaborating Clinics (SLICC) or American College of Rheumatology (ACR) criteria, which are similar and consider a variety of symptoms: at least 4 of 11 specified criteria have to be present. To confirm lupus nephritis a renal biopsy is indicated and serves at the same time to differentiate between several sub-groups and to determine disease activity. Based on the histopathological appearance, six sub-groups of lupus nephritis exist and define the prognosis. However, a switch between sub-groups is possible. Important parts of the therapy are consequent protection from sunlight, application of NSAIDs, chloroquine, corticosteroids or immunosuppressive drugs depending on disease activity, antihypertensive therapy to avoid further kidney damage and a nutrition rich in calcium and vitamin D to prevent mineral bone disease. Side effects may be severe including bone marrow and gonadal suppression (Tsokos 2011). The prognosis of SLE depends on different disease manifestations and on disease activity. The 10-year-survival is above 90% with adequate therapy. The elevated mortality risk is caused by higher prevalence of cardiovascular events, CKD, infections and uremia (Tsokos 2011).

- 16 -

1.2.6 Granulomatosis with polyangiitis (GPA) GPA, previously known as Wegener’s disease, is an antibody mediated vasculitis of small and medium-size blood vessels of unknown etiology affecting many organs. Early stages of GPA are characterized by manifestations limited to the respiratory tract such as chronic rhinosinusitis, nosebleeds and crusting, perforated septum, ulcerations in the oropharynx and pulmonary infiltrations. Later, the disease progresses and is characterized by a generalization which can include arthritis, neuropathy, glomerulonephritis, hemoptysis and skin lesions. Renal or other organ failures that can be life-threatening can be found in advanced stages of GPA, when it is also unresponsive to immunosuppressive therapy. The main renal manifestation is a pauci-immune, focal necrotizing glomerulonephritis. It is characterized by peri-glomerular accumulation of macrophages, neutrophils and lymphocytes. The presence of pauci-immune glomerulonephritis increases the risk of progression to ESRD largely (Sinico, Di Toma et al. 2013). Besides the clinical presentation, immunological parameters (cytoplasmic anti- neutrophil cytoplasmic antibodies, cANCA), biopsies from nasopharynx, lung or kidney and imaging techniques are used to establish the diagnosis. Therapy depends on stage and activity of the disease and consists mainly of the application of corticosteroids, cyclophosphamide and other immunosuppressive drugs. While initial dosages are higher to quickly control symptoms and to reduce inflammation, low-dose immunosuppression is sufficient to prevent relapse in the maintenance phase (Tarzi and Pusey 2014). GPA prognosis is poor without therapy (survival less than five months), but quite good under adequate therapy (>85% of patients survive 5-years). Kidney damage and adverse effects of cyclophosphamide are the most important unfavorable prognostic factors which limit survival (Hogan, Nachman et al. 1996).

1.2.7 Diabetes mellitus Diabetes mellitus describes a group of diseases which have chronic hyperglycemia in common. Of all patients with diabetes, 5% suffer from type 1 diabetes (T1DM) and >90% from type 2 diabetes (T2DM). Less prevalent are gestational diabetes and diabetes of other types. These rare types of diabetes can be caused by genetic defects in insulin receptors or mitochondria, endocrinopathies, infections and side effects of drugs. The prevalence of T2DM increases worldwide, affecting nowadays >15% of the population above 60 years in Western countries (Shaw, Sicree et al. 2010; Jaacks, Siegel et al. 2016).

- 17 -

Type 1 diabetes is an autoimmune disease that leads to the destruction of the insulin producing beta cells in the pancreas, resulting in a lack of insulin production and consequently in hyperglycemia. Manifestation usually takes place during adolescence, but a late onset form exists as well. Type 2 diabetes, which usually manifests at >40 years, arises from resistance against insulin. Besides a genetic predisposition, important contributors are a sedentary lifestyle, with lack of exercise, and obesity. Type 2 diabetes is therefore a typical complex disease, with unfavorable environmental factors acting upon genetically susceptible individuals. Chronic elevation of insulin levels lead to a resistance of the insulin receptors. Furthermore they cause increased appetite and additional food intake resulting in a circulus vitiosus. At the same time apoptosis of beta cells takes place, thus insulin stocks are depleted. The precise molecular mechanisms underlying T2DM pathogenesis are still incompletely understood. Both types of diabetes show a hereditary component. For T1DM, distinct HLA haplotypes are known to increase disease risk if present in the family. For T2DM, if one parent suffers from T2DM, risk for the children to develop the disease rises up to 50% and if both parents suffer from T2DM even further (Klein, Klein et al. 1996). Clinical manifestation is usually early and fast for type 1 but later and slower for type 2 diabetes. Typical clinical symptoms of diabetes are increased thirst, frequent urination, weight loss despite increased appetite, and fatigue. In advanced stages diabetic cardiomyopathy, neuropathy, nephropathy, microvascular disease and frequent infections occur.

Diagnosis is established by the measurement of blood glucose and HbA1c. Criteria to define diabetes are a) a HbA1c value of >6.5%, b) a HbA1c value of >5.7% in combination with a fasting blood glucose value of ≥126 mg/dl or c) a random blood glucose value of ≥200 mg/dl. In addition, an oral glucose tolerance test or measurement of urinary glucose and ketone bodies can be performed. HbA1c serves additionally as a long-time marker for blood glucose, reflecting glycaemia over the past eight weeks. Therapy of diabetes rests on several pillars. A healthy nutrition with weight loss is necessary, as well as exercising. Commonly used drugs are oral antidiabetics for patients with T2DM, and insulin. T1DM patients require insulin immediately as they have an absolute lack of insulin; therefore, oral antidiabetics do not help. Patient compliance is of uttermost importance; therapy success and glucose levels have to be checked regularly and all complications and risk factors for cardiovascular diseases have to be treated. Nutrition has to be planned carefully to avoid peaks or strong decreases of blood glucose concentration.

- 18 -

Diabetic nephropathy is an adverse consequence that can arise from both forms of diabetes. It manifests in about 30% of patients usually 5 to 10 years after disease onset (Adler,

Stevens et al. 2003), especially in the presence of risk factors such as a high HbA1c levels, arterial hypertension and a positive smoking history. It is the most common cause of ESRD. The percentage of ESRD patients with diabetes as the primary cause varies widely across countries. While it is below 20% in some Northern European countries, some East Asian countries reported more than 60%. In most countries, e.g. the United States, diabetes is considered the primary cause of ESRD in 40-50% of the patients (USRDS Annual Data Report 2016). Pathogenic mechanisms of diabetic nephropathy are not completely understood. Genetic predisposition is known to increase the risk of diabetic kidney disease, although no specific risk locus or pathway has been found that by itself could explain the predisposition. Levels of vascular endothelial growth factor-A (VEGF-A) and inflammatory cytokines increase because of injured glomeruli and mediate glomerular endothelial cell proliferation. Thus, the mesangial area expands and the glomerular basement membrane thickens (Steffes, Osterby et al. 1989). Further pathogenic mechanisms are discussed such as reactive oxygen species causing damage in the glomeruli, leading to sclerosis and glomerular hypertension (Cooper 2001). The microscopic appearance is illustrated in Figure 4. In T1DM, glomerular changes are already observed in early stages of the disease and characterized by hyperfiltration. Advanced stages Figure 4. Diabetic kidney disease. Glomerulosclerosis with thicked mesangium are characterized by M.Kimmelstiel-Wilson, and membranes. (from: W. Remmele: Pathologie. Bd. 5, Springer, Berlin 1997) which refers to glomerulosclerosis and proteinuria or nephrotic syndrome because of the microscopic pathologies described above. The changes observed in T2DM mainly overlap with those observed in T1DM, but the vascular and interstitial pathologies are less specific and more heterogeneous (Dalla Vestra, Saller et al. 2000). After an asymptomatic period, the glomerular filtration barrier starts to leak, allowing proteins such as albumin to pass, resulting in albuminuria. In earlier stages but not later

- 19 - stages, albuminuria is potentially reversible. Measures to avoid its development and progression to ESRD are antihypertensive medication, control of blood glucose and restriction of salt intake. Especially RAAS blocking agents are used widely as antihypertensive agents, based on the strong evidence of their efficacy in clinical studies. Apart from reducing blood pressure, they significantly reduce albuminuria (Parving 2000; Cravedi, Ruggenenti et al. 2010). Further dietary recommendations are discussed controversially (Stevens and Levin 2013).

1.2.8 Hypertensive chronic kidney disease Hypertensive kidney disease, also called hypertensive nephropathy or nephrosclerosis, refers to kidney damage due to high blood pressure without presence of inflammatory processes. Hypertension is present in about 75% of CKD patients and declared as the primary cause for CKD in about 25% of the CKD patients (USRDS Annual Data Report 2016). Hypertension is thought to contribute to kidney damage by causing vascular damage and sclerosis. Initially adaptive responses such as medial hypertrophy and intima thickening to minimize wall-stress narrow the vascular lumen leading to a decreased blood flow in the kidneys, followed by activation of the RAAS system. This increases glomerular pressure and results in permanent elevation of renal blood pressure, resulting in damage to the vasculature and glomeruli, and subsequently to tubular atrophy and interstitial nephritis. Nephrosclerosis can be divided into a benign and a malignant variant. Benign nephrosclerosis progresses slowly over years or decades, while malignant nephrosclerosis progresses rapidly to acute kidney injury. In early stages of nephrosclerosis, microalbuminuria is often the only symptom resulting in low patient awareness. Over time, albuminuria increases. Fibrosis, sclerosis and plaques deposited in the arteriae cause damage to the endothelium resulting in loss of glomerular function. The patient’s symptoms are similar to those of CKD in general, which are described above. As for diabetic nephropathy, the most important therapy approach is the use of RAAS blocking agents to lower blood pressure. Nephrosclerosis is commonly named as a cause of CKD. While it is without controversy that hypertension is a risk factor for CKD, it is debated if hypertension itself is causal for nephropathy or only associated with it (Freedman and Cohen 2016). The authors of this article propose to abandon this term and use “-based” or “arteriolar nephropathy” in patients of African ancestry instead. ”Gene-based” refers to carrier status of the APOL1 risk alleles, which has been shown to be associated with CKD (O'Seaghdha, Parekh et al. 2011), although the exact mechanism is still unclear. APOL1 encodes apolipoprotein 1, which is - 20 - upregulated by pro-inflammatory cytokines (Wan, Zhaorigetu et al. 2008) and contributes to the innate immune response. It is assumed that in the context of infections, more elevated levels of nephrotoxic metabolites are produced in patients with high risk APOL1 variants than in those with low risk variants (Olabisi, Zhang et al. 2016). Various studies showed different results concerning ethnic groups and as well associated cardiovascular risks, so further research is necessary to address the open points of this pathway.

1.3 Genetic epidemiology: Genome-wide association studies and SNP associations A genome-wide association study (GWAS) is a gene mapping method, which aims to detect associations between genetic variants in DNA (markers) and phenotypic characteristics or diseases. Single nucleotide polymorphisms (SNPs) serve as markers. SNPs occur naturally about every 100 to 300 basepairs. Since close-by SNPs are co-inherited and are therefore correlated (termed “linkage disequilibrium”, Figure 5), only a subset of all SNPs can be genotyped in order to achieve good genome coverage. This is currently cheaper than whole- genome sequencing. When performing a GWAS, genotypes at approximately 500,000 to 5 million SNPs per person are determined through array genotyping. Because of the known high correlation between these genotyped and additional ungenotyped SNPs in sequenced reference populations, genotype status at many more SNPS can be inferred (imputation) and subsequently be tested for association (Visscher, Brown et al. 2012).

Figure 5. Principle of GWAS SNPs in close proximity are often co-inherited and therefore highly correlated. Thus, it is sufficient to genotype a subset of about 1 million SNPs that can be used as markers for a genome-wide screen such as GWAS. At each SNP, an association test between genotype and presence of the disease or trait of interest is carried out to test, whether a disease or marker of interest, e.g. CKD differs across the three genotype categories. Statistical significance of such an association in a GWAS is defined as p-value of <5x10-8, which corrects for multiple testing of about 1 million independent SNPs. LD: linkage disequili- brium, SNP: single nucleotide polymorphism, CKD: chronic kidney disease. From: Köttgen, A: Genome- wide Association Studies in Nephrology Research, 2010.

- 21 -

Association tests, usually by linear or logistic regression, determine, for each SNP at any time, whether carrier status of a pre-specified allele at this SNP occurs more frequently in cases than in controls (or if the mean of a continuous parameter such as eGFR is different in carriers and non-carriers of the pre-specified allele, Figure 5). Because of the high number of tests, stringent correction for multiple testing is required to avoid an excess of false positive results. The p-value-threshold for statistical significance is usually set to 5x10-8, which corresponds to testing 1 million common independent SNPs that are typically found in the genome of European ancestry individuals. It is calculated as 0.05 (=typical type 1 error level) divided by 1 million. Furthermore, as population stratification can be a factor confounding the genotype – disease association, analyses need to adjust for genetic ancestry (Price, Zaitlen et al. 2010). As GWAS represent a gene mapping method, their results only represent an association and do not implicate that an associated SNP is directly causing the disease.

Results from a GWAS are often presented as a Manhattan plot. Figure 6 illustrates this graphical way of presentation: all phenotype-SNP associations (dots) appear with their respective genomic coordinates on the x axis and the p-value (negative decadic logarithm) for the strength of the association on the y axis. The level of genome-wide significance is indicated by a vertical line at 5x10-8. All SNPs with non-significant associations hence fall below this line, while all SNPs with significant associations are located above. Thus, the reader can easily read and compare the strength of the associations on the one hand, and detect regions and of special interest on the other hand. In addition, the likely disease-related gene names are added to the plot.

Figure 6. Manhattan Plot of an IgA nephropathy GWAS (Kiryluk, Li et al. 2014)

Statistically significant associations (p-value <5x10-8) are presented in pink with the respective gene names. - 22 -

GWAS have become possible because of technical progress allowing high-throughput genotyping, the generation of the the sequence and databases of its variation and the collection of large study populations. The first GWAS was published in 2006 and found two risk loci for wet age-related macular degeneration (Dewan, Liu et al. 2006). The WTCCC (The Wellcome Trust Case Control Consortium 2007) published the first GWAS conducted in large patient populations and using a SNP array with good coverage of the genome one year later.

Since the onset of GWAS, thousands of new SNP - disease associations have been found, mainly for common complex diseases such as metabolic disorders, autoimmune conditions and psychiatric diseases (Visscher, Brown et al. 2012). These findings have already helped to gain further biological knowledge, for example by using animal models to investigate the loss-of-function of implicated by GWAS. Further insights from GWAS concern areas such as population structure, natural selection and evolution. Nevertheless, the majority of GWAS findings cannot be directly related to biological and pathogenic pathways until now. Time-consuming follow-up studies are required to understand underlying molecular mechanisms.

The main findings of the GWAS in the past ten years were:

1. Many different risk loci contribute to complex disorders, mainly with small effects, but can explain in sum a large proportion of the familiar appearance of these diseases. 2. At many risk loci several SNPs with a wide range of allele frequencies exist that are associated with a disease. 3. Several risk loci contribute to more than one disease. 4. Risk loci are often shared across different ethnic groups.

Figure 7 illustrates conceptional insights of successive GWAS of kidney function and damage in increasingly large populations over time.

- 23 -

Figure 7. Relationship between minor allele frequency and effect size for genetic variants associated with continuous CKD-defining traits (eGFR, UACR)

Adapted from: Wuttke (2016): Insights into kidney diseases from genome-wide association studies. eGFR: estimated glomerular filtration rate, UACR: Urine albumin-to-creatinine rate.

The figure illustrates the following main insights: 1. While earlier GWAS found associations with larger effect sizes, more recent GWAS can detect smaller effect sizes (y axis). These later findings illustrate increased statistical power of larger study populations, as exemplified by smaller standard errors of the effect sizes (small vertical lines through the effect sizes). 2. Associations of genetic variants with low minor allele frequency (x axis) need to be of larger effect size to achieve sufficient statistical power for identification by GWAS compared to those of higher minor allele frequency. Therefore, they could not be detected in earlier studies. 3. The minor alleles of the detected SNPs can have either a positive or a negative effect on the studied kidney function parameter, suggesting that there is no purifying selection against alleles resulting in lower kidney function. This can be explained by the late onset of the disease, the moderate effect and/or other, beneficial effects of the

- 24 -

risk allele as it is known for the APOL1 CKD risk variant, protects from trypanosomiasis (Vanhamme, Paturiaux-Hanocq et al. 2003). 4. Different ethnic groups can generate complimentary insights: The study by Okada et al in East Asians identified loci that were only later identified in much larger studies of non-Asian populations.

Concerning CKD, several GWAS have been performed resulting in the detection of risk alleles associated with reduced GFR or increased risk of eGFR-defined CKD in the general population (Kottgen, Glazer et al. 2009; Köttgen 2010; Okada, Sim et al. 2012; Pattaro 2015). Other GWAS have found risk alleles for specific causes of CKD such as IgA nephropathy (Gharavi, Kiryluk et al. 2011; Kiryluk, Li et al. 2014; Li, Foo et al. 2015), membranous nephropathy (Stanescu, Arcos-Burgos et al. 2011), systemic lupus erythematosus (Harley, Alarcon-Riquelme et al. 2008; Chung, Taylor et al. 2011), type 1 diabetes mellitus (Sandholm, Salem et al. 2012; Sandholm, McKnight et al. 2013) or granulomatosis with polyangiitis (Holle 2013; Xie, Roshandel et al. 2013), but none of these studies examined the presence of overlapping risk variants for more than one etiology leading to CKD. Together, the findings of these and further GWAS were used as a source for this study and are outlined in detail in the Methods part.

1.4 Aims of the thesis GWAS in the general population have identified single-nucleotide polymorphisms (SNPs) in >50 independent risk loci associated with the estimated glomerular filtration rate (eGFR), CKD disease risk and microalbuminuria (MA) (Kottgen, Glazer et al. 2009; Boger, Chen et al. 2011; Pattaro 2015), but it still remains unclear whether the SNPs identified in population- based cohorts associate as well with advanced CKD (such as eGFR < 45 ml/min/1.73m² or urinary albumin-creatinine ratio (UACR) ≥ 300 g/mg) and if so, whether the strength of association is stronger. Moreover, it is unknown how these variants associate with more specific CKD etiologies such as hypertensive CKD and diabetic kidney disease (DKD). In addition, GWAS for specific kidney diseases, such as IgA (Kiryluk, Li et al. 2014) or membranous nephropathy (Stanescu, Arcos-Burgos et al. 2011), have found risk loci not detected in population-based genetic screens of eGFR, but there is no systematic comparison of these risk loci across different specific aetiologies of CKD.

- 25 -

To address these knowledge gaps in the current understanding, the aims of this thesis were to address the following questions:

1. Do kidney function associated SNPs discovered in population-based studies translate to advanced CKD and the common CKD etiologies diabetic nephropathy and hypertensive CKD (Aim 1)? 2. Are risk loci identified for specific etiologies also associated with other CKD etiologies? And are there loci contributing risk to several related etiologies such as autoimmune conditions leading to CKD (Aim 2)?

- 26 -

2 Methods 2.1 Study populations Case groups for all analyses in this thesis were derived from the GCKD study. As controls, three different groups were examined: a) patients from the GCKD study without the investigated characteristics (internal control group), b) European participants of the 1000 Genomes Project and c) controls from the WTCCC study (external control groups).

a) GCKD study

The GCKD study is an ongoing prospective observational study of 5,217 patients with moderate chronic kidney disease (CKD) under continuous nephrological medical care. Patients are followed prospectively for 10 years through structured visits every two years to assess disease course, hospitalization episodes and new-onset complications. Biomaterials (plasma, serum, DNA and spot-urine samples) are collected, processed and frozen for future analyses. A set of care parameters is measured from fresh samples in a central laboratory with standardized processing. In addition, telephone interviews are conducted between visits and patients’ nephrologists are contacted annually to assess key information about the patients’ health and the latest serum creatinine value. Between March 2010 and March 2012, 5,217 patients were enrolled in the study at nine recruitment centers in collaboration with university hospitals and practice based nephrologists all over Germany. All patients provided written informed consent (Eckardt, Barthlein et al. 2012). The GCKD study was approved by local ethics committees and registered in the national registry of clinical studies (DRKS 0003971).

To be eligible for inclusion, participants had to meet the following criteria: 1) age between 18 and 74 years, 2a) eGFR between 30 and 60 ml/min/1.73m² or 2b) eGFR > 60 ml/min/1.73m² and overt albuminuria (which was defined as > 300 mg albumin/g creatinine or > 300 mg albumin/day in the urine) or proteinuria (> 500 mg protein/g creatinine or > 500 mg protein/day in the urine). Of all patients, 4,775 (91.5%) were enrolled based on their eGFR, and 442 (8.5%) because of albuminuria. Exclusion criteria were non-Caucasian race, solid organ or bone marrow transplantation, active malignancy within 24 months prior to screening, heart failure NYHA IV and patients under legal attendance or unwilling to provide consent. Nephrologists were asked to define a leading cause of CKD based on biopsies and/or clinical grounds, and patients were subsequently categorized into groups of different CKD etiologies. More detailed information on the study design can be found in Eckardt et al (Eckardt, Barthlein et al. 2012) and Titze et al (Titze, Schmid et al. 2015), describing the findings from the GCKD baseline visit. - 27 -

b) 1000 Genomes Project

The 1000 Genomes Project was conducted between 2008 and 2015 to obtain a comprehensive description of genome-wide genetic variation in worldwide population. Therefore, 2,504 individuals from 26 populations in five different world regions (Africa, East Asia, Europe, South Asia and the Americas) were sampled and their DNA sequenced using both whole- genome and targeted exome sequencing (Auton, Brooks et al. 2015). The Project Consortium applied recently developed guidelines on ethical considerations for investigators performing genetic sampling, which they defined in an “Informed Consent Background Document”.

The 1000 Genomes Project Steering Committee decided about the selection of populations and sample sets. An overview of sample collection, selection criteria and data generation is given in two Nature publications (Auton, Brooks et al. 2015; Sudmant, Rausch et al. 2015) as well as on the Project’s homepage (http://www.1000genomes.org/about).

c) WTCCC study

The WTCCC (Wellcome Trust Case Control Consortium) study investigated genetic risk loci for several common diseases, the case groups, in a population in Great Britain in 2007 (The Wellcome Trust Case Control Consortium 2007). As a shared control group, they used a combination of two independent population-based groups, each of about 1,500 persons presumed to be healthy. This group has been used as an external control group in the present study as well. The control group consisted of 1,500 individuals of the 1958 British Birth cohort and 1,500 blood donors who gave consent to participate in the study. Participants of the 1958 British Birth cohort were all born in Great Britain in a certain week in 1958. Survivors have been followed regularly and at the age of 44-45 years, blood samples for DNA extraction were collected. Of these 17,000 participants, 1,500 who self-reported Caucasian ethnicity served as the first part of the control group in the study. The second part of the group was formed by blood donors who were recruited as study participants for the WTCCC study. Of originally 3,622 samples, 1,564 were selected from patients aged 18 to 69 years and that represented the 1958 British Birth cohort control group best concerning sex and geographical distribution. This study makes use of data generated by the Wellcome Trust Case Control Consortium. A full list of the investigators who contributed to the generation of data is available at www.wtccc.org.uk. Funding for the project was provided by the Wellcome Trust under award 076113 and 085475.

- 28 -

2.2 Exposure 2.2.1 Genotyping and sequencing In the GCKD study, 5,123 participants were genotyped at 2,337,794 SNPs using the Illumina Infinium 2.5M-8 microarray. The 1000 Genomes Project used an Illumina sequencing platform. Sequence data is available via the 1000 Genomes project website. SNP genotyping in the WTCCC study was performed with Affymetrix GeneChip 500K arrays. Genotype data is available via dbGAP (database of genotypes and phenotypes, https://www.ncbi.nlm.nih.gov/gap) and was obtained through an approved project application.

2.2.2 Quality control and filtering Genotype data cleaning in the GCKD study was performed according to standard protocols (Anderson, Pettersson et al. 2010). Individuals were removed if they had missing genotypes for more than 3% of SNPs (n=48), variant heterozygosity was more than 2 SD away from the mean of all samples (n=15) or if they failed the sex check (n=19) that compares genotypic and reported sex. While 57 individuals failed these initial quality checks, 5,066 individuals remained. Following, only one member of pair/ families with 1st or 2nd degree relatives also participating in the study was retained Figure 8. Flowchart: data cleaning in the (n=11 excluded) and individuals who GCKD study clustered away from other samples based on a common set of genome-wide SNPs, defined as outlying average-DST values and/or >8SD deviation away from the mean in a principal component analysis were removed (n=21). Finally, 5,034 individuals remained for further analyses (Figure 8). Data cleaning of the 1000 Genomes Project and the WTCCC study parent projects are described in the respective study design papers (The Wellcome Trust Case Control Consortium 2007; Auton, SNP: single nucleotide polymorphism, SD: standard deviation, DST: a measure of interpopulational genetic Brooks et al. 2015). distance based on allele sharing.

- 29 -

2.2.3 Imputation Genotype imputation of the GCKD and WTCCC data was performed according to standard protocols (Marchini and Howie 2010) using SHAPEIT v2.r644 (Delaneau, Zagury et al. 2013) and IMPUTE2 v2.3.0 (Howie, Donnelly et al. 2009) in order to increase the number of available SNPs per individual and in order to have information on the same SNPs although samples were genotyped using different arrays. Phased haplotypes from the 1000 Genomes project (Auton, Brooks et al. 2015) (Phase 3, release v3 date 2013-05-02, ALL subset) were used as the imputation reference panel. After imputation, about 9.3 million SNPs of high imputation quality (information measure ≥0.8 and minor allele frequency (MAF)>1%) were retained for further analysis.

2.3 Outcome 2.3.1 eGFR/UACR Previous population-based studies discovered genetic risk loci using screens for eGFR or UACR as the phenotype of interest. For estimating the GFR, the CKD-epi-formula (Levey, Stevens et al. 2009) was used.

α -1.209 Age eGFR =141 x min(SCr/κ, 1) x max(SCr /κ, 1) x 0.993 x 1.018 [if female] x 1.159 [if black]

SCr is the serum creatinine in mg/dl, min indicates the minimum of SCr/κ or 1, max indicates the maximum of SCr/κ or 1, κ = 0.7 (females) or 0.9 (males) and α = -0.329 (females) or -0.411 (males). Serum creatinine is assumed to have been measured with an assay calibration traceable to an isotope dilution mass spectrometry reference measurement procedure, which is the case in the GCKD study. As one important question of this thesis was if risk loci found to associate with kidney function in the physiological range are also associated with advanced CKD, we used eGFR and UACR as markers to define case groups of more advanced CKD. Categorical variates were derived based on cut-offs of these continuous kidney function markers.

2.3.2 Case and control groups Table 3 gives an overview of the main characteristics used to define case and control groups. Below, they are described in more detail.

- 30 -

Table 3. Case and control group characteristics for the analyses

GCKD internal 1KGP Case group WTCCC external Analysis SNPs control external characteristics controls characteristics controls Popula- eGFR<45 eGFR≥60 Advan- tion-based ml/min/1.73m² / ml/min/1.73m² / ced CKD loci UACR≥300 mg/g UACR<300 mg/g Popula- Diabetic leading cause of tion-based other cause of CKD NP CKD: T2DM all individuals all individuals of loci of European the WTCCC Hyper- Popula- leading cause of other cause of CKD ancestry control group tensive tion-based CKD: (excluding NP loci nephrosclerosis nephrosclerosis) Specific Etiology- leading cause of CKD specific CKD: specific other cause of CKD etiologies risk loci disease age, sex; for sensitivity analyses: Sex; age did not Co- sex, principal - intake of RAAS inhibitors, presence of vary for one sub- variates components diabetes group

SNP: single nucleotide polymorphism, GCKD: German Chronic Kidney Disease Study, 1KGP: Thousand Genomes Project, WTCCC: Wellcome Trust Case Control Consortium, CKD: Chronic Kidney Disease, NP: nephropathy, eGFR: estimated glomerular filtration rate, UACR: urine albumin creatinine ratio, T2DM: type 2 diabetes mellitus, RAAS: renin-angiotensin aldosterone system

2.3.2.1 Case groups In this thesis, three different case groups were studied to address the study aims.

1. Cases of “advanced CKD” were defined in two ways: a) all patients with eGFR < 45 ml/min/1.73m² (defining CKD stage G3b, see section 1.1) and b) all patients with UACR≥300 mg/g (defining CKD stage A3). 2. Regarding diabetic and hypertensive nephropathy, all patients were defined as cases whose leading cause of CKD was classified as “type 2 diabetes mellitus” or “nephrosclerosis“, respectively. 3. For the examination of specific etiologies of CKD, the leading cause of CKD served as the basis for the case group definition. All etiologies with a suspected genetic contribution were tested in a separate case group if there were at least 50 patients with the respective leading cause of CKD in the GCKD cohort.

2.3.2.2 Control groups Because selection bias can threaten inferences from case-control studies through a biased selection of controls, each analysis was performed using three different control groups, both internal and external controls (see section 2.1).

- 31 -

a) Internal GCKD control groups 1. Patients with “less advanced CKD” served as a control group for the “advanced CKD” association analyses. This was defined as eGFR ≥ 60 ml/min/1.73m² or UACR < 300 mg/g, respectively. 2. As an internal control group for cases with specific CKD etiologies (see 2.3.2.1.3) 1,655 GCKD patients with causes of CKD that were clearly distinct from the evaluated case groups (nephrosclerosis, infections, tumor nephrectomies, interstitial nephritis and vascular diseases) were used (Table 4). For analyzing associations with nephrosclerosis as the case group, the control group was reduced to the remaining 569 individuals.

Table 4. Composition of the GCKD internal control group for comparison to specific etiologies of CKD

Kidney disease n Vascular nephropathy (n=1,160)

Renal artery stenosis 49 Nephrosclerosis 1,086 Renal infarct 6 Other 19 Interstitial nephropathy (n=220)

Interstitial nephropathy 145 Analgesic nephropathy 51 Other 24 Acute kidney injury (n=62)

Post ischemic 58 Other 4 Single kidney (n=133)

Tumor nephrectomy 62 Kidney donor 27 Other nephrectomy 31 Other 3 (Post-)renal diseases (n=90)

Kidney stones 22 Infections 27 Neurogenic bladder dysfunction 3 Other 38 Sum 1,655

b) 1000 Genomes Project This control group was formed by 503 samples of European ancestry sampled from Central Europe (Great Britain, Finland, Italy and Spain) and the United States (CEU),

- 32 -

to maximize comparability to the GCKD cohort. As demographic data, only sex information was released by the project.

c) WTCCC control group Patients of this control group of 2,597 samples belonged either to the 1958 British Birth cohort or to the UK Blood Service Controls cohort (see section 2.1). Data on sex and race (all European ancestry) was available, but no additional phenotype or demographic data.

2.3.3 Covariates Sex and age were used as covariates for the association analyses. Sex was available for all three study populations (Table 5), while age was only available and variable for the GCKD cohort and thus was only used as a covariate for the comparison to internal control groups.

Table 5. Sex distribution in the study populations

Study population % male (n) % female (n) GCKD 60.1% (3,027) 39.9% (2,007) 1KGP 47.7% (240) 52.3% (263) WTCCC 49.9% (1,295) 50.1% (1,302)

GCKD: German Chronic Kidney Disease Study, 1KGP: Thousand Genomes Project, WTCCC: Wellcome Trust Case Control Consortium

For sensitivity analysis, presence of diabetes mellitus (n=1,737) and intake of RAAS inhibitors (n=3,869; missing data n=36) of GCKD patients were taken into account as covariates as well. Diabetes mellitus was defined by HbA1c >6.5% or the intake of oral antidiabetic drugs. To evaluate a potential impact of population stratification, a sensitivity analysis with genetic principal components (E1-E3) as covariates was performed for the 1000 Genomes Project control group to account for potentially different geographical origins of the case and external control groups.

2.4 Statistical analyses 2.4.1 Literature search for previously reported SNPs associated with kidney function and disease A literature search with the help of the GWAS Catalog (Welter, MacArthur et al. 2014) and PubMed was performed to assemble a table of all previously reported SNPs associated with - 33 - either the kidney function measures eGFR and UACR, CKD risk or risk for specific CKD etiologies. In populations of European ancestry, SNPs had to show genome-wide significant association with the outcome (p<5x10-8) and evidence for replication. Both studies examining cause-specific and cause-unspecific CKD etiologies were taken into account. Based on linkage disequilibrium information, redundant SNPs were removed if the two reported index SNPs were correlated in the European population (r²>0.2) and thus likely reflect the same signal, as calculated by SNiPA (Arnold, Raffler et al. 2015). Genetic information for the SNPs from the literature research was then extracted for all study participants from the imputed GCKD genotype dataset. As quality and plausibility checks, allele frequencies for all study populations and SNPs were calculated and compared to those given in the dbSNP database (Smigielski, Sirotkin et al. 2000). No SNPs were excluded for allele frequency discrepancies or for deviation from the Hardy-Weinberg-equilibrium (HWE) test. Typically, SNPs are discarded for violations of the HWE from genome-wide genotyping arrays if their p-value is <10-5. In this study, the cut-off for violation of the HWE was set to p<0.05 because of the targeted evaluation of much fewer candidate SNPs. Table 6 gives an overview of all SNPs used in the following analyses and the results of the plausibility checks. The five SNPs with deviations from the HWE were then re-tested for HWE deviations based on data of the control groups only, as deviations of HWE in combined case and control groups can result from inclusion of cases for whom this is a strong disease risk factor. Results are shown in Table 7.

Table 6. Overview of candidate SNPs and plausibility checks

Position HWE SNP Gene Reference Chr. Allele1 Allele2 AF1 AF² (b37) p-value Population based risk loci (eGFR, UACR, CKD) rs10109414 STC1 Köttgen 2010 8 23,751,151 C T 0.57 0.57 0.370 rs10277115 UNCX Okada 2012 7 1,285,195 A T 0.25 0.22 0.933 rs10491967 TSPAN9 Pattaro 2015 12 3,368,093 G A 0.89 0.89 0.671 rs10513801 ETV5 Pattaro 2015 3 185,822,353 T G 0.86 0.87 0.406 rs10774021 SLC6A13 Köttgen 2010 12 349,298 C T 0.35 0.36 1.000 rs10794720 WDR37 Köttgen 2010 10 1,156,165 T C 0.06 0.09 0.875 rs10994860 A1CF Pattaro 2015 10 52,645,424 C T 0.80 0.84 0.843 rs1106766 INHBC Pattaro 2015 12 57,809,456 C T 0.80 0.77 0.952 rs11078903 CDK12 Pattaro 2012 17 37,631,924 G A 0.25 0.25 0.933 rs11666497 SIPA1L3 Pattaro 2015 19 38,464,262 C T 0.80 0.82 0.001 rs11959928 DAB2 Köttgen 2010 5 39,397,132 T A 0.55 0.54 0.971 rs12124078 DNAJC16 Pattaro 2012 1 15,869,899 A G 0.70 0.70 0.218 rs12136063 SYPL2 Pattaro 2015 1 110,014,170 G A 0.31 0.31 0.874 rs12460876 SLC7A9 Köttgen 2010 19 33,356,891 T C 0.58 0.61 0.360 rs1260326 GCKR Köttgen 2010 2 27,730,940 T C 0.41 0.42 0.826 - 34 - rs12917707 UMOD Köttgen 2009 16 20,367,690 G T 0.80 0.84 0.596 rs13538 ALMS1/NAT8 Köttgen 2010 2 73,868,328 A G 0.80 0.78 0.391 rs1394125 UBE2Q2 Köttgen 2010 15 76,158,983 G A 0.66 0.64 0.881 rs163160 KCNQ1 Pattaro 2015 11 2,789,955 A G 0.82 0.82 0.324 rs164748 DPEP1 Pattaro 2015 16 89,708,292 C G 0.52 0.55 0.814 rs17216707 BCAS1 Pattaro 2015 20 52,732,362 T C 0.77 0.80 0.923 rs17319721 SHROOM3 Köttgen 2009 4 77,368,847 G A 0.59 0.56 0.975 rs1801239 CUBN Böger 2011 10 16,919,052 T C 0.91 0.89 0.904 rs2279463 SLC22A2 Köttgen 2010 6 160,668,389 A G 0.88 0.88 0.902 rs228611 NFKB1 Pattaro 2015 4 103,561,709 G A 0.52 0.51 0.924 rs2453580 SLC47A1 Pattaro 2012 17 19,438,321 T C 0.60 0.59 0.109 SPATA5L1/ rs2467853 Köttgen 2009 15 45,698,793 T G 0.59 0.61 0.992 GATM rs267734 ANXA9/LASS2 Köttgen 2010 1 150,951,477 T C 0.81 0.79 0.432 rs2712184 IGFBP5 Pattaro 2015 2 217,682,779 C A 0.43 0.41 0.911 rs2802729 SDCCAG8 Pattaro 2015 1 243,501,763 C A 0.55 0.50 0.999 rs2928148 INO80 Pattaro 2012 15 41,401,550 G A 0.50 0.50 0.991 rs347685 TFDP2 Köttgen 2010 3 141,807,137 C A 0.28 0.27 0.999 rs3750082 KBTBD2 Pattaro 2015 7 32,919,927 T A 0.64 0.68 0.450 rs3828890 MHC region Okada 2012 6 31,440,669 C G 0.88 0.91 0.979 rs3850625 CACNA1S Pattaro 2015 1 201,016,296 G A 0.88 0.88 0.167 rs3925584 MPPED2 Pattaro 2012 11 30,760,335 T C 0.51 0.55 0.233 rs4014195 AP5B1 Pattaro 2015 11 65,506,822 C G 0.65 0.62 0.879 rs4667594 LRP2 Pattaro 2015 2 170,008,506 T A 0.49 0.47 0.944 rs4744712 PIP5K1B Köttgen 2010 9 71,434,707 A C 0.38 0.39 0.659 rs491567 WDR72 Köttgen 2010 15 53,946,593 A C 0.80 0.79 0.995 rs6088580 TP53INP2 Pattaro 2015 20 33,285,053 G C 0.53 0.51 0.999 rs626277 DACH1 Köttgen 2010 13 72,347,696 A C 0.58 0.62 0.993 rs6420094 SLC34A1 Köttgen 2010 5 176,817,636 A G 0.67 0.67 0.886 rs6431731 DDX1 Pattaro 2012 2 15,863,002 C T 0.04 0.04 0.591 rs6459680 RNF32 Pattaro 2015 7 156.258.568 G T 0.20 0.25 0.565 rs6465825 TMEM60 Köttgen 2010 7 77,416,439 T C 0.57 0.60 0.595 rs6795744 WNT7A Pattaro 2015 3 13,906,850 G A 0.84 0.85 0.992 rs7422339 CPS1 Köttgen 2010 2 211,540,507 C A 0.70 0.68 0.997 rs7759001 ZNF204 Pattaro 2015 6 27,341,409 G A 0.23 0.22 0.995 rs7805747 PRKAG2 Köttgen 2010 7 151,407,801 G A 0.71 0.70 0.896 rs7956634 PTPRO Pattaro 2015 12 15,321,194 T C 0.83 0.82 0.765 rs8091180 NFATC1 Pattaro 2015 18 77,164,243 G A 0.40 0.43 0.685 rs881858 VEGFA Köttgen 2010 6 43,806,609 G A 0.31 0.29 0.902 rs9682041 SKIL Pattaro 2015 3 170,091,902 C T 0.12 0.12 0.955 rs9895661 BCAS3 Köttgen 2010 17 59,456,589 C T 0.19 0.20 0.785 Membranous nephropathy rs2187668 HLA-DQA1 Stanescu 2011 6 32,605,884 C T 0.89 0.88 0.001 rs4664308 PLA2R1 Stanescu 2011 2 160,917,497 A G 0.59 0.59 0.988 IgA nephropathy rs11150612 ITGAM-ITGAX Kiryluk 2014 16 31,357,760 G A 0.66 0.64 0.280 rs11574637 ITGAM-ITGAX Kiryluk 2014 16 31,368,874 T C 0.80 0.83 0.122 rs12716641 DEFA Li 2015 8 6,898,998 T C 0.54 0.51 0.644 rs17019602 VAV3 Kiryluk 2014 1 108,188,858 A G 0.80 0.79 0.242 rs1794275 HLA-DQA/B Yu 2011 6 32,671,248 G A 0.78 0.83 0.354 rs1883414 HLA-DPB2 Gharavi 2011 6 33,086,448 G A 0.67 0.69 1.000

- 35 - rs2033562 KLF10/ODF1 Li 2015 8 103,547,739 G C 0.38 0.36 1.000 rs2074038 ACCS Li 2015 11 44,087,989 G T 0.88 0.89 0.229 HORMAD2/ rs2412971 Gharavi 2011 22 30,494,371 G A 0.54 0.53 0.011 MTMR3 rs2523946 HLA-A Yu 2011 6 29,941,943 C T 0.51 0.56 0.999 rs2738048 DEFA Yu 2011 8 6,822,785 A G 0.68 0.68 0.112 rs3115573 HLA region Feehally 2010 6 32,218,843 A G 0.59 0.53 0.691 rs3803800 TNFSF13 Yu 2011 17 7,462,969 A G 0.23 0.21 0.942 rs4077515 CARD9 Kiryluk 2014 9 139,266,496 C T 0.60 0.58 0.392 rs660895 HLA-DRB1 Yu 2011 6 32,577,380 A G 0.78 0.83 0.974 rs6677604 CFHR1,3 Gharavi 2011 1 196,686,918 G A 0.81 0.80 0.094 rs7634389 ST6GAL1 Li 2015 3 186,738,421 T C 0.62 0.63 0.833 HLA-DR– rs7763262 Kiryluk 2014 6 32,424,882 T C 0.28 0.34 0.981 HLA-DQ rs9275596 HLA-DQB1 Gharavi 2011 6 32,681,631 C T 0.31 0.36 0.978 rs9314614 DEFA Li 2015 8 6,697,731 C G 0.48 0.49 0.346 TAP1/2/ rs9357155 Gharavi 2011 6 32,809,848 G A 0.86 0.88 0.339 PSMB8/9 Steroid sensitive nephrotic syndrome rs1129740 HLA-DQA1 Gbadegesin 2015 6 32,609,105 G A 0.45 0.42 0.001 Systemic lupus erythematosus rs10488631 TNPO3 Armstrong 2014 7 128,594,183 T C 0.90 0.90 0.713 rs1150754 TNXB Chung 2011 6 32,050,758 C T 0.89 0.86 0.003 rs4963128 KIAA1542 Harley 2008 11 589,564 T C 0.33 0.33 1.000 rs6445975 PXK Harley 2008 3 58,370,177 G T 0.26 0.28 0.978 rs7574865 STAT4 Chung 2011 2 191,964,633 T G 0.23 0.22 0.297 rs9888739 ITGAM Chung 2011 16 31,313,253 C T 0.86 0.88 0.325 Type 1 diabetes mellitus rs12437854 ESRD Sandholm 2012 15 94,141,833 T G 0.92 0.93 0.948 rs4972593 ESRD Sandholm 2013 2 174,462,854 T A 0.87 0.83 0.993 Granulomatosis with polyangiitis rs1949829 COBL Xie 2013 7 51,537,887 C T 0.95 0.93 0.937 rs4862110 DCTD Xie 2013 4 183,751,029 T C 0.81 0.82 0.534 rs595018 CCDC86 Xie 2013 11 60,592,276 T C 0.20 0.20 0.911 rs7151526 SERPINA1 Holle 2014 14 94,863,636 C A 0.96 0.95 0.703 rs7503953 WSCD1 Xie 2013 17 6,141,677 A C 0.16 0.15 0.937 rs9277554 HLA–DPB1 Xie 2013 6 33,055,538 C T 0.69 0.71 0.351

1) Allele frequency of allele 1 based on the dbSNP database 2) Allele frequency in the GCKD cohort AF= allele frequency, HWE= Hardy-Weinberg equilibrium. For complete citation of all references see bibliography.

Table 7. HWE p-values of SNPs that showed deviation in the GCKD population

SNP GCKD - all patients GCKD controls WTCCC controls 1KGP controls rs11666497 0.001 0.65 1 0.1 rs2187668 0.001 0.98 0.95 0.97 rs2412971 0.011 <0.001 0.99 0.66 rs1129740 0.001 0.03 0.9 <0.01 rs1150754 0.003 0.94 0.96 1

SNP: single nucleotide polymorphism, GCKD: German Chronic Kidney Disease Study, 1KGP: Thousand Genomes Project, WTCCC: Wellcome Trust Case Control Consortium - 36 -

Genetic data for one SNP in the 1000 Genomes Project and another in the WTCCC study was not available. Therefore, with the help of SNiPA (Arnold, Raffler et al. 2015) and SNAP (Johnson, Handsaker et al. 2008) to estimate correlations with additional SNPs mapping into the vicinity of these variants, proxies for these SNPs were found and used in subsequent analyses. Table 8 shows that perfect proxies (r²=1) could be identified for both SNPs. These proxies are referred to by the original SNP names as they are perfectly correlated.

Table 8. SNP proxies used in the analyses

WTCCC Allele1 Allele2 original rs9275596 0.69 T 0.31 C replaced by rs3129721 0.69 A 0.31 C LD: r²=1, distance: 5,380 bp 1KGP original rs6459680 0.75 T 0.25 G replaced by rs2365286 0.75 A 0.25 G LD: r²=1, distance: 391 bp WTCCC: Wellcome Trust Case Control Consortium, 1KGP: Thousand Genomes Project, LD: linkage disequilibrium, bp: basepairs

2.4.2 Descriptive statistics Descriptive statistics of cohort baseline characteristics in Table 8 were derived as percentages and frequency distributions for all categorical variables, and mean and standard deviation or median and interquartile ranges for continuous variables depending on their distribution. The missing rate for all variables was below 2%. All percentages refer to the number of available values indicated in the right column of Table 8.

2.4.3 Regression analyses Multivariable adjusted logistic regression was used, employing additive genetic models using genotype dosage data to account for imputation uncertainty for all analyses. Results are provided as Odds Ratio (OR), 95% Confidence Interval (CI) and p-value to assess statistical significance.

2.4.4 Covariates All main analyses were performed with sex and age as covariates (only sex for 1KGP and WTCCC controls, see above).

- 37 -

2.4.5 Sensitivity and conditional analyses An additional sensitivity analysis for UACR was performed where intake of RAAS inhibitors and presence of diabetes were considered as further covariates. In another sensitivity analysis for the results of the 1KGP control group, principal components E1-E3 were added as covariates to adjust for potential population stratification. In the conditional analysis examining the independence of the risk loci found in the HLA region on 6, additional SNPs were used as further covariates. For each CKD etiology, conditional analyses were carried out separately and all SNPs that were associated with other etiologies in the main analysis were considered as covariates. In addition, conditional analyses for nephropathy attributed to T1DM were carried out using previously reported T1DM risk loci as covariates to examine independence of the CKD- associated SNPs from previously reported SNPs that are associated with T1DM but not necessarily CKD in this setting.

2.4.6 Statistical significance Thresholds of statistical significance were defined for each analysis using a Bonferroni correction to account for multiple testing and set to p<0.1/n when a one-sided hypothesis was tested (which also required consistent direction of effect; it was only used when testing previously reported associations) and p<0.05/n for two-sided hypothesis testing, with n as the number of tested SNPs and two sided p-values. This correction was used for all analyses aiming to detect new associations. Thus, threshold for statistical significance of two-sided hypothesis testing was set at p=0.05/55=9.1x10-4 when testing the 55 population-based risk loci, and at p=0.05/38=9.3x10-4 when testing the 38 CKD etiology-specific risk loci. Nominal significance threshold was set to p=0.05 as usual.

2.4.7 Software PLINK (Chang, Chow et al. 2015) was used to process and clean genotype data. Principal components were derived using EIGENSOFT EIGENSTRAT SmartPCA (Patterson, Price et al. 2006). Imputation was performed using IMPUTE2 (Howie, Donnelly et al. 2009). STATA 13.0 (StataCorp., College Station, TX) was used to perform all statistical analyses related to the association of the candidate SNPs.

- 38 -

3 Results 3.1 Demographic data and baseline Figure 9. Age distribution in the GCKD cohort

characteristics 25 All GCKD patients were of European 20 ancestry, 60% of them were male. Mean age at study entrance was 15

60±12 years (ranging from 18 to 76 Percentage 10 years, Figure 9). More detailed 5 information about the GCKD study population is given in Table 9. 0 10 20 30 40 50 60 70 80 Age (years)

Table 9. Demographic data and baseline characteristics of the GCKD study population

Characteristics n Gender and anthropometric data Male 60.1 (3027) 5,034 Age, years 60.1 (12.0) 5,034 BMI, kg/m² 29.8 (6.0) 4,982 Blood pressure SBP, mm Hg 139.5 (20.4) 5,002 DBP, mm Hg 79.2 (11.7) Smoking history

Current smokers 16.0 (802) Former smokers 43.3 (2173) 5,018 Never smokers 40.7 (2043) Kidney function measures

eGFR, ml/min 49.5 (18.1) 4,993 Serum creatinine, mg/dl 1.5 (0.5) 4,993 UACR, mg/g – median (IQR) 50.52 (9.53, 385.26) 4,950 <30 mg/g 42.8 (2117) 30-300 mg/g 29.2 (1448) 4,950 >300 mg/g 28.0 (1385) Diabetes mellitus Type 1 DM 2.1 (103) 5,034 Type 2 DM 24.5 (1231) 5,034

HbA1c, % 6.3 (1.0) 4,941 Blood lipid measures Total cholesterol, mg/dl 211 (53) 4,988 LDL cholesterol, mg/dl 118 (44) 4,985 HDL cholesterol, mg/dl 52 (18) 4,986 Triglycerides, mg/dl – median (IQR) 168 (118, 239) 4,986

Summary data of all patients included in the analyses (n=5,034). Continuous variables are presented as mean (SD), categorical variables with % (n) unless described otherwise. BMI: Body mass index, SBP: systolic blood pressure, DBP: diastolic blood pressure. - 39 -

3.2 CKD etiologies Nephrologists were asked to define the leading cause of CKD for their patients. The most frequent leading causes among the 4,056 patients are listed below (Table 10). For 978 patients it was not possible for the treating nephrologist to clearly define a leading cause and the leading cause was therefore categorized as unknown or impossible to decide.

Table 10. Leading cause of CKD

Disease % (n) with biopsy (%) Nephrosclerosis 26.8 (1,086) 7.7 Type 2 diabetes mellitus 16.1 (653) 4.3 IgA nephropathy 9.0 (366) 85.8 ADPKD (autosomal dominant polycystic kidney disease) 4.2 (171) 2.9 Primary glomerular nephropathy, others 3.7 (150) 28.7 Membranous nephropathy 3.6 (147) 95.2 Interstitial nephritis 3.6 (145) 19.3 FSGS (Focal-segmental glomerulosclerosis) 3.5 (143) 98.6 Systemic lupus erythematosus 3.2 (128) 82.8 Granulomatosis with polyangiitis 2.9 (116) 69.8 Type 1 diabetes mellitus 2.2 (91) 4.4 Microscopic polyangiitis 1.6 (65) 89.2 Single kidney because of tumor nephrectomy 1.5 (62) 11.3 Status post acute renal failure / postischemic 1.4 (58) 12.1 Minimal change disease 1.4 (55) 96.4 Analgesic nephropathy 1.3 (51) 5.9 Membranoproliferative glomerulonephritis 1.0 (41) 95.1 Total: n=4,056. Case groups with >50 patients are shown as well as patients with membranoproliferative glomerulonephritis (n=41) as this disease is assumed to have a genetic component. Leading cause of CKD was considered unknown or impossible to decide in 978 patients.

3.3 GFR/UACR as kidney function measures Since an eGFR between 30 and 60 ml/min/1.73m² was one of the inclusion criteria, eGFR in the GCKD study does not show a normal distribution. Consistent with screening criteria for study inclusion, most individuals had an eGFR in this range. Individuals with eGFR≥60 ml/min/1.73m² were included for proteinuria. Those with eGFR<30 ml/min/1.73m² at the baseline visit, on which creatinine measurements were based, had progressed between the screening and the baseline visit. Mean eGFR was 49.5±18.1 ml/min/1.73m². Estimated GFR was ≤45 ml/min/1.73m² in 2,245 (CKD stage G3b) and >45, but ≤60 ml/min/1.73m² in 1,742 patients (CKD stage G3a, Figure 10). There were 1,006 patients with an eGFR≥60 ml/min/1.73m². Estimated GFR was not available for less than 1% (n=41) of study participants. - 40 -

Median UACR was 51 mg/g, with about one third of the values each below 30 mg/g (CKD stage A1), between 30 and 300 mg/g (A2) and above 300 mg/g (A3). 2% of values (n=84) were missing. Figure 11 displays the distribution for log transformed UACR values. Since the distribution of UACR was heavily skewed toward high values, this form of data presentation was chosen here.

Figure 10. Distribution of eGFR in the Figure 11. Distribution of ln(UACR) GCKD cohort in the GCKD cohort

15 10

8

10 6 Percentage Percentage 4 5

2

0 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 0 eGFR (ml/min/1.73m²) 0 2 4 6 8 10 12 14 ln(UACR) - mg/g

eGFR: estimated glomerular filtration rate, UACR: urine albumin creatinine ratio.

3.4 Hardy-Weinberg equilibrium test The Hardy-Weinberg equilibrium test is used in studies of population genetics to assess whether the observed distribution of genotypes corresponds to their expected distribution based on their two allele frequencies p and q in a study population. In the absence of genotyping errors, and when several assumptions such as random mating are met, genotypes in individuals without disease should be distributed as p², 2pq and q². Strong deviations of genotype distributions from HWE are often caused by genotyping errors. Therefore, the test is used as quality control. All results of the Hardy-Weinberg equilibrium tests are reported in Table 6 and Table 7 in the Methods part (see 2.4.1). Ratios of the two allele frequencies (f(GCKD)/f(literature)) were mainly around 1 ranging from 0.91 to 1.46, while the absolute difference of the frequencies (f(GCKD)-f(literature)) was constantly <0.06. In addition, deviation from the HWE was tested with the result that 89 SNPs were unremarkable and five SNPs had HWE p- values <0.05. In a stratified analysis examining HWE exclusively in the GCKD internal control group of patients, HWE p-values were >0.05 for three of these five SNPs, two SNPs (rs1129740 and rs2412971) remained showing a deviation from HWE. In the WTCCC control

- 41 - group, the alleles at both SNPs were distributed across genotypes as expected, while rs1129740 deviated also from HWE in the 1KGP control group (p<0.01), making a genotyping artifact unlikely. These SNPs were tagged, but not excluded. None of the significant results were based on these two SNPs, minimizing concerns about false positive associations.

3.5 Genetic associations

3.5.1 Associations of SNPs identified in population-based studies with advanced CKD To address aim 1 of this thesis, all risk loci associated with eGFR in the general population were tested for associations with advanced CKD as defined above. Of these 55 loci, five showed significant associations (p<9.1x10-4) with CKD stage G3b and two with CKD stage A3 (Table 11, Supplementary Table 1) compared to the largest control group, the WTCCC, after correction for multiple testing. The strongest association was found between CKD stage G3b and rs4014195 near AP5B1 (WTCCC control population, OR 1.25 per G allele, p=1.2x10-7). The SNP rs12917707 near UMOD was significantly associated with CKD stage G3b using all three control groups (OR=0.71 per T allele, p=9.8x10-5 with 1KGP control population, OR=0.81, p=2.6x10-4 with the WTCCC control population, OR=0.76, p=4.2x10-4 with GCKD control population, Table 11). Each A allele at rs3750082 in KBTBD2 was significantly associated with lower odds of both CKD stage G3b and stage A3 using the WTCCC control group. While most associations using the external, smaller 1KGP control group were nominally significant and direction-consistent with the WTCCC control group, associations with the GCKD-internal control group were not significant although the direction of association was consistent with the external groups when different from unity (OR of 1.0). Sensitivity analyses for UACR associations were performed considering intake of RAAS inhibitors and presence of diabetes as covariates. All nominally significant associations stayed robust. In addition, these analyses did not provide new significant findings (data not shown).

- 42 -

Table 11. Associations of SNPs identified in population-based studies with advanced CKD (stage G3b or A3)

SNP characteristics GCKD controls 1KGP controls WTCCC controls CKD stage G3b: GFR<45 ml/min/1.73m² vs. GFR≥60 ml/min/1.73m² SNP Effect Chr. OR OR OR P-value P-value P-value (Gene) allele (Position) [95% CI] [95% CI] [95% CI] rs10794720 10 0.98 0.67 0.77 C 8.3x10-1 4.7x10-3 7.1x10-4 (WDR37) (1156165) [0.88-1.08] [0.51-0.88] [0.67-0.90] rs12917707 16 0.76 0.71 0.81 T 4.2x10-4 9.8x10-5 2.6x10-4 (UMOD) (20367690) [0.71-0.82] [0.60-0.84] [0.73-0.91] rs2802729 1 1.04 1.24 1.15 A 5.3x10-1 2.9x10-3 8.2x10-4 (SDCCAG8) (243501763) [0.98-1.10] [1.08-1.43] [1.06-1.25] rs3750082 7 1.07 0.83 0.85 A 2.9x10-1 9.8x10-3 1.1x10-4 (KBTBD2) (32919927) [1.00-1.14] [0.72-0.96] [0.78-0.92] rs4014195 11 1.07 1.16 1.25 G 2.9x10-1 4.6x10-2 1.2x10-7 (AP5B1) (65506822) [1.00-1.13] [1.00-1.34] [1.15-1.36]

CKD stage A3: UACR≥300 mg/g vs. UACR<300 mg/g SNP Effect Chr. OR OR OR P-value P-value P-value (Gene) allele (Position) [95% CI] [95% CI] [95% CI] rs2453580 17 1.14 1.13 1.20 C 4.4x10-3 1.2x10-1 3.7x10-4 (SLC47A1) (19438321) [1.04-1.25] [0.97-1.30] [1.08-1.32] rs3750082 7 0.98 0.83 0.84 A 6.2x10-1 1.9x10-2 8.3x10-4 (KBTBD2) (32919927) [0.89-1.07] [0.71-0.97] [0.76-0.93]

SNP: single nucleotide polymorphism, GCKD: German Chronic Kidney Disease Study, 1KGP: Thousand Genomes Project, WTCCC: Wellcome Trust Case Control Consortium, GFR: glomerular filtration rate, UACR: urine albumin-ton-creatinine rate, OR: Odds ratio, CI: Confidence interval. Only associations that were statistically significant in at least one control group and at least nominally significant in a second control group are listed, for complete results see Supplementary Table 1. The significance threshold was set at 9.1x10-4 (Bonferroni correction, two-sided test). eGFR<45 ml/min/1.73m² cases: n=2,245, UACR≥300 mg/g cases: n=1,385, eGFR≥60 ml/min/1.73m² GCKD controls: n=1,006, UACR GCKD controls: n=3565, 1KGP controls: n=503, WTCCC controls: n=2,597.

3.5.2 Associations of SNPs identified in population-based studies with hypertensive and diabetic kidney disease In a second set of analyses, the association between the population-based risk loci and CKD for which the treating nephrologists had determined type 2 diabetes mellitus (n=653) or hypertension (n=1,086) as the leading cause were examined because they represent the majority of CKD cases in population-based studies. No significant association was detected for diabetic kidney disease with any of the three control groups. Conversely, index variants at UMOD, KBTBD2 and AP5B1 showed significant association (p<9.1x10-4) with hypertensive nephropathy in comparison to the external control groups, but not the GCKD control group although the direction was consistent with the external control groups (Table 12, Supplementary Table 2). The strongest association was observed for rs4014195 near AP5B1 (WTCCC control population, OR=1.41, p=1.1x10-10).

- 43 -

Table 12. Associations of SNPs identified in population-based studies with hypertensive nephropathy (nephrosclerosis)

SNP characteristics GCKD controls 1KGP controls WTCCC controls SNP Effect Chr. OR OR OR P-value P-value P-value (Gene) allele (Position) [95% CI] [95% CI] [95% CI] rs12917707 16 0.90 0.69 0.77 T 3.3x10-1 1.7x10-4 3.4x10-4 (UMOD) (20367690) [0.73-1.11] [0.56-0.83] [0.67-0.89] rs3750082 7 0.89 0.79 0.81 A 1.3x10-1 3.4x10-3 9.7x10-5 (KBTBD2) (32919927) [0.76-1.04] [0.67-0.92] [0.72-0.90] rs4014195 11 1.21 1.31 1.41 G 1.5x10-2 9.0x10-4 1.1x10-10 (AP5B1) (65506822) [1.04-1.41] [1.12-1.53] [1.27-1.57]

SNP: single nucleotide polymorphism, GCKD: German Chronic Kidney Disease Study, 1KGP: Thousand Genomes Project, WTCCC: Wellcome Trust Case Control Consortium, OR: Odds ratio, CI: Confidence interval. Only associations that were statistically significant in at least one control group and at least nominally significant in a second control group are listed, for complete results see Supplementary Table 2. The significance threshold was set at 9.1x10-4 (Bonferroni correction, two-sided test). Hypertensive nephropathy cases: n=1086, GCKD controls: n=569 (control group with non-genetic causes of CKD excluding nephrosclerosis), 1KGP controls: n=503, WTCCC controls: n=2,597.

3.5.3 Associations of SNPs with specific CKD etiologies 3.5.3.1 Associations of CKD etiology-specific risk variants with the previously reported CKD etiology To address aim 2 of the thesis, we tested known CKD etiology-specific risk loci for association with additional CKD etiologies across the broad spectrum of CKD etiologies available in the GCKD study. Analyses were performed for each of the three control groups described in the Methods part. Reported results mainly refer to the analyses with the internal GCKD control group as it is the control group that best resembles the case groups (same geographic origin, same genotyping chip and lab, CKD patients). First, associations of previously reported CKD etiology-specific risk loci with the reported CKD etiology were examined. For instance, SNPs previously reported as associated with IgA nephropathy were evaluated for association with IgA nephropathy in the GCKD study. Six risk loci showed significant associations after correction for multiple testing (Table 13). This is not unexpected as many of them had been found and reported through meta-analyses assembling a much larger number of cases. Twelve of the loci showed nominal association (p<0.05). Loci that showed nominally significant associations but were not significant after correction for multiple testing were only assessed for consistent effect direction compared to the one reported previously. They were not used for subsequent analyses as the statistical power of their association in the GCKD study alone was too low.

- 44 -

Table 13. Associations of known risk loci for specific CKD etiologies with the corresponding CKD etiology in the GCKD study

Effect Chromo- CKD SNP Gene Position OR [95% CI] P-value allele some etiology rs2187668 HLA-DQA1 T 6 32,605,884 MN 4.48 3.32 6.11 2.4x10-22 rs4664308 PLA2R1 G 2 160,917,497 MN 0.45 0.34 0.60 6.7x10-8 rs11150612 ITGAM-ITGAX A 16 31,357,760 IgA 1.14 0.95 1.36 1.6x10-1 rs11574637 ITGAM-ITGAX C 16 31,368,874 IgA 0.76 0.59 0.98 3.2x10-2 rs12716641 DEFA C 8 6,898,998 IgA 0.89 0.74 1.07 2.0x10-1 rs17019602 VAV3 G 1 108,188,858 IgA 1.05 0.85 1.30 6.5x10-1 rs1794275 HLA-DQA/B A 6 32,671,248 IgA 1.23 0.98 1.54 7.0x10-2 rs1883414 HLA-DPB2 A 6 33,086,448 IgA 0.88 0.72 1.07 2.0x10-1 rs2033562 KLF10/ODF1 C 8 103,547,739 IgA 1.15 0.96 1.39 1.4x10-1 rs2074038 ACCS T 11 44,087,989 IgA 1.17 0.89 1.54 2.7x10-1 rs2412971 HORMAD2/MTMR3 A 22 30,494,371 IgA 1.14 0.96 1.35 1.3x10-1 rs2523946 HLA-A T 6 29,941,943 IgA 1.15 0.97 1.38 1.1x10-1 rs2738048 DEFA G 8 6,822,785 IgA 0.76 0.63 0.92 5.6x10-3 rs3115573 HLA region G 6 32,218,843 IgA 1.25 1.05 1.48 1.3x10-2 rs3803800 TNFSF13 G 17 7,462,969 IgA 0.88 0.72 1.09 2.6x10-1 rs4077515 CARD9 T 9 139,266,496 IgA 1.15 0.96 1.39 1.2x10-1 rs660895 HLA-DRB1 G 6 32,577,380 IgA 1.09 0.86 1.37 4.7x10-1 rs6677604 CFHR1,3 A 1 196,686,918 IgA 0.69 0.54 0.87 1.5x10-3 rs7634389 ST6GAL1 C 3 186,738,421 IgA 1.30 1.08 1.57 5.9x10-3 rs7763262 HLA-DR–HLA-DQ C 6 32,424,882 IgA 1.28 1.05 1.55 1.3x10-2 rs9275596 HLA-DQB1 T 6 32,681,631 IgA 1.20 1.00 1.45 5.3x10-2 rs9314614 DEFA G 8 6,697,731 IgA 1.00 0.84 1.20 9.9x10-1 rs9357155 TAP1/2/PSMB8/9 A 6 32,809,848 IgA 0.99 0.75 1.32 9.7x10-1 rs1129740 HLA-DQA1 A 6 32,609,105 SSNS - - - - rs10488631 TNPO3 C 7 128,594,183 SLE 1.49 0.98 2.27 6.1x10-2 rs1150754 TNXB T 6 32,050,758 SLE 1.97 1.37 2.83 2.8x10-4 rs4963128 KIAA1542 C 11 589,564 SLE 0.94 0.69 1.30 7.3x10-1 rs6445975 PXK T 3 58,370,177 SLE 0.94 0.68 1.31 7.3x10-1 rs7574865 STAT4 G 2 191,964,633 SLE 0.53 0.39 0.73 9.7x10-5 rs9888739 ITGAM T 16 31,313,253 SLE 1.60 1.10 2.34 1.4x10-2 rs12437854 ESRD G 15 94,141,833 T1DM 0.88 0.47 1.68 7.1x10-1 rs4972593 ESRD A 2 174,462,854 T1DM 1.29 0.88 1.89 1.9x10-1 rs1949829 COBL T 7 51,537,887 GPA 0.75 0.41 1.39 3.6x10-1 rs4862110 DCTD C 4 183,751,029 GPA 1.08 0.78 1.49 6.6x10-1 rs595018 CCDC86 C 11 60,592,276 GPA 1.00 0.72 1.41 9.8x10-1 rs7151526 SERPINA1 A 14 94,863,636 GPA 1.73 0.99 3.00 5.3x10-2 rs7503953 WSCD1 C 17 6,141,677 GPA 0.93 0.63 1.40 7.4x10-1 rs9277554 HLA–DPB1 T 6 33,055,538 GPA 0.14 0.08 0.25 1.7x10-11

SNP: single nucleotide polymorphism, CKD: chronic kidney disease, MN: Membranous nephropathy, IgA: IgA nephropathy, SSNS: steroid sensitive nephrotic syndrome, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Confidence interval.

- 45 -

For membranous nephropathy, two known risk loci were analyzed and showed both significant and direction consistent association. SNP rs2187668 at HLA-DQA1 (OR=4.48 compared to OR=4.32 reported by Stanescu 2011) had the strongest association (p=2.4x10-22). SNP rs4664308 in PLA2R1, located on chromosome 2, was also significantly associated (OR=0.45 compared to OR=0.44 reported by Stanescu 2011, p=6.7x10-8). We tested 21 known IgA risk loci for association. SNP rs6677604 at CFHR1,3 showed a statistically significant association after correction for multiple testing. The OR was similar as previously described (0.69 in the GCKD study vs. 0.68 reported by Gharavi 2011). Five further loci were nominally significant (p<0.05) and showed direction-consistent signals with odds ratios comparable to previous reports. Of six loci tested for SLE, two were statistically significant after correction for multiple testing and direction consistent: rs1150754 at TNXB (OR=1.97, Chung 2011: OR=2.21) and rs7574865 at STAT4 (OR=0.53, Chung 2011: OR=0.56), while rs9888739 at ITGAM showed a nominally significant association in the expected direction. Six known risk loci were analyzed for GPA; rs9277554 at HLA-DPB1 was the only one significantly associated after correction for multiple testing in the GCKD cohort and was direction consistent (OR=0.14 in comparison to OR=0.22 reported by Xie 2013). For T1DM two known risk loci were tested, but did not show significant association. Steroid-Sensitive Nephrotic Syndrome (SSNS) was not tested because of lack of this CKD etiology in the GCKD study. Minimal change disease may be the most comparable etiology in the GCKD study, but no significant SNP association was found. No specific risk loci were previously identified through GWAS of autosomal dominant polycystic kidney disease (ADPKD), membranoproliferative glomerulonephritis (MPGN), focal-segmental glomerulosclerosis (FSGS) and microscopic polyangiitis (MPA)/ hemolytic-uremic syndrome (HUS). Accordingly, no risk loci for these etiologies were evaluated in this study.

3.5.3.2 Associations of CKD etiology-specific risk variants across different CKD etiologies As the next part of the analysis, associations of these etiology-specific risk loci with etiologies other than those previously reported were examined to identify genetic risk variants shared across CKD etiologies. For membranous nephropathy, three loci known for being associated with another CKD etiology than MN, showed significant associations after correction for multiple testing. All of them were located in the HLA region. SNP rs9275596 (HLA-DQB1, p=3.3x10-7) had

- 46 - been described as an IgA nephropathy locus (Gharavi, Kiryluk et al. 2011), rs1129740 (HLA- DQA1, p=8.4x10-5) as a risk locus for SSNS (Gbadegesin, Adeyemo et al. 2015), and rs1150754 (TNXB, p=1.9x10-11) as a risk locus for SLE (Chung, Taylor et al. 2011). For IgA nephropathy, no variants originally identified for other CKD etiologies were associated. Three risk loci originally reported as associated with another CKD etiology showed association with SLE: the MN locus rs2187668 (HLA-DQA1, p=5.9x10-6) and the two IgA loci rs7763262 (HLA-DR–HLA-DQ, p=3.0x10-4) and rs9275596 (HLA-DQB1, p=3.4x10-4). Another IgA locus, rs660895 (HLA-DRB1) was significantly associated with GPA (p=2.0x10-4) in this study. For T1DM three additional associations were found. The strongest association was between the IgA locus rs660895 (HLA-DRB1) and CKD attributed to T1DM with a p-value of p=4.6x10-11. Furthermore the SSNS locus rs1129740 (HLA-DQA1, p=1.1x10-5) and the SLE locus rs1150754 (TNXB, p=2.5x10-7) were significantly associated with T1DM. There were no SNPs significantly associated with ADPKD, MPGN, FSGS, Minimal change GN and MPA/HUS (data not shown).

The associations estimated using the GCKD internal control group were then compared to those using the external control groups (data not shown). All significant associations were also observed in at least one, but mostly in both control groups. To exclude false positive results due to population stratification a sensitivity analysis was performed for the 1KGP control group using the principal components E1 to E3 as covariates to capture variation of geographic origin. The effect size did not differ more than 10% from the main results prior to adjustment except for three SNP effects on MN (difference 14-16%, changes go to both directions; data not shown). These analyses suggest that population stratification is unlikely to give rise to the observed significant associations. Figure 12 and Table 14 display all significant associations in the GCKD cohort across the different CKD etiologies and already considers the results of the conditional analyses (see below), while the complete results are shown in Supplementary Table 3.

- 47 -

Figure 12. SNP associations across different CKD etiologies in the GCKD cohort

While some SNPs are associated with only one CKD etiology, others are associated with more than one. Similar to Table 14, only independent SNP associations are depicted. MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, gene names in italic.

Table 14. Associations between CKD etiology specific SNPs and other CKD etiologies

SNP characteristics IgA MN SLE T1DM GPA Known OR OR OR OR OR SNP Effect Chr. locus [95% P- [95% P- [95% P- [95% P- [95% P- (Gene) allele (Position) value value value value value for CI] CI] CI] CI] CI] 0.69 rs6677604 1 1.5x A IgA [0.54- n.s. n.s. n.s. n.s. (CFHR1,3) (196686918) 10-3 0.87] 0.45 rs4664308 2 6.7x G MN n.s. [0.34- n.s. n.s. n.s. (PLA2R1) (160917497) 10-8 0.60] 0.53 rs7574865 2 9.7x G SLE n.s. n.s. [0.39- n.s. n.s. (STAT4) (191964633) 10-5 0.73] 2.77 1.97 2.53 rs1150754 6 1.9x 2.8x 2.5x T SLE n.s. [2.06- [1.37- [1.78- n.s. (TNXB) (32050758) 10-11 10-4 10-7 3.71] 2.83] 3.60] rs7763262 0.57 6 3.0x (HLA-DR- C IgA n.s. n.s. [0.42- n.s. n.s. (32424882) 10-4 HLA-DQ) 0.77] rs660895 3.00 1.81 6 4.6x 2.0x (HLA- G IgA n.s. n.s. n.s. [2.17- [1.32- (32577380) 10-11 10-4 DRB1) 4.18] 2.47] rs2187668 4.48 2.36 6 2.4x 5.9x (HLA- T MN n.s. [3.32- [1.63- n.s. n.s. (32605884) 10-22 10-6 DQA1) 6.11] 3.42] rs1129740 1.69 2.13 6 8.4x 1.1x (HLA- A SSNS n.s. [1.30- n.s. [1.52- n.s. (32609105) 10-5 10-5 DQA1) 2.19] 2.97] rs9275596 0.52 0.57 6 3.3x 3.4x (HLA- T IgA n.s. [0.41- [0.41- n.s. n.s. (32681631) 10-7 10-4 DQB1) 0.67] 0.77] rs9277554 0.14 6 1.7x (HLA- T GPA n.s. n.s. n.s. n.s. [0.08- (33055538) 10-11 DPB1) 0.25]

MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Confidence interval, n.s.: not significant. Bold: significant and independent association, italic: previously known association, simple style: significant, but dependent association. For independence tests see Table 15. Results show associations using the GCKD internal control group that were confirmed in the external control groups. Significance threshold was set at 2.6x10-3 for known etiology-specific SNPs when examined for associations with the same etiology (Bonferroni correction, one- sided), and at 1.3x10-3 for the others (two-sided test). - 48 -

3.5.3.3 Conditional analysis of independence of CKD etiology specific SNPs Several of the significantly associated SNPs mapped into the HLA region, and may thus not be completely independent of each other. To examine their dependency, conditional analyses were performed for each disease separately including other significant SNPs as covariates in all combinations (Table 15). Associations were defined as independent when ORs did not vary by more than 20% from the original values across all conditional analyses and if p-values remained statistically significant throughout. Of four examined SNPs for MN, rs2187668 was the one with largest effect size and the strongest association acting independent of other SNPs in the region. Effects of all further SNPs disappeared when their associations were adjusted for genotype at rs2187668. In the case of SLE, none of the SNP associations remained statistically significantly associated with the outcome once the index SNP (with the lowest p-value) was included into the model. Thus, there were no further SNPs in the region that contributed information beyond the index SNP, rs2187668. The odds ratios of most SNPs associated with SLE varied by more than 20% in the conditional analysis. As p-values were only slightly below the significance threshold in the unconditional analyses, significance of the association was lost when adding further covariates. No association fulfilled the former criteria of independence, but the association of rs2187668 and SLE was the strongest in the unadjusted analyses and OR changes were limited to <20% when adjusting the association for other SNPs in the HLA region in the conditional analyses. Thus, this SNP can be seen as the “leading SNP” for SLE in this region. For GPA, odds ratio and p-values of both rs9277554 and rs660895 did not change in conditional analyses when adjusting for the respective other SNP, so both SNPs can be considered as independent risk variants for this CKD etiology. Concerning T1DM, two of three associated SNPs showed independence of the signal in the conditional analyses: rs1150754 and rs660895, while the association of rs1129740 disappeared when conditioned on genotype at the other two variants.

- 49 -

Table 15. Conditional analyses for independence of CKD etiology specific SNP signals

SNP Covariates OR [95% CI] p-value SNP Covariates OR [95% CI] p-value MN SLE - 2.77 [2.06-3.71] 1.9x10-11 - 1.97 [1.37-2.83] 2.8x10-4 rs2187668 0.81 [0.51-1.30] 3.9x10-1 rs7763262 1.63 [1.10-2.40] 1.4x10-2 rs1129740 2.50 [1.84-3.39] 4.0x10-9 rs2187668 1.15 [0.67-1.99] 6.1x10-1 rs9275596 2.28 [1.65-3.14] 6.1x10-7 rs9275596 1.64 [1.12-2.42] 1.2x10-2 rs2187668, rs7763262, 0.82 [0.51-1.30] 3.9x10-1 1.07 [0.61-1.88] 8.1x10-1 rs1150754 rs1129740 rs1150754 rs2187668 rs2187668, rs7763262, 0.82 [0.51-1.30] 3.9x10-1 1.60 [1.08-2.36] 1.9x10-2 rs9275596 rs9275596 rs1129740, rs2187668, 1.41 [0.95-2.10] 8.4x10-2 1.17 [0.68-2.02] 5.8x10-1 rs9275596 rs9275596 rs2187668, rs7763262, rs1129740, 0.82 [0.51-1.30] 3.9x10-1 rs2187668, 1.09 [0.61-1.92] 7.8x10-1 rs9275596 rs9275596 - 4.48 [3.32-6.11] 2.4x10-22 - 0.57 [0.42-0.77] 3.0x10-4 rs1150754 5.21 [3.32-8.19] 8.1x10-13 rs1150754 0.65 [0.47-0.90] 1.0x10-2 rs1129740 4.29 [3.09-5.96] 4.1x10-18 rs2187668 0.70 [0.50-0.99] 4.1x10-2 rs9275596 4.67 [3.16-6.90] 9.6x10-15 rs9275596 0.72 [0.44-1.18] 1.9x10-1 rs1150754, rs1150754, 4,97 [3.10-7.96] 2.7x10-11 0.70 [0.50-0.99] 4.6x10-2 rs2187668 rs1129740 rs7763262 rs2187668 rs1150754, rs1150754, 5.39 [3.24-8.97] 8.6x10-11 0.79 [0.47-1.32] 3.6x10-1 rs9275596 rs9275596 rs1129740, rs2187668, 4.11 [2.36-7.17] 6.2x10-7 0.73 [0.44-1.21] 2.2x10-1 rs9275596 rs9275596 rs1150754, rs1150754, rs1129740, 4.74 [2.49-9.04] 2.2x10-6 rs2187668, 0.74 [0.44-1.24] 2.6x10-1 rs9275596 rs9275596 - 1.69 [1.30-2.19] 8.4x10-5 - 2.36 [1.63-3.42] 5.9x10-6 rs1150754 1.44 [1.10-1.90] 8.2x10-3 rs1150754 2.12 [1.22-3.70] 8.0x10-3 rs2187668 1.11 [0.82-1.49] 4.9x10-1 rs7763262 1.96 [1.30-2.95] 1.3x10-3 rs9275596 2.57 [1.92-3.45] 2.8x10-10 rs9275596 1.94 [1.26-3.00] 2.7x10-3 rs1150754, rs1150754, 1.11 [0.82-1.49] 5.0x10-1 1.86 [1.04-3.33] 3.6x10-2 rs1129740 rs2187668 rs2187668 rs7763262 rs1150754, rs1150754, 2.20 [1.57-3.08] 4.7x10-6 1.72 [0.94-1.36] 8.0x10-2 rs9275596 rs9275596 rs2187668, rs7763262, 1.14 [0.76-1.70] 5.3x10-1 1.93 [1.25-2.97] 3.0x10-3 rs9275596 rs9275596 rs1150754, rs1150754, rs2187668, 1.14 [0.76-1.70] 5.3x10-1 rs7763262, 1.81 [0.97-3.37] 6.1x10-2 rs9275596 rs9275596 - 0.52 [0.41-0.67] 3.3x10-7 - 0.57 [0.41-0.77] 3.4x10-4 rs1150754 0.67 [0.51-0.88] 3.7x10-3 rs1150754 0.65 [0.46-0.90] 1.0x10-2 rs2187668 1.06 [0.75-1.48] 7.5x10-1 rs7763262 0.74 [0.44-1.22] 2.3x10-1 rs1129740 0.36 [0.27-0.47] 6.2x10-13 rs2187668 0.74 [0.51-1.06] 1.0x10-1 rs1150754, rs1150754, 1.05 [0.75-1.47] 7.7x10-1 0.78 [0.46-1.31] 3.4x10-1 rs9275596 rs2187668 rs9275596 rs7763262 rs1150754, rs1150754, 0.43 [0.31-0,61] 2.0x10-6 0.73 [0.51-1.06] 9.8x10-2 rs1129740 rs2187668 rs2187668, rs7763262, 0.96 [0.61-1.51] 8.5x10-1 0.95 [0.55-1.62] 8.4x10-1 rs1129740 rs2187668 rs1150754, rs1150754, rs2187668, 0.95 [0.60-1.51] 8.4x10-1 rs7763262, 0.93 [0.54-1.62] 8.0x10-1 rs1129740 rs2187668 T1DM GPA - 2.53 [1.78-3.60] 2.5x10-7 - 1.81 [1.32-2.47] 2.0x10-4 rs660895 rs660895 2.75 [1.91-3.98] 7.2x10-8 rs9277554 1.73 [1.26-2.38] 7.3x10-4 rs1150754 rs1129740 2.12 [1.47-3.05] 5.5x10-5 - 0.14 [0.08-0.25] 1.7x10-11 rs660895, rs9277554 rs660895 2.62 [1.78-3.85] 1.0x10-6 0.14 [0.08-0.25] 2.6x10-11 rs1129740

- 50 -

- 3.00 [2.17-4.18] 4.6x10-11 rs1150754 3.19 [2.28-4.46] 1.5x10-11 rs660895 rs1129740 2.47 [1.72-3.56] 1.2x10-6 rs1150754, 2.93 [2.00-4.30] 3.6x10-8 rs1129740 - 2.13 [1.52-2.97] 1.1x10-5 rs1150754 1.85 [1.30-2.62] 5.4x10-4 rs1129740 rs660895 1.52 [1.04-2.21] 3.0x10-2 rs1150754, 1.19 [0.80-1.77] 4.0x10-1 rs660895

MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Confidence interval.

Reading example for Table 15: In the left column results of the conditional analyses for all SNPs associated with MN are given. One of these SNPs was rs1150754 with an OR of 2.77 and p=1.9x10-11 without considering effects of further SNPs (first line). Next, additional SNPs were added as covariates to the analyses, e.g. rs2187668 in the second line. The OR of rs1150754 adjusted for rs2187668 is now 0.81 with p=3.9x10-1. Thus, it can be followed that the signal of rs1150754 is not a new and independent association, but caused by the strong linkage to rs2187668, which itself is known to be highly associated with MN. In the following lines OR and p-value of rs1150754 are given for conditional analyses using rs1129740 and rs9275596 as covariates. Finally, analyses were performed with two and three SNPs as covariates and results given as described before.

3.5.3.4 Conditional analysis to assess independence of risk variants or T1DM-attributed CKD and T1DM The formerly described analysis showed associations of SNPs with CKD from T1DM. In a further analysis, we addressed the question whether these associations still remained robust in effect size and direction when adjusting for additional previously reported risk variants mapping into the HLA region for T1DM, but not for CKD from T1DM (Hakonarson, Grant et al. 2007; Cooper, Table 16. Conditional analyses of SNPs associated with CKD Smyth et al. 2008; from T1DM and previously known T1DM SNPs Barrett, Clayton et al. 2009; Grant, Qu et al. 2009; Bradfield, Qu et al. 2011; Tomer, Dolan et al. 2015). Results are shown in Table 16. The associations of the two SNPs found in this study with CKD attributed to T1DM were independent from the previously T1DM: Type 1 diabetes mellitus, OR: Odds ratio - 51 - reported T1DM SNPs. Thus, they seem to represent newly identified, independent markers associated with CKD from T1DM beyond previously known associations with T1DM itself. This analysis is important because the GCKD study is a case only study that did not recruit patients with T1DM but without CKD.

3.5.3.5 Linkage disequilibrium calculations Most SNPs associated with CKD attributed to specific etiologies were located in the HLA region on chromosome 6. Only three SNPs were located on other chromosomes. To evaluate the correlation of these variants on chromosome 6 and to provide complementary information of their independence to the conditional analyses, linkage disequilibrium (LD) as a measure of correlation was calculated for all SNPs on chromosome 6 with significant associations with any CKD etiology in the GCKD study. LD can provide information of whether a combination of alleles at two SNPs appears more frequently than it would be expected based on their allele frequencies if they were independent. This usually occurs if two SNPs are located quite close to each other (<1 Mb (megabases) on the same chromosome). As shown in Table 17, there were three pairs of moderately correlated SNPs (r²>0.2): rs2187668 and rs1150754 (r²=0.49), rs2187668 and rs9275596 (r²=0.23) and rs9275596 and rs7763262 (r²=0.61). SNP pairs reported in the previous paragraph as independent consistently showed small r² values in this analysis as well (r²<0.20).

Table 17. Linkage disequilibrium of selected SNPs in the HLA region in the GCKD cohort

r² rs1150754 rs7763262 rs660895 rs2187668 rs1129740 rs9275596 rs9277554 rs1883414

rs1150754 0.14 <0.01 0.49 0.05 0.12 0.01 <0.01

rs7763262 0.66 0.10 0.15 0.14 0.61 <0.01 <0.01

rs660895 0.35 0.95 0.03 0.16 0.11 <0.01 <0.01

D' rs2187668 0.74 0.74 1 0.11 0.23 0.02 0.01

rs1129740 0.64 0.44 1 0.99 0.13 <0.01 <0.01

rs9275596 0.64 0.82 1 0.97 0.42 0.01 <0.01

rs9277554 0.16 0.06 0.01 0.2 0.05 0.09 0.15

rs1883414 0.25 0.05 0.03 0.4 0.01 0.12 0.42

Linkage disequilibrium of all SNPs on chromosome 6 that were significantly associated with one or more specific CKD etiologies. D’ and r2 are two measures of LD and provide complementary information.

- 52 -

4 Discussion

4.1 Summary of results Over the past decade, genetic research into different kidney diseases and kidney-disease defining measures has led to insights that these are complex human diseases and traits. This means that genetic risk variants in many genes as well as their interactions with the environment contribute to disease development and progression. However, genome-wide association studies aimed at the discovery of such genetic susceptibility genes have usually only evaluated one kidney disease at a time, or only one subset of the population. It was unclear to which degree genetic risk variants discovered for one CKD etiology are shared across other CKD etiologies. Similarly, it was unclear whether genetic variants associated with kidney function in the normal range are also associated with reduced kidney function or kidney damage. This doctoral thesis aimed at addressing these gaps by examining a large population of patients with CKD from different underlying causes and different stages. This doctoral thesis had the following principle findings: first, several loci known to be associated with kidney function and damage in the general population were also significantly associated with “advanced” CKD defined as stage G3b (UMOD, AP5B1, WDR37, SDCCAG8 and KBTBD2) or A3 (KBTBD2 and SLC47A1). Moreover, several loci discovered in association with eGFR in the general population were also associated with hypertensive nephropathy in comparison to external control populations with little if any CKD (UMOD, AP5B1 and KBTBD2). In each instance, the direction of association identified in the GCKD study was consistent with the direction previously reported in the general population, i.e. the same allele at a given SNP conferred increased risk in both settings. Second, several known CKD etiology-specific risk loci were replicated in the GCKD study, such as risk variants at HLA-DQA1 and PLA2R1 with MN, validating the ascertainment of the underlying cause of CKD in the GCKD study. Third, TNXB, HLA-DQA1, -DQB1 and -DRB1 were independently associated with additional CKD etiologies beyond the ones initially reported in their respective discovery studies. For example, a SLE-associated risk variant at TNXB was also associated with CKD from type 1 diabetes, a known MN-associated variant at HLA-DQA1 was additionally associated with SLE, and a known IgA risk variant at HLA- DRB1 was also associated with both GPA and CKD from type 1 diabetes in the GCKD study. There are several potential explanations for this observation: for one, genetic risk variants could be shared across different causes of CKD, but different manifestations could result from interactions with different environmental risk factors or antigens. Alternatively,

- 53 - shared genetic associations could hint at a continuum of clinical presentations for the same underlying disease.

4.2 Interpretation in the context of the literature UMOD is the first locus emerging from population-based GWAS for which experimental evidence was generated linking genotype at the associated index SNP to gene expression and the subsequent presence of salt-sensitive hypertensive CKD (Trudu, Janas et al. 2013). In support, the risk allele at the UMOD variant was significantly associated not only with stage G3b CKD but also specifically with CKD attributed to hypertension in this doctoral study. The connection of the UMOD locus to hypertensive kidney disease in particular is further supported by previous reports of an association between the CKD-risk allele and hypertension, irrespective of the kidney disease (Padmanabhan, Melander et al. 2010). The UMOD gene has also been target of studies that investigated the monogenetic manifestation of kidney disease. Mutations in this gene have been described as the cause of a variety of monogenic renal disorders such as medullary cystic kidney disease-2 (MCKD2), glomerulocystic kidney disease with hyperuricemia and isosthenuria (GCKDHI), and familial juvenile hyperuremic nephropathy. These monogenic presentations were later recognized as different forms of the same disease, so-called allelic disorders (Hart, Gorry et al. 2002). This underscores the importance of the UMOD gene across the range of kidney diseases, from rare deleterious mutations to common regulatory variants such as the SNPs studied here. UMOD is exclusively expressed in the kidney (Schaeffer, Cattaneo et al. 2012). The encoded protein uromodulin, also called Tamm-Horsfall protein, is the most abundant protein in the urine of healthy individuals. Levels of uromodulin are high in tubules where it is produced and secreted by the epithelial cells; it can consequently not be detected in glomeruli. Various functions of this protein have been described. Uromodulin is thought to protect against urinary infections by abolishing the binding of E.coli to uroplakin receptors, thus shortening the persistence time of E.coli in the bladder and lowering leukocyte levels (Bates, Raffi et al. 2004; Ghirotto, Tassi et al. 2016). A protective effect against kidney stones has been described as well (Liu, Mo et al. 2010). Furthermore it modulates tubular electrolyte transport, especially by regulating the ion transporters NKCC2 (sodium-potassium-chloride transporter) and ROMK (a potassium channel) in the thick ascending limb of Henle’s loop (Mutig, Kahl et al. 2011; Renigunta, Renigunta et al. 2011). The over-activation of NKCC2 causes an increased re-absorption of sodium and chloride and thus contributes to the development of hypertension and kidney damage (Trudu, Janas et al. 2013). Besides, uromodulin plays a role - 54 - in the innate immune response of the kidneys by activating myeloid dendritic cells and triggering monocytes and granulocytes to produce inflammatory molecules. Uromodulin aggregates attract and bind leukocytes resulting in an activation of proinflammatory cascades. The aggregates do not seem to have a direct damage effect on the kidney, but predispose for damage by additional comorbidities (Scolari, Izzi et al. 2015; Devuyst, Olinger et al. 2017). In addition to this proinflammatory function, uromodulin is supposed to also decrease inflammation after acute kidney injuries, e.g. after ischemia (El-Achkar, Wu et al. 2008). The association between advanced CKD and the risk variant in SDCCAG8 in this study is supported by rare mutations in SDCCAG8 that can cause Bardet-Biedl or Senior- Løken syndrome (Schaefer, Zaloszyc et al. 2011), both with severe renal phenotypes. Bardet- Biedl syndrome is a rare ciliopathy, leading to systemic disease with multiple organs affected. The renal phenotype is highly variable and may include concentrating defects, cystic diseases, FSGS and dysplasias leading to renal failure. Further manifestations are retinal dystrophy, obesity, polydactyly and intellectual disability (Forsythe, Sparks et al. 2017). The congenital Senior-Løken syndrome is another ciliopathy causing nephronophthisis and progressive eye disease because of retinopathy. Nephronophthisis is a medullary cystic kidney disease that often manifests in childhood. Currently, mutations in 20 different genes, which are all expressed in the cilia, are known to cause the disease. The dysfunctional gene products limit the cilias’ motility resulting in chronic tubulointerstitial nephritis, cystic renal disease and finally in ESRD, usually before the age of 30 (Stokman, Lilien et al. 1993). Although the function of SDCCAG8 is not completely understood, it is assumed that the encoded protein plays a role in the organization of the centrosome during interphase and mitosis. Again, this finding illustrates that genetic variants at the same gene can lead to a spectrum of manifestations, ranging from common variants associated with small effects on renal function in population-based studies over associations with advanced CKD in the GCKD study to rare mutations that cause severe renal phenotypes in monogenic diseases. There is only spare information about the other loci that we identified to be associated with advanced stages of CKD, in or close to the genes WDR37, KBTBD2, AP5B1 and SLC47A1. None of them is exclusively expressed in the kidneys. WDR37, which is located on chromosome 10, encodes a protein that belongs to a family of similar proteins called WD repeat proteins. Based on current knowledge, their function is to facilitate the formation of multiprotein complexes. Besides, they are involved in a variety of cellular processes such as cell cycle, signal transduction, formation of multiprotein complexes and others (Gene Cards - Human Gene Database). AP5B1 is a gene located on chromosome 11 that encodes an adaptor

- 55 - protein involved in endosomal transport. No relation to a renal disorder is known so far. SLC47A1 is a gene located in the Smith-Magenis syndrome region on chromosome 17. The Smith-Magenis syndrome is a rare genetic disease with complex dysplasia including renal malformation. The SLC47A1 gene product, also termed MATE1, is expressed at the brush- border membrane of proximal tubular epithelial cells and transports different cations, sugars, amine compounds and metal ions (Yonezawa and Inui 2011). In an in vitro study, the protein encoded by SLC47A1 was identified as a drug transporter, whose down regulation caused renal cell injury, especially when incubated with cisplatin (Mizuno, Sato et al. 2015). However, connections to renal pathologies of some of the genes are lacking. One reason is that these connections may simply not have been discovered yet. One approach to functionally characterize associated variants is to test whether they are associated with biomarkers such as metabolomics to obtain useful hints for identifying the underlying biochemical pathways (Rysz, Gluba-Brzozka et al. 2017). Moreover, functional studies using knock-out/ knock-down and knock-in/ overexpression of the suspicious genes or variants into cell lines or model organisms can also contribute to the understanding of the pathways. Finally, additional influences that may modify disease risk such as epigenetic modification and tissue-specific gene expression have to be considered and examined. Another factor that complicates the identification of the causal gene in a genetic locus that contains many associated SNPs is that the closest gene is not necessarily the gene causing the association. Many of the correlated, associated SNPs are intronic or intergenic. Therefore, the causal disease-associated genetic variant can – for example – be located in a regulatory element that controls the expression of a nearby gene, which does not have to be the closest gene. Lastly, factors related to the disease or trait under study can complicate the identification of associations. For example, in advanced CKD stage A3, many patients receive medication to lower albuminuria. A case that is successfully treated may hence be misclassified as a control, which can result in decreased power to identify associations. This is a possible explanation why in this study only two risk variants (at SLC47A1 and KBTBD2) were associated with advanced CKD stage A3 in contrast to five loci associated with advanced CKD stage G3b. The associations between population-based risk variants with advanced CKD in this study were consistently observed when two external control populations of European ancestry were evaluated, but not in comparison to an internal control population of GCKD patients with less severe CKD stages. A potential explanation for this observation is that the GCKD

- 56 - controls were in earlier stages of the same disease, may possess similar genetic susceptibility and would belong to an advanced CKD stage when observed later during the course of their disease. This study was able to identify four new associations between risk loci and specific etiologies of CKD by taking an alternative approach to previous GWAS studies, namely to evaluate a set of validated candidate SNPs for association across the rich spectrum of CKD etiologies in the GCKD study: rs2187668 was associated with SLE, rs1150754 and rs660895 were associated with T1DM and rs660895 as well with GPA. A possible explanation why these risk loci have not been found in former GWAS is the need to correct for multiple testing. Full genome-wide association studies use a Bonferroni correction for the number of independent common SNPs in the genome to indicate statistical significance (i.e. 0.05/1 million independent SNPs = 5x10-8). By focusing on previously validated candidate SNPs, the Bonferroni correction in our study was much less stringent (1.3x10-3 for the etiology-specific analysis), while still being conservative. Therefore, study exemplifies that a hypothesis-driven approach based on unbiased and validated GWAS data can lead to the identification of additional SNP associations not detected in the primary discovery GWAS.

4.3 Clinical interpretation A large number of antigens are encoded by a limited number of genes mapping into the HLA (human leukocyte antigens) region. Through recombination, a huge variety of antigens is formed and expressed on the cell surface. These antigens play a key role in the adaptive immune system. Depending on the cell type, different subtypes of HLA are present. While HLA class 1 molecules, which present the cell’s protein fragments, can be found on every nucleated cell, HLA class 2 molecules are restricted to antigen presenting cells such as B-cells and present phagocytosed cell molecules. T cells recognize the presented foreign antigens and produce cytokines such as IL-4 and IL-21, which consequently activates the B cells including their proliferation and a hypermutation of antibody producing genes (Crotty 2015). This results in a great variety of antibodies and a rapid selection of the fitting antibodies. The majority of these activated B cells produces antibodies and is called “plasma cells”. A small part converts as well into memory B cells, which remain in the body and can be activated in case of a re-infection. The HLA region is known for containing various risk loci for CKD of different etiologies. We examined several of them across CKD etiologies available in the GCKD study

- 57 - and showed evidence of independent association. This highlights the shared role of the adaptive immune response across several etiologies of CKD, and suggests some overlap between etiologies. For example, known risk variants for MN were independently associated with CKD from SLE. Histopathology of SLE is heterogeneous and overlap to the appearance of MN exists. The observed genetic overlap could therefore be influenced by the membranous histopathological appearance of lupus nephritis class V, potentially leading histopathologists to label a SLE case a membranous nephropathy. It needs to be noted, however, that lupus nephritis class V is an infrequent subtype of SLE and therefore any potential mislabeling is unlikely to fully account for the shared genetic risk variant at the HLA locus. More detailed studies focusing on SNP associations across sub-types of different autoimmune diseases are required to address the important question of shared genetic susceptibility versus different clinical/ histological presentations of the same underlying disease in more detail. Furthermore, studies with larger case groups of the respective disease could examine co-incidences of two diseases with overlapping genetic risk factors such as MN/ SLE and T1DM/ GPA. Up to now, several case reports have been published about immunological diseases occurring in the same patient such as MN and IgA nephropathy (Nishida, Kato et al. 2015) or MN and further autoimmune diseases (e.g. colitis ulcerosa) (Warling, Bovy et al. 2014). Both publications are case reports of rare co-manifestations, where the linkage mechanisms are not completely understood. There are more widely known co-occurrences of several immunologic diseases in the same patients, such as between primary sclerosing cholangitis and autoimmune hepatitis, which could be explained by shared genetic risk factors for the auto-immune disease. To date, however, there are no case reports on the co-occurrence of the combinations found in this study. One theory how the same genetic risk variants could be associated with different auto-immune diseases of the kidney is the presence of different auto-antigens or environmental risk factors that interact with the same genetically encoded HLA variant. The T1DM risk loci found in this study can be interpreted in several ways: either as risk loci for the disease, or as risk loci for CKD resulting from T1DM, or both. Because the GCKD study is a case only study that did not recruit individuals with T1DM but without CKD, this question could only be examined indirectly. The persistent genetic association with CKD attributed to T1DM while conditioning on known genetic risk variants for T1DM suggests that the association we observed was important for CKD beyond a mere association with the underlying cause of CKD. Particularly for T1DM, but also for the other CKD etiologies examined in this study, future studies that also recruit controls without CKD but

- 58 - with the disease that can cause CKD are needed to directly address this important research question.

4.4 Strengths and limitations The strengths of this study include the availability of CKD of different stages and from different etiologies in one study population. As explained above, this opens up the possibility to carry out conditional analyses to address whether genetic risk variants for a given disease are independent of nearby genetic risk variants for other, related diseases. Because of limited sample size within subgroups, analyses were restricted to the examination of a predefined number of candidate SNPs, but still rigorously accounted for testing a number of different candidate SNPS. This resulting statistical significance threshold that is less stringent compared to correction for genome-wide multiple testing, and therefore allowed to detect significant associations that would have been missed by GWAS. In addition, prospective follow-up data for the GCKD patients will be available in the future, enabling further investigations of the various CKD etiologies. Limitations of this study include the absence of an internal healthy control group as well as of control groups of patients suffering from the specific diseases underlying CKD such as T1DM, SLE or GPA without nephropathy. We thus examined external control groups and compared association results for consistency across different control groups in order to reduce the risk of false positive associations. External control groups may contain patients who would fulfill criteria for one of the case groups. The prevalence of CKD is around 10% (Levey and Coresh 2012) in the general population across all ages, and very low for the specific CKD etiologies examined in this study. In addition, the WTCCC control group was formed as a healthy control group for the different autoimmune disease case groups and both blood donors and the 1000 Genomes study population can safely be assumed to be mostly healthy. Furthermore, genetic effects were consistent at least in direction and often also in significance across the different control groups examined. This suggests that the amount of misclassification of cases as controls was likely small, and that there were no systematic differences in the compositions of the different external control groups.

- 59 -

4.5 Conclusion This study identified several associations between genetic loci associated with GFR or CKD in the general population and advanced stages of CKD. In addition, known risk loci for specific forms of CKD were shared across some additional CKD etiologies, suggesting a common mechanism by which the adaptive immune system may contribute to the associated CKD etiologies.

- 60 -

5 Abstract (English)

Chronic kidney disease (CKD) is a global health problem with a genetic component. To gain insights into its complex architecture genomewide association studies (GWAS) have identified genetic variants associated with kidney function in the general population and with disease in studies of specific etiologies. The generalization of population-based findings to advanced CKD as well as the genetic overlap between different CKD etiologies have not been well studied. This gap was addressed using data from 5,034 patients of the German Chronic Kidney Disease (GCKD) study with mostly stage 3 CKD from various etiologies and healthy controls both from the 1000 Genomes Project and the Wellcome Trust Case Control Consortium. Of 55 eGFR-associated markers identified in the general population, six were significantly associated with stage G3b and/or A3 CKD (in/near UMOD, KBTBD2, AP5B1, SDCCAG8, SLC47A1 and WDR37), some of which were additionally associated with hypertensive CKD (UMOD, KBTBD2 and AP5B1). Across CKD etiologies, a systemic lupus erythematosus-associated risk variant at TNXB was also associated with CKD from type 1 diabetes. A membranous nephropathy-associated variant at HLA-DQA1 was also associated with systemic lupus erythematosus, and an IgA risk variant at HLA-DRB1 was also associated with both granulomatosis with polyangiitis and type 1 diabetes. These associations were independent of additional known risk variants in the respective regions. At each SNP, the allele associated with higher risk was the same as the one reported in previous studies for another CKD etiology. In conclusion, some kidney-function associated variants from the general population translate to advanced CKD. Shared associations across CKD etiologies highlight the role of the adaptive immune response across several CKD etiologies and suggest some overlap between them.

- 61 -

6 Abstract (German)

Die chronische Nierenerkrankung ist ein weltweites Gesundheitsproblem. Um Einblicke in die komplexen genetischen Zusammenhänge zu bekommen, wurden in genomweiten Assoziationsstudien genetische Risikovarianten identifiziert, die mit eingeschränkter Nierenfunktion in der Normalbevölkerung oder mit spezifischen Ätiologien der chronischen Nierenerkrankung wie z.B. IgA-Nephropathie assoziiert sind. Bisher gibt es jedoch noch keine Erkenntnisse darüber, ob die populationsbasierten Risikovarianten sich auch auf fortgeschrittene Stadien der Nierenerkrankung übertragen lassen und ob spezifische Risikovarianten mit mehr als nur einer Krankheitsätiologie assoziiert sind. Diese Fragestellung wurde anhand von 5.034 Patienten der German Chronic Kidney Disease Studie, überwiegend mit chronischer Nierenerkrankung Stadium 3, untersucht. Als gesunde Kontrollen dienten dabei Probanden des 1000 Genomes Project und des Wellcome Trust Case Control Consortiums. Von 55 populationsbasierten Risikovarianten waren sechs signifikant mit chronischer Nierenerkrankung Stadium G3b und/oder A3 assoziiert (in oder nahe den Genen UMOD, KBTBD2, AP5B1, SDCCAG8, SLC47A1 und WDR37), einige davon zusätzlich auch mit hypertensiver Nierenerkrankung (UMOD, KBTBD2 und AP5B1). Bei den spezifischen Ätiologien für chronische Niereninsuffizienz ergaben sich einige neue Zusammenhänge und Überschneidungen: ein für systemischen Lupus erythematodes bekannter Risikolokus in TNXB war auch mit Typ-1-Diabetes assoziiert, ein Risikolokus in HLA-DRB1, bisher für membranöse Glomerulonephritis bekannt, ebenfalls mit systemischem Lupus erythematodes und ein IgA-Nephropathie-Risikolokus in HLA-DRB1 war außerdem mit Typ-1-Diabetes und Granulomatose mit Polyangiitis assoziiert. Die Assoziationen waren unabhängig von bereits bekannten Risikovarianten in den jeweiligen Genregionen. Das Risikoallel jeder der assoziierten Genvarianten war dasjenige, welches auch für die zuerst bekannte Erkrankung berichtet wurde. Zusammenfassend fanden sich einige populationsbasierte Risikovarianten für eingeschränkte Nierenfunktion, die sich auch auf fortgeschrittene Nierenerkrankung übertragen ließen. Die außerdem gefundene Generalisierung spezifischer Risikovarianten über mehrere Erkrankungen hinweg heben die Rolle des adaptiven Immunsystems hervor und lassen Überschneidungen zwischen den verschiedenen Ätiologien hinsichtlich Pathologie und Pathogenese vermuten.

- 62 -

7 Acknowledgement

I would like to thank Prof. Dr. Anna Köttgen for her great and continuous support and supervision of my work as well as for the opportunity to conduct my research in this very interesting scientific area. Her knowledge, enthusiasm and motivation helped me a lot and I am glad I could not only write my doctoral thesis but also a scientific publication. Thus, I have learned a lot about scientific methods and writing. Besides her, I would like to thank the whole “Genetic Epidemiology” work group, especially Dr. Matthias Wuttke and Yong Li for their help and collaboration.

Next, I want to thank Prof. Dr. Wolfgang Kühn for being second assessor of my thesis and giving me important clinical input and hints for scientific writing. In addition, I would like to thank all co-authors of the publication for their suggestions and collaboration and all investigators, nephrologists and patients who have made the GCKD study possible.

Last, but not least, I want to thank my family and my friends for their support I could always be sure of.

- 63 -

8 Eidesstaatliche Versicherung gemäß §8 Absatz 1 Nr. 3 der Promotionsordnung der Universität Freiburg für die Medizinische Fakultät

1. Bei der eingereichten Dissertation zu dem Thema “Associations between Known Genetic Risk Variants and CKD Stage and Etiology in the GCKD Study” handelt es sich um meine eigenständig erbrachte Leistung. 2. Ich habe nur die angegebenen Quellen und Hilfsmittel benutzt und mich keiner unzulässigen Hilfe Dritter bedient. Insbesondere habe ich wörtlich oder sinngemäß aus anderen Werken übernommene Inhalte als solche kenntlich gemacht. Niemand hat von mir unmittelbar oder mittelbar geldwerte Leistungen für Arbeiten erhalten, die im Zusammenhang mit dem Inhalt der vorgelegten Dissertation stehen. 3. Die Ordnung der Albert-Ludwigs-Universität zur Sicherung der Redlichkeit in der Wissenschaft habe ich zur Kenntnis genommen und akzeptiert. 4. Die Dissertation oder Teile davon habe ich bislang nicht an einer Hochschule des In- oder Auslands als Bestandteil einer Prüfungs- oder Qualifikationsleistung vorgelegt. 5. Die Richtigkeit der vorstehenden Erklärung bestätige ich. 6. Die Bedeutung der eidesstaatlichen Versicherung und die strafrechtlichen Folgen einer unrichtigen oder unvollständigen eidesstaatlichen Versicherung sind mir bekannt.

Ich versichere an Eides statt, dass ich nach bestem Wissen die reine Wahrheit erklärt und nichts verschwiegen habe.

Stuttgart, 15.02.2018 ______

Sebastian Wunnenburger

- 64 -

9 Supplementary Tables Supplementary Table 1. Associations between population-based SNPs and advanced CKD (stage G3b or A3) in the GCKD cohort SNP characteristics CKD stage G3b CKD stage A3 GCKD 1KGP WTCCC GCKD 1KGP WTCCC controls controls controls controls controls controls Effect SNP Gene OR p-value OR p-value OR p-value OR p-value OR p-value OR p-value allele rs10109414 STC1 T 1.08 1.8E-01 1.02 7.5E-01 1.13 3.3E-03 1.06 3.1E-01 0.98 8.0E-01 1.09 6.4E-02 rs10277115 UNCX T 1.08 2.6E-01 1.17 6.5E-02 1.13 2.3E-02 1.09 1.6E-01 1.07 4.5E-01 1.04 5.6E-01 rs10491967 TSPAN9 A 0.99 9.6E-01 1.01 9.0E-01 1.33 2.5E-05 1.12 1.8E-01 0.95 6.7E-01 1.27 2.8E-03 rs10513801 ETV5 G 0.94 4.5E-01 0.95 6.0E-01 0.96 4.7E-01 1.08 3.2E-01 0.91 3.7E-01 0.91 2.0E-01 rs10774021 SLC6A13 T 0.97 6.8E-01 0.98 8.2E-01 0.91 3.9E-02 1.03 5.5E-01 1.01 9.3E-01 0.91 7.2E-02 rs10794720 WDR37 C 0.98 8.3E-01 0.67 4.7E-03 0.77 7.1E-04 0.88 1.7E-01 0.75 5.5E-02 0.87 1.1E-01 rs10994860 A1CF T 0.99 8.8E-01 0.79 6.7E-03 0.94 3.0E-01 0.93 3.0E-01 0.82 3.2E-02 0.98 7.6E-01 rs1106766 INHBC T 0.98 7.7E-01 1.24 1.5E-02 0.94 2.1E-01 0.99 8.1E-01 1.17 9.5E-02 0.92 1.4E-01 rs11078903 CDK12 A 1.08 2.4E-01 0.99 9.3E-01 0.98 6.6E-01 1.06 3.4E-01 0.91 2.9E-01 0.90 5.0E-02 rs11666497 SIPA1L3 T 1.08 2.8E-01 0.98 8.6E-01 1.08 1.5E-01 0.94 3.2E-01 0.99 9.3E-01 1.09 1.7E-01 rs11959928 DAB2 A 1.05 4.1E-01 1.01 9.0E-01 1.15 7.1E-04 1.03 5.6E-01 1.02 7.8E-01 1.16 1.8E-03 rs12124078 DNAJC16 G 0.93 2.6E-01 1.00 9.7E-01 0.91 3.4E-02 1.08 1.8E-01 0.95 5.1E-01 0.86 4.3E-03 rs12136063 SYPL2 A 1.01 8.9E-01 0.98 7.5E-01 0.94 1.8E-01 0.90 6.0E-02 1.03 6.9E-01 1.01 8.5E-01 rs12460876 SLC7A9 C 1.00 9.5E-01 0.88 6.0E-02 1.00 9.5E-01 0.97 5.2E-01 0.88 9.1E-02 1.01 8.5E-01 rs1260326 GCKR C 1.09 1.3E-01 0.96 5.4E-01 0.87 6.3E-04 1.04 4.6E-01 0.94 4.2E-01 0.85 8.2E-04 rs12917707 UMOD T 0.76 4.2E-04 0.71 9.8E-05 0.81 2.6E-04 1.04 6.2E-01 0.78 7.6E-03 0.88 5.6E-02 rs13538 ALMS1/NAT8 G 1.00 9.7E-01 1.01 9.4E-01 0.88 1.3E-02 0.95 4.3E-01 1.07 4.7E-01 0.93 1.9E-01 rs1394125 UBE2Q2 A 1.00 9.6E-01 1.08 3.3E-01 1.03 5.0E-01 1.09 1.3E-01 1.07 3.9E-01 1.02 6.8E-01 rs163160 KCNQ1 G 1.07 3.5E-01 1.06 5.0E-01 1.03 6.1E-01 1.08 2.8E-01 1.00 9.9E-01 0.95 3.7E-01 rs164748 DPEP1 G 1.15 1.7E-02 0.93 2.8E-01 1.14 1.7E-03 0.97 5.3E-01 0.90 1.7E-01 1.10 3.9E-02 rs17216707 BCAS1 C 1.01 9.1E-01 0.89 1.8E-01 1.10 5.2E-02 0.93 3.0E-01 0.93 4.1E-01 1.14 2.8E-02 rs17319721 SHROOM3 A 1.07 2.8E-01 1.12 1.0E-01 0.98 6.8E-01 1.17 4.0E-03 1.01 8.6E-01 0.88 1.1E-02 rs1801239 CUBN C 1.07 5.0E-01 1.27 5.0E-02 1.01 8.6E-01 0.85 5.7E-02 1.28 4.9E-02 1.01 9.1E-01 rs2279463 SLC22A2 G 1.00 9.8E-01 0.92 4.1E-01 0.80 2.0E-04 1.08 3.6E-01 0.91 3.8E-01 0.78 5.9E-04 rs228611 NFKB1 A 0.99 9.3E-01 1.06 4.3E-01 1.11 8.9E-03 0.95 3.4E-01 1.10 1.9E-01 1.14 5.9E-03 rs2453580 SLC47A1 C 1.14 2.8E-02 1.06 4.0E-01 1.15 8.6E-04 0.89 2.9E-02 1.13 1.2E-01 1.20 3.7E-04 rs2467853 SPATA5L1 G 1.13 3.8E-02 0.96 5.3E-01 1.10 2.7E-02 0.91 8.8E-02 0.95 5.3E-01 1.11 4.2E-02 rs267734 ANXA9/LASS2 C 0.97 6.5E-01 1.09 3.0E-01 0.96 4.1E-01 0.92 1.7E-01 1.15 1.3E-01 1.01 8.9E-01 rs2712184 IGFBP5 A 0.97 6.5E-01 1.04 6.2E-01 1.01 8.0E-01 1.00 9.4E-01 1.06 4.2E-01 1.04 3.7E-01 rs2802729 SDCCAG8 A 1.04 5.3E-01 1.24 2.9E-03 1.15 8.2E-04 1.10 6.4E-02 1.18 3.2E-02 1.08 1.1E-01 rs2928148 INO80 A 0.96 4.6E-01 0.99 8.9E-01 0.86 2.9E-04 0.96 4.3E-01 1.05 5.2E-01 0.92 8.8E-02 rs347685 TFDP2 A 0.97 6.7E-01 1.03 7.4E-01 1.03 4.6E-01 0.99 8.5E-01 1.05 5.9E-01 1.06 3.0E-01 rs3750082 KBTBD2 A 1.07 2.9E-01 0.83 9.8E-03 0.85 1.1E-04 1.03 5.5E-01 0.83 1.9E-02 0.84 8.3E-04 rs3828890 MHC region G 1.21 6.6E-02 0.77 1.6E-02 1.01 8.6E-01 1.00 9.6E-01 0.73 7.4E-03 0.97 7.1E-01 rs3850625 CACNA1S A 1.20 5.0E-02 1.09 4.5E-01 1.07 2.7E-01 0.88 1.2E-01 1.10 4.2E-01 1.07 3.3E-01 rs3925584 MPPED2 C 1.01 8.5E-01 0.86 2.8E-02 1.00 9.2E-01 0.96 4.2E-01 0.92 2.5E-01 1.07 1.5E-01 rs4014195 AP5B1 G 1.07 2.9E-01 1.16 4.6E-02 1.25 1.2E-07 1.02 6.7E-01 1.13 1.2E-01 1.23 2.1E-05 rs4667594 LRP2 A 1.03 5.9E-01 1.11 1.4E-01 0.99 7.4E-01 1.04 5.2E-01 1.03 6.8E-01 0.91 6.1E-02 rs4744712 PIP5K1B C 0.95 3.7E-01 0.91 2.0E-01 0.96 3.4E-01 0.95 3.1E-01 0.96 6.2E-01 1.02 7.3E-01 rs491567 WDR72 C 1.07 3.2E-01 1.09 3.4E-01 0.95 3.5E-01 0.96 5.2E-01 1.05 5.7E-01 0.93 2.5E-01 rs6088580 TP53INP2 C 1.08 1.8E-01 1.14 7.0E-02 1.23 1.1E-06 1.07 2.3E-01 1.04 6.1E-01 1.14 6.0E-03 rs626277 DACH1 C 1.12 6.3E-02 0.88 8.5E-02 0.99 8.5E-01 0.95 3.9E-01 0.86 4.9E-02 0.98 7.1E-01 rs6420094 SLC34A1 G 0.97 6.4E-01 0.99 8.9E-01 0.99 8.2E-01 1.04 5.1E-01 0.99 9.1E-01 1.00 1.0E+00 rs6431731 DDX1 T 0.77 1.3E-01 0.89 5.5E-01 1.67 3.2E-07 1.05 7.1E-01 0.94 7.8E-01 1.78 2.0E-06 rs6459680 RNF32 A 0.91 1.5E-01 0.97 7.2E-01 0.97 5.6E-01 1.07 2.4E-01 0.92 3.3E-01 0.93 2.2E-01 rs6465825 TMEM60 C 0.98 7.2E-01 0.88 7.2E-02 1.02 7.1E-01 1.02 6.9E-01 0.90 1.7E-01 1.05 2.9E-01 rs6795744 WNT7A A 1.00 9.7E-01 0.94 5.3E-01 1.05 4.0E-01 0.94 3.7E-01 1.01 9.6E-01 1.11 1.2E-01 rs7422339 CPS1 A 1.09 1.6E-01 1.12 1.5E-01 1.03 4.5E-01 1.14 2.1E-02 1.05 5.2E-01 0.96 4.5E-01 rs7759001 ZNF204 A 0.95 4.9E-01 1.05 5.3E-01 1.05 3.4E-01 1.08 2.5E-01 0.99 9.2E-01 1.00 9.6E-01 rs7805747 PRKAG2 A 1.02 7.2E-01 1.05 5.6E-01 1.14 4.3E-03 1.07 2.7E-01 1.03 6.9E-01 1.11 4.4E-02 rs7956634 PTPRO C 1.01 8.7E-01 1.07 4.4E-01 0.89 2.2E-02 1.01 8.9E-01 1.04 7.1E-01 0.86 1.6E-02 rs8091180 NFATC1 A 0.94 2.9E-01 0.90 1.4E-01 1.07 1.2E-01 0.99 8.0E-01 0.89 1.1E-01 1.06 2.6E-01 rs881858 VEGFA A 0.94 3.2E-01 1.06 4.2E-01 1.07 1.5E-01 1.00 9.8E-01 1.07 3.9E-01 1.08 1.5E-01 rs9682041 SKIL T 1.02 8.2E-01 1.07 5.2E-01 1.18 8.3E-03 0.99 9.5E-01 1.09 4.5E-01 1.19 1.8E-02 rs9895661 BCAS3 T 1.01 8.6E-01 0.92 3.6E-01 0.82 2.1E-04 0.92 2.2E-01 0.94 5.3E-01 0.84 4.9E-03 OR: Odds ratio, CI: Confidence interval. The significance threshold was set at 9.1x10-4 (Bonferroni correction, two-sided test). eGFR<45 ml/min/1.73m² cases: n=2,245, UACR≥300 mg/g cases: n=1385, GFR GCKD controls: n=1,006 (eGFR≥60 ml/min/1.73m²), UACR GCKD controls: n=3,565 (UACR<300 mg/g), 1KGP controls: n=503, WTCCC controls: n=2,597. - 65 -

Supplementary Table 2. Associations between population-based SNPs and CKD from hypertension and type 2 diabetes mellitus in the GCKD cohort (all variants)

SNP characteristics CKD from hypertension CKD from type 2 diabetes mellitus GCKD controls 1KGP controls WTCCC controls GCKD controls 1KGP controls WTCCC controls Effect SNP Gene OR p-value OR p-value OR p-value OR p-value OR p-value OR p-value allele rs10109414 STC1 T 1.02 7.9E-01 1.01 8.8E-01 1.12 2.8E-02 1.06 3.7E-01 1.08 3.9E-01 1.19 7.3E-03 rs10277115 UNCX T 1.06 5.3E-01 1.21 3.7E-02 1.17 1.7E-02 1.04 6.5E-01 1.24 3.4E-02 1.20 2.5E-02 rs10491967 TSPAN9 A 1.04 7.6E-01 0.98 8.7E-01 1.31 1.4E-03 1.08 4.4E-01 1.06 6.8E-01 1.43 3.4E-04 rs10513801 ETV5 G 0.97 7.6E-01 0.96 7.1E-01 0.96 5.6E-01 0.93 4.5E-01 0.91 4.5E-01 0.91 3.2E-01 rs10774021 SLC6A13 T 0.99 9.0E-01 1.02 8.4E-01 0.94 2.3E-01 0.86 3.2E-02 0.89 2.0E-01 0.81 1.4E-03 rs10794720 WDR37 C 0.95 6.8E-01 0.66 5.2E-03 0.75 2.0E-03 0.94 6.0E-01 0.59 1.3E-03 0.71 1.8E-03 rs10994860 A1CF T 1.01 9.5E-01 0.76 6.1E-03 0.93 2.8E-01 1.01 9.5E-01 0.78 2.0E-02 0.94 4.7E-01 rs1106766 INHBC T 0.88 1.8E-01 1.07 4.7E-01 0.84 5.1E-03 1.11 1.9E-01 1.25 3.4E-02 0.98 8.2E-01 rs11078903 CDK12 A 1.00 9.8E-01 0.96 6.4E-01 0.94 2.8E-01 0.97 7.0E-01 0.94 5.0E-01 0.92 2.3E-01 rs11666497 SIPA1L3 T 1.10 3.3E-01 0.93 4.2E-01 1.02 7.3E-01 1.09 3.1E-01 0.97 7.9E-01 1.08 3.3E-01 rs11959928 DAB2 A 1.02 8.2E-01 1.04 6.5E-01 1.19 1.0E-03 0.93 2.5E-01 0.95 5.8E-01 1.10 1.5E-01 rs12124078 DNAJC16 G 0.98 8.4E-01 0.98 8.4E-01 0.90 6.5E-02 1.06 4.4E-01 1.05 6.1E-01 0.96 5.4E-01 rs12136063 SYPL2 A 0.93 3.9E-01 0.95 5.4E-01 0.93 2.0E-01 0.96 5.9E-01 0.94 4.8E-01 0.92 2.4E-01 rs12460876 SLC7A9 C 0.99 8.9E-01 0.86 6.5E-02 1.00 9.7E-01 1.00 9.4E-01 0.87 1.1E-01 1.00 9.7E-01 rs1260326 GCKR C 1.13 1.1E-01 0.96 5.9E-01 0.86 4.4E-03 1.07 3.0E-01 0.99 9.2E-01 0.88 4.9E-02 rs12917707 UMOD T 0.90 3.3E-01 0.69 1.7E-04 0.77 3.4E-04 1.12 2.1E-01 0.79 3.1E-02 0.90 2.1E-01 rs13538 ALMS1/NAT8 G 0.89 2.1E-01 0.99 9.5E-01 0.87 2.4E-02 1.03 7.2E-01 1.07 5.0E-01 0.93 3.4E-01 rs1394125 UBE2Q2 A 1.04 5.9E-01 1.07 3.7E-01 1.03 5.6E-01 1.04 5.8E-01 1.10 2.7E-01 1.06 3.6E-01 rs163160 KCNQ1 G 1.02 8.1E-01 1.14 1.9E-01 1.09 1.9E-01 0.98 8.0E-01 1.12 3.2E-01 1.06 4.8E-01 rs164748 DPEP1 G 0.90 1.6E-01 0.88 9.7E-02 1.09 8.9E-02 1.08 2.3E-01 0.98 8.4E-01 1.22 1.8E-03 rs17216707 BCAS1 C 1.01 9.4E-01 0.90 2.6E-01 1.10 1.3E-01 0.96 6.2E-01 0.85 1.2E-01 1.04 5.8E-01 rs17319721 SHROOM3 A 0.92 3.0E-01 1.10 2.3E-01 0.96 4.1E-01 0.98 7.6E-01 1.08 3.8E-01 0.97 5.8E-01 rs1801239 CUBN C 0.95 6.9E-01 1.17 2.4E-01 0.91 2.7E-01 1.26 2.3E-02 1.50 4.0E-03 1.16 1.3E-01 rs2279463 SLC22A2 G 1.20 1.2E-01 1.03 8.1E-01 0.90 1.5E-01 0.87 1.7E-01 0.86 2.4E-01 0.74 2.3E-03 rs228611 NFKB1 A 1.00 9.7E-01 1.09 2.6E-01 1.14 1.5E-02 0.98 7.3E-01 1.06 4.8E-01 1.11 1.1E-01 rs2453580 SLC47A1 C 1.15 7.3E-02 1.06 4.5E-01 1.12 3.5E-02 1.03 7.0E-01 1.03 7.2E-01 1.12 9.9E-02 rs2467853 SPATA5L1 G 0.85 3.6E-02 0.88 9.7E-02 1.02 7.0E-01 0.92 2.1E-01 0.85 5.7E-02 1.00 1.0E+00 rs267734 ANXA9/LASS2 C 1.04 6.9E-01 1.04 6.8E-01 0.91 1.4E-01 1.11 2.1E-01 1.11 3.3E-01 0.99 9.5E-01 rs2712184 IGFBP5 A 1.04 6.2E-01 1.07 3.7E-01 1.03 5.3E-01 1.12 1.0E-01 1.15 9.9E-02 1.13 6.4E-02 rs2802729 SDCCAG8 A 0.83 1.7E-02 1.16 5.7E-02 1.08 1.4E-01 0.91 1.5E-01 1.12 1.9E-01 1.06 3.6E-01 rs2928148 INO80 A 0.88 8.9E-02 0.89 1.3E-01 0.77 8.4E-07 1.18 1.4E-02 1.07 4.2E-01 0.94 3.7E-01 rs347685 TFDP2 A 1.04 6.5E-01 1.05 5.7E-01 1.07 2.1E-01 1.04 6.1E-01 1.08 4.1E-01 1.11 1.3E-01 rs3750082 KBTBD2 A 0.89 1.3E-01 0.79 3.4E-03 0.81 9.7E-05 1.02 8.2E-01 0.85 7.3E-02 0.86 2.0E-02 rs3828890 MHC region G 1.05 7.3E-01 0.76 2.7E-02 1.00 9.9E-01 0.96 7.5E-01 0.72 1.6E-02 0.93 5.1E-01 rs3850625 CACNA1S A 1.00 1.0E+00 0.99 9.5E-01 0.98 8.4E-01 1.13 2.2E-01 1.11 4.1E-01 1.10 3.1E-01 rs3925584 MPPED2 C 1.01 9.5E-01 0.83 1.2E-02 0.96 4.5E-01 1.12 9.0E-02 0.93 3.8E-01 1.07 2.8E-01 rs4014195 AP5B1 G 1.21 1.5E-02 1.31 9.0E-04 1.41 1.1E-10 0.93 2.5E-01 1.14 1.5E-01 1.23 1.1E-03 rs4667594 LRP2 A 1.08 3.2E-01 1.11 1.8E-01 0.98 7.6E-01 0.94 3.4E-01 1.03 7.1E-01 0.91 1.2E-01 rs4744712 PIP5K1B C 0.94 4.0E-01 0.90 1.8E-01 0.95 3.6E-01 1.14 5.5E-02 1.04 6.7E-01 1.11 1.1E-01 rs491567 WDR72 C 1.01 8.9E-01 1.00 9.9E-01 0.89 8.0E-02 0.99 9.3E-01 0.98 8.5E-01 0.89 1.2E-01 rs6088580 TP53INP2 C 0.88 1.0E-01 0.99 8.8E-01 1.08 1.3E-01 1.03 7.1E-01 1.07 4.1E-01 1.16 2.2E-02 rs626277 DACH1 C 1.00 9.9E-01 0.85 3.8E-02 0.95 2.9E-01 1.05 4.8E-01 0.89 1.8E-01 0.99 8.2E-01 rs6420094 SLC34A1 G 0.97 6.6E-01 1.00 9.9E-01 1.02 7.4E-01 0.96 5.8E-01 0.97 7.7E-01 0.99 9.1E-01 rs6431731 DDX1 T 1.38 1.1E-01 1.00 9.9E-01 1.89 2.5E-06 1.42 7.4E-02 1.17 5.3E-01 2.24 8.4E-06 rs6459680 RNF32 A 1.03 7.3E-01 0.95 5.6E-01 0.94 3.2E-01 1.03 6.8E-01 0.96 7.0E-01 0.95 4.8E-01 rs6465825 TMEM60 C 0.91 2.3E-01 0.83 2.1E-02 0.98 7.3E-01 1.04 5.3E-01 0.91 2.8E-01 1.06 3.6E-01 rs6795744 WNT7A A 1.18 1.2E-01 0.95 6.5E-01 1.05 5.0E-01 0.97 7.9E-01 0.86 2.3E-01 0.97 7.0E-01 rs7422339 CPS1 A 0.90 2.3E-01 1.07 4.4E-01 1.00 9.6E-01 0.99 8.6E-01 1.09 3.4E-01 1.03 7.1E-01 rs7759001 ZNF204 A 1.04 6.6E-01 1.03 7.8E-01 1.02 7.0E-01 1.10 2.2E-01 1.07 5.0E-01 1.11 1.9E-01 rs7805747 PRKAG2 A 0.92 3.2E-01 1.10 2.9E-01 1.21 8.7E-04 0.92 2.6E-01 1.08 4.2E-01 1.17 2.3E-02 rs7956634 PTPRO C 1.02 8.4E-01 1.14 2.0E-01 0.93 2.7E-01 0.95 5.2E-01 1.02 8.5E-01 0.87 7.9E-02 rs8091180 NFATC1 A 0.99 9.2E-01 0.84 3.3E-02 1.01 8.7E-01 1.01 9.0E-01 0.86 7.3E-02 1.02 7.4E-01 rs881858 VEGFA A 0.97 7.4E-01 1.16 7.4E-02 1.17 7.3E-03 0.83 1.1E-02 0.98 8.4E-01 1.01 9.4E-01 rs9682041 SKIL T 1.01 9.5E-01 0.93 5.6E-01 1.04 6.4E-01 1.15 1.8E-01 1.07 5.9E-01 1.20 6.3E-02 rs9895661 BCAS3 T 0.94 5.1E-01 0.87 1.5E-01 0.78 2.9E-04 1.00 9.9E-01 0.91 3.9E-01 0.79 3.2E-03 OR: Odds ratio, CI: Confidence interval. The significance threshold was set at 9.1x10-4 (Bonferroni correction, two-sided test). CKD from hypertension (nephrosclerosis) cases: n=1,086, CKD from type 2 diabetes mellitus: n=653, GCKD controls: n=569 for hypertension (control group with non-genetic causes of CKD excluding nephrosclerosis), and n=1,655 for T2DM. 1KGP controls: n=503, WTCCC controls: n=2,597. - 66 -

Supplementary Table 3. Associations between CKD etiology-specific SNPs and other CKD etiologies in the GCKD cohort (all variants) SNP characteristics IgA MN GCKD 1KGP WTCCC GCKD 1KGP WTCCC controls controls controls controls controls controls Effect Known SNP Gene OR p-value OR p-value OR p-value OR p-value OR p-value OR p-value allele locus for 6.3E- 3.9E- 1.1E- 2.4E- 7.8E- 2.2E- rs2187668 HLA-DQA1 T MN 0.93 0.87 0.60 4.48 4.71 3.02 01 01 04 22 17 16 6.5E- 6.6E- 7.3E- 6.7E- 1.0E- 6.3E- rs4664308 PLA2R1 G MN 1.04 1.04 1.03 0.45 0.47 0.44 01 01 01 08 06 09 ITGAM- 1.6E- 1.2E- 6.1E- 2.0E- 3.1E- 3.4E- rs11150612 A IgA 1.14 1.18 1.17 0.73 0.72 0.75 ITGAX 01 01 02 02 02 02 ITGAM- 3.2E- 1.9E- 1.7E- 4.3E- 8.2E- 1.9E- rs11574637 C IgA 0.76 0.65 0.85 1.13 0.96 1.23 ITGAX 02 03 01 01 01 01 2.0E- 6.0E- 6.5E- 6.4E- 1.8E- 6.5E- rs12716641 DEFA C IgA 0.89 1.05 0.96 1.06 1.20 1.06 01 01 01 01 01 01 6.5E- 4.0E- 5.6E- 1.7E- 8.9E- 3.0E- rs17019602 VAV3 G IgA 1.05 1.11 1.21 1.41 1.52 1.67 01 01 02 02 03 04 HLA- 7.0E- 4.0E- 2.7E- 1.4E- 3.2E- 1.8E- rs1794275 A IgA 1.23 0.90 1.25 0.61 0.48 0.64 DQA/B 02 01 02 02 04 02 2.0E- 7.4E- 6.7E- 3.8E- 2.4E- 2.9E- rs1883414 HLA-DPB2 A IgA 0.88 0.82 0.85 0.74 0.70 0.74 01 02 02 02 02 02 KLF10/OD 1.4E- 8.6E- 4.6E- 5.6E- 3.5E- 2.5E- rs2033562 C IgA 1.15 1.19 1.06 0.78 0.88 0.76 F1 01 02 01 02 01 02 2.7E- 9.3E- 3.3E- 7.5E- 1.7E- 5.4E- rs2074038 ACCS T IgA 1.17 0.99 1.13 0.94 0.72 0.88 01 01 01 01 01 01 HORMAD2/ 1.3E- 1.9E- 6.2E- 3.7E- 4.3E- 6.1E- rs2412971 A IgA 1.14 1.13 1.16 0.90 0.90 0.94 MTMR3 01 01 02 01 01 01 1.1E- 8.9E- 5.9E- 9.0E- 4.0E- 8.1E- rs2523946 HLA-A T IgA 1.15 0.99 1.16 0.80 0.66 0.81 01 01 02 02 03 02 5.6E- 4.7E- 1.9E- 8.7E- 9.0E- 3.1E- rs2738048 DEFA G IgA 0.76 0.81 0.89 0.98 1.02 1.14 03 02 01 01 01 01 1.3E- 3.4E- 3.8E- 7.4E- 1.9E- 3.8E- rs3115573 HLA region G IgA 1.25 1.58 1.75 0.96 1.19 1.29 02 06 12 01 01 02 2.6E- 9.5E- 4.0E- 9.7E- 9.9E- 8.1E- rs3803800 TNFSF13 G IgA 0.88 0.99 0.92 1.01 1.00 0.97 01 01 01 01 01 01 1.2E- 2.0E- 1.9E- 5.7E- 7.8E- 2.5E- rs4077515 CARD9 T IgA 1.15 1.26 1.11 0.93 0.96 0.87 01 02 01 01 01 01 4.7E- 9.2E- 8.6E- 1.2E- 3.3E- 1.4E- rs660895 HLA-DRB1 G IgA 1.09 0.81 0.84 0.60 0.41 0.42 01 02 02 02 05 05 1.5E- 2.2E- 7.5E- 4.7E- 7.1E- 5.8E- rs6677604 CFHR1,3 A IgA 0.69 0.85 0.75 0.89 1.07 0.92 03 01 03 01 01 01 5.9E- 2.3E- 4.9E- 6.6E- 7.1E- 4.3E- rs7634389 ST6GAL1 C IgA 1.30 1.13 1.06 1.06 0.95 0.91 03 01 01 01 01 01 HLA-DR– 1.3E- 6.1E- 7.2E- 1.5E- 1.7E- 1.0E- rs7763262 C IgA 1.28 1.06 1.35 0.67 0.55 0.67 HLA-DQ 02 01 04 03 05 03 5.3E- 7.6E- 3.7E- 3.3E- 2.3E- 7.0E- rs9275596 HLA-DQB1 T IgA 1.20 1.03 1.42 0.52 0.43 0.58 02 01 05 07 09 06 9.9E- 2.8E- 2.6E- 5.7E- 7.9E- 9.6E- rs9314614 DEFA G IgA 1.00 0.90 0.91 1.07 0.97 1.01 01 01 01 01 01 01 TAP1/2/PS 9.7E- 9.5E- 5.6E- 7.9E- 6.9E- 2.6E- rs9357155 A IgA 0.99 0.78 0.93 0.67 0.52 0.60 MB8/9 01 02 01 02 03 02 6.4E- 3.9E- 5.6E- 8.4E- 2.8E- 2.4E- rs1129740 HLA-DQA1 A SSNS 1.04 1.08 0.95 1.69 1.65 1.64 01 01 01 05 04 04 4.9E- 7.2E- 7.7E- 5.1E- 4.8E- 7.2E- rs10488631 TNPO3 C SLE 1.11 1.06 0.96 1.14 1.17 1.07 01 01 01 01 01 01 5.9E- 5.6E- 2.0E- 1.9E- 1.5E- 3.4E- rs1150754 TNXB T SLE 1.07 1.09 0.64 2.77 3.21 1.77 01 01 04 11 10 05 8.2E- 5.8E- 4.2E- 7.1E- 7.1E- 9.9E- rs4963128 KIAA1542 C SLE 1.02 1.06 1.07 0.95 0.95 1.00 01 01 01 01 01 01 5.5E- 5.5E- 3.7E- 1.9E- 7.4E- 2.4E- rs6445975 PXK T SLE 1.06 0.94 1.08 1.21 1.05 1.18 01 01 01 01 01 01 3.4E- 5.6E- 4.0E- 7.6E- 9.9E- 9.9E- rs7574865 STAT4 G SLE 0.90 0.93 0.92 0.96 1.00 1.00 01 01 01 01 01 01 4.6E- 7.0E- 3.6E- 7.5E- 7.8E- 3.9E- rs9888739 ITGAM T SLE 0.64 0.57 0.75 1.06 0.95 1.17 03 04 02 01 01 01 4.7E- 9.8E- 9.7E- 4.3E- 8.7E- 2.2E- rs12437854 ESRD G T1DM 1.13 0.99 1.28 1.21 1.04 1.31 01 01 02 01 01 01 6.0E- 2.2E- 1.8E- 4.9E- 6.7E- 5.3E- rs4972593 ESRD A T1DM 1.07 1.38 1.15 0.88 1.09 0.90 01 02 01 01 01 01 4.0E- 4.6E- 5.3E- 2.6E- 7.8E- 9.1E- rs1949829 COBL T GPA 1.16 1.50 1.53 0.73 0.92 0.97 01 02 03 01 01 01 8.3E- 2.8E- 6.8E- 6.1E- 9.4E- 5.7E- rs4862110 DCTD C GPA 0.82 0.87 1.05 0.92 0.99 1.10 02 01 01 01 01 01 7.7E- 4.2E- 6.0E- 7.2E- 3.5E- 7.9E- rs595018 CCDC86 C GPA 0.97 0.91 0.95 0.95 0.85 0.96 01 01 01 01 01 01 5.6E- 8.1E- 5.2E- 5.7E- 1.6E- 1.4E- rs7151526 SERPINA1 A GPA 0.88 1.06 1.13 1.98 1.94 2.13 01 01 01 03 02 03 3.4E- 3.8E- 6.1E- 7.9E- 2.9E- 5.0E- rs7503953 WSCD1 C GPA 1.15 1.36 1.26 1.05 1.24 1.13 01 02 02 01 01 01 6.5E- 2.1E- 1.5E- 7.8E- 3.9E- 4.4E- rs9277554 HLA–DPB1 T GPA 0.75 0.66 0.67 1.04 0.88 0.90 03 04 05 01 01 01 - 67 -

SLE GPA T1DM GCKD 1KGP WTCCC GCKD 1KGP WTCCC GCKD 1KGP WTCCC controls controls controls controls controls controls controls controls controls p- p- p- p- p- p- p- p- p- SNP OR OR OR OR OR OR OR OR OR value value value value value value value value value 5.9E- 1.0E- 5.5E- 2.2E- 3.4E- 9.7E- 1.6E- 1.9E- 1.2E- rs2187668 2.36 2.63 1.89 0.75 0.78 0.54 1.89 2.00 1.35 06 06 05 01 01 03 03 03 01 2.5E- 1.6E- 1.5E- 8.1E- 9.0E- 8.7E- 4.4E- 5.3E- 4.9E- rs4664308 0.83 0.81 0.82 1.03 1.02 1.02 1.36 1.36 1.35 01 01 01 01 01 01 02 02 02 8.3E- 5.9E- 7.1E- 5.4E- 2.8E- 2.5E- 4.1E- 1.1E- 9.2E- rs11150612 1.03 1.09 0.95 1.09 1.18 1.17 0.71 0.75 0.76 01 01 01 01 01 01 02 01 02 1.2E- 3.6E- 8.1E- 2.6E- 4.1E- 1.0E- 2.6E- 8.1E- 1.0E- rs11574637 1.56 1.43 1.94 0.50 0.43 0.56 1.24 1.05 1.36 02 02 06 03 04 02 01 01 01 5.1E- 1.8E- 6.5E- 1.7E- 2.2E- 7.9E- 7.4E- 5.1E- 2.2E- rs12716641 0.90 1.21 1.06 1.21 1.39 1.27 0.75 0.90 0.83 01 01 01 01 02 02 02 01 01 3.6E- 6.6E- 1.5E- 2.2E- 1.2E- 3.5E- 8.7E- 8.8E- 4.7E- rs17019602 1.17 1.37 1.61 1.42 1.51 1.73 0.97 1.03 1.15 01 02 03 02 02 04 01 01 01 2.0E- 7.5E- 1.2E- 9.4E- 7.7E- 7.6E- 6.4E- 2.0E- 6.6E- rs1794275 0.75 0.50 0.74 1.32 0.95 1.34 0.64 0.47 0.65 01 04 01 02 01 02 02 03 02 5.1E- 4.3E- 2.5E- 1.6E- 1.3E- 1.2E- 8.3E- 9.4E- 7.9E- rs1883414 1.35 1.36 1.35 0.60 0.56 0.59 1.04 1.01 1.04 02 02 02 03 03 03 01 01 01 5.8E- 7.6E- 6.3E- 6.8E- 1.0E+ 4.5E- 1.6E- 4.4E- 1.2E- rs2033562 0.92 0.96 0.94 0.94 1.00 0.90 0.80 0.88 0.79 01 01 01 01 00 01 01 01 01 9.8E- 3.1E- 5.9E- 1.3E- 6.0E- 2.0E- 8.9E- 5.2E- 9.4E- rs2074038 1.01 0.78 0.89 1.36 1.12 1.29 1.04 0.84 1.02 01 01 01 01 01 01 01 01 01 9.8E- 6.7E- 7.2E- 5.7E- 4.5E- 2.9E- 2.5E- 2.0E- 9.1E- rs2412971 1.00 1.06 1.05 1.08 1.11 1.15 1.19 1.23 1.29 01 01 01 01 01 01 01 01 02 3.4E- 6.2E- 5.7E- 4.7E- 4.0E- 3.4E- 7.9E- 3.3E- 9.0E- rs2523946 0.72 0.59 0.68 0.91 0.74 0.88 1.04 0.85 1.02 02 04 03 01 02 01 01 01 01 4.1E- 8.0E- 6.9E- 3.0E- 4.6E- 9.3E- 8.8E- 8.7E- 5.1E- rs2738048 1.13 1.31 1.43 0.86 0.89 0.99 0.98 1.03 1.11 01 02 03 01 01 01 01 01 01 3.2E- 7.1E- 9.2E- 1.5E- 9.2E- 6.2E- 7.3E- 1.1E- 3.0E- rs3115573 0.86 0.95 1.01 0.82 0.99 1.07 1.05 1.29 1.39 01 01 01 01 01 01 01 01 02 9.3E- 1.0E- 3.2E- 6.1E- 5.0E- 7.1E- 9.8E- 8.5E- 9.2E- rs3803800 1.02 1.33 1.18 1.09 1.12 1.06 0.99 1.04 0.98 01 01 01 01 01 01 01 01 01 2.6E- 4.5E- 5.8E- 9.0E- 5.6E- 8.0E- 7.3E- 3.7E- 1.7E- rs4077515 0.70 0.89 0.78 1.02 1.09 0.97 1.33 1.42 1.23 02 01 02 01 01 01 02 02 01 5.1E- 9.1E- 1.1E- 2.0E- 2.2E- 9.6E- 4.6E- 4.2E- 1.1E- rs660895 1.15 0.72 0.76 1.81 1.23 1.30 3.00 2.07 2.19 01 02 01 04 01 02 11 05 06 9.4E- 2.9E- 6.0E- 6.7E- 2.0E- 4.7E- 3.6E- 9.0E- 4.4E- rs6677604 0.99 1.22 1.09 1.07 1.25 1.12 0.84 0.97 0.86 01 01 01 01 01 01 01 01 01 4.4E- 9.4E- 9.5E- 7.5E- 1.9E- 5.3E- 6.9E- 4.1E- 2.3E- rs7634389 1.13 1.01 0.99 1.29 1.21 1.09 0.94 0.87 0.82 01 01 01 02 01 01 01 01 01 3.0E- 3.1E- 9.3E- 3.0E- 4.2E- 3.7E- 2.5E- 6.9E- 2.7E- rs7763262 0.57 0.43 0.52 1.16 0.88 1.14 1.21 0.93 1.20 04 08 07 01 01 01 01 01 01 3.4E- 1.2E- 4.3E- 1.2E- 9.4E- 2.5E- 2.6E- 8.0E- 7.2E- rs9275596 0.57 0.48 0.62 1.25 0.99 1.38 1.20 0.96 1.34 04 06 04 01 01 02 01 01 02 5.5E- 1.0E- 5.1E- 8.6E- 5.2E- 4.8E- 9.2E- 5.6E- 5.5E- rs9314614 0.91 0.79 0.78 0.98 0.91 0.91 0.99 0.91 0.91 01 01 02 01 01 01 01 01 01 5.0E- 1.1E- 5.2E- 5.2E- 6.4E- 1.4E- 5.8E- 1.7E- 2.0E- rs9357155 1.18 0.69 0.87 1.45 1.10 1.32 1.76 1.35 1.60 01 01 01 02 01 01 03 01 02 2.2E- 7.9E- 1.9E- 2.2E- 2.0E- 9.7E- 1.1E- 2.9E- 2.9E- rs1129740 1.21 1.28 1.20 1.18 1.19 1.01 2.13 2.05 2.09 01 02 01 01 01 01 05 05 05 6.1E- 5.7E- 1.6E- 9.8E- 9.1E- 7.3E- 7.3E- 7.0E- 9.9E- rs10488631 1.49 1.79 1.54 1.01 1.03 0.93 1.09 1.11 1.00 02 03 02 01 01 01 01 01 01 2.8E- 8.5E- 9.4E- 7.7E- 6.0E- 3.0E- 2.5E- 9.0E- 1.8E- rs1150754 1.97 2.37 1.51 0.94 1.12 0.64 2.53 3.04 1.71 04 06 03 01 01 02 07 08 03 7.3E- 4.3E- 4.2E- 2.7E- 3.3E- 3.6E- 2.8E- 3.3E- 4.1E- rs4963128 0.94 1.14 1.12 0.85 0.85 0.88 0.71 0.69 0.73 01 01 01 01 01 01 02 02 02 7.3E- 2.6E- 9.6E- 4.1E- 1.6E- 5.6E- 7.2E- 3.3E- 7.9E- rs6445975 0.94 0.84 1.01 0.89 0.80 0.92 0.94 0.84 0.96 01 01 01 01 01 01 01 01 01 9.7E- 1.6E- 1.9E- 9.2E- 7.0E- 7.0E- 3.4E- 6.2E- 5.1E- rs7574865 0.53 0.55 0.51 1.02 1.07 1.07 0.85 0.91 0.89 05 04 06 01 01 01 01 01 01 1.4E- 1.1E- 5.0E- 7.9E- 2.3E- 2.8E- 4.6E- 9.2E- 2.0E- rs9888739 1.60 1.61 2.24 0.48 0.42 0.55 1.18 1.02 1.32 02 02 07 03 03 02 01 01 01 2.7E- 9.9E- 3.5E- 2.1E- 7.0E- 2.0E- 7.1E- 4.0E- 9.7E- rs12437854 1.37 1.00 1.25 1.36 1.11 1.37 0.88 0.75 0.99 01 01 01 01 01 01 01 01 01 6.5E- 1.2E- 3.1E- 9.0E- 3.0E- 8.5E- 1.9E- 2.3E- 1.2E- rs4972593 1.10 1.35 1.18 0.98 1.24 1.04 1.29 1.64 1.34 01 01 01 01 01 01 01 02 01 9.5E- 2.1E- 5.0E- 3.6E- 8.3E- 9.8E- 6.1E- 2.7E- 1.4E- rs1949829 0.98 1.44 1.60 0.75 0.93 1.01 1.16 1.42 1.51 01 01 02 01 01 01 01 01 01 2.8E- 4.2E- 3.6E- 6.6E- 4.6E- 2.4E- 5.1E- 3.4E- 6.0E- rs4862110 0.81 0.85 0.83 1.08 1.15 1.25 1.13 1.21 0.88 01 01 01 01 01 01 01 01 01 2.8E- 5.7E- 2.4E- 9.8E- 8.7E- 9.9E- 5.1E- 3.7E- 3.3E- rs595018 0.82 0.91 0.83 1.00 0.97 1.00 0.88 0.84 0.84 01 01 01 01 01 01 01 01 01 8.1E- 9.4E- 6.6E- 5.3E- 6.2E- 1.3E- 2.9E- 2.8E- 1.6E- rs7151526 0.91 0.97 0.86 1.73 1.77 2.28 1.43 1.44 1.58 01 01 01 02 02 03 01 01 01 5.9E- 5.6E- 2.8E- 7.4E- 7.3E- 8.9E- 5.0E- 3.0E- 2.0E- rs7503953 0.89 1.13 1.24 0.93 1.07 1.03 0.66 0.80 0.77 01 01 01 01 01 01 02 01 01 rs9277554 1.7E- 3.1E- 1.5E- 1.7E- 8.6E- 1.4E- 7.9E- 7.1E- 7.6E- 1.24 1.17 1.22 0.14 0.13 0.13 1.05 0.94 0.95 01 01 01 11 12 12 01 01 01 MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio. Bold: statistically significant association. Significance threshold was set at 2.6x10-3 for known associations (Bonferroni correction, one-sided hypothesis), and at 1.3x10-3 for the others. - 68 -

10 Bibliography

Adler, A. I., R. J. Stevens, et al. (2003). "Development and progression of nephropathy in type 2 diabetes: the United Kingdom Prospective Diabetes Study (UKPDS 64)." Kidney Int 63(1): 225-232. Alfaadhel, T. and D. Cattran (2015). "Management of Membranous Nephropathy in Western Countries." Kidney Dis (Basel) 1(2): 126-137. Anderson, C. A., F. H. Pettersson, et al. (2010). "Data quality control in genetic case-control association studies." Nat Protoc 5(9): 1564-1573. Arnold, M., J. Raffler, et al. (2015). "SNiPA: an interactive, genetic variant-centered annotation browser." Bioinformatics 31(8): 1334-1336. Auton, A., L. D. Brooks, et al. (2015). "A global reference for human genetic variation." Nature 526(7571): 68-74. Baigent, C., M. J. Landray, et al. (2011). "The effects of lowering LDL cholesterol with simvastatin plus ezetimibe in patients with chronic kidney disease (Study of Heart and Renal Protection): a randomised placebo-controlled trial." Lancet 377(9784): 2181- 2192. Barrett, J. C., D. G. Clayton, et al. (2009). "Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes." Nat Genet 41(6): 703-707. Bates, J. M., H. M. Raffi, et al. (2004). "Tamm-Horsfall protein knockout mice are more prone to urinary tract infection: rapid communication." Kidney Int 65(3): 791-797. Boger, C. A., M. H. Chen, et al. (2011). "CUBN is a gene locus for albuminuria." J Am Soc Nephrol 22(3): 555-570. Bradfield, J. P., H. Q. Qu, et al. (2011). "A genome-wide meta-analysis of six type 1 diabetes cohorts identifies multiple associated loci." PLoS Genet 7(9): e1002293. Chang, C. C., C. C. Chow, et al. (2015). "Second-generation PLINK: rising to the challenge of larger and richer datasets." Gigascience 4: 7. Chung, S. A., K. E. Taylor, et al. (2011). "Differential genetic associations for systemic lupus erythematosus based on anti-dsDNA autoantibody production." PLoS Genet 7(3): e1001323. Cooper, J. D., D. J. Smyth, et al. (2008). "Meta-analysis of genome-wide association study data identifies additional type 1 diabetes risk loci." Nat Genet 40(12): 1399-1401. Cooper, M. E. (2001). "Interaction of metabolic and haemodynamic factors in mediating experimental diabetic nephropathy." Diabetologia 44(11): 1957-1972. Cravedi, P., P. Ruggenenti, et al. (2010). "Which antihypertensive drugs are the most nephroprotective and why?" Expert Opin Pharmacother 11(16): 2651-2663. Crotty, S. (2015). "A brief history of T cell help to B cells." Nat Rev Immunol 15(3): 185- 189. Dalla Vestra, M., A. Saller, et al. (2000). "Structural involvement in type 1 and type 2 diabetic nephropathy." Diabetes Metab 26 Suppl 4: 8-14. - 69 -

De Nicola, L. and R. Minutolo (2016). "Worldwide growing epidemic of CKD: fact or fiction?" Kidney Int 90(3): 482-484. Delaneau, O., J. F. Zagury, et al. (2013). "Improved whole-chromosome phasing for disease and population genetic studies." Nat Methods 10(1): 5-6. Devuyst, O., E. Olinger, et al. (2017). "Uromodulin: from physiology to rare and complex kidney disorders." Nat Rev Nephrol 13(9): 525-544. Dewan, A., M. Liu, et al. (2006). "HTRA1 promoter polymorphism in wet age-related macular degeneration." Science 314(5801): 989-992. Eckardt, K. U., B. Barthlein, et al. (2012). "The German Chronic Kidney Disease (GCKD) study: design and methods." Nephrol Dial Transplant 27(4): 1454-1460. Eckardt, K. U., J. Coresh, et al. (2013). "Evolving importance of kidney disease: from subspecialty to global health burden." Lancet 382(9887): 158-169. El-Achkar, T. M., X. R. Wu, et al. (2008). "Tamm-Horsfall protein protects the kidney from ischemic injury by decreasing inflammation and altering TLR4 expression." Am J Physiol Renal Physiol 295(2): F534-544. Forsythe, E., K. Sparks, et al. (2017). "Risk Factors for Severe Renal Disease in Bardet-Biedl Syndrome." J Am Soc Nephrol 28(3): 963-970. Francis, J. M., L. H. Beck, Jr., et al. (2016). "Membranous Nephropathy: A Journey From Bench to Bedside." Am J Kidney Dis 68(1): 138-147. Freedman, B. I. and A. H. Cohen (2016). "Hypertension-attributed nephropathy: what's in a name?" Nat Rev Nephrol 12(1): 27-36. Gbadegesin, R. A., A. Adeyemo, et al. (2015). "HLA-DQA1 and PLCG2 Are Candidate Risk Loci for Childhood-Onset Steroid-Sensitive Nephrotic Syndrome." J Am Soc Nephrol 26(7): 1701-1710. Gene Cards - Human Gene Database: http://www.genecards.org/cgi-bin/carddisp.pl?gene=WDR37. Gharavi, A. G., K. Kiryluk, et al. (2011). "Genome-wide association study identifies susceptibility loci for IgA nephropathy." Nat Genet 43(4): 321-327. Ghirotto, S., F. Tassi, et al. (2016). "The Uromodulin Gene Locus Shows Evidence of Pathogen Adaptation through Human Evolution." J Am Soc Nephrol 27(10): 2983- 2996. Global Burden of Disease Study 2016 "Global, regional, and national incidence, prevalence, and years lived with disability for 328 diseases and injuries for 195 countries, 1990- 2016: a systematic analysis for the Global Burden of Disease Study 2016." Lancet 390(10100): 1211-1259. Grant, S. F., H. Q. Qu, et al. (2009). "Follow-up analysis of genome-wide association data identifies novel loci for type 1 diabetes." Diabetes 58(1): 290-295. Hakonarson, H., S. F. Grant, et al. (2007). "A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene." Nature 448(7153): 591-594.

- 70 -

Hallan, S. I., J. Coresh, et al. (2006). "International comparison of the relationship of chronic kidney disease prevalence and ESRD risk." J Am Soc Nephrol 17(8): 2275-2284. Hallan, S. I., M. A. Ovrehus, et al. (2016). "Long-term trends in the prevalence of chronic kidney disease and the influence of cardiovascular risk factors in Norway." Kidney Int 90(3): 665-673. Harley, J. B., M. E. Alarcon-Riquelme, et al. (2008). "Genome-wide association scan in women with systemic lupus erythematosus identifies susceptibility variants in ITGAM, PXK, KIAA1542 and other loci." Nat Genet 40(2): 204-210. Hart, T. C., M. C. Gorry, et al. (2002). "Mutations of the UMOD gene are responsible for medullary cystic kidney disease 2 and familial juvenile hyperuricaemic nephropathy." J Med Genet 39(12): 882-892. Hogan, S. L., P. H. Nachman, et al. (1996). "Prognostic markers in patients with antineutrophil cytoplasmic autoantibody-associated microscopic polyangiitis and glomerulonephritis." J Am Soc Nephrol 7(1): 23-32. Holle, J. U. e. a. (2013). "Genetische Risikofaktoren von Vaskulitiden." Internist 2013 · 55:128–134 DOI 10.1007/s00108-013-3305-9. Hopper, J., Jr., P. A. Trew, et al. (1981). "Membranous nephropathy: its relative benignity in women." Nephron 29(1-2): 18-24. Howie, B. N., P. Donnelly, et al. (2009). "A flexible and accurate genotype imputation method for the next generation of genome-wide association studies." PLoS Genet 5(6): e1000529. Iwamoto, T. and T. B. Niewold (2016). "Genetics of human lupus nephritis." Clin Immunol. Jaacks, L. M., K. R. Siegel, et al. (2016). "Type 2 diabetes: A 21st century epidemic." Best Pract Res Clin Endocrinol Metab 30(3): 331-343. Johnson, A. D., R. E. Handsaker, et al. (2008). "SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap." Bioinformatics 24(24): 2938-2939. Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group (2013). "KDIGO 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease." Kidney Int Suppl, 3, 1–150. Kiryluk, K., Y. Li, et al. (2012). "Geographic differences in genetic susceptibility to IgA nephropathy: GWAS replication study and geospatial risk analysis." PLoS Genet 8(6): e1002765. Kiryluk, K., Y. Li, et al. (2014). "Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens." Nat Genet 46(11): 1187- 1196. Klein, B. E., R. Klein, et al. (1996). "Parental history of diabetes in a population-based study." Diabetes Care 19(8): 827-830. Kottgen, A., N. L. Glazer, et al. (2009). "Multiple loci associated with indices of renal function and chronic kidney disease." Nat Genet 41(6): 712-717.

- 71 -

Köttgen, A. e. a. (2010). "Multiple New Loci Associated with Kidney Function and Chronic Kidney Disease: The CKDGen consortium." Nat Genet. 2010 May ; 42(5): 376–384. doi:10.1038/ng.568. Lee, T., A. K. Biddle, et al. (2014). "Personalized prophylactic anticoagulation decision analysis in patients with membranous nephropathy." Kidney Int 85(6): 1412-1420. Levey, A. S. and J. Coresh (2012). "Chronic kidney disease." Lancet 379(9811): 165-180. Levey, A. S., L. A. Stevens, et al. (2009). "A new equation to estimate glomerular filtration rate." Ann Intern Med 150(9): 604-612. Levin, A., M. Tonelli, et al. (2017). "Global kidney health 2017 and beyond: a roadmap for closing gaps in care, research, and policy." Lancet. Li, M., J. N. Foo, et al. (2015). "Identification of new susceptibility loci for IgA nephropathy in Han Chinese." Nat Commun 6: 7270. Liu, Y., L. Mo, et al. (2010). "Progressive renal papillary calcification and ureteral stone formation in mice deficient for Tamm-Horsfall protein." Am J Physiol Renal Physiol 299(3): F469-478. Marchini, J. and B. Howie (2010). "Genotype imputation for genome-wide association studies." Nat Rev Genet 11(7): 499-511. MDRD study group (1992). "The Modification of Diet in Renal Disease Study: design, methods, and results from the feasibility study." Am J Kidney Dis 20(1): 18-33. Mirrakhimov, A. E., A. M. Ali, et al. (2014). "Primary Nephrotic Syndrome in Adults as a Risk Factor for Pulmonary Embolism: An Up-to-Date Review of the Literature." Int J Nephrol 2014: 916760. Mizuno, T., W. Sato, et al. (2015). "Significance of downregulation of renal organic cation transporter (SLC47A1) in cisplatin-induced proximal tubular injury." Onco Targets Ther 8: 1701-1706. Mutig, K., T. Kahl, et al. (2011). "Activation of the bumetanide-sensitive Na+,K+,2Cl- cotransporter (NKCC2) is facilitated by Tamm-Horsfall protein in a chloride-sensitive manner." J Biol Chem 286(34): 30200-30210. Nishida, M., R. Kato, et al. (2015). "Coexisting Membranous Nephropathy and IgA Nephropathy." Fetal Pediatr Pathol 34(6): 351-354. O'Seaghdha, C. M., R. S. Parekh, et al. (2011). "The MYH9/APOL1 region and chronic kidney disease in European-Americans." Hum Mol Genet 20(12): 2450-2456. Okada, Y., X. Sim, et al. (2012). "Meta-analysis identifies multiple loci associated with kidney function-related traits in east Asian populations." Nat Genet 44(8): 904-909. Olabisi, O. A., J. Y. Zhang, et al. (2016). "APOL1 kidney disease risk variants cause cytotoxicity by depleting cellular potassium and inducing stress-activated protein kinases." Proc Natl Acad Sci U S A 113(4): 830-837.

- 72 -

Padmanabhan, S., O. Melander, et al. (2010). "Genome-wide association study of blood pressure extremes identifies variant near UMOD associated with hypertension." PLoS Genet 6(10): e1001177. Parving, H. H. (2000). "Blockade of the renin-angiotensin-aldosterone system and renal protection in diabetes mellitus." J Renin Angiotensin Aldosterone Syst 1(1): 30-31. Pattaro, C. (2015). "Genetic Associations at 53 Loci Highlight Cell Types and Biologic Pathways for Kidney Function." Nat Commun 7: 10023. Patterson, N., A. L. Price, et al. (2006). "Population structure and eigenanalysis." PLoS Genet 2(12): e190. Price, A. L., N. A. Zaitlen, et al. (2010). "New approaches to population stratification in genome-wide association studies." Nat Rev Genet 11(7): 459-463. Reich, H. N., S. Troyanov, et al. (2007). "Remission of proteinuria improves prognosis in IgA nephropathy." J Am Soc Nephrol 18(12): 3177-3183. Renigunta, A., V. Renigunta, et al. (2011). "Tamm-Horsfall glycoprotein interacts with renal outer medullary potassium channel ROMK2 and regulates its function." J Biol Chem 286(3): 2224-2235. Rysz, J., A. Gluba-Brzozka, et al. (2017). "Novel Biomarkers in the Diagnosis of Chronic Kidney Disease and the Prediction of Its Outcome." Int J Mol Sci 18(8). Sandholm, N., A. J. McKnight, et al. (2013). "Chromosome 2q31.1 associates with ESRD in women with type 1 diabetes." J Am Soc Nephrol 24(10): 1537-1543. Sandholm, N., R. M. Salem, et al. (2012). "New susceptibility loci associated with kidney disease in type 1 diabetes." PLoS Genet 8(9): e1002921. Schaefer, E., A. Zaloszyc, et al. (2011). "Mutations in SDCCAG8/NPHP10 Cause Bardet- Biedl Syndrome and Are Associated with Penetrant Renal Disease and Absent Polydactyly." Mol Syndromol 1(6): 273-281. Schaeffer, C., A. Cattaneo, et al. (2012). "Urinary secretion and extracellular aggregation of mutant uromodulin isoforms." Kidney Int 81(8): 769-778. Scolari, F., C. Izzi, et al. (2015). "Uromodulin: from monogenic to multifactorial diseases." Nephrol Dial Transplant 30(8): 1250-1256. Shaw, J. E., R. A. Sicree, et al. (2010). "Global estimates of the prevalence of diabetes for 2010 and 2030." Diabetes Res Clin Pract 87(1): 4-14. Sinico, R. A., L. Di Toma, et al. (2013). "Renal involvement in anti-neutrophil cytoplasmic autoantibody associated vasculitis." Autoimmun Rev 12(4): 477-482. Smerud, H. K., P. Barany, et al. (2011). "New treatment for IgA nephropathy: enteric budesonide targeted to the ileocecal region ameliorates proteinuria." Nephrol Dial Transplant 26(10): 3237-3242. Smigielski, E. M., K. Sirotkin, et al. (2000). "dbSNP: a database of single nucleotide polymorphisms." Nucleic Acids Res 28(1): 352-355.

- 73 -

Stanescu, H. C., M. Arcos-Burgos, et al. (2011). "Risk HLA-DQA1 and PLA(2)R1 alleles in idiopathic membranous nephropathy." N Engl J Med 364(7): 616-626. Steffes, M. W., R. Osterby, et al. (1989). "Mesangial expansion as a central mechanism for loss of kidney function in diabetic patients." Diabetes 38(9): 1077-1081. Stevens, P. E. and A. Levin (2013). "Evaluation and management of chronic kidney disease: synopsis of the kidney disease: improving global outcomes 2012 clinical practice guideline." Ann Intern Med 158(11): 825-830. Stokman, M., M. Lilien, et al. (1993). "Nephronophthisis." Sudmant, P. H., T. Rausch, et al. (2015). "An integrated map of structural variation in 2,504 human genomes." Nature 526(7571): 75-81. Tarzi, R. M. and C. D. Pusey (2014). "Current and future prospects in the management of granulomatosis with polyangiitis (Wegener's granulomatosis)." Ther Clin Risk Manag 10: 279-293. The Wellcome Trust Case Control Consortium (2007). "Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls." Nature 447(7145): 661-678. Titze, S., M. Schmid, et al. (2015). "Disease burden and risk profile in referred patients with moderate chronic kidney disease: composition of the German Chronic Kidney Disease (GCKD) cohort." Nephrol Dial Transplant 30(3): 441-451. Tomana, M., K. Matousovic, et al. (1997). "Galactose-deficient IgA1 in sera of IgA nephropathy patients is present in complexes with IgG." Kidney Int 52(2): 509-516. Tomana, M., J. Novak, et al. (1999). "Circulating immune complexes in IgA nephropathy consist of IgA1 with galactose-deficient hinge region and antiglycan antibodies." J Clin Invest 104(1): 73-81. Tomas, N. M., L. H. Beck, Jr., et al. (2014). "Thrombospondin type-1 domain-containing 7A in idiopathic membranous nephropathy." N Engl J Med 371(24): 2277-2287. Tomer, Y., L. M. Dolan, et al. (2015). "Genome wide identification of new genes and pathways in patients with both autoimmune thyroiditis and type 1 diabetes." J Autoimmun 60: 32-39. Trudu, M., S. Janas, et al. (2013). "Common noncoding UMOD gene variants induce salt- sensitive hypertension and kidney damage by increasing uromodulin expression." Nat Med 19(12): 1655-1660. Tsokos, G. C. (2011). "Systemic lupus erythematosus." N Engl J Med 365(22): 2110-2121. Turner, N. e. a. (2015). "Oxford Textbook of Clinical Nephrology (4th edition)." Oxford University Press. USRDS Annual Data Report 2016. Vanhamme, L., F. Paturiaux-Hanocq, et al. (2003). "Apolipoprotein L-I is the trypanosome lytic factor of human serum." Nature 422(6927): 83-87.

- 74 -

Visscher, P. M., M. A. Brown, et al. (2012). "Five years of GWAS discovery." Am J Hum Genet 90(1): 7-24. Wan, G., S. Zhaorigetu, et al. (2008). "Apolipoprotein L1, a novel Bcl-2 homology domain 3- only lipid-binding protein, induces autophagic cell death." J Biol Chem 283(31): 21540-21549. Wanner, C., V. Krane, et al. (2005). "Atorvastatin in patients with type 2 diabetes mellitus undergoing hemodialysis." N Engl J Med 353(3): 238-248. Warling, O., C. Bovy, et al. (2014). "Overlap syndrome consisting of PSC-AIH with concomitant presence of a membranous glomerulonephritis and ulcerative colitis." World J Gastroenterol 20(16): 4811-4816. Welter, D., J. MacArthur, et al. (2014). "The NHGRI GWAS Catalog, a curated resource of SNP-trait associations." Nucleic Acids Res 42(Database issue): D1001-1006. Wolfe, R. A., V. B. Ashby, et al. (1999). "Comparison of mortality in all patients on dialysis, patients on dialysis awaiting transplantation, and recipients of a first cadaveric transplant." N Engl J Med 341(23): 1725-1730. Wuttke, M. and A. Kottgen (2016). "Insights into kidney diseases from genome-wide association studies." Nat Rev Nephrol 12(9): 549-562. Wyatt, R. J. and B. A. Julian (2013). "IgA nephropathy." N Engl J Med 368(25): 2402-2414. Xie, G., D. Roshandel, et al. (2013). "Association of granulomatosis with polyangiitis (Wegener's) with HLA-DPB1*04 and SEMA6A gene variants: evidence from genome-wide analysis." Arthritis Rheum 65(9): 2457-2468. Yonezawa, A. and K. Inui (2011). "Importance of the multidrug and toxin extrusion MATE/SLC47A family to pharmacokinetics, pharmacodynamics/toxicodynamics and pharmacogenomics." Br J Pharmacol 164(7): 1817-1825. Zent, R., R. Nagai, et al. (1997). "Idiopathic membranous nephropathy in the elderly: a comparative study." Am J Kidney Dis 29(2): 200-206. Zhang, Q. L. and D. Rothenbacher (2008). "Prevalence of chronic kidney disease in population-based studies: systematic review." BMC Public Health 8: 117.

- 75 -

11 Original publication in Scientific Reports

Parts of this thesis have been published as an article named “Associations between genetic risk variants for kidney diseases and kidney disease etiology” in Scientific Reports. The article is available at www.nature.com/articles/s41598-017-13356-6 by the 24th of October 2017.

- 76 -

www.nature.com/scientificreports

OPEN Associations between genetic risk variants for kidney diseases and kidney disease etiology Received: 2 June 2017 Sebastian Wunnenburger, Ulla T. Schultheiss,, Gerd Walz, Birgit Hausknecht, Arif B. Accepted: 21 September 2017 Ekici , Florian Kronenberg , Kai-Uwe Eckardt, Anna Köttgen & Matthias Wuttke Published: xx xx xxxx Chronic kidney disease (CKD) is a global health problem with a genetic component. Genome-wide association studies have identiied variants associated with speciic CKD etiologies, but their genetic overlap has not been well studied. This study examined SNP associations across diferent CKD etiologies and CKD stages using data from , CKD patients of the German Chronic Kidney Disease study. )n addition to conirming known associations, a systemic lupus erythematosus-associated risk variant at TNXB was also associated with CKD attributed to type diabetes p = . × −), a membranous nephropathy-associated variant at HLA-DQA1 was also associated with CKD attributed to systemic lupus erythematosus (p = .9 × −), and an IgA risk variant at HLA-DRB1 was associated with both CKD attributed to granulomatosis with polyangiitis (p = . × − and to type diabetes (p = . × −). Associations were independent of additional risk variants in the respective genetic regions. Evaluation of CKD stage showed a signiicant association of the UMOD risk variant, previously identiied in population-based studies for association with kidney function, for advanced stage ≥Gb compared to early-stage CKD (≤stage G. Shared genetic associations across CKD etiologies and stages highlight the role of the immune response in CKD. Association studies with detailed information on CKD etiology can reveal shared genetic risk variants.

he prevalence of chronic kidney disease (CKD) is high with >10% of the adult population afected in many countries1. Its genetic architecture is complex and incompletely understood. Genome-wide association studies (GWAS) have helped to gain insight into complex disease genetics2,3 by identifying single-nucleotide polymor- phisms (SNPs) in >70 independent risk loci associated with the estimated glomerular iltration rate (eGFR), CKD disease risk and microalbuminuria (MA)4–6 as well as speciic kidney diseases such as IgA7,8 or membranous nephropathy9 in case control studies. Because many of these speciic kidney diseases are individually rare, only very few studies have collected suicient numbers of patients with CKD attributed to various of these speciic etiologies using one study design and protocol. Consequently, genetic risk variants identiied in association with a speciic etiology of CKD have so far not been examined for their association with CKD attributed to other eti- ologies. Capitalizing on data from the large German Chronic Kidney Disease (GCKD) study, we therefore aimed to systematically examine whether risk loci discovered for speciic etiologies of CKD, especially for autoimmune conditions10, are associated with other CKD etiologies as well. Additionally, we aimed to examine whether risk loci discovered in the general population are also associated with advanced stages of CKD and with CKD in patients for whom the leading cause of disease was hypertension or diabetes, the most common causes of CKD. Subjects and Methods he GCKD study11,12 is an ongoing prospective observational study of 5,217 patients under nephrological care, followed for up to 10 years. At enrolment, all patients had CKD deined as an estimated glomerular iltration rate (eGFR) of 30–60 mL/min/1.73 m2 or either a urinary albumin-to-creatinine ratio (UACR) >300 mg/g or a

Institute of Genetic Epidemiology, Medical Center - University of Freiburg, Faculty of Medicine, Freiburg, Germany. Division of Nephrology, University of Freiburg, Faculty of Medicine, Freiburg, Germany. Department of Nephrology and Hypertension, University of Erlangen-Nürnberg, Erlangen, Germany. Institute of Human Genetics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany. Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, Innsbruck, Austria. Anna Köttgen and Matthias Wuttke contributed equally to this work. Correspondence and requests for materials should be addressed to A.K. (email: [email protected])

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 1 www.nature.com/scientificreports/

protein-to-creatinine ratio >500 mg/g when eGFR was >60 mL/min/1.73 m2. he GCKD study was approved by local ethics committees and registered in the national registry of clinical studies (DRKS 0003971). All meth- ods were carried out in accordance with relevant guidelines and regulations. Written informed consent was obtained from all subjects. Case groups for all analyses of speciic CKD etiologies were derived based on the lead- ing cause of CKD, which was determined by using standardized case report forms by the treating nephrologist. Serum creatinine was measured using an IDMS traceable gold-standard method. eGFR was calculated using the CKDepi equation13. Regardless of CKD etiology, two case groups of “advanced CKD” status were deined and included all patients with eGFR <45 ml/min/1.73 m² (stage G3b+ , n = 2245) or UACR ≥ 300 mg/g (stage A3, n = 1385), respectively. As controls, 1,655 GCKD patients for whom CKD etiology was assigned to a cause that could reasonably be assumed to difer from the case groups (nephrosclerosis, infections, tumor nephrectomies, interstitial nephritis and vascular diseases) were used as the control group for the etiology-speciic analyses. For the examination of stage G3b+ and A3 CKD, GCKD patients with eGFR ≥ 45 ml/min/1.73 m² (n = 1006) and UACR < 30 mg/g (n = 2117) were selected as control groups, respectively. In the GCKD study, 5,123 participants were genotyped for 2,612,357 markers at the Helmholtz Center Munich using the Illumina Ininium Omni 2.5 Exome-8 microarray (Illumina, GenomeStudio, Genotyping Module Version 1.9.4). Data cleaning was performed according to standard protocols14,15. Plink v1.90, R programming language and custom shell scripts were used during cleaning. Individual-level data were iltered for missingness (<3%) and mean heterozygosity (>2 SD). Sex checks were performed. hese checks resulted in 57 individuals being iltered. Identity-by-descent (IBD) allele sharing measure calculations were used to check for unrecognized and cryptic relatedness. Across 1.28 × 107 evaluated pairs, we detected 11 with the proportion of alleles shared IBD of >0.1875 (between second and third degree relatives). For these pairs, we removed one individual prior to data analysis. Principal component analyses (PCA) using the sotware Eigenstrat SmartPCA16 were conducted to examine and account for population stratiication. Outliers were removed in terms of genetic ancestry (automatic outlier detection, taking into account 10 PCs, deviation >8 SD). SNPs were iltered for callrate (>96%), minor allele frequency (>1%) and for deviation from the Hardy-Weinberg equilibrium (p > 1 × 10−5). Genotype impu- tation using the 1000 Genomes Phase 3 ALL reference panel17 was conducted according to standard protocols14,15. Imputation quality was assessed using the Impute2 “info” measure and was >0.8 for all SNPs. he inal dataset ater QC contained 5,034 individuals with genotypes for 2,337,794 SNPs. A literature search was performed to assemble a table of all previously reported SNPs that were associated with either the kidney function measures eGFR and UACR, or with general or etiology-speciic CKD risk in populations of European ancestry at genome-wide signiicance (p < 5 × 10−8) and with evidence for replication3. Descriptive statistics were derived as percentages and frequency distributions for all categorical variables and mean and standard deviation or median and interquartile ranges for continuous variables depending on their distribution. hresholds of statistical signiicance were deined for each analysis using a Bonferroni correction to account for multiple testing and set to p < 0.05/n for two-sided hypothesis testing, with n as the number of tested SNPs multi- plied by the number of phenotypes. For the analyses comparing etiologies, we used α = 0.05/(38*5) = 2.6 × 10−4, for the analyses comparing hypertensive and diabetic nephropathy, we used α = 0.05/(55*2) = 4.5 × 10−4. Power calculations have been performed using the sotware package Quanto18. Multivariable adjusted logistic regression analyses was used to evaluate the association between CKD eti- ology and genotype dosage assuming an additive genetic model, and accounted for sex and age as covariates. Conditional analyses accounted for further SNPs in addition. We did not adjust the logistic models for principal components as the genetic ancestry of the study population was very homogenous because of the study design and data cleaning. Sensitivity analyses adjusting for prinicipal components were carried out to verify that results were robust. STATA 13.0 (StataCorp., College Station, TX) was used to perform all analyses. Results Association studies were performed using diferent case deinitions as the outcome and genotype data from 38 + 55 = 93 selected SNPs3 as the exposure (Subjects and Methods). All GCKD patients were of European ances- try, 60% of them were male. Mean age at study entrance was 60 ± 12 years, and mean eGFR was 49.5 ± 18.1 ml/ min/1.73 m² (Table 1). We tested 38 known etiology-speciic risk loci for association with additional CKD etiologies across the broad spectrum of CKD etiologies available in the GCKD study: IgA nephropathy (n = 366), membranous nephropathy (MN, n = 147), systemic lupus erythematosus (SLE, n = 128), granulomatosis with polyangiitis (GPA, n = 116) and type 1 diabetes mellitus (T1DM, n = 91; Supplementary Table 1). he proportion of patients with a renal biopsy supporting the diagnosis was 85% (641/757) for these case groups except for T1DM (Table 1). he respec- tive case groups were compared to a control group, which consisted of n = 1655 patients for whom CKD etiol- ogy was assigned to a cause which could reasonably be assumed to difer from the case groups (see Subject and Methods). Several known associations7,9,10,19–21 were replicated: both HLA-DQA1 and PLA2R1 were strongly asso- ciated with MN (p = 2.4 × 10−22 and p = 6.7 × 10−8, respectively), STAT4 was associated with SLE (p = 9.7 × 10−5) and HLA-DPB1 with GPA (p = 1.7 × 10−11, Table 2, Supplementary Table 2), supporting an appropriate selection of the control group. hese associations had consistent directions with previously reported associations. Only 4 of 38 (11%) previously reported etiology-speciic risk loci showed signiicant associations ater applying the Bonferroni threshold, but many of them had been found and reported from meta-analyses assembling a much larger number of cases. For 21 variants previously reported as associated with IgA nephropathy, 15 displayed associations in the same direction (i.e. the same risk allele) in our data, which is signiicantly more than expected by chance (p-binomial for observing 15 or more direction-consistent associations = 0.021). Of the 21 variants, 6

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 2 www.nature.com/scientificreports/

Characteristics N with available data Demographic and anthropometric data Male 60.1 (3027) 5034 Age, years 60.1 (12.0) 5034 BMI, kg/m² 29.8 (6.0) 4982 Blood pressure SBP, mm Hg 139.5 (20.4) 5002 DBP, mm Hg 79.2 (11.7) Kidney function measures eGFR, ml/min 49.5 (18.1) 4993 Serum creatinine, mg/dl 1.5 (0.5) 4993 UACR, mg/g – median (IQR) 50.52 (9.53, 385.26) 4950 <30 mg/g 42.8% (2117) 30–299 mg/g 29.2% (1448) 4950 ≥300 mg/g 28.0% (1385) Diabetes mellitus Type 1 DM 2.1% (103) 5034 Type 2 DM 24.5% (1231) 5034

HbA1c, % 6.3% (1.0) 4941 Leading cause of CKD MN 2.9% (147) IgA 7.3% (366) SLE 2.5% (128) 5034 GPA 2.3% (116) T1DM 1.8% (91) Positive family history 28.2% (1260) 4475

Table 1. Demographic data and baseline characteristics in the GCKD cohort. Table summarizes data of all patients included in the analyses (n = 5034). Continuous variables are described in mean (SD), categorical variables in % (n) unless described otherwise. BMI: Body mass index, SBP: systolic blood pressure, DBP: diastolic blood pressure.

were nominally signiicant (p < 0.05). he efect sizes were similar on average (median efect size diference 0%, inter-quartile range −9% to 6%). Interestingly, several SNPs previously reported as associated with one speciic CKD-etiology were associ- ated with one or more additional etiologies of CKD. While PLA2R1 and STAT4 were exclusively associated with the previously reported entities MN and SLE, respectively, TNXB and several genes in the HLA region were shared across several CKD etiologies (Table 2, Supplementary Table 3). For instance, the known SLE risk variant rs1150754 at TNXB was associated not only with CKD attributed to T1DM (OR = 2.53, p = 2.5 × 10−7), but also with MN (OR = 2.77, p = 1.9 × 10−11). In order to investigate whether these newly emerging associations represented new indings or emerged because of co-occurrence with previously reported variants for that CKD etiology on shared haplotypes in the HLA region, conditional analyses were performed and linkage disequilibrium calculations were carried out (Table 3, Supplementary Tables 4 and 5). For MN, there were no new independently associated SNPs in the region beyond the original MN risk SNP rs2187668 at HLA-DQA1, suggesting that new associations between HLA risk variants for other CKD etiologies and MN were observed because of linkage disequilibrium with a known MN risk variant. Conversely, the association between CKD attributed to T1DM and rs1150754 was independent of previously reported risk variants for other CKD etiologies in the HLA-region (OR = 2.62, p-conditional = 1.0 × 10−6). In addition, the IgA risk locus at HLA-DRB1 was associated independently with GPA conditioned on the known GPA risk variant in HLA-DPB1 (p = 2.0 × 10−4, Table 3). Furthermore, TNXB (previously reported for SLE, p = 2.5 × 10−7 in GCKD) and HLA-DRB1 (IgA risk locus, p = 4.6 × 10−11 in GCKD) showed independent associations with CKD attributed to T1DM. hese associations remained robust in efect size and direction when adjusting for additional previously reported risk variants for T1DM22–27 in the HLA region (Supplementary Table 6). LD calculations were consistent with the conditional analyses, with low LD observed between variants for which the effect was not attenuated in conditional analyses (Supplementary Table 4). Next, two different categories of advanced CKD were examined for 55 SNPs: the first was defined as eGFR < 45 ml/min/1.73 m² (n = 2,245, CKD stage G3b) and the second as UACR ≥ 300 mg/g (n = 1,385, CKD stage A3). hese cases were compared to a GCKD control group of 2,245 patients with eGFR > 60 ml/min/1.73 m² and 2,117 patients with UACR < 30 mg/g, respectively. Of the risk loci discovered in the general population, UMOD showed signiicant association (OR = 0.76 per T allele, p < 4.2 × 10−4) with CKD stage 3b (Supplementary Table 7) compared to the GCKD control group, ater correction for multiple testing. he efect direction was

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 3 www.nature.com/scientificreports/

SNP Characteristics IgA MN SLE T1DM GPA SNP (Gene) Efect allele Chr. (Position) Known locus OR [95% CI] P-value OR [95% CI] P-value OR [95% CI] P-value OR [95% CI] P-value OR [95% CI] P-value rs4664308 (1.04 0.45 − (0.83 (1.36 (1.03 G 2 (160917497) MN (6.5 ·10−1) 6.7 ·10 8 (2.5 ·10−1) (4.4 ·10−2) (8.1 ·10−1) (PLA2R1) [0.87–1.25]) [0.34–0.60] [0.61–1.13]) [1.01–1.84]) [0.79–1.36]) rs7574865 (0.90 (0.96 0.53 − (0.85 (1.02 G 2 (191964633) SLE (3.4·10−1) (7.6 ·10−1) 9.7 ·10 5 (3.4 ·10−1) (9.2 ·10−1) (STAT4) [0.74–1.11]) [0.71–1.28]) [0.39–0.73] [0.60–1.19]) [0.74–1.40]) rs1150754 (1.07 2.77 (1.97 2.53 (0.94 T 6 (32050758) SLE (5.9 ·10−1) 1.9 ·10−11 (2.8 ·10−4) 2.5 ·10−7 (7.7 ·10−1) (TNXB) [0.83–1.40]) [2.06–3.71] [1.37–2.83]) [1.78–3.60] [0.63–1.42]) rs660895 (1.09 (0.60 (1.15 3.00 1.81 G 6 (32577380) IgA (4.7 ·10−1) (1.2 ·10−2) (5.1 ·10−1) 4.6 ·10−11 2.0 ·10−4 (HLA-DRB1) [0.86–1.37]) [0.40–0.89]) [0.77–1.71]) [2.17–4.18] [1.32–2.47] rs2187668 (0.93 4.48 − 2.36 (1.89 (0.75 T 6 (32605884) MN (6.3 ·10−1) 2.4 ·10 22 5.9 ·10−6 (1.6 ·10−3) (2.2 ·10−1) (HLA-DQA1) [0.70–1.24]) [3.32–6.11] [1.63–3.42] [1.27–2.80]) [0.47–1.20]) rs1129740 (1.04 (1.69 (1.21 2.13 (1.18 A 6 (32609105) SSNS (6.4 ·10−1) 8.4 ·10−5 (2.2 ·10−1) 1.1 ·10−5 (2.2 ·10−1) (HLA-DQA1) [0.88–1.24]) [1.30–2.19]) [0.89–1.62]) [1.52–2.97] [0.91–1.54]) rs9275596 (1.20 0.52 (0.57 (1.20 (1.25 T 6 (32681631) IgA (5.3 ·10−2) 3.3 ·10−7 (3.4 ·10−4) (2.6 ·10−1) (1.2 ·10−1) (HLA-DQB1) [1.00–1.45]) [0.41–0.67] [0.41–0.77]) [0.87–1.66]) [0.94–1.67]) rs9277554 (0.75 (1.04 (1.24 (1.05 0.14 − T 6 (33055538) GPA (6.5 ·10−3) (7.8 ·10−1) (1.7 ·10−1) (7.9 ·10−1) 1.7 ·10 11 (HLA-DPB1) [0.61–0.92]) [0.79–1.37]) [0.91–1.69]) [0.75–1.46]) [0.08–0.25]

Table 2. Associations between CKD etiology associated SNPs and other CKD etiologies. MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Conidence interval. Bold: signiicant and independent association, italic: previously known association, simple style: signiicant, but dependent association. Numbers in brackets (): non-signiicant. For independence tests see Table 3. Signiicance threshold was set at 2.6 × 10−4 (Bonferroni correction: α = 0.05/(38*5)).

consistent with observations from population-based studies, with the minor T allele associated with better eGFR and lower CKD risk. In a second set of analyses, the association between the 55 population-based risk loci and CKD for which the treating nephrologists had determined diabetes (n = 653) or hypertension (n = 1086) as the leading cause were examined because they represent the majority of CKD cases in population-based studies. No signiicant associa- tion was detected for both hypertensive nephropathy and diabetic kidney disease (Supplementary Table 8).

Discussion This study found genetic associations shared across specific etiologies of CKD. Several known CKD etiology-speciic risk loci were replicated in the GCKD study, and risk variants at TNXB, HLA-DQA1, -DQB1 and -DRB1 were independently associated with additional CKD etiologies beyond the ones initially reported. Based on post-hoc power calculations, our study had excellent power (>99%) to detect some true-positive associations such as the ones between MN nephropathy and the validated risk alleles at HLA-DQA1 and PLA2R1, as well as for some of the new and independent associations with additional disease entities such as the IgA variant rs660895 and T1DM nephropathy (99%) or the MN risk variant rs2187668 and an association with SLE nephropathy (90%). Nevertheless, power was moderate for other combinations, such as the association between the IgA risk variant rs660895 and GPA (50%). A priori power calculations were complicated by the fact that it is unclear whether the efect size of a genetic risk variant is the same or similar for diferent CKD etiologies. Power calculations were therefore provided across a range of case numbers, allele frequencies and efect sizes (Supplementary Table 9) to assess minimum detectable efect sizes. For example, for a case group of 100 patients, there was >80% power to detect an association signal for a SNP with a frequency of 30% and an OR of 2.0. We therefore cannot exclude that there are additional shared genetic risk variants that will only become apparent in future, larger studies that have assembled cases with CKD from diferent etiologies. he HLA region is known for containing various risk loci for CKD of diferent etiologies. Several of them could be examined across CKD etiologies available in the GCKD study, and showed evidence of novel, additional associations that had not been identiied at genome-wide signiicance in genome-wide association studies of these additional etiologies. he associations across CKD etiologies highlight the shared role of the adaptive immune response, and suggest some overlap between etiologies. For example, known risk variants for MN were inde- pendently associated with CKD attributed to SLE, which is interesting in light of the histopathological appearance of membranous nephropathy in lupus nephritis class V. he shared genetic risk variant for CKD attributed to SLE and to T1DM is supported by a report that exam- ined the co-existence of auto-immune diseases28. In this report, SLE and T1DM co-exist more oten than expected based on the prevalence of both diseases. Several case reports describe diferent overlapping auto-immune dis- eases that afect the kidney, such as MN and IgA nephropathy29 or MN and further extra-renal autoimmune diseases such as colitis ulcerosa)30. More detailed and better powered studies focusing on SNP associations across sub-types of diferent autoimmune diseases are required to address these important questions in more detail. Furthermore, studies with larger case groups of the respective disease could examine co-incidences of two dis- eases with overlapping genetic risk factors such as MN/SLE and T1DM/GPA.

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 4 www.nature.com/scientificreports/

Additional Covariate Main SNP SNPs OR [95% CI] p-value MN — 4.48 [3.32–6.11] 2.4 × 10−22 rs1150754 5.21 [3.32–8.19] 8.1 × 10−13 rs1129740 4.29 [3.09–5.96] 4.1 × 10−18 rs9275596 4.67 [3.16–6.90] 9.6 × 10−15 rs2187668 rs1150754, rs1129740 4,97 [3.10–7.96] 2.7 × 10−11 rs1150754, rs9275596 5.39 [3.24–8.97] 8.6 × 10−11 rs1129740, rs9275596 4.11 [2.36–7.17] 6.2 × 10−7 rs1150754, rs1129740, 4.74 [2.49–9.04] 2.2 × 10−6 rs9275596 SLE — 2.36 [1.63–3.42] 5.9 × 10−6 rs1150754 2.12 [1.22–3.70] 8.0 × 10−3 rs7763262 1.96 [1.30–2.95] 1.3 × 10−3 rs9275596 1.94 [1.26–3.00] 2.7 × 10−3 rs2187668 rs1150754, rs7763262 1.86 [1.04–3.33] 3.6 × 10−2 rs1150754, rs9275596 1.72 [0.94–1.36] 8.0 × 10−2 rs7763262, rs9275596 1.93 [1.25–2.97] 3.0 × 10−3 rs1150754, rs7763262, 1.81 [0.97–3.37] 6.1 × 10−2 rs9275596 GPA — 1.81 [1.32–2.47] 2.0 × 10−4 rs660895 rs9277554 1.73 [1.26–2.38] 7.3 × 10–4 — 0.14 [0.08–0.25] 1.7 × 10−11 rs9277554 rs660895 0.14 [0.08–0.25] 2.6 × 10−11 T1DM — 2.53 [1.78–3.60] 2.5 × 10−7 rs660895 2.75 [1.91–3.98] 7.2 × 10−8 rs1150754 rs1129740 2.12 [1.47–3.05] 5.5 × 10−5 rs660895, rs1129740 2.62 [1.78–3.85] 1.0 × 10−6 — 3.00 [2.17–4.18] 4.6 × 10−11 rs1150754 3.19 [2.28–4.46] 1.5 × 10−11 rs660895 rs1129740 2.47 [1.72–3.56] 1.2 × 10−6 rs1150754, rs1129740 2.93 [2.00–4.30] 3.6 × 10−8

Table 3. Conditional analyses for independence of SNP signals. All SNPs with independent associations with the examined etiology are displayed. Associations were deined as independent if they stayed statistically signiicant in all performed conditional analyses and ORs did not vary by >20%. Although rs2187668 did not show independence for signiicant association with SLE throughout, it is shown here because it had the lowest p-value of all SNPs associated with SLE and an OR variation of <20%. For complete data see Supplementary Table 7. MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Conidence interval.

he T1DM risk loci found in our study can be interpreted in two ways: either as risk loci for T1DM itself, or as risk loci for diabetic nephropathy resulting from T1DM, or a combination of both. To resolve this question, a control group of T1DM patients without CKD would be required. We additionally found that the UMOD locus known to be associated with kidney function in the general pop- ulation was also associated with advanced CKD (stage G3b). Experimental evidence links genotype at the UMOD risk variant identiied in GWAS to altered gene expression and salt-sensitive hypertensive CKD31. In support, the risk allele at the UMOD variant was signiicantly associated with stage G3b + CKD in our study, and the associ- ation with CKD attributed to hypertension was direction-consistent to the UMOD variant reported in previous GWAS of hypertension32. his inding illustrates associations across a spectrum of renal function from small changes in eGFR in population-based studies over advanced CKD in the GCKD study to severe renal phenotypes in monogenic diseases caused by loss of function mutations in UMOD33. Strengths of our study include the availability of CKD of diferent stages and from diferent etiologies in one study using the same standardized recruitment procedures, and the availability of genome-wide genotype data that allows for carrying out conditional analyses. Other CKD cohorts do not have access to comparable numbers of carefully phenotyped subgroups of patients with as many speciic CKD etiologies. Nevertheless, because of limited sample size within subgroups, analyses in our study were restricted to the examination of a predeined number of candidate SNPs and the study of large and moderate genetic efects, rigorously accounting for multiple

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 5 www.nature.com/scientificreports/

comparisons. Limitations include the absence of an internal control group of patients without CKD who sufer from the speciic diseases that can give risk to CKD, such as T1DM patients without nephropathy. In conclusion, genetic risk variants for speciic etiologies of CKD were shared across some etiologies, sug- gesting a common mechanism by which the adaptive immune system may contribute to the shared etiologies. In addition, we found the known risk variant at the UMOD locus that is associated with CKD in the general popu- lation to also associate with advanced stage CKD (G3b+) in the GCKD study, supporting the presence of genetic risk across the spectrum of kidney function.

References 1. Levey, A. S. & Coresh, J. Chronic kidney disease. Lancet 379, 165–80 (2012). 2. Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five Years of GWAS Discovery. he American Journal of Human Genetics 90, 7–24 (2012). 3. Wuttke, M. & Kottgen, A. Insights into kidney diseases from genome-wide association studies. Nat Rev Nephrol 12, 549–62 (2016). 4. Böger, C. A. CUBN is a gene locus for albuminuria. J Am Soc Nephrol 22, 555–570 (2011). 5. Köttgen, A. et al. New loci associated with kidney function and chronic kidney disease. Nature Genetics 42, 376–384 (2010). 6. Pattaro, C. Genetic Associations at 53 Loci Highlight Cell Types and Biologic Pathways for Kidney Function. Nat Commun 7, 10023 (2015). 7. Kiryluk, K. et al. Discovery of new risk loci for IgA nephropathy implicates genes involved in immunity against intestinal pathogens. Nature Genetics 46, 1187–1196 (2014). 8. Li, M. et al. Identiication of new susceptibility loci for IgA nephropathy in Han Chinese. Nature Communications 6, 7270 (2015). 9. Stanescu. Risk HLA-DQA1 and PLA(2)R1 alleles in idiopathic membranous nephropathy. N Engl J Med 364, 616–26 (2011). 10. Sekula. Genetic risk variants for membranous nephropathy: extension and association with other chronic kidney disease aetiologies. Nephrology Dialysis Transplantation, in press (2016). 11. Eckardt, K. U. & Barthlein, B. et al. he German Chronic Kidney Disease (GCKD) study: design and methods. Nephrol Dial Transplant 27(4), 1454–1460 (2012). 12. Titze, S. et al. Disease burden and risk proile in referred patients with moderate chronic kidney disease: composition of the German Chronic Kidney Disease (GCKD) cohort. Nephrol Dial Transplant 30, 441–51 (2015). 13. KDIGO 2012 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease. 14. Anderson, C. A. et al. Data quality control in genetic case-control association studies. Nat Protoc 5, 1564–73 (2010). 15. Marchini, J. & Howie, B. Genotype imputation for genome-wide association studies. Nat Rev Genet 11, 499–511 (2010). 16. Price, A. L. et al. Principal components analysis corrects for stratiication in genome-wide association studies. Nat Genet 38, 904–9 (2006). 17. he 1000 Genomes Project Consortium. An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81 (2015). 18. Morrison, J. G. W. Quantov1.2. 4; http://biostats.usc.edu/Quanto.html (2009). 19. de Bakker, P. I. W. et al. Diferential Genetic Associations for Systemic Lupus Erythematosus Based on Anti–dsDNA Autoantibody Production. PLoS Genetics 7, e1001323 (2011). 20. Gharavi, A. G. et al. Genome-wide association study identiies susceptibility loci for IgA nephropathy. Nature Genetics 43, 321–327 (2011). 21. Xie, G. et al. Association of Granulomatosis With Polyangiitis (Wegener’s) WithHLA-DPB1*04 and SEMA6A Gene Variants: Evidence From Genome-Wide Analysis. Arthritis & Rheumatism 65, 2457–2468 (2013). 22. Barrett, J. C. et al. Genome-wide association study and meta-analysis ind that over 40 loci afect risk of type 1 diabetes. Nat Genet 41, 703–7 (2009). 23. Bradield, J. P. et al. A genome-wide meta-analysis of six type 1 diabetes cohorts identiies multiple associated loci. PLoS Genet 7, e1002293 (2011). 24. Cooper, J. D. et al. Meta-analysis of genome-wide association study data identiies additional type 1 diabetes risk loci. Nat Genet 40, 1399–401 (2008). 25. Grant, S. F. et al. Follow-up analysis of genome-wide association data identiies novel loci for type 1 diabetes. Diabetes 58, 290–5 (2009). 26. Hakonarson, H. et al. A genome-wide association study identiies KIAA0350 as a type 1 diabetes gene. Nature 448, 591–4 (2007). 27. Tomer, Y. et al. Genome wide identiication of new genes and pathways in patients with both autoimmune thyroiditis and type 1 diabetes. J Autoimmun 60, 32–9 (2015). 28. Kota, S. K., Meher, L. K., Jammula, S. & Modi, K. D. Clinical proile of coexisting conditions in type 1 diabetes mellitus patients. Diabetes Metab Syndr 6, 70–6 (2012). 29. Nishida, M., Kato, R. & Hamaoka, K. Coexisting Membranous Nephropathy and IgA Nephropathy. Fetal Pediatr Pathol 34, 351–4 (2015). 30. Warling, O. et al. Overlap syndrome consisting of PSC-AIH with concomitant presence of a membranous glomerulonephritis and ulcerative colitis. World J Gastroenterol 20, 4811–6 (2014). 31. Trudu, M. et al. Common noncoding UMOD gene variants induce salt-sensitive hypertension and kidney damage by increasing uromodulin expression. Nat Med 19, 1655–60 (2013). 32. Padmanabhan, S. et al. Genome-wide association study of blood pressure extremes identiies variant near UMOD associated with hypertension. PLoS Genet 6, e1001177 (2010). 33. Hart, T. C. et al. Mutations of the UMOD gene are responsible for medullary cystic kidney disease 2 and familial juvenile hyperuricaemic nephropathy. J Med Genet 39, 882–92 (2002).

Acknowledgements he GCKD study is funded by grants from the German Ministry of Education and Research (BMBF, grant number 01ER0804) and the KfH Foundation for Preventive Medicine. We are grateful for the willingness of the patients to participate in the GCKD study. he enormous efort of the study personnel of the various regional centres is highly appreciated. We thank the large number of nephrologists who provide routine care for the patients and collaborate with the GCKD study. A list of nephrologists currently collaborating with the GCKD study is available at http://www.gckd.org. he article processing charge was funded by the German Research Foundation (DFG) and the University of Freiburg in the funding programme Open Access Publishing. GCKD Investigators are as follows.University of Erlangen-Nürnberg, Germany: Kai-Uwe Eckardt, Stephanie Titze, Hans-Ulrich Prokosch, Barbara Bärthlein, André Reis, Arif B. Ekici, Olaf Gefeller, Karl F. Hilgers, Silvia Hübner, Susanne Avendaño, Dinah Becker-Grosspitsch, Nina Hauck, Susanne A. Seuchter, Birgit Hausknecht, Marion Rittmeier, Anke Weigel,

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 6 www.nature.com/scientificreports/

Andreas Beck, homas Ganslandt, Sabine Knispel, homas Dressel and Martina Malzer. Technical University of Aachen, Germany: Jürgen Floege, Frank Eitner, Georg Schlieper, Katharina Findeisen, Elfriede Arweiler, Sabine Ernst, Mario Unger and Stefan Lipski. Charité, Humboldt-University of Berlin, Germany: Elke Schaefner, Seema Baid-Agrawal, Kerstin Petzold and Ralf Schindler. University of Freiburg, Germany: Anna Köttgen, Ulla T. Schultheiss, Simone Meder, Erna Mitsch, Ursula Reinhard and Gerd Walz. Hannover Medical School, Germany: Hermann Haller, Johan Lorenzen, Jan T. Kielstein and Petra Otto. University of Heidelberg, Germany: Claudia Sommerer, Claudia Föllinger and Martin Zeier. University of Jena, Germany: Gunter Wolf, Martin Busch, Katharina Paul and Lisett Dittrich. Ludwig-Maximilians University of München, Germany: homas Sitter, Robert Hilge and Claudia Blank. University of Würzburg, Germany: Christoph Wanner, Vera Krane, Daniel Schmiedeke, Sebastian Toncar, Daniela Cavitt, Karina Schönowsky and Antje Börner-Klein. Medical University of Innsbruck, Austria: Florian Kronenberg, Julia Raschenberger, Barbara Kollerits, Lukas Forer, Sebastian Schönherr and Hansi Weißensteiner. University of Regensburg, Germany: Peter Oefner, Wolfram Gronwald and Helena Zacharias. Department of Medical Biometry, Informatics and Epidemiology (IMBIE), University of Bonn: Matthias Schmid. he work of MW and AK was funded by the CRC 1140 Initiative and by KO 3598/3-1 (AK) of the German Research Foundation. UTS and MW were supported by the Else Kröner-Fresenius-Stitung (2013_Kolleg.03), Bad Homburg, Germany. Genotyping was funded by Bayer Pharma AG. he article processing charge was funded by the German Research Foundation (DFG) and the University of Freiburg in the funding programme Open Access Publishing. Author Contributions S.W., A.K. and M.W. designed this study. U.S., F.K., K.U.E. and A.K. were involved in the management of the GCKD study. B.H. and A.B.E. were involved with biobanking and/or genotyping. S.W. and M.W. developed statistical methods and performed the analyses. S.W., M.W. and A.K. interpreted the results. S.W., A.K. and M.W. drated the manuscript. All authors critically reviewed the manuscript. Additional Information Supplementary information accompanies this paper at https://doi.org/10.1038/s41598-017-13356-6. Competing Interests: he authors declare that they have no competing interests. Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional ailiations. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Cre- ative Commons license, and indicate if changes were made. he images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not per- mitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

© he Author(s) 2017

SCIENTIFIC REPORTS | : 13944 | DOI:./s--- 7 SUPPLEMENTARY MATERIAL

Associations between genetic risk variants for kidney diseases and kidney disease etiology

Sebastian Wunnenburger1, Ulla T. Schultheiss1,2, Gerd Walz2, Birgit Hausknecht3, Arif B. Ekici4, Florian Kronenberg5, Kai-Uwe Eckardt3, Anna Köttgen1*, Matthias Wuttke1*

Affiliations: 1. Institute of Genetic Epidemiology, Medical Center - University of Freiburg, Faculty of Medicine, Freiburg, Germany 2. Division of Nephrology, University of Freiburg, Faculty of Medicine, Freiburg, Germany 3. Department of Nephrology and Hypertension, University of Erlangen-Nürnberg, Erlangen, Germany 4. Institute of Human Genetics, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Erlangen, Germany 5. Division of Genetic Epidemiology, Department of Medical Genetics, Molecular and Clinical Pharmacology, Medical University of Innsbruck, Innsbruck, Austria

*Indicates joint oversight

Anna Köttgen, MD, MPH Institute of Genetic Epidemiology University Medical Center Freiburg Hugstetter Straße 49 D-79106 Freiburg Germany E-Mail: [email protected]

1 Contents

Supplementary Table 1. Composition of case and control groups in the GCKD study ...... 3 Supplementary Table 2. Associations of known risk loci for specific CKD etiologies ...... 4 Supplementary Table 3. Associations between CKD etiology associated SNPs and other CKD etiologies ...... 5 Supplementary Table 4: Linkage disequilibrium of selected SNPs in the HLA region in the GCKD cohort ...... 6 Supplementary Table 5. Conditional analyses for independence of SNP signals ...... 7 Supplementary Table 6: Conditional analyses of SNPs associated with CKD from T1DM and previously known T1DM SNPs ...... 9 Supplementary Table 7. Associations between population-based SNPs and advanced CKD (stage G3b+ or A3) ...... 10 Supplementary Table 8. Associations between population-based SNPs and CKD attributed to hypertension and type 2 diabetes mellitus ...... 11 Supplementary Table 9. Power calculations across a range of sample sizes (case numbers), allele frequencies and effect sizes ...... 12

2 Supplementary Tables

Supplementary Table 1. Composition of case and control groups in the GCKD study

Category n n (%) with biopsy GCKD case groups CKD stage G3b 2245 NA CKD stage A3 1385 NA Nephrosclerosis 1086 84 (7.7) Type 2 diabetes mellitus 653 28 (4.3) IgA nephropathy 366 314 (85.8) Membranous nephropathy 147 140 (95.2) Systemic lupus erythematosus 128 106 (82.8) Granulomatosis with polyangiitis 116 81 (69.8) Type 1 diabetes mellitus 91 4 (4.4) GCKD advanced disease control groups CKD stage G1/G2 1006 NA (eGFR ≥60 ml/min/1.73m²) CKD stage A1/A2 2117 NA (UACR <300 mg/g) GCKD specific CKD etiology control group Vascular nephropathy (n=1160) Renal artery stenosis 49 0 (0.0) Nephrosclerosis 1086 84 (7.7) Renal infarct 6 1 (16.7) Other 19 2 (10.5) Interstitial nephropathy (n=220) Interstitial nephropathy 145 28 (19.3) Analgesic nephropathy 51 3 (5.9) Other 24 4 (16.7) Acute kidney injury (n=62) Post ischemic 58 7 (12.1) Other 4 0 (0.0) Single kidney (n=133) Tumor nephrectomy 62 7 (11.3) Kidney donor 27 0 (0.0) Other nephrectomy 31 0 (0.0) Other 3 0 (0.0) (Post-)renal diseases (n=90) Kidney stones 22 1 (4.5) Infections 27 1 (3.7) Neurogenic bladder 3 0 (0.0) Other 38 2 (5.2) Sum 1655 140 (8.5)

3 Supplementary Table 2. Associations of known risk loci for specific CKD etiologies

Effect Chromo- CKD SNP Gene Position OR [95% CI] P-value allele some etiology rs2187668 HLA-DQA1 T 6 32,605,884 MN 4.48 3.32 6.11 2.4E-22 rs4664308 PLA2R1 G 2 160,917,497 MN 0.45 0.34 0.60 6.7E-08 rs11150612 ITGAM-ITGAX A 16 31,357,760 IgA 1.14 0.95 1.36 1.6E-01 rs11574637 ITGAM-ITGAX C 16 31,368,874 IgA 0.76 0.59 0.98 3.2E-02 rs12716641 DEFA C 8 6,898,998 IgA 0.89 0.74 1.07 2.0E-01 rs17019602 VAV3 G 1 108,188,858 IgA 1.05 0.85 1.30 6.5E-01 rs1794275 HLA-DQA/B A 6 32,671,248 IgA 1.23 0.98 1.54 7.0E-02 rs1883414 HLA-DPB2 A 6 33,086,448 IgA 0.88 0.72 1.07 2.0E-01 rs2033562 KLF10/ODF1 C 8 103,547,739 IgA 1.15 0.96 1.39 1.4E-01 rs2074038 ACCS T 11 44,087,989 IgA 1.17 0.89 1.54 2.7E-01 rs2412971 HORMAD2/MTMR3 A 22 30,494,371 IgA 1.14 0.96 1.35 1.3E-01 rs2523946 HLA-A T 6 29,941,943 IgA 1.15 0.97 1.38 1.1E-01 rs2738048 DEFA G 8 6,822,785 IgA 0.76 0.63 0.92 5.6E-03 rs3115573 HLA region G 6 32,218,843 IgA 1.25 1.05 1.48 1.3E-02 rs3803800 TNFSF13 G 17 7,462,969 IgA 0.88 0.72 1.09 2.6E-01 rs4077515 CARD9 T 9 139,266,496 IgA 1.15 0.96 1.39 1.2E-01 rs660895 HLA-DRB1 G 6 32,577,380 IgA 1.09 0.86 1.37 4.7E-01 rs6677604 CFHR1,3 A 1 196,686,918 IgA 0.69 0.54 0.87 1.5E-03 rs7634389 ST6GAL1 C 3 186,738,421 IgA 1.30 1.08 1.57 5.9E-03 rs7763262 HLA-DR–HLA-DQ C 6 32,424,882 IgA 1.28 1.05 1.55 1.3E-02 rs9275596 HLA-DQB1 T 6 32,681,631 IgA 1.20 1.00 1.45 5.3E-02 rs9314614 DEFA G 8 6,697,731 IgA 1.00 0.84 1.20 9.9E-01 rs9357155 TAP1/2/PSMB8/9 A 6 32,809,848 IgA 0.99 0.75 1.32 9.7E-01 rs1129740 HLA-DQA1 A 6 32,609,105 SSNS - - - - rs10488631 TNPO3 C 7 128,594,183 SLE 1.49 0.98 2.27 6.1E-02 rs1150754 TNXB T 6 32,050,758 SLE 1.97 1.37 2.83 2.8E-04 rs4963128 KIAA1542 C 11 589,564 SLE 0.94 0.69 1.30 7.3E-01 rs6445975 PXK T 3 58,370,177 SLE 0.94 0.68 1.31 7.3E-01 rs7574865 STAT4 G 2 191,964,633 SLE 0.53 0.39 0.73 9.7E-05 rs9888739 ITGAM T 16 31,313,253 SLE 1.60 1.10 2.34 1.4E-02 rs12437854 ESRD G 15 94,141,833 T1DM 0.88 0.47 1.68 7.1E-01 rs4972593 ESRD A 2 174,462,854 T1DM 1.29 0.88 1.89 1.9E-01 rs1949829 COBL T 7 51,537,887 GPA 0.75 0.41 1.39 3.6E-01 rs4862110 DCTD C 4 183,751,029 GPA 1.08 0.78 1.49 6.6E-01 rs595018 CCDC86 C 11 60,592,276 GPA 1.00 0.72 1.41 9.8E-01 rs7151526 SERPINA1 A 14 94,863,636 GPA 1.73 0.99 3.00 5.3E-02 rs7503953 WSCD1 C 17 6,141,677 GPA 0.93 0.63 1.40 7.4E-01 rs9277554 HLA–DPB1 T 6 33,055,538 GPA 0.14 0.08 0.25 1.7E-11

MN: Membranous nephropathy, IgA: IgA nephropathy, SSNS: steroid sensitive nephrotic syndrome, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Confidence interval. Significance threshold was set at 2.6x10-4 (Bonferroni correction: α < 0.05 / (38*5)), significant association p-values were marked in bold face.

4 Supplementary Table 3. Associations between CKD etiology associated SNPs and other CKD etiologies

SNP characteristics IgA MN SLE GPA T1DM Effect Known SNP Gene OR p-value OR p-value OR p-value OR p-value OR p-value allele locus for rs2187668 HLA-DQA1 T MN 0.93 6.3E-01 4.48 2.4E-22 2.36 5.9E-06 0.75 2.2E-01 1.89 1.6E-03 rs4664308 PLA2R1 G MN 1.04 6.5E-01 0.45 6.7E-08 0.83 2.5E-01 1.03 8.1E-01 1.36 4.4E-02 rs11150612 ITGAM-ITGAX A IgA 1.14 1.6E-01 0.73 2.0E-02 1.03 8.3E-01 1.09 5.4E-01 0.71 4.1E-02 rs11574637 ITGAM-ITGAX C IgA 0.76 3.2E-02 1.13 4.3E-01 1.56 1.2E-02 0.50 2.6E-03 1.24 2.6E-01 rs12716641 DEFA C IgA 0.89 2.0E-01 1.06 6.4E-01 0.90 5.1E-01 1.21 1.7E-01 0.75 7.4E-02 rs17019602 VAV3 G IgA 1.05 6.5E-01 1.41 1.7E-02 1.17 3.6E-01 1.42 2.2E-02 0.97 8.7E-01 rs1794275 HLA-DQA/B A IgA 1.23 7.0E-02 0.61 1.4E-02 0.75 2.0E-01 1.32 9.4E-02 0.64 6.4E-02 rs1883414 HLA-DPB2 A IgA 0.88 2.0E-01 0.74 3.8E-02 1.35 5.1E-02 0.60 1.6E-03 1.04 8.3E-01 rs2033562 KLF10/ODF1 C IgA 1.15 1.4E-01 0.78 5.6E-02 0.92 5.8E-01 0.94 6.8E-01 0.80 1.6E-01 rs2074038 ACCS T IgA 1.17 2.7E-01 0.94 7.5E-01 1.01 9.8E-01 1.36 1.3E-01 1.04 8.9E-01 rs2412971 HORMAD2/MTMR3 A IgA 1.14 1.3E-01 0.90 3.7E-01 1.00 9.8E-01 1.08 5.7E-01 1.19 2.5E-01 rs2523946 HLA-A T IgA 1.15 1.1E-01 0.80 9.0E-02 0.72 3.4E-02 0.91 4.7E-01 1.04 7.9E-01 rs2738048 DEFA G IgA 0.76 5.6E-03 0.98 8.7E-01 1.13 4.1E-01 0.86 3.0E-01 0.98 8.8E-01 rs3115573 HLA region G IgA 1.25 1.3E-02 0.96 7.4E-01 0.86 3.2E-01 0.82 1.5E-01 1.05 7.3E-01 rs3803800 TNFSF13 G IgA 0.88 2.6E-01 1.01 9.7E-01 1.02 9.3E-01 1.09 6.1E-01 0.99 9.8E-01 rs4077515 CARD9 T IgA 1.15 1.2E-01 0.93 5.7E-01 0.70 2.6E-02 1.02 9.0E-01 1.33 7.3E-02 rs660895 HLA-DRB1 G IgA 1.09 4.7E-01 0.60 1.2E-02 1.15 5.1E-01 1.81 2.0E-04 3.00 4.6E-11 rs6677604 CFHR1,3 A IgA 0.69 1.5E-03 0.89 4.7E-01 0.99 9.4E-01 1.07 6.7E-01 0.84 3.6E-01 rs7634389 ST6GAL1 C IgA 1.30 5.9E-03 1.06 6.6E-01 1.13 4.4E-01 1.29 7.5E-02 0.94 6.9E-01

rs7763262 HLA-DR–HLA-DQ C IgA 1.28 1.3E-02 0.67 1.5E-03 0.57 3.0E-04 1.16 3.0E-01 1.21 2.5E-01 rs9275596 HLA-DQB1 T IgA 1.20 5.3E-02 0.52 3.3E-07 0.57 3.4E-04 1.25 1.2E-01 1.20 2.6E-01 rs9314614 DEFA G IgA 1.00 9.9E-01 1.07 5.7E-01 0.91 5.5E-01 0.98 8.6E-01 0.99 9.2E-01 rs9357155 TAP1/2/PSMB8/9 A IgA 0.99 9.7E-01 0.67 7.9E-02 1.18 5.0E-01 1.45 5.2E-02 1.76 5.8E-03 rs1129740 HLA-DQA1 A SSNS 1.04 6.4E-01 1.69 8.4E-05 1.21 2.2E-01 1.18 2.2E-01 2.13 1.1E-05 rs10488631 TNPO3 C SLE 1.11 4.9E-01 1.14 5.1E-01 1.49 6.1E-02 1.01 9.8E-01 1.09 7.3E-01 rs1150754 TNXB T SLE 1.07 5.9E-01 2.77 1.9E-11 1.97 2.8E-04 0.94 7.7E-01 2.53 2.5E-07 rs4963128 KIAA1542 C SLE 1.02 8.2E-01 0.95 7.1E-01 0.94 7.3E-01 0.85 2.7E-01 0.71 2.8E-02 rs6445975 PXK T SLE 1.06 5.5E-01 1.21 1.9E-01 0.94 7.3E-01 0.89 4.1E-01 0.94 7.2E-01 rs7574865 STAT4 G SLE 0.90 3.4E-01 0.96 7.6E-01 0.53 9.7E-05 1.02 9.2E-01 0.85 3.4E-01 rs9888739 ITGAM T SLE 0.64 4.6E-03 1.06 7.5E-01 1.60 1.4E-02 0.48 7.9E-03 1.18 4.6E-01 rs12437854 ESRD G T1DM 1.13 4.7E-01 1.21 4.3E-01 1.37 2.7E-01 1.36 2.1E-01 0.88 7.1E-01 rs4972593 ESRD A T1DM 1.07 6.0E-01 0.88 4.9E-01 1.10 6.5E-01 0.98 9.0E-01 1.29 1.9E-01 rs1949829 COBL T GPA 1.16 4.0E-01 0.73 2.6E-01 0.98 9.5E-01 0.75 3.6E-01 1.16 6.1E-01 rs4862110 DCTD C GPA 0.82 8.3E-02 0.92 6.1E-01 0.81 2.8E-01 1.08 6.6E-01 1.13 5.1E-01 rs595018 CCDC86 C GPA 0.97 7.7E-01 0.95 7.2E-01 0.82 2.8E-01 1.00 9.8E-01 0.88 5.1E-01 rs7151526 SERPINA1 A GPA 0.88 5.6E-01 1.98 5.7E-03 0.91 8.1E-01 1.73 5.3E-02 1.43 2.9E-01 rs7503953 WSCD1 C GPA 1.15 3.4E-01 1.05 7.9E-01 0.89 5.9E-01 0.93 7.4E-01 0.66 5.0E-02 rs9277554 HLA–DPB1 T GPA 0.75 6.5E-03 1.04 7.8E-01 1.24 1.7E-01 0.14 1.7E-11 1.05 7.9E-01

MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio. Bold: statistically significant association. Significance threshold was set at 2.6x10-4 (Bonferroni correction: α < 0.05 / (38*5)), significant association p-values were marked in bold face.

5 Supplementary Table 4: Linkage disequilibrium of selected SNPs in the HLA region in the GCKD cohort

r² rs1150754 rs7763262 rs660895 rs2187668 rs1129740 rs9275596 rs9277554 rs1883414 rs1150754 0.14 <0.01 0.49 0.05 0.12 0.01 <0.01 rs7763262 0.66 0.10 0.15 0.14 0.61 <0.01 <0.01 rs660895 0.35 0.95 0.03 0.16 0.11 <0.01 <0.01 D' rs2187668 0.74 0.74 1 0.11 0.23 0.02 0.01 rs1129740 0.64 0.44 1 0.99 0.13 <0.01 <0.01 rs9275596 0.64 0.82 1 0.97 0.42 0.01 <0.01 rs9277554 0.16 0.06 0.01 0.2 0.05 0.09 0.15 rs1883414 0.25 0.05 0.03 0.4 0.01 0.12 0.42

Linkage disequilibrium of all SNPs on chromosome 6 that were significantly associated with one or more specific CKD etiologies.

6 Supplementary Table 5. Conditional analyses for independence of SNP signals

SNP Covariates OR [95% CI] p-value SNP Covariates OR [95% CI] p-value MN SLE - 2.77 [2.06-3.71] 1.9E-11 - 1.97 [1.37-2.83] 2.8E-04 rs2187668 0.81 [0.51-1.30] 3.9E-01 rs7763262 1.63 [1.10-2.40] 1.4E-02 rs1129740 2.50 [1.84-3.39] 4.0E-09 rs2187668 1.15 [0.67-1.99] 6.1E-01 rs9275596 2.28 [1.65-3.14] 6.1E-07 rs9275596 1.64 [1.12-2.42] 1.2E-02 rs2187668, rs7763262, 0.82 [0.51-1.30] 3.9E-01 1.07 [0.61-1.88] 8.1E-01 rs1150754 rs1129740 rs1150754 rs2187668 rs2187668, rs7763262, 0.82 [0.51-1.30] 3.9E-01 1.60 [1.08-2.36] 1.9E-02 rs9275596 rs9275596 rs1129740, rs2187668, 1.41 [0.95-2.10] 8.4E-02 1.17 [0.68-2.02] 5.8E-01 rs9275596 rs9275596 rs2187668, rs7763262, rs1129740, 0.82 [0.51-1.30] 3.9E-01 rs2187668, 1.09 [0.61-1.92] 7.8E-01 rs9275596 rs9275596 - 4.48 [3.32-6.11] 2.4E-22 - 0.57 [0.42-0.77] 3.0E-04 rs1150754 5.21 [3.32-8.19] 8.1E-13 rs1150754 0.65 [0.47-0.90] 1.0E-02 rs1129740 4.29 [3.09-5.96] 4.1E-18 rs2187668 0.70 [0.50-0.99] 4.1E-02 rs9275596 4.67 [3.16-6.90] 9.6E-15 rs9275596 0.72 [0.44-1.18] 1.9E-01 rs1150754, rs1150754, 4,97 [3.10-7.96] 2.7E-11 0.70 [0.50-0.99] 4.6E-02 rs2187668 rs1129740 rs7763262 rs2187668 rs1150754, rs1150754, 5.39 [3.24-8.97] 8.6E-11 0.79 [0.47-1.32] 3.6E-01 rs9275596 rs9275596 rs1129740, rs2187668, 4.11 [2.36-7.17] 6.2E-07 0.73 [0.44-1.21] 2.2E-01 rs9275596 rs9275596 rs1150754, rs1150754, rs1129740, 4.74 [2.49-9.04] 2.2E-06 rs2187668, 0.74 [0.44-1.24] 2.6E-01 rs9275596 rs9275596 - 1.69 [1.30-2.19] 8.4E-05 - 2.36 [1.63-3.42] 5.9E-06 rs1150754 1.44 [1.10-1.90] 8.2E-03 rs1150754 2.12 [1.22-3.70] 8.0E-03 rs2187668 1.11 [0.82-1.49] 4.9E-01 rs7763262 1.96 [1.30-2.95] 1.3E-03 rs9275596 2.57 [1.92-3.45] 2.8E-10 rs9275596 1.94 [1.26-3.00] 2.7E-03 rs1150754, rs1150754, 1.11 [0.82-1.49] 5.0E-01 1.86 [1.04-3.33] 3.6E-02 rs1129740 rs2187668 rs2187668 rs7763262 rs1150754, rs1150754, 2.20 [1.57-3.08] 4.7E-06 1.72 [0.94-1.36] 8.0E-02 rs9275596 rs9275596 rs2187668, rs7763262, 1.14 [0.76-1.70] 5.3E-01 1.93 [1.25-2.97] 3.0E-03 rs9275596 rs9275596 rs1150754, rs1150754, rs2187668, 1.14 [0.76-1.70] 5.3E-01 rs7763262, 1.81 [0.97-3.37] 6.1E-02 rs9275596 rs9275596 - 0.52 [0.41-0.67] 3.3E-07 - 0.57 [0.41-0.77] 3.4E-04 rs1150754 0.67 [0.51-0.88] 3.7E-03 rs1150754 0.65 [0.46-0.90] 1.0E-02 rs2187668 1.06 [0.75-1.48] 7.5E-01 rs7763262 0.74 [0.44-1.22] 2.3E-01 rs1129740 0.36 [0.27-0.47] 6.2E-13 rs2187668 0.74 [0.51-1.06] 1.0E-01 rs1150754, rs1150754, 1.05 [0.75-1.47] 7.7E-01 0.78 [0.46-1.31] 3.4E-01 rs9275596 rs2187668 rs9275596 rs7763262 rs1150754, rs1150754, 0.43 [0.31-0,61] 2.0E-06 0.73 [0.51-1.06] 9.8E-02 rs1129740 rs2187668 rs2187668, rs7763262, 0.96 [0.61-1.51] 8.5E-01 0.95 [0.55-1.62] 8.4E-01 rs1129740 rs2187668 rs1150754, rs1150754, rs2187668, 0.95 [0.60-1.51] 8.4E-01 rs7763262, 0.93 [0.54-1.62] 8.0E-01 rs1129740 rs2187668 T1DM GPA - 2.53 [1.78-3.60] 2.5E-07 - 1.81 [1.32-2.47] 2.0E-04 rs660895 rs660895 2.75 [1.91-3.98] 7.2E-08 rs9277554 1.73 [1.26-2.38] 7.3E-04 rs1150754 rs1129740 2.12 [1.47-3.05] 5.5E-05 - 0.14 [0.08-0.25] 1.7E-11 rs660895, rs9277554 rs660895 2.62 [1.78-3.85] 1.0E-06 0.14 [0.08-0.25] 2.6E-11 rs1129740

7 - 3.00 [2.17-4.18] 4.6E-11 rs1150754 3.19 [2.28-4.46] 1.5E-11 rs660895 rs1129740 2.47 [1.72-3.56] 1.2E-06 rs1150754, 2.93 [2.00-4.30] 3.6E-08 rs1129740 - 2.13 [1.52-2.97] 1.1E-05 rs1150754 1.85 [1.30-2.62] 5.4E-04 rs1129740 rs660895 1.52 [1.04-2.21] 3.0E-02 rs1150754, 1.19 [0.80-1.77] 4.0E-01 rs660895

MN: Membranous nephropathy, IgA: IgA nephropathy, SLE: Systemic lupus erythematosus, GPA: Granulomatosis with polyangiitis, T1DM: Type 1 diabetes mellitus, OR: Odds ratio, CI: Confidence interval.

8 Supplementary Table 6: Conditional analyses of SNPs associated with CKD from T1DM and previously known T1DM SNPs

Conditional analyses rs1150754 Conditional analyses rs660895 Covariate OR p-value Covariate OR p-value - 2.53 2.5E-07 - 3.01 4.6E-11 rs1015166 2.33 8.3E-06 rs1015166 2.86 4.4E-10 rs11755527 2.53 2.6E-07 rs11755527 3.01 4.7E-11 rs1270942 3.27 3.1E-05 rs1270942 3.36 1.9E-12 rs1980493 2.80 8.0E-06 rs1980493 3.65 6.5E-13 rs2251396 2.10 2.6E-04 rs2251396 2.86 6.0E-10 rs2523989 2.37 3.3E-05 rs2523989 3.15 1.4E-11 rs2647044 2.66 7.7E-05 rs2647044 3.59 6.8E-13 rs2857595 2.60 1.8E-05 rs2857595 3.24 6.2E-12 rs3757247 2.55 2.3E-07 rs3757247 3.01 4.7E-11 rs6931865 2.54 2.5E-07 rs6931865 3.01 4.6E-11 rs886424 2.41 1.7E-04 rs886424 3.22 6.7E-12 rs924043 2.58 1.7E-07 rs924043 3.02 4.1E-11 rs9268645 2.88 1.4E-08 rs9268645 2.91 7.3E-09 rs9272346 2.27 7.4E-06 rs9272346 2.60 8.0E-07 rs9388489 2.54 2.1E-07 rs9388489 3.04 4.0E-11

T1DM: Type 1 diabetes mellitus, OR: Odds ratio

SNPs were selected based on these publications: Hakonarson, Grant et al. 2007 (PMID 17632545); Cooper, Smyth et al. 2008 (PMID 18978792); Barrett, Clayton et al. 2009 (PMID 19430480); Grant, Qu et al. 2009 (PMID 18840781); Bradfield, Qu et al. 2011 (PMID 21980299); Tomer, Dolan et al. 2015 (PMID 25936594))

9 Supplementary Table 7. Associations between population-based SNPs and advanced CKD (stage G3b+ or A3)

SNP Characteristics CKD G3b CKD A3 Effect SNP Gene OR p-value OR p-value allele rs10109414 STC1 T 1.08 1.8E-01 1.06 3.1E-01 rs10277115 UNCX T 1.08 2.6E-01 1.09 1.6E-01 rs10491967 TSPAN9 A 0.99 9.6E-01 1.12 1.8E-01 rs10513801 ETV5 G 0.94 4.5E-01 1.08 3.2E-01 rs10774021 SLC6A13 T 0.97 6.8E-01 1.03 5.5E-01 rs10794720 WDR37 C 0.98 8.3E-01 0.88 1.7E-01 rs10994860 A1CF T 0.99 8.8E-01 0.93 3.0E-01 rs1106766 INHBC T 0.98 7.7E-01 0.99 8.1E-01 rs11078903 CDK12 A 1.08 2.4E-01 1.06 3.4E-01 rs11666497 SIPA1L3 T 1.08 2.8E-01 0.94 3.2E-01 rs11959928 DAB2 A 1.05 4.1E-01 1.03 5.6E-01 rs12124078 DNAJC16 G 0.93 2.6E-01 1.08 1.8E-01 rs12136063 SYPL2 A 1.01 8.9E-01 0.90 6.0E-02 rs12460876 SLC7A9 C 1.00 9.5E-01 0.97 5.2E-01 rs1260326 GCKR C 1.09 1.3E-01 1.04 4.6E-01 rs12917707 UMOD T 0.76 4.2E-04 1.04 6.2E-01 rs13538 ALMS1/NAT8 G 1.00 9.7E-01 0.95 4.3E-01 rs1394125 UBE2Q2 A 1.00 9.6E-01 1.09 1.3E-01 rs163160 KCNQ1 G 1.07 3.5E-01 1.08 2.8E-01 rs164748 DPEP1 G 1.15 1.7E-02 0.97 5.3E-01 rs17216707 BCAS1 C 1.01 9.1E-01 0.93 3.0E-01 rs17319721 SHROOM3 A 1.07 2.8E-01 1.17 4.0E-03 rs1801239 CUBN C 1.07 5.0E-01 0.85 5.7E-02 rs2279463 SLC22A2 G 1.00 9.8E-01 1.08 3.6E-01 rs228611 NFKB1 A 0.99 9.3E-01 0.95 3.4E-01 rs2453580 SLC47A1 C 1.14 2.8E-02 0.89 2.9E-02 rs2467853 SPATA5L1 G 1.13 3.8E-02 0.91 8.8E-02 rs267734 ANXA9/LASS2 C 0.97 6.5E-01 0.92 1.7E-01 rs2712184 IGFBP5 A 0.97 6.5E-01 1.00 9.4E-01 rs2802729 SDCCAG8 A 1.04 5.3E-01 1.10 6.4E-02 rs2928148 INO80 A 0.96 4.6E-01 0.96 4.3E-01 rs347685 TFDP2 A 0.97 6.7E-01 0.99 8.5E-01 rs3750082 KBTBD2 A 1.07 2.9E-01 1.03 5.5E-01 rs3828890 MHC region G 1.21 6.6E-02 1.00 9.6E-01 rs3850625 CACNA1S A 1.20 5.0E-02 0.88 1.2E-01 rs3925584 MPPED2 C 1.01 8.5E-01 0.96 4.2E-01 rs4014195 AP5B1 G 1.07 2.9E-01 1.02 6.7E-01 rs4667594 LRP2 A 1.03 5.9E-01 1.04 5.2E-01 rs4744712 PIP5K1B C 0.95 3.7E-01 0.95 3.1E-01 rs491567 WDR72 C 1.07 3.2E-01 0.96 5.2E-01 rs6088580 TP53INP2 C 1.08 1.8E-01 1.07 2.3E-01 rs626277 DACH1 C 1.12 6.3E-02 0.95 3.9E-01 rs6420094 SLC34A1 G 0.97 6.4E-01 1.04 5.1E-01 rs6431731 DDX1 T 0.77 1.3E-01 1.05 7.1E-01 rs6459680 RNF32 A 0.91 1.5E-01 1.07 2.4E-01 rs6465825 TMEM60 C 0.98 7.2E-01 1.02 6.9E-01 rs6795744 WNT7A A 1.00 9.7E-01 0.94 3.7E-01 rs7422339 CPS1 A 1.09 1.6E-01 1.14 2.1E-02 rs7759001 ZNF204 A 0.95 4.9E-01 1.08 2.5E-01 rs7805747 PRKAG2 A 1.02 7.2E-01 1.07 2.7E-01 rs7956634 PTPRO C 1.01 8.7E-01 1.01 8.9E-01 rs8091180 NFATC1 A 0.94 2.9E-01 0.99 8.0E-01 rs881858 VEGFA A 0.94 3.2E-01 1.00 9.8E-01 rs9682041 SKIL T 1.02 8.2E-01 0.99 9.5E-01 rs9895661 BCAS3 T 1.01 8.6E-01 0.92 2.2E-01

OR: Odds ratio, CI: Confidence interval. The significance threshold was set at 4.5x10-4 (Bonferroni correction 0.05/(2 x n(SNPs)), two-sided test). eGFR <45 ml/min/1.73m² cases: n=2245, UACR ≥300 mg/g cases: n=1385, GFR GCKD controls: n=1006 (eGFR >60 ml/min/1.73m²), UACR GCKD controls (UACR <30 mg/g): n=2117.

10 Supplementary Table 8. Associations between population-based SNPs and CKD attributed to hypertension and type 2 diabetes mellitus

Type 2 SNP characteristics Hypertension Diabetes mellitus Effect SNP Gene OR p-value OR p-value allele rs10109414 STC1 T 1.02 7.9E-01 1.06 3.7E-01 rs10277115 UNCX T 1.06 5.3E-01 1.04 6.5E-01 rs10491967 TSPAN9 A 1.04 7.6E-01 1.08 4.4E-01 rs10513801 ETV5 G 0.97 7.6E-01 0.93 4.5E-01 rs10774021 SLC6A13 T 0.99 9.0E-01 0.86 3.2E-02 rs10794720 WDR37 C 0.95 6.8E-01 0.94 6.0E-01 rs10994860 A1CF T 1.01 9.5E-01 1.01 9.5E-01 rs1106766 INHBC T 0.88 1.8E-01 1.11 1.9E-01 rs11078903 CDK12 A 1.00 9.8E-01 0.97 7.0E-01 rs11666497 SIPA1L3 T 1.10 3.3E-01 1.09 3.1E-01 rs11959928 DAB2 A 1.02 8.2E-01 0.93 2.5E-01 rs12124078 DNAJC16 G 0.98 8.4E-01 1.06 4.4E-01 rs12136063 SYPL2 A 0.93 3.9E-01 0.96 5.9E-01 rs12460876 SLC7A9 C 0.99 8.9E-01 1.00 9.4E-01 rs1260326 GCKR C 1.13 1.1E-01 1.07 3.0E-01 rs12917707 UMOD T 0.90 3.3E-01 1.12 2.1E-01 rs13538 ALMS1/NAT8 G 0.89 2.1E-01 1.03 7.2E-01 rs1394125 UBE2Q2 A 1.04 5.9E-01 1.04 5.8E-01 rs163160 KCNQ1 G 1.02 8.1E-01 0.98 8.0E-01 rs164748 DPEP1 G 0.90 1.6E-01 1.08 2.3E-01 rs17216707 BCAS1 C 1.01 9.4E-01 0.96 6.2E-01 rs17319721 SHROOM3 A 0.92 3.0E-01 0.98 7.6E-01 rs1801239 CUBN C 0.95 6.9E-01 1.26 2.3E-02 rs2279463 SLC22A2 G 1.20 1.2E-01 0.87 1.7E-01 rs228611 NFKB1 A 1.00 9.7E-01 0.98 7.3E-01 rs2453580 SLC47A1 C 1.15 7.3E-02 1.03 7.0E-01 rs2467853 SPATA5L1 G 0.85 3.6E-02 0.92 2.1E-01 rs267734 ANXA9/LASS2 C 1.04 6.9E-01 1.11 2.1E-01 rs2712184 IGFBP5 A 1.04 6.2E-01 1.12 1.0E-01 rs2802729 SDCCAG8 A 0.83 1.7E-02 0.91 1.5E-01 rs2928148 INO80 A 0.88 8.9E-02 1.18 1.4E-02 rs347685 TFDP2 A 1.04 6.5E-01 1.04 6.1E-01 rs3750082 KBTBD2 A 0.89 1.3E-01 1.02 8.2E-01 rs3828890 MHC region G 1.05 7.3E-01 0.96 7.5E-01 rs3850625 CACNA1S A 1.00 1.0E+00 1.13 2.2E-01 rs3925584 MPPED2 C 1.01 9.5E-01 1.12 9.0E-02 rs4014195 AP5B1 G 1.21 1.5E-02 0.93 2.5E-01 rs4667594 LRP2 A 1.08 3.2E-01 0.94 3.4E-01 rs4744712 PIP5K1B C 0.94 4.0E-01 1.14 5.5E-02 rs491567 WDR72 C 1.01 8.9E-01 0.99 9.3E-01 rs6088580 TP53INP2 C 0.88 1.0E-01 1.03 7.1E-01 rs626277 DACH1 C 1.00 9.9E-01 1.05 4.8E-01 rs6420094 SLC34A1 G 0.97 6.6E-01 0.96 5.8E-01 rs6431731 DDX1 T 1.38 1.1E-01 1.42 7.4E-02 rs6459680 RNF32 A 1.03 7.3E-01 1.03 6.8E-01 rs6465825 TMEM60 C 0.91 2.3E-01 1.04 5.3E-01 rs6795744 WNT7A A 1.18 1.2E-01 0.97 7.9E-01 rs7422339 CPS1 A 0.90 2.3E-01 0.99 8.6E-01 rs7759001 ZNF204 A 1.04 6.6E-01 1.10 2.2E-01 rs7805747 PRKAG2 A 0.92 3.2E-01 0.92 2.6E-01 rs7956634 PTPRO C 1.02 8.4E-01 0.95 5.2E-01 rs8091180 NFATC1 A 0.99 9.2E-01 1.01 9.0E-01 rs881858 VEGFA A 0.97 7.4E-01 0.83 1.1E-02 rs9682041 SKIL T 1.01 9.5E-01 1.15 1.8E-01 rs9895661 BCAS3 T 0.94 5.1E-01 1.00 9.9E-01

OR: Odds ratio, CI: Confidence interval. The significance threshold was set at 4.5x10-4 ( = 0.05 / (55*2), Bonferroni correction, two-sided test). CKD from hypertension (nephrosclerosis) cases: n=1086, CKD from type 2 diabetes mellitus: n=653, GCKD controls: n=569 for hypertension (specific CKD etiology control group excluding nephrosclerosis), and n=1655 for T2DM.

11 Supplementary Table 9. Power calculations across a range of sample sizes (case numbers), allele frequencies and effect sizes

Freq OR N Power Freq OR N Power Freq OR N Power Freq OR N Power 0.10 1.0 100 0.0% 0.15 1.0 100 0.0% 0.20 1.0 100 0.0% 0.25 1.0 100 0.0% 0.10 1.0 200 0.0% 0.15 1.0 200 0.0% 0.20 1.0 200 0.0% 0.25 1.0 200 0.0% 0.10 1.0 300 0.0% 0.15 1.0 300 0.0% 0.20 1.0 300 0.0% 0.25 1.0 300 0.0% 0.10 1.0 400 0.0% 0.15 1.0 400 0.0% 0.20 1.0 400 0.0% 0.25 1.0 400 0.0% 0.10 1.1 100 0.1% 0.15 1.1 100 0.1% 0.20 1.1 100 0.1% 0.25 1.1 100 0.1% 0.10 1.1 200 0.1% 0.15 1.1 200 0.1% 0.20 1.1 200 0.2% 0.25 1.1 200 0.2% 0.10 1.1 300 0.2% 0.15 1.1 300 0.2% 0.20 1.1 300 0.3% 0.25 1.1 300 0.4% 0.10 1.1 400 0.2% 0.15 1.1 400 0.3% 0.20 1.1 400 0.5% 0.25 1.1 400 0.6% 0.10 1.2 100 0.2% 0.15 1.2 100 0.3% 0.20 1.2 100 0.4% 0.25 1.2 100 0.5% 0.10 1.2 200 0.5% 0.15 1.2 200 0.9% 0.20 1.2 200 1.4% 0.25 1.2 200 1.8% 0.10 1.2 300 1.0% 0.15 1.2 300 1.9% 0.20 1.2 300 3.0% 0.25 1.2 300 4.0% 0.10 1.2 400 1.8% 0.15 1.2 400 3.4% 0.20 1.2 400 5.3% 0.25 1.2 400 7.2% 0.10 1.3 100 0.6% 0.15 1.3 100 1.0% 0.20 1.3 100 1.5% 0.25 1.3 100 2.0% 0.10 1.3 200 2.0% 0.15 1.3 200 3.9% 0.20 1.3 200 6.0% 0.25 1.3 200 8.1% 0.10 1.3 300 4.6% 0.15 1.3 300 9.1% 0.20 1.3 300 14.0% 0.25 1.3 300 18.6% 0.10 1.3 400 8.3% 0.15 1.3 400 16.4% 0.20 1.3 400 24.8% 0.25 1.3 400 32.1% 0.10 1.4 100 1.5% 0.15 1.4 100 2.8% 0.20 1.4 100 4.2% 0.25 1.4 100 5.6% 0.10 1.4 200 6.0% 0.15 1.4 200 11.7% 0.20 1.4 200 17.7% 0.25 1.4 200 23.1% 0.10 1.4 300 13.9% 0.15 1.4 300 26.2% 0.20 1.4 300 37.7% 0.25 1.4 300 46.8% 0.10 1.4 400 24.5% 0.15 1.4 400 43.3% 0.20 1.4 400 58.0% 0.25 1.4 400 68.2% 0.10 1.5 100 3.3% 0.15 1.5 100 6.4% 0.20 1.5 100 9.6% 0.25 1.5 100 12.6% 0.10 1.5 200 13.9% 0.15 1.5 200 26.0% 0.20 1.5 200 37.0% 0.25 1.5 200 45.7% 0.10 1.5 300 30.5% 0.15 1.5 300 51.3% 0.20 1.5 300 66.1% 0.25 1.5 300 75.4% 0.10 1.5 400 49.0% 0.15 1.5 400 72.6% 0.20 1.5 400 85.2% 0.25 1.5 400 91.3% 0.10 2.0 100 36.6% 0.15 2.0 100 57.0% 0.20 2.0 100 70.0% 0.25 2.0 100 77.5% 0.10 2.0 200 84.7% 0.15 2.0 200 96.1% 0.20 2.0 200 98.8% 0.25 2.0 200 99.5% 0.10 2.0 300 98.1% 0.15 2.0 300 99.9% 0.20 2.0 300 100.0% 0.25 2.0 300 100.0% 0.10 2.0 400 99.9% 0.15 2.0 400 100.0% 0.20 2.0 400 100.0% 0.25 2.0 400 100.0% 0.10 2.5 100 82.3% 0.15 2.5 100 94.4% 0.20 2.5 100 97.7% 0.25 2.5 100 98.8% 0.10 2.5 200 99.8% 0.15 2.5 200 100.0% 0.20 2.5 200 100.0% 0.25 2.5 200 100.0% 0.10 2.5 300 100.0% 0.15 2.5 300 100.0% 0.20 2.5 300 100.0% 0.25 2.5 300 100.0% 0.10 2.5 400 100.0% 0.15 2.5 400 100.0% 0.20 2.5 400 100.0% 0.25 2.5 400 100.0% 0.10 3.0 100 97.9% 0.15 3.0 100 99.7% 0.20 3.0 100 99.9% 0.25 3.0 100 100.0% 0.10 3.0 200 100.0% 0.15 3.0 200 100.0% 0.20 3.0 200 100.0% 0.25 3.0 200 100.0% 0.10 3.0 300 100.0% 0.15 3.0 300 100.0% 0.20 3.0 300 100.0% 0.25 3.0 300 100.0% 0.10 3.0 400 100.0% 0.15 3.0 400 100.0% 0.20 3.0 400 100.0% 0.25 3.0 400 100.0% 0.10 3.5 100 99.9% 0.15 3.5 100 100.0% 0.20 3.5 100 100.0% 0.25 3.5 100 100.0% 0.10 3.5 200 100.0% 0.15 3.5 200 100.0% 0.20 3.5 200 100.0% 0.25 3.5 200 100.0% 0.10 3.5 300 100.0% 0.15 3.5 300 100.0% 0.20 3.5 300 100.0% 0.25 3.5 300 100.0% 0.10 3.5 400 100.0% 0.15 3.5 400 100.0% 0.20 3.5 400 100.0% 0.25 3.5 400 100.0% 0.10 4.0 100 100.0% 0.15 4.0 100 100.0% 0.20 4.0 100 100.0% 0.25 4.0 100 100.0% 0.10 4.0 200 100.0% 0.15 4.0 200 100.0% 0.20 4.0 200 100.0% 0.25 4.0 200 100.0% 0.10 4.0 300 100.0% 0.15 4.0 300 100.0% 0.20 4.0 300 100.0% 0.25 4.0 300 100.0% 0.10 4.0 400 100.0% 0.15 4.0 400 100.0% 0.20 4.0 400 100.0% 0.25 4.0 400 100.0% 0.10 4.5 100 100.0% 0.15 4.5 100 100.0% 0.20 4.5 100 100.0% 0.25 4.5 100 100.0% 0.10 4.5 200 100.0% 0.15 4.5 200 100.0% 0.20 4.5 200 100.0% 0.25 4.5 200 100.0% 0.10 4.5 300 100.0% 0.15 4.5 300 100.0% 0.20 4.5 300 100.0% 0.25 4.5 300 100.0% 0.10 4.5 400 100.0% 0.15 4.5 400 100.0% 0.20 4.5 400 100.0% 0.25 4.5 400 100.0% 0.10 5.0 100 100.0% 0.15 5.0 100 100.0% 0.20 5.0 100 100.0% 0.25 5.0 100 100.0% 0.10 5.0 200 100.0% 0.15 5.0 200 100.0% 0.20 5.0 200 100.0% 0.25 5.0 200 100.0% 0.10 5.0 300 100.0% 0.15 5.0 300 100.0% 0.20 5.0 300 100.0% 0.25 5.0 300 100.0% 0.10 5.0 400 100.0% 0.15 5.0 400 100.0% 0.20 5.0 400 100.0% 0.25 5.0 400 100.0%

12 Freq OR N Power Freq OR N Power Freq OR N Power 0.30 1.0 100 0.0% 0.35 1.0 100 0.0% 0.40 1.0 100 0.0% 0.30 1.0 200 0.0% 0.35 1.0 200 0.0% 0.40 1.0 200 0.0% 0.30 1.0 300 0.0% 0.35 1.0 300 0.0% 0.40 1.0 300 0.0% 0.30 1.0 400 0.0% 0.35 1.0 400 0.0% 0.40 1.0 400 0.0% 0.30 1.1 100 0.1% 0.35 1.1 100 0.1% 0.40 1.1 100 0.1% 0.30 1.1 200 0.3% 0.35 1.1 200 0.3% 0.40 1.1 200 0.3% 0.30 1.1 300 0.4% 0.35 1.1 300 0.5% 0.40 1.1 300 0.5% 0.30 1.1 400 0.7% 0.35 1.1 400 0.8% 0.40 1.1 400 0.9% 0.30 1.2 100 0.6% 0.35 1.2 100 0.7% 0.40 1.2 100 0.8% 0.30 1.2 200 2.2% 0.35 1.2 200 2.5% 0.40 1.2 200 2.7% 0.30 1.2 300 4.9% 0.35 1.2 300 5.6% 0.40 1.2 300 6.1% 0.30 1.2 400 8.9% 0.35 1.2 400 10.3% 0.40 1.2 400 11.2% 0.30 1.3 100 2.4% 0.35 1.3 100 2.7% 0.40 1.3 100 2.9% 0.30 1.3 200 9.9% 0.35 1.3 200 11.3% 0.40 1.3 200 12.1% 0.30 1.3 300 22.5% 0.35 1.3 300 25.3% 0.40 1.3 300 27.1% 0.30 1.3 400 37.8% 0.35 1.3 400 41.9% 0.40 1.3 400 44.4% 0.30 1.4 100 6.8% 0.35 1.4 100 7.6% 0.40 1.4 100 8.2% 0.30 1.4 200 27.4% 0.35 1.4 200 30.5% 0.40 1.4 200 32.2% 0.30 1.4 300 53.4% 0.35 1.4 300 57.7% 0.40 1.4 300 60.1% 0.30 1.4 400 74.6% 0.35 1.4 400 78.5% 0.40 1.4 400 80.5% 0.30 1.5 100 15.0% 0.35 1.5 100 16.7% 0.40 1.5 100 17.6% 0.30 1.5 200 51.9% 0.35 1.5 200 55.8% 0.40 1.5 200 57.8% 0.30 1.5 300 81.0% 0.35 1.5 300 84.1% 0.40 1.5 300 85.6% 0.30 1.5 400 94.3% 0.35 1.5 400 95.7% 0.40 1.5 400 96.3% 0.30 2.0 100 81.5% 0.35 2.0 100 83.4% 0.40 2.0 100 83.9% 0.30 2.0 200 99.7% 0.35 2.0 200 99.8% 0.40 2.0 200 99.8% 0.30 2.0 300 100.0% 0.35 2.0 300 100.0% 0.40 2.0 300 100.0% 0.30 2.0 400 100.0% 0.35 2.0 400 100.0% 0.40 2.0 400 100.0% 0.30 2.5 100 99.2% 0.35 2.5 100 99.3% 0.40 2.5 100 99.2% 0.30 2.5 200 100.0% 0.35 2.5 200 100.0% 0.40 2.5 200 100.0% 0.30 2.5 300 100.0% 0.35 2.5 300 100.0% 0.40 2.5 300 100.0% 0.30 2.5 400 100.0% 0.35 2.5 400 100.0% 0.40 2.5 400 100.0% 0.30 3.0 100 100.0% 0.35 3.0 100 100.0% 0.40 3.0 100 100.0% 0.30 3.0 200 100.0% 0.35 3.0 200 100.0% 0.40 3.0 200 100.0% 0.30 3.0 300 100.0% 0.35 3.0 300 100.0% 0.40 3.0 300 100.0% 0.30 3.0 400 100.0% 0.35 3.0 400 100.0% 0.40 3.0 400 100.0% 0.30 3.5 100 100.0% 0.35 3.5 100 100.0% 0.40 3.5 100 100.0% 0.30 3.5 200 100.0% 0.35 3.5 200 100.0% 0.40 3.5 200 100.0% 0.30 3.5 300 100.0% 0.35 3.5 300 100.0% 0.40 3.5 300 100.0% 0.30 3.5 400 100.0% 0.35 3.5 400 100.0% 0.40 3.5 400 100.0% 0.30 4.0 100 100.0% 0.35 4.0 100 100.0% 0.40 4.0 100 100.0% 0.30 4.0 200 100.0% 0.35 4.0 200 100.0% 0.40 4.0 200 100.0% 0.30 4.0 300 100.0% 0.35 4.0 300 100.0% 0.40 4.0 300 100.0% 0.30 4.0 400 100.0% 0.35 4.0 400 100.0% 0.40 4.0 400 100.0% 0.30 4.5 100 100.0% 0.35 4.5 100 100.0% 0.40 4.5 100 100.0% 0.30 4.5 200 100.0% 0.35 4.5 200 100.0% 0.40 4.5 200 100.0% 0.30 4.5 300 100.0% 0.35 4.5 300 100.0% 0.40 4.5 300 100.0% 0.30 4.5 400 100.0% 0.35 4.5 400 100.0% 0.40 4.5 400 100.0% 0.30 5.0 100 100.0% 0.35 5.0 100 100.0% 0.40 5.0 100 100.0% 0.30 5.0 200 100.0% 0.35 5.0 200 100.0% 0.40 5.0 200 100.0% 0.30 5.0 300 100.0% 0.35 5.0 300 100.0% 0.40 5.0 300 100.0% 0.30 5.0 400 100.0% 0.35 5.0 400 100.0% 0.40 5.0 400 100.0%   = 0.000263, KP = 0.0001, n(controls): 1,655

13