PopulationPopulation GeneticsGenetics inin KidneyKidney HealthHealth andand DiseaseDisease Presentation Outline ƒ Introduction ƒ Some principles of Population Genetics ƒ Illustrative Example: ¾ African Heritage Populations ‐ Mapping Common Genetic Variants ƒ Take Home Messages –Call for Collaborations ƒ Acknowledgments Population Genetics: Society and Ethics • Genetic and non-genetic evidence indicates that humans are part of one extended family • Human sanctity and value is independent of population affiliation or genetics • Where we are going to is more important than where we came from Common Ancestors Unique Ancestors

Individual 1 Individual 2 Population Genetics in Kidney Health and Disease

¾A “population” in human genetics can be designated as a group of people who share common ancestry patterns at a “genome wide” or “genetic locus” level

¾“Population genetic architecture” can be measured using DNA diversity markers DNA Diversity Markers Arise from “slips” in DNA Replication Classification of DNA Diversity Markers Biological Type SNPs,STRs,CNVs,other

Genomic Location Uniparental (non‐recombining): useful in tracing genealogies and sex biased demographic history Biparental (autosomal): useful in assessing "relatedness", in tracing evolutionary history of genomic regions, and in trait/disease phenotype mapping

Demographic History

Choice and combination depends on scientific, historical, genealogical, clinical or forensic question of interest DNA Markers

SNP=Single Nucleotide Polymorphism DNA Markers

STR=Simple Tandem Repeat (microsatellite) one of many kinds of DNA markers

original: Mississippi

copy error 1: Missississippi

copy error 2: Mississississippi TWIGS ƒRepeat Event Markers: STR ƒTimespans < 1000 years Twigs on a given Branch define a ƒCombinations of markers define “Haplotypes” LINEAGE

BRANCHES Unique Event Markers: SNP Timespans >10K years Combinations of markers define “Haplogroups”

“For man is but the tree of the field” Deuteronomy 20:19 UniparentalPopulation (Y- ChromosomeSpecific Blocks and of “Linkage mtDNA) Disequilibrium” Markers are more usefulSize inand tracing Structure genealogies Reflect Population than are BiparentalArchitecture region and are markers more Useful in Disease Mapping

Biparental and undergo recombination Reconstructing Ancestry ¾Individuals who share the same set of markers for a given DNA region share a common ancestry at that genomic region

9DNA region Y chromsome: shared paternal ancestry 9DNA region mtDNA: shared maternal ancestry 9Disease related region of interest: shared Identity by Descent (IBD) 9Extended length haplotypes in panmictic and diverse populatoions (e.g. Africa) signifies evolutionary selection DNA Diversity Markers Classification of DNA Diversity Markers Biological Type SNPs,STRs,CNVs,other

Genomic Location Uniparental (non‐recombining): useful in tracing genealogies and sex biased demographic history Biparental (autosomal): useful in assessing "relatedness", in tracing evolutionary history of genomic regions, and in trait/disease phenotype mapping

Demographic History

Choice and combination depends on scientific, historical, genealogical, clinical or forensic question of interest t tttctccatttgtcgtgacacctttgttgacaccttcatttctgcattctcc aattctatttcactggtctatgg cagagaacacaaaatatggccagtggcctaaatccagcctactacctttttg tttttttttgtaacattttacta g t a t acatagccattcccatgtgtttccatga c tgtctgggctgcttttgcactctaatggcagagttaagaaattgtagc cagagaccacaatgcctcaaatatttactctacagccctttataaaaacagtgtgccaactcctgatttatgaa cttatcattatgtcaataccatactgtctttattactgtagttttataagtcatgacatcagataatgtaaatc SNPs: 3 million differencesg between individuals ctccaactttgtttttaatcaaaagtgttttggccatcctagatatactttgtattgccacataaatttgaagag c a tcagcctgtcagtgtctacaaaatagcatgctaggattttgatagggattgtgtagaatctatagattaattag t aggagaatgactatcttgacaatactgg 95% of thesectgcccctctgtattcgtggggga differences havettggttccacaacaacacccaccc no ccccactcggcaacccctgaaaccccca •Smaller percentagesacatcccccagcttttttcccctg encode phenotypicctaccaaaatccatggatgctca agtccatataaaatgccatactatttgdifferencesphenotypiccatataacctctgcaatcctcccc effects tatagtttagatcatctctagat tacttataatactaataaaatctaaatt gctatgtaaatagttgctatactgtgttgagggttttttgttttgttt t •An evenc smaller percentage causeg or c c ttgttttatttgtttgtttgtttgtattttaagagatggtgtcttgctttga ttgcccaggctggagtgcagtgg tgagatcatagcttactgcagcctcaapredispose to diseaseactcctggactcaaacagtcctcc or variable drug cacctcagcctcccaaagtgctgresponse ggatacaggtgtgacccactgtgccca•Population frequencygttattattttttatttgtattat may be “neutral’tttactgttgtattatttttaat or tattttttctgaatattttccatctataffectedInfluenced by “selection”agttggttgaatcatggatgtgga just by demographyacaggcaaatatggagggctaac tgtattgcatcttccagttcatgagtag tgcagtctctctgtttatttaaagttttagtttttctcaaccatgttt g a c a tacttttcagtatacaagactttgacgttttttgttaaatgtatttgtaagtattttattatttgtgatgttat ttaaaaagaaattgttgactgggcacagtggctcacgcctgtaatcccagcactttgggaggctgaggcgggca t g gatcacgaggtcaggagatcaagaccac tcctggctaacatggtaaaacccca gtctctactaaaaatagaaaaaa attagccaggcgtggtggcgagtgcctgtagtcccagctactcgggaggctg gaggcaggagaatggtgtgaacc g Useful to infer human originsc and tgggaggcggagcttgcagtgagctgagatcgtgccactgcattccagcctg gcgtgacagagcgagactctgtc a c aaaaaaataaataaaatttaaaaaaagmigrations andaagaagaaattattttcttaattt also in gene mappingcattttcaggttttttatttatt g t tctactatatggatacatgattgatttttgtatattgatcatgtatcctgcaaactagctaacatagtttattaa c tttctctttttttgtggattttaaaggattttctacatagataaataaacacacataaacagttttacttcttt g cttttcaacctagactggatgcattttttgtttttgtttgtttgtttgctttttaacttgctgcagtgactagaa gaatgtattgaagaatatattgttgaag caaaagcagtgagagtggacatccctgctttccccctgattttagggg a c ggaatgttttcagtctttcactatttaatatgattttagctataggtttatcctagatccctgttatcatgttg aggaaattcccttctatttctagtttgttgagattttttaattcatgtgattgcgctatctggctttgctctca Demographic factors affect the entire genome: founder , bottleneck, expansion, migration, admixture Selection: differential effect on specific genomic regions

Evolutionary Medicine Adapted from Lluis Quintana-Murci McClellan and King 2010

ionary filter may Very many private rare variants which contribute to Common variants which passed evolut now be relevant to common disease due to increased common disease and have not passed evolutionary filter life expectancy (or change in adaptive forces: diet, drugs, other) Whole Exome Sequencing: Implications for Rare Variant Mapping Can High Risk Common Variants be Mapped? Population Genetics in Kidney Health and Disease

Jewish Heritage

Druze Heritage

African Heritage •12 million Africans principally from three regions of West and South Africa were forcibly translocated to the Americas in ~ 400 years of slave trade (~ 1.5 M did not survive the voyage) •Subsequent admixture with Europeans to current genome wide African American population average of ~83% African Identify “ancestry” of chromosomal regions using DNA markers whose allele frequencies differ markedly between parent populations Admixture Generates Blocks of Linkage Disequilibrium (LD) greatly facilitating population-based gene mapping LinkageLinkage disequilibriumdisequilibrium

Linkage Disequilibrium (LD) is the nonrandom association (at the population level) of two alleles on the same : AB?

In equilibrium: PAB = PA * PB

In disequilibrium: PAB > PA * PB LinkageLinkage DisequilibriumDisequilibrium (LD)(LD) IfIf alleleallele AA isis foundfound inin significantsignificant LDLD withwith thethe causativecausative alleleallele (B),(B), wewe shouldshould detectdetect associationassociation ofof alleleallele AA toto thethe examinedexamined phenotypephenotype

AB

In “conventional” genome wide association (GWAS), only loci A in sufficient physical proximity to causative locus B to mitigate recombination will be associated with the health or disease phenotype of interest Recombination Breaks Down LD

recombination

C AG

10 generations C AG

100 generations T AG

C AG Admixture restores LD Mapping by Admixture Linkage Disequilibrium (MALD)

Cases: admixed population with differential disease risk

Controls: subjects or other chromosomal regions

Adapted from Smith and O’Brien 2005

In the case of a parent populationFeasibility specific depends common on: variant which confers common disease risk– the genomic regions containing that• Populationvariant should disparity be significantly in disease enrich frequencyed in markers which “paint” the ancestry of the region in “cases”•Common compared variant to “controls” (in the at or risk compared population) to genomic regions which do not contribute (most of genome)•Admixture LD End Stage Kidney Disease (ESKD)

¾ESKD is the final common pathway of chronic kidney disease (CKD) and is fatal without renal replacement therapy by dialysis or transplantation (much of the developing world where kidney disease is increasing) ¾~550 North Americans (out of 40M with CKD) and ~5,000 Israelis 4.19

1.95

1 3.62 1.86 1

%AF <1 31.2 (24.3) 83.4 (16.3) 24 High Rates of Kidney Disease among HIV Positive African Americans

Highleyman 2007

The most striking discrepancy is a > 10 fold greater risk for Kidney Disease (HIVAN) in HIV infected African Americans compared to HIV infected European Americans

•Population disparities not readily attributable to socio-economic or environmental factors •Familial clustering of CKD and ESKD of varying etiologies in African Americans (Freedman 1999)

OVERARCHING GENETIC RISK VARIANT (Freedman 1999) Non-diabetic ESKD in African Americans: Admixture scan Smith panel (Kao et al.)

Smith panel (Kopp et al.)

Tian + EMI panel (Shlush et al.) Bercovici et al. 2008

26 OR statistic for each SNP Admixture peak: centered on African ancestry >90% MYH9; 34 other were in cases found in the 2 mb 95% interval

African ancestry in controls

MYH9 encoding non-muscle myosin heavy chain was chosen: ¾Known Giant Platelet Syndromes caused by rare mutations with dominant Mendelian inheritance pattern sometimes cause ESKD ¾Center of the peak Adapted from Kopp et al 2008 and NIDDK 2010, Kao et al 2008 MYH9MYH9‐‐AssociatedAssociated NephropathiesNephropathies • Non‐monogenic forms of FSGS • Hypertension attributed CKD in persons of African heritage • HIVAN • Contribution to other forms of kidney disease Multiple Groups Conducted Fine Mapping Using a “Case-Control” Candidate Locus Association Study Design to Identify Disease Risk Markers for Non-Diabetic End Stage Kidney Disease

Behar et al. 2010

29 Disease Associated Variants ¾common ancient variants tend to be associated with low risk ¾rare recent variants tend to be associated with high risk

217 61

MYH9 region kidney risk allele • common (~0.6 in African Americans) • high OR (2-7 )

Bodmer, Nature Genetics 2008, adapted from Jeffrey Kopp 2010 ƒPublic health and policy implications of high OR risk alleles

ƒFind the pathogenic mutation which confers disease risk: Presumption of a single common variant in LD with risk variants

¾Molecular and cellular approaches using identified SNPs

¾Resequencing to search for novel SNPs

¾Different approach

31 FindFind thethe CausativeCausative

• Open minded approach (Don’t be enchanted by a tasty candidate morsel)

The Emperor’s New Genes 34 genes in the MALD interval

MYH9

• Clinical Observation

• 1,000 Genomes Data Mining

• Ultimately need biology and function AJKD 2006

Ethiopian Jews (Beta Israel) Low Risk for Kidney Failure – specifically HIVAN

We now know also this to be valid for Ethiopian non-Jews (unpublished – courtesy of Dawit Wolday) FindFind thethe CausativeCausative

1,000 Genomes Project (March 2010 Release) y 60 European 59 Yoruba y align and compare y filter the differences: population polymorphism (not singleton) skewed distribution high LD with MYH9

y 250 passed filter (out of 7,479 SNPs) y 4/250 non‐synonymous (3 missense, 1 stop) y genotyping Genes 350 kbp around MYH9

Arg182Cys

APOL3 FOXRED2 Q58X R71C MYH9 gene, 110kbp Contains dozens of ESKD associated INTRONIC SNPs APOL1 (15kbp) •S342G and I384M (G1 missense haplotype) LD 279/280 •del.N388/Y389 (G2 nonsense deletion) GenotypingGenotyping

Two Sample Sets: • 955 African American and Hispanic Americans (non‐diabetic ESKD and ethnically matched elderly controls)

• 660 individuals from 12 populations in Africa including 306 Ethiopians GenotypingGenotyping MethodsMethods 1. KasPar technology (automated allele specific PCR) 2. Manual RFLP

3. Verify by direct sequencing The APOL1 missense variants (rs73885319 and rs60910145) are far more strongly associated with ESKD risk than the leading MYH9 risk variants, both in terms of OR and p values Why was MYH9 Picked up first?

Linkage Disequilibrium and Hitchhiking

39 Positive Selection and Hitchiking?

Hitchhiking effect: “The increase in frequency of a neutral allele at a locus closely linked to a selectively favored allele at a different locus” (answers.com)

• Hitchhiking may have a distinctively patterned LD‐reducing effect, in particular near the target of selection

• MYH9 Variants hitchhiked with APOL1 Variants, which were under strong selection due to tropical infectious pathogen Stephan 2006 The Hitchhiking Effect on Linkage Disequilibrium Between Linked Neutral Loci Two leading MYH9 SNPs (Nelson et al. 2010) Brun 2010 Human African trypanosomiasis African Sleeping Sickness African Sleeping Sickness

Brun 2010 Human African trypanosomiasis Resistance to APOL1 killing of the parasite ApoL1 Missense Mutations S342G and I384M

bent alpha helix

Mutation predicted to exert a major effect on L internal site binding domain and function

hydrophobic core stabilized the bend

Protein Structure Models for Apoliproteins L: I-TASSER and CHIMERA Tm>0.5 for all predictions

45 Genovese et al. Core Findings African American 205 Biopsy proven FSGS 180 Control Association analysis with 1000 Genomes SNPs • APOL1 signal >> MYH9 signal • two allelic variants

G1 missense haplotype rs73885319 (S342G) LD~1 rs60910145 (I384M)

G2 missense mutation rs 71585313 (del.N388/Y389) (continued)

G1 risk 52% case 18% control

G2 risk 23% case 15% control OR (recessive) 10.5 (6‐18.4) (additive/dominant) 1 40% West Africans G1 or G2

G1 and G2 never on some parental allele Question: what percentage of kidney disease among African Americans would be eliminated if G1/G2 “causation” could be “cured”?

Answer: HIVAN 100%, non‐HIVAN > 70% G1/G2 homozygotes or compound heterozygotes HIV negative 12% HIV positive 50%

(corresponding for MYH9 are 5%, and 20%) Ethiopian MYH9 SNPs reflect a different population genetic history and thereby point to a phylogenetic branch upon which the actual “causative” mutation occurred

Causative Mutation occurred here Why do both F1 risk (in MYH9) and the APOL1 missense variants have the same zero frequency in Ethiopia?

They are on the same phylogenetic branch partitioning northeast from sub-Saharan Africa

53 http://www.proteinatlas.org

Kidney Injury Mechanism

Gain of function or loss of function kidney injury? 9Autophagy Ser342Gly APOL1 9Lipids Immunhistochemistry 9Trafficking

Experimental Approaches: - More human genetics: earlier stage CKD, other -Nephrectomy specimens -Kidney disease biopsies -Model systems: rats, mice, flies, yeast (don’t have endogenous APOL1) MYH9 vs APOL1 MYH9 APOL1 genetics evolution biology plausible new insight medical signature PH4 strong stronger Can a preventive or therapeutic intervention be hopefully developed? Possible Medical Significance Already (already true for MYH9 due to LD)

• Threshold for ART in HIV infection • Kidney transplant donation • LRD – donor and recipient • cadaveric –recipient •Recurrent FSGS • Hypertension management • Classification of CKD etiologies SO YOU HAVE MAPPED A GENETIC VARIANT ASSOCIATED WITH DISEASE RISK SO WHAT? Is it correct? Is it significant? WHAT’S NEXT? • More genetics to verify? •Evolution of the variant selection? •Biological significance ¾ does it make sense? ¾is there a new basic biological insight? • Medical significance ¾is there mechanistic evidence of disease risk causation? ¾PH4 medicine Predictive Preventive Personalized Participatory • Can a preventive or therapeutic intervention be developed? TakeTake HomeHome MessagesMessages II

• Population genetics research is both exciting and “healthy”, but must be done in a socially responsible manner • Rare variants and common variants both contribute to human disease • Phylogenetics very helpful in mapping of disease loci • Keep an open mind and avoid temptation (candidate gene) TakeTake HomeHome MessagesMessages IIII • “Nothing in Biology Makes Sense Except in

the Light of Evolution” Theodosius Dobzhansky

• Missense mutations in APOL1 genes most highly associated with risk for non‐diabetic kidney disease accounting for a very high western African population ancestry attributable risk

Kidney Failure

etiology

Genes + Environment

treatment Diabetes, Hypertension, Kidney Failure

etiology

Genes + Environment

treatment The Population Genetics of Complex Disease: Principles and Disease Examples

POPGEN GROUP: Doron Behar, Daniella Magen, Liron Berger, Liran Shlush, Guenady Yudkovsky, Tali Shemer, Sara Selig, Yarin Hadid, Shalev Itzkovitz, Shay Tzur, Sivan Bercovici, Walter Wasser CURRENT COLLABORATING LABS: Geiger, Hammer, Templeton, Gurwitz, Villems, Winkler, Kopp, Nelson PREVIOUS COLLABORATING LABS: Bradman, Goldstein, Bonne-Tamir, Jobling, Quintana-Murci Many students colleagues and collaborators Köszönöm Szépen

ﺷُﻜْﺮًا ﺟﺰﻳﻼ۶ Grazie תודה Shukran