<<

North African Jewish and non-Jewish populations form distinctive, orthogonal clusters

Christopher L. Campbella,1, Pier F. Palamarab,1, Maya Dubrovskyc,d,1, Laura R. Botiguée, Marc Fellousf, Gil Atzmong,h, Carole Oddouxa, Alexander Pearlmana, Li Haoi, Brenna M. Hennj, Edward Burnsg, Carlos D. Bustamantej, Comase, Eitan Friedmanc,d, Itsik tag">Pe’erb, and Harry Ostrera,h,2

Departments of aPathology, gMedicine, and hGenetics, Albert Einstein College of Medicine, Bronx, NY 10461; bDepartment of Computer Science, Columbia University, New York, NY 10027; cSusanne Levy Gertner Oncogenetics Unit, Danek Gertner Institute of Human Genetics, Chaim Sheba Medical Center, Tel- Hashomer 52621, ; dSackler School of Medicine, University, Ramat Aviv 69978, Israel; eInstitute of Evolutionary Biology, Consejo Superior de Investigaciones Cientificas, Universitat Pompeu Fabra, 08003 Barcelona, ; fCochin Institute, Institut National de la Santé et de la Recherche Médicale 567, 75014 Paris, France; iCenter for Genome Informatics, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07101; and jDepartment of Genetics, Stanford University, Stanford, CA 94305

Edited* by Arno G. Motulsky, University of Washington, Seattle, WA, and approved July 3, 2012 (received for review March 23, 2012)

North African constitute the second largest and Ethiopian Jews—groups that are thought to have limited group. However, their relatedness to each other; to European, Middle Eastern Jewish ancestry (15). Middle Eastern, and other Jewish Diaspora groups; and to their Previously, using genome-wide SNP and copy number varia- former North African non-Jewish neighbors has not been well tion data, we demonstrated that Sephardic (Greek and Turkish), defined. Here, genome-wide analysis of five North African Jewish Ashkenazi (Eastern European), and Mizrahi (Iranian, Iraqi, and groups (Moroccan, Algerian, Tunisian, Djerban, and Libyan) and Syrian) Jews with origins in Europe and the Middle East were comparison with other Jewish and non-Jewish groups demon- more related to each other than to their non-Jewish contempo- strated distinctive North African Jewish population clusters with rary neighbors (16). We showed that this relatedness could be proximity to other Jewish populations and variable degrees of explained on the basis of sharing DNA segments identical by Middle Eastern, European, and North African admixture. Two descent (IBD) within and between populations. Here, we build major subgroups were identified by principal component, neigh- on this understanding of the Jewish Diasporas by extending our analyses to members of the Jewish communities in Morocco, bor joining tree, and identity-by-descent analysis—Moroccan/ Algeria, Tunisia, Djerba, , Ethiopia, Yemen, and Georgia Algerian and Djerban/Libyan—that varied in their degree of Euro- and to members of non-Jewish communities from the same pean admixture. These populations showed a high degree of en- regions. We present a comprehensive population genetic analysis dogamy and were part of a larger Ashkenazi and Sephardic Jewish of North African Jews, a group that comprises the third major group. By principal component analysis, these North African group of World Jewry, following European and Middle Eastern groups were orthogonal to contemporary populations from North Jews. In addition, we extend these analyses to Georgian, Yem- and South Morocco, Western Sahara, Tunisia, Libya, and Egypt. enite, and Ethiopian Jews, thus developing a more comprehen- Thus, this study is compatible with the history of North African sive genetic map for Jewish population genetics. Jews—founding during Classical Antiquity with proselytism of lo- cal populations, followed by genetic isolation with the rise of Results Christianity and then Islam, and admixture following the emigra- North African Jewish Populations Form Distinctive Clusters with tion of Sephardic Jews during the Inquisition. Genetic Proximity to Each Other and to European and Middle Eastern Jewish Groups. SNP data were generated for 509 un- Jewish genetics | population genetics | North African genetics | related individuals (60.5% female) from the 15 Jewish pop-

identical by descent sharing | deep ancestry ulations (Table 1). These SNP data were merged with selected BIOLOGY datasets from the Human Genome Diversity Project (HGDP) to POPULATION ews lived in multiple communities in North Africa for >2,000 y examine the genetic structure of Jewish populations in both J(1). Successive waves of migration from the Middle East and global and regional contexts (Fig. 1 and SI Appendix, Fig. S1). Europe, as well as conversion and admixture of local populations The first two principal components of worldwide populations (mostly thought to be Berber and here termed “Maghrebi”), showed that the North African Jewish populations clustered with contributed to the formation of Jewish communities (2). Al- the European and Middle Eastern Jewish groups and European “ ” non-Jewish groups, but not with the North African non-Jewish though termed Sephardic, the formation of these communities A antedated the presence of Jews in the with groups, suggesting origins distinctive from the latter (Fig. 1 ). significant admixture occurring only after the expulsions from Georgian Jews formed part of this cluster, whereas Yemenite and Ethiopian Jews did not. When compared only to the Euro- Spain and in 1492 and 1497, respectively (2). From that pean, Middle Eastern, and North African Jewish and non-Jewish time up to their migration to current Israel starting in the 1940s populations, the North African Jewish populations formed and more massively in the 1950s, each of these populations lived a common but distinctive cluster—observed by principal com- in relative seclusion and was endogamous (3, 4). This separation ponents 1 and 2—that was overlapping with the Greek and led to their developing the characteristics of genetic isolates, such as high frequencies of founder mutations for Mendelian

disorders and limited repertoires of mitochondrial and Y chro- Author contributions: C.L.C., P.F.P., M.D., L.R.B., M.F., G.A., C.O., A.P., L.H., B.M.H., E.B., mosomal haplotypes (5–7). C.D.B., D.C., E.F., I.P., and H.O. designed research; C.L.C., P.F.P., M.D., L.H., I.P., and H.O. The relatedness of these Jewish groups to each other, to Eu- performed research; C.L.C., P.F.P., M.D., L.H., I.P., and H.O. analyzed data; and C.L.C., ropean and Middle Eastern Jews, and to their non-Jewish North P.F.P., M.D., L.R.B., G.A., B.M.H., I.P., and H.O. wrote the paper. African neighbors has been addressed in only a fragmentary The authors declare no conflict of interest. fashion in prior studies (8–14). Most studies were limited to one *This Direct Submission article had a prearranged editor. or two North African groups. One study challenged the story line 1C.L.C., P.F.P., and M.D. contributed equally to this work. of Judean migrants, Berber tribesmen, and Sephardic Jewish 2To whom correspondence should be addressed. E-mail: [email protected]. refugees contributing to the formation of these groups by dem- This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. onstrating shared ancestry between Libyan Jews and Yemenite 1073/pnas.1204840109/-/DCSupplemental.

www.pnas.org/cgi/doi/10.1073/pnas.1204840109 PNAS | August 21, 2012 | vol. 109 | no. 34 | 13865–13870 Downloaded by guest on September 30, 2021 Table 1. Summary of populations included in this study Jews and the other proximal to the Moroccan and Algerian Jews C Population ID Female Male Total Population (Fig. 1 ). The neighbor-joining tree supported this clustering of the ALGJ 23 1 24 Algerian Jewish Jewish populations, with the previously described European/ ASHJ 14 20 34 Ashkenazi Jewish* Syrian and Middle Eastern branches being discernible (Fig. 2). DJEJ 0 17 17 Djerban Jewish The European Turkish, Greek, and shared a com- ETHJ 13 3 16 Ethiopian Jewish mon branch, with Ashkenazi and forming con- GEOJ 4 9 13 Georgian Jewish nections to this branch. The North African populations added GRKJ 25 29 54 Greek Jewish* a subbranch to the European/Syrian branch, which in turn bi- – – – IRNJ 22 27 49 Iranian Jewish* furcated into Moroccan Algerian and Tunisian Djerban Libyan IRQJ 25 28 53 Iraqi Jewish* subbranches. As reported previously (16), the Middle Eastern ITAJ 20 19 39 Italian Jewish* Jewish branch included the Iranian and Iraqi Jews and the non- LIBJ 31 6 37 Libyan Jewish Jewish Adygei. This branch was observed now to include Geor- MORJ 32 6 38 Moroccan Jewish gian Jews. The Yemenite and Ethiopian Jews were on distinctive SYRJ 15 21 36 Syrian Jewish* branches with the on a branch between Pales- tinians and Bedouins. The robustness of this phylogenetic tree TUNJ 24 5 29 Tunisian Jewish was demonstrated by the fact that a majority of branches were TURJ 24 10 34 Turkish Jewish* supported by >90% of bootstrap replications. YMNJ 36 0 36 Yemini Jewish Pairwise FST analysis indicated that each of the North African ADYG 10 7 17 Adygei Jewish populations was distinct and, by bootstrap analysis, sta- ALGE 9 9 18 Algerian tistically different from all of the others [SI Appendix, Tables S1 BASQ 8 16 24 Basque (upper triangle) and S2]. Although FST may be sensitive to small BEDN 20 27 47 Bedouin sample sizes, these population differences were confirmed by DRUZ 32 13 45 Druze ANOVA on the principal component analysis (PCA) eigenvec- EGYP 0 19 19 Egyptian tors (P < 0.05; SI Appendix, Table S3), with the exception of FREN 17 12 29 French Algerian and Moroccan Jews, who were found to overlap. These LIBY 1 16 17 Libyan differences were also confirmed by permutation testing of be- MORN 0 18 18 North Moroccan tween-group IBS for all pairwise comparisons of the 15 Jewish MORS 5 5 10 South Moroccan populations (SI Appendix, Table S5). In addition, the exact test MOZA 9 19 28 Mozabite for population differentiation indicated that these populations NITA 7 14 21 Northern Italian were significantly different (P < 0.001)—again with the exception PALN 34 17 51 Palestinian of Algerian and Moroccan Jews (P = 0.25). These findings RUSS 9 16 25 Russian demonstrated that the most differentiated of the North African SARD 12 16 28 Sardinian Jewish populations was Djerban (average FST to all other Jewish SOCC 0 17 17 Saharan populations 0.026). The smallest FST was between Greek and TUNI 0 15 15 Tunisian Turkish Sephardic Jews (FST = 0.0024), who were close, in turn, AFRI 1 24 25 Sub-Saharan African to Italian, Algerian, Moroccan, and . The second ASIA 10 15 25 Asian smallest FST observed was between Algerian and Moroccan Jews (FST = 0.0027). As a point of reference, the average pairwise FST *Samples that were genotyped and reported in Atzmon et al. (16). between Jews and non-Jews (excluding African and Asian ref- erence populations) was 0.019. Thus, North African Jews were fi B identi able as a third major group along with Middle Eastern Turkish Sephardic Jewish cluster (Fig. 1 ). Each of these Jewish Jews and European/Syrian Jews, albeit with a higher degree of — groups, in turn, formed a distinctive cluster observed by prin- relatedness to European Jews. cipal components 1 and 3—that demonstrated a west to east cline with Algerian and Moroccan Jews in proximity to the Se- North African Jewish Populations Showed a High Degree of phardic Jewish populations. The Tunisian Jews exhibited two Endogamy and IBD Sharing Between Jewish Groups. We studied apparent clusters—one with proximity to Libyan and Djerban the frequency of IBD haplotypes shared by unrelated individuals

Fig. 1. Principal component analysis (PCA) of Jewish populations combined with other HGDP groups in global (A) and regional contexts (B and C). Dense regions with many overlapping populations are circled for the purpose of illustration, with a list of the groups adjacent.

13866 | www.pnas.org/cgi/doi/10.1073/pnas.1204840109 Campbell et al. Downloaded by guest on September 30, 2021 North African ancestry and less European ancestry. In addition, these neighboring non-Jewish populations showed inferred sub- Saharan ancestry that was not demonstrable by STRUCTURE analysis in the North African Jewish populations. Reflecting the high degree of relatedness and shared ancestry among Tunisian, Djerban, and Libyan Jews, a unique (red) component was ob- served at K = 6 and 7. Similarly, reflecting the high degree of relatedness and shared ancestry among Iranian and Iraqi Jews, a unique (green) component was observed at K = 7. By using Xplorigin to perform ancestry deconvolution for a subset of the populations, the Maghrebi (Tunisian non-- ish), European (Basque), and Middle-Eastern (Palestinian) ancestry components of North African Jewish communities were compared with the corresponding non-Jewish groups (Fig. 5). A stronger signal of European ancestry was found in the genomes of Jewish samples, with a decreased fraction of Maghrebi ori- gins, whereas the Middle Eastern component was comparable across groups. In Jewish groups, geographical proximity to the Iberian Peninsula correlated with an increase in European an- cestry and a decrease in Middle Eastern ancestry, whereas the Maghrebi component was only mildly reduced. Differences in ancestry proportions were found to be significant (P < 0.05), except for the Maghrebi component of non-Jewish Northern Moroccan compared with non-Jewish Algerian samples, and the European component of Jewish Moroccan compared with Jew- ish Algerian samples. Fig. 2. Neighbor-joining tree showing the relationship of European, Jewish, Given the distinctive genetic identity of the Basque population

Middle Eastern, and North African populations, using FST as the distance compared with other European populations (19), we also ran the metric. The tree was rooted by using the reference mixed Central and Xplorigin analysis using 48 randomly selected HGDP Russian Southern African population as an out-group. Major population groups are haplotypes as a reference for the European ancestral component labeled at the right, with the red bar within the Jewish group denoting of the analyzed populations. The results of such analysis sug- North African Jews. Five hundred bootstrap iterations were tested to assess gested that, although the estimated European components using the robustness of the tree [the labels at the nodes represent the number of Basque or Russian reference genomes were correlated, short iterations (%) in which that configuration was seen]. Populations labeled haplotype frequencies found in the Russian samples were less with an asterisk (*) cluster outside of their expected groups. representative of the analyzed groups’ European ancestry (Fig. 5 and SI Appendix, Fig. S6). In addition to genome-wide proportions, this ancestry painting within and across the groups analyzed. When IBD within pop- analysis was intersected with regions that harbor long-range IBD ulations was examined, the non-Jewish Tunisian Berbers haplotypes. The ancestry of these loci likely reflects more recent exhibited the highest level of haplotype sharing, suggesting demographic trends, because long IBD haplotypes are coin- a small effective population size and high levels of endogamy A herited from more recent common ancestors than the average (Fig. 3 ) (17, 18). With the exception of this Tunisian cohort, the genomic locus. In Jewish populations, the ancestry proportions Jewish populations generally showed higher IBD sharing than in corresponding IBD regions highlighted mild—but in some non-Jewish groups, indicating greater genetic isolation. cases significant—deviations from genome-wide averages (SI The relationships of the Jewish communities were outlined Appendix SI Appendix BIOLOGY , Fig. S3; detail in , Table S6), whereas POPULATION B further by the IBD sharing across populations [Fig. 3 and stronger differences were observed in the recent ancestry for the SI Appendix , Tables S1 (lower triangle) and S4], because the corresponding non-Jewish communities. In these groups, re- Jewish groups generally demonstrated closer relatedness with cently coinherited regions exhibited significantly increased Eu- other Jewish communities than with geographically near non- ropean ancestry, with significantly decreased Maghrebi ancestry, Jewish populations. In particular, North African Jewish commu- compared with genome-wide averages. This phenomenon was nities showed some of the highest levels of cross-population IBD generally stronger for loci shared IBD with individuals from sharing for the average pair of individuals (SI Appendix,Fig.S5). A Jewish communities. This increase in European ancestry and strong degree of relatedness was observed across individuals from corresponding decrease in Maghrebi ancestry may be interpreted the Djerban, Tunisian, and Libyan Jewish communities. Notice- in several ways: (i) This increase may be due to the inherently able proximity was also found between Jewish Algerian samples higher European ancestry of Jewish segments planted into the and other North African Jewish cohorts—such as Moroccan, genomes of non-Jewish populations. (ii) Alternatively, the dif- Tunisian, Libyan, and Djerban Jews—and across individuals ference in genome-wide ancestries between Jewish and non- from the Tunisian and Moroccan Jewish groups. Among non- Jewish groups alone could explain this observation in the case of Jewish North African groups, Algerians, South Moroccans, and recent symmetric gene flow in both directions. However, this West Saharan samples were found to share, on average, second scenario alone is inconsistent with the data, because it a smaller proportion of their genome IBD to other cohorts. would imply a comparable decrease of European ancestry in regions IBD to non-Jewish populations to be observed in Jewish North African Jewish and Non-Jewish Populations Vary in Their genomes. (iii) The observed increase of European ancestry could Proportions of European and Middle Eastern Ancestry. By STRUC- be similarly explained by European segments newly planted in TURE analysis, the North African Jewish groups demonstrated both populations. This explanation is also unlikely, because it inferred North African–Middle Eastern ancestry with varying would result in a comparable increase of European ancestry in inferred European ancestry (Fig. 4 and SI Appendix, Fig. S2). Jewish genomes, which is instead observed to only mildly in- The proportion of inferred European ancestry increased from crease compared with genome-wide averages. The increase in east to west, with Moroccan Jews demonstrating the highest European ancestry is stronger in IBD regions of length between proportion. In contrast, the neighboring non-Jewish North Af- 3 and 4 cM, compared with regions at least 4 cM long (SI Ap- rican populations demonstrated substantially higher inferred pendix, Table S7), which is compatible with European admixture

Campbell et al. PNAS | August 21, 2012 | vol. 109 | no. 34 | 13867 Downloaded by guest on September 30, 2021 Fig. 3. Genome-wide IBD sharing for the average pair of individuals within (A) and across (B and C) populations. With the exception of non-Jewish Tunisian samples, IBD sharing is higher within Jewish groups, reflecting higher levels of endogamy. Jewish populations exhibit higher sharing with other Jewish populations than with geographically near groups, The average total sharing across Jewish populations is generally higher than the sharing across other population pairs, and pairs of North African Jewish populations (dark to red color bars) share more segments IBD than most other Jewish pairs.

occurring several generations before present, through ancestors (Fig. 3). By STRUCTURE analysis, their ancestry appeared to be that resided in the Iberian Peninsula. of North African, Middle Eastern, and sub-Saharan origin with little European contribution (Fig. 4). Ethiopian and Yemenite Jewish Populations Form Distinctive Clusters, Despite forming a cluster on PCA and neighbor-joining tree Whereas Georgian Jews Do Not. As noted, by PCA, Georgian Jews that appeared intermediate to Jews and Middle Eastern non- formed part of the Jewish cluster, whereas Yemenite and Ethio- Jews, the Yemenite Jews were genetically closest to Egyptians by pian Jews did not. ANOVA on the PCA eigenvectors and pairwise FST (0.008), followed by Middle Eastern non-Jews, then Turkish FST analysis indicated that the Ethiopian, Yemenite, and Geor- and Greek Jews (FST = 0.010 and 0.012, respectively); however, gian Jewish populations were distinct and statistically different their mean FST to all other Jewish populations was similar to that from all of the others [SI Appendix, Tables S1 (upper triangle), S2, of all other Jewish populations with the exception of Ethiopian and S3]. Ethiopian Jews fell outside of the main three Jewish ge- Jews (SI Appendix, Fig. S4). Their mean levels of IBD sharing netic clusters with the highest average genetic differentiation with Jewish populations were comparable to the mean levels of compared with all other Jewish groups (FST = 0.047). They were IBD sharing of other Jewish populations, except Ethiopian Jews most closely related to non-Jewish Libyans and South Moroccans (SI Appendix, Fig. S4). By STRUCTURE analysis, their inferred (FST = 0.019) and then to the other North African and Middle ancestry was predominantly Middle Eastern and North African Eastern non-Jewish populations. Their closest (yet still quite dis- with little European contribution. tant) Jewish neighbors were Yemenite Jews (FST = 0.038). Like- As noted, the Georgian Jewish cluster overlapped the overall wise, they showed little IBD sharing with other Jewish populations Jewish cluster, and the Georgian Jewish subbranch on the neighbor-

Fig. 4. STRUCTURE results for Jewish populations combined with Middle Eastern, European, East Asian, and African populations from HGDP. K values of 3–7 are shown; each represents an alignment of 10 independent runs. Vertical bars represent individuals, which are grouped by their known populations and further combined into general regional groups, illustrated by the top bar.

13868 | www.pnas.org/cgi/doi/10.1073/pnas.1204840109 Campbell et al. Downloaded by guest on September 30, 2021 Fig. 5. Ancestry deconvolution. The genome-wide ancestry of North African Jewish and non-Jewish populations is compared with respect to European (Basque), Maghrebi (Tunisian non-Jewish), and Mid- dle Eastern (Palestinian) origins. Jewish populations exhibit increased European and decreased Maghrebi ancestry compared with corresponding non-Jewish groups. The Middle Eastern component is comparable across all groups.

joining tree was intermediate to those of Iranian and Iraqi Jews and communities originated in pre-Classical Antiquity and expanded the Adygei. The Georgian Jews had a low FST compared with Se- during Classical Antiquity to grow quite large and cover a signifi- phardic Jews (mean FST = 0.009), despite their similarity with cant proportion of North Africa. Iranian and Iraqi Jews in the neighbor-joining tree and PC analysis. As demonstrated from these analyses, both admixture and By pairwise IBD sharing with other Jewish populations, the isolation with endogamy contributed to the formation of these Georgian Jews fell within the pattern of Jewish relatedness. By groups. is thought to have spread among the indigenous STRUCTURE analysis, their inferred ancestry was predominantly Berbers of North Africa through proselytism, which was quite Middle Eastern and European, with little North African contri- common at the time, although the degree to which it occurred is bution. Notably, they shared a small proportion of the unique K = 7 unknown. Significant admixture occurred with the Sephardic Jews (green) component that was observed in the Iranian and Iraqi of Spain following their expulsion during the Inquisition in 1492 Jewish populations, which may be due to either small sample size or and later, when tens of thousands migrated to the Mahgreb genetic drift and founder effect. (Morocco, Algeria, and Tunisia) and fewer went elsewhere. This admixture is reflected in the higher proportion of European an- Discussion cestry among Moroccan and Algerian Jews and the greater genetic This study supports and expands the classification of Jewish proximity of these groups and high degree of IBD segment sharing populations that has been developed by us and others (14–16). It with Sephardic Greek, Turkish, and Italian Jews. Admixture oc- defines North African Jews as a distinct branch with significant curred with North African non-Jewish populations. However, as relatedness to European and Middle Eastern Jews, with both demonstrated by the apparent unrelatedness of the Jewish and being part of a larger Jewish cluster. Within this branch are two non-Jewish North Africans, these events were not recent. subbranches—the highly endogamous and related Djerban, Isolation began for Jews when Roman Emperor Constantine Libyan, and Tunisian Jews and the more European Algerian and converted to Christianity and made it the state religion of the BIOLOGY

Moroccan Jews. With these methods, the Tunisian Jews could be Roman Empire. In the process, Jews were deprived of their right POPULATION differentiated into two subclusters by PCA that were more Lib- to convert pagans or accept proselytes. Most Christian commu- yan/Djerban-related and more Moroccan/Algerian-related, and nities of North Africa disappeared following the Arab conquest the Moroccan and Algerian Jews could be demonstrated to be no of the 8th century CE. The Jewish communities remained, but different from a single population by FST and the exact test of were subject to religious and civil suppression. The Jews tended population differentiation. All of these populations were differ- to live in their own special quarters, and the area as a whole was entiated from the current non-Jewish populations in these fragmented into small tribal states. The resulting high degree of countries reflecting distinctive genetic histories. endogamy within the North African Jewish populations is These observations are consonant with the history of Jews in reflected in the high degree of within-population IBD sharing, North Africa, which stretch back to the earliest recorded history of the patterns of within-population rare Mendelian disorders, and the region (1–4, 20). Israelite traders may have been among the the identification of a shared inferred ancestral component earliest Phoenician traders who colonized the African coast and among the Djerban, Tunisian, and Libyan Jews by STRUC- established Carthage. The first evidence for Jews in North Africa is TURE, neighbor-joining tree, and, to a less obvious degree, from 312 Before Common Era when King Ptolemy Lagi of Egypt PCA. Such a uniquely shared inferred ancestral component was settled Jews in the cities of Cyrenaica in current-day Tunisia. The also observed among the Middle Eastern/Caucasian–Iranian, later Pax Romana facilitated communication among the Jewish Iraqi, and Georgian Jews. communities of the Mediterranean Basin and assured establish- These results are in agreement with previous population ge- ment of Judaism in the two African provinces of Proconsular netic studies of North African Jews, yet significantly expand their (Libya and Tunisia) and Caesarean (Algeria and Morocco)— observations by using larger numbers of populations and con- reflecting the current subbranches. Following the destruction of temporary methods. Earlier studies based on blood group the in by Roman Emperor Titus in 70 markers and serum proteins differentiated North African Jews Common Era (CE), 30,000 Jews were deported to Carthage in from other Jewish groups and from non-Jewish North Africans current-day Tunisia. Josephus reported the presence of 500,000 (8–13). A more recent study identified a distinctive signature for Jews in Cyrenaica in the 1st century CE. Jewish communities have Libyan Jews (15). Here, this signature was confirmed and shown been identified from the remains at Carthage and at to be shared by Djerban and Tunisian Jews. A global study of least 13 other sites, and Saint Augustine wrote about Jewish Jewish population genetics from 2010 that partitioned most communities at Utica, Simittra, Thusurus, and Oea. Thus, Jewish Jewish genomes into Ashkenazi–North African–Sephardic,

Campbell et al. PNAS | August 21, 2012 | vol. 109 | no. 34 | 13869 Downloaded by guest on September 30, 2021 Caucasus–Middle Eastern, and Yemenite subclusters demon- HumanHap650K Beadchips. After merging and quality-control filtering, a to- strated that an Ethiopian subcluster was close to the local pop- tal of 163,199 high-quality SNPs were available for downstream analyses. ulation, in accordance with what was observed here (14). A previous study of monoallelic matrilineal inheritance demon- PCA, FST, and Phylogeny. The SMARTPCA program from the EIGENSOFT strated limited mitochondrial lineages in Tunisian and Libyan package (Version 3.0) (22) was used to perform outlier removal and PCA. Jews, but not in Moroccan Jews, which was observed in this study FST values were calculated for each population pair by using Genepop as a high degree of extended IBD sharing among the more en- (23). A bootstrap method, sampling all markers with replacement for 500 iterations, was used to estimate FST confidence intervals and to generate dogamous Tunisian and Libyan populations (5). a consensus tree using the neighbor-joining method from PHYLIP (Ver- The observations for Georgian and Ethiopian Jews met his- — sion 3.69) (24). Genepop was also used to perform an exact test of torical expectations Georgian Jews are an outgrowth from the population differentiation. Iranian and Iraqi Jewish communities, and Ethiopian Jews are an ancient community that had relatively few, if any, Jewish Population Structure. Population ancestry proportions were inferred by using founders from elsewhere and existed in isolation for >2,000 the program STRUCTURE (Version 2.3) (25, 26). A selection of 5,113 markers years. Nonetheless, the low FST between Sephardic and Geor- was used, selected for high differences in allele frequency between pop- gian Jews suggests that the latter may have had significant con- ulations and low linkage disequilibrium. tact with Turkish or Syrian Jews. The observations for the Yemenite Jews are even more surprising. Like the Ethiopian IBD Discovery. Following computational phasing with BEAGLE (27), shared Jews, this population was founded >2,000 y ago and was thought IBD segments were detected by using the GERMLINE software package (28). to be comprised mostly of local proselytes, which is reflected in Only non-Jewish populations from North Africa and Jewish populations the distinctive clustering of the population away from other were analyzed, allowing a higher-density set of 598,260 SNPs to be used. Jewish groups and the mostly Middle Eastern ancestry present in Ancestry Deconvolution. The Xplorigin software package was used to perform this group. However, the observation of comparable FST and IBD sharing with other Jewish communities implies significant ancestry deconvolution on a subset of North African groups, by using de- common Jewish founders in the absence of more recent genetic scribed procedures (29, 30). Samples of North African origins were analyzed flow into the community. Thus, although Jewishness was trans- with respect to their Maghrebi, Middle Eastern, and European ancestry by mitted by the flow of ideas and genes, both appear to have been using 36 non-Jewish Tunisian Berber, 48 Palestinian, and 48 Basque refer- ence haplotypes, respectively. Data are available upon request. under selection for long periods of time. ACKNOWLEDGMENTS. We thank Anmol Tiwari for contributions to the Materials and Methods permutation testing; Alon Keinan, Yongzhao Shao, and Simon Gravel for More detailed methods are described in SI Appendix, SI Materials useful technical discussions; Lanchi U, Malka Sasson, Bridget Riley, Jidong and Methods. Shan, and David Reynolds for technical contributions; and Lawrence Schiff- man, Aron Rodrigue, and Harvey Goldberg for informative historical discus- sions. This work was supported in part by the Lewis and Rachel Rudin Genotype Data. A total of 509 Jewish samples from 15 populations were Foundation, the Iranian-American Jewish Federation of New York, the US– genotypedontheAffymetrix6.0array(Affyv6).Thesewerecombinedwith114 Israel Binational Science Foundation, National Institutes of Health Grant 5 non-Jewish individuals from seven North African populations (17) by using the U54 CA121852, and and Sidney Lapidus. L.R.B. and D.C. were supported same array and 365 samples genotyped in the HGDP (21), by using Illumina by Ministerio de Ciencia e Innovación Grant CGL2010-14944/BOS.

1. Baron SW (1952) A Social and Religious History of the Jews (Columbia Univ Press, New 16. Atzmon G, et al. (2010) ’s children in the genome era: Major Jewish diaspora York), 2nd Ed. populations comprise distinct genetic clusters with shared Middle Eastern ancestry. 2. Stillman NA (1979) The Jews of Arab Lands: A History and Source Book (Jewish Am J Hum Genet 86:850–859. Publication Society of America, Philadelphia), 1st Ed. 17. Henn BM, et al. (2012) Genomic ancestry of North Africans supports back-to-Africa 3. Hirschberg HZ (1974) A History of the Jews in North Africa (Brill, Leiden), 2nd Rev. Ed. migrations. PLoS Genet 8:e1002397. – 4. Chouraqui A (1968) Between East and West: A History of the Jews of North Africa 18. Wright S (1931) Evolution in Mendelian populations. Genetics 16:97 159. 19. Rodríguez-Ezpeleta N, et al. (2010) High-density SNP genotyping detects homoge- (Jewish Publication Society of America, Philadelphia), 1st Ed. fi 5. Behar DM, et al. (2008) Counting the founders: The matrilineal genetic ancestry of the neity of Spanish and French Basques, and con rms their genomic distinctiveness from other European populations. Hum Genet 128:113–117. Jewish Diaspora. PLoS ONE 3:e2062. 20. Gubbay L, Levy A (1992) The Sephardim: Their Glorious Tradition from the Babylonian 6. Ostrer H (2001) A genetic profile of contemporary Jewish populations. Nat Rev Genet Exile to the Present Day (Carnell Limited, London). 2:891–898. 21. Rosenberg NA, et al. (2002) Genetic structure of human populations. Science 298: 7. Manni F, et al. (2005) A Y chromosomal portrait of the population of Djerba (Tu- 2381–2385. nisia) to elucidate its complex demographic history. Bull Soc Anthropol Paris 22. Patterson N, Price AL, Reich D (2006) Population structure and eigenanalysis. PLoS 17:1–12. Genet 2:e190. 8. Bonné-Tamir B, Ashbel S, Bar-Shani S (1978) Ethnic communities in Israel: The genetic 23. Rousset F (2008) genepop’007: A complete re-implementation of the genepop soft- blood markers of the Moroccan Jews. Am J Phys Anthropol 49:465–471. ware for Windows and Linux. Mol Ecol Resources 8:103–106. 9. Bonné-Tamir B, Ashbel S, Modai J (1977) Genetic markers in Libyan Jews. Hum Genet 24. Felsenstein J (1989) PHYLIP: Phylogeny inference package (version 3.2). Cladistics 5: 37:319–328. 164–166. 10. Carmelli D, Cavalli-Sforza LL (1979) The genetic origin of the Jews: A multivariate 25. Falush D, Stephens M, Pritchard JK (2003) Inference of population structure using approach. Hum Biol 51:41–61. multilocus genotype data: Linked loci and correlated allele frequencies. Genetics 164: – 11. Karlin S, Kenett R, Bonné-Tamir B (1979) Analysis of biochemical genetic data on 1567 1587. 26. Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using Jewish populations: II. Results and interpretations of heterogeneity indices and dis- multilocus genotype data. Genetics 155:945–959. tance measures with respect to standards. Am J Hum Genet 31:341–365. 27. Browning SR, Browning BL (2007) Rapid and accurate haplotype phasing and missing- 12. Kobyliansky E, Micle S, Goldschmidt-Nathan M, Arensburg B, Nathan H (1982) Jewish data inference for whole-genome association studies by use of localized haplotype populations of the world: Genetic likeness and differences. Ann Hum Biol 9:1–34. clustering. Am J Hum Genet 81:1084–1097. 13. Livshits G, Sokal RR, Kobyliansky E (1991) Genetic affinities of Jewish populations. Am 28. Gusev A, et al. (2009) Whole population, genome-wide mapping of hidden re- – J Hum Genet 49:131 146. latedness. Genome Res 19:318–326. 14. Behar DM, et al. (2010) The genome-wide structure of the Jewish people. Nature 466: 29. Velez C, et al. (2011) The impact of Jews on the genomes of modern Latin 238–242. Americans. Hum Genet. 15. Rosenberg NA, et al. (2001) Distinctive genetic signatures in the Libyan Jews. Proc Natl 30. Bonnen PE, et al. (2010) European admixture on the Micronesian island of Kosrae: Acad Sci USA 98:858–863. Lessons from complete genetic information. Eur J Hum Genet 18:309–316.

13870 | www.pnas.org/cgi/doi/10.1073/pnas.1204840109 Campbell et al. Downloaded by guest on September 30, 2021