Evolution of the Rapidly Mutating Human Salivary Agglutinin Gene (DMBT1) and Population Subsistence Strategy
Total Page:16
File Type:pdf, Size:1020Kb
Evolution of the rapidly mutating human salivary agglutinin gene (DMBT1) and population subsistence strategy Shamik Polleya, Sandra Louzadab, Diego Fornic, Manuela Sironic, Theodosius Balaskasa, David S. Hainsd, Fengtang Yangb, and Edward J. Holloxa,1 aDepartment of Genetics, University of Leicester, Leicester LE1 7RH, United Kingdom; bMolecular Cytogenetics Facility, Wellcome Trust Sanger Institute, Hinxton, Cambridge CB10 1SA, United Kingdom; cBioinformatics, Scientific Institute IRCCS E. Medea, 23842 Bosisio Parini, Italy; and dDivision of Pediatric Nephrology, University of Tennessee Health Science Center, Le Bonheur Children’s Hospital, Memphis, TN 38103 Edited by Huntington F. Willard, The Marine Biological Laboratory, Woods Hole, MA, and approved March 4, 2015 (received for review August 27, 2014) The dietary change resulting from the domestication of plant and genetic variation has responded to this change in dental health animal species and development of agriculture at different lo- via natural selection. cations across the world was one of the most significant changes in We analyzed the variation of the deleted in malignant brain human evolution. An increase in dietary carbohydrates caused an tumors 1 (DMBT1) gene encoding a major salivary glycoprotein increase in dental caries following the development of agriculture, salivary agglutinin, also known as gp-340, hensin or muclin, and SAG mediated by the cariogenic oral bacterium Streptococcus mutans. hereafter referred to DMBT1 (10). This protein comprises Salivary agglutinin [SAG, encoded by the deleted in malignant ∼10% of total salivary protein in children and 5% in adults (11), SAG brain tumors 1 (DMBT1) gene] is an innate immune receptor glyco- and is also present at other mucosal surfaces (12). DMBT1 is protein that binds a variety of bacteria and viruses, and mediates a component of innate immunity, acting as a pattern recog- attachment of S. mutans to hydroxyapatite on the surface of the nition receptor interacting with bacteria such as S. mutans and Helicobacter pylori tooth. In this study we show that multiallelic copy number varia- and viruses such as HIV-1 and influenza (12). DMBT1 Variation between host saliva affects the adhesion of S. mutans tion (CNV) within is extensive across all populations and is SAG predicted to result in between 7–20 scavenger–receptor cysteine- (13), and protein variants of DMBT1 have been suggested to rich (SRCR) domains within each SAG molecule. Direct observation affect caries susceptibility in children (14). of de novo mutation in multigeneration families suggests these Copy number variation (CNV) describes a difference in DNA CNVs have a very high mutation rate for a protein-coding locus, dosage between different individuals, and includes simple de- with a mutation rate of up to 5% per gamete. Given that the SRCR letion and duplications as well as more complex multicopy and multiallelic variation (15). CNV can affect gene expression by domains bind S. mutans and hydroxyapatite in the tooth, we in- altering the total number of copies of individual genes and vestigated the association of sequence diversity at the SAG-binding therefore gene dosage, by changing tissue-specific enhancers or gene of S. mutans,andDMBT1 CNV. Furthermore, we show that DMBT1 by varying the number of exons within a gene, potentially altering CNV is also associated with a history of agriculture across the number of protein-coding subunits, for example (16). CNV global populations, suggesting that dietary change as a result of can also show a germ-line mutation rate at least an order of agriculture has shaped the pattern of CNV at DMBT1, and that the DMBT1 S. mutans magnitude higher than single nucleotide substitutions, because - interaction is a promising model of host-pathogen- of the distinct mutational processes that underlie copy number culture coevolution in humans. change (16). Genome wide, CNVs are enriched for genes that copy number variation | agriculture | DMBT1 | mutation | Significance GENETICS structural variation Humans have undergone an evolutionary very recent change in he effect of the agricultural transition on human genome environment of their own making. The development of agricul- Tvariation has been extensive (1). In addition to the indirect ture profoundly altered diet and exposure to pathogens, and yet effect of an exponential increase in population size, direct effects the evolutionary response to this is still poorly understood. Here, on particular genes have occurred, most notably the evolution, at we characterize extensive copy number variation (CNV) of the multiple locations through multiple alleles, of lactase persistence LCT gene encoding salivary agglutinin (deleted in malignant brain at the gene, enabling adults to drink milk generated from tumors 1, DMBT1). Salivary agglutinin comprises 10% of salivary domesticated mammals (2). The agricultural transition is also protein and binds bacteria, including mediating the attachment of thought to have had an impact on the oral commensal micro- Streptococcus mutans Streptococcus mutans the causative agent of dental caries, ,to biota, in particular , the causative agent of teeth. We show that DMBT1 is a very fast-mutating protein- dental caries which is the most common chronic infectious dis- coding locus, and DMBT1 CNV correlates with a population his- ease in humans. Analysis of ancient skeletal remains (3) and S. mutans tory of agriculture. Furthermore, we examine the relationship modern genomic diversity (4) have suggested that between variation of the S. mutans region that binds salivary became a major oral pathogen only after the development of agglutinin and CNV of the DMBT1 gene. agriculture and the concomitant increase in availability of sugars consumed directly or derived from starchy foods. The increased Author contributions: S.P. and E.J.H. designed research; S.P., S.L., T.B., F.Y., and E.J.H. level of caries in individuals from agricultural societies is ob- performed research; D.F., M.S., and D.S.H. contributed new reagents/analytic tools; S.P., served in both modern and prehistoric populations (5–7). This D.F., M.S., T.B., and E.J.H. analyzed data; and S.P. and E.J.H. wrote the paper. increase in caries was likely to have profound consequences to The authors declare no conflict of interest. the health of the individuals concerned before the development This article is a PNAS Direct Submission. of modern dental treatment (8). Caries left untreated leads to Freely available online through the PNAS open access option. tooth loss, potential severe infections, and a decrease in masti- 1To whom correspondence should be addressed. Email: [email protected]. catory efficiency potentially leading to a reduction in access of This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. enzymes to the food bolus (9). It is unclear whether human 1073/pnas.1416531112/-/DCSupplemental. www.pnas.org/cgi/doi/10.1073/pnas.1416531112 PNAS | April 21, 2015 | vol. 112 | no. 16 | 5105–5110 Downloaded by guest on September 28, 2021 encode proteins that interact with the environment, particularly (NAHR) between the 98% identical SRCR repeats carrying those in host defense (17), and a high mutation rate of these loci SRCR2 and SRCR6 is responsible for CNV1 (Fig. 1D and SI may contribute to immunological individuality of the host. Methods). It is also clear that CNV2 is considerably more com- Whether selection or a relaxation of functional constraint is re- plex than the small deletion described previously, being a multi- sponsible for this bias in genome-wide distribution remains un- allelic CNV ranging between 1 and 11 copies per diploid genome resolved, although there are strong arguments for the role of with each repeat unit carrying a single SRCR domain. gene duplication in evolution (18). There is convincing evidence Analysis of further samples from the CEPH-Human Genome that CNV in humans can affect the host’s susceptibility to in- Diversity Project (HGDP) panel of 971 individuals from 52 pop- fectious diseases, including the well established effect of α-globin ulations worldwide (28) showed rare individuals with a CNV2 copy deletion on malaria susceptibility (19). Furthermore, it has been number of zero. Sanger sequencing of PCR products from these suggested that the frequency of high copy number alleles of the individuals showed that all of the zero-copy CNV2 alleles had salivary amylase gene AMY1 has increased by natural selection in a breakpoint within 33 bp of sequence identical between SRCR8 populations that eat a carbohydrate-rich diet (20). and SRCR11 (Fig. S4), just upstream of the exon encoding the DMBT1SAG mostly consists of an array of scavenger receptor SRCR domain, suggesting that this allele was generated by NAHR cysteine-rich (SRCR) domains which bind bacteria, including between these repeats (SI Methods). This finding suggests that S. mutans (21) and promote their adherence to hydroxyapatite of other larger CNV2 alleles have also been generated by NAHR the tooth (22, 23), which is critical for the cariogenic activity of between any of the repeats carrying SRCR domains 8–11. the bacteria. The canonical DMBT1 gene annotated in the hg19 human genome assembly has 13 repeats each containing a SRCR DMBT1 Copy Number Variation Has a High Mutation Rate. The ex- domain (Fig. 1A). The repeats containing the SRCR domain, tensive allelic diversity and repetitive genomic structure of DMBT1, hereafter known as SRCR repeats, within the DMBT1 gene are together with the knowledge that NAHR is likely to have mediated distinct at the DNA level but share ∼80% identity at the protein generation of new alleles, led us to consider whether CNV1 and level. Within the SRCR domain, smaller regions that bind to CNV2 had a high mutation rate. To study this directly, we used S. mutans and hydroxyapatite have been identified (Fig. 1D), al- our validated PRT assays to call copy number of DMBT1 CNV1 though bacterial binding is inhibited by sialidases, showing gly- and CNV2 on 522 samples from 40 large multigenerational cosyl groups are also important in bacterial binding (24).