UNIVERSIDADE ESTADUAL DE CAMPINAS FACULDADE DE ODONTOLOGIA DE PIRACICABA

MANUEL ALEXANDER JARA ESPEJO

ASSOCIAÇÃO ENTRE SEQUÊNCIAS FORMADORAS DE G- QUADRUPLEXES NO INTRON 1 DOS GENES PAX9, MSX1, INHBA E BMP2 E O FENÓTIPO DENTÁRIO DOS MAMÍFEROS

ASSOCIATION BETWEEN G-QUADRUPLEX FORMING SEQUENCES IN THE FIRST INTRON OF PAX9, MSX1, INHBA AND BMP2 GENES AND MAMMALIAN DENTITION PHENOTYPE

Piracicaba 2017

MANUEL ALEXANDER JARA ESPEJO

ASSOCIAÇÃO ENTRE SEQUÊNCIAS FORMADORAS DE G-QUADRUPLEXES NO INTRON 1 DOS GENES PAX9, MSX1, INHBA E BMP2 E O FENÓTIPO DENTÁRIO DOS MAMÍFEROS

ASSOCIATION BETWEEN G-QUADRUPLEX FORMING SEQUENCES IN THE FIRST INTRON OF PAX9, MSX1, INHBA AND BMP2 GENES AND MAMMALIAN DENTITION PHENOTYPE

Dissertação apresentada à Faculdade de Odontologia, da Universidade Estadual de Campinas, para obtenção de título de Mestre em Biologia Buco-Dental, Área de Histologia e Embriologia.

Dissertation presented to the Piracicaba Dental School of the University of Campinas in partial fulfillment of the requirements for the degree of Master in Dental Biology, in Histology and Embryology area.

Orientador: Prof. Dr. Sérgio Roberto Peres Line.

ESTE EXEMPLAR CORRESPONDE À VERSÃO FINAL DA DISSERTAÇÃO DEFENDIDA PELO ALUNO MANUEL ALEXANDER JARA ESPEJO E ORIENTADA PELO PROF. DR. SERGIO ROBERTO PERES LINE.

Piracicaba

2017

DEDICATÓRIA

A Deus, por ter me guiado e cuidado durante estes anos e por ser o meu apoio em cada momento da minha vida.

À minha família, em especial aos meus pais Martha e Manuel, pelo seu amor e apoio constante em todas as minhas escolhas.

AGRADECIMENTOS

Á Universidade Estadual de Campinas representada na pessoa do Magnifico Reitor Prof. Dr. Jorge Tadeu Jorge.

À Faculdade De Odontologia De Piracicaba na pessoa do Diretor Prof. Dr. Guilherme Elias Pessanha Henriques, por ter-me dado o privilégio de estudar e desenvolver as minhas qualidades como pesquisador e como pessoa nesta magnifica alma mater.

Ao meu orientador, Prof. Dr. Sérgio Roberto Peres Line, pela orientação sincera, disposição a esclarecer as minhas dúvidas, paciência e, principalmente, por ser um verdadeiro exemplo de pessoa e cientista.

Aos professores do departamento de morfologia, na área de histologia, Prof.Dr. Marcelo Rocha Marques, pela paciência, apoio e conselhos, á Prof.Dra. Ana Paula de Souza Pardo pelo exemplo do bom uso da didática e ao professor Prof. Dr. Pedro Duarte Novaes.

Ao Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) pelo auxilio concedido para a execução desta tese.

Aos meus amigos e colegas alunos do programa de Pós-graduação em Biologia Buco-Dental e de outras áreas, pelo seu companheirismo e amizade, fazendo mais agradável cada momento na FOP.

RESUMO

A dentição dos mamíferos representa um modelo eficiente para entender como as mudanças nas restrições do desenvolvimento podem ter participado no surgimento da diversidade fenotípica ao longo do processo evolutivo. Diferentes abordagens e modelos têm sido propostos para explicar esta diversidade, incluindo a teoria dos campos morfogenéticos e o modelo da cascata de ativadores/inibidores. Quase a metade dos genes humanos apresenta sequências ricas em guanina na extremidade 5' do intron 1. A presença de sequências formadoras de G-quadruplexes (G4FS) próximas aos limites exon/intron sugere o seu potencial para regular a expressão gênica durante a transcrição e processamento do pré-mRNA. O nosso objetivo foi associar a estabilidade das GQFS, medida com base na mínima energia livre do RNA, localizados na extremidade 5' do primeiro íntron de genes que participam no desenvolvimento dentário com o tamanho relativo da dentição posterior obtido a partir de medições de crânios de 55 espécies de mamíferos placentários. A mínima energia livre das G4FS localizadas em genes com papeis críticos no desenvolvimento dentário, como o Pax9, Msx1, Inhba e Bmp2, associaram-se com o tamanho relativo dos dentes posteriores. Analisadas em conjunto, a diversidade estrutural das G4FS e sua associação com o tamanho relativo dos dentes posteriores, permite-nos sugerir que interações moleculares geradas a partir dessa estabilidade estrutural poderiam representar um mecanismo regulador da expressão gênica durante a odontogênese. Finalmente, isto poderia ter contribuído para a diversificação do fenótipo dentário dos mamíferos. .

Palavras-chave: Quadruplexe G, Dentição, Mamífero, Evolução.

ABSTRACT

Mammalian dentition represents a valuable model to understand how shifts in developmental constraints account for phenotypic diversification during evolution. Different concepts and models have been proposed to explain this diversity, including morphogenetic field’s theory to the activator/inhibitor cascade model. About half of human genes are G-rich at the 5' end of intron 1, and these sequences have the potential to form G-quadruplexes. The presence of G- quadruplex forming sequences (G4FS) close to exon/intron boundaries suggests their potential to regulate gene expression at the transcriptional level or pre-mRNA processing. The aim of this work was to associate stability of the G4FS, measured by the minimum free energy of RNA, located at the 5' end of the first intron of genes that participate in development with the relative size of posterior teeth obtained from skull measurement of 55 placental mammalian species. G4FS of Pax9, Msx1, Inhba and Bmp2, genes known to be critical for tooth development, were associated with relative tooth size. Taking together, the structural diversity of G4FS and their relationship with tooth relative length, we argue that molecular interactions generated from their structure stability may represent a modulator mechanism of gene expression during odontogenesis. Changes in the stability of G4FS may have played an important role in the diversification of mammalian dental phenotype.

Keywords: G-quadruplex, Dentition, Mammal, Evolution.

SUMÁRIO

1. INTRODUÇÃO 10

2. ARTIGO Association between G-quadruplex forming sequences in the first intron of Pax9, Msx1, Inhba and Bmp2 genes and mammalian dentition 13

3. CONCLUSÃO 52

REFERÊNCIAS 53

ANEXOS Anexo 1 - Comprovante de submissão do artigo 55

10

1 INTRODUÇÃO

A dentição dos mamíferos diversificou-se ao longo do processo evolutivo em resposta a adaptações a exigências dietéticas, ambientais e sociais. Nesse processo, a organização e funcionamento das vias genéticas envolvidas no desenvolvimento dentário tiveram que ser modificadas. A fórmula dental básica dos mamíferos placentários inclui grupos de três incisivos, um canino, quatro pré- molares e três molares em cada hemi-arcada, somando um total de 44 dentes, e a partir dela modificações específicas para cada espécie surgiram. Na maioria dos casos, as mudanças do fenótipo dentário aconteceram por um processo de perda de alguns dentes em cada grupo ou a perda de grupos inteiros. Os registros fosseis evidenciam que a perda dentária é precedida pela diminuição do tamanho e simplificação da morfologia, associado a uma redução funcional significativa (Ziegler, 1971).

Diferentes abordagens têm sido usadas para entender como a diversidade fenotípica na dentição dos mamíferos foi atingida. De acordo com Butler (1939), dentes que exibem similaridades estão organizados em campos relativamente independentes, denominados campos morfogenéticos, nos quais pode ser observado um gradiente de variabilidade de tamanho e forma; assim, em cada campo existiria um dente chave ou “pole tooth”, o qual exibiria a menor taxa de variação. Cada dente, de acordo à sua posição dentro do seu campo, estaria sob um controle genético diferente durante o desenvolvimento. Por outro lado, o modelo de ativação/inibição propõe a existência de uma cascata molecular ao longo do campo ; interações dinâmicas entre ativadores ectomesenquimais e inibidores derivados do germe do molar em formação agem na definição do tamanho do molar imediatamente posterior. Porém, não todos os mamíferos se encaixam nesse modelo na determinação do tamanho dos seus molares (Kavanagh et al., 2007). O tamanho final dos dentes não só é resultado do funcionamento de vias genéticas agindo dentro de campos morfológicos independentes. Campos morfológicos próximos e de ordem hierárquica superior também influenciam esse processo. O campos canino, pré-molar e molar podem ser considerados como parte de um campo morfogenético superior, o palato secundário, o qual também pode influenciar na definição do fenótipo dentário (Renvoisé et al., 2009; Ribeiro et al., 2013).

Embora as vias genéticas que dirigem o desenvolvimento dentário são

11

amplamente conservadas entre os mamíferos (Thesleff & Sharpe, 1997; Jernvall & Thesleff 2000; Bei, 2009), mudanças nos mecanismos de controle da expressão espaço-temporal dos genes envolvidos na odontogênese podem explicar as variações fenotípicas da dentição. O desenvolvimento dentário é controlado por interações epitélio-mesenquimais sequenciais e reciprocas que envolvem vias genéticas formadas por ligantes, receptores e fatores de transcrição. Os genes Msx1 e Pax9 codificam fatores de transcrição que são expressos no ectomesênquima dental, e ambos têm papeis muito importantes na correta formação dentária. Tem sido mostrado que em camundongos mutantes para os genes Msx1 e Pax9, o desenvolvimento dentário não progride além do estágio de botão (Satokata & Maas, 1994, Peters et al., 1998). Estes genes interagem tanto no nível transcricional quanto da tradução para regular a expressão do gene Bmp4, o que determina a transição correta do estágio de botão para o de capuz (Ogawa et al., 2006). Além disso, a redução na dosagem do Pax9 e Msx1 tem sido associada com o aumento do risco de oligodontia (Nakatomi et al., 2010).

Sequências ricas em guanina no DNA ou RNA podem formar G- quadruplexes (G4), as quais são estruturas secundárias formadas pelo empilhamento de dois ou mais quartetos de guanina. Estes quartetos resultam da associação de quatro guaninas em um arranjo cíclico estabilizado por pontes de hidrogênio tipo Hogsteen (Huppert & Balasubramanian, 2005). Estudos prévios têm mostrado que 48% dos genes humanos são ricos em guanina na extremidade 5' do primeiro íntron (Eddy & Maizels, 2008). G4 localizados nos íntrons podem regular a expressão gênica nos níveis transcricional e pós-transcricional. No gene Fxyd1, G4 causam diminuição na eficiência do splicing ou splicing alternativo (Dhayan et al, 2014). G4 também podem agir como elementos repressores da transcrição, como tem sido mostrado em estudos no gene Top1 (Reihold et al., 2010).

Os genes Msx1 e Pax9 têm estruturas G4 no início do primeiro íntron. G4 localizados na extremidade 5' do primeiro íntron do gene Pax9 pode modular o splicing do pré-mRNA (Ribeiro et al, 2015). A deleção de 11 nucleotídeos da sequência rica em guanina no início do primeiro íntron do gene Msx1 foi associada com oligodontia em dois indivíduos sem parentesco (Pawlowska et al., 2009). Além disso, tem sido postulado que deficiências no splicing podem modificar os níveis de proteína disponíveis dentro dos campos dentários durante o desenvolvimento,

12

afetando interações genéticas que finalmente produziriam falhas na formação dos dentes (Tatematsu et al., 2015; Xue et al., 2016). Apesar da sua importância funcional, nós observamos que entre os mamíferos as sequências formadoras de G4 dos genes Msx1 e Pax9 mostraram variabilidade em relação à sequência e aos valores de mínima energia livre (Mfe). Uma vez que estruturas G4 no primeiro íntron dos genes Pax9 e Msx1 poderiam ter um papel importante no desenvolvimento dentário, é plausível inferir que variações nestas sequências possam estar relacionadas às variações no fenótipo dentário em mamíferos placentários.

O nosso objetivo foi avaliar se existe associação entre sequências formadoras de G-quadruplexes no primeiro íntron dos genes Pax9, Msx1 e oito outros genes associados ao desenvolvimento dentário e variações no fenótipo dentário dos mamíferos.

13

2 ARTIGO

ASSOCIATION BETWEEN G-QUADRUPLEX FORMING SEQUENCES IN THE FIRST INTRON OF PAX9, MSX1, INHBA AND BMP2 GENES AND MAMMALIAN DENTITION.

Manuel Alexander Jara Espejo 1

Sergio Roberto Peres Line 1

1 Department of Morphology, Piracicaba Dental School, University of Campinas (UNICAMP), Piracicaba, SP, Brazil

Correspondence to: Prof. Sérgio Roberto Peres Line Departamento de Morfologia, Faculdade de Odontologia de Piracicaba, UNICAMP. Av. Limeira, 901, Piracicaba, SP, Brazil. CEP 13414-903 E-mail: [email protected]

SUMMARY

Mammalian dentition represents a valuable model to understand how shifts in developmental constraints account for phenotypic diversification during evolution. Different concepts and models have been proposed to explain this diversity, including morphogenetic field’s theory to the activator/inhibitor cascade model. About half of human genes are G-rich at the 5' end of intron 1, and these sequences have the potential to form G-quadruplexes. The presence of G- quadruplex forming sequences (G4FS) close to exon/intron boundaries suggests their potential to regulate gene expression at the transcriptional level or pre-mRNA processing. The aim of this work was to associate stability of the G4FS, measured by the minimum free energy of RNA, located at the 5' end of introns of genes that participate in tooth development with the relative size of posterior teeth obtained from skull measurement of 55 placental mammalian species. G4FS of Pax9, Msx1, Inhba and Bmp2 genes, known to be critical for tooth development, were associated with relative tooth size. Taking together, the structural diversity of G4FS and their relationship with tooth relative length, we argue that molecular interactions generated from their structure stability may represent a modulator mechanism of gene

14

expression during odontogenesis. Changes in the stability of G4FS may have played an important role in the diversification of mammalian dental phenotype

Keywords: G-quadruplex, Mammalian dentition, Evolution.

INTRODUCTION

Mammalian dentition acquired a range of diversity along the evolutionary process that reflects adaptations to dietary, environmental and social requirements. In order to produce diversity, the organization and function of developmental gene networks needed to be modified. According to Ziegler (1971), the basic dental formula in placental mammals includes groups of three , one canine, four , and three molars, from there, species-specific dental modifications arose. In most cases, this occurred by the loss of a few teeth within each group or the loss of entire groups. Fossil records indicate that dental loss is preceded by decreasing size and simplification of shape, linked to functional significance reduction. The tooth loss direction was anteroposterior for the premolars and posteroanterior in the molar row (1).

Different approaches have been used to understand how diverse mammalian dental phenotypes were attained. According to Butler, teeth displaying similarities are organized in relative independent fields that exhibit a gradient of size and shape variability. So, each tooth, according to its position within the field, is under differential developmental control (2). The activation/inhibition model proposed an inhibitory cascade model that suggests dynamic interactions between mesenchymal activators and molar-derived inhibitors acting on definition of the size of the immediately posterior molar; that way, molars size could increase or decrease along molar row, or remains the same. Nonetheless, no all mammals fit this model (3). Final tooth size not only results of gene networks acting within independent fields. Nearby and higher order fields are too influential. Canine, and molar fields may be considered as part of a higher order morphogenetic field, the secondary palate, which could also influence on tooth phenotype (4,5).

Although genetic dynamics leading tooth development are conserved among mammals (6–8), shifts on developmental constraints should account for phenotype diversity. Tooth development is controlled by sequential and reciprocal epithelial-mesenchymal interactions involving genetic pathways that comprise

15

ligands, receptors and transcription factors. Msx1 and Pax9 genes encode transcription factors that are expressed in dental ectomesenchyme, and both were shown to be critical for tooth morphogenesis. Msx1 and Pax9 homozygous null mutant mice exhibited growth arrest of tooth organs at the bud stage (9,10). These genes interact at the transcriptional and protein levels to regulate Bmp4 expression, which determine the transition from bud to cap stage (11). Moreover, reduction of Pax9 and Msx1 dosage has been associated with increased risk of oligodontia (12).

DNA or RNA guanine-rich sequences are able to fold into G-quadruplex structures (G4), which are secondary structures formed by the stacking of two or more G-quartets. These G-quartets arise from the association of four guanines into a cyclic arrangement stabilized by Hoogsten hydrogen bonding (13). Approximately 48% of human genes have G4 forming sequences (G4FS), which are preferentially located on the nontemplate strand, downstream to the TSS, particularly in the first intron 5' end (14). It has been demonstrated that intronic G4 can regulate gene expression at the transcriptional and post-transcriptional levels; G4 can alter splicing of Fxyd1, causing reduction of splicing activity or alternative splicing (15); on the other hand, Top1 gene transcription was repressed by an intronic G4 (16).

Pax9 and Msx1 genes have G4FS in the beginning of the first intron. G4FS present in the 5’ end of intron 1 of PAX9 gene can modulate splicing of pre- mRNA (17). An 11 bp deletion within a G4FS sequence in the beginning of intron 1 of MSX1 gene was associated with oligodontia in two unrelated individuals (18). Other intronic mutations in Msx1 indicate that deficient splicing may modify levels of available protein within each developmental tooth field affecting genetic interactions that finally resulted in tooth development failure (19,20). Despite of its functional importance, we noted that among mammalian species there was a considerable variability in the G4FS located at the 5’ end of the first intron of Pax9 and Msx1 genes. Since these G4FS affect the expression of Pax9 and Msx1 and interfere with tooth development, it is plausible to infer that variations on their sequences and structure may be related to variations of dental phenotype in placental mammals.

In this paper we have analyzed the association between G4FS located within the first intron of Pax9, Msx1 and eight other tooth-related genes and variations of mammalian dentition.

16

MATERIALS AND METHODS

In this study, 55 placental mammalian species, representing 13 orders, were included; the sample size was limited by availability of genomic sequences, since some G4 regions were not available for all species (Supplementary Table 1). Three species (Loxodonta Africana, Odobenus rosmarus, Dasypus novemcinctus) with residual teeth or anomalous dentitions were not used in the analysis.

Sequence data

Transcripts of PAX9, MSX1 and eight other tooth-related human genes (http://bite-it.helsinki.fi/) having G4FS in the 5’ end of intron 1 were downloaded from the Ensembl 85 database (http://www.ensembl.org/index.html); the genes analyzed and the specific number of species with available sequences for each gene are shown in Supplementary Table 2. Sequences ranging from 100 bp upstream to 200 bp downstream of exon 1-intron 1 boundary were scanned for the presence of G4FS using the RNAfold-g tool from the ViennaRNA Package (http://www.tbi.univie.ac.at/RNA/). This software not only identifies G4FS, but also calculates their minimum free energy (Mfe). Since many species had more than one G4FS, our analyses were performed adding the Mfe of all G4FS (G4Mfe) present in the guanine rich region. Human G4FS were used in the Multiz Alignments of 100 Vertebrates Genome browser tool, from Genome Browser Gateway (http://www.ucsc.edu/), to obtain the corresponding placental mammalian alignments. Some primate sequences were obtained from GenBank (21). Only genes with a minimum of 20 available sequences belonging to at least 5 orders were selected. In order to assess if the correlations were specific to G4Mfe we also obtained the Mfe of 50 bp sequences immediately upstream from the human G4FS.

Phylogeny analysis

Mitochondrial RNA sequences of all placental mammals analyzed and two other species (Didelphis virginiana and Alligator mississippiensis) were obtained from Ensemble database and aligned using Alignment/Align by ClustalW tool in Mega software (http://www.megasoftware.net). Since Myotis lucifugus and Eptesicus fuscus mitochondrial RNA was not available, we used those from Myotis davidii and Eptesicus serotinus, respectively. The multiple sequence alignment was used in order to estimate evolutionary distances between species using the

17

Distance|Compute Pairwise Distance tool of Mega.

Measurements of upper jaw and teeth

One hundred and thirty-three (133) mammalian skull photographs (ventral view) were obtained from diverse sources (Supplementary Table 3). The ImageJ software (https://imagej.nih.gov/ij/) was used to measure the upper jaw and teeth. Mesiodistal length of each posterior tooth (canine, premolars and molars) were obtained drawing a straight line between distal and mesial contact points on the occlusal surface. The length between the distal contact point of PM4 and the distal and mesial surfaces of canine were defined as PM_space and PMC_space lengths, respectively; in animals with missing canines, the most anterior limit was the premaxillary-maxillary suture. Definitive dimensions were obtained by averaging the left and right sides in each specimen analyzed. Molar relative lengths were defined as M1/sumPost, M2/sumPost and M3/sumPost, where M1, M2 and M3 represented the mesiodistal length of each tooth, and sumPost denoted the sum of mesiodistal lengths of posterior teeth. Premolar relative lengths were defined as PM2/PM_space, PM3/PM_space and PM4/PM_space, dividing each premolar mesiodistal length between PMspace. Canine relative length was obtained dividing the canine mesiodistal length between PMC_space. Those relative lengths were used as phenotypic variables (Figure 1).

Figure 1 Scheme showing how measurements were performed. Adapted from Ribeiro et al., 2013 (5).

18

Estimation of brain/body ratio

Brain and body mass were obtained for all selected species (Supplementary Table 4). Then, the brain/body variable (BB) was defined as the ratio between both values.

Statistical analysis

All statistical analyzes were performed using R software (https://www.r- project.org/). Spearman correlation tests between tooth relative lengths and G4Mfe were performed in order to associate these characteristics. Tests were performed individually for each gene and with the sum of G4Mfe from two or more genes, as well. Correlation tests were also performed between tooth relative lengths and evolutionary distances and brain/body ratio. Wilcoxon signed-rank test was used to evaluate statistical difference between tooth relative lengths. In this case species were split into two groups, with the same number of animals, according to G4Mfe values. We considered that a meaningful correlation between phenotype and G4Mfe occurred when p-values Spearman rank <= 0.005. This p-value was chosen as it represents the Bonferroni correction of p = 0.05/10 genes, and is an also more reliable level of significance (22). Since the Wilcoxon signed-rank test was performed on arbitrarily divided groups (groups of same size based on G4Mfe) we considered significant associations having p-value <= 0.01.

RESULTS

Relationship between mammalian molar relative length and G4FS in Msx1 and Pax9 genes

The relative length of M1 was positively associated with G4Mfe of first intron of Pax9 and Msx1 (Table 1A), where animals with more stable G4 tended to have smaller M1. M2 relative length was only associated with G4Mfe of first intron of Msx1 gene (Table 2A). For both, Msx1 and Pax9, the values of Spearman rank (rho) decreased progressively from M1 to M3, while p-values of this test and Wilcoxon signed-rank test increased progressively. Interestingly, the sum of Mfe of Pax9 and Msx1 increased significantly the correlation with M1, even though the number of species is smaller than in the analysis with single genes (Table 1B, Fig. 4). This decrease in species number occurs by the lack of sequences for some species in

19

different genes. Species having available sequences for Msx1 but not for Pax9, or vice versa, were not included in the combinatorial analysis of these two genes. The addition of G4Mfe of other genes decreased or had little effect on the significance of the statistical tests (Table 1B, 2B, 3B). Since brain size has been reported to have an inverse relation with tooth size,Pax9, Msx1 and the sum of Pax9 and Msx1 G4Mfe was divided by the brain/body ratio ((Pax9 + Msx1) / (BB) 1/3). Although brain/body ratio alone had a small non-significant association with molars relative length, its inclusion on the analysis significantly increased the correlations with M1 (Figures 2, 3, 4 and 5). No significant changes in correlations were observed when Pax9, Msx1 or the sum of both G4Mfe was divided by evolutionary distances. Replacement of brain/body values by evolutionary distances decreased the correlation values, showing that the brain/body effect was no due to phylogenetic bias.

TABLE 1. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between M1 and tooth-related genes G4Mfe.

A B Pax9 + – – – Pax9 + + + + Msx1 – + – – Msx1 + + + + Inhba – – + – Inhba – + – + Bmp2 – – – + Bmp2 – – + + Spearman rho 0.392 0.434 -0.003 0.303 Spearman rho 0.5304 0.4147 0.5163 0.4301 Spearman p 0.003 0.002 0.984 0.040 Spearman p 0.0002 0.0119 0.0015 0.0094 Wilcoxon p 0.005 0.007 0.521 0.142 Wilcoxon p 0.0050 0.0907 0.0224 0.0971

20

Figure 2: (A) Plot of Model (Msx1 G4Mfe / BB 1/3) versus M1. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

Figure 3: (A) Plot of Model (Pax9 G4Mfe / BB 1/3) versus M1. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

21

Figure 4: (A) Plot of Model ((Pax9 + Msx1) G4Mfe) versus M1. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P- values of Wilcoxon test are shown.

Figure 5: (A) Plot of Model ((Pax9 + Msx1) G4Mfe / BB1/3) versus M1. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

TABLE 2. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between M2 and tooth-related genes G4Mfe

A B Pax9 + – – – Pax9 + + + + Msx1 – + – – Msx1 + + + + Inhba – – + – Inhba – + – + Bmp2 – – – + Bmp2 – – + + Spearman rho 0.145 0.475 -0.101 0.220 Spearman rho 0.317 0.201 0.276 0.223 Spearman p 0.290 0.001 0.519 0.143 Spearman p 0.034 0.241 0.104 0.190 Wilcoxon p 0.511 0.001 0.603 0.489 Wilcoxon p 0.427 0.888 0.696 0.673

22

Figure 6: (A) Plot of Model ((Pax9 + Msx1) G4Mfe) versus M2. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P- values of Wilcoxon test are shown.

TABLE 3. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between M3 and tooth-related genes G4Mfe

A B Pax9 + – – – Pax9 + + + + Msx1 – + – – Msx1 + + + + Inhba – – + – Inhba – + – + Bmp2 – – – + Bmp2 – – + + Spearman rho 0.091 0.317 -0.070 0.161 Spearman rho 0.232 0.123 0.221 0.166 Spearman p 0.508 0.030 0.655 0.285 Spearman p 0.174 0.474 0.196 0.333 Wilcoxon p 0.729 0.022 0.462 0.558 Wilcoxon p 0.495 0.557 0.837 0.400

23

Figure 7: (A) Plot of Model ((Pax9+Msx1) G4Mfe) versus M3. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P- values of Wilcoxon test are shown.

No statistically significant coefficients were obtained when evolutionary distances were correlated with molar relative lengths (Supplementary Figures 1 and 2). No significant correlations were found when the sum of Pax9 and Msx1 G4Mfe where tested versus the evolutionary distances for Didelphis virginiana (rho = -0.15, p-value = 0.33) or Alligator mississippiensis (rho = 0.01, p-value = 0.93).

Since our sample (n = 55) included a high number of primates (n = 19), we tested if that could be guiding our results; thus, we removed primate species from the analysis. M1 correlation estimates increased after removing primates. The correlation with Pax9, Msx1 and sum of Pax9 and Msx1 G4Mfe were: rho = 0.51, p-value = 0.005; rho = 0.60, p-value = 0.0007; and rho = 0.61, p-value = 0.001; respectively. Since had the highest M1 and several specimens in this order had no premolar teeth, we wonder if the correlation was biased by rodents with absent premolars. We, therefore, performed analysis removing species having no premolar teeth. In this case Pax9 (p = 0.019) and Msx1 (p = 0.15) were no longer significantly correlated with M1. The sum of minimum free energies, however, was still significant (rho = 0.44, p = 0.005), as well as the quotient of the Mfe sum with brain/body mass (rho = 0.58, p = 0.0001).

24

In order to test confirm whether relationships were imparted specifically by G4FS the folding energies of 50 bp flanking upstream sequences from G4FS were correlated with molar relative lengths. These sequences were not G4FS. No significant correlation estimates were obtained with Pax9 or Msx1 (Supplementary Table 12). Non-significant correlations for M1 (rho= 0.10, p-value=0.54), M2 (rho = 0.95, p-value = 0.01) and M3 (rho = 0.21, p-value = 0.217) were also found when Mfe of 50 bp upstream sequences from Pax9 and Msx1 G4FS were summed.

Pax9, Msx1, Inhba and Bmp2 G4FS are associated with mammalian premolar and canine relative lengths.

Different from molar, where association were found dividing the mesiodistal length by the sum of mesiodistal length of all posterior teeth, premolar associations were found dividing the mesiodistal length of each tooth by the distance between the distal of canine to distal of PM4 (PM_space), while canine associations were performed dividing its mesiodistal length by the distance from the distal of PM4 to the mesial of canine (PMC_space). Significant correlations were found for PM4, PM3 and Canine relative lengths, which were negatively associated with G4Mfe. Therefore, animals with more stable G4 structures tended to have PM4, PM3 and Canine teeth with larger mesiodistal lengths. Stronger significant correlations were found for PM4, PM3 and Canine with G4Mfe of Inhba and Bmp2 genes; Pax9 G4Mfe was also significantly correlated with PM4 (Tables 4A, 5A and 7A). Significant correlations were found between Bmp2 Mfe of sequences upstream (non- quadruplex) of G4FS and PM4 and PM3 (Supplementary Table 12).

There was a significant correlation between Didelphis virginiana evolutionary distances with PM3 (Supplementary Figure 1). Canine was significantly correlated with Didelphis virginiana and Alligator mississippiensis evolutionary distances (Supplementary Figures 1 and 2). These data indicate that part of significant correlations between G4Mfe with premolars and canines may occur due to phylogenetic bias.

Combinatorial analysis showed that the best correlation estimates were obtained when G4Mfe of Pax9, Msx1, Bmp2, Inhba were added. Correlation estimates from combinatorial analysis for PM4, PM3 and Canine were -0.68 (p-value =4.38e-06), -0.72 (p-value = 7.57e-07) and 0.74 (p-value = 2.936e-07), respectively.

25

No significant correlations were found with PM2 (Tables 4B- 7B; Figures 8-11). There was, however, significant correlation between the sum of G4Mfe of Pax9, Msx1, Bmp2 and Inhba with the Didelphis virginiana evolutionary distances (rho = -0.56, p = 0.00036). Although the correlation with the sum of Pax9, Msx1, Bmp2 and Inhba G4Mfe was more significant than the correlation with evolutionary distances, it is possible that part of significant correlations between G4Mfe with premolars and canines may occur due to phylogenetic bias.

TABLE 4. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between PM4 and tooth-related genes G4Mfe

A B Pax9 + – – – Pax9 + + + + + Msx1 – + – – Msx1 + + + – + Inhba – – + – Inhba – + – + + Bmp2 – – – + Bmp2 – – + + + Spearman rho -0.38 -0.24 -0.50 -0.44 Spearman rho -0.45 -0.65 -0.58 -0.67 -0.68 Spearman p 0.005 0.111 0.001 0.002 Spearman p 0.002 1.6e-05 0.0002 8.3e-06 4.4e-06 Wilcoxon p 0.002 0.136 0.032 0.006 Wilcoxon p 0.001 0.0006 0.0026 0.002 6.9e-04

Figure 8: (A) Plot of Model (sum of Pax9, Msx1, Inhba and Bmp2 G4Mfe) versus PM4. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

26

TABLE 5. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between PM3 and tooth-related genes G4Mfe

A B Pax9 + – – – Pax9 + + + + + Msx1 – + – – Msx1 + + + – + Inhba – – + – Inhba – + – + + Bmp2 – – – + Bmp2 – – + + + Spearman rho -0.36 -0.32 -0.44 -0.48 Spearman rho -0.44 -0.69 -0.66 -0.66 -0.72 Spearman p 0.006 0.03 0.003 0.001 Spearman p 0.002 2.7e-06 1.2e-05 9.9e-06 7.6e-07 Wilcoxon p 0.004 0.02 0.05 0.01 Wilcoxon p 0.002 0.001 0.003 0.013 0.001

Figure 9: (A) Plot of Model (sum of Pax9, Msx1, Inhba and Bmp2 G4Mfe) versus PM3. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

TABLE 6. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between PM2 and tooth-related genes G4Mfe

A B Pax9 + – – – Pax9 + + + + + Msx1 – + – – Msx1 + + + – + Inhba – – + – Inhba – + – + + Bmp2 – – – + Bmp2 – – + + + Spearman rho 0.069 -0.43 0.448 0.076 Spearman rho -0.178 0.222 -0.119 0.136 0.047 Spearman p 0.615 0.003 0.003 0.617 Spearman p 0.300 0.193 0.489 0.429 0.785 Wilcoxon p 0.184 0.004 0.067 0.564 Wilcoxon p 0.624 0.287 0.554 0.467 0.673

27

Figure 10: (A) Plot of Model (sum of Pax9, Msx1, Inhba and Bmp2 G4Mfe) versus PM2. Correlation coefficient and corresponding p-value are shown. (B) Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

TABLE 7. Results of individual (A) and combinatorial (B) Spearman's correlation and Wilcoxon tests between Canine and tooth-related genes G4Mfe

A B Pax9 + – – – Pax9 + + + + + Msx1 – + – – Msx1 + + + – + Inhba – – + – Inhba – + – + + Bmp2 – – – + Bmp2 – – + + + Spearman rho -0.35 -0.27 -0.49 -0.46 Spearman rho -0.44 -0.73 -0.67 -0.70 -0.74 Spearman p 0.009 0.07 0.001 0.001 Spearman p 0.002 3.1e-07 7.3e-06 1.6e-06 2.9e-07 Wilcoxon p 0.002 0.03 0.04 0.01 Wilcoxon p 0.001 8.5e-0.5 0.0001 0.0003 0.00003

28

Figure 11: (A) Plot of Model (sum of Pax9, Msx1, Inhba and Bmp2 G4Mfe) versus Canine. Correlation coefficient and corresponding p-value are shown. (B). Boxplot of groups with low and high G4Mfe indicate animals to the right and left of dashed line in (A), respectively. P-values of Wilcoxon test are shown.

When the sum of G4Mfe from the four genes included in the model used for premolars and canine were analyzed versus evolutionary distances, we obtained correlations coefficients equal to -0.56 (p-value = 0.0004) and -0.17 (p-value = 0.304) for Didelphis virginiana, and Alligator mississippiensis, respectively.

No significant correlations were found between the sum of Pax9, Msx1, Bmp2 and Inhba Mfe of sequences upstream of G4FS with premolar and canine relative lengths (Supplementary Tables 13-16).

Results of individual correlation tests between other analyzed tooth-related genes and relative lengths can be seen in Supplementary Tables 5-11. Some of these genes showed weak to moderate correlation with tooth relative lengths; but, when these genes were included in combinatorial analyses (data not shown) there was no increase in correlation levels.

29

DISCUSSION

The pattern of dental loss in families with mutations in Msx1 and Pax9 has stressed the effect of these genes on tooth development, particularly on the growth arrest of M3 and PM4. Besides tooth agenesis, Pax9 mutations cause significant reduction of all teeth dimensions (23). Fossil records showed that dental loss is preceded by decreasing size and simplification of shape, linked to functional significance reduction (1). The frequent loss of M3 and PM4 in mutant human families indicates that these teeth have a lower developmental threshold, where the growth of the tooth germ is more sensitive to changes in gene expression (24). The fact that haploinsuficiency of Pax9 and Msx1 has low pleiotropy, as the dentition is frequently the only phenotype affected, makes these genes potential definers in the evolution of mammalian dental pattern. The activation/inhibition (a/i) cascade model of Kavanagh and colleagues (3) postulates that timing of the initiation of the posterior molars depends on previous molars through a dynamic balance between intermolar inhibition and mesenchymal activation. Changes in this balance lead to modifications of mammalian molar proportions and, therefore, phenotypic variations. Since M1 is the first molar tooth to develop (25) and controls the development of subsequent molars, it is expected M1 initiation and growth is under strict genetic control. In this sense our results showed that among molar teeth, M1 has the highest correlations with G4Mfe, followed by M2. Noteworthy, since a/i cascade initiates from M1; the decreasing correlation we observed in molar correlations can be explained by the fact the second and third molars development are more susceptible to variations in this molecular balance and prone to be influenced by non-genetic/epigenetic factors.

In the correlations between M1 relative length and Pax9 plus Msx1 G4Mfe, Jaculus jaculus (JJ) appeared as the major outlier (Fig. 4), having the lower Mfe (-90 kcal/Mol) among rodents with no premolars. Removal of this species results in an increase of correlation coefficients from 0.53 to 0.61 (p-value = 1.349e-05). Among mammals Jaculus jaculus presents uniq morphological adaptations, such as loss of the anterior and posterior hindlimb digits, fusion of the three central metatarsals, and dramatic elongation of the hindlimb relative to the forelimb with disproportionate elongation of the metatarsals (26). In Pax9-deficient mice a supernumerary preaxial digit is formed in limbs, and the flexor of the hindlimb toes is missing (10). Since PAX9 plays an important role in limb development it is plausible to infer that the

30

unexpectedly high G4Mfe of Jaculus jaculus may be related to the development of its uniq limb pattern.

Molar associations were found dividing the tooth mesiodistal length by the sum of mesiodistal length of all posterior teeth. Therefore, high relative length is obtained when other teeth are small or there are only few teeth in dentition. Accordingly, higher M1 relative length was found in animals having one or no premolars, such as Mus musculus, Rattus norvegicus, Daubentonia madagascarensis. These animals tended to have less stable G4 structures in the first intron of Pax9 and Msx1 genes. Lower M1 relative length were found in animals with relatively small M1 and large PM4, such as Felix catus, Canis lupus familiaris, Homo sapiens. These animals tended to have more stable G4 structures in the first intron of Pax9 and Msx1 genes. It is noteworthy that the correlation between M1 with Pax9 and Msx1 G4Mfe are significantly increased when Mfe are corrected by brain/body ratio. There is low correlation between brain/body ratio and Pax9 (rho = -0.02, p- value = 0.89) and moderate correlations with Msx1 G4Mfe (rho = 0.34, p-value = 0.037), as well as between M1 and brain/body ratio (rho = 0.39, p-value = 0.002). Variations in dental pattern have been related with brain size (27,28), and this relationship can be explained by the fact that dental and craniofacial developments share common gene pathways (10). Different from premolars, which occupy the space left by deciduous predecessors; molar teeth have to adapt in order to occupy the space in the posterior region of upper jaw. Since M1 is the first permanent tooth to initiate formation, and its tooth germ can influence the development of other posterior molars (3), its development must be pre-programmed in order to influence the initiation and growth of M2 and M3, which must occur in concert with cranial growth. The intricacy of this interaction is evidenced by the fact that craniofacial growth is long lasting process that extends from intrauterine life to the beginning of sexual maturity.

Different from molars, premolar associations were found dividing the mesiodistal length of each tooth by the distance between distal of canine to distal of PM4, while canine associations were performed dividing its mesiodistal length by the distance from distal of PM4 to mesial of canine. During development these teeth have to adjust their size in order to occupy the space between the lateral incisors and first molars, which develop and erupt earlier. In developmental terms canines and

31

premolars have to fit within a pre-established space that is inherited from the corresponding deciduous predecessors. Therefore, it seems logical that the associations of each premolar and canine are better explained by the relative size within its own developmental field. The fact that both, premolars and canines, showed a negative correlation with G4Mfe of the same genes and also with their sum, suggest that canine and premolars are part of a same developmental field (29).

Most studies on tooth development are performed using murine model, and therefore, cannot represent directly premolar and canine development. Our findings suggest the existence of correlations between premolar and canine relative length and the G4Mfe of Inhba and Bmp2 genes. Similar correlations were obtained for PM3 and PM4 relative length. These results may be explained by the fact that in most species these teeth have similar length, possibly due to a fairly similar developmental timing (25). Studies using murine model demonstrated that Inhba has an important role during craniofacial development (30). It is required for development of incisors and mandibular molar tooth germs to progress beyond the bud stage. Homozygous knockout mice lack all teeth, except maxillary molars (30,31). The Bmp2 gene, which is part of the evolutionary conserved Bmp pathway, is reiteratively used during tooth development. At tooth initiation, Bmp2 inhibits Pax9 expression in the mouse mandibular arch, participating of the determination of the bud formation sites (32). Our results suggest that G4FS in the first intron may regulate the expression of Inhba and Bmp2, which are relevant in the patterning of mammalian premolars and canines.

G-quadruplexes can form highly stable structures that can hinder DNA duplication (33) and transcription (34–36), and are intrinsically mutagenic (37). To prevent massive genome rearrangements by depletions of these sequences, cells require a specialized genome-protection mechanism that involves unwinding of G4FS by several DNA helicases (38–43). The high mutagenicity, the special nurture of quadruplex sequences by helicases, its enrichment in promoter sequences and 5’ end of first intron (14), indicates that these sequences are likely to have an active role on gene expression. This is stressed by the fact that the central guanines at guanine tracts, critical for folding stability, have a relatively lower rate of polymorphic bases (44). The G4FS in the 5’ end of the first intron of Pax9 and Msx1 have been previously shown to play a role in gene expression (17,18). The correlations between

32

G4Mfe and relative tooth lengths shown here support the idea that variations in these sequences may be related to variations of dental phenotype in placental mammals. It is important to mention that there was also a correlation between PM3 and canine relative sizes and evolutionary distances, indicating that, for these teeth, the observed association of G4FS could be imparted to phylogenetic bias. This correlation, however, was smaller and less significant than the one obtained with the sum of G4Mfe of Pax9, Msx1, Inhba, and Bmp2.

CONCLUSION

Here we provide evidence that G4FS at the intron 1 of Pax9, Msx1, Inhba and Bmp2 genes are associated with variations in mammalian dental phenotype. These sequences may depict dynamic transcriptional and splicing regulatory elements. Variations in G4FS may have imparted species-specific gene dosage, expression timing, and molecular interactions, contributing to dental phenotype diversification.

ACKNOWLEDGMENTS

This research was financially supported by The National Council for Scientific and Technological Development - Brazil (CNPq grant #132289/2015-6).

REFERENCES

1. Ziegler AC. A theory of the evolution of therian dental formulas and replacement patterns. Q Rev Biol. 1971; 46(3):226–49. 2. Butler PM. Studies of the Mammalian Dentition. Differentiation of the Post- canine Dentition. Proc Zool Soc London. 1939; 109 B (1):1–36. 3. Kavanagh KD, Evans AR, Jernvall J. Predicting evolutionary patterns of mammalian teeth from development. Nature. 2007 Sep 27;449(7161):427-32 4. Renvoisé E, Evans AR, Jebrane A, Labruère C, Laffont R, Montuire S. Evolution of mammal tooth patterns: new insights from a developmental prediction model.Evolution. 2009 May; 63(5):1327-40. 5. Ribeiro MM, de Andrade SC, de Souza AP, Line SR. The role of modularity in the evolution of primate postcanine dental formula: integrating jaw space with patterns of dentition. Anat Rec (Hoboken). 2013 Apr; 296(4):622-9. 6. Thesleff I, Sharpe P. Signalling networks regulating dental development. Mech Dev. 1997 Oct; 67(2):111-23.

33

7. Jernvall J, Thesleff I. Reiterative signaling and patterning during mammalian tooth morphogenesis. Mech Dev. 2000 Mar 15; 92(1):19-29. 8. Bei M. Molecular genetics of tooth development. Curr Opin Genet Dev. 2009 Oct; 19(5):504-10. 9. Satokata I, Maas R. Msx1 deficient mice exhibit cleft palate and abnormalities of craniofacial and tooth development. Nat Genet. 1994 Apr; 6(4):348-56. 10. Peters H, Neubüser A, Kratochwil K, Balling R. Pax9-deficient mice lack pharyngeal pouch derivatives and teeth and exhibit craniofacial and limb abnormalities. Genes Dev. 1998 Sep 1;12(17):2735-47. 11. Ogawa T, Kapadia H, Feng JQ, Raghow R, Peters H, D'Souza RN. Functional consequences of interactions between Pax9 and Msx1 genes in normal and abnormal tooth development. J Biol Chem. 2006 Jul 7;281(27):18363-9. 12. Nakatomi M, Wang XP, Key D, Lund JJ, Turbe-Doan A, Kist R, Aw A, Chen Y, Maas RL, Peters H. Genetic interactions between Pax9 and Msx1 regulate lip development and several stages of tooth morphogenesis. Dev Biol. 2010 Apr 15; 340(2):438-49. 13. Huppert JL, Balasubramanian S. Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 2005 May 24;33(9):2908-16. 14. Eddy J, Maizels N. Conserved elements with potential to form polymorphic G- quadruplex structures in the first intron of human genes. Nucleic Acids Res. 2008 Mar;36(4):1321-33. 15. Dhayan H, Baydoun AR, Kukol A. G-quadruplex formation of FXYD1 pre- mRNA indicates the possibility of regulating expression of its protein product. Arch Biochem Biophys. 2014 Oct 15; 560:52-8. 16. Reinhold WC, Mergny JL, Liu H, Ryan M, Pfister TD, Kinders R, Parchment R, Doroshow J, Weinstein JN, Pommier Y. Exon array analyses across the NC 60 reveal potential regulation of TOP1 by transcription pausing at guanosine quartets in the first intron. Cancer Res. 2010 Mar 15; 70(6): 2191-203. 17. Ribeiro MM, Teixeira GS, Martins L, Marques MR, de Souza AP, Line SR. G- quadruplex formation enhances splicing efficiency of PAX9 intron 1. Hum Genet. 2015 Jan; 134(1):37-44. 18. Pawlowska E, Janik-Papis K, Wisniewska-Jarosinska M, Szczepanska J, Blasiak J. Mutations in the human homeobox MSX1 gene in the congenital lack of permanent teeth. Tohoku J Exp Med. 2009 Apr; 217(4):307-12.

34

19. Tatematsu T, Kimura M, Nakashima M, Machida J, Yamaguchi S, Shibata A, Goto H, Nakayama A, Higashi Y, Miyachi H, Shimozato K, Matsumoto N, Tokita Y. An aberrant splice acceptor site due to a novel intronic nucleotide substitution in MSX1 gene is the cause of congenital tooth agenesis in a Japanese family. PLoS One. 2015 Jun 1; 10(6):e0128227. 20. Xue J, Gao Q, Huang Y, Zhang X, Yang P, Cram DS, Liang D, Wu L. A novel MSX1 intronic mutation associated with autosomal dominant non-syndromic oligodontia in a large Chinese family pedigree. Clin Chim Acta. 2016 Oct 1; 461:135-40. 21. Perry GH, Verrelli BC, Stone AC. Molecular evolution of the primate developmental genes MSX1 and PAX9. Mol Biol Evol. 2006 Mar;23(3):644-54. 22. Johnson VE. Revised standards for statistical evidence. Proc Natl Acad Sci U S A. 2013 Nov 26; 110(48):19313-7. 23. Brook AH, Elcock C, Aggarwal M, Lath DL, Russell JM, Patel PI, Smith RN. Tooth dimensions in hypodontia with a known PAX9 mutation. Arch Oral Biol. 2009 Dec;54 Suppl 1:S57-62. 24. Line SR. Variation of tooth number in mammalian dentition: connecting genetics, development, and evolution. Evol Dev. 2003 May-Jun; 5(3):295-304. 25. Hillson S. Tooth development in human evolution and bioarchaeology. Cambridge University Press; 2014 Mar 13. 26. Cooper KL. The lesser Egyptian jerboa, Jaculus jaculus: a unique rodent model for evolution and development. Cold Spring Harb Protoc. 2011 Dec 1; 2011(12):1451-6. 27. Godfrey LR, Samonds KE, Jungers WL, Sutherland MR. Teeth, brains, and primate life histories. Am J Phys Anthropol. 2001 Mar; 114(3):192-214. 28. Jiménez-Arenas JM, Pérez-Claros JA, Aledo JC, Palmqvist P. On the relationships of postcanine tooth size with dietary quality and brain volume in primates: implications for hominin evolution. Biomed Res Int. 2014; 2014:406507. 29. Line SR. Molecular morphogenetic fields in the development of human dentition. J Theor Biol. 2001 Jul 7; 211(1):67-75. 30. Matzuk MM, Kumar TR, Vassalli A, Bickenbach JR, Roop DR, Jaenisch R, Bradley A. Functional analysis of activins during mammalian development. Nature. 1995 Mar 23; 374(6520):354-6.

35

31. Ferguson CA, Tucker AS, Christensen L, Lau AL, Matzuk MM, Sharpe PT. Activin is an essential early mesenchymal signal in tooth development that is required for patterning of the murine dentition. Genes Dev. 1998 Aug 15; 12(16):2636-49. 32. Neubüser A, Peters H, Balling R, Martin GR. Antagonistic interactions between FGF and BMP signaling pathways: a mechanism for positioning the sites of tooth formation. Cell. 1997 Jul 25; 90(2):247-55. 33. Eddy S, Tillman M, Maddukuri L, Ketkar A, Zafar MK, Eoff RL. Human Translesion Polymerase κ Exhibits Enhanced Activity and Reduced Fidelity Two Nucleotides from G-Quadruplex DNA. Biochemistry. 2016 Sep 20; 55(37):5218-29. 34. Broxson C, Beckett J, Tornaletti S. Transcription arrest by a G quadruplex forming-trinucleotide repeat sequence from the human c-myb gene. Biochemistry. 2011 May 17; 50(19):4162-72. 35. Nambiar M, Srivastava M, Gopalakrishnan V, Sankaran SK, Raghavan SC. G- quadruplex structures formed at the HOX11 breakpoint region contribute to its fragility during t(10;14) translocation in T-cell leukemia. Mol Cell Biol. 2013 Nov; 33(21):4266-81. 36. Tan BG, Wellesley FC, Savery NJ, Szczelkun MD. Length heterogeneity at conserved sequence block 2 in human mitochondrial DNA acts as a rheostat for RNA polymerase POLRMT activity. Nucleic Acids Res. 2016 Sep 19; 44(16):7817-29. 37. Kruisselbrink E, Guryev V, Brouwer K, Pontier DB, Cuppen E, Tijsterman M. Mutagenic capacity of endogenous G4 DNA underlies genome instability in FANCJ-defective C. elegans. Curr Biol. 2008 Jun 24; 18(12):900-5. 38. Sun H, Karow JK, Hickson ID, Maizels N. The Bloom's syndrome helicase unwinds G4 DNA. J Biol Chem. 1998 Oct 16; 273(42):27587-92. 39. Fry M, Loeb LA. Human werner syndrome DNA helicase unwinds tetrahelical structures of the fragile X syndrome repeat sequence d(CGG)n. J Biol Chem. 1999 Apr 30;274(18):12797-802. 40. London TB, Barber LJ, Mosedale G, Kelly GP, Balasubramanian S, Hickson ID, Boulton SJ, Hiom K. FANCJ is a structure-specific DNA helicase associated with the maintenance of genomic G/C tracts. J Biol Chem. 2008 Dec 26; 283(52):36132-9.

36

41. Wu Y, Shin-ya K, Brosh RM Jr. FANCJ helicase defective in Fanconia anemia and breast cancer unwinds G-quadruplex DNA to defend genomic stability. Mol Cell Biol. 2008 Jun; 28(12):4116-28. 42. Wu Y, Brosh RM Jr. DNA helicase and helicase-nuclease enzymes with a conserved iron-sulfur cluster. Nucleic Acids Res. 2012 May; 40(10):4247-60. 43. Sanders CM. Human Pif1 helicase is a G-quadruplex DNA-binding protein with G-quadruplex DNA-unwinding activity. Biochem J. 2010 Aug 15; 430(1):119- 28. 44. Nakken S, Rognes T, Hovig E. The disruptive positions in human G- quadruplex motifs are less polymorphic and more conserved than their neutral counterparts. Nucleic Acids Res. 2009 Sep; 37(17):5749-56.

37

SUPPLEMENTARY DATA

Supplementary Table 1 Species included in the analysis. Dental formula is also showed.

MAXILLARY COMMON ORDER FAMILY SPECIES ABREVIATION DENTAL NAME FORMULA Primates Hominidae Homo sapiens Human HS I2-C1-P2-M3 Hominidae Gorilla gorilla Western GG I2-C1-P2-M3 lowland gorilla Hominidae Pan paniscus Bonobo PP I2-C1-P2-M3 Hominidae Pan troglodytes Chimpanzee PT I2-C1-P2-M3 Hominidae Pongo abelii Orangutan PAB I2-C1-P2-M3 (Sumatran) Lorisidae Perodicticus potto Potto PP I2-C1-P2-M3 Cercopithecidae Papio anubis Olive baboon PA I2-C1-P2-M3 Cercopithecidae Macaca fascicularis Crab-eating MF I2-C1-P2-M3 macaque Cercopithecidae Macaca mulatta Rhesus MM I2-C1-P2-M3 macaque Cercopithecidae Chlorocebus sabaeus Green monkey CHS I2-C1-P2-M3 Lemuridae Lemur catta Ring-tailed LC I2-C1-P3-M3 lemur Cebidae Leontopithecus rosalia Golden lion LR I2-C1-P3-M2 tamarin Cebidae Callimico goeldii Goeldi's CG I2-C1-P3-M3 monkey Cebidae Saguinus oedipus Cotton-top SO I2-C1-P3-M2 tamarin Cebidae Callithrix jacchus Common CJ I2-C1-P3-M2 marmoset Cebidae Saimiri boliviensis Bolivian SB I2-C1-P3-M3 squirrel monkey Daubentoniidae Daubentonia Aye-aye DM I1-C0-P1-M3 madagascariensis Hylobatidae Nomascus leucogenys White-cheeked NL I2-C1-P2-M3 gibbon Tarsiidae Tarsius syrichta Philippine TS I2-C1-P3-M3 tarsier Ursidae Ailuropoda melanoleuca Giant panda AM I2-C1-P4-M2 Canidae Canis lupus familiaris Dog CL I3-C1-P4-M2 Felidae Felis catus Cat FC I3-C1-P3-M1 Mustelidae Mustela putorius furo Ferret MP I3-C1-P3-M1 Phocidae Leptonychotes weddellii Weddell seal LW I2-C1-P3-M2 Artiodactyla Bovidae Bos taurus Cow BT I0-C0-P3-M3 Bovidae Capra hircus Goat CH I2-C1-P3-M3 Bovidae Ovis aries Sheep OA I0-C0-P3-M3 Bovidae Pantholops hodgsonii Tibetan PH I0-C0-P2-M3 antelope Camelidae Vicugna pacos Alpaca VP I0-C0-P3-M3 Camelidae Camelus bactrianus Bactrian camel CB I1-C1-P3-M3 Suidae Sus scrofa Pig SS I2-C1-P4-M3 Perissoodactyl Equidae Equus caballus EC I3-C0-P3-M3 a Rhinocerotidae Ceratotherium simum White CS I0-C0-P3-M3 rhinoceros Afrosoricida Tenrecidae Echinops telfairi Lesser ET I2-C1-P3-M3

38

Afrosoricida hedgehog tenrec Chrysochloridae Chrysochloris asiatica Cape golden CHA I3-C1-P3-M3 mole Macroscelidea Macroscelidea Elephantulus edwardii Elephantulus EE I3-C1-P4-M2 edwardii Hyracoidea Procaviidae Procavia capensis Rock Hyrax PC I0-C0-P4-M3 Chiroptera Pteropodidae Pteropus vampyrus Large flying fox PV I2-C1-P3-M2 Vespertilionidae Eptesicus fuscus Big brown bat EF I2-C1-P1-M3 Vespertilionidae Myotis lucifugus Little brown bat ML I1:2-C1-P3-M3 Soricomorpha Soricidae Sorex araneus Shrew SA I3-C1-P3-M3 Talpidae Condylura cristata Star-nosed Mole CC I3-C1-P4-M3 Ochotonidae Ochotona princeps Pika OP I2-C0-P3-M2 Rodentia Caviidae Cavia porcellus CP I1-C0-P1-M3 Bathyergidae Heterocephalus glaber Naked mole-rat HG I1-C0-P0-M3 Muridae Mus musculus Mouse MUS I1-C0-P0-M3 Muridae Rattus norvegicus Rat RN I1-C0-P0-M3 Dipodidae Jaculus jaculus Lesser Egyptian JJ I1-C0-P0-M3 Jerboa Cricetidae Microtus ochrogaster Prairie vole MO I1-C0-P0-M3 Cricetidae Mesocricetus auratus Golden hamster MA I1-C0-P0-M3 Sciuridae Spermophilus Squirrel ST I1-C0-P2-M3 tridecemlineatus Chinchillidae Chinchilla lanigera Long-tailed CHL I1-C0-P1-M3 Chinchilla Echimyidae Isothrix barbarabrownae Brush-tailed Rat IB I1-C0-P0-M3 Tubulidentata Orycteropodida Orycteropus afer OA I0-C0-P2-M3 e Scandentia Tupaiidae Tupaia belangeri Common tree TB I2-C1-P3-M3 shrew

Supplementary Table 2 Genes included in the analysis. Number of available sequence alignments for each gene.

GENE ENSEMBLE GENE ID ALIGNMENTS

PAX9 ENST00000361487.6 55 MSX1 ENST00000382723.4 47 BMP2 ENST00000378827.4 46 DLX1 ENST00000361725.4 47 BCOR ENST00000615339.1 37 BCOR_004 ENST00000378455.8 33 CD34 ENST00000310833.11 40 HDAC2 ENST00000519065.5 32 INHBA ENST00000442711.1 43 ITGA6 ENST00000412899.5 33

39

Supplementary Table 3 Identification and source of images used in this study

SPECIES DESCRIPTION SEX SOURCE Homo sapiens Homo sapiens DKY_9001-10817 Female Takahashi et al.,2006 Homo sapiens 10817 DKY_9002 Male Takahashi et al.,2006 Homo sapiens 10817 DKY_9004 Female Takahashi et al.,2006 Homo sapiens Male Takahashi et al.,2006 Gorilla gorilla Gorilla gorilla DKY_1755-10625 Male Takahashi et al.,2006 Gorilla gorilla JMC_3958- 13718 Female Takahashi et al.,2006 Gorilla gorilla JMC_4297- 12869 Male Takahashi et al.,2006 Pan troglodytes Pan troglodytes DKY_0218-10600 Male Takahashi et al.,2006 Pan troglodytes JMC_3982- 12861 Unknown Takahashi et al.,2006 Pan troglodytes THK_9002-12184 Unknown Takahashi et al.,2006 Pan troglodytes DKY_1096-10485 Female Takahashi et al.,2006 Pan troglodytes DKY_1445-10604 Female Takahashi et al.,2006 Pan troglodytes DKY_1608- 11088 Female Takahashi et al.,2006 Pan troglodytes DKY_2763-9934 Female Takahashi et al.,2006

Pan paniscus Taken measurements Males and Johanson, 1974 Females

Ventral view Female Bolter and Zihlman, 2012

Pongo abelii Ventral view Male Skeletons Web-University of

Texas Perodicticus potto Perodicticus potto DKY_2736-9810 Female Takahashi et al.,2006 Perodicticus potto ILF_0168-14837 Male Takahashi et al.,2006 Papio anubis Papio anubis DKY_1257-3562 Female Takahashi et al.,2006 Papio anubis PRI_1625-13520 Male Takahashi et al.,2006 Papio anubis PRI_1626-13513 Male Takahashi et al.,2006 Papio anubis PRI_3086-13464 Female Takahashi et al.,2006 Macaca fascicularis Macaca fascicularis DKY_0989-4134 Male Takahashi et al.,2006 Macaca fascicularis DKY_0995-4557 Male Takahashi et al.,2006 Macaca fascicularis DKY_1229-3502 Female Takahashi et al.,2006 Macaca fascicularis DKY_1556-0620 Female Takahashi et al.,2006 Macaca fascicularis DKY_1860-0269 Female Takahashi et al.,2006 Macaca fascicularis DKY_2047- 4745 Female Takahashi et al.,2006 Macaca fascicularis DKY_2447-3809 Male Takahashi et al.,2006 Macaca mulatta Macaca mulatta DKY_1086-4058 Male Takahashi et al.,2006 Macaca mulatta DKY_1682-4722 Male Takahashi et al.,2006 Macaca mulatta DKY_2654- 0272 Male Takahashi et al.,2006 Macaca mulatta DKY_2677-0284 Male Takahashi et al.,2006 Macaca mulatta DKY_1098-3791 Female Takahashi et al.,2006 Macaca mulatta DKY_1686-4061 Female Takahashi et al.,2006 Macaca mulatta DKY_1713-4023 Female Takahashi et al.,2006 Macaca mulatta DKY_2685-4565 Female Takahashi et al.,2006 Chlorocebus sabaeus Paper image Unknown Takahashi et al.,2006

40

Lemur catta Lemur catta DKY_2205-3526 Female Takahashi et al.,2006 Lemur catta DKY_2576- 6489 Female Takahashi et al.,2006 Lemur catta DKY_2715- 9130 Female Takahashi et al.,2006 Lemur catta DKY_2876-13777 Female Takahashi et al.,2006 Leontopithecus rosalia Leontopithecus rosalia THK_0255-12647 Male Takahashi et al.,2006

Callimico goeldii Ventral view Male Myer et al., 2017 Saguinus oedipus Saguinus oedipus ILF_0016-14274 Female Takahashi et al.,2006 Saguinus oedipus PRI_1491-13613 Unknown Takahashi et al.,2006 Callithrix jacchus Callithrix jacchus DKY_2538- 6552 Male Takahashi et al.,2006 Callithrix jacchus DKY_2546-6393 Female Takahashi et al.,2006 Callithrix jacchus DKY_2553-5818 Female Takahashi et al.,2006 Callithrix jacchus DKY_2555-6516 Male Takahashi et al.,2006 Callithrix jacchus THK_1078-12678 Male Takahashi et al.,2006 Callithrix jacchus THK_1084-12675 Male Takahashi et al.,2006 Callithrix jacchus THK_1090-12619 Female Takahashi et al.,2006 Callithrix jacchus THK_1097- 12616 Female Takahashi et al.,2006 Saimiri scireus* Saimiri scireus DKY_2360 Male Takahashi et al.,2006

Daubentonia Paper image Male Quinn and Wilson 2004 madagascariensis

Nomascus leucogenys Paper image Male LEE E. HARDING 2012

Tarsius syrichta Tarsius syrichta Female Myer et al., 2017

Ailuropoda melanoleuca Taken measurements Unknown Jin et al. 2007

Ventral view Unknown Myer et al., 2017 Canis lupus familiaris Canis lupus familiarisDKY _0724 Male Takahashi et al.,2006 Canis lupus familiarisDKY _0725 Male Takahashi et al.,2006 Canis lupus familiarisDKY _0763 Female Takahashi et al.,2006 Felis catus Felis catus DKY_0165-5117 Unknown Takahashi et al.,2006 Felis catus DKY_0466-3370 Unknown Takahashi et al.,2006

Feis catus Male WILL'S SKULL PAGE

Leptonychotes weddellii Paper image Unknown Stirling 1971 Mustela putorius furo Mustela putorius furo Unknown Richard Lawrence-

Flickr.com Mustela putorius furo DKY_2222-7543 Female Takahashi et al.,2006 Bos taurus Bos taurus THK_0453-12789 Unknown Takahashi et al.,2006 Bos taurus THK_0454-12343 Unknown Takahashi et al.,2006 Bos taurus THK_0456-12787 Unknown Takahashi et al.,2006 Bos taurus ILF_0001-14408 Unknown Takahashi et al.,2006 Bos taurus DKY_1692 Female Takahashi et al.,2006 Capra hircus Capra hircus DKY_1559-10735 Male Takahashi et al.,2006 Capra hircus DKY_1587-10221 Female Takahashi et al.,2006 Capra hircus DKY_1620-10159 Female Takahashi et al.,2006 Capra hircus DKY_1526-10699 Female Takahashi et al.,2006 Ovis aries Ovis aries THK_0406-12747 Unknown Takahashi et al.,2006 Ovis aries THK_0023-12275 Male Takahashi et al.,2006

Ovis aries THK_0021-12259 Male Takahashi et al.,2006 Ovis aries DKY_1389-10769 Unknown Takahashi et al.,2006

41

Camelus bactrianus Camelus bactrianus ILF_0003 Male Takahashi et al.,2006

Vicugna pacos Ventral view(scale from M3) Female Quiring and Leggett

Lateral view Unknown Shadyufo-Tumblr Pantholops hodgsonii Ventral view Unknown The Natural History

Museum

Sus scrofa Sus scofra Unknown Museum Victoria Equus caballus Equs cabalus DKY 2343-10807 Unknown Takahashi et al.,2006 Equs cabalus DKY_1964-10946 Male Takahashi et al.,2006

Equs cabalus Male Mammalian Skeleton – UCL

Equs cabalus Unknown Museum Victoria

Ceratotherium simum Ventral view Unknown Colin Groves 1972

Taken Measurements Unknown Chang,and Jang 2004.

Chrysochloris asiatica Ventral view Unknown Simonetta, A.M., 1968

Echinops telfairi Echinops telfairi ILF_0120-14774 Male Takahashi et al.,2006

Elephantulus edwardii Elephantulus edwardii(brachyrynchus) Unknown Myer et al., 2017 Ventral view Procavia capensis Procavia capensis DKY_2660-3406 Female Takahashi et al.,2006 Procavia capensis DKY_2698-10050 Female Takahashi et al.,2006 Procavia capensis ILF_0073-13929 Unknown Takahashi et al.,2006 Procavia capensis DKY_1803-9863 Female Takahashi et al.,2006

Pteropus vampyrus Pteropus vampyrus Male Myer et al., 2017 Pteropus vampyrus DKY2631-6672 Male Takahashi et al.,2006 Pteropus vampyrus DKY2362-8717 Male Takahashi et al.,2006 Pteropus vampyrus DKY2625-9073 Female Takahashi et al.,2006

Eptesicus fuscus Ventral view Male UCF BIOLOGY

Ventral view Unknown Myer et al., 2017

Myotis lucifugus Myotis lucifugus(drawing) Unknown Fenton, 1980

Ventral view Female Myer et al., 2017

Sorex araneus Ventral view Male Myer et al., 2017

Condylura cristata Ventral view Unknown Myer et al., 2017

Paper image Laerm et al. 2007

Ochotona princeps Paper image Male Hafner and Smith 2010

Paper image Unknown Smith and Weston 1990 Paper image Unknown Fostowicz-Frelik and

Meng(2013) Cavia porcellus Cavia porcellus THK_0319-12635 Female Takahashi et al.,2006 Cavia porcellus THK_0359-12660 Unknown Takahashi et al.,2006 Cavia porcellus ILF_0176-14849 Male Takahashi et al.,2006 Cavia porcellus ILF_0177-14848 Female Takahashi et al.,2006 Heterocephalus glaber Ventral view Male Takahashi et al.,2006 Jaculus jaculus Jaculus jaculus ILF_0017 Unknown Takahashi et al.,2006

Ventral view Female African Rodentia

Microtus ochrogaster Ventral view Unknown Myer et al., 2017

Mesocricetus auratus Ventral view Unknown Richard Lawrence-Flickr.com

Spermophilus Ventral view Unknown African Rodentia tridecemlineatus

Chinchilla lanigera Lateral view Unknown Paolo Viscardi-Zygoma

42

Ventral view Unknown Myer et al., 2017

Isothrix barbarabrownae Paper image Male and Patterson y Velazco 2006 females Mus musculus Ventral view Unknown Museum Victoria Rattus norvegicus Rattus norvegicus DKY_2574-9142 Male Takahashi et al.,2006 Rattus norvegicus DKY_2830 Male Takahashi et al.,2006

Orycteropus afer Ventral view Unknown Myer et al., 2017

Tupaia belangeri Ventral view(Tupaia sp.) Unknown Wible, J.R., 2011

Ventral view(Tupaia sp.) Unknown Myer et al., 2017

Ventral view(Tupaia sp.) Unknown Myer et al., 2017 *Used instead Saimiri boliviensis in the measurements. Sources

African rodentia. Accessed at: http://projects.biodiversity.be/africanrodentia/ Bolter DR, Zihlman AL. Skeletal development in Pan paniscus with comparisons to Pan troglodytes. American journal of physical anthropology. 2012 Apr 1;147(4):629-36. Chang CH, Jang CM. On the Processing and Mounting of a Skeleton of a White Rhinoceros, Ceratotherium sinum. Coll. Res. 2004;17:58-69. Fenton MB. Myotis lucifugus. Mammalian species. 1980 Nov 20(142):1-8. Flickr. Richard Lawrence. Accessed at: https://www.flickr.com/photos/richardlsphotos/15449950277/ Fostowicz-Frelik Ł, Meng J. Comparative morphology of premolar foramen in lagomorphs (Mammalia: Glires) and its functional and phylogenetic implications. PloS one. 2013 Nov 21;8(11):e79794. Groves CP. Ceratotherium simum. Mammalian species. 1972 Jun 16(8):1-6. Hafner DJ, Smith AT. Revision of the subspecies of the American pika, Ochotona princeps (Lagomorpha: Ochotonidae). Journal of Mammalogy. 2010 Apr 16;91(2):401-17. Harding LE. Nomascus leucogenys (Primates: Hylobatidae). Mammalian Species. 2012 Jan 25;44(1):1-5. Jin C, Ciochon RL, Dong W, Hunt RM, Liu J, Jaeger M, Zhu Q. The first skull of the earliest giant panda. Proceedings of the National Academy of Sciences. 2007 Jun 26;104(26):10932- 7. Johanson DC. Some metric aspects of the permanent and deciduous dentition of the pygmy chimpanzee (Pan paniscus). American Journal of Physical Anthropology. 1974 Jul 1;41(1):39-48. Laerm J, Chapman BR, Ford WM. Content and taxonomic comments. The Land Manager's Guide to Mammals of the South. 2007:113. Mammalian skeleton. Accessed: http://www.ucl.ac.uk/archaeology/boneview/Interactive%20Components/Mammalian%20Skel eton/Elements/Skull/Equussuperior.html Myers, P., R. Espinosa, C. S. Parr, T. Jones, G. S. Hammond, and T. A. Dewey. 2017. The Animal Diversity Web (online). Accessed at http://animaldiversity.org. Patterson BD, Velazco PM. A distinctive new cloud-forest rodent (Hystricognathi: Echimyidae) from the Manu Biosphere Reserve, Peru. Mastozoología Neotropical. 2006 Jul;13(2):175-91. Quinn A, Wilson DE. Daubentonia madagascariensis. Mammalian Species. 2004 Jul:1-6.

43

Simonetta AM. A NEW GOLDEN MOLE FROM SOMALIA WITH AN APPENDIX ON THE TAXONOMY OF THE FAMILY CHRYSOCHLORIDAE (MAMMALIA, INSECTIVORA) RICERCHE SULLA FAUNA DELLA SOMALIA PROMOSSE DALL'ISTITUTO DI ZOOLOGIA E DAL MUSEO ZOOLOGICO DELL'UNIVERSITÀ DI FIRENZE: XXX. Monitore Zoologico Italiano. Supplemento. 1968 Jan 1;2(1):27-55. Skeleton: available at http://eskeletons.org/boneviewer/nid/12548/region/skull/bone/cranium Smith AT, Weston ML. Ochotona princeps. Mammalian species. 1990 Apr 26(352):1-8. Stirling I. Leptonychotes weddelli. Mammalian Species. 1971 Jan 19(6):1-5. Takahashi H, Yamashita M, Shigehara N. Cranial photographs of mammals on the web: the Mammalian Crania Photographic Archive (MCPA2) and a comparison of bone image databases. Anthropological Science. 2006;114(3):217-22. The Museum Victoria. Accessed at: https://museumvictoria.com.au/bioinformatics/mammals/ The Natural History Museum. Available at: http://www.alamy.com/stock-photo-pantholops- hodgsonii-tibetan-antelope-or-chiru-66724756.html Wible JR. On the treeshrew skull (Mammalia, Placentalia, Scandentia). Annals of Carnegie Museum. 2011 Jun 7;79(3):149-230. WILL'S SKULL PAGE. Accessed at http://www.skullsite.co.uk/lists.htm.

Supplementary table 4 Brain and body values from species included in analysis. Sources of data are also indicated.

SPECIES Brain Source Body mass Source mass(gr) (gr) Homo sapiens 1250 Boddy et al., 2012 65142 Boddy et al., 2012 Gorilla gorilla 528.8 Boddy et al., 2012 163668 Boddy et al., 2012 Pan paniscus 322.4 Boddy et al., 2012 45400 Boddy et al., 2012 Pan troglodytes 320.35 Boddy et al., 2012 40719 Boddy et al., 2012 Pongo abelii 385.28 Taylor etal. 2007 57622 Isler et al 2008 Perodicticus potto 12.06 Boddy et al., 2012 844 Boddy et al., 2012 Papio anubis 161.35 Boddy et al., 2012 14480 Boddy et al., 2012 Macaca fascicularis 67.2 Boddy et al., 2012 4136 Boddy et al., 2012 Macaca mulatta 87.95 Boddy et al., 2012 2400 Boddy et al., 2012 Chlorocebus sabaeus 65.61 Boddy et al., 2012 2289.6 Boddy et al., 2012 Lemur catta 22.8 Boddy et al., 2012 1603 Boddy et al., 2012 Leontopithecus rosalia 13.1 Boddy et al., 2012 547.5 Boddy et al., 2012 Callimico goeldii 11 Boddy et al., 2012 480 Boddy et al., 2012 Saguinus oedipus 9.78 Boddy et al., 2012 372.8 Boddy et al., 2012 Callithrix jacchus 7.8 Boddy et al., 2012 247.5 Boddy et al., 2012 Saimiri boliviensis 24.1 Boddy et al., 2012 750 Boddy et al., 2012 Daubentonia 45.15 Boddy et al., 2012 2800 Boddy et al., 2012 madagascariensis Nomascus leucogenys 119 Montgomery SH, 7320 Montgomery SH, 2012 2012 Tarsius syrichta 3.6 Boddy et al., 2012 125 Boddy et al., 2012 Ailuropoda melanoleuca 226 Boddy et al., 2012 37500 Boddy et al., 2012

44

Canis lupus familiaris 119 Boddy et al., 2012 22680 Boddy et al., 2012 Felis catus 25 Roth et al 2006 3300 Rousseeuw et al, 2005 Mustela putorius furo 6.6 Boddy et al., 2012 415 Boddy et al., 2012 Leptonychotes weddellii 586.25 Boddy et al., 2012 368000 Boddy et al., 2012 Bos taurus 307 Jones et al. 2009 613000 Boddy et al., 2012 Capra hircus 130.6 Boddy et al., 2012 27660 Boddy et al., 2012 Ovis aries 125.9 Boddy et al., 2012 32950 Boddy et al., 2012 Pantholops hodgsonii 125.3 Finarelli J 2006 32500 Massicot P, 2001 Vicugna pacos 188 Kruska D, 1980 64900 Jones et al. 2009 Camelus bactrianus 626 Finarelli J 2006 547000 Boddy et al., 2012 Sus scrofa 107 Conrad et al, 2012 84500 Jones et al. 2009 Equus caballus 524 Boddy et al., 2012 400000 Jones et al. 2009 Ceratotherium simum 600 Cobb S , 1965 2260000 Jones et al. 2009 Echinops telfairi 0.62 Boddy et al., 2012 87.5 Boddy et al., 2012 Chrysochloris asiatica 0.7 Boddy et al., 2012 49 Boddy et al., 2012 Elephantulus edwardii 1.3* Stephan H, 1981 49.7 Jones et al. 2009 Procavia capensis 19.2 Boddy et al., 2012 2518 Boddy et al., 2012 Pteropus vampyrus 9.121 Swartz et al, 2012 1040 Jones et al. 2009 Eptesicus fuscus 0.24 McGuire LP, 2011 17.3 Boddy et al., 2012 Myotis lucifugus 0.126** Braun et al. 2009 7.7 Jones et al. 2009 Sorex araneus 0.2 Boddy et al., 2012 10.3 Boddy et al., 2012 Condylura cristata 1.37 Boddy et al., 2012 50 Boddy et al., 2012 Ochotona princeps 2.39 Boddy et al., 2012 169 Boddy et al., 2012 Cavia porcellus 4.8 Boddy et al., 2012 470 Boddy et al., 2012 Heterocephalus glaber 0.52 Boddy et al., 2012 60.8 Boddy et al., 2012 Mus musculus 0.43 Boddy et al., 2012 21.24 Boddy et al., 2012 Rattus norvegicus 8 Boddy et al., 2012 360 Boddy et al., 2012 Jaculus jaculus 1.21 Boddy et al., 2012 55.4 Boddy et al., 2012 Microtus ochrogaster 0.71 Boddy et al., 2012 43.8 Boddy et al., 2012 Mesocricetus auratus 1 Rousseeuw et al, 120 Rousseeuw et al, 2005 2005 Spermophilus tridecemlineatus 2.426 Boddy et al., 2012 155.6 Boddy et al., 2012 Chinchilla lanigera 5.3 McNab BK, 1989 432 McNab BK 1989 Isothrix barbarabrownae No available No available Orycteropus afer 72 Boddy et al., 2012 58200 Boddy et al., 2012 Tupaia belangeri 2.776 Boddy et al., 2012 157.5 Boddy et al., 2012 *Elephantulus fuscipes brain mass **Myostis albescens brain mass

Sources

Boddy AM, Mcgowen MR, Sherwood CC, Grossman LI, Goodman M, Wildman DE. Comparative analysis of encephalization in mammals reveals relaxed constraints on anthropoid primate and cetacean brain scaling. J Evol Biol. 2012;25(5):981–94.

45

Braun JK, Layman QD, Mares MA. Myotis albescens (Chiroptera: Vespertilionidae). Mammalian Species. 2009 Nov 25:1-9. COBB S. Brain Size. Arch Neurol. 1965;12(6):555-561. Conrad MS, Dilger RN, Johnson RW. Brain growth of the domestic pig (Sus scrofa) from 2 to 24 weeks of age: a longitudinal MRI study. Developmental neuroscience. 2012 Jul 6;34(4):291-8. Finarelli J a. Estimation of Endocranial Volume Through the Use of External Skull Measures in the Carnivora (Mammalia). J Mammal. 2006; 87(5):1027–36. Isler K, Kirk EC, Miller JM, Albrecht GA, Gelvin BR, Martin RD. Endocranial volumes of primate species: scaling analyses using a comprehensive and reliable data set. Journal of Human Evolution. 2008 Dec 31;55(6):967-78. Jones KE, Bielby J, Cardillo M, Fritz SA, O'Dell J, Orme CD, Safi K, Sechrest W, Boakes EH, Carbone C, Connolly C. PanTHERIA: a species‐level database of life history, ecology, and geography of extant and recently extinct mammals. Ecology. 2009 Sep 1; 90(9):2648-. Kruska D. Changes of brain size in mammals caused by domestication. Zeitschrift fur zoologische systematik und evolutionsforschung. 1980 jan 1; 18(3):161-95. Massicot, P. August 16, 2001. "Animal Info - Chiru (Tibetan Antelope)" (On-line). Accessed November 19, 2016 at http://www.animalinfo.org/species/artiperi/panthodg.htm. McGuire LP, Ratcliffe JM. Light enough to travel: migratory bats have smaller brains, but not larger hippocampi, than sedentary species. Biol Lett [Internet]. 2011;7(2):233–6. Available from: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3061165&tool=pmcentrez&rendert ype=abstract McNab BK, Eisenberg JF. Brain Size and Its Relation to the Rate of Metabolism in Mammals. Am Nat. 1989;133(2):157. Montgomery SH. The primate brain: Evolutionary history & genetics (Doctoral dissertation, University of Cambridge). Roth G, Dicke U. Evolution of the brain and intelligence. Trends in cognitive sciences. 2005 May 31;9(5):250-7. Rousseeuw, P. J., & Leroy, A. M. (2005). Robust regression and outlier detection (Vol. 589). John wiley & sons. Rousseeuw, P.J. & Leroy, A.M. (1987) Robust Regression and Outlier Detection. Wiley, p. 57. Stephan H, Frahm H, Baron G. New and revised data on volumes of brain structures in insectivores and primates. Folia Primatol. 1981;35(1):1–29. Swartz SM, Diaz JI, Riskin DK, Breuer KS. Evolutionary history of bats: fossils, molecules and morphology. InCambridge University Press 2012 Jan 1. Taylor AB, van Schaik CP. Variation in brain size and ecology in Pongo. Journal of Human Evolution. 2007 Jan 31;52(1):59-71.

46

Supplementary Table 5. Results of individual Spearman's correlation and Wilcoxon tests between M1 and G4Mfe from other tooth related genes.

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho -0.08 0.08 0.02 0.26 -0.19 0.12 Spearman p 0.60 0.64 0.90 0.15 0.29 0.53 Wilcoxon p 0.51 0.50 0.62 0.25 0.26 0.30

Supplementary Table 6. Results of individual Spearman's correlation and Wilcoxon tests between M2 and G4Mfe from other tooth related genes.

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho 0.04 0.07 -0.20 0.37 -0.11 0.13 Spearman p 0.78 0.68 0.20 0.04 0.55 0.47 Wilcoxon p 0.69 0.99 0.49 0.21 0.65 0.28

Supplementary Table 7. Results of individual Spearman's correlation and Wilcoxon tests between M3 and G4Mfe from other tooth related genes.

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho 0.13 0.09 -0.25 0.20 -0.21 0.20 Spearman p 0.39 0.60 0.12 0.28 0.25 0.26 Wilcoxon p 0.91 0.85 0.65 0.82 0.70 0.12

Supplementary Table 8. Results of individual Spearman's correlation and Wilcoxon tests between PM4 and G4Mfe from other tooth related genes.

47

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho -0.12 -0.10 -0.20 -0.31 0.38 -0.27 Spearman p 0.43 0.56 0.23 0.09 0.03 0.14 Wilcoxon p 0.14 0.24 0.04 0.03 0.01 0.23

Supplementary Table 9. Results of individual Spearman's correlation and Wilcoxon tests between PM3 and G4Mfe from other tooth related genes.

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho 0.006 -0.123 -0.307 -0.443 0.373 -0.205 Spearman p 0.967 0.469 0.054 0.011 0.033 0.261 Wilcoxon p 0.203 0.435 0.004 0.002 0.002 0.389

Supplementary Table 10. Results of individual Spearman's correlation and Wilcoxon tests between PM2 and G4Mfe from other tooth related genes.

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho 0.022 -0.144 0.264 0.056 0.051 0.296 Spearman p 0.885 0.394 0.100 0.759 0.779 0.100 Wilcoxon p 0.131 0.883 0.198 0.420 0.826 0.678

Supplementary Table 11 Results of individual Spearman's correlation and Wilcoxon tests between Canine and G4Mfe from other tooth related genes.

Dlx1 + – – – – – Bcor – + – – – – Cd34 – – + – – – Hdac2 – – – + – – Itga6 – – – – + – C9orf72 – – – – – + Spearman rho -0.005 -0.090 -0.331 -0.522 0.399 -0.313 Spearman p 0.976 0.597 0.037 0.002 0.021 0.081 Wilcoxon p 0.359 0.428 0.009 0.002 0.012 0.266

48

Supplementary Table 12. Results of Spearman's correlation test between posterior tooth relative lengths and upstream sequences Mfe

PAX9_up50 MSX1_up50 INHBA_up50 BMP2_up50 rho P-value rho P-value rho P-value rho P-value M1 -0.150 0.314 0.413 0.009 -0.149 0.354 -0.006 0.969 M2 -0.190 0.201 0.237 0.147 -0.196 0.219 0.086 0.572 M3 -0.298 0.042 -0.041 0.804 0.223 0.161 -0.047 0.759 PM4 0.009 0.953 0.090 0.586 -0.143 0.372 -0.493 0.001 PM3 0.021 0.889 -0.027 0.872 -0.205 0.198 -0.429 0.003 PM2 0.319 0.029 -0.199 0.225 0.131 0.413 0.100 0.514 Canine 0.035 0.815 0.247 0.130 -0.295 0.061 -0.357 0.016

Supplementary Table 13. Results of combinatorial tests between PM4 and the sum of upstream sequences Mfe

Pax9 + + + + + Msx1 + + + – + Inhba – + – + + Bmp2 – – + + + Spearman rho 0.14 0.13 0.00 -0.29 0.07 Spearman p-value 0.44 0.49 0.98 0.12 0.73 Wilcoxon p-value 0.34 0.57 0.49 0.07 0.84

Supplementary Table 14. Results of combinatorial tests between PM3 and the sum of upstream sequences Mfe

Pax9 + + + + + Msx1 + + + – + Inhba – + – + + Bmp2 – – + + + Spearman rho 0.031 -0.007 -0.041 -0.197 -0.027 Spearman p-value 0.867 0.970 0.827 0.288 0.886 Wilcoxon p-value 0.484 0.265 0.327 0.105 0.588

49

Supplementary Table 15. Results of combinatorial tests between PM2 and the sum of upstream sequences Mfe

Pax9 + + + + + Msx1 + + + – + Inhba – + – + + Bmp2 – – + + + Spearman rho -0.212 -0.244 -0.157 0.087 -0.231 Spearman p-value 0.261 0.194 0.408 0.648 0.219 Wilcoxon p-value 0.280 0.384 0.566 0.395 0.384

Supplementary Table 16. Results of combinatorial tests between Canine and the sum of upstream sequences Mfe

Pax9 + + + + + Msx1 + + + – + Inhba – + – + + Bmp2 – – + + + Spearman rho 0.29 0.23 0.22 -0.13 0.20 Spearman p-value 0.11 0.20 0.24 0.50 0.28 Wilcoxon p-value 0.07 0.77 0.70 0.31 0.44

50

Supplementary Figure 1: Results of Spearman's correlation tests between tooth relative lengths and Didelphis virginiana evolutionary distances

51

Supplementary Figure 2: Results of Spearman's correlation tests between tooth relative lengths and Alligator mississippiensis evolutionary distances

52

3 CONCLUSÃO

Nós fornecemos evidências de que G4FS no íntron 1 dos genes Pax9, Msx1, Bmp2 e Inhba estão associadas com variações no fenótipo dentário dos mamíferos. Estas sequências poderiam representar elementos reguladores da expressão genica aos níveis da transcrição ou do splicing. Assim, variações nas sequências G4 poderiam ter definido dosagens gênicas, tempos de expressão e interações moleculares específicas para cada espécie, contribuindo à diversificação do fenótipo dentário.

53

REFERÊNCIAS1

Bei M. Molecular genetics of tooth development. Curr Opin Genet Dev. 2009 Oct; 19(5):504-10.

Butler PM. Studies of the Mammalian Dentition.Differentiation of the Post-canine Dentition. Proc Zool Soc London. 1939; 109 B(1):1–36.

Dhayan H, Baydoun AR, Kukol A. G-quadruplex formation of FXYD1 pre-mRNA indicates the possibility of regulating expression of its protein product. Arch Biochem Biophys. 2014 Oct 15; 560:52-8.

Eddy J, Maizels N. Conserved elements with potential to form polymorphic G- quadruplex structures in the first intron of human genes. Nucleic Acids Res. 2008 Mar;36(4):1321-33.

Huppert JL, Balasubramanian S. Prevalence of quadruplexes in the human genome.Nucleic Acids Res. 2005 May 24;33(9):2908-16.

Jernvall J, Thesleff I. Reiterative signaling and patterning during mammalian tooth morphogenesis. Mech Dev. 2000 Mar 15;92(1):19-29.

Kavanagh KD, Evans AR, Jernvall J. Predicting evolutionary patterns of mammalian teeth from development. Nature. 2007 Sep 27;449(7161):427-32

Nakatomi M, Wang XP, Key D, Lund JJ, Turbe-Doan A, Kist R, Aw A, Chen Y, Maas RL, Peters H. Genetic interactions between Pax9 and Msx1 regulate lip development and several stages of tooth morphogenesis. Dev Biol. 2010 Apr 15;340(2):438-49.

Ogawa T, Kapadia H, Feng JQ, Raghow R, Peters H, D'Souza RN. Functional consequences of interactions between Pax9 and Msx1 genes in normal and abnormal tooth development. J Biol Chem. 2006 Jul 7;281(27):18363-9

Pawlowska E, Janik-Papis K, Wisniewska-Jarosinska M, Szczepanska J, Blasiak J. Mutations in the human homeobox MSX1 gene in the congenital lack of permanent teeth. Tohoku J Exp Med. 2009 Apr; 217(4):307-12.

Peters H, Neubüser A, Kratochwil K, Balling R. Pax9-deficient mice lack pharyngeal pouch derivatives and teeth and exhibit craniofacial and limb abnormalities. Genes Dev. 1998 Sep 1;12(17):2735-47.

1 De acordo com as normas da UNICAMP/FOP, baseadas na padronização do International Committee of Medical Journal Editors Vancouver Group. Abreviatura dos periódicos em conformidade com o PubMed.

54

Reinhold WC, Mergny JL, Liu H, Ryan M, Pfister TD, Kinders R, Parchment R, Doroshow J, Weinstein JN, Pommier Y. Exon array analyses across the NC 60 reveal potential regulation of TOP1 by transcription pausing at guanosine quartets in the first intron. Cancer Res. 2010 Mar 15; 70(6):2191-203.

Renvoisé E, Evans AR, Jebrane A, Labruère C, Laffont R, Montuire S. Evolution of mammal tooth patterns: new insights from a developmental prediction model.Evolution. 2009 May; 63(5):1327-40.

Ribeiro MM, de Andrade SC, de Souza AP, Line SR. The role of modularity in the evolution of primate postcanine dental formula: integrating jaw space with patterns of dentition. Anat Rec (Hoboken). 2013 Apr; 296(4): 622-9.

Ribeiro MM, Teixeira GS, Martins L, Marques MR, de Souza AP, Line SR. G- quadruplex formation enhances splicing efficiency of PAX9 intron 1. Hum Genet. 2015 Jan; 134(1):37-44.

Satokata I, Maas R. Msx1 deficient mice exhibit cleft palate and abnormalities of craniofacial and tooth development. Nat Genet. 1994 Apr; 6(4):348-56.

Tatematsu T, Kimura M, Nakashima M, Machida J, Yamaguchi S, Shibata A, Goto H, Nakayama A, Higashi Y, Miyachi H, Shimozato K, Matsumoto N, Tokita Y. An aberrant splice acceptor site due to a novel intronic nucleotide substitution in MSX1 gene is the cause of congenital tooth agenesis in a Japanese family. PLoS One. 2015 Jun 1; 10(6):e0128227.

Thesleff I, Sharpe P. Signalling networks regulating dental development. Mech Dev. 1997 Oct; 67(2):111-23.

Xue J, Gao Q, Huang Y, Zhang X, Yang P, Cram DS, Liang D, Wu L. A novel MSX1 intronic mutation associated with autosomal dominant non-syndromic oligodontia in a large Chinese family pedigree. Clin Chim Acta. 2016 Oct 1;461:135-40.

Ziegler AC. A theory of the evolution of therian dental formulas and replacement patterns. Q Rev Biol. 1971;46(3):226–49.

55

ANEXO 1- Comprovante de submissão do artigo.