Common Variation Near ROBO2 Is Associated with Expressive Vocabulary in Infancy Beate St Pourcain1,2,3,*, Rolieke A.M
Total Page:16
File Type:pdf, Size:1020Kb
ARTICLE Received 15 Jan 2014 | Accepted 28 Jul 2014 | Published 16 Sep 2014 DOI: 10.1038/ncomms5831 OPEN Common variation near ROBO2 is associated with expressive vocabulary in infancy Beate St Pourcain1,2,3,*, Rolieke A.M. Cents4,5,*, Andrew J.O. Whitehouse6,*, Claire M.A. Haworth7,8,*, Oliver S.P. Davis8,9,*, Paul F. O’Reilly8,10, Susan Roulstone11, Yvonne Wren11,QiW.Ang12, Fleur P. Velders4,5, David M. Evans1,13,14, John P. Kemp1,13,14, Nicole M. Warrington12,14, Laura Miller13, Nicholas J. Timpson1,13, Susan M. Ring1,13, Frank C. Verhulst5, Albert Hofman15, Fernando Rivadeneira15,16, Emma L. Meaburn17, Thomas S. Price18, Philip S. Dale19, Demetris Pillas10, Anneli Yliherva20, Alina Rodriguez10,21, Jean Golding13, Vincent W.V. Jaddoe4,15,22, Marjo-Riitta Jarvelin10,23,24,25,26, Robert Plomin8, Craig E. Pennell12, Henning Tiemeier5,15,* & George Davey Smith1,13 Twin studies suggest that expressive vocabulary at B24 months is modestly heritable. However, the genes influencing this early linguistic phenotype are unknown. Here we conduct a genome-wide screen and follow-up study of expressive vocabulary in toddlers of European descent from up to four studies of the EArly Genetics and Lifecourse Epidemiology consortium, analysing an early (15–18 months, ‘one-word stage’, NTo ta l ¼ 8,889) and a later (24–30 months, ‘two-word stage’, NTo ta l ¼ 10,819) phase of language acquisition. For the early phase, one single-nucleotide polymorphism (rs7642482) at 3p12.3 near ROBO2, encoding a conserved axon-binding receptor, reaches the genome-wide significance level (P ¼ 1.3 Â 10 À 8) in the combined sample. This association links language-related common genetic variation in the general population to a potential autism susceptibility locus and a linkage region for dyslexia, speech-sound disorder and reading. The contribution of common genetic influences is, although modest, supported by 2 2 genome-wide complex trait analysis (meta-GCTA h 15–18-months ¼ 0.13, meta-GCTA h 24–30-months ¼ 0.14) 2 and in concordance with additional twin analysis (5,733 pairs of European descent, h 24-months ¼ 0.20). 1 Medical Research Council Integrative Epidemiology Unit, University of Bristol, Oakfield House, 15-23 Oakfield Grove, Bristol BS8 2BN, UK. 2 School of Oral and Dental Sciences, University of Bristol, Lower Maudlin Street, Bristol BS1 2LY, UK. 3 School of Experimental Psychology, University of Bristol, 12a Priory Road, Bristol BS8 1TU, UK. 4 Generation R Study Group, Erasmus MC-University Medical Centre, Postbus 2040, 3000 CA Rotterdam, The Netherlands. 5 Department of Child and Adolescent Psychiatry/Psychology, Erasmus MC-University Medical Centre, Postbus 2060, 3000 CB Rotterdam, The Netherlands. 6 Telethon Kids Institute, Centre for Child Health Research, University of Western Australia, 100 Roberts Road, Subiaco, Western Australia 6008, Australia. 7 Department of Psychology, University of Warwick, Coventry CV4 7AL, UK. 8 Medical Research Council, Social, Genetic and Developmental Psychiatry Centre, Institute of Psychiatry, King’s College London, De Crespigny Park, Denmark Hill, London SE5 8AF, UK. 9 Department of Genetics, Evolution and Environment, UCL, UCL Genetics Institute, Darwin Building, Gower Street, London WC1E 6BT, UK. 10 Department of Epidemiology and Biostatistics, Medical Research Council (MRC) Public Health England (PHE) Centre for Environment and Health, School of Public Health, Imperial College London, Norfolk Place, London W2 1PG, UK. 11 Bristol Speech and Language Therapy Research Unit, University of the West of England, Frenchay Hospital, Frenchay Park Road, BS16 1LE Bristol, UK. 12 School of Women’s and Infants’ Health, University of Western Australia, 374 Bagot Road, Subiaco, Western Australia 6008, Australia. 13 School of Social and Community Medicine, University of Bristol, Canynge Hall, 39 Whatley Road, Bristol BS8 2PS, UK. 14 University of Queensland Diamantina Institute, Translational Research Institute, University of Queensland, 37 Kent Street Woolloongabba, Queensland 4102, Australia. 15 Department of Epidemiology, Erasmus MC-University Medical Centre, Postbus 2040, 3000 CA Rotterdam, The Netherlands. 16 Department of Internal Medicine, Erasmus MC-University Medical Centre, Postbus 2040, 3000 CA Rotterdam, The Netherlands. 17 Department of Psychological Sciences, Birkbeck, University of London, Malet Street, London WC1E 7HX, UK. 18 Institute for Translational Medicine and Therapeutics, University of Pennsylvania School of Medicine, 3400 Civic Center Boulevard, Building 421, Philadelphia, Pennsylvania 19104-5158, USA. 19 Department of Speech and Hearing Sciences, University of New Mexico, 1700 Lomas Boulevard NE Suite 1300, Albuquerque, New Mexico 87131, USA. 20 Faculty of Humanities, Logopedics, Child Language Research Center, University of Oulu, BOX 1000, Oulu 90014, Finland. 21 Mid Sweden University Department for Psychology/Mittuniversitetet Avdelningen fo¨r psykologi, 83125 O¨ stersund, Sweden. 22 Department of Pediatrics, Erasmus MC-University Medical Centre, Postbus 2060, 3000 CB Rotterdam, The Netherlands. 23 Unit of Primary Care, Oulu University Hospital, Kajaanintie 50, PO Box 20, FI-90220, Oulu 90029, Finland. 24 Department of Children and Young People and Families, National Institute for Health and Welfare, Aapistie 1, Box 310, FI-90101 Oulu, Finland. 25 Institute of Health Sciences, University of Oulu, PO Box 5000, Oulu FI-90014, Finland. 26 Biocenter Oulu, University of Oulu, PO Box 5000, Aapistie 5A, OuluFI-90014, Finland. * These authors contributed equally to this work. Correspondence and requests for materials should be addressed to B.S.P. (email: [email protected]). NATURE COMMUNICATIONS | 5:4831 | DOI: 10.1038/ncomms5831 | www.nature.com/naturecommunications 1 & 2014 Macmillan Publishers Limited. All rights reserved. ARTICLE NATURE COMMUNICATIONS | DOI: 10.1038/ncomms5831 he number of distinct spoken words is a widely used association signal was observed at rs7642482 on chromosome measure of early language abilities, which manifests during 3p12.3 near ROBO2 (P ¼ 9.5 Â 10 À 7, Supplementary Table 1) Tinfancy1. Word comprehension (known as receptive and for the late phase at rs11742977 on chromosome 5q22.1 language) in typically developing children starts at the age of within CAMK4 (P ¼ 3.5 Â 10 À 7, Supplementary Table 2). All about 6–9 months2, and the spontaneous production of words independent variants from the discovery analysis (associated (known as expressive language) emerges at about 10–15 Pr10 À 4, Supplementary Tables 1 and 2), including these SNPs, months1,3. During the next months the accumulation of words were taken forward to a follow-up study (Methods). This is typically slow, but then followed by an increase in rate, often included 2,038 18-month-old Dutch-speaking children for the quite sharp, around 14–22 months of age (‘vocabulary spurt’)1,4. early phase and 4,520 24–30-month-old Dutch or English- As development progresses, linguistic proficiency becomes more speaking children for the later phase (Supplementary Data 1). advanced, with two-word combinations (18–24 months of age)1,3 For four independent loci from the early phase GWAS and more complex grammatical structures (24–36 months of (rs7642482, rs10734234, rs11176749 and rs1654584), but none age)1,3 arising, accompanied by the steady increase in vocabulary for the later phase analysis, we found evidence for association size. Expressive vocabulary is therefore considered to be a rapidly within the follow-up cohort (Po0.05), assuming the same changing phenotype, especially between 12 and 24 months5, with direction of effect as in the discovery sample (Table 1; zero size at birth, B50 words at 15–18 months1,3, B200 words at Supplementary Tables 1–4). In the combined analysis of all 18–30 months1,3, B14,000 words at 6 years of age3,4 and available samples (Table 1; Fig. 3a–d) rs7642482 on chromosome Z50,000 words in high school graduates6,7. 3p12.3 near ROBO2 (the strongest signal in the discovery cohort) Twin analyses of cross-sectional data suggest that expressive reached the genome-wide significance level (P ¼ 1.3 Â 10 À 8), and vocabulary at B24 months is modestly heritable (h2 ¼ 0.16– the three other signals approached the suggestive level 0.38)8,9, and longitudinal twin analyses have reported an increase (rs10734234 on chromosome 11p15.2 near INSC, P ¼ 1.9 Â in heritability of language-related factors during development 10 À 7; rs11176749 on chromosome 12q15 near CAND1; (h2 ¼ 0.47–0.63, Z7 years of age)10. Large-scale investigations of P ¼ 7.2 Â 10 À 7 and rs1654584 on chromosome 19p13.3 within common genetic variation underlying growth in language skills, DAPK3; P ¼ 3.4 Â 10 À 7). however, are challenging owing to the complexity and varying Each of these four polymorphisms explained only a small nature of the phenotype. This is coupled with a change in proportion of the phenotypic variance (adjusted regression R2: for psychological instruments, which are used to assess these rs7642482 ¼ 0.34–0.35%, rs10734234 ¼ 0.27–0.35%, rs11176749 abilities with progressing age. Current genome-wide association ¼ 0.25–0.27% and rs1654584 ¼ 0.22–0.49%) in both the dis- studies (GWASs) using cross-sectional data on language covery and the follow-up cohort, but together the four SNPs abilities in childhood and adolescence have failed to identify accounted for 41% of the variation in early expressive robust signals of genome-wide association11,12, and genes vocabulary scores (joint adjusted regression R2 ¼ 1.10–1.45%). influencing earlier, less-complex linguistic phenotypes are For the SNP reaching genome-wide significance, rs7642482, each currently unknown. increase in the minor G-allele was associated with lower To attempt to understand genetic factors involved in language expressive vocabulary, although, due to the rank-transformation, development during infancy and early childhood, we perform a an interpretation of the magnitude of the genetic effect is not GWAS and follow-up study of expressive vocabulary scores in informative. An empirical estimate of the genetic effect in the independent children of European descent from the general discovery sample, suggested a decrease of 0.098 s.d.