Supplementary Methods s9

Total Page:16

File Type:pdf, Size:1020Kb

Supplementary Methods s9

SUPPLEMENTARY METHODS

1) RESEARCH DESIGN

Please see supplementary table 1 for details on clinical characteristics of the 5 GWAS datasets.

Singapore Prospective Study Program (SP2)

SP2 is a cross-sectional study of adult Singaporean Chinese, Malay and Asian-Indian samples, aged between 24 to 95 years. Disproportionate sampling stratified by ethnicity was used to increase sample size of Malay and Asian-Indian ethnic groups and 5,499 Chinese,

1,405 Malays and 1,138 Asian-Indians samples were available. Detailed methodology for SP2 has been previously reported [1] and for this study only adult Chinese samples were used.

Singapore Cohort study Of the Risk factors for Myopia (SCORM)

SCORM [2, 3] is a prospective study consisting of 1979 children from grades 1-3 (age 7 to 9) who were recruited from three schools in Singapore (located in the Eastern, Western and

Northern regions). Children with serious medical conditions such as heart disorders and leukaemia were excluded from the study (n=94). Samples were followed up yearly and for the purpose of this study, the BMI measures taken at the visits when the children were 9 years of age were used to minimize pubertal effects of later paediatric age. Singapore Diabetes Cohort Study (SDCS)

The SDCS includes Singaporean Chinese, Malay, and Asian-Indian adults with type 2 diabetes and has been published previously [4]. These individuals were recruited from

National Healthcare Group Polyclinics, National University Hospital Singapore and the Tan

Tock Seng Hospital. The study had a participation response rate of 90% with over 5000 patients recruited.

Singapore Malay Eye Study (SiMES)

SiMES is a cross-sectional study of Singaporean Malay adults (n=3280), aged between 40 and 80 years. Briefly, individuals from 15 residential districts in the southwestern part of

Singapore were recruited in an age-stratified manner and methodology has been described previously [5].

Singapore Indian Eye Study (SINDI)

SINDI is a cross-sectional study of Singaporean Asian-Indian adults (n=3400), aged between

40 and 80 years, recruited in an age-stratified manner. This study is part of the Singapore

Indian Chinese Cohort Eye Study and detailed methodology has been previously described

[6]. Study variable

Height (m) and weight (kg) was measured similarly in all datasets using standard protocols and used to derive BMI as weight over height-squared (kg/m2). The rank-based inverse normal transformation was applied to raw BMI values from each dataset to normalize data and enable comparison of results across datasets. These procedures were performed in intercooled STATA (version 8.2)

2) GENOTYPING AND QUALITY CONTROL

High-throughput genotyping of all samples were carried out using Illumina Beadchips®.

GWAS data was available for Chinese samples from SCORM, SP2 and SDCS. 1155 Chinese children from SCORM were genotyped using Illumina Hap550 (n=418) and Hap550Duo arrays (n=737). 3066 Chinese adults from SP2 were genotyped using Illumina 1Mduov3

(n=1016), HumanHap 610Quad (n=1467) and Hap550 arrays (n=583). Chinese adult diabetic cases from SDCS were genotyped using Illumina 1Mduov3 (n=1015) and HumanHap

610Quad arrays (n=1195). All SiMES Malay adults (n=3072) and all SINDI Asian-Indian adults (n=2953) were genotyped using HumanHap 610Quad arrays.

Supplementary table 2 provides details on sample QC measures. Quality control measures were carried out in a chip-wise manner in all studies except SCORM, where data from the

Illumina Hap550 and Hap550Duo were previously combined. Briefly, samples were excluded based on sample call-rates (<95%) and extreme heterozygosities (<0.25 or >0.35). Identity- by-state measures were performed by pairwise comparison of samples to detect cryptic relatedness such as monozygotic twins, full-sibling pairs and parent-offspring pairs. One sample from each relationship was excluded from further analysis and where duplicate samples had been genotyped in different SNP-arrays, samples from the denser array was retained.

Population structure ascertainment to prevent confounding of study results, was done using principal component analysis (PCA) [7] and analysing plots with four reference panels from

International Hapmap [8] and Singaporean Chinese, Malay and Asian-Indian samples from the Singapore Genome Variation Project [9]. Outliers with discordant ethnic membership from reported ethnicities and admixed samples were subsequently detected and removed.

PCA plots of SiMES Malays and SINDI Asian-Indians did not cluster tightly even after sample removal and to control for possible inflation of results, we adjusted for the first two and first three principal components in SiMES and SINDI respectively (Supplementary figure

1 and 2). In the SCORM dataset, the Eigenstrat PCA program [7] with a threshold set at eight standard deviations was used to remove samples that were possible outliers. Samples with discrepant genetically-inferred and reported genders were also removed. 1006 SCORM samples, 2431 SP2 samples, 1992 SDCS samples, 2522 SiMES samples and 2531 SINDI samples with BMI data and 2429 SP2 samples with waist phenotypes were available after sample QC procedures.

Breakdown of quality-control measures for SNPs is provided in Supplementary table 3. Sex and mitochondrial SNPs were removed, together with gross HWE outliers (p-value < 1x10-4).

SNPs that were monomorphic or rare (MAF < 1%) and SNPs with low call-rates (<95%) were also excluded. In datasets where more than one chip was used for genotyping, Mantel- extension tests were carried out to detect differences in allele frequencies of SNPs between the chips [10]. 45, 62 and 69 SNPs were detected from SCORM, SP2 and SDCS respectively and removed from analyses.

3) IMPUTATION

Imputation procedures were performed using IMPUTE v0.5.0. [11] and genotype calls were based on HapMap Phase 1 and 2 East-Asian samples (CHB and JPT) of NCBI build 36 and dbSNP 126 for all Chinese samples (SCORM, SP2, SDCS) [8]. For the Malay and Asian-

Indian samples, all HapMap reference panels (CEU, YRI and JPT+CHB) on NCBI build 36, release 22 dbSNP 126 were used for imputation to better capture local patterns of haplotype variations [12, 13]. Actual genotyped calls were replaced back into files and only imputed

SNPs, with a posterior probability ≥ 0.90 and call-rate ≥ 95% were used. 1816934 SNPs from

SCORM, 1527744 SNPs from SINDI, 1555870 SNPs from SiMES, 1745788 SNPs from SP2 and 1791569 SNPs from SDCS were available for subsequent analyses after imputation and

QC procedures (Supplementary table 3).

4) STATISTICAL ANALYSIS

SNP-based trend tests for BMI Z-score associations were carried out in a chip-wise manner.

Data were subsequently combined to derive overall results in SP2 and SDCS. Association between SNPs and quantitative phenotypes were analysed in an additive model and adjusted for age, age-squared, sex (sex was the only covariate in SCORM as all data used was from age 9) and population stratification (first two principal components in SiMES and first three principal components in SINDI). These analyses were performed using the genome association toolset, SNPTEST (version 1.1.5) [11]. For analyses of imputed SNPs, the – proper option was included and SNP information scores were required to be at least 0.5.

Association of individual SNP genotype was quantified by the regression slope (β), the standard error of β (se) and the association p-value for BMI Z-score. Genomic control (GC) correction was applied to association results from each chip, to further control for possible residual inflations.

Individual study results were subsequently pulled together (n=10482) using the general inverse variance-weighted meta-analysis, assuming a fixed effects model to derive overall pooled estimates and two-sided p-values. We further corrected for genomic inflation

(λ=1.046) after meta-analysis. Cochran’s Q and I2 measures were used to assess between- study heterogeneity and SNPs with Qpval < 0.1 were considered as significant [14]. Forest plots were used to assess variations in effect sizes. All meta-analysis procedures were performed using the meta package in STATA.

5) SNP SELECTION AND VALIDATION OF KNOWN LOCI

32 index SNPs from the recent GWAS for obesity [15] and 10 SNPs from 6 other loci discovered from GWAS that utilized European early-onset or extreme obesity samples [16-

17] were identified. SNPs that had QC issues in our datasets or had not been imputed or genotyped were not assessed (Supplementary table 5). Previously reported proxy SNPs [18], rs10913469 and rs7647305 were evaluated for the SEC16B and ETV5 regions, respectively as index SNPs [15] had poor call-rates (Supplementary Table 5). 2 other SNPs, rs13107325

(SLC39A8 region at chromosome 4) and rs10508503 (PTER region at chromosome 10), were seen to be rare or monomorphic among the Singaporean populations and were excluded from analysis (Supplementary Table 5).

In all we evaluated 31 index SNPs from 29 obesity regions, corresponding to FTO at chromosome 16, melanocortin 4 receptor (MC4R) at chromosome 18, glucosamine-6- phosphate deaminase 2 (GNPDA2) at chromosome 4, transmembrane protein 18 (TMEM18) at chromosome 2, glutaminyl-peptide cyclotransferase-like (QPCTL) and gastric inhibitory polypeptide receptor (GIPR) at chromosome 19, brain-derived neurotrophic factor (BDNF) at chromosome 11, ets variant 5 (ETV5) at chromosome 3, mitogen-activated protein kinase kinase 5 (MAP2K5) and SKI family transcriptional corepressor 1 (SKOR1) at chromosome

15, SEC16 homolog B (SEC16B) at chromosome 1, neuronal growth regulator 1 (NEGR1) at chromosome 1, transcription factor AP-2β (TFAPB2) at chromosome 6, FLJ35779 and 3- hydroxy-3-methylglutaryl-CoA reductase (HMGCR) at chromosome 5, nudix (nucleoside diphosphate linked moiety X)-type motif 3 (NUDT3) and high mobility group AT-hook 1

(HMGA1) at chromosome 6, rab and DnaJ domain containing (RBJ), adenylate cyclase 3

(ADCY3) and proopiomelanocortin (POMC) at chromosome 2, Fas apoptotic inhibitory molecule 2 (FAIM2) at chromosome 12, TNNI3 interacting kinase (TNNI3K) at chromosome

1, potassium channel tetramerisation domain containing 15 (KCTD15) at chromosome 19, mitochondrial carrier homolog 2 (MTCH2), NADH dehydrogenase (ubiquinone) Fe-S protein

3 (NDUFS3) and CUG triplet repeat, RNA binding protein 1 (CUGBP1) at chromosome 11, leucine rich repeat neuronal 6C (LRRN6C) at chromosome 9, polypyrimidine tract binding protein 2 (PTBP2) at chromosome 1, cell adhesion molecule 2 (CADM2) at chromosome 3, SH2B adaptor protein 1 (SH2B1), apolipoprotein B48 receptor (APOB48R), sulfotransferase family, cytosolic, 1A, phenol-preferring, member 2 (SULT1A2), AC138894.2, ataxin 2-like

(ATXN2L) and Tu translation elongation factor, mitochondrial (TUFM) at chromosome 16, transmembrane protein 160 (TMEM160) and zinc finger CCCH-type containing (ZC3H4) at chromosome 19, ribosomal protein L27a (RPL27A) and tubby (TUB) at chromosome 11, tankyrase (TNKS) and methionine sulfoxide reductase A (MSRA) at chromosome 8, v-maf musculoaponeurotic fibrosarcoma oncogene homolog (MAF) at chromosome 16, Niemann

Pick type C1 (NPC1) at chromosome 18, prolactin (PRL) at chromosome 6, serologically defined colon cancer antigen 8 (SDCCA8) at chromosome 1, for BMI associations among our datasets.

All power calculations were carried out using QUANTO [19]. We further tested regions that replicated among the Singapore datasets for involvement in known pathways using Ingenuity

Pathway Analysis (IPA) version 8.7 (Ingenuity® Systems, www.ingenuity.com).

References:

1. Teslovich TM, Musunuru K, Smith AV, Edmondson AC, Stylianou IM, Koseki M et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature 2010; 5;466(7307): 707-713.

2. Saw SM, Shankar A, Tan SB, Taylor H, Tan DT, Stone RA, et al. A cohort study of incident myopia in Singaporean children. Invest Ophthalmol Vis Sci. 2006; 47(5): 1839-1844.

3. Sabanayagam C, Shankar A, Chong YS, Wong TY, Saw SM. Breast-feeding and overweight in Singapore school children. Pediatr Int. 2009; 51(5): 650-656.

4. Tan JT, Ng DP, Nurbaya S, Ye S, Lim XL, Leong H et al. Polymorphisms identified through genome-wide association studies and their associations with type 2 diabetes in Chinese, Malays, and Asian-Indians in Singapore. J Clin Endocrinol Metab. 2010; 95(1): 390-397.

5. Tan JT, Dorajoo R, Seielstad M, Sim XL, Ong RT, Chia KS et al. FTO variants are associated with obesity in the Chinese and Malay populations in Singapore. Diabetes. 2008; 57(10): 2851-2857. 6. Lavanya R, Jeganathan VS, Zheng Y, Raju P, Cheung N, Tai ES et al. Methodology of the Singapore Indian Chinese Cohort (SICC) eye study: quantifying ethnic variations in the epidemiology of eye diseases in Asians. Ophthalmic Epidemiol. 2009; 16(6) :325-336.

7. Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA, Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat Genet 2006; 38(8): 904-909.

8. International HapMap Consortium. The Internaltional HapMap Project: Nature 2003; 426(6968):789-796.

9. Teo YY, Sim X, Ong RT, Tan AK, Chen J, Tantoso E et al. Singapore Genome Variation Project: a haplotype map of three Southeast Asian populations. Genome Res. 2009; 19(11): 2154-2162.

10. The Wellcome Trust Case Control Consortium. Genome-wide association study of 14000 cases of seven common diseases and 3000 shared controls. Nature 2007; 447: 661-675.

11. Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies via imputation of genotypes. Nat Genet 2007; 39: 906-913.

12. Chambers JC, Elliott P, Zabaneh D, Zhang W, Li Y, Froguel P et al. Common genetic variation near MC4R is associated with waist circumference and insulin resistance. Nat Genet 2008; 40: 716–718.

13. Elliott P, Chambers JC, Zhang W, Clarke R, Hopewell JC, Peden JF et al. Genetic Loci associated with C- reactive protein levels and risk of coronary heart disease. JAMA 2009; 302(1): 37-48.

14. Zeggini E, Ioannidis JPA. Meta-analysis in genome-wide association studies. Pharmacogenomics 2009; 10(2): 191–201.

15. Speliotes EK, Willer CJ, Berndt SI, Monda KL, Thorleifsson G, Jackson AU et al. Association analyses of 249,796 individuals reveal 18 new loci associated with body mass index. Nat Genet 2010; 10 October 2010; doi:10.1038/ng.686.

16. Meyre D, Delplanque J, Chèvre JC, Lecoeur C, Lobbens S, Gallina S et al. Genome-wide association study for early-onset and morbid adult obesity identifies three new risk loci in European populations. Nat Genet 2009; 41(2): 157-159.

17. Scherag A, Dina C, Hinney A, Vatin V, Scherag S, Vogel CI et al. Two new Loci for body-weight regulation identified in a joint analysis of genome-wide association studies for early-onset extreme obesity in French and german study groups. PLoS Genet 2010; 6(4): e1000916

18. Thorleifsson G, Walters GB, Gudbjartsson DF, Steinthorsdottir V, Sulem P, Helgadottir A et al. Genome- wide association yields new sequence variants at seven loci that associate with measures of obesity. Nat Genet 2009; 41(1): 18-24.

19. Gauderman WJ, Morrison JM. QUANTO 1.1: A computer program for power and sample size calculations for genetic-epidemiology studies. 2006, http://hydra.usc.edu/gxe/.

Recommended publications