Thesis for Submitting
Total Page:16
File Type:pdf, Size:1020Kb
Genetic Variation in the FMO2 Gene: Evolution & Functional Consequences Maha Saleh Al-Sulaimani School of Biological and Chemical Sciences Queen Mary, University of London Submitted for the degree of Doctor of Philosophy Supervisor: Prof. Ian R. Phillips Declaration of Ownership I, Maha Saleh Al-Sulaimani, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. ii Abstract Flavin-containing monooxygenase 2 (FMO2) is involved in the metabolism of xenobiotics, including therapeutic drugs. FMO2 exists in two forms: a functional and a non-functional form. The functional allele is found only in Africa and individuals of recent African origin. The aims of the project were to determine the frequency of functional FMO2 in Africa and obtain insights into the evolutionary history of the FMO2 gene. Six hundred and eighty nine samples from nine African population groups were genotyped for six high-frequency SNPs, and the genetic diversity within FMO2 was characterized by sequencing 3.44 kb of genomic DNA, encompassing the entire coding sequence and some flanking intronic sequences in 48 African individuals. Haplotypes were inferred using Phase and the relationship between mutations was revealed using reduced-median and median-joining Network. Test statistics were used to determine whether the genetic variation is compatible with neutral evolution. Genotyping indicated that deleterious SNPs occur mostly on a non-functional allele and that the frequencies of three were significantly different ( P<0.05) among populations. Resequencing identified 32 variants. Genetree was used to estimate the time to the most recent common ancestral sequence (~0.928 million years) and the ages of some of the mutations. Results indicate that the frequency of full-length 23238C alleles is relatively uniform across sub-Saharan Africa. Interestingly, this is not the case for the inferred potentially functional 23238C alleles, which frequency differed significantly ( P<0.05) across sub-Saharan Africa. iii The results also provide evidence that the frequency of functional FMO2 in east and west-Africa is high (≥0.54), which has important implications for therapy with drugs that are substrates for FMO2. A K a/K s > 1, and low nucleotide sequence diversity of intronic regions of 23238C alleles indicate a possible selective sweep. iv Abbreviations ABI, Applied Biosystems CEPH, Centre d`Etude du Polymorphisme Humain DNA, Deoxy Ribonucleic Acid DnaSP, DNA Sequence Polymorphism ETA, Ethionamide EGP, Environmental Gene Project ELB, Estimation Likelihood Bayesian EM, Estimation Maximization FMO2, Flavin-containing Monooxygenase 2 FAD-OOH, 4a-Hydroxy Flavin FEL, Fixed Effect-Likelihood HapMap-HCB, HapMap Han Chinese Panel HapMap-Ceu, HapMap European Panel from Utah HapMap-JPT, HapMap Japanese (Tokyo) Panel HGDP, Human Genome Diversity Panel HWE, Hardy-Weinberg Equilibrium INDEL, Insertion-Deletion Polymorphism KYR, Hundred Thousand Years LD, Linkage Disequilibrium MJ, Median-Joining MKT, McDonald-Kreitman Test MYR, Million Years NCBI, National Centre for Biotechnology Information NIEHS, National Institute of Environmental Health Sciences PCR, Polymerase Chain Reaction REL, Random Effect-Likelihood RM, Reduced-Median RNA, Ribunucleic Acid SMOGD, Software for the Measurement of Genetic Diversity SNP, Single-Nucleotide Polymorphism SSCP, Single Strand Conformation Polymorphism STRPs, Short Tandem Repeat Polymorphisms v TAZ, Thioacetazone TCGA, The Centre of Genetic Anthropology TMA, TMAU, Trimethylamine and Trimethylaminuria TMRCA , Time to the Most Recent Common Ancestor UCL, University College London UK, United Kingdom USA, United States of America vi Acknowledgements I owe special thanks to King Saud University of Riyadh, Saudi Arabia for funding me, and to all individuals involved with donating and collecting samples, especially Dr Ayele Tarekegn, Dr Krishna Veeramah and Dr Sarah Browning. I would like to thank my supervisor Professor Ian Phillips for his help and support during the duration of this PhD. He has provided a lot of valuable advice which I highly appreciate. I am also grateful for all the guidance that he gave me along the way. Thanks also go to my advisory panel, Professor Richard Nichols and Dr Steve Le Comber for their valuable advice and continuous guidance. I am grateful to Dr Rosemary Ekong for all her help and support in getting me started with genotyping and for all her advice and kind words. I would also like to thank Dr Neil Bradman for providing the opportunity to use the samples available at the TCGA for my research, as well as giving me some useful guidelines. I also appreciate the help of Dr Sarah Browning, Olivia Creemer, Ripu Bairns, Chris Plaster, Dr Krishna Veeramah and Dr Kate Ingram. Thank you to all at Queen Mary and UCL, who have helped me. I will be eternally grateful to my amazingly patient and caring husband Majed, for his selfless support and encouragement throughout this journey, all of which have been a motivating power for me, without him this would not have been possible. He was always there ready to advise and cheer me up when I needed it. To our children, Khalid, Deema and Dania, thank you for making me laugh when I least felt like it. Last but not least, thanks go out to my parents, Saleh (Assoc Prof, Consultant Surgeon, Dr Med Ret) and Ingeborg Al-Sulaimani for teaching me that education is of the utmost importance in any person`s life. vii Page of Contents 1. Introduction ………………………………………………………………………………………………… 1 1.1 FMOs …………………………………………………………………………………………………… 1 1.2 Human FMO Gene Family ………………………………………….…….………………… 3 1.3 Mechanism of Ac tion of FMOs …………………...……………………………………… 4 1.4 Structure of FMO …………………...…………………………………………………………… 5 1.4.1 The Three-dimensional structure of yeast and bacterial FMOs 8 1.5 Function of FMOs ……………………………………………………………………………… 9 1.6 FMO gene expression ………………………………………………………………………… 9 1.6.1 FMOI gene ……………………………………………………………………………… 9 1.6.2 FMO3 gene …………………………………………………………………………… 10 1. 6.3 FMO4 gene ……………………………………………………………………………… 10 1. 6.4 FMO5 gene ……………………………………………………………………………… 10 1. 6.5 FMO2 gene ……………………………………………………………………………… 11 1. 7 FMO substrate specificity …………………………………………………………………… 11 1.7.1 Endogenous and xenobiotic substrates …………………………………… 12 1.7.1.1 FMO1 protein ………………………………………………………… 14 1.7.1.3 FMO3 protein ………………………………………………………… 14 1.7.1.4 FMO4 protein ………………………………………………………… 15 1.7.1.5 FMO5 protein ………………………………………………………… 15 1.7.1.2 FMO2 protein ………………………………………………………… 15 1. 8 Association of FMO with human disease …………………………………………… 17 1. 9 FMO2 variants …………………………………………………………………………………… 18 1. 10 Natural selection ………………………………………………………………………………… 25 1. 11 Human origins and evolution ……………………………………………………………… 27 1.12 Geographical features of sub-Saharan Africa ……………………………………… 31 1.13 Sub-Saharan African populations ………………………………………………………… 32 1.13.1 West-African populations …………………………………………………… 32 1.13.1.1 Ethnic groups from Cameroon ……………………………… 32 1.13.1.1.1 The Fulbe …………………………………………… 32 1.13.1.1.2 Shuwa Arabs ……………………………………… 33 1.13.1.1.3 Mambila …………………………………………… 33 1.13.1.2 Ethnic groups from Ghana …………………………………… 34 1.13.1.2.1 The Asante ………………………………………… 34 1.13.1.1.2 The Bulsa …………………………………………… 34 1.13.1.3 Ethnic groups from Senegal ………………………………… 34 1.13.1.3.1 Manjak ……………………………………………… 34 1.13.2 Central-east African populations ………………………………………… 35 1.13.2.1 Ethnic groups from Tanzania ………………………………… 35 1.13.2.1.1 The Chagga ………………………………………… 35 1.13.3 South-east African populations …………………………………………… 35 1.13.3.1 Ethnic groups from Malawi …………………………………… 35 1.13.3.1.1 Malawi ……………………………………………… 35 1.13.3.2 Ethnic groups from Mozambique ………………………… 36 1.13.3.2.1 Sena …………………………………………………… 36 1.13.4 East-African populations ……………………………………………………… 36 1.13.4.1 Ethnic groups from Ethiopia ………………………………… 36 1.13.4.1.1 The Afar …………………………………………… 36 1.13.4.1.2 The Amhara ……………………………………… 37 viii 1.13.4.1.3 The Anuak ………………………………………… 37 1.13.4.1.4 The Gurage ……………………………………… 38 1.13.4.1.5 The Nuer …………………………………………… 39 1.13.4.1.5 The Oromo ………………………………………… 39 1.13.5 The Bantu Expansion …………………………………………………………… 41 1.14 Aims …………………………………………………………………………………………………… 42 2. Methods and Materials ………………………………………………………………………………… 44 2.1 Selection of Samples …………………………………………………………………………… 44 2.1.1 Criteria for selecting samples for genotyping ………………………… 44 2.1.2 Criteria for selecting samples for resequencing samples ………… 44 2.2 Sample collection ………………………………………………………………………………. 44 2.2.1 Genotyping samples ……………………………………………………………… 44 2.1.2 Resequenc ing samples …………………………………………………………… 44 2.2.2.1 West-African ascertainment plate ………………………………. 45 2.2.2.2 East-African ascertainment plate ……………………………….. 45 2.3 Genotyping ………………………………………………………………………………………… 45 2.3.1 Adjusting the concentration of genomic DNA samples ………… 45 2.3.2 Design of primers for genotyping ………………………………………… 45 2.3.3 Principles of TaqMan genotyping assay ………………………………… 46 2.3.4 TaqMan protocol ………………………………………………………….………… 47 2.3.4.1 TaqMan Genotyping of g.23238 C>T ………………… 48 2.4 DNA resequencing ……………………………………………………………………………… 50 2.5 Data analysis ………………………………………………………………………………………… 51 2.5.1 Genotyping Data …………………………………………………………………… 51 2.5.1.1 Haplotype inference ……………………………………………… 51 2.5.1.2 Genotypes ……………………………………………………………… 52 2.5.1.2.1 Hardy-Weinberg equilibrium (HWE) … 52 2.5.1.2.2 Pearson’s chi-square test …………………… 52 2.5.1.2.3 Network diagrams ………………………………