Identifying Envirogenomic Signatures for Predicting the Clinical Outcomes of Crohn's Disease

Identifying Envirogenomic Signatures for Predicting the Clinical Outcomes of Crohn's Disease Author Nasir, Bushra Farah Published 2013 Thesis Type Thesis (PhD Doctorate) School School of Medical Sciences DOI https://doi.org/10.25904/1912/1046 Copyright Statement The author owns the copyright in this thesis, unless stated otherwise. Downloaded from http://hdl.handle.net/10072/366228 Griffith Research Online https://research-repository.griffith.edu.au Identifying Envirogenomic Signatures for Predicting the Clinical Outcomes of Crohn’s Disease Bushra Farah Nasir M. Medical Research (Biomedical Science) School of Medical Sciences Griffith Institute of Health and Medical Research Genomics Research Centre Griffith University Submitted in fulfilment for the Degree of Doctor of Philosophy April 2013 2 3 4 ABSTRACT A complex interplay between genetic susceptibility, environmental factors and clinical indicators seem to cause the development of Crohn’s disease (CD). Few disorders in clinical medicine are associated with as much chronic morbidity as CD. Genetic factors are the predetermined cause, whereas non-genetic factors seem to further trigger the development of CD. From epidemiological data, based on concordance statistics in family studies, via linkage analysis to Genome Wide Association Studies (GWAS) and Whole Genome Analysis (WGA), robust evidence have been gathered, implicating distinct genomic loci involved in genetic susceptibility to CD. Most recently, a meta-analysis has been able to implicate 71 distinct genomic loci that seem to be associated with CD development. A study published in the American Journal of Human Genetics has also recently identified more than 200 genes associated with CD, which is more than what have been found for any other disease so far. Monozygotic twins show approximately 50-60% disease concordance, with much lower rates in dizygotic twins (~10%), highlighting the role of both environmental and genetic components in the development of CD. In earlier decades, CD patients suffered a lack of effective treatment preferences, and patients with moderate to severe CD were often consigned to prolonged systemic corticosteroid therapy and surgery as their only options. From the late 1990s onward, there has been a significant improvement in CD treatment therapy, specifically the widespread adoption of maintenance of immunomodulaters and the advent of biologic agents. The understanding of the natural history of those patients at-risk of severe and complicated CD patient subgroups, particularly an early identification of 5 patients with the potential for severe disease and its associated complications, would ultimately determine an optimal clinical approach that incorporates appropriate risk- benefit assessment for disease modifying therapy. The advent of chips for high-throughput genotyping of single nucleotide polymorphisms (SNPs) in 2004 provided researchers with a powerful new tool for unlocking the genetic basis of complex human diseases. GWAS using SNP chips have since identified risk loci for many complex disease traits, though the amount of disease heritability explained by any single SNP has been typically very minor (odds ratios < 1.5). Whilst the SNP chip technologies supporting GWAS are now scientifically well established (more than 1 million SNPs can currently be typed rapidly in a single assay), the bioinformatics methods for GWAS are still underdeveloped and require more empirical experimentation with model traits before widespread application is possible. In particular, for accurately identifying disease risk factors, there is an urgent need to move beyond the single SNP analysis, and into the statistical assessment of multiple SNPs acting independently and/or interactively with other environmental and clinical risk factors. Association studies are a major tool for identifying genes conferring susceptibility to complex disorders. These traits and diseases are being termed as “complex” because both genetic and environmental factors contribute to the susceptibility of risk development. Extensive experiments in genetic studies for many complex disorders have confirmed that many different genetic variants control disease risk, with each variant having only a subtle effect. Therefore genomic data needs to be integrated with patient environmental and clinical information so that it can be interpreted by trained clinicians who can provide personalised predictive profiles for patients, thus allowing for a more specifically tailored treatment regimen. Furthermore, this will 6 allow clinical researchers to translate novel disease biomarkers and drug responses into improved treatments, which in turn can be tailored to each patient’s individual genome. It is anticipated that such developments would be readily applicable to CD along with other multifaceted diseases. This research project adapts the basic GWAS design towards identifying envirogenomic signatures of disease, with a major focus on CD. By combining clinical, environmental and genetic factors and analysing them collectively, “envirogenomic signatures” can be created. Through the use of a multi-factor analysis procedure, it is expected that a greater proportion of the disease variance would be explained and therefore the coupled factor set would be of greater predictive value. This study would be a step forward in identifying envirogenomic signatures that may predict clinical outcomes in CD patients with an improved accuracy. CD associated SNPs play an important part in disease progression and severity, and information from clinical outcomes can be used to identify disease risk prospects for patients, when used with other clinical patient history data. A replicated multi- factorial model was developed and analysed in this section. This model was able to illustrate an 82% risk of requiring surgery for patients diagnosed with both the NOD2 gene and smoking variables. When clinical and/or environmental predictors and genetic predictors are combined to create envirogenomic signatures, high association to clinical severity of disease outcomes is demonstrated (Chapter Four) and this can potentially assist in creating individual risk profiles, useful for subsequent accurate treatment and early diagnosis of disease severity. 7 This project (Chapter Five) also involved the development of a novel Genomic Signature Analysis (GSA) bioinformatics method which is a comprehensive, step- wise approach for analysing genomic data in order to produce a precise genomic signature for predicting the level of disease risk. Originally outlaid in 2009 by Lea et al, further analyses has been able to provide progressive developments of the GSA method. With the inclusion of some modifications, the GSA process has been further improved. A 7-SNP genetic signature was identified using the GSA method by using the Wellcome Trust Case Control Consortium (WTCCC) CD dataset. The completed model was able to achieve an 86% risk of CD for those patients with the set of 7 SNPs identified from the GSA. The GSA method outlined in this study demonstrated a significantly important process to help classify high risk patients of CD and empower clinicians with the ability to diagnose such patients accurately and provide personalised treatment therapies accordingly. The field of medicine today is rapidly accelerating from inefficient and experimental practices to data-driven and practical clinical applications. Very soon, treatments, diagnosis, prognosis and disease prevention would be custom-made to each individual’s specific genotypic and phenotypic dataset. In the near future we anticipate that, through the incorporation of genomic research innovations, personalized medicine advances would no doubt transform the practice of medicine, thereby altering the global healthcare industry and with time, would pave the way for a lengthier and healthier quality of life. 8 ACKNOWLEDGEMENTS Throughout my academic career I have been blessed with countless opportunities that have come my way, making my career path more achievable. I am extremely grateful to all those who have walked along with me on every step of this journey, those who I am blessed enough to be able to take for granted. The past three years of my Doctoral research work have been extremely fulfilling and full of cooperation from many people in my life, some of whom I would like to appreciate here. It goes without saying, that I am eternally grateful to God for providing me this opportunity in life and bestowing me with the capabilities to accomplish this task. First and foremost, my supervisors Dr Rodney Lea and Dr Lyn Griffiths cannot be accredited enough, for all their support and guidance that they have provided so unassumingly. Making special considerations for me to attend meetings with my children, at my home and through teleconferences, I greatly appreciate the kindness and respect that they have always endowed upon me. Without these two people, I am genuinely positive that I would not have been able to complete such an immense undertaking as that is required for a Doctoral degree. I am sincerely thankful to both of these mentors for giving me the guidance and space to learn, so that I could really call this work my own. The past few years of my academic learning would also not have been possible without the abundant love, support and encouragement of my friends and family. I am extremely thankful to have been blessed with parents who have always encouraged me to excel and push my limits and always provided me with immense

Identifying Envirogenomic Signatures for Predicting the Clinical Outcomes of Crohn's Disease

Universidade Federal Do Rio Grande Do Sul Faculdade De Agronomia Programa De Pós-Graduaçao Em Zootecnia

Allele Frequency Difference AFD–An Intuitive Alternative to FST for Quantifying Genetic Population Differentiation

S41467-019-13480-Z OPEN Insights Into Malaria Susceptibility Using Genome- Wide Data on 17,000 Individuals from Africa, Asia and Oceania

Basic Principles and Laboratory Analysis of Genetic Variation Jesus Gonzalez-Bosquet and Stephen J

Allele Frequency Determination of Publicly Available Csnps in The

A Hapmap Harvest of Insights Into the Genetics of Common Disease Teri A

Novel Alzheimer's Disease Risk Variants Identified Based on Whole

The International Hapmap Project

Dna Sequencing – Methods and Applications

On Rare Variants in Principal Component Analysis of Population Stratifcation

Evaluating the Quality of the 1000 Genomes Project Data

A Map of Human Genome Variation from Population-Scale Sequencing