By DOROTHY ARABA HAMMOND
Total Page:16
File Type:pdf, Size:1020Kb
FUNCTIONAL ANNOTATION OF ALTERNATIVELY SPLICED ISOFORMS OF HUMAN G PROTEIN-COUPLED RECEPTORS (GPCRs) by DOROTHY ARABA HAMMOND (Under the Direction of YING XU) ABSTRACT The functional differences that exist between alternatively spliced isoforms of genes can be substantial; it could be variations of the function of the “primary” full- length protein, its antagonist function or not directly related to its function. Regardless of the functions of spliced isoforms there is a need for proper annotation. G protein-coupled receptors (GPCRs) are important trans-membrane proteins with important biological and pharmaceutical implications. Incorrect alternative splicing has been associated with a number of human diseases including diabetes and cancers. In this dissertation is presented a framework and results for annotating transcripts of GPCR genes that have originated from alternative splicing of their pre-mRNA. This was achieved by first assessing the genomic landscape of the GPCRs to determine the type of alternative splicing event that exist and the localization of such events. In addition, the influences of exon-intron structures of GPCRs on the alternative splicing of the genes are evaluated. Then, the molecular level functions are ascertained via motif identification. Furthermore, the cellular state in which each isoform could function was determined using tissues of expression from RNA- sequencing data. Lastly, isoform function is inferred from cis regulatory elements predicted to influence the transcription of the isoform in the specific cellular state. The results are presented here and made publicly available in a PHP/MySQL relational database. Results from the research analyses indicate that the genomic structures of GPCRs are complex; their exon-intron structures are not arbitrarily arranged; rather they are ordered in such a manner as to ease the alternative splicing of the gene for functional purposes. Also, that GPCR isoforms expression patterns differ across tissues. Additionally it was evident that GPCR isoforms have multiple cellular pathways in which they may be involved depending on the cellular conditions. Overall, 1044 human GPCR isoforms were annotated. These results are now publicly available via the Internet in the database called the G protein-coupled receptor transcriptional annotation of genes (gpcrTAG) database. The results from these analyses offer the promise of transcriptional drug control, as well as offer the framework to study functional diversity between isoforms of other genes families. INDEX WORDS: GPCR, G protein-coupled receptor, RNA-seq, membrane protein, trans-membrane proteins, gene structure, alternative splicing, functional prediction, gene regulation, differential gene expression, next-generation sequencing, transcriptomic data analysis FUNCTIONAL ANNOTATION OF ALTERNATIVELY SPLICED ISOFORMS OF HUMAN G PROTEIN-COUPLED RECEPTORS (GPCRs) by DOROTHY ARABA HAMMOND B.A., Kwame Nkrumah University of Science and Technology, Kumasi, Ghana, 2002 M.P.H., Texas A&M University, College Station, Texas, 2006 A Dissertation Submitted to the Graduate Faculty of The University of Georgia in Partial Fulfillment of the Requirements for the Degree DOCTOR OF PHILOSOPHY ATHENS, GEORGIA 2014 © 2014 DOROTHY ARABA HAMMOND All Rights Reserved FUNCTIONAL ANNOTATION OF ALTERNATIVELY SPLICED ISOFORMS OF HUMAN G PROTEIN-COUPLED RECEPTORS (GPCRs) by DOROTHY ARABA HAMMOND Major Professor: YING XU Committee: LIMING CAI NATARAJAN KANNAN Electronic Version Approved: Julie Coffield Interim Dean of the Graduate School The University of Georgia December 2014 iv DEDICATION To my husband, for all his support, encouragement, love and patience To my boys (Jack & Joel), just because I love you To my father, for his encouragements, support, and for being my personal manuscript reviewer To my mother, for all the love and support To my Aunt Emelia, for your desire to see me succeed To my grandparents, you said it. See what we did Most importantly, to my HEAVENLY FATHER to whom I owe it all v ACKNOWLEDGEMENTS A special heartfelt appreciation goes to Prof. Ying Xu for his direction and supervision throughout my study here at the University of Georgia. Your work ethics is encouraging. I feel privileged to have benefited from your guidance and mentorship. Thank you for the knowledge and experience. Additionally, my sincere appreciation goes to my advisory committee members Liming Cai, Ph.D., David Puett, Ph.D., and Natarajan Kannan, Ph.D., for your support through it all. Another sincere appreciation also goes to Jonathan Arnold, Ph.D. for his encouragement. Lastly, to the Institute of Bioinformatics and to Ms. Joan Yantko, you are a treasure. Another heartfelt gratitude goes to Victor Olman, Ph.D., former member of the University of Georgia community for all the statistical support. Financial support came from: • The National Institute of Health – Supplementary Grant (R01 Grant # GM07533101) • The National Institute of General Medical Sciences, National Institute of Health, Ruth L. Krirschstein Predoctoral Fellowship F31 (Grant # GM089093) Trust in the Lord with all your heart and lean not in your own understanding. In all your ways acknowledge Him, and He will make your path straight Proverbs 3:5&6 vi TABLE OF CONTENTS Page ACKNOWLEDGEMENTS ................................................................................................ v LIST OF TABLES ............................................................................................................. ix LIST OF FIGURES ............................................................................................................ x CHAPTER 1 INTRODUCTION ............................................................................................ 1 1.1 RESEARCH OBJECTIVE ................................................................... 1 1.2 G PROTEIN-COUPLED RECEPTORS (GPCR) ................................ 1 FUNCTION INFERRED FROM SEQUENCE 1 1.3 ALTERNATIVE SPLICING .............................................................. 10 1.4 STATE OF THE ART IN ANNOTATING ALTERNATIVELY SPLICED ISOFORMS AND CHALLENGES FACED .......................... 14 1.5 OVERVIEW OF CHAPTERS: SUMMARY OF THE APPROACH PRESENTED IN THIS DISSERTATION ............................................... 17 REFERENCES ......................................................................................... 20 2 FUNCTIONAL UNDERSTANDING OF THE DIVERSE EXON-INTRON STRUCTURES OF THE HUMAN GPCR ..................................................... 27 ABSTRACT .............................................................................................. 28 2.1 INTRODUCTION ............................................................................. 28 2.2 DATA COLLECTION AND ANALYSIS ........................................ 30 2.3 RESULTS ........................................................................................... 34 vii 2.4 DISCUSSIONS ................................................................................... 38 2.5 CONCLUDING REMARKS .............................................................. 39 REFERENCES ......................................................................................... 41 3 FUNCTIONAL ANNOTATION OF ALTERNATIVE LY SPLICED ISOFORMS OF HUMAN G PROTEIN-COUPLED RECEPTORS ........... 43 3.1 INTRODUCTION ............................................................................. 44 3.2 GPCR DATA RETRIEVAL AND PROCESSING ............................ 45 3.3 ANNOTATION OF MOLECULAR FUNCTION ............................. 46 3.4 ANNOTATION OF CELLULAR FUNCTION ................................. 55 3.5 INTEGRATION OF MOLECULAR- AND CELLULAR-LEVEL FUNCTION .............................................................................................. 74 3.6 DISCUSSION ..................................................................................... 74 3.7 CONCLUSION ................................................................................... 79 REFERENCES ......................................................................................... 80 4 THE gpcrTAG DATABASE ........................................................................... 83 4.1 INTRODUCTION .............................................................................. 83 4.2 DATABASE CONTENT .................................................................... 84 4.3 DATABASE DESIGN ........................................................................ 85 4.4 DATABASE IMPLEMENTATION .................................................. 87 4.5 ACCESSING DATABASE ................................................................ 87 4.6 DISCUSSION ..................................................................................... 90 4.7 FUTURE IMPROVEMENTS ............................................................ 91 REFERENCES ......................................................................................... 92 viii 5 PERSPECTIVES AND CONCLUSIONS ...................................................... 93 5.1 SUMMARY ........................................................................................ 93 5.2 FUTURE DIRECTIONS .................................................................... 95 5.5 CONCLUSIONS ................................................................................. 96 APPENDICES A Appendix Tables ............................................................................................. 97 B Links to Supplementary Tables ...................................................................