Developing the Cis-Regulatory Association Model (CRAM) to Identify Combinations Of

Developing the Cis-Regulatory Association Model (CRAM) to Identify Combinations Of

Developing the Cis-Regulatory Association Model (CRAM) to Identify Combinations of Transcription Factors in ChI!-Se# Data Thesis !resented in Partial Fulfillment of the Re#uirements for the Degree Master of Science in the $raduate School of The %hio State University 'y 'rian Kennedy) '*"* Department of Computer Science and Engineering The %hio State University ,-.- Thesis Committee/ 0ictor X* Jin) Advisor Raghu Machira3u Copyright by 'rian Kennedy ,-.- Abstract There are appro4imately ,)5-- human transcription factors 6hich may cis-regulate the e4pression of pro4imal genes. These T s may further interact 6ith one another and e4hibit different behavior in combination than individually) cis-regulatory modules (CRM)* +ven simple , and 7 T combinations could form over ,*8 billion different cis-reg- ulatory modules* Testing the functionality of these modules e4perimentally 6ill be a massive underta9ing* CRAM) the Cis-Regulatory Association Modeler predicts functional regulatory modules in silico using T s found in se#uences searched for T motifs defined by !osition :eight Matrices. This technique targets ChI!-se# data and finds CRMs 6hich are over-represented in the target se#uences compared to a random bac9ground) or anoth- er contrasting sample of se#uences, by using contrast fre#uent item-set mining in the e4perimental ChI!-se# pea9s and the control sample* The error 6ith 6hich these CRMs may be separated from the random bac9ground by a variety of features is used to deter- mine 6hich CRMs are truly specific to the e4perimental ChI!-se# sample under degree of motif matching) relative position) and genetic conservation* eed-for6ard neural net- 6or9s are used to learn the function 6hich specifies the classifiability of each CRM and calculate the error 6ith 6hich they are compared* Several other programs use a compara- ble approach; ho6ever) the application of neural net6or9s specifically and contrast item- set mining is novel* ii Dedicated to my mother) father) and brother) for all of their love and support* iii Acknowledgments I have many people to than9 for my ma9ing it this far/ my advisor) Dr* 0ictor 2in) for everything he<s done; Dr* Raghu Machira3u, for his counsel and support; all of my lab mates) for their 9no6ledge) assistance) and encouragement; and the incredible 'iomedical Informatics Department staff for everything they do* iv Vita 2003 Memphis Central High School 2008 '*"* Computer Science) University of Memphis 2009 Transferred from M*"* Bioinformatics, University of Memphis 2009-!resent M*"* Computer Science & Engineering) The %hio State University !ublications (ennedy BA) $ao :) Huang T=) 2in 01 (2009) =RT'?Db/ an informative data resource for hormone receptors target binding loci. Nucleic Acids Res. 38:D676-681 Bapat SA) 2in 0) Berry @) Balch C) Sharma @) (urrey @) Ahang ") ang ) ?an 1) ?i M) (ennedy 'A) Bigsby RM) Huang T=) @ephe6 (! (2010) Multivalent epigenetic mar9s confer microenvironment-responsive epigenetic plasticity to ovarian cancer cells. Epige- netics 5(8):716-729 ields of Study Ma3or/ Computer Science & Engineering Machine Learning applied in Bioinformatics v Table of Contents Abstract****************************************************************************************************************************************************ii Ac9no6ledgments********************************************************************************************************************************iv 0ita************************************************************************************************************************************************************v Table of Contents**********************************************************************************************************************************vi ?ist of Illustrations ******************************************************************************************************************************ix ?ist of Tables*****************************************************************************************************************************************4ii ?ists of Symbols & Abbreviations**************************************************************************************************4iii Chapter 1/ Introduction*************************************************************************************************************************. .*. Biological Bac9ground of the Buestion*********************************************************************************. .*, Research Buestion & $oals******************************************************************************************************C .*7 Prior Art in CRM Prediction****************************************************************************************************D .*C %vervie6 of the Solution**********************************************************************************************************D .*D %vervie6 of the Thesis**************************************************************************************************************E Chapter 2/ Algorithms and Design*****************************************************************************************************8 ,*. Input Data**********************************************************************************************************************************.- ,*, Transcription Factor Representation*************************************************************************************., ,*7 Transcription Factor Search****************************************************************************************************.C ,*C $enetic Conservation Measurement************************************************************************************.5 ,*D Transcription Factor Cartization*******************************************************************************************,- ,*5 Contrast Fre#uent Item-set Mining**************************************************************************************,C ,*E Artificial Neural Net6or9s*****************************************************************************************************,5 vi ,*G %utput Data*******************************************************************************************************************************7- Chapter 3/ Comparing Transcription Factors in KD5, cells*********************************************************7. 7*. Data Source********************************************************************************************************************************7. 7*, Data summary***************************************************************************************************************************7, 7*7 Biological bac9ground*************************************************************************************************************7D 7*C Method****************************************************************************************************************************************7D 7*D Analysis**************************************************************************************************************************************7D 7*5 Discussion**********************************************************************************************************************************C- Chapter 4/ Contrasting T$ -H Treatment in A,EG-**********************************************************************C, C*. Data source********************************************************************************************************************************C, C*, Data summary***************************************************************************************************************************C, C*7 Biological bac9ground*************************************************************************************************************CD C*C Method****************************************************************************************************************************************CD C*D Analysis**************************************************************************************************************************************CE C*5 Discussion**********************************************************************************************************************************D- Chapter 5/ Conclusions***********************************************************************************************************************D, D*. Address of the Research Buestion****************************************************************************************D, D*, Discussion of CRAM****************************************************************************************************************D7 D*7 Future :or9*******************************************************************************************************************************DC Bibliography******************************************************************************************************************************************DE Appendix************************************************************************************************************************************************5- A* FASTA format****************************************************************************************************************************5- '* BED format*********************************************************************************************************************************5- +* %utput format*****************************************************************************************************************************57 vii * J"%@ format********************************************************************************************************************************5C

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    79 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us