bioRxiv preprint first posted online Nov. 11, 2018; doi: http://dx.doi.org/10.1101/467852. The copyright holder for this preprint (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY-NC 4.0 International license. Predicting trait regulators by identifying co-localization of DNA binding and GWAS variants in regulatory regions Gerald Quon1,2,5,*, Soheil Feizi1,2, Daniel Marbach1,2, Melina Claussnitzer1,2,3,4, Manolis Kellis1,2,* 1Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA 2Broad Institute of MIT and Harvard, MIT, Cambridge, MA, USA 3Gerontology Division, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Ave, Boston, MA 02215 USA 4Nutritional Medicine, Technical University Munich, Gregor-Mendel-Str. 2. 85350 Freising-Weihenstephan, Germany 5Current Address: Department of Molecular and Cellular Biology, University of California, Davis, Davis, CA, USA *Correspondence and material requests should be addressed to:
[email protected],
[email protected] Abstract Genomic regions associated with complex traits and diseases are primarily located in non-coding regions of the genome and have unknown mechanism of action. A critical step to understanding the genetics of complex traits is to fine-map each associated locus; that is, to find the causal variant(s) that underlie genetic associations with a trait. Fine-mapping approaches are currently focused on identifying genomic annotations, such as transcription factor binding sites, which are enriched in direct overlap with candidate causal variants. We introduce CONVERGE, the first computational tool to search for co-localization of GWAS causal variants with transcription factor binding sites in the same regulatory regions, without requiring direct overlap.