Update of the Database of Human and Mouse Transcription Co-Factors and Transcription Factor Interactions
Total Page:16
File Type:pdf, Size:1020Kb
TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions Item Type Article Authors Schmeier, Sebastian; Alam, Tanvir; Essack, Magbubah; Bajic, Vladimir B. Citation Schmeier S, Alam T, Essack M, Bajic VB (2016) TcoF-DB v2: update of the database of human and mouse transcription co- factors and transcription factor interactions. Nucleic Acids Research: gkw1007. Available: http://dx.doi.org/10.1093/nar/ gkw1007. Eprint version Publisher's Version/PDF DOI 10.1093/nar/gkw1007 Publisher Oxford University Press (OUP) Journal Nucleic Acids Research Rights This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by-nc/4.0/ ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] Download date 28/09/2021 23:27:07 Item License http://creativecommons.org/licenses/by-nc/4.0/ Link to Item http://hdl.handle.net/10754/622045 Nucleic Acids Research, 2016 1 doi: 10.1093/nar/gkw1007 TcoF-DB v2: update of the database of human and mouse transcription co-factors and transcription factor interactions Sebastian Schmeier1, Tanvir Alam2, Magbubah Essack2 and Vladimir B. Bajic2,* 1Massey University Auckland, Institute of Natural and Mathematical Sciences, Auckland, New Zealand and 2King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Thuwal, Kingdom of Saudi Arabia Received August 29, 2016; Revised September 29, 2016; Editorial Decision October 15, 2016; Accepted October 17, 2016 ABSTRACT teins work in unison to facilitate the recruitment of RNA Polymerase II to the site of transcription initiation, which Transcription factors (TFs) play a pivotal role in substantially increases the complexity of the regulatory pro- transcriptional regulation, making them crucial for cess (10). TcoFs have many functions, such as signal trans- cell survival and important biological functions. For duction, modulation of TF–DNA binding, and chromatin the regulation of transcription, interactions of differ- modification (11,12). However, so far research on the inter- ent regulatory proteins known as transcription co- action of TFs with other proteins in relation to transcription factors (TcoFs) and TFs are essential in forming nec- regulation has not been that intensive. essary protein complexes. Although TcoFs them- Protein–protein interactions (PPI) play a crucial role in selves do not bind DNA directly, their influence on transcriptional regulation and can take several forms, e.g. transcriptional regulation and initiation, although in- the modification of one protein by another protein, thefor- direct, has been shown to be significant, with the mation of a protein complex by two or more proteins, etc. PPIs of our interest include the physical interaction between functionality of TFs strongly influenced by the pres- two TFs (10,13), the physical interaction between a TF and ence of TcoFs. In the TcoF-DB v2 database, we col- a TcoF, as well as potentially, the physical interaction be- lect information on TcoFs. In this article, we describe tween a TF and other nuclear proteins that are not known updates and improvements implemented in TcoF-DB to function as TFs or TcoFs. A web-based tool to recognise v2. TcoF-DB v2 provides several new features that the type of TF binding partner has also been developed (14). enables exploration of the roles of TcoFs. The con- Several resources that collect information on mam- tent of the database has significantly expanded, and malian TFs exist (e.g. JASPAR (15), TRANSFAC (16), is enriched with information from Gene Ontology, HOCOMOCO (17), TFCat (18), DBD (19), TFCONES biological pathways, diseases and molecular signa- (20), TFcheckpoint (21), TFdb (22), COMPEL (23), tures. TcoF-DB v2 now includes many more TFs; has TRANSCompel (16), etc.). While the influence of TFs on substantially increased the number of human TcoFs many biological processes (24–27) and diseases (26,28–30) is well established and described, research into the influ- to 958, and now includes information on mouse (418 ence of TcoFs has not received the same level of atten- new TcoFs). TcoF-DB v2 enables the exploration of tion. However, the recent release of the FARNA database information on TcoFs and allows investigations into (http://cbrc.kaust.edu.sa/farna) utilised TcoFs (31) to infer their influence on transcriptional regulation in hu- functions of over 10 000 non-coding transcripts in humans, mans and mice. TcoF-DB v2 can be accessed at showing the importance of TcoFs for the function of their http://tcofdb.org/. target transcripts. To facilitate investigations into the influ- ence of TcoFs on the regulation of transcriptional initia- tion, we first developed in 2010 the TcoF-DB database that INTRODUCTION comprises human TFs and 529 TcoFs that interact with Transcriptional regulation in eukaryotes is a complex pro- them (31). In 2014, 415 human TcoFs as well as TcoFs from cess. It involves many proteins that assemble at the site of other species were published in AnimalTFDB 2.0 database transcription initiation (1,2) either binding DNA directly, (32). However, we still do not know the complete list of e.g. transcription factors (TFs) (3–5), or interacting with TFs and TcoFs in mammals. Thus, we developed an up- TFs, e.g. transcription co-factors (TcoFs) (6–9). These pro- dated version of TcoF-DB v2, which increases the number *To whom correspondence should be addressed. Tel: +966 12 808 2386; Fax: +966 2 802 0127; Email: [email protected] C The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] 2 Nucleic Acids Research, 2016 of TFs and TcoFs substantially. Since the last TcoF-DB re- Out of the 7 437 (2 062) human (mouse) TF interacting lease, several important changes have been made to the re- genes, we have now identified those that can be considered sources on which TcoF-DB depends upon, e.g. Gene Ontol- to encode TcoFs. Similar to our former methodology (31), ogy (GO) updated its annotations and structure (33), and four conditions had to be met by a gene so that we consider PPI resources grew substantially. These changes made an it a TcoF (see Table 1). However, note that the GO in recent update of TcoF-DB necessary. In this article, we describe years has updated its annotations, thus, there was a need to new features and modifications we have made in the TcoF- adjust the methodology in this regard and it is reflected by DB v2 resource, and the updated version of TcoF-DB. new GO-terms within condition four (33). As before, we follow the classical definition of TcoFs from The update of transcription factors (9) and classify all TcoFs into four groups (31), based on the type of evidence that is present regarding fulfilment of We updated our lists of TFs that form the foundation of conditions c and d in Table 1 (conditions a and b from Table TcoF-DB v2. In TcoF-DB v2, we opted for a gene-centric 1 are compulsory for all TcoF groups): approach as opposed to a protein-centric one, to better sup- port genomics and transcriptomics studies, which in gen- r High-confidence TcoFs: All TcoFs that have experimen- eral are gene-centric. We extracted human genes that encode tal evidence for both, involvement in transcription regu- TFs (we denote them as TF genes) from a recent large-scale lation and for occurrence in the cell nucleus. study of promoter elements (4). After a hand-curation step r Hypothetical TcoFs (Class 1): All TcoFs that have experi- to remove obsolete entries, the numbers of human TF genes mental evidence for involvement in transcription regula- in TcoF-DB v2 increased to a total of 1 758. In addition, we tion, but only non-experimental evidence (e.g. ‘Inferred now expanded the database by including 1 507 mouse TF from Electronic Annotation’ or ‘Author statement’) for genes (4). r occurrence in the cell nucleus. Hypothetical TcoFs (Class 2): All TcoFs that have exper- The update of transcription co-factors imental evidence for occurrence in the cell nucleus, but We extracted genes and proteins that interact with TFs from only non-experimental evidence for involvement in tran- three databases: BioGRID v3.4.135 (34), IntAct (accessed r scription regulation. April 2016) (35) and Reactome (accessed April 2016) (36). Hypothetical TcoFs (Class 3): All TcoFs that have only BioGRID stores interactions in terms of gene-to-gene iden- non-experimental evidence for both, involvement in tran- tifiers, although the actual interaction is between proteins scription regulation and for occurrence in the cell nu- the genes encode for. To conform to our gene-centric ap- cleus. proach, and to be consistent, we converted the protein iden- tifiers in the PPIs in IntAct and Reactome to gene identi- This classification is done based on the type of evidence fiers. We excluded the formerly used database MINT (37), that is attached to the GO-terms of conditions c and d since from 2013 its data has been integrated into IntAct. in Table 1. We distinguish between experimental evidence All data from the above three databases are experimentally (GO codes: EXP, IDA, IMP, IGI, IEP and IPI) and non- supported. It was of particular importance that this infor- experimental evidence (all other codes; for more detail mation is presented in the PSI-MI format [molecular inter- on the codes see http://geneontology.org/page/guide-go- action standard of the Proteomics Standards Initiative (38)] evidence-codes).