Polani B.Ramesh Babu et al. / Journal of Pharmacy Research 2012,5(9),4863-4866 Review Article Available online through ISSN: 0974-6943 http://jprsolutions.info Applications of Bioinformatics Tools in Research: An Update Polani B.Ramesh Babu and P.Krishnamoorthy Department of Bioinformatics, School of Bio-Engineering. Bharath Institute of Science and Technology, Bharath University. Chennai. India. Received on:12-06-2012; Revised on: 17-07-2012; Accepted on:26-08-2012

ABSTRACT Bioinformatics originated as a cross-disciplinary field as the need for computational solutions to research problems raised in biomedicine. Stem cell driven regenerative systems are highly complex and dynamic, consisting of large numbers of different cells expressing many molecules controlling their fates. Therefore, mathematical models and computational tools are necessary - both to aid the interpretation of experimental data and to simulate the behavior of stem cell systems based on hypothetical assumptions about their complex cellular or molecular composition. Many long-standing questions in stem cell research remain unsolved. In this review, we describe recent developments and advancements in bioinformatics tools applicable in stem cell research.

Key words: Stem cells, Bioinformatics, Computation, Pluripotency

INTRODUCTION Bioinformatics is the application of computer science and information tech- of equivalent human ES cells, have now opened new vistas for regenerative nology to the field of biology and medicine. This rapidly developing branch medicine. Induced pluripotent stem cells (iPS or iPSC) are produced by of biology is highly interdisciplinary using concepts and techniques from nuclear reprogramming technology and they resemble ES cells in key ele- informatics, statistics, mathematics, retrieval and analysis of data [1,2]. ments [14]; they possess the potentiality to differentiate into any type of Major research efforts in Bioinformatics with reference to Stem cells in- cell in the body. More importantly, the iPS platform has distinct advantage clude DNA/protein sequence alignment, gene finding, gene assembly, pro- over ES system in the sense that iPS-derived cells are autologous and there- tein structure alignment, protein structure prediction, prediction of gene fore the iPS-derived transplantation does not require immunosuppressive expression, protein–protein interactions, genome-wide association studies, therapy. In addition, iPS research obviates the political and ethical quan- the modeling for various diseases, drug design and drug discovery. dary associated with embryo destruction and ES research. This remarkable discovery of cellular plasticity has important medical implications. Software tools in bioinformatics range from simple command-line tools, to more complex graphical programs and standalone web-services [3]. The use Both ES cells and iPS cells receive a marked attention from scientists and of online/internet resources or graphical navigation software (i.e. web brows- clinicians for regenerative medicine because of their high proliferative and ers) for the Internet was developed and used on a widespread scale. Produc- differentiation capacities. Most recent application of human ES and PS tion of high throughput biological data in terms of ‘genomics’, ‘microarrays’ cells may, however, reside in their use as a tool in drug development, disease and ‘proteomics’ in Stem cell biology increased continually with a distinct modeling and new modes of therapy [15,16,17]. The ease with which they acceleration [4.5.6] . Recently various tools are used to creating models or can be grown in bulk and their differentiation controlled in vitro is of impor- viewing 3-D models for adult tissue/organ developments in regenerative tance for their widespread adoption by industry and their clinical efficacy. medicine [7,8,9]. Small molecules have already had a positive impact on several areas of stem cell biology, from maintenance of pluripotency [10], the promotion of single Stem cells are a class of undifferentiated cells that are able to differentiate cell survival and steering differentiation to involvement in reprogramming into specialized cell types. Commonly, stem cells come from embryos somatic cells. High-throughput technology has played an important role in formed during the blastocyst phase of embryological development (embry- identifying novel compounds, however to date there are few published onic stem cells) and adult tissue (adult stem cells)[7]. Both types are gener- examples of medicinal chemistry input in this area. ally characterized by their potency, or potential to differentiate into differ- ent cell types (such as skin, muscle, bone, etc.)[7,10]. The Pluripotent cells The field of stem-cell biology has been catapulted forward by the startling (PS) have the ability to differentiate into almost all cell types, examples development of reprogramming technology [14,18] and somatic cell nuclear include embryonic stem cells and cells that are derived from the mesoderm, transfer or therapeutic cloning has provided great hope for stem cell-based endoderm, and ectoderm germ layers that are formed in the beginning stages therapies (Table 1). Recent breakthrough studies using a combination of of embryonic stem cell differentiation [11,12,13]. four factors to reprogram human somatic cells into PS cells without using embryos or eggs have led to an important revolution in stem cell research The discovery of embryonic stem (ES) cells came from the conjunction of [14,18]. Comparative analysis of human iPS cells and human ES cells using studies in human pathology, mouse genetics, early mouse embryo develop- assays for morphology, cell surface marker expression, gene expression ment, cell surface immunology and tissue culture [7,11]. They have not profiling, epigenetic status, and differentiation potential have revealed a only revolutionized experimental mammalian genetics but, with the advent remarkable degree of similarity between these two pluripotent stem cell types. These advances in reprogramming will enable the creation of patient- *Corresponding author. specific stem cell lines to study various disease mechanisms. Furthermore, Polani B.Ramesh Babu this reprogramming system provides great potential to design customized Department of Bioinformatics, patient-specific stem cell therapies with economic feasibility. Disease-spe- School of Bio-Engineering. cific human ES cells were the first to provide a useful source for studying Bharath Institute of Science and Technolgy, certain disease states. The recent demonstration that human somatic cells, Bharath University. Chennai. India. derived from readily accessible tissue such as skin or blood, can be con-

Journal of Pharmacy Research Vol.5 Issue 9.September 2012 4863-4866 Polani B.Ramesh Babu et al. / Journal of Pharmacy Research 2012,5(9),4863-4866 verted to embryonic-like induced pluripotent stem cells (hiPSCs) has opened FGF, BMP, Insulin, Notch and LIF), and epigenetic regulators as well as new perspectives for modeling and understanding a larger number of human some other relevant genes/proteins, such as proteins involved in nuclear pathologies. import/export.

Table 1 : Therapeutic Potential of Stem cells in Regenerative Medicine Table 2 : List of Bioinformatics tools used in Stem Cell Research. Name of the Description of the tool Application in stem cell research Disease Stem Cell Potential Bioinformatic tools Heart disease Adult bone marrow stem cells injected into heart arteries are believed to improve cardiac function in victims of heart attack or heart failure. hESCreg Online resource for European A global registry providing comprehensive Leukemia and other In various studies leukemia patients treated with stem cells from human embryonic stem cell information on available hESC lines including cancers bone marrow and umbilical cord emerged free of disease; donor blood registry their, derivation culture, genetics, potency stem cells have also reduced non-Hodgkin’s lymphoma, and pancreatic and procurement/ethical provenance. and ovarian cancer in some patients Rheumatoid Adult stem cells may be helpful in jump-starting repair of eroded CellFinder A stem cell navigation tool Provides linkage of individual cell lines or arthritis cartilage.In human trials, joint pain lessened temporarily after donor groups of cells to genetic or functional stem cell therapy in some patients,.and some then responded better to characteristics from sources outside of standard drug therapies hESCreg, e.g. expression profiles for Parkinson’s Since fetal tissue implants had mixed success in reducing neurological differentiated or pluripotent cells disease symptoms, some researchers say the best hope is that a patient’s own Genomatix Computational Search tool Suggests novel factors for neural stem cells may eventually be coaxed to mature into the dopamine- and database stemness producing cells needed to treat the disease PluriNetWork An Electronic representation A large network of interaction and regulation Type I diabetes Basic research is focused on understanding how embryonic stem cells of the Network and data links between genes/proteins involved in might be trained to become the type of pancreatic islet cells that secrete repository pluripotency needed insulin. Recent developments using proteins to spur cell StemBase A simple web-interface The largest online repositories for human and differentiation may speed progress. mouse stem cell gene expression data Stem cell genome- Computational and Determination of the molecular constituents Bioinformatics tools and Pluripotency: to-systems biology mathematical analysis. that govern stem cell characteristics and conjointly with functional validations via Computational methods and Bioinformatics tools are constantly being used genetic perturbation and protein location in regenerative medicine (Table 2). The ability to develop pluripotent cells binding analysis SCOR Stem Cell-Omics Repository A resource to collate and display quantitative from specific population phenotypes is an important advance in under- – A web resource and information across multiple planes of standing human disease and developing new therapeutics. Given the large queryable database measurement, including mRNA, protein and amounts of biological data available, it is necessary to use an open source post-translational modifications. similarity search engine to find pluripotent proteins while using a very expensive distance function [18-21]. However, the reprogramming proto- Transcription factors of ES and PS cells col is currently an inefficient and lengthy process. It is a common practice Molecular interactions between transcription factors are considered to be to use gene expression microarray assays as one of the tools to estimate the major control mechanisms for stem cell fate decisions [23,24]. By translat- pluripotent capability of the iPS cells. Using publicly available genome- ing these interactions into an appropriate mathematical state space formu- wide microarray expression datasets from reprogramming experiments, a lation it is possible to investigate the dynamics of cellular development on few genes were identified that define signature characteristic of donor cell the molecular level and establish a conceptual understanding of stem cell type (human fibroblasts), and self-renewal (iPS, ES) and partially repro- fate decisions. A few mathematical models for the description of molecular grammed cells partially induced pluripotent cells (PiPSC) [20,21]. switches in embryonic and hematopoietic stem cells have previously been demonstrated. Somatic cells can be reverted to an embryonic-like state PluriNetWork, is a large electronic representation of network underlying simply by over-expressing a combination of four transcription factors interaction and regulation links between genes/proteins involved in (OCT4/POU5F1, , KLF4 and or OCT4/POU5F1, SOX2, pluripotency of mouse and its applications [22]. Node annotations (e.g. NANOG and LIN28). In addition, several genetic markers have recently various gene/protein identifiers) and link annotations (e.g. pointers to the been identified that are associated with ES cells [23-26]. Among these ES literature) enable easy exploration of the network. Moreover, it can be cell markers are Prou5f1 (Oct-4), Nanog and Sox2, among others. Similar, subjected to automated analyses, yielding Gene Ontology enrichment, net- genetic markers were also found for some of the multipotent stem cells work statistics, and much more. This Expressence software was applied to found in various organs and tissues. highlight links in networks attempting to infer mechanisms from differen- tial experimental data such as microarray time series. Mechanisms impor- Deciphering the transcriptional networks operative in human ES cells (hES), tant for pluripotency may be discovered by this kind of data integration, induced pluripotent stem cells (hiPS) and embryonal cancinoma cells (hEC), and small molecules close to genes involved in these mechanisms may be is essential for enhancing our understanding of self-renewal and pluripotency. predicted to enhance the induction of pluripotency. It was earlier demonstrated that transcription factor OCT4, is a master regulator of the transcriptional networks required for inducing and main- To keep up with the growth of knowledge on the fundamental processes of taining pluripotency [26]. Therefore, employing a system biology approach pluripotency and reprogramming, Wiki and social networking softwares whereby correlating gene expression resulting from the ablation of OCT4 were combined towards a community curation system that is easy to use function in hES and hEC cells with potential OCT4-binding sites within the and flexible, and tailored to provide a benefit for the scientist, and to im- promoters of target genes allows a higher predictability of motif-specific prove communication and exchange of research results [22]. In this net- driven expression modules important for inducing pluripotency and main- work, 574 molecular interactions, stimulations and inhibitions were as- taining self-renewal [26,27]. sembled based on a collection of research data from 177 publications until June 2010, involving 274 mouse genes/proteins, all in a standard electronic Genomatix Software published some remarkable work on ES cells. This format, enabling analyses by readily available software such as Cytoscape tool was recently used to identify novel transcription factors like B-Myb and its plugins. In this network, each node represents a gene and its corre- and Maz, which are implicated either in the maintenance of the undifferen- sponding protein product. The network includes the core circuit of Oct4 tiated stem cell state or in early steps of differentiation [28]. System-based (Pou5f1), Sox2 and Nanog, its periphery (such as Stat3, Klf4, Esrrb, and c- and next generation sequencing technologies like rna-seq and chip-seq were Myc), connections to upstream signaling pathways (such as Activin, WNT, used to unravel gene transcription and regulatory mechanisms governing neural differentiation. Journal of Pharmacy Research Vol.5 Issue 9.September 2012 4863-4866 Polani B.Ramesh Babu et al. / Journal of Pharmacy Research 2012,5(9),4863-4866 StemBase is a web-interface generated by the Canadian Stem Cell Network. 6 Zuba-surma ek, Józkowicz A, Dulak j. Stem cells in pharmaceuti- It was designed by obtaining gene expression data from stem cells and cal biotechnology. Curr Pharm Biotechnol. 2011 Nov;12 derivatives mainly from mouse and human using DNA microarrays and (11):1760-73. Serial Analysis of Gene Expression [29]. This database indicates ways to 7 Hwa A. lim. Stem cells and regenerative medicine. www. asiabiotech. use it for the studying the expression of particular genes in stem cells or to com. Vol 14; No. 3. 2010; 27. search for genes with particular expression profiles in stem cells, which 8 Badylak SF, Weiss DJ, Caplan A, Macchiarini P. Engineered whole could be associated to stem cell function or used as stem cell markers. organs and complex tissues. Lancet. 2012 Mar 10;379(9819):943- 52. Cellfinder is a stem cell data repository under the Open Source model. Its 9 Jörg galle , Peter buske, Nicholas barker, Hans clevers, Markus functions are aimed at developing human embryonic stem cell registry loeffler . A comprehensive model of the spatio-temporal stem cell (hESCreg) into a stem cell navigation tool facilitating the linkage of indi- and tissue organisation in the intestinal crypt. 3rd satellite work- vidual cell lines or groups of cells to genetic or functional characteristics shop on bioinformatics in stem cell research. July 15; 2010, from sources outside of hESCreg, e.g. expression profiles for differentiated Dresden. or pluripotent cells [30,31]. This tool allows comparison and analysis of 10 Andrews PD. Discovering small molecules to control stem cell cells within hESCreg on multiple layers, but also expansion of data in the fate. Future Med Chem. 2011 Sep;3(12):1539-49. registry which can ultimately be used to design research projects of the 11 Evans M. Discovering Pluripotency: 30 years of mouse embry- user. To further develop its utility, hESCreg has started to register human onic stem cells. Nat Rev Mol Cell Biol. 2011 sep 23;12(10):680- induced pluripotent stem cell lines (hiPSC). Currently 20 hiPSC from 3 6. providers in 3 countries are listed. 12 J. Yu, Ma Vodyanik, K. Smuga-otto, J. Antosiewicz-bourget, J. Frane, S. Tian, J.Nie, G.A Jonsdottir, V. Ruotti, R. Stewart, I Hematopoietic Stem cells and transcription factors: Slukvin and J.A Thomson. “Induced Pluripotent stem cell lines Hematopoietic Stem cells have the capacity to differentiate along multiple derived from human somatic cells”. Science, 318(5858), 2007, lineages potentially giving rise to all cells present in the blood. This process pp. 1917–1920. is controlled by cell-specific and ubiquitously expressed transcription fac- 13 K.Takahashi and S Yamanaka. “Induction of Pluripotent stem tors and cofactors. Systems biology approaches including quantitative cells from mouse embryonic and adult fibroblast cultures by de- proteomics (isotope tagged methods), genomics (expression microarray, fined factors”. Cell, 126(4), 2006, pp. 663–676. ChIP-sequencing) and other bioinformatics tools were employed to under- 14 Miguel Andrade, Nancy Mah. Study of stem cell reprogramming stand the regulation of gene expression at the level of transcription, using profiles of gene expression. 3rd satellite workshop on epigenetics and chromatin structure. A few investigators attempted to iso- bioinformatics in stem cell research. July 15 2010, Dresden late endogenous transcription factors at various stages of hematopoietic 15 Robinton Da, Daley gq. The promise of induced pluripotent stem differentiation of healthy and/or leukemic cells and identified their interact- cells in research and therapy. Nature. 2012 jan 18;481(7381):295- ing partners by mass spectrometry. Quantitative Proteomics (ICAT, 305. iTRAQ) methods were used to pinpoint the dynamics of protein interac- 16 Maury Y, Gauthier M, Peschanski M, Martinat C. Human Pluri- tions within the transcriptional regulatory network. For example, it was potent stem cells for disease modelling and drug screening. earlier shown that the bZIP protein MafK regulates ß-globin expression by Bioessays. 2012 jan;34(1):61-71. exchanging its heterodimerization partner from the repressor Bach1 to the 17 Shi y. Induced Pluripotent stem cells, new tools for drug discov- activator p45 during erythroid differentiation [32]. ery and new hope for stem cell therapies. Curr Mol pharmacol. 2009 jan;2(1):15-8. The use of informatics in Stem cell research has been increasing during the 18 Miguel Andrade, Nancy Mah. Study of stem cell reprogramming last three decades. Stem cell Genomics and Proteomics allowed the emer- using profiles of gene expression. 3rd satellite workshop on gence of other high-throughput techniques of biological analysis that are of bioinformatics in stem cell research. July 15 2010, Dresden recent research focus in bioinformatics. Recent development in 19 Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos Bioinformatics research resulted in production of Web-based interactive G, Alvarez P, Brockman tools and databases and tools for analysis of various transcription factors W, Kim TK, Koche RP, Lee W, Mendenhall E, O’donovan A, regulating the fate of stem cells. Presser A, Russ C, Xie X,Meissner A, Wernig M, Jaenisch R, Nusbaum C, Lander ES, Bernstein BE. Genome-wide maps of ACKNOWLEDGEMENTS: chromatin state in Pluripotent and lineage-committed cells. Na- The encouragement and support from Bharat University, Chennai,India is ture. 2007 Aug 2;448(7153):553-60. grateful acknowledged. 20 Arnoldo j. muller-molina and marcos araúzo-bravo. Efficiently finding homologous pluripotent proteins by using similarity REFERENCES: search. 3rd satellite workshop on bioinformatics in stem cell re- 1. Carolina Perez-Iratxeta, Miguel A. Andrade-Navarro and Jonathan search. July 15 2010, Dresden D.Wren. Evolving research trends in bioinformatics. Briefings in 21 Shin S, Sun y, Liu y, Khaner h, Svant s, Cai j, Xu qx, Davidson Bioinformatics. Oct 31, 2006. Vol 8. NO 2. 88. bp, Stice sl, Smith Ak, Goldman Sa, Reubinoff Be, Zhan M, Rao 2. Roos DS. Computational biology. Bioinformatics—trying to MS, Chesnut jd. Whole genome analysis of human neural stem swim in a sea of data. Science 2001;291:1260–1. cells derived from embryonic stem cells and stem and progenitor 3 Ranganathan S. Bioinformatics education-perspectives and chal- cells isolated from fetal tissue. 3rd satellite workshop on lenges. PLoS Comput Biol 2005;1:e52. 4 Nicki tiffin, Miguel a Andrade-navarro and Carolina perez-iratxeta. bioinformatics in stem cell research. July 15 2010, Dresden. Linking genes to diseases: it’s all in the data. Genome Medicine 22 Anup Som, Clemens Harder, Boris Greber, Marcin Siatkowski1, 2009, 1:77. Yogesh Paudel, Gregor Warsow, Clemens Cap, Hans Scho, Georg 5 Brock A, Goh ht, Yang b, Lu y, Li h, . Cellular Reprogramming: a Fuellen. The Plurinetwork: an electronic representation of the new technology frontier in pharmaceutical research. Pharm Res. network underlying pluripotency in mouse, and its applications. 2012 Jan; 29(1):35-52. Plos one, Dec 2010 ,Vol 5, 12, e15165. Journal of Pharmacy Research Vol.5 Issue 9.September 2012 4863-4866 Polani B.Ramesh Babu et al. / Journal of Pharmacy Research 2012,5(9),4863-4866 23 Matsumoto A, Nakayama KI. Role of key regulators of the cell M, Anisimov SV, Wobus AM, Boheler KR. Linkage of pluripo- cycle in maintenance of hematopoietic stem cells. Biochim tent stem cell-associated transcripts to regulatory gene networks. Biophys Acta. 2012 Jul 19. Cells Tissues Organs. 2008;188(1-2):31-45. Epub 2008 Feb 27. 24 Pandian GN, Sugiyama H. Programmable genetic switches to 29 Reatha Sandie, Gareth A Palidwor, Matthew R Huska, Christo- control transcriptional machinery of pluripotency. Biotechnol J. pher J Porter, Paul M Krzyzanowski1, Enrique M Muro, Caro- 2012 Jun;7(6):798-809. lina perez-iratxeta1 and Miguel A Andrade-navarro. Recent de- 25 Ezashi T, Telugu BP, Roberts RM. Model systems for studying velopments in Stembase: a tool to study gene expression in hu- trophoblast differentiation from human pluripotent stem cells. man and murine stem cells. BMC research notes 2009, 2:39. Cells. Tissue Res. 2012 Mar 17 30 Joeri Borstlap, Mai X. Luong, Heather M. Rooke, B. Aran, A. 26 James Adjaye. A data integration approach to mapping Oct4 Damaschun, A. Elstner, Kelly P. Smith, Gary S. Stein and Anna gene regulatory networks required for sustaining self-renewal Veiga. International stem cell registries. In Vitro Cellular & Devel- and pluripotency in embryonic stem cells. 3rd satellite workshop opmental Biology - Animal. Volume 46, Numbers 3-4 (2010), on bioinformatics in stem cell research. July 15 2010, Dresden. 242-246. 27 Timm Schroeder. Continuous single cell data as the basis for stem 31 Joeri Borstlap,Glyn Stacey,Andreas Kurtz,Anja Elstner, cell systems biology 3rd satellite workshop on bioinformatics in Alexander Damaschun, Bego, ntilde,a Ar,aacute,n & Anna Veiga. stem cell research. July 15 2010, Dresden. First evaluation of the European hESCreg. Nature Biotechnology 28 Tarasov KV, Testa G, Tarasova YS, Kania G, Riordon DR, Volkova 26, 859–860 (1 August 2008).

Source of support: Nil, Conflict of interest: None Declared

Journal of Pharmacy Research Vol.5 Issue 9.September 2012 4863-4866