Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Review Article

International Journal of Pharma and Bio Sciences ISSN 0975-6299

BIOINFORMATICS WITH - A SYNERGISTIC EFFECT

MARGI GANDHI AND RAJASHREE MASHRU*

Faculty of pharmacy, Kalabhavan, The M.S.University of Baroda, Baroda, India

ABSTRACT

Nowadays Bioinformatics and computational biology has emerged as an interdisciplinary field that develops and applies computational methods to analyse large collections of biological data. Earlier bioinformatics term had different meaning, it referred to the study of information processes in biotic systems like biochemistry and biophysics. However with the emergence of bioinformatics tools it lead to the development of protein sequencing methods from a variety of organisms and with the availability of protein sequences it helped in determining sequences of insulin. The major challenge to bioinformatics tools was to manually handle large number of protein sequences of different organisms. Thus to overcome this limitation new computer methods were developed. The objective of the current review article is to addresses the limitations of bioinformatics field and how these limitations were overcome when computational biology was combined with bioinformatics. This article also includes recent application highlighting synergistic effect of bioinformatics with computational biology in various fields like drug discovery and development, food, treatment of neglected tropical diseases like TB, HIV, Leishmaniasis and malaria, Influenza surveillance and vaccine strain selection and development of precision medicine.

KEYWORDS: Biochemistry, Bioinformatics, Computational Biology, Synergistic effect, Drug discovery and development.

RAJASHREE MASHRU*

Faculty of pharmacy, Kalabhavan, The M.S. University of Baroda, Baroda, India

Received on: 18-06-2019 Revised and Accepted on: 26-07-2019 DOI: http://dx.doi.org/10.22376/ijpbs.2019.10.3.b224-239

Creative commons version 4.0

This article can be downloaded from www.ijpbs.net B-224

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

INTRODUCTION with biological data to solve problems.2 Computational biology: It is about studying biology using computational Bioinformatics: Biologists who specialize in the use of techniques, which further the understanding of science. computational tools and systems to answer problems of Computer scientists, Mathematicians, Statisticians, and 1 Engineers who specialize in developing theories, biology are bioinformaticians. They tend to draw upon 3 skills in software development, database development algorithms and techniques for tools and systems developed by bioinformatics are called computational and management and visualization methods to convey 2 information contained within data sets. It focuses more biologists. on the engineering side and creation of tools that work

HISTORY FROM BIOINFORMATICS TO COMPUTATIONAL BIOLOGY

High quality data for life sciences ↓ Network and system biology

Computational system-level Analysis ↓ Data Analysis

Figure 15 Databases

HIGH-THROUGHPUT DATA PROCESSING 1. precisely the transcription unit (5’ and 3’ Bioinformatics was coined in 1990 to define the use of boundaries) of genes in metazoan genomes was to computers in sequence analysis. But it had major difficult with bioinformatics. limitations which restricted its application. These 2. To identify correct sequence of mRNA (exon includes: parsing) was also a major challenge of bioinformatics.

This article can be downloaded from www.ijpbs.net B-225

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

From 1990 to 1997 many software were developed to 1. Common statistical regularities 2. Plain sequence overcome the limitation in the field of bioinformatics. Out similarity. of which one software was the state-of-the-art gene The drawback of this software was that it had no prediction software. It was based on two main principles relationship to actual molecular mechanism of gene expression.4

Figure 25 Flow chart showing relationship between Bioinformatics and computational Biology

Thus bioinformatics was linked to computational biology. CELL Their synergism involved using computer science in The basic unit of biological activity is called Cell. The understanding data collected by biologists and health cells of living kingdom are divided to two categories sciences and professionals. Together as an namely prokaryotic and eukaryotic cells. interdisciplinary field they aim to solve problems by their open sourced approach in development and sharing EUKARYOTIC CELLS research. Both fields interact with a wide range of These have advanced and complete cells. These cells disciplines within biology, including genetics, are found in unicellular and multicellular plants and biochemistry, biophysics, cell biology, and 2 5 5 animals and contain plasma membrane, nucleus, DNA, evaluation. figure 1 ,2 represents the flow chart of cytoplasm with ribosomes and cellular organelles process starting with bioinformatics tools to collect [Figure 3]5 databases and ending with computational software to analyse the collected data. PROKARYOTIC CELLS OVERVIEW ON BIOLOGY prokaryotic is derived from Greek word “v” Biology is all about the science of life and living where “”(pro) means before and “v” (karyon) organisms. The main vocabulary of our interest Means nucleus. They lack well defined nucleus and 5 includes: possess relatively simple structure [Figure 4] 1. Cell 2. Prokaryotic and eukaryotic cells 3. Nucleus, chromosomes, DNA, DNA bases A,G,C,T 4. RNA, 5. Gene

Figure 35 Eukaryotic Cells5

This article can be downloaded from www.ijpbs.net B-226

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Figure 45 Prokaryotic Cells5

NUCLEUS, CHROMOSOMES, DNA, DNA of genetic substance called DNA . DNA along with BASES A,G,C,T chromosomes are together known as the genome. The Nucleus is the largest cellular organelle. Its contain specific regions of the genomes are called genes. These chromatin material which gets condensed into two or genes control the protein synthesis through the more thick ribbon-like structure called chromosomes mediation of RNA. during cell division. These chromosomes are composed

Figure 56 The Central Dogma of life called the “central dogma of molecular biology”. [Figure 1. MessengerRNA [mRNA]: It specifies the 5]6. In eukaryotic organisms, DNA is also present in sequence of amino acid in protein synthesis. mitochondria and chloroplast. It performs specified 2. Ribosomal RNA [rRNA]; It is associated with function of protein synthesis. DNA is a chain of four structure and function of ribosome’s [factories of types of molecules A,G,C and T where A always links protein synthesis] with T and C with G. Thus, these sequence of AGCT 3. Transfer RNA [tRNA]: It delivers amino acid to gives complete blueprint of our lives including indication ribosomes for protein synthesis. of the diseases that are likely to occur. RNA contains 4 types of molecule A,G,C and U. The last one replaces T in DNA. Here is the list of bioinformatics tools [Refer Table 2]8and Computational RNA 7 These are single stranded molecules. They perform Biology Software. [Table-1] several functions carried out by three types of RNA

This article can be downloaded from www.ijpbs.net B-227

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Table 17 List of computational biology softwares used for ages

Year Software Purpose Creators Key capabilities released Citations BLAST Sequence Stephen Altschul, Warren First program to provide 1990 35,617 alignment Gish, Gene Myers, Webb statistics for sequence Miller, David Lipman alignment, combination of sensitivity and speed R Statistical Robert Gentleman, Ross Interactive statistical analysis, 1996 N/A analyses Ihaka extendable by packages ImageJ Image analysis Wayne Rasband Flexibility and extensibility 1997 N/A Cytoscape Network Trey Ideker et al. Extendable by plugins 2003 2,374 visualization and analysis Bioconductor Analysis of Robert Gentleman et al. Built on R, provides tools to 2004 3,517 genomic data enhance reproducibility of research Galaxy Web-based Anton Nekrutenko, Provides easy access to high- 2005 309 analysis James Taylor performance computing platform MAQ Short-read Heng Li, Richard Durbin Integrated read mapping and 2008 1,027 mapping SNP calling, introduced mapping quality scores Bowtie Short-read Ben Langmead, Cole Fast alignment allowing gaps 2009 1,871 mapping Trapnell, Mihai Pop, and mismatches based on Steven Salzberg Burrows-Wheeler Transform Tophat RNA-seq read , Lior Discovery of novel splice sites 2009 817 mapping Pachter, Steven Salzberg BWA Short-read Heng Li, Richard Durbin Fast alignment allowing gaps 2009 1,556 mapping and mismatches based on Burrows-Wheeler Transform Circos Data Martin Krzywinski et al. Compact representation of 2009 431 visualization similarities and differences arising from comparison between genomes SAMtools Short-read data Heng Li, Richard Durbin Storage of large nucleotide 2009 1,551 format and sequence alignments utilities Cufflinks RNA-seq Cole Trapnell, Steven Transcript assembly and 2010 710 analysis Salzberg, Barbara Wold, quantification IGV Short-read data James Robinson et al. Scalability, real-time data 2011 335 visualization exploration N/A, paper not available in Web of Science.

This article can be downloaded from www.ijpbs.net B-228

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Table28 Role of bioinformatics software and computational methods in clinical research

LIST OF SOFTWARES USED IN be reused for multiple studies. It saves time in study CLINICAL TRIALS8 setup, and ensures that there are standardization and consistency of data collection and reporting. Oracle Few of the currently available off-the-shelf software for clinical can be customized to contain views that allow clinical trials are:8 the data to be browsed. System generated error 8 messages are programmed to conduct data validation. E-CLINICAL The Oracle clinical application allows electronic data to Innovative e-Clinical technologies are now becoming be created, modified, maintained, and transmitted essential to make clinical data acquisition, aggregation, without compromising the authenticity, integrity, and analysis, and decision-making for the new product. Two confidentiality of data8 companies’ e-Clinical solutions and e-Clinical works are leaders in clinical solutions. The eClinical application ELECTRONIC CASE REPORT FORM8 (e.g. elluminate by e-Clinical solutions) provides a cost Electronic case report form is an electronic tool replica cutting, broad applicability with functionally rich solution of paper CRF where the clinical trial subject data are to manage clinical trials, support EDC all integrated into captured in an electronic format. The clinical and one system. The workflows developed in e-Clinical nonclinical data of the subject (including medical enable clinical operations to effectively plan each stage procedures) are gathered directly into the interface of of a trial8 central clinical database, thereby accelerating the data

transmission to the sponsor. In multicentric trial, the ORACLE CLINICAL AND ORACLE importance of eCRF greatly increases as the data REMOTE DATA CAPTURE8 managers have continuous insight of data collection at It is also known as relational database management one point. This makes data collection more efficient system. It is used for managing database design and which significantly contribute to the efficacy of the whole data acquisition for clinical study. This allows objects to clinical trial.8

This article can be downloaded from www.ijpbs.net B-229

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Table 38 Software and bioinformatics tools used in drug discovery8

Software Description Web URL e-Drug3D Database of 3D chemical structures http://chemoinfo.ipmc.cnr of drugs that provide various ready- s.fr/e-drug3d.html to-screen SD files of drugs and commercial drug fragments. It consists numerous compounds which act as a natural inputs for various cheminformatics and virtual screening applications IDMap Java-based software which predicts http://www.equispharm.co novel relationship between targets m/idmap and chemicals with the aid of text mining and chemical structure information. IDMap has a huge implication in drug repositioning as it builds a convenient environment for identifying the possible lead (or even commercial chemical) and its drug target Promiscuous The database has an exhaustive http://bioinformatics.charit resource of protein-protein and drug- e.de/promiscuous protein interactions intended to provide uniform data useful for drug repositioning and additional analysis. It provides three different types of entities: Drug, protein, and side effects as well as their relationships. The database consists of 25,000 drugs ( experimental and withdrawn drugs) commentary on relationships between drugs and protein or both Predict Predict stands for PREDICT A computational method for inferring potential drug indications of both novel molecules and approved drugs

QuantMap QuantMap integrated and predicts http://galaxy.predpharmto the interrelationship between drugs x.org toxicity with other chemical entities to avoid toxicity. It is a connectivity map that explores and relates the biologically active chemicals to gene expression data Connectivity Cmap is a catalog of genome-wide mapping (Cmap) transcriptional expression data from culture human cells treated with bioactive small molecules. It predicts the molecular response to a new chemical compound by linking its observed effect on the gene expression changes with similar compound. Cmap is a simple patter- matching algorithm which establishes functional connection between drugs, genes, and disease DCDB It is a database which consists of established drug combination in a well-organized manner. The aim of DCDB is to facilitate systems- oriented new drug discovery. These organized drug combinations are selected from various clinical trials

This article can be downloaded from www.ijpbs.net B-230

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

and the FDA electronic orange book SM2mR SM2mR is a database which has a http://bioinfo.hrbmuedu.c user friendly interface to retrieve n/SM2miR/ information about mRNA or small molecules. The database contains detailed information about mRNAs, their relationships, their expression pattern, FDA approval status, etc… which significantly support the development of drug, for example, mRNA therapeutics NCBI GEO The GEO is a public repository http://www containing data sets generated from ncbi.nlm.nih.gov/geo high throughput microarray and next generation sequence genomic. The data on expression of mRNA, genomic deoxyribonucleic acid, and protein provide support to the hypothesis, establish disease predictors, and provide inputs for algorithm development DrugComboRanker DrugComboRanker is a systematic Available on request computational tool to prioritize [email protected] synergistic drug combinations and uncover their mechanisms of action SaDA SaDA is a software targeted to address the requirements of a laboratory engaged in huge collaborative projects. It is an integrated system comprising of a database which could be customized accordingly for the management of collection, storage, and retrieval of information about study trial participants and biomedical samples of environmental orgin

SOFTWARE AND BIOINFORMATICS intake and submission, detailed analytics, and safety operations integrated into a single system.8 TOOLS IN PHARMACOVIGILANCE The following are the Drug safety software used by pharmaceutical companies: APPLICATIONS OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY ARISG8 IN DRUG DISCOVERY AND It is the leading platform for both pharmacovigilance and DEVELOPMENT clinical safety system. This application provides an Long term use of antibiotics against bacterial infection efficient and end-to-end requirement for managing has resulted in high levels of antibiotic resistance hence adverse event reporting in compliance with regulatory there is a need to develop more and more effective requirements and Food and Drug Administration 21CFR drugs against bacterial pathogens.10 But, the drug part 11. ARISg provides an integrated system for discovery process is time- consuming, expensive and pharmacovigilance and risk management, thereby laborious. The traditionally available drug discovery enabling pharmaceutical companies to monitor and process includes (i) Identification of disease and specific evaluate their products for safety risk.8 drug target (ii) Validation of the target (iii) Assay development (iv) Lead identification against the target ARGUS8 (v) Optimization of the lead candidate (vi) Pre-clinical Oracle Argus is a software for pharmacovigilance development including in-vivo and in-vitro studies (vii) application. It enables pharmaceutical companies to File for Investigational New Drug (IND) with Food and 1 infer fast and better safety decisions, optimize global Drug Administration (FDA; USA). (viii) Clinical compliance, and integrate risk management system. development studies, a. phase I clinical trials (20 – 100 Argus provides customizable end-to-end safety process healthy volunteers) b. Phase II clinical trials (100–500 with automated case processing, periodic reporting, E2B volunteers) c. Phase III trials (1000 to 5000 volunteers) (ix) NDA filed for FDA review (x) Drug approved for marketing[figure 6]10

This article can be downloaded from www.ijpbs.net B-231

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Figure 610 Drug Discovery and development -Timeline

To overcome some of these problems, more techniques play a significant role in the pharmacology sophisticated computational techniques are being used sectors that cover worldwide drug development through for the discovery of new drugs. These computational the use of various software and tools.

Figure 711 Structure based and ligand based Drug Design Approach

COMPUTER AIDED DRUG DESIGN MOLECULAR DOCKING [CADD] Molecular docking studies are used to understand the interaction between a ligand and a target protein at the Two main groups of techniques are mentioned namely 1 structure based and ligand based drug design atomic level. This approach includes the search approaches [Figure 7]11 algorithm and the scoring function. The search algorithm is able to explore all the possible orientations and conformations of a small molecule within a target STRUCTURE-BASED DRUG DESIGN 11 binding site . (SBDD) Structure-Based Drug Design (SBDD) exploits the three SCORING FUNCTION dimensional structure of the biological target, obtained Scoring function represents an approximate from X-ray crystallography or nuclear magnetic mathematical method used to predict and evaluate the resonance (NMR) spectroscopy, more rarely through 11 strength of non-covalent interactions between small Homology modeling . It includes: 11 molecules and target proteins. Software widely used for

molecular docking studies are Surflex-dock, AutoDock, VIRTUAL SCREENING APPROACH GOLD Glide, FlexX, DOCK, HADDOCK9The final goal Structure-based virtual screening approach involves the of this approach is to predict whether chemical use of the 3D structure of target protein to screen compounds are able to interact with a biological target against the compounds present in the chemical libraries and its affinity. The binding conformation of small through molecular docking studies. molecules into their target, their intermolecular

This article can be downloaded from www.ijpbs.net B-232

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239 interactions and the structural changes of the problem, a structural proteome-wide off-target drug/target complexes can be estimated through determination pipeline was developed by integrating molecular mechanics and molecular dynamics .Software computational methods for high throughput ligand packages used for MD simulations are GROMACS, binding site comparison and binding free energy NAMD, AMBER and CHARMM9 calculations to predict potential off-targets for known drugs. Here this method is applied to identify human off- ON THE CONTRARY LIGAND-BASED targets for Nelfinavir, an antiretroviral drug with anti- cancer behavior. The steps in our off-target pipeline are DRUG DESIGN (LBDD) 12 shown in Figure 8 . In the first step, the Nelfinavir Ligand-Based Drug Design (LBDD) exploits the binding pocket in the HIV protease dimer structure (PDB knowledge of compounds able to interact with the Id: 1OHR) was used to search 5,985 PDB structures of biological target in order to identify a set of chemical human proteins or homologs of human proteins using features ensuring the molecules activities. This model the SMAP software, which is based on a sensitive and can be used to design new potent drug-like entities. The robust ligand binding site comparison algorithm . Hits pharmacophore approach and quantitative structure- are considered significant if the SMAP p-value,1.0e-3. In activity relationship (QSAR) are the most used ligand- step 2, the binding poses and affinities of Nelfinavir to based methods.11 these putative off-targets are estimated using two

docking methods, Surflex and eHiTs, starting from the QSAR superimposed binding sites. If the docking score Ligand-based virtual screening involves the use of indicates severe structural clashes between Nelfinavir QSAR studies. The QSAR method involves the and the predicted binding pocket, the protein is removed development of mathematical models to correlate from the off target list. After filtering by SMAP and the relationships between the biological function and two docking programs, 92 putative off-targets remained physicochemical characteristics. In the drug discovery for further analysis. Among them, the top 7 ranked off- process, QSAR methods are used for the lead targets belong to the aspartyl protease family that is the optimization. This is a major step in drug discovery. In fusion form of the primary target HIV protease dimer. 3D-QSAR, Comparative molecular field analysis The remaining 85 proteins belong to different global (CoMFA) and Comparative molecular similarity indices folds from the primary target. These off-targets are analysis (CoMSIA) are the two techniques developed for dominated by protein kinases (PKs) (51 off-targets) and 10 ligand-based drug design. Pharmacophore approach other ATP or nucleotide binding proteins (17 off-targets). includes software HipHop, DISCO, HypoGen and Among the 51protein kinases, the majority of predicted 9 PHASE off-targets belong to the tyrosine kinase, cAMP-

dependent, cGMP-dependent and protein kinase C EXAMPLES INCLUDE TO FIND OUT families. The 12 top ranked PKs with p-value smaller BROAD SPECTRUM EFFECT OF than 1.0e-4 were subject to detailed protein-Nelfinavir 12 docking and 10 of them were further investigated NELFINAVIR through computational intensive molecular dynamic The traditional approach to drug discovery of ‘‘one drug simulations and MM/GBSA binding.12 The result – one target – one disease’’ (which means one drug has suggested that Nelfinavir inhibited various multiple its specific target to treat specific disease) is insufficient, protein kinase targets. This concluded that broad especially for complex diseases, like cancer. One drug spectrum low affinity binding by a drug or drugs to is likely to bind to multiple targets with varying affinity. multiple targets may lead to a collective effect which is However, to identify multiple targets for a drug is a important in treating complex diseases such as cancer. complex and challenging task. To overcome this

This article can be downloaded from www.ijpbs.net B-233

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

Figure 812 Off-target pipeline flow diagram of Nelfinavir

TO FIND OUT PUTATIVE LEADS FOR review article by Alberto Ambesi-Impiombato and Diego DRUG REPOSITIONING THROUGH DRUG- di Bernardo gives detailed information about how 13 computational methods like classification and network- TARGET INTERACTION PREDICTION based algorithms14 can be used to understand the mode The emergence of multi-resistant bacterial strains and of action and the efficacy of a given compound and to the existing void in the discovery and development of help elucidating the pathophysiology of a disease. By new classes of antibiotics is a growing concern. Indeed, the use of sophisticated computational biology and some bacterial strains are now resistant to last-line bioinformatics tools: a shift from single-target drugs to antibiotics and considered untreatable. Drug “network drugs” was achieved where network drug is repositioning has been suggested as a strategy to defined as a compound or a set of compounds that is minimize time and cost expenses until the drug reaches able to alter a biological pathway dysregulated by a the market, compared to traditional drug design. Drug- disease in a predefined way so as to restore its normal target interactions (DTIs) are the basis of rational drug physiological function. design but by using a computational approach to predict DTIs solely based on the primary sequence of the protein and the simplified molecular-input line-entry OPEN SOURCE DRUG DISCOVERY system of the ligand proved to be less time consuming. [OSDD] Computational techniques using Molecular docking Open source drug discovery is a model based upon software helped to compare the binding affinities open source movement within the computer software between a given ligand and a putative drug- target. This industry. It takes two primary attributes, namely strategy can greatly reduce the cost of lead screening collaboration of volunteers and free access to the and the time required for a drug to reach the market results, and applies them to drug discovery. In drug some examples of successfully repositioned drugs for discovery, open source software means that all virtual uses different from their original indications include and laboratory results are published with as much of the bupropion, fluoxetine, thalidomide and sildenafil . raw data available as possible. This should include Sildenafil is probably the most popular example, which enough data for someone knowledgeable in the topic to was initially used to treat hypertension, angina, and review and critique the data. Collaboration across currently for erectile dysfunction13 organizational and geographical boundaries offers several benefits. If enough researchers can be DEVELOPMENT OF ‘NETWORK- incentivized to collaborate, even small contributions by RECONSTRUCTION’ METHODS FOR DRUG many researchers can significantly progress a project. It 14 also opens a project to new external ideas and DISCOVERY approaches. It is anticipated that the majority of the Computational biology and bioinformatics have the researchers will contribute on a volunteer basis, thereby potential of changing the way drugs are designed. The

This article can be downloaded from www.ijpbs.net B-234

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239 reducing the cost of the project.15OSDD – open source with short treatment regimen are essentially required for software used are R, Bioconductor, Bio- Perl, Bio-Java, treatment of TB.18 OSDD approach was adopted in 2006 Bio-Python, etc15 in india for tackling TB and the project was launched in 2008.17 OSDD IN TREATING TB Globally over 1 billion people are affected by neglected OSDD IN DRUG DISCOVERY tropical disease [NTDs] including TB , HIV , Firstly, sysborg TB (system Biology of organism)[Figure Leishmaniasis and malaria.16where the global incidence 9]18postal was developed, were entire community can of TB is increasing by 0.6% per annum. The current share their Knowledge, data, producers, method and treatment relies on Directly observed therapy [DOT] algorithms etc. which can be used, reused and modified which is combinatorial therapy that involves 4 for further activities. Using this information medicines ionized, rifampicin, pyrazinamide and Bioinformatics and Computational Biology tools are ethambutol.[1st line treatment].18 Wherever the first line applied to discover target structure, its mechanism of of treatment fails, the second line therapy is given. For action andits safety and efficacy. This reduced the time example: cycloserine, , amikacin, para-aminosalicylic gap for new chemical entity to come into the market and acid.17 The inadequacy and flawed administration of also reduced cost of the drug.18 Thus by using OSDD, extending treatment has resulted in emergence of MDR India aims to control the neglected tropical disease like (Multiple drug resistant) and XDR (extremely drug TB. [Figure 10]19 resistant) TB. New highly potent and fast acting drugs

Figure918 Sysborg TB Postal

Figure 1019 Comparison between OSDD past and present status

This article can be downloaded from www.ijpbs.net B-235

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

APPLICATIONS OF BIOINFORMATICS AND IN FOOD22 COMPUTATIONAL BIOLOGY TO INFUENZA Bioinformatics is one of the rapidly expanding fields of SURVEILLANCE AND VACCINE STRAIN technology in Life Science Research. It provides details SELECTION20 of the molecular basis of human health. The immediate benefits of this information help us to extend our WHO global influenza surveillance network by applying understanding of the role of food in the health and well- bioinformatics tools provided wealth of data. When being of consumers. Bioinformatics along with these data were collaborated with a number of computational tools plays an important role in predicting computational and mathematical advances they and assessing the desired and undesired effects of increased the resolution at which antigenic surveillance microorganisms on food, genomics and proteomics data can be analyzed, can provide method for genetic study to meet the requirements of food production, food analysis and prediction, and can increase the processing, improving the quality and nutritive value of understanding of the determinants of repeated influenza food sources and many others. In addition, vaccination. These advances increase the information bioinformatics approaches can also be used in extracted from influenza surveillance and increase the producing the good quality of the crop including high quantitative data available for the vaccine strain yield and disease resistant. There are also a variety of selection process which further helped in controlling 20 databases that contain data on food, their constituents, influenza drastically. 22 nutritive value, chemistry and biology.

IN DEVELOPMENT OF PRECISION 23 21 FOOD TASTE MEDICINE Scientists found out the molecular and genetic details of In the development of precision medicine, some of the the taste receptors. These include: branches of bioinformatics collaborated with  Sour: For sour taste, an ion channel identical to computational biology. The branches of bioinformatics degenerin-1 is found to be the receptor. are called omics techniques which include genomics, proteomics, metabolomics and metagenomics. Out of  Bitter: A family of ~50 G protein-coupled receptors these metabolomics is mainly used in the development (GPCRs) has been identified in human taste cells. of precision medicine. Metabolomics experiments have  Umami: For umami, mGluR4 which is a ‘splice provided us with a vast amount of raw data that have to variant’ of brain glutamate receptor has been be processed and analyzed. Data processing, identified in rat taste cells multivariate statistical analysis and machine learning,  Sweet: G protein coupled receptor; Tas1r3 has data integration and visualization of raw data was performed using computational open source software been identified as a sweetness receptor. tools. These Open-source software tools include XCMS  Salt: The epithelial ion channel, ENaC is and XCMS Online, MZmine, xMSanalyzer and OpenMS. responsible for over 80% of salt taste In combination with data analysis and multiple platform transduction.22 data integration tools, such as MetaboAnalyst , These taste receptors can be used to discover the next Metabolite Set Enrichment Analysis (MSEA), Molecular generation of taste modifiers for foods. New Networking Approach and Metabolic Pathway Analysis developments in computational algorithms and software, (MetPA) , these open-source software tools have been with the available known structures of these receptors, successfully used in metabolomics driven precision have made possible the molecular modeling and medicine studies for biomarker discovery and validation simulations. Such simulations will make possible to as well as for patient stratification. Mass spectral develop more intense tasting compounds as food libraries and publically available reference biochemical additives. These also help in the understanding of the and pathway databases are designed to facilitate basis of taste persistence, antagonism and metabolite identification and data integration in complementation. Bioinformatics sequence and metabolomics. There are many such databases similarity algorithms have been used to determine available for precision medicine metabotyping most homology between sweet taste receptors and brain commonly used of which are METLIN, Kyoto glutamate receptors as well as in the identification of Encyclopedia of Genes and Genomes (KEGG) , sour taste sensors in mammals. MassTRIX, Madison Metabolomics Consortium Database (MMCD), Human MetabolomeDatabase and ALLERGEN DETECTION drug bank, LIPIDMAPS.PubChem and ChemSpider. There are few databases that are dedicated to the food These collaboration of bioinformatics tools and allergens which include AllerMatch, Informall FARRP computational biology software have helped in the Allergen database and SDAP .23 detection of the disease at an early stage and has also helped in curing them by the development of precision medicine. Examples include: development of evidence- BIOACTIVE PEPTIDES based software TREATMENTMAP used a panel of Bioactive peptides of food are the peptides that are pharmacogenomic markers to probe the genomic present within the food and have exerted biological sequences from pancreatic tumors. This diagnostic tool activities such as ant oxidative, antihypertensive, was shown to identify known driver mutations as well as mineral-binding, opiate-like, antimicrobial, immuno-, and 23 biomarkers for effective treatment.21 cytomodulating activity. They exhibit a range of functional activities that exceed their fundamental nutritional role. Consequently, there

This article can be downloaded from www.ijpbs.net B-236

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239 has been increased academic and commercial interest DATABASES IN FOOD SCIENCES24 in the use of foods enriched in bioactive peptides for FooDB promotion of health and alleviation of various conditions. FooDB is the comprehensive resource on food Thus, it has become apparent that the use of constituents, chemistry and biology. It provides bioinformatics can serve as a valuable strategy in the information on both macronutrients and micronutrients discovery of novel bioactive peptides. There have been including many of the constituents that give color, flavor, many approaches developed for the prediction of texture, taste and aroma to the food. The data are bioactive peptides. Examples include computational obtained from the literature and it includes data on the methods devised for prediction of antimicrobial peptides. compound’s description, nomenclature, chemical class, QSAR models which are types of regression model, information on its structure, its physicochemical data, its have also been used in prediction of bioactive peptides, 24 food source(s), its color, taste, aroma, physiological specifically antithrombotic peptides. effect, presumptive health effects and concentrations in

various foods. FOOD QUALITY AND SAFETY There is a growing appreciation for bioinformatics in the FOODWIKI DATABASE area of food quality and safety. As the genome It is the repository for food and nutritional information in sequencing projects are now focusing on the food borne a consensus style. It utilizes the immense amount of pathogens and innovative ways which will help in data that is possible and easily managed by determining the source of the food borne illnesses. In bioinformatics strategies and protocols. Such resources the future, molecular markers may help in identification will advance and develop food and nutritional sciences of the occurrence of spoilage and pathogenic bacteria with a view to improving the quality and nutritive value of and prediction of thermal preservation stress resistance. food sources. [Figure 11] 24 describes the use of Bioinformatics experts have developed a tool for computational tools in arranging all types of food data in detecting and identifying bacterial food pathogens. This homogenous format. Formatting and integrating food tool has been developed by FDA (Food and Drug and nutritional data in such a manner allows Administration) for molecular characterization of 23 comparative analysis, which is essential for the bacterial food borne pathogens using microarrays. discovery of new reactions and opportunities.24

Figure 1124 Representation of the flow of information in the FoodWiki Database

The centre part of each piece of information in the helped in development of new drugs at a faster rate. database is represented by the name(s) of the food The collaboration of bioinformatics tools and source(s) shown in a circle in Figure 11. The colorful computational biology software has helped in detecting boxes (red, blue, green, yellow) contain examples of the disease at early stage and has also helped in curing what can be stored for a given disease lowering effect of them by development of precision medicine. Also a given food source. The active ingredients can be applying bioinformatics tools and computational known or not, and in some cases can be represented by software in food provides details of the molecular basis an entire food. of human health. The immediate benefits of this information help us to extend our understanding of the CONCLUSION role of food in the health and well-being of consumers. This review article will help researcher in field of From this review article it is concluded that bioinformatics including accurate RNA and protein advancement in bioinformatics field over the ages has structure predicton. Precise model of : helped in many ways which includes, reducing the time understanding exactly how biological sequences evolve and cost of drug discovery and development process by and Development of new algorithm designs and analysis using sophisticated computational techniques. This has softwares

This article can be downloaded from www.ijpbs.net B-237

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

AUTHORS CONTRIBUTION STATEMENT CONFLICT OF INTEREST

Dr. Rajashree Mashru has constantly guided this Conflict of interest declared none. manuscript and Margi Gandhi has prepared the manuscript.

REFERENCES

1. Achuthsankar S. Nair. Computational Biology & Silico Rational Drug Design. Pharmaceuticals. Bioinformatics: A Gentle Overview. 2017;10(1). Communications of the Computer Society of DOI: https://doi.org/10.3390/ph10010026 India. 2007 Jan. Available from: 12. Xie L, Evangelidis T, Xie L, Bourne PE. Drug https://www.researchgate.net/profile/Achuthsanka Discovery Using Chemical Systems Biology: r_Nair/publication/231337374_Computational_Bio Weak Inhibition of Multiple Kinases May logy_Bioinformatics_A_Gentle_Overview/links/55 Contribute to the Anti-Cancer Effect of Nelfinavir. 27598a0cf2e486ae40fe75.pdf PLoS Comput Biol. 2011;7(4):e1002037. 2. Lewis University. A Catholic and Lasallian DOI:10.1371/journal.pcbi.1002037 University, one university parkway Romeoville, IL 13. Coelho ED, Arrais JP, Oliveira JL. Computational 60446-2200, Lewis university online programs. Discovery of Putative Leads for Drug Computational biology and bioinformatics two Repositioning through Drug-Target Interaction fields changing the world. Available from Prediction. PLoS Comput Biol. https://online.lewisu.edu/msds/resources/computa 2016;12(11):e1005219. tional-biology-and-bioinformatics-two-fields- DOI:10.1371/journal.pcbi.1005219. changing-the-world 14. Alberto Ambesi-Impiombato and Diego di 3. The Ohio State University. College of Medicine. Bernardo. Computational Biology and Drug Biomedical sciences Graduate Program. Discovery: From Single-Target to Network Drugs. Computational Biology and Bioinformatics. Curr Bioinform. 2006;1(1):3-13 Available from: DOI: 10.2174/157489306775330598 https://medicine.osu.edu/bsgp/areas-of-research- 15. Christine Årdaland John-Arne Røttingen. Open emphasis/computational-biology-and- Source Drug Discovery in Practice: A Case Study. bioinformatics/pages/index.aspx PLoS Negl Trop Dis. 2012 Sep;6(9):e1827. 4. Gustau camps-valls and Alistair M Chalk. DOI: 10.1371/journal.pntd.0001827 Bioinformatics and Computational Biology. Int J 16. Nisha Chandran and Samir K. Brahmachari. A Curr Pharm Res. 2009. Available from: decade of OSDD for TB: role and outcomes. https://www.researchgate.net/publication/2231314 Current Science. 2018;115(10):1858 28_Bioinformatics_and_Computational_Biology 17. Anshu bhardwaj, vinod scaria , Gajendra Pal 5. Marvin Johnston. Two Basic Types of Cells Singh Raghava, Andrew Michael Lnn, Nagasuma Prokaryotic Cells And Eurkaryotic Cells. Available Chandra , et al. Open source drug discovery- A from: https://slideplayer.com/slide/10633599/ new paradigm of collaborative research in 6. Ami McKenzie. The Central Dogma of Life. tuberculosis drug development. Tuberculosis. Available from: 2011.91(5):479-86. https://slideplayer.com/slide/5668245/ DOI:10.1016/j.tube.2011.06.004 7. Stephen Altschul, Barry Demchak, Richard 18. Open Source Drug Discovery. Available from: Durbin, Robert Gentleman, Martin Krzywinski, et http://www.osdd.net/research-development/why- al. The anatomy of successful computational tb-as-first-target biology software. Nat biotechnol. 2013 19. Miranda Winifred Butler. Open Source Drug Oct;31(10):894-7. Discovery (OSDD). 2015. Available from: DOI:10.1038/nbt.2721 https://slideplayer.com/slide/14685003/ 8. Supreet Kaur Gill, Ajay Francis Christopher, Vikas 20. Derek J. Smith. Applications of bioinformatics and Gupta and Parveen Bansal. Emerging role of computational biology to influenza surveillance Bioinformatics tools and software in evolution of and vaccine strain selection. Vaccine. clinical research. Perspect Clin Res. 2003:21(16):1758-61. 2016;7(3):115-22. DOI: 10.1016/S0264-410X(03)00068-9 DOI: 10.4103/2229-3485.184782. 21. Rajeev K. Azad and Vladimir Shulaev. 9. Kullappan Malathi and Sudha Ramaiah. Metabolomics technology and bioinformatics for Bioinformatics approaches for new drug precision medicine. Brief Bioinform. 2018:1–15. discovery: a review. Biotechnol. Genet. Eng. Rev. DOI: 10.1093/bib/bbx170 2018;34(2). 22. K. Rani & Dr. K. Gomathi. Application of DOI: 10.1080/02648725.2018.1502984 Bioinformatics in Food Quality Control. 10. Rahul pharma. Drug discovery and development. International Journal of Current Research and [Interent]. 2011. Dec 27 Available from Modern Education. 2017:43-6. Available from: https://www.slideshare.net/rahul_pharma/drug- http://ijcrme.rdmodernresearch.com/wp- discovery-and-development-10698574 content/uploads/2017/03/ICSACSRA-013.pdf 11. Giorgio cozza. The Development of CK2 Inhibitors: From Traditional Pharmacology to in

This article can be downloaded from www.ijpbs.net B-238

Int J Pharma Bio Sci 2019 July; 10(3): (B) 224-239

23. Anil Kumar and Nikita Chordia. Bioinformatics and future directions fir food and nutritional Approaches in Food Sciences. J Food research facilitated by a food-wiki database. MicrobiolSaf Hyg. 2017;2(2). Trends Food Sci Technol. 2013;34(1):5-17. DOI: 10.4172/2476-2059.1000e104 DOI: 10.1016/j.tifs.2013.08.009 24. Therese A. Holton, Vaishnavi Vijaykumar and Nora Khaldi. Bioinformatics: current perspectives

This article can be downloaded from www.ijpbs.net B-239