View metadata, citation and similar papers at core.ac.uk brought to you by CORE Journal of Electronics and Systemsprovided by MAT Journals Volume 4 Issue 1

Research Scenario of Bio Informatics in Big Data Approach

S.Jafar Ali Ibrahim1, Dr.M.Thangamani2, D. Sarathkumar3 1Doctoral Research Fellow, School of and Communication Engineering, Anna University, Chennai, Tamil Nadu, India 2Assistant Professor, Department of Computer Technology, Kongu Engineering College, Perundurai, Tamil Nadu, India 3Assistant Professor, Department of Electrical & Electronics Engineering, Kongu Engineering College, Perundurai, Tamil Nadu, India Email: [email protected] DOI: http://doi.org/10.5281/zenodo.2596987

Abstract Big Data can unify all patient related data to get a 360-degree view of the patient to analyze and predict outcomes. This investigation examines the concepts and characteristics of Big Data, concepts about Translational Bio Informatics and some public available big data repositories and major issues of big data. This issue covers the area of medical and healthcare applications and its opportunities.

Keywords: Big Data, Bio Informatics, Drug Discovery, Computational Intelligence Methods, Health Informatics, Health care data mining

Big Data Concepts Big data is a blanket term for the non- Big data life cycle looks like traditional strategies and technologies So how is data really handled when needed to gather, organize, process, and managing with a big data framework? gather insights from large datasets. While ideas to exertion differ, there are Characteristics of big data can be some populace in the scenario and described us 6 V’s, that are following software that we can discuss for the most Volume, Velocity, Variety, Value, part. While the means exhibited Variability and Veracity [1, 2, 3] underneath won't not be valid in all cases, they are broadly utilized. Volume The general tier of task embroiled with big It refers to as terabytes, petabytes, and data processing is: zettabytes of data. This focus on near  Ingesting data into the system instant feedback has driven many big data  Persisting the data in storage practitioners away from a batch-oriented  Computing and Analyzing data approach and closer to a real-time  Visualizing the results streaming system. Data is constantly being In Big data technology, we will take a added, massaged, processed, and analyzed moment to talk about clustered computing, in order to keep up with the influx of new an important strategy employed by most information and to surface valuable big data solutions. information early when it is most relevant. CLUSTERED COMPUTING Variety Resource Pooling: Combining the While more traditional data processing available storage space to hold data is a systems might expect data to enter the clear benefit, but CPU and memory pipeline already labeled, formatted, and pooling is also extremely important. organized, big data systems usually accept and store data closer to its raw state.

18 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

High Availability: Clusters can provide health plan websites and smartphone, etc.) varying levels of fault tolerance and [10] availability guarantees to prevent hardware or software failures from affecting access Clinical reference and health to data and processing. publication data It refers to reference data for clinical, Easy Scalability: Clusters make it easy to claim, and business data to enable scale horizontally by adding additional interoperability, drive compliance, and machines to the group. improve operational efficiencies. There is often noisy data or false Text-based publications (journals articles, information in big data. The focus of Big clinical research and medical reference Data is on correlations, not causality [4]. material) and clinical text-based reference practice guidelines and health product CATEGORIES OF MEDICAL BIG (e.g., drug information) data [7, 12]. DATA Data in healthcare can be categorized as Administrative, Business and External follows. Data  Insurance claims and related financial Genomic Data data, billing and scheduling [10] Such data are gathered by a  Biometric data: Fingerprints, system or genomic data processing handwriting and iris scans, etc software. Data sequencing analysis  Other Important Data techniques and variation analysis are  Device data, adverse events and patient common processes performed on genomic feedback, etc. [9] data. The aim of genomic data analysis is  The content from portal or Personal to determine the functions of specific Health Records (PHR) messaging genes. It refers to genotyping, gene (such as e-mails) between the patient expression and DNA sequence [6, 7]. and the provider team; the data generated in the PHR Ingesting data Clinical Data into the system A term defined in the context of a clinical t  Persisting the data in storage rial for data pertaining to the health status  Computing and Analyzing data of a patient or subject [8]. About 80% of  Visualizing the results this type data are unstructured documents, images and clinical or transcribed notes [9] Big data in Health Informatics: Structured data (e.g., laboratory data, However, the scope of this study will be structured EMR/HER) research that uses data mining in order to

answer questions throughout the various Behaviour Data and Patient Sentiment levels of health[13]. Data

Behavioural data refers to information The scope of data used by the subfield produced as a result of actions, typically TBI, on the other hand, exploits data from commercial behaviour using a range of each of these levels, from the molecular devices connected to the Internet, such as a level to entire populations [14]. PC, tablet, or Smartphone. Behavioural data tracks the sites visited, the apps BIG DATA AND DRUG DISCOVERY downloaded, or the games played. • Web In today drug discovery environment, Big and social media data Search engines, Data plays a vital role due to its 5 V Internet consumer use and networking concepts. These databases provide sites (Facebook, Twitter, Linkedin, blog, information about the drugs, their adverse

19 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

reactions, 1chemical formula, information (protein/peptide) drugs, 112 nutraceuticals about metabolic pathways, drug targets, and over 5,125 experimental drugs. disease for which a particular drug is used Additionally, 4,924 non-redundant protein etc. None of the existing (i.e. drug pharmacogenomic databases carry the target/enzyme/transporter/carrier) complete integrated information and hence sequences are linked to these drug entries. there is a need to develop a database which Each Drug Card entry contains more than integrates data from all the widely used 200 data fields with half of the information databases [38]. being devoted to drug/chemical data and Integrating big data analytics and the other half devoted to drug target or validating drugs in silico has the potential protein data. to improve the cost-effectiveness of the drug development pipeline. Big data– CTD driven strategies are being increasingly The whole database is categorized in to 11 used to address these challenges. types: Computational prediction of drug toxicity Chemicals, genes, chemical-gene/protein and pharmacodynamic/pharmacokinetic interactions, diseases, gene-disease properties, based on integration of multiple associations, chemical-disease data types, helps prioritize compounds for associations, references, organisms, gene in vivo and human testing, potentially ontology, pathways and exposures. reducing costs[39]. Reactome DRUG DISCOVERY RELATED BIG It has cross-referenced to several other DATA SOURCES databases such as Ensembl [44] and Data sets and resources available on UniProt. The pathways within the database Related to drug discovery are scattered in especially those pertaining to those in various databases and online resources and humans may be used for research and most of these databases are interlinked analysis, pathways modelling, systems based on the information they carry. Some biology as well as pharmacogenomics of these databases include PharmGKB applications to analyze effects of drug [40], DrugBank [41], CTD [42], Reactome pathway alterations on drug response and [43], KEGG [46], STITCH [47], PACdb phenotypes [45]. [48], dbGaP [49] IGVdb, PGP [50]. Brief explanation of the databases are given in KEGG the following section and also tabulated in It is an integrated resource of systems table 2. information (KEGG Pathways, KEGG Brite, KEGG Module, KEGG Disease, PharmGKB KEGG Drug and KEGG Environ), PharmGKB is a pharmocogenomics genomics information (KEGG Orthology, database that carries all the clinical KEGG Genes, KEGG Genome, KEGG information along with the dosage DGenes and KEGG SSDB) and chemical guidelines, gene-drug associations and information (KEGG Compounds, KEGG genotype phenotype relationships. It also Glycans, KEGG Reaction, KEGG RPair, has information about Variant KEGG RClass and KEGG Enzyme). Annotations, Clinical Annotations and Very Important Pharmacogene (VIP) STITCH summaries, drug-centered pathways. STITCH (Search Tool for Interacting Chemicals) is a database of known and Drug Bank predicted interactions between chemicals Drug Bank database is the open resource and proteins. The interactions include for drug, drug targets, and chemo direct (physical) and indirect (functional) informatics. It contains 11,067 drug entries associations; they stem from including 2,525 approved small molecule computational prediction, from knowledge drugs, 960 approved biotech transfer between organisms, and from

20 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

interactions aggregated from other Pharmacogenetics-Cell line database for (primary) databases. It also includes data use as a central repository of on interactions between 210,914 small pharmacology-related phenotypes that molecules and 9’643’763 proteins from integrates genotypic, gene expression, and 2'031 organisms pharmacological data obtained via lymphoblastoid cell lines. 90 YRI LCLs as Other databases well as ExiqonmiRNA baseline data from dpGaP (Database of Genotypes and 60 unrelated CEU and 60 unrelated YRI Phenotypes) is database of genotype- have been deposited in the PACdb phenotype association studies, genome- database. wide association studies, as well as associations between genotype and non- IGVd (Indian Genome Variation database) clinical traits. It was developed to archive contains information about SNP, CNVs in and distribute the data and results from over studies that have investigated the 1000 genes of biomedical important interaction of genotype and phenotype in metabolic and genetic networks and also Humans. genes of pharmacogenetic relevance [51].

PACdb (Pharamacogenomics and Cell There are many other biological databases database) contains information on the such as Uniprot, GO, GenBank, PDB have relationships between SNPs, gene cross-reference to above databases whose expression and cellular sensitivity to drugs information may serve as essential source analyzed in cell-based models. It is a for drug and it related studies.

Table 1: Levels of Data Data level(s) Question level(s) Sections Subsections Questions to be answered Used answered 1. What sub-type of cancer does Using Micro Using Gene Expression a patient have? [18] 2. Will a Level Data – Molecular Data to Make Clinical Clinical patient have a relapse of cancer? Molecules Predictions [19] Creating a Connectivity Human-Scale Can a full connectivity map of Tissue Map of the Brain Using Using Tissue Biology the brain be made [20,21]? Brain Images Level Data Using MRI Data for Do particular areas of the brain Patient Clinical Clinical Prediction correlate to clinical events? [22] 1. Should a patient be released from the ICU, or would they Prediction of ICU benefit from a longer stay?[23- Readmission and 25] 2. What is the 5 Mortality Rate year expectancy of a patient Using Patient Patient Clinical over the age of 50? [26] Level Data 1. What ailment does a patient have (real-time prediction) Real-Time Predictions [27,28] 2. Is an infant Using Data Streams experiencing a cardiorespiratory spell (real-time)? [29] Using Message Board Can message post data be used Data to Help Patients Clinical for dispersing clinically reliable Obtain Medical information? [30,31] Information Can search query data be used to Using Population Tracking Epidemics accurately track epidemics Level Data – Population Epidemic-Scale Using Search Query Data throughout a population? Social Media [32,33] Can Twitter post data be used to Tracking Epidemics accurately Epidemic-Scale Using Twitter Post Data track epidemics throughout a population?[34,35]

21 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

Table 2: Some Bio Informatics related Big Data Resources Which is publicly available Category Name Description URL Literature mining PolySearch 2.0 Web-based text mining tool http://polysearch.cs.ualberta.ca Extensive library of machine learning Machine learning Weka algorithms with a user-friendly http://www.cs.waikato.ac.nz/ml/weka/ interface Database of drug chemical, structural, Drug Bank pharmacological, and target http://www.drugbank.ca Database information Comprehensive database of structural, PubChem pharmacological, and biochemical https://pubchem.ncbi.nlm.nih.gov/ activity data Protein Data Repository of protein structural data http://www.wwpdb.org Bank Web tool predicting pharmacological admetSAR and toxicology parameters based on http://lmmd.ecust.edu.cn:8000/ chemical structures The Drug Gene Interaction Database of known drug-gene http://dgidb.genome.wustl.edu/ Cheminformatics Database connections for selected genes (DGIdb) SIDER Database of drug adverse effects http://sideeffects.embl.de/ Database of functional cellular Library of responses to genetic and Integrated pharmacological perturbations http://lincsportal.ccs.miami.edu/dataset Cellular measured in multiple types of s/ Signatures biomolecules (eg,transcriptome and (LINCS) kinome) Database/knowledge base of high- throughput compound screens and ChemBank http://chembank.broadinstitute.org/ other small molecule–related information Searchable/downloadable database of DAVID https://david.ncifcrf.gov/ molecular pathway knowledge base Molecular pathway NDEx Biological network knowledge base http://www.home.ndexbio.org/ knowledgebase/ Molecular Repository of molecular signatures analysis tool Signatures from curated databases, publications, http://www.broadinstitute.org/msigdb Database and research studies (MSigDb) Gene Expression Repository of raw and processed omics http://www.ncbi.nlm.nih.gov/geo/ Omnibus data Sequence Read Repository of sequencing data http://www.ncbi.nlm.nih.gov/sra Archive Omics data Repository of raw and processed omics repositories Array Express https://www.ebi.ac.uk/arrayexpress/ data Repository of genomic, proteomic, The Cancer https://tcga-data.nci.nih.gov/tcga/ histological, and clinical data for a Genome Atlas tcgaHome2.jsp wide variety of cancers

CONCLUSION extremely recently, due to the increasing Big data is a broad, rapidly evolving topic. capability of both computational resources This survey discussed a number of recent and the algorithms which take advantage studies being done within the most popular of these resources. Research on using these sub branches of Health Informatics, using tools and techniques for Health Big Data from all accessible levels of Informatics is critical, because this domain human existence to answer questions requires a great deal of testing and throughout all levels. Analyzing Big Data confirmation before new techniques can be of this scope has only been possible applied for making real world decisions

22 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

across all levels. The fact that 75: 637-641. DOI: computational power has reached the 10.12968/hmed.2014.75.11.637. ability to handle Big Data through efficient 10. Terry, N.P., 2013. Protecting patient algorithms. The use of Big Data provides privacy in the age of big data. UMKC advantages to Health Informatics by Law Rev., 81: 385-415. allowing for more tests cases or more 11. Shrestha, R.B., 2014. Big data and features for research, leading to both cloud computing. Applied Radiology. quicker validations of studies. 12. Miller, K., 2012. Big data analytics in biomedical research. Biomedical REFERENCES Computation Review. 1. Eaton, C., D. Deroos, T. Deutsch, G. 13. Herland et al.: A review of data mining Lapis and P. Zikopoulos, 2012. using big data in health informatics. Understanding big data. McGraw-Hill Journal of Big Data2014 1:2. Companies. doi:10.1186/2196-1115-1-2 2. O’Reilly Radar Team, 2012. Planning 14. Chen J, Qian F, Yan W, Shen B (2013) for big data. O’Reilly. Translational biomedical informatics in 3. Zikopoulos, P., C. Eaton, D. de Roos, the cloud: present and future. BioMed 2012. Understanding big data: Res Analytics for enterprise class hadoop Int2013:8.[http://dx.doi.org/10.1155/20 and streaming data. McGraw-Hill, 13/658925] New York. 15. McDonald E, Brown CT (2013) 4. Bottles, K. and E. Begoli, 2014. khmer: Working with big data in Understanding the pros and cons of big Bioinformatics. CoRR abs/1303.2223: data analytics. Physician Exec., 40: 6- 1–18 12 16. Beltrame, F. and Koslow, S. H. 5. Zaslavsky, A., C. Perera and D. (1999). as a mega Georgakopoulos, 2012. Sensing as a science issue. IEEE Transactions on service and big data. Proceedings of in the International Conference on Biomedicine, 3(3):239-240. Advances in Cloud Computing (ACC’ PMID: 10719488. 12), Bangalore, India, pp: 1-8. 17. Luscombe, N. M., Greenbaum, D., and 6. Chen, H.C., R.H.L. Chiang and V.C. Gerstein, M.(2001). What is Storey, 2012. Business intelligence and bioinformatics? a proposed definition analytics: From big data to big impact. and overview of the field. Method. MIS Q., 36: 1165-1188. Inform. Med., 40(4):346-258. 7. Priyanka, K. and N. Kulennavar, 2014. PMID: 11552348. A survey on big data analytics in 18. Haferlach T, Kohlmann A, Wieczorek health care. Int. J. Comput. Sci. L, Basso G, Kronnie GT, Béné MC, Inform. Technologies, 5: 5865-5868. De Vos J, Hernández JM, Hofmann 8. Segen's Medical Dictionary. S.v. WK, Mills KI, Gilkes A, Chiaretti S, "clinical data." Retrieved April 13 Shurtleff SA, Kipps TJ, Rassenti LZ, 2018 Yeoh AE, Papenhausen PR, Wm Liu, from https://medicaldictionary.thefreed Williams PM, For (2010) Clinical ictionary.com/clinical+data utility of microarray-based gene 9. Yang, S., M. Njoku and C.F. expression profiling in the diagnosis Mackenzie, 2014. ‘Big data’ and subclassification of leukemia: approaches to trauma outcome report from the international prediction and autonomous microarray innovations in leukemia resuscitation. Brit. J. Hospital Med., study group. J Clin Oncol 28(15): 2529–

23 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

2537.[http://jco.ascopubs.org/content/2 [http://www.sciencedirect.com/science 8/15/2529.abstract] /article/pii/S0957417412008020] 19. Salazar R, Roepman P, Capella G, 25. Ouanes I, Schwebel C, Franais A, Moreno V, Simon I, Dreezen C, Bruel C, Philippart F, Vesin A, Soufir Lopez-Doriga A, Santos C, Marijnen L, Adrie C, Garrouste-Orgeas M, C, Westerga J, Bruin S, Kerr D, Timsit JF, Misset B (2012) A model to Kuppen P, van de Velde C, Morreau predict short-term death or readmission H, Van Velthuysen L, Glas AM, Van’t after intensive care unit discharge. J Veer LJ, Tollenaar R (2011)Gene Crit Care 27(4): 422.e1–422.e9. expression signature to improve [http://www.sciencedirect.com/science prognosis prediction of stage II and III /article/pii/S0883944111003790] colorectal cancer. J Clin Oncol29: 17– 26. Mathias JS, Agrawal A, Feinglass J, 24. Cooper AJ, Baker DW, Choudhary A [http://jco.ascopubs.org/content/29/1/1 (2013) Development of a 5 year life 7.abstract] expectancy index in older adults using 20. Annese J (2012) The importance of predictive mining of electronic health combining MRI and large-scale digital record data. J Am Med Inform Assoc histology in neuroimaging studies of 20(e1): e118–e124. brain connectivity and disease. Front [http://jamia.bmj.com/content/20/e1/e1 Neuroinform6:13.[http://europepmc.or 18.abstract] g/abstract/MED/22536182] 27. Ballard C, Foster K, Frenkiel A, Gedik 21. Van Essen DC, Smith SM, Barch DM, B, Koranda MP, Nathan S, Rajan D, Behrens TE, Yacoub E, Ugurbil K Rea R, Spicer M, Williams B, Zoubov (2013) The WU-Minn human VN (2011) IBM Infosphere Streams: connectome project: an overview. Assembling Continuous Insight in the NeuroImage 80(0): 62–79. Information Revolution. [http://www.sciencedirect.com/science [http://www.redbooks.ibm.com/abstrac /article/pii/ S1053811913005351]. ts/sg.pages=247970html] [Mapping the Connectome] 28. Zhang Y, Fong S, Fiaidhi J, 22. Yoshida H, Kawaguchi A, Tsuruya K Mohammed S (2012) Real-time (2013) Radial basis function-sparse clinical decision support system with partial least squares for application to data stream mining.J Biomed brain imaging data. Comput Math Biotechnol 2012: 8. Methods Med 2013: 7. [http://dx.doi.org/10.1155/2012/58018 [http://dx.doi.org/10.1155/2013/59103 6] 2] 29. Thommandram A, Pugh JE, Eklund 23. Campbell AJ, Cook JA, Adey G, JM, McGregor C, James AG (2013) Cuthbertson BH (2008) Predicting Classifying neonatal spells using real- death and readmission after intensive time temporal analysis of physiological care discharge. British J Anaesth data streams: Algorithm development 100(5): 656–662. In: IEEE Point-of-Care Healthcare [http://europepmc.org/abstract/MED/1 Technologies (PHT 2013). IEEE, 8385264] based in New York, USA, Bangalore, 24. Fialho AS, Cismondi F, Vieira SM, India, pp 240–243 Reti SR, Sousa JMC, Finkelstein SN 30. Ashish N, Biswas A, Das S, Nag S, (2012) Data mining using clinical Pratap R (2012) The Abzooba smart physiology at discharge to predict ICU health informatics platform (SHIP)™– readmissions. Expert Syst Appl frompatient experiences to big data to 39(18): 13158–13165. insights. CoRR abs/1203.3764: 1–3

24 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

31. Rolia J, Yao W, Basu S, Lee WN, IGIPD. Proceedings - 2014 IEEE Singhal S, Kumar A, Sabella S (2013) International Conference on Big Data, Tell me what i don’t know - making IEEE Big Data 2014. 1-9. the most of social health forums. Tech. 10.1109/BigData.2014.7004385. Rep: HPL-2013–43. Hewlett Packard 39. Wang Y, Xing J, Xu Y, et al. In silico Labs ADME/T modelling for rational drug [https://www.hpl.hp.com/techreports/2 design. Q Rev Biophys 2015; 48:488– 013/ HPL-2013-43.pdf] 515. 32. Yuan Q, Nsoesie EO, Lv B, Peng G, 40. M. Whirl-Carrillo, E.M. McDonagh, J. Chunara R, Brownstein JS (2013) M. Hebert, L. Gong, K. Sangkuhl, C.F. Monitoring influenza epidemics in Thorn, R.B. Altman and T.E. China with search query from Baidu. Klein. "Pharmacogenomics PLoS ONE 8(5): e64323. [doi: Knowledge for Personalized 10.1371/journal.pone.0064323] Medicine"Clinical Pharmacology & 33. Ginsberg J, Mohebbi MH, Patel RS, Therapeutics (2012) 92(4): 414-417. Brammer L, Smolinski MS, Brilliant L 41. Wishart DS, Feunang YD, Guo AC, (2009) Detecting influenza epidemics Lo EJ, Marcu A, Grant JR, Sajed T, using search engine query data. Nature Johnson D, Li C, Sayeeda Z, 457(7232): 1012–1014. Assempour N, Iynkkaran I, Liu Y, [http://dx.doi.org/10.1038/nature07634 Maciejewski A, Gale N, Wilson A, ] Chin L, Cummings R, Le D, Pon A, 34. Achrekar H, Gandhe A, Lazarus R, Yu Knox C, Wilson M. Drug Bank 5.0: a SH, Liu B (2012) Twitter improves major update to the Drug Bank seasonal influenza prediction In: database for 2018. Nucleic Acids Res. International Conference on Health 2017 Nov 8. doi: Informatics (HEALTHINF’12). Nature 10.1093/nar/gkx1037. Publishing Group, based in London, 42. Davis AP, Grondin CJ, Johnson RJ, UK, Vilamoura, Portugal, pp 61–70 Sciaky D, King BL, McMorran R, 35. Signorini A, Segre AM, Polgreen PM Wiegers J, Wiegers TC, Mattingly CJ. (2011) The use of twitter to track The Comparative Toxicogenomics levels of disease activity and public Database: update 2017. Nucleic Acids concern in the U.S. during the Res. 2016 Sep 19;[Epub ahead of influenza A H1N1 pandemic. PLoS print] PMID:27651457 ONE 6(5): e19467. 43. The Reactome Pathway doi:10.1371/journal.pone.0019467 Knowledgebase. Fabregat A, Jupe S, 36. Bennett C, Doub T (2011) Data mining Matthews L, Sidiropoulos K, Gillespie and electronic health records: selecting M, Garapati P, Haw R, Jassal B, optimal clinical treatments in practice. Korninger F, May B, Milacic M, Roca CoRR abs/1112: 1668 CD, Rothfels K, Sevilla C, Shamovsky 37. Yasnoff WA, O’Carroll PW, Koo D, V, Shorser S, Varusai T, Viteri G, Linkins RW, Kilbourne EM. Public Weiser J, Wu G, Stein L, Hermjakob health informatics: improving and H, D'Eustachio P. Nucleic Acids Res. transforming public health in the 2018 Jan 4;46(D1):D649- information age. J Public Health D655.doi:10.1093/nar/gkx1132.PMID: Manag Pract 2000; 6:67–75. 29145629 38. Kumar, Pavan & Ch, Janaki & 44. Daniel R. Zerbino. Et al, Ensembl Neeharika, N & Saluja, Payal & 2018. PubMed PMID: 29155950. Mangala, Natampalli & B.B, Prahlada doi:10.1093/nar/gkx1098. Rao. (2015). Information gateway for 45. Ayesha Pasha, Vinod Scaria, integrated pharmacogenomics data- "Pharmacogenomics in the Era of

25 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

Personal Genomics: A Quick Guide to Ammal Engineering College, Online Resources and Tools", Omics Ramanathapuram, affiliated to Anna for Personalized Medicine, pp. 187- University, Chennai, Tamilnadu, India, 211, 2013 and Master of Technology degree in 46. Kanehisa, Furumichi, M., Tanabe, M., Computer Systems and Networks from Dr. Sato, Y., and Morishima, K.; KEGG: MGR Educational and Research Institute, new perspectives on genomes, Chennai, Tamil Nadu, India. He possesses pathways, diseases and drugs. Nucleic nearly 5 years of Industrial experience in Acids Res. 45, D353-D361 (2017). the field of IT Security, Bio Metric 47. Kuhn, Michael et al. “STITCH: Deployment, IT Administration, Network Interaction Networks of Chemicals and Proteins.” Nucleic Acids Security, Web Services and Security, Research 36.Database issue (2008): Service Oriented Architecture environment D684–D688. PMC. Web. 26 Apr. etc. He has presented the paper in 11 2018. reputed International conferences and 48. Gamazon, Eric R. et al. “PACdb: A published twelve papers in reputed Database for Cell-Based journals. His research interest includes Pharmacogenomics.” Pharmacogenetic Cloud Computing, Clinical Research, and s and genomics 20.4 (2010): 269– Medical Informatics, Internet of Things, 273. PMC. Web. 26 Apr. 2018. Machine Learning, Ontology 49. Mailman MD et al, 'The NCBI dbGaP Development, Big Data, and Data mining. database of genotypes and He visited the countries like Japan, phenotypes", Nat Genet, vol. 39, no. Malaysia, Singapore, Srilanka, and Gulf 10, pp. 1181–1186, 2007. countries for his research activities. Right 50. PGP-UK: a research and citizen Now he is doing his Ph.D. as a full-time science hybrid project in support of Doctoral Research Fellow in the area of personalized medicine. Translational Clinical Informatics at Anna Stephan Beck et al University, Chennai, Tamilnadu, India. bioRxiv 288829; doi: https://doi.org/10 .1101/288829 51. The Indian Genome Variation Consortium Hum Genet (2005) 118: 1. https://doi.org/10.1007/s00439-005- 0009-9 52. Ibrahim, S. Jafar Ali, and M. Thangamani. "Prediction of Novel Dr. M. Thangamani Drugs and Diseases for Hepatocellular possesses nearly 23 years of experience in Carcinoma Based on Multi-Source research, teaching, consulting and practical Simulated Annealing Based Random application development to solve real- Walk." Journal of medical systems 42, world business problems using analytics. no. 10 (2018): 188. Her research expertise covers Medical data 53. telemedicine and applications 2018 mining, machine learning, cloud (2018). computing, big data, fuzzy, soft computing, ontology development, web Bibliography services, and open source software. She has published nearly 80 articles in refereed, indexed, SCI Journals, books, and book chapters and presented over 67 papers in national and international Jafar Ali Ibrahim. S has conferences in the above field. She has completed his Bachelor of Technology in delivered more than 60 Guest Lectures in Information Technology from Syed reputed engineering colleges and reputed

26 Page 18-27 © MAT Journals 2019. All Rights Reserved

Journal of Electronics and Communication Systems Volume 4 Issue 1

industries on various topics. She has got Erode from June 2014 onwards. He best paper awards from various education- graduated B.E (EEE) from Sri related social activities in India and Ramakrishna Institute of technology in the Abroad. She has received the many year 2012 and M.E (Power Systems National and International Awards. She Engineering) from Velammal Engineering continues to actively serve the academic College in the year 2014 under Anna and research communities and presently University, Chennai. He has published 6 guiding nine Ph.D. Scholars under Anna papers in International journals, 3 papers is University. She is on the editorial board national journals and presented 5 papers in and reviewing committee of leading international conferences and 3 papers in research SCI journals. She has on the national conferences. He organized more program committee of top international than 25 workshops, seminars and short data mining and soft computing term training programs in various fields. conferences in various countries. She is His Areas of interests include Data also Board member in Taylor & Francis Communication Networks, Soft Bio Group and seasonal reviewer in IEEE Informatics in Big data approach, Internet Transaction on Fuzzy System, the of things (IOT), Power System Analysis international journal of advances in Fuzzy and Stability, FACTS, Energy Utilization System and Applied mathematics and and conservation etc. His research interests information journals. She has an include Big data Analytics, IOT in organizing chair and keynotes speaker in Healthcare, Power System Stability and international conferences in India and power Quality Improvement using various countries like California, Dubai, Malaysia. Custom Power Devices.

Cite this Article as: S.Jafar Ali Ibrahim, Dr.M.Thangamani, & D. Sarathkumar. (2019). Research Scenario of Bio Informatics in Big Data Approach. Journal of Electronics and D.Sarathkumar, M.E., is Communication Systems, 4(1), 18–27. working as an Assistant Professor in http://doi.org/10.5281/zenodo.2596987 Department of Electrical and Electronics Engineering, Kongu Engineering College,

27 Page 18-27 © MAT Journals 2019. All Rights Reserved