A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

e ISSN 2277 - 3290 Print ISSN 2277 - 3282 Journal of science Biological Science

www.journalofscience.net

AN OVERVIEW OF

A. Elumalai*

Anurag Pharmacy College, Ananthagiri (V), Kodad (M), Nalgonda (Dt), Andhra Pradesh, India, 508 206.

ABSTRACT Bioinformatics is a branch of biological science which deals with the study of methods for storing, retrieving and analyzing biological data, such as nucleic acid (DNA/RNA) and protein sequence, structure, function, pathways and genetic interactions. It generates new knowledge that is useful in such fields as drug design and development of new software tools to create that knowledge. This review deals with algorithms, databases and information systems, web technologies, artificial intelligence and soft , information and computation theory, structural , software engineering, data mining, image processing, modeling and simulation, discrete , control and system theory, circuit theory, and statistics.

Keywords: Bioinformatics, Nucleic acid, Algorithms.

INTRODUCTION Building on the recognition of the importance of includes nucleotide and sequences, protein information transmission, accumulation and processing in domains, and protein structures. The actual process of biological systems in 1978. Paulien Hogeweg, coined the analyzing and interpreting data is referred to as termed Bioinformatics to refer to the study of information computational biology. Important sub-disciplines within processes in biotic systems. This definition placed bioinformatics and computational biology include: bioinformatics as field parallel to biophysics and  the development and implementation of tools that biochemistry. Examples of relevant biological enable efficient access to, and use and management information processes studied in the early days of of, various types of information. bioinformatics are the formation of complex social  the development of new algorithms (mathematical interaction structures by simple behavioral rules, and the formulas) and statistics with which to assess information accumulation and maintenance in models of relationships among members of large data sets. For prebiotic evolution. example, methods to locate a gene within a sequence, At the beginning of the genomic revolution, the predict protein structure and/or function, and cluster term bioinformatics was re-discovered to refer to the protein sequences into families of related sequences. creation and maintenance of a database to store biological The primary goal of bioinformatics is to increase information such as nucleotide sequences and amino acid the understanding of biological processes. What sets it sequences. Development of this type of database involved apart from other approaches, however, is its focus on not only design issues but the development of complex developing and applying computationally intensive interfaces whereby researchers could access existing data techniques to achieve this goal. Examples include: pattern as well as submit new or revised data. recognition, data mining, machine learning algorithms, In order to study how normal cellular activities and visualization. Major research efforts in the field are altered in different disease states, the biological data include sequence alignment, gene finding, genome must be combined to form a comprehensive picture of assembly, drug design, drug discovery, protein structure these activities. Therefore, the field of bioinformatics has alignment, protein structure prediction, prediction of gene evolved such that the most pressing task now involves the expression and protein–protein interactions, genome-wide analysis and interpretation of various types of data. This

Corresponding Author:- A. Elumalai Email:- [email protected]

71

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

association studies and the modeling of evolution. impractical to analyze DNA sequences manually. Today, Interestingly, the term bioinformatics was coined computer programs such as BLAST are used daily to before the genomic revolution. Paulien Hogeweg and Ben search sequences from more than 260 000 organisms, Hesper defined the term in 1978 to refer to the study of containing over 190 billion nucleotides. These programs information processes in biotic systems. This definition can compensate for mutations (exchanged, deleted or placed bioinformatics as a field parallel to biophysics or inserted bases) in the DNA sequence, to identify biochemistry (biochemistry is the study of chemical sequences that are related, but not identical. A variant of processes in biological systems). However, its primary this sequence alignment is used in the sequencing process use since at least the late 1980s has been to describe the itself. The so-called shotgun sequencing technique (which application of computer science and information sciences was used, for example, by The Institute for Genomic to the analysis of biological data, particularly in those Research to sequence the first bacterial genome, areas of genomics involving large-scale DNA sequencing. Haemophilus influenzae) does not produce entire Bioinformatics now entails the creation and advancement chromosomes. Instead it generates the sequences of many of databases, algorithms, computational and statistical thousands of small DNA fragments (ranging from 35 to techniques and theory to solve formal and practical 900 nucleotides long, depending on the sequencing problems arising from the management and analysis of technology). The ends of these fragments overlap and, biological data. when aligned properly by a genome assembly program, Over the past few decades rapid developments in can be used to reconstruct the complete genome. Shotgun genomic and other molecular research technologies and sequencing yields sequence data quickly, but the task of developments in information technologies have combined assembling the fragments can be quite complicated for to produce a tremendous amount of information related to larger genomes. For a genome as large as the human molecular biology. Bioinformatics is the name given to genome, it may take many days of CPU time on large- these mathematical and computing approaches used to memory, multiprocessor computers to assemble the glean understanding of biological processes. fragments, and the resulting assembly will usually contain Common activities in bioinformatics include numerous gaps that have to be filled in later. Shotgun mapping and analyzing DNA and protein sequences, sequencing is the method of choice for virtually all aligning different DNA and protein sequences to compare genomes sequenced today, and genome assembly them, and creating and viewing 3-D models of protein algorithms are a critical area of bioinformatics research. structures. Another aspect of bioinformatics in sequence There are two fundamental ways of modelling a analysis is annotation. This involves computational gene Biological system (e.g., living cell) both coming under finding to search for protein-coding genes, RNA genes, Bioinformatic approaches [1]. and other functional sequences within a genome. Not all of the nucleotides within a genome are part of genes. Static Within the genomes of higher organisms, large parts of  Sequences – Proteins, Nucleic acids and Peptides the DNA do not serve any obvious purpose. This so-  Structures – Proteins, Nucleic acids, Ligands called junk DNA may, however, contain unrecognized (including metabolites and drugs) and Peptides functional elements. Bioinformatics helps to bridge the Interaction data among the above entities including gap between genome and proteome projects — for microarray data and Networks of proteins, metabolites example, in the use of DNA sequences for protein identification [2]. Dynamic Systems Biology comes under this category Genome annotation including reaction fluxes and variable concentrations of In the context of genomics, annotation is the metabolites. Multi-Agent Based modelling approaches process of marking the genes and other biological features capturing cellular events such as signalling, transcription in a DNA sequence. The first genome annotation software and reaction dynamics Since the Phage Φ-X174 was system was designed in 1995 by Dr. Owen White, who sequenced in 1977, the DNA sequences of thousands of was part of the team at The Institute for Genomic organisms have been decoded and stored in databases. Research that sequenced and analyzed the first genome of This sequence information is analyzed to determine genes a free-living organism to be decoded, the bacterium that encode polypeptides (proteins), RNA genes, Haemophilus influenzae. Dr. White built a software regulatory sequences, structural motifs, and repetitive system to find the genes (fragments of genomic sequence sequences. A comparison of genes within a species or that encode proteins), the transfer RNAs, and to make between different species can show similarities between initial assignments of function to those genes. Most protein functions, or relations between species (the use of current genome annotation systems work similarly, but molecular systematics to construct phylogenetic trees). the programs available for analysis of genomic DNA, With the growing amount of data, it long ago became such as the GeneMark program trained and used to find

72

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

protein-coding genes in Haemophilus influenzae, are in high-throughput gene expression studies. Such studies constantly changing and improving. are often used to determine the genes implicated in a disorder: one might compare microarray data from Computational evolutionary biology cancerous epithelial cells to data from non-cancerous cells Evolutionary biology is the study of the origin to determine the transcripts that are up-regulated and and descent of species, as well as their change over time. down-regulated in a particular population of cancer cells. Informatics has assisted evolutionary biologists in several key ways; it has enabled researchers to: Analysis of regulation  trace the evolution of a large number of organisms by Regulation is the complex orchestration of measuring changes in their DNA, rather than through events starting with an extracellular signal such as a physical taxonomy or physiological observations hormone and leading to an increase or decrease in the alone, activity of one or more proteins. Bioinformatics  more recently, compare entire genomes, which techniques have been applied to explore various steps in permits the study of more complex evolutionary this process. For example, promoter analysis involves the events, such as gene duplication, horizontal gene identification and study of sequence motifs in the DNA transfer, and the prediction of factors important in surrounding the coding region of a gene. These motifs bacterial speciation, influence the extent to which that region is transcribed  build complex computational models of populations into mRNA. Expression data can be used to infer gene to predict the outcome of the system over time regulation: one might compare microarray data from a  track and share information on an increasingly large wide variety of states of an organism to form hypotheses number of species and organisms about the genes involved in each state. In a single-cell The area of research within computer science organism, one might compare stages of the cell cycle, that uses genetic algorithms is sometimes confused with along with various stress conditions (heat shock, computational evolutionary biology, but the two areas are starvation, etc.). One can then apply clustering algorithms not necessarily related. to that expression data to determine which genes are co- expressed. For example, the upstream regions (promoters) Literature analysis of co-expressed genes can be searched for over- The growth in the number of published literature represented regulatory elements. makes it virtually impossible to read every paper, resulting in disjointed subfields of research. Literature Analysis of protein expression analysis aims to employ computational and statistical Protein microarrays and high throughput (HT) linguistics to mine this growing library of text resources. mass spectrometry (MS) can provide a snapshot of the For example: proteins present in a biological sample. Bioinformatics is  abbreviation recognition - identify the long-form and very much involved in making sense of protein abbreviation of biological terms, microarray and HT MS data; the former approach faces  named entity recognition - recognizing biological similar problems as with microarrays targeted at mRNA, terms such as gene names the latter involves the problem of matching large amounts  protein-protein interaction - identify which proteins of mass data against predicted masses from protein sequence databases, and the complicated statistical interact with which proteins from text analysis of samples where multiple, but incomplete The area of research draws from statistics and peptides from each protein are detected [3]. computational linguistics.

Analysis of mutations in cancer Analysis of gene expression In cancer, the genomes of affected cells are The expression of many genes can be determined rearranged in complex or even unpredictable ways. by measuring mRNA levels with multiple techniques Massive sequencing efforts are used to identify previously including microarrays, expressed cDNA sequence tag unknown point mutations in a variety of genes in cancer. (EST) sequencing, serial analysis of gene expression Bioinformaticians continue to produce specialized (SAGE) tag sequencing, massively parallel signature sequencing (MPSS), RNA-Seq, also known as Whole automated systems to manage the sheer volume of sequence data produced, and they create new algorithms Transcriptome Shotgun Sequencing (WTSS), or various and software to compare the sequencing results to the applications of multiplexed in-situ hybridization. All of growing collection of human genome sequences and these techniques are extremely noise-prone and/or subject germline polymorphisms. New physical detection to bias in the biological measurement, and a major technologies are employed, such as oligonucleotide research area in computational biology involves developing statistical tools to separate signal from noise microarrays to identify chromosomal gains and losses (called comparative genomic hybridization), and single-

73

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

nucleotide polymorphism arrays to detect known point objectivity, or speed. A fully developed analysis system mutations. These detection methods simultaneously may completely replace the observer. Although these measure several hundred thousand sites throughout the systems are not unique to biomedical imagery, biomedical genome, and when used in high-throughput to measure imaging is becoming more important for both diagnostics thousands of samples, generate terabytes of data per and research. Some examples are: experiment. Again the massive amounts and new types of  high-throughput and high-fidelity quantification and data generate new opportunities for bioinformaticians. sub-cellular localization (high-content screening, The data is often found to contain considerable cytohistopathology, Bioimage informatics) variability, or noise, and thus Hidden Markov model and  morphometrics change-point analysis methods are being developed to  clinical image analysis and visualization infer real copy number changes. Another type of data that  determining the real-time air-flow patterns in requires novel informatics development is the analysis of breathing lungs of living animals lesions found to be recurrent among many tumors.  quantifying occlusion size in real-time imagery from the development of and recovery during arterial Comparative genomics injury The core of comparative genome analysis is the  making behavioral observations from extended video establishment of the correspondence between genes recordings of laboratory animals (orthology analysis) or other genomic features in different  infrared measurements for metabolic activity organisms. It is these intergenomic maps that make it determination possible to trace the evolutionary processes responsible  inferring clone overlaps in DNA mapping, e.g. the for the divergence of two genomes. A multitude of Sulston score [4]. evolutionary events acting at various organizational levels shape genome evolution. At the lowest level, point Structural bioinformatic approaches mutations affect individual nucleotides. At a higher level, Protein structure prediction is another important large chromosomal segments undergo duplication, lateral application of bioinformatics. The amino acid sequence of transfer, inversion, transposition, deletion and insertion. a protein, the so-called primary structure, can be easily Ultimately, whole genomes are involved in processes of determined from the sequence on the gene that codes for hybridization, polyploidization and endosymbiosis, often it. In the vast majority of cases, this primary structure leading to rapid speciation. The complexity of genome uniquely determines a structure in its native environment. evolution poses many exciting challenges to developers of (Of course, there are exceptions, such as the bovine mathematical models and algorithms, who have recourse spongiform encephalopathy – a.k.a. Mad Cow Disease – to a spectra of algorithmic, statistical and mathematical prion.) Knowledge of this structure is vital in techniques, ranging from exact, heuristics, fixed understanding the function of the protein. For lack of parameter and approximation algorithms for problems better terms, structural information is usually classified as based on parsimony models to Markov Chain Monte one of secondary, tertiary and quaternary structure. A Carlo algorithms for Bayesian analysis of problems based viable general solution to such predictions remains an on probabilistic models. open problem. As of now, most efforts have been directed

towards heuristics that work most of the time. Modeling biological systems One of the key ideas in bioinformatics is the Systems biology involves the use of computer notion of homology. In the genomic branch of simulations of cellular subsystems (such as the networks bioinformatics, homology is used to predict the function of metabolites and enzymes which comprise metabolism, of a gene: if the sequence of gene A, whose function is signal transduction pathways and gene regulatory known, is homologous to the sequence of gene B, whose networks) to both analyze and visualize the complex function is unknown, one could infer that B may share A's connections of these cellular processes. Artificial life or function. In the structural branch of bioinformatics, virtual evolution attempts to understand evolutionary homology is used to determine which parts of a protein processes via the computer simulation of simple are important in structure formation and interaction with (artificial) life forms. other proteins. In a technique called homology modeling,

this information is used to predict the structure of a High-throughput image analysis protein once the structure of a homologous protein is Computational technologies are used to known. This currently remains the only way to predict accelerate or fully automate the processing, quantification protein structures reliably. and analysis of large amounts of high-information-content One example of this is the similar protein biomedical imagery. Modern image analysis systems homology between hemoglobin in humans and the augment an observer's ability to make measurements from hemoglobin in legumes (leghemoglobin). Both serve the a large or complex set of images, by improving accuracy,

74

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

same purpose of transporting oxygen in the organism. workbench, and UGENE. In order to maintain this Though both of these proteins have completely different tradition and create further opportunities, the non-profit amino acid sequences, their protein structures are virtually Open Bioinformatics Foundation have supported the identical, which reflects their near identical purposes. annual Bioinformatics Open Source Conference (BOSC) Other techniques for predicting protein structure include since 2000 [5]. protein threading and de novo (from scratch) - based modeling. Web services in bioinformatics SOAP and REST-based interfaces have been Molecular Interaction developed for a wide variety of bioinformatics Efficient software is available today for studying applications allowing an application running on one interactions among proteins, ligands and peptides. Types computer in one part of the world to use algorithms, data of interactions most often encountered in the field include and computing resources on servers in other parts of the – Protein–ligand (including drug), protein–protein and world. The main advantages derive from the fact that end protein–peptide. Molecular dynamic simulation of users do not have to deal with software and database movement of atoms about rotatable bonds is the maintenance overheads. fundamental principle behind computational algorithms, Basic bioinformatics services are classified by termed docking algorithms for studying molecular the EBI into three categories: SSS (Sequence Search interactions. Services), MSA (Multiple Sequence Alignment) and BSA (Biological Sequence Analysis). The availability of these Docking algorithms service-oriented bioinformatics resources demonstrate the In the last two decades, tens of thousands of applicability of web based bioinformatics solutions, and protein three-dimensional structures have been range from a collection of standalone tools with a determined by X-ray crystallography and Protein nuclear common data format under a single, standalone or web- magnetic resonance spectroscopy (protein NMR). One based interface, to integrative, distributed and extensible central question for the biological scientist is whether it is bioinformatics workflow management systems. practical to predict possible protein–protein interactions only based on these 3D shapes, without doing protein– Bioinformatics workflow management systems protein interaction experiments. A variety of methods A Bioinformatics workflow management system have been developed to tackle the Protein–protein is a specialized form of a workflow management system docking problem, though it seems that there is still much designed specifically to compose and execute a series of work to be done in this field. computational or data manipulation steps, or a workflow, in a Bioinformatics application. Such systems are Software and tools designed to Software tools for bioinformatics range from  provide an easy-to-use environment for individual simple command-line tools, to more complex graphical application scientists themselves to create their own programs and standalone web-services available from workflows various bioinformatics companies or public institutions.  provide interactive tools for the scientists enabling them to execute their workflows and view their Open source bioinformatics software results in real-time Many free and open source software tools have  simplify the process of sharing and reusing existed and continued to grow since the 1980s. The workflows between the scientists. combination of a continued need for new algorithms for  enable scientists to track the provenance of the the analysis of emerging types of biological readouts, the workflow execution results and the workflow potential for innovative in silico experiments, and freely creation steps. available open code bases have helped to create opportunities for all research groups to contribute to both bioinformatics and the range of open source software Health informatics is a discipline at the available, regardless of their funding arrangements. The intersection of information science, computer science, and open source tools often act as incubators of ideas, or health care. It deals with the resources, devices, and community-supported plug-ins in commercial methods required tooptimize the acquisition, storage, applications. They may also provide de facto standards retrieval, and use of information in health and and shared object models for assisting with the challenge biomedicine. Health informatics tools include not only of bioinformation integration. computers but also clinical guidelines, formal medical The range of open source software packages terminologies, and information and communication includes titles such as Bioconductor, BioPerl, Biopython, systems. It is applied to the areas of nursing, clinical care, BioJava, BioRuby, Bioclipse, EMBOSS, Taverna , pharmacy, public health, occupational therapy,

75

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

and (bio) medical research. Molecular bioinformatics and Boston, another center of biomedical computing that clinical informatics have converged into the field of received significant support from the NIH. In the 1970s translational bioinformatics. and 1980s it was the most commonly used programming language for clinical applications. The MUMPS operating Medical informatics in the United States system was used to support MUMPS language Even though the idea of using computers in specifications. As of 2004, a descendent of this system is emerged as technology advanced in the early being used in the United States Veterans Affairs hospital twentieth century, it was not until the 1950s that system. The VA has the largest enterprise-wide health informatics began to make a significant impact in the information system that includes an electronic medical United States. The earliest use of electronic digital record, known as the Veterans Health Information computers for medicine was for dental projects in the Systems and Technology Architecture (VistA). A 1950s at the United States National Bureau of Standards graphical user interface known as the Computerized by Robert Ledley. During the mid-1950s, the United Patient Record System (CPRS) allows health care States Air Force (USAF) carried out several medical providers to review and update a patient’s electronic projects on its computers while also encouraging civilian medical record at any of the VA's over 1,000 health care agencies such as the National Academy of Sciences - facilities. National Research Council (NAS-NRC) and the National During the 1960s, Morris Collen, a physician Institutes of Health (NIH) to sponsor such work. In 1959, working for Kaiser Permanente's Division of Research, Ledley and Lee B. Lusted published Reasoning developed computerized systems to automate many Foundations of Medical Diagnosis, a widely-read article aspects of multiphasic health checkups. These system in Science, which introduced computing (especially became the basis the larger medical databases Kaiser ) techniques to medical workers. Permanente developed during the 1970s and 1980s. The Ledley and Lusted’s article has remained influential for American College of Medical Informatics (ACMI) has decades, especially within the field of medical decision since 1993 annually bestowed the Morris F. Collen, MD making [6]. Medal for Outstanding Contributions to the Field of Guided by Ledley's late 1950s survey of Medical Informatics. computer use in biology and medicine (carried out for the In the 1970s a growing number of commercial NAS-NRC), and by his and Lusted's articles, the NIH vendors began to market practice management and undertook the first major effort to introduce computers to electronic medical records systems. Although many biology and medicine. This effort, carried out initially by products exist, only a small number of health practitioners the NIH's Advisory Committee on Computers in Research use fully featured electronic health care records systems. (ACCR), chaired by Lusted, spent over $40 million Homer R. Warner, one of the fathers of medical between 1960 and 1964 in order to establish dozens of informatics, founded the Department of Medical large and small biomedical research centers in the US. Informatics at the University of Utah in 1968. The One early (1960, non-ACCR) use of computers American Medical Informatics Association (AMIA) has was to help quantify normal human movement, as a an award named after him on application of informatics to precursor to scientifically measuring deviations from medicine. normal, and design of prostheses. The use of computers (IBM 650, 1620, and 7040) allowed analysis of a large Medical informatics in the UK sample size, and of more measurements and subgroups The broad history of health informatics has been than had been previously practical with mechanical captured in the book UK Health Computing: calculators, thus allowing an objective understanding of Recollections and reflections, Hayes G, Barnett D (Eds.), how human locomotion varies by age and body BCS by those active in the field, predominantly members characteristics. A study co-author was Dean of the of BCS Health and its constituent groups. The book Marquette University College of Engineering; this work describes the path taken as ‘early development of health led to discrete Biomedical Engineering departments there informatics was unorganized and idiosyncratic’. In the and elsewhere. early -1950s it was prompted by those involved in NHS The next steps, in the mid-1960s, were the finance and only in the early 1960s did solutions development (sponsored largely by the NIH) of expert including those in pathology, radiotherapy, immunization, systems such as MYCIN and Internist-I. In 1965, the and primary care emerge. Many of these solutions, even National Library of Medicine started to use MEDLINE in the early 1970s were developed in-house by pioneers in and MEDLARS. Around this time, Neil Pappalardo, the field to meet their own requirements. In part this was Curtis Marble, and Robert Greenes developed MUMPS due to some areas of health services (for example the (Massachusetts General Hospital Utility Multi- immunization and vaccination of children) still being Programming System) in Octo Barnett's Laboratory of provided by Local Authorities. Interesting, this is a Computer Science at Massachusetts General Hospital in situation which the coalition government propose broadly

76

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

to return to in the 2010 strategy Equity and Excellence: The first applications of computers to medicine Liberating the NHS; stating: and healthcare in Brazil started around 1968, with the  We will put patients at the heart of the NHS, through installation of the first mainframes in public university an information revolution and greater choice and hospitals, and the use of programmable calculators in control’ with shared decision-making becoming the scientific research applications. Minicomputers, such as norm: ‘no decision about me without me’ and the IBM 1130 were installed in several universities, and patients having access to the information they want, the first applications were developed for them, such as the to make choices about their care. They will have hospital census in the School of Medicine of Ribeirão increased control over their own care records. Preto and patient master files, in the Hospital das Clínicas These types of statements present a significant da Universidade de São Paulo, respectively at the cities of opportunity for health informaticians to come out of the Ribeirão Preto and São Paulo campi of the University of back-office and take up a front-line role supporting São Paulo. In the 1970s, several Digital Corporation and clinical practice, and the business of care delivery. The Hewlett Packard minicomputers were acquired for public UK health informatics community has long played a key and Armed Forces hospitals, and more intensively used role in international activity, joining TC4 of the for intensive-care unit, cardiology diagnostics, patient International Federation of Information Processing which monitoring and other applications. In the early 1980s, became IMIA. Under the aegis of BCS Health, with the arrival of cheaper microcomputers, a great Cambridge was the host for the first EFMI Medical upsurge of computer applications in health ensued, and in Informatics Europe conference and London was the 1986 the Brazilian Society of Health Informatics was location for IMIA’s tenth global congress. founded, the first Brazilian Congress of Health Informatics was held, and the first Brazilian Journal of Argentina Health Informatics was published. In Brazil, two Since 1997, the Buenos Aires Biomedical universities are pioneers in teaching and research in Informatics Group, a nonprofit group, represents the Medical Informatics, both the University of Sao Paulo interests of a broad range of clinical and non-clinical and the Federal University of Sao Paulo offer professionals working within the Health Informatics undergraduate programs highly qualified in the area as sphere. Its purposes are: well as extensive graduate programs [7].  Promote the implementation of the computer tool in the healthcare activity, scientific research, health Canada administration and in all areas related to health Health Informatics projects in Canada are sciences and biomedical research. implemented provincially, with different provinces  Support, promote and disseminate content related creating different systems. A national, federally-funded, activities with the management of health information not-for-profit organization called Canada Health Infoway and tools they used to do under the name of was created in 2001 to foster the development and Biomedical informatics. adoption of electronic health records across Canada. As of  Promote cooperation and exchange of actions December 31, 2008 there were 276 EHR projects under generated in the field of biomedical informatics, both way in Canadian hospitals, other health-care facilities, in the public and private, national and international pharmacies and laboratories, with an investment value of level. $1.5-billion from Canada Health Infoway.  Interact with all scientists, recognized academic Provincial and territorial programmes include the stimulating the creation of new instances that have following: the same goal and be inspired by the same purpose.  Health Ontario was created as an Ontario provincial  To promote, organize, sponsor and participate in government agency in September 2008. It has been events and activities for training in computer and plagued by delays and its CEO was fired over a information and disseminating developments in this multimillion-dollar contracts scandal in 2009. area that might be useful for team members and  Alberta Netcare was created in 2003 by the health related activities. Government of Alberta. Today the netCARE portal is The Argentinian health system is very used daily by thousands of clinicians. It provides heterogeneous, because of that the informatics access to demographic data, prescribed/dispensed developments show a heterogeneous stage. Lots of private drugs, known allergies/intolerances, immunizations, Health Care center have developed systems, such as the laboratory test results, diagnostic imaging reports, the German Hospital of Buenos Aires, which was one of the diabetes registry and other medical reports. netCARE first in developing an electronic health records system. interface capabilities are being included in electronic medical record products which are being funded by Brazil the provincial government.

77

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

United States development of the installed base. Almost all general In 2004 the U.S. Department of Health and practices in England and Wales are computerised and Human Services (HHS) formed the Office of the National patients have relatively extensive computerised primary Coordinator for Health Information Technology care clinical records. Computerisation is the responsibility (ONCHIT). The mission of this office is widespread of individual practices and there is no single, standardised adoption of interoperable electronic health records GP system. Interoperation between primary and (EHRs) in the US within 10 years. See quality secondary care systems is rather primitive. A focus on improvement organizations for more information on interworking (for interfacing and integration) standards is federal initiatives in this area. hoped will stimulate synergy between primary and The Certification Commission for Healthcare secondary care in sharing necessary information to Information Technology (CCHIT), a private nonprofit support the care of individuals. Scotland has an approach group, was funded in 2005 by the U.S. Department of to central connection under way which is more advanced Health and Human Services to develop a set of standards than the English one in some ways. Scotland has the for electronic health records (EHR) and supporting GPASS system whose source code is owned by the State, networks, and certify vendors who meet them. In July, and controlled and developed by NHS Scotland. GPASS 2006 CCHIT released its first list of 22 certified was accepted in 1984. It has been provided free to all GPs ambulatory EHR products, in two different in Scotland but has developed poorly. Discussion of open announcements. sourcing it as a remedy is occurring [5,6].

Europe Emerging Direction The European Union's Member States are The European Commission's preference, as committed to sharing their best practices and experiences exemplified in the 5th Framework as well as currently to create a European eHealth Area, thereby improving pursued pilot projects, is for Free/Libre and Open Source access to and quality health care at the same time as Software (FLOSS) for healthcare. stimulating growth in a promising new industrial sector. The European eHealth Action Plan plays a fundamental Asia and Oceania role in the European Union's strategy. Work on this In Asia and Australia-New Zealand, the regional initiative involves a collaborative approach among several group called the Asia Pacific Association for Medical parts of the Commission services. The European Institute Informatics (APAMI) was established in 1994 and now for Health Records is involved in the promotion of high consists of more than 15 member regions in the Asia quality electronic health record systems in the European Pacific Region. Union. Australia UK The Australasian College of Health Informatics There are different models of health informatics (ACHI) is the professional association for health delivery in each of the home countries (England, informatics in the Asia-Pacific region. It represents the Scotland, Northern Ireland and Wales) but some bodies interests of a broad range of clinical and non-clinical like UKCHIP (see below ) operate for those 'in and for' all professionals working within the health informatics the home countries and beyond. sphere through a commitment to quality, standards and ethical practice. Founded in 2002, ACHI is increasingly England valued for its thought leadership, its trusted advisors and The NHS in England has contracted out to national and international experts in Health Informatics. several vendors for a national health informatics system ACHI is an academic institutional member of the 'NPFIT' that originally divided the country into five International Medical Informatics Association (IMIA) and regions and is to be united by a central electronic medical a full member of the Australian Council of Professions. record system nicknamed the spine. The project, in 2010, ACHI is a sponsor of the e-Journal for Health Informatics, is seriously behind schedule and its scope and design are an indexed and peer-reviewed professional journal. ACHI being revised in real time. In 2010 a wide consultation has also supported the Australian Health Informatics was launched as part of a wider ‘Liberating the NHS’ Education Council (AHIEC) since its founding in 2009. plan. Many organisations and bodies (look on their own Although there are a number of health websites, as most have made their responses public in informatics organisations in Australia, the Health detail for information) responded to the consultation and a Informatics Society of Australia (HISA) is regarded as the new strategy is expected in the second quarter of 2011. major umbrella group and is a member of the The degree of computerisation in NHS secondary care International Medical Informatics Association (IMIA). was quite high before NPfIT and that programme has had Nursing informaticians were the driving force behind the the unfortunate effect of largely stalling further formation of HISA, which is now a company limited by

78

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

guarantee of the members. The membership comes from sometimes complex legal principles as they apply to across the informatics spectrum that is from students to information technology in health-related fields. It corporate affiliates. HISA has a number of branches as addresses the privacy, ethical and operational issues that well as special interest groups such as nursing (NIA), invariably arise when electronic tools, information and pathology, aged and community care, industry and media are used in health care delivery. Health Informatics medical imaging. Law also applies to all matters that involve information technology, health care and the interaction of information. Hong Kong It deals with the circumstances under which data and In Hong Kong a computerized patient record records are shared with other fields or areas that support system called the Clinical Management System (CMS) and enhance patient care. has been developed by the Hospital Authority since 1994. This system has been deployed at all the sites of the Clinical Informatics Authority (40 hospitals and 120 clinics), and is used by all Clinical informaticians transform health care by 30,000 clinical staff on a daily basis, with a daily analyzing, designing, implementing, and evaluating transaction of up to 2 millions. The comprehensive information and communication systems that enhance records of 7 million patients are available on-line in the individual and population health outcomes, improve Electronic Patient Record (ePR), with data integrated patient care, and strengthen the clinician-patient from all sites. Since 2004 radiology image viewing has relationship. Clinical informaticians use their knowledge been added to the ePR, with radiography images from any of patient care combined with their understanding of HA site being available as part of the ePR. informatics concepts, methods, and health informatics The Hong Kong Hospital Authority placed tools to: particular attention to the governance of clinical systems  assess information and knowledge needs of health development, with input from hundreds of clinicians care professionals and patients, being incorporated through a structured process. The  characterize, evaluate, and refine clinical processes, Health Informatics Section in Hong Kong Hospital  develop, implement, and refine clinical decision Authority has close relationship with Information support systems, and Technology Department and clinicians to develop  lead or participate in the procurement, customization, healthcare systems for the organization to support the development, implementation, management, service to all public hospitals and clinics in the region. evaluation, and continuous improvement of clinical The Hong Kong Society of Medical Informatics information systems. (HKSMI) was established in 1987 to promote the use of  Clinicians collaborate with other health care and information technology in healthcare. The eHealth information technology professionals to develop Consortium has been formed to bring together clinicians health informatics tools which promote patient care from boththe private and public sectors, medical that is safe, efficient, effective, timely, patient- informatics professionals and the IT industry to further centered, and equitable. promote IT in healthcare in Hong Kong. Translational bioinformatics New Zealand With the completion of the human genome and Health Informatics is taught at five New Zealand the recent advent of high throughput sequencing and universities. The most mature and established is the Otago genome-wise association studies of single nucleotide programme which has been offered for over a decade. polymorphisms, the fields of molecular bioinformatics, Health Informatics New Zealand (HINZ), is the national biostatistiques, statistical genetics and clinical informatics organisation that advocates for Health Informatics. HINZ are converging into the emerging field of translational organises a conference every year and also publishes an bioinformatics [7]. online journal- Healthcare Informatics Review Online. Clinical Research Informatics Saudi Arabia Clinical Research Informatics (or, CRI) takes the The Saudi Association for Health Information core foundations, principles, and technologies related to (SAHI) was established in 2006 to work under direct Health Informatics, and applies these to clinical research supervision of King Saud Bin Abdulaziz University for contexts. As such, CRI is a sub-discipline of sorts of Health Sciences to practice public activities, develop Health Informatics, and interest and activities in CRI have theoretical and applicable knowledge, and provide increased greatly in recent years given the overwhelming scientific and applicable studies. problems associated with the explosive growth of clinical research data and information. There are a number of Health Informatics Law activities within clinical research that CRI supports, Health informatics law deals with evolving and including:

79

A. Elumalai. / Journal of Science / Vol 2 / Issue 2 / 2012 / 71-80.

 more efficient and effective data collection and scientist's interest is mainly in understanding fundamental acquisition properties of computation. Health informatics, on the  optimal protocol design and efficient management other hand, is primarily concerned with understanding  patient recruitment and management fundamental properties of medicine that allow for the  adverse event reporting intervention of computers. The health domain provides an  regulatory compliance extremely wide variety of problems that can be tackled  data storage, transfer, processing and analysis using computational techniques, and computer scientists are attempting to make a difference in medicine by CONCLUSION studying the underlying principles of computer science Computational health informatics is a branch of that will allow for meaningful (to medicine) algorithms Computer Science that deals specifically with and systems to be developed. Thus, computer scientists computational techniques that are relevant in healthcare. working in computational health informatics and health Computational health informatics is also a branch of scientists working in medical health informatics combine Health Informatics, but is orthogonal to much of the work to develop the next generation of healthcare technologies. going on in health informatics because computer

REFERENCES 1. Hesper B, Hogeweg P. Bioinformatica: een werkconcept. Kameleon, 1(6), 1970, 28–29. 2. Hogeweg P. Simulating the growth of cellular forms. Simulation, 31(3), 1978, 90–96. 3. Hogeweg P. Searls, David B. ed. The Roots of Bioinformatics in Theoretical Biology. PLoS Computational Biology, 7(3), 2011, e1002021. 4. Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR, Fiddes CA, Hutchison CA, Slocombe PM, Smith M. Nucleotide sequence of bacteriophage phi X174 DNA. Nature, 265(5596), 1977, 687–95. 5. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Wheeler DL. GenBank. Nucleic Acids Res, 36, 2008, D25–30. 6. Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR, Bult CJ, Tomb JF, Dougherty BA, Merrick JM. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 269(5223), 1955, 496–512. 7. Antonio Carvajal-Rodríguez. Simulation of Genes and Genomes Forward in Time. Current Genomics, Bentham Science Publishers, 11(1), 2012, 58–61.

80