SABU M. THAMPI Assistant Professor Dept. of CSE LBS College of Engineering Kasaragod, Kerala-671542 [email protected] Introduction Bioinformatics is a new discipline that addresses the need to manage and interpret the data that in the past decade was massively generated by genomic research. This discipline represents the convergence of genomics, biotechnology and information technology, and encompasses analysis and interpretation of data, modeling of biological phenomena, and development of algorithms and statistics. Bioinformatics is by nature a cross-disciplinary field that began in the 1960s with the efforts of Margaret O. Dayhoff, Walter M. Fitch, Russell F. Doolittle and others and has matured into a fully developed discipline. However, bioinformatics is wide-encompassing and is therefore difficult to define. For many, including myself, it is still a nebulous term that encompasses molecular evolution, biological modeling, biophysics, and systems biology. For others, it is plainly computational science applied to a biological system. Bioinformatics is also a thriving field that is currently in the forefront of science and technology. Our society is investing heavily in the acquisition, transfer and exploitation of data and bioinformatics is at the center stage of activities that focus on the living world. It is currently a hot commodity, and students in bioinformatics will benefit from employment demand in government, the private sector, and academia. With the advent of computers, humans have become ‘data gatherers’, measuring every aspect of our life with inferences derived from these activities. In this new culture, everything can and will become data (from internet traffic and consumer taste to the mapping of galaxies or human behavior). Everything can be measured (in pixels, Hertz, nucleotide bases, etc), turned into collections of numbers that can be stored (generally in bytes of information), archived in databases, disseminated (through cable or wireless conduits), and analyzed. We are expecting giant pay-offs from our data: proactive control of our world (from earthquakes and disease to finance and social stability), and clear understanding of chemical, biological and cosmological processes. Ultimately, we expect a better life. Unfortunately, data brings clutter and noise and its interpretation cannot keep pace with its accumulation. One problem with data is its multi-dimensionality and how to uncover underlying signal (patterns) in the most parsimonious way (generally using nonlinear approaches. Another problem relates to what we do with the data. Scientific discovery is driven by falsifiability and imagination and not by purely logical processes that turn observations into understanding. Data will not generate knowledge if we use inductive principles. The gathering, archival, dissemination, modeling, and analysis of biological data falls within a relatively young field of scientific inquiry, currently known as ‘bioinformatics’, ‘Bioinformatics was spurred by wide accessibility of computers with increased compute power and by the advent of genomics. Genomics made it possible to acquire nucleic acid sequence and structural information from a wide range of genomes at an unprecedented pace and made this information accessible to further analysis and experimentation. For example, sequences were matched to those coding for globular proteins of known structure (defined by crystallography) and were used in high-throughput combinatorial approaches (such as DNA microarrays) to study patterns of gene expression. Inferences from sequences and biochemical data were used to construct metabolic networks. These activities have generated terabytes of data that are now being analyzed with computer, statistical, and machine learning techniques. The sheer number of sequences and information derived from these endeavors has given the false impression that imagination and hypothesis do not play a role in acquisition of biological knowledge. However, bioinformatics becomes only a science when fueled by hypothesis-driven research and within the context of the complex and ever- changing living world. The science that relates to bioinformatics has many components. It usually relates to biological molecules and therefore requires knowledge in the fields of biochemistry, molecular biology, molecular evolution, thermodynamics, biophysics, molecular engineering, and statistical mechanics, to name a few. It requires the use of computer science, mathematical, and statistical principles. Bioinformatics is in the cross roads of experimental and theoretical science. Bioinformatics is not only about modeling or data ‘mining’, it is about understanding the molecular world that fuels life from evolutionary and mechanistic perspectives. It is truly inter-disciplinary and is changing. Much like biotechnology and genomics, bioinformatics is moving from applied to basic science, from developing tools to developing hypotheses. The Genome The hereditary information that an organism passes to its offspring is represented in each of its cells. The representation is in the form of DNA molecules. Each DNA molecule is a long chain of chemical structures called nucleotides of four different types, which can be viewed abstractly as characters from the alphabet fA; C; T; Gg. The totality of this information is called the genome of the organism. In humans the genome consists of nucleotides. A major task of molecular biology is to _ Extract the information contained in the genomes of different organisms; _ Elucidate the structure of the genome; _ Apply this knowledge to the diagnosis and ultimately, treatment, of genetic diseases (about 4000 such diseases in humans have been identified); _ By comparing the genomes of different species, explain the process and mechanisms of evolution. These tasks require the invention of new algorithms. Genetics Genetics is the study of heredity. It began with the observations of Mendel on the inheritance of simple traits such as the color of peas. Mendel worked out the fundamental mathematical rules for the inheritance of such traits. The central abstraction in genetics is the concept of a gene. In classical genetics an organism’s genes were abstract attributes for which no biochemical basis was known. Over the past 40 years it has become possible to understand the mechanisms of heredity at the molecular level. Classical genetics was based on breeding experiments on plants and animals. Modern genetics is based on the molecular study of fungi, bacteria and viruses. Cells In the early 19th century development of the microscope enabled cells to be observed. A cell is an assortment of chemicals inside a sac bounded by a fatty layer called the plasma membrane. Bacteria and protozoa consist of single cells. Plants and animals are complex collections of cells of different types, with similar cells joined together to form tissues. A human has trillions of cells. In a eukaryotic cell the genetic material is inside a nucleus separated from the rest of the cell by a membrane, and the cell contains a number of discrete functional units called organelles. The part of the cell outside the nucleus is called the cytoplasm. In a prokaryotic cell there is no nucleus and the structure is more homogeneous. Bacteria are prokaryotes, plants, animals and protozoa are eukaryotes. A eukaryotic cell has diameter 10-100 microns, a bacterial cell 1-10 microns. The genetic material in cells is contained in structures called chromosomes. In most prokaryotes, each cell has a single chromosome. In eukaryotes, all the cells of any individual contain the same number of chromosomes, 8 in fruities, 46 in humans and bats, 84 in the rhinoceros. In mammals and many other eukaryotic organisms the chromosomes occur in homologous (similar in shape and structure) pairs, except for the sex chromosomes. The male has an X chromosome and a Y chromosome, and the female has two X chromosomes. Cells with homologous pairs of chromosomes are called diploid. Cells with unpaired chromosomes are called haploid.Inpolyploid organisms the chromosomes come in homologous triplets, quadruplets etc. Mitosis is the process of cell division. In mitosis each chromosome gets duplicated, the chromosomes line up in the central part of the cell, then separate into two groups which move to the ends of the cell, and a cell membrane forms separating the two sister cells. During mitosis chromosomes can be observed well under the microscope. They can be distinguished from one another by their banding structure, and certain abnormalities observed. For example, Down Syndrome occurs when there is an extra copy of chromosome 21. The sperm and egg cells (gametes) in sexually reproducing species are haploid. The process of meiosis produces four sperm/egg cells from a single diploid cell. We describe the main steps in meiosis. The original diploid cell has two homologous copies of each chromosome (with the exception of the X and Y chromosomes). One of these is called the paternal copy (because it was inherited from the father) and the other is called the maternal copy. Each of the two copies duplicates itself. The four resulting copies are then brought together, and material may be exchanged between the paternal and maternal copies. This process is called recombination or crossover. Then a cell division occurs, with the two paternal copies (possibly containing some material from the maternal copies by virtue of recombination) going to one of the daughter cells, and the two maternal copies to the other
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages155 Page
-
File Size-