Bioinformatics: Computational Analysis of Biological Information

RESEARCH at the University of Maryland Bioinformatics: Computational Analysis of Biological Information Bioinformatics—the use of advanced computational techniques for biological research—is accelerating rates of scientific discovery and leading to new approaches to human disease. These computational methods enable researchers to tackle previously cumbersome analytical tasks, such as studying the entire genetic of an organism. With the aid of the latest bioinformatics technology, researchers can interpret DNA sequences with greater accuracy, in less time, and at lower costs. The University of Maryland’s Center for Bioinformatics and Computational Biology (CBCB) is at the forefront of bioinformatics research. Directed by Horvitz Professor of Computer Science Steven Salzberg, the center coordinates the expertise of researchers working in computer science, molecular biology, mathematics, physics, and biochemistry. These researchers reduce complex biological phenomena to information units stored in enormous data sets. The analysis of this data reveals new answers to biological problems. CBCB projects include technological solutions for accelerating vaccine development, identifying the complex causes of epidemic diseases, and revealing previously unseen relationships between the biochemistry of our bodies and the symptoms of puzzling diseases. Steven Salzberg and Carl Kingsford are using bioinformatics to transform influenza research. Their work will yield better methods for tracking the spread of influenza and for designing vaccines. Mihai Pop uses computational tools to combat infant mortality in developing countries. Bioinformatics enables him to pinpoint causes with greater sophistication. Najib El-Sayed and Mihai Pop use computational statistics to detect correlations between gut bacteria and the symptoms of under-explained disorders, such as autism and Crohn’s disease. Steven Salzberg and the CBCB research team develop open-source bioinformatics software. They provide critical code components to researchers around the world. Center for Bioinformatics and Computational Biology http://www.cbcb.umd.edu/ Better Flu Vaccines through Bioinformatics The accelerating evolution of the influenza virus poses significant public health problems. For example, the 2008 flu vaccine was largely ineffective, because it did not protect against newer viral strains that dominated the flu season. Steven Salzberg and Carl Kingsford use bioinformatics to develop better methods for predicting how the flu evolves and circulates through different populations. Their work will lead to better flu vaccines. Salzberg and Kingsford have been studying the vastly expanded flu virus database, and they have developed new computational methods for recognizing patterns of mutation. They are now able to track flu developments across a range of formerly inaccessible criteria, such as age, location, and cultural and ethnic differences. Bioinformatics is particularly helpful in expediting flu philogenetic analysis—the ability to establish “family trees” of genetically diverse and rapidly mutating flu strains. With NIH colleague David Lipman, Salzberg and Kingsford have already generated 2700 completely sequenced genomes of distinct flu strains. Such a database is critical for improving annual flu prevention measures and for increasing general knowledge about flu pandemics. Salzberg and Kingsford hope to use this data to determine whether an emerging strain will be mild or severe, which age groups will be most susceptible, how long a given vaccine will prove effective, and how flu immunities change over time. Salzberg’s research specialties include comparative genomics, gene finding, and genome sequence assembly. Kingsford’s interests include protein structure prediction and the evolution of viral and bacterial genomes. Steven Salzberg [email protected] http://www.cbcb.umd.edu/~salzberg/ Carl Kingsford [email protected] http://www.cbcb.umd.edu/~carlk/ Using Bioinformatics to Combat Infant Mortality Using computational statistics and philogenetic analysis, Mihai Pop analyzes bacteria in fecal samples from infants in Mali, Bangladesh, and other developing countries to diagnose the causes of diarrhea and terminal dehydration. By analyzing huge genetic data sets based on the bacteria in diarrheic and non-diarrheic patients, Pop hopes to identify previously unknown pathogens causing diarrhea. Currently, forty percent of terminal dehydration cases cannot be diagnosed. Bioinformatics enables Pop to expand the number of studied samples and to find data patterns with greater sophistication. Expanded research parameters allow him to correlate data across a range of factors, including age, location, and season. Pop’s ultimate aim is to use this data to develop new methods of treatment and prevention, thus addressing a major public health problem in the developing world. Pop collaborates with researchers at the University of Maryland School of Medicine in Baltimore, and this project is funded by the Bill and Melinda Gates Foundation. Pop is an expert in genome assembly and metagenomics, which is sometimes referred to as environmental sequencing. Mihai Pop [email protected] http://www.cbcb.umd.edu/~mpop/ Approaching Old Diseases with New Bioinformatic Analysis Najib El-Sayed and Mihai Pop are developing new algorithms and databases to study how microbes in the human gut contribute to chemical imbalances associated with complex ailments. Their work could change the way we treat autism, Crohn’s disease, and other disorders. The human gut hosts thousands of symbiotic and parasitic bacteria species. The shear number of bacteria types can be challenge for researchers studying the links between bacterial biochemistry and diseases with under-defined causes. Bioinformatics frees El-Sayed and Pop from the Petri-dish model of analysis, allowing them to approach mysterious diseases with new tools and perspectives. They are able to decode the genetic material of large microorganism populations, a process that can reveal biomarkers correlating specific bacteria to the symptoms of specific syndromes. This work will lead to a better understanding of the causes of cryptic diseases, enabling earlier diagnoses and new treatment options. El-Sayed and Pop are especially eager to pursue new research suggesting a possible correlation between bacteriological imbalance and autism. The symptoms of autism, a disorder long perceived as an inherited neurological disorder, may be affected by other factors, including the bacteria population of the gut. If this proves true, some symptoms of autism could be inhibited with antibiotics or dietary changes. El-Sayed’s research interests focus on genomic approaches (comparative genomics, functional genomics, and genome sequencing and analysis) to parasites and host-parasite interactions. Najib El-Sayed [email protected] http://www.najibelsayed.org/ Mihai Pop [email protected] http://www.cbcb.umd.edu/~mpop/ Open Sourcing Genomics Software The emergence of genomics—the study of an organism’s entire DNA sequence—has exponentially increased our understanding of the links between genetic codes and the expression of specific traits. However, the volume and complexity of genetic data continue to pose challenges. Despite advances in assembly algorithms, data often appear in partial and scrambled forms that are hard to verify for accuracy and even harder to synthesize into meaningful patterns. Steven Salzberg and his CBCB team create software that will enable verification and synthesis of genomic information at larger scales. To expedite software development, they make it available to researchers around the world—for free. Popular in other software development environments, this “open source” model allows programmers to access and modify software code without paying royalties or licensing fees. Open source development accelerates research by removing time-consuming and distracting barriers created by legal restrictions, and allowing researchers to focus all their energy on scientific and technical problems. One of CBCB’s main projects is a Modular, Open-Source (AMOS) Genome Assembler—a generic assembly foundation that researchers can adapt for specific projects. The program’s modular approach eliminates the need to go back to assembly basics for each project, saving valuable development time. Other bioinformatics researchers around the world are already participating in and expanding the AMOS project. Steven Salzberg [email protected] http://www.cbcb.umd.edu/~salzberg/ RESEARCH at the University of Maryland.

Bioinformatics: Computational Analysis of Biological Information

Computational Methods Addressing Genetic Variation In

Big Data, Moocs, and ... (PDF)

Cloud Computing and the DNA Data Race Michael Schatz

BENG181/CSE 181/BIMM 181 Molecular Sequence Analysis Instructor: Pavel Pevzner

Steven L. Salzberg

THE BIG CHALLENGES of BIG DATA As They Grapple with Increasingly Large Data Sets, Biologists and Computer Scientists Uncork New Bottlenecks

UNIVERSITY of CALIFORNIA RIVERSIDE RNA-Seq

The Triumph of New-Age Medicine Medicine Has Long Decried Acupuncture, Homeopathy, and the Like As Dangerous Nonsense That Preys on the Gullible

Top 100 AI Leaders in Drug Discovery and Advanced Healthcare Introduction

The Anatomy of Successful Computational Biology Software

Steven L. Salzberg

Planning the Future of Genomics: Foundational Research and Applications in Genomic Medicine July 6-8, 2010 Airlie Center, Warrenton, VA