Cmse 520 Biomolecular Structure, Function And
Total Page:16
File Type:pdf, Size:1020Kb
CMSE 520 BIOMOLECULAR STRUCTURE, FUNCTION AND DYNAMICS (Computational Structural Biology) OUTLINE Review: Molecular biology Proteins: structure, conformation and function(5 lectures) Generalized coordinates, Phi, psi angles, DNA/RNA: structure and function (3 lectures) Structural and functional databases (PDB, SCOP, CATH, Functional domain database, gene ontology) Use scripting languages (e.g. python) to cross refernce between these databases: starting from sequence to find the function Relationship between sequence, structure and function Molecular Modeling, homology modeling Conservation, CONSURF Relationship between function and dynamics Confromational changes in proteins (structural changes due to ligation, hinge motions, allosteric changes in proteins and consecutive function change) Molecular Dynamics Monte Carlo Protein-protein interaction: recognition, structural matching, docking PPI databases: DIP, BIND, MINT, etc... References: CURRENT PROTOCOLS IN BIOINFORMATICS (e-book) (http://www.mrw.interscience.wiley.com/cp/cpbi/articles/bi0101/frame.html) Andreas D. Baxevanis, Daniel B. Davison, Roderic D.M. Page, Gregory A. Petsko, Lincoln D. Stein, and Gary D. Stormo (eds.) 2003 John Wiley & Sons, Inc. INTRODUCTION TO PROTEIN STRUCTURE Branden C & Tooze, 2nd ed. 1999, Garland Publishing COMPUTER SIMULATION OF BIOMOLECULAR SYSTEMS Van Gusteren, Weiner, Wilkinson Internet sources Ref: Department of Energy Rapid growth in experimental technologies Human Genome Projects Two major goals 1. DNA mapping 2. DNA sequencing Rapid growth in experimental technologies z Microrarray technologies – serial gene expression patterns and mutations z Time-resolved optical, rapid mixing techniques - folding & function mechanisms (Æ ns) z Techniques for probing single molecule mechanics (AFM, STM) (Æ pN) Æ more accurate models/data for computer-aided studies Weiss, S. (1999). Fluorescence spectroscopy of single molecules. Science 283, 1676-1683. function StructuralStructural Biology/MolecularBiology/Molecular BiophysicsBiophysics MostMost (all?)(all?) basicbasic “life“life processes”processes” areare mediatedmediated byby “machines”“machines” thatthat representrepresent thethe ultimateultimate miniaturizationminiaturization achievableachievable inin aa universeuniverse comprisedcomprised ofof atomsatoms andand molecules.molecules. TheThe goalgoal isis toto understandunderstand thethe underlyingunderlying principlesprinciples thatthat governgovern thethe operationoperation ofof thesethese molecularmolecular machines.machines. WhatWhat thththisisis coursecourse isis aboutabout overviewoverview ofof waysways inin whichwhich computerscomputers areare usedused toto solvesolve problemsproblems inin biologybiology supervisedsupervised learninglearning ofof illustrativeillustrative oror frequentlyfrequently--usedused algorithmsalgorithms andand programsprograms andand databasesdatabases supervisedsupervised learninglearning ofof programmingprogramming techniquestechniques andand algorithmsalgorithms selectedselected fromfrom thesethese usesuses StructureStructure WhatWhat dodo thethe moleculesmolecules looklook like?like? HowHow dodo wewe determinedetermine thatthat experimentally?experimentally? AreAre therethere generalgeneral structuralstructural principles?principles? HowHow isis thisthis informationinformation organized?organized? HowHow dodo structuralstructural generalizationsgeneralizations relaterelate toto simplesimple physical/chemicalphysical/chemical principles?principles? DynamicsDynamics TimeTime isis ofof thethe essenceessence inin biologicalbiological processesprocesses thereforetherefore howhow dodo wewe understandunderstand timetime--dependentdependent processesprocesses atat thethe molecularmolecular level?level? HowHow dodo wewe dodo thisthis experimentally?experimentally? HowHow dodo wewe dodo thisthis computationally?computationally? PromisingPromising FutureFuture forfor ComputationalComputational BiologyBiology Exponential growth in data Sequence and structure data from experiments Computational technology 12,665 structures as of July 11, 2000 22,810 structures as of October 7, 2003 35,026 structures as of February 7, 2006 Rost, B. (1998). Marrying structure and genomics. Structure 6, 259-263 Large databases Archival databanks of biological information Protein, DNA sequence databases Protein structure and nucleic acid databases Protein expression patterns Experimental Tecniques Derived databanks Sequence motifs Mutations and variations in proteins Classifications and or relationships Databanks of web sites Databanks of databanks containing biological information Links between databanks BIOINFORMATICS (definition) Definition by Luscombe et al., Yale, Dept. of Molecular Biophysics and Biochemistry, 2001 “Bioinformatics is conceptualizing biology in terms of macromolecules (in the sense of physical chemistry) and then applying ‘informatics’ techniques (derived from disciplines such as applied maths, computer science, and statistics) to understand and organize the information associated with these molecules, on a large-scale” COMPUTATIONAL BIOLOGY (definition) Definition by NIH (working definition) The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems. InformationInformation flowflow AA majormajor tasktask inin computationalcomputational molecularmolecular biologybiology isis toto “decipher”“decipher” informationinformation containedcontained inin biologicalbiological sequencessequences SinceSince thethe nucleotidenucleotide sequencesequence ofof aa genomegenome containscontains allall informationinformation necessarynecessary toto produceproduce aa functionalfunctional organism,organism, wewe shouldshould inin theorytheory bebe ableable toto duplicateduplicate thisthis decodingdecoding usingusing computerscomputers http://www-fp.mcs.anl.gov/~gaasterland/sg-review-slides.html 5 Two major challenges after completion of the HGP: StructuralStructural GenomicsGenomics andand FunctionalFunctional GenomicsGenomics Schematic representation of the universe of proteins in a given organism Kim, S.H. (1998). Nature Struct.Biol. 5, 643-645 Aim: “to construct the complete scheme of biological functions and cellular pathways for the entire organism” What's E-Cell Project? E-Cell Project is an international research project aiming to model and reconstruct biological phenomena in silico, and developing necessary theoretical supports, technologies and software platforms to allow precise whole cell simulation. Metabolism model of the model cell constructed with 127 genes PROTEOMICSPROTEOMICS Covers the following areas (but not limited to): ¾Protein structure Primary Structure: sequence of amino acids Secondary Structure: local spatial arrangement Tertiary Structure: three dimensional native conformation ¾Protein Function related to 3-D shape of the protein ¾Protein clusters according to a specified characteristic ¾Protein-Protein Interaction interaction among a number of proteins ¾Protein-DNA Interaction interaction between one protein and the genome .