<<

CMSE 520

BIOMOLECULAR , FUNCTION AND DYNAMICS

(Computational Structural ) OUTLINE

‰ Review: : structure, conformation and function(5 lectures) Generalized coordinates, Phi, psi angles, DNA/RNA: structure and function (3 lectures) ‰ Structural and functional databases (PDB, SCOP, CATH, Functional domain database, ontology) Use scripting languages (e.g. python) to cross refernce between these databases: starting from sequence to find the function ‰ Relationship between sequence, structure and function Molecular Modeling, Conservation, CONSURF ‰ Relationship between function and dynamics Confromational changes in proteins (structural changes due to ligation, hinge motions, allosteric changes in proteins and consecutive function change) Molecular Dynamics Monte Carlo ‰ -protein interaction: recognition, structural matching, docking PPI databases: DIP, BIND, MINT, etc... References:

CURRENT PROTOCOLS IN (e-book) (http://www.mrw.interscience.wiley.com/cp/cpbi/articles/bi0101/frame.html) Andreas D. Baxevanis, Daniel B. Davison, Roderic D.M. Page, Gregory A. Petsko, Lincoln D. Stein, and Gary D. Stormo (eds.) 2003 John Wiley & Sons, Inc.

INTRODUCTION TO Branden C & Tooze, 2nd ed. 1999, Garland Publishing

COMPUTER SIMULATION OF BIOMOLECULAR SYSTEMS Van Gusteren, Weiner, Wilkinson

Internet sources Ref: Department of Energy Rapid growth in experimental technologies

Human Genome Projects Two major goals 1. DNA mapping 2. DNA Rapid growth in experimental technologies

z Microrarray technologies – serial gene expression patterns and mutations z Time-resolved optical, rapid mixing techniques - folding & function mechanisms (Æ ns) z Techniques for probing single mechanics (AFM, STM) (Æ pN) Æ more accurate models/data for -aided studies

Weiss, S. (1999). Fluorescence sp Science 283, 1676-1683.

ectroscopy of si

ngle .

function

StructuralStructural Biology/MolecularBiology/Molecular BiophysicsBiophysics MostMost (all?)(all?) basicbasic ““life processes”processes” areare mediatedmediated byby “”“machines” thatthat representrepresent thethe ultimateultimate miniaturizationminiaturization achievableachievable inin aa universeuniverse comprisedcomprised ofof atomsatoms andand molecules.molecules. TheThe goalgoal isis toto understandunderstand thethe underlyingunderlying principlesprinciples thatthat governgovern thethe operationoperation ofof thesethese molecularmolecular machines.machines. WhatWhat thththisisis coursecourse isis aboutabout overviewoverview ofof waysways inin whichwhich computerscomputers areare usedused toto solvesolve problemsproblems inin biologybiology supervisedsupervised learninglearning ofof illustrativeillustrative oror frequentlyfrequently--usedused algorithmsalgorithms andand programsprograms andand databasesdatabases supervisedsupervised learninglearning ofof programmingprogramming techniquestechniques andand algorithmsalgorithms selectedselected fromfrom thesethese usesuses StructureStructure

WhatWhat dodo thethe moleculesmolecules looklook like?like? HowHow dodo wewe determinedetermine thatthat experimentally?experimentally? AreAre therethere generalgeneral structuralstructural principles?principles? HowHow isis thisthis informationinformation organized?organized? HowHow dodo structuralstructural generalizationsgeneralizations relaterelate toto simplesimple physical/chemicalphysical/chemical principles?principles? DynamicsDynamics

TimeTime isis ofof thethe essenceessence inin biologicalbiological processesprocesses thereforetherefore howhow dodo wewe understandunderstand timetime--dependentdependent processesprocesses atat thethe molecularmolecular level?level? HowHow dodo wewe dodo thisthis experimentally?experimentally? HowHow dodo wewe dodo thisthis computationally?computationally? PromisingPromising FutureFuture forfor ComputationalComputational BiologyBiology Exponential growth in data Sequence and structure data from experiments Computational technology 12,665 as of July 11, 2000

22,810 structures as of October 7, 2003

35,026 structures as of February 7, 2006

Rost, B. (1998). Marrying structure and genomics. Structure 6, 259-263 Large databases

Archival databanks of biological information Protein, DNA sequence databases Protein structure and databases Protein expression patterns Experimental Tecniques Derived databanks Sequence motifs Mutations and variations in proteins Classifications and or relationships

Databanks of web sites Databanks of databanks containing biological information Links between databanks BIOINFORMATICS (definition)

Definition by Luscombe et al., Yale, Dept. of Molecular Biophysics and , 2001

“Bioinformatics is conceptualizing biology in terms of (in the sense of physical ) and then applying ‘informatics’ techniques (derived from disciplines such as applied maths, , and statistics) to understand and organize the information associated with these molecules, on a large-scale” COMPUTATIONAL BIOLOGY (definition)

Definition by NIH (working definition)

The development and application of data-analytical and theoretical methods, mathematical modeling and computational simulation techniques to the study of biological, behavioral, and social systems.

InformationInformation flowflow

AA majormajor tasktask inin computationalcomputational molecularmolecular biologybiology isis toto “decipher”“decipher” informationinformation containedcontained inin biologicalbiological sequencessequences SinceSince thethe nucleotidenucleotide sequencesequence ofof aa genomegenome containscontains allall informationinformation necessarynecessary toto produceproduce aa functionalfunctional ,organism, wewe shouldshould inin theorytheory bebe ableable toto duplicateduplicate thisthis decodingdecoding usingusing computerscomputers http://www-fp.mcs.anl.gov/~gaasterland/sg-review-slides.html

5 Two major challenges after completion of the HGP: StructuralStructural GenomicsGenomics andand FunctionalFunctional GenomicsGenomics

Schematic representation of the universe of proteins in a given organism

Kim, S.H. (1998). Nature Struct.Biol. 5, 643-645

Aim: “to construct the complete scheme of biological functions and cellular pathways for the entire organism” What's E- Project?

E-Cell Project is an international research project aiming to model and reconstruct biological phenomena in silico, and developing necessary theoretical supports, technologies and software platforms to allow precise whole cell simulation.

Metabolism model of the model cell constructed with 127

PROTEOMICSPROTEOMICS

Covers the following areas (but not limited to): ¾Protein structure Primary Structure: sequence of amino acids Secondary Structure: local spatial arrangement Tertiary Structure: three dimensional native conformation

¾Protein Function related to 3-D shape of the protein

¾Protein clusters according to a specified characteristic

¾Protein-Protein Interaction interaction among a number of proteins

¾Protein-DNA Interaction interaction between one protein and the genome