Biochem 218 Protein Structural Motifs

Computational Molecular Biology Biochem 218 – BioMedical Informatics 231 http://biochem218.stanford.edu/ Protein Structural Motifs Doug Brutlag Professor Emeritus Biochemistry & Medicine (by courtesy) Homework 5: Phylogenies • For this homework assignment take 20 to 30 protein sequences which are at least 30% similar or better and: o 1) make a multiple sequence alignment with them using ClustalW and o 2) make two phylogenies, one using UPGMA method and the other using the Neighbor Joining method • Describe the resulting alignments and include graphic images of the phylogenies in a message to [email protected] • Mention if the trees seem reasonable biologically or taxonomically by comparison with standard taxonomies • Do the two trees have the same topology? • Do the trees have the same branch lengths? • If the two trees do not have the same topology or branch lengths, describe the differences and indicate why you think the two trees differ. Are the differences significant? • Do the trees show evidence of paralogous evolution? Which nodes are orthologous and which are paralogous bifurcations? • Do the trees show evidence of either gene conversion or horizontal gene transfer? Final Projects Due March 12 • Examples of Previous Final Projects o http://biochem218.stanford.edu/Projects.html • Critical review of any area of computational molecular biology. o Area from the lectures but in more depth o Any other area of bioinformatics or genomics focused on computational approaches • Proposed improvement or novel approach • Can be a combined experimental/computational method. • Could be an implementation or just pseudocode. • Please do a MeSH literature search for Reviews on your topic. Some useful MeSH terms include: o Algorithms o Statistics o Molecular Sequence Data o Molecular Structure etc. • Please send a proposed final project topic to [email protected] by next Friday Protein Structure Computational Goals • Compare all known structures to each other • Compute distances between protein structures • Classify and organize all structures in a biologically meaningful way • Discover conserved substructure domain • Discover conserved substructural motifs • Find common folding patterns and structural/functional motifs • Discover relationship between structure and function. • Study interactions between proteins and other proteins, ligands and DNA (Protein Docking) • Use known structures and folds to infer structure from sequence (Protein Threading) • Use known structural motifs to infer function from structure • Many more… Structural Classification of Proteins (SCOP) http://scop.berkeley.edu/ • Class o Similar secondary structure content o All α, all β, alternating α/ β etc • Fold (Architecture) o Major structural similarity o SSE’s in similar arrangement • Superfamily (Topology) o Probable common ancestry o HMM family membership • Family o Clear evolutionary relationship o Pairwise sequence similarity > 25% Classes of Protein Structures • Mainly α • Mainly β α β alternating o Parallel β sheets, β-α-β units • α β o Anti-parallel β sheets, segregated α and β regions o helices mostly on one side of sheet Classes of Protein Structures • Others o Multi-domain, membrane and cell surface, small proteins, peptides and fragments, designed proteins Folds / Architectures • Mainly α • α/ βa nd α+β o Bundle • Closed o Non-Bundle • • Mainly β Barrel o Single sheet • Roll, ... o Roll • Open o Barrel • Sandwich o Clam • Clam, ... o Sandwich o Prism o 4/6/7/8 Propeller o Solenoid The TIM Barrel Fold A Conceptual Problem ... Fold versus Topology Another example: Globin vs. Colicin PDB Protein Database http://www.rcsb.org/pdb/ • Protein DataBase o Multiple Structure Viewers o Sequence & Structure Comparison Tools o Derived Data . SCOP . CATH . pFAM . Go Terms o Education on Protein Structure o Download Structures and Entire Database PDB Protein Database http://www.rcsb.org/pdb/ PDB Protein Database http://www.rcsb.org/pdb/ PDB Advanced Search for UniProt Entry http://www.rcsb.org/pdb/ PDB Search Results http://www.rcsb.org/pdb/ PDB E. coli Hu Entry http://www.rcsb.org/pdb/explore/explore.do?structureId=2O97 PDB SimpleViewer http://www.rcsb.org/pdb/ PDB Protein Workshop View http://www.rcsb.org/pdb/ PDB Derived Data http://www.rcsb.org/pdb/ Molecule of the Month: Enhanceosome http://www.rcsb.org/pdb/static.do?p=education_discussion/molecule_of_the_month/current_month.html NCBI Structure Database http://www.ncbi.nlm.nih.gov/Structure/ • Macromolecular Structures • Related Structures • View Aligned Structures & Sequences • Cn3D: Downloadable Structure & Sequence Viewer • CDD: Conserved Domain Database o CD-Search: Protein Sequence Queries o CD-TREE: Protein Classification Downloadable Application o CDART: Conserved Domain Architecture Tool • PubChem: Small Molecules and Biological Activity • Biological Systems: BioCyc, KEGG and Reactome Pathways • MMDB: Molecular Modeling Database • CBLAST: BLAST sequence against PDB and Related Structure Database • IBIS: Inferred Biomolecular Interaction Server • VAST Search: Structure Alignment Tool NCBI Structure Database http://www.ncbi.nlm.nih.gov/Structure/ NCBI Structure Database http://www.ncbi.nlm.nih.gov/Structure/ NCBI Cn3D Viewer http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3d.shtml PyMol PDB Structure Viewer http://www.pymol.org/ Databases of Protein Folds • SCOP (http://scop.berkeley.edu/) o Structural Classification of Proteins o Class-Fold-Superfamily-Family o Manual assembly by inspection • Superfamily (http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/) o HMM models for each SCOP fold o Fold assignments to all genome ORFs o Assessment of specificity/sensitivity of structure prediction o Search by sequence, genome and keywords • CATH (http://www.biochem.ucl.ac.uk/bsm/cath/) o Class - Architecture - Topology - Homologous Superfamily o Manual classification at Architecture level o Automated topology classification using SSAP (Orengo & Taylor) • FSSP (http://www2.embl-ebi.ac.uk/dali/fssp/) o Fully automated using the DALI algorithm (Holm & Sander) o No internal node annotations o Structural similarity search using DALI SCOP Database of Protein Folds http://scop.berkeley.edu/ SCOP Hierarchy http://scop.berkeley.edu/data/scop.b.html SCOP Alpha and Beta Proteins http://scop.berkeley.edu/data/scop.b.d.html SCOP TIM Barrels http://scop.berkeley.edu/data/scop.b.d.b.html SCOP Thiamin Phosphate Synthase http://scop.berkeley.edu/data/scop.b.d.b.d.A.html SCOP Thiamin Phosphate Synthase Entry http://scop.berkeley.edu/ SuperFamily HMM Fold Library http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/ SuperFamily Major Features http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/ Genome Assignments by Superfamily http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/ Databases of Protein Folds • SCOP (http://scop.berkeley.edu/) o Structural Classification of Proteins o Class-Fold-Superfamily-Family o Manual assembly by inspection • Superfamily (http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/) o HMM models for each SCOP fold o Fold assignments to all genome ORFs o Assessment of specificity/sensitivity of structure prediction o Search by sequence, genome and keywords • CATH (http://www.biochem.ucl.ac.uk/bsm/cath/) o Class - Architecture - Topology - Homologous Superfamily o Manual classification at Architecture level o Automated topology classification using SSAP (Orengo & Taylor) • FSSP (http://www2.embl-ebi.ac.uk/dali/fssp/) o Fully automated using the DALI algorithm (Holm & Sander) o No internal node annotations o Structural similarity search using DALI CATH Protein Structure Classification http://www.biochem.ucl.ac.uk/bsm/cath/ CATH Protein Structure Hierarchy http://www.biochem.ucl.ac.uk/bsm/cath/ CATH Protein Class Level http://www.biochem.ucl.ac.uk/bsm/cath/ CATH Orthogonal Bundle http://www.biochem.ucl.ac.uk/bsm/cath/ CATH Protein Summary http://www.biochem.ucl.ac.uk/bsm/cath/ CATH Protein Summary http://www.biochem.ucl.ac.uk/bsm/cath/ Databases of Protein Folds • SCOP (http://scop.berkeley.edu/) o Structural Classification of Proteins o Class-Fold-Superfamily-Family o Manual assembly by inspection • Superfamily (http://supfam.mrc-lmb.cam.ac.uk/SUPERFAMILY/) o HMM models for each SCOP fold o Fold assignments to all genome ORFs o Assessment of specificity/sensitivity of structure prediction o Search by sequence, genome and keywords • CATH (http://www.biochem.ucl.ac.uk/bsm/cath/) o Class - Architecture - Topology - Homologous Superfamily o Manual classification at Architecture level o Automated topology classification using SSAP (Orengo & Taylor) • FSSP (http://www2.embl-ebi.ac.uk/dali/fssp/) o Fully automated using the DALI algorithm (Holm & Sander) o No internal node annotations o Structural similarity search using DALI FSSP Database http://srs.ebi.ac.uk/srsbin/cgi-bin/wgetz?-page+LibInfo+-lib+FSSP Dali Server http://www.ebi.ac.uk/dali/ DALI Database (Liisa Holm) http://ekhidna.biocenter.helsinki.fi/dali/start Protein Fold Prediction: Swiss Model http://swissmodel.expasy.org/ • Amos Bairoch, Swiss Bioinformatics Institute, SBI • Threading and Template Discovery • Workspace for saving Template Results • Domain Annotation • Structure Assessment • Template Library • Structures & Models • Documentation and Tutorials Protein Fold Prediction: Swiss Model http://swissmodel.expasy.org/ Automatic Protein Fold Prediction http://swissmodel.expasy.org/ Automatic Protein Fold Prediction Results http://swissmodel.expasy.org/ Automatic Protein Fold Prediction Results http://swissmodel.expasy.org/ Automatic Protein Fold Prediction Results http://swissmodel.expasy.org/

Biochem 218 Protein Structural Motifs

Predicting and Characterising Protein-Protein Complexes

Functional Effects Detailed Research Plan

Development of Novel Strategies for Template-Based Protein Structure Prediction

Centre for Bioinformatics Imperial College London

Centre for Bioinformatics Imperial College London Second Report 31

Spinout Equinox Pharma Speeds up and Reduces the Cost of Drug Discovery

Virtual Screening of Human Class-A Gpcrs Using Ligand Profiles Built on Multiple Ligand-Receptor Interactions

A Composite Approach to Protein Structure Evolution

Drosophila Melanogaster

Erepo-ORP: Exploring the Opportunity Space to Combat Orphan Diseases with Existing Drugs

Annual Review 2003

ELIXIR UK Node