Data Resources and Tools

Data Resources and Tools

6/22/2012 Introductory Bioinformatic Techniques 2012 Wits Bioinformatics Shaun Aron Sequence Structure Function Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron A variety of protein resources online Evolutionary relationships between proteins Several websites/portals dedicated to Detection of local similarities between providing a single interface to multiple proteins to detect functional domains resources Functional predictions Important to differentiate between Protein structural predictions databases, websites and portals Protein-Protein interactions Protein-Nucleotide interactions Protein engineering and design Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Domain identification Protein classification Multiple domains in eukaryotic proteins . Families Sequence based methods Domain Prediction Structure based methods . Sequence based . SCOP . Structure based . CATH Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron 1 6/22/2012 Experimental determination of structures . X-ray crystallography (requires crystallisation of protein) Sequence and information databases . NMR (for smaller proteins) . NCBI Entrez Protein Database – contains protein . EM (For large complex structures) Structural comparisons sequences from GenBank, RefSeq , as well as Protein folding records from SwissProt, PIR, PRF, and PDB . Folding simulations . UniProtKB – the “Protein knowledgebase”, a Structure prediction comprehensive set of protein sequences. Functional . Secondary structure information on proteins, with accurate, consistent, and . Tertiary structure rich annotation, the amino acid sequence, protein name or . Comparative modeling description, taxonomic data and citation information. De novo predictions Divided into two parts: Swiss-Prot and TrEMBL Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron . UniProt Swiss-Prot – the manually annotated, Prosite reviewed protein sequences in the UniProtKB. Contains documentation entries describing High quality. protein domain, families and functional sites . UniProt TrEMBL – the automatically annotated, together with the associated patterns and profiles unreviewed set of proteins (EMBL-Bank for identifying them (Single motif method) translated). Varying quality. PFam . Collection of protein multiple sequence alignments and profile Hidden Markov Models . Use libraries HMM to define domains in protein sequences (Full Domain Method) Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron SMART BLOCKS . Simple Modular Architect Research Tool - used for . Ungapped multiple alignments corresponding to the identification and annotation of protein the most conserved regions of proteins domains and domain architecture. No longer updated . Makes use of hand curated models for the Panther prediction of protein domains . Predicts protein function based on phylogenetic . Full domain method analyses by comparison to proteins of known functions Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron 2 6/22/2012 Single motif PRINTS methods Regular expressions (PROSITE) . Houses a collection of protein family fingerprints Full domain alignment methods . Fingerprint is a collection of motifs (multiple motif Profiles method) (Profile Library) . Can be used to predict functional families in HMMs uncharacterised sequences (Pfam, SMART, etc) . Hierarchical classification of protein superfamilies Identity matrices . Underpins the BLOCKS database Multiple motif (PRINTS) methods . BLAST, fingerprint and text search Slide duplicated from presentation by Alex Mitchell University of Manchester Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron PDB . Protein Data Bank – single worldwide archive of structural data of biological macromolecules Experimentally validated Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron SCOP . Structural Classification of Proteins . All alpha proteins . All beta proteins . Alpha and beta proteins (a/b) ▪ Mainly parallel beta sheets (beta-alpha-beta units) . Alpha and beta proteins (a+b) ▪ Mainly antiparallel beta sheets (segregated alpha and beta regions) 1dlw . Multi-domain proteins (alpha and beta) 1. Root: scop ▪ Folds consisting of two or more domains belonging to different classes . Membrane and cell surface proteins and peptides 2. Class: All alpha proteins . Small proteins 3. Fold: Globin-like . Coiled coil proteins 4. Superfamily: Globin-like . Designed proteins 5. Family: Truncated hemoglobin 6. Protein: Protozoan/bacterial hemoglobin 7. Species: Ciliate (Paramecium caudatum) Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron 3 6/22/2012 InterPro PIR . Integrated documentation resource of protein . Protein Information Resources families, domains and functional sites . Protein ontology . Collection of data from PROSITE, Pfam, PRINTS, . ProClass: Reports for UniProtKB ProDom, SMART, TIGRFAMs, PIR . ProLink: Literature, Text Mining . Integrated into one single resources for protein information Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron PDB Expasy (Swiss Institute of Bioinformatics) . Portal to PDB database . UniProt, PROSITE, homology modelling, docking, . Tools for searching PDB and related data many many other tools doing protein sequences EBI and identication, mass spectrometry and 2-DE data, protein characterisation and function . European Bioinformatics Institute families, patterns and profiles, post-translational . Tools and databases for primary sequence search, modication, protein structure, protein-protein structural databases and Uniprot interaction, similarity search/alignment, drug design, molecular modelling Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron 4 6/22/2012 Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron 5 6/22/2012 Selected sequences used to build the Prosite profile for the Zn(2)-C6 fungal-type DNA- binding domain The sequence logo below indicates the level of conservation of each residue in the alignment Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron Introductory Bioinformatics 2012 - Shaun Aron 6 .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    6 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us