Intro to Uniprot & Interpro

Intro to Uniprot & Interpro

Since 2002 a merger and collaboration of three databases: Swiss-Prot & TrEMBL PIR-PSD Funded mainly by NIH (US) to be the highest quality, most thoroughly annotated protein sequence database o A high quality protein sequence database A non redundant protein database, with maximal coverage including splice isoforms, disease variant and PTMs. Sequence archiving essential. o Easy protein identification Stable identifiers and consistent nomenclature/controlled vocabularies o Thorough protein annotation Detailed information on protein function, biological processes, molecular interactions and pathways cross-referenced to external source UniProtKB/TrEMBL UniProtKB/Swiss-Prot 1 entry per nucleotide submission 1 entry per protein Redundant, automatically Non-redundant, high-quality manual annotated - unreviewed annotation - reviewed UniProt/TrEMBL Sub/ ENA (EMBL) DNA database PDB Peptide Ensembl Data VEGA mRNA Patent FlyBase WormBase (Sanger) Data Data 6 02.06.2014 Manual annotation of UniProtKB/Swiss-Prot Splice variants Sequence Sequence features UniProtKB Ontologies Annotations Nomenclature References www.uniprot.org Beta.uniprot.org Sequence curation, stable identifiers, versioning and archiving For example – erroneous gene model predictions, frameshifts…. ..premature stop codons, read-throughs, erroneous initiator methionines….. Master headline Splice isoforms Identification of amino acid variants ..and of PTMs … and also Master headline Sequence annotation Master headline Protein nomenclature Master headline Master headline Controlled vocabularies used whenever possible… Master headline Annotation comments FUNCTION PTM SUBCELLULAR LOCATION RNA EDITING ALTERNATIVE PRODUCTS MASS SPECTROMETRY TISSUE SPECIFICITY DOMAIN DEVELOPMENTAL STAGE POLYMORPHISM INDUCTION DISRUPTION PHENOTYPE SIMILARITY ALLERGEN CATALYTIC ACTIVITY DISEASE COFACTOR TOXIC DOSE ENZYME REGULATION BIOTECHNOLOGY BIOPHYSICOCHEMICAL- PHARMACEUTICAL PROPERTIES MISCELLANEOUS PATHWAY CAUTION SUBUNIT SEQUENCE CAUTION INTERACTION WEB RESOURCE Automatic Annotation for UniProtKB/TrEMBL • UniProtKB/Swiss-Prot – manually curated proteins, capturing information available in the literature (~545,000 entries) • UniProtKB/Trembl – automated annotation only (~56,000,000 entries) Most proteins we now know the sequence of have not been biochemically profiled in any laboratory – and probably never will • InterPro - a database which integrates predictive information about proteins' function from partner resources, giving an overview of the families that a protein belongs to and the domains and sites it contains. • Users who have novel nucleotide or protein sequences to functionally characterise can use the software package InterProScan to run the scanning algorithms from the InterPro database Master headline Automatic Annotation o UniProtKB employs two prediction programs which are referred to as UniRule and SAAS. SAAS, Statistical UniRule maintains a Automatic Annotation set of manually System, generates a established and new set of decision-trees maintained annotation with every UniProtKB rules. release using data- mining. Swiss-Prot InterPro Master headline Proteomes Definition: The complete proteome of an organism is all the proteins expressed by that organism Two types of Proteomes Complete proteomes Complete sets of proteins thought to be expressed by organisms whose genomes have been completely sequenced. Reference proteomes Some complete proteomes have been selected as reference proteome sets. These cover the proteomes of well- studied model organisms and other proteomes of interest for biomedical research. Requirements for Complete Proteomes • Completely sequenced genome • Good gene prediction models • Proteins are mapped to genome • Good quality transcriptome/proteome data Obtaining Proteomes • Stuck? Just ask – active help and support team • Feedback – if you find something incorrect, outdated, missing etc please tell us. [email protected] Thanks for your attention .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    32 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us