EBI patent related services
4th Annual Forum for SMEs
October 18-19th 2010
Jennifer McDowall
Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview
• Patent sequence data
• Sequence archives
• Sequence searches
2 www.ebi.ac.uk Overview
• Patent sequence data
• Sequence archives
• Sequence searches
3 www.ebi.ac.uk Sequence data from patent literature
USPTO GenBank DDBJ JPO
ENA
EPO policy: data released to public (and to EMBL) 18 months September 2010 After patent application date, EPO nucl > 17.5m sequences independent of whether patent prot > 4.9m sequences has been granted.
4 www.ebi.ac.uk Patent Sequence records
European Nucleotide Archive (ENA, formerly EMBL-Bank)
Universal Protein Resource (UniProt)
Non-redundant Patent Sequence Databases
5 www.ebi.ac.uk ENA
old EMBL-Bank ENA-Annotation • ENA + raw data archives Sequence Read Archive Trace Archive
• ENA-Annotation >124m sequences
• Includes patent class (PAT): EPO, USTPO, JPO, KIPO
• Dates include: date sequence went public, date of last revision
6 www.ebi.ac.uk Patent sequence record in ENA
Sequence Download version data Dates (first public and last updated) Navigate to related data e.g. Version archive Graphical viewer
DNA source
Navigate to external data sources e.g. UniProt Patent reference
Sequence
7 www.ebi.ac.ukwww.ebi.ac.uk UniProt
• UniParc Non-redundant archive • UniProtKB SwissProt / TrEMBL • Composed of 4 sections • UniMES Metagenomic • UniRef Sequence clusters
• UniParc >23m sequences
• Includes patent class (PRT): EPO, USTPO, JPO, KIPO
• Dates include: date sequence went public, date of last revision
8 www.ebi.ac.uk Patent sequence record in UniProt
Accession Download data
List of databases containing sequence
REMTREMBL Navigate to (deprecated individual entries database)
Sequence
9 www.ebi.ac.uk Non-redundant patent databases
ENA (redundant)
Remove sequence redundancy
Level-1 NR Additional annotation, including priority dates Remove patent for patent family family redundancy
Level-2 NR
10 www.ebi.ac.uk Bulk Downloads
http://www.ebi.ac.uk/patentdata/
Patent proteins
Patent nucleotides
Non-redundant sequences
11 www.ebi.ac.uk Overview
• Patent sequence data
• Sequence archives
• Sequence searches
12 www.ebi.ac.uk Sequence archives
• ENA nucleotide sequence version archive (SVA) www.ebi.ac.uk/embl/sva
• UniSaveSearch– byUniProt date sequence/annotationSearch by accession version archive www.ebi.ac.uk/uniprot/unisave get specific record only get all records
13 www.ebi.ac.uk Provides complete version list
Compare View old different versions entries
14 www.ebi.ac.uk View old entries
15 www.ebi.ac.uk Compare different versions
16 www.ebi.ac.uk Overview
• Patent sequence data
• Sequence archives
• Sequence searches
17 www.ebi.ac.uk EB-eye: text search Fast, easy to use Search for patent WO0146262
Lists all entries Lists sequences associated with associated with WO0146262 WO0146262
18 www.ebi.ac.uk SRS: advanced text search For more complex searches
Select resources to search
Create query Patent literature then
Patent DNA
Patent proteins www.ebi.ac.uk http://srs.ebi.ac.uk/ Sequence Similarity & Analysis Search for patent sequence
Iterative BLAST searches
Fragment FASTA searches
20 www.ebi.ac.uk FASTA nucleotide patent search
Search ENA patent class or non-redundant patent datasets
21 www.ebi.ac.uk FASTA protein patent search
Search individual patent offices or non-redundant patent datasets
22 www.ebi.ac.uk Results: patent protein v UniProt
Provide UniProt records
Provide additional annotation
23 www.ebi.ac.uk Additional annotationNucleotide (protein searches) Structures sequences
Molecular GO interactions mapping
Enzyme Literature data
Gene Genome expression information
Chemical Domain/family information Reactions & classification 24 pathwayswww.ebi.ac.uk Functional predictions (protein searches)
• Visual comparison • InterPro classification • Helps identify mis- or partial matches
25 www.ebi.ac.uk Functional predictions (protein searches)
Extract Prioritize 100% ID information results • Matches: • family signature • 4 domain signatures
34% ID • Matches: • family signature • 3 domain signatures
28% ID • Matches: • 1 domain signature
24% ID • Matches: 26 www.ebi.ac.uk • No signatures Summary
Broad patent sequence coverage Protein/nucleotides: EPO, USTPO, JPO, KIPO
Comprehensive sequence databases ENA & UniParc (PAT / PRT class data) Non-redundant patent sequences enriched Sequence archives ENA SVA & UniSave track changes
Multiple search engines EB-eye text search >40 databases SRS advanced text searching >100 databases Multiple sequence search tools 27 www.ebi.ac.ukannotation-enhanced 04.09.08
QUESTIONS?
Contacts: http://www.ebi.ac.uk/support/
EBI is an Outstation of the European Molecular Biology Laboratory.