<<

EBI patent related services

4th Annual Forum for SMEs

October 18-19th 2010

Jennifer McDowall

Senior Scientist, EMBL-EBI EBI is an Outstation of the European Molecular Biology Laboratory. Overview

• Patent sequence data

• Sequence archives

• Sequence searches

2 www.ebi.ac.uk Overview

• Patent sequence data

• Sequence archives

• Sequence searches

3 www.ebi.ac.uk Sequence data from patent literature

USPTO GenBank DDBJ JPO

ENA

EPO policy: data released to public (and to EMBL) 18 months September 2010 After patent application date, EPO nucl > 17.5m sequences independent of whether patent prot > 4.9m sequences has been granted.

4 www.ebi.ac.uk Patent Sequence records

European Nucleotide Archive (ENA, formerly EMBL-Bank)

Universal Resource (UniProt)

Non-redundant Patent Sequence

5 www.ebi.ac.uk ENA

old EMBL-Bank ENA-Annotation • ENA + raw data archives Sequence Read Archive Trace Archive

• ENA-Annotation >124m sequences

• Includes patent class (PAT): EPO, USTPO, JPO, KIPO

• Dates include: date sequence went public, date of last revision

6 www.ebi.ac.uk Patent sequence record in ENA

Sequence Download version data Dates (first public and last updated) Navigate to related data e.g. Version archive Graphical viewer

DNA source

Navigate to external data sources e.g. UniProt Patent reference

Sequence

7 www.ebi.ac.ukwww.ebi.ac.uk UniProt

• UniParc Non-redundant archive • UniProtKB SwissProt / TrEMBL • Composed of 4 sections • UniMES Metagenomic • UniRef Sequence clusters

• UniParc >23m sequences

• Includes patent class (PRT): EPO, USTPO, JPO, KIPO

• Dates include: date sequence went public, date of last revision

8 www.ebi.ac.uk Patent sequence record in UniProt

Accession Download data

List of databases containing sequence

REMTREMBL Navigate to (deprecated individual entries )

Sequence

9 www.ebi.ac.uk Non-redundant patent databases

ENA (redundant)

Remove sequence redundancy

Level-1 NR Additional annotation, including priority dates Remove patent for patent family family redundancy

Level-2 NR

10 www.ebi.ac.uk Bulk Downloads

http://www.ebi.ac.uk/patentdata/

Patent

Patent nucleotides

Non-redundant sequences

11 www.ebi.ac.uk Overview

• Patent sequence data

• Sequence archives

• Sequence searches

12 www.ebi.ac.uk Sequence archives

• ENA nucleotide sequence version archive (SVA) www.ebi.ac.uk/embl/sva

• UniSaveSearch– byUniProt date  sequence/annotationSearch by accession version archive www.ebi.ac.uk/uniprot/unisave get specific record only  get all records

13 www.ebi.ac.uk Provides complete version list

Compare View old different versions entries

14 www.ebi.ac.uk View old entries

15 www.ebi.ac.uk Compare different versions

16 www.ebi.ac.uk Overview

• Patent sequence data

• Sequence archives

• Sequence searches

17 www.ebi.ac.uk EB-eye: text search Fast, easy to use Search for patent WO0146262

Lists all entries Lists sequences associated with associated with WO0146262 WO0146262

18 www.ebi.ac.uk SRS: advanced text search For more complex searches

Select resources to search

Create query Patent literature then

Patent DNA

Patent proteins www.ebi.ac.uk http://srs.ebi.ac.uk/ Sequence Similarity & Analysis Search for patent sequence

Iterative BLAST searches

Fragment FASTA searches

20 www.ebi.ac.uk FASTA nucleotide patent search

Search ENA patent class or non-redundant patent datasets

21 www.ebi.ac.uk FASTA protein patent search

Search individual patent offices or non-redundant patent datasets

22 www.ebi.ac.uk Results: patent protein v UniProt

Provide UniProt records

Provide additional annotation

23 www.ebi.ac.uk Additional annotationNucleotide (protein searches) Structures sequences

Molecular GO interactions mapping

Enzyme Literature data

Gene Genome expression information

Chemical Domain/family information Reactions & classification 24 pathwayswww.ebi.ac.uk Functional predictions (protein searches)

• Visual comparison • InterPro classification • Helps identify mis- or partial matches

25 www.ebi.ac.uk Functional predictions (protein searches)

Extract Prioritize 100% ID information results • Matches: • family signature • 4 domain signatures

34% ID • Matches: • family signature • 3 domain signatures

28% ID • Matches: • 1 domain signature

24% ID • Matches: 26 www.ebi.ac.uk • No signatures Summary

Broad patent sequence coverage  Protein/nucleotides: EPO, USTPO, JPO, KIPO

Comprehensive sequence databases  ENA & UniParc (PAT / PRT class data)  Non-redundant patent sequences  enriched Sequence archives  ENA SVA & UniSave  track changes

Multiple search engines  EB-eye text search  >40 databases  SRS  advanced text searching >100 databases  Multiple sequence search tools  27 www.ebi.ac.ukannotation-enhanced 04.09.08

QUESTIONS?

Contacts: http://www.ebi.ac.uk/support/

EBI is an Outstation of the European Molecular Biology Laboratory.