Latest Publications
Total Page:16
File Type:pdf, Size:1020Kb
InterPro InterPro Team Sep 10, 2021 ABOUT INTERPRO 1 About InterPro 1 2 Citing InterPro 3 2.1 Latest publications............................................3 2.2 All previous publications.........................................3 3 InterPro tutorials & Webinars7 3.1 Tutorials.................................................7 3.2 Webinars.................................................7 4 Upcoming courses and webinars9 4.1 Structural bioinformatics course (virtual)................................9 5 Previous courses 11 5.1 Structural bioinformatics course (virtual)................................ 11 5.2 Bioinformatics Resources for Protein Biology.............................. 11 6 InterPro Entries : essential information 13 6.1 InterPro entry types........................................... 13 6.2 Other entry and page types........................................ 14 6.3 Entry relationships............................................ 14 6.4 Overlapping entries........................................... 14 7 InterPro website banner 15 7.1 Navigation banner and menu....................................... 15 8 InterPro homepage 17 8.1 InterPro resource overview........................................ 18 8.2 Search box................................................ 18 8.3 Data.................................................... 19 8.4 News and information.......................................... 22 9 How to search the InterPro website? 23 9.1 Quick search............................................... 23 9.2 Sequence search............................................. 23 9.3 Text search................................................ 26 9.4 Domain architecture search....................................... 26 9.5 Using Browse feature to search and filter InterPro............................ 27 10 Protein sequence viewer 35 i 11 Browsing entries in the InterPro website 39 11.1 InterPro entry page............................................ 40 11.2 Member database page.......................................... 43 11.3 Protein entry page............................................ 48 11.4 Structure entry page........................................... 50 11.5 Taxonomy entry page.......................................... 50 11.6 Proteome entry page........................................... 50 11.7 Set entry page.............................................. 53 12 How to download InterPro data? 55 12.1 Download page.............................................. 55 12.2 Export button............................................... 55 12.3 Your downloads............................................. 55 12.4 InterPro Application Programming Interface (API)........................... 57 13 Release notes 59 13.1 General information........................................... 59 13.2 Other statistics.............................................. 59 14 Frequently Asked Questions (FAQs) 63 14.1 General Questions............................................ 63 14.2 Sequence searches (InterProScan).................................... 65 14.3 Web Interface............................................... 66 14.4 Application Programming Interface (API)................................ 67 14.5 Troubleshooting............................................. 68 14.6 Additional help.............................................. 68 15 InterProScan 69 15.1 Documentation.............................................. 69 15.2 Web services............................................... 69 15.3 Web based tools............................................. 69 15.4 Source code............................................... 70 15.5 Previous releases............................................. 70 15.6 License.................................................. 70 15.7 Follow us & reporting bugs....................................... 70 16 InterPro consortium member databases 71 16.1 CATH-Gene3D.............................................. 71 16.2 CDD................................................... 71 16.3 HAMAP................................................. 72 16.4 MobiDB Lite............................................... 72 16.5 PANTHER................................................ 72 16.6 Pfam................................................... 72 16.7 PIRSF.................................................. 73 16.8 PRINTS................................................. 73 16.9 PROSITE profiles............................................ 73 16.10 SFLD................................................... 73 16.11 SMART.................................................. 73 16.12 SUPERFAMILY............................................. 74 16.13 TIGRFAMs................................................ 74 17 InterPro team 75 17.1 Previous contributors........................................... 75 18 Funding 77 ii 19 Privacy 79 20 Literature references 81 iii iv CHAPTER ONE ABOUT INTERPRO InterPro is a resource that provides functional analysis of protein sequences by classifying them into families and predicting the presence of domains and important sites. To classify proteins in this way, InterPro uses predictive models, known as signatures, provided by several collaborating databases (referred to as member databases) that collectively make up the InterPro consortium. A key value of InterPro is that it combines protein signatures from these member databases into a single searchable resource, capitalising on their individual strengths to produce a powerful integrated database and diagnostic tool. We add further value to InterPro entries by providing detailed functional annotation as well as adding relevant GO terms that enable automatic annotation of millions of GO terms across the protein sequence databases. InterPro integrates signatures from the following 13 member databases: CATH, CDD, HAMAP, MobiDB Lite, Pan- ther, Pfam, PIRSF, PRINTS, Prosite, SFLD, SMART, SUPERFAMILY AND TIGRfams (the InterPro consortium section gives further information about the individual databases). The member databases use a variety of different methods to classify proteins. Each of the databases has a particular focus (e.g. protein domains defined from structure, or full length protein families with shared function). We strive to integrate the signatures from the member databases into InterPro entries and to identify where different member database entries are the same entity. You can use the InterPro website to obtain information about individual protein families, domains, important sites, perform a sequence search or browse through InterPro annotations. We have designed the website to be intuitive for new users meaning it is not essential to read this documentation. However, in the following sections you will find a wealth of specialised and powerful features that can be easily overlooked. You may also want to check out our list of training materials and webinars. InterPro is updated approximately every 8 weeks. The release notes page contains information about what has changed in each release. All information in InterPro is freely available. You can download InterPro data for local analyses from the Download page, or use the InterPro API. Find out more about the project by exploring the latest papers. 1 InterPro 2 Chapter 1. About InterPro CHAPTER TWO CITING INTERPRO 2.1 Latest publications If you find InterPro useful for your research, please cite the following publications: 2.1.1 InterPro The InterPro protein families and domains database: 20 years on Matthias Blum, Hsin-Yu Chang, Sara Chuguransky, Tiago Grego, Swaathi Kandasaamy, Alex Mitchell, Gift Nuka, Typhaine Paysan-Lafosse, Matloob Qureshi, Shriya Raj, Lorna Richardson, Gustavo A Salazar, Lowri Williams, Peer Bork, Alan Bridge, Julian Gough, Daniel H Haft, Ivica Letunic, Aron Marchler-Bauer, Huaiyu Mi, Darren A Natale, Marco Necci, Christine A Orengo, Arun P Pan- durangan, Catherine Rivoire, Christian J A Sigrist, Ian Sillitoe, Narmada Thanki, Paul D Thomas, Silvio C E Tosatto, Cathy H Wu, Alex Bateman, Robert D Finn Nucleic Acids Research (2020), gkaa977, PMID: 33156333 2.1.2 InterProScan InterProScan 5: genome-scale protein function classification Philip Jones, David Binns, Hsin-Yu Chang, Matthew Fraser, Weizhong Li, Craig McAnulla, Hamish McWilliam, John Maslen, Alex Mitchell, Gift Nuka, Sebastien Pesseat, Antony F. Quinn, Amaia Sangrador-Vegas, Maxim Scheremetjew, Siew-Yit Yong, Rodrigo Lopez, Sarah Hunter Bioin- formatics (2014), PMID: 24451626 2.2 All previous publications InterPro in 2019: improving coverage, classification and access to protein sequence annotations Alex L Mitchell, Teresa K Attwood, Patricia C Babbitt, Matthias Blum, Peer Bork, Alan Bridge, Shoshana D Brown, Hsin-Yu Chang, Sara El-Gebali, Matthew I Fraser, Julian Gough, David R Haft, Hongzhan Huang, Ivica Letunic, Rodrigo Lopez, Au- rélien Luciani, Fabio Madeira, Aron Marchler-Bauer, Huaiyu Mi, Darren A Natale, Marco Necci, Gift Nuka, Christine Orengo, Arun P Pandurangan, Typhaine Paysan-Lafosse, Sebastien Pesseat, Simon C Potter, Matloob A Qureshi, Neil D Rawlings, Nicole Redaschi, Lorna J Richardson, Catherine Rivoire, Gustavo A Salazar, Amaia Sangrador-Vegas, Christian J A Sigrist, Ian Sillitoe, Granger G Sutton, Narmada Thanki, Paul D Thomas, Silvio C E Tosatto, Siew-Yit Yong, Robert D Finn Nucleic Acids Research (2019) Database Issue 47:D351–D360, PMID: 30398656 InterPro in 2017 — beyond protein family and domain annotations Robert D. Finn, Teresa K. Attwood, Patricia C. Bab- bitt, Alex Bateman, Peer Bork, Alan J. Bridge, Hsin-Yu Chang, Zsuzsanna