D84–D88 Nucleic Acids Research, 2019, Vol. 47, Database issue Published online 5 November 2018 doi: 10.1093/nar/gky1078 The European Nucleotide Archive in 2018 Peter W. Harrison *, Blaise Alako, Clara Amid, Ana Cerdeno-T˜ arraga,´ Iain Cleland, Sam Holt, Abdulrahman Hussein, Suran Jayathilaka, Simon Kay, Thomas Keane, Rasko Leinonen , Xin Liu, JosueMart´ ´ınez-Villacorta, Annalisa Milano, Nima Pakseresht, Jeena Rajan, Kethi Reddy, Edward Richards, Marc Rosello, Nicole Silvester , Dmitriy Smirnov, Ana-Luisa Toribio, Senthilnathan Vijayaraja and Guy Cochrane
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
Received September 18, 2018; Revised October 18, 2018; Editorial Decision October 18, 2018; Accepted October 22, 2018
ABSTRACT tions. Our Helpdesk provides support to several thousand active data submitters and many times this number of data The European Nucleotide Archive (ENA; https://www. consumers. ebi.ac.uk/ena), provided from EMBL-EBI, has for As a member of the International Nucleotide Sequence more than three decades been responsible for archiv- Database Collaboration (1) (INSDC; http://www.insdc. ing the world’s public sequencing data and present- org/), we partner with the National Institute of Genetics’ ing this important resource to the scientific commu- DNA DataBank of Japan (2) and the United States Na- nity to support and accelerate the global research tional Center for Biotechnology’s GenBank and Sequence effort. Here, we outline ENA services and content in Read Archive (3). Together we provide globally comprehen- 2018 and provide an overview of a selection of focus sive coverage through routine data exchange, to build the areas of development work: extending data coordi- scientific standards for this exchange and to promote the nation services around ENA, sequence submissions timely sharing of well-structured sequence data. Throughout 2018, we have continued to provide ENA through template expansion, early pre-submission services for the submission, archiving, discovery and re- validation tools and our move towards a new browser trieval of globally comprehensive data across sequencing and retrieval infrastructure. platforms and scientific applications. In addition to opera- tion and support of ENA, a key focus for the year has been INTRODUCTION the extension of ENA core service provision (see Table 1) For over a third of a century, the European Nucleotide and involvement in data coordination activities around the Archive (ENA) has served as a cornerstone of the world’s services. bioinformatics infrastructure. In 2018, the resource contin- ues as a broad and heavily used platform for the sharing, SUMMARY OF ENA CONTENT, SERVICES AND COM- publication, safeguarding and reuse of globally comprehen- MUNITY SUPPORT sive public nucleotide sequence data and associated infor- mation. This year we have continued to provide through ENA a ENA content spans a spectrum of data types from raw rich set of open services and responsive support for the reads to asserted annotation and offers a diverse portfolio submission, archiving, enrichment and presentation of the of services to the scientific community. Submissions services world’s sequencing data across an ever increasing range of cater for our broad range of data providers, from the small- data types, scientific applications and user communities. scale submitting research laboratory to the major sequenc- Under the Webin submission services framework, we con- ing centre. Data access services, such as our search and re- tinue to operate both interactive and programmatic inter- trieval interfaces, support users from those browsing casu- faces for the submission of data of all accepted types. In ally to those integrating ENA content through program- the year we have processed 6600 studies, from some 800 matic access into their own analyses and software applica- 000 samples, comprising 600 000 libraries, 31 000 assem-
*To whom correspondence should be addressed. Tel: +44 1223 492589; Fax: +44 1223 494468; Email: [email protected] Present address: Peter W Harrison, European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.