Brain-CODE: a Secure Neuroinformatics Platform for Management, Federation, Sharing and Analysis of Multi-Dimensional Neuroscience Data
Total Page:16
File Type:pdf, Size:1020Kb
fninf-12-00028 May 22, 2018 Time: 17:15 # 1 TECHNOLOGY REPORT published: 23 May 2018 doi: 10.3389/fninf.2018.00028 Brain-CODE: A Secure Neuroinformatics Platform for Management, Federation, Sharing and Analysis of Multi-Dimensional Neuroscience Data Anthony L. Vaccarino1,2*, Moyez Dharsee2, Stephen Strother2,3, Don Aldridge4, Stephen R. Arnott2,3, Brendan Behan1, Costas Dafnas4, Fan Dong2,3, Kenneth Edgecombe4, Rachad El-Badrawi2, Khaled El-Emam5, Tom Gee2,3, Susan G. Evans2, Mojib Javadi2, Francis Jeanson1, Shannon Lefaivre1, Kristen Lutz2, F. Chris MacPhee4, Jordan Mikkelsen2, Tom Mikkelsen1, Nicholas Mirotchnick1, Tanya Schmah6, Christa M. Studzinski1, Donald T. Stuss1,3,7, Elizabeth Theriault1 and Kenneth R. Evans2 1 Ontario Brain Institute, Toronto, ON, Canada, 2 Indoc Research, Toronto, ON, Canada, 3 Rotman Research Institute, Toronto, ON, Canada, 4 Centre for Advanced Computing, Kingston, ON, Canada, 5 Privacy Analytics Inc., Ottawa, ON, Canada, 6 Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON, Canada, 7 Departments of Psychology and Medicine, University of Toronto, Toronto, ON, Canada Edited by: Neda Jahanshad, University of Southern California, Historically, research databases have existed in isolation with no practical avenue for United States sharing or pooling medical data into high dimensional datasets that can be efficiently Reviewed by: compared across databases. To address this challenge, the Ontario Brain Institute’s Alexander Fingelkurts, “Brain-CODE” is a large-scale neuroinformatics platform designed to support the BM-Science, Finland Sergey M. Plis, collection, storage, federation, sharing and analysis of different data types across The Mind Research Network (MRN), several brain disorders, as a means to understand common underlying causes of United States brain dysfunction and develop novel approaches to treatment. By providing researchers *Correspondence: Anthony L. Vaccarino access to aggregated datasets that they otherwise could not obtain independently, [email protected] Brain-CODE incentivizes data sharing and collaboration and facilitates analyses both within and across disorders and across a wide array of data types, including Received: 12 January 2018 Accepted: 03 May 2018 clinical, neuroimaging and molecular. The Brain-CODE system architecture provides Published: 23 May 2018 the technical capabilities to support (1) consolidated data management to securely Citation: capture, monitor and curate data, (2) privacy and security best-practices, and (3) Vaccarino AL, Dharsee M, interoperable and extensible systems that support harmonization, integration, and query Strother S, Aldridge D, Arnott SR, Behan B, Dafnas C, Dong F, across diverse data modalities and linkages to external data sources. Brain-CODE Edgecombe K, El-Badrawi R, currently supports collaborative research networks focused on various brain conditions, El-Emam K, Gee T, Evans SG, Javadi M, Jeanson F, Lefaivre S, including neurodevelopmental disorders, cerebral palsy, neurodegenerative diseases, Lutz K, MacPhee FC, Mikkelsen J, epilepsy and mood disorders. These programs are generating large volumes of data Mikkelsen T, Mirotchnick N, that are integrated within Brain-CODE to support scientific inquiry and analytics across Schmah T, Studzinski CM, Stuss DT, Theriault E and Evans KR (2018) multiple brain disorders and modalities. By providing access to very large datasets on Brain-CODE: A Secure patients with different brain disorders and enabling linkages to provincial, national and Neuroinformatics Platform for Management, Federation, Sharing international databases, Brain-CODE will help to generate new hypotheses about the and Analysis of Multi-Dimensional biological bases of brain disorders, and ultimately promote new discoveries to improve Neuroscience Data. patient care. Front. Neuroinform. 12:28. doi: 10.3389/fninf.2018.00028 Keywords: Brain-CODE, neuroinformatics, big data, electronic data capture, open data Frontiers in Neuroinformatics | www.frontiersin.org 1 May 2018 | Volume 12 | Article 28 fninf-12-00028 May 22, 2018 Time: 17:15 # 2 Vaccarino et al. Brain-CODE Neuroinformatics Platform INTRODUCTION Brain-CODE Design Principles Interoperability and Standardization to Support Data The principles of data sharing as a catalyst for scientific discovery Integration and Collaboration are widely recognized by international organizations such as the National Institutes of Health (2003), Wellcome Trust The types of data being collected in modern research are (2010) and Canadian Institutes of Health Research (2013). increasingly diverse, from larger numbers of sources and patient Historically, however, research databases have existed in isolation populations, and involving highly specialized technologies, with no practical avenue for sharing or pooling medical data from genomics and imaging, to wearable devices and surveys into high dimensional “big” datasets that can be efficiently delivered via mobile apps. There is also a growing need to compared across databases. Databases have their own sets of link and query data collected within a given research study data standards, software and processes, thus limiting their ability with data stored in disparate other locations and formats, to synthesize and share data with one another. To address such as public data repositories, health administration data this challenge, the Ontario Brain Institute (OBI) created Brain- holdings, electronic medical records, and legacy databases. As CODE – an extensible, neuroinformatics platform designed to a result, researchers deploy a broad range of tools to collect, support curation, sharing and analysis of different data types process and analyze their data, but the lack of interoperability across several brain disorders1. Brain-CODE allows researchers of these platforms serves as a barrier to data sharing and to collaborate and work more efficiently to understand the collaboration. Establishing standard software would address biological basis of brain disorders. these issues; however, available platforms each have their unique Ontario Brain Institute supports collaborative research advantages and there is a significant cost for researchers in networks focused on various brain conditions, including time and effort to move to new platforms. This complex neurodevelopmental disorders2, cerebral palsy3, epilepsy4, mood set of data integration needs cannot be addressed using disorders5, and neurodegenerative diseases6 (Figure 1). The inflexible systems working in isolation nor by the development creation of these programs has resulted in a “big data” of a “one size fits all” platform. Rather, to support this opportunity to support the development of innovative, impactful level of data integration, interoperability must be a core diagnostics and treatments for brain disorders (Stuss, 2014, requirement. This approach differs from most other data 2015). By providing researchers access to aggregated datasets platforms in which data are combined at the data analyses stage. that they otherwise could not obtain independently, Brain- Interoperability, however, enables large-scale data aggregation CODE incentivizes data sharing and collaboration, and facilitates and federation of systems and data across multiple data analyses both within and across disorders and across an array of types, allowing novel discoveries and analyses to be conducted. data types, including clinical, neuroimaging, and molecular. By Moreover, allowing researchers to decide which system to use collecting data elements across disorders Brain-CODE enables ensures greater researcher uptake, which facilitates collaboration deep phenotyping across data modalities within a brain disorder, and data sharing within and across the broader research as well as investigations across disorders. Moreover, linkages community. with provincial, national and international databases will allow From its inception, Brain-CODE architecture was designed scientists, clinicians, and industry to work together in powerful with interoperability in mind, such that it could support the new ways to better understand common underlying causes of integration and analysis of large volumes of complex data from brain dysfunction and develop novel approaches to treatment. diverse sources. With this approach each platform can maintain Using the FAIR Data Principles as guidance, Brain-CODE is its autonomy while still integrating into a much larger whole. being developed to support the principles of data being Findable, This can be a challenge as databases are often stored in individual Accessible, Interoperable, and Reusable (FAIR, Wilkinson et al., “silos” with their own sets of data standards, software and 2016). The Brain-CODE system architecture provides the processes which limit their ability to interact with one another. technical capabilities to support: Interoperability, therefore, requires the development of pipelines and processes between existing platforms; software to allow • Consolidated data management to securely capture, efficient and seamless exchange of data and information between monitor and curate data. systems and technologies, including application programming • Privacy and security best-practices. interfaces (APIs) to allow data flow between applications. In • Interoperable and extensible federation systems that addition, rigorous standardization processes are required that supports harmonization, integration and query across govern how information is recorded and exchanged in order to