Bunevicius A. Biomed Data J. 2015;1(1): 27-32 http://dx.doi.org/10.11610/bmdj.01104

ISSN 2367-5322 datajournals.eu: The European Open Data portal Policy Paper

The Need for Open-access, Structured Data in Clinical

Adomas Bunevicius

Neuroscience Institute and Department of Neurosurgery, Lithuanian University of Health Sciences, Kaunas, Lithuania http://lsmuni.lt/en/structure/medical-academy-/-institute/

ARTICLE INFO : A B S T R A C T

RECEIVED: 22 Sep 2014 Neurological and psychiatric disorders are considered major public health prob- REVISED: 02 Nov 2014 lems associated with a significant number of suffering patients and an enormous ACCEPTED: 14 Nov 2014 socioeconomic burden to societies. Therefore, significant research funds have ONLINE: 04 Jan 2015 been allocated towards improving outcomes of patients suffering from brain disorders. In recent years, researchers have been facing a constantly growing worldwide pressure to make their datasets freely and readily available to other K E Y W O R D S: researchers and the public. The major goal of this movement is to allow re-use of already collected data in order to facilitate medical progress. There are numer- ous ways that clinical neuroscience researchers and research organizations Dataset commonly employ to make their databases available. Institutional databases and Brain Diseases publicly available research and clinical data repositories and registries are the Disease Models most common ways to share data among researchers and organizations in order Data Sharing to facilitate collaboration. In addition, numerous traditional neuroscience peer reviewed journals have implemented various data sharing policies. Furthermore, a few traditional clinical neuroscience journals have recently started accepting dataset papers. Finally, there is a constantly growing number of data- base journals which accept clinical neuroscience research datasets for publica- tion. Potential for improving of citation metrics is among the important ad- vantages for authors considering publishing their databases in dataset journals.

Creative Commons BY-NC-SA 4.0 © 2015 Hosting by Procon Ltd. All rights reserved.

The human brain is a susceptible organ to numer- brovascular diseases, nervous system malignan- ous disorders that may be attributed to both ex- cies, infections, trauma and functional disorders. ternal and internal causes and develop across a Neurological disorders affect more than 1 billion lifespan. Brain disorders comprise a vast and het- people worldwide and account for 6.8 million erogeneous group of diseases affecting both cen- deaths, which is 12 percent of total deaths.1 Pop- tral and peripheral nervous systems at both struc- ulation based studies carried out in Western soci- tural and functional levels. Neurological disorders eties have estimated that more than one quarter and mental illness are considered two major of individuals met the diagnostic criteria for at groups of brain disorders. Neurological disorders least one mental disorder within a 12-month pe- include but are not limited to neurodevelopmen- riod.2,3 tal disorders, neurodegenerative disorders, cere-

 Corresponding Author: Eiveniugatve 2, LT-50009, Kaunas, Lithuania; E-mail: [email protected] Bunevicius A. Biomed Data J. 2015; 1(1): 27-32

From an individual patient perspective, neuro- search funding agencies 10 and political and li- logical and psychiatric disorders are considered censing authorities 11 to make their raw structured among the most serious and debilitating condi- scholarly research data openly and readily acces- tions and are often associated with severe func- sible for other researchers and public. For many tional disability, significant reduction of health researchers and research organizations it is no related quality of life and poor prognosis. Due to longer a question whether scientific data should the lack of effective treatment modalities, brain be shared and that the amount of shared data will disorders are often incurable disorders resulting in increase exponentially in the nearest future. Aside gradual and progressive functional decline. From a from well expected benefits to society via acceler- public health perspective, neurological and psy- ated improvement of outcomes of patients suf- chiatric disorders are considered the major public fering from brain disorders, other potentially im- health problem associated with significant socio- portant advantages of data sharing for individual economic burden for societies that are mainly at- researchers and/or research organizations should tributed to significant financial efforts required for be acknowledged. Importantly, data sharing is as- treatment of patients with brain disorders and sociated with greater citation count. For instance, progressive decline in patient productivity. For ex- it has been demonstrated that papers with pub- ample, the economic cost of brain disorders was licly available data have received significantly estimated at about 798 billion Euros in 2010 in Eu- greater number of citations relative to papers rope.4 According to the World Health Organiza- without publicly available datasets.12 Further- tion, neurological and neuropsychiatric disorders more, data sharing is expected to increase visibil- should be considered among the most serious ity of researcher and institutional research output, public health problems causing greater disability improve publication and data quality and trans- than infectious disorders, malignancies, ischemic parency, increase collaboration opportunities heart disease, respiratory diseases and digestive among researchers and research organizations, diseases.5 It has been estimated that neurological reduce research costs allowing to re-use already disorders contributed to 92 million of disability- collected data and fulfil data sharing requirements adjusted life years (DALYs) lost in 2005.5 The same imposed by certain funding agencies and peer-re- study has projected a 12% increase of DALYs ac- viewed journals. Also, data sharing can have an counted by neurological disorders in 2030. The important moral benefit making each individual global burden of neurological and psychiatric dis- member of the scientific community feel more as orders is expected to increase significantly in the a team player of a scientific community with an ul- future mainly due to increased longevity and im- timate goal of more expedited advancement of proved patient survival.5 As a result, studies aim- research progress that would eventually translate ing to elucidate the underlying biological mecha- into reduced disease burden and patient suffering. nisms of brain disorders and to improve outcomes However, data sharing is still in its infancy of patients suffering from neurological and psy- stage facing numerous technical, institutional, chiatric conditions remain an important research ethical and cultural hurdles that preclude from funding priority.6,7 universal implementation of data sharing cul- These research efforts have significantly ad- ture.13 From an individual researcher perspective, vanced our understanding about underlying bio- a paucity of peer-reviewed journals accepting raw logical mechanisms of brain disorders and im- scientific biomedical research datasets for publica- proved outcomes of patients suffering from neu- tion, lack of knowledge on how the raw data rological and psychiatric disorders. Nevertheless, should be handled and presented, and uncertainty much more remains to be done in order to im- about potential uses and/or misuses of the data prove recognition and expand treatment option of can be considered as important obstacles for data patients suffering from brain disorders. To this sharing. In addition, lack of knowledge of univer- end, there is a rising worldwide movement en- sally accepted ethical guidelines and policies re- couraging sharing of scholarly research data with garding raw research data sharing, especially investigators and other interested parties.8 Data when considering studies involving human sub- sharing is expected to be the next giant step al- jects, is another important barrier for data shar- lowing re-using already collected scholarly data in ing. Furthermore, storage of large datasets can order to re-validate current knowledge and gen- impose significant challenges to traditional pub- erate new hypotheses leading towards improved lishers since data storage and sharing is associated patient outcomes. Researchers as well as aca- with a need to improve technological infrastruc- demic research institutions are facing a consist- ture that can consequentially increase publishing ently growing worldwide push from societies,9 re- costs to the journal. 28 The Need for Open-access, Structured Data in Clinical Brain Research

To date, there are numerous ways that re- download under the Open Data Commons Attrib- searchers and clinicians working in the field of ution License. Finally, a number of psychiatry clini- neuroscience employ to share their research and cal trial datasets are available from the NIMH after clinical data (see Table 1). Deployment of clinical completion of required documentation. and research databases in institutional data re- There are numerous national and international positories is the most commonly employed way by clinical registries that prospectively collect clinical researchers and research organizations to store and outcome data of patients suffering from brain research data. Importantly, some organizations disorders. Important goals of such registries are and researchers’ groups are interested in data improving patient care and optimizing allocation sharing with other interested parties. The major of available healthcare resources. For example, goals of such data sharing are to improve visibility The National Neurosurgery Quality and Outcomes of organization, support reliability and transpar- Database (N²QOD) was established by the Neuro- ency of their research findings and attract new Point Alliance, a non-profit organization of the collaborators for possible future research endeav- American Association of Neurological Surgeons, ours. For example, the Clinical Research in Neu- and is currently managed by the Vanderbilt Insti- rology Registry of the Emory University is a pro- tute of Medicine and Public Health. The data from spectively collected longitudinal database of indi- such clinical data registries is usually available for viduals suffering from various neurological disor- participating institutions. Also, there is an in- ders that includes over 2500 clinical and biological creasingly growing interest towards establishing variables. According to the program’s website, the regional, national and/or international prospective program is interested in collaboration and inter- clinical registries or databases of individuals suf- ested parties are encouraged to contact the or- fering from rare neurological disorders that could ganization. Another example of similar approach is potentially allow to pool sufficient number of pa- the Clinical and Research Database for Persistent tients for clinical research studies and facilitate Schizophrenia of the Department Of Psychiatry of collaboration of scientists and physicians.16 Ac- University Of Cambridge. Researchers interested cording to the ORPHANET, the portal of rare dis- in schizophrenia research can contact a responsi- orders funded by the French government and the ble person. However, such data storage does not European Commission, there is a total of 641 reg- provide with immediate benefits for investigators istries of which 74 have been classified as global. via traditional citations metrics. These registries usually contain clinical data and Data from clinical trials can be obtained upon an access to such registries requires permission. request from study investigators once the dataset Contributors of clinical data registries can usually has been released. Sharing of clinical trial data is receive traditional authorship credits by partici- important from clinical perspective since findings pating in writing scientific papers stemming from from such studies are often used to approve new such registries. drugs and devices in standard patient care. Shar- Numerous promising public initiatives have ing of clinical trial data is strongly encouraged by been developed towards promoting clinical and funding and licensing authorities. An example of basic neuroscience research data sharing.17 For such clinical trial data sharing is the National Insti- example, the Neuroscience Information Frame- tute for Neurological Disorders (NINDS) Recombi- work (NIF) is a project funded by the NIH Blueprint nant Tissue Plasminogen Activator (rt-PA) Stroke for Neuroscience Research project and maintains Trial that was a pivotal trial published in 1995 the largest searchable collection of neuroscience demonstrating efficacy of intravenous rt-PA for data.18 The NIF is a user-friendly platform that management of acute stroke patients.14 Results of contains over 174 available datasets and data- this trial have led to intravenous rt-PA approval bases. However, it should be noted that the ma- for management of acute ischemic stroke pa- jority of datasets include basic neuroscience, tients. The original research data has become namely neuroimaging and genetics data. Aside available more than 10 years after the initial pub- from numerous important advantages of the NIF lication of major findings and anyone willing to and similar platforms, authors do not receive de- obtain original dataset should contact the NINDS served credits via traditional bibliographic metrics. and purchase the CD-ROM containing the data for Several traditional peer-reviewed clinical neu- US$79. However, some authors have recently roscience and psychiatry journals require authors noted that access to the trial data was complex to make their datasets publicly available for re- and some data appeared to be missing.15 Another viewers and readers upon acceptance of the example of such data sharing is the International manuscript. The Journal of Cognitive Neuroscience Stroke Trial database that is freely available for has pioneered the field by requesting authors of 29 Bunevicius A. Biomed Data J. 2015; 1(1): 27-32

Table 1. Common ways of clinical neuroscience research data sharing.

Sources Comment Example(s) Major advantages for authors

Institutional Continuously collected regis- Clinical Research in Neurology Improved collaboration and databases tries that include patient clinical Registry of the Emory univer- networking opportunities. En- and biological data. Data can be sity, Clinical and Research Da- hanced author and institu- made available for researchers tabase for Persistent Schizo- tional visibility and transpar- upon request. phrenia ency.

Raw clinical Usually includes clinical trial The National Institute of Neu- Enhances reliability and trial data data that can be obtained upon rological Disorders and Stroke transparency of research data. request. Analysis of such data rt-PA Stroke Study Group, In- Required by funding and li- can be complex. ternational Stroke Trial data- censing agencies. base

Publicly avail- Contains large number of da- The Neuroscience Information Enhanced author visibility and able research tasets and databases. Mostly FrameworkOpenfMRI, networking opportunities, re- data reposito- neuroimaging and neuro-ge- XNATCentral liability of research data ries netic data. Clinical data Registries that include clinical England’s Compendium of Neu- Collaboration and networking repositories and outcome data with a goal rology Data, The National Neu- opportunities and registries towards improving patient care. rosurgery Quality and Out- The data is usually available for comes Database(N²QOD), Reg- research purposes. Allows to istries of rare disorders of the aggregate larger sample sizes of brain patients suffering from rare disorders of the nervous system

Traditional Authors have to share datasets PLOS, European Journal of Neu- peer reviewed that can be either submitted as roscience, The BMJ journals with supplementary material or de- compulsory posited in public repositories data sharing policies Increasing transparency of re- search findings Peer reviewed Authors can optionally choose European Psychiatry, Biological journals with to share dataset upon manu- Psychiatry optional data script acceptance sharing poli- cies

Traditional Peer-review journals publishing BMC Neurology, BMC Research Increasing number of publica- peer reviewed traditional research articles. Da- Notes tions and improving citation journals ac- tabase articles describing da- metrics cepting da- tasets are also considered for taset papers publication. Datasets are stored in external repositories and should be made freely available for readers.

Specialized Consider only database papers. The Biomedical Data Journal, Increasing number of publica- database Papers describing dataset are Scientific Data, Journal of Open tions and citation metrics; journals published in the journal. Da- Public Health Data, Journal of user-friendly interface; facili- tasets are made freely available Open Psychology Data, GigaSci- tated mechanisms of data for download from the journal ence, Dataset Papers in Science sharing. web-page or from public data repository.

30 The Need for Open-access, Structured Data in Clinical Brain Research accepted manuscripts to make their datasets technological infrastructure that would provide freely available for reviewers and readers. The authors with user-friendly interface and with PLOS journals and the European Journal of more sophisticated user-friendly tools to share Neuroscience are examples of journals considering (i.e., upload and download) databases and brain research studies for publication that require metadata. Albeit the number of specialized bio- authors of accepted papers to make their medical database journals currently remains lim- databases used for manuscript writing freely ited, the field is rapidly expanding. Researchers available by depositing their databases in public working in the fields of clinical neuroscience and data repositories or submitting them as a psychiatry can consider the following journals: supporting document, depending on file size. Scientific Data Journal (published by the Nature Numerous other top journals in the fields of Group), Journal of Open Public Health Data and neurological and psychiatric research, such as the Journal of Open Psychology Data (both journals European Psychiatry and Biological Psychiatry, are published by the Ubiquity Press), GigaScience have adopted policies to deposit dataset used for (published by the BioMed Central and supported preparation of accepted manuscripts as an by BGI, a Chinese non-profit organization) and Da- optional requirement. The BMJ requires authors taset Papers in Science (published by Hindawi). of drug and device clinical trials to deposit their Despite the inter-journal variations in journal data in public data repositories. The major goal of scope, author guidelines and management of da- such policies is to enhance reliability and tasets, database journals have adopted open-ac- transparency of research findings presented in cess policy to databases and papers describing the traditional peer-reviewed journal papers. database acquisition and design. However, authors do not receive a citation credit In this context, we are delightful to launch The for the deposited databases. Biomedical Data Journal (BMDJ) that is an open- Only a few traditional peer-reviewed neurology access specialized biomedical dataset journal pub- and psychiatry journals included in the major bib- lished by the Procon Ltd. in collaboration with the liographic databases consider database articles for European Commission funded OpenScienceLink publication. Recently, the BioMedCentral group project. The BMDJ together with the Open- have started considering dataset articles for publi- ScienceLink platform is expected to provide indi- cations in their journals.19 For example, the BMC vidual researchers and research institutions with a Neurology and BMC Psychiatry accept database reliable outlet for sharing structured research articles describing “a novel biomedical database data. In addition, the OpenScienceLink platform likely to be of broad utility”. In the latter journals, will provide authors with numerous additional papers describing dataset are published in the services aiming to facilitate research collabora- journal while the datasets are stored in external tion, enhance the visibility of their research find- repositories and should be made readily accessi- ings and follow their research utilization by em- ble without restrictions for non-commercial re- ploying novel citation metrics. The BMDJ will con- searchers upon acceptance of the manuscript. sider data papers describing original datasets from Additional more direct advantages for authors in- various clinical neuroscience disciplines, including terested in publishing their dataset article in tradi- psychiatry, psychology, neurology, neurosurgery tional peer-reviewed journals are stemming from and neuroradiology. The journal will also consider the potential to improve traditional bibliographic original scientific investigations pertaining to brain and citation indexes, such as increasing the total health at all levels, ranging from epidemiological number of publications, and from the potential for research studies to clinical investigations con- citation count increase. However, the previously cerned with diagnosis, management and preven- mentioned potential technical issues are im- tion of brain disorders. Interdisciplinary studies on portant limitations for traditional publishers will- interrelated disciplines of genetics, immunology ing to consider dataset papers for publication. and endocrinology integrating knowledge about Specialized database journals are expected to brain health will also be considered for publica- fill in the majority of gaps imposed by the above tion. Papers from clinical investigations studying mentioned currently available research and clini- mental health and neurologic comorbidities in pa- cal data outlets. Most importantly, authors pub- tients with somatic disorders are also strongly en- lishing their datasets in specialized database jour- couraged. nals will receive immediate well-deserved credits, Computational modelling techniques have pro- such as increasing publication count and potential vided important tools that have significantly ex- for greater citation number. Also, specialized da- panded our understanding how electrical and tabase journals are expected to have adequate chemical impulses (and their interaction) affect 31 Bunevicius A. Biomed Data J. 2015; 1(1): 27-32 brain functioning under normal conditions and in 10 National Institute of Health. NIH Data Sharing Policy disease states.20,21 Computational modelling tech- and Implementation Guidance. Available from: niques have an enormous potential for various http://grants.nih.gov/grants/policy/data_sharing/ clinical applications in patients suffering from neu- data_sharing_guidance.htm. 11 rological and mental disorders.22,23 However, U.S. Food and Drug Administration. OpenFDA. Avail- able from: https://open.fda.gov/about. computational modelling studies require large 12 amounts of experimental and clinical data. To- Piwowar HA, Day RS and Fridsma DB. Sharing de- wards this end, an important goal of the BMDJ is tailed research data is associated with increased cita- to gather a sufficient amount of biological and ex- tion rate. PLoS ONE 2007 Mar 21;2(3):e308. DOI: 10.1371/journal.pone.0000308. perimental data that could subsequently be used 13 Wicherts JM, Borsbom D, Kats K and Molenaar D. as an important source for computational model- The poor availability of psychological research data for ling studies. reanalysis. Am Psychol. 2006 Oct;61(7):726-8. DOI: Acknowledgement 10.1037/0003-066X.61.7.726. 14 The National Institute of Neurological Disorders and The preparation of this manuscript was supported Stroke, rt-PA Stroke Study Group. Tissue plasminogen by the OpenScienceLink project (Grant agreement activator for acute ischemic stroke. N Eng J Med. 1995 #318652). Dec 14;333(24):1581–7. DOI: 10.1056/NEJM19951214 References 3332401. 15 Dachs RJ, Burton JH, Joslin J. A user's guide to the 1 The World Health Organization. The world health NINDS rt-PA stroke trial database. PLoS Med. 2008 May report 2007: a safer future: global public health secu- 20;5(5):e113. DOI: 10.1371/journal.pmed.00501 13. rity in the 21st century. Geneva, Switzerland: WHO 16 Rare Disease Registries in Europe. Orphanet Report Press; 2007. Available from: www.who.int/whr/2007/ Series. Rare Diseases collection, 2014 Jan. Available en. from: http://www.orpha.net/orphacom/cahiers/docs/ 2 Kessler RC, Chiu WT, Demler O, Merikangas KR, GB/Registries.pdf. Walters EE. Prevalence, severity, and comorbidity of 17 Poline JB, Breeze JL, Ghosh S, Gorgolewski K, Hal- 12-month DSM-IV disorders in the National Comorbid- chenko YO, Hanke M, et al. Data sharing in neuroimag- ity Survey Replication. Arch Gen Psychiatry 2005;62(6): ing research. Front Neuroinform. 2012 Apr 5; 6:9. DOI: 617–27. DOI: 10.1001/archpsyc.62.6.617. 10.3389/fninf.2012.00009. 3 Wittchen HU, Jacobi F. Size and burden of mental 18 The Neuroscience Information Framework. About disorders in Europe – a critical review and appraisal of NIF. Available from: https://neuinfo.org/about/ind 27 studies. EurNeuropsychopharmacol 2005; 4: 57–76. ex.shtm. PMID: 15961293. 19 Hrynaszkiewicz I. A call for BMC Research Notes 4 Olesen J, Gustavsson A, Svensson M, Wittchen HU, contributions promoting best practice in data stand- Jönsson B; CDBE 2010 study group; European Brain ardization, sharing and publication. BMC Res Notes. Council. The economic cost of brain disorders in Eu- 2010 Sep 2;3:235. DOI: 10.1186/1756-0500-3-235. rope. Eur J Neurol. 2012 Jan;19(1):155-62. DOI: 20 Nakagawa TT, Jirsa VK, Spiegler A, McIntosh AR, 10.1111/j.1468-1331.2011.03590.x. Deco G. Bottom up modeling of the connectome: link- 5 Dua T, GarridoCumbrera M, Mathers C, Saxena S. ing structure and function in the resting brain and their Global Burden of Neurological Disorders: Estimates and changes in aging. Neuroimage. 2013 Oct 15;80:318-29. Projections. In: Neurological Disorders: Public Health DOI: 10.1016/j.neuroimage.2013.04.05 5. Challenges. Geneva, Switzerland: WHO Press;2006. 21 Dauvermann MR, Whalley HC, Schmidt A, Lee GL, p. 26-39. Romaniuk L, Roberts N, Johnstone EC, Lawrie SM, 6 National Institutes of Health. Estimates of Funding Moorhead TW. Computational neuropsychiatry - schiz- for Various Research, Condition, and Disease Catego- ophrenia as a cognitive brain network disorder. Front ries (RCDC). 2014 March 7. Available from: Psychiatry. 2014 Mar 25;5:30. DOI: 10.3389/fpsyt.20 http://report.nih.gov/categorical_spending.aspx. 14.00030. 7 Sobocki P, Lekander I, Berwick S, Olesen J, Jönsson 22 Adaszewski S, Dukart J, Kherif F, Frackowiak R, Dra- B. Resource allocation to brain research in Europe ganski B. Alzheimer's Disease Neuroimaging Initiative. (RABRE). Eur J Neurosci. 2006 Nov;24(10):2691-3. DOI: How early can we predict Alzheimer's disease using 10.1111/j.1460-9568.2006.05116.x. computational anatomy? Neurobiol Aging. 2013 8 Uhlir PF, Schröder P. Open Data for Global Science. Dec;34(12):2815-26. DOI: 10.1016/j.neurobiolaging.20 Data Sci J. 2007;6:OD36–OD53. 13.06.015. 9 Hilts PJ. A Law Opening Research Data Sets Off De- 23 Friston KJ, Stephan KE, Montague R, Dolan RJ. Com- bate. The New York Times. 1999 Jul. Available from: putational psychiatry: the brain as a phantastic organ. www.nytimes.com/1999/07/31/us/a-law-opening- The Lancet Psychiatry 2014 Jul;1(2):148-158. DOI: research-data-sets-off-debate.html. 10.1016/S2215-0366(14)70275-5.

32