Brain Modelling : The Virtual Brain on EBRAINS

Michael Schirnera,b,c,d,e, Lia Domidef, Dionysios Perdikisa,b, Paul Triebkorna,b,g, Leon Stefanovskia,b, Roopa Paia,b, Paula Prodanf, Bogdan Valeanf, Jessica Palmera,b, Chloê Langforda,b, André Blickensdörfera,b, Michiel van der Vlagh, Sandra Diaz-Pierh, Alexander Peyserh, Wouter Klijnh, Dirk Pleiteri,j, Anne Nahmi, Oliver Schmidk, Marmaduke Woodmang, Lyuba Zehll, Jan Fousekg, Spase Petkoskig, Lionel Kuschg, Meysam Hashemig, Daniele Marinazzom,n, Jean-François Mangino, Agnes Flöelp,q, Simisola Akintoyer, Bernd Carsten Stahls, Michael Cepict, Emily Johnsont, Gustavo Decou,v,w,x, Anthony R. McIntoshy, Claus C. Hilgetagz, a, Marc Morgank, Bernd Schulleri, Alex Uptonb, Colin McMurtrieb, Timo Dickscheidl, Jan G. Bjaaliec, Katrin Amuntsl,d, Jochen Mersmanne, Viktor Jirsag, Petra Rittera,b,c,d,e,* aBerlin Institute of Health at Charité – Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany bCharité – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt Universität zu Berlin, Department of Neurology with Experimental Neurology, Charitéplatz 1, 10117 Berlin, Germany cBernstein Focus State Dependencies of Learning & Bernstein Center for Computational Neuroscience, Berlin, Germany dEinstein Center for Neuroscience Berlin, Charitéplatz 1, 10117 Berlin, Germany eEinstein Center Digital Future, Wilhelmstraße 67, 10117 Berlin, Germany fCodemart, Str. Petofi Sandor, Cluj Napoca, Romania gInstitut de Neurosciences des Systèmes, Aix Marseille Université, Marseille, France hSimLab Neuroscience, Jülich Supercomputing Centre (JSC), Institute for Advanced Simulation, JARA, Forschungszentrum Jülich GmbH, Jülich, Germany iJülich Centre, Forschungszentrum Jülich, Jülich, Germany jInstitut für Theoretische Physik, Universität Regensburg, Regensburg, Germany kSwiss Federal Institute of Technology Lausanne (EPFL), Campus Biotech, Geneva, Switzerland lInstitute of Neuroscience and Medicine (INM-1), Research Centre Jülich, Jülich, Germany mDepartment of Data-Analysis, Faculty of Psychology and Educational Sciences, Ghent University, Belgium nSan Camillo IRCSS, Venice, Italy oUniversité Paris-Saclay, CEA, CNRS, Neurospin, Baobab, Gif-sur-Yvette, France pDepartment of Neurology, University Medicine Greifswald, Greifswald, Germany qGerman Center for Neurodegenerative Diseases (DZNE) Standort Greifswald, Greifswald, Germany rLeicester De Montfort Law School, De Montfort University, Leicester, United Kingdom sCentre for Computing and Social Responsibility, De Montfort University, Leicester, United Kingdom tDepartment of Innovation and Digitalisation in Law, Faculty of Law, University of Vienna, Austria uCenter for Brain and Cognition, Computational Neuroscience Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra, Barcelona, Spain vInstitució Catalana de la Recerca i Estudis Avançats, Barcelona, Spain wDepartment of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany xSchool of Psychological Sciences, Turner Institute for Brain and Mental Health, Monash University, Melbourne, Clayton, Australia yBaycrest Health Sciences, Rotman Research Institute, Toronto, ON, Canada zInstitute of Computational Neuroscience, University Medical Center Hamburg-Eppendorf, Hamburg, Germany aDepartment of Health Sciences, Boston University, 635 Commonwealth Ave., Boston, Massachusetts 02215 bCSCS Swiss National Supercomputing Centre, Lugano, Switzerland cInstitute of Basic Medical Sciences, University of Oslo, Norway dC. and O. Vogt Institute for Brain Research, University Hospital Düsseldorf, Heinrich-Heine University Düsseldorf, Germany eCodeBox GmbH, Stuttgart, Germany *Corresponding author: Petra Ritter, [email protected], Charitéplatz 1, 10117 Berlin, Germany

1

Keywords: brain modelling, , , neuroimaging, network model, high performance computing

The Virtual Brain (TVB) is now available as open-source cloud ecosystem on EBRAINS, a shared digital research platform for brain science. It offers services for constructing, simulating and analysing brain network models (BNMs) including the TVB network simulator; magnetic resonance imaging (MRI) processing pipelines to extract structural and functional ; multiscale co-simulation of spiking and large-scale networks; a domain specific language for automatic high-performance code generation from user- specified models; simulation-ready BNMs of patients and healthy volunteers; Bayesian inference of epilepsy spread; data and code for mouse brain simulation; and extensive educational material. TVB cloud services facilitate reproducible online collaboration and discovery of data assets, models, and software embedded in scalable and secure workflows, a precondition for research on large cohort data sets, better generalizability and clinical translation.

Scientific studies are often difficult to (bit.ly/3ogLYtb). To provide replicate and findings often do not supercomputing resources, the HBP offers generalize in light of additional data (Aarts as part of the Interactive Computing E- et al., 2015; Ioannidis, 2005). The data and Infrastructure project access to compute the computational steps that produced the and storage resources of the Fenix findings as well as the explicit workflow infrastructure (fenix-ri.eu), a network of describing how to generate the results five European supercomputing centres. were identified as the minimal TVB services use EBRAINS core services for components for independent regeneration deployment (Figure 1). The 'Collaboratory' of computational results (Stodden et al., (Supplementary Note: The EBRAINS 2016). EBRAINS (European Brain Research Collaboratory) hosts light-weight INfrastructureS) is an open brain research workspaces, called 'collabs', where platform that makes data, tools and results research teams can exchange data and accessible to everyone within an work together on Office or Wiki environment that promotes reproducible documents, secured with access control. work. It offers cloud services for 'Lab' provides sandboxed JupyterLab collaborative online research, databases instances for developing applications and with annotated and curated data of many running code. Lab notebooks provide a modalities, atlases of human and rodent powerful interface to EBRAINS services, for brains, data processing workflows, data transfer as well as configuring and supercomputing resources, neuromorphic executing jobs on . Data systems, and virtual robots. EBRAINS was can be found and accessed via the developed by the 'KnowledgeGraph' (KG), which provides a (HBP), a research initiative funded by the graphical user interface (GUI) and European Commission with the mission to Application Programming Interface (API) decode the human brain (Amunts et al., for searching, inserting, editing, querying, 2019, 2016). TVB cloud services (Tables 1, aggregating, filtering and visual 2) were developed by the HBP subproject exploration of the data base. KG uses "The Virtual Brain" in collaboration with controlled vocabularies and ontologies, the two HBP partnering projects TVB-Cloud mapped with existing neuroimaging and (virtualbraincloud-2020.eu) and TVB-CD brain simulation ontologies

2

(Supplementary Note: Metadata rat brain atlas (Osen et al., 2019; Papp et annotations). Professional curation, al., 2014). The Multilevel Human Brain persistent DOIs, licensing, versioning, data Atlas uses the Julich-Brain probabilistic sharing agreements and protected cytoarchitectonic maps (Amunts et al., workflows enable secure, interoperable 2020) to link with template spaces such as and reusable sharing of data and services. BigBrain (Amunts et al., 2013) at the Containerized workflows and backend micrometer scale and MNI (Das et al., supercomputing resources enable 2016) at millimeter scale, and combines reproducible research that scales to the them with imaging-based maps of function demands of the project. A RESTful API is (Evans et al., 2012) and connectivity used for connecting different cloud (Guevara et al., 2017) to link a growing set components, authentication, data transfer of multimodal feature descriptions of the and supercomputer job scheduling. Atlas human brain, in order to capture brain Services provide common spatial reference organization in its different facets. spaces including a multilevel atlas of the human brain as well as the Waxholm Space Results

The Virtual Brain End-to-end encryption drive.ebrains.eu Web GUI Graph database TVB Processing Pipeline wiki.ebrains.eu Sandboxing Container image openMINDS TVB-Multiscale metadata office.ebrains.eu iam.ebrains.eu REST-API Python notebooks Identity and access Fast_TVB management Visual graph Personal data lab.ebrains.eu explorer

(Py)Unicore Python library TVB-HPC OpenShift Authentication Interfaces Metadata and querying Autorisation Bayesian virtual epileptic patient KnowledgeGraph Access control HPC HPC backend Data privacy Core services Deployment TVB services

Figure 1. TVB on EBRAINS cloud ecosystem. Brain simulation and neuroimaging require personal data applicable to data protection regulation. Encryption, sandboxing and access control are used to protect personal data. EBRAINS provides core services: 'Drive' for hosting and sharing files; 'Wiki' and 'Office' to create workspaces and documents for collaborative research; 'Lab' for running live code in sandboxed JupyterLab instances; 'OpenShift' for orchestrating services and resource management; 'HPC' are supercomputers for resource- intensive computations. Services interact via RESTful and use UNICORE for communication with supercomputers. Services are deployed in the form of Web GUIs, container images, Python notebooks, Python libraries and high-performance machine codes. Input and output data can be loaded from and stored into the EBRAINS Knowledge Graph using openMINDS-compliant metadata annotations to enable efficient sharing and re-use. The connectors show interactions between different components (colours group connectors for different deployments).

Service Function URLs The Virtual Brain network Web-App Brain simulation thevirtualbrain.apps.hbp.eu Collab wiki.ebrains.eu/bin/view/Collabs/the-virtual-brain

3

End-to-end use case wiki.ebrains.eu/bin/view/Collabs/user-story-tvb Source code .com/the-virtual-brain/tvb-root Python libraries tvb-library tvb-framework Container image hub..com/r/thevirtualbrain/tvb-run

TVB Image Connectome Web-App Processing analysis tvb-pipeline.apps.hbp.eu Pipeline Collab wiki.ebrains.eu/bin/view/Collabs/tvb-pipeline Source codes github.com/BrainModes/tvb-pipeline-sc github.com/BrainModes/fmriprep github.com/BrainModes/tvb-pipeline-converter Container images hub.docker.com/r/thevirtualbrain/tvb-pipeline-sc hub.docker.com/r/thevirtualbrain/tvb-pipeline-fmriprep hub.docker.com/r/thevirtualbrain/tvb-pipeline-converter

Multiscale co- Two toolboxes Web-App (TVB-Multiscale) simulation for concurrent tvb-nest.apps.hbp.eu simulation of Collab (TVB-Multiscale) large-scale wiki.ebrains.eu/bin/view/Collabs/the-virtual-brain-multiscale and spiking Collab (Parallel CoSimulation) networks wiki.ebrains.eu/bin/view/Collabs/co-simulation-tvb-and-nest-high- computer Source code (TVB-Multiscale) github.com/the-virtual-brain/tvb-multiscale Source code (Parallel CoSimulation) github.com/multiscale-cosim/TVB-NEST Container image (TVB-Multiscale) hub.docker.com/r/thevirtualbrain/tvb-nest

TVB-HPC Automatic Collab code wiki.ebrains.eu/bin/view/Collabs/rateml-tvb/ generation Source code github.com/the-virtual-brain/tvb-root

Fast_TVB Parallelized Collab ReducedWong wiki.ebrains.eu/bin/view/Collabs/fast-tvb Wang Source code github.com/BrainModes/fast_tvb Container image hub.docker.com/r/thevirtualbrain/fast_tvb

Bayesian Epilepsy Collab Virtual modelling wiki.ebrains.eu/bin/view/Collabs/bayesian-virtual-epileptic-patient Epileptic Source code Patient github.com/ins-amu/BVEP

TVB Mouse Mouse brain Collabs Brains simulation wiki.ebrains.eu/bin/view/Collabs/tvb-mouse-brains wiki.ebrains.eu/bin/view/Collabs/mouse-stroke-brain-network-model/ TVB-ready SC, FC, and DOI dataset fMRI from 10.25493/1ECN-6SM tumour URL patients and kg.ebrains.eu/search/instances/Dataset/a696ccc7-e742-4301-8b43- controls d6814f3e5a44

4

openMINDS Metadata in Collab metadata for JSON-LD wiki.ebrains.eu/bin/view/Collabs/openminds-metadata-for-tvb- TVB-ready format ready-data data openMINDS schema github.com/HumanBrainProject/openMINDS

TVB atlas Brain atlas Collab (development version) adapter wiki.ebrains.eu/bin/view/Collabs/sga3-d1-1-showcase-1 Visualizer brainsimulation.org/atlasweb_multiscale

INCF TVB Education and URL training space training training.incf.org/collection/virtual-brain-simulation-platform Table 1. TVB cloud services and URLs leading to their main entry points.

Cloud service Publications The Virtual Brain (Ritter et al., 2013; Sanz-Leon et al., 2015, 2013) TVB Image Processing Pipeline (Proix et al., 2016; Schirner et al., 2015) Fast_TVB (Costa-Klein et al., 2020; Schirner et al., 2018; Shen et al., 2019; Zimmermann et al., 2018) Bayesian Virtual Epileptic Patient (Hashemi et al., 2020; Jirsa et al., 2017) TVB Mouse Brain (Melozzi et al., 2019, 2017) TVB ready datasets (Aerts et al., 2020, 2018) INCF TVB training space (Matzke et al., 2015) Table 2. Exemplary publications using software, workflow or data sets underlying different TVB cloud services.

to predict signals like functional MRI (fMRI) The Virtual Brain or electroencephalograms (EEG). Predicted signals can be compared with TVB (thevirtualbrain.org) is an open-source empirical measurements to evaluate and BNM simulator that combines optimize the model or to analyse the experimental data with neuron theory for underlying model mechanisms. TVB can be brain research (Supplementary Note: Brain directly used from a web GUI (Table 1), simulation with TVB) (Ritter et al., 2013; without the need to install further Sanz-Leon et al., 2013). A BNM is a software or to have a specific operating computer model for simulating brain system, computing environment or activity based on systems of differential hardware. In addition, compiled equations that are coupled by a standalone versions for Linux, Windows reconstruction of a brain's structural and Mac OS as well as execution-ready connectome (SC), the white-matter axon container images are freely available for fiber bundle network that interconnects download. TVB can also be used as a brain areas (Breakspear, 2017). Neural Python library for programming--locally, as populations are simulated by neural mass well as in the EBRAINS Lab (Figure 1). TVB models (NMM), i.e., mean field theories on usage is introduced through Jupyter coupled neurons. SCs can be reconstructed notebooks, explanatory videos and from diffusion-weighted MRI (dwMRI) data technical documentation. The main using the TVB Image Processing Pipeline or documentation is hosted at found with KG. docs.thevirtualbrain.org. BNMs simulate neural activity like membrane voltage fluctuations, synaptic TVB Image Processing Pipeline current flow, or spike-firing, which is used

5

The TVB Image Processing Pipeline takes in dependence of the instantaneous firing anatomical, functional and diffusion MRI rate of the NMM. Vice versa, the mean as input and generates SCs, region-average activity of a NEST neuron network may be fMRI time series, functional connectivity used to inform ongoing inputs to a TVB (FC), brain surface triangulations, NMM. Coupling may be unidirectional, projection matrices for predicting EEG, and e.g., to study effects of large-scale inputs brain parcellations as output. The outputs on small-scale spiking-network activity, or can be used for analysis or for direct bidirectional, to study how both scales upload to TVB. Processing steps are mutually interact. The TVB-multiscale performed by containerized workflows, project is under ongoing development orchestrated by Python scripts, for flexible currently focussing on postulating and adaptation. The pipeline combines three validating coupling scenarios between the BIDS Apps ("Brain Imaging Data Structure", scales, optimizing the user interfaces as (Gorgolewski et al., 2016)) that can be run well as optimizing performance via via command-line interface. The container parallelization and better interprocess codes, respectively images, received communication. TVB-multiscale can be version numbers and are hosted at GitHub, downloaded as standalone container respectively Docker Hub, for change image or used on EBRAINS from Jupyter tracking (Table 1). Jupyter notebooks notebooks (Table 1). explain how the TVB Image Processing Pipeline can be executed on a High-Performance implementations of TVB supercomputer from a web browser using access control, encryption and sandboxing Brain modelling often requires the for protection of personal data. exploration of high-dimensional parameter spaces and thus simulation Multiscale Co-Simulation software need to be efficient and able to run on parallel architectures. Numba, a Multiscale Co-Simulation are two Python compiler that translates a subset of Python toolboxes for simulating brain networks and NumPy into high-performance where large-scale NMMs interact with machine code, is used for faster execution models of individual neurons or neuron of TVB. However, the central integration networks (Table 1). TVB is used to simulate loop is implemented in Python for the large-scale NMMs, while NEST modularity and generality and therefore (Gewaltig and Diesmann, 2007) is used to constitutes a bottleneck for simulation simulate the small-scale neurons. The speed, which is why two dedicated high- toolboxes provide two interfaces to couple performance implementations were the two simulators, using the created. programmatic Python interface of TVB and TVB-HPC automatically produces high- PyNEST (Eppler et al., 2009), a Python performance codes for CPUs and GPUs wrapper for NEST. Populations interact using the domain-specific language with neurons by coupling NMM state RateML for model specification in order to variables with single neuron state variables simplify the process of implementing or parameters. For example, a TVB state optimized BNM simulation codes. RateML variable that simulates ongoing population is based on the domain-independent firing can be used to inject spikes into a language 'LEMS' (Vella et al., 2014), which NEST spiking network, e.g., by sampling allows for the declarative description of spike times from a probability distribution model components in a concise XML

6

representation. TVB-HPC is part of the healthy zones (HZ), where no seizures main TVB Python toolbox (Table 1). occur. Priors for the excitability parameter Fast_TVB is a specialized high-performance can express clinical hypotheses or implementation of the empirical observations (e.g., in a seizure ReducedWongWang model (Deco et al., region the HZ range can be excluded from 2014), written in C, that makes use of the prior). Simulation results are compared "Single Instruction, Multiple Data" with empirical measurements (e.g., EEG operations, explicit memory management from implanted electrodes) and model and C pointers to efficiently use CPU selection with cross-validation metrics are resources. Fast_TVB's high efficiency computed to evaluate clinical hypotheses (Supplementary Figure 6) enables to and to assess the model's ability to predict perform dense parameter space new data. The workflow was successfully explorations or to simulate high- used to infer the spatial maps of dimensional models with millions of nodes. epileptogenicity from ground-truth The code uses multithreading to synthetic data for symptomatic and simultaneously perform the processing of asymptomatic seizures (Hashemi et al., one BNM on multiple processors. Fast_TVB 2020). The currently running EPINOV is deployed as a standalone container clinical trial (epinov.com) investigates image (Table 1). informing clinical decisions with virtual patient studies to improve surgery Bayesian Virtual Epileptic Patient outcome. To run BVEP, users may follow the instructions in the collab or the source The Bayesian Virtual Epileptic Patient code repository (Table 1). (BVEP) uses Bayesian inference to compute posterior probability TVB Mouse Brains distributions for parameters of TVB's Epileptor NMM in order to study the The Virtual Mouse Brain (TVMB) extends spread of epileptic seizures (Jirsa et al., TVB with tractography-based as well as 2017, 2014). The approach applies Bayes' tracer-based mouse SC (Melozzi et al., theorem on prior distributions obtained 2017). Tracer-based SC was exported from from empirical data (e.g., a patient’s SC, or the Allen Mouse Brain Connectivity Atlas lesions detected in MRI) and model (Oh et al., 2014) using the Allen simulations, taking into account the Connectivity Builder. Two use cases of likelihood for these observations. TVMB are demonstrated on EBRAINS Estimating the excitability parameter of an (Table 1). The first use case performs a Epileptor-BNM for every brain region bifurcation analysis to show the existence yields a map of epileptogenicity to guide of multistability in mouse BNMs and clinical decision-making. The excitability compares FC simulated using tracer-based parameter controls whether an Epileptor SC versus FC using dwMRI-based SC NMM shows epileptogenic behavior and (Melozzi et al., 2019). The second use case the estimated values are used to classify ("Stroke Mouse Brain") constructs custom brain regions into three categories: mouse SC at different resolutions (the epileptogenic zones (EZ), which can maximum resolution of 50 μm yields 540 autonomously trigger seizures; brain regions), fits the working point of a propagation zones (PZ), which do not mouse BNM and explores ways to simulate trigger seizures autonomously but may be stroke and rehabilitation in mice (Allegra recruited during the seizure evolution; and Mascaro et al., 2020).

7

set of multimodal data features, including TVB-ready data transmitter receptor densities (Palomero- Gallagher and Zilles, 2019), cell The EBRAINS KG provides users with TVB- distributions, and physiological recordings, ready reference data sets in BIDS format based on the Julich-Brain cytoarchitectonic from tumor patients and matched control maps (Amunts et al., 2020). Aligned with participants. The data set contains region- standard brain templates, the Human Atlas average fMRI time series, FC, and SC from can be registered with individual brains to 31 brain tumor patients before and after export multimodal microstructural surgery, and 11 healthy controls (Aerts et "fingerprints" that can be used to set the al., 2019). Planning for tumor surgery parameters of BNM nodes. For example, involves delineating eloquent tissue to density measurements for 16 receptors spare by analyzing neuroimaging data like will be provided for each brain region, and fMRI and dwMRI data. Instead of analyzing high-resolution tractography maps (full these modalities independently, BNMs brain, post-mortem 200 μm isotropic provide a novel way to combine their diffusion MRI and 60 μm isotropic 3D information. Polarised Light Imaging) will increase the Optimized parameters in presurgical reliability of SC (Axer et al., 2016; Beaujoin models differentiated between regions et al., 2018). As data adapter, it is planned directly affected by a tumor, regions to link intracranial electrophysiology distant from a tumor and regions in a recordings with the respective Julich-Brain healthy brain, which may help to better regions to set parameters based on direct delineate eloquent tissue (Aerts et al., measurements of effective connectivity 2018). Furthermore, it was found that and transmission delays from stimulation "virtual neurosurgeries", where the experiments (Trebaul et al., 2018). A patients' actual surgeries were simulated, viewer was implemented to visualize improved the fit with postsurgical brain different atlas maps on the cortical surface dynamics, which may allow presurgical (Table 1). exploration of different surgical strategies (Aerts et al., 2020). INCF training space

TVB atlas and data adapters The INCF (International Coordination Facility) training space holds Adapters connect TVB with data and a dedicated collection for TVB with didactic software (Table 1; Supplementary Note: use cases, video tutorials, notebooks and TVB atlas and data adapters). Among example data sets (Supplementary Note: others, TVB has interfaces with MATLAB, INCF training space). INCF's TVB EduPack the Brain Connectivity Toolbox, and the module (Table 1) gives a thorough Allen Brain Atlas introduction into work with TVB in general (docs.thevirtualbrain.org). Adapters are and EBRAINS cloud services in particular, under development that connect TVB with helping users to reproduce several TVB the Human Brain Atlas, Knowledge Graph publications. Tutorials consist of short and Human Intracerebral EEG Platform to video lectures, scripting tutorials with inform BNM parameterization and to Jupyter notebooks and code. TVB Made compare simulation results with empirical Easy is a series of short lectures that data (Table 1). The Human Brain Atlas introduce working with TVB in a clinical characterizes brain regions with a growing context, e.g., predicting recovery after

8

stroke and simulating epilepsy patient processing of personal data including the brains. Brief lectures describe methods, storage and sharing of data. From the results and ways to replicate the principal outset of the project, at the development ideas of the articles with TVB. stages, Article 25 GDPR ("data protection by design and by default") requires Data protection in the TVB on EBRAINS partners to implement the appropriate cloud technical and organizational measures, to ensure adherence with the principles of Biomedical research is facing challenges data protection at set out in Article 5 GDPR because many methods lack technical and more generally a commitment to infrastructure to protect the privacy of GDPR compliant data processing to personal data. Problematically, biomedical safeguard the rights and freedoms of the data cannot be easily anonymized or data subjects as set out in the GDPR. A pseudonymized such that all potentially principle means of ensuring GDPR identifiable information are removed, and compliant data processing is the potential re-identification is excluded implementation of appropriate technical (Byrge and Kennedy, 2018; Gymrek et al., and organizational measures to ensure a 2013; Rocher et al., 2019). To use cloud level of security appropriate to the risk of services, personal data has to be the processing as set out in Article 32 transmitted over the and other GDPR. An assessment of the most suitable open networks and is stored and security measures must consider the processed on shared supercomputers, context and purposes of processing in which poses the risk for unauthorised relation to the risk posed to the rights and access of the data. The European Union's freedoms of the data subject. Having made General Data Protection Regulation this assessment, the appropriate security (GDPR) and similar international and measures are realized for TVB-on-EBRAINS national laws impose restrictions on the by implementing access control, encryption and sandboxing (Figure 2).

Personal data

Data Create Exchange Authenticate Encrypt Encrypted controller key pair public keys results

Forward EBRAINS Authenticate Upload Download public keys Sandbox

Start Create Exchange Pipeline Encrypt HPC Decrypt sandbox key pair public keys processing results

Figure 2. Securing personal data processing workflows in shared environments. Personal input data is encrypted with public key cryptography on the data controller's computer before upload to the cloud. The key pair for upload is generated within a sandboxed process at the final processing site and the private key never leaves the sandbox. All processing is performed in the sandbox and personal data is never written outside the sandbox in unencrypted form. A public key generated by the data controller is used for returning encrypted results.

9

EBRAINS access control uses passwords the last process exits. For upload to the and cryptographic keys to prevent TVB Web GUI personal data is first unauthorized access, to provide secure encrypted with a public key and delegated access for connecting different immediately after arrival in the cloud cloud services, as well as to provide single decrypted and again encrypted with a sign-on. Keycloak is used for identity and freshly created secret key to increase access management (IAM), i.e., user protection in case that the key for data registration, management and permission upload is leaked. The data stays encrypted control. The OpenID Connect protocol is until an authorised user opens the used for authentication--confirming the respective project in its private TVB Web- identity of a user--based on OAuth 2.0 GUI space. Each project is encrypted with a specifications for delegating and conveying different key and the public key for initial authorization decisions. OAuth 2.0 is a encryption is regularly changed. For widely adopted standard for access resource-intensive simulations the delegation and user authorization flows encrypted data is forwarded to without the need for sharing credentials, supercomputing systems and only which is realized by issuing cryptographic decrypted after the associated job on the tokens that provide ongoing access to compute nodes was started. Furthermore, protected resources on behalf of a user. the decryption key is sent from the TVB With this framework EBRAINS applications Web-GUI instance directly to the running can limit the scope of services accessible by backend compute process on the a user. supercomputer using an authenticated Encryption is a fundamental tool for data security token; the key is only kept in privacy, ensuring that data becomes memory, and never written to the file unintelligible without decryption key. system. See Supplementary Note: Data Therefore, data is better protected if it is protection in the TVB on EBRAINS cloud for encrypted at all times with the only more information. exception being during the time of the processing, where it is only decrypted for Shared responsibility & compliance the parties involved; additionally, during the processing it may exist in unencrypted In order to use TVB cloud services a user form only within isolated memory and file- must agree to terms that clarify their system locations that are invisible from the personal responsibility regarding the host (sandboxes). TVB-on-EBRAINS compliance with the GDPR workflows (Figure 2) require that personal (ebrains.eu/terms). These terms data must already be encrypted on the transparently clarify the nature of security data controller's computer before being precautions, contact persons, personal uploaded to the cloud. Public key responsibilities of the user, monitoring, cryptography is used in a way that the logging and passing of information to third secret key for data upload is created ad- parties. A detailed risk assessment and hoc at the beginning of the workflow and technical documentation are currently in remains the entire time in a sandboxed preparation to clarify inherent risks of process at the final processing site. open networks and shared computing Processing results are never written out in systems and the taken countermeasures. unencrypted form, except in temporary Security measures were put in place to filesystem trees that are invisible from the prevent that users are able to access host and automatically cleaned up when another user's data if not explicitly shared.

10

The system is designed such that the user mailing lists and support contacts provide is the only person that is in control over the efficient and didactic dissemination of data while in the cloud. From a GDPR point knowledge and support. EBRAINS core of view they are therefore considered to be services enable to map and organize data controllers (Article 26 GDPR), which is projects into a persistent and replicable the person determining the (essential) structure at a central and secure place, means and purposes of the processing, which makes it easier to pick up complex Hence, the parties using the cloud do not projects at a later time. The flexibility of cede their legal responsibilities for data the platform and its focus on community- protection—on the contrary they remain driven research enable rapid adoption of unreservedly with the user as the data advances in brain simulation and controller, because the technical and connectomics, as well as correction of organisational mechanisms, that were put errors. Technical and organisational in place to prevent unauthorized access, security mechanisms are designed to guarantee their sole and independent provide highest data protection standards, determination on the purposes and means while at the same time providing flexibility of the data processing operation. to enable state-of-the-art research. To keep the high quality of the cloud services, ongoing and future efforts are directed towards the continuous integration of Discussion improved community standards and best practices. The TVB on EBRAINS ecosystem TVB cloud services were developed to can be transferred to other Cloud lower the barriers to brain simulation and environments within the European Open connectome analysis workflows and to Science Cloud or beyond. Thus, it serves as enhance their reproducibility. All codes are a reference architecture for secure open source and available for download processing and simulation of neuroscience from GitHub. Software is packaged in data in the cloud (Figure 1 and platform-independent container images Supplementary Discussion). that can be directly used without the need to install dependencies. Most software and Methods data components were peer-reviewed, and results published in academic journals TVB Image Processing Pipeline (Table 2). To ensure long-term accessibility, software libraries from third- The EBRAINS TVB Image Processing party developers were forked and Pipeline combines three BIDS Apps (see archived. Codes and images are versioned Supplementary Note: BIDS Apps) to yield for reproducibility; legacy codes and an updated version of our brain network images remain available in their model construction workflow (Schirner et repositories. By reporting the application al., 2015). BIDS Apps are brain imaging name and version in a publication, it software packages that understand BIDS becomes possible to exactly replicate the datasets and that are deployed as portable used workflow, which counteracts container images (Gorgolewski et al., 2017, variability between software versions and 2016): lack of reporting. Comprehensive • thevirtualbrain/tvb-pipeline-sc for documentation in the form of manuals, diffusion MRI tractography, tutorials, lectures, Jupyter notebooks, demo data, workshops, videos, use cases,

11

• thevirtualbrain/tvb-pipeline- user namespaces that allow to give fmriprep for fMRI preprocessing, unprivileged users container features. and Bubblewrap creates a new mount • thevirtualbrain/tvb-pipeline- namespace where the root is on a converter for data privacy and temporary filesystem that is invisible from conversion of the results of the the host and that will be automatically other two containers into TVB and cleaned up when the last process exits. BIDS formats. Shortly before the upload of the encrypted All three container images are provided on data a pair of public and private keys is TVB's main Docker Hub Repository generated, and the public key is forwarded hub.docker.com/r/thevirtualbrain. The to the data controller to encrypt the code for the container tvb-pipeline-sc was password for the encrypted input data. The cloned and modified from the BIDS App data themselves is encrypted on the data MRtrix3_connectome (github.com/BIDS- controller's computer with AES-256 Apps/MRtrix3_connectome) and tvb- encryption using pyAesCrypt pipeline-fmriprep was cloned from the (pypi.org/project/pyAesCrypt). After the neuroimaging software fmriprep encrypted data and password were (github.com/poldracklab/fmriprep), which uploaded to the supercomputer, the is why we recommend to acknowledge password and data are decrypted into the MRtrix3, MRtrix3_connectome (Smith and sandboxed temporary file system. The Connelly, 2019; Tournier et al., 2019) and secret private key for decryption is only fmriprep (Esteban et al., 2019) when using held in the main memory of the sandboxed these workflows. Containerization makes it process and never written out. During the easier to deploy these workflows on entire processing intermediate results are different architectures, as they rely on a only written into the temporary filesystem number of dependencies. that is invisible from the host. The outputs The overall workflow is coordinated by a of the workflow are encrypted with a central program that orchestrates the public key that was generated on the data execution of the three containers on a controller's computer and all input and supercomputer and that ensures that intermediate data of the workflow are personal data is encrypted at all times, deleted, the sandbox stopped, and the except for the duration of the processing encrypted results returned to the data and then only in the main memory of a controller. sandboxed process (Figure 2). All Processing of functional or diffusion- processing of personal data on the weighted MRI requires at least one high- supercomputer happens inside the resolution T1-weighted structural MRI for sandbox and intermediary data is only parcellating the brain. If the input data was written out into a temporary file system not corrected for susceptibility distortions, that is invisible from the host. After a data then additionally phase-reversed images of controller authenticated with the EBRAINS the B0 field in dwMRI data and fieldmaps platform and initiated a processing for fMRI are required. Please see the workflow, EBRAINS authenticates with the Supplementary Note: Exemplary input supercomputer and starts a sandboxed data for more information on input data process that is isolated from the host using sets. Bubblewrap (github.com/containers/bubblewrap), thevirtualbrain/tvb-pipeline-sc which is a sandboxing tool based on Linux

12

Diffusion modelling and tractography are fiber orientation distributions, spherical carried out by MRtrix3, which is a powerful deconvolution is performed with dwi2fod. and actively developed tractography For inter-modal registration, the workflow toolbox (Tournier et al., 2019). Notably, it extracts the brain with ROBEX and includes methods to increase the accuracy performs a bias field correction on the T1 of tractography and reducing false image in its original space using ANTS positives by removing tracks that are N4BiasFieldCorrection. Contrast-matched anatomically implausible, called images for inter-modal registration Anatomically-Constrained Tractography between DWIs and T1 are generated using (Smith et al., 2012). In order to estimate mrhistmatch and the T1w MRI is registered region-to-region coupling strengths, the to DWI using mrregister. The MRtrix MRtrix3 programs SIFT, respectively SIFT2, program 5ttgen in conjunction with FSL or re-weight each track in the reconstructed FreeSurfer is then used to segment tissues tractogram such that streamline densities into cortical grey matter, sub-cortical grey become proportional to the cross- matter, white matter and cerebrospinal sectional area connecting each pair of fluid. Next, cortical gray matter brain regions (Smith et al., 2015). parcellations are obtained. For the atlases The workflow is controlled by a Python 'desikan', 'destrieux', and 'hcpmmp1', the script that can be modified according to parcellation information is taken from the research question. In the following we FreeSurfer's recon-all output. For the describe the default setup of the workflow atlases 'aal', 'aal2', 'craddock200', as it is currently implemented. As input, 'craddock400', and 'perry512', a non-linear the workflow requires T1-weighted and registration to the MNI152_T1_2mm diffusion-weighted MRI data, as well as at template is performed with optionally least one phase-reversed dwMRI of the B0 ANTS or FSL, and the registration result is field for susceptibility distortion used to transform the atlas information correction. Outputs are whole-brain into individual subject space. Based on this tractograms and SC. In between, the nonlinear registration any atlas defined on following processing steps are performed. the MNI152 template can be used to First, dwMRI is denoised by removing parcellate the subject's brain, e.g., noise-only principal components (Veraart cytoarchitectonic information from Julich- et al., 2016) using dwidenoise. Next, Gibbs Brain. Next, the central step in this ringing artifacts (Kellner et al., 2016) are workflow is performed: whole-brain fibre- removed using mrdegibbs, and distortions tractography using tckgen with ACT to are corrected with dwipreproc: eddy control for anatomical plausibility of current-induced distortion correction and constructed tracks using information from motion correction is performed using FSL the tissue-segmented image (Smith et al., eddy, and (optionally) susceptibility- 2012). As additional plausibility criteria, induced distortion correction is performed tracks are truncated and re-tracked if a using FSL topup. In the next step, B1 bias poor structural termination is field inhomogeneities are corrected using encountered, respectively cropped when dwibiascorrect and a brain mask for DWI is they hit the gray-matter-white-matter created using dwi2mask and maskfilter. interface and when they exceed a length of After computing fractional anisotropy 250 mm. Seed points are determined maps using dwi2tensor, dwi2response is dynamically using the SIFT model (Smith et used to estimate the response functions al., 2015). If the number of tracks to be for spherical deconvolution. To obtain generated is not manually specified, it is

13

set to 500*N*(N-1), where N is the number data, as well as noise components of brain regions in the parcellation. After a estimated with independent component whole-brain tractogram has been analysis, which are used to remove generated, SIFT2 (Smith et al., 2015) is associated variance from the fMRI data used next to filter tractography results to during the tvb-pipeline-converter determine streamline weights. SIFT2 workflow. The following processing steps optimises per-streamline cross-section are performed: T1w volumes are corrected multipliers to match a whole-brain for intensity non-uniformity using tractogram to fixel-wise fibre densities. As N4BiasFieldCorrection and skull-stripped last step of the MRtrix workflow, using antsBrainExtraction.sh. Brain tck2connectome is used to aggregate the surfaces are reconstructed using whole brain tractogram into a region-by- FreeSurfer's recon-all, and the temporary region connectome matrix using the brain mask is refined to reconcile ANTs- specified region parcellation. In addition to derived and FreeSurfer-derived outputting the sum of streamline weights segmentations of the cortical gray-matter for each region pair, also the mean of Mindboggle. Spatial normalization to streamline lengths between each region is the ICBM 152 Nonlinear Asymmetrical output. Optionally, the workflow also template version 2009c is performed outputs track density images, which are through nonlinear registration with the useful for visualising and evaluating the antsRegistration tool, using brain- results of tractography. extracted versions of both T1w volume and template. Brain tissue segmentation of thevirtualbrain/tvb-pipeline-fmriprep cerebrospinal fluid (CSF), white-matter (WM) and gray-matter (GM) is performed The fMRI workflow relies on fmriprep, on the brain-extracted T1w using FSL FAST. which is a preprocessing pipeline for fMRI Functional data is slice time corrected data that is relatively robust to variation in using 3dTshift from AFNI and motion scan acquisition protocols, sequence corrected using mcflirt from FSL. Distortion parameters, or presence of fieldmaps for correction is performed using an artifact correction (Esteban et al., 2020, implementation of the TOPUP technique 2019). A characteristic feature of fmriprep using 3dQwarp. This is followed by co- is its "glass " philosophy, according to registration to the corresponding T1w which reports for visual verification are using boundary-based registration with produced after important processing nine degrees of freedom, using bbregister steps. This is an important feature as many from FreeSurfer. Motion correcting steps of MRI preprocessing pipelines are transformations, field distortion correcting susceptible to errors and verification is warp, BOLD-to-T1w transformation and therefore generally recommended after T1w-to-template (MNI) warp are major steps. The workflow combines the concatenated and applied in a single step following software: FreeSurfer, FSL, AFNI, using antsApplyTransforms using Lanczos ANTS, BIDS validator and ICA-AROMA. interpolation. Physiological noise Auxiliary tools are: Python (miniconda), regressors are extracted applying pandoc, SVGO, neurodebian and git. As CompCor. Principal components are input the workflow requires data to be estimated for the two CompCor variants: in BIDS format, and it must include at least temporal (tCompCor) and anatomical one T1w structural image and one fMRI (aCompCor). A mask to exclude signal with series. Output contains preprocessed fMRI cortical origin is obtained by eroding the

14

brain mask, ensuring it only contained mapped to MNI space. Here, the converter subcortical structures. Six tCompCor container performs non-aggressive components are then calculated including denoising following the approach only the top 5% variable voxels within that described in Pruim et al. (2015) using the subcortical mask. For aCompCor, six independent components (ICs) together components are calculated within the with their classification obtained in the intersection of the subcortical mask and previous step. ICA-AROMA classifies ICs the union of CSF and WM masks calculated into movement-related and movement- in T1w space, after their projection to the unrelated ICs. It does not require manual native space of each functional run. Frame- training of the classifier--the classifier is wise displacement is calculated for each already trained using four theoretically functional run using the implementation of motivated spatial and temporal features: Nipype. ICA-based Automatic Removal Of high-frequency content, correlation with Motion Artifacts (AROMA) was used to realignment parameters, edge fraction, generate aggressive noise regressors as and CSF fraction of each IC (Pruim et al., well as to create a variant of data that is 2015). With aggressive denoising the non-aggressively denoised (see section entire temporal waveform of movement- thevirtualbrain/tvb-pipeline-converter for related ICs is regressed from the data, details on aggressive vs. non-aggressive analogous to the commonly done cleaning). For details, please refer to regression of nuisance parameter time fmriprep's documentation courses (like mean global signal, mean (https://fmriprep.readthedocs.io/). tissue class signal, frame-wise displacement, motion parameters) from thevirtualbrain/tvb-pipeline-converter the data. Critically, all variance associated with such "nuisance" time courses is The third container of the TVB image removed, including shared variance with processing pipeline takes the outputs of actual brain signals. This is problematic, the first two containers as input and because global fMRI time courses often outputs TVB-ready BNM connectomes in share a lot of variance with neural activity: BIDS format (Supplementary Table 1) and even if two signals originate from different in native TVB-ready format locations in the brain, they may still be (Supplementary Table 2). Outputs include highly correlated. Therefore, regressing SC, FC, EEG/MEG out global signals likely removes signals of (magnetoencephalography) projection interest (Liu et al., 2017; Murphy and Fox, matrices, and brain surface triangulations. 2017). As an alternative, the non- As first step, the workflow performs what aggressive approach based on ICA-AROMA fmriprep documentation calls "non- is more conservative by first performing a aggressive" cleaning of motion artefacts. regression on the full set of IC time series, Following its glass box philosophy, including both signal and noise ICs. Then, fmriprep involves the researcher in critical to clean the data, only the motion-related decisions about the denoising strategy, as regressors are subtracted from the data, it can have considerable effects on the which specifically removes variance ensuing analyses. To this end, fmriprep associated with motion-related-ICs that is provides both aggressively and non- not shared by the remaining ICs. aggressively denoised fMRI. The output of Problematically, with this non-aggressive the fmriprep workflow already contains approach motion-related dynamics that ICA-AROMA denoised 4D NIFTI files were not identified as such are specifically

15

retained in the data. Conversely, as obtain the region mapping. The following mentioned, the drawback of aggressive atlases are currently natively supported, denoising is that it may remove shared for other atlases the workflow needs to be variance between signal and noise. To adapted: "aal", "aal2", "craddock200", clean fMRI data non-aggressively the "craddock400", "desikan", "destrieux", workflow uses fsl_regfilt with the provided "hcpmmp1", "perry512". AROMA ICs and a list of (motion-related) Next, the MNE toolbox (Gramfort et al., ICs to reject. After this step, further 2013) is used to compute forward and nuisance regression may be performed (if inverse models for electromagnetic source it fits the study design) as well as removal imaging. First, surfaces are decimated to of linear trends and high-pass filtering. It is, 30,000 triangles and the region-mapping is however, important that during such a obtained by nearest neighbor second regression step variables interpolation in the original high- correlated with motion are not used as resolution surface. For the head model, regressors, as this may re-introduce BEM surfaces using the FreeSurfer motion-related waveform components watershed algorithm are constructed. To into the time series. make the workflow automatic, standard After denoising, the converter computes EEG montage locations are projected onto region-average fMRI time courses by the individual head surfaces. The outputs resampling the brain parcellation to fMRI of this step are a projection or lead-field resolution and averaging over all voxel matrix (LFM), EEG sensor location time series in each region. The converter coordinates, and a mapping between also outputs cortical surfaces, which are surface vertices and large-scale regions. used for detailed surface simulations and The LFM has the dimensions M x N, with M for EEG or MEG source modelling, e.g. in being the number of vertices on the order to extract region-average EEG or cortical surface and N the number of EEG MEG source activity, which can be used to sensors. Next, surfaces are exported: the constrain BNM dynamics (Schirner et al., downsampled pial surface from FreeSurfer 2018). The converter merges the left and (used to define source space) as well as the right hemisphere cortical surface BEM surfaces of the inside and the outside triangulations reconstructed by FreeSurfer of the skull and the scalp. Finally, the and generates a mapping between each converter outputs the SC weights and vertex and the large-scale regions of the distances matrices as well as region brain atlas. Each vertex of the cortical centroids, average orientations (average white matter and pial surface over all vertex-normals), surface areas, and triangulations are associated with the two files that indicate to which hemisphere region-label of the region at that location. each region belongs and whether it is a This step is carried out with HCP cortical or a subcortical region. All output Connectome Workbench for atlases that files are plaintext ASCII files that are zipped are defined in volumes (e.g., "aal", "aal2" for uploading to TVB. As last step, the "craddock200", "craddock400", converter generates metadata for the BIDS "perry512", atlas names as used in MRtrix output (see Supplementary Note: workflow source codes) and outputs a Metadata annotations). Supplementary GIFTI label file for each hemisphere. For Table 1 summarizes the folder and atlases defined on surfaces (e.g., filename structure in BIDS format and "desikan", "destrieux", "hcpmmp1"), the Supplementary Table 2 summarizes folder 'annot' files from FreeSurfer are used to and filename structure for outputs in TVB

16

format, which are all stored in the output only BNMs, it is possible to subsequently file "TVB_output.zip" that can be imported input simulated neural activity into into TVB using the "Upload Connectivity forward models to simulate typical ZIP" functionality of TVB. neuroimaging signals like fMRI or EEG. The TVB to NEST interface is based on the Multiscale Co-Simulation creation of TVB "proxy" nodes within the spiking network model. Proxy nodes are Multiscale Co-Simulation is currently NEST stimulation devices that are used to implemented in the form of two toolboxes: inject spikes or currents into NEST neurons the TVB-Multiscale toolbox and the Parallel in order to simulate inputs from the large CoSimulation toolbox (Table 1). Both are scale to the small scale at a particular brain based on common concepts and region. It is possible to generate currents architectures, but they are independently or spike trains with a desired first order developed to focus on different goals: TVB- (mean firing rate) or second order statistics Multiscale focusses on rapid prototyping of (correlations) such that the statistics of the scientific use cases while Parallel produced spikes or currents follow the CoSimulation focusses on the optimization corresponding statistics of the large-scale of co-simulation performance. population activity. Proxy nodes can be In order to couple the small scale (NEST) coupled to NEST spiking networks with with the large scale (TVB), TVB NMM user defined connection weights and equations are coupled with NEST single delays, in order to simulate the large-scale neuron equations. The coupling can be features of coupling on the small scale. To unidirectional, i.e., one scale provides compute inputs from NEST to TVB, NEST input to the other scale, or bidirectional, recording devices are used to aggregate i.e., both scales provide input to one the activity of spiking neuron populations. another. The bidirectional case enables to Multiscale Co-Simulation gives the user "substitute" large-scale NMMs by small- flexibility for custom configuration of co- scale spiking networks to simulate one or simulations regarding the network more specific populations of a BNM on a structures, the state variables used for finer level. Large-scale inputs to the coupling, the transformation functions and substituted populations will then be the devices for computing and applying forwarded to the small-scale network, inputs, the allocation of computational while small-scale activity will be averaged, resources, and the storage of results. or transformed with a custom-defined The Parallel CoSimulation toolbox is transformation function, and forwarded as currently under development to optimize input to the connected large-scale nodes. the performance of co-simulation as well For example, such a transformation as for integration with the TVB-Multiscale function may convert moment-to-moment toolbox. The idea for optimization is to firing rates of a large-scale NMM into spike reduce the number of costly trains by sampling from Poisson communication operations between NEST distributions centered at these firing rates, and TVB, because inputs often do not need which can then be used to drive spiking to be exchanged in every single time step network models. Vice versa, spike trains of of the model integration. Rather, it is often a population of neurons may be used to the case that the model does not require compute instantaneous population firing instantaneous interactions, depending on rates, which are then used to drive large- the axonal transmission delays in the scale NMMs. Like in the case of large-scale- network. The Parallel CoSimulation

17

toolbox therefore employs a strategy multiple models in parallel. RateML is where each simulator integrates their based on the domain-independent respective model equations independently language 'LEMS' (Vella et al., 2014), which and in parallel for a number of time steps, supports the declarative description of until again an exchange of inputs becomes model components in a concise XML unavoidable. The toolbox uses MPI representation. The PyLEMS expression (Message Passing Interface) for parser is used to check and parse communication and is implemented in a mathematical expressions (Vella et al., modular fashion: independent modules for 2014). Exemplary RateML ports of the TVB simulation and for input transformation NMMs Epileptor, Kuramoto, 2D Oscillator, enable scaling over multiple compute Montbrio and ReducedWongWang are nodes and to allocate different provided with the main TVB software computational resources to each module. package. The communication between simulators is Code with model equations is mapped to implemented using Message Passing generalized universal function (gufuncs) Interface (MPI), and synchronization is using Numba's guvectorize decorator, taken care by the mediating which compiles a pure Python function transformation modules, in which the directly into machine code that can be receiving data are also adapted to the used to operate on NumPy arrays. An appropriate inputs’ format of each example of a Numba generated simulator. The simulators and the guvectorize function is displayed in Listing transformers are encapsulated in 1. In this example the gufunc independent modules allowing for _numba_dfun_Epileptor accepts two n and flexibility regarding the components of any m sized float64 arrays and 19 scalars as particular instance of co-simulation, as input and returns a n sized float64 array. well as for scaling co-simulation to multiple The benefit of writing NumPy gufuncs with supercomputer nodes. A benchmark of the Numba's decorators, is that it toolbox is provided in Supplementary automatically uses parallel operations such Note: TVB Multiscale Co-Simulation as reduction, accumulation and benchmark. broadcasting to efficiently implement an algorithm. Please see Supplementary High-Performance implementations of TVB Methods: TVB-HPC for further notes on the implementation and a benchmark. For TVB-HPC the domain-specific language RateML was developed, which allows to @guvectorize([(float64[:], float64[:], (float64 * 19), specify custom NMMs without requiring float64[:])], knowledge about how to optimally '(n),(m)' + ',()'*19 + '->(n)', implement such models. Python code for nopython=True) CPU and CUDA code for GPU is def _numba_dfun_EpileptorT(vw, coupling, a, b, c, d, r, s, x0, Iext, automatically produced from RateML slope, model specifications. The resulting Python Iext2, tau, aa, bb, Kvf, Kf, Ks, tt, code uses Numba vectorization (Lam et al., modification, local_coupling, dx):

2015) to integrate the NMMs. The c_pop1 = coupling[0] resulting CUDA code uses the highly c_pop2 = coupling[1] parallel architecture of modern GPUs by c_pop3 = coupling[2] spawning one thread for every simulated c_pop4 = coupling[3] ... # calculate derivatives parameter set, allowing to simulate

18

return dx benchmark are provided in Supplementary Listing 1. Example of a gufunc header for Note: Fast_TVB. the epileptor model, where vw holds the input and dx the computed output Bayesian Virtual Epileptic Patient derivatives. Bayesian inference is used to compute Fast_TVB was developed in C and makes posterior probability distributions for extensive use of several optimization parameters of TVB's Epileptor NMM techniques to increase the speed of BNM (Hashemi et al., 2020; Jirsa et al., 2017, simulation. It uses Single Instruction, 2014; Proix et al., 2017). The Epileptor was Multiple Data instructions where the same developed on the basis of a taxonomy operation is concurrently applied to analysis of seizure dynamics that has multiple values contained in one large shown that a system of five linked state register. In the central integration loop no variables is sufficient to describe onset, function calls are made to avoid possible time course and offset of ictal-like overhead and instead all necessary code is discharges and their recurrence (Jirsa et al., either directly written into the loop or 2014). A detailed analysis over 2000 focal- inlined. Inner loops are unrolled, jammed onset seizures from multiple centers has and bound to avoid "end of loop" tests and shown that the Epileptor model is able to branch penalties. Intermediary results are realistically simulate the most dominant re-used in other parts of the algorithm and dynamics in seizure-like events (Saggio et not computed again. For example, some al., 2020). The target of inference is to terms and expressions occur multiple obtain posterior distributions of the times in the equations or they resolve to a excitability parameter for every brain constant and therefore only need to be region in order to assemble patient- computed once and not repeatedly in each specific epileptogenicity maps; time step. To reduce memory-related furthermore, initial conditions, global performance bottlenecks and cache coupling, and noise parameters are fitted failures, data to compute coupling inputs is (in total 3N+3 parameters for N brain organized in an efficient ring-buffer that regions). The excitability parameter stores related data in close locality. SC is controls whether the Epileptor shows stored in a sparse layout that scales linearly epileptogenic behavior or not and the with the number of connections (and not estimated values are used to classify brain quadratically as when stored as array), regions into three categories: which enables to hold large BNMs epileptogenic zones (EZ), which can efficiently in memory. The computation of autonomously trigger seizures; coupling inputs also requires that the propagation zones (PZ), which do not program steps through distant memory trigger seizures autonomously but may be locations (in order to fetch time-delayed recruited during the seizure evolution; and state variables), which is why pointer healthy zones (HZ), where no seizures tables that directly link to the required occur. addresses are used instead of array As exact posterior inference in such high- indexing tables, which saves one dimensional models is intractable, dereferencing operation for each memory approximate methods are used, e.g., access. Input and output operations Markov Chain Monte Carlo (MCMC), which happen only before or after the main enables to generate correlated samples simulation to avoid waiting time for that converge to a (potentially complex) devices. Further information and a

19

target distribution. Gradient-based MCMC where it was possible to accurately and algorithms like Hamilton Monte Carlo efficiently infer the spatial map of provide efficient convergence to high- epileptogenicity for all brain regions dimensional target distributions, but their (Hashemi et al., 2020). See Supplementary performance is highly sensitive to Note: Bayesian Virtual Epileptic Patient for hyperparameters, which often need to be more information. re-tuned to arrive at the desired target distribution, which is solved in BVEP by TVB Mouse Brains using No-U-Turn Samplers for adaptive self-tuning of hyperparameters (Hoffman For TVMB (Melozzi et al., 2019, 2017) and Gelman, 2014). Alternatively, tracer-based mouse SC at different Automatic Differentiation Variational resolutions was constructed from the Allen Inference (ADVI) is used to approximate Mouse Brain Connectivity Atlas (Oh et al., the posterior by first positing a family of 2014) with a maximum resolution of 50 densities and then finding a member of μm, yielding 540 brain regions. The that family that is closest to the target "Mouse Stroke" workflow associates distribution as measured by Kullback- structural alterations related to stroke with Leibler divergence (Kucukelbir et al., 2017). corresponding changes in FC and BVEP allows to integrate clinical population synchronization (Allegra information as priors, such as a patient’s Mascaro et al., 2020). To model stroke and SC, or lesions detected in MRI. Priors for subsequent rehabilitation, a pre-defined the excitability parameter are either fraction of incoming connections is uninformative (e.g., a uniform distribution removed, e.g., 0 % corresponds to no over the entire range for EZ, PZ and HZ), or stroke, while 100 % would correspond to a they can express clinical hypotheses (e.g., devastating stroke that destroyed all by extending a distribution only over one incoming connections to the region. category) or for including observations Similarly, recovery from stroke is modelled (e.g., when the region was recorded to by compensatory rewiring (Nudo, 2013) show seizure activity, the range for HZ can where the connection strengths of the be excluded). The fitting result is evaluated non-damaged connections are increased using information criteria and approximate relative to healthy connectivity. Different leave-one-out cross-validation metrics to stroke connectomes are produced in this assess the model's ability in predicting data manner and the resulting FC is then fitted it was not trained for. Model evidence is and compared with empirical data. By used to compare different clinical testing different re-wiring scenarios, the hypotheses and to further inform relative importance of compensatory physicians before therapeutic rewiring in different connections for re- interventions. BVEP is implemented in the establishing healthy FC is studied. For open-source probabilistic programming these BNMs the Kuramoto model language tools Stan (mc-stan.org) and (Kuramoto, 1984) was used, which is a PyMC3 (docs.pymc.io), which use phenomenological model for emergent automatic differentiation to compute group dynamics of weakly coupled gradients of specified model density oscillators that we used previously to link functions for NUTS and ADVI. The SC with oscillatory dynamics in different workflow was tested with ground-truth modalities (Petkoski et al., 2018; Petkoski synthetic data for two patients with and Jirsa, 2019). symptomatic and asymptomatic seizures

20

TVB-ready data found in the original journal articles using this dataset (Aerts et al., 2020, 2018) and Ethics statement. The data analyzed here in the dataset publication on EBRAINS have been reported in previous studies KnowledgeGraph (Aerts et al., 2019). (Aerts et al., 2020, 2018). Both studies were approved by the Ethics Committee at Data availability Ghent University Hospital. All participants received detailed study information and The datasets generated during and/or gave written informed consent before analysed during the current study are study enrolment. available in the EBRAINS KnowledgeGraph This dataset contains MRI derivatives from repository, search.kg.ebrains.eu. patients who were diagnosed with either a KnowledgeGraph is a DOI-minting glioma, developing from glial cells, or a repository where EBRAINS data and meningioma, developing in the meninges, software services are indexed and/or as well as healthy controls. Patients were archived. To share personal data among recruited from Ghent University Hospital researchers subject to national or (Belgium) between May 2015 and October international data protection laws (e.g., 2017 on the day before each patient's the General Data Protection Regulation of tumor surgery. Patients were eligible if the European Union) protected they (1) were at least 18 years old, (2) had environments and workflows as well as a supratentorial meningioma (WHO grade data sharing agreement templates were I or II) or glioma (WHO grade II or III) brain created (Supplementary Note: Data tumor, (3) were able to complete protection in the TVB on EBRAINS cloud). neuropsychological testing, and (4) were medically approved to undergo MRI Code availability investigation. Partners were also asked to participate in the study to constitute a All software codes presented in this article group of control subjects that suffer from have open-source licenses and can be emotional distress comparable to that of freely downloaded from GitHub (Table 1). the patients. Data from 11 glioma patients The EBRAINS database service (mean age 47.5 y, SD = 11.3; 4 females), 14 KnowledgeGraph (search.kg.ebrains.eu) is meningioma patients (mean age 60.4 y, SD a DOI-minting repository where all data = 12.3; 11 females), and 11 healthy and software services are indexed and/or partners (mean age 58.6 y, SD = 10.3; 4 archived. Source codes are deployed as females) was collected. From all cloud services that can be used on participants, three types of MRI scans were ebrains.eu and as standalone download obtained using a Siemens 3T Magnetom versions that can be pulled as container Trio MRI scanner: T1-MPRAGE anatomic images from Docker Hub (Table 1). images, resting-state functional echo- Acknowledgments planar imaging data, a multishell high angular resolution diffusion-weighted MRI We gratefully acknowledge the Swiss scan, and two DWI b = 0 s/mm2 images National Supercomputing Center CSCS for were collected with reversed phase- supporting this project by providing encoding blips for the purpose of computing time through the Interactive correcting susceptibility-induced Computing E-Infrastructure (ICEI) on the distortions. Further information on the Supercomputer PizDaint of the Fenix dataset, preprocessing and analysis can be Infrastructure (projects ich10, ich12) and

21

the Gauss Centre for Supercomputing e.V. 424778381); SPP Computational (www.gauss-centre.eu) for supporting this Connectomics RI 2073/6-1, RI 2073/10-2, project by providing computing time RI 2073/9-1. through the John von Neumann Institute for Computing (NIC) on the GCS Author Contributions Supercomputer JUWELS at Jülich Supercomputing Centre (JSC). We Conceptualization, PR, VJ, JM, KA, JGB, TD, acknowledge support by H2020 Research CCH, ARM, GD, MS; Data curation, DM, LS, and Innovation Action grants Human Brain RP, MS, LZ, OS; Formal analysis, MS, DM, Project SGA2 785907, SGA3 945539, LS, RP, PT; Funding acquisition, PR; VirtualBrainCloud 826421 and ERC 683049; Methodology, MS, DP, PT, MvdV, SD-P, AP, German Research Foundation CRC 1315, WK, OS, MW, LZ, JF, SP, LK, MH, J-FM, BS, CRC 936, CRC-TRR 295, RI 2073/6-1, RI TD, JM, PR; Project administration, PR; 2073/9-1 and RI 2073/10-2; Berlin Institute Resources, PR, VJ, JM, KA, JGB, TD, CMM, of Health & Foundation Charité, Johanna AU, MM, CCH, AF, J-FM, DM, DP, AN; Quandt Excellence Initiative. Several Software, MS, LD, DP, PT, PP, BV, MvdV, computations have also been performed SD-P, AP, WK, OS, MW, LZ, JF, SP, LK, MH, on the HPC for Research cluster of the BS, TD, JM, GD; Supervision, PR, VJ, JM, KA, Berlin Institute of Health. We acknowledge JGB, TD, CMM, MM, CCH, ARM, GD, BCS, the use of Fenix Infrastructure resources, SA, MC, EJ; Visualization, JP, CL, AB, PR; which are partially funded from the Writing – original draft, MS with European Union's Horizon 2020 research contributions from all co-authors. and innovation programme through the ICEI project under the grant agreement No. Competing interests 800858. German Research Foundation SFB 1436 (project ID 425899996); SFB 1315 The authors declare no competing (project ID 327654276); SFB 936 (project ID interests. 178316478); SFB-TRR 295 (project ID

References

Aarts, A.A., Anderson, J.E., Anderson, C.J., Attridge, P.R., Attwood, A., Axt, J., Babel, M., Bahník, Š., Baranski, E., Barnett-Cowan, M., Bartmess, E., Beer, J., Bell, R., Bentley, H., Beyan, L., Binion, G., Borsboom, D., Bosch, A., Bosco, F.A., Bowman, S.D., Brandt, M.J., Braswell, E., Brohmer, H., Brown, B.T., Brown, K., Brüning, J., Calhoun-Sauls, A., Callahan, S.P., Chagnon, E., Chandler, J., Chartier, C.R., Cheung, F., Christopherson, C.D., Cillessen, L., Clay, R., Cleary, H., Cloud, M.D., Conn, M., Cohoon, J., Columbus, S., Cordes, A., Costantini, G., Alvarez, L.D.C., Cremata, E., Crusius, J., DeCoster, J., DeGaetano, M.A., Penna, N.D., Den Bezemer, B., Deserno, M.K., Devitt, O., Dewitte, L., Dobolyi, D.G., Dodson, G.T., Donnellan, M.B., Donohue, R., Dore, R.A., Dorrough, A., Dreber, A., Dugas, M., Dunn, E.W., Easey, K., Eboigbe, S., Eggleston, C., Embley, J., Epskamp, S., Errington, T.M., Estel, V., Farach, F.J., Feather, J., Fedor, A., Fernández-Castilla, B., Fiedler, S., Field, J.G., Fitneva, S.A., Flagan, T., Forest, A.L., Forsell, E., Foster, J.D., Frank, M.C., Frazier, R.S., Fuchs, H., Gable, P., Galak, J., Galliani, E.M., Gampa, A., Garcia, S., Gazarian, D., Gilbert, E., Giner-Sorolla, R., Glöckner, A., Goellner, L., Goh, J.X., Goldberg, R., Goodbourn, P.T., Gordon-McKeon, S., Gorges, B., Gorges, J., Goss, J., Graham, J., Grange, J.A., Gray, J., Hartgerink, C., Hartshorne, J., Hasselman, F., Hayes, T., Heikensten, E., Henninger, F.,

22

Hodsoll, J., Holubar, T., Hoogendoorn, G., Humphries, D.J., Hung, C.O.Y., Immelman, N., Irsik, V.C., Jahn, G., Jäkel, F., Jekel, M., Johannesson, M., Johnson, L.G., Johnson, D.J., Johnson, K.M., Johnston, W.J., Jonas, K., Joy-Gaba, J.A., Kappes, H.B., Kelso, K., Kidwell, M.C., Kim, S.K., Kirkhart, M., Kleinberg, B., Knežević, G., Kolorz, F.M., Kossakowski, J.J., Krause, R.W., Krijnen, J., Kuhlmann, T., Kunkels, Y.K., Kyc, M.M., Lai, C.K., Laique, A., Lakens, D., Lane, K.A., Lassetter, B., Lazarević, L.B., Le Bel, E.P., Lee, K.J., Lee, M., Lemm, K., Levitan, C.A., Lewis, M., Lin, L., Lin, S., Lippold, M., Loureiro, D., Luteijn, I., MacKinnon, S., Mainard, H.N., Marigold, D.C., Martin, D.P., Martinez, T., Masicampo, E.J., Matacotta, J., Mathur, M., May, M., Mechin, N., Mehta, P., Meixner, J., Melinger, A., Miller, J.K., Miller, M., Moore, K., Möschl, M., Motyl, M., Müller, S.M., Munafo, M., Neijenhuijs, K.I., Nervi, T., Nicolas, G., Nilsonne, G., Nosek, B.A., Nuijten, M.B., Olsson, C., Osborne, C., Ostkamp, L., Pavel, M., Penton-Voak, I.S., Perna, O., Pernet, C., Perugini, M., Pipitone, R.N., Pitts, M., Plessow, F., Prenoveau, J.M., Rahal, R.M., Ratliff, K.A., Reinhard, D., Renkewitz, F., Ricker, A.A., Rigney, A., Rivers, A.M., Roebke, M., Rutchick, A.M., Ryan, R.S., Sahin, O., Saide, A., Sandstrom, G.M., Santos, D., Saxe, R., Schlegelmilch, R., Schmidt, K., Scholz, S., Seibel, L., Selterman, D.F., Shaki, S., Simpson, W.B., Sinclair, H.C., Skorinko, J.L.M., Slowik, A., Snyder, J.S., Soderberg, C., Sonnleitner, C., Spencer, N., Spies, J.R., Steegen, S., Stieger, S., Strohminger, N., Sullivan, G.B., Talhelm, T., Tapia, M., Te Dorsthorst, A., Thomae, M., Thomas, S.L., Tio, P., Traets, F., Tsang, S., Tuerlinckx, F., Turchan, P., Valášek, M., Van’t Veer, A.E., Van Aert, R., Van Assen, M., Van Bork, R., Van De Ven, M., Van Den Bergh, D., Van Der Hulst, M., Van Dooren, R., Van Doorn, J., Van Renswoude, D.R., Van Rijn, H., Vanpaemel, W., Echeverría, A.V., Vazquez, M., Velez, N., Vermue, M., Verschoor, M., Vianello, M., Voracek, M., Vuu, G., Wagenmakers, E.J., Weerdmeester, J., Welsh, A., Westgate, E.C., Wissink, J., Wood, M., Woods, A., Wright, E., Wu, S., Zeelenberg, M., Zuni, K., 2015. Estimating the reproducibility of psychological science. Science (80-. ). https://doi.org/10.1126/science.aac4716 Aerts, H., Schirner, M., Dhollander, T., Jeurissen, B., Achten, E., Van Roost, D., Ritter, P., Marinazzo, D., 2020. Modeling brain dynamics after tumor resection using The Virtual Brain. Neuroimage. https://doi.org/10.1016/j.neuroimage.2020.116738 Aerts, H., Schirner, M., Jeurissen, B., Dhollander, T., Van Roost, D., Achten, E., Ritter, P., Marinazzo, D., 2019. TVB time series and connectomes for personalized brain modeling in brain tumor patients [Data set]. https://doi.org/10.25493/1ECN-6SM Aerts, H., Schirner, M., Jeurissen, B., Van Roost, D., Achten, E., Ritter, P., Marinazzo, D., 2018. Modeling brain dynamics in brain tumor patients using the virtual brain. eNeuro. https://doi.org/10.1523/ENEURO.0083-18.2018 Allegra Mascaro, A.L., Falotico, E., Petkoski, S., Pasquini, M., Vannucci, L., Tort-Colet, N., Conti, E., Resta, F., Spalletti, C., Ramalingasetty, S.T., von Arnim, A., Formento, E., Angelidis, E., Blixhavn, C.H., Leergaard, T.B., Caleo, M., Destexhe, A., Ijspeert, A., Micera, S., Laschi, C., Jirsa, V.K., Gewaltig, M.O., Pavone, F.S., 2020. Experimental and Computational Study on Motor Control and Recovery After Stroke: Toward a Constructive Loop Between Experimental and Virtual Embodied Neuroscience. Front. Syst. Neurosci. 14, 1–34. https://doi.org/10.3389/fnsys.2020.00031 Amunts, K., Ebell, C., Muller, J., Telefont, M., Knoll, A., Lippert, T., 2016. The Human Brain Project: Creating a European Research Infrastructure to Decode the Human Brain. Neuron. https://doi.org/10.1016/j.neuron.2016.10.046 Amunts, K., Knoll, A.C., Lippert, T., Pennartz, C.M.A., Ryvlin, P., Destexhe, A., Jirsa, V.K., D’Angelo, E., Bjaalie, J.G., 2019. The Human Brain Project—Synergy between

23

neuroscience, computing, informatics, and brain-inspired technologies. PLoS Biol. https://doi.org/10.1371/journal.pbio.3000344 Amunts, K., Lepage, C., Borgeat, L., Mohlberg, H., Dickscheid, T., Rousseau, M.É., Bludau, S., Bazin, P.L., Lewis, L.B., Oros-Peusquens, A.M., Shah, N.J., Lippert, T., Zilles, K., Evans, A.C., 2013. BigBrain: An ultrahigh-resolution 3D human brain model. Science (80-. ). https://doi.org/10.1126/science.1235381 Amunts, K., Mohlberg, H., Bludau, S., Zilles, K., 2020. Julich-Brain: A 3D probabilistic atlas of the human brain’s cytoarchitecture. Science (80-. ). Axer, M., Strohmer, S., Gräßel, D., Bücker, O., Dohmen, M., Reckfort, J., Zilles, K., Amunts, K., 2016. Estimating fiber orientation distribution functions in 3D-Polarized Light Imaging. Front. Neuroanat. https://doi.org/10.3389/fnana.2016.00040 Beaujoin, J., Palomero-Gallagher, N., Boumezbeur, F., Axer, M., Bernard, J., Poupon, F., Schmitz, D., Mangin, J.F., Poupon, C., 2018. Post-mortem inference of the human hippocampal connectivity and microstructure using ultra-high field diffusion MRI at 11.7 T. Brain Struct. Funct. https://doi.org/10.1007/s00429-018-1617-1 Breakspear, M., 2017. Dynamic models of large-scale brain activity. Nat. Neurosci. https://doi.org/10.1038/nn.4497 Byrge, L., Kennedy, D.P., 2018. High-accuracy individual identification using a “thin slice” of the functional connectome. Netw. Neurosci. https://doi.org/10.1162/netn_a_00068 Costa-Klein, P., Ettinger, U., Schirner, M., Ritter, P., Falkai, P., Koutsouleris, N., Kambeitz, J., 2020. Investigating the Effect of the Neuregulin-1 Genotype on Brain Function Using Brain Network Simulations. Biol. Psychiatry. https://doi.org/10.1016/j.biopsych.2020.02.118 Das, S., Glatard, T., MacIntyre, L.C., Madjar, C., Rogers, C., Rousseau, M.E., Rioux, P., MacFarlane, D., Mohades, Z., Gnanasekaran, R., Makowski, C., Kostopoulos, P., Adalat, R., Khalili-Mahani, N., Niso, G., Moreau, J.T., Evans, A.C., 2016. The MNI data-sharing and processing ecosystem. Neuroimage. https://doi.org/10.1016/j.neuroimage.2015.08.076 Deco, G., Ponce-Alvarez, A., Hagmann, P., Romani, G.L., Mantini, D., Corbetta, M., 2014. How Local Excitation–Inhibition Ratio Impacts the Whole Brain Dynamics. J. Neurosci. 34, 7886–7898. Eppler, J.M., Helias, M., Muller, E., Diesmann, M., Gewaltig, M.O., 2009. PyNEST: A convenient interface to the NEST simulator. Front. Neuroinform. https://doi.org/10.3389/neuro.11.012.2008 Esteban, O., Ciric, R., Finc, K., Blair, R.W., Markiewicz, C.J., Moodie, C.A., Kent, J.D., Goncalves, M., DuPre, E., Gomez, D.E.P., Ye, Z., Salo, T., Valabregue, R., Amlien, I.K., Liem, F., Jacoby, N., Stojić, H., Cieslak, M., Urchs, S., Halchenko, Y.O., Ghosh, S.S., De La Vega, A., Yarkoni, T., Wright, J., Thompson, W.H., Poldrack, R.A., Gorgolewski, K.J., 2020. Analysis of task- based functional MRI data preprocessed with fMRIPrep. Nat. Protoc. https://doi.org/10.1038/s41596-020-0327-3 Esteban, O., Markiewicz, C.J., Blair, R.W., Moodie, C.A., Isik, A.I., Erramuzpe, A., Kent, J.D., Goncalves, M., DuPre, E., Snyder, M., Oya, H., Ghosh, S.S., Wright, J., Durnez, J., Poldrack, R.A., Gorgolewski, K.J., 2019. fMRIPrep: a robust preprocessing pipeline for functional MRI. Nat. Methods. https://doi.org/10.1038/s41592-018-0235-4 Evans, A.C., Janke, A.L., Collins, D.L., Baillet, S., 2012. Brain templates and atlases. Neuroimage. https://doi.org/10.1016/j.neuroimage.2012.01.024 Gewaltig, M.-O., Diesmann, M., 2007. NEST (NEural Simulation Tool). Scholarpedia. https://doi.org/10.4249/scholarpedia.1430

24

Gorgolewski, K.J., Alfaro-Almagro, F., Auer, T., Bellec, P., Capotă, M., Chakravarty, M.M., Churchill, N.W., Cohen, A.L., Craddock, R.C., Devenyi, G.A., Eklund, A., Esteban, O., Flandin, G., Ghosh, S.S., Guntupalli, J.S., Jenkinson, M., Keshavan, A., Kiar, G., Liem, F., Raamana, P.R., Raffelt, D., Steele, C.J., Quirion, P.O., Smith, R., Strother, S.C., Varoquaux, G., Wang, Y., Yarkoni, T., Poldrack, R.A., 2017. BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods. PLoS Comput. Biol. https://doi.org/10.1371/journal.pcbi.1005209 Gorgolewski, K.J., Auer, T., Calhoun, V.D., Craddock, R.C., Das, S., Duff, E.P., Flandin, G., Ghosh, S.S., Glatard, T., Halchenko, Y.O., Handwerker, D.A., Hanke, M., Keator, D., Li, X., Michael, Z., Maumet, C., Nichols, B.N., Nichols, T.E., Pellman, J., Poline, J.B., Rokem, A., Schaefer, G., Sochat, V., Triplett, W., Turner, J.A., Varoquaux, G., Poldrack, R.A., 2016. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci. Data. https://doi.org/10.1038/sdata.2016.44 Gramfort, A., Luessi, M., Larson, E., Engemann, D.A., Strohmeier, D., Brodbeck, C., Goj, R., Jas, M., Brooks, T., Parkkonen, L., Hämäläinen, M., 2013. MEG and EEG data analysis with MNE-Python. Front. Neurosci. https://doi.org/10.3389/fnins.2013.00267 Guevara, M., Román, C., Houenou, J., Duclap, D., Poupon, C., Mangin, J.F., Guevara, P., 2017. Reproducibility of superficial white matter tracts using diffusion-weighted imaging tractography. Neuroimage 147, 703–725. Gymrek, M., McGuire, A.L., Golan, D., Halperin, E., Erlich, Y., 2013. Identifying personal genomes by surname inference. Science (80-. ). https://doi.org/10.1126/science.1229566 Hashemi, M., Vattikonda, A.N., Sip, V., Guye, M., Bartolomei, F., Woodman, M.M., Jirsa, V.K., 2020. The Bayesian Virtual Epileptic Patient: A probabilistic framework designed to infer the spatial map of epileptogenicity in a personalized large-scale brain model of epilepsy spread. Neuroimage. https://doi.org/10.1016/j.neuroimage.2020.116839 Hoffman, M.D., Gelman, A., 2014. The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. J. Mach. Learn. Res. Ioannidis, J.P.A., 2005. Why most published research findings are false. PLoS Med. 2, e124. Jirsa, V.K., Proix, T., Perdikis, D., Woodman, M.M., Wang, H., Bernard, C., Bénar, C., Chauvel, P., Bartolomei, F., Guye, M., Gonzalez-Martinez, J., 2017. The Virtual Epileptic Patient: Individualized whole-brain models of epilepsy spread. Neuroimage. https://doi.org/10.1016/j.neuroimage.2016.04.049 Jirsa, V.K., Stacey, W.C., Quilichini, P.P., Ivanov, A.I., Bernard, C., 2014. On the nature of seizure dynamics. Brain 137, 2210–2230. https://doi.org/10.1093/brain/awu133 Kellner, E., Dhital, B., Kiselev, V.G., Reisert, M., 2016. Gibbs-ringing artifact removal based on local subvoxel-shifts. Magn. Reson. Med. https://doi.org/10.1002/mrm.26054 Kucukelbir, A., Blei, D.M., Gelman, A., Ranganath, R., Tran, D., 2017. Automatic Differentiation Variational Inference. J. Mach. Learn. Res. Kuramoto, Y., 1984. Chemical Oscillations, Waves, and Turbulence, Springer Series in Synergetics. Springer Berlin Heidelberg, Heidelberg. https://doi.org/10.1007/978-3-642- 69689-3 Lam, S.K., Pitrou, A., Seibert, S., 2015. Numba: a LLVM-based Python JIT compiler. Proc. Second Work. LLVM Compil. Infrastruct. HPC - LLVM ’15. https://doi.org/10.1145/2833157.2833162 Liu, T.T., Nalci, A., Falahpour, M., 2017. The global signal in fMRI: Nuisance or Information? Neuroimage. https://doi.org/10.1016/j.neuroimage.2017.02.036

25

Matzke, H., Schirner, M., Vollbrecht, D., Rothmeier, S., Llarena, A., Rojas, R., Triebkorn, P., Domide, L., Mersmann, J., Solodkin, A., 2015. TVB-EduPack—An Interactive Learning and Scripting Platform for The Virtual Brain. Front. Neuroinform. 9. Melozzi, F., Bergmann, E., Harris, J.A., Kahn, I., Jirsa, V.K., Bernard, C., 2019. Individual structural features constrain the mouse functional connectome. Proc. Natl. Acad. Sci. U. S. A. 116, 26961–26969. https://doi.org/10.1073/pnas.1906694116 Melozzi, F., Woodman, M.M., Jirsa, V.K., Bernard, C., 2017. The virtual mouse brain: A computational neuroinformatics platform to study whole mouse brain dynamics. eNeuro. https://doi.org/10.1523/ENEURO.0111-17.2017 Murphy, K., Fox, M.D., 2017. Towards a consensus regarding global signal regression for resting state functional connectivity MRI. Neuroimage. https://doi.org/10.1016/j.neuroimage.2016.11.052 Nudo, R.J., 2013. Recovery after brain injury: mechanisms and principles. Front. Hum. Neurosci. 7, 1–14. https://doi.org/10.3389/fnhum.2013.00887 Oh, S.W., Harris, J.A., Ng, L., Winslow, B., Cain, N., Mihalas, S., Wang, Q., Lau, C., Kuan, L., Henry, A.M., Mortrud, M.T., Ouellette, B., Nguyen, T.N., Sorensen, S.A., Slaughterbeck, C.R., Wakeman, W., Li, Y., Feng, D., Ho, A., Nicholas, E., Hirokawa, K.E., Bohn, P., Joines, K.M., Peng, H., Hawrylycz, M.J., Phillips, J.W., Hohmann, J.G., Wohnoutka, P., Gerfen, C.R., Koch, C., Bernard, A., Dang, C., Jones, A.R., Zeng, H., 2014. A mesoscale connectome of the mouse brain. Nature 508, 207–214. https://doi.org/10.1038/nature13186 Osen, K.K., Imad, J., Wennberg, A.E., Papp, E.A., Leergaard, T.B., 2019. Waxholm Space atlas of the rat brain auditory system: Three-dimensional delineations based on structural and diffusion tensor magnetic resonance imaging. Neuroimage. https://doi.org/10.1016/j.neuroimage.2019.05.016 Palomero-Gallagher, N., Zilles, K., 2019. Cortical layers: Cyto-, myelo-, receptor- and synaptic architecture in human cortical areas. Neuroimage. https://doi.org/10.1016/j.neuroimage.2017.08.035 Papp, E.A., Leergaard, T.B., Calabrese, E., Johnson, G.A., Bjaalie, J.G., 2014. Waxholm Space atlas of the Sprague Dawley rat brain. Neuroimage. https://doi.org/10.1016/j.neuroimage.2014.04.001 Petkoski, S., Jirsa, V.K., 2019. Transmission time delays organize the brain network synchronization. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 377, 20180132. https://doi.org/10.1098/rsta.2018.0132 Petkoski, S., Palva, J.M., Jirsa, V.K., 2018. Phase-lags in large scale brain synchronization: Methodological considerations and in-silico analysis. PLoS Comput. Biol. 14, 1–30. https://doi.org/10.1371/journal.pcbi.1006160 Proix, T., Bartolomei, F., Guye, M., Jirsa, V.K., 2017. Individual brain structure and modelling predict seizure propagation. Brain. https://doi.org/10.1093/brain/awx004 Proix, T., Spiegler, A., Schirner, M., Rothmeier, S., Ritter, P., Jirsa, V.K., 2016. How do parcellation size and short-range connectivity affect dynamics in large-scale brain network models? Neuroimage. Pruim, R.H.R., Mennes, M., van Rooij, D., Llera, A., Buitelaar, J.K., Beckmann, C.F., 2015. ICA- AROMA: A robust ICA-based strategy for removing motion artifacts from fMRI data. Neuroimage. https://doi.org/10.1016/j.neuroimage.2015.02.064 Ritter, P., Schirner, M., McIntosh, A.R., Jirsa, V.K., 2013. The virtual brain integrates computational modeling and multimodal neuroimaging. Brain Connect. 3, 121–145. https://doi.org/10.1089/brain.2012.0120

26

Rocher, L., Hendrickx, J.M., de Montjoye, Y.A., 2019. Estimating the success of re- identifications in incomplete datasets using generative models. Nat. Commun. https://doi.org/10.1038/s41467-019-10933-3 Saggio, M.L., Crisp, D., Scott, J.M., Karoly, P., Kuhlmann, L., Nakatani, M., Murai, T., Dümpelmann, M., Schulze-Bonhage, A., Ikeda, A., Cook, M., Gliske, S. V., Lin, J., Bernard, C., Jirsa, V.K., Stacey, W.C., 2020. A taxonomy of seizure dynamotypes. Elife 9, 1–56. https://doi.org/10.7554/eLife.55632 Sanz-Leon, P., Knock, S.A., Spiegler, A., Jirsa, V.K., 2015. Mathematical framework for large- scale brain network modeling in The Virtual Brain. Neuroimage 111, 385–430. Sanz-Leon, P., Knock, S.A., Woodman, M.M., Domide, L., Mersmann, J., McIntosh, A.R., Jirsa, V.K., 2013. The Virtual Brain: a simulator of primate brain network dynamics. Front. Neuroinform. 7. Schirner, M., McIntosh, A.R., Jirsa, V.K., Deco, G., Ritter, P., 2018. Inferring multi-scale neural mechanisms with brain network modelling. Elife. https://doi.org/10.7554/eLife.28927 Schirner, M., Rothmeier, S., Jirsa, V.K., McIntosh, A.R., Ritter, P., 2015. An automated pipeline for constructing personalized virtual brains from multimodal neuroimaging data. Neuroimage. Shen, K., Bezgin, G., Schirner, M., Ritter, P., Everling, S., McIntosh, A.R., 2019. A macaque connectome for large-scale network simulations in TheVirtualBrain. Sci. data. https://doi.org/10.1038/s41597-019-0129-z Smith, R., Connelly, A., 2019. MRtrix3_connectome: A BIDS Application for quantitative structural connectome construction., in: OHBM. p. W610. Smith, R., Tournier, J.D., Calamante, F., Connelly, A., 2015. SIFT2: Enabling dense quantitative assessment of brain white matter connectivity using streamlines tractography. Neuroimage. https://doi.org/10.1016/j.neuroimage.2015.06.092 Smith, R., Tournier, J.D., Calamante, F., Connelly, A., 2012. Anatomically-constrained tractography: Improved diffusion MRI streamlines tractography through effective use of anatomical information. Neuroimage. https://doi.org/10.1016/j.neuroimage.2012.06.005 Stodden, V., McNutt, M., Bailey, D.H., Deelman, E., Gil, Y., Hanson, B., Heroux, M.A., Ioannidis, J.P.A., Taufer, M., 2016. Enhancing reproducibility for computational methods. Science (80-. ). https://doi.org/10.1126/science.aah6168 Tournier, J.D., Smith, R., Raffelt, D., Tabbara, R., Dhollander, T., Pietsch, M., Christiaens, D., Jeurissen, B., Yeh, C.H., Connelly, A., 2019. MRtrix3: A fast, flexible and open software framework for medical image processing and visualisation. Neuroimage. https://doi.org/10.1016/j.neuroimage.2019.116137 Trebaul, L., Deman, P., Tuyisenge, V., Jedynak, M., Hugues, E., Rudrauf, D., Bhattacharjee, M., Tadel, F., Chanteloup-Foret, B., Saubat, C., Reyes Mejia, G.C., Adam, C., Nica, A., Pail, M., Dubeau, F., Rheims, S., Trébuchon, A., Wang, H., Liu, S., Blauwblomme, T., Garcés, M., De Palma, L., Valentin, A., Metsähonkala, E.L., Petrescu, A.M., Landré, E., Szurhaj, W., Hirsch, E., Valton, L., Rocamora, R., Schulze-Bonhage, A., Mindruta, I., Francione, S., Maillard, L., Taussig, D., Kahane, P., David, O., 2018. Probabilistic functional tractography of the human cortex revisited. Neuroimage. https://doi.org/10.1016/j.neuroimage.2018.07.039 Vella, M., Cannon, R.C., Crook, S., Davison, A.P., Ganapathy, G., Robinson, H.P.C., Angus Silver, R., Gleeson, P., 2014. libNeuroML and PyLEMS: Using Python to combine procedural and declarative modeling approaches in computational neuroscience. Front. Neuroinform.

27

https://doi.org/10.3389/fninf.2014.00038 Veraart, J., Novikov, D.S., Christiaens, D., Ades-aron, B., Sijbers, J., Fieremans, E., 2016. Denoising of diffusion MRI using random matrix theory. Neuroimage. https://doi.org/10.1016/j.neuroimage.2016.08.016 Zimmermann, J., Perry, A., Breakspear, M., Schirner, M., Sachdev, P., Wen, W., Kochan, N.A., Mapstone, M., Ritter, P., McIntosh, A.R., Solodkin, A., 2018. Differentiation of Alzheimer’s disease based on local and global parameters in personalized Virtual Brain models. NeuroImage Clin. https://doi.org/10.1016/j.nicl.2018.04.017

28