Optimizing Data for Modeling Neuronal Responses

TECHNOLOGY REPORT published: 10 January 2019 doi: 10.3389/fnins.2018.00986 Optimizing Data for Modeling Neuronal Responses Peter Zeidman 1*†, Samira M. Kazan 1†, Nick Todd 2, Nikolaus Weiskopf 1,3, Karl J. Friston 1 and Martina F. Callaghan 1 1 Wellcome Centre for Human Neuroimaging, UCL Institute of Neurology, University College London, London, United Kingdom, 2 Department of Radiology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, United States, 3 Department of Neurophysics, Max Planck Institute for Human Cognition and Brain Sciences, Leipzig, Germany In this technical note, we address an unresolved challenge in neuroimaging statistics: how to determine which of several datasets is the best for inferring neuronal responses. Comparisons of this kind are important for experimenters when choosing an imaging protocol—and for developers of new acquisition methods. However, the hypothesis that one dataset is better than another cannot be tested using conventional statistics (based on likelihood ratios), as these require the data to be the same under each hypothesis. Edited by: Here we present Bayesian data comparison (BDC), a principled framework for evaluating Daniel Baumgarten, UMIT–Private Universität für the quality of functional imaging data, in terms of the precision with which neuronal Gesundheitswissenschaften, connectivity parameters can be estimated and competing models can be disambiguated. Medizinische Informatik und Technik, For each of several candidate datasets, neuronal responses are modeled using Bayesian Austria (probabilistic) forward models, such as General Linear Models (GLMs) or Dynamic Casual Reviewed by: Alberto Sorrentino, Models (DCMs). Next, the parameters from subject-specific models are summarized at Università di Genova, Italy the group level using a Bayesian GLM. A series of measures, which we introduce here, Dimitris Pinotsis, City University of London, are then used to evaluate each dataset in terms of the precision of (group-level) parameter United Kingdom estimates and the ability of the data to distinguish similar models. To exemplify the *Correspondence: approach, we compared four datasets that were acquired in a study evaluating multiband Peter Zeidman fMRI acquisition schemes, and we used simulations to establish the face validity of the [email protected] comparison measures. To enable people to reproduce these analyses using their own † These authors have contributed data and experimental paradigms, we provide general-purpose Matlab code via the SPM equally to this work software. Specialty section: Keywords: dynamic causal modeling, DCM, fMRI, PEB, multiband This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience INTRODUCTION Received: 29 June 2018 Hypothesis testing involves comparing the evidence for different models or hypotheses, given some Accepted: 10 December 2018 measured data. The key quantity of interest is the likelihood ratio—the probability of observing the Published: 10 January 2019 data under one model relative to another—written p y m1 / p(y|m2) for models m1 and m2 and Citation: dataset y. Likelihood ratios are ubiquitous in statistics, forming the basis of the F-test and the Bayes Zeidman P, Kazan SM, Todd N, factor in classical and Bayesian statistics, respectively. They are the most powerful test for any given Weiskopf N, Friston KJ and level of significance by the Neyman-Pearson lemma (Neyman and Pearson, 1933). However, the Callaghan MF (2019) Optimizing Data for Modeling Neuronal Responses. likelihood ratio test assumes that there is only one dataset y–and so cannot be used to compare Front. Neurosci. 12:986. different datasets. Therefore, an unresolved problem, especially pertinent to neuroimaging, is how doi: 10.3389/fnins.2018.00986 to test the hypothesis that one dataset is better than another for making inferences. Frontiers in Neuroscience | www.frontiersin.org 1 January 2019 | Volume 12 | Article 986 Zeidman et al. Optimizing Data With DCM Neuronal activity and circuitry cannot generally be observed acceleration factor. Multiband is an approach for rapid directly, but rather are inferred from measured timeseries. In the acquisition of fMRI data, in which multiple slices are acquired case of fMRI, the data are mediated by neuro-vascular coupling, simultaneously and subsequently unfolded using coil sensitivity the BOLD response and noise. To estimate the underlying information (Larkman et al., 2001; Xu et al., 2013). The rapid neuronal responses, models are specified which formalize the sampling enables sources of physiological noise to be separated experimenter’s understanding of how the data were generated. from sources of experimental variance more efficiently, however, Hypotheses are then tested by making inferences about model a penalty for this increased temporal resolution is a reduction parameters, or by comparing the evidence under different of the signal-to-noise ratio (SNR) of each image. A detailed models. For example, an F-test can be performed on the General analysis of these data is published separately (Todd et al., Linear Model (GLM) using classical statistics, or Bayesian Model 2017). We do not seek to draw any novel conclusions about Comparison can be used to select between probabilistic models. multiband acceleration from this specific dataset, but rather we From the experimenter’s perspective, the best dataset provides use it to illustrate a generic approach for comparing datasets. the most precise estimates of neuronal responses (enabling Additionally, we conducted simulations to establish the face efficient inference about parameters) and provides the greatest validity of the outcome measures, using estimated effect sizes and discrimination among competing models (enabling efficient SNR levels from the empirical multiband data. inference about models). The methodology we introduce here offers several novel Here, we introduce Bayesian data comparison (BDC)—a set contributions. First, it provides a sensitive comparison of of information measures for evaluating a dataset’s ability to data by evaluating their ability to reduce uncertainty about support inferences about both parameters and models. While neuronal parameters, and to discriminate among competing they are generic and can be used with any sort of probabilistic models. Second, our procedure identifies the best dataset for models, we illustrate their application using Dynamic Causal hypothesis testing at the group level, reflecting the objectives Modeling (DCM) for fMRI (Friston et al., 2003) because it of most cognitive neuroscience studies. Unlike a classical GLM offers several advantages. Compared to the GLM, DCM provides analysis—where only the maximum likelihood estimates from a richer characterization of neuroimaging data, through the each subject are taken to the group level—the (parametric use of biophysical models, based on differential equations that empirical) Bayesian methods used here take into account the separate neuronal and hemodynamic parameters. This means uncertainty of the parameters (the full covariance matrix), when one can evaluate which dataset is best for estimating neuronal modeling at the group level. Additionally, this methodology parameters specifically. These models also include connections provides the necessary tools to evaluate which imaging protocolis among regions, making DCM the usual approach for inferring optimal for effective connectivity analyses, although we anticipate effective (directed) connectivity from fMRI data. many questions about data quality will not necessarily relate to By using the same methodology to select among datasets as connectivity. We provide a single Matlab function for conducting experimenters use to select between connectivity models, feature all the analyses described in this paper, which is available in selection and hypothesis testing can be brought into alignment the SPM (http://www.fil.ion.ucl.ac.uk/spm/) software package for connectivity studies. Moreover, a particularly useful feature (spm_dcm_bdc.m). This function can be used to evaluate any of DCM for comparing datasets is that it employs Bayesian type of imaging protocol, in terms of the precision with which statistics. The posterior probability over neuronal parameters model parameters are estimated and the complexity of the forms a multivariate normal distribution, providing expected generative models that can be disambiguated. values for each parameter, as well as their covariance. The precision (inverse covariance) quantifies the confidence we place METHODS in the parameter estimates, given the data. After establishing that an experimental effect exists, for example by conducting an initial We begin by briefly reprising the theory behind DCM and GLM analysis or by reference to previous studies, the precision introducing the set of outcome measures used to evaluate data of the parameters can be used to compare datasets. DCM also quality. We then illustrate the analysis pipeline in the context enables experimenters to distinguish among models, in terms of an exemplar fMRI dataset and evaluate the measures using | of which model maximizes the log model evidence ln p(y m). simulations. We cannot use this quantity to compare different datasets, but we can ask which of several datasets provides the most Dynamic Causal Modeling efficient discrimination among models. For these reasons, we DCM is a framework for evaluating generative models of time used Bayesian methods as the basis for comparing datasets, both

Optimizing Data for Modeling Neuronal Responses

Plos Computational Biology 2018

Bayesian Model Reduction

A Tutorial on Group Effective Connectivity Analysis, Part 2: Second Level Analysis with PEB

Second Level Analysis with PEB