<<

Are mental properties supervenient on brain properties? SUBJECT AREAS: Joshua T. Vogelstein1, R. Jacob Vogelstein2 & Carey E. Priebe1 MODELLING COMPUTATIONAL BIOLOGY 1Department of Applied Mathematics & Statistics, Johns Hopkins University, Baltimore, MD, 21218, 2National Security Technology STATISTICS Department, Johns Hopkins University Applied Physics Laboratory, Laurel, MD, 20723. BIOINFORMATICS

The ‘‘mind-brain supervenience’’ conjecture suggests that all mental properties are derived from the Received physical properties of the brain. To address the question of whether the mind supervenes on the brain, we 1 April 2011 frame a supervenience hypothesis in rigorous statistical terms. Specifically, we propose a modified version of supervenience (called e-supervenience) that is amenable to experimental investigation and statistical Accepted analysis. To illustrate this approach, we perform a thought experiment that illustrates how the probabilistic 1 September 2011 theory of pattern recognition can be used to make a one-sided determination of e-supervenience. The physical property of the brain employed in this analysis is the graph describing brain connectivity (i.e., the Published brain-graph or connectome). e-supervenience allows us to determine whether a particular mental property 26 September 2011 can be inferred from one’s connectome to within any given positive misclassification rate, regardless of the relationship between the two. This may provide further motivation for cross-disciplinary research between neuroscientists and statisticians. Correspondence and requests for materials uestions and assumptions about mind-brain supervenience go back at least as far as Plato’s dialogues should be addressed to circa 400 BCE1. While there are many different notions of supervenience, we find Davidson’s canonical 2 J.T.V. (joshuav@jhu. Q particularly illustrative : edu) [mind-brain] supervenience might be taken to mean that there cannot be two events alike in all physical respects but differing in some mental respect, or that an object cannot alter in some mental respect without altering in some physical respect. Colloquially, supervenience means ‘‘there cannot be a mind-difference without a physical-difference.’’ This philosophical conjecture has potentially widespread implications. For example, neural network theory and artificial intelligence often implicitly assume a local version mind-brain supervenience3, 4. Cognitive neuroscience similarly seems to operate under such assumptions5. Philosophers continue to debate and refine notions of supervenience6. Yet, to date, relatively scant attention has been paid to what might be empirically learned about supervenience. In this work we attempt to bridge the gap between philosophical conjecture and empirical investigations by casting supervenience in a probabilistic framework amenable to hypothesis testing. We then use the probabilistic theory of pattern recognition to determine the limits of what one can and cannot learn about supervenience through data analysis. The implications of this work are varied. It provides a probabilistic framework for con- verting philosophical conjectures into statistical hypotheses that are amenable to experimental investigation, which allows the philosopher to gain empirical support for her rational arguments. This leads to the construction of the first explicit proof (to our knowledge) of a universally consistent classifier on graphs, and the first demonstration of the tractability of answering supervenience questions. Supervenience therefore seems to per- haps be a useful but under-utilized concept for neuroscientific investigations. This work should provide further motivation for cross-disciplinary efforts across three fields—, statistics, and neuroscience—with shared goals but mostly disjoint jargon and methods of analysis.

Results Statistical supervenience: a definition. Let M~fgm1,m2, ... be the space of all possible minds and let B~fgb1,b2, ... be the set of all possible brains. M includes a mind for each possible collection of thoughts, memories, beliefs, etc. B includes a brain for each possible position and momentum of all subatomic particles within the skull. Given these definitions, Davidson’s conjecture may be concisely and formally stated thusly: m ? m9 ) b ? b9, where ðÞm,b ,ðÞm0,b0 [M|B are mind-brain pairs. This mind-brain supervenience relation does not imply an injective relation, a causal relation, or an identity relation (see Appendix 1 for more details and

SCIENTIFIC REPORTS | 1 : 100 | DOI: 10.1038/srep00100 1 www.nature.com/scientificreports

some examples). To facilitate both statistical analysis and empirical Hn0 ~fgðÞMnz1,Bnz1 , ...,ðÞMnzn0 ,Bnzn0 be a set of n9 hold-out investigation, we convert this local supervenience relation from a samples, each sampled iid from FMB. The hold-out approximation logical to a probabilistic relation. to the misclassification rate is given by X Let F indicate a joint distribution of minds and brains. 0 MB ^n ðÞ~ IfgðÞ; T =

SCIENTIFIC REPORTS | 1 : 100 | DOI: 10.1038/srep00100 2 www.nature.com/scientificreports number of cells in a person’s brain at some time t, or the collection of distribution specifies the brain-graph distribution given that the spike trains of all neurons in the brain during some time period t to t9. organism does (or does not) exhibit the behavior. Regardless of the details of the specifications of the mental property Let Auv be the number of chemical synapses between neuron u and and the brain property, given such specifications, one can assume a neuron v according to15. Then, let S be the set of edges deemed 16 model, F. We desire a classifier gn that is guaranteed to be consistent, responsible for odor-evoked behavior according to . If odor-evoked no matter which of the possible distributions FMB[F is the true behavior is supervenient on this signal subgraph S, then the distri- distribution. A classifier with such a property is called a universally bution of edges in S must differ between the two classes of odor 17 consistent classifier. Below, under a very general mind-brain model evoked behavior . Let Euvjj denote the expected number of edges F, we construct a universally consistent classifier. from vertex v to vertex u in class j. For class m0, let Euvj0 5 Auv 1 g, where g 5 0.05 is a small noise parameter (it is believed that the Gedankenexperiment 1. Let the physical property under con- 12 sideration be brain connectivity structure, so b is a brain-graph C. elegans connectome is similar across organisms ). For class m1, let (‘‘connectome’’) with vertices representing neurons (or collec- Euvj1 5 Auv 1 zuv, where the signal parameter zuv 5 g for all edges tions thereof) and edges representing synapses (or collections not in S, and zuv is uniformly sampled from [25, 5] for all edges thereof). Further let B, the brain observation space, be the collec- within S. For both classes, let each edge be Poisson distributed, F ~ ~Poisson E . tion of all graphs on a given finite number of vertices, and let M, Auv jM mj uvjj the mental property observation space, be finite. Now, imagine We consider kn-nearest neighbor classification of labeled multi- collecting very large amounts of very accurate identically and graphs (directed, with loops) on the 279 under Frobenius (the independently sampled brain-graph data and associated mental C. elegans somatic nervous system has only 279 neurons that make synapses with other neurons). The k -nearest neighbor classifier property indicators from FMB.Akn-nearest neighbor classifier n using a Frobenius norm is universally consistent (see Methods used here satisfies kn R ‘ as n R ‘ and kn/n R 0asn R ‘, ensuring for details). The existence of a universally consistent classifier universal consistency. (Better classifiers can be constructed for the guarantees that eventually (in n, n9) we will be able to conclude joint distribution FMB used here; however, we demand universal e consistency.) Figure 1 shows that for this simulation, rejecting M*FB for this mind-brain property pair, if indeed e-superveni- ence holds. This logic holds for directed graphs or multigraphs or (e 5 0.1)-supervenience at a 5 0.01 requires only a few hundred hypergraphs with discrete edge weights and vertex attributes, as training samples. well as unlabeled graphs (see ref. 8 for details). Furthermore, the Importantly, conducting this experiment in actu is not beyond proof holds for other matrix norms (which might speed up con- current technological limitations. 3D superresolution imaging18 vergence and hence reduce the required n), and the regression combined with neurite tracing algorithms19, 20, 21 allow the collection scenario where jjM is infinite (again, see Methods for details). of a C. elegans brain-graph within a day. Genetic manipulations, laser ablations, and training paradigms can each be used to obtain a non- Thus, under the conditions stated in the above Gedanken- 13 wild type population for use as M 5 m1 , and the class of each experiment, universal consistency yields: 22 organism (m0 vs. m1) can also be determined automatically . e Theorem 3. M*FB[b?1 as n, n9R ‘.

Unfortunately, the rate of convergence of LF(gn)toLF(g*) depends 9 on the (unknown) distribution F 5 FMB . Furthermore, arbitrarily slow convergence theorems regarding the rate of convergence of LF(gn)toLF(g*) demonstrate that there is no universal n, n9 which will guarantee that the test has power greater than any specified target b . a10. For this reason, the test outlined above can provide only a one-sided conclusion: if we reject we can be 100(1 2 a)% confident e that M*FB holds, but we can never be confident in its negation; e rather, it may be the case that the evidence in favor of M*FB is insufficient because we simply have not yet collected enough data. This leads immediately to the following theorem:

Theorem 4. For any target power bmin . a, there is no universal n, n9 that guarantees b $ bmin. Therefore, even e-supervenience does not satisfy Popper’s falsifia- bility criterion11.

The feasibility of a consistent statistical test for e-supervenience. Theorem 3 demonstrates the availability of a consistent test under certain restrictions. Theorem 4, however, demonstrates that con- vergence rates might be unbearably slow. We therefore provide an illustrative example of the feasibility of such a test on synthetic data. Caenorhabditis elegans is a species whose nervous system is Figure 1 | C. elegans graph classification simulation results. The ^n0 believed to consist of the same 302 labeled neurons for each organ- estimated hold-out misclassification rate LF ðÞgn (with n9 5 1000 testing ism12. Moreover, these animals exhibit a rich behavioral repertoire samples) is plotted as a function of class-conditional training sample size 13 e that seemingly depends on circuit properties . These findings motiv- nj 5 n/2, suggesting that for e 5 0.1 we can determine that M*F B holds ate the use of C. elegans for a synthetic data analysis14. Conducting with 99% confidence with just a few hundred training samples generated ^n0 such an experiment requires specifying a joint distribution F over from FMB. Each dot depicts LF ðÞgn for some n; standard errors are MB 0 0 1=2 L^n ðÞg 1{L^n ðÞg =n0 . For example, at n 5 180 we have brain-graphs and behaviors. The joint distribution decomposes into F pn ffiffiffiffiffi F n j ~ ~ ^n0 ~ the product of a class-conditional distribution (likelihood) and a kn t 8ns 53 (where t?s indicates the floor operator), LF ðÞgn 0:057, 0:1 prior, FMB 5 FBjMFM. The prior specifies the probability of any and standard error less than 0.01. We reject H0 : L* $ 0.1 at a 5 0.01. particular organism exhibiting the behavior. The class-conditional Note that L* < 0 for this simulation.

SCIENTIFIC REPORTS | 1 : 100 | DOI: 10.1038/srep00100 3 www.nature.com/scientificreports

Discussion corresponding minds, m(1), m(2),..., m(n), where parenthetical indices indicate rank This work makes the following contributions. First, we define stati- order among {di}ig[n]. The 1-NN algorithm predicts that the unobserved mind is of the same class as the closest brain’s class: m^ ~mðÞ1 .Thekn nearest neighbor is a stical supervenience based on Davidson’s canonical statement straightforward generalization of this approach. It says that the test mind is in the (Definition 1). This definition makes it apparent that supervenience same class as whichevernoP class is the plurality class among the kn nearest neighbors, ~ I kn ~ 0 implies the possibility of perfect classification (Theorem 1). We then m^ argmaxm0 i~1 mðÞi m . Given a particular choice of kn (the number of prove that there is no viable test against supervenience, so one can nearest neighbors to consider) and a choice of d(?,?) (the distance metric used to never reject a null hypothesis in favor of supervenience, regardless of compare the test datum and training data), one has a relatively simple and intuitive the amount of data (Theorem 2). This motivates the introduction of a algorithm. Let g be the k nearest neighbor (k NN) classifier when there are n training relaxed notion called e-supervenience (Definition 2), against which n n n samples. A collection of such classifiers {gn}, with kn increasing with n, is called a consistent statistical tests are readily available. Under a very general classifier sequence. A universally consistent classifier sequence is any classifier brain-graph/mental property model (Gedankenexperiment 1), a con- sequence that is guaranteed to converge to the Bayes optimal classifier regardless of sistent statistical test against e-supervenience is always available, no the true distribution from which the data were sampled; that is, a universally con- sistent classifier sequence satisfies LF(gn) R LF(g*)asn R ‘ for all FMB. In the main matter the true distribution FMB (Theorem 3). In other words, the text, we refer to the whole sequence as a classifier. 34 proposed test is guaranteed to reject the null whenever the null is The knNN classifier is consistent if (i) kn R ‘ as n R ‘ and (ii) kn/n R 0asn R ‘ . 34 false, given sufficient data, for any possible distribution governing In Stone’s original proof, b was assumed to be a q-dimensional vector, and the L2 P 2 0 ~ q { 0 mental property/brain property pairs. norm (dbðÞ,b j~1 bj bj , where j indexes elements of the q-dimensional Alas, arbitrary slow convergence theorems demonstrate that vector) was shown to satisfy the constraints on a distance metric for this collection of there is no universal n, n9 for which convergence is guaranteed classifiers to be universally consistent. Later, others extended these results to apply to 9 (Theorem 4). Thus, a failure to reject is ambiguous: even if the data any Lp norm . When brain-graphs are represented by their adjacency matrices, one can stack the columns of the adjacency matrices, effectively embedding graphs into a satisfy the above assumptions, the failure to reject may be due to vector space, in which case Stone’s theorem applies. Stone’s original proof also applied either (i) an insufficient amount of data or (ii) M may not be to the scenario when jjM was infinite, resulting in a universally consistent regression e-supervenient on B. Moreover, the data will not, in general, satisfy algorithm as well. the above assumptions. In addition to dependence (because each Note that the above extension of Stone’s original theorem to the graph domain human does not exist in a vacuum), the mental property measure- implicitly assumed that vertices were labeled, such that elements of the adjacency matrices could easily be compared across graphs. In theory, when vertices are ments will often be ‘‘noisy’’ (for example, accurately diagnosing psy- unlabeled, one could first map each graph to a quotient space invariant to iso- chiatric disorders is a sticky wicket23). Nonetheless, synthetic data morphisms, and then proceed as before. Unfortunately, there is no known poly- analysis suggests that under somewhat realistic assumptions, con- nomial time complexity algorithm for graph isomorphism35, so in practice, dealing 8 vergence obtains with an amount of data one might conceivably with unlabeled vertices will likely be computationally challenging . collect (Figure 1 and ensuing discussion). Thus, given measurements of mental and brain properties that we 1. Plato. Plato: complete works (Hackett Pub Co, 1997). believe reflect the properties of interest, and given a sufficient 2. Davidson, D. Experience and Theory, chap. Mental Eve (Duckworth, 1970). amount of data satisfying the independent and identically sampled 3. Haykin, S. Neural Networks and Learning Machines (Prentice Hall, 2008), 3rd edn. assumption, a rejection of He : L* $ e in favor of M*e B entails 4. Ripley, B. D. Pattern Recognition and Neural Networks (Cambridge University 0 F Press, 2008). that we are 100(1 2 a)% confident that the mental property under 5. Gazzaniga, M. S., Ivry, R. B. & Mangun, G. R. Cognitive Neuroscience: The Biology investigation is e-supervenient on the brain property under investi- of the Mind (Third Edition) (W. W. Norton & Company, 2008). gation. Unfortunately, failure to reject is more ambiguous. 6. Kim, J. , or Something Near Enough (Princeton Monographs in Interestingly, much of contemporary research in neuroscience and Philosophy) (Princeton University Press, 2007). cognitive science could be cast as mind-brain supervenience investi- 7. Bickel, P. J. & Doksum, K. A. Mathematical Statistics: Basic Ideas and Selected 24 Topics, Vol I (2nd Edition), vol. 1 (Prentice Hall, 2000). gations. Specifically, searches for ‘‘engrams’’ of memory traces or 8. Vogelstein, J. T. & Priebe, C. E. Shuffled Graph Classification: Theory and ‘‘neural correlates’’ of various behaviors or mental properties (for Connectome Applications. Submitted for publication (2011). example, consciousness25), may be more aptly called searches for 9. Devroye, L., Gyo¨rfi, L. & Lugosi, G. A Probabilistic Theory of Pattern Recognition the ‘‘neural supervenia’’ of such properties. Letting the brain pro- (Springer, 1996). perty be a brain-graph is perhaps especially pertinent in light of the 10. Devroye, L. On Arbitrarily Slow Rates of Global Convergence in Density 26, 27 Estimation. Utilitas Mathematica 483, 475–483 (1983). advent of ‘‘connectomics’’ , a field devoted to estimating whole 11. Popper, K. R. The logic of scientific discovery (Hutchinson, 1959). organism brain-graphs and relating them to function. Testing super- 12. Durbin, R. M. Studies on the Development and Organisation of the Nervous System venience of various mental properties on these brain-graphs will of Caenorhabditis elegans. Ph.D. thesis, University of Cambridge (1987). perhaps therefore become increasingly compelling; the framework 13. de Bono, M. & Maricq, A. V. Neuronal substrates of complex behaviors in developed herein could be fundamental to these investigations. For C. elegans. Annu Rev Neurosci 28, 451–501 (2005). 14. Gelman, A. & Shalizi, C. R. Philosophy and the practice of Bayesian statistics. example, questions about whether connectivity structure alone is Submitted for publication 1–36 (2011). sufficient to explain a particular mental property is one possible 15. Varshney, L. R., Chen, B. L., Paniagua, E., Hall, D. H. & Chklovskii, D. B. mind-brain e-supervenience investigation. The above synthetic data Structural Properties of the Caenorhabditis elegans Neuronal Network. PLoS analysis demonstrates the feasibility of e-supervenience on small Computational Biology 7, e1001066 (2011). e 16. Chalasani, S. H. et al. Dissecting a circuit for olfactory behaviour in Caenorhabditis brain-graphs. Note that -supervenience tests need not investigate elegans. Nature 450, 63–70 (2007). seemingly intractable problems, like consciousness. For example, 17. Vogelstein, J. T., Gray, W. R., Vogelstein, R. J. & Priebe, C. E. Graph Classification aspects of visual perception appear to supervene on visual cortical using Signal Subgraphs: Applications in Statistical Connectomics. Submitted for activity (for example, binocular rivalry28). Moreover, an inability to publication (2011). reject e-supervenience for small e is also potentially meaningful. For 18. Vaziri, A., Tang, J., Shroff, H. & Shank, C. V. Multilayer three-dimensional super resolution imaging of thick biological samples. Proceedings of the National example, perhaps auditory localization precision supervenes on a Academy of Sciences of the United States of America 105, 20221–6 (2008). rate code only to some e . c, the rest supervening on a spike timing 19. Helmstaedter, M., Briggman, K. L. & Denk, W. 3D structural imaging of the brain code29. Similar supervenience tests on increasingly complex mental with photons and electrons. Current opinion in neurobiology 18, 633–41 (2008). properties will potentially benefit from either higher-throughput 20. Mishchenko, Y. Automation of 3D reconstruction of neural tissue from large imaging modalities30, 31, more coarse brain-graphs32, 33, or both. volume of conventional serial section transmission electron micrographs. Journal of Neuroscience Methods 176, 276–289 (2009). 21. Lu, J., Fiala, J. C. & Lichtman, J. W. Semi-Automated Reconstruction of Neural Methods Processes from Large Numbers of Fluorescence Images. PLoS ONE 4, e5655 The 1-nearest neighbor (1-NN) classifier works as follows. Compute the distance (2009). g between the test brain b and all n training brains, di 5 d(b, bi) for all i [n], where 22. Buckingham, S. D. & Sattelle, D. B. Strategies for automated analysis of C. elegans [n] 5 1,2,..., n. Then, sort these distances, d(1) , d(2) , ... , d(n), and consider their locomotion. Invertebrate neuroscience : IN 8, 121–31 (2008).

SCIENTIFIC REPORTS | 1 : 100 | DOI: 10.1038/srep00100 4 www.nature.com/scientificreports

23. Kessler, R. C. et al. Lifetime prevalence and age-of-onset distributions of DSM-IV 35. Garey, M. R. & Johnson, D. S. Computers and intractability. A guide to the theory of disorders in the National Comorbidity Survey Replication. Archives of general NP-completeness. A Series of Books in the Mathematical Sciences (WH Freeman psychiatry 62, 593–602 (2005). and Company, 1979). 24. Lashley, K. S. In search of the engram. Symposia of the society for experimental biology 4, 30 (1950). 25. Koch, C. The Quest for Consciousness (Roberts and Company Publishers, 2010). Acknowledgments 26. Sporns, O., Tononi, G. & Kotter, R. The Human Connectome: A Structural The authors would like to acknowledge helpful discussions with J. Lande, B. Vogelstein, S. Description of the Human Brain. PLoS Computational Biology 1, e42 (2005). Seung, and K. Kording. This work was partially supported by the Research Program in 27. Hagmann, P. From diffusion MRI to brain connectomics. Ph.D. thesis, Institut de Applied Neuroscience. traitement des signaux (2005). 28. Tong, F., Meng, M. & Blake, R. Neural bases of binocular rivalry. Trends in Author contributions Cognitive Sciences 10, 502–511 (2006). JTV, RJV, and CEP conceived of the manuscript. JTV and CEP wrote it. CEP ran the 29. Chase, S. M. & Young, E. D. Spike-timing codes enhance the representation of experiment. multiple simultaneous soundlocalization cues in the inferior colliculus. Journal of Neuroscience 26, 3889–98 (2006). 30. Hayworth, K.J., Kasthuri, N., Schalek, R. & Lichtman, J. W. Automating the Additional information Collection of Ultrathin Serial Sections for Large Volume TEM Reconstructions. Supplementary information accompanies this paper at http://www.nature.com/ Microscopy and Microanalysis 12(Suppl. 02), 86–87 (2006). scientificreports 31. Bock, D. D. et al. Network anatomy and in vivo physiology of visual cortical Competing financial interests: The authors have no competing financial interests to neurons. Nature 471, 177–182 (2011). declare. 32. Palm, C. et al. Towards ultra-high resolution fibre tract mapping of the human brain–registration of polarised light images and reorientation of fibre vectors. License: This work is licensed under a Creative Commons Frontiers in Human Neuroscience 4 (2010). Attribution-NonCommercial-ShareAlike 3.0 Unported License. To view a copy of this 33. Johansen-Berg, H. & Behrens, T. E. Diffusion MRI: From quantitative license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ measurement to in-vivo neuroanatomy (Academic Press, 2009). How to cite this article: Vogelstein, J.T., Vogelstein, R.J. & Priebe, C.E. Are mental 34. Stone, C. J. Consistent Nonparametric Regression. The Annals of Statistics 5, 595– properties supervenient on brain properties? Sci. Rep. 1, 100; DOI:10.1038/srep00100 620 (1977). (2011).

SCIENTIFIC REPORTS | 1 : 100 | DOI: 10.1038/srep00100 5