Child Development Perspectives (In Press)

Total Page:16

File Type:pdf, Size:1020Kb

Child Development Perspectives (In Press)

Child Development Perspectives (In Press)

The theory theory 2.0: Probabilistic models and cognitive development

Commentary on Nora Newcombe Neoconstructivism

Alison Gopnik Dept. Of Psychology University of California at Berkeley, Berkeley, CA, 94704

[email protected]

Newcombe’s piece is a welcome attempt to reconcile the profusion of terrific empirical work in cognitive development with the unsatisfactory traditional theoretical positions. Most of her points are very well-taken, and define an emerging empirical consensus. However, she doesn’t capture one important theoretical dimension. Newcombe opposes nativism to everything else, combining connectionist and dynamic systems theories, information processing theories, the “theory theory”, statistical learning and Bayesian inference. But there is an alternative split that is equally important and that carves up the theoretical territory rather differently. This is a contrast between representational and non- representational accounts of development. While almost all nativist approaches are representational, among empiricists there is an important divide between those who, like nativists, embrace representation, and those who deny it.

A deep theoretical tension lies at the heart of developmental cognitive science, reflecting a deep tension in epistemology. Since Plato, two facts about human knowledge. have clashed. We seem to have abstract, structured and accurate representations of the world – representations that allow a wide range of new inferences. But all that reaches us from the world are concrete, probabilistic patterns of sensory stimuli. How do we get such powerful knowledge from such impoverished data? Nativists, from Plato to “core knowledge” theorists, argue that the abstract structure is actually innate, we only appear to learn it from experience. Empiricists, from Aristotle to the connectionists, argue that we only appear to have abstract, structured, representations – actually we just accumulate specific associations.

The empirical accomplishments of the last thirty years have actually made the theoretical tension worse. We might have thought, as Piaget did, that children begin with concrete particular “sensorimotor” schemes and somehow gradually construct more abstract representations. But, empirically, as Newcombe points out, one of the most important discoveries of the last thirty years is that children—even young infants—already have abstract, structured, representations of the world: intuitive theories and grammars, conceptual hierarchies and phonological and spatial maps. But, as she also points out, we have accumulated equally compelling evidence that those representations change as a result of experience and learning. In fact, we’ve discovered that even young infants learn in surprisingly complex and sophisticated ways. They transform their representations based on concrete experiences -- the contingent, probabilistic evidence of their senses. How is this possible?

Connectionist and dynamic theories, like their associationist precursors, allow learning but deny that there are abstract representations. Nativism allows representation but denies that there is substantive learning. Many empirically minded developmental psychologists, like those Newcombe describes, have been dissatisfied with both those options. In fact, for that very reason, many of us have advocated the “theory theory” – the idea that children’s learning is like theory change in science—because in science we also see both rich structure and significant learning. However, until recently, there were no precise computational accounts of theories or theory change. Connectionism, in particular, could characterize some types of learning but only at the cost of eschewing representation.

Fortunately, that situation has changed dramatically in the past decade. Newcombe mentions Bayesian learning as an alternative to association, but Bayesian learning is just one part of a broader approach, sometimes called the “probabilistic model” approach, that dominates machine learning and is increasingly influential in cognitive science (for recent reviews see Gopnik & Schulz, 2007 and Griffiths et al, 2010; for applications to cognitive development see Gopnik et al. 2004; Xu and Tenenbaum, 2005 and special sections of Developmental Science (2007) and Cognition (in press). )Probabilistic models promise a computationally precise developmental cognitive science that can integrate structure and learning. Like nativist approaches, but unlike traditional empiricist approaches, probabilistic models propose that even very young infants have abstract, structured, hierarchical representations -- representations that we can think of as hypotheses about objects, people or language. Unlike nativism, however, the probabilistic models approach allows for learning and even, in the most recent formulations, for radical theory change as a result of evidence (for a recent example see Kemp et al, 2010). Unlike the connectionist or dynamic picture, learning is rational. Probabilistic models stem from work in the philosophy of science and machine learning that outlines how a system could, in principle, make the best inferences from data. Unlike “core knowledge”, learning in probabilistic models doesn’t require external representations like language.

The central advance has been to formulate structured representations, such as causal graphical models, or “Bayes nets” that can be easily combined with probabilistic learning, such as Bayesian inference. “Theory theorists” proposed that children learn by constructing hypotheses and testing them against evidence. But if this is a deterministic process then the “poverty of the stimulus” problem becomes acute – there will never be enough data to definitively prove that one hypothesis is right and reject the rest. In contrast, if the child is a probabilistic learner, weighing the evidence to strengthen or reduce support for one hypothesis over another, we can help explain how children are gradually able to revise their initial theories in favor of better ones. Young children do indeed behave like probabilistic learners – entertaining multiple hypotheses, weighing new possibilities against prior beliefs, experimenting and explaining – rather than simply using associationist mechanisms to match the patterns in the data or fiddling with details of innate core knowledge.

The ultimate test of any perspective is whether it generates new and interesting empirical research. Researchers inspired by the probabilistic models approach have already begun to make important developmental discoveries, discoveries that don’t fit either a connectionist/dynamic or nativist picture. 9-month-olds, for example, can make causal inferences that go beyond association (Sobel and Kirkham, 2007) 20-month-olds can infer a person’s desire from a non-random sampling pattern (Kushnir, Xu and Wellman, in press) , and 4-year-olds discover new abstract variables and rules from only a few data points (Schulz et al. 2008; Lucas et al, 2010), integrate new evidence and prior knowledge (Schulz et al. 2007; Kushnir & Gopnik 2007, Sobel et al, 2004) and rationally experiment to uncover new causal structure (Schulz & Bonawitz, 2007).

Developmental evidence has also inspired computational advances. The computational framework began by using statistical patterns to infer simple underlying patterns, as in data-mining. Developmentalists emphasize the importance of framework theories, explanation and experimentation, and social context. Computationalists are starting to tackle those problems, too, with some success. Finally, but most important of all, the probabilistic models approach is broad enough that there are many theoretical specifics to be worked out and many exciting empirical questions to ask.

References

Gopnik, A,; Glymour, C.; Sobel, D.; Schulz, L.; Kushnir, T.; Danks, D. (2004). A theory of causal learning in children: Causal maps and Bayes nets. Psychological Review, 111(1),3-32

Gopnik, A. & Schulz L. (eds). (2007). Causal learning: Philosophy, psychology and computation. NY,: Oxford University Press.

Griffiths. T.; Chater, N.; Kemp, C.; Perfors, A., & Tenenbaum, J. (2010). Probabilistic models of cognition: Exploring representations and inductive biases. Trends in Cognitive Sciences,

Kemp, C.; Tenenbaum, J. B, Niyogi, S. & Griffiths, T. (2010). A probabilistic model of theory formation. Cognition, 114, 2 , 165-196.

Kirkham, N., & Sobel, D. (2006). Blicket and babies: The development of causal reasoning in toddlers and infants. Developmental Psychology. 42(6), 1103-1115. Kushnir, T., & Gopnik, A. (2007). Conditional probability versus spatial contiguity in causal learning: Preschoolers use new contingency evidence to overcome prior spatial assumptions. Developmental Psychology, Vol 43(1), 186-196

Kushnir, T., Xu, F., & Wellman, H. (in press). Young children use statistical sampling to infer the preferences of others. Psychological Science.

Lucas C.; Gopnik, A. & Griffiths, T. (2010). Developmental differences in learning the form of causal relationships. Proceedings of the Cognitive Science Society.

Schulz, L.; Goodman, N.; Tenenbaum, J.; & Jenkins, A. (2008). Going beyond the evidence: Abstract laws and preschoolers' responses to anomalous data. Cognition 109(2), 211-223.

Schulz, L.; Bonawitz, E.; & Griffiths, T. (2007). Can being scared cause tummy aches? Naive theories, ambiguous evidence, and preschoolers' causal inferences. Developmental Psychology, 43(5), 1124-1139.

Schulz, L. E. & Bonawitz, E. B. (2007). Serious fun: Preschoolers engage in more exploratory play when evidence is confounded. Developmental Psychology, 43, 1045- 1050

Xu, F. & Tenenbaum, J. B. (2007). Word learning as Bayesian inference. Psychological Review, 245-272

Recommended publications