A Fundamental Property of All-Or-None Models, Binomial Distribution of Responses Prior to Conditioning, with Application to Concept Formation in Children
Total Page:16
File Type:pdf, Size:1020Kb
The Stanford Institute for Mathematical Studies in the Social Sciences APPLIED MATHEMATICS AND STATISTICS LABORATORIES STANFORD UNIVERSITY Reprint No. 70 A Fundamental Property of All-or-None Models, Binomial Distribution of Responses Prior to Conditioning, with Application to Concept Formation in Children PATRICK SUPPES AND ROSE GINSBERG Reprinted from Psychological Review Vol. 70, No.2, 1963 1963, Vol. 70, No.2, 139-161 A FUKDAl\IENTAL PROPERTY OF ALL-OR-i\ONE ::\IODELS, BE\O::\IIAL DISTRIBUTION OF RESPONSES PRIOR TO CONDITIONING, \VITH APPLICATION TO CONCEPT FORlVIATION IN CHILDREN 1 I. PATRICK SUPPES A~D ROSE GINSBERG Stanford University A basic ~sumption of the simple all-or-none conditioning model is that the probability of a correct response remains constant over trials before conditioning. 4 implications of this assumption were tested: (a) prior to the'last error there will be no evidence of learning, (b) the sequence of responses prior to the last error forms a sequence of BerJif\oulli trials, (c) responses prior to the last error exhibit a binomial distribution and (d) specific sequences of errors and successes are dis tributed in accordance with the binomial hypothesis. These 4 tests were performed on the data from 7 experiments concerned with concept formation in children, paired-associate learning and probability learning in adults, and T maze learning in rats. The statistical evidence from these various experimental groups provided substantial support of the all-or-none model. However, when Vincent curves were constructed for responses prior to the last error, some of the learning curves showed significant departures from stationariness. In the past year or two there has that the single stimulus element will been extensive application of a single be conditioned to the correct response. stimulus element conditioning model \Ve consider only those situations in' to paired-associate learning (Bower, which the subject is always informed 1961; Estes, 1961) and to concept of the correct response so that the formation in children (Suppes & Gins correct association may be learned on berg, 1962). In a paired-associate any trial. experiment the single stimulus element This all-or-none conditioning model represents a stimulus item from a list may be viewed as resulting from im of paired associates; in a concept posing special restrictions on more formation experiment the stimulus general models of stimulus sampling element represents a concept, or some theory. The statistics of this model aspect of a concept. The two essential have been analyzed in great detail in assumptions of the model are the Bower (1961). Supplementary sta following. First, until the single tistics for a finite number of trials at stimulus element is conditioned, there the end of which not all subjects are is a constant guessing probability, p, conditioned have been given by Estes that the subject responds correctly (1961) and Suppes and Ginsberg (the probability of an error on every (1962). trial is q = 1 - P). Second, on each The point of the present paper is to trial there is a constant probability, c, make explicit a simple but funda mentally important fact about the all 1 This research was performed pursuant to a or-none conditioning model: the as contract with the United States Office of Education, Department of Health, Education, sumption of a constant guessing and Welfare. probability on each trial before condi- 139 140 PATRICK SUPPES A;,-[D ROSE GIKSBERG tioning implies that there is a binomial Consider now how much simpler this distribution, with parameter p, of quantity is if we know whether or not responses prior to the last error. 2 This the subject is conditioned. Let Un observation has three important con stand for the unconditioned state on sequences for the analysis of experi Trial nand Cn for the conditioned state mental data. First, it implies that the on that trial, etc. Then the condi sequence of responses prior to the last tional probabilities are simply3 error forms a sequence of Bernoulli trials. This null hypothesis admits at P ,,(11/ U,,+I) = p2 [2J once the possibility of applying the P n (l1/ U n Cn+1) = p [3J many powerful statistics that are not applicable in the usual learning situa P,,(l1/C,,) =1 [4J tion for which the theory postulates dependence of responses from trial to Moreover, except for a few trials after trial. Second, the consideration of the last error when the subject may be response sequences prior to the last unconditioned but guessing correctly, error makes possible a deeper analysis we know what state he is in. In of response data than do statistics particular, on all trials prior to the which are averaged over subjects and last error we know he is in the un are a function of the conditioning conditioned state and thus that the parameter c. \Vhen statistics are ex probability of two successes in a row should be Relative to the third pressed as a function of c and the data pz. are analyzed in terms of all subjects point above, it may be noted that if the data are summed over subjects, regardless of whether or not they are test of Equation 1 requires the conditioned, then it is often the case that the large number of correct re assumption that all subjects have the same conditioning parameter sponses occurring after conditioning c, bias the statistics very favorably in whereas test of Equation 2 does not, and is compatible with the assumption terms of the model. Third, the ob of individual differences in condition servation that the distribution of re ing "propensi ty. " sponses prior to the last error should be binomial permits generalization of the model to admit individual differ STATISTICAL TESTS OF THE MODEL ences in the conditioning parameter c, Once the observation has been made while retaining a uniform guessing that according to the model responses parameter p. prior to the last error have a binomial These points may be emphasized by distribution, it is possible to consider considering just one example of a a variety of goodness of fit tests for familiar statistic for the model. Let this assumption. The virtue of these P" (11) be the joint pro babili ty of a goodness of fit tests is that in contra success on Trial n and on Trial n + 1. distinction to the many statistics It is easily shown that considered by Bower they permit a genuine statistical evaluation of the Pn(l1) null hypothesis that the model fits the = 1 - [1 - pZ (1 - c) - pc J data. There are four goodness of fit X (1 - c)n-l [lJ 3 There are only three cases to consider, 2 It is easy to demonstrate that it is sta namely, Un+1, UnC"+I, and Cn, because Un+1 tist~cally incorrect actually to include the last implies Un with probability one and Cn error in the analysis of response data. implies Cn+1 with probability one. A FUNDAMENTAL PROPERTY OF ALL-OR-NoNE MODELS 141 tests we believe to be of particular where i = 0, 1; n, (t) is the number of importance. In introducing these correct (i = 1) or incorrect (i = 0) four tests, we want to emphasize that responses in Block t; n (t) is the total we are not suggesting they are the number of responses in Block t; lh is only tests or that they are the only the number of correct (or incorrect) interesting ones. It seems to us, how responses summed over all blocks; ever, that they do ask the four most and N is the total number of responses important questions suggested by the summed over all blocks. The x2 "guessing" assumption of the model. statistic has the usual limiting dis The statistical properties of these four tribution with T - 1 degrees of tests are well known in the literature freedom, where T is the number of and do not need to be discussed here. blocks of trials. If there are m > 2 A good reference for the first two on responses, the number of degrees of stationarity and order is Anderson freedom is (m - l)(T - 1). Under ami Goodman (1957). the restriction to two responses, the Stationarity. Perhaps the most expression for X2 may be simplified to striking feature predicted is that if X2 = L [NnICt) - nln(t)]2/nln2n(t) data summed only over responses I made prior to the last error are con thus eliminating the summation over i. sidered, then there will be no evidence Order. The second property follow of learning over trials. Statistically ing from the guessing assumption this means the model predicts a which it is critical and significant to binomial distribution of responses test is that the sequence of responses with the constant parameter p. From the standpoint of learning theory this prior to the last error does indeed is a particularly interesting prediction form a sequence of Bernoulli trials, that is, that there is statistical in because of the classical emphasis on the mean learning curve. If the dependence in the responses made binomial assumption holds, the mean from trial to trial. There are various learning curve, when estimated over ways of testing this assumption but responses prior to the last error for it seems to us that the simplest and each subject, will be a horizontal line. most direct is to test the null hy Empirical tests of this prediction in pothesis that the dependence is zero experiments concerned with children's order versus the hypothesis that the concept formation, animal learning, dependence is first order. Acceptance probability learning, and paired-as of the null hypothesis has the strong sociate learning in human adults, are implication that we cannot predict given below.