Behavior Research Methods, Instruments, & Computers 1999,31 (4), 696-700

AppleTree: A multinomial processing tree modeling program for computers

RAINER ROTHKEGEL University ojTrier, Trier, Germany

Multinomial processingtree (MPT) models are statistical models that allow for the prediction of cat­ egorical frequency data by sets of unobservable (cognitive) states. In MPT models, the probability that an event belongs to a certain category is a sum of products of state probabilities. AppleTree is a com­ puter program for Macintosh for testing user-defined MPT models. It can fit model parameters to em­ pirical frequency data, provide confidence intervals for the parameters, generate tree graphs für the models, and perform identifiability checks. In this article, the algorithms used by AppleTree and the han­ dling of the program are described.

Many theoretical models in psychology imply that the Hu (1999) has developed a multinomial processing tree cognitive system can be in a number ofdiscrete cognitive modeling software for MS-DOS, called MBT. AppleTree states, and that behavior depends on which states a cog­ was developed because, to my knowledge, there was no nitive system is in. Because cognitive states themselves program for MacOS capable of fitting MPT models. In are unobservable, conclusions about these states are typ­ addition, a program was needed that has no limitations ically drawn from behavioral data. on the size ofthe tree models. Multinomial processing tree (MPT) models belong to a dass of methods that allow for the testing of models A SIMPLE EXAMPLE that predict frequency data from unobservable cognitive states. The statistical properties of multinomial models Suppose we want to examine the structure ofinteritem are weIl understood (Hu & Batchelder, 1994; Riefer & associations in serial learning. In an experiment, partie­ Batchelder, 1988), and there is a substantial body of re­ ipants learn a list ofitems and are then given a cued ser­ search carried out using these models (e. g., Ashby, ial reeall task with one item (A) as eue and the task to pro­ Prinzmetal, Ivry, & Maddox, 1996; Batchelder & Riefer, vide the first (B) and seeond (C) sueeessor ofthat item. 1990; Bayen, Murnane, & Erdfelder, 1996; Buchner, Erd­ Responses are grouped into four eategories depending felder, & Vaterrodt-Plünnecke, 1995; Buchner, Steffens, on whether item B and/or item C are reprodueed eorrectly. Erdfelder, & Rothkegel, 1997; Buchner, Steffens, & Suppose we want to predict these frequencies by the in­ Rothkegel, 1998; Riefer & Batchelder, 1995; Schweick­ teritem assoeiations between items A, B, and C. A sim­ ert, 1993). ple model, aceording to which only forward assoeiations MPT models require that responses can be partitioned exist between these items, requires three states and there­ into a set ofdisjoint categories. The basic assumption of fore three parameters: pAB, pBC, and pAC are probabil­ MPT models is that the joint occurrence or nonoccur­ ities ofassociations between items A and B, Band C, and rence ofcognitive states determines whether a response A and C, respectively. Depending on which combination falls into a certain category or not. MPT models can be of states a person is in, different outcomes would be ex­ viewed as binary tree models where each bifurcation de­ pected. Consider, for instance, the two sets ofstates that termines whether the system is in a certain state or not. would lead to both items Band C being reproduced cor­ Each leaf(i. e., each end ofa branch) is associated with a rectly (henceforth category Fee). One possibility is that response category; that is, whenever the system is in the there is an association between items A and B on the one states indicated by the links along a branch, a response is hand and an association between items Band C on the produced that falls into the category that appears as the other. The other possibility is that there is an association leafofthat branch. between items A and B on the one hand and an assoeiation between items A and C on the other. Both sets of states The work reported here was supported by Grant Bu 945/1-2 from the would lead to correct responses for both items Band C. Deutsche Forschungsgemeinschaft. The author thanks Axel Buchner, The probability for category Fce is therefore pAB *pBC + Edgar Erdfelder, Xiangen Hu, and Martin Brand for help and feedback pAB * (1 - pBC) * pAC. In that way the entire model can in the implementation process ofAppleTree. For reviews ofthe manu­ be deseribed by an equation system, and it can be illus­ script the author thanks Axel Buchner and Karl F.Wender. Correspon­ dence should be addressed to R. Rothkegel, FB I-Psychologie, Univer­ trated by the binary tree model given in Figure I. This ex­ sität Trier, 0-54286 Trier, Germany (e-mail: rainer@ cogpsy.uni­ ample will be used later in this artiele to illustrate the trier.de). handling of AppleTree.

Copyright 1999 Psychonomic Society, Inc. 696 APPLETREE 697

2 o ~ Trees from Model: SerLearn.eqn § I!lI13 is asymptotically X distributed. For multinomial pro­ ... cessing tree models, PD). can be defined as PBC-~ f- pAC pAB l-pBC B (3) Pairs l-pAC §!] l-pAB PAC--~ ~ 1-PAC .. -... Special cases are A = 0 and A = 1. If Aequals I, PD). is 2 ~I I ~ <8> equal to the Pearson X . IfAequals 0, PD). is equal to the likelihood ratio C 2. In this case, Equation 3 has to be Figure 1. Tree graph picture generated by AppleTree. transformed by taking the limit Iim [PD).]. ).->0 ALGORITHMS Parameter Estimation The basic algorithms used in AppleTree were devel­ AppleTree uses the expectation-maximization (EM) oped by Hu and Batchelder (1994). Therefore, they are algorithm (see Dempster, Laird, & Rubin, 1977) to obtain only brietly described in this article, using the notation of estimators for the link probabilities that minimize the Hu and Batchelder. PD). statistic. In the E-Step, expected frequencies mij for each branch are computed given a parameter vector 0 Branch and Category Probabilities and a set ofcategory frequencies nj : As noted, in a binary tree describing an MPT model, each link is associated with a probability. The probabil­ m(O)= njp;j(O) (4) ity for a system being in a certain set ofstates is the prod­ IJ Pj (0) uct oflink probabilities along a branch: In the M-Step, revised estimates of the parameters are S.. b computed given the expected branch frequencies: Pij(O)=cij TI 0:'" (1- Os) 'I' , (I) s=1 J [ n, J). 'I where 0 is a vector representing S free parameters (link L __J - LaijSmij(O) probabilities), a is the frequency with which parameter ,() j=1 npj(O) ;=1 ijs o A =------(5) Os appears in branch i ofcategoryj, bijs is the frequency ofits complement in this branch, and cU, is the product of f[~J). s I. (aijS+b;js)m;;CO) all constant link probabilities in the branch. (Beginning j=1 np;CO) ;=1 with Version 1.1, AppleTree uses a binary tree represen­ tation for the structural equations instead ofthe 3-D ar­ The degree ofchange from one iteration to the can rays a and b. This is usually more efficient, both with regard be adjusted by a step width parameter e. By setting e to to computational complexity and memory requirements.) a value lower or greater than I, the degree of change of A certain response may be reached by more than one the parameter vector 0 from iteration k - I to iteration k branch, and category probabilities are the sums of the is reduced or increased to branch probabilities ofall branches leading to a certain =Ok-I-e(ok-l-e). (6) category: e' 'I The estimation begins by initializing the parameter vec­ Pj (0) = LPij(O). (2) tor with a set of starting values. Then, in each iteration, i=1 Equations 4 to 6 are applied until a stopping criterion is reached. The stopping criterion is reached when the AppleTree is able to handle joint multinomial models­ change in the parameter vector and the PD). fit measure that is, models that consist of several processing trees is below a criticallevel. After each iteration that may use a common set of parameters. In this case, l cijs in Equation I is multiplied by the relative proportion 8 =sup{ IPD1- PDt ], sup{ IO~ - O~-I I, S = I,...,S}} (7) ofcategory frequencies for the tree to which the branch belongs. is computed. If8is smaller then the specified Stop-Delta (see below), the estimation process stops. Goodness ofFit The fit of a parameter vector to a given set of branch Confidence Intervals frequencies nj is determined with the power divergence To provide confidence intervals, AppleTree computes family ofstatistics (Read & Cressie, 1988). PD). defines the observed Fisher information matrix (see Efron & a family ofstatistics (dependent on one parameter A)that Hinkley, 1978) for the parameters. For reasons ofbrevity, 698 ROTHKEGEL the algorithm is not presented here (see Hu & Batchelder, Hc 40 1994, for details). The inverse 01' the Fisher information Hf 100 matrix is the expected variance-covariance matrix for the parameters. Since the parameters can be shown to be asymptotically standard-normal distributed, confidence Format ofModel Files intervals for the parameters can be computed from their The first line ofthe model description file contains the variances in the usual way. total number ofbranches specified in the model. Begin­ ning with the second line, descriptions ofthe tree branches PROGRAM HANDLING follow, one line for each branch. The branch descriptions do not have to follow any particular order. Every line has Given a description 01' a multinomial model and a set to begin with the name ofthe tree the branch belongs to. offrequency data, AppleTree can determine the fit ofthe Like category names, tree names can consist 01' any se­ model to the data. It provides estimates for the model pa­ quence of alphanumerical characters. Separated by one rameters as well as their variance-covariance matrix and or more blanks, the name ofthe category follows to which confidence intervals. It also provides a means to check the particular branch belongs. After the category name the identifiability 01' the model and is able to generate a and one or more blanks, the expression appears that deter­ graphic representation 01'the tree model. Given that the mines the branch probability. The branch probability is memory partition for AppleTree is large enough, Apple­ a product ofparameters and/or their inverse values in the Tree has no limits 01' practical relevance on the size 01' particular sequence in which they appear in the tree the models (i.e., the number 01' parameters, categories, branch, starting from the root. For parameter names, the trees, and tree branches). same mies apply as for branch and tree names. The model in the present example consists ofone tree Entering Model and Data Descriptions (calIed "Pairs") with five branches. The model file for As a first step, two plain (ASCII) text files must be the example is as folIows: provided, one with a description 01' the model, the other 5 with the data that shall be fitted by the model. Both files Pairs Fcc pAB*pBC can be generated with any text editor, but one can also Pairs Fcc pAB*(1-pBC)*pAC use AppleTree for this task, because the program has a Pairs FcfpAB*(1-pBC)*(1-pAC) built-in text editor. For both data and model files, Ap­ Pairs Hc (1-pAB)*pAC pleTree largely conforms to the format specifications Pairs Fff(l-pAB)*( I-pAC) used by Hu (1999) in his MBT program. Entering the Files into AppleTree Format ofData Files To fit a model to a set ofdata, users must first enter the The first line 01' the data file must be the title for the model file and the data file into AppleTree. These files data to follow in the subsequent lines. One line represents can be entered by using the appropriate menu commands data from one response category. Each line begins with or by dragging the files from the Finder to the Model and a unique name for the category. Category names may con­ Data fields in the AppleTree window (Figure 2). The sist of any sequence of alphanumerical characters. The files can also be entered by dropping them on the Ap­ empirical frequency for this category follows after the pleTree icon in the Finder. In this case, files with the ex­ category name, separated from the name by any sequence tension ".eqn" are treated as model files, and files with the of blanks (tabulators or spaces). Categories can be en­ extension ".mdt" are treated as data files. Files without tered in any order into the data file. Immediately after the these extensions are shown in a text editor window. last line ofdata there must be a line beginning with 3 con­ Upon entering a model file into AppleTree, the pro­ secutive equals signs. Additional data sets can be ap­ gram checks whether the equations can be transformed pended after this line in the same way as the first one. into binary trees. IfAppleTree accepts a model file, one In the example given in the introduction, the frequency can be sure that the equations are consistent descriptions distribution 01' the answers may comprise 25 trials in ofbinary trees. To check whether the equations describe which Band C are reproduced correctly (Category Fee), the right kind ofmodel, it is advisable to examine the tree 35 items in which only Bis reproduced correctly (Cate­ graph, which can be obtained by selecting Show Tree gory Fcf), 40 items in which only C is reproduced cor­ Graph in the Model menu. Tree graphs (Figure I) can be rectly (Category Ffc), and 100 items in which both answers exported by copy and paste or by drag and drop. are false (Category Hf). The data file for this example looks like this: Fitting the Model Hypothetical response frequencies after aserial Models can be fitted by pressing the Run button in the learning task AppleTree window. 11' the default settings in the Data, Fcc 25 Fitting, and Results fields are left unchanged, AppleTree Fcf 35 fits the model to the first data set in the data file. During APPLETREE 699

AppleTree

Hode1: SerLearn.eqn Data: SerLearn.mdt Data Set [ Set 1 ~ J 0 Batoh Mode pAB 10.31248610 free ~ Hypothetioa1 response frequencies after a pAC r~.~·~.~\.·~·~.·.·:.·J 0 pAB serta1learning task pBC 10.0001 10 find Results Fitting To [ Window ~) Lambda 10 Stepwidth 1=1===i 9 Parameters Alpha 1_0_.0_5_--.J Stop-Delta b.Oe-8 9 Frequenoies Iter at ions : 4 - o Fisher Information G-Square : 3.500703 . o Identifiability Degrees of Freedom: 2 Runs r~·.·.·.·.·.·.·.·.·.·.·.·.·.·.·:::J

Figure 2. AppleTree window. the fitting process, AppleTree shows the parameter val­ data sets (e.g., when performing Monte Carlo studies). If ues in the Model field. The number of iterations and the users select Nowhere, no output is generated beyond the PD). statistic are shown in the Fitting field. After the fit­ output visible in the AppleTree window. ting process has stopped, a text window is opened, show­ If the Parameters check box is checked, AppleTree ing the results ofthe fitting process. lists the values of all parameters. For free parameters, con­ fidence intervals are given. The Alpha field may be used Selecting the Data Set to set the error probability for the confidence intervals. In the Data field, users can select the data set to which For instance, setting Alpha to .05 yields 95% confidence the model is fitted. Ifusers want to fit the model to more intervals for the parameters. than one data set, they can check the Batch Mode check If the Frequencies check box is checked, AppleTree box. In this case, AppleTree fits the model to all data sets lists empirical frequencies from the Data file, together in the data file, starting with the current selection in the with expected frequencies based on the model, as well as Data Set pop-up menu, to the last data set in the file. the ratios ofthese frequencies. If Fisher Information is checked, the inverted Fisher Controlling the Fitting Process information matrix is shown in the output. lt provides an The controls in the Fitting field can be used to intlu­ estimate ofthe variance-covariance matrix ofthe free pa­ ence the fitting process and to monitor its results. In the rameters (see above). Lambda field, users can select the A parameter for the The Identifiability option allows for acheck of the PD). statistic (see Equation 3). In the Stepsize field, users identifiability ofthe model. A model is nonidentifiable if can control the e by which the parameters are changed in there is more than one set ofparameter values that leads each iteration (see Equation 6). This is especially useful to a global minimum. If Identifiability is checked, Ap­ ifthe fitting process "gets stuck" in a local minimum or pleTree repeats the fitting process with randomly deter­ alternates constantly between two sets of values for the mined sets of initial parameter values. The number of parameters. The stepsize can also be changed during the repetitions can be set in the Runs field. For each run, the fitting process by pressing the "+" or "-" keys on the values ofall free parameters, together with the fit values, keyboard. If users set Stepsize to 0, parameters are left are reported. After the last run, ranges for parameters unchanged. This is useful ifone wants to see the fit ofthe and fit values are given. Ifall runs result in equal fit val­ model for the initial values ofthe parameters. The stop­ ues but some parameter values differ, the model is non­ ping criterion for the fitting process (see Equation 7) can identifiable (given that the global minimum was found). be altered by changing the value of Stop-Delta. In order to allow for a monitoring ofthe progress ofthe fitting procedure, parameter values, the fit value, and the Controlling the Output number of iterations are updated once every second. The options that control the output of the results are listed in the Results field. By default, all output appears Changing the Model in a window. By selecting File ... , all output is written to Restrietions to the model specified in the model file a file. In that way, the output does not occupy any RAM. can be added in the Model field. In the scroll field, every This is helpful if users want to fit very large numbers of parameter is listed together with its value and its status. 700 ROTHKEGEL

Initially all parameters are free parameters; that is, their can be obtained on the AppleTree Webpage, . leet Fixed in the pop-up menu for a parameter, then this parameter is treated as a eonstant. Users ean also set one REFERENCES parameter equal to another parameter by seleeting the ASHBY, F. G., PRINZMETAL, W, IVRY, R., & MADDOX, W T. (1996). A name ofthe other parameter in the pop-up menu associ­ formal theory offeature binding in object perception. Psychological ated with the parameter. On the bottom ofthe Model field, Review, 103, 165-192. the degrees offreedom for the model with the eurrent re­ BATCHELDER, WH., & RIEFER, D. M. (1990). Multinomial processing strietions are displayed. This value ean be negative if models of source monitoring. Psychological Review, 97, 548-564. BAYEN, U. 1., MURNANE, K., & ERDFELDER, E. (1996). Source discrim­ there are too many parameters relative to the number of ination,item detection, and multinomialmodels of source monitoring. empirieal eategories to be predieted. The degrees of Journal 0/Experimental Psychology: Learning, Memory. & Cogni­ freedom must not be negative or else the parameter esti­ tion,22,197-215. mates cannot be interpreted. BUCHNER, A., ERDFELDER, E., & VATERRODT-PLÜNNECKE, B. (1995). For the example given above, fitting the unrestrieted Toward unbiased measurement of conscious and unconscious mem­ ory processes within the process dissociation framework. Journal of model reveals aperfeet fit ofthe model to the data. This Experimental Psychology: General, 124,137-160. is trivial, beeause this model has no degrees offreedom. BUCHNER, A., STEFFENS, M., ERDFELDER, E., & ROTHKEGEL, R. (1997). Nontrivial versions ofthe model ean be obtained by add­ A multinomial model to assess fluency and recollection in a sequence ing restrietions. A viable restriction would be to allow only learning task. The Quarterly Journal 0/ Experimental Psychology, 50A,631-663. one parameter for assoeiations between direet sueeessors BUCHNER, A., STEFFENS, M., & ROTHKEGEL, R. (1998). On the role of in the list. To add this restrietion, seleet "~pAB" in the fragmentary knowledge in a sequence learning task. The Quarterly pop-up menu next to the parameter pBC in the Model Journal 0/Experimental Psychology, 51A, 251-281. field. Inthat way, parameters pAB and pBC are set to be DEMPSTER, A. P., LAIRD, N. M., & RUBIN, D. B. (1977). Maximum like­ equal. Another eoneeivable variant would be a model in lihood from incomplete data via the EM algorithm. Journal ofthe Royal Statistical Society, 39, 1-38. whieh assoeiations exist only between direet suecessors EFRON, B., & HINKLEY, D. V. (1978). Assessing the accuracy of the in the list. This ean be achieved by fixing parameter pAC maximum likelihood estimator: Observed versus expected Fisher in­ at a value near O. For technical reasons, parameter values formation. Biometrika, 65, 457-487. have to be inside the interval (.9999; .0001). Hu, X. (1999). Multinomial processing tree models: An implernenta­ tion. Behavior Research Methods, Instruments. & Computers, 31, 689-695. TECHNICAL REMARKS Hu, X., & BATCHELDER, W H. (1994). The statistical analysis of gen­ eral processing tree models with the EM algorithm. Psychometrika, 59,21-47. AppleTree requires MacOS 7.5 or later. The program READ, T. R. C., & CRESSIE, N. A. C. (1988). Goodness-of-fit statistics is a so-called ; that is, it eontains code for the for discrete multi variate data. New York: Springer. 680xO (68020 at minimum) and the PowerPC family of RIEFER, D. M., & BATCHELDER, WH. (1988). Multinomial modeling processors. It is highly recommended to run AppleTree and the measurement of cognitive processes. Psychological Review, on a PowerPC Maeintosh because, for compatibility rea­ 95,318-339. RIEFER, D. M., & BATCHELDER, W H. (1995). A multinomial modeling sons, the 680xOcode does not contain any direct floating analysis of the recognition-failure paradigm. Memory & Cognition, point unit (FPU) calls and is therefore much slower than 23, 611-630. the PowerPC code. SCHWEICKERT, R. (1993). A multinomialprocessingtree modelfordegra­ dation and redintegration in immediate recall. Memory & Cognition, AvaiIabiIity 21,168-175. AppleTree is available free ofcharge at . The version described in (Manuscript received January 26, 1998; this article is 1.2.1. Information about recent changes revision accepted for publication May 28, 1998.)