IO~AL OF VERBAL LEARNING AND VERBAL BEHAVIOR 6, 232-239 (1967)

Grammar and the Recall of Chains of Verbal Responses1

KUnT SALZINGER Biometrics Research, New York State Department of Mental Hygiene and Polytechnic Institute of Brooklyn AND

CAROL ECKERMAN Columbia University, New York 10032

The validity of transformation for explaining differences in recall of differently structured series (a simple declarative sentenee-D vs. a passive negative query-PNQ) of 10 nonsense syllables and function words was tested. All the main variables, except for grammatical structure, gave rise to significant differences. Func- tion words were more easily recalled than nonsense content words; sentences than random arrays; whole presentation was superior to serial presentation; and the second was recalled better than the first. Interactions showed that differ- ences between sentence and random order, between D and PNQ, and between whole and serial learning were markedly reduced from first to second presentation. Function words depended more on contextual constraint than nonsense syllables, and D and PNQ differed most when presented serially in random order. The results could not be explained by a transformation grammar model but were consonant with the concept of frequency of occurrence of single words and combinations of words.

Chomsky (1957) has claimed that an the kernel, 2 into the more complicated approach to grammar in terms of frequency structures-is, of course, an empirical prob- of single words and frequency of their lem. The simple declarative sentence in combinations is inadequate, and he has the active mood-"The boy threw the promulgated a transformation grammar ball"-would be an example of a kernel, which psychologists such as Miller (1962) have used to investigate the psychological ~A recent article by Bever, Fodor, and Weksel (1965) states that psychologists like Miller aspects of grammar. (1962) and Mehler (1963) have made the m/s- The validity of the transformation model take of identifying the kernel with the simple of grammar-namely, that there is a set declarative sentence and have assumed that trans- of rules which allows the S to change or formations are performed upon the kernel sen- transform the simplest kind of sentence, tence when in fact the correct interpretation should have been that the simple declarative sen- tence is only one of the types of sentences which 1 This research was reported in part at the can be generated by a basic kernel structure. Eastern Psyehol. Assoc. Meetings, Atlantic City, Whether one looks upon the kernel as a sentence 1965, and was supported by NIMH Grant MH or a "structure," the transformation grammar of 07477. The authors wish to thank Judith Tanur Chomsky (1957) does suggest a difference in the and Verna Schmauder for their aid in the statisti- number of steps necessary to generate the two cal analysis of the data and Joseph Zubin for his sentence types discussed in this experiment and interest in and encouragement of this study. this is the concern of this study. 232 GRAMMAR AND RECALL ~ and the passive negative query-"Wasn't different in Miller's (1962) paper, i.e., the the ball thrown by the boy?"-an example simple active declarative sentence vs. the of a complicated structure. Miller (1962) passive negative query. Furthermore, the and Mehler (1963) suggest that memory same number of word units was employed for complex structures depends on the in both structures to eliminate the possible coding of the kernel sentence plus some artifact of Mehler's experiment. Finally, indication of what transformation must be • by using a condition in which the units performed to rearrange the words into were presented in nonsentence order it their correct and more complex order. It was possible to measure the influence of follows from this that when Ss are given grammatical structure by itself on memory kernel sentences to memorize, they should and thus to replicate and extend (by in- make fewer errors than when given more cluding an additional structure) the ex- complicated structures, and this is in fact periment first done by Epstein (1961). reported (Mehler, 1963). Unfortunately, METHOD the prompting method of recall used by Materials. Two grammatical types of 10-unit Mehler, which consists of supplying S with "sentences," each consisting of two different sets discriminative stimuli for the lexical words of function words and nonsense syllables, were and not for the function words which serve constructed. One grammatical structure was a to differentiate the structures of the sen- simple declarative sentence (D) and the other tences, would be most helpful to kernel a passive negative query (PNQ). The nonsense syllables selected for the experiment had associa- sentences and might therefore have pro- tion values ranging from 73 to 89% (Glaze, duced an artifact in his experiment. Fur- 1928); dissyllables used varied between 1.26 and thermore, the different number of words 1.28 in meaningfulness (Noble, 1952). The re- in the various structures-from 6 for the quired characteristics of English were simu- kernel to 8 in the passive negative query lated by adding the prefix "be" and "de" to in- dicate verbs, and the suffixes "s'" for nouns, "y'" (PNQ)-should have been controlled to and "er" for adiectives, "ly" for adverbs, and "ed" assure that the smaller number of words and "ing" for verbs, as well as by using function alone, or in interaction with the prompting words. The frequency of occurrence of successive procedure or in interaction with the differ- bigrams and trigrams of the dissyllables (Under- ences in structure, could not explain the wood and Schulz, 1960) was approximately equated for the two sentence sets (a and b in differences in memory between the kernel Table 1 ). Inspection of Table 1 shows that the and the other sentences. same two basic sets of units were employed in all In view of these somewhat equivocal the word series. However, the following excep- results and in order to test the theory, it tions were made to keep the number of word units the same for the two grammatical sentences; seemed reasonable to investigate the de- the function word "and" was used in the D sen- gree to which sentences of different gram- tence to match the required function word "by" matical structures can be recalled. Since in the PNQ sentence, the verb ending "ing" in Chomsky (1957) has claimed independ- D was matched by the ending "ed," and the word ence of grammar and , it was "were" in D was matched by the word "weren't" in PNQ. The first word in each of the four sen- decided to use a technique recently em- tences was capitalized; the two D sentences ployed by Epstein (1961), namely, non- ended with a period; and the two PNQ sentences sense syllables with bound morphemes and ended with a question mark. The order of the function words so arranged as to provide words in each of the sentences was randomized the discriminative stimuli for different sen- according to the s~nc random order to produce four unstructured series. Both capitalization and tence structures. Two structures were em- punctuation were omitted from these random ployed which were described as most series. 234 SALZINGEB AND ECKEBMAN

TABLE 1 TrIE 10-WoR~ STIMULUS SERIES

Declarative grammatical Set a--And the piqy kews were beboving the nazer zumaps dygly. Set b--And the kavy bycs were derizing the tober tatuks neply. Declarative random Set a--beboving piqy hazer were the and zumaps kews dygly the Set b--derizing kavy tober were the and latuks byes neply the Passive-negative-query grammatical Set a--Weren't the hazer zumaps dygly behoved by the piqy kews? Set b--Weren't the tober latuks neply derized by the kavy bycs? Passive-negative-query random Set a--beboved hazer the dygly by weren't piqy zumaps kews the Set b--derized tober the neply by weren't kavy latuks bycs the

The verbal material was prepared for presenta- in which they volunteered for the experiment. tion on a Lafayette memory drum (Model 303) When one set of 16 conditions was completed, by both a serial and a whole method. For the Ss were then placed, in the order in which they whole method of presentation, each series was volunteered, into the next set of 16 randomly typed in a single horizontal line across a separate arranged conditions. This process was continued 8.5 × ll-inch sheet of white paper. The series until data had been collected from 80 Ss. The was centered on the page and one space sepa- variables combined were: grammatical structure rated successive words. For the serial method of (D vs. PNQ); word order (sentence order vs. presentation, the words of a given series were random arrangement of words); stage of learning typed successively in the middle of the page with (first vs. second presentation of a or b); method the second word centered under the first and so of stimulus presentation [entire series at once on. Two lines were skipped between successive (whole) vs. one unit at a time (serial)]. For words and any punctuation immediately followed purposes of analysis, errors were counted sepa- the last word. rately for nonsense syllables and for function A random series of ten single digits was con- words, thus defining a fifth main factor used in structed with the aid of a table of random num- the analysis of variance. All Ss were presented bers, with the restrictions that all digits from 1 to with the series of numbers first: the entire series 9 inclusive be used and that the same digit ap- (for 10 secs.) or one unit at a time in agreement pear in both the second and seventh positions to with the presentation conditions of the material follow the repetitions occurring in the D sen- which followed. tences. This series of digits was also prepared for For the whole method of presentation, Ss were both the serial and whole methods of presenta- told to learn the numbers in the order in which tion. Four spaces occurred between successive they were presented and to distribute their at- digits in the whole method. tention equally among them. After each stimulus Subiects. Eighty undergraduate and graduate presentation they had 30 secs. to write the numbers students whose native language was English were they recalled on a sheet of paper provided with employed. Their ages ranged from 18 to 28 years. ten blanks. Procedure. The Ss were tested individually and The Ss in the serial presentation condition were paid $1.50 an hour for participating. They were given substantially the same instructions except assigned to one of 16 experimental conditions that they were told the numbers would appear which consisted of all combinations of the vari- successively in the window of the machine. The ables listed below, except that a given S received single digits appeared at a rate of i per sejc. both sets (a and b) of stimulus material under In both the serial and whole methods the pro- the exact same conditions. The 16 conditions eedure was repeated until S wrote the digits cor- were permuted on the basis of random ordcrings rectly on three successive trials. Immediately after of the numbers 1 through 16. The Ss were then a trial, E checked the record. To be correct, all placed into each of the conditions in the order the digits had to appear in correct order. The GRAMMAR AND RECALL 235

S was given no knowledge of results until after RESULTS AND DISCUSSION the third successive correct trial, at which point E said, "good" and asked ff S had any questions Recall of Numbers. Statistical analysis about the procedure. When it was ascertained showed that it took Ss almost twice as that S understood the procedure, E proceeded to many trials to reach a criterion of one the presentation of the first series of nonsense 10-unit sequence correctly recalled when and regular words. The Ss receiving the grammatically structured using the serial method (M--4.15) than series (D or PNQ) by the whole method were when using the whole method (M----- told that they would be shown a sentence com- 2.13), t(78)--5.19, p < .001. This differ- posed of both nonsense and regular words for ence remained for comparison of whole 10 secs. and were then given the same basic in- structions, with appropriate changes, as for the vs. serial on a criterion of two (5.72 vs. nUlTlbers. 3.27, respectively) and three successively The Ss administered the same grammatically correct recalls (6.72 vs. 4.35, respectively), structured series as above by the serial method t(78)--5.52, p < .001. It indicates that were given substantially the same instructions ex- presentation of an entire sequence of units cept that they were told that the words would appear successively in the window. rather than one item at a time will result The Ss given the random series (whole or in faster learning even when the sequence serial) received the same basic instructions but is not arranged according to some under- were told that E was presenting words originally lying order. This difference is probably arranged to make up a sentence, but which were due to the fact that S can distribute his now scrambled so that they no longer read like a sentence. The Ss were still required to learn the attention so that combinations more diffi- words in the order in which they were presented. cult to learn get more practice. Further- Words were counted as "correct" when they more, S can arrange the items in larger were written in correct order and spelled the same units when all are exposed at once. as the stimuli. For three Ss who made the same Recall of Nonsense Syllables and Func- single spelling error for 10 successive trials, the experiment was terminated at that point, and the tion Words. The recall data were con- data included in this study. One S had the PNQ verted into two basic scores: number of sentence (Set a) presented serially, one the D sen- errors per function word and number of tence (Set b) presented serially, and the third errors per nonsense syllable. A five-way had the D sentence (Set b) presented whole. analysis of variance was then performed After each S learned the first word series, E said, "good" and immediately proceeded to the on these basic scores. Table 2 shows the second word series following the same procedure. mean error scores for all the conditions

TABLE 2 MEAN NUMBER OF ERRORS PER WORD

D Series PNQ Series Stage Sentence Random Sentence Random Whole Serial Whole Serial Whole Serial Whole Serial

Nonsense syllables First 1.28 2.53 2.60 2.52 1.72 3.13 2.43 3.60 Second 1.57 2.33 1.93 1.60 .98 1.73 1.32 1.98 Function words First .18 .53 1.20 1.55 .83 .85 1.45 3.35 Second • 28 .05 .68 .10 .18 .05 .28 .75 236 SALZINGER AND ECKE1R]VIAN for the criterion of 3 successive correct occurrence of these competing responses. recalls. Analysis of variance to a criterion If Ss did in fact commit the kernel and of 1 correct recall shows substantially the the transformation rule to memory sepa- same results, the exception being that only rately, then one should find more errors for the lower criterion is there a signifi- for the PNQ than for the D sentence for cantly greater number of errors for tile both presentations. PNQ than for the D structure, F(1, 288), Accepting the equivocal difference be- 3.93, p < .05. tween grammatical structures as real, fur- Error scores were used in preference to ther inspection of the conditions under trials to criterion for this analysis in order which grammatical structure affects recall to include the difference in recall between demonstrates that PNQ gives rise to more function and nonsense words within the errors than D only in serial presentation, same analysis as the other variables and, i.e., the interaction is significant, F( 1, 288) perhaps more important, in order to evalu- --5.10, p < .05. Following Epstein's rea- ate interactions of word type with the soning ( 1962 ), we should expect such a dif- other variables. Analysis of variance ap- ference only for whole presentation, since plied to trials to criterion scores showed only this would allow S to learn the under- substantially the same results as the error lying structure. The triple interaction be- scores analysis, i.e., no significant difference tween grammatical structure, method of due to the grammatical structure variable, stimulus presentation, and word order F(1, 128) < 1, p > .05; a significantly large shows still another reason for rejecting the number of trials to criterion for the first notion of a plan or underlying structure, than for the second learning stage, F(1, F(1, 288) -- 5.42, p < .025, since the differ- 128) = 7.65, p < .01; and nonsignificant ence between D and PNQ stems almost trends in the same direction as the error wholly from the difference of the random score analysis for the word order variable arrangements of D and PNQ under serial (sentence vs. random) and the method of conditions of presentation. The reversal stimulus presentation variable (whole vs. under whole conditions of presentation serial ). clarifies the interaction between grammati- Grammatical Structure. The difference cal structure and method of presentation; in memory due to grammatical structure apparently, whole presentation allows S is at best equivocal, F(1, 288)--3.07, so to distribute his attention over the items p > .05. The Grammatical Strncture-Stage as to compensate for the differences in of Learning interaction, F(1, 288)--9.71, difficulty. It would seem most reasonable p < .005, shows a larger number of errors to conclude that the grammatical struc- in the PNQ structure than in the D struc- tures do not produce differences as struc- ture for the first presentation only, i.e., tures but rather in terms of the frequency at the beginning of learning only. These of occurrence of their words or combina- results suggest that earlier relative un- tions of words. The fact that the words of familiarity with the PNQ structure is sim- the PNQ structure produce a large num- ply overcome by presenting S with that ber of errors when in random arrangement structure. In terms of learning theory, the raises the possibility that they are gener- successive words of the PNQ sentences ally combined in fewer different ways than may be posited to evoke other than the the words of the D structure. In other called for words at the beginning of learn- words, the random arrangement of the ing because of the higher frequency of PNQ structure results in a relatively greater CanMMaa A~D m~CaLL 237 number of chunks (Miller, 1956) or units grammatical structure and method of stim- (Salzinger, 1962) than the randomly ar- ulus presentation. The interaction roughly ranged D words, since the words in the corroborates Epstein's finding (1962) that PNQ sentences are less frequently used •the difference due to word order manifests in other combinations. It is certainly most itself when using the whole but not the difficult to explain this result in terms of serial method of presentation, as long as transformation grammar, which presuma- the' grammatical structure consists of the D bly is irrelevant to random (ungrammati- sentence only. His conclusion is not war- cal) sequences. ranted, however, for the PNQ sentence Word Order. These results are in agree- which shows the difference due to gram- ment with Epstein's findings (1961, !962), matical structure to be at least as great that words in random order produce a for the serial as for the whole method of greater number of errors than the same presentation. It might be noted here that words in sentence form, F( 1, 288) = 18.61, the serial method as used in this experi- p < .005. It is interesting to note, however, ment differs in two respects from Epstein's that here, as for the grammatical structure study. First, Ss were told that they would variable, the difference due to word order be presented words from a sentence in is evident only early in learning, viz., word sentence or in random order; Epstein's Ss order-stage of learning interaction, F(1, were not. Secondly, Ss responded in the 288) = 9.57, p < .005. same way for the whole and serial presen- The triple interaction of word type, tation methods, i.e., by out the method of stimulus presentation, and word entire sentence after seeing all stimuli one order indicates that the difference in re- at a time, while Epstein's Ss followed the call due to word order does not manifest usual anticipatory serial learning instruc- itself at all for nonsense syllables when tions. Our findings suggest that Epstein's they are presented serially, F( 1, 288) conclusion-"chains of immediate proba- 5.35, p < .025. On the other hand, for func- bilistic associations within the structured tion words the difference is even larger sentences" cannot be used to explain the for serial than for whole method of pres- difference due to grammatical structure- entation, suggesting that for function words may have been premature. the immediate associations are more im- Stage of Learning. The Ss made a signifi- portant than they are for the nonsense cantly larger number of errors on the first syllables. The mere separate (serial) pres- t/~an on the second series of words learned, entation, even in sentence form, is sufll- F(1, 288)--45.59, p < .005. Furthermore cient to give rise to less recall for the it has already been indicated how the nonsense syllables so that randomization stage of learning interacts with word order does not lead to further decrement. In and with grammatical structure. other words, the nonsense syllables may Stage of learning is :of course a fre- depend more upon long-range associations quency-of-occurrence variable, and the re- between words. Thus, Epstein's failure to sults suggest its importance for explaining find a difference in memory due to word differences in ability to recall various gram- order when using the serial method may matical structures. Since the D structure be ascribed largely to the di~culty in re- most likely occurs quite frequently in ver- calling nonsense syllables under these bal behavior (text as well as speech) to conditions. The variable of word order is which the average S is exposed, one would also involved in the triple interaction with expect more positive transfer to it than to 238 SALZINGER AND ECKEI~MAN the PNQ structure which occurs less fre- S commits a complex sentence to memory quently. It is unlikely that this can be a by storing its kernel plus a footnote con- warm-up effect since all Ss were exposed cerning the selection of the appropriate to a learning task before this task, namely transformation, is not corroborated by the the learning of a series of numbers. data of this experiment. Notions of fre- Method of Stimulus Presentation. The quency of occurrence (a basic variable in whole method of presentation is superior behavior theory) of the words or their to the serial method, F(1, 288)=13.51, combinations contained in the different p < .005, but the interaction with stage of structures appear to agree more with our learning indicates (as mentioned above) findings. The importance of the frequency- that this difference occurs only for the of-occurrence variable continues to be first presentation. Other interactions of this demonstrated in current research. A recent variable have already been discussed example was given by Baddeley (1964) above. who showed that series of nonsense sylla- Word Type, Nonsense syllables are al- bles can be learned more rapidly ff the last most three times more difficult to memorize letter of each syllable and the first of the than function words, F(1, 288)=98.46, subsequent one are "compatible" than p < .005. This difference is involved in a when they are not. Compatibility is de- significant triple interaction with method fined merely in terms of the number of of stimulus presentation and word order. letters required before an S will guess the Although the nonsense syllables'give rise next letter-a simple frequency-of-occur- to higher error scores than do function rence relationship. words for all conditions, the difference is It might also be pointed out that other greater in sentences than in random ar- experimenters have provided alternative rangements. Although the function words formulations to explain the influence of are easier to recall than nonsense syllables grammatical variables in verbal behavior. because of their higher frequency of occur- Braine (1963a, 1963b) has suggested that rence, they also show more dependence people acquiring a language learn the lo- upon context than the nonsense syllables. cation of units within sentences and asso- The relatively greater dependence of func- ciations between pairs of words. Jenkins tion words upon context is in agreement and Palermo (1964) suggest sequence and with a recent experiment by Glanzer class of words as important aspects of lan- (1962), who has maintained that function guage learning. Finally, Salzinger (1959, words are incomplete units but can be 1965) and Staats (1961) pointed out the made complete by embedding them among importance of the concept of response class nonsense syllables which alone are also in- in operant conditioning, i.e., what response complete units. This experiment showed members do in fact combine to form a re- that function words can be made part of a sponse class, and in another paper Salz- complete unit by embedding them in a inger (1962) showed the importance of nonsense syllable sentence (for more ex- unit size for understanding the properties tensive discussion of the problem of unit of speech. In summary, a number of inves- see Salzinger, 1962). tigators have suggested concepts derived from learning theory which could explain DISCUSSION at least those properties of grammar cur- Thus, in general, we are led to the con- rently being investigated. In view of the clusion that Miller's notion (1962), that an equivocal nature of the behavioral evi- GRAMMAR AND RECALL 239 dence for a transformation grammar, it MARTIn, J. G., DAVIDSON, JUDY R., AND XVILLIAMS, would seem advisable at this point to in- MYRNA L. Grammatical agreement and. set in learning at two age levels. J. exp. Psychol., vestigate further the application of fre- 1965, 70, 570-574. quency concepts to simple structures and MARTIN, J. G., AND JONES, R. L. Size and struc- then determine what concepts, if any, must ture of grammatical units in paired-associate be added to handle more complex struc- learning at two age levels. J. exp. Psychol., tures. Such work has already begun in ex- 1965, 70, 407-411. MEHLER, J: Some effects of grammatical trans- periments by Martin and Jones (1965) and formations on the recall of English sentences. Martin, Daviclson, and Williams (1965). J. verb. Learn. verb. Behav., 1968, 2, 346- 851. REFERENCES MmLE~, G. A. The magic number seven, plus or BAm~ELEY, A. D. Language-habits, S-R compati- minns two: Some limits on our capacity for bility, and verbal learning. Amer. J. Psychol., processing information. Psychol. Rev., 1956, 1964, 77, 463--468. 63, 81-97. BEVER, T. G., FODoR, J. A., AND WEKSEL,.W. On MmLER, G. A. Some psychological studies of the acquisition of syntax: a critique of "con- grammar. Amer. Psychol., 1962, 17, 748-762. textual generalization.'" Psychol. Rev., 1965, MILLER, G. A., GALANTER, E., AND PRmRAM, 72, 467-482. K. H. Plans and the structure of behavior. BRAINE, M. D. S. On learning the grammatical New York: Holt, 1960. order of words. PsyehoI. Rev., 1963, 70, 823-- NOBLE, C. E. Analysis of meaning. Psychol. Rev., 348. (a) 195"2, 59, 421-430. BRAINE, M. D. S. Learning language structures. S~ZINGEa, K. Experimental manipulation of ver- Paper presented at the American Psychol. bal behavior: a review. J. gen. Psychol., 1959, Assoc., Phila., Pa., 1963. (b) 61, 65-94. CHOMSKY, N. Syntactic structures. "s Gravenhage: SALZINCER, K. Some problems of response meas- Mouton and Co., 1957. urement in verbal behavior: the response unit EPSTEI1V, W. The influence of syntactical structure and intraresponse relations. Paper presented on learning. Amer. J. Psychol., 1961, 74, at the conference on Methods of Measurement 80-85. of Change in Human Behavior, Montreal, EPSTEIN, W. A further study of the influence of Canada, 1962. syntactical structure on learning. Amer. J. SALZINGER, K. The problem of respanse class in Psychol., 1962, 75, 121-126. verbal behavior. Paper presented at the Con- GLANZEn, M. Grammatical category: A rote learn- ference on Verbal Behavior, New York City, ing and word association analysis. J. verb. 1965. Learn. verb. Behav., 1962, 1, 31-41. STAATS, A. W. Verbal habit-families, concepts, GLAZE, J. A. The association value of nonsense and the operant conditioning of word classes. syllables. J. genet. Psychol., 1928, 35, 255- Psychol. Rev., 1961, 68, 190-204. 267. UNDERWOOD, B. J. A~D SCHULZ, R. W. Meaning- JENKINS, J. J., aND PALERMO, D. S. Mediation fulness and verbal learning. Philadelphia: processes and the acquisition of linguistic Lippineott, 1960. structure. Monogr. Soc. Res. Child DeveIopm., I964, 29, 141-169. (Received June 21, 1965)