Quick viewing(Text Mode)

Is Writing As Much Phonological As Speaking?: Homophone Usage Across Speaking and Writing

Is Writing As Much Phonological As Speaking?: Homophone Usage Across Speaking and Writing

Psychologia, 2004, 47, 1–9

IS WRITING AS MUCH PHONOLOGICAL AS SPEAKING?: HOMOPHONE USAGE ACROSS SPEAKING AND WRITING

Chang H. LEE1), Kichun NAM2)

1)Pusan National University, Korea, 2)Korea University, Korea

and

James W. PENNEBAKER3)

3)University of Texas at Austin, U.S.A.

Homophone usage was compared between speaking and writing across two sets of experiments. In the first set of studies 101 introductory students talked into a video camera about an emotional attitudinal issue – once in a way that supported their view and another time that was opposite to their view. In a different session, 44 different students performed the same task but wrote their views on paper. In the second set of studies, individuals interacted with a person whom they had never met either orally in person (N=62) or on a computer chat system where they typed to one another (N=124). Similar amounts of homophones were used across the two verbal processes, although more high frequency homophones were used in speaking than in writing. These results suggest that phonological processing do play a major role in writing.

Key words: homophone, speaking, writing, lexical selection, context, LIWC

When we are writing, we sometimes phonologically preactivate and rehearse the text to be written. This conscious phenomenon is especially true when writing demands delicate composition. The product of writing itself, however, is an orthographic form. Thus, the question that naturally arises about the writing process is what the role of phonological information is, and which form of information plays the main role in writing. This question is related to the traditionally important question about the role of in language, and especially in reading. There have been debates on whether the main process of reading is phonological, orthographic, or both. Researchers supporting the phonological recoding hypothesis argue that phonological information is necessary to access the meaning of a word and is dominant over the orthographic information in word recognition (e.g., Lee & Turvey, 2003; Lukatela, Eaton, Lee, & Turvey, 2001; Lukatela, Frost, & Turvey, 1999; Lukatela & Turvey, 1991, 1993, 1994, 2000; Perfetti & Bell, 1991). The researchers supporting the dual-route hypothesis argue

Correspondence concerning this article should be addressed to Chang H. Lee, Department of Psychology, Pusan National University, 30 Changjeon-Dong, Keumjeong-Ku, Pusan, 609-735, South Korea (e-mail: [email protected]) or Kichun Nam, Department of Psychology, Korea University, 5 Anam-Dong, SungBuk-Ku, 136-100, South Korea ([email protected]). Preparation of this manuscript was made possible by a grant from the National Institutes of Health, U.S.A. (MH59321).

1 2 LEE, NAM, & PENNEBAKER that all but very low frequency words are processed orthographically and that the orthographic information is dominant over the phonological information (e.g., Coltheart, 1978; Coltheart, Curtis, Atkins, & Haller, 1993, Rastle & Coltheart, 1999; Seidenberg, Waters, Barnes, Tanenhaus, 1984). Because reading and writing are like opposite sides of a coin, either type of processing would be the main form of information in writing. Researchers have used homophones (e.g., break and brake) as the main stimuli when investigating whether reading is done by the phonological or orthographical basis. The confusion across the two types of homophones in the semantic categorization and the facilitation of one homophone by the other counterpart has been regarded as the main evidence supporting the phonological recoding in reading. Since homophones were used as the main stimuli to investigate the role of phonology in reading, their usage in writing could be used as a clue for the role of phonology in writing. Because homophones elicit confusion when only the phonological form is activated, the use of homophones in writing would be discouraged to the degree that phonology plays a major role. In contrast, if orthographic information plays a dominant role in comparison with phonological information, the use of homophones in writing – as opposed to speaking – would be no problem because of the different orthographic forms between two possible forms of homophones. A simple test of the phonological basis of homophones would be to compare their usage in natural spoken language versus written language. If orthographic information plays a significant role in writing, homophone usage would be no obstacle in writing, allowing more homophone usage in writing than in speaking. This is because homophones have different , making them no difficulty in writing if writing depends on orthographic processing. In contrast, if phonological information plays a significant role in writing, the base rates of homophone usage in writing would be fewer than in speaking. This hypothesis is based on the assumption that the phonological recoding is automatically executed in writing as well as in reading. The homophone would elicit phonological confusion in writing, reducing the usage of that kind of words. The current study attempts to count out the number of homophones in various speaking and writing sample to address the question of information in these two verbal processes. Obviously, homophones can be divided into high frequency and low frequency usage homophones. The pairs of homophones in which both members of homophones are high frequency (e.g., their and there) are the central focus of our study because two members would create significant competition. In contrast, when one member of a homophone pair is low frequency (e.g., air and heir), usage of the high frequency homophone (i.e., air) would not be relatively difficult because of less competition between the two members. In addition, there is the possibility that people might not even know the low frequency homophone (Star & Fleming, 2001), eliciting scarce use of this type of homophones in daily life. In the present study, data from previously conducted experiments were reanalyzed to determine the base rates of homophone usage in writing versus speaking. The permission of reanalyzing the data came from Dr. Pennebaker, who has the copyright and is one of the authors in the previous studies. In the first pair of studies, separate groups of students HOMOPHONE USAGE IN SPEAKING AND WRITING 3 were asked to either write about their true and false views on a controversial topic (abortion) or were asked to talk aloud about the same topic to another person while they were videotaped. Both groups of students, then, were addressing the same topics and only differed in writing versus speaking them. In the second pair of studies, college students who had never met interacted with one another orally or over a laboratory-based computer chat system. The pairs of participants then only differed in their getting-to-know each other by talking (and seeing each other) or by writing. For the present study, the percentage of homophones used as a function of total number of words written or spoken were calculated using a computerized text analysis program, Linguistic Inquiry and Word Count, or LIWC (Pennebaker, Francis, & Booth, 2001). In addition to overall homophone usage, six categories of homophone subtypes were also designated. Specifically, homophones could be divided into different categories by their relative frequency, and by their orthographic similarity. In terms of their relative frequency, frequency of the two possible homophones can be both high frequency words, or one is low frequency and the other is high frequency, or both low frequency words. Overall, 128 homophones are selected from a corpus of 3625 homophones constructed by Hobbs (1993). Another criterion was that only the homophones that were used in the previous language researches were selected. The range of low frequency homophone was set to frequencies of less than 32 per million (Pexman, Lupker, & Jared, 2001), and the frequency count were based on the Kucera and Francis’ word frequency count (1967). The range of high frequency homophone was set to frequencies of more than 64 per million. Four different categories of homophones were created according to their relative frequency. In terms of orthographic similarity, some homophones are one letter different from each other (e.g., feet and feat), whereas other homophones are two or more letters different from each other (e.g., way and weigh). It was expected that if orthographic information played a dominant role in writing, then it would be probable that this might encourage a greater usage of two-letter different homophones than one-letter different ones.

STUDY 1: ATTITUDES ABOUT ABORTION IN WRITING AND IN SPEAKING

For the present study, data from a project on the detection of deception were reanalyzed (Newman, Pennebaker, Berry, & Richards, 2003). In the original project, participants from two separate experiments were asked to either talk about or write about their true attitudes as well as their false attitudes about abortion.

Method Participants: In the first of two studies (the talking study), 101 introductory psychology students (54 males, 47 females) participated as part of an extra credit option. Overall, approximately 75% favored abortion. In the second experiment – the writing study – which was run one year later, 44 introductory psychology students (18 males and 26 females, approximately 70% reported favoring abortion) participated as part of a similar extra credit option. No one participated in both studies. Linguistic data from all participants are included in the analyses. 4 LEE, NAM, & PENNEBAKER

Table 1. The Mean Percentage and Standard Deviations for the Six Homophone Categories Across Speaking and Writing in Study 1

Speaking Writing

Mean (SD) Mean (SD)

Homophone 3.96 (2.34) 3.79 (1.73) H-H Frequency 3.37 (2.20) 2.75 (1.54) H-L Frequency 1< 1< L-H Frequency 1< 1< L-L Frequency 1< 1< 1 Letter differ 1< 1< 2 Letter differ 2.67 (2.01) 2.04 (1.31)

Procedure: Students in the talking study volunteered for a study dealing with attitudes among college students. On arrival to the lab, half of the participants were asked to talk about their true views on abortion for two minutes. At the conclusion of the two minutes, they were then asked to present the opposite views – as if they were their own. The remaining students talked about their opposite views first. Participants presented their views orally into a video camera while a silent experimenter recorded their speeches. In the related writing study, participants were asked to write two brief essays about their views on abortion as well as a counterattitudinal essay on the topic. Participants, who were also randomly assigned to a counterbalanced order, wrote for five minutes on each essay. Data preparation: For each participant in each study, both text samples (the honest and deceptive transcribed speech samples or writing samples) were analyzed using the homophone dictionaries with the LIWC program. The two text samples were then aggregated by participant resulting in one line of data per person. Note that all homophone means represent percentages based on the number of homophones divided by the total number of words within each text sample.

Results and Discussion The mean percentage and standard deviations for the six homophone categories are summarized in Table 1. A series of independent t-tests (two-tailed tests) were computed for each category. The main effects of homophone were not statistically significant, t<1. Low-low frequency, low-high frequency, and high-low frequency homophones are excluded from the analysis because they are below 1%. Main effects of high-high frequency homophones, and 2-letter different homophones were statistically significant, t(143)=3.26, p<.01, and t(143)=3.71, p<.01, respectively. Counter to the hypothesis that argues the dominant role of orthographic processing in writing, homophone usage was not greater in written than spoken text. Overall, these results suggest that, within the current context, speaking involves more homophones than writing. Putting aside the homophones that are rarely used (frequencies less than 1%), high-high frequency homophones – the very ones most likely to be misunderstood by a listener – are significantly more likely to be used in speaking than in writing. These results do not support the argument that writing is an orthographic process. Rather, if phonological information plays a strong role in writing, these results would be explained by the difficulty of writing words with the same phonology but HOMOPHONE USAGE IN SPEAKING AND WRITING 5 different meaning. An alternative interpretation is that homophones are processed in such a way that context weighs relevant meaning or suppresses irrelevant meaning, or both, so that only one meaning is process and becomes conscious in word processing (Ferrand & Grainger, 2003; Pexman & Lupker, 1999). Because speaking provides various cues about context, it would be easier to use homophones in speaking than in writing. A finding of particular interest is that individuals use a fair amount of high frequency homophones (2.7%–3.3%) in speaking and writing. Such a high rate of high frequency homophone usage in normal writing and speaking suggests that the context of word usage significantly overrides the phonological similarity between words. By , of course, it also means that people do not use low frequency homophones (less than 1%). These results pinpoint that the use of such homophones in many previous reading and writing research studies may lack some degrees of external validity in applying their results to the actual language usage. For a typical example, previous studies used low frequency homophones in the phonological priming task, and argued that phonological recoding is a default process in reading(e.g., Lukatela & Turvey, 1994, Perfetti & Bell, 1991). If people use this kind of homophone rarely in the actual daily life, the finding in the previous studies can not be directly generalized to the actual language process and language usage. The finding in the study 1 also tend to confirm the argument of Star and Fleming (2001) that people generally do not know and certainly do not use low frequency homophones.

STUDY 2: COMPARING CHATTING AND NORMAL CONVERSATION

Whereas Study 1 compared spoken and written speech samples among students engaged in monologues about their attitudes about a highly charged emotional topic, Study 2 involves interactions among pairs of complete strangers. The purpose of this study is to generalize the results in the study 1 to the conversation situation, which is consisted of significant amounts of language usage in a daily life. In one study, participants met in a waiting room and, unknown to them, were tape recorded while they idly chatted with each other. Data for this experiment are from a data set provided by Ickes (1993). In a second study, strangers were introduced to one another on a computer chat board set up as a part of an experiment. The only instructions given to the participants were for them to get to know one another. The written chat data were part of a study provided by Niederhoffer and Pennebaker (2002). As in Study 1, the purpose of the present study was to compare how participants used homophones while talking versus writing.

Method Participants: In the talking study, 62 (31 mixed-sex pairs) introductory psychology students participated as part of an extra credit option. For the computer chat writing study, 124 (62 dyads: 48 males, 76 females; 10 all male dyads, 24 all female dyads, and 28 mixed-sex dyads) introductory students participated. Data from all interactions are included in the analyses. Procedure: The talking study used the standard stranger paradigm (see Ickes, Stinson, Bissonnette, & 6 LEE, NAM, & PENNEBAKER

Table 2. The Mean Percentage and Standard Deviations for the Six Homophone Categories Across Speaking and Writing in Study 2

Speaking Writing

Mean (SD) Mean (SD)

Homophone 4.44 (0.86) 4.25 (0.76) H-H Frequency 3.02 (0.80) 2.66 (0.67) H-L Frequency 1.39 (0.52) 1.48 (0.39) L-H Frequency 1< 1< L-L Frequency 1< 1< 1 Letter differ 2.18 (0.66) 3.45 (1.33) 2 Letter differ 1.68 (0.64) 1.46 (0.49)

Garcia, 1990) wherein participants arrived at the lab at an appointed time. They were seated on a couch and completed a consent form that gave the experimenter permission to videotape their interaction. A sham video camera was placed in front of the participants but was “discovered” to not have a videotape. The experimenter left purportedly to find a tape and the participants were told to wait on the couch and, presumably, get to know one another before the study began. In fact, the interaction between the two students was videotaped by a hidden camera. At the conclusion of the 5-min interaction, the experimenter returned and explained the true purpose of the study. The internet chat study involved groups of up to 40 students who signed up to participate in a study in two separate locations at the same time. Each computer in one room was connected to a computer in the second room. Interacting participants never saw one another. The experimenter explained that the study was similar to standard chat systems where they would be interacting with a person at another location. They were told that they would interact with their partner for about 45 minutes and that their goal was simply to get to know the other person. Individuals interacted for 45 minutes and then completed a follow-up survey about their experience. Data preparation: Data were analyzed on the dyad rather than the individual level. The oral interactions were transcribed. Spelling errors, typos, and abbreviations were corrected for study 2. The entire record of words from each interaction were subjected to the LIWC analyses using the homophone dictionaries resulting in one LIWC output line per interaction.

Results and Discussion The mean percentage and standard deviations for the six homophone categories are summarized in Table 2. A series of independent t-tests (two-tailed tests) were computed for each category. The main effects of homophone were not statistically significant, t(91)=1.05, p<.30. Main effects of high-high frequency homophones, t(91)=4.97, p<.01, 1-letter different homophones, t(91)=–2.30, p>.05, and 2-letter different homophones, and t(91)=1.80, p<.07 were statistically significant or marginally significant. Main effects of high-low frequency homophones were not statistically significant, t(91)=–1.01, p<.05. Low-low frequency and low-high frequency homophones are excluded from the analysis because they are below 1 %. These results confirmed the results of study 1. The general tendency in this study was that people use more high frequency homophones in speaking than in writing. A distinct result of this study was that more 1-letter different homophones were used in writing than in speaking. If the orthographic processing is dominant in writing, there HOMOPHONE USAGE IN SPEAKING AND WRITING 7 should be more 2-letter different homophones in writing than in speaking because of the additional orthographic information due to the number of different letters across homophones. The current results did not support the argument of possible orthographic processing in writing.

GENERAL DISCUSSION

Studies 1 and 2 showed that overall homophone usage was not different across speaking and writing, and that high frequency homophones was used more in speaking than in writing. This question was addressed by using a recently developed linguistic analysis program, the LIWC. Speaking and writing in daily life provide a convergent evidence for the homophone usage. The two hypotheses of language processing predicted a different result. If orthographic information plays a more significant role in writing than in speaking because writing is composed of visual orthographic forms, then it is expected that more homophones would be used in writing than in speaking. In contrast, the assumption of dominance by phonological information in writing would elicit less amount of homophone usage in writng than in speaking. These results are generally compatible to the latter view. Less usage of homophones in writing indicates that the possible activation of phonological information related with homophones would elicit confusion in writing homophones. The pattern of results that the interference was more salient for high frequency homophones than for low frequency homophones is also compatible with the phonological recoding hypothesis. An alternative view on the results of this study based on the more usage of high frequency homophone in speaking than in writing would be the constraint of context in speaking. If a context in speaking situation is so rich in resolving the meaning of a word, the words with common phonology would not be a problem to speak. This interpretation suggests that orthographic, or phonological processing might not play a major role in writing after the lexical selection, as compared to the significant role of social context. When we compare the usage of homophones in the studies 1 and 2, people generally used more homophones in the study 2 than in the study 1. Although it was not statistically significant and not valid to compare data from two different studies, it also gives us a clue that social context of conversation might play a role in using homophones. The social context could be a possible confounding variable in this current study that needs futher investigation in future studies. Another of the limitations of this study is that the data we analyzed are based on the final product of speaking and writing. Thus, we could not tell whether the online process of speaking and writing is phonological or orthographic in a precise manner. As discussed, the results of this study can be explained by the possible phonological recoding in the early stage of language processing, or the strong role of social context in speaking. This study, however, surely provide deductive evidence toward the importance of phonological processing in language generally. 8 LEE, NAM, & PENNEBAKER

REFERENCES

Coltheart, M. 1978. Lexical access in simple reading tasks. In G. Underwood (Ed.), Strategies of information processing (pp. 151–216). San Diego, CA: Academic Press. Coltheart, M., Curtis, B., Atkins, P., & Haller, M. 1993. Models of Reading aloud: Dual-route and parallel- distributed-processing approaches. Psychological Review, 100, 589–608. Ferrand, L., & Grainger, J. 2003. Homophone interference effects in visual word recognition. Quarterly Journal of Experimental Psychology: Human Experimental Psychology, 56, 403–419. Hobbs, J. B. 1993. Homophones and homographs: An American dictionary (2nd ed.). Jefferson, NC: McFarland & Company, Inc. Ickes, W. 1993. Naturalistic interaction between strangers. Unpublished data, The University of Texas at Arlington. Ickes, W., Stinson, L., Bissonnette, V., & Garcia, S. 1990. Naturalistic social cognition: Empathic accuracy in mixed-sex dyads. Journal of Personality and Social Psychology, 59, 730–742. Kucera, H., & Francis, W. N. 1967. Computational analysis of present-day . Providence, RI: Brown university Press. Lee, C. H., & Turvey, M. T. 2003. Silent letters and phonological priming. Journal of Psycholinguistic Research, 32, 313–333. Lukatela, G., Eaton, T., Lee, C., & Turvey, M. T. 2001. Does visual word indentification involve a subphonemic level? Cognition, 78, B41–B52. Lukatela, G., Frost, S., & Turvey, M. T. 1999. Identity priming in English is compromised by phonological ambiguity. Journal of Experimental Psychology: Human Perception and Performance, 25, 775–790. Lukatela, G., & Turvey, M. T. 1991. Phonological access of the lexicon: Evidence from associative priming with pseudohomophones. Journal of Experimental Psychology: Human Perception and Performance, 17, 951–966. Lukatela, G., & Turvey, M. T. 1993. Similar attentional, frequency, and associative effects for pseudohomophones and words. Journal of Experimental Psychology: Human Perception and Performance, 19, 166–178. Lukatela, G., & Turvey, M. T. 1994. Visual lexical access is initially phonological 2: Evidence from phonological priming homophones, and pseudohomophones. Journal of Experimental Psychology: General, 123, 331–353. Lukatela, G., & Turvey, M. T. 2000. Do spelling variation affect associative and phonological priming by pseudohomophones? Perception and Psychophysics, 62, 196–217. Newman, M. L., Pennebaker, J. W., Berry, D. S., & Richards, J. M. 2003. Lying words: Predicting deception from linguistic styles. Personality & Social Psychology Bulletin, 29, 665–675. Niederhoffer, K. G., & Pennebaker, J. W. 2002. Linguistic style matching in social interaction. Journal of Language and Social Psychology, 21, 337–360. Pennebaker, J. W., Francis, M. E., & Booth, R. J. 2001. Linguistic Inquiry and Word Count (LIWC 2001): A computerized text analysis program. Mahwah, NJ: Erlbaum. Perfetti C. A., & Bell, L. 1991. Phonemic activation during the first 40 ms of word identification: Evidence form backward masking and priming. Journal of Memory and Language, 30, 473–485. Pexman, P. M., & Lupker, S. J. 1999. Ambiguity and visual word recognition: Can feedback explain both homophone and effects? Canadian Journal of Experimental Psychology, 53, 323–334. Pexman, P. M., Lupker, S. J., & Jared, D. 2001. Homophone effects in lexical decision. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 139–156. Rastle, K., & Coltheart, M. 1999. Lexical and nonlexical phonological priming in reading aloud. Journal of Experimental Psychology: Human Perception and Performance, 25, 461–481. Seidenberg, M. S., Waters, G. S., Barnes, M. A., & Tanenhaus, M. K. 1984. When does irregular spelling or pronunciation influence word recognition? Journal of Verbal Learning and Verbal Behavior, 23, 383– 404. Star, M. S., & Fleming, K. K. 2001. A rose by any other name is not the same: The role of orthographic knowledge in homophone confusion errors. Journal of Experimental Psychology: Learning, Memory, and Cognition, 27, 744–760.

(Manuscript received November 4, 2003; Revision accepted February 10, 2004) HOMOPHONE USAGE IN SPEAKING AND WRITING 9

Appendix Dictionary in the LIWC for Homophones Used in the Analysis

H-H frequency H-L frequency L-H frequency L-L frequency

there but butt tow their due dew toe four air heir flour for night knight flower two beat beet ore too none nun oar hear blue blew alter here board bored altar peace rain reign bail piece sweet suite bale scene pain pane bridal seen break brake bridle sea course coarse serf see feet feat surf right horse hoarse clause write hair hare claws sight way weigh chord site hall haul cord son higher hire dough sun heart hart doe war latter ladder fairy wore least leased ferry sign sine fir made maid fur main mane flare mind mined flair morning mourning bow right rite bough read reed steak real reel stake cent scent foul seem seam fowl size sighs hay hey heal heel tee tea vane vein

Note: H-H: High-High; H-L: High-Low; L-H: Low-High; L-L: Low-Low.