The emergence of the modern concept of introspection: a quantitative linguistic analysis I. Raskovsky D. Fernandez´ Slezak Department of Computer Science Department of Computer Science University of Buenos Aires University of Buenos Aires Pabellon´ I, Ciudad Universitaria Pabellon´ I, Ciudad Universitaria Buenos Aires, C1428EGA, Argentina Buenos Aires, C1428EGA, Argentina [email protected] [email protected] C.G. Diuk G.A. Cecchi Department of Psychology Computational Biology Center Princeton University T.J. Watson IBM Research Center Princeton, NJ 08540, USA Yorktown Heights, NY 10598, USA [email protected] [email protected] Abstract 1 Introduction The evolution of literary styles in the west- The evolution of literary styles in the western tradi- ern tradition has been the subject of extended research that arguably has spanned centuries. tion has been the subject of extended research that In particular, previous work has conjectured arguably has spanned centuries. In particular, previ- the existence of a gradual yet persistent in- ous work has conjectured the existence of a grad- crease of the degree of self-awareness or in- ual yet persistent increase of the degree of self- trospection, i.e. that capacity to expound on awareness or introspection, i.e. that capacity to ex- one’s own thought processes and behaviors, pound on one’s own thought processes and behav- reflected in the chronology of the classical lit- iors, reflected in the chronology of the classical lit- erary texts. This type of question has been tra- erary texts. This type of question has been tradi- ditionally addressed by qualitative studies in philology and literary theory. In this paper, tionally addressed by qualitative studies in philology we describe preliminary results based on the and literary theory. In this paper, we describe pre- application of computational linguistics tech- liminary results based on the application of compu- niques to quantitatively analyze this hypoth- tational linguistics techniques to quantitatively ana- esis. We evaluate the appearance of intro- lyze this hypothesis. spection in texts by searching words related to The striking differences between the Iliad and the it, and focus on simple studies on the Bible. This preliminary results are highly positive, Oddysey in the way the characters’ behaviors are indicating that it is indeed possible to statis- attributed to divine intervention, or to the individ- tically discriminate between texts based on a ual’s volition, has been pointed out by numerous semantic core centered around introspection, scholars (Onians, 1988; Dodds, 1951; Adkins, 1970; chronologically and culturally belonging to De Jong and Sullivan, 1994). However, not until different phases. In our opinion, the rigurous the highly influential work of Marshall McLuhan extension of our analysis can provide not only (McLuhan, 1962) and Julian Jaynes (Jaynes, 2000) a stricter statistical measure of the evolution of introspection, but also means to investigate was it pointed out that these changes may reflect not subtle differences in aesthetic styles and cog- just artistic or even cultural tendencies, but profound nitive structures across cultures, authors and alterations in the mental structure of those who literary forms. wrote, collected and assimilated the stories. While 68 Proceedings of the NAACL HLT 2010 Young Investigators Workshop on Computational Approaches to Languages of the Americas, pages 68–75, Los Angeles, California, June 2010. c 2010 Association for Computational Linguistics McLuhan argued for a materialistic effect of the type 2 Materials and methods of medium (the linearity of written language, the We downloaded selected texts representative of dif- holistic nature of the moving image) on the orga- ferent ages in literature from the MIT classic texts nization of thoughts (linear or integrative, respec- archive (Daniel C. Stevenson, 2010), based on refer- tively), Jaynes proposed a more radical hypothe- ences in Jaynes’ book (Jaynes, 2000). The selected sis: a relatively abrupt transition from a “bicam- texts are: the Iliad and the Odyssey (approx. 1200 eral mind”, where one hemisphere produced god- BC to 900 BC), The Bible (approx. 1400 BC to like commands that the other followed blindly, to AD 200), Lucretius’ On the Nature of Things (99 the modern mind with its ability of self-awareness. BC - 55 BC), St. Augustine’s Confessions (AD 397 Moreover, Jaynes boldly suggested that this transi- - AD 398), Shakespear’s The Merchant of Venice tion may have been accompanied by a physical pro- (AD 1596 - AD 1598), Hamlet (approx. AD 1600), cess that altered the relationship between the hemi- Macbeth (AD 1603 - AD 1607) and Othello (AD spheres, and changed culture permanently. Since 1603), Cervantes’ Quixote (AD 1605 - AD 1615), its publication, The origins of consciousness in the Jean Austen’s Mansfield Park (AD 1814), Emma breakdown of the bicameral mind has been highly (AD 1815) and Persuasion (AD 1816) and Proust’s influential inside and outside scientific quarters, as Time Regained (AD 1927). well as a source of continuing controversy (Cavanna et al., 2007). On this preliminary study, we focused on ex- tremely simple techniques to test our hypothesis. We have implemented a series of basic routines to analyze the frequency of certain words related Whether brought about by nature or nurture, how- to introspection, selected by hand. We used very ever, Jaynes presents compelling arguments about simple regular expressions to search over the text: the effects of this transition, including stylistic think+, thought, myself, mind+, feel+ changes throughout the other foundational text of and felt. The search was conducted on 10,000- the western world, the Bible. Simply put, a less rad- words windows starting from the beginning of the ical version of Jaynes’ hypothesis would state that, text moving towards the end in 2,000-words steps. within the judeo-greco-christian cultural tradition, Also, the appearance of references to God in the there exists an “arrow of time” pointing to increasing Bible was measured. In this case, we looked for: introspection. The question we set out to answer in lord, god and almighty; all searches done case the present manuscript is to what extent it is possible insensitive. In order to control for the possible in- to analyze, quantitatively, this hypothesis. crease of these selected words as a trivial conse- quence of an increase in the overall linguistic rich- ness or expressiveness of the text, we also computed The widespread availability of classic and modern the total number of distinct words for each step. literary texts has paved the road to a wide variety of linguistic studies. Matters of literary style and struc- As an alternative approach, we applied a data- ture are necessarily more controversial, although the driven method to extract the semantic structure of recent work of F. Moretti (Moretti, 2005) has shown texts, namely topic modeling (Blei, 2009). We uti- that it is indeed possible to quantify the subtle varia- lized the implementation of the mallet package (Mc- tions in the structure of the novel over temporal peri- Callum, 2002), an off-the-shelf tool, generating 100 odizations and geographical locations. In any event, topics through 10,000 Gibbs sampling rounds. The given that our intention is to complete a preliminary topics were then manually inspected for their seman- study of feasibility, we focus here on capturing the tic relevance to the issue at hand, i.e. introspection. textual traces of words or lexical structures that can 3 Results be reasonably argued to reflect introspective think- ing on the part of the characters, using techniques The preliminary results are highly positive, indicat- from machine learning and computational linguis- ing that it is indeed possible to statistically discrim- tics. inate between texts based on a semantic core cen- 69 0.07 0.06 Bible Iliad Odyssey 0.05 Lucretius Confessions 0.04 Quixote Merchant Othello 0.03 Macbeth Hamlet 0.02 Emma Persuasion Mansfield Park 0.01 Time Regained Frequency of introspection related words 0 500 1000 1500 2000 2500 3000 3500 4000 Number of distinct words Figure 1: Frequency of words related to introspection versus the amount of different words. Each text is identified by a unique color; each point represents a 10,000-words window. tered around introspection, chronologically and cul- ble average of his work seems to fall in line with turally belonging to different phases. In figure 1 we the global temporal order. It is beyond the scope of show the frequency of words related to introspec- our manuscript to discuss the nuances of the work tion versus the amount of different words, for all the of The Bard, but our analytic approach may pro- texts we chose. Each author is identified by a unique vide new tools to the ongoing Shakespearean schol- color; each text is identified by an unique symbol; arship. Finally, our analysis seems to really break each point represents a 10,000-words window. The down for Proust, as one intuitively would expect a frequency is calculated as count over the number of much higher measure of introspection, even more different words in the 10K windows. To summarize so considering that he displays significant richness this information and provide statistical value to our in terms of the number of distinct words in the text. analysis, we present in figure 2 the mean and stan- This failure is clearly an indication of the limitations dard deviation for each of the selected texts. of our current approach, which as it stands may only not be applicable to modern or contemporary litera- We clearly observe how different texts are dis- ture. joint in the graph, both in amount of different words used in each window, as well as the frequency of The Bible is of particular interest for this work, introspection. This is the case for the Iliad and as it was written in parts along a wide time interval the Odyssey, confirming that our preliminary mea- (taking into account the Old and New Testaments).
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages8 Page
-
File Size-