<<

Detection of Longitudinal Development of Dementia in Literary Writing

A thesis presented to

the faculty of

the College of Arts and Sciences of Ohio University

In partial fulfillment

of the requirements for the degree

Master of Arts

Torri E. Raines

May 2018

© 2018 Torri E. Raines. All Rights Reserved. 2

This thesis titled

Detection of Longitudinal Development of Dementia in Literary Writing

by

TORRI E. RAINES

has been approved for

the Department of Linguistics

and the College of Arts and Sciences by

David Bell

Associate Professor of the Department of Linguistics

Robert Frank

Dean, College of Arts and Sciences 3

ABSTRACT

RAINES, TORRI E., M.A., May 2018, Linguistics

Detection of Longitudinal Development of Dementia in Literary Writing

Director of Thesis: David Bell

Past studies have suggested that the progression of dementia, especially Alzheimer’s disease, can be detected in the writing of literary authors through analysis of their lexical diversity patterns. However, those studies have used oversimplified measures and vague definitions of lexical diversity. This study uses a multi-faceted, computationally operationalized model of lexical diversity innovated by Scott Jarvis to analyze a total of 129 novels by five authors (three with dementia and two without), with the purpose of identifying the lexical characteristics of dementia in literary writing. A total of 22 novels by two authors with suicidal depression were also analyzed in order to determine whether this condition also leads to changes in authors’ lexical diversity patterns.

Analyses were conducted with six individual lexical diversity measures and two supplementary lexicosyntactic measures. Results suggest that dementia as well as the effects of healthy aging manifest in different aspects of lexical diversity for different authors, and that this model of lexical diversity is a robust tool for detecting lexical decay indicative of dementia. The model achieves 100% classification accuracy in discriminating between dementia-affected and non-dementia-affected novels. Classification accuracy drops slightly with leave-one-out cross-validation but remains higher than 88% for all dementia group authors.

4

DEDICATION

To my fiancé and life partner, Zachary Thompson, and our ridiculous cats, Harley and Pickle.

5

ACKNOWLEDGMENTS

I am grateful to Dr. Scott Jarvis for his invaluable guidance, feedback, and encouragement throughout this project and for getting me interested in lexical diversity and computational linguistics in the first place.

I would like to thank my other committee members, Dr. David Bell and Dr. Michelle

O’Malley, for their enthusiasm and input. I also extend my gratitude to Dr. Romy Ghanem for help with using SPSS and interpreting statistics.

As always, my fiancé Zachary Thompson has my endless gratitude for emotional and intellectual support that made the completion of this project possible. I also want to thank him for his help with various coding problems throughout the process.

6

TABLE OF CONTENTS

Page

Abstract ...... 3 Dedication ...... 4 Acknowledgments ...... 5 Table of Contents ...... 6 List of Tables ...... 8 List of Figures ...... 9 Chapter 1: Introduction ...... 10 Chapter 2: Literature review ...... 12 Alzheimer’s and Dementia Detection ...... 12 Measuring Lexical Diversity ...... 16 Depression Detection ...... 21 The Present Study ...... 24 Chapter 3: Method ...... 26 Data ...... 26 Authors ...... 28 Analysis ...... 38 Lexical Diversity ...... 38 Other measures ...... 42 Chapter 4: Results and Discussion ...... 46 Individual variables and linear regression ...... 46 Control Group ...... 47 Dementia Group ...... 52 Depression Group ...... 64 Exploratory – Hierarchical Cluster Analysis ...... 68 Confirmatory – Linear Discriminant Analysis ...... 73 Dementia Group ...... 73 Depression Group ...... 81 Returning to Research Questions ...... 82 RQ1 - Will complex measures of lexical diversity support the findings of Le et al.’s 2011 study? ...... 82 7

RQ2 - Will these measures provide more nuanced detection of the development of dementia? ...... 83 RQ3 - Can these measures be applied to depression, and particularly depression escalating into suicide? ...... 85 Limitations to the Study and Future Directions ...... 85 Chapter 5: Conclusion ...... 88 References ...... 90 Appendix A: Sample Data ...... 96 Appendix B: Significant Linear Regression Graphs ...... 97 Appendix C: Dendogram from Hierarchical Cluster Analysis ...... 105

8

LIST OF TABLES

Page

Table 1. novels used ...... 29 Table 2. Agatha Christie novels used ...... 31 Table 3. Terry Pratchett novels used ...... 33 Table 4. P.D. James novels used ...... 34 Table 5. Mary Higgins Clark novels used ...... 35 Table 6. Virginia Woolf novels used ...... 37 Table 7. Kurt Vonnegut novels used ...... 38 Table 8. Raw LD and it-v-adj-to-that values for P.D. James ...... 48 Table 9. Linear regression results for P.D. James ...... 48 Table 10. Raw LD and it-v-adj-to-that values for Mary Higgins Clark ...... 50 Table 11. Linear regression results for Mary Higgins Clark ...... 51 Table 12. Raw LD and it-v-adj-to-that values for Iris Murdoch ...... 53 Table 13. Linear regression results for Iris Murdoch...... 54 Table 14. Raw LD and it-v-adj-to-that values for Agatha Christie ...... 58 Table 15. Linear regression results for Agatha Christie...... 59 Table 16. Raw LD and it-v-adj-to-that values for Terry Pratchett ...... 61 Table 17. Linear regression results for Terry Pratchett ...... 62 Table 18. Raw LD and it-v-adj-to-that values for Kurt Vonnegut ...... 65 Table 19. Linear regression results for Kurt Vonnegut ...... 66 Table 20. Raw LD and it-v-adj-to-that values for Virginia Woolf ...... 67 Table 21. Linear regression results for Virginia Woolf ...... 68 Table 22. Tests of equality of group means for Iris Murdoch ...... 75 Table 23. Tests of equality of group means for Agatha Christie ...... 76 Table 24. Tests of equality of group means for Terry Pratchett ...... 78 Table 25. Tests of equality of group means for Kurt Vonnegut ...... 82 Table 26. Tests of equality of group means for Virginia Woolf ...... 82

9

LIST OF FIGURES

Page

Figure 1. Evenness – P.D. James ...... 49 Figure 2. Delta Prime – P.D. James...... 49 Figure 3. Delta Prime – Mary Higgins Clark...... 51 Figure 4. Special types – Iris Murdoch...... 54 Figure 5. Delta Prime – Iris Murdoch...... 55 Figure 6. Delta Prime – Iris Murdoch (linear regression)...... 56 Figure 7. Delta Prime – Agatha Christie...... 59 Figure 8. Delta Prime – Terry Pratchett...... 62 Figure 9. Clusters20 – Iris Murdoch...... 63 Figure 10. Clusters20 – Agatha Christie...... 64 Figure 11. Delta – Kurt Vonnegut...... 66 Figure 12. Clusters20 – Terry Pratchett...... 79 Figure 13. Special types – Terry Pratchett...... 79 Figure 14. Semantic disparity – Terry Pratchett...... 80 Figure 15. Evenness – Terry Pratchett...... 80 Figure C1. First third of dendogram...... 105 Figure C2. Second third of dendogram...... 106 Figure C3. Final third of dendogram...... 107

10

CHAPTER 1: INTRODUCTION

Erosion of language ability caused by forms of dementia like Alzheimer’s disease

(AD) often manifests in lexical patterns of a patient’s writing. The term ‘dementia’ is used to describe several illnesses that cause physical changes in the brain and disrupt cognitive and behavioral faculties to the point of impeding the daily life of those affected (National

Institute on Aging, 2016). AD is the most common form of dementia and causes clumps and tangles in the brain that result in the death of a massive number of neurons, causing symptoms such as memory problems, lexical recall deficiencies, reasoning issues, and visual/spatial impairment (National Institute on Aging, 2016). Most cases of AD are late- onset, with symptoms appearing in the patient’s mid-60s. Early-onset AD can begin anywhere between the patient’s 30s and mid-60s. AD can only be definitively diagnosed post-mortem, since the patient’s brain must be autopsied. Early detection of AD is thought to be possible through biological markers such as spinal fluid, but such techniques are not yet reliable (National Institute on Aging, 2016). Therefore, developing non-biological detection methods could prove important in detecting AD and other forms of dementia early. Since language is one of the main cognitive abilities that dementia impacts, and because that impact is usually in the form of lexical recall problems, lexical analysis of the language use of confirmed dementia patients is an important avenue of research.

Past studies have used computational analysis of changing linguistic behavior over time to trace the development of dementia in literary authors. Because the data sets are so extensive and span much of the authors’ lives, career novelists have proven to be ideal subjects for this type of analysis. Investigation of linguistic signs of AD in literary writing began with a study by Garrard, Maloney, Hodges, and Patterson (2005), who investigated 11 three of Iris Murdoch’s novels. Murdoch is a prime candidate for such a study due to the trajectory of her career; her writing was regarded as being of consistently high quality throughout her career until the publication of her last novel, Jackson’s Dilemma, which readers and critics did not receive favorably. After she finished writing the novel, she received a diagnosis of probable AD, which was confirmed after her death (Garrard et al., 2005). After

Garrard et al.’s analysis, several follow-up studies have been conducted on Murdoch as well as a widening circle of other authors who died with confirmed or suspected AD. Each replication study has been designed to overcome some of the shortcomings of its predecessors with more nuanced measures, a wider scope of novelists, or both (van Velzen

& Garrard, 2008). The most recent studies in this chain, Le, Lancashire, Hirst, and Jokel

(2011) and van Velzen, Nanetti, and de Deyn (2014), investigate AD with similar results, suggesting that the development of AD has a distinct lexical pattern. However, both studies use somewhat simple measures of what they refer to as lexical diversity, and their concept of lexical diversity consists only of type-token ratios in windows of 55,000 tokens. Whereas these studies and the ones that came before them have found convincing evidence of lexical signs of the development of AD, this study seeks to not only expand upon those results with a multi-faceted analysis of lexical diversity developed by Jarvis (2013), but to widen the scope of lexical analysis of mental illness to examine chronic depression, particularly in authors who have attempted suicide, in an exploratory attempt to discover any lexical signs of an imminent suicide attempt. 12

CHAPTER 2: LITERATURE REVIEW

Alzheimer’s and Dementia Detection

Studies such as Tang-Wai and Graham (2008), Kemper, Thompson, and Marquis

(2001), Hebert et al. (2000), Lyons et al. (1993), and Baddeley et al. (1991) have provided evidence that Alzheimer’s Disease manifests in deficits in several aspects of language ability including lexical and semantic decay, mild syntactic simplification, and apparent working memory deficits (see particularly Baddeley et al., 1991). Furthermore, past studies have illustrated especially noticeable decay of lexical abilities due to AD through word fluency tasks, picture naming tasks (Nebes, 1989), and picture description tasks (Bird, Patterson, &

Hodges, 2000). However, those tasks have little validity in everyday life, so studies such as

Snowdon, Greiner, and Markesbery (2000) began exploring lexical decay due to AD in prompted and spontaneous connected speech. Snowdon et al. (2000) and Riley et al. (2005) examined the writing of several nuns from a religious community over a period of about five decades and found consistent syntactic and semantic differences in the language use of those who did compared to those who did not develop dementia. Bucks, Singh, Cuerden, and

Wilcock (2000) computationally analyzed the spontaneous speech of dementia patients and healthy controls using what they described as three measures of lexical richness, including type-token ratio, Brunét’s index, and Honoré’s statistic, and found that the lexical richness values of dementia patients were lower than that of the healthy controls. Studies of this kind prompted researchers to seek a more extensive set of data of the language of subjects who would eventually develop AD, but were unaware of the pending disease and thus not distressed or attempting to compensate for loss of ability in writing (see Bucks et al., 2000 for overview of criticisms of ‘structured’ tasks). Writing by authors with long literary careers 13 and who have developed AD at a point before they stopped publishing work has proven to be a suitable avenue for this line of research.

The study by Garrard, Maloney, Hodges, and Patterson (2005) represents the first foray into quantitative linguistic analysis of literary texts for evidence of the effects of dementia. Garrard et al. (2005) analyzed just three of Murdoch’s novels--, her first novel; the first 100 pages from The Sea, the Sea, her most critically successful novel; and

Jackson’s Dilemma, her final novel. Their lexical analyses found that the mean word frequency in Jackson’s Dilemma was higher than in the other two novels, and Jackson’s Dilemma had a lower type-token ratio (TTR) and new type introduction rate than its predecessors. These results coincided with the researchers’ expectations that Murdoch would have a more limited, higher-frequency vocabulary at her disposal at the time that her AD was beginning to manifest. The study, although the researchers had access to 2.2 of Murdoch’s novels, only examined five random samples of 100-word chunks from each novel for many of its analyses, including its analyses of the lexical difference between the novels. These samples were analyzed manually.

Van Velzen and Garrard (2008) sought to corroborate these results with data from another language. They analyzed three novels by Gerard Reve, a Dutch author with a very similar AD trajectory to that of Murdoch--he, too, developed AD late in life after a long literary career and only became aware of his disease after writing his final novel. The researchers used the entirety of the three novels--Werther Nieland, Bezorgde ouders (Parents

Worry), and Het hijgend hert (The Panting Heart)--using TTR measures of consecutive 1000- word chunks. They found that Reve’s final novel had a lower mean TTR with higher standard deviation than the other novels. The study also compared the mean TTR from the 14 first halves of the texts to the second halves and found that the final novel had a lower mean

TTR in the second half (dropping from 0.425 to 0.404), whereas the other two novels showed no such change. In a case study of Agatha Christie, who was widely thought to have dementia during the time she wrote her last several novels but was never diagnosed,

Lancashire and Hirst (2009) expanded the longitudinal depth of this research archetype, analyzing 14 of Christie’s novels. They used a type-token ratio of the first five consecutive

10,000-word segments of each novel as well as a count of maximal phrase types (repeated n- grams that are not contained by a longer n-gram) and counts of what they term ‘indefinite words,’ including ‘thing,’ ‘anything,’ and ‘something.’ Their results fell within expectations--

Christie’s TTR declined with age, and her repetition of phrases and usage of indefinite words increased, with the exception of an outlier novel, Passenger to Frankfurt, for which Christie conducted extensive research.

Le et al.’s (2011) study, of which Le (2010) was the predecessor, widened the scope of authors as well as books per author, examining at least 15 novels each from an author who died with dementia (Murdoch), one who was suspected of having dementia but never confirmed (Agatha Christie), and one who aged and later died in good mental health (PD

James). Le et al. (2011) also expanded on the measures of its predecessors. For each novel, the researchers calculated lexical measures of the lemmatized texts, including the overall

TTR; the word-type introduction rate (WTIR); global n-gram repetitions of two to 11 words within the first 55,000 tokens of the text (global repetition); the proportion of lemmas from open-class words repeated within 10 following open-class words to the total number of content words (local repetition); the proportions of “indefinite nouns” including thing, something, anything, and nothing and of frequent, low-imageability verbs (what they call lexical 15 specificity); the proportion of word class types and tokens (word-class deficit); and the proportion of POS-tagged interjections and fillers. Their strongest results lie within the TTR and WTIR measures. The study found a decline in active vocabulary and increase in lexical repetition in Murdoch and Christie’s writing, again with Christie’s novel Passenger to Frankfurt as an outlier. Signs of decline in Murdoch’s writing were especially apparent in her second to last novel, , which Murdoch wrote in 1993, five years before her diagnosis.

Le et al. also observed a decline around Murdoch’s late 40s and early 50s followed by a recovery preceding the AD decline, which they attributed to personal crises Murdoch endured at the time. The study found signs of normal aging in James--steady increase in vocabulary over time, then a very slight decline in her later years.

The most recent study in this line of research is Van Velzen et al. (2014), which revisits the methods of Garrard et al. (2005), Van Velzen and Garrard (2008), and Le et al.

(2011) with refined data modelling methods in an attempt to ‘tell the story’ of the data with more accuracy. Their measures included type-token ratios of the first 55,000 tokens (to replicate Le et al.) and noun to pronoun ratios (with the idea that increased usage of pronouns as placeholders for more specific nouns indicates a decline in lexical ability). To model their data, they used the AIC() method (which compares multiple linear models to find the best fit by calculating an Akaike Information Criterion value for each--the lowest value indicates the best fit) in the programming language R to compare first through sixth order models (for more information on AIC(), see van Velzen et al., 2014, p. 194). The main difference compared to Le et al. (2011) that van Velzen et al. (2014) describe is that their data modelling reveals that Murdoch’s lexical decline is more sudden and occurs later than

Christie’s. They also analyzed the novels of Gerard Reve, Hugo Claus, and Harry Mulisch, all 16 of whom are contemporaries of Murdoch, Christie, and James. Reve and Claus, who had

AD, showed decline patterns very similar to Murdoch’s, whereas Mulisch, who was added as another healthy control, showed consistency similar to James. Their noun-pronoun ratio measures did not have significant results for Murdoch and showed a gradual decline in noun-pronoun ratio for Christie.

Measuring Lexical Diversity

The most compelling results of the studies discussed above are the evidence of lexical decay in the authors who developed AD. However, the measures used by these studies are relatively simple, and they also use somewhat conflicting and loose, or entirely absent, construct definitions to justify their measures. Hazy definitions of LD have plagued multitudes of past studies that sought to use it for purposes from forensics (Colwell et al.,

2002) to language acquisition (Singh, 2001) to stylistics (Smith and Kelly, 2002) (as cited in

McCarthy & Jarvis, 2007). Many past studies have used TTR or some manipulation thereof as an index of lexical diversity, and several studies have used ‘lexical richness’ to mean what others use ‘lexical diversity’ to describe (Bucks et al., 2000). It is generally agreed upon that the overall TTR of an entire text is a poor measure of LD since it is deeply dependent on text length--as the length of a text increases, the likelihood of each new token to represent a previously unused type decreases. TTR is thus more accurately described as a measure of lexical repetition rather than lexical diversity. While some researchers may suggest that lexical diversity is simply the opposite of repetition, Jarvis (2013b) argues that lexical diversity has more to do with the opposite of redundancy, which is subjective and based on “perception of excessive or unnecessary repetition” (p. 20), and furthermore he argues that the opposite 17 of redundancy is only a component of lexical diversity rather than an encapsulation of the entire concept.

According to Jarvis’s construct definition, lexical diversity, which he refers to as ‘the variety of words found in a text,’ is a complex and nuanced construct that cannot be fully captured with TTR measures. Carroll (1938) was the first researcher to investigate lexical diversity under that term; with his research started a tradition of viewing LD as an objective phenomenon that can be captured with mathematical equations which deal with the relationship between types and tokens. Like the simple type-token ratio, these formulae mostly result in indices of lexical repetition. More recently, Jarvis and his colleagues have argued that lexical diversity is a multidimensional phenomenon which requires multiple indices to be measured fully. In his 1935 book, George Kingsley Zipf discusses the “curious orderliness” (p. 215) of language and the human instinct for the proper amount of diversity in multiple aspects of life, which manifests in language use--not only in terms of word frequency ranks, but in less-researched properties like what Zipf terms ‘wavelengths,’ or characteristic frequencies at which individual words tend to be repeated in natural language.

In that vein, Jarvis (2013a) advocates for “viewing lexical diversity as a perceptual phenomenon with measurable objective properties that must be calibrated with elements of perception” (p. 96) rather than a wholly objective construct. Not only do past measures not fully capture the multifaceted essence of lexical diversity, but such studies have validated their measures according to their correlation with proficiency. However, proposed measures of LD correlating highly with proficiency does not prove their validity as measures of LD--it simply demonstrates that they, at least in part, measure something closely related to 18 proficiency. Instead, as Jarvis explains, lexical diversity indices need to be validated according to how well they predict human ratings of LD.

Jarvis’s endeavor to create a theoretically-grounded construct definition of lexical diversity and develop corresponding, sophisticated measures for this construct started with his 2007 study with McCarthy in which they evaluated several existing measures of LD in an attempt to find one that is not significantly affected by text length. McCarthy and Jarvis evaluated 14 existing measures of LD, including vocd and multiple variations of TTR measures, and found that all 14 were significantly affected by text length. While showing that vocd is essentially a needlessly distorted measure of the sums of the probabilities (SOP) for occurrence of each individual word in a given text, McCarthy and Jarvis argue that “It is not enough to be able to show what an LD tool does measure; we also need a theoretically adequate explanation of what an LD tool should measure” (p. 471).

McCarthy and Jarvis expanded upon this investigation in their 2010 study, in which they introduced a measure that they termed “measure of textual lexical diversity” (MTLD).

They intended MTLD to be a component measure rather than independent index of LD and described its calculation as follows: the total number of tokens in a text divided by the total number of textual segments whose TTR is above .72. This calculation is performed starting from the beginning of the text going forward and from the end of the text going backward, and the average of these two values is used as the final index (see McCarthy & Jarvis, 2010, pp. 384-5 for more detail). MTLD circumvents the problem of text length sensitivity that has plagued almost all past measures of LD. The final value produced by MTLD represents the average of how many words in the text are required to reach the point right before the TTR stabilizes, or the point at which the addition of repeated or new types to the text cannot 19 substantially affect the TTR. However, Jarvis’s most recent model of LD does not incorporate MTLD, and his later articles focus on LD as a perception-based phenomenon.

In his 2012 and 2013 articles, Jarvis discusses the first iterations of his six-part lexical diversity model, which he would later refine to have seven components. He defined the first iteration of this model as “consisting of the following six properties: variegation (i.e., the range of lexical types in a sample), volume (i.e., the total number of lexical tokens in the sample), balance (i.e., the degree to which the tokens in the sample are equally distributed across types), rarity (i.e., the degree of specialness or uncommonness found among those types), disparity (i.e., the degree of difference between those lexical types), and dispersion

(i.e., the way that those types are ordered in the text vis-à-vis one another)” (Jarvis, 2012, p.

56). He tested five of these components, all but dispersion, in a lexical diversity rating task with 130 human raters and found that each of the five affects human perception of lexical diversity. Though he includes text length as a component of lexical diversity, Jarvis stresses that other measures’ dependence on text length is not redeemable because in such measures, text length is an intervening variable whose weight is not calibrated to reflect human judgments of lexical diversity. For example, text length is a negative factor in TTR--the longer the text, the lower the rating--but Jarvis has found that a longer text is perceived to be more lexically diverse by human raters. In his 2013 articles, Jarvis developed his model further by taking inspiration from measures of biodiversity. By 2015, his model consisted of the following seven variables: volume, or the total token count, abundance, or the total number of types, variety, or type-token relationship, evenness, or how evenly tokens are distributed across types, dispersion, or the degree to which tokens of the same type are evenly distributed in a text, specialness, or how many, what kind, and to what degree particular words in the text 20 stand out as adding to its lexical diversity (from the concept in biodiversity of “the uniqueness of a species whose loss could not easily be compensated by other species”

(Jarvis, 2013a)), and disparity, or how different the words in a text are from each other. Jarvis validated these variables as significantly contributing to a measure which predicts human raters’ judgments of a text’s lexical diversity through several rounds of having participants with high English proficiency rate the LD of narratives written by Finnish, Swedish, and

English speakers (Jarvis, 2013a; 2013b; 2015). The 2015 version of the model had an R2 of

0.89, with five of the seven measures contributing significantly without multicollinearity problems.

As mentioned above, studies examining AD seem to still define LD, if they define it at all, as a variation on TTR. Garrard et al. (2005) refer to measures of lexical diversity in their abstract, but do not mention lexical diversity again in the paper or give an explicit definition of such measures. Later, they refer to TTR and “the lower rate of increase of this proportion over successive incremental samples” (p. 258), so they assumedly operated with a working definition of LD as the TTR of the whole text and WTIR of 10,000-word increment windows (10,000, 20,000, and so on). Van Velzen and Garrard (2008) mention the importance of refining measures of LD for the detection of AD, but then go on to assert that “measurement of lexical diversity can be achieved simply by calculating the type to token ratio (TTR) within a text sample of given length” (p. 281). They calculated their TTRs in 1000-word windows, from token 1-1000, then 1001-2000, and so on, also calculating the mean and standard deviation of these TTRs. This usage of TTR is slightly more sophisticated than their simplistic definition of LD makes it sound, although they still fall into the trap of attributing LD to a single index. This particular manipulation of TTR was 21 initially proposed by Johnson (1939; 1944) under the term ‘mean segmental type-token ratio,’ and as Jarvis (2013b) points out, these kinds of measures, which amount to a different approach to the same inadequate measure, “take as their input far too little information to account for the diversity of word use in a text” (p. 18). This trend of simple measures continues with Le et al. (2011) and van Velzen et al (2014). Le et al (2011) also uses TTR (of the first 55,000 words in the novel since all but one of the novels they analyzed have at least that many tokens) and WTIR (for 10,000-word sequential windows), although they use a large variety of other lexical indices as well. Van Velzen et al (2014) refer to LD as TTR and follow the lead of Le et al (2011) in using a 55,000-token window. These studies suffer from something similar to the “terminological drift” (Jarvis, 2013a) that has impeded many studies that deal with lexical diversity. The present study will attempt to escape this pitfall by employing Jarvis’s multi-faceted model of lexical diversity. For mathematical details on each component’s calculation, see the Method section.

Depression Detection

Since lexical diversity has been used with some degree of success in multiple linguistic fields, including second language acquisition research, to detect proficiency; stylistics, to determine whether a text has one or multiple authors; and forensics, to discover guilt or innocence (Jarvis & McCarthy, 2007); in addition to its uses in detecting AD, perhaps lexical diversity and other computationally-measurable factors have potential to find a profile for chronic depression and perhaps detect an impending suicide attempt of the text’s author. There is a significant body of existing research on detecting clinical depression in language use, though very few past studies examine literary writing for lexical signs of depression. 22

Stirman and Pennebaker (2001) analyzed the writing of nine suicidal and nonsuicidal poets in an attempt to predict suicide attempts through themes and linguistic features of the subjects’ poetry. Their analysis was guided by two models of suicide: the integration/disengagement model (Durkheim, 1951, as cited in Stirman and Pennebaker,

2001) and the hopelessness model. “According to Durkheim’s model, the suicidal individual has failed to integrate into society sufficiently and is therefore detached from social life”

(Stirman and Pennebaker, 2001, p. 518). Due to this and an increase of self-attentiveness related to fame, Stirman and Pennebaker expected to find an increase in self-references and decrease in other-references in the writing of suicidal poets. The hopelessness model of suicide, on the other hand, theorizes suicide as happening “during extended periods of sadness and desperation” (p. 518). According to this model, the hopeless feelings that accompany these periods lead the person to suicide as a solution. Stirman and Pennebaker assume that linguistic predictors of suicide based on the hopelessness model would include negative affect and emotional terms, corresponding lack of positive words, and evident preoccupation with death. For their analysis, the researchers used the text analysis program

Linguistic Inquiry and Word Count (LIWC), which categorizes words according to what they indicate about emotion, cognition, style, and other aspects of a text, and calculates percentages of words in each category. Stirman and Pennebaker selected nine well-known poets who committed suicide and matched them individually with nine control poets who were similar to their counterparts in terms of nationality, era, sex, and education. The control poets were control in the sense that they did not successfully commit suicide; some of them also had mental illness like their suicidal counterparts. Therefore, significant differences in language use between the groups is likely more accurately attributable to suicidal tendencies 23 than to mental illness (Stirman & Pennebaker 2001). The suicidal group used fewer plural first-person references, more singular first-person references, and more death-related vocabulary than the control group. Surprisingly, the suicidal group also used significantly more sexual references and vocabulary than the control group throughout their writing careers.

The majority of other studies concerned with detecting depression in language use deal with social media, prompted essays and speech, and other non-literary writing, but corroborate the results of Stirman and Pennebaker (2001). Rude, Gortner, and Pennebaker

(2004) found that depressed college students used the word “I” more in personal essays, and

Bucci and Freedman (1981) found that this effect is also present in spoken language--when asked to talk for 10 minutes about a personal topic, people with depression use “I” more than healthy individuals. More recently, Baddeley, Daniel, and Pennebaker (2011) used this self-concerned language profile of depression to examine personal and professional writing, including letters, field reports, and journals spanning seven years, of a land surveyor named

Henry Hellyer whose controversial death was suspected of not being suicide. Their results showed that Hellyer used “I” more and “we” less as time progressed toward his death, indicative of depression and suicide. Choudhury, Gamon, Counts, and Horvitz (2013) found similar language patterns when examining the Twitter posts of self-professed sufferers of major depressive disorder for the year prior to reported onset of depression. In addition to increase in usage of “I,” their study found decrease in usage of third person pronouns. With data consisting of oral interviews and blog posts, Fineberg et al. (2016) found similar results, although they suggest that increased usage of “I” is a marker of many forms of any mental illness rather than just depression. They also found results that corresponded with Stirman 24 and Pennebaker (2001)’s findings regarding sexual language, though Fineberg et al. categorized it with language dealing with ‘biological processes,’ which they deemed another self-referential category. Bernard, Baddeley, Rodriguez, and Burke (2016) also saw higher “I” usage in depressed college students’ personal writing compared to non-depressed students.

Most of the above depression studies (Baddeley, Daniel, & Pennebaker, 2011; Bernard et al.,

2016; Choudhury et al., 2013; Fineberg et al., 2016; Stirman & Pennebaker, 2001) used

LIWC to process their data.

The Present Study

The chain of past studies on lexical decay in literary authors suffering from

Alzheimer’s disease have shown promising results that contribute to our understanding of what characterizes early linguistic manifestations of dementia. However, these studies have used somewhat simple measures, and their methodology has suffered from terminological imprecision regarding their definition and operationalization of lexical diversity. Their results need to be replicated with more sophisticated, clearly defined, and validated measures.

Furthermore, past studies provide a basis for text analysis as a viable tool in suicide prediction, but their most consistent results have been with the increasing or decreasing usage of individual words--“I” and “we,” respectively. Most of these have examined writing that involves some facet of authorial self-reference, which is, by definition of the genre, generally not present in fiction. Since LD has been successfully used in other fields of linguistics to profile language users for a variety of purposes, this study will explore whether the lexical diversity in a suicidal author’s writing changes preceding a suicide attempt.

Although depression does not erode the language centers of the brain like AD, there may 25 still be measurable changes in lexical diversity that could indicate the escalation of the depression toward suicide.

In this study, I use the computational model of lexical diversity developed by Jarvis, as well as a few theoretically-motivated fringe measures whose constructs may contribute to a text’s LD, to examine the writing of seven literary authors--three with confirmed or suspected AD, two controls, and two with chronic depression and suicide attempts. To those ends, this study seeks to answer the following research questions:

1. Will complex measures of lexical diversity support the findings of Le et al.’s 2011

study?

2. Will these measures provide more nuanced detection of the development of

dementia?

3. Can these measures be applied to depression, and particularly depression escalating

into suicide?

26

CHAPTER 3: METHOD

Data

The writing of seven authors was analyzed. In order to replicate the results of Le et al. (2011) and van Velzen et al. (2014) with more sophisticated measures, three of the authors whose work I examined are the English-language writers--Iris Murdoch, P.D. James, and Agatha Christie--examined in the two papers just mentioned. I added to the AD roster two more authors; the first, Mary Higgins Clark, as another healthy control, and the second,

Terry Pratchett, as another case of confirmed AD. Pratchett’s case is an opportunity for interesting insight, as he continued writing and publishing novels for several years after his diagnosis of a rare, early-onset form of AD. For the depression/suicide analysis, I examined novels written by Kurt Vonnegut and Virginia Woolf, both of whom were chronically depressed. Woolf committed suicide, and Vonnegut had an unsuccessful suicide attempt 23 years before his natural death.

Each of the novels was acquired in digital format--usually epub, mobi, or pdf--and converted to plain text (.txt) if originally in a different format using a free ebook management software called Calibre. The novels were then cleaned of most punctuation with the exception of single quotes used as apostrophes. I hand-checked each text file to assure that the files were not rendered unusable by errors in file format conversion. The cleaned versions of the files were then lemmatized and part of speech tagged using

TreeTagger. These cleaned, tagged files were then used as the input files for the rest of the computational analyses.

For the healthy control authors, I used 20 of Clark’s novels (1,665,416 tokens) and all 18 of James’s crime novels (2,199,148 tokens), totaling 3,864,564 tokens in the healthy 27 control group. For the dementia group, I used a selection of 36 of Christie’s novels

(2,421,744 tokens), 20 of Murdoch’s novels (2,689,167 tokens), and 35 of Pratchett’s

Discworld novels (3,315,986 tokens), totaling 8,426,897 tokens for the Alzheimer’s group.

The number of novels for Christie is larger than that of Murdoch, James, and Clark due to past studies’ analyses of Christie showing less consistent results than analyses of Murdoch, whose lexical decay from AD was more sudden and pronounced (Le et al., 2011 & van

Velzen et al., 2014). I did not use all 75 of her novels, but my dataset has at least one novel for every three years of her writing career, with a greater density of novels per year toward the end of her career to allow closer observation of how her lexical diversity patterns changed as she neared the end of her life. Similarly, I used a larger dataset for Pratchett than for Murdoch, James, and Clark due to the nature of his dementia, which he claimed affected him more bodily rather than mentally (Grice, 2010; Pratchett, 2008/2015). Since his dementia was expected to behave differently than Murdoch’s, whose dementia patterns were similar to those of Gerard Reve and Hugo Claus who both had similar kinds of AD to

Murdoch (van Velzen et al., 2014), I included all of his adult novels in order to observe the development of his AD more closely. I left out Pratchett’s novels aimed at younger readers because initial analysis revealed that those novels had consistently lower levels of LD, and since there were only six children’s novels in the original dataset, there were not enough data points to do a separate analysis for his adult novels and children’s novels. For the depression group, the data depended on which texts I was able to acquire. I used 13 of Vonnegut’s novels (806,135 tokens) (the last of his 14 could not be acquired in a usable format) as well as Woolf’s nine novels (836,790 tokens), totaling 1,642,925 tokens for the depression group. 28

Overall, this study analyzes approximately 13,934,386 tokens from 151 novels by seven authors. There may be errors in the token counts due to formatting issues in the original files leading to some words being pressed together into one (e.g. “two words” appearing as “twowords”). Due to the volume of the data, not all of such errors were caught and corrected. Tables 1-7 below show the selected books, year of composition, and age of the author at time of composition for each novel. Each table is positioned below the corresponding author. To save space in later tables, name codes are indicated for each book.

For the purpose of linear discriminant analysis, each book for each of the authors with pathological conditions (the dementia group and the depression group) was coded with a zero or a one, with zeroes corresponding to no signs of a pathological condition expected to be apparent in the novel (-dementia or -suicidal) and ones corresponding to the opposite

(+dementia or +suicidal). The coding was decided based on biographical information about the authors and reinforced by Delta Prime (Hoover 2004a, Hoover 2004b) calculations and hierarchical cluster analysis, which are discussed below. Logic for pathological condition coding is discussed in further detail in the Results section.

Authors

Iris Murdoch was born on July 15th, 1919 and died on February 8th, 1999. She wrote 26 novels in her lifetime, composing and publishing her final novel, Jackson’s Dilemma

(published 1995), shortly before her Alzheimer’s diagnosis at age 77. Her novels are all in the same general genre, with themes including morality, sexuality, and the nature of good and evil (Nicholls, 1999). Partially due to her critical acclaim, her manuscripts were not substantially modified by editors (Garrard et al., 2005). Only her last two novels were coded as +dementia, in line with past studies pointing out the sudden and drastic nature of her 29 decline, which mostly manifests in Jackson’s Dilemma but shows weaker signs in The Green

Knight (Le et al., 2011, van Velzen et al., 2014).

Table 1 Iris Murdoch novels used Novel Year Age Dementia Under the Net (M1) 1954 35 0 The Flight from the Enchanter (M2) 1955 36 0 (M3) 1958 39 0 (M4) 1961 42 0 (M5) 1962 43 0 (M6) 1963 44 0 (M7) 1964 45 0 (M8) 1966 47 0 (M9) 1968 49 0 Bruno's Dream (M10) 1969 50 0 A Fairly Honourable Defeat (M11) 1970 51 0 (M12) 1973 54 0 The Sacred and Profane Love Machine (M13) 1974 55 0 (M14) 1976 57 0 The Sea, The Sea (M15) 1978 59 0 The Philosopher's Pupil (M16) 1983 64 0 (M17) 1985 66 0 The Book and the Brotherhood (M18) 1987 68 0 The Green Knight (M19) 1993 74 1 Jackson’s Dilemma (M20) 1995 76 1

Agatha Christie (born September 15th, 1890; died January 12th, 1976) was never diagnosed with AD, but was suspected by colleagues and loved ones to have had dementia while she was composing her last few novels (Lancashire & Hirst, 2009). Sixty-six of her novels were detective stories, but she also wrote romances under a pseudonym. Her first novel was published in 1920 (age 30) and her last in 1973 (age 83). Her last 10 novels were 30 coded as +dementia, which coincides with past studies’ implications that her decline was more gradual (Le et al., 2011 & van Velzen et al., 2014).

31

Table 2 Agatha Christie novels used Novel Year Age Dementia Mysterious Affair at Styles (AC1) 1920 30 0 The Secret Adversary (AC2) 1922 32 0 Murder on the Links (AC3) 1923 33 0 The Man in the Brown Suit (AC4) 1924 34 0 The Secret of Chimneys (AC5) 1925 35 0 The Big Four (AC6) 1927 37 0 Mystery of the Blue Train (AC7) 1928 38 0 The Seven Dials Mystery (AC8) 1929 39 0 Giant's Bread (AC9) 1930 40 0 The Murder at the Vicarage (AC10) 1930 40 0 The Sittaford Mystery (AC11) 1931 41 0 Peril at End House (AC12) 1932 42 0 Murder on the Orient Express (AC13) 1934 44 0 Death in the Clouds (AC14) 1935 45 0 Cards on the Table (AC15) 1936 46 0 Death on the Nile (AC16) 1937 47 0 Appointment with Death (AC17) 1938 48 0 And Then There Were None (AC18) 1939 49 0 Evil Under the Sun (AC19) 1941 51 0 Five Little Pigs (AC20) 1942 52 0 Towards Zero (AC21) 1944 54 0 Sparkling Cyanide (AC22) 1945 55 0 Taken at the Flood (AC23) 1948 58 0 They Came to Baghdad (AC24) 1951 61 0 A Pocket Full of Rye (AC25) 1953 63 0 Dead Man's Folly (AC26) 1956 66 0 Ordeal by Innocence (AC27) 1958 68 1 Cat Among the Pigeons (AC28) 1959 69 1 The Mirror Crack'd from Side to Side (AC29) 1962 72 1 A Caribbean Mystery (AC30) 1964 74 1 Endless Night (AC31) 1967 77 1 By the Pricking of My Thumbs (AC32) 1968 78 1 Passenger to Frankfurt (AC33) 1970 80 1 Nemesis (AC34) 1971 81 1 Elephants Can Remember (AC35) 1972 82 1 Postern of Fate (AC36) 1973 83 1 32

Terry Pratchett was born on April 28th, 1948 and died on March 12th, 2015. He wrote 41 novels set in his fantasy universe Discworld as well as other fantasy novels and books for children. His adult-aimed writing career began with the novel The Dark Side of the

Sun in 1976, when he was 28. He began writing the Discworld novels at age 35. In 2007, he was diagnosed with Posterior Cortical Atrophy, a form of early-onset Alzheimer’s disease which tends to manifest in deterioration of visual abilities, including impaired object, person, and place recognition as well as trouble reacting to visual stimuli (Pratchett, 2008/2015). The symptoms of his disease were mostly physical (Grice, 2010), and he continued to write until a few months before his death. His final novel, The Shepherd’s Crown, was published after his death. Though all of his Discworld novels take place in the same universe, six are intended for ‘younger readers,’ and unfortunately for this analysis his last novel, as well as another novel he published after his AD diagnosis, were children’s novels (making two out of his five post-AD novels children’s novels). As mentioned above, the children’s novels were left out of the analysis for this study due to their overall lower LD and lack of enough data points for a separate sub-analysis. His last three adult novels are coded as +dementia, as they were all released after his diagnosis.

33

Table 3 Terry Pratchett novels used Novel Year Age Dementia The Colour of Magic (P1) 1983 35 0 The Light Fantastic (P2) 1986 38 0 Equal Rites (P3) 1987 39 0 Mort (P4) 1987 39 0 Guards Guards (P5) 1989 41 0 Pyramids (P6) 1989 41 0 Sourcery (P7) 1989 41 0 Wyrd Sisters (P8) 1989 41 0 Eric (P9) 1990 42 0 Moving Pictures (P10) 1990 42 0 Reaper Man (P11) 1991 43 0 Witches Abroad (P12) 1991 43 0 Lords and Ladies (P13) 1992 44 0 Small Gods (P14) 1992 44 0 Men at Arms (P15) 1993 45 0 Interesting Times (P16) 1994 46 0 Maskerade (P17) 1994 46 0 Soul Music (P18) 1994 46 0 Feet of Clay (P19) 1996 48 0 Hogfather (P20) 1996 48 0 Jingo (P21) 1997 49 0 Carpe Jugulum (P22) 1998 50 0 The Last Continent (P23) 1998 50 0 The Fifth Elephant (P24) 1999 51 0 The Truth (P25) 2000 52 0 The Last Hero (P26) 2001 53 0 Thief of Time (P27) 2001 53 0 Night Watch (P28) 2002 54 0 Monstrous Regiment (P29) 2003 55 0 Going Postal (P30) 2004 56 0 Thud (P31) 2005 57 0 Making Money (P32) 2007 59 0 Unseen Academicals (P33) 2009 61 1 Snuff (P34) 2011 63 1 Raising Steam (P35) 2013 65 1

34

P. D. James, born on August 3rd, 1920 and deceased on November 27th, 2014, died at age 94 having written 19 novels, 18 of which were crime novels. Fourteen of her murder mysteries have the same central character, Detective Adam Dalgliesh (Stasio, 2014). She published her first novel at age 42 and her last at age 91, three years before her death. She aged healthily and is one of the controls for this study.

Table 4 P.D. James novels used Novel Year Age Cover Her Face (J1) 1962 42 A Mind to Murder (J2) 1963 43 Unnatural Causes (J3) 1967 47 Shroud for a Nightingale (J4) 1971 51 An Unsuitable Job for a Woman (J5) 1972 52 The Black Tower (J6) 1975 55 Death of an Expert Witness (J7) 1977 57 Innocent Blood (J8) 1980 60 The Skull Beneath the Skin (J9) 1982 62 Taste for Death (J10) 1986 66 Devices and Desires (J11) 1989 69 The Children of Men (J12) 1992 72 Original Sin (J13) 1994 74 A Certain Justice (J14) 1997 77 Death in Holy Orders (J15) 2001 81 The Murder Room (J16) 2003 83 The Lighthouse (J17) 2005 85 The Private Patient (J18) 2008 88

Mary Higgins Clark is the other healthy control for this study. She was born on

December 24th, 1927 and is 90 years old at the time of writing. Though she began writing short stories in her early 20s and sold her first at age 29, her first suspense novel was not published until she was 48, in 1975. She is the sole author of 35 suspense novels and has co- 35 authored several more (“About Mary Higgins Clark,” n.d.). Her most recent novel was published on March 28th, 2017.

Table 5 Mary Higgins Clark novels used Novel Year Age Where Are the Children (C1) 1975 48 A Stranger Is Watching (C2) 1977 50 The Cradle Will Fall (C3) 1980 53 A Cry in the Night (C4) 1982 55 Stillwatch (C5) 1984 57 Weep No More My Lady (C6) 1987 60 While My Pretty One Sleeps (C7) 1989 62 All Around the Town (C8) 1992 65 Let Me Call You Sweetheart (C9) 1995 68 Pretend You Don't See Her (C10) 1997 70 We'll Meet Again (C11) 1999 72 On the Street Where You Live (C12) 2001 74 Nighttime Is My Time (C13) 2004 77 I Heard That Song Before (C14) 2007 80 The Shadow of Your Smile (C15) 2010 83 The Lost Years (C16) 2012 85 I've Got You Under My Skin (C17) 2014 87 The Melody Lingers On (C18) 2015 88 As Time Goes By (C19) 2016 89 All by Myself Alone (C20) 2017 90

Virginia Woolf (January 25th, 1882 - March 28th, 1941) suffered from mental illness from a young age, exacerbated by deaths of loved ones throughout her life. Her first novel was published in 1915, when she was 33, but she had been writing and revising it for a few years prior. Her first suicide attempt was in 1913. Biographers tend to assume that Woolf suffered from manic depression or bipolar disorder. She wrote in multiple genres throughout her life--mostly novels, short fiction, and essays, some of which were book-length. Although 36

Woolf suffered from bouts of manic depression throughout her life, her worst breakdowns are reported to have happened before she started publishing her novels, in between 1910 and 1915 (Reid, 2007). After her suicide attempt in 1913 and the accompanying mental fallout, Woolf recovered and kept her manic depression mostly in check thereafter, until the suicide attempt that took her life at age 59 (Reid, 2007). Further, she connected with Vita

Sackville-West in 1922, who became a close friend, confidant, and lover. Woolf wrote her novel Orlando, which is generally regarded as her most lighthearted publication, for Sackville-

West in the sense that Orlando, the titular character, was heavily inspired by her. Woolf’s love for Sackville-West is supposedly evident in the style of the writing (Simkin, 2015).

Though Woolf’s mental health was delicate throughout her life, it was relatively stable until

1941, when her husband became concerned enough to attempt to intervene (Simkin, 2015).

Thus, only her last novel was coded as +suicidal. In March 1941, at age 59, she wrote a suicide note to her husband and drowned herself. The novel she had been working on before her death, Between the Acts, was published posthumously a few months later without any edits from third parties. Her final novel thematically deals with the value of art, which was one of the issues that troubled her amidst World War II and contributed to her suicide

(Reid, 2007). Her fiction writing is generally considered to fall under the umbrella of modernism, and her style and intended audience are consistent except for Orlando.

37

Table 6 Virginia Woolf novels used Novel Year Age Suicidal The Voyage Out (W1) 1915 33 0 Night and Day (W2) 1919 37 0 Jacob's Room (W3) 1922 40 0 Mrs. Dalloway (W4) 1925 43 0 To the Lighthouse (W5) 1927 45 0 Orlando (W6) 1928 46 0 The Waves (W7) 1931 49 0 The Years (W8) 1937 55 0 Between the Acts (W9) 1941 59 1

Kurt Vonnegut was born on November 11, 1922 and died on April 11, 2007 (age

84). His mother committed suicide when he was 22. He published his first novel in 1952 at age 30. He suffered from chronic depression throughout his life and unsuccessfully attempted suicide in 1984. Vonnegut kept writing after the suicide attempt, publishing his last book in 1977, 10 years before his death. His bibliography includes 14 novels and five nonfiction books, in addition to short stories and plays. His fiction and nonfiction deal with outsiders, politics, and the darker parts of human nature, and his fiction often incorporates science fiction elements (Grossman, 2007). The novels after Slaughterhouse Five up through the first novel published after his suicide attempt (Galapagos) were coded as +suicide because critical reception of his first two post-Slaughterhouse Five novels was lukewarm. Further, during this period of his life, Vonnegut’s marriage ended in divorce (in 1971) and he began taking Ritalin after his son had a mental breakdown (in 1972).

38

Table 7 Kurt Vonnegut novels used Novel Year Age Suicidal Player Piano (V1) 1952 30 0 Mother Night (V2) 1961 39 0 Cat's Cradle (V3) 1963 41 0 God Bless You, Mr. Rosewater (V4) 1965 43 0 Slaughterhouse-Five (V5) 1969 47 0 Breakfast of Champions (V6) 1973 51 1 Slapstick or Lonesome No More (V7) 1976 54 1 Jailbird (V8) 1979 57 1 Deadeye Dick (V9) 1982 60 1 Galapagos (V10) 1985 63 1 Bluebeard (V11) 1987 65 0 Hocus Pocus (V12) 1990 68 0 Timequake (V13) 1997 75 0

Analysis

All of the calculations and comparisons described below, as well as in the Results and Discussion sections, were performed on a within-author basis where applicable.

Between-author comparisons were not made except in hierarchical cluster analysis due to the individualized nature of novel writing and the impossibility of ensuring the exact uniformity of genre and perspective that would be necessary to compare texts from different authors.

Further, this study is concerned with change in individuals’ LD over time corresponding to the development of dementia or suicidal depression, not across-author generalizable signals of the onset of such conditions.

Lexical Diversity

The following calculations of Jarvis’s multi-faceted model of lexical diversity were performed on lemmatized text (.txt) files of the novels using scripts written in Python. Texts were lemmatized using the free software TreeTagger, with the setting enabled to use tokens 39 in place of unknown lemmas. This results in more accurate calculations than the default

TreeTagger setting, which is to label unknown lemmas (which can include unfamiliar proper names as well as mistakes in the text like two words accidentally left together without spaces, e.g. “twowords”) as simply “.” Labeling unknown lemmas as “” would likely result in falsely deflated LD indices, since each instance of “” would be counted as a repetition. Instead, I chose to err on the side of caution, perhaps occasionally falsely inflating LD numbers with “new” types made by the accidental fusing of two tokens that would have already been in the text (“twowords” would be its own type rather than being counted as tokens of “two” and “words”). Due to the extremely large volume of data, these occasional mistakes in processing and tagging are not expected to have impacted the results of this study in a significant way. A small sample of a data file can be found in Appendix A. The first two facets of Jarvis’s lexical diversity model are simple:

Volume is the total number of tokens in a text, and Abundance is the total number of types.

These two calculations were performed on the data but were not used in further statistical analyses due to their extreme sensitivity (or direct measuring, in the case of tokens) to text length and the lack of uniformity of length in the novels. In place of total types, the effective number of types was calculated (exponential of Shannon entropy) for each novel and used as a proxy for Abundance in further statistical calculations. For details on the theoretical basis of effective types, which is derived from ecology, see Jarvis (2013a). The other parts of

Jarvis’s lexical diversity model are described below and were used in all statistical analyses in this study. 40

Variety

Variety was operationalized as a moving-average type-token ratio, calculated by taking the number of types for words 1-50 of the text, then words 2-51, 3-52, and so on to the end of the text; these values are then averaged to arrive at the final Variety value for a text. This was done on each novel with windows of 50, 100, 250, 500, 1000, 2500, and 5000 words. Simple graphs of the Variety values for each token window size were compared, and a window size of 5000 tokens was selected for statistical analysis.

Dispersion

Using the same text-window sizes as the variety measure, the Dispersion script counts the number of ‘clusters,’ or repetitions of the same type within a window. From this, the average number of clusters is calculated and relativized to clusters per 100 words. The five most common words in all novels combined (the, be, and, to, a) were left out of consideration for cluster words. Similar comparisons were made for Dispersion window sizes as for Variety, and a window size of 20 was selected for further analysis. In counting and averaging clusters of the same type within a window, Dispersion measures how well- dispersed throughout a text tokens of the same type are. Use of a wide variety of types would give an impression of higher lexical diversity if tokens of the same type did not all occur within the same paragraph or sentence, for example.

Evenness

Evenness is derived from Shannon’s entropy index to measure how evenly-balanced each type is relative to the number of tokens representing each type. A text can have a high type-token ratio and still have the majority of repeated words weighted toward one type. A text with a more even distribution of tokens across types (for example having one to three 41 tokens per type in a paragraph having 10 types) is likely to be perceived as more lexically diverse than a text that has most of its tokens occupied by fewer types (for example 10 types with one type having 10 tokens and the other types having one to two tokens). Shannon’s entropy index is calculated by, for each type in a text, dividing the number of tokens of that type by the total tokens in the text to find the proportion of total tokens that a given type accounts for. That proportion is then multiplied by the natural logarithm (base e) of the proportion. The resulting number is added to the current Shannon index (which starts at 0).

The resulting Shannon index is then multiplied by -1 and divided by the natural logarithm of the total types in the text, resulting in the final Evenness value for the text.

Specialness

Jarvis’s Specialness calculation was a count of words identified as ‘special’--meaning they contribute more to the LD of a text than other words--by human judgments and statistical analysis (Jarvis, 2015). An automated approximation of this process was developed for this study and correlates well with Jarvis’s model (personal communication, 2017).

Similar to what Le (2010) did, my Specialness measure uses WordNet’s hypernym depth function to approximate the specificity of a given word. If that word is a noun, it must have a hypernym depth of at least seven. For verbs, the minimum depth is two. Further, to be counted as a special word, nouns must not be in the 1000 most frequent words in English according to COCA, and verbs must not be in the 900 most frequent English words. Other words counted as special include any adjectives or adverbs not in the 1000 most frequent

English words (no word classes besides verbs and nouns have depth in WordNet) and any modals and prepositions. These cutoffs and inclusions were decided based on Jarvis’s 42 human-rated Specialness data. The final Specialness value for a text is the total number of special types in the text.

Disparity

Disparity “estimates the mean number of synonyms in each text based on the

WordNet semantic sense index (wordnet.princeton.edu)” (Jarvis, 2013a, p. 102). Jarvis was not entirely satisfied with this measure, however, as it seems to be functioning as an index of synonym usage that correlates positively with LD. Even though the writers who use many synonyms are using semantically similar vocabulary, the writers’ use of synonyms allows them to avoid repetition, thus lowering redundancy and increasing the perceived LD of the text. This measure of Disparity was used for this study.

Of the calculations described above, all but Specialness are slight derivations of

Python scripts written by Jarvis or my Python translations of scripts originally written by

Jarvis in the programming language Perl. I modified the scripts by using the original algorithm adapted to process text files in bulk. The Specialness scripts (it is a four-script process) were developed by me with Jarvis’s guidance. Additional calculations are described below and were performed with scripts written by me with programming help from Zak

Thompson of Ohio University.

Other measures

Verb-Argument Constructions

A further measure was experimentally used that is not included in Jarvis’s model of lexical diversity, but theoretically aligns with Zipf’s concept of variegation. Instead of being limited to the individual word level, Zipf’s law seems to also apply within constructions.

Specifically, Ellis, Römer, and O’Donnell (2016) have shown Zipfian distribution in the 43 frequency ranks of specific verbs used to fill common verb-argument constructions (VACs).

If verb-argument constructions behave in a Zipfian way, then choosing a less common verb to fill a verb-argument construction may increase the perceived lexical diversity of a text.

Thus, VACs following the verb-preposition-noun pattern were extracted from each text.

First, the Stanford Parser was run on each novel using Python and Java code from Kris Kyle of the University of Hawaii. The output was parsed with my own Python code, which constructed a frequency list of verbs used for each VAC, with the specific prepositions distinguishing the VACs as types (e.g. for the verb-across-noun type, a frequency list of verbs used to fill the construction was made). These verb frequency lists were then analyzed for

Zipfian behavior. Due to a lack of interesting patterns for the VACs, they were not used further in the analysis and will not be discussed further in this study.

It-v-adj-to/that

The last computational measure used in this study is lexicosyntactic. Inspired by a presentation given at the 2017 Georgetown University Roundtable conference by Sakol

Suethanapornkul, I wrote a python script to extract “it”-(modal)-verb-(adverb)-adjective-

“to”/“that” constructions from each novel. Parentheses in the construction description indicate optional parts. Examples of constructions that would be extracted include “it seems important to,” “it would usually seem that,” and “it may be surprising that.” These constructions were then sorted into categories of every possible structural permutation, resulting in eight construction types (e.g. the base construction using “that” – “it”-verb- adjective-“that”). In this study, higher frequency of these constructions was considered to be indicative of greater linguistic complexity. A total count of all such constructions was calculated as well. Simple graphs of these counts were considered, and the overall count and 44 the count for “it”-verb-adjective-“that” were relativized to novel length and used in further analysis. Below, these two counts will be generally referred to as it-v-adj-to/that values.

Delta Prime

For each novel, Delta Prime (Hoover, 2004a & Hoover, 2004b) values were calculated by deriving z-scores for each LD measure and the it-v-adj-to/that values. Those z- scores were then added and averaged if they met the following criteria: 1) their direction, positive or negative, must correspond with the direction expected for lower overall lexical diversity and linguistic complexity. For LD measures, this corresponded to negative values for all measures except Dispersion (clusters). Dispersion would indicate lower LD for higher values because the resulting index indicates the amount of clusters, or repetition of the same types, within the given token window. For it-v-adj-to/that values, negative z-scores were also considered to represent lower LD/complexity. 2) the magnitude of the z-scores must have an absolute value greater than 0.5 to indicate a difference from the mean that is less likely to be coincidental. If the z-scores for a given measure failed to meet the above criteria, a zero was entered in their place for the final average calculation. The resulting Delta Prime values were considered to represent how far below the author’s ‘normal’ tendencies for complexity and lexical diversity each novel was.

Statistical analysis

After the above-described LD and other measures were calculated for each novel, several inferential statistical tests were run for each author. All of the following statistical tests were conducted using the software SPSS. First, for each dependent variable (LD, it-v- adj-to/that, and Delta Prime) a linear regression was calculated with age as the independent variable. Next, exploratory analysis was conducted with all LD variables and the it-v-adj- 45 to/that values as the variables in a hierarchical cluster analysis. Finally, confirmative analysis was conducted via linear discriminant analysis with LD variables and it-v-adj-to/that values as independent variables and dementia or suicidal depression (depending on group) as the grouping variable. In the linear discriminant analysis, prior probabilities for groups assumed all groups to be equal rather than calculating from group size, since the purpose of this study is to determine if LD alone can separate +dementia novels from -dementia novels.

46

CHAPTER 4: RESULTS AND DISCUSSION

Individual variables and linear regression

The final selection of eight individual variables used in this study include the following: LD factors – effective types (proxy for Abundance), MAT5000 (moving average of types per 5000 words; Variety), clusters20 (relativized per 100 words, number of types that recur within 20 tokens of each other; Dispersion), Evenness (described above), special types (types counted as “special” according to frequency, part of speech, and hypernym depth criteria described above; Specialness), semantic disparity (total number of sense tokens—word tokens sharing a sense according to WordNet’s sense index—divided by sense types; Disparity); and it-v-adj-to/that values—IT VERB ADJ THAT relative (relative frequency per 1,000,000 tokens of the construction “it [verb] [adjective] that”) and overall relative IT-V-A-T/T (relative frequency per 1,000,000 tokens of any possible construction from the pattern “it”-(modal)-verb-(adv)-adj-to/that). The raw results for each author are presented in the tables below, along with summary tables p and R-square values from linear regression tests performed on each dependent variable with age as the independent variable.

A total of 63 individual linear regression tests were run: one per LD factor and it-v-adj- to/that variable (which will also hereafter be referred to as supplementary variables), per author, along with one linear regression test per author for Delta Prime values. To save space, rather than having separate columns for dementia and suicidal depression in the tables below, novels representing pathological conditions will be indicated with a + beside their name code (i.e. AC35+ is Elephants Can Remember by Agatha Christie, which was coded

+dementia). All scatterplots and accompanying regressions for variables with statistically 47 significant trends not displayed in this section can be found in Appendix B. In all scatterplots, author age is on the x-axis. In all line graphs, novel name code is on the x-axis.

Control Group

P.D. James

As indicated in Tables 8 and 9 below and illustrated by Figure 1 below, the only significant decline in P.D. James’s LD over time is manifested in Evenness (p < .001).

Semantic disparity and special types also showed significant trends, but their direction of change indicates an improvement in LD rather than a decline. Scatterplots for semantic disparity and special types can be found in Appendix B. The rest of the measures for James remained relatively stable, fluctuating lightly throughout her career but with no distinct upward or downward trend. Figure 2 is a line graph of James’s Delta Prime values, which clearly displays this fluctuation.

48

Table 8 Raw LD and it-v-adj-to-that values for P.D. James IT overall Effec- VERB MAT Clusters Special Even- Sem. relative Novel Year Age tive ADJ 5000 20 Types ness Disp. IT-V-A- Types THAT T/T relative J1 1962 42 461.49 1147.79 9.87 1567 0.711 1.232 229.65 791.03 J2 1963 43 469.38 1150.15 9.44 1601 0.713 1.232 173.43 1065.35 J3 1967 47 524.95 1240.24 8.89 1721 0.717 1.236 125.52 878.67 J4 1971 51 523.99 1206.68 9.69 2042 0.706 1.246 199.08 787.27 J5 1972 52 521.86 1236.43 10.11 1746 0.715 1.234 152.62 686.78 J6 1975 55 576.68 1290.24 9.33 2239 0.712 1.256 178.58 780.1 J7 1977 57 515.3 1223.93 10.06 1937 0.703 1.243 231.66 815.43 J8 1980 60 561.5 1286.51 10.85 2229 0.703 1.259 183.69 709.71 J9 1982 62 549.9 1261.11 9.93 2301 0.699 1.261 157.5 724.5 J10 1986 66 527.34 1211.05 10.89 2498 0.687 1.266 164.24 713.58 J11 1989 69 524.56 1201.16 10.74 2410 0.691 1.265 178.97 722.04 J12 1992 72 556.77 1270.52 10.26 1889 0.711 1.247 95.57 753.91 J13 1994 74 525.28 1191.52 10.63 2307 0.691 1.261 181.69 696.48 J14 1997 77 497.75 1155.09 10.89 2131 0.69 1.258 121.6 691.21 J15 2001 81 513.98 1202.39 9.83 2222 0.69 1.254 224.82 872.83 J16 2003 83 509.21 1192.67 10 2145 0.693 1.254 216.33 802.5 J17 2005 85 501.56 1178.98 9.66 2020 0.698 1.253 168.41 785.91 J18 2008 88 503.96 1177.17 9.7 2034 0.697 1.255 138.9 731.06 mean 520.30 1212.42 10.04 2057.7 0.702 1.251 173.46 778.24

Table 9 Linear regression results for P.D. James; 18 novels total Direction, if Factor R-square p significant trend MAT5000 0.025 0.528 Clusters20 0.102 0.196 Evenness 0.545 < .001 decreasing Semantic Disparity 0.414 0.004 increasing Special types 0.28 0.024 increasing Effective types 0.005 0.788 IT VERB ADJ THAT 0.022 0.558 relative overall relative IT-V-A- 0.132 0.138 T/T Delta Prime 0 0.966 49

Evenness - P.D. James R² = 0.5455 0.72

0.71

0.7

Evenness 0.69

0.68 40 45 50 55 60 65 70 75 80 85 90 Age

Figure 1. Evenness scatterplot and linear trendline for P.D. James (p < .001).

Delta Prime - P.D. James 2

1.5

1

0.5

0 J1 J2 J3 J4 J5 J6 J7 J8 J9 J10 J11 J12 J13 J14 J15 J16 J17 J18

Figure 2. Delta Prime values for P.D. James, showing deviation from author’s ‘normal.’ Units are average number of standard deviations (absolute z-score) from mean.

Mary Higgins Clark

Though all but two LD factors and one it-v-adj-to/that measure for Clark show significant decline individually, shown in Tables 10 and 11, her novels show a significant upward trend in the use of the IT VERB ADJ THAT construction, with the strongest R- square of any individual factor. Further, the overall linear pattern displayed by the Delta 50

Prime values shown in Figure 3 is not statistically significant, and the data seems to suggest a slight rise in deviation from Clark’s baseline as she aged, with a return to relative normalcy in her most recent novel. Scatterplots with linear regressions for MAT5000, semantic disparity, special types, effective types, and IT VERB ADJ THAT relative can be found in Appendix

B.

Table 10 Raw LD and it-v-adj-to-that values for Mary Higgins Clark IT overall VERB Effective Clusters Special Even- Sem. relative Novel Year Age MAT5000 ADJ Types 20 Types ness Disp. IT-V- THAT A-T/T relative C1 1975 48 444.93 1128.97 11.22 1189 0.72 1.215 32.87 608.14 C2 1977 50 464.72 1128.27 10.68 1129 0.721 1.211 27.93 586.63 C3 1980 53 473.06 1159.91 11.1 1423 0.711 1.228 43.97 428.71 C4 1982 55 433.31 1084.96 11.26 1367 0.706 1.225 43.18 345.43 C5 1984 57 484.14 1200.87 9.89 1460 0.71 1.23 66.83 490.09 C6 1987 60 473.38 1186.03 10.51 1461 0.711 1.236 58.42 432.29 C7 1989 62 535.97 1269.28 9.26 1454 0.718 1.233 23.65 307.47 C8 1992 65 467.28 1165.41 9.87 1301 0.713 1.224 82.46 506.56 C9 1995 68 452.12 1139.81 10.59 1389 0.71 1.232 210.81 737.84 C10 1997 70 434.71 1096.43 10.39 1132 0.713 1.216 127.18 546.88 C11 1999 72 432.99 1090.06 10.92 1459 0.702 1.231 124.83 480.12 C12 2001 74 473.29 1159.6 9.8 1333 0.715 1.226 92.27 426.76 C13 2004 77 434.47 1076.88 10.7 1368 0.702 1.226 122.93 501.19 C14 2007 80 406.62 1062.68 10.79 1197 0.706 1.221 100.27 579.31 C15 2010 83 432.32 1083.69 10.43 1207 0.711 1.221 132.19 440.63 C16 2012 85 407.7 1045.05 10.87 1137 0.711 1.214 181.75 690.66 C17 2014 87 423.56 1076.94 10.61 1120 0.716 1.212 104.85 537.34 C18 2015 88 425.38 1088.16 11.19 1108 0.716 1.209 101.18 650.47 C19 2016 89 424.5 1037.45 11.32 1012 0.722 1.208 124.01 744.04 C20 2017 90 436.45 1117.83 10.23 1042 0.721 1.214 111.43 604.9 mean 448.045 1119.91 10.58 1264.4 0.713 1.221 95.65 532.27

51

Table 11 Linear regression results for Mary Higgins Clark; 20 novels total Direction, if Factor R-square p significant trend MAT5000 0.345 0.007 decreasing Clusters20 0.005 0.774 Evenness 0.002 0.848 Semantic Disparity 0.204 0.046 decreasing Special types 0.318 0.01 decreasing Effective types 0.344 0.007 decreasing IT VERB ADJ THAT 0.435 0.002 increasing relative overall relative IT-V-A- 0.174 0.063 T/T Delta Prime 0.002 0.848

Delta Prime - Mary Higgins Clark 2

1.5

1

0.5

0 C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 C15 C16 C17 C18 C19 C20

Figure 3. Delta Prime values for Mary Higgins Clark, showing deviation from author’s ‘normal.’

Units are average number of standard deviations (absolute z-score) from mean.

Overall, although the control group authors show some significant decreasing trends in individual LD variables and supplementary variables, their Delta Prime values and lack of significant trends therein do not seem to suggest any decline in lexical ability more drastic than the natural slight decline that comes with aging. 52

Dementia Group

If lexical diversity, and particularly those factors identified by Jarvis and used in this study, can predict dementia, then we should expect to see a drop in LD values (except clusters20, in which an increase would indicate lower diversity) and supplementary variable values in the novels that were composed when the author had dementia. However, different types of dementia may have different lexical manifestations (and the same type of dementia may manifest differently depending on the person) and therefore affect different individual variables. So, the Delta Prime value, which indicates how far below the mean the writing in a particular novel is, is likely to be more indicative of overall decline than a single factor. Delta

Prime is expressed as the average of absolute z-scores across factors where the z-scores directionally correspond with decline in ability and their absolute magnitude is greater than

0.5. As such, some interesting or generally representative graphs for individual variables are presented below, but as illustrated by the regression table for Mary Higgins Clark above,

Delta Prime carries more weight in distinguishing abnormal trends since it is a conglomerate value calculated from notable deviations from the means in the individual variables.

Iris Murdoch

As can be observed in Table 12 below, Murdoch’s final novel appears to be substantially different from its predecessors in most of the variables being examined, especially in special types, clusters, and overall relative IT-V-A-T/T. This is confirmed by the p values in Table 13. Notably, the trend for special types is upward, but the scatterplot

(Figure 4) clearly shows Jackson’s Dilemma (M20) as an outlier with The Green Knight (M19) showing less extreme, but still distinct, trend defiance. These Specialness results can be compared to the lexical specificity measures Le et al. (2011) used. They counted uses of 53 indefinite nouns and high-frequency words and found that Murdoch’s use of both fluctuated but ultimately declined with age, which is similar to the overall Specialness in her writing increasing with her age. However, their data did not capture the extreme and trend-defying drop in this component of lexical diversity Jackson’s Dilemma. Perhaps Specialness is a better measure of the aspect of LD that Le et al. (2011) were trying to capture.

Table 12 Raw LD and it-v-adj-to-that values for Iris Murdoch IT overall VERB Effective Clusters Special Even- Sem. relative Novel Year Age MAT5000 ADJ Types 20 Types ness Disp. IT-V-A- THAT T/T relative M1 1954 35 455.75 1121.82 11.62 1705 0.696 1.256 187.2 817.77 M2 1955 36 479.32 1115.53 11.45 1736 0.703 1.254 36.94 387.92 M3 1958 39 502.4 1189.02 9.77 2108 0.703 1.261 171.26 573.73 M4 1961 42 425.04 1135.98 13.04 1615 0.705 1.237 172.96 678.52 M5 1962 43 482.99 1169.86 12.81 2110 0.698 1.259 88.75 505.87 M6 1963 44 462.45 1108.18 11.76 1794 0.709 1.245 95.44 349.95 M7 1964 45 429.65 1158.22 12.82 1273 0.723 1.215 82.07 615.52 M8 1966 47 470.29 1153.97 13.25 1710 0.71 1.244 171.27 623.93 M9 1968 49 491.26 1126.56 13.03 2112 0.699 1.255 136.68 611.04 M10 1969 50 477.09 1126.71 13.65 1750 0.707 1.247 52.43 545.24 M11 1970 51 469.53 1101.83 13.62 2289 0.688 1.262 67.35 599.43 M12 1973 54 476.75 1129.98 14.32 2471 0.684 1.272 70.47 461.25 M13 1974 55 487.94 1138.16 14.06 2354 0.691 1.263 28.81 338.54 M14 1976 57 465.82 1111.88 14.45 2121 0.69 1.262 75.21 451.25 M15 1978 59 482.81 1139.72 13.46 2765 0.679 1.279 102.83 553.34 M16 1983 64 573.01 1209.89 12.87 3073 0.689 1.292 76.11 389.52 M17 1985 66 483.46 1112.41 13.85 2591 0.682 1.275 53.76 337.25 M18 1987 68 517.89 1151.72 13.53 2837 0.684 1.286 85.57 400.83 M19+ 1993 74 485.01 1098.69 14.1 2349 0.689 1.268 81.51 290.38 M20+ 1995 76 444.55 1043.14 14.19 1317 0.714 1.224 99.9 144.3 mean 478.15 1132.16 13.08 2104 0.697 1.258 96.83 483.78

54

Table 13 Linear regression results for Iris Murdoch; 20 novels total Direction, if Factor R-square p significant trend MAT5000 0.105 0.163 Clusters20 0.481 0.001 increasing Evenness 0.16 0.081 Semantic Disparity 0.112 0.15 Special types 0.201 0.047 increasing Effective types 0.092 0.193 IT VERB ADJ THAT 0.145 0.097 relative overall relative IT-V-A- 0.502 < .001 decreasing T/T Delta Prime 0.36 0.005 increasing

Special types - Iris Murdoch 3500 R² = 0.2014 3000 2500 2000

Special Special types 1500 M20 1000 30 40 50 60 70 80 Age

Figure 4. Linear regression for special types for Iris Murdoch (p = .047).

Despite somewhat inconsistent results in the individual component variables in terms of significant trends, Murdoch’s Delta Prime values show a statistically significant trend (p = .005) indicating an overall deviation away from her baseline consistent with what is expected from individuals with Alzheimer’s disease (Le et al., 2011 & van Velzen et al.,

2014). These Delta Prime values are shown in a line graph in Figure 5 and a scatterplot in 55

Figure 6. Even with an already upward trend, Jackson’s Dilemma stands out once more as an anomaly in Figure 6, and The Green Knight initiates the upward climb in abnormality that peaks in Murdoch’s final novel. This is consistent with past studies’ observations that her dementia symptoms seem to have manifested in her writing rapidly and drastically (Garrard et al., 2005; Le et al., 2011; & van Velzen et al., 2014). The other scatterplots for Murdoch’s statistically significant regressions can be found in Appendix B.

Le et al. (2011) describe a ‘trough’ in the complexity and LD of Murdoch’s writing in her late forties and early fifties that then recovers before her final decline. They attribute this trough to emotional and mental difficulties in Murdoch’s life as revealed in her diaries. The

Delta Prime values in Figure 5 somewhat corroborate this result, but point to her mid-late fifties (books M12-M15) as the ‘trough’ period.

Delta Prime - Iris Murdoch 2

1.5

1

0.5

0

Figure 5. Line graph for Delta Prime values for Iris Murdoch. 56

Delta Prime - Iris Murdoch 1.4 R² = 0.3605 1.2 1 0.8 0.6 0.4 Delta Prime Delta 0.2 0 30 40 50 60 70 80 Age

Figure 6. Scatterplot and linear regression line for Delta Prime for Iris Murdoch (p = .005).

Agatha Christie

Every individual variable used in this study, as well as Delta Prime, shows a statistically significant linear trend indicating declining lexical abilities in line with what would be expected for an individual suffering from Alzheimer’s disease (descriptive statistical information shown in Table 14; inferential statistical information shown in Table 15). As shown by the Delta Prime line graph below in Figure 7, lexical aspects of Agatha Christie’s writing began to noticeably deviate from her overall norm in the 27th (AC27) novel included in this dataset (Ordeal by Innocence—actually her 57th novel out of the 71 published in her lifetime). The 28th novel, Cat Among the Pigeons (AC28), seems to return to her baseline, but the remaining eight novels in the dataset continue the trend of increasing deviation from

Christie’s ‘normal’ LD values, with the final spike on Elephants Can Remember (AC35). The harsh fluctuation in Delta Prime values could possibly be attributed to the ephemeral nature of early dementia symptoms (Bradshaw, Saling, Hopwood, Anderson, & Brodtmann, 2004).

Postern of Fate (AC36) indicating less-severe decline than Elephants Can Remember (AC35) is 57 expected since Agatha Christie had editing help on Postern (Le et al., 2011). The highest spike in Delta Prime values coinciding with Endless Night (AC31) is somewhat unexpected, although critics identify a shift in style toward a streamlined feeling in the novel (Richardson,

1967). Perhaps Endless Night represents a collision of Alzheimer’s impact on vocabulary and an attempt at an intentionally sparser style from Christie.

Overall, the more uniform and gradual decline indicated by these linear regressions is consistent with the suggestion of Le et al (2011) and van Velzen (2014) that Christie’s decline was not as drastic or sudden as Murdoch’s but was still distinct and mostly lexical in nature. Le et al. (2011) did not find significant syntactic signs of Christie’s decline. Insofar as the it-v-adj-to/that values can be considered somewhat syntactic, the results of this study lightly contradict Le et al., though the lexical variables (except Evenness) do still have higher predictive ability than the lexicosyntactic ones. Linear regressions for Christie can be found in Appendix B.

58

Table 14 Raw LD and it-v-adj-to-that values for Agatha Christie IT overall VERB Effective Clusters Special Even- Sem. relative Novel Year Age MAT5000 ADJ Types 20 Types ness Disp. IT-V- THAT A-T/T relative AC1 1920 30 426.52 1086.72 10.84 1223 0.723 1.224 172.13 533.61 AC2 1922 32 477.12 1150.85 9.65 1534 0.714 1.243 76.92 384.58 AC3 1923 33 435.53 1122.38 10.19 1248 0.722 1.227 349.77 766.16 AC4 1924 34 443.23 1135.18 11.28 1503 0.708 1.239 193.67 555.19 AC5 1925 35 465.11 1144.23 10.16 1435 0.714 1.235 90.97 415.85 AC6 1927 37 467.36 1169.22 10.17 1197 0.725 1.227 265.43 460.08 AC7 1928 38 433.03 1072.73 10.89 1363 0.716 1.229 196.21 462.5 AC8 1929 39 441.31 1088.62 10.98 1407 0.717 1.224 102.75 293.58 AC9 1930 40 428.33 1037.59 13.07 1805 0.691 1.24 38.3 392.55 AC10 1930 40 375.45 996.04 12.13 1244 0.706 1.212 124.67 526.37 AC11 1931 41 415.57 1047.28 11.28 1225 0.716 1.219 90.36 496.99 AC12 1932 42 412.29 1082.76 12.29 1183 0.72 1.209 124.19 461.29 AC13 1934 44 415.4 1033.13 10.71 1128 0.725 1.21 250.71 802.29 AC14 1935 45 457.57 1096.33 11.64 1228 0.725 1.216 210.91 584.05 AC15 1936 46 425.12 1040.13 11.65 1220 0.72 1.206 86.1 464.93 AC16 1937 47 458.6 1093.59 11.44 1588 0.713 1.228 186.56 646.73 AC17 1938 48 432.31 1068.74 11.69 1277 0.724 1.215 197.3 609.82 AC18 1939 49 435.07 1069.04 11.22 1172 0.727 1.208 73.03 310.38 AC19 1941 51 412.05 1038.19 11.68 1178 0.719 1.212 167.39 535.64 AC20 1942 52 378.58 1034.74 12.93 1342 0.704 1.214 74.27 490.17 AC21 1944 54 415.33 1049.82 11.89 1242 0.72 1.201 33.78 253.34 AC22 1945 55 423.58 1070.22 12.56 1316 0.714 1.22 46.07 383.91 AC23 1948 58 440.33 1084.19 12.47 1325 0.717 1.216 108.33 278.56 AC24 1951 61 515.16 1192.69 10.64 1596 0.717 1.235 97.26 416.82 AC25 1953 63 398.22 1008.15 12.46 1177 0.717 1.202 132.27 479.49 AC26 1956 66 434.65 1074.38 11.17 1259 0.72 1.208 171.46 377.21 AC27+ 1958 68 344.81 918.58 15.12 1154 0.701 1.196 41.97 433.69 AC28+ 1959 69 423.1 1035.35 12.49 1283 0.711 1.214 70.94 340.51 AC29+ 1962 72 384.43 982.39 13.46 1188 0.706 1.205 83.28 374.75 AC30+ 1964 74 369.59 938.46 13.09 1001 0.718 1.192 54.48 417.67 AC31+ 1967 77 297.44 866.03 17.4 924 0.693 1.192 76.7 429.51 AC32+ 1968 78 393 977.06 13.36 1156 0.705 1.207 95.5 313.8 AC33+ 1970 80 490.48 1112.46 13.68 1337 0.713 1.216 42.94 357.8 AC34+ 1971 81 374.5 943.99 14.57 1112 0.703 1.215 75.93 404.98 AC35+ 1972 82 306.74 826.55 16.46 780 0.702 1.188 95.2 571.18 59

Table 14 continued AC36+ 1973 83 346.44 883.88 15.29 896 0.695 1.194 63.24 341.48 mean 416.48 1043.66 12.28 1256.8 0.713 1.215 121.14 454.65

Table 15 Linear regression results for Agatha Christie; 36 novels total R- Direction, if Factor p square significant trend MAT5000 0.478 < .001 decreasing Clusters20 0.637 < .001 increasing Evenness 0.23 0.003 decreasing Semantic Disparity 0.488 < .001 decreasing Special types 0.304 < .001 decreasing Effective types 0.283 0.001 decreasing IT VERB ADJ THAT 0.248 0.002 decreasing relative overall relative IT-V-A-T/T 0.138 0.026 decreasing Delta Prime 0.504 < .001 increasing

Delta Prime - Agatha Christie 2 1.5 1 0.5 0 AC1 AC2 AC3 AC4 AC5 AC6 AC7 AC8 AC9 AC10 AC11 AC12 AC13 AC14 AC15 AC16 AC17 AC18 AC19 AC20 AC21 AC22 AC23 AC24 AC25 AC26 AC27+ AC28+ AC29+ AC30+ AC31+ AC32+ AC33+ AC34+ AC35+ AC36+ Figure 7. Line graph for Delta Prime values for Agatha Christie.

Terry Pratchett

Unlike Christie and Murdoch, the LD values and supplementary variable values for

Terry Pratchett’s writing do not behave in patterns that straightforwardly suggest a decline in lexical ability late in life. Table 16 below shows the raw values for the individual variables, 60 which fluctuate throughout his career. The numbers and graphs for his data look more like those of Mary Higgins Clark or P.D. James (see Figure 8 below), especially considering that his Delta Prime values show no significant linear trend. The final spike in Pratchett’s Delta

Prime values corresponds with the first novel of his that was coded +dementia (Unseen

Academicals – P33), but the Delta Prime values quickly fall back toward his baseline. Only four LD variables in the data for Pratchett have statistically significant trends, and two of those trends are in the ‘wrong’ direction to indicate declining lexical abilities, as indicated in

Table 17 below. Though the overall ineffectiveness of Delta Prime in modelling Pratchett’s

Alzheimer’s is unexpected, these results are not entirely unsurprising since Pratchett’s dementia was a variety (posterior cortical atrophy) that primarily deteriorates mental faculties associated with vision rather than memory and language ability.

61

Table 16 Raw LD and it-v-adj-to-that values for Terry Pratchett IT overall VERB Effective Clusters Special Even- Sem. relative Novel Year Age MAT5000 ADJ Types 20 Types ness Disp. IT-V-A- THAT T/T relative P1 1983 35 582.62 1321.74 8.44 1601 0.727 1.237 29.91 254.21 P2 1986 38 483.93 1214.24 9.77 1147 0.729 1.207 43.26 302.81 P3 1987 39 500.7 1219.93 9.68 1374 0.719 1.226 29.68 371 P4 1987 39 593.71 1331.42 9.54 1515 0.721 1.229 54.05 310.79 P5 1989 41 548.71 1247.54 10.27 1843 0.705 1.25 59.71 248.81 P6 1989 41 535.81 1251.34 10.26 1660 0.709 1.241 100.5 491.32 P7 1989 41 530.62 1263.69 9.76 1644 0.714 1.238 111.93 572.09 P8 1989 41 512.55 1230.63 10.25 1557 0.708 1.235 57.95 370.86 P9 1990 42 469.58 1208.07 9.92 1016 0.734 1.199 140.75 506.69 P10 1990 42 528.3 1193.64 10.66 1759 0.704 1.233 50.33 392.57 P11 1991 43 574.11 1267.78 10.31 1506 0.718 1.228 12.44 335.82 P12 1991 43 476.55 1138.33 11.03 1455 0.704 1.221 35.98 431.79 P13 1992 44 501.61 1174.67 10.97 1444 0.704 1.224 32.91 351.02 P14 1992 44 479.17 1155.76 11.24 1497 0.701 1.232 75.44 441.86 P15 1993 45 537.74 1193.24 10.84 1530 0.707 1.23 62.01 392.71 P16 1994 46 546.94 1218.22 10.78 1652 0.71 1.233 126.44 463.61 P17 1994 46 513.39 1184.82 10.22 1438 0.71 1.229 57.39 367.29 P18 1994 46 551.35 1217.84 10.76 1543 0.707 1.229 71.82 513.03 P19 1996 48 540.12 1211.48 10.25 1536 0.705 1.234 72.6 404.48 P20 1996 48 599.01 1276.94 10.18 1562 0.711 1.232 61.91 319.89 P21 1997 49 534.23 1202.07 9.98 1695 0.701 1.236 56.02 326.78 P22 1998 50 509.62 1190.35 10.1 1614 0.699 1.233 9.92 238.14 P23 1998 50 515.63 1188.34 9.62 1623 0.703 1.235 61.87 422.75 P24 1999 51 526.82 1193.49 9.69 1686 0.701 1.239 45.49 345.76 P25 2000 52 532.08 1199.12 10.52 1676 0.699 1.241 28.12 431.19 P26 2001 53 457.86 1180.75 10.48 873 0.734 1.193 49.19 319.76 P27 2001 53 546.74 1205.22 10.42 1651 0.703 1.235 86.41 403.24 P28 2002 54 490.24 1147.9 10.22 1575 0.697 1.241 34.13 187.73 P29 2003 55 504.67 1161.84 10.33 1650 0.697 1.243 50.58 286.62 P30 2004 56 530.18 1194.09 10.22 1797 0.697 1.241 41.62 416.18 P31 2005 57 523.1 1174.5 10.04 1672 0.701 1.238 70.79 530.94 P32 2007 59 529.69 1217.81 9.79 1855 0.698 1.248 52.12 347.49 P33+ 2009 61 500.81 1142.09 10.57 1993 0.689 1.252 86.43 453.78 P34+ 2011 63 500.06 1154.05 9.4 1932 0.694 1.256 29.78 245.68 P35+ 2013 65 522.25 1202.9 8.33 1914 0.694 1.263 54.23 371.84 62

Table 16 continued mean 523.73 1207.88 10.14 1585.3 0.707 1.234 58.39 376.30

Table 17 Linear regression results for Terry Pratchett; 35 novels total R- Direction, if Factor p square significant trend MAT5000 0.307 0.001 decreasing Clusters20 0.017 0.456 Evenness 0.444 < .001 decreasing Semantic Disparity 0.257 0.002 increasing Special types 0.232 0.003 increasing Effective types 0.041 0.246 IT VERB ADJ THAT 0.008 0.601 relative overall relative IT-V-A-T/T 0.004 0.704 Delta Prime 0.002 0.782

Delta Prime - Terry Pratchett 2

1.5

1

0.5

0 P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 P16 P17 P18 P19 P20 P21 P22 P23 P24 P25 P26 P27 P28 P29 P30 P31 P32 P33+ P34+ P35+

Figure 8. Line graph of Delta Prime values for Terry Pratchett.

Dementia group overall

Both dementia-affected authors whose writing displayed significant declines in lexical diversity according to Delta Prime values also displayed a significant, fairly robust (R2 = .637 and p < .001 for Christie; R2 = .481 and p = .001 for Murdoch) upward trend in clusters20, which measures Dispersion. Clusters20 is calculated by finding the average number of types 63 repeated within 20 words of each other in a text, relativized to clusters per 100 words; higher clusters20 corresponds to lower LD. As an individual variable (rather than a conglomerate variable like Delta Prime), clusters20 seems to have the strongest predictive power for dementia of any variable tested in this study. This corresponds with the results of Le et al.

(2011) for their local repetition measure, which functions similarly to clusters20 but only considers open-class words when counting repetitions. Le et al. found that Christie had a strong increase in local lexical repetition toward the end of her career while Murdoch’s increasing repetition was less pronounced in overall trend but showed a “sharp rise” (p. 455) in her 50s and was very strong in her last two novels. Below are the scatterplots for clusters20 for Murdoch and Christie (Figures 9 and 10 respectively).

Clusters20 - Iris Murdoch 15 14 13 12 11

Clusters20 10 9 R² = 0.4813 8 30 40 50 60 70 80 Age

Figure 9. Linear regression for clusters20 for Iris Murdoch (p = .001) 64

Clusters20 - Agatha Christie 18 16 14 12 10 Clusters20 R² = 0.6366 8 6 25 35 45 55 65 75 85 95 Age

Figure 10. Linear regression for clusters20 for Agatha Christie (p = < .001).

When Delta Prime transformations are applied to the z-scores of LD variables and supplementary variables used in this study, the resulting model is a significant and fairly strong (Christie: R2 = .504, p < .001; Murdoch: R2 = .36, p = .005) predictor of age at the time of composition in the writing of literary authors suffering from dementia that primarily affects memory. Since Delta Prime values of LD and supplementary measures do not have a significant or robust relationship with age in the writing of authors without a pathological condition, predicting age in this case seems to be instead functioning as predicting dementia effects. In the case of Terry Pratchett, the model was not effective for predicting age, perhaps because he suffered from posterior cortical atrophy and claimed to not have mental struggles because of the disease beyond vision-related problems like object recognition.

Depression Group

Since what to expect regarding lexical diversity in the case of suicidal depression is less clear than in the case of dementia, Delta values were considered for the depression group authors in addition to Delta Prime. Delta values are simply the average absolute z- scores for each novel across all factors. 65

Kurt Vonnegut

Since Vonnegut’s suicide attempt was in the middle of his life (at age 62) and he continued to write and publish novels afterward until a few years before his natural death, his LD patterns are not expected to be linear if suicidal depression correlates with any specific LD tendencies, especially if LD values would be expected to return to the author’s baseline after the suicidal period had passed. Indeed, as shown in Table 18 below, none of the individual variables, nor Delta or Delta Prime, have significant linear relationships with

Vonnegut’s age. However, it seems that none of the variables have any particular relationship with his age, linear, polynomial, or otherwise. Figure 11 shows Vonnegut’s Delta values, which should illustrate total deviation from the author’s ‘normal’ values, positive or negative. The fluctuations in the Delta values do not coincide with the novels coded as

+suicidal. Raw values for Vonnegut’s novels for LD and supplementary variables are shown in Table 19.

Table 18 Linear regression results for Kurt Vonnegut; 13 novels total R- Direction, if Factor p square significant trend MAT5000 0.027 0.589 Clusters20 0.232 0.096 Evenness 0.012 0.725 Semantic Disparity 0.075 0.365 Special types 0.07 0.383 Effective types 0 0.965 IT VERB ADJ THAT 0.002 0.873 relative overall relative IT-V-A-T/T 0.005 0.826 Delta Prime 0.121 0.244 Delta 0.074 0.368

66

Delta - Vonnegut 1

0.8

0.6

0.4

0.2

0 V1 V2 V3 V4 V5 V6+ V7+ V8+ V9+ V10+ V11 V12 V13

Figure 11. Delta values for Vonnegut.

Table 19 Raw LD and it-v-adj-to-that values for Kurt Vonnegut IT overall VERB Effective Clusters Special Even- Sem. relative Novel Year Age MAT5000 ADJ Types 20 Types ness Disp. IT-V- THAT A-T/T relative V1 1952 30 616.02 1286.32 10.95 1998 0.709 1.247 58.07 319.39 V2 1961 39 460.36 1186.7 16.18 1079 0.719 1.197 41.29 103.22 V3 1963 41 545.34 1235.25 13.1 1184 0.727 1.207 73.46 330.55 V4 1965 43 582.07 1306.06 11.75 1298 0.733 1.216 40.03 240.2 V5 1969 47 498.65 1234.19 11.79 1096 0.725 1.201 59.15 177.46 V6+ 1973 51 545.39 1234.93 12.08 1126 0.731 1.204 0 173.22 V7+ 1976 54 513.06 1276.1 11.79 1000 0.737 1.187 77.45 361.42 V8+ 1979 57 531.39 1239.42 11.92 1448 0.712 1.222 80.18 320.72 V9+ 1982 60 489.79 1171.44 12.01 1138 0.721 1.201 85.51 222.32 V10+ 1985 63 526.7 1222.05 10.28 1409 0.721 1.222 43.6 261.63 V11 1987 65 479.84 1182.72 12.18 1280 0.711 1.21 42.84 157.09 V12 1990 68 533.54 1243.36 11.15 1346 0.709 1.219 72.69 266.54 V13 1997 75 636.21 1413.19 10.15 1243 0.741 1.199 41.57 228.65 mean 535.26 1248.60 11.95 1280.39 0.723 1.210 55.06 243.26

Virginia Woolf

Since Virginia Woolf’s life and writing career ended in suicidal depression and a successful suicide attempt, the LD and supplemental variable values for her writing are 67 expected to have a more linear relationship if LD is affected by depression escalating into suicidal intensity. As the raw values in Table 20 below suggest, Woolf’s writing seems to have increased in several facets of LD in her final novel. MAT5000, clusters20, and evenness all show noticeable jumps from their means as well as the novels that immediately precede them. However, as shown in Table 21, no variables but overall relative it-v-a-t/t have significant linear relationships with Woolf’s age.

Table 20 Raw LD and it-v-adj-to-that values for Virginia Woolf IT overall VERB Effective Clusters Special Even- Sem. relative Novel Year Age MAT5000 ADJ Types 20 Types ness Disp. IT-V- THAT A-T/T relative W1 1915 33 510.34 1176.76 10.17 1965 0.7 1.262 200.2 700.72 W2 1919 37 502.76 1191.47 10.69 2279 0.691 1.281 176 539.73 W3 1922 40 627.61 1355.25 8.58 1339 0.742 1.222 35.69 339.04 W4 1925 43 519.36 1221.66 11.93 1296 0.725 1.232 77.23 432.5 W5 1927 45 440.41 1119.28 12.31 1321 0.711 1.23 85.22 269.86 W6 1928 46 584.99 1362.96 9.42 1651 0.716 1.252 213.22 426.44 W7 1931 49 583.18 1350.28 11.64 1603 0.727 1.239 102.28 255.71 W8 1937 55 443.52 1060.25 12.6 1542 0.694 1.234 0 274.84 W9+ 1941 59 602.63 1339.14 10.2 999 0.745 1.212 64.71 194.13 mean 534.98 1241.89 10.84 1555 0.717 1.240 106.06 381.44

68

Table 21 Linear regression results for Virginia Woolf; 9 novels total R- Direction, if Factor p square significant trend MAT5000 0.01 0.797 Clusters20 0.1 0.407 Evenness 0.131 0.339 Semantic Disparity 0.436 0.053 Special types 0.432 0.055 Effective types 0.008 0.822 IT VERB ADJ THAT 0.327 0.107 relative overall relative IT-V-A-T/T 0.729 0.003 decreasing Delta Prime 0.25 0.17 Delta 0 0.986

The implications for the relationship, if there is one, between lexical diversity and suicidal depression are unclear from the descriptive statistics and linear regression results.

Delta Prime graphs for each depression group author, which look more like the control group than the dementia group, are in Appendix B. In the following sections, I will discuss the results of hierarchical cluster analysis and linear discriminant analysis for all authors.

Exploratory – Hierarchical Cluster Analysis

Hierarchical cluster analysis was conducted in SPSS on all novels by all authors with all LD and supplementary measures as variables. The resulting dendogram is reproduced in chunks in Appendix C due to its massive size. Novels that have similarities in the variables used should cluster together, meaning it is expected that novels by the same author would usually cluster together unless there was something unusual about a novel that separated it from the others.

Overall, books by the same author generally clustered together except for Woolf novels, which spread throughout clusters from other authors. The novels also seem to have 69 tended to cluster together according to general genre. Though there is a mixture of Clark,

Christie, Pratchett, and Vonnegut novels in the first large cluster (shown in Figure C1),

Vonnegut novels often clustered with Pratchett novels in other cases. Both Vonnegut’s and

Pratchett’s writing tends toward satirical and humorous, and perhaps humorous or satirical writing has somewhat distinct LD patterns that would cause these novels to form clusters.

However, if such genre-specific lexical diversity patterns exist, they are perhaps not extremely distinct since novels from other authors are sporadically mixed into the same clusters. Christie and Clark novels intermixed in large clusters like the one in Figure C3, and both authors specialize in suspense writing. Interestingly, although James has likewise penned mostly detective novels, her writing clustered mostly by itself without as much intermingling with Christie and Clark as may be expected.

The most notable clusters are those that seem to indicate anomalous characteristics of a certain novel or group of novels compared to others by the same author. Importantly, seven of Christie’s last 10 novels (all 10 of which were coded +dementia) in this dataset cluster together in two first-level clusters, with Ordeal by Innocence (AC27), The Mirror Crack’d from Side to Side (AC28), By the Pricking of My Thumbs (AC32), and Nemesis (AC34) together, connected at the next level to a first-level cluster composed of Endless Night (AC31), Postern of

Fate (AC36), and A Caribbean Mystery (AC30). The missing novels out of the last 10 are Cat

Among the Pigeons (AC28), which was also an outlier in its Delta Prime value, as described above; Passenger to Frankfurt (AC33), which past studies (Garrard et al., 2005 & Le et al.,

2011) considered an outlier and left out of their analyses due to Christie having conducted background political research that they postulate likely affected her vocabulary usage; and

Elephants Can Remember (AC35), which is Christie’s second-to-last novel and the one most 70 perceivably affected by dementia (as mentioned above, Christie had editorial help for her last novel). Passenger to Frankfurt and Cat Among the Pigeons can be found in the next first-level cluster down along with earlier Christie novels and some Clark novels. This supports the idea that the writing in Passenger and Cat more closely resembles Christie’s earlier work than her Alzheimer’s-affected work, perhaps because of background research in the case of the former and the fluctuating nature of dementia allowing for a reprieve in the case of the latter.

As the most dementia-affected novel of the author with the earliest-appearing and most consistent lexical signals of AD in this study, Elephants Can Remember is unsurprisingly the most isolated novel in the entire dendogram with no clusters until the seventh level.

A large group of Murdoch’s novels cluster at the bottom of the dendogram (Figure

C3). Four of these Murdoch novels (The Bell – M3, The Nice and the Good – M9, An Unofficial

Rose – M5, and Henry and Cato – M14) cluster with Pratchett’s third to last novel (Unseen

Academicals - P33, the first one of his labeled +dementia) and the P.D. James novel with the highest Delta Prime value (A Certain Justice – J14). This seems to say more about the

Pratchett novel and the James novel being different from the other novels by their respective authors than it does about the Murdoch novels, given that M3, M5, and M9 have among the smallest Delta Prime values of all Murdoch novels. The other main Murdoch novel cluster

(in Figure C3, directly below the one just discussed) connects seven of her mid-to-late-career novels via two levels of sub-clusters, the first level being in pairs. The first-level sub-clusters connect The Sacred and Profane Love Machine (M13) with The Green Knight (M19), The Black

Prince (M12) with The Good Apprentice (M17), and The Sea, the Sea (M15) with The Book and the

Brotherhood (M18), leaving The Philosopher’s Pupil (M16) alone in the first level. At the second level, the first two sub-clusters come together, as do the third sub-cluster and M16. The 71 entire group is united in the third level. The inclusion of The Green Knight (M19) in this cluster group seems to suggest that it bears enough similarity to other Murdoch novels to not be as noticeably tainted by dementia. Le et al. (2011) describe finding early but less distinctive signs of the impact of dementia in The Green Knight, which is Murdoch’s second to last novel. The Delta Prime values discussed above show that The Green Knight does indeed show signs of the decline that peaks in Jackson’s Dilemma. However, the dendogram from this study seems to suggest that The Green Knight is still similar enough to Murdoch’s past novels that the effects of dementia were perhaps not yet fully pronounced. It does seem that Iris

Murdoch wrote The Green Knight when Alzheimer’s disease was already beginning to manifest in her lexical abilities. However, since the novel alludes to and somewhat parodies the

Medieval story Sir Gawain and the Green Knight, maybe its LD behaved similarly to that of

Passenger to Frankfurt by Agatha Christie, which was written with the help of background research that may have affected vocabulary usage and therefore LD.

The composition of these clusters perhaps challenges the idea put forth by Le et al.

(2011) that Murdoch’s writing went through a trough of complexity in her late forties and early-mid fifties before improving just before final decline, since the dendogram shows novels from her mid-fifties, which would presumably be part of the trough, clustering with novels from her sixties, which would presumably be part of the post-trough recovery.

However, since the clusters in question are only pairs on the first level, perhaps the similarities between the novels are not strong enough to be taken as evidence against the trough described by Le et al. (2011).

Murdoch’s last novel, Jackson’s Dilemma (M20), clusters at the first level only with

Bluebeard by Vonnegut, and at the second level with a variety of Christie and Clark novels. 72

This meets expectations of Murdoch’s final novel behaving anomalously in relation to the rest of her work and corroborates the implications of the LD and supplementary variable results as well as the Delta prime values. Echoing the results discussed in the previous section of this study, Pratchett’s last three novels (Unseen Academicals - P33, Snuff – P34, and

Raising Steam – P35), which were coded +dementia, did not cluster with each other and away from the rest of his work. Only Unseen Academicals was relatively on its own, appearing with

Murdoch novels.

For the depression group, the most interesting cluster results are those of Virginia

Woolf’s novels. Supporting the notion that Orlando (W6) is stylistically distinct from Woolf’s other novels, it is relatively isolated in the dendogram, having no first-level clusters and connecting at the second level to multiple dense clusters containing Pratchett, Vonnegut, and other Woolf novels. Between the Acts (W9), Woolf’s final novel and the only one of hers coded +suicidal, is also relatively isolated on the dendogram. Between the Acts clusters at the second level only with Vonnegut novels and a Pratchett novel. Its relative isolation in the dendogram seems to support the observation based on LD results that her final novel is somewhat distinct from her others, although in general, Woolf’s novels do not cluster with each other like novels from the other authors. In this way, Woolf seems to display the highest stylistic (at least in terms of LD properties and supplementary measures) variation.

Hierarchical cluster analysis results for Vonnegut’s novels are, like the LD, supplementary variable, and Delta Prime results for his novels, inconclusive. The novels coded +suicidal (Breakfast of Champions – V6, Slapstick or Lonesome No More – V7, Jailbird – V8,

Deadeye Dick – V9, and Galapagos – V10) do not cluster all together. Jailbird and Galapagos appear in the same cluster (Figure C1), but Deadeye Dick appears with his earlier novels and 73

Slapstick or Lonesome No More is relatively isolated, clustering at the first and second levels with a single Pratchett novel in each level.

Overall, hierarchical cluster analysis generally supported the results of the raw LD and supplementary values, Delta Prime and Delta values, and linear regressions discussed in the previous section of this study. Further, hierarchical cluster analysis reinforced pathological condition coding decisions for Iris Murdoch and Agatha Christie, both of whom did not have diagnoses during their careers unlike Terry Pratchett or distinct events that were manifestations of their conditions like Kurt Vonnegut or Virginia Woolf. Below, I discuss the results of linear discriminant analysis performed on each author with a pathological condition.

Confirmatory – Linear Discriminant Analysis

Linear discriminant analysis was performed on each author affected by a pathological condition (all but Clark and James) individually with dementia or suicidal as the grouping variable and all LD and supplemental variables as independent variables. In the tables below, each variable is listed in the order of its contribution to the model. Variable weights

(reported in SPSS as the Structure Matrix) in the tables are expressed as pooled within- groups correlations between discriminating variables and the standardized canonical discriminant function. Wilks’ Lambda, F, and two degrees of freedom values are listed for each variable.

Dementia Group

Overall, discrimination accuracy between novels coded as -dementia versus novels coded as +dementia was very high—100 percent for Murdoch, Christie, and Pratchett. 74

Iris Murdoch

The eight-variable model used in this study correctly classified 100 percent of Iris

Murdoch’s novels (canonical correlation = 0.831, p = .036), though only two of her novels

(The Green Knight – M19 and Jackson’s Dilemma – M20) were coded as +dementia. When the discriminant analysis was run with leave-one-out cross-validation, in which each case is classified by the functions derived from all cases other than that case, classification accuracy was 90 percent. One novel from each group was misclassified: The Green Knight (M19) and

The Unicorn (M6). The misclassification of The Green Knight is unsurprising given the subtlety in the signs of dementia in the novel’s LD. It is unclear why The Unicorn would be misclassified.

As shown in Table 22 below, the variable weights and significance for the linear discriminant analysis contrast with the significance and strength of the individual variables in the linear regression results. From the linear regressions, it appeared that overall relative IT-

V-A-T/T was the strongest predictor of Murdoch’s decline, followed by clusters20. In the discriminant model, overall relative IT-V-A-T/T is still significant, but clusters20 is not. In this model, only MAT5000 and overall relative IT-V-A-T/T achieve significance in tests of equality of group means. Had Murdoch written more novels during and after the onset of her dementia, perhaps more of the variables would have reached significance and performed more cohesively as they do in linear discriminant analysis for Christie.

75

Table 22 Tests of equality of group means for Iris Murdoch. Variable weights from structure matrix. Wilks' Variable Variable F df1 df2 p Lambda Weight MAT5000 0.655 9.5 1 18 0.006 0.486 overall relative IT- 0.666 9.011 1 18 0.008 0.473 V-A-T/T Clusters20 0.903 1.93 1 18 0.182 -0.219 Semantic Disparity 0.956 0.833 1 18 0.373 0.144 special types 0.965 0.653 1 18 0.43 0.127 Effective Types 0.979 0.387 1 18 0.542 0.098 Evenness 0.984 0.285 1 18 0.6 -0.084 IT VERB ADJ 0.998 0.036 1 18 0.852 0.03 THAT relative

Agatha Christie

100 percent of Christie’s novels were correctly classified (canonical correlation =

0.875; p < .001). All but one of the variables used achieved statistical significance in the model, which is shown in Table 22 below. With cross-validation, the classification accuracy dropped to 91.7 percent, and one -dementia (Giant’s Bread – AC9) novel as well as two

+dementia novels (Cat Among the Pigeons – AC28 and A Caribbean Mystery – AC30) were misclassified. Even the variable that was not significant (overall relative IT-V-A-T/T) was close to statistical significance, though its contribution to the model was minimal. All other factors contributed to the classification accuracy of the model. In classifying Christie novels, clusters20 seems to be the most important factor. The roles of clusters20 and overall relative

IT-V-A-T/T in the linear discriminant analysis correspond with the linear regression results, wherein clusters20 had the highest R-square value of any individual factor and overall relative IT-V-A-T/T had the lowest R-square.

76

Table 23 Tests of equality of group means for Agatha Christie. Variable weights from structure matrix. Wilks' Variable Variable F df1 df2 p Lambda Weight Clusters20 0.385 54.415 1 34 < .001 -0.7 MAT5000 0.494 34.818 1 34 < .001 0.56 Effective types 0.658 17.633 1 34 < .001 0.398 Semantic 0.664 17.168 1 34 < .001 0.393 Disparity Evenness 0.666 17.03 1 34 0.001 0.392 Special types 0.698 14.745 1 34 < .001 0.364 IT VERB ADJ 0.806 8.181 1 34 0.007 0.271 THAT relative overall relative 0.921 2.936 1 34 0.096 0.163 IT-V-A-T/T

Interestingly, although Cat Among the Pigeons (AC28) is an outlier in Delta Prime values, several raw LD values, and in the hierarchical cluster analysis results, the regular (not cross-validated) model still correctly distinguishes it as +dementia. Despite the novel appearing to match Christie’s baseline, then, the component measures of LD and the supplementary measures must still show enough distinctions from her past writing to identify the novel as being written under the effects of dementia. The results of the other analyses in this study point to Cat as an outlier, but linear discriminant analysis suggests that it still nonetheless should be counted as part of the group of novels written under the effects of dementia. Its membership in the dementia group is, however, tenuous enough that the novel is misclassified in leave-one-out cross-validation. It seems likely that Christie’s dementia was still detectable to nuanced measures of LD but was in a waning, less-severe state when she wrote the novel.

The other misclassified +dementia novel in leave-one-out cross-validation discriminant analysis is A Caribbean Mystery (AC30). Like Cat Among the Pigeons, this novel 77 represents a brief return toward Christie’s baseline after a spike in Delta Prime values, so perhaps it too was written during a waning phase of Christie’s dementia. The misclassified - dementia novel is Giant’s Bread (AC9), which is the only novel included in this study that

Christie wrote under the pseudonym Mary Westmacott. Intentional style differences on

Christie’s part may account for the LD differences that led to misclassification.

Terry Pratchett

Surprisingly, though the individual linear regression results, Delta Prime values, and hierarchical cluster analysis results for Terry Pratchett seem to suggest that there are much less distinct LD patterns, if any, that separate his +dementia novels from his -dementia novels, the linear discriminant analysis model still achieved 100 percent classification accuracy for his novels. When discriminant analysis was conducted with cross-validation, the classification accuracy dropped to 88.6% percent. The cross-validated models misclassified one +dementia novel (Unseen Academicals – P33) and three -dementia novels (The Colour of

Magic – P1, Night Watch – P28, and Thud – P31). There are no immediately apparent reasons for the misclassifications of these Pratchett novels unlike those of the Christie and Woolf novels.

In classifying Pratchett’s novels, the model had a canonical correlation of 0.744 and significance of .003. Table 24 shows that four LD variables were statistically significant, and the supplementary variables were extremely ineffective for classification of Pratchett’s novels. Three of the four significant variables (evenness, semantic disparity, special types) from the linear discriminant analysis were significant in linear regression as well, though evenness had a higher R-square than semantic disparity and special types. In addition to the predictive power of evenness being outclassed by semantic disparity and special types in the 78 linear discriminant analysis, the other primary variable strength and significance difference in relation to linear regression results is that MAT5000 has been replaced by clusters20.

Table 24 Tests of equality of group means for Terry Pratchett. Variable weights from structure matrix. Wilks' Variable Variable F df1 df2 p Lambda Weight Semantic disparity 0.729 12.253 1 33 0.001 0.548 Special types 0.766 10.097 1 33 0.003 0.498 Evenness 0.829 6.813 1 33 0.014 -0.409 Clusters20 0.876 4.668 1 33 0.038 -0.338 MAT5000 0.919 2.912 1 33 0.097 -0.267 Effective Types 0.977 0.766 1 33 0.388 -0.137 overall relative 0.996 0.146 1 33 0.705 -0.06 IT-V-A-T/T IT VERB ADJ 1 0.009 1 33 0.924 -0.015 THAT relative

The raw LD results shown in the first part of the results section show that clusters20 in Pratchett’s novels fluctuates throughout his life, but in each of his last two novels the clusters20 value is distinctly less than that of the preceding novel. A similar pattern occurs in his last three novels for special types values, whereas a corresponding increase can be observed in semantic disparity and somewhat in evenness. These patterns are shown in

Figures 12-15 below with the pattern in question circled. Increasing evenness seems to correspond with decreasing special types, with a pronounced change between the third to last and second to last novels that levels out at the last novel. Likewise, clusters20’s decrease seems to correspond to semantic disparity’s increase, each displaying a more consistent diagonal pattern. Perhaps it is this pattern, with increasing values in clusters20 and special types corresponding to decreasing values in semantic disparity and evenness, helps the 79 model achieve its high accuracy despite the lack of significant overall linear relationships between the variables and Pratchett’s age.

Clusters20 - Terry Pratchett 12

10

8

6 30 35 40 45 50 55 60 65 70

Figure 12. Clusters20 with +dementia novels circled.

Special types - Terry Pratchett 2100 1900 1700 1500 1300 1100 900 700 30 35 40 45 50 55 60 65 70

Figure 13. Special types with +dementia novels circled. 80

Semantic Disparity - Terry Pratchett 1.28 1.26 1.24 1.22 1.2 1.18 30 35 40 45 50 55 60 65 70

Figure 14. Semantic disparity with +dementia novels circled.

Evenness - Terry Pratchett 0.74 0.73 0.72 0.71 0.7 0.69 0.68 30 35 40 45 50 55 60 65 70

Figure 15. Evenness with +dementia novels circled.

Dementia group overall

The model consistently successfully discriminated between all dementia-affected and dementia-unaffected novels for all three authors. When leave-one-out cross-validation was used, classification accuracy dropped to 90, 91.7, and 88.6 percent for Murdoch, Christie, and Pratchett respectively. That Christie had the highest accuracy when cross-validation was used is unsurprising given that the signs of dementia in her writing seem to be more distinct than for Murdoch and Pratchett, as evidenced by all but one of the variables reaching 81 statistical significance in the discriminant model for Christie. Different variables contributed more to the accuracy of the model for different authors, suggesting that dementia affects individuals’ writing differently, or at least that different types of dementia affect writing differently (Agatha Christie’s specific dementia type is not confirmed). Semantic disparity is the only factor that contributed significantly to the model for all three authors in the dementia group. The supplementary variables were generally weak and insignificant except for overall relative IT-V-A-T/T for classifying Murdoch novels, in which case it was one of only two significant variables.

Depression Group

As with the raw values, linear regression, and hierarchical cluster analysis results for

Vonnegut and Woolf, linear discriminant analysis yielded sporadic results for the depression group authors. Classification accuracy was 76.9% for Vonnegut and 100% for Woolf, but the model did not achieve statistical significance for Vonnegut (p = .67) or Woolf (p = .675).

As such, less space is devoted to discussing the results for the depression group since none were significant. Tables 25 and 26 below show variable information for Vonnegut and Woolf respectively. Leave-one-out cross-validation was not used in analysis of the depression group due to the poor performance of the base model.

82

Table 25 Tests of equality of group means for Kurt Vonnegut. Variable weights from structure matrix. Wilks' Variable Variable F df1 df2 p Lambda Weight overall relative 0.931 0.816 1 11 0.386 -0.24 IT-V-A-T/T MAT5000 0.934 0.781 1 11 0.396 0.235 Effective Types 0.95 0.58 1 11 0.462 0.202 special types 0.967 0.378 1 11 0.551 0.163 Clusters20 0.967 0.378 1 11 0.551 0.163 Semantic 0.975 0.286 1 11 0.604 0.142 Disparity Evenness 0.984 0.174 1 11 0.684 -0.111 IT VERB ADJ 0.994 0.071 1 11 0.795 -0.071 THAT relative

Table 26 Tests of equality of group means for Virginia Woolf. Variable weights from structure matrix. Wilks' Variable Variable F df1 df2 p Lambda Weight IT VERB ADJ 0.686 3.211 1 7 0.116 0.389 THAT relative Semantic 0.784 1.933 1 7 0.207 0.302 Disparity special types 0.824 1.497 1 7 0.261 0.266 Clusters20 0.946 0.398 1 7 0.548 -0.137 MAT5000 0.956 0.323 1 7 0.588 0.123 overall relative 0.733 2.547 1 7 0.155 -0.069 IT-V-A-T/T Effective Types 0.99 0.069 1 7 0.8 0.057 Evenness 0.994 0.043 1 7 0.841 -0.045

Returning to Research Questions

RQ1 - Will complex measures of lexical diversity support the findings of Le et al.’s 2011 study?

The results of this study generally coincide with the findings of Le et al. (2011) in that lexical diversity was a strong predictor of age in the writing of authors who developed 83 memory-affecting dementia. Since LD did not predict age in authors who aged healthily, it can be inferred that LD is functioning as a predictor of dementia and the associated language-related memory deterioration. Further, Le et al. (2011) suggested that syntactic measures had less consistent predictive results than lexical measures and fluctuated more widely for Christie than for Murdoch. Indeed, the two lexicosyntactic measures used in this study had consistently less predictive power in analyses of dementia-affected authors except for overall relative IT-V-A-T/T for Murdoch. The main contradiction of this study relative to Le et al. (2011) is that Le et al. suggested that Murdoch’s writing went through a trough in her mid-forties and early fifties, but the results of this study do not corroborate that suggestion.

RQ2 - Will these measures provide more nuanced detection of the development of dementia?

Importantly, the results of this study seem to suggest that decline in lexical diversity associated with healthy aging is distinct from decline in lexical diversity due to dementia primarily quantitatively rather than qualitatively. There was no consistent pattern of individual aspects of LD declining in dementia group authors that did not decline in control group authors, nor was there much consistency between authors in the same group. Thus, declining LD associated with dementia seems to be mostly distinguishable from aging-related

LD changes in the severity of the overall decline. The specific qualities that suffered the most from the authors’ disease are different for each author.

The Delta Prime values that resulted from the LD variables and supplementary variables provide interesting insight into the generalized pattern of decline of each dementia group author. Le et al. (2011) describe a general decline throughout the entirety of their

Christie dataset, but the LD model used in this study, as well as the Delta Prime 84 transformations thereof, suggests that her decline most clearly manifested beginning around her late sixties, when Ordeal by Innocence was published. Le et al. (2011) also identify Passenger to Frankfurt as an extreme outlier, but this study suggests that, while Passenger is indeed a bit more lexically diverse than its immediate chronological neighbors, it still shows clear signs of the effects of dementia. This is supported by linear discriminant analysis and hierarchical cluster analysis.

Individual LD variables in their raw form also offered nuanced insight into the development of dementia. Le et al. (2011) posit that Murdoch’s decline from AD began to manifest in The Green Knight. While the hierarchical cluster analysis does not seem to suggest that The Green Knight is symptomatic of dementia, raw LD values, Delta Prime, and linear discriminant analysis support Le et al. (2011). The Green Knight’s MAT5000 value is distinctly lower than her previous novels. Delta Prime values show Murdoch’s decline beginning in

The Green Knight and peaking in Jackson’s Dilemma, and linear discriminant analysis classifies

The Green Knight as a +dementia novel. However, the leave-one-out cross-validated linear discriminant analysis, with Jackson’s Dilemma as the model’s only example of dementia to use for classification, misclassifies The Green Knight. In that regard, perhaps Murdoch’s decline indeed began to manifest in The Green Knight, but less distinctly than in Jackson’s Dilemma.

The case of Terry Pratchett in this study is one of the most interesting in that linear discriminant analysis still classified his novels correctly and achieved statistical significance even though the other analyses in the study suggested otherwise. Delta Prime showed no linear relationship with Pratchett’s age, unlike the other two dementia authors, and hierarchical cluster analysis did not group his +dementia novels in a way that would suggest that they were distinct from his other novels in terms of LD. The success of the linear 85 discriminant analysis seems to be tied to distinct increases in certain LD variables corresponding to decreases in other LD variables in Pratchett’s three +dementia novels.

RQ3 - Can these measures be applied to depression, and particularly depression escalating into suicide?

The results of this study for authors in the depression group were sporadic at best.

Though some of the LD and supplementary variables for Woolf seemed to show distinct changes in her last novel, no statistical analyses other than linear regression for one lexicosyntactic variable were statistically significant. Likewise, Vonnegut’s LD did not yield patterns that were useful for predicting the presumed intensification of his depression before his suicide attempt. However, the pathological condition coding for the depression group, and especially for Vonnegut, was much less straightforward than that of the dementia group.

It is hard to be certain when depression has reached suicidal intensity, and biographical information is likely not enough to make this distinction when coding the data. Especially for someone like Virginia Woolf, whose mental health underwent rapid declines and recoveries at multiple points in her life, the lack of a recent suicide attempt does not necessarily mean that the person is not suicidally depressed.

Limitations to the Study and Future Directions

As mentioned in the previous section, in reference to the depression group, the coding for +suicidal and -suicidal could very well be mistaken, and it is hard to verify without detailed and explicit information from the subject whether they were suicidally depressed while writing any given text. Future studies pursuing detection of suicidal depression in literary writing should seek authors with extensive diaries available, if possible.

Limitations in data preparation include the possibility of errors in the raw text of the novels that went undetected by me, as I was the only person quality-checking the text files 86 for all 151 novels. As mentioned in the Method section, the text cleaning process resulted in errors such as some words being accidentally combined with neighboring words due to unintentional removal of spaces, which in turn affected tokenization, lemmatization, and all following analyses.

It is possible that, as past studies mention (Le et al., 2011 & Garrard et al., 2005), dialogue in novels may warp the usefulness of LD measures since the author may intentionally simplify vocabulary and other features of their writing to create characters. For this study, I did not separate dialogue from narration due to the impracticality given the volume of data. Hand-separating dialogue from narration would be infeasible for this project, and due to inconsistencies in the ways dialogue is represented in books (e.g. sometimes double quotes, sometimes single quotes, sometimes none at all, sometimes no opening quote if the dialogue starts a chapter), it was impractical to create a Python script to attempt to separate the dialogue. Future studies should compare samples of writing from an author that contain both dialogue and narration, samples that contain only narration, and samples that contain only dialogue to assess whether significant differences in LD are present.

Character perspective in the novels may influence LD and was not considered in this study. My logic for ignoring character perspective is that the primary bottom-level difference between first-person writing and third-person writing is that first-person writing would likely use first-person pronouns for the narrator whereas third-person writing would use third- person pronouns for the same character. However, there may be higher-level style differences that would affect LD. Future studies should compare first-person novels to third-person novels by the same author to test whether character perspective affects LD. 87

Though Jarvis’s lexical diversity model has been thoroughly validated by human raters for short texts written by adolescent native and non-native English speakers about

Charlie Chaplin clips (Jarvis, 2017), the model has not been validated for literary texts. In the future, I plan to follow the human rating and model testing steps outlined by Jarvis (2017) to confirm its validity on other genres and make any necessary adjustments.

88

CHAPTER 5: CONCLUSION

In this study, I presented the largest longitudinal computational analysis of literary texts for the purpose of detecting dementia to date. I used a complex, six-measure model of lexical diversity proposed by Dr. Scott Jarvis as well as two theoretically-motivated supplementary measures on 151 novels by seven authors. My goal was to follow up and expand on the analyses conducted by Le et al. (2011). All variables behaved mostly as expected, with increases in clusters20 and decreases in MAT5000, effective types, special types, semantic disparity, evenness, IT VERB ADJ THAT relative, and overall relative IT-V-

A-T/T equating to lower lexical diversity and occurring in novels written later in life. My results generally supported those of Le et al. (2011), but with exceptions including that I did not find evidence of Murdoch having a trough in her lexical diversity in her mid-forties and early fifties.

Using a multi-faceted model of lexical diversity allowed nuanced insights into the development of dementia in Agatha Christie, Iris Murdoch, and Terry Pratchett. Through linear regression of individual variables, calculation of Delta Prime values, hierarchical cluster analysis, and linear discriminant analysis, my model successfully predicted age in authors with dementia and showed no significance for healthy controls. I found that Christie’s dementia seems to have started in her late sixties. Iris Murdoch’s dementia is most strongly apparent in her final novel, Jackson’s Dilemma, but signs also appear in The Green Knight. Terry Pratchett’s dementia, though it primarily affected him physically rather than lexically, seems to manifest most clearly through two pairs of mirrored, opposite rises and declines in four LD variables in his last three novels. Different LD variables contributed more to the discriminant model for different authors, implying that dementia affects individuals differently, or at least that 89 the three authors examined had different types of dementia. Further, this suggests that the effects of dementia on LD do not have a distinct qualitative pattern and that the decline associated with dementia is distinct from decline associated with age primarily because of the severity of the decline. Applying this LD model to novels by authors who attempted suicide led to null results in attempting to predict suicidal depression but provided more evidence for the LD model not showing false positives on authors who do not suffer from dementia.

Future iterations of this project will include even more novels from Christie and

Murdoch so that more precise timelines of the progression of their dementia can be developed. I also plan to gather human ratings of LD in passages of the novels used in this study to validate the computational LD model and guide any necessary adjustments.

In the long term, this study will hopefully contribute to a body of knowledge that could eventually grow into a powerful diagnostic tool for detecting dementia among the still- living. For this LD model to be useful and practical in alerting a patient to encroaching dementia, the model needs to be tested on the type of text that possible patients would produce consistently enough to be used as data. The necessity of language data in a consistent medium over the course of the patient’s life likely points toward social media, blogs, and communicative writing like email and text messages as the next frontier in experimental dementia detection linguistic research.

90

REFERENCES

About Mary Higgins Clark (n.d.). Retrieved from http://maryhigginsclark.com/about-mary/

Baddeley, A. D., Bressi, S., Della Sala, S., Logie, R., & Spinnler, H. (1991). The decline of

working memory in Alzheimer's disease. Brain, 114(6), 2521-2542.

Baddeley, J. L., Daniel, G. R., & Pennebaker, J. W. (2011). How Henry Hellyer’s use of

language foretold his suicide. Crisis.

Bernard, J. D., Baddeley, J. L., Rodriguez, B. F., & Burke, P. A. (2016). Depression,

Language, and Affect: An Examination of the Influence of Baseline Depression and

Affect Induction on Language. Journal of Language and Social Psychology, 35(3), 317-326.

Bird, H., Ralph, M. A. L., Patterson, K., & Hodges, J. R. (2000). The rise and fall of

frequency and imageability: Noun and verb production in semantic dementia. Brain

and language, 73(1), 17-49.

Bradshaw, J., Saling, M., Hopwood, M., Anderson, V., & Brodtmann, A. (2004). Fluctuating

cognition in dementia with Lewy bodies and Alzheimer’s disease is qualitatively

distinct. Journal of Neurology, Neurosurgery & Psychiatry, 75(3), 382-387.

Bucci, W., & Freedman, N. (1981). The language of depression. Bulletin of the Menninger Clinic,

45(4), 334.

Bucks, R. S., Singh, S., Cuerden, J. M., & Wilcock, G. K. (2000). Analysis of spontaneous,

conversational speech in dementia of Alzheimer type: Evaluation of an objective

technique for analysing lexical performance. Aphasiology, 14(1), 71-91.

Castañeda-Jiménez, G., & Jarvis, S. (2014). Exploring lexical diversity in second language

Spanish. In K. Geeslin (Ed.), The handbook of Spanish second language acquisition (pp. 498-

513). Boston: Wiley. 91

De Choudhury, M., Gamon, M., Counts, S., & Horvitz, E. (2013, July). Predicting

Depression via Social Media. In ICWSM (p. 2).

Ellis, N. C., Römer, U., & O’Donnell, M. B. (2016). Language usage, acquisition, and

processing: Cognitive and corpus investigations of construction grammar. Language

Learning, 66: Supplement 1.

Fineberg, S. K., Leavitt, J., Deutsch-Link, S., Dealy, S., Landry, C. D., Pirruccio, K., &

Corlett, P. R. (2016). Self-reference in psychosis and depression: a language marker

of illness. Psychological medicine, 46(12), 2605-2615.

Garrard, P., Maloney, L. M., Hodges, J. R., & Patterson, K. (2005). The effects of very early

Alzheimer's disease on the characteristics of writing by a renowned author. Brain,

128(2), 250-260.

Grice, E. (2012, September 10). Sir Terry Pratchett: "I thought my Alzheimer's would be a

lot worse than this by now". The Telegraph. Retrieved from

http://www.telegraph.co.uk/lifestyle/9532983/Sir-Terry-Pratchett-I-thought-my-

Alzheimers-would-be-a-lot-worse-than-this-by-now.html

Grossman, L. (2007, April 12). Kurt Vonnegut, 1922-2007. Time. Retrieved from

http://content.time.com/time/arts/article/0,8599,1609650,00.html

Hebert, L. E., Wilson, R. S., Gilley, D. W., Beckett, L. A., Scherr, P. A., Bennett, D. A., &

Evans, D. A. (2000). Decline of language among women and men with Alzheimer's

disease. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences, 55(6),

354-361.

Hoover, D. L. (2004a). Testing Burrows's delta. Literary and linguistic computing, 19(4), 453-475.

Hoover, D. L. (2004b). Delta prime?. Literary and Linguistic Computing, 19(4), 477-495. 92

Jarvis, S. (2002). Short texts, best-fitting curves and new measures of lexical diversity.

Language Testing, 19(1), 57-84.

Jarvis, S. (2012). Lexical challenges in the intersection of applied linguistics and ANLP. In C.

Boonthum-Denecke, P.M. McCarthy, & T. Lamkin (Eds.), Cross-disciplinary advances in

applied natural language processing: Issues and approaches (pp. 50-72). Hershey, PA: IGI

Global.

Jarvis, S. (2013a). Capturing the diversity in lexical diversity. Language Learning, 63

(Supplement 1), 87-106.

Jarvis, S. (2013b). Defining and measuring lexical diversity. In S. Jarvis & M. Daller (Eds.),

Vocabulary knowledge: Human ratings and automated measures (pp. 13-44). Amsterdam:

Benjamins.

Jarvis, S. (2015, August). Lexical diversity. Plenary presented at EuroSLA 25 (the 25th annual

meeting of the European Second Language Association), Aix-en-Provence, France,

August 27-29, 2015.

Jarvis, S. (2017). Grounding lexical diversity in human judgments. Language Testing, 34(4),

537-553.

Kemper, S., Thompson, M., & Marquis, J. (2001). Longitudinal change in language

production: Effects of aging and dementia on grammatical complexity and

propositional content. Psychol Aging 16, 600.

Lancashire, I., & Hirst, G. (2009, March). Vocabulary changes in Agatha Christie’s mysteries

as an indication of dementia: a case study. In 19th Annual Rotman Research Institute

Conference, Cognitive Aging: Research and Practice (pp. 8-10). 93

Le, X. (2010). Longitudinal Detection of Dementia through Lexical and Syntactic Changes in

Writing. Science.

Le, X., Lancashire, I., Hirst, G., & Jokel, R. (2011). Longitudinal detection of dementia

through lexical and syntactic changes in writing: a case study of three British

novelists. Literary and Linguistic Computing, 26(4), 435-461.

Lyons, K., Kemper, S., LaBarge, E., Ferraro, F. R., Balota, D., & Storandt, M. (1993). Oral

language and Alzheimer's disease: A reduction in syntactic complexity. Aging and

Cognition, 50, 81-86.

McCarthy, P. M., & Jarvis, S. (2007). vocd: A theoretical and empirical evaluation. Language

Testing, 24(4), 459-488.

McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of

sophisticated approaches to lexical diversity assessment. Behavior research methods,

42(2), 381-392.

National Institute on Aging. (2016, August 17). Alzheimer’s disease fact sheet. Retrieved

from https://www.nia.nih.gov/health/alzheimers-disease-fact-sheet

Nebes, R. D. (1989). Semantic memory in Alzheimer's disease. Psychological bulletin, 106(3),

377.

Nicholls, R. (1999, February 9). Iris Murdoch, novelist and philosopher, is dead. New York

Times. Retrieved from

http://www.nytimes.com/learning/general/onthisday/bday/0715.html

Pratchett, T. (2015, March 14). ‘A butt of my own jokes’: Terry Pratchett on the disease that

finally claimed him. The Guardian. Reprinted from the Alzheimer’s Society. (2008). 94

Retrieved from https://www.theguardian.com/books/2015/mar/15/a-butt-of-my-

own-jokes-terry-pratchett-on-the-disease-that-finally-claimed-him

Reid, P. (2007, May 30). Virginia Woolf. Encyclopedia Britannica. Retrieved from

https://www.britannica.com/biography/Virginia-Woolf

Richardson, M. (1967, November 5). The Observer. 9.

Riley, K. P., Snowdon, D. A., Desrosiers, M. F., & Markesbery, W. R. (2005). Early life

linguistic ability, late life cognitive function, and neuropathology: findings from the

Nun Study. Neurobiology of aging, 26(3), 341-347.

Rude, S., Gortner, E. M., & Pennebaker, J. (2004). Language use of depressed and

depression-vulnerable college students. Cognition & Emotion, 18(8), 1121-1133.

Simkin, J. (2015, August). Virginia Woolf. Spartacus Educational. Retrieved from

http://spartacus-educational.com/Jwoolf.htm

Snowdon DA, Greiner LH, Markesbery WR. Linguistic ability in early life and the

neuropathology of Alzheimer’s disease and cerebrovascular disease. Findings from

the Nun Study. Ann NY Acad Sci 2000; 903: 34–8.

Stasio, M. (2014, November 27). P. D. James, creator of the Adam Dalgliesh mysteries, dies

at 94. New York Times. Retrieved from:

https://www.nytimes.com/2014/11/28/arts/international/p-d-james-mystery-

novelist-known-as-queen-of-crime-dies-at-94.html

Stirman, S. W., & Pennebaker, J. W. (2001). Word use in the poetry of suicidal and

nonsuicidal poets. Psychosomatic Medicine, 63(4), 517-522.

Suethanapornkul, S. (2017, March). ‘It is clear… that? or to?’: Second language users’

variable knowledge of the introductory-IT construction. Presentation given at 95

Georgetown University Roundtable conference 2017. Washington DC, USA, March

10-12.

Tang-Wai, D. F., & Graham, N. L. (2008). Assessment of language function in dementia.

Geriatrics and Aging, 11(2), 103.

Van Velzen, M. H., Nanetti, L., & de Deyn, P. P. (2014). Data modelling in corpus

linguistics: How low may we go?. Cortex, 55, 192-201.

Van Velzen, M., & Garrard, P. (2008). From hindsight to insight–retrospective analysis of

language written by a renowned Alzheimer's patient. Interdisciplinary Science Reviews,

33(4), 278-286

Zipf, G. K. (1935). The Psycho-Biology of Language. Boston: Houghton Mifflin. 96

APPENDIX A: SAMPLE DATA

Data sample: first four sentences from Jackson’s Dilemma by Iris Murdoch.

ONE CD one Edward NP Edward Lannion NP Lannion was VBD be sitting VVG sit at IN at his PP$ his desk NN desk in IN in his PP$ his pleasant JJ pleasant house NN house in IN in London NP London in IN in Notting NP Notting Hill NP Hill The DT the sun NN sun was VBD be shining VVG shine It PP it was VBD be an DT an early JJ early morning NN morning in IN in June NP June not RB not quite RB quite midsummer NN midsummer Edward NP Edward was VBD be good RB good looking VVG look

For questions about the Python code used in this study, contact the author at [email protected]. 97

APPENDIX B: INDIVIDUAL VARIABLE GRAPHS

Age is displayed on the x-axis for all of the linear regressions below. Displayed on the y-axis in each graph is the same measure from the graph title.

Special types - P.D. James 2600 2400 2200 2000 1800

1600 R² = 0.28 1400 40 50 60 70 80 90 100

Linear regression for special types for P.D. James (p = 0.024).

Semantic disparity - P.D. James 1.27

1.26

1.25

1.24 R² = 0.4143 1.23 40 50 60 70 80 90 100

Linear regression for semantic disparity for P.D. James (p = .004).

98

MAT5000 - Mary Higgins Clark 1300 1250 R² = 0.3446 1200 1150 1100 1050 1000 40 50 60 70 80 90 100

Linear regression for MAT5000 for Mary Higgins Clark (p = .007).

Effective types - Mary Higgins Clark 550 R² = 0.3436 500

450

400

350 40 50 60 70 80 90 100

Linear regression for effective types for Mary Higgins Clark (p = .007).

Special types - Mary Higgins Clark 1500 R² = 0.3178 1400 1300 1200 1100 1000 900 800 40 50 60 70 80 90 100

Linear regression for special types for Mary Higgins Clark (p = .01). 99

Semantic disparity - Mary Higgins Clark 1.24 R² = 0.2038 1.235 1.23 1.225 1.22 1.215 1.21 1.205 40 50 60 70 80 90 100

Linear regression for semantic disparity for Mary Higgins Clark (p = .046).

IT VERB ADJ THAT - Mary Higgins Clark 250 R² = 0.4354 200 150 100 50 0 40 50 60 70 80 90 100

Linear regression for IT VERB ADJ THAT for Mary Higgins Clark (p = .002).

overall relative IT-V-A-T/T - Iris Murdoch

1000 R² = 0.5024 800 600 400 200 0 30 40 50 60 70 80

Linear regression for overall relative IT-V-A-T/T for Iris Murdoch (p < .001). 100

MAT5000 - Agatha Christie 1300 1200 1100 1000 900 800 700 R² = 0.4778 600 25 35 45 55 65 75 85 95

Linear regression for MAT5000 for Agatha Christie (p < .001).

Effective Types - Agatha Christie 550 500 R² = 0.2831 450 400 350 300 250 200 25 35 45 55 65 75 85 95

Linear regression for effective types for Agatha Christie (p = .001).

Evenness - Agatha Christie 0.73 R² = 0.2297 0.72

0.71

0.7

0.69

0.68 25 35 45 55 65 75 85 95

Linear regression for Evenness for Agatha Christie (p = .003). 101

Semantic disparity - Agatha Christie

1.25 R² = 0.4883 1.24 1.23 1.22 1.21 1.2 1.19 1.18 25 35 45 55 65 75 85 95

Linear regression for semantic disparity for Agatha Christie (p < .001).

Special types - Agatha Christie

2000 R² = 0.304 1800 1600 1400 1200 1000 800 600 25 35 45 55 65 75 85 95

Linear regression for special types for Agatha Christie (p < .001).

IT VERB ADJ THAT relative - Agatha Christie

400 R² = 0.2475 350 300 250 200 150 100 50 0 25 35 45 55 65 75 85 95

Linear regression for IT VERB ADJ THAT relative for Agatha Christie (p = .002). 102

overall relative IT-V-A-T/T - Agatha Christie 1000 R² = 0.1383 800 600 400 200 0 25 35 45 55 65 75 85 95

Linear regression for overall relative IT-V-A-T/T for Agatha Christie (p = .026).

Delta Prime - Agatha Christie 3 R² = 0.47 2.5 2 1.5 1 0.5 0 -0.5 25 35 45 55 65 75 85 95

Linear regression for Delta Prime for Agatha Christie (p < .001).

MAT5000 - Terry Pratchett 1350 R² = 0.3066 1300 1250 1200 1150 1100 30 35 40 45 50 55 60 65 70

Linear regression for MAT5000 for Terry Pratchett (p = .001). 103

Evenness - Terry Pratchett 0.74 0.73 0.72 0.71 0.7 0.69 R² = 0.444 0.68 30 35 40 45 50 55 60 65 70

Linear regression for Evenness for Terry Pratchett (p < .001).

Semantic disparity - Terry Pratchett 1.28 1.26 1.24 1.22 1.2 R² = 0.2569 1.18 30 35 40 45 50 55 60 65 70

Linear regression for semantic disparity for Terry Pratchett (p = .002).

Special types - Terry Pratchett 2500 2000 1500 1000 R² = 0.2321 500 0 30 35 40 45 50 55 60 65 70

Linear regression for special types for Terry Pratchett (p = .003).

104

Delta Prime - Kurt Vonnegut 2 1.5 1 0.5 0

Delta Prime values for Kurt Vonnegut.

Delta Prime - Virginia Woolf 2 1.5 1 0.5 0

Delta Prime values for Virginia Woolf.

105

APPENDIX C: DENDOGRAM FROM HIERARCHICAL CLUSTER ANALYSIS

The dendogram below is result of hierarchical cluster analysis conducted on all authors with all LD and supplementary variables. It has been split into three parts and reproduced across two pages due to space constraints to improve readability.

Figure C1. First third of dendogram. 106

Figure C2. Second third of dendogram. 107

Figure C3. Final third of dendogram.

! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! !

Thesis and Dissertation Services ! !