Style in Science Fiction and Fantasy

Studies in Stylometry

by

Naomi K. Fraser B.A. (Hons)(Newcastle)

A thesis submitted in fulfilment of the requirements for the award of the degree of Doctor of Philosophy

The University of Newcastle

Australia

November, 2017

This research was supported by an Australian Government Research

Training Program (RTP) Scholarship

I hereby certify that the work embodied in the thesis is my own work, conducted under normal supervision. The thesis contains no material which has been accepted, or is being examined, for the award of any other degree or diploma in any university or other tertiary institution and, to the best of my knowledge and belief, contains no material previously published or written by another person, except where due reference has been made in the text. I give consent to the final version of my thesis being made available worldwide when deposited in the University’s Digital Repository, subject to the provisions of the Copyright Act 1968 and any approved embargo.

Signed

Naomi Fraser Acknowledgments

My first encounter with stylometry was due to Professor Hugh Craig who has subsequently backed this project with every resource necessary to see it to completion. Associate Professor Caroline Webb has brought her expertise to bear on this project with flair and close attention to detail. I am thankful to both my supervisors for their patient endurance. This project would not have gotten far off the ground without the technical assistance from Dr. Alexis Antonia, Dr. Jack Elliott, Dr Bill Pascoe and Emeritus Professor John Burrows from the Centre for Literary and Linguistic Computing at the University of Newcastle, Australia. I also want to thank Ben Brawn for his mathematical help, Ruth Tapp for her editing services and Dr. Jill McKeowen who has offered much support throughout the process.

The development of this project was aided by the members of the McMullin Building who have always offered me friendship and encouragement. In particular, I wish to thank the HDR cohort and those who occupy the once empty desks of MC148 — my fellow travellers, you have brought life and fellowship to what was once a lonely road. Thank you to my family and friends for your support and for reminding me of my purpose.

The final acknowledgment is to my husband, John, who made it his mission to read the books I talk about, to understand my arguments and who has supported the vision for this project since before I could articulate it myself.

Table of Contents

List of Tables and Figures ...... iv

Abstract...... vi

Preface ...... viii

Style in the Service of Genre ...... 1

Introduction ...... 1

Style ...... 11

Genre Theory ...... 25 Conclusion ...... 37

Stylometry in the Service of Style ...... 42

Introduction ...... 42

The Scope of Stylometry...... 43

2.2.1 Stylometry and the Digital Humanities ...... 46

2.2.2 Stylometry and Authorial Attribution ...... 49

2.2.3 Stylometry and the Questions “Beyond” Authorship ...... 52

2.2.4 The Interpretational Gap ...... 61

Introduction to Principal Component Analysis...... 64 2.3.1 Applying PCA ...... 67

2.3.2 Proportional Frequencies ...... 69

2.3.3 Word Variables ...... 70

2.3.4 Correlation versus Covariance ...... 70

2.3.5 Interpreting PCA Results ...... 76

Methodology ...... 77

2.4.1 Selecting Texts ...... 77

2.4.2 Counting Words ...... 79

2.4.3 Applying PCA ...... 80 2.4.4 Interpreting Results ...... 80

Conclusion ...... 83

Contextualising the Style of Early Science Fiction and Fantasy ...... 84

3.1 Introduction ...... 84

3.2 A Computational Study of Style ...... 91 3.2.1 The Styles at the Extremes of PC1 ...... 95

3.2.2 The Style of Lilith ...... 99

3.2.3 The Style of The Time Machine ...... 104 3.2.4 Utopian Styles ...... 113

3.3 Conclusion ...... 118

The Case for ...... 121

4.1 Introduction ...... 121

4.2 Stapledon and Wells ...... 124

4.3 Stapledon and Woolf ...... 142

4.3.1 The Closest: Sirius and Orlando ...... 145

4.3.2 The Extremes: and The Years ...... 149

4.4 Conclusion ...... 153 Stylistic Variation in the Harry Potter Sequence ...... 158

Introduction ...... 158

New Measures of Variation ...... 163

5.2.1 Three Stylistic Variations ...... 167

5.2.2 Measuring Change in Chapters ...... 178

5.2.3 Measuring Change in Direct Speech and Narration ...... 183

Variations in other Series ...... 190

5.3.1 The Complete Chronicles of Narnia ...... 191 5.3.2 Young Wizards ...... 199

Conclusion ...... 206

Conclusion: Expanding the Verbal Universe ...... 210

Appendices ...... 216

Appendix A ...... 216

Appendix B ...... 217

Appendix C ...... 219

Appendix D ...... 219

Appendix E ...... 221 Appendix F ...... 222

Appendix G ...... 222

Appendix H ...... 225 Appendix I ...... 225

Appendix J ...... 226

Works Cited ...... 227

List of Tables and Figures

Table 5.1 4-grams with Frequency >5 in Philospher's Stone ...... 168

Table 5.2 4-grams with Frequency >5 in Half-Blood Prince ...... 169

Table 5.3 Context of "for a moment" in Philosopher's Stone ...... 169

Table 5.4 Context of "for a moment" in Half-Blood Prince ...... 170

Table 5.5 Significance of Correlation Coefficients in the PCA Results on the Original and the

New Millennium Editions (NM) of Young Wizards ...... 204

Figure 2.1 PCA on Harry Potter Corpus Using Covariance Matrix ...... 73

Figure 2.2 PCA on Harry Potter Corpus Using Correlation Matrix ...... 74

Figure 2.3 The Order of 92 Words in Harry Potter and the Deathly Hallows (HP7) ...... 75

Figure 3.1 PCA Scores of 31 Texts from 1871-1900 (PC1 vs. PC2) ...... 92

Figure 3.2 PCA Loadings of 99 Words in 31 Texts from 1871-1900 (PC1 vs. PC2) ...... 93

Figure 3.3 Manual PCA Scores for Lilith Chapters with The Time Machine and The Eustace

Diamonds (PC1 vs. PC2) ...... 102

Figure 3.4 Cluster Dendrogram of The Time Machine using Squared Euclidean Distance

Measure ...... 109

Figure 3.5 Manual PCA Scores for The Time Machine Chapters with The Eustace Diamonds

(PC1 vs. PC2) ...... 110

Figure 4.1 PCA Scores of Stapledon and Wells (PC1 vs. PC2) ...... 127

Figure 4.2 PCA Loadings of 100 Words in Stapledon and Wells (PC1 vs. PC2) ...... 129

Figure 4.3 PCA Scores Woolf and Stapledon (PC1 vs. PC2) ...... 143

Figure 4.4 PCA Loadings of 99 Words in Woolf and Stapledon (PC1 vs. PC2) ...... 145

Figure 5.1 Grade Levels of all seven Harry Potter Books ...... 162

Figure 5.2 Book Length of all seven Harry Potter Books ...... 162

Figure 5.3 PCA Scores of the Seven Harry Potter Novels (PC1 vs. PC2) ...... 164

Figure 5.4 PCA Loadings of 92 Common Words in the Seven Harry Potter Books (PC1 vs. PC2)

...... 165

Figure 5.5 Manual PCA Scores for 199 Chapters in Harry Potter Sequence (PC1 vs. PC2) ...... 179

Figure 5.6 Rate of Direct Speech in the Seven Harry Potter Books ...... 184

Figure 5.7 PCA Scores for Harry Potter Books Divided into Seven Sections of Direct Speech and

Seven Sections of Narration (PC1 vs. PC2) ...... 185 Figure 5.8 PCA Loadings for 92 Words on Harry Potter Books Divided into Seven Sections of

Direct Speech and Seven Sections of Narration (PC1 vs. PC2) ...... 187

Figure 5.9 PCA Scores of Narnia (PC1 vs. PC2) ...... 192

Figure 5.10 PCA Loadings for 93 Words in Narnia (PC1 vs. PC2) ...... 193

Figure 5.11 PCA Scores of Young Wizards Series (PC1 vs. PC2)...... 201

Figure 5.12 PCA Scores for Young Wizards (The New Millennium Edition) (PC1 vs. PC2) ...... 203

Figure 5.13 PCA Loadings for 97 Words in Young Wizards (Original Versions) (PC1 vs. PC2) . 205 Abstract This thesis explores style in science fiction and fantasy by applying stylometry to three case studies. These are genres that are traditionally analysed for the potency of their themes and tropes rather than for their language and style. In each case study, the multivariate method

Principal Component Analysis (PCA) is applied as a data-reduction technique to corpora containing examples of science fiction and fantasy texts. The data-set analysed by PCA includes the proportional frequencies of no more than one hundred of the most common words in each corpus. As such, this thesis analyses the underlying style of texts through common words that are generally overlooked by readers but are ubiquitous and integral to the structure of language. Style is a multifarious term but is defined broadly in this thesis to mean patterned variation in the way language is used, often to artistic effect, such as in genres or between authors. By studying the base strata of style this thesis relates the function of underlying features in language to prominent features of texts and genres.

The statistical study of style is presented in three case studies: the contextualisation of style in early science fiction and fantasy; the case of Olaf Stapledon’s style; and the stylistic variation across J.K. Rowling’s Harry Potter sequence (1997-2007). The first case study examines late-Victorian texts and asks how H.G. Wells and George MacDonald’s early science fiction and fantasy texts, The Time Machine (1895) and Lilith (1895) stylistically relate to contemporaneous works. The results indicate that Wells’s style is similar to adventure fiction while MacDonald’s fantasy, though it appears highly eccentric to the contemporary reader, does not stand out among peers. The second case study presents an argument for the effectiveness of the style of

Olaf Stapledon through computational comparisons with his predecessor, H.G. Wells, and his modernist contemporary, Virginia Woolf. Although, Stapledon is considered a poor stylist and is celebrated on the basis of his visionary imagination rather than his storytelling prowess, the results indicate that his style is relatively distinctive but also consistent with his artistic goals.

The third case study explores whether computational evidence can be found for underlying stylistic variation in Rowling’s Harry Potter sequence. The results are then compared with those from two other fantasy book series, C.S. Lewis’s Complete Chronicles of Narnia (1950-1956) and

Diane Duane’s Young Wizards (1983-2016), in order to ascertain the nature of style in other series. The statistical results demonstrate progressive change in Rowling’s sequence occurring at a level previously unknown to critics.

Together these three case studies offer new perspectives on the role of style in science fiction and fantasy through a computational method that has not previously been employed in relation to either genre.

vii Preface Upon my first introduction to stylometry, the quantitative study of style, there was one question that immediately came to mind: can the styles of genres be distinguished by only the most common elements of language, frequently used words? Common words, often skipped over by both writer and reader, are integral to the formation of sentences and the creation of meaning. Yet how useful are they as indicators of artistic expression in the context of science fiction and fantasy, two genres that generally rely on iconic tropes for distinction? This thesis does not seek a definitive list of stylistic markers to distinguish a science fiction text from fantasy — classification is not the goal. Rather, the research has been undertaken with the goal of exploring the extent to which the study of style can lead to new interpretations of the genres and further our understandings of some specific texts. Prior to the discussion on style, genre and methodology, this prefatory section offers a brief note on the organisation of the thesis, the choice of texts selected and an introductory statement about the role of stylometry as literary criticism.

The results of this project are presented in three case studies; the first offers a snapshot from early science fiction and fantasy, the second of the evolution of early science fiction and the last offer a glance into some more contemporary forms of fantasy fiction. The methodology of stylometry requires the texts to be in a machine-readable, electronic format. Due to legal reasons, the only works available are those that are either in public domain or those which are accompanied by express permission from publishers. In public domain, there are a variety of works by early science fiction authors including as H.G. Wells, Mary Shelley, Edward Bellamy and Edward Bulwer-Lytton. Similarly, there are several early fantasy writers with work in public domain including Lord Dunsany and William Morris. For the first case study (Chapter

3), George MacDonald’s work Lilith (1895)was selected as a recognisable example of the early form of portal and quest fantasy (Mendlesohn, Rhetorics 48). Alongside Lilith, H.G. Wells’s The

Time Machine (1895), a celebrated work of early science fiction, has been selected to be a

viii counterweight. As coeval works, the texts are ideal for comparison. Not only was Wells aware of MacDonald’s fantasy but he was also impressed with MacDonald’s solutions to some of the artistic problems they both faced by exploring their shared interests in multiple dimensions and realities (Hein 390). In order to gain a robust picture of any shared or distinctive stylistic traits between MacDonald and Wells, the first case study puts The Time Machine and Lilith against a backdrop of twenty-nine contemporaneous novels of other genres found in the public domain from the late-Victorian era. Observed similarities and differences in style expose the different strategies Wells and MacDonald employed for the similar artistic goals as well as the stylistic relationship between their works and other genres.

The second case study (Chapter 4) is a study of Olaf Stapledon whose nine works of early science fiction are in the public domain and available through Project Gutenberg. Writing during the interwar period, Stapledon’s science fiction has been celebrated by critics and some have even claimed that his influence on the genre is “probably second only to that of H.G.

Wells” (Ashley and Clute). Stapledon is, however, a neglected writer of early science fiction, particularly on the topic of style. Chapter 4 is a study of Stapledon’s fictional work in relation to two other authors: Wells and Virginia Woolf. The computational studies prove a rich source of literary enquiry into Stapledon’s visionary and experimental science fiction works and present a stylometric perspective on a pertinent question asked by Stapledon scholars: is there a traceable lineage from Wells to Stapledon (Crossley, Olaf 198).

The third case study (Chapter 5), explores the stylistic variation in J.K. Rowling’s fantasy sequence, Harry Potter (1997-2007). Critics have suggested that the Harry Potter sequence experiences a shift from children’s fantasy to young adult fantasy (Levy and Mendlesohn 133).

However, change has not previously been studied at a deep level of language use. Using stylometry, Chapter 5 determines a chronological shift in Rowling’s style that is present at a deep and consistent level. As the book series is a dominant form in fantasy fiction (Keen 725),

ix this chapter also analyses two additional fantasy series, C.S. Lewis’s The Complete Chronicles of

Narnia (1950-1956) and Diane Duane’s Young Wizards (1983-2016). Narnia is used with express permission from the estate of C.S. Lewis; and both Rowling and Duane’s books are available for purchase through their respective websites without security encryptions.1 Although stylistic variations do occur across the books of Narnia and Young Wizards, they are not chronologically progressive changes as in Harry Potter. The stylometric account of the three fantasy series demonstrates the relationship between latent variations and salient changes.

The choice to use these two genres comes down to three main factors: style is problematic in the two genres, as are questions of genre boundaries, and as very popular genres they embody one of the main problems facing twenty-first century literary criticism: the excessive number of volumes available for reading. Since the late 1980s, computational literary criticism has been a recognised field but it has gained most traction in the study of authorship, particularly focusing on eras where public domain works are numerous such as the Early

Modern era drama (Craig and Kinney), Victorian novels (Tabata, “Stylometry”; Heuser and Le-

Khac; Hoover, “Corpus”) and some modernist texts (Rybicki and Heydel). With direct permission, the stylometric study of popular category romance has also yielded interesting insights (Elliott). The two relatively modern genres, science fiction and fantasy are leading the way in publishing with DRM-free formats.2 There are specific imprints from major publisher’s,

1 The Harry Potter books were purchased through www.pottermore.com and Young Wizards through ebooksdirect.dianeduane.com. As the copyright law in Australia includes a fair use clause for academic purposes, the only question of legality arises because of the necessity to bypass security encryptions on eBooks in order to change the format to an xml or text file. Texts sold as eBooks without Digital Rights Management (DRM) allow the format of the file to be changed – a necessary step before counting words – without breaching the conditions of purchase. 2 In addition, there have been moves away from “hard” DRM to “soft” DRM which resembles more of a watermark thus allowing users to “float” their electronic books between devices rather than being bound

x such as Macmillan’s Tor and Forge imprints, as well as publishers, Baen Books, that are dedicated to publishing science fiction and fantasy electronic books without security encryption. In addition, the rate of independent self-published books is rapidly increasing and most self-published eBooks are DRM-free.3 Although these trends have the promise of making more science fiction and fantasy available for stylometric studies, there are difficulties identifying which works to read and yet a large volume of quality literature is surfacing with only underground methods of promotion.4 By selecting recognisable examples of science fiction and fantasy and focusing on three key areas of interest in the genres, the emergence, evolution and stylistic characteristic of one of fantasy fictions dominant forms, the thesis aims to progress the study of stylometry with the view that, ultimately, statistical studies in style can aid those desiring to analyse the new titles that emerge at an ever increasing rate of production.

In part, the pressure of increasing rates of literary production can be interpreted as one of the driving forces behind the development of computational methods. Franco Moretti borrows the term “the great unread” from Margaret Cohen to explain how no one literary critic can ever claim to work with an entire subset of literature but tend to only work with the

“canonical fraction, which is not even one per cent of published literature” (“Conjectures” 55).

According to Stephen Ramsay, it is an ancient calculus; the tradition of written literature has produced so much material that most libraries contain “more than five hundred human lifetimes worth of reading material” (113). Indeed the number of new titles published each year

to the one. This measure has already been adopted in Germany (Anderson). However, most moves to change copyright laws in Australia are met with opposition (Stinson). 3 According to the Author Earnings February 2017 Report, self-publishing independent authors are capturing between 24-34% of all eBook sales in five English-language markets – Australia, US, UK, Canada and NZ (“February 2017”). Self-publishing sites, such as Lulu, have moved away from DRM (Tom). 4 George R.R. Martin has predicted that more self-published books will be recognised by The Hugo Awards while acknowledging the difficulty in determining which books to read (Andres).

xi currently exceeds the amount a single person can expect to read in their lifetime.5 However,

Ramsay suggests the real pressure is not due to the amount of unreadable material but because it “exceeds our ability to create reliable guides to it” (113): bereft of a guide, we are left to stumble upon literature, relying on serendipitous chance to govern the selection of classics. As more machine-readable texts become available for academic study, the access to the ‘great unread’ increases.

The present research contributes to the body of scholarship of computational approaches to literary criticism, yet the extent to which the methodology applied in this thesis will eventually come to develop an automated guide to the unread will only be revealed in time. The methodology currently offers unprecedented access to the underlying structure of texts through the statistical study of language. Moretti has coined the term “distant reading” to describe the field but I am uncomfortable ascribing the act of reading to the methodology of stylometry (Moretti, Distant). A computer can recognise words but not read, in the same way that I can recognise symbols of the Kanji alphabet but not read Japanese. I can successfully link a sound, even some limited meaning to certain symbols but I cannot link the symbols together in a fluid motion where each one alters the meaning of the string of symbols. I certainly do not possess the fluency to manipulate the symbols in order to create new meaning. It has been noted that one of the effects of “distant reading” is that visualisations of data have come to replace block quotations (Liu). Graphs are read closely while texts are held at a distance. As a

5 This is calculated by assigning a very generous reading rate of one book per day for the entirety of an average adult lifetime. Using the life expectancy average of 80 and considering the adult lifetime to be 62 years of that, 22,630 books (365 *62) could feasibly be read in a lifetime. In comparison figures from IPA’s Annual Report indicate that 184,000 titles were published in the UK in 2013 (Ingenta). Therefore, the number of books published annually is greater than eight times the number of books an adult can expect to read in their lifetime.

xii study rooted in the home-discipline of literary studies, this thesis mixes methods with the intention of offering statistical pathways to new readings.

Rather than offering definitive guides to accessing the unread, or reliable pathways for rendering unknown material known, each case study focuses on texts that are, for the most part, widely-read works of fantasy and science fiction. Using familiar texts makes the journey from data to interpretation more accessible. Each case study offers a snapshot to the underlying structure of genre and the understandings that can be gleaned from a study of style.

Therefore, from the outset, it is important to acknowledge that stylometry is a search for knowledge “not to solve problems but to make them worse” by answering questions with more questions (McCarty 8). With that said, we don’t always end up with wholly speculative answers, it is as Karl Popper once said of the scientific method:

The empirical basis of objective science has thus nothing ‘absolute’ about it. Science does not

rest upon solid bedrock. The bold structure of its theories rises, as it were, above a swamp.

It is like a building erected on piles. The piles are driven down from above into the swamp,

but not down to any natural or ‘given’ base; and if we stop driving the piles deeper, it is not

because we have reached firm ground. We simply stop when we are satisfied that the piles

are firm enough to carry the structure, at least for the time being. (Popper 111)

After all, “who has ever said the last word on a problem of literary criticism? And do we even want the last word said?” (Murry 4). As for the choice to study science fiction and fantasy, style is incredibly important to these genres despite it being a neglected area of scholarship. As

Ursula Le Guin has argued, when the text is presenting a wholly other or alternate world the voice of the creator cannot falter:

xiii A world where no voice has ever spoken before; where the act of speech is the act of

creation. The only voice that speaks there is the creator’s voice. And every word counts.

(Le Guin 154)

The question the thesis seeks to address is, what can we learn from studying the frequencies of only the one hundred or so most common words in any given collection of texts? Through three case studies the dissertation proposes that the underlying structures of language directly correspond with higher order variations in style that can be related to elements of genre, such as modes and norms. The point of entry is the counting of words.

xiv

Style in the Service of Genre Introduction Underpinning the concepts of a text is the language of a text. The ability of a science fiction or fantasy text to reveal thematic concerns and establish atmospheres (Mandala 1); to maintain the tone of a work (Wolfe, Evaporating 35); and create “new narrative worlds” (Slusser 3, emphasis in original) is grounded in the style of the language. Here the term “grounded” is used loosely.

Rather than searching for a solid bedrock of what features constitute the style of either genre, this thesis explores how the underlying patterns of language reveal key strategies in the genres.

As Le Guin has said of fantasy fiction, “every word counts” (154), and counting words is the entry point for this thesis to the discussion of the structures underlying key concepts of genre— not every word is counted, however, only the most common words, the base strata of style.

This thesis represents the first attempt to bring stylometry to bear on the lively critical traditions of the genres of science fiction and fantasy. It is, however, chiefly an act of literary criticism and therefore explores questions pertinent to criticism, including the relationship of style and genre, of genre norms and forms and the relationship between language and the verisimilitude of a text according to chief strategies of the genres.

Although this thesis is a brand-new approach to the study of style in science fiction and fantasy, it is not the first study to offer a systematic exploration of language in either genre. The works of literary-linguists Peter Stockwell and Susan Mandala both demonstrate how style serves science fiction and fantasy to “remarkable effect in thematically relevant ways” (Mandala

1) and how the “organising and processing of language” underlies the experience of a genre

(Stockwell 2). The study of style in the present thesis follows Mandala’s use of Geoffrey N.

Leech and Michael H. Short’s definition of style as “the way language is used in literature for

‘artistic function’” (Mandala 1). Overall, stylometry tends to differ from most linguistic approaches and more closely resembles stylistics. Leech and Short explain that “[l]inguistics places literary uses of language against the background of more ‘ordinary’ uses of language” (5). According to stylistics,

“[s]tyle can be seen as variation in language use” (Wales 436): variations that occur within a single text, genre, author’s corpus or variation in the use of language across several texts. The three case studies of this thesis explore variations in literary language against the backdrop of other literary language. As it is a study in stylometry, the present research investigates quantitatively identifiable “patterns that are linked to the processes of writing and reading”

(Craig, “Stylistic”). The method employed, Principal Component Analysis (PCA), is particularly suited to allow a corpus of texts to “array themselves according to their respective affinities and disaffinities” (Burrows, “PCA”). More will be said on this in Chapter 2; for the moment, it will suffice to explain that the patterns identified using stylometry are explored in the thesis according to the artistic function of language, rather than an exclusively linguistic function.

Significantly, PCA will summarise the variance of a dataset in new, composite variables but will not present the totality of style as “[a]ny findings are relative to the features chosen rather than absolute” (Craig and Greatley-Hirsch 34). Researchers Hugh Craig and Brett

Greatley-Hirsch go on to note that “a profile of the usage of the one hundred most common words…may be revealing, but only in specific ways” (35). It is important to remember that the term style is used to refer to much more than just frequency data. Furthermore, PCA can only answer closed questions. For instance, are the styles of these texts similar or not? While PCA allows room for interpretation, such as how closely related some texts are, these are still mostly closed questions particularly in relation to the open questions more commonly explored in literary criticism. The task of the scholar adapting PCA as literary criticism is to further the exploration by asking of the results both how the texts are and are not distinguished and why.

This final question, the why, moves the empirical approach more firmly into the realm of critical scholarship. Craig has asked previously, in the title of a paper, “If you can tell authors 2 apart have you learnt anything about them?” His answer to this integral question was to demonstrate how the same profile of word frequencies can serve as both a test on the authorship of a work and as stylistic descriptions of the work (Craig, “Authorial” 109–112).

Therefore, frequency data can assist in defining the stylistic qualities of an author’s style as well as characteristics of style in the works themselves.

It is also important to note that the quantitative analysis of style does not preclude or disqualify qualitative interpretations. J. Berenike Herrmann, Karina van Dalen-Oskam and

Christof Schöch offer a new definition of style in light of emerging computational studies of style (27). After analysing three different literary traditions, German, Dutch and French, they arrived at a new definition: “[s]tyle is a property of texts constituted by an ensemble of formal features which can be observed quantitatively or qualitatively” (44). The paper is the first systematic approach for re-defining style in light of recent developments in stylometry, but the analysis provided does not encompass much of Anglo-Saxon literary traditions. Therefore, the present chapter of this thesis first demarcates the use of the word “style” as it pertains to the present study and relevant discussion concerning science fiction and fantasy.

The second section of this chapter explores the definition of genre, subgenre, mode and norms according to developments in genre theory — from the freezing of genres in neoclassical theory to the flexible Wittgenstein family-resemblance theory. The aim in most theories of genre is to avoid these two extremes, the concepts of an immutable frozen form and of the instability of the constantly changing genre. Genre theory is a pertinent discussion to the definition of science fiction and fantasy, both of which have unstable origins. Often grouped under broader terms of “fantastic” or “speculative” fiction, along with horror, these three genres “have been unstable literary isotopes virtually since their evolution into identifiable narrative modes – or at least into identifiable market categories – a process that began a century or more ago and is still going” (Wolfe, Evaporating 14). Despite the instability, Gary K. Wolfe notes, “at times they have

3 seemed in such bondage to formula and convention that they were in danger of fossilization”

(Evaporating 14). Due to this history of instability in both science fiction and fantasy, the sections in this chapter on style and genre actively explore these two concepts as they relate to theories of the two genres and practical criticism.

With threads of science fiction and fantasy criticism woven throughout the chapter, it is pertinent to offer an initial introduction of critical approaches to each genre. Arriving at a critical definition of science fiction and fantasy is not without its difficulties. Science fiction has been defined so frequently that there is “little critical consensus” concerning which texts “might be included or excluded” (Wolfe, Critical 108). Similarly, new critical definitions of fantasy tend to crop up in “[n]early every critical text in the field” (Attebery, Strategies 12) which has made the genre “tremendously difficult to pin down” (Mendlesohn and James, “Introduction” 1). It is in this critical environment that Mendelsohn, as one of the editors to The Cambridge Companion to Science Fiction, proposed in her introduction that “[s]cience fiction is less a genre … than an ongoing discussion” (Mendlesohn, “Introduction” 1). Along with Edward James, Mendlesohn is also the editor to The Cambridge Companion to Fantasy Literature where together they note that most of the definitions generated by major theorists “exclude most of what general readers think of as fantasy” (“Introduction” 1). Their volume, therefore, treats the body of fantasy on a

“multiplicity of terms” that recognises “academic, reader and commercial understandings of fantasy as equally valuable” (Mendlesohn and James, “Introduction” 2). Although science fiction can be viewed as an “ongoing discussion” and fantasy fiction through a “multiplicity of terms”, such definitions afford little stability for a discussion of the genres.

On the other hand, some early critical definitions of the two genres proved to be excessively narrow. Tzvetan Todorov’s definition of the fantastic is of a genre that exists only in the reader’s moment of hesitation, between “total faith or total incredulity” (31). As soon as a decision is made, Todorov considers the text to belong to either the uncanny or the marvellous

4 categories (31). Such a definition narrows the field of the fantastic to a particular category which includes texts such as Franz Kafka’s The Metamorphosis (1915) but not the bulk of what critics would consider to be fantasy fiction. In the same vein, a narrow category of fiction emerges from Darko Suvin’s definition of science fiction: “SF, is then, a literary genre whose necessary and sufficient conditions are the presence and interaction of estrangement and cognition, and whose main formal device is an imaginative framework alternative to the author’s own empirical environment”(Suvin, Metamorphoses 7–8, emphasis in original). Suvin himself asserts that “80 to

90 percent of the works” labelled as science fiction “are sheer confectionery” (Metamorphoses 36); works that do not sufficiently meet his definition of the genre. In addition to cognitive estrangement, Suvin also demands that science fiction “be wiser than the world it speaks to”

(Metamorphoses 36).

One way of dealing with narrow definitions is to follow John Rieder and interpret the category of literature that falls under “cognitive estrangement” as “a specific, late-twentieth- century, academic genre category that has to be understood partly in the context of its opposition to the commercial genre practices Suvin deplored” (“On Defining” 193). In which case, “Suvin’s definition becomes part of the history of sf, not the key to unravelling sf’s confusion with other forms” (Rieder, “On Defining” 193). However, Adam Roberts has adopted

Suvin’s definition on the basis of its usefulness for demarcating pertinent characteristics of the genre. Roberts explains the particular usefulness of Suvin’s additional term, novum: “the fictional device, artefact or premise that focuses the difference between the world the reader inhabits and the fictional world of the SF text” (History 1). Examples of a novum include time travel and artificial intelligence. Estrangement is similarly useful in that it is a “creative approach…for exploring the novum” (Suvin, Metamorphoses ix). It is also a feature that differentiates science fiction “from the ‘realistic’ literary mainstream”, while the cognitive element of Suvin’s definition is what differentiates science fiction from myth; it is the element that is scientific and empirical (Metamorphoses 8). Simon Spiegel has pointed out that 5

“[a]lthough everyone seems to agree that sf renders the content of its stories somehow ‘strange,’ there are upon closer inspection considerable differences in the way that sf scholars make use of

Suvin’s own definition” (369). Spiegel goes on to note that Suvin’s own use of the term

“estrangement” differed greatly from his predecessors (372). Spiegel’s definition holds that “the effect of estrangement does not arise only from making things strange, but from the naturalization of the marvellous” (375, emphasis in original). Therefore, while we may say that science fiction is any literary work (written or otherwise), that includes the cognitive elements of science and technology, it is also a genre marked by its ability to make ordinary things strange and strange things natural.

In order to encapsulate such a genre we can turn to Samuel Delany’s approach to the genre as a set of codified reading strategies. Delany suggests that the genre will mean different things, depending on whether it is approached as science fiction or as realist fiction: “most of our specific SF expectations will be organized around the question: what in the portrayed world of the story, by statement or implication, must be different from ours in order for this sentence to be normally uttered?” (Silent 31). Roberts summarised Delany’s approach as one which treats science fiction as a “reading strategy” as much as anything else (History 2, emphasis in original).

As will be evident in the discussion that unfolds, extrapolation is a key element in Delany’s reading strategy. Furthermore, it can be argued that while strategies of estrangement may serve to construct a framework in any given work of science fiction, the effect of the narrative genre is felt primarily through the ability of the reader to extrapolate from the written word to imagine a new world. As Suvin argued, science fiction “demands from the author and reader, teacher and critic, not merely specialized, quantified positivistic knowledge … but a social imagination”

(Metamorphoses 36). At this point, it seems unlikely that these concepts of a socially engaged genre can be related to questions of style and underlying patterns in language use. However, there are some additional concepts of science fiction that are relevant to a study of style.

6

Firstly, for the purposes of the present research science fiction is applied to works that were written prior to the emergence of the term science fiction. John Rieder has argued:

In order for a text to be recognized as having generic features, it must allude to a set of

strategies, images, or themes that has already emerged into the visibility of a

conventional or at least repeatable gesture. Genre, therefore, is always found in the

middle of things, never at the beginning of them. (“On Defining” 196)

One of the early science fiction writers, H.G. Wells, labelled his stories as “scientific fantasies”

(also referred to as Scientific Romances, the title of a 1933 collection of his fiction). Although his stories could be seen as a distinct category of fiction, Wells’s own awareness of his work is revealed in the preface he wrote to the 1933 collection:

In all this type of story the living interest lies in their non-fantastic elements and not in

the invention itself… the fantastic elements, the strange property or the strange world,

is used only to throw up and intensify our natural reactions of wonder, fear or

perplexity. The invention is nothing in itself and when this kind of thing is attempted

by clumsy writers who do not understand this elementary principle nothing could be

conceived more silly and extravagant. Anyone can invent human beings inside out or

worlds like dumb-bells or a gravitation that repels. The thing that makes such

imaginations interesting is their translation into commonplace terms and a rigid

exclusion of other marvels from the story. Then it becomes human.” (The Scientific viii)

In the explanation of his own work, Wells refers to “invention”, a concept similar to Suvin’s cognitive estrangement and the novum. Wells further demonstrates his awareness of the unique form when he states that an invention explored in science fiction is only interesting when it is unique in the strange world and adds: “For the writers of fantastic stories to help the reader to play the game properly, he must help him in every possible unobtrusive way to domesticate the impossible hypothesis” (Wells, The Scientific viii, emphasis in original). Wells’s awareness of the 7 form is evidence of an intentional use of strategies of the genre, such as naturalising the strange, that were present even before the label of science fiction was ascribed to the work. Therefore,

Wells’s works can, and have been, treated as science fiction. Present in Wells’s own concept of the form is one of the central aspects of science fiction: the extrapolation from empirical knowledge of reality in a manner that is “supposedly factual” (Suvin, Metamorphoses 6). As a result, a science fiction work includes the “factual reporting of fiction” in a way that “is confronting a normative system…with a point of view or look implying a new set of norms”

(Suvin, Metamorphoses 6). In this way, questions of style more directly relate to formations of genre definitions: the manner in which an extrapolative concept is reported directs the reader’s engagement with the ideas contained in the literature.

Definitions of fantasy are, likewise, built around reader response. Todorov’s category of the fantastic exists only in a liminal moment of “hesitation”, while broader definitions necessitate the presence of wonder, secondary belief or some form of narrative transaction.

Corresponding to science fiction’s concept of estrangement, the narrative transaction required in fantasy has been labelled by J.R.R. Tolkien as “arresting strangeness” (45). This concept is based on the notion that “[t]he human mind is capable of forming mental images of things not actually present” (44). Tolkien argues that fantasy often “remains undeveloped”: “[a]nyone inheriting the fantastic device of human language can say the green sun…But that is not enough…To make a Secondary World inside which the green sun will be credible, commanding

Secondary Belief, will probably require labour and thought” (46). While Tolkien’s musings on fantasy provide a basis for a functional definition of the genre, it is only one approach. Brian

Attebery has argued that the problem with defining fantasy is finding an approach that can encompass both A Midsummer Night’s Dream and Conan the Barbarian (Strategies 1). Hence, the genre can be understood as a mode, a formula or a “fuzzy set”. In Attebery’s discussion on mode and formula, it is apparent that mode threatens to destabilise the concept of genre, something that theorists attempt to avoid, while formula is precisely the type of “genre 8 freezing” that is also undesirable. “It is difficult”, writes Attebery, “to say anything meaningful about either the mode, which is so vast, or the formula, which tends toward triviality. The task would be easier if there were an in-between category, something varied and capable of artistic development and yet limited to a particular period and a discernible structure” (Strategies 2).

Although clearly desiring an ideal category, Attebery is aware of the artifice in his division between formula and mode, arguing that “some notion of genre is needed” (Strategies 11). He settles on the concept of “fuzzy-sets” (Strategies 12); of which more will be said in Section 3 of this chapter.

Significantly, Attebery’s genre definitions do not preclude the role of wonder: “[t]he concept of wonder, as a key to fantasy’s impact, may best be understood as an alternative formulation of the idea of estrangement” (Strategies 16). To further understand how fantasy generates wonder through estrangement we can turn briefly to another definition of fantasy, offered first by Kathryn Hume, and which holds that one of the key aspects of the genre is the element of the fantastic or impossible; an element that falls into the category of being a departure from “consensus reality” (8). Defining “consensus reality”, or its reverse form, the impossible, proves difficult. Wolfe has suggested that the resources available to literary scholars cannot adequately answer, in its totality, the question of how something is recognised as being impossible (Evaporating 64). Wolfe, therefore, proposes a scale: with private imaginations at one end and universal myth systems at the other and fantasy fiction found “somewhere toward the middle” (Evaporating 64–5). He explains that

contemporary fantasy must engage in an implied compact between author and reader –

an agreement that whatever impossibilities we encounter will be made significant to us,

but will retain their idiosyncratic nature that we still recognize them to be impossible.

(Evaporating 65)

9

“Significant” and “idiosyncratic”, can be considered key features of fantasy fiction: fantasy must include an impossible element that is significant to the reader in a way they recognise without conforming to anything too formulaic – thus maintaining an idiosyncratic nature.

Formula is not “necessarily bad” (Attebery, Strategies 10), as Attebery has argued “[e]very element of the formula may be present in a tale of sparkling originality” (Strategies 10). A similar notion to Wolfe’s “implied compact” can be found even in E.M. Forster’s early essay on fantasy: “What does fantasy ask of us? It asks us to pay something extra. It compels us to an adjustment that is different to an adjustment required by a work of art, to an additional adjustment” (75). The adjustment requires the reader to suspend disbelief and to entertain what

Tolkien would consider to be a secondary belief, despite the impossibilities and unknowns that are accompanied with the genre’s key elements. As a definition of a popular genre, fantasy fiction that encompasses both Shakespeare’s fairies as well as high heroic tales of wizards and soldiers, requires an active relationship between reader and text. As Attebery has argued, “[a] fantasy writer is not writing to a faceless posterity but to a knowledgeable, coherent, and demanding group. This real audience may actually be encoded in the text as a component of the postulated reader to such an extent that an outsider, someone new to the genre, may find the narrative impenetrable” (“Fantasy” 35). The state of the genre thus encourages a definition akin to Delany’s approach to science fiction, one that interprets the genre as a set of reading strategies, encoded to present the impossible in such a manner as to induce in the reader a sense of wonder or belief.

Therefore, as well as creating verisimilitude in fictional worlds, occupying a particular place on the moving scale of the impossible, projecting a plausible world from a scientific notion, the work of a science fiction or fantasy text is required to prompt the reader to apply particular reading strategies, to suspend disbelief for both the unlikely and the outright impossible and to extrapolate, often from very few cues, the existence of an alternate future, past or present. As will be demonstrated throughout the chapter, the notion of transaction, 10 whether through secondary belief or the active entreaty of extrapolative reading, is a recurring element for discussions that deal with the artistic function of language.

Style A foremost concern is, what exactly is meant by the term style? In a lecture presented at Oxford in the Summer Term of 1921, John Middleton Murry stated:

A favourite habit with a term of criticism is to have two quite easily separable and

distinct meanings, and to have besides, an existence in a kind of limbo, where it

partakes a little of these two distinct meanings, even though they are irreconcilable. (1–

2)

He was chiefly discussing the problem of style, a term used so widely that “it is very difficult to define” (Wales 435). Yet, as highlighted by Murry, it can at any one moment refer to two distinct meanings or even be in a state of limbo between meanings. In his own attempt to elucidate the multiple strands of meaning, Murry proposed three distinct uses of the word style:

“Style, as personal idiosyncrasy”, “Style, as the highest achievement of literature” and “Style, as technique of exposition” (8). Treating only two meanings as relevant to studies of literature, personal styles and style as literary merit, Murry found that attention to an author’s distinctive style was often employed “as a term of praise” (9). The confusion between recognisable authorial styles and styles denoting literary merit is understandable given the highly idiosyncratic emphasis on style. Murry wrote that “a great writer is never more intensely and recognizably himself than in his greatest passages” (Murry 9). William Strunk Jr. and E.B. White later declared that all writing is “the Self escaping into the open” (67) to the extent that “[w]ith some writers, style not only reveals the spirit of the man but reveals his identity, as surely as would his fingerprints” (68). Indeed, stylometry has discovered that “authorial style is detectable in texts to a degree which surprises even traditional author-centred scholars” (Craig

11 and Greatley-Hirsch).1 Yet a semantic difficulty remains when referencing an author’s style.

Murry puts it thus: “when we say that Marlowe had style, we are referring to a quality which transcends all personal idiosyncrasy, yet needs – or seems to need – personal idiosyncrasy in order to be manifested” (7). Almost one hundred years on from Murry’s lecture and the term style is still often obscured by the two main meanings he identified.

Turning to style in science fiction and fantasy, the notion of style that is both idiosyncrasy and a literary aesthetic, is problematic on the basis that styles of a highly idiosyncratic nature appear to be uncommon in the two genres and to further complicate the matter, science fiction and fantasy styles do not tend to conform to the codes of literary style as handed down by the narrative tradition of the nineteenth-century. We can turn to two examples of literary criticism, one of science fiction and the other of fantasy, to further illustrate.

“Dick typically produces a style that is serviceable” wrote Carl Freedman about science fiction writer, Philip K. Dick (34). In fact Freedman objected to Philip K. Dick’s popular standing as a renowned science fiction stylist on the basis that the prose was just “functionally adequate” to accommodate the “fast-moving narratives” but lacked the polished elegance of the “allusive resonance of incontestably literary prose” (34). Therefore, Dick’s style fails what Freedman considered to be “the most prestigious test of literary significance – style” (34).

Similarly in fantasy criticism, J.R.R. Tolkien’s The Lord of the Rings (LOTR) was celebrated by an early critic as “a magnificent performance” despite not qualifying as

“literature” (Raffel 218). Burton Raffel’s argument, presented in four parts, lists his criteria for

1 Stylometric studies are still finding more empirical evidence that “the authorial signal is spread throughout the whole frequent and not-so-frequent word spectrum” so as to be found in almost every strata of language (Eder, “Visualization” 53). 12 literature: style, characterisation, incident, and morality. The majority of the essay covers the matter of style in LOTR, which Raffel argues is limited in scope due to the limits of Tolkien’s narrative goals. Accordingly, the prose is limited to being “no more than Faërie props”, merely

“a cog” in the machine of the narrative (Raffel 226) and Tolkien is only concerned with

“narrative realities” (Raffel 222).

In both instances, functional styles were perceived as unliterary. Perhaps if Murry had been privy to these critical discussions he would have cast science fiction and fantasy into his third category of style, “the technique of exposition” that he reserved for reporting styles. The mode of storytelling in science fiction is readily recognised as a “reporting” style (Stockwell 76).

The origins of the modern genre are said to have come out of the journalistic drive among scientists to report findings without “rhetorical ornamentation” birthing a new style that can be described simply as “’scientific’ plainness” (Parrinder, Science 106). Fantasy is similarly marked by a tendency to adopt a reporting register of language, “Rather than saying, “If only I had wings,” the fantastic asserts that I do” (Attebery, Strategies 6). Brian Attebery even suggests that a linguistic definition of fantasy would be “the use of the verb forms of reporting for events that in ordinary discourse would require more conditional forms” (Attebery, Strategies 6).

For the purposes of literary criticism, the term style tends to exist on two planes: the individual and the corporate. The confusing aspect is that “style naturally comes to be applied to a writer’s idiosyncrasy” (Murry 19). In addition, science fiction critic George Slusser argues that the role of style in world creation is secondary, as style is a matter of “individual utterance rather than general rhetorical structure” (4). Slusser states, “[d]espite these traditional expectations, however, style plays a significant role in SF world creation. It may in fact be central to the process many see as the primary function of SF and what gives it its generic identity: the creation not just of narrative worlds but of new narrative worlds” (3). If we are to explore the concepts of style on the second plane, the implication of style as integral to world

13 creation, then we ought to take Le Guin’s statement literally: “each act of speech is the act of creation” (Le Guin 154).

Unlike Murry, who dismissed the category of “technique of exposition” out of hand, the study of stylistic variation seeks the root of technique, in terms of writing and reading. In the present thesis, when applied to questions of genre development and stylistic progression, a key question to arise is not just how the language functions to achieve certain aims but whether or not we can trace developments of stylistic and genre strategies through variations in language use. Such an enquiry demands attention be drawn to the elements of science fiction and fantasy that may not necessarily fall into Murry’s “true style” that must be unique (15) and is “the complete realization of a universal significance in a personal and particular expression”

(8). In addition, science fiction and fantasy have had to overcome issues such as definitions of

‘literariness’.

In the same essay where he allots Dick a fail grade for style, Freedman also argues that the notion of “subliterary” as it is applied to science fiction ought to be rethought (37). When arguing that style must be treated as a highly individualistic process, as critics are wont to do, style cannot be evaluated on the basis of its context, Freedman states “it is hardly feasible to perform any stylistic analysis on a passage from an SF work without considering the specifically

SF characteristics of the style” (35). Affording no such leniency to the form of fantasy, Raffel’s argument of Tolkien’s non-literary style was refuted ten years later by Elizabeth Kirk, who argued that Raffel’s “modern” approach was built on the assumption that the “awareness” of the artist is unique and unavailable to ordinary folk (9). Accordingly, Raffel’s approach treats

“the function of language in any work of art” as a force to remove the reader from their

“habitual derivative consciousness” and induce them to “participate in a new one” (9). Offering some context, Kirk argues that Tolkien’s interest in the “spatial and chronological dimensions” of an “entire world” was a requisite strategy allowing Tolkien to fulfil his primary objective: to

14 populate a fictional world with languages of his own invention (6). Rather than examining the artistic goals of the entity known as ‘literature’, Kirk places primacy on Tolkien’s artistic goals.

In doing so, she shifts the focus of style from the individual’s expression in language to the individual’s goal in the work. However, as will be demonstrated when we more closely examine science fiction and fantasy, the two genres are largely reliant on shared strategies for both reading and writing. Artistic goals tend to be understood in terms of how they are used, whether that is to recreate, subvert or even surpass existing goals and norms.

Thus far, the concept of style has eluded a formal definition that would explain at which point in the expression of language style begins to emerge. In stylometry, the study of style takes place on the whole strata of word frequencies, but that has not always been the case for studies of style. The multiplicity in the use of terms that Murry initially discussed resurfaces in this context, except that now it concerns the point at which the “allusive resonance of incontestably literary prose” (Freedman 34) can be found in a sentence, paragraph or text. Even where the notion of ‘literary’ has been laid to the wayside, the question emerges when critics claim that a ‘fake’ style is immediately obvious, whether it be the faked authenticity of the idiosyncratic style (Murry 15–16) or a faked aesthetic such as plainness (Le Guin 153). Yet, how is it measured? Murry invokes a test that requires feeling rather than measurement: “The test of a true idiosyncrasy of style is that we should feel it to be necessary and inevitable; in it we should be able to catch an immediate reference back to the whole mode of feeling that is consistent with itself” (16). Le Guin offers an example where by substituting only four words in a quotation originally from a fantasy novel the quotation can be made to resemble a fragment from a realist novel set in the political arena of Washington DC (146). She specifically selected a passage of dialogue for this purpose, claiming “style in a novel is often particularly visible in dialogue” (147). Fantasy in particular is exposed as Le Guin explains in her conclusion, “[t]here is no comfortable matrix of the commonplace to substitute for the imagination, to provide ready-made emotional response, and to disguise flaws and failures of creation” (154). Often, the 15 opposite is considered true of science fiction, where the concepts distract the reader from the style and the style is so plain as to disappear from view (Stockwell 76). Nevertheless, Samuel R.

Delany has argued that a careful reader will notice style from the opening sentence but that as the reader progresses they will begin to notice other facets of an author’s mastery such as plot.

In the first instance, style “may glare out from the opening sentence” while other components of the writing are revealed later in the process of reading (Jewel-Hinged 31).

Presently there are two possible tests to determine when style begins to be identifiable, one by feeling and the other by substitution. Regarding the first approach we can turn once again to the critical discourse that occurred surrounding Tolkien’s LOTR trilogy. In 1938,

Tolkien wrote that “[l]iterature works from mind to mind and is thus more progenitive” and is

“universal”:

If it speaks of bread or wine or stone or tree, it appeals to the whole of these things, to

their ideas; yet each hearer will give them a peculiar personal embodiment in his

imagination…If a story says ‘he climbed a hill and saw a river in the valley below,’ the

illustrator may catch, or nearly catch, his own vision of such a scene; but every hearer of

the words will have his own picture, and it will be made out of all the hills and rivers

and dales he has ever seen, but especially out of The Hill, The River, The Valley which

were for him the first embodiment of the word. (70)

Quoting this same segment, Raffel has disagreed with Tolkien’s position on language in literature. To Raffel, the story of climbing a hill and seeing a river doesn’t “evoke any kind of scene” (226), and he argues that the phrases Tolkien exemplifies are ingredients of narrative and therefore have nothing to do “with what words as words can communicate” and nor “the question of style” (227). In response, Kirk argued that Tolkien was raising “issues central to the twentieth century’s critical debate over the role of language” (12). Conceding that “[p]lain words lose their capacity to evoke vision and experience”, Kirk suggests that Tolkien’s answer

16 to this is “to provide a context in which the reader’s capacity to recreate is itself recreated” (11).

Kirk notes that “Tolkien flies in the face of most modern poetics” but when viewed “in the larger patterns of western history”, his opinions are not that eccentric (12). In support of this

Kirk invokes the argument, as old as Aristotle, that “the function of art is to make possible experiences that are not isolated or eccentric” (12).

Delany paints a similar portrait for science fiction when he breaks down the sentence,

“[t]he red sun is high, the blue low” to support his argument that a “sixty thousand word novel is one picture corrected fifty-nine thousand, nine hundred and ninety-nine times” (Jewel-Hinged

29). As each successive word is read the image, which began with the, is altered. Delany outlines what Tolkien would term “his own peculiar personal embodiment” of the image: “The is a grayish ellipsoid about four feet high that balances on the floor perhaps a yard away. Yours is no doubt different” (Jewel-Hinged 29). The particularly science fiction quality of the style in

Delany’s example is in the extrapolated world procured by the image, one of a world that

“crawls with long red shadows and stubby blue ones, joined by purple triangles” capable of being conjured in only the “quarter of a second” it would usually take to read the line (Jewel-

Hinged 31). It has been noted by Damien Broderick that Delany’s outline of the reading process can be understood as a parable that “nicely allegorises some of the processes of coding and decoding called up by genres” (68). Certainly, Delany’s reading process is an exaggeration and, according to Broderick, follows an “additive algorithm” that “flies in the face of contemporary linguistics and semiotics” —words correspond to sentences (68).

Furthermore, the extrapolation in Delany’s reading evokes Tolkien’s oft-cited green sun. However, a sense of an alternate world is not just found in the assembly of words that otherwise would not otherwise collocate, as Tolkien himself suggested and which is quoted here at length:

17

The human mind, endowed with the powers of generalisation and abstraction, sees not

only green-grass, discriminating it from other things (and finding it fair to look upon),

but sees that it is green as well as being grass. But how powerful, how stimulating to

the very faculty that produced it, was the invention of the adjective: no spell or

incantation in Faerie is more potent. And that is not surprising: such incantations might

indeed be said to be only another view of adjectives, a part of speech in a mythical

grammar. The mind that thought of light, heavy, grey, yellow, still, swift, also conceived of

magic that would make heavy things light and able to fly, turn grey lead into yellow

gold, and the still rock into swift water. If it could do the one, it could do the other; it

inevitably did both. When we take green from grass, blue from heaven, and red from

blood, we have already an enchanter’s power upon one plane; and the desire to wield

that power in the world external to our mind awakes. (24–25, emphasis in original)

At this point, some scholars discuss the role of grammatical modifiers in reference to the sub- creation of a secondary world. For instance, Kayla Snow argues that Tolkien’s concept of fantasy is as a “means for altering or distorting primary reality so as to reawaken in his readers a sense of wonder in the Primary World” (120). She goes on to suggest that objects such as green-grass “suddenly take on more depth when viewed through the lens of a subcreated

Secondary World” (120). This interpretation is unconvincing as it does not, as Frank Scafella does, consider the way natural objects become enchanted: “the materials from which the tale is formed are given in reality. Contrary to popular opinion, the fairy story is not a tale about fairies per se… but about ‘the adventures of men’ in the Faerie, the Perilous Realm” (313). In this realm, ordinary objects such as trees, birds, stone and water, even human beings, are enchanted. It is possible for this to exist as a mode, whereby as Delany has suggested, “No matter how naturalistic the setting, once the witch has taken off on her broomstick the most realistic of trees, cats, night clouds, or the moon behind them become infected” (Jewel-Hinged

33). On this point, Attebery has suggested that there are disadvantages to approaching the 18 fantasy genre as a mode, and while we will return to the discussion of genre definition, part of his discussion concerns whether fantasy can be treated as a function of language (Strategies 4).

Attebery argues that Tolkien’s approach, through the function of language, is just one theory that could be applied to fantasy and argues that the linguistic function of fantasy is “based on our ability to separate modifier from substantive and recombine them to produce green suns and flying serpents” (Strategies 5). Although Attebery points out that not all fantasy is written or expressed in language, narrative fantasy employs the “arbitrary system” of language — the assignation of meaning to signs in an arbitrary manner. Attebery’s example of this is the

“arbitrary choice to call a cow a cow and not a lilac” where “once that choice is made” in the organisation of a system, the phrase “I am going to milk the lilacs” is not understood (Strategies

6). Slusser similarly argues that science fiction relies on choices, entered and agreed upon by a community of readers, when he explains that the metaphors of science fiction “are both individual statements and at the same time aware of an SF ‘community’” (16). Therefore, the generic identity of either science fiction or fantasy as understood through the function of language, could be said to rely on the re-organisation of society which is expressed first and foremost through the expression of language.

Others have explored the role of modifiers in fantasy to explicitly “demonstrate the

Tolkienesque stylistics” in descriptive language (Randall 176). There is a strong argument made by Neil Randall that, “the adjective has, in its ability to transform nouns, the power of enchantment” (176). His analysis of the stylistics in Tolkien’s “green grass” is closer to

Mandala’s literary-linguistic study of deictic noun phrases that induce secondary belief rather than other responses to Tolkien’s theory of enchantment through language. Instead of the collocation of words with modifiers, Mandala’s methodology investigates the function of larger structures in language to demonstrate how a reader is linguistically positioned to accept the

“unknown and unreal world as actual and taken for granted” (98). Mandala argues that it

“largely comes down to subtle manipulations of style” (98) which she demonstrates through a 19 close reading of the opening paragraph to George R.R. Martin’s A Game of Thrones. The fictional elements in Martin’s world that do not, or cannot, exist in our own world are “assumed to exist” and “presented as utterly familiar” through several grammatical features including the deictic use of definite noun phrases and the suppressions of explanation (Mandala 99). A deictic expression contains adverbs such as this and that, which require context; “the hearer must know who the speaker is, and in what context (time, place, etc.) his or her utterance was made”

(Mandala 99). For example, in the expression, “it’s over there”, the speaker assumes that the intended recipient will understand both what it refers to and the position indicated by there.

Thus, “narrators who introduce people or places or things with the are assuming that readers already know about them” (Mandala 100). Mandala argues that narrators who introduce imaginary features of the world using the definite article, the, “cast” the reader, “linguistically at least, as already believing the incredible” (101). Moreover, the entire structure of the introductory paragraph requires the reader not only to automatically accept the fantastic element where it appears but to reorganise their notions of their cognitive reality as more information is added. Consider the paragraph, reproduced here as it was by Mandala:

The morning had dawned clear and cold, with a crispness that hinted at the end of

summer. They set forth at daybreak to see a man beheaded, twenty in all, and Bran rode

among them, nervous with excitement. This was the first time he had been deemed old

enough to go with his lord father and his brothers to see the king’s justice done. It was

the ninth year of summer, and the seventh of Bran’s life. (11)

The order of the information is important. The first sentence betrays no fantastic elements about the morning or the summer and readers must supply information from their own experience of reality. It is three sentences before a definite noun phrase introduces a single element of fantasy in “the ninth year of summer”. This delay induces a questioning of the assumptions made from the impression of the initial, naturalist sentences. Similarly, there is a delay between the

20 information concerning the company travelling “to see a man beheaded” and Bran being old enough “to see the king’s justice done”. The first information about the political and legal system of the fictional world is generalised with the indefinite article a, it is the second mention of the power system that is more concrete and uses the definite article. Although the phrase

“the king’s justice” can be interpreted literally and does not require knowledge of the fictional world to decode, it is a hint that there is a system of justice that the reader does not fully yet understand. There is also a delay in establishing the point of view; a deictic expression introduces the company of twenty, “[t]hey set forth”, and Bran is introduced as one who “rode among them”. This emphasises the fictional world rather than any one individual; a fitting structure for the opening sequence given that the point of view shifts between numerous characters throughout the rest of the novel and the series. Linguistically, the imaginary elements of the world are introduced in such a way that the reader is cast “as already believing the incredible” but structurally, the reader’s initial assumptions are disrupted by the introduction of the imaginary elements (Mandala 101). Were we to attempt a Le Guin-style substitution here, it would render the fantastic in this paragraph inert; all that is required is to substitute “ninth year of summer” for “ninth week of summer”.

However, fantasy can be fantasy without fantasy. Gary Wolfe explains: “[a]most before fantasy came to be defined as a genre…one of its classic texts had already violated the terms of that genre, creating a classic fantasy novel without material fantasy” (Evaporating 35). He is referring here to Mervyn Peake’s Titus Groan (1946), published before Tolkien’s trilogy, which despite exaggerations of the grotesque and as unlikely as the world of Gormenghast is, contains no explicitly impossible or supernatural feature or event (Wolfe, Evaporating 35). Wolfe concludes, “[t]he fact that few readers seem to notice this, or be bothered by it, suggests that the overwhelming tone of the novel carries enough of the fantasy effect to override mere concerns of plot and setting” (Evaporating 35). He then goes on to elucidate a more recent example, Geoff

Ryman’s Was (1992), where the structure of the novel contains “many of the elements of classic 21 fantasy” which “are consistently undercut by the intrusion of realism” (Evaporating 36). “None of this”, Wolfe concludes, “prevented Was from being nominated for the World Fantasy Award in 1993 as best fantasy novel of the year” (Evaporating 37). One critical point in Wolfe’s argument is that by the 1990s, “fantasy as a genre had begun to evaporate into the broader spectrum of literature” (Evaporating 37).

Therefore, substitution may have revealed to Le Guin a “fake” fantasy style but her test does not reveal the function of underlying structures of language as does Mandala’s analysis of

Martin’s opening sequence. There is a larger structure working here, as in Wolfe’s two examples. The paragraph carries an “otherworldly tone” through the constant undercutting of the realistic image a reader is encouraged to acquire before being led to question their assumptions about the fictional landscape. Together with the structure of deictic noun phrases and the layering of information, just a single word can dramatically rearrange the entire world.

Years. Which leads to the question: what other underlying features of language are operating to achieve key thematic effects in the two genres?

To further apply Tolkien’s logic to a particular aspect of Raffel’s argument against his style, we can analyse Raffel’s appraisal of style that “belong[s] to literature” (225). Comparing

Tolkien’s description of “low and comfortable chairs” in the Prancing Pony Inn to a description from Thomas Wolfe, Raffel argues that Wolfe “allows the reader to experience the chairs and tables for himself” (225). To Wolfe’s description “She replaced the disreputable furniture of the house by new shiny Grand Rapids chairs and tables”, Raffel responds, “’Grand Rapids’ furniture is mass-produced, factory-made furniture – but what is a “low and comfortable” chair?” (225). Raffel, a native to Wolfe’s America, is apparently blind to the limitations inherent in the cultural reference, “Grand Rapids”. Though a name that carries discernible meaning in the context of Wolfe’s artistic aims, this one reference is far more limiting than the effect of

‘mythical grammar’ in Tolkien’s prose. Peppered with references to external cultural artefacts,

22

Raffel’s concept of ‘literature’ offers no space for adjectives and adverbs which, according to the modern aesthetic, are only tasteful when used sparsely (see Strunk and White). The difficulty with fantasy, as mentioned above, is that each word is an act of creation in what is otherwise a void. In science fiction, each word requires extrapolation. With each utterance, the reader is moving away from the consensus reality and, without readerly knowledge, the nature of the exchange occurring in these two genres can be easily misjudged. Raffel complained that “low and comfortable” provided exactly what to feel about the chairs which restricts the reading experience to what Tolkien desired the reader to experience (225). Yet such a complaint only reveals the extent to which Raffel missed what Tolkien saw in the power of adjectives, the abstracted essence of an object: “We may put a deadly green upon a man’s face and produce a horror; we may make the rare and terrible blue moon to shine; or we may cause woods to spring with silver leaves and rams to wear fleeces of gold, and put hot fire into the belly of the cold worm” (25).

Structure as well as word selection is important in stylistic elocution. David Sandner suggests that the structure of fantasy is, on occasion, obviously linked to the structure of the language—as in Lewis Carroll’s Alice in Wonderland, where the language is a “dizzying metonym” that “needn’t mean anything” and perhaps demonstrates that there is “nothing” real to the fantasy world at all (4). As a contrast, Sandner points to Tolkien who “reveals another way”; fantasy that is surprisingly “full” with the details of the world’s history, landscape, myths, characters all provided in the text, in appendices and in his “endless miscellaneous works” (4).

Providing there is a recognisable structure, literary language can be nonsensical yet still communicate meaning. Stanley Fish explores Noam Chomsky’s much cited illustration of this concept using two phrases that both lack any semantic meaning: “colorless green ideas sleep furiously”; “furiously sleep ideas green colorless”. Stanley Fish explains that the latter “exhibits

23 no logical relationships whatsoever” but in the first example the units of the segment are in a known grammatical order. “You can’t do anything”, Fish argues, “with ‘furiously sleep ideas green colorless,’ not because it is without meaning, but because it is without form” (27). The first sentence, however, is a prime example of the kind of “arresting strangeness” that is inherent in fantasy (Tolkien 45). The same concept can be extended to account for the extrapolative reading of science fiction, where celestial objects not yet discovered and technology not yet invented are named and described. Delany has argued that one of the barriers to understanding science fiction is the “inability to create the alternate world that gives the story’s incidents all their sense” (Starboard 50). Science fiction demands a certain level of literacy in scientific concepts and language while fantasy demands a willingness to accompany the semantically strange to the logical end point.

Delany offers yet another approach to interpreting the function of language in science fiction and fantasy when he suggests that the main distinctions between the two can be seen in terms of the subjunctive, or at least in “distinct level[s] of subjunctivity” where subjunctivity is the tension or “mood” that “informs the whole series of words” (Jewel-Hinged 10). For example, fantasy fiction is defined by objects and situations that “could not have happened” and Delany argues that “immediately it informs all the words in the series” as though the impossibility infects other aspects of the setting (Jewel-Hinged 11, emphasis in original). Science fiction, however, is defined by events, objects and situations that “have not happened” which governs subcategories of science fiction which includes events that “might happen”, “will not happen”,

“have not happened yet” and “have not happened in the past” (Jewel-Hinged 11, emphasis in original).

According to Delany, the subjunctive mood of a text also informs the corrective process and in science fiction a reader is attuned to assumptions drawn from “what we know of the physically explainable universe” (Jewel-Hinged 11–12).

24

What, so far, have we learnt of style? Style, the manner of expression, is inescapably individual, but seeking highly idiosyncratic styles as markers of literary style is not necessarily compatible with the aims of science fiction and fantasy. Adopting a reporting register, fantasy fiction generates worlds with words. A simple statement that encompasses the arrestingly strange collocations of words. As powerful as it is to generate meaning, the green sun is only successful when in the correct form. In this regard, the style of fantasy is limited to particular forms. Furthermore, fantasies set in wholly other worlds, cannot have chairs described by external referents but only in terms of what is consistent with the world being portrayed. Thus the mythical power imbued in the term “green grass” can be transferred to an as yet unimagined element, such as a “green sun”. The same is true in science fiction with the additional burden that the writer of science fiction must be cautious not to fall into the traps of becoming antiquated as technology is no longer futuristic or as science proves certain theorems wrong. In saying so, the style of science fiction and fantasy, simple and unadorned as it is, can still be faked. It is not enough, therefore, to simply swap a cat for an elf, a car for a broomstick.

Through cues and inferences, the knowledgeable reader knows whether or not to interpret unfamiliar collocations as literal or figurative. Unable to rely on knowledge from a consensual reality, readers without broader knowledge of the genres do not have their own reference point to draft pictures that embody the expression of language: the world of myth, science and forms requires its own attention.

Genre Theory To organise a set of texts into a genre is to also offer readers a guide as to the cues and inferences common to a particular genre. Frustrated with colleagues dismissing the genre of science fiction out of hand, Delany investigated the cause of their dismissal and concluded that in doing so “it becomes clear that their difficulty is almost entirely in their inability to create the alternate world that gives the story’s incidents all their sense” (Starboard 50). Delany goes on to state that “these readers have no trouble imagining a Balzac provincial printing office, a

25

Dickens boarding school, or an Austen sitting room” and yet “they are absolutely stymied by, say, the contemporary SF writer’s most ordinary ‘monopole magnet mining operations in the outer asteroid belt of Delta Cygni’” (Starboard 50). This phrase Delany calls “most ordinary” can project an entire futuristic setting and society in the mind of the knowledgeable reader. For

Delany, the phrase conjures an image of a time when in this world “a completely new kind of magnet has been discovered” which in turn suggests “a whole new branch of electromagnetic technology at work” affecting transport, communication and the entire organisation of society

(Starboard 50). From this single fragment, the scale is set of a world where human activities are occurring in a distant constellation with at least four suns and multiple asteroid belts (Starboard

51). Science fiction requires readers to reorganise what they know about humanity and society, technology and laws, and read between the lines in order to extrapolate from shorthand to an image of the world as it could be. Without the requisite reading strategies collecting around the notion of genre, readers are in danger of resembling Delany’s colleagues:

Such readers, used to the given world of mundane fiction, tend to lay the fabulata of

science fiction over that given world – and come up with confusion. They do not yet

know that these fabulata replace, displace, and reorganize the elements of that given

world into new worlds. (Starboard 52)

As an old concept, genre has collected its own multiplicity of definitions and associated theories. This section demarcates the trends in genre theory, particularly as they pertain to the application of genre theory in critical science fiction and fantasy.

According to Alastair Fowler, neoclassical genre theorists of the eighteenth-century held to “fixed” genre rules that governed new works (Kinds 27). Any new work was either held in strict obedience to the old rules or had the task of proving that they nevertheless “embody some worthwhile additional kind happily exempt from the old criteria” (Kinds 27).

Neoclassicists have not been the only ones to attempt a freeze on genre rules. Thomas Pavel

26 points to the seventeenth-century Renaissance poetic movement in France which “dogmatically assumed tragedy to be the genre described by Aristotle’s Poetics” and which failed to account for the “innumerable” tragedies from the 16th Century European traditions that did not observe

Aristotle’s three unities of action, time and place (201). As Pavel notes, such exclusions occur whenever genre rules become fixed; the genre is frozen in time when reduced to “immutable formulas” (201). However, genres generally resist such immutability for two reasons that Pavel points out: “they change with time” and “often possess an internal flexibility that makes them mobile and unpredictable at any given time” (201). All in all, genres can be quite unstable.

Of the genre freezing that occurred in the eighteenth-century, Fowler wrote in 1982,

“we are only now recovering”; for in theory, the rules went out of vogue but in practice they were often enforced (Kinds 28). Twenty years earlier, we can see in the critical practice of C.S.

Lewis an hierarchical attitude toward the frozen genres: “It is the smaller poets who invent forms, in so far as forms are invented” (Preface 3). In Lewis’s A Preface to Paradise Lost, form included the pre-existing types of epic, tragedy, the novel or “what not” (Preface 3).

Nevertheless, his practical study of form is useful, for he holds that every text can be considered from two perspectives; “[f]rom the one point of view it is an expression of opinions and emotions; from the other, it is an organization of words which exist to produce a particular kind of patterned experience in the readers” (Preface 2-3). In the context of Lewis’s criticism, the

“patterned experience” in the form of the epic is found in the repetition of clauses which allow the reader to anticipate the direction of the work and savour the content of a work delivered orally. However, a collection of norms around a genre similarly produce a conceptual adherence, leading the reader to engage with “a particular kind of patterned experience”

(Preface 3).

Some further particulars concerning the operation of the novel as a form require elucidation, but first the disjunct in the usage of terms ought to be addressed. When a concept

27 such as genre has been discussed for as long as it has, there is bound to be a multiplicity in the definition of the term. The problem is clearly seen in the contrast between Pavel’s reference to tragedies and Lewis labelling tragedy as a form. This is complicated further when compared to another theorist who asserts that tragedy can be a mode (Frow 77). What then is a tragedy?

To start with the last term first, understanding genres through modes can be useful as

Attebery demonstrates in the first chapter of Strategies of Fantasy. According to Frow, “modes start their life as genres but over time take on a more general force which is detached from particular structural embodiments” (77). For instance, “tragedy moves from designating only a dramatic form and comes to refer to the sense of the tragic in any medium whatsoever” and

“pastoral modulates from the georgic or the eclogue into a broader form which can be applied to any genre that deals with an idealised countryside populated by simple folk” (Frow 77).

When viewed in this way, modes can be invoked in an “adjectival” sense. That is, a mode denotes the tonal features or “colouring” of a genre rather than the structure of formal qualities

(Frow 78-9). Furthermore, when a genre becomes “exhausted”, such as the gothic romance, it can continue to survive in a modal form as it becomes attached to other ‘structural embodiments’ (Frow 77).

Approaching genre with a multiplicity of concepts can be useful, for as Fowler explains,

“[l]iterary works can always be grouped in different ways” (Kinds 54). Confusion abounds where generic types are treated as a single category, which is why Fowler demarcates the following: “kind or historical genre, subgenre, mode, and constructional type” (Kinds 55).

Fowler ascribes the term ‘kind’ to the fixed genre, arguing that “there is a substantial basis of agreement about many historical kinds” (Kinds 54). “Fixed” genres are not limited to the kinds known as “the natural forms”, i.e. Aristotle’s division of literary works into the epic, the dramatic and the lyric. In a later essay, Fowler outlines how kinds, also termed the historical genres, include categories such as the treatise, the Renaissance epistle, the anthology, and the

28

“poetics” genre - the social basis of which was, at the time, “the novel activity of literary criticism” (“Formation” 186). He remarks that there was a revival of the “pure” genres during the Renaissance then turns his attention to “describing kinds actually practiced” (“Formation”

186). Significantly, Fowler’s extensive explanation of what generic features constitute “a kind’s typical repertoire” (Kinds 73) includes both what is said and how it is said. On the topic of subject Fowler writes, “[w]e have inherited a strong suspicion of the idea that subject may be limited generically” (Kinds 64). He argues that “individual constraints on subject have undoubtedly relaxed. However, their place has simply been taken by others, although these remain unformulated” (Kinds 66). Therefore, while there is no “precise range of subjects” to characterise a kind, Fowler claims that “no kind is indifferent to subject” (Kinds 66).

Furthermore, as Frow writes, subject matter “corresponds” to “stylistic choices” and therefore to the structural organisation of a text and texts that cluster together in genres. For instance, the ‘low style’ of the New Testament produces a more casual effect of “everydayness” which is distinct from the “elevated tone” of the prophetic books in the Old Testament (Frow

86). The stylistic shift marks a theological shift: the subject impacts the stylistic choices. Such un-extractable connections are what fuel the form-and-content debate. For use in the present thesis however, there are similarities between the terms ‘form’ and ‘kind’, with the distinction that Fowler’s definition of kinds is broader in order to encompass his particular interests in historical genres. In part this is an important aspect of genre theory as it encourages flexibility in scholarship, rather than maintaining that some forms are uninvented and therefore “pure” while mixed forms or new forms are the domain of “smaller poets”.

“In the formation of kinds, it seems usual for subgenres to emerge before genres”

(Fowler, “The Formation” 187). Fowler anticipates that this statement seems counterintuitive, going on to explain, “At first there is no name for the broader type, only for the particular

‘subject’. The absence of a genre label is of course no argument against the genre’s existence”

29

(“The Formation” 188). It is this habit of genres, of being identifiable only retrospectively, that forms the crucible of recent developments in film genre theory. David Chandler explains,

“[s]ome genres are defined only retrospectively, being unrecognized by the original producers and audiences. Genres need to be studied as historical phenomena” (4). When treating genres as historical phenomena it can be concluded, as Jane Feuer notes, that “[a] genre is ultimately an abstract conception rather than something that exists empirically in the world” (qtd. in

Chandler 1). When genres are treated as artefacts of a particular time there is a tendency to revert to the immutable rules to find what can and cannot be included in the historical phenomena. The usefulness of genre as an interpretive aid, regardless of whether or not the collection of literary works occurs around an abstract label, is eloquently iterated by Pavel who argues:

Genre is a crucial interpretive tool because it is a crucial artistic tool in the first place.

Literary texts are neither natural phenomena subject to scientific dissection, nor

miracles performed by gods and thus worthy of worship, but fruits of human talent

and labor. To understand them, we need to appreciate the efforts that went into their

production. Genre helps us figure out the nature of a literary work because the person

who wrote it and the culture for which that person labored used genre as a guideline

for literary creation. (202)

Genres and subgenres are therefore important tools to aid the reader, both casual and critical.

As mentioned above, Lewis stated that forms “produce a particular kind of patterned experience in the readers” (3). In addition, Fowler argues that “the category subgenre helps resolve the old problem of whether genre is governed by subject or form” (Kinds 112). However, subject alone does not determine the genre even if it does determine the grouping of subgenre

(Kinds 112). For instance, within the form of the novel, there are offshoots defined according to their content, such as the Bildungsroman and the historical novel (Duff xvi).

30

In order to account for the various forms a literary work may take, not to mention the countless offshoots, twentieth century theorists turned to organic analogies from the biological sciences. The method of collecting texts around a core entity has followed flexible concepts such as family-resemblance where “representatives of a genre” can make up entire genres whose

“septs” (clans) and “individual members are related in various ways, without necessarily having any single feature shared in common by all” (Fowler, Kinds 41). Derived from the

Wittgenstein family-resemblance theory and adopted by theorists Fowler and Frow, science fiction and fantasy critical theory has similarly turned to the family-resemblance theory to explain the blurred edges of the respective genres. Regarding the adoption of family- resemblance theory, Fowler explained the problem with genre as such:

Each subgenre has too much variety too elusively and mutably distributed for

definition to be feasible. We can specify features that are often present and felt to be

characteristic, but not features that are always present…And so with other subgenres of

tragedy and other genres of literature. They never have enough necessary elements

common to all members for them to be regarded as classes. Either defining

characteristics are absent altogether, or they are limited to meagre distinctions that do

no more than subdivide the genre. (Kinds 40)

Rather than a model of taxonomy that “assumes there can be something like an exhaustive classification”, family-resemblance theory addresses “the fuzziness and open-endedness of the relation between texts and genres” (Frow 65). Frow addresses one of the key concerns that arose in the wake of Fowler’s adaptation of the Wittgenstein theory: the questions of where “the line of dissimilarity is to be drawn” (66). In refining the theory, Frow turns to developments in

“cognitive psychology of classification by prototype”. Prototypes postulate that we “take a robin or a sparrow to be more central to [the category of bird] than an ostrich, and a kitchen chair to be more typical of the class of chairs than a throne or a piano stool” (66). Thus, “classes

31 defined by prototypes have a common core and then fade into fuzziness at the edges” rather than having a list of shared properties, whether via strict rules or the looser connection to a family tree. Although studies in stylometry can explore the correlations between any given set of texts without giving mathematical weight to any one text, the researcher can still invoke the presence of a prototype by focusing on the distance from the one text, the point of origin if you will, to all other texts in the study. It is therefore, important to retain the focus of genre theory, which John Frow helpfully returns us to when he states

our concern should not be with matters of taxonomic substance (‘What classes and sub-

classes are there? To which class does this text belong?’) – to which there are never any

‘correct’ answers – but rather with questions of use: ‘What models of classification are

there, and how have people made use of them in particular circumstances?’ (67)

Frow turned to his primary interests, “the ‘ordinary’ uses of these models”, which led him to trace the formal models and critical traditions that inform the contemporary, ordinary use. This leaves us free to trace the impact of family-resemblance and the refined, “classification by prototype” in the critical traditions of science fiction and fantasy.

As a starting point, Attebery’s “fuzzy sets” have become a popular entry point for discussions of both fantasy and science fiction genre definitions. “A fuzzy set to logicians”, explains Attebery, “is a grouping defined by a set of core examples or prototypes; other entities belong more or less, depending on their degree of resemblance to the core. A fuzzy set has a center but no perimeter” (“Elizabeth” 122). Therefore, “[t]he genre one defines depends on the prototypes one picks” ( “Elizabeth” 123). In one application of fuzzy sets, John Pennington determined Tolkien’s LOTR trilogy to be “that center by which we judge other fantasies” (81).

Pennington’s analysis of the influences on Rowling’s fantasy sequence, Harry Potter, argues that the “fuzzy set of influence constantly shifts” to the extent where “[s]oon we are not even in any fantasy fuzzy set” as it ranged from Nancy Drew and Roald Dahl to Lewis Carroll (82). This is

32 one example of what Greer Watson argues is the “problem of relying on core examples of fuzzy sets”. Watson suggests that the problem occurs when “[c]ore examples differ too much, in too many ways” (166). Similar to Frow, Watson turns to the sciences to refine Attebery’s fuzzy sets:

In the sciences, both the physical and social sciences, there is a method of reducing

experimental error by limiting the range of variation. This is known as ‘controlling the

variables.’ Experimental subjects are so selected as to be maximally similar in all ways

except the ones under examination. Differences between them are therefore attributable

only to the variable in which the researcher is interested. Unfortunately, in literary

analysis, the high level of control possible in, for example, medical research simply

cannot be attained. Books are not lab rats. To make things a little easier, however, the

stories do not have to parallel one another throughout, since readers are usually able to

identify the type of text they are reading within the first few chapters. (166)

Interestingly, the stylometric technique employed throughout the thesis is occasionally used in other fields to control variables (Varmuza and Filzmoser). The PCA method exposes the variation in any given set of data, a process which highlights outliers, the extreme edges of the dataset, that can then be ‘controlled’ – eliminated, prior to further testing. That is not, however, the application of PCA in stylometry, as explained by John Burrows: “Most statistical methods assign the members of specimens to one or another of several pre-determined classes”, and this is certainly one of the goals in authorship attribution – where a control group is preferred to compare the anonymous texts with known works by the potential author (Burrows, “PCA”). In exploratory studies, however, stylometrists “use PCA in such a way as to allow the specimens to array themselves according to their respective affinities and disaffinities, whatever these may be. The outcome enables us to form inferences about the overall patterns that have emerged and the possible reasons for any aberrations” (Burrows, “PCA”). Because, as Watson rightly declares, “[b]ooks are not lab rats”, the methodology employed in the thesis does not align with

33 classificatory genre theory. While scientists may be able to exclude some outliers from studies on the grounds of maintaining their research aims, literary critics are not at the same liberty.

While lab rats are replicates and are bred to be genetically identical, books are not and cannot thus be treated the same way – discarded if they are aberrant. Furthermore, a book that emerges as an outlier in a statistical study does not automatically indicate that it is does not belong to a particular generic group. As such, it is necessary to tread lightly with the interpretation of statistical results.

Although borrowing a popular method from the statistical infrastructure of the sciences, the case studies explore genre not in an attempt to identify the position of a particular work in the concentric circles of a genre or subgenre, but rather as patterns of linguistic features that adhere to conventions of reading particular to the norms found in any given collection of works. Such a concept of genre is identified by Frow as one of the other “analogies through which twentieth-century critics have conceptualised the literary genres” (64). The approach can be summarised as “genre as the social institution” (64). Revisiting the social concept of genre,

Frow quotes Todorov to argue that “genres are ‘only the classes of texts that have been historically perceived as such’: historical rather than theoretical entities” (Frow 81). This is not to say that genres are just arbitrary names prescribed to clusters of texts – “since they have properties that can be described. The mode of existence of genres is social” (Frow 81): “In a given society the recurrence of certain discursive properties is institutionalised, and individual texts are produced and perceived in relation to the norm constituted by that codification. A genre, whether literary or not, is nothing other than the codification of discursive properties”

(Todorov qtd. in Frow 81). Such an approach is more common in science fiction criticism.

Monique R. Morgan invokes Attebery’s approach of fuzzy sets to begin with, but swiftly aligns it to the notion with Carl Freedman’s approach, quoting Freedman directly to explain “a genre is not a classification but an element or, better still, a tendency that, in combination with other relatively autonomous generic elements or tendencies, is active to a greater or lesser degree 34 within a literary text” (Freedman qtd. in Morgan 266, emphasis in original). A broader approach has been offered by John Rieder who argues that genres can be treated as historical processes.

The historical approach views genres as mutable — the genre can be stretched to include or defended against inclusion rather than comparing new works against an established set of rules

(“On Defining” 193-194). Rieder explores the shortcoming of both the family resemblance approach and fuzzy sets and defends his proposition that “sf has no essence, no single unifying characteristic, and no point of origin” (“On Defining” 193) by arguing that genre is “always found in the middle of things, never at the beginning of them” (196). This line was quoted at the beginning of this chapter to defend a definition of the genre that encompasses texts that surfaced before the term science fiction was coined. However, the notion that genres have no set origin is integral to the first case study in Chapter 3, and so Rieder’s further comments on the matter are illuminating:

Studying the beginnings of the genre is not at all a matter of finding its points of origin

but rather of observing an accretion of repetitions, echoes, imitations, allusions,

identifications, and distinctions that testifies to an emerging sense of a conventional

web of resemblances. It is the gradual articulation of generic recognition, not the

appearance of a formal type, that constitutes the history of early sf. (“On Defining” 196)

Although this project compares emerging formal types of early science fiction and fantasy with features of contemporaneous novels, it is important to note that this thesis does not claim to discover any traits unique to the origins of either genre. The traits discussed are of early forms that offer insights into the relationship between the style of these works and those of the same era. The identification of an accretion of traits is not within the scope of the stylometric studies in this thesis. Instead, this project approaches genres from a perspective more akin to Delany’s description of science fiction, as a collection of reading strategies rather than the academic perspective of defensible taxonomic definitions. However, rather than directly appropriating

35

Delany’s reading strategies, this study adopts the language of genre norms in order to explore the relationship between style and genre.

Thomas Pavel’s concept of genre norms proposes that genres are a collection of successful artistic solutions that are organised around shared representational problems. Pavel recognised that although some generic requirements are formal, such as the features of a sonnet, most genres are characterised by more abstract requirements such as tragedy which, as a notion, has moral and existential implications (Pavel 203–204). As such, the more abstract qualities of a tragic novel can be normative, but in a way that is essentially rooted within a cultural norm: normative in the sense that they are striving to meet the cultural conventions or expectations placed on a literary work. “As cultural phenomena, most works of fiction have behind them a tradition of successful models and ossified procedures, which make it easier for the public to focus its expectations and for the writers of fiction to meet them” (Pavel 205). In addition to the abstracted content of a literary work, Pavel considers that genres have “an internal set of normative requirements independent from social customs” (206, emphasis in original). These “requirements” are selected where they best serve the representational purposes of a particular artistic goal or generic aim:

Used in this way, the term ‘norm’ means a successful artistic solution to a representational

problem. Such norms are not obligatory rules of behavior, they are just effective recipes

worthy of being imitated by writers who have similar artistic concerns and want to obtain

similar results. (Pavel 209)

With this concept of norms comes more artistic freedom. Rather than an immutable formula, the writer choosing a genre can search for the artistic solution of best fit, to borrow terminology from statistics. Following “obligatory rules of behavior” may be the safer option, to guarantee commercial success but imitation ought to yield more in a new work than just a reproduction of the recipe. Since the middle of the twentieth-century, the success of Tolkien’s fantasy trilogy

36 demonstrated the effectiveness of his recipe for high fantasy. Yet it is only now, in the twenty- first century, that this category of fiction is starting to recover from the influence of Tolkien’s trilogy. It has taken numerous generations and almost countless iterations before twenty-first century writers such as Brandon Sanderson and Patrick Rothfuss began to reinterpret the genre.

However, not everyone was so affected. In an interview, Le Guin expressed her gratitude for coming to Tolkien late: “I am grateful that I was in my twenties when I first read Tolkien and had gone far enough towards finding my own voice and way as a writer that I could learn from him (endlessly) without being overwhelmed, overinfluenced by him” (qtd. in Marcus 97). A distinction can be drawn, therefore, between recognising successful elements of a recipe for a similar representational problem, and appropriating it as a genre norm to suit a specific representational problem, and simply following the whole of the recipe as a blind list of rules.

It could be said that science fiction and fantasy share a representational problem: each text generates a unique world, a “world where no voice has ever spoken before” (Le Guin, “From

Elfland” 154). From there the artistic solutions to problems branch out according to the unique subject matter of each genre, subgenre, category and individual work. Nevertheless, while

“every word counts” in a world where “the act of speech is the act of creation”, no author works in a true void (Le Guin 154). As Le Guin herself observed, she could learn “endlessly” from Tolkien (qtd. in Marcus 97).

Conclusion Assuming that science fiction and fantasy are each a set of artistic solutions found to be successful for representational problems shared by authors, how do we come to understand the operation of language as an artistic function of the genres? For instance, a key component of science fiction is the extrapolative vision of humanity’s future, thus the success of a science fiction work could be said to require a verisimilitude of vision and competence of language to guide a reader to project a futuristic world. In this sense, genre can be approached as Delany has regarded it: as a set of reading strategies. On the other hand, the requirement of some form 37 of departure from reality in a fantasy work has led to a multiplicity of types in the fantasy genre. Mendlesohn’s categories of fantasy helpfully work backwards from the representational problems to explore how the artistic solutions achieve certain effects. Her questing after distinction in form is preceded, however, by a caveat:

This book is very much grounded in a love of forms, but form cannot be wholly

abstracted from content or ideology. Furthermore, I have come to believe that form may

act to constrain ideological possibilities. Consequently consideration is given to

interpretation where the issue is how a particular mode of writing helps to generate,

intensify, or twist meaning. (Rhetorics 18, emphasis in original)

Mendelsohn touches on one of the hesitations experienced when critically exploring forms: the loss of the socio-political commentary, culturally relevant and ideologically challenging critique which is a highly valued aspect of contemporary criticism. Furthermore, the post-romantic view of the artist, “as a privileged sensibility whose experiences are not only more intense than those of ordinary men but ‘original’” (Kirk 9), is a function of art based on a “comparatively modern assumption that the function of language in any work of art is to force the reader out of the reactions, awareness, associations of ideas, and value judgments which he shares with others and to substitute for them sharper, more distinctive individual, and ‘original’ modes of awareness” (Kirk 9). Nevertheless, it is “perfectly possible for an artist’s concern with language to be concern for language as a medium of communal consciousness and of certain modes of awareness and evaluation to which its existence vis à vis other languages testifies” (Kirk 10).

In the 1980s, popular literature was fighting an ongoing battle about the artistry of functional language. Orson Scott Card recounts his dilemma as a science fiction writer toward the end of the twentieth century: having “deliberately avoided all the little literary games and gimmicks that make ‘fine’ writing so impenetrable to the general audience” Card claims he wrote Ender’s Game (1985) in its “simplest, purest form” (xv). Nevertheless, Card found that

38

“the people that hated it really hated it” even though “[a]ll the layers of meaning are there to be decoded” and a straightforward narrative structure does not take away from abstract meanings of a text (xv). Card’s experience exemplifies the common assumption that accessible literature is inferior to literature written in a manner that is “impenetrable to the general audience”. The debate is ongoing (McKenna) and the aesthetics of science fiction and fantasy are still shifting.

Although studies of style deal with form they do not, as Stockwell and Sara Whiteley have emphasised, disregard “matters of performance, utterance, artistic design and aesthetic effect” (1). In other words, discussions that examine the formal features of a text, including the organisation and choice of linguistic elements, are not inherently formalist discussions (see

Stockwell and Whiteley 1–9). However, several events converged which caused disciplines dealing with form, such as stylistics, to develop “elsewhere” — as Stockwell and Whiteley explain: “[w]hile literary criticism was having a crisis of theory and methodology, especially in the 1980s, stylistics remained largely distant from these debates” (3). The distance of systematic studies of aesthetics from mainstream literary scholarship has not gone unnoticed. Introducing her work on the rhetoric of fantasy, Mendlesohn observed, “[t]here is almost nothing dealing with the language of the fantastic that goes beyond aesthetic preference” (Rhetorics 15). She adds, in a footnote, “A cursory consideration of the contents of the Journal of the Fantastic in the Arts will confirm this impression” (Rhetorics 304). The discussion on linguistic features of fantasy appears to halt in the early 1970s only to return in the late 2000s.2 Not only were ”waves of theoretical argument” disturbing the convergence of stylistics and literary criticism (Stockwell

2 This is an observation I have made after conducting a literature review in which I found that the last stylistic discussion of LOTR in the twentieth century appeared in 1971 and was next followed by an article in 2009. It should be noted that linguistic work on Tolkien’s languages continued, particularly among medieval scholars who Reid suggested have more training in linguistics than twenty-first century literary scholars (536). 39 and Whiteley 3), the development of systematic-functional linguistics in the 1970s and 1980s

“replaced almost exclusively” other approaches in the stylistician’s toolkit (Stockwell and

Whiteley 2). Paul Simpson confirms that “[s]tylistics in the early twenty-first century is very much alive and well” despite the attacks that came against the “ailing” field in the late twentieth century (2). While the theory wars of the late twentieth century prompted only back room critical interpretations of style, stylistic-literary work has quietly resumed.

Certainly, it is not only science fiction and fantasy criticism that has experienced a gap in stylistic interpretation. Steve Guthrie writes of the gap in critical approaches to Chaucer, where Bakhtin’s theory of dialogics, which is strongly influenced by linguistic and stylistic analysis, has only recently joined the toolkit of critics alongside Bakhtin’s more subject-central theories, such as the carnivalesque (94). The retro-fitting of Bakhtin’s theory to medieval literature has progressed “despite Bakhtin’s own doubts about the presence of true novelistic discourse before the Renaissance” and the strong focus of the socio-political aspects of the period (94). Guthrie’s comments on the integration of dialogics with practical Chaucer criticism is illuminating:

Bakhtin seems to have the added advantage of a user-friendly elasticity of mind, and

his writings open lines of communication between structuralist and poststructuralist

thought and among contextualists and close readers of many kinds…Bakhtin’s theory is

built on a very concrete linguistic and stylistic analysis that may offer insights to

scholars in and of itself, not apart from the wide political implications of dialogic

analysis but fundamental to them. (94)

Therefore, despite the destructive wave of theoretical argument, Guthrie demonstrates the benefits of theory. New terms and approaches open lines of communication between various perspectives. This can be seen in the application of Suvin’s definition of science fiction; although there is general agreement that Suvin’s working definition of the genre is excessively narrow,

40 his terms “novum” and “cognitive estrangement” are still largely adopted due to their usefulness.

The lines of communication have been opened between the study of form and meaning, and throughout the thesis, the link between representational problems and artistic solutions will be demonstrated in the context of science fiction and fantasy works. For some, the problem of form may remain, and thus I reiterate the sentiments of both Mendelsohn and Guthrie: the question of form is an avenue to illumination in and of itself and, while it is never wholly abstracted from content, it is, nevertheless, fundamental to the implications of theme and ideology.

Subsequently, it is argued throughout the thesis that there ought to be a return to the study of language’s artistic function. A thesis concerned with the artistic function of language must also consider the function of art. Thus, the function of art is treated as an access point for more than just ideological content. To borrow a notion from fantasy writer Madeleine L’Engle, “[i]n art, we are able to do all the things we have forgotten; we are able to walk on water; we speak to the angels who call us; we move, unfettered, among the stars” (57). Science fiction moves us among galaxies, fantasy achieves things we did not know were possible. There is no reason why, as Mendlesohn has suggested, the study of forms cannot inform understandings of how these concerns are outworked in a text. Indeed, it is the domain of the present thesis to demonstrate how the stylistic patterns in language reveal higher order processes in a text, how the vehicle for the ideas of science fiction and fantasy is the underlying structures of language, the writing style.

41

Stylometry in the Service of Style Introduction John Burrows has confirmed that statistical analysis “can never yield absolute support for any proposition” but “it can embrace all members of a chosen class of phenomena – all instances, even, of word-types like and and the” (“Style” 182, emphasis in original). Hence, it is useful to consider stylometry as the study of features that are otherwise outside the scope of the researcher. Such features make up the underlying structure of texts. The frequencies of only one hundred words can account for half of an entire text and yet words such as and and the elude scrutiny even in the closest readings. Nevertheless, it is words such as these that can reveal patterns related to larger constructs of meaning in texts, from individual authorial traits to signals of a genre. A stylometric study of just one hundred most frequently used words is a methodological approach that serves Samuel Delany’s call to “get away from the distracting concept of SF and examine precisely what sort of word-beast sits before us” (Jewel-Hinged 36).

The analysis specifically focuses on the underlying patterns of language in the word-beasts studied.

To the aid of the researcher attempting to turn a list of one hundred word frequencies into useful information comes the multivariate technique, PCA. This method is designed to examine the relationships between variables (words) and observations (texts) and when applied to literary texts it returns components that explain variations in the data (words and texts). The most variance is found in the first component and in each subsequent component less variance is explained than the preceding. Even though results tend to be plotted on only two- dimensional graphs the new components explain the data in its “multivariate space” (Holmes

113). As well as a data-reduction technique, PCA is particularly useful for literary studies because, as David I. Holmes explains, No mathematical assumptions are necessary; the data ‘speaks for itself’. Clusterings of

points, each representing a sampled text, are clearly visible, as are outliers or points which

do not conform to any pattern. (113)

This chapter introduces the field of stylometry, contextualises the influence of advancements in the digital humanities and the methodological pursuits of stylometry to the emergence of PCA as a prominent tool in the field before outlining the method as has been applied in this thesis.

The Scope of Stylometry What is stylometry? Stylometry is the study of quantifiable features of style of a written or

spoken text… (Kenny, A Stylometric 1)

Computational stylistics aims to find patterns in language that are linked to the processes

of reading and writing… (Craig, “Stylistic”)

Instead of the traditional practice of ‘close reading’ in literary analysis, stylometry does not

set out from a single direct reading; instead, it attempts to explore large text collections in

order to find relationships and patterns of similarity and difference invisible to the eye of

the human reader. (Eder et al., “Stylometry” 108)

These three definitions of stylometry paint a picture of a nascent field that has already experienced a series of name changes. Writing in 1986, Anthony Kenny offered a broad definition of stylometry, including both spoken and written texts. The core of Kenny’s definition remains; modern stylometry is “the study of quantifiable features of style”, but the focus has tended to be on written literary material rather than spoken.1 In the year following

1 Within literary studies, stylometry deals with written texts, literary and other such as letters and personal correspondence (Burrows, “Word-Patterns”). When applied to computational linguistics, stylometry has 43

Kenny’s stylometric study of the New Testament, John Burrows’s seminal work, Computation into Criticism: A Study of Jane Austen’s Novels and an Experiment in Method (1987) was published without any mention of the term ‘stylometry’. In the same year Burrows published a paper which made a clear distinction between “broader purposes of computer assisted literary criticism” and the “stylometric” pursuit of authorial questions. As well as research aims,

Burrows outlined how his method differed when employed for stylometrics: “It is more holistic in emphasis and it makes no firm distinction between the grammatical and the lexical elements of vocabulary” (Burrows, “Word-Patterns" 62). Hugh Craig, along with others (Tabata,

“Dickens’s”; Dalen-Oskam), followed Burrows in adopting the term computational stylistics to define what Kenny had originally labelled as stylometry. Some researchers, including Maciej

Eder, used the two terms interchangeably (“Mind”). From the late 1980s until recently stylometry as a term has generally been reserved for authorial studies. However, Eder and colleagues, Jan Rybicki and Mike Kestemont have recently employed just the one the term, stylometry, to cover both authorial attribution studies and text analysis (Eder et al.,

“Stylometry” 108).

Although a relatively minor matter, tracing the name changes is a path to understanding the field’s complex relationship with traditional literary scholarship. It is not unlikely that in the intervening years when practitioners preferred computational stylistics over stylometry the choice was intentional; an attempt to soften the mathematical ring to the name and align the field to the more familiar one, stylistics. After all, “most people who make a

come to mean authorial attribution but is still applied to written communication, such as research on instant text communication (Cristani et al.). 44 profession of the study of literature do so because they have an artistic rather than a scientific bent” (Stanley Wells qtd. in Holmes 111).

In the late 1990s there was concern about the perception of the field as practitioners sought to dispel the “distrust of the intrusion of statistical methods into humanities scholarship” (Holmes 113). Statistical techniques have since been applied to literary questions from various traditions and in multiple languages. The sharing of code and data tends to be open and transparent, with much gained from mailing lists and the real-time blogging of projects2. Ultimately, there are fewer barriers to resources for the modern scholar than when

Burrows and Kenny first started. A review of Burrows’s monograph, Computation, declared it improbable that such a modern study of Jane Austen would appear out of Newcastle, Australia given the tyranny of distance from major northern hubs of scholarship (Cullen). Technology aids connectivity among research but other difficulties are still present particularly in the intersection between maths and training in humanities scholarship.3 Furthermore, tools and techniques are sometimes insufficiently documented, and scholars attempting to replicate results or apply new techniques can tend to require the technical support from other practitioners.

It is not only in the application of computational tools that stylometry presents difficulties for humanities scholarship. The search for “patterns in language that are linked to the processes of reading and writing” (Craig, “Stylistic”) is an unusual focus for literary

2 See http://www.matthewjockers.net/ as an example of the ongoing testing and development of Jockers’s “syuzhet” package for R, an open source statistics program. 3 Resources are still being developed for training humanities students in relevant math and computer skills. A book length guide was recently written by Patrick Juola and Stephen Ramsay, Six Septembers: Mathematics for the Humanist (2017). 45 criticism given its breadth and similarity to linguistics. Computational linguist, Walter

Daelemans has argued that stylometry appears to “describe and explain the casual relations between psychological and sociological properties of authors on the one hand, and their writing style on the other” (451). With interests in language psychology and literary sciences,

Daelemans’s critique of stylometry called for the systematic testing of these casual links.

However, as a project interested in deepening scholarly understanding of literary texts, this thesis does not explore the cognitive aspects of reading or writing. Rather, in studying how the meaning of texts are grounded in language this thesis explores how artistic strategies are deployed at the deeper levels of language, the frequency of words. Therefore, the scope of stylometry in the present thesis is the quantification of stylistic qualities to explore patterns of word usage in literature and what these patterns reveal about variation in the texts.

The change in name, from stylometry to computational stylistics and back to stylometry, is not necessarily a sign of discord within the field but rather is the result of a nascent but already globalised field. Stylometry, once reserved for the categorical questions of authorship, was part of the field of computational stylistics, a name that downplayed the mathematical components and placed prominence on discussions of stylistics. Only recently have practitioners been embracing the term stylometry once again.

The following review explores stylometry in its reverse trajectory, beginning with the alignment of the field with distant reading and big data, through the stage of developing and testing methods for authorship attribution, back to the original vision, which aligns with the modern pursuit of stylometry “beyond” authorship attribution, stylometry “as one tool among many” for literary research (Kenny, The Computation 14).

2.2.1 Stylometry and the Digital Humanities Stylometry aligns with the digital humanities and can be defined in light of developments in the digital humanities. In part, the digital humanities is an umbrella term for the intersection of

46 computer technology with humanities research and as such has “an interdisciplinary core”

(Schreibman et al.). In the past, the digital humanities have been celebrated for having a “big tent” mentality; gathering many disciplines and humanists under its banner and fostering equality with those who practice computing humanities scholarship and those who do not. The extent to which this was a reality has been questioned and criticised and in 2016, the editors of

Debates in the Digital Humanities, Lauren F Klein and Matthew Gold, stated that

the challenges currently associated with the digital humanities involve a shift from

congregating in the big tent to practicing DH at a field-specific level, where DH work

confronts disciplinary habits of mind.

This shift, they claim, is evident in the discourse of digital humanists who once worked anxiously to communicate their research to their home discipline but are now exploring more direct avenues for how digital humanities might influence and be influenced by the home disciplines. Moving the digital humanities back to the home disciplines of the humanities includes disseminating results through field-specific avenues rather than solely digital humanities avenues and, in this thesis, involves melding close reading with interpreting statistical results. Accordingly, the research aims of literary criticism are forefront rather than attempts to expand the methodologies of digital humanities. Although this shift has only recently occurred in the digital humanities, the field of stylometry has been working towards influencing field-specific scholarship since its inception. “The method of analysis employed in this paper”, wrote Burrows in 1987, “can finally, be brought to bear not only on questions of authorship and chronological change but on questions nearer to the leading interests of literary critics and narrative theorists” (Burrows, “Word-Patterns” 69).

Nevertheless, the earlier focus of the digital humanities is shared by early work in stylometry where priority was given to developing and testing methods. In the search for the gold standard of a stylometric technique, practitioners sought the “holy grail” of authorship

47 attribution testing (Holmes 111). The work carried out in developing and testing methods of authorship has provided the foundation for other questions to be explored using computational tools. The “big tent” digital humanities movement also prompted a focus on the packaging of tools so that humanists without a background in the computer sciences could access digitally enhanced research practices. In stylometry, the packaging of tools has come in the form of free downloads such as the Java-based program Intelligent Archive, developed by Prof. Hugh Craig at the Centre for Literary and Linguistic Computing. Other tools have been packaged for use in the statistical environment, R. The Computational Stylistics Group created “stylo” for R which has a graphic user interface and includes PCA, Burrows’s delta and various distance measures.

Matthew Jockers has released “syuzhet” for R, a package that “extracts sentiment and sentiment-derived plot arcs from text using three sentiment dictionaries” (Jockers, “syuzhet”).

However, there is a constant requirement for the stocktake of tools and transparency among researchers concerning which tools they implement. Even within a reputable statistics environment such as R, there are multiple ways to apply PCA to data and each one has slightly different results.4

As part of the digital humanities, stylometry contributes to a broader discussion concerning how traditional humanities scholarship can engage in research that is only made possible by the advent of powerful computers. One of these conversations concerns a new frontier for literary criticism in “big data”. Large-scale data analysis is possible with developments in computing power in the last two decades, allowing scholars to ask questions that “were previously inconceivable” (Jockers, Macroanalysis 4). This trend is epitomised in the

4 The R manual mentions that the signs of the columns of the rotation matrix are arbitrary signals and therefore differences can occur between different programs for PCA and even between different versions of R (“Principal Components Analysis”). 48 work of Franco Moretti and outlined by Matthew Jockers in Macroanalysis: Digital Methods and

Literary History. The power of analysing large amounts of literary data is concerned with literary history rather than textual analysis and thus can be treated as distinct branch of stylometry.

In contrast to the big-data trends, each case study in the present thesis focuses on only a handful of books, the largest corpus is thirty-one texts of which only two are closely examined while the rest offer contextualisation. The smallest corpus is seven texts, of which a closer investigation of all seven is offered. A corpus of seven books represents over one million words but, while representing a large amount of data, this does not necessarily qualify as ‘big data’.

Rather than trawling, data mining or excavating new literary histories, the vision for the present study is to unearth the complex layers of the subtleties of style by investigating whether patterns of language can be found deeper than what we can see or feel. The Stanford Literary

Lab found markers that suggest the “logic” underpinning genres is deeper than previously thought and in a way that is similar to the unseen depths of an iceberg (Allison, Ryan, et al. 25).

To strip away the obvious signs of a genre, the words relating to witches and wizards and time machines, what remains are the necessary foundations of tone and mood, the patterns of which are infused with the overall tone of a text. It is recognised that different genres demand certain tones and moods. In the frequencies of very common words, this thesis picks up the challenge to plumb the depths of logic. However, in keeping with the current climate of the digital humanities, the challenge to this thesis has been to apply stylometry as a tool without allowing the hermeneutical framework of the literary critical question to be overrun by methodological limitations. In doing so, it is necessary to outline the influence of authorial attribution to the development of stylometry’s toolkit.

2.2.2 Stylometry and Authorial Attribution Stylometry’s cousin, stylistics, has always had an interest in authorial attribution with one of the earliest examples being Lorenzo Valla in 1457 who employed textual analysis to prove that the

Donation of Constantine was a forgery. The modern iteration of stylometry similarly has roots in 49 the study of authorship and it is generally considered to have origins in Augustus de Morgan’s

1851 suggestion that the writings of St. Paul could be settled by the measurement of linguistic features. This suggestion was taken up by T.C. Mendenhall who used variations in word- lengths as a measure of difference in the works of Charles Dickens, J.S. Mill and William

Thackeray and later a study of William Shakespeare and Francis Bacon (Kenny, The Computation

1–2).5

Stylometric methods continued developing in the twentieth century as statisticians experimented with different stylistic and statistical measures, such as sentence lengths (Yule) and both univariate and multivariate approaches (Holmes). In the early 1960s two studies appeared which have come to be regarded as the model of computational statistical investigation (Kenny, The Computation 7). These are Alvar Ellegård’s study of the Junius Letters

(Ellegård) and Frederick Mosteller and David Wallace’s work on the Federalist Papers

(Mosteller and Wallace). Where Ellegård determined characteristic style on the basis of nearly five hundred words, Mosteller and Wallace did not find enough distinctive words and instead found it useful to study words that were used with high frequency including prepositions, conjunctions and articles – sometimes known as “function words” or words with little to no lexical meaning (Kenny, The Computation 8).

As Master of Balliol College, Oxford and initially a “mathematical ignoramus with a purely humanistic background”, Anthony Kenny was in a prime position to write an

5 Anthony Kenny’s book The Computation of Style: An Introduction to Statistics for Students of Literature and Humanities (1982) is a useful review of the developments in stylometry up until the 1980s but is problematic due to its insufficient referencing. However, it has been considered a “lucid” and “invaluable guide” (Burrows, Computation 11). The eloquent outline of stylometry from the 1800s to the 1980s makes Kenny’s introduction a valuable resource regardless, and rather than dismiss his work many have verified his sources. 50

“elementary statistical text” illustrating the application of statistics to literary material (The

Computation vi). The state of stylometry by the 1980s is thus summarised by Kenny:

At present it seems possible to identify statistical features of style objectively

measurable, which are unique to particular authors in the sense that they appear in all

the writings of that author and not in any other so far studied … Stylometry can hope

in time to fulfil the aspirations of those who take it up, as one tool among many, for

literary and historical research. It cannot fulfil the hopes or fears of those who see it as

an extension to the mental sphere of the individuating techniques we use in the

physical identification of bodies. (The Computation 13-14)

As Kenny predicted, stylometry has discovered “that authorial style is detectable in texts to a degree which surprises even traditional author-centred scholars” (Craig and Greatley-Hirsch

14). Scholars are still attempting to determine the smallest possible sample size for authorship identification (Eder, “Does Size”). Although it cannot be verified if there is a database of

“stylometric profiles of all citizens” on file with the FBI or Scotland Yard as Kenny predicted

(Kenny, The Computation 13), linguists are working toward automated authorial attribution for

Tweets, emails and other social media modes in issuing communiques (Schwartz et al.).

The degree to which authorial style can be determined using function words has brought empirical evidence to bear not only on specific questions of authorship but also on the premises underlying the study of authorial style. By quantifying an author’s style and identifying elements that are characteristic of an author according to word frequencies, the following assumption prevails: even when intentionally imitating another’s style, authors do not have control over function words and thus ubiquitous words can serve as markers of authorial style.

However, in an echo of Kenny’s warning about extrapolating physical identifiers to a mental sphere, Hugh Craig and Brett Greatley-Hirsch note:

51

Not all the features that serve to distinguish authors will necessarily prove to be

stylistically interesting – just as a fingerprint may identify an individual with a high degree

of accuracy but tell us nothing about that person’s behaviour or predispositions – but it is

likely that some of the features will have a literary interest. (17)

During the development and testing of methods for the statistical analysis of authorship, the field moved from univariate approaches to multivariate. Univariate techniques involve only one variable at a time and do not consider the relationships between data while multivariate techniques involve multiple variables analysed to uncover the relationships between unequal variables. PCA, is “a standard technique in multivariate statistical data analysis” (Holmes 113).

The aim of PCA is “to transform the observed variables to a new set of variables” (Holmes 113).

2.2.3 Stylometry and the Questions “Beyond” Authorship Although authorship studies are one of the most popular applications of stylometry (Eder et al.,

“Stylometry” 108), more scholarship has recently turned to questions “beyond” authorship, looking to expand the amount of work that is concerned with what Burrows originally termed questions of “interest” to literary critics (Burrows, “Word-Patterns” 69).

Burrows was not the only one to have an early vision for the statistical study of literary questions. In 1986, Kenny wrote of the promise of stylometry, highlighting several areas of humanities scholarship that could be impacted by the development of the field in addition to authorship attribution:

One may wish to study the statistics of word usage or word order with a view to

understanding a text better, to catch nuances of meaning and perhaps to render them into

a different language. Or one may be interested in the history of the development of a

language, and study the speech habits of particular authors as an indication of linguistic

change. Or one may hope to use the quantifiable features of a text as an indication of the

authorship of a text when this is in question. (Kenny, A Stylometric 1)

52

Despite the strong focus on authorship attribution studies, there have been studies on the style of translations (Rybicki and Heydel), the stylistic markers of early modern plays (Craig,

“Authorial”) and studies on the stylistic changes in an author’ oeuvre (Tabata, “Dickens’s”;

Hoover, “Corpus”).

In John Burrows’s Computation into Criticism, the idiolects of Austen’s characters were the subject of study. Burrows argued that the style of Austen’s narrative and personal letters were distinct, as well as differences between her narrative and dialogue styles and those of other authors. His arguments made use of the statistical study of very common words and he has subsequently been attributed for being “the first to see the potential of these words for literary analysis” (Craig and Greatley-Hirsch 18).

The region of text occupied by common words is so vast that it defies “the most accurate memory and the finest powers of discrimination” (Burrows Computation 3). Statistical analysis grants access in concise and measured ways. Since Burrows’s initial foray into the idiolects of Jane Austen’s characters in Computation into Criticism, the techniques have become more sophisticated, and as Burrows describes in his chapter in The Cambridge Companion to Jane

Austen,

[the results] show how the common words of the present set of texts respond to

principal component analysis. Those words which behave most like each other lie

towards the four extremities and are opposed by the words that behave least like them.

(To behave alike, in this context, is to exhibit concomitant frequencies by occurring

more – or less – frequently in the same texts as each other). (“Style” 183)

Very common words belong to the verbal universe of any text and yet generally belong to the background of the universe. One assumption in literary criticism about such words, as Burrows explains, is that they “constitute a largely inert medium while all the real activity emanates from more visible and more energetic bodies” (Computation 2). The most common words from 53 any given novel include function words, those with primarily grammatical rather than semantic meanings; articles, auxiliary verbs, prepositions, and conjunctions, all word-types that form the base-strata of language and generally a significant portion of a text. Another assumption of literary criticism is that the language of literature is distinct from everyday language. The ubiquitous nature of such words in all manner of discourse formed the rationale for their use in authorial attribution tests but even in studies of broader literary interests, common words are valuable. The subtle distinctions in the use of very common words is revealing due to their

“closely constrained functions”, and the fact that “their relative frequencies across a range of texts mark subtle but remarkably consistent differences of reference, of syntax, and of emphasis” (“Style” 183). For instance, in a study of early modern drama, Craig found a relationship between the use of indefinite articles a and an and the progression of style in the era (“A and an”). Writing on the function of articles, Craig suggested that “indefinite” as opposed to “definite” is not the most accurate description of the function of a and an. Rather, these articles can refer to abstract members of possible well-populated categories rather than concrete instances that can be referred to using the. With the abstract function of the indefinite article more roundly defined, Craig demonstrated how the frequency of a and an indicate that later plays have an increased level of detachment (“A and an”, 287-9).

Burrows was the first to make a case for the neglected regions of the verbal universe, the base-strata of language, arguing that it “has light of its own to shed” and exploring throughout Computation into Criticism how exact evidence of such words “does have a distinct bearing on questions of importance in the territory of literary interpretation” (Computation 2).

Burrows’s statistical evidence and subsequent conclusions about Jane Austen’s style have not only influenced new generations of stylists but also Jane Austen scholars. Bharat Tandon deems

Burrows’s statistical analysis to be an “invaluable” contribution, particularly for demonstrating

54 that “it is not always the most visible divisions that are the most pertinent” and that the subtlety of an author’s style can depend “upon the work done by particles and auxiliaries” (219).

As such “the heart” of stylometry is the frequencies of common words (Craig and

Kinney 12). Yet, over fifty-one thousand instances of the in J.K. Rowling’s Harry Potter series hardly seems to constitute the basis of a study of literary language. A statistical analysis of all instances of the in the Harry Potter series on its own is not particularly interesting, to statisticians or literary critics. The core interest for both fields comes in the form of differences. A key proponent of stylistics is comparison, to say that Jane Austen contains sentences of a certain length is not particularly interesting, nor relevant to wider literary discourses unless there is a comparison, for example, the sentences in Jane Austen fiction are longer than those in the work of Henry James. Consequently, over fifty-one thousand instances of the across a series does not adequately communicate issues central to the meaning and interpretation of the novels in the series. However, an analysis of multiple word types forms a rich picture of the base-strata of a novel’s verbal universe.

As well as authorial signals, PCA is also apt at distinguishing chronological changes in style, the idiolects of characters, the trends of publishing houses, distinctions between literary genres and form6. In following earlier works in textual and genre analysis, the modern field of stylometry “take[s] advantage of authorship work on particular questions and of the methods that have been developed and tested there” (Craig and Greatley-Hirsch 14). However, there are a few key methodological and philosophical distinctions that are necessary to point out when

6 In this context literary genres indicate differences in novels, science fiction, fantasy, detective fiction, romances, and so on, even down to sub-genres. Form indicates the type of work, novel, play, poem and other subcategories, including first person novel, second person novel. 55 applying these methods to questions that are wider than the closed questions of authorship including the selection of common words as stylistic markers, the overwhelming author-signal in PCA results and differences of interpretation between open and closed questions.

Part of the work developing and testing methods of authorship attribution has included testing for accurate linguistic markers of authorial style (Hoover, “Frequent”; Hoover,

“Statistical”). Even so, it was early in the development of method when Burrows argued that there was no need to distinguish between function words and lexical words in studies interested in wider questions (Burrows, “Word-Patterns” 62). PCA has since been applied to a combination of common words and markers of punctuation (Allison, Ryan, et al.), to counts clauses, verb tenses, and mood (Allison, Gemma, et al.), to count frequent collocations of words

(Hoover, “Frequent”), and even to rhyming words in medieval texts (Kestemont).

Arguably, however, at the heart of stylometry remains the frequencies of common words. Nevertheless, this position ought to be closely examined, rather than assumed, as a basic premise in each stylometric study. Even when common words are the foundation of a study, inclusion words vary depending on the question being asked. For instance, Tomoji Tabata excluded personal pronouns and other deictic words, such as finite verbs, in his study of

Charles Dickens’s style “so as to diminish the overshadowing effect of what is already known”

(“Dickens’s”). Tabata’s paper was testing for a chronological change in Dickens’s style and included the example of Bleak House (1853)which has two alternating narrators and two distinct narrative styles; first and third person. Tabata justified his omitting of specific word types as

“differences in point of view are obvious between first-person narrative and third person narratives”. In all stylometric studies, the exclusion of certain word types “deprives [the] data of some interesting subjects”; a consideration that needs to be weighed according to the aims of the study and with the notion that different words will reveal difference nuances of stylistic differences (Tabata, “Dickens’s”).

56

Generally, some words are usually omitted from a study when applying PCA. Choices to include frequently used words, such as the, and to discard rarer words is a choice that comes down to the statistics. The aim of the method is to reduce the dimensionality of a dataset with the results projected in two dimensions, occasionally several dimensions are analysed if there is a significant proportion of variance summarised across multiple components. Interpretations are drawn from the two-dimensional graphs. Even so, several thousand data points are too many for accurate interpretation. Not to mention the nuances of stylistic differences that are affected when all word types are included. Although some researchers prefer to count the frequencies of up to six hundred words (Hoover, “Multivariate” 344), following the method employed most commonly by Burrows involves counting anywhere between fifty and one hundred words. The process of selecting which features to count is a recognised part of stylometry’s methodology (Craig and Greatley-Hirsch 34-5). Although it is a somewhat arbitrary choice, a list of one hundred words can make up anywhere between 30-50% of a text, making the list well suited to explaining stylistic distinctions in the underlying style of a corpus.

However, as Tabata modelled in his study of Dickens, the exclusion of certain word types ought to be justified.

Although common words are easily overlooked by readers as they are difficult to count in the normal reading process and are often generally meaningless in isolated incidents, they are essential to the structure of language. As such, they are an ideal subject for an empirical study where the counting process is automated; a multivariate analysis, such as PCA, is applied to the frequencies to explore the complex relationships between words and texts, and where a close analysis is offered for words that are otherwise read over quickly and yet form roughly half of any given novel.

It is important to note that the use of common words as measures of style offers access to the base-strata of language, the underlying patterns of language in a text which are distinct

57 from thematic and conceptual patterns. Nevertheless, it is possible to work upwards from the base-strata analysis to examine how particular stylistic markers influence the thematic and conceptual components of a corpus. For instance, a high number of prepositions denoting movement speaks of action, and adverbs such as then indicate the progression of narrative time.

In this thesis, the features studied are the proportional frequencies of the one hundred most common words in the corpus rather than rarer words or word collocations. It is usually the themes and tropes that are the focus of scholarly study in the works selected for this thesis.

Similarly, rather than form it is the tropes and themes that are usually used to distinguish the two genres. In cases where form is a question, such as in Farah Mendlesohn’s typologies of fantasy (Rhetorics), there is no attention given to the very common words, the underlying structures of language. These words appear with such frequency that it is difficult to make a meaningful study of them without the aid of statistics.

Yet ubiquitous words reveal more than just the author’s linguistic fingerprint, they have also been shown to reveal stylistic descriptions (Craig, “Authorial”), the stylistic differences between genres (Allison, Ryan, et al.) and stylistic changes in author’s oeuvre

(Hoover, “Corpus”). Furthermore, it is not just stylometrists who advocate for the study of common words. Trauma psychologist, James W. Pennebaker, found that function words were useful for studying changes in writing styles which in turn provided indicators of mental health. One of the markers discovered was a change in the use of pronouns over the course of people writing about their traumatic experiences, “the more people changed in their use of first- person singular pronouns … compared with other pronouns … the better their health later becomes” (12). Pennebaker’s study also included emails and correspondence in which function words are analysed in order to interpret the state of mind and personality of the composer. In a later study, Pennebaker and Yla R. Tausczik treated function words as “style words” which they opposed to lexical words or “content words” (29, emphasis in original).

58

In this thesis, the features studied are the most common words — words that are foundational to meaning and style. In each case study, the one hundred most common words of every corpus were counted and then any proper names and titles (such as Mrs and Professor) were omitted. Whatever number of words remained was the number of words used in the study. This is a similar process which Tabata used in his study of Dickens’s style, the omitted words were left off the list but no new words were used to replace them (Tabata, “Dickens’s”).

By leaving out proper names and titles, the results of the PCA are not overshadowed by differences in protagonists which is useful in the case of Narnia where protagonists come and go throughout the series. In addition to these protocols, no distinction was made between homographic forms such as to as a preposition and to as an infinite marker, and between contracted forms, such as didn’t, and the expanded forms did and not. The choice to expand contractions has been made by other researchers (McKenna and Antonia) and homographs have been counted separately, particularly in the study of early modern drama (Hoover,

“Statistical” 422; Craig and Kinney 221). However, in the present project the choice to count contractions as distinct word tokens allows for patterns of formality to emerge as contracted forms can be markers of informal language and therefore valuable to study as word-tokens

(Huddleston and Pullum 91, 800). One of Burrows’s early studies on Jane Austen’s novels compiled a list of words that was “entirely literal”, where contractions were not lengthened and were incorporated with the individual word-tokens and there were no distinctions made between homographic forms (“Word-Patterns” 62). Burrows noted that when the homographic forms were distinguished the results were very similar (“Word-Patterns” 66). Therefore, the protocols in this study follow-on from Tabata and Burrows’s approaches: titles and names are removed from the list, contractions are not expanded and no formal distinction between homographic forms is made in the process of counting. Throughout the thesis then, attention remains on the base-strata of words; the depths of style which have not yet been closely studied in such breadth in either science fiction or fantasy.

59

Previous studies show that PCA is a powerful tool for distinguishing between authorial styles. Rather than controlling word choice to diminish the overwhelming author-effects, stylometrists wishing to explore questions beyond authorship are required to consider corpus composition:

If we want to understand the nature and degree of important non-authorial

considerations such as genres and era, then we must ensure that we account for any

authorial effects. (Craig and Greatley-Hirsch 15)

In the present thesis, two approaches have been undertaken to ensure that authorial effects are accounted for. The first is to make sure that no one author is dominant in any corpus. In

Chapter 3 the corpus intentionally consists of thirty-one texts by thirty-one different authors for this purpose. The second approach is to separately analyse the works of three authors, as in

Chapter 5 where Harry Potter, Narnia and Young Wizards are treated as three different tests despite exploring the same question of chronological stylistic variation in a book series. Chapter

4 consist of two tests, one between Stapledon and Wells and the other between Stapledon and

Woolf. These tests are specifically seeking stylistic markers to separate Wells and Woolf from

Stapledon.

A final distinction between questions of authorship and other questions is the closed nature of a category question and the open ended exploratory approach that permits PCA to be employed as a tool of stylometry to a series of literary questions. Closed questions include those with answers that are either yes or no. Thus, the question, does this text resemble texts by a particular author, is closed. Open questions do not pursue categorical answers such as whether or not texts belong to class A or class B. Rather, they are explorations of broader questions and this difference in approach requires a different understanding of how results are interpreted.

60

2.2.4 The Interpretational Gap Although the PCA method produces legible graphs that require “no mathematical assumptions” (Holmes 113), there is still a significant theoretical and methodological challenge to researchers: the gap between signal and interpretation. Hugh Craig points to the

“methodological blank” in stylometry that exists between statistics and conclusions regarding style:

Precious few theoretical models bridge the gap between, on one hand, the counting and

analysing of linguistic features, and, on the other, the eclectic, holistic, impressionistic,

and yet indispensable business of capturing the impact and flavour of a group of texts.

(Craig, “Authorial” 103)

Thus the researcher must “leap” from the frequencies to the meanings (Craig, “Authorial " 103–

4). In addressing the methodological gap, Craig has suggested a model for computational stylistics which ought to “[work] with tendencies rather than rules.” He explains that not every aspect of a text can be reduced to the language of statistics:

There must be room both for the insight that any text, and any collection of texts, has

elements which will never be reducible to tabular form, as well as for the knowledge that

many of the elements of the individual case will form part of a pattern. There is a strong

instinct in human beings to reduce complexity and to simplify: this is a survival

mechanism. (Craig, “Stylistic”)

There is also a tendency among humans to explain differences between objects in binary, reducing complex differences to a polarity of two extremes. In addition, PCA often reinforces dichotomy readings. The word weightings are plotted along the principal component that explain the styles distinguished by each end of the new composite variable. Yet, it is only the words weighted high and low that are indicative of the variation accounted by the component.

Although this can reinforce a binary projection of the data, exploring only the extremes of the

61 new variables does not imply that there is only a binary of stylistic variation in any given text or collection of texts. Rather, as Craig’s above quote argues, each text treated in a study must be afforded the perspective that there are elements which cannot be explained in a binary reading.

In Computation into Criticism, Burrows diverges from his main field of inquiry, the idiolects of Austen’s characters, to investigate a framework for the merging of literary criticism with linguistics and statistics. The import of his fourteen-page excursus remains relevant for computational stylistic studies. Drawing on Russian formalism, Burrows relates the concepts of

“foregrounding” and “defamiliarization” to his statistical analysis of character idiolects.

Ultimately however, these elements were developed to distinguish literary language from others. Burrows argued that, although useful in other forms of inquiry, concepts developed to distinguish literary qualities in language, when applied to the pattern of usage in very common words, can be examined according to the same meaning as the more overt element and thus

“[t]he language of literature will have spread through the whole field of meaning and ceased, in another way, to be peculiarly literary at all” (Computation 116).

Similarly, linguistic methodology appears in stylistics to support literary critical interpretations. Robin Anne Reid brings Roger Fowler’s discussion on stylistics to the fore in her investigation of language in Tolkien’s fantasy fiction and argues that the discipline of linguistics avoids the assumption that underlies literary criticism: “that the language, the style of writing, used in those texts identified as literary by literature faculty is somehow special and bears no relation to ordinary language” (518). Accordingly, linguistic methodology is not only appropriate for Reid’s purposes but also useful for bypassing the problems often associated with fantasy fiction which dismiss the language of such works on aesthetic grounds. As a field that investigates all language, not just language which is deemed to have literary value, Reid makes a case for a range of aesthetic effects in Tolkien’s work.

62

Studying the frequency of common words has implications that range out into wider phenomena of the same work, a broader tradition, or even in the matter of language use more globally. Rather than this being a problem of the methodology, Burrows treats it as the nature of language and asserts that each facet of the statistical evidence he studied “can find a place within a larger framework of communication-theory” (Computation 116); the aim of such a framework is to “define the conditions under which a message, verbal or otherwise, is most likely to be accurately conveyed (Computation 118). The application of formalist concepts to the interpretation of statistical patterning among such elements of language leads to crowding: “the foreground will become so densely populated that little or nothing can be truly foregrounded”

(Computation 116).

Arguably, what makes critical interpretations interesting is the element of difference.

The same principle governs the statistical tests, is built into the statistical test even, and so it ought to also govern the framework of interpretation. In 1987 John Burrows suggested that there was scope “for a closer working relationship” between three fields: the literary critic, the statistician and the linguist (Computation 106). The framework Burrows explored in his monograph was imbued with literary theories from the Russian Formalists and David Lodge and the linguistic theories of Ferdinand de Saussure. Burrows argues however that certain ideas, notably ‘foregrounding’ and ‘defamiliarization’, ought to “take their place among other potent forms of emphasis” and proposes an approach to computational stylistics through “a larger framework of communication-theory” (Burrows, Computation 116). Within such a framework, the stylistic differences that create patterns between different texts or features of texts, can be understood as incorporating “everything from idiosyncratic tropes and figures, through characteristic ‘key-words’ and images, syntactical form, and local instances of

‘foregrounding, to unobtrusive (but statistically demonstrable) habits of expression” (Burrows,

Computation 119). The key objective of the communication-theory framework is “to define the

63 conditions under which a message, verbal or otherwise, is most likely to be accurately conveyed” (Burrows, Computation 118). Burrows argued that the conditions of a message are not necessarily found in the salient features, the foregrounding of particular features and other

“language-games” (Computation 115). The ‘message’, as understood here, is the effect that language has, the meaning that is conveyed through the tone or colour; understood in spoken language as the inflection that conveys additional meaning to spoken words. Gravity, humour or disgust are all effects which can inform interpretations.

Burrows’s interpretive framework sought to bring the statistical evidence to bear on matters of literary criticism but he recognised the “formidable obstacle” that the evidence is to interpretation: “the evidence, not ordinarily so visible as statistical analysis makes it, of frequency-patterning in the very common words. If patterning of the kind examined earlier is not to be regarded as peculiarly literary, it will be difficult, if not impossible, to draw the line between literary and non-literary effects” (Computation 115, emphasis in original).

Introduction to Principal Component Analysis Yet neither an X-ray photograph, representing the inner structures, nor an ‘Indentikit’

composite, representing the visible features, discriminates as precisely among people of

roughly the same physical type as correlation-matrices (and the ‘maps’ derived from

them) discriminate among the idiolects of Jane Austen’s characters. (Computation 101)

Such was Burrows’s conclusion concerning the results he published in 1987. His study of very common words examined language as though it were “a human organism” with hidden components and a visible skin; both of which contribute to uniqueness (Computation 101). In a passage of dialogue, 560 words long, from Pride and Prejudice, Burrows suppressed all words other than the thirty most common words in order to demonstrate how “the vestigial shape of the original passage still leaves traces enough for its speaker to be identified” (Computation 102-

103). Having already compiled a hierarchy of words used by each of the major characters in

64

Pride and Prejudice, Burrows was able to correlate the skeletal structure of the passage with the stylistic markers of Mr Collins (Computation 103). This example was in support of Burrows’s argument that the “skeleton” of literary language, or “the latent resources” of language, are like the barebones that when studied can reveal “the grammatical identity of many of the excised words” (Computation 102).

Creating a correlation matrix from a list of variables is the first step of PCA in the form used in this thesis. Following all the steps of PCA allows patterns between texts and words to be correlated and produce a similar skeletal picture of styles with the power to discriminate between texts and text segments based on word usage. Eventually Burrows’s methodology developed from just creating ‘maps’ from the Eigen-vectors of correlation matrices to applying

PCA. Although PCA originated early in the twentieth century, computers allow statisticians from numerous fields to instantly process the data of hundreds of variables. As Craig and

Greatley-Hirsch explain, Karl Pearson invented the procedure in 1901 and Harold Hotelling independently outlined the same procedure again in 1933, giving it the name it is known by today, “Principal Components Analysis” (20). PCA is a data reduction technique that assists researchers in comprehending the otherwise fathomless depths in a dataset comprised of multiple variables. In stylometry, each word is treated as a variable. The results of PCA arrange each word variable along a vector, a principal component. The position of each word expresses the reciprocal relationships between all the variables and a percentage of variance for the component gives the researcher an idea of how important these positions are in the scheme of stylistic variation.

65

By 1992 Burrows’s application of PCA to literary material more closely resembled the

“conventional” applications of PCA than his research in the 1980s (463).7 Conventional or not,

Burrows’s application of PCA was as an end in itself, a tool that produced interpretable results.

In other fields however, PCA is more often used to prepare data for further analysis (Everitt and Hothorn 349). After experimenting with different ways to implement PCA, Burrows has recently offered an anecdotal explanation of the main differences between a stylometric application of PCA and other applications:

Many statistical methods assign the members of a set of specimens to one or another of

several pre-determined classes. PCA itself is often used in this way in disciplines like

engineering and also for the purposes of quality control, where aberrant specimens are

rejected. But we use PCA in such a way as to allow the specimens to array themselves

according to their respective affinities and disaffinities, whatever these may be. The

outcome enables us to form inferences about the overall patterns that have emerged

and the possible reasons for any aberrations. Such inferences can then be tested.

(“PCA”)

The testing of inferences can occur in the course of discussing the results; retuning to the literary text in question or, in the instance of classification questions, corroborating the results of

PCA with the results from other tests.

It should be noted that Burrows’s use of PCA as an end in itself is not contrary to statistical literature. Statisticians recognise that in some applications PCA might “be amenable

7 For a mathematical explanation of developments in Burrows’s application of PCA see Binongo and Smith; Oakes 31–45; and for the story of how Burrows came to first apply PCA to literary questions see Craig and Greatley-Hirsch 17-23. 66 to interpretation” (Everitt and Hothorn 349). Although PCA is a long established statistical method there are several variations in its application. These will be made clear as part of the following explanation of the different steps in applying PCA to stylometric questions. The discussion of the method in this chapter is not a mathematical explanation but an outline of the steps in the process of generating and interpreting results.8

2.3.1 Applying PCA

As mentioned above, the first step of PCA is to measure the correlation between each of the variables. Where one hundred words are selected, PCA first correlates each word with the other ninety-nine words. The resulting correlation matrix is then subjected to Eigen analysis.

Tomoji Tabata explains: “[b]y eigen analysis, the principal components of the matrix are extracted, and it is possible to project the most powerful components in a scatter diagram”

(Tabata, “Dickens’s”). The components are new variables, uncorrelated from each other and returning the greatest amount of variance that can be summarised in the one composite variable. In R, the number of components returned are whichever dimension is smallest, where n = word tokens and p = texts, if n = 100 and p = 31 then thirty-one components are found, with the first accounting for the largest amount of variance. If n = 100 and p = 299 then one hundred components are returned, the first still accounting for the greatest percentage of variance found in the dataset. The results from the extraction of components is portrayed in a graph, which

“gives a picture of the reciprocal relationships” among all the words selected as Tabata further illustrate when explaining how the graph can be interpreted:

8 A mathematical account of the underpinning processes of PCA (those which are automated in statistical packages) is presented in Binongo and Smith 449-549. 67

Relative distance between the entries reflects similarity or contrast among these words

in their concomitant variation over the [texts]. Words located towards the east and

those located towards the west of the graph tend to be mutually opposed: when the

frequency of one set of words rises high in a given text-sample, the frequency of the

other tends to fall low. The same applies to the vertical axis. (Tabata, “Dickens’s”)

It is possible to then multiply the Eigen-matrix that produced the weights back through the original table of word frequencies. This produces PCA scores which are also turned into a scatterplot. In this second graph, the texts are the data points and are distributed according to

“the word-pattern” of the first set of results (Tabata, “Dickens’s”). Tabata explains how to interpret the PCA results:

Since this graph is a product of the previous one, they correspond to each other. When

one compares the two graphs, one can see that easterly entries of words in [the word

graph] occur more frequently in text-entries lying towards the east of [the text graph]

than in those lying towards the west (and vice versa), while northern entries of words

in [the word graph] predominate in the text-entries situated in the north of [the text

graph] and are outnumbered in texts that find their place in the south (and vice versa).

Additionally, words that contribute little to the horizontal and vertical differentiations

of texts lie around the middle of the graph. (“Dickens’s”)

In other words, the two graphs are mutually supportive avenues to understanding PCA results.

Most of these steps are automated in statistics packages such as R or Matlab. R is the statistics environment used in all three case studies of this thesis and has two inbuilt PCA commands as

68 well as numerous packages that include PCA. The command used in this thesis is prcomp().9

There are three steps which must be governed by the researcher in terms of data collection, data input and an additional command when evoking prcomp()in R. These three steps are essential for using PCA in such a way as to link the results to literary concerns and ensure that the patterns in the results are accurately portraying the literary material.

2.3.2 Proportional Frequencies

The first step is to ensure that the word counts are turned into proportional frequencies prior to subjecting them to PCA. This ensures that differences in size between texts or text segments do not disturb the findings, as Burrows pointed out, where unequal segments of text are being compared raw word counts must be standardized; turned into frequencies as a proportion of the text (“Not Unless” 92). The is used 3,577 times in J.K. Rowling’s Harry Potter and the

Philosopher’s Stone, a novel of 223 pages, but it is used 10,454 times in Harry Potter and the Deathly

Hallows which is more than twice as long, a novel of 607 pages. In order to compare word counts where the texts are of an unequal size, as most texts are, the desire is for a like for like comparison. Otherwise comparison by statistical analysis is rendered useless. When converted to a percentage of the text the raw count is divided by the total size of the text; ‘the’ occurs at a rate of 4.6% of all words in Philosopher’s Stone and 5.2% in Deathly Hallows. This difference is not as large as it appeared to be when only the raw counts were compared. Hence, for two texts of unequal size to be meaningfully compared, the word counts must be in the form of proportions.

9 Princomp()is another inbuilt command available for calculating PCA in R and computes the variances by dividing by N while prcomp() follows the usual approach which is to compute the variances with the usual divisor N – 1. Prcomp()is also preferred on account of automatically applying the single value decomposition on a centred and scaled data matrix and the online R handbook reports this is “the preferred method for numerical accuracy” (“Principal Components Analysis”). 69

2.3.3 Word Variables

The second step that a stylometrist must take is to treat the words as variables rather than texts.

Burrows’s early applications of PCA reversed this order, treating the texts as variables rather than the words (Computation). However, word variables were quickly adopted because, as

Binongo and Smith explain, when the words are treated as variables “they tend to occur at less homogenous rates in different texts” (Binongo and Smith 463). The preferred method means that the PC scores provide positions for the texts as data points along an axis that is explained by the PC weights for the words. Not only is this process more “amenable to interpretation”, as per Binongo and Smith (463), but it is the more intuitive option. Rather than finding the variance in the words, the variance is found between the texts and is explained by the frequencies of the words.

2.3.4 Correlation versus Covariance

The final step is to adopt a correlation matrix rather than a covariance matrix. Using a correlation matrix has become standard practice; practitioners report that the best results in classification tests (authorial attribution tests) are achieved when using correlation rather than covariance (Baayen et al.; Hoover, “Frequent”). Mathematically, the only difference is that a correlation matrix is standardised (Gentle 295). Both matrices are mean-corrected, but the correlation matrix is then divided by the standard deviation.10 In the software program R, the argument scale subjects a matrix to both these processes. Thus, to ensure that an analysis using

10 For a mathematical explanation of mean-correcting and standardising by dividing mean-corrected values by the standard deviation see Binongo and Smith 446, 448. 70 the prcomp() function in R is carried out on a correlation matrix the scale argument must be set to TRUE (Hothorn and Everitt 352).

Despite being standard practice on the basis of improved results, there is not much explanation in the literature to explain why correlation matrices work best for stylometry.

Statisticians argue that when “variables are on very different scales principal component analysis is usually carried out on the correlation matrix rather than the covariance matrix”

(Hothorn and Everitt 350). Arguably, the frequency of word use can differ greatly, even in a list of one hundred of the most common words. Thus, the word frequencies can be on different scales. Binongo and Smith explain the situation for stylometrists as such:

In stylometry, authors’ different rates of use of words that occur infrequently are

generally thought to be less reliable than smaller differences in those that occur much

more often. However, the latter usually exhibit larger variances and, in PCA, can

swamp the effect of the less frequent words. Thus the stylometrist is placed in an

invidious position: to standardize gives equal weight to all words; to use the actual

numbers of occurrences can mean that many words with power to discriminate may be

overwhelmed by less effective high-frequency words. (458–9)

Binongo and Smith arrived at this conclusion after comparing PCA using standardized variables and non-standardized variables. Standardised was defined in their paper as “dividing the mean-corrected values by the standard deviation” (446). Although they do not invoke the terms “covariance” and “correlation” in this comparison, the difference between the two matrices is that the columns of a correlation matrix are standardised — the columns are divided by their standard deviations (Gentle 295). D.L. Hoover quotes most of this same passage to explain the differences between correlation and covariance in PCA and goes on to note that

Binongo and Smith create some uncertainty in their recommendations. For instance, in the

71 analysis that follows the above explanation, Binongo and Smith do not make clear which approach they used (Hoover, “Frequent” 269). Hoover himself only performed a brief comparison of the two approaches, concluding that, although outside the scope of his article, it appeared as though “the correlation matrix more often produces recognizably correct clusters of texts of known authorship than does the covariance matrix” (“Frequent” 269). In the experience of Harald Baayen, Hans van Halteren and Fiona Tweedie, correlation matrices almost always “led to greatly improved classifications” when using PCA for authorship attribution (129). There was only one instance in their study where PCA was not improved by using the correlation matrix (128). Interestingly, this application of PCA was the only one analysing the “relative frequency of hapax legomena” that is “the extent to which the syntactic creativity unique to a particular author (or text sample) manifests itself” (Baayen et al. 128). The authors do not offer an explanation for this but as they were asking whether “robust clues to authorship identity should also emerge on the basis of the hapax legomena” (127), one could assume that, as a relative measure of rare words, the variables were not on very different scales.

Another clue from the literature is in the manual to the R package “stylo”, produced by the Computational Stylistics Group. In creating “stylo”, the Computational Stylistic Group included both methods for PCA, covariance and correlation with the note that the correlation matrix “is possibly the more reliable option of the two, at least for English” while covariance is good for “a single iteration (or just a few)” (Eder et al., Stylo). Finally, it is a well-established practice, as Binongo and Smith outline in their history of PCA in stylometry which dates the use of correlation matrices to Burrows and A.J. Hassall in 1988 (Binongo and Smith 446).

All the reasons given in the literature do not encourage a blind adoption of the correlation matrix. Therefore a brief experiment of comparison was conducted. Taking the corpus of seven Harry Potter texts and the list of ninety-two frequent words generated for use in

Chapter 5, two PCA results were produced, one using a covariance matrix and the other a

72 correlation matrix. To simplify the comparison of these two methods the results were projected in biplots, both word graph and text graph overlayed in one plot. Biplots are usually too crowded to be of much use generally but suffice for the current purpose of comparing methods.

Figure 2.1 PCA on Harry Potter Corpus Using Covariance Matrix

73

Figure 2.2 PCA on Harry Potter Corpus Using Correlation Matrix

Immediately apparent is the fact that there are differences. As Binongo and Smith warned, in Figure 2.1 the results are “overwhelmed” by words that are presumably exhibiting larger variances (459). The prominent words in Figure 2.1 are among the top ten most frequently words in the corpus, including the, and, a, said, and he. However, not all the words jutting out in Figure 2.1 are from the top end of the list of word frequencies. For instance, the word wand can be observed close to he and yet wand is the ranked as word ninety-two on the list while he is the seventh most frequently used word in Harry Potter.

The only explanation is that wand is actually ranked forty-two on the list of most frequent words in the seventh book of the series, Harry Potter and the Deathly Hallows (2007):In other words, wand has a higher proportional frequency in the last book but a lower rank than proportional frequency of wand overall.

74

Figure 2.3 The Order of 92 Words in Harry Potter and the Deathly Hallows (HP7)

Ranked ninety-two overall, the word wand has a sharp increase in usage in Deathly

Hallows, jumping to forty-two in this text alone. In this way it, like all the other words jutting out in Figure 2.1, is an outlier. Methodologically speaking, one may wish to suppress outliers such as these to diminish some of what are already obvious trends in the data to allow other, less obvious ones can come to the fore. Where the purpose is to remove outliers prior to subjecting the data to other analyses, the covariance matrix may be preferred so that extreme variables can be excised from further processing. In the present thesis, however, the aim of the research is to explore the underlying patterns in word usage and these patterns are more clearly expressed through a PCA computed using a correlation matrix. As Figure 2.2 demonstrates, the spread of texts within the two-dimensional space can be explained by a multitude of patterns in word usage rather than just fifteen or so outlying words. Therefore, a correlation matrix has been adopted throughout all the PCAs conducted for this thesis.

75

In addition to the above three methodological choices there are other considerations to be made. The type and number of words selected, the corpus, whether the texts are kept whole or broken into segments, whether the segments are dictated by word size or according to chapter, are all external choices which influence the outcome of the research. As such, they are explored in further detail in the following sections: Chapter 2 Section 1.3 above discusses the various approaches to word selection while Chapter 2 Section 3.2 specifies the method used in this thesis to compile word lists; Chapter 2 Section 3.1 explains how the texts were selected and prepared; and Chapter 2 Section 3.3 gives details for how PCA was specifically applied in the different case studies of this thesis.

2.3.5 Interpreting PCA Results

Interpreting the results, as discussed earlier, can be theoretically problematic. In fact, Hoover has claimed that PCA is “more difficult to interpret” than some other statistical techniques but at the same time it is useful in spite of this difficulty because the two plots provide more information (Hoover, “Frequent” 268). To reduce the difficulty and increase transparency the following is a brief outline of the approach taken throughout this thesis.

The two graphs are closely related as Tabata explains, the text graph is a “product” of the first one (“Dickens’s”). What was not explained in the outline of PCA above is that Tabata was speaking of a two-dimensional graph, one of the easiest ways to portray data in a text document. PCA returns as too many components to graph in two-dimensional space and only the first few contain the majority of the variance. Occasionally this can occur within just the first two, and other times a researcher will desire to look at the third and fourth components to complete the picture of variance found. Determining which components to examine is yet another subjective choice. In the present thesis, the approach was to plot only the first two as the rate of variance from the third onwards in dramatically lower than the variance found by the first two components. This protocol is employed throughout the thesis because with 76 literature, minor variations in style could be found in all the returned components but none of the components carry the same significance as the first. Plotting the second is useful as comparison to the first as well as informative for carrying some variance, in this thesis usually a variance above 10%.

Methodology 2.4.1 Selecting Texts Science fiction and fantasy have both had an incredible influence on today’s society. Science fiction is on par with current developments in technology. As Rob Latham has noted, “[t]he scope of [science fiction’s] sociocultural influence has never been greater, almost keeping pace with the magnitude of technological change itself” (Latham 6). The immersive nature of many science fiction and fantasy works have enveloped new generations in a plethora of futuristic and fantastic worlds and continued the tradition of extrapolative reading. J.K. Rowling’s Harry

Potter series, which is analysed in this thesis, infiltrated the imaginations of entire generations to the extent that some have attributed reversals in declining reading trends among children to

Rowling’s fantasy series (Keen 734-5).

Not all the texts studied in this thesis can claim the same impact and more than half belong to an older era of extrapolative literature, one that immediately preceded the explosion of pulp science fiction in the 1920s. The reason for this is simply a matter of legality.

The steel wall of copyright law stands in the way of computational stylistics and other fields from the Digital Humanities. As a result, the domain of computational stylistics has long been with works that have ceased to be covered by statutory copyright protections and are thus in the public domain. Although Google has been digitising and collecting books in libraries around the world, scholars and institutions are hesitant to work with books within copyright protection because it is a grey area of the law. However, there is an inclusion in Australia’s copyright law for “fair use” which includes in its definition “academic research”. Prima facie,

77 the academic research is privileged by the statute copyright law. However, it remains a grey area because it has not yet been tested by the courts.

There is a second hesitation for using electronic copies of books related to the contract between purchaser and provider. Publishing houses include terms and conditions that users automatically agree to upon purchasing electronic books. Invariably, publishing houses will include a clause that prohibits a user from bypassing their security encryptions and copying any material from the purchased text.

In conclusion, the statute law would indicate that an academic dealing with more than ten percent of a book for research purposes is permitted, according to the fair use clause.

Nevertheless, a researcher who bypasses the security encryption of an electronic book in violation of a legally binding contract has no statutory defence for such an action. At least, not at the moment. Although special permissions were sought from publishing houses for this research only one response was received, from the Estate of C.S. Lewis.

Therefore, the current thesis investigates the study of style in early science fiction and fantasy and in more contemporary children’s fantasy fiction. The works of public domain included for analysis are by H.G. Wells, George MacDonald and Olaf Stapledon. Access to texts by these authors came from Project Gutenberg and careful attention was given to the quality of the text found on the public platform.11 The second part of the thesis analyses works by J.K.

Rowling and C.S. Lewis. The publishing house for Rowling makes electronic versions of her

11 Project Gutenberg refers to the Australian website: http://gutenberg.net.au/. The texts available on this site have been prepared by volunteers. 78 fantasy series, Harry Potter, available for purchase without security encryption. The use of these books is governed by the fair use clause in Australian copyright law.

The difficulty arising from assembling a set of texts this diverse is that a wildly non- homogenous corpus yields wild results. The quantitative analysis is designed to demonstrate the variations between objects, and the variations between all the above authors who together span a hundred years of writing would return numerous patterns of variation, making the task of interpretation incredibly difficult.

Thus, the research aim is to search for a link between observable patterns in language and the artistic function of observed patterns. It is in the interest of this research to investigate variations within the context of the text, both the literary historical context and also the context of the text’s construction.

2.4.2 Counting Words The process for counting the frequency of words was carried out using Intelligent Archive a software program created by the Centre for Literary and Linguistic Computing at The

University of Newcastle, Australia. Intelligent Archive allows the user to create repositories of texts and search the repositories for lists of words or the most frequent words either in the complete texts or in segments of the text, i.e. per chapter or per thousand words. For each corpus, a word frequency list of one hundred words was generated and any proper nouns and titles were excluded. A new list was generated for each individual corpus studied in this thesis.

This means that for every test, the list of most frequent words are representative of the most frequently used words in the present corpus. Therefore, the list of words used to analyse The

Time Machine against a backdrop of thirty other texts is the one hundred most frequent common words found in all thirty-one texts together with any titles and names removed. The word lists for each corpus are all provided in the Appendices. Once counted, the proportional frequency for each of the words was saved in a CSV file ready to be processed.

79

2.4.3 Applying PCA PCA is a very common statistical function and as such is built into several programs including

Matlab and R. The open source statistics environment, R, is the program used throughout this thesis. The word frequencies are imported to a data-frame through a CSV file, scaled, and then subjected to the inbuilt prcomp() function. The full script is available in Appendix A.

In Chapters 3 and 5, PCA is applied not only to word frequencies from the texts but to word frequencies from chapters of the selected texts. This measure responds to specific questions that emerge in the course of analysis in both these case studies and allows the initial loadings from the PCA on texts to calculate PC scores along the same new variable for each chapter studied. Results for each chapter are determined manually by multiplying the frequency of the words in each chapter by the loadings for each word according to the principal component result from the PCA test by whole text. This allows the analysis of the texts by an internal measure, the chapters, but still according to the same component of variance determined by the initial test. The patterns of the chapters allow inferences to be drawn regarding the internal stylistic difference of the narratives in relation to the overall distinction between the texts. In addition, the extremes of the principal components can be explored by a closer reading of the chapters at each end.

2.4.4 Interpreting Results In the same way that science fiction texts offer a plurality in the language, one of a concrete world and the other of an imaginative world, the results of any given PCA offer concrete results that are sensible on their own but with the act of interpretation can offer a clearer visualisation of the original text. Samuel R. Delany has contemplated the reading strategies brought to science fiction texts and argued that there is a distinct language of science fiction which leads the reader to extrapolate sparse information to imply the outlines of entirely different societies and scientific developments (Starboard 53). In the language of statisticians, the interpretation of statistical results always requires “the subjective judgment of a human being” and although the

80 techniques can “help a person to think” they do not “write the interpretation for the same”(L.L.

Thurstone qtd. in Di Franco and Marradi 125). Although the empirical method applied in the present thesis is a robust statistical analysis, the act of forming inferences regarding the overall patterns seen to have emerged is both a necessary step in the analytical process and a subjective aspect of the test.

As mentioned above, the results of the PCA are of processed data and several stages removed from the original source. The first stage was to count the words; the second was to extract the frequencies of only one hundred most common words. Already removed from their original context, the words are represented in this stage as numerical percentages of the whole text. The third stage removes the data several more times before returning a set of principal components to account for the variance between the texts. There is one component for each text, or segment of text, analysed in the test. The first step of interpretation is identifying how many components to analyse. Although all the components convey information about the variance the first component is generally the only measure interpreted (Di Franco and Marradi 125). As a data-reduction technique, PCA is useful for explaining the main differences between a set of items, in our case a set of texts, and with the first component always summarising the major distinctions. Experience has shown however that the second component also returns a fair amount of variance and to justify the use of two components the percentage of variance is included in the labels on the axes of each graph. Occasionally the variance between the two is quite similar, for instance in the corpus of C.S. Lewis’s The Complete Chronicles of Narnia (in

Chapter 5) where there is a difference of only 3% between the first component and the second.

In this test, the second component (always represented as the vertical y-axis of a graph) explains almost as much of the variation as the first component but, surprisingly, it is only one text, The

Lion, The Witch and the Wardrobe, that is distinguished from the six others by the second

81 component. The explanation for this distinction offers interesting insights into some major stylistic differences between The Lion and the rest of Lewis’s series.

Although all the returned components of the PCA contain information pertaining to the stylistic differences between texts, it is only the first and occasionally the second that is the subject of interpretation in the present case. The component accounting for a significant proportion of variance between the texts, when interpreted, can offer qualitative explanation for stylistic differences observed by statistical analysis.

The first step of interpretation is to explain the patterns between texts and the words.

This can be achieved by studying the graph containing the word loadings for each principal component. Beyond the obvious patterns on the graph it is useful to take at first the top ten words at either end of the principal component. That is, the ten words most associated with each extreme, the negative end of the continuum and the positive. Working with only twenty words allows the researcher to explore in more depth the way these words are used in the texts that have been correlated more strongly with these particular words. These words can be used in a concordance search through the texts to contextualise the usage of the words or to identify the chapters where these words are used most frequently, and a closer reading of the text carried out to further illustrate the narrative function of these words.

To draw conclusions about the patterns and what they infer about the stylistic differences between texts and genres is a subjective process that is informed by the frameworks brought to the study. To a certain extent Chapter 1 elucidates my own frameworks for understanding and interpreting science fiction and fantasy. As a literary critic (rather than a linguist) one is chiefly interested in the artistic function of the texts at the level of these common words which means that final conclusions as to the patterns observed in PCA results are informed through further contextual evidence of those words. Thus, it is only after observing the use of words in context that an explanation can be offered for why the higher usage of the in early science fiction and

82 adventure novels appears to indicate a higher rate of spatial description as opposed to social and psychological descriptions in the realist texts in the same corpus.

Conclusion The following case studies offer statistical findings that explore the style of science fiction and fantasy works by characterising the styles of various works and identifying the main stylistic differences between genres and texts at the base-strata of language.

The subsequent chapters of this thesis are structured according to the different levels of analysis that have been involved in the computational stylistic study. Analysis at the

‘computational’ level is the first step in producing results using word frequencies and PCA; it offers a view of the complex relationships that are between words, between texts and between words and texts. Then analysis at the ‘textual’ level is still largely statistical but is the first plunge back into the text according to the new perspective offered by the multivariate data.

This level can include searching the text for particular words or groupings of words using a concordance or it can involve further statistical tests. This level is needed to demonstrate the connections between the data and the texts. Finally, there is analysis at the ‘semantic’ level which is needed to illustrate the precise functions of language in the texts and the effects they produce. A common analysis is threaded throughout all three stages. In order to be a work of literary criticism rather than a series of observations about word frequencies, in each level of the discussion the specific details are related to the generic ones, from the evidence to an explanation. This is happening at each level so that multiple explanations can be gained.

Therefore, not only do the levels work as a progressive interpretation of the texts but also as an accumulative interpretation.

83

Contextualising the Style of Early Science Fiction and Fantasy

3.1 Introduction The genealogies for the genres of science fiction and fantasy are not fixed entities. Scholars have variously argued that the origins of science fiction can be found in 1818 with Mary Shelley’s

Frankenstein (Aldiss 8), in 1926 when Hugo Gernsback first described the category under the label “scientification” (Delany, Silent 25–6), or circa 80 with Plutarch’s Peri tou prosôpou (Roberts,

History 346). Fantasy fiction, on the other hand, has global roots in fairy tale and cultural myth, but for fantasy as a twentieth century, commercial genre the start date is often considered to be the publication of J.R.R. Tolkien’s The Lord of the Rings trilogy (1954). The seeds of the twentieth- century genre can be found in the nineteenth-century fantasy works of George MacDonald as well as ancient Norse mythology and many other sources. Cultural moments and the early works which germinated the seeds for numerous science fiction and fantasy tropes are discussed in various histories and encyclopedias that chronicle the development of the genres, including Brian Aldiss’s Billion Year Spree (1973), John Clute and Peter Nicholl’s Encyclopedia of

Science Fiction (2011), and John Clute and John Grant’s Encyclopedia of Fantasy (1997).

However, the stylistic seed, the linguistic influence on the language accommodating the ideas and tropes of the genres, is not generally the subject of critical scholarship. This chapter presents a new approach to studying early science fiction and fantasy and is the first attempt to contextualise the style of these two genres with the aid of stylometry. This question was approached through the study of coeval texts, H.G. Wells’s The Time Machine (1895) and George

MacDonald’s Lilith (1895), within a corpus of thirty-one novels, all of which were published between 1868 and 1900. The stylometric analysis determines the most extreme stylistic differences that are present in the corpus of texts. This provides a continuum of stylistic distinction upon which the styles of Wells’s science fiction work and MacDonald’s fantasy novel can be located and contextualised. For the purposes of this chapter, the term “early” is applied to these two works, which both precede the formal demarcations of the genre and are considered early examples by leading authorities. Still, the decision to use works from the middle of the 1880s in order to contextualise the styles of these two genres is a choice that deserves some attention since the definition of what can be considered “early” can alter the definition of the genre itself. “Stress the relative youth of the mode”, writes Adam Roberts, “and you are arguing that SF is a specific artistic response to a very particular set of historical and cultural phenomena” (Science 37).

Alternatively, if you “[s]tress the antiquity of SF”, as Roberts puts it, you argue that science fiction is instead “a common factor across a wide range of different histories and cultures, that it speaks to something more durable, perhaps something fundamental in the human make-up, some human desire to imagine worlds other than the one we actually inhabit” (Roberts, Science

37-38). The same can be said for fantasy fiction and the differing approaches to defining the genre demonstrate as much. To define the genre as the “non-mimetic” whenever it is present in a fictionalised or dramatised form is to suggest that the genre is ancient and universal. To define the start date as the commercial breakthrough of a trilogy form such as Lord of the Rings is to imply that the genre emerges from a particular cultural moment. Indeed, Farah Mendlesohn and Edward James state that “[f]antasy and not realism has been a normal mode for much of history” and go on to note, “[a]rguably however, fantasy as a genre only emerges in response

(and contemporaneous to) the emergence of mimesis (or realism) as a genre” (7, emphasis in original). The history of fantasy, they suggest, is found throughout the history of written fiction and yet the mid-nineteenth century “saw the emergence of a new kind of fantasy” (18) and by the end of the nineteenth century fantasy works appeared that were to “provide emergent genre fantasy with many of its core concepts and tropes” (20). Thus, even those who trace the history of fantasy through the earliest forms of the ancient Greek and Roman novels note a shift in the nineteenth-century. It is among the fictional works at the end of this century that the present study aims to contextualise the styles of early science fiction and fantasy works. On the matter

85 of genre definitions, the approach taken in this thesis, as discussed in Chapter 1, is a blend of the two approaches: genres can develop from “a common factor across a wide range of different histories and cultures” prior to becoming an established set of artistic solutions and reading strategies that are encoded in the language and tropes of a genre.1 This thesis highlights the underlying stylistic features that mark such solutions.

Even with established codes and conventions, there is no guarantee that any given genre will experience a prolonged existence. Hence, the decision to focus this study on the late-

Victorian period was prompted by the literary context at the time: many novelistic genres had emerged and already disappeared and it was at the turn of the century that the “super” genres of the twentieth-century began to coalesce. The late 1800s is key to understanding twentieth century novelistic genres that emerged from the fabric of the Victorian novel because traces of the old genres can often be found in the dominant genres. As discussed in Chapter 1, while some of the exhausted genres can be defined as modes, others survived as genre fiction in categories such as science fiction, or as “super-niches”, to use Franco Moretti’s term. Moretti argues that super-niches were created by the growth of the book market that created a demand for “all sorts of niches for ‘specialist’ readers and genres (nautical tales, sporting novels, school stories, mystères) … which culminates at the turn of the century in the super-niches of detective fiction and then science fiction” (Graphs 8). Indeed, Moretti’s quantitative analysis of forty-four novelistic genres from 1740-1900 found that the genres tended to emerge and disappear in

1 It is important to note that ascribing the label “science fiction” to works that were written prior to the term’s coinage is not necessarily indicative of a change in the definition of the genre. Therefore, although Wells described his works as “scientific romances”, the term “science fiction” is used throughout the chapter in keeping with the classification of The Time Machine as science fiction. Similarly, Lilith was written prior to the establishment of fantasy fiction as a commercial category, yet can be described as an early example of the portal-type of fantasy fiction (Mendlesohn, Rhetorics 51). 86 clusters, with “major bursts of creativity” occurring in six different clusters of time (18). In summarising the results, Moretti notes that “[i]nstead of changing all the time and a little at a time, then, the system stands still for decades, and is then ‘punctuated’ by brief bursts of invention” (18). Moretti demarcated two of those periods of invention as the early 1870s and the mid-late 1880s (18). In the present study, the corpus of late Victorian novels is a sample of texts that were published across both these bursts, with all thirty-one works published between 1871 and 1900. In this literary context, where the book market was steered by reading interests causing reading specialisations to converge and create bricolage genres, prototypical examples of twentieth century science fiction and fantasy first emerged from a cultural milieu.

There are, of course, more than just two authors from this period who were writing in prototypical forms of science fiction and fantasy (Suvin, Victorian; Mendlesohn and James, A

Short). The selection of H.G. Wells and George MacDonald, and their coeval texts, was determined by the representational problems the authors shared, and which they communicated following the publication of The Time Machine and Lilith. In these works, both authors tackle the representational problem of how to investigate worlds beyond the familiar dimensions. Following the publication of Lilith, Wells wrote to MacDonald saying as much:

“[c]uriously enough, I have been at work on a book based on essentially the same idea, namely that, assuming more than three dimensions, it follows that there must be wonderful worlds nearer us than breathing and closer than hands and feet” (qtd. in Hein 390). Their respective artistic solutions differed, which Wells acknowledged, “I have wanted to get into such kindred worlds for the purposes of romance for several years, but I’ve been bothered by the way. Your polarization and mirror business struck me as neat in the extreme” (qtd. in Hein 390). Despite the differences between Wells’s invention and MacDonald’s mirror-portal, the two works do share in one artistic solution: a framed narrative.

Wells’s narrative is framed with a device known as the “club story”, which provides a contrast between the domestic and the fantastic. The fantastic tale is told by the Time Traveller

87 but is introduced, closed and reported verbatim by another, who is according to critics, most likely named Mr Hillyer (Parrinder, Shadows 36). As such, the two aspects of the narrative are clearly differentiated. Furthermore, the setting of the frame narrative, the Time Traveller’s cosy smoking-room, and the cast of everyday late-Victorian professional characters, including the

Psychologist, the Provincial Mayor, the Medical Man and the Editor, domesticate the club story so that it functions as the familiar contrasted with the fantastic. As well as a contrast to highlight the fantastic elements, Nicholas Ruddick has argued that the framed narrative functions for satirical purposes by illuminating the “triviality of human concerns when viewed from outside the exclusive perspective of the historical present” (342).

In contrast, the frame narrative in Lilith is subtler: not only is the whole narrated by a single protagonist, Mr Vane, but the initial domestic setting of the library is made strange from the outset, and, upon return to the library, Vane is unsure if he is dreaming or if he has returned. Although—or perhaps because—this is an early example of the portal fantasy story,

Mendlesohn notes that the relationship between structure and style in Lilith does not conform to the conventions of the portal type (Rhetorics 51). Where the modern portal fantasy has come to rely on “the contrast with the frame world”, Mendlesohn argues, MacDonald’s instead “makes the present world strange” (Rhetorics 51). This is understandable as MacDonald was writing before such conventions were established (Rhetorics 51). Nevertheless, to contemporary readers and scholars such as Mendlesohn, Lilith is a rather strange example of fantasy fiction. A question, therefore, to consider in the computational analysis is whether the underlying style of

Lilith presents as an outlier given the strangeness of its worlds and stylistic components.

Another question considered throughout the following analysis is the relationship between Wells’s style in The Time Machine and the style of nineteenth-century utopian narratives. For The Time Machine can be interpreted as Wells’s response to the utopian narratives of the similar era. Parrinder, for instance, points to evidence from the Time Traveller’s tale and the two instances where the artificiality of utopian narratives are mocked (Shadows 44). Rather

88 than resembling utopian narratives, Parrinder suggests, “Wells's tale is a violent adventure story as well as something resembling a fieldwork report” (Parrinder, Shadows 45). Meanwhile, although The Time Machine emphasises the Time Traveller’s deductive reasoning rather than

“the exposition of a superior utopian philosophy” (Parrinder, Shadows 44), Brian Aldiss describes Wells’s writing as “an analytical fiction, not the sort of thing craved by broad public taste” (138). Instead of “analytical fiction”, the late-Victorian readers, according to Aldiss, preferred the exotic narratives of escape such as those by H. Rider Haggard (138). Aldiss argues that Haggard’s adventure novels have had a strong influence on the development of science fiction.

In terms of a precursor for twentieth-century trends, however, it has been argued that it was Joseph Conrad’s narratives that “led the Victorian novel of adventure into modernist territory” (Jolly 388). Although well-established as a forerunner of twentieth-century science fiction, it has been argued that The Time Machine is not necessarily proto-modern. Ruddick has suggested that the experimental aspects of the text that may resemble the themes of modernism

– including the liberation from spatial and temporal frameworks – is Wells’s attempt to “raise the reader’s awareness of a temporal frame too large to be measured … and indifferent to human concerns” (340). The position of the late-Victorian narrative in terms of its influence on both modernists and the evolution of twentieth century genres is already an area of interest.

However, one neglected aspect is the stylistic relationship between the works of the nascent

“super” genres found in this period and the stylistic context of the era. The corpus used in this chapter includes work from both Conrad and Haggard, as well as examples of nineteenth- century utopian narratives and therefore offers insight to the underlying style of the inter- connected forms of utopian, adventure and early science fiction.

Included in the corpus are three utopian texts: The Coming Race (1871) by Edward

Bulwer Lytton, Erewhon (1872) by Samuel Butler and News from Nowhere (1890) by William

Morris. It is usually held that utopian fiction forms a sub-genre of science fiction, although

89 some, such as Adam Roberts, suggests that “utopian fiction must be discussed as a parallel development to SF” (History vii). Thus, this stylometric analysis examines the nature of the relationship between the style of The Time Machine and those of the three utopian narratives included in the study.

As well as three utopian novels, the corpus included adventure novels The Nigger of the

Narcissus (1897) by Joseph Conrad and King Solomon’s Mines (1885) by H. Rider Haggard, as well as an early detective novel, Wilkie Collins’s The Moonstone (1868). Also present are twenty- three novels that can be categorised as social realist novels, such as Anthony Trollope’s The

Eustace Diamonds (1871).2 As such, the twenty-nine background texts of the corpus can be treated as a stratified perspective of style in the late-Victorian era, not too vast as to overshadow the two texts that are the focal point of the research but still large enough to offer a snapshot that includes the three related utopian texts and the dominant form, the social realist novel.

The corpus of late-Victorian novels was originally assembled by a different researcher for a separate study and was intended to provide a snapshot of the style of the late-Victorian novel.3 More than just a snapshot, the corpus was designed to ensure that each work is by a different author thus making it as representative of a late-Victorian narrative style as possible.

2 The full list is: William Black’s McLeod of Dare (1879); R.D. Blackmore’s Mary Anerley (1880); Mary Elizabeth Braddon’s, Vixen (1879); Rhoda Broughton’s Nancy (1873); Marie Corelli’s Thelma (1887), Benjamin Disraeli’s Endymion (1880); George Du Maurier’s The Martian (1896); George Eliot’s Daniel Deronda (1876); George Gissing’s New Grub Street (1891); Sarah Grand’s The Beth Book (1897); Thomas Hardy’s Jude the Obscure (1895); Richard Jefferies’s Amaryllis At The Fair (1887); Rudyard Kipling’s The Light that Failed (1890); A.E.W. Mason’s The Philanderers (1897); George Meredith’s Beauchamps Career (1875); Margaret Oliphant’s Sir Tom (1893); Ouida’s Waters of Edera (1900); James Payn’s Bred in the Bone (1872); Charles Reade’s The Woman Hater (1876); Robert Louis Stevenson’s The Master of Ballantrae (1889); Mary Augusta Ward’s Marcella (1894); and Charlotte M. Yonge’s Lady Hester (1895).The texts are found in Appendix B.

3 This corpus (minus Lilith) was initially compiled by Prof. Hugh Craig at the Centre for Literary and Linguistic Computing, University of Newcastle, Australia for a study on the late-Victorian novel. 90

By reducing the effect of authorial signals, the strongest stylistic signal that is usually found in computational studies, the thirty-one texts by thirty-one authors provides a corpus suitable for exploring other stylistic signals, including genre.

The focus of the following computational study is the position of these two works of early science fiction and fantasy respectively, The Time Machine and Lilith, in relation to twenty- nine other works from the late-Victorian period. In addition to the position of these texts within this corpus, this thesis examines what is revealed through stylometric study about the underlying style in these samples of early science fiction and fantasy.

3.2 A Computational Study of Style The first step in the computational study was to count the one hundred most common words in the corpus of thirty-one texts. According to the methodology outlined in Chapter 2, any titles or names are removed from the list. In this case the only word removed was the title, Mr. The remaining ninety-nine words make up 55% of the tokens of the corpus even though, on average, they only make up 1% of word types found in each text. Thus, the following computational study of the usage of ninety-nine words offers a picture of half the words used in the corpus as the underlying patterns of how those words are used in the thirty-one texts.

Many of the words counted for this study are often overlooked in the general course of reading, even close, critical reading. For instance, included in the list of ninety-nine words are many prepositions, out, in, up, down, articles, the, a, an, main verbs, be, have and modals, must, could, would. The list also includes the pronouns they, their, you, he, her, she, his and him. A full list of the ninety-nine words is found in Appendix C.

The corpus offers a dataset rich in dimensions: the proportional frequencies of ninety- nine words for each of the thirty-one texts leads to a matrix that has 3,069 data points. When subjected to a PCA the variables are correlated, an Eigen analysis is performed and then new components are provided which account for the maximum amount of variance found in the original dataset. In this particular study, the PCA returned thirty-one uncorrelated components

91 where the first component accounts for the largest amount of variance and each subsequent component accounts for a diminishing amount of variance. The loadings of the variables constitute the components, and each text has a score for each component. The components are plotted against the other, once for the scores and again for the loadings.

Figure 3.1 PCA Scores of 31 Texts from 1871-1900 (PC1 vs. PC2)

Figure 3.1 shows the scores for all thirty-one texts along PC1 and PC2. The texts at the extremes of PC1 (the horizontal axis) are Conrad’s sea adventure, The Nigger of the Narcissus, and Trollope’s The Eustace Diamonds. Conrad’s novel is also the low extreme along PC2, where the highest texts are utopian narratives: Bulwer Lytton’s The Coming Race and Butler’s Erewhon.

Wells’s The Time Machine is the second text to appear on the left-hand side of PC1 while

MacDonald’s Lilith is in the middle of the horizontal axis. Both Lilith and The Time Machine are in the cluster dominating the middle of PC2. The fifth text on the left-hand side of PC1 is also an adventure narrative, King Solomon’s Mines by Haggard. The pattern of texts along either PC1 or

PC2 is not correlated to chronological change in the corpus. A correlation test on both

92 components and the order of publication for all thirty-one novels returns p values that are outside the threshold of a significant correlation.4 Instead, the first component can be interpreted as a distinction between adventure-based narratives on the left-hand side and more domestic or social realism on the right-hand side. The second component, on the other hand, is distinguishing between utopian fiction at the top and adventure fiction down the bottom. The style of The Time Machine emerges as similar to those of the adventure narratives, while the style of Lilith does not stand out on the extremes of either PC1 or PC2 but is positioned in the midst of the social realist dominated centre.

Figure 3.2 PCA Loadings of 99 Words in 31 Texts from 1871-1900 (PC1 vs. PC2)

The word loadings produced by the PCA provide further information on the features which characterise the extremes of both PC1 and PC2. Plotted in Figure 3.2, the word loadings

4 The threshold is p < 0.05. The correlation coefficient for PC1 and the order of text by publication is r = - 0.23 which returned p = 0.1. The correlation coefficient for PC2 and the order of the texts by publication is r = 0.26 and the p-value returned was p = 0.08. 93 reveal that the extremes of PC1 are the on the left-hand side and do on the right-hand side. In addition to the definite article, prepositions of, in, on, and from are all found on the left-hand side along with a, by and some. These are words denoting movement, most likely pertaining to action sequences. The prominence of these words also indicates a style of narration with a high usage of articles and possibly similes, as like is also located on the left-hand side of PC1.

In contrast, the texts located on the right-hand extreme have relatively higher uses of modal verbs, would and should; the conjunctions, but and if; and, main verbs be and have. In addition, the texts at the right-hand end of PC1 appear to employ relatively more negation, with the presence of not, and focus on quantities, expressed through words such as much. Although most of the prepositions are located at the opposite end of PC1, there are words that denote movement on the right-hand extreme as well, such as come and do. The style indicated by the words at the high end of PC1 is one that focuses on quantification, the expression of possibilities, and the relationships between objects and ideas.

Significantly, neither The Time Machine nor Lilith are found at the extremes of either end of PC1 or as outliers along PC2. These results indicate that the styles of Wells’s science fiction and MacDonald’s fantasy are quite consistent with those of contemporary late-Victorian novels.

Thus, while science fiction “could begin to exist as a literary form only when a different future became conceivable by human beings” (Scholes and Rabkin 7), a different style was not necessarily a requirement for the nascent literary form to develop. While Wells is celebrated “as the creator of modern science fiction” (Parrinder, “Introduction” 1), the PCA results reveal that

The Time Machine is not a stylistic outlier among other late-Victorian writers; Wells’s text is, however, found as the second most extreme text on the low end of PC1, suggesting some distinctive features. MacDonald’s Lilith, on the other hand, is in the middle of the graph, indicating a style that is average in its use of these ninety-nine words along these two measures and when compared to the extremes of social or domestic realism and adventure.

94

In order to explore the nuances of the stylistic distinctions and similarities between The

Time Machine, Lilith and the rest of the corpus, we must first examine the styles of the extremes of PC1. Context is key to interpreting the implication of these results, and particular attention is required to highlight the words along PC1 that are usually unnoticed by the reader: the ubiquitous, the small and the unmemorable words such as the and do. The discussion consists of four parts. The first part focuses on the style of the texts located at the extremes of PC1, which are unpacked with reference to the words found at the high end (the right-hand side) and the words located at the low end (the left-hand side); the second section examines the style of The

Time Machine in relation to these extremes; the third section does the same for Lilith; the final part of the discussion examines the second component (PC2) and explores the distinction between the style of utopian texts and the rest of the corpus.

3.2.1 The Styles at the Extremes of PC1 Located at the high extreme of PC1 is The Eustace Diamonds. The narration of this text, set in the heart of London society, makes frequent use of not. Indeed, negation is a device Trollope employs to introduce his characters. The characterisation of individuals is explored according to what they are “not in the habit” of doing. Consider these descriptions of two characters: “a more good-natured old soul than the dean's wife did not exist”, “she [Lizzie] was not in the habit of concealing her hatred for Lady Linlithgow” (ch. 1). Both characters, the dean’s wife and the main protagonist Lizzie, are introduced with the aid of negation. For the dean’s wife negation implies she stands alone among old souls, and for Lizzie what is revealed is an unbecoming, negative character trait that is demonstrated without the narrator being required to define it directly. The conjunction but is also used frequently in Trollope’s description of characters: “She was small, but taller than she looked to be”; “Her chin was perfect in its round

… But it lacked a dimple, and therefore lacked feminine tenderness” (ch. 2). The conjunction but connects two opposing descriptions, which in some instances of Trollope’s descriptions serves

95 to contrast Lizzie’s external beauty with unfavourable character traits such as the description of her teeth, “without flaw or blemish … but perhaps they were shown too often”.

When we contrast such style with the extreme on the other end of PC1, Conrad’s Nigger of the Narcissus, we find that the conjunction but is also used: “aft, but halfway”, “no longer in splashing clusters of three or four together, but dropped alongside singly”, “with glances critical but friendly” (ch. 1). To place these fragments in their context reveals the presence of the word found at the extreme left-hand side of PC1, the: “The main deck was dark aft, but halfway from forward, through the open doors of the forecastle, two streaks of brilliant light cut the shadow of the quiet night that lay upon the ship” (ch. 1). Accordingly, the relatively higher use of the in adventure narratives indicates a style with more deictic references as the is employed to refer to objects that are known to the listener (Craig, “A and an in English Plays, 1580-1639”

275). In contrast, the relatively higher use of but on the right-hand side of PC1 indicates that the texts located at the same end of the component employ contrasting clauses with relatively more frequency.

Wells’s The Time Machine actually has the highest rate of the in the entire corpus – 6.9% of the whole text is made up of the while the next closest, Bulwer Lytton’s The Coming Race, is

6.6%. Other articles, a and an, are also among the top ten words located on the left-hand side of the graph, and Conrad’s sea novel has the highest rate of a in the corpus at 3% of the text.

Together these articles are part of the “system of determiners in English, which allows precise and rich structure of references” (Craig, “A and an in English Plays, 1580-1639” 275). Although a and an are often interpreted as the indefinite article in implicit opposition to the definite article, the, Hugh Craig has outlined three other characterisations of the dialectic between the definite and the indefinite articles: the familiar or unfamiliar, the identified or unidentified, and exclusiveness or inclusiveness (Craig, “A and an in English Plays, 1580-1639” 275–277). In short, a and an are used when an object is unfamiliar; when a non-identifiable element is being introduced; and, to express inclusivity, in other words, a or an are used to imply that there are

96 other objects that have been excluded from the description. Thus, when Wells’s Time Traveller speaks of “a solitary white, ape-like creature running rather quickly up the hill”, it is assumed from the context preceding this description that the listener understands the specific hill is referred to and that the “ape-like creature” is unfamiliar (ch. 5). In addition to the articles, the homograph, some, also scores low on the left-hand side of PC1 and is employed as a determiner in The Time Machine: “some greyish animal” (ch. 5). Some occurs relatively more frequently in

The Time Machine than any other text in the corpus and together with the, a and an, all weighted at the low end of PC1, indicates that the texts also weighted at that end, the adventure narratives, have styles that require relatively more identification and reference to objects, both definite and indefinite objects, than the texts at the other end of PC1.

In contrast to a dominance of article, the words weighted at the social realist extreme of

PC1 indicate that the narration of these texts refer to objects through words such as that, would, could and did:

There were pieces in verse that she could read,—things not wondrously good in

themselves,—so that she would ravish you; and she would so look at you as she did it

that you would hardly dare either to avert your eyes or to return her gaze. (ch. 2)

Both the modal verbs and did and do refer to aspects of Lizzie’s past “she could read”, “as she did it”; similarly, in another instance, another character is described as someone “who could easily do anything to which she might put her hand” (ch. 3).

Rather than providing explanations through modality, Conrad’s style in his adventure- based sea novel, as well as articles, employs more prepositions, such as of and from. Indeed, there are several prepositions that are frequently found in Conrad’s narrative, highlighted below:

… at any moment the masts were likely to be jumped out or blown overboard … The

watch then on duty, led by Mr. Creighton, began to struggle up the rigging … a sudden

gust, pin all up the shrouds the whole crawling line in attitudes of crucifixion. The

97

other watch plunged down on the main deck to haul up the sail. Men's heads bobbed

up as the water flung them irresistibly from side to side. Mr. Baker grunted

encouragingly in our midst, spluttering and blowing amongst the tangled ropes like an

energetic porpoise. (ch. 3)

The prepositions describe the activities of sailors both in physical space, “in our midst”, and through metaphor, “in attitudes of crucifixion”. Thus, adventure narratives such as Conrad’s may be fixated on the movement of protagonists through the temporal and physical space of a narrative.

Conrad regularly employs like to create comparisons: “silhouettes of moving men appeared for a moment, very black, without relief, like figures cut out of sheet tin” (ch. 1). In contrast, Trollope rarely employs like for comparisons but regularly uses it to explain the preferences of different characters: “she did not like them”, ”[s]he did like music”, “[s]he did like reading” (Eustace, ch. 2). This is not to say that similes do not occur in The Eustace Diamonds:

Lucy’s presence is at one point described as being “like sunshine” (ch. 3). However, a realistic setting of the domestic and familiar, at least familiar to the implied late-Victorian reader, does not require a constant comparison nor the evocation of powerful imagery in order to assist the implied reader in understanding the events and the space in which the events are narrated. An exotic setting is far more likely, therefore, to lean on the narrative devices of metaphor and simile, particularly direct analogies provided by the device of the simile. Conrad even employs imagery of animals found in and around the sea which emphasises his exotic, nautical setting:

“like an energetic porpoise”, “his shoulders were peaked and drooped like the broken wings of a bird” (ch. 1).

Summarising the styles indicated by PC1 has required looking at the extremes of the component. The Eustace Diamonds represents the right-hand extreme and the style associated with domestic novels, while The Nigger of the Narcissus stands in for the texts at the left-hand end of PC1 and represents a style shared by other adventure-based narratives including The 98

Time Machine and King Solomon’s Mines. Having established key aspects accounting for the stylistic split along PC1, articles and prepositions on the left-hand side and a narration style with a high rate of negation and modals on the right-hand side, we can now consider why Lilith blends in with the majority of the corpus and further contextualise the style of The Time Machine according to the style indicated by the low end of PC1.

3.2.2 The Style of Lilith One of the surprising results in the PCA study of thirty-one late-Victorian novels is that Lilith is not distinctive on either measure plotted in Figure 3.1. MacDonald’s style in Lilith is known to be distinct from contemporaneous texts due to features such as archaic diction in the direct speech of some characters: “Thou art beautiful because God created thee” (ch. 29). However, none of these old word forms were counted in the stylometric study. Instead the focus of the study was on the words forming the skeleton of the language in the thirty-one texts. This underlying structure of the language as a measure of style employs ubiquitous elements, very common words, and is thus a measure that reveals the similarities and differences in deep structures of style. According to this measure, Lilith is not distinguished by either the first or second components but is in the middle, along with many other novels in the corpus. Although this is a negative finding, in terms of statistics, when explored in closer detail, we find that the style does share common elements of the late-Victorian novels, a mixture of the styles found at the extremes of PC1: the style of the exotic adventure narrative and the style of the domestic novel of late-Victorian society.

Words weighted on the low end of PC1, with the adventure narratives, appear frequently in Lilith. As the narrator, Vane, moves through the library that is in the process of becoming an estranged landscape, strange to the protagonist and thus to the implied reader, the style employs articles as well as prepositions found low on PC1. In the quote below, the words in bold are located on the left of PC1 and underlined are those that appear on the right-hand side of Figure 3.2:

99

The same moment I saw the back of a slender old man, in a long, dark coat, shiny as

from much wear, in the act of disappearing through the masked door into the closet

beyond. I darted across the room, found the door shut, pulled it open, looked into the

closet, which had no other issue, and, seeing nobody, concluded, not without

uneasiness, that I had had a recurrence of my former illusion, and sat down again to

my reading. (ch. 1)

In this example, the prepositions relate to aspects of the narration that are generating suspense, narrating the movement of Vane across the room: “I darted across the room”, “and sat down again”. The articles add to the strangeness by hinting at the definite moment, that there was something seen, “the back”, even though it is an unfamiliar and, for the moment, an unidentifiable man, “a slender old man, in a long, dark coat”. Furthermore, specific components of the library are definite, “the room”, “the door”, “the closet” but the sense that the illusion is more than an illusion is increased by the article a: “I had had a recurrence”. The article here refers to the non-exclusive occurrence of the vision and the uncertainty of whether it can even be identified as an illusion or as something else. Negation, through one of the words weighted high on PC1, not, adds to the unravelling of the perfectly normal, non-fantastic explanation for the vision of a man: “not without uneasiness”. Accounting for volume, much, also weighted high on PC2, offers description, “shiny as from much wear”. Furthermore, there is in this last example, at least one distinctive element of MacDonald’s style: the combination “as from” which occurs several times in Lilith but is not a form used at all by Trollope or Wells, both of whom prefer “as though” – a combination that does not occur in Lilith at all. However, although this is another idiosyncratic aspect that distinguishes MacDonald’s style from his peers, it is not a deep pattern of the sort uncovered by the PCA on ninety-nine common words. Rather, the elements weighted both high and low on PC1 are found in Lilith in a manner that, according to the multivariate statistical analysis, does not distinguish this early fantasy work. The variance

100 uncovered by PC1 is between the adventure and the domestic narratives and the fantasy does not sit high or low on such a component.

However, some chapters in Lilith may resemble the style of the adventure narratives more than others that may resemble the high extreme of PC1. To investigate this, the book was approached in chapters. This approach has the additional benefit of breaking the narrative of the framed world, the strange library from the fantasy world, through the mirror portal. There are no obvious stylistic distinctions in how the two worlds are narrated: even though the entry to the fantastic world is preceded by mounting suspense through the odd occurrences within the old house, there does not appear to be any dramatic stylistic shift once the narrator emerges into the unknown. This hypothesis can be tested by determining whether there is a split between the frame chapters, Chapters 1 and 2, and the rest of the narrative.

Exploring the variation of styles within Lilith required counting the proportional frequencies for all ninety-nine common words in all forty-seven chapters of Lilith and then calculating PC scores for all forty-seven chapters. In order to graph the chapters of Lilith according to the same measure already found, new PCA scores were manually counted for each of the segments. That is, rather than finding a new component – which would be a different measure to the one found in the PCA above –new scores were calculated for the same two components that were originally graphed in Figures 3.1 and 3.2. The calculations followed this formula:

∑ 푓푖 ∗ 푙푖 푖=1 in that n the number of words used in the original PCA, f is the standardised proportional frequency of the word from the segment and l is the original PC loading for the corresponding word. This formula was repeated for each of the segments until new scores for all forty-seven segments were calculated. This process was completed for both PC1 and PC2. For perspective, chapters of a text from the far-right of PC1, Trollope’s The Eustace Diamonds were also included

101 and a text to represent the left-hand extremes, H.G. Wells’s The Time Machine. Together all 140 segments are plotted in Figure 3.3.

Figure 3.3 Manual PCA Scores for Lilith Chapters with The Time Machine and The Eustace Diamonds (PC1 vs. PC2)

In answer to the second hypothesis proposed above, it is immediately evident from

Figure 3.3 that there is no distinction between the chapters set in the fantasy world and the chapters that precede Vane’s entry to the spirit world, Chapters 1 and 2. Along both PC1 and

PC2 there is no stylistic distinction found in the narration of the frame world for the portal- fantasy. With this question answered, we can move on to the more complex hypothesis: the question of whether some chapters of Lilith are closer in style to one of the extremes along PC1.

Firstly, it is evident that the chapters of Lilith are more at home with the chapters of The

Eustace Diamonds than with the chapters of The Time Machine. The chapter furthest to the left is numbered 46 and is titled “The City”. It describes Vane, Lona and the Little Ones entering a strange city. At the end of the chapter, Vane finds himself returned suddenly to his library. This chapter has one of the highest proportions of the and one of the lowest proportions of but in the 102 entire novel. Conversely, the chapter furthest to the right, Chapter 13, has one of the highest proportions of but, a relatively low proportion of the and the second lowest proportion of a.

Interestingly, the chapter with the highest proportion of but, Chapter 30, is also on the right hand side. Titled “Adam Explains”, this chapter is full of dialogue that employs the conjunction but to connect contrasts: “… for ages we knew nothing of her fate. But she was divinely fostered”, “… Eve longed after the child, and would have been to her as a mother to her first- born, but we were then unfit to train her …”. Highlighted in the second quote are other words weighted high on PC1 include the modal verb would. This modal is employed by Adam to relate

Eve’s past: “she would have died”, “she would have been raging”. Also present at a relatively higher rate in Chapter 30 are second person pronouns, you and your. These are also markers of a chapter heavy with dialogue: “You have saved the life of her and their enemy; therefore your life belongs to her and them”. Therefore, the chapters of Lilith that scored low on PC1 resemble the adventure narratives but also have a relatively low proportion of dialogue compared to the chapters that scored higher on PC1 which consist mostly of dialogue.

Indeed, the chapters found at the left-hand side of Figure 3.3 are mostly narrating

Vane’s travels across the strange land and thus feature the articles and prepositions found at the same end of Figure 3.2: “In the middle of the afternoon I came out of the wood” (ch. 11); “Now appeared a woman, with glorious eyes looking out of a skull; now an armed figure on a skeleton horse” (ch. 26); “At length we drew near the cloud, which hung down the steps like the borders of a garment, passed through the fringe, and entered the deep folds” (ch. 46). While many of the chapters found on the right-hand side are made up mostly of direct speech, including chapters numbered 5, 47, 9, 28, 21 and 4, the chapters on the left-hand side are devoid of direct speech, including chapters 2, 45 and 26. Chapters 11 and 35 each include only two lines of speech. This split between the dialectic chapters and the descriptive chapters indicates that the words found with the adventure narratives are readily present in Lilith as Vane traverses the countryside of the strange world, while the words associated with domestic realism are

103 employed in the conversations between characters including negation, modality and verbs:

“You need not be frightened,” I said” (ch. 21); “Why did you not tell me? That I should have been so near him, and not know!” (ch. 9). Therefore, although Lilith is not distinguished by PC1 or PC2 in Figure 3.1, different styles are employed according to the purpose of the chapter, a style with more conjunctions, negation, personal pronouns and modal verbs is found in chapters where there is more direct speech and explanations offered to Vane from figures such as Adam. On the other hand, chapters where Vane is moving across the landscape of the fantasy world, hardly conversing with any characters but narrating movement and description are chapters marked by the features weighted with the adventure narratives: prepositions and articles. Primarily, however, the PCA finding is a negative one as it places Lilith in the middle, undistinguished by the underlying patterns of word use in a late-Victorian stylistic backdrop.

3.2.3 The Style of The Time Machine In contrast to MacDonald’s style, the style of H.G. Wells is distinguished by PC1 as it is ranked second lowest on the horizontal axis of Figure 3.1. The Time Machine is therefore stylistically distinct from Trollope’s The Eustace Diamonds and similar in style to Conrad’s sea novel.

Looking at some of the key words discussed in Section 3.2.1, we find that the conjunction but is also employed by Wells in the course of the narrative. However, like Conrad’s style in The

Nigger of the Narcissus, Wells’s style includes words weighted high on PC1 but has a relatively higher use of prepositions and other words weighted low on PC1. In the following example, the words found on the left-hand side of PC1 are bolded and the contrasting words are underlined:

They merged at last into a kind of hysterical exhilaration. I remarked indeed a clumsy

swaying of the machine, for which I was unable to account. But my mind was too

confused to attend to it, so with a kind of madness growing upon me, I flung myself

into futurity. (ch. 3)

This style of The Time Machine, marked by third person pronouns, articles and prepositions is what causes Wells’s text to sit low on PC1 along with other adventure based narratives.

104

However, different articles and prepositions are used in different contexts within The Time

Machine, depending on the narrative purpose. A closer look at the context of like, the, a, an and of reveals that although these words are all weighted at the low end of PC1, they are used at different rates depending on the aim of a particular section of Wells’s narrative.

The preposition like is used in The Time Machine, as in The Nigger of the Narcissus, to invoke similes for aspects that are unknown and unfamiliar to the reader. The narrator relies on this device to contrast between the known and unknown. This includes Wells’s descriptions of the futuristic races, such as the Eloi who are described as being “like children” (ch. 4) and the

Morlocks who are described as being “like a human spider” (ch. 5). The personal reflections of the narrator are also described, at times, through similes, “I felt like a schoolmaster amidst children” (ch. 4).

Although similes are not the only device Wells employs to describe the unfamiliar, they are effective for communicating concepts that are wholly unknown to the implied reader. In one instance, the Time Traveller employs the closest words appropriate rather than a simile to describe time travel: “So long as I travelled at a high velocity through time, this scarcely mattered…” (ch. 3). Even though it is an odd expression to travel at “high velocity” towards a point in time rather than a place, this is not as strange to the reader as the rest of the sentence: “I was, so to speak, attenuated—was slipping like a vapour through the interstices of intervening substances!” (ch. 3). Both “velocity” and “slipping like a vapour” relate aspects of the impossible event of time travel; however there is a stylistic distinction between the Time

Traveller’s descriptions of his fantastic invention and descriptions of his unfamiliar experiences.

Indeed, the Time Traveller does not use similes at all in the first chapter, where he expounds most of the technical aspects of the machine as well as the notion of time as the fourth dimension. In the second chapter, the Time Traveller uses like once to draw a comparison between aspects of his story and a lie, “[m]ost of it will sound like lying”. It is not until the third chapter that the Time Traveller employs a number of similes related to the impossible event of

105 time travel: “shot like a rocket”, “night came like the turning out of a lamp”, “a feeling exactly like that one has upon a switchback” (ch. 3). There is a shift that occurs in the style of the Time

Traveller’s descriptions: experience is narrated with more abstraction than the mechanical and technical details.

It is not only similes that aid with abstraction, the indefinite articles can also be employed to relay ideas with abstraction and generalisation (Craig, “A and an” 275). In contrast, the can be used to narrate concrete situations. The difference between how the definite and indefinite articles are used in The Time Machine is illustrated in a 1,110-word segment that occurs at the end of Chapter 4. This segment of text is where the Time Traveller expounds his first theory of the futuristic society that is based on his initial interpretation of observations made in the London of the far future. In this section of the narrative, the definite articles are used more than twice as much as both a and an. Indeed, the proportional frequencies of the and the preposition, of, are higher in this section than the text of The Time Machine as a whole as they are used by the Time Traveller to explain his theories: “the outcome of”, “the work of”, “the conditions of”, “the agriculture of”, “the subjugation of”, “the ideal of”, “the reaction of”, “the love of”, “the outcome of”, “the flourish of”, “the fate of”, “the problem of”, “the whole secret of”, “the increase of”. Thus, the Time Traveller’s theorising contains more certainty and definiteness than abstraction or generalisation. Rather than “an increase” it is firmly put, “the increase”. The certainty of the Time Traveller’s deductions is felt in the extensive use of the definite article and reinforced by evidence that is linked using the preposition of, despite his admission at the end of the chapter that his initial explanation was “plausible enough—as most wrong theories are!” This admission that these theories were eventually proved wrong does not contradict the initial certainty felt by the Time Traveller. As John S. Partington has argued, drawing on examples from Chapter 4 to demonstrate, the Time Traveller’s initial analysis of the future includes direct extrapolations from his own time that constitute a cautionary warning to

Wells’s readers (Partington 58). In all the examples Partington uses, however, the indefinite

106 article is found but not the definite article. On closer inspection, in most of the instances where the Time Traveller directly extrapolates theories of the future society from his own society the indefinite article is used to speak of “a sentiment arising”, of “discords in a refined and pleasant life” and how “[e]ven in our own time certain tendencies and desires, once necessary to survival, are a constant source of failure”. There are other instances where a is employed to refer to more generalised outcomes of the Time Traveller’s theories: “it strengthened my belief in a perfect conquest of Nature”. Nevertheless, a appears more frequently when Wells is turning the analysis of the futuristic world to a social critique of his own society. Therefore, the indefinite article is used for direct extrapolations to provide contrasts and the is used to relay concrete deductions supported by evidence, linked with the preposition of. In terms of artistic strategies, the unfamiliar experiences are relayed using comparisons and abstractions while the technical specifications of Wells’s invention of time travel and the social theorising are narrated with a heavier use of the definite article.

However, comparison is not only employed to domesticate the familiar but can also be used to estrange the familiar. For instance, the other narrator of Wells’s text, who narrates

Chapters 1, 2, part of 12 and the epilogue, estranges the Time Traveller from the everyday man.

Part of the estrangement occurs through denying him a name, but it also occurs through comparison: “[h]e sat back in his chair at first, and spoke like a weary man” (ch. 2). By comparing him to something familiar with a device that generally implies a comparison to something different, the narrator implies that the Time Traveller is not an ordinary man. He is, however, a man capable of behaving in a familiar way. In the same chapter, Wells employs a rather colourful simile in the course of describing the Time Traveller: “… they were somehow aware that trusting their reputations for judgment with him was like furnishing a nursery with egg-shell china …” (ch. 2). Darko Suvin has noted that when the Time Traveller is in the year

802,701, in what Suvin terms the “Eloi-Morlock episode”, he “can not simply be a representative

Man” but is “a complex Victorian gentleman-inventor who displays various fin-de-siècle

107 attitudes when faced with shifting situations and interpretations of the Eloi and of the Eloi-

Morlocks relationship” (“A Grammar” 108). However, when faced with the final sunset – the eclipse, encounters with large crabs and monsters from the sea – the Time Traveller becomes “a generic representative of Homo sapiens, an Everyman defined in terms of biological rather than theological classification, as a species-creature and not a temporarily embodied soul” (107-8).

Without the estranging of the Time Traveller that occurs through the framing narrative, the shifting from a specific late-Victorian framework for interpretation to a generic representative of humankind would not have been accommodated as easily or as smoothly. The distance already attained through the defamiliarisation of the Time Traveller allows the Time Traveller to be both a representative of a set of specific values as well as a more generalised set of features, being the last of a dominant race that “disappeared in the depth of geological time” (Suvin, “A

Grammar” 108).

Returning to the question of distinctive styles within The Time Machine, there is evidence that Wells employs the words found at the extreme right-hand side of PC1 at a higher frequency in the chapters consisting of the frame narrative. Consider the quote below, where the words that are found at the right-hand side of Figure 3.2 are underlined and the words that appear on the left-hand side are bolded:

Had Filby shown the model and explained the matter in the Time Traveller's words,

we should have shown him far less scepticism. For we should have perceived his

motives; a pork butcher could understand Filby. But the Time Traveller had more than

a touch of whim among his elements, and we distrusted him. Things that would have

made the frame of a less clever man seemed tricks in his hands. (ch. 2)

This style of narration features words found on the far-right of PC1 including verbs that hint at the outcome of the tale, “should have”, and words speaking of the Time Traveller in the past tense, “had more than”. This section also employs words found on the left-hand side including the conjunction and, prepositions and articles. Nevertheless, the frame narrative appears to have

108 a higher density of words associated with the narratives found at the right-hand side of PC1.

This suggests that the framing chapters are stylistically distinct from the Time Traveller’s tale of the future.

In order to test whether there is indeed a distinct style for the framing narrative, the ninety-nine words were counted again within the thirteen discrete segments of The Time

Machine – the segments include twelve chapters and an epilogue. Of these segments, Chapters

1, 2 and the epilogue constitute the frame narrative along with 72% of Chapter 12. The dataset of proportional frequencies was then subjected to a distribution test in R using squared

Euclidean distance, which John Burrows found to “yield the most accurate results” as it “avoids any undue smoothing of data whose inherent roughness reflects the complexities of the language itself” (“Textual Analysis” 326). The results are plotted in a dendrogram in Figure 3.4.

Figure 3.4 Cluster Dendrogram of The Time Machine using Squared Euclidean Distance Measure

Figure 3.4 demonstrates a separation between the chapters constituting the frame narrative and those that are the Time Traveller’s tale. The segment numbered (13) is the epilogue, which is grouped next to Chapters 1 and 2 and followed by Chapter 12. This result suggests that there is indeed a stylistic difference between the four chapters constituting the frame narrative and the Time Traveller’s tale, and indicates that it might be fruitful to

109 investigate where these chapters sit among the rest of the corpus. Such an investigation allows us to ask which of the thirteen segments of The Time Machine would appear closest to the higher end of PC1 and thus resemble the style of the social realist novels at that end and which are the most distinct from that style.

In order to explore this, The Time Machine was graphed by chapter using the PC scores manually counted for Figure 3.3 in Section 3.2.2. As well as the thirteen chapters of The Time

Machine, the eighty chapters of Trollope’s The Eustace Diamonds were also included. Since

Trollope’s text is furthest to the right-hand side along PC1 in Figure 3.1 it offers perspective, allowing us to see how far across the component any segment of The Time Machine really appears through the contrast with a text that is furthest to the right. The results appear in Figure

3.5.

Figure 3.5 Manual PCA Scores for The Time Machine Chapters with The Eustace Diamonds (PC1 vs. PC2)

According to Figure 3.5, the first chapter in The Time Machine is closest to the right-hand side of the graph, though still further left than any chapter of The Eustace Diamonds. Thus, PC1 remains the major distinguishing measure between the styles of the two narratives. This test also confirms that the style in Chapter 1 is closest to the style found to distinguish the texts at 110 the right-hand extreme of PC1. Chapter 1 does, therefore, have a relatively higher usage of words such as that, but, not, have and the modal verb, would. However, PC1 does not distinguish between the elements of the frame narrative. Rather PC2 is found to place the Epilogue, along with Chapter 4, at the top which is distinguished from the rest of the frame narrative, Chapters

2, 12 and 1 that are the three lowest chapters on PC2. The position of Chapter 4 high on PC2, confirms that, as discussed above, there is a relatively higher usage of the preposition of in this chapter and also reveals that there is a relatively higher usage of which in this chapter, as which and of are the highest scoring words on PC2 according to Figure 3.2. As discussed above, the

Time Traveller’s first theory of the futuristic society is contained within Chapter 4 and accounts for 29% of that chapter. In this segment, which is indeed employed in the context of extrapolating from the Time Traveller’s own reality: “an odd consequence of the social effort in which we are at present engaged”, “across which my machine had leaped”, “that commerce which constitutes the body of our world, was gone”, “conditions under which the”, “conditions under which”, “that triumph which began the last great peace”. In contrast, the Epilogue only employs which once, which is not necessarily surprising given its length: “Or did he go forward, into one of the nearer ages, in which men are still men, but with the riddle of our own time answered and its wearisome problems solved? Into the manhood of the race …”. In this example are other elements shared with Chapter 4 such as of and the. Also in this quote is or, another high scoring word on PC2. Or introduces a speculative alternative and is used at a relatively higher frequency in the Epilogue than any of the previous twelve chapters of The Time

Machine as the narrator speculates where the Time Traveller travelled: “or among the grotesque saurians … or beside the lonely saline lakes of the Triassic Age”. Curiously, the Epilogue also employs but at a relatively higher rate than any other chapter in Wells’s text. But is weighted high on PC1 and in the middle of PC2 but is nevertheless, an important conjunction in the

Epilogue that is full of contrasting statements and speculations: “One cannot choose but wonder.”, “But to me the future is still black and blank …”. The Epilogue presents the frame

111 narrator’s speculations and yet resembles the style of the concrete assertions of the Time

Traveller’s theories in Chapter 4. The more speculative between the two, the Epilogue, is marked by the relatively higher use of but and or. What unites the two sections then, isn’t the conjunctions but rather the use of the definite article, the use of which and the words that are not employed as frequently including prepositions denoting movement, such as up, and out, the adverb then and the pronouns he, him and you. These words all score low on PC2.

The two chapters positioned close together at the lowest extreme of PC2 represent the bookends to the Time Traveller’s tale. Unlike the scientific discussion of the first chapter and the speculations of the Epilogue, Chapters 2 and 12 have several features that distinguish them from the rest of the frame narrative, as well as the Time Traveller’s tale. These include the expression of disbelief voices by the dinner companions, the change in the Time Traveller’s direct speech, it is much shorter than the first chapter, and the additional description offered by the narrator of the Time Traveller’s appearance. These are just some of the thematic features that separate Chapters 2 and 12 from the rest of The Time Machine. According to the deeper stylistic elements, however, the words scoring low on PC2 indicate a style dominated by prepositions. The use of out, the word lowest on PC2, accounts for the domestic aspects of the setting: “held out his glass for more”, “as he went out”, “He reached out his hand for a cigar”

(ch. 2). Similarly, the prepositions down and up are used in narrating everyday actions: “went on down the corridor … I got up and went down the passage” (ch. 12). Also weighted low on PC2 are the personal pronouns he and him, and the adverb, then. These are key words in the narration provided by the frame narrator: “Then he came”, “Then he spoke”. The second person pronoun you occurs mostly in direct speech: “'Do you really travel through time?’” (ch.

12). However, the narrator occasionally addresses the implied reader: “You cannot know how his expression followed the turns of his story!” (ch. 12). In a similar manner, the Time Traveller addresses his listeners: “'I know,' he said, after a pause, 'that all this will be absolutely incredible to you” (ch. 12). This example is clearly narrated by the frame narrator but there are

112 moments throughout the Time Traveller’s soliloquy where he addresses his audience directly:

“I have already told you of …” (ch. 11), “Can you imagine …” (ch. 5). Overall, the bookend chapters to the Time Traveller’s tale are stylistically distinct from the verbatim report of his adventures in the future on account of the usage of you, your, then, he, down, out and up. The chapter closest to the style distinguished by the right-hand side is the first as it contains relatively more modal verbs and the conjunction but as the external narrator frames the Time

Traveller’s theories, and the reception of said theories, all within a domestic setting.

Nevertheless, although Chapter 1 in The Time Machine is closest in style to the social narrative of

The Eustace Diamonds, no part of Wells’s narrative actually overlaps with any part of Trollope’s.

Therefore, the style of The Time Machine can be said to resemble adventure fiction rather than the more realistic social narratives set in domestic rather than exotic locations. Ranked second lowest on PC1, Wells’s narrative of the future is split along PC2 when studied by chapter. This is also the component that distinguished the utopian narratives from the rest of the corpus in Figure 3.1 prompting the question of how the utopian narratives are stylistically distinct from The Time Machine.

3.2.4 Utopian Styles Ostensibly, The Time Machine shares many features with the nineteenth-century utopian narrative. In The Time Machine what first appears to be a utopian society is found to be more dystopian and has been termed a “failure of utopia” (Partington 57). Parrinder places The Time

Machine alongside utopian literature in his monograph Utopian Literature and Science. Parrinder also argues that the utopian narratives of Butler and Bulwer Lytton are gradually revealed to have more dystopian characteristics than utopian (Utopian 84). Thus, The Time Machine, although not strictly a work of utopian fiction, is in good company with these other narratives.

Nevertheless, the computational analysis has determined that Wells’s narrative is stylistically distinct from these nineteenth century utopian narratives.

113

According to the second principal component (PC2), there is a stylistic distinction between the utopian narratives and the rest of the corpus. At the top of Figure 3.1 are Bulwer

Lytton’s The Coming Race and Butler’s Erewhon, and the third utopian text in the corpus,

Morris’s News From Nowhere, is located slightly lower than these but above the cluster of texts in the middle of the graph. The texts contrasted by this component and that appear at the low extreme of PC2 include Conrad’s Nigger of the Narcissus and Kipling’s Light that Failed. Although

Kipling wrote books set in varied and exotic settings, Light that Failed is mostly set in London and is a story of unrequited love. When compared to Kim (1901) and the stories in The Jungle

Book (1894), which include tales of fantasy and adventure, Light that Failed is relatively mundane. Therefore, the occurrence of Kipling and Conrad together at the bottom of PC2 cannot be automatically attributed to both being adventure fiction. However, they do share elements that have already been shown to characterise adventure-based narration, for the split of words along PC2 reveals that Conrad and Kipling’s texts share the words out, up, at, go, then and down. This is contrasted to the words weighted at the top of the vertical axis: which, of, more, than, were and their. In the context of the utopian narratives these words are frequently found in the descriptions of other races and more advanced societies.

The highest word on PC2 is which. It is no surprise, then, that the three utopian novels actually have the highest proportional frequencies of which in the entire corpus. Which frequently collocates with by and of. For instance, in the opening lines of Erewhon the narrator declares, “I will say nothing of my antecedents, nor of the circumstances which led me to leave my native country”, and then explains only that he left with the intention of purchasing land in a colony “by which means I thought that I could better my fortunes more rapidly than in

England” (ch. 1). In this example, more and than are also used to compare and contrast. Both these words appear at the top of PC2 and occur throughout the utopian narratives of Bulwer

Lytton and Butler as the narrators of each relate the utopian society with their empirical world:

“The Vril-ya unite in a conviction of a future state, more felicitous and more perfect than the

114 present” (The Coming ch. 13); “They are believed to be extremely numerous, far more so than mankind” (Erewhon ch. 14). As well as referring to the fictitious races by their given names, Vril- ya and Erewhonians respectively, the narrators also make frequent use of third person plural nouns including they and their as well as the plural form of was: were. Characterising aspects of the fictitious civilisations are described thus: “In other respects they were more like the best class of Englishmen than any whom I have seen in other countries” (Erewhon, ch. 17); “Myths of that world were still preserved in their archives, and in those myths were legends of a vaulted dome in which the lamps were lighted by no human hand” (The Coming, ch. 4). In the second quote, the myths of the land are told in similar style to the experience the narrator has of the strange land itself.

The utopian narrator occasionally lapses into a present tense discussion: “I think that the Erewhonians are beginning to become aware of these things” (ch. 22). The “things” the narrator is referring to are the benefits of pursuing practical training rather than purely hypothetical training, which at the time of the narrator’s visit to Erewhon was the primary education occurring through universities of “Unreason”. In continuing the discussion, the narrator employs the auxiliary will, a word invoking the future: “I am sure that if they will have the courage to carry it through they will never regret it” (ch. 22). For the most part, however, these utopians from the nineteenth-century are primarily past tense, first person narration featuring plural forms to describe civilisations.

Other also occurs at the high end of PC2 and indicates a thematic distinction between the utopian novels and the rest of the corpus: the focus on other civilisations. Although Butler’s

Erewhonians are quite similar to humans, there is a subset of the civilisation that is treated as otherwise even by the Erewhonians themselves. These are the sick, for it is illegal to have a disease in Erewhon. Other is employed as the narrator refers to the sick, “two other people, who were the first I had seen looking anything but well and handsome” (ch. 7), but not when he first encounters the Erewhonian form, which occurs when he finds himself in a ring of statues:

115

“They were barbarous--neither Egyptian, nor Assyrian, nor Japanese—different from any of these, and yet akin to all” (ch. 5). It is this moment that Patrick Parrinder argues is “the first instance” where the utopian nature of the world is “displaced” and the character of the

Erewhonian world is set up as being “monstrous and alien” (Utopian 14).

The narrative of the future is one of the key elements of science fiction and is a shared component between the utopian narratives and Wells’s early science fiction, The Time Machine.

Furthermore, Parrinder has noted previously that The Time Machine is comparable to the nineteenth-century utopian narratives in relation to the “sensational narratives of entry and exit” where the narrator makes a “hair’s-breadth escape” (Parrinder, Utopian 84). Parrinder argues that “[t]he difficulty and increasing urgency of escape from Bulwer Lytton’s and Butler’s new worlds emphasise their dystopian character” (84). How do these exits compare when considered according to style?

In both utopian narratives the narrators have pertinent reasons to make a hasty, if difficult, escape from the strange worlds, and yet both are narrated in a way that lacks suspense. In The Coming Race, for instance, the narrator is rescued to make his escape in the dead of night: “Soft as were our footsteps, their sounds vexed the ear, as out of harmony with the universal repose” (ch. 29). The journey to the place of exit is told in the same narrative style that is indicated by the words high on PC2: “from which we descended…along the broad upward-road which wound beneath the rocks … toward the place from which I had descended

… through that closed wall of rock before which I had last stood with Taee …” (ch. 29). In

Butler’s narrative, the escape from Erewhon is longer but similarly lacks a building of suspense to the final exit. Rather than being rescued, the narrator executes his own escape: “My plan was this …” (ch. 28). The lengthy explanation of the plan includes the words found high on PC2:

“knowing that we had no other chance of getting away from Erewhon, I drew inspiration from the extremity in which we were placed, and made a pattern from which the Queen's workmen were able to work successfully” (ch. 28). The escape in The Time Machine is also narrated in first

116 person, “Very calmly I tried to strike the match. I had only to fix on the levers and depart then like a ghost”. Rather than which, Wells’s narrative of escape includes the preposition, at, and the adverb, then, both of which are found lower on PC2: “I made a sweeping blow in the dark at them with the levers, and began to scramble into the saddle of the machine. Then came one hand upon me and then another. Then I had simply to fight against their persistent fingers for my levers …” (ch. 10). In contrast to the escape narratives, the Time Traveller’s tale moves at a much faster pace with events linked together through the adverb then and the direct actions of the escape told with prepositions such as at. Also present in this narrative of escape are the elements found low on PC1, the words associated with the adventure narratives, such as the articles, a and the, and prepositions such as into and of.

Indeed, when the “last scramble” of Wells’s Time Traveller is compared to the exit of

Haggard’s protagonists in King Solomon’s Mines, we find the elements of the adventure narrative, the words low on PC1, rather than any words weighted either high or low on PC2.

The narration of the escape from the mine makes frequent use of the conjunction and, which is pronounced in one of the artistic strategies employed in Haggard’s narrative, repetition: “A squeeze, a struggle, and Sir Henry was out, and so was Good, and so was I” and “we were rolling over and over and over through grass and bushes and soft, wet soil” (ch. 18). The use of repetition serves to heighten the emotions that the company were feeling upon finally exiting the mine. No such exuberance exists in the narrative of the Time Traveller’s escape nor the exits from the utopian worlds of Butler and Bulwer Lytton. In terms of the PCA results, the elements of repetition would be counted in the conjunction and, and the article a, both of which are weighed low on PC1.

Hence, in a corpus of late-Victorian texts it is the utopian narratives that are stylistic outliers and not the texts of early fantasy and science fiction. Characterised by a high proportion of which, more, than and they, the styles of the utopian narratives are distinguished by narratives with a focus on wholly different societies that, at least prima facie, appear to be more advanced

117 societies, even a utopia. Hence, aspects of these fictional worlds are compared to the empirical world as being “more than” the narrator’s reality. Although The Time Machine includes many of these elements—comparison to the narrator’s world, a society that at first presents as a utopia, and a new culture the narrator is discovering—Wells’s style is not distinguished by these elements but is positioned by PC2 in the middle and by PC1, on the left with the adventure narratives.

3.3 Conclusion This chapter set out to determine whether there are notable stylistic variations in the early examples of science fiction and fantasy. The analysis revealed that when studied against twenty-nine other texts from the late-Victorian era neither of the focus texts, The Time Machine and Lilith, are outliers. The multivariate analysis on the corpus showed that the main stylistic variation between the thirty-one texts is between narratives with exotic settings and those with domestic settings, between the action narrative and the social narrative. Along this continuum the early science fiction work scored second lowest at the action end, but Lilith was not distinguished along the measure. The second component measured another aspect of stylistic variation in the corpus that highlighted a distinction between the utopian narratives and the rest of the texts. The main difference was in the style of exposition in the utopian works that narrated the comparison between an advanced society and the narrator’s own reality. The rest of the corpus, particularly works lowest on the second component, such as Conrad’s The Nigger of the Narcissus, contained a contrasting style of prepositions and adverbs connecting events rather than an expositional style presenting contrasts. Thus, even along the second component, neither The Time Machine nor Lilith are outliers or even located at either extreme of the vertical axis.

These results are surprising on several counts. Firstly, Lilith is known to be stylistically different from contemporaneous texts due to its use of archaic diction. As well as containing easily distinguishable stylistic markers, MacDonald’s early fantasy novel is a unique mixture of

118 tropes, including the portal and the frame narrative, and a style that does not necessarily support such tropes, at least not in the manner familiar to a contemporary reader. However according to the deeper measure of style, the usage of the ninety-nine most common words in a corpus of late-Victorian texts does not distinguish Lilith from any of its peers.

The second finding was not as surprising. The stylistic relationship between The Time

Machine and the adventure fiction of Conrad and Haggard on the left-hand side is stylistic evidence that can be interpreted as supporting the proposed links between late-Victorian sensational adventure fiction and the development of science fiction (Aldiss 138). Rather, the surprising result concerning The Time Machine is that there is a variation between the Time

Traveller’s own narrative and the frame narrative. The stylistic variation is not necessarily due to a different diction or ‘voice’ but rather to a difference between the concrete narration of the mechanical technicalities and the more abstract narration of sensations experienced by the Time

Traveller. Within the Time Traveller’s own narrative there are shifts between the adventure- paced narrative, linked with prepositions, and the more concrete narration of deductions and extrapolations where a heavier use of the definite article can be found.

Finally, rather than MacDonald’s odd fantasy or Wells’s adventures with science, it is the utopian narratives that are found as stylistic outliers in the late-Victorian corpus. This is the first stylometric study on the stylistic context of The Time Machine and as such is the first study to provide evidence that the style of the related, possibly even parallel, genre of utopian narratives is stylistically more distinct than examples of early science fiction. However, the three examples of utopian narratives examined were not necessarily the template for a future genre. As Parrinder has argued, there is a distinction between classic utopias and modern utopias where the modern, emerging from the twentieth-century onwards, are “fictions not of

‘nowhere’ but of ‘not yet’” (Utopian 3). Accordingly, the late nineteenth-century utopian narratives are positioned in a unique literary and stylistic context: they certainly influenced the future writers of utopian fiction and the science fiction writers who were to incorporate utopian

119 tropes, but they did not spawn a super-genre. The style of these works, mostly expositional, does not contain the same elements of the fast-paced action and the emotive language of the texts positioned lower on the second component. It is perhaps not a surprise then that Wells’s reaction to the utopian narratives in The Time Machine stylistically resembles the commercially successful adventure writings of the late-Victorian era. Although The Time Machine shares tropes with the three utopian narratives and is often discussed as a utopian narrative it is firmly, stylistically, distinct from late nineteenth-century utopian fiction.

Although the scope of this study has not extended to a complete list of early science fiction and fantasy texts, the implication of the results in this chapter are that the style of The

Time Machine is similar to the popular, contemporaneous adventure narratives, and that the underlying style of Lilith is not distinguished from a corpus of social realist narratives. Both

Wells and MacDonald were experimenting with new concepts of other worlds, accessed by moving differently in the dimension of time and the world of spirits. However, in the context of other late-Victorian texts neither The Time Machine nor Lilith has an underlying style that can be considered “new”.

120

The Case for Olaf Stapledon

4.1 Introduction Considered one of the most original practitioners of early science fiction, William Olaf

Stapledon is celebrated for his ideas rather than his literary style. This is not necessarily because

Stapledon’s style is overlooked, rather it is known for being idiosyncratic to the point of being unclassifiable. To some it is “repulsive” while to others it is “fascinating” (Jameson 124).

Stapledon’s style merits closer attention. How is it that the idiosyncratic style of an original practitioner is not valued as an important aspect of his contribution to the genre?

To explore this puzzle, the computational analysis in this chapter considers nine of

Stapledon’s works in relation to seventeen by H.G. Wells, his direct predecessor, and eight by

Virginia Woolf, his modernist peer. Although Wells and Woolf are not the only authors who have influenced Stapledon, comparison with their work has an added interest because they in turn were influenced by Stapledon’s fiction (Crossley, “Famous” 625; Henry 111).

Contextualising Stapledon’s style allows greater insight as to whether his literary style is indeed idiosyncratic to the point of being unlike any other genre. In two separate stylometric studies, one of Stapledon and Wells and the other of Stapledon and Woolf, this chapter demonstrates how, on the one hand, Stapledon’s literary style operates in the tradition of science fiction and on the other hand, it functions as a style emerging from the cultural milieu of modernism.

Further to the problem of Stapledon’s style are three specific questions that are explored in the following quantitative studies. These concern Stapledon’s position in the tradition of science fiction, claims concerning his idiosyncratic style in relation to his novel, Star Maker

(1937), and his literary-artistic achievements. Stapledon’s literary achievements are usually discussed in relation to his visionary imagination, however the execution of his vision in his nine novels has received little attention. Yet his literary achievement ought to include his unique artistry – as some commentators have suggested (Waugh; Huntington). This study calls for a renewed critical response to Stapledon’s art on the basis of quantitative evidence linking

Stapledon’s style to his unique representational problems.

Scholars have claimed that Stapledon is one of the most isolated figures in science fiction (McCarthy 30; Priest 192). Although there is evidence of his connection to one illustrious figure, Wells, it is recognised that Stapledon preferred to downplay the influence of his predecessor, carefully insisting that his debt was to the “Wellsian mind rather than to specific texts” and constantly questioning Wells’s ideas, both “privately and in print” (Crossley,

“Famous” 623, 624). As Wells is conventionally considered one of the forefathers of science fiction, it is logical to consider Stapledon as a second-generation science fiction writer. Yet

Stapledon’s style does not appear to be an imitation of, or even an attempt to imitate, Wells’s literary style or his narrative formulas. In a lecture prepared and presented in the mid to late

1930s, Stapledon outlines his stylistic preferences for science fiction: “unemotional, unrhetorical, dry, concise, abstract” (qtd. in Crossley, “Olaf Stapledon” 27). A reader familiar with

Stapledon’s canon will recognise these descriptors as apt stylistic markers, distinguishing

Stapledon as an original practitioner of science fiction – sharing some artistic aims with predecessors while emerging from the first generation with new stylistic approaches to the shared representational problems. A question that remains unexplored is, just how isolated is

Stapledon in terms of style?

A specific example of Stapledon’s unique style is found in his work, Star Maker. Noted as Stapledon’s most “inventive” work (Kinnaird 517), Star Maker has been described by Fredric

Jameson as being idiosyncratic to the point of not resembling any genre, neither science fiction or utopian, and not even resembling traditional notions of art (124). Indeed, Star Maker has been classified by Curtis C. Smith as a “cosmic history” (“Books” 299) and is treated by Gerry

Canavan as “cosmological sf”, an attempt to “schematize all possible systems for social

122 organisation that might ever exist in the universe” (310). It is one of the first science fiction novels to decentre the human experience, taking the reader beyond earth to increasingly unfamiliar worlds. According to Jameson, Star Maker achieves this through a past-tense narration that is “a kind of imitation of historical discourse” albeit “running on empty” (125), where vast temporal progressions are narrated in a manner that precludes the ability of the reader to imaginatively conceptualise the passing of a billion years – or two. Reiterating

Stapledon’s own admissions in the preface to one of his minor works, Last Men in London (1932),

Jameson explains that Stapledon’s narratives of the future move “whole societies around as though they were characters” (203).1 Despite it being quite evident that Stapledon’s canon includes some works which fall into the category of “future history” and others that do not, commentators, including Jameson and Peter Stockwell, restrict their remarks on Stapledon’s stylistic qualities to just the one category, the future history. This overlooks the protagonist- based narratives such as Odd John (1935), Sirius (1944), The Flames: A Fantasy (1947) and A Man

Divided (1950). If Star Maker is indeed as idiosyncratic as Jameson claims, then a stylometric study might expect to find it as an outlier in Stapledon’s corpus. However, Star Maker is routinely treated alongside Stapledon’s other future histories. In making his argument, Jameson uses examples from (1930), another future history, to illustrate his claims about Stapledon’s style. Such treatment of Stapledon’s canon suggests that rather than containing a single outlier, the seemingly aberrant Star Maker, there are several works which fall into the same category. Therefore, the problem to explore is whether all Stapledon’s future history works are distinct, idiosyncratic to the point of defying categorisation – even distinct

1 Stapledon claims “this is a work of fiction, it does not pretend to be a novel…It has no hero but Man…There is no plot” (London; Preface). 123 from one another – or whether those with shared representational problems also share artistic solutions.

Finally, it is often said that while Stapledon has a reputation as a singular mythmaker, he was “not a great poet, nor even in some conventional respects a very good novelist”

(Davenport xiv). Among the numerous accounts of Stapledon’s extraordinary vision, there are only a few suggestions as to how Stapledon’s style served his visionary content as a solution, whether artistic or not, to their representational problems (Huntington; Waugh). Nevertheless,

Stapledon’s vision required a linguistic vehicle. Written by a philosopher turned amateur novelist and lacking the story-telling talents of Wells and Woolf, each of Stapledon’s nine science fiction works can be approached as its own experiment in writing fiction. Together they form a “composite fiction” (Rabkin), outlining the development in Stapledon’s philosophy and progressions in his ideology (Moskowitz), even tracing the development of his literary style and narrative technique (Goodwin). This raises further questions, such as whether Stapledon’s literary style developed as his literary career progressed, or if the two distinct forms he developed attracted distinct literary styles according to the artistic goals of each work. If so,

Stapledon’s accomplishments as a unique stylist ought to be re-evaluated accordingly.

4.2 Stapledon and Wells Although Stapledon is usually placed in the next generation of science fiction writers to follow

Wells, there was an overlap in their careers leading to a cross-pollination of ideas. Not only was

Wells still publishing fiction when Stapledon commenced in 1930 but his second last novel references Stapledon’s first novel: “that man Olaf Stapledon has already tried something of the sort in a book called Last and First Men” (Star-Begotten, ch. 2). Robert Shelton has argued that

Last and First Men is, at least in part, Stapledon’s response to Wells’s The War of the Worlds

(1898). In turn, Wells’s Star-Begotten is a response to Last and First Men (Shelton 1). Although

Stapledon accrued “a profound debt” to Wells (Roberts, History 169), Wells’s later fiction, in turn, owes a debt to Stapledon. 124

It has been argued that Wells’s artistic experiments in the late stages of his career were not as successful as his earlier works and perhaps not even as successful as Stapledon’s works.

Wells’s major works are generally placed in two distinct groups, with the first characterised as

“a fiction of wonder” and the second “a fiction of social concern and commitment” (Scholes and

Rabkin 19). The first group consists of works published from 1895 to 1901 and the second, overlapping slightly with the first, includes works published between 1899 and 1908. In general, the remainder of his twentieth century fiction is distinguished by “its heightened political seriousness” (Wagar 577). This change was met with disappointment from fans and critics, which Wells responded to indirectly when he noted in 1934 the “incurable habit with literary critics to lament some lost artistry and innocence in my early work and accuse me of having become polemical in my later years” (qtd. in Wagar 577). In addition, Steven McLean has observed a significant change to Wells’s narrative techniques in 1901, starting with The First

Men in the Moon (117). By contrast, Peter Stockwell does not distinguish between Wells’s nineteenth and twentieth century fiction but rather explores what he calls prototypical science fiction narrative that concerns “events in the future” (Stockwell 34).

From a more aesthetic viewpoint, Crossley has argued that Wells’s late fictions, including Star-Begotten, attempted but failed to adopt a modernist tone in imitation of

Stapledon (“Famous” 630). According to Crossley, Stapledon’s experimental style is a

“stunning” example of distinctive modernist traits (“Famous” 629). The features Crossley points to are difficult to quantify: the absence of satire in the mechanical narration of utopias and a combination of “high comedy and genuine eroticism” (“Famous” 628, 629). Despite these stylistic discrepancies, critics have claimed that there is a traceable lineage from Wells through

Stapledon to genre science fiction (Crossley, Olaf 198; Moskowitz 264). Although no one has thoroughly explored the direct stylistic lineage between Wells and Stapledon, the latter’s

“special form”, developed in order to address “the aesthetic problem he solves” (Huntington

349), has been addressed briefly. The question explored in this computational study is whether 125 these solutions are truly unique to Stapledon or can be related in some way to Wells’s earlier attempts.

The relation between the two authors is explored by considering the differences as well as the similarities in the underlying structures of language through an analysis of the relationships between one hundred words and twenty-six texts. Of the twenty-six texts in the corpus, nine belong to Stapledon. These include Stapledon’s five major works, Last and First

Men, Odd John, Star Maker, Sirius and The Flames, and four minor texts which are also considered early examples of science fiction: Last Men in London, Man Divided, , Death into Life (Smith, “Books”). Not included in the corpus are Stapledon’s short stories, poetry and philosophical volumes and the first draft of Star Maker which was posthumously published as

Nebula Maker (1976). Publication details for all nine works are found in Appendix 4.1 along with information regarding the seventeen texts by Wells. Included in the Wells corpus are sixteen science fiction novels and one that is debatable, The Dream (1924), which is included on account of its utopian features – the story follows a man from a future utopia experiencing a dream of the Victorian period. Not included are Wells’s science fiction short stories and other novels that are not classed as science fiction publications (Wagar).

One hundred words were selected, based on frequency. This list represents an extreme difference in usage: the most frequently used word, the, appears a total of 116,118 times while the one hundredth most common word, seemed, occurs only 2,226 times. The list also represents fifty-two percent of all word occurrences in the corpus and includes some key lexical words such as time and world. One recent study by Craig and Greatley-Hirsch restricted the studied word-list to function words, the “skeleton” of language as they appear regularly “regardless of topic or genre” (124). This study uses the most common words, regardless of whether or not they are function words, in keeping with the classic Burrows method (Burrows, “Word-

Patterns”). A full list of the one hundred words is found in Appendix E.

126

The proportional frequency of each word in the list of one hundred was analysed using

PCA, which first correlates the word-variables. The resulting analysis finds a combination of weightings that accounts for the largest amount of variance in the data. It gives similar weightings to words that tend to appear in the same texts and opposite weightings to those words which did not tend to occur together. The analysis returned twenty-six of these combinations with each component progressively explaining lower amounts of variance. The first component (PC1) accounts for 29% of the variance in the data and the second component

(PC2) 15%. These two components are studied in the following discussion and are graphed in

Figure 4.1 which serves as a map of the patterns found between the one hundred words as used in the 26 texts. In Figure 4.1, the initials (OS, HGW) identify the author, the value in brackets supplies the date of publication and a legend to the titles is found in Appendix D. For instance

OS_Darkness(1942) is Stapledon’s Darkness and the Light, first published in 1942.

Figure 4.1 PCA Scores of Stapledon and Wells (PC1 vs. PC2)

The first component (PC1) returned a clear distinction between the two authors. The texts on the right-hand side of the y-axis all belong to Wells while the left-hand side of the 127 graph is dominated by Stapledon’s texts, with the exception of two by Wells, A Modern Utopia

(1900) and The Shape of Things to Come (1933). This result indicates that there are indeed stylistic differences between the two authors. However, these results are complicated by exploring the relationship between Wells’s two outliers and the split in Stapledon’s corpus – a cluster of four at the bottom left and the texts closer to the y-axis speak to two distinct styles within

Stapledon’s corpus.

The far-left cluster of Stapledon’s texts are his four future history texts. Separated slightly is Last Men in London, the sequel to Last and First Men and a social commentary viewing

Stapledon’s present-day London through the lens of a visitor from two billion years in the future. It is plausible that a reflection on present day affairs requires a different stylistic form than narratives devoted to speculating futures. It is this style that is closer to Wells’s The Shape of

Things to Come.

Turning to the results along the second principal component (PC2), there are distinctions according to date on the right-hand side of the y-axis. Wells’s early works, from The

Time Machine (1895) to The War in Air (1908), appear together in a group within the bottom right quadrant, with the exception of A Modern Utopia which is at the extreme top of PC2. Indeed, all of Wells’s texts that were written in the nineteenth century fall below the horizontal x-axis. The only texts to appear above the x-axis were written in the twentieth century. The trend occurs only in one direction: while Wells’s later fictions may stylistically resemble his earlier work, none of his nineteenth century works come close to the top of the graph.

Therefore, in characterising PC1 it can be said that this component has mostly separated the authors based on their styles, but also a sub-style in the corpora of both, a distinction that is more evident in Stapledon’s corpus. Separated from Stapledon’s main cluster are the stories of human interest, characters and dialogue. PC2, on the other hand, primarily distinguishes between the majority of the corpus and a group of outliers including Stapledon’s The Flames and

128

Wells’s later books, A Modern Utopia and Star-Begotten. These results prompt a further question: how are Wells’s two outlying texts related to Stapledon’s style?

The word weightings produced by PCA offer a way of understanding the patterns detected in terms of word use. Graphed in Figure 4.2, the proximity of certain words to

Stapledon’s four future history works – these are the words that appear on the left-hand side of the graph – indicates that they are found relatively more frequently in these texts and can be said to be characteristic of his particular style.

Figure 4.2 PCA Loadings of 100 Words in Stapledon and Wells (PC1 vs. PC2)

Indeed, the words found on the left-hand side of PC1 (the horizontal axis) are found in both Wells’s The Shape of Things to Come and Stapledon’s future history and confirm that they share a style as they pursue other, known similarities. In the future history narratives, for instance, both authors treat entire civilisations as their characters of their narratives, both cover vast cultural epochs in single broad-sweeping summaries, and both reference entire trends and features thus: “So within a century the appearance of the human crowd changed” (Wells, Shape,

129 ch. 1, sec. 6); “So strong were the influences of that time that even up to 2020 the tendency of architectural design was to crouch” (Wells, Shape, ch. 1, sec. 6); “The crowds that streamed along these footpaths were as variegated as our own” (Stapledon, Star, ch. 3, sec. 2). The emphasised words are those found on the far left-hand side of Figure 4.2, but they are not the most extreme word found on that side of the axis.

According to the word weightings in Figure 4.2, the word at the extreme left-hand side of the PC1 is by and is opposed to up at the right-hand end. In terms of usage, the texts on the left of Figure 4.1 have a high frequency of by and a low frequency of up. Although both by and up are preposition forms, they are employed mostly as adverbs and evidently have distinct functions in language. In terms of style, the question is what sort of language has a high usage of by and a lower usage of up and what manner of expression has the reverse?

By indicates relationships between concepts and agents performing actions: “led at first by the British, but later by the North Americans” (Stapledon, Darkness, ch. 3, sec. 1). This example is from Stapledon’s Darkness and the Light, the text that falls furthest to the left in Figure

4.1 and contains the highest frequency of by in the corpus – occurring 492 times and accounting for 0.61% of the text in Darkness and the Light.

By contrast, the text with the smallest frequency of by, Wells’s Star-Begotten, has a total of ninety-two occurrences (accounting for 0.24% of the text) and 10 of these occurrences are in the same sub-chapter, Chapter 1, Part 2. This chapter is distinct from the rest of the narrative on account of being an introduction to the central figure, Mr Joseph Davis. Interrupting the narrative “to tell the reader a few things about him”, by occurs in this section to relate events of

Davis’s past, “a shadow story which was told less by positive statement than by hints”; tell of his abilities, “this was a thing that came to him by such imperceptible degrees”; explain his relationship to others, “inspired by Paul de Kruif's Microbe Hunters”; and to pinpoint narrative time, “By this time he was at Oxford”. Accordingly, in the text with the least occurrences of by,

130 the heaviest concentration of the word is in the one section of the novel that could be considered past narration while the rest of the narrative consists of dialogue between characters.

Comparatively, an example of a style heavy in the preposition form up but light in by is found in The Invisible Man; the text with the highest rate of up in the corpus (accounting for

0.4%) of all word tokens in the text. The highest counts of up occur in Chapter 27, a chase scene:

“he got up”, “he went up”, “led the way up”, “tried to struggle up”, “stood up”, “threw up”,

“given up”. The language is describing action and movement.

A brief look at up in styles heavier in by reveals that up can serve a very different function than primarily indicating movement. The following examples are from Stapledon’s

Death into Life, the text with the lowest proportion of up in the entire corpus. Rather than physical movements, the direction implied by up is metaphorical: “Greedily he licked up all the sweetness of his precious numbered days, and spat out all their bitterness”, “lapped up nothing, spewed nothing out”, “in many other betrayals, up and down his life, he had given poison to his own soul”, “New thoughts welled up in him”, “He took up once more the thread of his meditation”. Here other prepositions of movement, out and down, are similarly employed

–to express not the physical motion of a protagonist but a metaphysical movement through temporal or cosmic landscapes. Indeed, even the preposition in the title of the text, Death into

Life, alerts the reader to the metaphysical rendering of what is otherwise a material concept.

While into can be expressing movement, it can also be expressing a change of state – from a dead state to a lively state.

The word up – albeit not used as frequently – can still be used to indicate physical movement in Stapledon’s future histories. For instance, in Star Maker the direction of the narrator’s disembodied consciousness is said to “soar up from the planet” (ch. 4, sec. 1). Out of context, this fragment could be another example of metaphorical movement, but within the context of Stapledon’s imaginary framework, in which the narrator experiences an entirely

131 disembodied journey across the cosmos, it is clear that this instance refers to a physical movement in a literal direction – away from the planet. In this way, therefore, one stylistic difference between styles of future history texts and the style of more adventure-based texts can be observed in the usage of just two words.

Observation of other patterns in Figure 4.2 fleshes out the reporting, past-narration and historical modes. Indeed, the patterns are quantitative evidence for the interpretations of

Jameson and Stockwell concerning the form of future history narratives: that the core narrative, of simple past narration covering vast time leaps, is interrupted by a reporting style, editorial interludes, as Stockwell calls them (36), or “Verne-like fait divers” according to Jameson (126).

Stockwell’s example of the editorial interlude is taken from Wells’s The Shape of Things to Come and contains both simple past and progressive past narration as well as a non-past form which is the narrator’s comment, “what our historian calls here…” (36). In this instance, not only is the interlude inserting an authoritative voice for the historical account but also the superior consciousness of the narrator. Stockwell points out that instances such as these are some of the only instances of simple present forms and it is not until Isaac Asimov, Ray Bradbury and

Arthur C. Clarke attempted the future history form that there is “any significant introduction of the direct speech of characters” (37) and thus more narration of individual characters and their situations. In the present case, the style of narration is marked not only with a heavy usage of the preposition by, but also a relatively higher rate of the words in, of, their, which, and were.

In order to cover vast periods of time, entire epochs, the narration of future events also refers to entire generations or variant species of humanity using the third person plural: “The

Third Men were very subject to a craving for personal immortality. Their lives were brief, their love of life intense” (Last); “though their faces were inhuman, the basic pattern of their minds was not unlike our own. Their senses were much like ours…” (Star-Maker 49). In addition, which appears frequently in past-narration, operating below as a relative pronoun:

132

Only the faint incoming light of other stars was seen. This afforded the perception of a

surrounding heaven of flashing constellations, which were set not in blackness but in

blackness tinged with the humanly inconceivable color of the cosmic rays. (Star-Maker,

ch. 6, sec. 3)

These words also appear together in the style of Wells’s future history, The Shape of Things to

Come:

The most difficult thing in our understanding of the past is to realize, even in the most

elementary form, the mental states of those men and women, who seem so deceptively

like ourselves… It is only when we compare their conduct with ours that we realize

that, judged by their contents and their habits of reaction, those brains might almost

have belonged to another species of creature. (ch. 3, sec. 7)

This style of past narration tends to invoke the plural pronoun their in order to indicate entire civilisations and epochs. The writers also tend to refer to entire subsections of civilisations as men: “It was perfectly sane men who made the World War” (Wells, Shape, ch. 3, sec. 7). Thus, it can be said that one artistic solution to narrating the far distant future of humanity relies entirely on past narration and particularly on the articulation of a point of reference through prepositions, whether physical or metaphysical.

In contrast, the words located on the right-hand side of Figure 4.2 indicate that Wells’s cluster of mostly early works have more first-person narration and dialogue and a heavier use of the pronouns I, my and me, as well as the verb said. The cluster of prepositions, up, out, over, down as well as adverbs denoting temporal positions, before, again, then with verbs came and seemed, and the adjective little, all indicate texts containing both description and dialogue.

Furthermore, the style of texts at this end of the graph also include a higher rate of the indefinite article a. The following samples from chapter 27 in The Invisible Man indicate how a is used:

“read a strange missive”, “on a greasy sheet of paper”, “from a locked drawer”, “took a little 133 revolver”, “wrote a number of brief notes”, “added a mental reservation”, “a familiar voice”, “a silvery glimpse”, “a second smash”, “a snap like a pistol”. In this example, the indefinite article operates to create a sense of suspense. As well as being “indefinite”, “a or an can be used to introduce a particular individual item to a discourse, one that is familiar and uniquely identifiable for the speaker, but not for the hearer” (Craig, “A and an” 276). As such, when the item is referenced again the writer or speaker “can use the definite article or a personal pronoun” (Craig, “A and an” 276). However, in the above example from The Invisible Man, most individual items are not mentioned again. The introduction of so many unfamiliar items, items that may not have a recurring relevance in the narrative, sets a scene of suspense with many unknowns remaining unknowable, placed in view of the reader only briefly as the action speeds up throughout the chapter.

If Wells’s texts feature a relatively higher usage of a, which texts are more associated with the determiner, the? Located bottom left of Figure 4.2, the has a relatively higher usage in

Stapledon’s Darkness and the Light, Wells’s War of the Worlds and The Shape of Things to Come than any of the other texts in the corpus – although these three are closely followed by Stapledon’s other future histories. Exploring the distinction between Wells’s future history, The Shape of

Things to Come, and Stapledon’s four main future histories will provide a deeper understanding of how the, and other nuanced distinctions, operate in the future history narratives of Wells and

Stapledon.

Although similar in terms of narrating future and/or distant civilisations, when it comes to the more unfamiliar concepts, such as the group mind, or the “communal mind” (Star Maker, ch. 8), Wells maintains a more confident tone, and this is seen in the words associated closer to the position of Shape in Figure 4.2 (particularly along PC1):

The body of mankind is now one single organism of nearly two thousand five hundred

million persons, and the individual differences of every one of these persons is like an

134

exploring tentacle thrust out to test and learn, to savour life in its fullness and bring in

new experiences for the common stock. We are all members of one body. (Shape, ch. 5,

sec. 9)

In this instance, the determiner the offers definiteness along with other elements of Wells’s style including the familiar features of past narration, prepositions of and in. Along PC1, the words more associated with The Shape of Things to Come than Stapledon’s future histories also include now, are, is and to. Where Wells’s narrator relays what “is”, the description of the “communal mind” in Stapledon’s Star Maker is less certain:

Seemingly we had attained such a deep mental accord that, when conflict arose, it was

more like dissociation within a single mind than discord between two separate

individuals. (Star Maker, ch. 4, sec. 1)

These stylistic distinctions are in keeping with documented authorial traits; as Crossley explains, Stapledon “was neither as noisy nor self-assured as his mentor Wells” (“Introduction” x). This is not to say that Stapledon’s narratives always lack certainty. For instance, in narrating the perspective of an alien race from Mars in Last and First Men, Stapledon’s narration carries more authority:

Experiment had shown that these creatures died when pulled to pieces, and that

though the sun’s radiation affected them by setting up action in their visual organs,

they had no really direct sensitivity to radiation. Obviously, therefore, they must be

unconscious. (ch. 4, sec. 1)

Must is found half way along the left-hand side of PC1, indicating that is not as strongly correlated with Last and First Men as the other words on the far right-hand side. Similarly, this above segment of text sits somewhat outside the core narrative, presented as a report by

Martian intelligence on humanity. This section has been interpreted by John Rieder as an

135 example of post-colonial themes in science fiction, an early dramatisation of the “potentially catastrophic consequences of understanding the exotic in terms of the one’s own experiences”

(Colonialism 83).

The narration of the exotic, or the “Other,” is a recurring theme in Stapledon’s fiction as

Sheryll Vint and Joan Gordon note in relation to Sirius. Notably, Vint argues that Stapledon contemplates a future where humans are capable of “sharing the planet with beings radically different from ourselves” (204). Sirius, positioned close to the y-axis in Figure 4.1, is stylistically distinct to Last and First Men, suggesting that Stapledon’s style varies, even where themes may be similar. Such a reading of the PCA results is particularly persuasive given interpretations of

Stapledon’s nine works as a composite whole (Rabkin) with a progression of philosophical ideas (Moskowitz) and an increasingly unreliable narrator (Goodwin). The uncertainty of

Stapledon’s authorial voice, interrupted with the occasional editorial report, as in the Martian example above, can also be understood in the context of Stapledon’s broader aims.

Departing from the computational study briefly, Stapledon’s spiritual quest can be found embodied in his notions of the communal or cosmic mind. Robert Branham has argued that Stapledon’s vision is achieved by his unique literary style of “unreliability” through

“indirection” which maintains “the sense of mystery that inspired and sustained his spiritual quest” (Branham 249). However, the elements of unreliability and indirection could be linked to the futility of Stapledon’s quest rather than a sense of mystery. For instance, Gerry Canavan has claimed that Star Maker was Stapledon’s attempt to “break through the incomprehensibility and incoherence of the present to find the hidden system lurking underneath that finally explains everything”, which Canavan then links to Wells’s criticism of Stapledon (311). “Essentially”,

Wells wrote to Stapledon, “I am more positivist and finite than you are. You are still trying to get a formula for the whole universe. I gave up trying to swallow the Whole years ago” (qtd. in

Crossley, "Correspondence" 41).

136

Rather than a lack of confidence or inability to articulate his vision, might Stapledon’s hesitation with regard to elements such as communal mind be linked to the artistic solutions he sought for the philosophical problems he tackles, such as the limitations of language? Indeed, the narrator of Star Maker directly addresses the concept of language as it relates to the group mind: “Human speech has no accurate terms to describe our peculiar relationship” (ch. 7). It has been noted by scholars that the very nature of encountering beings with superior consciousness incurs the representational difficulty of narrating encounters from the perspective of the less developed consciousness (Huntington 352-3; Branham 250). Stapledon’s artistic solution to this limitation in relation to the communal mind in Star Maker is self-reflexive:

In the following chapters, which deal with the cosmical, experiences of this communal

“I”, it would be logically correct to refer to the exploring mind always in the

singular…nevertheless the pronoun “we” will still be generally employed so as to

preserve the true impression of a communal enterprise, and to avoid the false

impression that the explorer was just the human author. (ch. 8)

Indeed, first person plural pronouns are employed more frequently in Star Maker than almost every other work included in the corpus – exceeded only by Wells’s A Modern Utopia. The segment of Star Maker with the highest ratio of first person plurals is the first section of Chapter

4, where the narrator first attempts to explain the communal mind through the relationship between his disembodied consciousness and his host mind, Bvalltu: “In time each of us came to feel…”. In Death into Life the language available to Stapledon is almost exhausted. “Nothing in my world is identical with anything in yours. Not a tree, not a word, not a person. Is redness, even, to me just what it is to you?” (Second Interlude). Through the narrator, Stapledon writes:

“What matter? Such a difference would be eternally insignificant for us, since it would be forever indiscernible” (Second Interlude). Just as he once explained to Wells, Stapledon was committed to the notion of “thinking about the universe as a whole even though

137 philosophically it’s futile” (qtd. in Crossley, "Correspondence" 41-42). Perhaps this is one explanation for their stylistic differences: Wells was certain of his positivism and finite ideas but

Stapledon knew his attempts to formulate the universe as a whole were futile.

The PCA results have confirmed the presence of key stylistic markers that distinguish the style of Stapledon from Wells, where even Stapledon’s future history texts are distinct from both his other fictions and Wells’s own attempt to narrate the future as past history. The form of the future history text is past-tense description of entire societies rather than individual characters; progressions – as well as declines – described by the transition between epochs rather than individual years. With movements through metaphysical as well as physical spaces, the style of a future history text is dense with by, were, their, in and of which are all positioned on the left-hand side of Figure 4.2, corresponding to the position of the future histories on the left- hand side of Figure 4.1. Although these elements are found in both Stapledon’s future histories and Wells’s The Shape of Things to Come, the PCA results indicate a stylistic distinction between them that can be linked to Wells’s more confident tone.

We turn now to further nuances, those between Stapledon’s other works, Odd John,

Sirius, A Man Divided and The Flames, all of which are separated from the historical narratives of the future and are closer in style to Wells’s adventure-based narratives. While still containing past narration forms, these four works are more associated with the words needed to form dialogue than they are with the words prominent in the future history works.

In relaying the story of an augmented super dog, the narrator of Sirius recreates the past from a scientist’s papers, intermittent with personal encounters with the protagonists, “Sirius at a later date told me” (ch. 14) and “freely” filled out with the narrator’s “imagination” (ch. 1).

The combination is a narrative of past account, dialogue between characters and more personal details than are found in any of Stapledon’s future history works:

138

Elizabeth mothered him and cleaned up his foot with a certain well-known disinfectant.

The smell of it was repugnant to him, but it now acquired a flavour of security and

kindliness which was to last him all his life. (ch. 3)

As well as being described as the “most aesthetically satisfying” of all Stapledon’s fictions

(Kinnaird 517), Sirius has also been considered as Stapledon’s best novelistic attempt to “create an ‘objective observer’” (Barron 119). Similarly, Odd John and A Man Divided contain both dialogue, the narration of a main protagonist and didactic elements resembling Stapledon’s reporting style of the future histories. However, where the future histories move whole eons and societies around as though they were characters, Stapledon’s works closest to the middle of

Figure 4.1 are more concerned with individuals.

Accounting for why The Flames sits so high in Figure 4.1 is the split of words along the vertical axis in Figure 4.2 (PC2) which point to a comparatively high use of the verbs will and be and is highly suggestive that these two words may appear together in the form will + be— especially because there is no future tense in the English language but only the form will + be.

Thus, science fiction stories narrating the future rely on the past tense to maintain verisimilitude

(Stockwell 45). However, the position of these words at the top of the graph indicates that the language of Wells’s A Modern Utopia and Star Begotten, and Stapledon’s The Flames, are invoking the future rather than asserting the occurrence of the future as historical fact. A passage from

Wells’s A Modern Utopia confirms that the style of speculative predictions is quite different to the more certain forms of past-narration, “[t]he language of Utopia will no doubt be one and indivisible…” (ch. 1, sec. 5). With more certainty, the narrator concludes, “the language they will speak will still be a living tongue, an animated system of imperfections, which every individual man will infinitesimally modify” (ch. 1, sec. 5). This more concrete vision is due to use of which, a word that occurs in the bottom-left quadrant along with the future history narratives.

139

Furthermore, modals would and could are split along the y-axis, with would appearing close to the top and could occurring lower, associated more with Wells’s The Shape of Things to

Come. The difference between these two modals comes down to degrees of certainty: while would offers more certainty, could is speculative. Consider the difference between “that would be dangerous” as opposed to “that could be dangerous”. The modal could tends to create a subjunctive mood, “the mood of non-fact, expressing the uncertain, hypothetical, or desirable”

(Wales 307). Even the verb form will + have expresses more certainty than could, “The whole world will surely have a common language… Indeed, should we be in Utopia at all, if we could not talk to everyone?” (Wells, Modern, ch .1, sec. 5). Comparatively, the use of modality in

Wells’s The Shape of Things To Come is in keeping with the past-narrative forms: “It would take three or four generations to convert the world to a forward-looking attitude” (Wells, Shape, ch.

3, sec. 9). A concordance search of Shape reveals that would collocates most with have, and could with be. In contrast, a concordance search of A Modern Utopia shows that could and would both collocate most frequently with be. Apart from these distinctions, there seems to be few reasons for an author to choose between would and could. Yet the choice is one distinguishing factor between a utopian form and a future history form.

The modal verbs serve as a specific solution to a representational problem in the future history forms. As Stockwell noted in discussing an excerpt from Wells’s Shape, there are moments where the narrator interjects information with a pronoun such as our. However, hypothetical situations expressed through modality, also interrupt the narrative core, “If the reader were sent back only for the hundred and seventy years between now and 1933, he would still feel a decided uneasiness about what people might or might not do next” (Wells, Shape, ch.

3, sec. 7). Stapledon’s future histories also employ modals most often when the core narrative is interrupted with editorial interludes:

140

It might have been expected that, after the downfall of the First World State, recovery

would have occurred within a few generations. Historians have, indeed, often puzzled

over the cause of this surprisingly complete and lasting degradation. (Last, ch. 5, sec. 1)

Might is also employed in this example of hindsight, where would is used throughout Last and

First Men to express the shape of the future that could have occurred or that could have been avoided. Hindsight and speculation in the future history narrative is a vehicle not only for the authors to offer social criticism but to assist the reader in accessing the experience of the speculative vision of the future.

It is worth noting that the split of prepositions across PC1 is a surprising feature and not common in most computational stylistics. The split indicates two very distinct styles of narration, one of adventure and the other of history. Even within styles narrating the future there are two distinct forms: the narration of a speculative future and the narration of the future as past history. Distinct to the utopian narrative, the future history novel requires different solutions for different representational problems. Among the artistic solutions available,

Stapledon attempted a distant account of history – as in the case of Last and First Men – and a truncated encyclopaedic guide to alternate species in distant galaxies – as in Star Maker. By confirming that there are distinct literary styles in Stapledon’s corpus, this stylometric study suggests that the distinct representational problem of narrating the future necessitates a particular stylistic solution – one that even Wells employed in The Shape of Things to Come.

Although they have distinct literary styles, Wells and Stapledon share some aims and thus some solutions. For instance, Stapledon uses more dialogue while maintaining past- narration in Sirius and his other protagonist-centred works. Stapledon’s most “ambiguous” work (Kinnaird 517), The Flames, is closest to the uncertain modalities of the utopian literature.

Star Maker has an idiosyncratic style, as Jameson claimed, in that it sits apart from these four works and does not come close to the form used by the utopian genre nor to Wells’s more

141 traditional science fictions, The Time Machine and War of the Worlds. But it is no outlier surrounded as it is by three other future history texts. Although unique in form even among

Stapledon’s entire corpus, this category of fiction forms its own distinct stylistic grouping. If any of Stapledon’s works ought to be deemed his most unusual it could be The Flames – indeed it is the text that is stylistically closer to Virginia Woolf’s most experimental work, The Waves.

4.3 Stapledon and Woolf In their letters to one another, Woolf and Stapledon expressed mutual admiration. But while

Stapledon admired Woolf’s artistry, writing to her of his “despair at the thought of the contrast between your art and my own pedestrian method” (qtd. in Henry 111), Woolf congratulated

Stapledon on his ability to grasp ideas that she herself “tried to express, much more fumblingly, in fiction” (qtd. in Crossley, “Olaf Stapledon” 29). Given that Stapledon praised Woolf’s artistry and Woolf in turn praised Stapledon’s achievements, it is likely that they were influenced by each other’s artistic solutions to the problems that they faced. A computational study of their styles can help show how close their styles are and what elements are distinct.

Included in the corpus are the nine Stapledon texts used in the test of Section 4.2 and nine of Woolf’s, including Mrs Dalloway (1925), To the Lighthouse (1927), Orlando (1928), The Waves

(1931) and The Years (1937). A full list is found in Appendix D. The one hundred most frequently used words in the corpus of seventeen texts were counted and the title Mrs was removed which leaves the proportional frequencies of ninety-nine words to analyse. In addition to function words, the list, found in Appendix F, includes world, man, time and people.

142

Figure 4.3 PCA Scores Woolf and Stapledon (PC1 vs. PC2)

The PCA test returned the results graphed in Figure 4.3. The works of Woolf and

Stapledon are separated along PC1 – Stapledon’s are found to the left, Woolf’s to the right.

Three of Stapledon’s works come closer to the middle of the graph, and therefore closer to

Woolf’s texts. These are Sirius, Odd John and A Man Divided. The closest text of Woolf’s to the middle of the graph is Orlando and the text furthest to the right is The Years. These results indicate that there are distinct authorial styles which separate Stapledon and Woolf. There is, however, once again a split in Stapledon’s works which is evident in the gap between

Stapledon’s future histories, which, along with Last Men in London, are furthest to the left, and

Stapledon’s more human centred novels. In the middle of the gap is The Flames which, along

PC2, is closest to one of Woolf’s works The Waves.

It is unsurprising that along PC2 Woolf’s most experimental novel, The Waves, is an outlier from the rest of her novels. Following six characters through different stages of their lives, The Waves is presented through soliloquies which contain high symbolism and an

“artificial style of language that is not found in ordinary communication” (Balossi 1).

143

Interestingly, this novel has already attracted a corpus linguistic study in which the personality traits and individual styles of the characters were quantitatively and qualitatively studied

(Balossi).

As well as a clear authorial split along PC1, the horizontal axis also indicates that Woolf’s novels appear to cluster closer together in style than Stapledon’s, indicating a more homogenous style between her texts. A standard deviation test confirms that Stapledon’s texts are more dispersed along PC1 with a standard deviation of 4.1 as opposed to Woolf’s texts at

2.9. However, along PC2 Woolf’s outlying novel is responsible for a higher rate of dispersal, with a standard deviation of 4 along the vertical axis compared to Stapledon’s standard deviation of 3.6. Although Woolf’s corpus is slightly more consistent stylistically, her most experimental work is more an outlier than any of Stapledon’s. As there is again a distinction between Stapledon’s two narrative styles, it appears as though the PCA has once again found differences between the authors that can be related to their narrative styles. Exploring the stylistic similarities of the authors’ two closest works, Woolf’s Orlando and Stapledon’s Sirius, and the two extremes on PC1, Stapledon’s Star Maker and Woolf’s The Years, offers insight into the significance of these patterns in Figure 4.3. First, however, we can turn to the word plot,

Figure 4.4, in order to explore which words underlie the stylistic similarities and differences in the two authors.

As in the test of Stapledon and Wells, this component is split by prepositions. The extreme on the left is of and, on the right, up. A rather pronounced distinction in this test is most of the pronouns appearing on the right-hand side and dominating the bottom half of the second component.

144

Figure 4.4 PCA Loadings of 99 Words in Woolf and Stapledon (PC1 vs. PC2)

Taking a closer look at PC2 – the vertical component – the standout for The Waves is a higher use of first person pronouns, which is unsurprising as the narrative is told through first person soliloquies. Similarly, The Flames is told in first person, presented as a letter rather than soliloquies, and is framed with brief narration either side. Also located down the bottom of PC2 is the main verb have, which is opposed to had which is located at the top of the graph. Other verbs confirm a present/past split along PC2: is/was and are/were. In addition, second person pronouns you and our, words which indicate dialogue, are also located fairly low along the vertical horizontal axis.

4.3.1 The Closest: Sirius and Orlando The two texts closest in style according to the PCA are Orlando and Sirius. Although research to date indicates they have not previously been discussed together, they share elements of the impossible: from the metamorphoses and chronological oddities in Orlando to the talking dog in

Sirius. They also share a similar style of narration as both are written as ‘biographies’.

145

These similarities are apparent in shared stylistic traits. For instance, there is a heavy usage of seemed particularly when narrating the opinion of the eponymous protagonists: “The frenzy of the Moor seemed to him his own frenzy” (Woolf, Orlando, ch. 1); “But it was Elizabeth herself who snatched Sirius from the jaws of death (as it seemed to him)” (Stapledon, Sirius, ch.

3). The bracketing “as it seemed” indicates a hesitation to narrate the inner thoughts of characters. Indeed, Stapledon’s narrator appears more comfortable narrating Plaxy’s thoughts rather than those of the talking dog with more occurrences of “seemed to her”. The usage of so and upon is relatively higher in Orlando than most of Woolf’s other texts and higher than its usage in Sirius. In terms of conjunctions, Stapledon favours but, which does not appear so frequently in Woolf’s texts. This is far from an exhaustive list of similarities and differences in the styles of Woolf and Stapledon, yet when considered alone we miss one critical difference in the narrative styles of Orlando and Sirius: the difference made by verbs.

The narrative style of the two texts is more fully illustrated by diverting briefly from the list of one hundred words to consider two excerpts from the novels. The first excerpt from

Orlando is a fast-paced sequence, appearing very early on:

Orlando looked no more. He dashed downhill. He let himself in at a wicket gate. He tore

up the winding staircase. He reached his room. He tossed his stockings to one side of the

room, his jerkin to the other. He dipped his head. He scoured his hands. He pared his

finger nails. With no more than six inches of looking-glass and a pair of old candles to

help him, he had thrust on crimson breeches, lace collar, waistcoat of taffeta, and shoes

with rosettes on them as big as double dahlias in less than ten minutes by the stable clock

(ch. 1).

Nowhere in Sirius is there such a fast-paced narration. Instead, one of the most significant scenes in Sirius, an instance where Sirius is shot at, is narrated as follows:

146

Matters were brought to a head by a serious incident. This I recount on the evidence of

Pugh, who had the story from Sirius himself. The man-dog was out on the hills with

one of his canine pupils. Suddenly a shot was fired, and Sirius's companion leapt into

the air, then staggered about yelping. The charge, no doubt, was meant for Sirius; it

winged the other dog. Sirius at once turned wolf. Getting wind of the man, he charged

in his direction. The second barrel of the shot-gun was fired, but the assailant had lost

his nerve; he missed again, and then he dropped his gun and ran to some steep rocks.

(ch. 17)

Where the flight of Orlando is heralded by “Orlando looked no more”, Stapledon’s transition is far longer because, as well as the protagonist, the narrator is also present in the story, “I recount”. The narrator in Sirius interrupts the flow of the narrative in order to give evidential weight for the story. In Orlando, however, the sequence progresses rapidly with use of verbs:

“looked”, “dashed”, “reached”, “tossed”, “dipped”, “scoured”, “pared”. Although both are told in the past tense, the narrator in Sirius spends more time positioning the protagonist in a setting, “was out”, and the events within a timeframe, “suddenly”, “at once”, “and then”.

Woolf’s setting in both space and time is accomplished through the participial clause, “He dashed downhill”. This style of narration, short sentences and punchy verbs, mimics the behaviour of the protagonist. Woolf’s sustained pace over the entire paragraph is an attempt to stylistically affect the intensity experienced by Orlando throughout the scene. The paragraph culminates with a final string of short sentences which accentuates the drama of a scene that is predominately just detailing a protagonist getting dressed: “He was ready. He was flushed. He was excited. But he was terribly late” (ch. 1). In contrast, the climactic events in Chapter 17 of

Sirius are rich in detail but are narrated in a delayed fashion; the narrator’s interjections of “no doubt” and of explanation that precede events, “the assailant had lost his nerve; he missed again” culminate in very little emotion attached to the significant scene.

147

If we leave the principal component analysis for a moment, it is worth noting that, on occasion, there is concise, stylistic beauty in Sirius. Late in the final chapter, the morning following Sirius’s death is narrated as follows:

Presently the sirens sounded, far down in the villages, steady, sad and thankful. A sheep

called mournfully. Very far away a dog barked. Behind Arenig Fawr the dawn was

already like the glow of a great fire. (ch. 17)

In this moment, the narrator finds freedom to apply personification to the sounds of the sirens and the sheep call; to allude to Sirius’s death in the sound of a dog barking “[v]ery far away” and to include a simile. Hitherto, the narrator had been bound by the rigors of a factual account of Sirius’s origins, life and death. It is as though Sirius’s death not only freed Robert, the narrator, to pursue his lover Plaxy, but also to be more poetic.

The narrator of Orlando is similarly bound by the “duty of a biographer” to report factually and “plod, without looking to right or left, in the indelible footprints of truth; unenticed by flowers” (Orlando, ch. 2). However, at least stylistically, the narrator of Orlando appears freer with the narration of private moments than does Robert in Sirius and lacks the same level of evidence that Robert continues to claim. Indeed, by Chapter 6, Orlando’s biographer lacks enough information to recount anything other than the months as they go by:

“November, December, January” and so on. After narrating a year in this manner, the biographer explains that though this method is “bare” is “has its merits”. Far from being

“indelible” or rigorous, the chronology in Orlando is actually impossible (Richardson 388). For not only does the narrative break several laws of nature but the narrator turns from the “duty” of biographer and starts to notice the flowers: “The only resource now left us is to look out of the window. Let us go, then, exploring, this summer morning, when all are adoring the plum blossom and the bee” (ch. 6). Where Woolf’s inclusion of impossible elements results in the impossible narration of private moments and the passing of time, the impossible in Sirius

148 appears to require an additional rigour to the fidelity of reporting a scientific experiment and its outcomes. The outcomes have intrinsically human implications and thus Stapledon strays into what is ‘seeming’ only to illustrate the struggles of the half-human, half-dog protagonist.

4.3.2 The Extremes: Star Maker and The Years The Years was Woolf’s attempt to “call in all the cosmic immensities” (qtd. in Henry 111).

According to Holly Henry, this aim, which was recorded in Woolf’s diary, is present in the novel’s “multiple references to the moon and stars, and the far reaches of intergalactic space”

(111). Henry goes on to compare The Years to Stapledon’s Star Maker, both of which were published in 1937, and argues that the emergence of the two novels in the same year was part of a wider cultural response to scientific advances. Visages of telescopic and microscopic worlds are two particular scientific advances Henry argues are present in both novels (116). Henry further links both novels to the work of popular astronomer, James Jeans, which she argues is an “important intersection” between Woolf and Stapledon (117). The link between Jeans and

Stapledon is also noted by Crossley (Olaf Stapledon 117). Although they emerged from the same cultural milieu, the PCA found that these two works are the least similar according to the first principal component, the horizontal axis in Figure 4.3.

As already mentioned, the extremes of PC1 are marked by a split in prepositions and also by personal pronouns on the right and impersonal, formal language on the left. The pronoun furthest to the left is the third person plural, their, which fits the scope of Stapledon’s future history works and their intergalactic scale of societies and epochal scales of time. Travelling through time and immense distances in space are achieved using the words found furthest to the left: “But in time I came to be able to live through the experiences of my host with vividness and accuracy, while yet preserving my own individuality, my own critical intelligence, my own desires and fears” (Star Maker, ch. 3.1). Indeed, the narrator of Star Maker is unclear as to exactly how much time has passed, “I have spent several years on the Other

Earth” (ch. 4, sec. 1). Furthermore, and although Star Maker encompasses intergalactic 149 communities, describing entire races and their physicality, sociality and evolutionary progress, the narrator is concerned also with these elements in himself: in the preservation of his “own individuality”. Throughout the cosmic exploration, the pursuit remains “the understanding of worlds or human rank, though diverse” (ch. 6).

In comparison, The Years traverses fifty years, not billions, and the protagonists do not experience disembodied space travel. This is not to say that they do not travel at all or that the cosmic reach of the novel is non-existent. On the right-hand side of Figure 4.4, the PCA results indicate that both looked and up are associated with The Years. But these words are not only used to describe heavenward perspectives. Where the words appear together in the text, “looked up” refers to the direction the dog looks or the frequent process of various characters looking up from their books or their tasks, such as the folding of newspapers and occasionally, looking up to the sky: “She looked up; a slice of yellow light lay round the moon” (1907).

Also on the right-hand side of Figure 4.4 are words required to form internal and external speech: said and thought. Both feature heavily in the novel. The feminine pronouns she and her are fittingly associated with The Years – a novel concerning the inner lives of women. A group of prepositional forms, which are often employed as adverbs, are also clustered at the right-hand end of PC1 a position, a seemingly confusing correlation for a domestic novel rather than an adventure-based narrative. However, up, out, down, on and over are associated with The Years due to the frequency with which the movement of characters is narrated: characters are regularly moving, whether it is “going up the iron staircase” or “down the stairs”; their positions are often described, “sitting on the terrace”, “[s]he leant over the flaps of the cab looking out”.

Concerning the inner lives of characters as well as their relationships, it is no surprise that this novel contains a mixture of dialogue, internal thoughts, and description.

150

As well as in descriptions and the direction of characters’ gazes, the “cosmic immensities” in The Years can be found in the imaginations of the protagonists. The ability of

Sara, for instance, to imagine the moon and its terrain:

Stretched flat on her bed, she saw the moon; it seemed immensely high above her. Little

vapours were moving across the surface. Now they parted and she saw engravings

chased over the white disc. What were they, she wondered--mountains? valleys? And if

valleys, she said to herself half closing her eyes, then white trees; then icy hollows, and

nightingales, two nightingales calling to each other, calling and answering each other

across the valleys. (1907)

Sara not only saw the moon through her window but ‘saw’ the moon’s terrain in her mind’s eye. In the same way, Kitty Malone conjures up an image of a naval experiment reported in The

Times, “She saw the bright light from the ships on the drawing-room chair” (1880). When

Eleanor sees her brother: “She found it difficult to write to Edward, seeing him before her, when she took up the pen, when she smoothed the notepaper on the writing table” (1880).

Towards the end of The Years, Peggy reduces the celestial elements she is observing through a window to more familiar items:

There was a row of chimney-pots against the sky. Then the stars. Inscrutable, eternal,

indifferent--those were the words; the right words. But I don't feel it, she said, looking at

the stars. So why pretend to? What they're really like, she thought, screwing up her eyes

to look at them, is little bits of frosty steel. And the moon--there it was--is a polished dish-

cover. But she felt nothing, even when she had reduced moon and stars to that. (Present

Day)

Still, even when likening the stars to everyday items Peggy feels nothing. After an awkward encounter with a man she considers “wretched” Peggy returns her gaze to the stars, only now

151 they “seemed pricked haphazard in the sky”. It is only when her cousin praises Peggy in the presence of Peggy’s father that her view of the celestial bodies is altered.

There, said Peggy, that's pleasure. The nerve down her spine seemed to tingle as the

praise reached her father. Each emotion touched a different nerve. A sneer rasped the

thigh; pleasure thrilled the spine; and also affected the sight. The stars had softened; they

quivered. (Present Day)

Embedded in a novel consisting of domestic affairs that are described and discussed by various characters, the cosmic imaginings of Woolf’s protagonists are devices to focus emotion and to channel contemplation. The celestial bodies are altered by Peggy’s emotions, as though humans have such control over the immensities of space. On the other hand, the cosmological immensities of Stapledon’s future histories are repeatedly altering humanity. From the impact of a smaller planet on the evolution of centaur-like beings in Star Maker, to the ultimate end of humanity by destruction of our sun in Last and First Men. Even the emotions of the disembodied narrator of Star Maker are impacted and altered by the immensities of the cosmos.

Thus, although the focus on the cosmos in both Star Maker and The Years are influenced by a shared cultural milieu, the presentation of the cosmos achieves two different effects and thus employs distinct styles. The contemplative protagonists of The Years are in constant dialogue both internally and externally, the domestic lives and familial ties are explored in depth by Woolf. Star Maker, on the other hand, follows the report of its disembodied narrator, and contemplation occurs in past narration rather than dialogue. These styles can be summarised as follows: the cosmic immensity of internal thought corresponding to external events in the dialogue-heavy The Years; and cosmic immensity of galactic exploration to redefine concepts of individuality through a reporting style narration that naturally lends itself to a more didactic narrative in Star Maker.

152

To conclude, it was known prior to this study that Stapledon envied the artistry of

Woolf’s style and that in return Woolf admired Stapledon’s ability to achieve his imaginative vision in his fiction. Furthermore, it was Woolf’s aim at one point to write her own “Essay-

Novel” (Diary 129), an aim she later abandoned to pursue the subtler expression of her explorations in a more traditional narrative. She recorded the inception of her novel-essay in her diary: “I have entirely remodelled my ‘Essay’. Its [sic] to be an Essay-Novel, called the

Pargiters…” (Diary 129). The culmination of these ideas is the novel The Years; the didactic essay elements abandoned (later turned into the book-length essay Three Guineas (1938)), the final version reflects Woolf’s designation of narrative art as not having a “palpable design upon the reader but . . . rather whisper[ing] its truths” (Briggs 81). Indeed, the PCA study of Stapledon and Woolf’s styles found that The Years is the furthest stylistically from Stapledon’s attempts at

“essay[s] in myth creation” (Stapledon, Last, Preface). Those closest together were the biographical narratives of the impossible, Orlando’s metamorphosis and the life of Sirius the super dog. The readings informed by these results suggests that the cause for these distinctions and similarities comes down to the dominance of didactic past narration in Stapledon’s fiction, a feature also found in Orlando. It has been noted that Orlando could be interpreted as “a series of brilliant essays on history, fashions, literary periods and sexuality” (Lee 96). Therefore, the closest to Stapledon’s style is one of Woolf’s more essay-like fictions. In contrast, there is an absence of overt didactic prose in The Years where Woolf prioritises dialogue and suggestive allusions and metaphors.

4.4 Conclusion Through stylometry, what have we learnt about Stapledon’s style as it pertains to his position in the tradition of science fiction, in relation to claims of extreme idiosyncrasy, and the effectiveness of Stapledon’s solutions?

Firstly, it can be said that Stapledon shares some stylistic traits with his predecessor,

Wells. However, Stapledon’s most distinct style is found in the form of his future history 153 narratives and in that instance Wells comes closer to imitating Stapledon’s style with The Shape of Things to Come. As well, Stapledon’s protagonist-based narratives also seem to share stylistic elements with Woolf’s fiction, providing quantitative evidence to the suggestion that, as a second-generation science fiction writer, Stapledon is the modernist to Wells’s late-Victorian.

Nevertheless, he is not as truly idiosyncratic as some have claimed.

Stapledon certainly makes stylistic choices that are highly unusual, such as having more didactic elements in even his most human novel, Sirius. His artistic solutions are some examples of his unique literary style as well as evidence of how he, as a second-generation science fiction writer, turned to new solutions rather than rehashing solutions from Wells.

Facing a unique representational problem as a second-generation writer of science fiction – there were no established genre norms – Stapledon ought to be celebrated on account of his solutions. According to Scholes and Rabkin, “science fiction could begin to exist as a literary form only when a different future became conceivable by human beings” (7). The preceding requirement is “perceiving the world in new ways” and, to make it convincing, the imaginative vision must be “radically different from the familiar patterns of the past and present” (Scholes and Rabkin 6, 7). Stapledon certainly had the imaginative scope but did not have a well-established literary template. Nevertheless, his friend Naomi Mitchison, in celebrating Star Maker, offers a clue as to part of the solution: “The thing that I believe you are so immensely good at is convincing detail – almost mechanical detail – about something one knows nothing about and hasn’t even imagined, but which yet you can make absolutely clear”

(qtd. in Crossley, “Olaf Stapledon” 28; emphasis in original). This “mechanical detail” was laid out in Star Maker in a deliberate order, as the narrator explains:

I should not have been so surprised by the strangely human character of this creature

had I at this early stage understood the forces that controlled my adventure. Influences

154

which I shall describe doomed me to discover first such worlds as were most akin to

my own. (ch. 3, sec. 1)

The wandering consciousness of the narrator travels to increasingly alien worlds, as yet unimagined by anyone before Stapledon. As a solution to the increasing distance between the human and the non-human, Stapledon employs a unique device. The narrator first comes across a “quasi-human kind” that is said to resemble “a centaur, with four legs and two capable arms”

(ch. 5, sec. 2). Following this discovery, the narrator finds a world “rather smaller than the rest” where “man was not a centaur” (ch. 5, sec. 2). Stapledon’s introduction to this alien is a manner of distancing, describing an entire race as not being the centaurs which were previously described. We are now twice removed from the humanoid creature of the first distant world, therefore three times removed from the reader’s own experience. At this point, Stapledon throws in a familiar image: “with very large rumps, reminiscent of the Victorian bustle” (ch. 5, sec. 2) before moving to describe a race that is most removed from what is known: a race descended from “a sort of five-pronged marine animal” (ch. 5, sec. 2). It is interesting to note that Stapledon begins his descriptions by non-direct references, what something is not or what something is descended from. The mechanical detail, at least as it relates to the description of alien forms – a feature which Stapledon is celebrated for (Jameson 124) – is in the tight control of Stapledon’s narrative progression, always aiding the reader as the worlds progressively become more distant from our own. Although cumbersome to contemporary readers, it was an important strategy for Stapledon’s generation.

Although much can still be explored in relation to Stapledon, Wells and Woolf, looking to future stylometric investigation there is one particularly important question: the link between

Stapledon and the genre science fiction writers who followed him. A number of commentators argue that Stapledon is responsible for “seeding” the genre with ideas: Sam Moskowitz, for instance, points to several tropes that did not appear in the pulp science fiction stories until

155 after the publication of First and Last Men and Star Maker (Moskowitz 270). Indeed, the invented, or the popularised, tropes that have been attributed to Stapledon include invading alien life forms (Jameson 124), wars that occur on a galactic scale, the organisation of galactic empires

(Moskowitz 270) and notions of symbiosis and community (Smith, “Introduction” ix). In addition, Stapledon appears to have been one of the first to explore virtual reality technology in his 1937 novel, Star Maker: “It worked not through the sense organs, but direct stimulation of the appropriate brain-centers” (ch. 3, sec. 2). Thus, the scope of Stapledon’s imagination certainly lends itself to the claim that his fiction contains some of the earliest examples of tropes and this in turn “seeded” the genre with a plethora of ideas. However, such claims are yet to be thoroughly investigated.

Scholars often nod to Stapledon’s influence. Take for example Eric Rabkin’s brief reference to the “seed” for Arthur C. Clarke’s novel, Childhood’s End (1953), which Rabkin argues can be found in Stapledon’s 1935 novel, Odd John (Rabkin 247). Further stylometric studies could therefore incorporate works by authors such as Clarke, Isaac Asimov and even

Ursula Le Guin to more thoroughly explore links between Stapledon’s style and the subsequent generations of science fiction writers.

As can be expected from a key figure of the genre, Stapledon was prescient. His fiction is still relevant to contemporary science fiction interests, such as studies concerning how the

“Other” is represented in science fiction (Rieder Colonialism; Vint; Gordon). This chapter has demonstrated how Stapledon’s style communicates his vision of other beings, oscillating from certainty to ambiguity depending on the particular vision. This chapter has also considered

Stapledon’s stylistic relation to Wells and Woolf. Pitted against a forefather of science fiction and a preeminent modernist, Stapledon’s attempt to be a novelist of science fiction in an era of modernist experimentation and vision gave rise to a style which remained determinedly separate from that of both of his mentors. As well as being “unemotional, unrhetorical, dry,

156 concise, [and] abstract” (Stapledon qtd. in Crossley, “Olaf Stapledon” 27), Stapledon’s style is more didactic than the style of both Wells and Woolf – expositional and full of details. Rather than Wells’ positivism and confidence, and Woolf’s emotional and internal probing, Stapledon’s fiction travels deeper into cosmic mysteries, boldly attempting to stand outside the usual human experience in order to report on what could be experienced beyond ourselves – these are ideas that were previously unseen and are otherwise unknowable. His style, however, does not venture far beyond his penchant for unemotional and dry language which is perhaps for the best. Without the security of Wells’ more finite utterances, Stapledon’s attempt to tackle the whole of what is unknown rests on the consistency of his unrhetorical and concise style. The stylometry in this chapter gives us some fixed points in determining just how consistent, and distinctive, that style is.

157

Stylistic Variation in the Harry Potter Sequence Introduction It is widely acknowledged that from the first book to the seventh, J.K. Rowling’s fantasy sequence, Harry Potter (1997-2007), undergoes significant stylistic changes. These changes occur on several levels but are generally related to the increasing complexity of the novels, changes to plot structure and narrative strategies, and even variations in reading levels. This chapter investigates whether these changes can be observed in the underlying structure of Rowling’s language, specifically whether there is stylistic variation to be found in the pattern of use among common words.

This problem is explored through three questions on the nature of Rowling’s style. The first is also the most straightforward: does the style in the sequence change according to chronological order? Other computational studies have found that an author’s style can change, in a chronological progression, over the course of their career (Hoover, “Corpus”; Craig,

“Contrast”), but as far as research shows there have been no stylometric studies concerning variations in book series. As certain aspects of change in the Harry Potter sequence were planned from the outset, this question takes on another interesting dimension. It is known that

Rowling planned for there to be seven books, commenting later that seven is a “magical number” (qtd. in Nel 32). Through seven volumes, Rowling accommodates the seven years in the British high school system thus allowing the protagonists to age one year in each book.

Furthermore, some commentators have noted that the increasing complexity of the novels indicates Rowling’s desire to both maintain and educate her readers as they age (Webb, Fantasy

43). Not only do the protagonists age seven years from the first to the last book but the implied reader does too. The question is whether or not the chronological changes can be observed in chronological change in style. This first question will be answered by the second, which is a more complex query about Rowling’s style: would a shift in style speak to Rowling’s maturing, developing authorial voice or to other changes, such as changes in narrative aim and themes? In order to explore this question the quantitative study looks at changing stylistic preferences from the first to the last book – do Rowling’s stylistic choices undergo a measurable change that sits apart from other changes such as the maturing of her characters? To explore this the sequence is studied by chapter as well as by book which provides a larger data-set (199 chapters compared to seven books) and thus, an avenue for exploring the relationships between style and narrative purpose in more depth. The sequence is also broken into sections of direct speech and the rest of narration to explore further variations in Rowling’s style. Finally, this study interrogates whether any changes in Rowling’s sequence can also be found in other book series by applying the same computational study to C.S. Lewis’s fantasy series The Complete Chronicles of Narnia

(1950-1956) and Diane Duane’s fantasy series Young Wizards (1983-2016). This last analysis demonstrates that the stylometric study of Rowling’s style is not an artefact of the methodology by showing a contrasting finding and a comparable finding, which together indicate that stylistic variation in a book series is not an automatic trait of the form but relies on other textual elements and artistic purposes.

Concerning Harry Potter, the variations from book to book have been attributed to a shift in genre, from children’s fiction to young adult fiction (Levy and Mendlesohn 133). Some commentators have offered explanations for how Rowling’s strategies change to accommodate this increased complexity. Kate Behr, for instance, argues that there are transformations throughout the Harry Potter sequence, both thematically and in terms of narrative structure. As

Harry’s knowledge of the wizarding world increases new details become less frequent and are not as absorbing in the late books where “the reader’s attention has shifted from Harry’s environment to Harry himself” (Behr 265). Farah Mendlesohn takes this a step further to argue that the changes across the sequence are a progression in the type of fantasy: “[the] sequence becomes less ‘portal’ with each book, and by the end has effectively reversed the trajectory so that it is ‘our’ world that seems to be the portal into which Harry occasionally enters” 159

(“Fantasy” 31). Mendlesohn also observed that similar changes occur in individual volumes of the sequence (Rhetorics 32). This suggests that a similar shift may occur in the language of each book as well as the sequence as a whole. Employing the concept of the mirror rather than portal,

Behr similarly argues that there is a reverse trajectory: “[a]lthough the wizarding world seems to be the shadowside of the Muggle, mimicking its concerns, bureaucratic structure, and weaknesses, narrative transformation makes it more concrete than the culture whose discourse, traditions, and customs it mirrors” (261, emphasis in original). Caroline Webb suggests that

Rowling’s moral and socio-political vision for the sequence is illuminated through both the

“development of the action” and “Rowling’s changing narrative strategy” (“‘Abandoned Boys’”

15). Part of this narrative strategy is a shift from the external narrating of facts to the projection of the reader “directly into Harry’s consciousness” (Webb, “‘Abandoned Boys’” 16). Karin E.

Westman accounts for this shift by explaining Rowling’s use of a “limited omniscient point of view” in an increasingly complex fictional world (147). Rather than a truly omniscient narrator, the narrative is focalised through Harry, whose understanding of the world becomes more complex as he becomes old enough to better comprehend the forces at work in the magical world. According to Westman, the limited omniscience can explain “the varying tones and emphases of each book” because the implied reader’s “frame of reference for each book is dynamic, expanding to account for Harry’s new knowledge of his world rather than fixed from the start of the series” (147, 148). In terms of tone, Westman points to the relatively joy-filled narrative in Philosopher’s Stone as compared with the later books (147), and Behr similarly suggests that the tone changes from “wonder, innocence, and comedy to fear, experience, and tragedy as the series progresses” (Behr 263). Indeed, the first three books are often described as

“cosy” (Webb, “‘Abandoned Boys’” 15) and “comic” (Behr 262), an atmosphere which Rowling transforms into “disturbing” (“‘Abandoned Boys’” 15) and “menacing” (Behr 262) from the fourth book onwards. Such narrative strategies and corresponding differences in tone explain

160 changes that occur on the level of narrative structure, but what about the underlying structures of Rowling’s language – the usage of common words?

This chapter identifies further measures of stylistic change in the Harry Potter sequence that are uncovered in the course of the computational study. However, before moving onto any new measure, let us examine the variation in two familiar measures: grade level and length.

The Flesch-Kincaid Grade Level test is employed here as an indicator of the reading level of each book as it takes into account the length of both sentences and words, though not of whole texts.1 Figure 5.1 plots the resulting grade level of each Harry Potter book along an axis, clearly showing that although there is an increase in grade level from level 4.9 to level 6.3 the increase is not in chronological reading order.2 The simplest book in terms of grade level is

Harry Potter and the Philosopher’s Stone (1997) and the most difficult book is the fifth volume,

Harry Potter and the Half-Blood Prince (2005). The lengths of the novels are plotted in Figure 5.2 which shows that the shortest volume is the first book and there is a gradual increase until the fourth book, Harry Potter and the Goblet of Fire (2000), which experiences a sudden jump in length. Although the length of the novels does not increase in a strictly chronological order, the length and the order are correlated.3 Further, when compared, the two measures are correlated indicating that there is a relationship between length and grade level.4 Therefore, although

1 In order to arrive at the final grade level, the formula incorporates constants and is as follows: .39 x (Total Words/Total Sentences) + 11.8 x (Total Syllables/Total Words) – 15.59 (Kincaid et al.). 2 However the correlation between the grade level and the order of the books is within a significant threshold (p < 0.05): r = 0.006, p < 0.02. Where r is the correlation coefficient and p is the directional probability. 3 r = 0.8, p < 0.02. 4 r = 0.7, p < 0.05. It should be noted that length does not impact the Flesch-Kincaid measure and therefore this relationship is a correlation between otherwise unconnected factors. 161 neither measure indicates a consistent chronological change the statistical correlation between the two measures indicates a link between length and grade level.

Figure 5.1 Grade Levels of all seven Harry Potter Books

Figure 5.2 Book Length of all seven Harry Potter Books

The following computational study more thoroughly explores whether Rowling’s crafting of Harry Potter, in a sequence of seven books, is expressed at deep levels of the language by asking what other measures of style can account for variation in the sequence. For instance, are there some common words that are used more frequently in the late novels? If so, what artistic goals do these words achieve?

162

New Measures of Variation To begin, the one hundred most frequently used words in the entire sequence were counted and salutations, titles and names were removed, including Harry, Ron, Hermione, Dumbledore,

Professor, Hagrid, Snape and Weasley. This left ninety-two common words which are listed in full in Appendix B.4. Using the proportional frequency of these ninety-two words, the PCA then returned a first component that accounts for 35.5% of the variance. On this component, the largest difference in the sequence is between Philosopher’s Stone (the first book) and Half-Blood

Prince (the sixth book). The results are plotted in Figure 5.3 along with PC2 – which accounts for

21.4% of the variance in the dataset. In between the two extremes, the books are plotted along the horizontal axis (PC1) in the following order: Prisoner of Azkaban (3); Chamber of Secrets (2);

Goblet of Fire (4); Order of the Phoenix (5); Deathly Hallows (7). Notably, the first four books are on the left-hand side of the vertical axis and the last three are on the right-hand side, which indicates that the stylistic shift identified on this component of the PCA is correlated to the chronological progression of the sequence.5 The middle book, Goblet of Fire is found on the left- hand side of the y-axis and is not particularly distinct from the early books but positioned centrally along the component. For all the changes that first occur in Goblet of Fire such as a sudden increase in length, a divergence from the plot structure of the first three and the first death of a character, this volume is a mid-way point in terms of style – it forms part of the progression observed in the style of the sequence but is not dramatically distinguished from the first three books.

Even though the variation in style does not occur in a strict chronological change, the underlying style of Rowling’s Harry Potter sequence, according to PC1, does change according

5 r = 0.94, p < 0.001. 163 to the progression of the novels. This leads us to the question: which of the ninety-two words included in this study have caused the stylistic split between the first and the sixth book and how do those words function in the narrative? In order to explore this question, we can consider the word loadings for the ninety-two common words, which are plotted in Figure 5.4.

Figure 5.3 PCA Scores of the Seven Harry Potter Novels (PC1 vs. PC2)

164

Figure 5.4 PCA Loadings of 92 Common Words in the Seven Harry Potter Books (PC1 vs. PC2)

According to the left-hand side of the graph, the low end of PC1 is dominated by prepositions and other words which denote position and movement in time and space. The seven lowest scoring words along PC1 are back, up, out, off, got, was and get which indicate that the texts scoring low on PC1 in Figure 5.3 have a relatively higher usage of prepositions denoting movement, verbs indicating possession and events that have occurred in the past. For example, “they started to run back up the passage… it was coming from the chamber… Harry got to his feet” (Philosopher’s, ch. 10); “how to get past Fluffy…anything to get rid of Norbert”

(Philosopher’s, ch. 14); “Harry saw at once that it was a diary… it was fifty years old…was looking over Harry’s shoulder” (Chamber, ch. 13). Along with prepositions, the third person pronouns they, their and them are also located on the left-hand side of Figure 5.4. These results imply that the style of the books on the left-hand side, the early volumes in the sequence, have a relatively high concentration of words related to the narration of action sequences and groups.

165

In contrast, the right-hand side is dominated by a larger variety of word types which include, but are not limited to, modal, auxiliary and main verbs: would, that, not, did, is, to and who. The homograph that is used in several ways although it is frequently found in the direct speech of Dumbledore: “given everything that has happened to you … never forget that what the prophecy said is … It is essential that you understand this … only protection that can possibly work against … a mirror that reflected your heart’s desire” (Half-Blood, ch. 23). The ellipses in this quote represent extensive explanations that Dumbledore is offering to Harry concerning not only the discoveries contained in the book, Half-Blood Prince, but also concerning events from the first novel such as the Mirror of Eirsed from Philosopher’s Stone. As well as that, other words from the high end of PC1 are found in the above fragments of Dumbledore’s language such as you, your and to. In between these fragments, other words from the right-hand end of PC1 can be found: “the one who would challenge him”, “not immortality or riches.

Harry, have you any idea how few wizards could have seen what you saw in that mirror”. The words have, that and modal verbs are not only found in direct speech but also in Harry’s internal speech: “Harry had the impression that the words shocked Slughorn himself … Harry had been sure Slughorn would be one of those wizards who could not bear to hear Voldemort’s name spoken aloud” (Half-Blood, ch. 4). As such, the results in Figure 5.4 indicate that the books weighted on the right-hand side of PC1 have a relatively higher usage of words used in deictic reference, “that mirror”, in the expression of possibilities, “one who would”, hypothetical situations, “few wizards could have seen” and negation, “not immortality”. Therefore, a preliminary finding from the stylometric study of Harry Potter indicates that stylistic variations do occur from the first books to the later books and that the change appears to be due to a shift from language concerning movement and possession to language concerning specific information and the speculation of possibilities; from the past narration that concerns groups to past narration and speech that relays discussions and Harry’s inner thoughts.

166

These preliminary observations can be confirmed by further examining the context of these words in the texts. This is approached in the following discussion through the study of words at the extreme ends of PC1 in Figure 5.4 and is in three parts: the study of prepositions, the patterns of personal pronouns across PC1, and finally, the different usage rates of contractions and uncontracted forms.

5.2.1 Three Stylistic Variations 5.2.1.1 Changes in the usage of prepositions

Weighted on the low end of PC1 is the preposition out which can be contrasted with a high scoring preposition on the right-hand side, for. In order to investigate the contexts in which these words are used, a concordance search was conducted for clusters of four words that include either our or for and occur at a frequency of five times or more in each book studied which in this instance were the books at the two extremes of PC1 Philosopher’s Stone and Half-

Blood Prince.6

Considering first the context for the preposition out in Philosopher’s Stone, the search returned four clusters which appear in Table 5.1. The two most frequent are “out of the way” and “turned out to be”. The latter phrase is used primarily in the narration of Harry’s initial encounters with the wizarding world: “the Potions lesson turned out to be the worst thing”,

“Quirrell’s lesson turned out to be a bit of a joke” (ch. 8). Indeed, Harry’s entry to the wizarding world includes “finding out” a lot of new information: “There was a lot more to magic, as Harry quickly found out, than waving your wand and saying a few funny words” (ch. 8). The other two frequent collocations with out indicate movement, specifically out of windows and beds.

6 The concordance tool used was Wordsmith. 167

This is not surprising as other prepositions denoting movement, up, off, on, are also scored at the low end of PC1.

Table 5.1 4-grams with Frequency >5 in Philospher's Stone

Rank 4-grams Frequency 1 out of the way 12 2 turned out to be 7 3 out of the window 6 4 out of bed and 5

A search of out in Half-Blood Prince revealed that this preposition also collocates most frequently in the phrase “out of the way” but only once in the phrase “turned out to be”: “they grudgingly started Snape’s homework. This turned out to be so complex that they still had not finished when Hermione joined them” (ch. 9). Although the phrase “out of the way” appears fourteen times in Half-Blood, this volume is more than twice as long as Philosopher’s Stone which means that the phrase is used proportionally more frequently in the first volume. The PCA results, confirmed by concordance searches, indicate that out occurs relatively more frequently in the first book than the sixth but is found in similar contexts throughout the series. Therefore, the style of narration around certain events does not change greatly but the proportion of text that calls for this style of narration does change from the first volume to the sixth.

Turning to the preposition for, a concordance search in Half-Blood Prince shows that for occurs most often in the phrase “for the first time”. Table 5.2 shows the six highest ranked clusters of four words with for. This table indicates that for is used most often in reference to time as four out of six clusters include the word “moment” and all ten of the occurrences of “for a very long” are followed by “time”: “Harry lay awake for a very long time in the darkness”

(ch. 14). Before any subsequent conclusions can be drawn about the style of Half-Blood, it is must be noted that these rankings of collocations are, once again, not unique to the sixth volume: “for the first time” is also the highest ranking cluster in Philosopher’s Stone. However, “for a moment” occurs only seven times in the first volume which indicates that although the context

168 of usage is similar between the first and the sixth books, one primary difference is the frequency which is not solely attributed to the increasing book length as the proportional frequencies also differ: the sixth volume has a higher rate of for particularly in the phrase, “for a moment”. This trend can, however, be linked to an increasing narrative focus on the behaviour of characters and Harry’s inner dialogue, as when searched in the texts of Half-Blood Prince, the phrase “for a moment” is often preceded by verbs such as “thought” and “wondered”.

Table 5.2 4-grams with Frequency >5 in Half-Blood Prince

Rank 4-grams Frequency 1 for the first time 20 2 for a moment then 14 3 for a very long 10 4 him for a moment 6 5 thought for a moment 5 6 for a moment he 5 Table 5.3 Context of "for a moment" in Philosopher's Stone

No. 1 Dudley thought for a moment. It looked like hard work. 2 … For a moment it looked as though she might faint … 3 Her eyes lingered for a moment on Neville’s cloak … 4 … could I borrow Wood for a moment?’ Wood? thought Harry, bewildered … 5 For a moment, he was sure he’d … 6 ‘Don’t talk to me for a moment,’ said Ron … 7 … couldn’t feel them – for a moment he could see nothing but dark fire …

In contrast, in the seven times the phrase “for a moment” appears in Philosopher’s Stone, there is only one instance where it is in the context of a character’s inner speech – and that character is not Harry but his cousin, Dudley. Table 5.3 lists all seven occurrences appearing in context in Philosopher’s Stone and is included to illustrate a contrast to seven occurrences, selected from a random starting point in Half-Blood Prince, included in Table 5.4.

169

Table 5.4 Context of "for a moment" in Half-Blood Prince

No. 1 Hermione sat in thought for a moment and then said … 2 … and Harry wondered for a moment whether he was dead. 3 Harry stared at these words for a moment. Hadn’t he once, long ago, heard of bezoars? 4 Harry wondered, for a moment, whether he was going to shout at him. 5 Harry and Slughorn watched him. For a moment, Ron beamed at them. 6 … after muttering incomprehensibly for a moment he merely started snoring. 7 For a moment Harry thought Voldemort was not going to let go of it …

One of the main differences apparent in the contexts from the two volumes is the increased number of references to Harry’s thoughts in Half-Blood Prince. This is not to say that

Harry’s thoughts are not narrated in the first book. Example No. 4 in Table 5.4 shows clearly a moment of Harry’s inner speech being indirectly narrated, “Wood? thought Harry,” and a less direct instance can be seen in example No.5. In Table 5.4, however, three of the seven examples,

No. 2, 3 and 7, are narrating Harry’s inner speech. The first example in Table 5.4 narrates

Hermione’s action of sitting “in thought” but not her inner speech. The concordance search of both Philosopher’s Stone and Half-Blood Prince indicates that both novels employ “for a moment” to mostly narrate brief passages of time that indicate hesitations and pauses in the narrative, however the phrase is used with a higher proportional frequency in the sixth volume.

Furthermore, the contexts of the phrase in Half-Blood Prince suggests that one difference between the first and the sixth book is an increased focus on Harry’s inner speech.

In summary, studying just two examples of prepositions, out and for, demonstrates broader trends in Rowling’s stylistic choices. Although there is change, as demonstrated by the

PCA results in Figure 5.3, Rowling also has some consistent stylistic choices. For instance, the expression, “out of the corner of his” appears throughout the series whether it be the corner of a mouth or an eye. There are other phrases that drop off in frequency, such as the phrases

“turned out to be” and “for the first time”. These phrases are frequently employed in the first book when Harry is a native to suburban London and not to the wizarding world. The longer phrase, “for the first time in his life”, occurs four times in the first novel but only once in the

170 sixth and when it does it is in relation to more advanced magic: “Harry realised that he had just

Apparated for the first time in his life”. Therefore, it could be said that these features indicated that as the sequence progresses the word usage also changes to account for an increasing focus on the behaviour and inner thoughts of characters. Furthermore, the results indicate that such a focus occurs with relatively more frequency in Half-Blood Prince than does the action of protagonists climbing out windows or beds, or other actions, as suggested by the PCA as dominant features of the first book. However, all seven of the Harry Potter books are action- based narratives, which accounts for why prepositions are present throughout the sequence.

5.2.1.2 Changes in the usage of personal pronouns Another indicator of change in the series may be observed in the variation of personal pronouns. Weighted on the left-hand side of PC1 are three third person plural pronouns: they, their, them. In contrast, the personal pronouns scoring high on the right-hand side of PC1 are a mixture of first and second, singular and plural pronouns: I, she, her, we, you, your. In the middle of the graph, are personal pronouns that are employed in all seven volumes in a way that is not distinguished by PC1 including: me, him, my, his, and he. It is interesting that the male gendered pronouns are not distinguished by the measure of PC1 but the feminine pronouns scored highly. This suggests that there is a more constant usage of the male pronoun throughout the sequence than feminine pronouns which is not a surprising result for a sequence with a male protagonist. The other interesting pattern in this spread of pronouns is that pronouns appearing in direct speech are found at the right, I, we, you, and your. However, me and my are not used in a distinctive manner at either end of the component. Investigating the context of these personal pronouns reveals the reason why PC1 has split the third person pronouns to the left, the first and second person pronouns to the right and the feminine pronouns to the right as well.

Focusing first on the left extreme, the third person pronouns that are employed in the first book refer to characters in groups such as the Dursleys, or Harry and his fellow first-year classmates, Ron and Hermione, who together attempt to solve a mystery: “They had indeed 171 been searching books for Flamel’s name … because how else were they going to find out what

Snape was trying to steal?” (Philosopher’s, ch. 12). The third person plural pronouns are not used as relatively frequently as the sequence progresses because as Harry ages he increasingly acts alone. For instance, in the fourth book, Goblet of Fire, Harry faces the challenges of the Triwizard

Tournament alone and finds himself, at times, socially ostracised from Ron and his other classmates. Ostracised, Harry attempts to defend himself and uses me and my in direct speech:

“’Great,’ said Harry bitterly. ‘Really great. Tell him from me I’ll swap any time he wants. Tell him from me he’s welcome to it … people gawping at my forehead everywhere I go …” (Goblet, ch. 18). However, rather than Harry’s, it is actually Voldemort’s speech that appears to have a heavier concentrated frequency of me. The following fragments are from chapter one in Goblet of

Fire: “Move me closer to the fire”, “fetch me a substitute”, “you don’t want me to spoil the surprise”. In this instance, the study of pronouns translates to a study of character – Voldemort is more concerned with himself than the other characters. In addition, Goblet of Fire has the highest rate of my in the corpus, which can also be explained by two features of the book:

Harry’s first time in the sequence as a socially ostracised adolescent and therefore a higher focus on himself in his inner and direct speech, and one of the first prolonged glimpses of Voldemort, where his servants refer to him as “my Lord”.

The words on the right-hand side that are associated with direct speech can be accounted for by the increased need for explanation as the sequence progresses. In particular, authority figures such as Dumbledore increase the amount of direct instruction and explanation they offer Harry. An excerpt from one of Harry’s private lessons with Dumbledore in Half-Blood

Prince demonstrates the frequency of second and first person pronouns in the direct speech:

‘– which, naturally, made you forget all about trying to retrieve the memory; I would

have expected nothing else, while your best friend was in danger … I would have

hoped that you returned to the task I set you. I thought I made it clear to you how very

172

important that memory is. Indeed, I did my best to impress upon you that it is the most

crucial memory of all and that we will be wasting our time without it.’ (ch. 20)

Also present in this excerpt is first person plural, we, and, although not counted as one of the one hundred most common words, the pronoun our. When compared to the explanations

Dumbledore offers Harry earlier in the sequence it is evident that second person pronouns are employed even in the first book: “Your mother died to save you. If there is one thing Voldemort cannot understand, it is love. He didn’t realise that love as powerful as your mother’s for you leaves its own mark …” (Philosopher’s; ch. 17). The difference between the first and the sixth book in this regard is that the amount of explanation delivered by direct speech in Half-Blood

Prince which is a considerable portion of the text while the explanation Dumbledore offers to

Harry at the end of the first five volumes is considerably shorter.

In terms of the gendered pronouns, when the ratio of the feminine pronouns to male pronouns is examined it is revealed that the rate of feminine pronouns does increase as the sequence progresses. In Philosopher’s Stone the female pronouns account for only 12% of all gendered pronouns but in the second and third books it increases slightly to 15%. The fourth book increases again to 18% before there is a significant jump in the usage of female pronouns to 25% in the fifth book, Order of the Phoenix. In fact this is the highest rate of the entire sequence: the sixth book has a rate of only 20% feminine and the final book has a rate of 21%. The rate of feminine pronouns doubles as the series progresses, a fact that can be explained as more female characters have more prominent roles. Some of these characters include Ginny, who arrives at

Hogwarts in the second volume, Rita Skeeter a journalist in the fourth book and Luna

Lovegood who appears from the fifth book onwards. However, there is quite an obvious reason for why the feminine pronouns suddenly account for a quarter of all gendered pronouns in the fifth book, and that is the role of the female villain – Dolores Umbridge.

173

Therefore, the split of pronouns along PC1 indicates a variation in the Harry Potter sequence related to the increasing alienation of Harry from his peers, an increase in the amount of speech and explanations and the introduction of a female villain and more broadly a larger focus on female characters in the narratives.

5.2.1.3 Changes in the usage of contractions

Another measure of change in the sequence that is evident in Figure 5.4 is the distribution of the contraction don’t at the middle of PC1 and the requirements for an uncontracted form did and not at the far right-hand end of the horizontal axis. As contractions are generally considered informal (Huddleston and Pullum 91, 800), a higher presence of the uncontracted form “did not” could indicate more formal language in the later books. However, the evidence from

Figure 5.4 alone is not very compelling as do and not were counted separately and not together, the equivalent contraction, didn’t, was not counted at all and the contraction don’t appears in the middle of PC1 indicating that it does not account for stylistic variation in the sequence.

However, a brief study of more contractions and the uncontracted forms reveals that there is indeed an increase in formality and suggests that did and not frequently appear together in the late books, primarily in Dumbledore’s speech.

In Philosopher’s Stone, the contraction didn’t occurs 194 times while did not appears only five times. Out of all occurrences in the first book, only 2.5% of them are the more formal version, did not. In Half-Blood Prince however, the uncontracted version accounts for more than half of all occurrences – 59% of the time; where didn’t or did not occur in Half-Blood Prince it is the uncontracted form that is preferred. A brief survey of other contractions and their non- contracted counterparts reveal that this pattern is not limited to didn’t/did not.

The ratio of non-contracted to contracted forms does not increase until the fifth novel,

Order of the Phoenix. From Goblet of Fire to Order of the Phoenix the ratio of could not increases from 14% to 31% and the ratio of would not from 13% to 43%. Turning to other forms, the 174 changes occur more gradually and are seen when comparing the first book with the last. For instance the ratio of do not in Philosopher’s Stone is 2% and by Deathly Hallows it is 10%.

Contractions are often still the preferred form overall but the ratio in which the uncontracted form is used increases as the series progresses suggesting an increase to the formality of the language.

Significantly in the first four books, the uncontracted forms will not and do not are only used by adult characters and always appear in direct speech—that is, until the end of the fourth novel when Voldemort attempts to control Harry with a curse. Harry’s inner struggle is narrated briefly:

I will not, said a stronger voice, in the back of his head, I won’t answer …

Just answer ‘no’ …

I won’t do it, I won’t say it …

Just answer ‘no’ …

‘I WON’T!’

And these words burst from Harry’s mouth. (Goblet, ch. 34)

The “stronger voice” places emphasis on his resistance to the controlling curse, “I will not”, and it is repeated as “I won’t”. Not only is it rare for one of the children to use an uncontracted form, it is rare for Dumbledore to use contractions. Although some adults tend to revert to a less formal speech when talking to the children – “’You’ll see me very soon, Harry,’ said Sirius”

(Goblet, ch. 36) –Dumbledore retains a formal diction in most settings, with one notable exception.

175

At the end of the sixth book, Dumbledore drinks a potion that tortures him and his speech is, for the first time, riddled with contractions:

‘I don’t want … don’t make me …’

Harry stared into the whitened face he knew so well, at the crooked nose and half-

moon spectacles, and did not know what to do.

‘No, no, no … no … I can’t … I can’t, don’t make me, I don’t want to …’

[…] ‘Don’t hurt them, don’t hurt them, please, please, it’s my fault, hurt me instead …

(Half-Blood, ch. 26)

Dumbledore’s first words after this ordeal return immediately to his usual diction: “Quite understandable … One alone could not have done it” (ellipsis in original), and although Harry is concerned with “how slurred Dumbledore’s voice had become”, Dumbledore nevertheless reassures Harry with his familiar emphasis through uncontracted forms: “‘I am not worried,

Harry,’ said Dumbledore, his voice a little stronger despite the freezing water. ‘I am with you.’”

The change in Dumbledore’s language is a stylistic choice that signifies to the reader a sudden change in Dumbledore’s function: no longer the one in control of the situation Dumbledore’s burden falls to Harry who the narrator informs us “did not know what to do”. Even though

Harry does not have the answers, the use of the uncontracted form reinforces the brief reversal of roles while Dumbledore is rendered insensible by the potion. When the ordeal ends, the reader is assured that Dumbledore has full control of his faculty, even if he is weakened, a point reinforced by the stylistic return to uncontracted forms. “I am with you” is reassuring to both

Harry and the implied reader. The ordeal Dumbledore experiences marks the first time in six volumes that both Harry and the reader witness a side of Dumbledore that is neither composed nor in control but is uncertain and terrorised by a significant trauma. Although Dumbledore returns to his normal composure and attempts to reassure the reader, this section foreshadows 176 the huge shift that occurs immediately afterwards, when Dumbledore dies at the hands of

Snape. It also reveals an inner turmoil of Dumbledore’s that is uncovered in the final book. In this instance stylistic variation heralds significant change in the sequence: Dumbledore’s weakening.

There are, of course, several times when Rowling narrates using an uncontracted form in the early books. In such cases there is an emphasis achieved through the more formal phrasing:

Snape was looking right at him, and the bell which rang ten minutes later could not

have been more welcome. (Chamber, ch. 11)

Gryffindor were not out of the running after all, although they could not afford to lose

another match. (Prisoner, ch. 10)

In these two examples, the uncontracted forms emphasise the relief provided by the bell and the importance of the upcoming Quidditch match. As with the other measures of stylistic variation so far analysed, there is evidence of both contracted and uncontracted forms throughout the sequence.

Thus far, we can summarise stylistic variation in the sequence as a shift from action to relationships, as demonstrated through the prepositions out and for; a shift from narrating the activities of groups with the third person plural pronouns to an increased amount of direct speech to convey important information; and a shift from informal to more formal language.

When considered all together, these three variations in the use of language can be related to

Harry’s increasing familiarity with his new found identity and the wizarding world. As the sequence progresses not only does Harry’s knowledge of the world increase, so too does the reader’s familiarity and knowledge which lessens the need for “absorbing” details (Behr 265).

177

5.2.2 Measuring Change in Chapters

If indeed there are shared styles of narration throughout the sequence and the variation observed in Figures 5.3 and 5.4 is a matter of proportions – where the earlier books have a higher proportion of external narration and the later books have a higher proportion of internal narration, but they share the style of narration – then a study of the style of each chapter ought to reveal patterns of internal variation in each book. It was hypothesised that when plotted along the same two principal components, the 199 chapters that make up the seven-volume series would spread out according to the style of narration in each chapter; some of the books would have some chapters with very similar styles and some chapters from the sixth and seventh books may be close to the style of chapters from the first three books.

The process of plotting the chapters along the same principal components as plotted in

Figures 5.3 and 5.4 requires new PC scores to be calculated manually for each chapter in order to keep the loadings constant. These are calculated according to the following formula:

∑ 푓푖 ∗ 푙푖 푖=1 where f is the standardised proportional frequency of the word in the chapter, l is the original

PC loading for the corresponding word and n is number of common words used in the original

PCA experiment. The sum of all the proportional frequencies multiplied by the original PC loadings for each word is what constitutes the new component (as signified by the sigma above). This formula was repeated for each of the 199 chapters along both PC1 and PC2 and the results are graphed in Figure 5.5.

178

Figure 5.5 Manual PCA Scores for 199 Chapters in Harry Potter Sequence (PC1 vs. PC2)

The chapters from the first three books are plotted in warm colours and are mostly found on the far left side of the graph. The fourth novel – the book in the middle of Figure 5.3 – is plotted in grey and the final three are in cool colours. Although the warm colours cluster together towards the left-hand side and the cool colours cluster together more to the right, there are some chapters that are interspersed. The only chapters labelled in Figure 5.9 are the right- hand extremes of all seven volumes and the left-hand extremes of the last four volumes. As the majority of the warm coloured chapters, those from the first three volumes, fall to the left, no chapter has been isolated as the far-left extreme for the purposes of the following discussion.

Comparing first the chapters that are furthest to the right from each volume reveals that all seven are chapters where Harry has the main events and circumstances of the novel explained to him from a figure in authority. The chapters from the first three books are not as far right as those from the last four volumes but nonetheless, these chapters – which are 17 from

179

Philosopher’s Stone, 19 from Prisoner of Azkaban and 18 from Chamber of Secrets – constitute the right-hand extreme for their respective volumes and as such have a more frequent use of the words that were found to be highly scored on PC1. Using examples from chapter 17 in

Philosopher’s Stone to unpack this finding requires examining the explanations from two figures who assume authority over Harry: Quirrell and Dumbledore.

The words from the right-hand side of Figure 5.4 that are used in this chapter include the modal verb could, verbs, such as have, prepositions, such as for, and the homograph, that. For instance, part of Dumbledore’s explanation to Harry in chapter 17 of Philosopher’s Stone employs a modal verb and a preposition: “could not touch you for this reason”. Dumbledore’s explanations are also marked by a frequent use of both present tense and past tense, “If there is one thing Voldemort cannot understand, it is love … It was agony to touch a person marked by something so good” (Philosopher’s, ch. 17). Quirrell’s explanation of his role in the preceding narrative events makes frequent use of that as in anaphoric references to past events: “that

Quidditch match”, “that broom”, “after all that”, “that three-headed dog”. Dumbledore also employs that frequently: “I do believe he worked so hard to protect you this year because he felt that would make him and your father quits”. In this example, the direct speech also includes personal pronouns you and your which are also weighted highly on PC1.

One of the main differences between the explanations offered in the first novel and those offered in the later novels is the length of the explanations. In chapter 17 of Philosopher’s

Stone, Dumbledore’s answers to Harry’s questions are brief and indulgent, there is no sense of urgency to fill Harry in on the details and so Dumbledore’s manner is soft and disarming, which is a contrast to the danger Harry had just faced. Smiling, beaming, twinkling and responding “dreamily” as though from afar Dumbledore returns to the present tense with an

“Ah-” several times before responding to Harry. Dumbledore even evades one of Harry’s questions, and although Dumbledore appears to encourage Harry’s desire for answers by

180 politely waiting for Harry to continue asking questions while humming softly or intently looking out the window, Dumbledore is firm on the point that there are certain explanations that must wait even longer: “Alas, the first thing you ask me, I cannot tell you.” Dumbledore’s explanations in the later books are longer and, in the case of Half-Blood Prince, are more frequent.

The question Dumbledore evades in the first novel, “But why would [Voldemort] want to kill me in the first place?” is finally answered in Chapter 37 of Order of the Phoenix, which happens to be the chapter that is farthest to the right-hand side of Figure 5.5 Here Dumbledore not only explains events from the preceding novels but refers back to the first four novels. That is once again employed frequently in anaphoric reference: “I was sure that”, “I was right to think that”, “I knew at once that”. Dumbledore’s use of that is characteristic of his register in the later books as he demystifies, fills in the gaps and expands the overarching narratives of the sequence. In comparison to Dumbledore, Harry hardly uses that even where it could be used: “I tried to check [that] he’d really taken Sirius”. Evident in this example is, again, the difference in formality between Dumbledore’s diction and Harry’s speech.

Dumbledore is not the only character to speak in this style. Chapter 2 of Half-Blood

Prince, which is on the right-hand side of Figure 5.5, although it is not labelled, consists of a secret conversation between Snape, Bellatrix and Narcissus in which Snape has similar stylistic characteristics as Dumbledore with frequent uses of that, “it became apparent to me very quickly that he had no extraordinary talent at all” and modality, “I would have been a fool to risk it”. As a marker of formal language and dialogic explanation, these words are found frequently in the direct speech of characters in authority who possess knowledge others lack. As it happens, Dumbledore fills this role most frequently.

Therefore, the chapters that congregate towards the right-hand side share a narrative purpose: to offer explanations to Harry. These chapters are stylistically united by sharing a 181 relatively large amount of direct speech from characters who are older and more knowledgeable than Harry. The narrative purpose of the chapters necessitates modal verbs, second person pronouns, linking prepositions and anaphoric references.

In contrast, the chapters from the later books that are furthest to the left are full of action scenes and share a style with the bulk of the early books featuring prepositions and other elements of external, past tense narration. This includes Chapter 7 from Order of the Phoenix, which relays the dramatic events of the Quidditch World Cup, Chapter 4 of Deathly Hallows which is the eventful departure of Harry and his decoys from Privet Drive, and chapter 32 from the fourth novel, Goblet of Fire. This late chapter in Goblet of Fire is a pivotal scene in the sequence and although it is a relatively short chapter it contains several important events and a memorable climax for the sequence as a whole: a character dies and Voldemort returns. The narration in this chapter involves many prepositions and third person plural pronouns: “They pulled out their wands”, “Whoever they were, they were short, and wearing a hooded cloak pulled up over their head to obscure their face”. This chapter also has the highest frequency of was in the volume, “was looking around”, “was on the ground”, “what he was about to see”,

“He was dead”, “and a nose that was as flat as a snake’s”. The narration style throughout this short chapter is the backdrop to the dramatic close of the chapter: “Lord Voldemort had risen again”.

The study of the sequence in chapters reinforces the theory that the stylistic variation between the first books and the later instalments is due to the proportion of the narrative style contained in each volume. The early books spend more time narrating action sequences as compared to the later books where explanations dominate the narrative. This is not to say that action does not occur in all seven volumes. Indeed, the presence of action-heavy chapters to the left-hand side of Figure 5.5 indicates that the style of the first three books can still be found in the later books. Similarly, the chapters that include explanations in the first three books are

182 positioned closest to the bulk of chapters from the last three books. Throughout the analysis thus far, there has been a recurring element of variation: the style of dialogue. Some of the words on the right-hand side, those weighted with the later volumes, are words found primarily in dialogue: present tense verbs is and have, and second person pronouns.

Furthermore, the explanations that distinguish some chapters from others are conducted through direct speech. The variation from the start of the sequence to the end could therefore be related to changes in direct speech — such as the rate of direct speech or the content of the speech, i.e. the increased use of modality verbs to accommodate Dumbledore’s explanations. In order to investigate this, PCA can be applied to a study of the style of direct speech and the rest of narration providing another measure of variation in Harry Potter.

5.2.3 Measuring Change in Direct Speech and Narration

Given that the late books seem to be distinguished by words that tend to occur in direct speech, it is possible that the variation explained by the PCA is due to a larger ratio of dialogue in the late books. By counting the length of direct speech in each book and comparing it to the rest of narration we can graph the ratio of dialogue throughout the sequence, seen in Figure 5.6.

183

Figure 5.6 Rate of Direct Speech in the Seven Harry Potter Books

The rate of dialogue does not increase as the sequence progresses even though it does vary from book to book.7 The first and last books have the same percentage of direct speech,

37%, and the second and sixth book have a similar ratio, 46% and 45% respectively.

Consequently, the distribution of words related to speech at the right-hand side of Figure 5.4 is not due to an increase in dialogue from early to late books. However, it could be that the style of speech in Harry Potter changes over the course of the sequence and that this aspect of the narrative could be the main cause for the split of books, words and chapters along PC1.

However, another PCA study can ascertain whether segments of direct speech and the rest of narration vary from book to book in a chronological order.

A third PCA study was conducted using the proportional frequencies of the same ninety-two words but in two segments for each book: direct speech and the rest of the narrative.

The results are below in Figure 5.7.

7 The correlation coefficient between the rate of dialogue and the order of the sequence does not return a significant p value: r = 0.014, p = 0.97 where the threshold is p < 0.05. 184

Figure 5.7 PCA Scores for Harry Potter Books Divided into Seven Sections of Direct Speech and Seven Sections of Narration (PC1 vs. PC2)

Accounting for 74.6% of the variance in the dataset, PC1 definitively splits the segments of direct speech (on the left) from the rest of narration. PC2, plotted along the vertical axis, accounts for only 8%. However this component demonstrates a very significant trend: both the speech and the rest of narration experience stylistic changes from the early books to the late.

The first book is at the bottom of the graph, on both the direct speech and the narration side of the y-axis, followed closely by the second and third books, although the style of direct speech in

Prisoner of Azkaban (the third) is closer to the first book than the style of dialogue in Chamber of

Secrets. At the very centre of PC2 are both segments of the fourth book, followed by the fifth.

Finally, the last two books are reversed in terms of direct speech but the stylistic variation in the segments of narration follows in complete chronological order along PC2.

Thus, the progression in the series from an early style to a late style occurs in the style of both segments. The narration, remarkably, is arranged in exact volume order. The dialogue also moves early to late as the PC2 axis increases. There are some perturbations to the order but the

185 correlation between volume order and PC2 score for these segments is strong.8 Therefore, the correlation between stylistic variation and the chronology of Rowling’s sequence is evidence in both parts of the narrative: direct speech and the rest of narration. Even when separated, stylistic change occurs from the first book to the end of the sequence.

The distribution of words plotted in Figure 5.8 shows the words which separated the segments of direct speech from the segments of the rest of narration. Most of the word loadings are clustered tightly in the middle of the y axis, even though they are split from left to right, so for ease of reading only the top and bottom ten words for both PC1 and PC2 are labelled, i.e. the highest and lowest at each end of both the horizontal and vertical axes (modal verb could is also labelled as it is pertinent to the following discussion).

8 r = 0.93, p < 0.002 186

Figure 5.8 PCA Loadings for 92 Words on Harry Potter Books Divided into Seven Sections of Direct Speech and Seven Sections of Narration (PC1 vs. PC2)

The words face, eyes and looked appear on the right-hand side while see and know are on the left-hand side, suggesting that the narration is occupied with features and the direction a protagonist is looking while the characters discuss more directly what they have viewed and understood. The first and second person pronouns, I, you, are on the left-hand side of the graph, associated with direct speech, while plural third person, they, is down the bottom right-hand side and singular third persons she, her and his are at the top (for the feminine pronouns) and middle (for the male) on the right-hand side. The position of did and that on the left-hand side of the graph, and at the top, confirms that the rate of usage of these words is relatively higher in direct speech – particularly with the speech found in the later books. The prepositions up, on and out are on the right-hand side of the graph, indicating that the description of movement is correlated more strongly with functions of narration rather than direct speech and is used relatively more in the early books. The position of these words confirms the discussion above – the style of Harry Potter changes to include more elements of formality and explanations with

187 less description of action as the series progresses. However, the style changes in a similar manner in both the direct speech and the rest of the narration, shifting from an early occupation with movement, the location of objects in space and the plural pronoun to the focus in later books on individuals (and more females) and explanatory dialogue and narration, which is occupied with the expression of relationships between objects rather than the movement of objects.

Interestingly, there is a split between the modals – would is on the left-hand side, associated with speech in the late volumes, and could is on the right-hand side, associated with narration in the late volumes. Could is used in the present-tense narration while would expresses modality of the past or future, as the following examples from Deathly Hallows demonstrate:

‘But Dumbledore would have found it, Harry!’ … ‘A face,’ muttered Harry, every time.

‘The same face. The thief who stole from Gregorovitch.’ And Ron would turn away,

making no effort to hide his disappointment. … He could not entirely blame them, when

they were so desperate for a lead on the Horcruxes. (Deathly, ch. 15)

As well as tense, would is often associated with the opinions of the speaker: “’I personally would be utterly bamboozled’” (Deathly, ch. 3). As such, it does occur at times in the narration of Harry’s thoughts, which were counted as part of the narration: “He had thought that he would feel elated if they managed to steal back the Horcrux”, “Would the elf keep silent or would he tell the Death Eater everything he knew? Harry wanted to believe that Kreacher had changed …” (Deathly, ch. 14). Nevertheless, the results from the PCA in Figures 5.7 and 5.8 indicate that would is more strongly associated with direct speech than the rest of the narration – even if it is employed occasionally when the narration shifts in tense and to Harry’s inner thoughts. Both these modal verbs are associated with the late books and account for the increased narrative focus on opinions and the discussion of possibilities and hypotheses, as demonstrated in the examples above. 188

A change that is in keeping with the chronology of story progression and can be found in the usage of only ninety-two common words in Harry Potter may be a subtle change – occurring in only 8% of the variance to be found according to a PCA. Yet, by counting the books according to distinct parts, direct speech and the rest of narration, a chronological change is indeed found in the quantitative study of both segments. These results show a stylistic change at a level previously unknown and that contradicts some of the early critical appraisals of Rowling’s style.

One early critic of Rowling’s fantasy books, John Pennington, complained: “Rowling is not sensitive to the integrity of style” (86). Although not commenting on chronological change,

Pennington points to instances where Harry is said to have “hissed” in Goblet as Fire which

Pennington considers to be an inappropriate characteristic for the “developing hero of the series, [who] should not particularly hiss, for that is a characteristic of snakes and other evil creatures” (86). The leap from this evidence to the conclusion that Rowling is not sensitive about her stylistic choices leads to a precarious argument and the PCA results above now provide evidence to the contrary. The correlation between changes in Rowling’s styles of narration and direct speech with a progressive chronological variation reveals a deep consistency in her stylistic choices. While critics may not agree with the morality of the stylistic choices or perhaps the aesthetics of the choice, the variation in style across the sequence indicates a depth of stylistic choices previously unnoticed by critics. The analysis of these results has also shown how Rowling’s style changed to accommodate the variation in narrative purposes: as Harry ages there is a change in the usage of specific words, such as modal verbs that are used to express uncertainties and explain theories. As the narratives become more complex there is an increased need for explanations which are accompanied by verbs such as have, and the word that — a particularly prevalent word in Dumbledore’s speech. The change to

Rowling’s representational problems, such as narrating the inner speech of an adolescent and

189 discussing more background information instead of the description of the world, is reflected by consistent changes in the underlying patterns of word use. These results are demonstrated in

Rowling’s stylistic choices in both direct speech and narration.

There remains the question of how much of the observed variation is an artefact of the method employed and whether Rowling’s sequence stands alone as an example of stylistic variation in a chronological fashion. Is it common to find variation in an author’s style across a seven part book series, or does the above stylometry of Rowling’s sequence sit apart? In order to answer these questions we can cross-check the findings by conducting the same analysis on two other fantasy book series that were also intended for young readers.

Variations in other Series In order to arrive at a more robust conclusion regarding stylistic variation in the Harry Potter sequence, two counterpoints are provided: C.S. Lewis’s The Complete Chronicles of Narnia (1950-

1956) and Diane Duane’s Young Wizards series (1983-2016). Unlike Harry Potter, the narration of

Lewis’s series does not progress in a linear timeline – it was published in a different chronology to how it is often read today. Furthermore, the focal characters are not consistent across the series, since as the protagonists reach a certain age they are told that they can no longer return to Narnia. On the other hand, Duane’s series is an open ended story – still being published, although its beginning predates Rowling’s sequence: the first novel was published fourteen years before Philosopher’s Stone. Although the protagonists do seem to age in Young Wizards the timeline is not as clearly demarcated as Harry Potter. For instance, one of the main protagonists,

Nita, is thirteen in the first volume, So You Want to be a Wizard (1983) and has aged one year by the eighth volume Wizards at War (2005). This is despite at least two summer holidays having supposedly passed, one in the second volume, Deep Wizardry (1985) and a second in the fourth volume, A Wizard Abroad (1993). In 2012, Duane released the “The New Millennium Edition” which aimed to update the computer and internet technology in the series as well as fix the

190 timeline issues (Duane). In the new editions, the storyline in the first nine books occurs over a period of three years.

5.3.1 The Complete Chronicles of Narnia Studying the stylistic change in The Complete Chronicles of Narnia draws on the same methodology that was used to study the seven volumes of Harry Potter. Once again, the first step was to count one hundred of the most frequent words in the corpus. Then all names were removed, which included Lucy, Aslan, Narnia, king, Edmund, Caspian and Jill. The proportional frequencies of the remaining ninety-three words were used as the dataset and subjected to a

PCA, the results of which were graphed in Figure 5.9. The first principal component (PC1) distinguishes between Prince Caspian (1951) at one end and The Magician’s Nephew (1955) at the other. The spread of books along the horizontal axis does not follow the publication order of the books which is: The Lion, the Witch and the Wardrobe (1950), Prince Caspian, The Voyage of the Dawn

Treader (1952), The Silver Chair (1953), The Horse and His Boy (1954), Magician’s, and The Last Battle

(1956). Neither does the ordering of Narnia along PC1 correspond to the reading order that is suggested by HarperCollins: The Magician’s Nephew, The Lion, the Witch and the Wardrobe, The

Horse and his Boy, Prince Caspian, The Voyage of the Dawn Voyage, The Silver Chair, and The Last

Battle. Significantly, The Lion, the Witch and the Wardrobe is dramatically different to all the other texts along the second principal component (PC2), expressed as the vertical y-axis. The variation in the series along both PC1 and PC2 is not correlated with either the order of publication or the chronology of the story for there is no significant correlation to be found when the two reading orders are correlated with the PC scores (for either the first or second

191 components).9 Nevertheless, PC2 can be summarised as a measure that found one variation according to chronology: the first volume published is distinct from all that followed. However, it is not a variation that was followed by successive volumes in either the order of publication or storyline.

Figure 5.9 PCA Scores of Narnia (PC1 vs. PC2)

The variations in Narnia can be unpacked by studying the corresponding graph of the word loadings, plotted in Figure 5.10. According to this graph, the extremes of PC1 are present tense words, such as are, is and have, which scored low on the left-hand side, and past tense words including had, was and did that scored high on the right-hand side. As well as indicating

9 Correlation of published order and PC1: r = 0.18, p = 0.35; of reading order and PC1: r = -0.41, p = 0.18; of published order and PC2: r = 0.59, p = 0.08; and of reading order and PC2: r = 0.08, p = 0.43. None of the p scores are within the threshold p < 0.05. The closest one is PC2 and the published order, which is the component distinguishing Lion from the rest of the series. 192 that the books on the left, such as Prince Caspian, contain relatively more present tense dialogue and narration, the right-hand side is also marked by the presence of that and there – words that determine which objects are being referred to and the position of objects. Therefore, the texts on the right, such as The Magician’s Nephew, are stories told with relatively more past tense narration and with a relatively higher need for identifying objects and explanations. On the left- hand side, however, the texts are also weighted with adjectives, great and little, and the number two. These texts include descriptions as well as relatively more direct speech between characters. When considered together, these patterns suggest that the main stylistic distinction between the texts along PC1 could be said to be a distinction in tense, between a story told through a series of events and including discussions between characters – such as in Prince

Caspian – and a story that is reported primarily in the past tense, as in The Magician’s Nephew. In order to illustrate the styles at either end, we can turn to examples from these two texts.

Figure 5.10 PCA Loadings for 93 Words in Narnia (PC1 vs. PC2)

193

The words found at the high end of PC1 in Figure 5.10 are used to link events in a reporting style of narration. For instance, in The Magician’s Nephew, upon Digory’s return from

Narnia with the Witch clutching his ear, there is a brief moment of suspense that is made almost comical by the narrator’s use of both the words found at the high end of PC1 and short sentences:

In a moment they found themselves in Uncle Andrew’s study; and there was Uncle

Andrew himself, staring at the wonderful creature that Digory had brought back from

beyond the world. And well he might stare. Digory and Polly stared too. (Magician’s,

ch. 6)

The style in The Magician’s Nephew is a mixture of linear narration as seen in the above quote,

“there was”, “had brought back from” and speculation at other moments where the narrator makes use of the conjunction, if — a word also weighted high on PC1. If is found throughout

The Magician’s Nephew: “as if the room had been intended for a much larger collection of images”, “if you ever met people who looked like that”, “If only Digory had remembered what he himself has said a few minutes ago” (ch. 4). The style of The Magician’s Nephew accommodates the new details concerning Narnia that are offered in this volume even though it is the final book published in the series. Although the story in this novel is set prior to the other stories, the narrator speaks in past tense not only about the events of The Magician’s Nephew but also the events of The Lion, the Witch and the Wardrobe: “and when, many years later, another child from our world got into Narnia, on a snowy night, she found the light still burning” (ch.

15). As the final volume published in the Narnia series, the narrator describes the world of

Narnia with both speculation about certain aspects of the world and a concrete reporting style using the other words weighted high on PC1: had, that, there, was, back.

In contrast, the words weighted low on PC1 include present tense words, is and are, the pronouns who, we, his and us, and the conjunction and. This conjunction is found in abundance

194 throughout the narrative of Prince Caspian: “Caspian and other captains”, “deeply into it and sometimes … “, “and he and his army arrived”, “and both his party and Caspian’s”, “those of us who have most need of cover and are most accustomed …” (ch. 7). The present tense words are found primarily in direct speech which conveys a lot of information for the reader. For instance, the introduction to Aslan’s How is explained through dialogue: “it is a huge mound”,

“the Mound is all hollowed out”, “the Stone is in the central cave”, “There is room in the mound for all our stores” (ch. 7). Even when the children first return to Narnia it is a Dwarf who begins to explain the changes that have occurred since they were last there. The narrator eventually takes over to introduce the story of Prince Caspian asserting that “it would take too long and be confusing” for the full account to be relayed as a conversation between the children and the Dwarf (ch. 3). The narrator in Prince Caspian progresses through the story, through both dialogue and narration where spatial shifts and temporal progressions are clearly marked: “For now something was happening”, “a little to their right” (ch. 3). Despite a tendency for reporting in narrative that occurs in texts at both extremes of the horizontal axis, the style of narration is distinct between Prince Caspian and The Magician’s Nephew as demonstrated and explained by

PC1.

The second principal component is another matter as it distinguishes between The Lion, the Witch and the Wardrobe and the rest of Lewis’s series. At the top of PC2 are the words I, me, and my, and in contrast at the bottom of PC2 are third person plural pronouns, they, them and even the word, all. One of the distinctions between The Lion and the rest of Narnia is a relatively higher focus on groups rather than individuals in Lewis’s first published fantasy. The increased focus on individuals in the later books can be found in changes to the narrator’s own style. If we compare The Lion to the last book published, The Magician’s Nephew, the stylistic differences are apparent in the delivery of information. The narrator of The Magician’s Nephew tends to offer the protagonist’s reflections on the events at the same time as reporting the events, for example “as

Polly said afterwards” (ch. 6). Indeed, the narrator is more reflective in this last book published 195 compared to the narrator in the first. There are several points where the narrator in The

Magician’s offers more information (words weighted high on PC2 are highlighted): “As Digory said afterward” (ch. 2), “she never mentioned it again or afterward. I think (and Digory thinks too) that her mind was …” (ch. 6); “Of course Digory did not realize the truth quite clearly either, or not till later” (ch. 3). In contrast, there are very few mentions of the outcomes and the later reflections of the protagonists in The Lion. There is one instance where additional information is provided regarding a gift given to Lucy: “He gave her a little bottle of what looked like glass (but people said afterward that it was made of diamond) and a small dagger”

(ch. 10). One of the differences between the first and the last novel of Narnia is therefore, an increase in retrospection that necessitates an increase in the personal pronoun, I, as the narrator refers more to his own opinion and the use of both as and was to refer to the events that took place. Perhaps writing the last volume in the series as a sort of retrospective prologue can account for the retrospection of Lewis’s narrator. However, the reflections reinforce to the reader the fact that the narration of The Magician’s Nephew takes place long after the events that are being told, after the earlier volumes, The Lion and Prince Caspian.

The narration of The Lion, the Witch and the Wardrobe is also marked by the past participle been, while the verb be is associated with the top of PC2, and therefore with the rest of the series. Also found at the bottom of PC2 are the prepositions into and then which link events together. Although they are found throughout the narration, they are easily found in relation to the magical wardrobe:

… she thought she would have time to have one look into the wardrobe and then hide

somewhere else … then there was nothing for it but to jump into the wardrobe … it is

very silly to shut oneself into a wardrobe … and he came into the room just in time to

see Lucy vanishing into the wardrobe. He at once decided to get into it himself … (ch.

3)

196

The split of words along PC2 reveals that the language in The Lion is focused relatively more on describing the protagonists as a group, reporting speech that is not as heavy with first person pronouns and a reporting style that is firmly past tense to link events together with words such as then. The PCA suggests the style of The Lion is distinguished from the rest of the Narnia by the usage of these words.

Peter Schakel has also noted a distinct style of narration in The Lion, the Witch and the

Wardrobe which he argues can support the position of this novel as the first in the reading order.

His observations confirm the distinctiveness of The Lion. Schakel argues that The Lion includes careful details as Lucy first enters the world, causing gaps, “details that need to be clarified”

(96). The first mention of the name Narnia creates a gap, the inexplicable presence of a lamp- post in the middle of a forest creates another gap and the first mention of the name Aslan creates the “most important” gap according to Schakel (96). The gaps are filled slowly, as partial information is offered the mystery of the inexplicable is heightened. By comparison, The

Magician’s Nephew offers a very different experience. The introduction to the story mentions

Narnia with a deictic reference, “the land of Narnia”, which assumes the reader has a degree of shared knowledge with the narrator not only of the world but of “all the comings and goings between our world and the land of Narnia”. Schakel further notes that the word Narnia is not mentioned again until the title of chapter 9 (98) as the reader is invited into the story of how it

“first began”. Schakel argues, “The only reason for putting The Magician’s Nephew first is to have the reader encounter events in chronological order, the order in which they happened, and that, as every storyteller knows, is quite unimportant as a reason” (94). However, to read the series in chronological order is to miss the particularly potent first entry to Narnia as experienced in The

Lion. The presence of these artistic strategies in The Lion adds weight to the finding that The Lion is the only outlier along PC2; the only book in the series that varies significantly in style is also the book that varies in terms of artistic strategy when entering Narnia.

197

In terms of the style of language, The Lion, the Witch and the Wardrobe is distinct from the rest of the series but this distinction does correspond with one of the available reading orders as

The Lion was the first volume published. Furthermore, although it is not the first volume in terms of story chronology, it has been argued as possessing narrative elements that cause it to be an appropriate entry to the world of Narnia. Compared to the Harry Potter sequence, however, the differences are not correlated with the ongoing chronologies of either the story or publishing order of the books. However, there is another contrast between the findings of

Narnia and of Harry Potter that can be identified as relating to the focalisation of characters.

Throughout Rowling’s seven volume sequence, the story is narrated from Harry’s perspective. This accounts for part of the stylistic variation book to book: as Harry ages the narration changes to accommodate his expanding perspective, his adolescent feelings and the complexity of the tasks set before him. However, Narnia does not develop according to character-arcs and as characters age they are excluded from Narnia and from the books.

Although all seven books have an omniscient narrator there are differences in how the narrative is focalised. For instance, The Silver Chair is told from the perspective of Jill Pole, with direct inner speech occurring frequently and a distinctive opening to the novel: “It was a dull autumn day and Jill Pole was crying behind the gym. She was crying because they had been bullying her” (9). The other novels open with self-reflexive introductions: “This is the story of” (Horse

11), “This story is about” (Magician’s 9). Prince Caspian and The Lion open with the exact same line, “Once there were four children whose names were Peter, Susan, Edmund and Lucy” but the narrator qualifies in Prince Caspian that “it has been told in another book called The Lion, the

Witch and the Wardrobe how they had a remarkable adventure” (Prince 12). The opening to The

Magician’s Nephew speaks directly to the reader and explains the setting of the story far back in the childhood of “your grandfather” (Magician’s 9). In contrast, the opening to The Silver Chair draws the reader directly into the story through the perspective of Jill, which immediately distinguishes the novel as having a distinct strategy in focalising the narrative through a 198 specific character. However, The Silver Chair is not distinguished as an outlier or an extreme on either PC1 or PC2. Therefore, the difference of narration in The Silver Chair has not impacted the stylistic variation found by the PCA. This is a contrast to the stylistic variation found in

Rowling’s novels, where the narration of the first books to the last in Harry Potter is distinguished by an increase in modality expressing possibilities and desires, and in words employed to refer to facts and features of the world that have become assumed knowledge for

Harry and the reader.

In summary, the above study of The Complete Chronicles of Narnia serves to confirm that a series in seven volumes – each book with a closed progression of narrative events – does not necessarily progress stylistically in relation to the progression of the story or the order in which the volumes are written. In contrast, the changes to the styles of the Harry Potter books occur with more consistency as the books progress in publication order.

5.3.2 Young Wizards The Young Wizards series shares two significant traits with the Harry Potter sequence: the protagonists age and the books are published over at least a decade, more in the case of Young

Wizards. One significant difference is the open setting of Young Wizards compared to the largely closed world of Harry Potter. The characters in Young Wizards are receiving a magical education but not through a school environment, which means that there is a slightly different setting in each novel ranging from New York to Ireland and all the way to Mars. Despite additional materials and subsequent stories by Rowling that are set in the same world, the original Harry

Potter sequence is a closed seven-part story: there was a set number of books in the original sequence and the setting is consistent, with the only significant departure occurring in the last book. On the other hand, Duane’s series is ongoing, with a tenth book published in 2016 although not yet published when the data was being collected for this study. Therefore, the first nine books in the Young Wizards series provides a useful comparison to the Harry Potter sequence primarily due to the similarities in the maturing of the books as the series progresses. 199

The nine books are: So You Want to be a Wizard (1983), Deep Wizardry (1985), High Wizardry

(1990), A Wizard Abroad (1993), The Wizard's Dilemma (2001), A Wizard Alone (2002), Wizard's

Holiday (2003), Wizards at War (2005) and A Wizard of Mars (2010). In 2011 Duane announced that she would revise the early four books to update the technology for twenty-first century readers.

She has since published revised editions of all nine original books in a set called The New

Millennium Edition. The original nine versions are used first in the following study, and those initial results are then compared to the revised editions of the nine volumes.

The PCA performed on the original works in Duane’s Young Wizards series considers only the most common words – not many of which are lexical words and none of which belie the fantastic world of Duane’s fiction. The one hundred most frequent words were counted and three names discarded from the list leaving the proportional frequency of ninety-seven common words in nine original books as the dataset for this test. 10 The first two components returned by the subsequent PCA are graphed against each other in Figure 5.11. The numbers denote the published order of the books, which is also the chronological order of the story.

10 These are the names of the three main characters, Nita, Kit and Dairine. 200

Figure 5.11 PCA Scores of Young Wizards Series (PC1 vs. PC2)

The first component (PC1) has distributed the Young Wizards series along the horizontal axis in an order that correlates to the chronological order of the volumes.11 Similarly to the

Harry Potter sequence, the extremes along the horizontal axis are not the first and the last novels but rather the first and the sixth. On the left-hand side of the y-axis are the first four novels while the last five books are on the right-hand side. The only books to sit outside of the published order are the fifth and the sixth book. Significantly, there is an eight-year gap between when the fourth book was published and when the fifth book was released —all the books on the left-hand side of the graph were written over a decade in the twentieth century between 1983 and 1993 and all the books on the right-hand side were published in the twenty-

11 The correlation between the order of the books and the order of the books along PC1 is significant: r = 0.89, p = 0.002. 201 first century over the course of nine years (2001-2010). In the intervening years between the fourth and the fifth books, Duane wrote other fiction works such as two titles in a series by the name Feline Wizards (1997, 1998). She also wrote in category series such as Star Trek, Spider-Man,

X-Com, X-Men, The Harbinger Trilogy.12 Although Duane was writing fiction for established series prior to her break in the Young Wizards sequence, the time spent focused on other projects could have sparked a shift in Duane’s style that can account for the split between the twentieth century books and the twenty-first century books. Certainly, Duane, at some point, became aware of the datedness of her 1980s and 1990s books as this prompted her release of The New

Millennium Edition in 2012. Indeed, it was only the first four books that Duane initially declared required updating “for this millennium’s generation” (Duane), although all nine have since been revised and republished in the new edition. As well as updating the technology to include mobile phones and the internet, Duane also attempted to address what she called

“timeline issues” that are due to “longish periods elapsing between books” (Duane). The question is, do these updates alter the stylistic split between the early books and the late books?

Analysed using a PCA, The New Millennium Edition, for all the changes to the temporal setting and technology, is essentially the same as the original in the terms of common word usage. When counted, the same list of one hundred most frequent words was returned. In keeping with the methodological protocols outlined in Chapter 2, the character names, Nita, Kit and Dairine were excluded. The remaining ninety-seven words were used as the dataset and the PCA accounted for a similar amount of variance in PC1 (30.6%) and in PC2 (16.4%). The texts were distributed along both PC1 and PC2 in the same order as the PCA found the original

12 Among the titles produced between 1993 and2000 are The Venom Factor (1994); UFO Defense (1997); Intellivore (1997); Empire’s End (1997); Starrise At Corrivale (1998). 202 books and essentially the same pattern – comparing Figure 5.11 and Figure 5.12 reveals a slight re-positioning of books 8 and 9 in terms of pattern but the volumes are still in the same order along both the horizontal and vertical axes. When the two components are analysed for correlation a highly significant correlation is found to be shared by both the texts scores and the word loadings, see Figure 5.12 Therefore, the updates Duane made to her Young Wizards series do not extend to the syntactical choices made using the very common words in her corpus of nine volumes.

Figure 5.12 PCA Scores for Young Wizards (The New Millennium Edition) (PC1 vs. PC2)

203

Table 5.5 Significance of Correlation Coefficients in the PCA Results on the Original and the New Millennium Editions (NM) of Young Wizards

Original and NM Correlations R-value P-value

PC1 Text Scores 0.996 <.0001

PC2 Text Scores 0.974 <.0001

PC1 Word Loadings 0.982 <.0001

PC2 Word Loadings 0.960 <.0001

The nature of stylistic variation in Young Wizards, as found along PC1, can be briefly explored through the word loadings from the PCA using the results from the original versions, which are plotted in Figure 5.13. The extremes along the horizontal axis are and, of, the, with, well and a which scored low on the left-hand side of the graph and just, way, get, been, to and how which scored high on the right-hand side. There are more words on the right-hand side of the graph than the left and among them are going, got and go all of which indicate a focus on the movement of protagonists in the later volumes. The right-hand side is also dominated with words used in direct speech such as first person pronouns, me and I. Accordingly, the main difference between the books published in the twentieth century and those published in the twenty-first century volumes is an increased complexity. The early books appear to have a narrative style that prefers the conjunction and, which is contrasted to several types of words on the right-hand side. Among the words on the right-hand side are how, when, but and the highest scoring word: just. Indeed, when just is explored in the context of the sixth book, A Wizard

Alone, it is found in narration, “had just come into the dining room again” and in dialogue,

“Just do this from now on” (ch. 1). In the first book, So You Want to Be a Wizard, just occurs at a proportional frequency of only 2% which increases to 3.8% in the sixth book. Together with the other words on the right-hand side of Figure 5.13, there is a distinct usage of words that distinguish Duane’s twenty-first century novels from her four earlier novels.

204

Figure 5.13 PCA Loadings for 97 Words in Young Wizards (Original Versions) (PC1 vs. PC2)

Of the ninety-seven words used in the PCA test on the Young Wizards series, seventy-nine of them are the same words used in the study of Harry Potter.13 To compare the similarities between the spread of words along PC1 in the Young Wizards test and the first component found in the initial Harry Potter test (Figure 5.4), the PC loadings for the seventy-nine words were correlated. The resulting correlation coefficient was r = -0.041 which returned a p-value of p = 0.36. As such, there is no linear correlation between the variation of seventy-nine common words as found by the PCA on Harry Potter and the PCA on Young Wizards.

13 The seventy-nine shared words are: a, about, again, all, an, and, are, around, as, at, back, be, been, but, by, could, do, don't, down, for, from, get, got, had, have, he, her, him, his, how, I, if, in, into, is, it, its, just, know, like, looked, me, more, no, not, now, of, on, one, or, out, over, right, said, see, she, so, that, the, their, them, then, there, they, this, through, time, to, up, was, we, well, were, what, when, with, would, you, your.

205

Both Young Wizards and Harry Potter experience a stylistic variation but in Duane’s sequence the change can be effectively described as the difference between her style up to 1993 and her style since 2001. After an eight-year break the style of the last five books in the Young

Wizards sequence is distinct from the first four. For the Harry Potter sequence, however, the change is not as neatly categorised but is more consistent across the books indicating a more progressive stylistic shift in Rowling’s style.

Conclusion

The aim of this chapter was to discover whether variation in the Harry Potter sequence could be observed in a stylometric study. Returning to this original aim, the question has been affirmatively answered: according to a PCA study there is stylistic variation in the Harry Potter sequence and it is a variation that occurs in a chronological fashion. From the first book to the last, stylometry has demonstrated that there are changes in the narrative style and the style of dialogue. The stylistic variation in Harry Potter occurs at such a deep level that it can be found in several PCA measures.

Stylistically, the Harry Potter sequence undergoes a shift from a narrative style that has a relatively high focus on action, movement and groups to a focus on individuals, discussions and explanations. The later books have a relatively higher focus on explanation but still contain language associated with action. However, the action is not as dominant in the later novels as is the discussion of motives, the narration of inner thoughts, and the speculation and extrapolation of narrative events and circumstances. The first PCA results in this chapter indicated that dialogue was one of the main contributors to this distinct style at the end of the sequence. Yet it established later that the ratio of dialogue to narration does not increase according to the chronology of the story. This is just one of the surprising findings in this chapter.

206

One of the unforeseen findings was the spread of texts along the PC1 continuum in the first test, for although Goblet of Fire is readily seen as the first book in Harry Potter to herald a significant variation in the sequence, it was not significant according to PC1. Goblet of Fire may be the first book in the sequence to dramatically depart from the plot structure and the length of the first three volumes, to move away from the childish ways of boys and girls to narrate more adolescent interactions and is the first novel where Harry is more isolated from his peers, set apart by the events of the Triwizard Tournament. However, it does not diverge significantly in style.

Another surprising finding is that the words counted for the PCA test on Harry Potter did not necessarily include any words that can be described as directly related to tone, yet they uncovered stylistic variation anyway. The ninety-two words counted are fundamental elements of the language of Harry Potter and therefore of the creation of the narratives. The early books, with their characteristic use of prepositions denoting movement and pronouns relaying the activities of a group of protagonists, are more adventure story than the later books. Without much focus on the emotions, adolescent yearnings and confusion of the older protagonists, the style of the early books can be described as a younger narration. As Westman has argued, the stories are all focalised through Harry’s perspective (147), which implies an aging narration as

Harry ages. The PCA findings reveal that as Harry ages there is also an increase in responsibility which is seen in the style through the words used for explanations offered to

Harry, mostly by Dumbledore, including that, not, would and to —words that are used in language to refer to evidence, explain theories and rationalise conclusions. Accompanied by more formal language, the increase in explanations in the late books is a marked shift from the informal narration of the early books that are dominated by prepositions. Even Dumbledore’s explanations in the early books were shown to have fewer of the late stylistic markers. These stylistic changes are related to the changes in Harry, the main focaliser throughout the

207 sequence. Harry’s increased awareness of the fantasy world, for instance, results in an increase in modal verbs as the characters discuss, and Harry mediates on, the variety of solutions and options for action.

The stylistic variations that characterise Harry Potter are not necessarily shared by the other book series. Although there is stylistic change in the other series that were studied, it is not due to the same reasons as Rowling’s stylistic variation in Harry Potter. In the case of Narnia, the variation is related to the different narrative styles in books such as Prince Caspian which is narrated through dialogue, as opposed to a more reflective narration in books such as The

Magician’s Nephew. Along the second component the significant difference between Lewis’s first

Narnia book, The Lion, the Witch and the Wardrobe, and the rest of the series indicates that the strategies Lewis used in The Lion are stylistically distinct at a level previously unknown.

In contrast to Narnia but comparable to Harry Potter, Duane’s Young Wizards series does progress in a chronological order with a strong distinction between her twentieth century novels and those she published, after an eight-year pause, in the twenty-first century. When compared directly to the PCA results of the Harry Potter series, however, there is no correlation between the PC1 loadings of the shared seventy-nine common words. The distinctions between the three series leads to the conclusion that although the styles of all three writers change within the series, there is no correlation between the variations found in the three studies. Therefore, the stylistic variation in Harry Potter is not present in either Narnia or Young Wizards. The changes in all three book series can be related to the individual circumstances of the writers, their subjects of choice and the development of different artistic strategies to accommodate the changing vision from book to book. The PCA method found stylistic variations in all three but the most consistently progressive variations were found in Rowling’s Harry Potter sequence.

The most unexpected finding of this study is the fact that there is a correlation between the progression of the Harry Potter sequence and the styles of both the direct speech and the rest 208 of the narrative. This finding offers a new perspective on one of the early criticisms that was levelled against the sequence, epitomised by Pennington’s claims. Evoking Ursula Le Guin’s

Poughkeepsie test, Pennington stated: “Harry Potter continues to reside in Poughkeepsie …

[m]aybe Harry is supposed to be grounded in Poughkeepsie” (86). Considering the surprising computational results, this position ought to be re-evaluated. For the correlation between the chronological progression of the sequence and changes in both Rowling’s direct speech and the rest of the narration reinforces the finding that stylistic variation in the Harry Potter sequence occurs chronologically and at a deep and consistent level.

209

Conclusion: Expanding the Verbal Universe This thesis set out to explore the links between underlying stylistic patterns in language and higher order processes in texts that function as vehicles for the ideas of either science fiction or fantasy. The main question asked throughout this thesis was what could be learnt about science fiction and fantasy through stylometric studies of word frequencies. The three case studies have addressed the chief concern of this thesis by exploring the relationship of style to three areas of scholarship: the critical traditions of the genres themselves, the literary study of style more generally, and stylometry as literary criticism.

Concerning the genres themselves, in the past the topic of style in science fiction and fantasy has attracted only a handful of studies. In one of those Peter Stockwell said that though science fiction is “the most conceptually experimental of genres … the style of its language has traditionally been very pedestrian, conservative, unimaginative and unspectacular…It is writing that is about conceptual alternativity, rather than an exemplification in its styles of fantasy and unrealism” (Stockwell 50). The results in the first case study (Chapter 3) confirm the latter aspect of Stockwell’s statement: the studied science fiction text, H.G. Wells’s The Time

Machine, makes use of concrete language with only a mixture of the abstract. The style of The

Time Machine is not readily distinguishable from the styles of other genres in the same period. In contrast, the utopian narratives included in the corpus were outliers in terms of the underlying patterns of language use. In the second case study, the analysis of science fiction works, and notably those categorised as future histories, challenges Stockwell’s suggestion that science fiction is unimaginative and unspectacular. Olaf Stapledon’s works in particular demonstrate a style distinctive from both Wells and Virginia Woolf. Stapledon’s artistic solutions were imaginative and resulted in a distinctive style. The packaging of science fiction has, in the past, been a point of some discomfort in the scholarly community. Although it is a genre that is lauded for the ideas it contains and the ability of those ideas to offer social commentary, this thesis has revealed the language underlying these aspects of science fiction. The effects of a science fiction work can be found in the usage of common words. The defamiliarisation of

Wells’s Time Traveller occurs through a single preposition, like. Effective artistic solutions are found in Stapledon’s unconventional style and the discovery of Stapledon’s two writing styles indicates that different solutions to accommodate two different goals — there is a style for writing the future as history and a style for extrapolation through fictional biography. The latter of these styles is more versatile than the former and comes closest in style to both Woolf’s modernist writings and Wells’s adventure style. Science fiction may be known as the “literature of ideas” (Broderick 66), but by focusing on style this thesis reveals new information about how ideas are conveyed in the style of the ubiquitous and common words.

On the other hand, fantasy fiction, according to Ursula Le Guin “is nearer to poetry, to mysticism, and to insanity than naturalistic fiction is” (84). It is a surprising result then, that the second case study (Chapter 3) found that early fantasy novel, Lilith, was at home among the more naturalistic fiction. The style of George MacDonald’s strange fantasy did not belong with its science fiction counterpart, The Time Machine. Nor does the style of Lilith belong with the styles of the other extremes found: the domestic realist and the utopian narratives. This is not to deny that the language of Lilith is strange, for MacDonald includes passages of archaic grammar in some sections of direct speech and “archaic manner is indeed a perfect distancer” in fantasy fiction (Le Guin 90). The archaic in Lilith does not, however, contribute to any deep stylistic distinctions between the underlying style of the work and the other styles discovered in the case study. This situation is an illustration of what was acknowledged at the outset: the stylometric method employed throughout this thesis cannot touch on all the stylistic qualities that constitute a work of genre. Nevertheless, the underlying style of Lilith was found to be at home among the rest of the corpus as MacDonald’s novel has neither a relatively heavy use of the prepositions that mark the adventure styles and nor does it have the high quantity of modal verbs and second person pronouns that mark the domestic realist novels. Lilith makes use of 211 both and is therefore in the mid-range of the spectrum defined by the principal component and remote from either extreme.

In contrast, the third case study (Chapter 5) revealed that the style of Harry Potter progresses chronologically. Significantly, there is consistent change in Rowling’s use of ninety- two common words and this change is still found within two distinct levels of discourse: direct speech and the rest of narration. It was noted previously that since the middle of the twentieth- century scholars have considered the style of fantasy texts as “unliterary”. Although it was beyond the scope of this thesis to address all these claims, the results of this thesis provide evidence against at least one of the claims concerning Rowling’s style, particularly Rowling’s apparent lack of sensitivity toward style. Through stylometry, this thesis bears evidence that

Rowling’s style changes in Harry Potter in a manner that is consistently progressive according to the chronology of the novels. Furthermore, this change is not repeated in other fantasy series — it is not a trend that occurs when every author writes multiple novels set in the same fantasy world. C.S Lewis’s style in Narnia changed for a different reason: The Lion, the Witch and the

Wardrobe achieves what no other novel in the series does by portraying a world without backstory and this necessitates a different style. Diane Duane’s style in Young Wizards undergoes a shift that can be related to a writing hiatus rather than changes to novelistic purposes. The stylistic variations in Harry Potter that are now confirmed with statistical evidence, are related to narrative purpose and it is a purpose that runs deep, measured and identified in the usage of very common words.

The three case studies shed new light on how to formulate concepts of literary style in literature more broadly. For style to be celebrated there has to be a mixture of particular aesthetics and idiosyncrasies and yet the first case study (Chapter 3) revealed that the examples of early science fiction and fantasy contain neither of these elements against the backdrop of contemporaneous works. The Time Machine aligned particularly well with the underlying

212 language of adventure fiction and it is argued in Chapter 3 that Wells employed strategies of adventure fiction as solutions to his otherwise unique representational problems, observed mainly through the defamiliarisation of the familiar, the concrete mixed with the abstract to extrapolate speculations, and the packaging of the new form, the scientific romance, in the style of the adventure form. This packaging achieves Wells’s goal to domesticate the impossible: the impossible is more readily contained when the author can draw on well-known solutions for packaging the exotic.

The second case study (Chapter 4) presented several tiers of evidence to highlight how even an unconventional writing style can be effective in delivering an imaginative vision. The first was a study of Stapledon and Wells where it was revealed that Stapledon’s style separates into two categories. One of these categories, the future history narrative, shares both representational aims and stylistic characteristics with one of Wells’s texts. The second tier was a comparison between Stapledon and Virginia Woolf and the results complicate the notion that shared representational aims can result in shared styles as some of their works that shared broad representational aims pursued very distinct solutions. Yet other works, the works with similar solutions but different representational problems, were stylistically closer. Hence, a correlation has emerged from the first two case studies: shared artistic solutions can lead to similar underlying styles. The artistic solutions may be shared even when the representational problems and narrative purposes are not shared.

The third, and final, case study extends this conclusion. The study on stylistic variation in Harry Potter (Chapter 5) proposed that stylistic choices exist at a level far deeper than previously thought. Although some critics may claim that an author was insensitive to the demands of style, the depth of change in Harry Potter, observable in different types of discourse within the sequence, indicates that the decisions made at a higher level influence more of the verbal universe than previously conceived by critics of fantasy fiction. Significantly, as the

213 narrative purposes of Harry Potter change, so too does the style. The statistical study reveals measures of style in Harry Potter that include: an increasing use of that, marking a shift from the casual explanations of the early books to the in-depth explanations of the late books; a prevalence of prepositions in the early books; and an increasing frequency of singular pronouns and decreasing use of third person plural pronouns. As the need for more information increases

Dumbledore’s rate of dialogue increases and as the frequency of Dumbledore’s explanations increase the style changes. As Harry ages his inner dialogue increases and as the reader is given more insights into Harry’s thoughts the style changes As Harry increasingly acts alone, he is in conflict with his peers and the style changes. Thus, the stylistic shift from Philosopher’s Stone to

Deathly Hallows can be observed in the frequencies of only ninety-two words. Therefore, a change in narrative purpose can attract different artistic solutions and these solutions can even be observed changing in different types of discourses, in both direct speech and the rest of narration. Although Thomas Pavel first argued that genre norms consist of artistic strategies, his comments concerned salient genre norms such as the rhyming couplets of a sonnet and, more broadly, the identifiable artistic solutions found to representational problems. This thesis has revealed that the usage of very common words can be linked to observable artistic solutions.

The results in Chapter 5, in particular, demonstrate stylistic variation in more than one type of discourse operating in the Harry Potter novels. Such depth of artistic solution, observable even in the frequencies of ninety-two words, is a surprising finding: not only was it unanticipated, it suggests that the salient features of genre fiction can be found in patterns of common and ubiquitous features.

Concerning the relationship between stylometry and literary criticism, this thesis offers an insight into the role of the Digital Humanities in one of the home disciplines. Stylometry had never before been applied to the genres of science fiction and fantasy, yet the methodology used in this thesis, PCA, has been applied to literary questions from the late twentieth century. John

Burrows has argued since the late 1980s that the most neglected region of the verbal universe 214 consists of the most common words and that the study of these regions in literature has both a

“light of its own to shed” as well as a “distinct bearing on questions of importance in the territory of literary interpretation” (Computation 2). The results in this thesis present evidence to support both these outcomes: the results offer insight into the fascinating patterns in word usage and together are evidence that genre fiction can be effectively studied through the measuring and identification of stylistic variations in the usage of one hundred words or less.

Overall, this thesis demonstrates that the study of style is informative in multiple scenarios: the study of style is illuminating even where it aligns science fiction works with styles found in popular adventure fiction; the study of style is informing even in the study of science fiction works deemed unliterary and therefore problematic; and the study of style reveals the depths of stylistic choices in popular fantasy fiction. In all three cases, the quantified study of style revealed avenues of study in previously neglected and critically dismissed areas of these two genres. The higher order choices, to write a narrative chronicling time travel, to write stories that span billions of years, to write in a sequence of seven novels, together with all the accompanying choices filter down to the lower-order choices, even the use of one hundred or so most common words. One of the first quotes to appear in this thesis was Le Guin’s: “every word counts” (154). Through stylometry this thesis demonstrates how just one hundred words counted is enough to reveal new insights about texts, authors, genres and styles.

215

Appendices Appendix A R Script: #reads csv file my.data <- as.data.frame(scale(read.csv("word-frequencies.csv", comment.char = "", row.names=1)))

#performs pca on dataset my.pca <- prcomp(my.data, scale=TRUE)

#prints summary of component variances summary(my.pca)

#Test One – to ensure that sum of sdev^2 equals number of variables sum((my.pca$sdev)^2)

#Test Two – sum((my.pca$rotation[,1])^2) must return [1] my.pca$rotation[,1] sum((my.pca$rotation[,1])^2)

#Test Three – calculates PC scores to compare to prcomp$x calcpc <- function(variables,loadings)

{ as.data.frame(variables) numsamples <- nrow(variables) pc<-numeric(numsamples) numvariables<-length(variables) for (i in 1:numsamples)

{ valuei <- 0 for (j in 1:numvariables)

{ valueij <- variables[i,j] loadingj<-loadings[j] valuei <- valuei + (valueij * loadingj)

}

216 pc[i] <- valuei

} return(pc)

} calcpc(my.data, my.pca$rotation[,1]) my.pca$x[,1]

#Test Five – graphs percentage of variance in new components to examine which component has variance drop-off screeplot(my.pca, type = 'lines')

#Outputs – rotation gives word weightings, x gives text scores write.csv(my.pca$rotation, "loadings.csv") write.csv(my.pca$x, "scores.csv")

Appendix B Corpus list of thirty-one books by thirty-one authors used in Chapter 3. Publication date is original date. All versions used in study are those released on Project Gutenberg.

Author Title Publication Date Abbreviation

Black, William McLeod of Dare 1878 Black_McLeodofDare

Blackmore, R D Mary Anerley 1880 Blackmore_MaryAnerley

Braddon, Mary Elizabeth Vixen 1879 Braddon_Vixen

Broughton, Rhoda Nancy 1873 Broughton_Nancy

Butler, Samuel Erewhon 1872 Butler_Erewhon

Conrad, Joseph Nigger of the 1897 Conrad_Niggerofthe

Narcissus Narcissus

Corelli, Marie Thelma 1887 Corelli_Thelma

Disraeli, Benjamin Endymion 1880 Disaraeli_Endymion

Du Maurier, George The Martian 1896 DuMaurier_TheMartian

217

Eliot, George Daniel Deronda 1876 Eliot_DanielDeronda

Gissing, George New Grub Street 1891 Gissing_NewGrubStreet

Grand, Sarah The Beth Book 1897 Grand_TheBethBook

Haggard, H. Rider King Solomon’s 1898 Haggard_KingSolomons

Mines Mines

Hardy, Thomas Jude the Obscure 1895 Hardy_JudetheObscure

Jefferies, Richard Amaryllis At The 1887 Jefferies_AmaryllisAtThe

Fair Fair

Kipling, Rudyard The Light that 1890 Kipling_Lightthat

Failed Failed

Bulwer-Lytton, Edward The Coming Race 1871 Bulwer-

Lytton_TheComingRace

MacDonald, George Lilith 1895 MacDonald_Lilith

Mason, A.E.W. The Philanderers 1897 Mason_ThePhilanderers

Meredith, George Beauchamps 1875 Meredith_Beauchamps

Career Career

Morris, William News from 1890 Morriws_Newsfrom

Nowhere Nowhere

Oliphant, Margaret Sir Tom 1893 Oliphant_SirTom

Ouida Waters of Edera 1900 Ouida_WatersofEdera

Payn, James Bred in the Bone 1872 Payn_BredInTheBone

Reade, Charles The Woman 1876 Reade_TheWoman

Hater Hater

Stevenson, Robert Louis The Master of 1889 Stevenson_Masterof

Ballantrae Ballantrae

218

Trollope, Anthony The Eustace 1871 Trollope_TheEustace

Diamonds Diamonds

Ward, Mary Augusta Marcella 1894 Ward_Marcella

Wells, H.G. The Time 1895 Wells_Time

Machine Machine

Yonge, Charlotte M Lady Hester 1874 Yonge_LadyHester

Appendix C Word list for Chapter 3, 31 Novels 1871-1900, Ninety-Nine Most Frequently Used Words – listed in alphabetical order:

a good never them about had no then after has not there all have now they am he of think an her old this and him on time any his one to are how only up as I or upon at if other very be in out was been into own we before is said well but it say were by know see what can like she when come little should which could made so who did man some will do me such with down more than would for much that you from must the your go my their Appendix D A list of texts used in Chapter 4 including those by Olaf Stapledon, H.G. Wells and Virginia Woolf.

219

Author Title Publication Date Abbreviation

Stapledon, Olaf A Man Divided 1950 OS_Man

Stapledon, Olaf Darkness and the Light 1942 OS_Darkness

Stapledon, Olaf Death Into Life 1946 OS_Death

Stapledon, Olaf Last Men in London 1933 OS_London

Stapledon, Olaf Last and First Men 1930 OS_Last

Stapledon, Olaf Odd John 1935 OS_John

Stapledon, Olaf Sirius 1944 OS_Sirius

Stapledon, Olaf Star Maker 1937 OS_Star

Stapledon, Olaf The Flames 1947 OS_Flames

Wells, H.G. A Modern Utopia 1905 HGW_Modern

Wells, H.G. The First Men in Moon 1901 HGW_First

Wells, H.G. The Invisible Man 1897 HGW_Invisible

Wells, H.G. The Island of Dr Moreau 1896 HGW_Moreau

Wells, H.G. Star Begotten 1937 HGW_Star

Wells, H.G. The Shape of Things to Come 1933 HGW_Shape

Wells, H.G. The Sleeper Awakes* 1910 HGW_Sleeper

Wells, H.G. The Wonderful Visit 1895 HGW_Visit

Wells, H.G. The Time Machine 1895 HW_TTM

Wells, H.G. The War of the Worlds 1898 HGW_War

Woolf, Virginia Between the Acts 1941 Woolf_Between

Woolf, Virginia Jacobs Room 1922 Woolf_JacobsRoom

Woolf, Virginia Mrs Dalloway 1925 Woolf_MrsDalloway

220

Woolf, Virginia Night and Day 1919 Woolf_NightandDay

Woolf, Virginia Orlando 1928 Woolf_Orlando

Woolf, Virginia The Voyage Out 1915 Woolf_TheVoyageOut

Woolf, Virginia The Waves 1931 Woolf_TheWaves

Woolf, Virginia The Years 1937 Woolf_TheYears

Woolf, Virginia Woolf_TotheLighthou To the Lighthouse 1927 se

*The revised edition of the 1899, When the Sleeper Awakes

Appendix E Word List for Chapter 4, One Hundred Most Frequently Used Words in Stapledon and Wells Corpus – listed in alphabetical order:

a great no then about had not there after have now these again he of they all her on things an him one this and his only through any I or time are if other to as in our up at into out upon be is over us been it own very before its people was but life said we by like seemed were came little she what could made so when did man some which do me still who down men than will even more that with first must the world for my their would from new them you

221

Appendix F Word List for Chapter 4, Ninety-Nine Most Frequently Used Words in Stapledon and Woolf Corpus – listed in alphabetical order:

a her now there about him of these after himself old they again his on this all how one though an I only thought and if or through are in other time as into our to at is out too be it over up been its own upon but life people very by like said was could little see we did looked seemed were do made she what down man so when even me some which for men than who from more that with great must the world had my their would have no them you he not then

Appendix G A list of texts used in Chapter 5 including those by J.K. Rowling*, C.S. Lewis†, and Diane Duane‡:

Author Title Publication Date Abbreviation

Rowling, J.K. Harry Potter and the 1997 HP1

Philosopher’s Stone

Rowling, J.K. Harry Potter and the Chamber of 1998 HP2

Secrets

222

Rowling, J.K. Harry Potter and the Prisoner of 1999 HP3

Azkaban

Rowling, J.K. Harry Potter and the Goblet of 2000 HP4

Fire

Rowling, J.K. Harry Potter and the Order of the HP5 2003 Phoenix

Rowling, J.K. Harry Potter and the Half-Blood 2005 HP6

Prince

Rowling, J.K. Harry Potter and the Deathly 2007 HP7

Hallows

Lewis, C.S. The Lion, the Witch and the 1950 Lion

Wardrobe

Lewis, C.S. Prince Caspian 1951 Caspian

Lewis, C.S. The Voyage of the Dawn Treader 1952 Voyage

Lewis, C.S. The Silver Chair 1953 Silver

Lewis, C.S. The Horse and His Boy 1954 Horse

Lewis, C.S. The Magician’s Nephew 1955 Magician

Lewis, C.S. The Last Battle 1956 Last

Duane, Diane So You Want to be a Wizard 1983 YW_1

Duane, Diane Deep Wizardry 1985 YW_2

Duane, Diane High Wizardry 1990 YW_3

Duane, Diane A Wizard Abroad 1993 YW_4

Duane, Diane The Wizard's Dilemma 2001 YW_5

223

Duane, Diane A Wizard Alone 2002 YW_6

Duane, Diane Wizard's Holiday 2003 YW_7

Duane, Diane Wizards at War 2005 YW_8

Duane, Diane A Wizard of Mars 2010 YW_9

Duane, Diane So You Want to be a Wizard 2012 YW__NM_1

New Millennium Edition

Duane, Diane Deep Wizardry 2012 YW_NM_2

New Millennium Edition

Duane, Diane High Wizardry 2012 YW_NM_3

New Millennium Edition

Duane, Diane A Wizard Abroad 2013 YW_NM_4

New Millennium Edition

Duane, Diane The Wizard's Dilemma 2013 YW_NM_5

New Millennium Edition

Duane, Diane A Wizard Alone 2013 YW_NM_6

New Millennium Edition

Duane, Diane Wizard's Holiday 2013 YW_NM_7

New Millennium Edition

Duane, Diane Wizards at War 2013 YW_NM_8

New Millennium Edition

Duane, Diane A Wizard of Mars 2013 YW_NM_9

New Millennium Edition

* All J.K. Rowling texts are eBooks published by Pottermore Limited and purchased DRM-free.

† All the C.S. Lewis texts are eBooks published by HarperCollins and used with permissions. 224

‡ All the Diane Duane texts are eBooks published by Errantry Press and purchased DRM-free.

Appendix H Word List for Chapter 5, Ninety-Two Most Frequently Used Words in J.K. Rowling’s Harry Potter Sequence– listed in alphabetical order:

a from more there about get my they again got no think all had not this an have now though and he of through are her off time around him on to as his one up at how or very back I out wand be if over was been in right we but into said well by is see were could it she what did its so when do just still which don't know that who down like the with eyes looked their would face looking them you for me then your

Appendix I Word List for Chapter 5, Ninety-Three Most Frequently Used Words in C.S. Lewis’s The Complete Chronicles of Narnia– listed in alphabetical order:

a by great it about came had just all can have know an come he like and could her little are did him long as do his looked at don't how me back down I more be for if my been from in no before go into not but good is now 225

of so time when on than to which one that two who only the up will or their us with out them very would right then was you said there we your say they well see think were she this what Appendix J Word List for Chapter 5, Ninety-Seven Most Frequently Used Words in Diane Duane’s Young Wizards series– listed in alphabetical order:

a go me there about going more they again got no this all had not thought an have now through and he of time are her on to around here one too as him or up at his other was away how out way back I over we be if right well been in said went but into see were by is she what can it so when could its some where do just something with don't know than would down like that you even little the your for long their from look them get looked then

226

Works Cited Aldiss, Brian W. Billion Year Spree: The True History of Science Fiction. Doubleday & Company,

Inc., 1973.

Allison, Sarah, Heuser Ryan, et al. Quantitative Formalism: An Experiment. 1, Stanford Literary

Lab, 2011, https://litlab.stanford.edu/LiteraryLabPamphlet1.pdf.

Allison, Sarah, Marissa Gemma, et al. Style at the Scale of the Sentence. 5, 2013,

http://litlab.stanford.edu/LiteraryLabPamphlet5.pdf.

Anderson, Porter. “With Booksellers’ Pressure: DRM Is Now Soft in Germany.” The Bookseller,

2015, https://www.thebookseller.com/futurebook/certain-digital-undoing-drm-germany-

309359.

Andres, Holly. “Who Won Science Fiction’s Hugo Awards, and Why It Matters.” Wired, Aug.

2015, https://www.wired.com/2015/08/won-science-fictions-hugo-awards-matters.

Ashley, Mike, and John Clute. “Stapledon, Olaf.” The Encyclopedia of Science Fiction, edited by

John Clute et al., Gollancz, 2015.

Attebery, Brian. “Elizabeth Enright and the Family Story as Genre.” Children’s Literature, vol. 37,

no. 1, 2009, pp. 114–136, doi:10.1353/chl.0.0811.

---. “Fantasy and the Narrative Transaction.” Style, vol. 25, no. 1, 1991, pp. 28–41,

http://www.jstor.org/stable/42945882.

---. Strategies of Fantasy. Indiana University Press, 1992.

Baayen, Harald, et al. “Outside the Cave of Shadows: Using Syntactic Annotation to Enhance

Authorship Attribution.” Literary and Linguistic Computing, vol. 11, no. 3, 1996, pp. 121–

131, doi:10.1093/llc/11.3.121.

Balossi, Giuseppina. Corpus Linguistic Approach to Literary Language and Characterization: Virginia

227

Woolf’s The Waves. John Benjamins Publishing Company, 2014.

Barron, Neil. Anatomy of Wonder: A Critical Guide to Science Fiction. R.R. Bowker Company, 1981.

Behr, Kate. “Philosopher’s Stone to Resurrection Stone: Narrative Transformations across the

Harry Potter Series.” Critical Perspectives on Harry Potter, edited by Elizabeth Heilman,

Routlege, 2008, pp. 257–271, https://ebookcentral-proquest-

com.ezproxy.newcastle.edu.au/lib/newcastle/detail.action?docID=355844#.

Binongo, José Nilo G., and M. W. A. Smith. “The Application of Principal Component Analysis

to Stylometry.” Literary and Linguistic Computing, vol. 14, no. 4, 1999, pp. 445–465,

doi:10.1093/llc/14.4.445.

Branham, Robert. “Stapledon’s ‘Agnostic Mysticism.’” Science Fiction Studies, vol. 9, no. 3, 1982,

pp. 249–256, http://www.jstor.org/stable/42395.

Briggs, Julia. “The Novels of the 1930s and the Impact of History.” The Cambridge Companion to

Virginia Woolf, edited by Susan Sellers, Cambridge University Press, 2010, pp. 70–88.

Broderick, Damien. Reading by Starlight: Postmodern Science Fiction. Taylor and Francis, 2005.

Bulwer Lytton, Edward. The Coming Race. Project Gutenberg, 2006,

http://www.gutenberg.org/1/9/5/1951/.

Burrows, John. Computation into Criticism: A Study of Jane Austen and an Experiment in Method.

Clarendon Publishers, 1987.

---. “Not Unless You Ask Nicely: The Interpretative Nexus Between Analysis and Information.”

Literary and Linguistic Computing, vol. 7, no. 2, 1992, pp. 91–109, doi:10.1093/llc/7.2.91.

---. “PCA”. Received by Naomi Fraser, 5 April 2016.

---. “Style.” The Cambridge Companion to Jane Austen, edited by Edward Copeland and Juliet

McMaster, Cambridge University Press, 1997.

228

---. “Textual Analysis.” A Companion to Digital Humanities, edited by Susan Schreibman et al.,

Blackwell, 2004, doi:10.1002/9780470999875.ch23.

---. “Word-Patterns and Story-Shapes: The Statistical Analysis of Narrative Style.” Literary and

Linguistic Computing, vol. 2, no. 2, 1987, pp. 61–70, doi:10.1093/llc/2.2.61.

Butler, Samuel. Erewhon. Project Gutenberg, 2005, http://www.gutenberg.org/dirs/1/9/0/1906.

Canavan, Gerry. “‘A Dread Mystery, Compelling Adoration’: Olaf Stapledon, Star Maker, and

Totality.” Science Fiction Studies, vol. 43, no. 2, 2016, pp. 310–330,

doi:10.5621/sciefictstud.43.2.0310.

Card, Orson Scott. Ender’s Game. Orbit, 2011.

Chandler, Daniel. “An Introduction to Genre Theory.” Www.aber.ac.uk/, 2000,

http://www.aber.ac.uk/media/Documents/intgenre/chandler_genre_theory.pdf.

Conrad, Joseph. The Nigger of the Narcissus. Project Gutenberg, 2006,

http://www.gutenberg.org/1/7/7/3/17731/.

Craig, Hugh. “A and an in English Plays, 1580-1639.” Texas Studies in Literature and Language,

vol. 53, no. 3, 2011, pp. 273–293, doi:10.1353/tsl.2011.0013.

---. “Authorial Attribution and Computational Stylistics: If You Can Tell Authors Apart, Have

You Learned Anything about Them?” Literary and Linguistic Computing, vol. 14, no. 1, 1999,

pp. 103–113, doi:10.1093/llc/14.1.103.

---. “Contrast and Change in the Idiolects of Ben Jonson Characters.” Computers and the

Humanities, vol. 33, 1999, pp. 221–240, http://www.jstor.org/stable/30200502.

---. “Stylistic Analysis and Authorship Studies.” A Companion to Digital Humanities, edited by

Susan Schreibman et al., Blackwell, 2004, doi:10.1002/9780470999875.ch20.

Craig, Hugh, and Brett Greatley-Hirsch. Style, Computers, and Early Modern Drama: Beyond

229

Authorship. Cambridge University Press, 2017.

Craig, Hugh, and Arthur F. Kinney. Shakespeare, Computers, and the Mystery of Authorship.

Cambridge University Press, 2009.

Cristani, Marco, et al. “Conversationally-Inspired Stylometric Features for Authorship

Attribution in Instant Messaging.” Proceedings of the 20th ACM International Conference on

Multimedia, ACM, 2012, pp. 1121–1124, doi:10.1145/2393347.2396398.

Crossley, Robert. “Famous Mythical Beasts: Olaf Stapledon and H.G. Wells.” Georgia Review,

vol. 36, no. 3, 1982, pp. 619–635, http://www.jstor.org/stable/41398486.

---. “Introduction.” An Olaf Stapledon Reader, edited by Robert Crossley, Syracuse University

Press, 1997.

---. Olaf Stapledon: Speaking for the Future. Liverpool University Press, 1994.

---. “Olaf Stapledon and the Idea of Science Fiction.” Modern Fiction Studies, vol. 32, no. 1, 1986,

pp. 21–42, doi:10.1353/mfs.0.1241.

---. “The Correspondence of Olaf Stapledon and H. G. Wells, 1931-1942.” Science Fiction

Dialogues, edited by Gary Wolfe, Academcy Chicago, 1982.

Cullen, Alma. “Letter.” London Review of Books, vol. 9, no. 14, 1987,

http://www.lrb.co.uk/v09/n12/marilyn-butler/jane-austens-word-process.

Daelemans, Walter. “Explanation in Computational Stylometry.” Computational Linguistics and

Intelligent Text Processing: 14th International Conference, CICLing 2013, Samos, Greece, March

24-30, 2013, Proceedings, Part I, edited by A. Gelbukh, Springer, 2013, pp. 451–462,

doi:10.1007/978-3-642-37247-6.

Dalen-Oskam, Karina van. “Names in Novels: An Experiment in Computational Stylistics.”

Literary and Linguistic Computing, vol. 28, no. 2, 2013, pp. 359–370, doi:10.1093/llc/fqs007.

230

Davenport, Basil. “The Vision of Olaf Stapledon.” To The End of Time, edited by Olaf Stapledon,

Gregg Press, 1975, pp. vii–xiv.

Delany, Samuel R. Silent Interviews: On Language, Race, Sex, Science Fiction, and Some Comics: A

Collection of Written Interviews. Wesleyan University Press, 1994.

---. Starboard Wine: More Notes on the Language of Science Fiction. Wesleyan University Press, 2012,

ProQuest Ebook Central, http://ebookcentral-proquest-

com.ezproxy.newcastle.edu.au/lib/newcastle/detail.action?docID=956196.

---. The Jewel-Hinged Jaw: Notes on the Language of Science Fiction. Revised Ed, Wesleyan

University Press, 2009, ProQuest Ebook Central, http://ebookcentral-proquest-

com.ezproxy.newcastle.edu.au/lib/newcastle/detail.action?docID=776841.

Di Franco, Giovanni, and Alberto Marradi. Factor Analysis and Principal Component Analysis.

FrancoAngeli, 2013.

Duane, Diane. “Young Wizards New Millennium Editions: A Little More Info.” Out of Ambit:

Diane Duane’s Weblog, http://dianeduane.com/outofambit/2011/05/30/young-wizards-new-

millennium-editions-a-little-more-info/. Accessed 6 Oct. 2017.

Duff, David. “Introduction.” Modern Genre Theory, Longman, 2000, pp. 1–24.

Eder, Maciej. “Does Size Matter? Authorship Attribution, Small Samples, Big Problem.” Digital

Scholarship in the Humanities, vol. 30, no. 2, 2015, pp. 167–182, doi:10.1093/llc/fqt066.

---. “Mind Your Corpus: Systematic Errors in Authorship Attribution.” Literary and Linguistic

Computing, vol. 28, no. 4, 2013, pp. 603–614, doi:10.1093/llc/fqt039.

---. Stylo: A Package for Stylometric Analyses. 2013, pp. 1–36,

https://sites.google.com/site/computationalstylistics/stylo/stylo_howto.pdf?attredirects=0&

d=1.

231

---. “Stylometry with R: A Package for Computational Text Analysis.” The R Journal, vol. 8, no. 1,

2016, pp. 107–121, https://journal.r-project.org/archive/2016/RJ-2016-007/index.html.

---. “Visualization in Stylometry: Cluster Analysis Using Networks.” Digital Scholarship in the

Humanities, vol. 32, no. 1, 2017, pp. 50–64, doi:10.1093/llc/fqv061.

Ellegård, Alvar. A Statistical Method for Determining Authorship: The Junius Letters, 1769-1772.

University of Gothenbuurg, 1962.

Elliott, Jack. “Patterns and Trends in Harlequin Category Romances.” Advancing Digital

Humanities: Research, Methods, Theories, edited by Paul Longley Arthur and Katherine

Bode, Palgrave Macmillan, 2014, pp. 54–67.

“February 2017 Big, Bad, Wide & International Report: Covering Amazon, Apple, B&N, and

Kobo Ebook Sales in the US, UK, Canada, Australia, and New Zealand.” Author Earnings,

2017, http://authorearnings.com/report/february-2017/.

Fish, Stanley. How to Write a Sentence: And How to Read One. Harper, 2011.

Fowler, Alastair. Kinds of Literature: An Introduction to the Theory of Genres and Modes. Harvard

University Press, 1982.

---. “The Formation of Genres in the Renaissance and After.” New Literary History, vol. 34, no. 2,

2003, pp. 185–200.

Freedman, Carl. “Style, Fiction, Science Fiction: The Case of Philip K. Dick.” Styles of Creation:

Aesthetic Technique and the Creation of Fictional Worlds, edited by George Slusser and Eric S.

Rabkin, University of Georgia Press, 1992, pp. 30–43.

Frow, John. Genre. Routledge, 2006.

Gentle, James E. Matrix Algebra: Theory, Computations, and Applications in Statistics. Springer,

2007.

232

Goodwin, Jonathan. “Telepathy and Cosmic Horror in Olaf Stapledon’s The Flames.” Journal of

the Fantastic in the Arts, vol. 25, no. 1, 2014, pp. 78–92.

Gordon, Joan. "Animal Studies." The Routledge Companion to Science Fiction, edited by Mark

Bould, Andrew M. Butler, Adam Roberts, and Sherryl Vint. Routledge, 2009.

Guthrie, Steve. “Dialogics and Prosody in Chaucer.” Bakhtin and Medieval Voices, edited by

Thomas J. Farrell, University Press of Florida, 1995, pp. 94–108.

Haggard, H. Rider. King Solomon’s Mines. Project Gutenberg, 2005,

http://www.gutenberg.org/2/1/6/2166/.

Hein, Rolland. George MacDonald: Victorian Mythmaker. Wipf & Stock, 2013.

Henry, Holly. Virginia Woolf and The Discourse of Science: The Aesthetics of Astronomy. Cambridge

University Press, 2003.

Herrmann, J.Berenike, et al. “Revisiting Style, a Key Concept in Literary Studies.” Journal of

Literary Theory, vol. 9, no. 1, 2015, pp. 25–52, doi:10.1515/jlt-2015-0003.

Heuser, Ryan, and Long Le-Khac. A Quantitative Literary History of 2,958 Nineteenth-Century

British Novels: The Semantic Cohort Method. 4, 2012,

https://litlab.stanford.edu/LiteraryLabPamphlet4.pdf.

Holmes, David I. “The Evolution of Stylometry in Humanities Scholarship.” Literary and

Linguistic Computing, vol. 13, no. 3, 1998, pp. 111–117, doi:10.1093/llc/13.3.111.

Hoover, David L. “Corpus Stylistics, Stylometry, And the Styles of Henry James.” Style, vol. 41,

no. 2, 2007, pp. 174–203, http://www.jstor.org/stable/10.5325/style.41.2.174.

---. “Frequent Collocations and Authorial Style.” Literary and Linguistic Computing, vol. 18, no. 3,

2003, pp. 261–286, doi:10.1093/llc/18.3.261.

---. “Multivariate Analysis and the Study of Style Variation.” Literary and Linguistic Computing,

233

vol. 18, no. 4, 2003, pp. 341–360, doi:10.1093/llc/18.4.341.

---. “Statistical Stylistics and Authorship Attribution: An Empirical Investigation.” Literary and

Linguistic Computing, vol. 16, no. 4, 2001, pp. 421–444, doi:10.1093/llc/18.4.341.

Hothorn, Torsten, and Brian S. Everitt. A Handbook of Statistical Analyses Using R. Third Edit,

CRC Press, 2014.

Huddleston, Rodney D., and Geoffrey K. Pullum. The Cambridge Grammar of the English

Language. Edited by Rodney D Huddleston and Geoffrey K Pullum, Cambridge University

Press, 2002.

Hume, Kathryn. Fantasy and Mimesis: Responses to Reality in Western Literature. Methuen, 1984.

Huntington, John. “Olaf Stapledon and the Novel About the Future.” Contemporary Literature,

vol. 22, no. 3, 1981, pp. 349–365, http://www.jstor.org/stable/1208284.

Ingenta. “IPA Report Says Global Publishing Productivity Is Up But Growth Is Down.” Blog,

2014, http://www.ingenta.com/blog-article/ipa-report-says-global-publishing-

productivity-is-up-but-growth-is-down-2/.

Jameson, Fredric. Archaeologies of the Future: The Desire Called Utopia and Other Science Fictions.

Verso, 2005.

Jockers, Matthew. Macroanalysis: Digital Methods and Literary History. University of Illinois Press,

2013.

---. “Syuzhet: Extracts Sentiment and Sentiment-Derived Plot Arcs from Text.” R-Project, 2016,

https://cran.r-project.org/package=syuzhet.

Jolly, Roslyn. “Postcolonial Readings.” A Companion to the Victorian Novel, edited by William

Baker, Greenwood Publishing Group, 2002.

Keen, Suzanne. “The Series Novel: A Dominant Form.” Cambridge History of the English Novel, 234

edited by Robert L Caserio and Clement Hawes, Cambridge University Press, 2012, pp.

724–739.

Kenny, Anthony. A Stylometric Study of the New Testament. Clarendon Press, 1986.

---. The Computation of Style: An Introduction to Statistics for Students of Literature and Humanities.

Pergamon Press, 1982.

Kestemont, Mike. “Stylometry for Medieval Authorship Studies.” Digital Philology, vol. 1, no. 1,

2012, pp. 42–72, doi:10.1353/dph.2012.0002.

Kincaid, J.Peter, et al. “Derivation of New Readability Formulas (Automated Readability Index,

Fog Count and Flesch Reading Ease Formula) For Navy Enlisted Personnel.” Research

Branch Report 8-75, 1975.

Kinnaird, John. “Stapledon, (William) Olaf.” Twentieth-Century Science-Fiction Writers, edited by

Curtis C. Smith, St. Martin’s Press, 1981, pp. 516–517.

Kirk, Elizabeth D. “‘I Would Rather Have Written in Elvish’: Language, Fiction and ‘The Lord

of the Rings.’” Novel: A Forum on Fiction, vol. 5, no. 1, 1971, pp. 5–18.

L’Engle, Madeleine. Walking on Water. WaterBrook Press, 2001.

Latham, Rob. “Introduction.” The Oxford Hanbook of Science Fiction, edited by Rob Latham,

Oxford University Press, 2014, pp. 1–19.

Le Guin, Ursula. “From Elfland to Poughkeepsie.” Fantastic Literature: A Critical Reader, edited

by David Sandner, Praeger, 2004, pp. 144–155.

Lee, Hermione. “Virginia Woolf’s Essays.” The Cambridge Companion to Virginia Woolf, edited by

Susan Sellers, Cambridge University Press, 2010, pp. 89–106.

Leech, Geoffrey N., and Michael H. Short. Style in Fiction. Longman, 1981.

235

Levy, Michael, and Farah Mendlesohn. Children’s Fantasy Literature: An Introduction. Cambridge

University Press, 2016.

Lewis, C. S. A Preface to Paradise Lost. Oxford University Press, 1960.

---. Prince Caspian. HarperCollins, 2010.

---. The Horse and His Boy. HarperCollins, 2010.

---. The Magician’s Nephew. HarperCollins, 2010.

Liu, Alan. “Where Is Cultural Criticism in the Digital Humanities.” Debates in the Digital

Humanities, edited by Matthew K. Gold and Lauren F. Klein, University of Minnesota

Press, 2012, http://dhdebates.gc.cuny.edu/debates/text/22.

MacDonald, George. Lilith. Project Gutenberg, 1999, http://www.gutenberg.org/1/6/4/1640/.

Mandala, Susan. Language in Science Fiction and Fantasy: The Question of Style. Continuum

International Publishing, 2010.

Marcus, Leonard S., editor. The Wand in the Word: Conversations with Writers of Fantasy.

Candlewick Press, 2006.

Martin, George R. R. A Game of Thrones. HarperCollins, 2011.

McCarthy, Patrick A. Olaf Stapledon. Twayne, 1982.

McCarty, Willard. Humanities Computing. Palgrave Macmillan, 2005, https://ebookcentral-

proquest-com.ezproxy.newcastle.edu.au/lib/newcastle/detail.action?docID=257474.

McKenna, C. W. F., and A. Antonia. “The Statistical Analysis of Style: Reflections on Form,

Meaning, and Ideology in the ‘Nausicaa’ Episode of Ulysses.” Literary and Linguistic

Computing, vol. 16, no. 4, 2001, pp. 353–373, doi:10.1093/llc/16.4.353.

McLean, Steven. The Early Fiction of H.G. Wells: Fantasies of Science. Palgrave Macmillan, 2009,

236

http://0-

www.palgraveconnect.com.library.newcastle.edu.au/pc/doifinder/10.1057/9780230236639.

McKenna, Juliet. “The Genre Debate: Science Fiction Travels Farther Than Literary Fiction.” The

Guardian, 18 Apr. 2014.

McLean, Steven. The Early Fiction of H.G. Wells: Fantasies of Science. Palgrave Macmillan, 2009,

http://0-

www.palgraveconnect.com.library.newcastle.edu.au/pc/doifinder/10.1057/9780230236639.

Mendlesohn, Farah. “Fantasy in Children’s Fiction.” Modern Children’s Literature: An

Introduction, edited by Catherine Butler and Kimberley Reynolds, Palgrave Macmillan,

2014, pp. 24–37.

---. “Introduction.” The Cambridge Companion to Science Fiction, edited by Farah Mendlesohn and

Edward James, Cambridge University Press, 2006.

---. Rhetorics of Fantasy. Wesleyan University Press, 2008.

Mendlesohn, Farah, and Edward James. A Short History of Fantasy. Middlesex University Press,

2009.

---. “Introduction.” The Cambridge Companion to Fantasy Literature, edited by Edward James and

Farah Mendlesohn, Cambridge University Press, 2012.

Moretti, Franco. “Conjectures on World Literature.” New Left Review, vol. 1, 2000, pp. 54–68,

https://newleftreview.org/II/1/franco-moretti-conjectures-on-world-literature.

---. Distant Reading. Verso, 2013.

Morgan, Monique R. “Madness, Unreliable Narration, and Genre in The Purple Cloud Author.”

Science Fiction Studies, vol. 36, no. 2, 2009, pp. 266–283, 237

http://www.jstor.org/stable/pdf/40649959.

Moskowitz, Sam. Explorers of the Infinite: Shapers of Science Fiction. Hyperion Press, Inc., 1974.

Mosteller, Frederick, and David Wallace. Inference and Disputed Authorship: The Federalist.

Addison-Wesley, 1964.

Murry, J.Middleton. The Problem of Style. Oxford University Press, 1952.

Nel, Philip. J.K. Rowling’s Harry Potter Novels: A Reader’s Guide. A&C Black, 2001.

Oakes, Michael P. Literary Detective Work on the Computer. John Benjamins Publishing Company,

2014.

Parrinder, Patrick. “Introduction.” H.G. Wells: The Critical Heritage, edited by Patrick Parrinder,

Routledge and Kegan Paul, 1972, pp. 1–31.

---. Science Fiction: Its Criticsm and Teaching. Methuen, 1980.

---. Shadows of the Future: H.G. Wells, Science, Fiction, and Prophecy. Liverpool University Press,

2004.

---. Utopian Literature and Science: From the Scientific Revolution to Brave New World and Beyond.

Palgrave Macmillan, 2015.

Partington, John S. “The Time Machine and A Modern Utopia: The Static and Kinetic Utopias of

the Early H.G. Wells.” Utopian Studies, vol. 13, no. 1, 2002, pp. 57–68,

www.jstor.org/stable/20718409.

Pavel, Thomas. “Literary Genres as Norms and Good Habits.” New Literary History, vol. 34, no.

2, 2003, pp. 201–210, www.jstor.org/stable/20057776.

Pennebaker, James W. The Secret Life of Pronouns: What Our Words Say About Us. Bloomsbury

Press, 2011.

238

Pennington, John. “From Elfland to Hogwarts, or the Aesthetic Trouble with Harry Potter.” The

Lion and the Unicorn, vol. 26, no. 1, 2002, pp. 78–97, https://muse.jhu.edu/article/35536.

Popper, Karl. The Logic of Scientific Discovery. Routledge, 1999.

Priest, Christopher. “British Science Fiction.” Science Fiction: A Critical Guide, edited by Patrick

Parrinder, Longman, 1979, pp. 187–202.

“Principal Components Analysis.” R-Manual, https://stat.ethz.ch/R-manual/R-

devel/library/stats/html/prcomp.html.

Rabkin, Eric S. “The Composite Fiction of Olaf Stapledon.” Science Fiction Studies, vol. 9, no. 3,

1982, pp. 238–248, http://www.jstor.org/stable/pdf/4239499.pdf.

Raffel, Burton. “‘The Lord of the Rings’ as Literature.” Tolkien and the Critics: Essays on J.R.R.

Tolkien’s “The Lord of the Rings,” edited by Neil D. Isaacs and Rose A. Zimbardo, University

of Notre Dame Press, 1968, pp. 218–246.

Ramsay, Stephen. “The Hermeneutics of Screwing Around; or What You Do with a Million

Books.” Pastplay: Teaching and Learning History with Technology, edited by Kevin Kee,

University of Michigan Press, 2014, pp. 110–121.

Randall, Neil. “Shoeless Joe: Fantasy and the Humor of Fellow-Feeling.” Modern Fiction Studies,

vol. 33, no. 1, 1987, pp. 173–182.

Reid, Robin Anne. “Mythology and History: A Stylistic Analysis of The Lord of the Rings.”

Style, vol. 43, no. 4, 2009, pp. 517–615, www.jstor.org/stable/10.5325/style.43.4.517.

Richardson, Brian. “Unnatural Narrative Theory.” Style, vol. 50, no. 4, 2016, pp. 385–405.

Rieder, John. Colonialism and the Emergence of Science Fiction. Wesleyan University Press, 2008.

---. “On Defining SF, or Not: Genre Theory, SF, and History.” Science Fiction Studies, vol. 37, no.

2, 2010, pp. 191–209, http://www.jstor.org/stable/25746406. 239

Roberts, Adam. Science Fiction. Taylor and Francis, 2005.

---. The History of Science Fiction. Palgrave Macmillan, 2006, doi:10.1057/9780230554658.

Rowling, J. K. Harry Potter and the Philosopher’s Stone. Pottermore Limited, 2012.

--. Harry Potter and the Chamber of Secrets. Pottermore Limited, 2012.

---. Harry Potter and the Prisoner of Azkaban. Pottermore Limited, 2012.

---. Harry Potter and the Goblet of Fire. Pottermore Limited, 2012.

--. Harry Potter and the Order of the Phoenix. Pottermore Limited, 2012.

---. Harry Potter and the Half-Blood Prince. Pottermore Limited, 2012.

--. Harry Potter and the Deathly Hallows. Pottermore Limited, 2012.

Ruddick, Nicholas. “‘Tell Us All about Little Rosebery’: Topicality and Temporality in H.G.

Wells’s ‘The Time Machine.’” Science Fiction Studies, vol. 28, no. 3, 2001, pp. 337–354.

Rybicki, Jan, and Magda Heydel. “The Stylistics and Stylometry of Collaborative Translation:

Woolf’s Night and Day in Polish.” Literary and Linguistic Computing, vol. 28, no. 4, 2013,

pp. 708–717, doi:10.1093/llc/fqt027.

Sandner, David. “Introduction.” Fantastic Literature: A Critical Reader, edited by David Sandner,

Praeger, 2004, pp. 1–13.

Scafella, Frank. “Tolkien, The Gospel, and The Fairy Story.” Soundings: An Interdisciplinary

Journal, vol. 64, no. 3, 1981, pp. 310–325, http://www.jstor.org/stable/pdf/41178192.

Scholes, Robert, and Eric S. Rabkin. Science Fiction. Oxford University Press, 1977.

Schreibman, Susan, et al. “The Digital Humanities and Humanities Computing: An

Introduction.” A Companion to Digital Humanities, edited by Susan Schreibman et al.,

Blackwell, 2004. 240

Schwartz, Roy, et al. “Authorship Attribution of Micro-Messages.” Proceedings of the 2013

Conference on Empirical Methods in Natural Language Processing, Association for

Computational Linguistics, 2013, pp. 1880–1891, http://aclweb.org/anthology//D/D13/D13-

1193.pdf.

Shelton, Robert. “The Mars-Begotten Men of Olaf Stapledon and H.G. Wells (Les Martiens

d’Olaf Stapledon et H.G. Wells).” Science Fiction Studies, vol. 11, no. 1, 1984, pp. 1–14.

Simpson, Paul. Stylistics: A Resource Book for Students. Routledge, 2004.

Slusser, George. “Reflections on Style in Science Fiction.” Styles of Creation: Aesthetic Technique

and the Creation of Fictional Worlds, edited by George Slusser and Eric S. Rabkin, University

of Georgia Press, 1992, pp. 3–23.

Smith, Curtis C. “Introduction.” To The End of Time, edited by Olaf Stapledon, Gregg Press, 1975,

pp. v–xi.

---. “The Books of Olaf Stapledon: A Chronological Survey.” Science Fiction Studies, vol. 1, no. 4,

1974, pp. 297–299.

Snow, Kayla. “What Hath Hobbits to Do with Prophets?: The Fantastic Reality of J. R. R. Tolkien

and Flannery O’Connor.” Logos: A Journal of Catholic Thought and Culture, vol. 17, no. 4,

2014, pp. 108–129, doi:10.1353/log.2014.0040.

Spiegel, Simon. “Things Made Strange: On the Concept of ‘Estrangement’ in Science Fiction

Theory.” Science Fiction Studies, vol. 35, no. 3, 2008, pp. 369–385,

http://www.jstor.org/stable/25475174..

Stapledon, Olaf. A Man Divided. Project Gutenberg, 2006,

http://gutenberg.net.au/ebooks06/0601331h.html.

241

--. Darkness and the Light. Project Gutenberg, 2006,

http://gutenberg.net.au/ebooks06/0601311h.html.

--. Death into Life. Project Gutenberg, N.D., http://gutenberg.net.au/ebooks06/0601281h.html.

Retrieved June 2014.

--. Last and First Men. Project Gutenberg, 2006, http://gutenberg.net.au/ebooks06/0601101h.html.

--. Last Men in London. Project Gutenberg, 2006,

http://gutenberg.net.au/ebooks06/0601271h.html.

--. Odd John. Project Gutenberg, 2006, http://gutenberg.net.au/ebooks06/0601111h.html.

--. Sirius. Project Gutenberg, 2006, http://gutenberg.net.au/ebooks06/0601151h.html.

--. Star Maker. Project Gutenberg, 2006, http://gutenberg.net.au/ebooks06/0601841.txt.

--. The Flames. Project Gutenberg, 2006, http://gutenberg.net.au/ebooks06/0601131h.html.

Stinson, Emmett. “Protect Australian Stories! The Campaign against PIR Reform.” Overland,

2016, https://overland.org.au/2016/05/protect-australian-stories-the-campaign-against-pir-

reform/.

Stockwell, Peter. The Poetics of Science Fiction. Longman, 2000.

Stockwell, Peter, and Sara Whiteley. “Introduction.” The Cambridge Handbook of Stylistics, edited

by Peter Stockwell and Sara Whiteley, Cambridge University Press, 2014, pp. 1–9.

Strunk Jr., William, and E. B. White. The Elements of Style. Allyn and Bacon, 2000.

Suvin, Darko. “A Grammar of Form and a Criticism of Fact: The Time Machine as a Structural

Model for Science Fiction.” H.G. Wells and Modern Science Fiction, Associated University

Presses, Inc., 1977, pp. 90–115.

---. Metamorphoses of Science Fiction: On the Poetics and History of a Literary Genre. Yale University

242

Press, 1979.

---. Victorian Science Fiction in the UK: The Discourses of Knowledge and Power. G.K. Hall, 1983.

Tabata, Tomoji. “Dickens’s Narrative Style: A Statistical Approach to Chronological Variation.”

Revue, Informatique et Statistique Dans Les Sciences Humaines, vol. 30, 1994, pp. 165–182.

---. “Stylometry of Dickens’s Language.” Advancing Digital Humanities: Research, Methods,

Theories, edited by Paul Longley Arthur and Katherine Bode, Palgrave Macmillan, 2014,

pp. 28–53.

Tandon, Bharat. Jane Austen and the Morality of Conversation. Anthem Press, 2003.

Tausczik, Yla R., and James W. Pennebaker. “The Psychological Meaning of Words: LIWC and

Computerized Text Analysis Methods.” Journal of Language and Social Psychology, vol. 29,

no. 1, 2010, pp. 24–54, doi:10.1177/0261927X09351676.

Todorov, Tzvetan. The Fantastic: A Structural Approach to a Literary Genre. Cornell University

Press, 1975.

Tolkien, J. R. R. Tree and Leaf. Unwin Hyman Limited, 1988.

Tom. “Lulu Says Goodbye to DRM.” Lulu Blog, 2013, http://www.lulu.com/blog/2013/01/drm-

update/#sthash.KuQNBnXL.3yzoypaX.dpbs.

Trollope, Anthony. The Eustace Diamonds. Project Gutenberg, 2003,

http://www.gutenberg.org/dirs/7/3/8/7381.

Varmuza, Kurt, and Peter Filzmoser. Introduction to Multivariate Statistical Analysis in

Chemometrics. CRC Press, 2009.

Vint, Sherryl. Animal Alterity: Science Fiction and the Question of the Animal. Liverpool University

Press, 2012.

Wagar, W.Warren. “WELLS, H(erbert) G(eorge).” Twentieth-Century Science-Fiction Writers, 243

edited by Curtis C. Smith, St. Martin’s Press, 1981, pp. 574–577.

Wales, Katie. A Dictionary Of Stylistics. Longman, 1989.

Watson, Greer. “Assumptions of Reality: Low Fantasy, Magical Realism, and the Fantastic.”

Journal of the Fantastic in the Arts, vol. 11, no. 2, 2000, pp. 164–172,

http://www.jstor.org/stable/43308437.

Waugh, Robert H. “Spirals and Metaphors: The Shape of Divinity in Olaf Stapledon’s Myth.”

Extrapolation, vol. 38, no. 3, 1997, pp. 207–221.

Webb, Caroline. “‘Abandoned Boys’ and ‘Pampered Princes’: Fantasy as the Journey to Reality

in the Harry Potter Sequence.” Papers: Explorations into Children’s Literature, vol. 18, no. 2,

2008, pp. 15–21.

---. Fantasy and the Real World in British Children’s Literature: The Power of Story. Routledge, 2014.

Wells, H. G. A Modern Utopia. Project Gutenberg, 2004,

http://www.gutenberg.org/files/6424/6424-h/6424-h.htm.

--. Star-Begotten. Project Gutenberg, 2013, http://gutenberg.net.au/ebooks07/0701231h.html.

--. The Invisible Man. Project Gutenberg, 2004,

http://www.gutenberg.org/files/5230/5230-h/5230-h.htm.

---. The Scientific Romances of H.G. Wells. Victor Gollancz, 1933.

---. The Shape of Things to Come. Project Gutenberg, 2003,

http://gutenberg.net.au/ebooks03/0301391h.html.

--. The Time Machine. Project Gutenberg, 2004, http://www.gutenberg.net/3/35/.

Westman, Karin E. “Perspective, Memory, and Moral Authority: The Legacy of Jane Austen in J.

K. Rowling’s Harry Potter.” Children’s Literature, vol. 35, no. 1, 2007, pp. 145–165.

244

Wolfe, Gary K. Critical Terms for Science Fiction and Fantasy: A Glossary and Guide to Scholarship.

Greenwood Press, 1986.

---. Evaporating Genres: Essays on Fantastic Literature. Wesleyan University Press, 2011, http://0-

site.ebrary.com.library.newcastle.edu.au/lib/newcastle/docDetail.action?docID=10468478.

Woolf, Virginia. Orlando. Project Gutenberg, 2002,

http://gutenberg.net.au/ebooks02/0200331h.html.

--. The Diary of Virginia Woolf. Edited by Anne Oliver Bell and Andrew McNeillie, Vol. 4, The

Hogarth Press, 1982.

--. The Waves. Project Gutenberg, 2002, http://gutenberg.net.au/ebooks02/0201091h.html.

--. The Years. Project Gutenberg, 2003, http://gutenberg.net.au/ebooks03/0301221h.html

Yule, G.Udny. “On Sentence-Length as a Statistical Characteristic of Style in Prose: With

Application to Two Cases of Disputed Authorship.” Biometrika, vol. 30, no. 3/4, 1939, pp.

363–390, http://www.jstor.org/stable/2332655.

245