Quick viewing(Text Mode)

Stephen Crane and the New-York Tribune: a Case Study in Traditional and Non-Traditional Authorship Attribution

Stephen Crane and the New-York Tribune: a Case Study in Traditional and Non-Traditional Authorship Attribution

Computers and the Humanities 35: 315–331, 2001. 315 © 2001 Kluwer Academic Publishers. Printed in the Netherlands.

Stephen Crane and the New-York Tribune: A Case Study in Traditional and Non-Traditional Authorship Attribution

DAVID I. HOLMES, MICHAEL ROBERTSON and ROXANNA PAEZ The College of , USA

Abstract. This paper describes how traditional and non-traditional methods were used to identify seventeen previously unknown articles that we believe to be by Stephen Crane, published in the New-York Tribune between 1889 and 1892. The articles, printed without byline in what was at the time ’s most prestigious newspaper, report on activities in a string of summer resort towns on New Jersey’s northern shore. Scholars had previously identified fourteen shore reports as Crane’s; these possible attributions more than double that corpus. The seventeen articles confirm how remarkably early Stephen Crane set his distinctive writing style and artistic agenda. In addition, the sheer quantity of the articles from the summer of 1892 reveals how vigorously the twenty-year-old Crane sought to establish himself in the role of professional writer. Finally, our discovery of an article about the New Jersey National Guard’s summer encampment reveals another way in which Crane immersed himself in nineteenth-century military culture and help to explain how a young man who had never seen a battle could write so convincingly of war in his soon-to-come masterpiece, . We argue that the joint interdisciplinary approach employed in this paper should be the way in which attributional research is conducted.

Key words: authorship, New York Tribune, Stephen Crane, stylometry

1. Introduction The past forty years have witnessed a revolution in authorship attribution. When Erdman and Fogel (1966) assembled their massive collection of the best work in the field to date, not one of the articles they selected employed computer-assisted statistical methodologies. Even fifteen years later, a guide to literary research that is still regarded as a standard in the field (Altick, 1981) devoted its entire chapter on “Problems in Authorship” to the traditional methods treated by Erdman and Fogel’s contributors: the use of “external” evidence such as letters and other contemporary testimony and the “internal” evidence provided by a work’s content and style. However, two years before Erdman and Fogel published their collection, Mosteller and Wallace (1964) completed a groundbreaking study of the vexed problem of authorship in The Federalist Papers, using sophisticated statistical methodology. The example of Mosteller and Wallace, combined with the late twentieth-century revolution in computing, inaugurated a new era for “non-traditional” statistically 316 DAVID I. HOLMES ET AL. based studies of authorship; Holmes (1998) offers a comprehensive survey of the flood of non-traditional scholarship that followed Mosteller and Wallace. The best-known studies of authorship attribution, both traditional and non- traditional, have centered on a relatively limited body of texts, notably British works from the Renaissance through the eighteenth century. However, Stephen Crane, the nineteenth-century American writer best know for The Red Badge of Courage, affords an interesting case study in attribution. Crane’s early unsigned journalism, written from the New Jersey shore, has been studied by a number of scholars using traditional methods (Berryman, 1950; Bowers, 1973; Elconin, 1948; Kwiat, 1953; Williams and Starrett, 1948). In addition, O’Donnell (1966) used computer-aided discriminant analysis in his non-traditional study of the posthum- ously published novel The O’Ruddy, begun by Crane and finished by . However, no one had combined traditional and non-traditional methods in deter- mining Crane’s authorship of disputed texts. This essay, a collaboration between a literary scholar and two statisticians, is the first to do so.

2. Stephen Crane’s New Jersey Shore Journalism Stephen Crane began his career as a professional writer in the summer of 1888, when he was sixteen (Wertheim and Sorrentino, 1988). His assignment was to assist his brother J. Townley Crane, Jr., almost twenty years older than Stephen, who had established Crane’s New Jersey Coast News Bureau in 1880 when he arranged to serve as correspondent for the and the New-York Tribune. For three-quarters of the year, Townley Crane’s duties must have been light as he ferreted out news in the sparsely populated shore towns of Monmouth County. However, during the summer months the news bureau’s duties exploded. New York City newspapers of the 1880s and devoted remarkable amounts of space to chronicling the summer vacations of the city’s upper and upper-middle classes. Every Sunday edition of most New York newspapers and, during July and August, most daily editions as well carried news articles from the summer resorts popular with the more affluent citizens of Gilded Age New York: Saratoga Springs, Newport, the Adirondacks, Cape May, and the northern New Jersey shore. The format of these articles was standardized: a lead proclaimed the resort’s unique beauties and the unprecedented success of the current summer season; a few brief paragraphs recounted recent events, such as a fund-raising carnival or the opening of a new hotel; and the article concluded with a lengthy list of names of recent arrivals and where they were staying. Stephen Crane’s best-known New Jersey shore article, published in the Tribune on August 21, 1892, explodes this traditional format. His assignment was to report on a parade of the Junior Order of United American Mechanics, a working-class nativist organization that came annually to Asbury Park for a patriotic fest known as “American Day.” Other newspapers, mindful of the group’s political power, covered the parade with a few flattering sentences. Crane saw it as an oppor- TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 317 tunity for . He began by observing that the spectacle of an Asbury Park crowd confronting the working-class marchers was “an interesting sight,” then proceeded to juxtapose ironically the three groups brought together by the scene: the marchers, “bronzed, slope-shouldered, uncouth and begrimed with dust”; the spectators, “composed of summer gowns, lace parasols, tennis trousers, straw hats and indifferent smiles”; and the native Asbury Parker, “a man to whom a dollar, when held close to his eye, often shuts out any impression he may have had that other people possess rights” (Bowers, 1971, pp. 521–522). Crane, who always reserved his sharpest barbs for his own class, admired the “sun-beaten honesty” in the faces of the marchers; however, it was the United American Mechanics who wrote a letter of complaint to the Tribune, which led the newspaper to fire both Stephen and Townley Crane (Wertheim and Sorrentino, 1994). This ignominious episode in the early career of one of America’s greatest writers was commented upon in letters and memoirs by many of his contempo- raries, providing ample external evidence for Crane’s authorship of the article. In the 1940s, literary scholars Elconin (1948) and Williams and Starrett (1948) examined the files of the New-York Tribune for the summer of 1892, searching for additional articles by Crane. Using internal evidence of both content and style, they attributed eight other articles to Crane. The fact that these articles were strikingly different in content and tone from the Tribune’s usual New Jersey shore articles and their close resemblance in subject matter and style to the fiction Crane wrote in 1892 – plus their identification by two different sets of Crane scholars, working independently – made these attributions so convincing that they have been accepted without question for over fifty years. Kwiat (1952) found internal evidence as solid and compelling as that used by Elconin and Williams and Starrett to attribute one additional 1892 Tribune article to Crane. Berryman (1950) used definitive external evidence from a Crane contem- porary to attribute an 1891 article. Thus, when the highly respected textual scholar Fredson Bowers began to assemble his complete edition of Stephen Crane’s works, there were a total of eleven articles in the canon of Stephen Crane’s New Jersey shore journalism. Convinced that there were more to be found, Bowers set his corps of graduate student assistants to work combing the files of the Tribune. They found three articles which treated topics that Crane later developed into lengthy signed articles; Bowers sensibly regarded this evidence as sufficient for attribution. His edition of Crane’s journalism (1973) thus established the canon of Jersey shore articles at a total of fourteen. In addition, Bowers’ researchers flagged twenty- eight articles that, on the basis of internal evidence of style and content, seemed to be by Stephen Crane. Bowers reprinted these articles in his edition as “Possible Attributions.” 318 DAVID I. HOLMES ET AL.

3. Discovery and “Traditional” Attribution The eleven articles definitively attributed to Crane in the 1940s and 1950s bore datelines from three adjoining towns on the New Jersey shore: Asbury Park, Ocean Grove, and Avon-by-the-Sea. When Bowers set his researchers to work to find possible attributions, he evidently decided to limit his search to articles with datelines from those three towns. No scholar questioned his decision. However, during research for a book on Stephen Crane’s journalism (Robertson, 1997), we came across an item in the Schoberlin Collection at the Library that revealed limitations in Bowers’ search. In a folder labeled “Crane–1891,” part of the materials that Melvin Schoberlin assembled for his never published biography, a one-page prospectus for Crane’s New Jersey Coast News Bureau was found, evidence of an attempt by Townley Crane to expand his business. The docu- ment’s subheading, printed just below the news bureau’s name, is “Sandy Hook to Barnegat Bay.” The body of the prospectus lists the shore towns bounded by those two prominent geographical features, including some of the most prominent resorts on the Jersey shore – notably Long Branch, which was visited by every U.S. President from Grant to Harrison and vied with Cape May for the distinction of being New Jersey’s most fashionable summer destination; and Spring Lake, a small but elegant resort. With this new external evidence of the Crane news bureau’s wide geographical range, we questioned Bowers’ decision to limit his search for possible attributions to articles originating from Asbury Park and the two towns just south of it. Would it not make sense for Townley to send his teenaged brother to cover news in the resorts a few miles distant from their home base of Asbury Park and save himself the trouble? Wouldn’t he need Stephen’s help to cover the news at Long Branch, which was even larger and livelier than Asbury Park? Shortly after finding the prospectus, we came across an article from Spring Lake in the New-York Tribune of June 26, 1892. It begins: This town has taken on its usual garb of lurid summer hue. The beach, the hotel verandas and the lakeside are now all alive with the red and white and gold of the town’s summer revellers, who make merry in a nice, mild sort of way. The hotel proprietors have removed the sackcloth and ashes which is said to be their dress during the dreary winter months, and have appeared in gentle, expansible smiles and new clothes, for everything points to a most prosperous season. Surely this was by the same author who wrote a week later from Asbury Park: Pleasure seekers arrive by the avalanche. Hotel-proprietors are pelted with hail- storms of trunks and showers of valises. To protect themselves they do not put up umbrellas, nor even prices. They merely smile copiously. The lot of the baggageman, however, is not an easy one. He manipulates these various storms and directs them. He is beginning to swear with a greater enthusiasm. It will be a fine season. (Bowers, 1973, p. 509) TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 319

The second article was attributed to Stephen Crane by both Elconin (1948) and Williams and Starrett (1948). We had little doubt that the first was his also. Both passages are marked throughout by Crane’s distinctive ironic tone; both contain witty hyperbole; and both employ striking lexical juxtapositions, such as the hotel proprietors who wear “expansible smiles and new clothes” in the first passage and who refrain in the second from putting up either umbrellas or prices. It seemed likely that the Tribune contained additional Stephen Crane articles from Spring Lake, Long Branch, and other locations not examined by Bowers and other scholars. We determined to search for them. However, our first step was to analyze Townley Crane’s prose. We searched the New-York Tribune for the summer of 1886, when Crane’s New Jersey Coast News Bureau was already well estab- lished but Stephen had not yet begun his journalistic career, and collected articles with a dateline from the New Jersey shore towns named in Townley’s prospectus. We found a total of twenty-two articles. Although in accordance with journalistic practice of the time none of the articles was signed, all bore an identical byline: “From the Regular Correspondent of the Tribune.” In addition, the relatively small number of articles published that summer – a fraction of the total published each summer during the early 1890s – made it likely that Townley wrote all the articles himself. Their style is remarkably consistent. Townley Crane seems to have been a completely straightforward writer, an unimaginative but sincere booster of the New Jersey shore towns where he made his living. In contrast, Stephen Crane is noted for his gleefully scorching , evident throughout his journalism and fiction. To locate articles that might be by Stephen, we searched the New-York Tribune for the summers of 1888, when Stephen claimed he began assisting Townley, through 1892, when he was fired. We read every issue from the last Sunday in May, the earliest date when resort news was likely to appear, through the second Sunday in September, when the last of the summer visitors departed, searching for articles with a dateline from the New Jersey shore towns named in Townley Crane’s prospectus. The results of our search were striking. The 1886 articles were uniformly pallid and inoffensive in their style. However, in 1889, when Stephen was seventeen, a distinctive new voice suddenly emerged in the Tribune. On July 30 the newspaper published an article that takes ironic aim at the visitors to a summer institute for Protestant clergy: After spending half a day in discussing the question “Is There Any Other Science Than Physical Science? If So, What & Why?” it was a curious sight to see a number of the reverend intellectual giants of the American Institute of Christian Philosophy seated in a boat fishing for crabs and gravely discussing the question “Is there any better bait for crabs than fish tails? If so, what and where is it to be found?” Other eminent lecturers went in bathing, and as they bobbed up and down in the waves they solemnly argued about immersion. The internal evidence of its playfully ironic style strongly suggested that this article was Stephen’s. Content provided additional evidence for the attribution; Stephen 320 DAVID I. HOLMES ET AL. wrote about the American Institute of Christian Philosophy the following summer in an article definitively attributed and reprinted by Bowers (1973). Using the traditional attributional tools of content and style, we found sixteen other articles published between 1889 and 1892 that we identified as possibly by Stephen Crane. As a whole, the seventeen possible attributions that we identified, written when Crane was seventeen to twenty years old, confirm how remarkably early he set his distinctive writing style and artistic agenda; more than a century after their original newspaper publication they remain delightful reading. In addi- tion, the sheer quantity of articles from the summer of 1892 – fourteen of our seventeen attributions, which supplement dozens of other articles and short stories that he wrote in 1892 – reveal how vigorously the twenty-year-old Crane sought to establish himself in the role of professional writer. Finally, our discoveries include an 1892 article about the New Jersey National Guard summer encampment at Sea Girt. Like all of Crane’s work, the article is witty and ironic. Its larger significance is that it shows Crane was familiar with the military culture of his state’s national guard; thus, it constitutes an important piece in completing the puzzle of how a young man who had never seen war could write so convincingly about it in The Red Badge of Courage, which Crane began the year after he left the Tribune. Our initial attributions were limited to articles that were so stylistically distinctive in their irony and verbal inventiveness that they clearly looked to be from Stephen’s hand rather than Townley’s. For an alternative and objective statistical analysis, we turned to the science of stylometry.

4. ‘Non-Traditional’ Attribution: Stylometry

4.1. SAMPLING AND TEXTUAL PREPARATION The stylometric task facing us was to examine the seventeen articles and attribute them to either Stephen or Townley Crane, who so far as is known were the only writers contributing New Jersey shore articles to the Tribune. Suitable control samples in more than one genre are required, so, within the genre of fiction, several textual samples of about 3,000 words were obtained from The Red Badge of Courage and ’s The Nigger of the “Narcissus”, the latter being chosen because we know that Crane and Conrad read and admired each other’s novels. For journalistic controls, we turned to Richard Harding Davis and Jacob Riis, who were, along with Crane, the most prominent American journalists of the 1890s. We know that Crane was familiar with their work, which paralleled his own war correspondence (in the case of Davis) and New York City journalism (in Riis’s case). Accordingly, samples of text were taken from Davis’s AYearfroma Reporter’s Notebook and Riis’s How the Other Half Lives. Examples of Stephen Crane’s New Jersey shore reports, his signed New York City journalism, and his war correspondence, also signed, were taken from the University of edition of Crane’s work; samples of Townley Crane’s journ- alism were taken from the New-York Tribune. The seventeen anonymous articles TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 321

Table I. Textual samples

Author Title Date Sample Number of words

Stephen Crane The Red Badge of Courage 1895 1 3022 2 3036 3 3037 4 3009 5 3006

Joseph Conrad The Nigger of the “Narcissus” 1897 1 3000 2 3000 3 2999 4 2996 5 3014

Richard Harding Davis A Year from a Reporter’s Notebook 1897 1 3000 2 3000 3 2999

Jacob Riis How the Other Half Lives 1890 1 3000 2 2992 3 3032

Townley Crane Journalism 1886 1 1660 2 1660 3 1658

Stephen Crane New York City journalism 1894 1 3000 2 3000 3 3000

Stephen Crane Shore journalism 1890–1892 1 2304 2 2304 3 2306

Stephen Crane War correspondence 1897–1898 1 2888 2 3447 3 3406

Anonymous articles 1889–1892 1 1814 2 1802

were first merged, the resultant text then being split into two halves of approxi- mately 1800 words each. All samples were either typed, scanned or downloaded from an internet resource. The following table lists the texts and samples used in this investigation along with their dates of composition. 322 DAVID I. HOLMES ET AL.

4.2. STYLOMETRIC METHODOLOGY A number of studies have recently appeared in which the features used as indicators are not imposed by the prior judgement of the analyst but are found by straightfor- ward procedures from the texts under scrutiny (see Burrows, 1989, 1992; Binongo, 1994; Burrows and Craig, 1994; Holmes and Forsyth, 1995; Forsyth and Holmes, 1996; Tweedie et al., 1998; Forsyth et al., 1999). Such textual features have been used not only in authorship attribution but also to distinguish among genres. This approach involves finding the most frequently used words and treating the rate of usage of each such word in a given text as a feature. The exact number of common words used varies by author and application but generally lies between 50 and 75, the implication being that they should be among the most common in the language, and that content words should be avoided. Multivariate statistical techniques are then applied to the vector of occurrence rates to search for patterns. Each phase of the analysis (see below) employs different text selections, so only the most frequently occurring non-contextual function words for those partic- ular texts under consideration are used. Special computer software identifies these words from the corpus of texts and computes their occurrence rates for each individual text in that corpus.

4.3. HIERARCHY OF ANALYSES (a) Fiction only: Stephen Crane and Joseph Conrad The first phase in the investigation was designed to establish the validity of the technique discussed above, within the context of this research. Known texts should appear to be internally consistent within author but distinct from those by other authors. Using the textual samples from Stephen Crane’s The Red Badge of Courage and Conrad’s The Nigger of the “Narcissus”, the fifty most frequently occurring words were identified and the occurrence rates of these words used as input to a principal components analysis. The positions of the samples in the space of the first two principal components are plotted in Figure 1. Figure 1 shows that the five Crane text samples are tightly clustered, having positive values on the first principal component, whereas the five Conrad text samples all lie to the left of the plot with negative values on the first principal component. The horizontal axis (PC1) is the dominant axis, explaining 39.2% of the variation in the original data, with the vertical axis (PC2) explaining only an additional 15.3%. In looking for patterns, therefore, it is in order to project the points downwards onto this first axis. We can see which words are highly associ- ated with Crane and Conrad by looking at the associated scaled loadings plot in Figure 2, which helps to explain the clusterings observed in the main plot. We may imagine this to be superimposed on top of Figure 1. Words on the right of this plot such as “himself”, “youth” and “from” have high usages by the author on the right of the previous plot, namely Crane, while words to the left such as “on”, “up” TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 323

Figure 1. PCA fiction: Crane vs. Conrad. and “out” are words favored by Conrad. These plots confirm the validity of the “Burrows” technique within this context, showing the Crane and Conrad samples to be clearly distinguishable from each other.

(b) Genre comparison: Crane’s fiction and journalism In this phase, we discard the Conrad samples and bring in the textual samples of Stephen Crane’s journalism both from the shore (labeled S) and from New York City (labeled N). The samples from The Red Badge of Courage are labeled R. Using the fifty most frequently occurring words from this corpus, Figure 3 shows the textual samples plotted in the space of the first two principal components, which together explain 54.5% of the variation in the original data set. This plot clearly shows that Crane’s shore journalism differs markedly in his use of function words from his fiction writing. Projection onto the first principal component also reveals that his New York City journalism has a style that differs from his shore journalism but is similar in word usage to the style of his fiction. Looking at the dates of composition of these textual samples, it is interesting to note that the New York City journalism is also closer in chronological terms to his novel than are the textual samples from the shore. It is not impossible, therefore, that the first principal component may have captured date of composition and not 324 DAVID I. HOLMES ET AL.

Figure 2. Scaled loadings plot fiction: Crane vs. Conrad. genre, but the time scale here spans just five years and date of composition may not be an important factor. The associated scaled loadings plot in Figure 4, which again, may be superimposed on Figure 3, tells us that words such as “and”, “is”, “which”, “of”, “on” and “are” occur more frequently in his shore journalism than in his other writings.

(c) Stephen Crane’s journalism Having noted the stylometric difference between Crane’s New York City journ- alism and his shore journalism, we can now discard the genre of fiction, which has served its purpose as a control, and add Crane’s third mode of journalism to the analysis, namely his war correspondence. Accordingly the three textual samples obtained from his war dispatches from the Greco-Turkish War (1897) and from the Spanish-American War (1898) were added to the other samples of his journalism, and a principal components analysis run on the occurrence rates of the fifty most frequently occurring words in this corpus, in the usual manner. Figure 5 shows the samples plotted in the space of the first two principal components, which together explain 50% of the variation in the data set. This plot clearly illustrates how even Crane’s non-contextual function words differ in their rate of usage among the three sub-genres of his journalism, along the TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 325

Figure 3. PCA Crane: Journalism vs. Fiction.

first principal component. Examination of the dates of composition of the textual samples indicate that this principal component may once again be capturing “time”, although there is a maximum span of just eight years between his earliest shore journalism and his latest war correspondence. Clearly, when looking at the disputed texts in a forthcoming analysis, we must be careful to compare them only against the appropriate mode of journalism from our known writings and we must also be aware of possible chronological factors.

(d) Journalism controls We now proceed to the next phase by bringing in the samples of journalistic writing from Townley Crane, Richard Harding Davis and Jacob Riis, and discarding the samples of Stephen Crane’s war journalism, which have served their purpose. By comparing writing styles solely within the genre of journalism, we hope to add further weight to the validation of the method of analysis. Figure 6 shows these textual samples plotted in the space of the first two principal components derived from the occurrence rates of the fifty most frequently occurring words. The groupings are very evident, the most interesting being the tight clustering of the three Townley Crane samples (labeled T), which all lie well to the left along the first principal component, which explains 32.7% of the variation in the original 326 DAVID I. HOLMES ET AL.

Figure 4. Scaled loadings plot Crane: Journalism vs. Fiction. data set. It is the second principal component, which explains an additional 17.0% of the variation, that separates out the Davis (labeled D) and Riis (labeled R) textual samples from the others, although it is hard to distinguish between these two writers with just three samples from each. Nevertheless, the clear distinction between Townley’s shore journalism and Stephen’s shore journalism means that we may now confidently proceed to the final stage of the investigation involving the anonymous articles from the New Jersey shore.

(e) The Crane brothers and the anonymous articles Having validated the technique on the control samples, we may now focus exclu- sively on the main task, namely the attribution of the seventeen anonymous articles in the New-York Tribune, assumed to be from the hand of either Stephen or Townley Crane. The only textual samples used in this final phase of analysis are the shore journalism extracts from both Stephen and Townley, and, of course, the two samples containing the anonymous articles. The samples of Stephen Crane’s New York City journalism will be discarded, since we are now looking solely at journalism originating from the shore. These shore textual samples are also closest in chronological terms to the anonymous articles. TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 327

Figure 5. PCA Stephen Crane journalism.

The number of high-frequency function words used in this attributional phase was maintained at 50. The occurrence rates of these words for the texts under consideration were computed and, once again, a principal components analysis conducted on the data array. Figure 7 shows the textual samples plotted in the space of the first two principal components, which together explain 53.7% of the variation in the data set. Projection onto the first principal component in Figure 7 shows the two disputed samples (labeled D) to be remarkably internally consistent and to lie clearly on the left of the axis, the “Stephen” side. They do, however, appear to be somewhat distinctive since they are pulled away by the second principal component (which explains 16.6% of the variation). It is possible that this distinction in vocabu- lary between Crane’s previously published shore articles and the newly attributed articles arises because all of the latter are short news articles, whereas the previ- ously identified pieces include both news reports and several long feature articles that have a somewhat different generic status. Since the evidence provided by Figure 7 is not compelling, an alternative analysis may be made using the technique of cluster analysis. Dendrograms represent a more reliable depiction of the data since we do not lose a signifi- cant proportion of the original variability when using cluster analysis. Figure 8 328 DAVID I. HOLMES ET AL.

Figure 6. PCA all journalism controls. shows the resulting dendrogram, using the occurrence rates of the 50 words as raw variables, squared Euclidean distance as the metric and average linkage as the clustering algorithm. Looking at the clustering, we can see that the two disputed samples first merge together, then join into the “Stephen” cluster. The “Townley” cluster remains distinct. The results of the cluster analysis and principal components analysis are now mutually supportive, confirming the “traditional” attribution of these seventeen articles to the youthful ironist Stephen Crane.

5. Conclusion The “non-traditional” analysis has supplied objective, stylometric evidence that supports the “traditional” scholarship on the problem of authorship of these seven- teen articles. However, we do not wish to claim that our dual approach to attribution offers proof positive of Stephen Crane’s authorship of each of the articles; indeed, we regard such assertions of authorship of disputed texts, in the absence of conclusive external evidence, as remnants of an outmoded positivist epistemology. Postmodern inquiry suggests that we be sceptical of truth claims in authorship attribution. In this, it agrees with poet John Keats, who argued that the mark of TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 329

Figure 7. PCA journalism and the disputed articles.

Figure 8. Dendrogram Crane brothers and the disputed articles. the highest intellect is “negative capability,” the capacity to accept the limits of our knowledge and to remain in “uncertainties, Mysteries, doubts, without any irritable reaching after fact and reason” (Rollins, 1958). A postmodern approach to authorship attribution avoids positivist claims, yet it need not remain adrift in a sea of signifiers. If, in the absence of definitive external evidence, no attributional claim can be absolute, some methodologies will 330 DAVID I. HOLMES ET AL. nevertheless be more reliable than others. In blending a traditional approach to the attribution of these seventeen articles with a non-traditional, stylometric approach, we agree with the viewpoint of Hänlein (1999), who argues that the most reliable results in authorship recognition studies take into account both “intuitive” find- ings – i.e., the traditional scholar’s inherently subjective recognition of an author’s distinctive style – and computational methods. A sequential approach to attribu- tion is recommended by Rudman (1998), who stresses, “Any non-traditional study should only be undertaken after an exhuastive traditional study. The non-traditional is a tool for the traditional authorship scholar, not a proving ground for statisticians and others to test statistical techniques.” We believe that this joint interdisciplinary approach should be the way in which attributional research is conducted.

Acknowledgements Michael Robertson’s research was supported by a FIRSL grant from The College of New Jersey. David Holmes’ and Roxanna Paez’s research was supported by the New Jersey Minority Academic Career fellowship program. We wish to thank Dr Richard Forsyth of the University of Luton, UK, for the use of his specialist computer software in the analysis phase of this investigation.

References Altick, R.D. The Art of Literary Research, 3rd edn. New York: Norton, 1981. Berryman, J. Stephen Crane: A Critical Biography. New York: William Sloane, 1950. Binongo, J.N.G. “Joaquin’s Joaquinesquerie, Joaquinesquerie’s Joaquin: A Statistical Expression of a Filipino Writer’s Style”. Literary and Linguistic Computing, 9 (1994), 267–279. Bowers, F., ed. Tales, Sketches and Reports. Vol. 8 of The University of Virginia Edition of the Works of Stephen Crane. Charlottesville: University Press of Virginia, 1973. Burrows, J.F. “ ‘An Ocean Where each Kind ...’: Statistical Analysis and Some Major Determinants of Literary Style”. Computers and the Humanities, 23 (1989), 309–321. Burrows, J.F. “Not Unless You Ask Nicely: The Interpretive Nexus Between Analysis and Informa- tion”. Literary and Linguistic Computing, 7 (1992), 91–109. Burrows, J.F. and D.H. Craig. “Lyrical Drama and the ‘Turbid Mountebanks’: Styles of Dialogue in Romantic and Renaissance Tragedy”. Computers and the Humanities, 28 (1994), 63–86. Elconin, V.A. “Stephen Crane at Asbury Park”. , 20 (1948), 275–289. Erdman, D.V. and E.G. Fogel, eds. Evidence for Authorship: Essays on Problems of Attribution. Ithaca: Cornell University Press, 1966. Forsyth, R.S. and D.I. Holmes. “Feature-Finding for Text Classification”. Literary and Linguistic Computing, 11 (1996), 163–174. Forsyth, R.S., D.I. Holmes and E.K. Tse. “Cicero, Sigonio and Burrows: Investigating the Authen- ticity of the ‘Consolatio’ ”. Literary and Linguistic Computing, 14 (1999), 1–26. Hänlein, H. Studies in Authorship Recognition – A Corpus-based Approach. European University Studies, Series XIV, Vol. 352. Frankfurt am Main: Peter Lang, 1999. Holmes, D.I. “The Evolution of Stylometry in Humanities Scholarship”. Literary and Linguistic Computing, 13 (1998), 111–117. Holmes, D.I. and R.S. Forsyth. “The ‘Federalist’ Revisited: New Directions in Authorship Attribu- tion”. Literary and Linguistic Computing, 10 (1995), 111–127. TRADITIONAL AND NON-TRADITIONAL AUTHORSHIP ATTRIBUTION 331

Kwiat, J.J. “The Newspaper Experience: Crane, Norris, and Dreiser”. Nineteenth-Century Fiction,8 (1953), 99–117. Mosteller, F. and D.L. Wallace. Applied Bayesian and Classical Inference: The Case of the Federalist Papers. Reading, MA: Addison-Wesley, 1964. O’ Donnell, B. “Stephen Crane’s ‘The O’Ruddy’: A Problem in Authorship Discrimination”. In The Computer and Literary Style. Ed. Jacob Leed. Kent, OH: Kent State University Press, 1966. Robertson, M. Stephen Crane, Journalism, and the Making of Modern American Literature.New York: Press, 1997. Rollins, H.E., ed. The Letters of John Keats, Vol. 1. Cambridge: Harvard University Press, 1958. Rudman, J. “Non-Traditional Authorship Attribution Studies in the Historia Augusta:Some Caveats”. Literary and Linguistic Computing, 13 (1998), 151–157. Tweedie, F.J., D.I. Holmes and T.N. Corns. “The Provenance of ‘De Doctrina Christiana’, Attributed to John Milton: A Statistical Investigation”. Literary and Linguistic Computing, 13 (1998), 77– 87. Wertheim, S. and P. Sorrentino, eds. The Correspondence of Stephen Crane,2Vols.NewYork: Columbia University Press, 1988. Wertheim, S. and P. Sorrentino. The Crane Log: A Documentary Life of Stephen Crane.NewYork: G. K. Hall, 1994. Williams, A.W. and V. Starrett. Stephen Crane: A Bibliography. Glendale, CA: John Valentine, 1948.