<<

Hartmut Ilsemann: “Un-Shakespearian” Shakespeare Plays

In the library of Charles II a volume is labelled "Shakespeare. Vol. I". It contains three plays, namely Fair Em, the Miller's Daughter of Manchester, and The Merry Devil of Edmonton. The attribution of these plays to has not achieved much acclaim, and Eric Sams complained bitterly in similar circumstances that such testimony was treated as ‘gossip’, ‘stories’ or ‘quite wrong’ and ‘muddled’.1 Tucker Brooke who published these plays together with many other apocryphal plays2 comments: ‘The remaining members of the group [i.e. the above-mentioned plays] belong distinctly to a lower order, that is, except on the theory of apprentice work or the hastiest of retouching, modern criticism can hardly admit their claim of Shakespearian origin to be even plausible’ (Brooke, p. vi). Even worse: ‘There is a curious dramatic irony in the fact that Mucedorus and Fair Em have been attributed by serious and respectable critics to the pen of Shakespeare.’ (Brooke, p. vii) Quite bewildered he notes that as many as six quarto editions of The Merry Devil of Edmonton are recorded between 1608 and 1655, and the above-mentioned Shakespeare library volume is denounced by him as a commentary on knowledge of Shakespeare after the Restoration. Similar dismissals can be found in his introductions to the plays themselves and often Shakespeare attributions attract criticism, scorn, contempt, if not hatred. It is no wonder that traditional stylometry has never been in a position to substantiate any Shakespearian claim to these stylistically “un-Shakespearian” plays. In 2013, however, a version of Rolling Delta became available which had recourse to Burrows’s Delta (2002),3 and had opened up new possibilities of authorship attributions. Whereas Burrows’s Delta could only deal with whole texts and sole-authored plays, the improved version of Maciej Eder, Jan Rybicki and Mike Kestemont made use of windows of a particular size that rolled through the text with an overlap und could in this way indicate collaborations.4 Recent studies have thus been able to tackle problematic authorship issues and offer solutions. , which had the initials “W.S.” on the title page is in fact not an early play by Shakespeare, as Tieck believed and Eric Sams confirmed (Sams, p. 165), but a play by Christopher Marlowe (Brooke: ‘… which I feel a large degree of confidence in attributing as a whole to the pen of Robert Greene.’ p. xvii). Traditional stylometry was dependent on a rather small number of strong discriminators and was endangered by an insufficient number of variables producing statistical havoc, but it was nevertheless accepted in general as a signal, a hint or an indication. Burrows’s Delta makes use of weak discriminators, but in such numbers that statistically sound results are the outcome. Hoover 2 quotes Burrows’s original definition of Delta as ‘the mean of the absolute differences between the z-scores for a set of word-variables in a given text-group and the z-scores for the same set of word-variables in a target text.’5 When Locrine was analysed with a set of reference texts the lowest delta values of each measuring point, indicating the smallest stylistic difference between reference texts and exploration text, referred to Marlowe’s Tamburlaines (see: http://www.shak-stat.engsem.uni-hannover.de/esurvey.html). Other investigations dealt conclusively with Shakespeare’s role in and .6 Another paper, called “Stylometry approaching Parnassus,” was submitted to DSH7 and proved that John Marston and Thomas Nashe were the authors of The Parnassus Plays. For this reason it makes sense to apply the new non-traditional stylometry features to the texts in question bearing in mind that at least Fair Em, the Miller's Daughter of Manchester and Mucedorus can lay claim to the theory of apprentice work that Brooke mentioned. Q 1 of Fair Em is undated, and the only known copy of this edition is in the Bodleian, the other quarto, Q 2 was printed for John Wright and gives the year 1631. Eric Sams saw Fair Em attacked by Greene in the Preface to his Farewell to Folly, printed in 1591, but already registered in 1587 (Sams, p. 163ff.) This would make Fair Em a really early play that ranges in the same period as The Famous Victories of .8 William Knell who had played the role of Henry was killed in Spring 1587, and it is a safe conclusion that the play was extant in 1586, if not earlier. Ever since Delta came into being various attempts have been made to find the best possible parameters in the process of authorship detection. Hoover achieved excellent results when the culling value was set to 70 % (automatic removal of words too characteristic of individual texts), which led to a harmonization of the word lists and their comparability.9 Jack Grieve tested variable types and found the most frequent character trigrams (MF3C) superior to character bigrams (MF2C), word frequencies (MF1W) and other variables.10 Another important point is the choice of reference texts and their quality. It is obvious that modern spelling editions and the highly individual spellings of Renaissance drama editions (most of them quartos,) do not really match, even though MF3C does remarkably well in such situations. There is a certain degree of uncertainty when a best-fitting reference text is missing, and the scarcity of texts by a particular author can be a severe problem, which Rolling Delta shares with traditional stylometric approaches. If reference texts are not sole- authored the outcome is always doubtful, and whatever the results of text examinations are, they have to be seen as current evidence within a complex interdependent system of attributions where the exchange of one cornerstone requires a rebuild in parts of the edifice. 3

Happily the attribution with Shakespeare texts is less hazardous, as there are enough pure texts without collaborative passages. Moreover Rolling Delta seems to be very sensitive to temporal circumstances, while traditional stylometry worked with whole text corpora stretching over decades, as if there were no development of style within time. It is obvious that the choice of reference texts has to undergo a careful selection process in which a large number of samples (i.e. their lowest delta values) determine whether they are suitable. In the case of Fair Em the following texts were used as reference texts after the elimination of a fair number of unsuitable references. The folder with the primary_set contained: Anon. (c. 1594) (21306 words) Greene. Friar Bacon and Friar Bungay (16828 words) Kyd. Soliman and Perseda (17867 words) Munday. John a Kent and John a Cumber (13514 words) Peele. The Old Wives’ Tale (7713 words) Shakespeare. The Two Gentlemen of Verona (17255 words) The folder with the secondary_set comprised: Anon. Fair Em (11607 words) The surprise on this list is certainly the anonymous play King Leir which a mixed acting company of the Queen Elizabeth’s and Sussex’s Men, performed on 6 and 8 April 1594 at the Rose Theatre, as recorded by Philip Henslowe. This was before Shakespeare joined the Lord Chamberlain’s Men in June. Rolling Delta identified King Leir clearly as a play by Shakespeare, and this claim can be substantiated when we look into the chart that examines Fair Em with a 5000-word window, evaluating the most frequent character trigrams with a window overlap of 250 words and a culling percentage of 70. 4

Figure 1 Reference texts and lowest delta figures in the evaluation of Fair Em Of all texts King Leir has the smallest stylistic distance from Fair Em, immediately followed by The Two Gentlemen of Verona, a play that has made it into the , but is seen by many as a weak Shakespeare play that has many imperfections and was written around 1590. The idea that Shakespeare’s apprentice plays lack the complexity and maturity of the official corpus is somehow mirrored in the various window sizes tested with Rolling Delta. The smaller windows are normally unreliable, as Eder has shown in his paper “Small Samples, Big Problems”.11 When he examined 62 European novels he found that a window size of 5000 words was suitable, and this is also the default value of R Stylo’s feature Rolling Delta feature. In the majority of Shakespeare plays one can indeed not tell much below 5000- word windows whether the play is by Shakespeare or a collaborative piece of work. This is often due to the material that Shakespeare used, and at the beginning of his career there seems to be lot of plagiarism, if not pilfering, as attested by Greene and Nashe in many of their writings. But often these early “un-Shakespearian” plays give away their author at a very early stage of tested window sizes. But the large volume of data has made it almost impossible to grasp the overall results. For this reason the lowest delta value of each measuring point, indicating the smallest stylistic difference between reference texts and exploration text, was now noted and furnished with a capitalised letter representing the author of the respective text. The outcome can be seen in Table 1, a matrix that displays text segments of 250 words vertically (column A), beginning at 500 words where B6 is the first measuring point of the 1000-word window. The next window covers 1500 words and its first 5 measuring point (C7) is at 750 words. 750 words is also the second measuring point of the 1000-word window marking words 251-1250 of the text. As the window sizes grow horizontally, as indicated in their vertical keys above the table, the first measuring points move their coordinates one column to the right and one line further down (towards the end of the text one line further up), and each coordinate has an attribution. In this way all attributions can be seen at one glance. Theoretically the windows could grow up to the point where the remaining coordinate encompasses the whole text. This is Burrows’s original delta, as already noted. But here the window sizes stop growing at 5000 words to allow for the presentation of three different variables, namely the word frequencies MF1W (columns B to J), character bigrams MF2C (columns K to S) and character trigrams MF3C (columns T to AB), of which the latter is most reliable due to a higher number of variables. The letters L and S (L = Leir S = Shakespeare) dominate even in the smaller windows indicating their stylistic closeness to Fair Em. 6

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAAB AC AD 1 Window Sizes and attributions: Fair Em 2 T 1 1 2 2 3 3 4 4 5 1 1 2 2 3 3 4 4 5 1 1 2 2 3 3 4 4 5 3 H 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 5 0 Scenes/words 4 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 500 K K G I,1 619 7 750 K G m f 1 w K L m f 2 c K K m f 3 c 8 1000 G K K L L S L L L 9 1250 L L L G S S S K L S L L 10 1500 S L L L L S S S S S S S S L L I,2 1392 11 1750 S S S L K K S S S S L K S S S S L L 12 2000 S S S L L K K S S S S S L K S S S S L L L I,3 2047 13 2250 S S S S L K K L S S S S S L S K S S S S S L L L 14 2500 L S S S S L L L L S S S S S S S S S S S S L S S S L L I,4 2599 15 2750 K K S S S L L L L K K S S S S S S S L K S L S S S L L 16 3000 K L K S L L L L L K K L S S S S S S K L K S S L S L L 17 3250 K K K K L L L L L K L L L S S S S S K L L L L S L S L 18 3500 K K L L L L L L L K L L L L L S S S K L S L L L S L S 19 3750 L G K L L L L L L L L L L L L S S S S L L L L L L L L II,1 3837 20 4000 L L L K L L L L L S L L L L L L S L S S L L L L L L L 21 4250 L L L L L L L L L S L L L L L L L L S S S L L L L L L 22 4500 L L L L L L L L L S L S L L L L L L S S S S L L L L L II,2 4468 23 4750 L L L L L L L L L S S L L L L L L L S S S S L L L L L 24 5000 L L L L L L L L L S L S L L L L L L S S S S L L L L L II,3 4922 25 5250 L L L L L L L L L S S L L L L L L L S S S L L L L L L 26 5500 L L L L L L L L L S S L L S L L L L L S L L L L L L L 27 5750 L L L L L L L L L S L L L L S L L L L L L L L L L L L 28 6000 L L L L L L L L L L L L L L L L L L L L L L L L L L L III,1 6087 29 6250 L L L L L L L L L L L S L L L L L L L L L L L L L L L III,2 6424 30 6500 L L L L L L L L L L L L L L L L L S L L L L L L L L L III,3 6582 31 6750 L L L L L L L L L L L L L L L L L L L L L L L L L L L 32 7000 L L L L L L L L L L L L L L L L L L L L L L L L L L L III,4 7166 33 7250 K L L L L L L L L L L L L L L L L L L L L L L L L L L 34 7500 K S L L L L L L L L L L L L L L L L L L L L L L L L L III,5 7529 35 7750 S L L L L L L L L L L L L L L L L L L L L L L L L L L 36 8000 L S L S L L L L L L L L L L L L L L L L L L L L L L L III,6 7903 37 8250 L L S L L L L L L L L L L L L L L L L L L L L L L L L IV,1 8365 38 8500 S S L L L L L L L L L L L L L L L L L L L L L L L L L IV,2 8571 39 8750 S S L L L L L L L L L L L L L L L L L L L L L L L L L 40 9000 S L S S L L L L L L L L L L L L L L S L L L L L L L L 41 9250 L S S S L L L L L L L L L L L L L L L L L L L L 42 9500 K S S L L L L L L L L L L L L L L L L L L IV,3 9442 43 9750 K K L L L L L L L L L L L L L L L L 44 10000 K L L L L L L L L L L L L L L 45 10250 S L L L L L L L L L L L 46 10500 L L L L L L L L L 47 10750 L S L L L L 48 11000 L L L 49 11250 50 11500 V,1 11596 Table 1 Window sizes and attributions in Fair Em Mucedorus, “a most pleasant comedie”, as the title page of the 1598 edition proclaims, was most successfully performed also by strolling players well into the 18th century. It was published in seventeen editions as W. W. Greg found out, and Q 3 of 1610 stated that it was in the repertoire of the . Edward Archer’s play list of 1656 named Shakespeare as the author of the play, but even though the play has some interesting generic features with regard to the fool, its pastoral and folktale background and its romantic tone in general, Shakespeare scholars would not endorse attributions to Shakespeare and classified the 7 comedy as apocryphal. Testing the play involved once again a set of reference texts and the adapted version of etext no. 1548 of the Gutenberg project. The folder with the primary_set contained: Anon. King Leir (c. 1594) (21306 words) Chettle. The Tragedie of Hoffman (19955 words) Kyd. Soliman and Perseda (17867 words) Peele. The Old Wives’ Tale ( 7713 words) Rowley. When You See Me You Know Me (24731 words) Shakespeare. (20902 words) The folder with the secondary_set comprised: Anon. Mucedorus (12377 words) As before, a large number of possible reference text authors were excluded after exhaustive testing, among them Greene, Lyly, Lodge, Marlowe, Munday, and Nashe. The MF3C chart with a window size of 5000 words gives the smallest stylistic distance from Mucedorus as follows:

Figure 2 Reference texts and lowest delta figures in the evaluation of Mucedorus Undoubtedly the white curves representing Shakespeare reference texts have the lowest delta values and therefore possess the smallest stylistic distance from Mucedorus. It may be due to generic dispositions that The Taming of the Shrew is in many instances favoured over King Leir which as a chronicle history is closer to tragedy. If we proceed to the attribution table, the effect of different window sizes becomes visible and similarly there is a distinction between the variables and their attributive power. 8

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAAB AC AD 1 Window sizes and attributions: Mucedorus 2 T 1 1 2 2 3 3 4 4 5 1 1 2 2 3 3 4 4 5 1 1 2 2 3 3 4 4 5 3 H 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 5 0 scenes/words 4 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 500 C K K 7 750 K K m f 1 w K K m f 1 w K K m f 3 c I,0 755 8 1000 K K K K K K K K K 9 1250 C K K K C K K K K K K K I,1 1175 10 1500 C K K K K C K K K K K K K K K I,2 1296 11 1750 C K K K K K C K K K K K C K K S K S I,3 1819 12 2000 K K K K K K K K K K K K K K S S S S S S S 13 2250 K S S K K K K K K S K K K K K K S S S S S S S K 14 2500 C S S K S K K K K S S S K S K K K K S S S S S S S S S 15 2750 S S S S K K K K K S S S S K K K K K S S S S S S S S S I,4 2784 16 3000 S S S S K K K K K S S S K K K K K K S S S S S S S S S II,1 3100 17 3250 S S S S K K K K K S S S S K K K K K S S S S S S S S S 18 3500 S S S S K K K K K S S S K K K K K K S S S S S S S S S 19 3750 R K S R S K K K S R K S R K K K K S S K S S S S S S S 20 4000 K K K R S K S S K K K K R C C S S K K S S S S S S S S II,2 4105 21 4250 K K K K K S S K K K K K K C C S S K K K S S S S S S S II,3 4379 22 4500 M K K K K S S S K R K K K C S S S S K K S S S S S S S 23 4750 M K K K S K K S S C K K C S S K S S K K S S S S S S S 24 5000 C K C K K K K S S C K C K K K K S S S S S S S S S S S II,4 5075 25 5250 C C C K K K K S S C C S K K K S S S S S S S S S S S S 26 5500 C C C K K K K K S C S S K K K K S S S S S S S S S S S 27 5750 S K K K K K K K S S S K K K K K S S S S S S S S S S S III,1 5795 28 6000 S K K K K K K K K S S S S S K K C S S S S S S S S S S 29 6250 K K S K K K K K K S S S S S S S S K S S S S S S S S S III,2 6194 30 6500 S S S S K K K K K S S S S S S S S K S S S S S S S S S III,3 6657 31 6750 K K S K S K K K K K S S S S S S S K S S S S S S S S S III,4 6757 32 7000 S K K K K K K K K S K S S S S S S S S K S S S S S S S 33 7250 R K K K K K K K K S K K S S S S S S S S S S S S S S S III,5 7386 34 7500 S R K K K K K K K S R S K S S S S S S S S S S S S S S 35 7750 K C K K K K K K K K S S K K K S S S S S S S S S S S S IV,1 7685 36 8000 C K K K K K K K S C K S K K K S S S P S S S S S S S S 37 8250 K K K K K K K K S K K K K S K K S S P K S S S S S S S IV,2 8359 38 8500 C K K K K K K K S S K K K K S S S S S K K S S S S S S 39 8750 K K K K K P K K K K K K K K S S S S K K S S S S S S S 40 9000 K K K K K K S K K K K K K S S S S S K S S S S S S S S 41 9250 P K K K K K K S S S K K S S S S S S S S S S S S S S S IV,3 9221 42 9500 C S P K K K K K S C S S S S S S S S P S S S S S S S S IV,4 9501 43 9750 P P S K K K K K K S S S S S S S S S P P S S S S S S S 44 10000 P P P K S K K K S S S S S S S S S S S S S S S S 45 10250 S P P S S S K S S S S S S S S S S S S S S 46 10500 S S S S S S S S S S S S S S S S S S 47 10750 S S S S K S S S S S S S S S S V,1 10868 48 11000 S S S S S S S S S S S S 49 11250 S S S S S S S S S 50 11500 S K S K S S 51 11750 S S S 52 12000 53 12250 V,2 12247 C = Chettle K = Kyd P = Peele R = Rowley S = Shakespeare Table 2 Window sizes and attributions in Mucedorus The Shakespeare attribution rests upon MF3C, and with MF2C for the larger part of the text on window sizes of 4500 and 5000 words. There are also Shakespeare attributions with MF1W, but here, as at the beginning of MF2C, plays an important role. It is not impossible that his text was taken over by Shakespeare and subsequently reshaped by him. It is worth while looking at the nature of the types of variables that were used in this analysis. 9

Word-frequency analyses can be doubtful when compared with other variables. The higher degree of reliability must be seen in the capability of MF2C to evaluate the text between function words as well. In the phrase “around the corner”, MF1W would account for the definite article “the”, which is among the most prominent in all texts, and probably for the preposition “around”, but not necessarily for the word “corner”. MF2C divides this word into “_c | co | or | rn | ne | er | r_”, where “_” denotes a blank. These bigrams are among the most frequent and will undergo an evaluation. The implication is that MF2C is more effective than MF1W, and as many investigations in the past were only carried out with function words, the assumption gains weight that faulty attributions are likely. MF3C has another potential. Recurrent sentence patterns mirror the artistic singularity of the author’s mind. Even when only the definite article “the” is divided into its components, we encounter “x_t | _th | the | he_ | e_x”, where “_” stands for the blank and “x” represents a letter in the preceding or following word. The number of counted features is always higher than 1 as with MF1W. In terms of methodology it becomes impossible to hold on to relative word frequencies alone. In table 2, MF3C can possibly attest Kyd’s opening of the play, but the overall text bears the mark of Shakespeare. The Merry Devil of Edmonton is definitely not an early play as tests with reference texts by Greene, Peele, Lodge, Lyly, Munday, Nashe and Wilson clearly show. But equally there were some problems with Shakespearian and other reference texts after the building of the Globe in 1599. The play is first mentioned in 1604 and finds its entry into the Stationer’s Register in 1607. A year later it was printed (Q 1) and there were five more quartos up to 1655. It is necessary to recall the extensive testing with the reference texts of the time. Otherwise the texts that were eventually chosen would cause some head-shaking. In most cases Shakespeare’s was linked with low delta values, but then the various sets of combinations of reference texts ruled out Fletcher, Dekker’s Old Fortunatus and A Shoemaker’s Holiday, Heywood, Middleton, Shakespeare’s King Leir and , and to some degree Day, but favoured Webster and Wilkins, Beaumont and Rowley. It was only in these circumstances that Shakespeare’s The Winter’s Tale, and yielded even lower delta values than any of the other reference texts. That is why the final folder with the primary_set contained: Beaumont. The Knight of the Burning Pestle (20712 words) Dekker. Satiro-Mastix (22299 words) Shakespeare. Cymbeline (27699 words) Shakespeare. The Tempest (16557 words) Shakespeare. The Winter’s Tale (25575 words) In the secondary_set was: 10

Anon. The Merry Devil of Edmonton (12015 words) It is only logical that reference texts that come from only one author will also indicate that author as the nearest stylistic equivalent to the text in question. Here 60 % of the reference texts are by Shakespeare. But due to the preliminary selection process this does not mean that the chart below has a natural predilection for Shakespeare. His predominance is a matter of the lowest delta figures sifted out from a large number of reference text combinations.

Figure 3 Reference texts and lowest delta figures in the evaluation of The Merry Devil of Edmonton It is certainly of interest that one Shakespeare reference text only yields exactly the same result as can be seen in Figure 4. 11

Figure 4 Reference texts and lowest delta figures in the evaluation of The Merry Devil of Edmonton The results of the 6000-word windows of Figures 3 and 4 are mirrored in column AF16 to AF20 of Table 3. With the larger windows of 5500 and 6000 words it is apparently Dekker who began the play which would then have been continued by Shakespeare. But one should also note the differences in MF1W and MF2C which fully endorse Shakespeare from III,2 onwards.

12

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z AAABACADAEAF AG AH 1 Window sizes and attributions: The Merry Devil of Edmonton 2 T 1 1 2 2 3 3 4 4 5 5 6 1 1 2 2 3 3 4 4 5 1 1 2 2 3 3 4 4 5 5 6 3 H 0 5 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 5 0 0 5 0 5 0 5 0 5 0 5 0 Scenes/Words 4 T 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 O 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 500 S S S 7 750 D D m f 1 w S S m f 2 c D D m f 3 c 8 1000 D D D S S S D D D I,0 971 9 1250 D D D D S S S S D D D D 10 1500 D D D S S D D D S S D D D D S 11 1750 S D D D S S D D S S S S D D D D S S I,1 1801 12 2000 D D D D D S D D S S S S S S D S S D S D S 13 2250 D D S D D D S S B S S S S S S S D S S S D D S S I,2 2234 14 2500 D S S D D D S S S S S S S S S S S S S S S S S D S S S 15 2750 S D D D D S S S S B S S S S S S S S S S S S S S S S S S D 16 3000 S D D D S S S S D D B S S S S S S S S S S S S S S S S D D D D 17 3250 S S S S S S S B D D D S S B S S S S S S S S S S S S S D D D D 18 3500 D S S S B B B B B D D B B S S S S S S S S S S S S S S D D D D I,3 3433 19 3750 D B S S B B B B B B D B B S S S S S S S B S S S S S D D D D D 20 4000 B B S B B B B B B B B S S S S S S B B S S S S S S S B D B B D II,1 4078 21 4250 S S B B B B B B B B D S S S S S S S B B S B B B B B B B B S S 22 4500 S S B B B D B B B B D S S S S B S S S B S S B B B B B B S S S 23 4750 S S D B D B B B D D B S S S B S B B S S S S B B B B B S S S S II,2 4876 24 5000 S D D D B B B B D B D S S S S S S S S S B S B B B B B S S S S 25 5250 D B D D B B B B B B B D S S S S S S S S D B S S B B S S S S S 26 5500 B D D D D B B B B B B D S S S S S S S S D D S S S S S S S S S II,3 5447 27 5750 D D D D D S B B B B B D S S S S S S S S D S S S S S S S S S S 28 6000 D D S S S S B B B B B D B S S S S S S S S S S S S S S S S S S 29 6250 B S S S S S S B B B B B D S S S S S S S S S S S S S S S S S S III,1 6296 30 6500 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 31 6750 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 32 7000 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 33 7250 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 34 7500 S S D S S S S S S S S S S S S S S S S S S S S S S S S S S S S III,2 7575 35 7750 D D S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 36 8000 D D D S S S S S S S S B S S S S S S S S S D S S S S S S S S S 37 8250 S D D S S S S S S S S S S S S S S S S S S D S S S S S S S S S 38 8500 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 39 8750 D D S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 40 9000 S D S S S S S S S S S S S S S S S S S S S S S S S S S S S S S IV,1 8847 41 9250 S S S S S S S S S S S S S S S S S S S S S S S S S S S S S 42 9500 S S S S S S S S S S S S S S S S S S S S S S S S S S S IV,2 9526 43 9750 S S S S S S S S S S S S S S S S S S S S S S S S 44 10000 S S S S S S S B S S S S S S S S S S S S S 45 10250 S S B S S S B S S S S S B S S S S S 46 10500 S B B S S B B S S S B B S S S V,1 10510 47 10750 S S B S S S S S S D B S 48 11000 S S S S S S S S S 49 11250 S D S S S S 50 11500 D S S 51 11750 V,2 11742 D = Dekker. Satiro-Mastix S = Shakespeare. The Tempest B = Beaumont.The Knight of the Burning Pestle S = Shakespeare. Cymbeline S = Shakespeare. The Winter's Tale Table 3 Window sizes and attributions in The Merry Devil of Edmonton The problem with traditional attributions is that there is only little certainty, and many ascriptions remain broad estimates. According to Brooke Satiro-Mastix ‘is probably by Dekker’ (Brooke, p. x). A closer look at Satiro-Mastix with Rolling Delta reveals that another Dekker play, A Shoemaker’s Holiday, has the highest delta values within the set of reference texts, and is accordingly miles away in its style from Satiro-Mastix. Acts II,1 to IV,2 of the play match the style of John Day’s The Blind Beggar of Bethnal Green exactly, and only the beginning and end of the play correspond to Dekker’s Old Fortunatus. Day’s play is said to have been co-authored by Chettle. Rolling Delta does not confirm this assumption, but rather 13 favours Middleton and Heywood in the first three acts. The end is much in the style of Shakespeare’s Cymbeline. If one were to summarise the general state of ascriptions as laid down in secondary literature one would have to say that concrete authorship knowledge of the past was not so much a matter of individual thorough investigations with irrefutable tools and unflawed reasoning, but rather a matter of hearsay, impressions and subjective conjectures of so-called experts who were just children of their time and believed what they had been told. Much suspense is generated when it is anticipated that a body of learning of this kind will disintegrate when new and fresh approaches become available. To make sure that no premature conclusions are drawn on Rolling Delta evidence alone, it is indispensable to underpin the findings with other approaches. Machine-learning algorithms are normally well suited to build up classifiers that can identify the author of the text in question. Rolling classify too makes use of classifiers like delta (classic Burrowsian), svm (support vector machine) and nsc (nearest shrunken centroid). These supervised machine- learning classifications are combined with the idea of a sequential analysis in which text segments of 500 words are consecutively attributed.12 In fact, it is a 5000-word window that is evaluated and the step size is 4500 words so that the second window then covers words 501 to 5500, the third 1001 to 6000, etc. The reference texts were the same as in Rolling Delta analyses. Similarly the three texts were tested with delta, svm and nsc on the basis of MF1W, MF2C and MF3C. Due to the distinct mathematical kernels of the procedures identical results could not be expected. Fair Em In the list of results apo stands for Shakespeare’s King Leir and shak represents Shakespeare’s The Two Gentlemen of Verona. > delta.mf1w$classification.results apo apo apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> svm.mf1w$classification.results apo apo apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> nsc.mf1w$classification.results apo apo apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> delta..mf2c$classification.results kyd apo apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> svm.mf2c$classification.results apo apo apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> nsc.mf2c$classification.results 14

kyd shak shak apo apo apo apo apo shak shak apo apo apo apo (total number of elements: 14)

> delta.mf3c$classification.results apo shak apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> nsc.mf3c$classification.results apo apo shak apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14)

> svm.mf3c$classification.results apo apo apo apo apo apo apo apo apo apo apo apo apo apo (total number of elements: 14) Of 126 evaluated text segments 124 indicate Shakespeare. The 2 outliers refer to Thomas Kyd. It is interesting that Kyd is returned in MF2C evaluations, and here particularly with delta and nsc, whereas svm appears to be very reliable. Mucedorus In the list of results apo stands for Shakespeare’s King Leir and shak represents Shakespeare’s The Taming of the Shrew. > deltawords$classification.results (total number of elements: 15) shak shak shak shak shak shak shak shak apo shak shak shak shak shak apo

> svmwords$classification.results shak shak shak shak shak shak shak apo apo apo apo shak shak shak shak

> nscwords$classification.results apo apo apo apo shak shak shak kyd kyd shak shak shak shak shak shak

> delta.mf2c$classification.results apo apo apo shak shak shak apo apo apo apo apo apo apo apo apo

> svm.mf2c$classification.results apo apo apo apo apo apo apo apo apo apo apo apo apo apo apo

> nsc.mf2c$classification.results apo apo apo apo apo apo apo apo apo apo apo shak apo apo apo

> delta.mf3c$classification.results kyd apo apo shak shak shak shak shak kyd apo shak apo shak apo apo

> nsc.mf3c$classification.results kyd kyd apo shak shak shak shak shak shak shak shak shak shak apo apo

> svm.mf3c$classification.results apo apo apo shak shak shak shak shak apo shak shak shak shak shak shak Of 135 evaluated text segments 129 indicate Shakespeare. The 6 outliers refer to

Thomas Kyd, but it is once again the svm evaluation which returns Shakespeare exclusively. The Merry Devil of Edmonton In the list of results beau refers to Beaumont’s The Knight of the Burning Pestle and dek stands for Dekker’s play Satiro-Mastix. Shakespeare’s plays are represented by shak. > delta.mf1w$classification.results 15

(total number of elements: 14) shak shak shak shak shak shak shak beau shak shak beau shak shak beau

> svm.mf1w$classification.results shak shak shak shak shak shak shak shak shak shak shak shak shak shak

> nsc.mf1w$classification.results shak shak shak shak shak shak shak shak shak shak shak shak shak shak

> delta.mf2c$classification.results shak shak beau beau beau beau beau shak shak shak shak shak shak shak

> svm.mf2c$classification.results shak shak shak shak shak shak shak shak shak shak shak shak shak shak

> nsc.mf2c$classification.results dek shak dek shak shak shak shak shak shak shak shak shak shak shak

> delta.mf3c$classification.results shak shak dek beau beau shak beau beau shak shak beau shak shak shak

> svm.mf3c$classification.results shak shak shak shak shak shak shak shak shak shak shak shak shak shak

> nsc.mf3c$classification.results shak dek shak shak shak shak shak shak shak shak shak shak shak shak Of 126 evaluated text segments 109 refer to Shakespeare and correspond in some instances to the initial phase of the play, roughly those 13 % that were probably written by somebody else. Once again svm returns exclusively Shakespeare. When stylometric tools that are programmed to detect collaborations return an overwhelming majority of Shakespeare assignments then one can also anticipate that traditional multivariate analyses will not come up with worthless results. Cluster Analysis

Figure 5 Cluster Analysis placement of The Merry Devil of Edmonton, Fair Em and Mucedorus Because of their reliability, MF3C frequencies were used in the analyses which give a visual impression of the distances between the plays. Once again Fair Em is twinned with King Leir, followed by Mucedorus. The Merry Devil of Edmonton (on the left) is here depicted together with Rowley’s When You See Me You Know Me, which was only in part supported by Rolling Delta. The question as to why The Merry Devil of Edmonton was refused a clearer following may derive from its initial collaborative nature, which made itself 16 felt in various stages of the analysis, but this may also come from the doubtful nature of the surrounding plays which, in all probability, are also not sole authored. The bootstrap consensus tree which is a compromise between the underlying cluster analyses operated also with MF3C and a culling percentage of 70. It placed Mucedorus and Fair Em between Shakespeare’s King Leir and The Taming of the Shrew. The Merry Devil of Edmonton was recorded as independent.

Figure 6 Bootstrap Consensus Tree with placement of Mucedorus and Fair Em The crucial question emerging from analyses with Rolling Delta and Rolling Classify concerns the body of knowledge of literary history. Naturally, this is composed of basic assumptions and evidence and follows a substantiated methodological course. In authorship attributions the starting point was Shakespeare’s First Folio of 1623, the gold standard so to speak, as far as the Shakespeare corpus is concerned. From here critics and scholars have ventured backwards in time to establish the fixed points of how the works had come into being. It is this perspective that led to the notion of apocryphal plays, memorial reconstructions, secondary-class plays that Shakespeare had exploited and improved, and plays of doubtful origin that printers and booksellers had furnished with the letters “W.S.” to make a profit from the sales. The new non-traditional tools in stylometry draw attention to a paradigmatic change in which an apprentice stage in Shakespeare’s career becomes visible, and the perspective changes to a natural moving forward in time, taking developments into account and allowing the young playwright a phase of simple tastes, of testing and experimenting, as in Fair Em and Mucedorus. Later, when he became established as a dramatist, theatre director and co-owner of The Globe he was certainly interested in financially successful plays that drew large audiences, like The Merry Devil of Edmonton. He was apparently not someone who would haughtily look down at low-level entertainment, when this turned out to be lucrative. People who have only a genius in mind find this hard to accept. 17

Footnotes

1 Eric Sams, The Real Shakespeare: Retrieving the Early Years, 1564-1594, New Haven, London: Yale Univ. Press, 1995, p.xi 2 C.F. Tucker Brooke, The being a collection of fourteen plays which have been ascribed to Shakespeare. Edited with introd., notes and bibliography, Oxford: Clarendon Press, 1908. 3 J. F. Burrows (2002). ‘Delta’: a measure of stylistic difference and a guide to likely authorship. Literary and Linguistic Computing, 17(3): 267–87 4 Eder, M., Kestemont, M. and Rybicki, J. (2016). “Stylometry with R: A package for computational text analysis.” R Journal, 16(1): 107-121, http://journal.r-project.org/archive/2016-1/eder-rybicki-kestemont.pdf 5 in “Testing Burrows’s Delta”, Literary and Linguistic Computing, vol. 19, no 4, 2004, 453-475. 6 Hartmut Ilsemann, ”More News on Sir Thomas More,“ Digital Scholarship in the Humanities, pre-print 2017, 1-13. doi:10.1093/llc/fqx013 and “The Two Oldcastles of London,” Digital Scholarship in the Humanities, fqw039, https://doi.org/10.1093/llc/fqw039 7 Hartmut Ilsemann “Stylometry Approaching Parnassus,” submitted to: Digital Scholarship in the Humanities, August 2017 8 See: http://www.shak-stat.engsem.uni-hannover.de/eauthorfamvicth5.html, which notes a more than considerable Shakespeare contribution. 9 Hoover, David L. (2004). “Testing Burrows’s Delta”, Literary and Linguistic Computing, vol. 19, no 4, 453- 475. 10 Jack Grieve, “Quantitative Authorship Attribution: An Evaluation of Techniques,” Literary and Linguistic Computing, vol. 22, no 3, 2007, 251-270 11 Maciej Eder, “Does size matter? Authorship attribution, small samples, big problem”, Digital Scholarship in the Humanities, 30(2), 2015: 167-182 12 In their stylo_howto.pdf-description of June 2017 Eder, Rybicki and Kestemont give their explanation of Rolling Classify as follows: In the first step, the traceable differences between samples produce a set of rules, or a classifier, for discriminating authorial “uniqueness” in style. The second step is of predictive nature – using the trained classifier, the machine assigns other text samples to the authorial classes established by the classifier; any disputed or anonymous samples will be assigned to one of the classes as well, provided that such a classification is usually based on probabilistic grounds. The procedure described above relies on an organized corpus of texts. To be precise, the clue is to divide all the available texts into two groups: primary (training) set and secondary (test) set. The first set, being a collection of texts written by known authors (“candidates”), serves as a sub-corpus for finding the best classifier, or discrimination rules, while the second set is a pool of texts of known authors, anonymous texts, disputed ones, and so on. The better the classifier, the more samples from the test set are attributed (“guessed”) correctly and the more reliable is the attribution of the disputed samples.