Katharsis – a Tool for Computational Drametrics
Total Page:16
File Type:pdf, Size:1020Kb
Katharsis – A Tool for Computational Drametrics Katharsis – A Tool for Computational Drametrics Thomas Schmidt ([email protected]), University of Regensburg, Germany Manuel Burghardt ([email protected]), University of Leipzig, Germany Katrin Dennerlein ([email protected]), University of Würzburg, Germany Christian Wolff ([email protected]), University of Regensburg, Germany Please cite as: Schmidt, T., Burghardt, M., Dennerlein, K. & Wolff, C. (2019). Katharsis - A Tool for Computational Drametrics. In: Book of Abstracts, Digital Humanities Conference 2019 (DH 2019). Utrecht, Netherlands. 1. Introduction With his idea of 'Distant Reading', Moretti (2000) introduced an important leitmotif in the Digital Humanities that has led to an ongoing discussion about quantitative methods in literary and cultural studies (Clement et al., 2008; Crane, 2006). We believe that the literary genre of drama is particularly well suited for quantitative analyses and hence adapt the concept of "Drametrics" (as proposed by Romanska, 2015) as a term for the distant reading of dramatic texts. In addition to the actual dialogs, dramatic texts contain other structural elements that can be easily quantified, such as the characters of the play as well as an explicit act and scene structure. Keeping these features in mind, it is hardly surprising that we find a number of recent studies dedicated to the quantitative analysis of drama (e.g. Ilsemann, 2013; Wilhelm et al., 2013; Nalisnick and Baird, 2013; Trilcke et al., 2015; Dennerlein, 2015; Xanthos et al., 2016; Willand and Reiter, 2017; Krautter, 2018). At the same time, there have been quantitative approaches to the analysis of drama that date far back into the pre-digital age. As an example for early approaches to quantitative analyses of drama, we would like to refer to the ideas of Marcus’ (1973) mathematical poetics, which also contains interesting approaches for quantitative drama analysis. 2. Solomon Marcus’ Mathematical Poetics Marcus suggests the scenic presence of characters as a basic computable measure of a play, which, for each dramatic text, can be visualized by means of a configuration matrix (Marcus, 1973). The matrix (cf. figure 1) contains one row for each character of the play, and one column for each scene. Whenever a character appears on stage, the value 1 is entered into the corresponding cell; if a character is not present in a scene 0 is entered as a value. Figure 1: An example configuration matrix visualizes the appearances of characters (A-G) throughout the 15 scenes of the play. Configuration matrices can be used to compute various quantitative aspects of a drama, for instance: the scenic distance and proximity of characters and even specific relationships between characters (e.g. dominance, alternation, independence or concomitance ) as well as the overall configuration density of plays (Marcus, 1973). The configuration density is calculated by dividing the number of cells holding a 1 by the total number of cells. In other words, the configuration density indicates how many of the potential character appearances have actually been realized. It can be understood as a measure of a play’s 'population density'. When every character appears on the stage in every scene, the play has a theoretical maximum configuration density value of 1. During the 1970s and early 1980s, several studies applied Marcus’ mathematical approach for the analysis of texts, always dealing with very few samples of text (cf. Marcus, 1974; Marcus, 1977; Marcus, 1984). In these studies, configuration matrices proved to be useful in text analysis, as they fasten and simplify the overview of a character’s first or last appearance, co-presence or avoidance with other characters. Some years later, Ilsemann (1998) took on the ideas of Solomon Marcus to explore Shakespeare’s plays in a quantitative way. Ilsemann (1998) used the frequency and lengths of characters’ speeches as further parameters and found that the configuration density is an important aspect of genre-distinct quantitative patterns for comedies, romances, tragedies and history plays. In 2005 and 2008, Ilsemann used the frequencies and distributions of speech lengths to discuss authorship attribution in Shakespeare’s plays. https://dev.clariah.nl/files/dh2019/boa/0584.html 1/5 Katharsis – A Tool for Computational Drametrics 3. The Katharsis Tool In order to be able to automatically analyze quantitative aspects of dramatic texts according to Marcus’ character configurations and Ilsemann’s analysis of speech lengths and frequencies, we have created Katharsis , a tool for computational drametrics . The Katharsis tool comprises a parsing component that extracts and calculates various quantitative parameters as suggested by Marcus (1973) and an analysis component that searches for dramatic texts of a certain author, genre, timeframe, etc. Currently, a test corpus of approx. 100 German drama texts from the TextGrid Repository 1 is available for analysis. The texts are available as TEI-XML, allowing for the extraction of metadata (title, author, year etc.) and speeches with the corresponding speaker and structural information. Note that the tool can be extended with further plays from other authors and genres if the texts are encoded in TEI-XML. Furthermore, the quantitative metrics are independent of the language. Figure 2 shows the Katharsis results for a search for dramatic texts by Friedrich Schiller. Users can download any quantitative information displayed in the screenshot in JSON format for individual analysis. Figure 2: Summary of quantitative information calculated by Katharsis for dramatic texts by Friedrich Schiller. With the help of Katharsis researchers are able to examine a specific drama in more detail. The tool provides an interactive configuration matrix to explore character appearances and speech statistics for each configuration (figure 3). Figure 3: Katharsis snippet of the interactive configuration matrix for the play Maria Stuart, by Friedrich Schiller. Furthermore, Katharsis produces a table and several interactive bar charts to analyze the distribution of speakers and speech statistics on the structural levels (act and scene) and the progression of these metrics throughout the course of a play (for an example see figure 4). https://dev.clariah.nl/files/dh2019/boa/0584.html 2/5 Katharsis – A Tool for Computational Drametrics Figure 4: Average length of speeches (measured in number of words) throughout all acts of the play Maria Stuart by Friedrich Schiller. Another segment of the tool shows statistics concerning the comparison of speakers like speech statistics and the distributions of scenic presence. Furthermore, following Marcus’ (1973) approach, specific character relations derived from the configuration matrix can be explored. For each character of the play, the tool displays relations to other characters which may be of the type dominate/dominated, alternative, independent or concomitant . The last component concerning the analysis of individual dramatic texts follows Ilsemann’s (2005; 2008) idea to examine the distribution of speech lengths in the play. We calculated the speech length by counting the number of words. Users can analyze an interactive histogram and a curve chart. Different speech lengths can be included in the visualization dynamically to narrow down the range of speech lengths for more in-depth analysis (see figure 5 for an example with a comparison). Finally, Katharsis can be used to analyze and compare self-created collections of plays by means of various quantitative aspects. The comparison of different genres and authors is a pre-configured comparison. Figure 5 illustrates a comparison of speech lengths for Goethe and Schiller showing that Goethe’s most frequent speech length is seven while Schiller’s is rather low with only four words. This might be one reason why the plays of Goethe never were that successful on stage like those of Schiller. Figure 5: Comparison of the relative distribution of speech lengths for the plays of Goethe and Schiller. The Katharsis tool is available online and can be tested as a live demo in any current web browser: http://lauchblatt.github.io/Katharsis/index.html 4. Case Studies on Quantitative Drama Analysis In this section, we illustrate the usefulness of Katharsis by means of short case studies: An important computable aspect of dramatic texts are the encounters of characters on stage in different configurations. A case study that used Katharsis on 13 tragedies, 17 comedies, one tragicomedy and one Schauspiel of the German authors Andreas Gryphius, Christian Weise, and Gotthold Ephraim Lessing verified the hypothesis that there is a trend for comedies to have higher configuration densities than tragedies (Dennerlein, 2015). For dramatic German texts from 1600 to 1800 the mean length of speeches in comedies (as compared to tragedies) is lower (see figure 6), whereas the total number of speeches is higher (see figure 7), which means characters in comedies seem to interact in a more dialogic manner. https://dev.clariah.nl/files/dh2019/boa/0584.html 3/5 Katharsis – A Tool for Computational Drametrics Figure 6: Average length of speeches in comedies and tragedies of the corpus. Figure 7: Average number of speeches in comedies and tragedies of the corpus. This seems plausible with regard to some characteristics of tragedies and comedies already known: Tragedies more often feature monologues because they provide the ideal occasion to