Can a Troubled Political Era Be Detected by Machine Learning Methods? an Application on the End of Year Speeches of the Italian Presidents
Total Page:16
File Type:pdf, Size:1020Kb
Can a troubled political era be detected by machine learning methods? An application on the End of Year speeches of the Italian Presidents Franco M. T. Gatti1, Arjuna Tuzzi2 1Università degli Studi di Padova – [email protected] 2Università degli Studi di Padova – [email protected] Abstract President Oscar Luigi Scalfaro took office in 1992, when the Italian political and judicial climate was highly turbulent. As his speeches may be considered a good source of information for grasping the transition between the so called "First" and the "Second" Republic, this study aims at analysing the 71 End of Year speeches pronounced by the 11 Presidents of the Italian Republic in order to check whether we are able to detect a clear-cut fracture within the corpus by means of supervised machine learning methods applied to texts. For this purpose, our corpus has been arranged in three different configurations to assess the performance of different classification tasks within the training set (cross-validation procedure) and to observe in which one of the three arrangements the algorithms perform better. For the final attribution task we chose the configuration that showed the highest accuracy and we split the set of speeches in a training set and a test set to observe how the algorithms assign the text chunks of the test set to the two classes (First or Second Republic). A comparison between the performance of the two different algorithms used in this work, i.e. Support Vector Machine (SVM) and Random Forest (RF) has been discussed throughout the whole study. With reference to previous studies, Scalfaro's speeches confirmed their role as transition messages and the maximum precision was reached by both algorithms when they were discarded by the two classes. Keywords: machine learning, Support Vector Machine, Random Forest, presidential speeches, End of Year speeches Riassunto Il Presidente Oscar Luigi Scalfaro ha iniziato il suo mandato nel 1992, quando il clima politico e giudiziario italiano era fortemente turbolento. Dal momento che i suoi discorsi possono essere considerati una buona fonte informativa per cogliere la transizione tra le cosiddette "Prima" e "Seconda" Repubblica, questo lavoro mira a studiare i 71 messaggi di fine anno pronunciati dagli 11 Presidenti della Repubblica Italiana per stabilire se è possibile rilevare una frattura netta all’interno del corpus, con l’ausilio di metodi di supervised machine learning applicati ai testi. Per raggiungere questo obiettivo, il corpus è stato preparato in tre configurazioni diverse al fine di saggiare la performance di diversi processi di classificazione all’interno del training set (procedura di cross- validation) e di osservare in quale delle tre configurazioni gli algoritmi raggiungono la migliore performance. Per l’ultima procedura di attribuzione abbiamo scelto la configurazione che ha mostrato la precisione più alta abbiamo suddiviso l'insieme dei discorsi in un training set e un test set, al fine di osservare come i frammenti del test set vengono assegnati dagli algoritmi alle due classi (Prima o Seconda Repubblica). Un confronto tra le performance dei due diversi algoritmi utilizzati in questo lavoro, Support Vector Machine (SVM) e Random Forest (RF), è stato discusso nel corso della ricerca. I discorsi di Scalfaro hanno confermato il loro ruolo di messaggi di transizione già evidenziato in lavori precedenti e la massima precisione è stata raggiunta da entrambi gli algoritmi quando sono stati esclusi dalle due classi. Parole chiave: machine learning, Support Vector Machine, Random Forest, discorso presidenziale, discorsi di fine anno! JADT 2020 : 15es Journées internationales d’Analyse statistique des Données Textuelles 2 FRANCO M. T. GATTI, ARJUNA TUZZI 1. Introduction In 1992, a few months before Oscar Luigi Scalfaro took office as the 9th President of the Italian Republic, the climate within Italian society grew troubled. Some judiciary investigations uncovered a thick network of bribe affairs between almost all the main political parties and many influential representatives of the business and financial world. Furthermore, the assassination of magistrate Giovanni Falcone - who was conducting some investigations on the links between Cosa Nostra (a well-known mafia-type Italian criminal organization) and some political leaders of the country - took place on May 23, 1992, precisely on the same days when the sessions for the election of the President of the Republic were taking place. This tense period was exploited by the new President Scalfaro as a chance to strengthen one of the key functions of the Head of State office: a role of super partes guarantor. In order to keep the balance in a troubled country he took the opportunity to deliver speeches that were outdistanced from those of his predecessors and that appear quite different from those of his successors (Cortelazzo 2007); even in some of those institutional ceremonies well known for characterizing the role of the President with a certain rituality. This is the case of the End of Year speeches that represent a traditional and relevant social event in the year lifetime of many Countries when the President (or the King or Queen of a Country) gives a wishes speech to the whole citizens (Cortelazzo and Tuzzi, 2007). The corpus collected and arranged for this research might be a good testing field in order to check whether the fracture opened in 1992 within the Italian society can be detected also in the President’s End of Year speeches. And more specifically whether it is possible to do it by means of machine learning methods aimed at text classification. All the analyses included in this work have been processed within the version 3.6.1 of R (R Core Team, 2019) environment for statistical computing. 2. Corpus The corpus of the End of Year addresses of the Presidents of the Italian Republic (Table 1) is 120,531 word tokens in size and includes all the 71 speeches delivered by all the 11 Presidents: Luigi Einaudi (1948-1955), Giovanni Gronchi (1955-1962), Antonio Segni (1962-1964), Giuseppe Saragat (1964-1971), Giovanni Leone (1971-1978), Sandro Pertini (1978-1985), Francesco Cossiga (1985-1992), Oscar Luigi Scalfaro (1992-1999), Carlo Azeglio Ciampi (1999-2006), Giorgio Napolitano (2006-2015), and Sergio Mattarella (2015-current). We have not included Enrico De Nicola (Provisional Head of State and first President of the Italian Republic in the time span 1946-48) because his successor, Luigi Einaudi, inaugurated the tradition of the Italian End of Year speech in Italy and his first discourse took place during the second year of his office in 1949. The Italian Constitution envisages a presidential office of 7 years and our corpus includes seven speeches for most Presidents. Luigi Einaudi (6 speeches), Antonio Segni (2 speeches, since he resigned from his position after two years), Giorgio Napolitano (9 speeches, since in 2013 he obtained a second office and, then, resigned after two years), and Sergio Mattarella (5 speeches, as he is currently in office) represent exceptions. ! JADT 2020 : 15es Journées internationales d’Analyse statistique des Données Textuelles CAN A TROUBLED POLITICAL ERA BE DETECTED BY MACHINE LEARNING METHODS? 3 Table 1. Basic measures of the End of Year speeches of the Italian Presidents size in word tokens Office No. disc. total min max mean Luigi Einaudi 1948-1955 6 1,203 150 260 201 Giovanni Gronchi 1955-1962 7 5,829 388 1,252 833 Antonio Segni 1962-1964 2 1,795 738 1,057 898 Giuseppe Saragat 1964-1971 7 8,476 465 1,929 1,211 Giovanni Leone 1971-1978 7 7,379 262 1,604 1,054 Sandro Pertini 1978-1985 7 15,592 1,340 3,746 2,227 Francesco Cossiga 1985-1992 7 13,890 418 3,345 1,984 Oscar Luigi Scalfaro 1992-1999 7 24,675 2,085 5,012 3,525 Carlo Azeglio Ciampi 1999-2006 7 12,597 1,193 2,129 1,800 Giorgio Napolitano 2006-2015 9 20,389 1,717 2,643 2,265 Sergio Mattarella 2015-current 5 8,706 1,102 2,127 1,741 Corpus 71 120,531 150 5,012 1,698 Figure 1 highlights the changing length in word tokens of speeches over time. As already illustrated in previous studies (cfr. Cortelazzo and Tuzzi 2007) the size of speeches is highly mutable: at the very beginning short speeches were delivered at the radio (in 1959 the fifth address by Gronchi was broadcast on television), some Presidents decided to be very concise in the first addresses (e.g. Gronchi, Saragat, Leone), in 1991 Cossiga chose to deliver a very short last message that was only an anticipation of his resignation from office a few months later (and 1992 emerges again as a troubled time in the history of democratic institutions in Italy); on the contrary, Scalfaro's discourses are wide and eloquent. Previous studies have also already explained that the text genre is only apparently flat and homogeneous. The individual traits of the President are more relevant than other variables, for example, more relevant than the temporal proximity among Presidents, as well as ideological proximity, generation proximity, etc. (Pauli and Tuzzi 2009, Bernardi and Tuzzi 2007, Cortelazzo and Tuzzi 2016). An analysis of chronological data highlighted that the time effect is not as important as Presidents’ individual choices and offered a portrait of some Presidents as representatives that were able to affirm their personality beyond contingent issues (Trevisani and Tuzzi 2012). As shown in the correspondence analysis in Figure 2, two Presidents clearly emerges from the others: Sandro Pertini and Oscar Luigi Scalfaro. Pertini can be considered unique as he often delivered speeches without following a written canvas word by word. Scalfaro’s differentiation shows that he can’t be easily assimilated to any other presidential style.