<<

The Finnish and partisan bias: an experimental study of expressive voting

Noora Pirneskoski

Department of Economics

Hanken School of Economics

Helsinki

2019

HANKEN SCHOOL OF ECONOMICS

Department of: Type of work:

Economics Master’s thesis

Author: Noora Pirneskoski Date: 9.4.2019

Title of thesis:

The and partisan bias: an experimental study of expressive voting

Abstract: My study contributes to the literature on expressive voting by demonstrating how the voting behavior of partisans differs from market choices of decisive individuals. The experiment follows a similar experiment protocol as Robbett and Matthews (2018), but with a completely new subject pool and context. By randomly assigning the respondents in a group of 1 or 5 individuals, the experiment tries to establish whether the self-identified descendants of the partisans of the Finnish Civil War give more expressive answers when voting in comparison to decisive individuals.

My experiment successfully replicates the main findings of Robbett and Matthews (2018). The results show that the answers of the voters become significantly more partisan in comparison to those of decisive individuals. Moreover, the same result is found both for questions relating to the Civil War as well as contemporary . However, the likelihood of a correct answer did not seem to change between treatment and control.

My results demonstrate that, alongside material preferences, affirmation of partisan identity can be a major driver for voter behavior. Further, the persistence of the partisan gap regarding both contemporary and the Civil War related facts suggests inheritability of voter choice over generations. The results show that there are benefits to further investigating expressive voting behavior with experimental methods and widening views on how political partisanship is understood.

Keywords: Voting, expressivity, partisanship, civil war, experimental economics, voting behavior, expressive voting, elections

CONTENTS

1 Introduction ...... 1 2 The Finnish Civil War and its significance today ...... 5 2.1. Preconditions of the Civil War ...... 5 2.2. The Finnish Civil War ...... 7 2.3. Memories of the war ...... 9 3 Economic theory on voting and partisan bias ...... 13 3.1. Partisanship and partisan bias ...... 14 3.2. The expressive voting theory ...... 16 3.3. Formalizing partisanship and expressive voting ...... 18 3.3.1. Experimental evidence of expressive voting ...... 21 4 Experimental design and empirical strategy ...... 24 4.1. Experimental design ...... 24 4.2. Predictions ...... 27 4.3. Statistical models ...... 29 4.4. Sample ...... 32 4.4.1. Determining sample size: power and sample size analysis ...... 33 4.5. Pilot experiments ...... 36 5 Results ...... 38 5.1. Descriptive statistics ...... 38 5.2. Partisan gap ...... 41 5.3. Likelihood of a correct response ...... 47 5.4. Robustness checks ...... 51 5.4.1. Bonferroni adjustment ...... 51 5.4.2. Post-study probabilities ...... 53 6 Discussion ...... 56 6.1. Implications for the expressive voting theory ...... 56 6.2. Implications for understanding partisanship...... 60 6.3. Limitations ...... 62 6.4. Internal and external validity ...... 64 7 Conclusions ...... 67 References ...... 69

Appendix 1 Survey Form ...... 76

Appendix 2 Pilots ...... 85 First pilot ...... 85 Results ...... 86 Modifications to the experimental design based on the first pilot results ...... 88 Second pilot ...... 89 Results ...... 89 Modifications to the experimental design based on the second pilot results ...... 91

Appendix 3 Individual questions, regression results...... 93 Appendix 4 Robustness checks ...... 94

TABLES

Table 1 Casualties of the Civil War. Adapted from VNK (2001) ...... 8 Table 2 Experimental questions ...... 25 Table 3 Two types of errors. Adapted from Casella and Berger (2002) ...... 34

Table 4 Descriptive statistics ...... 39 Table 5 Support for Civil War forces ...... 40 Table 6 Voting and the partisan gap, linear regression models ...... 44 Table 7 Likelihood of correct answer, linear likelihood estimation ...... 49

Table 8 PSPs with zero or one successful replication ...... 54 Table 9 Pilot 1 results ...... 87 Table 10 Pilot 2 results ...... 90

Table 11 Partisan results from both pilots ...... 91 Table 12 All tested questions...... 92

Table 13 Linear regression results, individual questions ...... 93 Table 14 Alternative dummyfications for Table 6 ...... 94

Table 15 Alternative dummyfications for Table 7 ...... 94 Table 16 Question type’s effect on partisan gap ...... 95 Table 17 Probit model, likelihood of a correct answer, challenging questions 96

FIGURES

Figure 1 Power and sample size ...... 36

Figure 2 Partisan gap in treatment and control groups ...... 42 Figure 3 Partisan gap in different question categories ...... 43 Figure 4 Voting and the likelihood of correct answer ...... 48

1

1 INTRODUCTION

The year 2018 marked the centenary of the Finnish Civil War. Despite the temporal distance, the debate on the meaning of the war is still lively, which was proved by the plethora of articles, tv-programs and commemorative events that took place throughout 2018. Even though the contemporary official discourse tries to transcend the old dichotomies of the war and instead focus on the effects of human suffering in the society, sometimes the old divisions resurface. In a recent poll by YLE, 7 % of stated that the Civil War still divides the nation “strongly”, and 68 % see the divisions to be “somewhat strong” (Palmolahti, 2018). These views stem from the interpretations that the two sides of the war – the Whites and the Reds – have made of the conflict. According to Kinnunen (2014), on one hand, the Civil War is seen through the perspective of an independence struggle where the White army saved the country from Soviet rule. On the other hand, the war serves as a symbol of a fight for a more equal society where the deaths of the imprisoned and executed Red soldiers are seen victims and martyrs (Kinnunen, 2014).

Partisan identities seem to affect how individuals answer factual questions: studies on survey responses of political partisans have found that partisans tend to give answers that are consistent with their partisan identities (Prior et al., 2015). Partisanship is also strongly related to voting behavior. Whereas traditional economic accounts of voting have usually considered voter choice in elections as a choice over material outcomes based on voter preferences, the expressive voting theory argues that voters derive “expressive utility from acts or decisions that substantiate or confirm personal identity” (Hillman, 2010, p. 404). From this perspective, political partisans can be seen to derive intrinsic benefit from casting a vote that confirms partisan identity, independent of the actual outcome of the elections. So far, expressive voting behavior of partisans has not been widely studied with experimental methods, even though experimental evidence on for example ethical bias in voting already exists (Shayo & Harel 2012, Feddersen et al. 2009).

My study continues in the lines of Robbett and Matthews (2018) by contributing to the literature on expressive voting by demonstrating how the voting behavior of partisans might differ from the market choices of decisive individuals. The aim of my thesis is to study whether individuals who identify themselves as descendants of the White or Red forces of the 1918 Finnish Civil War engage in partisan voting behavior. With some

2

restrictions, my experiment falls into the second replication category of Levitt and List (2009): it follows a similar experiment protocol as Robbett and Matthews (2018) but with a completely new subject pool and context. In the experiment, each respondent is randomized into a group of 1 or 5 people, where they are incentivized to answer (control group) or vote (treatment group) for the correct answer of ten questions related to the Finnish Civil War and contemporary Finnish politics. Therefore, my experiment assesses if the self-disclosed family history of an individual affects not only views on the Civil War in a voting situation, but also whether the partisan bias is present when voting on contemporary political issues. The main prediction of my study is that individuals who self-identify as descendants of the Red or White partisans of the Civil War engage in more expressive behavior when voting for a correct answer than when answering individually. Consequently, the prediction entails that the partisan gap, defined as the difference between the responses of the self-identified Red and White respondents, should be larger for voters than for decisive individuals. The increased expressivity should also decrease the likelihood of a correct answer in the treatment group in comparison to the control group.

My experiment finds confirming evidence on expressive voting similar to what Robbett and Matthews (2018) found in their study. The results show that the answers of voters become significantly more partisan in comparison to those of decisive individuals: in the control group, no partisan gap could be observed, but in the treatment condition the gap increases to around 8 percentage points. Interestingly, there is no evidence of a partisan gap in the control group, which would imply that if the incentives were high enough, the partisan gap could disappear from the voting condition as well. Moreover, similar results are found both for questions relating to the Civil War as well as contemporary politics, which suggests that political divisions might be transmitted over generations. However, the findings regarding the change in the likelihood of correct answer are inconclusive. Different from Robbett and Matthews (2018), after robustness checks, it seems that the likelihood of a correct answer in the treatment condition does not significantly differ from that of the control group. This could be explained by the control group using a “heuristic of moderation” where in an information poor situation submitting answers close to mid-range options is a reasonable strategy to deal with the lack of information.

There are several ways that my experiment contributes to existing literature. Following Robbett and Matthews (2018), it demonstrates that affirmation of partisan identity can be a driver for voter behavior alongside material preferences. Furthermore, experimental

3

evidence on expressive voting as a way to confirm individual’s partisan identity is limited: the evidence of partisan bias in past studies is mostly from surveys done in an Anglo- American context, where the electoral systems can be seen to operate in way that induces partisan behavior. A further issue is that survey methods for studying partisanship suffer from several problems such as omitted variable bias and reverse causality (Gerber et al., 2010). Experiments have been argued to be a good solution for issues typical for observational studies: experiments use random assignment to the treatment and control group making the assignment to treatment independent of potential outcomes (Angrist & Pischke, 2009). Therefore, utilizing an experimental research design, which is still not widely used in for studying elections, suggests high reliability of the findings.

It has also been argued that the concept of partisanship is not well suited for countries with multi-party systems such as Finland (Nyhan & Reifler, 2018), but the results of my experiment seem to suggest otherwise: the results provide novel evidence of partisanship in the Finnish context. The experiment shows that expressive partisan voting behavior can be found in multi-party systems when the right identities are “activated”, and that nature of the partisan gap does not necessarily need to be party political in nature. Instead, it seems that affirmation of partisan identity can be caused by other divisions in society. Therefore, the results of the experiment suggest that shifting focus of future research toward these divisions especially in multi-party systems might help us better understand voter behavior.

Apart from contributing to the literature on expressive voting, the study also adds to the discussions over political socialization and the effects of Civil Wars. Political scientists have been interested in the inheritability of voter choice for long, but this topic has been relatively underexplored in Finland (Tiihonen, Kestilä-Kekkonen, Westinen, & Rapeli, 2016). In the Finnish context, my experiment contributes to the research of the effects of the Civil War on the society as well as inheritability of voter choice by providing unique evidence that could suggest that old divisions are transmitted over generations: the observed partisan gap was found in the responses of the voters regarding both contemporary and Civil War related facts. Therefore, it provides a novel opening to the debate on voter choice and political socialization in Finland. Further, the results also raise interesting new questions on how partisan identities are formed, and the causes and effects of self-identification.

My thesis is constructed in the following way: first, it briefly introduces the Finnish Civil War and highlights the divisive nature of the war on the Finnish society even after it. The

4

second chapter also serves as a background for the questionnaire that was used in the pilot experiment providing answers to the factual questions about the Civil War that are used in the experiment. Second, the third chapter presents the theoretical foundations of this study by discussing previous research on partisan bias and expressive voting and shows how partisan and expressive voting behavior can be modelled. The chapter also introduces how previous experimental studies have tried to assess expressivity. The fourth chapter presents the pre-registered experiment protocol and empirical strategy of the study, and it also discusses factors that have determined the sample of the experiment.

The fifth chapter discusses the results of the experiment. First, section 5.1 presents some descriptive statistics of the sample, and then section 5.2 shows that the partisan gap in the treatment group is significantly different from the control group. Section 5.3 looks at whether the likelihood of a correct answer is lower in the treatment than in the control group but does not find conclusive evidence on this especially after the robustness checks performed in section 5.4. Apart from the Bonferroni adjustment used as a robustness check, section 5.4 also looks at post-study probabilities of the results. The discussion in chapter 6 considers what implications the results have for the expressive voting theory as well as for the conversation on political socialization in Finland. The sixth chapter also presents the limitations of the study and discusses its internal and external validity. The last chapter concludes the thesis by presenting some ideas on possible paths for future research and by emphasizing the importance of understanding expressive voting behavior.

5

2 THE FINNISH CIVIL WAR AND ITS SIGNIFICANCE TODAY

“This cannot simply be swept away. We must have the courage to be honest about history because only honesty creates a foundation for trust. A strong society is able to face up to painful things as well. We must try to reconcile the past.” (Niinistö, 2018)

The year 2018 marked the centenary of the Finnish Civil War that started on the 27th of January 1918. In his annual new year’s speech broadcast to the whole country, President Sauli Niinistö opened the year by reminding the viewers and listeners that “in the early days of independence we were not ‘together’, but very badly apart”. He then continued to emphasize the importance of connection between people with different backgrounds and the ability to respect different opinions. There is a reason behind this rhetoric – Finns are still not unanimous on the consequences of the turmoil in 1918. A recent poll by YLE shows that 7 % of Finns still think that the Civil War divides the nation “strongly” and 68 % see the divisions to be “somewhat strong” (Palmolahti, 2018).

This chapter gives a brief introduction to the history of the Finnish Civil War. The first part of this chapter presents some background on what are thought to be the causes of the conflict highlighting the forces that were dividing the nation. The second section then moves to describe the development of the war and its immediate consequences. Some aspects of the war that, for the purposes of the experiment, were identified to have partisan political significance: prison camps, Red and White terror, the number of Red casualties, and the involvement of Russian soldiers in the war. These aspects are further discussed in the second sub-chapter. The last part of this chapter describes the different interpretations of the conflict and how different actors in Finnish society have used the conflict as an active part of memory politics. It also sheds light on what the Civil War means to the Finns in the current times.

2.1. Preconditions of the Civil War

As Haapala (2014) has described, the development of the Finnish Civil war was a mix of internal and external conditions of the country. The development of the conflict required a complex interplay of both deepening class and political divisions as well as the conditions imposed upon Finland by the First World War and the crisis of the (Haapala 2014). Even though different academic accounts of the war tend to describe the extent, development and intensity of the class divide in Finland and its

6

importance on the development of the war differently (Szpunar, 2012), early 20th century Finland can be described as a class society. The political divisions of the Civil War followed the typical class divisions of the time: people were divided into the Reds and the Whites depending on whether they belonged to the owning or non-owning class (Haapala, 2014). Finnish society was very agrarian at this time, so a big part of the Reds were landless farmers called “crofters” (torppari) from southern Finland. However, the Red forces were led by the more organized urban workers concentrated in bigger cities such as , , and (Haapala 2014). Similarly, the owning class of the Whites was mostly composed of farmers, who usually did not own more than 20 hectares of land, as well as the small middle-class of lower-level civil servants and small entrepreneurs (Haapala, 2014). However, it should be noted that a lot of Finns did not participate in conflict at all or were forcefully drafted against their own will (Hoppu, 2009a; Osinsky & Eloranta, 2014)

In a sense, the Finnish Civil War is an interesting example of the development of a civil conflict. Under the Russian rule, even before its independence in 1917, the had strong institutions in all fields of life besides the military (Haapala, 2014). The February of 1917 in Russia did not cause strong disturbance to the economic and political life: the main political institutions such as the parliament, , political parties and local administration were not strongly affected when the rule of the Romanov monarchy was overturned in Russia (Osinsky & Eloranta, 2014). However, the events of 1917, as well as the First World War, can still be accounted for a big role in the development of the conflict. Because of the First World War, Finland as a part of Russia lost important markets in the West, and even when some of this loss was compensated by increased trade with Russia, the country slowly plummeted into an economic crisis (Arosalo, 1998; Heikkinen, 2017). As the World War dragged on, especially the urban population’s and industrial workers’ economic conditions started to significantly worsen, which lead to riots and strikes in the big cities and highlighted the class divides (Osinsky & Eloranta, 2014). Maybe the most critical factor for the development of the war was the lack of state institutions to take care maintenance of public order: until the end of 1917, Finland did not have a properly constituted police force or army capable of protecting individual rights or property (Kissane, 2004).

Instead, to fill this power vacuum, started forming in order to either protect the workers’ rights or to defend the bourgeois’ interests against a radicalized labor movement. These militias would eventually form the two sides of the Civil War: the Red

7

Guards and the Civil Guards. By the end of 1917, there were already 40,000 civil guards and 30,000 Red Guards in Finland (Kissane, 2004). From January 1918 onwards, the opposing guards started attempts to seize each other’s arms which resulted in small shootouts in many municipalities (Siltala, 2014). In the unstable situation, the local Red Guards started gaining more power, and as a last attempt to prevent the situation from getting out of hand, the then bourgeois Senate of Finland declared the Civil Guards as the official state army on the 25th of January 1918 (Haapala, 2014). However, the Helsinki Red Guards had already engulfed the previously more moderate Social Democratic party and its leading committee declared a seizure of power in the night between the 27th and 28th of January 1918 (Siltala, 2014).

2.2. The Finnish Civil War

According to Tikka (2014), the Finnish Civil War was predominantly a domestic dispute. The main parties of the conflict were the Finnish Red Guards and the White Army, but both sides had support from foreign forces: the Whites from Germany and , the Reds from Russia (Tikka, 2014). This ties the Civil War to its international context – even though the causes of the war were domestic in nature, the war also relates to the First World War and developments in Russia (Haapala, 2009). Tikka (2014) notes that the headcount of both armies is not entirely clear, but both forces have been estimated to reach almost 100,000 soldiers at the end of the war. However, the number of active forces has been estimated to be significantly smaller, with a maximum of 80,000 reds and 60,000 white soldiers (Tikka, 2014). Similar issues arise with the numbers of foreign soldiers in the armies. The estimates of Russian soldiers in the Red forces fluctuate between 1 000 to 10 000 depending on the source but, according to Haapala (2018), the number of Russians on the Red side is more likely to be closer to 1 000 than 10 000. The estimates of German soldiers are more unanimous: at the end of the war, around 14,500 members of Imperial German’s Division are said to have supported the White forces (Tikka, 2014).

Most of the battles of the conflict were fought during February 1918 on the frontline formed between Pori, Vilppula, Mäntyharju, Lappeenranta and Rautu, with the south of this line being predominantly Red and north predominantly White (Hentilä 2018). Initially, the Reds were quite successful in their efforts, but the Whites started soon gaining an advantage due to their relatively superior power and training. The Red south started to completely collapse in early April as the White forces launched an all-out attack

8

on the traditional stronghold of the industrial workers, the city of Tampere (Tikka, 2014). The war ended with the triumph of the Whites in May 1918 after the cities of and Vyborg were successfully seized from the Reds at the end of April (Hoppu, 2009b).

The Civil War is one of the bloodiest chapters of Finnish history. The conflict lasted only three and half months, but the number of casualties tells a story of a conflict that spiraled out of control fast: the death toll is estimated to be around 37 000 people, which is a shockingly high number for a country with a population just over 3 million (Tepora & Roselius, 2014). In proportion, the number of casualties is comparable to the death toll of the Spanish Civil War, and the deaths caused by the Civil War also exceed the number of fallen soldiers and civilian casualties of the by over 10 000 (Hentilä, 2018; Tepora, 2014b). When counting in all the casualties during the war as well as other causes of death, in total 95 000 people died in 1918 – this is the largest number of deceased within one year throughout the whole history of Finland (Tilastokeskus, 2007). The War Victims in Finland 1914-22 -project published in 2001 is one of the biggest efforts in trying to estimate the actual number of deaths in 1918 as well as to identify as many of the victims as possible (VNK, 2001). The project is also important in that it attempted to identify the causes of deaths for the victims of the war. Some of its results are reported on Table 1.

Table 1 Casualties of the Civil War. Adapted from VNK (2001)

The number of casualties was high overall, but especially the Reds suffered heavy losses and were subject of retaliation missions of the Whites. Out of the nearly 36,000 Finnish casualties, approximately 27,000 are considered to be part of the Red forces – this amounts to roughly 75 % of all the Finnish casualties (Tikka, 2014). Notedly, only a fourth of the overall casualties are reported to be the consequence of formal battles: instead,

9

around 9,700 dead are reported as victims of terror and roughly 13,500 died at prison camps in the aftermath of the conflict (VNK, 2001). This tells about the nature of the war. Terror, guerilla warfare and violent retaliation operations formed a major part of the conflict (Tikka, 2014). As Tikka (2014) notes, since the soldiers often had problems to distinguishing allies from foes, both armies resorted to systems of surveillance, segregation and even killing of civilians if the situation so required. Red and White terror, as well as the number of terror victims, is still a highly controversial topic. Even though both sides tended to resort to “take no prisoners” tactics during the conflict, Reds were overrepresented as victims of terror (Tepora, 2014b) – around 1,650 of the terror victims were White whereas approximately 10,000 were Red (Tikka, 2014).

Prison camps are one of the most divisive parts of the whole conflict and often used as a symbol of the brutality of the war. According to Suodenjoki (2009), at the end of the Civil War, the number of Red prisoners in White prison camps was around 80 000. Most of these prisoners had been captured when the Red forces surrendered in May (Suodenjoki, 2009). It is impossible to tell the exact number of prisoners in the camps: there were a lot of unreported prisoners in the Finnish countryside and the number changes depending on whether short-term prisoners are accounted for or not (Suodenjoki, 2009). Almost all of the individuals that died at the camps were thought to be part of the Red forces, and the deaths at the prison camps amount up to roughly 40 % of all Red casualties (VNK, 2001).

2.3. Memories of the war

“The president also said that there’s a right to disagree. Therefore, War of Liberation, Mutiny etc. are also ok [as the name of the war].” (Minister of Defense Jussi Niinistö, 2018)1

Civil wars often give rise to complex memory politics where the memories of the conflict are actively managed, suppressed and polarized in order to serve different political purposes (Kantola, 2014). As Kantola (2014) notes, divisions and partisanship between the victors and the defeated can endure for a long time and be passed on to the next generations. In the current society, the Finnish Civil War forms a major part of the public

1 Loosely translated from the original Finnish tweet by the Minister of Defense Jussi Niinistö. The tweet caused a wide discussion on whether the Minister is trying to repoliticize the Civil War terminology, as the War of Liberation and Mutiny are considered politically loaded. See, for example, Saukkomaa, H. (2018, January 1st). Puolustusministeri tarttui sisällissotatermiin – "Myös vapaussota ja kapina". Turun Sanomat, Retrieved from www.ts.fi

10

narrative and remembrance of the nation (Tepora & Roselius, 2014). Kinnunen (2014) argues that three distinct narratives of the conflict can be identified, and these narratives are strongly tied to the names that are used of the conflict. The narratives are usually coined as the White, Red and reconciling interpretations of the Civil War. The different interpretations of the war become evident in the different names that have been given to the conflict highlighting the political underpinnings of these interpretations.

The most recent narrative of the Finnish Civil War – as reflected in President Niinistö’s speech earlier in this chapter – tends to emphasize the importance of reconciliation with the past and strives to create a collective shared memory of the war (Kinnunen, 2014). According to Kinnunen (2014), this discourse is characterized by its attempt to acknowledge the events of 1918 as multifaceted and the divide between “the guilty” and “the righteous” as problematic in the context of a civil conflict. Since the 1990s, the term sisällissota (similar to the English “civil war”) has been used of the conflict, giving it a name that is seen as neutral in comparison to the older terminology (Tepora & Roselius, 2014). Often the different interpretations of the war are described as temporarily distinct conversations, where the reconciliation process is seen as the most recent development. However, Tepora (2014a) notes that to some extent it can be argued that the reconciliation started already before the Second World War. The need to unify the country during the second World War had a great impact on the collective memories, helping to merge the stories of the two opposing sides as one national narrative (Tepora, 2014a).

Nevertheless, old divisions still often enter the public discourse as can be seen from Defense Minister Jussi Niinistö’s comment on the president’s speech cited at the beginning of this section. According to Peltonen (2003), for a long time fostering the memory of the conflict was split between two public expressions: the White and the Red sides had their own commemorative publications, celebrations and monuments. Especially in the local communities two separate views of the events persisted through the simultaneous existence of the civic guard hall and labor hall, separate sporting societies for the left- and right-wing youth, and even the left and “neutral” co-operative stores (Peltonen, 2003).

This situation was intensified by the dominance of the White interpretation of the conflict. Especially right after the war, the official history of the conflict was written by its victors, which is reflected on the terminology of the war. According to Tepora and Roselius (2014), during the interwar period the conflict was typically referred to as “the

11

War of Liberation” (vapaussota), a term chosen to reflect the white interpretation of the conflict as a freedom fight from Russia and the . The War of Liberation interpretation was important for the White winners because the conflict was portrayed as a natural development of Finland finally gaining its independence from the Russians – even though Finland had been already recognized as a sovereign state by other countries before the war even broke out (Hentilä, 2018). Hentilä (2018) notes how the massive histories produced right after the war, other War of Liberation literature, as well as collections of memories depicting the cruelties of the Reds, were a way to hegemonize the White interpretation. These accounts usually downplayed the fact that the conflict was, essentially, a civil war, and tended to emphasize the involvement of the Russian Bolsheviks (Tepora & Roselius, 2014). Until the 1960s, the Red interpretations and memories of the Civil War remained mostly marginalized: academic history of the Civil War belittled the scope and purpose of White terror as well as the importance of class conflict as a backdrop for the war (Tepora, 2014a).

However, throughout this time, the labor movement and leftist parties rejected the War of Liberation interpretation of the Civil War. Until the end of the Second World War, “the Red truth” lived both orally and in written form in the labor movement, transmitting an opposing view of the events with descriptions of the cruelties of the Whites that were suppressed in the official account of the war (Peltonen, 2009). As a response to the official discourse, social democrats tended to refer to the war as kansalaissota (similar to the English term “civil war” but has a political connotation in Finnish) whereas communists and radical leftists tended to refer to the events by terms vallankumous (revolution) or luokkasota (Class War) (Haapala, 2009). The communists often emphasized the image of the Red soldiers as revolutionary fighters of the working class, whereas social democrats usually focused on the suffering of the Reds in the prison camps (Kinnunen, 2014). In the 1960s, the Red’s views of the conflict were brought to the general discussion especially through Väinö Linna’s famous Under the North Star novel trilogy, sparking a debate that at least to a certain extent accelerated the change in academic research toward more detached and varied approach (Tepora, 2014a).

The controversy on the name of the war might seem redundant but in reality, the chosen terminology reveals political underpinnings associated with the war as well as what role the memory of the conflict plays in identity politics of different social groups (Kinnunen, 2014, p. 407). In a recent survey conducted by YLE, 47 % of the respondents preferred the term sisällissota, whereas 23 % said they would use the term kansalaissota and 10

12

% the term vapaussota (Palmolahti, 2018). Similarly, Torsti’s (2012) study shows that these three terms tend to dominate. According to her, especially the younger generations tend to use the term sisällissota – in the age group 15-19, 60 % of the subjects preferred this term. On the other hand, in the age group 60-69 only 17 % would use this word, instead, depending on the subject’s political orientation, the terms kansalaissota or vapaussota were more popular (Torsti, 2012 ). Similar discussions still take place also in the academia even though sisällissota has become more or less standard in publications – in 2008 Helsingin Sanomat published an article where 20 professors out of 28 accepted the term sisällissota (Kinnunen, 2014).

13

3 ECONOMIC THEORY ON VOTING AND PARTISAN BIAS

Since the aim of this thesis is to study partisan voting behavior, the theoretical background of this study covers mainly economic literature on partisanship and expressive voting. The theoretical basis that has been largely dominant in the field of voting studies is the rational choice paradigm. According to Evans (2004), the rational choice models of voting combine theories of social action and economic theory of rationality. In simplified terms, rational choice theory states that individual’s decision on whether to vote and how to vote is based on a calculation of likely benefits from the preferred option, i.e. voters base their actions on what they expect to get from an election materially and what the potential costs are (Evans, 2004). This means that voters vote for the electoral outcome that they think will leave them best off, and therefore the decision of the voter is a “direct analogue to consumer choice in the market place” (Brennan & Hamlin, 1998). This allows the researcher to use the methods and analytic apparatus of modern economic theory in the field of politics and voting (Brennan & Buchanan, 1984).

However, these theories have received heavy criticism as they do not seem to reflect the reality of voting behavior very well – the “buyers’ remorse” that has been reported after the Brexit vote and the last U.S. presidential election cannot be explained by rational choice models (Robbett & Matthews, 2018). These criticisms have led to theories on political “cheerleading” where voters engage in expressive rather than rational behavior. Expressive behavior has also been given as an explanation to the existence of the so- called partisan gap, i.e. the differences in views observed on political partisans when they are asked to answer factual questions. First, this chapter looks at how previous studies have tried to assess the existence of divergence in the views of political partisans and why this divergence might exist. Second, it presents theories of expressive voting that can provide an account on why partisan divergences might arise. Third, the chapter presents some examples of how partisan and expressive voting behavior have been theoretically formalized and how these choices can be modelled. The last part of the chapter discusses how previous experimental studies have tried to assess the existence of expressive voting behavior.

14

3.1. Partisanship and partisan bias

Firstly, it should be pointed out that there is still debate on the definition, nature, origin and measurement of partisanship even though it is agreed that it has a powerful influence on political behavior (Huddy & Bankert, 2017). As White and Ypi (2016) note, partisanship is generally described as collective behavior in academic literature. “Partisans unite to oppose those with whom they are at odds” and thus partisanship involves some form of coordination between individuals that share similar political views (White & Ypi, 2016, p. 9). Yet, partisanship also requires some form of “other”, something that partisans unite against. Usually, partisanship is linked with political parties, but other group affiliations and allegiances, such as , religious identities, and ethnic kinships, have also been proven to affect attitudes and behavior (Gerber, Huber, & Washington, 2010). Consequently, partisanship does not require the individual to formally be a member of a party organization (White & Ypi, 2016).

There are two accounts on what causes partisanship: instrumental and expressive. From the instrumental perspective, political partisanship is thought to be related to party performance, ideological beliefs and how well the party achieves the individual’s preferred policies (Achen, 2002). This conceptualization has its roots in the rational choice paradigm introduced earlier: political decision-making is seen to be based on individual’s utility maximization that considers only extrinsic, material factors (Huddy & Bankert, 2017). The problem with this model is that in reality, changing political and economic conditions seem to have very little effect on party attachments (Green, Palmquist, & Schickler, 2002). To explain this, the expressive view on partisanship, on the other hand, sees partisanship as “an enduring identity strengthened by social affiliations to gender, religion, or ethnic and racial groups” (Huddy & Bankert, 2017, pp. 3-4). Since political partisanship shares features and is often connected with other social identities such as religion and ethnicity, voter choice and partisan identification tend to be stable over time (Green et al., 2002). Therefore, the influence of short-term events in the realm of politics remains small, and they do not seem to affect voter choice.

Many academics argue that partisan loyalties also influence how individuals view the political world. Maybe one of the most famous formulations of this argument comes from the authors of The American Voter who emphasized the “role of enduring partisan commitments in shaping attitudes toward political objects” (Campbell, Converse, Miller and Stokes, 1960 cited in Bartels, 2002, p. 117). In academic literature, this phenomenon is often referred to as partisan bias. Prior research has mainly focused on partisan bias

15

in survey responses: for example, in the American setting, studies often find that Democratic and Republican partisans tend to give different answers to factual questions (Robbett & Matthews, 2018). Interestingly, these differences seem to increase with an increase of political awareness (Bartels, 2012). In recent studies, the differences have been found to be persistent. For example, using cross-sectional and panel survey data, McCabe (2016) found that while personal experiences (e.g. change in the individual’s insurance status) influenced how individuals viewed the Affordable Care Act (ACA), partisan identities still played a strong role in the results. In general, Democrats attributed positive changes in their insurance status to the ACA, whilst Republicans saw negative changes as a result of the new policy (McCabe, 2016, p. 880).

However, there are some limitations to studies that assess partisanship with survey data. A long-standing concern in existing research has been that the observed correlation between partisanship and politically relevant outcomes actually originates from unobserved variables that are correlated with both partisan identities and beliefs (Fiorina, 2002, p. 105). Furthermore, it is possible that causality might flow in both directions: partisanship is reflected in political attitudes and events, but also causes them. Gerber et al. (2010) identify several threats to unbiased measurement of causal effects in existing research. The most important ones include omitted variable bias due to unobserved differences between individuals and endogeneity in partisanship (Gerber et al., 2010). In order to combat these issues, there are also examples of studies that have tried to assess partisanship in an experimental setting. For example, Ramirez and Erickson (2014) conducted an experiment where they show that even when their subjects received factual and neutral information on about a specific aspect about the economy, partisanship still continued to bias the views of their subjects on the economy. Their study also showed that to some extent people might be aware of partisanship affecting their judgements: when the subjects received cues that challenged their views, they became more insecure than the subjects in the control condition that did not receive the treatment (Ramirez & Erickson, 2014).

Two different explanations have been given on what might cause such bias. One explanation to the partisan gap in the answers of e.g. Democrat and Republican partisans is a view that “partisan loyalties have pervasive effects on perceptions of the political world” (Bartels, 2002, p. 138). According to Bartels (2002), partisan bias impedes a natural tendency toward convergence in political views that, without the bias, would be caused by shared political experience. The partisan bias is argued to arise because

16

partisanship affects how individuals acquire and process information – it is seen to create a “perceptual screen” that causes individuals to seek out information that confirms their personal opinions and ignore information that contradicts these beliefs (Gerber et al., 2010).

Alternatively, partisan bias could arise from a combination of different motivations. Instead of giving their responses solely because they believe in them, partisan responses give the individual an opportunity to support their “team” (Bullock, Hill, Gerber, & Huber, 2015). For example Prior, Khanna, and Sood (2015) found in an experimental study design that when individuals were incentivized either by appeals to accuracy or monetary rewards, partisan bias on people’s views about the economic conditions reduced. However, when there were no incentives, people tended to answer survey questions in a similar fashion they would answer opinion questions (Prior et al., 2015). The results of Prior et al. (2015) are complementary with those of Bullock et al. (2015), who instead of rates at which people answer correctly, look at the differences in the answers of partisans. Similarly, they found that incentives reduced partisan responding and therefore partisan biases in surveys might arise because surveys provide a low-cost opportunity for expressing partisan affinities (Bullock et al., 2015). These results imply that people actually know more about politics than results from unincentivized surveys would suggest. However, in an unincentivized situation, they choose to act in a way that confirms their partisan identity, which intensifies the partisan bias (Bullock et al., 2015).

3.2. The expressive voting theory

The expressive voting theory and more generally expressive behavior have been given as a rationale on why the above-mentioned partisan biases exist. In their famous article, Brennan and Buchannan (1984) make an analogy between cheering for a favorite sports team and voting. For them, voting resembles behavior that individuals engage in when they cheer for their favorite sports team. In large majoritarian elections neither the act of voting or the decision to vote in the first place can be explained as a way to achieve a particular political outcome – just as it cannot be argued that spectators would attend a game only to secure a victory to their team (Brennan & Buchanan, 1984, p. 187). The likelihood of being pivotal in a large majoritarian election is almost non-existent, which creates a paradox for the traditional rational choice framework. If instrumental voters are rational, they are predicted not to vote as the cost of voting (e.g. time invested) exceeds the expected benefit of voting based on the unlikeliness of an individual vote

17

being decisive (Hillman, 2010). Brennan (2008) argues that this reasoning is a simple example of relative price logic, where the magnitude of the relative change in the price is unusually large. When voting in order to receive an instrumental material benefit, “a dollar’s worth of instrumental benefit shrinks to one dollar times the probability of being decisive” – a change so big in large majoritarian elections that the change in the price can be described as qualitative in nature (Brennan, 2008, pp. 479-480).

This qualitative change in the price of the instrumental benefits of voting allows for the possibility that behavior in the realm of politics can be very different from markets without any departure from the logic of rational agency (Brennan, 2008). The expressive voting theory argues that voters seek “expressive utility from acts or decisions that substantiate or confirm personal identity” from voting and derive intrinsic utility from this behavior (Hillman, 2010, p. 404). Hamlin and Jennings (2008) describe expressive voting behavior also as direct and exclusive. They argue that expressive voting is direct in the sense that the individual chooses an action in order to gain direct or intrinsic benefits of the act of voting, instead of considering the general consequences of that choice. Expressive voting is also exclusive in the sense that the decision to vote is made based on a specific subset of possible benefits rather than an overall assessment of the situation (Hamlin & Jennings, 2008). However, as Hillman (2010) notes, expressive behavior is not non-rational or altruistic – in engaging expressive behavior individuals derive utility through acts and declarations that confirm a person’s identity.

Hillman’s (2010) definition is built on positive confirmation of identity. Conversely, Hamlin and Jennings (2017) note that identity is usually defined in reference to the “other” group; this is why expressive behavior can also cause inter-group conflict. Therefore, expressive behavior also provides a rationale for negative campaigning. Hamlin and Jennings (2017) argue that when faced with expressive behavior, candidates and political parties have a clear incentive to appeal to groups that would provide them with a level of support that causes them to win, and through negative campaigning and attacks on the character the other group can be made seem like an unattractive “team” to identify with. Therefore, expressive voting behavior should be understood relative to an audience: in some instances this could mean a direct audience such as members of the individual’s social group, but in the case of voting, this audience can also be the individual itself in the sense of demonstrating a certain affiliation through voting (Hamlin & Jennings, 2008). Wagner and Tyran (2016) suggest that the motivations for expressive behavior need to be investigated further in this respect. They argue that even

18

though it might be natural to maintain that voting in a large majoritarian election is mainly about self-image concerns, expressing values and identity to others might also be a highly relevant motivation for voting.

3.3. Formalizing partisanship and expressive voting

The expressive accounts of voting share a lot of similarities with modern theoretical literature on voter motivations. There are two types of relevant theoretical models that should be discussed here: models on political partisanship that build on the assumption that the individual always considers herself pivotal, and the models that explicitly incorporate some form of expressive or ethical benefit from voting that is caused by the change in pivotality. Moreover, these models tend to try to answer either of two separate questions: whether to vote; and how to vote. This next section discusses on how voting behavior, and specifically expressive voting behavior, has been theoretically modelled in academic literature.

Some models of strategic voting explicitly include political partisanship in their models. The game theoretic models of Feddersen and Pesendorfer (1996, 1997) are some of the most influential examples of this. Feddersen and Pesendorfer (1996, 1997) show that traditional approaches where individuals always assume their vote is pivotal do not represent an equilibrium in a corresponding voting game. In both papers, they model a two-candidate election where a part of the population consists of political partisans that always vote for their preferred candidate, whereas some fraction of voters change their choice depending on a private signal they receive. Especially Feddersen and Pesendorfer (1997, p.1030) is heavily influenced by theory on common value auctions: similar to how bidders condition on their bid being the highest, voters must “condition their beliefs about the quality of the alternatives on the event that one vote can change the election outcome, i.e. a vote is pivotal”. The main result of Feddersen and Pesendorfer (1997) is that the information environment of the vote can be a significant determinant on whether an election is an efficient information aggregation mechanism – therefore, the argument that societies would be collectively better at making good decisions does not always hold.

The second relevant strand of theorizing consists of the non-instrumental or partially instrumental models of voter turnout and voter choice. The models of Riker and Ordeshook (1968), Brennan and Buchanan (1984), Schuessler (2000), Feddersen and Sandroni (2006), Feddersen, Gailmard, and Sandroni (2009) and Robbett and Matthews

19

(2018) have an expressive component in which individuals derive benefit from performing civic duty, voting for the ethically better option or affirming partisan identity. One of the earliest treatments of the choice on whether to vote, where the individual is assumed to receive some other form of benefit from voting than material, is the model proposed by Riker and Ordeshook (1968). In their model on the “calculus of voting”, the agent receives a “duty” payoff when the individual votes and thus “does their part” in the election by voting. Their modification of the standard voting calculus was the addition of the D term and it is written as

푅 = 푃퐵 − 퐶 + 퐷 where R is the reward, in utiles, that the individual voter receives from voting; P is the probability of receiving benefit B through voting; B is the benefit the individual receives from the success of the preferred candidate over the less preferred one; C is the cost of voting; and D contains elements of positive satisfaction from e.g. compliance with the ethic of voting, affirming allegiance to the political system and affirming a partisan preference (Riker & Ordeshook, 1968). In simplified terms, the model of Riker and Ordeshook (1968) shows that R can be positive and therefore there can exist a positive fraction of individuals who vote in an election if for them 1) D>C, or 2) D≤C in an election where PB>C-D. This means that especially if the expressive benefit D is large enough, even in large elections the benefit of voting may be high enough for an individual to vote. Feddersen and Sandroni (2006) have reached a similar result in a strategic voting model and demonstrated that the expected turnout of an election is a function of the costs of voting, level of disagreement within the electorate, importance of the election, and agents’ ethical payoff from “doing their part”.

Generally, the other models mentioned above primarily modify the Riker and Ordeshook (1968) model by altering the D term – instead of defining it as a benefit of voting per se, they typically attach it to a particular choice of the voter (Feddersen et al., 2009). Many of these models share similarities to the Andreoni (1990) model where the individual receives a warm-glow benefit from acting in a particular way. However, different approaches have risen as well. For example, instead of receiving “warm-glow” from “doing their part” as in the Riker and Ordeshook (1968) model, Lee and Murphy (2017) have updated the classic expressive model to account for why voters might choose to inflict harm upon others. They model a situation where the voter can harm other voters, and the model is based on a utility function over the anger and shame the voter

20

experiences, which are defined as functions over P (likelihood of pivotality) and the amount of harm the voter can impose.

An example of the classic expressive model in a more strategic framework is Feddersen et al. (2009). The basic setting of this model is similar to Feddersen and Sandroni (2006) formalization on voter turnout: two types of agents, A and B, need to decide whether to vote for option A, vote for option B, or abstain in a two-candidate election. I will not discuss the implications of the choice of abstention here, as the relevant implications to expressivity can be explained through the choice between two candidates. In the Feddersen et al. (2009) model agents of type B are further divided to active and inactive agents, where 푛훽 represent the fraction of active type B agents. Type A agents receive a higher monetary payoff if candidate A is selected (1 − 푐 if A selected, 0 if B selected); type B agents a higher monetary payoff if candidate B is selected (1 + 푥 − 푐 if B is selected, 1 − 푐 if A selected and the agent votes). Note that 푥 represents the monetary premium that B types receive if candidate B is selected, and c is the cost of voting. Feddersen et al. 1 (2009) assume that there are more type A agents than type B agents and that > 푥 > 2 2푐 > 0. This ensures that candidate A can be defined as the ethical outcome, as its selection maximizes the sum of the monetary rewards to the agents and it maximizes the minimum monetary reward received (for a full discussion of the monetary rewards, see Feddersen et al., 2009, p. 178-179).

Following Feddersen et al. (2009), the payoff profiles for voting for option A or voting for option B (payoff structure for abstention is not included here):

1 1 ∗ 휋퐴(푛훽, 푥) = (1 + 훿) + (1 − ) × (1 + 훿 + 푞 (푥 − 훿)) − 푐 + 푑 − 휀퐴 푛훽 푛훽

1 1 ∗ 휋퐵(푛훽, 푥) = (1 + 푥) + (1 − ) × (1 + 훿 + 푞 (푥 − 훿)) − 푐 − 휀퐵 푛훽 푛훽

The payoffs are dependent on the number of active type B agents, 푛훽; the cost of voting, c; monetary premium, x, for B types conditional on candidate B winning; probability that alternative B is selected when the decision maker’s vote is not pivotal, 푞∗; and stochastic payoff disturbance 휀푘 (Feddersen et al., 2009). In this model, the pivot probability is represented by the term 1/푛훽. An interesting difference between the Riker and Ordeshook (1968) model – besides the introduction of different voter types – is the addition of two different forms of subjective payoffs: the “ethical instrumental” payoff 훿

21

that is obtained when candidate A wins, and the “ethical expressive” payoff d, similar to Riker and Ordeshook (1968), obtained when voting for candidate A, the ethically better choice (Feddersen et al., 2009, p. 179). Conditional on voting, the voter will vote for A instead of B if 퐸(휋퐴(푛훽, 푥)) ≥ 퐸(휋퐵(푛훽, 푥)).With some algebra, this reduces to:

푥 − 훿 푑 ≥ 푛훽

This means that for an individual with neither type of expressive payoff (푑 = 훿 = 0), the voting choice will always be B. Note, however, if an individual that receives even a small 1 expressive benefit, she will vote for A rather than B if the pivot probability, , becomes 푛훽 sufficiently small, i.e. the size of the election becomes sufficiently large and the number of active B type voters increases (Feddersen et al., 2019).

The specification entails that voters may have a tendency to vote for alternative B when the pivot probabilities are high but switch to option A when the pivot probabilities decrease, i.e. voters may change their choice toward the ethical option if the pivot probabilities decrease (Feddersen et al., 2019). This leads to an interesting implication: Feddersen et al. (2019) note that as the size of the electorate increases and the pivot probability decreases enough, the choice of which candidate to vote for becomes hypothetical since this choice does not have a great effect on the material payoffs. The decrease in the pivot probability decreases the potential selfish benefit from x, whereas the expressive benefit d is not affected by the change in the pivot probability (see the inequality above), therefore directing behavior towards more expressiveness when pivot probabilities decrease (Feddersen et al., 2009, p. 179). Feddersen et al. (2009) have empirically tested this model, and the results of this experiment are discussed in the following section.

3.3.1. Experimental evidence of expressive voting

This section discusses how expressive voting behavior has been assessed in experimental studies. There is a fairly large pool of experimental research on expressive voting, and as a simplification, these experiments can be thought to assess expressivity in three different contexts. This section discusses some examples on these three different forms of expressivity that previous experiments have assessed: the experiments based on Tullock’s (1971) famous thought experiment; experiments that have tried to assess expressivity caused by ethical bias; and lastly, the Robbett and Matthews (2018) study

22

on expressivity as partisan bias. As we will see, even though the replications of Tullock’s “Charity of the Uncharitable” have not necessarily produced expected results, there exists experimental empirical evidence on expressivity in the realms of ethical bias and partisan bias.

One of the most famous examples of expressivity and voting is Tullock's (1971) article “Charity of the Uncharitable” in which he through a thought experiment argues that instead of donating a sum of money on their own, most individuals prefer to vote for a tax to raise the money. The results of replicating this thought experiment in classroom and laboratory environments have been mixed. Carter and Guerette (1992) experiment did not find conclusive support the theory due to experimental design problems, but Fischer (1996), using a reiterated version of the Carter and Guerette (1992) design, managed to find evidence for expressive voting. Proving the standard assumption, Fischer (1996) found that when the chances of pivotality decrease, the more expressive the subjects’ voting behavior became.

However, in later studies, these results have not been so clear-cut. For example, Kamenica and Brad (2014) found little evidence of expressiveness against the basic prediction of the model. Similarly, Bischoff and Krauskopf (2015) found in their study that even though the individual decision to donate their endowment seemed to cause a feeling of “warm-glow”, the same effect is not found when the subjects need to decide on this collectively. In a very similar experiment to Tullock’s Charity of the Uncharitable, Tyran (2004) conducted an experimental where the subjects were asked to vote on a proposal to tax everyone and to donate tax revenues to a charity. Though his results were also mixed, one of his main findings from this experiment was the “bandwagon effect” – if many other voters are expected to accept the proposal, the subjects were more likely to vote for donations themselves (Tyran, 2004).

Evidence for ethical voting bias, on the other hand, has been reported in multiple experiments. For instance, Shayo and Harel (2012) found that so-called non- consequentialist concerns, such as moral motivations, were a driver for at least 12.5 % of their subjects in an experimental election. Similarly, Feddersen et al. (2009) found that their experimental results supported a bias toward more unselfish outcomes as the size of the election increased: as the probabilities of being pivotal decreased, the collective choices depart from individual material preferences toward moral considerations. As predicted by their model introduced in the previous section, Feddersen et al. (2009) found that as the pivot probabilities increased, the propensity to vote for option B

23

(“selfish choice”) increased and the propensity to vote for option A (“ethical choice”) decreased.

However, experimental evidence in expressive voting as a means to confirm individual’s partisan identity is limited and the study of Robbett and Matthews (2018) is the first to report this. The experimental design of Robbett and Matthews (2018) shares a lot of commonalities with the Bullock et al. (2015) and Prior et al. (2015) studies introduced earlier with its use of factual questions, but unlike in them, the respondents in Robbett and Matthew’s design are always incentivized. Another difference is also that the study is designed to assess whether individuals give more biased answers when they think they are voting for the answer than when they are answering the question themselves (Robbett & Matthews, 2018). In their experiment, Robbett and Matthews (2018) asked Democratic and Republican partisans to answer seven multiple-choice questions that had objective answers but could still challenge their partisan views. Their respondents were randomly assigned to either answer the questions as decisive individuals or to vote for the correct answer in a group of 5 or 25. Regarding partisan bias, the main findings of their experiment were that when individuals are voting for the right answer, there is a significant increase in the partisan gap and a significant decrease in the likelihood of giving a right answer when the question challenges partisan views (Robbett & Matthews, 2018).

24

4 EXPERIMENTAL DESIGN AND EMPIRICAL STRATEGY

This chapter presents the experiment protocol of the randomized controlled trial conducted as a part of the study. Over recent years randomized controlled trials have become a standard tool in economists’ toolkit. Digitalization has significantly reduced the cost of running experiments, and therefore their use might even increase in the years to come (Athey & Imbens, 2017). Randomized controlled trials have one advantage over many other strategies economics employs – namely, they are a powerful tool for causal inference. Since randomized controlled trials, by definition, use random assignment to the treatment and control group and therefore make the assignment to treatment independent of potential outcomes, they solve the self-selection problem that is a common threat for observational studies (Angrist & Pischke, 2009).

This thesis is based on an experiment conducted in late December 2018 and early January 2019. The next sections introduce my experimental design, the main predictions of the experiment and other factors that have affected the experiment design. This chapter also explores the preregistered statistical models that were used to analyze the obtained data. The preregistration of the empirical strategies was done on January 9, 2019, at OSF.io, and the preregistration will be made public once the thesis is public. Last, this chapter discusses how the sample was collected and describes the power and sample size calculations that were conducted to predetermine the required sample size.

4.1. Experimental design

The replicability of the experimental methodology is one of its main advantages as it allows researchers to “test whether the results can be independently verified” and this “not only serves to generate a deeper collection of comparable data but also provides incentives for the experimenter to collect data carefully” (Levitt & List, 2009). Levitt and List (2009) have defined three levels at which replication studies operate: they can either use existing data from the original experiment to reanalyze it and confirm the results; run an experiment that follows the same protocol as the original; or test the hypotheses of the original study using a new research design. Since the experimental design of this study follows closely the experiment of Robbett and Matthews (2018) and it tests two of the main predictions of the original study, this study can be considered to fall into the second replication category of Levitt and List (2009).

25

The aim of this experiment is to assess whether voting for an answer to a factual question instead of answering as a decisive individual affects the individual’s propensity to respond expressively to a factual question. The dependent variable of interest is the size of the partisan gap, which is defined as the difference between the answers of the self- identified Red and White respondents. The treatment variable is the group size: each participant is randomly assigned to a group of 1 (control) or 5 (treatment), which varies how pivotal the answer of an individual is. The importance of pivotality was discussed in chapter 3.2 and the decisions of an individual participating in the study will be formalized in part 4.2. The main background variable of is the individual’s self-identified status as White or Red: the respondents are asked to give an estimate what percentage of their relatives were on the White or Red side during the Civil War. If the respondent reports that more than 50 % of their relatives were Red, they will be coded as Red partisans, and vice versa. Individuals who did not self-identify as Civil War descendants or if they report a 50-50 -family background are not be included in the sample.

Table 2 Experiment questions

The experiment was conducted online with a survey that randomizes the instructions given to the respondents: the individuals in the treatment group were incentivized to vote for the correct answers, whereas the control group received incentives to answer what they think to be the correct answer (for survey template, see Appendix 1). The questions of the experiment were factual questions based on the historical events of the Civil War relating to, for example, the numbers of deaths in the prison camps as well as factual questions related to modern day politics. The political questions were chosen from topics that have been divisive in the left-right -axis in Finnish politics over the recent years, such as education, inequality and the provision of healthcare services. There is a correct, factual answer to each question, but also a divisive, partisan feature. For instance,

26

regarding question “Casualties” on Table 2, a supporter of the White forces might be more likely to underestimate the numbers of deaths of Red prisoners, whereas a Red supporter might overestimate them. As an additional control, a politically neutral question was also added to the question battery. The questions used in the experiment are listed in Table 2. The amount of time to answer each question was restricted to 30 seconds in order to prevent the urge for individuals to look up the correct answer.

In order to control for variation that could be caused by something else than the treatment, some additional background questions were added in order to obtain control variables. These include questions regarding individual characteristics such as age and gender as well as how informed the individual is about the civil war, how much they follow the news, and if they voted in the previous parliamentary elections in 2015. Apart from age, these variables are coded as indicators taking value 1 if the respondent is female, if they voted in the previous 2015 parliamentary elections, if they follow the news (values “a lot” and “quite a lot” of the relevant question), or if the individual reports being informed about the events of the Civil War (values 4 and 5 of the relevant question).2

In experimental economics, standard procedures require that the subjects of the experiments are provided adequate incentives for them to act in a similar fashion as they would in a natural setting. Experiments conducted online have obvious advantages over laboratory experiments such as low cost, large subject pools and a more natural setting than a usual lab environment (Vinogradov & Shadrina, 2013). However, providing sufficient incentives can be tricky in online experiments: distribution of cash can be difficult or expensive in large-scale experiments, and furthermore, to preserve anonymity, the researcher should not pay the subjects directly (Duersch, Oechssler, & Schipper, 2009).

In order to incentivize the respondents, they were able win small monetary rewards. The reward was based on how many questions the group answers correctly. When the group size is 1 (control), the respondents should understand that their answer will be decisive and not a matter of voting. The respondents in this group were rewarded for each question they answer correctly, whereas when the group size is 5, the participants were

2 The results of the relevant statistical tests using alternative dummyfications are reported in Appendix 4. The alternative dummyfication for variable “Follow News” was coded with the dummy taking value 1 if the individual answered “a lot” to the relevant question, and for “Know History” the alternative dummy took value 1 of the answer to the relevant question was either value 3, 4 or 5. The alterative dummyfications did not have an effect on the main results of the experiment.

27

rewarded if the majority of the group voted on the correct answer (treatment). Similar to a real election, the individuals in the treatment group could not communicate with the other individuals in their voting group nor observe the actions of the other voters. This meant that the individuals in the treatment group were rewarded if 3 out of 5 members of the group answered the question correctly based on the individual answers given. Considering the amount of time it took to answer the experiment questions (around 10 minutes), the amount of money rewarded from each correct question did not need to be high in order to ensure sharp incentives. The flat fee for participating in the study was 2,5 € and the respondents had a possibility to gain additional 0.50 € from each correct answer.

Even though the experimental design is, by and large, the same as in Robbett and Matthews (2018), some aspects of the design were modified or left out to suit the purposes of this thesis. For instance, their second treatment variable, availability of information, was left out. Additionally, the voting group size was kept at 5 individuals instead of varying the size of groups like in the original paper. I made this decision since the partisan gap in the results of Robbett’s and Matthews’ findings did not differ significantly between groups of 5 and 25 individuals (Robbett & Matthews, 2018). Also, there is previous evidence that population size does not have much of an effect on either voter choice or the decision to acquire more information on the question at hand, which would suggest that individuals respond rather to the act of voting itself and not to the likelihood of being pivotal (Elbittar, Gomberg, Martinelli, & Palfrey, 2019). For these reasons, I decided that having only one treatment condition would be sufficient to maximize statistical power.

4.2. Predictions

The formalized model of participant choice under the experiment has its roots, essentially, in the Riker and Ordeshook (1968) model introduced in section 3.3. Following Robbett and Matthews (2018) voter choice under the experiment can be formalized as

퐸푈(푟) = 푃 ∗ 퐵 ∗ 푏(푟) + 푐 ∗ 푒(푟) where the expected utility 퐸푈(푟) from submitting a response 푟 is determined by 푃, the likelihood of being pivotal, 퐵, financial benefit from correct response receiving the majority of votes and 푏(푟), the respondent’s belief that 푟 is correct. The utility of an

28

answer is further determined by 푐, which captures the individual’s personal taste for giving a partisan response and 푒(푟), representing the to what extent the submitted response 푟 is in line with the respondent’s political tastes (Robbett and Matthews, 2018). As Robbett and Matthews (2018) note, the first term in the equation represents the instrumental benefit that the voter can receive from the choice, and the second term the expressive benefit of the response. Therefore, when the financial benefit, the likelihood of being pivotal or the individual’s trust in their answer increases, the instrumental considerations of the voting choice become more important, whereas when these factors decrease, the expressive considerations strengthen. In this model, the usual cost term present in the Riker and Ordeshook (1968) model has been omitted, as it is assumed that voting in the experiment is costless and mandatory for the participants.

Based on the expressive voting theory, some predictions can be made in the context of this study. These predictions are based on Robbett and Matthews (2018, p. 112) predictions 1A and 1B:

A: the difference between the Red and White answers (partisan gap) is larger in the treatment condition (voters) than in the control condition (decisive individuals).

B: the likelihood of a correct response is greater for decisive individuals (control) than for voters (treatment).

Based on the results of Robbett and Matthews (2018) and academic literature on the Civil War, it is reasonable to predict that A would hold and that partisans engage in significantly more “cheerleading” in voting groups than as decisive individuals. The predictions entail that I) the partisan gap should be larger under the “voting” (treatment) condition than under the “individual” (control) condition and II) the likelihood of giving a correct answer should be greater when respondents act as decisive individuals than when they are voting for the answer.

Interestingly, my experiment design allows to study if the Red/White family background not only influences the individual’s views on the Civil War but also on contemporary political issues. As discussed in chapter 2, the Finnish Civil War was related to class divisions between the political left and political right in the Finnish society, and the divisions were carried forward also after the war. Especially the factors discussed in chapter 2 that were analyzed by Peltonen (2003) could indicate that to some extent the

29

individual’s family history influences their position in the left-right axis. If this is the case, the partisan gaps for both sets of questions should be similar.

There is also a possibility that no partisan gap exists under the treatment or control conditions. This would mean there is no observable partisan division regarding questions of the Civil War or contemporary political issues in the self-identified descendants of the Civil War families. This would indicate that that the respondents have adopted a narrative of reconciliation instead of the Red or White interpretations of the events of the war (see section 2.1). It should be noted that there might be a difference between generations in this regard: it could be that younger generations do not have such a strong connection to the Civil War and have learned about the war mostly through history classes. Consequently, they do not engage in expressive voting as much as older generations. YLE’s recent poll is indicative of this: only 5 % of the age group 18-24 state that their views on the Civil War have been strongly affected by their family’s experiences, whereas in the age group 65-79 this number is 26 % (Palmolahti, 2018).

4.3. Statistical models

This section of my thesis discusses the statistical models and other relevant parts that were preregistered to OSF.io on January 9, 2019. The goal of preregistration is to lessen the use of bad statistical practices, such as performing multiple tests on data and only reporting test results that are statistically significant (Fricker Jr., 2016). With preregistration, the researcher commits to certain analytic steps without prior knowledge of the research outcomes, and therefore separates the exploratory (or hypothesis-generating) work from confirmatory (hypothesis-testing) research (Nosek, Ebersole, DeHave, & Mellor, 2018). The models discussed here form the confirmatory analysis of this thesis and are aimed to testing the predictions introduced in the previous section. Two pilots were conducted prior to data collection and the pilot data was not used in the final analysis. Data collection was completed on January 7, 2019, but the data file was not accessed prior to pre-registration.

The first consideration for the study is to see to what extent descendants of Reds and Whites give different answers when voting than when answering themselves, as opposed to incorrect answers (prediction A). To do that, this study follows the Bullock et al. (2015) and Robbett and Matthews (2018) treatment of multiple choice questions. In order to measure the partisan gap, both of these studies created scale scores to each question to range linearly from 0 to 1. In my experiment, each experiment question has 5 answer

30

options and the answer most favorable to Whites was coded 0 and the answer most favorable to Reds was coded 1. The intermediate answers are coded to have values 0.25, 0.5 and 0.75, respectively. Then, following Robbett and Matthews (2018), the partisan gap can be defined as the difference in the rescaled responses provided by the Whites and Reds. For example, a good way to provide initial evidence on the existence of a partisan gap is to provide a figure of scaled red-white difference for individuals and voters, that is averaged across all the partisan questions, similar to the one provided in Robbett and Matthews (2018). Due to the normalization, no difference would indicate that there is no partisan bias, whereas a positive difference shows that there is a bias, where individuals answer the questions in a way that coincides with their background.

Since this study is a randomized controlled trial, where the random assignment to the treatment and control group solves the self-selection problem by making the assignment to treatment independent of potential outcomes, the effect of the treatment can be captured with a linear regression model (Angrist & Pischke, 2009). According to Angrist and Pischke (2009), since randomization eliminates selection bias, the difference across the treatment groups can be thought to capture the average causal effect of the group size the individual is assigned to. The estimation of this treatment effect can be done with regression models, but as Athey and Imbens (2017) note, they were not initially developed to analyze data from randomized experiments and therefore they suggest caution when using them. Specifically, in order to preserve many of the finite sample properties that simple comparisons of means possess and to keep the interpretations of the regression estimates clear, Athey and Imbens (2017) recommend to only use indicator variables rather than multivalued variables as covariates in the regression function. Therefore, most of the control variables used are coded as indicators, as discussed already earlier in this chapter.

The first part of the analysis tries to establish whether self-identified Red and White descendants give expressive answers when voting for the correct answer and whether the resultant partisan gaps in the voting condition are larger than when participants are answering on their own. The below regression models are based on Robbett and Matthews (2018, p.113) Table 2 models 1 and 2:

(1) 푅푖,푞 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푣표푡푖푛푔 × 푟푒푑 + 훽3푟푒푑 + 푋푖훾 + 푟푒푑 × 푋푖휇 + 휀푖,푞

(1.1) 푅푖,푞 푐푖푣푖푙 푤푎푟 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푣표푡푖푛푔 × 푟푒푑 + 훽3푟푒푑 + 푋푖훾 + 푟푒푑 × 푋푖휇 + 휀푖,푞

31

(1.2)푅푖,푞 푐표푛푡푒푚푝 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푣표푡푖푛푔 × 푟푒푑 + 훽3푟푒푑 + 푋푖훾 + 푟푒푑 × 푋푖휇 + 휀푖,푞

In these models 푅푖,푞 is the response of person i to question q, and the responses are coded on the interval [0,1] following the Bullock et al. (2015) and Robbett and Matthews (2018) treatment of multiple choice questions (1 representing the “reddest” answer). 푉표푡푖푛푔 is an indicator variable taking value 1 if the individual was assigned to the treatment group and 푟푒푑 is another indicator variable taking value 1 if the individual self-reported Red family ties (>50 % of the perceived relatives were red). Individuals who report 50-50

Red/White family backgrounds are dropped. 푋푖 includes individual characteristics such as gender, age, if the individual was a voter in the previous 2015 parliamentary elections, if the individual follows the news (values 3 and 4 in the relevant question), and if the individual is informed about the Civil War (values 4 and 5 in the relevant question). These controls are uncorrelated with the treatment, and therefore do not affect the estimate of 훽1 or 훽2 (Angrist & Pischke, 2009).

Athey and Imbens (2017) note that random assignment to treatment and control does not imply that the error term is independent of the treatment indicator. Instead, it is likely that there is heteroscedasticity and therefore, in this case, it is necessary to use the Eicker-Huber-White robust standard errors in order to obtain valid confidence intervals (Athey & Imbens, 2017). Moreover, since it is reasonable to assume that the background of an individual is going to affect her responses, it is likely that the responses are going to be correlated. Therefore, as Angrist and Pischke (2009) note, it is reasonable to assume that the error terms are correlated on the level of the individual and thus the error terms need to be clustered at this level. The specifications 1.1 and 1.2 run the same regression only to a sub-group of questions: the ones related to civil war, or the ones related to modern political issues.

Following Robbett and Matthews (2018, p.113), in addition, the following regressions are also be run to assess prediction A:

(2) 푅̅푖 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푣표푡푖푛푔 × 푟푒푑 + 훽3푟푒푑 + 푋푖훾 + 푟푒푑 × 푋푖휇 + 휀푖

(2.1) 푅̅푖,푐푖푣푖푙 푤푎푟 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푣표푡푖푛푔 × 푟푒푑 + 훽3푟푒푑 + 푋푖훾 + 푟푒푑 × 푋푖휇 + 휀푖

(2.2) 푅̅푖,푐표푛푡푒푚푝표푟푎푟푦 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푣표푡푖푛푔 × 푟푒푑 + 훽3푟푒푑 + 푋푖훾 + 푟푒푑 × 푋푖휇 + 휀푖

These specifications are very much the same as the previous ones with the difference that

푅̅푖 is the scaled response averaged for each individual. Again, the same analysis is

32

performed for the whole question set as well as the two sub-groups of questions. Term

푋푖훾 includes individual characteristics such as gender, age, place of birth, political affiliation, how much the individual follows the news, and how informed the individual is about the civil war.

The second consideration is to see whether individuals are more likely to answer incorrectly questions they are voting on rather than answering for themselves, and whether they are more likely to answer correctly to the neutral questions rather than the ones that conflict or confirm their personal views (prediction B). The below linear probability models are based on Robbett and Matthews (2018, p. 115) Table 3 models 1- 4:

(3) 퐶표푟푟푒푐푡푖,푞 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푟푒푑 + 푋푖훾 + 휀푖,푞

(3.1) 퐶표푟푟푒푐푡푖,푞,푛푒푢푡푟푎푙 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푟푒푑 + 푋푖훾 + 휀푖,푞

(3.2) 퐶표푟푟푒푐푡푖,푞,푐표푛푓푖푟푚푖푛푔 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푟푒푑 + 푋푖훾 + 휀푖,푞

(3.3) 퐶표푟푟푒푐푡푖,푞,푐표푛푓푙푖푐푡푖푛푔 = 훽0 + 훽1푣표푡푖푛푔 + 훽2푟푒푑 + 푋푖훾 + 휀푖,푞

퐶표푟푟푒푐푡 is an indicator taking value 1 when the individual answer is correct. 푉표푡푖푛푔 is an indicator variable taking value 1 if the individual was assigned to the treatment group and 푟푒푑 is another indicator variable taking value 1 if the individual reports having Red family ties, and 푋푖 is a set of individual characteristics discussed earlier. For the reasons stated earlier, standard errors need to be clustered at the individual level.

In addition to these statistical tests, to check for the robustness of the results, the findings are tested using Bonferroni adjusted statistical significance. Furthermore, to see how a single successful replication affects the likelihood of the findings being true associations in the absence of researcher bias, the study also reports post-study probabilities following Maniadis, Tufano, and List (2017). These robustness checks are discussed further in chapter 5.4.

4.4. Sample

After some consideration of the pilot results, for practical reasons, I decided that the sample of the partisans should be collected from the students of University of Tampere and University of Turku. Both of these subject pools were not contacted during the

33

piloting of the study. The participants from University of Tampere were recruited through the university's e-mailing lists and the from University of Turku were recruited through an ORSEE-operated (Greiner, 2015) experiment participant subject pool of the PCRClab. The data collection was completed on January 7, 2019.

Since the aim of the study was to contact individuals that had family members that had been part of the Civil War, the question on how to obtain a suitable sample for the study poses a problem as the population in question is somewhat hidden. According to Petersen and Valdez (2005), hidden populations are subsets of a population whose membership is hard to distinguish only based on existing knowledge and using conventional sampling techniques might not be feasible or applicable for producing data. These groups might be hard to identify because of low social visibility due to e.g. stigmatized or illegal behavior (Petersen & Valdez, 2005).

To solve for this, the experiment had to rely self-identification as there exist no database of civil war descendants. The introductory message of the experiment did not disclose that the research is specifically looking for Civil War partisans to answer, and suitability for the study was determined when the individual started answering the questionnaire. This means individuals without family history with the Civil War are not be part of the sample. The procedure also prevented these individuals from gaming the survey. However, due to the sensitive and traumatic nature of the Civil War, some individuals might not be willing to share their family histories openly or they might not be aware of them. Therefore, it is likely that the sample obtained represents a very specific subset of descendants of the Civil War partisans that, firstly, are aware of their family histories and secondly, feel so strongly of them that they are willing to openly share them. Moreover, the subjects of the experiment are students, and therefore the sample cannot be considered as representative of all individuals with Civil War related family backgrounds.

4.4.1. Determining sample size: power and sample size analysis

Determining the required minimum sample size is an important step in the planning of an experimental study. When determining the sample size, the researcher may run into several issues: on one hand, it is important to have a large enough sample size so that the study is able to detect the effect of interest, but on the other hand oversampling might also be problematic. Oversampling uses unnecessary resources and time, but depending on the study, it might also be ethically problematic to subject too many individuals to a certain treatment. However, with an underpowered study, the problems are even more

34

significant: if the study is underpowered, the researcher cannot observe whether the effect of interest exists or not, and therefore underpowered studies are waste of the society’s resources as well the participant’s time and effort (Bausell & Li, 2012). To solve this issue, the objective of power and sample-size analysis (PSS) is to help the researcher design a study such that the chosen statistical method has high enough power to identify the effect of interest in case the effect exists (Dattalo, 2008).

The main elements that are needed for PSS analysis are study design; statistical method; significance level 훼; power 1 − 훽; effect size and sample size N (see for example Kraemer & Blasey, 2016). As defined in the previous section, the relevant statistical test is multiple linear regression. The significance level 훼 describes the upper bound for type I error, i.e. at what likelihood the researcher incorrectly rejects the null hypothesis when in reality the null hypothesis holds or Pr(푟푒푗푒푐푡 퐻0|퐻0 푖푠 푡푟푢푒) (Casella & Berger, 2002). The convention is to set the significance level to a small probability such as 5 % -level to protect the null hypothesis (Cohen, 1992). Power, on the other hand, is related to type II error, i.e. accepting the null hypothesis when the alternative hypothesis would be true (Hogg & Tanis, 2010). Power is therefore defined as the probability of correctly rejecting the null hypothesis when the null hypothesis is false, i.e. Pr(푟푒푗푒푐푡 퐻0|퐻0 푖푠 푓푎푙푠푒). In social sciences, the common requirement for power is at least 80 %. For a summary of the two types of errors, see Table 3.

Table 3 Two types of errors. Adapted from Casella and Berger (2002)

The third element of power analysis is effect size. In this context, the effect size is a measure of the size of the mean differences among the study groups (Bausell & Li, 2012). Expected effect sizes can be estimated using previous research, but the chosen test type affects which method of effect size estimation can be used (Cohen, 1992). In this case, the relevant effect size statistic is Cohen’s 푓2 from Robbett and Matthews (2018) study as the relevant statistical method used is regression analysis. Cohen’s 푓2 (Cohen, 1992) is defined as

푅2 푓2 = 1 − 푅2

35

Where 푅2 denotes squared multiple correlation. However, since the measure reported in Robbett and Matthews (2018) is adjusted 푅2, according to Draper and Smith (1998) the correct measure can be obtained from: (1 − 푅2)(푛 − 1) 푅2 = 1 − 푎푑푗 (푛 − 푝)

2 (1 − 푅푎푑푗)(푛 − 푝) 푅2 = 1 − 푛 − 1

2 Plugging in the values 푅푎푑푗 = 0.138, 푛 = 199, and 푝 = 12 (total number of parameters, 2 including 훽0) from Table 2 of Robbett and Matthews (2018, p. 113), the value of 푅 = 0,186. Therefore, the correct value for the effect size is 푓2 = 0,2285.

The sample calculation was done for a linear multiple regression with 훼 = 0.05, required level of power 1 − 훽 = 0.80, effect size 푓2 = 0,2285, and number of predictors = 18. The power calculation was done using the G*Power 3.1 program, which is a power analysis software created for social, behavioral and bio-medical sciences (Faul, Erdfelder, Buchner, & Lang, 2009). According to Faul et al. (2009), the G*Power software can be used for power calculations for the commonly used statistical tests and it has five types of power analyses: a priori, compromise, criterion, post hoc and sensitivity analysis. In this case, the a priori analysis of G*Power was used to calculate group sizes for a two- tailed test. The expressive voting theory assumes that the differences in means should be larger in the group that votes for the answers, which would indicate that only one-sided t-test is necessary, but to be on the safe side, also a two-tailed test was done. For the two- tailed test, a required total sample size would be only 39. If the level of power is increased to 0.9, n increases to 50. Figure 1 created with the G*Power program presents how, when holding 훼 constant, an increase in the sample size increases the level of power that the study achieves.

It also should be noted that sample size calculation should serve as the minimum requirement for the sample size, as it is possible that the researcher loses subjects during the study for multiple reasons. Since PSS is generally used to determine the lower limit for sample size, I decided that the experiment would need at least approximately 120 partisan responses. The cut-off point of the data collection was set to 200 responses due to budgetary reasons. This is to make sure that in case there are any attrition or other problems with the obtained data, the acquired level of power would still be high enough.

36

Figure 1 Power and sample size

t tests - Linear multiple regression: Fixed model, single regression coefficient Tail(s) = Two, Number of predictors = 18, α err prob = 0,05, Effect size f² = 0,2285

60

50

e

z

i s

40 Effect size f²

e

l

p m

a 30 = 0,2285

s

l

a t

o 20 T

10

0 0,6 0,65 0,7 0,75 0,8 0,85 0,9 0,95 Power (1-β err prob)

4.5. Pilot experiments

Pilot experiments are usually the only way to test if the planned experimental design is actually feasible: they are a useful way to see whether there are unclear instructions, missing or leaking information and enough or too little time for certain activities (Friedman & Shyam, 1994). Therefore, in order to test whether or not the future study would be feasible, two pilot studies were conducted at the end of July and October 2018. The aim of the pilot experiments was to assess if there were clear flaws within the experiment design overall in terms of, for example, the instructions or formulation of the questions. The pilots were also an opportunity to scout out questions that do not work for the experimental design: there is a need for both sides to be able to answer the questions expressively, but it is be problematic if both sides tend to answer questions in a similar fashion.

The pilots were fairly small in size; the first having 36 participants and the second 27 participants. Both studies were unincentivized and only had the control condition, i.e. all of the respondents answered the pilots as decisive individuals. Therefore, the results obtained can be only thought to indicate a possible lower bound of the expressiveness. Appendix 3 reports the results of the two pilots. As the sample sizes on both pilots were fairly low, no formal statistical tests were conducted, and the Appendix rather reports descriptive statistics of the pilot experiments.

37

The pilots lead to some modifications to the experimental design and especially determined which questions were selected to be used in the final experiment. Some questions had to be dropped out from the final experiment: if the pilot study subjects did not seem to answer the questions very differently from each other, the questions were perhaps not understood correctly, or the question was too easy. All the questions tested out during the piloting phase of the study are reported in Table 12 in Appendix 3.

38

5 RESULTS

This chapter introduces the results of the experiment and determines whether the predictions introduced in section 4.2 hold. The statistical analyses performed in this chapter follow the empirical strategy that was specified in section 4.3 and preregistered to ORF.io. The first section of this chapter introduces some descriptive statistics and describes how the coding of the experimental questions was done. Then, following Robbett and Matthews (2018), to test prediction A, section 5.2 examines to what extent Red and White partisans give different responses from each other. The focus of this analysis is on whether the respondents provide partisan responses when they are incentivized to vote for the correct answers to factual questions, and whether the resultant partisan gap is larger in the treatment group (“voters”) than in the control group (“decisive individuals”). This question is assessed in part 5.2, which provides figures and tables that provide evidence on the existence of a partisan gap under the treatment condition but finds, quite surprisingly, no evidence of such a gap under the control group.

Secondly, this chapter assesses if the likelihood of a correct answer changes between the treatment and control groups. Part 5.3 examines prediction B and determines whether the likelihood of a correct response is greater for decisive individuals (control) than for voters (treatment). The results show that when the self-identified Civil War descendants are voting for questions that politically challenge their views, the likelihood of a correct answer decreases. The last part if this chapter presents some additional robustness checks of the results. First, the last section assesses how the Bonferroni adjustment affects the results. Second, it estimates post-study probabilities as suggested by Maniadis, Tufano, and List (2014) in order to establish if the findings here can be considered as true associations.

5.1. Descriptive statistics

This section introduces some descriptive statistics of the sample. The experiment was conducted during mid-December 2018 and early January 2019, and the responses were collected from two universities: University of Tampere and University of Turku. The experiment had 212 respondents in total, but some of the responses had to be discarded due to missing data (2 observations) or self-reported 50-50 Civil War family membership (26 observations). Interestingly, this means that only 12 % of the respondents reported a

39

uniformly mixed background. After these treatments, the size of the sample that was used for the statistical analysis was 184 individuals, out of which 93 had been assigned to treatment and 91 to control.

Table 4 reports some descriptive statistics of the relevant control variables over the treatment and control groups as well as the overall sample. As can be seen from the table, the means of all the variables are very similar between the treatment and the control groups. The average age of the respondents was 26 years and around 70 percent of the subjects were female. These are typical of the population from which the sample was obtained: for example, Finnish tertiary education overall tends to have a higher rate of female than male students, and only 17 % of students study in programs that have an equal ratio of males and females (Keski-Petäjä & Witting, 2018).

Table 4 Descriptive statistics

The variables “Follow News”, “Know History” and “Past Voter” are all indicators taking value 1 if the individual reports that they follow the news (values “a lot” and “quite a lot” of the relevant question), if the individual reports being informed about the events of the Civil War (values 4 and 5 of the relevant question), and if the individual reports that they voted in the previous 2015 parliamentary elections. As can be seen from Table 4, the respondents reported a high ratio of news following as well as voting, but less so on their

40

knowledge on the Civil War: only around 20 % of the individuals reported a higher-than- average knowledge on the war.

As stated earlier, “Red” is an indicator variable taking value 1 in case the individual reports a primarily Red Civil War family history. As can be seen from Table 4, the division of the individuals who self-identified as Red or White descendants was very evenly distributed in the sample: 50.5 % of the respondents stated a primarily Red family background (>50 % of relatives from the Civil War were Red). The respondents were also asked a question on who they thought had the correct cause during the war. Table 5 presents these results: about 40 % chose the “cannot say” option, but for the rest of the options the results are fairly equally balanced with around 20 % answers in each of the other categories. According to these results, support for the Reds’ cause seems to be slightly less than for Whites, but most of the respondents seem to choose an option that does not require them to take a side on the matter.

Table 5 Support for Civil War forces

As stated earlier, the experiment had altogether 9 experimental questions, and the responses to these questions are the dependent variable of interest. On average, 33 % of the responses were correct. As discussed earlier, the treatment of the multiple-choice questions follows Robbett and Matthews (2018) and Bullock et al. (2015): the experiment questions had five answer options and the “reddest” answer was coded to have value 1, the “whitest” as 0, and the other options as 0.25, 0.5 and 0.75, respectively. This coding was based on the observed trends of the responses3, and not on the initial assumptions

3 The scale of the question on the Russian involvement (“Russians") in the Civil War was originally coded to have the “whitest” answer to be the one that indicated the highest Russian involvement. The original coding was based on the Civil War literature and academic accounts on how the high Russian involvement on the Civil War was a big aspect of the “War on Liberation” -accounts of the conflict. However, when observing the answers Red and White descendants had given, the mean of the Red voters was lower than for White voters - i.e. it was as if the Reds were giving more “white” answers. As the original coding of the variables was based on the pilot results and an educated guess based on previous literature on how the partisan differences might play out, it was decided that reversing the order can be justified.

41

that were made during the design and piloting phase of the study. The exact order of coding for each question can be found in Appendix 1.

5.2. Partisan gap

This section explores if a partisan gap can be found in the responses to the experimental questions, and how the treatment affects this gap. Following Robbett and Matthews (2018), in order to examine prediction A, there is, first, a need to examine to what extent Red and White partisans give different responses from each other, whereas part 5.3 then explores the likelihood of a correct answer. The focus of this section is on whether the respondents provide partisan responses when they are incentivized to vote for the correct answer to factual questions, and whether the resultant partisan gap is larger in the treatment group (“voters”) than in the control group (“decisive individuals”). The aim of this sections is, therefore, to determine if there is evidence that supports the predictions of the expressive voting theory introduced in chapter 3: do the respondents change their behavior toward more expressivity if the likelihood of pivotality of their answer decreases?

Illustrative of this effect, one of the main results of Robbett and Matthews (2018) was that the partisan gap they observed increased when the individuals were voting for a correct answer. They found that the partisan gap was around 13 percentage points in both of their voting conditions and that this gap shrank to little less than 5 percentage points when the individuals were given instructions to answer the questions on their own rather than vote. When assessing both voting groups altogether, Robbett and Matthews (2018) found a 10 percentage point increase in the partisan gap in the treatment condition. According to them, this increase in the partisan gap was mostly caused by the increased expressiveness of Republican partisans when voting for the answers.

In order to indicatively explore whether a partisan gap can be found in this experiment, Figure 2 depicts the average partisan gap in the treatment and control group. The partisan gap here should be understood as the difference between the average answers of the self-identified Red and White subjects in the voting group of five and the decisive individuals answering on their own, averaged over all nine partisan questions. Given the normalization of each question on the [0,1] interval explained in the previous section, a difference of zero would indicate that there is no partisan bias in the responses. On the other hand, a positive difference would imply that participants, on average, answer the questions in a manner that supports their partisan views in the voting group. As can be

42

seen from the right side of Figure 2, an average partisan gap of around six percentage points exists in the treatment group. This gap is significant at all conventional levels (p<0.001). Therefore, there is preliminary evidence that suggests that a partisan gap can indeed be found in the treatment group.

Figure 2 Partisan gap in treatment and control groups Figure presents the average difference in the responses of Reds and Whites. Standard error bars reported. Individual is the level of observation.

As discussed earlier in section 3.2, there are two different views on what might be causing such a gap in the answers of the partisans. If the subjects would have genuine but different beliefs about the factual questions, possibly caused by the “perceptual screen” as theorized by Bartels (2002), then finding a partisan gap in the voting condition does not prove that there is expressivity in the responses. However, if the self-identified Reds and Whites would have genuinely different views of the world, the responses in the treatment and control groups should remain similar. This does not seem to be the case. As Figure 2 shows, the partisan gap disappears in the control group: the gap seems to be close to zero or even negative indicating that the respondents in the control group give answers that would not necessarily fit their partisan groups. Additionally, in the control group, the gap is not significant at any conventional level (p=0.3370) and it can be concluded that the answers of the Reds and Whites do not differ from each other in the control group. Therefore, the effect observed here can be interpreted as an increase in “real” expressivity in the treatment group rather than a perceptual difference between Reds and Whites. Interestingly, the result that no gap is observed in the control group differs from what Robbett and Matthews (2018) found in the American context. Even though their result was quite not significant, they still observed a partisan gap of little

43

less than five percentage points also in the control group. In the Finnish context, this gap does not seem to exist in the control group.

Figure 3 Partisan gap in different question categories Figure presents the average difference in the responses of Reds and Whites. Standard error bars reported. Individual is the level of observation.

Furthermore, these results remain more or less similar independent of the question type. Figure 3 depicts the partisan gap over both the Civil War related questions and the contemporary politics related questions. As can be seen from the Figure 3, in both categories the average partisan gap in the treatment group remains around six percentage points, and in the control group the gap disappears with both categories of questions. Therefore, there does not seem to be evidence that the partisan gap would be caused solely by certain types of questions, and the individuals in the treatment group seem to engage in expressiveness both over questions related to the Civil War and contemporary politics.

To formalize this analysis, the models 1 – 1.2 and 2 – 2.2 in Table 6 match the correspondingly numbered regression models presented in part 4.3. In these regressions, the subject’s response to the experiment questions (coded on the [0,1] interval) is regressed against indicators on whether the participant is voting for the answer (Voting), whether the individual has self-identified as a descendant of a predominantly Red family (Red) and their interaction (Voting x Red). On model 1 and 2 the dependent variable is the individual’s response to all the experimental questions, whereas on models 1.1 and 2.1 the dependent variable consists of only the Civil War related questions, and on models 1.2 and 2.2 only the contemporary questions. In the first three columns the unit of observation is the response and the errors are clustered at the level of the individual,

44

whereas for the fourth to the sixth column the unit of observation is the individual, i.e. the average response of the individual. Similar to Robbett and Matthews (2018), the results remain almost identical independent on the level of observation. Despite small changes in the coefficients, the main results of this table remain unaffected by an alternative dummyfication of variables Follow News and Know History (see Appendix 4 Table 14).

Table 6 Voting and the partisan gap, linear regression models

45

Interestingly, the results obtained on models 1 and 2 are very much the same as the ones Robbett and Matthews (2018) observed. The significant positive coefficient on Voting x Red on columns 1 and 2 indicates that, with controls in place, there is approximately an 8 percentage point increase in the partisan gap when participants are incentivized to vote for the correct answer. This result is highly significant independent of the level of observation (in both model 1 and model 2, p=0.003). Looking at the other models, it looks as if the result is slightly driven by the Civil War related questions (models 1.1 and 2.1). The slight decrease of the coefficient on Voting x Red on the contemporary questions (models 1.2 and 2.2) in comparison to the Civil War questions (models 1.1 and 2.1) suggests a slight increase in expressiveness in the questions regarding the Civil War in comparison to contemporary questions.4 Still, all in all, a clear increase in the partisan gap can be observed with all the models. Therefore, consistent with the Robbett and Matthews (2018), the responses seem to become significantly more partisan when the respondents are voting for the correct answer. Moreover, partisanship does not only influence the subject’s responses when voting in matters related to the Civil War but also on questions on contemporary politics. Note that in models 1.1 and 2.1, the significant negative coefficients on Red do not indicate that the self-identified Reds would be giving more “white” answers in the control group when answering questions related to the Civil War.5

There are also interesting diversions from the results of Robbett and Matthews (2018). In the American context, Robbett and Matthews (2018) found that the increase in the partisan gap was mostly driven by the Republican respondents. In the U.S. context, present day Republican voters are considered the political right and Democrats the political left. In the Finnish context, it seems that instead the self-identified Reds, historically connected to the political left, are increasing the expressiveness of their answers more in the voting situation. The point estimates on Voting, which indicates the change of behavior if the individual self-identifies as White, are small and insignificant. This would imply that Whites are not changing their behavior when they are voting in

4 This question is further explored in Table 16 in Appendix 4, but the results seem to suggest that the partisan gap does not change due to the question type, as can be seen from the small and insignificant coefficient on the interaction of the indicators Civil War, Voting and Red. 5 Instead, a more reflective approximation of this is given by the linear combination 푅푒푑 + 퐴푔푒푥푅푒푑 ∗ 26. This linear combination gives the coefficient on the Red individual with the average age of 26, but with all the control indicators equal to 0. The coefficient for this case is -0.0565 (p=0.390), showing that no observable difference between the Reds and the Whites can be found in the control group. The linear combination was calculated with Stata’s lincom command.

46

comparison to answering as decisive individuals, even though the small negative change in the responses indicates slightly more “white” responses (the “whitest” answer was coded 0). This means that the increase in the partisan gap is caused by the increased expressivity in the answers of the Reds: Reds seem to give significantly more “red” answers when voting for the correct response than answering individually. For example, when looking at model 2, the change in the Reds’ answers can be obtained by the linear combination of coefficients on Voting and Voting x Red. This combination indicates approximately 7 percentage point increase (p<0.000)6 in the “redness” of answers of Red participants in the treatment condition. However, it should be noted that the comparison between the Finnish and U.S. context is not self-evident since the political left-right -axis is a simplification of the Finnish multiparty system, and its political institutions differ drastically from the American two-party system. Moreover, even though the Whites have had connections to the political right and Reds to the political left, the self-identified White and Reds do not, per se, represent voters of a political party.

In order to avoid endogeneity, Table 6 reports all nine partisan questions even if an individual question does not produce a partisan gap. To further examine the success of each question in generating partisan responding, individual regressions on each experimental question are reported in Table 12 in Appendix 3. Even though the results on individual questions are not all statistically significant, some interesting observations can be made. Firstly, it seems that the results are somewhat driven by the questions regarding the deaths of POWs in the prison camps (question “POW”), the Russian involvement in the Civil War (question “Russians”), the income inequality (question “inequality) and the percentage of NEET-youth, for which the voting condition seems to generate a large change in the partisan gap. Some of the questions did not seem to produce a clear gap, and especially the question on the greenhouse gas emissions (question “Greenhouse”) did not seem to work in the expected manner. However, the consistency of the results for individual questions seems to indicate that the gap observed in the treatment group is not caused by, for example, the respondents misunderstanding the questions.

Secondly, a very interesting observation is that the Russian involvement in the Civil War seems to generate a large and statistically significant partisan gap. As discussed earlier,

6 Calculated with Stata’s lincom command. The coefficient takes value 0.0696, with standard error 0f 0.1885 and p<0.000.

47

the order of the coding of some questions had to be changed in order to accommodate for expressivity in manners that were unexpected. One of the main changes was the coding on the Russian involvement in the war: Reds seemed to give higher estimates on the number of Russian soldiers than Whites. This is a very surprising result, as a big part of the “War of Liberation” interpretation advocated by the White winners of the conflict, was that the war was waged against Russian socialist forces whereas the Red interpretation has emphasized that the war was a class conflict fought among Finns. Thus, one would expect an upward partisan bias in the estimate of the Russian involvement among the White and rather an opposite bias among the Red. As discussed earlier, this unexpected result does not seem to be caused by the respondents, for example, misunderstanding the questions: the other questions that were able to capture expressivity seem to follow the expected trends, and therefore it is unlikely that the difference is caused by a misunderstanding. Furthermore, when coded in the altered manner, the partisan gap in this question is fairly large (15 percentage points) and the result is statistically significant, as can be seen from Table 12 in Appendix 3.

5.3. Likelihood of a correct response

This section discusses whether the treatment changes the likelihood of answering questions correctly. In their analysis Robbett and Matthews (2018) found that individuals who were voting for the correct answer to a question were more likely to answer the question wrong than those who were giving the answers as individuals especially if the question challenged their partisan views. In general, when individuals answered for themselves instead of voting, they observed a 5 % increase in the likelihood of a correct answer. Also, voting seemed to produce a sharp decrease in the likelihood to answer correctly if the question challenged the individual’s views: Robbett and Matthews observed a 12 percentage point drop in the likelihood of a correct answer if the question challenged the respondent’s partisan affiliation.

The results in Figure 4 show that the same trend that Robbett and Matthews (2018) found can also be observed, at least to some extent, with the Civil War partisans. Overall, on average 1/3 of the answers the respondents gave were correct. As the left-hand side of Figure 4 indicates, this result remains approximately the same both in the treatment and control group if assessing all questions altogether. When considering all the questions on the left-hand side of Figure 4, the decrease in the likelihood of correct answer for voters is very slight and not statistically significant.

48

However, these results look different when assessing only questions that conflict with the respondent’s partisan affiliation. The right-hand side of figure 4 plots the same likelihood for questions for which the correct answer falls into to the opposite side of the [0,1] interval from the participant’s preferred partisan response, excluding the neutral question as well as the two questions of which the correct answer was coded 0.5 were left out from this consideration. As shown on the right side of Figure 4, the likelihood of a correct answer decreases 7 percentage points in the voting group when answering conflicting questions. This would indicate that the likelihood of answering correctly decreases slightly for voters in comparison to decisive individuals.

Figure 4 Voting and the likelihood of correct answer The left panel reports all questions, while the right panel only questions for which the correct answer is on the opposite side of the respondent’s affiliation. Standard error bars reported. Individual is the level of observation.

Table 7 reports very similar results from a set of linear probability models comparable to Robbett and Matthews (2018). In these models, the dependent variable is an indicator variable defined as 1 in the case of correct response, and 0 otherwise. In the first column, the dependent variable consists of all the responses to all the 10 questions irrespective of whether the question is politically neutral or not. Despite small changes in the coefficients, the main results of this table remain unaffected by alternative dummyfications of variables Follow News and Know History (see Appendix 4 table 15). At conventional significance levels, the results show that, consistent with above, no clear trend can be observed: the change in the likelihood to give a correct response caused by the treatment is only minimal and statistically insignificant (p=0.949). However, gender and political activity seems to affect the likelihood of answering correctly: women seem

49

to have a slightly lower likelihood of giving correct responses by 5 percentage points (p=0.055), whereas if the individual has voted in the previous parliamentary election, they are around 5.5 percentage points (p=0.064) more likely to answer correctly.

Table 7 Likelihood of correct answer, linear likelihood estimation

In the second column, the same model is estimated only for the neutral question about the Finnish population during the Civil War. Consistent with what Robbett and Matthews (2018) found, the treatment nor the individual’s background seem to affect the likelihood of answering this question correctly. The small and insignificant coefficients on model 3.1 indicate that for a neutral question, there is no change in the likelihood of a correct response that would be caused by the treatment. Similarly, the third column looks only at the questions that confirm the respondent’s partisan identity. Again, the treatment does not seem to affect the likelihood of answering these questions correctly, but there are some other observations that can be made. Firstly, women seem to be almost 7 percentage points (p=0.082) less likely to answer confirming questions correctly, which could be the driver of the gender effect on the first column. Secondly, if the individual reports high or moderately high news following, their likelihood of answering confirming questions increases by around 11 percentage points (p=0.004).

50

This seems to give some support for the idea that people tend to follow and remember the news that do not challenge their own values and ideas. Similarly, if the individual has voted in the previous elections, they are more likely to answer confirming questions correctly (p=0.006).

Model 3.3 estimates the linear probability model for the conflicting questions only (again, defined as questions for which the correct answer falls into to the opposite side of the [0,1] interval from the participant’s preferred partisan response). Consistent with Figure 4, the negative and weakly significant coefficient (p=0.095) on Voting indicates that when controlling for other factors, the likelihood of correct response decreases by 7 percentage points when the respondents have to vote for the correct answer instead of answering themselves.7 This is slightly lower than what Robbett and Matthews (2018) observed in their study. Furthermore, when looking at the two-limit Tobit regression with conflicting questions only, for which the individual is the level of observation in the last column, the result seems to contradict model 3.3. The decrease in the likelihood of correct response on Voting remains almost the same, but the result is not significant, which would indicate that the weakly significant coefficient on Voting in model 3.3 should not be taken as conclusive evidence of the decrease in the likelihood of correct answers.

Additionally, as discussed earlier, the increased expressiveness seems to be mainly driven by the Reds, and some results on model 3.3 seem to support this notion. The negative and significant (p<0.000) coefficient on Red signals that the Reds are almost 19 percentage points less likely to answer correctly to questions that conflict with their views. Interestingly, the Reds’ higher tendency to answer the challenging questions incorrectly is almost completely offset by their tendency to answer correctly to questions that confirm their views: when controlling for other factors, Red respondents are 20 percentage points (p<0.000) more likely to answer confirming questions correctly. The results on the Tobit model on column 3.4 support the finding that Reds are more likely to answer conflicting questions incorrectly as indicated by the large and statistically highly significant coefficient on Red.

7 Appendix 4 Table 17 reports the probit results for the challenging questions. The probit model indicates a very similar result as the linear likelihood model. The results show that the treatment decreases the z- statistic by -0.203 which means that when moving from control to treatment, the probability of a correct response decreases by -0.0713 (p=0.078).

51

5.4. Robustness checks

All in all, the results reported in the two previous sections confirmed, to some extent, both of the predictions made in section 4.2. Consistent with prediction 퐴 the results showed an increase in the partisan gap when the respondents are told to vote for the correct answer instead of answering the questions themselves. These results were significant at conventional significance levels. Additionally, consistent with prediction B, the likelihood of a correct answer was higher for decisive individuals than for voters when the questions challenge their partisan views, and at conventional significance levels, these results were weakly significant. However, the significance of this result was very weak, and this result should be taken with caution.

It should be noted that trust in statistical significance as a sole criterion to review results has received a list of criticisms. One of the major critiques has been that relying on only standard significance levels can lead to an excessive number of false positives, i.e. to so- called type-I errors (Maniadis et al., 2014). Since a lot of published findings cannot be reproduced in subsequent studies, this has sparked a lively debate on whether a sizable fraction of published results are actually type-I errors (Ioannidis, 2005, 2012). The extent to which this so-called credibility crisis in research affects experimental economics is not fully understood yet (Maniadis et al., 2017), but, for instance, Camerer et al. (2016) found that the reproducibility of economic experiments was better than for psychological studies. In itself, my experiment can be considered as a successful replication of the experiment of Robbett and Matthews (2018) although in a highly novel context. In subsection 5.4.2, following Maniadis et al. (2014, 2017), I give post-study- probability (PSP) estimates of the effects on this replication on the confidence that the findings correspond to a true causal association. In section 5.4.1, I address the potential for false positives, by applying Bonferroni-corrections to tighten the significance criterion in order to adjust for the fact that multiple tests are performed simultaneously.

5.4.1. Bonferroni adjustment

Social sciences typically choose the level of significance of statistical tests at a conventional 훼 = 0.05 level, but since there are multiple tests performed on the same hypothesis, the relevant level of significance needs to be adjusted. The simplest method for this is the Bonferroni procedure. The Bonferroni procedure adjusts the significance level of hypothesis tests when there are multiple statistical tests performed on a single hypothesis (Perrett & Mundfrom, 2010). The purpose of the Bonferroni adjustment is to

52

reduce the probability of Type I errors, i.e. the risk of incorrectly rejecting the null hypothesis: when 푘 tests are performed on a single hypothesis, the probability of incorrectly rejecting the null hypothesis increases by 훼 × 푘 (Bland & Altman, 1995). According to Bland and Altman (1995), the Bonferroni procedure corrects for this mistake by adjusting the relevant level of significance by the number of tests performed, 훼/푘. In the case of this study, there are three relevant tests to the hypotheses, namely tests 1, 2 and 3.3. This means that 푘 = 3 and the relevant level of significance should be adjusted to 0.05/3 ≈ 0.01667. For weak significance, the Bonferroni adjustment entails 0.1/3 ≈ 0.0333.

Based on these calculations, some conclusions can be drawn about the main predictions of this experiment. On Table 6, the main results on the coefficient Voting x Red remain significant even with the Bonferroni adjustment. As stated earlier, the main hypotheses were models 1 and 2, and for these two models the results are significant with the Bonferroni adjustment since the obtained p-value is 0.003. The same is true for the Civil War related questions on 1.1 and 2.1 (p=0.011 and p=0.011, respectively). However, for models 1.2 and 2.2, the Bonferroni adjustment suggests that the results on these questions would not be significant as they are only weakly significant on conventional significance levels (p=0.050 and p=0.053, respectively).

With model 3.3, the third main prediction in Table 7, the Bonferroni adjustment suggests that prediction B does not hold. The coefficient on Voting that indicates a decrease in the likelihood of a correct answer for voters in questions that challenge their views is only weakly significant (p=0.095). With the Bonferroni adjustment, this significance disappears, indicating that the likelihood of correct answer in the treatment group cannot be determined to be different from the control group. This conclusion is supported by the insignificant coefficient on Voting obtained from the Tobit regression of model 3.4. However, it seems that the coefficients on Red on both columns 3.2 and 3.3 are significant even with the Bonferroni adjustment. Both of the results were highly significant (p<0.000), and therefore the Bonferroni adjustment does not affect the results. Therefore, as stated earlier, it seems that individuals who report having Red family members are more likely to answer confirming questions correctly, but this effect seems to be offset by their increased likelihood of answering challenging questions incorrectly.

53

5.4.2. Post-study probabilities

Lastly, this section assesses the likelihood that the findings of the experiment are true associations. Based on an extension of the Bayesian framework in Ioannidis (2005) and Maniadis et al. (2014), Maniadis et al. (2017) have developed a framework that can be used to assess whether an association can be considered true. This post-study probability (PSP) framework highlights how the common p-value benchmark can easily lead to results that are not robust, especially in the case of novel and surprising findings (Maniadis, 2014). Instead, to be able to determine whether a finding is true, the researcher should also consider the power of the study as well as the prior information on the validity of the hypothesis (Azmat, Bagues, Cabrales, & Iriberri, 2018). Based on these observations, the basic PSP framework is based on the observed p-value, the power of the design, the prior probability of the hypothesis, tolerance for false positives (Maniadis et al., 2014).

Since, following Levitt and List (2009), this study is essentially a replication, this factors into the PSP calculation. Replications studies can be very valuable: as Maniadis et al. (2014) and Moonesinghe, Khoury, and Janssens (2007) have noted, if there is no research bias, just a few replications will lead beliefs to converge toward the truth. Therefore, to show how this effort in replication has affected the likelihood of the findings to be true, this paper uses the Maniadis et al. (2017, p. F220) equation for calculating the posterior belief that an association is true after observing all the evidence from conducted studies:

(푝푟표푏푎푏푖푙푖푡푦 표푓 푡ℎ푒 푎푠푠표푐푖푎푡푖표푛 푡표 푏푒 푡푟푢푒 푎푛푑 ℎ푎푣푖푛푔 푟 푠푢푐푐푒푠푠 푖푛 푛 푡푟푖푎푙푠) 푃푆푃푟푒푝 = (푃푟표푏푎푏푖푙푖푡푦 표푓 ℎ푎푣푖푛푔 푟 푠푢푐푐푒푠푠 푖푛 푛 푡푟푖푎푙푠)

푏(1 − 훽, 푟, 푛)휋 푃푆푃푟푒푝 = (1 − 훽, 푟, 푛)휋 + 푏(훼, 푟, 푛)(1 − 휋) where the probability of observing r successful replications in n replication trials { } ( ) 푛 푟 푛−푟 conditional on a true association is 푃푅 푋 = 푟 = 푏 1 − 훽, 푟, 푛 = (푟)(1 − 훽) 훽 and { } ( ) 푛 푟 푛−푟 conditional on false association is 푃푅 푋 = 푟 = 푏 훼, 푟, 푛 = (푟)훼 (1 − 훼) . Further, PSP is also dependent on the level power (1 − 훽) the study achieves, the chosen level of 훼-error probability and the level of prior belief 휋 that the association is true, which in this case is the posterior belief of the original study (Maniadis et al., 2017). It should be noted that this equation assumes the absence of researcher bias. In the case that no

54

replications have been performed, the researcher should use equation (1) from Maniadis et al. (2017, p. F211).

Following Maniadis et al. (2014); (Maniadis et al., 2017), Table 8 depicts the posterior likelihoods of the findings being true associations after the first Robbett and Matthews (2018). The calculations were done for two cases: no replication (i=0) and one replication in which the replication is successful (i=1). In the table, the power is set to standard 0.8- level, 훼-error probability is 0.05 and the level of the prior is varied. As we can see from the table, if assuming the absence of researcher bias, only one successful replication already improves the post-study probabilities considerably. Even with a very small initial prior probability, the PSPs after one replication are higher than 70 percent, meaning that the likelihood of the findings being false positives is already quite small. Therefore, it can be concluded that this replication study provides further evidence that the results of Robbett and Matthews (2018) can be considered as true associations. However, the results of these calculations should be thought as indicative: for example, the power of the Robbett and Matthews (2018) study was higher than the standard requirement of 0.8. Note, that the calculations do not also take into account the possibility of different types of researcher bias that are discussed in Maniadis et al. (2017).

Table 8 PSPs with zero or one successful replication

In conclusion, these additional robustness checks seem to indicate that both the results obtained here as well as the results of the original Robbett and Matthews (2018) study should be taken seriously. Most importantly, the main results of this study on models 1 and 2 remain significant also after the Bonferroni adjustment. Secondly, the post-study probabilities indicate that a single successful replication significantly increases the likelihood of the findings of this and the Robbett and Matthews (2018) study being true

55

associations, which also highlights the importance of this replication attempt. However, there are also some caveats: one of the main predictions of Robbett and Matthews (2018) was that the likelihood of correct answers should decrease in the treatment condition. Both the results of the Bonferroni adjustment indicate that no such change can be observed between the treatment and the control group in this experiment. Instead, there seems to be a trend for Red respondents to have a lower likelihood for answering conflicting questions correctly, and a higher likelihood to answer correctly to questions that confirm their views.

56

6 DISCUSSION

This chapter discusses the implications of the results of my experiment for both the expressive voting theory and the concept of partisanship, as well as considers some of the possible limitations of the experiment. As stated earlier, even though there is a fairly large pool of experimental studies on expressive voting, the Robbett and Matthews (2018) study was the first to provide experimental evidence of expressive voting as a means to confirm individual’s partisan identity. The results introduced in the previous section indicate similar findings: as the partisan gap appears only under the treatment condition and not in control, there seems to be evidence that voters derive expressive benefit from choosing responses that allow them to express their affiliation with either side of the Finnish Civil War. Additionally, an interesting aspect of my experiment in comparison to the findings of Robbett and Matthews (2018) is the inclusion of the historical questions. The persistence of the partisan gap over both historical and contemporary facts is a unique finding that to my knowledge has not been reported elsewhere. Moreover, the fact that a partisan gap arises in the answers of the self- identified descendants of Civil War partisans both over historical and contemporary issues could suggest that the political divisions in the society can be transmitted over generations through political socialization.

The chapter is divided into four sections. The first section discusses the implications of the results of the experiment for the expressive voting theory and how the results relate to the findings of Robbett and Matthews (2018). The second section looks at other interesting aspects of the study and their possible implications with a particular focus on partisanship and the transmission of political values and divisions in society. The third section is concerned with some of the limitations of the experiment, and it also considers how these limitations might affect the conclusions that can be drawn from the results. The last section looks at the internal and external validity of this study and what factors can be seen to affect especially its external validity. As a conclusion, it also proposes some ideas on how partisanship and expressivity could be assessed in further studies especially in multi-party contexts.

6.1. Implications for the expressive voting theory

First and foremost, the results of my experiment support the conclusions of Robbett and Matthews (2018) on expressive voting in small electorates. My results seem to suggest

57

that their findings are not unique to the American two-party systems, and that voters engage in expressive partisan behavior also in other political contexts. As analyzed earlier in section 5.2, the evidence on Figures 2 and 3 (pp. 42-43) and Table 6 (p. 44) show that self-identified Civil War descendants gave more partisan responses when voting for the answer than when answering individually. Section 3.2.1 introduced different expressive voting models that assumed that individuals receive direct intrinsic benefit from voting choices that were, for instance, consistent with their partisan identities. The results from my experiment seem to be in line with the assumptions of these models: according to the results in Table 6 (p. 44) the decrease in the likelihood of pivotality increased the expressivity of the answers in the voting group in comparison to the control group. Furthermore, if self-identified identified Reds and Whites would have genuinely different views of the world, the partisan gap in the treatment and control groups should remain similar. A partisan gap was absent, however, in the control group, as shown in Figures 2 and 3 (p. 42-43). My findings would consequently suggest that the respondent’s in the treatment group voting is a way for the respondents to confirm their partisan identity.

Consistent with Brennan and Buchanan’s (1984) analogy between voting and cheering for a favorite sports team, the results of this experiment seem to imply that voter behavior cannot be purely understood through preferences over different material policy outcomes. Instead, the results indicate that a complicated mix of different group affinities, that are not necessarily even affiliations with certain political parties, affect people’s behavior when voting. Even though Brennan and Buchannan (1984) linked their analysis in particular party affiliation, my results seem to confirm their conclusion that results of majoritarian elections should not be presumed to be reflective of preferences of citizens over alternative electoral outcomes. Instead, it seems partisanship can lead people to vote in ways that do not actually reflect their preferred policy outcome but rather express some form of partisan affiliation. As Feddersen et al. (2009) found in their experiment that studied ethical expressive bias, individuals might even vote in ways that go against their material self-interests.

However, it cannot be concluded that expressivity would be something that needs to be eliminated completely. As Riker and Ordeshook (1968) have shown in their theoretical model, expressivity can be a major driver for people to vote. Similarly, in their experiment Feddersen et al. (2009) found that when pivot probabilities decrease, individuals with relatively strong expressive preferences either continue voting or switch

58

from abstention to voting. Hence, it seems that this type of behavior might be highly important for voter turnout. Yet, as anecdotal evidence from for example the Brexit vote shows, expressivity can also lead to “voter’s remorse”: for instance, the Guardian has reported that a lot of the people who voted for the UK to leave stated they were putting in protest votes and did not think that the vote would lead to the UK leaving the EU8. Politicians and officials involved in policy creation and implementation might face an inherent dilemma: how to balance expressive and material self-interests in a way that allows for the creation of policies that serve the common interests but also ensures sufficiently high rates of participation in elections.

A very interesting difference between the results obtained in my experiment and in the Robbett and Matthews (2018) study is that whereas Robbett and Matthews (2018) observed a partisan gap also in the control group, my study did not have evidence of a partisan gap in the responses of the decisive individuals. This would suggest that at least with the self-identified Civil War partisans, the observed partisan gap in the treatment condition is caused only by the change in pivotality, and that decisive individuals seem to attempt to answer the questions based on their materialistic preferences. The expected utility function of Robbett and Matthews (2018) already introduced in section 4.2 was:

퐸푈(푟) = 푃 ∗ 퐵 ∗ 푏(푟) + 푐 ∗ 푒(푟) where the expected benefit from submitting response r was effectively dependent on the material gain B affected by the likelihood of pivotality P, and intrinsic expressive benefit c that is unaffected changes in pivotality. The results seem to indicate the Civil War partisans in the control group do not, effectively, put any weight on the second part of the equation that captures expressive benefit that an individual would gain from submitting a certain response. The expressive part of the equation becomes only relevant when P, the likelihood of being pivotal, becomes sufficiently small and, therefore, starts affecting the instrumental material benefit that can be obtained from a certain choice. This would also mean that expressive behavior could also appear when the instrumental benefits are so small, irrespective of P, that the expressive benefits of voting become relevant.

8 Dynskey, L. (2017, Nov 25th). ‘I thought I’d put in a protest vote’: the people who regret voting leave. The Guardian, Retrieved from http://www.theguardian.com

59

The fact that no partisan gap was found under the control group could also suggest that if the likelihood of pivotality or the incentives, i.e. the possible material benefits of the vote, had been high enough, the self-identified Civil War partisans would not answer in expressive manners in the voting condition. Even though the “electorate” in the experiment was small, meaning that relative to real elections, the pivot probabilities are actually quite high, the incentives were also fairly low. It could be that in a different setting where the stakes were high enough, the respondents in the voting group would not engage in expressivity. One can speculate to what extent this result is transferrable to real elections. In the experiment the material benefits of a vote were small but clear and easy to understand, but in a real election even if the instrumental benefits were truly high for a voter, they might be hard to assess or put into proportion. For instance, in a study assessing the previous 2015 parliamentary , the major drivers for party choice seemed to be education level, profession and social class instead of particular policy proposals made by the parties (Westinen, 2016).

Interestingly, even though the partisan gap did not exist under the control condition, the likelihood of a correct answer did not seem to increase in the control group in comparison to the treatment. Availability of information would seem an obvious fix to this problem: in their slightly more extensive experiment design, Robbett and Matthews (2018) found that when individuals are given the option of acquiring free information to guide their answer, the partisan gap is nearly eliminated. However, easy access to quality information might not prove to be a cure-all solution to expressivity. Firstly, it should be noted that the nature of the questions in the experiment was factual and narrow, and therefore the choices voters need to make in actual elections are much more complex. Secondly, as the results of the experiment showed, following the news affects the likelihood of a correct answer only when the questions were in line with partisan views. This phenomenon is called selective exposure – i.e. the tendency to follow news that are in line with our pre-existing views – and it can greatly affect how we consume news and media. Multiple studies have found that selective exposure affects media consumption, and, for instance, a meta-study by Hart et al. (2009) found that people are almost two times more likely to select information that affirms rather than refutes their pre-existing attitudes, beliefs, and behaviors. The prevalence of social media has been speculated to magnify this effect, and it seems segregation of media consumption is higher for Internet than for television news, magazines and newspapers (Gentzkow & Shapiro, 2011).

60

The question remains why the likelihood of a correct answer in the control group was not significantly different from the voting group, even though the existence of the partisan gap would suggest so. If the respondents in the voting group are giving more partisan and hence also more likely incorrect answers, why the decisive individuals do not seem to have a higher likelihood of a correct answer? The answer might be found in the alternative interpretations of their results that Robbett and Matthews (2018) have proposed. Firstly, it could be that the voters are using the partisan affiliation as a heuristic when they are not pivotal in order to acquire lower cognitive costs. As Robbett and Matthews (2018) note, this explanation does not have big implications when it comes to the interpretations of the results, and the difference is more or less motivational. Secondly, an explanation for the observed behavior could also be the respondents’ dependence on two separate heuristics: the “partisan heuristic” when voting and the “heuristic of moderation” when being decisive. For the decisive individuals, supplying moderate, middle-ground responses can be reasonable strategy to tackle an incentivized but low-information situation (see Robbett & Matthews, 2018, p. 119 for full discussion).

To some extent, the second heuristic rationale could explain why there is very little improvement in the accuracy of the responses in the control group in comparison to the treatment. If the decisive individuals are using the “heuristic of moderation”, as Robbett and Matthews (2018) call it, the individuals in the control group would provide answers closer to the middle-range of the options. These answers are not necessarily correct, and therefore a drop in the likelihood of a correct answer cannot be observed. Furthermore, since the partisan gap is still observable in the treatment group, this indicates that the consistent with the second rationale, the voters are using the “partisan heuristic” in the treatment group. However, it may also be that the increased differences between the answers are caused by “real” expressivity. With the current results, it is impossible to conclusively define to what extent the voters are relying on this partisan heuristic, and to what extent they are engaging in “real” expressive behavior.

6.2. Implications for understanding partisanship

In addition to confirming the findings of Robbett and Matthews (2018), my experiment has also unique and novel aspects. For example, the inclusion of the questions about historical facts brings important new findings for understanding partisanship and expressive behavior. As Figures 2 and 3 (pp. 42-43) showed, the fact that the partisan gap was found both over contemporary and historical facts suggests that partisanship is

61

a more complex phenomenon than just identification with a certain political party. Moreover, the fact that a similar partisan gap is found both over historical and contemporary issues suggests that with the self-identified descendants of Civil War partisans the political divisions of the past are transmitted over generations. Since the Civil War partisanship seems to be reflected also in the left-right -political axis regarding contemporary questions, this implies that these political views are transferred in families over generations. Even though political socialization has not been studied widely in Finland, for example Myllyniemi (2012) found that political values on the left-right axis are often inherited from parent to child. In addition to this, the results of this experiment would imply that the process does not only happen from parent to child but also over multiple generations. However, it should be also noted that due to the reliance of my design on self-identification it is hard to be certain of this effect. The implications of self- identification are discussed further in the next section.

Furthermore, it seems that expressivity can also arise in manners that at first counter- intuitive, as the results of the question on the Russian involvement in the Civil War show. A large part of the academic historical literature on the Civil War suggested that the Whites would be more likely to give large estimates on the question of the Russian involvement in the war. A large part of the War of Liberation discourse was that the conflict was essentially over Finnish independence against the Russian socialists as discussed at length in section 2.3. However, my results show that the self-identified Red descendants were more likely to overestimate the Russian involvement to the conflict. This would indicate that, contrary to what most of the academic literature on the “War of Liberation” interpretation proposes, at least nowadays self-identified descendants of the Whites are actually quite aware of the fact that Russian involvement in the war was low, and instead the self-identified descendants of the Reds seem to be overestimating this number.

What could then be the reason behind the fact that the descendants of the Reds seem to overestimate the Russian involvement? It is impossible to provide conclusive answers, but there are multiple possible explanations for this. To give some examples, firstly, it could be that the general understanding of the Civil War for the descendants of the Reds relates to the general notion of class conflict in Europe. For example, the communist interpretations of the Civil War tended to picture the Reds as revolutionary fighters of the working class and linked the conflict to the general uprising of the working classes in Europe (Saarela, 2014). Since the results of my experiment seem to suggest that the Red

62

descendants are more leftist than the White, they might see the conflict through these lenses of a class struggle. This could indicate that the Red descendants believe the Finnish Civil War had stronger ties to the Russian Bolshevik movement than it actually did, and through that cause them to overestimate the number of Russian soldiers that participated in the conflict.

A second explanation could arise from the so-called system justification hypothesis, “people who suffer the most from a given state of affairs are paradoxically the least likely to question, challenge, reject or change it” (Jost, Pelham, Sheldon, & Ni Sullivan, 2003, p. 32). It could be that the descendants of the Reds still suffer from being the losing side of the war: even though the current accounts of the Civil War recognize the suffering that the White forces caused, as the losing side, the self-identified Red descendants could see their families more responsible for the start of the conflict. Since the White dominant accounts of the war at first emphasized that Reds were responsible for the war, the shame of being the “bad side” could have transmitted over generations. This could make the Reds more prone to believing the accounts where the Russians were heavily involved in the conflict and that the fight was over Finnish independence, manifesting by the Reds overstating the Russian participation in the War.

6.3. Limitations

My experiment constitutes a valuable replication of parts of the study of Robbett and Matthews (2018). As discussed earlier, the credibility crisis that has affected a number of fields in social sciences should be taken seriously, and one way to maintain the credibility of science is to produce efficient and unbiased replication studies (Ioannidis, 2012). Even though this is not necessarily a limitation, it should be noted that my experiment did not use an experimental design that was exactly the same the one that Robbett and Matthews (2018) used. Consequently, my study cannot be considered as a full replication of their work. A major part of the Robbett and Matthews (2018) study was that in addition to varying the likelihood of pivotality, they had availability of information as a second treatment variable. Through offering some participants the option to either receive information that helped answering the question for free or against a reduction on the possible bonus from correct answer, they found that voters, as opposed to decisive individuals, often opposed to purchasing information (Robbett and Matthews, 2018). Since similar procedures on information acquisition were not implemented, this study cannot assess whether this finding of rational ignorance is

63

replicable. A second replication attempt could solve for this issue by implementing the Robbett and Matthews (2018) experiment design completely. Further, other already existing experiments, such as Nyhan and Reifler (2018) have found very similar results to those of Robbett and Matthews (2018), which would indicate that similar results would be found in a replication as well.

It is easy to maintain that relying on self-identification to establish Civil War descendance is a major limitation of my study design: there is no way of knowing whether the individuals actually had White or Red relatives during the Civil War. Yet, it could also be argued that their real status is not that important. Rather, self-identification is the factor that changes behavior – whether or not the identification is based on real family relationships might not matter so much. Even though it is impossible to assess this claim within the context of this study, there is some evidence that indicates that the above argument might be true. In the context of studies of social class, Evans and Mellon (2016) have found that irrespective of their “true” social class based on their occupational background, the respondents’ self-identification as working class or middle class had a major impact on their political attitudes. Their results showed that those, who self- identified as working class despite their middle-class occupation, were less likely to be classified as right-wing. A very similar effect might be at play with Civil War family backgrounds. For an individual that views her relatives as either Red or White might be more affected by this identification than an individual, whose relatives were involved in the war, but either is not aware of this or does not have a strong personal connection with the events of the conflict.

In the context of this study, however, it is impossible to determine the order of causality in this identification process. It is clear that the change in pivotality is creating expressivity especially for the individuals who consider themselves as Red descendants, but it is difficult to assess what causes the self-identification. There are many possibilities: real Civil War family ties, political partisanship or ideology for example in the left-right axis, or just identification with either of the forces for other personal reasons. Some studies indicate that Civil War identification could be directly connected to social class and political affiliations. For example, in her study assessing how Finns understand their own history, Torsti (2012) found that social class was a driver for some of the differences in the views on the Civil War. For example, individuals that had stated a working-class background were more likely to state that the Reds were silenced in the aftermath of the conflict, the prison camps are the greatest misery of the Finnish history,

64

and that the war was mainly waged due to economic inequalities (Torsti, 2012). Furthermore, studies have found that social class does still affect voter choice in parliamentary elections also in Finland (Westinen, 2016). For example, in their assessment of the 2011 parliamentary elections Westinen and Kestilä-Kekkonen (2015) found that the traditional socioeconomic divides (defined as left-wing, blue-collar and bourgeois) were still identifiable in the voting behavior of Finns. Further studies would be needed to untangle this identification process.

6.4. Internal and external validity

There are two important aspects to discuss when assessing experimental studies: their internal and external validity. Internal validity is concerned with the experiment’s ability to “estimate casual effects within the study population” (Athey & Imbens, 2017, p. 79). Since the effect experiments estimate come from the forced manipulation of the treatment condition, Shadish, Cook, and Cambell (2002) argue that internal validity is achieved automatically if randomization has been well executed. As can be seen from the summary statistics in section 5.1 in Table 4 (p. 39), the characteristics of the individuals between the treatment and control condition are more or less balanced. Also, nothing in the implementation, such as experimenter effects or unclear instructions, could have affected the observed effect: the implementation of the study was exactly the same for both the control and treatment group apart from the instructions they were given, and no changes to the implementation were made during the study. As Table 13 in Appendix 3 suggests, the respondents understood the questions of my experiment in a consistent manner. The divisions indicated by the coefficient Voting x Red seems to increase for most of the questions, even though all of them did not produce a statistically significant partisan gap.

External validity poses a trickier question for experiments. External validity is concerned with how well the causal inferences of a study are generalizable over alternative settings where the population, outcomes and contexts of the study may change (Athey & Imbens, 2017). As Shadish et al. (2002) note, experiments tend to be conducted in a very restricted range of settings and with a limited amount of the versions of the treatment. This means that there is no guarantee that the results of an experiment can be extrapolated outside their own contexts (Athey & Imbens, 2017). As a replication, my experiment tells about the external validity of the original Robbett and Matthews (2018) study. Based on my results, it can be argued that the original study has external validity

65

because their results held in a very different political context. Since voters in multi-party systems change their party choice from election to election, partisanship has been criticized to be an unfitting concept for multi-party contexts (Holmberg, 2009). My results would suggest otherwise. The main findings of Robbett and Matthews (2018) held in a very different political context than that of the United States, showing that partisan bias and expressive voting behavior are not only relevant in two-party contexts.

However, it should be noted that the subject pool of this experiment consists of students and, therefore, it can be questioned to what extent the results are representative outside of the population from which the sample was obtained. Economics has been long criticized for use of students as experimental subjects. Students are argued to be a very specific group with fairly narrow socio-demographic characteristics, and therefore the results obtained with them tell very little about behavior of other, more diverse social groups (Ortmann, 2005). This holds true also for this subject pool: the average age of respondents was low and there were significantly more female respondents than male. Thus, it might be that the obtained results hold only for a very specific population of individuals: self-identified Civil War descendants at university-level education. Therefore, for future research, it would be important to conduct a similar experiment with a more diverse sample of individuals and see whether the results obtained in this study would hold for the overall Finnish population. Especially age could be an important factor that might affect how strongly partisan the views are – older people might have a more personal connection to the conflict through relatives that might have actually been part of the war. This could either cause the older respondents to answer more expressively or it might also be that as the events of the Civil War seem closer for older generations, they behave less expressively to avoid bringing forth old divisions. The findings of Torsti (2012 ) would indicate that age could actually strengthen the expressivity of answers: older generations were more inclined to state that if the Whites had lost the war, Finland would have a become a part of the .

In general, one should be careful when discussing what the findings of this study could mean in terms of the overall population of Finland. Obtaining a sample better representative of the general population might yield very different results. Additionally, in order to understand expressivity better, performing a similar experiment in a multi- party context along multiple political divisions could also bring forth interesting new evidence on expressive voting behavior. This would, however, require a different way of thinking about partisanship and voter choice, since contrary to the classical definitions

66

of partisanship, in multi-party systems people do not commit to one political party a for lifelog periods of time (Holmberg, 2009). As the results of this experiment show, instead of focusing on affiliation with political parties, especially in multi-party systems it could be more beneficial to look at other stable divisions in the society. A good starting point in the Finnish context could, for instance, be ideological divides or the different voter “blocs” (the working class bloc, the bourgeois bloc etc.) that Westinen and Kestilä- Kekkonen (2015) observed in their assessment of the 2011 Finnish parliamentary elections.

67

7 CONCLUSIONS

My experiment demonstrates, following Robbett and Matthews (2018), that affirmation of partisan identity can be a driver of voter behavior alongside material preferences. The results show that expressing affiliation with either side of the Finnish Civil War affected how the voters answered the experiment questions. This provides interesting new evidence on how not only party-political affinities, but also other related group affiliations can affect voter choice in elections. Additionally, my experiment has shown that studying expressive voting behavior has traction in multi-party systems. As discussed in the previous chapter, an interesting new application of this experimental framework could be used to assess the behavior of Finnish voters by trying to understand partisanship through different voter blocs, and not necessarily only through party affiliation.

My research project shows, essentially, that the concept of expressivity still has interesting and novel applications to different realms of political behavior. My effort of trying to understand the implications of self-identification with either side of the Finnish Civil War to voting behavior has not been performed with a rigorous experimental framework before. Experimental methods have many advantages over surveys in studying partisan biases, such as elimination of omitted variable bias. Thus, expanding the experimental research on expressiveness to other political and local contexts than the U.S. and Finland could provide new openings to the study of societal divisions. For example, similar civil war related rifts are present in many countries, and especially places like Northern Ireland where partisan divisions have been strong could prove to be a fruitful ground to show how partisan biases affect people’s choices in voting situations.

The findings regarding expressivity are important, but so is the evidence that suggests that political divides might be transmitted from generation to generation through political socialization. Especially in the Finnish context, the evidence of how family background affects people’s political choices is still sparse, and to my knowledge, previous studies have not assessed the transmission of political values over multiple generations. These results are by no means conclusive and would need further research, permitted that suitable data is available. Similarly, the single finding on how the descendants of the Reds, contrary to what historical research suggests, provided higher estimates of the Russian involvement in the Civil War could benefit from further

68

investigation. Especially in this regards sociologists or historians might have better tools to study where the effect comes from.

Lastly, the credibility crisis in social sciences should be taken seriously, and economics has also been questioned for inflated results in empirical and experimental analyses (Camerer et al., 2016). Replications are a fundamental part of maintaining the credibility of science (Ioannidis, 2012), and therefore being able to replicate the main results of Robbett and Matthews (2018) is a very positive finding. Furthermore, as the PSP calculations showed, replicating existing research increases the likelihood of the findings to be true associations significantly. My results show that with current information, the findings of Robbett and Matthews (2018) are quite likely to be true associations. This shows even further that expressivity should have a place in future research.

In this day and age, characterized by increased political volatility, uncertainty, and concerns about the functioning of democracy, understanding voting and voter behavior is of paramount importance. In my view, the field of expressivity in partisan voting behavior can still open novel perspectives on why and how people vote. Through understanding how the need to derive expressive benefit from affirming partisan and other identities influences voter behavior, we can better understand the functioning and the possible downfalls of elections in general. The results of my experiment show that there are benefits to further investigating expressive voting behavior with experimental methods and widening our views on how political partisanship is understood.

69

REFERENCES

Achen, C. H. (2002). Parental socialization and rational party identification. Political Behavior, 24(2), 151-170.

Andreoni, J. (1990). Impure altruism and donations to public goods: A theory of warm- glow giving. The Economic Journal, 100, 464-477.

Angrist, J., & Pischke, J.-S. (2009). Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton: Princeton University Press.

Arosalo, S. (1998). Social conditions for political violence: red and white terror in the Finnish Civil War of 1918. Journal of Peace Research, 35(2), 147-166.

Athey, S., & Imbens, G. W. (2017). The econometrics of randomized field experiments In A. Banerjee & E. Duflo (Eds.), Hanbook of Economic Field Experiments, Volume 1, pp. 74-135. Amsterdam: Elsevier.

Azmat, G., Bagues, M., Cabrales, A., & Iriberri, N. (2018). What you don’t know…Can’t hurt you? A natural experiment on relative performance feedback in higher education. Management Science, Forthcoming, 1-58.

Bartels, L. (2002). Beyond the running tally: Partisan bias in political perceptions. Political Behavior, 24(2), 117-150.

Bartels, L. (2012). The New Gilded Age. New Jersey: Princeton University Press.

Bausell, B., & Li, Y.-F. (2012). Power Analysis for Experimental Research: A Practical Guide for the Biological, Medical and Social Sciences. Cambridge: Cambridge University Press.

Bischoff, I., & Krauskopf, T. (2015). Warm glow of giving collectively - an experimental study. Journal of Economic Psychology, 51(2015), 210-218.

Bland, J. M., & Altman, D. G. (1995). Multiple significance tests: the Bonferroni method. British Medical Journal, 310(6973), 170-171.

Brennan, G. (2008). Psychological dimensions in voter choice. Public Choice, 137, 475- 489.

Brennan, G., & Buchanan, J. (1984). Voter choice: Evaluating political alternatives. American Behavioral Scientist, 28(2), 185-201.

Brennan, G., & Hamlin, A. (1998). Expressive voting and electoral equilibrium. Public Choice(95), 149-175.

70

Bullock, J. G., Hill, S. J., Gerber, A. S., & Huber, G. A. (2015). Partisan bias in factual beliefs about politics. Quarterly Journal of , 10(4), 519-578.

Camerer, C., Dreber, A., Forsell, E., Ho, T.-H., Huber, J., Johannesson, M.,... Wu, H. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351 (6280), 1433-1436.

Carter, J. R., & Guerette, S. D. (1992). An experimental study of expressive voting. Public Choice, 73, 251-260.

Casella, G., & Berger, R. L. (2002). Statistical Inference. Pacific Grove: Duxbury.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112(1), 155-159.

Dattalo, P. (2008). Determining Sample Size: Balancing Power, Precision and Practicality. Oxford: Oxford University Press.

Draper, N. R., & Smith, H. (1998). Applied regression analysis. New York: Wiley & Sons.

Duersch, P., Oechssler, J., & Schipper, B. C. (2009). Incentives for subjects in internet experiments. Economics Letters, 105(1), 120-122.

Elbittar, A., Gomberg, A., Martinelli, C., & Palfrey, T. R. (Forthcoming). Ignorance and bias in collective decisions. Journal of Economic Behavior & Organization.

Evans, G., & Mellon, J. (2016). Social class: Identity, awareness and political attitudes: Why are we still working class? British Social Attitudes, 33, 1-19.

Evans, J. (2004). Voters and Voting: An Introduction. London: Sage Publications.

Faul, F., Erdfelder, E., Buchner, A., & Lang, A.-G. (2009). Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behavior Research Methods, 41(4), 1149-1160.

Feddersen, T., Gailmard, S., & Sandroni, A. (2009). Moral bias in large elections: Theory and experimental evidence. American Political Science Review, 103(2), 175- 192.

Feddersen, T., & Pesendorfer, W. (1996). The swing voter’s curse. American Economic Review, 86(3), 408-424.

Feddersen, T., & Pesendorfer, W. (1997). Voting behavior and information aggregation in elections with private information. Econometrica, 65(5), 1029-1059.

Feddersen, T., & Sandroni, A. (2006). A Theory of Participation in Elections. The American Economic Review, 96(4), 1271-1282.

71

Fiorina, M. (2002). Parties and partisanship: a 40-year retrospective. Political Behavior, 24(2), 93-114.

Fischer, A. (1996). A further experimental study of expressive voting. Public Choice 88(1), 171-184.

Fricker Jr., R. D. (2016). False positives are statistically inevitable. Science, 351(673), 569-570.

Friedman, D., & Shyam, S. (1994). Experimental Methods: A Primer for Economists. Cambridge: Cambridge University Press.

Gentzkow, M., & Shapiro, J. M. (2011). Ideological segregation online and offline. The Quarterly Journal of Economics, 126(4), 1799-1839.

Gerber, A. S., Huber, G. A., & Washington, E. (2010). Party affiliation, partisanship, and political beliefs: a field experiment. American Political Science Review, 104(04), 720-744.

Green, D., Palmquist, B., & Schickler, E. (2002). Partisan Hearts and Minds: Political Parties and the Social Identities of Voters. Yale: Yale University Press.

Greiner, B. (2015). Subject pool recruitment procedures: organizing experiments with ORSEE. Journal of the Economic Science Association, 1(1), 114-125.

Haapala, P. (2009). Tie sotaan. In P. Haapala & T. Hoppu (Eds.), Sisällissodan pikkujättiläinen, pp. 10-31. Helsinki: WSOY.

Haapala, P. (2014). The expected and non-expected roots of chaos: preconditions of the Finnish Civil War. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War 1918: History, Memory, Legacy, pp. 21-50. Leiden: Brill.

Hamlin, A., & Jennings, C. (2008). Expressive political behaviour: Foundations, scope and implications. British Journal of Political Science, 41(3), 646-670.

Hamlin, A., & Jennings, C. (2017). Expressive voting. In R. Congleton, B. Gofman, & S. Voigt (Eds.), Oxford Handbook of Public Choice Oxford: Oxford University Press.

Hart, W., Albarracin, D., Eagly, A. H., Brechan, I., Lindberg, M. J., & Merrill, L. (2009). Feeling validated versus being correct: a meta-analysis of selective exposure to information. Psychol Bull, 135(4), 555-588.

Heikkinen, S. (2017). War, inflation and wages: The labor market in Finland, 1910- 1925. Essays in Economic & Business History, 35(1), 63-96.

Hentilä, S. (2018). Pitkät varjot: Muistamisen historia ja politiikka. Helsinki: Siltala.

72

Hillman, A. L. (2010). Expressive behavior in economics and politics. European Journal of Political Economy, 26(4), 403-418.

Hogg, R., & Tanis, E. (2010). Probability and Statistical Inference. New Jersey: Pearson.

Holmberg, S. (2009). Partisanship reconsidered. Oxford: Oxford University Press.

Hoppu, T. (2009a). Taistelevat osapuolet ja johtajat. In P. Haapala & T. Hoppu (Eds.), Sisällissodan pikkujättiläinen, pp. 112-144. Helsinki: WSOY.

Hoppu, T. (2009b). Valkoisten voitto. In P. Haapala & T. Hoppu (Eds.), Sisällissodan pikkujättiläinen, pp. 119-225. Helsinki: WSOY.

Huddy, L., & Bankert, A. (2017). Political partisanship as Social Identity. Oxford Research Encyclopedia of Politics

Ioannidis, J. P. (2005). Why most published research findings are false. PLoS Med, 2(8), e124.

Ioannidis, J. P. (2012). Why Science Is Not Necessarily Self-Correcting. Perspectives on Psychological Science, 7(6), 645-654.

Jost, J. T., Pelham, B. W., Sheldon, O., & Ni Sullivan, B. (2003). Social inequality and the reduction of ideological dissonance on behalf of the system: evidence of enhanced system justification among the disadvantaged. European Journal of Social Psychology, 33(1), 13-36.

Kantola, A. (2014). The therapeutic imaginary in memory work: Mediating the Finnish Civil War in Tampere. Memory Studies 7(1), 92-107.

Keski-Petäjä, M., & Witting, M. (2018). Alle viidennes opiskelijoista opinnoissa joissa tasaisesti naisia ja miehiä – koulutusalojen eriytyminen jatkuu. Helsinki: Tilastokeskus. Retrieved from http://www.stat.fi/tietotrendit/artikkelit/2018/alle- viidennes-opiskelijoista-opinnoissa-joissa-tasaisesti-naisia-ja-miehia-koulutusalojen- eriytyminen-jatkuu/

Kinnunen, T. (2014). The post- memory culture of the civil war: old-new patterns and new approaches. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War: History, Memory, Legacy, pp. 401- 439. Leiden: Brill.

Kissane, B. (2004). Democratization, state formation, and Civil War in Finland and Ireland. Comparative Political Studies, 37(8), 969-985.

Kraemer, H. C., & Blasey, C. (2016). How Many Subjects?: Statistical Power Analysis in Research. California: SAGE.

73

Lee, D. R., & Murphy, R. H. (2017). An expressive voting model of anger, hatred, harm and shame. Public Choice, 173(2017), 307-323.

Levitt, S. D., & List, J. A. (2009). Field experiments in economics: The past, the present, and the future. European Economic Review, 53(1), 1-18.

Maniadis, Z., Tufano, F., & List, J. (2014). One swallow doesn’t make a summer: New evidence on anchoring effects. The American Economic Review, 104(1), 277-290.

Maniadis, Z., Tufano, F., & List, J. A. (2017). To replicate or not to replicate? Exploring reproducibility in economics through the lens of a model and pilot studu. The Economic Journal, 127(October), F209-F235.

McCabe, K. T. (2016). Attitude responsiveness and partisan bias: direct experience with the affordable care act. Political Behavior, 38, 861-882.

Moonesinghe, R., Khoury, M. J., & Janssens, A. C. (2007). Most published research findings are false - but a little replication goes a long way. PLoS Med, 4(2), e28.

Myllyniemi, S. (2012). Monipolvinen hyvinvointi: Nuorisobarometri 2012. Helsinki: Opetus- ja kulttuuriministeriö, Nuorisotutkimusverkosto.

Niinistö, S. (2018). New Year speech by President of the Republic Sauli Niinistö on 1 January 2018: Office of the President of the Republic of Finland. https://www.sttinfo.fi/tiedote/new-year-speech-by-president-of-the-republic-sauli- niinisto-on-1-january-2018?publisherId=3981&releaseId=65473551

Nosek, B. A., Ebersole, C. R., DeHave, A. C., & Mellor, D. T. (2018). The preregistration revolution. PNAS, 115(11), 2600-2606.

Nyhan, B., & Reifler, J. (2018). The roles of information deficits and identity threat in the prevalence of misperceptions. Journal of Elections, Public Opinion and Parties(May), 1-25.

Ortmann, A. (2005). Field experiments in economics: some methodological caveats. In J. Carpenter, G. Harrison, & J. List (Eds.), Field Experiments in Economics pp. 51-70. Amsterdam: Elsevier.

Osinsky, P., & Eloranta, J. (2014). Why did the communists win or lose? A comparative analysis of the revolutionary civil wars in Russia, Finland, Spain, and China. Sociological Forum, 29(2), 318-341.

Palmolahti, H. (2018). Vuoden 1918 sota jakaa yhä kansaa, vaikka enemmistö seisoo juoksuhautojen välissä - Ylen kysely kertoo mitä Suomi ajattelee sisällissodasta Helsinki: YLE. Retrieved from https://yle.fi/uutiset/3-10014896

Peltonen, U.-M. (2003). Muistin paikat: vuoden 1918 sisällissodan muistamisesta ja unohtamisesta. Helsinki: Suomen Kirjallisuuden Seura.

74

Peltonen, U.-M. (2009). Sisällissodan muistaminen. In P. Haapala & T. Hoppu (Eds.), Sisällissodan pikkujättiläinen, pp. 464-474. Helsinki: WSOY.

Perrett, J. J., & Mundfrom, D. J. (2010). Bonferroni Procedure. In N. J. Salkind (Ed.), Encyclopedia of research design, pp. 98-101. California: Sage.

Petersen, R. D., & Valdez, A. (2005). Using snowball-based methods in hidden populations to generate a randomized community sample of gang-affiliated adolescents. Youth Violence and Juvenile Justice, 3(2), 151-167.

Prior, M., Khanna, K., & Sood, G. (2015). You cannot be serious: The impact of accuracy incentives on partisan bias in reports of economic perceptions. Quarterly Journal of Political Science, 10(4), 489-518.

Ramirez, M. D., & Erickson, N. (2014). Partisan bias and information discounting in economic judgements. Political Psychology, 35(3), 401-415.

Riker, W. H., & Ordeshook, P. C. (1968). A theory of the calculus of voting. The American Political Science Review, 62(1), 25-42.

Robbett, A., & Matthews, P. H. (2018). Partisan bias and expressive voting. Journal of Public Economics, 157, 107-120.

Saarela, T. (2014). The Finnish Labor movement and the memory of the Civil War. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War 1918: History, Memory, Legacy, pp. 331-363. Leiden: Brill.

Schuessler, A. A. (2000). Expressive voting. Rationality and Society, 12(1), 87-119.

Shadish, W., Cook, T., & Cambell, D. (2002). Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Boston: Houghton Mifflin Company.

Shayo, M., & Harel, A. (2012). Non-consequentialist voting. Journal of Economic Behavior & Organization, 81(1), 299-313.

Siltala, J. (2014). Being absorbed into an unintended war. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War: History, Memory, Legacy, pp. 51-89. Leiden: Brill.

Suodenjoki, S. (2009). Vankileirit. In P. Haapala & T. Hoppu (Eds.), Sisällissodan pikkujättiläinen, pp. 335-355. Helsinki: WSOY.

Szpunar, P. M. (2012). Collective memory and the stranger: Remembering and forgetting the 1918 Finnish Civil War. International Journal of Communication, 5, 1200-1221.

75

Tepora, T. (2014a). Changing perceptions of 1918: World War II and the post-war rise of the left. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War: History, Memory, Legacy, pp. 365-400. Leiden: Brill.

Tepora, T. (2014b). Coming to terms with Violence: Sacrifice, collective memory and reconciliation in inter-war Finland. Scandinavian Journal of History, 39(4), 487-509.

Tepora, T., & Roselius, A. (2014). Introduction: The Finnish civil war, revolution and scholarship. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War 1918: History, Memory, Legacy, pp. 1-20. Leiden: Brill.

Tiihonen, A., Kestilä-Kekkonen, E., Westinen, J., & Rapeli, L. (2016). Puoluekannan periytyminen vanhemmilta lapsille. In K. Grönlund & H. Wass (Eds.), Poliittisen osallistumisen eriytyminen: Eduskuntavaalitutkimus 2015, pp. 298-320. Helsinki: Oikeusministeriö.

Tikka, M. (2014). Warfare and terror in 1918. In T. Tepora & A. Roselius (Eds.), The Finnish Civil War: History, Memory, Legacy, pp. 90-118. Leiden: Brill.

Tilastokeskus. (2007). Väestönkehitys itsenäisessä Suomessa - kasvun vuosikymmenistä kohti harmaantuvaa Suomea. Helsinki: Tilastokeskus. Retrieved from https://www.stat.fi/tup/suomi90/joulukuu.html

Torsti, P. (2012 ). Suomalaiset ja historia. Helsinki: Gaudeamus.

Tullock, G. (1971). Charity of the uncharitable. Economic Inquiry, 9(4), 379-392.

Tyran, J.-R. (2004). Voting when money and morals conflict: an experimental test of expressive voting Journal of Public Economics, 88, 1646-1664.

Vinogradov, D., & Shadrina, E. (2013). Non-monetary incentives in online experiments. Economics Letters, 119(3), 306-310.

VNK. (2001). Suomen sotasurmat 1914-1922. Finland: Valtioneuvoston kanslia. Retrieved from http://vesta.narc.fi/cgi-bin/db2www/sotasurmaetusivu/stat2

Westinen, J., & Kestilä-Kekkonen, E. (2015). Perusduunarit, vihervasemmisto ja porvarit: Suomalaisen äänestäjäkunnan jakautuminen ideologisiin blokkeihin 2011 eduskunta-vaaleissa. Politiikka, 57(2), 94-114.

Westinen, J. Puoluevalinta Suomessa 2000-luvulla. In K. Grönlund & H.Wass (Eds.), Poliittisen osallistumisen eriytyminen: Eduskuntavaalitutkimus 2015, pp. 249-272. Helsinki: Oikeusministeriö.

White, J., & Ypi, L. (2016). The Meaning of Partisanship. Oxford: Oxford University Press.

76

APPENDIX 1 SURVEY FORM

Survey template

[e-mail]

Hei,

Haluaisitko osallistua Suomen sisällissotaa ja Suomen nykytilannetta tarkastelevaan tutkimukseen?

Tutkin Pro Gradu -tutkielmassani Suomen sisällissodan vaikutuksia Suomen nyky- yhteiskuntaan. Tutkimuksen tarkoituksena on käsitellä suomalaisten asenteita sisällissotaa sekä nykypäivän poliittisia kysymyksiä kohtaan. Osana tutkimustani toteutetaan kysely, joka koostuu erilaisista sisällissodan ja nykyisen poliittisen tilanteen tuntemusta mittaavista kysymyksistä. Tutkimus toteutetaan yhteistyössä Helsinki Graduate School of Economics:n kanssa. Tutkielman ohjaajina toimivat Topi Miettinen (Helsinki GSE) ja Peter H. Matthews (Middlebury College).

Mikäli sinut todetaan tutkimukseen soveltuvaksi, ensimmäiselle 200 vastaajalle maksetaan pieni korvaus osallistumisesta. Palkkio maksetaan sinulle MobilePay - sovelluksen avulla. Mikäli et ole vielä MobilePay:n käyttäjä, voit asentaa sovelluksen myös palkkionmaksun jälkeen.

Koko kyselyyn vastaamiseen kuluu noin 10 minuuttia, ja kysely toteutetaan sensitiivisesti ja hyviä tutkimuseettisiä periaatteita noudattaen. Mikäli kiinnostuit osallistumisesta, voit vastata kyselyyn alla olevasta linkistä:

XXXXXXXXXXX

Lisätietoja tutkimuksesta voi pyytää Noora Pirneskoskelta ([email protected]).

Ystävällisin terveisin,

Noora Pirneskoski

------

[page 1]

Kiitos että olet päättänyt ottaa osaa Suomen sisällissotaa koskevaan tutkimukseen. Tämä kysely toteutetaan osana Suomen sisällissotaa koskevaa tutkimusprojektia. Kyselyssä esitettävien lukujen ja tietojen lähteenä on käytetty pääosin Valtioneuvoston kanslian vuonna 2002 julkaiseman Suomen sotasurmat 1914-22 -projektin sekä Tilastokeskuksen tietoja.

77

Kysely koostuu kolmesta osasta, joissa kysytään näkemyksiäsi ja tietojasi Suomen sisällissodasta ja Suomen nykytilanteesta. Koko kyselyyn vastaamiseen kestää noin 10 minuuttia ja vastaamiseen ei saa käyttää apuvälineitä. Kysely toteutetaan täysin anonyymisti ja luottamuksellisesti.

Mikäli sinut todetaan tutkimukseen soveltuvaksi, ensimmäiselle 200 vastaajalle maksetaan pieni korvaus osallistumisesta. Palkkio maksetaan sinulle MobilePay - sovelluksen avulla. Vastattuasi kaikkiin kysymyksiin sinut ohjataan uuteen lomakkeeseen, johon voit jättää yhteystietosi palkkionmaksua varten.

------

[page 2, starred questions compulsory]

Onko suvussasi henkilöitä, jotka olivat jollakin tapaa osallisina Suomen vuoden 1918 sisällissodassa?*

Kyllä

Ei → [end survey when moving to next page] Kiitos kiinnostuksestasi tutkimusta kohtaan, mutta valitettavasti et sovellu vastaajaksi kyselyyn. Lisätietoja tutkimusprojektista voit pyytää Noora Pirneskoskelta ([email protected])

Jos vastasit edelliseen kysymykseen kyllä, kuinka monta prosenttia sisällissodassa osallisina olleista sukulaisistasi arvioit olleen punaisia/valkoisia?*

Huomaathan, että lukujen summan täytyy olla 100.

Punaisia (%)

Valkoisia (%)

------

[page 3]

Asuinpaikkakunta:

Synnyinpaikkakunta:

Syntymäaika: dd/mm/yyyy

Sukupuoli:

Nainen

Mies

78

Äänestitkö vuoden 2015 eduskuntavaaleissa?

Kyllä

Ei

Minkä puolueen koet itsellesi läheisimmäksi?

Keskusta

Kokoomus

Perussuomalaiset

SDP

Vihreä liitto

Vasemmistoliitto

RKP

Kristillisdemokraatit

Muu

Valitse itsellesi nelinumeroinen koodi. Kirjoita koodisi ylös, sillä tarvitset sitä myöhemmin kirjatessasi tietosi palkkionmaksua varten.

Nelinumeroinen luku 0000-9999:

------

[page 4]

Koekysymykset

[Control] Kiitos vastauksistasi tähän asti.

Seuraavat kysymykset koskevat sisällissodan tapahtumia ja Suomen nykytilannetta. Tässä osiossa esitetään yksitellen kymmenen monivalintakysymystä ja tehtävänäsi on arvioida mikä vastausvaihtoehdoista on tosi. Jokaiseen kysymykseen on vastausaikaa 30 sekuntia.

Tähän osioon osallistumisesta saat 2 euron palkkion. Lisäksi jokaisesta oikeasta vastauksesta monivalintakysymykseen sinun on mahdollista ansaita 50 sentin bonus rahapalkkioosi.

79

Sinulle esitetään ensin sisällissotaa ja sitten Suomen nykytilannetta koskevat kysymykset. Mikäli et vastaa kysymykseen 30 sekunnin kuluessa, sinut siirretään automaattisesti seuraavaan kysymykseen etkä voi saada kysymyksestä saatavaa bonusta. Jos et tiedä oikeaa vastausta, valitse vaihtoehto, jonka luulet olevan lähimpänä totuutta. Älä käytä apukeinoja vastaamiseen.

[Treatment] Kiitos vastauksistasi tähän asti.

Seuraavat kysymykset koskevat sisällissodan tapahtumia ja Suomen nykytilannetta. Tässä osiossa esitetään yksitellen kymmenen monivalintakysymystä ja tehtävänäsi on äänestää mikä vastausvaihtoehdoista on tosi. Jokaiseen kysymykseen on äänestysaikaa 30 sekuntia.

Tähän osioon osallistumisesta saat 2 euron palkkion. Sinun vastauksesi yhdistetään neljän muun vastaajan vastauksiin. Mikäli ryhmäsi enemmistö (kolme viidestä tai enemmän) vastaa monivalintakysymykseen oikein, ansaitset 50 sentin bonuksen rahapalkkioosi. Huomaathan tämän tarkoittavan, että vaikka vastaisit itse yksittäiseen kysymykseen joko oikein tai väärin, voit silti joko voittaa tai hävitä bonuksen.

Kukaan ryhmäsi jäsenistä ei saa tietää mitä olet vastannut kysymyksiin tai kuinka monta kysymystä ryhmäsi sai oikein. Saat vain tiedon siitä, kuinka suuren palkkion olet ryhmäsi jäsenenä yhteensä ansainnut.

Mikäli et vastaa kysymykseen 30 sekunnin kuluessa, sinut siirretään automaattisesti seuraavaan kysymykseen ja menetät mahdollisuuden bonukseen. Älä käytä apukeinoja vastaamiseen.

------

[Each question on their own page, 30 second time limit, bolded option correct answer]

Kuinka suuri osuus vankileireillä olleista suomalaisista punaisiksi katsotuista vangeista kuoli vankeutensa aikana?

Luvussa otetaan huomioon vankileireillä sairauksiin, kulkutauteihin ja aliravitsemukseen kuolleet sekä väkivaltaisesti menehtyneet suomalaiset punakaartilaiset sekä epäilyttäviksi katsotut siviilihenkilöt.

 0 – 15 % [0]

 15 – 30 %

 30 – 45 %

 45 – 60 %

 60 – 75 % [1]

80

Sekä valkoiset että punaiset joukot syyllistyivät sodan aikana terroriin. Kumman puolen terroritekoihin kuoli enemmän uhreja?

Terrorilla tarkoitetaan tässä yhteydessä varsinaisten taistelutoimien ulkopuolista poliittista väkivaltaa, kuten taisteluiden ulkopuolisia murhia ja teloituksia.

 Punaiseen terroriin 10 kertaa enemmän kuin valkoiseen [0]

 Punaiseen terroriin 5 kertaa enemmän kuin valkoiseen

 Molempiin yhtä paljon

 Valkoiseen terroriin 5 kertaa enemmän kuin punaiseen

 Valkoiseen terroriin 10 kertaa enemmän kuin punaiseen [1]

Mikä oli Suomen väkiluku sisällissodan alussa?

 2 – 2,5 miljoonaa

 2,5 – 3 miljoonaa

 3 – 3,5 miljoonaa

 3,5 – 4 miljoonaa

 4 – 4,5 miljoonaa

Kuinka suuri osuus kaikista sisällissodassa kuolleista suomalaisista on katsottu kuuluneen punaisiin joukkoihin?

Mukaan luetaan sekä taistelussa kaatuneet, terrorin takia kuolleet että vankileireillä menehtyneet. Terrorilla tarkoitetaan tässä yhteydessä virallisen taistelutoiminnan ulkopuolella menehtyneitä henkilöitä.

 0 – 20 % [0]

 20 – 40 %

 40 – 60 %

 60 – 80 %

 80 – 100 % [1]

Sekä punaisten että valkoisten joukkojen vahvuuden on arvioitu olleen kaiken kaikkiaan n. 80 000 - 90 000 henkeä. Kuinka suuri osuus punaisten joukkojen miesvahvuudesta on arvioitu olleen venäläisiä?

81

Huomaa, että luvuilla tarkoitetaan konfliktiin osallistuneita sotilaita kaiken kaikkiaan, ei yhtäaikaisesti rintamalla olleita joukkoja.

 0 – 5 % [0]

 5 – 10 %

 10 – 15 %

 15 – 20 %

 20 – 25 % [1]

------

[Control] Kiitos vastauksistasi tähän asti. Seuraavat monivalintakysymykset koskevat Suomen nykytilannetta. Kuten aiemmin, vastaamiseen on aikaa 30 sekuntia ja jokaisesta oikeasta vastauksesta voit ansaita 50 sentin bonuksen palkkioosi.

[Treatment] Kiitos vastauksistasi tähän asti. Seuraavat monivalintakysymykset koskevat Suomen nykytilannetta. Kuten aiemmin, äänestämiseen on aikaa 30 sekuntia ja mikäli ryhmäsi enemmistö valitsee oikean vastauksen, saat 50 sentin bonuksen palkkioosi.

------

Minkä osuus Suomessa julkisesti rahoitetuista sosiaali- ja terveyspalveluista on nykymallissa yksityisten palveluntarjoajien tuottamia?

 0 – 5 % [0]

 5 – 10 %

 10 – 15 %

 15 – 20 %

 20 – 25 % [1]

Pienituloisiksi luetaan ihmiset, joiden vuoden aikana käytettävissä olevat tulot alittavat 60 prosenttia suomalaisesta mediaanitulosta. Miten suuri osuus Suomen väestöstä lukeutui pienituloisiin vuonna 2016?

 0 – 5 % [0]

 5 – 10 %

 10 – 15 %

82

 15 – 20 %

 20 – 25 % [1]

Sipilän hallitus on asettanut tavoitteekseen nostaa työllisyysastetta hallituskauden aikana. Mikä oli työllisyysaste vuoden 2018 elokuussa?

 65 – 67 % [1]

 67 – 69 %

 69 – 71 %

 71 – 73 %

 73 – 75 % [0]

Miten Suomen kasvihuonepäästöt ovat kehittyneet vuodesta 1990 vuoteen 2016?

 Laskeneet 20 – 30 % [0]

 Laskeneet 10 – 20 %

 Säilyneet melko lailla samoina (kasvua/laskua n. 0 – 10 %)

 Kasvaneet 10 – 20 %

 Kasvaneet 20 – 30 % [1]

Sipilän hallituksen tavoitteena on ollut vähentää syrjäytyneiden nuorien määrää. Syrjäytyneiksi tai syrjäytymisriskissä oleviksi nuoriksi katsotaan ne nuoret, jotka eivät ole työelämässä, koulutuksessa tai suorittamassa asepalvelusta. Mikä osuus 15 – 24 -vuotiaista nuorista ei ollut vuoden vuonna 2017 työelämässä, koulutuksessa tai suorittamassa asepalvelusta?

7 – 8 % [0]

8 – 9 %

9 – 10 %

10 – 11 %

11 – 12 % [1]

[last page]

Taustakysymykset

83

Kiitos vastauksistasi. Sinua pyydetään nyt vastaamaan muutamaan taustoittavaan kysymykseen.

Kumpi osapuoli oli mielestäsi sisällissodassa enemmän oikealla asialla?*

Valkoiset

Punaiset

Molemmat yhtä paljon

En osaa sanoa

Kuinka paljon seuraat uutisia?*

Paljon

Melko paljon

Vähän

En lainkaan

Kuinka paljon olet seurannut sisällissotaa koskenutta viimeaikaista uutisointia?*

Paljon

Melko paljon

Vähän

En lainkaan

Asteikolla yhdestä viiteen, kuinka hyvin koet tuntevasi Suomen sisällissodan tapahtumat?*

1 = "en lainkaan"

5 = "todella hyvin"

[end survey and save answers]

84

Kiitos vastauksistasi! Vastauksesi on nyt tallennettu.

Klikkaamalla alla olevaa linkkiä pääset palkkionmaksulomakkeeseen. Sinun tulee täyttää lomake voidaksesi lunastaa palkkiosi.

[link to separate survey]

Palkkiolomake

Täytäthän alla olevat tiedot palkkionmaksua varten. Sinulle lähetetään myöhemmin sähköpostitse palkkiolomake, joka sinun tulee allekirjoittaa palkkion lunastamista varten.

Huomioithan että palkkionmaksu toteutetaan MobilePay -sovelluksen avulla. Mikäli sinulla ei ole vielä sovellusta, sinulle lähetetään maksun yhteydessä muistutusviesti sovelluksen asentamisesta. Mikäli asennat sovelluksen 30 päivän sisällä maksusta, maksu toimitetaan tilillesi normaalisti.

Otathan ongelmatilanteissa yhteyttä Noora Pirneskoskeen ([email protected])

Kiitos osallistumisestasi!

Nelinumeroinen koodisi:

Nimi:

Syntymäaika:

Sähköpostiosoite:

Puhelinnumero:

[end survey and save answers]

85

APPENDIX 2 PILOTS

First pilot

In the first pilot, the subjects were asked to answer an online survey that consisted of different background questions as well as factual questions related to the civil war. The experimental design of the pilot study followed the procedures of the planned study described above with some exceptions: to simplify the design and to make sure that a smaller sample size would be enough to provide meaningful results, the treatment voting condition was left out from the pilot and instead all the subjects were asked to answer the questions as decisive individuals. Therefore, the results from the pilot provide only the lower limit of expressive voting behavior. The pilot also had no monetary incentives which also affects the reliability of the results, but the participants were given the possibility to enter a lottery of a gift card to a local book store. All the questions that were tested in both the first and the second pilot are summarized in Table 7.

The pilot was conducted at the end of July 2018, and it was sent to Facebook-groups of two youth chapters of Finnish parties: Youth of the National Coalition (Kokoomusnuoret) and Left Youth (Vasemmistonuoret). The decision to run the pilot with these two groups was based on the assumption that the members of these parties would be more likely to have distinctly white or red family backgrounds. The sample obtained consisted of 36 individuals between the ages 16 and 56, and out of whom 28 identified the as closest to them. Initially, the hope was that the answers would be spread more evenly between the two parties. It is impossible to clearly pinpoint the reason that might have caused more National Coalition members to answer the survey, but there is a possibility that the researcher’s association with Hanken School of Economics could have been a reason for Left party members not to answer. It is also noteworthy that both of the Facebook-groups are open and therefore anyone with or without membership of the youth groups can enter the sites – out of the 36 answers, two were given by Green Party supporters, one by a Social Democrat party supporter and one by a True Finn supporter. There are also considerable differences in the ages of the respondents: the youngest person to participate was 16, and the oldest 56. However, this is not a major concern as the goal was to attract respondents who had political affiliations, which seemed to be the case for persons who browse these two Facebook sites.

86

Results

As noted before, the pilot was run in a manner where the respondents were answering as decisive individuals, and therefore the results from the pilot only provides the lower limit of expressiveness. The analysis of the pilot results was mainly done by looking at the descriptive statistics of the answers and that the answers were “going to the right direction” i.e. that respondents with for example white family backgrounds gave answers consistently in a similar manner. Generally, the pilot itself worked well. All the questions have been answered systematically and no question had a lot of missing values. For example, out of the 36 respondents, only 5 stated that Civil War did not have an effect to any of their relatives and the question on percentages of affected family members seems to have been understood well. One interesting observation that could indicate that a partisan bias can be found in the actual experiment is that when asked which side “had the right cause during the civil war”, the respondents do not answer “I don’t know” or “both equally”. Instead, out of the 28 respondents from the National Coalition answered that the whites had the right cause, and 4 out of 5 Left party or Social democrat supporters stated that the reds were right. Similarly, when asked about their family background, there is a tendency to answer either 100 % white or 100 % red: 7 respondents stated that their family background was completely red and 10 that it was completely white.

In order to create two partisan groups, the respondents were coded to belong to either the left/red or right/white group. The respondents who self-reported to have white families (>50% of family white) or identified National Coalition supporters in case they did not report family ties were coded as belonging to the right/white group. The individuals with red families (>50% of family red) or who identified as Left Party, Social Democrat or Green Party supporters were coded to belong to the left/red group. The answers of all respondents were standardized from 0 to 1, with 0 being the “whitest” and 1 the “reddest” answer. This means that in order for the questions to work as intended, the means of the answers given by people with Red family backgrounds would need to be greater than the means of the individuals with White family backgrounds.

87

Table 9 Pilot 1 results

To formalize my analysis, I performed a t-test of two independent means for each of the questions that are intended to be part of the experiment part of the final survey. The results of this analysis are promising and can be seen in Table 9. Firstly, for most of the questions, it seems that they worked as intended: the means of all the answers were higher for the respondents who identified with more left parties or who had red families. The first question regarding red prisoners of war also gave significant results and the one regarding Russian involvement is very close to being significant. These results are understandable, as they are some of the most contested aspects of the war: The situation at the prison camps has been called a human disaster and the overstatement of Russian involvement has been an important part of the white narrative of the Civil War. Even though the results of the regarding the question on terror and overall casualties of the Civil War are not significant, the means of the answer are still different enough for this effect to be intensified by the treatment in the final experiment.

88

However, there are two questions that did not produce results that are satisfactory: the one regarding the casualties of white soldiers during official battles and the one on German involvement in the war. In the case of the white casualties, the differences in the means are very small and also highly insignificant, whereas in the case of the German involvement the differences are small and hard to interpret. Firstly, rather unexpectedly it seems that the whites seem to give higher estimates for the German involvement in the war rather than the whites. This could be explained by the fact that the white Finland has always had close connections to Germany (e.g. the Jäger Movement during the First World War) and therefore it can be that the whites feel proud of the German support.

Modifications to the experimental design based on the first pilot results

After looking at the results, some of the questions seem clearly problematic. In the pilot, two neutral questions were added to the question battery. The idea of these questions is to check whether there is a difference in how partisans answer these questions versus the political questions to make sure that the observed bias is actually caused by partisan views rather than other biases. One of the neutral questions asked who wrote the famous “Under the North Star” novel trilogy that brought the treatment of the red POWs to the general discussion. Out of 36 respondents, 33 were able to give the right answer, and therefore this question should be dropped out from the final version of the experiment as it is too easy. Instead, the question regarding the population of Finland at the start of the civil war seems to be good, as the spread of the answers is wide, there are 9 respondents that got the question right, and there seems to be no difference between white or red respondents. There might be also a need to reframe the question of how many family members were affected by the civil war: the responses range from 0 to 1000 with a mean of 37.28. This would indicate that the question is badly phrased.

The third question that seems problematic is the number of White soldiers who died during the official battles of the Civil War. The differences in this question were small and insignificant, and it is possible to come up with different explanations as to why either the Whites or Reds might over- or understate the amounts. From the results of the pilot, a few decisions regarding the questions used in the experiment were made. Firstly, it was decided that some questions need to be left out or revised. The questions that were removed are the neutral one concerning Väinö Linna and the one regarding White casualties that died during battle. With the question regarding the number of relatives the war affected, the phrasing was updated to a simple yes/no answer.

89

Since there is a need to have enough questions in the experiment, I decided that it would be useful to include some questions that could measure expressiveness in the left-right axis on contemporary issues that are similar to those in the original Robbett and Matthews (2018) study. In order to test these questions as well as to gather some more responses from individuals with red family backgrounds, a second pilot would need to be conducted to test out the new questions and the other changes made.

Second pilot

The second pilot was conducted end of October at at the department of social sciences. The sample was obtained from a class of world politics students, and it consisted of 21 individuals aged between 19 and 37. Similar to the first pilot, the second pilot did not have the treatment condition of the final experimental design, but otherwise followed the final study design. The pilot was not incentivized, and therefore the results should be taken as indicative, and to present the lower-bound of expressiveness that is expected to observe in the final experiment. The aim of the second pilot was to test the new questions added to the experimental design, and therefore it had five new questions related to the current political situation in Finland. These questions were related to the employment level, greenhouse gas emissions, income inequality, risk youth and health and social services. The new version of the questionnaire also contained the questions related to the Civil War that were kept after the first pilot. All the questions that were tested in both the first and the second pilot are summarized in Table 7.

Results

There are some general remarks that should first be made about the second pilot. Firstly, the number of individuals that self-reported white or red family backgrounds was fairly low: only 8 out of 21 respondents indicated that they had relatives in the Civil War and only one individual self-declared red background. To some extent this is to be expected: not all Finns chose their side in the Civil War and since the age of the subjects is low, it can be that there is less awareness of the topic. Secondly, the group was fairly homogenous in regard to political partisanship. 15 out of 21 respondents were green party supporters, with rest of the subjects stating affiliation with either National Coalition or Left Party.

Since the new questions are meant to be divisive on the left-right political axis, the respondents were divided to two groups based on their party affiliation or family

90

background with Green and Left Party voters as well as individuals who reported Red family backgrounds in one group, and National Coalition supporters and individuals with White family background in the other. In order to see if these two groups answered the questions differently from each other, a t-test of two independent means was performed. The results of this analysis can be seen on Table 5. As can be seen from the table, for the three first questions the means of the questions are as expected: the means of the political left are higher than the means of the political right. In addition, the results are very close to being significant which is very promising since these results are supposed to provide the lower bound of expressiveness. For the two last questions regarding risk youth and healthcare privatization the results are again very close to being significant, but the differences in means were not as expected. The means of these questions were higher for the white respondents, but this can also just result from how the questions were coded and how I interpreted what the partisan responses would be. Therefore, for the final version of the experiment, the coding can be reversed to reflect the differences observed in the pilot if necessary.

Table 10 Pilot 2 results

In order to tie the first and second pilot results together and to see how the questions perform with individuals that have reported as being Red or White descendants, I combined the data obtained from the first and the second pilot for the questions that

91

were used in both of the pilots. The results reported in Table 11 show the summary statistics of each question for individuals with either red or white background. As can be seen from the results, the questions seem to work in the expected manner with the means of the red answers being slightly higher than the means of the white individuals. To test if the means differed from each other significantly, a t-test of two independent means was conducted, but none of the differences were significant. However, as the results are assumed to be the lower bound, this effect is expected to be intensified by the treatment.

Table 11 Partisan results from both pilots

Modifications to the experimental design based on the second pilot results

The most important part of the pilot was to test out the questions in the experimental part of the survey. The final decision on the selected questions was based on the pilot results, and the final experimental questions can be seen from Table 12. In the end, 10 questions were selected out of which one is neutral. From the Civil War related questions, the one related to the German involvement was left out as it seemed to be still problematic as in the first pilot. Instead, I decided to keep all the tested contemporary questions: even though they did not have the expected results, it would be interesting to see how they perform in the final experiment. Also, in the Robbett and Matthews (2018, see online appendix) experiment, all of the questions did not give significant differences in the means of the answers. However, in the final analysis, the differences between all

92

questions can be aggregated in order to better analyse the differences between the treatment and control group, and therefore the importance of one individual question for the overall study design is not that great.

Table 12 All tested questions

93

APPENDIX 3 INDIVIDUAL QUESTIONS, REGRESSION RESULTS

Table 13 Linear regression results, individual questions

94

APPENDIX 4 ROBUSTNESS CHECKS

Table 14 Alternative dummyfications for Table 6 Table 15 Alternative dummyfications for Table 7

95

Table 16 Question type’s effect on partisan gap

The model in Table 16 estimates how much, approximately, the question type affects the partisan gap. The variable “Civil War” is an indicator taking value 1 if the question was about the Civil War. The coefficient on “CW x Voting x Red” indicates how much the question type affected the partisan gap. As the coefficient is small and insignificant, the Civil War question type does not seem to affect the partisan gap, and Civil War question type cannot be concluded to be the main driver of the overall partisan gap.

96

Table 17 Probit model, likelihood of a correct answer, challenging questions

Table 17 reports probit results for the case of challenging questions (questions for which the correct answer falls into to the opposite side of the [0,1] interval from the participant’s preferred partisan response). The dependent variable takes value 1 if the individual’s answer is correct, and 0 otherwise. The probit model indicates a very similar result as the linear likelihood model. The results show that the moving from control to treatment decreases the z-statistic by -0.203, which means that when moving from control to treatment, the probability of a correct response decreases by -0.0713 (p=0.078). The calculation of the change in the probability of correct response was done with Stata’s “margins, dydx(*)” command.