PS 139: Analysis of the 2013 Armenian Presidential Election

James Chang, Jerry Feng, Karthik Siva, Benjamin Wu April 18, 2014

Introduction Historical Background is a landlocked country located at the crossroads between Western Asia and Eastern Europe. It is a primarily mountainous country, bordered by Turkey to the west, Georgia to the north, the Nagorno- Karabakh Republic and Azerbaijan to the east, and Iran to the south. Up until the end of the 19th century, present-day Armenia was divided by the Ottoman and Russian empires. During World War I, Armenians living in the then-Ottoman Empire were exterminated in the Armenian Genocide, but the country finally gained independence in 1918. The First Republic of Armenia was surrounded by hostile countries who quickly ended its independence in 1920, and Armenia was absorbed by the quickly thereafter. The modern Republic of Armenia gained its independence in 1991 and has remained a unitary multi-party democracy. Ninety-six percent of the population speaks the , while 76% also speaks Russian. For this reason, Armenia can be considered culturally homogeneous for the most part, and political parties are divided by gubernatorial ideology differences rather than racial ones.

Government The current Armenian government is only 13 years old. Armenia is a representative democracy and places its president as the head of government and of the multi-party system. The Armenian presidential elections follow a simple majority rule vote in which a candidate must amass 50% of the total votes to win the presidency. In the case that multiple candidates run for office and no candidate amasses enough votes, a re-vote is held with the two most popular candidates from the first round.

Presidential Elections Over the course of its history, Armenia has had three presidents. The first was Levon Ter-Petrosyan, who was was popularly elected the first President of the newly independent Republic of Armenia on October 16, 1991 and re-elected on September 22, 1996. After the resignation of his predecessor Levon Ter-Petrosyan, was elected Armenia’s second President on March 30, 1998, defeating his main rival, , in an early presidential election marred by irregularities and violations by both sides, as reported by international electoral observers. , the Prime Minister of Armenia at the time and having President Kocharyan’s backing, was viewed as the strongest contender for the presidency in the February 2008 presidential election and was elected in both the 2008 and 2013 presidential elections, amidst some accusations of fraud. Thus, in this paper we apply various statistical methods on a variety of aspects of precinct-level data for the 2008 and 2013 presidential elections in order to detect and classify any possible forms of fraud in either year.

1 The 2008 and 2013 Presidential Elections

In both the 2008 and 2013 presidential elections, Serzh Sargsyan won by a large margin, garnering 52.89% of the votes in 2008 and 56.67% of the votes in 2013, whereas the runner-up in 2008, Levon Ter- Petrosyan won just 21.51% of the votes and in 2013, Raffi K. Hovhannisyan won 35.51% of the votes. The data in Tables 1 and 2 suggest that while in 2008, Ter-Petrosyan had a few closer contenders, namely Artur Baghdasaryan and Vahan Hovhannisyan, the 2013 election could essentially be viewed as a two-candidate election between Serzh Sargsyan and Raffi K. Hovhannisyan, as the third place candidate won approximately one-twentieth the number of votes as Hovhannisyan.

Candidate Number of Votes Percent of Votes Sargsyan, Serzh 862369 52.89 Ter-Petrosyan, Levon 351222 21.51 Baghdasaryan, Artur 272427 16.68 Hovhannisyan, Vahan 100966 6.18 Manukyan, Vazgen 21075 1.29 Karapetyan, Tigran 9792 0.6 Geghamyan, Artashes 7524 0.46 Melikyan, Arman 4399 0.27 Harutyunyan, Aram 2892 0.18

Table 1. National election data for the 2008 Armenian presidential election won by Serzh Sargsyan with runner-up Levon Ter-Petrosyan.

Candidate Number of Votes Percent of Votes Serzh Sargsyan 861155 56.67 Raffi K. Hovhannisyan 539691 35.51 Hrant Bagratyan 31643 2.08 Paruyr Hayrikyan 18096 1.19 Andrias Ghukasyan 8329 0.54 Vardan Sedrakyan 6210 0.41 Arman Melikyan 3516 0.23

Table 2. National election data for the 2013 Armenian presidential election won by Serzh Sargsyan with runner-up Raffi K. Hovhannisyan.

Constituency Background

The electorate of Armenia consists of 41 regional blocks (or constituencies) of varying size, illustrated in Figure 1. Each constituency is further divided into 20-50 precincts. Each constituency is assigned an arbitrary number from 1 to 41, and each precinct is numbered from 1 to the maximum number of precincts that exist within the constituency. For example, Precinct 35/1 is located in the Shirak constituency, within the Azatan region, where all precincts labelled 35/X are located in the Shirak constituency. The data, as provided by the Armenian Central Electoral Commission, contains information on the number of total valid votes, the number of electors, votes apportioned to each candidate during the election, and invalid ballots to precinct precision. Each precinct contains about 15-2200 electors (eligible voters).

2 Figure 1. Constituency map for Armenia in the 2013 presidential election.

Benford’s Law: Least Significant Digit Analysis

To begin the search for fraud in the electoral process, we began with the simplest method of detection using Benford’s Law. The idea of least significant digit analysis is that if precinct-level data was tampered with by directly changing the number of votes per candidate, then human error would tend to generate random numbers whose least significant digit would follow a nonuniform distribution. In the 2013 data, the distributions of least significant digits for the 5 candidates with the lowest number of total votes are heavily skewed to the left, as one might expect. On the precinct level, these candidates often received less than 1% of the votes; as a result, it should be expected that in many precincts they received as few as 2 or 3 votes. Therefore, the skewness of this data is not a valid basis for accusations of fraud. The two most popular candidates, Raffi Hovhannisyan and Serzh Sargsyan, received relatively uniform least significant digit distributions as shown in Figure 2, and there does not seem to be any significant conclusion we can draw from this analysis.

3 Figure 2. Benford’s Law analysis for the precinct-level election data for the winner of the 2013 Armenian election, Serzh Sargsyan, and his runner-up, Raffi K. Hovhannisyan.

We investigated the least significant digit distributions for the 2008 election as well, where Serzh Sargsyan was the victor as well. In general, there appears a moderate preference for polling stations to report a number ending with zero, regardless of the candidate, as shown in the histograms in Figures 3 and 4. However, this information alone is not indicative of any fraud in the election, and again it is likely that the third and fourth place candidates gained zero votes in a nonnegligible number of precincts. Again, there is little that we can conclude from this analysis.

Figure 3. Benford’s Law analysis for the precinct-level election data for top two candidates in the 2008 Armenian election, Serzh Sargsyan and Levon Ter-Petrosyan.

4 Figure 4. Benford’s Law analysis for the precinct-level election data for third and fourth place candidates in the 2008 Armenian election, Artur Baghdasaryan and Vahan Hovhannisyan.

Turnout Distribution

While the simple distribution of turnout is not a true analysis, there is much we can learn from inspecting these histograms.

Figure 5. Percent (of electorate) turnout distributions in the 2008 (top) and 2013 (bottom) Armenian presidential elections.

5 In a fair election, the number of people who arrive to vote is expected to be a Gaussian distribution, but from a simple inspection of Figure 5, it is obvious that the 2013 election turnout distribution is greatly skewed to the higher end. On the other hand, the 2008 turnout distribution is much closer to a normal distribution. However, as the incumbent Serzh was the favorite to win the 2013 election as well as the actual winner, there are no obvious reasons for turnout to drastically shift between the two election years, and so we expect that there was some extra activity that prompted the deviation in the 2013 distribution. At this moment, we can cautiously suggest that there may have been some kind of ballot stuffing, which would cause turnout levels in some precincts to appear higher than the true numbers, without creating a simple shift in a normal distribution. If there is ballot stuffing, we should then expect to see some anomalous patterns in a plot of turnout against number of votes.

Winning Margin Analysis

We next turn our attention to the distribution of winning margins. In the following plots in Figure 6, we have the average turnout per constituency minus 50% (for ease of viewing) and the winning margin (votes for Sargsyan minus votes for Hovhannisyan). Note that while the constituencies are numbered arbitrarily, we can see a correlation between these two variables that suggests that there is a loose relationship between turnout and the number of votes Sargsyan receives, reinforcing our already established notion that Sargsyan seems to be asymmetrically favored by high turnouts.

Figure 6. Average turnout (minus 50%) and average winning margin between Serzh Sargsyana and Raffi K. Hov- hannisyan, with each bar representing an average over each of the 41 constituencies.

6 Turnout vs. Votes

Here we try two types of scatter plots. The first is turnout as a percentage per precinct plotted against the percent of registered electors who voted for each candidate. In the analysis of these plots that follows, we are assuming an effective mean field theory, namely that no net correlation exists between voting preferences and geographic location. This assumption of homogeneity allows us to perform analyses based on the expectation that a candidate should not win a larger fraction of a precinct’s votes even if turnout increases within the precinct and analyses using the national percentages of votes won by each candidate to search for anomalous behavior in each precinct. It should be noted that these assumptions could be verified or rejected using political opinion polls that preceded the election, but these were not available in public domain.

Type 1 In a fair election, we might expect that the percent of the electorate in each precinct that votes for each candidate has a positive linear relationship with the turnout. For example, if candidate A received 20 votes in a precinct with 1000 eligible voters and 10% turnout, then we should expect that in a precinct with 20% turnout and 10000 eligible voters, approximately 400 voters will vote for candidate A. In other words, as more people show up to vote, both candidates gain more votes on average. In practice this may not be entirely accurate, as some political analysts have claimed that the first 20% and the last 20% of voters actually vote differently, but these differences should not affect a positive-sloping regression line for all candidates. In Figure 7 are the two plots of this type for Serzh Sargsyan and Raffi Hovhannisyan, with a linear fit regression line in blue as well as a 100% vote line (where the turnout equals the number of votes for the candidate) for Sargsyan.

Figure 7. Percent of the electorate that voted for Serzh Sargsyan (left) and Raffi K. Hovhannisyan (right) in the 2013 election in each precinct as a function of percent turnout. We expect that, in general, as turnout increases, the both candidates should gain larger shares of the total electorates’ votes, but this is clearly not the case for the runner-up. The line of the best for the votes for Sargsyan follows V = 1.35 × T − 45.09 while Hovhannisyan’s follows V = −0.30 × T + 37.3, suggesting the latter loses votes as turnout increases.

As we see from the above two plots, Hovhannisyan’s vote to turnout distribution is somewhat unusual. As a larger proportion of the electorate arrives to vote, he seems to become less popular. A plausible explanation for this phenomenon might be that the masses genuinely support the incumbent more popular candidate, Sargsyan. However if this were the case, we would expect that the scatter points be more obviously linear. Instead, we have a strangely shaped triangular region in which the low-turnout precincts differ wildly on

7 their support for Hovhannisyan and high-turnout precincts generally start to disapprove as turnout increases. Correspondingly, in the plot of Sargsyans votes he seems to have an increasing linear function as we might reasonably expect, but there appears to be some anomalous activity in the low turnout precincts and very high (>80%) turnout precincts in which Sargsyan receives nearly all of the valid votes. We do the same analysis in Figures 8 and 9 for the top four candidates of the 2008 election, again with the inclusion of 100% vote lines as dotted lines:

Figure 8. Percent of the electorate that voted for Serzh Sargsyan (left) and Levon Ter-Petrosyan (right) in the 2008 election in each precinct as a function of percent turnout.

Figure 9. Percent of the electorate that voted for Artur Baghdasaryan (left) and Vahan Hovhannisyan (right) in the 2008 election in each precinct as a function of percent turnout.

We note that the regression line for Serzh Sargsyan using the 2008 election data is very close to that using the 2013 election data, and the second and third place candidates in the 2008 election also tended get a smaller share of a precinct’s electorate’s votes as its turnout increased. The data for Levon Ter-Petrosyan also shares the same triangular shape as that of Raffi K. Hovhannisyan. In the second type of scatter plots, we thus check for a basis to argue for or against accusations of fraud against Sargsyan.

8 Type 2 The following plots are the percent of valid votes for each candidate as a function of turnout. This time, we expect a uniform distribution (a line with zero slope) because in a fair election under our assumptions the percent of votes that a candidate wins should not change with turnout, unless contrary to our assumptions, places with large turnout genuinely strongly preferred one candidate. We also plot their respective residuals to look for suspicious patterns of deviations from the regression lines.

Figure 10. Percent of the valid votes for Serzh Sargsyan in the 2013 election as a function of turnout in each precinct. The line clearly has positive slope (V = 1.11 × T − 9.10), suggesting that either there was correlation between high turnout and increasing preference for Sargsyan over the other candidates or fraud.

Figure 11. Percent of the valid votes for Raffi K. Hovhannisyan in the 2013 election as a function of turnout in each precinct. The line clearly has negative slope (V = −0.91 × T + 89.73), suggesting either the opposite trend for Hovhannisyan or the impact of vote stealing.

In these two plots and their residuals, there are several factors to note. The first is the symmetry of the two residuals plots, which can be explained as follows. Raffi K. Hovhannisyan and Serzh Sargsyan were the

9 two most popular candidates in the 2013 election, and all other candidates received trivial numbers of votes in almost every precinct. So for the most part it is to be expected that whatever votes Sargsyan did not receive, Hovhannisyan did. The second thing to note is the unusual trailing tail of points where Sargsyan wins nearly 100% of the votes in the low turnout regime, and this tail actually continues up into the high turnout regime as well; however in the high turnout regime, these points are much closer to the line of best fit and do not constitute any suspicious deviations. On the other hand, the tail in the low turnout regime not only represents all the precincts in which Sargsyan received close to 100% of the valid votes but also where the points deviate very significantly from the regression line, thus raising suspicion, at least in light of the assumptions stated earlier. Of course, it should be noted first that the line itself is fit to data that is affected by possible fraud, but in this analysis we first assume that overall the fraud is limited or nonexistent. In other words, the general trend that high-turnout regions genuinely strongly preferred Sargsyan is assumed true, and further large deviations from this trend in favor of Sargsyan in precincts with a medium number of votes is suspicious. Next, we need to address the fact that even if the percent turnout is large, this does not tell us about the absolute number of votes. In fact, the truly suspicious points are those that deviate significantly from the line and have a large number of votes. Indeed, as it turns out, size of precinct electorate and percent turnout are not strongly correlated for this election, and a linear fit to such data has a near zero slope. The low-to-middle range of the percent turnout contains a lot of precincts with raw numbers of votes below 500 that often give all of their votes to Sargsyan; this anomaly is investigated next. There are also a few points in this range of turnout that have between 1500 and 2000 votes, and so we suspect that some fraud occurred here, as they, too, contribute a disproportionate fraction of their votes to Sargsyan. The middle-to-high range of percent turnout has a greater mix of raw numbers of turnout, and even still, the deviations from the regression lines in Figure 10 is not very significant. Thus, because fraud is typically easier to commit without immediate detection in less urban areas than in cities, we next narrow the search to the smaller precincts, again looking for large deviations from the regression line wherein Sargsyan wins nearly all of the votes. As it turned out, several precincts actually had very few voters or votes, with the smallest precinct reporting 14 valid votes. The following is a plot of the distribution of the raw number of accepted valid votes by precinct.

Figure 12. Distribution of the raw number of valid votes counted per precinct. Evidently, many precincts had relatively low turnout.

10 Because the distribution of precinct size is so unusual and because vote stealing is generally more difficult in large cities, we look for fraud in the smaller precincts, but not so small that winning 100% of the votes is unusual. Thus, we consider the percent turnout plotted against percent of votes for each candidate for precincts that received between 100 and 500 votes, which maintain the same large-scale characteristics as we noted before.

Figure 13. Percent of the valid votes for Serzh Sargsyan in the 2013 election as a function of turnout in each precinct in the low-to-middle range of turnout.

Figure 14. Percent of the valid votes for Raffi K. Hovhannisyan in the 2013 election as a function of turnout in each precinct in the low-to-middle range of turnout.

This trailing tail, which does not disappear entirely until we limit our consideration to 500 or higher vote precincts (1764 votes being the highest number for any precinct). This suggests that in many smaller precincts, vote stealing may have occurred and the long chain of near-100% precincts as won by Sargsyan were not merely a result of his popularity. There are also some unusual phenomena near the high end of the spectrum as well, but these do not deviate significantly enough from the line of best fit to definitively suggest fraud of any kind. Looking at these same plots for the top four candidates in the 2008 election, once

11 more it is clear that the largest deviations from the line of the best fit for the data on the votes for Sargsyan often correspond to points where he won nearly 100% of the votes, and these points also correspond to points where Ter-Petrosyan and Baghdasaryan won almost no votes.

Figure 15. Percent of the valid votes for Serzh Sargsyan in the 2008 election as a function of turnout in each precinct. The line clearly has positive slope (V = 0.8068 × T − 1.2989), suggesting that either there was correlation between high turnout and increasing preference for Sargsyan over the other candidates or fraud.

Figure 16. Percent of the valid votes for Levon Ter-Petrosyan in the 2008 election as a function of turnout in each precinct. The line clearly has negative slope (V = −0.3179 × T + 41.9130), suggesting either the opposite trend for Ter-Petrosyan or the impact of vote stealing.

12 Figure 17. Percent of the valid votes for Artur Baghdasaryan in the 2008 election as a function of turnout in each precinct. The line clearly has negative slope (V = −0.4314 × T + 46.5317), suggesting either the opposite trend for Baghdasaryan or the impact of vote stealing.

Figure 18. Percent of the valid votes for Vahan Hovhannisyan in the 2008 election as a function of turnout in each precinct. The line has a very slight positive slope (V = 0.0062 × T + 5.9245), which indicates neither that his election was fraudulent nor that it was the victim of fraud from other candidates.

On the other hand, the voting pattern for Hovhannisyan gives the cleanest record, with a line with very little slope and mostly randomly scattered residuals. Unlike the data for the other three candidates, this data appears to be both free of fraud as well as only marginally impacted by the fraud potentially committed by others in the election. However, in the vast majority of the constituencies, Hovhannisyan won less than

13 15% of the the votes, so the impact of another candidate’s fraudulent activity would be difficult to detect. At this point, the analysis cautiously suggests that Sargsyan is guilty of vote stealing in the 2008 and 2013 elections, especially in the low-to-middle range percent turnout precincts with raw turnout between 100 and 500 votes. In addition, there are a few very suspicious points in the low percent turnout regime that correspond to larger raw turnout that are reported to have contributed almost all of their votes to Sargsyan. Finally, the trend in both years of increasingly large fractions of votes going to Sargsyan as percent turnout increases is not any definitive indicator of fraud as the correlation between raw numbers of votes and percent turnout is not strong.

CDF Analysis

The scatter plots in the previous section provided a qualitative idea of where to look for fraud, so we now consider the following statistical test as a supplement to quantiatively find fraud. Consider that each precinct has N voters (counting valid ballots only) and each person can be described as a Bernoulli problem in which he or she can vote either for Sargsyan or against him. Given that Sargsyan won the election with 58.63% of the valid votes (the number in Table 2 includes invalid votes), the probability that any given voter votes for him can be approximated to p0 = 0.5863, using the assumptions of homogeneity. Out of N voters, the probability that exactly n of them vote for Sargsyan (where n ≤ N) is given by the binomial distribution:

N P (p , n) = pn(1 − p)N−n 0 n

Then for each precinct, we check whether the probability that n is greater than or equal to vi, the actual number of votes Sargsyan received, using a CDF and compare this probability to an α level. In other words:

H0: The true distribution of voter approval of Sargsyan is 58.63%.

We fail to reject the null hypothesis if P(n ≥ vi) ≥ α. We reject the null hypothesis if P(n ≥ vi) ≤ α. Of course, if the percent of all precincts that reject the hypothesis is large compared to α, then fraud is likely. We performed this statistical test for decreasing α levels (by factors of 10) and found that in close to half of all precincts, it is extraordinarily unlikely at the α = 1% level that Sargsyan received such a high proportion of votes, and we therefore suspect that fraud was committed in a large percent of these precincts. In Figure 19, the red curve uses the actual national percent of votes won by Sargsyan (58.63%) and illustrates the percent of all precincts that rejected H0 for various α. The x-axis uses Log10(α) because this test was performed for α = 0.01, 0.001, 0.0001, etc. Immediately, it is clear from looking at the graph from right to left that the ratio of the percent of precincts that reject H0 at α = 1% to α is much greater than 1. This point is the grounds for the claims made above. A more interesting aspect of this plot is that as α decreases, this ratio increases, meaning compared the α level, larger proportions of the precincts reject H0, again providing evidence that fraud occurred in more than 30% of precincts . Finally, to give a clearer picture of just how significant this result is, these curves are given for hypothetical p0 value greater than 58.63%, the actual p0 value.

14 Figure 19. Plot of the number of precincts that reject the null hypothesis for various α levels and p0. The red curve corresponds to the p0 estimated by the national election result, and the other curves depict just how statistically significant the indicator of fraud is.

Conclusion

We believe that there was a large amount of fraud involved in the 2013 Armenian Presidential Elections and support this claim with analysis of the data for the 2008 and 2013 elections that are provided on the Armenian Election Committees website. First, we examined the winning margin between the incumbent (Serzh Sargsyan) and his opposer (Raffi K. Hovhannisyan). We found that as voter turnout increased, the margin by which Sargsyan won increased. Next, we investigated the turnout distribution on a precinct level for both 2008 and 2013. In 2008, the dis- tribution was close to a normal Gaussian distribution, while in 2013, the distribution appears similar to that of the 2008 distribution, but shifted to the right with a suspicious bump in the higher percentage turnouts. From this bump in the turnout distribution, we suspect some kind of fraud - possibly ballot stuffing - due to the skewed distribution in 2013. Therefore, we continued this line of investigation by scrutinizing the percent turnout vs. percent of valid voters who voted for each candidate. In the 2013 data, we discovered that as percent voter turnout increased, the percent of the electorate that voted for Sargsyan increased, while his opponents share of the votes de- creased, a phenomena that would be expected in a case of ballot stuffing. In order to strengthen our claims, we examined the percent of valid votes each candidate won vs the percent turnout, and surely enough, we could confirm that as percent voter turnout increased, Sargsyan indeed received increased percentages of the votes. These plots then gave good reaosn to believe that votes were stolen in many of the precincts with 100-500 valid votes. However, these findings alone do not constitute clear proof of voting fraud. There are many valid reasons why Sargsyans winning margin increased as voter turnout increased. For instance, as the popular candidate in both elections, Sargsyan did have the support of the masses as well as the endorsement of the previous president when it came to voting. In that case, high turnout precincts may in fact correspond to precincts

15 in which Sargsyan was extremely popular while lower ones suffered from voter indifference. As a result, we considered a simple binomial distribution statistical test, through which we discovered overwhelming evidence for fraud, under the assumption that Sargsyans popularity should be roughly homo- geneous throughout the precincts. This is, however, a bold assumption, so we will end by qualifying our allegations of vote-stealing and ballot-stuffing in the 2013 election with the point that this statistical test based on our assumptions has a chance that it is biased toward overstating the amount of fraud.

Works Cited

[1] Central Electoral Committee of the Republic of Armenia. Elections of the President of the Republic, 2008. Central Electoral Committee of the Republic of Armenia. Web. 09 Mar. 2014.

[2] Central Electoral Committee of the Republic of Armenia. Elections of the President of the Repub- lic, 2013. Central Electoral Committee of the Republic of Armenia. Web. 09 Mar. 2014.

[3] Central Electoral Committee of the Republic of Armenia. The electoral constituencies. Central Electoral Committee of the Republic of Armenia. Web. 13 Mar. 2014.

[4] The Office to the President of the Republic of Armenia. The Constitution of the Republic of Arme- nia - Library - The President of The Republic of Armenia: Chapter 3. The President of the Republic. The Office to the President of the Republic of Armenia. Web. 16 Mar. 2014.

[5] Hayastan. USSR (From 1920s to 1980s) - History of Armenia. Hayastan. Web. 10 Mar. 2014.

[6] Hayastan. Independance (From 1990) - History of Armenia. Hayastan. Web. 10 Mar. 2014. We should include a list of the near 100

16