<<

The Effect of the 1918 Influenza Pandemic on Income Inequality: Evidence from Italy

Sergio Galletta1, Tommaso Giommoni2

1University of Bergamo 2ETH Zürich

Abstract

In this paper, we estimate the effect of the 1918 influenza pandemic on income inequality in Ital- ian municipalities. Our identification strategy exploits the exogenous diffusion of influenza across municipalities due to the presence of infected soldiers on leave from World I operations at the peak of the pandemic. The measures of income inequality come from newly digitized historical administrative records on Italian taxpayer incomes. We show that in the short-/medium-run (i.e., after five years), income inequality is higher in Italian municipalities more afflicted by the pan- demic. The effect is mostly explained by an increase in the share of income held by the rich to the detriment of the other strata of the population.

Keywords: Inequality, , Pandemic, Italy, JEL Classification: I14, D31, N44

Email addresses: [email protected] (Sergio Galletta), [email protected] (Tommaso Giommoni) 1 1. Introduction

The impact of pandemics on income inequality is an issue of vital importance, both from a positive and a normative perspective. While there are widespread negative economic effects of a pandemic, the poorer are likely to be hit harder. For example, the probability of being infected during an epidemic is potentially higher for poorer than for richer individuals. Moreover, the lower- income strata of the population may not have accumulated wealth to smooth the severity of the crisis and better recover once the pandemic has passed.1 To date, there is very little evidence regarding whether and to what extent global pandemics can affect income distribution (Karlsson et al., 2014; Alfani and Ammannati, 2017; Furceri et al., 2020; Alfani, 2020b). The limited number of empirical results is mostly due to the relative rarity of pandemics in recent , as well as to the scant availability of economic data for less recent periods. The main goal of this paper is to address this gap by providing empirical evidence of the redistributive effect of the 1918 influenza pandemic, also known as the Spanish flu, in Italian municipalities. Our empirical analysis employs unique data and exploits the heterogeneous diffusion of the influenza across localities in a cross-sectional framework. The identification and estimation of the pandemic’s effect on inequality are challenging due to potential reverse causality and correlated omitted variables: for example, one might expect the pandemic to have a greater impact in areas with higher inequality. In addition, there may be unobservable municipal characteristics associated with both inequality and the severity of the pandemic. To address such identification issues, we rely on a unique natural experiment in Italy during World War I (WWI). At that time, pandemic diffusion was linked with the movement of Italian troops in the national territory. According to historical accounts, the relocation of contagious soldiers played a decisive role in the diffusion

1 See, for example, some recent studies suggesting that the COVID-19 pandemic may have stronger, negative, effects on the more vulnerable categories of individuals (Adams-Prassl et al., 2020; Alon et al., 2020). Similarly, recent literature studies the effects of natural disasters (Bui et al., 2014; Yamamura, 2015) and economic crisis (Atkinson and Morelli, 2011) on income inequality and poverty. 2 of the flu from the war front to Italian communities (Tognotti, 2015; Cutolo, 2020). Drawing on these accounts, we exploit the plausibly exogenous variation in the number of WWI soldiers who returned infected to their hometown on leave of absence during the pandemic’s peak. To construct a proxy for soldier-related flu exposure across municipalities, we use the Italian “Honor Roll of World War I Dead” (“Albo dei Caduti Italiani della Grande Guerra”). This com- prehensive publication reports detailed biographical information for more than half a million Italian soldiers who died from 1915 to 1920. We use these data to create our treatment variable, which captures the number of soldiers returning from the front who died of illness in their hometown during the peak of the epidemic. We validate the pandemic treatment variable by showing that it captures both geographic and time variation in the severity of the disease. First, we provide evidence that, at the regional level, the per capita number of soldiers who died of illness in their hometown during the peak of the epidemic is significantly and positively correlated with the number of deaths due to influenza. This correlation does not emerge for other years of the war or for the per capita soldier deaths from the broader region outside the hometown. Thus, we can argue that we are picking up the variation in the severity of the disease rather than the general local effects of WWI. Second, we show that the national monthly variation in excess mortality of is highly correlated with the monthly variation of soldiers who passed away because of illness. This suggests that soldiers who died of illness during the epidemic peak are likely victims of the Spanish flu. Our identification assumption is that the number of infected soldiers who returned to their home- town and eventually died of illness is exogenous, conditional on controls. We report evidence in support of this assumption. First, we show that our treatment is significantly affected by the id- iosyncratic risk of contagion of the regiments where soldiers returning to their homes had served. Second, we show that our proxy is not correlated with pre-determined municipal characteristics such as demographic and geographical features as well as local policies that might be correlated with both local inequality and influenza outcomes. Third, we show that omitted variable bias is likely to be limited, as the addition of pre-determined municipal characteristics and a broad set of

3 fixed effects marginally alters the estimated coefficients while sensibly increasing the model fit (Altonji et al., 2005; Oster, 2019). Besides this new identification strategy for pandemic exposure, our second empirical contri- bution is a new historical measure of income inequality for Italian municipalities. We collected and digitized income declaration reports published by the Italian Ministry of Finance in 1924 for the main tax on income. In the early 1900s, this tax was the single largest source of government tax revenue, accounting for more than half of the revenues collected by the Italian Treasury from direct taxes. These publications include individual income data on the universe of taxpayers with income from commercial, industrial, or other professional activities. Using these income data, we calculate several income inequality measures for around 2,000 municipalities across Italy. Our preferred estimate indicates that a one-standard-deviation increase in our proxy for pan- demic exposure at the municipal level caused a 2.2% increase in inequality, as measured by the Gini index five years later. These estimates are stable across alternative specifications and differ- ent approaches to measure inequality. Further analyses, conducted both at the municipal and at a more fine-grained (i.e., individual) level, suggest that the effect is mostly driven by an increase in the share of income detained by richer people to the detriment of poorer strata of the popula- tion. The effect magnitude does not appear to be significantly altered by local conditions such as, geographical location, level of income or public policies. Moreover, we find suggestive evi- dence, mostly through correlations, hinting to the persistence over time of these effects, as the most severely afflicted municipalities still have marginally more unequal incomes even a century later (i.e., in 2018). Finally, we report several robustness checks that validate our results. To the best of our knowledge, our paper is one of the first to study the causal effects of pandemics on income in- equality focusing on local jurisdictions. Moreover, we go beyond the analysis of GDP by showing how the 1918 influenza Pandemic affected the income distribution. These results contribute to a number of literatures. First, we add to the research on the effect of pandemics on inequality. Karlsson et al. (2014) find that the 1918 epidemic increased the share of poor people in Sweeden. Alfani (2015) and Alfani and Ammannati (2017) focus on the effect of the

4 fourteenth-century Black Death on income inequality in a set of Italian regions. They show that the plague has reduced inequality over-time. Furceri et al. (2020) apply a panel cross-country analysis on the few epidemics that occurred since 2000 and find an increase in inequality in those countries that were more severely affected. Galasso (2020) uses survey evidence to show that the COVID- 19 epidemic worsens the labor market outcomes of low-income individuals, immediately after the introduction of lockdown measures. Alfani (2020b) revises the relationship between pandemics and inequality with a long-term perspective and shows that, except for the fourteenth-century Black Death, more recent pandemics did not reduce inequality.2 More generally, our paper complements the rapidly growing body of research aiming to un- derstand the economic and social consequences of pandemics. Carillo and Jappelli (2020) show that Italian regions with the highest mortality rates during the 1918 epidemic reported a significant decrease in GDP compared to less affected regions and that the GDP growth rate returned to normal levels four years after the pandemic. The presence of a negative effect of the Spanish flu on GDP is also emphasized by other studies (Karlsson et al., 2014; Correia et al., 2020; Dahl et al., 2020). In particular, Correia et al. (2020) exploit variation in the mortality rate due to the 1918 influenza across major U.S. cities to analyze economic outcomes. Aside from the negative effect on GDP, they found that non-pharmaceutical interventions did not harm the economy. Finally, we contribute to the literature on the long-term consequences of pandemics. There is evidence hinting at the long-term effect of the 1918 influenza on individuals’ outcomes such as health and economic status as well as attitudes and trust (Almond, 2006; Lin and Liu, 2014; Percoco, 2016; Beach et al., 2018; Guimbeau et al., 2020; Aassve et al., 2021); moreover, many studies look at the long-term impact of other pandemics in the history (Voigtländer and Voth, 2013; Alfani and Ammannati, 2017). However, the only papers linking pandemics and inequality in the long-run are Alfani (2015) and Alfani and Ammannati (2017), that study the effect of the Black

2 This study is also related to the papers providing early evidence on the consequence of the COVID-19 pandemic on inequality (Blundell et al., 2020; O’Donoghue et al., 2020).

5 Death and find that inequality has decreased in the following centuries.3 The remainder of the paper is structured as follow. Section 2 discusses the history of the 1918 influenza in Italy. Section 3 describes the data used for the analysis. Section 4 discusses the iden- tification strategy. Section 5 presents the results. Section 6 reports several additional results and robustness checks. Section 7 concludes.

2. Historical background

The 1918 influenza is considered one of the deadliest pandemics experienced in modern history. Estimates suggest that 500 million people were infected worldwide and that between 20 million and 50 million people died as a consequence of the disease (Johnson and Mueller, 2002). Italy reported one of the highest mortality rates in Europe, with 600,000 people victim of the flu (16.74 per thousand of the overall population – according to the 1911 census).4 Three different waves occurred between spring 1918 and early 1919. The first and third waves were of moderate intensity, while most of the casualties were a result of the second wave, in fall 1918, particularly in October and November 1918. Figure 1 reports the total number of deaths from influenza over time, and it clearly shows the outbreak of 1918. The severity of the epidemic was a result of several factors. First, the standard of living was very low for most of the population. Hygienic conditions were generally inadequate to help to oppose the spread of the virus. For example, just a quarter of the population had access to running water. Similarly, not everyone lived in houses with private toilets or access to sewer lines. Second, public authorities produced ineffective non-pharmaceutical interventions. They were implemented too late and with a lack of coordination between the responsible offices and were generally not enforced. Some argue that the initial lack of involvement was part of a strategy that

3 More broadly, our study also relates to the literature on the long-term evolution of income inequality (Piketty, 2003; Piketty and Saez, 2003; Dell, 2005; Piketty, 2005). 4 Tognotti (2015) reports an accurate description of the Italian experience with the influenza epidemic and its interplay with WWI. What we described in this section is based on this source, unless noted otherwise. Alfani and Melegaro (2010) and Cutolo (2020) provide additional evidence on the Spanish flu in Italy. 6 aimed not to further demoralize citizens and soldiers in a crucial moment for the resolution of WWI. This is also revealed in the few mentions of the influenza by elected representatives in the Italian Parliament during the worst months of the crisis. By inspecting parliamentary speeches, we find only one request to introduce social distancing measures in late November 1918.5 Finally, it is generally recognized that WWI played a major role in the proliferation of the virus (Crosby, 1989; Winter, 2010; Tognotti, 2015). This is true when considering the spread of the dis- ease to a global scale as well as for the Italian setting. It has been suggested that the disease was brought to Europe during the spring of 1918 by the U.S. army when 200,000 American soldiers crossed the Atlantic to join the Allies in the battlefields. Moreover, it has been documented that the end of the war and the following demobilization in the autumn of 1918 caused the situation to dete- riorate further, helping the disease to spread (Oxford et al., 2005; Taubenberger and Morens, 2006; Herring and Sattenspiel, 2011). Similarly, several historical sources have reported the presence of the disease among Italian soldiers in camps and trenches and have documented the central role played by the troopers in spreading the flu in the internal front. Italy had an active internal front in the national territory, which made it impossible to avoid interactions between civilians and soldiers. In fact, the frequent relocation of troops, as well as the presence of soldiers on leave, have been considered to be important actors in the propagation of the disease across the Italian regions (Tognotti, 2015).6 In this setting, the virus was able to spread prior to the demobilization of the Italian Army that took place mostly in 1919, weeks after the official cessation of warfare between Italy and -, on November 3rd 1918.

5 Francesco Rota, November 21st 1918: Atti Parlamentari - Camera dei Deputati. 6 Soldiers could ask for a period of leave for a variety of reasons. In addition to the ordinary leave that each soldier was entitled to receive after a certain period at the front, it was also possible to ask for sick leave as well as for work-related leave (e.g., to work in the agricultural sector). The duration of the leave was typically between ten and fifteen days. The management of leaves was modified during the years of the war. They were allowed very rarely when the Italian Army was under the command of General Luigi Cadorna. When General Armando Diaz became Commander in Chief, at the end of 1917, the policy changed in order to grant soldiers to get their rightful time of leave. The scope of the policy was to increase the morale of troops after the Caporetto disaster.

7 For instance, considering the second wave, the first outbreak among soldiers had been reported as early as mid-August in a military camp nearby Parma. The health inspector of the camp indicated that this outbreak was associated with soldiers on leave returning from Northern provinces. A local newspaper reported that in the week from August 19 to 25, 77 people died because of the flu, 37 of whom were soldiers. Inspecting the data we gathered about military casualties, we believe these to be conservative numbers. We found that in August 1918, in Parma, 95 soldiers died because of a disease, 90 of them between August 16 and 31. Such episodes warned the military administration of the health risks and persuaded it to inter- vene to limit the spread of the flu among soldiers, unfortunately without much success. Cutolo (2020) reports that general measures of controls decided by the Ministry of War had been imple- mented starting from early September 1918, requiring strict medical controls for soldiers returning to the front after a period of leave. Initially, some military units attempted to stop soldiers from go- ing on leave, but the complaints raised by soldiers and the lack of coordination among the various commands halted any nationwide limitation on leave. In fact, the soldiers were able to request sick leave to recover from the flu.7 Civilians, meanwhile, were also worried about the possible spread of the virus from soldiers.8 To provide an insightful anecdote about the situation, we include an extract of a letter of the American philanthropist Evangeline Whipple, resident in the Italian city of Bagni di Lucca, sent on October 28th 1918 to the Rector of New York’s Grace Church:

There is a mountain village of this Commune, about ten miles up the valley, from which point one must climb on foot for about an hour and a half before reaching.... Most of the men, all of fighting age, are at the Front. But they brought the Spanish fever in its most virulent form to this remote place, on their ten days’ leave of absence. In an isolated place like this sky village, with no water except rain water, of course the contagious sickness has full sway. The priest, the only one in authority, felt the sickness coming on, and ran away....

Overall, anecdotal and historical evidence points to a causal link between infected soldiers

7 See, for example, Arturo Radici Valenti’s description of his experience with the flu (Capodarca, 1991). 8 Cutolo (2020) reports an article from the newspaper Il Tempo, describing a popular unrest in a city in the South of Italy in which two soldiers, one in leave, were killed because they where considered to be virus spreaders. 8 and civilians. This evidence becomes of primary relevance in motivating the estimation strategy detailed in Section 4.

3. Data

This paper relies on a variety of sources of information. First, we have data on the victims of influenza and casualties in World War I, and second, we have income data at the individual level from the income declaration of 1924. Finally, we complement these data with additional variables from a variety of sources. Descriptive statistics are reported in Appendix Table A.1.

3.1. Influenza and general mortality

We gathered yearly information about casualties from influenza in the period 1915 to 1919 at the regional level from the publication “Cause di morte: 1887−1955” issued by the national statistical office. Following the literature, we counted deaths related to influenza and pneumonia together. The lowest level of geographic aggregation of such data is the region. The average number of casualties for the period 1915-1920 is 9,617 (Appendix Table A.1, Panel A). As Figure 1 shows, there is a large jump in 1918, corresponding to the peak of the pandemic. Moreover, we gathered national-level data with the monthly excess of mortality in the period 1915 − 1920 from Mortara (1925). This variable reports the ratio between the number of deaths in each month and the average number of deaths in the same month for the triennium 1911 − 1913.

3.2. Income declaration data

The main dependent variable of the analysis consists of a measure of inequality at the munic- ipality level. To generate such an indicator, we rely on a unique of information on Italian incomes. We collected and digitized a series of publications listing individual income declarations for 24 Italian provinces composed of around 2,000 municipalities in 1924.9 These

9 It is worth noting that the selection of such provinces is not due to any specific reason. These are the publications that we were able to collect and digitize thus far as part of a larger data collection effort with the final goal of digitizing this publication for all Italian provinces. For our analysis, it is important to note that these provinces are 9 tabulations were issued by the Italian tax authority in application of a law enacted in 1922 (Regio Decreto, 16 dicembre 1922 n. 1631) having the clear goal of assisting the state enforcement ac- tion against tax evasion. These reports provide information about taxable income that was borne only by taxpayers who were required to declare revenues from commercial and industrial activities (category B), or the profession of liberal arts (category C). The majority of taxpayers from these two categories are owners of small businesses, mostly individual companies, or small stores and artisans. The same tax was also imposed to income deriving from capital (category A) and from salaries and pensions of public employees (category D), for which, however, no disaggregated information was released.10 In Appendix Table A.2 we report the number of submitted tax declarations and the total amount of income corresponding to the tax base for each income category. The groups under consideration, categories B and C, represent the majority of the Italian taxpayers relative to this tax (69.2%), while the other two groups represent a smaller fraction. Similarly, if we focus on the aggregate declared income, the categories under consideration account for nearly 7 billion Lire, equal to the 75.2% of all declared income in that year. As we only have information on a sub-sample of Italian provinces, our final dataset includes 221,139 taxpayers from a total of 955,198 (categories B and C), therefore accounting for nearly 23% of the national sample.11 For the purpose of our analysis, we generated a series of indicators in order to measure the income inequality of Italian municipalities. To prevent the presence of outliers from distorting these measures we decided to exclude the 0.5% richest Italians from our analysis.12 First, we

scattered across Italian regions. For the province of Rome, we have not been able to get access to the publications containing the data of the complete set of municipalities, but only of a sub-sample. 10 The Italian income tax was called “Imposta di ricchezza mobile” and it was in effect between 1864 and 1973. 11 We are aware that our data might be affected by taxpayers’ false declarations. This should not be an issue as far as there isn’t a systematic correlation between a municipality’s tendency to evade or elude taxes, pre-existing levels of inequality and our treatment variable. As all individuals within a district are subject to the same tax- enforcement office and face the same probability of audit and detection, the inclusion of district fixed effects should account for most of potential concerns. 12 It is worth mentioning that our findings emerge also if we use the entire sample. 10 computed the municipal Gini index: this variable has an average value of 0.41 (Appendix Table A.1, Panel B). Appendix Figure A.1, Plot A, shows the distribution of this variable and Figure 2, Plot A, shows the distribution of this indicator over the Italian territory. Two remarks can be made: first, as already mentioned, the income data at our disposal cover provinces scattered across different regions. Second, municipalities in the sample show large variations in the level of income inequality, even within the same province. Next, we computed the fraction of income owned by the top 20% and the bottom 20% taxpayers, whose average values are 0.47 and 0.06, respectively. The distributions of these variables are shown in Appendix Figure A.1, Plots B and C: the top 20% shows a bell distribution, while the bottom 20% has a small right tail. Finally, we generated a series of additional inequality indicators such as the Gini index net of taxes, the income standard deviation, the Theil index, and the coefficient of variation. The average value of these indicators is reported in Appendix Table A.1, Panel B.

3.3. World War I casualties

To construct the main explanatory variable, we collected detailed information on the universe of Italian military victims in WWI. These data come from the “Albo dei Caduti Italiani della Grande Guerra”, a publication containing details on all the 540,401 Italian military victims that have been identified from 1915 to 1920. The number of Italian military victims in WWI has been estimated at around 651,000 (Mortara, 1925), suggesting that the coverage of the Albo dei Caduti is rather high. As reported by General Fulvio Zugaro, head of the Royal Army’s Statistics Office and chief scientific advisor of the operation, the publication contains the information on soldiers who: a) died in combat or due to war injuries, b) went missing due to war-related causes, c) died or went missing in captivity (except deserters), d) died of illness related to war service, e) died of an accidental cause related to war service, and f) died for suicide whose cause was related to war service (Zugaro, 1926). The dataset contains information on the demographics of the deceased soldiers, such as their name, date of birth, and city of origin. Moreover, information on the circumstances of the death such as the date and the place of death is also available. Importantly, it also reports the cause 11 of death. For instance, we know whether a soldier died while fighting, due to injuries sustained during military combat, or due to illness. We use this information to create variables that proxy the severity of the Spanish flu at the municipality level, in light of the historical evidence suggesting that an important cause of the diffusion of the disease was the movement of soldiers. We create municipal variables counting the number of soldiers who died, the number of soldiers who died because of some illness and the number of soldiers who died of illness in their hometown, which measures those victims of illness who came back and passed away in their hometown. Therefore, this is the subset of victims whose place of death coincides with their city of origin. As argued in the historical background section, these soldiers were mostly on leave. To refine this variable as a proxy for the severity of the influenza, we focus only on those casualties that occurred during the months of the peak of the Spanish flu, that is, August-December 1918. The definition of the timing window is crucially related to the historical facts of WWI. On the one hand, we start from August as we know from Tognotti (2015) that in that month the epidemic started in the military district of Parma. On the other hand, we consider December as the last month of our period to account for the fact that the war ended on November 3rd 1918, and that our empirical strategy (identification assumptions) hinges mostly on the idea that the diffusion of the Spanish flu is related to soldiers on leave of absence while the war was still active.13 This latter variable is the main regressor in our central analysis and measures the extent to which troopers bring the Spanish flu back to their hometown. Figure 3 shows the number of Italian casualties in WWI over time. The total number of victims (light blue bars) shows a positive trend in the years 1915 − 1918 and a peak in 1918, with around 160,000 casualties, then it decreased drastically in 1919 and 1920.14 The number of victims due to illness (blue bars) shows a different pattern, as this is low until 1917 and it sharply increases in

13 We present several robustness checks using alternative time windows in Section 6. 14 Italy entered the war on May 24th 1915 and the conflict ended on November 3rd 1918. A limited number of casualties in the dataset, only 5%, is attributed to the years 1919 and 1920: these are soldiers who passed away after the end of the war for war-related reasons, 88% of whom due to illness.

12 1918 by around 350%, compared to the previous year, with around 113,000 victims. It is important to note that in 1918 more than two-thirds of the war deaths were due to illness and, as already discussed, this peak is mostly due to the diffusion of the Spanish flu in the trenches. Finally, the number of those who died of illness in their hometowns (dark blue bars) in 1918 is around 8,500. This measure had an increase of around 160% from 1917 to 1918. During the months of the peak of the epidemic (i.e., August–December 1918) we count a total of 3,113 soldiers who died of illness in their hometowns. For the sample of cities included in our analysis (24 provinces in which fiscal data are available), it is equal to 1,504. In the analysis, we use these variables in per capita terms. Figure 2, Panel B, displays the spatial distribution of this variable, per 100,000 inhabitants, for all Italian municipalities. Interestingly, we find relevant variation both within and between provinces across Italian regions. Finally, Appendix Table A.1, Panel C, shows the descriptive statistics of these casualties measures and Appendix Table A.3 shows the monthly evolution of these indicators in the time span July 1918-June 1919 . Finally, we exploited the information contained in the Albo dei Caduti about the regiment of membership of military victims. Our aim is to measure the degree of health risk of each regiment, with respect to the probability that their members get infected by the Spanish flu. The general pur- pose is to capture an important “push factor” in the probability that a soldier contracts the disease, which is external to any characteristic of his hometown. In particular, we measure the probability of dying from a disease, as opposed to dying for other reasons, for each municipality-regiment during the peak of the pandemic.15 Therefore, the variable infection risk - regiments attributes to each municipality the average health risk of the regiments where the soldiers of the city served. The average value of this indicator is 0.727 suggesting that, on average, a municipality had sol- diers in regiments where the 72.7% of soldiers’ death were due to illness during the pandemic peak August-December 1918 (Appendix Table A.1, Panel C).

15 This measure is municipality-regiment specific as we exclude those soldiers that came from the municipality of interest and belonged to the regiment in analysis. This adjustment allows us to reduce any potential endogeneity concern of this health risk indicator, as it relies on infection records of soldiers coming from other municipalities.

13 3.4. Other data

To conduct our analysis, we collected a set of additional municipal characteristics. First, we tracked municipal population size making use of the decennial census. In particular, we recovered these data from the censuses conducted in 1901 and in 1911. Moreover, we collected the share of males in the population as well as the literacy rate of the entire population and by gender. From the censuses, we also recovered important information on Italian municipalities such as the area in square kilometers, the administrative importance (i.e., whether the city is the capital of a province), and the geographic features (i.e., whether the city is in a mountain region, on the hill, or on a plain, as well as whether it is on the coast or not). We then computed municipality population density as the ratio between city population (in 1911) and city area. The average population of Italian cities in 1911 is 5,033 inhabitants, with an average growth rate of 5.8% between 1901 and 1911 and an average share of male population of 48%. The literate population is, on average, the 55%, with substantial differences between male (47%) and female population (62%). Finally, the average population density is 1.66 inhabitants per square kilometer (Appendix Table A.1, Panel D). Second, we collected relevant local fiscal policy measures from the 1912 municipal budgets by digitizing the volume “Bilanci Comunali per l’anno 1912”, published by the Italian Statistical Office. We use data about total expenditure, total surplus, and various categories of spending: police and sanitation/hygiene services, justice and security, education and public works. These are important characteristics to help to understand the potential differences among municipalities characterized by different inflows of ill soldiers, our main explanatory variable. Total expenditures amount to 23.9 Lire per inhabitant in 1912. On average, spending in police and sanitation/hygiene services represents 20.5% of total expenditures, spending in justice and security accounts for 0.9%, education for 17.9% and public works spending represents 14.3% (Appendix Table A.1, Panel D). Moreover, we measured the value of the Gini index in 2018, exploiting the municipal data on income declarations from the Italian Ministry of Economy and Finance.16 This indicator has an average value of 0.40 in the sample under analysis and of 0.39 in the entire sample of Italian

16 See Giommoni (2019) for the use of this dataset to construct other types of income inequality indicators. 14 municipalities (Appendix Table A.1, Panel B). This dataset does not contain individual incomes but only information about grouped income, we therefore rely on this group-specific data to construct the Gini index.17 Finally, our analysis will be conducted on a sample of 1726 municipalities for which we have all the available information. Appendix Table A.4 reports the number of municipalities in the sample for each Italian region (Panel A) and province (Panel B).

4. Empirical approach

4.1. Estimation strategy

Our primary aim is to identify the short-/medium-run effect of the severity of the 1918 influenza on income inequality in Italian local jurisdictions. In an ideal setting in which the geographic spread of influenza was orthogonal to ex ante local inequality, we could simply estimate the following equation via OLS:

Inequalityidp = β Influenza severityidp + γXidp + ηidp (1)

where Inequalityidp is a measure of income inequality in municipality i part of district d and province

p, while Influenza severityidp denotes the harshness of the influenza at the municipality level. To improve the precision of the estimates, we can also add a set of fixed effects and pre-determined mu-

nicipal characteristics as controls Xidp, while ηip denotes the error term. β would be the coefficient of interest, identifying the nature of the relationship between inequality and influenza severity. Unfortunately, we face data limitations when assessing the actual level of the influenza severity. In fact, as already reported in Section 3, while our data collection allows us to measure income inequality at the local level, the only available information about deaths related to the influenza pandemic is reported at a higher level of aggregation, that is, the regional level. Moreover, even

17 This is a common practice when individual-level data are not available. See, for example, Sala-i-Martin (2006) and Davies et al. (2011).

15 assuming that we could avail ourselves of influenza casualties at the municipal level, the use of an OLS estimator without accounting for potential endogeneity concerns will most likely produce biased coefficients. Indeed, there are a number of reasons one can expect this variable to be, either directly or indirectly, related to pre-existing local economic conditions. For instance, in areas with many poor individuals, it was common for more families to share the same home unit, limiting the possibility of social distancing. Similarly, in poorer localities the absence of minimal hygienic conditions might have worsened the effect of the disease compared to richer areas. To overcome these issues, we decide to focus on a different variable that would work as a proxy for the severity of the influenza, and that we argue is less subject to endogeneity concerns. We take advantage of the well-documented role played by soldiers from WWI in the diffusion of the 1918 influenza, as we discussed in Section 2. Specifically, we link inequality with the municipal exposure to infected soldiers returning home from the war front.18 Therefore, we base our main results on the following equation:

Inequalityidp = βVictims WWI for illness hometownidp + γXidp + ηidp (2) where all terms are as previously defined, while the main regressor Victims WWI for illness home- town is equal to the number of soldiers who returned to their hometowns from the war frontline and died due to a disease complication during the peak of the epidemic (August–December 1918). This variable is expressed in per capita terms (with 1911 population). Again, β is our coefficient of interest. Xidp includes a set of fixed effects and controls in order to improve the precision of the estimates. In particular, the full specification includes geography fixed effects (capturing whether the city is in a mountain region, on the hill, on a plain, or on the coast), district fixed effects, quar- tiles of taxpayers fixed effects (capturing the number of taxpayers) and the interaction between district and quartiles of taxpayers fixed effects. Moreover, the set of municipal-specific controls

18 This approach is similar in spirit to Hilt and Rahn (2020) and Correia et al. (2020) that exploit the distance to military camps in the US as in instrument to the severity of the 1918 influenza.

16 includes a province capital dummy, population density, share of literate population (total and by gender), share of male population and a set of budget variables, in per capita terms, from the 1912 municipal budget: total expenditures, budget surplus, spending on police and sanitation/hygiene services, spending on justice and security, spending on education and spending on public works. Finally, to account for correlation across cities within a district, standard errors are clustered at the district level. Overall, our strategy hinges on two assumptions. First, our proxy is indeed correlated with the number of deaths due to influenza. Second, the proportion of ill soldiers returning to their hometown is plausibly exogenous, conditional on controls.19

4.2. Assessing the quality of the proxy

In this section, we test whether our main explanatory variable is a good measure to capture the severity of the epidemic at the municipal level. In particular, we want to show that the number of soldiers who died in their hometown during the pandemic as a result of some disease is positively correlated with the number of deaths due to influenza, both variables are expressed in per capita terms. To this end, we exploit geographic and yearly variation in the regional level data and monthly variation in the national level data. First, we focus on regional data, as only at this level of aggregation we have real numbers for the casualties of influenza. Figure 4 and Appendix Table A.5 present the results graphically and numerically, respectively. Panel A of Figure 4 shows a positive relationship between the number of soldiers who died of a disease in 1918 –in their hometown– and the total number of deaths due to influenza. Therefore, we confirm the expected relationship between the two variables of interest. One potential concern is that our proxy is informative not just of the link between casualties of infected soldiers that came back and the influenza severity in a specific locality, but more generally of the connection between the number of WWI victims and the influenza severity. If that was the

19 To put it differently, one can consider the estimates from Equation 2 to be similar to a reduced form effect of an in- strumental variables strategy, where the number of soldiers returning in their hometown who died due to a disease complication is the instrument for the endogenous variable, the count of influenza casualties in a municipality. 17 case, our proxy would be likely to detect also variation across localities in the severity of WWI. To account for this aspect, we display in Figure 4, Panel B, (and in Column 1 of Appendix Table A.5) the relationship between the number of deaths due to influenza and the total number of military victims outside their hometown. Reassuringly, we see that the sign of the relationship is the opposite of that of our interest: it seems that regions with more victims are less affected by the pandemic. An additional possible issue is that the evidence shown so far could be mostly explained by pre- existing relationships to the variable of interest, and therefore, unrelated to the 1918 influenza outbreak. This seems unlikely as, in Panels C and D of Figure 4 and in Column 2 of Appendix Table A.5, we show that there is not a significant relationship between the number of WWI victims (both by illness in hometown and total outside of hometown) and influenza casualties in 1917. Appendix Table A.5 reports the actual coefficient estimates of the scatter plots: Columns 1 and 2 show the analysis conducted for 1918 and 1917, while Columns 3 and 4 confirm that our evidence also holds when considering post-pandemic (1918 − 1920) and pre-pandemic (1915 − 1917) years. Second, we exploit the time variation. Figure 5 displays the Italian monthly excess of mortality from the beginning of 1915 to the end of 1920, compared to the same month for the period 1911 − 1913 (Panel A), and the monthly number of deaths of ill soldiers in their hometown (Panel B). The two time series appear to follow a very similar trend. It is comforting to see that both peak in October 1918 and that the two variables have a high level of correlation 0.874).20 Overall, our evidence tends to support the quality of our proxy variable.

4.3. Exogeneity of the proxy variable

In this section, we discuss the exogeneity of our proxy variable and provide supporting evi- dence. Our causal identification rests on the idea that the probability for a soldier to return back to his municipality of origin and eventually die there because of a disease, during the peak of the pandemic, is unrelated to the pre-existing local level of income inequality. This condition may be violated, for instance, when soldiers from a more/less unequal municipal-

20 This also suggests that the majority of soldiers who died of illness during the peak died largely due to the Influenza.

18 ity are more/less likely to get infected by the virus while at the front. Considering this possibility, it is worth stressing that the diffusion of the disease among soldiers across the country was mostly ex- ogenous to municipal characteristics, including also local inequality, for two reasons: 1) the spread of the virus among soldiers mainly occurred within the same regiment; 2) soldiers belonging to a given regiment were native of municipalities located in different areas of Italy. We document these two aspects looking at our data. In Appendix Figure A.2 we display the relationship between our proxy variable and the variable infection risk - regiments that captures for each municipality the average health risk of the regiments where its soldiers served (estimates are reported in Appendix Table A.6). The figure highlights a positive and significant correlation between the two variables. We find a coefficient of 0.276 (standard error=0.0253) suggesting that an increase by one standard deviation of the infection risk variable raises our proxy of 27% of its standard deviation. In fact, cities more affected by the virus were also the ones whose soldiers happened to be enlisted in reg- iments with a higher circulation of the virus. In addition, using the data from the Albo d’oro, we find that on average there is a 90% probability that a soldier would be the only one from his home- town in a regiment.21 This suggests that there was a very limited sorting of soldiers from the same municipality into specific regiments, and therefore that the infection risk indicator is unlikely to be correlated to the characteristics of the municipalities of origin of the regiment members. This also implies that clusters of virus diffusion for soldiers from the same municipality while at the front were unlikely. With these results, however, we cannot categorically exclude that at least some soldiers got infected while at home rather than while on service. Nonetheless, given the previous and the his- torical evidence we believe this may represent a small fraction of our sample of victims. Therefore, the variation in our treatment seems unlikely to be determined by the spread of the influenza in a municipality that pre-dated the movements of soldiers. So far, we have reported evidence that the spread of the virus is determined by idiosyncratic

21 For instance, considering the 1918 August outbreak in Parma, the 90 soldiers who died were from the same unit, but came from 83 different municipalities. 19 components. However, one might be concerned that there are local conditions, or municipal poli- cies, that correlate with local inequality, which might favor/disfavor the survival rate of returning sick soldiers. About these concerns, we do two things. First, we rely on a set of balance checks to determine whether cities with different inflows of infected soldiers showed systematic differences in initial characteristics. In particular, we perform a set of regressions to check whether the proxy is correlated with pre-1918 or time-invariant municipal features. Second, we are comforted by the fact that the inclusion of additional covariates and fixed effects, marginally affects the size of the effect of the treatment variable on inequality, hinting at the presence of a limited omitted variable bias (more details in Section 5). Table 1 reports the results of the balance checks. The main regressor is the number of ill sol- diers who died in their hometown during the peak of the pandemic in per capita terms, while the dependent variable is a different covariate each time. Panel A focuses on demographic and geo- graphic features, Panel B accounts for local policies and Panel C includes municipal literacy rates and the share of male population. All regressions include district fixed effects, and when possi- ble geographic, number of contributors (quartile) fixed effects and the interaction between district and number of contributors (quartile) fixed effects to be as close as possible to the most saturated specification that we use in the main analysis. It is reassuring to see that none of the estimated coefficients are statistically different from zero at the conventional level. The first set of regres- sions suggests that our treatment is not correlated with population growth, population density, the geographic characteristics of a municipality nor the number of contributors. In addition, the ad- ministrative status of a city does not matter (Province capital). Notably, the number of ill soldiers who died in their hometown is not correlated with population density, a variable that we should ex- pect to have an independent effect on the number of influenza casualties. In the second panel, we show that the provision of local public goods is not correlated with the number of soldiers returning and eventually dying in their hometown. We account for total expenditure, budget surplus/deficit, expenditure on police and sanitation/hygiene services, expenditure on justice, expenditure on ed- ucation and expenditure in public work. Importantly, our proxy variable is not associated with

20 expenditure on sanitation and hygiene services. Finally, the table shows that our treatment variable is not correlated with municipal literacy rate, and this is true if we consider the entire population, the male or the female population, and with the share of male population.

5. Results

We now present our central results. Table 2 reports the estimate of our main analysis. The explanatory variable is the number of soldiers who returned ill in their hometown and died there during the peak of the epidemic, August–December 1918, which captures the local severity of influenza.22 The dependent variable is the Gini index, computed from the municipal tax declaration in 1924, and expressed in logarithm.23 In each of the proposed specifications of Table 2, the estimates suggest that the cities hit more severely by the epidemic display higher levels of income inequality five years later, as measured by the Gini index. In Column (1) we show the effect without including any covariates, while from Column (2) to (5) we add progressively a rich set of fixed effects and municipal controls. In particular, an increase by one standard deviation of our proxy variable leads to an increase in the Gini index by between 2.1% and 3%. The size of this effect is small, but not negligible: for instance, it is higher than that found by Furceri et al. (2020) who estimate an average effect of the presence of an epidemic on the Gini index to be between 0.75% and 1.25%. These relationships are also shown graphically via binned scatterplot in Appendix Figure A.3.24 It is reassuring for the causal interpretation of our results that the coefficient remains rather stable as we gradually include covariates. More specifically, in the spirit of Oster (2019), we estimate how large the unobservables must be, in comparison to the observables, to nullify the effect that we find. This may help to evaluate how likely the presence of omitted variable bias is in the estimates. Larger values are associated with smaller chances that OVB would substantially affect our findings. Moving from

22 The measure is expressed in per capita terms (with the 1911 population as a benchmark) and standardized. 23 The main results also emerge if we use the Gini index not expressed in logarithm. 24 Appendix Table A.7 also shows the coefficients and standard errors of the full set of control variables.

21 the estimate of Column (1) to that of Column (5) there is a decrease in the effect of 0.8% and an increase in the R2 from 0.006 to 0.351(= R˜), which implies that the impact of unobservables must be no less than 6.14(= δ) times larger than that of observables to cancel out the effect.25 In Table 3, we explore which part of the income distribution is affected by the epidemic. We fo- cus on the share of resources held by the top 20% (Panel A) and by the bottom 20% (Panel B) of the population. These estimates demonstrate that in municipalities in which the epidemic was hardest, (1) the fraction of resources held by the top 20% increases, and (2) the share of resources detained by the bottom 20% decreases. A one-standard-deviation increase in the treatment increases the income share of the top 20% by 0.7%, while it reduces the income share of the bottom 20% by 0.2%. If we interpret the size of the coefficients relative to the mean of the dependent variables we find a 1.3% increase for top 20% and a 2.6% reduction for the bottom 20%. These findings suggest that the increase in inequality is mostly driven by an increase of the share of income produced by richer taxpayers as well as by an impoverishment of poorer groups of the society.26 Further analysis confirming this evidence will be displayed in the next section (Figure 6). Overall, these results show that the diffusion of the Spanish flu across Italian municipalities increased income inequality. Though data limitation makes it difficult to clearly study the mech- anisms underpinning our central evidence, it is worth highlighting the plausible channels that are suggested in the Economic History literature. First, this increase in inequality may be due to the asymmetry in the contagion risk between poor and rich individuals, which may depend on several

25 This is well above the cutoff value of δ = 1 (Altonji et al., 2005; Oster, 2019). When δ = 1, the observables are as

important as the unobservables. To compute these measures, we follow Oster (2019) and use an Rmax = R˜ ×1.3.

We also tested with alternative values of Rmax. For instance, when we use an Rmax = R˜ × 2 the effect of the unobservables must be no less than 1.78 times larger than that of observables to cancel out the effect. 26 One potential concern regarding these findings may arise in case, in the cities most affected by the Spanish flu, a fraction of the poor taxpayers went out of business and no longer appeared in the tabulations in 1924. Even in this case, this would work against our findings as this would mechanically increase the resources of the poor people in the cities heavily affected by the flu. Importantly, we show in the next Section that the number of taxpayers in a city, after five years, is not affected by the severity of the flu. Overall, we find no evidence that our sample is censored. 22 factors such as the difference in living standard and the unequal access to healthcare. This may have implied larger economic losses for poor people, aggravating inequalities. In fact, there is evidence that the poor face a significantly higher infection risk –and mortality rate– during epidemics, com- pared to the other socio-economic strata (Alfani, 2020a): this has been documented for the Spanish flu (Fourie and Jayes, 2021) but also for older epidemics such as the Black Death (Cummins et al., 2016; Alfani and Murphy, 2017) and the 19th century cholera pandemic (Ambrus et al., 2020). A second possible mechanism, may be that poor individuals worked in more fragile business sectors as well as in low-skilled occupations. Therefore, the economic crisis that followed the epidemic may have hit those contributors more heavily. Many recent studies document this asymmetric ef- fect on real wages for pre-industrial epidemics (Pamuk and Schatzmiller, 2014; Rota and Weisdorf, 2020). An additional point to mention is that our results do not account for income deriving from salaries and from capital. We can expect the effects of the Spanish flu on these types of income, and therefore on income inequality, to be mixed. On the one hand, employees might face negative effects on their income either because they could become unemployed or because they may experi- ence wage reductions, especially if employed in more vulnerable occupations or sectors. Therefore, we could also expect for this category of workers an exacerbation in inequality. On the other hand, capital rents are usually residual incomes for taxpayers. These contributors are typically located in the right tail of the income distribution, and therefore, the expected negative effect of the Spanish flu on this kind of income, if any, is likely to reduce income inequality.

6. Additional analysis and robustness checks

6.1. Analysis with individual data

In this section we discuss a set of additional analysis and robustness checks. First, we aim at complementing the main results by exploiting the availability of fine-grained data about indi- viduals’ incomes. The main objective of this exercise is to provide evidence that the effect is

23 economically significant and to understand better its distributional impact.27 As a first step, we start from the municipal level dataset and compute residuals after regressing the treatment variable against our set of controls and fixed effects, as defined in Column (5) of Table 2. Next, we label municipalities depending on whether they belong to the first or third tertile of the distribution of the residuals. Finally, we merge this information with individual level incomes. As a result, we have an individual level dataset, where taxpayers are grouped based on the level of exposure to the flu of the municipality of residence. These data allow us to compute the Lorenz curves of the two groups of taxpayers and verify whether there is a visible difference. We display this result in Panel A of Figure 6. Coherently with the main findings, we find that the Lorenz curve derived from individuals belonging to the first tertile of the distribution is always above the one computed from individuals belonging to the third tertile of the distribution. This suggests that income inequality in the latter group is more pro- nounced than in the former one. Moreover, the distance between the two curves visibly decreases towards the top part of the cumulative distribution, suggesting that most of the differences are in the mid/lower part of the distribution. To further investigate the distributional effect of the pandemic, we compute the difference in the fraction of income declared by each decile of the distribution between the groups of taxpayers in the third and first tertile of the treatment. The result is displayed in Panel B of Figure 6. Also in this case, we find validation of our main evidence. The top deciles had gained and the bottom ones had lost. Interestingly, we show that the reduction in income share is not only focused on the poorest part of the population, but there is a persistent negative difference up until the sixth decile.

6.2. Alternative treatment definition, dependent variables, and heterogeneous effects

In Table 4 we report coefficient estimates for alternative definitions of the treatment variables. In Column (1) of Table 4 we use an explanatory variable expressed in binary term: in particular,

27 In addition, this analysis implicitly reduces the potential pitfalls due to individual outliers that one could be worried when running the analysis at the municipal level.

24 we define the variable Victims WWI illness (hometown) - binary as equal to one if a municipality has a positive inflow of ill soldiers who died there, during the pandemic peak, and zero otherwise. The impact of this treatment is larger than that of the main variable: treated cities experienced an average increase in the Gini index of 4.4%. In Column (2) of Table 4, we use the variable Victims WWI illness (place of death): this variable assigns to a municipality the number of soldiers who died there by illness (in the months August-December 1918), regardless of whether that locality was their hometown, and it is larger, on average, than our main explanatory variable. This alterna- tive treatment has a positive and significant effect on the Gini index, but the magnitude is smaller than the main one. This is reasonable as a soldier on leave is more likely to have multiple interac- tions with the community and this may spread the contagion more compared to infected troopers who arrived in a non-hometown city. In Columns (3) and (4) we test two placebo treatment vari- ables. In Column (3) we evaluate the impact of the number of soldiers from a municipality who died of illness outside their hometown (in the months August-December 1918), while in Column (4) we test the overall exposure of a locality to the war by measuring the total number of WWI victims from a specific municipality, excluding those victims considered in our main treatment. In both cases, we find that there is not a statistically significant effect and the size of the coefficient is significantly smaller than the main effect estimated in Table 2. Also, it is worth noting that, in unre- ported estimates, when we add these two placebo treatment variables as covariates to the preferred specification we find that our main result is not affected and these placebos are still statistically insignificant, suggesting a likely independence between these variables. Moreover, in Appendix Table A.8 we report results where we use alternative dependent vari- ables. In Columns (1) to (4) we use alternative definitions of inequality, while in the last two Columns (5-6) we use as dependent variables two additional economic measures, constructed from the tax declarations data. In Column (1) we focus on the Gini net, constructed with individual in- comes net of taxes, in Column (2) on the income standard deviation, in Column (3) on the Theil index and in Column (4) on the Coefficient of Variation. These indicators are expressed in loga- rithm. In all these cases, a positive and significant effect emerges. Column (5) of Table A.8 shows

25 the impact on the total income declared (per capita): no significant effects emerge. Furthermore, Column (6) of Appendix Table A.8 shows that the Spanish flu does not affect the number of tax- payers, also in per capita terms. These results suggest that the epidemic did not have a substantial effect on economic activities as proxied by declared income in the medium-term. Finally, we test for potential heterogeneous effects. In Appendix Table A.9 we report split sample estimates. In Columns (1) and (2) we focus, respectively, on the sample of municipalities from North and South of Italy. In Columns (3) and (4), instead, we split the sample depending on the level of GDP, limiting the analysis only to municipalities belonging to either the poorest or the richest Italian regions, respectively, those with per capita GDP below and above the median level. Overall, we find that the effect does not seem to depend on this specific dimensions as in all four columns we find a positive and significant effect.28 Next, we evaluate whether there is a heterogeneous impact of the pandemic depending on the pre-epidemic local policies. We focus on those policies that are more likely to modify the effects of the flu. In particular, we consider these components of the budget: total expenditures, spending in hygiene and police and spending in public works. To capture such heterogeneity, we estimate models where we add each time to the main specification a different interaction term of the treatment variable and a public finance indicator (expressed in per capita terms). The results, reported in Appendix Table A.10, show that the interaction term is never statistically significant, suggesting the limited role of local policies in altering the effect of the pandemic on inequality.

6.3. Long-term effects

In this section we attempt to relate our findings with existing economic literature about per- sistence. While the main analysis addresses the short-/medium-term effect of the Spanish flu on income inequality, here we provide some initial, and mostly suggestive, evidence of its long-term effects. Despite the potential limitations that are typical in this kind of empirical exercises (see,

28 In unreported estimates, we find that the interaction terms from an interaction analysis are not significantly dif- ferent from zero, in both cases.

26 for example, Voth (2020) for a detailed survey on the topic), these results complement existing re- search on the long-term effect of the Spanish flu that, differently from our paper, typically focuses on individual level data (Almond, 2006; Lin and Liu, 2014; Percoco, 2016; Beach et al., 2018; Guimbeau et al., 2020; Aassve et al., 2021). To conduct this analysis, we constructed a measure of municipal inequality in 2018, which allows us to study the effect of Spanish flu severity on income inequality after 100 years. Specifi- cally, we replicate the main analysis with the Gini index in 2018 as the dependent variable, using, alternatively, the continuous and the binary version of our treatment variable.29 This allows us to measure inequality for each municipality in Italy. Figure 7 displays a simplified version of the results graphically, while the actual estimates are presented in Appendix Table A.11. Panel A of Figure 7 shows the difference in the average Gini index in 1924 between municipalities that had at least one soldier who died of illness in the hometown during the peak of the pandemic (August- December 1918), and those that did not, while Panel B shows the same relationship using Gini index in 2018. In both cases, there is a significant difference between the two groups, with the municipalities in which the treatment is zero reporting a lower level of income inequality. From the estimated coefficients, when using the saturated specification and the binary index, we find that treated municipalities have a Gini index in 2018 that is 1.4% (significant at the 1% level) higher using the whole sample of Italian municipalities, and 1.2% (significant at the 5% level) higher with the sample of municipalities used in the main analysis.30 When using the continuous treatment variable, instead, the coefficient is still positive, but not always statistically significant. These results provide some initial evidence of the long-lasting effect of the 1918 influenza pandemic on shaping inequality. Nevertheless, these findings should be taken with due caution:

29 As mentioned, in 2018, we do not avail ourselves of individual income declarations; therefore, the inequality index has been constructed from grouped data. One drawback of this approach is that we cannot remove from the sample the 0.5% richest taxpayers in 2018, as we did in the main analysis on the 1924 data. 30 The fact that the long-term effect is very similar when considering the broader sample of Italian municipalities and our restricted one, weakens the concern that our main results are merely driven by the restricted sample of municipalities for which we have income data. 27 first, they should be interpreted as simple correlations, without any causal claim, second, this link is merely suggestive as the analysis clearly overlooks the important changes experienced by the Italian society in the last century, finally, this analysis documents the persistence of an already narrow effect.

6.4. Robustness checks

In this section we briefly discuss a series of robustness checks that are reported and described in Appendix B. First, we test the main model dividing the treatment in four-months period and we show that the effect emerges when the pandemic was particularly severe, i.e. in the last quarter of 1918 (Appendix Figure A.4). Second, we show that the main results are robust to different defi- nitions of the treatment in terms of the time window under consideration (Appendix Tables A.12). Third, we conduct a sensitivity analysis by excluding each region at a time and the main results are always confirmed (Appendix Table A.13). Fourth, we test the stability of the main findings to the exclusion of groups of municipalities depending on the number of taxpayers (Appendix Figure A.5). Fifth, we show that the results are robust to the exclusion of those municipalities with high rates of enrollment in the army (Appendix Table A.14, columns 1-2). Sixth, we conduct the anal- ysis using the main treatment (of Table 2) and the placebo variables (of Table 4) non-standardized in order to capture the effect in absolute terms (Appendix Table A.14, Columns 3-5). Seventh, we confirm that the main findings also emerge using a Generalized Linear Model (GLM) with a Poisson regression (Appendix Table A.14, Column 6). Finally, we show that the main effects do not depend on i) the presence of prisoners’ camps, ii) the administrative importance (i.e. the status of province capital) and iii) the proximity to the Italian war front (Appendix Table A.14, Columns 7-10).

7. Conclusion

In this paper, we provide some initial evidence for the effect of pandemics on income inequality using the Italian experience with the 1918 influenza as a case study. Our analysis is based on the geographic variation in the severity of the disease, induced by the plausibly exogenous presence of 28 ill soldiers in local jurisdictions. The main results suggest that pandemics of this type, i.e. highly widespread but associated with a relatively low mortality rate, increase income inequality.31 In the short-/medium-term this effect mainly comes from a reduction in the share of income generated by low and middle income population and by an increase in the income of the top earners. We are clearly aware that our study focuses on a specific historical setting and it is not directly comparable with the pandemics of previous centuries, nor with the most recent ones. This implies that our results are difficult to generalize and raises external validity issues. Still, our findings are in line with existing evidence focusing on epidemics of the current century (Furceri et al., 2020), supporting the possibility that a similar pattern may occur as a result of future pandemics. Therefore, also in light of potential long-term effects of pandemics, this study’s outcomes would call for interventions aimed at attenuating potential distributive consequences that may increase inequalities also for future generations. There are many possible lines of research that future studies may follow. First, the main chan- nels at the basis of our results require further exploration: understanding under which conditions pandemics may trigger inequality is important, both from a positive and a normative perspective. Second, the study of which policies are more successful in limiting the distributive effects of epi- demics is of vital importance to avoid a surge in future levels of inequality.

31 The overall mortality rate of 1918 influenza in Italy was around 1% (Aassve et al., 2021). This figure is much lower compared to older pandemics in history: the Black Death, for example, had a mortality rate of around 50% and led to opposite distributional effects (Alfani, 2020b). This suggests that the mortality rate could play a crucial role in affecting the economic consequences of pandemics, including those on inequality. 29 References

Aassve, A., Alfani, G., Gandolfi, F., Le Moglie, M., 2021. Epidemics and trust: the case of the spanish flu. Health economics 30, 840–857. Adams-Prassl, A., Boneva, T., Golin, M., Rauh, C., 2020. Inequality in the impact of the coron- avirus shock: Evidence from real time surveys. Journal of Public Economics 189, 104245. Alfani, G., 2015. Economic inequality in northwestern italy: A long-term view (fourteenth to eighteenth centuries). The Journal of Economic History 75, 1058–1096. Alfani, G., 2020a. The economic history of poverty, 1450-1800, in: The Routledge History of Poverty, c. 1450–1800. Routledge. Alfani, G., 2020b. Epidemics, inequality, and poverty in preindustrial and early industrial time. Journal of Economic Literature forthcoming. Alfani, G., Ammannati, F., 2017. Long-term trends in economic inequality: the case of the floren- tine state, c. 1300–1800. The Economic History Review 70, 1072–1102. Alfani, G., Melegaro, A., 2010. Pandemie d’Italia: dalla peste nera all’influenza suina : l’impatto sulla società. Cultura e società, Egea. Alfani, G., Murphy, T., 2017. Plague and lethal epidemics in the pre-industrial world. Journal of Economic History 77, 314–43. Almond, D., 2006. Is the 1918 influenza pandemic over? long-term effects of in utero influenza exposure in the post-1940 us population. Journal of political Economy 114, 672–712. Alon, T.M., Doepke, M., Olmstead-Rumsey, J., Tertilt, M., 2020. The impact of COVID-19 on gender equality. Technical Report. National Bureau of economic research. Altonji, J.G., Elder, T.E., Taber, C.R., 2005. An evaluation of instrumental variable strategies for estimating the effects of catholic schooling. Journal of Human Resources 40, 791–821. Ambrus, A., Field, E., Gonzalez, R., 2020. Loss in the time of cholera: Long-run impact of a disease epidemic on the urban landscape. American Economic Review 110, 475–525. Atkinson, A.B., Morelli, S., 2011. Economic crises and inequality. UNDP-HDRO Occasional Papers 6. 30 Beach, B., Ferrie, J.P., Saavedra, M.H., 2018. Fetal shock or selection? The 1918 influenza pan- demic and human capital development. Technical Report. National Bureau of Economic Re- search. Blundell, R., Costa Dias, M., Joyce, R., Xu, X., 2020. Covid-19 and inequalities. Fiscal Studies 41, 291–319. Bui, A.T., Dungey, M., Nguyen, C.V., Pham, T.P., 2014. The impact of natural disasters on house- hold income, expenditure, poverty and inequality: evidence from . Applied Economics 46, 1751–1766. Capodarca, V., 1991. Le ultime voci della grande guerra. Tra le righe. Carillo, M.F., Jappelli, T., 2020. Pandemic and local economic growth: Evidence from the Great Influenza in Italy. Technical Report. Covid Economics 10: 1-23. Correia, S., Luck, S., Verner, E., 2020. Pandemics depress the economy, public health interventions do not: Evidence from the 1918 flu. Technical Report. Crosby, A., 1989. America’s forgotten pandemic: The influenza of 1918. Cambridge University Press, Cambridge . Cummins, N., Kelly, M., Ó Gráda, C., 2016. Living standards and plague in london, 1560-1665. Economic History Review 69, 3–34. Cutolo, F., 2020. L’ influenza spagnola del 1918-1919. La dimensione globale, il quadro nazionale e un caso locale. I.S.R.Pt Editore. Dahl, C.M., Hansen, C.W., Jense, P.S., 2020. The 1918 epidemic and a V-shaped recession: Evi- dence from municipal income data. Technical Report. Daniele, V., Malanima, P., 2007. Il prodotto delle regioni e il divario nord-sud in italia (1861-2004). Rivista di politica economica 97, 267–316. Davies, J.B., Sandström, S., Shorrocks, A., Wolff, E.N., 2011. The level and distribution of global household wealth. The Economic Journal 121, 223–254. Dell, F., 2005. Top incomes in germany and switzerland over the twentieth century. Journal of the European Economic Association 3, 412–421.

31 Fourie, J., Jayes, J., 2021. Health inequality and the 1918 influenza in . World Devel- opment 141, 105407. Furceri, D., Loungani, P., Ostry, J.D., Pizzuto, P., 2020. Will Covid-19 affect inequality? Evidence from past pandemics. Technical Report. Covid Economics 12: 138-57. Galasso, V., 2020. Covid: not a great equaliser. Covid Economics 19 (2020): 241-265 . Giommoni, T., 2019. Does progressivity always lead to ? the impact of local redistribution on tax manipulation. CESifo Working Paper Series No. 7588 . Guimbeau, A., Menon, N., Musacchio, A., 2020. The brazilian bombshell? the long-term impact of the 1918 influenza pandemic the south american way. Technical Report. National Bureau of Economic Research. Herring, D.A., Sattenspiel, L., 2011. Death in winter: Spanish flu in the canadian subarctic, in: Killingray, D., Phillips, H. (Eds.), The Spanish Influenza Pandemic of 1918-1919: New Per- spectives. Routledge Studies. Routledge Studies in the Social History of Medicine. chapter 10. Hilt, E., Rahn, W., 2020. Financial asset ownership and political partisanship: Liberty bonds and republican electoral success in the 1920s. The Journal of Economic History 80, 746–781. Istituto Centrale di Statistica, 1958. Istituto Centrale di Statistica: Cause di Morte (1887-1955). Johnson, N.P., Mueller, J., 2002. Updating the accounts: global mortality of the 1918-1920” span- ish” influenza pandemic. Bulletin of the History of Medicine , 105–115. Karlsson, M., Nilsson, T., Pichler, S., 2014. The impact of the 1918 spanish flu epidemic on economic performance in : An investigation into the consequences of an extraordinary mortality shock. Journal of Health Economics 36, 1 – 19. Lin, M.J., Liu, E.M., 2014. Does in utero exposure to illness matter? The 1918 influenza epidemic in Taiwan as a natural experiment. Journal of Health Economics 37, 152–163. Mortara, G., 1925. La salute pubblica in Italia durante e dopo la guerra. v. 9, G. Laterza & figli. O’Donoghue, C., Sologon, D.M., Kyzyma, I., , McHale, J., 2020. Modelling the distributional impact of the covid-19 crisis. Fiscal Studies 41, 321–336.

32 Oster, E., 2019. Unobservable selection and coefficient stability: Theory and evidence. Journal of Business & Economic Statistics 37, 187–204. Oxford, J.S., Lambkin, R., Sefton, A., Daniels, R., Elliot, A., Brown, R., Gill, D., 2005. A hypoth- esis: the conjunction of soldiers, gas, pigs, ducks, geese and horses in northern france during the great war provided the conditions for the emergence of the “spanish” influenza pandemic of 1918–1919. Vaccine 23, 940–945. Pamuk, S., Schatzmiller, M., 2014. Plagues, wages, and economic change in the islamic middle east, 700–1500. The Journal of Economic History 74, 196–229. Percoco, M., 2016. Health shocks and human capital accumulation: the case of spanish flu in italian regions. Regional Studies 50, 1496–1508. Piketty, T., 2003. Income inequality in france, 1901–1998. Journal of Political Economy 111, 1004–1042. Piketty, T., 2005. Top income shares in the long run: An overview. Journal of the European economic association 3, 382–392. Piketty, T., Saez, E., 2003. Income inequality in the , 1913- 1998. Quarterly Journal of Economics 118, 1–39. Rota, M., Weisdorf, J., 2020. Italy and the little divergence in wages and prices: New data, new results. Journal of Economic History . Sala-i-Martin, X., 2006. The world distribution of income: falling poverty and… convergence, period. The Quarterly Journal of Economics 121, 351–397. Santos Silva, J.M.C., Tenreyro, S., 2006. The log of gravity. The Review of Economics and Statistics 88, 641–658. Taubenberger, J.K., Morens, D.M., 2006. 1918 influenza: the mother of all pandemics. Emerging infectious diseases 12(1), 15––22. Tavernini, L., 2001. Prigionieri austro-ungarici nei campi di concentramento italiani 1915–1920. Annuali Museo Storico Italiano della Guerra. Roverete-Museo della Guerra 9/10: 11 . Tognotti, E., 2015. La” spagnola” in Italia. Storia dell’influenza che fece temere la fine del mondo

33 (1918-1919): Storia dell’influenza che fece temere la fine del mondo (1918-1919). FrancoAn- geli. Tortato, A., 2004. La prigionia di guerra in italia: 1915-1919. Ugo Mursia Editore . Voigtländer, N., Voth, H.J., 2013. The three horsemen of riches: Plague, war, and urbanization in early modern europe. Review of Economic Studies 80, 774–811. Voth, H.J., 2020. Persistence: myth and mystery. Available at SSRN . Winter, J., 2010. L’inuenza spagnola. in: La prima guerra mondiale, a cura di S. Audoin-Rouzeau, J. Becker, Einaudi, Torino 2010, pp. 288 . Yamamura, E., 2015. The impact of natural disasters on income inequality: analysis using panel data during the period 1970 to 2004. International Economic Journal 29, 359–374. Zugaro, F., 1926. Bollettino dell’ ufficio storico. l’albo d’oro dei caduti per l’iitalia nella guerra mondiale .

34 Figure 1: Number of deaths from Influenza in Italy

Notes: The plot shows the number of deaths from Influenza over time, according to the publication Cause di Morte (1887-1955) from Istituto Centrale di Statistica (1958). The count includes the number of victims for flu and pneumonia.

35 Figure 2: Distribution of Gini index and war-related casualties by illness (hometown)

(A) Gini index - 1924 (B) Death by Illness (hometown) - 1918

Notes: Panel A shows the geographic distribution of the Gini index across Italian municipalities. Panel B shows the number of soldiers who died by illness in their hometown per 100,000 inhabitants. The cutoffs are set according to the variables quintiles. The areas marked by grey lines indicate the regions of Italy that were not part of the country in 1918 (the region of Trentino-Alto Adige and the provinces of Gorizia and Trieste).

36 Figure 3: Italian victims of WWI - over time

Notes: The plot shows the numbers of Italian victims in WWI according to the Albo dei Caduti Italiani della Grande Guerra over time considering all causes of death, illness and illness in the hometown of the victims.

37 Figure 4: WWI victims and influenza

(A) Death by Illness (hometown) - 1918 (B) Other kinds of deaths - 1918

(C) Death by Illness (hometown) - 1917 (D) Other kinds of deaths - 1917

Notes: The Panel A shows the correlation between the number of soldiers who died of illness in their hometown and the number of victims for influenza for Italian regions in 1918. The Panel B shows the correlation between the number of military victims outside their hometown and the number of victims for influenza for Italian regions in 1918. The Panel C shows the correlation between the number of soldiers who died of illness in their hometown and the number of victims for influenza for Italian regions in 1917. The Panel D shows the correlation between the number of military victims outside their hometown and the number of victims for influenza for Italian regions in 1917. All the variables are expressed in per capita terms, with 1911 population as the benchmark. The number of victims for influenza has been residualized with control variables of Table A.5 (Columns 1 and 2).

38 Figure 5: Death rate vs Italian victims of WWI - monthly data

(A) Death rate Italian population (B) Victims WWI-illness (hometown)

Notes: The left plot shows the Italian death rate over time, according to Mortara (1925). The average in the triennium 1911-1913 is the benchmark, with value of 100. The right plot shows the number of victims of illness who died in their hometown from “Albo caduti”. Correlation= 0.874.

39 Figure 6: Individual level data analysis

(A) Lorenz curves (B) Difference in income shares (by decile)

Notes: The plot in Panel (A) shows the Lorenz curves constructed using the income data at the individual level. “1st treatment tertile” indicates the taxpayers from cities with residuals in the first tertile, while “3rd treatment tertile” indicates the taxpayers from cities with residulas in the third tertile. The value of the Gini index for individuals in the former group is 0.507, while the one for individuals in the latter group is 0.536. The plot in Panel (B) shows the difference in fraction of income, expressed in deciles, between the groups of taxpayers in the third tertile of the treatment and in the first tertile.

40 Figure 7: Inequality in the long-run

(A) Gini index in 1924 (B) Gini index in 2018

Notes: The plots show the Gini index in 1924 (left plot) and in 2018 (right plot) for Italian municipalities dividing between cities where no ill soldiers came back from the front and those with a positive number of ill soldier that returned in the period August-December 1918.

41 Table 1: Correlation with municipal characteristics

Population growth Population Geographical N taxpayers Province rate (1901-1911) density 1911 zone quartile capital (1) (2) (3) (4) (5) Panel A. Municipal demographic and geographical features Victims WWI - illness (hometown) -0.002 -0.029 0.010 0.015 0.003 (0.002) (0.031) (0.028) (0.025) (0.003) N 1710 1726 1726 1726 1726 R2 0.366 0.302 0.635 0.344 0.198 Geography FE Yes Yes No Yes Yes N taxpayers quartile FE Yes Yes Yes No Yes District FE Yes Yes Yes Yes Yes N taxpayers quartile FE X District FE Yes Yes Yes No Yes

Total Budget Police, Hygiene Justice Education expenditures surplus expenditure expenditure expenditure (1) (2) (3) (4) (5) Panel B. Municipal budget (1912) Victims WWI illness (hometown) -1.297 -0.013 -0.026 -0.031 -0.354 (0.817) (0.039) (0.204) (0.019) (0.226) N 1726 1726 1726 1726 1726 R2 0.277 0.190 0.175 0.247 0.207 Geography FE Yes Yes Yes Yes Yes N taxpayers quartile FE Yes Yes Yes Yes Yes District FE Yes Yes Yes Yes Yes N taxpayers quartile FE X District FE Yes Yes Yes Yes Yes

Public work Share Share male Share female Share male expenditure literate literate literate population (1) (2) (3) (4) (5) Panel C. Municipal literacy and male population Victims WWI illness (hometown) -0.511 -0.000 -0.001 0.002 -0.000 (0.395) (0.003) (0.003) (0.003) (0.001) N 1726 1726 1726 1726 1726 R2 0.239 0.791 0.712 0.787 0.455 Geography FE Yes Yes Yes Yes Yes N taxpayers quartile FE Yes Yes Yes Yes Yes District FE Yes Yes Yes Yes Yes N taxpayers quartile FE X District FE Yes Yes Yes Yes Yes Notes: The dependent variable is reported in the column head. Victims WWI - illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. Robust standard errors clustered at the district level are in parentheses: * p < 0.1, ** p < 0.05 and *** p < 0.01.

42 Table 2: Impact of WWI victims on inequality indicators: Gini index.

(1) (2) (3) (4) (5) Victims WWI - illness (hometown) 0.030*** 0.026*** 0.021*** 0.022*** 0.022*** (0.008) (0.008) (0.007) (0.007) (0.007) N 1726 1726 1726 1726 1726 R2 0.006 0.044 0.240 0.292 0.351

Municipal controls No Yes Yes Yes Yes Province FE No No Yes No No Geography FE No No Yes Yes Yes N taxpayers quartile FE No No Yes Yes Yes District FE No No No Yes Yes N taxpayers quartile FE × District FE No No No No Yes Notes: The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

43 Table 3: Impact of WWI victims on inequality indicators: income shares.

(1) (2) (3) (4) (5) Panel A: Income share of top 20% Victims WWI - illness (hometown) 0.008*** 0.007*** 0.006** 0.006** 0.007** (0.003) (0.003) (0.003) (0.003) (0.003) N 1726 1726 1726 1726 1726 R2 0.006 0.036 0.191 0.251 0.318

Panel B: Income share of bottom 20% Victims WWI - illness (hometown) -0.003*** -0.002*** -0.002** -0.002** -0.002** (0.001) (0.001) (0.001) (0.001) (0.001) N 1726 1726 1726 1726 1726 R2 0.007 0.038 0.228 0.297 0.367

Municipal controls No Yes Yes Yes Yes Province FE No No Yes No No Geography FE No No Yes Yes Yes N taxpayers quartile FE No No Yes Yes Yes District FE No No No Yes Yes N taxpayers quartile FE × District FE No No No No Yes Notes: The dependent variable is the income share of top 20% in 1924 (Panel A) and the income share of bottom 20% in 1924 (Panel B). The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

44 Table 4: Alternative treatment definition

Dep. var.: Gini index (log) (1) (2) (3) (4) Victims WWI - illness (hometown) - binary 0.044*** (0.015) Victims WWI - illness (place of death) 0.015* (0.008) Victims WWI illness (others) August-December 1918 0.006 (0.013) Victims WWI all (1915-1920)- 0.009 illness hometown (August-December 1918) (0.014)

N 1726 1726 1726 1726 R2 0.350 0.349 0.348 0.348

Municipal controls Yes Yes Yes Yes Geography FE Yes Yes Yes Yes N taxpayers quartile FE Yes Yes Yes Yes District FE Yes Yes Yes Yes N taxpayers quartile FE × District FE Yes Yes Yes Yes Notes: The dependent variable is municipal Gini index in 1924 (expressed in log). The variable Victims WWI-illness (hometown) - binary is a binary variable equal to one if there is at least one soldier who died of illness in his hometown between August and December 1918. The variable Victims WWI-illness (place of death) captures the number of soldiers who died of illness and focuses on the municipality of death (rather than their birthplace) between August and December 1918. The variable Victims WWI-illness (others) August-December 1918 captures the number of soldiers who died of illness outside their hometown between August and December 1918. Victims WWI all (1915-1920) is the total number of soldiers who died during WWI. All these variables are standardized and the last three variables are in per capita terms (with population of 1911). Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

45 The Effect of the 1918 Influenza Pandemic on Income

Inequality: Evidence from Italy

APPENDIX

1 A. Appendix - Additional Figures and Tables

Figure A.1: Distribution of inequality indices by municipality (1924)

(A) Gini index (B) Top 20% income share (C) Bottom 20% income share

Notes: The figure shows the distribution of the income measures: municipal Gini index (Plot A), fraction of income owned by the top 20% taxpayers (Plot B) and the fraction of income owned by the bottom 20% taxpayers (Plot C).

2 Figure A.2: Infection risk level in regiments and victims of illness (hometown)

Notes: The binned scatterplot shows the output of the regression Victims WWI illness hometownidp = βInfection risk regimentsidp + γXidp + ηidp where the variable Infection risk regimentsidp measures the average share of victims of illness in the regiments in which the soldiers of the city i served. All the other terms are defined as in equation 2. Cov- ered period is August-December 1918. The variable Victims WWI illness hometown is expressed in per capita terms and it is standardized and the variable Infection risk regiments is standardized. The estimates are reported in Appendix Table A.6, Column 5.

3 Figure A.3: Effect of pandemic exposure on the Gini index: binned scatterplot

Notes: The binned scatterplot shows the output of the regression according to Equation 2 and it replicates the output of Table 2, Column (5). The dependent variable is municipal Gini index (in log) in 1924 and the explanatory variable is the number of soldiers that returned -infected- and died in their hometown in the period August-December 1918 (per capita and standardized).

4 Table A.1: Descriptive statistics

Average value Standard deviation N Panel A: Influenza (region) Casualties Influenza 1915-1920 9,617.8 11,055.4 96 Panel B: Inequality indicators (municipality) Gini index (1924) .410 .120 1726 Gini index (1924) - net .393 .119 1726 Top 20% share (1924) .478 .108 1726 Bottom 20% share (1924) .060 .028 1726 Income standard deviation (1924) 2161.13 1894.87 1726 Theil index (1924) .3592 .2445 1726 Coefficient of variation (1924) 1.01 .49 1726 Gini index (2018) .4020 .0350 1688 Gini index (2018) [entire sample] .3958 .0403 6705 Panel C: Victims WWI in per capita terms (municipality) Casualties 1915-1920 .015 .006 1726 Casualties illness 1915-1920 .005 .002 1726 Casualties illness (hometown) 1915-1920 .0006 .0007 1726 Casualties illness (hometown) Aug-Dec 1918 .00013 .00024 1726 Casualties illness (hometown) Jan-Jul 1918 .00007 .00021 1726 Casualties illness (other) Aug-Dec 1918 .0016 .0012 1726 Casualties illness (place of death) Aug-Dec 1918 .00032 .00094 1726 Casualties illness hometown (binary) Aug-Dec 1918 .3319 .4710 1726 Casualties illness (hometown) 1918q1 .00003 .0001 1726 Casualties illness (hometown) 1918q2 .00003 .0001 1726 Casualties illness (hometown) 1918q3 .00004 .0001 1726 Casualties illness (hometown) 1918q4 .00009 .0002 1726 Casualties illness (hometown) 1919q1 .00005 .0001 1726 Casualties illness (hometown) 1919q2 .00002 .0001 1726 Infection risk - Regiments .727 .240 1726 Panel D: Additional information (municipality) Population 1911 5033.37 11421.62 1726 Population density 1911 1.66 4.31 1726 Population growth rate 01-11 .0580 .1052 1726 Province capital .0162 .1263 1726 Total expenditures 23.98 24.85 1726 Budget surplus .6669 1.96 1726 Expenditures in police and sanitation/hygiene services 4.92 8.51 1726 Expenditures in justice and security .2266 1.09 1726 Expenditures in education 4.30 6.28 1726 Expenditures in public works 3.45 12.10 1726 Literacy (share) .5522 .1893 1726 Literacy male (share) .4721 .1765 1726 Literacy female (share) .6221 .2078 1726 Male population (share) .4802 .0399 1726 Notes: Casualties Influenza measures the regional and yearly number of victims for Influenza and Pneumonia. Gini index (1924) measures the municipal Gini index in 1924, it is constructed with individual incomes of taxpayers in the categories B and C. Gini index (1924) - net measures the municipal net Gini index in 1924, it is constructed with after tax individual income data. Top 20% share (1924) and Bottom 20% share (1924) measure the fraction of wealth of, respectively, the top 20% and the bottom 20% taxpayers in 1924. Gini index (2018) measures the municipal Gini index in 2018, it is constructed with bracket-specific income data. Casualties 1915-1920, Casualties illness 1915-1920 and Casualties illness (hometown) 1915-1920 measure, respectively, the number of military victims, the number of military victims by illness, the number of military victims by illness (in their hometown) in the reference period. The variables Casualties illness (hometown) Aug-Dec 1918/Jan-Jul 1918 measures the number of military victims by illness (in their hometown) in the reference period. The variable Casualties illness (other) Aug-Dec 1918 measures the number of military victims by illness (outside their hometown) in the reference period. The variable Casualties illness (place of death) Aug-Dec 1918 measures the number of military victims by illness in the reference period, attributed to the municipality where they passed away. The variable Casualties illness hometown (binary) Aug-Dec 1918 is the binary version of the variable Casualties illness (hometown) Aug-Dec 1918. The variable Infection risk - Regiments measures the share of victims of illness in the regiments in which the soldiers of the reference city served for the period August-December 1918. Population 1911 is municipal population in 1911, Population density 1911 is the ration between municipal population and city area in 1911. Population growth rate 01-11 is the growth rate from 1901 to 1911 census. Mountain city, Coastal city and Province capital, are dummy variable capturing whether the city, respectively, is in a mountain area, is on the coast and is a province capital. The variables Total expenditures, Budget surplus, Expenditures in police and health services, Expenditures in justice and security, Expenditures in education and Expenditures in public works are expressed in per capita terms. Literacy (share), Literacy male (share) and Literacy female (share) capture, respectively, the literacy rate of the total population, of the male population and of the female population. Male population is the fraction of male in municipal population.

5 Table A.2: Tax payers category, 1924

Category of income N. Tax payers Sum income decleared Average Income Capital (A) 487,057 936,057,618 1,921 Commercial and industrial (B) 790,697 4,637,308,934 5,864 Profession of liberal arts (C) 164,501 2,300,934,863 13,987 Salaries and pensions of public employees (D) 21,077 1,342,912,893 63,714 Total 1,379,871 9,217,214,308 6,679 Notes: This table reports information about the different categories of tax-payers that were subject to the income tax (Italian Ministry of Finance).

6 Table A.3: Descriptive statistics: monthly indicators (per 1,000)

Casualties illness (hometown) Average value Standard deviation N Victims WWI-illness (hometown) 1918 July .0085 .0548 1726 Victims WWI-illness (hometown) 1918 August .0082 .0528 1726 Victims WWI-illness (hometown) 1918 September .0247 .1007 1726 Victims WWI-illness (hometown) 1918 October .0454 .1336 1726 Victims WWI-illness (hometown) 1918 November .0285 .1225 1726 Victims WWI-illness (hometown) 1918 December .0241 .0938 1726 Victims WWI-illness (hometown) 1919 January .0234 .1048 1726 Victims WWI-illness (hometown) 1919 February .0131 .1013 1726 Victims WWI-illness (hometown) 1919 March .0147 .0915 1726 Victims WWI-illness (hometown) 1919 April .0108 .0631 1726 Victims WWI-illness (hometown) 1919 May .0090 .0530 1726 Victims WWI-illness (hometown) 1919 June .0096 .0635 1726

Notes: The variables Victims WWI-illness (hometown) capture the number of soldiers (per 1,000) who died of illness in their hometown in a specific year/month, in per capita terms (with population of 1911).

7 Table A.4: Descriptive statistics: geographical distribution of cities in the sample

Frequency Percentage Panel A: Distribution across Italian regions Abruzzi-Molise 203 11.76 Basilicata 117 6.78 Calabria 187 10.83 Campania 249 14.43 Emilia-Romagna 134 7.76 Liguria 35 2.03 Lombardia 60 3.48 Marche 57 3.30 Sardegna 307 17.79 Sicilia 169 9.79 Toscana 38 2.20 Umbria 75 4.35 Veneto 95 5.50

Panel B: Distribution across Italian provinces Aquila degli Abruzzi 102 5.91 Benevento 63 3.65 Bologna 34 1.97 Cagliari 216 12.51 Caltanissetta 25 1.45 Catanzaro 111 6.43 Chieti 101 5.85 Messina 79 4.58 Modena 43 2.49 Napoli 53 3.07 Padova 95 5.50 Palermo 65 3.77 Parma 40 2.32 Perugia 75 4.35 Pesaro e Urbino 57 3.30 Pisa 38 2.20 Porto Maurizio 35 2.03 Potenza 117 6.78 Ravenna 17 0.98 Reggio di Calabria 76 4.40 Salerno 133 7.71 Sassari 91 5.27 Sondrio 60 3.48 Notes: The table shows the number and the percentage of municipalities in the effective sample across the Italian regions/provinces.

8 Table A.5: Soldiers casualties and influenza deaths

N victims of Influenza Year 1918 Year 1917 Post-pandemic Pre-pandemic 1918-1920 1915-1917 (1) (2) (3) (4) Victims WWI - illness (hometown) 0.418** -0.0890 0.296** -0.131 (0.174) (0.126) (0.132) (0.112)

Victims WWI - others -0.988*** -0.0902 -0.962*** -0.0811 (0.289) (0.109) (0.307) (0.0732)

Year FE No No Yes Yes N 16 16 48 48 R2 0.513 0.163 0.935 0.195 Notes: Each column represents a single regression. All the variables are in per capita terms (with population of 1911) and standardized. The unit of observation are Italian regions. Robust standard errors clustered at the region level are in parentheses: * p < 0.1, ** p < 0.05 and *** p < 0.01.

9 Table A.6: Infection risk level in regiments and victims of illness (hometown)

Victims WWI - illness (hometown) (1) (2) (3) (4) (5) Infection risk - Regiments 0.254*** 0.251*** 0.259*** 0.268*** 0.273*** (0.019) (0.020) (0.024) (0.025) (0.027)

N 1726 1726 1726 1726 1726 R2 0.028 0.040 0.080 0.097 0.186

Municipal controls No Yes Yes Yes Yes Province FE No No Yes No No Geography FE No No Yes Yes Yes N taxpayers quartile FE No No Yes Yes Yes District FE No No No Yes Yes N taxpayers quartile FE × District FE No No No No Yes Notes: The dependent variable, Victims WWI-illness (hometown), captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. The variable Infection risk - Regiments captures the share of victims of illness in the regiments in which the soldiers of the reference city were located for the period August-December 1918 and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

10 Table A.7: Impact of WWI victims on inequality indicators: Gini index - with control variables.

(1) (2) (3) (4) Victims WWI - illness (hometown) 0.026*** 0.021*** 0.022*** 0.022*** (0.008) (0.007) (0.007) (0.007) Province capital 0.124* 0.071 0.072 0.042 (0.062) (0.049) (0.052) (0.050) Population density (1911) 0.008** 0.002*** 0.002*** 0.002*** (0.003) (0.001) (0.001) (0.001) Tot expenditures (1912) 0.001 0.000 0.000 0.000 (0.001) (0.001) (0.001) (0.001) Surplus (1912) -0.007 0.002 0.001 -0.000 (0.005) (0.004) (0.004) (0.004) Expenditures police/hygiene (1912) 0.001 0.000 -0.000 0.000 (0.001) (0.001) (0.001) (0.001) Expenditures justice/security (1912) -0.003 -0.005* -0.002 -0.001 (0.003) (0.003) (0.002) (0.003) Expenditures education (1912) -0.002 0.000 0.000 -0.000 (0.002) (0.002) (0.002) (0.002) Expenditures public work (1912) -0.004*** -0.002* -0.002* -0.002* (0.001) (0.001) (0.001) (0.001) Share literate -0.888** -0.599** -0.633** -0.591*** (0.379) (0.252) (0.253) (0.213) Share literate male 0.377 0.252 0.266 0.305 (0.244) (0.204) (0.204) (0.194) Share literate female 0.368* 0.304** 0.381*** 0.270* (0.189) (0.140) (0.142) (0.136) Share male -0.049 0.033 0.046 0.059 (0.413) (0.235) (0.224) (0.227) N 1726 1726 1726 1726 R2 0.044 0.240 0.292 0.351

Municipal controls Yes Yes Yes Yes Province FE No Yes No No Geography FE No Yes Yes Yes N contributors quartile FE No Yes Yes Yes District FE No No Yes Yes N contributors quartile FE × District FE No No No Yes Notes: The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (home- town) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 popu- lation and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of liter- ate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

11 Table A.8: Alternative dependent variables

Dep. var.: Net Gini Income standard Theil Coefficient of Total declared N of taxpayers index (log) deviation (log) index (log) variation (log) income (per capita) (per capita) (1) (2) (3) (4) (5) (6) Victims WWI - illness (hometown) 0.024*** 0.047** 0.044*** 0.023** 1.053 0.0002 (0.007) (0.018) (0.015) (0.009) (0.658) (0.0002)

N 1726 1726 1726 1726 1726 1726 R2 0.355 0.411 0.338 0.343 0.563 0.435

Municipal controls Yes Yes Yes Yes Yes Yes Geography FE Yes Yes Yes Yes Yes Yes N taxpayers quartile FE Yes Yes Yes Yes Yes No District FE Yes Yes Yes Yes Yes Yes N taxpayers quartile FE × District FE Yes Yes Yes Yes Yes No Notes: The dependent variable is municipal Gini index (log) in 1924 net of taxes (Column 1), the income standard deviation (log) in 1924 (Column 2), the Theil index (log) in 1924 (Column 3), the coefficient of variation (log) in 1924 (Column 4), the municipal total declared income in thousands of Lire per capita (Column 5) and the number of taxpayers per capita (Column 6). The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

12 Table A.9: Heterogeneity analysis

Gini index (log) North South Rich cities Poor cities (1) (2) (3) (4) Victims WWI - illness (hometown) 0.017* 0.023** 0.019*** 0.025* (0.009) (0.010) (0.007) (0.014)

N 494 1232 898 828 R2 0.466 0.313 0.403 0.290

Municipal controls Yes Yes Yes Yes Geography FE Yes Yes Yes Yes N taxpayers quartile FE Yes Yes Yes Yes District FE Yes Yes Yes Yes N taxpayers quartile FE × District FE Yes Yes Yes Yes Notes: The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. The sample includes in Column (1) cities in the North of Italy, in Column (2) cities in the South of Italy, in Column (3) cities in regions with a GDP per capita in 1911 higher than the median level and in Column (4) cities in regions with a GDP per capita in 1911 lower than the median level. The source for regional GDP per capita in 1911 is Daniele and Malanima (2007). Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

13 Table A.10: Heterogeneity analysis - Budget variable 1912

Interaction term (Above median) Total Hygiene, police Public work expenditures expenditures expenditures (1) (2) (3) Victims WWI - illness (hometown) 0.029** 0.021* 0.027** (0.012) (0.012) (0.011) Victims WWI - illness (hometown) × Above median -0.011 0.004 -0.008 (0.020) (0.022) (0.020)

N 1726 1726 1726 R2 0.350 0.349 0.351

Municipal controls Yes Yes Yes Geography FE Yes Yes Yes N taxpayers quartile FE Yes Yes Yes District FE Yes Yes Yes N taxpayers quartile FE × District FE Yes Yes Yes Notes: The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. The variable Above median is a dummy variable identifying whether a municipality is above the median in the distribution of the variable specified in each column head. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), share of literate population, share of male literate, share of female literate and share of male population. Differently to the standard control set budget variables are not included for this analysis. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

14 Table A.11: Long-term effect on income inequality

All municipalities Municipalities in 1924 sample (1) (2) (3) (4) (5) (6) (7) (8) Panel A: Continuous treatment Victims WWI - illness (hometown) 0.003** 0.003** 0.002 0.001 0.002 0.001 0.001 0.002 (0.001) (0.001) (0.001) (0.001) (0.003) (0.003) (0.003) (0.003) N 6705 6705 6705 6705 1688 1688 1688 1688 R2 0.001 0.031 0.271 0.338 0.000 0.046 0.306 0.368

Panel B: Binary treatment Victims WWI - illness (hometown) 0.029*** 0.023*** 0.014*** 0.013*** 0.019*** 0.014** 0.012** 0.012** (0.004) (0.004) (0.003) (0.003) (0.006) (0.006) (0.005) (0.005) N 6705 6705 6705 6705 1688 1688 1688 1688 R2 0.017 0.041 0.274 0.341 0.011 0.052 0.310 0.371 Municipal controls No Yes Yes Yes No Yes Yes Yes Province FE No No Yes Yes No No Yes Yes Geography FE No No Yes Yes No No Yes Yes District FE No No No Yes No No No Yes Notes: The dependent variable is municipal Gini index in 2018. The sample includes all Italian municipalities in Columns (1-4) and only those used in the main analysis in Columns (5-9). The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. The variable is continuous on Panel A and it is binary -0 in cities with no returning soldiers and 1 in those with a positive number- in Panel B. Municipal controls include a dummy whether the city is a province capital and population density of the city (defined as the ration between 1911 population and city area). Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p < 0.1, ** p < 0.05 and *** p < 0.01.

15 B. Appendix - Robustness checks

In this Appendix, we present a series of robustness tests to show the validity of the main results of the paper. In Appendix Figure A.4, we test the impact of the main treatment focusing on the time span January 1918–August 1919 by accounting for its value in four-months periods. The graph shows that the inflow of ill soldiers has led to an increase in the Gini index only when the pandemic was particularly severe (i.e., in the last four months of 1918), while the impact was marginal in the other quarters. As an additional test, we show that our results are robust to different definitions of the treatment in terms of the time window considered. These tests are reported in Appendix Table A.12, Panel A. First, we add to the time window of the main treatment January 1919 or 1919q1 (Columns 1 and 2). In both cases, we find a significant effect with a very similar magnitude (0.21 in both cases). Second, we run the main analysis considering the long period January 1918-June 1919 (Column 3). Also in this case, the coefficient is positive and significant. Third, we repeat the analysis of Column (3) excluding each quarter progressively (Columns 4 to 9). We find that the coefficients are highly significant only when we exclude the quarters before the autumn wave of the flu. In Appendix Table A.12, Panel B, we conduct the same analysis with the treatment defined using the number of soldiers who died by illness by place of death (as in Column 2 of Table 4). In this case, we find a significant effect in all estimates, but when excluding the last quarter of 1918. Overall, this evidence suggests that most of the variation allowing us to identify the effect of the Spanish flu is taking place from the latter quarter of 1918. This suggests that moving the analysis towards considering prior months would just add “noise” to the treatment, mechanically reducing the coefficient. Also, our approach of using the narrow definition of soldiers that died in their hometown provides conservative estimates, compared to the broader definition of place of death. Subsequently, we perform a number of tests to evaluate the sensitivity of our results to the exclusion of portions of the sample. We start by performing our main estimates by removing one region at the time. The results are reported in Appendix Table A.13. We find that there is not any specific region driving the effect. Importantly, the exclusion from the sample of either Lombardia 16 or Veneto, two regions that were located at the front of WWI, does not affect our results.1 Furthermore, we check the stability of our results to the exclusion of municipalities depend- ing on their number of taxpayers. One potential issue with the municipal level analysis is that in small municipalities, with few taxpayers, the Gini index would be very sensitive to extreme values. Therefore, we report in Appendix Figure A.5 the coefficients of the analyses in which we cut every time the sub-sample of municipalities that represents a specific 5% group in terms of number of taxpayers. The coefficient is always positive, spanning between 0.16 and 0.26, and statistically significant. This suggests that the main results are not driven by specific subsets of municipalities with a particular number of taxpayers. Finally, in Appendix Table A.14 we report a number of additional robustness tests. In Columns (1) and (2) we want to account for the possibility that the treatment indicator may be systematically higher in cities where people found more appealing to join the army and this may be correlated with important local economic characteristics (e.g., local unemployment). In order to control for this potential confounding factor, we use the total number of military victims during WWI, in a certain municipality, as a proxy for the number of drafted soldiers.2 Therefore, we conduct two analyses accounting for this proxy: in Column (1) we include this variable as a control, while in Column (2), we use as main treatment the number of victims by illness during the peak of the epidemic in the hometown, over the total number of military victims in that city, instead of the per capita measure. In both cases, the main results are confirmed. Next, we want to estimate the main model using the non-standardized treatment in order to capture the effect in absolute terms. Column (3) of Appendix Table A.14 shows this analysis: the impact of one unit increase in the treatment has a very large effect on local inequality. In addition, we conduct the same analysis with the two placebo treatments used in Table 4. In Columns (4) and (5) we use as a regressor, respectively, the number of soldiers who died of illness outside their hometown (during the peak of the epidemic) and the total number

1 In Column 10 of Appendix Table A.14, we do report estimates where we exclude municipalities from both regions. 2 In these regards, it is important to mention that the control set always includes the variable municipal share of male population which is another local proxy for the number of drafted soldiers in a given municipality.

17 of WWI victims, excluding those considered in our main treatment. In both cases the coefficient is positive, insignificant and very small in magnitude. As expected, the death of soldiers by Influenza in their hometown has a larger impact as this is probably associated with the contagion and death of other citizens, differently from the death of soldiers elsewhere. This finding further justifies our definition of the treatment. In Column (6) we conduct the main analysis estimating a Generalized Linear Model (GLM) using a Poisson regression. This may convey more efficient and less biased estimates, in presence of a dependent variable expressed in logarithm (Santos Silva and Tenreyro, 2006). The main result is confirmed and the coefficient is similar in magnitude to the main analysis. Moreover, we control for the presence of prisoner camps according to Tavernini (2001) and Tortato (2004). In fact, the municipalities in which these camps are located may have a higher exposure to the virus due to the massive military presence. Therefore, we conduct two analysis: in Column (7) we exclude from the sample those municipalities in which the prisoner camps are located and in Column (8) we control in the regression for the dummy variable capturing the presence of a camp. The main results are confirmed in both cases. Finally, we show that the main results hold if we exclude the municipalities that are province capitals (Column 9) and those located in regions close to the Italian war front, i.e., Lombardia and Veneto (Column 10).

18 Figure A.4: Effect of pandemic exposure on the Gini index

∑ 1919t2 Notes: The plot shows the estimates according to Equation Inequalityidp = t=1918t1 βtVictims WWI for illness hometownidpt + γXidp + ηidp. The value of the coefficients and the standard errors are reported in correspondence of each dot. For each coefficient, 95% (delimited by horizontal bars) and 90% (bold line) confidence intervals are included. The dependent variable is municipal Gini index (in log) in 1924 and the explanatory variable is the number of soldiers that returned -infected- and died in their hometown in the corresponding 4-months periods (per capita and standardized). The specification is the same as in Column (5) of Table 2.

19 Figure A.5: Excluding municipalities by size

Notes: The plot shows the outputs of the regression according to Equation 2 in which the sample has been reduced, excluding each time the cities in the corresponding 5% group in terms of number of taxpayers. The dependent variable is municipal Gini index (in log) in 1924 and the explanatory variable is the number of soldiers that returned -infected- and died in their hometown in the period August-December 1918 (per capita and standardized). The dotted gray line indicates the value of the coefficient in the main analysis (Column 5 of Table 2).

20 Table A.12: Impact of WWI victims on inequality indicators - different time windows

Period included Quarter excluded Aug18-Jan19 Aug18-Mar19 1918Q1-1919Q2 1918Q1 1918Q2 1918Q3 1918Q4 1919Q1 1919Q2 (1) (2) (3) (4) (5) (6) (7) (8) (9) Panel A: victims - hometown Victims WWI - illness (hometown) 0.021** 0.021** 0.013* 0.016** 0.019*** 0.011 0.005 0.012* 0.010 (0.008) (0.008) (0.007) (0.007) (0.007) (0.007) (0.007) (0.006) (0.007) N 1726 1726 1726 1726 1726 1726 1726 1726 1726 R2 0.351 0.351 0.349 0.350 0.350 0.349 0.348 0.349 0.349 Panel B: victims - of death

21 Victims WWI - illness (place of death) 0.017* 0.017* 0.020* 0.019* 0.021* 0.023** 0.014 0.018* 0.018* (0.009) (0.009) (0.011) (0.010) (0.011) (0.011) (0.014) (0.010) (0.010) N 1726 1726 1726 1726 1726 1726 1726 1726 1726 R2 0.349 0.349 0.349 0.349 0.349 0.349 0.348 0.349 0.349 Province FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Geography FE Yes Yes Yes Yes Yes Yes Yes Yes Yes N contributors decile FE Yes Yes Yes Yes Yes Yes Yes Yes Yes District FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Municipal controls Yes Yes Yes Yes Yes Yes Yes Yes Yes N contributors decile FE*District FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Notes: The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. The variable Victims WWI-illness (place of death) captures the number of soldiers who died of illness and focuses on the municipality of death (rather than their birthplace) between August and December 1918, in per capita terms (with population of 1911) and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01. Table A.13: Impact of WWI victims on inequality indicators - excluding Italian regions

Excluded region Abruzzo Basilicata Calabria Campania Emilia Liguria Lombardia -Romagna (1) (2) (3) (4) (5) (6) (7) Victims WWI - illness (hometown) 0.020*** 0.021*** 0.022*** 0.022*** 0.022*** 0.021*** 0.022*** (0.007) (0.008) (0.008) (0.008) (0.008) (0.007) (0.008) N 1523 1609 1539 1477 1592 1691 1666 R2 0.370 0.353 0.357 0.354 0.338 0.354 0.351

Marche Sardegna Sicilia Toscana Umbria Veneto (1) (2) (3) (4) (5) (6) Victims WWI - illness (hometown) 0.020*** 0.023*** 0.024*** 0.023*** 0.023*** 0.024*** - (0.007) (0.008) (0.007) (0.007) (0.007) (0.007) N 1669 1419 1557 1688 1651 1631 R2 0.349 0.338 0.359 0.351 0.331 0.352 Province FE Yes Yes Yes Yes Yes Yes Yes Geography FE Yes Yes Yes Yes Yes Yes Yes N contributors decile FE Yes Yes Yes Yes Yes Yes Yes District FE Yes Yes Yes Yes Yes Yes Yes Municipal controls Yes Yes Yes Yes Yes Yes Yes N contributors decile FE*District FE Yes Yes Yes Yes Yes Yes Yes Notes: The region of Abruzzo also includes the municipalities in the region of Molise. The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their hometown between August and December 1918, in per capita terms (with population of 1911) and standardized. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.

22 Table A.14: Robustness checks

Controlling for Treatment in Treatment non Placebo 1 non Placebo 2 non GLM Without Controlling for Without province Without regions N of victims relative terms standardized standardized standardized estimation prisoners camps prisoners camps capitals at the front (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) Victims WWI - illness (hometown) 0.022*** 85.856*** 0.018*** 0.022*** 0.022*** 0.022*** 0.024*** (0.007) (28.194) (0.007) (0.007) (0.007) (0.007) (0.008) Victims WWI - illness (hometown) over 0.015** Total N of victims (0.007) Victims WWI illness (others) August-December 1918 4.233 (8.690)

23 Victims WWI all (1915-1920)- 0.955 illness hometown (August-December 1918) (1.447) N 1726 1726 1726 1726 1726 1726 1668 1726 1696 1571 R2 0.351 0.350 0.351 0.348 0.348 - 0.347 0.351 0.350 0.0.352

Municipal controls Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Geography FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N contributors quartile FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes District FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes N contributors quartile FE × District FE Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Notes: The dependent variable is municipal Gini index (log) in 1924. The variable Victims WWI-illness (hometown) captures the number of soldiers who died of illness in their home town between August and December 1918, in per capita terms (with population of 1911) and standardized. The variable Victims WWI-illness (hometown) over Total N of victims captures the number of soldiers who died of illness in their home town between August and December 1918 on the total number of WWI victims from that city and standardized. The variable Victims WWI-illness (others) August-December 1918 captures the number of soldiers who died of illness outside their hometown between August and December 1918, per capita. Victims WWI all (1915-1920) is the total number of soldiers who died during WWI, per capita. In Column (10) the municipalities that belong to the regions of Lombardia and Veneto have been excluded from the sample. Municipal controls include a dummy whether the city is a province capital, population density of the city (defined as the ration between 1911 population and city area), total expenditure, budget surplus and municipal spending in police, health service, justice, education and public work (all budget variables are in per capita and refers to the year 1912), share of literate population, share of male literate, share of female literate and share of male population. Fixed effects are described in section 4. Robust standard errors clustered at the district level are in parentheses: * p<0.10, ** p<0.05, *** p<0.01.