Killer Incentives: Awards, Status Competition, and Pilot

Performance during World War II

Philipp Ager Leonardo Bursztyn Hans-Joachim Voth* University of University of University of Southern Denmark Chicago & NBER Zurich

Abstract. We examine the role of awards and of status competition in a high- stakes setting – aerial combat during during World War II. We collect and compile new, detailed data on monthly victory scores of over 5,000 German pilots for the period 1939-45. Our results suggest that awards may have been an important incentive. Crucially, we find evidence of status competition: When the daily bulletin of the German armed forces mentioned the accomplishments of a particular fighter pilots, his former peers perform markedly better. Outperformance is differential across skill groups. The best pilots try harder, score more, and die no more frequently when a former peer is mentioned; average pilots try win only very few additional victories, but die at a markedly higher rate. Our results suggest that the incentive effects of non- financial rewards can lead to ambiguous results in settings where both risk and output affect aggregate performance.

* Corresponding author ([email protected]). We are grateful to Jacob Miller and Lukas Leucht for outstanding research assistance. We thank David Card, Ernesto Dal Bó, David Dorn, Barry Eichengreen, Armin Falk, Ernst Fehr, Paul Gertler, Patrick Kline, Jonathan Leonard, Martha Olney, Christy Romer, Richard Sutch, Noam Yuchtman and Fabrizio Zilibotti for helpful comments, and seminar audiences at the Academy of Behavioral Economics, Haas Business School (UC Berkeley), the Economic History Seminar (UC Berkeley), and the Labor Lunch (UC Berkeley) for constructive criticisms. 2

I. INTRODUCTION

Humans crave recognition. In the Theory of Moral Sentiments, Adam Smith defined “bettering our condition” as “to be observed, to be attended to, to be taken notice of with sympathy… and approbation.” From baseball’s Hall of Fame to the Congressional Medal of Honor, from the fellowships of scholarly societies to the employee-of-the-month award, symbolic rewards exploit the human need for approbation as a motivating force. By creating an artificial scarcity, awards are meant to spur effort and increase productivity (Besley and Ghatak 2008, Moldovanu et al. 2007). Both career concerns and a desire for reciprocity may be behind the incentive effects of symbolic awards (Fehr and Schmidt 1999; Dewatripont et al. 1999). On the whole, the empirical evidence suggests that non-pecuniary rewards work. Markham, Scott, and McKee (2002) demonstrate improved attendance after a public recognition program was introduced at a US firm. Chan et al. (2014) find that academics perform better after receiving prizes or prestigious fellowships. Kosfeld and Neckermann (2011) examine students working on a database project and show that they increase their performance if they are part of the that can receive a congratulatory card for best performance.1 Ashraf et al. (2016) show that condom sales in Zambia can be promoted effectively by non-financial rewards.2

At the same time, “[a] medal glitters, but it also casts a shadow” (Winston Churchill). Competing for awards is similar to other forms of status competition, from conspicuous consumption to prominent donations (Veblen 1899; Glazer and Konrad 1996). Just like awards, these are a way to signal status and success. Competition for positional goods is zero-sum in nature, and may lead to lower satisfaction or excessive effort (Layard 2005, Frank 1999). The overall effects of competing for symbolic awards may also be ambiguous: They may crowd in effort, giving rise to social multiplier effects, or they may crowd them out. Effects could depend on relative rank or the level of innate skill. While some theoretical work takes the zero- sum nature of symbolic rewards into account (Besley and Ghatak 2008; Moldovanu et al. 2007; Frey 2007; ), there is little empirical work on the social dimension of awards and their potential effect on others in the organization.3

In this paper, we examine the performance of German fighter pilots during World War II. These include the highest-scoring aces of all time. Fighter pilots constitute an ideal setting for analyzing the effects of awards – the stakes are high, awards were almost entirely symbolic, and principal-agent problems are strong. This is a

1 Neckermann et al. (2014) demonstrate a positive effect of awards for voluntary work behaviours on core productivity in a call center. 2 In contrast is Malmendier and Tate (2009), who show that superstar CEOs significantly underperform after receiving awards. 3 Notable exceptions include Ashraf et al. (2014), who find that explicit rankings undermine effort, while awards increase them. 3 favourable setting compared to existing research symbolic awards, which are typically laboratory based or occur in a setting with easily observable effort and low stakes. In single-engine fighters, pilots operate alone; once battle is joined, there is no effective control of individual planes by superior officers. Each man had to decide when to pursue victory or break off contact. A pilot’s number of sorties in a day was as much determined by combat conditions as by ‘fighting spirit’. In such a setting, individual motivation was key: ’s most decorated pilot during World War II was Hans-Ulrich Rudel. He completed 2,500 missions, destroying over 500 Russian and three capital ships in the process. Rudel insisted on flying continuously until the end of the war despite having a leg amputated, and despite being ordered by Hitler himself to stop flying and transfer to a training school (Rudel 1949). Awards were also purely symbolic, consisting of medals worth only pennies. Medals did not come with financial rewards nor did they automatically lead to promotion with higher rates of pay.4

The German armed forces operated an elaborate system of awards and medals. The main ones – the Knight’s Crosses – were awarded cumulatively, with each lower award being a requisite for the next step on the ladder. Personal valor in the face of the enemy was a key consideration; privates, NCOs and officers had similiar chances to be recognized, and there were almost no awards for ‘desk warriors’ (Van Creveld 1980). Recipients of the Knight’s Cross were widely considered heroes, starred on the front pages of magazines, and were frequently asked for autographs, or to tour the ‘home front’ to boost morale. Military hierarchy was partly inverted for recipients – officers had to salute a Knight’s Cross holders first, even if a mere private was wearing the medal. While the symbolic value was high, so were the stakes involved in aerial combat – by spring 1944, fully a quarter of pilots became casualties every month.5 The vast majority of German fighter pilots serving in World War II was shot down and died.

We first examine the effects of awards and of public recognition on the performance of recipients. Pilots scored more victories in the run-up to receiving an award, and then went back to their earlier performance level. These effects were carefully taken into account by the leadership. Germany began the war with an established system of awards. After the massive successes against France and initially, against the , many aces already held the top award – the Knight’s Cross. To keep pilots motivated, and after much pressure from the Luftwaffe especially, the German armed forces introduced new medals: the Oak Leaves to the Knight’s Cross,

4 Pilots typically did not want to be promoted, for fear of being given a desk job. The air force was careful to not ‘reward’ its best pilots by promotion and an effective transfer to administrative duty. 5 In the first six months of 1944, the Luftwaffe lost 250% of its fighter strength – i.e. without new planes arriving from the factories, it would have found itself without planes very quickly. Loss rates for planes were higher than for pilots since many pilots crash-landed or parachuted to safety, even when their plane had been completely destroyed. 4 the Swords, and the Diamonds.6 This is in line with the predictions from models of fashion cycles, where the status value of high-end products continuously are devalued as they are adopted by more and more people – to which the optimal response requires the introduction of a new, “superior” product (Pesendorfer 1995).

Our data also allows us to examine how the social dimension of awards shaped their effects. While a priori, status competition could sharpen or dull incentives, there is ample anecdotal evidence that it was a key motivating force. For example, during the Battle of Britain in the summer of 1940, two German pilots – and – were neck-and-neck in terms of total victories. When Mölders was ordered to confer with the head of the Luftwaffe, Hermann Göring, he went to for three days of meetings – but on the condition that Galland would also be grounded for the same number of days. Remarkably, at a time when the air battle against Britain hung in the balance, Göring (himself a WWI fighter ace) accepted that one of his top-scoring pilots would be grounded gratuitously.

Combining data on social distance in pre-existing networks with the timing of awards, we show that public recognition actually crowded in effort: Peers of pilots who were publicly recognized tried harder, increasing their score of aerial victories significantly during the months in question. This is particularly true among the best pilots (i.e., those more likely to be successful in being publicly recognized after they tried harder), and of those who were closest in terms of social distance to the pilot receiving the award. Pilots who had flown in the same squadron in the past as the new award holder upped their score by a lot; those who had only served on the same airbase, by a little. The top panel of Figure 1 shows the difference in effects for aces and non-aces performance, for periods with or without a peer being mentioned in dispatches. Aces always scored more, but they increased their monthly victory rate by 1.2 victories compared to a base of 1.9 when someone that they flew with in the past was mentioned in the daily armed forces bulletin. Non-aces also scored more, by 0.13 victory, relative to a baseline of 0.22.

6 There was also the Golden Oak Leaves and Diamonds to the Knight’s Cross, the highest distinction save for the Great Cross of the Knight’s Cross. Both were only awarded only once – the former, to Ulrich Rudel, a bomber pilot; the latter, to Hermann Göring. 5

Victories

4 1.24*** 3.5

3

2.5

2

1.5 victories per month 1 0.13***

0.5

0 peer no peer peer no peer

ACES NON-ACES Exit

0.09

0.08

0.07 0.013** 0.06

0.05

0.04

0.03 exit rate per month 0.02

0.01

0 peer no peer peer no peer

ACES NON-ACES

Note: Ace is defined as being in the top 20% of victories over all of World War II. Peers are pilots who in the past served in the same squadron but no longer do so. Mentions are in the German Armed Forces Daily Bulletin (Wehrmachtsbericht).

Figure 1: Victory and Exit Rates for Fighter Pilots, Aces and Non-Aces, during Periods of Peer Mentions.

Similarly, when two pilots came from the same region, an award for one spurred greater efforts from the other. This suggests that the German award system during 6

World War II had important incentive effects not only for pilots who received a medal, but for their peers as well. We interpret these effects as being driven by competition for status, in the spirit of “keeping up with the Joneses”, and provide evidence ruling out alternative interpretations such as learning about one’s own type, common shocks to peers, social learning, and differences in equipment available to pilots.

Importantly, there is some evidence that due to these social multiplier effects, incentives may have been too sharp: While the rate of aerial victories by peers per month increased, so did the casualty rate. The best pilots reacted to awards for their competitors by increasing their own performance, seeing their exit rate change only a little; more mediocre pilots tried harder, but paid a high price in the form of much greater casualty rates (Panel B, Figure 1). Overall, our results suggest that the Luftwaffe’s reward system may have increased the performance of its best aces, but as a result degraded its overall effectiveness as a fighting force.

In addition to work on symbolic rewards, we also relate to a literature on “keeping up with the Joneses” and positional goods that shows that status competition can be welfare-reducing. On the theory side, Frank (1985) established that status goods may lead to welfare losses through wasteful spending. Empirically, Agarwal et al. (2016) show that close neighbors of winners of a Canadian lottery are more likely to later file for bankruptcy.7 Our findings also contribute to a literature on relative position and satisfaction that shows that higher levels of one’s peers’ incomes have a negative effect on one’s satisfaction levels (e.g., Luttmer 2005 and Card et al., 2012).

More broadly, we also relate to the literature on peer effects and the consequences of collaborating with others in an organization (Mas and Moretti 2008; Falk and Ichino 2006; Bandiera et al. 2010). Peer effects are typically either driven by knowledge spillovers, task complementarities, or social pressure. In our setting, we can rule out the first two. Also, evidence for peer pressure is relatively strong amongst low- skilled, but it is distinctly mixed for highly-skilled individuals (Waldinger 2012; Azoulay et al. 2010; Jackson and Bruegman 2009). We add an example where social interactions create greater incentives to perform amongst highly skilled (and motivated) individuals.

While the setting of our study is highly specific, the results are of general interest. The fact that symbolic awards are effective in a high-stakes setting with major risks is

7 Relatedly, Bertrand and More (2016) provide evidence of “trickle-down” consumption. When households that are not rich are exposed to those with top incomes and high consumption levels, they increase the budget shares allocated to more visible goods. Kuhn et al. (2011) find that nonparticipants of a Dutch lottery who are neighbors of lottery winners have higher levels of car consumption than other nonparticipants.

7 important. CEOs and traders also have to trade off risk against returns, even if their downside is more limited than that of fighter pilots. The insight that symbolic rewards operate in a social environment is even more general. Our results show that status competition can lead to a crowding-in of effort, a dimension that has so far not been demonstrated in the literature. Finally, we show that these high-powered incentives in the form of awards can backfire precisely because relative status concerns can lead to too much risk-taking. Here, there are obvious analogies with incentives in compensation schemes in financial institutions, for example, where the desire to be the “best” trader can lead to catastrophic losses.

We proceed as follows. II provides background on the German air force during World War II and the data we use. Section III presents our main findings, and Section IV discusses alternative channels and interpretations.

II. HISTORICAL BACKGROUND AND DATA

In this section, we discuss the setting of our study – the organization of the German air force in World War II, and its rise and fall as a fighting force. We also explain where our data come from, and the limitations of our source.

A. The German air force during World War II Aerial combat began during . Initially, planes had been unarmed. They quickly evolved into specialised types, ranging from single-seat fighters to bombers. During World War, the highest-score ace, the Baron von Richthofen, had scored 80 victories. By the time World War II started, both fighters and bombers had become faster and more powerful. The German air force had sent planes and men to participate in the Spanish Civil War on Franco’s side, gaining valuable experience. There, it carried out the first mass bombing of a civilian target at Guernica in 1936. German air support was important for the ultimate victory of the fascist forces. The air force was organized into air fleets composed of several flying . The flying corps contained several wings (Geschwader), typically containing three groups which in turn consisted of three to four squadrons (Staffel) each. Each squadron had an establishment strength of 12 aircraft, but actual strength could be as high as 16 or as low as 4 or 5. Germany began the war in 1939 with 4,000 planes, including 1,200 fighters, and 880,000 men. It had initially been designed for joint arms operations. During the initial “Blitzkrieg” campaigns, it mostly operated as close air support for the . The wars against Poland, France, and Russia opened with successful attacks on enemy air forces, destroying many planes on the ground. This ensured that the Luftwaffe during the initial phase of the war operated with air superiority. Before 1943, the only exception was defeat in the Battle of Britain, when the German air 8 force failed to dominate the skies – and the planned invasion of the British Isles was consequently called off. By 1943, personnel and the number of planes had doubled, to 2,000,000 men and some 7,000 planes. As the allied bomber offensive against the German cities gathered pace, ever more fighter units were called back to defend the Reich. Air attack on hydrogenation plants, airframe and aero-engine factories were a particular threat to the Luftwaffe, and from 1943 onwards, its main effort was trying to stem the growing tide of bomber incursions. Despite these efforts, German cities quickly turned to rubble as the strength of British and American air forces continued to increase. Having started the war with modern planes and a large air fleet, Germany first lost its quantitative edge. Once the war in Russia began, and the US joined the fight, the Luftwaffe was heavily outnumbered in all theatres of war. Eventually, it also fell behind in terms of quality of equipment, keeping the outdated BF-109 plane flying as a main fighter until the end of the war. New planes with advanced technology, such as the ME-262 jet, arrived too late to make a difference. Pilot training suffered as well. Until 1942, German pilots received the same training or more than allied pilots; by 1944, German pilots received less than half of the flying hours of UK or US pilots before being sent off into combat. Loss rates became staggeringly high. In January 1942, the air force lost only 1.8% of its fighter pilots; by May 1944, it was losing 25% of them every month. Destruction of planes was even more rapid. The Luftwaffe had lost 785 planes in combat during the six months between May and December 1940 (and another 300 in accidents, etc.); between January and June 1944, it lost 2,855 (plus another 1,345 in accidents). Actual planes available relative to authorized strength fell from 95% in January 1942 to 45% in September 1944. Air attacks against German cities may not have dented morale as much as British planners had hoped; ‘precision’ daylight bombing by the US air force destroyed much less of the industrial capacity than planned. Nonetheless, it is clear that the Anglo-American air offensive was highly effective in degrading the capabilities of the German air force, to the point that the Normandy landings in the summer of 1944 were largely unopposed in the air. While the Luftwaffe lost air superiority in the West from 1942/43 onwards, it continued to be a match for the Red Air Force almost until the end. Outdated planes like the Stuka dive bomber had to be withdrawn from the Western theatres after 1942; they continued to operate in the East until 1945. Better training and better equipment gave the German units an edge against Russian planes and pilots; when it made an effort, the Luftwaffe could establish temporary air supremacy on the Eastern Front. It was not until late 1944 that this ability decreased, as more and more units were switched to the Western Front.

B. Medals and other awards 9

The German armed forces used a large number of different awards. Some were widely distributed, such as “campaign medals” handed out to every who participated in a particular operation – such as the Russia 1941 campaign medal, commonly known as the “Order of the Frozen Flesh”. Some recognized particular skills or feats of arms, such as the close combat badge and destruction badges. The major medals for valor were the 1st and 2nd class, and the Knight’s Crosses. By the end of the war, there were five different Knight’s Crosses, awarded cumulatively. The total number of recipients varied greatly --–here were approximately 3.3 million Iron Crosses awarded, but only one of the highest Knight’s Cross (Golden Oak Leaves and Diamonds – cf. Table A1). Awards would not be handed out based on fixed rules, but informal quotas existed. For example, in the Luftwaffe, the first aerial victory typically brought an award of the Iron Cross 2nd class, and the 5th, 1st class. Quotas for the Knight’s Cross varied during the war, starting at 20 and then being raised later.

Commanding officers had to nominate their men Knight’s Crosses. The nomination included details of the nominated soldier’s deeds, rank, service number, and experience, and were then be sent up the chain of command, requiring support at every step. Commanders could not nominate themselves. At head of each branch of the armed services had to endorse the nomination. Final authority lay with the Army’s Personnel Office, which would make a recommendation to the Hitler himself

The second type of important recognition was mentions in dispatches. The Daily Bulletin was distributed widely throughout German-held territory, and displayed prominently at the command post of every unit, with all having a high chance of reading it the day it was issued. Individual pilots, soldiers, and sailors would on occasion be mentioned for particular feats of arms. For example, on August 6, 1944, the daily bulletin highlighted the destruction of 11 tanks in one day by Stuka pilot Hans-Ulrich Rudel (bringing his total tally to 300).

C. Data

The Air Force High Command (Oberkommando der Luftwaffe – OKL) received fighter claims throughout World War II. The records contain, like a log book, detailed information on every reported aerial victory of German fighter pilots during WWII by (Geschwader), unit (Gruppe), squadron (Staffel), the pilot’s first and last name together with his rank, and the day, location, type of damage, witnesses, and type of the claimed aircraft. German rules for counting a claim as an aerial victory were relatively demanding (Caldwell 2013). Each claim had to be accompanied by a witness confirming the destruction of the enemy aircraft (impact or explosion in the air), or that the enemy pilot had to be seen bailing out. Many claims were not accepted, and rightly so – comparison with combat records from the 10 other side suggests that there is some evidence of over-claiming, sometimes by 100% or so on average, by both Western air forces and the German air force – cf. the case studies in Caldwell (2013). Our database of German fighter pilots during WWII draws on two principal sources: Jim Perry’s and Tony Wood’s Oberkommando der Luftwaffe (OKL) combat claims list, and the Kracker Luftwaffe Archive.8 The OKL fighter claims list were extracted from microfilm records stored at the (Bundesarchiv) in Freiburg. As some OKL fighter claims records did not survive WWII, Tony Wood supplemented the list by claims of other published sources, such as Donald Caldwell’s (1996) JG26 war dairy, to obtain a comprehensive list of German fighter claims for the years 1939 to 1945.

We clean the Perry-Wood fighter claims records from typos, such as misspelled names, rank, and units, and then construct a monthly panel by aggregating the information for every pilot by month and year. This panel contains the number of monthly victories per pilot together with pilots’ first and last name, rank, wing, unit and squadron. We then match the panel data with additional information from the Kracker Luftwaffe Archive. Kracker’s archive collected detailed personal data of German fighter pilots, such as type of awards together with the award date (e.g., the Iron Cross 2nd and 1st class, Knight’s Cross, …), the pilot’s war status (e.g., whether he was killed in action, a , or a WWII survivor), and for some pilots the starting point of their Luftwaffe career. With this information at hand, we have for every pilot in the sample the timing when he received an award, his war status, and how long he was active during WWII. We typically observe pilots for the first time when they file their first claim with the OKL; our database does not contain pilots that never scored during combat.

We also use information on the official introduction of medals, such as the Oak Leaves to the Knight’s Cross, quotas for medals from Obermaier (1989), and the total number of aerial victories of Legion Condor pilots that participated in the Spanish Civil War (from Jim Perry’s and Tony Wood’s list of Spanish Civil War combat claims).

For our analysis we only use information on day fighter pilots. That is because the skills required of night figher pilots were substantially different. While day fighters mostly battled against other fighter pilots, night fighters were principally used to intercept bombers.

Our sample is unbalanced and consists of over 5,000 fighter pilots of the German Luftwaffe that made at least one combat claim during WWII. On average, pilots are in observation for 19 months, giving a total of 96,127observations. Of the 5,081 pilots

8 For more information about Tony Woods’ combat claims and the Kracker Luftwaffe Archive, see https://web.archive.org/web/20130928070316/http://lesbutler.co.uk/claims/tonywood.htm and http://www.aircrewremembered.com/KrackerDatabase/. 11

in our sample, 3,242 (equivalent to 64%) exit our sample – meaning they are not in the dataset in the next month, and that month does not coincide with the end of the war. We count these as casualties. Spot checks suggest that the vast majority of cases thus identified do indeed refer to pilots either killed or severely wounded.9

D. Pilot performance The German air force in WWII counted amongst its ranks the highest-scoring aces of all time. 409 pilots scored 40 victories or more in WWII; 379 were German, 10 from the USSR, 7 from Japan, 6 from Finland, 1 from the US, and zero from the UK. The best-scoring in history was , with 352 aerial victories. The highest-scoring non-German ace was Ilmari Juutilainen from Finland, with 94 victories; the best-performing Soviet, British and American pilots were credited with 66, 40 and 38 kills, respectively. Figure 1 gives the distribution and nationality of aces in WWII:

400 300 200 WWII Aerial Victories WWII 100 0

0 100 200 300 400 rank

Germany other

Figure 1: Aerial victories, total for WWII, by rank (gaps=ties)

9 A more difficult issue are aces who are given desk jobs. They stay in our data set, but are effectively inactive for extended periods. Adolf Galland, for example, became General Inspector of Fighters in 1941, and clocked zero victories from Dezember 41 to April 45. In April 45, after being sacked by Göring, he scored another 5. In cases such as this one, we add an additional variable for inactivity, designed to capture vacations, recuperation periods in hospital, and transfer to staff duties.

12

The high concentration of aces in the German air force reflects three factors the “fly till you die” rule – Western air forces rotated pilots out of active duty after a fixed number of sorties, but German pilots continued to fly until they were shot down, the very poor quality of planes and training in the USSR at the start of WWII, greater experience vis-à-vis enemy air forces during the early stages of the conflict, driven in part by German participation in the Spanish Civil War (Simkin and Roychowdhury 2008). Overall, the German air force records document 53,008 claims of aerial victories. These are credited kills, not only claims. In an average month, an average German pilot scored 0.55 victories and faced a risk of 3.4% of exiting the sample permanently, synonymous in almost all cases with death. In the East (West), the victory rate was 0.97 (0.32) and the exit rate 0.029 (0.037). In other words, the exchange ratio in the East was 33 and in the West, 8.6.

Scores were highly unevenly distributed. The top-scoring 110 pilots achieved as many aerial victories as the bottom 4,900 put together. In an average month, the vast majority of pilots (over 80%) failed to score even a single victory. Some, on the other hand, notched up large numbers of victories quickly: Emil Lang shot down 68 enemy planes in October 1943, and Hans-Joachim Marseille scored 17 victories in a single day, on 1. September 1942. Figure 2 shows the number of monthly victories per pilot, plotted against the quantiles of the distribution:

13

Figure 2: Cumulative Distribution of Monthly Victory Scores

There was a large seasonal component to air combat. The summer season – when ground operations were common, and hours of daylight were long – also saw substantial spikes in aerial activity; the winter months saw a lull in fighting. The next figure plots mean victory and exit (death) rates over time. Peaks in both series mostly coincide, except for the end of the war when the victory rate plummeted and the exit rate spiked. 14 2 .1 .08 1.5 .06 1 (mean)vic (mean)exit .04 .5 .02 0 0

1939m1 1940m1 1941m1 1942m1 1943m1 1944m1 1945m1 identifier time by year and month

(mean) vic (mean) exit

Figure 3: Mean Victory Rate per Pilot and Month – 1939/9 to 1945/4

E. Organization and training

The Luftwaffe was divided into air fleets (Luftflotten). These were responsible for particular geographical areas. Their number rose from four to 7 during World War II. Air corps within each air fleet controlled the planes and men; air districts were responsible for infrastructure.

The air corps were composed of wings (Geschwader), with approximately 100 planes each. The wings were organized by function, with different Geschwader for fighter planes, long-range bombers, dive bombers, reconnaissance, etc. Each wing would contain several groups, all dedicated to the same specialized function. A wing had 3- 4 squadrons (Staffeln), with an established strength of 12 planes each.

Pilots would first learn to fly, before receiving training in more specialized skills such as aerial combat. They would first attend “boot camp”, with an emphasis on physical fitness and military discipline. After some basic training in aernautics, they would then move on to an elementary flying school. Once they had their pilot’s wings (after 100-150 hours), prospective fighter pilots were sent to an air combat school. Upon completion, they were attached to a squadron or group in an operational training unit at the front, where they learned from experienced pilots before transferring to combat units. 15

F. Illustrative evidence on the incentive effects of awards

Soldiers received awards on the recommendation of the commanding officer, supported by other superiors. There was no mechanical rule that would have determined when a pilot had “done enough.” Using actual awards is problematic because of reverse-causality issues – pilots may have (by sheer luck) a good month, with many victories, and receive a medal. Simple reversion towards the mean would then ensure that the effect goes away in the next month.

To illustrate the effect of awards, we use the informal quotas established by Luftwaffe instead. Pilots knew that if they “got close” to the target numbers, their chances of receiving an award would go up substantially. Figure 5 illustrates the magnitudes involved:

1 .8 .6 .4 probability of KCR KCR award of probability .2 0

-100 -50 0 50 100 score relative to quota

95% CI lpoly smooth

Figure 4: Cumulative Probability of Receiving a Knight’s Cross, relative to quota (0=quota)

The chance to receive a Knight’s Cross is essentially zero unless a pilot is with 10-15 victories of the quota. The likelihood is highest at the quota, or just above it, and then falls gradually. By the time a pilot exceeded the quota by 50, he was very likely to already have it. In an average month, for pilots whose score is not within ±15 victories of the quota, the chances of receiving a KCR are 0.0015. Within this window, the probability rises to 0.043 percent – 29 times higher.

16

We use the quota system to establish the incentive effect of medals. This allows us to side-step the reverse causation problem. There are important incentives to score without getting close to a quota – pilots are held in higher esteem by their colleagues, and survival in some situations may depend on aerial victory. In most situations and with most cumulative scores, this does not lead to receiving a medal. However, the same pilot with the same inherent skill and score-invariant incentives knows that the next N victories will get him over the threshold. If pilots discount the future sufficiently, they will make greater efforts as they get close the quota.10 They will also rationally relax and exert less effort once they have reached the quota.

Hartmann Grasser is a good illustration of the incentive effects of awards that we see in the aggregate (Figure 5). Hartmann Grasser averaged 2.5 victories per month after receiving the Knights Cross in September of 1941. Then, as he approached the quota for the Knights Cross with Oak Leaves (80 victories), he suddenly scored 10, 17, and 9 in three consecutive months, until reaching the quota threshold for the Oak Leaves in September of 1942.

Figure 5: Cumulative Score of Hartmann Grasser (0 designates the month Grasser passed the quota for the Oak Leaves.)

To examine results systematically, we regress monthly victories on whether a pilot is “close” to the quota (+/- 15 victories), controlling for a variety of factors including pilot quality, squadron fixed effects, front, and aircraft type. Table 1 gives the results of estimating regressions of monthly victory counts on being close to the quota, for both Knight’s Crosses and the Oak Leaves (the two dominant award categories). We

10 Given the average exit rate in our data, half of all pilots could expect to die within 21 months. 17 find that being close to the quota increased performance by 1.1 – 2.2 victories per month, depending on specification and award. Results are unaffected by our controls, and do not consistently decline in size as we add additional regressors. The most demanding specification with a full set of controls including front, squadron fixed effects, time fixed effects, and front yields a coefficient estimate that is very similar to the one under OLS. As the case of Hartmann Grasser illustrates, once pilots got above the quota, they typically reduced their level of effort.

There is a risk of reverse causation in Table 1 – pilots could have a positive “run” of victories, and hence find themselves close to the quota. This would be captured by the close-to-quota dummy. To side-step this issue, we instrument being close to the quota by quota changes. The identifying assumption is that for an individual pilot, the factors that led to a change in quota are uncorrelated with performance. The first stage is strong and highly significant (cf. Appendix A.#). We obtain an F-statistic of 26.3, and a probability of 0.05 on the Anderson-Rubin test. We estimate an effect that is larger than the OLS coefficient for people treated with being close the quota for exogenous reasons. The coefficient is larger than in col. 1-4, and the reasons is that the share of compliers is markedly smaller – 4.5% vs. 7.7% for periods without change. Once we scale by the share of compliers, the coefficients are indistinguishable.

Table 1: Monthly victory score of pilots (conditional on getting close to the quota +/- victories)

Next, we examine the effect of the introduction of new awards – such as the Swords in the autumn of 1941. The idea is to examine changes in the performance of pilots 18 who had already received the highest award – and who were now given something new to compete for. We compare them with pilots who are just as good, based on their average monthly victory score after controlling for peer effects,11 but who still have lower awards to look forward to. Specifically, we form comparison groups amongst pilots who will all eventually obtain a Knight’s Cross, and are in the upper half of the skill distribution. We then examine how pilots who have the Oak Leaves already perform after the introduction of the Swords, comparing them with all pilots on similar trajectories taken from the elite comparison group.

Figure 6 illustrates the results. Before September 1941, pilots with the Oak Leaves are on a very similar trajectory to (elite) pilots without the Oaks. Both groups score on average 3-4 victories per month on average, far above the sample average of 0.57. Immediately after the introduction of the Swords, the Oak Leave holders’ performance jumps. They score an extra 14 months in October, and continue to perform at a higher rate than the pilots in their comparison group thereafter.

Figure 6: Synthetic control method – Introduction of Oak Leaves to the Knight’s Cross

11 Using the method in Mas and Moretti (2009). 19

III. MAIN RESULTS: INDIRECT EFFECTS OF PEER RECOGNITION

To examine the effect of peer recognition, we focus on mentions in daily bulletins. This is for several reasons. First, mentions in the Wehrmachtbericht were highly visible, and distributed throughout German-held territory quickly. News of major awards travelled more slowly, and with more variability. Second, mentions in the daily bulletin were largely unexpected. This contrasts with expectations of an award driven by the quota system. Third, the large number of medals awarded means that someone was almost always receiving a medal somewhere. This complicates the analysis since there are effectively almost no “quiet periods.”

We first examine if a good pilot being recognized by pilot being mentioned in the Wehrmachtbericht changed the performance of other “good pilots” at the same time. We control for time variation and the fact that the mention month itself may have been unusual iself, by using a dummy variable. The variable of interest is the interaction between the pilot quality percentile dummy and the mention period dummy. If it is strongly positive, pilots in the nth percentile outperformed strongly in a month when another ace was receiving public recognition.

Table 2 gives the results. We find that all the ranks of pilots from the 80th percentile upwards are showing greater performance in periods when an ace is mentioned in Wehrmacht dispatches. The additional outperformance – on top of their average outperformance – is substantial. At the opposite end of the skill distribution, we actually find underperformance during these months. Pilots who are not in the top 20% actually see their monthly score decline – but this may in part reflect their higher rate of exit, which will directly affect the victory scores recorded in the month of the mention. Below, we show that the exit rate of low-ranked pilots increases sharply during periods when peers get mentioned.12

12 An alternative interpretation is that poor pilots become demotivated, when confronted with the 20

Table 2: Percentile of Pilot’s Rank and Mention in Dispatches of Other Pilots – By Overall Performance

The definition of peer in Table 2 is based on performance tiers only. Instead of calling everyone a peer who flies at the same time, and who is also highly ranked, we focus on pilots who flew with the mentioned pilot in the past (but have now been posted to another unit). Figure 7 illustrates our identification strategy, using the case of two pilots, – one of the highest-scoring aces of World War II – and Horst Ademeit. During the spring of 1941, they served together in the same squadron. Then Ademeit remained while Nowotny got transferred to various units; eventually Ademeit was moved, too. In September 1943, Nowotny is mentioned in the Wehrmachtbericht. We classify Ademeit as a ‘past squadron peer’, and compare his performance in the month of Nowotny’s citation with other months.

21

Figure 7: Identification strategy. Note: red dashed line indicates mention in the Wehrmachtbericht for Walter Nowotny

Our strategy has the advantage of not being contaminated by correlated shocks that may increase victory scores for everyone in the same unit (such as a major offensive, or good flying weather). Table 3 shows first that past peers from the same squadron react markedly during periods when a pilot is mentioned in the Wehrmachtbericht. The variable survives adding a large set of controls, as well as squadron and time fixed effects (col. 2-4). Using the most demanding specification, we also examine if pilots in the same group also saw a bump in performance. Groups were larger units comprising 3 or 4 squadrons. They would typically be operating from the same airfield, but not necessarily so. We find outperformance for ‘group peers’, but a smaller effect. Finally, we form even broader comparison groups by tracking the location of every airfield from which German fighter units operated during World War II. We use this information to create a dummy for ‘base peers’. Many of these would have been in the same group, but other groups would also sometimes have operated from the same airfield. We continue to find an effect, but the effect is smaller -- as the intensity of (past) interaction declines, we obtain smaller effects from the mention in the daily bulletin.

22

Table 3: Peer Effects during Mention Periods

G. Birthplace peers

We interpret the effects of peer recognition as driven by a desire to keep up with one’s peers. In other words, the increase in the number of victories is compatible with an interpretation emphasizing competition over status.

To examine further if status competition is a likely explanation for our findings, we test if those born in towns close to each other react more to a mention in dispatches. For a total of 352 aces, we were able to determine their birthplace. We already know that among the aces, the average effect is relatively large. How much bigger is the increase in the number of victories when someone from the same region is mentioned? For those born close to each other, the effect of a mention in dispatches is particularly large. At a distance of less than 20km, the extra “bump” during the mention month amounts to 3.5 extra victories. Further away, at a distance of 400km, say, the increase is “only” 2.5. This suggests that the social multiplier effect is particularly large when someone of a more similar background is mentioned.

23 4 3.5 3 vic 2.5 2

3 4 5 6 7 lnpeerdistance

Figure 8: Differential outperformance during mention periods, as a function of birthplace distance

H. Risk

Fighter pilots faced high risks of death -- every aerial victory had to be bought at the cost of possible death. We do not have exact data on death, but we can construct a proxy based on exit our sample. In those cases where it was possible to check, in the overwhelmingly majority of cases, the pilot actually died. Any analysis of the efficiency of the Luftwaffe incentive system has to consider both dimensions, the number of enemy fighters downed and the number of own men lost.

To examine the effects, we repeat the analysis in tables 2 and 3, using the risk of exit from the sample as the dependent variable. We use survival analysis, performing Cox regressions on our data. Table 4 shows the hazard ratios for pilots, conditional on whether or not a past squadron peer is mentioned in the daily bulletin. Experience has only a limited effect on exit rates. On the Eastern Front, rates of death are clearly much lower.

For the sample as a whole, we find a markedly higher coefficient during such periods – with the risk of exit being twice as high. The effect is larger for pilots below the 80th percentile of overall war performance (col 2). Using the estimates with controls from Panel B, for pilots above the 80th percentile, there is a 53% increase in hazard rate; for 24 those below, of 141%. The best pilots (90th percentile or above), the effect is even smaller and not different from zero.

Table 4: Risk of Death – Peer Performance Effects

(1) (2) (3) (4) Performance percentile Full sample =<80 80+ 90+ Panel A: Estimates without controls mentionperiod 0.986 1.032 0.984 0.707*** (-0.30) (0.54) (-0.21) (-2.87) pastpeerofmen 2.325*** 2.647*** 1.619** 1.607* tioned (5.81) (5.34) (1.98) (1.67) N 95877 76772 19105 9741 Panel B: Estimates with controls (1) (2) (3) (4) Full sample =<80 80+ 90+ mentionperiod 0.976 1.011 0.981 0.768** (-0.52) (0.19) (-0.24) (-2.08) pastpeerofmen 2.136*** 2.408*** 1.533* 1.489 tioned (5.14) (4.72) (1.73) (1.40)

Experience 0.998* 1.002 0.994** 1.002 (-1.78) (1.06) (-2.46) (0.67)

Front 0.872*** 0.672*** 0.922 0.792*** (-3.25) (-6.79) (-1.32) (-2.59)

N 95877 76772 19105 9741 Exponentiated coefficients; t statistics in parentheses * p < .1, ** p < .05, *** p < .01 Note: Estimates of hazard ratios from Cox regressions. Panel B estimates also control for aircraft type (dummy variables for FW and BF types).

The results in Table 4 put the results on performance improvements in context. While performance increased for pilots on average in months when a former peer was mentioned, this increase in performance was bought at the cost of higher exit rates. This effect differs by overall ability. Pilots below the 80th percentile of overall scores saw their victory rate decline by 0.36 victories in a month of a former peer being mentioned; and at the same time, their exit rate increased by a factor 1.4. In contrast, top pilots (above the 90th percentile) saw their victory rate increase by 0.25-2.5 victories; at the same time, their exit rate is not significantly higher during the month in question.

25

IV. ALTERNATIVE INTERPRETATIONS AND ROBUSTNESS

Next, we attempt to rule out potential confounding mechanisms, and we examine the robustness of our findings.

A. Correlated shocks and social learning

A natural confounding factor is the presence of unobserved, correlated shocks simultaneously affecting the outcomes of peer groups. To deal with the most likely correlated shocks, we focus on peers who are not part of the same squadron at the time of measurements. This approach deals with contemporaneous, squadron-level shocks. Nonetheless, we need to consider whether other omitted variables may be driving our results.

Aircraft equipment. Since performance in aerial combat is a function of both pilot ability and the quality of equipment, changes in performance could in principle be explained by changes in planes. It is possible that sudden increases in the number of aerial victories could be driven by similar aces receiving simultaneous upgrades in their planes, for example.

This is unlikely to drive our findings. We have information on the type of aircraft used for 83,000 observations out of 96,000 in total. Figure # in the appendix shows the distribution of aircraft types used. The greatest part of missions was flown in just four aircraft types – the BF 109 E,F, and G, and the FW 190. Combined, they accounted for 86% of all aircraft types used.

Did correlated upgrades of equipment contribute to the increase in performance during mention months? First, we define as peers those pilots who shared the same squadron in the past. The Luftwaffe typically upgraded entire squadrons, to facilitate maintenance and training. Typically, squadrons would be recalled to Germany, re- equipped, and sent back to the front. There is no anecdotal evidence of aces being given special treatment. To the contrary, in one case at least, an ace – Hans-Joachim Marseille – was forced to ‘upgrade’, to the BF 109G. This occurred despite his protests because his entire squadron was being re-equipped. Marseille promptly died when the (more powerful, but not very reliable) new engine failed on one of his first missions.

Second, we can directly control for the effect of aircraft type. Table 3, col. 3 includes dummy variables for the different aircraft types. Any systematic increase in performance as a result of aircraft upgrades should be captured in our data. Finally, we examine directly if the probability of flying a similar type of aircraft is systematically higher in months during which one ace is mentioned in the Wehrmachtbericht. 26

Learning. There could still be correlated shocks affecting pilots who belonged in a squadron in the past. Assume that pilots learned some specific skills from other pilots while flying together in the past, and that that skill became particularly useful in a specific period. Then, we would expect outstanding pilots to do particularly well, leading to a mention in the daily bulletin. At the same time, pilots with past exposure whom they trained, or who developed similar skills in the same environment, do better, too. Then we would find higher performance by past peers, in periods when aces get mentioned in the daily bulletin – but the reason would be correlated learning on the job, rather than motivation effects.

We again do not believe these alternative mechanism is likely to drive our results. First, we can control for squadron-level shocks (even in the past) by including fixed effects for squadrons that the pilot has been part of (#lukas, pls create this set of dummies and redo table 3 with it).

Nonetheless, it could be that what matters is having been part of a squadron at a specific point in time -- for example, during a particular campaign, or at the same time as a particular pilot. First, we examine directly the hypothesis that past exposure to a particular pilot (the one that will be mentioned in dispatches) improves performance in all periods. We add a dummy variable for the peer in question that takes the value of 1 for all periods after having served jointly. Adding this variable to our estimation does not change results. The variable of interest – mentionedperiod*pastpeer – is unaffected. Also, the fixed effects of having served together with a pilot who will eventually be mentioned vary from -0.7 to 1.01, with an average of 0.11. In other words, in 44% of all cases, the fixed effect of having flown with an ace who will be mentioned in dispatches is actually negative.

This leaves open the possibility that pilots may have picked up skills by flying together that become useful in particular, novel situations. If the mentioned pilot has a particularly good month – and gets mentioned in dispatches – his former peer could possibly also put his acquired skill to good use. If correlated performance across peers is to be explained by these alternative mechanisms, we should expect correlated performance to also occur when dispatch mentions were not made.

To examine this possibility, we first restrict the sample to former peers, i.e. pilots who flew at some point in the past with a pilot who will be mentioned in the daily bulletin during the war. We then regress the log of victories by pilot i (+.01) on the log of victories of the mentioned pilot (+.01), allowing us to estimate elasticities of performance directly:

27

Where C is a constant, α measures the correlation of victory scores between pilot i and his mentioned-in-dispatches peer m, β is the average change in log victories in a mention month for pilot m, and γ is the coefficient of interest – the change in the co- movement between pilot i's victory score and that of his former peer who is mentioned. An increase in the correlation during the mention period is a very demanding test. By definition, the mentioned pilot had an exceptionally good month when he is cited in the daily bulletin. For his former peer to show an even greater correlation in mention periods requires a very sharp change in fortunes.

This is exactly what we find: In non-mention periods, the correlation between former peers (one of whom will eventually be mentioned) is 0.117; in mention periods, it is 0.17, or more than 50% higher (col. 1). The effect holds up when controlling for front, experience, and aircraft type (col. 2) as well as time FE (col. 3). Here, the results suggests that the correlation during mention periods is more than twice as strong as during quiet periods. If we exclude pilots from the same squadron, because they might be subject to correlated shocks, we find a large increase in correlations during mention periods, compared to a relatively small baseline correlation (col. 4). The implied increase in correlations is now by a factor of five. If we even exclude pilots from the same group, the baseline correlation – in quiet times, without mentions – is quite low, but it increases six-fold during mention periods.13

13 Another potential confound, though less likely, is that one observing that a pilot they knew has just been recognized could updated pilots’ beliefs about their own skill and potential, if they consider the rewarded pilot as someone similar to themselves. This in turn could lead pilots to exert more effort and/or take more risk. However, since we observe increases in performance for both aces and non-aces (the latter not being similar in quality to the rewarded pilots), we believe that this alternative story is not the main driver of our findings, whereas competition over status might be more salient at the moment a peer gets recognized. The fact that the social multipliers are strongest when a peer gets mentioned also suggests that updates in beliefs about one’s own skill are not the main force behind our results. 28

Table 5: Correlation of Pilot Performance, Mention Periods

(1) (2) (3) (4) (5) Baseline Controls +TimeFE Samesq=0 Samegroup= +SquadFE 0

Log(Victory+.01)_ 0.117*** 0.0949*** 0.0463*** 0.0190*** -0.00349 mentioned (31.13) (25.50) (11.91) (4.12) (-0.66)

Mentionperiod -0.134** -0.123** -0.114** -0.0966 -0.0544 (-2.37) (-2.20) (-2.03) (-1.40) (-0.72)

Mentionperiod* 0.0530*** 0.0573*** 0.0614*** 0.0806*** 0.0690** Log(Victory+.01)_ mentioned (2.72) (2.97) (3.22) (3.29) (2.57)

Front 0.483*** 0.473*** 0.460*** 0.437*** (22.19) (21.21) (17.19) (14.40)

Experience -0.0158*** -0.0146*** -0.0123*** -0.0101*** (-21.53) (-18.27) (-12.52) (-9.36)

Constant -3.117*** -3.053*** -2.115*** -2.480*** -3.082*** (-215.24) (-96.23) (-7.24) (-6.65) (-7.33) N 49451 49451 49451 35006 26882 t statistics in parentheses * p < .1, ** p < .05, *** p < .01

B. Learning about one’s own ability vs status competition

The results in Table 3 show that peers of a mentioned pilot outperform during the mention period. This could also reflect peers’ learning about their own type and ability, instead of status competition. In both cases, during the month of the peer’s mention, former colleagues would try harder and score more.

One way to tackle this problem empirically is to separate our data into two categories – mentions of a former peer with a monthly victory score that exceeds the treated pilot’s own past performance, and those that are within the range of the treated pilot’s own accomplishments. Put differently, if Nowotny is mentioned with a monthly score that far exceeds Adameit’s, Adameit may be learning about his own type – having been through similar experiences himself during his time as a squadron peer of Nowotny. If, however, Adameit had already scored as much as Nowotny does during the September 43 month when the Wehrmacht bulletin mention occurs, then it is more likely that status competition motivated Adameit to do better.

29

Table 5: Correlation of Pilot Performance, Mention Periods

Panel A: mentioned pilot score above treated pilot max

Panel B: mentioned pilot score equal or below treated pilot max

The variables of interest here are on the second to fourth lines. As we can see, the effect is there in both samples – but it is clearly stronger among pilots who had already performed at the same level as the mentioned peer.

V. CONCLUSION

Social comparisons can lead to lower job satisfaction and demotivation (Card et al. 2012). There is also strong evidence that status competition is a potent force in 30 human behavior, especially when it comes to consumption (Bertrand and Morse 2015; Kuhn et al. 2011).

In this paper, we examine if status competition can be a spur to greater effort. We examine data from the German Air Force during World War II. We focus on pilots who flew together in the past, and were then assigned to different squadrons. When one of them is mentioned in the daily bulletin of the German Armed Forces for his outstanding accomplishments, the former colleague on average ‘tries harder’ and scores more. The effect varies by skill group. The performance gains are concentrated amongst highly skilled pilots. Poor pilots also score more victories, but their gains are small (and may even be negative).

Importantly, we find that 31

Appendix

Table A.1: Pilots with more than 50 victories by war-end only

Award Number of recipients Iron Cross 2nd class Ca. 3 million Iron Cross 1st class Ca. 300,000 Knight’s Cross 7,364 Knight’s Cross with Oak Leaves 890 Knight’s Cross with Oak Leaves and 160 Swords Knight’s Cross with Oak Leaves, Swords 27 and Diamonds Knight’s Cross with Golden Oak Leaves, 1 Swords and Diamonds Table A.2: Higher awards in the German armed forces, World War II

32

Bf 109G BF109D Fw 190A Bf 109F Bf 109E Bf 109G/K BF109E Bf 109F/G Fw 190A, Bf 109G Bf 109E/F BF109F Fw 190A/D Bf 109E/F/G Bf 109F, Fw 190A BF109G Bf 109G, Fw 190A Bf 109F/G, Fw 190A Bf 109F/Fw 190A BF109K Bf 109K Fw 190D Me 262A FW190A Bf 109E/Bf 109F Fw 190A** Bf 109G/K, Fw 190D FW190D 0 10,000 20,000 30,000 number of man-months -2 -1 0 1 2 Fixed Effect Figure A1: Aircraft type – useage and fixed effect (95 and 99% CI) on victory scores

.2 .15 .1 close_kcr .05 0

-50 0 50 quotachange

Figure A2: Quota change and probability of being close to the Knight’s Cross quota

.4 .3 .2 Fraction .1 0 -1 -.5 0 .5 1 Mentioned Pilot FE Figure A3: Mentioned pilots fixed effect

33 80 60 40 vic 20 0

-10 -5 0 5 10 Inverse Normal

Figure A4: Distribution of Victory Scores

5 0 -5 logvic -10 -15 -15 -10 -5 0 5 Inverse Normal

Figure A5: Distribution of Log(Victory+0.01)