Simple Contracts with Adverse Selection and Moral Hazard∗

Daniel Gottlieb and Humberto Moreira†

October, 2015

Abstract

We study a principal-agent model with moral hazard and adverse selection. Risk-neutral agents with limited liability have arbitrary private information about the distribution of outputs and the cost of effort. We obtain conditions under which the optimal mechanism offers a small number of contracts. When output is binary, the optimal mechanism always offers the same contract to all types. More generally, under a multiplicative separability condition, the optimal mechanism offers at most two contracts. If, in addition, the principal’s payoff must be monotone (free disposal) and the distribution of outputs has the monotone likelihood ratio property, the mechanism offers a single debt contract. Our model suggests that offering just one or two contracts may be optimal in environments with adverse selection and moral hazard, where rich menus of contracts provide gaming opportunities to the agent. Therefore, we give an explanation for an old puzzle in contract theory: the scarcity of large menus of contracts.

∗We thank Eduardo Azevedo, Dirk Bergemann, Vinicius Carrasco, Sylvain Chassang, Gonzalo Cisternas, Alex Edmans, Eduardo Faingold, Leandro Gorno, Faruk Gul, Jason Hartline, Tibor Heumann, Bengt Holmström, Jo- hannes Hörner, Ohad Kadan, Lucas Maestri, George Mailath, David Martimort, Stephen Morris, Roger Myerson, Larry Samuelson, Yuliy Sannikov, Jean Tirole, Rakesh Vohra, John Zhu, and seminar audiences at Arizona State University, FGV, HEC Montreal, Johns Hopkins University, Princeton University, PUC-Rio, Universidad de Chile, University of Pennsylvania, University of Pittsburg/Carnegie Mellon University, the Wharton School, Yale University, and the BYU Computational Public Economics, 2013 LAMES, 2013 SBE, 2014 IWGTS, 2014 ESEM, 2015 AEA meetings, and 2015 ESWC for comments and suggestions. Rafael Mourão provided outstanding research assistance. Gottlieb gratefully acknowledges ﬁnancial support from the Dorinda and Mark Winkelman Distinguished Scholar Award. Moreira acknowledges CNPq for ﬁnancial support. †Gottlieb: John M. Olin School of Business, Washington University in St. Louis [email protected]. Moreira: FGV/EPGE, [email protected]. Contents

1 Introduction 1

2 Two Outputs 5 2.1 Statement of the Problem ...... 5 2.2 Benchmarks ...... 6 2.3 Contract Simplicity ...... 8

3 Multiple Outputs 10

4 Free Disposal 13 Debt Contracts ...... 15

5 Procurement and Regulation 16

6 Conclusion 19

Appendix 20 Proofs ...... 20 Theorem 1 ...... 20 Theorem 2 ...... 25 Proposition 1 ...... 34 Proposition 2 ...... 37 Examples ...... 39 Multiplicative Separability ...... 41

References 47

Online Appendix 51 1 Introduction

Most real-world contracts are much simpler than theory predicts. Differently from standard adverse selection models, contracting parties offer a limited number of contracts, often a single one. Unlike in standard moral hazard models, similar contracts are offered in fundamentally different environments. As Hart and Holmstrom (1987) and Chiappori and Salanie (2003) argue in their surveys of the literature:

The extreme sensitivity to informational variables that comes across from this type of modeling is at odds with reality. Real world schemes are simpler than the theory would dictate and surprisingly uniform across a wide range of circumstances. (Hart and Holmstrom, 1987, pp. 105)

The recent literature (...) provides very strong evidence that contractual forms have large eﬀects on behavior. As the notion that “incentives matter” is one of the central tenets of economists of every persuasion, this should be comforting to the community. On the other hand, it raises an old puzzle: if contractual form matters so much, why do we observe such a prevalence of fairly simple contracts? (Chiappori and Salanie, 2003 , pp. 34)

In this paper, we propose an answer to this puzzle based on the interaction between adverse selection, moral hazard, and limited liability. Most contracting situations have both adverse selection and moral hazard. Managers, for example, take actions that affect the firm’s profitability. At the same time, they usually have better knowledge about the efficacy of each action. Moreover, virtually all contracting parties have limited liability. Entrepreneurs raising capital from investors, for example, enjoy limited liability as the value of their equity cannot fall below zero. Also, anti-slavery laws enforce limited liability in employment contracts. We consider a principal-agent relationship with bilateral risk neutrality and limited liability. The agent selects an unobserved “effort,” which may consist of a single or multiple tasks. The agent also has private information, in an arbitrary way, about the distribution of outputs and about effort costs, resulting in a model where types and efforts are multidimensional (possibly infinite dimensional) and unordered. We show that the interaction between adverse selection, moral hazard, and limited liability imposes severe screening costs. With binary outcomes, the optimal mechanism offers a single contract to all agents regardless of the type space or the distribution of types. We generalize this result to settings with multiple outputs under a multiplicative separability condition. This condition – satisfied, for example, under the spanning condition of Grossman and Hart (1983) – is equivalent to assuming that agents rank the “power” of all contracts equally. The optimal mechanism has at most two contracts. Additionally, if the output distribution is ordered by first-order stochastic dominance

1 and the principal’s payoff must be monotone (“free disposal”), the optimal mechanism offers a single contract. If the output distribution also satisfies the monotone likelihood ratio property, this optimal contract consists of the principal taking a single debt contract or, equivalently, giving all agents the same call option. More broadly, our paper identifies an important downside from offering flexibility to agents through menus of contracts: gaming. This is particularly stark in the model considered here, where both the principal and the agent are risk neutral. Then, the agent always selects the contract with the highest expected payment conditional on his effort, which is precisely the contract with the highest cost to the principal. That is, conditional on the effort that the agent chooses, reducing the number of contracts always increases the principal’s profits (for a fixed effort). In particular, when the principal can identify the contract with the highest and lowest powers in a mechanism (i.e. when multiplicative separability holds), she can simultaneously reduce informational rents and increase efficiency by removing all intermediate contracts. Although the framework we study has been widely applied to financial contracting, it has many other applications. One such application is procurement and regulation. Despite the central role that menus of contracts play in the theory of procurement and regulation, they are rarely observed in practice.1 Accordingly, many papers try to identify conditions for simple procurement contracts to be close to optimal.2 We generalize the classic model of Laffont and Tirole (1986, 1993) by allowing effort to affect the regulated firm’s costs stochastically and assuming that the firm has limited liability. We then determine conditions for the optimal mechanism to offer one or two contracts, and for the optimal contract to be a price cap. Since limited liability constraints are a key aspect of most procurement contracts (see, e.g., Burguet et al., 2012), our model provides an explanation for the absence of menus of contracts in procurement. Our result on the optimality of simple contracts is related to the robustness intuition of Holmstrom and Milgrom (1987). However, the notion of robustness in our static model is different from the one in their seminal paper. Here, offering a single contract is robust in that it reduces the agents’ incentives to misrepresent their private information about the environment. In Holmstrom and Milgrom’s model, linear contracts are robust in the sense that they prevent the agent from readjusting effort over time.3 Moreover, as in their work, we also contribute to

1For example, Bajari and Tadelis (2001) argue that “the descriptive engineering and construction management literature (...) suggests that menus of contracts are not used. Instead, the vast majority of contracts are variants of simple fixed-price (FP) and cost-plus (C+) contracts.” 2Using the Laffont-Tirole framework, Rogerson (2003) and Chu and Sappington (2007) show that a pair of simple contracts can achieve a large fraction of the surplus under a certain range of parametric settings – 75 or 73 percent when costs follow either uniform or power distributions, respectively – for quadratic costs. Bajari and Tadelis (2001) assume that there is a fixed cost of specifying each state of nature in the contract to rationalize the simplicity of observed contracts. 3Edmans and Gabaix (2011) extend the linearity results to a model in which the realization of noise occurs before the action. Holmstrom’s idea of robustness is different from the one used in the more recent literature that identifies contracts that perform well for large classes of priors. Notable contributions include Chassang (2013), who studies a class of calibrated contracts that are detail-free and approximate the performance of the

2 the applied literature by identifying assumptions under which researchers can focus on a simpler set of contracts when solving their models. For example, under the same assumptions as Innes (1990) (and, when there are more than two outputs, a multiplicative separability and an ordering assumption), there is no loss of generality in assuming that the optimal mechanism involves a single debt contract even if there is adverse selection, making it easy to obtain comparative statics results.

Related Literature

We consider a principal-agent relationship with bilateral risk neutrality and limited liability, as is commonly studied in corporate finance (c.f. Tirole, 2005). We build on this standard environment by adding adverse selection in an arbitrary way and allowing effort to be multidimensional. Our work is related to a literature that identifies conditions for contracts to take the form of debt and for equilibria to have complete pooling. In a single-task moral hazard setting where contracts must be monotone, Innes (1990) and Poblete and Spulber (2012) show that contracts take the form of debt if the distribution of output satisfies the monotonicity of the likelihood ratio property. Our main focus is on the lack of menus of contracts, which, of course, can only be addressed by introducing adverse selection. Nevertheless, our Theorem 3 is reminiscent of their main result.4 In a signaling model of financial contracting, Nachman and Noe (1994) show that there is complete pooling if and only if firm types are strictly ordered by conditional stochastic dominance. When firms are ordered by conditional stochastic dominance, capitalists face a lemons problem: while they would like to offer better terms to healthier firms, those who are more likely to accept the contract are precisely the least healthy firms. Our papers emphasize different forces that may lead to pooling. In Nachman and Noe (1994), pooling occurs when the distribution of types induces a market breakdown for all but the worse contract. In our model, pooling happens because of moral hazard and limited liability: giving flexibility to agents requires the principal to leave excessive rents. For example, when output is binary, complete pooling occurs in our model for any parameters of the model (i.e. regardless of whether types are ordered).5 Similarly, Demarzo and Duffie (1999) consider a signaling model of security design and show best linear contract in dynamic environments when players are patient, and Carroll (Forthcoming), who shows that the best contract for a principal who faces an agent with uncertain technology and evaluates contracts in terms of their worst-case performance is linear. 4See Jewitt et al. (2008) for a general analysis of moral hazard models with limited liability. Moral hazard models with bilateral risk neutrality, limited liability, and monotonicity include, for example, Matthews (2001), Dewatripont et al. (2003), Poblete and Spulber (2012), and Chaigneau et al. (2014). Adverse selection models in this setting include Nachman and Noe (1994), Demarzo and Duffie (1999), DeMarzo (2005), and DeMarzo et al. (2005). 5See footnote 20 for a discussion of Nachman and Noe’s requirement that types be ordered by conditional stochastic dominance in the context of our model.

3 that, under a uniform-worst case condition and a monotonicity constraint, equilibrium contracts take the form of debt. Using this model, several authors studied whether intermediaries pool different assets in equilibrium. The conclusion depends on whether the security is designed before or after firms learn about the asset’s profitability (c.f., DeMarzo (2005), Biais and Mariotti (2005), and Farhi and Tirole (Forthcoming)). In a one-dimensional adverse selection setting, Guesnerie and Laffont (1984) showed that optimal mechanisms are “non-responsive” when the first-best allocation is decreasing. This occurs because optimality clashes with incentive compatibility, which requires allocations to be non- decreasing. The reason for pooling in our model is quite different from non-responsiveness. For example, if the agent only has private information about the distribution of output, the first best is increasing and therefore implementable in a pure adverse selection environment. Nevertheless, with multiplicative separability, the principal at most two contracts (see footnote 13). More related to our work, Ollier and Thomas (2013) substitute the traditional (interim) participation constraint by an ex-post constraint in a one-dimensional model with binary outcomes. They show that, under conditions that ensure that the first-order approach holds, there is no benefit from screening. As argued previously, our application to procurement and regulation builds on Laffont and Tirole (1986, 1993). In their model, there is both adverse selection and moral hazard. However, because the link between effort, types, and output is deterministic, the model can be reduced to a pure adverse selection model. For this reason, they are often referred to as models with ‘false moral hazard’ (c.f. Laffont and Martimort, 2002). We allow effort to affect the regulated firm’s costs stochastically so the problem cannot be reduced to a pure adverse selection model. Picard (1987), Melumad and Reichelstein (1989), and Caillaud et al. (1992) also introduce noise in the relationship between output and effort and show that, under certain conditions, the principal can achieve the same utility as in the absence of noise. Therefore, unlike in our model, they find that there is no cost from moral hazard. Our model differs from theirs in two ways. First, we also allow the agent to have private information about the distribution of output, while they assume that all private information concerns the cost of effort. Introducing private information about the distribution makes moral hazard costly. Second, they do not assume that agents have limited liability. Limited liability also prevents the principal from eliminating moral hazard at no cost. In Section 2, we present the model with two outputs and discuss the benchmark cases of pure adverse selection and pure model hazard. In Section 3, we generalize the results for multiple outputs. In Section 4, we introduce the free disposal constraint and obtain conditions for the optimality of debt. In Section 5, we present the application of our model to procurement and regulation. Then, Section 6 concludes. All proofs are in the appendix.

4 2 Two Outputs

2.1 Statement of the Problem

We start with a two-output model. There is a risk-neutral principal and a risk-neutral agent with limited liability. The agent has private information about the environment, captured by a type θ ∈ Θ. From the principal’s perspective, types are distributed according to a distribution µ. We will discuss the assumptions on Θ and µ below. θ The agent exerts an effort e ∈ E, which costs ce. The space of possible efforts E is a compact metric space. Effort can consist of a single task (E ⊂ R) or multiple tasks (E ⊂ RN ). The θ least-costly effort has a non-positive cost: min ce ≤ 0. This condition is satisfied in standard e∈E frameworks where the lowest effort costs zero, as well as in more general multi-task frameworks that allow the agent to derive private benefits from certain actions.6 The principal does not observe the effort chosen by the agent. She does, however, observe the output from the partnership x ∈ {xL, xH }, which is stochastically affected by the agent’s effort. We refer to xH as a high output or as success, to xL as a low output or failure, and to

∆x := xH − xL > 0 as the incremental output. Given effort e, a high output happens with θ probability pe. The type space Θ may be discrete or continuous, and types may be finite- or infinite- θ θ dimensional. Each type is fully characterized by the the pair of functions p· , c· specifying the probability of success and the cost of each effort.7 For example, if effort is binary, each θ θ θ θ type can be described by the four-dimensional vector (p0, p1, c0, c1). If effort is continuous, each θ θ 2 type is described by the infinite-dimensional function p· , c· : E → R . Our model does not require the agent to have private information about all of these dimensions, of course. The case θ where the cost of effort e is common knowledge, for example, is accommodated by letting ce be constant in θ. Note that we do not impose any order on the space of types and efforts. Success probabilities and costs may be non-monotone functions and, moreover, types and effort may be complements, substitutes, or neither in terms of probabilities and costs. By the revelation principle, we can focus on direct mechanisms. Let B(Θ) denote the Borel σ-field of Θ. A direct mechanism is a triple of B(Θ)-measurable functions (s, b, e):Θ → R2 × E, consisting of fixed payments s (or salaries), bonuses b, and effort recommendations e. An agent who reports type θ agrees to exert effort e (θ) and receives s (θ) in case of failure and s (θ) + b (θ) in case of success. A pair of payments s(θ) and b(θ) is called a contract.

6Allowing the cost of the lowest eﬀort to be positive makes participation random as in Rochet and Stole (2002) and is beyond the scope of this paper. 7 θ θ · We write p· to denote the function e 7−→ pe that keeps θ constant and varies e. Similarly, pe refers to θ θ 7−→ pe . The same notation is used for other functions.

5 Given a mechanism (s, b, e) , a type-θ agent gets payoﬀ

θ θ U (θ) := s (θ) + pe(θ)b (θ) − ce(θ). (1)

The mechanism must satisfy the following incentive compatibility (IC) and participation (IR) constraints: ˆ θ ˆ θ ˆ U (θ) ≥ s θ + peˆb θ − ceˆ, ∀θ, θ, e,ˆ (IC) U (θ) ≥ 0, ∀θ. (IR)

The agent is protected by limited liability (LL), which prevents payments from being negative:8

s (θ) ≥ 0 and s(θ) + b(θ) ≥ 0, ∀θ. (LL)

An optimal mechanism maximizes the principal’s expected proﬁt

n θ θ o pe(θ) [xH − (s (θ) + b (θ))] + 1 − pe(θ) [xL − s (θ)] dµ(θ) (2) ˆΘ among mechanisms that satisfy IC, IR, and LL. Two mechanisms are equivalent if, for almost all types, they give the same payoﬀs to both the principal and the agent. To ensure the existence of an optimal mechanism, we make the following technical assumptions, which are satisﬁed by all standard agency models:

Assumption 1. Θ is a complete separable metric space. µ is a probability measure on B(Θ). θ θ · · For each θ ∈ Θ, p· and c· are continuous functions and, for each e ∈ E, pe and ce are B(Θ)-mensurable functions.

2.2 Benchmarks

We first consider the benchmark cases of pure moral hazard and pure adverse selection. We show that, in both cases, the principal typically offers a different contract to each type. Therefore, the uniform contract result that we will obtain in Section 2.3 requires both moral hazard and adverse selection. For simplicity, we focus on the single-task case (E ⊂ R) and assume that θ the probability of high output pe and the cost of effort are both non-decreasing in e (i.e. effort increases the probability of success at a cost). We normalize the lowest effort to zero.

8Setting the reservation utility equal to zero is a normalization. The important assumption is that there exists an eﬀort with low enough cost, so that any mechanism satisfying IC and LL also satisﬁes the IR (see footnote 6 and Lemma 2).

6 Pure Moral Hazard

Suppose the principal observes the agent’s type but does not observe effort. Without limited liability, the principal can implement the first best by “selling the firm” to each agent – i.e., paying a bonus equal to the incremental output b (θ) = ∆x and offering a fixed payment that θ θ extracts the entire surplus s (θ) = ce(θ) −pe(θ)∆x, where e(θ) is the first-best effort. With limited liability, the principal needs to leave rents to the agent if she wants to sell the firm. Then, it is profitable to distort the bonus downward, reducing the agent’s effort.9 Moreover, limited liability binds, so the agent gets a zero fixed payment. Optimal contracts with and without limited liability vary in opposite dimensions: while, without limited liability, they have the same bonus (b = ∆x) and different salaries, optimal contracts with limited liability have the same salary (s = 0) and different bonuses.10 In both cases, however, the principal offers different contracts to different types.11 Moreover, these mechanisms are no longer feasible if types are unobservable. If offered contracts with the same bonus and different salaries, all types would select the one with the highest salary. Similarly, if offered the same salary and different bonuses, they would all pick the contract with the highest bonus. The principal can still screen unobservable types by varying both the salary and the bonus. In fact, without limited liability, this is typically optimal. Our main result shows that, with limited liability, the principal prefers not to offer a menu of contracts. Instead, the optimal mechanism offers a single contract despite the presence of many different types.12

Pure Adverse Selection

Now suppose the principal observes the agent’s effort but not his type. If effort costs are common knowledge (that is, θ only affects the conditional probability of success), the principal can implement the first best by fully reimbursing the cost of each effort. The agent would be

9To see this, note that if the principal wants to recommend the lowest effort, she will pay s(θ) = b(θ) = 0 θ θ and get expected payoff p0xH + 1 − p0 xL > xL. Suppose the principal pays b(θ) ≥ ∆x and the agent exerts θ θ θ θ effort e. The principal’s payoff is then pe [xH − b (θ)] + 1 − pe xL ≤ pe(xH − ∆x) + 1 − pe xL = xL . Comparing these two inequalities, we can see that offering b(θ) ≥ ∆x is dominated by paying s = b = 0 and recommending e = 0. Then, because agents are paid a bonus lower than the incremental output, the efforts of all types e(θ) lie (weakly) below the first-best effort. 10For example, when there are only two efforts (say, 0 and 1) and the agent has limited liability, the principal θ c1 θ offers a bonus b (θ) = θ θ if she recommends the high effort (e (θ) = 1). This bonus is strictly increasing in c1 p1 −p0 θ θ and p0 and strictly decreasing in p1. 11As we will see in Section 2.3, with both adverse selection and moral hazard, the principal offers the same contract to all types. This optimal contract depends on the principal’s prior distribution over types. This is expected, since the general model converges to the pure moral hazard model as the principal’s prior becomes concentrated at a single type. 12If one showed that LL binds for all types when types are not observable, IC would imply that a single contract is offered. However, it is not obvious that the principal refrain from using different salaries and bonuses to screen different types. In fact, with risk aversion, LL typically does not bind for all types and the optimal mechanism involves menus of contracts (Gottlieb and Moreira, 2014).

7 indifferent between all efforts and would therefore accept to choose the principal’s preferred one.13 The first best can no longer be implemented when the agent has private information about effort costs: If the principal offered to fully reimburse the effort costs of all types, they would all pretend to be the types with the highest costs. Conversely, if the probability of success is common knowledge (θ only affects the cost of effort), the optimal mechanism posts a payment for each (observed) effort and agents choose their efforts based on their privately-known costs. As the next example illustrates, offering a menu of contracts can be optimal if the agent has private information about both probabilities and costs (but effort is still observable):

Example 1. There are two efforts (0 and 1, or “low” and “high”) and two types (A and B). The A B 2 A B effort costs are c1 = 1, c1 = 3 , and c0 = c0 = 0. Given a high effort, the probability of success A 2 B 1 for type A is p1 = 3 and for type B is p1 = 3 . We assume that the project fails with a high enough probability if they exert low effort and take xH − xL to be large enough for the principal to want to implement high effort from both types. In the appendix, we show that the optimal mechanism offers the following payments: s(A) = 3 2 0, b(A) = 2 , s(B) = 3 , and b(B) = 0. Notice that the principal uses salaries and bonuses to screen types: type A, which has a higher probability of success, gets zero salary and a positive bonus, whereas type B, which has a lower probability of success, gets a positive salary but no bonus. Moreover, this mechanism is no longer feasible if effort is not observable, since both types would choose e = 0.

2.3 Contract Simplicity

We can now state the simplicity result with binary outcomes, which establishes that the principal oﬀers a single contract:

Theorem 1. There exists an optimal mechanism that oﬀers a single contract (s, b) to all types, with s = 0 and b < ∆x. Moreover, any optimal mechanism is equivalent to a mechanism that oﬀers the same contract to all types.

The proof is based on three lemmas. The first one shows that IC and LL imply that IR never binds. This follows from the fact that the agent can always guarantee himself a non-negative payoff by picking the lowest effort and collecting the non-negative payments. The second lemma shows that any mechanism that pays a bonus greater than the incremental output to some type cannot be optimal. Any such mechanism gives the principal a payoff that is lower than if she offered all types a constant payment of zero.

13Note that, unlike in Guesnerie and Laﬀont (1984), the ﬁrst-best allocation in this case is non-decreasing and is therefore feasible when the agent is not subject to moral hazard. Complete pooling in our model is due to the interaction between adverse selection, moral hazard, and limited liability, not because of non-responsiveness.

8 The last lemma establishes that for any mechanism with bonuses lower than the incremental output, if multiple contracts are being offered, the principal can improve by offering all types the contract with the highest bonus. Since IR never binds, all agents pick this single contract once other contracts are removed. There are two effects from this migration to the highest-powered contract: a reduction of rents and an increase in efficiency. Since agents are risk neutral, they pick the contract that maximizes expected payments conditional on their effort. Thus, holding effort fixed, reducing the set of contracts being offered decreases expected payments to the agents (rent extraction effect). Second, because agents now face a higher bonus, they choose an effort with a higher probability of success. This raises the principal’s profit by the increased probability of success times the incremental output net of the bonus paid to the agent:

(pe˜ − pe)(∆x − b),

where e is the agent’s old effort, e˜ is the agent’s new effort, and pe˜ ≥ pe. Since the incremental output exceeds the bonus, both terms are positive. Hence, the efficiency effect is also positive. The proof then concludes by showing that an optimal mechanism exists. Limited liability and risk neutrality play an important role in Theorem 1. Limited liability ensures that agents do not leave the mechanism if their contract is removed. Without it, the participation constraint would bind for some type. Then, removing low-powered contracts would induce some types to prefer not to participate. Risk neutrality implies that, holding effort fixed, the principal and the agent split a pie of a fixed size. Since the agent always picks the contract with the highest expected payment, providing more freedom of choice to the agent can only hurt the principal (holding effort fixed). With risk aversion, different bonuses also affect the size of the pie since lower bonuses insure the agent better. Then, removing all but the highest-powered contract improves efficiency but worsens risk sharing. Theorem 1 greatly simplifies the analysis of the optimal mechanism by allowing us to rewrite the principal’s program as a standard optimization problem with a single instrument b ∈ [0, ∆x]. It is then straightforward to obtain comparative statics results. For example, using a supermod- ularity argument, we can show that the optimal bonus and the success probabilities of all types are increasing in the incremental output ∆x. Moreover, the probability of success is distorted downwards relative to the first best. Formally, letting eFB denote a first-best effort, we have:

FB θ θ θ θ e (θ) ∈ arg max xL + p ∆x − c and e (θ) ∈ arg max p b − c . e e e e e e

θ θ Since b < ∆x, it follows by a revealed-preferences argument that pe(θ) ≤ peFB (θ), with strict inequality for some type if the type space is suﬃciently rich. Finally, notice that Theorem 1 does not depend on the distribution of types or other parameters of the model. It is straightforward to generalize the analysis above to the case of multiple outputs when

9 contracts are restricted to two-part tariffs, where the fixed part corresponds to a wage that is paid independent of the output and the variable part corresponds to equity payments that are linear in the firm’s output. The restriction to two-part tariffs can be motivated by arbitrage opportunities when the principal deals with multiple agents who can costlessly redistribute outputs between themselves. In the next section, we study optimal contracts without this restriction.

3 Multiple Outputs

We now generalize the model to allow for multiple outputs. Let X := {x1, ..., xN } be the set of possible (real-valued) outputs with x1 < ... < xN . As before, the agent’s private information is described by a type θ ∈ Θ. Types are distributed according to a probability measure µ on B(Θ). Notice that our formulation allows types to be finite- or infinite-dimensional, and their distribution may be discrete or continuous. The agent chooses an unobservable effort e from the compact metric space E. A type-θ agent θ θ who exerts effort e produces output xi with probability pe (xi) := Pr(x = xi|θ, e). Let ce denote type θ’s cost of of effort e. As before, we assume that the least-costly effort has a non-positive θ cost: mine ce ≤ 0 for all θ. A contract is a function that specifies a transfer to the agent conditional on each possible output. A mechanism specifies a contract and an effort recommendation for each type. That is, a mechanism is a pair of measurable functions w :Θ × X → R and e :Θ → E, so that a type-θ agent is recommended effort e (θ) and gets paid wθ (x) in case of output x. Given a mechanism (w, e), a type-θ agent gets expected payoff

N X θ θ θ U (θ) := w (xi) pe(θ) (xi) − ce(θ). i=1

As in the two-output case, the mechanism has to satisfy the following IC, IR, and LL constraints:

N X θˆ θ θ ˆ U (θ) ≥ w (xi) peˆ (xi) − ceˆ, ∀θ, θ, e,ˆ (IC) i=1

U (θ) ≥ 0, ∀θ, (IR)

θ w (xi) ≥ 0, ∀θ, i. (LL)

An optimal mechanism maximizes the principal’s expected proﬁt

N X x − wθ(x ) pθ (x )dµ(θ) (3) ˆ i i e(θ) i Θ i=1

10 among mechanisms that satisfy IC, IR, and LL. The main question we address in this section is whether there exists an optimal mechanism that oﬀers at most two contracts to all types – that is, whether there exist wL and wH such that, θ θ for each θ, either w (x) = wL(x) for all x or w (x) = wH (x) for all x. The example below shows that, without additional restrictions, the answer is no.

Example 2. There are N ≥ 3 types and N + 1 states: Θ = {1, ..., N} and X = {x0, x1, ..., xN }. θ θ There are two eﬀorts, E = {0, 1}, and all types have the same eﬀort costs: c0 = 0 and c1 = 1 for all θ. The conditional probability of each output is ( ( 1 if x = x 1 if x = x pθ(x) = 2 θ and pθ(x) = 0 . 1 1 0 2N if x 6= xθ 0 if x 6= x0

That is, output always equals x0 if the agent exerts low effort. Any output is possible with high effort, although each type’s “own output” (xθ) is the most likely outcome. Suppose that x0 is low enough, so that it is optimal for the principal to implement high effort from all types. As we show in the appendix, the optimal mechanism offers the following contracts: ( 2 if x = x wθ (x) = θ . 0 if x 6= xθ

Each type’s contract pays 2 if the output matches the agent’s own type and zero otherwise. Therefore, the optimal mechanism consists of S diﬀerent contracts, one for each type.

The example illustrates the main problem in generalizing Theorem 1 to multiple outputs. With two outputs, the only way to incentivize effort is to pay a higher bonus. The power of any contract is determined by its bonus.14 With multiple outputs, there is one bonus associated with each incremental output, so contract power has, in general, only a partial order. A contract that convinces one type to exert effort may be ineffective in incentivizing another type. In the example, the cheapest way to incentivize type θ to exert effort was to pay a bonus in case of output xθ. To rule out cases such as the one in Example 2, where types disagree over the effectiveness of incentives, we need to ensure that types order the power of different contracts in the same way. Formally, we need the following property to hold. Let w and w˜ be two contracts satisfying

LL. If there exist e0, e˜0 ∈ E and θ0 ∈ Θ for which

N N X X w(x ) pθ0 (x ) − pθ0 (x ) = w˜(x ) pθ0 (x ) − pθ0 (x ) , i e0 i e˜0 i i e0 i e˜0 i i=1 i=1

14Similarly, when the contract space is restricted to two-part tariﬀs, we can unequivocally rank the power of two contracts by their slopes.

11 then, for all θ ∈ Θ and all e, e˜ ∈ E,

N N X θ θ X θ θ w(xi) pe(xi) − pe˜(xi) = w˜(xi) pe(xi) − pe˜(xi) . i=1 i=1

This condition states that if one type has the same incentives to exert two efforts under contracts w and w˜, so do all other types for any two efforts. In other words, all types agree on the incentives provided by each contract. Of course, they may still pick different efforts depending on their distributions and effort costs. However, this condition rules out cases such as Example 4, where the principal increases effort from each type by offering a contract that pays a higher bonus in a different state. Although intuitive, this is a somewhat convoluted assumption. As we show in Appendix C, however, this assumption is equivalent to the following multiplicative separability (MS) condition:15

θ Deﬁnition 1. A cumulative distribution Fe satisﬁes multiplicative separability (MS) if there exist functions H : X → R and I : E × Θ → R such that

θ θ Fe (x) + I(e, θ)H(x) = Fe˜ (x) + I(˜e, θ)H (x) ∀e, e,˜ θ, x. (4)

Multiplicative separability always holds when there are only two outputs. It also holds under the Linearity of the Distribution Function Condition (Grossman and Hart 1983; Hart and Holmstrom 1987), which is obtained by taking E = [0, 1], I (e, θ) ∈ [0, 1], and H(x) =

F0(x) − F1(x) for some distributions F0 and F1:

θ Fe (x) = I(e, θ)F1 (x) + [1 − I(e, θ)]F0 (x) . (5)

This condition is commonly used in pure moral hazard models, along with the convexity of costs, to justify the ﬁrst-order approach. Importantly, however, none of our results assume the validity of the ﬁrst-order approach.16 The following technical conditions, which generalize Assumption 1, guarantee the existence of an optimal mechanism:17

θ θ Assumption 2. i) ce and pe(xi) are continuous functions of (e, θ); and 15In Appendix C, we also give a graphical characterization of distributions satisfying MS. 16There is no relationship between MS and Holmstrom’s (1979) sufficient statistic result. For example, MS pθ (x) eH always holds with binary outputs. However, as long as the likelihood ratio pθ (x) is not constant in θ for some eL efforts eL and eH , x will not be not a sufficient statistic for e given (x, θ). Moreover, because types also affect the cost of effort, the optimal pure-moral-hazard contract is also a function of θ even if types do not affect the likelihood ratio (see footnote 10). 17Assumption 2. i requires joint continuity of c and p on (e, θ), which is needed in the proof of Theorem 2. For our other results, only continuity with respect to e and measurability with respect to θ are required.

12 θ ii) pe (xi) ≥ p for some p > 0.

The next theorem shows that whenever MS holds, the optimal mechanism offers at most two contracts.18 Types who are offered the same contract may still choose different efforts, however, depending on their distributions and effort costs. Theorem 2. Suppose MS holds. There exists an optimal mechanism that offers at most two contracts to all types. Moreover, for any optimal mechanism, there exists an equivalent mechanism that offers at most two contracts to all types. Recall that with two efforts, substituting all contracts by the one with the highest bonus had two effects. First, holding effort fixed, it reduced the expected payment to all types. Second, it increased the probability of success, which raised the principal’s profit because the bonus was lower than the incremental output. The first of these effects remains unchanged with multiple outputs: holding effort fixed, the principal and the agent split a pie of fixed size (since they are both risk neutral). Therefore, for any fixed effort, removing a contract from the mechanism cannot hurt the principal. The main difficulty with multiple outputs is with the second effect. High-powered contracts may be unprofitable for two reasons. First, a high-powered contract may incentivize an inefficient effort. Second, because some bonuses may exceed the incremental output (i.e., xi+1 − xi < wi+1 − wi for some i), improving the distribution of outputs may decrease the principal’s profits. Addressing these issues requires the principal to offer two contracts: a high-powered contract to types whose effort she wants to encourage and a low-powered contract to those whose effort she wants to discourage. We show that if the principal removes from a mechanism all contracts except for the ones with the highest and the lowest incentives, types whose effort is profitable pick the high-powered contract and types whose effort is unprofitable pick the low-powered contract. Therefore, the second effect is also positive.

4 Free Disposal

We now introduce the following free disposal (FD) constraint in the multiple-output environment from Section 3: y − wθ (y) ≥ x − wθ (x) (FD) for all θ ∈ Θ and all x, y ∈ X with y ≥ x. Because FD requires the principal’s proﬁt to be non- decreasing, we follow Matthews (2001) and call it a monotonicity constraint.19 As Innes (1990)

18MS can be slightly weakened since it does not need to hold for all eﬀorts, only the ones that principal may θ θ want to implement. For example, Theorem 2 remains unchanged if (4) fails at points where F· (x) and c· are locally concave (since such eﬀorts are not implementable). 19FD still allows the agent’s payment wθ(x) to be non-monotonic. All the results from this section continue to hold if we also impose monotonicity on the agent’s side (as well as limited liability on the principal). In

13 argues, FD can be seen as an additional incentive constraint if the principal can costlessly reduce output or if the agent can borrow from outside lenders to inflate output. An optimal monotone mechanism maximizes the principal’s expected profit (3) among mechanisms that satisfy IC, IR, LL, and FD. θ θ θ Let F := Fe : e ∈ E be a family of distributions satisfying MS. We say that F is ordered by first-order stochastic dominance (FOSD) if the distributions can be written as in equation (4) θ θ 20 and Fe (x) first-order stochastically dominates Fe˜ (x) if and only if I(e, θ) ≤ I(˜e, θ). That is, I(e, θ) orders the output distributions in terms of FOSD. Economically, in a family ordered by FOSD, the order of expected payments under different efforts is the same for any non-decreasing payments. This is a standard assumption in single-task models, where we typically assume that effort orders the output distribution in terms of FOSD. This is a less compelling assumption in multi-task models. It fails, for example, if the agent allocates effort to a safe and a risky task, and effort allocated to the risky task increases both the mean and the variance of output. Following the same steps as Theorem 2, we can show that the optimal monotone mechanism offers at most two contracts. We now show that if distributions are ordered by FOSD, the optimal monotone mechanism offers a single contract:21

Lemma 1. Suppose MS holds and F θ is ordered by FOSD for all θ. Then, there exists an optimal monotone mechanism that oﬀers the same contract to all types. Moreover, any optimal monotone mechanism is equivalent to a mechanism that oﬀers the same contract to all types.

Recall that, in general, the principal does not want to encourage effort from all types. But when distributions are ordered by FOSD, the ranking of payments is the same for any non- decreasing payments. Since FD requires the principal’s payment to be non-decreasing, offering the contract with the highest incentives is always profitable. Therefore, a single contract suffices to implement the optimal mechanism. While Lemma 1 shows that MS and FOSD are sufficient conditions for the optimal mechanism to involve a single contract, they are not necessary. Choosing outputs appropriately in Example 2 (so that FD does not bind) shows that the optimal mechanism may involve as many contracts as types if the distribution is ordered by FOSD but does not satisfy MS (see the supplementary appendix). However, as we show in Appendix D, offering a single contract is optimal under a slightly more general condition than MS. particular, when debt contracts are optimal (Theorem 3), adding either of these constraints would not modify the optimal contract. 20In the pure adverse selection model of Nachman and Noe (1994), project types are assumed to be ranked in terms of first-order stochastic dominance, which is fundamentally different than assuming that, for each type θ, effort orders outputs in terms of FOSD. We do not require types to be ordered in any way. 21 · · Lemma 1 and Theorem 3 only require that, for each e ∈ E, pe and ce are measurable functions in Assumption 2 (ii).

14 Debt Contracts

Next, we consider the optimality of debt contracts. The principal gets a debt contract if his payments x − w (x) equal min {x, x¯} for some face value x¯, or, equivalently, if the agent is given a call option w (x) = max {x − x,¯ 0}. Incentive-compatible mechanisms cannot offer more than one debt contract, since agents would always pick the one with the lowest face value. Therefore, for a mechanism to offer debt contracts only, it needs to offer the same contract to all types. Accordingly, we assume that MS holds and F θ is ordered by FOSD for all θ, so the principal offers a single contract. The existing literature established that, in the single-task version of the model (E ⊂ R) with pure moral hazard, the monotone likelihood ratio property (MLRP) is sufficient for the θ optimality of debt. In this one-dimensional effort model, a probability mass function pe satisfies pθ (x) eL MLRP if, for any eL, eH with eL < eH , the ratio pθ (x) is decreasing in x. Intuitively, MLRP eH means that the “evidence” in favor of higher efforts increases with output. MLRP, which is implied by FOSD, plays an important role on the monotonicity of contracts (Holmstrom (1979); Grossman and Hart (1983)). Accordingly, we work with the following notion of MLRP:

Deﬁnition 2. A distribution satisfying MS has the monotone likelihood ratio property if it pθ (x) eH can be written as in equation (4) and pθ (x) is increasing in x for any eL, eH , and θ with eL I(eH , θ) > I(eL, θ).

Our next result establishes that when, in addition to the conditions from Lemma 1(b), the distributions satisfy MLRP, not only is it optimal to oﬀer only one contract, but this contract takes the form of debt:

Theorem 3. Suppose MS holds and the distributions satisfy MLRP. There exists an optimal monotone mechanism that gives the principal a single debt contract. Moreover, any optimal monotone mechanism is equivalent to a mechanism that gives the principal a single debt contract.

The intuition for the optimality of debt is reminiscent of Innes (1990). With MLRP, higher outputs are more “indicative” of higher effort. Therefore, transferring payments from lower to higher outputs relaxes the IC constraints. In the canonical pure moral hazard model, there is a single IC, preventing the agent from choosing a lower effort. Here, because of adverse selection, there is one IC for each type, which prevents each type from picking a different effort. However, because of multiplicative separability and ordering by FOSD, the ICs of all types are aligned, in the sense that if a perturbation raises one type’s incentives to exert effort, it must also raise the incentives of all types. This observation allows us to summarize all the constraints of the program into a single one, as in the pure moral hazard model.

15 5 Procurement and Regulation

In this section, we adapt our main framework to a setting of procurement and regulation that builds on the classic framework of Laffont and Tirole (1986, 1993). Their model can be reduced to a pure adverse selection model because effort affects the regulated firm’s cost deterministically. Our model incorporates moral hazard in their model by allowing effort to affect the firm’s cost stochastically. A regulated firm produces an indivisible good that generates a consumer surplus of S > 0 at a random monetary cost C ∈ {C1, ..., CN } with C1 < ... < CN . The firm’s manager chooses a θ cost-reducing effort e ∈ E, which is not observed by the regulator. Let pe denote the probability that the firm’s cost equals C conditional on type θ and effort e. θ θ The firm’s manager has cost of effort ce, with mine ce ≤ 0. As argued previously, this is satisfied if the lowest effort costs zero or if the manager gets private benefits out of some activities. The firm’s manager has private information about both his ability to cut costs (i.e., the θ θ conditional distribution of costs pe) and the cost of effort ce. The regulator observes the monetary cost C incurred by the firm but not the manager’s effort e. As an accounting convention, assume that the regulator reimburses the firm’s monetary costs in addition to paying the firm an amount conditional on the observed cost C. A contract is a function that specifies a transfer to the firm conditional on each possible cost

C. A mechanism is a pair of measurable functions w : {C1, ..., CN } × Θ → R and e :Θ → R specifying, for each reported type, a recommended eﬀort and a transfer for each cost realization. Given a mechanism (w, e) , a type-θ manager gets payoﬀ

N X θ θ θ U(θ) := w (Ci) pe(θ) (Ci) − ce(θ). (6) i=1

As usual, the mechanism must satisfy the IC and IR constraints

N X θˆ θˆ θˆ ˆ U (θ) ≥ w (Ci) peˆ (Ci) − ceˆ, ∀θ, θ, e,ˆ (IC) i=1

U (θ) ≥ 0, ∀θ. (IR)

The manager is protected by limited liability, so that payments are non-negative:

wθ(C) ≥ 0, ∀C. (LL)

We impose the technical conditions from Assumption 2 (with C instead of x). Since, by the accounting convention described above, the regulator fully reimburses the ﬁrm’s

16 cost realization, the regulator’s expected payment to type θ equals

N X θ θ Ci + w (Ci) pe(θ) (Ci) . i=1

Because the government uses distortionary taxation to raise public funds, the regulator faces a shadow cost of public funds λ > 0. The net surplus of consumers/taxpayers is

N X θ θ S − (1 + λ) Ci + w (Ci) pe(θ) (Ci) . (7) i=1

A utilitarian regulator maximizes the sum of the expected utility of the ﬁrm’s manager (6) and the consumers’ net surplus (7):

N N X θ θ X θ θ S − (1 + λ) Cipe(θ) (Ci) − ce(θ) − λ w (Ci) pe(θ) (Ci) . i=1 i=1

As the last term of this expression shows, because taxation is distortionary (λ > 0), leaving rents the the regulated firm is costly. Moreover, because each dollar reimbursed to the firm has an additional cost of λ due to distortionary taxation, cutting the firm’s cost increases social surplus PN θ θ 22 by 1 + λ. The first-best effort minimizes (1 + λ) i=1 Cipe (Ci) + ce. The main difference between this model and the principal-agent model considered previously is that, while the principal only cares about her own payoff, a utilitarian regulator also cares about the manager’s payoffs. Therefore, the regulator also internalizes the manager’s effort cost so their preferences are not perfectly misaligned. However, to avoid distortionary taxation, the regulator would still like to leave as little rents as possible to the firm’s manager. We can now state the simplicity result for the case of binary outcomes:

Proposition 1. Suppose there are two possible monetary costs (N = 2). There exists an optimal mechanism that oﬀers the same contract to all types. Moreover, for any optimal mechanism, there exists an equivalent mechanism that oﬀers the same contract to all types.

The proof of Proposition 1 is similar to the one from Theorem 1, with one key diﬀerence. Because the regulator internalizes the cost of eﬀort, to show that the bonus must be lower than

22If types were observable and the firm did not have limited liability, the regulator would implement the first best by making the firm the residual claimant of the social gain from cutting costs and extracting the entire surplus: " N # θ X θ w (Ci) = (1 + λ) CipeFB (θ)(Ci) − Ci , i=1 where eFB(θ) is the first-best effort. This violates LL because w is non-degenerate and has mean zero.

17 the one that implements the ﬁrst best, we substitute contracts by the one that “sells the ﬁrm” to the manager (rather than the contract that always pays zero, as in Theorem 1). Next, we consider the case of multiple outputs N > 2. As in Section 4, contracts must satisfy the following free disposal constraint:

θ θ w (Cj) − w (Ci) ≤ Ci − Cj (FD) for all j and i with i > j. FD requires the firm’s compensation for cutting costs not to exceed the amount cut. It must be satisfied, for example, if the firm’s manager can secretly borrow from an outside party to inflate firm earnings. An optimal monotone mechanism maximizes the principal’s expected profit (3) among mechanisms that satisfy IC, IR, LL, and FD. As in Section 4, we assume that the cost distributions satisfy MS and are ordered by FOSD. Notice that, differently from output in the principal-agent model, a higher cost decreases the principal’s payoff. We therefore say that the cost distribution satisfies MLRP if, for any eL, eH , pθ (C) eH and θ with I(eH , θ) > I(eL, θ), pθ (C) is decreasing in C. That is, higher costs are indicative of eL “less effort” (more precisely, they are indicative of a lower I(e, θ)). Adapting Lemma 1 and Theorem 3, we obtain the following results:

Proposition 2. Suppose MS holds and the family of distributions is ordered by FOSD. a) There exists an optimal monotone mechanism that offers the same contract to all types. Moreover, for any optimal monotone mechanism, there exists an equivalent mechanism that offers the same contract to all types. b) Suppose, in addition, that the distributions satisfy MLRP. Then, there exists an optimal monotone mechanism that offers to all types the contract w (C) = max C¯ − C; 0 , for some C¯. Moreover, for any optimal monotone mechanism, there exists an equivalent mechanism that offers to all types the contract w (C) = max C¯ − C; 0 , for some C¯.

The optimal contract in part (b) specifies a “reasonable cost” C¯ and reimburses the regulated firm for any cost cuts beyond C¯. Since, by our accounting convention, the regulator pays the firm’s cost directly, the firm’s revenues under this contract equal:

w(C) + C = max C,C¯ .

This reimbursement rule consists of a standard price cap except that, because of limited liability, the regulator must bailout ﬁrms with cost realizations above the cap. Price caps are the most common form of incentive regulation. They are used, for example, by the U.S. Federal Communications Commission (FCC) to regulate the telephone industry. Price caps are often used in procurement as well. For example, prospective reimbursement systems commonly used in health care specify an amount C¯ based on what a service should cost, and let providers keep

18 cost savings C¯ − C to themselves. They are used, for example, by Medicare.

6 Conclusion

The observation that many contracts are simple and relatively uniform across different sectors is an old puzzle in contract theory. While standard adverse selection models predict that agents will be offered large menus of contracts, contracting parties typically offer a limited number of contracts, often a single one. While standard moral hazard models predict that contracts should be fine tuned to the likelihood ratio of output, similar contracts are offered in different environments. We argue that these two features endogenously emerge in a general model of moral hazard and adverse selection if contracts must satisfy limited liability. With binary outcomes, the principal always offers a single contract regardless of any parameters of the model. The joint presence of moral hazard and adverse selection is key for this result. When either types or effort are observed, the principal typically prefers to offer different contracts to different types. With multiple outputs, it is optimal to offer two contracts if the distribution of output satisfies a separability condition. Moreover, if the marginal distribution satisfies MLRP, this optimal monotone contract is a debt contract for the principal. Our paper shows that gaming may be an important downside from giving flexibility to agents by offering menus of contracts. This is particularly stark with bilateral risk neutrality where, holding effort fixed, the agent always selects the most expensive contract to the principal. Then, reducing the number of contracts offered to the agent always increases the principal’s profits (for a fixed effort). If, in addition, we can identify a most efficient contract from the menu (such as when output is binary or when the distribution is multiplicatively separable and ordered), the principal can always improve by eliminating other contracts. The simplicity results rely on the presence of limited liability and bilateral risk neutrality. Limited liability ensures that increasing the power of a contract will not force the agent to abandon the mechanism. Bilateral risk neutrality means that, holding effort fixed, principal and agent are perfectly misaligned so that the principal always benefits by reducing the agent’s flexibility. With risk aversion, their misalignment is no longer perfect because there are potential gains from risk-sharing. While risk neutrality is a reasonable assumption in many settings (such as the optimal compensation of wealthy managers, procurement contracts, or the regulation of large companies), there are many other settings where they are not (for example, insurance contracts or sharecropping). In these cases, our results no longer hold. In our companion paper (Gottlieb and Moreira 2014), we study optimal mechanisms in the binary-outcome model when agents are risk averse and when there are no limited liability constraints. While we obtain some simplicity results, optimal mechanisms are considerably more complex than they are here.

19 Appendix

A. Proofs

Proof of Theorem 1

We ﬁrst verify that participation is implied by incentive compatibility and limited liability:

Lemma 2. Let (s, b, e) be a mechanism that satisﬁes IC and LL. Then, it satisﬁes IR.

θ θ Proof. The IC preventing θ from deviating to e states that U (θ) ≥ s (θ) + peb (θ) − ce. Then, θ because s(θ) and b(θ)+s(θ) are non-negative (LL) and min ce ≤ 0, it follows that U (θ) ≥ 0. e∈E The proof of the theorem will use the following result, which states that the bonus does not exceed the incremental output:

Lemma 3. Consider a mechanism (s, b, e) in which b (θ) > ∆x for some θ. Then, there exists another mechanism that gives the principal a greater payoﬀ than (s, b, e).

Proof. In any optimal mechanism, the limited liability constraint must bind for some type. Otherwise, reducing the fixed payment to all types by a uniform amount maintains feasibility and increases the principal’s payoff. Let (s, b, e) be an optimal mechanism with b(θ) > ∆x for some type θ. If all types have bonuses greater than the incremental output (b(θ) > ∆x), the principal’s payoff is strictly lower than the one she obtains by offering s = b = 0 to all types, which contradicts optimality. Suppose there exists a type θ˜ who picks a contract with b(θ˜) ≤ ∆x. Because LL must bind for some type θH , this type must pick a contract with bH := b(θH ) > ∆x (otherwise, the contract

(0, bH ) with bH ≤ ∆x would be first-order stochastically dominated by the contract that pays s(θ) ≥ 0 and b(θ) > ∆x, which is offered by assumption). Therefore, the mechanism includes at least one contract (sL, bL) with bL ≤ ∆x and contract (0, bH ) with bH > ∆x. We claim that this mechanism gives the principal a lower profit on all types than offering s = b = 0 to all types. θ ˆ For each θ, let e¯(θ) ∈ arg min ce. The principal’s profit from a type θ who chooses contract e∈E (0, bH ) and exerts effort eH ∈ E is

x + pθˆ (∆x − b ). L eH H

Replacing this contract by s = b = 0 changes the principal’s payoﬀ to x + pθˆ ∆x, increasing L e¯(θˆ) proﬁts by pθˆ ∆x + pθˆ (b − ∆x) > 0. e¯(θˆ) eH H

20 The principal’s proﬁt from a type θ who chooses (sL, bL) and exerts eﬀort eL ∈ E is

x − s + pθ (∆x − b ). L L eL L

By the incentive compatibility constraint of a type who picks (sL, bL) and the fact that bH > ∆x,

pθ ∆x < pθ b ≤ s + pθ b . eL eL H L eL L

Adding xL to all terms in this inequality and rearranging, gives

x + pθ ∆x − s + pθ b < x . L eL L eL L L

θ Add pe¯(θ)∆x ≥ 0 to the expression on the right to obtain:

x + pθ ∆x − s + pθ b < x + pθ ∆x. L eL L eL L L e¯(θ)

The term on the right is the principal’s profit from the constant-payment contract (s = b = 0) whereas the term on the left is the profit from the original contract. Hence, this replacement also raises profits from any type who chooses a contract with b(θ) ≤ ∆x.

The next lemma, which is the main step for proving Theorem 1, shows that any feasible mechanism is weakly dominated by a mechanism that oﬀers a single contract to all types:

Lemma 4. Let (s, b, e) be a mechanism satisfying IC, IR, and LL. There exists a mechanism that oﬀers a single contract (0, b∗) to all types and gives the principal a (weakly) greater payoﬀ than (s, b, e).

Proof. Let (s, b, e) be a mechanism that satisfies IC, IR, and LL. By Lemma 3, for any mechanism that offers a bonus greater than ∆x for some type, there exists another mechanism offering bonuses lower than ∆x to all types that gives the principal a higher payoff. Thus, there is no loss of generality in assuming that b(θˆ) ≤ ∆x for all θˆ ∈ Θ. Fix a type θ ∈ Θ. n o n o Let b∗ ≡ sup b(θˆ): θˆ ∈ Θ and s∗ ≡ inf s(θˆ): θˆ ∈ Θ denote the “highest” bonus and the “lowest” fixed payment in the mechanism. If s∗ > 0, reducing all fixed payments uniformly by s∗ would keep all the constraints satisfied and increase the principal’s payoff. Therefore, we can assume that s∗ = 0. Moreover, either there exists θˆ such that s(θˆ) = 0, b(θˆ) = b∗ (i.e., (0, b∗) n o is offered in the mechanism), or (0, b∗) is a limit point of s(θˆ), b(θˆ) ; θˆ ∈ Θ . Consider the alternative mechanism that offers the contract (0, b∗) to all types and let

∗ θ ∗ θ e (θ) ∈ arg max peb − ce, e∈E

21 which exists because the objective function is continuous and E is compact. The principal’s payoﬀ from type θ in the original mechanism is

θ xL − s(θ) + pe(θ) [∆x − b(θ)] . (8)

Her payoﬀ in the alternative mechanism is

θ ∗ xL + pe∗(θ)(∆x − b ). (9)

Since the contract (0, b∗) either belongs to, or is a limit point of the original mechanism, no agent can be better off by switching to (0, b∗) while holding the recommended effort fixed:

θ θ θ ∗ θ s(θ) + pe(θ)b(θ) − ce(θ) ≥ pe(θ)b − ce(θ).

θ (This inequality follows from type θ’s incentive constraint while holding e(θ) ﬁxed). Adding ce(θ) to both sides, it follows that the expected payment in the alternative mechanism cannot exceed the one from the original mechanism if eﬀort is held constant:

θ θ ∗ s(θ) + pe(θ)b(θ) ≥ pe(θ)b . (10)

That is, if the agent chooses not to change effort (e∗(θ) = e(θ)), the principal obtains a higher payoff in the alternative mechanism (9) than in the original one (8). Allowing effort to change, the principal’s payoff in the alternative mechanism minus the payoff in the original mechanism becomes

θ ∗ θ θ θ ∗ θ θ ∗ pe∗(θ)(∆x − b ) − pe(θ) [∆x − b(θ)] + s(θ) = pe∗(θ) − pe(θ) (∆x − b ) + s(θ) + pe(θ)b (θ) − pe(θ)b θ θ ∗ ≥ pe∗(θ) − pe(θ) (∆x − b ), (11) where the inequality follows from (10). ∗ θ θ By assumption, ∆x ≥ b . We claim that pe∗(θ) ≥ pe(θ). To wit, by the incentive constraint of type θ in the original mechanism,

θ θ θ θ s(θ) + pe(θ)b(θ) − ce(θ) ≥ s(θ) + pe∗(θ)b(θ) − ce∗(θ), and, by the deﬁnition of e∗(θ),

θ ∗ θ θ ∗ θ pe∗(θ)b − ce∗(θ) ≥ pe(θ)b − ce(θ).

22 Rearranging both inequalities, we can write them as:

θ θ ∗ θ θ θ θ pe∗(θ) − pe(θ) b ≥ ce∗(θ) − ce(θ) ≥ pe∗(θ) − pe(θ) b(θ)

θ θ ∗ ∴ pe∗(θ) − pe(θ) [b − b(θ)] ≥ 0.

∗ ∗ θ θ Since b is the supremum of bonuses, b ≥ b(θ). Therefore, pe∗(θ) ≥ pe(θ). Hence, (11) implies that the principal’s payoff from each type θ is higher in the alternative mechanism even when the agent chooses a different effort.

The proof of existence follows arguments similar to Page (1992) and is given in the supplementary appendix. To conclude the proof, we now establish that any optimal mechanism is equivalent to a mechanism that offers a single contract. Lemma 5. Any optimal mechanism is equivalent to a mechanism that offers the same contract to all types. Proof. Let (s, b, e) be an optimal mechanism. Using the notation of the proof of Lemma 4, consider a mechanism (0, b∗, e∗) that gives the principal a (weakly) greater payoff than (s, b, e) and b∗ is a constant function. Using the same argument as in the proof of Lemma 4, we can show that θ θ θ ∗ θ θ ∗ θ s(θ) + pe(θ)b(θ) − ce(θ) ≥ pe∗(θ)b − ce∗(θ) ≥ pe(θ)b − ce(θ) (12) and

θ ∗ θ θ θ ∗ θ θ ∗ pe∗(θ)(∆x − b ) − pe(θ) [∆x − b(θ)] + s(θ) = [pe∗(θ) − pe(θ)](∆x − b ) + s(θ) + pe(θ)b (θ) − pe(θ)b θ θ ∗ ≥ [pe∗(θ) − pe(θ)](∆x − b ) ≥ 0. (13) If the ﬁrst inequality of (12) is strict, then the ﬁrst inequality of (13) is also strict. Since (s, b, e) is an optimal mechanism, we must have

θ θ θ ∗ θ s(θ) + pe(θ)b(θ) − ce(θ) = pe∗(θ)b − ce∗(θ) and θ θ ∗ pe(θ) [∆x − b(θ)] − s(θ) = pe∗(θ)(∆x − b ) for all almost all θ ∈ Θ, i.e., the agent’s and principal’s payoﬀs in mechanisms (s, b, e) and (0, b∗, e∗) are the same almost surely.

Proof of Theorem 2

Let ∆x ≡ xN − x1. The following result will be important in order to establish existence of an optimal mechanism, since will allow us to restrict the set of possible contracts to a compact set.

23 θ ∆x Lemma 6. Let (w, e) be a mechanism satisfying IC and LL. Suppose that w (xi) ≥ p2 for some type θ and output xi. Then (w, e) is not optimal.

Proof. The proof has two steps. First, it shows that if an incentive-compatible mechanism offers a high enough payment in one state, then every other contract must also have a high enough payment in some state (otherwise, everyone would prefer the former contract). Second, it shows that any contract that makes a high enough payment in some state is dominated by than the null contract. ∆x Step 1. Suppose the mechanism offers a contract w˜ = (w ˜1, ..., wÑ ) with w˜i ≥ p2 for some output i. Let θ be a type that picks w and exerts effort e. Then, this type’s incentive-compatibility constraint gives: θ θ E [w|θ, e] − ce ≥ E [w ˜|θ, e] − ce.

Since max {w1, ..., wN } ≥ E [w|θ, e] and w˜i ≥ 0 for all i, this inequality implies the following:

∆x max {wi} ≥ pw˜i ≥ , i∈{0,...,N} p

∆x where the last inequality uses w˜i ≥ p2 . Step 2. We now show that this mechanism gives the principal a lower payoff than offering the contract that always pays zero to all types. Since the probability is bounded below by p and payments are non-negative, the principal’s payoff from offering w = (w1, ..., wN ) to type θ is

E [x − w|θ, e(θ)] ≤ xN − pwi, ∀i.

θ Let e0 ∈ arg max ce. The principal’s payoff from offering type θ a zero payment in all states is θ E[x|e0] > x0. Combining these two inequalities, we obtain the following necessary condition for w to give a higher payoff to the principal than the null contract:

∆x w < , ∀i. i p

Thus, the principal can proﬁt from oﬀering the null contract instead of w if

∆x w ≥ i p for some i.

The proof will also use the fact that any contract satisfies IC and LL also satisfies IR (because θ mine ce ≤ 0). Before presenting the proof, it is useful to introduce some terminology. For θ θ simplicity, we abuse notation and write pe,i for pe(xi) and wi for w (xi). The payoff of a type-θ

24 agent who exerts eﬀort e and gets contract w equals

N θ X θ θ ve (w) := wipe,i − ce. (14) i=1

Analogously, the principal’s payoﬀ from such type is

N θ X θ ue(w) := [xi − wi]pe,i. (15) i=1

By MS, a type-θ agent who switches from eﬀort e˜ to e while keeping the same contract w gains

N θ θ X θ θ ve (w) − ve˜(w) = −[I(e, θ) − I(˜e, θ)] wihi + ce˜ − ce, (16) i=1 where write hi := H (xi) − H (xi−1) for i > 1, H(x0) = 0 and h1 := H(x1). In turn, this switch changes the principal’s payoﬀ by

N θ θ X ue(w) − ue˜(w) = −[I(e, θ) − I(˜e, θ)] [xi − wi]hi. (17) i=1

If I(e, θ) ≥ I(˜e, θ) the principal (weakly) gains from shifting eﬀort from e˜ to e if and only if

N X [xi − wi]hi ≤ 0. (18) i=1

Similarly, the agent’s expected payment (weakly) increases from this shift in efforts if and only PN if i=1 wihi ≤ 0. The proof shows that the principal can (weakly) profit by removing all but two contracts from any feasible menu of contracts. Contracts for which the principal would like to “encourage effort” – i.e., equation (18) holds – are substituted by the contract with the highest incentives. Contracts for which the principal wants to “discourage effort” are substituted by the one with the lowest incentives. Establishing this result requires two steps. First, we verify that this substitution increases the principal’s profits if types pick the contract intended for them. Second, we verify that each type prefers to pick the contract intended for him.

Proof of the theorem. Let (w, e) be a feasible mechanism. Let M := wθ(·): θ ∈ Θ denote the set of all contracts in this mechanism. By the previous lemma, there is no loss of generality in assuming that M is bounded. Its closure, M¯ , is compact. For each type θ, there are two

25 possibilities: N N X θ X θ [xi − wi ]hi ≥ 0 or [xi − wi ]hi ≤ 0. i=1 i=1

PN θ + Case 1) i=1[xi − wi ]hi ≥ 0. Let w be a solution of the following maximization program:

PN max i=1 wihi, w∈M¯ (19) PN subject to i=1[xi − wi]hi ≥ 0.

The set of contracts that satisfy the constraint is non-empty by assumption. Because M¯ is compact, the objective function is a continuous linear functional and the constraint defines a closed set, (19) has a solution w+ ∈ M¯ . Let e+(θ) be an effort that maximizes the agent’s payoff under contract w+ (see the online appendix for existence). In particular, the agent’s payoff with the effort in the original mechanism e(θ) cannot exceed the payoff with effort e+(θ):

θ + θ + ve+(θ)(w ) ≥ ve(θ)(w ), which, by MS, can be written as

N + X + θ θ I (e (θ) , θ) − I e (θ), θ wi hi ≥ ce+(θ) − ce(θ). (20) i=1

Similarly, because e(θ) is the agent’s eﬀort choice with contract wθ,

N + X θ θ θ I (e (θ) , θ) − I e (θ), θ wi hi ≤ ce+(θ) − ce(θ). (21) i=1

Combining (20) and (21), we obtain

[I (e (θ) , θ) − I (e+(θ), θ)] PN wθh ≤ cθ − cθ i=1 i i e+(θ) e(θ) (22) + PN + ≤ [I (e (θ) , θ) − I (e (θ), θ)] i=1 wi hi.

Because w+ solves program (19),

N N X + X θ wi hi ≥ wi hi. i=1 i=1

Therefore, it follows from (22) that I (e (θ) , θ) ≥ I (e+(θ), θ). We now establish that replacing contract wθ by w+ increases the principal’s payoﬀ from type

26 θ. As in the proof of Theorem 1, we first show that, holding effort fixed, the principal is better off with the substitution of contracts. Since w+ is the limit of sequence in M, the agent’s utility is continuous in the topology, and the original mechanism is incentive compatible, it follows that

θ θ θ + ve(θ)(w ) ≥ ve(θ)(w ). (23)

PN θ Substitute the expression for the agent’s payoﬀ, multiply both sides by −1, and add i=1 xipe(θ),i to both sides to write:

N N X + θ X θ θ xi − wi pe(θ),i ≥ xi − wi pe(θ),i, (24) i=1 i=1 which states that, holding effort e(θ) fixed, the principal gets a higher profit with contract w+ than with wθ. To show that the change in effort also benefits the principal, notice that since I (e(θ), θ) ≥ I (e+(θ), θ) and w+ solves program (19), the following inequality holds:

N + X + I (e (θ) , θ) − I e (θ), θ xi − wi hi ≥ 0. i=1 Use MS to rewrite this inequality as

N N X + θ X + θ xi − wi pe+(θ),i ≥ xi − wi pe(θ),i, (25) i=1 i=1 which states that the principal gains from the change in effort. Combining (24) and (25) establishes that the principal’s profit from θ with the new contract exceeds her profit with the original contract:

θ + θ θ ue+(θ)(w ) ≥ ue(θ)(w ).

PN θ − Case 2) i=1[xi − wi ]hi ≤ 0. Let w be the solution of the minimization program:

PN min i=1 wihi, w∈M¯ (26) PN subject to i=1[xi − wi]hi ≤ 0.

As in case 1, incentive compatibility yields

[I (e(θ), θ) − I (e− (θ) , θ)] PN wθh ≤ cθ − cθ i=1 i i e−(θ) e(θ) (27) − PN − ≤ [I (e(θ), θ) − I (e (θ) , θ)] i=1 wi hi.

27 Since w− solves program (26), N N X − X θ wi hi ≤ wi hi, i=1 i=1 so that, by inequality (27), I (e− (θ) , θ) ≥ I (e(θ), θ). As in case 1, incentive compatibility implies that, holding effort e(θ) fixed, the principal’s profit is higher with contract w− than with wθ: N N X − θ X θ θ xi − wi pe(θ),i ≥ xi − wi pe(θ),i. (28) i=1 i=1 Next, we show that the change in effort also benefits the principal. Because I (e−(θ), θ) ≥ I (e(θ), θ), and because w− solves (26), the following inequality holds:

N − X − I (e(θ), θ) − I e (θ) , θ xi − wi hi ≥ 0. i=1 Use MS to rewrite this inequality as

N N X − θ X − θ [xi − wi ]pe−(θ),i ≥ xi − wi pe(θ),i, (29) i=1 i=1 which shows that the principal gains from the change in effort. Combining (24) and (29) establishes that the principal’s profit from θ with the new contract exceeds her profit with the original θ − θ θ contract: ue−(θ)(w ) ≥ ue(θ)(w ). Consider the mechanism (w, ¯ e¯) determined by

( w+(x), if PN [x − wθ]h ≥ 0 w¯θ(x) = i=1 i i i − PN θ w (x), if i=1[xi − wi ]hi < 0 and ( e+(θ), if PN [x − wθ]h ≥ 0 e¯(θ) = i=1 i i i , − PN θ e (θ), if i=1[xi − wi ]hi < 0

± θ ± where, as before, e (θ) ∈ arg max ve (w ). From the previous argument, this mechanism raises e∈E the principal’s payoff point-wise (i.e. it raises the principal’s payoff conditional on each type) and, by construction, it satisfies LL. It only remains to show that this mechanism is incentive compatible. Since e+ (θ) and e− (θ) maximize type θ0s payoff conditional on each contract, we only need

28 to verify that types with recommended contract w+ do not beneﬁt from picking w−:

N X θ θ + θ − [xi − wi ]hi > 0 =⇒ ve+(θ)(w ) ≥ ve−(θ)(w ), (30) i=1 and types with recommended contract w− do not deviate to w+:

N X θ θ − θ + [xi − wi ]hi < 0 =⇒ ve−(θ)(w ) ≥ ve+(θ)(w ). (31) i=1

In order to verify condition (30), let θ and θˆ be types with

N N X θ X θˆ [xi − wi ]hi ≥ 0 ≥ [xi − wi ]hi. i=1 i=1

Since type θ chose eﬀort e(θ) in the original mechanism, incentive compatibility of the original mechanism gives θ θ θ θˆ ve(θ)(w ) ≥ ve−(θ)(w ).

ˆ ˆ ˆ PN θn θn − Let (θn) be a sequence such that i=1[xi − wi ]hi ≤ 0 for all n and (w ) converges to w . Again, incentive compatibility of the original mechanism gives

θ θ θ − ve(θ)(w ) ≥ ve−(θ)(w ), (32)

θ where we are using the continuity of ve−(θ)(·). PN θn θˆn Now let (θn) be a sequence such that i=1[xi − wi ]hi ≥ 0 for all n and (w ) converges to + w . Since E is a compact metric space, (e(θn)) has a converging subsequence. Let e˜ ∈ E denote its limit and, with some abuse of notation, let (e(θn)) denote the subsequence itself. We claim that

θn θn θ + lim v (w ) = ve˜(w ). n→∞ e(θn) · Indeed, by the continuity of c· , we only need to show the convergence of the integral term in (14). But notice that

PN wθn pθn − PN w+pθ = PN wθn [pθn − pθn ] i=1 i e(θn),i i=1 i e,i˜ i=1 i e(θn),i e,i˜ PN θn + θn PN + θn θ + i=1 wi − wi pe,i˜ + i=1 wi pe,i˜ − pe,i˜

By MS, the ﬁrst term on the right hand side equals

N X θn [I(˜e, θn) − I(e(θn), θn)] wi hi, i=1

29 which converges to zero because I(·, ·) is a continuous function. The second and third terms on

θn + θn the right hand side also converge to zero because (w ) converges to w and pe,i˜ converges to θ · pe,i˜ by the continuity of pe,i˜ . Since e+(θ) is an optimal eﬀort choice for type θ under contract w+,

θ + θ + ve+(θ)(w ) ≥ ve˜(w ) = lim vθn (wθn ). n→∞ e(θn)

Using (32) for θ = θn, we obtain

vθn (wθn ) ≥ vθn (w−) e(θn) e−(θn) . θn − ≥ ve−(θ)(w )

Substituting this last inequality in the previous one and taking the limit, we verify that (30) holds. The proof of (31) is analogous. Hence, the mechanism (w, ¯ e¯) is incentive compatible. We still need to verify that an optimal mechanism exists. The proof, which follows arguments similar to Page (1992), is given in the supplementary appendix. Next, we establish the equivalence claim. Let (w, e) be an optimal mechanism and construct (w, ¯ e¯) as done previously. We need to show that, for almost all types, the principal and the agent get the same payoffs in both mechanisms. As before, there are two possible cases for each PN θ PN θ θ ∈ Θ: i=1[xi − wi ]hi ≥ 0 and i=1[xi − wi ]hi ≤ 0. We will consider the first case (the second one is analogous). θ + θ θ By construction, ue+(θ)(w ) ≥ ue(θ)(w ) for all θ. By the optimality of (w, e), this inequality cannot hold on a set of types with positive measure. Therefore, the principal must be indifferent between these mechanisms except for a set of types with measure zero. Now consider the agent’s payoff. Since w+ is the limit of sequence in M, we can proceed as in (23) to obtain:

θ θ θ + ve(θ)(w ) ≥ ve+(θ)(w ) (33) for all θ. Therefore, we must have

θ θ θ + θ + ve(θ)(w ) ≥ ve+(θ)(w ) ≥ ve(θ)(w ) for all θ (where the last inequality follows from incentive compatibility of (w, ¯ e¯)). Let Θ˜ be the set of types for which θ θ θ + ve(θ)(w ) > ve(θ)(w ).

30 Using the expression for the agent’s utility and rearranging, we obtain

N N θ + X + θ X θ θ θ θ ue(θ)(w ) = xi − wi pe(θ),i ≥ xi − wi pe(θ),i = ue(θ)(w ), i=1 i=1 with strict inequality exactly on Θ˜ . Use MS to write

N θ + θ + + X + ue+(θ)(w ) − ue(θ)(w ) = [I(e (θ) , θ) − I(e , θ)] [xi − wi ]hi ≥ 0, i=1 where, as in the previous part of the proof, the inequality follows from I(e (θ) , θ) ≥ I(e+(θ), θ) PN + θ + θ θ and i=1[xi − wi ]hi ≥ 0. Combine these two inequalities to obtain ue+(θ)(w ) ≥ ue(θ)(w ) for all θ, with strict inequality exactly on Θ˜ . Again, by the optimality of mechanism (w, e), Θ˜ must have zero measure, which concludes the proof.

Proof of Lemma 1

By FSOD, without loss of generality, we can assume that H(x) ≥ 0, for all x. Use summation by parts to write: N N+1 X X [xi − wi] hi = − (∆xi − ∆wi)H(xi), i=0 i=1 where hi ≡ H (xi+1) − H (xi), ∆xi ≡ xi − xi−1, ∆wi ≡ wi − wi−1, x0 = w0 and xN+1 = wN+1.

By FD, ∆xi ≥ ∆wi and, by FOSD, H(xi) ≥ 0. Thus, only case 2 is possible in the proof of Theorem 2. Hence, the optimal mechanism can be implemented with a single contract.

Proof of Theorem 3

Before presenting the proof, let us introduce some notation. As in the proof of Lemma 1, let θ θ hi ≡ H (xi+1) − H (xi), pe,i for pe(xi) and wi for w (xi). Notice that whenever MS holds and the family of distributions is ordered by FOSD, we can write θ θ pe,i + I(e, θ)hi = pe,i˜ + I(˜e, θ)hi, ∀e, e,˜ θ, i. Rearranging this expression, gives

θ pe,i˜ hi θ − 1 = − [I(˜e, θ) − I(e, θ)] θ . pe,i pe,i

hi Thus, MLRP holds if and only if θ is decreasing in i for any e, θ. pe,i

31 Since, by Lemma 1, the principal oﬀers a single contract, we can use MS to rewrite IC as

N X [I(e, θ) − I(e(θ), θ)] wihi ≥ ce(θ) − ce, ∀e. i=1

In particular, if it is optimal for the agent to pick effort e(θ) when offered contract w, it is also PN PN optimal to do so it when offered any contract w˜ with i=1 wihi = i=1 w˜ihi. We are now ready to present the proof:

∗ PN ∗ Proof of the theorem. Let w be an optimal contract. Then, for K = i=1 wi hi, it must also solve the following program: N X θ min wi pe(θ),idµ(θ) {wi} ˆ i=1 Θ subject to N X wihi = K, (IC’) i=1

wi ≥ 0, (LL)

xi − xi−1 ≥ wi − wi−1. (M)

As argued above, the first constraint ensures that effort e(θ) is still optimal for the agent. The second and third constraints are LL and FD. This is a restricted program: any contract that satisfies these constraints is feasible but not every feasible contract satisfies these constraints. Therefore, since w∗ is optimal among all feasible contracts, it must also solve this more restricted program that only includes a subset of feasible contracts. θ Let p¯i ≡ Θ pe(θ),idµ(θ) denote the marginal distribution of output xi. It is convenient to ´ rewrite the program in terms of increments. Let ∆w0 ≡ w0, ∆wi ≡ wi − wi−1, ∆x0 ≡ x0, and

∆xi ≡ xi − xi−1 for i ≥ 1. The program can then be rewritten as

N j X X min pj ∆wi (34) {∆wi} j=0 i=0 subject to N j X X hj ∆wi = K, (IC’) j=0 i=0

j X ∆wi ≥ 0, ∀j (LLj) i=0

∆xj − ∆wj ≥ 0, ∀j. (Mj)

32 We claim that if LLj holds with equality, then Mj holds with strict inequality and vice-versa.

To see this, notice that if LLj holds with equality, then

j j−1 X X ∆wi = ∆wi +∆wj = 0 ∴ ∆wj ≤ 0 < ∆xj, i=0 i=0 | {z } ≥0 by Mj−1 showing that Mj holds with inequality. Conversely, if Mj holds with equality, then

j−1 X ∆xj = ∆wj ∴ ∆wj + ∆wi ≥ ∆xj > 0, i=0 | {z } ≥0 by LLj−1 so LLj holds with inequality. The necessary ﬁrst-order conditions associated with program (34) are

N N N X IC X X LL M − p¯j + λ hj + µj − µi = 0, ∀i (35) j=i j=i j=i along with the usual complementary slackness conditions. IC Let ξj ≡ p¯j − λ hj and notice that

1 hj ξj > 0 ⇐⇒ IC > . λ p¯j

pθ pθ Notice that MLRP implies that e(θ),j is increasing in j for all θ, so that e(θ),j dµ(θ) = p¯j hj Θ hj hj ´ is also increasing in j. Hence, hj is decreasing in j, implying that there exists k ∈ {0, ..., N} p¯j such that ξj > (≤)0 for all j > (≤)k.

Substituting ξi in (35), we obtain

LL M M ξi = µi − µi + µi+1, i < N

LL M ξN = µN − µN .

There are three cases: (1) ξN > 0, (2) ξN < 0, and (3) ξN = 0.

Case 1) ξN > 0. In this case, LLN binds and MN does not. Because ξi is decreasing, ξi > 0 for all i and, by induction, none of the monotonicity constraints bind. As a result, the solution is wi = 0 for all i.

33 Case 2) ξN < 0. In this case, MN binds. For N − 1, we have

LL M LL M ξN−1 = µN−1 − µN−1 − ξN ≥ µN−1 − µN−1.

LL M If N − 1 ≥ k so that ξN−1 ≤ 0, it then follows that µN−1 − µN−1 ≤ 0 so that MN−1 binds (and

LLN−1 doesn’t). Inductively, it follows that Mi binds for all i ≥ k. LL Let j be such that LLj binds (and, therefore µj = 0). By the previous argument, it must be the case that j < k so that ξj > 0. We have that

LL M M LL M 0 < ξj−1 = µj−1 − µj−1 + µj = µj−1 − µj−1.

∗ Then, we must have that LLj−1 binds and Mj−1 doesn’t. Hence, there exists i ∈ {0, ..., N} ∗ ∗ such that LLi binds on all i ≤ i and Mi binds on all i > i . The contract must therefore be an option with a price between xi∗ and xi∗+1.

LL M Case 3) ξN = 0. In this case, µN = µN . Since we have previously shown that we cannot LL M have both constraints holding with equality, we must have µN = µN = 0. Substituting at the condition for the previous output gives

LL M ξN−1 = µN−1 − µN−1 > 0,

M LL M where we used the fact that ξi is decreasing and µN = 0. Thus, µN−1 > 0 = µN−1 so that

LLN−1 binds. Substituting inductively shows that all other LL constraints also bind. Hence, ∗ the solution in this case is an option contract with a strike price x ∈ {xN−1, xN }.

Proof of Propositions 1 and 2

In order to rewrite the procurement model using similar terminology as in Section 3, perform the change of variables:

xi := S − (1 + λ)Ci.

We will write contracts in terms of the taxpayer’s net surplus x, instead of the ﬁrm’s production θ θ S−x cost C by letting W (x) := w 1+λ . The distribution of the taxpayer’s net surplus x is θ θ S−xi θ θ determined by Ge(xi) := Fe 1+λ . Notice that if Fe satisﬁes MS, so does Ge; moreover, if θ θ Fe ; e ∈ E is ordered by FOSD, so is Ge; e ∈ E . With some abuse of notation, we will write θ pe for the probability distribution function associated with the cumulative distribution function θ Ge.

34 The regulator’s payoﬀ is N X θ θ θ xi − λWi pe(θ),i − ce(θ), (36) i=1 whereas the type-θ manager’s payoﬀ is

N X θ θ θ Wi pe(θ),i − ce(θ). i=1

Proof of Proposition 1

Let s(θ) ≡ W1(θ) and b(θ) ≡ W2(θ) − W1(θ). As in Lemma 2, it can be verified that any mechanism that satisfies IC and LL also satisfies IR. The proof will follow from two lemmas, which are similar to Lemmas 3 and 4. The first one shows that any optimal mechanism pays a bonus lower than the one that implements the first-best effort, distorting the probability of success downward:

∆x Lemma 7. Consider a mechanism (s, b, e) in which b (θ) ≥ 1+λ for some θ. Then, there exists another mechanism that gives the principal a greater payoﬀ than (s, b, e).

∆x Proof. The regulator’s payoﬀ from oﬀering contract (s, b) = 0, 1+λ to type θ is

θ ∆x θ x + p ∗ − c ∗ , (37) 1 e (θ) 1 + λ e (θ)

∗ θ θ where e (θ) ∈ arg maxe∈E x1 + pe∆x − (1 + λ) ce is a ﬁrst-best eﬀort. ∆x We claim that (37) exceeds (36) when b (θ) ≥ 1+λ . To see this, notice that

∆x θ ∆x θ θ θ b (θ) ≥ 1+λ ⇐⇒ pe(θ) 1+λ − ce(θ) ≥ pe(θ) [∆x − λb (θ)] − ce(θ) θ ∆x θ θ θ =⇒ pe∗(θ) 1+λ − ce∗(θ) ≥ pe(θ) [∆x − λb (θ)] − ce(θ) θ ∆x θ θ θ =⇒ x1 + pe∗(θ) 1+λ − ce∗(θ) ≥ x1 − λs(θ) + pe(θ) [∆x − λb (θ)] − ce(θ), where the first line follows from algebraic manipulations, the second follows from e∗(θ) being ∆x optimal for the agent when offered a bonus of 1+λ , and the third line adds x1 to both sides and subtracts λs(θ) > 0 from the expression on the right hand side. ∆x This would conclude the proof if (s, b, e) is such that b (θ) ≥ 1+λ for all θ. Therefore, suppose ∆x there also exist types with b(θ) < 1+λ . From the same argument as above, the principal profits ∆x from substituting the contract of any type with b(θ) ≥ 1+λ by (0, ∆x/(1 + λ)). We now verify ∆x that this substitution is also profitable for those with with b(θ) < 1+λ . ∆x Let θL and θH be types with b(θL) < 1+λ ≤ b(θH ) and, without loss of generality, suppose that s(θH ) = 0. Then

35 b(θ ) ≥ ∆x ⇐⇒ pθL ∆x − cθL ≥ pθL [∆x − λb(θ )] − cθL H 1+λ e(θL) 1+λ e(θL) e(θL) H e(θL) θL ∆x θL θL θL =⇒ pe∗(θ ) − ce∗(θ ) ≥ pe(θ ) [∆x − λb(θH )] − ce(θ ) L 1+λ L h L i L θL ∆x θL θL θL θL =⇒ p ∗ − c ∗ + λ s (θ ) + p b (θ ) ≥ p ∆x − c e (θL) 1+λ e (θL) L e(θL) L e(θL) e(θL) θL ∆x θL θL θL ⇐⇒ x + p ∗ − c ∗ ≥ x − λs (θ ) + p [∆x − λb (θ )] − c , 1 e (θL) 1+λ e (θL) 1 L e(θL) L e(θL)

∗ where the first line follows from algebraic manipulations; the second from the fact that e (θL) ∆x is the agent’s optimal effort when given a bonus of 1+λ ; the third follows from IC – holding effort e(θL) fixed, type θL must prefer (s (θL) , b (θL)) to (0, b(θH )) –; and the last line follows from algebraic manipulations. Therefore, the regulator profits from substituting the contracts of all ∆x ∆x types by (0, 1+λ ) if there exists a type with b(θ) ≥ 1+λ . The second lemma shows that removing all but the contract with the highest bonus increases the regulator’s payoff:

Lemma 8. Let (s, b, e) be a mechanism satisfying IC and LL. There exists a mechanism that oﬀers a single contract (0, b∗) to all types and gives the principal a weakly greater payoﬀ than (s, b, e).

Proof. Let (s, b, e) be a mechanism that satisfies IC and LL. By the previous lemma, there is no loss of generality in assuming that b(θ) < ∆x for all types. n 1+λ o n o Fix a type θ ∈ Θ. Let b∗ ≡ sup b(θˆ): θˆ ∈ Θ and s∗ ≡ inf s(θˆ): θˆ ∈ Θ . Because a uniform reduction in fixed payments increases the regulator’s payoff while retaining IC and LL, we can assume, without loss of generality, that s∗ = 0. Moreover, either there exists θˆ such that n o s(θˆ) = 0, b(θˆ) = b∗, or (0, b∗) is a limit point of s(θˆ), b(θˆ) ; θˆ ∈ Θ . Consider the alternative mechanism that offers the contract (0, b∗) to all types and let

∗ θ ∗ θ e (θ) ∈ arg max peb − ce. (38) e∈E

∗ θ θ Incentive compatibility establishes that b ≥ b(θ) implies pe∗(θ) ≥ pe(θ). Since, by the previous ∆x ∗ lemma, 1+λ ≥ b , we have

θ θ ∗ pe∗(θ) − pe(θ) (∆x − (1 + λ) b ) ≥ 0. (39)

By (38), it follows that θ ∗ θ θ ∗ θ pe∗(θ)b − ce∗(θ) ≥ pe(θ)b − ce(θ).

θ ∗ θ θ ∗ θ Thus, adding pe∗(θ)b − ce∗(θ) − pe(θ)b + ce(θ) ≥ 0 to the term on the LHS at (39) establishes that

36 θ θ ∗ θ θ ∗ θ ∗ θ θ ∗ θ pe∗(θ) − pe(θ) (∆x − λb ) + pe(θ) − pe∗(θ) b + pe∗(θ)b − ce∗(θ) − pe(θ)b + ce(θ) ≥ 0, which can be simpliﬁed to

θ θ ∗ θ θ pe∗(θ) − pe(θ) (∆x − λb ) + ce(θ) − ce∗(θ) ≥ 0. (40)

As in Lemma 4, use IC to write:

θ θ ∗ s(θ) + pe(θ)b(θ) ≥ pe(θ)b . (41)

The regulator’s gain from substituting mechanisms equals

θ ∗ θ θ θ pe∗(θ) (∆x − λb ) − pe(θ) [∆x − λb (θ)] + λs (θ) + ce(θ) − ce∗(θ) θ θ ∗ h θ ∗ i θ θ = pe∗(θ) − pe(θ) (∆x − λb ) + λ s (θ) + pe(θ) (b (θ) − b ) + ce(θ) − ce∗(θ) θ θ ∗ θ θ ≥ pe∗(θ) − pe(θ) (∆x − λb ) + ce(θ) − ce∗(θ) ≥ 0, where the ﬁrst inequality follows from (41) and the second follows from (39). Thus, that the principal improves by substituting all contracts by (0, b∗).

The proposition follows from Lemma 8.

Proof of Proposition 2

Part a) Let (w, e) be a feasible mechanism. It is straightforward to adapt Lemma 6 to show that there is no loss of generality in assuming that contracts are uniformly bounded. As in the proof of Theorem 2, let M := W θ(·): θ ∈ Θ denote the set of all contracts in this mechanism, and let M¯ denote its closure, which is compact. Let

N − X W ∈ arg min Wihi, (42) W ∈M¯ i=1 and, for each type, let N − X − θ θ e (θ) ∈ arg max Wi pe,i − ce. (43) e i=1 Existence of W − and e−(θ) follow from the arguments in Theorem 2.

37 Use IC to write

[I (e(θ), θ) − I (e− (θ) , θ)] PN W θh ≤ cθ − cθ i=1 i i e−(θ) e(θ) . (44) − PN − ≤ [I (e(θ), θ) − I (e (θ) , θ)] i=1 Wi hi

Since W − solves (42), it satisﬁes

N N X − X θ Wi hi ≤ Wi hi, i=1 i=1 so that, by (44), I (e− (θ) , θ) ≥ I (e(θ), θ). That is, offering W − yields a distribution of net surplus x that first-order stochastically dominates any other contract in M¯ . We first show that, holding effort e(θ) fixed, the regulator’s payoff is higher with contract W − than with W θ. Since W − is the limit of sequence in M, the agent’s utility is continuous, and the original mechanism is incentive compatible, it follows that

N N X θ θ θ X − θ θ Wi pe(θ),i − ce(θ) ≥ Wi pe(θ),i − ce(θ). i=1 i=1

With some algebraic manipulations, we can rewrite this inequality as

N N X − θ θ X θ θ θ xi − λWi pe(θ),i − ce(θ) ≥ xi − λWi pe(θ),i − ce(θ), (45) i=1 i=1 which shows that, holding effort constant, the regulator obtains a higher payoff with W − than with W θ for e(θ) fixed. Next, we show that changing effort from e(θ) to e−(θ) also increases the regulator’s payoff. − − − Let ∆xi = xi − xi−1 and ∆Wi = Wi − Wi−1, so that

N N+1 X xi X ∆xi − W − h = − − ∆W − H(x ) ≤ 0, (46) 1 + λ i i 1 + λ i i i=0 i=1

− − where the equality uses summation by parts, x0 = W0 , xN+1 = WN+1, and the inequality uses H(x) ≥ 0 for all x and FD. Use (46) to obtain:

[I (e (θ) , θ) − I (e− (θ) , θ)] PN x h ≥ (1 + λ)[I (e (θ) , θ) − I (e− (θ) , θ)] PN W −h i=1 i i i=1 i i , θ θ − PN − ≥ ce−(θ) − ce(θ) + λ [I (e (θ) , θ) − I (e (θ) , θ)] i=1 Wi hi (47) where the ﬁrst inequality follows from I (e− (θ) , θ) ≥ I (e(θ), θ) and some algebraic manipulations, whereas the second inequality follows from the fact that e−(θ) maximizes type θ’s eﬀort

38 under contract W − (program 43). Rearranging (47), we obtain:

N − X − θ θ − I e (θ) , θ − I (e (θ) , θ) xi − λWi hi ≥ ce−(θ) − ce(θ). i=1

Using MS, this inequality can be written as

N N X − θ θ X − θ θ xi − λWi pe−(θ),i − ce−(θ) ≥ xi − λWi pe(θ),i − ce(θ), (48) i=1 i=1 which establishes that the change in eﬀort from e(θ) to e−(θ) increases the regulator’s payoﬀ. Combining inequalities (45) and (48) concludes the proof.

Part b) As in the proof of Theorem 3, let hi ≡ H (xi+1) − H (xi) and, with a slight abuse of θ θ notation, write pe,i for pe(xi) and Wi for W (xi). If (W, e) is an optimal monotone mechanism, it must solve the following program

N X θ min Wi pe(θ),idµ(θ) {Wi} ˆ i=1 Θ subject to N X Wihi = K, (IC’) i=1

Wi ≥ 0, (LL) x − x i i−1 ≥ W − W . (M) 1 + λ i i−1 Using the same arguments as in proof of Theorem 3, it follows that there exists x¯ such that ( 0 if x ≤ x¯ w(x) = . x−x¯ 1+λ if x > x¯

¯ ¯ S−x¯ Rewriting in terms of costs, we obtain w(C) = max C − C; 0 , where C ≡ 1+λ .

B. Examples

Screening with Pure Adverse Selection (continuation)

A 2 B 1 Given a high effort, the probability of success for type A is p1 = 3 and for type B is p1 = 3 . We assume that the project fails with a high enough probability if they exert low effort and we take xH − xL to be large enough for the principal to want to implement high effort from both types.

39 i Let wj denote type i’s payment in state j. The optimal mechanism must solve the following program: min 2wA + wA + wB + 2wB i i H L H L wH ,wL≥0 A A B B subject to 2wH + wL ≥ 2wH + wL B B A A wH + 2wL ≥ wH + 2wL A A 2wH + wL ≥ 3 B B wH + 2wL ≥ 2. The two first constraints require A and B to prefer to report their types truthfully (IC constraints). Because effort is observable, the principal does not need to worry about deviations on effort. Then, it is no longer the case that LL implies IR. The last two constraints are precisely the IR constraints.

Example 2 (continuation)

Consider the relaxed program: N i X µiw min i wj ≥0 2 i i=1 subject to j ! 1 P w wi + j6=i i ≥ 1. 2 i N

This is a relaxed program in that it only considers the “pure moral hazard” ICs. The solution is ( 2 if x = xθ wθ (xi) = . 0 if x 6= xθ

The proof concludes by verifying that this solution satisfies the omitted ICs. The ICs preventing each type from deviating in both effort and contracts are satisfied with equality:

1 0 = · 2 − 1 = wi = 0. 2 0

The ICs preventing each type from picking a different contracts while exerting high effort hold with inequality: 1 1 0 = · 2 − 1 > · 2 − 1. 2 N Thus, all omitted ICs are satisfied.

40 C. Multiplicative Separability

This appendix examines the multiplicative separability condition (MS). We provide two results. The ﬁrst is a characterization of MS in terms of the ordering of incentives. The second one is a graphical interpretation of how distributions satisfying MS change with eﬀort. Recall the condition presented in the text, which allows contracts to be ranked by their incentives:

Deﬁnition 3. A distribution of outputs is ordered in terms of incentives if, for any w and w˜ satisfying LL, N N X X w(x ) pθ0 (x ) − pθ0 (x ) = w˜(x ) pθ0 (x ) − pθ0 (x ) i e0 i e˜0 i i e0 i e˜0 i i=1 i=1 for some e0, e˜0 ∈ E and θ0 ∈ Θ, then

N N X θ θ X θ θ w(xi) pe(xi) − pe˜(xi) = w˜(xi) pe(xi) − pe˜(xi) i=1 i=1 for all θ ∈ Θ and e, e˜ ∈ E.

Substitution shows that any distribution that satisﬁes MS must be ordered in terms of incentives. The next proposition shows that the reverse is also true, so these are equivalent conditions:

Proposition 3. The following statements are equivalent: θ 1. The distribution of outputs pe(x) is ordered in terms of incentives. θ 2. The distribution of outputs pe(x) satisﬁes MS.

3. For all i, j, there exist constants φi,j (that depend on the outputs xi and xj but not on type or eﬀort) such that pθ(x ) − pθ(x ) e i e˜ i = φ (49) θ θ i,j pe(xj) − pe˜(xj) θ θ whenever pe(xj) − pe˜(xj) 6= 0.

Proof. We first show the equivalence between (2) and (3). From Definition 1, a distribution satisfies MS if and only if, for all (x, θ, e, e˜),

θ θ pe(x) − pe˜(x) = [I(˜e, θ) − I(e, θ)]H(x). (50)

θ θ Since (2) and (3) trivially hold if pe(xi) = pe˜(xi) for all x, θ, e, and e˜, it suﬃces to consider the θ θ case where pe(xj) 6= pe˜(xj) for some (xj, θ, e, e˜).

41 Suppose MS holds. Then, we must have H(xj) 6= 0. Multiplying both sides of equation (50) by H(xi) , we obtain: H(xj )

θ θ H(xi) θ θ pe(xj) − pe˜(xj) = [I(˜e, θ) − I(e, θ)] H(xi) = pe(xi) − pe˜(xi). H(xj)

H(xi) Setting φi,j ≡ establishes the equation in the statement. Conversely, when property (3) H(xj ) does not hold, it is impossible to ﬁnd H and I that satisfy (50). Next, we establish that (3) implies (1). To do so, suppose that (3) holds, and let w and w˜ be two payment vectors satisfying LL such that

N N X X w(x )[pθ0 (x ) − pθ0 (x )] = w˜(x )[pθ0 (x ) − pθ0 (x )] (51) i e0 i e˜0 i i e0 i e˜0 i i=1 i=1 for some e , e˜ ∈ E and θ ∈ Θ. Without loss of generality, suppose that pθ0 (x ) 6= pθ0 (x ) 0 0 0 e0 j e˜0 j for some (xj, θ0, e0, e˜0) (the distribution would immediately be ordered in terms of incentives if θ θ pe(xi) = pe˜(xi) for all (xi, θ, e, e˜). For notational simplicity (relabeling if needed), let j = 1. Use equation (49) to write:

N N X X w(x )[pθ0 (x ) − pθ0 (x )] = [pθ0 (x ) − pθ0 (x )] w(x )φ , i e0 i e˜0 i e0 1 e˜0 1 i i,1 i=1 i=1 and N N X X w˜(x )[pθ0 (x ) − pθ0 (x )] = [pθ0 (x ) − pθ0 (x )] w˜(x )φ . i e0 i e˜0 i e0 1 e˜0 1 i i,1 i=1 i=1 Substitute these equations in (51):

N N X X [pθ0 (x ) − pθ0 (x )] w(x )φ = [pθ0 (x ) − pθ0 (x )] w˜(x )φ , e0 1 e˜0 1 i i,1 e0 1 e˜0 1 i i,1 i=1 i=1 which, since pθ0 (x ) 6= pθ0 (x ), implies: e0 1 e˜0 1

N N X X w(xi)φi,1 = w˜(xi)φi,1. i=1 i=1

θ θ For each θ, e, and e˜, multiply both sides by pe(x1) − pe˜(x1) to obtain:

N N θ θ X θ θ X [pe(x1) − pe˜(x1)] w(xi)φi,1 = [pe(x1) − pe˜(x1)] w˜(xi)φi,1. i=1 i=1

42 Then, using condition (49) again, gives:

N N X θ θ X θ θ w(xi)[pe(xi) − pe˜(xi)] = w˜(xi)[pe(xi) − pe˜(xi)], i=1 i=1 which establishes (1). It remains to show that (1) implies in (3). Notice that if a distribution is ordered in terms θ θ of incentives, then either pe(·) − pe˜(·) is identical to zero for each θ, e, and e˜, which immediately implies condition (49), or there exists a non-null vector pθ0 (·) − pθ0 (·), for some θ , e and e˜ , e0 e˜0 0 0 0 such that {w ∈ N ; w · (pθ0 − pθ0 ) = 0} ⊂ {w ∈ N ; w · (pθ − pθ) = 0}, R e0 e˜0 R e e˜ for all θ, e and e˜, where · represents the canonical inner product of RN . Therefore, the orthogonal complement of pθ − pθ in N contains the orthogonal complement of pθ0 − pθ0 , which means that e e˜ R e0 e˜0 pθ − pθ is collinear to pθ0 − pθ0 for all θ, e and e˜. Therefore, e e˜ e0 e˜0

θ θ θ θ pe(xi) − pe˜(xi) = φi,j pe(xj) − pe˜(xj) where pθ0 (x ) − pθ0 (x ) e0 i e˜0 i φi,j = pθ0 (x ) − pθ0 (x ) e0 j e˜0 j for pθ0 (x ) − pθ0 (x ) 6= 0. e0 j e˜0 j Proposition 3 shows the equivalence between three properties of a probability distribution. As discussed in the main text, property (1) states that if one type has the same incentives to exert two efforts under contracts w and w˜, so do all other types. In other words, all contracts can be ranked in terms of the incentives that they provide. Property 2 is the MS condition used in the text. Property 3 provides a straightforward graphic interpretation of what it means for a distribution to be multiplicatively separable. It means that, for all types, changing effort has a proportional effect on the probability of each outcome. For example, suppose there are three outputs so that the probabilities conditional on each type and effort lie on the two-dimensional simplex. Because probabilities add up to 1, one of the equalities in (49) is redundant, and a distribution satisfies MS if and only if

θ θ θ θ pe(x2) − pe˜(x2) = φ pe(x3) − pe˜(x3) for some constant φ. This condition is represented in Figure 1. Each point is the graph corresponds to the probability distribution conditional on a type and an eﬀort. We draw two points

43 Figure 1: Multiplicative Separability with three outputs. with the same color if they correspond to the same type but diﬀerent eﬀorts. MS requires the lines connecting these points to have the same slope (φ).

D. Generalized Multiplicative Separability

Definition 4. Let (T , ω) be finite probability space. The distribution of outputs satisfies generalized multiplicative separability (GMS) if there exist a function H : X × T → R+ and measurable function I : E × Θ × T → R such that

T T θ X θ X Fe˜ (x) + H (x, t) I(˜e, θ, t)ωt = Fe (x) + H (x, t) I (e, θ, t) ωt, ∀e, e,˜ θ, x, (GMS) t=1 t=1 and I(˜e, θ, t) ≥ I(e, θ, t) ⇐⇒ I(˜e, θ, t˜) ≥ I(e, θ, t˜) for all e, e,˜ t, t,˜ θ.

Example 3. We say that the distribution of outputs satisﬁes the Generalized Linearity of the Distribution Function Condition if

T T θ X ¯ X ¯ Fe (x) = I(e, θ, t)Ft(x) = − I(e, θ, t)H(x, t) + F0(x), (52) t=0 t=1

¯ PT where Ft(x); t = 1, ..., T is a family of cumulative distributions, I(e, θ, t) ≥ 0, t=1 I(e, θ, t) = ¯ ¯ ¯ ¯ 1 and H(x, t) = F0(x) − Ft(x). If E = [0, 1], Ft ﬁrst-order stochastically dominates F0 and

44 I(·, θ, t) is non-decreasing for all t and θ, then the distribution of outputs satisﬁes GMS.

Theorem 4. Suppose that GMS holds. There exists an optimal mechanism that oﬀers a single contract to all types. Moreover, for any optimal mechanism, there exists an equivalent mechanism that oﬀers a single contract.

Proof. By Lemma 6, there is no loss of generality in assuming that all contracts are uniformly bounded. By GMS, a type-θ agent who switches from eﬀort e˜ to e while keeping the same contract w gains

T N θ θ X X θ θ ve (w) − ve˜(w) = [I(˜e, θ, t) − I(e, θ, t)] wihi(t)ωt + ce˜ − ce. (53) t=1 i=1

In turn, this switch changes the principal’s payoﬀ by

T N θ θ X X ue(w) − ue˜(w) = [I(˜e, θ, t) − I(e, θ, t)] [xi − wi]hi(t)ωt. (54) t=1 i=1

Let (w, e) be a feasible mechanism. The set of all bonuses in this mechanism,

M := wθ : θ ∈ Θ , is well deﬁned and is composed of uniformly bounded functions. Therefore, its closure M is a compact set. Let w∗ be a solution of the following maximization program:

N X max wihi(t). (55) w∈M¯ ,t∈T i=1

Since M¯ is compact and S is finite, the objective function is a continuous linear functional and the constraint defines a closed set, (55) has a solution w∗ ∈ M¯ . Let e∗(θ) be an effort that maximizes the agent’s payoff under contract w∗:

θ ∗ θ ∗ ve∗(θ)(w ) ≥ ve(θ)(w ), which, by GMS, can be written as

T N X ∗ X ∗ θ θ [I(e(θ), θ, t) − I(e (θ), θ, t)] wi hi (t) ωt ≥ ce∗(θ) − ce(θ). (56) t=1 i=1

45 Similarly, because e(θ) is his eﬀort choice with contract wθ(x),

T N X ∗ X θ θ θ [I(e(θ), θ, t) − I(e (θ), θ, t)] wi hi (t) ωt ≤ ce∗(θ) − ce(θ). (57) t=1 i=1

Combining (56) and (57), we obtain

PT [I(e(θ), θ, t) − I(e∗(θ), θ, t)] PN wθh (t) ω ≤ cθ − cθ ≤ t=1 i=1 i i t e∗(θ) e(θ) (58) PT ∗ PN ∗ t=1 [I(e(θ), θ, t) − I(e (θ), θ, t)] i=1 wi hi (t) ωt.

Since w∗ solves program (55), we have

N N X ∗ X θ wi hi(t) ≥ wi hi(t), for all t ∈ T . i=1 i=1

Therefore, it follows from GMS and (58) that I(e(θ), θ, t) ≥ I(e∗(θ), θ, t), for all t ∈ T . We now establish that replacing contract wθ by w∗ increases the principal’s payoff from type θ. As in the proof of Theorem 2, we first show that, holding effort fixed, the principal is better off with the substitution of contracts. Since w∗ is the limit of sequence in M, the agent’s utility is continuous, and the original mechanism is incentive compatible, it follows that

θ θ θ ∗ ve(θ)(w ) ≥ ve(θ)(w ). (59)

PN θ Multiply both sides by −1 and add i=0 xipe(θ),i to both sides of this inequality to obtain:

N N X ∗ θ X θ θ [xi − wi ] pe(θ),i ≥ xi − wi pe(θ),i, (60) i=1 i=1 which states that, holding effort e(θ) fixed, the principal gets a higher profit with contract w∗ than with wθ. Next, we show that the change in effort also benefits the principal. Notice first that FD implies that the function x → x − w∗(x) is non-decreasing. Therefore, applying integration by parts and using GMS and FD, we conclude that

N N+1 X ∗ X ∗ [xi − wi ] hi(t) = − [∆xi − ∆wi ]H(xi, t) ≤ 0, for all t, i=0 i=1

∗ ∗ ∗ ∗ ∗ where ∆xi = xi − xi−1, ∆wi = wi − wi−1 and x0 = w0 and xN+1 = wN+1. Hence, since

46 I(e(θ), θ, t) ≥ I(e∗(θ), θ, t), for all t, we have that

N ∗ X ∗ [I(e(θ), θ, t) − I(e (θ), θ, t)] [xi − wi ] hi (t) ≤ 0, for all t. i=1

Using again GMS, we can rewrite this last inequality as

N N X ∗ θ X ∗ θ [xi − wi ]pe∗(θ),i ≥ [xi − wi ] pe(θ),i, (61) i=1 i=1 which shows that the principal gains from the change in effort. Combining (60) and (61) establishes that the principal’s profit from θ with the new contract exceeds her profit with the original contract:

θ ∗ θ θ ue∗(θ)(w ) ≥ ue(θ)(w ).

θ ∗ ∗ ∗ θ ∗ Consider the mechanism (w, ¯ e¯): w¯ (x) = w (θ) and e¯(θ) = e (θ), where e (θ) ∈ arg max ve (w ), e for all θ. As previously shown, this mechanism raises the principal’s payoff pointwise and satisfies LL. Notice that the expected utility of type θ at contract w∗ who chooses effort e is

N θ ∗ X ∗ θ θ ve (w ) = wi pe,i − ce. i=1

∗ ∗ θ Since w ≥ 0 and the agent must weakly prefer the recommended eﬀort e to the one with ce ≤ 0, the IR constraint must hold. Since the mechanism (w, ¯ e¯) is constituted by only one contract, it trivially satisﬁes IC. The proof of the second part is analogous to the proof from Theorem 2.

References

Bajari, P. and S. Tadelis (2001): “Incentives Versus Transaction Costs: A Theory of Pro- curement Contracts,” RAND Journal of Economics, 32, 287–307.

Biais, B. and T. Mariotti (2005): “Strategic Liquidity Supply and Security Design,” Review of Economic Studies, 72, 615–649.

Burguet, R., J.-J. Ganuza, and E. Hauk (2012): “Limited Liability and Mechanism Design in Procurement,” Games and Economic Behavior, 76, 15–25.

Caillaud, B., R. Guesnerie, and P. Rey (1992): “Noisy Observation in Adverse Selection Models,” Review of Economic Studies, 59, 595–615.

47 Carroll, G. (Forthcoming): “Robustness and Linear Contracts,” American Economic Review.

Chaigneau, P., A. Edmans, and D. Gottlieb (2014): “The Value of Informativeness for Contracting,” Tech. rep., HEC Montreal, LBS, and Wharton.

Chassang, S. (2013): “Calibrated Incentive Contracts,” Econometrica, 81, 1935–1971.

Chiappori, P.-A. and B. Salanie (2003): “Testing Contract Theory: A Survey of Some Recent Work,” in Advances in Economics and Econometrics, ed. by M. Dewatripont, L. P. Hansen, and S. T. Turnovsky, Cambridge: Cambridge University Press, vol. 1.

Chu, L. Y. and D. Sappington (2007): “Simple Cost-Sharing Contracts,” American Eco- nomic Review, 97, 419–428.

Demarzo, P. and D. Duffie (1999): “A Liquidity-based Model of Security Design,” Econo- metrica, 67, 65–99.

DeMarzo, P. M. (2005): “The Pooling and Tranching of Securities: A Model of Informed Intermediation,” Review of Financial Studies, 18, 1–35.

DeMarzo, P. M., I. Kremer, and A. Skrzypacz (2005): “Bidding with Securities: Auctions and Security Design,” American Economic Review, 95, 936–959.

Dewatripont, M., P. Legros, and S. A. Matthews (2003): “Moral Hazard and Capital Structure Dynamics,” Journal of the European Economic Association, 1, 890–930.

Edmans, A. and X. Gabaix (2011): “Tractability in Incentive Contracting,” Review of Fi- nancial Studies, 24, 2865–94.

Farhi, E. and J. Tirole (Forthcoming): “Liquid Bundles,” Journal of Economic Theory.

Gottlieb, D. and H. Moreira (2014): “Simultaneous Adverse Selection and Moral Hazard,” Tech. rep., Wharton School and EPGE/FGV.

Grossman, S. J. and O. D. Hart (1983): “An Analysis of the Principal-Agent Problem,” Econometrica, 51, 7–45.

Guesnerie, R. and J.-J. Laffont (1984): “A Complete Solution to a Class of Principal- Agent Problems with an Application to the Control of a Self-Managed Firm,” Journal of Public Economics, 25, 329–369.

Hart, O. D. and B. Holmstrom (1987): “The Theory of Contracts,” in Advances in Economic Theory, Fifth World Congress, ed. by T. Bewley, Cambridge: Cambridge University Press.

48 Holmstrom, B. (1979): “Moral Hazard and Observability,” The Bell Journal of Economics, 10, 74–91.

Holmstrom, B. and P. Milgrom (1987): “Aggregation and Linearity in the Provision of Intertemporal Incentives,” Econometrica, 55, 303–28.

Innes, R. D. (1990): “Limited Liability and Incentive Contracting with Ex-Ante Action Choices,” Journal of Economic Theory, 52, 45–67.

Jewitt, I., O. Kadan, and J. M. Swinkels (2008): “Moral Hazard with Bounded Payments,” Journal of Economic Theory, 143, 59–82.

Laffont, J.-J. and D. Martimort (2002): The Theory of Incentives: The Principal-Agent Model, Princeton University Press.

Laffont, J.-J. and J. Tirole (1986): “Using Cost Observation to Regulate Firms,” Journal of Political Economy, 94, 614–641.

——— (1993): A Theory of Incentives in Procurement and Regulation, MIT press.

Matthews, S. A. (2001): “Renegotiating Moral Hazard Contracts under Limited Liability and Monotonicity,” Journal of Economic Theory, 97, 1–29.

Melumad, N. D. and S. Reichelstein (1989): “Value of Communication in Agencies,” Journal of Economic Theory, 47, 334–368.

Nachman, D. C. and T. H. Noe (1994): “Optimal design of securities under asymmetric information,” Review of Financial Studies, 7, 1–44.

Ollier, S. and L. Thomas (2013): “Ex Post Participation Constraint in a Principal–Agent Model with Adverse Selection and Moral Hazard,” Journal of Economic Theory, 148, 2383– 2403.

Page, F. H. (1992): “Mechanism Design for General Screening Problems with Moral Hazard,” Economic Theory, 2, 265–281.

Picard, P. (1987): “On the Design of Incentive Schemes under Moral Hazard and Adverse Selection,” Journal of Public Economics, 33, 305–31.

Poblete, J. and D. Spulber (2012): “The Form of Incentive Contracts: Agency with Moral Hazard, Risk Neutrality, and Limited Liability,” RAND Journal of Economics, 43, 215–34.

Rochet, J.-C. and L. A. Stole (2002): “Nonlinear Pricing with Random Participation,” Review of Economic Studies, 69, 277–311.

49 Rogerson, W. P. (2003): “Simple Menus of Contracts in Cost-Based Procurement and Regu- lation,” American Economic Review, 93, 919–26.

Tirole, J. (2005): The Theory of Corporate Finance, Princeton University Press.

50 Online Appendix

Existence

This proof covers the model with two outputs as a particular case. By Lemma 6, there is no loss of generality in taking uniformly bounded contracts. Let L > 0 denote this uniform bound so that W := [0,L]N ⊂ RN denotes the space of contracts. Let B(W) be the Borel σ-field in W. θ PN θ θ θ Let ve (w) := i=1 w(xi)pe(xi) − ce. By Assumption 2, v· (·) is continuous on W × E, and, · θ for each (w, e) ∈ W × E, ve(w) is B(Θ)-measurable. Since E is compact and v· (w) continuous, θ θ the agent’s effort choice problem has a solution. Let V (w) := sup ve (w) denote the utility e∈E of type θ. Let the non-empty, compact-valued mapping (θ, w) → e∗(θ, w) denote the “reaction function” of type θ: ∗ θ θ e (θ, w) := {e ∈ E : ve (w) ≥ V (w)}. By Berge’s Maximum Theorem, w → V θ(w) is continuous for each θ and w → e∗(θ, w) is upper semicontinuous on W for each θ. θ PN θ Let ue(w) := i=1[xi −w(xi)]pe(xi) denote the principal’s payoff. By Assumption 2, for each θ · θ ∈ Θ, u· (·) is continuous on W × E, and for each (w, e) ∈ W × E, ue(w) is B(Θ)-measurable. θ Notice that p is µ-integrable on Θ × E because |pe| ≤ 1 on Θ × E. ∗ θ θ Since e (θ, w) is compact and u· (w) continuous, for each (θ, w), the problem sup ue(w) e∈e∗(θ,w) θ has a solution, say ew. Thus, for each contract w, the principal can provide the agent with a θ θ ∗ list, {ew; θ ∈ Θ}, of recommended efforts. Since ew ∈ e (θ, w) for each θ ∈ Θ, a type-θ agent has θ no incentive to take an effort other than the one requested by the principal ew (i.e., the agent is obedient). To choose the best list of requests, the principal must therefore solve the problem

θ sup ue(w) e∈e∗(θ,w)

θ θ for each (θ, w) ∈ Θ×W. Let U (w) := sup ue(w). Since u is B(Θ)×B(W)×B(E)-measurable e∈e∗(θ,w) and continuous on E, and since e∗(·, ·) is B(Θ) × B(W)-measurable (see Nowak (1984), Lemma 1.10) and compact-valued, it follows that U is B(Θ) × B(W)-measurable (see Himmelberg et.al. (1976), Theorem 2). Moreover, since e∗(θ, ·) is upper semicontinuous on W, it follows from Theorem 2 of Berge (1963) that U θ(·) is upper semicontinuous on W for each θ. Finally, since the principal’s payoﬀ u is integrably bounded, U :Θ × W → R is also integrably bounded on Θ × W. With these observations, we can write the principal’s program as

sup U θ(w)dµ(θ) w∈W ˆ

51 which has a solution since W is compact. Remark 1. Notice that Theorems 2 and 3 still hold with a continuum of outputs if we impose that the space of feasible contracts is uniformly bounded. The maximization and minimization problems deﬁned in proof of Theorem 2 have solutions if we consider the (L∞(X),L1(X))-weak* topology on the space of feasible contracts. Indeed, by the Banach-Alaoglu theorem (see Rudin, 1991), the set M deﬁned for those problems has a weak*-compact closure, and their objective functions are continuous with respect to weak* topology. The rest of the proof is a simple adaptation of the arguments from Theorems 2 and 3.

Example from Section 3: FOSD without MS

This appendix presents an example illustrating that the single-contract result from Lemma 1 (b) requires MS. That is, if the family of distributions is ordered by FOSD but does not satisfy MS, it may be optimal to oﬀer more than one contract.

Example 4. There are two types Θ = {A, B} in equal proportion, two eﬀorts E = {0, 1}, and θ three outcomes X = {L, M, H} with L < M < H. Both types have the same eﬀort costs: c0 = 0 θ and c1 = 1 for all θ. Their conditional probabilities are represented in the following table:

Type A Type B x = L x = M x = H x = L x = M x = H 1 1 1 1 1 1 e = 0 3 3 3 e = 0 3 3 3 1 2 2 1 1 1 e = 1 5 5 5 e = 1 6 2 3 Table 1: Conditional probabilities of each type.

Both types have the same probability distribution when they exert low effort. However, A observes both high and low outputs more frequently than B when he exerts high effort, whereas B observes the intermediate output more frequently than A. Notice that, for both types, the distribution of output with high effort first-order stochastically dominates the distribution with low effort. We assume that M and H are large enough for it to be optimal to recommend a high effort to both types. As we show below, the optimal mechanism offers contract wA = (0, 0, 15) to type A and wB = (0, 6, 6) to type B. Referring to the incremental payment in each state as “the bonus,” this mechanism pays type A a bonus of 15 in state x = H and zero in other states. Type B gets a bonus of 6 in state M and zero in states L and H. The principal finds it profitable to separate them because their likelihood ratios are not ordered. Each type is paid a bonus in the state where his likelihood ratio is the highest (x = H for type A and x = M for type B).

θ θ θ θ Let w = (wH , wM , wL) denote the vector of payments to the agent. There are six ICs.

52 • ICs preventing deviations in eﬀort for a ﬁxed contract:

2wA + 2wA + wA wA + wA + wA H M L − 1 ≥ H M L , (62) 5 3

2wB + 3wB + wB wB + wB + 3wB H M L − 1 ≥ H M L . (63) 6 3 • ICs preventing deviations in both contracts and eﬀorts:

2wA + 2wA + wA wB + wB + wB H M L − 1 ≥ H M L , (64) 5 3

2wB + 3wB + wB wA + wA + wA H M L − 1 ≥ H M L . (65) 6 5 • ICs preventing deviations in contract for a ﬁxed eﬀort:

A A A B B B 2wH + 2wM + wL ≥ 2wH + 2wM + wL , (66)

B B B A A A 2wH + 3wM + wL ≥ 2wH + 3wM + wL . (67)

θ The principal must choose payments wi ≥ 0 to minimize her expected cost

2wA + 2wA + wA 2wB + 3wB + wB H M L + H M L 5 6 subject to the ICs above. This is a standard linear program, and it is straightforward to check that the solution is wA = (0, 0, 15) and wB = (0, 6, 6), which gives the principal a cost of 11. To verify that we cannot implement the optimal mechanism with a single contract, consider the principal’s program if we require her to oﬀer the same contract (wH , wM , wL) to both types. The principal’s cost is then

2w + 2w + w 2w + 3w + w 11w + 27w + 22w H M L + H M L = L M H . 5 6 30

Since there is only one contract, the only relevant IC is the one preventing each type from choosing a diﬀerent eﬀort:

2w + 2w + w w + w + w H M L − 1 ≥ H M L , 5 3 and 3w + w w + 3w M L − 1 ≥ M L . 6 3

It is straightforward to verify that the solution is wL = 0, wM = 6, wH = 9, which gives the

53 principal a cost of 12. Thus, oﬀering the best single-contract mechanism yields a higher cost than oﬀering the optimal menu of contracts in this example.

References

Berge,C. (1963). Topological Spaces. MacMillan: New York.

Himmelberg, C.J., Parthasarathy, T. and VanVleck, F.S. (1976), “Optimal plans for dynamic programming problems,” Mathematics of Operations Research, Mathematics of Operations Research, 1, pp. 390-394.

Nowak, A.S. (1984), “On zero-sum stochastic games with general state space.I,” Proba- bility and Mathematical Statistics, 4(1), pp. 13-32.

Rudin,W. (1991). Functional Analysis. McGraw-Hill: New York., 2nd edition.