Digital Platforms and Consumer Search

Dissertation

Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University

By

Benjamin Casner, MSc.

Graduate Program in Economics

The Ohio State University

2020

Dissertation Committee:

Huanxing Yang, Co-Advisor James Peck, Co-Advisor Paul J. Healy c Copyright by

Benjamin Casner

2020 Abstract

In the first chapter I explore why market platforms do not expel low-quality sellers when screening costs are minimal. I model a platform market with consumer search.

The presence of low-quality sellers reduces search intensity, softening competition

between sellers and increasing the equilibrium market price. The platform admits

some low-quality sellers if competition between sellers is intense. Recommending a

high-quality seller and this form of search obfuscation are complementary strategies.

The low-quality sellers enable the recommended seller to attract many consumers at a

high price and the effect of the recommendation is strengthened as low-quality sellers

become more adept at imitating high-quality sellers.

The second chapter looks at consumer search behavior when consumers are in-

experienced with the market. In many consumer search environments, searchers are

learning about the distribution of offers in the market. I conduct an experiment

exploring a broad class of search problems with learning about the distribution of

payoffs. My results support the common theoretical prediction that learning results in

declining reservation values, providing evidence that learning may be an explanation

for recall seen in field data. While the theory predicts a “one step” reservation value

, many subjects instead choose to set a high reservation value in order to learn

about the distribution before adjusting based on their observations. I provide evidence

that under-searching in search experiments may stem from a reinforcement heuristic.

ii In the third chapter I use a model with a platform facilitating interaction between consumers, advertisers, and content creators to explore the effects of introducing a subscription which allows consumers to avoid ads. The subscription increases provision of niche content, but increased advertising and a high subscription price may reduce the welfare of consumers who enjoy mass market content. The effect on total welfare depends on how much the platform increases payments to content creators as a result of the subscription

iii Acknowledgments

Thank you to my family for always supporting me, Parker Wheatley for teaching me how to study, Soyoung Lee for years of friendship and for being one of the best sounding boards for research ideas, Paul Healy and the micro theory group at Ohio

State for guiding me through the softer skills in academia, and most of all to James

Peck and Huanxing Yang for teaching me how to recognize when to drop a bad idea, pursue a good one, and how to recognize the difference.

iv Vita

2011 ...... B.A. Economics and Math, Saint John’s University 2012 ...... MSc. Behavioural Economics, Univer- sity of Nottingham 2014 ...... M.A. Economics, The Ohio State Uni- versity 2014-present ...... Graduate Teaching Associate, The Ohio State University.

Fields of Study

Major Field: Economics

v Table of Contents

Page

Abstract...... ii

Acknowledgments...... iv

Vita...... v

List of Tables...... ix

List of Figures...... x

1. Introduction...... 1

2. Seller Curation In Platforms...... 2

2.1 Introduction...... 2 2.2 Related Literature...... 5 2.3 The Model...... 9 2.3.1 Equilibrium of the Market Subgame...... 12 2.3.2 Consumer Surplus...... 19 2.4 Search Obfuscation Via Low Quality Sellers...... 20 2.4.1 Equilibrium Curation Decision...... 24 2.5 Recommending a Seller...... 28 2.6 Discussion...... 39 2.7 Conclusion...... 42

3. Learning While Shopping: An Experimental Investigation into the Effect of Learning on Consumer Search...... 44

3.1 Introduction...... 44 3.2 Related literature...... 48

vi 3.3 Theory...... 52 3.4 Design of the Experiment...... 54 3.4.1 Anatomy of an experimental search spell...... 55 3.4.2 Treatments...... 58 3.4.3 Supplemental data...... 60 3.5 Hypotheses...... 62 3.6 Main Results...... 66 3.6.1 The reservation value path...... 71 3.6.2 The information demand heuristic...... 78 3.7 Feedback and under-search...... 82 3.8 Distance from the theoretical optimum...... 88 3.9 High Cost Sessions...... 92 3.9.1 Results...... 92 3.10 Conclusion...... 96

4. Media Provision With Outsourced Content Production...... 98

4.1 Introduction...... 98 4.2 The Baseline Model...... 105 4.2.1 Baseline with no subscription...... 105 4.2.2 Baseline with subscription...... 111 4.3 Welfare Comparison...... 116 4.3.1 Total Welfare...... 119 4.3.2 Platform and Creator Profits...... 121 4.3.3 Consumer Welfare...... 122 4.3.4 Numerical Example...... 124 4.4 Extensions...... 127 4.4.1 Heterogenous Utility...... 127 4.4.2 Enriched Advertising Sector...... 133 4.5 Conclusion...... 137

Appendices 147

A. Omitted Proofs...... 147

B. Duopoly platforms...... 167

C. Experimental Instructions...... 173

C.0.1 Learning Last Experimental Instructions...... 173

vii C.0.2 Learning First Instructions...... 178

D. Media Provision Numerical Example...... 185

D.0.1 Analytical Solutions for the Triangular Distribution Example. 190

viii List of Tables

Table Page

3.1 Summary statistics for each treatment...... 66

3.2 Fixed effects OLS predicting the reservation value path...... 74

3.3 Random effects ordered probit regression predicting the direction of change in reservation values...... 76

3.4 Probit predicting whether a subject would be classified as a high information demand type...... 79

3.5 OLS regression predicting payoffs for the search task in the learning treatment...... 80

3.6 Random effects probit predicting the probability of reservation value increasing...... 83

3.7 Random effects ordered probit regression predicting the direction of change in initial reservation values...... 87

3.8 Summary statistics for each treatment in the high cost sessions.... 92

3.9 Random effects ordered probit regression predicting the direction of change in reservation values in the high cost sessions...... 95

ix List of Figures

Figure Page

2.1 The process by which consumers learn about seller quality and their match value...... 11

2.2 A parameter space showing the values where the platform will admit a positive proportion of low-quality sellers in this numerical example.. 23

2.3 Comparative statics using the uniform numerical example...... 27

2.4 A diagram showing the comparison between α∗ and αR∗ as a function of the parameters...... 36

2.5 A parameter space showing when the platform will admit a positive proprortion of low-quality sellers when it has the ability to recommend a high-quality seller in this numerical example...... 38

2.6 The proportion of low-quality sellers with no screening, at the profit maximizing level, if there were 0 screening cost, and if the platform treated prices as independent of the proportion of low quality sellers. 39

3.1 An example of the user interface subjects saw in the full information treatment...... 58

3.2 An example of the user interface subjects saw in the learning treatment. 59

3.3 The density plots for two possible distributions in the learning treatment. 60

3.4 Kernel density representation of the length of the search spells..... 67

3.5 The reservation values by depth in search spell and treatment..... 68

x 3.6 Initial reservation values by treatment...... 69

3.7 The kernel density and approximate paths of the normalized reservation values...... 71

3.8 The normalized reservation values in the learning treatment by depth in search spell, broken down by the true mean of the distribution... 73

3.9 The difference between the final observation and the corresponding reservation value by whether they lower, maintain, or raise the initial reservation value in the next search spell...... 85

3.10 The percentage error in reservation values compared to the theoretical optimum for a Bayesian learner by depth in search spell and treatment. 88

3.11 Mean absolute error for the first three reservation values of each search spell...... 90

3.12 The kernel density and approximate paths of the normalized reservation values in the high cost sessions...... 93

4.1 The sides and network externalities in the model...... 100

4.2 Content market structure...... 106

4.3 Market capture...... 118

4.4 Variation in total welfare and utility by advertising sensitivity..... 125

4.5 Change in total welfare and utility by curvature of market size..... 126

4.6 Consumer decisions as a function of type...... 128

D.1 Variation in subscription price and creator payment by advertising sensitivity...... 185

D.2 Variation in t∗ by advertising sensitivity...... 186

D.3 Variation in creator participation by curvature of market size function. 187

D.4 Variation in subscription price and creator payment by curvature of g(·).188

xi D.5 Variation in ad views...... 189

xii Chapter 1: Introduction

The advent of the internet has resulted in new business models and an increased

importance of understanding consumers search for products. The economics of

platforms are particularly important as an increasing proportion of market interaction

takes place on them. The implications of a profit maximizing market designer are

complex and have a significant effect on market outcomes, particularly when these

markets have search frictions.

In my research, I try to understand consumer search behavior and how platform-

profit maximizing market design impacts social welfare. Given that platform based

industries tend to be highly concentrated, this is a subject of particular interest to

anti-trust regulators as well as researchers trying to understand market power in the

context of market designing firms.

The first chapter combines the themes of search and platform design in an examina-

tion of when market platforms have an incentive to garble the consumer search process.

The second chapter focuses on consumer search when consumers are learning about

the distribution of offers in the market. This is important for understanding consumer

behavior in markets where there are many new consumers or when consumers shop

infrequently. The final chapter examines the welfare impact when a media platform which attracts content creators rather than making its own content offers an ad

avoidance subscription.

1 Chapter 2: Seller Curation In Platforms

2.1 Introduction

Platforms which connect buyers and sellers suffer a loss of reputation if they

host too many low-quality sellers. Reputational effects can have a significant impact

on customer retention (Nosko and Tadelis 2015). However, many platforms seem

unwilling to curate the quality of sellers on their marketplace. For example, many

sellers of low-quality exercise nutritional supplements on Amazon use fake reviews

to imitate high-quality sellers. Amazon seems reluctant to respond to this problem

even though many of these fake reviews are easy to detect algorithmically.1 While the

cost of screening could potentially explain some of this behavior, costs cannot explain why this reluctance appears even in cases when simple measures could significantly

increase the average quality of products on offer. The computer game platform Steam

has notoriously low barriers to entry, with one notable example where users found

that software they had purchased was just an empty folder. It would be a relatively

simple matter to ensure that a game must at least run before the platform allows it

1. See https://www.npr.org/sections/money/2018/06/27/623990036/episode-850-the-fake-review- hunter.

2 to be sold.2 It therefore seems likely that these platforms have an incentive to not

screen out low-quality sellers.

To address the question of why a platform might want a lax screening policy I

present a model where a monopoly market platform hosts sellers in exchange for a

percentage of their revenue.3 Sellers can either be low- or high-quality, and the platform

sets the proportion of low-quality sellers on its market. Consumers participating in the

platform’s market must search before they can purchase from a seller. The presence

of the low-quality sellers creates two countervailing effects: The obfuscation effect

reduces consumer search intensity, softening competition between sellers and raising

equilibrium market prices. The lemon effect reduces consumer confidence in the quality

of goods on the market which reduces their willingness to pay. The platform will

admit a positive proportion of low-quality sellers if the obfuscation effect is stronger

than the lemon effect. If the platform is able to recommend a high quality seller, then

that seller will benefit from obfuscation without being subject to the lemon effect which further increases platform profits and increases the likelihood that it will adopt

a lax screening policy.

The obfuscation effect is so named because I show that the low-quality sellers

can act as a search obfuscation mechanism (Ellison and Ellison 2009). Consumers

never knowingly purchase from a low-quality seller, so if a consumer searches and

encounters a low-quality seller then they will pay the search cost to visit another

seller, and they will continue searching until they encounter a seller they believe to

be high-quality. The possibility of encountering a low-quality seller upon searching

2. See https://www.gamerevolution.com/news/373453-steam-poorly-moderated-selling-empty- game-folder 3. I discuss a model with competition between platforms in Appendix B

3 decreases the value of searching as the proportion of low-quality sellers increases, which in turn leads to higher equilibrium prices.4 Although decreasing the value of

search drives some consumers away from the market, the platform increases seller

profits, and consequently its own revenue per consumer, by softening competition

between sellers.

The low-quality sellers in my model attempt to fool consumers into believing that

they are high-quality sellers. Consumers know the probability that low-quality sellers

successfully imitate high-quality sellers, but cannot distinguish between a high-quality

seller and a low-quality seller who is successful in their deception. The lemon effect

reflects the fact that the presence of low-quality sellers reduces consumers’ confidence

in the quality of any prospect they are considering because it is possible that it is a

low-quality seller in disguise. Consumers will have diminished willingness to pay for

sellers’ products and are less eager to participate in the platform’s market because

of the possibility that they may be scammed. The lemon effect is harmful to the

platform, so if it is significantly stronger than the search obfuscation effect then the

platform does not admit any low-quality sellers.

In independent work, Barach, Golden, and Horton (2019) show that highlighting

recommended sellers can be a powerful tool for platforms. At first glance this practice

might initially seem to be contradictory to the obfuscation effect in my model. In fact,

by combining the literatures on search obfuscation and seller recommendations, we

can see that recommending a high-quality seller and obfuscating search via low-quality

sellers are complementary strategies for the platform.5 It is profit maximizing for

4. One could equivalently think of this as an increase in the effective search cost. 5. An example of this kind of highlighting on Amazon includes “Amazon’s choice” products. On Steam, new software is often highlighted in a “top new releases” section on the front page.

4 the platform to recommend a high quality seller, so the recommendation is credible

and the recommended product is not subject to the lemon effect. Consumers are

therefore willing to pay a higher price for the recommended seller’s product than the

product of a random seller of uncertain quality. Consumers, the recommended seller,

and the platform are better off with the recommendation for any fixed proportion of

low-quality sellers. However, by mitigating the lemon effect the recommended seller

reduces the platform’s incentive to screen. The ability to recommend a seller can even

cause the platform to allow low-quality sellers under parameter sets where it would

not absent the ability to recommend. Consumer gains from the recommendation may

thus be diminished or even eliminated by an increase in the proportion of low-quality

sellers.

2.2 Related Literature

The idea behind search obfuscation starts with the observation that increasing

search costs often lead to higher prices and larger profits, with the extreme case being

Diamond’s (1971) result that firms will price as local monopolies if consumers can

be completely prevented from searching. Ellison and Ellison (2009) and Ellison and

Wolitzky (2012) demonstrate that coordination on increasing search costs can survive

even with sophisticated consumers. Although as Ellison and Wolitzky discuss, it is

not clear why firms would not simply coordinate on price if they can coordinate on

increasing search costs.6 Armstrong and Zhou (2015) explain the phenomenon of

“exploding offers” in both labor and consumer search markets as an attempt by firms

to deter search.

6. de Roos (2018) shows that limited product comparability can aid price . Therefore, price collusion and obfuscation together may be easier to justify than either alone.

5 These models all rely on individual firm actions affecting the market as a whole.

Most of the previous papers which have considered obfuscation by monopolists (e.g.

Petrikait˙e(2018) or Gamp (2019)) have done so in the context of multi-product firms which also have control over prices. The obfuscation in my model is more akin to

the wholesaler recommended prices in Lubensky (2017) in that it is a way for the

platform to regain some of the price control it sacrifices by not selling directly to

consumers.7 This control comes from softening competition between sellers, meaning

that I also contribute to the literature on within group effects in platforms (Weyl 2010;

Belleflamme and Peitz 2019).

The search obfuscation in my model is mechanically similar to the noisy search

in Yang (2013) or the relevance of sellers in Eliaz and Spiegler (2011) and Chen

and Zhang (2018). In all three models higher search quality is represented by a

reduced probability of encountering a product in which the consumer is uninterested

(i.e. 0 match utility). Encountering a low-quality seller in my model is equivalent to

an irrelevant search in those models, which means that reducing the proportion of

low-quality sellers is similarly equivalent to increasing the precision of search.

Eliaz and Spiegler (2011) is one of the most similar papers to mine in that it

focuses on a platform obfuscating search by controlling the average quality of sellers.

However, their platform receives a payment per search so it is thus a much more direct

conclusion in their case that the platform benefits from extending the search process.

By specifying a model in which the platform benefits directly from higher seller revenue

rather than indirectly via additional searches, I am able to connect obfuscation to the

7. I do not address the question of why the business chooses to operate as a platform instead of selling directly. See Hagiu and Wright (2015) for a discussion of the tradeoffs between selling as a platform and direct sales.

6 seller recommendation and search prominence literature. My results on recommended

sellers depend in part on the recommended seller holding a prominent search position.

All consumers who buy from the recommended seller search only once, but because

the platform in my model is skimming a portion of seller profits rather than charging

a price per search, its profits increase.

Chen and Zhang (2018) find similar effects in terms of the effects of including

low-quality sellers on search persistence and pricing, but their main focus is on the

trade-off between increasing the variety of products available and allowing more

low-quality sellers to enter the market. They do not derive analytical results for the

effects of including low-quality sellers on industry profit when sellers are horizontally

differentiated, nor do they consider the impact of average seller quality on consumer

participation.

Importantly, neither Eliaz and Spiegler (2011) nor Chen and Zhang (2018) include

deceptive sellers, whereas the imperfectly observable quality in my model drives the

pricing advantage held by the recommended seller. There is a significant literature

looking at false advertising and deceptive sellers (e.g. Corts (2014) Piccolo, Tedeschi,

and Ursino (2015), Rhodes and Wilson (2018)) but these papers focus on determining when the socially optimal policy allows some false advertising instead of platform

profit. Gamp and Kraehmer (2018) come closer in that they examine the interaction

between deceptive sellers and competition in a search environment. But again they

are not interested in platform design, instead focusing on seller incentives to provide

high or low-quality products.

Athey and Ellison (2011) and Chen and He (2011) examine prominence in the

context of sponsored search auctions, whereby a seller pays for prominence in a search

7 engine’s results. The primary interest of both papers is their respective auction

mechanisms and showing that higher quality sellers will bid for more prominent

positions. The platforms in these models are largely passive, primarily interacting with

consumers and sellers by modifying the auctions (i.e. setting a reserve price) rather

than the search environment itself. Mamadehussene (2019) considers obfuscation

in the context of a price comparison platform which is selling prominence, but the

obfuscation comes from sellers rather than the platform, which combined with his

platform’s revenue model does not allow for the complementarity I find between

obfuscation and the seller recommendation.

Unlike these previous papers, the benefit of the recommendation to the platform in

my model does not come from selling prominence. Prominence is important because it

directs consumers to the recommended seller, but it does not drive the seller’s pricing

behavior. The recommended seller’s higher price comes instead from the credibility

of the platform’s recommendation allowing the recommended seller to raise its price without driving away consumers. This result differs significantly from Yea’s (2018)

model where a platform’s recommendation is only credible if all of its

messages are revenue equivalent in equilibrium.

My finding that the recommendation can improve consumer welfare has empirical

support from Chen and Yao’s (2016) finding that following platform recommended

search order significantly reduces search costs with little loss in expected match

utility. Additionally, I provide an explanation for the the effectiveness of “cheap talk”

recommendations in Barach, Golden, and Horton’s (2019) field experiment. Finally, while not explicitly modeled as such, one could think of a platform’s first party content

8 as a form of recommendation, so this paper is also relevant to the literature on platform

” (Zhu and Liu 2018; Zhu 2019).

Hagiu and Jullien (2011) and White (2013) examine motivations for market

platforms to reduce search quality via mechanisms other than search obfuscation. The

“search diversion” described by these papers is similar to my model in that garbling

search drives away some consumers, but the benefit to the platform takes the form of

increased trading volume for high margin products. In my model the benefit instead

arises from higher market prices for all products.

Search design is an important consideration for market platforms, Dinerstein et

al. (2018) show that search design has a significant impact on seller pricing decisions in

platform markets. They describe how a change in eBay’s search design which lowered

search costs reduced price dispersion and equilibrium prices. Teh (2019) independently

derives some platform search design results similar to those I show in Section 2.4,

although his paper examines questions of platform governance more generally and

does not consider deceptive sellers.

2.3 The Model

The agents in the basic model consist of a monopoly platform, a unit mass

of consumers, each with unit demand, a continuum of high-quality sellers, and a

continuum of low-quality sellers, both with strictly positive total mass.8

8. In equilibrium all sellers charge the same price, so the platform only cares about the mass of transactions. Unit demand implies that consumers only consider the proportion of high vs. low-quality sellers on the platform and its effect on the search process. The model does not include an entry or operating cost, so the scaling effect from relative mass of sellers to consumers on seller profits does not change any seller decisions. For all of these reasons, the exact mass of sellers in the game or on the platform does not matter, but for the sake of notational simplicity I assume that the total mass of sellers on the platform is 1.

9 Consumers decide whether to shop on the platform, those who do not participate in

the platform’s market have access to an outside option. The population distribution of

utility for this outside option follows the continuous and concave CDF Q(·). Let α be

the proportion of low quality sellers in the platform’s market, and V (α) the expected value to a consumer of participating in the platform’s market given α. Consumers will participate in the platform’s market if V (α) is greater than their value for the

outside option, so the mass of participating consumers is Q (V (α)).9

Those consumers who do decide to participate face an environment similar to

Wolinsky (1986) or Anderson and Renault (1999). Consumer i’s utility when purchasing

from seller j charging price pj is

ij − pj (2.1)

ij is a consumer-seller specific match utility term. Consumers cannot observe p or

 before visiting a seller and can only purchase from sellers they have visited. For

high-quality sellers  is distributed according to the log-concave distribution function

f() with corresponding CDF F (), and support [, ¯] where  ≥ 0 and ¯ ∈ (, ∞).10

For low-quality sellers  = 0, so consumers never purchase from a seller they know

to be of low-quality. Low-quality sellers attempt to imitate high-quality sellers and

successfully fool a visiting consumer with exogenous probability β > 0.11 Thus, upon

9. For the purposes of this model I will be assuming that a positive proportion of consumers participate in the platform’s market. This is a fairly trivial assumption as the platform makes 0 profit with no consumers and a positive profit otherwise, but it allows me to ignore uninteresting equilibria where the sellers charge unreasonably high prices because they are indifferent over all prices with no consumers. 10. Strictly speaking the assumption that  ≥ 0 is not necessary for my results, but I include it to prevent confusion regarding the distinction between low- and high-quality sellers. 11. I use this reduced form process here because endogenous deception would likely make the model intractable. See Smirnov and Starkov (2018) for a detailed model of low-quality sellers imitating high

10 Participation decision Do not participate Participate

Consumer searches, visits a seller and observes price

Probability 1 − α Probability α

Seller is high-quality Seller is low-quality

Probability β Probability 1 − β

Seller fools Seller does not fool consumer consumer

Consumer receives Consumer receives  draw Consumer receives (fake)  Consumer searches value of outside and decides whether to draw and decides whether again option search again to search again

Figure 2.1: The process by which consumers learn about seller quality and their match value. Rounded corners indicate consumer actions, right angled corners indicate random events. The dashed line connects two nodes within an information set.

visiting a low-quality seller, consumers do not purchase with probability 1 − β and with probability β the consumer incorrectly believes the low-quality seller to be a

high-quality seller. Upon successfully deceiving a consumer, the low-quality seller

is indistinguishable from a high-quality seller, but this does not necessarily mean

that the consumer will purchase from that seller. A consumer who is fooled by a

low-quality seller receives a fake match utility signal which follows f() and represents

the extent to which the sales tactics which successfully deceive a consumer appeal

to that consumer. Consumers cannot distinguish between a true match utility draw

from a high-quality seller and a fake match utility signal from a successful low-quality

quality sellers. I discuss comparative statics on β in Section 2.4 and the possibility of the platform influencing β in Section 2.5

11 seller. The consumer purchases from a low-quality seller only if that low-quality seller

successfully fools the consumer and the perceived match utility is sufficiently high.

This process is shown as a flowchart in Figure 2.1.

The platform receives an exogenous proportion of sellers’ revenue and sets the

proportion of low-quality sellers α to maximize profit. While the obfuscation in

this model is essentially a story about failing to screen, I assume that the platform

can perfectly and costlessly determine seller quality and so does not face any sort

of screening cost when directly deciding the proportion of low-quality sellers. This

assumption allows me to focus on the primary mechanisms behind my results, see

Section 2.6 for a discussion of screening costs and their implications. I assume that

this proportion of low-quality sellers is publicly observable.12 Sellers’ only decision is what price to charge given the proportion α.

2.3.1 Equilibrium of the Market Subgame

As is standard in the search literature I use symmetric weak perfect Bayesian

equilibrium for my and assume that searchers’ beliefs about other

sellers’ prices are not affected by observing one seller deviate from equilibrium. The

timing of the model proceeds as follows:

1. The platform sets the proportion of high-quality and low-quality sellers.

2. Consumers commit to either participating in the platform’s market or remaining

with the outside option.13

12. Public observability here represents the reputation costs to the platform of allowing low quality sellers. 13. Commitment is not a binding assumption in the benchmark model since the search environment is stationary, so if consumers find participation initially worthwhile they will always find it worthwhile.

12 3. Sellers admitted by the platform set prices simultaneously.

4. Consumers participating in the platform’s market search among sellers and make

purchasing decisions.

Consumer Search Behavior

Consumers are aware of the presence of low-quality sellers, so if the proportion of

low-quality sellers on the platform is α, then a consumer who has searched and found

a prospect −pj + ij will evaluate the value Xb of this current option as

Ψij − pj ≡ Xb (2.2) where

1 − α Ψ = (2.3) βα + 1 − α

The term Ψ is the mathematical representation of the lemon effect, so I call it the lemon coefficient. It represents the fact that, conditional on a consumer evaluating

a product as high-quality, the probability that the product they are observing is

actually high quality is the probability 1 − α that they encountered a high-quality

seller, divided by the probability βα + 1 − α of the event that they evaluate a product

as high quality. This coefficient is decreasing in β because the more able low-quality

sellers are to successfully imitate high-quality sellers, the less the consumer will believe

that the product they are considering is actually high-quality.

The intuition behind the main results should still hold if search is not stationary. See Burdett and Vishwanath (1988), Choi and Smith (2016), or Casner (2020a) for a discussion of the effects of a non-stationary search environment.

13 Search is undirected, so a searching consumer will randomly select a new seller to visit.14 If a consumer chooses to search for another prospect, they find a low-quality

seller with probability α, and conditional on visiting a low-quality seller they correctly

determine that this seller is low-quality with probability 1 − β, so with probability

α(1 − β) the consumer must search at least once more. Those consumers who search

twice need to search a third time with probability α(1 − β) and so on. Let c denote

the (strictly positive) cost of searching, the expected cost of finding an additional

prospect is then

∞ X c [α(1 − β)]kc = 1 − α(1 − β) k=0 (2.4) c = βα + 1 − α

Denote by p the price that sellers set in a symmetric equilibrium and 0 the hypothetical match utility draw at a new prospect. Consumers expect that all unobserved sellers set price p so the expected payoff Xs from finding another prospect is Z ¯ 0 0 0 c Xs ≡ max [Ψij − pj, Ψ − p] f( )d − (2.5)  βα + 1 − α

A searcher can recall past prospects freely, so the new prospect is preferred only if

0 p−pj 15  > ij + Ψ . Setting the utility of searching (Xs) equal to the utility of staying with the current prospect (Xb), consumers are indifferent between staying with the

current prospect and searching again if

Z ¯ 0 0 0 c Ψij − pj = max [Ψij − pj, Ψ − p] f( )d − (2.6)  βα + 1 − α

14. High-quality sellers in this environment face a symmetric problem and will hence set symmetric prices. Low-quality sellers only make a profit if they successfully imitate a high-quality seller, so they must set the same price as the high quality sellers. Since all sellers are ex ante identical, consumers have no incentive to engage in ordered search if doing so requires even the slightest effort. 15. I use this assumption of free recall because it matches the previous literature and it slightly eases derivation. In equilibrium no consumer exercises recall so there are no qualitative differences between allowing recall vs. no recall for this model.

14 Combining terms and simplifying

Z ¯   0 p − pj 0 0 c = (1 − α)  − ij − f( )d (2.7) p−pj Ψ ij + Ψ

Define U as the ij which solves Equation (2.7). In equilibrium all sellers set the same

price, so the stopping rule is given by

Z ¯ c = (1 − α) (0 − U) f(0)d0 (2.8) U

and consumers will stop at a seller if they draw  > U. U is consumers’ reservation

match value and depends on α only through the latter’s effect on Equation (2.8).

Proposition 1. In a symmetric equilibrium the reservation match value U is decreas- ing in the proportion of low quality sellers α.

Proof. All proofs are provided in Appendix A.

The right hand side of Equation (2.8) is decreasing in U. From the assumption

of symmetry in prices, U does not change with α except through its effect on the

stopping rule, so the RHS of Equation (2.8) is also decreasing in α. If α increases, then

U must decrease in order to maintain equality. Intuitively, as the cost of searching

increases relative to the marginal value, consumers are willing to stop after observing

a lower match value. The possibility of encountering a low-quality seller and having

to search again increases the effective search cost and hence reduces consumer search

intensity. This decrease in search persistence as the proportion of low-quality sellers

increases is the mathematical expression of the search obfuscation effect defined in

Section 2.1.

It is worth noting that β has no effect on search behavior once a consumer is

participating in the platform’s market. This is because the probability that the

15 consumer is being scammed is the same across all prospects. Thus, while the lemon

effect reduces the consumers’ perceived valuation of each prospect, it reduces their valuation equally across all prospects. It does not influence the relative valuation of

an additional search relative to c because an increase in the probability of being fooled

reduces the expected search cost to find an additional prospect as well as reducing

consumers’ confidence in that prospect. These factors exactly cancel out and so β does

not influence search behavior in equilibrium. This result relies on ex-ante symmetry

across sellers and does not hold for the recommended seller in Section 2.5.

Seller pricing

Sellers’ only decision once they join a platform is what price to charge. Since

consumers’ beliefs about other sellers’ prices are not affected by observing deviations,

they use the stopping rule derived above. It is easy to show that an equivalent definition

R ¯ 0 0 0 c of U is the  which solves Ψ − p =  max [Ψ − p, Ψ − p] f( )d − βα+1−α (=

Xs in equilibrium). U is the critical match value determining whether a consumer

stops or continues to search after visiting seller j only if pj = p. The precise implication

of this stopping rule is that consumers purchase any observed prospect whose net

expected utility exceeds the difference ΨU − p.16 Consumers stop at a seller charging

price pj if ij makes value of consumption greater than value of an additional search

or:

Ψij − pj > ΨU − p (2.9)

16. While it might be more intuitive to derive equilibrium prices if the stopping rule were expressed in terms of this net expected utility (e.g. Bar-Isaac, Caruana, and Cu˜nat(2012) do so), the gains from ease of intuition would be more than offset by unwieldy expressions later on.

16 A consumer will stop at a seller who sets pj > p only if the match utility draw is high

enough so that the net expected utility from purchasing exceeds this cutoff. Solving

Equation (2.9) for ij, the probability that a given consumer who visits a seemingly

high-quality seller j charging price pj buys from that seller is the probability that

p−pj p−pj  ij > U + Ψ , or more directly: 1 − F U + Ψ . Let γ denote the probability that a consumer stops after visiting an arbitrary seller.

In equilibrium this probability is γ = (αβ +1−α)(1−F (U)), but since that probability

is based on other sellers’ prices I introduce γ here to emphasize that seller j’s price

is disciplined only by the proportion of consumers who stop after visiting j. Given

participation Q(V (α)), a total mass Q(V (α))γ consumers will search once and then

purchase. With probability 1 − γ consumers leave the after visiting a seller they visit

and so a mass Q (V (α)) (1 − γ) will evaluate at least two prospects, Q (V (α)) (1 − γ)2

Q(V (α)) three prospects and so on. Taking this to the limit, each seller will receive γ consumer visits and this mass will not be affected by an individual seller’s deviation

since each seller is infinitesimal compared to the size of the market.17 The platform

takes an exogenous percentage ξ of seller revenue, and the marginal cost of production

17. Technically the mass should be this fraction divided by the total mass of sellers, but as mentioned in Footnote8 I assume this latter mass to be 1 since the ratio of buyers to sellers is irrelevant for all agents’ decisions.

17 is 0.18 high-quality seller profit is then    Q(V (α)) pj − p π = (1 − ξ) pj 1 − F U + | {z } γ Ψ (2.10) Seller’s | {z } | {z } revenue Number of Price*purchase probability share consumer visits and low-quality seller profit is βπ. It is a matter of simple calculus to find that the profit maximizing price for both high- and low-quality sellers is determined implicitly by

p−pj  1 − F U + Ψ pj =Ψ (2.11) p−pj  f U + Ψ

Symmetry then implies pj = p so

1 − F (U) p =Ψ (2.12) f (U)

Log concavity of f(·) ensures sufficiency of the first order condition (Bagnoli and

Bergstrom 2005).19 The lemon coefficient appears in the price as well as consumers’ evaluation of their match utility. As consumers’ faith in the product they are purchasing decreases, sellers must lower their prices to compensate. On the other hand, the search obfuscation effect reduces the appeal of moving on to a different seller, which reduces

18. An exogenous platform commission is not essential for my results. Both Steam and Amazon set the same commission across many product categories with heterogeneous competitive environments, suggesting that factors exogenous to the market do play an important role in determining their pricing. I could follow Teh (2019) and introduce a positive marginal cost of production for sellers. In which case the commission would affect seller pricing, but my environment is sufficiently close to his that adapting his result that the platform will allow low-quality sellers if their marginal costs are sufficiently low would not be overly difficult. Given that much of my motivation comes from software platforms, it would arguably be more sensible to assume no marginal cost and instead endogenize the commission by assuming it is set using Nash bargaining. In this case ξ would continue to be an independent multiplier on profits and would not affect pricing, curation, or participation decisions. 19. Note that while symmetric costs mean that the profit maximizing prices are identical, this does not allow for the possibility that high-quality sellers might want to signal their quality via prices. However, even if I were to allow for different marginal costs, the cost of the low-quality sellers would surely be lower, meaning that the low-quality sellers could imitate any strategy the high-quality sellers might adopt which allows for positive profits. Given that consumers never knowingly purchase from a low-quality seller, the result would always be pooling on the high-quality sellers’ price.

18 U and pushes equilibrium prices upward. If the obfuscation effect dominates the

lemon effect, which happens when low-quality sellers are not too adept at imitating

high-quality sellers (β small), then softened competition causes prices to increase with

the proportion of low-quality sellers α. This logic is formalized in Proposition 2

Proposition 2. Equilibrium prices are increasing in the proportion of low-quality sellers if their probability of fooling any given consumer is sufficiently low. Formally:

∂p For any α, ∂α > 0 if β is sufficiently small.

2.3.2 Consumer Surplus

A consumer who participates on the market will stop after observing  > U. With

probability Ψ this observation comes from a high-quality seller, so the expected match

R ¯ f() 1 utility is Ψ U  1−F (U) d. The expected number of prospects observed is 1−F (U) , with

1 an average of βα+1−α searches required to find each prospect, so the expected search

c cost is (1−F (U))(βα+1−α) . Finally, all sellers will charge the same price p, so Z ¯ f() c V (α) = Ψ  d − − p (2.13) U 1 − F (U) (1 − F (U))(βα + 1 − α) Using Equation (2.8) and Equation (2.11), then simplifying

 1 − F (U) V (α) = Ψ U − (2.14) f(U)

20 1−F (U) Both the lemon effect and search obfuscation reduce V (α). The term U − f(U) is the consumer surplus retained by consumers who purchase from high-quality sellers.

20. This implies that unlike Rhodes and Wilson’s (2018) “price effect”, the lemon effect’s impact on price is never sufficient to create a net benefit to consumers from the presence of these low-quality sellers. The main differences that lead to these divergent results are: 1. the seller in their model is a monopolist, whereas I am considering a competitive setting and 2. Their price effect comes from a change in deception probability (equivalent to changing β in my model) whereas my model focuses on changes in the proportion of low-quality sellers. I discuss comparative statics on β after Proposition 4.

19 Conditional on purchasing from a high-quality seller, this surplus is not influenced by

the lemon effect. However, search obfuscation reduces U so consumers will receive

less surplus as a result of reduced search intensity if the proportion of low-quality

sellers increases. Because receiving positive surplus is conditional on purchasing from

a high-quality seller, the expected value of participating in the platform’s market is

scaled down by the lemon coefficient. The more likely consumers are to be ripped off

by a low-quality seller, the lower their expected payoff.

This clearly illustrates the difference between the obfuscation and lemon effects.

The obfuscation effect allows the sellers to capture more of the expected surplus

available in the market. This reduces participation but can be beneficial to the

platform because it receives a share of the increase in sellers’ profits. On the other

hand, the lemon effect reduces the total amount of expected surplus, which benefits

no one.

2.4 Search Obfuscation Via Low Quality Sellers

Every consumer who shops on the platform will purchase at the symmetric equi-

librium price p. The platform operates costlessly, so its profit is

Π = ξpQ(V (α)) (2.15)

Π Because ξ is an exogenous scalar, maximizing Π is equivalent to maximizing ξ = pQ(V (α)), so to simplify notation I suppress ξ unless it is relevant.

Proposition 3. The platform admits a positive proportion of low-quality sellers if the search cost is sufficiently low and low-quality sellers are not too adept at imitating high-quality sellers: α > 0 in equilibrium if both conditions1 and2 hold.

20 1. β is sufficiently small

0  ∂p  Q (V (0)) ∂U 2. c is sufficiently small and f(¯) > 0 or Q(V (0)) ∈ o ∂p as U → ¯ 1+ ∂U

Lower search costs create more intense competition between sellers, so intuitively

Proposition 3 says that the platform will admit low-quality sellers if the competition

between sellers on the platform is sufficiently intense in their absence (so that the

increased prices from softening competition compensate for the reduction in consumer

participation from search obfuscation) but only if the loss in consumer confidence

stemming from the possibility of being fooled by the low-quality sellers is not too large

(so that the the reduction in consumer participation from the lemon effect does not

eliminate the benefits to the platform from search obfuscation). Requiring positive

f(¯) or highly concave Q(V (0)) ensures that the net effect of softening competition

overcomes the reduction in demand from the search obfuscation effect when the

proportion of low-quality sellers α = 0 and competition without search obfuscation is

sufficiently intense.21

Example

Because analytic solutions for price, reservation match value, and participation

are not generically available, the proof of Proposition 3 relies on limiting arguments.

The reader might then quite reasonably wonder whether the platform only admits

low-quality sellers in a small portion of the relevant parameter space. I introduce a

numerical example here to demonstrate that this not the case. Suppose  ∼ U[0, 1]

for the high-quality sellers and Q(V (α)) = V (α) (equivalent to assuming that the

platform is a monopoly at one end of a Hotelling line with transport cost 1). Then

21. This requirement is sufficient but not necessary. I discuss the intuition behind this condition more thoroughly in the proof of Proposition 3

21 p c we can solve Equation (2.8) explicitly to find that U = 1 − 1−α . Plugging this into Equation (2.11) and Equation (2.14) we get that

r c p = Ψ (2.16) 1 − α  r c  V (α) = Ψ 1 − 2 (2.17) 1 − α

c 1 Consumer participation is decreasing in both c and α, and for 1−α > 4 no consumers find it worthwhile to participate in the platform’s market at all. Platform profit is

then

r c  r c  Π = ξΨ2 1 − 2 (2.18) 1 − α 1 − α

Figure 2.2 shows the region where derivative of platform profit with regard to

the proportion of low-quality sellers at α = 0 is positive. The marginal profit from

admitting low-quality sellers is monotonically decreasing in both β and c, and the

platform will only admit low-quality sellers if this marginal profit is positive, which

holds only if both c and β are small (the lower left corner of the graph). For c > 0.06

or β > 0.2 the platform will never admit a positive proportion of low-quality sellers.

Note that for c ≥ 0.25 no consumers would participate in the platform’s market

even at α = 0, and market platforms tend to have search costs that are quite low

relative to the purchase price . The prices in this example tend to be in the region of

0.2 (see Figure 2.3 in the discussion of comparative statics below). Hong and Shum

(2006), Blake, Nosko, and Tadelis (2016), and Ursu (2018), all find price to search

cost ratios below 10% in online search, so search costs above 0.05 seem unrealistic in

this example. Depending on how tightly one narrows the search cost, the platform

is admitting low-quality sellers in 10 − 20% of the relevant parameter space, which

22 0.14

0.12

0.10 * c 0.08 α >0

0.06 α*=0

0.04

0.02

0.0 0.1 0.2 0.3 0.4 β

Figure 2.2: A parameter space showing the values where the platform will admit a positive proportion of low-quality sellers in this numerical example.

When parameters lie in the shaded region the platform sets α > 0 in equilibrium, but it admits no low-quality sellers in the white region.

23 is not the majority but hardly a special case. This is reinforced further when one

considers that β close to 1 is probably unreasonable. This result is robust to various

distributions, and is further reinforced with the introduction of recommended sellers

(see Section 2.5).

2.4.1 Equilibrium Curation Decision

Assuming that conditions1 and2 of Proposition 3 are satisfied is equivalent to

assuming that the platform will set α ∈ (0, 1), because if the platform sets α = 1

then no consumer will participate in the platform’s market, so the only possible

corner solution is at α = 0, which is precluded by Proposition 3. Define an interior equilibrium as any equilibrium with α ∈ (0, 1). The following first order condition for

the platform’s curation decision is a necessary condition in any interior equilibrium.

f(U)2+f 0(U)(1−F (U)) 0 2 Q (V (α)) f(U) − βB(α) =  2 0  p 1 + f(U) +f (U)(1−F (U)) Q(V (α)) (2.19) f(U)2 | {z } | {z } Consumers leaving the platform Net marginal impact of low-quality sellers on price Where B(α) is an algebraic construct introduced to reduce unnecessary mathematical

clutter. The definition of B(α) can be found in the proof of Lemma 1.

Roughly speaking, the first term on the left hand side of Equation (2.19) represents

the elasticity of market prices with regard to the proportion of low-quality sellers, and

βB(α) is a mitigation in this price increase caused by the lemon effect. It is the result

of β and consumers’ expectations about β. The right hand side is the elasticity of

consumers’ participation in the platform. The following condition is stronger than

needed to give sufficiency of the first order condition, but is nevertheless not terribly

restrictive22:

22. See the proof of Lemma 1 for details of why this assumption is sufficient for single peakedness of platform profit in the curation decision.

24 Assumption 1. The variance of f(·) is large

Let α∗ denote the proportion of low-quality sellers in equilibrium.

Lemma 1. Under Assumption 1, α∗ is determined uniquely by Equation (2.19) for any interior equilibrium.

Lemma 1 allows me to use Equation (2.19) to evaluate the comparative statics of

α∗.

Proposition 4. Under Assumption 1, for any interior equilibrium

1. The proportion of low-quality sellers decreases in search cost: α∗ is decreasing

in c.

2. If Q(V (α)) is linear, then the proportion of low-quality sellers decreases in the

deception probability: α∗ is decreasing in β for linear Q(V (α)).

The obfuscation effect from the low-quality sellers effectively increases the search

cost, but comes at the cost of lost consumer participation and lower consumer confi-

dence. Increasing c does not contribute to the lemon effect, but does increase prices, so

if the search cost increases, then the platform has less incentive to increase the effective

cost further via low quality sellers. Increasing the search cost does not contribute to

the lemon effect so directly increasing this cost seems to be a preferable option for the

platform rather than obfuscating search via low-quality sellers. However, as Eliaz and

Spiegler (2011) note, an action which is so hostile to consumers would likely incite a

strong negative reaction from consumers and potentially even regulatory scrutiny.

Even if consumers and regulators were to remain complacent about an increase

in search costs, many platforms have a number of different markets which share a

25 common search environment. For example, the market for high-end headphones on

Amazon is likely much less competitive than the market for exercise supplements.

Thus, allowing more low-quality sellers in the supplement market would allow Amazon

to increase the search cost in that market without also increasing search costs in the

headphone market. Additionally, if screening costs are a factor (see Section 2.6) then

obfuscation via low-quality sellers may be more cost effective than direct obfuscation via increased search costs.23

Potential low-quality sellers who do not get to participate in the platform are worse off with a higher search cost, but the platform and all sellers (both low- and

high-quality) participating in the platform are better off. Sellers are able to charge a

higher price because the proportion of low-quality sellers is lower which reduces the

lemon effect and increases total profits. Interestingly, the consumers can be better off with a higher search cost as well. Because α∗ is decreasing in c, an increase in search

costs reduces the probability that they encounter and then trade with a low-quality

seller, and the effect of this increased confidence relative to an equilibrium with lower

search cost increases market participation.24

As β increases, the lemon effect becomes stronger compared to the search obfusca-

tion effect, so the marginal cost of admitting additional low-quality sellers increases

and α∗ decreases. Linearity of Q(·) ensures that the effects of consumers’ decreased

23. While the platform in this model is choosing the proportion of low-quality sellers, this is abstracting away from the idea that there would be some base rate of α and the platform would need to screen sellers in order raise or lower it. Increasing c would require an active and visible change in platform design, but increasing α only involves reducing screening effort. 24. Using the numerical solution above, holding β constant at 0.15, α∗ ≈ 0.19 when c = 0.01, giving V (0.19) ≈ 0.85. When c = 0.018 α∗ = 0 and V (0) = 0.86. It is worth noting that while the range of c where increasing search costs benefit consumers is narrow in this example, the reduction in α means that even when consumers are worse off with the increased search cost, it is often not a significant loss. When c = 0.03 (tripling search costs compared to the base case) V (0) = 0.82.

26 (a) Deception probability (c = 0.02) (b) Search cost (β = 0.1)

Figure 2.3: Comparative statics using the uniform numerical example.

willingness to pay are outweighed by their decreased market participation. If Q(·) is

highly concave, then Q0(·) might be small enough relative to Q(·) that the price reduc-

tion from an increase in β can be stronger than the effect on demand and an increase

in β could increase α∗ because the platform would rather serve fewer consumers at a

higher market price. However, no matter what the effect on α∗, both the platform

and sellers are worse off as β increases because it causes consumers’ confidence to

decrease more quickly as α increases. As with c, the effect of β on consumer welfare

is ambiguous. Directly, an increase in β reduces welfare by making consumers more vulnerable to the low-quality sellers, but if it reduces α∗ sufficiently then consumers

can be made better off in equilibrium because the number of low-quality sellers they

encounter decreases.

Figure 2.3 shows comparative statics from the uniform example above. Increasing

the deception probability makes consumers better off as a result of increased screening,

27 and the platform’s profit decreases slightly. Increasing search costs makes consumers worse off and increases platform profit.

2.5 Recommending a Seller

At face value, recommending certain sellers to consumers seems like behavior

contradictory to increasing search costs by admitting low-quality sellers. Why make it

easier for consumers to distinguish high quality sellers when the point of obfuscation is

to reduce comparison shopping? However, if the platform can credibly recommend a

high-quality seller to consumers then this behavior is in fact a complementary strategy

to admitting low-quality sellers. The low-quality sellers strengthen the effect of the

recommendation, allowing the recommended seller to charge a higher price, and the

recommendation mitigates the impact of the lemon effect which increases consumers’

participation on the platform. Consequently platform profits are always higher with a

recommended seller when α > 0, and the set of parameters where the platform will

admit a positive proportion of low-quality sellers is wider with a recommended seller.

I model the recommendation process by having the platform choose a high-quality

seller to be prominent in the sense of Armstrong, Vickers, and Zhou (2009). All

consumers visit the recommended seller first at no search cost, but if they choose

to move on and visit other sellers then their search process is undirected sequential

search as in the previous section. As will become apparent later, the platform is better

off recommending a high-quality seller than a low-quality seller, so consumers view

this recommendation as credible and believe with probability 1 that the recommended

seller is high-quality. Denote the recommended seller by R, in this case consumer i’s

evaluation of the utility from the recommended seller given match value draw iR and

28 price pR is simply

−pR + iR (2.20)

The value of searching for another prospect is quite similar to the basic monopoly

model

Z ¯ 0 0 0 c max [−pR + iR, Ψ − p] f( )d − (2.21)  βα + 1 − α

Where p is the equilibrium price of the non-recommended sellers. The new prospect is

preferable if

 + p − p 0 > iR R (2.22) Ψ

By logic almost identical to that in the basic model, the stopping rule is then given by

Z ¯   UR + p − pR c = (1 − α) 0 − f(0)d0 (2.23) UR+p−pR Ψ Ψ

Where UR is the iR which solves Equation (2.23). As in Section 2.3, UR as derived

above is based on the recommended seller charging the expected price pR. Equa-

tion (2.23) implies that consumers will stop at the recommended seller if the net utility

they receive from doing so exceeds UR − pR. Suppose the recommended seller deviates

0 0 to price pR, then consumers stop at the recommended seller if iR − pR > UR − pR

0 and the probability that these consumers stay after visiting is 1 − F (UR + pR − pR). The mass of consumers participating in the platform’s market is Q(V (α)), so the

recommended seller’s profit is given by

0 0 πR = pR(1 − ξ)Q(V (α))(1 − F (UR + pR − pR)) (2.24)

29 Consumers have rational expectations over firm prices and do not observe deviations, so the recommended seller takes consumer participation as given.25 The profit maximizing price is thus given by

0 0 1 − F (UR + pR − pR) pR = 0 (2.25) f(UR + pR − pR)

In equilibrium consumers’ expectations about the price are correct so

1 − F (UR) pR = (2.26) f(UR)

Note the absence of the lemon coefficient. Consumers believe with certainty that the recommended seller is high-quality, so α influences the recommended seller’s price only through UR. Consumers who draw iR < UR and visit other sellers never return to the recommended seller, so the other sellers are only competing with each other and price as in the previous section; This leads consumers who do not purchase from the recommended seller to use the stopping rule defined by Equation (2.8). Consumers’ greater confidence in the recommended seller’s product gives the recommended seller a competitive advantage which is formalized in Proposition 5.

Proposition 5. In the equilibrium with a recommended seller

1. Consumers visiting sellers after the recommended seller have the same reservation

match value U as in Section 2.3 and these other sellers charge the same price p.

2. The net expected utility above which the consumer stops searching is the same at

the recommended seller and the other sellers: UR − pR = ΨU − p.

25. The commitment assumption has more effect than in the baseline because the search environment is no longer completely stationary. Relaxing this assumption would mean that all consumers visit the recommended seller, but a subset of consumers would then leave the platform after finding that both the recommended seller’s product and the value of continuing to search are less appealing than the outside option. This creates broadly similar results as I present here with almost identical intuition at the cost of greatly increasing the difficulty of the analysis. Further details are available upon request.

30 3. The recommended seller sets a higher price than the other sellers if the proportion

of low-quality sellers is positive: pR > p if α > 0.

4. The recommended seller sets an identical price to the other sellers and the

reservation match values are identical if there are no low-quality sellers: UR = U

and pR = p if α = 0.

One of the less obvious implications of part2 in Proposition 5 is that the benefit

the recommended seller accrues from search obfuscation is exactly the same as the

benefit gained by the other sellers. Its pricing advantage comes entirely from the

lemon effect. If we were to set β = 0 then the recommended seller and the other sellers would all set the exact same price. The recommended seller’s profit actually increases

as the lemon effect gets stronger as long as consumer participation doesn’t decrease

too much.

Define V R(α) as the expected value to consumers of participating in the platform’s

market when there is a recommended seller. Solving explicitly for this value

Z ¯    R f() 1 − F (U) V (α) = (1 − F (UR))  d() − pR + F (UR)Ψ U − (2.27) UR 1 − F (UR) f(U)

Corollary 1. Consumers’ value of participation in the market is greater with a recommended seller than in the equilibrium where all sellers are symmetric: V R(α) >

V (α) for all α > 0.

The consumer value increases with the recommendation for two reasons: First,

the consumers observe the recommended seller with no search cost.26 Second, while

26. The assumption that the prominent seller is observed at no cost can be significantly relaxed and the result will still hold. However it seems natural that search costs would be significantly reduced when seeing the recommended seller and imposing this assumption drastically simplifies the analysis.

31 the recommended seller’s price is higher than the symmetric equilibrium price, the

recommended seller does not capture all of the surplus gained from mitigating the

lemon effect, with the remainder going to consumers.

An immediate consequence of Corollary 1 is that consumers ex ante utility is higher

at the recommended seller despite its higher price, so visiting the recommended seller

first is incentive compatible. If consumers never visit the recommended seller then

the search problem is stationary. If consumers do not wish to visit the recommended

seller in the first search, then visiting it cannot be more appealing than visiting a

random seller in any subsequent search, implying that consumers would never visit the

recommended seller. In that case, the value of participation would be exactly equal

to participation in the equilibrium with no recommended seller, but by Corollary 1

they can do better by visiting the recommended seller in the first search. Therefore

(1 − F (UR)) of the consumers who visit the platform will stop at the recommended

seller and pay pR, the rest will purchase from one of the other sellers and pay p. The

platform’s profit with a recommended seller is thus

R R Π = ξQ(V (α)) [(1 − F (UR))pR + F (UR)p] (2.28)

From Proposition 5 pR > p, so (1 − F (UR))pR + F (UR)p > p, and from Corollary 1

Q(V R(α)) > Q(V (α)) for any proportion of low-quality sellers. The platform’s profits

must be strictly higher with the recommended seller at any fixed α > 0, and at least weakly higher when α = 0. Given that the platform can still pick any α ∈ [0, 1] with

the recommended seller, its profit must be greater with the ability to recommend.

This proves Proposition 6.

32 Proposition 6. The platform’s profits are higher in the equilibrium of the model where it can recommend a seller than in the model with no recommended seller. This inequality is strict whenever the platform-profit maximizing proportion of low-quality sellers is positive in the model with the recommended seller.

Also note that if the platform recommends a low quality seller, then the share of

the platform’s profits coming from the recommended seller are scaled down by at least

β even if consumers’ ex ante beliefs do not change, so the platform is strictly better

off recommending a high quality seller.27

Unlike in the previous section, the platform’s profit is not necessarily decreasing

in β. Because a stronger lemon effect increases the strength of the recommendation,

the additional profit from a stronger recommended seller can more than compensate

for the lost profit of the other sellers and reduced participation. Therefore, while a

platform which is able to exert influence over β (e.g. by monitoring sellers or providing

stronger review/certification systems) would always reduce it in the base model, this

is not necessarily the case when recommendations are available.28

The positive effect of β on platform profits, and the profitability of the recom-

mendation in general, would be further strengthened if the platform were to auction

the recommendation using a second price auction. The recommendation is more

profitable for a high-quality seller, so it would maintain credibility and capture all of

27. As is common with this sort of recommendation model (e.g. Athey and Ellison (2011)) there is an alternate equilibrium where consumers don’t believe the platform’s recommendation, in which case the platform is indifferent over which type of seller to recommend. However, given that recommending a high-quality seller is always at least weakly incentive compatible, it seems sensible to focus on the more interesting equilibrium.

∂Π 28. Using the uniform example from before, ∂β > 0 with a recommended seller at c = 0.05, β = 0.2 and consequent choice of α = 0.27. While the necessity of solving for α at every parameter point investigated makes numerical search fairly slow, it appears that the parameter regions where the platform finds β beneficial are narrow, but extant. I thank an anonymous referee for suggesting this avenue of inquiry.

33 the supply-side benefits of the recommendation. I leave detailed consideration of this

process for future work.

Proposition 7 summarizes comparisons of the marginal effect of the low-quality

sellers between the symmetric monopoly model and the model with the recommended

seller. Let αR∗ denote the equilibrium proportion of low-quality sellers in the model with the recommended seller.

Proposition 7. For any interior equilibrium:

1. If the proportion of low-quality sellers is sufficiently small, then the reservation

match value at the recommended seller is smaller than at the other sellers and

it is also more responsive to changes in the proportion of low-quality sellers:

∂UR ∂U ∂α < ∂α for small α and UR < U for small α > 0.

2. The equilibrium proportion of low-quality sellers is higher with the recommended

seller if participation is less sensitive to the curation decision than price: αR∗ >

α∗ if Q(·) is sufficiently concave.

Part1 implies that if the equilibrium proportion of low quality sellers is small without a recommended seller, then it will be weakly higher with a recommended

seller. The most interesting case being the parameter ranges where the platform would

fully screen with no recommended seller but admits a small proportion of low quality

sellers if it can make a recommendation. Corollary 2 formalizes this intuition.

Corollary 2. The set of parameters where the platform admits a positive proportion of low-quality sellers is larger when it can recommend a seller than when it cannot. If the equilibrium proportion is small but positive in the baseline model, then the platform admits more low-quality sellers when it can make a recommendation:

34 1. The set of β and c such that α∗ > 0 is strictly contained in the set where αR∗ > 0.

2. The set of β and c such that αR∗ > α∗ strictly contains the set such that α∗ = 0

and αR∗ > 0

The recommendation increases the marginal benefit and reduces the marginal cost

of admitting low-quality sellers when α is small. From Proposition 4 α∗ will be small

but positive when β and/or c are large but not too large. So αR∗ > α∗ > 0 in a band

of parameters contained within the set where α∗ > 0 but near the border with the

set where α∗ = 0 and αR∗ > 0. If either parameter is very large then the equilibrium

proportion of low-quality sellers is 0 in both models.

αR∗ can be smaller than α∗ when c and β are small because of the higher average

price being paid by consumers so in general the comparison between αR∗ and α∗ is

ambiguous in this case. Price is more responsive to α with the recommended seller,

but because the average price paid by consumers is higher, the revenue lost from

consumers moving to the outside option is also greater. Part2 of Proposition 7 says

that for Q(·) sufficiently concave the increased ability to raise prices has more effect

than the revenue loss from consumers leaving the platform, in which case αR∗ ≥ α∗

under all parameter sets. Figure 2.4 summarizes these conclusions visually.

The fact that the recommendation increases the willingness of the platform to admit

low-quality sellers also speaks to the complementary nature of the recommendation

and obfuscation when it comes to the platform’s profits. Proposition 6 shows that

adding the recommendation on top of admitting low-quality sellers is more profitable

than admitting low-quality sellers alone. The platform is free to set α = 0 in the model with a recommendation, so the fact that it does not means the two strategies together

must be more profitable than just the recommendation. The previous literature (e.g.

35 c

α R ∗ = α ∗ α R = 0 ∗ > α α R ∗ ∗ > α = 0

α R ∗ α ∗ > ∗ > 0 > 0 0

β

Figure 2.4: A diagram showing the comparison between α∗ and αR∗ as a function of the parameters.

Armstrong and Zhou (2011)) on prominence has assumed that the benefit to the

platform from providing a recommendation comes from the fee sellers pay to the

platform. My platform is not charging such a fee, and the increase in profits from the

recommendation is a result of the presence of the low-quality sellers. Indeed, if the

platform were to recommend a seller without admitting low-quality sellers then this would have almost no impact on the platform’s profits. Participation would increase

slightly given that I assume the recommended seller is observed at no cost, but my

model would otherwise be equivalent to that of Armstrong, Vickers, and Zhou (2009).

In that paper, a prominent seller charges an identical price to all other sellers when

there are a continuum of sellers. The price difference for the recommended seller in

my model comes from the lemon effect, which is not present when α = 0.

36 Example

Continuing the example from the symmetric equilibrium, we can use Proposition 5

to find

 r c  U − p = Ψ 1 − − p (2.29) R R 1 − α

1−F (UR) Using p from the symmetric equilibrium example and the fact that pR = = f(UR)

1 − UR we can solve this equation to find

1 βα + 2(1 − α) r c  U = Ψ − 2 (2.30) R 2 1 − α 1 − α

Plugging these solutions into Equations (2.27) and (2.28), I again compute the

derivative of platform profits at α = 0. The results are displayed in Figure 2.5, and

they show that admitting low-quality sellers is much more profitable for the platform

in the model with the recommended seller than in the model without.

Comparing these results to Figure 2.2, the set of parameters where the platform

admits low-quality sellers in equilibrium is roughly quadrupled with a recommended

seller. For example: at c = 0.07 and β = 0.1 the platform would not admit any

low-quality sellers without a recommended seller, but the marginal profit from the

initial low-quality sellers is comfortably positive when a recommendation is possible.

37 0.14

0.12

0.10 * c 0.08 α >0

0.06 α*=0

0.04

0.02

0.0 0.1 0.2 0.3 0.4 β

Figure 2.5: A parameter space showing when the platform will admit a positive proprortion of low-quality sellers when it has the ability to recommend a high-quality seller in this numerical example.

When parameters lie in the shaded region the platform sets α > 0 in equilibrium, but it admits no low-quality sellers in the white region.

38 2.6 Discussion

In this section I discuss several topics which do not merit a full extension of the model but which nevertheless have potentially important implications for my results.

Screening Costs

In the model I assume that the platform can freely set the proportion of low-quality sellers in order to demonstrate the search obfuscation mechanism. This is obviously not the case for real world platforms and so the reader might reasonably question how to interpret this model. To answer that question, consider a situation where there is a base rate of low-quality sellers and that the platform must pay a cost to find and expel low quality sellers.

1

Base rate

Profit Maxmizing α

α∗

Profit Maximizing α (Ignoring Obfuscation) 0

Figure 2.6: The proportion of low-quality sellers with no screening, at the profit maximizing level, if there were 0 screening cost, and if the platform treated prices as independent of the proportion of low quality sellers.

39 In Figure 2.6 I show an example of a situation that could then occur. The base

5 29 rate of low quality sellers if the platform does not screen is approximately 6 . If

∗ 1 the platform had no screening costs then it would set α = 2 , but because screening

2 out the low-quality sellers is costly, its profit maximizing proportion is 3 . However, if the platform ignored the search obfuscating effect of the low-quality sellers (i.e.

if it assumed that getting rid of these sellers would have no effect on price) then it

1 would instead screen to a proportion of 6 . An outside observer who fails to take

2 1 obfuscation into account might then wonder why the platform sets α = 3 instead of 6 . If instead α∗ were greater than the base rate then we might not see any screening at

all, as screening out high-quality sellers might incite a significant negative reaction by

consumers or even legal action against the platform.

This example illustrates the point that the phenomena I demonstrate would show

up in real world behavior as inaction on the part of the platform. An empirical use

for this model could be to explain gaps between predicted optimal levels of screening

and actual screening in platform markets.

Alternative Platform Revenue Structures

Platforms could extract revenue by other means. For example: introducing a fixed

fee per transaction paid by the sellers would directly raise the market price without

impacting consumer confidence. In cases where a platform is charging such a fee we would expect to see more intensive curation, but as pointed out above, the impact

on competition in the platform would still act as an additional screening cost for the

29. These numbers are chosen to optimize readability of the figure rather than coming from any data.

40 platform. Therefore we would still expect to see α above the level which would be

implied by a naive screening policy which ignored competitive effects.

On the other hand if the platform were to charge an entrance fee to consumers, it would instead have an incentive to maximize the value of consumers’ participation in

the market and then extract this value via the entrance fee. We can see evidence of

this pattern in the fact that Apple (which makes a great deal of money from the sale

of iPhones) screens the apps on its platform much more intensively than Google does

for the Google Play store.

Cost of Returns

I do not model the possibility that unsatisfied consumers might wish to return the

products they have purchased. Assuming that returns are costly to the platform, then

the main effect of adding returns would be to magnify the negative impact of β on

platform profits. Consumers who purchase a product only to later return it not only

do not contribute to profits, but increase costs. However, if β is sufficiently low then

the volume of returns should be low enough that the increased profits from search

obfuscation would still overcome the combined negative effects of the lemon effect

and return costs. Although the equilibrium proportion of low-quality sellers would be

lower than if returns were not a factor.30

The impact of returns on platform profit may be mitigated or even positive if

consumers engage in the search process again after returning a low-quality product.

While even a return policy that fully reimbursed the purchase price could not eliminate

the negative impact of being scammed on consumers as they will still have expended

30. Customers often complain that Amazon makes return products unnecessarily difficult, either by making the process obtuse or requiring them to visit one of Amazon’s physical locations. It may be the case that this is an intentional strategy intended to mitigate the costs of accepting returns.

41 search costs on purchasing a dud product, it would mitigate their welfare loss. Conse-

quently, consumers value of participation V (α) and therefore Q(V (α)) would increase

at any α with the introduction of a return policy. See Hinnosaar and Kawai (2018)

for a more thorough discussion of sellers’ optimal behavior when returns are a factor.

2.7 Conclusion

This project explores the effects of low-quality sellers in platform markets and why

platforms might be reluctant to curate the selection of sellers who list in their markets.

In this model, platforms choose the proportion of high and low quality sellers on their

markets with no screening cost. Sellers and consumers then participate in a market

search game similar to Wolinsky (1986).

The impact of low-quality sellers can be broken down into a search obfuscation

effect and a lemon effect. The search-obfuscation effect reduces consumers’ willingness

to search and so softens competition between sellers on the platform. The lemon

effect reflects consumers’ belief that the product they are purchasing might be a

low-quality product successfully imitating a high-quality product, which reduces their willingness to pay. Both effects reduce consumers’ participation in the market, but

only the former is beneficial to the platform, so if the low-quality sellers are too

adept at fooling consumers then the platform would prefer to admit no low-quality

consumers. However, if the obfuscation effect dominates the lemon effect and the

market is sufficiently competitive then the platform-profit maximizing proportion of

low-quality consumers is positive despite the lack of screening cost.

If the platform can recommend a high-quality seller, then that seller will not be

subject to the lemon effect and so charges a higher price. Recommending a high-quality

42 seller is more profitable for the platform than simply obfuscating search as in the baseline model. The ability to recommend may cause the platform to admit low-quality sellers when it would not without a recommended seller. This recommended seller does not capture all of the benefits of the recommendation, so consumer welfare is higher with a recommended seller at any proportion of low-quality sellers. However, the net effect of recommendation may be negative for consumers if the ability to recommend causes the platform to admit more low-quality sellers.

43 Chapter 3: Learning While Shopping: An Experimental Investigation into the Effect of Learning on Consumer Search

3.1 Introduction

Much of the literature on consumer search behavior assumes that searchers know the

distribution of offers available in the market. In many markets the consumers who are

most familiar with this distribution are the consumers who search the least (Moorthy,

Ratchford, and Talukdar 1997). Therefore, it may be unreasonable to assume that

consumers know the distribution in circumstances where they purchase infrequently

or where a large percentage of consumers are new to the market. Uncertainty about

the distribution can have many implications for search behavior depending on the

form uncertainty takes. In this article I focus on circumstances where theory predicts

that learning will result in declining reservation values similar to those frequently seen

in field and experimental search behavior (e.g. Brown, Flinn, and Schotter (2011)).

Most papers in the theoretical literature agree that with well behaved priors,

searchers who are uncertain about the distribution of values will use a “one step”

reservation (or cutoff) strategy similar to those common in standard search models.31

“One step” here means that the searcher is deciding between accepting the best offer

31. “Well behaved” here meaning that all distributions allowed by the priors should have a shared support, and one observation should not move the posteriors too much.

44 currently available and one additional search(Burdett and Vishwanath 1988; Dana

1994; Bikhchandani and Sharma 1996; Dubra 2004; Mauring 2017; Kaya and Kim

2018).32 Furthermore, the initial reservation value of a purely Bayesian learner should

be above the reservation value which would be used for the expected distribution from

the priors, and reservation values will generally decline as a search spell continues

(Burdett and Vishwanath 1988; Bikhchandani and Sharma 1996; Dubra 2004). The

high initial reservation value comes not from a desire to gather information about the

distribution, but from the fact that a high observation will lead to more optimistic

posteriors and therefore a more persistent searcher. It takes a very high draw to cause

the Bayesian learner to stop after just one search. Declining reservation values stem

from the fact that a low observation is viewed not just as an unlucky draw, but as

bad news about the distribution. Consequently this low draw will lead to posteriors which are more pessimistic than the priors. A high draw may be sufficient to make

the searcher more optimistic, but any draw which would be high enough to raise

the reservation utility above its previous value will also be above that reservation value, meaning that the search spell ends before the reservation value can increase.33

I refer to this elimination of searchers who would raise their reservation values from

the search process as the selection effect, whereby searchers who would increase their

reservation values are filtered out of the data because they cease searching.

This selection effect is important because it could serve as a potential explanation

for real world behavior which is inconsistent with standard sequential search theory without learning. Most notably, the decline in reservation values means that a rational

32. The term from the search literature is “myopic”, but I use “one step” here to avoid accidentally conveying pejorative connotations. Especially given that the strategy is theoretically optimal. 33. Note that this refers to reservation values after the initial draw. It takes a high initial draw to get the Bayesian learner to stop after just one draw, but subsequent reservation values must decline.

45 searcher will sometimes exercise recall or exit the market in an environment with

learning where they would not if they had full information. In a policy context this

has important implications for labor market search. In particular, the increasing

pessimism of job searchers could exacerbate the standard supply/demand dynamics

that lead to low wages in a loose labor market. For managers of firms, particularly

firms which are acting as marketplaces, learning means that consumers need to be

shown appealing products early in their search process. Otherwise they are likely to

believe that the products on offer are low-quality or overpriced, which can lead to

consumers exiting the market entirely (e.g. Nosko and Tadelis (2015)).

The selection effect has been thoroughly explored in the theoretical literature, but

it has not previously been tested in an experimental environment. In this article I

attempt to fill this gap in the experimental literature by presenting an experiment which compares subjects’ behavior in a simulated consumer search environment with

a known distribution (the “full information treatment”) to their behavior when they

are updating over a set of priors (the “learning treatment”). I elicit reservation values

using a method similar to Jhunjhunwala (2018) and Brown, Flinn, and Schotter

(2011). Subjects exhibit declining reservation utility in both treatments, consistent with the previous experimental literature, but the rate of decline is much stronger in

the learning treatment, suggesting that learning is at least partially responsible for

declining reservation utility and recall seen in real world search data.34

While the selection effect appears to work well in this experimental environment,

the prediction that subjects will use a one step reservation value does not. The change

34. Subjects were 50% more likely to exercise recall in the learning treatment than the full information treatment. The p-value on a Mann-Whitney test for similarity of the treatments is less than 0.1%.

46 in the initial reservation values across treatments is inconsistent across subjects.

Higher ability subjects (as measured by numeracy metrics and academic data) set higher reservation values than in the full information treatment and often maintained these values for several searches. Lower ability subjects tended to decrease their initial reservation values in the face of uncertainty. In a post-session questionnaire, those subjects who raised their initial reservation values expressed a desire to gain information about the distribution. This heuristic (gathering information and then deciding how to react) requires much less computation than the theoretically optimal strategy and performs quite well in terms of search payoffs. Additionally, it results in behavior which is quite similar to the “search funnel” described in the marketing literature, whereby consumers first engage in broad searches within a market to learn about the products on offer and their prices before narrowing their search once they have this information. Previous discussions of this search funnel have suggested that it may result from searchers’ imperfect knowledge of their own tastes (Blake, Nosko, and Tadelis 2016). However, taste variation is not a factor in my experiment, so the similarity of behavior between the subjects in this article and the description of the search funnel suggests that learning about the market may be sufficient explanation for this phenomenon.

This information heuristic is likely to be of interest to researchers studying learning in complex environments. The general form of separating the information “exploration” from payoff relevant “exploit” behavior seems like it would apply to many economic contexts. For example, subjects in a market experiment who are setting prices against uncertain demand might prefer to test out several prices to get a sense of the market before they seriously attempt to maximize profits. This heuristic performs well in this

47 search environment, but the penalty for deviating from optimal behavior might be

much larger in a different setting.

Despite some subjects setting higher initial reservation values, most subjects in

this experiment drastically under-search relative to the theoretical optimum. I show

that this is at least partially a result of the structure of the game failing to give them

negative feedback when they set a reservation value which is too low. Subjects who

draw a value close to their reservation value are much more likely to increase their

reservation value in the subsequent search. This draw makes the possibility of settling

for a low value more salient and encourages them to become more persistent. Stopping with a value high above the reservation value does not intuitively feel like negative

feedback, but stopping with a value close to or below the elicited reservation value (the

latter due to exercising recall) does. Therefore the only significant change between

search spells is subjects who reduce their initial reservation values in the following

spell after receiving this “negative” feedback. This pattern is suggestive of a situation

similar to Charness and Levin (2005), where the obvious reinforcement heuristic leads

to actions opposite of those which would be indicated by Bayesian rationality.

3.2 Related literature

My experimental design emulates models where the selection effect causes reserva-

tion values to decline (Burdett and Vishwanath 1988; Bikhchandani and Sharma 1996),

but it also contributes to a broader literature on search and learning. Dubra (2004)

emphasizes the importance of priors in search behavior, and shows that with some pri-

ors a specific observation can cause the reservation value to increase drastically. Dana

(1994) explores the implications of strategic firms in a consumer search environment

48 with learning, but consumers’ rational expectations about firm pricing strategies mean

the selection effect is not present. Mauring (2017) and Kaya and Kim (2018) describe

models with alternative forms of learning which allow reservation values to increase

or decrease. Yang and Ye (2008) create a theoretical model explaining “rocket and

feather” pricing in markets where consumers learn about the value of search which

has subsequently been used to explain gasoline pricing (Chandra and Tappata 2011;

Lewis 2011). However Yang and Ye’s model is similar to that of Varian (1980) in that

searchers know all the prices in the market, meaning that the learning dynamics do

not allow for a reservation value. These examples illustrate that, while quite robust within the standard sequential search environment, the selection effect is dependent

on the primary source of learning being consumers’ own observations.

This article also contributes to a small but growing experimental search literature.

The seminal experimental economics work in search was conducted by Schotter and

Braunstein (1981), whose main finding was that consumers tend to under-search

relative to the theoretical optimum for a risk neutral agent. Most experiments have

been similar to the design of Schotter and Braunstein in that they model the search

process by allowing subjects to either continue searching or stop after observing an

offer rather than eliciting reservation values (Rapoport and Tversky (1970), Kogut

(1990), Zwick et al. (2003), and Caplin, Dean, and Martin (2011)). To my knowledge

Jhunjhunwala (2018) and Brown, Flinn, and Schotter (2011) are the only other

experiments which elicit reservation values from subjects. Both papers find that

reservation values decline over the course of search spells despite subjects having full

information about the distribution from which they are drawing prospects. Some

of this decline may be due to the fact that subjects do not fully understand the

49 distribution even after having it explained to them. Rapoport and Tversky (1970)

show that subjects fail to converge to the theoretical optimum level of search even after weeks of training in a distribution. However, Brown, Flinn, and Schotter show that

time spent searching has a stronger effect on the decline in values than the number of

searches, so their fatigue hypothesis seems more plausible.

A few other experiments have considered uncertainty. Falk, Huffman, and Sunde

(2006) consider search in a labor market frame with uncertainty as to the probability of

getting an offer upon searching.35 Cox and Oaxaca (2000) investigate search behavior

in an environment with uncertainty and unknown distributions. Their environment is

one of the most similar from the previous literature to that in this article. However,

they use discrete uniform distributions with a finite search horizon. Importantly, their

distributions often allow for observations which fully reveal the distribution. Because

of the nature of their priors, they find that behavior is inconsistent with a one step

reservation value strategy. Indeed, one of their main research questions is whether

theoretical models can successfully predict behavior in environments where theory

predicts that a reservation value strategy will not be used, while the priors in my

experiment are selected specifically to ensure that the theory predicts a reservation value strategy. Because they do not elicit reservation values, the majority of their

analysis consists of tests evaluating whether observed stopping behavior is significantly

different from theoretical predictions.

I am not aware of any other experiments which elicit reservation values from

participants in a setting with uncertainty about the distribution of match values. This

35. This situation is strategically more similar to uncertainty about the cost of search than the uncertainty about the distribution described in the articles which form the theoretical basis for my experiment. See Yang (2013) or Casner (2020b) for a more in depth discussion of search environments with a probability of failure and the similarity to an increase in search costs.

50 is significant since most of the theoretical predictions I test are stated in terms of

reservation values, so I can approach this problem more directly than previous work.

Eliciting reservation values does come at the sacrifice of making the assumption that

subjects would use a reservation value strategy given freedom to choose. While I show

that there is some evidence that this may not be true with an uncertain distribution, I

believe that it is generally a safe assumption. Sonnemans (1998) shows that reservation value procedures are among the most frequently chosen when subjects are given the

freedom to design their own strategies. Additionally, even if consumers would not

use a one step reservation value strategy of their own accord, the elicitation method

provides a more nuanced picture of their perceived value for an additional search than

implicit estimation methods using only stopping behavior would. For example, the

information demand heuristic described in Section 3.6.2 would be much harder to

study and might even have gone unnoticed without the elicited reservation values in

the learning treatment.

There is a small empirical literature on search with learning. De los Santos,

Horta¸csu, and Wildenbeest (2017) is the only empirical paper I know of which

addresses the selection effect directly. They use data from online MP3 player purchases

to demonstrate that ignoring the possibility that consumers learn while they search

can lead to severe statistical bias in estimations of search cost. Weisbuch, Kirman,

and Herreiner (2000) describe consumer search behavior in a Marseille fish market,

and show that consumers expend significant effort making sure that they have a

general idea of the prices being charged on the market even though they generally

only purchase from one store. Nosko and Tadelis (2015) show that consumers learning

51 about the reputation of sellers on eBay exhibit increasing pessimism similar to the

selection effect.

3.3 Theory

The class of search models explored in this paper is a one-sided, undirected, and

sequential consumer search environment with recall, unit demand, and non-strategic

firms.36 Consumers’ utility from consuming a product is a stochastic consumer-firm

match value u which is drawn randomly from some distribution and which consumers

only observe upon visiting a firm. Consumers pay a search cost c to visit a firm, and

they can only purchase from a firm they have visited. Since the firms are non-strategic, we can think of them as all charging the same price, or equivalently we can think of

match values as being consumption value for a firm’s product net of its price, in which

case with the distribution of values stems from different prices being charged in the

market. Regardless of interpretation, a consumer who searches k times and stops with

match value u receives final utility

u − c ∗ k (3.1)

From Weitzman (1979), the optimal strategy in this environment is to use a one-step

reservation value strategy. This means that the searcher will choose a reservation value, and if the highest value observed so far is below that strategy then they will

search once more and make a new decision based on the new draw. If their draw is

above the reservation value then they will stop searching and consume the highest

36. Non-strategic firms are not strictly necessary for the selection effect, but for the purposes of this experiment I am interested in consumer rather than firm behavior, so considering firm strategy would introduce counterproductive complexity. This model is also equivalent to one sided labor market search models with non-strategic wages.

52 option they have observed. Given a distribution of utility values F (·) with support

[u, u¯] and density f(·), the reservation utility x will be the minimum utility at which

the cost of searching (c) equals or exceeds the expected value of an additional search.

For a continuous distribution with free recall this is the x which solves37

Z u¯ (u − x) dF (u) = c (3.2) x

In a stationary environment, neither the cost of searching nor the distribution of values changes as the consumer continues to search, so the reservation value should be

constant, rather than declining as we see in actual behavior. If consumers are learning

about the distribution of prospects as they search, then the stopping rule changes

slightly. In the experiment all uncertainty is about the mean of the distribution, so I

focus on that environment here. Let µ be the mean of the distribution, µ∗ ∈ [µ, µ¯] the

∗ searchers’ prior mean and g(µ|µ , x1, x2, x3, ..., xn) the posterior density assigned to µ

given the priors and previous observations x1, x2, x3..., xn. The stopping rule for a one

∗ step searcher is then stopping after observing any u > x(µ , x1, x2, x3, ..., xn) where

∗ x(µ , x1, x2, x3, ..., xn) solves Z µ¯ Z u¯ ∗ ∗ (u − x(µ , x1, x2, x3, ..., xn)) dF (u|µ)g(µ|µ , x1, x2, x3, ..., xn)dµ = c ∗ µ x(µ ,x1,x2,x3,...,xn) (3.3)

The selection effect says that there is no observation which could place a high posterior weight on a high µ without also being above the reservation value. Consumers who

continue to search must become increasingly pessimistic as long as the priors are

∗ well behaved. Therefore x(µ , x1, x2, x3, ...) must decline as the number of searches

increases. My experiment is designed to test if the selection effect appears in actual

37. Recall in my experiment is costly, but the stopping condition with costly recall is more complex and the intuition is identical, so I use free recall for the explanation of the one step stopping rule.

53 behavior, but previous experimental work (Brown, Flinn, and Schotter 2011) has

shown that reservation values decline even in environments with a known distribution,

so it is clear that learning about the distribution cannot be the only factor leading to

declining reservation values.

Two other common explanations for declining reservation values include failure

to account for the sunk cost fallacy (Kogut 1990), and fatigue (Brown, Flinn, and

Schotter 2011). Failure to account for the sunk cost fallacy means that the searcher

fails to treat expended search costs as sunk, instead adding the them to the right

hand side of Equation (3.2). Fatigue is the idea that continuing to search is tiring

and/or boring, so that the cost of searching (c) actually increases over time. All

three processes should result in a monotonically declining reservation value and the

effects are difficult to separate in an empirical setting, but they are distinguishable

in a laboratory environment. I control for fatigue in my experiment by tracking how

long subjects have been searching when they set reservation values. I do not explicitly

control for sunk cost, but Kogut’s description of sunk cost mostly focuses on the

induced search cost rather than effort. Since the induced search cost is constant whether or not subjects know the distribution, any difference between the treatments

should be entirely the result of the difference in information structures.

3.4 Design of the Experiment

The experiments for this article were conducted in the Experimental Economics

Laboratory at the Ohio State University using zTree software (Fischbacher 2007).

The design of the experiment was within-subject, with subjects participating in seven

search spells with a known distribution and seven search spells with an uncertain

54 distribution. I refer to the search spells where they knew the distribution as the full information treatment and the search spells with an uncertain distribution as the learning treatment. To test for order effects, some subjects experienced the learning

treatment first and some the full information treatment first. The 81 participants were recruited through The Ohio State University Economics Department’s “Online

Recruitment System for Economic Experiments” (ORSEE) (Greiner 2015). 38 of these

subjects experienced the full information treatment first, and 43 saw the learning

treatment first. The average payoff for the experiment was $13.68, with a minimum

of $7.55 and maximum of $18.75. Each session took approximately 1 hour.

3.4.1 Anatomy of an experimental search spell

To simulate the search process, subjects drew values denoted in experimental

currency units from a truncated normal distribution with support [0,700], a standard

deviation of 100, and a mean which depended on the treatment.38 To simulate the cost

of searching each draw required subjects to pay a cost of 5 experimental currency units.

Subjects’ final payoffs for a search spell were the maximum match value observed

minus the sum of search costs accrued over the course of the spell. For example, a

subject who searched 5 times and stopped with a match value of 400 would have a

final payoff of 375 experimental currency units for that search spell. This payment

structure does in effect create a finite horizon for search. I did impose a constraint that

the payout for a search spell could not go below 0 so a subject who searched sufficiently

many times could theoretically have accrued a final payoff of 0 and also included a

notification at 25 searches warning subjects of this possibility. The longest search

38. For clarity, the standard deviation of the original distribution was 100. It is therefore slightly smaller for truncated distribution, but the amount of mass removed by the truncation is quite small, so the difference is negligible.

55 spell in all of the data was 19 searches, so this warning was not seen by any subject.

Even at 20 searches, the total accumulated search costs would be 100, meaning that

a subject would have to be incredibly unlucky to not receive a positive payoff after

that many searches even when facing the least favorable distribution in the learning

treatment. Given the ubiquity of under-search in previous experiments and the small

search cost it does not seem likely that this finite horizon had any impact on search

behavior.

The theory makes specific predictions about how the path of reservation values

should differ when the distribution of values is and is not known. The main outcome of

interest in this experiment is whether subjects’ reservation values decline within search

spells and if that decline is steeper when they are learning about the distribution (see

Section 3.5 for a more detailed description of the hypotheses in the context of this

experimental design). In order to elicit reservation values at every point in a search

spell, subjects were asked to input a “minimum acceptable value” (i.e. a reservation value) before each draw. If the draw was above the subject’s minimum acceptable value, then that would be interpreted as an “acceptable offer” and the search spell

ended with the payoff as described above. If the draw was below this reservation value,

then subjects had the option to either pay the search cost to accept their highest draw

so far, equivalent to exercising costly recall in a traditional search environment, or

they could state a new reservation value and pay the search cost to receive another

draw. Each search spell continued until the subject either drew a match value above

their current minimum acceptable value, or accepted their highest draw. Costly recall

ensures strict incentive compatibility of stating the true reservation value for every

search. A value below the true reservation value only changes the payoff if it causes

56 the subject to stop after observing a value that is, by definition, lower than the payoff at which they would wish to stop. Stating a value above their true reservation value and recalling leads to a strictly lower payoff than stating the true reservation value in the first place because of the cost of recall. Each treatment consisted of two practice search spells with no payoff, and five payoff relevant spells. Payment for the search task was calculated by randomly selecting the results of one of the five payoff relevant search spells in each treatment.

57 3.4.2 Treatments

Figure 3.1: An example of the user interface subjects saw in the full information treatment.

The horizontal axis of the graph is the range of possible value draws and the histogram shows 10,000 simulated draws from the distribution (an approximation of the density function). The dots on the axis are values that the subject drew themselves.

In the “full information” treatment, mean of the truncated normal distribution was

350 for every search spell and subjects were aware of this fact. In addition, subjects were shown a histogram with the results of several thousand simulated draws from

the distribution. Once subjects began to search the values they drew appeared as red

dots on the axis of the histogram.

58 Figure 3.2: An example of the user interface subjects saw in the learning treatment.

It is mostly identical to the full information treatment, but the subjects did not receive full information about the distribution of payoffs and only saw their own draws as dots on the axis.

In the learning treatment, the support of the distribution was still 0 to 700 units, but the mean of the distribution was determined randomly at the beginning of each search spell. The mean was drawn from a U[150,550] distribution, and subjects were aware of both the range and shape of the prior distribution, but did not know the realized value. Figure 3.3 shows the density plots for two possible distributions that subjects might have faced in the learning treatment. Importantly, the endpoints of the parameter distribution were chosen to be sufficiently far away from the endpoints of the truncation that the effect of a high or low mean on the shape of the distribution

59 was minimal. Subjects guessed the actual value of the mean after each search spell

in the learning treatment. If their guess was within 100 of the actual value then

they had 100 minus the absolute value of the difference between the actual value

and their guess added to their payoff for that search spell. This data is used in

Section 3.8 to evaluate the determinants of the distance between subjects’ guess and

the actual value. Experimental currency units were converted to dollars at a rate of

1ECU = $0.014USD.

0.004

0.003

0.002

0.001

100 200 300 400 500 600 700

Figure 3.3: The density plots for two possible distributions in the learning treatment.

The solid line shows the density for a mean of 300 and the dashed line shows a mean of 450.

3.4.3 Supplemental data

The experiment included several additional tasks to gather supplemental data.

Subjects underwent a risk aversion assessment similar to Holt and Laury (2002) to

track risk preference. To track numeric competency they completed a symbolic number

mapping line task (Peters and Bjalkebring 2015). For the number line task, subjects

60 were shown an otherwise unmarked number line with endpoints between 0 and 1,000

and they were asked to mark the position of a sequence of numbers on that line.

The closer their marks to the actual positions of the numbers on the line, the more

numerate they were considered to be. Numeracy is correlated with better ability to

process information, and so the expectation was that more numerate subjects would

exhibit behavior closer to the optimal strategy. Additionally, Schley and Peters (2014)

show that higher symbolic number mapping numeracy is associated with more risk

neutral behavior. This comes from the association between symbolic number mapping

numeracy and ability to form accurate intuitive estimates of numeric information.

Subjects with low numeracy scores tend to estimate the value of experimental lotteries

more conservatively and so exhibit more risk averse behavior. There was also a

subjective numeracy questionnaire which assessed subjects’ self-perception of their

ability with numbers. Falco (2019) hypothesizes that subjective numeracy may have

effects on behavior separate from its correlation with actual numeric ability.39 Those

subjects who have higher subjective numeracy have been shown to have a desire

to use more numeric information than those who are less confident with numbers.

The hypothesis behind measuring subjective numeracy was that this behavior could

manifest as greater persistence in search spells with an unknown distribution due to a

desire for more information about the distribution.

39. The data from this project was shared with David Falco for a separate research project. I would like to thank him both for suggesting the subjective numeracy assessment and providing the questionnaire. See his paper (Falco 2019) for a psychological investigation of the relationship between numeracy and consumer search.

61 3.5 Hypotheses

Applying Equation (3.2) and Equation (3.3) to the experimental environment. The

optimal reservation value in the full information treatment is 475, which implies that

the average search spell should be 9.5 searches long. The optimal initial reservation value in the learning treatment is 593, although the expected length would not be trivial

(or indeed particularly informative) to calculate because of path dependence between

the reservation values and observations. According to the theory, reservation values

should be constant in the full information treatment, while the learning treatment will lead to declining reservation values. The rate of decline is also path dependent as

a searcher who sees middling values will be less pessimistic than one who has seen a

number of low draws.

From the previous literature we tend not to see constant reservation values with

a known distribution. However, the selection effect from the theory should only

be present in the learning treatment. Therefore the main comparison of interest is whether the rate of decline is significantly steeper in the learning treatment than in

the full information treatment. This is summarized in Hypothesis 1:

Hypothesis 1. Reservation values may or may not decline within search spells in the full information treatment, but they will decline in the learning treatment. If reservation values decline within search spells in both treatments, then the rate of decline will be steeper in the learning treatment and this will be driven by increasing

pessimism about the distribution of match values.

Based on the predictions of the theory and behavior in previous experiments with

full information, the design of this experiment assumes that subjects will using a one

62 step reservation value strategy in both treatments to elicit reservation values. While

the previous literature has provided ample evidence that this is true when subjects

know the distribution (e.g. Sonnemans (1998) or Brown, Flinn, and Schotter (2011)),

it is important to check that this is indeed the case when interpreting the results of

the learning treatment.

Hypothesis 2. Subjects will use a one step reservation value strategy in the learning treatment.

Testing Hypothesis 2 is not quite as straightforward as Hypothesis 1, and is partly

a qualitative exercise. The key feature of a one step strategy is that the Bayesian

searcher will be using the information conveyed by searches, but will not search

specifically to gain information. If a rational Bayesian agent observes an initial draw

of 550 in the learning treatment, then they will continue searching not because they want to gather more information but because their posterior beliefs put high weight on

favorable distributions. If the first observed value is above 563 then the posteriors are

still quite optimistic, but the expected marginal value of one additional search is still

smaller than the search cost. Assuming the searcher observes 550 and then 520, then

the posterior beliefs would still be quite high, but the reservation value would then be

around 540 and the searcher would exercise recall. An example of behavior in this

experiment that is inconsistent with a one step strategy is entering a reservation value

of 700 because there is no possibility of drawing a value which would exit the search

spell. This ensures that the searcher must either exercise costly recall or search at

least once more, meaning that there is no possibility of completing the search process

in one step. Another strategy that some subjects use is that they will set a high but

not extreme reservation value, say 600. This reflects the reasoning that if they get a

63 draw above 600 then it does not matter so much what the actual distribution is, but

they will then maintain that reservation value for several draws before adjusting based

on their observations. The fact that they are waiting to make use of the informational

content of the draws is again inconsistent with deciding between accepting the current

highest value or drawing once more.

I reject Hypothesis 2 in Section 3.6.2 primarily based on subjects’ descriptions of

their own strategies in the learning treatment combined with finding behavior like

that described in the example above which is consistent with the alternative strategies

being described and inconsistent with a one step strategy. The fact that they are

engaging in initial searches with the primary goal of gathering information and the

intention of making decisions about acceptance only after this information gathering

step is explicitly not a one step strategy. Result 1 and Result 2 in Section 3.6 give

more detailed descriptions of how Hypothesis 1 and Hypothesis 2 fare when evaluated

against the data.

Section 3.7 was a reaction to finding under-searching at levels similar to those in

other experiments. One possible explanation for this under-searching is that, similar

to most overbidding in second price auctions, subjects are usually not punished in an

obvious way for setting a reservation value below the optimal level. Even when they

are forced to settle for a low value as a result of entering a low reservation value they

still receive a positive payout. If on the other hand, a subject experiences a “near

miss”, where they receive a draw that is close to, but still below their sub-optimal

reservation value, then this may make them aware of the possibility that they could

exit the search spell with a low draw and cause them to increase their reservation value.

64 Hypothesis 3. Subjects will be more likely to increase their reservation value if they see a draw which is below, but close to their reservation value.

Charness and Levin (2005) show that subjects tend to respond to a high payout by

repeating the strategy that led to that payout even when doing so is not optimal.

If subjects use a similar reinforcement heuristic in this environment, then we would

expect a subject who sets a low reservation value and receives a draw far above that

reservation value to imitate the strategy that received this result in the following

search spell.

Hypothesis 4. Subjects who draw a value high above their reservation value in one search spell will set an initial reservation value in the following search spell which is similar to the initial reservation value in the search spell which received the high draw.

Result 3 and Result 4 summarize my evaluations of how hypotheses3 and4 fared in

the data.

In Section 3.8 I analyze subjects’ behavior in comparison to the theoretically optimal

behavior. I did not enter this project with a strong prior about how behavior would

differ from the optimum, so Result 5 in that section does not have a corresponding

hypothesis here.

Finally, Section 3.9 describes a robustness check evaluating whether subjects

behavior is similar given a higher search cost. Result 6 does not have a corresponding

hypothesis apart from the implicit null prediction that behavior will not be significantly

different from behavior in the low search cost sessions except that a higher search cost will lead to shorter search spells.

65 3.6 Main Results

Because the practice spells were not payoff relevant, I drop those spells from all

analysis. The results in the rest of the paper use the data from the ten payoff relevant

spells.

Full Information Learning Average Search 2.785 3.217 Length (0.127) (0.158) Average Search 418.17 409.77 Profits (3.876) (6.376)

Table 3.1: Summary statistics for each treatment.

Search profits are stated in experimental currency units. Standard errors are shown in parentheses.

Table 3.1 shows summary statistics for the full information and learning treatments.

The average search length is longer with the learning treatment, but the average payout

for the searchers is lower. This is because the average length of search spells in the

learning treatment is inversely correlated with the mean of the distribution (as can be

seen in Figure 3.8). I discuss the implications of this correlation in Section 3.6.1.

Figure 3.4 shows a kernel density representation of the number of searches in each

search spell. Most search spells lasted less than 5 searches, and 97% of all spells

lasted less than 10 searches, I therefore drop search spells longer than 10 from the

graphical analysis in this section. If risk neutral subjects were searching according to

66 Figure 3.4: Kernel density representation of the length of the search spells.

the theoretical optimum in the full information treatment then the expected number

of searches would be 9.5. Subjects are therefore under-searching.

Figure 3.5 shows the reservation values of subjects at each point in their search

spells aggregated across all search spells. Thus the left most box and whisker describes

the initial reservation values in all of the full information search spells, whereas the

2nd from the left collects 2nd reservation values from the full information spells for all

subjects who searched twice and so on. The boxes represent the middle 50% of the

data, with the upper end of each box being the 75th percentile of the data and the

lower end the 25th percentile. The whiskers contain all data that is less than 1.5 times

the inter-quartile range away from the upper or lower quartile of the data.40 Dots

represent outliers. The optimum reservation value in the full information treatment is

475 and the optimum initial reservation value for a risk neutral searcher using Bayes’

rule in the learning treatment is 593. Consistent with under-searching in the previous

40. The inter-quartile range is the distance between the 75th and 25th percentile.

67 Figure 3.5: The reservation values by depth in search spell and treatment.

experimental search literature, most subjects stated reservation values well below

the optimum in both treatments, although there is more dispersion in the learning

treatment than in the full information treatment.41 It is possible that some of this

dispersion can be attributed to differences in values observed by different subjects in

the learning treatment causing different inferences about the mean, but this cannot

explain the difference in initial reservation values.

Figure 3.6 shows that subjects had disparate responses to the change in environment.

There is a much wider spread in the distribution of initial reservation values in the

learning treatment. Some subjects increased their reservation values, while others reduced their initial reservation values. Of particular interest is the cluster of initial

reservation values at or near the top of the possible outcomes in the learning treatment,

including three observations that were above the support of possible draws. This

41. The average distance from the optimum was approximately 85 in the full information treatment and 185 in the learning treatment. I discuss distance from the optimum more thoroughly in Section 3.8.

68 Figure 3.6: Initial reservation values by treatment.

is the most extreme version of the information demand heuristic described in the

introduction, with subjects setting reservation values that will almost certainly not be

drawn in order to get a better sense of the distribution. This behavior is completely

inconsistent with a one step strategy as the one step strategy is only concerned with the

expected marginal benefit—conditional on beliefs—of one additional draw. Drawing

a value with the intent of making an additional draw afterward should not occur

according to the theory.

Subjects also had differing reports about their own reaction to the learning environ-

ment. In a post session questionnaire asking subjects to describe their strategies in the

learning treatment, the two most common responses were variations on either “Chose

high until enough draws were done to see a distribution” or “The unknown distribution was harder so I chose to be even more conservative and opt for a lower minimum

because I didn’t want to draw multiple times” (both quotes are actual responses).

69 Thus there appear to be two behavioral factors at play. The first is a desire to learn

about the distribution before making a decision, which increases reservation values.

The second is aversion to the increased complexity in the uncertain environment which

leads to more conservative search behavior.

The demand for information from the first factor contrasts with the high reservation values in the one step strategy from the literature in that many of the subjects will

pick a high reservation value and stick with it until they have enough information

to begin adjusting, rather than adjusting with every draw. While this demand for

information is not strictly optimal, it is a well performing heuristic and much easier

to calculate than the true optimal reservation value. For context, processing the

data to calculate the optimal reservation values given subjects’ observations took

nearly an hour. Thus, this heuristic is likely boundedly optimal given that the

theoretically optimal strategy is too computationally intense to actually implement.

It separates the complex problem of simultaneously exploring the environment and

exploiting previously gathered information into an “explore” stage and an “exploit”

stage. This explore stage typically lasted for 2-3 searches, but the length of the exploit

stage was much more variable since it depended on the information gathered in the

initial searches. A subject who had unusually high initial draws from an unfavorable

distribution would be much more likely to continue searching for a long time than

one who had low draws from a favorable distribution. I discuss this heuristic more

thoroughly in Section 3.6.2.

70 3.6.1 The reservation value path

(a) The kernel density of the normalized (b) The normalized reservation values by reservation values in each treatment. The depth in search spell and treatment. Note initial searches have been dropped from that the data starts at the 2nd search this graph. since the first value is always 0 due to the normalization.

Figure 3.7: The kernel density and approximate paths of the normalized reservation values.

The main interest in this project is in the change in reservation value paths when subjects are learning as opposed to when they have full information about the

distribution. Just using the unmodified reservation values masks patterns in the

reservation value paths. For example, the apparent flat or upward trend of reservation values in the full information treatment in Figure 3.5 is an illusion caused by the fact

that subjects with low reservation values end their searches earlier. This creates a

survivorship bias that appears to give a positive correlation between depth in a search

spell and reservation values. To correct for this bias, much of my analysis in this paper

uses subjects’ normalized reservation values. That is, I subtract the initial reservation

71 value from the later reservation values in each search spell so that only the change in

reservation values remains. Formally, for the kth search by subject i in the j th search

spell, the normalized reservation value is given by

Normalized Reservation Valueijk = Reservation Valueijk − Reservation Valueij1

I drop the first search from graphs involving the normalized reservation values since

this value is always 0. As Figure 3.7b shows, the selection effect is much more apparent when looking at the normalized reservation value. There is a declining trend in both

treatments, but it is much stronger when subjects are learning about the distribution.

In fact, the reservation values are so strongly concentrated around 0 in the full

information treatment (many search spells did have constant reservation values) that

the means and the upper end of the inter-quartile ranges very nearly coincide—making

them difficult to distinguish in the graph. Similarly, Figure 3.7a shows the kernel

density of the normalized reservation values in each treatment after the initial search.

The peak of the density for the full information treatment is firmly over 0, with

little deviation, again reflecting the fact that many subjects picked a reservation value and did not vary far from it. In the learning treatment, not only is there a

long tail reflecting declining reservation values, but the peak of the kernel density is

noticeably leftward of 0, reflecting the much stronger decline in reservation values

along search spells in the learning treatment. 90% (76 out of 81) subjects lowered their

reservation values more often than they raised them in both treatments, with most

of the remainder having flat reservation value paths in at least one treatment. The

difference is more striking when we look at behavior spell by spell, with approximately

half of the spells in the full information treatment having a decreasing trend in the

reservation values and most of the remaining being flat. However, in the learning

72 treatment approximately 70% of the reservation value paths showed a decreasing trend.

Figure 3.8: The normalized reservation values in the learning treatment by depth in search spell, broken down by the true mean of the distribution.

A low mean is less than 280, a high mean is greater than 415.

The graphical evidence strongly indicates that this decline is primarily due to the selection effect. We can see in Figure 3.8 that subjects with high draws for the mean tend to have much shorter search spells, and even then the tendency is for the reservation value to decline since continuing with a high mean is correlated with a high initial reservation value. When the distribution is less favorable search spells tend to be longer and reservation values exhibit a much stronger decline. This pattern demonstrates that welfare implications for searching from an unknown distribution are doubly negative. Not only is the optimal strategy more difficult to approximate, but a poor distribution leads to increased expenditure of search costs in addition to a

73 low expected payoff. This pattern is strikingly similar to Blake, Nosko, and Tadelis’s

(2016) finding that successful search spells on eBay tend to be significantly shorter

than unsuccessful ones.

Dependent Variable: Normalized Reservation Value Covariates Coefficients (Std. Error) Learning 9.831 ( 21.0405 ) # of searches 1.1402 ( 1.8340 ) Learning*# of searches -3.8343 ( 3.3525 ) Time -0.5268** ( 0.2230 ) Learning*Time -0.9405** ( 0.3956 ) Period 3.3311* (1.7622) Number of Subjects 81 Number of Observations 2,428 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.2: Fixed effects OLS predicting the reservation value path.

Fixed effects grouped on subject and period.

The results of a fixed effects ordinary least squares regression of the normalized

reservation values on predictive variables are shown in Table 3.2.4243 “# of searches”

is the number of times a subject has searched within a given search spell. The number

of searches is not predictive of changes in the reservation value, but both time spent

42. I do not include subject level controls here because the fixed effects regression makes them redundant. 43. All regressions in this article use robust standard errors (clustered on subject where appropriate). Insignificant covariates are dropped from regression tables to aid readability. A copy of de-identified data and a commented version of the STATA code used to run all regressions is available from the author upon request.

74 searching and the interaction of time with the learning treatment are significant and

negative. Time is measured in seconds since the beginning of the search spell.44 This

result is consistent with Brown, Flinn, and Schotter’s (2011) result that time spent

searching has a greater impact on reservation values than the number of searches.

“Learning” is a dummy for the learning treatment, and it is not significant on its own,

but the interaction between time spent searching and learning is significant, consistent with the selection effect on reservation values. However, this interaction could also be

caused by greater fatigue in the more complex learning environment. Period refers the

number of search spells completed, so data associated with a subject’s third search

spell would have a period value of 3. This coefficient is marginally significant and

positive, which in the context of this regression is evidence suggesting that subjects

became more persistent as they gained more experience with the environment.

To separate the fatigue explanation from the selection effect and to explore further

the factors driving the changes in reservation values, I create an “update” variable, which takes on a value of 1 if a subject increases their reservation value above the

previous one, 0 if the subject does not change their reservation value, and -1 if the

subject lowers their reservation value.45 Table 3.3 shows the results of a random effects

ordered probit on this update variable. “Previous Observation” is the last value the

subject drew before setting a new reservation value, if subjects are learning using an

approximation of Bayes’ rule, then this value should be the only information they are

using to update their beliefs since all previous observations are already incorporated

into the priors. The previous observation only has a significant effect in the learning

44. A regression using time since the beginning of the experiment rather than since the beginning of the search spell yielded broadly similar results. 45. Although the theory predicts strictly declining reservation values, I avoid making this restriction since subjects did occasionally raise their reservation value, albeit rarely.

75 Dependent Variable: Change in Reservation Value Covariates Coefficients (Std. Error) (1) (2) Previous Observation -0.0004 -0.000 ( 0.0004 ) ( 0.0005 ) Learning* Previous Observation 0.0019*** 0.0018*** ( 0.0003 ) (0.0005) SMAP Numeracy Score 0.0090*** 0.0047 ( 0.0044 ) (0.0083) Time -0.0082* -0.0109** ( 0.0046 ) (0.0050) Learning*Time -0.0085** -0.0081* ( 0.0040 ) (0.0045) Academic Controls No Yes Upper Cutoff 0.9268 0.5974 Lower Cutoff -0.7165 -1.2226 Number of Subjects 69 53 Number of Observations 1,670 1,220 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.3: Random effects ordered probit regression predicting the direction of change in reservation values.

None of the academic controls (ACT, GPA etc.) in the 2nd regression were significant, therefore I omit their coefficients to save space.

treatment, and the coefficient is positive, supporting the idea that the decline in

reservation values in the learning treatment is driven significantly by changes in beliefs

as a result of the observed values. In the context of an ordered probit a positive

coefficient implies a positive correlation with the event of raising the reservation value and a negative correlation with lowering the reservation value.46 Therefore the

significant positive coefficient on the previous observation means that subjects who

observe a high (low) value are more (less) likely to raise their reservation value in

the following search and less (more) likely to lower their reservation value, while the

46. The implications for the middle event (maintaining the same reservation value) are ambiguous. I thank an anonymous referee for pointing this out.

76 impact on the probability of maintaining the same reservation value is ambiguous.

While this does support the selection effect, time spent searching has a significant

effect both on its own and when interacted with the learning treatment. Subjects are

more likely to lower and less likely to raise their reservation value as they spend more

time searching. Therefore, it is likely that fatigue is playing a role as well.

The SMAP numeracy score is the mean absolute error of the subjects’ performance

on the number line task described above, so a higher SMAP score indicates a less

numerate subject.47 Numeracy is significantly predictive of behavior when there are no

other proxies for ability, but insignificant when GPA and ACT scores are introduced to

the regression. This is likely an increase in standard error caused by loss in efficiency

of the regression because the academic data and the SMAP score are both serving

as proxies for numeric ability.48 Less numerate subjects are less likely to adjust their

reservation values downward and more likely to raise their reservation values with any

given search, which is partly due to the fact that they tended to set lower reservation values in the first place.

Result 1. The difference in reservation value paths between treatments is supportive of the selection effect. Reservation values decline within search spells in both the full information and learning treatments, but the rate of decline is much steeper in the learning treatment. The informational content of the draws in the learning treatment is one of the main drivers of this additional decline, which is consistent with the

prediction of increasing pessimism within search spells.

47. The number line task involved completing the task 10 times, so SMAP = 10 1 P 10 |ith input value − ith actual value| i=1 48. The academic controls were not significant when regressed without numeracy. Also note that academic records were not available for all subjects, so the second regression also has fewer subjects.

77 3.6.2 The information demand heuristic

The second main prediction—that searchers will use a one step reservation value

strategy with well behaved priors—does not fare as well as the selection effect. I refer

to the strategy of setting a high initial reservation value and adjusting only after a few

observations in the learning treatment as the “information demand heuristic”. I define

any search spell with an initial reservation value over 400 which is maintained at least

through the second search as using this strategy. I define any subject who used this

strategy in at least two of their payoff relevant learning treatment search spells as a

high information demand subject or “information demander”; 25 out of the 81 subjects

fit this description. While these subjects are increasing their reservation values in

response to uncertainty about the environment as a Bayesian searcher would, this

heuristic is not a one step strategy. Therefore the reason why the initial reservation values are higher differs drastically from the factors driving the high reservation value

in the theory. The initial reservation value is high in the theory because a high initial

draw means that the one step Bayesian searcher is more optimistic about continued

search, with subsequent draws leading to declining values. With this heuristic, the

initial value is high because the searcher wants to search more than once before

deciding whether or not to stop. More restrictive definitions for the heuristic or type

(e.g. maintaining the same reservation value through the 3rd search or using the

strategy for three of the learning search spells) yield similar qualitative results to those

described here.

Table 3.4 shows the results of a probit predicting whether a subject is classified as

a high information demand subject. “SNS Score” represents how subjects rated their

comfort and ability with numeric information in the subjective numeracy survey, with

78 Dependent Variable: Information Demand Type Covariates Coefficients (Std. Error) (1) (2) SNS Score -0.4679** -0.5453** ( 0.2361 ) ( 0.2652 ) ACT Score - 0.1936*** - (0.0728) Other Academic Controls No Yes Number of Subjects 81 55 Number of Observations 81 55 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.4: Probit predicting whether a subject would be classified as a high information demand type.

a higher score indicating more comfort with numbers and a higher self-assessment of

numeric ability. Interestingly, numeric competency is not correlated with a subject

being an information demander, and subjective numeracy is negatively correlated with

use of the heuristic, which is the opposite of what Falco’s (2019) hypothesis would

predict. One potential explanation for this discrepancy is that subjects with greater

subjective numeracy may have a greater tendency to use numeric information, and

so they feel a desire to adjust their reservation values based on their observations.

Though this hypothesis is merely speculation without further research to test it. While

numeric competency does not predict information demand, academic ability metrics

are positively correlated with its use, which is not terribly surprising given that it

seems to be a well performing strategy (see below).

Use of the heuristic within a given search spell does not predict a higher search

payoff, but frequent use of it across search spells does. Table 3.5 shows the results

of a regression of payoffs from the search task against numeracy and the categories

79 Dependent Variable: Search Task Payoffs Covariates Coefficients (Std. Error) (1) (2) (3) Information Demand Type 36.2192** 34.9088* 29.6689* (17.3214) ( 18.1015 ) (17.4039) Information Demand Heuristic 17.5541 17.1551 12.0834 (23.4486) ( 25.8476 ) (21.0770) Learning 23.6106** 116.8047* 49.6895 (11.5433) ( 68.8906 ) (71.4848) SNS Score - 20.1837** 11.4456 - ( 10.1061 ) (11.9451) SMAP Numeracy Score - -0.9528 0.3558 - ( 0.0704 ) (.3271) Learning* SMAP Numeracy Score - -0.4788** 0.7491 - ( 0.1503 ) (0.6669) ACT Score - - 7.8939*** - - (2.2524) Other Academic Controls No No Yes Number of Subjects 81 81 61 Number of Observations 477 477 367 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.5: OLS regression predicting payoffs for the search task in the learning treatment.

described above.49 The learning treatment is positively correlated with payoffs in the

first regression, but this correlation disappears with the introduction of ability proxies,

suggesting that it is mainly driven by high ability subjects exploiting search spells with favorable distributions. Subjects with more numeric ability are better able to use

information in the learning treatment, and also tend to set higher reservation values

overall.50 The heuristic is not significantly correlated with payoffs from the search

task, but the information demander type is significant and positive. This difference

49. I use ordinary OLS here rather than a panel estimator as a Hausman test rejects use of random effects, and a fixed effects estimator would cause the control variables (which are the variables of interest) to drop out. I therefore believe this specification to be more informative. 50. There are some endogeneity concerns with the regression in Table 3.5 as subjective numeric ability is determined at least partially by actual ability, so SMAP numeracy will partially determine the SNS Score. Three stage least squares analysis does suggest that this is valid, but the qualitative results of the 3SLS regressions are essentially identical to those presented here except that significant

80 likely comes from sub-optimality of rigidly sticking to the heuristic. The information

demander types set higher reservation values even when they do not stick to the

heuristic (an average of 446 as opposed to 390 for the non-demanders) and they were

less likely to stick to the heuristic if their first draw was quite low. Being classified as

an information demander type is only borderline significant when proxies for ability

are introduced. Ability proxies, especially academic data, are much more predictive

of payoffs than frequent use of the heuristic. The heuristic is a reasonable strategy

used by high ability subjects to deal with the uncertainty in the learning treatment,

but these subjects do not stick to this strategy rigidly if the information in their first

draw is sufficiently compelling.

Result 2. Many subjects do not use a one step reservation value strategy in the environment with learning. They instead use a heuristic where they separate the search

process into an “explore” stage and an “exploit” stage. While the use of the strategy itself is not significantly predictive of payoffs, those subjects who regularly use the heuristic (i.e. the information demanders) perform significantly better on the search task than subjects who did not.

This heuristic is quite similar to descriptions of real world search behavior. Blake,

Nosko, and Tadelis (2016) find that search patterns on eBay follow a pattern quite

similar to the heuristic described in this paper. Blake, Nosko, and Tadelis describe

a “search funnel”, which is a concept from the marketing literature that involves

initial exploratory searches to get a sense of the products and prices available in the

market, followed by more directed precise search strings once this information has variables become more significant. I use the simple OLS here because it conveys the same information in an easier to read format.

81 been gathered. Previous research has assumed that the search funnel is a result of

consumers’ imperfect knowledge of their own taste. According to this framework

consumers leave the information gathering stage once they have decided what it is they want to buy. The similarity of the search pattern in my results absent any uncertainty

about taste suggests that the search funnel may also be a way for consumers to deal with the complexity of learning prices in an unfamiliar market.

3.7 Feedback and under-search

In theory, risk aversion might be an explanation for the ubiquity of under-search,

the reservation value is determined by a comparison of a certain cost to a stochastic

benefit, so a more risk averse searcher should search less. In practice the degree of

risk aversion as measured by the Holt and Laury (2002) risk preference elicitation was

not significant for any regression, including regressions predicting the length of search

spell by subject, so it seems unlikely that this is the primary explanation.

One potential alternative explanation is the lack of negative feedback for suboptimal

strategies. A subject setting a constant reservation value of 350 in the full information

treatment will still stop with a positive payoff, and therefore may not realize that their

reservation value is too low. An exception to this rule is a situation where a subject

draws a value near, but not above their reservation value. This event will make the

possibility of settling for a low reservation value more salient, and can cause subjects

to revise their reservation values upward. Table 3.6 describes a random effects probit

analyzing the event that a subject’s reservation value increases. Only 11% of the 1724

observed changes in reservation values were increases, but nevertheless the mechanism

hypothesized above appears to work. I create the “Distance” variable by subtracting

82 Dependent Variable: Event that reservation value increases Covariates Coefficients (Std. Error) (1) (2) (3) Distance -0.0036*** -0.0036*** -0.0030*** (0.0005) ( 0.0005 ) (0.0007) Learning* Distance -0.0043*** -0.0045*** -0.0034*** (0.0006) ( 0.0006 ) (0.0007) SMAP Numeracy Score - 0.0135** 0.0159 - ( 0.0063 ) (0.0125) Learning* SMAP Numeracy Score - -0.0144* 0.0016 - ( 0.0076 ) (0.0113) ACT Score - - -0.0969* - - (0.0577) Numeracy Measures No Yes Yes Other Academic Controls No No Yes Number of Subjects 69 69 53 Number of Observations 1,724 1,724 1,257 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.6: Random effects probit predicting the probability of reservation value increasing.

“Distance”=last reservation value-last observed value.

the last observed value from the last reservation value for every search. It is significant

and negative, indicating that the farther the last observation is below its corresponding

reservation value the less likely a subject is to raise their subsequent reservation value.

This effect is stronger in the learning treatment because of the informational content

in the observations. Receiving a draw close to the reservation value in the learning

treatment not only increases the salience of settling for a low draw, but also indicates

that the subject may be facing a favorable distribution. Subjects with higher ACT

scores are somewhat less likely to raise their reservation values, but this is in large part

a mechanical effect because higher ability subjects generally set higher reservation values in the first place.

83 Result 3. While other factors may explain the observed behavior, lack of negative

feedback for sub-optimal strategies appears to be a plausible explanation for the preva- lence of under-search and lack of inter-spell convergence toward the optimum value.

When the possibility of settling for a low draw is made salient subjects are likely to raise their reservation values in response.

If we look at feedback between search spells, the picture is slightly different. If a

subject consistently stops with values far above their reservation value, then this is

feedback that their reservation values are too low. However, subjects may also view

this as positive feedback because they are receiving a high payoff relative to their

reservation value, so this may act as reinforcement (in the psychological sense) and

encourage them to maintain or even lower their reservation values. As Charness and

Levin (2005) establish, people have difficulty updating when Bayesian rationality and

reinforcement heuristics are in opposition. They show that people tend to stick with

strategies that yield positive payoffs even when the information conveyed by that

payoff suggests that they should change their action.

This experiment is a particularly interesting environment in which to explore this

possibility because we would expect any response in the full information treatment

to be significantly dampened in the learning treatment. A subject who receives a

draw significantly above their reservation value in the full information treatment could

interpret it as positive reinforcement, but would likely just attribute it to a favorable

distribution in the learning treatment. If instead they were interpreting the feedback

according to standard rationality, then for the most part they should generally still

raise their initial reservation value in the following search spell in both treatments

given that subjects are almost universally under-searching.

84 The data support this reinforcement hypothesis, but in a very specific way. As

Figure 3.9 shows, the only significant relationship between this difference and the initial reservation value is that subjects tend to lower their initial reservation value if their observations are too close to their observed reservation values or if they find themselves exercising recall and settling for a value below their reservation value. This is particularly true of high ability subjects.51

(a) Low Ability Subjects (b) High Ability Subjects

Figure 3.9: The difference between the final observation and the corresponding reservation value by whether they lower, maintain, or raise the initial reservation value in the next search spell.

Similar to the update variable, I create an “initial update” variable which takes a value of 1 if the initial reservation value is higher than the one in the previous period, 0 if it is the same, and -1 if it decreases. The “Difference” variable is the final observation minus the final reservation value in the previous search spell. As Table 3.7 shows, this difference between the final reservation value and the final observation is

51. High ability is defined as having an effective ACT Composite over 28, or an average error on the numeracy task less than 25 if this data was not available. These values are the integers closest to the median.

85 consistently the most important factor determining whether the initial reservation value changes relative to the previous search spell. Ability controls and the number of

searches in the previous search spell (which controls for potential frustration with an

unluckily long search spell) also play a role. Naive interpretation of the coefficient for

the difference variable would suggest that subjects who saw a value high above their

reservation are more likely to raise their initial reservation value in the following search

spell while those who stop with a low final observation relative to their reservation value are likely to lower their initial reservation value in the following spell. However

Figure 3.9 suggests that this correlation is driven entirely by the latter effect, and

even then only in the full information treatment. While I do not include additional

regressions here, if I instead run a regression against a dummy that takes a value of 1 whenever the reservation value increases then distance is never significant. Restricting

the regression to the learning treatment similarly suppresses any significance. This

behavior is similar to Charness and Levin (2005), but instead of sticking with a

strategy after positive feedback, subjects are adjusting their strategy after negative

feedback.

Result 4. Subjects appear to be using a reinforcement heuristic similar to that proposed by Charness and Levin (2005) when choosing their initial reservation values. Subjects rarely become more persistent, but often adjust their reservation values to under-search even more in response to a search spell with an unfavorable result. This adjustment does not occur in the learning treatment, suggesting that subjects are aware that information does not transfer across search spells as readily when the distribution is uncertain.

Based on the responses to the post-session questionnaires, it appears that subjects

often equated low reservation values with caution. It is therefore likely that the main

86 Dependent Variable: Change in Initial Reservation Value Covariates Coefficients (Std. Error) (1) (2) (3) Difference 0.0013*** 0.0025*** 0.0026*** (0.0005) (0.0007) (0.0009) Learning* Difference 0.0000 -0.0008 -0.0008 (0.0004) (0.0008) (0.0008) High Ability*Difference - 0.0010* 0.0015* - (0.0005) (0.0008) Learning*High Ability*Difference - 0.0003 -0.0001 - (0.0004) (0.0005) Previous Initial Reservation Value -0.0014 -0.0015 -0.0017** (0.0009) (0.0009) (0.0007) # Searches in previous spell -0.0242 0.0246 -0.0335* (0.0148) (0.0154) (0.0173) SMAP Numeracy Score - -0.0036*** -0.0057 - (0.0009) (0.0038) SNS Score - 0.1145 0.2542*** - (0.0714) (0.0760) ACT Score - - 0.0149 - - (0.0160) Learning*ACT Score - - 0.0448** - - (0.0192) Academic Controls No No Yes Upper Cutoff -0.1301 0.3126 1.4040 Lower Cutoff -1.3396 -0.901 0.1258 Number of Subjects 81 81 61 Number of Observations 805 805 610 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.7: Random effects ordered probit regression predicting the direction of change in initial reservation values.

mechanism behind this result is a desire to increase caution in response to negative

feedback. Given how important feedback and information design has been shown to

be for optimal auction behavior (e.g. Kagel, Harstad, and Levin (1987)), the design

of feedback in search environments seems like it may be fertile ground for further

research.52

52. See Jhunjhunwala (2018) for a more detailed exploration of the role feedback plays in search behavior.

87 3.8 Distance from the theoretical optimum

Figure 3.10: The percentage error in reservation values compared to the theoretical optimum for a Bayesian learner by depth in search spell and treatment.

In this section I compare subject’s behavior to that of a risk neutral rational

Bayesian agent using a one step stopping rule. In the full information treatment this simply involves solving Equation (3.2) using the given distribution. The process for the learning treatment is roughly equivalent, except that F (·) is a function of the priors and posterior beliefs as a result of observing x. The rational agent sets a reservation value by anticipating what the effects of observing x would be on their beliefs, then calculating the marginal benefit of an additional search given those posteriors. Anticipating the effect of an observation on future beliefs is not an easily intuited concept, so it is unlikely that most subjects are engaged in this level of forward thinking. This complexity may explain why subjects’ reservation values are farther

88 from the optimum in the learning treatment than in the full information treatment

(as shown in Figure 3.10).

The optimal reservation value in the full information treatment is 475, and the

optimal initial reservation value in the learning treatment is 593. Later optimal

reservation values in the learning treatment depend on the observations and therefore

cannot be summarized simply. I calculate the percentage error using the formula

Reservation Value − Optimal Value Percentage Error = (3.4) Optimal Value

so a negative value indicates that subjects are setting a reservation value lower than

the theoretical optimum. It is apparent from Figure 3.10 that most subjects are setting

reservation values below the optimum, and they are farther below the optimum in the

learning treatment. The amount of error decreases as subjects continue to search, and

this rate of decrease is faster with the learning treatment, such that the percentage

errors are roughly equivalent between treatments by the time subjects reach their 3rd

search. Note that some of the convergence to the optimal reservation value occurs

because subjects who input low reservation values drop out early. However, even with this fact we do see evidence for convergence toward the optimal value later in

search spells for the learning treatment. This convergence has a number of potential

causes: First, as a search spell continues, the optimal reservation value decreases.

Subjects are setting reservation values too low for the most part, so there is less room

to deviate from the optimum. Second, the optimal reservation values decrease more

quickly than actual reservation values, which could be caused by subjects using a

non-Bayesian updating process, or just updating more slowly. Finally, subjects are

closer to the optimum when they know the distribution, and as they continue to

search within a search spell in the learning treatment they will eventually have a

89 reasonably good estimate of the true distribution.53 For later draws within a spell

they are approximating the stationary search problem, which is much easier to solve.

(a) Learning last (b) Learning first

Figure 3.11: Mean absolute error for the first three reservation values of each search spell.

Result 5. The absolute value of error decreases as subjects continue to draw values within a given search spell, and this within-spell convergence between actual and optimum behavior is slightly stronger in the learning treatment. This suggests that subjects do use information about the distribution to inform their strategies, and that better information leads to a strategy closer to the theoretical optimum.

While there is strong intra-spell convergence, the convergence between spells is

not nearly as strong, but we can still see some interesting patterns. Figure 3.11 shows

53. The average absolute value of error for the initial search is 85 in the full information treatment and 185 in the learning treatment. A Mann-Whitney test for difference of behavior in the two treatments is significant at the 0.001% level. The single most important determinant (as determined by OLS regression) of how close subjects’ elicited beliefs about the mean of the distribution were to the true mean was the number of searches.

90 progression of the mean absolute error in the first 3 searches of each spell.54 Formally,

I define the variable M3Error by

N 3 1 X X M3Error = |Reservation Value − Optimal Value | k N ∗ 3 nki nki k n=1 i=1

Where k is the kth search spell, Nk the number of subjects who searched at least 3

times within the kth search spell, and i the depth in the search spell. I use the mean

of the absolute value because it provides a clearer interpretation of the graph (whether

or not there is convergence to the optimum across search spells) than simply using

the mean error would. Just using the initial reservation value for each search spell yields similar results.

We can see two trends in the error:

1. There is a downward trend in the error in the first few search spells of each

session, regardless of the order.

• While this may simply indicate that the experimental design should have al-

lowed for more practice spells at the beginning, it is noteworthy that we see

some evidence for gradual mastery of the environment in the full-information

treatment. Subjects are told the distribution in the full information treat-

ment, but they need practice with the problem in order to understand how

to use this information.

2. Once this initial environmental learning curve is accounted for, there does not

appear to be a strong trend in the error across periods.

54. I use mean absolute error rather than mean absolute percentage error here to give the reader a sense of scale for the deviations from the theoretical optimum. The graphs with percentage error are nearly identical.

91 Regressions looking at the driving factors behind error largely replicate Result 5.

Additional results, such as a negative correlation between ability measures and error,

are largely unsurprising, so I do not include these regressions here.

3.9 High Cost Sessions

I ran additional sessions with a search cost of 25 rather than 5 to test the robustness

of these findings. The results are largely, though not entirely similar. The 72 subjects were again recruited from the Ohio State ORSEE system with 42 seeing the learning

treatment last and 30 seeing it first.55 The average payout was $12.13 with a minimum

of $6.45 and a maximum of $18.75. Aside from the difference in search cost, all tasks

in the high cost sessions were identical to the main part of the experiment.

3.9.1 Results

Full Information Learning Average Search 2.200 2.079 Length (0.098) (0.079) Average Search 415.461 388.737 Profits (3.785) (6.511)

Table 3.8: Summary statistics for each treatment in the high cost sessions.

Search profits are stated in experimental currency units. Standard errors are shown in parentheses.

55. The imbalance in learning-last vs. learning-first here came from declining show-up rates by the time the learning-first sessions were run.

92 It is immediately obvious from Table 3.8 that there is much less differentiation

in search behavior between treatments in the high cost search environment. The

average search length is actually shorter in the learning treatment as opposed to

longer, though this average is not significantly different from the average search spell

length in the high cost full information treatment.

(a) The kernel density of the normalized (b) The normalized reservation values by reservation values in each treatment for depth in search spell and treatment for the high cost sessions. The initial searches the high cost sessions. Note that the data have been dropped from this graph. starts at the 2nd search since the first search is always 0 due to the normaliza- tion.

Figure 3.12: The kernel density and approximate paths of the normalized reservation values in the high cost sessions.

Despite the similarity in search lengths between treatments, we can still see some

evidence for the selection effect. The kernel density of the normalized reservation values

in Figure 3.12a still shows the same peak over 0 for the full information treatment and

left skew for the learning treatment which is indicative of learning causing reservation values to decline, though the difference is much less striking than in the low cost data.

In Figure 3.12b we can see a weak decline as subjects get further into their search

93 spells in the learning treatment. A repetition of the regression in Table 3.2 yields

no significant coefficients, but based on the graphical evidence above I believe that

this does not demonstrate absence of a decline but instead comes from the increased

search cost diminishing the effect size.

This belief is reinforced by examining the results of an ordered probit similar to

Table 3.3 looking at the factors influencing the direction in which subjects update their

reservation values in the high cost sessions. We can see in Table 3.9 that the previous

observation is still one of the most influential factors in the learning treatment. The

informational content of the observations in the learning treatment is still playing a

significant role. There is a negative correlation between the previous observation and

the direction of update in the full information treatment, which would suggest that

subjects are lowering (raising) their reservation values after observing high (low) values.

However the correlation is only borderline significant when I restrict the regression to

search spells where subjects did not exercise recall, which suggests that the correlation

is at least partly driven by frequent use of recall when the observed value is close

to the stated reservation value. In other words, subjects appear to be engaging in

satisficing behavior in the face of higher search costs (See Caplin, Dean, and Martin

(2011) for a discussion of search and satisficing behavior).

While many subjects described strategies similar to the information demand

heuristic, only 4 out of the 72 high cost subjects satisfied the definition of information

demander compared to 25 out of 81 in the low cost sessions. While this is too few

subjects for any statistical analysis to be meaningful, the difference in use of the

heuristic is itself an interesting result. The information gathering stage of the heuristic

involves exchanging search costs for additional information, and it appears that the

94 Dependent Variable: Change in Reservation Value Covariates Coefficients (Std. Error) (1) (2) Previous Observation -0.0015** -0.0016** ( 0.0006 ) ( 0.0006 ) Learning* Previous Observation 0.0029*** 0.0033*** ( 0.0007 ) (0.0008) SMAP Numeracy Score 0.0070 0.0055 ( 0.0053 ) (0.0063) SNS Score 0.3274*** 0.4249*** ( 0.1105 ) (0.1271) Learning*SNS Score -0.5170*** -0.5710*** ( 0.1882 ) (0.2107) # of Searches 0.0432 0.0332 ( 0.0314 ) (0.0362) Learning * # of Searches 0.1210** 0.1376*** ( 0.0473 ) (0.0519) Time -0.0007 0.0000 ( 0.0027 ) (0.0026) Learning*Time -0.0033 -0.0058 ( 0.0045 ) (0.0055) GPA - -0.2496* - (0.1355) Other Academic Controls No Yes Upper Cutoff 2.2706 2.5359 Lower Cutoff 1.1296 1.5507 Number of Subjects 63 47 Number of Observations 841 675 Note: *,**,*** indicates significance at the 10, 5, and 1 percent levels respectively. Standard errors are shown in parentheses.

Table 3.9: Random effects ordered probit regression predicting the direction of change in reservation values in the high cost sessions.

additional expense in the high cost session made this information gathering stage too

expensive for most subjects.

Result 6. The main results from the low-cost treatments are somewhat robust to a higher cost environment. However, subjects are more prone to satisficing behavior when the search cost is larger. Additionally, the high cost of gathering information reduced use of the information demand heuristic.

95 3.10 Conclusion

Search with learning about the distribution is an topic which has heretofore been under-explored in the experimental literature. There are many different ways to incorporate learning into a search model, but for this experiment I restrict the setting to learning through observations from the distribution where the priors have overlapping support. This environment is quite common in the theoretical literature, and has some of the most consistent predictions. Namely, that searchers will use a one step stopping rule a la Weitzman (1979), and that the reservation value for this stopping rule will monotonically decline due to the fact that any observation high enough to cause an increase in the reservation value will also be above that reservation value. I show that this selection effect does indeed hold for experimental subjects in a simulated search environment, but I also have evidence that they may not be using a one step stopping rule in the learning treatment.56 Instead, many subjects seem to have a desire to learn about the distribution before stopping. The complexity and unintuitive nature of the optimal strategy may make this heuristic optimal in real world situations with limited computational power. Additionally, this heuristic may be closer to being unboundedly optimal in situations where it is possible that a searcher may encounter the same distribution again because additional information about the distribution would carry over to the next search spell. Subjects consistently under-search, and the persistence of this behavior appears to be at least in part due to the unintuitive nature of the feedback which the search game provides.

56. Given that the elicitation method assumes a one step strategy I had intended to run an additional treatment suggested by a referee to see if behavior is broadly similar in a setting without it. Unfortunately, because of the 2020 Covid-19 outbreak, the Ohio State University laboratory suspended experiments during the spring of 2020 and so I was not able to run these additional sessions before leaving that institution.

96 Given Jhunjhunwala’s (2018) result that under-searching can be counteracted by the mere prospect of additional feedback, this area seems like a fruitful avenue for future research.

97 Chapter 4: Media Provision With Outsourced Content Production

4.1 Introduction

The ability to stream media via the internet has facilitated the creation and

dissemination of many new types of content. The new forms of content range from

extremely popular new genres, such as video makeup tutorials, to extremely niche

concepts like speedruns57. Part of the reason for this rise in variety is that with

streaming, each consumer on such a platform can potentially watch content which

differs from that consumed by every other consumer, so a platform does not have to

sacrifice mass market consumers when adding niche content. The other reason for this

rise is a democratization of content production. Platforms such as YouTube or Twitch

host content created by third parties in exchange for a share of the revenue generated

by that content. This lowers the barrier to entry for creators with innovative ideas to

put their content in front of an audience.

If content ideas arrive randomly within a large population, then it makes little

sense for a platform like YouTube to hire creators within the firm. Creating a medium with very low barriers for independent creators to post content allows the platform

57. Speedruns are videos, usually live-streamed, of a player attempting to complete a computer game as quickly as possible.

98 to avoid the costs of finding and producing the content itself. The downside is that,

in order to add this content, the platform must attract the content creator who

makes it. This creates an effect that has heretofore been largely underexplored in the

theoretical literature: attracting new content creators allows the platform to provide

new varieties of content, and the new content expands the set of consumers interested

in the platform which, in turn, will attract more revenue.

I present a model where a monopoly platform brings together content creators,

advertisers, and consumers. The influence of each group on the others is shown in

Figure 4.1. Consumers can view content in exchange for either a subscription fee or watching advertising. The creators provide the content on which advertisers advertise

in exchange for a share of the subscription or ad revenue from each consumer attracted

by their content. Advertisers purchase ad views in order to generate trade with

consumers, but the presence of the ads imposes a nuisance cost on the consumers. The

platform acts as a clearinghouse which matches consumers to content creators and

advertisers to content. It decides what advertising level to set, how much to charge for

ads and the subscription, and how much revenue to share with the content creators.

Because the motivating example for this project is online video, I refer to the general

action of consuming content as ”watching”.

99 Consumers

views nuisance subscription fees cost product content purchases

ad views Advertisers Creators

ad revenue

Figure 4.1: The sides and network externalities in the model.

The objective of this article is to examine the welfare effects of a subscription which allows consumers to avoid advertising in the context of a three sided market.

I begin with a highly simplified benchmark model where consumers’ gross utility of watching is homogeneous and advertisers’ value per ad is homogeneous in order to

focus on the implications of adding content creators to a multi-sided media market. I

compare an equilibrium where the platform can offer an ad avoidance subscription to

one where it cannot.

When the platform increases the advertising level in the model without the

subscription, the consumers who watch will see more ads (and so generate more

revenue for the platform and content creators) but some consumers will choose to

leave the platform due to the increase in nuisance. The platform will choose the

advertising level which maximizes the total number of ad views. If the platform can

offer the subscription, then the consumers who are most sensitive to advertising will

100 choose to subscribe, and the platform’s tradeoff when increasing the advertising level is

reduced or eliminated. Because the subscription gives the consumer the ability to avoid

advertising, more consumers will watch in the equilibrium when it is available. The

content creators receive a payment from the platform based on how many consumers watch their content. This implies that in the advertising-only equilibrium only creators with a relatively large audience size will receive enough revenue to cover the costs of

producing content. Because the subscription increases both viewership and revenue

per consumer, the platform will increase these payments in order to attract relatively

niche content creators who will in turn attract consumers with niche tastes.

The subscription transforms consumers’ nuisance cost from advertising (which is

purely wasteful) into a transfer to the platform. Additionally, the new content creators

are all at least weakly better off with the subscription as a result of producing, so

in most circumstances the subscription will increase total welfare. The profit of the

platform and content creators will always increase, but the effect on consumer welfare

is more nuanced. Consumers attracted to the platform by the new content creators who produce only when the subscription is present are better off with the subscription,

but low nuisance cost consumers who enjoy content which would be created regardless

of the presence of the subscription are either exposed to more advertising than in

the advertising-only equilibrium or the cost of the subscription fee is higher than the

nuisance cost they would receive in the advertising-only equilibrium. On the other

hand, the high nuisance cost consumers who enjoy content which would be produced

in either equilibrium are better off with the subscription because the price of the

subscription is below the nuisance they would have borne in the advertising-only

equilibrium.

101 The potential for market expansion after the addition of new content creates a

tradeoff, albeit an imperfect one, between the welfare of consumers who enjoy mass

market media and those who prefer more niche media. If the platform’s ability to

capture consumers’ value increases (whether through advertising or the subscription),

then consumers who watch content on the platform are by definition worse off, but

the value of additional content to the platform increases, meaning that it increases

creator payments. This attracts creators with smaller audiences and, as discussed with the subscription above, the new consumers attracted by this content are better

off for having new media to watch. The imperfection of this tradeoff comes from the

fact that the platform is capturing additional value from the niche consumers as well

as the mass market consumers.

Finally, market expansion from new content creators is also significant when

considering the effect of the subscription on advertisers’ profits58. Advertising in this

model is informative, with each ad view informing the consumer about the existence

of an advertiser’s product and generating the possibility of trade. Intuitively, one would expect that the subscription would make advertisers worse off since they are

able to reach fewer consumers and thus generate fewer trades. But the platform will want to increase the advertising level to drive more subscriptions, so it will reduce the

price of advertising. Advertiser profits may increase even though they are trading with

fewer consumers. Additionally, the ad views lost to the subscription are compensated

for by the presence of ad-viewing consumers on the newly added content markets, so

58. This point mostly applies to Section 4.4.2 since the platform is able to extract all advertiser value in the baseline model, but the important elements are present in the baseline comparison as well.

102 the number of ad views and trades with consumers may increase as a result of the

subscription.

Previous models featuring ad-avoidance subscriptions (e.g. T˚ag(2009) or Prasad,

Mahajan, and Bronnenberg (2003)) have highlighted the fact that the subscription

gives the platform the ability to extract value directly from consumers rather than

relying on the value the advertisers place on advertising, reducing net consumer surplus

(T˚agin particular discusses this effect in detail). While this factor is present in my

model, it is counterbalanced by increased welfare of the consumers attracted by the

new content creators who come to the platform as a result of the subscription. The

net effect of these two factors on consumer welfare is ambiguous, but the addition of

the content creators makes the subscription much more favorable to consumers in my

model than similar subscriptions in these earlier papers.

As Crampes, Haritchabalet, and Jullien (2009) discuss, platforms in most two sided

models have a limited capacity to relay content which stems from the assumption that

the platform has only a single distribution channel, so if it wishes to add more of one

type of content it must reduce provision of a different type. The key difference in my

model is that I emulate platforms where this restriction does not apply59. While a

few papers have considered single firms with multiple channels for content provision

(most notably Anderson and Coate (2005)) and others (e.g. Anderson and Gans

(2011)) have considered horizontal differentiation in content provision, I am unaware

of any that have allowed the number of channels to be an endogenous decision to the

extent this paper does. This is not the first paper in the literature to acknowledge

59. While this framework is most applicable to three sided platforms like YouTube and Twitch because of the niche nature of independently created content, many of the conclusions in this paper extend to a cable provider adding channels to its television package or a streaming firm such as Netflix purchasing additional content at a cost.

103 the importance of the supply side on content provision. Maldonado (2017) conducts

an empirical investigation into the effects of Twitch.tv’s favorable treatment of large

creators, but he is mainly concerned with using that platform’s unequal treatment to

explore the implications of content neutrality on networks, and does not explore the welfare implications of this market size effect. Crampes, Haritchabalet, and Jullien

(2009) and Choi (2006) both consider media markets with free entry, but in both cases

the markets consist of a number of single channel two-sided platforms competing with

each other rather than a monopoly platform adding channels. If entry consists of

multiple platforms competing then it will be in each platform’s interest to discourage

entry, but if instead a monopoly platform is hosting all of the new content then

attracting additional content creators is beneficial to the platform.

Much of the previous literature on ad avoidance has assumed that the avoidance

technology is external to the platform. Anderson and Gans (2011) find that the

introduction of ad avoidance increases the advertising level, consistent with the

findings of this paper. However they also find that ad avoidance reduces the platform’s

ability to extract value from viewers of niche content, causing the platform to shift to

greater provision of mass market media. In contrast with my model, the subscription

fee allows the platform to capture viewer surplus more effectively and makes production

of niche content more worthwhile. The major reason for this difference is that when

the ad avoidance technology is separate from the platform, the costs consumers pay to

avoid advertising are purely deadweight loss. In my model, the costs become revenue

for the platform/content creators and the ad avoidance allows consumers to avoid the

deadweight loss of nuisance costs which are in excess of the advertiser’s value of an ad view. Therefore, increased avoidance is generally beneficial for total surplus.

104 4.2 The Baseline Model

The set of network externalities is complex with a three sided market. Therefore, to focus on the effects of adding the content creators I begin with a baseline model using strong simplifying assumptions. Namely, homogeneous gross utility of watching for consumers and homogeneous advertisers. The main results from the baseline model are robust to relaxing these assumptions as explored in Section 4.4.

4.2.1 Baseline with no subscription

The structure of the content space is shown in Figure 4.2. There are a continuum of content types (I use the terms ”content type” and ”content market” interchangably) indexed by t ∈ [0, t¯] where t¯ can be +∞. Each consumer is exogenously assigned precisely one type of content and they only enjoy that type (similar to Bergemann and Bonatti (2011) or more broadly Yang (2013)). Each content type holds only one content creator and has a market size g(t) which denotes the mass of consumers in that market. Because each type has only one creator, t is used to index creators as well as types. Without loss of generality I assume the content types are ordered so that the market size is monotonically decreasing, and I make the technical assumption that g(·) is differentiable and invertable. Advertisers purchase ads which will be disseminated among views on all markets according to the advertising level chosen by the platform.

Consumers

Without the subscription, consumers’ only choice is whether to watch or not.

Consumer utility is 0 if they choose not to watch and utility if they do watch is given

105 g(t)

Type t mar-

market size ket contains mass g(t) of consumers and one creator.

t

Figure 4.2: Content market structure

by

u − aη (4.1)

Gross utility of watching is u > 0 if the content creator for that type decides to produce and 0 otherwise. Consumers within a given content type are differentiated by their nuisance cost η > 0 from advertising60. η ∈ [η, η¯] is exogenously assigned and distributed across consumers independently from content type according to the log concave distribution function61 f(η). If η¯ < u then all consumers watch at every advertising level, meaning that the platform would not face a meaningful decision when changing the advertising level. Hence for the rest of the paper I assume η¯ > u.

60. The assumption that all consumers find advertising to be strictly harmful net of trade is potentially contentious (See Crampes, Haritchabalet, and Jullien (2009) for a discussion of the implications of positive advertising externalities) but one that is supported empirically by the findings of Wilbur (2008) among many others. Even so, adding a set of consumers with negative nuisance costs would not substantially change my results since these consumers would be unresponsive to both the presence of the subscription and changes in the advertising level. 61. The total measure of consumers can be greater than 1 when summed across content markets. f(·) should be thought of as a population distribution.

106 The nuisance cost faced by consumers is scaled by the advertising level a ∈ [0, 1] set by

the platform. For the purposes of this article I view a as the mass of firms allowed by

the platform to purchase advertising, each of whom reaches every watching consumer with a single ad with probability 162.

Consumers choose to watch if they receive weakly positive utility from doing so (I

assume indifferent consumers watch). This means that only consumers with relatively

low nuisance costs watch. Since the utility of the outside option is equal to 0 it is easy

to see that consumers will watch only if the content creator of their type is producing

(so the gross utility of watching is positive) and

u − aη ≥ 0 u =⇒ ≥ η a u so the critical η below which consumers will choose to watch is a and the proportion

u of consumers who watch in equilibrium in the advertising-only model is F ( a ), where R η F (η) = η f(n)dn. This proportion will be constant across markets due to the independence of g(·) and f(·).

Content Creators

Content creators receive a payment from the platform per ad view. Their profit is

based on the total number of ad views their content receives and the fee set by the

platform. The creators face only one decision: produce or not. If they choose not to

62. The assumption that a consumer views multiple ads is somewhat debatable as most online videos feature only a single advertisement before viewing. However, many longer videos also feature mid-roll ads, and even shorter videos will often have display advertising. There will often be ads next to the video or inserted between the video and its description, and the intrusiveness of the ads themselves can vary. Some ads allow viewers to opt out after a few seconds, while others run for as long as several minutes without the option to skip them. Therefore this continuous representation of the amount of advertising viewed by a consumer is more appropriate than it might initially appear.

107 produce they receive a payoff of 0. If they do produce they face a cost c and receive a

payment wa from the platform per ad view on their content. Given a market size g(t),

u u viewership proportion F ( a ), and advertising level a, total ad views will be aF ( a )g(t), so the profit on production with advertising only is

u w aF ( )g(t) − c (4.2) a a

The content creators will produce if they earn positive profit from doing so. This

means that there will be a critical g(t) above which the content creator will produce.

Setting the above expression equal to 0, it is then simple to solve for the creator who

is indifferent between producing and not in equilibrium:

  ∗ −1 c t = g u (4.3) waaF ( a )

t∗ denotes the critical content type. This is the content market where g(t∗) times the

average creator revenue per consumer is exactly equal to the cost of production. All

creators with larger audiences will produce and no creator with a smaller audience will make content.

Advertisers

In order to focus on the implications of adding the content creator side to the

market, I assume a particularly simple advertising sector for the baseline model.

Advertisers have value z per ad view, which reflects the expected surplus from trade.

All consumers have the same value for trade, so the advertisers can extract full value

108 from the consumers and consumers receive 0 surplus from trade6364. The platform

charges a price pa per ad view. The advertiser value per ad is then

πad = z − pa (4.4)

Since advertisers are homogeneous, the platform sets pa = z and extract all of the

surplus from the advertisers.

The Platform

The platform earns profit on the margin between the fee it receives per ad view

and the creator payment per ad view. The platform’s total profits will be the sum of

these margins across all ad views. It will select pa, a, and wa to maximize

u π = (p − w )aF ( )G(t∗) (4.5) plat a a a

Where

Z t∗ G(t∗) = g(t)dt 0 pa was determined in the advertiser’s problem, so all that remains is to evaluate the

platform’s choice of a and wa. I impose the following assumption

Assumption 2. F (η) − f(η)η is strictly increasing and crosses from negative to

positive exactly once as η increases.

63. This is the setup of Anderson and Coate (2005). Alternatively one can think of the nuisance cost as being net of expected trade surplus. My results do not change so long as the total effect of advertising on utility is negative. 64. This linear returns advertising technology is similar to Choi (2006). While Section 4.4.2 shows that the homegeneity assumption has a significant impact in this model, changing the returns to scale on advertising would not significantly change my main results.

109 This assumption is not equivalent to log-concavity (for example, the uniform

distribution does not satisfy it). For many log-concave distributions F (η) − f(η)η is

increasing in η, so this assumption is equivalent to assuming that F (η) < f(η)η in

those cases, but monotonicity is not necessary so long as this single crossing property

is satisfied. I use the accent ”∧” to refer to the equilibrium values for endogenous variables when the subscription is unavailable. The equilibrium is given in the following

lemma (all proofs are relegated to the appendix):

Lemma 2. If f(·) satisfies Assumption 2 and g(·) is concave, then there is a unique equilibrium in the baseline model with no subscription. The price of advertising is

pˆa = z. The advertising level solves Equation (4.6) if the solution can be reached with

aˆ ∈ (0, 1] u u u F ( ) = f( ) (4.6) aˆ aˆ aˆ

Otherwise the equilibrium aˆ will be 1. wˆa solves Equation (4.7):

1 −c2 (z − wˆ ) = G(tˆ∗) (4.7) 0 ˆ∗ u 2 a g (t ) wˆa(w ˆaaFˆ ( aˆ ))

Note that tˆ∗ is fully determined by Lemma 2 and Equation (4.3). Equation (4.7)

represents the platform’s tradeoff between attracting additional creators and increasing

the payment to existing creators. A solution to this first order condition always exists

since no creators will produce when the payment is 0 and marginal benefit of additional

markets is decreasing and reaches 0 as wˆa approaches pˆa. Concavity of g(x) ensures

that platform profit is concave in the creator payment and gives sufficiency of the first

order condition for profit maximization.

110 When the platform increases the advertising level, it increases the number of

advertisements but drives away some consumers due to the increased nuisance cost.

Equation (4.6) captures this tradeoff, and Assumption 2 ensures a unique solution which will maximize total ad views.

The platform and content creators’ profits are both maximized when the number of

ad views is maximized, so their interests are aligned when it comes to advertising levels,

but the inability of the platform to set different payments for different creators means

that it is paying the creators with the largest audiences too much from both a social welfare and profit maximizing perspective. If the platform were able to differentiate

among creators when deciding revenue shares, then it could reduce per-view payments

to creators with a large audience and attract more niche creators by increasing the

payments on small markets until niche creators are just willing to produce content.

Total welfare would increase since the audience of these niche creators must collectively value this additional content more than the cost of creating it (otherwise the platform would not be able to extract enough revenue to be willing to pay the creator) and the

consumers of the mass market content would be no worse off.

4.2.2 Baseline with subscription

I now evaluate the baseline model when the platform can offer the subscription.

Let ps be the subscription price, I assume utility is quasilinear, so utility from watching with the subscription is

u − ps

u does not vary across consumers, so if ps > u, no consumer purchases the subscription

and the remaining decisions are equivalent to the baseline with no subscription. If

111 ps ≤ u, then the minimum possible utility from watching when the relevant content

creator produces is u − ps ≥ 0. This means that for ps ≤ u, all consumers watch if

their associated content creator produces and their only relevant decision is whether

to purchase the subscription. Proposition 8 below shows that the equilibrium with

the subscription is always more profitable than the one without, so the platform sets

ps ≤ u and the proportion of consumers who view content in the baseline model with

the subscription is 1. Consumers purchase the subscription if

p u − p ≥ u − ηa =⇒ η ≥ s s a

This condition states that the subscription is more appealing than watching with

ads if the price is less than or equal to the nuisance cost from advertising (assuming

consumers subscribe when indifferent). The proportion of consumers who watch ads

ps is F ( a )

Let ws be the creator payment received per premium view. Then creator revenue

is now the revenue from advertising per viewer, plus the revenue from the subscription

per viewer, multiplied by the number of viewers of each type. Therefore, creator profit when the subscription is available is given by

 p p  w aF ( s ) + w (1 − F ( s )) g(t) − c (4.8) a a s a

and so the indifferent content creator in the model with the subscription is

  ∗ −1 c t = g ps ps (4.9) waaF ( a ) + ws(1 − F ( a ))

The platform’s revenue with the new income stream is the weighted average of

the margin on advertising views and the margin on premium views, multiplied by the

112 total number of views.

h p p i π = (p − w )aF ( s ) + (p − w )(1 − F ( s )) G(t∗) (4.10) plat a a a s s a

The advertiser’s problem is unaffected by the presence of the subscription, so the price

of advertising is still pa = z. The platform has 4 undetermined decision variables:

ws, wa, ps, a.

Similar to the ”∧” accent in the previous section, I use ”∼” to distinguish equilib-

rium values of endogenous variables when the subscription is available. To understand

p˜s p˜s the platform’s decision making, it is helpful to define p˜ = p˜aaF˜ ( a˜ ) + p˜s(1 − F ( a˜ ))

p˜s p˜s andw ˜ =w ˜aaF˜ ( a˜ ) +w ˜s(1 − F ( a˜ )) Then in the equilibrium with the subscription

∗ π˜plat = (˜p − w˜) G(t˜ ) and

 c  t˜∗ = g−1 (4.11) w˜ p˜ is the platform’s revenue per watching consumer and w˜ is the content creators’

revenue per consumer who watches their content. A lemma follows:

Lemma 3. There exist a continuum of platform profit maximizing pairs (w˜s, w˜a), but they are all equivalent to a single optimal value of w˜. This w˜ solves

−c2 p˜ − w˜ = G(t˜∗) (4.12) g0 (t˜∗) w˜3

If g(·) is concave, this solution exists and is unique for a given value of p˜a, p˜s, and a˜.

The intuition behind this lemma is straightforward, if the platform increases w˜a

p˜s aF ( a˜ ) by  and decreases w˜s by p˜s , then both producer and platform profits are (1−F ( a˜ )) 113 unchanged. This means that there is no point prediction for the creator payments, but

rather a prediction that equates the marginal benefit of adding an additional market

to the cost of giving existing producers a greater proportion of total revenue. The

equilibrium in this market will then be given by the following lemma:

Proposition 8. In the baseline model with the subscription available, the price of advertising is p˜a = z. If Assumption 2 is satisfied and g(·) is concave then:

1. If z > u:

• p˜s = u

• aˆ < a˜ and a˜ solves Equation (4.13) unless this solution would put it above

the corner at 1, in which case a˜ = 1.

u u u  u2 u z f( ) − F ( ) = f( ) (4.13) a a˜ a˜ a˜ a˜

2. If z < u:

• a˜ = 1

• p˜s solves

1 − F ( p˜s ) a˜ =p ˜ − z (4.14) p˜s s f( a˜ )

if the solution is less than u, or p˜s = u if it is not.

3. w˜ solves Equation (4.12)

If z > u, then an advertising view is more valuable than a subscriber view, but a

subscriber is more valuable than a consumer leaving the platform, so the platform

sets the highest subscription price where consumers will still purchase it to encourage

114 only the consumers who would otherwise not watch to subscribe. The platform could

set a˜ = aˆ and p˜s = u, have the exact same number of ad viewing consumers as it had

previously, and still make more profit than when there was no subscription due new

consumers coming on to the platform. However the addition of the subscription means

that the opportunity cost of raising the advertising level is lower since consumers

are switching to the subscription rather than away from the platform, so the profit

maximizing advertising level is higher than when advertising is the only revenue

stream.

When a subscription view is more valuable than an advertising view (i.e. when

u > z), the platform’s optimal advertising level is the corner solution. The subscription

fee will always be weakly below u, so increasing the advertising level drives additional

subscriptions rather than driving away viewership. In this case, the opportunity

cost of increasing the advertising level has not only gone down, but disappeared

entirely. However, because subscription views are now more valuable, increasing the

subscription price means trading off gaining additional revenue from subscribers with

losing revenue as marginal consumers move from the subscription to watching with

ads. Equation (4.14) balances these effects.

The case for which z > u does not seem to apply in actual media markets. Estimates

for CPM (revenue per thousand ads) for YouTube can be as low as USD$2-$10 (Green

2015; Gutelle 2014) whereas YouTube’s subscription service is approximately $10

per month at the time of writing, meaning that a free viewer would need to watch

1,000-5,000 ads per month to give as much revenue as a single subscriber. Comparable

numbers hold for similar platforms like Twitch and Hulu, so for the remainder of this

115 paper I will focus primarily on equilibria where consumer value is significantly greater

than advertiser value. This is formalized in Assumption 3:

Assumption 3. z < u

4.3 Welfare Comparison

Platform and creator profits unambiguously increase with the introduction of the

subscription fee, leading to an increase in the number of creators and content markets

served. When evaluating the effect on consumer welfare, it is helpful to separate the

content markets which are served in both equilibria from the markets which are only

served when the subscription is available. I therefore define the former markets as intensive markets and the latter as extensive markets. With this definition I am ready

to evaluate the welfare effects of the subscription

Proposition 9. If Assumption 2, Assumption 3 and concavity of g(·) are satisfied, then

1. t˜∗ ≥ tˆ∗ and creators receive more revenue when the subscription is available than

when the subscription is not available.

2. Intensive market consumers for whom η < p˜s will watch content with ads in

both the advertising-only equilibrium and the equilibrium with the subscription.

Their net utility will be lower in the equilibrium with the subscription than in

the advertising-only equilibrium.

p˜s 3. Intensive market consumers for whom p˜s ≤ η ≤ aˆ will purchase the subscription when it is available, but their net utility of watching will be lower in the equilibrium

with the subscription than in the advertising-only equilibrium.

116 p˜s 4. Intensive market consumers for whom aˆ < η purchase the subscription when it is available. Their net utility is greater in the equilibrium with the subscription

than in the advertising-only equilibrium.

5. The net utility of extensive market consumers is at least weakly greater in the

equilibrium with the subscription than in the advertising-only equilibrium

Creator profits increase when the subscription is available for two reasons: i.) The option to subscribe increases viewership which means that the creators would get additional revenue even ifw ˜ were to remain constant, and ii.) The marginal value of an additional content market is larger with the subscription, meaning that the the platform will be willing to set w˜ > wˆ in order to serve more markets even though this means increasing payments to creators who already produce. Figure 4.3 shows market coverage in the two versions of the baseline model and provides a more detailed breakdown of the cosumer groups from Proposition 9.

Figure 4.3a shows market capture in the advertising-only model. The vertical axis is the size of a given content market, but it is helpful to think of the consumers as being ordered in ascending η, so that consumers closest the horizontal axis have

u the lowest nuisance cost. Consumers above the F ( aˆ )g(t) line find the nuisance cost too high to consider watching, and those to the right of tˆ∗ prefer content for which the effective audience size is too small for the creator to be willing to produce. The consumers in the lower left region are both willing to watch and have content provided for them.

Figure 4.3b shows market coverage in the model with the addition of the subscrip- tion. All consumers who were watching previously continue to watch, but whether they benefit from the presence of the subscription welfare depends on their nuisance

117 g(t) g(t) Region 4 u u F ( aˆ )g(t) F ( aˆ )g(t) Region 3 p˜s F ( aˆ )g(t)

market size Watching market size Region 2 Consumers F (˜ps)g(t) Region 5 Region 1 tˆ∗ tˆ∗ t˜∗

t t (a) Market coverage in the advertising-only (b) Market coverage with the addition of the model premium

Figure 4.3: Market capture

cost. The highest nuisance cost consumers who were watching in the advertising-only

equlibrium (those in Region 3) are better off with the subscription because the price

is lower than the nuisance cost they would suffer if the subscription were unavailable.

p˜s Consumers for whom p˜s ≤ η ≤ aˆ purchase the subscription, but pay more in the subscription fee than they would in nuisance cost in the advertising-only equilibrium.

They are therefore worse off than they were in that equilibrium. These consumers lie

in Region 2. Finally, the consumers who continue to watch with ads (Region 1) are worse off with the subscription available because they are viewing the same content with more advertising.

Two sets of new consumers begin watching content as a result of the subscription.

Consumers who previously suffered too much nuisance cost to find the platform worth- while now purchase the subscription and watch, these lie in Region 4. Additionally,

118 because t˜∗ > tˆ∗, consumers on previously un-served markets are now watching as well.

This is the extensive change in market coverage and is represented by Region 5. All

consumers who begin to watch as a result of the subscription when they would not in

the advertising-only model are at least weakly better off with the subscription.

While it is easy to identify the consumers who benefit and lose from the subscription,

the changes in aggregate welfare and how much of the benefit from the subscription is

captured by the platform depends on f(·) and g(·).

4.3.1 Total Welfare

Since the platform captures all of the advertiser surplus and content creator revenue

is a fraction of platform revenue, the total non-consumer welfare in the advertising-only

model is

u u u (ˆp − wˆ)F ( )G(tˆ∗) +wF ˆ ( )G(tˆ∗) − tˆ∗c =pF ˆ ( )G(tˆ∗) − tˆ∗c aˆ aˆ aˆ

The first term on the left hand side represents platform revenue, the second the creator

revenue, and the third the cost of production borne by the creators. Adding them

together gives the total revenue minus the cost of production as shown on the right

hand side. Similarly, non-consumer welfare in the model with the subscription is

pG˜ (t˜∗) − t˜∗c. Total welfare in the advertising-only model is

" u # u Z aˆ zaFˆ ( ) + (u − aηˆ )dF (η) G(tˆ∗) − tˆ∗c aˆ η

The first term in the brackets is non-consumer revenue, and the second is the

average net utility of consumers. Adding together gives the total social benefit to

content production, and tˆ∗c is the cost. Total welfare in the model with the subscription

is

119 " # Z p˜s ∗ ∗ p˜aF (˜ps) + u − ηdF (η) G(t˜ ) − t˜ c η

Note that with the addition of the subscription, subscribing consumers no longer

bear the nuisance cost of advertising. This nuisance cost must be greater than the

65 social benefit, since the benefit of advertising to a single consumer is equal to p˜a = z ,

and the consumers who subscribe are willing to pay p˜s > z (where the inequality

comes from Assumption 3) to avoid the ad. The price of the subscription cancels out

of total welfare because it is a transfer from consumers to the platform. Taking the

difference of the two welfare equations, the change in total welfare is

u  Z aˆ Z p˜s   u  u ∗ ∆Ω = z F (˜ps) − aFˆ ( ) + (1 − F ( ))u +a ˆ ηdF (η) − ηdF (η) G(tˆ ) aˆ aˆ η η | {z } Welfare change for content covered in both models " # Z p˜s ∗ ∗ ∗ ∗ + zF (˜ps) + u − ηdF (η) (G(t˜ ) − G(tˆ )) − (t˜ − tˆ )c η | {z } Content produced with addition of subscription (4.15)

The first term in ∆Ω is the change in welfare on the intensive markets, which can be

either positive or negative depending on whether the increased profits and ability to

avoid advertising outweigh the losses for low nuisance cost consumers. The second

and third terms are the extensive welfare change, which is strictly positive since all

creators with market sizes larger than g(t˜∗) make positive profits, and the benefits to

consumers and the platform are non-negative. Note that while both the consumers

from region 1 and region 2 in Figure 4.3b are worse off with the subscription, the

region 2 consumers are worse off because they are giving a substantial transfer to the

65. From Assumption 3,a ˜ = 1.

120 platform in the form of a subscription fee, so their contribution to the change in total welfare is positive. Therefore, total welfare can only decrease with the subscription

if the increase in nuisance costs faced by region 1 consumers is large relative to the welfare change of all other groups in the model. The circumstances where this occurs will be rare as it requires a high subscription price (suggesting that the platform

is able to extract a substantial amount of consumer surplus) and a small extensive

change in market coverage (which would require that the platform doesn’t extract

much consumer surplus relative to the cost of acquiring new markets).

4.3.2 Platform and Creator Profits

Separating the welfare change into the change in platform/producer welfare and

the change in consumer welfare. The change in platform/producer welfare is h u i ∆π = z(F (˜p ) − a ∗ F ( )) +p ˜ (1 − F (˜p )) G(tˆ∗) s aˆ s s | {z } Content covered in both models (4.16) ∗ ∗ ∗ ∗ + [zF (˜ps) +p ˜s(1 − F (˜ps))] (G(t˜ ) − G(tˆ )) + (tˆ − t˜ )c | {z } Content produced with addition of subscription

Both the intensive and extensive changes are positive, so profit is unambiguously

higher in the subscription model. This is sensible since the advertising-only structure

remains available with the addition of the subscription, so if the subscription were

not more profitable the platform would not offer it. This increase in profit is also

necessary to have the extensive welfare change for consumers, since otherwise the

platform would not increase creator payments and no new creators would choose to

produce content.

121 4.3.3 Consumer Welfare

u !  Z p˜s Z aˆ  u ∗ ∆U = (1 − F ( ))u − (1 − F (˜ps))˜ps − ηdF (η) − aˆ ηdF (η) G(tˆ ) aˆ η η | {z } Content covered in both models " # Z p˜s ∗ ∗ + u − (1 − F (˜ps))˜ps − ηdF (η) (G(t˜ ) − G(tˆ )) η | {z } Content produced with addition of subscription (4.17)

The intensive market change in consumer welfare (the first term on the right hand side of Equation (4.17)) has ambiguous sign. As previously described, low nuisance cost consumers are worse off and high nuisance cost consumers better off with the subscription. The aggregate change in intensive welfare depends on the distribution of η. If there is a relatively high mass of consumers with high nuisance costs, then the benefits of the subscription to the high nuisance cost consumers outweigh the losses of those with low nuisance costs, but if the proportion of high nuisance cost consumers is too high, then this effect is counteracted by a high subscription price and all of the benefits of the subscription are captured by the platform and content creators. Therefore, aggregate intensive market consumer welfare decreases with the subscription if either the proportion of consumers who are highly averse to advertising is too low, or too high. The extensive market welfare change (the second term on the right hand side of Equation (4.17)) is positive since those content types were previously not produced so consumers can be no worse off than if there was no content.

If the extensive change is sufficiently large, it may be enough to counteract a decrease in welfare on the intensive markets, raising aggregate consumer welfare even if the intensive market consumers are worse off with the subscription.

122 This highlights a tradeoff between the welfare of consumers on the intensive markets

and the extensive market consumers’ welfare. The higher the platform’s profit, the

more it is willing to pay content creators and the greater the increase in content (i.e.

the greater G(t˜∗) − G(tˆ∗)). However, once the additional consumers who join the

platform with the addition of the subscription are accounted for, an increase in the

platform’s profit must imply an increased ability to extract value from consumers,

so the consumers who would watch anyway benefit less from the subscription. This

tradeoff is imperfect in that the more surplus the platform can extract, the less the

extensive market consumers benefit from being served as more of their value is being

captured by the platform as well.

As the mass of consumers with high nuisance cost increases, total surplus in the

equilibrium with the subscription will increase relative to advertising-only on all

markets (including the intensive ones). As the size of the content markets becomes

more even, t∗ becomes more responsive to changes in w, which would increase the

(positive) extensive change, but then tˆ∗ will be higher in this case as well, so relative

size of the extensive change might decrease even though total coverage would be higher

compared to a more concentrated market. While an increase in total welfare from the

premium is the most likely outcome, it is possible for the subscription to reduce welfare

if the increased nuisance costs from consumers who continue to watch ads outweighs

the benefits to other agents. This can happen if consumers are highly insensitive to

ads and the distribution of consumers is such that the size of the extensive market

change is small compared to the intensive market consumers’ welfare decrease.

123 4.3.4 Numerical Example

To demonstrate how changing the distributions changes the results I compute two

numerical examples using different nuisance cost distribution functions: 1. a right

triangular distribution and 2. a half-normal distribution. The triangular distribution

2(¯η−η) has domain [η, η¯] with mode η. The density function is then (¯η−η)2 . While not often used, the triangular distribution is a good approximation of many distributions, and

has the benefit of being analytically tractable for this model66. The half normal

distribution (density function 2φ(x), x ≥ 0 where φ(·) is the standard normal pdf) is

an example of a distribution where Assumption 2 does not hold. Most of the results

from earlier in the paper are robust to this fact. Now set

g(t) = (1 − t(α−1)) where α ≥ 1 is a concavity parameter for the market size function. The higher α,

the less concave the function, meaning that t∗ will be more responsive to changes in

w67. I use this function for the example because it is tractable, and concavity is a

function of a single parameter which facilitates demonstrating the effects of variation

in curvature of market size68.

66. Analytical solutions for the advertising level and subscription price with the triangular distribu- tion are available in Appendix D.0.1. Numerical calculations for both distributions were performed using Matlab R2016b. Code available from the author upon request. 67. In a slight abuse of notation, I use w to refer to creator payments when statements can refer to either equilibrium. Unaccented p should be interpreted similarly for platform revenue. 68. This function is not normalized to keep total consumer mass constant, but doing so produces qualitatively identical results.

124 (a) Triangular distribution of nuisance costs (b) Half-normal distribution

Figure 4.4: Variation in total welfare and utility by advertising sensitivity. Parameter values: η = 0.25, z = 0.2, c = 0.05, α = 3, u = 1

The curves in Figure 4.4a show the variation in ∆U and ∆Ω as η¯ changes in the

triangular distribution. As the consumers become more sensitive to advertising, the

platform is able to capture more of their surplus by increasing the subscription price,

so more of the benefits of the subscription are captured by the platform. As η¯ becomes

sufficiently high, p˜s approaches the corner solution at p˜s = u = 1 and all of the benefits

of the subscription are captured by the platform. The kink in the curves occurs when

the price reaches a corner. Eventually, all intensive consumers are weakly worse off with the subscription due to either the increase in nuisance cost from consumers who

do not purchase the subscription or the high subscription cost capturing all of their

surplus. The same holds true for subscribers on the extensive markets, but the low

nuisance cost extensive market consumers who watch the content with ads are still

strictly better off in the subscription equilibrium.

125 The welfare changes depicted in Figure 4.4b show that the half normal distribution

is qualitatively different from the triangular distribution. Since Assumption 2 doesn’t

hold, a = 1 in both models. The consumers are all weakly better off with the presence

of the subscription and initially strictly better off as the average nuisance cost increases.

When σ increases more consumers purchase the subscription, and the consumers who

use the free version are no worse off since the advertising level was already as high as

it could get, but as consumers become more sensitive to advertising, the subscription

price increases and the platform begins to capture more consumer surplus, which cuts

into the increase in consumer welfare from the subscription.

(a) Triangular distribution of nuisance costs (b) Half-normal distribution

Figure 4.5: Change in total welfare and utility by curvature of market size. Parameter values: η = 0.25, µ = 0, z = 0.2, c = 0.05, η¯ = 2.3, σ = 0.5, u = 1

Figure 4.5a shows the welfare changes as the market size function becomes less

curved. The change in utility and the increase in welfare both become more intense69

69. While not visible on the graph, ∆U is decreasing in α.

126 when nuisance costs follow a triangular distribution. In other parameter settings where the utility change is positive, it becomes more positive (albeit at a similarly

shallow rate) as α increases. This is because as the market size curve becomes flatter,

the marginal benefit of additional content markets becomes larger relative to the cost

of increasing payments to creators on intensive markets, so coverage will increase in

both equilibria. But the increase in t∗ as a result of the subscription diminishes since with a flatter curve more creators were already producing under the advertising-only

model to begin with. Thus the intensive change in utility becomes more important, if

the intensive change is initially so negative that the addition of the extensive markets

cannot counterbalance it, then this disparity becomes more severe as the size of the

markets becomes more even.

With the half normal distribution, consumer welfare increases as the market size

function flattens. Again, since ads are already at the corner in the advertising-only

model, there is no mechanism for consumer welfare to decrease with the introduction of

the subscription, so consumers on the intensive markets are no worse off and flattening

the market curvature very slightly enhances the increase in market coverage from the

introduction of the subscription. more detailed results for this example are available

in Appendix D.

4.4 Extensions

4.4.1 Heterogenous Utility

This section shows that many of the results from the baseline model are robust to

a consumer base with heterogeneous gross utility of watching. I allow the gross utility

127 of viewing in a content market where the content creator is producing to vary along the interval [u, u¯] with population density function j(u) independent of f(·) and g(·).

Non-viewing Paying Non-Viewing Consumers Consumers Consumers

η η p˜s a˜ Non-viewing Consumers Ad-viewing Ad-viewing Viewing Consumers Consumers Consumers

p˜s u u (a) Advertising-only (b) With premium

Figure 4.6: Consumer decisions as a function of type

Without the subscription, the critical level of η below which consumers will choose to watch now depends on u. As u increases, consumers with increasingly high η find it worthwhile to watch. From Equation (4.1), the critical η varies linearly with u, as illustrated in Figure 4.6a70.

The major changes occur when the platform can provide the subscription. Con- sumers whose utility is below the subscription price are choosing between not watching and the free version, whereas consumers whose utility is above it choose between the free and paid versions. Figure 4.6b shows the decisions of consumers as a function of

70. More complex utility functions or choice spaces such as allowing the consumers to choose how much time they spend watching content yield results that are qualitatively similar to those presented here.

128 their type when the subscription is available. For consumers whose utility of viewing is

below p˜s, they behave as in the advertising-only version of the model, and when they value viewing more than the price of the subscription they behave as in Section 4.2.2

because watching will always be better than the outside option. Consumers with low

nuisance cost watch content with ads, consumers with high utility and high nuisance

cost watch with the subscription, and consumers with low utility and high nuisance

cost do not watch at all. Putting this in mathematical terms, it follows that the

proportion of consumers who watch (denoted here by ψ) is  u¯ ψˆ = R F u  j(u)du advertising-only  u aˆ  Z p˜s  ˜ u ψ = F j(u)du + (1 − J(˜ps)) with subscription (4.18)  u a˜ | {z }  | {z } Subscribing consumers  Ad viewing consumers As with earlier notation, ψˆ denotes the viewership proportion in the advertising-

only model and ψ˜ the proportion in the model when the subscription is available.

The presence of non-viewing consumers in the subscription model means that the

percentage of consumers who watch is not guaranteed to rise with the addition of

the subscription. If the subscription price and advertising level are sufficiently high, viewership may even decrease. The platform’s profit under advertising-only is almost

ˆ u  identical to that in the baseline, substituting ψ for F aˆ , however with the addition of the subscription it becomes

 Z p˜s    u p˜s πplat = F j(u)du + F (1 − J(˜ps)) a˜(˜pa − w˜a) u a˜ a˜ | {z } Advertising revenue (4.19)  p˜   + 1 − F s (1 − J(˜p )) (˜p − w˜ ) G(t˜∗) a˜ s s s | {z } Subscription revenue The platform’s advertising revenue comes from low utility consumers and high utility

consumers with a low nuisance cost. When it increases the advertising level, it will

129 drive some of the low utility consumers away from the platform instead of onto the

subscription. If it increases the price, then some of the high utility, high nuisance cost

consumers will leave the platform rather than switching to the version with advertising.

In both cases the increased tradeoff means that the platform will be able to extract

less of the consumers’ value than in the baseline model. Since Assumption 3 does not

make sense with heterogeneous utility, I impose the following assumption instead.

Assumption 4. z < u

Assumption 4 ensures that p˜s > p˜a. The platform will set pa, a, wa, w˜s, andp ˜s to

maximize profit. Define

Z p˜s  0 u p˜s p˜s w˜ =w ˜aa˜ j(u)F ( du + (1 − J(˜ps))F ( ) +w ˜s(1 − F ( )) (4.20) u a˜ a˜ a˜

and

Z p˜s  0 u p˜s p˜s p˜ =p ˜aa˜ j(u)F ( du + (1 − J(˜ps))F ( ) +p ˜s(1 − F ( )) (4.21) u a˜ a˜ a˜

These are the analogues of p˜ and w˜ in the model with with heterogeneous gross

utility of watching. The equilibria when the subscription is and is not available are

characterized by Lemma 4

Lemma 4. Under Assumption 2, Assumption 4 and if g(·) is concave, then

1. The price of advertising is pˆa = z =p ˜a.

ˆ u  2. In the advertising-only case wˆ solves Equation (4.7) substituting ψ for F aˆ . The advertising level aˆ is the solution to

Z u¯  u u u F − f j(u)du = 0 (4.22) u aˆ aˆ aˆ 130 if a ∈ (0, 1] which solves this equation exists, otherwise aˆ = 1.

3. When the platform can offer the subscription, a˜ ≥ aˆ and the equilibrium adver-

tising level is interior if there exists a˜ ∈ (0, 1] solves the following equation: p˜ p˜ p˜ s (1 − J(˜p ))(˜p − a˜p˜ )f( s ) + (˜p − w˜ )F ( s )J(˜p ) a˜2 s s a a˜ a a a˜ s (4.23) Z p˜s u u u  = f( ) − F ( ) j(u)du(˜pa − w˜a) u a˜ a˜ a˜ otherwise advertising will reach the corner solution at 1.

The profit maximizing subscription price solves

p˜s f( a ) p˜s − a˜p˜a j(˜ps) 1 = + (˜ps − w˜s) (4.24) p˜s a 1 − J(˜p ) 1 − F ( a ) s

if the solution is greater than u. Otherwise p˜s = u. Lemma 3 continues to hold,

substituting p˜0 and w˜0 for p˜ and w˜, so w˜0 is determined by Equation (4.12).

Equation (4.22), which determines the advertising level in the advertising-only model with heterogeneous utiliity, should be interpreted identically to the baseline equilibrium. The platform is balancing additional advertising against the lost view- ership due to the increased nuisance cost, but it is now averaging the effects of the change in advertising across the utility space.

The terms on the left hand side of Equation (4.23) represent the net marginal benefit of increasing advertising from the consumers with u ≥ p˜s since these consumers are either watching more ads or shifting onto the subscription. The term on the right hand is the negative of the change in ad views from the consumers who do not find the subscription worthwhile. In equilibrium, the advertising level is above that in the advertising-only model even if the solution to Equation (4.23) is interior.

Increasing the advertising level will cause some of the marginal consumers to switch

131 to the subscription rather than leaving the platform. The tradeoff to increasing the

advertising level, while not eliminated, is reduced.

The subscription pricing equation is similar to the one in Section 4.2.2, but the

platform faces an additional tradeoff in that some consumers will switch to not watching from purchasing the subscription rather than switching to ad-supported to version when the price increases. Because of this factor, the subscription price will be

lower than the price in Section 4.2.2 would be for the same distribution of nuisance

costs.

Most of the results from Section 4.3 are robust to the addition of heterogeneous

utility.

Proposition 10. If Assumption 2, Assumption 4 and concavity of g(·) hold, then when consumers have heterogeneous utility

1. t˜∗ ≥ tˆ∗ and creators receive more revenue when the subscription is available than

when the subscription is not available.

2. The welfare comparison between advertising-only and the equilibrium with the

subscription for intensive market consumers who have u ≥ p˜s will depend on

their nuisance cost exactly as in points 2-4 of Proposition 9.

3. Intensive market consumers for whom u < p˜s have weakly lower net utility with

the presence of the subscription.

4. All extensive market consumers have weakly greater net utility when the sub-

scription is available.

Since the platform is free to set p˜s arbitrarily high and keep the advertising solution

from the advertising-only model, platform profit, creator payment and content coverage

132 must weakly increase with the addition of the subscription. Total welfare will increase

under most distributions and parameter sets. The most significant difference comes

in the form of comparative statics. With homogeneous utility, the more sensitive

consumers are to advertising (i.e. the more mass is concentrated at high η), the

more consumer surplus the platform can capture and the better off it will be. With

heterogeneous utility the optimal subscription price may be high enough that most

consumers are still choosing between not watching and the ad-supported version of

the service. But in this case a significant portion of the profit is still coming from

advertising, therefore the platform may wish to set a lower advertising level as ad

sensitivity increases, reducing the number of consumers purchasing the subscription.

The set of consumers who are better off with the subscription is reduced compared

to the baseline model. Consumers on the intensive markets are better off if they have

a high nuisance cost and a high enough utility to purchase the subscription. All

other intensive consumers are weakly worse off. The limitations which heterogeneous

utility puts on the platform’s ability to extract consumer value will lead to a smaller

extensive effect, but the extensive effect will still be positive. Consumers on the

extensive markets are better off if they will watch the content, and those who do not watch are no worse off than when no content was produced.

4.4.2 Enriched Advertising Sector

The welfare results from the baseline model are also robust to an enriched adver-

tising sector. Suppose we have a unit mass of advertisers with value of trade z ∈ [z, z¯] which follows the log-concave population distribution h(z). Then each advertiser will

buy advertising if they have positive expected profit from doing so

133 z − pa ≥ 0

In this section, I adhere strictly with the interpretation that the advertising level is

the proportion of firms which are purchasing ads that go to all consumers. For a given

price, the demand for advertising will be (1 − H(pa)). This means that the platform

can no longer set the advertising level independently of the advertising price. The

advertising level is therefore i(pa) ≡ (1 − H(pa)) and, for the advertising-only model,

platform profit is analogous to Section 4.2.1

∗ u πplat = (ˆpa − wˆa)i(ˆpa)G(tˆ )F ( ) (4.25) i(ˆpa)

Platform profit in the premium model is similar to Section 4.2.2.

      p˜s p˜s ∗ πplat = (˜pa − w˜a)i(˜pa)F + (˜ps − w˜s)(1 − F ) G(t˜ ) (4.26) i(˜pa) i(˜pa)

Consumers behave as in Section 4.2.2 replacing a with i(pa). The enriched advertising

sector does not directly change the behavior of the consumers or content creators from

the baseline model except insofar as it changes the advertising level. Before solving

for equilibrium I impose the following assumption

Assumption 5. In the model with heterogenous advertisers

• z¯ < u

2 f x  x  ( 1−H(x) ) • 1−H(x) x is increasing in x F ( 1−H(x) )

134 The first part of Assumption 5 ensures that the price of the subscription is greater than the price of advertising. As in Section 4.4.1 I focus on equilibria with this property because it is most relevant to the motivating examples. The second part of the assumption ensures that the platform’s profit is quasi-concave in the advertising price when it can offer the subscription. Note that Assumption 2 requires that

f x  x  ( 1−H(x) ) 1−H(x) x is decreasing in x, so the two assumptions together place upper F ( 1−H(x) ) and lower bounds on the log-concavity of f(·) and a lower bound on η.

Lemma 5. Under Assumption 2, Assumption 5 and if g(·) is concave, then

1. wˆ is determined by Equation (4.7) and w˜ by Equation (4.12), substituting a

with i(pa) in both equations. Similarly, the subscription price is determined by

Equation (4.14) using the same substitution.

2. With advertising only the price of advertising solves

  1 − H(ˆpa) u u u u F ( ) = F ( ) − f( ) pˆa (4.27) h(ˆpa) i(ˆpa) i(ˆpa) i(ˆpa) i(ˆpa)

if a solution such that z¯ ≥ pˆa ≥ z exists. Otherwise it will be at the corner where

i(ˆpa) = 1 and pˆa = z.

3. In the model with the subscription, p˜a ≤ pˆa and the advertising price solves

 2   1 − H(˜pa) p˜s p˜s p˜s p˜s p˜s p˜s F ( ) = f( ) + pa F ( ) − f( ) (4.28) h(˜pa) i(˜pa) i(˜pa) i(˜pa) i(˜pa) i(˜pa) i(˜pa)

if a solution such that z¯ ≥ p˜a ≥ z exists. Otherwise it will be at the corner where

i(˜pa) = 1 and p˜a = z.

135 Ad views are maximized when F ( u ) = f( u ) u , but when ad views are max- i(pa) i(pa) i(pa) imized the marginal profit from increasing the ad price is strictly positive. Therefore

in equilibrium the number of total ad views must be below that in the baseline model

so that the right hand side of Equation (4.27) is positive.

Since the platform is not directly controlling the advertising level, reducing the

price may have such a minute effect on advertising that the additional subscriptions it would drive do not make up for the lost advertising revenue. In this case, the intensive

consumers are benefiting from price insensitivity of advertisers, but this benefit comes

at the cost of reduced welfare gains for the extensive markets. This limit on the

advertising level will mitigate the increase in content coverage and some consumers will not have content produced for them when they would if the advertising level were

higher.

Proposition 11. If Assumption 2, Assumption 5 and concavity of g(·) hold, then in the model with the enriched advertising sector

1. Platform profits, creator payments and content coverage are all higher when the

platform can offer the subscription.

2. Consumers’ welfare changes with the presence of the subscription in the same

manner as Proposition 9, replacing aˆ with i(ˆpa)

3. Advertiser profits can increase or decrease as a result of the subscription

The ambiguity of the subscription’s effect on advertiser profits is surprising. In-

tuitively, the advertisers are losing ad views on the intensive market since some

consumers will be using the premium version of the content rather than viewing ads.

136 However, because the platform wishes to increase the advertising level, it will lower

the advertising price, meaning that the expected profit for each ad view increases.

Additionally, content on the extensive markets brings in new ad-viewing consumers to

the platform which mitigates the loss of the ad-viewing consumers on the intensive

markets. The total ad views for each advertiser may even increase.

4.5 Conclusion

I find that the provision of an ad avoidance subscription can increase aggregate

consumer welfare and will generally increase total welfare. T˚ag’s (2009) result that

consumer welfare decreases holds if either the consumers are either too sensitive

to advertising so that the subscription price is high, or too insensitive so that few

consumers take advantage of the subscription. On the other hand, if consumers’

advertising sensitivity lies in between these two extremes then the subscription will

increase consumer welfare. The more sensitive consumers are to advertising, the more

the benefits of the subscription will be captured by the platform and content producers.

This capture creates a tradeoff between the welfare of consumers on the intensive

markets and those on the extensive markets. As platform profits increase, so too do

payments to creators and therefore the number of creators who produce. Since the

extensive market consumers are always at least weakly better off when their content

producer is active, the welfare effect from the addition of this content will be strictly

positive. Although it may be negligible if the platform is able to capture too much

consumer welfare.

While the intuitive conclusion is that introducing the subscription would reduce

the profits of advertisers by reducing their potential audience, under many parameter

137 specifications the opposite holds true. In Section 4.4.2 I show that the platform has an

incentive to increase the amount of advertising in order to drive more consumers on

to the subscription, which lowers the price of purchasing ad views. Additionally, the

increase in content coverage means that the number of total ad views may increase

rather than decrease, consequently the advertisers may be better off with the presence

of a subscription than without it.

There are a number of other potential implications from separation of content from

the platform which are not covered in this model. In a monopoly, the platform would

prefer to pay the creators on the largest markets less per view and smaller creators

more in order to increase the amount of content and bring in more consumers. If the

platform is competing with another platform, then the largest creators become the

most valuable way of attracting consumers. The ability to differentiate might increase

competition for these creators and increase the payments they receive71. Additionally,

in this model most of the benefit of the subscription to content creators is accrued

by the large content creators who have a significant profit margin over their cost of

production. From discussions with a content creator who makes a living on YouTube

(Personal Communication 2017), the most significant benefit of the subscription is to

smaller creators. This is because in reality most YouTube channels face a dynamic

problem where they have to decide how much time and money to invest in improving

their content, and the revenue they earn from advertising is highly variable and

unpredictable. The subscription not only increases the revenue these creators earn,

but is also significantly more reliable, thus reducing their uncertainty when deciding

how much to invest in their channel. Finally, the platform in this model has no

71. Assuming that creators, following the terminology of Armstrong (2006), are single homing.

138 search frictions. In reality the size of a given creator’s audience is endogenous, and when additional creators join a platform consumers may not be aware of it because

the new creator is difficult to differentiate from myriad other small creators making

similar content. Small creators have an incentive to appear similar to those with larger

audiences in order to divert some of the consumers who search for content from those

large creators. This search diversion can be good for the platform because it allows

smaller creators to grow their audience, but it reduces the profits of creators with a

large viewer base.

139 Bibliography

Anderson, Simon P., and Stephen Coate. 2005. Market provision of broadcasting: a welfare analysis. The Review of Economic Studies 72 (4): 947–972. issn: 00346527, 1467937X.

Anderson, Simon P, Andre De Palma, and Yurii Nesterov. 1995. Oligopolistic com- petition and the optimal provision of products. Econometrica: Journal of the Econometric Society: 1281–1301.

Anderson, Simon P., and Joshua S. Gans. 2011. Platform siphoning: ad-avoidance and media content. American Economic Journal: Microeconomics 3, no. 4 (): 1–34.

Anderson, Simon P., and Regis Renault. 1999. Pricing, Product Diversity, and Search Costs: A Bertrand-Chamberlin-Diamond Model. RAND Journal of Economics 30 (4): 719–735.

Armstrong, Mark. 2006. Competition in two-sided markets. The RAND Journal of Economics 37 (3): 668–691.

Armstrong, Mark, John Vickers, and Jidong Zhou. 2009. Prominence and consumer search. The RAND Journal of Economics 40 (2): 209–233.

Armstrong, Mark, and Jidong Zhou. 2011. Paying for prominence. The Economic Journal 121 (556): F368–F395.

. 2015. Search deterrence. The Review of Economic Studies 83 (1): 26–57.

Athey, Susan, and Glenn Ellison. 2011. Position auctions with consumer search. The Quarterly Journal of Economics 126 (3): 1213–1270.

Bagnoli, Mark, and Ted Bergstrom. 2005. Log-concave probability and its applications. Economic Theory 26 (2): 445–469.

140 Barach, Moshe A, Joseph M Golden, and John J Horton. 2019. Steering in online markets: the role of platform incentives and credibility. Working Paper, Working Paper Series 25917. National Bureau of Economic Research.

Bar-Isaac, Heski, Guillermo Caruana, and Vicente Cu˜nat.2012. Search, design, and market structure. American Economic Review 102 (2): 1140–60.

Belleflamme, Paul, and Martin Peitz. 2019. Managing competition on a two-sided platform. Journal of Economics & Management Strategy 28 (1): 5–22.

Bergemann, Dirk, and Alessandro Bonatti. 2011. Targeting in advertising markets: implications for offline versus online media. The RAND Journal of Economics 42 (3): 417–443. issn: 07416261.

Bikhchandani, Sushil, and Sunil Sharma. 1996. Optimal search with learning. Journal of Economic Dynamics and Control 20 (1-3): 333–359.

Blake, Thomas, Chris Nosko, and Steven Tadelis. 2016. Returns to consumer search: evidence from ebay. In Proceedings of the 2016 acm conference on economics and computation, 531–545. ACM.

Brown, Meta, Christopher J. Flinn, and Andrew Schotter. 2011. Real-time search in the laboratory and the market. American Economic Review 101 (2): 948–74.

Burdett, Kenneth, and Tara Vishwanath. 1988. Declining reservation wages and learning. The Review of Economic Studies 55 (4): 655–665.

Caplin, Andrew, Mark Dean, and Daniel Martin. 2011. Search and satisficing. American Economic Review 101 (7): 2899–2922.

Casner, Ben. 2020a. Learning while shopping: an experimental investigation into the effects of learning on recall in consumer search. Working Paper.

. 2020b. Seller curation in platforms. Working Paper.

Chandra, Ambarish, and Mariano Tappata. 2011. Consumer search and dynamic price dispersion: an application to gasoline markets. The RAND Journal of Economics 42 (4): 681–704.

Charness, Gary, and Dan Levin. 2005. When optimal choices feel wrong: a laboratory study of bayesian updating, complexity, and affect. American Economic Review 95, no. 4 (): 1300–1309.

141 Chen, Yongmin, and Chuan He. 2011. Paid placement: advertising and search on the internet. The Economic Journal 121 (556): F309–F328.

Chen, Yongmin, and Tianle Zhang. 2018. Entry and welfare in search markets. The Economic Journal 128 (608): 55–80.

Chen, Yuxin, and Song Yao. 2016. Sequential search with refinement: model and application with click-stream data. Management Science 63 (12): 4345–4365.

Choi, Jay Pil. 2006. Broadcast competition and advertising with free entry: subscription vs. free-to-air. Information Economics and Policy 18 (2): 181–196. issn: 0167-6245.

Choi, Michael, and Lones Smith. 2016. Optimal sequential search among alternatives. Available at SSRN 2858097.

Corts, Kenneth S. 2014. Finite optimal penalties for false advertising. The Journal of Industrial Economics 62 (4): 661–681.

Cox, James C, and Ronald L Oaxaca. 2000. Good news and bad news: search from unknown wage offer distributions. Experimental Economics 2 (3): 197–225.

Crampes, Claude, Carole Haritchabalet, and Bruno Jullien. 2009. Advertising, compe- tition and entry in media industries. The Journal of Industrial Economics 57 (1): 7–31. issn: 1467-6451.

Dana, James. 1994. Learning in an equilibrium search model. International Economic Review 35 (3): 745–71.

De los Santos, Babur, Ali Horta¸csu,and Matthijs R Wildenbeest. 2017. Search with learning for differentiated products: evidence from e-commerce. Journal of Business & Economic Statistics 0 (0): 1–16. de Roos, Nicolas. 2018. Collusion with limited product comparability. The RAND Journal of Economics 49 (3): 481–503.

Diamond, Peter. 1971. A model of price adjustment. Journal of Economic Theory 3 (2): 156–168.

Dinerstein, Michael, Liran Einav, Jonathan Levin, and Neel Sundaresan. 2018. Con- sumer price search and platform design in internet commerce. American Economic Review 108 (7): 1820–59.

Dubra, Juan. 2004. Optimism and Overconfidence in Search. Review of Economic Dynamics 7, no. 1 (): 198–218.

142 Eliaz, Kfir, and Ran Spiegler. 2011. A simple model of search engine pricing. The Economic Journal 121 (556): F329–F339.

Ellison, Glenn, and Sara Fisher Ellison. 2009. Search, obfuscation, and price elasticities on the internet. Econometrica 77 (2): 427–452.

Ellison, Glenn, and Alexander Wolitzky. 2012. A search cost model of obfuscation. The RAND Journal of Economics 43 (3): 417–441.

Falco, David. 2019. Economic search and numeracy. Working Paper.

Falk, Armin, David Huffman, and Uwe Sunde. 2006. Self-confidence and search.

Fischbacher, Urs. 2007. Z-tree: zurich toolbox for ready-made economic experiments. Experimental economics 10 (2): 171–178.

Fudenberg, Drew, and . 1991. . Cambridge, MA: MIT Press.

Gamp, Tobias. 2019. Guided search. Working paper.

Gamp, Tobias, and Daniel Kraehmer. 2018. Deception and competition in search markets. Working Paper.

Green, Hank. 2015. The $1,000 cpm. Accessed January 16, 2017. https://medium. com/@hankgreen/the-1-000-cpm-f92717506a4b#.3lxhobg3q.

Greiner, Ben. 2015. Subject pool recruitment procedures: organizing experiments with orsee. Journal of the Economic Science Association 1 (1): 114–125.

Gutelle, Sam. 2014. The average youtube cpm is $7.60, but making money isn’t easy. Accessed April 24, 2018. https://www.tubefilter.com/2014/02/03/youtube- average-cpm-advertising-rate/.

Hagiu, Andrei, and Bruno Jullien. 2011. Why do intermediaries divert search? The RAND Journal of Economics 42 (2): 337–362.

Hagiu, Andrei, and Julian Wright. 2015. Multi-sided platforms. International Journal of Industrial Organization 43:162–174.

Hinnosaar, Toomas, and Keiichi Kawai. 2018. Robust pricing with refunds. Working Paper.

Holt, Charles A., and Susan K. Laury. 2002. Risk aversion and incentive effects. American Economic Review 92, no. 5 (): 1644–1655.

143 Hong, Han, and Matthew Shum. 2006. Using price distributions to estimate search costs. RAND Journal of Economics 37 (2): 257–275.

Jhunjhunwala, Tanushree. 2018. Searching to avoid regret: experimental evidence and an application to charitable giving. Working Paper.

Kagel, John H, Ronald M Harstad, and Dan Levin. 1987. Information impact and allocation rules in auctions with affiliated private values: a laboratory study. Econometrica: Journal of the Econometric Society: 1275–1304.

Kaya, Ayca, and Kyungmin Kim. 2018. Trading dynamics with private buyer signals in the market for lemons. The Review of Economic Studies: rdy007.

Kogut, Carl A. 1990. Consumer search behavior and sunk costs. Journal of Economic Behavior and Organization 14 (3): 381–392. issn: 0167-2681.

Lewis, Matthew S. 2011. Asymmetric price adjustment and consumer search: an examination of the retail gasoline market. Journal of Economics & Management Strategy 20 (2): 409–449.

Lubensky, Dmitry. 2017. A model of recommended retail prices. The RAND Journal of Economics 48 (2): 358–386.

Maldonado, Jose Francisco Tudon. 2017. Congestion v content provision in net neu- trality: the case of amazon’s twitch.tv. Working paper.

Mamadehussene, Samir. 2019. The interplay between obfuscation and prominence in price comparison platforms. Management Science.

Mauring, Eeva. 2017. Learning from trades. The Economic Journal 127 (601): 827–872.

Moorthy, Sridhar, Brian T. Ratchford, and Debabrata Talukdar. 1997. Consumer information search revisited: theory and empirical analysis. Journal of Consumer Research 23 (4): 263–277.

Nosko, Chris, and Steven Tadelis. 2015. The limits of reputation in platform markets: an empirical analysis and field experiment. Working Paper.

Peters, Ellen, and Par Bjalkebring. 2015. Multiple numeric competencies: when a number is not just a number. Journal of personality and social psychology 108 (5): 802.

Petrikait˙e,Vaiva. 2018. Consumer obfuscation by a multiproduct firm. The RAND Journal of Economics 49 (1): 206–223.

144 Piccolo, Salvatore, Piero Tedeschi, and Giovanni Ursino. 2015. How limiting deceptive practices harms consumers. The RAND Journal of Economics 46 (3): 611–624.

Prasad, Ashutosh, Vijay Mahajan, and Bart Bronnenberg. 2003. Advertising versus pay-per-view in electronic media. International Journal of Research in Marketing 20, no. 1 (): 13–30. issn: 0167-8116.

Rapoport, Amnon, and . 1970. Choice behavior in an optional stopping task. Organizational Behavior and Human Performance 5 (2): 105–120.

Rhodes, Andrew, and Chris M Wilson. 2018. False advertising. The RAND Journal of Economics 49 (2): 348–369.

Schley, Dan R., and Ellen Peters. 2014. Assessing ”economic value”: symbolic-number mappings predict risky and riskless valuations. PMID: 24452604, Psychological Science 25 (3): 753–761.

Schotter, Andrew, and Yale M Braunstein. 1981. Economic search: an experimental study. Economic Inquiry 19 (1): 1–25.

Smirnov, Aleksei, and Egor Starkov. 2018. Bad news turned good: reversal under censorship. Working Paper.

Sonnemans, Joep. 1998. Strategies of search. Journal of Economic Behavior & Orga- nization 35 (3): 309–332. issn: 0167-2681.

T˚ag,Joacim. 2009. Paying to remove advertisements. Information Economics and Policy 21 (4): 245–252. issn: 0167-6245.

Teh, Tat-How. 2019. Platform governance. Working Paper.

Ursu, Raluca M. 2018. The power of rankings: quantifying the effect of rankings on online consumer search and purchase decisions. Marketing Science 37 (4): 530–552.

Varian, Hal R. 1980. A Model of Sales. American Economic Review 70, no. 4 (): 651–59.

Weisbuch, Gerard, Alan Kirman, and Dorothea Herreiner. 2000. Market Organisation and Trading Relationships. Economic Journal 110, no. 463 (): 411–36.

Weitzman, Martin L. 1979. Optimal search for the best alternative. Econometrica: Journal of the Econometric Society: 641–654.

145 Weyl, E Glen. 2010. A price theory of multi-sided platforms. American Economic Review 100 (4): 1642–72.

White, Alexander. 2013. Search engines: left side quality versus right side profits. International Journal of Industrial Organization 31 (6): 690–701.

Wilbur, Kenneth C. 2008. A two-sided, empirical model of television advertising and viewing markets. Marketing Science 27 (3): 356–378.

Wolinsky, Asher. 1986. True monopolistic competition as a result of imperfect infor- mation. The Quarterly Journal of Economics 101 (3): 493–511.

Yang, Huanxing. 2013. Targeted search and the long tail effect. RAND Journal of Economics 44 (4): 733–756.

Yang, Huanxing, and Lixin Ye. 2008. Search with learning: understanding asymmetric price adjustments. The RAND Journal of Economics 39 (2): 547–564.

Yea, Sangjun. 2018. Quality disclosure in online marketplaces. Working Paper.

Zhu, Feng. 2019. Friends or foes? examining platform owners’ entry into complementors’ spaces. Journal of Economics & Management Strategy 28 (1): 23–28.

Zhu, Feng, and Qihong Liu. 2018. Competing with complementors: an empirical look at amazon. com. Strategic Management Journal 39 (10): 2618–2642.

Zwick, Rami, Amnon Rapoport, Alison King Chung Lo, and AV Muthukrishnan. 2003. Consumer sequential search: not enough or too much? Marketing Science 22 (4): 503–519.

146 Appendix A: Omitted Proofs

Proof of Proposition 1

Because U is determined by Equation (2.8), the derivatives of both sides with regard to α must be equal. Z ¯ ∂U Z ¯ 0 = − (0 − U) f(0)d0 − (1 − α) f(0)d0 U ∂α U ¯ (A.1) ∂U R (0 − U) f(0)d0 =⇒ = − U ∂α (1 − α)(1 − F (U)) R ¯ 0 0 since U f( )d = (1 − F (U)). All of the elements of the fraction on the right hand

∂U side are positive, so the negative of the fraction must be negative and ∂α < 0 .

Proof of Proposition 2 Directly taking the derivative of Equation (2.11) ∂p −(βα + 1 − α) + (1 − β)(1 − α) 1 − F (U) ∂U −f(U)2 − f 0(U)(1 − F (U)) = + Ψ ∂α (βα + 1 − α)2 f (U) ∂α f(U)2 (A.2) −β 1 − F (U) ∂U −f(U)2 − f 0(U)(1 − F (U)) = + Ψ (βα + 1 − α)2 f (U) ∂α f(U)2 In a symmetric equilibrium, U does not depend on β, so as β → 0 the first term in the parentheses on the right hand side approaches 0 and the second term approaches

∂U −f(U)2−f 0(U)(1−F (U)) ∂U ∂α f(U)2 . From Proposition 1 ∂α < 0, and Anderson, De Palma, and

f(U)2+f 0(U)(1−F (U)) Nesterov (1995) demonstrate that f(U)2 > 0 for a log concave distribution,

∂U −f(U)2−f 0(U)(1−F (U)) ∂p ∂p so ∂α f(U)2 > 0. Therefore limβ→0 ∂α > 0 and by inspection ∂α is

∂p continuous in β, so it must be the case that ∂α > 0 for β sufficiently small. 147 Proof of Proposition 3 Taking the derivative of Π with regard to α

∂Π ∂p = Q(V (α)) + pQ0(V (α))V 0(α) ∂α ∂α  −β 1 − F (U) ∂U −f(U)2 − f 0(U)(1 − F (U))  = + Ψ Q(V (α)) (βα + 1 − α)2 f (U) ∂α f(U)2  −β  1 − F (U)  ∂U  f(U)2 + f 0(U)(1 − F (U))  + pQ0(V (α)) U − + Ψ 1 + (A.3) (βα + 1 − α)2 f(U) ∂α f(U)2 −β  1 − F (U)  1 − F (U)  = Q(V (α)) + pQ0(V (α)) U − (βα + 1 − α)2 f(U) f(U) ∂U   f(U)2 + f 0(U)(1 − F (U))  f(U)2 + f 0(U)(1 − F (U))  + Ψ pQ0(V (α)) 1 + − Q(V (α)) ∂α f(U)2 f(U)2 The first term in the brackets represents the change in profits due to the lemon effect.

The lemon effect pushes prices down and drives consumers away from platform, so this

term is always negative. The second term represents the change in profits resulting

from search obfuscation; this effect can be either positive or negative depending on whether the increased prices make up for reduced participation. When α = 0 the

second term does not depend on β at all. If this second term is positive then for β

sufficiently small the entire derivative must be positive and platform profit is increasing

in α at α = 0, this gives condition1.

The second term is positive at α = 0 if  f(U)2 + f 0(U)(1 − F (U)) f(U)2 + f 0(U)(1 − F (U)) pQ0(V (0)) 1 + < Q(V (0)) f(U)2 f(U)2 (A.4) In other words, admitting low-quality sellers is profitable if the revenue loss from

consumers leaving the platform (the left hand side) is less than the increase in revenue

from the obfuscation effect. As c → 0, p will approach 0, and Q0(V (0)) will be

f(U)2+f 0(U)(1−F (U)) decreasing. From Anderson, De Palma, and Nesterov (1995), f(U)2 > 0, so the left hand side will go to 0 as the search cost decreases. However, it is theoretically

f(U)2+f 0(U)(1−F (U)) possible that in the limit f(U)2 could tend to 0 sufficiently quickly that the revenue loss from reduced participation is greater than the revenue increase from

148 f(U)2+f 0(U)(1−F (U)) obfuscation for all values of c. If f(¯) > 0, then limc→0 f(U)2 > 0, which eliminates this concern. Alternatively we can rearrange the inequality to be

 f(U)2+f 0(U)(1−F (U))  0 Q (V (0)) 1 + f(U)2 1 < (A.5) Q(V (0)) f(U)2+f 0(U)(1−F (U)) p f(U)2

f(U)2+f0(U)(1−F (U)) ! 0 Q (V (0)) f(U)2 If ∈ o 2 0 —equivalent to highly concave Q(·)—then the Q(V (0)) 1+ f(U) +f (U)(1−F (U)) f(U)2 density of marginal consumers will become so thin as search costs decrease that the

revenue increase from obfuscation must overcome the change in participation for c

sufficiently small.

Proof of Lemma 1

Sufficiency of the first order condition follows immediately if Π is single peaked in

α. Π is single-peaked in α if

f(U)2+f 0(U)(1−F (U)) 0 f(U)2 Q (V (α)) − − βB(α) (A.6)  f(U)2+f 0(U)(1−F (U))  Q(V (α)) p 1 + f(U)2

Where

1−F (U) 0  1−F (U)  (1 − F (U)) Q(V (α)) f(U) + pQ (V (α)) U − f(U) B(α) = (A.7) (1 − α) R ¯( − U)f()d 1−F (U)  f(U)2+f 0(U)(1−F (U))  U Q(V (α)) f(U) 1 + f(U)2

Q0(V (α)) is decreasing in α. Because Q(·) is concave and V (α) is decreasing in α, Q(V (α)) must be increasing in α. β must be small enough under for Proposition 2 to apply, otherwise

the platform would not admit any low-quality sellers since Π is strictly decreasing in

1 α if p is not increasing. Therefore p must decrease in α, so the only possible sources f(U)2+f0(U)(1−F (U)) f(U)2 of multiple maxima are  2 0  and βB(α). If the variance of f(·) is 1+ f(U) +f (U)(1−F (U)) f(U)2

149 1 sufficiently large (so that f(·) is small) then the effect of 1−α in the first term of B(α) dominates and B(α) is increasing in α. Finally, f(U)2+f 0(U)(1−F (U)) f 0(U)(1−F (U)) 2 1 + 2 f(U) = f(U)  2 0  f 0(U)(1−F (U)) (A.8) f(U) +f (U)(1−F (U)) 2 + 1 + f(U)2 f(U)2 is the ratio of the change in price as U decreases to the change in V (α) as U decreases.

This ratio can increase or decrease in U, but if f 0(U) is sufficiently close to 0 (e.g.

1 with a uniform distribution or high variance distribution) then the effects of p and

Q0(V (α)) Q(V (α)) dominate the other terms and profit is single-peaked in α.

Proof of Proposition 4

Proof of part1: First note that Π is increasing in α if the left hand side of

Equation (2.19) minus the right hand side is positive.

As c increases, U decreases and p increases, meaning that V (α) decreases. Since

the direct effect of α on p decreases prices via the lemon coefficient, Assumption 1

implies that f(U)2+f 0(U)(1−F (U)) 0 f(U)2 Q (V (α)) − − βB(α) (A.9)  f(U)2+f 0(U)(1−F (U))  Q(V (α)) p 1 + f(U)2 is increasing in U, so if c decreases then marginal profits in α decrease for all values

of α. Concavity of the objective function then further implies that the α which solves

Equation (2.19) must decrease, so α∗ decreases. V (α) Proof of part2 If Q(V (α)) = t t > 0, then from Equation (A.3) ∂Π 1  −2β(1 − α)  1 − F (U)  1 − F (U)  = U − ∂α t (βα + 1 − α)3 f(U) f(U) ∂U  1 − F (U)  f(U)2 + f 0(U)(1 − F (U))   1 − F (U)  f(U)2 + f 0(U)(1 − F (U))   + (Ψ)2 1 + − U − ∂α f(U) f(U)2 f(U) f(U)2 (1 − α)  −2β  1 − F (U)  1 − F (U)  = U − t(βα + 1 − α)2 βα + 1 − α f(U) f(U) ∂U  1 − F (U)  f(U)2 + f 0(U)(1 − F (U))   1 − F (U)  f(U)2 + f 0(U)(1 − F (U))   + (1 − α) 1 + − U − ∂α f(U) f(U)2 f(U) f(U)2 (A.10)

150 The terms outside of the square brackets do not matter for determining the effect

of β on α∗. The first term in brackets is negative and decreasing in β, the second

term in the brackets is positive in any interior equilibrium and does not vary with β.

From Assumption 1 the sum inside the brackets must be decreasing in α, so if the

first term becomes more negative then α∗ must decrease.

Proof of Proposition 5

Consumers only move on from the recommended seller if iR is low, which means

that any consumer who visits a different seller will never go back to the recommended

seller. Deriving the stopping rule for these consumers then follows precisely the same

steps as in Section 2.3. Because consumers are using the Section 2.3 stopping rule,

the non-recommended sellers will charge the Section 2.3 price.

For the rest of the lemma, begin by comparing Equation (2.8) and Equation (2.23),

it must be the case that

U + p − p R R = U Ψ (A.11)

=⇒ UR = ΨU + pR − p

This equation can be rearranged to give part2. Suppose pR ≤ p, then pR − p ≤ 0 and

1−F (x) UR < U as Ψ < 1 when α > 0. But f(·) is log concave so f(x) is decreasing in x and

1 − F (UR) 1 − F (U) 1 − F (U) pR = > > Ψ = p (A.12) f(UR) f (U) f (U)

151 Which contradicts the assumption that pR ≤ p and proves part3. The proof of part4

uses nearly identical logic to show that an inequality in either direction produces a

contradiction, noting that Ψ = 1 when α = 0.

Proof of Corollary 1

With probability 1 − F (UR) consumers purchase from the recommended seller and

R ¯ f() receive expected utility  d() − pR. With probability F (UR), consumers UR 1−F (UR) move on from the recommended seller, in which case their search problem is iden-

tical to the equilibrium without a recommended seller and their expected payoff is

 1−F (U)  Ψ U − f(U) , so

Z ¯ f()   1 − F (U) V (α) =  d() − pR (1 − F (UR)) + F (UR)Ψ U − UR 1 − F (UR) f(U) (A.13)

Focusing on the first term

Z ¯ f() Z ¯ f()  d() − pR = ( − pR) d() UR 1 − F (UR) UR 1 − F (UR) Z ¯ f() > (U − p ) d() R R 1 − F (U ) UR R (A.14)

=UR − pR  1 − F (U) =Ψ U − f(U)

Where the last equality comes from Proposition 5. Plugging this inequality into the value function  1 − F (U) V (α) >((1 − F (U )) + F (U ))Ψ U − R R f(U) (A.15)  1 − F (U) =Ψ U − f(U)

152 Since U is identical to the reservation utility from the equilibrium without the rec-

ommended seller, Equation (A.15) implies that consumers must be strictly better off with the recommended seller.

Proof of Proposition 7 Part1: Since part2 of Proposition 5 holds for all α, the derivatives of both sides of the equation must be equal

∂U  f(U )2 + f 0(U )(1 − F (U ))  ∂U  f(U)2 + f 0(U)(1 − F (U))  β  1 − F (U)  R 1 + R R R = Ψ 1 + − U − 2 2 2 ∂α f(UR) ∂α f(U) (βα + 1 − α) f(U) (A.16)

Rearranging

 2 0    1 + f(U) +f (U)(1−F (U)) U − 1−F (U) ∂U ∂U f(U)2 β f(U) R (A.17) = Ψ  2 0  −  2 0  ∂α ∂α f(UR) +f (UR)(1−F (UR)) (βα + 1 − α)2 f(UR) +f (UR)(1−F (UR)) 1 + 2 1 + 2 f(UR) f(UR)

∂UR ∂U Since UR = U and Ψ = 1 for α = 0, this then implies that ∂α < ∂α for α sufficiently close to 0, which in turn implies that UR < U for α positive but near 0.

Part2: Taking the derivative of platform profit ∂Π R =Q0(V (α))V 0(α) [(1 − F (U )p + F (U )p] ∂α R R R  ∂p ∂p ∂U  (A.18) + Q(V (α)) (1 − F (U )) R + F (U ) − R f(U )(p − p) R ∂α R ∂α ∂α R R Evaluating the derivative of the recommended seller’s profit

∂p ∂U f(U )2 + f 0(U )(1 − F (U )) R R R R R (A.19) = − 2 ∂α ∂α f(UR)

From Equation (A.17) this derivative must be positive. From part1 of this proposition

for small α it is larger than the change in p with α, but the comparison is ambiguous

for large α. Additionally, because pR > p for α > 0, the marginal costs of increasing α

(the first term on the right hand side of Equation (A.18)) of can potentially be larger when the platform recommends a seller as well. However, from Corollary 1, Q0(V (α))

153 must be closer to 0 in the equilibrium with the recommended seller because V (α) is

∂ΠR ∂Π larger. For Q(·) sufficiently concave, this effect dominates and ∂α > ∂α for all α, so from Assumption 1, we must have αR∗ > α∗.

Proof of Corollary 2

From Proposition 3 and Proposition 4, for β or c sufficiently large, α∗ is close to 0. But then from Proposition 7 and Corollary 1 a change in α must result in a larger increase in seller profits must and a smaller decrease in participation when the platform can recommend a seller. It follows immediately that αR∗ > α∗ for any β and c pair in this set such that α∗ is positive but small. Further, comparing Equation (A.18) and Equation (A.3), it follows from inspection and part4 of Proposition 5 that that

∂pR ∂p the former will be strictly greater at α = 0 when ∂α > ∂α . Part1 of Proposition 7 implies that this last condition will hold at any c and β pair such that the optimal

α∗ is exactly 0. Thus αR∗ > 0 at those pairs, and as Equation (A.18) will changes smoothly with β and c there must be parameters outside the set where α∗ is positive such that αR∗ > 0.

Proof of Lemma 2

Begin by noting the following identities:

∂t∗ 1 −c = 0 ∗ 2 u > 0 (A.20) ∂wa g (t ) wa aF ( a )

∂F ( u ) ∂t∗ −c w F ( u ) + aw a a a a ∂a (A.21) = 0 ∗ 2 u 2 > 0 ∂a g (t ) (wa aF ( a )) 154 and

∂F ( u ) u u a = −f( ) (A.22) ∂a a a2

Substituting Equation (A.20) and Equation (A.22) into Equation (A.21) gives

∗ u u u ∂t −cwa F ( a ) − f( a ) a = 0 ∗ 2 u 2 > 0 (A.23) ∂a g (t ) (wa aF ( a ))

The first term in the product is positive since g0 (t) < 0. Thus the sign of the derivative

u is positive iff the increased revenue from having more ad views (F ( a )) outweighs the

u u loss in viewership caused by the increase in nuisance (f( a ) a ). Take the FOC with regard to advertising level:

∂π  u ∂t∗ u ∂F ( u ) plat =(p − w ) G(t∗)F ( ) + g(t∗) aF ( ) + aG(t∗) a ∂a a a a ∂a a ∂a   u u u ∂t∗ u  =(p − w ) G(t∗) F ( ) − f( ) + g(t∗) aF ( ) a a a a a ∂a a Using the identities established above and the definition of t∗, then pulling out

common factors:

 u 2  ∂πplat  u u u ∗ aF ( a ) c = (pa − wa) F ( ) − f( ) G(t ) − 0 ∗ u 3 (A.24) ∂a a a a g (t ) (waaF ( a ))

Since g(·) is decreasing all terms in this product are positive with the exception of

u u u  F ( a ) − f( a ) a , so this is the only relevant part of the platform’s decision. So long as Assumption 2 is satisfied, there will be a unique solution. This solution must be

profit maximizing because (from Assumption 2 and the fact that f(·) is log concave)

u u u  F ( a ) − f( a ) a is positive for a close to 0 and negative for a close to 1. 155 Now take the derivative of platform profit with regard to creator payment, substi-

tuting appropriately using the identities above:

 ∗  ∂πplat u g(t ) −c ∗ = aF ( ) 0 ∗ 2 u (pa − wa) − G(t ) (A.25) ∂wa a g (t ) wa aF ( a )

Using the definition of t∗

 2  ∂πplat u 1 −c ∗ = aF ( ) 0 ∗ u 2 (pa − wa) − G(t ) (A.26) ∂wa a g (t ) wa(waaF ( a ))

∗ The derivative will be positive when G(t ) = 0 and negative at wa = pa, it is also

continuous, so a solution always exists. Checking for concavity, take the second

derivative

2  ∗ 2 ∂ πplat u −1 ∂t −c 2 = aF ( ) 00 ∗ u 2 (pa − wa) ∂ wa a g (t ) ∂wa wa(waaF ( a )) 3 −c2 − 0 ∗ 4 u 2 (pa − wa) g (t ) wa (aF ( a )) 2 ∗  1 −c ∗ ∂t − 0 ∗ u 2 − g(t ) g (t ) wa(waaF ( a )) ∂wa So long as the first term is negative and/or sufficiently small, the platform’s profit is

concave in prices. Thus g00 (t) ≤ 0 is sufficient, but not necessary to ensure that there

is a unique profit maximizing creator payment for every advertising level so long as

there are a positive number of views.

Proof of Lemma 3

The following identities are useful:

∗ p˜s ∂t −c 1 − F ( a˜ ) = 0 ∗ 2 > 0 (A.27) ∂w˜s g (t )  p˜s p˜s  w˜aaF˜ ( a˜ ) +w ˜s(1 − F ( a˜ ))

∗ p˜s ∂t −c aF ( a˜ ) = 0 ∗ 2 > 0 (A.28) ∂w˜a g (t )  p˜s p˜s  w˜aaF˜ ( a˜ ) +w ˜s(1 − F ( a˜ )) 156 Using the above identities, take the derivative of platform profit with regard to the share of subscription fees (w ˜s)

  ∗ ∂πplat p˜s p˜s ∗ ∂t = (pa − w˜a)˜aF ( ) + (˜ps − w˜s)(1 − F ( )) g(t ) ∂w˜s a˜ a˜ ∂w˜s (A.29) p˜ − (1 − F ( s ))G(t∗) a˜ and advertising revenue (w ˜a)

  ∗ ∂πplat p˜s p˜s ∗ ∂t = (pa − w˜a)˜aF ( ) + (˜ps − w˜s)(1 − F ( )) g(t ) ∂w˜a a˜ a˜ ∂w˜a (A.30) p˜ − aF˜ ( s )G(t∗) a˜ Both of these derivatives give the same FOC.

2 p˜s p˜s −c (pa − w˜a)aF ( a˜ ) + (˜ps − w˜s)(1 − F ( a˜ )) ∗ 0 3 = G(t ) (A.31) g (t∗)  p˜s p˜s  w˜aaF ( a˜ ) +w ˜s(1 − F ( a˜ ))

p˜s aF ( a˜ ) For any pair (w˜a, w˜s) that solves this FOC, (w˜a + , w˜s − p˜s ) will also be a (1−F ( a˜ )) solution for any  ∈ R, thus there is a continuum of solutions. From the definition of p˜ andw ˜, I can rewrite this FOC as

−c2 p˜ − w˜ = G(t˜∗) (A.32) g0 (t˜∗) w˜3

By almost the exact same derivation as in the proof of Lemma 2, concavity of g(·) gives uniqueness and sufficiency of this FOC for profit maximization.

Proof of Proposition 8

The value of p˜a was derived in text. The value of w˜ comes from Lemma 3. To see the increase in advertising level, take the derivative of platform profit with regard to advertising level

157 ∂π   p˜ p˜ p˜  plat = (˜p − w˜ ) F ( s ) − af˜ ( s ) s ∂a a a a˜ a˜ a˜2 p˜ p˜  + (˜p − w˜ )f( s ) s G(t˜∗) s s a˜ a˜2 ∂t˜∗ + (˜p − w˜)g(t˜∗) ∂a Using Equation (4.9) and taking the derivative of t˜∗ w.r.t. a

∂π   p˜ p˜ p˜  plat = (˜p − w˜ ) F ( s ) − af˜ ( s ) s ∂a a a a˜ a˜ a˜2 p˜ p˜  + (˜p − w˜ )f( s ) s G(t˜∗) s s a˜ a˜2 2 p˜s p˜s −c w˜aF ( a˜ ) + a˜2 (w ˜s − a˜w˜a) + (˜p − w˜) 0 3 g (t˜∗)  p˜s p˜s  w˜aaF˜ ( a˜ ) +w ˜s(1 − F ( a˜ ))

Using Lemma 3, I can seta ˜w˜a =w ˜s. Then from Equation (A.31)

∂π  p˜ p˜ p˜ p˜  plat = (˜p − w˜ )F ( s ) + (˜p − ap˜ ) s f( s ) +w ˜ F ( s ) G(t∗) ∂a a a a˜ s a a2 a˜ a a˜ ∂π  p˜ p˜ p˜  ⇒ plat = p˜ F ( s ) + (˜p − ap˜ ) s f( s ) G(t∗) ∂a a a˜ s a a2 a˜

This derivative is unambiguously positive if p˜s > az˜ = a˜p˜a, so if that inequality holds advertisement will to a corner solution at 1. p˜s < az˜ only if p˜s = u < az˜ (see below) then we can rearrange the above to get

∂π   u u u  u2 u  plat = p˜ F ( ) − f( ) + f( ) G(t∗) (A.33) ∂a a a˜ a a˜ a˜ a˜

u u u Since aˆ sets F ( a˜ ) − a f( a˜ ) = 0, this derivative will be positive at a˜ = aˆ, so the profit maximizing advertising level must be above aˆ. Furthermore, Assumption 2 and log concavity of f(·) will be enough to give sufficiency of this equation for profit maximization.

158 To find the subscription fee, take the first order derivative of platform profit

  ∗ ∂πplat p˜s 1 p˜s p˜s 1 ∗ ∗ ∂t = (˜pa − wa)˜af( ) +(1 − F ( )) − (˜ps − w˜s)f( ) G(t ) + (˜p − w˜)g(t ) ∂p˜s a˜ a˜ a˜ a˜ a˜ ∂p˜s

∗ Using Equation (4.9) and taking the derivative of t˜ w.r.t.p ˜s witha ˜w˜a =w ˜s

  ∂πplat ∗ p˜s p˜s 1 =G(t ) (1 − F ( )) + (˜paa˜ − p˜s)f( ) (A.34) ∂p˜s a˜ a˜ a

If za˜ = p˜aa˜ > p˜s then this derivative will be strictly positive, so the only circum-

stance in which the platform will set the subscription price less than az˜ is when az˜ > u

and the subscription price is p˜s = u. If that inequality does not hold, the derivative

above gives gives a relatively straightforward FOC

1 − F ( p˜s ) p˜ − p˜ a˜ a˜ = s a (A.35) p˜s a˜ f( a˜ )

If z > u, then the platform will always set p˜s = u because for any advertising level

in which the subscription price is greater than az˜ , the platform will want to increase

the advertising level. But then as a˜ increases az˜ will surpass p˜s, and the platform will want to increase the subscription price. For z > u, the subscription price will reach

the corner solution where it can no longer be increased after az˜ increases. For u > z,

the advertising level will reach the corner first and the pricing FOC becomes

1 − F (˜ps) =p ˜s − p˜a (A.36) f(˜ps)

Profit is increasing in price if the inverse hazard function on the left hand side is

greater than the margin on subscribing consumers on the right. It follows from log

concavity of f(·) that the inverse hazard rate will be decreasing in the subscription

159 price while the margin is increasing so the FOC can be satisfied at most once. For

ps < p˜s profit is increasing in ps, while it will be decreasing in ps if the opposite

inequality holds. Therefore this FOC will be sufficient for profit maximization.

Proof of Proposition 9

To show that the payment to creators will increase, consider Equation (A.32)

from Lemma 3. The same first order condition can be used in the advertising only

u u equilibrium can be used substituting wˆ = wˆaaFˆ ( aˆ ) for w˜ and pˆ = pˆaaFˆ ( aˆ ) for p˜ Since the platform solves this problem in terms of the respective w and p for both the

advertising-only model and the model with the subscription. It is trivial to show that

p˜ > pˆ, and since

 c  t∗ = g−1 (A.37) w

in both problems then the solution to Equation (A.32) can be thought of as an

implicit function of p, and we can see that the left hand side increases with p, meaning

that the right hand side must as well. Thus G(t∗) must increase with the subscription,

meaning that both creator payments and variety of content will increase.

If consumers do not purchase the subscription when it is available, then they face

more nuisance cost than they would by watching in the advertising-only equlibrium

due to the higher advertising level from Proposition 8. Therefore they would have watched in that equilibrium, and they would have received greater net utility from

doing so.

160 Consumers are better off in the equilibrium with the subscription only if they

purchase the subscription and the price of the subscription is less than the nuisance

cost they would face in the advertising-only equilibrium. i.e. if

aηˆ > p˜s p˜ =⇒ η > s aˆ

But consumers will purchase the subscription if η > p˜s, and aˆ ≤ 1, so unless aˆ = 1

p˜s aˆ > p˜s, and there is a set of consumers who purchase the subscription, but the price of the subscription is greater than the nuisance cost they would have faced in the

advertising-only equilibrium.

The extensive market consumers must be at least weakly better off from the

presence of the subscription because they all choose to watch when their relevant

content creator produces. Since the outside option is still available, they can be no worse off than in the advertising-only equilibrium.

Proof of Lemma 4

As before, a consumer deciding between the free version and not watching will watch iff

u ≥ aη

u which means that for a given value of u, only those consumers for whom η ≤ a will watch. There will be a density j(u) of consumers with any given u, and the

independence of the two population distributions leads to Equation (4.18). To get

Equation (4.22) note that Equation (A.24) can be generalized to

161 ! ∂πˆ ∂ψˆ h i plat = (ˆp − wˆ ) ψˆ − aˆ G(tˆ∗) +ag ˆ (tˆ∗)ψˆ (A.38) ∂aˆ a a ∂aˆ

with the middle term being the only one of ambiguous sign. In the case of

ˆ ∂ψˆ R u¯ u  u u  heterogeneous utility ψ − aˆ ∂aˆ = u F aˆ − aˆ f aˆ j(u)du. Assumption 2 ensures that this derivative will have only one solution since it means that the middle term will be decreasing in a.

Equation (4.23) is simply derived by taking the derivative of profit with respect to

a˜, and setting it equal to 0 ∂π˜ p˜ p˜ p˜ plat = s (1 − J(˜p ))(˜p − a˜p˜ )f( s ) + (˜p − w˜ )F ( s )J(˜p ) ∂a˜ a˜2 s s a a˜ a a a˜ s Z p˜s  u u u  + F ( ) − f( ) j(u)du(˜pa − w˜a) u a˜ a˜ a˜

R p˜s u u u  Since aˆ sets u F ( a˜ ) − a˜ f( a˜ ) j(u)du = 0, the derivative above will be strictly positive at aˆ, and it is easy to show that the second derivative is negative given

Assumption 2 and Assumption 4. The solution to Equation (4.23) is sufficient for

profit maximization anda ˜ must be greater thana ˆ.

Equation (4.24) is derived by taking the derivative of the profit function with

respect top ˜s:    ∂πplat p˜s 1 p˜s p˜s = F ( )j(˜ps) + f( )(1 − J(˜ps)) − F ( )j(˜ps) a˜(˜pa − w˜a) ∂p˜s a˜ a˜ a˜ a˜ 1 p˜ p˜ p˜  − f( s )(1 − J(˜p )) − j(˜p )(1 − F ( s ))(˜p − w˜ ) + (1 − F ( s ))(1 − J(˜p )) G(t˜∗) a˜ a˜ s s a˜ s s a˜ s (A.39)

Focusing on the inside of the parentheses and setting the equation equal to 0 p˜ (1 − F ( s ))(1 − J(˜p )) a˜ s 1 p˜ p˜ = f( s )(1 − J(˜p ))(˜p − a˜p˜ ) + j(˜p )(1 − F ( s ))(˜p − w˜ ) a˜ a˜ s s a s a˜ s s 162 p˜s Divide through by (1 − F ( a˜ ))(1 − J(˜ps))

p˜s f( a˜ ) p˜s − a˜p˜a j(˜ps) 1 = + (˜ps − w˜s) (A.40) p˜s a˜ 1 − J(˜p ) 1 − F ( a˜ ) s

Profit is increasing in p˜s if the left hand side is greater than the right, and decreasing

if the opposite holds true. From log-concavity of f(·) and j(·) the right hand side

is increasing in ps while the left hand side is constant, meaning that the first order

condition, if it can be satisfied, will be sufficient for profit maximization. If this

condition cannot be satisfied then the price will be a corner at u because the right

hand side approaches infinity as ps approaches u¯, meaning that the derivative of profit will be negative in ps at that point.

Proof of Proposition 10

The proofs of parts 1, 2, and 4 are essentially identical to the proofs of the equivalent

points in Proposition 9.

For part 3, note that all consumers for whom u < p˜s will either watch with

advertising or not watch at all when the subscription is available. Consumers in this

subset who do not watch in either equilibrium have identical payoffs. Consumers in

this subset who watch in the advertising-only equilibrium either watch with more ads when the subscription is available or they switch to not watching, and their welfare

must weakly decrease with the introduction of the subscription.

Proof of Lemma 5

Take the derivative of profit with regard to the price of advertising in the advertising-

only model:

163 ∂πplat ∗ p˜s =(1 − H(pad)G(t )F ( ) ∂pa i(pad)   p˜s p˜s p˜s − h(pad)(pa − wa) F ( ) − f( ) (A.41) i(pad) i(pad) i(pad) " p˜s # (1 − H(pad))F ( ) 2 ∗ i(pad) c ∗ G(t ) − wa 0 ∗ p˜s 3 g (t ) (wa(1 − H(pad))F ( )) i(pad) Setting this derivative equal to 0 gives the following FOC

1 − H(p ) p˜ ad F ( s )G(t∗) = h(pad) i(pad) " p˜s #   (1 − H(pad))F ( ) 2 pa − wa p˜s p˜s p˜s ∗ i(pad) c F ( ) − f( ) G(t ) − wa 0 ∗ p˜s 3 v i(pad) i(pad) i(pad) g (t ) (wa(1 − H(pad))F ( )) i(pad) (A.42)

Using Equation (4.12) to rearrange the advertising price FOC

∗   1 − H(pad) p˜s ∗ G(t ) p˜s p˜s p˜s F ( )G(t ) = F ( ) − f( ) [pa − wa + wa] h(pad) i(pad) v i(pad) i(pad) i(pad) (A.43)

Canceling like terms

  1 − H(pad) p˜s p˜s p˜s p˜s F ( ) = F ( ) − f( ) pad (A.44) h(pad) i(pad) i(pad) i(pad) i(pad)

Divide through by F ( p˜s ) i(pad)

p˜s ! 1 − H(p ) f( ) p˜ ad = 1 − i(pad) s p (A.45) p˜s ad h(pad) F ( ) i(pad) i(pad)

From log concavity of h(·),the left hand side will approach 0 as pa increases, and

Assumption 2 and log-concavity of f(·) imply that the right hand side will be strictly

positive and increasing. Profit is increasing if the left hand side is greater than the

right, and decreasing if it is less, so these conditions are sufficient for quasi-concavity.

There must be at most one solution to the FOC above and it is sufficient for profit

164 maximization. If the profit maximizing price is at a corner, it must be at z and not z¯

since pa =z ¯ would imply 0 mass of advertisers which would in turn lead to 0 profit.

In the premium model the advertising pricing equation will again be given by

taking the derivative of profit with regard to the ad price   ∂πplat p˜s p˜s p˜s p˜s = i(pa)F ( ) − h(˜pa) paF ( ) + (˜ps − i(pa)pa) 2 f( ) ∂pa i(pad) i(pad) i(pa) i(pad) (A.46) and setting it equal to 0

 2   1 − H(˜pa) p˜s p˜s p˜s p˜s p˜s p˜s F ( ) = f( ) + pa F ( ) − f( ) (A.47) h(˜pa) i(˜pa) i(˜pa) i(˜pa) i(˜pa) i(˜pa) i(˜pa)

Assumption 2 and Assumption 5 together ensure that increasing pa will decrease

the left hand side relative to the right, which again gives quasi-concavity of profits in

the advertising price and sufficiency of the first order condition for profit maximization.

This is almost identical to the advertising-only FOC, except for the additional term

2 p˜s f( p˜s ) which represents that consumers who switch to free viewing as a result i(pad) i(pad) of the increased advertising price lowering the advertising level. Therefore, with an

interior solution the platform will have a lower advertising price and higher advertising

level than when the subscription was not available.

Proof of Proposition 10

The proofs of parts 1 and 2 are essentially identical to the proof of Proposition 9

For part 3, each advertiser’s total profit is

∗ u G(tˆ )F ( )(z − pˆa) i(ˆpa) in the advertising-only equilibrium and

∗ p˜s G(t˜ )F ( )(z − p˜a) i(˜pa) 165 ∗ ∗ in the equilibrium with the subscription. From Lemma 5, p˜a > pˆa and G(t˜ ) > G(tˆ ), but F ( p˜s ) < F ( u ). The relative size of the effects depends on the f(·) and g(·), i(˜pa i(hatpa so profits can increase or decrease.

166 Appendix B: Duopoly platforms

Suppose that instead of a single monopoly platform, two platforms (denoted 1 and 2) compete in a duopoly setting with single homing consumers and multi-homing sellers. I use the subscript 1 to denote the variables relevant to platform 1 and 2 to those on platform 2. For this section I focus on symmetric equilibrium where all sellers on platform i set the same price given αi, and where α1 = α2. With the additional platform, the timing of the model changes slightly:

1. The platforms set α1 and α2 simultaneously.

2. Consumers commit to participating in one of the platforms’ markets or remaining

with the outside option.

3. Sellers set prices simultaneously on each platform.

4. Consumers participating a platform’s market search among sellers and make

purchasing decisions.

Prices are not observable to consumers so multi-homing sellers set the prices independently on each platform by best responding to consumer search behavior.

Similarly, as consumers commit to a platform before searching, the proportion of

167 low-quality sellers on the other platform does not influence their search behavior. For

i = 1, 2 the derivations from the monopoly model apply:

(1 − αi) 1 − F (Ui) pi = βαi + 1 − αi f (Ui)   (B.1) (1 − αi) 1 − F (Ui) V (αi) = Ui − βαi + 1 − αi f(Ui)

Where Ui is determined by ??, substituting αi for α. The mass of consumers who commit to platform i is given by the continuous and differentiable function

Qi(V (α1),V (α2)), where Qi(V (αi),V (α−i)) is increasing and concave in V (αi) and

decreasing in V (α−i). Furthermore, I assume that

dQ (x, x) dQ (x, x) i = −i ≥ 0 for i = 1, 2 (B.2) dx dx

This assumption implies that neither platform has an inherent competitive advantage.

If both platforms provide the same expected value, and this value increases while

remaining symmetric, then any change in demand must come from the outside

option. Behavior is identical to the monopoly once consumers have committed to

a platform, so the major implication of introducing platform competition is that

consumer participation in a platform’s market will depend on both platforms’ curation

decisions.

Lemma 6. In the symmetric duopoly model, for β and c sufficiently small and if

0 max f (·) sufficiently small then αi = α−i = 0 cannot be an equilibrium.

Proof. Following the same derivation steps as in the proof of Proposition 3, a platform’s

profit is increasing if β is sufficiently small and at αi = α−i = 0

 2 0  ∂Qi(V (0),V (0)) f(Ui) +f (Ui)(1−F (Ui)) 1 + 2 f(Ui) 1 ∂V (αi) (B.3) 2 0 < Q (V (0),V (0)) f(Ui) +f (Ui)(1−F (Ui)) p i 2 i f(Ui) 168 By precisely the same logic as in the proof of Proposition 3, if

 2 0  ∂Qi(V (0),V (0)) f(Ui) +f (Ui)(1−F (Ui)) 1 + f(U )2 lim ∂V (αi) i < ∞ (B.4) f(U )2+f 0(U )(1−F (U )) Ui→¯ Q (V (0),V (0)) i i i i 2 f(Ui) then the above inequality must hold for c sufficiently small.

The logic of the proof of Lemma 6 is similar to that of Proposition 3, and the

intuition is similar as well. If the the platform markets are sufficiently competitive,

and the lemon effect not too strong, then each platform will have an incentive to

obfuscate search even if the other platform has no low-quality sellers.

Assumption 6. β and c are sufficiently small so that αi = α−i = 0 is not an equilibrium.

By nearly the same reasoning as in the monopoly model, platform profit is guaran-

teed to be concave for f(·) sufficiently small across its range, so Assumption 7 gives

sufficiency of the first order condition for platform i0s curation decision.

2 0 f(Ui) +f (Ui)(1−F (Ui)) 2 f(Ui) Assumption 7. The variance of f(·) is sufficiently large so that  2 0  − f(Ui) +f (Ui)(1−F (Ui)) pi 1+ 2 f(Ui) ∂Qi(V (αi),V (α−i)) ∂V (αi) − βBi(αi, α−i) is decreasing in αi Qi(V (αi),V (α−i))

∗ Lemma 6 describes equilibrium curation levels in this environment. Let αD denote the symmetric equilibrium curation decision in the duopoly model, then

Lemma 7. Under Assumption 6 and Assumption 7, if Qi(V (αi),V (α−i)) is continu- ously differentiable then an equilibrium with symmetric α > 0 exists and for i = 1, 2,

∗ αD is determined implicitly by

2 0 ∗ ∗ f(Ui) +f (Ui)(1−F (Ui)) ∂Qi(V (αD),V (αD)) 2 f(Ui) = ∂V (αi) + βB (α∗ , α∗ )  2 0  ∗ ∗ i D D (B.5) f(Ui) +f (Ui)(1−F (Ui)) Qi(V (α ),V (α )) pi 1 + 2 D D f(Ui)

169 Where

∗ ∗ ∗ ∗ 1−F (Ui) ∂Qi(V (αD ),V (αD ))  1−F (Ui)  (1 − F (U )) Qi(V (αD),V (αD)) f(U ) + pi ∂V (α ) Ui − f(U ) ∗ ∗ i i i i (B.6) Bi(αD, αD) = ¯  2 0  (1 − α) R ( − U )f()d ∗ ∗ 1−F (Ui) f(Ui) +f (Ui)(1−F (Ui)) U i Qi(V (α ),V (α )) 1 + 2 i D D f(Ui) f(Ui)

∗ Proof. Bi(αi, αD) approaches ∞ as αi approaches 1, so no platform will ever set αi = 1.

Assumption 6 ensures that αi = α−i = 0 is never an equilibrium. Assumption 7,

Assumption 6 and continuous differentiability of Qi(V (αi),V (α−i)) together ensure

that a solution to Equation (B.5) in αi exists for some range of α−i > 0 and that this

solution is both necessary and sufficient for profit maximization if the optimal response

to α−i is positive. Since V (α) is continuous in α and αi, α−i ∈ [0, 1], the payoff

functions are continuous and (by assumption) single-peaked in the platform’s own

strategies, and the strategy spaces are compact. Therefore by theorem 1.2 in Fudenberg

and Tirole (1991) an equilibrium in pure strategies exists and by Assumption 6 it

must have at least one α > 0.

Furthermore, there must be an equilibrium which has symmetric α > 0 . Define

α(α−i) as platform i’s reaction function. Since Qi(V (αi),V (α−i)) is continuously

differentiable, the solution to Equation (B.5) must vary with α−i continuously. α(α−i)

is the solution to Equation (B.5) if this solution is non-negative or 0 if it is not, so

α(α−i) must be a continuous function. As the domain of α(·) is [0, 1], and the range

is contained in this compact, convex set, Brouwer’s fixed point theorem implies that

there exists α−i such that α(α−i) = α−i. Furthermore, from Assumption 6 this α−i

must be positive, and so the symmetric solution is determined by Equation (B.5).

Lemma 6 states that under relatively mild conditions a symmetric equilibrium

exists where both platforms admit a positive number of low-quality sellers if their

retail markets would be highly competitive without them. This does not rule out the

170 possibility of an asymmetric equilibrium, but I leave analysis of such equilibria for

future work. Proposition 12 compares α∗, the equilibrium curation decision in the

∗ monopoly model, to αD, the equilibrium decision in the duopoly.

Proposition 12. Under Assumptions 1 to 3, if the equilibrium in the multi-platform

∗ ∗ model is symmetric then αD > α if and only if

∂Q (V (α∗),V (α∗)) i Q0(V (α∗)) ∂V (αi) (B.7) ∗ ∗ < ∗ Qi(V (α ),V (α )) Q(V (α ))

That is, the equilibrium proportion of low-quality sellers increases with the addition of a second platform if and only if demand is less elastic in the duopoly platform case.

Proof. Assumptions 1 to 3 ensure sufficiency and necessity of the first order conditions.

The conclusion comes directly from comparing Equation (B.5) to Equation (2.19) and

∗ ∗ ∗ the assumption of concavity in both cases. The presence of Bi(αD, αD) and B(α ) means that the comparison is not immediately obvious. However, it is easy to show

that

∗ ∗ (1 − F (Ui)) 1 Bi(α , α ) = 2 0 D D R ¯ f(Ui) +f (Ui)(1−F (Ui)) (1 − α) ( − Ui)f()d 1 + 2 Ui f(Ui) ∗ ∗  1−F (U )  (B.8) ∂Qi(V (αD),V (αD)) i ! pi Ui − f(U ) + ∂V (αi) i ∗ ∗  2 0  Qi(V (α ),V (α )) 1−F (Ui) f(Ui) +f (Ui)(1−F (Ui)) D D 1 + 2 f(Ui) f(Ui) and

(1 − F (U)) 1 B(α∗) = (1 − α) R ¯( − U)f()d f(U)2+f 0(U)(1−F (U)) U 1 + f(U)2  1−F (U)  (B.9) 0 ∗ ! Q (V (α )) p U − f(U) + Q(V (α∗)) 1−F (U)  f(U)2+f 0(U)(1−F (U))  f(U) 1 + f(U)2

∗ ∗ ∗ Given that both B(α ) and Bi(αD, αD) are positive, the only difference between the ∂Qi(V (αi),V (α−i)) 0 ∗ Q (V (α )) ∂V (αi) two first order conditions is the comparison of ∗ to . If the latter Q(V (α )) Qi(V (αi),V (α−i)) 171 ∗ ∗ ∗ is smaller at αi = α−i = α , then by Assumption 7 αD must be larger than α for Equation (B.5) to hold.

Because the effects of α on price are identical in the two models, any differences in the curation decision must come from differences in the elasticity of the participation

∗ ∗ ∗ function. It would be unreasonable to expect Q(V (α )) < Qi(V (α ),V (α )), however it is entirely possible that consumer participation might be less elastic in the equilibrium of a competitive market. The larger set of choices available to consumers in a duopoly environment means that consumers are more likely to have a strong preference for the platform they choose in equilibrium, (or a strong aversion to both options if they choose the outside option) so a consumer will attract fewer new consumers if it lowers

∂2Q (V (α∗ ),V (α∗ )) α. This effect is strengthened if i D D > 0 because this implies that as ∂V (αi)∂V (α−i) platform −i allows more low-quality sellers, platform i loses fewer consumers when it increases αi.

172 Appendix C: Experimental Instructions

C.0.1 Learning Last Experimental Instructions

Welcome to our decision making study! Thank you for your participation. Please

turn off and put away your cell phones, and please refrain from talking to other

participants during the study. You will receive a payment for your time based on the

results of the experiment. The experiment should take approximately 1 hour. The

average payment will be about $17 but your earnings will depend on your decisions

and on elements of chance.

Drawing Values

Imagine you are searching for a product online. When you find what you’re looking

for, you have the option of taking that deal immediately, or continuing to search.

Searching takes time and effort, but there’s always the possibility of finding an item

that you like more or finding the same item at a lower price. We will be simulating this

search process by having you draw values from a random distribution. You will begin

by entering a “minimum acceptable value” and making a draw from the distribution.

This minimum acceptable value represents the deal with the lowest payoff that is still

high enough you would stop searching after finding it. Each draw costs 5 experimental

currency units, and if you get a draw above your minimum acceptable value then you

173 will be done with that search spell and your payoff will be the value you drew minus

the cost of the draw.

If your draw is below your minimum acceptable value, then you will have two options, you can either accept the highest value you have observed so far at a cost of 5 ex-

perimental currency units, or enter a new minimum acceptable value (which can be

higher, lower, or equal to your previous minimum acceptable value) and draw a new value. If you put a low minimum acceptable value, then you are likely to get a value

above this cutoff in your next draw, but you may end up settling for a very low payoff.

If your minimum acceptable value is very high, then you won’t stop after drawing a

lower value, but it may take a lot of draws to get a value that high.

The distribution you will be drawing from is a “censored normal distribution” with

a low end of 0, a high end of 700 and a mean of 350. That means that every number will be between 0 and 700 but numbers closer to 350 are more likely than numbers

near the end. For example, if you input a minimum acceptable value of 300, then you will have a 70% probability of drawing a value above that number, but the probability

of drawing a value above 400 is 30%. You will be shown a graph with 10,000 random

draws from this distribution and your draws will be displayed on this graph as red

174 dots on the axis. You will be repeating part 1 seven times. The first two times will be

for practice and then your payment for this section will be randomly selected from

among the payoffs in the other five periods.

Drawing Values With an Unknown Distribution

This task will be similar to the first task, but you will be drawing values from a

normal with an unknown mean. The computer will randomly determine a number

anywhere between 150 and 550 to be the mean of the distribution at the start of every

period, but you will not know what that mean is. Each number between 150 and 550

is equally likely to be chosen for the mean. Once the computer has determined the

mean you will choose a minimum acceptable value and draw values just like in the

last section. It is still possible to draw every number between 0 and 700, and numbers

closer to the mean are still more likely. For example, if the mean is 200, then you will

be more likely to draw numbers close to 200 but it is still possible to draw values close

to 700.

Once you have finished drawing values, you will be asked to guess the mean of the

distribution for that search spell. If your guess is within 100 units of the actual value,

then 100 minus the amount by which your guess differs from the actual value will be

added to your payoff for that period. This section will be repeated seven times just as

before, and the computer will draw a new number for the mean of the distribution at

the beginning of each period. The first two periods will be for practice again and then your payment for this section will again be randomly selected from among the payoffs

in the other five periods.

175 Number line task

Once you have finished drawing values, we will ask you to complete a number line

task. You will be shown a number line ranging between 0 and 1000 and a sequence

of numbers. Your task for this part of the experiment is to use your mouse to drag

the diamond as close to where you think each number lies on the line as possible.

We will pick one number at random out of the sequence and if the position of the

diamond is within 100 units of the number shown, then you will be awarded a number

of experimental units equal to 100 minus the amount by which your placement differs

from the number.

Lottery Choice

The next part of the experiment will involve choosing between pairs of lotteries.

Lottery B on the right side has a probability of giving a much higher payout than

lottery A on the left, but also has a possibility of giving a much smaller payout. The

lotteries have an increasing probability of giving the high payout, and we will ask you to pick the minimum probability of the high payout at which you would prefer

the high variance of lottery B to the low variance of lottery A. Once you choose that

minimum probability, the program will automatically select lottery B for all higher

probabilities of receiving the high payout. We will be selecting one of your chosen

lotteries at random to determine your payment for this part of the experiment.

Survey

The last part of the experiment consists of a few questions and a brief survey

assessing how you deal with numeric information. In order to ensure accuracy of this

176 information we ask that you be especially sure not to use any calculators in this part

of the experiment. The answers to these questions will not affect your payment.

Calculating Payment

The computer will randomly select your earnings from one of the five repetitions

after the practice periods from part 1 and one of the five repetitions after the practice

periods from part 2, then add these to your earnings from parts 3 and 4 to determine your total payment. Experimental currency units will be converted to dollars at a rate

of 1ECU to 0.014USD.

Questionnaire

Once you have written down your payment the experiment will conclude with a

brief questionnaire. After you have completed the questionnaire the experimenter will

call out your name and you can collect your payment.

177 Practice Questions

Please fill in the blanks in the following examples. This is just to get you used

to the type of thinking you will be doing in the experiment. Your answers to these

questions will not affect your payment or your standing with the experimenter.

Example 1: You have drawn 3 values from the distribution in part 1

1. If your current highest value is 180 and you decide to accept this value, then

you will get a payout of [ ] for this period.

2. If you decide to draw again with a minimum acceptable value of 250 and you

draw 320, then your final payout for this period will be [ ]

Example 2: You have drawn 7 times from the distribution in part 2

1. If your current highest value is 120 and you decide to accept this value, then

you will get a payout of [ ] for this period.

2. If you decide to draw again with a minimum acceptable value of 400 and you

draw 480, then your final payout for this period will be [ ]

C.0.2 Learning First Instructions

Welcome to our decision making study! Thank you for your participation. Please

turn off and put away your cell phones, and please refrain from talking to other

participants during the study. You will receive a payment for your time based on the

results of the experiment. The experiment should take approximately 1 hour. The

average payment will be about $17 but your earnings will depend on your decisions

and on elements of chance.

178 Drawing Values With an Unknown Distribution

Imagine you are searching for a product online. When you find what you’re looking

for, you have the option of taking that deal immediately, or continuing to search.

Searching takes time and effort, but there’s always the possibility of finding an item

that you like more or finding the same item at a lower price. We will be simulating this

search process by having you draw values from a random distribution. You will begin

by entering a “minimum acceptable value” and making a draw from the distribution.

This minimum acceptable value represents the deal with the lowest payoff that is still

high enough you would stop searching after finding it. Each draw costs 5 experimental

currency units, and if you get a draw above your minimum acceptable value then you will be done with that search spell and your payoff will be the value you drew minus

the cost of the draw.

If your draw is below your minimum acceptable value, then you will have two op-

tions, you can either accept the highest value you have observed so far at a cost of 5

experimental currency units, or enter a new minimum acceptable value (which can

be higher, lower, or equal to your previous minimum acceptable value) and draw

a new value. If you put a low minimum acceptable value, then you are likely to

get a value above this cutoff in your next draw, but you may end up settling for

a very low payoff. If your minimum acceptable value is very high, then you won’t

stop after drawing a lower value, but it may take a lot of draws to get a value that high.

The distribution you will be drawing from is a “censored normal distribution” with a

low end of 0, a high end of 700 and an unknown mean. This means that it is possible to

179 draw every number between 0 and 700, but numbers closer to the mean are more likely.

As an illustrative example: If you input a minimum acceptable value of 300 and

the mean is 350, then you will have a 70% probability of drawing a value above that

number, but the probability of drawing a value above 400 is 30%. On the other hand,

if the mean is 200, then you would only have a 16% probability of drawing a value

above 300, but it is still possible to draw values close to 700. The computer will

randomly determine a number anywhere between 150 and 550 to be the mean of the

distribution at the start of every period, but you will not know what that mean is.

Each number between 150 and 550 is equally likely to be chosen for the mean. Once

the computer has determined the mean you will choose a minimum acceptable value

and begin drawing values. You will be shown the values you draw as red dots on a

number line as in the picture below.

Once you have finished drawing values, you will be asked to guess the mean of

the distribution for that search spell. If your guess is within 100 units of the actual value, then 100 minus the amount by which your guess differs from the actual value will be added to your payoff for that period. This section will be repeated seven

times, and the computer will draw a new number for the mean of the distribution

at the beginning of each period. The first two times will be for practice and then

180 your payment for this section will be randomly selected from among the payoffs in the

other five periods.

Drawing Values With a Known Distribution

This task will be similar to the first task, but you will be drawing values from

a censored normal distribution with a known mean of 350. That means that every

number will be between 0 and 700 but numbers closer to 350 are more likely than

numbers near the end. You will be shown a graph with 10,000 random draws from

this distribution and your draws will be displayed on this graph as red dots on the

axis. The picture above shows an example of what this graph will look like. Since you

know the distribution in this part of the experiment you will not be asked to guess

the mean. You will be repeating this part seven times just as before. The first two

periods will again be for practice and then your payment for this section will again be

randomly selected from among the payoffs in the other five periods.

Number line task

Once you have finished drawing values, we will ask you to complete a number line

task. You will be shown a number line ranging between 0 and 1000 and a sequence

of numbers. Your task for this part of the experiment is to use your mouse to drag

the diamond as close to where you think each number lies on the line as possible.

We will pick one number at random out of the sequence and if the position of the

181 diamond is within 100 units of the number shown, then you will be awarded a number

of experimental units equal to 100 minus the amount by which your placement differs

from the number.

Lottery Choice

The next part of the experiment will involve choosing between pairs of lotteries.

Lottery B on the right side has a probability of giving a much higher payout than

lottery A on the left, but also has a possibility of giving a much smaller payout. The

lotteries have an increasing probability of giving the high payout, and we will ask you to pick the minimum probability of the high payout at which you would prefer

the high variance of lottery B to the low variance of lottery A. Once you choose that

minimum probability, the program will automatically select lottery B for all higher

probabilities of receiving the high payout. We will be selecting one of your chosen

lotteries at random to determine your payment for this part of the experiment.

Survey

The last part of the experiment consists of a few questions and a brief survey

assessing how you deal with numeric information. In order to ensure accuracy of this

information we ask that you be especially sure not to use any calculators in this part

of the experiment. The answers to these questions will not affect your payment.

Calculating Payment

The computer will randomly select your earnings from one of the five repetitions

after the practice periods from part 1 and one of the five repetitions after the practice

periods from part 2, then add these to your earnings from parts 3 and 4 to determine

182 your total payment. Experimental currency units will be converted to dollars at a rate

of 1ECU to 0.014USD.

Questionnaire

Once you have written down your payment the experiment will conclude with a

brief questionnaire. After you have completed the questionnaire the experimenter will

call out your name and you can collect your payment.

183 Practice Questions

Please fill in the blanks in the following examples. This is just to get you used

to the type of thinking you will be doing in the experiment. Your answers to these

questions will not affect your payment or your standing with the experimenter.

Example 1: You have drawn 3 values from the distribution in part 1

1. If your current highest value is 180 and you decide to accept this value, then

you will get a payout of [ ] for this period.

2. If you decide to draw again with a minimum acceptable value of 250 and you

draw 320, then your final payout for this period will be [ ]

Example 2: You have drawn 7 times from the distribution in part 2

1. If your current highest value is 120 and you decide to accept this value, then

you will get a payout of [ ] for this period.

2. If you decide to draw again with a minimum acceptable value of 400 and you

draw 480, then your final payout for this period will be [ ]

184 Appendix D: Media Provision Numerical Example

(a) Triangular distribution of nuisance costs (b) Half-normal distribution

Figure D.1: Variation in subscription price and creator payment by advertising sensitivity. Parameter values: η = 0.25, µ = 0, z = 0.2, c = 0.05, α = 3, u = 1

Figure D.1a shows the influence of η¯ on the subscription price and the creator payments. While the subscription price is higher when consumers are more ad sensitive, the creator payments increase almost imperceptibly. This is because p grows quite slowly with η so the value of additional markets does not significantly increase as

η¯ increases. The subscription price is high, and once it reaches the corner the rate at which additional consumers are driven onto the subscription as η¯ increases is low.

185 While the subscription price in Figure D.1b is lower than that in the triangular model,

it follows the same general pattern of increasing in ad sensitivity while the creator

payments are almost constant.

(a) Triangular distribution of nuisance costs (b) Half-normal distribution

Figure D.2: Variation in t∗ by advertising sensitivity. Parameter values: η = 0.25, µ = 0, z = 0.2, c = 0.05, α = 3, u = 1

Figure D.2a shows the equilibrium content coverage under the advertising-only

and subscription models. As consumers become more sensitive to ads, platform and

creator profits decrease in the advertising-only model since the number of ad views will decrease, decreasing the number of creators who produce. With the addition of

the subscription, increased sensitivity to advertising becomes an advantage for the

platform as more consumers choose to pay the subscription fee rather than watch ads.

Since w is increasing in p (albeit not by much), and more consumers choose to watch

rather than staying with the outside option, the creator revenue increases as average

nuisance cost increases, and this increases the amount of content produced under the

186 subscription. Content coverage follows a similar pattern with the half-normal nuisance

cost distribution, which can be seen in Figure D.2b as the disadvantage the platform

faces when it cannot offer the subscription once again becomes a strength when it can.

(a) Triangular distribution of nuisance costs (b) Half-normal distribution

Figure D.3: Variation in creator participation by curvature of market size function. Parameter values: η = 0.25, µ = 0, z = 0.2, c = 0.05, η¯ = 2.3, σ = 0.5, u = 1

Figure D.3a shows content coverage in the two models as curvature decreases. The

advertising-only coverage is more responsive to the change in curvature than coverage with the subscription. This results from the fact that the increase in creators’ share

of revenue as a result of the subscription is smaller as α increases (see Figure D.4a).

The pattern in Figure D.3b is roughly similar, except that the subscription coverage

level is more responsive to introduction of the subscription as α increases (albeit not visibly) when nuisance costs follow a half-normal distribution. The difference is due to

the fact that the increase in creator payment as a result of the subscription is much

smaller with the half normal nuisance cost distribution and therefore does not have

187 much room to decrease as α increases, meaning that the increase in viewership has

more impact on creator revenue.

(a) Triangular distribution of nuisance costs (b) Half-normal distribution

Figure D.4: Variation in subscription price and creator payment by curvature of g(·). Parameter values: η = 0.25, µ = 0, z = 0.2, c = 0.05, η¯ = 2.3, σ = 0.5, u = 1

Figure D.4a and Figure D.4b depict qualitatively similar variation in the equilibrium

prices as a function of α. The subscription price does not depend on α since subscription

rates depend only on nuisance cost which is distributed independently of content type.

The creator payment in the premium model decreases with α. Market coverage under

the advertising-only model is increasing and p does not vary with α. The marginal

benefit of gaining an additional content market decreases relative to the marginal cost

because an increase in creator payments will involve paying more creators who are

already producing, so the amount by which the platform is willing to increase the

creators’ share of revenue as a result of the subscription decreases.

188 (a) Variation by advertising sensitivity. (b) Variation by curvature of g(·). Triangular distribution of nuisance costs.

(c) Variation by advertising sensitivity. (d) Variation by curvature of g(·). Half normal distribution.

Figure D.5: Variation in ad views. Parameter values: η = 0.25, µ = 0, z = 0.2, c = 0.05, u = 1 and α = 3 or η¯ = 2.3, σ = 0.5

Figure D.5a and Figure D.5b depict the total number of ad views under the two

business models with the triangular distribution of nuisance costs, and Figure D.5c and

Figure D.5d show ad views with the half normal distribution. Total ad views decrease with nuisance cost since more consumers will either not watch or subscribe, and

increase as g(·) flattens since market coverage is increasing. However the effect of the

subscription is ambiguous, with the triangular distribution the consumers who continue

to watch ads are exposed to more advertising since a is moving from an interior solution

189 to the corner at 1, which mitigates the loss from subscribing consumers in addition to the extensive market effect. With the half normal distribution the advertising level is already at the corner so this mitigating effect is not present. Additionally, the magnitude of the extensive change in market coverage is much smaller, so the increase from free viewers on the extensive markets is not sufficient to compensate for the loss of ad views on the intensive markets.

D.0.1 Analytical Solutions for the Triangular Distribution Example.

u  u  u The advertising level solves F a = f a a or

2 η¯ − 1  2 η¯ − 1  1 1 − a = a (D.1) (¯η − η)2 (¯η − η)2 a

1 Impose the technical restriction that η¯ − η ≥ η which ensures that the a which solves the above is between 0 and 1. Then the advertising level in the ad-only model is

∗ 1 a = q (D.2) 2¯ηη − η2

The advertising level is decreasing as both the upper and lower ends of the support of the nuisance cost increase. This is because the higher the average nuisance cost, the fewer consumers will watch at a given advertising level and the lower the ad-view maximizing a will be.

With the addition of the subscription, the advertising level goes to 1 and the subscription price solves

190 1 − Fη(˜ps) =p ˜s − p˜a (D.3) fη(˜ps)

which gives

2 η¯   p˜ = min +p ˜ , 1 (D.4) s 3 2 a

The price is increasing as consumers become more sensitive to ads since this means that for any given price fewer consumers back will use the free version, but also increases in the advertising price. The higher the advertising price, the less the platform loses from converting a subscriber back into a free viewer.

191