The Geneva Papers on Risk and Insurance Theory, 24: 139–158 (1999) c 1999 The Geneva Association

Experimental Tests of Self-Selection and Screening in Insurance Decisions

ZUR SHAPIRA Stern School of Business, New York University, New York, NY 10012-1126 [email protected]

ITZHAK VENEZIA School of Business, Hebrew University, Jerusalem, Israel [email protected]

Abstract

A major characteristic of insurance markets is that may lead to phenomena such as and moral hazard. Another aspect of markets with asymmetric information is self-selection, which refers to the pattern of choices that individuals with different personal characteristics make when facing a menu of contracts or options. To combat problems of asymmetric information, insurance firms can use screening. That is, they can offer the clients a menu of choices and infer their characteristics from their choices. This article reports the results of several studies that examined the degree to which people behave according to the notions of self-selection and screening. Subjects played the role of either insurance buyers or sellers. The results of these studies provide partial support for the hypothesis that subjects use self-selection and screening in insurance markets. Our study also points at the importance of learning in experimental studies. In one-stage experiments where subjects did not get feedback, screening was not detected. When multistage experiments were conducted, and the subjects learned from experience and were also taught the relevant theories, their decisions were more aligned with screening.

Key words: self-selection, screening, information asymmetry, insurance markets

1. Introduction

Adverse selection is an incentive problem that emerges from informational asymmetry. It refers to the conjecture that people who purchase insurance are not a random sample, but rather a group of individuals with private information about their personal situations that may lead them to obtain higher than average benefits from the insurer under the policy.1 The consequences of adverse selection may be detrimental to competitive markets (cf. Riley [1979], Rothschild and Stiglitz [1976], Spence [1973], Stiglitz and Weiss [1981]). Akerlof’s [1970] classical paper demonstrated that adverse selection, caused by asymmetric information, can eliminate markets. Since then, economists have searched for ways to minimize the handicaps of asymmetric information. For instance, the use of reputation or warranties can help in used car markets, while the introduction of deductibles may reduce adverse selection in insurance markets. In some markets, however, this problem is hard to overcome. Milgrom and Roberts [1992] employ a health care example, that of pregnancy and delivery. If such an item was offered in an insurance policy, consumers most likely to purchase it would be those planning to bear children in the near future. The unobserved 140 ZUR SHAPIRA AND ITZHAK VENEZIA characteristics (or private information) of these potential consumers may have large effects on the cost of insurance policies. Indeed, private insurance companies in the U.S. rarely offer such policies. Several authors (Beliveau [1984], Dionne and Doherty [1991], Jaynes [1978], Venezia [1991], Wilson [1977]) discussing equilibrium models of insurance markets have suggested that adverse selection can be mitigated by using screening. That is, the insurer can provide a menu of alternative contracts (differing in prices and deductibles) to potential insureds in order to induce self-selection. The insurer then infers the category of the insureds by their selection of a particular policy.2 For screening to work, some self-selection constraints are necessary (see Milgrom and Roberts [1992]). These should be designed to render it disadvantageous to riskier insureds to buy the same policies as do less risky insureds. Clearly, the behavior of economic agents under the self-selection constraints is rational. In the analysis of insureds’ behavior, an assumption is usually made that the insureds under- stand these constraints, or that they behave as if they understand them. Is this assumption of rationality borne out by empirical findings on behavior in insurance markets? The answer to this question is not entirely clear. For example, in a survey of over 3000 homeowners living in either flood-prone or earthquake-prone areas of the United States, Kunreuther et al. [1978] found several striking facts that contradict rationality as a basis for insureds’ behavior. Following Kunreuther and Slovic [1978], one can argue that a better understanding of the market failure phenomenon in insurance may be achieved by examining both the attitudes and the information-processing limitations of agents in insurance markets. The present article reports the findings of a few experiments focusing on the issues of self-selection and screening. The purpose of these experiments was to investigate whether people behave in a way that reflects the instrumentality of these concepts. If people’s behavior does not reveal an understanding of and a belief in the instrumentality of self-selection and screening, perhaps other means for fighting market failure may be needed. The structure of the article is as follows. In Section 2 we provide a methodological overview of the design of the screening tests. In Section 3 we present some pilot studies, which were performed both for testing our hypotheses and for the main study. The main experiments are described in Section 4, and a conclusion follows in the last section.

2. Methodological overview

We first tested whether or not subjects self-select. We chose to begin with this test because self-selection is a necessary requirement for screening. Sellers will not initiate screening unless they believe that buyers will self-select. We presented subjects with the task of purchasing an insurance policy. They were pro- vided with information classifying them into two subgroups, namely, L and H (for Low and High risk, respectively). These subgroups were distinguished by differing probability distributions of damages. We then offered the subjects two alternative insurance policies, one with a deductible and one providing full coverage (i.e., without a deductible). Subjects were requested to buy one policy. The policies and distribution of claims were so designed that the L subjects would prefer the policy with a deductible and the H ones the no-deductible policy, SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 141 and so that any risk-averse person would prefer buying one of the policies to remaining without insurance. Since an L client is less likely to suffer a loss, his/her chances of having to “pay” the deductible are lower than those of an H client. Therefore, the L client should be less willing to pay the extra price needed to purchase the full coverage policy instead of the policy with a deductible. We then examined whether or not the subjects indeed self-selected. To test for screening, we presented subjects with the task of selling an insurance policy. Each subject played the role of an insurer (seller of insurance). The sellers were different subjects from those in the buying experiment so as not to give the (self-selection) idea away. The insurer knew the probability distribution of claims of each type of buyer, as well as the proportion of each type of buyers in the particular market. The insurer, however, could not identify a priori which potential insured was L and which was H. The insurer could offer two insurance policies: one with a deductible (predetermined by us) and one without a deductible. The insurer’s task was to price these policies. This is a simplified version of a situation where screening can be employed. In this experiment, we constructed the distributions of claims so that self-selection would be easier to detect than in the buying experiment. We simplified the screening task; had we not done so, the screening task here would have been much more difficult than in the buying experiment, since the screening task already includes an extra step (i.e., the seller must conjecture that the buyers will self-select, and price accordingly). We tested the screening hypothesis in two ways. First, we randomly assigned each subject to one of two groups, or “markets.” These two groups were presented with descriptions of similar clients, L and H. The two groups differed, however, in the proportions of these two types in their markets. One group (Group A) had predominantly type H clients (75% type H clients and 25% type L clients), and the other group (Group B) had predominantly type L clients (75% type L and 25% type H clients). Under complete screening, prices should be the same in the two groups. This occurs since with complete separation only type L clients will buy the policy with a deductible, and only type H clients will buy the full coverage policy. The insurer in any group will therefore price the full coverage policy according to projected losses and demand of the type H clients only. Similarly, the deductible policy will be priced according to the projected losses and demand of type L clients only. The percentage of clients of each type in the group should therefore not affect pricing. We thus tested for screening by checking whether prices differed between the two groups, thereby analyzing whether the proportions of each type of clients affected pricing.3 This method may be quite a stringent test, since it requires the assumption of complete self-selection. Sellers, however, may expect that some of the buyers will not detect the self- selection opportunity and that both policy F (full coverage) and policy D (with a deductable) will be purchased by the two types of clients (that is, a pooling equilibrium may emerge). In this case, the prices will depend on the proportions of type L and type H clients in the market. Another way of testing the screening hypothesis is by examining how many sellers choose prices that induce self-selection. However, whether or not a set of prices induces self- selection depends on the utility functions of the potential clients, and these were unknown to the sellers. Therefore, we assumed “reasonable” risk-aversion measures, namely, a utility function with constant absolute risk aversion of 0.001 to 0.0025.4 We defined a set of prices 142 ZUR SHAPIRA AND ITZHAK VENEZIA as inducing “theoretical” self-selection if for all individuals with the above utilities, the conditions for self-selection hold. We then counted the number of subjects for whom self- selection held.5 This provided a further (albeit incomplete) indication of whether or not screening was intended. It should be noted, however, that some prices may induce self- selection as described above without the seller expressly intending to do so. The scenario presented to the subjects was a simplified version of the Rothschild–Stiglitz [1976] model. In that model, sellers could determine any price/quantity (deductible) com- bination. We, however, restricted the possible deductibles in order to simplify the tasks of the subjects and allow them to focus on screening. The nature of equilibrium in such a complex market depends on the competitive behavior of the participants, on the available information, and on what each seller anticipates from other competitors. Rothschild and Stiglitz [1976] argued that a pooling Nash equilibrium (i.e., an equilibrium where both types of insureds buy the same contract) is impossible. Thus, the only possible equilib- rium is a separating one, even though this does not always exist. If one moves from Nash equilibrium to other definitions of equilibrium, then a pooling equilibrium may or may not exist. According to Wilson’s [1977] definition, a pooling equilibrium may exist under the conditions specified above. However, Miyazaki [1977] and Spence [1978], expanding on and refining Wilson’s definition, showed that a pooling equilibrium is impossible, though a separating one always exists (see also Crocker and Snow [1985]). Thus, the nature and existence of equilibrium depends on some factors we cannot control. Screening, though, is a very likely outcome according to theory. It remains to be determined empirically what kind of equilibrium will emerge in our market.

3. A pilot study

To ascertain whether self-selection and screening could be induced in an experimental setting, we ran two experiments: one to replicate buyers’ behavior and the other simulating sellers’ behavior.

3.1. Experiment 1: Buyer’s behavior

Method: Subjects played the role of insurance buyers. They were presented with a distri- bution of damages and were told that they could (but were not required to) buy insurance to cover these damages. The subjects were divided into two groups that differed in the dis- tribution of damages they were presented with. One of the distributions entailed more risk than the other. We offered the subjects two policies: one with a deductible, and one without (i.e., full insurance). The test consisted of examining whether the subjects with the riskier distribution of damages would buy the full coverage policy, and whether those with a less risky distribution would buy the policy with a deductible. Subjects: One hundred and seventy-seven Master of Business Administration students from New York University participated in the experiment, which was run during a regularly scheduled class session. Their ages ranged from 23 to 35, with a median of 27. Procedure: Subjects were presented with one page describing two renters’ insurance policies. The subjects were requested to choose one of the two policies, which offered either full coverage for $120 or coverage with a deductible of $100 and a premium of SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 143

Table 1A. Buying insurance.a

Suppose you wish to purchase a renter’s insurance policy to insure your belongings (in the apartment you rent) against fire and theft. Assume that the probabilities of your incurring losses (based on the building and area of the city where you rent your apartment) for the duration of one year are as follows:

H Loss ($) Probability

0 .61 100 .30 1,000 .09 10,000 0

There are only two policies that you are offered: Policy 1: Annual cost is $99 and the deductible is $100. Policy 2: Annual cost is $120 and the deductible is $0. Which policy do you choose (circle 12) Please explain your choice.

aHigh-risk condition.

$99. These prices were at par with the prices of renters’ insurance policies at the lower end of household items value. Two different distributions were used, which were denoted by H and L, respectively (see Tables 1A and 1B). Distribution H had a mean of 120 and distribution L a mean of 110. The expected losses with a deductible of 100 were 99 for L and 81 for H. Each subject received only one description, and the subjects were not aware of our classification of distributions into H and L.6 Results and Discussion: The number of subjects choosing policy 1 (with a deductible) and policy 2 (with full coverage) are presented in Table 2. The self-selection hypothesis suggests that subjects, when faced with a menu of options, will select the one that “fits” them best. In other words, high-risk consumers will choose the policy offering full coverage, while low-risk consumers should choose the policy with the deductible. A chi-square test run on the data provided partial support for the self-selection hypothesis ( 2 = . , <. ) (1) 2 83 p 08 . The choice of the subjects who received the H distribution appeared to be more in line with the self-selection hypothesis than that of those who received the L distribution. The former clearly preferred the full coverage policy over the deductible policy, whereas the latter were evenly divided between the policies. Why did the subjects faced with the H distribution choose the “right” policy (full coverage), whereas those faced with the L distribution did not? One possibility is risk aversion. The “right” policy for the latter group was the one with a deductible, but if they were risk-averse enough they would have preferred full coverage. To test this, we calculated the optimal decisions according to the expected 144 ZUR SHAPIRA AND ITZHAK VENEZIA

Table 1B. Buying insurance.a

Suppose you wish to purchase a renter’s insurance policy to insure your belongings (in the apartment you rent) against fire and theft. Assume that the probabilities of your incurring losses (based on the building and area of the city where you rent your apartment) for the duration of one year are as follows:

L Loss ($) Probability

0 0.89 100 .10 1,000 .0 10,000 0.01

There are only two policies that you are offered: Policy 1: Annual cost is $99 and the deductible is $100. Policy 2: Annual cost is $120 and the deductible is $0. Which policy do you choose (circle 12) Please explain your choice.

aLow-risk condition.

Table 2. Number of subjects buying different policies (Experiment 1).

Choice High risk Low risk Total

Full coverage 27 23 50 Deductible 14 27 41 Total 41 50 91 utility of hypothetical clients with distribution of damages L and exponential utility function with a measure of risk aversion ranging from .001 to .0025. We found that such consumers should choose the deductible policy. This led us to believe that reasons other than risk aversion caused the deviation from self-selection. One possibility is misunderstanding of, or aversion to, the consequences of choosing a policy with a deductible. Such an aversion to deductibles has been suggested before (see Schoemaker and Kunreuther [1979]). Also, we determined later (Experiment 4A) that in some cases subjects did not know how to calculate the expected receipts from the insurer under the deductible policy, and hence may have erred in their choices involving such policies.

3.2. Experiment 2: Seller’s behavior

The goal of this experiment was to examine whether sellers use screening. Subjects were told they were insurers competing in an insurance market. The potential buyers were described SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 145 as one of two types: “careful” (type L) or “negligent” (type H), with different potential damage distributions. Subjects were told that they could offer two policies, one providing full coverage (denoted contract F) and the other providing coverage with a deductible (denoted D). Their task was to set a price for each contract. Method: The subjects were randomly divided into two groups. Subjects in one group (faced with predominantly negligent clients) were told that 75% of the clients were negligent and 25% were careful, but the subjects could not foretell which client was negligent and which was careful. Subjects in the second group (predominantly careful clients) were told that only 25% of the clients were negligent and the rest were careful, and again, the subjects could not identify the character of a certain potential client. The task was framed in the context of selling renter insurance policies and is described in Table 3. Subjects were asked to determine prices for two possible policies, F (providing full coverage) and D (with a $100 deductible). We explained how deductibles work and reminded the subjects that lower prices induce higher demand but are less likely to cover losses and provide a profit. If sellers achieve complete self-selection (and hence screening exists), all negligent clients buy contract F and all careful buyers buy contract D. As argued in Section 2, the proportions of careful and negligent customers in the population would not affect pricing in this case. We could then test for screening by analyzing whether prices depend on the given proportion of careful and negligent customers. Subjects and Procedure: Two hundred and twenty-seven Master of Business Adminis- tration students at New York University participated in the experiment. Their ages ranged from 23 to 36, with a median age of 27. The experiment was run in a regularly scheduled class. Subjects were provided with two pages similar to those presented in Table 3. They were requested to set their prices for the two policies and provide an explanation. Results and Discussion: The median prices set for the two policies by the subjects, according to the different groups they were assigned to, are presented in Table 4. There were significant differences between the prices set for similar policies by the two groups. The prices of contract F were more expensive, on average, for the group with the greater percentage of high-risk clients than for the group with a smaller percentage of high-risk clients (t = 2.18, p <.05). The same is true for the prices of contract D (t = 2.14, p <.05). The sellers’ prices did not provide support for the screening hypothesis.

4. The main study

The results of the two pilot experiments supported the hypotheses that buyers self-selected but that sellers did not screen. In the main study, we ran four experiments (3A, 3B, 4A, and 4B), with some changes based on the feedback gathered from the pilot studies. The changes made were as follows:

1. We introduced (in Experiments 3B, 4A, and 4B) more explicit incentive schemes to better motivate the subjects to make good decisions. 2. To increase the external validity of the experiment, we conducted some studies (Experi- ments 3A and 3B) with insurance practitioners. This allows a higher generalizability of the results with reference to actual behavior in insurance markets. 146 ZUR SHAPIRA AND ITZHAK VENEZIA

Table 3. Selling insurance (Group I).a

Assume that you are an insurance agent. You were offered an opportunity of making a bid for insuring rental apartments through a large organization in the city (N = 1000). Basically, if your bid is accepted you’ll be able to sell policies to these 1000 employees (who will buy personal insurance from you) covering their personal belongings in the apartments they rent, against fire and theft. Assume that the probabilities of damages that these employees may incur (based on their previous insurance records) come from the following two distributions:

AB

Loss ($) Probability Loss ($) Probability

0 0.89 0 0.0725 100 0.10 100 .90 1,000 0.00 1,000 .025 10,000 0.01 10,000 .0025

You cannot know which distribution a particular employee “comes” from; the company told you that 75% of the employees “come” from distribution A and 25% from distribution B. What would be the prices you’d charge? Recall that there is competition (other agents can come with more attractive offers). At the same time, in setting the price of the policy you should not forget the potential claims. Expected claims are affected by the policy an employee buys as well as the distribution he “comes” from. Employees are free to choose between the offered policies and may also decide not to buy any policy. Please note that if you price the policy(ies) too high you may have no demand. On the other hand, if you price them too low you may eventually lose money. This potential deal is very important to you as insurance business is declining. Think and decide! Policy 1: A deductible of $100 Policy 2: A deductible of $0

Decision: Policy 1 sell/no sell at price $ each Policy 2 sell/no sell at price $ each Please explain your decision:

aFor subjects in Group II, the table reads that 25% of the employees come from A and 75% from B.

Table 4. Median prices set for insurance policies (in $) (Experiment 2).

High risk Low risk

Contract F 229.9 149.5 Contract D 148.6 117.8 SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 147

Table 5. Distribution functions of losses for high- and low-risk clients (Experiments 3B and 4B).

H: High risk L: Low risk

Loss ($) Probability Loss ($) Probability

0 0.7 0 0.9 100 0.2 100 0 1200 0.1 1200 0.1

3. To further increase the internal validity of the experiment, we ran a multistage procedure in which the subjects had ample opportunity to learn from feedback and improve their behavior over time (Experiments 4A and 4B). 4. We replaced (in Experiments 3B and 4B) the distributions H and L of Table 3 with the distributions H and L presented in Table 5. We made this change so that the distribution L would stochastically dominate H. Also, with these distributions, the intuition behind labeling them as H (high risk) and L (low risk) is clearer, and the calculation of expected claims is easier. The expected value of H is $140 and that of L is $120. If a $100 deductible is used, the expected values of payments are $110 for both the H and the L distributions.

We ran two sets of experiments. One set (Experiments 3A and 3B) utilized insurance practitioners as subjects. In the second set (Experiments 4A and 4B), nonprofessional subjects took part in multistage experiments.

4.1. Seller’s behavior, practitioners

4.1.1. Experiment 3A. Method: The method essentially replicated the method of Experiment 2. Subjects and Procedure: Twenty-seven insurance practitioners whose ages ranged from 30 to 55 years participated in the study. All subjects were enrolled in advanced classes at the College of Insurance in Tel Aviv. They completed the questionnaires during the first 30 minutes of a regular class session and were told that “top performers will be noted.” All the practitioners had at least five years experience in the industry. Some were insurance agents, some were insurance agents’ inspectors, some were underwriters, and some owned insurance agencies. Results and Discussion: The mean prices set for the two policies by the subjects according to the different subgroups they were randomly assigned to (predominantly negligent or predominantly careful clients) are presented in Table 6. As in Experiment 2, there were significant differences between the prices set by the two groups (for both the F and the D policies, the differences were significant at p <.005). This outcome does not provide support for the screening hypothesis according to the first method of testing. We calculated the number of subjects who chose prices that induce theoretical self-selection (that is, prices 148 ZUR SHAPIRA AND ITZHAK VENEZIA

Table 6. Prices set for insurance policies (in $) (Experiment 3A).

High-risk market Low-risk market (75% negligent) (25% negligent)

Policy D Policy F Policy D Policy F

Mean 79.40 141.90 154.79 223.88 Median 69 132.50 119 210 Std. dev. 37.92 70.46 73.81 93.54 Range 30–166 59–311 86–300 117–400 N 12 15 a p = 0.004 b p = 0.026 Note: The p values denote the significance level of the difference between mean prices of the same policy (D or F) between the two markets. They are marked a p for policy D and b p for policy F. inducing self-selection for the wide range of risk-averse clients discussed in Section 2). Out of 27 subjects, 9 offered such prices. Again, this outcome does not provide strong evidence in favor of screening. The results of Experiment 3A are therefore qualitatively similar to those of Experiment 2. It appears that experience as a practitioner in the insurance industry did not induce a higher tendency to behave in line with the screening conjecture.

4.1.2. Experiment 3B. Method: This experiment was similar to Experiment 3A except for the following two modifications: 1) the distributions of claims of potential clients used in this experiment are those presented in Table 5, and 2) the incentives were made clearer and more significant (as described below). We carefully explained to the subjects that their decisions would enter into a simulated market, reflecting as closely as possible a “real” market. The clients in the simulated market had damages and claims as described in the questionnaire. Subjects were told that, based on their prices, the prices of the competitors (the other subjects in the experiment), and the decisions of the simulated clients, we would compute profits for each subject. The profits were calculated as the difference between total revenues, computed as the number of policies sold of each type multiplied by their respective prices, and total claims, simulated according to the number of clients of each type buying each policy and their distribution of claims (see Appendix B for the details of the simulation). Subjects were encouraged to ask clarification questions. The subjects were told that the top five performers (i.e., those with highest profits) would receive prizes (in the form of book coupons that could be redeemed at the largest bookstore chain in Israel). The coupons’ face values were 150, 120, 90, 60, and 30 IS (3.5 IS = $1 US). Prizes were allocated to the subjects according to their profit ranking. In addition to making their decisions, subjects were asked to respond to a few questions after making their choices. As well as offering us some insight into the subjects’ decision process, these questions served as manipulation checks (see Appendix C). Subjects: Twenty-six insurance practitioners and advanced students of insurance whose ages ranged from 22 to 50 years participated in the study. All subjects were enrolled in SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 149

Table 7. Prices set for insurance policies (in $) (Experiment 3B).

High-risk market Low-risk market (75% negligent) (25% negligent)

Policy D Policy F Policy D Policy F

Mean 155.98 210.14 109.95 155.66 Median 128.75 159.25 111.75 132.50 Std. dev. 121.66 119.08 61.08 72.52 Range 25–500 120–550 27.5–200 60–280 N 16 10 a p = 0.11 b p = 0.10 Note: The p values denote the significance level of the difference between mean prices of the same policy (D or F) between the two markets. They are marked a p for policy D and b p for policy F. advanced classes at the College of Insurance in Tel Aviv. They completed the question- naires during the first 30 minutes of a regular class session. All the subjects except for two had experience in the insurance industry. On average, they had 3.2 years of practical experience. Results and Discussion: The manipulation in the experiment consisted of dividing the subjects into two groups and providing different proportions of H and L clients to the two groups. In the postexperimental questionnaire, subjects were requested to recall the per- centages of H and L in their task. All the subjects correctly reported these percentages. In addition, they were asked what change of premium would they instill if the percentages were changed to 50% H and 50% L. Again, all the subjects’ responses indicate that they were perfectly aware of the initial percentages of H and L presented to them in the experimental task and that their decision would change if these percentages changed. The mean prices set for the two policies by the subjects according to the two subgroups they were assigned to (predominantly H or L) are presented in Table 7. As in Experiment 3A, there were significant differences between the prices set by the two groups (in this exper- iment, p <.1 for the F and for the D policies). Of the 26 subjects, only 9 offered prices that would induce theoretical self-selection. The results are, therefore, qualitatively simi- lar to those of Experiment 3A and do not support the screening hypothesis. The different distributions and the introduction of a more explicit incentive scheme did not change the conclusions drawn from the results of Experiment 3A. Subjects’ written answers show that, in making their decisions, almost all subjects cal- culated some kind of expectations of claims. For the most part, they first calculated the expectations of claims of each type and then calculated an average weighted by the pro- portions of type H and type L in the population. This outcome indicates that the subjects did not expect self-selection and did not try to screen. In response to the question “How would you change your decision if the population consisted of 50% careful and 50% negli- gent clients?” most subjects either calculated new prices or said they would change prices, including the direction of price change. In setting their prices, subjects apparently believed that the proportions of high-risk and low-risk clients buying policy D were the same as those buying policy F, and that these proportions equaled the proportions of high- and low-risk 150 ZUR SHAPIRA AND ITZHAK VENEZIA clients in the population. They did not take into account the possibility that clients who differ in their risk may also differ in their preference between policies D and F. Moreover, subjects did not consider the possibility of influencing and guiding the different types of clients into buying different policies. This result indicates that the subjects did not try to screen. In the next set of experiments, we analyzed the effect of learning by conducting the experiments over multiple stages.

4.2. Seller’s behavior, multiple stages, nonpractitioners

Proponents of the assertion that consumers behave in a rational manner argue that agents in markets have to be able to learn or else suffer a competitive disadvantage. To examine the potential role of learning, we ran an experiment with multiple stages. Two experiments were run in two different locations, as described below.

4.2.1. Experiment 4A. Method: The design of this experiment was similar to that of Experiment 3B except that four rounds of decisions were made, and subjects were provided with additional information following each round. To induce the subjects to think carefully about the problem, they were promised a reward. They were told that the policies they offered would enter into a simulated market and that those who performed best in terms of profits would get extra credit toward their course grade. Since MBA students are quite competitive, this reward was significant. We provided the same explanation of the nature of competition as in the former experiments. In the present experiment, however, we provided more detailed analysis of how a deductible works, since subjects were not practitioners. Subjects: Twenty-seven Master of Business Administration students enrolled in a “Risk Management and Insurance” class at the Hebrew University participated in the experiment. Prior to taking this class, the students had completed several courses in and statistics and at least one course in finance. Except for one student, none of them had had previous exposure to self-selection models or to the economics of information. The number of subjects varied among rounds, since they were requested to perform the task during class and some missed a class session. The number of subjects ranged from 27 in the first round to 13 in the third fourth. Procedure: The first round of this experiment was identical to Experiment 3A. It was run in the first session of the semester so that prior to that, subjects had no training in insurance. Prior to round 2, the students were taught the theory of asymmetric information and self- selection and how to calculate the expected payments of an insurer that sells a contract with or without a deductible. We also provided them with the business results of their decisions in the first round. They were given the list of all competitors and, for each of the competitors, the following information: prices charged, number of policies sold of each type, total claims made, and profits. Prior to rounds 3 and 4, the students again went over the calculation of expected payments with and without a deductible. The instructor worked through an explicit example. Students were also provided with the business results of the previous rounds. SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 151

Table 8. Prices set for insurance policies (in $) (Experiment 4A).

High-risk market Low-risk market (75% negligent) (25% negligent)

Policy D Policy F Policy D Policy F

First round Mean 148.21 182.40 75.42 149.08 Median 117 140 55 152.50 Std. dev. 90.95 83.62 41.62 15.15 Range 75–366 125–430 21–132 120–170 N 15 12 a p = 0.02 b p = 0.16 Second round Mean 123.4 162.6 160.6 199.3 Median 130 162 140 172.5 Std. dev. 23.29 11.93 69.67 82.28 Range 56–180 150–181 120–366 130–430 N 510 ap=0.20 b p = 0.24 Third round Mean 140 163.3 175.83 197 Median 140 170 137.5 161 Std. dev. 10.83 14.22 85.47 88.72 Range 121–150 135–172 125–366 140–430 N 512 ap=0.20 b p = 0.24 Fourth round Mean 145.43 170.71 134.17 158.83 Median 148 175 135 161 Std. dev. 8.58 7.94 5.21 6.54 Range 130–155 158–182 126–140 150–165 N 76 ap=0.20 b p = 0.25 Note: The p values denote the significance level of the difference between mean prices of the same policy (D or F) between the two markets. They are marked a p for policy D and b p for policy F.

Results and Discussion: The summary of prices of the policies set by the subjects is presented in Table 8. The difference between the two groups in the first round suggests that there was no screening at this stage. The standard deviation of prices (i.e., the dispersion of prices within each group for each policy) generally declined with rounds. This apparent convergence in prices may be a result of anchoring on a correct calculation of expected values following discussions in class. In the first round, many subjects not only did not consider screening but also erred in calculating expected claims (especially for contract D). The many ways in which the subjects miscalculated expected claims generated a higher variance between subjects than seen in later rounds, where all subjects had the same information concerning expected values. The effect of competition also became better known in later rounds. As the game progressed, subjects became aware that competitors are likely to price around (the commonly known) expected values. They were careful not to deviate too much from these expected values, and hence prices converged over the rounds. 152 ZUR SHAPIRA AND ITZHAK VENEZIA

The prices of policy D generally increased over the stages in both markets, but this effect was less marked for policy F. A partial explanation of this is that initially subjects did not know how to calculate the expected damages for policy D. In some cases they determined the price of policy D (which has a deductible of $100) as the price of policy F minus 100 (whereas the true difference between the expected damages was much lower, especially for negligent buyers). Prices of policy D were much lower than the actuary value of policy D in rounds 1 and 2, given that these policies were to be purchased by negligent buyers. Hence many students suffered “losses” due to these policies. When comparing the prices set in Experiment 3B by the practitioners, who were presented with the same distributions as those set by the current subjects in round 1, we also note that the practitioners selected much higher prices (and in the simulated market earned higher profits).7 The reason for this may be that the practitioners added a margin of 35%–50%, according to the Israeli industry standards, whereas the students were content with much lower margins. As in Experiments 3A and 3B, we also examined the frequency with which subjects chose prices that induced theoretical self-selection. In the first round, only 8 of 27 subjects selected prices that induced self-selection. The proportion of induced self-selections in this case (.30) is thus slightly lower than in the practitioners group (.34), but not significantly so. In the second round, however, 12 of 15 (80%) selected prices inducing self-selection, and in the third and fourth rounds all subjects set such prices. This result and the small differences between the prices of the two groups in the last rounds indicate that in these rounds screening probably did occur.

4.2.2. Experiment 4B. In this experiment, we replicated Experiment 3A with the following differences: 1) the distribution of claims used was that of Table 5 (instead of Table 4), and 2) the incentive scheme was made more precise (as described below). Subjects: Fourteen Master of Business Administration students enrolled in a “Risk Man- agement and Insurance” class at the Tel Aviv branch of the University of Manchester participated in the experiment. Prior to taking this class, the students had completed sev- eral courses in economics and statistics, and at least one course in finance. None had had previous exposure to self-selection models or to the economics of information. The number of subjects varied among rounds, since students were requested to perform the task during class and some missed a class session. They were told that in each group (the predominantly H and predominantly L groups) the three subjects with highest cumulative profits would receive 4, 2, and 1 points, respectively, toward their final grade (a 100-point scale). Since the students are highly motivated to get good grades, these rewards were quite significant. Results and Discussion: The prices set by the two groups were quite similar in all three rounds (see Table 9) with differences between the groups diminishing over time. This result represents quite a favorable result for screening. However, in computing the percentage of subjects that induced “theoretical” self-selection, we notice that this number changed from 69% in the first round to 54% in the second round and to 57% in the third round. As in Experiment 4A, the variance in prices decreased over the rounds, and prices converged. Overall, the behavior of subjects in this experiment was quite similar to the behavior of subjects in Experiment 4A. SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 153

Table 9. Prices set for insurance policies (in $) (Experiment 4B).

High-risk market Low-risk market (75% negligent) (25% negligent)

Policy D Policy F Policy D Policy F

First round Mean 115.56 149.33 121.50 102.25 Median 126 149 116 149.50 Std. dev. 30.30 15.95 13.68 27.22 Range 35–143 125–175 110–144 141–209 N 94 ap=0.33 b p = 0.24 Second round Mean 135.75 156.13 131 153 Median 132 156 130 154 Std. dev. 12.77 11.29 12.13 8.12 Range 120–150 135–173 116–150 141–165 N 85 ap=0.27 b p = 0.30 Third round Mean 138.67 164.44 133.60 154.80 Median 135 167 128 154 Std. dev. 10.04 7.20 11.16 4.49 Range 125–154 154–177 121–147 150–160 N 95 ap=0.23 b p = 0.01 Note: The p values denote the significance level of the difference between mean prices of the same policy (D or F) between the two markets. They are marked a p for policy D and b p for policy F.

5. General discussion and conclusion

Testing for the existence of screening and self-selection is not simple. Nevertheless, a combination of laboratory experiments and field studies would seem to be a good way to go about collecting behavioral data to supplement the economic analysis of insurance markets. The present studies provide clear data about a rather simple case: whether or not screening is present in experimental insurance markets where theory predicts screening is likely to exist. We draw conclusions from this study on two levels, that of methodology and that of insurance aspects. From the methodological perspective, we note that had we concluded Experiments 4A and 4B after the first stage, or considered only the other one-stage studies, we would have rejected the screening hypothesis. It is possible that subjects do not behave in some experimental situations as the theory predicts not because they are irrational or that the theory is wrong, but because of the complexity and ambiguity of the task (see Kunreuther, Hogarth, and Meszaros [1993] and Shapira [1993]). One possible way to account for the subjects’ behavior is to consider the ease with which expected values can be calculated. In the cases presented, the calculation of expected damages was difficult. It is possible that when the situation is ambiguous, subjects base their decisions on a subset of the parameters 154 ZUR SHAPIRA AND ITZHAK VENEZIA and hence end up with biased estimates. Such a line of explanation is consistent with Tversky and Kahneman’s [1986] suggestion that expected utility is a good descriptive model when the situation is transparent. If, however, the situation is opaque, an alternative model may do a better job of describing subjects’ behavior. This hypothesis is supported by the effect of learning in this study (see also Einhorn [1980]). The pricing decisions in Experiments 4A and 4B showed some tendency to converge over time to the screening prices predicted by theory, as the subjects learned more about the relevant parameters and as the problem became better understood. Our findings about the effect of learning agree with those of Friedman [1998] concerning the “Price is Right” anomaly. Friedman found that this anomaly, which has been detected in one-stage experiments, is mitigated or even eliminated if subjects are allowed learning over several stages of the experiment. He suggests that this outcome may also be true for other anomalies detected in one-stage experiments. The fact that practitioners, who had just one stage, did not detect the screening possi- bilities better than the students in the one-stage studies was surprising. We propose that this is another manifestation of a phenomenon that was noted elsewhere (e.g., Camerer [1995]), namely, practitioners either do not bring their real-life expertise into laboratory sit- uations or have difficulties adapting their real-life habits to the slightly different scenarios of experimental settings. From the insurance point of view, the last stage of Experiments 4A and 4B best replicates reality, since at this stage subjects had experience, feedback, and rewards that depend on performance. Since the evidence for screening at this stage was weakly positive, the screening hypothesis obtains only mild support in this study. This result agrees with the mixed empirical evidence for screening in the literature (see Dionne and Doherty [1993] for evidence in favor of screening, and Browne and Doerpinghaus [1993] and Cutler [1996] for evidence against screening). The results of the experiments also indicate that the clearer the scenario becomes, the more likely screening becomes. This outcome suggests that also in practice, the more transparent the situation, the higher the chances of screening and self- selection. Further research is necessary to pinpoint and better understand the factors that enhance and those that hinder screening and self-selection.

Appendix A: Conditions for self-selection

Suppose all clients have the same preferences, described by a utility function u. Let W0 denote the common initial wealth of each client. Self-selection is achieved if

˜ ˜ EG[U(W) | F] > EG[U(W) | D] (1) and

˜ ˜ EC[U(W) | F] < EC[U(W) | D].

EG and EC denote expectations with respect to the distribution of negligent and careful clients, respectively. W˜ denotes the random final wealth of the insured after buying the SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 155 insurance (whether D or F). W˜ is therefore given by

˜ ˜ ˜ W = W0 X + IF(X) PF, if contract F is chosen, and ˜ ˜ ˜ W = W0 X + ID(X) PD, if contract D is chosen,

Where X˜ denotes the damages, I (X˜ ) denotes the payments of the insurer and P denotes the price of the contract. The indices F or D denote whether the payments and prices are under the full coverage contract or deductible contract, respectively.

Appendix B: Simulating demands and business results

According to the scenario given in the subjects’ task, there were 1000 potential customers in the market (groups H and L were considered different markets). These 1000 customers were allocated between the subjects according to their prices. The probability that a negligent customer (G) will buy a policy from seller i is given by

¯ ¯ (P /P , ) + (P /P , ) Prob(GB ) = Prob(G buys i) = P D D i F F i (2) i ( ¯ / ) + ( ¯ / ) j=1[ PD PD, j PF PF, j ] where

¯ PD = average price of contract D (coverage with a deductible) in the market,

PD,i = price of contract D offered by seller i, ¯ PF and PF,i defined similarly for contract F (full coverage).

and are parameters denoting the relative importance the client attaches to contract D and F. For a negligent client, will be smaller than . After a seller is selected by a negligent client, the client will choose D or F according to which contract provides the higher expected utility, that is, the insured will choose D if

˜ ˜ {EG[U(W) | D, PD,i ] EG[U(W) | F, PF,i ]} > 0 (3) and F otherwise. The process of generating demand for a careful client is similar, except that we have interchanged the parameters and in determining the probability, Prob(CBi ), of a careful customer buying a policy from seller i. We have done so since the careful client is, a priori, more interested in the D contracts of the various sellers. In determining whether a careful buyer that has already chosen seller i will buy policy D or F, we used the same procedure as with the negligent buyer, except that we used in (3) the distribution of the careful insureds. Thus, the number of clients of any seller i was generated by first determining the probabil- ity of negligent clients turning to this seller and multiplying that probability by the number of negligent clients in the market (750 in market H and 250 in market L), determining 156 ZUR SHAPIRA AND ITZHAK VENEZIA the number of careful clients turning to this seller, and then splitting those buying D or F according to the above procedure. The profits of each seller i were given by computing

= + i PDi TDi PFi TFi ( ˜ | ) ( ˜ | ) TGDi EG X D TCDi EC X D ( ˜ | ) ( ˜ | ) TGFi EG X F TCFi EC X F where E(X˜ | D) and E(X˜ | F) denote the expected claims under policies D and F, res- pectively. TGDi , TCDi , TGFi , and TCFi denote, respectively, the number of negligent clients buying contract D, the number of careful clients buying contract D, the number of negligent clients buying contract F, and the number of careful clients buying contract F.

Appendix C: Questions asked after the decisions were made

1. Describe your decision process and how you determined your prices. 2. How did the deductible affect pricing? 3. How would you change your decisions if the population consisted of 50% careful and 50% negligent clients? 4. What are your chances of winning one of the prizes? 5. Does the population in your world consist mainly of careful or negligent clients?

Acknowledgments

The authors would like to thank Sasson Bar-Yosef, Sari Carp, Lawrence White, and two anonymous referees for helpful comments and suggestions. The financial support of the Galanter foundation, The Krueger Center of Finance, and the Stern School of Business is also gratefully acknowledged. We also thank Yehuda Kahane and the College of Insurance in Tel Aviv for support in data gathering, and Berna Sifonte for technical assistance.

Notes

1. Hemenway [1990] discusses also the possibility of propitious (the opposite of adverse) selection. 2. Dionne and Doherty [1993] provide evidence for screening by automobile insurers in California, whereas Browne and Doerpinghaus [1993] found evidence for adverse selection in medical insurance and Cutler [1996] found evidence for adverse selection in life and . 3. In general, the prices determined would be the results of some competitive game between the participants. The appropriate equilibrium concept to apply to this case is, however, not easy to determine. 4. The choice of the appropriate risk measure is based on the risk measures found and used by Friedman [1974], Keeler, Newhouse, and Phelps [1977], and Venezia [1983]. Since we consider a wide range of risk-aversion measures, and since the exponential utility is a commonly used one, we believe our results are quite reasonable with respect to the choice of a utility function. 5. The exact conditions for self-selection are presented in Appendix A. They impose that all “careful” subjects prefer the deductible policy to full insurance, and that the opposite holds for “negligent” buyers. SELF-SELECTION AND SCREENING IN INSURANCE DECISIONS 157

6. For these distributions, the higher mean distribution provides lower expected losses when a $100 deductible is given. However, this should not interfere with the examination of whether or not screening and self-selection will occur. 7. We compare the practitioners only to round 1 of the current experiment because the practitioners had only one round.

References

AKERLOF, G. [1970]: “The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism,” Quarterly Journal of Economics, 84, 488–500. BELIVEAU, B. [1984]: “Theoretical and Empirical Aspects of Implicit Information in the Market for Life Insur- ance,” Journal of Risk and Insurance, 51, 286–307. BROWNE, M.J. and DOERPINGHAUS, H.I. [1993]: “Informational Asymmetries and Adverse Selection in the Market for Individual Medical Expense Insurance,” The Journal of Risk and Insurance, 60, 300–312. CAMERER, C. [1995]: “Individual Decision Making,” in Handbook of Experimental Economics, J. Kugel and A. Roth (Eds.), New York: Oxford, 587–703. CROCKER, K.J. and SNOW, A. [1985]: “The Efficiency of Competitive Equilibrium in Insurance Markets with Adverse Selection,” Journal of Public Economics, 26, 207–219. CUTLER, D. [1996]: “Public Policy for Health Care,” NBER Working Paper 5591, Cambridge, MA. DIONNE, G. and DOHERTY, N. [1993]: “Adverse Selection, Commitment and Renegotiation: Extension to and Evidence from Insurance Markets,” Working Paper #9301, Universit´e de Paris X-Nanterre, Thema. DIONNE, G. and DOHERTY, N. [1991]: “Adverse Selection in Insurance Markets: A Selective Survey,” Working Paper #9105, Department of Economic Sciences, Universit´e de Montreal. EINHORN, H. [1980]: “Learning from Experience and Sub-optimal Rules in Decision Making,” in Cognitive Processes in Choice and Decision Behavior, T. Wallsten (Ed.), Hillsdale, NJ: Erlbaum. FRIEDMAN, B. [1974]: “Risk Aversion and the Consumer Choice of Health Insurance Option,” Review of Economics and Statistics, 55, 209–214. FRIEDMAN, D. [1998]: “Monty Hall’s Three Doors: Construction and Deconstruction of a Choice Anomaly,” American Economic Review, 88, 933–946. HEMENWAY, D. [1990]: “Propitious Selection,” Quarterly Journal of Economics, 105, 1063–1069. JAYNES, G. [1978]: “Equilibrium in Monopolistically Competitive Insurance Markets,” Journal of Economic Theory, 19, 394–422. KEELER, E., NEWHOUSE, J., and PHELPS, C. [1977]: “Deductibles and Demand for Medical Care Services: The Theory of a Consumer Facing a Variable Price Schedule under Uncertainty,” Econometrica, 45, 641–655. KUNREUTHER, H. et al. [1978]: Disaster Insurance Protection: Public Policy Lessons, New York: Wiley. KUNREUTHER, H. and SLOVIC, P. [1978]: “Economics, Psychology and Protective Behavior,” American Eco- nomic Review, 68, 64–69. KUNREUTHER, H., HOGARTH, R., and MESZAROS, J. [1993]: “Insurer Ambiguity and Market Failure,” Journal of Risk and Uncertainty, 7, 35–52. MILGROM, P. and ROBERTS, J. [1992]: Economics, Organization and Management, Englewood Cliffs, NJ: Prentice-Hall. MIYAZAKI, H. [1977]: “The Rat Race and Internal Labor Markets,” Bell Journal of Economics, 8, 394–418. RILEY, J.G. [1979]: “Information Equilibrium,” Econometrica, 47, 331–359. ROTHSCHILD, M. and STIGLITZ, J.E. [1976]: “Equilibrium in Competitive Markets,” Quarterly Journal of Economics, 90, 629–649. SHAPIRA, Z. [1993]: “Ambiguity and Risk Taking in Organizations,” Journal of Risk and Uncertainty, 7, 89–94. SHAPIRA, Z. and VENEZIA, I. [1992]: “Size and Frequency of Prizes as Determinants of the Demand for Lotteries,” Organizational Behavior and Human Decision Processes, 52, 307–318. SCHOEMAKER, P.J. and KUNREUTHER, H. [1979]: “An Experimental Study of Insurance Decisions,” Journal of Risk and Insurance, 46, 603–618. SPENCE, M. [1973]: Market Signaling: Information Transfer in Hiring and Related Processes, Cambridge, MA: Harvard University Press. 158 ZUR SHAPIRA AND ITZHAK VENEZIA

SPENCE, M. [1978]: “Product Differentiation and Performance in Insurance Markets,” Journal of Public Eco- nomics, 10, 427–447. STIGLITZ, J. and WEISS, A. [1981]: “Credit Rationing in Markets with Imperfect Information,” American Economic Review, 71(3), 393–410. TVERSKY, A. and KAHNEMAN, D. [1986]: “Choice and the Framing of Decisions,” Journal of Business, 59(4, Part II), S259–S278. VENEZIA, I. [1991]: “Tie-in Arrangements of Life Insurance and Savings: an Economic Rationale,” Journal of Risk and Insurance, 58, 383–396. VENEZIA, I. [1983]: “Aspects of Optimal Automobile Insurance,” Journal of Risk and Insurance, 51, 63–79. WILSON, C. [1977]: “A Model of Insurance Markets with Asymmetric Information,” Journal of Economic Theory, 16, 167–207.