University of Amsterdam Amsterdam Business School Asset Pricing And

University of Amsterdam Amsterdam Business School

MSc Business Economics, Finance track Master Thesis

Asset Pricing and Behavioral Finance Evidence from a Betting Exchange

Rob Clowting June 2017

Supervisor: Florian Peters Statement of Originality

This document is written by Rob Clowting who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.

i Abstract

In this thesis I use sports betting exchanges as a laboratory setting for testing behavioral theories on asset pricing. A betting exchange is a betting market where individual bettors come together and make the market, e↵ectively forming a simple financial market where single-payo↵contracts are traded in the form of bets. I use a unique dataset from the betting exchange platform Betfair, from which I collect 601,543 betting contracts spanning the period 2012-2016. I test how prices respond to news in betting markets and if this happens rationally or irrationally. I also construct measures for betting volume and the time the market for a bet was open and test how these a↵ect price response. I find that all throughout the betting market, price response is inecient and overreaction is the leading cause. Examining the e↵ects of betting volume and market opening time, I find that market eciency increases when betting volume increases. For market time, the results are more ambiguous: in the short to medium run, overreaction seems to decrease, but increases sharply again in the long run. I then present several behavioral theories to theoretically explain the results found.

ii Contents

1 Introduction 1

2 Literature Review 4 2.1 SportsBettingMarkets ...... 5 2.2 BehavioralFinance...... 7 2.3 E↵ects of Volume and Time on Market Eciency...... 9

3 Methodology & Hypotheses 11

4 Data & Summary Statistics 15

5Results 20

6 Additional Analysis 27

7 Conclusion and Discussion 29

8References 32

9Appendix 34

iii 1 Introduction

In his renowned paper on market eciency and behavioral ﬁnance, Fama (1998) counters most research on the behavioral theories of anomalies found in classical asset pricing theory. He states that no behavioral model has yet come up with more convincing results compared to the rational theory he puts forward. Rational theories of market eciency state that return premia are linked to aggregate systematic risk, where behavioral theory states that returns are generated by investor cognitive errors and biases in a market that is not completely ecient. One of the biggest obstacles when testing for market eciency is the joint hypothesis problem as stated by Fama (1971): any test of market eciency is simultaneously a test of the underlying equilibrium asset pricing model. Regular capital markets present a virtually impossible to test empirical environment, as aggregate systematic risk is unobservable and the termination date of the asset is often not known. Ideally, an environment should be found where these fundamental problems of testing for market eciency are absent. I propose such an environment exists, and it comes from a perhaps unexpected source: sports betting markets. Sports betting markets have been subject of some study in the literature because they present a laboratory setting of a simple ﬁnancial market where single-payo↵contracts are traded in the form of bets on sports events. As Moskowitz (2015) argues, either our asset pricing models should explain returns from all markets, or we would have to make separate models for di↵erent asset types. Sports betting markets o↵er the opportunity to test behavioral asset pricing models, as they have three unique features that set them apart from the rational asset pricing framework. First, sports betting markets are fully idiosyncratic, having no relation to aggregate risk in the economy. Second, sports bets have a short and observable termination date at which the true value of the underlying contract is revealed. Third, the outcome of a bet is completely independent of betting activity. The observable true value of the bet after the betting event has taken place allows for the detection of mispricing. The dataset I use allows me to also observe market opening prices and the price of the bet at market close. I use prices at these three points in time (open, close, terminal value) to test behavioral theories of asset pricing. Almost all literature on betting markets studies bookmaker betting markets, the traditional and best known form of betting markets where bookmakers set prices for the bets and act as market makers. Testing any asset pricing theory in these markets will therefore always be biased by bookmaker choices and preferences. However, recent years have seen the rise of so called betting exchanges, where bookmakers are replaced by individual bettors posting their preferred

1 limit and market orders for bets, creating a trading mechanism similar to that of regular stock exchanges. This e↵ectively creates an environment where individual bettors come together and trade betting contracts, creating an isolated financial market in which asset pricing theories can be tested. This paper will take this betting exchange environment and use it to test several behavioral theories of asset pricing. I use a unique dataset from Betfair, currently the largest betting exchange platform worldwide with an annual turnover of 55,3 billion pounds. There is still little research on these betting exchanges, and to the best of my knowledge this paper is the first to test behavioral theories in this environment. Another unique and interesting feature of the Betfair dataset is that it contains measures of market time and market size, allowing me to test what the e↵ects of these are on pricing of bets. More specifically, I study price movements and corrections and apply several behavioral theories on these. Barberis et al. (1998) present a model of underreaction followed by overreaction. Daniel et al. (1998) come up with a model of overreaction followed by overreaction and Hong and Stein (1999) present another model of overreaction. Prices can move from open to close for information and non-information reasons and might respond rationally or irrationally to information. From the literature I derive the main objective of this paper, which is to test whether prices respond rationally or irrationally to news, and in the irrational case, whether this is due to pure noise, overreaction or underreaction. Next I test if and how the time a bet was traded and the amount of money that was bet influence these results. I construct a dataset from the Betfair betting platform that contains all football matches on which was bet in the period 2012-2016. This gives me a total of 601,543 betting contracts that cover 250,382 football matches. I also construct a subset of contracts which were traded for more than 24 hours and on which more than 1000 GBP was bet and I divide the data into intervals of market time and betting volume. The results of the first test of rational versus irrational price movements point to one clear direction: prices respond irrationally and this leads to overreaction. For both the full dataset and the subsample the coecients found are significant and imply overreaction. The di↵erence between the two results is that for the full dataset, the hypothesis that price movements are caused by pure noise is not rejected, but for the subset it is. These results point to a relationship between overreaction and market time and betting volume. To study this apparent relationship, the dataset is then further split up into di↵erent market time and betting volume intervals. The e↵ect of betting volume on the eciency of price movements is clear: increasing betting volume leads to an increase in eciency. For the highest intervals of betting volume the rationally ecient response hypothesis is not rejected. Turning to market time, the results are more ambiguous. In

2 the short to medium run, overreaction appears to decrease in longer opened markets. However, for the highest intervals of markets opened for more than 150 hours, overreaction increases again strongly. Additional analysis of the results using di↵erent criteria for market time and betting volume confirms these results. There is no evidence that underreaction plays a role in price movements in betting exchanges. The first main finding of this paper is that in betting exchanges, prices generally respond irrationally to news, leading to overreaction. The second main finding is that this overreaction decreases in betting volume, and responds ambiguously to the time a market is open. Linking these results to the literature, Chordia et al.(2008) find that liquidity increases market eciency, which o↵ers an explanation for the the increase in eciency when betting volume increases. The general overreaction could be evidence for the model of Daniel et al. (1998), who present overconfident investors (or bettors) as a source of mispricing. The model by Hong and Stein (1999) presents an explanation of the long-term increase in overreaction, as it accounts for momentum traders who amplify overreaction when they interact with one another. It should be noted that these are theoretical explanations for the phenomena observed, as this paper does not yet present empirical data on testing these theories. The rest of this thesis is structured as follows. Section 2 will discuss the current literature on betting markets, market eciency and behavioral finance. Section 3 will present the hypotheses of the thesis and present the methodology required to test these. Section 4 introduces the data and provides descriptive statistcs. In Section 5, the main results are discussed. Section 6 presents additional analyses of the results found. Finally, Section 7 is the conclusion and discussion.

3 2 Literature Review

One of the oldest problems in modern financial theory states that testing for market eciency is somewhat impossible because of the Joint Hypothesis problem, as formulated by Malkiel and Fama (1970) and Fama (1991). This Joint Hypothesis problem implies that testing for market eciency will always be a test of the underlying asset pricing model. Therefore any empirical findings on supposed market ineciencies can never fully be ruled out as being caused by an incorrect model. Sports betting markets have been subject to quite extensive research for some years because they can be used as a laboratory for testing asset pricing. Moskowitz (2015) argues that our asset pricing models should explain returns for all asset markets, or we would need di↵erent models for di↵erent asset markets. Sport betting markets o↵er an opportunity to circumvent the Joint Hypothesis stated by Fama (1991) because of three reasons. (i). Betting markets are fully idiosyncratic: they have no relation to aggregate risk or risk premia in the economy. (ii) The terminal value of a bet is always known and observable after a bet settles, allowing for mispricing to be detected. (iii). The outcome of a bet is completely independent of betting activity (assuming absence of ‘match-fixing’). These three characteristics make betting markets an interesting laboratory for testing asset pricing models Considering the first feature, Moskowitz (2015) argues that aggregate risk preferences and risk premia might a↵ect the entire betting market as a whole but should have no e↵ect on the cross-section of games. Rational asset pricing theories should therefore have nothing to say about return predictability for betting contracts. However, sports betting contracts should be subject to behavioral explanations that have been proposed to cause anomalous returns in regular financial markets. Since the betting market is idiosyncratic, contracts are independent of aggregate risk and therefore can tell us more about the role bettor behavior and information. The second and third feature of sports betting markets imply that betting contracts have a short and known termination date, namely the end of the sports event on which the contract was bought. This means that any uncertainty about the value of the contract is resolved at the termination date of the contract, providing a ‘true’ value of the contract. Since bettor behavior

4 has no inﬂuence on the outcome of the event1, the outcome is fully exogenous and as a result this allows for any mispricing to be detected. Another hypothesis is that these markets are in fact ecient, and that there is no mispricing, thus implying that there is no return predictability. The combination of these features, idiosyncracy and the known true terminal value, make sports betting markets a useful laboratory for testing behavioral theories.

2.1 Sports Betting Markets

In their paper, Franck et al.(2012) describe a betting market as a simple speculative market, where contracts on some future cash flow are traded. The outcome of a certain event determines the direction of the cash flow. For example, in sports betting this could be ‘Team X wins’, or ‘Player Y scores’. This paper looks at fixed-odds betting, which means the size of the cash flow is determined by the odds. In European online sports betting, odds are generally presented as so called ‘decimal odds’. For example, an odd of 1.58 means that for every euro bet, the bettor receives 1 1.58 his stake if he wins the bet. Although odds may change over time, the bettor’s ⇥ claim is determined by the initially taken odd and is not altered by subsequent price changes. There are two distinct market forms in sport betting markets: bookmaker markets and betting exchange markets. Most of the literature on betting markets has focused on bookmaker markets, the traditional form of betting markets, where bookmakers act as market makers and set the price, and the individual bettors buy the bets from the bookmakers. Bookmakers make money by charging a commission, which is reflected in their odds. There is already some literature on eciency and mispricing in these markets. Ineciency in betting markets implies that the odds quoted by a bookmaker do not reflect the true probability of the outcome of the underlying event. Vlastakis et al. (2009) for example, find evidence for the so called favorite-longshot bias, meaning that betting on favorites (low returns with high probability) yields higher returns than betting on ‘longshots’ (high returns with low probabilities). They also find evidence for the overestimation of the home team advantage. Bettors tend to have an unjustified belief that the home team has a bigger advantage than it empirically has. Bookmakers seem to be able to exploit these biases

1This assumes absence of ‘match-fixing’ or large scale influencing of football matches by (criminal) persons or organizations. Although several instances of match-fixing have come to light over the past years in mostly lower level European leagues, I assume that these instances are so rare they do not influence the betting market in any significant way. To challenge this would be beyond the scope of this paper and would also be beyond the field of economics

5 with their price setting. Looking at betting strategies, Vlastakis et al. (2009) find that the most profitable betting strategy is the ‘away-favorite’, although it should be noted that this strategy still yields negative average returns due to bookmakers’ commission and price setting. Why bookmakers may purposely set prices ineciently is examined in several papers. Kuypers (2000), Franck et al. (2012) and Vlastakis et al. (2009) all provide evidence that bookmakers may maximize profits by setting market-inecient odds. They bring forward the behavioral argument that bettor biases, such as the mentioned favorite-longshot and home team advantage overestimation, are exploited by bookmakers to maximize profits. This works in two ways. On the one hand, bookmakers can exploit bettor biases by quoting market inecient odds that are advantageous to their own profits (the bettor takes the loss on eciency, the bookmaker the gains). The other way this works is because bookmakers charge a commission on bets, meaning that higher trading volume can sometimes o↵set the potential losses on quoting inecient odds. Another more intuitive reason described by Franck et al. (2012), is that bookmakers may set inecient odds temporarily for promotional or advertising reasons to attract new customers. By taking losses on the inecient odds or promotions during the advertising period, the bookmaker hopes to attract and retain new customers on whom they will make a profit in the long run, since bettors face transaction costs when switching bookmakers. As the literature on bookmaker betting markets shows, the problem with testing for market eciency in bookmaker markets is that bookmakers have incentives to purposely set odds ineciently for profit-maximizing reasons. A test of true market eciency will therefore always be biased by bookmaker preferences. This paper examines a di↵erent type of betting platform, the so called betting exchange, where individual bettors come together and quote their odds at which they are willing to trade. On the betting exchange, the market makers are the bettors themselves and the bookmakers are eliminated from the equation. Betting exchange markets are a relatively new phenomenon in the (online) sports betting community. The biggest betting exchange platform is currently Betfair.com, launched in 2000 and with an annual turnover of £ 55,3 billion in 2015 (Betfair, 2015). On a betting exchange, the bookmaker is replaced by individual bettors and prices are set by an auction process of supply and demand, similar to regular stock exchanges. The betting exchange platform o↵ers the individual bettor the possibility to either buy or sell a bet, similar to going long or short on a stock. The bettor can therefore take the position Team X wins/draws/loses, but also the position Team X does not win/draw/lose. In other words, instead of the bookmaker taking the other side of the contract, in betting exchange markets it is another individual bettor. This

6 creates a continuous double auction process on the betting platform. If bettors with opposing views agree on a price, the platform executes their transaction. The bettor can either submit a limit order and wait for another person to match his price, or place a market order which will be matched with already o↵ered bets. The order book is shown publicly on the platform’s website that show most attractive odds and corresponding available betting volumes. Betting exchanges earn money by charging a commission on the cash flow of winning bets (ex-post) instead of including the commission in the quoted odds (ex-ante) like bookmakers. Smith et al. (2006, 2009) are amongst the first to study these betting exchanges and compare these to the bookmakers market. They find that betting exchanges o↵er a significant increase in eciency compared to bookmakers’ markets. They find that the favorite-longshot bias is less prominent in betting exchange markets than in traditional bookmakers markets. They also test information-based against risk preference models and find that the information based model predicts the favorite-longshot bias better than a the risk preference model. Frank et al. (2012) study arbitrage opportunities when combining betting on exchange markets with bookmaker markets for the top five European football leagues. They find that using this strategy, arbitrage opportunities arise in 19,2% of all matches, resulting in an average positive return on these bets of 1,4%. They conclude that these are not caused by random price di↵erences but are caused by di↵erent levels of informational eciency. Most arbitrage opportunities arise when bookmakers o↵er ineciently low-priced bets which can then be sold on the betting exchange for a higher price. They find that bookmakers markets are the main cause of the arbitrage opportunities arising, as they set prices less ecient than the betting exchange.

2.2 Behavioral Finance

Since betting markets form a suitable laboratory for testing asset pricing theories and models they could also be used to study the field of behavioral finance. In his paper studying psychological influences on asset pricing, Hirschleifer (2001) describes heuristics as rules-of-thumb that a↵ect individual decision making and may there also a↵ect decision making for a group of individuals acting in an economic environment. As the literature on betting markets show, heuristics seem to be playing a role in betting markets as there is ample evidence of the favorite-longshot bias and home team advantage overestimation (Vlastakis et al. (2009), Levitt (2004), Smith et al. (2006,2009)). In the behavioral finance literature there are several theories that try to explain behavioral phenomena based on investor beliefs. The focus of this paper will be mainly on over- and under-

7 reaction. One of the most cited papers in these areas is by Barberis et al. (1998), who present a model that tries to explain two pervasive empirical phenomena: short term underreaction of stock prices to news and long term overreaction to a series of good or bad news. Their model is based on systemic errors that investors make in reaction to public news announcements. They argue that conservatism makes people judge initial good or bad news insuciently, pushing prices up or down too little. After a series of good or bad news however, representativeness causes investors to overreact and push prices up or down too far. Daniel et al. (1998) present a di↵erent model where overconfident economic agents overweigh their private signals leading to overreaction in pricing. They argue that this is caused by so called self-attribution bias: public news that confirms the investors’ belief increases his confidence but disconfirming news does not a↵ect the investors’ confidence in his own beliefs. Following this reasoning, initial overconfidence is then amplified and leads to even more overconfidence, generating momentum. Another explanation for the existence of over- and underreaction and momentum is provided by Hong and Stein (1999). Their model does not look at individual investor psychology, but rather focuses on a market where di↵erent groups of traders interact. The model assumes there are two types of traders, ‘newswatchers’ and ‘momentum traders’, who interact with each other. Newswatchers base their forecasts on private information, and momentum traders only base their actions on the most recent price change. The newswatchers trade based on private information, which then di↵uses slowly through the population of newswatchers. This slow di↵usion leads to underreaction of prices in the short run. The momentum traders then start engaging in positive feedback trading. For them, rising prices imply that information is slowly di↵using trough the market. However, because momentum traders cannot observe the extent to which news has di↵used through the economy, they keep buying even after prices have reached fundamental value. This generates an overreaction that is only later reversed. Hong et al. (2000) test this model of momentum and information di↵usion by looking at firms of di↵erent size and di↵erent levels of analyst coverage. They find that momentum is stronger in smaller companies and companies with lower levels of analyst coverage, consistent with their hypothesis of slow information di↵usion. Linking these behavioral theories to betting markets, Moskowitz (2015) is amongst the first to use sports betting markets as an asset pricing laboratory to test these models of over- and underreaction and momentum. Moskowitz uses data from bookmaker markets and finds that price movements in betting markets are consistent with overreaction models, as described by the

8 model of Daniel et al. (1998). He then examines what may cause this overreaction and finds that momentum exhibits significant predictability for returns and that value exhibits significant but weaker predictability, and no evidence that size predicts returns in any way. However, by using bookmaker market instead of betting exchange markets there is still the argument of bookmaker preferences that may cause results to be a↵ected.

2.3 E↵ects of Volume and Time on Market Eciency

The dataset that is used in this paper also features a measure of betting volume for each contract and allows me to construct a measure for how long a certain bet was traded. Betting volume is related to trading activity, market depth and liquidity and market time may be linked to the response to news arrival and e↵ective di↵usion of news. Unfortunately, there is no literature that focuses on trading volume and liquidity in betting markets. However, the literature on liquidity in regular financial markets provides sucient guidance. Pagano (1989) presents the relationship between trading volume and liquidity as a feedback loop where the former amplifies the latter and vice versa. He argues that trading volume is positively linked to liquidity as both speculators and informed traders enter the market as volume increases. Admati and Pfleiderer (1988) point out that “liquidity begets liquidity”. Studying these phenomena, Chordia et al. (2008) are amongst the first to study the link between liquidity and market eciency and find that an increase in liquidity leads to an increase in eciency, caused by a rise in arbitrage trading. Chung and Hrazdil (2010) confirm the findings of Chordia et al. in a more extensive study. The literature on the relationship between time and market eciency and mainly focuses on the arrival of news and how fast it is incorporated into prices. In one of the earlier papers on this subject, Patell and Wolfson (1984) find that price response takes places in five to ten minutes. Busse and Green (2001) sum up several studies that look at the speed at which prices react and find that prices incorporate news within five to 15 minutes in regular financial markets. However the papers Busse and Green (2001) cite are somewhat outdated, and they subsequently find in their own research that for positive news prices are corrected within one minute, whereas for negative news this is around 15 minutes. Up to my knowledge there has not been any research done on news arrival in betting markets or market time in general. This makes it dicult to draw conclusions on how prices respond to news in a sports betting market. On the one hand, one could present the argument that news about sports is often more uncertain and rumor-based and therefore does not have the same power as an earnings announcement for a stock would

9 have. On the other hand, the literature on sports betting markets shows that these markets are for a large part ecient so why would the response to news not be, one can ask.

Summarizing, to the best of my knowledge this Thesis will be the first paper that will use a betting exchange to test behavioral models of over- and underreaction. By using a unique dataset from Betfair, the largest betting exchange in the world, I will study behavioral models in betting exchanges and how they di↵er from bookmaker markets. The nature of the betting exchange e↵ectively rules out the argument of bookmaker preferences causing inecient price setting and thus a↵ecting results. Another unqiue feature of this paper is the fact that I am able to construct measures for time the market was open for every contract in my dataset and how much money was bet on each of these. This allows me to study the e↵ects of time and traded volume on price response and market eciency and the implications of these on behavioral models. Because of the similarities between the functioning of a betting exchange and regular financial markets, empirical evidence found in this paper can contribute to the growing literature on behavioral theories in financial markets.

10 3 Methodology & Hypotheses

This paper will use a unique dataset from the betting exchange platform Betfair. The platform makes all its data available through its website for members who actively bet on the platform. Betting exchanges form a market where individual bettors come together and set prices for betting contracts through a continuous auction process, similar to regular stock exchanges. Instead of a bookmaker quoting odds and acting as a market maker, prices are determined by the individual bettors on the platform. The betting exchange platform charges a commission fee on the bettor’s net profits. For Betfair the commission ranges between 2% and 5%, depending on the bettors individual betting activity and volume. Theoretically, odds could range from any number > 1 up to (near-) infinity. In practice betting platforms do not provide odds smaller than 1.01 (two decimals) and limit their odds around 1000 for the events deemed most unlikely, such as an underdog beating a favorite team by a 10 goal margin in football2. Following Franck et al. (2012), when a bet on the outcome e of a certain event has been matched, bettors hold a contract on some future cash flow. The underlying payo↵of the contract is determined by the odds oe and the direction of the cash flow is determined by the outcome of the underlying event. If he wins, the bettor has to pay the commission c (0

E[Re]=e(oe 1)(1 c)+(1 e)( 1) (1) = [o (1 c)+c] 1, e e where e is the true probability of the outcome e occurring. The structure of the betting process is as follows: for a certain bet, the first time odds are matched and a transaction is executed, this gives the opening price P0. The closing time of the market is the start of the event, gives the closing price P1. The event then starts and finishes at the game outcome PT , at which time the true terminal value of the contract is revealed. The figure below shows this timeline of prices and returns, as presented by Moskowitz (2015):

2Higher odds than 1000 do occur, but not for single-match betting events, which is the focus of this paper. Odds over 1000 are often seen for betting events that span a full tournament or season. For example, at the start of the 2015-2016 Premier League season, bookmakers quoted odds of up to 5000 for Leicester City winning the Premier League title, an event deemed highly unlikely by bookmakers. Le- icester famously won the title, leaving bookmakers behind with big losses because of their pricesetting (http://www.telegraph.co.uk/news/2016/05/02/leicester-city-win-premier-league-and-cost-bookies-biggest-ever/)

11 The time between the opening price and the closing price may vary between a few minutes up to a few weeks. Since this paper only looks at football data, the time between closing and game outcome is approximately 105 minutes (2 halves of 45 minutes, 15 minutes half-time break and additional injury time)3. As time progresses between opening and close, prices may change for similar reasons as they would in a regular ﬁnancial market. Prices could change if bettors enter the market who think the contract is mispriced or because new information arrives, for example the injury of a key player. Prices can move from open to close for information and non-information reasons and might respond rationally or irrationally to information. Take for example the situation where somewhere between market open and close, a team’s key player is injured. As a response to this, the odds of the bet will change. If this happens for information reasons and rationally, the closing price will be a better predictor of the outcome of the game than the opening price. Also, if price setting happens in a fully rational manner, there will be no return predictability from market close to the end of the game, as the closing price equals the expectation of the terminal value,

P1 = E[PT ]. Intuitively this makes sense, since at market close P1 (the start of the event), all information about the bet’s underlying event should be known and included in the price, from starting line-ups and for instance how injury prone the respective players are, to stadium attendance, weather conditions and other factors that might inﬂuence the game. Following the same reasoning, price movement from market open to close should have no predictive value for close-to-end return under the rational hypothesis. This brings up the base regression:

Rclose:end = ↵ + 1Ropen:close + ✏ (2) and the following predictions regarding the rational response to information:

3In playo↵rounds, football matches that end in a tie go into 30 minute overtime and end with a penalty shoot-out if the overtime does not bring a winner. However, ‘match odds’ betting contracts only apply to the result after regular playing time of 90 minutes, regardless of overtime being played. Overtime and penalty bets have their own separate contracts on betting platforms

12 1. If prices move (P = P ) for information reasons and markets respond rationally to the 0 6 1 news, then 1 =0

The next prediction follows from the idea that prices could move from market opening to closing for purely non-information reasons, such as investor sentiment or pure noise. In this case the closing price is wrong and the price will be corrected as the game ends and the true price is revealed. The open-close return should then negatively predict the close-end return as prices move back to true value at the terminal date. If there then was no information content in the price movement, prices will fully revert to the original opening price, leading to the second prediction:

2. If prices move (P = P ) for non-information reasons, then = 1 0 6 1 1 Another scenario is that prices move for information reasons, but the markets respond irrationally to the news, overreacting or underreacting to news concerning the underlying sports event. This idea of under- and overreaction comes from the theories and models presented by Daniel et al. (1998), Barberis et al. (1998) and Hong and Stein (1999). If this were to be the case, closing prices are still wrong but there would also be predictability of the close-end return from the open-close return. The third prediction then becomes:

3. If prices move (P = P ) for information reasons but markets respond irrationally to the 0 6 1 news then

a. 1 > 0 if underreaction

b. 1 > 0 if overreaction

These three hypotheses can be attributed to information, non-information and irrational information response hypotheses respectively and these will be tested further in this paper. The three hypotheses are summarized in the ﬁgure below

13 with the remark that if = 1, the purely non-information hypothesis is conﬁrmed. 1

14 4 Data & Summary Statistics

This paper uses a unique dataset from the betting platform Betfair. Data is collected through the website data.betfair.com, which becomes available after the user has reached a certain amount of Betfair points, which are granted based on betting activity and volume. In 2015, the Betfair betting platform had an annual turnover of £ 55,3 billion (Betfair, 2015). Betfair posts its historical betting data in weekly files, going back as far as June 2004. This paper uses data from the calendar years 2012-2016, using 260 weekly datafiles as the base for this paper. The datasets contain all betting data for all betting events that are available on the Betfair website, ranging from sports events to political elections and other miscellaneous events. Table A1 in the Appendix provides an overview of the variables these datasets contain. This paper will examine the football market and will use the single betting contract that concerns whether a team wins, draws or loses, also called “Match Odds”. As Franck et al. (2012) also describe, the underlying reason is that these bets are most popular and most frequently traded on betting platforms. When a betting website is opened, these odds are often the first to be displayed. I also drop all bets that were made during the sporting event instead of before the game, as that is not the focus of this paper. Next I drop all values for which the “VOLUME MATCHED” is smaller than 1, since these observations have no real economic value. The next step is to determine what the odds were at the market opening and what were the odds at market close. For each match I observe all odds that have been traded, implying that both a buyer and a seller of the contract were found for that particular odd. By taking the first time a transaction took place (“FIRST TAKEN” in the dataset) I find the market opening odd. The market closing odd is determined by taking the latest time at which a transaction took place, given by the “LATEST TAKEN” variable in the dataset. I also construct a measure for market time, which is simply the time between the first time an odd was traded and the closing odd, expressed in hours. I then make sure that the market opening odd and the market closing odd are all matched with the respective match so that each single betting event has a single opening and closing odd and there are no duplicates in the dataset. Next I drop all observations where the market time equals exactly 0, meaning that in the data “FIRST TAKEN” is also “LATEST TAKEN”. Doing this drops 216,457 observations, more than a quarter of my total dataset. Since this is a significant amount, I analyze the characteristics of these dropped observations and find that for the vast majority of these observations only one transaction took place (“NUMBER BETS” equals 2, implying one single matched transaction) and had a volume

15 matched lower than GBP 10. Next I calculate prices and returns from equation (1) and the actual outcome of the match. Following Franck et al. (2012), I assume a Betfair base commission of 5%. This provides me with the return measures for open-close, open-end and close-end. Following Moskowitz (2015), I drop observations where the return from market open to close is smaller than -300% or larger than 300%. If I lay these outliers next to the volume matched and market time measures, it seems these values are most likely errors in the data or in the time stamps of the data. For these outliers, where the open to close returns are more than plus or minus 300%, the market time is mostly shorter than 1 hour and volume matched is low. It does not make sense to include these observations in the data, since they have no real economic signiﬁcance. I am then left with my cleaned full dataset which contains 601,543 ‘match odds’ betting contracts that cover 250,382 football matches from the years 2012-2016. Table 1 shows summary statistics on the full dataset.

Table 1: Summary Statistics of the full Betfair dataset This table shows the most important variables that are included in the dataset. Odds are the price at which a contract was traded. The bettors stake multiplied with the odds forms the contracts payo↵if the bettor wins the bet. Matched Bets is the variable indicating the amount of bets that were matched on a certain contract, with 2 being the minimum as that indicates there is a buyer and seller of the contract. Volume Matched is the total amount of money matched on a contract in GBP. Market Time is the time in hours between the ﬁrst time a transaction took place and the last time the contract was traded. Win % is a variable taking on the value 1 if the contract’s underlying bet was won and 0 if it was lost.

Count Mean SD Min Max Odds 601,543 4.34 11.38 1.01 1000 Matched Bets 601,543 33.82 89.67 2 9331 Volume Matched 601,543 2,641 24,509 1 3,795,204 Market Time 601,543 12.12 33.32 0.0003 3,627 Win Flag 601,543 0.34 0.47 0 1 N 601,543

I then look at how the observations are distributed over volume matched and market time by dividing the full sample of contracts into 5 subsamples of di↵erent market time. Table 2 shows descriptive statistics of these subsamples. One of the main points of interest of this table is the fact that more than two-thirds of the observations in the dataset have a market time of less than 6 hours and have relatively low volume matched and number of matched bets. The economic

16 interpretation of these numbers is that a large majority of the data concerns betting contracts where there was only little time between the ﬁrst time a bet was taken and the last time. In other words, the market was open for only a short time and as a result only a few bets were matched with a relatively low value. As these factors may inﬂuence how well a market functions I also construct a subset of the full dataset that contains contracts that span longer and higher volume markets. Smith et al.(2006) use a minimum of 2000 GBP as they argue that any observation below that mark would not have enough liquidity to be treated as representative. The criteria I use are a market time of at least 24 hours and at least 1000 pounds in matched volume. I will refer to these datasets as the full dataset and the long-market high-volume subset (or LM-HV dataset).

Table 2: Summary statistics of betting contracts over di↵erent market time horizons This table shows the average market time in hours, volume matched in GBP and the number of matched bets with standard deviations in parentheses. The total dataset of 601,543 observations is divided into 5 subsets of di↵erent market times. Longer markets have higher volume matched and more matched bets on average. More than two-thirds of the observations of the full dataset are contracts with a market time of less than 6 hours.

Market Time Market Time Market Time Market Time Market Time 0-6h 6-24h 24-72h 72-150h >150h Market Time 1.231 12.81 43.14 102.3 241.1 (1.56) (5.44) (14.08) (21.10) (220.80)

Volume Matched 1,245 2,465 4,622 14,340 67,698 (13,785) (20,659) (23,389) (65,268) (160,637)

Matched Bets 18.32 42.32 73.43 134.6 362.6 (45.20) (86.79) (107.4) (207.5) (471.0) N 412,969 60,933 51,137 32,266 28,420

Next I turn to the return distributions, which are calculated using equation 1 and summarized in Table 3 and visualized in Figure B1 in the Appendix. For the full dataset, the open-to-end mean return is -7.26% and the close-to-end mean return is 6.07%. This implies that a random bet placed at the time of market opening will yield a negative return of -7.26% on average, and -6.07% if the bet is placed at the time of market closing. The minimum return of a bet is a logical -100%, as this is simply a bettor’s cash ﬂow when he loses his bet. The maximum return in the dataset is 61,655%, corresponding to an odd of 650 of the underlying betting contract in the

17 dataset. Although this might seem like a very high number, a highly unlikely betting outcome can be expected to occur on a total of more than 600,000 betting contracts. Checking the dataset for a possible excess in extreme returns, I ﬁnd that the second highest return found is 4,655% (corresponding to an odd of 50) and after which the returns gradually decrease in size. The mean of the open-to-close return of the full sample is -1.19% and he minimum and maximum are limited at -300% and 300%, as described in the data cleaning section above. Another noticeable fact is that standard deviations of returns seem to be large, which may be caused by the large part of the dataset that has a ﬁxed loss of -100%. Examining the LM-HV subsample of the dataset, the results are very similar. Both open-to-close and close-to-end returns are negative, -4.58% and -4.08% respectively. Maximum return for the LM-HV subsample is 2,565% and standard deviations of returns are still relatively large with 149.34% and 149.51% but lower than the full dataset.

Table 3: Summary statistics of returns This table shows return distributions over di↵erent parts of the timeline of betting contracts.

Panel A: Full Dataset Count Mean SD Min Max

Ropen:end 601,543 -7.26% 171.77% -100% 61655%

Rclose:end 601,543 -6.07% 173.75% -100% 61655%

Ropen:close 601,543 -1.19%6 26.1% -299.25% 299.25% Panel B: LM-HV subsample Count Mean SD Min Max

Ropen:end 43,942 -4.58% 149.34% -100% 256.50%

Rclose:end 43,942 -4.08% 149.51% -100% 256.50%

Ropen:close 43,942 -0.50% 19.10% -298.30% 294.50%

Figure B1 in the Appendix plots the distribution of returns with the full dataset in the left column and the LM-HV subsample in the right column. Both open-to-end and close-to-end returns have a mass at -1 in both the full sample and the LM-HV sample, representing the fraction of lost bets. For open-to-close returns, both the full dataset and the subset have similar distribution where returns are centered at zero, implying that for the majority of contracts, prices did not move or moved only slightly. The full dataset and the LM-HV subset are further summarized in Table B1 in the Appendix. This table shows skewness, kurtosis and percentiles

18 for matched bets, volume matched and market time. The table shows that in the full dataset the majority of the observations have a relatively low number of matched bets, low amount of money matched and a short time the market was open. The bottom 50% of the dataset has a maximum of 12 matched bets (6 transactions), with GBP 144 matched during a market that was open for 1.67 hours (100 minutes). From an economic perspective, the question can be asked if there ever was a functioning market for these observations. Table 4 reports return correlations for the three return horizons. As expected, open-to-end returns are very highly correlated with close-to-end return: 0.99 for both the full dataset and the LM-HV subsample. The economic interpretation of this is that prices on average move only slightly between market opening and closing. For the full dataset, open-to-close returns have almost zero correlation with open-to-end returns and are negatively correlated with close-to-end returns. This is di↵erent for the LM-HV subsample, where the open-to-close return is slightly positively correlated with the open-to-end return but negatively correlated with close-to-end returns. Comparing these results to the literature, the distributions of returns and correlations seem to be in line with the results found by Moskowitz (2015). Where Moskowitz ﬁnds return distributions with slightly higher returns and smaller standard deviations, he also ﬁnds that returns on the contracts are on average small and negative. Also the correlations found for the LM-HV subsample come close to the results found in Moskowitz’ paper.

Table 4: Return Correlations This table shows correlations of returns for the full dataset (Panel A) and the LM-HV subsample (Panel B).

Panel A: full dataset Panel B: LM-HV sample

Ropen:end Rclose:end Ropen:close Ropen:end Rclose:end Ropen:close

Ropen:end 1 Ropen:end 1

Rclose:end 0.989 1 Rclose:end 0.992 1

Ropen:close -0.001 -0.151 1 Ropen:close 0.054 -0.073 1

19 5 Results

The goal of this paper is to test whether there is predictability of returns through price movements and if these are a↵ected by market time and betting volume. I will test whether price movements from market close to end can be predicted from market opening to closing prices. Recalling the regression as stated in equation (2):

Rclose:end = ↵ + 1Ropen:close + ✏ (3)

Table 5 shows results of testing of price movements for the full dataset and the LM-HV sample. For the full dataset, the 1 coecient is -1.0068, apparently very close to -1. Testing to reject the three hypotheses I reject H1 that the beta coecient is zero, I do no not reject H2 that 1 is equal to -1, and for H3 I reject the underreaction hypothesis (1 > 0) but I do not reject the overreaction hypothesis (1 < 0). Then turning to to subset of the data, the LM-HV sample has a 1 coecient of -0.5727 which is signiﬁcant at the 0.1% level. I reject hypotheses 1, 2 and 3a but I do no reject hypothesis 3b.

Table 5: Testing price movements for full dataset and LM-HV sample

This table shows the results for the regression Rclose:end = ↵ + 1Ropen:close + ✏,withthefulldatasetinthe left column and the LM-HV sample in the right column. The t-scores are presented in parentheses below the 1 coecient. Also reported are the results for testing the three hypotheses on information versus sentiment in price movements.

Full Dataset LM-HV Sample

1 -1.0068⇤⇤⇤ -0.5727⇤⇤⇤ (-46.21) (-4.61)

H1 : 1 =0 Reject Reject H : = 1 Do not reject Reject 2 1 H3a : 1 > 0RejectReject

H3b : 1 < 0 Do not reject Do not reject

⇤p<0.05,⇤⇤ p<0.01,⇤⇤⇤ p<0.001

Comparing the results of the full dataset with the LM-HV subsample, for both there is evidence to conclude that overreaction is playing a role in price movements and that prices do not move for purely informational reasons. The di↵erence between the two that should be noted

20 here is that for the full dataset I cannot reject H2 and for the LM-HV sample I can. This implies that for the full dataset, I do not reject the hypothesis that price movements are be caused purely by noise. In the LM-HV sample on the contrary, I can reject this hypothesis and I am left with not rejecting only H3b, the overreaction hypothesis. As a 1 coecient closer to 0 implies more ecient or rationally responding markets, the LM-HV sample appears to be more ecient than the full sample. This confirms the idea that the time a market was open for a contract and its underlying betting volume influence price movements, which makes sense from both an intuitive and economic perspective. Comparing these first regression results to the literature, Moskowitz

(2015) finds 1 coecients between -1.43 and -0.19 for the match odds contracts in his dataset of American sports. He finds di↵erent degrees of eciency for di↵erent contracts and sports. Where Moskowitz di↵erentiates between di↵erent contracts and sports4, his dataset does not allow him to examine the e↵ects of market time and volume like the Betfair dataset does. This paper only examines one type of contract (match odds), but will further examine the influence of market time and betting volume on prices movements in betting exchanges. To further test this influence of market size and market time I run the same regression using di↵erent intervals for market size and time and summarize these results in Table 6. Panel A shows results for di↵erent market time intervals, and Panel B shows di↵erent intervals for volume matched. Perhaps the most striking finding of Table 6 is the fact that the 1 coecient is never significantly positive for any interval for either market time or volume matched, thus e↵ectively ruling out the underreaction hypothesis. Although not all are significantly so, all coecients found are negative and together form evidence of the non-information hypothesis or the overreaction hypothesis. Figure C1 in the Appendix plots these coecients with their respective 95% confidence intervals. Focusing on Panel A, the results found are slightly ambiguous. Intuitively I would expect longer opened markets to be more ecient, but the results found do not fully confirm this idea. For the shortest time intervals in the first two columns, representing market times of less than 6 hours and 6-12 hours, I fail to reject the pure noise hypothesis that = 1. Moving further 1 right in the table I find results for longer market time intervals. Between 24 hours and 250 hours, there seems to be little evidence that markets become more ecient, as the 1 coecient ranges between 0 and -1 for these market time intervals. However, only for the 48-96 hour interval I

4Moskowitz examines contracts for di↵erent US sports: NBA (basketball), NFL (American football), MLB (baseball) and NHL (ice hockey). He also studies di↵erent contracts: Point spread (di↵erence between points scored between teams), Moneyline (winner/loser/tie, same as match odds used in this paper) and Over/Under contracts (total points scored).

21 can safely reject the second hypothesis that price movements are caused by pure noise. For the 150-250 hour interval, I ﬁnd an insigniﬁcant coecient of -0.294, for which I fail to reject any hypothesis. For markets longer than 250 hours, something apparently remarkable happens. The

1 coecient in the last two columns are -2.131 and -2.044 and are both significant at the 0.1% level. Turning to Panel B, the results found seem to be more in line with what is expected from a theoretical or intuitive viewpoint. As betting volume increases, overreaction seems to decrease for the intervals of 1 GBP up to 100,000 GBP. For the fourth column, the interval of betting contracts with 10,000-100,000 GBP matched, I do not reject the second hypothesis that 1=0, which implies that prices move for information reasons and markets respond rationally to these price changes. Also, this fourth column is the only column for both Panel A and B where I can simultaneously reject H2 and not reject H1, meaning that for this subset of data price movements could be rational and are not caused by pure noise. Generally speaking, the results found in Panel B seem to confirm the idea that as betting volume increases, the betting market becomes more ecient as 1 approaches zero. Where the results in Table 6 and Figure C1 at first glance appear to show evidence that eciency increases for longer market times, there is an anomaly in these results that cannot be ignored. For the two longest open market time intervals of 250-350h and more than 350 hours, overreaction increases sharply to a 1 of around -2. A similar pattern is seen for the contracts with more than 100,000 GBP matched in Panel B, where overreaction also increases again after first decreasing for larger volume matched intervals, although this e↵ect seems to be smaller. In order to gain a better understanding of these numbers, I perform two additional analyses. I combine Panel A and Panel B from Table 6 in Table C1 in the Appendix, splitting the data further into more detailed intervals to see if this sheds more light on the results found. Starting with Table C1 in the Appendix, this table further splits the dataset into market time and volume matched intervals. Further right on the table are larger market time intervals, further down are larger volume matched intervals. For each interval the table reports 1, t-score and the number of observations in the interval. Every coecient reported represents a unique interval, so that the results only apply to observations within the specified market time and volume matched interval. For example, the result of the first row and first column in the top left corner only reports outcomes for observations for which market time was 0-6 hours and had 1-100 GBP volume matched. Similar to Table 6, almost all 1 coecients found are negative, implying overreaction. Reading the table from top to bottom, for most columns it seems like increasing betting volume leads to a decrease in overreaction as was also found in Panel B of Table 6. I

22 Table 6: Market Time and Volume Matched Intervals

This table shows the results for the regression Rclose:end = ↵ + 1Ropen:close + ✏ ,fordi↵erentmarkettime intervals. T-scores are presented in brackets below the 1 coecients and below that the number of observations is shown. The bottom four rows of each panel present results for testing the four hypotheses. All tests are signiﬁcant at the 0.1% level unless stated otherwise with asterisks. Market time is displayed in hours, volume matched in GBP

Panel A: Market Time < 6h 6-12h 12-24h 24-48h 48-96h 96-150h 150-250h 250-350h > 350h

1 -0.998⇤⇤⇤ -1.138⇤⇤⇤ -1.312⇤⇤⇤ -0.904⇤⇤⇤ -0.533⇤⇤ -0.775⇤⇤ -0.294 -2.131⇤⇤⇤ -2.044*** (-42.62) (-13.28) (-11.01) (-6.72) (-2.84) (-3.08) (-0.56) (-5.05) (-4.56) N 412,969 60,933 51,137 32,266 28,420 12,247 2,801 456 318

H1 reject reject reject reject reject⇤⇤ reject⇤⇤ no reject reject reject

H2 noreject noreject reject** noreject reject* noreject noreject reject⇤⇤ reject⇤⇤

H3a reject reject reject reject reject reject no reject reject reject

H3b no reject no reject no reject no reject no reject⇤⇤ no reject⇤⇤ no reject no reject no reject

Panel B: Volume Matched 1-100 100-1,000 1k-10k 10k-100k >100k

1 -1.249⇤⇤⇤ -0.977⇤⇤⇤ -0.440⇤⇤⇤ -0.0446 -0.607⇤⇤ (-38.78) (-27.41) (-7.69) (-0.33) (-3.22) N 263,205 220,951 95,236 19,759 2,487

H1 reject reject reject no reject reject⇤⇤

H2 reject no reject reject reject reject⇤

H3a reject reject reject no reject reject⇤⇤

H3b no reject no reject no reject no reject no reject

⇤ p<0.05, ⇤⇤ p<0.01, ⇤⇤⇤ p<0.001

23 fail to reject the market ecient hypothesis 1 = 0 for most of the relatively short intervals with high volume matched. It should be noted that for these higher volume results I also often fail to reject the pure noise hypothesis that = 1 due to insignificance of the coecients. 1 One observation stands out, which is the fourth row of the first column (0-6 hours, 10k-100k GBP). This observation is the only one that is significantly positive with 0.375 (significant at the 5% level) and for which I reject both the overreaction hypothesis and the eciency hypothesis, therefore implying underreaction. Although isolated and possibly due to chance, a possible explanation for this is that due to the short time the market was open and the high volume that was bet on the contract, prices did not get a chance to suciently react to news leading to underreaction. Reading the table from left to right, it is more dicult to draw a conclusion or

find a pattern in the outcomes. For the first two rows with low betting volume, 1 coecients are mostly negative around -1 and become insignificant as market time increases. This implies that overreaction is strong and often insignificantly di↵erent from -1, the pure noise hypothesis. Again, economically one can ask the question whether there ever was a truly functioning market when betting lines where open for several days but only a small amount of money was matched on these bets. The bottom two rows do not show a particular pattern for the first market time intervals, as most coecients are insignificant. Interestingly, as market time increases, overreaction seems to increase too in these rows, as I find significant coecients that signal a high degree of overreaction as 1 is between -1.177 and -2.130. Summarizing the results from Table 6, Table C1 and Figure C1, I can draw two conclusions: overreaction decreases in volume matched, implying increasing market eciency as betting volume increases. Second, overreaction appears at first to decrease in market time, but then increases again. Slightly generalizing results, for the intervals up to 96 hours overreaction apparently decreases, but then increases again for the intervals longer than 96 hours. It seems that after a certain threshold, the longer a market is open and the more is bet, overreaction increases again. Linking the overreaction found to the literature, the results are in line with several theories and predictions found in the behavioral finance literature. Daniel et al. (1998) predict that overconfident investors overweigh their private signals leading to overreaction in prices. Looking at gambling markets in general, it could be argued that these markets are characterized by a larger ratio of ‘overconfident’ agents versus purely rational or institutional agents compared to regular financial markets. This would explain the general tendency for prices to overreact but not the pattern I observe in the results of decreasing overreaction followed by an increase in overreaction for longer opened markets. A stronger case is to be made for the argument

24 of positive feedback trading, brought forward by De Long et al. (1990) and Hong and Stein (1999) with their model with newswatchers and momentum traders. Newswatchers trade based on their private information, which leads to momentum traders to start trading as this is a sign that good private information is entering the market. Momentum traders however do not observe when the information is fully reflected in the prices as their trades keep prices moving. This mechanism leads to newswatchers slowly stopping to trade on the exchange until only momentum traders remain in the betting market, amplifying each other’s trading behavior and increasing overreaction. Regarding the increase of eciency when betting volume increases, this significant result could be linked to the findings of Chordia et al. (2008) and Chung and Hrazdil (2010), who conclude that an increase in liquidity also increases market eciency. Assuming, like Pagano (1989) that a higher betting volume indicates that the underlying market was more liquid, the same conclusion could be drawn for the results found in this paper: an increase in liquidity leads to an increase in eciency. Chordia et al. (2008) find that the liquidity stimulates arbitrage activity, which leads to an increase in the markets eciency. They do note that illiquidity does not per se imply an inecient market. An alternative explanation why apparently obvious mispricing is not exploited in betting exchanges is related to implementation costs. Merton (1987) and D’Avolio (2002) bring forward arguments why transaction costs may make it less attractive to exploit a mispricing. The general Betfair commission charges over a won bet is 2-5%, which is high compared to regular financial markets (D’Avolio, 2002). An arbitrageur that spots a mispricing will have to first overcome this 2-5% commission before exploiting the mispricing will be profitable. Following this reasoning, betting exchanges are also subject to the limits of arbitrage as described in behavioral finance theory. Although this mechanism may play a role in betting exchange markets, the degree of overreaction found cannot be explained by the existence of this commission alone. Concluding, I propose with due carefulness that there are possibly two major forces at work driving the eciency of price movements in betting exchange markets. On the one hand, increasing betting volume and activity drives market eciency up, reducing overreaction. This is in line with the conclusions of Chordia et al.(2008) and Chung and Hrazdil (2010), who find that increasing liquidity increases market eciency. At the same time, behavioral forces are at work in betting exchange markets that drive eciency down. As overreaction appears to be present all throughout the betting exchange, behavioral finance provides a possible explanation for these phenomena. General overreaction may be caused by overconfident agents as proposed

25 by Daniel et al.(1998). This mispricing may remain partly unexploited because of the existence of transaction costs, limiting arbitrageurs from proﬁting (Merton, 1987, D’Avolio, 2002). The pattern where in the long run overreaction increases further may be caused by the mechanism of newswatchers and momentum traders, as proposed by the model of Hong and Stein (1999).

26 6 Additional Analysis

In this section, I take alternative intervals for market time and betting volume that might make more sense from an economic and market eciency viewpoint. As betting volume seems to be an important factor that determines the degree of eciency in my dataset, I take three betting volume cuto↵points. One of the issues of splitting the data into intervals as I did in Table C1 is that the outcomes become insignificant because there are too few observations. Regarding betting volume, I take three di↵erent intervals in this section: more than 1,000 GBP, more than 10,000 GBP and more than 100,000 GBP. It should be noted that all observations of the >100,000 interval are included in the other two subsets, and similarly the >10,000 observations in the first subset. For market time, I take several intervals that include enough observations to get significant results but also make sense from an economic point of view. As Busse and Green (2002) argue, price response takes place in on average five to fifteen minutes in regular financial markets. Although it could be argued that price response occurs more slowly in betting markets, price changes in markets that were opened shortly should not be discarded, as they may still contain useful information. I therefore take four intervals from 0 hours to 96 hours. These intervals are: 0-12 hours, 0-24 hours, 0-48 hours and 0-96 hours. One of the hypotheses derived from the results section is that there is a significant increase in overreaction somewhere around the 96 hour mark, so I also include the intervals 96-150 hours, 96-250 hours, 96-350 hours and >96 hours. Table D1 and Figure D1 in the Appendix report results when these intervals are chosen. The di↵erence is striking: the four intervals up to 96 hours show overreaction that is far less than the four intervals of longer than 96 hours, for all three betting volume intervals. Panel A of Figure D1 shows that overreaction in the first four intervals is significantly lower than in the four longer market intervals. Panel B and C show that for high betting volumes, the market ecient hypothesis is not rejected while the pure noise hypothesis is rejected for the first four intervals. The overall result of this additional analysis is again that the longer opened market shows a higher degree of overreaction than the shorter opened market. These results consistently confirm the earlier found pattern of increasing overreaction in the long run. I perform an additional test taking the perspective from the football market. I take the UEFA country coecient list, which is a list of the best performing country leagues (UEFA, 2017) and select the top 5 leagues and the most popular European international league: Spain’s Primera Division, Germany’s Bundesliga, England’s Premier League, Italy’s Serie A, France’s Ligue 1, and the Champions League. The top 5 leagues have a market share of more than 50%

27 (Franck et al., 2012) in European football, and the Champions League is renowned for being the biggest and best paid football competition in the world. As these leagues are the biggest and most popular leagues worldwide, they could produce di↵erent outcomes due to di↵erent levels of informational eciency. For example, because of higher exposure, interest from the public or greater news coverage, there could be more and better information available for these leagues, leading to higher market eciency. I create a subsample of the 85,641 betting contracts in the full dataset that fall into the top 5 leagues or Champions League category. I run the same regressions using the full subset and the LM-HV criteria (>24 hours, >1000 GBP) and present the results in Table 7. The first row of the table presents the original regression results that are equal to those in Table 5 in the Results section for comparison. The second row shows the regression results for the Top 5 leagues and Champions League subsample, with the first column showing the full subset and the second column showing the results when applying the LM-HV criteria. The coecient for the new subsample (-1.0317) appears to be very close to the one from the full dataset (-1.0068), and the same is true when applying the LM-HV criteria: -0.6145 for the top league subsample and -0.5727 for the original LM-HV subsample. Testing whether the coecients are statistically di↵erent from the original results, I reject this hypothesis in both cases. Linking this result to the rest of the paper, for the most popular and best reported leagues the results are not significantly di↵erent from the full dataset, providing further evidence of the robustness of the results.

Table 7: Testing price movements for the top 5 and Champions League subsample

This table shows the results for the regression Rclose:end = ↵ + 1Ropen:close + ✏,withthefulldatasetinthe left column and the LM-HV sample in the right column. The ﬁrst row presents the results from all observations, which are equal to the results of Table 5. The second row shows the results for the top 5 leagues and Champions

Leagues subsample. The t-scores are presented in parentheses below the 1 coecient. The third row shows the results for testing the hypothesis that the 1 coecient of the original dataset is di↵erent from the top 5 + CL subsample.

Full Sample LM-HV Sample

1:All observations -1.0068⇤⇤⇤ -0.5727⇤⇤⇤ (-46.21) (-4.61)

1:Top5+CL -1.0317⇤⇤⇤ -0.6145⇤ (-14.09) (-2.47) H : = Reject Reject a 1:All obs 6 1:Top5+CL ⇤p<0.05,⇤⇤ p<0.01,⇤⇤⇤ p<0.001

28 7 Conclusion and Discussion

In this paper I have taken betting exchange markets as a laboratory setting for studying price movements and market eciency. Betting markets form an interesting testing environment, as they form a simple financial market where a certain contract is traded with a fixed payo↵and of which the true terminal value is revealed after the betting event has been completed. While betting markets have been studied in the literature, they have mostly been studied in so called bookmaker markets, where bookmakers act as market makers. As bookmakers may have an incentive to set prices ineciently to maximize their own profits, a test for eciency in these markets will always be influenced by bookmaker preferences and choices. Betting exchanges are a relatively new type of betting platform where individual bettors come together and make the market by trading betting contracts, as would happen in a regular financial stock exchange. Prices on these betting exchange platforms are not a↵ected by bookmaker preferences and have been shown in the literature to be more ecient than bookmaker markets. By using a unique dataset from the betting platform Betfair, this paper is the first to my knowledge that studies these exchange markets from a behavioral finance perspective to explain the phenomena found. The dataset also allows me to construct an accurate measure for the time the bet’s underlying market was open and how much money was bet, allowing me to study the e↵ects of time and volume on price movements. Similar to Moskowitz (2015), I use opening and closing prices to test behavioral theories on eciency of price movements. Prices can move from open to close for information and non- information reasons and might respond rationally or irrationally to information. The three hypotheses I derive make predictions about eciency of price movements and over- and underreaction. By using the full dataset and splitting it into di↵erent intervals regarding market time and betting volume, I obtain results that make me draw some strong conclusions and some that require more study before they could be made. The strongest conclusion that I can draw safely and without reservations, is that betting exchange markets show strong evidence of overreaction in price movements. All but one of the significant coecients that I find in my results are negative, implying overreaction. I then split up the dataset into di↵erent intervals of time and betting volume, and I find that there are large di↵erences in the degree of overreaction over di↵erent intervals of market time and betting volume. For many of these intervals, I cannot rule out that these price movements are caused by pure noise, as the 1 coecient is insignificantly di↵erent from the second hypothesis that implies pure noise. Trying to explain what causes these

29 di↵erences, I further examine market time and betting volume and how this a↵ects the results. I find that overall, overreaction decreases in betting volume. Stated otherwise, increasing betting volume apparently leads to more ecient betting markets. For the intervals with highest betting volume, I do not reject the market eciency hypothesis several times. Turning to the e↵ects of di↵erent market times, the relation between overreaction and market time that I find is ambiguous. I find evidence that overreaction decreases in the short to middle run, but increases sharply again in the long run. Choosing several additional types of intervals, this pattern is recurring and significant. What may cause this is a question this paper does not empirically answer and is material for further study. I propose that there may be three forces at work that can explain the results found. The increase in eciency for higher betting volume is in line with results found by Chordia (2008) and Chung and Hrazdil (2010), who find that an increase in liquidity increases market eciency. The general overreaction found may be caused by overconfident agents who overweigh their private signals, in line with Daniel et al. (1998). The pattern where overreaction increases in the long run may be evidence for the model described by Hong and Stein (1999), where newswatchers and momentum traders interact and momentum traders drive prices away from fundamentals as they amplify eachother’s trading behavior. Again, this is more a theoretical explanation than that it is an empirical one. Further research could focus on isolating and testing either of the three forces that I identify as potential causes for the results found. As this paper does not study the nature or direction of the news or event to which prices react, I cannot conclusively make a statement on what actually causes overreaction. An option would be to construct a subset of the Betfair dataset that contains only betting contracts where a significant news event occurred during the time the market was open. For instance, if one were to construct a subset of matches where a key player was injured in the days before the match, it would probably be a cleaner environment to test the response to news. Doing this could perhaps also shed light on the increase in overreaction in the long run, which I observe but for which this paper fails to o↵er an empirical explanation. Another subject for further research could be other sports or di↵erent types of betting contracts, for example goal scorer bets, total goals scored bets, etcetera. As these bets could have di↵erent

30 underlying ‘fundamentals’5 compared to the match odds bets that I use, these betting contracts could also bring forward di↵erent results. Linking the results to regular asset pricing literature, this paper provides a di↵erent angle on the ongoing discussion on behavioral finance and the limitations of classical asset pricing theories. This paper provides evidence that in betting exchanges, prices frequently respond ineciently to news, and that this takes the form of overreaction. Although a betting market is not a standard asset market, theory on the latter should also apply to the first and vice versa, or else we would need di↵erent models for di↵erent asset markets. As Moskowitz (2015) argues, the forces at play in the betting market should then also play a role in standard financial asset markets. In a standard financial market the joint hypothesis problem stands in the way of a definitive conclusion on market eciency, but betting markets circumvent this issue because bets are idiosyncratic and the true value of the betting contract is revealed after the event. Because most of the results point in the direction that prices overreact, I feel somewhat safe to draw the conclusion that overreaction should also play a role in regular financial markets, forming another piece of evidence in the growing literature on behavioral theories in financial markets and asset pricing.

5Fundamentals in this case should be thought of as di↵erent team or player characteristics that drive the bet’s outcome. Where match odds bets are a reﬂection of the probabilities of one team beating the other (or tying), goal scorer bets could have an individual player’s skill as main driver and, following the same reasoning, total goals scored could be a combination of each team’s o↵ensive skill compared to the the relative defensive skills of the opponent.

31 8 References

Admati, A. R., and Pfleiderer, p. (1988) A theory of intraday patterns: Volume and price variability, Review of Financial Studies, 1, 3–40 Barberis, N. and Thaler, R. (2003) A Survey of Behavioral Finance, Handbook of the Economics of Finance, 1053-1123. Barberis, N, Schleifer, A. and Vishny, R. (1998) A model of investor sentiment, Journal of Finance, 49, 307-343. Betfair, (2015) Betfair Annual Report 2015, http://corporatebetfaircom//media/Files/B/Betfair-˜ Corporate/pdf/annual-report-2015pdf. Busse, J. A. and Green, T.C. (2002). Market eciency in real time, Journal of Financial Economics, 65, 415-437 Chordia, T., Roll, R. and Subrahmanyam, A. (2008) Liquidity and market eciency, Journal of Financial Economics, 87, 249-268 Chung, D. and Hrazdil, K. (2010) Liquidity and market eciency: a large sample study, Journal of Finance and Banking, 34, 2346-2357 D’Avolio, G. (2002) The market for borrowing stock, Journal of Financial Economics, 66, 271- 306 Daniel, K., Hirshleifer, D. and Subrahmanyam, A. (1998) Investor Psychology and Security Market Under- and Overreactions, Journal of Finance, 53, 1839-1885 De Long, J. B., Shleifer, A., Summers, L. H. and Waldmann, R. J. (1990), Positive Feedback Investment Strategies and Destabilizing Rational Speculation, The Journal of Finance, 45: 379–395 Fama, E. F. (1991) Ecient Capital Markets: II, Journal of Finance, 46, 1575-1617 Fama, E. F. (1998) Market Eciency, Long-term Returns, and Behavioral Finance, Journal of Financial Economics, 49, 283-306 Franck, E., Verbeek, E. and Nüensch, S. (2012) Inter-market Arbitrage in Betting, Economica, 80, 300-325 Hirschleifer, D. (2001) Investor Psychology and Asset Pricing, Journal of Finance, 56, 1533-1597 Hong, H. and Stein, J. C. (1999) A Unified Theory of Underreaction, Momentum Trading, and Overreaction in Asset Markets, Journal of Finance, 54, 2143-2184 Hong, H., Lim T. and Stein, J. C. (2000) Bad News Travels Slowly: Size, Analyst Coverage, and the Profitability of Momentum Strategies, Journal of Finance, 55, 265-295

32 Kuypers, T. (2000) Information and Eciency: an Empirical Study of a Fixed-Odds Betting Market ,Applied Economics, 32, 1353-1363 Kyle, A. S. (1985). Continuous Auctions and Insider Trading. Econometrica, 53, 1315-1335 Levitt, S. D. (2004) Why Are Gambling Markets Organized so Di↵erently From Financial Markets?, The Economic Journal, 114, 223-246 Malkiel, B. G. and Fama, E. F. (1970), Ecient Capital Markets: a Review of Theory and Empirical Work,The Journal of Finance, 25, 383–417 Merton, R. C. (1987), A Simple Model of Capital Market Equilibrium with Incomplete Informa- tion, Journal of Finance, 52, 483-510 Moskowitz, T. J. (2015) Asset Pricing and Sports Betting, Working paper, University of Chicago Booth School of Business Pagano, M. (1989), Trading Volume and Asset Liquidty, The Quarterly Journal of Economics , 104, 255-274 Patell, J. M. and Wolfson M. A. (1984) The intraday speed of adjustment of stock prices to earnings and dividend announcements, The Journal of Financial Economics, 13, 223-252 Smith, M. A., Paton, D. and Vaughan Williams, L. (2006) Market Eciency in Person-to-Person Betting Economica, 73, 673-689 Smith, M. A., Paton, D. and Vaughan Williams, L. (2009) Do bookmakers possess superior skills to bettors in predicting outcomes? Journal of Economic Behaviour and Organization, 71, 539-549 UEFA (2017, June 21), Uefa rankings for club competitions, http://www.uefa.com/memberassociations/uefarankings/country/ Vlastakis, N., Dotsis, G. and Markellos, R. N. (2009) How Ecient is the European Football Betting Market? Evidence from Arbitrage and Trading Strategies Journal of Forecasting, 28, 426- 444

33 9 Appendix

Table A1: Dataset Variables List

This is a list of all the variables in the Betfair dataset and with descriptions and examples

Variable Name Description SPORTS ID Identification number for underlying sport EVENT ID Event-specificidentificationnumber FULL DESCRIPTION Full description of the sports event containing country, league and week. Example:“Australian Soccer/A-League 2011/12/Fixtures 04 January/Wellington v Sydney” EVENT Description of betting event. Example: “Match Odds” “Goal Scorer” “Over/Under 25” SELECTION Description of the selection of the bet on the betting event Example: “Sydney FC” “The Draw” “Wellington” SELECTION ID Identification number of “SELECTION” ODDS Odds corresponding to the betting contract NUMBER BETS Number corresponding to how many individual bets were matched on these odds VOLUME MATCHED Sum of the stakes of both back and lay bets in GBP WIN FLAG Describes whether the betting event was won “1” or not “0” IN PLAY Describeswhetherthebetwassettledbeforetheeventstarted(PE) or during the event (IP) SETTLED DATE Date and time at which the betting event was settled SCHEDULED OFF The time the betting market was scheduled to close, corresponding to the planned starting time of the event ACTUAL OFF The actual time the betting market closed, corresponding to the actual starting time of the event FIRST TAKEN Time and date at which the odds on the selection were first matched LATEST TAKEN Time and date at which the odds on the selection were last matched

34 Figure B1: Return Distributions for Full Sample and LM-HV sample The left of this ﬁgure column shows the return distributions for open-to-end, close-to-end and open-to-close returns for the full dataset. The right colums shows these distributions for the LM-HV sample. The Y-axis shows the fraction of total observations, the X-axis the absolute returns, not in percentages 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 100 200 300 400 500 600 0 5 10 15 20 25

(a) Full Sample open-to-end returns (b) LM-HV open-to-end returns 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 0 100 200 300 400 500 600 0 5 10 15 20 25

(c) Full Sample close-to-end returns (d) LM-HV close-to-end returns 0 .2 .4 .6 .8 1 0 .2 .4 .6 .8 1 −3 −2 −1 0 1 2 3 −3 −2 −1 0 1 2 3

(e) Full Sample open-to-close returns (f) LM-HV open-to-close returns 35 Table B1: Additional Summary Statistics

This table shows additional summary statistics including skewness, kurtosis and percentiles for the full dataset and the LM-HV subset. Matched bets is in absolute numbers, volume matched is measured in GBP and market time is displayed in hours. Panel A: Full dataset Mean SD Skew Kurt 1st% 5th% 10th% 25th% 50th% 75th% 90th% 95th% 99th% Matched Bets 33.82 89.67 16.82 716 3 3 4 6 12 30 72 123 352 VolumeMatched 2642 24,510 42.51 3343 2.54 5.82 9.56 30.72 144.02 662.4 2709 6845 43,287 36 Market Time 12.12 33.31 33.31 21.47 0.00 0.01 0.04 0.21 1.68 8.99 32.52 67.34 132.63 N 601,543 Panel B: LM-HV Sample Mean SD Skew Kurt 1st% 5th% 10th% 25th% 50th% 75th% 90th% 95th% 99th% Matched Bets 163.15 228.55 7.48 129 19 31 40 61 98 175 332 505 1116 VolumeMatched 17,885 71,378 17.04 51 1025 1135 1284 1883 3764 10,119 31,378 65,106 269,609 Market Time 80.75 83.56 15.00 448 24.34 26.13 28.77 42.76 65.65 101.07 140.56 165.92 291.85 N 43,942 Figure C1: 1 Coecients and 95% CI for Market Time and Volume Matched

This ﬁgure shows the 1 coecients of Table 6 with the corresponding 95% conﬁdence interval. Panel A shows market time intervals in hours, Panel B shows volume matched intervals in GBP.

Panel A: Market Time 1

0.5

1 1

1.5

2.5

3 <6h 6-12h 12-24h 24-48h 48-96h 96-150h 150-250h 250-350h >350h

Panel B: Volume Matched 1

0.5

1 1

1.5

2.5

3 1-100 100-1000 1k-10k 10k-100k >100k

37 Table C1: 1Coecients for Market Time and Volume Matched Intervals

This table shows the results for the regression Rclose:end = ↵ + 1Ropen:close + ✏.Thetablefurther splits the dataset into market time and volume matched intervals. Further right on the table are larger market time intervals, further down are larger volume matched intervals. For each interval the table reports 1,t-scoreandthenumberofobservationsintheinterval.Everycoecientreportedrepresentsa unique interval, so that the results only apply to observations within the specified market time and volume matched interval. For example, the result of the first row and first column in the top left corner only reports outcomes for observations for which market time was 0-6 hours and had 1-100 GBP volume matched. < 6h 6-12h 12-24h 24-48h 48-96h 96-150h 150-250h >250

1-10 1 -1.239⇤⇤⇤ -1.452⇤⇤⇤ -1.234⇤⇤⇤ -1.691⇤⇤ -1.243 1.993 -1.925 0 t (-37.61) (-9.27) (-3.54) (-3.27) (-1.74) -1.86 (-0.44) (.) N22717519076109763641185243735 14

100-1000 1 -0.898⇤⇤⇤ -1.134⇤⇤⇤ -1.669⇤⇤⇤ -1.038⇤⇤⇤ -0.989⇤⇤ -0.375 -1.175 -6.545⇤⇤⇤ t (-23.41) (-8.86) (-10.23) (-4.99) (-2.84) (-0.53) (-0.59) (-3.58) N1393402926025767140029520277223459

1k-10k 1 -0.248⇤⇤⇤ -0.885⇤⇤⇤ -0.889⇤⇤⇤ -0.653⇤⇤ -0.433 -0.618 -0.823 -1.825⇤⇤ t (-3.63) (-4.49) (-4.58) (-3.25) (-1.78) (-1.63) (-0.69) (-3.29) N3908510391129081276632415766912167

10k-100k 1 0.374⇤ -0.327 0.0215 -0.612 0.263 -1.200⇤⇤ 0.136 -2.130⇤⇤⇤ t -2.06 (-1.25) -0.05 (-1.57) -0.46 (-2.89) -0.23 (-5.05) N6773195713571747344828941213370

>100k 1 0.0415 -0.46 -0.821 0.492 -1.177⇤ -1.513⇤⇤⇤ -0.541 -1.548 t -0.2 (-0.91) (-0.89) -0.76 (-2.57) (-4.83) (-0.72) (-1.47) N687250131110359379407164

⇤p<0.05,⇤⇤ p<0.01,⇤⇤⇤ p<0.001

38 Table D1: Additional Analysis Results

This table shows the results for the regression Rclose:end = ↵ + 1Ropen:close + ✏.Thetablehasthreeintervalsforvolumematched:>1,000 GBP, >10,000 GBP and >100,000 GBP. The columns represent di↵erent market time intervals. Volume matched is displayed in GBP, market time in hours. 0-12h 0-24h 0-48h 0-96h 96-150h 96-250h 96-350h >96h

>1000 1 -0.259⇤⇤⇤ -0.316⇤⇤⇤ -0.350⇤⇤⇤ -0.344⇤⇤⇤ -0.899⇤⇤⇤ -0.768⇤⇤ -0.839⇤⇤⇤ -0.889⇤⇤⇤ t (-4.38) (-5.65) (-6.52) (-6.44) (-3.38) (-3.19) (-3.68) (-4.03)

39 N59,13773,53388,156105,2049,03811,57011,98712,271

>10,000 1 0.196 0.169 0.112 0.121 -1.256⇤⇤⇤ -0.900⇤⇤ -0.934⇤⇤ -1.044⇤⇤⇤ t -1.38 -1.27 -0.89 -0.9 (-3.66) (-2.92) (-3.17) (-3.81) N9,66711,15513,01216,8193,2734,8935,2105,427

>100,000 1 -0.0432 -0.134 -0.0835 -0.221 -1.513⇤⇤⇤ -1.320⇤⇤⇤ -1.435⇤⇤⇤ -1.344⇤⇤⇤ t (-0.22) (-0.61) (-0.39) (-1.13) (-4.83) (-4.32) (-4.85) (-4.57) N9371,0681,1781,537379786870950

⇤p<0.05,⇤⇤ p<0.01,⇤⇤⇤ p<0.001 Figure D1: 1 Coecients and 95% CI for Market Time and Volume Matched

This ﬁgure shows the results for the regression Rclose:end = ↵+1Ropen:close+✏ and corresponding 95% conﬁdence intervals, corresponding to Table D1.

Panel A: >1000 GBP 1

0.5

0.5 1 1.5 2 2.5 0-12h 0-24h 0-48h 0-96h 96-150h 96-250h 96-350h >96h

Panel B: >10,000 GBP 1

0.5

0.5 1 1.5 2 2.5 0-12h 0-24h 0-48h 0-96h 96-150h 96-250h 96-350h >96h

Panel C: >100,000 GBP 1

0.5

0.5 1 1.5 2 2.5 0-12h 0-24h 0-48h 0-96h 96-150h 96-250h 96-350h >96h