University of Amsterdam Amsterdam Business School
MSc Business Economics, Finance track Master Thesis
Asset Pricing and Behavioral Finance Evidence from a Betting Exchange
Rob Clowting June 2017
Supervisor: Florian Peters Statement of Originality
This document is written by Rob Clowting who declares to take full responsibility for the contents of this document. I declare that the text and the work presented in this document is original and that no sources other than those mentioned in the text and its references have been used in creating it. The Faculty of Economics and Business is responsible solely for the supervision of completion of the work, not for the contents.
i Abstract
In this thesis I use sports betting exchanges as a laboratory setting for testing behavioral theories on asset pricing. A betting exchange is a betting market where individual bettors come together and make the market, e↵ectively forming a simple financial market where single-payo↵contracts are traded in the form of bets. I use a unique dataset from the betting exchange platform Betfair, from which I collect 601,543 betting contracts spanning the period 2012-2016. I test how prices respond to news in betting markets and if this happens rationally or irrationally. I also construct measures for betting volume and the time the market for a bet was open and test how these a↵ect price response. I find that all throughout the betting market, price response is ine cient and overreaction is the leading cause. Examining the e↵ects of betting volume and market opening time, I find that market e ciency increases when betting volume increases. For market time, the results are more ambiguous: in the short to medium run, overreaction seems to decrease, but increases sharply again in the long run. I then present several behavioral theories to theoretically explain the results found.
ii Contents
1 Introduction 1
2 Literature Review 4 2.1 SportsBettingMarkets ...... 5 2.2 BehavioralFinance...... 7 2.3 E↵ects of Volume and Time on Market E ciency...... 9
3 Methodology & Hypotheses 11
4 Data & Summary Statistics 15
5Results 20
6 Additional Analysis 27
7 Conclusion and Discussion 29
8References 32
9Appendix 34
iii 1 Introduction
In his renowned paper on market e ciency and behavioral finance, Fama (1998) counters most research on the behavioral theories of anomalies found in classical asset pricing theory. He states that no behavioral model has yet come up with more convincing results compared to the rational theory he puts forward. Rational theories of market e ciency state that return premia are linked to aggregate systematic risk, where behavioral theory states that returns are generated by investor cognitive errors and biases in a market that is not completely e cient. One of the biggest obstacles when testing for market e ciency is the joint hypothesis problem as stated by Fama (1971): any test of market e ciency is simultaneously a test of the underlying equilibrium asset pricing model. Regular capital markets present a virtually impossible to test empirical environment, as aggregate systematic risk is unobservable and the termination date of the asset is often not known. Ideally, an environment should be found where these fundamental problems of testing for market e ciency are absent. I propose such an environment exists, and it comes from a perhaps unexpected source: sports betting markets. Sports betting markets have been subject of some study in the literature because they present a laboratory setting of a simple financial market where single-payo↵contracts are traded in the form of bets on sports events. As Moskowitz (2015) argues, either our asset pricing models should explain returns from all markets, or we would have to make separate models for di↵erent asset types. Sports betting markets o↵er the opportunity to test behavioral asset pricing models, as they have three unique features that set them apart from the rational asset pricing framework. First, sports betting markets are fully idiosyncratic, having no relation to aggregate risk in the economy. Second, sports bets have a short and observable termination date at which the true value of the underlying contract is revealed. Third, the outcome of a bet is completely independent of betting activity. The observable true value of the bet after the betting event has taken place allows for the detection of mispricing. The dataset I use allows me to also observe market opening prices and the price of the bet at market close. I use prices at these three points in time (open, close, terminal value) to test behavioral theories of asset pricing. Almost all literature on betting markets studies bookmaker betting markets, the traditional and best known form of betting markets where bookmakers set prices for the bets and act as market makers. Testing any asset pricing theory in these markets will therefore always be biased by bookmaker choices and preferences. However, recent years have seen the rise of so called betting exchanges, where bookmakers are replaced by individual bettors posting their preferred
1 limit and market orders for bets, creating a trading mechanism similar to that of regular stock exchanges. This e↵ectively creates an environment where individual bettors come together and trade betting contracts, creating an isolated financial market in which asset pricing theories can be tested. This paper will take this betting exchange environment and use it to test several behavioral theories of asset pricing. I use a unique dataset from Betfair, currently the largest betting exchange platform worldwide with an annual turnover of 55,3 billion pounds. There is still little research on these betting exchanges, and to the best of my knowledge this paper is the first to test behavioral theories in this environment. Another unique and interesting feature of the Betfair dataset is that it contains measures of market time and market size, allowing me to test what the e↵ects of these are on pricing of bets. More specifically, I study price movements and corrections and apply several behavioral theories on these. Barberis et al. (1998) present a model of underreaction followed by overreaction. Daniel et al. (1998) come up with a model of overreaction followed by overreaction and Hong and Stein (1999) present another model of overreaction. Prices can move from open to close for information and non-information reasons and might respond rationally or irrationally to information. From the literature I derive the main objective of this paper, which is to test whether prices respond rationally or irrationally to news, and in the irrational case, whether this is due to pure noise, overreaction or underreaction. Next I test if and how the time a bet was traded and the amount of money that was bet influence these results. I construct a dataset from the Betfair betting platform that contains all football matches on which was bet in the period 2012-2016. This gives me a total of 601,543 betting contracts that cover 250,382 football matches. I also construct a subset of contracts which were traded for more than 24 hours and on which more than 1000 GBP was bet and I divide the data into intervals of market time and betting volume. The results of the first test of rational versus irrational price movements point to one clear direction: prices respond irrationally and this leads to overreaction. For both the full dataset and the subsample the coe cients found are significant and imply overreaction. The di↵erence between the two results is that for the full dataset, the hypothesis that price movements are caused by pure noise is not rejected, but for the subset it is. These results point to a relationship between overreaction and market time and betting volume. To study this apparent relationship, the dataset is then further split up into di↵erent market time and betting volume intervals. The e↵ect of betting volume on the e ciency of price movements is clear: increasing betting volume leads to an increase in e ciency. For the highest intervals of betting volume the rationally e cient response hypothesis is not rejected. Turning to market time, the results are more ambiguous. In
2 the short to medium run, overreaction appears to decrease in longer opened markets. However, for the highest intervals of markets opened for more than 150 hours, overreaction increases again strongly. Additional analysis of the results using di↵erent criteria for market time and betting volume confirms these results. There is no evidence that underreaction plays a role in price movements in betting exchanges. The first main finding of this paper is that in betting exchanges, prices generally respond irrationally to news, leading to overreaction. The second main finding is that this overreaction decreases in betting volume, and responds ambiguously to the time a market is open. Linking these results to the literature, Chordia et al.(2008) find that liquidity increases market e ciency, which o↵ers an explanation for the the increase in e ciency when betting volume increases. The general overreaction could be evidence for the model of Daniel et al. (1998), who present over- confident investors (or bettors) as a source of mispricing. The model by Hong and Stein (1999) presents an explanation of the long-term increase in overreaction, as it accounts for momentum traders who amplify overreaction when they interact with one another. It should be noted that these are theoretical explanations for the phenomena observed, as this paper does not yet present empirical data on testing these theories. The rest of this thesis is structured as follows. Section 2 will discuss the current literature on betting markets, market e ciency and behavioral finance. Section 3 will present the hypotheses of the thesis and present the methodology required to test these. Section 4 introduces the data and provides descriptive statistcs. In Section 5, the main results are discussed. Section 6 presents additional analyses of the results found. Finally, Section 7 is the conclusion and discussion.
3 2 Literature Review
One of the oldest problems in modern financial theory states that testing for market e ciency is somewhat impossible because of the Joint Hypothesis problem, as formulated by Malkiel and Fama (1970) and Fama (1991). This Joint Hypothesis problem implies that testing for market e ciency will always be a test of the underlying asset pricing model. Therefore any empirical findings on supposed market ine ciencies can never fully be ruled out as being caused by an incorrect model. Sports betting markets have been subject to quite extensive research for some years because they can be used as a laboratory for testing asset pricing. Moskowitz (2015) argues that our asset pricing models should explain returns for all asset markets, or we would need di↵erent models for di↵erent asset markets. Sport betting markets o↵er an opportunity to circumvent the Joint Hypothesis stated by Fama (1991) because of three reasons. (i). Betting markets are fully idiosyncratic: they have no relation to aggregate risk or risk premia in the economy. (ii) The terminal value of a bet is always known and observable after a bet settles, allowing for mispricing to be detected. (iii). The outcome of a bet is completely independent of betting activity (assuming absence of ‘match-fixing’). These three characteristics make betting markets an interesting laboratory for testing asset pricing models Considering the first feature, Moskowitz (2015) argues that aggregate risk preferences and risk premia might a↵ect the entire betting market as a whole but should have no e↵ect on the cross-section of games. Rational asset pricing theories should therefore have nothing to say about return predictability for betting contracts. However, sports betting contracts should be subject to behavioral explanations that have been proposed to cause anomalous returns in regular financial markets. Since the betting market is idiosyncratic, contracts are independent of aggregate risk and therefore can tell us more about the role bettor behavior and information. The second and third feature of sports betting markets imply that betting contracts have a short and known termination date, namely the end of the sports event on which the contract was bought. This means that any uncertainty about the value of the contract is resolved at the termination date of the contract, providing a ‘true’ value of the contract. Since bettor behavior
4 has no influence on the outcome of the event1, the outcome is fully exogenous and as a result this allows for any mispricing to be detected. Another hypothesis is that these markets are in fact e cient, and that there is no mispricing, thus implying that there is no return predictability. The combination of these features, idiosyncracy and the known true terminal value, make sports betting markets a useful laboratory for testing behavioral theories.
2.1 Sports Betting Markets
In their paper, Franck et al.(2012) describe a betting market as a simple speculative market, where contracts on some future cash flow are traded. The outcome of a certain event determines the direction of the cash flow. For example, in sports betting this could be ‘Team X wins’, or ‘Player Y scores’. This paper looks at fixed-odds betting, which means the size of the cash flow is determined by the odds. In European online sports betting, odds are generally presented as so called ‘decimal odds’. For example, an odd of 1.58 means that for every euro bet, the bettor receives 1 1.58 his stake if he wins the bet. Although odds may change over time, the bettor’s ⇥ claim is determined by the initially taken odd and is not altered by subsequent price changes. There are two distinct market forms in sport betting markets: bookmaker markets and betting exchange markets. Most of the literature on betting markets has focused on bookmaker markets, the traditional form of betting markets, where bookmakers act as market makers and set the price, and the individual bettors buy the bets from the bookmakers. Bookmakers make money by charging a commission, which is reflected in their odds. There is already some literature on e ciency and mispricing in these markets. Ine ciency in betting markets implies that the odds quoted by a bookmaker do not reflect the true probability of the outcome of the underlying event. Vlastakis et al. (2009) for example, find evidence for the so called favorite-longshot bias, meaning that betting on favorites (low returns with high probability) yields higher returns than betting on ‘longshots’ (high returns with low probabilities). They also find evidence for the overestimation of the home team advantage. Bettors tend to have an unjustified belief that the home team has a bigger advantage than it empirically has. Bookmakers seem to be able to exploit these biases
1This assumes absence of ‘match-fixing’ or large scale influencing of football matches by (criminal) persons or organizations. Although several instances of match-fixing have come to light over the past years in mostly lower level European leagues, I assume that these instances are so rare they do not influence the betting market in any significant way. To challenge this would be beyond the scope of this paper and would also be beyond the field of economics
5 with their price setting. Looking at betting strategies, Vlastakis et al. (2009) find that the most profitable betting strategy is the ‘away-favorite’, although it should be noted that this strategy still yields negative average returns due to bookmakers’ commission and price setting. Why bookmakers may purposely set prices ine ciently is examined in several papers. Kuypers (2000), Franck et al. (2012) and Vlastakis et al. (2009) all provide evidence that bookmakers may maximize profits by setting market-ine cient odds. They bring forward the behavioral argument that bettor biases, such as the mentioned favorite-longshot and home team advantage overestimation, are exploited by bookmakers to maximize profits. This works in two ways. On the one hand, bookmakers can exploit bettor biases by quoting market ine cient odds that are advantageous to their own profits (the bettor takes the loss on e ciency, the bookmaker the gains). The other way this works is because bookmakers charge a commission on bets, meaning that higher trading volume can sometimes o↵set the potential losses on quoting ine cient odds. Another more intuitive reason described by Franck et al. (2012), is that bookmakers may set ine cient odds temporarily for promotional or advertising reasons to attract new customers. By taking losses on the ine cient odds or promotions during the advertising period, the bookmaker hopes to attract and retain new customers on whom they will make a profit in the long run, since bettors face transaction costs when switching bookmakers. As the literature on bookmaker betting markets shows, the problem with testing for market e ciency in bookmaker markets is that bookmakers have incentives to purposely set odds in- e ciently for profit-maximizing reasons. A test of true market e ciency will therefore always be biased by bookmaker preferences. This paper examines a di↵erent type of betting platform, the so called betting exchange, where individual bettors come together and quote their odds at which they are willing to trade. On the betting exchange, the market makers are the bettors themselves and the bookmakers are eliminated from the equation. Betting exchange markets are a relatively new phenomenon in the (online) sports betting community. The biggest betting exchange platform is currently Betfair.com, launched in 2000 and with an annual turnover of £ 55,3 billion in 2015 (Betfair, 2015). On a betting exchange, the bookmaker is replaced by individual bettors and prices are set by an auction process of supply and demand, similar to regular stock exchanges. The betting exchange platform o↵ers the individual bettor the possibility to either buy or sell a bet, similar to going long or short on a stock. The bettor can therefore take the position Team X wins/draws/loses, but also the position Team X does not win/draw/lose. In other words, instead of the bookmaker taking the other side of the contract, in betting exchange markets it is another individual bettor. This
6 creates a continuous double auction process on the betting platform. If bettors with opposing views agree on a price, the platform executes their transaction. The bettor can either submit a limit order and wait for another person to match his price, or place a market order which will be matched with already o↵ered bets. The order book is shown publicly on the platform’s website that show most attractive odds and corresponding available betting volumes. Betting exchanges earn money by charging a commission on the cash flow of winning bets (ex-post) instead of including the commission in the quoted odds (ex-ante) like bookmakers. Smith et al. (2006, 2009) are amongst the first to study these betting exchanges and compare these to the bookmakers market. They find that betting exchanges o↵er a significant increase in e ciency compared to bookmakers’ markets. They find that the favorite-longshot bias is less prominent in betting exchange markets than in traditional bookmakers markets. They also test information-based against risk preference models and find that the information based model predicts the favorite-longshot bias better than a the risk preference model. Frank et al. (2012) study arbitrage opportunities when combining betting on exchange mar- kets with bookmaker markets for the top five European football leagues. They find that using this strategy, arbitrage opportunities arise in 19,2% of all matches, resulting in an average pos- itive return on these bets of 1,4%. They conclude that these are not caused by random price di↵erences but are caused by di↵erent levels of informational e ciency. Most arbitrage oppor- tunities arise when bookmakers o↵er ine ciently low-priced bets which can then be sold on the betting exchange for a higher price. They find that bookmakers markets are the main cause of the arbitrage opportunities arising, as they set prices less e cient than the betting exchange.
2.2 Behavioral Finance
Since betting markets form a suitable laboratory for testing asset pricing theories and models they could also be used to study the field of behavioral finance. In his paper studying psychological influences on asset pricing, Hirschleifer (2001) describes heuristics as rules-of-thumb that a↵ect individual decision making and may there also a↵ect decision making for a group of individuals acting in an economic environment. As the literature on betting markets show, heuristics seem to be playing a role in betting markets as there is ample evidence of the favorite-longshot bias and home team advantage overestimation (Vlastakis et al. (2009), Levitt (2004), Smith et al. (2006,2009)). In the behavioral finance literature there are several theories that try to explain behavioral phenomena based on investor beliefs. The focus of this paper will be mainly on over- and under-
7 reaction. One of the most cited papers in these areas is by Barberis et al. (1998), who present a model that tries to explain two pervasive empirical phenomena: short term underreaction of stock prices to news and long term overreaction to a series of good or bad news. Their model is based on systemic errors that investors make in reaction to public news announcements. They ar- gue that conservatism makes people judge initial good or bad news insu ciently, pushing prices up or down too little. After a series of good or bad news however, representativeness causes investors to overreact and push prices up or down too far. Daniel et al. (1998) present a di↵erent model where overconfident economic agents overweigh their private signals leading to overreaction in pricing. They argue that this is caused by so called self-attribution bias: public news that confirms the investors’ belief increases his confidence but disconfirming news does not a↵ect the investors’ confidence in his own beliefs. Following this reasoning, initial overconfidence is then amplified and leads to even more overconfidence, generating momentum. Another explanation for the existence of over- and underreaction and momentum is provided by Hong and Stein (1999). Their model does not look at individual investor psychology, but rather focuses on a market where di↵erent groups of traders interact. The model assumes there are two types of traders, ‘newswatchers’ and ‘momentum traders’, who interact with each other. Newswatchers base their forecasts on private information, and momentum traders only base their actions on the most recent price change. The newswatchers trade based on private information, which then di↵uses slowly through the population of newswatchers. This slow di↵usion leads to underreaction of prices in the short run. The momentum traders then start engaging in positive feedback trading. For them, rising prices imply that information is slowly di↵using trough the market. However, because momentum traders cannot observe the extent to which news has di↵used through the economy, they keep buying even after prices have reached fundamental value. This generates an overreaction that is only later reversed. Hong et al. (2000) test this model of momentum and information di↵usion by looking at firms of di↵erent size and di↵erent levels of analyst coverage. They find that momentum is stronger in smaller companies and companies with lower levels of analyst coverage, consistent with their hypothesis of slow information di↵usion. Linking these behavioral theories to betting markets, Moskowitz (2015) is amongst the first to use sports betting markets as an asset pricing laboratory to test these models of over- and underreaction and momentum. Moskowitz uses data from bookmaker markets and finds that price movements in betting markets are consistent with overreaction models, as described by the
8 model of Daniel et al. (1998). He then examines what may cause this overreaction and finds that momentum exhibits significant predictability for returns and that value exhibits significant but weaker predictability, and no evidence that size predicts returns in any way. However, by using bookmaker market instead of betting exchange markets there is still the argument of bookmaker preferences that may cause results to be a↵ected.
2.3 E↵ects of Volume and Time on Market E ciency
The dataset that is used in this paper also features a measure of betting volume for each contract and allows me to construct a measure for how long a certain bet was traded. Betting volume is related to trading activity, market depth and liquidity and market time may be linked to the response to news arrival and e↵ective di↵usion of news. Unfortunately, there is no literature that focuses on trading volume and liquidity in betting markets. However, the literature on liquidity in regular financial markets provides su cient guidance. Pagano (1989) presents the relationship between trading volume and liquidity as a feedback loop where the former amplifies the latter and vice versa. He argues that trading volume is positively linked to liquidity as both speculators and informed traders enter the market as volume increases. Admati and Pfleiderer (1988) point out that “liquidity begets liquidity”. Studying these phenomena, Chordia et al. (2008) are amongst the first to study the link between liquidity and market e ciency and find that an increase in liquidity leads to an increase in e ciency, caused by a rise in arbitrage trading. Chung and Hrazdil (2010) confirm the findings of Chordia et al. in a more extensive study. The literature on the relationship between time and market e ciency and mainly focuses on the arrival of news and how fast it is incorporated into prices. In one of the earlier papers on this subject, Patell and Wolfson (1984) find that price response takes places in five to ten minutes. Busse and Green (2001) sum up several studies that look at the speed at which prices react and find that prices incorporate news within five to 15 minutes in regular financial markets. However the papers Busse and Green (2001) cite are somewhat outdated, and they subsequently find in their own research that for positive news prices are corrected within one minute, whereas for negative news this is around 15 minutes. Up to my knowledge there has not been any research done on news arrival in betting markets or market time in general. This makes it di cult to draw conclusions on how prices respond to news in a sports betting market. On the one hand, one could present the argument that news about sports is often more uncertain and rumor-based and therefore does not have the same power as an earnings announcement for a stock would
9 have. On the other hand, the literature on sports betting markets shows that these markets are for a large part e cient so why would the response to news not be, one can ask.
Summarizing, to the best of my knowledge this Thesis will be the first paper that will use a betting exchange to test behavioral models of over- and underreaction. By using a unique dataset from Betfair, the largest betting exchange in the world, I will study behavioral models in betting exchanges and how they di↵er from bookmaker markets. The nature of the betting exchange e↵ectively rules out the argument of bookmaker preferences causing ine cient price setting and thus a↵ecting results. Another unqiue feature of this paper is the fact that I am able to construct measures for time the market was open for every contract in my dataset and how much money was bet on each of these. This allows me to study the e↵ects of time and traded volume on price response and market e ciency and the implications of these on behavioral models. Because of the similarities between the functioning of a betting exchange and regular financial markets, empirical evidence found in this paper can contribute to the growing literature on behavioral theories in financial markets.
10 3 Methodology & Hypotheses
This paper will use a unique dataset from the betting exchange platform Betfair. The platform makes all its data available through its website for members who actively bet on the platform. Betting exchanges form a market where individual bettors come together and set prices for bet- ting contracts through a continuous auction process, similar to regular stock exchanges. Instead of a bookmaker quoting odds and acting as a market maker, prices are determined by the in- dividual bettors on the platform. The betting exchange platform charges a commission fee on the bettor’s net profits. For Betfair the commission ranges between 2% and 5%, depending on the bettors individual betting activity and volume. Theoretically, odds could range from any number > 1 up to (near-) infinity. In practice betting platforms do not provide odds smaller than 1.01 (two decimals) and limit their odds around 1000 for the events deemed most unlikely, such as an underdog beating a favorite team by a 10 goal margin in football2. Following Franck et al. (2012), when a bet on the outcome e of a certain event has been matched, bettors hold a contract on some future cash flow. The underlying payo↵of the contract is determined by the odds oe and the direction of the cash flow is determined by the outcome of the underlying event. If he wins, the bettor has to pay the commission c (0 E[Re]= e(oe 1)(1 c)+(1 e)( 1) (1) = [o (1 c)+c] 1, e e where e is the true probability of the outcome e occurring. The structure of the betting process is as follows: for a certain bet, the first time odds are matched and a transaction is executed, this gives the opening price P0. The closing time of the market is the start of the event, gives the closing price P1. The event then starts and finishes at the game outcome PT , at which time the true terminal value of the contract is revealed. The figure below shows this timeline of prices and returns, as presented by Moskowitz (2015): 2Higher odds than 1000 do occur, but not for single-match betting events, which is the focus of this paper. Odds over 1000 are often seen for betting events that span a full tournament or season. For ex- ample, at the start of the 2015-2016 Premier League season, bookmakers quoted odds of up to 5000 for Leicester City winning the Premier League title, an event deemed highly unlikely by bookmakers. Le- icester famously won the title, leaving bookmakers behind with big losses because of their pricesetting (http://www.telegraph.co.uk/news/2016/05/02/leicester-city-win-premier-league-and-cost-bookies-biggest-ever/) 11 The time between the opening price and the closing price may vary between a few minutes up to a few weeks. Since this paper only looks at football data, the time between closing and game outcome is approximately 105 minutes (2 halves of 45 minutes, 15 minutes half-time break and additional injury time)3. As time progresses between opening and close, prices may change for similar reasons as they would in a regular financial market. Prices could change if bettors enter the market who think the contract is mispriced or because new information arrives, for example the injury of a key player. Prices can move from open to close for information and non-information reasons and might respond rationally or irrationally to information. Take for example the situation where some- where between market open and close, a team’s key player is injured. As a response to this, the odds of the bet will change. If this happens for information reasons and rationally, the closing price will be a better predictor of the outcome of the game than the opening price. Also, if price setting happens in a fully rational manner, there will be no return predictability from market close to the end of the game, as the closing price equals the expectation of the terminal value, P1 = E[PT ]. Intuitively this makes sense, since at market close P1 (the start of the event), all information about the bet’s underlying event should be known and included in the price, from starting line-ups and for instance how injury prone the respective players are, to stadium attendance, weather conditions and other factors that might influence the game. Following the same reasoning, price movement from market open to close should have no predictive value for close-to-end return under the rational hypothesis. This brings up the base regression: Rclose:end = ↵ + 1Ropen:close + ✏ (2) and the following predictions regarding the rational response to information: 3In playo↵rounds, football matches that end in a tie go into 30 minute overtime and end with a penalty shoot-out if the overtime does not bring a winner. However, ‘match odds’ betting contracts only apply to the result after regular playing time of 90 minutes, regardless of overtime being played. Overtime and penalty bets have their own separate contracts on betting platforms 12 1. If prices move (P = P ) for information reasons and markets respond rationally to the 0 6 1 news, then 1 =0 The next prediction follows from the idea that prices could move from market opening to closing for purely non-information reasons, such as investor sentiment or pure noise. In this case the closing price is wrong and the price will be corrected as the game ends and the true price is revealed. The open-close return should then negatively predict the close-end return as prices move back to true value at the terminal date. If there then was no information content in the price movement, prices will fully revert to the original opening price, leading to the second prediction: 2. If prices move (P = P ) for non-information reasons, then = 1 0 6 1 1 Another scenario is that prices move for information reasons, but the markets respond irrationally to the news, overreacting or underreacting to news concerning the underlying sports event. This idea of under- and overreaction comes from the theories and models presented by Daniel et al. (1998), Barberis et al. (1998) and Hong and Stein (1999). If this were to be the case, closing prices are still wrong but there would also be predictability of the close-end return from the open-close return. The third prediction then becomes: 3. If prices move (P = P ) for information reasons but markets respond irrationally to the 0 6 1 news then a. 1 > 0 if underreaction b. 1 > 0 if overreaction These three hypotheses can be attributed to information, non-information and irrational infor- mation response hypotheses respectively and these will be tested further in this paper. The three hypotheses are summarized in the figure below 13 with the remark that if = 1, the purely non-information hypothesis is confirmed. 1 14 4 Data & Summary Statistics This paper uses a unique dataset from the betting platform Betfair. Data is collected through the website data.betfair.com, which becomes available after the user has reached a certain amount of Betfair points, which are granted based on betting activity and volume. In 2015, the Betfair betting platform had an annual turnover of £ 55,3 billion (Betfair, 2015). Betfair posts its historical betting data in weekly files, going back as far as June 2004. This paper uses data from the calendar years 2012-2016, using 260 weekly datafiles as the base for this paper. The datasets contain all betting data for all betting events that are available on the Betfair website, ranging from sports events to political elections and other miscellaneous events. Table A1 in the Appendix provides an overview of the variables these datasets contain. This paper will examine the football market and will use the single betting contract that concerns whether a team wins, draws or loses, also called “Match Odds”. As Franck et al. (2012) also describe, the underlying reason is that these bets are most popular and most frequently traded on betting platforms. When a betting website is opened, these odds are often the first to be displayed. I also drop all bets that were made during the sporting event instead of before the game, as that is not the focus of this paper. Next I drop all values for which the “VOLUME MATCHED” is smaller than 1, since these observations have no real economic value. The next step is to determine what the odds were at the market opening and what were the odds at market close. For each match I observe all odds that have been traded, implying that both a buyer and a seller of the contract were found for that particular odd. By taking the first time a transaction took place (“FIRST TAKEN” in the dataset) I find the market opening odd. The market closing odd is determined by taking the latest time at which a transaction took place, given by the “LATEST TAKEN” variable in the dataset. I also construct a measure for market time, which is simply the time between the first time an odd was traded and the closing odd, expressed in hours. I then make sure that the market opening odd and the market closing odd are all matched with the respective match so that each single betting event has a single opening and closing odd and there are no duplicates in the dataset. Next I drop all observations where the market time equals exactly 0, meaning that in the data “FIRST TAKEN” is also “LATEST TAKEN”. Doing this drops 216,457 observations, more than a quarter of my total dataset. Since this is a significant amount, I analyze the characteristics of these dropped observations and find that for the vast majority of these observations only one transaction took place (“NUMBER BETS” equals 2, implying one single matched transaction) and had a volume 15 matched lower than GBP 10. Next I calculate prices and returns from equation (1) and the actual outcome of the match. Following Franck et al. (2012), I assume a Betfair base commission of 5%. This provides me with the return measures for open-close, open-end and close-end. Following Moskowitz (2015), I drop observations where the return from market open to close is smaller than -300% or larger than 300%. If I lay these outliers next to the volume matched and market time measures, it seems these values are most likely errors in the data or in the time stamps of the data. For these outliers, where the open to close returns are more than plus or minus 300%, the market time is mostly shorter than 1 hour and volume matched is low. It does not make sense to include these observations in the data, since they have no real economic significance. I am then left with my cleaned full dataset which contains 601,543 ‘match odds’ betting contracts that cover 250,382 football matches from the years 2012-2016. Table 1 shows summary statistics on the full dataset. Table 1: Summary Statistics of the full Betfair dataset This table shows the most important variables that are included in the dataset. Odds are the price at which a contract was traded. The bettors stake multiplied with the odds forms the contracts payo↵if the bettor wins the bet. Matched Bets is the variable indicating the amount of bets that were matched on a certain contract, with 2 being the minimum as that indicates there is a buyer and seller of the contract. Volume Matched is the total amount of money matched on a contract in GBP. Market Time is the time in hours between the first time a transaction took place and the last time the contract was traded. Win % is a variable taking on the value 1 if the contract’s underlying bet was won and 0 if it was lost. Count Mean SD Min Max Odds 601,543 4.34 11.38 1.01 1000 Matched Bets 601,543 33.82 89.67 2 9331 Volume Matched 601,543 2,641 24,509 1 3,795,204 Market Time 601,543 12.12 33.32 0.0003 3,627 Win Flag 601,543 0.34 0.47 0 1 N 601,543 I then look at how the observations are distributed over volume matched and market time by dividing the full sample of contracts into 5 subsamples of di↵erent market time. Table 2 shows descriptive statistics of these subsamples. One of the main points of interest of this table is the fact that more than two-thirds of the observations in the dataset have a market time of less than 6 hours and have relatively low volume matched and number of matched bets. The economic 16 interpretation of these numbers is that a large majority of the data concerns betting contracts where there was only little time between the first time a bet was taken and the last time. In other words, the market was open for only a short time and as a result only a few bets were matched with a relatively low value. As these factors may influence how well a market functions I also construct a subset of the full dataset that contains contracts that span longer and higher volume markets. Smith et al.(2006) use a minimum of 2000 GBP as they argue that any observation below that mark would not have enough liquidity to be treated as representative. The criteria I use are a market time of at least 24 hours and at least 1000 pounds in matched volume. I will refer to these datasets as the full dataset and the long-market high-volume subset (or LM-HV dataset). Table 2: Summary statistics of betting contracts over di↵erent market time horizons This table shows the average market time in hours, volume matched in GBP and the number of matched bets with standard deviations in parentheses. The total dataset of 601,543 observations is divided into 5 subsets of di↵erent market times. Longer markets have higher volume matched and more matched bets on average. More than two-thirds of the observations of the full dataset are contracts with a market time of less than 6 hours. Market Time Market Time Market Time Market Time Market Time 0-6h 6-24h 24-72h 72-150h >150h Market Time 1.231 12.81 43.14 102.3 241.1 (1.56) (5.44) (14.08) (21.10) (220.80) Volume Matched 1,245 2,465 4,622 14,340 67,698 (13,785) (20,659) (23,389) (65,268) (160,637) Matched Bets 18.32 42.32 73.43 134.6 362.6 (45.20) (86.79) (107.4) (207.5) (471.0) N 412,969 60,933 51,137 32,266 28,420 Next I turn to the return distributions, which are calculated using equation 1 and summarized in Table 3 and visualized in Figure B1 in the Appendix. For the full dataset, the open-to-end mean return is -7.26% and the close-to-end mean return is 6.07%. This implies that a random bet placed at the time of market opening will yield a negative return of -7.26% on average, and -6.07% if the bet is placed at the time of market closing. The minimum return of a bet is a logical -100%, as this is simply a bettor’s cash flow when he loses his bet. The maximum return in the dataset is 61,655%, corresponding to an odd of 650 of the underlying betting contract in the 17 dataset. Although this might seem like a very high number, a highly unlikely betting outcome can be expected to occur on a total of more than 600,000 betting contracts. Checking the dataset for a possible excess in extreme returns, I find that the second highest return found is 4,655% (corresponding to an odd of 50) and after which the returns gradually decrease in size. The mean of the open-to-close return of the full sample is -1.19% and he minimum and maximum are limited at -300% and 300%, as described in the data cleaning section above. Another noticeable fact is that standard deviations of returns seem to be large, which may be caused by the large part of the dataset that has a fixed loss of -100%. Examining the LM-HV subsample of the dataset, the results are very similar. Both open-to-close and close-to-end returns are negative, -4.58% and -4.08% respectively. Maximum return for the LM-HV subsample is 2,565% and standard deviations of returns are still relatively large with 149.34% and 149.51% but lower than the full dataset. Table 3: Summary statistics of returns This table shows return distributions over di↵erent parts of the timeline of betting contracts. Panel A: Full Dataset Count Mean SD Min Max Ropen:end 601,543 -7.26% 171.77% -100% 61655% Rclose:end 601,543 -6.07% 173.75% -100% 61655% Ropen:close 601,543 -1.19%6 26.1% -299.25% 299.25% Panel B: LM-HV subsample Count Mean SD Min Max Ropen:end 43,942 -4.58% 149.34% -100% 256.50% Rclose:end 43,942 -4.08% 149.51% -100% 256.50% Ropen:close 43,942 -0.50% 19.10% -298.30% 294.50% Figure B1 in the Appendix plots the distribution of returns with the full dataset in the left column and the LM-HV subsample in the right column. Both open-to-end and close-to-end returns have a mass at -1 in both the full sample and the LM-HV sample, representing the fraction of lost bets. For open-to-close returns, both the full dataset and the subset have similar distribution where returns are centered at zero, implying that for the majority of contracts, prices did not move or moved only slightly. The full dataset and the LM-HV subset are further summarized in Table B1 in the Appendix. This table shows skewness, kurtosis and percentiles 18 for matched bets, volume matched and market time. The table shows that in the full dataset the majority of the observations have a relatively low number of matched bets, low amount of money matched and a short time the market was open. The bottom 50% of the dataset has a maximum of 12 matched bets (6 transactions), with GBP 144 matched during a market that was open for 1.67 hours (100 minutes). From an economic perspective, the question can be asked if there ever was a functioning market for these observations. Table 4 reports return correlations for the three return horizons. As expected, open-to-end returns are very highly correlated with close-to-end return: 0.99 for both the full dataset and the LM-HV subsample. The economic interpretation of this is that prices on average move only slightly between market opening and closing. For the full dataset, open-to-close returns have almost zero correlation with open-to-end returns and are negatively correlated with close-to-end returns. This is di↵erent for the LM-HV subsample, where the open-to-close return is slightly positively correlated with the open-to-end return but negatively correlated with close-to-end returns. Comparing these results to the literature, the distributions of returns and correlations seem to be in line with the results found by Moskowitz (2015). Where Moskowitz finds return distri- butions with slightly higher returns and smaller standard deviations, he also finds that returns on the contracts are on average small and negative. Also the correlations found for the LM-HV subsample come close to the results found in Moskowitz’ paper. Table 4: Return Correlations This table shows correlations of returns for the full dataset (Panel A) and the LM-HV subsample (Panel B). Panel A: full dataset Panel B: LM-HV sample Ropen:end Rclose:end Ropen:close Ropen:end Rclose:end Ropen:close Ropen:end 1 Ropen:end 1 Rclose:end 0.989 1 Rclose:end 0.992 1 Ropen:close -0.001 -0.151 1 Ropen:close 0.054 -0.073 1 19 5 Results The goal of this paper is to test whether there is predictability of returns through price movements and if these are a↵ected by market time and betting volume. I will test whether price movements from market close to end can be predicted from market opening to closing prices. Recalling the regression as stated in equation (2): Rclose:end = ↵ + 1Ropen:close + ✏ (3) Table 5 shows results of testing of price movements for the full dataset and the LM-HV sample. For the full dataset, the 1 coe cient is -1.0068, apparently very close to -1. Testing to reject the three hypotheses I reject H1 that the beta coe cient is zero, I do no not reject H2 that 1 is equal to -1, and for H3 I reject the underreaction hypothesis ( 1 > 0) but I do not reject the overreaction hypothesis ( 1 < 0). Then turning to to subset of the data, the LM-HV sample has a 1 coe cient of -0.5727 which is significant at the 0.1% level. I reject hypotheses 1, 2 and 3a but I do no reject hypothesis 3b. Table 5: Testing price movements for full dataset and LM-HV sample This table shows the results for the regression Rclose:end = ↵ + 1Ropen:close + ✏,withthefulldatasetinthe left column and the LM-HV sample in the right column. The t-scores are presented in parentheses below the 1 coe cient. Also reported are the results for testing the three hypotheses on information versus sentiment in price movements. Full Dataset LM-HV Sample