Judgment and Decision Making, Vol. 5, No. 2, April 2010, pp. 124–132

Gambler’s , hot hand belief, and the time of patterns

Yanlong Sun∗ and Hongbin Wang University of Texas Health Science Center at Houston

Abstract

The gambler’s fallacy and the hot hand belief have been classified as two exemplars of human misperceptions of random sequential events. This article examines the times of pattern occurrences where a fair or biased coin is tossed repeatedly. We demonstrate that, due to different pattern composition, two different statistics (mean time and waiting time) can arise from the same independent Bernoulli trials. When the coin is fair, the mean time is equal for all patterns of the same length but the waiting time is the longest for streak patterns. When the coin is biased, both mean time and waiting time change more rapidly with the probability of heads for a streak pattern than for a non-streak pattern. These facts might provide a new insight for understanding why people view streak patterns as rare and remarkable. The statistics of waiting time may not justify the prediction by the gambler’s fallacy, but paying attention to streaks in the hot hand belief appears to be meaningful in detecting the changes in the underlying process. Keywords: gambler’s fallacy; hot hand belief; heuristics; mean time; waiting time.

1 Introduction is the representativeness heuristic, which attributes both the gambler’s fallacy and the hot hand belief to a false The gambler’s fallacy and the hot hand belief have been belief of the “law of small numbers” (Gilovich, et al., classified as two exemplars of human misperceptions of 1985; Tversky & Kahneman, 1971). By this account, random sequential events and widely studied in multi- people tend to believe that a local sample should resem- ple disciplines such as , sports, behavioral ble the underlying population and chance is perceived as economics and neuroeconomics (e.g., Camerer, Loewen- “a self-correcting process in which a deviation in one di- stein, & Prelec, 2005; Gilovich, Griffin, & Kahneman, rection induces a deviation in the opposite direction to 2002; Gilovich, Vallone, & Tversky, 1985; Kahneman, restore the equilibrium” (Tversky & Kahneman, 1974, p. 2002; Malkiel, 2003; Rabin, 2002; Rabin & Vayanos, 1125). Thus, in the gambler’s fallacy, a tail is due to re- 2010). Often manifested in more intricate forms, these verse a streak of heads. In the hot hand belief, a streak two phenomena can be demonstrated by independent and of successes may indicate the existence of a hot hand by identically distributed Bernoulli trials. Suppose that a fair which the streak tends to be prolonged (see also Tversky coin with equal probabilities of coming up a head (h) & Gilovich, 1989). and a tail (t) is tossed repeatedly and the first three out- However, the representativeness account has been criti- comes produce three heads (h, h, h). In predicting the cized for its incompleteness and testability (e.g., Ayton & next outcome, one with the gambler’s fallacy would pre- Fischer, 2004; Falk & Konold, 1997; Gigerenzer, 1996; dict (h, h, h, t) — a reversal of the streak. In contrast, Kubovy & Gilden, 1991). Ayton and Fischer (2004) sug- one with the hot hand belief would predict (h, h, h, h) — gest that the gambler’s fallacy arises from the experience a continuation of the streak. of negative recency in sequences of natural events such The fact that people exhibit two opposing expectations as roulette games, but the hot hand belief arises from the upon the same past information — negative recency in experience of positive recency in serial fluctuations in hu- the gambler’s fallacy and positive recency in the hot hand man performance. Similarly, it has been proposed that the belief — has been the center of attention in the research hot hand belief can arise when people evaluate the perfor- on of , pattern detection and judg- mance of a mutual fund manager rather than the fluctua- ment of uncertainty (for reviews, see, Ayton & Fischer, tions of the portfolio (Rabin, 2002; Rabin & Vayanos, 2004; Oskarsson, Van Boven, McClelland, & Hastie, 2010), or, the gambler’s luck rather than the outcomes 2009). Among existing theories, a prevailing account of a roulette game (Croson & Sundali, 2005; Sundali & Croson, 2006). Moreover, Burns and Corpus (2004) ∗ Address: Yanlong Sun ([email protected]), or, Hongbin show that subjects assume positive recency for forecast- Wang ([email protected]), School of Health Information Sciences, University of Texas Health Science Center at Houston, 7000 ing scenarios they rated as “nonrandom” and negative re- Fannin St. Suite 600, Houston, TX 77030. cency for scenarios they rated as “random”. Burns (2004)

124 Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 125

further argues that the hot hand belief is a fast and fru- It is the goal of this paper to demonstrate a plausible link gal heuristic to detect changes in the shooting accuracy between the statistics of pattern times and people’s per- of basketball players. This argument is consistent with ception of randomness. the finding of “residual nonstationarity” in Sun (2004), in It is important to note that the concept of waiting which it is suggested that the fluctuations in players’ per- time has recently received attention in psychology liter- formance can be obscured by real-time adjustments based ature (Hahn & Warren, 2009; Sun, Tweney, & Wang, on the detection of a hot hand. For example, after making 2010a, 2010b). Hahn and Warren (2009) show that, in several shots in a row, a player might try a more difficult a global sequence of moderate length, streak patterns shot or the opponent players may increase the defense ef- such as (h, h, h, h) have higher “probabilities of nonoc- fort. (For a review on the hot hand study, see Bar-Eli, currence” than (h, h, h, t). Base on this result, they argue Avugos, & Raab, 2006.) that, given people’s limited exposure to the environment Compared to the representativeness account, the alter- (e.g., the number of coin tosses is limited), mispercep- native interpretations distinguish the hot hand belief from tions of randomness such as the gambler’s fallacy might the gambler’s fallacy by deviations from a random pro- actually emerge as apt reflections of these environmen- cess. When the underlying process is truly random (or tal statistics. Sun, Tweney, and Wang (2010a) criticize statistically impossible to tell apart from independent and Hahn and Warren’s interpretation by clarifying the rela- stationary Bernoulli trials), both beliefs are considered tionship between the probability of nonoccurrence and as or misperceptions of randomness. In particu- waiting time. In particular, Sun et al. argue that the proba- lar, both beliefs appear to share a common intuition that bility of nonoccurrence is a manifestation of waiting time, streak patterns are “rare” and “remarkable” — a streak which is independent of the length of the global sequence, of heads is unlikely to occur if the coin is fair, or, a bas- and neither statistic would justify the prediction of revert- ketball player is unlikely to make shots in streaks unless ing of a streak by the gambler’s fallacy (also see Sun, et he or she has a hot hand. However, the independence al., 2010b). Notwithstanding the debate, the argument assumption of Bernoulli trials states that, for a fair coin, of treating waiting time as a part of the environmental a streak will occur as often as any other patterns of the statistics appears to be quite plausible. Given that dif- same length in its exact order (i.e., the equiprobability of ferent statistics can arise from the same process of coin “n-grams”, Falk & Konold, 1997, p. 306). Then, what tossing (or basketball shooting), it is likely that they have is so special about streak patterns that people normally been actually experienced by people and have different tend to avoid them and only expect them when they feel effects on people’s perception of randomness. In the fol- “hot”? In the present paper, we show that streak patterns lowing, we examine these statistics in detail and discuss do possess a set of properties that set them apart from their psychological implications. other patterns, and these properties may provide an alter- native explanation for the particular role of streak patterns in people’s perception and judgment of randomness. 2 Mean time, waiting time, and We exemplify by comparing two patterns (h, h, h, t) variance of interarrival times and (h, h, h, h). When a fair coin is tossed repeat- edly, both patterns have the same probability of occur- Let us call the occurrence of a pattern an arrival of the pat- rence in any four successive trials. However, it takes tern when a coin (fair or biased) is tossed repeatedly. We on average 16 tosses to encounter the first occurrence of can define a counting process N (n) , n ≥ 1, where N (n) (h, h, h, t) but 30 tosses to encounter the first occurrence denote the number of arrivals of a pattern by the time n of (h, h, h, h). In other words, streak pattern (h, h, h, h) (i.e., by the nth toss). The process has parameters µ and has been “delayed” for its first occurrence. The expected σ2 as the mean and variance of the time between suc- number of trials required for the first occurrence of a par- cessive arrivals. (A more detailed treatment is provided ticular pattern is a statistical property known as “waiting in the Appendix. The results presented in this paper are time”, which can be different among patterns due to dif- verified by simulations conducted in the R statistics envi- ferent pattern compositions (see Gardner, 1988; Graham, ronment and the scripts are available in this issue of the Knuth, & Patashnik, 1994). While the probability of oc- journal.) currence (or frequency) describes how often a pattern oc- For a particular pattern, its interarrival time T is de- curs, the waiting time describes when a pattern will occur fined as the number of tosses between any two successive from the time at which monitoring begins. Interestingly, occurrences of the pattern, and the first arrival time T ∗ is these are different statistical properties and clearly bear defined as the number of tosses until the first occurrence different psychological relevance. For example, for a pas- since the beginning of the counting process. The mean senger who is waiting for a bus, when the first bus arrives of interarrival times E [T ], hence referred to as “mean probably is more relevant than how often the bus arrives. time”, is determined by the individual probabilities of the Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 126

elements in the pattern. Assume a fair coin with equal Figure 1: Self-overlap within patterns. Pattern probabilities of heads and tails, p = p = 1/2, h t (h, h, h, h) overlaps with itself at 3 positions when a copy −4 is shifted to the right end each time by one position, and E [Th,h,h,t] = E [Th,h,h,h] = µ = (ph) = 16. (1) pattern (h, h, h, t) has no shifted overlap. Overlapped el- That is, patterns (h, h, h, t) and (h, h, h, h) have the ements are underlined. same mean time (16 tosses) between successive arrivals. h h h h h h h t This is equivalent to the statement that (h, h, h, t) and h h h h h h h t (h, h, h, h) have the same probability of occurrence. Re- gardless of the number of coin tosses, a gambler will en- h h h h h h h t counter either pattern equally often. After n tosses, the h h h h h h h t expected number of encounters for either pattern is the same as in Table 1: Mean and variance of the first arrival time, and E [N (n)] = (n − 4 + 1) /16. (2) mean and variance of interarrival times for patterns of length 4 when a fair coin is tossed repeatedly. Recipro- However, the mean of the first arrival time E [T ∗], the cal patterns are listed only once, for example, (h, h, h, t) waiting time, can be different due to the different amount is equivalent to (t, t, t, h). For non-overlapping patterns of “self-overlap” within a particular pattern (see Figure such as (h, h, h, t), the two pairs of statistics are identical. 1). The amount of self-overlap (s) can be defined as the maximum length of a sub-pattern that has to occur twice Patterns E [T ∗] Var (T ∗) E [T ] Var (T ) (with or without overlap) to start and finish one occur- rence of the original pattern. For example, among all (h, h, h, t) 16 144 16 144 patterns of length 4, pattern (h, h, h, h) has the largest (h, h, t, t) 16 144 16 144 amount of self-overlap (s = 3), and (h, h, h, t) is non- (h, t, t, t) 16 144 16 144 overlapping (s = 0). A direct consequence of self- (h, t, h, h) 18 210 16 208 overlap is that the pattern’s first occurrence will be de- (h, h, t, h) 18 210 16 208 layed when s > 0. Imagine that one is waiting for an occurrence of (h, h, h, h) and has already obtained three (h, t, t, h) 18 210 16 208 heads — a sub-pattern of length 3, (h, h, h) — if the 4th (h, t, h, t) 20 276 16 272 toss is a tail, the waiting has to start from scratch and the (h, h, h, h) 30 734 16 592 waiting time spent on the sub-pattern (h, h, h) is wasted. In contrast, when one is waiting for pattern (h, h, h, t) and has already obtained three heads, if the 4th toss is a completely separated from each other thus more evenly head, the waiting continues but it still has three heads to distributed, consecutive reoccurrences of (h, h, h, h) can start with. It can be shown that, for a fair coin, among all overlap with each other and tend to be clustered (see Fig- patterns of length 4, (h, h, h, h) and (h, h, h, t) have the ure 1). As a consequence, among all possible patterns of longest and shortest waiting times, respectively (also see length 4, these two patterns have the smallest and largest Table 1), variance of interarrival times, respectively:

£ ∗ ¤ £ ∗ ¤ Var (Th,h,h,t) = 144,SD (Th,h,h,t) = 12; E Th,h,h,h = E Th,h,h + E [Th,h,h,h] = 14 + 16 = 30 (3) Var (Th,h,h,h) = 592,SD (Th,h,h,h) = 24.33.

£ ∗ ¤ E Th,h,h,t = E [Th,h,h,t] = 16. (4) 3 Frequency versus delay Moreover, it can be shown that waiting time E [T ∗] is almost perfectly correlated to the variance of interar- In essence, the contrast between mean time and waiting rival times Var (T ) for patterns of the same length (see time lies in the contrast between “frequency” and “de- Table 1 and Appendix). Intuitively, both E [T ∗] and lay”. On one hand, the mean time estimates the average Var (T ) are direct consequences of the self-overlapping distance between consecutive occurrences and equals to property, and the amount of self-overlap in the pattern the inversion of the probability of occurrence, therefore, it determines the minimum distance by which a consecutive is a measure of frequency. When the coin is fair, patterns occurrence can follow (i.e., the shortest interarrival time). of the same length have the same mean time thus the same While consecutive reoccurrences of (h, h, h, t) have to be frequency to occur (see Equations 1 and 2). On the other Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 127

hand, waiting time estimates when a pattern will occur Figure 2: Probabilities of occurrence at least once for pat- since one starts counting and the time of occurrence is de- terns (h, h, h, t) and (h, h, h, h) when a fair coin is tossed layed on the basis of the pattern’s mean time: Equations N times. (3) and (4) show that the amount of delay for an over- lapping pattern (s > 0) equals the waiting time for the repeating sub-pattern of length s; for a non-overlapping pattern (s = 0), no delay is incurred and its waiting time always equals its mean time (also see Table 1).1 If one assumes that people’s perception of random- ness is shaped by the environment (e.g., Ayton & Fischer, 2004; Lopes & Oden, 1987; Pinker, 1997), it is likely that people have actually experienced different statistics from the same process, although they might not be aware of the exact distinction. The contrast, either between mean time and waiting time, or between frequency and delay, might have important implications regarding people’s percep- tion of sequential patterns, particularly in the gambler’s Probability of Occurrence At Least Once h h h t fallacy and the hot hand belief. In the following, we first h h h h examine the (ex-ante) perception or expectation of pat- 0.0 0.2 0.4 0.6 0.8 1.0 terns as an integrated sequence, then, the prediction of a 10 20 30 40 single outcome based on the perception of patterns. N (number of tosses) First, due to the largest amount of self-overlap, a streak pattern is the most delayed pattern for its first occur- rence, comparing to all other patterns of the same length. ity in the pattern composition (Figure 2 shows the com- The amount of delay is considerably large even for short parison between (h, h, h, h) and (h, h, h, t)). This fact streaks (see Table 1), and it will grow exponentially as might from another prospective explain why streak pat- the length of the streak grows. For example, for a streak terns are under-represented in people’s perception. That of 10 heads in tossing a fair coin, its mean time is 1024 is, because of its clustering tendency, overlapped reoc- tosses, and its waiting time is 2046 tosses, 1022 tosses currences of a streak pattern may be counted only once away from the mean time (which is the waiting time for or replaced by one count of a longer streak. Such specu- a streak of 9 heads). Given that the mean time remains lation appears to be consistent with the finding in a recent the same for all patterns of the same length, it is possi- study by Olivola and Oppenheimer (2008): when partici- ble that people’s sense of rareness about streak patterns pants recalled the studied binary sequence, the lengths of have stemmed from their experiences of the long waiting streaks imbedded in the original sequence were underes- times. timated. Moreover, the waiting time statistic can manifest itself Nevertheless, although the long waiting time might in many other forms. One example is the probability of provide a statistical basis to justify people’s perception occurrence at least once — the probability that a partic- of streak patterns as rare events, it does not justify the ular pattern occurs at least once when a coin is tossed N prediction of a single outcome to reverse (or, to avoid) times — which is complementary of the probability of a streak pattern by the gambler’s fallacy. By the in- nonoccurrence — the probability that a particular pattern dependence assumption of Bernoulli trials, given that will not occur at all in N tosses. The latter probability has one has already obtained three heads in a row, the ad- been discussed by Hahn and Warren (2009), and Sun et ditional time to encounter (h, h, h, h) is E [Th,h,h,h], and al. (2010a) provide an analytical solution to both proba- the additional time to encounter (h, h, h, t) is E [Th,h,h,t], bilities. It can be shown that among all patterns of length which is the same in the case of a fair coin (see Equa- 4, the streak pattern (h, h, h, h) has the lowest probabil- tions 3 and 4). That is, the statement that the streak ity of occurrence at least once for any N > 4, which pattern(h, h, h, h)’s first occurrence is “delayed” is an ex- is another consequence of the self-overlapping probabil- ante expectation when the pattern is treated as a whole as 1We have been discussing the case in which individual patterns are one starts tossing the coin from scratch. However, such counted separately. When multiple patterns are counted simultaneously, statement does not mean that the “streak-reversal” pat- a statistical property known as “nontransitivity” will arise in which a tern (h, h, h, t)’s first occurrence is “expedited” since its pattern with a longer waiting time may occur earlier than a pattern with waiting time cannot be shorter than its mean time. In a shorter waiting time (Sun, et al., 2010a). However, it can be shown that streak patterns remain the most delayed patterns in the presence of other words, although waiting time (or probability of oc- nontransitivity. currence at least once) may depict (h, h, h, t) as the most Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 128

“representative” pattern of the coin tossing process (for Figure 3: Mean time and waiting time as the function its waiting time is equal to its mean time, or, its occur- of the probability of heads (p ) for patterns (h, h, h, t) rences are most evenly distributed), it does not predict h and (h, h, h, h). To illustrate the difference in detail, the that a streak of heads will soon be reversed by a tail, thus, function is plotted in the range of p = [0.3, 0.7]. it does not vindicate the error in the gambler’s fallacy. h Then, what about the prediction of a single outcome to h h h t continue a streak by the hot hand belief? The debate over h h h h the statistical validity of the hot hand belief has lasted more than twenty years (e.g., Bar-Eli et al., 2006), and it is not likely to be ended by simply introducing a new set of statistics. However, pattern time statistics do seem to support some of the existing theories. In particular, it has been suggested that the hot hand belief arises when peo-

ple are evaluating human performance, and people pay Mean Time particular attention to streak patterns in order to detect a change in the performance, for example, fluctuations in the shooting accuracy of basketball players (e.g., Ayton & Fischer, 2004; Burns, 2004; Burns & Corpus, 2004; Sun, 2004). By such account, the prediction to continue a streak is actually valid on the basis of a higher probabil- 0 50 100 150 ity of a single outcome (e.g., a higher shooting accuracy, 0.3 0.4 0.5 0.6 0.7 a higher probability of heads in case of a biased coin). It Probability of Heads can be shown that by the measure of either mean time or waiting time, streak patterns are indeed a good indicator h h h t for detecting the changes in the probability of single out- h h h h comes. Figure 3 shows mean time and waiting time as the function of the probability of heads (ph), respectively. It shows that with a small change in ph, both mean time and waiting time change more rapidly for pattern (h, h, h, h) than for pattern (h, h, h, t). For example, when ph in- creases from .5 to .6, E [Th,h,h,hh ] dropsi from 16 tosses to

∗ Time Waiting 7.7 tosses (∆ = 8.3) and E Th,h,h,h drops more rapidly from 30 tosses to 16.8h tossesi (∆ = 13.2). In contrast, ∗ E [Th,h,h,t] and E Th,h,h,t only drops from 16 tosses to 11.6 tosses (∆= 4.4) (note that for pattern (h, h, h, t), its mean time and waiting time are identical at all levels 0 50 100 150 of ph). Furthermore, Figure 3 also shows that the mean time 0.3 0.4 0.5 0.6 0.7 and waiting time depict different pictures regarding the Probability of Heads occurrences of streak pattern (h, h, h, h) at various lev- els of ph. For example, at ph = .5, (h, h, h, h) and (h, h, h, t) are indifferent by the measure of mean time As mentioned before, waiting time E [T ∗] is in ef- but quite distinguishable by the measure of waiting time, fect an indicator of the variance of the interarrival times a fact we have mentioned before. It also shows that Var (T ) as both are direct consequences of the inter- (h, h, h, h) will occur more frequently than (h, h, h, t) overlapping property of the pattern. From this prospec- (shorter mean time) as soon as ph > .5, but remains de- tive, the contrast between frequency (mean time) and layed (longer waiting time) for its first occurrence until delay (waiting time) actually reflects the contrast be- ph > .7. Then, one may wonder if the mean time and tween the mean and variance of the same random vari- waiting time have different effects on people’s percep- able interarrival times, E [T ] and Var (T ). To evaluate tion, or, if people perceive the properties of frequency the combined effect of frequency and delay on people’s and delay differently, how these effects can be integrated perception, a quantitative measure may be provided by to produce a single response in the subjective preference the mean-variance paradigm in the modern portfolio the- over patterns? ory (Markowitz, 1952; Sharpe, 1994) or the coefficient Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 129

of variation in risk perception (Weber, Shafir, & Blais, Figure 4: Frequency-Delay ratio as a function of the 2004).2 In this paradigm, to describe the desirability of probability of a head (ph) for patterns (h, h, h, t) and an option in the tradeoff between return and risk, a Sharpe (h, h, h, h). To illustrate the difference in detail, the func- Ratio (Sharpe, 1994) is calculated as the ratio between the tion is plotted in the range of p = [0.3, 0.7]. expected return and the variance of the returns. Similarly h for pattern occurrences, we can calculate a “Frequency- h h h t Delay ratio” (100/µσ) to describe the tradeoff between h h h h the frequency of occurrence (1/µ) and the delay (σ), where µ = E [T ] and σ = SD (T ) are the mean and standard deviation of interarrival times, respectively, and the constant 100 is to express the ratio as a percentage. Analogous to the Sharpe Ratio by which people would prefer an option with a higher return and a lower risk, the assumption underlying the Frequency-Delay ratio is that people would be more willingly to make a prediction on a pattern with a higher frequency of occurrence and a lower amount of delay. In other words, a person can possess both the gambler’s fallacy and the hot hand belief, and Ratio 100/(mu*sigma) Frequency−Delay which belief arises would depend on how frequency and

delay are weighted separately and how they are incorpo- 0 1 2 3 4

rated. Figure 4 shows the Frequency-Delay ratio at var- 0.3 0.4 0.5 0.6 0.7

ious levels of ph, where pattern (h, h, h, t) has a higher Probability of Heads ratio when ph < .61, and pattern (h, h, h, h) has a higher ratio when ph > .62. Described by such ratio, a person would be more willingly to predict on (h, h, h, t) when in shaping people’s perception of random patterns. It ph < .61, and more willingly to predict on (h, h, h, h) has long been argued that people may have an accurate when ph > .62. That is, “a streak of heads is unlikely to sense of randomness but fail to reveal it in their behav- occur if the coin is fair, and a basketball player is unlikely ior (e.g., Pinker, 1997; for an overview of different opin- to make shots in streaks unless he or she (really) has a hot ions, see Rapoport & Budescu, 1992). The distinction hand.” between mean time and waiting time may provide a new prospective in the studies on people’s perception of ran- dom events. For instance, the mean time does not differ- 4 Conclusion entiate any patterns in the case of tossing a fair coin; if one assumes people have acquired accurate experiences We presented a set of statistics on the time of pattern oc- from the environment, it is quite possible that people’s currences in Bernoulli trials. In particular, we demon- notion of streak patterns as rare events is guided by the strated that, due to the different pattern composition, dif- waiting time. ferent statistical properties can arise. The mean time It should be noted that the statistics of pattern times can measures the frequency of pattern occurrences, and the manifest themselves in many forms and each manifesta- waiting time measures the amount of delay in pattern tion may receive different interpretations, depending on occurrence times. While previous research on percep- the specific assumptions about human perception of se- tion of random patterns has focused on the mean time quential events and the specific task environment. For ex- or the frequency of occurrences (e.g., Budescu, 1987; ample, regarding the distinction between frequency and Falk & Konold, 1997; Gilovich, et al., 1985; Lopes & delay, people may be more sensitive to one property than Oden, 1987; Nickerson, 2002; Oskarsson, et al., 2009), to the other; or one property is more important than the the waiting time and the property of delay have not been other in different situations. For a passenger waiting for addressed until recently. It is likely that people are not a bus, if he or she is concerned only with catching the able to precisely differentiate these statistics. However, first bus (or, at least one bus), the waiting time (delay) given that these statistics can be observed from the same is certainly more important. However, if the passenger process, it is possible that they all have played roles is interested in estimating the number of bus arrivals in a certain period of time, the mean time should be the statis- 2 It should be noted that although we adopt a paradigm in risk analy- tic of choice. Another example is that, if we assume the sis, our intention here is to present a quantitative measure for the com- bined effect of frequency and delay, and we withhold claims about risk actual basketball shooting as tossing a coin with a fixed preferences in the hot hand belief or gambler’s fallacy. probability of heads (as the null hypothesis), the hot hand Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 130

belief would be judged as a fallacy. However, if we as- iors: The adaptiveness of the “hot hand”. Cognitive sume that the basketball player’s shooting accuracy is ini- Psychology, 48, 295–331. tially unknown and can fluctuate, paying attention to the Burns, B. D., & Corpus, B. (2004). Randomness and in- occurrence of streaks may actually be an effective way to ductions from streaks: “Gambler’s fallacy” versus “hot detect the change. hand”. Psychonomic Bulletin & Review, 11, 179–184. Even in simple cases like coin tossing, people’s per- Camerer, C., Loewenstein, G., & Prelec, D. (2005). ception of randomness surely cannot be reduced to a cer- Neuroeconomics: How neuroscience can inform eco- tain set of statistics. Many other perceptual and cognitive nomics. Journal of Economic Literature, 43, 9–64. mechanisms can come into play, such as perception of Croson, R., & Sundali, J. (2005). The gambler’s fallacy proportion and symmetry (Rapoport & Budescu, 1997), and the hot hand: Empirical data from casinos. Journal subjective complexity (Falk & Konold, 1997), or, work- of Risk and Uncertainty, 30, 195–209. ing memory capacity (Kareev, 1992). Nevertheless, the Falk, R., & Konold, C. (1997). Making sense of ran- statistics of pattern times appear to qualify for a useful domness: Implicit encoding as a basis for judgment. toolset since they provide objective measures in situa- Psychological Review, 104, 301–318. tions where sequential information is essential. For ex- Gardner, M. (1988). Time travel and other mathematical ample, people respond differently depending on whether bewilderments. New York: Freeman. they view sequences all at once on paper or they actually Gigerenzer, G. (1996). On narrow norms and vague observe sequences unfolding over time (e.g., Olivola & heuristics: A reply to Kahneman and Tversky. Psy- Oppenheimer, 2008). They habitually look for sequen- chological Review, 103, 592–596. tial patterns and their perception of patterns influence their responses in single experimental trials even when Gilovich, T., Griffin, D., & Kahneman, D. (Eds.). (2002). Heuristics and biases: The psychology of inuitive judg- the sequence of trials is completely independent (Barr- ment. New York: Cambridge University Press. aclough, Conroy, & Lee, 2004; Huettel, 2006; Huettel, Mack, & McCarthy, 2002). Moreover, the dissociation Gilovich, T., Vallone, R., & Tversky, A. (1985). The hot of frequency and delay might have important implica- hand in basketball: On the misperception of random tions in studies on people’s intertemporal choices. And it sequences. Cognitive Psychology, 17, 295–314. has been suggested that people are sensitive to time dis- Graham, R. L., Knuth, D. E., & Patashnik, O. (1994). counting while the behavioral and neural effects of delays Concrete mathematics. Reading MA: Addison- are independent of probability (e.g., Luhmann, Chun, Yi, Wesley. Lee, & Wang, 2008; McClure, Laibson, Loewenstein, & Hahn, U., & Warren, P. A. (2009). of ran- Cohen, 2004). In these aspects, we conjecture that ex- domness: Why three heads are better than four. Psy- amination of the pattern time statistics, combined with chological Review, 116, 454–461. empirical experiments, might be a fruitful approach in Huettel, S. A. (2006). Behavioral, but not reward, risk future investigations on pattern detection, perception of modulates activation of prefrontal, parietal and insu- randomness, and judgment and decision-making under lar cortices. Cognitive, Affective, & Behavioral Neuro- uncertainty. science, 6, 141–151. Huettel, S. A., Mack, P. B., & McCarthy, G. (2002). Per- ceiving patterns in random series: Dynamic processing References of sequence in prefrontal cortex. Nature Neuroscience, 5, 485–490. Ayton, P., & Fischer, I. (2004). The hot hand fallacy and Kahneman, D. (2002). Maps of bounded rationality: A the gambler’s fallacy: two faces of subjective random- perspective on intuitive judgment and choice. In T. ness? Memory & Cognition, 32, 1369–1378. Frangsmyr (Ed.), Les Prix Nobel 2002 [Nobel Prizes Bar-Eli, M., Avugos, S., & Raab, M. (2006). Twenty 2002]. Stockholm, Sweden: Almquist & Wiksell In- years of “hot hand” research: Review and critique. ternational. Psychology of Sport & Exercise, 7, 525–553. Kareev, Y. (1992). Not that bad after all: Generation of Barraclough, D. J., Conroy, M. L., & Lee, D. (2004). Pre- random sequences. Journal of Experimental Psychol- frontal cortex and decision making in a mixed-strategy ogy: Human Perception and Performance, 18, 1189– game. Nature Neuroscience, 7, 404–410. 1194. Budescu, D. V. (1987). A Markov model for generation Kubovy, M., & Gilden, D. (1991). Apparent randomness of random binary sequences. Journal of Experimental is not always the complement of apparent order. In G. Psychology: Human Perception and Performance, 13, R. Lockhead & J. R. Pomerantz (Eds.), The perception 25–39. of structure (pp. 115–127). Washington, DC: Ameri- Burns, B. D. (2004). Heuristics as beliefs and as behav- can Psychological Association. Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 131

Lopes, L. L., & Oden, G. C. (1987). Distinguishing be- Sundali, J., & Croson, R. (2006). Biases in casino bet- tween random and nonrandom events. Journal of Ex- ting: The hot hand and the gambler’s fallacy. Judgment perimental Psychology: Learning Memory and Cogni- and Decision Making, 1, 1–12. tion, 13, 392–400. Tversky, A., & Gilovich, T. (1989). The cold facts about Luhmann, C. C., Chun, M. M., Yi, D.-J., Lee, D., & the “hot hand” in basketball. Chance, 2, 16–21. Wang, X.-J. (2008). Neural dissociation of delay and Tversky, A., & Kahneman, D. (1971). Belief in the law of uncertainty in intertemporal choice. Journal of Neuro- small numbers. Psychological Bulletin, 76, 105–110. science, 28, 14459–14466. Tversky, A., & Kahneman, D. (1974). Judgment un- Malkiel, B. G. (2003). A random walk down Wall Street. der uncertainty: Heuristics and biases. Science, 185, New York: W. W. Norton. 1124–1131. Markowitz, H. M. (1952). Portfolio selection. Journal of Weber, E. U., Shafir, S., & Blais, A.-R. (2004). Predict- Finance, 7, 77–91. ing risk sensitivity in humans and lower animals: Risk McClure, S. M., Laibson, D. I., Loewenstein, G., & Co- as variance or coefficient of variation. Psychological hen, J. D. (2004). Separate neural systems value im- Review, 111, 430–445. mediate and delayed monetary rewards. Science, 306, 503–507. Appendix. The mean of the first ar- Nickerson, R. S. (2002). The production and perception of randomness. Psychological Review, 109, 330–357. rival time, and the mean and variance Olivola, C. Y., & Oppenheimer, D. M. (2008). Random- of interarrival times ness in retrospect: Exploring the interactions between memory and randomness cognition. Psychonomic Bul- Let X1,X2, ... be independent variables with letin & Review, 15, 991–996. P {Xi = j} = p (j), j ≥ 0 . In the case of coin Oskarsson, A. T., Van Boven, L., McClelland, G. H., & tossing, j = 0, 1, p(0) = ph and p(1) = pt represent Hastie, R. (2009). What’s next? Judging sequences of the probabilities of a head and a tail, respectively. binary events. Psychological Bulletin, 135, 262–285. For a pattern (x1, ..., xr) of length r, we say that Pinker, S. (1997). How the mind works. New York: Nor- an arrival (renewal) occurs at time n, n ≥ r, if ton. (Xn−r+1, ..., Xn) = (x1, ..., xr). LetN (n) denote Rabin, M. (2002). Inference by believers in the law of the number of arrivals by time n. N (n) , n ≥ 1, is a small numbers. Quarterly journal of Economics, 117, counting process in which the first arrival time has a 775–816. different distribution than the other interarrival times. Rabin, M., & Vayanos, D. (2010). The gambler’s and Then, N (n) , n ≥ 1 is said to be a general or delayed hot-hand : Theory and applications. Review of renewal process with parameters µ and σ2 as the mean Economic Studies, 77, 730–778. and variance of the time between successive arrivals. Rapoport, A., & Budescu, D. V. (1992). Generation Define indicator variables I (i), I (i) = 1 if there is an of random series in two-person strictly competitive arrival at time i and I (i) = 0 otherwise, i ≥ r. I (i) are games. Journal of Experimental Psychology: General, Bernoulli random variables with parameter p, where 121, 352–363. Y Rapoport, A., & Budescu, D. V. (1997). Randomization r p = p (xi). (5) in individual choice behavior. Psychological Review, i=1 104, 603–617. Then, the mean interarrival time is given by Sharpe, W. F. (1994). The Sharpe Ratio. Journal of Port- folio Management, 21, 49–58. µ = 1/p, (6) Sun, Y. (2004). Detecting the hot hand: An alternative model. In K. Forbus, D. Gentner & T. Regier (Eds.), and the variance of interarrival times is given by Proceedings of the 26th annual conference of the cog- Xr−1 nitive science society (pp. 1279–1284): Chicago, IL: σ2 = p−2 (1 − p) + 2p−3 Cov (I (r) ,I (r + j)). Lawrence Erlbaum. j=1 Sun, Y., Tweney, R. D., & Wang, H. (2010a). Occurrence (7) and nonoccurrence of random sequences: Comment on An overlap index s is defined as the maximum number Hahn and Warren (2009). Psychological Review, 117, of elements at the end of the pattern that can be used as in press. the beginning part of the next arrival, Sun, Y., Tweney, R. D., & Wang, H. (2010b). Postscript: Untangling the gambler’s fallacy. Psychological Re- view, 117, in press. s = max {j < r :(ir−j+1, ..., ir) = (i1, ..., ij)} . (8) Judgment and Decision Making, Vol. 5, No. 2, April 2010 Gambler’s fallacy hot hand, and timing 132

We further define s = 0 when no equality can be found of each other, it follows that I (r) I (r + j) = 0 if 1 ≤ in Equation (8). Then, s = 0 for (h, h, h, t), s = 2 for j ≤ (r − s − 1). Therefore, from Equation (7), we have (h, t, h, t), and s = 3 for (h, h, h, h). 2 Since the first arrival time can have a different distri- Var (T ) = σ bution, let T denote the interarrival time of the process = p−2 − (2r − 1) p−1 (11) ∗ when reoccurrences are included, and T denote the first Xr−1 arrival time. For pattern (h, h, h, t), r = 4 and s = 0, so + 2p−3 E [I (r) I (r + j)] . that N (n) , n ≥ 1 is an ordinary renewal process. There- j=r−s fore, for a fair coin, p = p = 1/2, from Equation (6), | {z } h t for overlapped arrivals £ ∗ ¤ E [Th,h,h,t] = E Th,h,h,t = µ = 1/p = 16. The overlapped arrivals can happen in the following ways, Since two arrivals of (h, h, h, t) cannot occur within a distance less than r (r = 4) of each other, it follows that 1 E [I (4) I (5)] = P {h, h, h, h, h} = ; I (r) I (r + j) = 0 when 1 ≤ j ≤ r − 1, and, 32 1 E [I (4) I (6)] = P {h, h, h, h, h, h} = ; Cov (I (r) ,I (r + j)) 64 1 = −E [I (r)] E [I (r + j)] = −p2. E [I (4) I (7)] = P {h, h, h, h, h, h, h} = . 128 Assume a fair coin, from Equation (7), we obtain Thus, ¡ ¢ ∗ Var (T ) = (16)2 − 7(16) Var (Th,h,h,t) = Var Th,h,h,t µ ¶ 1 1 1 = σ2 = p−2 − (2r − 1) p−1 = 144. + 2(16)3 + + 32 64 128 For pattern (h, h, h, h), r = 4 and s = 3, and = 592.

∗ ∗ From Equations (10) and (11), it can be seen the over- Th,h,h,h = Th,h,h + Th,h,h,h, (9) lapped arrivals will extend both the mean of the first ar- ∗ rival time and the variance of interarrival times. Thus, where Th,h,h,h is the first arrival time for pattern ∗ these two variables are correlated, and both are positively (h, h, h, h), Th,h,h is the first arrival time for pat- tern (h, h, h), and Th,h,h,h is the interarrival time for determined by the overlap index s. The procedure of cal- (h, h, h, h). Since the coin is tossed independently, we culating the variance of the first arrival times is omitted have here.

£ ∗ ¤ £ ∗ ¤ E Th,h,h,h = E Th,h,h + E [Th,h,h,h] . (10)

For a fair coin, from Equation (6), we have

E [Th,h,h,h] = µ = 16. h i ∗ Then, E Th,h,h,h can be obtained by recursively apply- ing Equation (10) s times, starting from the shortest non- overlap pattern (h). That is,

∗ E [Th] = E [Th ] = 1/ph = 2;

£ ∗ ¤ ∗ E Th,h = E [Th ] + E [Th,h] = 2 + 4 = 6; ...

£ ∗ ¤ E Th,h,h,h = 2 + 4 + 8 + 16 = 30. For the variance of interarrival times, since no two ar- rivals of (h, h, h, h) can occur within a distance (r−s−1)