<<

Statistics, Probability and Chaos Author(s): L. Mark Berliner Source: Statistical Science, Vol. 7, No. 1 (Feb., 1992), pp. 69-90 Published by: Institute of Mathematical Statistics Stable URL: http://www.jstor.org/stable/2245991 . Accessed: 15/08/2011 11:09

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at . http://www.jstor.org/page/info/about/policies/terms.jsp

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship. For more information about JSTOR, please contact [email protected].

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to Statistical Science.

http://www.jstor.org Statistical Science 1992, Vol. 7, No. 1, 69-122 Statistics, Probabilityand Chaos L. Mark Berliner

Abstract.The studyof chaotic behavior has receivedsubstantial atten- tionin manydisciplines. Although often based on deterministicmodels, chaos is associated with complex,"random" behavior and formsof unpredictability.Mathematical models and definitionsassociated with chaos are reviewed.The relationshipbetween the mathematicsof chaos and probabilisticnotions, including ergodic theoryand uncertainty modeling,are emphasized.Popular data analyticmethods appearing in the literatureare discussed.A major goal of this articleis to present some indicationsof how probabilitymodelers and statisticianscan contributeto analyses involvingchaos. Key wordsand phrases: Dynamicalsystems, , nonlinear series,stationary processes, prediction.

1. INTRODUCTION tions of chaos, I offera reviewof the basic notions ofchaos withemphasis on thoseaspects of particu- unpre- Chaos is associated with complex and lar interestto statisticiansand probabilists.Many dictable behavior of phenomenaover time. Such of the referencesgiven here provideindications of in behaviorcan arise deterministicdynamical sys- the breath of interestin chaos. Jackson (1989) tems. Many examples are based on mathematical providesan introductionand an extensivebibliog- models for (discrete)time series in which, after raphy [also, see Shiraiwa (1985)]. Berge, Pomeau startingfrom some ,the value of and Vidal (1984),Cooper (1989) and Rasband(1990) the series at any time is a specified,nonlinear discuss applicationsof chaos in the physical sci- functionof the previousvalue. (Continuoustime ences and .Valuable sourcesfor work processesare discussed in Section 2.) These proc- on chaos in biologicaland medical scienceinclude esses are intriguingin that the realizationscorre- May (1987), Glass and Mackey (1988) and Basar spondingto different,although extremelyclose, (1990). Wegman(1988) and Chatterjeeand Yilmaz initial conditionstypically diverge. The practical (1992) present reviews of particular interestto implicationof this phenomenonis that,despite the statisticians.Finally, useful, "general audience" underlyingdeterminism, we cannot predict,with introductionsto chaos includeCrutchfield, Farmer, any reasonableprecision, the values of the process. Packard and Shaw (1986), Gleick (1987), Peterson for large time values; even the slightesterror in (1988) and Stewart(1989). specifyingthe initialcondition eventually ruins our Section 2 presents discussion of the standard attempt.Later in this article,indications that real- mathematicalsetup of nonlinear dynamical sys- izations of such dynamical can display tems.Definitions of chaos are reviewedand chaotic characteristicstypically associated with random- behavioris explainedmathematically, as well as by ness are presented.A majortheme of this studyis example. Next, I review relationshipsbetween thatthis connection with suggests that chaos and probability.Two key pointsin this dis- statisticalreasoning may play a crucialrole in the cussionare: (i) the roleof ergodic theory and (ii) the analysis ofchaos. suggestionof uncertainty modeling and analysisby Stronginterest has recentlybeen shown in nu- probabilisticmethods. Statistical analyses related merous literaturesin the areas of nonlineardy- to chaos are discussed in Sections 3 and 4. In namical systemsand chaos. However,rather than Section3, the emphasisis on some "data analytic" attemptingto providean overviewof the applica- methodsfor analyzingchaotic data. The goals of these techniquesbasically involve attempts at un- derstandingthe structureand qualitative aspects of models and data displayingchaotic behavior. L. MarkBerliner is AssociateProfessor, Department [Specifically,the notionsof (i) estimationof dimen- of Statistics,Ohio State University,141 Cockins sion, (ii) Poincaremaps and (iii) reconstructionby Hall, 1958 Neil Avenue,Columbus, Ohio 43210. time delays are reviewed.]Although statisticians

69 70 L. M. BERLINER are now beginningto make contributionsalong specifyingthe initial conditionis in the sixth these lines, the methods describedin Section 3 decimalplace. This sortof behavior, known as sen- have been developedprimarily by mathematicians sitivity to initial conditions, is one of the key and physicists.In Section4, I discusssome possible componentsof chaos. To amplifyon this phe- strategiesfor methods of chaotic data analysisbased nomenon,the firstframe of Figure 2 presentsdot on main streamtechniques for statistical modeling plots, at selected time values, of the dynamical and inference.Finally, Section 5 is devotedto gen- correspondingto 18 initial conditions eral remarksconcerning statistics and chaos. equally spaced in the interval [0.2340, 0.2357]. There are two messages in this plot. First, note that the images of these 18 points are quickly 2. ,PROBABILITY AND CHAOS attractedto the unit interval.Second, the initial 2.1 The Complexityof Nonlinear Dynamical conditionsappear to get "mixed" up in an almost noncontinuousmanner. (However,for the logistic Systems , xt is, of course,a continuousfunction of xo A simpledeterministic may be forall t). The secondframe of Figure 2 is a scatter- definedas follows.For a discretetime index , plot of the values of the logisticmap after2000 T = {0, 1, 2, ... }, considera timeseries { x,; te T}. iterates against the correspondinginitial condi- Assume that xo is an initial conditionand that tions for4000 initials equally spaced in the inter- xt+1= f(xt), forsome functionf that maps a do- val [0.10005, 0.3]. There is clearly essentiallyno main D into D. (D is typicallya compactsubset of meaningfulstatements about the relationshipbe- a metricspace). Chaoticbehavior may arise when f tween x2000and xo, even though x2000is a well- is a nonlinearfunction. definedpolynomial (of admittedlyhigh To begin,some numerical examples for one ofthe order) of xo. (Note that presentingthis graph more popular examples of dynamicalsystems, the for 2000 iterates is a bit of "overkill." Corre- logisticmap, are given. The dynamicalsystem is spondingscatterplots after even a 100 or so iterates obtainedby iteratingthe function f( x) = ax(l - x), wouldlook quite the same.) where a is a fixedparameter in the interval[0,4]. Let xo be an initialpoint in the interval[0,1]; note 2.1.1 Some mathematicsfor nonlinear dynamical that then all futurevalues ofthe systemalso lie in systems [0,1].To get a bit ofthe flavorof this map,example This discussionis intendedto providesome flavor computationsare presentedfor an importantvalue of the mathematicsconcerning the appearance of of a: namely,a = 4.0. Figure 1 presentstime series complexor chaoticbehavior in nonlineardynami- plots of the first500 iterates of the logisticmap cal systems.The presentationis a bit quick, and correspondingto the initial values 0.31, 0.310001 untilSection 2.1.3, considersone-dimensional maps and 0.32. The firstthing to notice about these only.More completedetails maybe foundin Collet series is that their appearance is "complex." In- and Eckmann(1980), Rasband (1990) and Devaney deed, one mightbe temptedto suggestthese series (1989). We begin by consideringthe long-run are "random." Also, despitethe similarityin the behavior of a dynamicalsystem generated by a initial conditions,visual inspectionof the series nonlinearfunction f. The studybegins with the indicatesthat they are not quantitativelysimilar. considerationof fixedpoints of f; namely,those To make the ,I have includedscatterplots of pointsthat are solutionsto f(x) = x. The key re- these series,matched by time. The first25 iterates sult in this contextis the followingproposition. ofthe maps in these plots are indicatedby a differ- Using conventionalnotation, let ffn(.) denotethe ent symbol fromthe rest. Points falling on the 450 n-foldcomposition of f. line in these plotssuggest time values at whichthe correspondingvalues ofthe systemsare quite close. PROPOSITION 2.1. Let p be a fixed point of f. If We see that quite early in time, the three series I f'(p) I < 1, then thereexists an open intervalU "predict"each otherreasonably well. However,the about p such that, forall x in U, limn-+o fn(x) = p. similarityin the series diminishesrapidly as time increases. (The rate of this "separation" is in fact Under the conditionsof this proposition,p is an exponentialin time.) Note' that, except for very attractingfixed point and the set U is a stable set. early , the series correspondingto xo = It is also true that if if'( p) i > 1, p is a repelling 0.310001 is really no betterat predictingthe 0.31 fixed point. (In these two cases, p is said to be series than is the 0.32 series. Also, predictions hyperbolic;if I f'(p) I = 1, p acts as a based on xo = 0.31 when the correctvalue is and moredelicate analyses are required.The rest 0.310001 are verypoor, even thoughthe errorin ofthis discussionfocuses on hyperbolicpoints.) CHAOS 71

0.6 o.6 A 0.21 0.

100 200 300 400 100 200 300 400 series series

1.0 -

0.6

0.4 C

100 200 300 400

series (i)

1 .0 0 0 .0 x x x1 x x 0 X 0 0 0.8 - 0.8

0.6 - X 0.6 x | R O~~~~04| 0~~~~x ? X 04 A 0.4 0 A t x

0.2 -- 0 x 0.2 op x Qo0 x x , X 0191t' XX-tx 00X0

0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

B C

1 .0 1 10@ .'T,,.. '*-. ,.::' ,'

0.8 .. . 0.8

0.6 . '*; 0.6

A ...... A

0.2 t. 0.2 t.;

0 0 0...... 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

B (ii) C FIG. 1. Examples ofthe : a = 4.0. Initial conditions: (A) x0 = .31; (B) X0 = .310001;(C) X0 = .32. (i) Time seriesplots for times 26 - 500. (ii) Scatterplotsmatched by time. In topplots, 0 denotes times 1 - 15 and x denotes times 15 - 25; bottomplots show times 26 - 500. 72 L. M. BERLINER

1.0 8 0 0 ?+

* x 0.8 4 X > + 0~~~~~~ x o x X +

0 ~~~x+

0.6 0O+ 0 x + 0~~~~~~~~~~~~~~ t + 0 X x~~~~~~ 0.4- +o o 0 O + 0 0 + + + 0.2 ~~0 x x

20 30 40 50 100 200 1.0 d.~~~~~~~~~~~~

0.0~~~~~~~~~~~~~~~~~~~5 (i)

%~~~~~~~~~~~~~~%.s

8. 0.40.2t/ *|;. -fi;!rX.:; .,a.

P I~~~~~~3~~.' .

6 * _

0.15 0.20 0.25 0.30 xo (ii) FIG. 2. Mixingbehavior of the logistic map: a = 4;0. (i) Dot plotsfor 18 initialconditions. 0 denotesxo = .2340, .2341, .2342, .2343, .2344, .2345; x denotesxo = .2346, .2347, .2348, .2349, .2350, .2351; + denotesxo = .2352, .2353, .2354, .2355, .2356, .2357. (ii) Scatterplotof logistic map at time2000 againstxo for4000 xo's in [.10005,.3].

Consider now the logistic map definedearlier. ing fixed point. Furthermore,the iterates x,,= The fixedpoints of this map, i.e., solutionsto the ff(x) are monotonically decreasing for all x in x = ax(1 - x), are easily seen to be 0 and (0,1],but boundedfrom below by zero,and so, since Pa = 1 - 1 /a. Notethat f'(x) = a(l - 2 x). We now f is continuous,limp ,f (x) = 0. Note that this considerthe behaviorof iteratesof the logisticfor argumentalso happensto coverthe nonhyperbolic various a. case when a = 1. 1. a < 1: Because f'(O) = a < 1, 0 is an attract- 2. 1 < a < 3: First,note that now 0 is a repelling CHAOS 73 fixedpoint. The transitionfrom attraction to repul- Further,we applyProposition 1.1 to f2 to inferthe sion occurredas f'(O)increased to one at a = 1 and existenceof a stable set, U, correspondingto p, became greaterthan one for a > 1. This is a flag such that for all x in U, lim j, nXf2n(X) = p, An that thereis a potentialchange in behavior.Con- analogous claim is true for Pl A bit more work siderthe otherfixed point Pa. Notethat I f'(Pa) I = thenyields that for all x (0,1),except for the preim- 12 - al < 1, and so Pa is an attractingfixed point ages of 0 and pa, and for3 < a < 1 + V1, xn is if 1 < a < 3. Thus, for all 0 < x < 1 there is a asymptoticallyattracted to oscillate between pi chance that limn ,f'fn(x) = Pa. This conclusionis and Pl Note that,theoretically, xn need not actu- alreadyguaranteed for the stable set of Pa. It is not ally ever reach pmor Pi exactly,although numeri- hard to demonstratethat iteratesof any point in cal computationstypically indicate exact (0,1) eventuallyenter the stable set. The conclusion on this two-pointattractor. is that for almost all, in the sense of Lebesgue Because I f2't(p)I = 1 at a = 1 + 6 and , x in [0,1], limn,fn(x) = Pa. The only I f2/(pu)I > 1 fora > 1 + V", we (correctly)antic- exceptionsare the endpoints0 and 1. ipate anotherbifurcation at a = 1 + V in which 3. a > 3: Now both 0 and Pa are repellingfixed the period two attractorsplits into a period four points.Because f'(Pa) > 1, Pa can no longerattract . However, the analysis based on the iteratesof the map exceptfor x's such that there analogs of(2.1) and (2.2) forthis, and later,bifurca- exists an n such that ffn(x) = Pa. This set is the tions is no longerfeasible. More delicatetools are collectionof preimagesof Pa. This set is countable, needed to mathematicallydescribe the behavior. because there is a "denumerable" for The resultis that the perioddoubling bifurcations findingthe preimages:Namely, first find the solu- ofperiods 4 to 8, etc.,continue as a increases,on to tionsto f( x) = Pa. In this step the solutionsare Pa "period 2X" at a,. = 3.56994.... The fundamen- and 1 - Pa. The next step is to "invert" 1 - Pa to tal mathematicsexplaining the "period doubling obtaintwo morepreimages, etc. Also, notethat for cascade" is due to the physicist,M. Feigenbaum all a, the preimagesof 0 are 0 and 1, but 1 has no (see Feigenbaum,1978). For a > a,, the asymp- preimages.To see what happens to otherpoints, totic behavior of the logisticbecomes even more considerthe fixedpoints of f2(x). (Fixed pointsof complexthan perioddoubling cascade and is still f2 are periodicpoints of period 2 forf). Now, we notcompletely understood. We observemore period mustsolve doublingas well as aperiodicbehavior. For some a xn appears to wander on a "singular" attractor (2.1) a(ax(l - x))(1 - ax(l - x)) - x = 0, (i.e., a Cantor set), whereas forother a, particu- larly, a = 4, as x0 varies, xn wanderson a "con- a fourthdegree polynomial. However, we can recog- tinuous" set. One interestingphenomenon occurs nize that both 0 and Pa are solutions.Accounting in a small intervalof a's near 3.83... . In this forthese solutions,we can findthe remainingtwo region, a period 3 attractorappears, and then by solvingthe qi!adratic quicklyunder goes its ownperiod doubling cascade. a a3x2 + (Pa - 2)a x This factimplies further complications in that,if is such nontrivialthree-cycle behavior is possible, (2.2) + a3(1 + Pa(Pa - 2)) + a2 = 0. then all other cycles have solutions. That is, if f3(x) = x has nontrivial solutions, so does f n(X) = The solutionsare x, for all n. This result is a consequence of Sarkovskii'sTheorem [see Devaney (1989), Section = + 1 + `a2- - 3 and Pu [a 2 a ]/2a 1.10]. Also, see Li and Yorke (1975). On the other mostone attract- Pi= [a+ 1 - Va2 -2 a -3 ]/2a. hand,for each fixeda, thereis at ing periodic cycle. That is, except for a set of Note that, of course, f(pu) = Pi. First,these roots Lebesgue measure zero, all initial values lead to are real and distinctif a2 - 2 a - 3 > 0 or if a > 3. paths that are attractedto the same periodiccycle, At a = 3, Pa = Pu = p1. That is, we observe a bifur- ifthere is attractingcycle. We have seen a version cationat a = 3, in whichthe attractingfixed-point of this propertyabove: for 3 < a < 1 + V, there Pa splits into two pieces. Next, we ask forwhat a are nontrivial solutions to both f(x) = x and are these rootsattracting fixed-points of f2? Note f2(x) = x. However,only the period2 cycle is at- that tracting.The periodone cycleis onlyobservable for of Pa and 0. = - - - the preimages f21(x) a2(1 2x)(1 2ax(1 x)), To illustratesome of the ideas concerningperi- = = and so f21(pu) f21(pl) = 1 - (a2 - 2a - 3). odic behavior,consider the case of a 3.83001. Tlru I , - t .2 _ . qj I . i y This value of a correspondsto an attractingthree- 74 L. M. BERLINER

cycle with attractor 0.1561 ... -. 0.5046... - . 0.9574.... Figure 3 presentsa scatterplotof the values of x500 against xo for 800 xo's equally spaced in the interval[0.17, 0.3]. After500 iter- ates, all of these initial points lead to paths that 0.64 approximatelycycle through the attractor;but the phase of the cyclingvaries in a complicatedfash- ion. This suggestsa formof sensitivityto initial 0 .4 conditions,at least forpoints outside of stable sets . correspondingto,G,, values in ,.the attractor,!'G0,;. in the 0 2-..I- :.,,,., periodiccase. By exploringother initial conditions,

we can in factestimate the stable sets correspond- @ ? | ~~~~~~~~~~~I ~ ~I ~ 'Sl ing to the threepoints of the attractor;that is, the 3.5 3.6 3.7 3.8 3.9 4 threeattracting fixed points of f3. For example,my a computationsindicate that the stable set corre- FIG. 4. Iteratesof the logistic map. spondingto the point 0.1561... is approximately = [0.14545,0.163571. This estimatecan be veri- I, tivityto initialconditions can occurwith Lebesgue fiedby checkingthat f3(X) E11 for xE 1I. This cri- terionwas numericallysatisfied for 2000 equally measure0. Maps forwhich all pointsmust separate underiteration are said to be expansive.However, spacedpoints in I,. To summarizethe asymptoticbehavior of the expansivenessis typicallytoo restrictivefor most logisticmap, considerthe plot in Figure 4. This maps. (For example,consider the logisticmap when plot is intendedto indicate the attractorof the a = 4. The initial conditions xo and 1 - xo result since logisticmap as a functionof a. For a gridof values in identicalrealizations of the map. Therefore, of a from3.45 to 4, iterates 101 to 221 of the xo can be chosenarbitrarily close to 0.5, expansive- set of all x logistichave been plotted.The result,known as an ness cannotbe claimed).However, the a = 4 is a orbitdiagram, is an interesting,complex object. leading to any periodicbehavior when set of Lebesgue measure zero. This sort of phe- 2.1.2 Whatis mathematicalchaos? nomenonis related to anothercomponent of the mathematicaldefinition of chaos, namely the set of to be a universallyac- There does not appear x's leadingto periodicbehavior is dense in D. That chaos. Thereare cepted,mathematical definition of is, complex,aperiodic behavior can arise despite what one mightmean by differentways to quantify the existenceof denselydistributed opportunities complexor unpredictablebehavior. The primary for well-orderedbehavior. This is implicitin the to ini- conceptappears to b& the notionsensitivity claim of Li and Yorke (1975) that "period three tial conditions,typically quantified as: implieschaos." I thinkthe way mostpeople would like to interpretsensitivity is as if the map were -+ DEFINITION2.1. f: D D. displays sensitivityto almost everywhere,typically, in the sense of initialconditions if there exists 6 > 0 such that for Lebesgue measure,expansive. any x in D and any neighborhoodV of x, there exists a y in V and n 2 0 such that fn(x)- DEFINITION 2.2. f:D - D is almost everywhere > fn(y)I 6 expansiveif there exists 6 > 0 such that foralmost all x in D and almost all y in D, there exists t'his definitionsuggests that thereexist pointsar- n 2 0 such that I fn(x)-fn(y) I > 6. bitrarilyclose to x that separatefrom x duringthe time evolutionof the dynamicalsystem. However, AlthoughI have not located discussionof such a the definitiondoes notsay all pointsmust separate, definition,there may be relationshipsto some of apparentlyleaving open the possibilitythat sensi- the definitionsrelating chaos and randomnessdis- cussedbelow. Anothermathematical concept associated with definitionsof chaos intuitivelyinvolves the rich- I...... I...... ness ofchaotic paths.

DEFINITION 2.3. f:D-+ D is topologically transi- 0.21 0.24 0.27 0.30 0.18 of sets U, V in D there FIG. 3. Periodic behaviorof the logisticmap: a = 3.83001. tive if forany pair open Scatterplotof X500 versusx0 for800 pointsin [.11, .3]. exists n > 0 suchthat f (U) ln v is not empty. CHAOS 75 This definitionsuggests that some paths of the dynamicalsystem generated by f eventuallyvisit

all regions of D. In fact, it turns out that f is 0.2 topologicallytransitive if and only if the map pos- sesses a path that is dense in D. To thispoint, I have onlylisted properties associ- 0.0 ated with chaos, but have not given an explicit definitionof chaos. Indeed,different definitions ex- ist. The popularizedview of chaos revolvesaround -0.2 sensitivityto initialconditions as in Definition2.1. A now standard and more complete,mathemati- , I- given in Devaney cally motivateddefinition is -1 0 1 (1989, page 50). Namely, a map is "chaotic" if it has the propertiesof: (i) sensitivityto initial condi- (i) tions (Definition2.1); (ii) topologicaltransitivity (Definition2.3); and (iii) periodicpoints are dense. Other related definitionsof chaos (positive Lia- punov exponents,as introducedin Section 3.1.1, and the existence of continuousergodic distribu- tions, as introducedin Section 2.2.3) involve no- 0 I1100 200 300 400 500 tions of ergodictheory. See Collet and Eckmann (ii) (1980) forfurther discussion. FIG. 5. Plotsfor Henon map. (i) Scatterplotof first 2000 iter- ates. (ii) First500 iteratesof x. 2.1.3 Dissipative systemsand chaos Much ofthe complexbehavior of the logisticis a resultof its noninvertibility.Indeed, noninvertibil- Carleson(1991).] To get a feelfor this map,Figure ityis requiredto observechaos forone-dimensional 5 presentsa scatterplotof the first2000 iterations dynamicalsystems, as definedhere. However,ev- (the attractor)of the map, as well as a time series erywhereinvertible maps in two or more dimen- plot of the "x" series. These computationswere sions can also exhibitchaotic behavior. Among the based on the conditionsa = 1.4, b = 0.3, xO= 0.4 many interestingfacets of dynamicalsystems, one and yo = 0.3. area that receivesmuch attentionis the studyof This exampleillustrates two important aspects of strangeattractors. The basis issue is the long-run chaoticbehavior. First, note that the complexgeo- behaviorof the system.As time proceeds,the tra- metricalstructure of the Henon attractor.This ob- jectoriesof systemsmay becometrapped in certain ject is of dimension.Such objectsappear in boundedregions of the of the system. the studyof manydynamical systems. The mathe- As notedeven forthe logisticmap, these trapping matics that suggestsuch behaviorare as follows. regionsor attractorscan displayremarkable oddi- The Henon map, viewed as a transformationfrom ties. An importantexample in two dimensionsis R2 to R2, has Jacobianequal to -b. If 0 < b < 1, the Henon map. This map can displaythe property we make the geometricalobservation that Henon ofhaving a strangeattractor; that is, the attractor map contracts the areas of sets to whichit is ap- "appears to be locallythe productof a two-dimen- plied. More generally,such maps are said to be sional manifoldby a Cantorset." This quote,along dissipative. (Maps that maintainarea underitera- with a motivationof the map, may be foundin tion are conservative.) Intuitively,the complex Henon (1976). Also, see the previousreferences for limitingbehavior of chaotic,dissipative dynamical discussion.The Henon map is givenby the follow- systemsis the resultof two competingmathemati- ing : cal trends. Dissipativenesssuggests that iterates tend to collapse to sets of Lebesgue measure zero. (2.3) xt1 = + yt - ax2 and Yt+1 = bxt However,an effectof chaos is to prohibitperiodic behavior.The naturalresults consistent with these forfixed values of a and b 'and t = 0, 1, .... This two phenomenais forthe systemto be attractedto invertiblemap can not onlypossess strangeattrac- an inflnite,singular set of Lebesgue measurezero tors,but also display strongsensitivity to initial (in an appropriatemanifold of Rk). Such attracting conditions.as encounteredearlier. [The rigorous sets are known as strange . (The above verificationof many of the propertiesof the Henon heuristicsare notcomplete. For example,conserva- map is actually very difficult;see Benedicks and tive systemscan displaychaotic behavior without 76 L. M. BERLINER being attracted to singular attractors. Rather, here. The reader can find a valuable discussion of conservative systems can have attractors that are the basics in Press, Flannery, Teukolsky and Vet- colorfully called "fat "; that is, complex terling (1986). geometrical objects that are essentially "space-fill- The followingexample of (2.5) is used in Section ing." Such systems will not be considered further 4. The differentialequation, known as the Lorenz .n this article. The reader may consult the refer- system, is extremely popular in the literature on ences for discussion.) chaos. The system,given component-wise,is Second, note that there is a relationship between the roles of time and dimension in the definitionof dx/dt = a(y - x), dynamical systems. In particular, (2.3) can be writ- (2.6) dy/dt = -xz + rx - y, ten as a one dimensional relationship if we allow dz/dt = xy - bz, consideration of more lags of time: where a, r and b are constants. Lorenz (1963) (2.4) x = 1 + bxt1 - ax. considered this system as a rough approximation to Statisticians familiar with more conventional time aspects of the dynamics of the Earth's atmosphere. series modeling might see a kinship between (2.4) Figure 6 presents plots in of a numeri- and a "nonlinear autoregressive model of order 2." cal approximation of a solution to (2.6) where a = I will return to such concepts in Section 4. 10, r = 28 and b = 8/3. (I tried to indicate why the attractor is nicknamed "The Butterfly.") For this 2.1.4 Continuous time dynamical systems:Differ- famous choice of the parameters, the solutions (see ential equations Figure 7) display sensitive dependence to initial Continuous time dynamical systems arise natu- conditions and "unpredictable" fluctuations. For rally in many applications in which the time almost all initial conditions,orbits are attracted to the evolution of the quantities of interest, composing a object displayed in the various panels of Figure 6. k-dimensionalvector x(t), are modeled via differen- Note that I have plotted points fromthe discrete time numerical tial equations. In particular, consider a initial value approximation in this figure. problem where x(O) is an initial condition and the "True" solutions to (2.6) are continuous and are dynamics of the system are quantified by the differ- attracted to a "continuous," yet strange attractorof ential equation Lebesgue measure zero, since the is dissipative. (2.5) dx(t)/dt = F(x(t)), t > 0. 2.2 Randomness and Chaos The value x(t) describes the state of the system at This section reviews various relationships be- time t; the domain of possible values of x( ) is tween chaos and randomness. The key ideas called the phase space. A specific solution to (2.5), involve the interrelations between sensitivity to in "plotted" the phase space, is known as an . initial conditions, uncertainty modeling and er- As in the case of discrete time, solutions to (2.5) godic theory. The discussion emphasizes ideas, but, can be chaotic in the sense that some views of the forthe sake of brevity,not rigor. solutions may appear "random," solutions display sensitivity to initial conditions and, in the case 2.2.1 Uncertainty,chaos and randomness of dissipative systems, now indicated when The main topic of this section is how uncertainty, = aFi1 /a xi < 0, orbits are attracted to strange especially in the presence of attractors. ,naturally leads to the use of random or probabilistic methods. Traditional methods for numerically solving dif- I will begin with a historical perspective. An early ferential equations typically involve discrete time and persuasive suggestion that deterministicmod- approximations. The simplest method, in the con- els may be of limited value is the followingdiscus- text of (2.5) uses the approximation = x(tn+l) sion of Laplace, circa 1800, (from Laplace, 1951, h + where the stepsize, h, is small F(x(tn)) x(tJ) page 4): and tn+1 = h + tn. Thus, numerical solutions to differentialequations are themselves typically dis- Given for one instant an intelligence which crete time dynamical systems. For the actual could comprehend all the forces by which na- computations in this article, I used a more sophisti- ture is animated and the respective situation of cated approximation known as the four-point the beings who compose it-an intelligence suf- Runge-Kutta method. This method appears to be ficiently vast to submit these data to analy- considered a standard method for solving differen- sis-it would embrace in the same formula the tial equations. Further details are not relevant movements of the greatest bodies of the uni- CHAOS 77

10 - > '. ~~~~~~1-.. 1**' j W: ~ ~~IZ )

r~~oV l ~~~ S[ljii l*jl j-u l ji ~ ji9

10

l l l

30~~~~~~~~~~~~~r2000 4000 6000 80000 TINE 20

*. ..'- .. ...* -... *. --e.. . -. .

40- -i 1 -tl, t "'

10 - ' E X ' - il i i* ! '-'*' i - @ | *

IN, %e-eZ iZv

: -314t-it-- ~ ~ ~ ~ ~ ~ * 2000 4000 6000 8000 TIME (i) FIG. 7. Time series plots forthe Lorenz system.

verse and those of the lightestatom; for it, nothingwould be uncertainand the future,as the past,would be presentto its eyes. Afterbriefly reviewing some of the success of sci- ence,Laplace continues: All these effortsin the searchfor truth tend to lead [human intelligence]back continuallyto the vast intelligencewhich we have just men- ..~.: A tioned,but fromwhich it will always remain infinitelyremoved. ..~~~~~~~~~~~~~~~~~~. Laplace beautifullyitemizes the need for perfect knowledgeof the natural laws and initial condi- tions in deterministicanalysis. However,he also clearlyquestions the relevanceof his "vast intelli- gence" vis-a-vishuman efforts.[Laplace has often been misunderstoodin this regard. Readers of Laplace who emphasize the firstportion of the above quote seem to believe that Laplace was a strictdeterminist. For example,see Stewart(1989, pages 11-12). For a particularlyunjust appraisal of Laplace, see Gleick (1987, page 14) Laplace also wrote, "It is remarkable that ... [the theory of probabilities]should be elevatedto the rank ofthe most importantsubjects of human knowledge" the FIG. 6. The Lorenzattractor: Four viewsof the Lorenz attractor (Laplace, 1951, page 195). These are hardly based on iterates1001-8000 of the four-pointRunge-Kutta algo- wordsof a strictdeterminist.] rithmwith stpsize = .01. A primaryexample of the use of probabilistic 78 L. M. BERLINER methodsto partiallyovercome difficulties in a com- mane to my currentthesis, however, is Poincare's plex, deterministicsetting is statisticalmechanics. explicitsuggestion that we can perceive"chance" The key issue is the studyof the motionsof a large to be at works as a result of our uncertainty. number(on the orderof 1024) of particlesfloating Poincarewent further by suggestingan operational aroundin a box. Suppose that the systemis such approachfor dealing with such problemsthat basi- that we can assume that the motion of all the cally involvethe treatmentof unknown initial con- particlesare governedby the deterministiclaws of ditions as random. (This idea is known as the motionof classical .In principle,we should "Method of ArbitraryFunctions" and will be dis- then be able to compute,given all of the requisite cussedfurther in Section5.) initial conditions,the exact futurebehavior of the entiresystem. However, we immediatelyrecognize 2.2.2 Defining chaotic to be random a problem:can we ever claim, for such a large I will only give a simple description,based on system,that we actually know,-with sufficient ac- Jackson(1989), ofthe basic idea of what is known curacyto performthe calculations,all ofthe initial as . Consider a discrete time conditions?At least for the last 140 years, the dynamicalsystem such as the logisticmap. At each commonlyagreed upon answeris no. Indeed,in the iterate,xn, n > 0, ofthe process,we will associate presence of uncertaintyconcerning these initial a simpleindicator function based on the locationof conditions,the motionof the (or, at least, xn. In particular,suppose we let Yn = 1 if Xn ES observablefunctions of these motions)not onlyap- and Y,,= 0 otherwise,where S is somesubset of D. pear "random,"but are successfullymodeled via That is, each initial conditionof the systemis now stochastictechniques. associatedwith an infinitesequence of 0's and l's. The secondpertinent class ofhistorical examples directlyrelates to nonlineardynamical chaos. For DEFINITION 2.4. A map is chaoticif forany se- example,probabilistic methods have longbeen sug- 0's and l's, thereexists an initial condi- gestedin the area of fluiddynamics, particularly, quence of the same sequence of defined ;see Grenanderand Rosenblatt(1984, tion yielding Yn's fixedS. Chapter5) forpertinent discussion. The genius of above forsome Poincare anticipateda great deal of the current interestin nonlineardynamics. Consider the now The idea here is that if this definitionis satisfied, frequentlycited commentsof Poincare,circa 1900 the deterministicdynamical system is, in a sense, (fromPoincare, 1946, pages 397-398): at least as richas Bernoullicoin tossing. For exam- ple, it turnsout that the logisticmap is chaoticas A very small cause, which escapes us, deter- long as a is large enough(a > 3.83 ... ) to permit mines a considerableeffect which we cannot periodthree (and, thus,all higherperiods) behav- help seeing,and thenwe say that the effectis ior. There is a directrelationship between Defini- due to chance. If we could know exactlythe tion 2.4 and the mathematicaldefinitions relating laws of nature and the situationof the uni- to periodicpoints being dense,as describedin Sec- verse at the Initial instant,we shouldbe able tion 2.1. I will returnto symbolicdynamics in the to predictthe situationof this same universeat followingsubsection. a subsequentinstant. But even whenthe natu- ral laws should have no furthersecret for us, 2.2.3 Ergodic theoryand chaos we could know the initial situationonly ap- Perhaps the strongestrelationships between de- If that permitsus to foreseethe proximately. terministic chaos and randomness are found situationwith the' same succeeding degreeof throughconsideration of ergodictheory. [Only a that is all we require,we say approximation, cursorypresentation is givenhere. Some valuable the had been predicted,that it is phenomenon references include Breiman (1968); Cornfield, laws. But it is not always the case; it ruledby Fomin and Sinai (1982); Eckmann and Ruelle may happen that slightlydifferences in the (1985); and Ornstein(1988).] To motivatethe cen- initial conditionsproduce very great differ- tral ideas, imaginethat some arbitrary system (sto- ences in the final a slighterror in phenomena; chastic or deterministic)is under study.Consider the formerwould make an enormouserror in one experimentin whichwe will observethe evolu- becomes and the latter. Prediction impossible, tion of some variable of the systemsover time. In we have the fortuitousphenomenon. anotherexperiment, we will observethe same vari- This eloquentpassage capturesthe keypoints for able as in the first,but forseveral "similar"repli- our discussion.It includesthe mathematicalprob- cates of the same systemat a given time point. lem of sensitivityto initial conditions.More ger- Ergodictheory seeks to answerthe question, "When CHAOS 79

can we expectthe average ofthe data, overtime, in 400 the firstexperiment, to be the same (in expecta- the average ofthe data, overthe replicates tion)as 300 at a fixedtime?" There is an intriguingrelation- shipbetween the questionof ergodic theory and the familiarquestion of "independentversus repeated 200 measures" observations.Of particularinterest to Bayesians, the role of exchangeabilityin ergodic theorydeserves attention. Pursuit of these issues is 100 beyondthe scopeof this paper. To see the role ofergodic theory in deterministic chaos,we will need a bit of formalism.Consider a 0.0000 0.1 625 0,3250 0.4875 0.6500 0.81 25 0.9750 probabilitytriple (X, F, P), where X is a sample (i) space for some experiment,F is a a-algebra of 300 subsetsof X and P is a probabilitymeasure. As- sume X is a subset of R1 and, thus,permit P to representa probabilitymeasure or corresponding 200 distributionfunction. Next, we introducea func- tion, f which maps X to X. For X distributed accordingto P, considerthe randomvariable f(X). The function,or transformation,f is said to be 100 measure preserving(or ) with respect to P ifthe randomvariable f(X) has distributionP. To solidifythis definitionby example, let X = [0,11 and let P be the arc-sine[or beta(0.5, 0.5)] distribu- 0.0000 0.1625 0.3250 0.4875 0.6500 0.81 25 0.9750 tion, with probabilitydensity function p( x) = (ii) {wr/x(1 - x) It is easy to checkthat the logis- FIG. 8. Example of ergodic behavior: Logistic map, a = 4.0. (i) }I-. Histogram of 4000 iteratesof = .20005. (ii) Histogram of the tic function, f(x) = 4 x(1 - x), is invariant with xo logistic map at time 2000 for 4000 xo's in [.10005, .30005]. respectto P. A fewmore definitions are neededto relate these ideas to dynamical systems. For an invertible show that exceptfor X, forwhich P(X) = 1, any transformationf, a subset AeF is invariantif invariant subset must be denumerableand thus f `(A) = A. For a noninvertiblef, invariance of A have P-measure0. Ergodicbehavior is suggested means f t(A) = A forall t > 0. Further,f is said to by consideringsample paths ofthe logistic.Figure be ergodic if for every invariant subset, AeF, 8 (i) presentsa histogramof 4000 iteratesof this P( A) = 0 or 1. That is, if f is ergodic,sample paths map, beginning at the initial condition xo = of the dynamicalsystem obtained from f do not 0.20005. Figure8 (ii) presentsthe histogramof the becometrapped in propersubsets of the supportof values of the logisticmap at time 2000 for4000 P, but rathermix overits support.(Note the corre- differentinitial conditions.In both cases, we see spondenceto topologicaltransitivity in Definition the appearanceof somethinglike the arc-sineden- 2.3.) Furthermore,if f is ergodicwith respectto sity.In the constructionof Figure 8 (ii) I have used two probabilitymeasures P, and P2 on the same what is perhapsthe mostimportant feature of the measurespace, then either P, = P2 or P1 is orthog- applicationof the ergodictheorem to deterministic onal to P2. With this ,we can state a dynamicalsystems. The key point is to note the simpleversion of the ergodictheorem: powerof the "almostsure" convergenceof the theo- rem. That is, the initialcondition xo ofthe system If f is measure-preservingand ergodic on (X, F, need notbe randomlygenerated according to P for P) and Y is any random variable such that E( I Y I) ergodicityto applyto the resultingdynamical sys- < oo, then tem. Indeed,the initialcondition need notbe "ran- domlygenerated." We must only avoid sets of P- 1n measure zero. (For the arc-sinedistribution, this E Y(f '(xo)) -+Ep(Y) a.s.(P), asn -oo. for n i=l means Lebesgue measure zero.) Furthermore, any logisticwith a large (a > 3.83 ... ) enoughto To illustratethis result,again considerthe logis- admit periodiccycles of all orders,each periodic tic map with a = 4 and let P correspondto the arc cyclegenerates a discreteergodic distribution that sin law. To see that f is ergodic,it is not hard to assigns equal mass to the componentsof the cycle. 80 L. M. BERLINER Note that all these distributionsare mutuallyor- the strong law of large numbers. Pursuing the thogonal. An implicationof these considerations deterministicargument, I will leave it to philoso- suggest that, to observe ergodic behavior corre- phersto debatethe meaningof the claim,"Starting spondingto the arc sine forthe logisticwith a = 4, at initial conditionxo = .2958672..., the proba- we mustparticularly avoid the preimages(see Sec- bility that Yn = 1 and Yn+1 = 0 is 0.52 if tion 2) ofall periodicpoints. n = 101oo.''Although there is nothing"random" Actually,the above argumentsoffer only a par- here,the probabilitystatement still makes sense to tial explanationof why we see ergodic,apparently me. randombehavior in -generateddynamical systems.Electronic can onlyrepresent a 3. DATA ANALYSIS AND CHAOS although,fortunately, quite large numberof finite, In this sectionI will reviewsome ofthe problems numbersand thus no numericallycomputed dy- associatedwith chaotic models and data that are of namicalsystem can trulybe aperiodic.In the logis- particularto statisticians. tic map example of Figure 8, the initial conditions were necessarilyonly truncated real numbers,and 3.1 Measuring Chaos thus, lie in a set of Lebesgue measure zero. A possible explanation of why we still can observe Importantquestions arise in attemptingto char- behavior expectedunder the ergodictheorem in- acterizewhat a chaotictime series of data should volves the so-called shadowing property.The idea look like. Intuitively,a chaoticseries should look is that the computer-generatedorbit of the system "random,"but thisintuition is notnecessarily easy is, in a sense, an approximationto sometrue orbit. to quantifyin termsof the mathematicalor proba- The result is that we can be reasonablyconfident bilisticdefinitions discussed in Section2. that,if proper care is taken,computer results, espe- 3.1.1 Liapunov exponents cially aggregatedresults such as the histogramsof Figure 8, do in factcapture the correctqualitative One of the most popular measures of chaos, featuresof the system.The requiredcare alludedto Liapunov exponents,are based on mathematics as- in the previous statementrefers to the roundoff sociated with the sensitivityto initial conditions errorpresent in the numericalcomputation of the conceptdescribed in Section 2. [Nearly all of the nonlinearfunction f. Furtherdiscussion of compu- general referencesgiven here presentdiscussions; tationalissues fordynamical systems is beyondthe especially see Eckmann and Ruelle (1985).] Con- scope ofthis paper [see Guckenheimerand Holmes sider a univariate,discrete time dynamicalsystem (1983); Hammel, Yorke and Grebogi (1987); and where xn fn( xo). To studythe impact of varying Parkerand Chua (1989)]. initial conditions,it is natural to considerderiva- The storyof ergodictheory, chaos and random- tives dxn/dxo. The Liapunov exponent,say X(so), ness is still notcomplete. A mostintimate relation- is definedas (x0) = limn- (1/n) log[ Idxn/ dxo ]. accrues fromthe followingresult. Under the Note thatvia the chainrule, dxn/dx0 can be repre- assumptionsof the ergodictheorem, if the initial sentedas a productand so X((xo),under appropriate conditionxo is generatedaccording to P, the se- conditions,may be subjectto the ergodictheorem. quence { Y(fi(xo)) = Yi(xo), i > 0} is a stationary If so, then X(x0)= X, independentof x0, almost stochasticprocess. (The probabilisticstructure of surelywith respect to an appropriateergodic distri- this stochasticprocess varies with the ergodicdis- bution.Under such circumstancesX is a quantita- tributionused to generatexo.) Thus,if we choose Y tive measure of the dynamicalsystem's degree of to be the identitytransformation, Y(x) = x, the sensitivityto initial conditions.In particular,the deterministicdynamical system with initial condi- approximation somehow with care to avoid sets of tion chosen, (3.1) dxn z eXndxo P-measurezero, is a realizationof a stochasticproc- ess. This result is also importantin symbolicdy- suggeststhat forlarge X,small changes in initial namics; indeed,the previousstatement provides a conditionsresult in separationof paths at an expo- naturalgeneralized definition of chaos in the spirit nentialrate as n grows. of Definition2.4. Considerthe symbolicdefinition Note that,in the ergodiccase, Xcan be estimated of Section2.3 forthe logisticwhen a = 4 and S = froma singletime series. This meanswe can assess [O,.5].In this case, if the initial conditionhas the sensitivityto initial conditionseven though the arc sine distribution,it can be shownthat the Y's data is based on a single x0. Of course,this as- are actually independent,identically distributed sumes that ergodicityapplies and that the xo that ("equally likely")Bernoulli random variables. [For generatedthe data is in supportof the appropriate related discussion,see Breiman (1968, page 108).] ergodicdistribution. (Recall the ergodicdistribu- In such a case, the ergodictheorem coincides with tions, althoughorthogonal, need not be unique.) CHAOS 81 Furthermore,by using an appropriateindependent (Kolmogorov)capacity of A as sample of initial conditions,we can potentially l combineestimates of Xto obtainmore precise esti- (A, ) mates and associatedstandard errors. (3.3) dc = limsup I have definedX fora univariatedynamical sys- tem;extensions to higherdimensions can be found A secondmeasure of the size of a set is the Haus- in the references.Also, the workof Nychka, McCaf- dorff dimension,dH. Althoughin a sense it is frey,Ellner and Gallant (1990) on estimatingLia- mathematicallymore pleasing than versions of punov exponentswith nonparametricregression (3.3), it definitionrequires additional development techniquesis ofspecial interestto statisticians. and will be omitted.It shouldbe notedthat various authors defined dc (or one of its cousins) as the 3.1.2 The geometricstructure of chaos fractal dimension of A, while others, including A secondclass of measuresof chaos involvesthe Mandelbrot,define the fractaldimension to be dH. notion of attractorsof dynamical systems men- This is unfortunatebecause these measures can tioned in Section 2.4. Attemptsat characterizing differ:in general, dH s dc. For the Cantor set in complexgeometrical objects have a long historyin one dimension, dH= dc = log2/log3, suggesting mathematics.Spurred on by the modernwork of B. that the set lies in some meaningfulfraction, al- Mandelbrotand others,as well as the relationships thoughof Lebesgue measure zero, of the unit inter- to chaoticdynamical systems, there has been con- val. The fractaldimension of the Lorenz attractor siderable research in the general area of fractal introducedin Section 2 is about 2.04, suggesting geometryand its applications.Valuable references, that the attractorlies entirelyin a manifoldof R3, at various levels of mathematicalsophistication, but not R2, and is a fractal. include Mandelbrot(1982), Barnsley (1988), Peit- There are othermeasures of dimensionin addi- gen and Saupe (1988), Kaye (1989) and Edgar tion to those mentionedabove. Some of the meas- (1990). The presentationhere followsthose in Ru- ures oftenstudied in the chaos literaturemay be .elle (1989), Baker and Gollub (1990) and Rasband motivatedby the suggestionthat one relate the (1990). Also, see Farmer, Ott and Yorke (1983), geometryof attractors, the structureof ergodic dis- Guckenheimer(1984) and Takens (1985). tributionsand the mathematical propertiesof The startingpoint is an attemptgeneralize no- chaos. To motivatethe potentialof such interrela- tions of geometric"size" of sets lyingin Rk, from tionships,pretend we are facedwith a problemin the conventionalideas of "length" (k = 1), "area" whicha discretetime, chaotic and ergodicsystem and "arc length" (k = 2), and "volume" and evolvesfrom one of two possibleinitial conditions. "surfacearea" (k > 2), in cases in whichthe com- As we watchthe evolution,we actuallygain infor- plexityof the sets of interestprohibits meaningful mationabout whichinitial conditionwas the true categorizationby these familiar measures. (The sort one,because sensitivityto initialconditions implies ofset to keep in mindis the Cantorset; this "large" that the paths separate quickly,no matter how set has Lebesgue measure zero.) The mostreadily close the two candidatesare. In this sense some understoodclass ofmeasures involves the notionof authorsclaim that chaoticpaths "create informa- tryingto "cover"the set ofinterest, say A, lyingin tion" about initialconditions. [Berliner (1991) pre- a compactsubset of Rk, with k-dimensionalboxes sentsan analogousargument in termsof statistical withsides oflength e, small number.If k = 1 and information.]Alternatively, sensitivity to initial A is simplyan interval of length L, clearly the conditionsalso suggeststhat we have decreasing "number"of "boxes" used to coverthe intervalis informationabout the futureof a dynamicalsystem approximately,ignoring integer part corrections, as time increases.Consider the followingheuristic N(A, e) = L/E. For A, a k-dimensionalcube with argument.Suppose that the compactphase space of side L, we have that N(A, E) = (L/E)k. For such a univariate dynamicalsystem is discretized,at nice sets A, a little algebra suggests the usual least forour observationof it, into J cells of equal interpretationof the dimensionof A: and small size, say dx. Further,we will agree to summarizeour uncertaintyin the state ofthe sys- log N(A, E) tem througha discretedistribution on the J cells. (3.2) ?-0 log (1/E) A familiar measure of our uncertaintyabout a randomvariable distributedaccording a probabil- Differentmeasures of dimensionare based on this ity distributionis the entropyfunction; for any notionof "covering"A. [For general A, the limit distribution,P, on the J cells above,the entropyof in (3.2) may not exist.] For example,for an arbi- P is Ent(P) = -iJ 1Pi log Pi, where Pi is the trary,compact set A lying entirelyin Rk to be probabilityof the ith cell. Next,suppose that at a covered by k-dimensional cubes, define the givenpoint in timewe knowthat the systemlies in 82 L. M. BERLINER a particularcell. The correspondingP has mationmeasures are considered: zero. (Actually,a betterargument would quantify our uncertaintyin the-initialcondition via a proba- lPi over the cell knownto contain (3.5) lq(P) = bilitydistribution q - 1 6-o loge the system.That is, our entropy,as a measure of uncertainty,is virtuallynever zero. The basic idea Note that q( ) is nonincreasingin q. For q = 1, a can be conveyedwithout this correction.) At a large limitingargument yields the representation futuretime point,say t, chaos suggeststhat the processwill be in one of approximately,ignoring integerpart corrections,exp(Xt) cells, where X is (3.6) 01(P) = -lim Ent (P()) 6--+o log e the positiveLiapunov exponentof the system[re- call (3.1)]. If a uniformdistribution on thesecells is which is known as the information dimension. appropriate,the resultingdistribution has entropy [21(P) can also be relatedto Hausdorffdimension; increases approximatelyequal to Xt.Thus, entropy see Ruelle (1989).] Furthermore,the capacity d, as the forecasttime increases. Providingrigor for coincideswith tO(P). The key notionof the rescal- this argumentis not easy; a key point is that the ing involvesconsideration of the rates at whichthe futuredistribution need not be uniformon the attractorfills the appropriatespace. In particular, candidatecells, but rathershould involvethe ap- note that,unlike (3.4), (3.2), (3.3), (3.5) and (3.6), propriateergodic distribution.Nevertheless, this respectively,all involvelimits that are scaled by intuition suggests that the asymptoticrate of loge. For example, the informationdimension is change in entropy,represented by the information not equal to the entropyof P. Distributionswith dimension,introduced below, provides information differententropies, but similarstructure, can have about the structureof the problem.Further, en- the same informationdimension. tropyor the informationdimension, may, in some To estimate ,( P) froma finiteset of data, con- settings,be directlyrelated to the Liapunov expo- sidera decreasingsequence of ej's. For each ej and nentsof the system.The theoryis incomplete,but correspondingpartition of the phase space into J(ej) discussioncan be foundin the referencesunder the "boxes," we need to estimateall of the Pi. For a topicof the Kaplan-Yorke Conjecture. discretetime dynamicalsystem, it is natural to We now turn to descriptionsof other measures estimatethe 'i's by the correspondingproportions and to problemsof estimation of these measuresof of the data in the boxes. Most proceduresproceed dimension.These two aspects are intimatelyre- by graphing log(Z!Ay) (5i(6j))q) versus logej and lated in the sense that a given measure is often estimating q(P) by the slope of the least-squares definedas the result of a given estimationproce- linear fittedline throughthese points. This ap- dure. The typical set-up is one in which "long" proach seems to reasonable in principle,but some realizationsof the systemunder study are avail- art is involved.In particular,Ej's which are too able, but the true attractorand correspondinger- large will not capturethe structureof the hypothe- This godic distributionare unknown. suggests sized attractor.Also, ej's whichare too small lead problemsof statisticalestimation based on data. to too many emptycells, resultingin a loss of The standardmethods are based on the assumption structure.After all, the true dimensionfor a finite thatthe processis ergodicand that the data covers set ofpoints of zero. a sufficientlylong time periodto be representative In general,large values of q are usefulin relat- of the ergodicdistribution. For example, suppose ing geometricstructure to probabilisticstructure. we are to estimate a functional; say (, such as An importantexample, known as the correlation entropy,of the ergodicdistribution, P. The first dimensionand due to Grassbergerand Procaccia step is to partitionthe phase space of the system (1983), is given by 02(P). To motivatethis dimen- into J(E) "boxes," each of side E. Ignoringevents sion define,for a given ergodicdistribution, the on the boundariesof the boxes,define Pi(E) as the function probabilityunder the ergodicdistribution of the ith box; let ^(E) representthe corresponding(multi- (3.7) Pr(IX- YI cr), nomial) approximationto P. Assumingthe attrac- torlies in a compactset, it i'sreasonable to assume for r > 0, and where X and Y are independent, that identicallydistributed random variables generated fromthe ergodicdistribution under study, provides (3.4) (P) = lim (p(E)). some probabilistic informationconcerning the structureof this distribution.To empiricallyesti- To unifyvarious measures discussedso far,the mate (3.7) based on a set of data, { Yi}i=1,N' froma followingclass ofrescaled, generalized Renyi infor- discreteergodic distribution, Grassberger and Pro- CHAOS 83 caccia (1983) considerthe quantity that can be informativeabout any underlying 1 structure.

C(r) = N 2Z.[IO,r](^,j I YiYiI) 3.2.1 Poincare Maps where I representsthe usual indicatorfunction. Suppose a particularcontinuous time dynamical The correlationdimension is defined as systemis understudy with the intentof tryingto understandsome featuresof the dynamics.The log C(r) particularsystem may be a knownchaotic system dGP = HAL" log r being studiedvia computerexperiments or an un- known systemdisplaying complex, "random" be- In conclusion,most of the traditionalprocedures havior being observed(without error!) in nature. describedin the literaturestrike me as what a Faced with complexbehavior, perhaps in high di- statistician would describe as "nonparametric mensions,we studya subset of the data carefully methodof moments" procedures. Also, the effectsof chosento provideinformation about the underlying observationerror, arising fromvarious plausible dynamics.Specifically, one definesa Poincaresur- error models, is not completelyunderstood. For faceof section, as somemanifold in the phase space, newermethods and furtherdiscussion along these whichthe realizationof the systemstrikes "trans- lines, see Wolff(1990) and Smith (1992) in the versally." We then considerthe successivevalues "statisticsliterature" and Ellner (1988), M6ller et ofsome subsetof variables of the systemeach time al. (1989) and Ramsey and Yuan (1989) in the the systempasses throughthe section. "physicsliterature." The above suggestionis easily illustratedfor the 3.2 Data Analysis: The Search forStructure Lorenz system.We considerthe Poincare section consistingof the two-dimensionalplane, {( x, y, z): To set the stage for the topics of this section, x = y}. Next, simply constructa "time series" consider the following"experiment." Suppose I { .. ., x(r), x(r + 1). ... } of the values, in time providedyou witha data set { xnln= 1,1000consisting order,of x each time the solutionpasses through of computer-generatediterates of the logisticmap the section.The fundamentalpoint is thatthe new, for a randomlychosen initial conditionand with lowerdimensional series is a discretetime dynami- a = 4, but toldyou nothingabout howthe data was cal system,although on a differenttime scale from generated.As a data analyst,you mightbegin by the original,that inheritsimportant qualitative lookingat a simpletime series plot ofthe data. As featuresfrom the original.Indeed, there actually indicated earlier, the resulting time series plot existsa function,say f,such that the iteratesof the would suggest that the data are indeed random. new series may be representedas x(r + 1) = However,no reader-of this paper would be fooled f(x(r)); f is knownas a PoincareMap. The Poincare intomaking such a conclusion.For example,if you Map forour example is numericallysuggested in consideredfitting an autoregressivemodel to the Figure 9. To obtainthis map,I simplyrecorded the data, you wouldprobably first look at a scatterplot values of x at each intersectionof the systemwith of xn+1 versus xn. Of course,this plot would look the chosensection and thenscatterplotted x(r + 1) like a plot of the function y = 4x(1 - x). That is, against x(r). Althoughnumerical errors are of by simplyconsidering the "rightway" to look at course present,note that the pointsappear to fall the data, the deterministicstructure of the logistic on some functionquite tightly.(The data forthis map would simplyappear despitethe randomap- figureare based on 20,000iterates of the numerical pearanceof the originaldata. On-theother hand, if solution,which resultedin 429 intersections.)To instead ofthe originaldata, I providedyou with a indicatethe natureof the map,I also scatterplotted time series of the correspondingsymbolic dynamic x(r + 2) against x(r) and x(r + 10) against x(r). ofthe data (i.e., { Ynln=11000 where Yn= 1, if xn > The resultsare simply(numerical estimates of) the 0.5, and 0, otherwise),then there is nothingthat appropriatecompositions of the PoincareMap. you could do to findthe now hiddendeterministic The value of the above type of analysis may be structureof the underlyinglogistic map. more than theoreticalin suggestinginteresting Whereasthis example falls far short of indicating ideas fordata analysis. For example,imagine ana- the complexityassociated With chaotic data analy- lyzingdata froman unknown,yet chaotic-looking sis, it does suggest one of the fundamentalques- system. If we were fortunateenough to find a tions: What operationscan we performto chaotic PoincareMap such as the above throughdata anal- data to "find" any underlyingdeterminism? The ysis, we made great stridesin understandingthe second part of the example presents a warning system.In particular,some deterministicelement concerningthe "design of experiments"for chaotic of the systemis identified.Further, in the Lorenz data analysis: we must try to observe variables case, notethat the PoincareMap is a noninvertible 84 L. M. BERLINER

15t X2nd\ X

R 10 R 10 E E T T0 \ U 0- U O RU R

-10 -10

-10 0 10 -10 0 10 I I

3rd loth

R 10 - R 10 I! E T )T U 0 *,, U 0 RR NN

- -10-\ ' -' -IO '--;., ',il

-10 0 10 -10 0 10

FIG. 9. Poincare Map and its iterates. univariatemap, as was the logisticmap, thereby suggestinga sourceof the apparentchaotic behav- 37.5 ior. Note that there appears to be some potential ; 9~~~~~~~~~~~j for limited predictions(also see Nese, 1989). If intersectionsof the systemwith the Poincare sec- 30 -. tionare practicallyinteresting in the contextof the z problem under study, the empiricallyobtained 22.5 PoincareMap offersvery precise predictions of suc- cessive values of some variable at pointsof inter- 15 section.To amplifythe predictiveidea, considera bivariate time series { . . ., (x(r), z(r)), . .. } in the Lorenzsystem example. A scatterplotof this series -10 0 10 I is given in Figure 10. Note the very "tight" rela- FIG. 10. Scatterplotof z versusx on the Poincare section. tionshipbetween x and z on the Poincaresection. This suggeststhat predictionsof the value of z at an intersectiontime can be made given only the underlying,but unknown,dynamics driving the correspondingvalue of x. Finally, to enhance the processof interest,offers challenges of interestto value of either of these types of prediction,we the statistician.In particular,one may not be able mightcombine forecasts with data analyticallyob- to observeall ofthe importantdynamical variables; tained informationconcerning the waiting times, indeed,we may noteven knowwhich variables are measuredon the originaltime scale, until intersec- important.A potentialapproach to data analysisin tions. A plot of the values of x at returnsare such cases is usually knownas reconstructionby plottedas a functionof returnnumber in Figure timedelays. Primary originators of the main ideas 11. Histogramsof the waiting times forthe data are Packard,Crutchfield, Farmer and Shaw (1980), used to obtain the Poincare Map are presentedin Takens (1981) and D. Ruelle (see Ruelle, 1989). Figure 12. To motivatethe idea, considera simpledynami- cal systemwith a two-dimensionalphase space (see 3.2.2 Reconstructionby time delays Section2.1.4). The firstcoordinate of the systemis The studyof experimental time series data, in an some functionover time, say x1(t). The othercoor- attemptto understandimportant features of the dinateof the systemis x2(t) = dx1(t)/dt. In such a CHAOS 85

I o...... '''. ' .. ..' i a } ......

-10- .. ,..:....'

100 200 300 400 RETURN NUMBER FIG. 11. Time series plot of returnedx by returnnumber.

40 sidera k-dimensionaldynamical system x(t) evolv- 30 ing in continuoustime. Define a univariatesystem 20 IIh { Y,} where the y process is given by Yt = h(x(t)) fora functionh: Rk -+ R. A discrete,multivariate 10 X } --- LJl I f. L. - I time series is now constructedby definingvectors

30 44 58 72 86 100 vt as vt = (Yt, Yt+T ..*, Yt+(n-l)7) forsome choice Ist WAITING TIME of r and n. The claim, based on the theoryof Takens (1981), is that analysisof a sequenceof say Nt's ,at time t, t +, t +2 -...t +Nr, permits certain inferences,asymptotically, concerning the 30 qualitative,especially geometric, behavior of the

20 originalsystem. Theoretical justification of this idea stemsfrom the topologicalresult knownas Whit- 10 ney's EmbeddingTheorem. Of course, there are assumptionsunder which the methodologyworks. rLUdnELL6 n Lr6 63 83 103 123 143 163 183 203 223 The crucialone ofthese is that n ? 2 dH + 1, where ZniLdWAITING TIME dH is the Hausdorffdimension of the attractorof the originalsystem. There are also interestingsta- tisticalissues relatedto the use of time delays. In 25 particular,reasonable choices for the designparam- 20 eters r, N, n (in practice,dH is unknown)and the 15 embeddingfunction h are neededfor data analysis. 10 r It is typicallythe case that h is chosento simply "pick off' one of the componentsof x. Depending on h and assuming N musthave some reasonable 315 380 445 510 575 bound, X-should be chosen to be large enough to 10th WAITING TIME overcomethe verystrong local "correlation"in the FIG. 12. Histograms of waiting times until return to Poincare process,yet be small enoughto capturethe impor- section. tant dynamics of the process [see Liebert and Schuster(1989) forpertinent discussion]. The im- case, we quickly recognizethat we can learn a pact ofthe presenceof various types of observation great deal about the structureof the behaviorof errorson analysesseems to be onlypartially under- this process,in particular,characteristics of the stood. [See Abarbanel,Brown and Kadtke (1989) phase space plotof x2(t) plottedagainst x1(t),even for some work in this direction.]Beyond the ref- if we can onlyobserve x1( ) at a discretecollection erences already given, the reader may consult of time points.This is possible since,for small h, Broomheadand King (1986) and Rasband (1990) X2(t) [xl(t + h) - x1(t)1/h. Therefore,a large forfurther discussion. Nicolis and Prigogine(1989) numberof observations,close to each othertempo- discuss the implementationof the methodin an rally,of the firstcoordinate of the systemtell us examplebased on climatedata. about some characteristicsof the entiresystem. A second example is based on the Henon map. Al- 3.2.3 Other methodologies thoughthe map is definedin (2.3) is in R2, (2.4) It seems quite natural to attemptto try to fit suggests-that all the dynamicsof the processare convenient,flexible functions to chaoticdata. The containedin the single "x" coordinateif only we goals are similarto thosein nonparametricregres- view the data appropriately. sion[see Wahba (1990) forgeneral discussion]. Such The basic idea extendsto general systems.Con- goals include data smoothing,interpolation be- 86 L. M. BERLINER tweenobservation times and short-timeprediction. cal regularityby Hopf.See Engel (1987) fora very The potentialhere is actuallyvery large. I cannot valuable discussion.To quicklycommunicate the do justiceto the directionhere, but offerthe follow- idea, recall the ergodictheory review of Section ing referencesto the interestedreader: Lumley 2.2.3 and the correspondingexample of the logistic (1970), Crutchfleldand McNamara (1987), Farmer map with a = 4. In this case the arc-sinedistri- and Sidorowich (1987), Casdagli (1989), and bution is a continuous, ergodic distribution. Kostelich and Yorke (1990). Furthermore,the Invarianceimplies that if xo is generatedaccord- referenceby Nychka,McCaffrey, Ellner and Gal- ing to the arc-sine,then at everytime t, xt also has lant (1990) is especiallyrecommended to statisti- an arc-sinedistribution. However, some intuition cians forits new results,as well as reviewin this suggestedthat xo need not actuallybe generated direction. accordingto the ergodic distributionfor ergodic behaviorto manifestitself. Hopf's notion of statisti- 4. STATISTICAL ANALYSES FOR CHAOS cal regularityis a rigorousresult along these lines. The result is that, under mild regularitycondi- 4.1 ParametricStatistical Analysis forChaotic tions,including a continuousergodic distribution, Models say P, if xo has any distributionthat is absolutely Geweke (1989) and Berliner(1991) are the pri- continuouswith respectto P, then as t tends to maryreferences for this section.A natural class of infinity,xt convergesto distributionto P. That is, modelsfor which the statisticianfeels "at home" the initial distribution"washes out." (Note the are based on the specificationof a dynamicalfunc- natural correspondencebetween this notion and tion drivingthe systemunder study.Assume that the conceptof stationary,ergodic distributions in the functionis knownup to a finitecollection of Markov processes.) The application to Bayesian parameters.Specifically, the dynamicalsystem is forecastingis immediate.Under the appropriate assumedto be drivenby the relationship conditions,if we computea posteriordistribution forunknown initial conditions and thatposterior is (4.1) xt+1= f(xt; -0 abso-lutely continuous with respect to a continuous ergodic distribution,then our implied predictive where f is specifled.The parameter - and the distributionfor xt as t growsmust collapse to the initial value xo may both be unknown.Further, ergodic distribution.This is a strong,statistical assume that, at some time points,we observethe reflectionof the notionof unpredictability of chaotic x-processwith observation error. Depending on the processes.The strengthof the resultis thatthere is model we use for the errors,we an constructa nothingphilosophical to debate beforeaccepting likelihoodfunction based on data forthe unknown implicationto forecasting.The theoremsays that quantities.Even in verysimple examples, such as underperfect conditions in whichthe prior is agreed the logisticmap observedwith independent Gauss- to be known and correctlyspecified, and thus, ian errors,the resultinglikelihood functions can be Bayesian computationsare uncontroversialappli- extremelycomplex, intractable objects. Berliner cationsof , long-term predictions, (1991) offerssome heuristics concerning the behav- moreprecise than those associatedwith a continu- ior ofsuch "chaoticlikelihoods." For example,it is ous ergodicdistribution, of chaotic processesare possibleto relate chaos as measuredvia Liapunov impossible. exponentswith Fisher informationconcerning un- known initial conditions.Chaotic processes ob- 4.2 Statistics and Dynamical Systems served with errorproduce statistical information There is a huge literaturedevoted to statistical concerninginitial conditions. However, the value of analysesfor dynamical systems. Statisticians regu- this informationfor prediction is limited.Specifi- larlyconsider the modelwith "system equation" cally, if maximumlikelihood estimates of initial conditionsare sought,one is firstfaced with a very (4.2) xt+1 = f(xt;') + zt, difflcultproblem of finding such estimates.Even if where { is itselfa stochasticprocess, and "ob- one were able to obtain a good estimate of xo, zt} sensitivityto initialconditions moderates the value servationequation" for the observabley ofsuch estimatesin the contextof prediction. (4.3) Yt = h( xt) + et, Berliner (1991) considersBayesian forecasting based on modelsas suggestedin the previouspara- where h is some functionand e representsmeas- graph. Bayesian forecasting,in the presence of urementerror. The inclusionof z in (4.2) is sug- unknowninitial conditions,is intimatelyrelated gestedas a natural,more realistic version of (4.1), to ergodictheory via a phenomenoncalled statisti- whichallows some notionof errorin the specifica- CHAOS 87 tion of f, as well as the possibilityof unknown 5.1 Impact of Chaos on Statistics environmentaleffects influencing the evolutionof the process. Study of (4.2) and (4.3), especially First, I suggestsome philosophicalimplications when f (and h) is a linear function,has been of chaos forstatistics. Section 2 containeddiscus- popular for at least 30 years under the general sion of two centralpoints that relate mathemati- topicof Kalman filtering.See Jazwinski(1970) and cal chaos to probabilitytheory. These were (i) Meinhold and Singpurwalla (1983) for review. Poincare's notionsrelating uncertainty about as- Studyof generalizationsof (4.2) [typicallywithout pects of deterministicprocess to randomness,and (4.3)] that allow for dependenceupon more time (ii) the various relationshipsbetween ergodic the- lags ofthe x process,but assume f to be linear and ory,stochastic processes and chaos. The firstpoint z to be white noise is a subset of classical time ofdiscussion involves foundational impacts of these series analysis as in Box and Jenkins(1976). Also notionson statisticalphilosophy. For example,con- fordiscussion of related ideas involvingstate space sider the "sacred cow" example of elementary modeling,see Aoki (1987). Key recentreferences statisticscourses: the probabilisticstructure of fair on nonlineartime series include Kitagawa (1987) coin tossing.What is meantby "the probabilityof and Priestly(1988); also, see Gallant (1987). A key 'heads' is 0.5?" The typical explanationrevolves referenceon Bayesian analyses of dynamicmodels aroundthe frequencyinterpretation of probability, is West and Harrison(1989). but relies on some primitivenotion of randomness Few would argue with the claim that relatively on the part of students.[Many works of I. J. Good littleof the mainstreamstatistical time series liter- are verypertinent here; see Good(1983).] Ofcourse, ature expresslydeals withissues in chaos, such as Bayesian teacherstypically question the validityof the problemsdiscussed in Section3. [An important the frequencydefinition and remarkon the "sub- recentwork that is, at least partially,motivated by jective" assertionof the coin's fairness.However, notionsof chaos is Tong (1990). This highlyrecom- one mightraise the questionof the existenceof any mendedbook offers substantial review of chaos and randomnessin this context.If we knew all the related work in nonlineartime series.] However, pertinentinitial conditionsfor the coin toss, we chaoticdata analysis shouldnot be viewed as fun- could apply the laws of physicsto determinethe damentally distinct from mainstream statistical outcomeof the toss [see Engel (1987) forprecisely timeseries analysis. Consider(4.2) and (4.3) with- such computations;also, see Ford (1983)]. It can out any error(z and e) terms. If the analyst as- thenbe claimedthat our use ofnotions of random- sumes, as is customary,that (4.2) generates an ness in coin tossingis really a reflectionof uncer- ergodicprocess (i.e., past its transienceperiod), the taintiesabout initialconditions. mathematicsdiscussed in Section2 implythat the Note thatthis discussioncoincides with a portion data drivenby (4.3) forma realizationof a station- ofthe Bayesian philosophyof statistics. In particu- ary stochasticprocess. This conclusionapplies to lar, a key componentof the Bayesian paradigmis the original series as well as data based on any the modelingof uncertain quantities as ifthey are (measurable) h function.Thus, the inferencebase random.Perhaps the moral of deterministicchaos fordata analyticprocedures, such as the methodof and its modelingvia probabilitycan be exemplified timedelays or symbolicdynamics, is the same as in witha simplechallenge to the reader:can you come general stationarytime series analysis. This com- up with a physicalmodel or a real statisticsprob- mon foundationfor chaologistsand statisticians lem that resultsin randomnessthat does not stem can hold in the presenceof observationerrors as from some deterministicuncertainty? (The only well. Suppose that the et's are iid. In modelswith- "rule" in thismind game is thatyour cannot resort out the z terms,if the x processis ergodic,(4.3) to quantummechanics.) If the answer is no, or at correspondsto a stationaryprocess. More gener- least not withouta great deal of work,then per- ally,if (4.2) yieldsa Markovprocess that is station- haps we are led to one of the commonanswers ary and ergodic, (4.3) still yields a stationary given by Bayesians to B. Efron'squestion, "Why stochasticprocess. isn't everyonea Bayesian?" (Efron, 1986). The ''answer" is that the questionis vacuous: everyone is a Bayesian. (Of course,I am referringto model- 5. DISCUSSION ing. I am aware that not everyoneapplies Bayes' Two discussion points concerningchaos and Theoremin reachingconclusions.) My pointis that statisticsare consideredhere. First,what does the frequentiststatisticians may find it even morediffi- emergenceof chaos suggestabout statisticalmodel- cultto adhereto theirtraditional view ofstatistics. ing?Second, what can statisticiansand probabilists Specifically,the notionof a finedistinction between bringto the practicalstudy of chaos? random and unknown,but deterministic,quanti- 88 L. M. BERLINER ties seemsuntenable in practice.A furtherproblem subjectivistmodels. However,no scientistshould forthe frequentistinvolves the notionof "repeata- wish to be restrictedin the incorporationof addi- ble and identical"experiments. Most statisticians tional, case-specificinformation. Also, I question would agree to the nonexistence,at least at a the practical importanceof the notionof "objec- verypure level, of such experimentsdue to ever- tively interpretedprobabilities," since one must changingenvironmental effects. Beyond environ- firstsubjectively decide that the laws underwhich mental effects,however, chaos suggests that be- such probabilitiesare derivedactually apply and cause one couldnever truly claim identicalsettings are completein a givensetting. foreven controllablefactors and, since even slight Beyondphilosophy, statisticians should be inter- errorsmay have great influence,repeatable and ested in chaos in lightof the numerousopportuni- identicalexperiments are not possible. This prob- ties for potentialresearch contributions.Also, I lem deprives the frequentistany foundationfor believe that the techniques,such as reviewedin inference. Section3, beingdeveloped to deal withchaos should These pointsalso bear on discussionswithin the be considereda part ofmainstream statistical time Bayesian community.Specifically, I referto the series methods.As chaos becomes more popular, subjectiveversus objectiveor necessarian contro- such techniqueswill be useful and "expected"by versy.The notionsof ergodic distributions, statisti- our colleagues,in day-to-daystatistical consulting cal regularityand Poincare's methodof arbitrary and collaborativeresearch. Finally, many of the functionsall containrelevant messages in this re- notions and modelingstrategies associated with gard. [Engel (1987) includesan excellentreview of chaos may prove useful in other statisticalprob- the methodof arbitraryfunctions.] Consider the lems. As an example,it is interestingto speculate followingremarks of Von Plato (1983): on the value of ideas about spatially distributed chaotic processesin statisticalimaging problems. It will be interestingto see what kind of an For an introductionto "spatio-temporalchaos," see escape fromthe recentlyestablished rigorous Crutchfieldand Kaneko (1987) and Crutchfield resultsin ergodictheory the subjectivistschool (1988). on the foundationsof probabilitytheory will take. 5.2 Impact of Statistics on the Analysis of Chaos It has been traditional to think that de- One possibleavenue forscientific inquiry in vari- terministicsystems allow only subjectively ous settingsrevolves around the questionof decid- interpretableprobabilities. There is howevera ing whetheror not a particularphenomenon or differenttradition, in which the 'stable data set is deterministic,yet chaotic,or random. frequencyphenomena of nature' (to use an (For example,see Berge,Pomeau and Vidal, 1984; expressionof Hopf)are accountedfor by objec- Brock,1986; Bartlett,1990; and Sugihara and May, tively interpretedprobabilities, despite the 1990.) It is importantto first decide if such a view that these phenomenaare supposedto be questionmakes sense. I have alreadyasked whether governedby laws ofnature of the classical sort. or not "ordinary"randomness is simplya resultof uncertaintyin the presence of determinism.Re- I claimthat the "escape" alludedto by Von Plato is gardless of the foundationalanswer, statisticians unnecessary.There is nothingto escape from.As should be able to make operational contributions Savage (1973) explains: along these lines. Note that such issues are chal- Personal probabilityworks well in science lenging,because deterministic,chaotic models and whereverprobability seems at all relevant;in stochasticmodels typically both can be made to fit particular,the personalisticposition does no data. (Again, I referin part to the relationship violenceto any genuine objectivityof science; between ergodic,chaotic systemsand stationary and finally,the personalisticposition does not stochasticprocesses.) Specifically,recall the gen- neglectany appropriaterole offrequency or of eral structuresuggested by (4.2) and (4.3). When f symmetryin the applicationof probability. is chaoticand z is omittedfrom (4.2), the interpre- tation of the model xt+1 = f(xt; 77)as deterministic The finalphrase ofthis quote impliesthat no sub- seems to be irrelevantin termsof prediction, since jectivistwould claim that fundamentalknowledge, meaningfulpredictions must still be statistical.Al- such as containedin ergodicproperties, concerning ternatively,the inclusionof z in (4.2) need not be a systemunder study should be ignored.Indeed, based on an assertionthat the phenomenonunder ergodicdistributions and other "objectivist"sug- study is innatelyrandom, but rather that f, al- gestions are natural candidates for benchmark thoughof some value, is not capable of capturing analyses and startingpoints upon which to build all the deterministicaspects of the process.Impor- CHAOS 89

tantcontributions from statisticians revolve around BARNSLEY, M. (1988). Fractals Everywhere.Academic, San the constructionof probabilistic/statisticalpredic- Diego, Calif. tions and estimatesof interestingquantities. The BARTLETT, M.S. (1990). Chance or chaos? (with discussion). J. Roy. Statist. Soc. Ser. A 153 321-347. performanceand flexibilityof statisticallycorrect BASAR, E., ed. (1990). Chaos in Brain Function.Springer, New predictionprocedures should form powerful criteria York. forchoosing among alternative models. For general BENEDICKS, M. and CARLESON, L. (1991). The dynamicsof Henon discussions and referencesconcerning statistical the map. Ann. of Math. (2) 133 73-169. modeling,see Cox (1990), Hill (1990) and Lehmann BERGE, P., POMEAU, Y. and VIDAL C. (1984). Order Within Chaos. Wiley, New York. (1990). BERLINER, L. M. (1991). Likelihood and Bayesian prediction of Thereis also a need forcarefully designed experi- chaotic systems. J. Amer. Statist. Assoc. 86 938-952. ments in the contextof chaotic phenomena.This Box, G. E. P. and JENKINS, G. M. (1976). Time Series Analysis: avenue ofresearch may be particularlyinteresting Forecasting and Control. Holden-Day, San Francisco. and challengingto both statisticiansand proba- BREIMAN, L. (1968). Probability.Addison-Wesley, Reading Mass. BROCK, W. A. (1986). Distinguishing random and deterministic bilists,in that at least some of the basis of infer- systems. J. Econom. Theory40 168-195. ence involvesergodic theory. BROOMHEAD, D. S. and KING G. P. (1986). Extractingqualitative To summarize,I believe that statisticiansand dynamics fromexperimental data. Phys. D 20 217-236. probabilistscan make importantcontributions in CASDAGLI, M. (1989). Nonlinear predictionof chaotic time series. the area of chaos. First,in view ofthe universally Phys. D 35 335-356. CHATTERJEE, S. and YILMAZ, M. R. (1992). Chaos, fractalsand acceptedview that long-term,precise prediction of statistics (with discussion). Statist. Sci. 7 49-121. the path of a chaoticprocess is impossible,useful COLLET, P. and ECKMANN, J.-P. (1980). IteratedMaps on the conclusionsor predictionsmust be statistical in Interval as Dynamical Systems. Birkhauser, Boston. nature. For example, one may be able to assess COOPER, N. G., ed. (1989). FromCardinals to Chaos. Cambridge probabilitiesfor interestingevents, or subsets of Univ. Press. CORNFELD, I. P., FOMIN, S. V. and SINAI, YA.G. (1982). Ergodic the phase space of the dynamical systemunder Theory. Springer, New York. study. The computationof such estimates must Cox, D. R. (1990). Role of models in statistical analysis. Statist. typicallyaccount for uncertainties in initial condi- Sci. 5 169-174. tions,model specification, unknown parameters and CRUTCHFIELD, J. P. (1988). Complexityin nonlinear image proc- the implicationsof ergodictheory. Second, assess- essing. IEEE Trans. Circuits and Systems 35 770-780. CRUTCHFIELD, J. P., FARMER, J. D., PACKARD, N. H. and SHAW, ing the effectsof one dynamicalsystem that serves R. S. (1986). Chaos. ScientificAmerican 255 46-57. as an input to a second systemof interestis a CRUTCHFIELD, J. P. and KANEKO, K. (1987). Phenomenologyof natural statisticalproblem (see Tong, 1990, page spatio-temporalchaos. In Directions in Chaos (H. Bai-Lin, 429). All of these points are importantfor active ed). World ScientificPress, Singapore. analyses of dynamicalsystems when the goals of CRUTCHFIELD, J. P. and McNAMARA, B. S. (1987). Equationsof motion froma data series. Complex Systems 1 417-452. analysis includeprediction and "control"(see Ott, DEVANEY, R. L. (1989). Introductionto Chaotic Dynamical Sys- Grebogiand Yorke, 1990, and Sinha, Ramaswamy tems,2nd ed. Benjamin-Cummings,Menlo Park, Calif. and Subba Rao, 1990, for relevant discussionof ECKMANN, J.-P.and RUELLE, D. (1985). Ergodictheory of chaos controlproblems). When conclusionsare soughtin and strange attractors. Rev. Modern Phys. 57 617-656. practicalproblems based on data and uncertainty EDGAR, G. E. (1990). Measure, Topology, and Fractal . Springer, New York. modeling,the analyses are exactlythe business of EFRON, B. (1986). Why isn't everyone a Bayesian? (with discus- statisticiansand probabilists. sion). Amer. Statist. 40 1-11. ELLNER, S. (1988). Estimating attractordimensions fromlimited ACKNOWLEDGMENTS data: A new method, with error estimates. Phys. Lett. A 133 128-133. I thankSteve MacEachernfor both listening and ENGEL E. (1987). A road to randomness. Technical Report 278, sharing his insights concerningthis article. My Dept. Statistics, StanfordUniv. sincerethanks go to the refereesfor their excellent FARMER, J. D., Orr, E. and YORKE, J. A. (1983). The dimension help. This work was partiallysupported by NSF of chaotic attractors. Phys. D 7 153-180. GrantDMS-89-06787. FARMER, J. D. and SIDOROWICH, J. J. (1987). Predictingchaotic time series. Phys. Rev. Lett. 59 845-848. REFERENCES FEIGENBAUM, M.J. (1978). Quantitative universality in a class of nonlinear transformations.J. Statist. Phys. 19 25-52. ABARBANEL, H. D. I., BROWN, R. and KADTKE, J. B. (1989). FORD, J. (1983). How randomis a coin toss? PhysicsToday 1 Predictionand systemidentification in chaoticnonlinear 40-47. systems:Time series with broadband spectra. Phys. Lett. A GALLANT, A. R. (1987). NonlinearStatistical Models. Wiley, 138 401-408. New York. AOKI, M. (1987). State Space Modeling of Time Series. Springer, GEWEKE, J. (1989). Inference and forecastingfor deterministic New York. nonlinear time series observed with measurement errors. In BAKER, G. L. and GOLLUB, J. P. (1990). Chaotic Dynamics. Nonlinear Dynamics and Evolutionary (R. Day CambridgeUniv. Press. and P. Chen, eds.). To appear. 90 L. M. BERLINER

GLASS, L. and MACKEY, M. C. (1988). From Clocks to Chaos. ric regression.Inst. StatisticsMimeograph Series No. 1977, PrincetonUniv. Press. Dept. Statistics,North Carolina State Univ. GLEICK, J. (1987). Chaos: Making a New Science.Viking Pen- ORNSTEIN, D. S. (1988). Ergodic theory,randomness, and guin,New York. "chaos." Science243 182-187. GOOD, I. J. (1983). Good Thinking.Univ. MinnesotaPress, OTT,E., GREBOGI,C. and YORKE,J. A. (1990). Controllingchaos. Minneapolis. Phys. Rev. Lett. 64 1196-1199. GRASSBERGER, P. and PROCACCIA, I. (1983). Measuring the PACKARD, N. H., CRUTCHFIELD,J. P., FARMER, J. D. and SHAW, strangeness of strange attractors. Phys. D9 189-208. R. S. (1980). Geometryfrom a time series. Phys. Rev. Lett. GRENANDER, U. and ROSENBLATT, M. (1984). StatisticalAnalysis 45 712-716. ofStationary Time Series, 2nd ed. Chelsea,New York. PARKER,T. S. and CHUA, L. 0. (1989). Practical Numerical GUCKENHEIMER, J. (1984). Dimensionestimates for attractors. Algorithmsfor Chaotic Systems. Springer, New York. Contemp.Math. 28 357-367. PEITGEN, H.-O. and SAUPE,D., eds. (1988). The Science of Frac- GUCKENHEIMER, J. and HOLMES, P. (1983). NonlinearOscilla- tal Images. Springer, New York. tions,Dynamical Systems, and Bifurcationsof Vector Fields. PETERSON, I. (1988). The Mathematical Tourist. W. H. Freeman, Springer,New York. New York. HAMMEL, S. M., YORKE, J. A. and GREBOGI, C. (1987). Do POINCARE', H. (1946). The Foundations of Science. Science Press, numericalorbits of chaoticdynamical processes represent Lancaster, U.K. trueorbits? J. Complexity3 136-145. PRESS, W. H., FLANNERY,B. P., TEUKOLSKY,S. A. and VETTER- HEINON M. (1976). A two dimensionalmapping with a strange LING, W. T. (1986). Numerical Recipes. Cambridge Univ. attractor.Comm. Math. Phys.50 69-77. Press. HILL, B. M. (1990). A theoryof Bayesian data analysis. In PRIESTLEY,M. B. (1988). Nonlinear and Non-stationaryTime Bayesian and LikelihoodMethods in Statisticsand Econo- Series Analysis. Academic, New York. metrics(S. Geisser,J. S. Hodges,S. J. Press and A. Zellner, RAMSEY,J. B. and YUAN, H.-J. (1989). Bias and error bars in eds.) 49-74. North-Holland,Amsterdam. dimension calculations and their evaluation in some simple JACKSON, E. A. (1989). Perspectivesof NonlinearDynamics. models. Phys. Lett. A 134 287-297. CambridgeUniv. Press. RASBAND,S. N. (1990). Chaotic Dynamics of Nonlinear Systems. JAZWINSKI,A. H. (1970). StochasticProcesses and FilteringThe- Wiley, New York. ory.Academic, New York. RUELLE, D. (1989). Chaotic Evolution and Strange Attractors. KAYE, B. H. (1989). A Random Walk ThroughFractal Dimen- Cambridge Univ. Press. sions.VCH, New York. SAVAGE,L. J. (1973). Probability in science: A personalistic KITAGAWA, G. (1987). Non-Gaussianstate-space modelling of account. In Proceedings of the Fourth InternationalCongress nonstationarytime series (with discussion). J. Amer. for Logic, Methodology,and Philosophy of Science (P. Sup- Statist.Assoc. 82 1032-1063. pes, L. Henkin, A. Joja, and GR. C. Mois, eds.) 417-428. KOSTELICH, E. J. and YORKE, J. A. (1990). Noise reduction: North-Holland,Amsterdam. Findingthe simplestdynamical system consistent with the SHIRAIWA,K. (1985). Bibliographyfor Dynamical Systems. Dept. data. Phys. D 41 183-196. Mathematics, Nagoya Univ., Furocho, Chikusa-Ku, Nagoya, LAPLACE, P.S. (1951). A PhilosophicalEssay on Probabilities. Japan. Dover,New York. SINHA,S., RAMASWAMY,R. and SUBBARAO, J. (1990). Adaptive LEHMANN, E. L. (1990). Modelspecification: The viewsof Fisher control in nonlinear dynamics. Phys. D 43 118-128. and Neyman and later developments.Statist. Sci. 5 SMITH, R. L. (1992). Estimating dimensions in noisy chaotic 160-168. time series (with discussion). J. Roy. Statist. Soc. Ser. B Li, T.-Y. and YORKE, J. A. (1975). Period threeimplies chaos. To appear. Amer.Math. Monthly 82 985-992. STEWART, I. (1989). Does God Play Dice? Blackwell, London. LIEBERT, W. and SCHUSTER, H. G. (1989). Properchoice of the SUGIHARA, G. and MAY, R. M. (1990). Nonlinear forecastingas a time delay forthe analysis of chaotictime series. Phys. way of distinguishingchaos frommeasurement errors in Lett. A 142 107-111. time series. Nature 344 734-741. LORENZ, E. N. (1963). Deterministnon-periodic . J. Atmo- TAKENS, F. (1981). Detectingstrange attractors in turbulence. sphericSci. 20 130-141. Dynamical Systems and Turbulence. Lecture Notes in Math. LUMLEY, J. L. (1970). StochasticTools in Turbulence.Academic, 898 366-381. Springer,New York. New York. TAKENS, F. (1985). On the numericaldetermination of the di- MANDELBROT, B. B. (1982). The Fractal Geometryof Nature. W. mension of an attractor. Dynamical Systems and Bifurca- H. Freeman,New York. tions. Lecture Notes in Math. 1125 99-106. Springer, New MAY,,R.M. (1987). Chaos and the dynamicsof biological popula- York. tions.In DynamicalChaos (M. V. Berry,I. C. Percivaland TONG,H. (1990). Non-linear Time Series: A Dynamical System N. 0. Weiss,eds.) 27-43. PrincetonUniv. Press. Approach.Oxford Univ. Press. MEINHOLD, R. J. and SINGPURWALLA,N. D. (1983). Understand- VON PLATO, J. (1983). The method ofarbitrary functions. British ing the Kalman filter.Amer. Statist. 37 123-127. J. Philos. Sci. 34 37-47. MOLLER, M., LANGE, W., MITSCHKE,F., ABRAHAM, N. B. and WAHBA, G. (1990). Spline Models for Observational Data. SIAM, HtIBNER, U. (1989). Errors fromdigitizing and noise in Philadelphia. estimatingdimensions. Phys. Lett. A 138 176-182. WEGMAN, E. J. (1988). On randomness,determinism, and com- NESE, J. M. (1989). Quantifyinglocal predictabilityin phase putability. J. Statist. Plann. Inference20 279-294. space. Phys. D 35 237-250. WEST, M. and HARRISON J. (1989). Bayesian Forecasting and NICHOLIS, G. and PRIGOGINE, I. (1989). ExploringComplexity. W. Dynamic Models. Springer, New York. H. Freeman,New York. WOLFF, R. C. L. (1990). A note on the behavior of the correlation NYCHKA, D. MCCAFFREY, D., ELLNER, S. and GALLANT, A. R. integral in the presence of a time series. Biometrika 77 (1990). EstimatingLyapunov exponents with nonparamet- 689-697.