Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrology and © Author(s) 2010. This work is distributed under Earth System the Creative Commons Attribution 3.0 License. Sciences

HESS Opinions “A random walk on water”

D. Koutsoyiannis Department of Water Resources and Environmental Engineering, School of Civil Engineering, National Technical University of Athens, Greece

Invited contribution by D. Koutsoyiannis, recipient of the EGU Henry Darcy Medal 2009. Received: 5 October 2009 – Published in Hydrol. Earth Syst. Sci. Discuss.: 29 October 2009 Revised: 19 March 2010 – Accepted: 19 March 2010 – Published: 25 March 2010

Abstract. According to the traditional notion of randomness 1 What is randomness? and uncertainty, natural phenomena are separated into two mutually exclusive components, random (or stochastic) and In his foundation of the modern axiomatic theory of proba- deterministic. Within this dichotomous logic, the determin- bility, A. N. Kolmogorov (1933) avoided defining random- istic part supposedly represents cause-effect relationships ness. He used the notions of random events and random vari- and, thus, is physics and science (the “good”), whereas ables in a mathematical sense but without explaining what randomness has little relationship with science and no randomness is. Later, in about 1965, A. N. Kolmogorov and relationship with understanding (the “evil”). Here I argue G. J. Chaitin independently proposed a definition of random- that such views should be reconsidered by admitting that ness based on complexity or absence of regularities or pat- uncertainty is an intrinsic property of nature, that causality terns (which could be reproduced by an algorithm). Specifi- implies dependence of natural processes in time, thus cally, a of numbers is random if the smallest algorithm suggesting predictability, but even the tiniest uncertainty capable of specifying it to a computer has about the same (e.g. in initial conditions) may result in unpredictability number of bits of information as the series itself (Chaitin, after a certain time horizon. On these premises it is possible 1975; Kolmogorov, 1963, 1965; Kolmogorov and Uspenskii, to shape a consistent stochastic representation of natural 1987; from Shiryaev, 1989). Interestingly, Chaitin proved processes, in which predictability (suggested by determin- that, although randomness can be precisely defined in this istic laws) and unpredictability (randomness) coexist and manner and can even be measured, there cannot be a proof are not separable or additive components. Deciding which that a given real number (regarded as a series of its digits) is of the two dominates is simply a matter of specifying the random. time horizon and scale of the prediction. Long horizons of prediction are inevitably associated with high uncertainty, The move from this mathematical abstraction of a real whose quantification relies on the long-term stochastic number to the realm of real physical phenomena is not properties of the processes. straightforward. Here, commonly, randomness is contrasted to determinism. The movement of planets is a typical exam- Αἰών παῖς ἐστι παίζων πεσσεύων. Παιδός ἡ βασιληίη. ( Time is plea child of aplaying, deterministic throwing phenomenon, dice. whereas that of dice is (Time is a child playing, throwing dice. The ruling power is thought to be random. This reflects a dichotomous logic, ac- The ruling power is a child's ; Heraclitus; ca. 540-480 BC; Fragment 52) a child’s; Heraclitus; ca. 540–480 BC; Fragment 52) cording to which there exist two mutually exclusive types of events or processes – deterministic and random (or stochas- I am convinced that He does not throw dice. (Albert Einstein, in a letter to Max Born in 1926) I am convinced that He does not throw dice. (Albert tic). Such dichotomy is perceived either on ontological or on Einstein, in a letter to Max Born in 1926) epistemological grounds. In the former perception the natu- 1 What is randomness? ral events are thought to belong, in their essence, to these two different types, whereas in the latter it is regarded convenient Correspondence to: D. Koutsoyiannis to separate them into these types, where processes that we In his foundation of the modern([email protected]) axiomatic theory of , A.do N. not Kolmogorov understand or explain(1933) are considered random. When a avoided defining randomness. He used the notions of random events and random variables in a mathematicalPublished sense by but Copernicus without Publicationsexplaining onwhat behalf randomness of the European is. Later, Geosciences in about Union. 1965, A. N. Kolmogorov and G. J. Chaitin independently proposed a definition of randomness based on complexity or absence of regularities or patterns (which could be reproduced by an algorithm). Specifically, a series of numbers is random if the smallest algorithm capable of specifying it to a computer has about the same number of bits of information as the series itself (Chaitin, 1975; Kolmogorov, 1963, 1965, Kolmogorov and Uspenskii, 1987, from Shiryaev, 1989). Interestingly, Chaitin proved that, although randomness can be precisely defined in this manner and can even be measured, there cannot be a proof that a given real number (regarded as a series of its digits) is random.

The move from this mathematical abstraction of a real number to the realm of real physical phenomena is not straightforward. Here, commonly, randomness is contrasted to determinism. The movement of planets is a typical example of a deterministic phenomenon, whereas that of dice is thought to be random. This reflects a dichotomous logic, according to which there exist two mutually exclusive types of events or processes—deterministic and random (or stochastic). Such dichotomy is perceived either on ontological or on epistemological grounds. In the former perception the natural events are thought to belong, in their essence, to these two different types, whereas in the latter it is regarded convenient to separate them into these types, where processes that we do not understand or explain are considered random. When a classification of a specific process into one of these two types fails—and it usually does, except in a few cases such as the above examples of planets and dice—then a separation of the process into two different, usually additive, parts is typically devised. This perception has been dominant in geosciences, including hydrology. This thinking proceeds so as to form a reductionist hierarchy. Thus, each of the parts may be further subdivided into subparts (e.g., deterministic subparts such as periodic and aperiodic or trends). This dichotomous logic is typically combined with a manichean perception, in which the deterministic part supposedly represents causeeffect relationships and reason and thus is 2

586 D. Koutsoyiannis: A random walk on water classification of a specific process into one of these two types processes that we may understand, we may explain, but we fails – and it usually does, except in a few cases such as the cannot predict2. In other words, randomness and determin- above examples of planets and dice – then a separation of the ism (which, in turn, could be identified with predictability) process into two different, usually additive, parts is typically coexist in the same process, but are not separable or addi- devised. This perception has been dominant in geosciences, tive components. It is a matter of specifying the time horizon including hydrology. This thinking proceeds so as to form a and scale of prediction to decide which of the two dominates. reductionist hierarchy. Thus, each of the parts may be further This view, which will be illustrated in the sequel, is consis- subdivided into subparts (e.g. deterministic subparts such as tent with Kolmogorov’s and Chaitin’s views of mathematical periodic and aperiodic or trends). This dichotomous logic is randomness, as well as with K. Popper’s (1992) indetermin- typically combined with a manichean perception, in which istic world view. the deterministic part supposedly represents cause-effect re- In identifying randomness with unpredictability, the latter lationships and reason and thus is physics and science (the is meant in deterministic terms: any specific prediction algo- “good”), whereas randomness has little relationship with sci- rithm is not able to accurately calculate the future state of a ence and no relationship with understanding (the “evil”). The system and thus there is uncertainty. If, in our attempt to pre- random part is also characterized as “noise”, in contrast to the dict the future, we are not able to know the precise state, we deterministic “signal”. “Noise” is a contaminant that causes could lower our target and try to know how high or low the uncertainty, a kind of illness that should be remedied or elim- uncertainty is. Quantification of uncertainty is a useful target inated. and a feasible one. Naturally, it is achieved by means of prob- and statistics, which traditionally pro- ability. The classical definition of probability, i.e. the ratio vided the tools for dealing with randomness and uncertainty, of the favourable outcomes of a specified event to the num- have been regarded by some as the “necessary evil”, but not ber of possible outcomes, has several logical problems and as an essential part of physical sciences. This view has also does not help to study natural processes. Rather, we should affected hydrology and geophysics, particularly in the last follow the Kolmogorov (1933) system, in which probability couple of decades. Some tried to banish probability from hy- is a normalized measure, i.e. a function that maps sets (ar- drology, replacing it with deterministic sensitivity analysis eas where the unknown quantities lie) to real numbers. An and fuzzy-logic representations. Others attempted to demon- event A is just a set (of elements called elementary events) strate that irregular fluctuations observed in natural processes to which a probability, i.e. a number P (A) in the interval are au fond manifestations of underlying deterministic dy- [0, 1], is assigned. The notion of a random variable, i.e. a namics with low dimensionality, thus rendering probabilistic single-valued function x of the set of all elementary events descriptions unnecessary. Some of the above views and re- (so that to each elementary event ξ it maps a real number cent developments are simply flawed because they make er- x(ξ)), is central in this system, and is associated with a prob- roneous use of probability and statistics, which, remarkably, ability distribution function (F (x):=P {x≤ x})3 and a prob- provide the tools for such analyses (Koutsoyiannis, 2006). ability density function (f (x):=dF (x)/dx). The notion of a The entire logic of contrasting determinism with random- random variable allows probabilization of uncertainty, typi- ness is just a false dichotomy. To see this, it suffices to re- cal in Bayesian statistics (not to be confused with the lately call that P.-S. Laplace, perhaps the most famous proponent abused term of “Bayesian beliefs”). of determinism in the history of philosophy of science (cf. Laplace’s demon), is, at the same time, one of the founders of probability theory. According to Laplace (1812), “probabil- 2 Emergence of randomness from determinism ity theory is, au fond, nothing but common sense reduced to calculus1”. This harmonizes with J. C. Maxwell’s view that To illustrate that randomness coexists with determinism and “the true logic for this world is the of ” that the two do not imply different types of mechanisms, (Maxwell, 1850). Recently, the same view was epigrammat- or different parts or components in the time evolution, we ically formulated in the title of E. T. Jaynes’s (2003) book will study a toy model of a caricature hydrological system. “Probability Theory: The Logic of Science”. The system, shown in Fig. 1, and its toy model are designed Here I argue that the dominant dichotomous logic reflects 2According to Niels Bohr, “Prediction is difficult, especially of a na¨ıve and inconsistent view of randomness. It cannot help the future”. us see the unity of Nature. Are the movement of planets 3Some texts distinguish random variables, which are functions, and that of dice qualitatively different natural phenomena? from their values, which are numbers, by denoting them with upper Do they not obey the same physical laws? Abandoning this case and lower case letters, respectively. Since this convention has logic and seeking a more consistent view, I propose to iden- several problems (e.g. the Latin x and the Greek χ, if put in upper tify randomness with unpredictability. Randomness exists in case, are the same symbol X), other texts do not distinguish the two at all, thus creating other type of ambiguity. Here we follow 1Original in French: “la theorie´ des probabilites´ n’est, au fond, another convention, in which random variables are underscored and que le bon sens reduit´ au calcul”. their values are not.

Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/ Figures D. Koutsoyiannis: A random walk on water 587

Evidently, it is more interesting to study our system when v : it is “alive”, that is, out of the equilibrium. To this aim, as the Vegetation system is 2-D, we need one equation additional to Eq. (1) to τ : cover Transpiration model it, which we seek in conceptualizing the dynamics of vegetation. As x=0 represents the state where the vegetation φ : does not change, we may assume that soil water in excess, Infiltration x >0, will result in increase of vegetation and soil water in deficit, x <0, will result in decrease of the vegetation cover. The graph in Fig. 2 was constructed heuristically, according x : to this logic, and is described by the following equation: Soil water 3 max(1+(x − /β) ,1)v − Datum v = i 1 i 1 i 3 3 (2) max(1−(xi−1/β) ,1)+(xi−1/β) vi−1 Figure 1: A caricatureFig. 1. hydrologicalA caricature hydrologicalsystem for systemwhich the for whichtoy model the toy is modelconstructed. is where β=100 mm is a standardizing constant to make the constructed. equation dimensionally consistent. Summarizing, we have a 2-D toy model in discrete time i 1 whose state is described by the state variables (x , v ) =:x intentionally to be simple. A large piece of land is con- i i i x i 1 = 400 (with xi in bold denoting a vector) and whose dynamics vsidered,i on which water infiltrates and is stored in the soil is represented by Eqs. (1) (water balance) and (2) (vegeta- (without distinction fromx i 1 groundwater), = 200 from where it can 0.8 tion cover dynamics). To avoid approximation errors, the transpire though vegetation. Except infiltration, transpiration dynamics is expressed in discrete time by construction of Increased and water storage in the soil, no other hydrological processes vegetation the model (rather than derived as an approximation of con- High soil water are0.6 considered. To simplify the system, no change is im- x i 1 = 100 tinuous time differential equations) and thus is fully accu- posed to its external “forcings”. That is, the rates of infiltra-(above datum) 0 0 rate as far as the toy model is concerned. The system pa- tion φ and potential transpiration τp are assumed constant in x = 0 rameters are four and are assumed to be known precisely: time. The toy model is constructedi 1 assuming discrete time, Unchanged 0.4 Normal soil φwater=250 mm, τ =1000 mm, α=750 mm and β=100 mm. The denoted as i = 1, 2, ..., and that in each time unit 1t (say, p vegetation 0 (at datum) model is easy to program on a hand calculator or a spread- “year”), the input is φ:=φ 1tx i 1 =250= 100 mm and the potential out- 4 0 sheet . The system dynamics is graphically demonstrated in put τ :=τ 1t=1000 mm. The internal state variables, which 0.2 p p Fig. 3, where interesting geometrical surfaces appear, show- are allowed to vary in time, are two, thus shaping a two- Decreased x i 1 = 200 Low soil water ing that the transformation xi = f (xi−1) is not invertible. It vegetation dimensional (2-D) dynamical system: the fraction of the(below land datum) x i 1 = 400 should be stressed that no explicit “agent” of randomness that0 is covered by vegetation, v (0≤ v ≤1) and the soil wa- i i (e.g. perturbation by a random number generator) has been ter storage0xi. 0.2 The latter 0.4 is measured 0.6 above 0.8 a certain 1 datum, v i 1 introduced into the system. so that it can take positive values up to some upper bound Moreover, there is no external forcing imposing change Figure 2: Conceptualα (assuming dynamics that of watervegetation above forα thespills caricature as runoff) hydrological or negative system. onto the system. If any change occurs, it is caused by inter- values without a bound (i.e. −∞ ≤ xi ≤ α). The constant α is assumed to be 750 mm. If the vegetation at time i is v , the nal reasons, that is, by an “imbalance” of the vegetation cover i and water stored. To see this, let us assume that at time i=0 actual output through transpiration will be τi = vi τp. Thus, the water balance equation is the system state is somewhat different from the equilibrium state, setting initial conditions x0=100 mm (6=0) and v0=0.30 xi = min(xi−1 +φ −viτp,α) (1) (6=0.25). Using Eqs. (1) and (2) we can calculate the system state (xi, vi) at times i >0. The trajectories of x and v for We can observe that, if at some time i, time i=1 to 100 are shown in Fig. 4. Apparently, the system vi=φ/τp=250/1000=0.25, then the water balance re- remains “alive”, i.e. it exhibits change all the time, and its sults in xi = xi−1+φ −viτp=xi−1. Assuming that the system state does not converge to the equilibrium. The trajectories, dynamics is fully deterministic, continuity demands that albeit produced by simple deterministic dynamics, are inter- there should be some specific value of xi−1 for which esting and30 seem periodic; we will discuss their properties in v = v − . Without loss of generality, we set this value x=0; i i 1 Sect. 5. that is, we define the datum in such a way that the vegetation As the dynamics is fully deterministic, one may be remains unchanged if water is stored up to the datum. Thus, tempted to cast predictions for arbitrarily long time hori- the state (v=0.25, x=0) represents an equilibrium state: if zons. For example, for time 100, iterative application at some time the system happens to be at this state, it will of the simple dynamics allows to calculate the prediction remain there for ever. In other words, once the system x100=−244.55 mm, v100=0.7423 (as plotted in the right end reaches its equilibrium state, it becomes a “dead” system, exhibiting no change. 4The interested reader can find a spreadsheet with the toy model in http://www.itia.ntua.gr/en/docinfo/923/.

www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010 Figures

v : Vegetation τ : cover Transpiration

φ : Infiltration

x : Soil water Datum

Figure 1: A caricature hydrological system for which the toy model is constructed. 588 D. Koutsoyiannis: A random walk on water dynamics, which is fully consistent with this understanding, very simple, fully deterministic, 1 dynamics,nonlinear andwhich chaotic. is fully consistent with this understanding, very simple, fully deterministic, v x i 1 = 400 i nonlinear and chaotic. However,x i 1 = 200 science is not identical to understanding. As R. Feynman (1965) stated, “I dynamics, which is fully0.8 consistent with this understanding, very simple, fully deterministic, However, science is not identical to understanding. As R. Feynman (1965) stated, “I Increased think I can safely say that nobody understands quantum mechanics”—and this does not nonlinearvegetation and chaotic. High soil water 0.6 think I can safely say that nobody understands quantum mechanics”—and this does not dynamics, which is fully consistentpreclude with thisx i 1 under= quantum 100 standing, mechanics very simple, from (abovefully being deterministic, datum) science. Literally, the name science points to However, science is not identicalpreclude to understanding quantum mechanics. As R. Feynman from being (1965) science. stated, “I Lite rally, the name science points to nonlinear and chaotic. overstanding *, and understanding is not identical, nor a prerequisite, to overstanding. Perhaps think I can safely say that nobodyoverstanding understandsx i 1 =* ,0 quanand tumunderstanding mechanics”—and is not identical, this does nor not a prerequisite, to overstanding. Perhaps Unchanged However, science0.4 is not identicalan imbalance to understanding of understanding,. As R. Feynman whichNormal (1965) soil pertains stated,water “I to observing the details of a system, and preclude quantum mechanics from being science. Literally, the name science points to vegetation an imbalance of understanding, which(at datum) pertains to observing the details of a system, and think I can safely say that nobody understandsx = 100 quantum mechanics”—and this does not overstanding *, and understanding is notoverstanding, identical,i 1 nor which a prerequi aims site,at an to overall overstanding. image ofPerhaps the system, the “forest” rather than the “tree”, 0.2 overstanding, which aims at an overall image of the system, the “forest” rather than the “tree”, preclude quantum mechanics fromkeeps being science science. alive. Lite rally, the name science points to an imbalanceDecreased of understanding, which pertains to obx serving = 200 the details of a system, and * i 1 Low soil water overstandingvegetation , and understanding keepsis not identical, science alive.nor a prerequi site, to overstanding.(below datum) Perhaps overstanding, which aims at an overall image of the system, thex “forest”i 1 = 400 rather than the “tree”, an imbalance of understanding,0 which pertains We can to thus observing hope the that, details despite of a system, achieving and a good understanding of the system keeps science alive. 0 0.2 We 0.4 can 0.6 thus hope 0.8 that, 1 despite achieving a good understanding of the system overstanding, which aims at an overallmechanisms image of the and system, a precise the “forest” formulationv i 1 rather than of its the dynamic “tree”, s, science may have not come to an end, mechanisms and a precise formulation of its dynamic s, science may have not come to an end, Fig. 2. Conceptual Figurekeeps We dynamics science 2: can Conceptual thus of alive. vegetation hope dynamics for that, the caricature despiteof asvegetation far hydrological achievingas our for toy the system. a modelcaricature good is un concerned. hydrologicalderstanding Let system. of us the now system focus on predictions, especially of the future, as far as our toy model is concerned. Let us now focus on predictions, especially of the future, mechanisms We and can a thusprecise hope formulation that, despitewhich of its achieving isdynamic a crucial s, a science good target unmay derstanding of have science—with not ofcome the to systemevenan end, high er importance in engineering. Does, which is a crucial target of science—with even higher importance in engineering. Does, as farmechanisms as our toy andmodel a precise is concerned. formulation really,Let ofus its nowdeterministic dynamic focuss, on science predictions, dynamics may have allowespecially not acome reliable of tothe an future, preend,diction at an arbitrarily long time horizon, really, deterministic dynamics allow a reliable prediction at an arbitrarily long time horizon, whichas far is as a our crucial toy model target is of concerned. science—with800 Let us now even fo highcus oner predictions, importance especially in engineering. of the future, Does, as inx i our above example? In the previous section, 1in vconstructingi our prediction for time 100 really, deterministic dynamics allowas600 a inreliable our above prediction example? at an Inarbitrarily the previous long timesection, horizon, in0.9 constructing our prediction for time 100 which is a crucial target of science—with600800 even higher importance in engineering. Does, (400x100 = –244.55 mm, v100 = 0.7423), we, explicitly0.8 or implicitly,0.91 assumed that we know the 400600 0.80.9 as in our above example? In the previous(200x100 = section, –244.55 in mm,constructing v100 = 0.7423),our prediction we, explicitly for time 100 or0.7 implicitly, assumed that we know the really, deterministic dynamics allow a reliable prediction at an arbitrarily long time horizon, 0.70.8 200400 0.6 parameters0 and initial state with full precision. However,0.60.7 these are real numbers. It is now (x100as = in –244.55 our above mm, example? v100 = 0.7423),In the previousparameters we, explicitly 0200section, and i norinitial constructing implicitly, state withourassumed prediction full tprecision.hat forwe timeknow H 100owever, the0.5 these are real numbers. It is now 200 0.50.6 2000 0.4 0.0 well known that1.0 not only can real numbers not be kn0.40.5own with full precision, but (with (x100 = –244.55 mm, v100 = 0.7423),400 we, 400200explicitly or implicitly, assumed that we know the 0.2 parameters and initial state with fullwell precision. known H thatowever,0.8 not these only canare real real numbers. numbers It notis now be0.3 kn0.30.4own with full precision, but (with 0.4 600 600400 0.6 0.2 0.20.3 probability 1), they0.4 are not computable (Chaitin, 2004). Therefore, our further investigations well0.6 parameters known that and notinitial only state can with real fullprobability numbers precision.800600 not1), H owever, bethey kn areown these not with computable are fullreal precision,numbers. (Chaitin, It but is now2 (with004). 30 Therefore,0.10.2 our further investigations 800 0.2 0.1 0.8 00.1 1000800 0.0 0 v i 1 1.0well known that not only can realwill1000 numbers incorporate not bev ithe kn 1 ownpremise with that full precision,a continuous but (with(rea l) variable cannot ever be described with 0 probability 1), they are 0 not computablewill (Chaitin,incorporate 2004). the Therefore,premise that our afurther continuous investigations (real) variable cannot ever be described with 200 400 600 200 400 600 800 600 400 200 800 600 400 200 1000 probability1000 1), they are not computablex full (infinite) (Chaitin, 2precision,004). Therefore, particularly our further if it investigations varies in time. This premise, which will be called the will incorporate the premise thati 1 a fullcontinuous (infinite) (rea precision,l) variable particularly cannot ever if beit variedescribeds xini 1 time. with This premise, which will be called the will incorporate the premise that a continuous (real) variable cannot ever be described with Fig.Figure 3. Graphicalfull 3: Graphical(infinite) depiction precision, of depiction the toy model particularly of dynamics. the premisetoyifpremise it variemodel of sof in incomplete incompletedyna time.mics. This precision,premise, precision, which is is consistent consistent will be called with with mathematicsthe mathematics (cf. (cf. Chatin’s Chatin’s results), results), as aswell well full (infinite) precision, particularly if it varies in time. This premise, which will be called the premise of incomplete precision, is consistentasas with with physics physics with mathematics (cf. (cf. W. W. Heisenberg’s, Heisenberg’s, (cf. Chatin’s 1927, 1927, results), uncerta uncerta as wellintyinty principle). principle). of Fig. 4). Furthermore,premise of incomplete one may beprecision, tempted is to consistent think that with themathematics name science (cf. pointsChatin’s to results),overstanding as well5, and understand- 800as with physics (cf. W. Heisenberg’s, 1927, uncertainty principle). 0.9 the “primitive science”Soil that water, this system x represents, if It any, is Equilibrium: reasonable,ing is not identical, then,x = 0 to nor assume a prerequisite, that there to overstanding. is some small Per- uncertainty, at least in the x as with physics (cf. W. Heisenberg’s, 1927, It isuncerta reasonable,inty principle). then, to assume that there is some smallv uncertainty, at least in the hasi come to an end withVegetation the above discourse.cover, v We have al- Equilibrium:haps an imbalance v = 0.25 of understanding, which pertains toi ob- ready achieved It an is understanding reasonable, then, of the to system, assumeinitial its that driving conditions there isserving so (initialme the small details values uncertainty, of aof system, state atvariabl and least overstanding, es). in the Perhaps which it aimswould be reasonable to assume 600 It is reasonable, then, to assumeinitial thatconditions there is (initial some small values uncertainty, of state variabl at least es). in thePerhaps 0.8 it would be reasonable to assume mechanismsinitial and conditions the causative (initial relationships: values of (a) state there variabl is wa-es). Perhapsat an overall it would image be of thereasonable system, theto “forest”assume rather than the ter balance (conservationinitial conditions of mass); (initial (b) values excessive ofthat thatstate soil there therevariabl water is is es).uncertainty uncertainty “tree”,Perhaps keeps it also alsowould science in in thebe the alive. reasonableparameters parameters to anassume and din in model model equation equation (2) (2) (but (but not not in in(1), (1), which which causes400 increasethat there of vegetation;is uncertainty (c) deficient also in the soil parameters water causes and in model equation (2) (but not in (1), which 0.7 that there is uncertainty also in therepresents parametersrepresents an preservation preservationd in modelWe can equation of thus of mass). mass). hope (2) that,(but However, However, not despite in (1), toachieving to which keep keep our a our good study study under- simple, simple, we we will will restrict restrict our our decreaserepresents of vegetation; preservation (d) excessive of vegetation mass). However, causes de- to keepstanding our study of the simple, system mechanisms we will restrict and a precise our formulation crease200 of soilrepresents water; and preservation (e) deficient of vegetation mass). investigationinvestigation However, causes in- to keepto toof the the itsour dynamics,uncertainty uncertainty study simple, science of of initialwe initial may will haveconditi conditi restrict notons. come ons. our Figure to Figure an0.6 end, 5 shows5 as shows far the the trajectory trajectory of ofthe the soil soil crease ofinvestigation soilinvestigation water. And to wetheto the haveuncertainty uncertainty completely of of initial and initial precisely conditi conditions.ons.as FigureFigure our toy 55 modelshows isthe concerned. trajectorytrajectory of Letof the the us soil nowsoil focus on pre- formulated the system dynamics, which is fullywaterwater consistent x x for for the the dictions,already already especiallyexamined examined of initial theinitial future, conditions conditions which is( x a(0x crucial=0 =100 100 targetmm, mm, ofv0 v=0 0.30)= 0.30) and and for for five five more more water0 water x for x forthe thealready already examined examined initial initial conditions conditions ( x( x0 = = 100100 mm,mm, v0 == 0.30)0.30) andand for for five five more more 0.5 with this understanding, very simple, fully deterministic,setssets ofof initial initial conditions conditionsscience0 – with only only even 0slightly slightly higher (< importance (< 1%) 1%) dif dif inferentferent engineering. from from the Does,the basic basic set. set. At At short short times, times, the the nonlinearsets and setsof chaotic. initialof initial conditions conditions only only slightly slightly (< (< 1%) 1%) dif differentferentreally, fromfrom deterministic thethe basic set.set. dynamics AtAt shortshort allow times, times, a reliable the the prediction at However,200 science is not identical to understanding. As 0.4

R. Feynman (1965) stated, “I think I can safely say that no- 5 ** Science < Latin Scientia < translation of Greek Episteme * * ScienceScience < < Latin Latin Scientia Scientia < < translation translation of of Greek Greek Ep Epistemeisteme (Επιστήη) (Επιστήη) < Epistasthai< Epistasthai (Επίστασθαι) (Επίστασθαι) = to= ktonow know body400 understands Science Science < quantum Latin < Latin Scientia mechanics”Scientia < translation< translation – and of this ofGreek Greek does Ep Ep notistemeisteme (Επιστήη) (Επιστήη) << EpistasthaiEpistasthai (Επίστασθαι)(Επίστασθαι) = == to toto k knownowknow how 0.3 to do < [epi precludehow quantum howto do to < mechanicsdo [epi < [epi (επί) (επί) = fromover] = over] being+ [histasthai + [histasthai science. (ίστασhowhow (ίστασ Literally, to toθαι) doθαι) do =< <= to[epi to[epi stand] stand] (επί) (επί) = == to=to over] overstand.over]overstand. + + +[histasthai [histasthai [histasthai (ίστασ (ίστασθαι)θαι) = to= to stand]stand] stand] == to to= overstand. tooverstand. overstand. 7 7 600 7 7 0.2 Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/

800 0.1

1000 0 0 10 20 30 40 50 60 70 80 90 100 i

Figure 4: System trajectory for 100 time steps assuming initial conditions x0 = 100 mm and v0 = 0.30.

31

800 x i 1 v i 600 0.9 600800 400 0.8 0.91 400600 0.80.9 200 0.7 0.70.8 200400 0.6 0 0.60.7 0200 0.5 200 0.50.6 2000 0.4 0.0 1.0 0.40.5 400 400200 0.2 0.8 0.3 0.30.4 0.4 600 600400 0.6 0.2 0.20.3 0.6 0.4 800 800600 0.1 0.10.2 0.8 0.2 1000800 0.0 00.1 v i 1 1.0 1000 v i 1 0 0 0 200 400 600 200 400 600 800 600 400 200 800 600 400 200 1000 1000 x D. Koutsoyiannis: Ai random1 walk on waterx i 1 589 Figure 3: Graphical depiction of the toy model dynamics.

800 0.9 800 Soil water, x Equilibrium: x = 0 x v x i Vegetation cover, v Equilibrium: v = 0.25 i i 600 0.8 600

400 0.7 400

200 0.6 200

0 0.5 0

200 0.4 200

400 0.3 400

600 0.2 600

800 0.1 800

1000 0 1000 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 i i

Figure 5: Evolution of soil water x as in Figure 4 but with uncertainty in initial conditions: FigureFig. 4.4: SystemSystem trajectory trajectory for 100 time for steps 100 assu timeming steps initial assumingconditions x0 initial= 100 mm condi- and v0 Fig. 5. Evolution of soil water x as in Fig. 4 but with uncertainty Bold blue line corresponds to initial conditions s = 100 mm, v = 0.30 and the other five lines =tions 0.30. x0=100 mm and v0=0.30. in initial conditions: bold blue line0 corresponds0 to initial conditions represents0=100 initial mm, conditionsv0=0.30 slightly and (< the 1%) other different. five lines represent initial condi- tions slightly (<1%) different. an arbitrarily long time horizon, as in our above example? In the previous section, in constructing our prediction for be thought of as the emergence of randomness from deter- time 100 (x100=−244.55 mm, v100=0.7423), we, explicitly or implicitly, assumed that we know the parameters and initial minism. Later, in Sect. 5, we will study the inverse case, state with full precision. However, these are real numbers. i.e. whether randomness could give rise to determinism (at a It is now well known that not only can real numbers not be31 macroscopic level). known with full precision, but (with probability 1), they are We can easily imagine that if the system dynamics were not computable (Chaitin, 2004). Therefore, our further inves- different, so as to drive it to its “dead” equilibrium state, there tigations will incorporate the premise that a continuous (real) would not be uncertainty in the future. The nonlinear type of variable cannot ever be described with full (infinite) preci- dynamics we used is the agent that made the system “alive”, sion, particularly if it varies in time. This premise, which i.e. changing and not dying. Apparently, what makes the sys- will be called the premise of incomplete precision, is consis- tem alive is the same agent that creates the uncertainty. Only tent with mathematics (cf. Chatin’s results), as well as with dead systems are certain – and this might be useful to recall physics (cf. W. Heisenberg’s, 1927, uncertainty principle). when thinking about eliminating uncertainty. 32 It is reasonable, then, to assume that there is some small The type of uncertainty we observe here could be hardly uncertainty, at least in the initial conditions (initial values of classified in categories typically used in hydrology. It is not state variables). Perhaps it would be reasonable to assume model uncertainty, i.e. incomplete representation of reality, that there is uncertainty also in the parameters and in model because our system is artificial. It is not parameter uncer- Eq. (1) (but not in Eq. (1), which represents preservation of tainty, because we assumed that the parameters are com- mass). However, to keep our study simple, we will restrict pletely known. It is not even data uncertainty, as our inputs our investigation to the uncertainty of initial conditions. Fig- and outputs are assumed fully known and constant, and in ure 5 shows the trajectory of the soil water x for the already fact we have not assumed any measurement error. The un- examined initial conditions (x0=100 mm, v0=0.30) and for certainty in the initial conditions should be thought of as a five more sets of initial conditions only slightly (<1%) dif- consequence of the premise of incomplete precision, rather ferent from the basic set. At short times, the differences in than as a measurement error. One may think that the assumed the trajectories are not visible in Fig. 5. At about time 20, uncertainty 1% is too high to represent this premise. But we the differences become visible and slightly later (time ∼30) used this number just for better illustration. One may easily they become large. Soon thereafter, the different trajectories experiment with lower uncertainties in initial conditions to become unrelated to each other. This shows that a tiny un- see that the behaviour does not change. Only the time span certainty in initial conditions gets amplified after some time, of predictability changes. For example, reducing uncertainty − a fact well known in chaotic systems since H. Poincare’s´ dis- from 1% to 10 6 will extend the predictability time span, but covery of chaos (e.g. Poincare,´ 1908). As a result, the de- not more than double it. terministic dynamics can produce good predictions only for All alive natural systems behave more or less this way, and short time horizons. For longer time horizons, the determin- only the predictability time span changes. This view unifies istic predictions become extremely inaccurate and useless. phenomena as diverse as the movement of dice and planets, In other words, at long times the system behaviour is unpre- although in the former the time span of predictability is less dictable, that is, random, whereas at short times it is very than a second, whereas in the latter it is several millions of well described by the deterministic dynamics. This could years. As strange as it seems, even the solar system is chaotic www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010 590 D. Koutsoyiannis: A random walk on water and unpredictable in such long horizons (Laskar, 1989). For 800 x i example, it has been shown that it may never be possible to 600 accurately calculate the location of the Earth in its orbit 100 400 million years in the past or into the future (Duncan, 1994; 200 Lissauer, 1999), whereas an accurate solution for the orbital and precessional motion of the Earth is limited between 35 0 to 50 million years (Laskar, 1999). 200

400

600 3 From determinism to stochastics exact trajectory 800 envelope of 5 model runs envelope of 1000 model runs As simple and obvious as the premise of incomplete preci- 1000 0 10 20 30 40 50 60 70 80 90 100 sion may seem, it implies a radically different perception i and study of physical phenomena. First of all, the proper visualization of the trajectory of a system’s evolution can no 800 x i longer be a line or a thread. Rather it should be a stream tube 600

(a notion familiar to hydrologists) of nonzero size (distance 400 between its imaginary walls). The path this tube follows is 200 important to know, but the size of the tube is equally impor- tant. This size is not constant, but varies in time. When an 0 200 observation is made at a time io, the size of the tube at this time becomes tiny. In our hypothetical system this small size 400 represents an error related to the premise of incomplete pre- 600 cision but in real-world systems it represents a (usually much exact trajectory 800 envelope of 5 model runs higher) observation error. However, for future times of pre- envelope of 1000 model runs 1000 diction (i > io), as well as past times (i < io) at which no 0 10 20 30 40 50 60 70 80 90 100 i observation had been made, the size becomes much larger. Initially, we can try to define the imaginary walls of this FigureFig. 6.6: StreamStream tube representation tube representation of the evolution of of thesoil water evolution x (with initial of soilconditions water as stream tube as the envelope curves of several model runs inx Figure(with 4 initialand 1% uncertainty) conditions in a as deterministic in Fig. 4setting and using 1% envelope uncertainty) curves constructed in a de- terministic setting using envelope curves constructed by 5 or 1000 with perturbed initial conditions within the assumed uncer- by 5 or 1000 simulation runs: (upper) when only x0 is observed; (lower) when x0 to x30 are observed.simulation runs: (upper) when only x0 is observed; (lower) when x0 tainty bounds. The tube visualization of this type for xi is to x30 are observed. 33 shown in Fig. 6 in two cases, when only x0 is observed and when x to x are observed, and with only 5 model runs (as 0 30 Here comes probability and stochastics, which will give us in Fig. 5) as well as with 1000 model runs. At small times a good description of the tube size (independent of the num- the tube has a size too tiny to be seen and a rough shape. The ber of model runs) as well as a profile of the likelihood that latter is typical in systems with discrete-time dynamics (in the system state is at any specified position, a profile remind- continuous time it would be smooth, but the dynamics and ing of the profile of longitudinal velocity across a stream tube the calculations of a continuous-time model would be too in real flows. One may wonder: Is it permissible to use prob- complicated to serve our purpose). At long times, the size ability in a system that is purely deterministic, as the system gets much larger and increases with the number of model we investigate here is? The answer I propose is a categorical runs that are used to construct the envelope curves. It can “yes”. This answer is consistent with the unified notion of be expected that, as this number tends to infinity, the stream randomness discussed above, as well as with the concept of tube (the zone between the envelopes) will tend to cover all probabilization of uncertainty, that is, the axiomatic reduc- available space; in the case of x, this is the interval between tion from the notion of an uncertain quantity to the notion of −∞ and α=750 mm. Apparently, the dependence of the size a random variable (Robert, 2007). To my perception, nothing on the number of model runs and the large or infinite size in the Kolmogorov (1933) axiomatic system prohibits this of tube as this number increases are deficiencies of the enve- reduction. In a probability-theoretic context, an unknown lope method. Because of these deficiencies, a deterministic value x is a realization of a random variable x and is as- approach of this type, i.e. based on lower and upper physical i i sociated with a probability density function f (x ). A family bounds (cf. the probable maximum precipitation concept), i of random variables x , (arbitrarily, usually infinitely, large) cannot help to effectively describe the stream tube size and i is a whereas a realization of the stochastic the uncertainty. 6 process, i.e. a series of numbers xi is a time series .

6In some texts the two terms are used as synonymous, but dis- tinction helps avoid ambiguity and misunderstanding.

Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/ D. Koutsoyiannis: A random walk on water 591

1 Probability along with the related fields of statistics and Time i = 0 f(x ) stochastic processes are currently described by the collec- Time i = 100, estimated by Monte Carlo 7 tive term stochastics . The current meaning of this scientific 0.1 Gaussian fitted to time i = 100 term is no different from that first given by Jakob Bernoulli Approximation for time i = 100 based on past data assuming ergodicity (1713 – Ars Conjectandi, written 1684–1689). Specifically, 0.01 lowerlower and and upper upper physical physical bounds bounds (cf. (cf. the the probable probable m aximummaximum precipitation precipitation concept), concept), cannot cannot Bernoulli defined stochastics as the science of prediction, or help to effectively describe the stream tube size and the uncertainty. help to effectively describe the stream tube sizethe and science the uncertainty.lower of measuring and upper as physical exactly bounds as possible (cf. the probable proba- maximum0.001 precipitation concept), cannot bilities of events. (In this respect, stochastics should not be Here comes probability and stochastics, which willhelp giveto effectively us a good describe description the stream of thetube size and the uncertainty. Here comes probability and stochastics, identifiedwhich will with give the us very a common good description ARMA or of similar the types of 0.0001 models.) tubetube size size (independent (independent of of the the number number of of simulations simulations) as) aswell well as Here asa profilea profilecomes of probability ofthe the likelihood likelihood and that stochastics, that which will give us a good description of the To make up a stochastic formulation of the evolution of the 0.00001 thethe system system state state is isat atany any specified specified position, position, a system,pra profileofile reminding first, remindingtube we size fully of (independent ofutilizethe the profile theprofile known ofof ofthelongitudinal deterministiclongitudinal number of simulations dynam- ) as well1000 as a profile 800 600 of the 400 likelihood 200 0 that 200 400 600 800 1000 x velocity across a stream tube in real flows.ics x Onei=S ( maxthei−y1 )system, wonder: where stateS Isis is the it at permissible vectorany specified function to position, representing use a profile reminding of the profile of longitudinal velocity across a stream tube in real flows. One may wonder: Is it permissible to use FigureFig. 7: Probability 7. Probability densitydensity functions functions f (x) of thef system(x) of state the x system (soil water) state forx times(soil i = 0 the deterministic dynamics. In our system, the state xi is i i velocity across a stream tube in real flows. Oneand 100. mawater)y wonder: for times Isi=0 it and permissible 100. to use probabilityprobability in in a systema system that that is ispurely purely deterministi deterministithec, vector c,as asthe the (x systemi, systemvi) and we we the investigate investigate transformation here here is?S is? isThe describedThe by answer I propose is a categorical “yes”. ThisEqs. answe (1) andr isprobability (2)consistent and is graphicallyin with a system the depictedunifiedthat is purely notion in Fig. deterministi 3.of Second,c, as 800the system we investigate here is? The answer I propose is a categorical “yes”. This answer is consistent with the unified notion of Low uncertainty High uncertainty we assume that the probability density at time 0, f (x0), is x i In the iterative application of the stochastic description of answer I propose is a categorical “yes”. This answer 600is consistent with the unified notion of randomnessrandomness discussed discussed above, above, as as well well as as with with known.the the con con Third,ceptcept of we ofprobabilization useprobabilization the following of of conceptuncertainty uncertainty from, the, theory the system evolution we encounter two difficulties. First, de- of dynamicalrandomness systems: givendiscussed the probability above, as densitywell as functionwith the con400spitecept of being probabilization simple, the functionof uncertaintyS is not, invertible and the thatthat is, is, the the axiomatic axiomatic reduction reduction from from the the notion notion of of an an uncertain uncertain quantity quantity to tothe the notion notion of ofa a −1 at time i−1, f (xi−1), that of next time i, f (xi), is given integral over the counterimage S (A) needs to be evalu- that is, the axiomatic reduction from the notion of an200 uncertain quantity to the notion of a randomrandom variable variable (Robert, (Robert, 2007). 2007). To To my my perception, perception,by the Frobenius-Perron n othing nothing in in the operator the Kolmogorov KolmogorovFP, i.e. f (1933)( x (1933)i)=FP f (xi−1), ated numerically. Second, as the deterministic formulation is random variable (Robert, 2007). To my perception,quite nothing satisfactory in the for Kolmogorov short time horizons,(1933) the stochastic for- axiomatic system prohibits this reduction. uniquely In a probabilitytheoretic defined by an integral context, equation an unknown (e.g. Lasota and 0 axiomatic system prohibits this reduction. In a probabilitytheoretic context, an unknown mulation gets more meaningful for long ones. Iterative appli- Mackey, 1991),axiomatic which system in our prohibits case takes this the reduction. following In sim- a probabilitytheoretic context, an unknown 200 valuevalue x i xisi isa arealization realization of of a arandom random variable variable plifiedxi xandi and form,is isassociated associated with with a probabilitya probability density density cation of Eq. (4) over time will result in multiple integrations, value xi is a realization of a random variable xi and isso associated that eventually, with a for probability long time horizons,density we need to perform function f(x ). A family of random variables x , (arbitrarily, usually infinitely, large) is a 400 function f(xi).i A family of random variables xi, i (arbitrarily,∂2 Z usually infinitely, large) is a a high dimensional numerical integration. This is difficult, FPf (x) = function f(xi). f A ( u family)du of random variables (3)xi, (arbitrarily, usually infinitely, large) is a − 600 exact trajectory stochasticstochastic process process whereas whereas a realizationa realization of of the the sto stochasticchastic process,∂x∂v process,S i.e.,1 i.e.,(A) a seriesa series of ofnumbers numbers xi isxi is unless a stochastic integration method is used. Specifically, stochastic process whereas a realization of the stochasticit isprocess, easily shown95%i.e., prediction a series (e.g. limits Metropolis of numbers and xi is Ulam, 1949; Niederre- * 800 a timea time series series.* . −1 where A :=a {timex; x series≤ (x.,* v)} and S (A) is the counterim- iter,0 1992) 10 that 20 for a 30 number 40 of 50dimensions 60 70d >4, 80 a stochas- 90 100 age of A, i.e. the set containing all points x whose mappings tic (Monte Carlo) integration method (in which the functioni ProbabilityProbability along along with with the the related related fields fields of ofstatistics statistics and and stochastic stochastic processes processes are are S(x) belong to AProbability. Iterative application along with of the the related equation fields can ofFigure statisticsevaluation 8: Stream tube and points representation stochastic are taken of the processes evoluti at random)on of are soil iswater more x (with accurate initial conditions (for as † † f x ) i currentlycurrently described described by by the the collective collective term term stochastics stochasticsdetermine. .The thecurrently The densitycurrent current described meaning( meaningi for by anyof the ofthis time collectivethis scientific .scientific term stochasticsin Figure.†the The 4 same and current 1% total uncertainty) meaning number in a ofstochasticof evaluationthis scientificsetting using points) Monte than Carlo classical prediction nu-limits for This shows that the stochastic representation has an an- 95% probabilitymerical of integration, bracketing the basedtrue state on betwee a gridn therepresentation stream tube “walls” of(1000 the simulation in- termterm is isno no different different from from that that first first given given by by Jakob Jakob Bernoulli Bernoulliterm (1713—Ars is (1713—Ars no different Conjectandi, Conjectandi,from that first written written given by Jakob Bernoulli (1713—Ars Conjectandi, written alytical expression, as has the deterministic. However, the runs).tegration space. 16841689). Specifically, Bernoulli defined stochastic stochastics representation as the science refers of to prediction the evolution, or the in time of In our example, the Monte Carlo method does not in- 16841689). Specifically, Bernoulli defined stochastics 16841689). as the science Specifically, of prediction Bernoulli, or the defined stochastics as the science of prediction , or the 34 admissible sets and densities (the stream tube visualization), volve other calculations than those we did to construct the science of measuring as exactly as possible the probabilities of events. (In this respect, science of measuring as exactly as possible rather the pro thanbabilitiesscience to trajectories of of measuring events. of points (In as (the this exactly thread respect, as visualization). possible the proenvelopesbabilities above. of events. It is (In so verythis respect, simple that it even bypasses − stochasticsstochastics should should not not be be identified identified with with the theFrom very very the common deterministic,commonstochastics ARMA ARMA should “exact” or or not similar but similar be inaccurate, identified types types of thread-likewith of the very thecommon calculation ARMA of S or 1 similar(A). Results types for of the density function trajectory x =S(x − ), we have moved to the tube-like tra- f (x) of the system state x (soil water) for time i=100, in models.)i i 1 i models.)models.) jectory: comparison to that for time i=0, are shown in Fig. 7. The To make up a stochastic formulation of the evolutiMonteon Carlo of the integration system, first, was performed we fully assuming f (x0) to To To make make up up a a stochastic stochastic formulation formulation of of the the evoluti evoluti∂2 onZ on of of the the system, system, first, first, we we fully fully be uniformly extending 1% around the value x0=(100 mm, fi(x) = fi−1(u)du x(4) S x S x S xutilize− 1 the knownS deterministic dynamics i = ( i – 1), where is the vector function utilizeutilize the the known known deterministic deterministic dynamics dynamics x i =i =S (x(i∂x∂v – i1 ),– 1 ), where S where(A) S is is the the vector vector function function 0.30) and using 1000 simulations. The Monte Carlo method is very powerful, x yet so easy that we may fail to notice that representing x x the deterministic dynamics. In our system, the state i is the vector (xi, vi) and the representingrepresenting the the deterministic deterministic dynamics. dynamics. In In our ourwhich sys system,tem, has the beenthe state state writteni isi isthe in the avector mannervector (x i(, slightly xvi,i) vandi) and differentthe the from we are doing numerical integration. Technically, its appli- Eq. (3). Clearly,transformation the stochastic S is described formulation by equations does not (1) disre- and (2) and is graphically depicted in Figure 3. transformationtransformation S Sis isdescribed described by by equations equations (1) (1) and and (2) (2) and and is isgraph graphicallyically depicted depicted in inFigure Figure 3. 3. cation to construct Fig. 7 would be done even intuitively, gard the deterministic dynamics: it is included in the coun- without knowing that there is some concrete mathematical terimage S − 1 (A) . However, it can be easily extended to background (Eq. 3) behind our simulations. However, being describe non-deterministic* In some texts the dynamics two terms by generalizing are used as synonymous the FP , conscious but distinction of the helps theoretical avoid ambiguity background and and the associated * * In In some some texts texts the the two two terms terms are are used used as as synonymous synonymousoperator, but(Lasota, but distinction distinction and Mackey, helps helps avoid 1991). avoid ambiguity ambiguity and and misunderstanding. strengths and limitations, results in most efficient use of the misunderstanding.misunderstanding. method, in acceleration of scientific progress, and in avoid- 7 Stochastics† Stochastics< Greek < Greek Stochasticos Stochasticos (Στοχαστικός) << SStoc-tochazesthai (Στοχάζεσθαι) < Stochos (Στόχος); Stochos = † † ance of confusions and potential errors. Stochastics Stochastics < Greek< Greek Stochasticos Stochasticos (Στοχαστικός) (Στοχαστικός) < S (b) to guess or conjecture (the target) > It is observed in Fig. 7 that moving from time i=0 to i=100, target;target; Stochazesthai Stochazesthai = (a)= (a) to toaim, aim, point, point, or orshoot shoot (an (an arrow) arrow)Stochazesthai at aat target a target => (b) (a)> (b) to to toguess aim, guess point,or orconjectur conjectur or shoote (thee (the (an target) arrow)target) > > at a target > (b) to guess(c) to or imagine, conjecture think (the deeply, target) bethink,> (c) contemplate, to imagine, cogitate, think meditate.the density changes from concentrated to broad and from (c)(c) to toimagine, imagine, think think deeply, deeply, bethink, bethink, contemplate, contemplate, cogitate, cogitate,deeply, meditate. meditate. bethink, contemplate, cogitate, meditate. uniform to Gaussian; the theoretical10 Gaussian curve is also 1010 www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010 1 Time i = 0 f(x ) Time i = 100, estimated by Monte Carlo

0.1 Gaussian fitted to time i = 100

Approximation for time i = 100 based on past data assuming ergodicity 0.01

0.001

0.0001

0.00001 1000 800 600 400 200 0 200 400 600 800 1000 x

592integration method (in which the function evaluationFigure points7: Probability ar densitye D. taken functions Koutsoyiannis: at fi(x ) random)of the system A state random is x (soil more water) walk for on times water i = 0 accurate (for the same total number of evaluation points)and 100. than classical numerical integration, plotted. Knowing a priori, for theoretical reasons, that the 800 x i Low uncertainty High uncertainty probabilitybased on distribution, a grid representation after a long time, of the will integration be Gaussian space. 600 is very important and substantially simplifies the solution of problem. In But our are example, there theoretical the Monte reasons Carlo implying method Gaus- does not400 involve other calculations than those sian distribution? Jaynes (2003) lists a number of them. The 200 we did to construct the envelopes above. It is so very simple that it even bypasses the most widely known is the . In its most 0 common formulation,S–1 which involves sums of random vari- ables,calculation it seems inapplicableof (A). Results here as there for the are nodensity such sums.function fi(200x) of the system state x (soil water) for However,time i if= interpreted100, in comparison as a statement to about that the for properties time i of= 0, are400 shown in Figure 7. The Monte Carlo density functions under convolution (multiple integrals of a 600 exact trajectory numberintegration of density was functions performed tends assuming to the Gaussian f(x0 density)) to be uniformly it extending95% prediction 1% limits around the value x0 = 800 may give the required explanation. However, it is more con- 0 10 20 30 40 50 60 70 80 90 100 (100 mm , 0.30) and using 1000 simulations. The Monte Carlo method is very powerful, yet i venient to express our reasoning in terms of the principle of maximumso easy entropy that :we for may fixed meanfail to and notice variance, that the we distribu- are doinFiguregFig. numerical 8.8: StreamStream tube tuberepresentation integration. representation of the evoluti Technically, ofon of the soilevolution water x (with its ofinitial soil conditions water asx tion that maximizes entropy is the (or the in(with Figure initial 4 and 1% conditions uncertainty) in as a instochastic Fig. 4 set andting 1%using uncertainty) Monte Carlo prediction in a stochas- limits for truncatedapplication normal, to if construct the domain Figure of the variable7 would is anbe interval done eve95%ntic intuitively, settingprobability using of bracketing Montewithout the Carlotrue state knowing prediction between the stream limitsthat tube forthere “walls” 95% (1000 probability simulation in the real line; Papoulis, 1991). Entropy8 is a probabilistic runs).of bracketing the true state between the stream tube “walls” (1000 concept,is some which concrete for a continuous mathematical random variablebackgroundx is defined (equationsimulation (3)) behind runs). our simulations. However, 34

asbeing conscious of the theoretical background and the associated strengths and limitations, φ[x] := E[−lnf (x)] (5) Coming back to our example, we can use the same simula- results in most efficient use of the method, in acctioneleration method of – in scientific fact the same progress, simulation and runs in – to trace the E g x g x where [ ( )] denotes the expectation of any function ( ), probability densities for all times. The propagation of uncer- i.e.avoidance of confusions and potential errors. tainty in time is typically visualized through prediction inter- ∞ Z vals, whose limits define the stream tube “walls” according E[ g(x)] := It is observedg(x)f (x)d xin Figure 7 that moving from(6) timeto the i = probabilistic 0 to i = 100, approach. the density An example changes is shown in Fig. 8, from concentrated−∞ to broad and from uniform to Gausforsian; a certain the probability, theoretical say Gaussian 95%, of the curve true state is being con- Entropy is a typical measure of uncertainty, so its maximiza- tained in the stream tube. Now the stream tube size does not tionalso indicates plotted. that Knowing the uncertainty a priori becomes, for astheoretical high as possi- reasons,depend that the on theprobability number of dis simulationstribution, and after does a not diverge ble. Entropy could be used to quantify the notions of ran- to infinity (except if the probability chosen to visualize the domnesslong time, (high entropy) will be and Gaussian determinism is very (lowest important entropy). and stream substantially tube is 100%). simplifies Thus, the the probabilistic solution stream of tube vi- sualization is more meaningful, convenient and informative Givenproblem. that information But are andthere entropy theoretical are more reasons or less the implying same Gaussian distribution? Jaynes (2003) lists quantity, this quantification agrees with Kolmogorov’s and than the envelope representation of Fig. 6. It is observed that Page- Now appears as Chaitin’sChangea number viewto read of of them. mathematical(changes The highlighted) most randomness. widely known is the Centforral long Limit time Theorem. horizons the In streamits most tube common becomes less rough Column The principle of maximum entropy was introduced by and its size, i.e. the uncertainty, tends to stabilize to a maxi- 3L Jaynestionformulation, φ΄ (1957) and potential as which a method transpiration involves to infer τ ΄sumsp unknown are assumed of random probabilities constant variable in mums, it value.seems Thisinapplicable defines another here typeas there of equilibrium, are a sta- 3L from“year”), known the information.input is φ := φ However,΄Δt = 250 muchmm and earlier, the potential in statis- out- tistical thermodynamic equilibrium of maximum entropy. To no such sums. However, if interpreted as a statement distinguishabout the itproperties from the static of ordensity “dead” functions equilibrium of Sect. 2, 3L ticalput thermophysics,τp := τ΄p Δt = 1000 entropy mm. The maximization internal state had variables, constituted which we can call it the “alive” equilibrium. 5L theasunder theoreticalwell asconvolution with basis physics of the (multiple(cf.Second W. Heisenberg’s, Law integrals of thermodynamics 1927,of a uncertaintynumber, o f density functions tends to the Gaussian 6L where–∞ and the α optimization= 750 mm. Apparently, constraints the of dependence fixed mean andof the vari- size on ancedensity) correspond it may to thegive conservation the required of momentum explanation. and en-However, it is more convenient to express our 6R a good description of the tube size (independent of the number 4 The power of data ergy,ofreasoning model respectively runs) in as terms (assumingwell as of a profile the that principle the of the microscopic likelihood of maximum descrip-that entropy : for fixed mean and variance, the tion is in terms of momenta). In this case, it is used to de- 7L sity function at time i – 1, f(xi – 1), that of next time i, f(xi), is As our prediction horizon increases and we approach the termine macroscopic thermodynamic states of physical sys- 7L ***Deletedistribution space that after maximizes the f in end ofentropy the first is line the so normalthat f(x d)istribution “alive” equilibrium, (or the truncated we may find normal, it natural if to the raise this ques- tems. Some regard the application of the principle of maxi- i - 1 goes to the second line altogether tion: Do we really need the deterministic dynamics to make mumdomain entropy of in statistical the variable thermophysics is an intervalfundamentally in the dif- real line; Papoulis, 1991). Entropy * is a 7L 2 a long-term prediction? Intuitively, from Fig. 8, the answer ferent from its∂ application in inference, as used here. In f x)(FP = f d)( uu seems to bex negative. But, then, what can replace the dynam- myprobabilistic opinion, though,∂∂ concept,∫S − the1 A)( two which must for be thea continuous same. Observ- random variable is defined as vx ics? Recalling that the form of the density function may be ing that Nature spontaneously maximizes entropy, i.e. uncer- *** (x) in parenthesis in left hand side known a priori, as discussed above, what we need to com- tainty, why not use just the same principle in logic for infer- 8L tical thermophysics, entropy maximization hadφ[ constitutedx :] = E[− ln f (pletelyx)] express the density function are the two (5) parameters ence about natural systems? (For a profound discussion of 8L ance correspond to the conservation of momentum and energy, of the Gaussian curve, namely its mean and standard devi- the concept of entropy and its controversial views in thermo- respectively (assuming that the microscopic description is in ation. But these could be estimated from data. Hence, for physics, the interested reader is referred to Robertson, 1993). terms of momenta). In this case, it is used to determine macro- long horizons past data render knowledge of dynamics un- 8L_f 8* 8 EntropyEntropyEntropy << GreekGreekGreek εντροπίαεντροπία << entrepesthaientrepomaientrepesthai (εντρέποαι)(εντρέπεσθαι =) =to to turn necessary. into. For illustration, Fig. 9 shows a record of 100 past toturn turn into. into. values of xi corresponding to times i=−100 to −1. Here, 12 9L important theorem by G. D. Birkhoff (1931) says that for an er- 9L_f Hydrol.9 Ergodicity Earth < Syst. Greek Sci., εργοδικός 14, 585– <601 [ergon, 2010 (έργον) = work] + www.hydrol-earth-syst-sci.net/14/585/2010/ [odos (οδός) = path].

11_F11 Fig. 11: Graphical depiction of the application of the algorithm

to recover the system dynamics (with emphasis of its dimensionality) from a time series of 10 000 points; numbers in the legend box are trial embedding dimensions m. maximum entropy. To distinguish it from the static or “dead” equilibrium of Sect. 2, we can D. Koutsoyiannis: A random walk on water 593 callmaximum it the “alive” entropy. equilibrium. To distinguish it from the static or “dead” equilibrium of Sect. 2, we can becausecall it our the caricature “alive” systemequilibrium. is imaginary, the past data are 800 synthetic, generated by the same model, but in a real system x i 4 The power of data 600 with really unknown dynamics these would be past obser- maximum entropy. To distinguish it from the staticvations 4or “dead” of The the power systemequilibrium state. of data Toof generateSect. 2, thewe datacan here we as- 400 call it the “alive” equilibrium. sumeAs initialour prediction conditions: xhorizon−100=(73.99 increases mm, 0.904), and forwe which approach the200 “alive” equilibrium, we may find it the resulting state at time i=0 is x =(99.5034≈100 mm, As our prediction horizon increases0 and we approach the0 “alive” equilibrium, we may find it 0.3019natural≈0.30). to raise This this state question: is compatible Do (withinwe really precision need the deterministic dynamics to make a long 4 The power of data 1%) with the rounded off initial state x =(100, 0.30) that 200 termnatural prediction? to raise this Intuitively, question: from Do0 we Figure really 8, need the antheswer deterministic seems to bedynamics negative. to But,make then, a long what we used in earlier investigations. Interpreting past data as a 400 statisticalterm prediction? sample, weestimate Intuitively, a sample from mean Figureµ=− 2.528, the mm answer seems to be negative. But, then, what As our prediction horizon increases and we approachcan the replace “alive” theequilibrium, dynamics? we Recallingmay find it that the form of600 the density function may be known a andcan a sample replace standard the dynamics? deviation σ =209.13Recalling mm. that With the these form of the density function may be known a natural to raise this question: Do we really need valuesthe deterministic we can obtain dynamics the complete to make density a long function for time 800 priori , as discussed above, what we need to completely100 express 90 80the density 70 60 function 50 40 are 30 the 20 10 0 i=100,priori which, as isdiscussed plotted in above, Fig. 7 along what with we theneed results to completely ob- express the density function are the i term prediction? Intuitively, from Figure 8, the answertwo seems parameters to be negative. of the GaussianBut, then, what curve, namely its mean and standard deviation. But these tained by the Monte Carlo simulation, in which the deter- Fig. 9. A sample of past data of soil water x for times i=−100 to two parameters of the Gaussian curve, namelyFigure its 9: meA samplean and of past standard data of soil deviation.water xi for times But i = i–100 these to –1. can replace the dynamics? Recalling that the formministic couldof the dynamics density be estimated wasfunction explicitly from may taken be data. known into Hence, account. a for It can long horiz−1. ons past data render knowledge of be seen that the empirical result without considering the dy- 500 priori , as discussed above, what we need to completelycould ex bepress estimated the density from function data. are Hence, the for long e i horizonsBetter skill past of data render knowledgeBetter skill of naïve of 450 namicsdynamics is a good unnecessary. approximation. For illustration, Figure 9 shdoneowsdeterministic initially a record forecast in Fig. of 4; 100 (b) a past na¨ıve statistical values statistical forecast of prediction, xi ac- two parameters of the Gaussian curve, namely its Despitedynamics mean and being unnecessary.standard empirical, deviation. this For result illustration, But and, these more generally, Figure 9 sh400cordingows a to record which ofthe 100 future past equals values the average of xi of past data. thecorresponding use of past data into prediction,times i = can–100 find to a theoretical–1. Here, jus- becauseStochastics our caricature provides system the tool tois compareimaginary the two, the predictions, could be estimated from data. Hence, for long horizons past data render knowledge9 of 350 tificationcorresponding in the concept to times of ergodicity i = –100, an to important –1. Here, concept becausewhich our caricature is the standard system (root meanis imaginary square –, RMS) the error, inpast dynamical data systems are synthetic, and stochastics. generated By definition by the (e.g. same La- mode300 l, but in a real system with really dynamics unnecessary. For illustration, Figure 9 shpastows data a record are synthetic,of 100 past generated values of byxi the same model,r but in a real system with really sota and Mackey, 1994, p. 59), a transformation is ergodic 250 h 2i unknown dynamics these would be past observationse io:=f theE system(xi −x ˆi )state. To generate the data (9) corresponding to times i = –100 to –1. Here, becauseif allunknown our its invariantcaricature dynamics sets system are these trivial is imaginarywould (have zero be, probability).pastthe observations In of the system state. To generate the data x 200 otherhere words, we assume in an ergodic initial transformation conditions: starting –100 from= (73.99 any mm,where 0.904),xi denotes for which the random the resulting variable representing state at the state past data are synthetic, generated by the same model, but in a real system with really 150 here we assume initial conditions: x–100 = (73.99 mm, 0.904), for which the resulting state at point, a trajectory will visit all other points, without being and xˆi denotes the specific prediction for time i provided by time i = 0 is x = (99.5034 ≈ 100 mm, 0.3019 ≈ 0.30). This state is compatibleDeterministic (within forecast error unknown dynamics these would be past observationstrapped of the to asystem certain subset.state.0 To (In generate contrast, inthe non-ergodic data trans- 100either method (a) or (b). It is easily seen that for method time i = 0 is x0 = (99.5034 ≈ 100 mm, 0.3019 ≈ 0.30). This state is compatibleNaïve (within statistical forecast error formations there are invariant subsets, such that a trajectory x50(b), in which xˆi=µ (mean, estimatedA priori at deterministic−2.52 mm), forecast the error here we assume initial conditions: x–100 = (73.99 mm,precision 0.904), for 1%) which with the the resulting rounded state off at initial state 0 = (100, 0.30) that we used in earlier startingprecision from within 1%) awith subset the will rounded never depart off from initial it). state An x0standard = (100, error 0.30) equals that the we standard used deviation inAverage earlier deterministicσ (estimated forecast at error x 0 time i = 0 is 0 = (99.5034 ≈ 100 mm, 0.3019important ≈investigations. 0.30). theorem This state by Interpreting G. is D. compatible Birkhoff past (1931) (within data says as that a statist for anical 209.13sample,0 mm), 10 we and 20 estimate is 30 constant 40a sample for 50 all i . 60mean In method 70 = 80 (a), e 90i is 100 ergodicinvestigations. transformation InterpretingS and for anypast integrable data as functiona statistgical sample, we estimate a sample mean = i x different for different times i and can be evaluated again by precision 1%) with the rounded off initial statethe –0 following2.52 = (100, mm property 0.30) and a that holdssample we true: used standard in earlier deviation σ = 209.13 mm. With these values we can obtain –2.52 mm and a sample standard deviation σ =Figure 209.13 10:Monte Comparison mm. Carlo With of integration, thethese RMS values prediction now we of error Eq.can of (9). soilobtain water The x results for the are deterministic investigations. Interpreting past data as a statistical sample, we estimate∞ a sample mean = shown in Fig. 10. Clearly, in short lead times (<∼30) the de- the completen−1 density function for time i = 100,prediction which and theis naïveplotted statistical in prediction.Figure 7 along with the 1 X   Z ∼ limthe completeg Si(x )density= g( functionx)f (x)d xfor time i = 100,(7) whichterministic is plotted forecast in Figure is better, 7 butalong in long with lead the times (> 45) –2.52 mm and a sample standard deviation σ = 209.13n→∞ mm. With these values we can obtain resultsn = obtained by the Monte Carlo simulation, inthe which na¨ıve statistical the deterministic forecast is superior. dynamics was resultsi 0 obtained by−∞ the Monte Carlo simulation, in which the deterministic dynamics was the complete density function for time i = 100, which is plotted in Figure 7 along with the However, this is not a surprise. Actually, stochastics can explicitly taken into account. It can be seen that the empirical result without considering the withexplicitly the right-hand taken into side account. representing It can the be expectationseen that the giveempirical us an aresult priori without estimate considering of the deterministic the model er- results obtained by the Monte Carlo simulation,E in[g (whichx)]. For the example, deterministic for g( x dynamics)=x, setting wasx the initial ror, applicable near the “alive” equilibrium, where the uncer- dynamics is a good approximation. 0 explicitly taken into account. It can be seen that thesystemdynamics empirical state, observingisresult a good without thatapproximation. theconsidering sequence thex0, x1=S(x0), tainty has stabilized. Thus, from Eq. (9) we obtain x =S2(x ), ..., represents a trajectory of the system, and tak- 2 0 q r h i 35 dynamics is a good approximation. ing the equality Despite in the being limit empirical,as an approximation this result with finite and, moree := genEerally,(x −x ˆ )2 the = useE  of(x − pastµ)+ data(µ− x ˆ in) 2 (10) Despite being empirical, this result and, morei generally,i i the use of pasti data in i n) ( prediction,terms, we obtain can findthat the a timetheoretical average equalsjustification the true in the concept of ergodicity ,* an important Despite being empirical, this result and, (ensemble) moreprediction, gen averageerally, can E the [x find]: use a of theoretical past data justification in in thewhich concept after typical of ergodicity manipulations,* an results important in concept ∞in dynamical systems* and stochastics. By definitionq (e.g., Lasota and Mackey, 1994, prediction, can find a theoretical justification in thenconcept−1 concept in dynamical of ergodicity systems, an important and stochastics. By definition2 (e.g., Lasota2 and Mackey, 1994, 1 X Z ei = σ +(µ−x ˆi) (11) x ≈ xf (x)dx (8) concept in dynamical systems and stochastics. Byn dep.p.finition 59),59),i a (e.g.,transformation Lasota and isisMackey, ergodicergodic 1994, ifif allall its its inva invariantriant sets sets are are trivial trivial (have (have zero zero probability). probability). In In i=0 −∞ When it happens that xˆi=µ, then ei=σ , as in the statistical p. 59), a transformation is ergodic if all its invariantother othersets are words,words, trivial in(have an ergodiczero probability). transformationtransformation In startingstarting from fromprediction; any any point, otherwisepoint, a atrajectory trajectory the error inwill will the visit deterministic visit all all predic- Thus, ergodicity allows estimation of the system properties tion is obviously greater than σ. The a priori error estimates other words, in an ergodic transformation startingusing fromotherother past any points,data point, only. withouta The trajectory question being being thenwill arises:trappedvisit trapped all If the to to dynam- a a certain certainare su sualsobset.bset. plotted (In (In in contrast, Fig. contrast, 10 and in agree in nonergodic nonergodic well with those obtained ics is known, should a long-term prediction be better based other points, without being trapped to a certain subset. (In contrast, in nonergodic by Monte Carlo simulation. By treating xˆi also as a random ontransformationstransformations the data or on the dynamics?there areare Toinvariantinvariant explore thissubsets,subsets, question, such such let t hatthat avariable atrajectory trajectory we easily starting starting obtain from from that within the within average a subseta errorsubset of the deter- us compare two different types of predictions: (a) a typical √ transformations there are invariant subsets, such that a trajectory starting from within a subset ministic forecast over all times (above ∼45) will be ei=σ 2 deterministic prediction, based on applying the dynamics as (also plotted in Fig. 10), which shows that, on the average,

**9 ErgodicityErgodicityErgodicity < GreekGreek εργοδικός << [ergon[ergon (έργον) (έργον) = = wo work]work]rk] + + [odos [odosthe (οδός) (οδός) na¨ıve = statistical=path]. path]. prediction√ outperforms the deterministic * Ergodicity < Greek εργοδικός < [ergon (έργον) = work] + [odos (οδός) = path]. prediction by a factor of 2. 1414 14 www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010

800

x i 600

400

200

0

200

400

600

800 100 90 80 70 60 50 40 30 20 10 0 i 594 D. Koutsoyiannis: A random walk on water

Figure 9: A sample of past data of soil water xi for times i = –100 to –1. 500 Takens (1981) embedding theorems. The recovery is based e i Better skill of Better skill of naïve m 450 deterministic forecast statistical forecast on time-delay vectors xi := (xi, xi−1,. . . , xi−m+1) of a single 400 observable xi, with the required vector size m being no more

350 than 2d+1, where d is the system dimensionality, which can

300 also be estimated from the time series. Forming time-delay vectors xm with trial size m, we are able to calculate the mul- 250 i tidimensional entropy φm(ε)=E[−lnp], where ε is a scale 200 length (side length of hypercube) related to a grid covering 150 the m-dimensional space, on which the empirical probability 100 Deterministic forecast error m Naïve statistical forecast error p that a data point xi belongs to each hypercube is calcu- 50 A priori deterministic forecast error lated (notice the difference from definition Eq. (5), i.e. the Average deterministic forecast error 0 replacement of the probability density f with the probability 0 10 20 30 40 50 60 70 80 90 100 i mass p). The entropy φm(ε) is a decreasing function of ε and tends to infinity as ε tends to zero. The limit of φ (ε)/(−lnε) Figure 10: Comparison of the RMS prediction error of soil water x for the deterministic m Fig. 10. Comparison of the RMS prediction error of soil water x for as ε tends to zero, which (according to de l’Hopital’sˆ rule) is prediction and the naïve statistical prediction. the deterministic prediction and the na¨ıve statistical prediction. also equal to the limit of the slope dm(ε):=−1φm(ε)/1lnε, gives the dimension of the subspace of the m-dimensional m This example shows that for long horizons the use of de- space where the set of xi lies. For small ε, dm(ε) cannot ex- terministic dynamics gives misleading results and a danger- ceed m nor d. Application of a standard algorithm that imple- ous illusion of exactness. Unless a stochastic framework is ments this idea for increasing trial values of m (Grassberger used, neglecting deterministic dynamics in long-term predic- and Procaccia, 1983; Koutsoyiannis, 2006) is demonstrated 35 tion is preferable. In very complex systems, the same be- in Fig. 11, where it can be seen that dm does not exceed d=2, haviour could emerge also in the smallest prediction hori- thus capturing the system dimensionality, which is 2. Note, zons. This justifies, for example, the so-called ensemble however, that a large data set is required for the application forecasting in precipitation and flood prediction. In essence, of this technique. Our toy model can easily give us arbitrarily it does not differ from this stochastic framework discussed, long time series (here we used a time series of 10 000 points and is much more effective and reliable than a single deter- rather than 100 used before). But shortness of data or poor ministic forecast. attentiveness in the statistical properties of the data may re- In seeking a more informative prediction than the na¨ıve sult in erroneous conclusions (Koutsoyiannis, 2006). Other prediction, a natural question is: Is reduction of uncertainty stochastic tools that can recover deterministic controls from possible for long time horizons? In our simple example the data are discussed in the next section. answer is categorical: No way! For there is no margin for better knowledge of dynamics (we have assumed full knowl- edge already). And there is indifference of potentially im- 5 Exploration of the long term stochastic properties of proved knowledge of initial conditions. As mentioned above, the system reduction of initial uncertainty from 1% to 10−6 results in no Arguably, when we are interested in a prediction for a long reduction of final uncertainty at i=100. Therefore, a more time horizon, we wish to know not the exact system state at a informative prediction cannot be a prediction with reduced specified time but an average behaviour around that time and uncertainty. Rather, it must be a point prediction accompa- perhaps a measure of the dispersion of the extremes around it. nied by quantified uncertainty. This has already been done in This implies a different perspective, macroscopic rather than Fig. 7. detailed, of long-term prediction and predictability. In this In summary, for long time horizons, the stochastic infer- we can disregard the “instantaneous” system state, which in ence using (a) past data, (b) ergodicity, and (c) maximum en- atmospheric sciences is referred to as the weather, and try to tropy, provides an informative prediction. Knowledge of dy- predict the long-term average for a future period, commonly namics does not improve this prediction. For short time hori- referred to as the climate. According to a common defini- zons, the stochastic framework also incorporates the deter- tion, climate is “the long-term average of conditions in the ministic dynamics and uses it in a Monte Carlo framework. atmosphere, ocean, and ice sheets and sea ice described by Thus, the stochastic representation is an all-times solution, statistics, such as means and extremes (US Global Change good for both short and long horizons, and helps figure out Research Program, 2009). The usefulness of the notion of when the deterministic dynamics should be considered or ne- climate as a long-term average extends also to hydrological glected. It also highlights the usefulness and importance of processes. Actually, in studies of “climate change” and its data (which may have been missed today as indicated by the impacts, including those in hydrology and water resources, ongoing dismantling of hydrological monitoring networks). long-term predictions always refer to long-term average con- In theory, a good data set allows even the recovery of dy- ditions. namics, if it is unknown, employing the Whitney (1936) and

Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/ D. Koutsoyiannis: A random walk on water 595

12 8 1 2 3 4

) 1 2 3 4 ) ε ε ( ( 5 6 7 8 7

m 5 6 7 8 m 10 d φ 6 8 5

6 4

3 4 2 2 1

0 0 0.001 0.01 0.1 1 0.001 0.01 0.1 1 ε ε

Fig. 11. Graphical depiction of the application of the algorithm to recover the system dynamics (with emphasis of its dimensionality) from a time seriesFigure of 10 11: 000 points; Graphical numbers depiction in the legend ofbox theare trial application embedding dimensions of the m algorithm. to recover the system dynamics (with emphasis of its dimensionality) from a time series of 10 000 points. To study this macroscopic notion of prediction and pre- purpose. A plot of the first 1000 terms of the time series dictability, we need long simulations and data series that of yi is shown in Fig. 12 (lower panel) and reveals a different enable observation of long-term behaviours. In all follow- behaviour than xi (upper panel). Here the variability is high, ing illustrations we use time series with lengths of 10 000. not only at a short time scale but also at a long one. The long The first thousand terms of the time series of the soil water “excursions” of the moving average of 30 values (“the cli- xi, generated with the same initial conditions as before, are mate”) from the global average (of 10 000 values) are quite shown in Fig. 12 (upper panel). The plot shows high variabil- characteristic. ity at the shortest time scale (i.e. 1), with peculiar variation To explore the long-term stochastic properties of our sys- patterns not visible in the plots of fewer data points presented tem, including periodicity and time dependence or persis- earlier. Yet it shows relatively flat time average at scale 30 tence, we use three stochastic tools. The first is the pe- (“climate”). Despite high variability at scale 1, the trajectory riodogram, i.e. the square absolute value of the Discrete of the system state does not resemble a purely irregular or Fourier Transform of the time series. It is a real function random pattern. Rather the trajectory seems cyclical, but a q(ω), where ω is frequency. The quantity q(ω)dω is propor- more careful investigation reveals that the time series differs tional to the fraction of variance explained by ω and thus ex- from that of a typical periodic deterministic system. Specifi- cessive values of q(ω) indicate cyclicity with period 1/ω. The cally, as shown in Fig. 13 (upper panel) there is no constant periodogram of 10 000 terms xi is shown in Fig. 13 (lower periodicity, but the time δ between successive peaks of the panel). time series xi varies between 4 and 10 time steps, with a pe- The second is the empirical autocorrelation function (au- riod of 6 time steps being the most frequent. tocorrelogram), i.e. the Finite Fourier Transform of the pe- The relatively flat average of the soil water xi makes pre- riodogram (divided by variance). It is a sequence of values diction of the long-term average rather trivial in this carica- ρj , where j is a lag. It is alternatively defined and more ture system. However, other problems, in which variability easily determined as ρj =Cov[xi, xi−j ] / Var[xi], where plays a role (e.g. long term behaviour of extremes), are less Cov[xi, xi−j ]:=E[(xi−µ)(xi−j −µ)], Var[xi]:=Cov[xi, xi]= 2 trivial and more interesting, even in this very simple system. E[(xi −µ) ], and µ=E[ x]. The autocorrelograms of both xi To study the peculiar variability of xi, we introduce the ran- and yi are shown in Fig. 14. dom variable yi:=|xi−xi+6|, where the time lag 6 was chosen The third tool aims at a multi-scale stochastic representa- to be equal to the most frequent δ. We call y the variability i tion. Based on the process xi at scale 1, we define a process index and we study its long-term behaviour in comparison to (k) ≥ xi at any scale k 1 as: that of xi. It can be easily verified that yi represents the sam- ple standard deviation of the size-two sample xi and xi+6. ik Apparently, the standard deviation of a number of consecu- 1 X x(k) := x (12) tive xi (e.g. the seven terms xi to xi+6) would give a more i k l representative variability index, but we choose this simpler l=(i−1)k+1 definition to avoid artificial dependence between successive time series terms or else to avoid the need to change scale. A key multi-scale characteristic is the standard deviation σ (k) (k) (k) ≥ Besides, the simple definition serves well our exploration of xi . The quantity σ is a function of the scale k 1, here www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010 36

596 D. Koutsoyiannis: A random walk on water to avoid the need to change scale. Besides, the simple definition serves well our exploration 800 0.35

x i ν purpose.600 A plot of the first 1000 terms of the time series of yi is shown in Figure 12 (lower to avoid the need to change scale. Besides, the simple definition serves well our exploration 0.3 to400 avoid the need to change scale. Besides, the simple definition serves well our exploration panel) and reveals a different behaviour than xi (upper panel). 0.35Here the variability is high, not purpose. A plot of the first 1000 terms of the time series200 of yi is shown in Figure 12 (lower 0.25 ν purpose. A plot of the first 1000 terms of the time series of yi is0.3 shown in Figure 12 (lower only0 at a short time scale but also at a long one. The long “excursions” of the moving average panel) and reveals a different behaviour than xi (upper panel). Here the variability is high, not 0.2 0.25 200panel) and reveals a different behaviour than xi (upper panel). Here the variability is high, not of 30 values (“the climate”) from the global average (of0.2 10 000 values) are quite only at a short time scale but also at a long one. The long400 “excursions” of the moving average 0.15 600only at a short time scale but also at a long one. The long “excursions”0.15 of the moving average characteristic. 0.1 of 30 values (“the climate”) from the global average (of 10 000 values) are quite 0.1 800of 30 Annual values storage (“the climate”) from the global average (of 10 000 values) are quite Moving average of 30 values 0.05 1000 0.05 characteristic. characteristic.0 To 100 explore 200 300 the 400longterm 500 600 stochastic 700 800 properties 900 1000 of our system, including periodicity and i 0 0 1 2 3 4 5 6 7 8 9 10 111213 14 2000 1 2 3 4 5 6 7 8 9 10 111213 14 time dependenceVariability index or persistence, we use three stochastic tools. The first is the periodogram, δ To explore the longterm stochastic propertiesy i of our system, including periodicity and δ ToMoving explore average of the30 values longterm stochastic properties of our system, including periodicity and 1800 1/ ω i.e., the squareAverage of 10 absolute 000 values value of the Discrete Fourier Transform100 of30 20 the15 10 time8 7 series.6 5 4It is a real 3 2 time dependence or persistence, we use three stocha1600stic tools. The first is the periodogram, 100000000 1/ ω time dependence or persistence, we use three stochastic tools.100 30 The 20 15 first10 is8 7the6 periodogram, 5 Original4 periodogram 3 2 q e Filtered, moving average of 100 values 1400 100000000 c function q(ω), where ω is frequency. The quantity q(ω)d10000000ω is proportionaln to the fractionOriginal periodogram of i.e., the square absolute value of the Discrete Fourier Transformi.e., the square of the absolute time series. value Itof isthe a realDiscrete Fourier Transform of the timete series. It is a real 1200 q ise Filtered, moving average of 100 values rsc P 10000000 1000000 e n ers variance1000 explained by ω and thus excessive values of q(ω) indicatep cyclicity with period 1/ ωi.s t function q(ω), where ω is frequency. The quantity q(functionω)d ω is qproportional(ω), where ωto is the frequency. fraction ofThe quantity q(ω)d ω is proportionaltite to the fraction of en ns ce 800 A i 100000 rs Pe variance explained by ω and thus excessive values ofThe qvariance(ω periodogram) indicate explained cyclicity of by 10 ω 000with and terms periodthus excessivex i1/ isω shown. values in Figure of1000000 q(ω 13) indicate (lowere panel).cyclicity with period 1/ ω. rsi 600 ip ste 10000 t nce 400 n The periodogram of 10 000 terms xi is shown in Figure The 13 (lowerperiodogram The second panel). ofis 10the 000 empirical terms x i autocorrelationis shown in Figure functi100000 13 (loweron (autocorrelogram),A panel). i.e., the Finite 200 1000

0 The second is the empirical autocorrelation functiFourier on The(autocorrelogram), Transform second is of the the empirical i.e., periodogram. the Finiteautocorrelation It is a seque functi10000nceon (autocorrelogram), of100 values ρj, where i.e., j isthe a Finite lag. It is 0 100 200 300 400 500 600 700 800 900 1000 i 10 Fourier Transform of the periodogram. It is a sequeFigurealternatively nce 12:Fourier Evolution of values Transform of the soildefined ρ j water, where ofxi, thethe and variabilityj periodogram. ismore a index lag. easilyyi It, as is well It as determined is their a moving seque nce 1000 as of ρ valuesj = Cov[ ρj, x wherei, xi – j] is/ Var[ a lag.xi], It where is 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 averagesFig. 12.of lengthEvolution 30, for time of i up the to soil1000. water xi, the variability index yi, as 2 ω alternatively defined and more easily determinedwellCov[ alternatively as as theirxρij, x= movingi –Cov[ j] := definedaverages xEi, [( xii – –of j] and length/ )Var[ (x morei 30,x– ij], – for where easily time)], Var[i up determined toxi 1000.] := Cov[ asx i, ρxji ]= = Cov[E[( xxi i–, xi) – ],j] /and Var[ x i],= E where[ x]. The Figure100 13: Graphical depiction of periodic properties of the system based on (upper): a 2 2 Cov[ xi, xi – j] := E[( xi – ) (xi – j – )], Var[ xi] := Cov[autocorrelogramsxCov[i, xi]x =i, xEi [(– j]x i :=– E )[(of],x i both and– ) x(i x =andi – Ej [– yx i ].are)], The Var[shown xi] :=in FigureCov[ xhistogrami, Fig.14.xi] = 13. of E the[(Graphical timexi – between ) depiction], successive and peaks, of= periodicE δ[, ofx]. the The soil properties water xi, where of theν is the system fraction 10 referred to as the climacogram and typically depicted on ofbased occurrences10 on (upper): of a certain a δ histogram over the total of number the time of occurrences; between (lower): successive the periodogram peaks, autocorrelograms of both xi and yi are shown in Figure 14. 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 autocorrelograms of both xi and yi are shown in Figurea double 14. logarithmic plot. While the periodogram and the ofδ the, of time the series soil of waterxi, wherex ω, whereis frequencyν is and the q is fraction the spectral of density; occurrences 10 000 terms of a of The third tool aims at a multiscale stochastic representation. Basedi on the process xi at ω autocorrelogram are related to each other through a Fourier timecertain series wereδ over used for the the total construction number of the of plots. occurrences; (lower): the peri- The third tool aims at a (multiscalek) stochastic representation. Based on the process xi at The third tool aims at a multiscale stochastictransform, rescalepresentation. 1, thewe climacogram define Based a process on is the related processxi to at the any x autocorrelogrami at scale kFigure ≥ 1 as: 13: Graphicalodogram of depiction the time of series periodic of x i properti, whereesω ofis the frequency system and basedq is on (upper): a (k) the spectral density; 10 000 terms of time series were used for the (k) by ascale simpler 1, we transformation, define a process i.e. xi at any scale k ≥histogram 1 as: of the time between successive peaks, δ, of the soil water xi, where ν is the fraction scale 1, we define a process xi at any scale k ≥ 1 as: 37 construction of the plots. ik k−1  of occurrences of a certain δ over the total number of occurrences; (lower): the periodogram (k) σ √ X j (k ) 1 σ =√ αk, αk = 1+2 1− ρj ↔ xi := of theik timex lseries of x , where ω is frequency and q is the (12) spectral density; 10 000 terms of ik k (k ) 1 ∑ i 38 (k ) 1 k j=1 x := k l=(i− )1 xk +1 a mathematical model to describe this (12) behaviour. This be- x := x (12)i time∑ seriesl were used for the construction of the plots. i ∑ l k l=(i− )1 k +1 k l=(i− )1 k +1 j+1 j−1 haviour has been known with several names including Hurst ρj = αj+1−jαj + αj−1 (13) A key2 multiscale characteristic2 is the standard deviationphenomenon σ(k) of, long-termx (k). The persistence quantity , σlong(k) is range a dependence, and Hurst-Kolmogorov(k) (k)i (HK) behaviour(k) or HK (stochastic) ItA is key directly ( multiscalek) verified(k) (actually characteristic this is the is(k) most the classicalstandard sta- deviation σ of xi . The quantity σ is a A key multiscale characteristic is the standard deviationfunction σ of of the xi scale. The k quantity≥ 1, here σ referred is a to as the climacogramdynamics.* At and the typically same time, depicted Eq. (15) on defines a a simple tistical law) that in a purely random process * function of* the scale k ≥ 1, here referred to as the climacogramstochastic modeland typically that reproduces depicted this on behaviour, a known as a function of the scale k ≥ 1, here referred to as the climacogramdouble logarithmic and typically plot. While depicted the onperiodogram a and the autocorrelogram are related to each (k) σ simple scaling stochastic model or fractional Gaussian noise σ double= √ logarithmic plot. While the periodogram(14) and the autocorrelogram are related to each 38 double logarithmic plot. While the periodogram and otherthe autocorrelogram throughk a Fourier are related transform, to each the climacogram is(due related to Mandelbrot to the autocorrelogram and van Ness, 1968), by or a the HK model. other through a Fourier transform, the climacogram is Climacograms related to the for autocorrelogram the time series x byi and a yi are shown in other through a Fourier transform, the climacogramHowever,simpler is relatedthis transformation, law to the may autocorrelogram not bei.e., verified in by natural a systems. Fig. 15, where the departure from the classical law (Eq. 14) Thesimpler simplest transformation, alternative is i.e., simpler transformation, i.e., of a purely random process is evident. σ A purely random process would have a flat periodogram, σ (k) = k −1 (15) 1−H (k ) σ k −1  j  j +1 j −1 k (k ) σ  j  butj Fig.+1 13 (lower panel)j −1 indicates a different behaviour for k −1 σ = αk , αk =1+ 2 1− ρ j ↔ ρ j = α j +1 − αj j + α j −1 (13) σ  j  j +1σ = αk j,− 1 αk =1+ 2 ∑1− ρ j ↔ ρ j = α j +1 − αj j + α j −1 (13) (k ) k ∑j =1 k xi. Furthermore,2 fixed periodicities2 would be manifested in σ = α , α =1+ 2 1− ρ ↔where ρ =H is aα constant− kαj (0+< H <1)α known (13) asj =1 the  Hurstk   co- 2 2 k k ∑ j j j +1 j j −1 the periodogram as high impulses in the specific periods, but k j =1  k  efficient, after2 Hurst (1951) who2 first analyzed statistically the long-term behaviour of geophysical time series. Earlier, in the periodogram of Fig. 13 no impulses exist. Rather, the figure indicates relatively higher densities q at a broad band Kolmogorov (1940), in studying turbulence, had proposed of periods 1/ω, between 5–12 time units (agreeing with the 10* *ClimacogramClimacogram Climacogram << < GreekGreek ΚλιακόγρααΚλιακόγραα < << [climax [climax (κλια (κλιαξ)ξ) = =scale] scale]simpler + [gramma+ [gramma representation (γράα) (γράα) = written]. in = thewritten]. upper panel of Fig. 13). The * Climacogram < Greek Κλιακόγραα < [climax (κλιαξ) == scale] scale] + + [gramma [gramma (γράα) = written]. most noticeable characteristics in the periodogram are the 19 19 19

Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/

1 ρ j Soil water, x 0.8 Variability index, y

0.6 Consistently positive autocorrelation 0.4 Persistence (not nonstationarity)

0.2

0

0.2 Alternating positivenegative autocorrelation 0.4 Antipersistence (not periodicity)

0.6

0.8 0 10 20 30 40 50 60 70 80 90 100 j

D. Koutsoyiannis: A random walk on water 597 Figure 14: Empirical autocorrelation functions of 10 000 terms xi and yi.

1 1000 ρ j Soil water, x 0.8 Variability index, y ( k ) 0.6 σ p

ers

Consistently positive autocorrelation iste 0.4 Persistence (not nonstationarity) Slop nc ↓ ↓ e = e 0.34 0.2 100 ≈3 time 0 s hig her 0.2 Sl ≈ ope ↓ 3 0. Alternating positivenegative autocorrelation t 5 0.4 Antipersistence (not periodicity) im e 0.6 a s 10 n lo t S w 0.8 ip lo er e pe 0 10 20 30 40 50 60 70 80 90 100 rs 0 j i .9 Soil water, x s 6 te Figure 14:Fig. Empirical 14. Empirical autocorrelation autocorrelation functions of functions10 000 terms of x 10i and 000 yi. terms xi and Variability index, y n ce yi. Random series 1000 1 1 1030 100 1000 ( k ) k σ p

increasing spectral densities for lowe frequenciesrs (1/ω >8)

iste (k) Slop nc Fig. 15. Climacograms (plots of standard deviation(k) σ vs. averag- ↓ ↓ e and the decreasing ones for high frequencies = (1/eω <8). The 0.34 Figure 15: Climacograms (plots of standard deviation σ vs. averaging scale k) of the time 100 ≈3 ing scale k) of the time series xi and yi (where standard deviations two different behaviours are indicatorstim ofe antipersistence and s hig series xi and yi (wherewere estimated standard by deviations the classical were statistical estimated estimator by the c oflassical time series statistical estimator persistence, respectively. her Slo averaged on each scale k). ≈ pe The autocorrelogram in Fig.↓ 143 depicts0.5 theof same time be-series averaged on each scale k). tim haviours in a different manner. It is observede that the auto- a s 10 n lo correlation for lag one is positive,t whichS isw expected because ip lo er The most characteristic and useful plot of all three is the e pe of physical consistence (states in neighbouringrs 0 times should x i .9 climacogram in Fig. 15. For the soil water series i and Soil water, x s 6 te be positively correlated because the changesn in small time for scales 1–3, the slope formed by the empirical points is Variability index, y c should be small)Random and indicatesseries short-terme persistence. For very low, reflecting short-term persistence. For large scales 1 higher lags the autocorrelation of xi oscillates between neg- (k >10) the empirical points are arranged in a straight line 39 1 1030 100k 1000 ative and positive values. The existence of negative autocor- with steep slope, −0.98. From Eq. (15) we can see that this (k) Figure relations15: Climacograms is an (plots indication of standard of antipersistence.deviation σ vs. averaging In the scale simplest k) of the time slope equals H−1, so that in this case H=0.02 asymptoti- series xcase,i and yi an(where antipersistent standard deviations process were estimated should by have the c alllassical its autocorre-statistical estimator cally (for large scales). Likewise, the plot for the variability of time lationsseries averaged negative: on each it scale can k be). verified that in an HK process de- index yi indicates a slope of −0.34 for large scales, which fined by Eq. (15) with H <0.5, the autocorrelation function corresponds to H=0.66. The slope for a purely random pro- is negative everywhere. But this cannot be the case in na- cess, also shown in figure, is −0.5, which corresponds to ture because short-term persistence demands that some auto- H=0.5. Generally, an H between 0.5 and 1 characterizes correlations should be positive. Therefore, here we broaden long-term persistence, whereas an H between 0 and 0.5 indi- this more common but too simplistic and unrealistic notion of 39 cates antipersistence. Thus, this figure verifies the antipersis-

antipersistence so as to include alternating negative and pos- tent and persistent behaviour already detected for xi and yi, itive autocorrelations without a constant period. A strictly respectively. periodic behaviour would also result in autocorrelation os- Most importantly, this figure provides insights on the pre- cillating between positive and negative values, which creates dictability issue. It is well known that for a one-step-ahead the risk of misinterpretation of antipersistence as periodicity. prediction at scale 1, a purely random process xi is the most However, the intervals between different peaks or troughs in unpredictable. Dependence enhances one-step-ahead pre- Fig. 14 have not constant length, so here we have antipersis- dictability. For example, in a process with ρ1=0.5 (compa- tence (sometimes called “quasi-periodic” behaviour). rable to that of our series xi and yi) the standard deviation,√ Figure 14 also shows the autocorrelation of the variability − 2 conditional on known present state, is a fraction (1 ρ1 ) index yi. In this case, the autocorrelation is always posi- of the unconditional one, i.e. 13% smaller. However, in the tive, indicating persistence both in short and long term. Pro- climatic-type predictions, in which the average behaviour cesses with consistently positive autocorrelation functions rather than exact values is studied, the situation is differ- lead to large and long “excursions” from the mean as shown ent. In our example, as clearly shown in Fig. 15, at the cli- in Fig. 12 (lower panel), which often tends to be interpreted matic scale of 30 time steps, predictability is deteriorated by as nonstationarity. The latter, however, would require that the a factor of 3 for the persistent process yi (thus eliminating system’s dynamics changes in time in a deterministic man- and largely exceeding the 13% reduction due to conditioning ner, which does not happen here (and in most of the cases). on the present state). On the other hand, for the antipersis- tence process xi, the long-term predictability is improved by

www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010 598 D. Koutsoyiannis: A random walk on water √ 2000 −12 Variability index, rounded initial conditions standard deviation σ, is multiplied by 1/ NA=1.3×10 y i 1800 Moving average of 30 values, rounded initial conditions Variability index, exact initial conditions when macroscopic properties are calculated. This tiny num- 1600 Moving average of 30 values, exact initial conditions ber explains why we do not need an accurate representation 1400 at the molecular level to have the macroscopic quantity cor- 1200 rect; in other words it justifies the emergence of determin- 1000 ism. However, even with that high averaging scale k=NA, if 800 there is substantial positive dependence between the differ-

600 ent elements, the uncertainty may still be high. For exam-

400 ple, application of the HK law (Eq. 15) with H=0.99 yields

200 that the ratio of macroscopic to microscopic uncertainty is −0.01 0 1/NA =0.58, i.e. more than 11 orders of magnitude higher 0 100 200 300 400 500 600 700 800 900 1000 than in classical statistics. Evidently, then, we cannot speak i of determinism in the latter case. As an additional exam- FigureFig. 16: 16. PlotPlot of 1000 of terms 1000 ofterms the time of series the yi time (variability series index)yi (variabilityat scales 1 and 30 index) and for ple, when k=30 and H=0.95 the ratio of macroscopic to mi- twoat sets scales of initial 1 and conditions 30 and deferring for two lesssets than of1%. initial conditions deferring less croscopic uncertainty is 1/30−0.05=0.84. Again, we cannot than 1%. speak of emergence of determinism. In contrast, in processes with antipersistence the emergence of determinism is much a factor of about 3. In summary (and perhaps contrary to more relevant (here, indeed, positive and negative tendencies what is believed), long-term persistence substantially deteri- cancel each other). orates predictability over long time scales – but antipersis- tence improves it. Figure 16 provides further demonstration of the unpre- 6 From the toy model to the real world dictability of persistent processes. The plot shows 1000 items of the time series yi (variability index) at the annual and the In comparison to our toy model, a natural system, such as climatic scale and for the two sets of initial conditions dis- the atmosphere, a river basin, etc.: (a) is much more com- cussed above, the exact and rounded off, which differ by plex; (b) has time-varying inputs and outputs; (c) has spatial less than 1%. After about 30 time steps (one time unit of extent, variability and dependence (in addition to temporal); “climate”), the departures in the two cases are pronounced. (d) has greater dimensionality (virtually infinite); (e) has dy- Thus, even a completely deterministic system is completely namics that to a large extent is unknown and difficult or im- unpredictable at a large (climatic) time scale, when there is possible to express deterministically; and (f) has parameters persistence. that are unknown. Hence uncertainty and unpredictability This may seem counterintuitive to many. The conven- are naturally even more pronounced in a natural system. The tional argument is that averaging at a large temporal or spa-40 role of stochastics is then even more crucial: (a) to infer dy- tial scale results in “elimination of local noise” or “cancel- namics (laws) from past data; (b) to formulate the system lation of positive and negative tendencies” and gives rise equations; (c) to estimate the involved parameters; and (d) to to the emergence of determinism from randomness. While test any hypothesis about the dynamics. Data offer the only the latter is an indisputable truth and is the basis of macro- solid ground for all these tasks, and failure to rely upon, and scopic determinism in thermodynamic systems, extreme cau- test against, evidence from data renders inferences about hy- tion is required not to generalize unjustifiably. Theoretical pothesized dynamics worthless. justification of the emergence of determinism from random- Despite the huge difference of the toy model and natu- ness relies on probability theory and on application of its ral reality, we may hope that the above discourse can help laws, already mentioned above. Practically, we can speak address several questions, from philosophical to technical. of emergence of determinism in a macroscopic quantity if Given the current dominant trend in hydrological (and other the uncertainty of an averaged quantity is so low that we geophysical) sciences for physically-based modelling, rele- can neglect it. Two things are most important to derive the vant questions are: What is physically-based modelling? Is uncertainty of the macroscopic quantity: the scale of aver- physics a synonym for determinism? Is physically-based aging k (number of microscopic elements) and the depen- synonymous to mechanistic? Are first principles mechanis- dence among the microscopic elements. An example from tic principles? Is not statistical physics part of physics? Is thermodynamics of ideal gases is enlightening: in a mole not entropy maximization a first principle? Is not stochas- of a gas the macroscopic quantities are averages of a num- tic modelling part of physical modelling? Will it ever be 23 ber of k = NA=6.022×10 molecules (NA is the Avogadro possible to achieve such a physically-based modelling of hy- number) and independence between different molecules can drological systems that will not depend on data or stochastic be assumed (e.g. Stowe, 2007). Simple calculations us- representations? Can detailed representations and reduction ing the classical statistical law (Eq. 14) yield that the un- to first principles render hydrologic measurements unneces- certainty associated to a single molecule, expressed by the sary? What level of detail is needed in such reductionist

Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/ D. Koutsoyiannis: A random walk on water 599 modelling for a catchment of, say, 1000 km2? How far can indication that climate is predictable in deterministic terms? the current research trend toward detailed models advance For the latter question, we can observe that the last numer- hydrology and water resources science and technology? ical example in Sect. 5 is relevant to climate (k=30 as typi- Hydrological uncertainty and its reduction is currently a cally assumed in climate studies and H =0.95, a value that is core research issue and also very important from a water re- not uncommon for annual temperature). The resulting very sources engineering and management perspective. Relevant little reduction of uncertainty as we move from the annual questions are: To what extent can hydrological uncertainty be to the 30-year scale indicates that the standard perception of reduced? Can uncertainty be eliminated by uncovering the climate as less variable and less uncertain, in comparison to system’s deterministic dynamics? Is uncertainty epistemic the annual fluctuation, may be flatly wrong. or inherent in nature? When there is potential for reduction A number of philosophical questions related to the ex- of uncertainty, what is the most effective means for reduc- posed view of uncertainty and its implications on society, tion? Is it better understanding, better deterministic mod- economy and life could be relevant to this study, but I re- elling, more detailed discretization, or better data? When the frain from presenting here. However, as scientific texts often limits of uncertainty reduction have been reached, what is the emphasize the negative aspects of uncertainty or generally appropriate scientific and engineering attitude? Is it confes- imply a negative meaning of it, it may be useful to refer to sion of failure and no action, or quantification of uncertainty it as a positive quality. As demonstrated here, uncertainty and risk through stochastics? Are current stochastic meth- is tightly related to change and, hence, to creativity, evolu- ods consistent with observed natural behaviours? Is there po- tion, and progress, in life and in science. As far as science tential to improve current stochastic methods in hydrology? is concerned, recognizing and studying uncertainty, wherever Can deterministic methods provide solid scientific grounds there is uncertainty, should be interpreted not as a pessimistic for water resources engineering and management? Are there conceding of defeat, but just as an important scientific task, physical upper and lower limits (deterministic envelopes) in consistent with the aim of science for the pursuit of truth. extreme hydrological phenomena, such as precipitation and flood, whose determination could constitute the basis of hy- drological design? Can there be risk-free hydraulic engineer- 7 Conclusions ing and water management? The following summarizing questions can represent the con- In the last decades, financial support for research in hy- clusions of this article: drology and water resources engineering and management has been strongly linked to research on climate, a practice – Can natural processes be divided in deterministic and that does not favour hydrology as an autonomous scientific random components? discipline. Therefore, questions related to climate research – Are probabilistic approaches unnecessary in systems and the relationship between hydrology and climate are quite with known deterministic dynamics? important: Are deterministic climate models a proper means to assessing future climate and are there alternatives to them? – Is stochastics a collection of mathematical tools, unable Is the current interface of climate and hydrology satisfactory? to give physical explanations? Should hydrology and water resources planning rely on cli- mate model outputs? Are climate models properly validated – Are deterministic systems deterministically predictable and is the meaning of model validation and prediction the in all time horizons? same in hydrology and in climate modelling? Is the evolu- – Do stochastic predictions disregard deterministic dy- tion of climate and its impacts on water resources determin- namics in all time horizons? istically predictable? With respect to the last questions, we can observe that climate modellers do not hesitate to offer ar- – Can uncertainty be eliminated (or radically reduced) by bitrarily long predictions, with time horizons from 2100 AD discovering a system’s deterministic dynamics? (Battisti and Naylor, 2009), to 3000 AD (Solomon et al., – Does positive autocorrelation (i.e. dependence) improve 2009), to 100 000 AD (Shaffer et al., 2009) – to mention a long term predictions? few of the most recent publications in the most reputable journals. Given the definition of climate (detailed above) as – Are deterministic predictions of climate possible? a long-term average state, the behaviours illustrated by the – Are the popular climate “predictions” or “projections” toy model are quite relevant. The high Hurst exponents es- trustworthy and able to support decisions on water man- timated in several instrumental and proxy climatic time se- agement, hydraulic engineering, or even “geoengineer- ries, especially for temperature (H >0.90; Cohn and Lins, ing” to control Earth’s climate? 2005; Koutsoyiannis and Montanari, 2007; Koutsoyiannis et al., 2009), support the view of most climate processes The most common answer to all these questions is “yes”. as persistent ones and, hence, far more unpredictable than a Hopefully, the above discourse explained why my answers purely random process. This raises the question: Is there any to all of them are “no”. www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010 600 D. Koutsoyiannis: A random walk on water

Acknowledgements. I heartily thank Tim Cohn and Harry Lins who Kolmogorov, A. N.: Grundbegrijfe der Wahrscheinlichkeitsrech- nominated me for the Henry Darcy Medal, and the EGU committee nung, Ergebnisse der Math. (2), Berlin, 1933; 2nd English Edi- members Gunter¨ Bloschl,¨ Alberto Montanari, Hubert Savenije, tion: Foundations of the Theory of Probability, 84 pp. Chelsea Lars Gottschalk and Andras´ Bardossy´ who honoured me with the Publishing Company, New York, 1956. Medal, also giving me the opportunity to organize the thoughts I Kolmogorov, A. N.: Wienersche Spiralen und einige andere inter- presented in the medal lecture and now in this paper. I also thank essante Kurven in Hilbertschen Raum, Dokl. Akad. Nauk URSS, all colleagues who, after hearing my lecture, viewing the slides of 26, 115–118, 1940. my presentation, or reading the Discussion version of this article Kolmogorov, A. N.: On tables of random numbers, Sankhya¨ Ser. A, in HESSD, honoured me with their thoughtful and encouraging 25, 369–375, 1963. comments and discussions in person, in blogs and by emails. In Kolmogorov, A. N.: Three approaches to the definition of quantity particular, I thank Steven Weijs, Willie Soon, Antonis Koussis of information, Probl. Inf. Transm., 1(1), 1–7, 1965. and Alberto Montanari for their online interactive commentaries Kolmogorov, A. N., and Uspenskii, V. A.: Algorithms and random- and reviews of the Discussion version of this article, which ness, Theor. Probab. Appl., 32, 389–412, 1987. helped me to clarify my opinions and theses in a series of replies Koutsoyiannis, D.: On the quest for chaotic attractors in hydrologi- as well as in the final version of the paper. I regard the online cal processes, Hydrolog. Sci. J., 51(6), 1065–1091, 2006. commentaries and replies essential elements appended to this paper. Koutsoyiannis, D. and Montanari, A.: Statistical analysis of hy- droclimatic time series: Uncertainty and insights, Water Resour. Edited by: H. H. G. Savenije Res., 43(5), W05429, doi:10.1029/2006WR005592, 2007. Koutsoyiannis, D., Montanari, A., Lins, H. F., and Cohn, T. A.: Climate, hydrology and freshwater: towards an interactive in- corporation of hydrological experience into climate research – References DISCUSSION of “The implications of projected climate change for freshwater resources and their management”, Hydrolog. Sci. Battisti, D. S. and Naylor, R. L.: Historical warnings of future food J., 54(2), 394–405, 2009. insecurity with unprecedented seasonal heat, Science, 323, 240– Laplace, P.-S.: Theorie´ Analytique des Probabilites´ / par M. le 244, 2009. Comte Laplace. 1812. Oeuvres completes` de Laplace VII. 1-2. Bernoulli, J.: Ars Conjectandi, Thurnisii fratres, Basel, 306+35 pp., Publiees´ sous les auspices de l’Academie´ des sciences, par MM. 1713. les secretaires´ perpetuels,´ Paris, Gauthier-Villars, 1878–1912. Birkhoff, G. D.: Proof of the ergodic theorem, Proc. Nat. Acad. Sci., Laskar, J.: A numerical experiment on the chaotic behaviour of the 17, 656–660, 1931. Solar System, Nature, 338, 237–238, 1989. Chaitin, G. J.: Randomness and mathematical proof, Sci. Am., Laskar, J.: The limits of Earth orbital calculations for geological 232(5), 47–52, 1975. time-scale use, Philos. Tr. R. Soc. Lond. A, 357, 1735–1759, Chaitin, G. J.: How real are real numbers?, http://arxiv.org/abs/ 1999. math.HO/0411418, last access: March 2010, 2004. Lasota, A. and Mackey, M. C.: Chaos, Fractals and Noise, Springer- Cohn, T. A. and Lins, H. F.: Nature’s style: Naturally trendy, Geo- Verlag, New York, 1994. phys. Res. Lett., 32, L23402, doi:10.1029/2005GL024476, 2005. Lissauer, J. L.: Chaotic motion in the Solar System, Rev. Mod. Duncan, M. J.: Orbital stability and the structure of the solar sys- Phys., 71(3), 835–845, 1999. tem, in 1994, in: Circumstellar Dust Disks and Planet Forma- Mandelbrot, B. B. and van Ness J. W.: Fractional Brownian Motion, tion, Proceedings of the 10th IAP Astrophysics Meeting, Institut fractional noises and applications, SIAM Review, 10, 422–437, d’Astrophysique de Paris, edited by: Ferlet, R. and Vidal-Madjar, 1968. A., Editions Frontiers, Gif sur Yvette, France, 245–256, 1994. Maxwell, J. C.: A letter to Lewis Campbell, 1850; quoted in: Camp- Feynman, R.: The Character of Physical Law, MIT Press, Cam- bell, L. and W. Garnett, The Life of James Clerk, Maxwell, Lon- bridge, MA, 1965. don, 1882, Digitally reproduced by Sonnet Software, Inc., p. 80, Grassberger, P. and Procaccia, I.: Characterization of strange attrac- 1997. tors, Phys. Rev. Lett., 50(5), 346–349, 1983. Metropolis, N. and Ulam, S.: The Monte Carlo method, J. Am. Stat. Heisenberg, W.: Uber¨ den anschaulichen Inhalt der quanten- Assoc., 44(247), 335–341, 1949. theoretischen Kinematik und Mechanik, Z. Phys., 43, 172– Niederreiter, H.: Random Number Generation and Quasi-Monte 198, doi:10.1007/BF01397280, 1927, English translation by: Carlo Methods, Society for Industrial and Applied Mathematics, Wheeler, J. A. and Zurek, H., Quantum Theory and Measure- Philadelphia, 1992. ment, Princeton Univ. Press, 62–84, 1983. Papoulis, A.: Probability, Random Variables, and Stochastic Pro- Hurst, H. E.: Long term storage capacities of reservoirs, Trans. Am. cesses, 3rd ed., McGraw-Hill, New York, 1991. Soc. Civil Engrs., 116, 776–808, 1951. Poincare,´ H.: Science et Methode,´ 1908; reproduced in Chance, The Jaynes, E. T.: Information theory and statistical mechanics. Phys. World of Mathematics, Simon & Schuster, New York, p. 1382, Rev., 106(4), 620–630, 1957. 1956. Jaynes, E. T.: Probability Theory: The Logic of Science, Cam- Popper, K.: Quantum Theory and the Schism in Physics, Routledge, bridge Univ. Press, 728 pp., 2003. London, 230 pp., 1992.

Hydrol. Earth Syst. Sci., 14, 585–601, 2010 www.hydrol-earth-syst-sci.net/14/585/2010/ D. Koutsoyiannis: A random walk on water 601

Robert, C. P.: The Bayesian Choice, From Decision-Theoretic Stowe, K.: Thermodynamics and Statistical Mechanics, 2nd ed., Foundations to Computational Implementation, Springer, New Cambridge Univ. Press, Cambridge, UK, 2007. York, 2 edn., 602 pp., 2007. Takens, F.: Detecting strange attractors in turbulence, in: Dynami- Robertson, H. S.: Statistical Thermophysics, Prentice Hall, Engle- cal Systems and Turbulence, edited by: Rand, D. A. and Young, wood Cliffs, NJ, USA, 1993. L.-S., Lecture Notes in Mathematics no. 898, Springer-Verlag, Shaffer, G., Olsen, S. M., and Pedersen, J. O. P.: Long-term ocean New York, USA, 336–381, 1981. oxygen depletion in response to carbon dioxide emissions from US Global Change Research Program: Climate Literacy: The fossil fuels, Nat. Geosci., 2, 105–109 doi:10.1038/NGEO420, Essential Principles of Climate Sciences, Second Version, 2009. March 2009, www.climatescience.gov/Library/Literacy/default. Shiryaev, A. N.: Kolmogorov, Life and creative activities, The An- php, last access: March 2010, 2009. nals of Probability, 17(3), 866–944, 1989. Whitney, H.: Differentiable manifolds, Ann. Math., 37, 645–680, Solomon, S., Plattner, G.-K., Knutti, R., and Friedlingstein, P.: Ir- 1936. reversible climate change due to carbon dioxide emissions, Proc. Natl. Acad. Sci., 106(6), 1704–1709, 2009.

www.hydrol-earth-syst-sci.net/14/585/2010/ Hydrol. Earth Syst. Sci., 14, 585–601, 2010