<<

arXiv:1701.06734v9 [cs.IT] 24 May 2017 h apigtimes sampling the sample of time) transmission (i.e., rqec constraint frequency hne smdlda okcnevn IOqeewith queue FIFO work-conserving a as random modeled is channel hr xs oegnrtdsmlsta r o eiee to delivered not are that samples generated some exist there hr h apigtimes sampling the where ati h uu ni t rnmsinsat.Let starts. transmission its until queue the in wait 9330 yORgatN01-7121,adb TUBITAK. by and N00014-17-1-2417, ag grant grant under ONR Center, by Technology and 09-39370, Science NSF an (CSoI), where httasistm-tme ape fteform the of samples time-stamped transmits that au of value hs oli opoietebest-guess the provide to is goal whose bevrmauigaWee process Wiener a measuring observer mle hntoeo g-pia apig eowi samp sampling. m zero-wait uniform is sampling, policy classic age-optimal show sampling and of comparisons optimal those the Our than of smaller solved. error optimal estimation recently the the been times that process, has optimiz sampling age-of-information that Wiener the an problem observed to if reduces Further, the problem sample. deliversampling of is the next f independent sample wait the once previous are to take taken the optimal then after is is time and sample it of rather, new amount delivered; is non-zero a is strategy which sample optimal channel previous absenc in the the the sampling constraint, in during zero-wait even frequency varies that sampling is process constrai the Thi consequence Wiener frequency interesting threshold. the An sampling optimal delay. much the the how find by and and determined policy, is st threshold sampling threshold tha a optimal the is strategy that egy samp prove sampling a We to constraint. subject optimal error frequency estimation square the f mean the on signal study The minimizes research delay. the recent we random by of with of-information, Motivated estimate queue samples. real-time received a causally estimato a of remote consists reconstructs a that estimator to channel forwarded a samples via with process, Wiener 1 h eerhwsspotdi atb h etrfrSineo Science for Center the by part in supported was research The nesi rie ta mt ytm sample system, empty an at arrives it Unless osdrasse ihtotrias(e i.1:An 1): Fig. (see terminals two with system a Consider Abstract y“okcnevn” ematta h hne skp busy kept is channel the that meant we “work-conserving”, By eoeEtmto fteWee rcs vra over Process Wiener the of Estimation Remote f max i.i.d. W I hsppr ecnie rbe fsmln a sampling of problem a consider we paper, this —In t hs w emnl r once yachannel a by connected are terminals two These . stemxmmalwdsmln frequency. sampling allowed maximum the is delay i inf lim n S →∞ Y i .I I. i aslysbett naeaesampling average an to subject causally where , et fEC,MsahstsIsiueo ehooy Camb Technology, of Institute Massachusetts EECS, of Dept. NTRODUCTION n 1 S et fEE ideEs ehia nvriy naa Tur Ankara, University, Technical East Middle EEE, of Dept. E hne ihRno Delay Random with Channel i [ satisfy S n et fEE h hoSaeUiest,Clmu,OH Columbus, University, State Ohio the ECE, of Dept. Y ] i i u,Yr oynky n lfUysal-Biyikoglu Elif and Polyanskiy, Yury Sun, Yin i ≥ . ≥ 1 f h bevrcnchoose can observer The 0 0 max 1 ≤ W stecanldelay channel the is t S W , ˆ 1 n nestimator, an and t ≤ o h current the for S h estimator. the 2 emn CCF- reement i Information f ( ≤ G S ed to needs i i . . . whenever W , a 2 2017 22, May ethe be ation ra or ling, age- rom The S of e ling rat- uch not ed, i nt ) r s t , value aslyrcie ape with samples received causally silsrtdi i.2 emaueteqaiyo remote of quality the measure We MMSE: 2. the via Fig. estimation in illustrated as qaeerr(ME estimation, (MMSE) error square sample of time delivery The civstefnaetltaef between tradeoff fundamental the achieves by h otiuin fti ae r umrzda follows: as summarized are paper this of contributions The rnmsinsatn ieo sample of time starting transmission model. System 1: Fig. nti ae,w td h pia apigsrtg that strategy sampling optimal the study we paper, this In • • S satrsodplc,adfidteotmlthreshold. optimal the find strategy by determined and sampling is policy, threshold optimal This threshold solved the is a that constraint is prove frequency We sampling exactly. the to MMSE the subject minimizing for problem sampling optimal The hnvrteei akti transmission. in packet channel be a the should is when there threshold sample whenever the new Consequently, a busy. take is to that proven oppor- suboptimal each have transmission is We it model, its stale. for becomes our queue unnecessarily In tunity the [1]–[10]: in waiting e.g., policy threshold sample previous in, threshold the studied Our from policies difference sample). random important a (i.e., an has delay of channel time the transmission during variation signal nteasneo h apigfeunyconstraint frequency sampling even the that of is study absence our the of in consequence unexpected An 0 W W = W 0 ˆ t D t 0 = = = 0 i sup lim W E observer 0 = sampler T skonb h siao o re represented free, for estimator the by known is [ S W →∞ i , t ttime At . |  W if ! ( T 1 S t S j i E ∈ W, D , " [ D S ig,MA ridge, ACK Z j i ) 0 t i ≤ D , T h siao forms estimator the , i ! key ( t is W ] D i +1 channel t D i − ) ≤ i i , f W = max i ˆ t ! ymnmmmean minimum By . 0 = t uhthat such G ) 2 estimator i n h mutof amount the and dt f , + max 1 # , Y . 2 i , . . . ,  n MMSE. and h initial The . S W ˆ i W disabled ˆ t t ≤ using G (1) i 1 . 2

− W − W Wt WSi t Si Wt

β β t t t · · · t (Si, 0) Si +Yi =Si+1 (Si, 0) Si +Yi Si+1 0 S1 S2 S3 S4 S5 · · · Si Si+1 Si+2 −β −β (a) Wiener process Wt and its samples. Wˆ t (a) If WSi+Yi WSi √β, (b) If WSi+Yi WSi < √β, sample| i + 1 is− taken| at≥ time sample|i + 1 is− taken at| time t Si+1 = Si + Yi. that satisfies t Si + Yi and Wt WS = √≥β. · · · i t | − | Fig. 3: Illustration of the threshold policy (4). 0 D1 D2D3 D4D5 · · · DiDi+1 Di+2 and channel delay in [7], the problem analyzed therein is a (b) Estimate process Wˆ t using causally received samples. special case of ours. Fig. 2: Sampling and remote estimation of the Wiener process. III. MAIN RESULT Let π = (S ,S ,...) represent a sampling policy, and (i.e., f = ), the optimal strategy is not zero-wait 0 1 max Π be the set of causal sampling policies which satisfy the sampling in which∞ a new sample is generated once the following conditions: (i) The information that is available for previous sample is delivered; rather, it is optimal to wait determining the sampling time S includes the history of the a positive amount of time after the previous sample is i Wiener process (Wt : t [0,Si]), the history of channel delivered, and then take the next sample. ∈ states (It : t [0,Si]), and the sampling times of previous • If the sampling times are independent of the observed samples (S ,...,S∈ ), where I 0, 1 is the idle/busy Wiener process, the optimal sampling problem reduces 0 i−1 t state of the channel at time t. (ii)∈ The { inter-sampling} times to an age-of-information optimization problem solved in T = S S ,i = 0, 1,... form a [11], [12]. The asymptotics of the MMSE-optimal and i i+1 i {[22, Section 6.1]:− There exist integers} 0 k < k < . . . such age-optimal sampling policies at low/high channel delay ≤ 1 2 that the post-kj process Tk i,i = 0, 1,... has the same or low/high sampling frequencies are studied. { j + } distribution as the post-k1 process Tk1+i,i = 0, 1,... and • Our theoretical and numerical comparisons show that the { } is independent of the pre-kj process Ti,i =0, 1,...,kj 1 ; MMSE of the optimal sampling policy is much smaller 2 { 2 − } in addition, E[S ] < and 0 < E[(Sk Sk ) ] < for than those of age-optimal sampling, zero-wait sampling, k1 ∞ j+1 − j ∞ j =1, 2,...2 We assume that the Wiener process W and the and classic uniform sampling. t channel delay Yi are determined by two external processes, which are mutually independent and do not change according II. RELATED WORK to the sampling policy π Π. We also assume that the Yi’s E 2 ∈ are i.i.d. with [Yi ] < . On the one hand, the results in this paper are closely related A sampling policy π∞ Π is said to be signal-independent to the recent age-of-information studies, e.g., [11]–[20], where (signal-dependent), if π∈is (not) independent of the Wiener the focus was on queueing and channel delay, without a process Wt,t 0 . Example policies in Π include: signal model. The discovery that the zero-wait policy is not 1. Uniform{ sampling≥ } [23], [24]: The inter-sampling times always optimal for minimizing the age-of-information can be are constant, such that for some β 0, found in [11]–[13]. The sub-optimality of a work-conserving ≥ scheduling policy was also observed in [19], which considered Si+1 = Si + β. (2) scheduling updates to different users with unreliable channels. 2. Zero-wait sampling [11]–[14]: A new sample is generated One important observation in our study is that the behavior of once the previous sample is delivered, i.e., the optimal update policy changes dramatically after addinga signal model. Si+1 = Si + Yi. (3) On the other hand, the paper can be considered as a con- 3. Threshold policy in signal variation: The sampling times tribution to the rich literature on remote estimation, e.g., [1]– are given by [10], [21], by adding a queueing model. Optimal transmission scheduling of sensor measurements for estimating a Si+1 = inf t Si + Yi : Wt WSi β , (4) process was recently studied in [9], [10], where the samples are ≥ | − |≥ n p o transmitted over a channel with additive noise. In the absence which is illustrated in Fig. 3. If WS Y WS √β, | i+ i − i |≥ of channel delay and queueing (i.e., Yi = 0), the problems sample i+1 is generated at the time Si+1 = Si +Yi when of sampling Wiener process and Gaussian were 2 addressed in [1], [7], [8], where the optimality of threshold Really, we assume that Ti is a regenerative process because we an- alyze the time-average MMSE in (6), but operationally a nicer definition policies was established. To the best of our knowledge, [7] is E Dn − ˆ 2 E is lim supn→∞ [R0 (Wt Wt) dt]/ [Dn]. These two definitions are the closest study with this paper. Because there is no queueing equivalent when Ti is a regenerative process. 3

sample i is delivered; otherwise, if WSi+Yi WSi < A. Signal-Independent Sampling and the Age-of-Information √β, sample i +1 is generated at the| earliest time− t such| that t S +Y and W W reaches the threshold √β. Let Πsig-independent Π denote the set of signal-independent i i t Si ⊂ It is worthwhile≥ to emphasize| − that| even if there exists time sampling policies, defined as t [Si,Si+Yi) such that Wt WS √β, no sample is i Πsig-independent = π Π: π is independent of Wt,t 0 . taken∈ at such time t, as depicted| − in Fig.|≥ 3. In other words, { ∈ ≥ } the threshold-based control is disabled during [S ,S +Y ) For each π Πsig-independent, the MMSE (6) can be written as i i i ∈ and is reactivated at time Si +Yi. This is a key difference (see Appendix A for its proof) from previous studies on threshold policies [1]–[10]. 1 T 4. Threshold policy in time variation [11]–[13]: The sam- lim sup E ∆(t)dt , (10) T pling times are given by T →∞ "Z0 # where Si = inf t Si + Yi : t Si β . (5) +1 { ≥ − ≥ } ∆(t)= t Si, t [Di,Di ), i =0, 1, 2,..., (11) The optimal sampling problem for minimizing the MMSE − ∈ +1 subject to a sampling frequency constraint is formulated as is the age-of-information [14], that is, the time difference T between the generation time of the freshest received sample 1 2 min lim sup E (Wt Wˆ t) dt (6) and the current time t. If the policy space in (6) is restricted π∈Π T →∞ T 0 − "Z # from Π to Πsig-independent, (6) reduces to the following age-of- 1 1 information optimization problem [11], [12]: s.t. lim inf E[Sn] . (7) n→∞ n ≥ fmax 1 T Problem (6) is a constrained continuous-time Markov deci- min lim sup E ∆(t)dt (12) π∈Πsig-independent T sion problem with a continuous state space. Somewhat to our T →∞ "Z0 # surprise, we were able to exactly solve (6): 1 1 s.t. lim inf E[Sn] . n→∞ n ≥ fmax Theorem 1. There exists β 0 such that the sampling policy (4) is optimal to (6), and≥ the optimal β is determined by Problem (6) is significantly more challenging than (12), solving3 because in (6) the sampler needs to make decisions based on the evolution of the signal process Wt, which is not required 1 E[max(β2, W 4 )] E[max(β, W 2 )]=max , Y , (8) in (12). More powerful techniques than those in [11], [12] are Y f 2β  max  developed in Section IV to solve (6). where Y is a random variable with the same distribution as Theorem 2. [11], [12] There exists β 0 such that the Y . The optimal value of (6) is then given by i sampling policy (5) is optimal to (12), and≥ the optimal β is E[max(β2, W 4 )] determined by solving mmse Y E (9) opt , E 2 + [Y ]. 6 [max(β, WY )] 1 E[max(β2, Y 2)] E[max(β, Y )]=max , . (13) Proof. See Section IV. f 2β  max  The optimal value of (12) is then given by The optimal policy in (4) and (8) is called the “MMSE- E 2 2 optimal” policy. Note that one can use the bisection method mmse [max(β , Y )] E age-opt , + [Y ]. (14) or other one-dimensional search method to solve (8) with quite 2E[max(β, Y )] low complexity. Interestingly, this optimal policy does not The sampling policy in (5) and (13) is referred to as the suffer from the “curse of dimensionality” issue encountered “age-optimal” policy. Because Πsig-independent Π, in many Markov decision problems. ⊂ Notice that the feasible policies in Π can use the complete mmseopt mmseage-opt. (15) ≤ history of the Wiener process (W : t [0,S ]) to determine t i+1 In the following, the asymptotics of the MMSE-optimal and S . However, the MMSE-optimal policy∈ in (4) and (8) only i+1 age-optimal sampling policies at low/high channel delay or requires recent knowledge of the Wiener process (Wt WS : − i low/high sampling frequencies are studied. t [Si + Yi,Si ]) to determine Si . ∈ +1 +1 Moreover, according to (8), the threshold √β is determined by the maximum sampling frequency fmax and the distribution of the signal variation WY during the channel delay Y . It is B. Low Channel Delay or Low Sampling Frequency worth noting that WY is a random variable that tightly couples the source process and the channel delay. This is different Let from the traditional wisdom of information theory where Yi = dXi (16) source coding and channel coding can be treated separately. represent the scaling of the channel delay Yi with d, where 3 → d 0 and the Xi’s are i.i.d. positive random variables. If d If β 0, the last terms in (8) and (13) are determined by L’Hˆopital’s ≥ → rule. 0 or f 0, we can obtain from (8) that (see Appendix B max → 4

for its proof) IV. PROOFOFTHE MAIN RESULT 1 1 β = + o , (17) We establish Theorem 1 in four steps: First, we employ f f max  max  the strong of the Wiener process to simplify where f(x) = o(g(x)) as x a means that limx→a the optimal sampling problem (6). Second, we study the → f(x)/g(x)=0. Hence, the MMSE-optimal policy in (4) and Lagrangian dual problem of the simplified problem, and de- (8) becomes compose the Lagrangian dual problem into a series of mutually 1 independent per-sample control problems. Each of these per- Si+1 =inf t Si : Wt WS , (18) sample control problems is a continuous-time Markov decision ≥ | − i |≥ f  r max  problem. Third, we utilize optimal stopping theory [25] to and as shown in Appendix B, the optimal value of (6) becomes solve the per-sample control problems. Finally, we show that 1 1 the Lagrangian duality gap is zero. By this, the original mmse = + o . (19) opt 6f f problem (6) is solved. The details are as follows. max  max  The sampling policy (18) was also obtained in [7] for the case that Yi =0 for all i. If d 0 or f 0, one can show that the age-optimal A. Problem Simplification → max → policy in (5) and (13) becomes uniform sampling (2) with We first provide a lemma that is crucial for simplifying (6). β = 1/fmax + o(1/fmax), and the optimal value of (12) is mmseage-opt =1/(2fmax)+ o(1/fmax). Therefore, Lemma 1. In the optimal sampling problem (6) for minimizing the MMSE of the Wiener process, it is suboptimal to take a mmseopt mmseopt 1 lim = lim = . (20) new sample before the previous sample is delivered. d→0 mmseage-opt fmax→0 mmseage-opt 3 Proof. See Appendix E. C. High Channel Delay or Unbounded Sampling Frequency In recent studies on age-of-information [11], [12], Lemma If d or fmax , as shown in Appendix C, the MMSE-optimal→ ∞ policy for→ ∞ solving (6) is given by (4) where 1 was intuitive and hence was used without a proof: If a β is determined by solving sample is taken when the channel is busy, it needs to wait in the queue until its transmission starts, and becomes stale E 2 E 2 4 2β [max(β, WY )] = [max(β , WY )]. (21) while waiting. A better method is to wait until the channel becomes idle, and then generate a new sample, as stated in Similarly, if d or f , the age-optimal policy for max Lemma 1. However, this lemma is not intuitive in the MMSE solving (12) is→ given ∞ by (5) where→ ∞β is determined by solving minimization problem (6): The proof of Lemma 1 relies on 2βE[max(β, Y )] = E[max(β2, Y 2)]. (22) the strong Markov property of Wiener process, which may not hold for other signal processes. In these limits, the ratio between mmse and mmse opt age-opt By Lemma 1, we only need to consider a sub-class of depends on the distribution of Y . sampling policies Π Π such that each sample is generated When the sampling frequency is unbounded, i.e., f = 1 max and submitted to the⊂ channel after the previous sample is , one logically reasonable policy is the zero-wait policy in delivered, i.e., (3)∞ [11]–[14]. This zero-wait policy achieves the maximum throughput and the minimum queueing delay of the channel. Π1 = π Π: Si = Gi Di−1 for all i . (24) Surprisingly, this zero-wait policy does not always minimize { ∈ ≥ } the age-of-information in (12) and almost never minimizes the This completely eliminates the waiting time wasted in the MMSE in (6), as stated below: queue, and hence the queue is always kept empty. The in- formation that is available for determining Si includes the Theorem 3. If fmax = , the zero-wait policy is optimal for history of signal values (Wt : t [0,Si]) and the channel ∞ ∈ 4 solving (6) if and only if Y =0 with probability one. delay (Y1,...,Yi−1) of previous samples. To characterize this statement precisely, let us define the σ-fields t = σ(Ws : s Proof. See Appendix D. + + F ∈ [0,t]) and t = s>t s. Then, t ,t 0 is the filtration (i.e., a non-decreasingF ∩ andF right-continuous{F ≥ family} of σ-fields) Theorem 4. [12] If fmax = , the zero-wait policy is optimal for solving (12) if and only if∞ of the Wiener process Wt. Given the transmission durations (Y1,...,Yi−1) of previous samples, Si is a with 2 E[Y ] 2 ess inf Y E[Y ], (23) respect to the filtration +,t 0 of the Wiener process W , ≤ t t that is {F ≥ } where ess inf Y = sup y [0, ) : Pr[Y

4 Note that the condition in Theorem 4 is weaker than that Note that the generation times (S1,...,Si−1) of previous samples are of Theorem 5 in [12]. also included in this information. 5

Then, the policy space Π1 can be alternatively expressed as B. Lagrangian Dual Problem of (29) + Although (29) is a continuous-time Markov decision prob- Π1 = Si :[ Si t Y1,...,Yi−1] t , { { ≤ }| ∈F lem with a continuous state space (not a convex optimization Si = Gi Di for all i, ≥ −1 problem), it is possible to use the Lagrangian dual approach Ti = Si Si is a regenerative process . (26) +1 − } to solve (29) and show that it admits no duality gap. Define the following Lagrangian Let Zi = Si Di 0 represent the waiting time between +1 − ≥ the delivery time Di of sample i and the generation time Si+1 L(π; λ, c) i−1 of sample i +1. Then, Si = Z0 + j=1(Yj + Zj ) and Di = n−1 Di+1 i−1 1 E 2 (Zj + Yj+1). If (Y1, Y2,...) is given, (S0,S1,...) is = lim (Wt WSi ) dt (c + λ)(Yi +Zi) j=0 P n→∞ n − − uniquely determined by . Hence, one can also use i=0 " Di # (Z0,Z1,...) X Z P to represent a sampling policy. λ π = (Z0,Z1,...) + . (31) Because Ti is a regenerative process, by following the fmax in [26] and [22, Section 6.1], one can show Let 1 E that n [Sn] is a convergent sequence and g(λ, c) , inf L(π; λ, c). (32) T π∈Π1 1 E ˆ 2 lim sup (Wt Wt) dt Then, the Lagrangian dual problem of (29) is defined by T − T →∞ "Z0 # Dn 2 d(c) , max g(λ, c), (33) E (Wt Wˆ t) dt λ≥0 0 − = lim E n→∞ hR [Dn] i where d(c) is the optimum value of (33). Weak duality [27], n−1 E Di+1 2 [28] implies that d(c) p(c). In Section IV-D, we will i=0 D (Wt WSi ) dt ≤ = lim i − , establish strong duality, i.e., d(copt) = p(copt)=0, at the n→∞ hn−1 E i P Ri=0 [Yi + Zi] optimal choice of c = copt. In the sequel, we solve (32). By considering the sufficient where in the last step we have used E E n−1 P [Dn]= [ i=0 (Zi + of the Markov decision problem (32), we obtain E n−1 Yi+1)] = [ i=0 (Yi + Zi)]. Hence, (6) can be rewritten as the following Markov decision problem: P Lemma 3. For any λ 0, there exists an optimal solution ≥ P (Z0,Z1,...) to (32) in which Zi is independent of (Wt,t n−1 E Di+1 2 ∈ (Wt WSi ) dt [0, W ]) for all i =1, 2,... mmse i=0 Di − Si opt , min lim n−1 (27) π∈Π1 n→∞ h E i P iR=0 [Yi + Zi] Proof. See Appendix G. n−1 1 P 1 Using the stopping times and martingale theory of the s.t. lim E [Yi + Zi] , (28) n→∞ n ≥ f Wiener process, we obtain the following lemma: i=0 max X where mmseopt is the optimal value of (27). Lemma 4. Let τ 0 be a stopping time of the Wiener process 2 ≥ Wt with E[τ ] < , then In order to solve (27), let us consider the following Markov ∞ decision problem with a parameter c 0: τ 1 ≥ E W 2dt = E W 4 . (34) n t τ −1 Di+1 0 6 1 E 2 Z  p(c), min lim (Wt WSi ) dt c(Yi +Zi)   π∈Π1 n→∞ n − − Proof. See Appendix H. i=0 "ZDi # X (29) If Zi is independent of (Wt,t [0, WSi ]), by using Lemma n−1 ∈ 1 1 4, we can show that for every i =1, 2,..., s.t. lim E [Yi + Zi] , n→∞ n ≥ f Di+1 i=0 max E 2 X (Wt WSi ) dt − where p(c) is the optimum value of (29). "ZDi # 1E 4 E E Lemma 2. The following assertions are true: = (WSi+Yi+Zi WSi ) + [Yi + Zi] [Yi] , (35) 6 − mmse which is proven in Appendix I.  (a). opt T c if and only if p(c) T 0. s Define the σ-fields t = σ(Ws+v Ws : v [0,t]) and s+ s F − s+∈ (b). If p(c)=0, the solutions to (27) and (29) are identical. = v>t , as well as the filtration ,t 0 of Ft ∩ Fv {Ft ≥ } the time-shifted Wiener process Ws t Ws,t [0, ) . Proof. See Appendix F. { + − ∈ ∞ } Define Ms as the set of square-integrable stopping times of Ws t Ws,t [0, ) , i.e., { + − ∈ ∞ } Hence, the solution to (27) can be obtained by solving (29) M s+ E 2 s = τ 0 : τ t t , τ < . and seeking a copt 0 such that { ≥ { ≤ }∈F ∞} ≥ By using (35) and considering the sufficient  statistics of the p(copt)=0. (30) Markov decision problem (32), we obtain 6

15 Theorem 5. An optimal solution (Z0,Z1,...) to (32) satisfies uniform sampling 1 zero-wait E 4 age-optimal Zi = min (WSi+Yi+τ WSi ) τ∈MS +Y 2 − MMSE-optimal i i  10

β(Yi + τ) WS Y WS , Yi , (36) − i+ i − i  MMSE where β is given by 5

β = 3(c + λ E [Y ]). (37) − 0 Proof. See Appendix J. 0 0.5 1 1.5 f max

Fig. 4: MMSE vs. fmax tradeoff for i.i.d. exponential channel C. Per-Sample Optimal Stopping Solution to (36) delay. We use optimal stopping theory [25] to solve (36). Let us first pose (36) in the language of optimal stopping. A A function f(x) is said to be excessive for the process Xt continuous-time two-dimensional Xt on a prob- if [25] ability space (R2, , P) is defined as follows: Given the initial F 2 state X = x = (s,b), the state X at time t is Exf(Xt) f(x), for all t 0, x R . (43) 0 t ≤ ≥ ∈ Xt = (s + t,b + Wt), (38) By using the Itˆo-Tanaka-Meyer formula in , we can obtain where Wt,t 0 is a standard Wiener process. Define { ≥ } Px(A) = P(A X = x) and ExZ = E(Z X = x), Lemma 6. The function u(x) is excessive for the process Xt. | 0 | 0 respectively, as the of event A and the Proof. See Appendix L. conditional expectation of random variable Z for given initial X state X0 = x. Define the σ-fields t = σ(Xv : v [0,t]) Now, we are ready to prove Theorem 6. X+ X F X+∈ and t = v>t v , as well as the filtration t ,t 0 F ∩ F {F2 ≥ } Proof of Theorem 6. In Lemma 5 and Lemma 6, we have of the Markov chain Xt. A random variable τ : R [0, ) shown that u(x) = E g(X ∗ ) is an excessive function and is said to be a stopping time of X if τ t →X+ ∞for x τ t t u(x) g(x). In addition, it is known that P (τ ∗ < )=1 all t 0. Let M be the set of square-integrable{ ≤ stopping}∈F times x for all≥x R2 [29, Theorem 8.5.3]. These conditions,∞ together of X≥, i.e., t with the∈ Corollary to Theorem 1 in [25, Section 3.3.1], imply M X+ E 2 ∗ = τ 0 : τ t t , τ < . that τ is an optimal stopping time of (39). This completes { ≥ { ≤ }∈F ∞} the proof. Our goal is to solve the following optimal stopping problem: An immediate consequence of Theorem 6 is sup Exg(Xτ ), (39) τ∈M Corollary 1. An optimal solution to (36) is where the function R2 R is defined as g : inf t 0 : WSi+Yi+t WSi √β , if β 0; → Zi = ≥ | − |≥ ≥ 1 0, if β < 0. g(s,b)= βs b4 (40)   − 2 (44) with parameter β 0, and x is the initial state of the Markov ≥ chain X(t). Notice that (39) becomes (36) if the initial state is D. Zero Duality Gap between (29) and (33) at c = copt x = (Yi, WSi+Yi WSi ), and Wt is replaced by WSi+Yi+t Theorem 7. The following assertions are true: − − WS . i (a). The duality gap between (29) and (33) is zero such that Theorem 6. For all initial state (s,b) R2 and β 0, an d(c)= p(c). optimal stopping time for solving (39) is∈ ≥ (b). A common optimal solution to (6), (27), and (29) with ∗ τ = inf t 0 : b + Wt β . (41) c = copt, is given by (4) and (8). ≥ | |≥ Proof Sketch of Theorem 7. We first use Theorem 5 and In order to prove Theoremn 6, let usp defineo the function Corollary 1 to find a geometric multiplier [27] for the primal u(x)= E g(X ∗ ) and establish some properties of u(x). x τ problem (29). Hence, the duality gap between (29) and (33) is Lemma 5. u(x) g(x) for all x R2, and zero, because otherwise there exists no geometric multiplier ≥ ∈ [27, Section 6.1-6.2]. By this, part (a) is proven. Next, we βs 1 b4, if b2 β; u(s,b)= − 2 ≥ (42) consider the case of c = copt and use Lemma 2 to prove part ( βs + 1 β2 βb2, if b2 < β. (b). See Appendix M for the details. 2 − Proof. See Appendix K. Hence, Theorem 1 follows from Theorem 7. 7

100 100 90 90 uniform sampling zero-wait 80 age-optimal 80 age-optimal MMSE-optimal 70 MMSE-optimal 70 60 60 50 50 MMSE 40 MMSE 40 30 30 20 20 10 10 0 0 0 0.5 1 1.5 2 2.5 3 3.5 0 0.5 1 1.5 2 2.5 3 3.5 σ σ Fig. 5: MMSE vs. the scale parameter σ of i.i.d. log-normal Fig. 6: MMSE vs. the scale parameter σ of i.i.d. log-normal channel delay for fmax =0.8. channel delay for fmax =1.5.

V. NUMERICAL RESULTS Figure 5 and Figure 6 illustrate the MMSE of i.i.d. log- normal channel delay for f = 0.8 and f = 1.5, In this section, we evaluate the estimation performance max max respectively, where Y = eσXi /E[eσXi ], σ > 0 is the scale achieved by the following four sampling policies: i parameter of log-, and (X1,X2,...) are 1. Uniform sampling: The policy in (2) with β = fmax. i.i.d. Gaussian random variables with zero mean and unit 2. Zero-wait sampling [11]–[14]: The sampling policy in . Because E[Yi]=1, the maximum throughput of the (3), which is feasible when fmax E[Yi]. ≥ channel is 1. In Fig. 5, since fmax < 1, zero-wait sampling is 3. Age-optimal sampling [11], [12]: The sampling policy in not feasible and hence is not plotted. As the scale parameter σ (5) and (13), which is the optimal solution to (12). grows, the tail of the log-normal distribution becomes heavier 4. MMSE-optimal sampling: The sampling policy in (4) and and heavier. We observe that mmseuniform grows quickly with (8), which is the optimal solution to (6). respect to σ, much faster than mmseopt and mmseage-opt. In mmse mmse mmse mmse Let uniform, zero-wait, age-opt, and opt, be addition, the gap between mmseopt and mmseage-opt increases the MMSEs of uniform sampling, zero-wait sampling, age- as σ grows. In Fig. 6, because fmax > 1, mmseuniform is infinite optimal sampling, MMSE-optimal sampling, respectively. Ac- and hence is not plotted. We can find that mmsezero-wait grows cording to (15), as well as the facts that uniform sampling is quickly with respect to σ and is much larger than mmseopt and feasible for (12) and zero-wait sampling is feasible for (12) mmseage-opt. when f E[Yi], we can obtain max ≥ VI. CONCLUSION mmseopt mmseage-opt mmseuniform, ≤ ≤ mmseopt mmseage-opt mmsezero-wait, when fmax E[Yi], In this paper, we have investigated optimal sampling and ≤ ≤ ≥ remote estimation of the Wiener process over a channel with which fit with our numerical results below. random delay. The optimal sampling policy for minimizing the Figure 4 depicts the tradeoff between MMSE and fmax for mean square estimation error subject to a sampling frequency E i.i.d. exponential channel delay with mean [Yi]=1/µ = 1. constraint has been obtained. We prove that the optimal Hence, the maximum throughput of the channel is µ = 1. sampling policy is a threshold policy, and find the optimal mmse In this setting, uniform is characterized by eq. (25) of threshold. Analytical and numerical comparisons with several [14], which was obtained using a D/M/1 queueing model. For important sampling policies, including age-optimal sampling, small values of fmax, age-optimal sampling is similar with zero-wait sampling, and classic uniform sampling, have been mmse mmse uniform sampling, and hence age-opt and uniform are provided. The results in this paper generalize recent research of similar values. However, as fmax approaches the maximum on ago-of-information by adding a signal model, and can be mmse throughput 1, uniform increases to infinite. This is because also considered a contribution to the rich literature on remote the queue length in uniform sampling is large at high sampling estimation by adding a channel that consists of a queue with frequencies, and the samples become stale during their long random delay. waiting times in the queue. On the other hand, mmseopt and mmseage-opt decrease with respect to fmax. The reason is that ACKNOWLEDGEMENT the set of feasible policies satisfying the constraints in (6) and (12) becomes larger as fmax grows, and hence the optimal The authors are grateful to Ness B. Shroff and Roy D. Yates values of (6) and (12) are decreasing in fmax. Moreover, for their careful reading of the paper and valuable suggestions. the gap between mmseopt and mmseage-opt is large for small mmse mmse values of fmax. The ratio opt/ age-opt tends to 1/3 REFERENCES as fmax 0, which is in accordance with (20). As we →mmse mmse mmse [1] K. J. Astrom and B. M. Bernhardsson, “Comparison of Riemann and expected, zero-wait is larger than opt and age-opt Lebesgue sampling for first order stochastic systems,” in IEEE CDC, when f 1. 2002. max ≥ 8

[2] B. Hajek, K. Mitzel, and S. Yang, “Paging and registration in cellular APPENDIX A networks: Jointly optimal policies and an iterative algorithm,” IEEE PROOF OF (10) Trans. Inf. Theory, vol. 54, no. 2, pp. 608–622, Feb 2008. [3] O. C. Imer and T. Bas¸ar, “Optimal estimation with limited measure- ments,” International Journal of Systems Control and Communications, If π is independent of Wt,t [0, ) , the Si’s and Di’s vol. 2, no. 1-3, pp. 5–29, 2010. { ∈ ∞ } are independent of Wt,t [0, ) . Hence, [4] G. M. Lipsa and N. C. Martins, “Remote state estimation with commu- { ∈ ∞ } nication costs for first-order LTI systems,” IEEE Trans. Auto. Control, Di+1 2 vol. 56, no. 9, pp. 2013–2025, Sept 2011. E (Wt Wˆ t) dt − [5] A. Nayyar, T. Bas¸ar, D. Teneketzis, and V. V. Veeravalli, “Optimal (ZDi ) strategies for communication and remote estimation with an energy Di+1 harvesting sensor,” IEEE Trans. Auto. Control, vol. 58, no. 9, 2013. (a) 2 = E E (Wt WS ) dt Si,Di,Di+1 [6] J. Wu, Q. Jia, K. H. Johansson, and L. Shi, “Event-based sensor i ( ( Di − )) data scheduling: Tradeoff between communication rate and estimation Z quality,” IEEE Trans. Auto. Control, vol. 58, no. 4, 2013. Di+1 (b) 2 =E E (Wt WS ) Si ,Di,Di dt [7] K. Nar and T. Bas¸ar, “Sampling multidimensional Wiener processes,” − i | +1 in IEEE CDC, Dec. 2014, pp. 3426–3431. (ZDi ) [8] J. Chakravorty and A. Mahajan, “Distortion-transmission trade-off in D  c i+1 real-time transmission of Gauss-Markov sources,” in IEEE ISIT, 2015. ( )E E 2 = (Wt WSi ) dt [9] X. Gao, E. Akyol, and T. Bas¸ar, “Optimal estimation with limited ( Di − ) measurements and noisy communication,” in IEEE CDC, 2015. Z Di+1   [10] ——, “On remote estimation with multiple communication channels,” (d)E in American Control Conference, 2016, pp. 5425–5430. = (t Si)dt ( Di − ) [11] Y. Sun, E. Uysal-Biyikoglu, R. D. Yates, C. E. Koksal, and N. B. Shroff, Z D +1 “Update or wait: How to keep your data fresh,” in IEEE Infocom, 2016. (e) i [12] ——, “Update or wait: How to keep your data fresh,” submitted to IEEE =E ∆(t)dt , Trans. Inf. Theory, 2016, https://arxiv.org/abs/1601.02284. (ZDi ) [13] R. D. Yates, “Lazy is timely: Status updates by an energy harvesting where step (a) is due to the law of iterated expectations, step source,” in IEEE ISIT, 2015. [14] S. Kaul, R. D. Yates, and M. Gruteser, “Real-time status: How often (b) is due to Fubini’s theorem, step (c) is because Si,Di,Di+1 should one update?” in Proc. IEEE INFOCOM, 2012. are independent of the Wiener process, step (d) is due to E 2 [15] C. Kam, S. Kompella, and A. Ephremides, “Age of information under Wald’s identity [WT ]= T [30, Theorem 2.48] and the strong random updates,” in IEEE ISIT, 2013. Markov property of the Wiener process [30, Theorem 2.16], [16] B. T. Bacinoglu, E. T. Ceran, and E. Uysal-Biyikoglu, “Age of infor- mation under energy replenishment constraints,” in ITA, 2015. and step (e) is due to (11). By this, (10) is proven. [17] A. M. Bedewy, Y. Sun, and N. B. Shroff, “Optimizing data freshness, throughput, and delay in multi-server information-update systems,” in IEEE ISIT, 2016. [18] ——, “Age-optimal information updates in multihop networks,” in IEEE APPENDIX B ISIT, 2017. PROOFS OF (17) AND (19) [19] I. Kadota, E. Uysal-Biyikoglu, R. Singh, and E. Modiano, “Minimizing the age of information in broadcast wireless networks,” in Allerton If f 0, (8) tells us that Conference, 2016. max → [20] B. T. Bacinoglu and E. Uysal-Biyikoglu, “Scheduling status updates to E 2 1 minimize age of information with an energy harvesting sensor,” in IEEE [max(β, WY )] = , ISIT, 2017. fmax [21] R. Mehra, “Optimization of measurement schedules and sensor designs which implies for linear dynamic systems,” IEEE Trans. Auto. Control, 1976. [22] P. J. Haas, Stochastic Petri Nets: Modelling, Stability, Simulation. New 1 E 2 E β β + [WY ]= β + [Y ]. York, NY: Springer New York, 2002. ≤ fmax ≤ [23] H. Nyquist, “Certain topics in telegraph transmission theory,” Transac- tions of the American Institute of Electrical Engineers, vol. 47, no. 2, Hence, pp. 617–644, April 1928. 1 E 1 [24] C. E. Shannon, “Communication in the presence of noise,” Proceedings [Y ] β . of the IRE, vol. 37, no. 1, pp. 10–21, Jan 1949. fmax − ≤ ≤ fmax [25] A. N. Shiryaev, Optimal Stopping Rules. New York: Springer-Verlag, If fmax 0, (17) follows. Because Y is independent of the 1978. Wiener process,→ using the law of iterated expectations and the [26] S. M. Ross, Stochastic Processes, 2nd ed. John Wiley& Sons, 1996. Gaussian distribution of the Wiener process, we can obtain [27] D. P. Bertsekas, A. Nedi´c, and A. E. Ozdaglar, Convex Analysis and E 4 E 2 E 2 E Optimization. Belmont, MA: Athena Scientific, 2003. [WY ]=3 [Y ] and [WY ]=3 [Y ]. Hence, [28] S. Boyd and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: β E[max(β, W 2 )] β + E[W 2 ]= β + E[Y ], Cambridge Univerisity Press, 2004. ≤ Y ≤ Y 2 E 2 4 2 E 4 2 E 2 [29] R. Durrett, Probability: Theory and Examples, 4th ed. Cambridge β [max(β , WY )] β + [WY ]= β +3 [Y ]. Univerisity Press, 2010. ≤ ≤ [30] P. Morters and Y. Peres, . Cambridge Univerisity Therefore, Press, 2010. β2 E[max(β2, W 4 )] β2 +3E[Y 2] [31] D. P. Bertsekas, Dynamic Programming and Optimal Control, 3rd ed. Y (45) E E 2 . Belmont, MA: Athena Scientific, 2005, vol. 1. β + [Y ] ≤ [max(β, WY )] ≤ β [32] P. R. Kumar and P. Varaiya, Stochastic Systems: Estimation, Identifica- tion, and Adaptive Control. Englewood Cliffs, NY: Prentice-Hall, Inc., By combining (9), (17), and (45), (19) follows in the case of 1986. f 0. max → 9

If d 0, then Y 0 and WY 0 with probability one. Case 2: E[Y ]=0 and hence Y =0 with probability one. In →E →2 →E 2 4 2 Hence, [max(β, WY )] β and [max(β , WY )] β . this case, β = 0 is the solution to (22). Hence, the zero-wait Substituting these into (8)→ and (45), yields → policy expressed by (5) with β =0 is the age-optimal policy. 1 E[max(β2, W 4 )] 1 Combining these two cases, the proof is completed. lim β = , lim Y + E[Y ] = . d→0 f d→0 6E[max(β, W 2 )] 6f max  Y  max By this, (17) and (19) are proven in the case of d 0. This → completes the proof. APPENDIX E PROOF OF LEMMA 1 APPENDIX C PROOF OF (21) Suppose that in policy π, sample i is generated when the If f , the sampling frequency constraint in (6) can max channel is busy sending another sample, and hence sample be removed.→ By ∞ (8), the optimal β is determined by (21). i needs to wait for some time before submitted to the If d , let us consider the equation → ∞ channel, i.e., Si < Gi. Let us consider a virtual sampling E 2 4 ′ ′ E 2 [max(β , WY )] policy π = S0,...,Si−1, Gi,Si+1,... . We call policy π [max(β, WY )]= . (46) { } 2β a virtual policy because the generation time of sample i in policy π′ is S′ = G and it may happen that S′ > S . If Y grows by a times, then β and E[max(β, W 2 )] in (46) i i i i+1 Y However, this will not affect our proof below. We will show both should grow by a times, and E[max(β2, W 4 )] in (46) Y that the MMSE of policy π′ is smaller than that of policy should grow by a2 times. Hence, if d , it holds in (8) → ∞ π = S ,...,Si ,Si,Si ,... . that { 0 −1 +1 } Note that the Wiener process W : t [0, ) does 1 E[max(β2, W 4 )] t Y (47) not change according to the sampling{ policy,∈ and the∞ } sample fmax ≤ 2β delivery times D1,D2,... remain the same in policy π and and the solution to (8) is given by (21). This completes the policy π′. Hence,{ the only} difference between policies π and proof. π′ is that the generation time of sample i is postponed from Si to Gi. The MMSE estimator of policy π is given by (1) APPENDIX D and the MMSE estimator of policy π′ is given by PROOFS OF THEOREMS 3 AND 4 0, t [0,D1); Proof of Theorem 3. The zero-wait policy can be expressed ˆ ∈ Wt = WGi , t [Di,Di+1); (51) as (4) with β = 0. Because Y is independent of the  ∈  WSj , t [Dj ,Dj+1), j = i, j 1. Wiener process, using the law of iterated expectations and the ∈ 6 ≥ Gaussian distribution of the Wiener process, we can obtain Because Si Gi Di Di+1, by the strong Markov prop- ≤ ≤ ≤ Di+1 E 4 E 2 erty of the Wiener process [30, Theorem 2.16], 2[Wt [WY ]=3 [Y ]. According to (21), β = 0 if and only if Di − E 4 E 2 WGi ]dt and WGi WSi are mutually independent. Hence, [WY ]=3 [Y ]=0 which is equivalent to Y = 0 with − R probability one. This completes the proof. Di+1 2 E (Wt WS ) dt Proof of Theorem 4. In the one direction, the zero-wait policy − i (ZDi ) can be expressed as (5) with β ess inf Y . If the zero-wait ≤ Di+1 policy is optimal, then the solution to (22) must satisfy β =E (W W )2 + (W W )2dt ≤ t Gi Gi Si ess inf Y , which further implies β Y with probability one. ( Di − − ) ≤ Z From this, we can get Di+1 E 2 + 2(Wt WGi )(WGi WSi )dt 2ess inf Y E[Y ] 2βE[Y ]= E[Y ], (48) D − − ≥ (Z i ) By this, (23) follows. Di+1 =E (W W )2 + (W W )2dt In the other direction, if (23) holds, we will show that the t Gi Gi Si ( Di − − ) zero-wait policy is age-optimal by considering the following Z Di+1 two cases. E E + 2(Wt WGi )dt [WGi WSi ]. Case 1: E[Y ] > 0. By choosing D − − (Z i ) E[Y 2] Note that the channel is busy whenever there exist some β = , (49) 2E[Y ] generated samples that are not delivered to the estimator. Hence, during the time interval [S , G ], the channel is busy we can get β ess inf Y and hence i i ≤ sending some samples generated before Si in policy π. Be- 2 β Y (50) cause E[Y ] < , we can get E[Yj ] < and ≤ j ∞ ∞ with probability one. According to (49) and (50), such a β is i−1 E E the solution to (22). Hence, the zero-wait policy expressed by [Gi Si] Yj < . − ≤ ∞ (5) with β ess inf Y is the age-optimal policy.  j=1  ≤ X 10

By Wald’s identity [30, Theorem 2.44], we have E[WG mmseπ = mmseopt = c. Applying this in the arguments of i − WSi ]=0 and hence (53)-(55), we can show that policy π satisfies

Di+1 n−1 Di+1 2 1 2 E (Wt WS ) dt lim E (Wt WS ) dt c(Yi + Zi) =0. − i n→∞ n − i − ( Di ) i=0 " Di # Z X Z Di+1 2 This and p(c)=0 imply that policy π is an optimal solution E (Wt WG ) dt . (52) ≥ − i to (29). (ZDi ) Similarly, we can prove that each optimal solution to (29) ′ Therefore, the MMSE of policy π is smaller than that of is an optimal solution to (27). By this, part (b) is proven. policy π. By repeating the above arguments for all samples i ′′ satisfying Si < Gi, one can show that policy π = APPENDIX G S0, G1,...,Gi−1, Gi, Gi+1,... is better than policy π = PROOF OF LEMMA 3 { } S ,S ,...,Si ,Si,Si ,... . This completes the proof. { 0 1 −1 +1 } Because the Yi’s are i.i.d., Zi is independent of Yi+1, Yi+2,..., and the strong Markov property of the Wiener APPENDIX F process [30, Theorem 2.16], in the Lagrangian L(π; λ) the PROOF OF LEMMA 2 term related to Zi is

Part (a) is proven in two steps: Si+Yi+Zi+Yi+1 mmse 2 Step 1: We will prove that opt c if and only if E (Wt WS ) dt (c + λ)(Yi + Zi) , ≤ − i − p(c) 0. " Si+Yi # ≤ Z If mmseopt c, then there exists a policy π = (56) ≤ that is feasible for both (27) and (29), (Z0,Z1,...) Π1 which is determined by the control decision Zi and the which satisfies∈ recent information of the system i = (Yi, (WS t WS , I i+ − i n−1 Di+1 2 t 0)). According to [31, p. 252] and [32, Chapter 6], E (Wt WS ) dt i i=0 Di i ≥ I lim − c. (53) is the sufficient statistic for determining Zi in (32). There- n→∞ h n−1 E i ≤ P Ri=0 [Yi +Zi] fore, there exists an optimal policy (Z0,Z1,...) in which Zi is determined based on only i, which is independent of Hence, P I (Wt : t [0,Si]). This completes the proof. 1 n−1 E Di+1 2 ∈ n i=0 D (Wt WSi ) dt c(Yi + Zi) lim i − − 0. n→∞ hR 1 n−1 E [Y +Z ] i ≤ P n i=0 i i APPENDIX H (54) P PROOF OF LEMMA 4 Because the inter-sampling times Ti = Yi + Zi are re- 4 generative, the renewal theory [26] tells us that the limit According to Theorem 2.51 and Exercise 2.15 of [30], Wt t − 1 n−1 E 6 W 2ds and W 4 6tW 2 +3t2 are two martingales of the limn→∞ n i=0 [Yi +Zi] exists and is positive. By this, 0 s t t − t 2 2 we get Wiener process Wt,t [0, ) . Hence, 0 Ws ds tWt + P 2R { ∈ ∞ } − n−1 t /2 is also a martingale of the Wiener process. Di+1 R 1 2 Because the minimum of two stopping times is a stopping lim E (Wt WS ) dt c(Yi + Zi) 0. n→∞ i n Di − − ≤ time and constant times are stopping times [29], it follows that i=0 "Z # X (55) t τ is a bounded stopping time for every t [0, ), where x∧ y = min[x, y]. Then, it follows from Theorem∈ ∞ 8.5.1 of Therefore, p(c) 0. ∧ ≤ [29] that for every t [0, ) On the reverse direction, if p(c) 0, then there exists a ∈ ∞ ≤ t∧τ policy π = (Z0,Z1,...) Π1 that is feasible for both (27) E 2 1E 4 ∈ Ws ds = Wt∧τ (57) and (29), which satisfies (55). From (55), we can derive (54) 0 6 mmse Z  and (53). Hence, opt c. By this, we have proven that   1 ≤ E 2 2 (58) mmseopt c if and only if p(c) 0. = (t τ)Wt∧τ (t τ) . ≤ ≤ ∧ − 2 ∧ Step 2: We needs to prove that mmseopt c if and only if p(c) > 0. This completes the t∧τ τ E 2 E 2 proof of part (a). lim Ws ds = Ws ds . t→∞ Part (b): We first show that each optimal solution to (27) Z0  Z0  is an optimal solution to (29). By the claim of part (a), The remaining task is to show that mmse p(c)=0 is equivalent to opt = c. Suppose that policy E 4 E 4 lim Wt∧τ = Wτ . (59) π = (Z0,Z1,...) Π1 is an optimal solution to (27). Then, t→∞ ∈     11

Towards this goal, we combine (57) and (58), and apply

Cauchy-Schwarz inequality to get Di+1 2 =E (Wt WS ) dt E W 4 − i t∧τ (ZDi ) E 2 2 = 6(t τ)Wt∧τ 3(t τ) Yi+Zi+Yi+1  ∧  − ∧ 2 =E (WS +t WS ) dt E 2 E 4 E 2 i i 6  [(t τ) ] [Wt τ ] 3 (t τ) . ( Yi − ) ≤ ∧ ∧ − ∧ Z q Yi+Zi+Yi+1 E 4 E 2  2  (a) 2 Let x = [Wt∧τ ] / [(t τ) ], then x 6x + 3 0. = E E (W W ) dt Y , Y ∧ − ≤ Si+t Si i i+1 By the roots and properties of quadratic functions, we obtain ( ( Yi − )) p Z 3 √6 x 3+ √6 and hence (b) 1 − ≤ ≤ = E E (W W )4 Y , Y E 4 √ 2E 2 √ 2E 2 Si+Yi+Zi+Yi+1 Si i i+1 Wt∧τ (3 + 6) (t τ) (3 + 6) τ < . 6 ( ( − )) ≤ ∧ ≤ ∞ Then, we use Fatou’s lemma [29, Theorem 1.5.4] to derive       1E E 4 (WSi+Yi WSi ) Yi, Yi +1 E 4 − 6 ( ( − )) Wτ (c) 1 1 E 4 E 4 E 4 =  lim Wt τ = (WSi+Yi+Zi+Yi+1 WSi ) (WSi+Yi WSi ) , t→∞ ∧ 6 − − 6 − h E 4i (62) lim inf Wt τ     ≤ t→∞ ∧ where step (a) and step (c) are due to the law of iterated (3 + √6)2E τ 2  < . (60) ≤ ∞ expectations, and step (b) is due to Lemma 4. Because Si+1 = Further, by (60) and Doob’s maximal  inequality [30, Theorem Si + Yi + Zi, we have 12.30] and [29, Theorem 5.4.3], 4 E (WS Y Z Y WS ) i+ i+ i+ i+1 − i E 4 E 4 = [(WSi+Yi+Zi WSi ) + (WSi+1+Yi+1 WSi+1 )] sup Wt τ   ∧ − 4 − "t∈[0,∞) # =E (WS Y Z WS )  i+ i+ i − i +4E (W W )3(W W ) E 4  Si+Yi+Zi Si Si+1+Yi+1 Si+1 = sup Wt − 2 − 2 t∈[0,τ] +6E (WS Y Z WS ) (WS Y WS ) " #  i+ i+ i − i i+1+ i+1 − i+1  4 3 +4E (WS +Y +Z WS )(WS +Y WS ) 4 E 4  i i i − i i+1 i+1 − i+1  Wτ < . (61) E 4 ≤ 3 ∞ + (WSi+1+Yi+1 WSi+1 )    − 4 4  4  4 =E (WS +Y +Z WS ) Because Wt∧τ supt∈[0,∞) Wt∧τ and supt∈[0,∞) Wt∧τ is  i i i i  ≤ − 3 integrable, (59) follows from dominated convergence theorem +4E (WS Y Z WS ) E (WS Y WS )  i+ i+ i −  i i+1+ i+1 − i+1 [29, Theorem 1.5.6]. This completes the proof. 2 2 +6E (WS Y Z WS ) E (WS Y WS )  i+ i+ i − i   i+1+ i+1 − i+1  E E 3 +4 [(WSi+Yi+Zi WSi )]  (WSi+1+Yi+1 WSi+1 )  − 4 − + E (WS Y WS ) , i+1+ i+1 − i+1   where in the last equation we have used the fact that Yi+1 is independent of Yi and Zi, and the strong Markov property of the Wiener process [30, Theorem 2.16]. Because

3 E (WS Y WS ) Yi i+1+ i+1 − i+1 | +1 =E (WS Y WS ) Yi =0,  i+1+ i+1 − i+1 | +1  by the law of iterated expectations, we have APPENDIX I E 3 E (WSi+1+Yi+1 WSi+1 ) = (WSi+1+Yi+1 WSi+1 ) =0. PROOF OF (35) − − E 2 E In addition, Wald’s identity tells us that Wτ = [τ] for any stopping time τ with E [τ] < . Hence, ∞   4 E (WS Y Z Y WS ) i+ i+ i+ i+1 − i E 4 E E = (WSi+Yi+Zi WSi ) +6 [Yi +Zi] [Yi+1] − 4 +E (WS Y WS ) . (63)  i+1+ i+1 − i+1

Finally, because (WSi+t WSi ) and (WSi+1+t WSi+1 ) are By using (26) and the condition that Zi is independent of − − both Wiener processes, and the Yi’s are i.i.d., (Wt,t [0, WSi ]), we obtain that for given Yi and Yi+1, Yi ∈ 4 4 and Y + Z + Y are stopping times of the time-shifted E (WS Y WS ) = E (WS Y WS ) . (64) i i i+1 i+ i − i i+1+ i+1 − i+1 Wiener process WS t WS ,t 0 . Hence, { i+ − i ≥ }     12

Combining (62)-(64), yields (35). APPENDIX L PROOF OF LEMMA 6

APPENDIX J PROOF OF THEOREM 5 The function u(s,b) is continuous differentiable in (s,b). By (35), (56) can be rewritten as ∂2 In addition, ∂2b u(s,b) is continuous everywhere but at b = √β. By the Itˆo-Tanaka-Meyer formula [30, Theorem 7.14 Si+Yi+Zi+Yi+1 ±and Corollary 7.35], we obtain that almost surely E 2 (Wt WSi ) dt (c + λ)(Yi + Zi) S +Y − − "Z i i # u(s + t,b + Wt) u(s,b) − 1 4 β t = (WS +Y +Z WS ) (Yi +Zi) ∂ 6 i i i − i − 3 = u(s + r, b + Wr)dWr 0 ∂b 1 4 β Z t = [(WSi+Yi WSi )+(WSi+Yi+Zi WSi+Yi )] (Yi +Zi). ∂ 6 − − − 3 + u(s + r, b + Wr)dr (65) 0 ∂s Z ∞ 2 1 a ∂ Because the Y ’s are i.i.d. and the strong Markov property of + L (t) u(s + r, b + a)da, (70) i 2 ∂b2 the Wiener process [30, Theorem 2.16], the term in (65) is Z−∞ a determined by the control decision Zi and the information where L (t) is the that the Wiener process spends ′ at the level , i.e., i = (WSi+Yi WSi , Yi, (WSi+Yi+t WSi+Yi , t 0)). a I − − ′ ≥ According to [31, p. 252] and [32, Chapter 6], i is the t I a 1 sufficient statistic for determining the waiting time Zi in L (t) = lim 1{|Ws−a|≤ǫ}ds, (71) ǫ↓0 2ǫ 0 (32). Therefore, there exists an optimal policy (Z0,Z1,...) Z ′ and 1A is the indicator function of event A. By the property in which Zi is determined based on only i. By this, (32) is decomposed into a sequence of per-sampleI control problems of local times of the Wiener process [30, Theorem 6.18], we (36). In addition, because the Yi’s are i.i.d. and the strong obtain that almost surely Markov property of the Wiener process, the Zi’s in this u(s + t,b + Wt) u(s,b) optimal policy are i.i.d. Similarly, the (WS Y Z WS )’s − i+ i+ i − i t ∂ in this optimal policy are i.i.d. = u(s + r, b + W )dW ∂b r r Z0 t ∂ + u(s + r, b + W )dr APPENDIX K ∂s r Z0 PROOF OF LEMMA 5 1 t ∂2 + u(s + r, b + W )dr. (72) 2 ∂b2 r Case 1: If b2 β, then (41) tells us that Z0 ≥ τ ∗ =0 (66) Because ∂ 2b3, if b2 β; and u(s,b)= − ≥ ∂b 2 1 4 ( 2βb, if b < β, u(x)= E[g(X0) X0 = x]= g(x)= βs b . (67) − | − 2 we can obtain that for all t 0 and all x = (s,b) R2 Case 2: If b2 <β, then τ ∗ > 0 and (b + W ∗ )2 = β. Invoking ≥ ∈ τ t 2 Theorem 8.5.5 in [29], yields ∂ Ex u(s + r, b + Wr) dr < . (73) ∗ 2 ( 0 ∂b ) ∞ Exτ = ( β b)( β b)= β b . (68) Z   − − − − − This and Thoerem 7.11 of [30] imply that t ∂ Using this, we can obtainp p 0 ∂b u(s + r, b + Wr)dWr is a martingale and E ∗ R u(x)= xg(X(τ )) t E ∂ ∗ 1 4 x u(s + r, b + Wr)dWr =0, t 0. (74) = β(s + Exτ ) Ex (b + Wτ ∗ ) ∂b ∀ ≥ − 2 Z0  1 By combining (38), (72), and (74), we get = β(s + β b2) β2  − − 2 t ∂ 1 ∂2 1 2 2 E [u(X )] u(x)= E u(X )+ u(X ) dr . = βs + β b β. (69) x t x ∂s r 2 ∂b2 r 2 − − 0 Z   (75) Hence, in Case 2, 2 1 1 1 It is easy to compute that if b >β, u(x) g(x)= β2 b2β + b4 = (b2 β)2 0. − 2 − 2 2 − ≥ ∂ 1 ∂2 u(s,b)+ u(s,b)= β 3b2 0; By combining these two cases, Lemma 5 is proven. ∂s 2 ∂b2 − ≤ 13

2 ⋆ and if b <β, copt in (30), p(copt)=0. By Lemma 2(b), this policy π is also an optimal solution to (27). In addition, and ∂ 1 ∂2 p(copt)=0 u(s,b)+ u(s,b)= β β =0. Lemma 2(a) imply mmse = c . Substituting policy π⋆ and ∂s 2 ∂b2 opt opt − (35) into (27), yields Hence, n−1 E 4 E 2 i=0 (WSi+Yi+Zi WSi ) +(Yi + Zi) [Y ] ∂ 1 ∂ copt = lim − u(s,b)+ u(s,b) 0 (76) n→∞ n−1 E ∂s 2 ∂b2 ≤ P  6 i=0 [Yi +Zi]  E 4 R2 (WSi+Yi+Zi WSi ) for all (s,b) except for b = √β. Since the Lebesgue = − P + E[Y ], (84) ∈ ± 6E [Y +Z ] measure of those r for which b + Wr = √β is zero, we get  i i  E ± R2 from (75) and (76) that x [u(Xt)] u(x) for all x and where in the last equation we have used that the ’s are i.i.d. ≤ ∈ Zi t 0. This completes the proof. and the (WS Y Z WS )’s are i.i.d., which were shown ≥ i+ i+ i − i in the proof of Theorem 5. According to (84), copt E[Y ]. ⋆ ≥ ⋆ APPENDIX M Hence, β = 3(copt + λ E [Y ]) 0, in which case policy π − ≥ PROOF OF THEOREM 7 in (44) is exactly (4). Theorem 7 is proven in three steps: The value of β can be obtained by considering the following Step 1: We will show that the duality gap between (29) and two cases: (33) is zero, i.e., d(c)= p(c). ⋆ ⋆ To that end, we needs to find π = (Z0,Z1,...) and λ Case 1: If λ> 0, then (84) and (81) imply that that satisfy the following conditions: 1 E [Yi + Zi]= , (85) n−1 fmax ⋆ 1 E 1 π Π, lim [Yi + Zi] 0, (77) E (W W )4 ∈ n→∞ n − f ≥ E Si+Yi+Zi Si i=0 max β > 3(copt [Y ]) = − . (86) X − 2E [Yi +Zi] λ⋆ 0, (78)   ≥⋆ ⋆ ⋆ L(π ; λ ,copt) = inf L(π; λ ,copt), (79) Case 2: If λ =0, then (84) and (82) imply that π∈Π1 n−1 1 ⋆ 1 1 E [Yi + Zi] , (87) λ lim E [Yi + Zi] =0. (80) ≥ f n→∞ n − f max ( i=0 max ) E 4 X (WSi+Yi+Zi WSi ) ⋆ β = 3(copt E[Y ]) = − . (88) According to Theorem 5 and Corollary 1, the solution π − 2E [Yi +Zi] to (79) is given by (44) where β = 3(c + λ⋆ E [Y ]). In   Combining (85)-(88), (83) follows. By (84), the optimal value addition, as shown in the proof of Theorem 5,− the Z ’s in i of (27) is given by policy π⋆ are i.i.d. From (77), (78), and (80), λ⋆ is determined ⋆ E 4 by considering two cases: If λ > 0, then [(WSi+Yi+Zi WSi ) ] mmseopt = − + E[Y ]. (89) n−1 6E[Yi + Zi] 1 1 lim E [Yi + Zi]= E [Yi + Zi]= . (81) n→∞ n f i=0 max Step 3: We will show that the expectations in (83) and (89) X If λ⋆ =0, then are given by n−1 E E 2 (90) 1 1 [Yi + Zi]= [max(β, WY )], lim E [Yi + Zi]= E [Yi + Zi] . (82) E 4 E 2 4 n→∞ n ≥ f [(WSi +Yi+Zi WSi ) ]= [max(β , WY )]. (91) i=0 max − X ⋆ ⋆ According to (44) with β 0, we have Hence, such π and λ satisfy (77)-(80). By [27, Prop. 6.2.5], ≥ π⋆ is an optimal solution to the primal problem (29) and λ⋆ WSi+Yi+Zi WSi is a geometric multiplier [27] for the primal problem (29). − WS Y WS , if WS Y WS √β; In addition, the duality gap between (29) and (33) is zero, = i+ i − i | i+ i − i |≥ √β, if WS Y WS < √β. because otherwise there exists no geometric multiplier [27,  | i+ i − i | Section 6.1-6.2]. Hence, Step 2: We will show that a common optimal solution to E[(W W )4]= E[max(β2, (W W )4)]. (6), (27), and (29) with c = c , is given by (4) where β 0 Si+Yi+Zi Si Si+Yi Si opt − − (92) is determined by solving ≥ In addition, from (66) and (68) we know that if W 1 E[(W W )4] Si+Yi E Si+Yi+Zi Si | − [Yi + Zi]=max , − , (83) WSi √β fmax 2β |≥   E where the last term is determined by L’Hˆopital’s rule if β 0. [Zi Yi] = 0; → | We consider the case that c = copt. In Step 1, we have otherwise, shown that policy π⋆ in (44) with β = 3(c + λ⋆ E [Y ]) opt 2 − E[Zi Yi]= β (WS Y WS ) . is an optimal solution to (29). According to the definition of | − i+ i − i 14

Hence,

2 E[Zi Yi] = max[β (WS Y WS ) , 0]. | − i+ i − i Using the law of iterated expectations, the strong Markov property of the Wiener process, and Wald’s identity 2 E[(WS Y WS ) ]= E[Yi], yields i+ i − i E[Zi + Yi]

=E[E[Zi Yi]+ Yi] | 2 =E[max(β (WS Y WS ) , 0)+ Yi] − i+ i − i E 2 2 = [max(β (WSi+Yi WSi ) , 0)+(WSi+Yi WSi ) ] − − 2 − =E[max(β, (WS Y WS ) )]. (93) i+ i − i

Finally, because Wt and WSi+t WSi are of the same distri- bution, (90) and (91) follow from− (93) and (92), respectively. This completes the proof.

T ¡¢ £¡¤¥¦§ ¨ £¡¤¥¦§ © § ¤§ ¤¨ ¡ ¢  ¡  § ¡  ¨ ¤¨ £ ¦  £¦     h h¦ ¡  ¦¤ h ¢h©© 

T ¡¢ £¡¤¥¦§ ¨ £¡¤¥¦§  § ¤§ ¤¨ ¡ ¢  ¡  § ¡  ¨ ¤¨ £ ¦  £¦     h h¦ ¡  ¦¤ h ¢h©©