2.10 Rate of a Stationary Source Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Discrete-time Information Source

In most communication systems, communication takes place continually • instead of over a finite period of time. Examples: TV broadcast, Internet, cellular systems. • The information source can be modeled as a discrete-time random process • X ,k 1 . { k } Xk,k 1 is an infinite collection of random variables indexed by the • {set of positive } integers. The index k is referred to as the “time” index. Random variables X are called letters. • k Assume that H(X ) < for all k. • k 1 Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= __h = . k ______k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Total Entropy of { Xk }

For a finite subset A of the index set k : k 1 , the joint entropy • H(X ,k A) is finite because { } k 2 H(X ,k A) H(X ) < . k 2  k 1 k A X2 However, the joint entropy of an infinite collection of letters is infinite • except for very special cases. For example, X are i.i.d. and H(X )=h for all k.Then • k k

1 1 H(X ,k 1) = H(X )= h = . k k 1 kX=1 kX=1 In general, it is not meaningful to discuss H(X ,k 1). • k Entropy Rate Entropy Rate

We are motivated to definite the entropy rate of an information source, which gives the average entropy of a letter of the source.

Definition 2.54 The entropy rate of an information source X is defined as { k} 1 HX =lim H(X1,X2, ,Xn) n !1 n ··· when the limit exists. Entropy Rate

We are motivated to definite the entropy rate of an information source, which gives the average entropy of a letter of the source.

Definition 2.54 The entropy rate of an information source X is defined as { k} 1 HX =lim H(X1,X2, ,Xn) n !1 n ··· when the limit exists. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { }

1 ______nH(X) lim H______(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Exist

Example 2.55 Let Xk be an i.i.d. source with generic random variable X. Then { } 1 nH(X) lim H(X1,X2, ,Xn)= lim n n !1 n ··· !1 n =limH(X) n !1 = H(X), i.e., the entropy rate of an i.i.d. source is the entropy of any of its single letters. Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n ______1 2 ··· n n k ______kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n ______k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n ______k kX=1 1 n = __k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n ______kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n ______kX=1 1 n(n + 1) = n ______2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} Entropy Rate May Not Exist

Example 2.56 Let Xk be a source such that Xk are mutually independent and H(X )=k for k{ 1.} Then k 1 1 n H(X ,X , ,X )= H(X ) n 1 2 ··· n n k kX=1 1 n = k n kX=1 1 n(n + 1) = n 2 1 = (n + 1), 2 which does not converge as n although H(Xk) < for all k. Therefore, the entropy rate of X does!1 not exist. 1 { k} The Limit H X0

Toward characterizing the asymptotic behavior of Xk , it is natural to • consider the limit { }

HX0 =limH(Xn X1,X2, ,Xn 1) n !1 | ··· if it exists.

The quantity H(Xn X1,X2, ,Xn 1) is the of the • next letter given that| we know··· all the past history of the source.

HX0 is the limit of this quantity after the source has been run for an • indefinite amount of time.

Is H0 = H ? • X X The Limit H X0

Toward characterizing the asymptotic behavior of Xk , it is natural to • consider the limit { }

HX0 =limH(Xn X1,X2, ,Xn 1) n !1 | ··· if it exists.

The quantity H(Xn X1,X2, ,Xn 1) is the conditional entropy of the • next letter given that| we know··· all the past history of the source.

HX0 is the limit of this quantity after the source has been run for an • indefinite amount of time.

Is H0 = H ? • X X The Limit H X0

Toward characterizing the asymptotic behavior of Xk , it is natural to • consider the limit { }

HX0 =limH(Xn X1,X2, ,Xn 1) n !1 | ··· if it exists.

The quantity H(Xn X1,X2, ,Xn 1) is the conditional entropy of the • next letter given that| we know··· all the past history of the source.

HX0 is the limit of this quantity after the source has been run for an • indefinite amount of time.

Is H0 = H ? • X X The Limit H X0

Toward characterizing the asymptotic behavior of Xk , it is natural to • consider the limit { }

HX0 =limH(Xn X1,X2, ,Xn 1) n !1 | ··· if it exists.

The quantity H(Xn X1,X2, ,Xn 1) is the conditional entropy of the • next letter given that| we know··· all the past history of the source.

HX0 is the limit of this quantity after the source has been run for an • indefinite amount of time.

Is H0 = H ? • X X The Limit H X0

Toward characterizing the asymptotic behavior of Xk , it is natural to • consider the limit { }

HX0 =limH(Xn X1,X2, ,Xn 1) n !1 | ··· if it exists.

The quantity H(Xn X1,X2, ,Xn 1) is the conditional entropy of the • next letter given that| we know··· all the past history of the source.

HX0 is the limit of this quantity after the source has been run for an • indefinite amount of time.

Is H0 = H ? • X X The Limit H X0

Toward characterizing the asymptotic behavior of Xk , it is natural to • consider the limit { }

HX0 =limH(Xn X1,X2, ,Xn 1) n !1 | ··· if it exists.

The quantity H(Xn X1,X2, ,Xn 1) is the conditional entropy of the • next letter given that| we know··· all the past history of the source.

HX0 is the limit of this quantity after the source has been run for an • indefinite amount of time.

Is H0 = H ? • X X Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationary Information Source

A stationary information source is one such that any finite block of random vari- ables and any of its time-shift versions have exactly the same joint distribution.

Definition 2.57 An information source X is stationary if { k} X ,X , ,X 1 2 ··· m and X ,X , ,X 1+l 2+l ··· m+l have the same joint distribution for any m, l 1.

m m X ,X , ,X X ,X , ,X 1 2 ··· m ··· 1+l 2+l ··· m+l ··· z }| l{ z }| { Stationarity and HX0

Recall that HX0 =limH(Xn X1,X2, ,Xn 1). n !1 | ···

Lemma 2.58 Let X be a stationary source. Then H0 exists. { k} X Stationarity and HX0

Recall that HX0 =limH(Xn X1,X2, ,Xn 1). n !1 | ···

Lemma 2.58 Let X be a stationary source. Then H0 exists. { k} X Stationarity and HX0

Recall that HX0 =limH(Xn X1,X2, ,Xn 1). n !1 | ···

Lemma 2.58 Let X be a stationary source. Then H0 exists. { k} X Lemma 2.58 Let X be a stationary source. Then { k} HX0 exists.

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X2,X3, ,Xn 1)  | ··· = H(X____n 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X3, ,Xn 1)  | ··· = H(X____n 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X3, ,Xn 1)  | ··· = H(X____n 1 X1_,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X_3, ,Xn 1)  | ··· = H(X____n 1 X1_,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X_3, ,Xn 1)  | ··· = H(X____n 1 X1_,X2_, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X_3, ,X____n 1)  | ··· = H(X____n 1 X1_,X2_, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X_3, ,X____n 1)  | ··· = H(X____n 1 X1_,X2_, ,X____n 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn_ X_2,X_3, ,X____n 1)  | ··· = H(X____n 1 X1_,X2_, ,X____n 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Lemma 2.58 Let X be a stationary source. Then { k} H exists. H0 =limH(X X ,X , ,X ) X0 X n n| 1 2 ··· n 1 !1

Proof 1. Since H(X X ,X , ,X ) is lower bounded n| 1 2 ··· n 1 by zero for all n, it suffices to prove that this quantity is non-increasing in n to conclude that the limit HX0 exists. 2. Toward this end, for n 2, consider

H(Xn X1,X2, ,Xn 1) | ··· H(Xn X2,X3, ,Xn 1)  | ··· = H(Xn 1 X1,X2, ,Xn 2), | ··· where the last step is justified by the stationarity of X . { k} 3. Therefore, H(X X ,X , ,X ) is non- n| 1 2 ··· n 1 increasing in n. The lemma is proved. Cesáro Mean Cesáro Mean

Consider a sequence a ,n 1 . • { n } Construct a sequence b ,n 1 where b = 1 n a . • { n } n n i=1 i b is the average of the first n terms in a . P • n { n} b ,n 1 are called the Ces´aro means of a . • n { n} It can be shown that if a a,thenb a. • n ! n ! Cesáro Mean

Consider a sequence a ,n 1 . • { n } Construct a sequence b ,n 1 where b = 1 n a . • { n } n n i=1 i b is the average of the first n terms in a . P • n { n} b ,n 1 are called the Ces´aro means of a . • n { n} It can be shown that if a a,thenb a. • n ! n ! Cesáro Mean

Consider a sequence a ,n 1 . • { n } Construct a sequence b ,n 1 where b = 1 n a . • { n } n n i=1 i b is the average of the first n terms in a . P • n { n} b ,n 1 are called the Ces´aro means of a . • n { n} It can be shown that if a a,thenb a. • n ! n ! Cesáro Mean

Consider a sequence a ,n 1 . • { n } Construct a sequence b ,n 1 where b = 1 n a . • { n } n n i=1 i b is the average of the first n terms in a . P • n { n} b ,n 1 are called the Ces´aro means of a . • n { n} It can be shown that if a a,thenb a. • n ! n ! Cesáro Mean

Consider a sequence a ,n 1 . • { n } Construct a sequence b ,n 1 where b = 1 n a . • { n } n n i=1 i b is the average of the first n terms in a . P • n { n} b ,n 1 are called the Ces´aro means of a . • n { n} It can be shown that if a a,thenb a. • n ! n ! Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏ > 0, there n ! !1 exists N(✏) such that a a < ✏ for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏ > 0, there n ! !1 exists N(✏) such that a a < ✏ for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a |__n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a |__n | i n ______iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a_ | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a_ | n | i n iX=1 n n 1 1 = a a i n n iX=1 ______iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a_ __i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a_ __i n n iX=1 iX=1 n 1 = (a a) ______i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | |______i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | |______i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + _✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 ______i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 ______i=NX(✏)+1

1 N(✏) n______N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1 <1 1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏>0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1 <1 1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏ > 0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏ > 0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | ______iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏ > 0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏ > 0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P | | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 <✏ 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏ > 0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P ______| | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 <✏ 1 N(✏) < ai a + ✏. n | | iX=1 Lemma 2.59 (Ces´aroMean) Let a and b be real 3. The first term tends to 0 as n . k k !1 numbers. If a a as n and b = 1 n a , n ! !1 n n i=1 i 4. Therefore, for any ✏ > 0, by taking n to be suffi- then b a as n . ciently large, we can make bn a < 2✏. n ! !1 P ______| | 5. Hence b a as n , proving the lemma. n ! !1 Proof 1. Since a a as n ,forevery✏>0, there n ! !1 exists N(✏) such that a a <✏for all n>N(✏). | n | 2. For n>N(✏), consider

1 n b a = a a | n | i n iX=1 n n 1 1 = a a i n n iX=1 iX=1 n 1 = (a a) i n iX=1 n 1 ai a  n | | iX=1 1 N(✏) n = a a + a a 0 | i | | i |1 n i=1 i=N(✏)+1 B X X C @ A 1 N(✏) 1 n < ai a + ✏ n | | n iX=1 i=NX(✏)+1

1 N(✏) n N(✏) = ai a + ✏ n | | n · iX=1 <✏ 1 N(✏) < ai a + ✏. n | | iX=1 Theorem 2.60 The entropy rate H of a stationary source X exists and X { k} is equal to HX0 .

Idea of Proof: Use stationarity of X and Ces´aro mean. { k} Theorem 2.60 The entropy rate H of a stationary source X exists and X { k} is equal to HX0 .

Idea of Proof: Use stationarity of X and Ces´aro mean. { k} Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

bn 1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

bn 1 H(X1,X2, ,Xn) n ··· ak 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

bn 1 H(X1,X2, ,Xn) n ··· ak 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

bn 1 H(X1,X2, ,Xn) n ··· ak 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). n ···

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· an 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ______! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ______! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ______! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Theorem 2.60 The entropy rate H of a stationary 5. Therefore, b ,n 1 are the Ces´aro means of a . X n { n} source X exists and is equal to H . { k} X0 6. From the definition of H , we have a H . X0 n ! X0 7. By Lemma 2.59, bn H0 , and so Proof ! X 1. We have proved in Lemma 2.58 that H always ex- X0 1 ists for a stationary source X . k HX =lim H(X1,X2, ,Xn)=HX0 . { } n n ··· 2. In order to prove the theorem, we only have to prove !1 that HX = HX0 . 3. For n 1, let 8. Hence, HX = HX0 , as to be proved.

an = H(Xn X1,X2, ,Xn 1) | ··· 1 bn = H(X1,X2, ,Xn). ··· H0 =limH(X X ,X , ,X ) n X n n| 1 2 ··· n 1 !1

4. By the chain rule for entropy,

1 H(X1,X2, ,Xn) n ··· 1 n = H(Xk X1,X2, ,Xk 1), n | ··· kX=1 or 1 n bn = ak. n kX=1 Remarks Remarks

Theorem 2.60 says that • 1. the entropy rate of an information source Xk exists under the fairly general assumption that X is stationary;{ } { k} 2. HX0 is an alternative definition/interpretation of the entropy rate of X when X is stationary. { k} { k} However, the entropy rate of a stationary source Xk may not carry any • physical meaning unless X is also ergodic. See{ Section} 5.4. { k} Remarks

Theorem 2.60 says that • 1. the entropy rate of an information source Xk exists under the fairly general assumption that X is stationary;{ } { k} 2. HX0 is an alternative definition/interpretation of the entropy rate of X when X is stationary. { k} { k} However, the entropy rate of a stationary source Xk may not carry any • physical meaning unless X is also ergodic. See{ Section} 5.4. { k} Remarks

Theorem 2.60 says that • 1. the entropy rate of an information source Xk exists under the fairly general assumption that X is stationary;{ } { k} 2. HX0 is an alternative definition/interpretation of the entropy rate of X when X is stationary. { k} { k} However, the entropy rate of a stationary source Xk may not carry any • physical meaning unless X is also ergodic. See{ Section} 5.4. { k} Remarks

Theorem 2.60 says that • 1. the entropy rate of an information source Xk exists under the fairly general assumption that X is stationary;{ } { k} 2. HX0 is an alternative definition/interpretation of the entropy rate of X when X is stationary. { k} { k} However, the entropy rate of a stationary source Xk may not carry any • physical meaning unless X is also ergodic. See{ Section} 5.4. { k}