University of Texas Rio Grande Valley ScholarWorks @ UTRGV

Mathematical and Statistical Sciences Faculty Publications and Presentations College of Sciences

2020

The quantization of the standard triadic Cantor distribution

Mrinal Kanti Roychowdhury The University of Texas Rio Grande Valley

Follow this and additional works at: https://scholarworks.utrgv.edu/mss_fac

Part of the Mathematics Commons

Recommended Citation Roychowdhury, Mrinal K. 2020. “The Quantization of the Standard Triadic Cantor Distribution.” HOUSTON JOURNAL OF MATHEMATICS 46 (2): 389–407.

This Article is brought to you for free and open access by the College of Sciences at ScholarWorks @ UTRGV. It has been accepted for inclusion in Mathematical and Statistical Sciences Faculty Publications and Presentations by an authorized administrator of ScholarWorks @ UTRGV. For more information, please contact [email protected], [email protected]. arXiv:1809.07913v7 [math.DS] 21 Nov 2019 eeae by generated rpsto 1.1. Proposition ( to ieso,qatzto coefficient. quantization dimension, α fi xss scle the called is (0 exists, it if fi xss scle the called is exists, it if uhaset a Such in fqatzto hoy n a e G,G,Z o t epb deep its for Z] GN, Let [GG, technology. see may engineering and One an theory theory. results, theoretical quantization measur more of probability for tions a with [GL1,GL3,GL4,GL5,P] measure to probability refer given a of approximation by e of set oeteEciennr on norm Euclidean the note esr notmlstof set optimal an measure R ntedrcino pia esof sets optimal of direction the in number i ) min , ⊂ e od n phrases. and words Key 2010 Mathematics of Journal Houston appear, To n ftemi ahmtclam fteqatzto rbe st st to is problem quantization the of aims mathematical main the of One a + V P hnt n te lmn in element other any to than R ∞ ( ( a P M d n ∈ ahmtc ujc Classification. Subject Mathematics H UNIAINO H TNADTIDCCANTOR TRIADIC STANDARD THE OF QUANTIZATION THE apnssc that such mappings mto fagvnpoaiiydsrbto yapoaiiydsrbto th distribution probability given a a by For distribution points. probability many given finitely a of imation Abstract. P ytesmlrt mappings similarity the by when ieso exists. dimension n ,tenumber the ), h ooo eingnrtdby generated region Voronoi the , ; α -means ( ≥ α sauiu oe rbblt esr on measure probability Borel unique a is k a .Te,the Then, ). α x | .W ute hwta h uniaincecetde o exist not does coefficient quantization the that show further We 2. α k − o hc h nmmocr n otisn oethan more no contains and occurs infimum the which for )) a ,w netgt h pia esof sets optimal the investigate we 3, = a n sdntdby denoted is and , ∈ > k 2 α 0 dP h uniainshm npoaiiyter el ihfidn best a finding with deals theory probability in scheme quantization The i.e., , , Let ( ( ii x ) α sotnrfre oa the as to referred often is ) n atrst rbblt itiuin pia es uniainer quantization sets, optimal distribution, probability set, Cantor P S M hqatzto ro,dntdby denoted error, quantization th ea pia e of set optimal an be uniaindimension quantization j V s n ( ( ( -dimensional ∂M n x masawy a exactly has always -means a = ) R inf = | α d RNLKNIROYCHOWDHURY KANTI MRINAL ( S = ) a 2 o any for j k | 1 α − α o 1 for D n n 1 e snwsaetefloigpooiin(e [GG,GL1]). (see proposition following the state now us Let . )=0 = )) α { k mas n srfre o[R R,G2 R 1R] The R1–R5]. RR, GL2, DR1, [CR, to referred is one -means, x V ( x n P ≥ + DISTRIBUTION ( 0x,2A0 94A34. 28A80, 60Exx, 1. ∈ := P =lim := ) ≤ ,let 2, 2( d 2 R n R ; uniaincoefficient quantization k j Introduction lim , j →∞ a α α − − d ≥ d ( 1) 1 : ) ≤ n n ∈ iii eoethe denote : R ( { →∞ ,and 1, n P k n o all for S k α ) α uhthat such -means, x j nti ae,frtepoaiiymeasure probability the for paper, this In . 2 s .I skonta o otnosprobability continuous a for that known is It ). 1 n a 1 : V stesto l lmnsin elements all of set the is − ⊂ − masadthe and -means n = of a ( cost ≤ R log x log 2 P k n E d j P ∈ ) , min = ∈ ( , V n ≤ P R X a card( d or n n sdntdby denoted is and N n let and , n lmns(e G1) osesm work some see To [GL1]). (see elements k dmninlEcienspace, Euclidean -dimensional ( ∈ a upr h atrstgenerated set Cantor the has } P : o nt set finite a For . itrinerror distortion X α eastof set a be V ) b α ∈ and , , n α ∈ ) := n k P,2 o rmsn applica- promising for [P1,P2] d ≤ M hqatzto rosfrall for errors quantization th P x V for n − = M ( n a o ( huhtequantization the though n b | k 1 k ( P . α k} P a cgon ninformation in ackground P onsi aldan called is points otatv similarity contractive ,i endby defined is ), )) | ie nt subset finite a Given . α . j k ti upre on supported is at . ffiiespot We support. finite of e =1 ) hn o every for Then, eteVrniregion Voronoi the be for D P α d h ro nthe in error the udy R ( ◦ ⊂ d P α S hc r closer are which n sdenoted is and , j .Frany For ). R − o,quantization ror, 1 d approx- Then, . h number the , P k · k , optimal a ∈ s de- α ∈ , 2 Mrinal Kanti Roychowdhury

From the above proposition, we can say that if α is an optimal set of n-means for P , then each a ∈ α is the conditional expectation of the X given that X takes values in the Voronoi region of a. Sometimes, we also refer to such an a ∈ α as the centroid of its own Voronoi region. In this regard, interested readers can see [DFG, DR2, R1]. Let k ≥ 2 be a positive integer, and let {Sj : 1 ≤ j ≤ k} be a set of contractive similarity 1 2(j−1) R 1 k −1 mapping such that Sj(x)= 2k−1 x + 2k−1 for all x ∈ , and let P = k j=1 P ◦ Sj . Then, P is a unique Borel probability measure on R, and P has support the Cantor set C generated by P k the similarity mappings Sj for 1 ≤ j ≤ k, and C satisfies the invariance equality C = ∪ Sj(C) j=1 (see [F,H]). The Cantor set C generated by the k similarity mappings is called the k-adic Cantor set, more specifically the standard k-adic Cantor set, and the probability measure P is called the k-adic Cantor distribution, more specifically the standard k-adic Cantor distribution. If k = 2, 1 1 2 R then we have two similarity mappings given by S1(x) = 3 x and S2(x) = 3 x + 3 for all x ∈ , 1 −1 1 −1 and then the probability measure P is given by P = 2 P ◦ S1 + 2 P ◦ S2 , which has support the classical Cantor set C satisfying C = S1(C) ∪ S2(C). For this dyadic Cantor distribution, in [GL2], Graf and Luschgy determined the optimal sets of n-means and the nth quantization errors for all n ≥ 2. They also showed that the quantization dimension D(P ) of P exists, and equals the Hausdorff dimension of the invariant set C, but the quantization coefficient does not exist. By a word σ of length n, where n ≥ 1, over the alphabet {1, 2, 3}, it is meant that σ := n σ1σ2 ··· σn, where σj ∈{1, 2, 3} for all 1 ≤ j ≤ n. By {1, 2, 3} , we denote the set of all words over the alphabet {1, 2, 3} of some finite length n ≥ 0. Notice that the empty word ∅ has length n zero. For σ := σ1σ2 ··· σn ∈ {1, 2, 3} , by Sσ it is meant that Sσ := Sσ1 ◦···◦ Sσn . For the empty word ∅, by S∅ it is meant the identity mapping on R. Write ∞ {1, 2, 3}∗ := ∪ {1, 2, 3}n, n=0 i.e., {1, 2, 3}∗ denotes the set of all words over the alphabet {1, 2, 3} including the empty word ∅. Let X be a random variable with P . For words β, γ, ··· , δ in {1, 2, 3}∗, by a(β, γ, ··· , δ) we mean the conditional expectation of the random variable X given Jβ ∪ Jγ ∪···∪ Jδ, i.e., 1 a(β, γ, ··· , δ)= E(X|X ∈ J ∪ J ∪···∪ J )= xdP (x). β γ δ P (J ∪···∪ J ) β δ ZJβ∪···∪Jδ Definition 1.2. For n ∈ N with n ≥ 3 let ℓ(n) be the unique natural number with 3ℓ(n) ≤ ℓ(n)+1 n < 3 . Write α2 := {a(1, 21), a(22, 23, 3)} and α3 := {a(1), a(2), a(3)}. For n ≥ 3, define αn := αn(I) as follows: ℓ(n) ℓ(n) ℓ(n) {a(ω): ω ∈{1, 2, 3} \ I} ∪ Sω(α2) if 3 ≤ n ≤ 2 · 3 , ω∈I αn(I)= ℓ(n) ℓ(n) ℓ(n)+1 {Sω(α2): ω ∈{1, 2, 3} \ I} ∪ Sω(α3) if 2 · 3

In this paper, in Section 3 we show that for all n ≥ 2, the sets αn given by Definition 1.2 form the optimal sets of n-means for the standard triadic Cantor distribution P . In Section 4, we show that the quantization coefficient does not exist though the quantization dimension 1 D(P ) exists. Notice that the probability measure P is symmetric about the point 2 , i.e., if two 1 intervals of equal lengths are equidistant from the point 2 , then they have the same probability. Thus, it seems, which is true in the case of dyadic Cantor distribution (see [GL2]), that if the closed interval [0, 1] is partitioned in the middle, then the conditional expectations of the left 1 1 half [0, 2 ], and the right half [ 2 , 1] will form the optimal set of two-means. In Proposition 2.6, we show that it is not true in the case of triadic Cantor distribution. The result in this paper The quantization of the standard triadic Cantor distribution 3

extends the well-known result for the dyadic Cantor distribution given by Graf-Luschgy, and we are grateful to say that the work in this paper was motivated by their work (see [GL2]). 2. Preliminaries R 1 2 Let Sj for 1 ≤ j ≤ 3 be the contractive similarity mappings on given by Sj(x)= 5 x+ 5 (j−1) R n for all x ∈ . For σ := σ1σ2 ··· σn ∈{1, 2, 3} , set Jσ := Sσ([0, 1]), where Sσ := Sσ1 ◦···◦Sσn . For the empty word ∅, write J := J∅ = S∅([0, 1]) = [0, 1]. Then, the set C := n∈N σ∈{1,2,3}n Jσ is known as the Cantor set generated by the mappings Sj, and equals the support of the probability 3 1 −1 ∗ T S 1 measure P given by P = j=1 3 P ◦ Sj . For σ = σ1σ2 ··· σn ∈{1, 2, 3} , n ≥ 0, write pσ := 3n and s := 1 . σ 5n P Let us now give the following lemmas. The proofs are similar to the similar lemmas in [GL2]. Lemma 2.1. Let f : R → R+ be Borel measurable and k ∈ N. Then, 1 fdP = (f ◦ S )(x)dP (x). 3k σ k Z σ∈{X1,2,3} Z Lemma 2.2. Let X be a random variable with probability distribution P . Then, E(X) = 1 1 R 2 1 2 2 and V := V (X)= 9 , and for any x0 ∈ , (x − x0) dP (x)= V +(x0 − 2 ) . From Lemma 2.1 and Lemma 2.2 the followingR corollary is true. ∗ Corollary 2.3. Let σ ∈{1, 2, 3} and x0 ∈ R. Then,

2 2 1 2 (1) (x − x0) dP (x)= pσ sσV +(Sσ( ) − x0) . J 2 Z σ   Remark 2.4. From the above lemma it follows that the optimal set of one-mean is the and the corresponding quantization error is the V of the random variable X. For n 1 σ ∈{1, 2, 3} , n ≥ 1, we have a(σ)= E(x : X ∈ Jσ)= Sσ( 2 ). ℓ(n) ℓ(n) Proposition 2.5. Let αn := αn(I) be the set given by Definition 1.2. If 3 ≤ n ≤ 2 · 3 , 3ℓ(n) then the number of such sets is Cn−3ℓ(n) , and the corresponding distortion error is given by 1 V (P ; α (I)) = (2 · 3ℓ(n) − n)V +(n − 3ℓ(n))V (P ; α ) . n 75ℓ(n) 2 ℓ(n) ℓ(n)+1 3ℓ(n)  If 2 · 3

2 2 V (P ; αn(I)) = (x − a(σ)) dP + min (x − a) dP a∈Sσ(α2) ℓ(n) Jσ σ∈I Jσ σ∈{1,X2,3} \I Z X Z 1 1 1 = V + V (P ; α ) = (2 · 3ℓ(n) − n)V +(n − 3ℓ(n))V (P ; α ) . 3ℓ(n) 25ℓ(n) 2 75ℓ(n) 2 ℓ(n) σ∈I  σ∈{1,X2,3} \I X    Similarly, the other part of the proposition can be derived. Thus, the proof of the proposition is complete.  The following proposition is helpful to find the optimal set of two-means given in Lemma 3.1. It also shows that though the triadic Cantor distribution is uniform and symmetric about the 1 1 point 2 , the set consisting of the conditional expectations of the left half [0, 2 ], and the right 1 half [ 2 , 1] does not form an optimal set of two-means. 4 Mrinal Kanti Roychowdhury

1 1 Proposition 2.6. Let a1 := E(X : X ∈ [0, 2 ]), and a2 := E(X : X ∈ [ 2 , 1]). Then, the set γ := {a1, a2} does not form an optimal set of two-means for P . Proof. By the hypothesis, we have 1 a = E(X : X ∈ [0, ]) = E X : X ∈ J ∪ J ∪ J ∪··· , and 1 2 1 21 221 1   a = E(X : X ∈ [ , 1]) = E X : X ∈ J ∪ J ∪ J ∪··· , 2 2 3 23 223 yielding   ∞ 1 1 5n − 4 3 ∞ 1 1 5n +4 11 a =2 = , and a =2 = . 1 3n 2 5n 14 2 3n 2 5n 14 n=1 n=1 and the corresponding distortionX error is given by X 3 2 V (P ; γ)=2 x − dP J ∪J ∪J ∪J ··· 14 Z 1 21 221 2221   implying ∞ ∞ 1 1 1 5n − 4 3 2 13 V (P ; γ)=2 V + − = . 75n 3n 2 5n 14 441 n=1 n=1  X X  11 1 Let us now consider the set β := {a(1, 21), a(22, 23, 3)}. Since 25 = S21(1) < 2 (a(1, 21) + 117 12 a(22, 23, 3)) = 250 28125 = V (P ; β), the set γ does not form an optimal set of two-means yielding the proposition.  3. Optimal sets of n-means and the nth quantization errors for all n ≥ 2

Let αn be the set given by Definition 1.2. In this section, we show that for all n ≥ 2, the sets αn form the optimal sets of n-means for the standard triadic Cantor distribution P . To calculate the quantization error we will frequently use the formula given by (1).

Lemma 3.1. The set α2 := {a(1, 21), a(22, 23, 3)} forms an optimal set of two-means, and the 821 corresponding quantization error is given by V2 = 28125 . Proof. Consider the set β of two points given by β := {a(1, 21), a(22, 23, 3)}. The distortion error due to the set β is given by 821 min(x − b)2dP = (x − a(1, 21))2dP + (x − a(22, 23, 3))2dP = . b∈β 28125 Z ZJ1∪J21 ZJ22∪J23∪J3 821 Since V2 is the quantization error for two-means, we have V2 ≤ 28125 . Let α := {a, b} be an optimal set of two-means. Since the optimal quantizers are the expected values of their own Voronoi regions, we have 0 V , 2 135 2 ZJ2∪J3 The quantization of the standard triadic Cantor distribution 5

which leads to a contradiction. Hence, we can assume that the Voronoi region of a contains 2 1 1 2 1 points from J2, i.e., S2(0) = 5 < 2 (a + b) < 2 . Assume that S2(0) = 5 < 2 (a + b) < S2112(0). Then, writing A1 := J2112 ∪ J2113 ∪ J212 ∪ J213 ∪ J22 ∪ J23 ∪ J3, we have 452551888 V ≥ (x − a(1))2dP + (x − a(2112, 2113, 212, 213, 22, 23, 3))2dP = >V , 2 15092578125 2 ZJ1 ZA1 1 1 which leads to a contradiction. Hence, S2112(0) < 2 (a + b) < 2 . Assume that S2112(0) < 1 2 (a + b) V , 2 25913671875 2 ZJ1∪J2111 ZA2 1 1 1 which gives a contradiction. Hence, S2113(0) < 2 (a + b) < 2 . Assume that S2113(0) < 2 (a + b) < S212(0). Then, writing A3 := J212 ∪ J213 ∪ J22 ∪ J23 ∪ J3, we have 4180404512 V ≥ (x−a(1, 2111, 2112))2dP + (x−a(212, 213, 22, 23, 3))2dP = 2 140389453125 ZJ1∪J2111∪J2112 ZA3 1 1 yielding V2 > V2, which is a contradiction. Hence, S212(0) < 2 (a + b) < 2 . Assume that 1 S212(0) < 2 (a + b) V , 2 7119140625 2 ZJ1∪J211 ZA4 1 1 1 which is a contradiction. Hence, S2122(0) < 2 (a + b) < 2 . Assume that S2122(0) < 2 (a + b) < S2123(0). Then, writing A5 := J2123 ∪ J213 ∪ J22 ∪ J23 ∪ J3, we have 12733641776 V ≥ (x − a(1, 211, 2121))2dP + (x − a(2123, 213, 22, 23, 3))2dP = 2 432558984375 ZJ1∪J211∪J2121 ZA5 1 1 yielding V2 > V2, which is a contradiction. Hence, S2123(0) < 2 (a + b) < 2 . Assume that 1 S2123(0) < 2 (a+b) V , 2 11390625 2 ZA6 ZA7 1 1 1 which give a contradiction. Hence, S213(0) < 2 (a + b) < 2 . Assume that S213(0) < 2 (a + b) < S21313(0). Then, writing A8 = J1 ∪ J211 ∪ J212, a8 = E(X : X ∈ A8), A9 = J21313 ∪ J2132 ∪ J2133 ∪ J22 ∪ J23 ∪ J3, and a9 = E(X : X ∈ A9), we have 489226058779 V ≥ (x − a )2dP + (x − a )2dP = >V , 2 8 9 16680146484375 2 ZA8 ZA9 1 1 which leads to a contradiction. Hence, S21313(0) < 2 (a + b) < 2 . Assume that S21313(0) < 1 2 (a + b) V , 2 10 11 101383681640625 2 ZA10 ZA11 1 1 which leads to a contradiction. Hence, S2132(0) < 2 (a + b) < 2 . 1 Assume that S2132(0) < 2 (a + b)

[S21321(0),S21322(0)], [S21322(0),S21323(0)], [S21323(0),S2133(0)]. 1 1 Then, 2 (a + b) belongs to one of the above three subintervals. First, assume that 2 (a + b) ∈ [S21321(0),S21322(0)]. Then, as before we can show that the distortion error is larger than 1 0.0291911 ≥ V2, which leads to a contradiction. To show that for 2 (a+b) ∈ [S21321(0),S21322(0)], 6 Mrinal Kanti Roychowdhury the distortion error is larger, we may need to partition the subinterval [S21321(0),S21322(0)] into the following three sub-subintervals:

[S213211(0),S213212(0)], [S213212(0),S213213(0)], [S213213(0),S21322(0)], and then we can separately consider the cases as follows: 1 (a + b) ∈ [S (0),S (0)], [S (0),S (0)], or [S (0),S (0)]. 2 213211 213212 213212 213213 213213 21322 If needed, we can further partition the above sub-subintervals to check that the distortion 1 is larger. Similarly, we can show that if 2 (a + b) belongs to either [S21322(0),S21323(0)], or 1 [S21323(0),S2133(0)], then the contradiction arises. Therefore, S21313(0) < 2 (a + b) < S2132(0) 1 can not happen. Using the similar arguments, we can show that neither S2132(0) < 2 (a + 1 b) < S2133(0), nor S2133(0) < 2 (a + b) < S2133(1) can happen. Hence, we can assume that 1 1 S21(1) < 2 (a + b) < 2 . 1 1 Assume that S22222(0) < 2 (a + b) < 2 . Then, writing A12 = J1 ∪ J21 ∪ J221 ∪ J2221 ∪ J22221, and a12 = E(X : X ∈ A12), and by Proposition 2.6, noting the fact that 1 (x − E(X : X ∈ [ 2 ,1] 1 , 2dP 13 [ 2 1])) = 882 , we have R 13 61346429393 V ≥ (x − a )2dP + = >V , 2 12 882 2093027343750 2 ZA12 1 which gives a contradiction. Hence, S21(1) < 2 (a + b) < S22222(0). Assume that S222212(0) < 1 2 (a + b) V , 2 13 14 35273384033203125 2 ZA13 ZA14 1 which is a contradiction. Hence, S21(1) < 2 (a + b) < S222212(0). Proceeding in this way, and using the similar technique of partitioning the intervals into subintervals, as mentioned in the 1 previous paragraph, we can show that S22(0) < 2 (a + b) < S222212 cannot happen. Hence, 1 S21(1) < 2 (a + b) < S22(0). Thus, the set α2 := {a(1, 21), a(22, 23, 3)} forms an optimal set of 821 two-means, and the corresponding quantization error is given by V2 = 28125 . Hence, the proof of the lemma is complete. 

Corollary 3.2. Let α2 be an optimal set of two-means. Then, for 1 ≤ j ≤ 3, we have min (x − a)2dP = 1 V . Jj a∈Sj (α2) 75 2 Proof.R We have

2 1 2 −1 1 2 min (x − a) dP = min (x − a) d(P ◦ Sj )= min (Sj(x) − a) dP a∈Sj (α2) 3 a∈Sj (α2) 3 a∈Sj (α2) ZJj ZJj Z 1 2 1 2 1 = min(Sj(x) − Sj(a)) dP = min(x − a) dP = V2. 3 a∈α2 75 a∈α2 75 Z Z Thus, the proof of the corollary is complete.  1 1 1 Lemma 3.3. The set α3 := {S1( 2 ),S2( 2 ),S3( 2 )} forms an optimal set of three-means, and the 1 corresponding quantization error is given by V3 = 225 . 1 Proof. Let β be a set of three points such that β := {Sj( 2 ): j =1, 2, 3}. Then, 3 2 1 2 1 min(x − b) dP = (x − Sj( )) dP = . b∈β 2 225 j=1 Jj Z X Z 1 Since V3 is the quantization error for three-means, we have V3 ≤ 225 . Let α := {a1, a2, a3} be an optimal set of three-means. Since the elements in an optimal set are the centroids of their The quantization of the standard triadic Cantor distribution 7

own Voronoi regions, without any loss of generality, we can assume that 0 < a1 < a2 < a3 < 1. We now prove that α ∩ Jj =6 ∅ for all 1 ≤ j ≤ 3. Suppose that α ∩ J1 = ∅. Then, 1 13 V ≥ (x − )2dP = >V , 3 5 2700 3 ZJ1 which is a contradiction. So, we can assume that α ∩ J1 =6 ∅. Similarly, α ∩ J3 =6 ∅. We now 2 3 that that α ∩ J2 =6 ∅. Suppose that α ∩ J2 = ∅. Then, either a2 < 5 , or a2 > 5 . Assume that 2 1 2 4 3 a2 < 5 . Then, as 2 ( 5 + 5 )= 5 , we have 2 1 17 V ≥ (x − )2dP + (x − S ( ))2dP = >V , 3 5 3 2 2700 3 ZJ2 ZJ3 3 which is a contradiction. Similarly, we can show that if 5 < a2, then a contradiction arises. Thus, we can assume that a2 ∈ J2, i.e., α ∩ J2 =6 ∅. Now, if the Voronoi region of a1 contains 1 2 4 4 1 3 points from J2, we have 2 (a1 +a2) > 5 implying a2 > 5 −a1 ≥ 5 − 5 = 5 , which is a contradiction as a2 ∈ J2. Thus, the Voronoi region of a1 does not contain any point from J2. Similarly, the Voronoi region of a2 does not contain any point from J1 and J3. Likewise, the Voronoi region of a3 does not contain any point from J2. Since the optimal quantizers are the centroids of their 1 1 1 own Voronoi regions, we have a1 = S1( 2 ), a2 = S2( 2 ), and a3 = S3( 2 ), and the corresponding 1  quantization error is given by V3 = 225 . Thus, the proof of the lemma is complete. 1 Remark 3.4. By Lemma 3.3, we see that the set {Sj( 2 ):1 ≤ j ≤ 3} forms an optimal set of 1 1 1 three-means. Similarly, we can show that the sets {S1( 2 ),S2( 2 )} ∪ S3(α2), {S1( 2 )} ∪ S2(α2) ∪ S3(α2), S1(α2) ∪ S2(α2) ∪ S3(α2), S1(α3) ∪ S2(α2) ∪ S3(α2), and S1(α3) ∪ S2(α3) ∪ S3(α2) form optimal sets of n-means for n = 4, 5, 6, 7, 8, respectively. Due to technicality of the proofs, we do not show them in the paper.

Proposition 3.5. Let αn be an optimal set of n-means for any n ≥ 3. Then, αn ∩ Jj =6 ∅ 1 2 3 4 for all 1 ≤ j ≤ 3, and αn does not contain any point from the open intervals ( 5 , 5 ) and ( 5 , 5 ). Moreover, the Voronoi region of any point in αn ∩ Jj does not contain any point from Ji, where 1 ≤ i =6 j ≤ 3. Proof. Due to Lemma 3.3, and Remark 3.4, the proposition is true for 3 ≤ n ≤ 8. Let us now prove that the proposition is true for n ≥ 9. Let αn := {a1, a2, ··· , an} be an optimal set of n-means for n ≥ 9. Since the points in an optimal set are the centroids of their own Voronoi regions, without any loss of generality, we can assume that 0 < a1 < a2 < ··· < an < 1. 1 2 Consider the set of nine elements β := {Sσ( 2 ): σ ∈{1, 2, 3} }. Then, 1 1 min(x − a)2dP = (x − a(σ))2dP = V = . a∈β 2 2 Jσ 25 5625 Z σ∈{X1,2,3} Z 1 Since Vn is the quantization error for n-means for n ≥ 9, we have Vn ≤ V9 ≤ 5625 . Suppose that 1 5 < a1. Then, 1 2 13 Vn ≥ (x − ) dP = >Vn, J1 5 2700 Z 1 4 which is a contradiction. So, we an assume that a1 ≤ 5 . Similarly, 5 ≤ an. Thus, αn ∩ J1 =6 ∅, and αn ∩ J3 =6 ∅. We now show that αn ∩ J2 =6 ∅. For the sake of contradiction, assume that 2 2 αn ∩ J2 = ∅. Let aj := max{ai : ai < 5 for 1 ≤ i ≤ n − 1}. Then, aj < 5 . As αn ∩ J2 = ∅, we 8 Mrinal Kanti Roychowdhury

3 have 5 < aj+1. Thus, using the symmetry and the formula given by (1), we have ∞ ∞ 2 2 1 1 1 2 2 V x dP V S n−1 n ≥ 2 ( − ) =2 n n +( 2 1( ) − ) J − 5 3 25 2 5 n=2 Z 2n 11 n=2 ∞X X   1 1 1 2 4 1 4 =2 V + (S n−1 ( )) − S n−1 ( )+ 75n 3n 2 1 2 5 2 1 2 25 n=2   X∞ ∞  ∞  ∞ 1 1 5n − 4 2 8 1 5n − 4 8 1 =2 V +2 − + 75n 3n 5n−110 5 3n 5n−110 25 3n n=2 n=2 n=2 n=2 X X   X X 19 = >V , 18900 n which gives a contradiction. Hence, we can conclude that αn ∩ J2 =6 ∅. Next, suppose that αn 1 2 1 contains a point from the open interval ( 5 , 5 ). Let aj := max{ai : ai < 5 for 1 ≤ i ≤ n − 2}. 1 2 Then, aj+1 ∈ ( 5 , 5 ), and aj+2 ∈ J2. The following cases can arise: 1 3 Case 1. 5 < aj+1 ≤ 10 . 1 2 4 4 3 1 Then, 2 (aj+1 + aj+2) > 5 implying aj+2 > 5 − aj+1 ≥ 5 − 10 = 2 implying, as before, ∞ 1 2 Vn ≥ (x − ) dP J − 2 n=2 Z 2n 11 X∞ ∞ ∞ ∞ 1 1 5n − 4 2 1 5n − 4 1 1 = V + − + 75n 3n 5n−110 3n 5n−110 4 3n n=2 n=2 n=2 n=2 X X   X X 1 = >V , 1350 n which leads to a contradiction. 3 2 Case 2. 10 ≤ aj+1 < 5 . 1 1 2 2 3 1 Then, 2 (aj + aj+1) < 5 implying aj ≤ 5 − aj+1 ≤ 5 − 10 = 10 . Then, 1 37 V ≥ (x − )2dP = >V , n 10 50625 n ZJ13 which yields a contradiction. Thus, by Case 1 and Case 2, we can conclude that αn does not contain any point from the 1 2 1 open interval ( 5 , 5 ). Reflecting the situation with respect to the point 2 , we can conclude that 3 4 αn does not contain any point from the open interval ( 5 , 5 ) as well. To prove the last part of 1 the proposition, we proceed as follows: aj = max{ai : ai < 5 for 1 ≤ i ≤ n − 2}. Then, aj is the rightmost element in αn ∩ J1, and aj+1 ∈ αn ∩ J2. Suppose that the Voronoi region of aj 1 2 4 4 1 3 contains points from J2. Then, 2 (aj + aj+1) > 5 implying aj+1 > 5 − aj ≥ 5 − 5 = 5 , which yields a contradiction as aj+1 ∈ J2. Thus, the Voronoi region of any point in αn ∩ J1 does not contain any point J2, and so from J3 as well. Similarly, we can prove that the Voronoi region of any point in αn ∩ J2 does not contain any point from J1 and J3, and the Voronoi region of any point in αn ∩ J3 does not contain any point from J1 and J2. Thus, the proof of the proposition is complete.  The following lemma is similar to Lemma 4.5 in [GL2].

Lemma 3.6. Let n ≥ 3, and let αn be an optimal set of n-means such that αn ∩ Jj =6 ∅ for all 1 2 3 4 1 ≤ j ≤ 3, and αn does not contain any point from the open intervals ( 5 , 5 ) and ( 5 , 5 ). Further assume that the Voronoi region of any point in αn ∩Jj does not contain any point from Ji, where −1 1 ≤ i =6 j ≤ 3. Set βj := αn ∩ Jj, and nj := card (βj) for 1 ≤ j ≤ 3. Then, Sj (βj) is an 1 optimal set of nj-means, and Vn = 75 (Vn1 + Vn2 + Vn3 ). The quantization of the standard triadic Cantor distribution 9

Proof. By the hypothesis, βj =6 ∅, for all 1 ≤ j ≤ 3, and αn does not contain any point from the 3 1 2 3 4 open intervals ( , ) and ( , ). Hence, αn = ∪ βj. Since αn is an optimal set of n-means, 5 5 5 5 j=1 3 3 2 2 Vn = min kx − ak dP = min kx − ak dP. a∈αn a∈βj j=1 Jj j=1 Jj X Z X Z Now, using Lemma 2.1 we have 1 3 1 3 V x S−1 a 2dP x a 2dP. (2) n = min k − j ( )k = min− k − k 75 a∈βj 75 a∈S 1(β ) j=1 j=1 j j X Z X Z −1 R2 If S1 (β1) is not an optimal set of n1-means, then we can find a set γ1 ⊂ with card(γ1)= n1 such that x a 2dP < x a 2dP. min k − k min− k − k a∈γ1 a∈S 1(β ) Z Z 1 1 But, then S1(γ1) ∪ β2 ∪ β3 will be a set of cardinality n, and

2 min{kx − ak : a ∈ S1(γ1) ∪ β2 ∪ β3}dP Z 1 3 = min kx − ak2dP + min kx − ak2dP −1 a∈S1(γ1) 75 a∈S (β ) J1 j=2 j j Z X Z 1 1 3 = min kx − S−1(a)k2dP + min kx − ak2dP 1 −1 75 a∈S1(γ1) 75 a∈S (β ) j=2 j j Z X Z 1 1 3 x a 2dP x a 2dP = min k − k + min− k − k 75 a∈γ1 75 a∈S 1(β ) j=2 j j Z X Z 1 1 3 < x a 2dP x a 2dP. min− k − k + min− k − k 75 a∈S 1(β ) 75 a∈S 1(β ) 1 1 j=2 j j Z X Z 2 Thus by (2), we have min{kx − ak : a ∈ S1(γ1) ∪ β2 ∪ β3}dP

Without any loss of generality, we can assume that n1 ≥ n2 ≥ n3. Let p,q,r ∈ N be such that p p q q r r (4) 3 ≤ n1 ≤ 2 · 3 , 3 ≤ n2 ≤ 2 · 3 , and 3 ≤ n3 ≤ 2 · 3 . m−1 We will show that p = q = r = m − 1. Since n1 ≥ n2 ≥ n3, we have n1 ≥ 3 , and m−1 n3 ≤ 2 · 3 implying r ≤ m − 1 ≤ p. If V˜n is the distortion error due to the set {a(σ): σ ∈ m {1, 2, 3} \ I} ∪ (∪σ∈I Sσ(α2)), by Proposition 2.5, we have 1 V˜ = ((2 · 3m − n)V +(n − 3m)V ) . n 75m 2

Thus, by the induction hypothesis, (3) implies V˜n ≥ Vn yielding 1 1 ((2 · 3m − n)V +(n − 3m)V ) ≥ ((2 · 3p − n )V +(n − 3p)V ) 75m 2 75p+1 1 1 2 1 1 + ((2 · 3q − n )V +(n − 3q)V )+ ((2 · 3r − n )V +(n − 3r)V ) 75q+1 2 2 2 75r+1 3 3 2 which upon simplification gives, n n (5) 3 (2V − V ) − (V − V ) ≥ 25m−p−1 (2V − V ) − 1 (V − V ) 2 3m 2 2 3p 2   n2   n3 + 25m−q−1 (2V − V ) − (V − V ) + 25m−r−1 (2V − V ) − (V − V ) . 2 3q 2 2 3r 2  n n1 n2 n3  n n1 n2 n3  Hence, using the bounds of 3m , 3p , 3q , and 3r , i.e., putting 3m = 1, and 3p = 3q = 3r = 2, from the above inequality, we obtain 3V (6) 11.419 = ≥ 25m−p−1 + 25m−q−1 + 25m−r−1. V2 Recall that r ≤ m − 1 ≤ p. Moreover, p > m is not possible. Thus, from (6), to obtain the values of p, q, and r, we proceed as follows: (i) If p = r = m − 1, then (6) implies that q = m − 1. (ii) If p = m − 1 and r = m − 2, then 11.419 ≥ 1+25m−q−1 + 25, which gives a contradiction, in fact, a contradiction arises for p = m − 1 and any r < m − 1. (iii) If p = m and r = m − 1, then (6) implies that q ≥ m − 1. Notice that if q = m, m m m then n > n1 + n2 ≥ 3 +3 = 2 · 3 , which is a contradiction. Also, q > m is not possible. So, q = m − 1 is the only choice, but then also a contradiction arises as shown below: For p = m, q = m − 1, and r = m − 1, (5) implies n 1 1 n n n 3 (2V − V ) − (V − V ) ≥ (2V − V )( + 2) − (V − V )( 1 + 2 + 3 ), 2 3m 2 2 25 2 25 3m 3m−1 3m−1 i.e.,   24 n n n 1 n (2V − V ) ≥ (V − V )( − 2 − 3 − 1 ), 2 25 2 3m−1 3m−1 3m−1 25 3m which upon simplification yields, n 2V − V 12 5429 1 2 < , m ≤ = 1 3 V − V2 37 7104 p p which is a contradiction because 3 ≤ n1 ≤ 2 · 3 , and p = m. 1 m−q−1 (iv) If p = m and r = m − 2, then (6) implies 11.419 ≥ 25 +25 + 25, which is not true. In fact, a contradiction arises for p = m and any r < m − 1. −1 Hence, we can conclude that p = q = r = m − 1. Since by Lemma 3.6, for Sj (αn ∩ Jj) is an m−1 m−1 optimal set of nj means where 3 ≤ nj ≤ 2 · 3 , we have −1 m−1 Sj (αn ∩ Jj)= {a(σ): σ ∈{1, 2, 3} \ Ij} ∪ ∪σ∈Ij Sσ(α2) ,  The quantization of the standard triadic Cantor distribution 11

m−1 m−1 where Ij ⊆{1, 2, 3} with card (Ij)= nj − 3 for 1 ≤ j ≤ 3. Hence, 3 −1 ℓ(n) αn := αn(I)= Sj (αn ∩ Jj)= {a(σ): σ ∈{1, 2, 3} \ I} ∪ (∪σ∈I Sσ(α2)) , j=1 [ where I ⊆{1, 2, 3}m with card (I)= n − 3m, is an optimal set of n-means. The corresponding quantization error is 1 V = ((2 · 3m − n)V +(n − 3m)V )= V (P ; α (I)), n 75m 2 n m m where V (P ; αn(I)) is given by Proposition 2.5. Thus, the theorem is true if 3 ≤ n ≤ 2 · 3 . Similarly, we can prove that the theorem is true if 2 · 3m

2 1 2/β f(x)= x β ((2V − V ) − x(V − V )) = (5429 − 2304x)x . 2 2 28125 Then, 5429 (i) 1 < 1152(β+2) < 2, and 5429 5429 (ii) f( 1152(β+2) > f(2) > f(1), and f([1, 2]) = [f(1), f( 1152(β+2) )]. Proof. Notice that f is a continuous function on the closed interval [1, 2], and 2 −1 x β (10858 − 2304(β + 2)x) f ′(x)= . 28125β ′ 5429 ′ 2(1152β−3125) ′ f (x) = 0 implies that x = 1152(β+2) . Moreover, f (1) = − 28125β > 0, and f (2) = 41/β (821−2304β) 5429 821 1/β 1 28125β < 0. Thus, f is maximum at x = 1152(β+2) . Again, f(2) = 28125 4 > 9 = f(1). 5429 5429 5429 Hence, 1 < 1152(β+2) < 2, and f( 1152(β+2) ) > f(2) > f(1), and f([1, 2]) = [f(1), f( 1152(β+2) )]. Thus, the proof of the lemma is complete.  Theorem 4.2. The β-dimensional quantization coefficient does not exist. 5429 Proof. Let M = max{f(x):1 ≤ x ≤ 2}. Then, by Lemma 4.1, we have M = 1152(β+2) . ℓ(nk) ℓ(nk) Let (nk)k∈N be a subsequence of the set of natural numbers such that 3 ≤ nk < 2 · 3 . The assertion of the theorem will follow if we show that the set of accumulation points of the 2 β sequence (nk Vnk )k≥1 is [f(1), f(M)]. Let y ∈ [f(1), f(M)], then y = f(x) for some x ∈ [1, 2]. ℓ ℓ ℓ Set nkℓ = ⌊x3 ⌋, where ⌊x3 ⌋ denotes the greatest integer less than or equal to x3 . Then, ℓ ℓ nkℓ < nkℓ+1 and ℓ(nkℓ ) = ℓ, where by ℓ(nkℓ ) = ℓ it is meant that 3 ≤ nkℓ < 2 · 3 . Notice that 1 ℓ β ℓ(n) ℓ(n) then there exists xkℓ ∈ [1, 2] such that nkℓ = xkℓ 3 . Recall that 3 = 5, and if 3 ≤ n ≤ 2·3 , then by Theorem 3.7, we have 1 V = (2 · 3ℓ(n) − n)V +(n − 3ℓ(n))V . n 75ℓ(n) 2  12 Mrinal Kanti Roychowdhury

Thus, putting the values of n and V , we obtain kℓ nkℓ 2 2 β β 1 ℓ ℓ n Vn = n (2 · 3 − nk )V +(nk − 3 )V2 kℓ kℓ kℓ 75ℓ ℓ ℓ yielding  2 2 (7) n β V = x β ((2V − V ) − x (V − V )) = f(x ). kℓ nkℓ kℓ 2 kℓ 2 kℓ ℓ ℓ ℓ 1 Again, xk 3 ≤ x3 < xk 3 + 1, which implies x − ℓ < xk ≤ x, and so, lim xk = x. Since f is ℓ ℓ 3 ℓ ℓ→∞ ℓ continuous, we have 2 β lim nk Vnk = f(x)= y, ℓ→∞ ℓ ℓ 2 β which yields the fact that y is an accumulation point of the subsequence (nk Vnk )k≥1 whenever y ∈ 2 β [f(1), f(M)]. To prove the converse, let y be an accumulation point of the sequence (nk Vnk )k≥1. 2 2 2 β β β Then, there exists a subsequence (n Vn )i≥1 of (n Vn )k≥1 such that lim n Vn = y. Set ki ki k k i→∞ ki ki nki ℓk = ℓ(nk ) and xk = ℓ . Then, xk ∈ [1, 2], and as shown in (7), we have i i i 3 ki i 2 n β V = f(x ). ki nki ki Let (x ) be a convergent subsequence of (x ) , then we obtain kij j≥1 ki i≥1 2 2 β β y = lim nk Vnk = lim nk Vnk = lim f(xki ) ∈ [f(1), f(M)]. i→∞ i i j→∞ ij ij j→∞ j 2 β Thus, the set of accumulation points of the sequence (nk Vnk )k≥1 is [f(1), f(M)], i.e., the β- dimensional quantization coefficient for P does not exist. Hence, the proof of the theorem is complete.  References [CR] D. Comez and M.K. Roychowdhury, Quantization for uniform distributions on stretched Sierpinski trian- gles, Monatshefte f¨ur Mathematik, Volume 190, Issue 1, 79-100 (2019). [DFG] Q. Du, V. Faber and M. Gunzburger, Centroidal Voronoi Tessellations: Applications and Algorithms, SIAM Review, Vol. 41, No. 4 (1999), pp. 637-676. [DR1] C.P. Dettmann and M.K. Roychowdhury, Quantization for uniform distributions on equilateral triangles, Real Analysis Exchange, Vol. 42(1), 2017, pp. 149-166. [DR2] C.P. Dettmann and M.K. Roychowdhury, An algorithm to compute CVTs for finitely generated Cantor distributions, to appear, Southeast Asian Bulletin of Mathematics. [F] K.J. Falconer, Techniques in Fractal Geometry, Chichester: Wiley, 1997. [GG] A. Gersho and R.M. Gray, Vector quantization and signal compression, Kluwer Academy publishers: Boston, 1992. [GL1] S. Graf and H. Luschgy, Foundations of quantization for probability distributions, Lecture Notes in Math- ematics 1730, Springer, Berlin, 2000. [GL2] S. Graf and H. Luschgy, The Quantization of the Cantor Distribution, Math. Nachr., 183 (1997), pp. 113-133. [GL3] S. Graf and H. Luschgy, Asymptotics of the quantization error for self-similar probabilities, Real Anal. Exchange 26, 795-810 (2001). [GL4] S. Graf and H. Luschgy, Quantization for probability measures with respect to the geometric mean error, Math. Proc. Cambridge Philos. Soc. 136, 687-717 (2004). [GL5] S. Graf and H. Luschgy and G. Pag`es, The local quantization behavior of absolutely continuous probabilities, Ann. Probab. 40, 1795-1828 (2012). [GN] R. Gray and D. Neuhoff, Quantization, IEEE Trans. Inform. Theory, 44 (1998), pp. 2325-2383. [H] J. Hutchinson, Fractals and self-similarity, Indiana Univ. J., 30 (1981), pp. 713-747. [P] K. P¨otzelberger, The quantization dimension of distributions, Math. Proc. Cambridge Philos. Soc., Volume 131, Issue 3, November 2001, pp. 507-519. [P1] G. Pag`es, A space quantization method for numerical integration, J. Comput. Appl. Math. 89, 1-38 (1998). The quantization of the standard triadic Cantor distribution 13

[P2] G. Pag`es and J. Printems, Functional quantization for numerics with an application to option pricing, Monte Carlo Methods Appl. 11, 407-446 (2005). [R1] M.K. Roychowdhury, Quantization and centroidal Voronoi tessellations for probability measures on dyadic Cantor sets, Journal of Fractal Geometry, 4 (2017), 127-146. [R2] M.K. Roychowdhury, Optimal quantizers for some absolutely continuous probability measures, Real Anal- ysis Exchange, Vol. 43(1), 2017, pp. 105-136. [R3] M.K.Roychowdhury, Optimal quantization for the Cantor distribution generated by infinite similitudes, Israel Journal of Mathematics 231 (2019), 437-466. [R4] M.K. Roychowdhury, Least upper bound of the exact formula for optimal quantization of some uniform Cantor distributions, Discrete and Continuous Dynamical Systems- Series A, Volume 38, Number 9, September 2018, pp. 4555-4570. [R5] M.K. Roychowdhury, Center of mass and the optimal quantizers for some continuous and discrete uniform distributions, to appear, Journal of Interdisciplinary Mathematics. [RR] J. Rosenblatt and M.K. Roychowdhury, Optimal quantization for piecewise uniform distributions, Uniform Distribution Theory 13 (2018), no. 2, 23-55. [Z] P.L. Zador, Development and Evaluation of Procedures for Quantizing Multivariate Distributions, Ph.D. Thesis (Stanford University, 1964).

School of Mathematical and Statistical Sciences, University of Texas Rio Grande Valley, 1201 West University Drive, Edinburg, TX 78539-2999, USA. E-mail address: [email protected]