Discrete Math. Appl. 2016; 26 (1):13–29
Yuriy S. Kharin and Egor V. Vecherko Detection of embeddings in binary Markov chains
DOI: 10.1515/dma-2016-0002 Received March 31, 2015
Abstract: The paper is concerned with problems in steganography on the detection of embeddings and sta- tistical estimation of positions at which message bits are embedded. Binary stationary Markov chains with known or unknown matrices of transition probabilities are used as mathematical models of cover sequences (container les). Based on the runs statistics and the likelihood ratio statistic, statistical tests are constructed for detecting the presence of embeddings. For a family of contiguous alternatives, the asymptotic power of statistical tests based on the runs statistics is found. An algorithm of polynomial complexity is developed for the statistical estimation of positions with embedded bits. Results of computer experiments are presented.
Keywords: steganography, model of embeddings, Markov chain, statistical test, power, total number of runs ËË Note: Originally published in Diskretnaya Matematika (2015) 27, №3, 123–144 (in Russian).
1 Introduction
The paper is concerned with a topical problem in steganographic information security—this is the problem of embedding detection, that is of the construction of statistical tests for the existence of embeddings and of statistical estimates of positions (points) of embeddings. The problem of detection of embeddings was studied in [1, 2, 3, 4] under the assumption that the prob- abilistic model of a cover sequence is completely known. So, in [1] statistical tests were constructed for the embedding existence in the case when the initial (cover) sequence is modeled by a Bernoulli scheme of inde- pendent trials; it was also shown that the embedding detection is impossible if the fraction of the embeddings tends to 0 as the length of the initial sequence tends to ∞. A similar fact was proved in [3]. In [2] a most pow- erful statistical test for the embedding existence was constructed for the model based on a Bernoulli scheme of independent trials, and statistical estimates of the fraction of embeddings were put forward. Statistical es- timates of the model parameters of the embedding in a binary Markov chain were constructed and examined in [5]; they allow to make preliminary conclusions on the fraction of embeddings. It is worth mentioning that the majority of studies on the detection of embeddings are based on empirical characteristics of sequences, which involve methods of discriminant analysis for testing the embedding existence. We also note that the above problems of recognition of embeddings are close to those on the detection of deviations of output se- quences of cryptographic generators from uniformly distributed random sequences [6]. Our purpose in this paper is to continue the studies initiated in [5]: we construct and analyse statistical tests for the embedding existence, and to develop algorithms for the statistical estimation of embeddings points. The paper is organized as follows. In § 2 we describe the mathematical (q, r)-block model of embedding in a binary Markov chain. In § 3 we construct statistical tests for the embedding existence based on the runs statistics and on the short runs statistics, and in § 4 we consider tests based on the likelihood ratio statis- tics. In § 5, we put forward an algorithm of polynomial complexity for statistical estimation. The results of numerical experiments are given in § 6.
Yuriy S. Kharin: Belarusian State University, e-mail: [email protected] Egor V. Vecherko: Belarusian State University, e-mail: [email protected] 14 Ë Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains
2 Mathematical model of embedding
We dene the generalized (q, r)-block model of embedding, a particular case of which was proposed by the authors of the present paper in [5]. Throughout, (Ø, F, P) is the underlying probability space, V = {0, 1} is V T O (⋅) the binary alphabet, T is the space of binary -dimensional vectors, is the ‘big O’ notation introduced ℕ I{A} A ut = (u , . . . , u ) ∈ V by Landau, is the set of natural numbers, is the indicator of an event , t t t t −t +1 (t , t ∈ ℕ, t ≤ t ) is a binary string of t − t + 1 successive symbols of some sequence2 {u : t ∈ ℕ}, w(⋅) is 1 2 1 2 2 1 1 1 t 2 2 1 the Hamming weight, L{î} is the probability distribution of a random variable î, B(è) denotes the Bernoulli distribution with parameter è ∈ [0, 1]: P{î = 1} = 1 − P{î = 0} = è, Õ(⋅) is the distribution function for the standard normal law N(0, 1). According to [5], an adequate model of the cover sequence for embedding a message is a binary sequence xT = (x , x , . . . , x ) ∈ V , x ∈ V, t = 1, . . . , T, of length T, which is a homogeneous rst-order binary 1 1 2 T T t Markov chain with symmetric matrix of one-step transition probabilities P: 1 1 + ù 1 − ù 1 P = P(ù) = , P{x ⊕ x } = (1 − ù), |ù| < 1. (1) 2 1 − ù 1 + ù t t+1 2
Here, ù is the parameter of the model: the case ù = 0 corresponds to a scheme of independent trials which was examined in [1]. The case ù > 0 takes into account an attraction-type dependence, and the ù < 0, a repulsion- type dependence. We note that the Markov chain (1) satises the ergodicity conditions [7] and has the uniform stationary distribution (1/2, 1/2). In what follows, we shall assume that the Markov chain (1) is stationary, and so its initial probability distribution agrees with the uniform distribution. In practical applications [5] a message is subject to a cryptographic transformation before being em- bedded in the cover sequence, and hence we assume in what follows that a message îM = (î , . . . , î ) ∈ 1 1 M V ,M ≤ T M M , is a sequence of independent Bernoulli random variables: L{î } = B(è ), P{î = j} = è , j ∈ V, è = 1 − è , t = 1, . . . , M. t 1 t j 1 0 (2)
The stego-key ãT = (ã , . . . , ã ) ∈ V species the points (time instants) at which the message bits îM 1 1 T T 1 are embedded in the sequence xT. We introduce a special (q, r)-block model of the stego-key ãT (q, r ∈ ℕ, 1 1 r ≤ q), assuming that the length of the sequence xT is a multiple of q: T = Kq. 1 æ ∈ V, L{æ } = B(ä) k = 1, . . . , K Let k k , , be auxiliary independent random variables, which govern the choice of the blocks {x = xkq } for embedding the message îM: if æ = 1, then r successive bits (k) (k−1)q+1 1 k r x æ = 0 of the message are embedded in randomly chosen bits of the block (k); if k then no embedding in the block x is performed; G(q,r) = {g(q,r), . . . , g(q,r)} = {uq ∈ V : w(uq) = r} is the set consisting of (k) 1 C 1 q 1 Cr lexicographically ordered binary vectors of lengthr q equipped with the Hamming weight r; g , g ,... are q q 1 2 independent random variables, g has uniform probability distribution on the set {1, . . . , Cr }, k q 1 P{ã = g(q,r)|æ = 1} = P{g = i} = . (k) i k k Cr q
In the (q, r)-block model of embedding, the sequence ãT consists of blocks of length q: ã = ãq, ã = 1 (1) 1 (2) ã2q , . . . , ã = ãKq , q+1 (K) (K−1)q+1 (0, . . . , 0), æ = 0, . k ã = q k = 1, . . . , T/q, (k) > (3) g(q,r) ∈ G(q,r), æ = 1, g = i, F i k k the parameter ä characterizes the fraction of embeddings. We note that for the (q, r)-block model of embed- ding the maximum capacity for the stego system is Tr/q bits, while the cardinality of the set of all possible stego-keys Ã(q,r) = {G(q,r) ∪ {(0, . . . , 0)}}T/q Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains Ë 15 is |Ã(q,r)| = (1 + Cr )T/q. In the case q = r = 1, we have the classical model [5] of a bit-wise embedding, q |Ã(1,1)| = 2T. For the most commonly encountered in steganography methods of embedding (the ‘LSB replacement’ and the ‘± embedding’ [8]) the random stego-sequence YT = (Y ,...,Y ) is generated by the sequences 1 1 T {x }, {î } {ã } t t , t via the function transform x , ã = 0, Y = x ⊕ ã x ⊕ ã î = t t (4) t t t t t ó î , ã = 1, ó t t where ó = ∑t ã . The sequences {x }, {î }, {ã } are assumedt to be jointly independent. t j=1 j t t t We note that for r = 1 the model presented here coincides with the q-block model considered in [5]. è = è = 1/2 From the practical point of view, the case with 0 1 in (2), which presents the greatest challenge for embedding detection, is the most noteworthy in the framework of the Markov model of embedding (1)–(4). In this case the one-dimensional distribution of probabilities is not distorted for an embedding in (4), P{Y = 1} = P{Y = 0} = P{x = 1} = P{x = 0} = 1/2, t = 1, 2, . . . , T. t t t t (5) Another justication of the relevance of the case considered in the present paper is the practical utilization of preliminary cryptographic transformation of a message that removes the nonuniformity in the probability distribution of symbols.
3 Embedding detection based on the runs statistics
3.1 Using the total number of runs statistics
We introduce two hypotheses concerning the fraction ä ∈ [0, 1] of embeddings: H : {ä = 0}, H : {ä > 0}. 0 1 (6) The hypothesis H means that there are no embeddings and the stego-sequence YT agrees with the cover 0 1 sequence xT. The composite alternative H means there exist embeddings with some unknown fraction ä > 1 1 0. If the parameter of the cover sequence ù is known, then the null hypothesis, which will be denoted by H H H 0,ù, is simple; otherwise, 0 is also a composite hypothesis. If the hypothesis 0 holds, then the probability P P P measure will be denoted by 0, otherwise, by ä. One similarly denotes the moments of random variables. P P The distributions 0 and ä were found in [5]. Lemma 1. Under the hypothesis H , the probability distribution of the stego-sequence YT is as follows 0,ù 1 P {YT = yT} = P {xT = yT} = 2−T(1 − ù)B −1(1 + ù)T−B , 0 1 1 0 1 1 T T where T−1 B = B (yT) = 1 + H y ⊕ y T T 1 t t+1 t=1 is the minimal sucient statistics with H . 0,ù B The statistics T is called the ‘runs test’ in [9] (it means the total number of runs). By virtue of (1), under the H I{Y ⊕ Y = 1} hypothesis 0 the sequence of indicators t t+1 consists of independent random variables with B(2−1(1 − ù)) B Bernoulli distribution . Using the exact binomial probability distribution of the statistics T with the known value of ù, one may construct a randomized statistical test for the embedding existence with the given probability of the rst kind error á,. However, for practical purposes, it is more convenient to use its asymptotic variant as T → ∞, which is given by the critical region
X B+ = {yT :B ≥ 1 + 1 T(1 − ù) − 1 t yT(1 − ù2)} for ù > 0, 1á 1 T 2 2 á X B− = {yT :B ≤ 1 + 1 T(1 − ù) + 1 t yT(1 − ù2)} for ù < 0, (7) 1á 1 T 2 2 á 16 Ë Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains
t á Õ(t ) = á where á is the -quantile of the standard normal distribution: á . Theorem 1. Let the model of embedding (4) hold. Then as T → ∞ the asymptotic size of test (7) for the hy- potheses H , H based on the total number of runs statistics B coincides with a preassigned signicance level 0,ù 1 T á ∈ (0, 1). The asymptotic expression for the power of this test in the case of the (1, 1)-model of embedding and ñ of the family of simple contiguous alternatives H : {ä = }, â > 0, is as follows: 1,ä Tâ
6.1, 0 < â < 1/2, 6 |ù| WB+ = WB− → Õ t + 2ñ , â = 1/2, (8) 1 1 6> á $ 2 6 1 − ù Fá, â > 1/2. Proof. H Under the hypothesis 0, the De Moivre–Laplace limit theorem implies that B − 1 − 1 T(1 − ù) L ® T 2 ¯ → N(0, 1) as T → ∞. (9) 0 1 xT(1 − ù2) 2 Hence, using (9) we have, as T → ∞,
P {X B+} → á, P {X B−} → á. 0 1á 0 1á q = r = 1 H In the case , under the alternative 1, it follows from (1), (2), (4), (5) that the initial rst moment of B the random variable T is equal to
T−1 E {B } = 1 + H E {Y ⊕ Y } = 1 + 2−1(T − 1)(1 − (1 − ä)2ù). ä T ä t t+1 t=1 H Using similar arguments, we calculate the initial second moment under the alternative 1. We have
T−1 T−1 E {B2 } = E ®(1 + H Y ⊕ Y )(1 + H Y ⊕ Y )¯ = ä T ä t t +1 t t +1 t =1 t =1 1 1 2 2 1 T−1 2 = E ® H (Y ⊕ Y )(Y ⊕ Y )¯ + 2E {B } − 1 = ä t t +1 t t +1 ä T t ,t =1 1 1 2 2 1 2 T−2 = 3E {B } − 2 + 2 H H P {Y = ℎ, Y = 1 − ℎ, Y = ℎ}+ ä T ä t t+1 t+2 t=1 ℎ∈V T−2 T−ó−1 + 2 H H H P {Y = ℎ ,Y = 1 − ℎ ,Y = ℎ ,Y = 1 − ℎ } = ä t 1 t+1 1 t+ó 2 t+ó 2 ó=2 t=1 ℎ ,ℎ ∈V = 31E2 {B } − 2 + 2−1(T − 2)(1 + ù(ù − 2)(1 − ä)2)+ ä T T−2 + 4 H(T − ó − 1)P {Y = 0, Y = 1, Y = 0, Y = 1}+ ä t t+1 t+ó t+ó+1 ó=2 + P {Y = 0, Y = 1, Y = 1, Y = 0} = ä t t+1 t+ó t+ó+1 = 1 + 2−13(T − 1)(1 − ù(1 − ä)2) + 2−1(T − 2)(1 + ù(ù − 2)(1 − ä)2)+ + 2−2(T − 2)(T − 3)(1 + ù(ùä2 − 2ùä − 2 + ù)(1 − ä)2).
For the variance, we have
D {B } = 1 T(1 − (1 − ä)2ù2(1 − 6ä + 3ä2) − 1 (1 − (1 − ä)2ù2(1 − 10ä + 5ä2)) = ä T 4 4 = T( 1 (1 − (1 − ä)2ù2(1 − 6ä + 3ä2)))(1 + o(1)), T → ∞. 4 Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains Ë 17
{Y } By the construction (4) the random sequence t satises the strong mixing property [10, 11], and hence the central limit theorem for weakly dependent random variables holds, B − 1 − 1 T(1 − (1 − ä)2ù) L ® T 2 ¯ → N(0, 1). (10) ä xD {B } ä T ñ In the case ù > 0, we take into account (10) and substitute ä = as T → ∞ into the expression for the Tâ power. As a result, we have lim WB+=lim P {X B+}=lim P {B ≥1 + 1 T(1 − ù) − 1 t yT(1 − ù2)}= 1 ä 1á ä T 2 2 á B − E {B } 1 + 1 T(1 − ù) − E {B } − 1 t xT(1 − ù2) =lim P ® T ä T ≥ 2 ä T 2 á ¯= ä xD {B } xD {B } ä T ä T Tùä(2 − ä) + t xT(1 − ù2) = Õ ¤lim á ¥ . xT(1 − (1 − ä)2ù2(1 − 6ä + 3ä2)) Analyzing various values of â in this expression, we arrive at (8). The case ù < 0 is dealt with similarly.
3.2 Using the short runs statistics
Y ,...,Y ∈ V Let us construct the sequence of indicators of sign changes in the sequence 1 T T: z = Y ⊕ Y ∈ V, t = 1, . . . , T − 1. t t t+1 (11) Next, we dene the set of patterns in sequence (11): {b , b , . . .}, b = (1, 0, . . . , 0, 1), ó ∈ ℕ ∪ {0}; 1 2 ó ó b ó 0 here ó is the chain of successive ’s bounded from the left and right by 1’s. Such patterns specify series ó + 1 {Y } C of 0’s and 1’s of length in the stego-sequence t . Further, we consider the disjoint random events ó, ó ∈ ℕ ∪ {0}: C = {(z , z , . . . , z ) = b }. ó t t+1 t+ó+1 ó Lemma 2. Let the model of embedding hold, q = r = 1. Then under the alternative H the probability (4) 1 distribution of the random events C is given by ó P {C } = P {C } + a (ä, ù) = 2−(ó+2)(1 + ù)ó(1 − ù)2 + a (ä, ù), ä ó 0 ó ó ó (12) where a (ä, ù) → 0 as ä → 0, |ù| < 1. ó Proof. Using the law of total probability for the model of embedding under consideration we nd that P {C } = H P {ãt+ó+1=uó+2}P {(z , z , . . . , z )=b |ãt+ó+1=uó+2}= ä ó ä t 1 ä t t+1 t+ó+1 ó t 1 u ∈V ó+2 = (1 −1ä)ó+ó+22P {C } + ä H äw(u )−1(1 − ä)ó+2−w(u )× 0 ó ó+2 ó+2 1 1 u ∈V : w(u )>0 ó+2 ó+2 ×P {(z , z , . . . , z ) = b1 |ãtó++2ó+1 =1 uó+2} ÚÚÚ→ 2−(ó+2)(1 + ù)ó(1 − ù)2. ä t t+1 t+ó+1 ó t 1 ä→0
Theorem 2. Under the hypotheses of Lemma 2, the function a (ä, ù) has the asymptotic expansion ó a = äa(1)(ù) + O ä2 , ó ó −1 . 2 ù(2 − ù), ó = 0, (13) a(1)(ù) = 2−2ù(1 − ù)(1 + ù), ó = 1, ó > F 2−ó−1ù(1 − ù)(1 + ù)ó−2(ù2 + (ó + 1)ù − ó + 2), ó ≥ 2. 18 Ë Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains
Proof. We partition the set U = {uó+2 = (u , . . . , u ) ∈ V : w(uó+2) = 1}, |U| = ó + 2, of binary ó+2,1 1 1 ó+2 ó+2 1 vectors of length ó + 2, ó ≥ 3, with unit Hamming weight into three disjoint subsets: U =U(0) ∪ U(1) ∪ U(2) , ó+2,1 ó+2,1 ó+2,1 ó+2,1 U(j) ={uó+2 ∈ U : u + u = 1}, j ∈ {0, 1}, ó+2,1 1 ó+2,1 j+1 ó+2−j U(2) ={uó+2 ∈ U : Hó u = 1}. ó+2,1 1 ó+2,1 j=3 j Arguing as in the proof of Lemma 2, we have P {C } = P {C } − ä(ó + 2)P {C }+ ä ó 0 ó 0 ó +ä H H P {(z , z , . . . , z ) = b |ãt+ó+1 = uó+2} + O ä2 . (14) ä t t+1 t+ó+1 ó t 1 j∈{0,1,2} u ∈U ó+2 (j) 1 ó+2,1 Let us consider the case ó ≥ 2. The subset U(j) , j ∈ {0, 1, 2}, contains sequences uó+2 ∈ U such that ó+2,1 1 ó+2,1 the events C ∩ {ãt+ó+1 = uó+2} are equiprobable under the alternative H : ó t 1 1 P {C ∩ {ãt+ó+1 = uó+2}} = ä ó t 1 ä(1 − ä)ó+22−ó−3(1 − ù)(1 + ù)ó, uó+2 ∈ U(0) , 6. 1 ó+2,1 (15) = ä(1 − ä)ó+22−ó−3(1 − ù)2(1 + ù)ó, uó+2 ∈ U(1) , 6> 1 ó+2,1 ä(1 − ä)ó+22−ó−3(1 − ù)2(1 + ù2)(1 + ù)ó−2, uó+2 ∈ U(2) . F 1 ó+2,1 Now (13) with ó ≥ 2 follows by substitution of (15) into (14). The case ó < 2 in (13) is considered similarly. Theorem 3. Under the hypotheses of Lemma 2, the function a (ä, ù) has the second-order asymptotic expansion ó a = äa(1)(ù) + ä2a(2)(ù) + O ä3 , ó ó ó where a(2)(ù)=2−2ù(−2 + ù), a(2)(ù)= 2−3ù(−1 + 4ù + ù2), 0 1 a(2)(ù)=2−4ù2(−7+10ù+ù2), a(2)(ù)= 2−5ù(1−12ù+2ù2+16ù3+ù4), 2 3 a(2)(ù)=2−ó−2ù(1+ù)ó−4(ó−2+ù(2ó2−14ó+13)+2ù2(−2ó2+7ó − 8)+ ó + 2ù3(ó2 − 3ó + 9) + ù4(5ó + 2) + ù5), ó ≥ 4. (16) U The proof is similar to that of Theorem 2, the set of stego-keys ó+2,2 being split into classes of equiprobable events. H From Theorems 2, 3 it follows that under the alternative 1 (existence of embeddings) the probability distribution of the total number of runs of a given length diers from that distribution under the hypothesis H ù > 0 C , C , C ä 0 1 0. In particular, for the probabilities of the events 0 1 2 increase as increases from to , ó > ó = 2 + ù(3 + ù)(1 − ù)−1 P {C } ä whereas for ù the probability ä ó decreases with the increasing of . This being so, we consider the statistics
T−2 T−2 B = H z , B = H z z , T,1 t T,2 t t+1 (17) t=1 t=1 B 1 {y } where the statistics T,2 is the total number of series of 0’s and of 1’s of length in the sequence t , and the B B B = B − z − 1 statistics T,1 is related to the total number of runs statistics T by the relation T,1 T T−1 . H Using Theorem 1 from [5] one may show that under the alternative 1 the initial rst-order moments of (B , B ) the bivariate statistics T,1 T,2 , as given by (17), read as E {B } = (T − 2) 1 (1 − (1 − ä)2ù) = E {B } + T 1 ä(2 − ä)ù + o(T), T → ∞, ä T,1 2 0 T,1 2 E {B } = (T − 2) 1 (1 − (1 − ä)2ù(2 − ù)) = E {B } + T 1 ä(2 − ä)ù(2 − ù) + o(T), T → ∞. (18) ä T,2 4 0 T,2 4 From (18) it is seen that for ù > 0 the mean number of sign changes or of two neighbouring sign changes is larger when the embeddings exist than in the opposite case. Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains Ë 19
Theorem 4. Let the model of embedding (4) holds. Then, as T → ∞, the statistical test for the hypotheses H , 0,ù H of the asymptotic signicance level á ∈ (0, 1) based on the bivariate statistics (17) is given by the critical 1 region B X = {yT :(B , B ) ∈ D }, (19) 1á 1 T,1 T,2 1,2 1,2 where the region D is as follows 1,2
D = ¦(B , B ):(B − Tì )ù ≥ 0, (B − Tì2 )ù ≥ 0, 1,2 T,1 T,2 T,1 0,1 T,2 0,1 (20) (5−3ù)(1−ù) 1−ù B − ì − B − ì T,1 0,1 ὔ ¤ 16 4 ¥ T,1 0,1 ≥ Tc §, 2 1−ù 1 2 1,2 B − ì − B − ì T,2 0,1 4 4 T,2 0,1 ð − arccos 2y 1−ù ì = 1 (1 − ù), c = 2−5(1 − ù2)2 ln 5−3ù , 0,1 2 1,2 2ðá that is, B P {X } = P {(B , B ) ∈ D } → á. 0 1á 0 T,1 T,2 1,2 1,2 Proof. H {z } Using the fact that under the hypothesis 0,ù the random variables t are independent and have the B(2−1(1 − ù)) z z z z Bernoulli distribution , and since the random variables t t+1 and s s+1 are independent if |t − s| > 1, we nd that E {B } = T 1 (1 − ù)(1 + o(1)), E {B } = T 1 (1 − ù)2(1 + o(1)), 0 T,1 2 0 T,2 4 D {B } = (T − 2)D {z } = T 1 (1 − ù2)(1 + o(1)), 0 T,1 0 t 4 D {B } = (T − 2)D {z z } + 2 H cov {z z , z z } = 0 T,2 0 t t+1 0 t t+1 s s+1 1≤t
1 1−ù Ò = (1 − ù2) ¤ 4 4 ¥ . 0 1−ù (5−3ù)(1−ù) (21) 4 16 D ù > 0 + In Fig. 1 the region 1,2 for the case is marked by the ‘ ’ sign. Such a form of the domain follows (B , B ) from the asymptotical normality of the bivariate statistics T,1 T,2 and from expressions (18). To calculate D Ò− the probability of the rst kind error, we use the linear transform of the region 1,2. We apply the matrix 0 1 to the unit vectors (1, 0), (0, 1) ∈ R2 and construct the Gram matrix: 2
u = Ò− (1, 0)ὔ, u = Ò− (0, 1)ὔ, 1 0 1 2 0 1 2 2 ὔ ὔ 6 (5−3ù)(1−ù) 1−ù u u u u 2 − 1 1 1 2 = Ò−1 = ¤ 16 4 ¥ . ὔ ὔ 0 2 2 1−ù 1 u u u u (1 − ù ) − 1 2 2 2 4 4 20 Ë Yuriy S. Kharin and Egor V. Vecherko, Detection of embeddings in binary Markov chains
BT,2