<<

arXiv:1301.7564v1 [cs.IT] 31 Jan 2013 nsm ye fpce ewrs h rpsdcdsaea are codes proposed The so-called networks. the packet pac of of of delivery generalization types of some out of in effect permuta the studying from for comes channels motivation transmissi The information channels. reliable of problem the to relevant npoetv pcsa dqaecntut o uhachann a codes such for and constructs networks, adequate as coded appropria spaces network an projective as linear in channel random operator of the by of model presented definition ga one which the the for [3], to as paper codes rise same seminal of their the in definition is Kschischang idea the K¨otter and This underlying channel. more idea a a consider, basic such we give the that will as model we channel well the section of following description sen the detailed networks, wireless In networks, ad-hoc etc. tolerant networks, mobile delay as popular networks, recently vehicular such of technologies number t the a of networking include permutation have Examples random sent. etc., essentially packets an packets, delivering erroneous of effect drop delivering to addition packets, in on and some [2], guarantees networks. packets no of packet delivery provide in-order of the protocols types network certain some Namely, in transmission end-to-end channels. inform permutation reliable over for constructs transmission appropriate be to argued are called rmwr olw lsl h n rpsdb K by proposed one coding-theoreti the presented The closely code c . follows of Manhattan framework to properties equivalence the basic their under which the among of established, Some are authors. the by rmwr o uhcanl,wihi ae ntenotion the on based is which channels, of such for framework channels permutation oesaesmlri ayrset,adtebscie is for idea coding on basic view the unified a and channels. admits respects, of which types many way these mathematica a in two in similar The presented channels. are operator models the for Kschischang eto.Fnly eto rvdstosml,btfairly but codes. simple, of types two thi both provides in of and examples described V general codes also Section multiset are Finally, codes of section. subset properties code over basic extended advantages multiset Some their then so-called IV. is the Section introducing approach in by This generalized [1]. and in presented approach emtto hnesaie o xml,a oesfran for models as example, for arise, channels Permutation nti ae,w td h rbe ferrcreto nthe in correction error of problem the study we paper, this In nScinIIw ilgv noeve ftesbe coding subset the of overview an give will we III Section In Abstract utstcodes multiset ustcodes subset Ti ae nrdcstento of notion the introduces paper —This utstCdsfrPruainChannels Permutation for Codes Multiset hs oe r eeaiaino h so- the of generalization a are codes These . eetypooe yteatos[] and [1], authors the by proposed recently .I I. eamt rsn coding-theoretic a present to aim We . NTRODUCTION eateto oe,Eetois n omncto Engin Communication and Electronics, Power, of Department ustcodes subset nvriyo oiSd 10 oiSd Serbia Sad, Novi 21000 Sad, Novi of University ldnKvˇeicadDjnVukobratovi´c Kovaˇcevi´c Dejan Mladen and eetyproposed recently , Emails: utstcodes multiset { te and otter ¨ mae,dejanv kmladen, nover on ation ping odes kets tion sor el. he ve as as te s c s s l stepruaincanlwt netos eein,and deletions, insertions, with channel pape substitutions. permutation this in the considered channel is can the ) conclude, permuted, original To the deduced. being (in be symbol is erased the sequence of position routers. transmitted the the in erasures as overflows that, network buffer by Note scenari they consequent caused networking be and where a can congestion in situations deletions example, packet other For above, receiver’ mentioned [4]). various the e.g., of also (see, or timing occur are twice, incorrect read There the of is of clock. thought symbol because be a skipped, can where is deletions errors, usu and synchronization symbols, Insertions as of noise. alterations by random caused are errors) (i.e., tions admpruain ftesmos n ouinrle on relies solution One symbols? the w deal of one does how haverandom but literature; substitutions) the in and studied thoroughly deletions, (insertions, sections. consider subsequent in given are definitions precise the channel permutation the for Coding layer). A. physical and (link cod layers error-correcting channel lower and the the error-detecting er in that by understood occur addressed is can are it assumpt Namely, deletions permutations). frequent only the from a that (apart is that scenario It this layer. out o in network application done pointed or is end-to-end coding transport be that the an consequently, should and assumes model, it transmission here network, proposed packet framework a els te eeeiu fet ntetasitdsqec,suc sequence, any transmitted the insertions in on effects delivered deleterious being other them on rely are order. particular cannot and There network. receiver separately the packets in routed the routes the are different which over channels message sent in frequently such networks single of packet Introduction, a permutation of the comprising types random in some a in noted arise outputs As sequence. sequence, this input any for hs nusaesqecso ybl from symbols of are inputs whose ya By ihu oso eeaiy easm that assume we generality, of loss Without eak1: Remark oe o aiu ye fcanlipimnsta we that impairments channel of types way; various informal somewhat for a Codes in idea main the state now We nadto orno emttos h hne a have can channel the permutations, random to addition In Let } emtto channel permutation @uns.ac.rs S r setal h aea eein,bcuethe because deletions, as same the essentially are , eafiieapae with alphabet finite a be deletions ntecs hntepruaincanlmod- channel permutation the when case the In I T II. and , ECANLMODEL CHANNEL HE eering, substitutions over S eudrtn h channel the understand we | S | fsmos Substitu- symbols. of = S > q = S { n which, and , 1 0 , 2 symbols. q , . . . , been sat es fore, as h rors ally ion not ith } o n s r . the following simple idea: Information should be encoded in for X, Y ∈P(S), where △ denotes the an object which is invariant under permutations. An example between sets which is defined as X △Y = (X \Y )∪(Y \X). of such an object is a . Based on this observation, the We also have: authors have introduced the so-called subset codes as relevant d(X, Y )= |X ∪ Y | − |X ∩ Y | for the above channel model [1], the codewords of which are taken to be subsets of an alphabet S. An appropriate metric = |X| + |Y |− 2|X ∩ Y | (2) is specified on the space of all subsets of S, after which the =2|X ∪ Y | − |X| − |Y |. definition of codes and their parameters follows familiar lines. The distance is the length of the shortest path between In the present paper we further generalize this idea by noting d(X, Y ) and in the Hasse diagram [5] of the of subsets that there exists an even more general object invariant under X Y of ordered by inclusion. This diagram plays a role similar permutations – a multiset. Informally, a multiset is a set with S to the Hamming hypercube for the classical codes in the repetitions of elements allowed. Clearly, for a given alphabet Hamming metric, and it is in fact isomorphic to the Hamming S, there are more of certain than there hypercube, as discussed below. For constant-cardinality codes, are sets of the same cardinality, and hence, this approach can the distance between codewords is always even. increase the code rate, among other advantages. To conclude this section, we note that we have adopted the The minimum distance of the code can now be defined as: above principle of taking codewords to be objects invariant min d(X, Y ). (3) under the channel transformation, from the work of K¨otter X,Y ∈C, X6=Y and Kschischang [3]. These authors have noticed that this Other important parameters of the code are its size |C|, principle can be applied to the channels arising in networks maximum cardinality of the codewords: which are based on random linear network coding (RLNC). In such networks, random linear of the injected max |X|, (4) packets are delivered to the receiver, and hence, the only X∈C property preserved by such a channel is the vector space and the cardinality of the ambient set, |S|. The code C⊆P(S) 1 spanned by those packets . From this observation, the authors with minimum distance d and codewords of cardinality at most of [3] have developed the notion of subspace codes, i.e., ℓ is said to be of type [log |S|, log |C|, d; ℓ] (we assume that codes in projective spaces, where codewords are taken to be the logarithms are to the base 2, i.e., that the lengths of the vector subspaces of some ambient vector space (the space of messages are measured in bits). The setting we have in mind is all packets). There are many parallels between subspace and the following: The source maps a k-bit information sequence subset/multiset codes, as will be evident from the exposition in to a set of ℓn-bit symbols/packets which are sent through the the subsequent sections. In fact, multiset codes can be thought channel. In the channel, these symbols are permuted, some of of as a generalization of subspace codes in the sense that any them are deleted, some of them are received erroneously, and basis of a subspace of the set of all packets is also a subset of possibly some new symbols are inserted. The receiver collects the set of all packets. This is a consequence of the fact that all these symbols and attempts to reconstruct the information the RLNC channel is more restrictive that the permutation sequence. channel. Namely, permuting the packets is a special case of Having the above scenario in mind, we can also define the delivering multiple linear combinations of the packets. rate of an [n,k,d; ℓ] subset code as: III. SUBSET CODES k R = . (5) Let S be a nonempty finite set representing the alphabet of nℓ the given permutation channel. If this channel models a packet A. of subset codes and binary codes network, we can think of S as the set of all possible packets. Let P(S) denote the power set of S, i.e., the set of all subsets When the ambient set S is specified, the subsets of S of S, and P(S,ℓ) the set of all subsets of S of cardinality ℓ. are uniquely determined by their characteristic functions (also Definition 1: A subset code over an alphabet S is a non- called indicator functions). The characteristic of a set

empty subset of P(S). If C ⊆P(S,ℓ), we say that C is a X ⊆ S is a mapping ½X : S →{0, 1}, defined by: constant-cardinality code. As usual, in order to enable the receiver to recover from 1 x ∈ X ½X (x)= (6) errors, erasures, etc., codewords of the code should be chosen (0 x∈ / X. to differ from each other as much as possible. A measure of

”dissimilarity” of sets is needed for this purpose. A natural If S = {1,...,q}, these functions can be identified with binary ½ one, which is in fact a metric on P(S), is given by: sequences of length q, namely (½X (1),..., X (q)). All set operations (unions, intersections, differences, etc.) on P(S) d(X, Y )= |X △ Y | (1) can be expressed in terms of the corresponding characteristic 1 Actually, it is preserved only if the transformation applied to the packets functions. For example, it is easy to see that:

is full-rank, but this happens with high if the linear combinations

½ ½ ½ ½ are indeed random. ½X△Y = X ⊕ Y = | X − Y | (7) where ⊕ denotes the XOR operation (addition modulo 2). in the subset case, what we send through the channel are the Also, the cardinality of the set X can be expressed as: positions of ones in this codeword.

|X| = ½X (x), (8) IV. MULTISET CODES x In this section, we generalize subset codes by allowing which is just the HammingX weight of the binary sequence

the codewords to contain multiple copies of their elements. ½ (½ (1),..., (q)). From the above one concludes that:

X X This feature is quite natural, because any interesting classical ½ d(X, Y )= |X △ Y | = |½X (x) − Y (x)| , (9) code over a finite alphabet contains codewords with multiple x occurrences of some symbols. In our case, the codewords i.e., the distance between sets X and Y is equal to the Ham- are sets, and the objects we need – sets with repetitions of

ming distance between the binary sequences corresponding elements allowed – are known as multisets [6]. A multiset ½ to ½X and Y . The above reasoning implies the following is defined with a set of elements it contains, and numbers interesting fact: The subset codes in P(S) are just a different of occurrences of each in the set. The number of S representation of binary codes in the space {0, 1}| | under the occurrences of an element, called its multiplicity, is assumed Hamming metric. Every subset code of type [n,k,d; ℓ] has a to be finite. Finally, we note that multisets are also invariant n binary counterpart with parameters (2 ,k,d) and maximum under permutations and hence are suitable for the permutation codeword weight ℓ, and vice versa. channel. Example 1: Let S = {1, 2, 3, 4, 5}. Any subset of S can Let M(S) denote the collection of all multisets over an be identified by a binary sequence of length 5; for example alphabet S. Operations on M(S), such as union, intersection, {1, 2} ↔ 11000, {2, 4} ↔ 01010, etc. Consider now some difference, etc., are straightforward extensions of the corre- 5 code in {0, 1} , e.g., C = {11000, 01010, 01110, 00111}. sponding operations on sets. It is easiest to illustrate them on The subset counterpart of this code is then CS = a simple example. {{1, 2}, {2, 4}, {2, 3, 4}, {3, 4, 5}}. The distance between two Example 2: Let X = {1, 2, 2, 2, 3} and Y = subsets of S is the Hamming distance between the correspond- {1, 2, 2, 3, 3, 4} be two multisets over S = {1, 2, 3, 4}. ing binary sequences, for example: Then X ∩ Y = {1, 2, 2, 3}, X ∪ Y = {1, 2, 2, 2, 3, 3, 4}, X \ Y = {2}, Y \ X = {3, 4}. The cardinality of X and Y d ({1, 2}, {2, 4})= |{1, 4}| =2= d (11000, 01010) (10) H is |X| =5, |Y | =6, respectively. so that all properties of C directly translate into equivalent Codes in the space M(S) are defined analogously to properties of the subset code CS. the codes in P(S). In the following, M(S,ℓ) denotes the An important consequence of this isomorphism is that collection of all multisets of cardinality ℓ. subset codes can be constructed by using the familiar con- Definition 2: A multiset code over S is a nonempty subset structions of the codes for binary channels. Apart from the of M(S). If C ⊆M(S,ℓ), we say that C is a constant- construction itself, the analogy can be used for the analysis cardinality code. of the transmission of a subset through a channel. Namely, Note that M(S) is an infinite space. It is always assumed,

an equivalent way of describing that X was sent and Y however, even if not explicitly stated, that a multiset code is ½ was received, is that the binary word (½X (1),..., X (q)) finite. In particular, we have in mind multiset codes with an

was sent (through the corresponding binary channel) and upper bound on the cardinality of the codewords, which is a ½ (½Y (1),..., Y (q)) was received. Insertion of an element reasonable constraint from the ”practical” point of view. In

i∈ / X to X corresponds to the 0 → 1 transition in the any case, we shall mostly deal with constant-cardinality codes ½ binary channel, i.e., ½X (i)=0 and Y (i)=1. Similarly, where this issue does not arise. deletion of an element i from X corresponds to the 1 → 0 It is easy to see that the function d from (1) is a metric transition, and a substitution corresponds to both transitions on M(S), and hence we can define the minimum distance of (at different positions) as it is essentially a of a multiset code in the same way as for subset codes. Other an insertion and a deletion. Consider further the special case code parameters are also defined in the same way as for subset when only deletions can occur in the channel (recall that this codes and those definitions will not be repeated here. is a frequent model of an end-to-end transmission over packet We now prove a simple, but basic fact about the correcting networks). It is easy to conclude from the above discussion that capabilities of multiset codes. The analogous statement for the this channel is equivalent to the so-called Z-channel in which special case of subset codes is proven in [1]. the crossover 1 → 0 occurs with probability p (the probability Theorem 3: A multiset code C with minimum distance d is of deletion), while the crossover 0 → 1 never occurs. The capable of correcting any pattern of s insertions, ρ deletions, analysis of subset codes and the corresponding permutation and t substitutions, as long as 2(s + ρ +2t) < d. channel with deletions is thus reduced to the analysis of Proof: Let X ∈ C be the multiset which is transmitted binary codes and the binary Z-channel, respectively. Note that, through the channel. Let Y be the received multiset. If ρ for both these channels, we can design a binary code with packets from X have been deleted, and s new packets have appropriate parameters. The difference is that, in the binary been inserted, then we easily deduce that |X ∩ Y | ≥ |X|− ρ channel we send a codeword (binary sequence) itself, while and |Y | = |X|− ρ + s. Since each substitution is essentially Zq a combination of one deletion and one insertion, the actual description of the codes in ≥0 under the Manhattan metric. number of deletions and insertions is ρ + t and s + t, respec- Constant-cardinality codes are then equivalent to the codes on Z tively, wherefrom one concludes that |X ∩ Y | ≥ |X|− ρ − t the ”sphere” {(x1,...,xq): xi ∈ ≥0, i xi = ℓ}. and |Y | = |X|− ρ + s, and that V. EXAMPLES OF CODES FOR THEP PERMUTATION d(X, Y )= |X| + |Y |− 2|X ∩ Y |≤ s + ρ +2t. (11) CHANNEL Now, if the assumption 2(s + ρ + 2t) < d holds, then In this section, we describe a simple way to construct subset d−1 d(X, Y ) ≤ ⌊ 2 ⌋ and therefore X can be recovered from and multiset codes, and discuss some of the properties of the Y by the minimum distance decoder. obtained codes. If only deletions can occur in the channel, then d(X, Y )= ρ d−1 A. Example of subset codes and the sent codeword is recoverable whenever ρ ≤⌊ 2 ⌋. An obvious advantage that multiset codes have over subset A straightforward way of obtaining codes for the permu- codes is the code rate improvement which is a consequence tation channel is to use some classical error-correcting code, of them being defined in a bigger space: and add a sequence number to every symbol of the codeword so that the order of symbols can be restored at the receiving q + ℓ − 1 q |M(S,ℓ)| = > = |P(S,ℓ)| . (12) side. This approach is illustrated below. ℓ ℓ     Let A be a finite alphabet with |A| = q. Observe some code Further, when |S| = q is ”small”, it is necessary to use C over A with parameters (ℓ,k,d), meaning that |C| = qk, multiset codes because, unlike subset codes, they allow the the codewords of C are q-ary sequences of length ℓ, and the cardinality of the codewords to be larger than the cardinality Hamming distance between any two codewords is at least d. of the alphabet. For example, multiset codes with arbitrary For any codeword p = (p1,...,pℓ) ∈ C, create a sequence minimum distance (and hence, arbitrary correction capability) (t1,...,tℓ), where ti = i ◦ pi is a new symbol obtained by can be defined even over a binary alphabet. prepending a sequence number to the symbol pi (◦ denotes the concatenation of strings). This mapping is clearly injective and A. Isomorphism of multiset codes and integer codes the set of all sequences thus obtained defines a code C′ over The isomorphism between subset codes and binary codes, an alphabet S = {1,...,ℓ} × A with parameters (ℓ,k,d). which has many important consequences, as discussed in The codewords of C′ are invariant under permutations, i.e., Section III-A, also has an appropriate generalization in the any permutation of (t1,...,tℓ) has the same meaning to the multiset framework. Namely, multiset codes turn out to be receiver because it can recover (p1,...,pℓ) from the sequence equivalent to integer codes under the so-called Manhattan numbers. Therefore, one can imagine the carrier of information metric, and this equivalence is illustrated next. being a set {t1,...,tℓ}, and hence this simple construction Multisets over an alphabet can be described by their S yields an example of a subset code CS over S. The code has multiplicity functions in the same way subsets are described qk codewords, each of cardinality ℓ. The minimum (subset) by their characteristic functions (in fact, that is how multisets distance of the code is easily determined by observing two are usually defined formally [6]). The multiplicity function of codewords: Z a multiset over is a mapping Ñ 0, where 1 ◦ p1 2 ◦ p2 ... ℓ ◦ pℓ X S X : S → ≥ (16)

ÑX(x) is the number of occurrences of x in X. Clearly, a 1 ◦ r1 2 ◦ r2 ... ℓ ◦ rℓ. multiset is a set if and only if the range of its multiplicity It is evident that the cardinality of the intersection of the subset function is {0, 1}. Operations on multisets can be expressed codewords P = {1◦p1,...,ℓ◦pℓ} and R = {1◦r1,...,ℓ◦rℓ}

in terms of their multiplicity functions, for example: is equal to the number of positions where the sequences p =

Ñ Ñ ÑX∪Y = max{ X, Y }, (p1,...,pℓ) and r = (r1,...,rℓ) agree, which is, on the other

hand, equal to ℓ minus the Hamming distance of these two

Ñ Ñ ÑX∩Y = min{ X, Y }, (13)

sequences. Therefore,

Ñ Ñ ÑX\Y = max{0, X − Y }, while the cardinality of a multiset is expressed as: d(P, R)=2dH(p, r), (17)

and hence the minimum (subset) distance of CS is 2d. To |X| = ÑX(x). (14) x conclude, this construction yields a subset code CS of type If the alphabet is S = {1X, 2,...,q}, the multiplicity func- [log qℓ,k log q, 2d; ℓ]. tion of a multiset X is uniquely specified by a sequence Note that the decoding procedure for CS is the same as for

Zq Ñ (ÑX(1),..., X(q)) ∈ ≥0. Therefore, the space M(S) is C once the codeword of C is recovered by using sequence Zq essentially equivalent to the space ≥0. Further, the distance numbers. Note also that recovering (p1,...,pℓ) from {1 ◦ between multisets is: p1,...,ℓ ◦ pℓ} reduces deletions to erasures, while insertions

and substitutions are reduced to errors. Namely, if i ◦ pi has Ñ d(X, Y )= |X △ Y | = | ÑX(x) − Y (x)| , (15) x been deleted, the receiver will be able to deduce that the which is the familiar ℓ1 distance,X also known as the Manhattan symbol at the ith position is missing. Similarly, if j ◦ pj has metric. Therefore, multiset codes are basically just another been inserted and the receiver now possesses two symbols with the sequence number j, it will choose one at random, this way is greater than or equal to the Levenshtein distance possibly resulting in an error at the jth position. Hence, when between the original sequences, and therefore the code CM is subset codes constructed in this way are used, the permutation of type [log qℓ,k log q, dM ; ℓ], where dM ≥ d. channel (over S) with insertions, deletions, and substitutions, As noted above, one possible decoding procedure for CM is reduces to the classical discrete memoryless channel (over A) to first use the sequence numbers to obtain the right ordering with errors and erasures. of symbols, and then apply the decoding algorithm for C to the The codes described above are, to the best of our knowledge, resulting sequence. If this procedure is used, then one easily the only type of error-correcting codes for the permutation concludes that the number of insertions and deletions which d−1 channel described in the literature (see, e.g., the construction can be corrected is at most ⌊ 2 ⌋, and therefore, the ”effective of the ”outer” code in [4]). As we have illustratred, they are in minimum distance” of the code is d. fact only a special case of the more general notion of subset As a final note here, we would like to stress that the above codes. We note that better subset codes can be constructed construction merely serves as an illustration of a constant- via the isomorphism given in Section III-A, i.e., by using the cardinality mutiset code. The general method of construction familiar constructions of binary codes (see also [1]). that can be used is via the corresponding constant-weight integer codes in the Manhattan metric, as explained in Section B. Example of multiset codes IV-A. It appears, however, that these codes have not been We next describe a simple construction which yields an studied thoroughly before, and it remains an interesting prob- example of a multiset code (which is not a subset code). It lem for future research to explore further their properties, and is also is based on ”classical” codes and sequence numbers. obtain explicit constructions and decoding algorithms. Again, let A be a finite alphabet with |A| = q symbols, and VI. CONCLUSION C a code over A. For any codeword p = (p1,...,pℓ) ∈ C, We have presented a framework for forward error correction we create a sequence (t1,...,tℓ) by prepending sequence numbers to the symbols of p, but in such a way that runs of in the permutation channels. We have introduced multiset identical symbols in p are given the same sequence number. codes as relevant constructs for correcting insertions, deletions, For example, the sequence (a,a,b,b,c,b), where a,b,c ∈ A, and substitutions in such channels. Some basic properties of is mapped to (1 ◦ a, 1 ◦ a, 2 ◦ b, 2 ◦ b, 3 ◦ c, 4 ◦ b). The multiset codes have been established. The framework pre- obtained sequence is invariant under permutations, and it is sented is analogous to the one introduced recently by K¨otter easily concluded, similarly to the example from the previous and Kschischang for the operator channels, and can be viewed as its extension. As a consequence, a unified view on coding subsection, that this procedure yields a multiset code CM over for RLNC networks and multipath routed packet networks is S. The decoding procedure for CM is again the same as for C once the codeword is recovered from the sequence numbers. obtained. In this case, however, recovering p from {i1 ◦ p1,...,iℓ ◦ pℓ} ACKNOWLEDGMENT reduces deletions to deletions, insertions to either insertions The authors would like to acknowledge the financial support or substitutions, and substitutions to substitutions (i.e., errors). of the Ministry of Science and Technological Development of Namely, if the symbol ij ◦ pj has been deleted, the receiver cannot deduce (in general) which symbol has been deleted the Republic of Serbia (grants No. TR32040 and III44003). because there could have been multiple copies of this or REFERENCES some other symbols. Similar reasoning applies for the other [1] M. Kovaˇcevi´cand D. Vukobratovi´c, ”Subset Codes for Packet Networks,” cases. Therefore, the code C has to be resilient to insertions, IEEE Commun. Lett., submitted for publication, revised Jan. 2013, deletions, and substitutions. available at arXiv:1210.7341. Finally, let us determine the parameters of from those of [2] D. P. Bertsekas and R. Gallager, Data Networks, 2nd ed., Prentice Hall, CM 1992. C. Let C be of type (ℓ,k,d), where k and ℓ are as before, and d [3] R. K¨otter and F. R. Kschischang, ”Coding for Errors and Erasures in is the minimum Levenshtein distance [7], which is the relevant Random Network Coding,” IEEE Trans. Inf. Theory, vol. 54, no. 8, pp. 3579–3591, Aug. 2008. distance measure for insertion/deletion channels (it is defined [4] L. J. Schulman and D. Zuckerman, ”Asymptotically Good Codes Correct- as the minimum number of insertions and deletions that ing Insertions, Deletions, and Transpositions,” IEEE Trans. Inf. Theory, transform one sequence to the other). Observe two multiset vol. 45, no. 7, pp. 2552–2557, Nov. 1999. [5] G. Birkhoff, Lattice Theory, 3rd ed., American Mathematical Society, codewords P and R: 1967. [6] M. Aigner, Combinatorial Theory, Springer, 1979. i1 ◦ p1 i2 ◦ p2 ... iℓ ◦ pℓ (18) [7] V. I. Levenshtein, ”Binary Codes Capable of Correcting Deletions, j1 ◦ r1 j2 ◦ r2 ... jℓ ◦ rℓ, Insertions, and Reversals,” Sov. Phys.–Dokl., vol. 10, no. 8, pp. 707–710, Feb. 1966. where, (im) and (jm) are nondecreasing integer sequences, as explained above. Unfortunately, the distance between P and R in general cannot be expressed via the Levenshtein (or Hamming) distance between p and r, and hence the minimum distance of CM cannot be inferred from d. It is easy to conclude, however, that the distance between two multisets obtained in