Computing k-th Lyndon Word and Decoding Lexicographically Minimal de Bruijn Sequence
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter
University of Warsaw
CPM 2014 Moscow, June 18, 2014
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 1/17 Examples: 0010011, 000001, 1. Non-examples: 010011 (010011 > 001101) 001001 (001001 = (001)2)
Lyndon words
Definition (Lyndon, 1954) A Lyndon word is a word that is strictly smaller than all its nontrivial cyclic rotations.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 2/17 Non-examples: 010011 (010011 > 001101) 001001 (001001 = (001)2)
Lyndon words
Definition (Lyndon, 1954) A Lyndon word is a word that is strictly smaller than all its nontrivial cyclic rotations.
Examples: 0010011, 000001, 1.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 2/17 Lyndon words
Definition (Lyndon, 1954) A Lyndon word is a word that is strictly smaller than all its nontrivial cyclic rotations.
Examples: 0010011, 000001, 1. Non-examples: 010011 (010011 > 001101) 001001 (001001 = (001)2)
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 2/17 |Lynd(001110)| = 6, |Lynd(011010)| = 8.
We define
Lynd(w) = {x ∈ L|w| : x ≤ w}. Example: Σ = {0, 1},
L6 = {000001, 000011, 000101, 000111, 001011 001101, 001111, 010111, 011111},
Enumerating Lyndon words
Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ .
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 |Lynd(001110)| = 6, |Lynd(011010)| = 8.
Example: Σ = {0, 1},
L6 = {000001, 000011, 000101, 000111, 001011 001101, 001111, 010111, 011111},
Enumerating Lyndon words
Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define
Lynd(w) = {x ∈ L|w| : x ≤ w}.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 |Lynd(001110)| = 6, |Lynd(011010)| = 8.
Enumerating Lyndon words
Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define
Lynd(w) = {x ∈ L|w| : x ≤ w}. Example: Σ = {0, 1},
L6 = {000001, 000011, 000101, 000111, 001011 001101, 001111, 010111, 011111},
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 |Lynd(011010)| = 8.
Enumerating Lyndon words
Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define
Lynd(w) = {x ∈ L|w| : x ≤ w}. Example: Σ = {0, 1},
L6 = {000001, 000011, 000101, 000111, 001011 001101, 001111, 010111, 011111}, |Lynd(001110)| = 6,
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 Enumerating Lyndon words
Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define
Lynd(w) = {x ∈ L|w| : x ≤ w}. Example: Σ = {0, 1},
L6 = {000001, 000011, 000101, 000111, 001011 001101, 001111, 010111, 011111}, |Lynd(001110)| = 6, |Lynd(011010)| = 8.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 Technical assumptions: word-RAM model, Σ = {0, 1, . . . , σ − 1}, σ fits in a machine word.
Corollary
The k-th lexicographically smallest Lyndon word in Ln can be computed in O(n4 log σ) time.
Proof. Binary search over Σn.
Our results (Lyndon words)
Theorem Given a word w of length n the value |Lynd(w)| can be computed in O(n3) time.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 4/17 Corollary
The k-th lexicographically smallest Lyndon word in Ln can be computed in O(n4 log σ) time.
Proof. Binary search over Σn.
Our results (Lyndon words)
Theorem Given a word w of length n the value |Lynd(w)| can be computed in O(n3) time.
Technical assumptions: word-RAM model, Σ = {0, 1, . . . , σ − 1}, σ fits in a machine word.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 4/17 Our results (Lyndon words)
Theorem Given a word w of length n the value |Lynd(w)| can be computed in O(n3) time.
Technical assumptions: word-RAM model, Σ = {0, 1, . . . , σ − 1}, σ fits in a machine word.
Corollary
The k-th lexicographically smallest Lyndon word in Ln can be computed in O(n4 log σ) time.
Proof. Binary search over Σn.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 4/17 no polynomial-time algorithm known previously for Ln.
combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object;
Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications:
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln.
text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object;
Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications: combinatorics of words,
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln.
algebra. Generating k-th smallest element is a natural question for any combinatorial object;
Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications: combinatorics of words, text algorithms,
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln.
Generating k-th smallest element is a natural question for any combinatorial object;
Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln. Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object;
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object;
no polynomial-time algorithm known previously for Ln.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 Applications to decoding the minimal de Bruijn sequence.
Motivation
Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object;
no polynomial-time algorithm known previously for Ln. Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 Motivation
Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object;
no polynomial-time algorithm known previously for Ln. Generalization of the known formula:
1 X n d |Ln| = n µ( d )σ . d|n
Applications to decoding the minimal de Bruijn sequence.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 1 0 1
0
0 0
0 0 0 0
0 0 0 0
De Bruijn sequences
Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
n = 4, Σ = {0, 1}
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1
0
0 0
0 0 0 0
0 0 0 0
De Bruijn sequences
Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
n = 4, Σ = {0, 1}
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1
0
0 0
0 0 0 0
0 0 0 0
De Bruijn sequences
Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
n = 4, Σ = {0, 1}
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1
0
0 0
0 0 0 0
0 0 0 0
De Bruijn sequences
Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
n = 4, Σ = {0, 1}
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1
0
0 0
0 0 0 0
0 0 0 0
De Bruijn sequences
Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
n = 4, Σ = {0, 1}
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 For Σ = {0, 1} and n = 6 the sequence dB6 is:
0000001000011000101000111001001011 001101001111010101110110111111
Theorem (Fredricksen, Maiorana, 1978)
The sequence dBn is the concatenation, in lexicographic order, of all Lyndon words over Σ whose length divides n.
Lexicographically minimal de Bruijn sequence
Notation: Σ fixed alphabet,
dBn the lexicographically minimal de Bruijn sequence over Σ of rank n.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 7/17 Theorem (Fredricksen, Maiorana, 1978)
The sequence dBn is the concatenation, in lexicographic order, of all Lyndon words over Σ whose length divides n.
Lexicographically minimal de Bruijn sequence
Notation: Σ fixed alphabet,
dBn the lexicographically minimal de Bruijn sequence over Σ of rank n.
For Σ = {0, 1} and n = 6 the sequence dB6 is:
0000001000011000101000111001001011 001101001111010101110110111111
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 7/17 Lexicographically minimal de Bruijn sequence
Notation: Σ fixed alphabet,
dBn the lexicographically minimal de Bruijn sequence over Σ of rank n.
For Σ = {0, 1} and n = 6 the sequence dB6 is:
0000001000011000101000111001001011 001101001111010101110110111111
Theorem (Fredricksen, Maiorana, 1978)
The sequence dBn is the concatenation, in lexicographic order, of all Lyndon words over Σ whose length divides n.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 7/17 Applications: 1 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Applications: 1 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
1 1 1 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Applications: 1 1 0 1 1 1 position sensing schemes.
0 1 Previous work: 0 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
1 1 1 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Applications: 1 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
0 0 1 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Applications: 1 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
0 0 1 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Applications: 1 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
0 1 0 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Applications: 1 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
1 1 0 1 1
0 1 0
0
0
1
0
0
1 0 1
0 1 0 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 1 1 1 0 1 1
0 1 Previous work: 0 0
0 polynomial-time decoding
0 0 0 0 0
1 1 1 1 1 schemes for several types of de
0 0 0 0 0
0 0 0 0
0 Bruijn sequences:
1 1 1 1 1
0 0 0 0 0
1 1 1 1 1 Paterson & Robshaw, 1995 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
Applications: 1 1 0 1 1 position sensing schemes.
0 1 0
0
0
1
0
0
1 0 1
0 1 0 0
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 1 1 1 0 1 1
0 1 0
0 0
0 0 0 0 0
1 1 1 1 1
0 0 0 0 0
0 0 0 0 0
1 1 1 1 1
0 0 0 0 0 1 1 1 1 1
Decoding de Bruijn sequences
A decoding algorithm finds the position of an arbitrary word from Σn in a given de Bruijn sequence in polynomial time.
Applications: 1 1 0 1 1 position sensing schemes.
0 1 Previous work: 0
0 polynomial-time decoding
0
1 schemes for several types of de
0
0 Bruijn sequences:
1
0 1 Paterson & Robshaw, 1995 0 1 0 0 Mitchell, Etzion and Paterson, 1996 Tuliani, 2001
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 8/17 Our results (minimal de Bruijn sequences)
Using the correspondence between Lyndon words and lex. min. de Bruijn sequences we show the following: Theorem 3 There exists an O(n )-time decoding algorithm for dBn.
Theorem
For any n the k-th symbol of dBn can be computed in O(n4 log σ) time.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 9/17 : auxiliary reduction
For a word w ∈ Σ+, denote the lexicographically minimal cyclic rotation of w by hwi. Lemma For any word w ∈ Σn the lexicographically maximal w 0 ∈ Σn such that hw 0i = w 0 ≤ w can be found in O(n2) time.
Lynd(w 0) = Lynd(w), so we can assume that hwi = w.
Main algorithm
We sketch the proof of the following result: Theorem For any word w of length n, the value |Lynd(w)| can be computed in poly(n) time.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 10/17 Lynd(w 0) = Lynd(w), so we can assume that hwi = w.
Main algorithm: auxiliary reduction
We sketch the proof of the following result: Theorem For any word w of length n, the value |Lynd(w)| can be computed in poly(n) time.
For a word w ∈ Σ+, denote the lexicographically minimal cyclic rotation of w by hwi. Lemma For any word w ∈ Σn the lexicographically maximal w 0 ∈ Σn such that hw 0i = w 0 ≤ w can be found in O(n2) time.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 10/17 Main algorithm: auxiliary reduction
We sketch the proof of the following result: Theorem For any word w of length n, the value |Lynd(w)| can be computed in poly(n) time.
For a word w ∈ Σ+, denote the lexicographically minimal cyclic rotation of w by hwi. Lemma For any word w ∈ Σn the lexicographically maximal w 0 ∈ Σn such that hw 0i = w 0 ≤ w can be found in O(n2) time.
Lynd(w 0) = Lynd(w), so we can assume that hwi = w.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 10/17 Example. Let w = 010111. CS(w[1..1]) = {0}, CS(w[1..2]) = {00, 01, 10}, CS(w[1..3]) = {000, 001, 010, 100}, |CS(w)| = 54 1 |Lynd(w)| = 6 · µ(1) |CS(w)|+µ(2) |CS(w[1..3])|+µ(3) |CS(w[1..2])| + 1 µ(6) |CS(w[1..1])| = 6 · (54 − 4 − 3 + 1) = 8.
Formula for |Lynd(w)|
Define CS(w) = {x ∈ Σ|w| : hxi ≤ w}.
Lemma If hwi = w then
1 X n |Lynd(w)| = n µ( d ) |CS(w[1..d])| d|n
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 11/17 Formula for |Lynd(w)|
Define CS(w) = {x ∈ Σ|w| : hxi ≤ w}.
Lemma If hwi = w then
1 X n |Lynd(w)| = n µ( d ) |CS(w[1..d])| d|n
Example. Let w = 010111. CS(w[1..1]) = {0}, CS(w[1..2]) = {00, 01, 10}, CS(w[1..3]) = {000, 001, 010, 100}, |CS(w)| = 54 1 |Lynd(w)| = 6 · µ(1) |CS(w)|+µ(2) |CS(w[1..3])|+µ(3) |CS(w[1..2])| + 1 µ(6) |CS(w[1..1])| = 6 · (54 − 4 − 3 + 1) = 8.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 11/17 z ≤ w
y 0
y 2 ∈ L(w), |y| = n y 0 ≤ w, hence y ∈ CS(w)
Fact p n If√hwi = w then CS(w) = L(w) ∩ Σ , where L = {y : y 2 ∈ L}.
y y
CS(w) as a language
Define a language L(w) as follows: x ∈ L(w) if there exists a subword z of x such that z ≤ w but z is not a proper prefix of w.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 12/17 z ≤ w
y 0
y 2 ∈ L(w), |y| = n y 0 ≤ w, hence y ∈ CS(w)
y y
CS(w) as a language
Define a language L(w) as follows: x ∈ L(w) if there exists a subword z of x such that z ≤ w but z is not a proper prefix of w. Fact p n √If hwi = w then CS(w) = L(w) ∩ Σ , where L = {y : y 2 ∈ L}.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 12/17 y 0
y 0 ≤ w, hence y ∈ CS(w)
CS(w) as a language
Define a language L(w) as follows: x ∈ L(w) if there exists a subword z of x such that z ≤ w but z is not a proper prefix of w. Fact p n √If hwi = w then CS(w) = L(w) ∩ Σ , where L = {y : y 2 ∈ L}.
z ≤ w
y y
y 2 ∈ L(w), |y| = n
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 12/17 CS(w) as a language
Define a language L(w) as follows: x ∈ L(w) if there exists a subword z of x such that z ≤ w but z is not a proper prefix of w. Fact p n √If hwi = w then CS(w) = L(w) ∩ Σ , where L = {y : y 2 ∈ L}.
z ≤ w
y 0 y y
y 2 ∈ L(w), |y| = n y 0 ≤ w, hence y ∈ CS(w)
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 12/17 w(i+1) if c = w[i + 1] and i 6= n − 1,
w(0) if c > w[i + 1], AC otherwise. a b a b a a a b a a b a b b b b Fact Let a word w be its own minimal rotation, i.e. hwi = w. If w[1..j] is a border of w[1..i], then w[j + 1] ≤ w[i + 1].
Deterministic automaton recognizing L(w)
A has n + 1 states: one for each proper prefix of w, and an auxiliary accepting state AC. The transitions are defined as follows: δ(AC, c) = AC for any c ∈ Σ and δ(w(i), c) =
a
start
b
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 13/17 w(0) if c > w[i + 1], AC otherwise. a b a b a b b b b Fact Let a word w be its own minimal rotation, i.e. hwi = w. If w[1..j] is a border of w[1..i], then w[j + 1] ≤ w[i + 1].
Deterministic automaton recognizing L(w)
A has n + 1 states: one for each proper prefix of w, and an auxiliary accepting state AC. The transitions are defined as follows: δ(AC, c) = AC for any c ∈ Σ and w if c = w[i + 1] and i 6= n − 1, (i+1) δ(w(i), c) =
a
start a a b a a b a b
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 13/17 AC otherwise. a a a b
Fact Let a word w be its own minimal rotation, i.e. hwi = w. If w[1..j] is a border of w[1..i], then w[j + 1] ≤ w[i + 1].
Deterministic automaton recognizing L(w)
A has n + 1 states: one for each proper prefix of w, and an auxiliary accepting state AC. The transitions are defined as follows: δ(AC, c) = AC for any c ∈ Σ and w if c = w[i + 1] and i 6= n − 1, (i+1) δ(w(i), c) = w(0) if c > w[i + 1],
b a b start a a b a a b a b b b b
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 13/17 Fact Let a word w be its own minimal rotation, i.e. hwi = w. If w[1..j] is a border of w[1..i], then w[j + 1] ≤ w[i + 1].
Deterministic automaton recognizing L(w)
A has n + 1 states: one for each proper prefix of w, and an auxiliary accepting state AC. The transitions are defined as follows: δ(AC, c) = AC for any c ∈ Σ and w if c = w[i + 1] and i 6= n − 1, (i+1) δ(w(i), c) = w(0) if c > w[i + 1], AC otherwise. a b a a b a start a a b a a b a b b b b b
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 13/17 Deterministic automaton recognizing L(w)
A has n + 1 states: one for each proper prefix of w, and an auxiliary accepting state AC. The transitions are defined as follows: δ(AC, c) = AC for any c ∈ Σ and w if c = w[i + 1] and i 6= n − 1, (i+1) δ(w(i), c) = w(0) if c > w[i + 1], AC otherwise. a b a a b a start a a b a a b a b b b b b Fact Let a word w be its own minimal rotation, i.e. hwi = w. If w[1..j] is a border of w[1..i], then w[j + 1] ≤ w[i + 1].
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 13/17 Fact
For any automaton A = (Q, q0, F , δ) we have
p n X 0 n | L(A) ∩ Σ | = |LA(q0, q) ∩ LA(q, q ) ∩ Σ |. q∈Q,q0∈F
0 00 m One can compute |LA(q0, q) ∩ LA(q , q ) ∩ Σ | for all q, q0, q00 ∈ Q and m ≤ n in O(poly(n)) time.
To obtain O(n3) time complexity, we use an alternative method that exploits the structure of A.
Dynamic programming
0 For A = (Q, q0, F , δ) and q, q ∈ Q define
0 ∗ 0 LA(q, q ) = {x ∈ Σ : δ(q, x) = q }.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 14/17 0 00 m One can compute |LA(q0, q) ∩ LA(q , q ) ∩ Σ | for all q, q0, q00 ∈ Q and m ≤ n in O(poly(n)) time.
To obtain O(n3) time complexity, we use an alternative method that exploits the structure of A.
Dynamic programming
0 For A = (Q, q0, F , δ) and q, q ∈ Q define
0 ∗ 0 LA(q, q ) = {x ∈ Σ : δ(q, x) = q }.
Fact
For any automaton A = (Q, q0, F , δ) we have
p n X 0 n | L(A) ∩ Σ | = |LA(q0, q) ∩ LA(q, q ) ∩ Σ |. q∈Q,q0∈F
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 14/17 To obtain O(n3) time complexity, we use an alternative method that exploits the structure of A.
Dynamic programming
0 For A = (Q, q0, F , δ) and q, q ∈ Q define
0 ∗ 0 LA(q, q ) = {x ∈ Σ : δ(q, x) = q }.
Fact
For any automaton A = (Q, q0, F , δ) we have
p n X 0 n | L(A) ∩ Σ | = |LA(q0, q) ∩ LA(q, q ) ∩ Σ |. q∈Q,q0∈F
0 00 m One can compute |LA(q0, q) ∩ LA(q , q ) ∩ Σ | for all q, q0, q00 ∈ Q and m ≤ n in O(poly(n)) time.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 14/17 Dynamic programming
0 For A = (Q, q0, F , δ) and q, q ∈ Q define
0 ∗ 0 LA(q, q ) = {x ∈ Σ : δ(q, x) = q }.
Fact
For any automaton A = (Q, q0, F , δ) we have
p n X 0 n | L(A) ∩ Σ | = |LA(q0, q) ∩ LA(q, q ) ∩ Σ |. q∈Q,q0∈F
0 00 m One can compute |LA(q0, q) ∩ LA(q , q ) ∩ Σ | for all q, q0, q00 ∈ Q and m ≤ n in O(poly(n)) time.
To obtain O(n3) time complexity, we use an alternative method that exploits the structure of A.
Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 14/17 1 Find λk−1 and λk+1 (using the FKM algorithm).
2 Perform pattern matching for w in λk−1λk λk+1.
3 The occurrence of λk in dBn ends at position n/|λk | |CS(λk )|.