Computing K-Th Lyndon Word and Decoding Lexicographically Minimal De Bruijn Sequence
Total Page:16
File Type:pdf, Size:1020Kb
Computing k-th Lyndon Word and Decoding Lexicographically Minimal de Bruijn Sequence Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter University of Warsaw CPM 2014 Moscow, June 18, 2014 Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 1/17 Examples: 0010011, 000001, 1. Non-examples: 010011 (010011 > 001101) 001001 (001001 = (001)2) Lyndon words Definition (Lyndon, 1954) A Lyndon word is a word that is strictly smaller than all its nontrivial cyclic rotations. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 2/17 Non-examples: 010011 (010011 > 001101) 001001 (001001 = (001)2) Lyndon words Definition (Lyndon, 1954) A Lyndon word is a word that is strictly smaller than all its nontrivial cyclic rotations. Examples: 0010011, 000001, 1. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 2/17 Lyndon words Definition (Lyndon, 1954) A Lyndon word is a word that is strictly smaller than all its nontrivial cyclic rotations. Examples: 0010011, 000001, 1. Non-examples: 010011 (010011 > 001101) 001001 (001001 = (001)2) Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 2/17 jLynd(001110)j = 6; jLynd(011010)j = 8: We define Lynd(w) = fx 2 Ljwj : x ≤ wg: Example: Σ = f0; 1g; L6 = f000001; 000011; 000101; 000111; 001011 001101; 001111; 010111; 011111g; Enumerating Lyndon words Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 jLynd(001110)j = 6; jLynd(011010)j = 8: Example: Σ = f0; 1g; L6 = f000001; 000011; 000101; 000111; 001011 001101; 001111; 010111; 011111g; Enumerating Lyndon words Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define Lynd(w) = fx 2 Ljwj : x ≤ wg: Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 jLynd(001110)j = 6; jLynd(011010)j = 8: Enumerating Lyndon words Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define Lynd(w) = fx 2 Ljwj : x ≤ wg: Example: Σ = f0; 1g; L6 = f000001; 000011; 000101; 000111; 001011 001101; 001111; 010111; 011111g; Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 jLynd(011010)j = 8: Enumerating Lyndon words Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define Lynd(w) = fx 2 Ljwj : x ≤ wg: Example: Σ = f0; 1g; L6 = f000001; 000011; 000101; 000111; 001011 001101; 001111; 010111; 011111g; jLynd(001110)j = 6; Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 Enumerating Lyndon words Notation: Σ a fixed finite totally-ordered alphabet, L all Lyndon words in Σ+, n Ln all Lyndon words in Σ . We define Lynd(w) = fx 2 Ljwj : x ≤ wg: Example: Σ = f0; 1g; L6 = f000001; 000011; 000101; 000111; 001011 001101; 001111; 010111; 011111g; jLynd(001110)j = 6; jLynd(011010)j = 8: Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 3/17 Technical assumptions: word-RAM model, Σ = f0; 1; : : : ; σ − 1g, σ fits in a machine word. Corollary The k-th lexicographically smallest Lyndon word in Ln can be computed in O(n4 log σ) time. Proof. Binary search over Σn. Our results (Lyndon words) Theorem Given a word w of length n the value jLynd(w)j can be computed in O(n3) time. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 4/17 Corollary The k-th lexicographically smallest Lyndon word in Ln can be computed in O(n4 log σ) time. Proof. Binary search over Σn. Our results (Lyndon words) Theorem Given a word w of length n the value jLynd(w)j can be computed in O(n3) time. Technical assumptions: word-RAM model, Σ = f0; 1; : : : ; σ − 1g, σ fits in a machine word. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 4/17 Our results (Lyndon words) Theorem Given a word w of length n the value jLynd(w)j can be computed in O(n3) time. Technical assumptions: word-RAM model, Σ = f0; 1; : : : ; σ − 1g, σ fits in a machine word. Corollary The k-th lexicographically smallest Lyndon word in Ln can be computed in O(n4 log σ) time. Proof. Binary search over Σn. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 4/17 no polynomial-time algorithm known previously for Ln. combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object; Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln. text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object; Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: combinatorics of words, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln. algebra. Generating k-th smallest element is a natural question for any combinatorial object; Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: combinatorics of words, text algorithms, Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln. Generating k-th smallest element is a natural question for any combinatorial object; Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 no polynomial-time algorithm known previously for Ln. Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object; Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object; no polynomial-time algorithm known previously for Ln. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 Applications to decoding the minimal de Bruijn sequence. Motivation Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object; no polynomial-time algorithm known previously for Ln. Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 Motivation Lyndon words have numerous applications: combinatorics of words, text algorithms, algebra. Generating k-th smallest element is a natural question for any combinatorial object; no polynomial-time algorithm known previously for Ln. Generalization of the known formula: 1 X n d jLnj = n µ( d )σ : djn Applications to decoding the minimal de Bruijn sequence. Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 5/17 1 0 1 0 0 0 0 0 0 0 0 0 0 0 De Bruijn sequences Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once. 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 n = 4; Σ = f0; 1g Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1 0 0 0 0 0 0 0 0 0 0 0 De Bruijn sequences Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once. 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 n = 4; Σ = f0; 1g Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1 0 0 0 0 0 0 0 0 0 0 0 De Bruijn sequences Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once. 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 n = 4; Σ = f0; 1g Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1 0 0 0 0 0 0 0 0 0 0 0 De Bruijn sequences Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once. 1 1 0 1 1 0 1 0 0 0 1 0 0 1 0 1 n = 4; Σ = f0; 1g Tomasz Kociumaka, Jakub Radoszewski, Wojciech Rytter Lyndon Words and de Bruijn Sequences 6/17 1 0 1 0 0 0 0 0 0 0 0 0 0 0 De Bruijn sequences Definition A de Bruijn sequence of rank n is a cyclic sequence in which every word from Σn appears as a subword exactly once.