Binary De Bruijn Sequences Via Zech's Logarithms
Total Page:16
File Type:pdf, Size:1020Kb
SN Computer Science manuscript No. (will be inserted by the editor) Binary de Bruijn Sequences via Zech's Logarithms Zuling Chang · Martianus Frederic Ezerman · Adamas Aqsa Fahreza · San Ling · Janusz Szmidt · Huaxiong Wang Received: date / Accepted: date Abstract The focus of this work is to show how to tially used, for ease of generation. The process can be combine Zech's logarithms and each of the cycle joining repeated on each of the resulting de Bruijn sequences. and cross-join pairing methods to construct binary de Most prior constructions in the literature measure Bruijn sequences. Basic implementations are supplied the complexity of the corresponding bit-by-bit algo- as proofs of concept. rithms. Our approach is different. We aim first to build The cycles, in the cycle joining method, are typi- a connected adjacency subgraph that is certified to con- cally generated by a linear feedback shift register. We tain all of the cycles as vertices. The ingredients are prove a crucial characterization that determining Zech's computed just once and concisely stored. Simple strate- logarithms is equivalent to identifying conjugate pairs gies are offered to keep the complexities low as the order shared by any two distinct cycles. This speeds up the grows. task of building a connected adjacency subgraph that Keywords Binary de Bruijn sequence · cycle struc- contains all vertices of the complete adjacency graph. ture · conjugate pair · cross-join pair · Zech's logarithms Distinct spanning trees in either graph correspond to cyclically inequivalent de Bruijn sequences. As the cy- Mathematics Subject Classification (2000) 11B50 · cles are being joined, guided by the conjugate pairs, we 94A55 · 94A60 track the changes in the feedback function. We show how to produce certificates of existence for spanning trees of certain types to conveniently handle large or- 1 Introduction der cases. A binary de Bruijn sequence of order n 2 has pe- The characterization of conjugate pairs via Zech's N riod N = 2n in which each n-tuple occurs exactly once. logarithms, as positional markings, is then adapted to n−1 Up to cyclic equivalence, there are 22 −n of them [5]. identify cross-join pairs. A modified m-sequence is ini- A naive approach capable of generating all of these se- quences, e.g., by generating all Eulerian cycles in the de Bruijn graph, requires an exponential amount of space. Z. Chang School of Mathematics and Statistics, Zhengzhou University, Adding a 0 to the longest string of 0s in a maximal Zhengzhou 450001, China length sequence, also known as an m-sequence, of pe- E-mail: zuling [email protected] riod 2n − 1 produces a de Bruijn sequence of order n. M. F. Ezerman ( ) · A. A. Fahreza · S. Ling · H. Wang 1 n There are Λn := n φ(2 − 1) such m-sequences where Division of Mathematical Sciences, School of Physical and φ(·) is the Euler totient function. There is a bijection Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link, Singapore 637371 between the set of all m-sequences and the set of primi- E-mail: ffredezerman,adamas,lingsan,[email protected] tive polynomials of degree n in F2[x]. As n grows large, 2n−1−n J. Szmidt Λn soon becomes miniscule in comparison to 2 . Military Communication Institute, ul. Warszawska 22 A, 05- In terms of applications, there is quite a diversity in the 130 Zegrze, Poland community of users. This results in varied criteria as to E-mail: [email protected] what constitute a good construction approach. 2 Z. Chang et al. Mainly for cryptographic purposes, there has been a bits of storage in n units of time. Etzion and Lempel sustained interest in efficiently generating a good num- in [14] considered a shift register that produced many ber of de Bruijn sequences of large orders. For stream short cycles which were then joined together. They pro- cipher applications (see, e.g., the 16 eSTREAM final- posed two constructions based on, respectively, pure cy- ists in [35]), it is desirable to have high unpredictability, cling registers PCRs and pure summing registers PSRs. i.e., sequences with high nonlinearity, while still being For a PCR, the number of produced sequence is expo- somewhat easy to implement. This requires multiple nential with base 2 (see [14, Theorem 2] for the exact measures of complexity to be satisfied. Due to their lin- formula), with time complexity O(n) to produce the earity, modified m-sequences are unsuitable for crypto- next bit. It was quite expensive in memory, requiring graphic applications. approximately 3n plus the exponent of 2 in the exact Bioinformaticians want large libraries of sequences formula. For PSR, the number of produced sequence n2 to use in experiments [26] and in genome assembly [8]. is ≈ 2 4 with its bit-by-bit algorithm costing O(n) in Their design philosophy imposes some Hamming dis- time and O(n2) in memory. tance requirements, since a library must not have se- Huang showed how to join the pure cycles of the quences which are too similar to one another or exhibit- complementing circulating register f(x1; : : : ; xn) = x1 ing some taboo patterns. Secrecy or pseudorandomness in [24]. Its bit-by-bit algorithm took 4n bits of stor- is often inconsequential. Designers of very large scale age and 4n units of time. The resulting sequences tend integrated circuits often utilize binary de Bruijn mul- to have a good local 0 and 1 balance. In the work of tiprocessor network [36]. Important properties include Jansen et al. [28], the cycles were generated by an LFSR the diameter and the degree of the network. The objec- with feedback function f(x) that can be factored into tive here is to have a small layout that still allows for r irreducible polynomials of the same degree m. The many complete binary trees to be constructed system- number of produced de Bruijn sequences was approxi- atically for extensibility, fault-tolerance, and self diag- mately O(22n= log(2n)). Its bit-by-bit complexity was 3n nosability. bits of storage and at most 4n FSR shifts. Numerous subsequent works, until very recently, are listed with 1.1 Prior Literature the details on their input parameters and performance complexity in [7, Table 4]. Given its long history and numerous applications, the The cross-join pairing method begins with a known quest to efficiently generate de Bruijn sequences has de Bruijn sequence, usually a modified m-sequence. One produced a large literature and remains very active. then identifies cross-join pairs that allow the sequence, Most known approaches fall roughly into three types, seen as a cycle of period 2n, to be cut and reconnected each to be briefly discussed below. In terms of com- into inequivalent de Bruijn sequences from the initial plexity, it is customary to specify the memory and time one. Two representative works on this method were requirements to generate the next bit, i.e., the last en- done by Helleseth and Kløve in [23] and by Mykkeltveit try in the next state given a current state. Such an and Szmidt in [34]. algorithm is called a bit-by-bit algorithm. Readers who The third method, where we collect rather distinct are interested in the hardware implementation of de approaches, is less clear-cut than the first two. The Bruijn sequence generators can find a recent in-depth defining factor is that they follow some assignment rules treatment in [42]. or preference functions in producing one symbol at a As the name suggests, the cycle joining approach time using the previous n symbols. Some clarity can builds de Bruijn sequences by combining disjoint cy- perhaps be given by way of several recent examples. cles of smaller periods [18]. One begins, for example, They typically yield only a few de Bruijn sequences. with a linear feedback shift register (LFSR) with an ir- A necklace is a string in its lexicographically small- reducible but nonprimitive characteristic polynomial of est rotation. Stevens and Williams in [40] presented degree n. It produces disjoint cycles, each with a pe- an application of the necklace prefix algorithm to con- riod that divides 2n − 1. One identifies a conjugate pair struct de Bruijn sequences. Their loopless algorithm between any two distinct cycles and combines the cy- took O(n) time. They significantly simplified the FKM cles into a longer cycle. The process is repeated until algorithm, attributed to Fredricksen, Kessler, and Maio- all disjoint cycles have been combined into a de Bruijn rana, that concatenated the aperiodic prefixes of neck- sequence. The mechanism can of course be applied on laces in lexicographic order. In [37], Sawada et al. gave a all identifiable conjugate pairs. simple shift-rule, i.e., a function that maps each length Fredricksen applied the method on pure cycling reg- n substring to the next length n substring, for a de ister of length n in [17]. The bit-by-bit routine took 6n Bruijn sequence of order n. Its bit-by-bit algorithm ran Binary de Bruijn Sequences via Zech's Logarithms 3 in costant time using O(n) space. Instead of concate- demonstrated by examples. This simple strategy keeps nating necklaces in lexicographic order, Dragon et al. in the complexities low as the order grows. [12] used the co-lexicographic order to construct a de We then show how to adapt finding conjugate pairs Bruijn sequence in O(n) time. Natural subsequences of shared by two smaller cycles in the cycle joining method the lexicographically least sequence had been shown to to finding two conjugate pairs in a given de Bruijn se- be de Bruijn sequences for interesting subsets of strings.