A Simple Parallel Cartesian Tree Algorithm and Its Application to Suﬃx Tree Construction ∗

A Simple Parallel Cartesian Tree Algorithm and its Application to Suffix Tree Construction ∗ Guy E. Blellochy Julian Shunz Abstract data structures for string processing. For example, it We present a simple linear work and space, and poly- is used in several bioinformatic applications, such as logarithmic time parallel algorithm for generating mul- REPuter [KS99], MUMmer [DPCS02], OASIS [MPK03] tiway Cartesian trees. As a special case, the algorithm and Trellis+ [PZ08]. Both suffix trees and a linear can be used to generate suffix trees from suffix arrays on time algorithm for constructing them were introduced arbitrary alphabets in the same bounds. In conjunction by Weiner [Wei73] (although he used the term posi- with parallel suffix array algorithms, such as the skew tion tree). Since then various similar constructions have algorithm, this gives a rather simple linear work paral- been described [McC76] and there have been many im- lel algorithm for generating suffix trees over an integer plementations of these algorithms. Although originally alphabet Σ ⊆ [1; : : : ; n], where n is the length of the in- designed for fixed-sized alphabets with deterministic lin- put string. More generally, given a sorted sequences of ear time, Weiner's algorithm can work on an alphabet strings and the longest common prefix lengths between f1; : : : ; ng, henceforth [n], in linear expected time sim- adjacent elements, the algorithm will generate a pat tree ply by using hashing to access the children of a node. (compacted trie) over the strings. The algorithm of Weiner and its derivatives are all We also present experimental results comparing incremental and inherently sequential. The first paral- the performance of the algorithm to existing sequential lel algorithm for suffix trees was given by Apostolico et. + implementations and a second parallel algorithm. We al. [AIL 88] and was based on a quite different doubling present comparisons for the Cartesian tree algorithm approach. For a parameter 0 < ≤ 1 the algorithm 1 n 1+ on its own and for constructing a suffix tree using our runs in O( log n) time, O( log n) work and O(n ) algorithm. The results show that on a variety of strings space on the CRCW PRAM for arbitrary alphabets. our algorithm is competitive with the sequential version Although reasonably simple, this algorithm is likely not on a single processor and achieves good speedup on practical since it is not work efficient and uses super- multiple processors. linear memory (by a polynomial factor). The parallel construction of suffix trees was later improved to linear 1 Introduction work and space by Hariharan [Har94], with an algorithm taking O(log4 n) time on the CREW PRAM, and then For a string s of length n over a character set Σ ⊆ by Farach and Muthukrishnan to O(log n) time using f1; : : : ; ng1 the suffix-tree data structure stores all the a randomized CRCW PRAM [FM96] (high-probability suffixes of s in a pat tree (a trie in which maximal branch bounds). These later results are for a constant-sized al- free paths are contracted into a single edge). In addi- phabet, are \considerably non-trivial", and do not seem tion to supporting searches in s for any string t 2 Σ∗ to be amenable to efficient implementations. in O(jtj) expected time2, suffix trees efficiently sup- One way to construct a suffix tree is to first generate port many other operations on strings, such as longest a suffix array (an array of pointers to the lexicograph- common substring, maximal repeats, longest repeated ically sorted suffixes), and then convert it to a suffix substrings, and longest palindrome, among many oth- tree. For binary alphabets and given the length of the ers [Gus97]. As such it is one of the most important longest common prefix (LCP) between adjacent entries this conversion can be done sequentially by generating ∗This work was supported by generous gifts from Intel, Mi- a Cartesian tree in linear time and space. The approach crosoft and IBM, and by the National Science Foundation under can be generalized to arbitrary alphabets using multi- award CCF1018188. way Cartesian trees without much difficulty. Using suf- y Carnegie Mellon University, E-mail: [email protected] fix arrays is attractive since in recent years there has zCarnegie Mellon University, E-mail: [email protected] 1More general alphabets can be used by first sorting the been considerable theoretical and practical advances in characters and then labeling them from 1 to n. the generation of suffix arrays (see e.g. [PST07]). The 2Worst case time for constant sized alphabets. Copyright © 2011 by SIAM 48 Unauthorized reproduction is prohibited. interest is partly due to their need in the widely used machine on a variety of inputs. First, we compare our Burrows-Wheeler compression algorithm, and also as a Cartesian tree algorithm with a simple stack based se- more space-efficient alternative to suffix trees. As such quential implementation. On one core our algorithm is there have been dozens of papers on efficient implemen- about 3x slower, but we achieve about 30x speedup on tations of suffix arrays. Among these Karkkainen and 32 cores and about 45x speedup with 32 cores using hy- Sanders have developed a quite simple and efficient par- perthreading (two threads per core). We also analyze allel algorithm for suffix arrays [KS03, KS07] that can the algorithm when used as part of code to generate a also generate LCPs. suffix tree from the original string. We compare the code The story with generating Cartesian trees in paral- to the ANSV based algorithm described in the previous lel is less satisfactory. Berkman et. al [BSV93] describe paragraph and to existing sequential implementations a parallel algorithm for the all nearest smaller values of suffix trees. Our algorithm is always faster than the (ANSV) problem, which can be directly used to gen- ANSV algorithm. The algorithm is competitive with erate a binary Cartesian tree for fixed sized alphabets. the sequential code on a single processor (core), and However, it cannot directly be used for non-constant achieves good speedup on 32 cores. Finally, we present sized alphabets, and the algorithm is very complicated. timings for searching multiple strings in the suffix tree Iliopoulos and Rytter [IR04] present two much simpler we generate. Our times are always faster than the se- algorithms for generating suffix trees from suffix arrays, quential suffix tree on one core and always more than one based on merging and one based on a variant of 50x faster on 32 cores using hyperthreading. the ANSV problem that allows for multiway Cartesian trees. However they both require O(n log n) work. 2 Preliminaries In this paper we describe a linear work, linear space, Given a string s of length n over an ordered alphabet and polylogarithmic time algorithm for generating mul- Σ, the suffix array, SA, represents the n suffixes of s in tiway Cartesian trees. The algorithm is based on divide- lexicographically sorted order. To be precise, SA[i] = j and-conquer and we describe two versions that differ if and only if the suffix starting at the j'th position in in whether the merging step is done sequentially or s appears in the i'th position in the suffix-sorted order. in parallel. The first based on a sequential merge, A pat tree [Mor68] (or patricia tree, or compacted trie) is very simple, and for a tree of height d, it runs in of a set of strings S is a modified trie in which (1) edges O(minfd log n; ng) time on the CREW PRAM. The sec- can be labeled with a sequence of characters instead of a ond version is only slightly more complicated and runs single character, (2) no node has a single child, and (3) 2 in O(log n) time on the CREW PRAM. They both use every string in S corresponds to concatenation of labels linear work and space. for a path from the root to a leaf. Given a string s of Given any linear work and space algorithm for length n, the suffix tree for s stores the n suffixes of s generating a suffix array and corresponding LCPs using in a pat tree. O(S(n)) time our results lead to a linear work and space In this paper we assume an integer alphabet Σ ⊆ [n] 2 algorithm for generating suffix trees in O(S(n) + log n) where n is the total number of characters. We require time. For example using the Skew algorithm [KS03] on that the pat tree and suffix tree supports the following 2 a CRCW PRAM we have O(log n) time for constant- queries on a node in constant expected time: finding sized alphabets and O(n ); 0 < ≤ 1 time for the the child edge based on the first character of the edge, alphabet [n]. We note that a polylogarithmic time, finding the first child, finding the next and previous linear work and linear space algorithm for the alphabet sibling in the character order, and finding the parent. [n] would imply stable radix sort on [n] in the same If the alphabet is constant sized all these operations bounds, which is a long open problem. can easily be implemented in constant worst-case time. For comparison we also present a technique for using A Cartesian tree [Vui80] on a sequence of elements the ANSV problem to generate multiway Cartesian taken from a total order is a binary tree that satisfies trees on arbitrary alphabets in linear work and space. two properties: (1) heap order on values, i.e. a node The algorithm runs in O(I(n) + log n) time on the has an equal or lesser value than any of its descendants, CRCW PRAM, where I(n) is the best time bound for a and (2) an inorder traversal of the tree defines the linear-work stable sorting of integers from [n].

A Simple Parallel Cartesian Tree Algorithm and Its Application to Suﬃx Tree Construction ∗

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support