Optimal Parallel Algorithms for String Matching
Total Page:16
File Type:pdf, Size:1020Kb
INFORMATION AND CONTROL 67, 144-157 (1985) Optimal Parallel Algorithms for String Matching ZvI GALIL* Tel Aviv University, Tel Aviv, Israel and Columbia University, New York Let WRAM [PRAM] be a parallel computer with p processors (RAMs) which share a common memory and are allowed simultaneous reads and writes [only simultaneous reads]. The only type of simultaneous writes allowed is a simultaneous AND: a subset of the processors may write 0 simultaneously into the same memory cell. Let t be the time bound of the computer. We design below families of p~rallel algorithms that solve the string matching problem with inputs of size n (n is the sum of lengths of the pattern and the text) and have the following performance in terms of p, t and n: (1) For WRAM: pt=O(n) for p<~n/logn (i.e., t~>logn).* (2) For PRAM: pt=O(n) for p<~n/log2n (i.e., t~>log2n). (3) For WRAM: t=constant for p=n*+~" and any e>0. (4) For WRAM: t - ©(log n/log log n) for p = n. Similar families are also obtained for the problem of finding all initial palindromes of a given string. © 1985 AcademicPress, Inc. 1. INTRODUCTION We design parallel agorithms in the following model: p synchronized processors (RAMs) share a common memory. Any subset of the processors can simultaneously read from the same memory location. We sometimes allow simultaneous writing in the weakest sense: any subset of processors can write the value 0 into the same memory location (i.e., turn off a switch). We denote by WRAM [PRAM] the model that allows [-does not allow] such simultaneous writing. We also consider (but only biefly) other models of parallel computation. We actually design a family of algorithms because we have a parameter p. The performance of the family is measured in terms of three parameters: p--the number of processors, t the time, and n--the size of the problem instance. It is well known that every parallel algorithm with p processors and time t can be easily converted to a sequential algorithm of time pt. Hence the analog of linear-time algorithm in sequential computation is a family of parallel algorithms with pt=O(n). We therefore call such algorithms *Research supported by National Science Foundation Grants MCS-8303139 and DCR-85-11713. All logarithms are to the base 2. 144 0019-9958/85 $3.00 Copyright © 1985 by AcademicPress, Inc. All rights of reproductionin any form reserved. OPTIMUM PARALLEL ALGORITHMS 145 optimal. Surprisingly, while there are many problems for which linear-time algorithms are known, there are very few problems for which optimal parallel algorithms are known for a wide range of p; so few, that we list them here. Every associative function of n variables can be computed by a PRAM in pt= O(n) for p<~n/logn. (Use a binary tree; each leaf "treats" nip inputs.) For a certain subset of these functions including the n variable OR (and AND), £2(log n) time is needed on the PRAM (Cook and Dwork, 1982), so for these functions pt= O(n) is unattainable for p>n/log n. Con- sequently, the only question left is with how few processors we can com- pute these functions in constant time on a WRAM. The answer depends on the specific function. The n variable OR (or AND) function can be com- puted trivially by a WRAM in pt = n for p ~< n (i.e., in time = 1 with n processors). The n variable MAXIMUM function can be computed in pt = O(n) for p ~< n/log log n and in constant time with n 1+~ processors (for every e >0) (Valiant, 1975; Shiloach and Vishkin, 198l). Optimal parallel algorithms are known for merging two sorted arrays (for p ~<n/log n on a PRAM); merging can be done in constant time even by a PRAM with n 1+~ processors (Schiloach and Vishkin, 1981) and in time loglog n with n processors (Valiant, 1975; Shiloach and Vishkin, 1981). Recently, optimal parallel algorithms were designed for the problem of converting an expression to its parse tree (Bar-On and Vishkin, 1983) and for selection (Vishkin, 1983). What is common to all these problems except selection is that for each one of them there is a trivial (sequential) underlying linear-time algorithm. In this paper we design optimal parallel algorithms for string matching. The linear-time algorithm for string matching is by now very well understood, but at one time it was quite a major discovery. Unlike the case of computing n-variable functions (where it is trivial) and merging (where it is quite simple) designing optimal parallel algorithms for string matching was not immediate. As for the problems mentioned above, we designed other parallel algorithms that perform string matching on WRAM in constant time with only n 1+~ processors. As in the cases above the time is proportional to 1/e. If only n processors are available the time needed is O(log n/log log n). The families of algorithms we design have several appealing features: 1. They are not derived from any of the variants of the linear-time sequential algorithms (Knuth, Morris, and Pratt, 1977; Boyer and Moore, 1977). The latter do not seem to be parallelizable, because they sequentially construct tables which are used sequentially. So, even given the tables for free does not seem to help much. Two known algorithms are parallelizable but do not yield optimal parallel algorithms: the O(n log n) algorithm in 146 zvI GALIL (Karp, Miller, and Rosenberg, 1972) yields tp=O(nlog2n) and the probabilistic linear-time algorithm in (Karp and Rabin, 1980) yields a probabilistic family with tp = O(n log n). 2. The algorithms we design are all derived from one algorithm: it is an algorithm for WRAM with p = O(n) and t = O(log n) for the case that the text is twice as long as the pattern. 3. The algorithms make use of properties of periodicities in strings derived from the periodicity lemma which states that two different periodicities cannot co-exist too long (if they do, then there is a common refinement). Similar properties were used in a different way to design a linear-time algorithm for string matching which uses only a constant num- ber (five) of registers (Galil and Seiferas, 1983). Therefore, we may have here an example of a relationship between space complexity in sequential computation and parallel time in the lowest level. 4. As in the algorithm in Galil and Seiferas (1983), it is possible to write a very short program (for each processor), but a longer explanation is needed mainly because the algorithm implicitly uses properties of periodicities several times. 5. The algorithms use what seems to be a novel method of com- munication among the various processors, as will be indicated below. String matching is the following problem. The input consists of two strings, x (the pattern) and y (the text), over a given alphabet of a fixed size. The output is a Boolean array indicating all the occurrences of x in y. In Section 2 we prove several simple facts about periodicities of strings used by the algorithm. In Section 3 we sketch the main algorithm, which is not optimal (p= O(n), t=O(log n)) and only deals with a special case (1 y[ = 2 Ix[ ). In Section 4 we complete the details of the algorithm. In Sec- tion 5 we show how the four families of parallel algorithms mentioned in the abstract above are derived from the main algorithm. In Section 6 we briefly discuss other models of parallel computation and the problem of finding all initial palindromes of a given string. 2. PERIODICITY IN STRINGS A string u is a period of a string w if w is a prefix of u k for some k or equivalently if w is a prefix of uw. We call the shortest period of a string w the period of w. Thus a is the period of aaaaaaa while aa, aa, etc. are also periods of w. We say that w has period size P if the length of the period of w is P. If w is at least twice as long as its period we say that w is periodic. We will consider prefixes of the pattern x of increasing length. Assume we consider a prefix u and then a longer prefix v. In the case that u is OPTIMUM PARALLEL ALGORITHMS 147 periodic we will say that the periodicity continues in v if the period of v is the same as the period of u (e.g., u = abcabcab, v = abcabcabcabcabca) and that the periodicity terminates otherwise (e.g., the same u, v=abcab- eabed... ). We will need some simple facts about periodicities. FACT 1 (The periodicity lemma, Lyndon and Schutzenberger, 1962). If w has two periods of size P and Q and [w] >>.P+Q, then w has a period of size gcd(P, Q). For a one-line proof see Galil and Seiferas (1983). In the rest of this section let z be a given fixed string. An occurrence of v at j will mean an occurrance of v at position j in z (i.e., starting with zj). FACT 2. If v Occurs at j and j+ [~, [~ <~ Iv[/2, then (1) v is periodic with a period of length fi, and (2) v occurs ar j+ P, where P is the period size of V. The first half of Fact 2 follows from the laternative definition of period (u is a period of w if w is a prefix of uw).