External Inverse Pattern Matching

y z

Leszek G asieniec Piotr Indyk Piotr Krysta

Abstract

We consider external inverse pattern matching problem Given a text T of length n

over an ordered alphab et such that jj and a number m n The entire prob

m

lem is to nd a pattern P which is not a subword of T and which maximizes

M AX

the sum of Hamming distances b etween P and all subwords of T of length m We

M AX

present optimal O n log time algorithm for the external inverse pattern matching

problem which substantially improves the only known p olynomial O nm log time

algorithm introduced in Moreover we discuss a fast parallel implementation of our

algorithm on the CREW PRAM mo del

Topics algorithms and data structures string algorithms parallel algorithms

MaxPlanck Institut f ur Informatik Im Stadtwald D Saarbr ucken Germany

Email leszekmpisbmpgde

y

Computer Science Department Stanford University Gates Building CA USA

Email indykcsstanfordedu

z

Institute of Computer Science University of Wroclaw Przesmyckiego PL

Wroclaw Poland Email pkrystaiiuniwrocpl

Research of this author was partially supp orted by KBN grant P

Introduction

Given a string T called later a text of length n over an alphab et The inverse pattern

m m

matching problem is to nd a word P or P which minimizes maxi

MIN M AX

mizes the sum of Hamming distances b etween P P and all subwords of length

MIN M AX

m in the text T One can also consider two variations of the problem when the optimal word

is supp osed to o ccur in the text T or opp ositely when its o ccurrence in T is forbidden The

two variations of the problem are called resp ectively internal and external inverse pattern

matching problems It is assumed that in the internal inverse pattern matching desired

internal pattern P must minimize the sum of distances whereas in the external case

MIN

optimal external pattern P maximizes the entire sum As rep orted in the inverse

M AX

pattern matching app ears naturally and nds applications in several elds like information

retrieval data compression computer security and molecular biology For example the ex

ternal inverse pattern matching can b e used in a context of intrusion or plagiarism detection

see or in the synthesis of molecular prob es in genome sequencing by hybridization

It was shown by Amir Ap ostolico and Lewenstein in that the inverse pattern matching

problem can b e solved in time O n log when no additional restriction on P or P

M AX MIN

is assumed However it turned out that internal inverse pattern matching problem app ears

to b e signicantly harder Amir et al in presented two algorithms for this problem

The rst algorithm which is reasonably simple has the running time O nm log The

second one uses more sophisticated techniques like convolutions for Hamming distance

p

2

computation and it runs in time O n m log m Amir et al have shown a reduction from

all mismatches problem see to the internal inverse pattern matching Any improvement

in the all mismatches issue is a long standing op en problem thus it lo oks to b e quite unlikely

to get a faster algorithm for the internal inverse pattern matching However Amir et al show

that using techniques from one can get faster sup erlinear solution for the internal case

when approximate answers are allowed

The b est known to our knowledge O nm log time algorithm for the external inverse

pattern matching was given by Amir et al in They presented the idea of mstems for

the text T ie all p ossible words of length at most m not b elonging to T but whose all

prop er prexes form subwords of T It was shown in that the optimal external pattern

P can b e comp osed of some mstem of T extended by a prop er size sux of the maximal

M AX

word P Unfortunately the straightforward application of the mstem approach leads

M AX

to O nm log time solution b ecause one has to lo ok for mstems testing all text subwords

of length at most m In this pap er we show how to p erform tests of text subwords more

eciently We present new and optimal O n log time algorithm for the external inverse

pattern matching problem showing that the internal case is a b ottleneck in the inverse

pattern matching The optimality comes from the complexity of element distinction problem

to which external inverse pattern matching can b e reduced The new ecient solution is a

consequence of deep er analysis of relation b etween the maximal words P P and the

M AX M AX

text T Our main algorithm uses ecient algorithmic techniques like compact sux trees

range minimum queries and lowest common ancestor queries in trees supp orted

by an online computations of symbol weights dened later

The rest of the pap er is organized as follows In section we introduce notation and basic

techniques used in our algorithm Section contains the main algorithm with complexity

analysis and pro of of correctness In this section we also discuss a parallel implementation

of our algorithm Section contains the nal remarks and states some op en problems in the

related areas

Preliminaries

Given an ordered alphab et containing symbols ie jj Any sequence of concate

nated symbols from is called a word or a string We use symbol to denote op eration of

concatenation but the symbol is omitted in cases where the use of concatenation is natural

th

We use a notation w i for the i symbol of the word w w ij for the substring of w which

starts at p osition i and ends at j w stands for the string w without its rst symbol while

symbol stands for the empty string For example let w b e a string of length n ie

jw j n Then w w n w w n w i i w i and w i j when i j Any

subword of w of the form w i for all i f ng is called a prex of w and a subword

of the form w jn for all j f ng is called a sux of w We use notation u w

u w when u is not a subword of string w In case u w we say that the word u is

external string for w A search for a given of words S whose edges are lab eled

by symbols drawn from the alphab et is called a for S Any sequence v v v

1 k

of neighboring no des by parentchildren relation in a tree such that all v s are pairwise

i

disjoint is called a path Each word w S is represented in the trie as a path from the ro ot

to some leaf Recall that a trie is a prex tree ie two words have a common path from the

ro ot as long as they have the same prex A path v v v whose all internal no des

1 k

but last have degree equal to is called a chain

Problem Denition

Let T b e a text over such that jT j n and P b e a string over such that jP j m n

The Hamming distance b etween the word P and a subword of T starting at p osition i

is dened as follows

jP j

X

hP j T i j H P T ii jP j

j =1

where for any two symbols a b

a b

ha b

a b

In other words the Hamming distance H P T ii jP j gives the number of mismatches

b etween symbols of the aligned words T ii jP j and P In this pap er we are primarily

interested in the total Hamming distance b etween the word P and all its alignments in the

text T which is dened as

jT jjP j+1

X

H P T ii jP j HP T

i=1

Now we are ready to introduce the entire problem

Problem External Inverse Pattern Matching

n

Given a text string T where jj and a p ositive integer m st m n The

m

entire problem is to nd a pattern P st P T and HP T

M AX M AX M AX

m

HP T for all strings P which do not b elong to T

Notice that if text T contains all p ossible strings of length m then the external inverse

pattern matching has no solution

th

According to the denition of entire problem the i symbol of the desired pattern P

M AX

can b e aligned only with p ositions from i to n m i in the text T since we are interested

only in full alignments of the pattern P in T The latter observation denes naturally

M AX

th th

m dierent ranges in the text T st the i range is asso ciated with the i p osition in

pattern P We call these m consecutive ranges windows Win Win So simply

M AX 1 m

th

Win T in m i Let b e a frequency function dened in the i window Win

i i i

ie a equals to the number of all o ccurrences of symbol a in Win for all a and

i i

th

i f mg The weight w of symbols in the i window is dened as follows

i

w a n m a for all i f mg and a

i i

In terms of the weights the entire problem can b e viewed as lo oking for a string P T of

length m which maximizes the sum

m

X

w P i

i

i=1

th

Notice that the sum is maximized when the i p osition in the pattern P is o ccupied by the

least frequent or equivalently the heaviest symbol in the window Win which corresp onds

i

to the denition of the maximal word P the maximal solution of the general inverse

M AX

pattern matching However in the external inverse pattern matching optimal pattern P

M AX

maximizes the sum among all external strings for the text T

Through the rest of the pap er we will use the weighted version of the external inverse pattern

matching Moreover b efore the main algorithm starts we transform the input string to one

which consists of numbers from the range to st every symbol from the alphab et

is substituted by a unique number Thus from now on we assume that symbols can b e

treated as small numbers Since the alphab et is ordered the transformation can b e simply

p erformed in time O n log which do es not violate the complexity of the entire algorithm

Basic Techniques

In the following section we recall some basic techniques used in our algorithm

Compact sux tree and compact trie

A sux tree of a word w is a trie which represents all suxes of w Notice that

in the worst case the size of the sux tree can b e quadratic in the size of the input string

However since the sux tree has exactly jw j n leaves corresp onding to all suxes it

can b e stored in linear space as follows Every chain in the sux tree is represented by a

pair of integers i j which refers to the subword w ij There are exactly n leaves in the

sux tree thus the number of internal no des of degree greater than one and the number

of chains are b oth not greater than n The linear representation of a sux tree is called

compact sux tree and it is a known fact that for a word w such that jw j n it can

b e constructed in time O n log In this pap er we consider with compact description

of chains Recall that a chain is a path v v v whose all no des but last have degree

1 k

For our purp oses there are stored subwords of only one text in the trie which means that

all the chains in the trie represent substrings of the same text All the chains are exchanged

by edges lab eled by pairs of indices describing a p osition of the corresp onding subword in

the text It is imp ortant that our denition of the chain implies that each no de of the trie

of degree has all outgoing edges of length This means that these edges are lab eled by

single symbols

Range minimum search

Given a vector V V n of n numbers A range query for a pair i j where i j n

is a question ab out minimum among all numbers in the range V ij The main goal in range

minimum search problem is to prepro cess eciently vector V such that the range queries can

b e answered as fast as p ossible Gab ow et al gave a linear time prepro cessing algorithm

for the range minima that results in constanttime query retrieval

Lowest common ancestor in a tree

Let T b e a ro oted tree with a ro ot r For any no de x T let branchx denote a path from

the no de x to the ro ot r and depthx denote a distance length of the path from the no de x

to the ro ot r Given two no des v w T No de v is an ancestor of no de w i v branchw

For example the ro ot r is an ancestor of all no des in T A lowest common ancestor for

any pair of no des v w T is a no de u T with the greatest p ossible depthu such that

u branchv and u branchw In there was shown that after a linear prepro cessing

of a tree T all lowest common ancestor queries can b e answered in constant time Moreover

introduced a simpler algorithm with the same sequential time b ounds and its parallel

counterpart with linear O log ntime prepro cessing and constant time queries

External Inverse Pattern Matching Algorithm

We start this section recalling known facts ab out the external inverse pattern matching

introduced by Amir et al in The following notion of mstems plays a crucial role in

Amir et al approach as well as in our algorithm

Denition Any string R Rl over the alphabet for l f mg is called an

mstem for the text T T n i the whole word R T n m l but Rl

T n m l

Assume that we have already computed the optimal maximal word P using techniques

M AX

from Recall that the cell P i contains the heaviest symbol in the window Win

M AX i

Roughly sp eaking construction of the word P can b e done as follows First one has to

M AX

compute the frequencies of all symbols in the window Win and this can b e done in linear

1

time Since the dierence b etween symbol frequencies in any two neighboring windows is

small only one symbol comes in and only one comes out when we change the window

we can compute the heaviest symbols in the consecutive windows Win Win allowing

2 m

constant time for each window Additionally we compute two arrays F m and F m

1 2

where F i contains the second heaviest symbol in the window Win and F i contains

1 i 2

the dierence w P i w F i The arrays are called tables of ips for the pattern

i M AX i 1

P More detailed description of a data structure which gives the weights of symbols in

M AX

the consecutive windows is given in section

If P T then we take P as desired pattern P and the external inverse pattern

M AX M AX M AX

matching is solved Otherwise if P is a subword of T then the following fact holds

M AX

Fact If P is the solution of the external inverse pattern matching problem then

M AX

P where is some mstem for the text T and P jj m

M AX M AX

According to the Fact searching for the optimal pattern P has b een reduced to

M AX

testing weights of all O nm p ossible words of the form In fact Amir et al in

reduced the number of words for testing to O nm skipping over nonreasonable solutions

More precisely they build a trie for all the substrings of T of length m and they traverse it

no de by no de in BFS order testing a maximal external string leaving the trie at a current

no de Since the size of the trie is O nm their approach gives an algorithm with running

time O mn log In this pap er we show how to search the no des of the trie more eciently

Let v b e a no de in the trie on depth k Let C v fc c g b e set of children of the no de

1 l

v and X v fx x g b e set of symbols on rst p ositions of edges e e connecting

1 l 1 l

the no de v to its children resp ectively Let s b e a string represented by a path from the ro ot

of the trie to v see Figure Moreover let y b e the heaviest symbol in the window Win

k +1

which is not in X v ie w y w z for all z n X v The following lemma

k +1 k +1

shows the advantage of mstem approach

Lemma The string s y P k m is the heaviest possible external string leaving

M AX

the trie at the node v

Pro of Since the weight of the string s is xed and the sux P k m is the heaviest

M AX

p ossible extension see the denition of P thus the string s y P k m is the

M AX M AX

heaviest p ossible external string leaving trie at the no de v 2

0 0

X v b e a set of symbols on rst p ositions in the edges e e Let X v x x

1 l 1 l

e e i x x such that w x w y for all i f l g

p q p q k +1 i k +1 s

k v k+1 y e 1 e i e l c 1 c c i l P[k+2..m] MAX

T T T c1 ci cl

.

Figure Heaviest external string leaving the trie at the no de v

Lemma Maximal external string s y P k m leaving the trie at the node v is

M AX

heavier than al l strings of the form s x P k m for al l i f l g and there

i M AX

0

is no need to test external strings going out of the trie at and b elow the edges e e

1 l

Pro of Notice that all external strings passing through the no de v have the same prex s

Since the sux y P k m is heavier than all p ossible suxes starting from symbols

M AX

of the set X v the results follows 2

Lemma has interesting consequences if the maximal external symbol y P k

M AX

Corollary If the maximal external symbol y P k then the string s P k

M AX M AX

m is the heaviest possible external string passing through the node v and there is no need

to traverse the trie below the node v

The advantage of the Corollary b ecomes more clear when it is used in the context of the

no des of some chain in the trie The following lemma plays a crucial role in our ecient

searching of chains in the trie of the text subwords

Lemma Let u ur be a string which is represented by the only chain going out

u

from a node v v has degree one If ur P k k r then

M AX

A the string s P k m is the heaviest possible external string passing through the

M AX

node v and there is no need to search the trie below v otherwise

B let j be the position in the word ur for which the corresponding dierence in the ip

table F k j is the smal lest in range k k r then the word s P k

2 M AX

k j F k j P k j m is the heaviest possible external string

1 M AX

among al l external strings leaving the trie at nodes of the chain In this case a part

u

of the trie below the chain is a subject for further search

u s

k v k+1 ρ u k+j k+j+1 y

ci

P[k+j+1..m] MAX y = F[k+j] 1 T ci

.

Figure Heaviest external string leaving the trie at the chain

u

Pro of

ad A The string s P k m is an external string for T and it is the heaviest p ossible

M AX

external string which passes through the no de v in the trie

ad B It is enough to change only one symbol in the word u to create an external string

which leaves the trie at some no de of the chain see Figure and Fact According

u

to the denition of the tables of ips the p osition j in u gives the minimal lose of weight

among all p ossible swaps of one symbol in u Since it is still p ossible that the maximal

external string leaves the trie b elow the chain the part of the trie hanged b elow

u u

is a sub ject of further search

2

Now we are ready to present our main algorithm

Algorithm

The algorithm consists of two stages The rst one called prepro cessing contains a con

struction and initialization of all data structures used later during the actual search The

second stage called searching phase consists of an actual construction of the desired optimal

external solution P

M AX

Prepro cessing

First of all we nd the maximal pattern P using techniques from If P is an

M AX M AX

external string for the text T which can b e checked by any string matching algorithm eg

see then we are done otherwise instead of the full trie of text subwords we build a

compact trie T It is reconstructed from a compact sux tree for the text T by cutting

T

all deep paths from the ro ot to leaves on depth m and skipping all shallow paths shorter

than m At every no de v of T we keep information ab out the string s subword of the

T

text T which is represented by the path from ro ot of the trie to the no de v Additionally

we build a common sux tree T for the text T and the maximal word P ie a sux

M AX

tree for the word T P and we prepro cess it for LCA queries Construction of all the

M AX

trees can b e done in time O n log as well as the prepro cessing for LCA queries

An online computation of the weights of symbols in the consecutive windows plays a crucial

role in the prepro cessing and the searching phase According to the need of the algorithm

a data structure which represents the weights of symbols must keep also the current order

b etween the weighted symbols The data structure is represented by an array M M n

th

st the i cell of the array contains a p ointer to a doublelinked horizontal list of all

symbols having weight i Moreover all nonempty cells of the array are connected into a

doublelinked vertical list The nonempty cell in the array M with the largest index which

contains a list of the heaviest symbols is accessible directly by a variable max Symbols in

the lists are also accessible directly by the symbol index The construction initialization

of the data structure which corresp onds to Win can b e simply done in linear O n time

1

since we assumed that symbols from the alphab et are substituted by unique numbers from

the range The data structure supp orts the following three op erations

The rst op eration gives the weight of any symbol a in the current window The weight

of the symbol a corresp onds to the p osition of a horizontal list containing a in the array

M Since the symbols in the horizontal lists are accessible by the symbol index thus this

op eration works in constant time

The second op eration is needed when the algorithm changes a window from Win to Win

i i+1

for all i f n mg When the window is changed only two symbols change slightly

their weights ie T i comes out and the symbol T n m i comes into the window We

nd the weight w T i using the symbol index then we exclude the symbol T i from the

i

list linked at M w T i and then the symbol T i is inserted at the b eginning of the list

i

linked at M w T i In the meantime the p ointers to the neighbors of M w T i

i i

and M w T i in the vertical list are mo died if necessary Finally the symbol index

i

is decreased by one at the p osition T i Similar op eration is p erformed when the symbol

T n m i increases by one its level in the array M Thus the whole step can b e implemented

in constant time

The third op eration is p erformed when we lo ok for the heaviest symbol in a window Win

i

not b elonging to the given set of symbols X v fx x g After a sequence of

1 l

l deletions in the horizontal lists and at most l deletions in the vertical list the desired

symbol is accessible at M max Finally the current structure of symbol weights is restored

by the reverse sequence of insertions to the horizontal lists and the vertical list And the

whole step can b e implemented in time O l

Using the online computation of weights we compute in time O m the ip tables F m

1

which contains the second heaviest symbols in the consecutive windows and F m whose

2

th

i cell contains the dierence w P i w F i The second table F is prepro cessed

i M AX i 1 2

in linear time for the minimum range queries Finally we compute the weights of all suxes

of the maximal pattern P and we store them in a table S m also in time O m

M AX

Searching phase

The searching phase consists of two rounds During the rst search of T for every no de v

T

we compute the heaviest external string called a candidate which leaves the trie at a no de

v or at a chain which is placed under the no de v Finally if there is any candidate the trie

is searched again to nd the maximal external pattern P Otherwise the entire problem

M AX

has no solution

During the rst round the algorithm traverses the tree T in the BFSlike order st children

T

are inserted into a waiting list according to their depth in the tree Assume that the algorithm

just to ok from the waiting list a no de v of depth k in T It is assumed recursively that

T

the weight of a string s which is represented by a path from the ro ot of T to the no de v

T

has b een already computed and the online weight data structure is currently set to answer

queries in the window Win

k +1

If the no de v is of degree all edges coming out of v are lab eled by single symbols

denition of the compact trie see section We nd the maximal external symbol y

using the online weight data structure The weight of the word s y P k m is clearly

M AX

comp osed of weights of the string s stored at the no de v the symbol y describ ed by weight

function in the current window and the sux of pattern P stored in the table S The

M AX

weight of the word s y P k m is stored at the no de v According to Lemma

M AX

all light edges and corresp onding subtrees hanged under them with symbols lighter than

y can b e ignored For the rest of edges we up date at their ending no des information ab out

the weight of a string which is represented by the path coming from the ro ot to fulll the

recursive assumption The weight of the string is comp osed of the weight of the string s

stored at v and the weight of a symbol placed on the edge given by the weight function

All no des b elow heavy edges are inserted into the waiting list on level k

If the no de v is a rst no de its degree is of a chain we check if a string u ur

u

represented by the chain symbols is a subword of the pattern P ie if ur

M AX

P k k r This can b e done by asking for a lowest common ancestor of the prop er

M AX

sux of T string s and its extension in T and the pattern sux P k m If the

M AX

lowest common ancestor for b oth suxes is placed on a level r in T then we know

that ur P k k r and according to part A of Lemma we have only one

M AX

candidate s P k m and we do not search the trie b elow the chain Otherwise

M AX u

when the equality holds we recover the candidate from the ip tables F and F First we

1 2

ask a minimum range query in F k k r getting index of a p osition j whose change

2

gives the smallest lose of weight and getting the candidate s P k j F k

M AX 1

j P k j m The information ab out the weight of a string represented by path

M AX

coming from the ro ot to the no de under the chain is up dated with a help of the table S

u

In b oth cases the time at no de v is prop ortional to degree of the no de v thus searching of

the whole trie T can b e done in time prop ortional to the size of the trie ie in time O n

T

At last the trie T is searched again to nd the maximal weight which is the weight of the

T

maximal external pattern P

M AX

Theorem The external inverse pattern matching problem can be solved in optimal time

O n log 2

Parallel Approach

In this section we discuss shortly a parallel implementation of our external inverse pattern

matching algorithm on the CREW PRAM mo del Most of the steps in our algorithm can

b e easily parallelized when we allow for sup erlinear work and space

Theorem The external inverse pattern matching algorithm can be implemented in time

O log n and work O n log n m log on the CREW PRAM

Pro of Both trees T and T can b e computed in O log ntime and O n log n work when

T

sub quadratic space is available see Moreover the tree T can b e prepro cessed in time

O log n and linear work for LCA queries see Since we can not use online computation

of weights in all windows at the same time we have to compute the whole table of weights

which is of size O m The table of weights can b e easily computed in time O log m and

work O m but since we still need to keep order b etween weights of symbols we have to

sort all m columns by parallel mergesort which gives total work O m log When

the table is ready we compute the pattern P ip tables F F and the table S in time

M AX 1 2

O log m and linear work Then the table F is prepro cessed for minimum range queries

2

i

in logarithmic time and work O m log m computing minimum in every blo ck of size

for i m When all data structures are ready we start the searching algorithm We

assume that at every no de v of the trie T there is a linear number of pro cessors according

T

to the degree of v We compute in constant time the weights of all feasible edges ie the

edges of length placed under no des of degree with help of the table of weights and

the edges representing chains which are subwords of P using LCA queries and table

M AX

S If the path from ro ot of the trie to the no de v is comp osed only of feasible edges then

the no de v is called feasible no de We use any Euler tour technique see eg to compute

all feasible no des and the weights of a strings which are represented by paths from the ro ot

Computation of the feasible no des is done in time O log n and linear work Now if a feasible

no de v placed on depth k and under a string s is the rst no de of a chain the weight of a

candidate is comp osed from the weight of the string s table S and ip tables If the no de

v has degree l then O l pro cessors asso ciated with the no de nd in logarithmic time

the heaviest symbol y in column k after deletion of l symbols placed in edges coming out

of v This can b e done by testing only l heaviest symbols in column k In this case

the weight of the candidate is comp osed from the weight of the string s symbol y and table

S m When the weights of candidates are ready we apply any tree contraction algorithm

see eg lo oking for the maximum in the tree representing optimal pattern P 2

M AX

Conclusion

We have presented a new and optimal O n log time algorithm for the sequential external

inverse pattern matching showing that the internal case is the hardest part of inverse pattern

matching It is an interesting question if there exists a faster algorithm solving internal

inverse pattern matching but it lo oks this question has no simple answer Another interesting

task for further research is to improve the b ounds of the external inverse pattern matching

in the parallel issue Notice that if the pro duct of m and is small ie m O n our

parallel implementation is fast and ecient But if we want to keep linear complexity for all

feasible values of n m and one has to pass the b ottleneck hidden in the computation of

the weights of symbols in every window Win Another interesting question app ears when

i

we ask for the complexity of sequential and parallel inverse pattern matching in case of other

measures of distances eg the edit distance

References

K Abrahamson Generalized String Matching SIAM Journal on Computing

B Alb erts D Bray J Lewis M Ra K Rob erts and JD Watson Molecular Biology of the

Cell Garland Publishing NY

Amiho o d Amir Alb erto Ap ostolico and Moshe Lewenstein Inverse Pattern Matching

Manuscript to app ear in Journal of Algorithms

A Ap ostolico C Iliopp oulos GM Landau B Schieber and U Vishkin Parallel construction

of a sux tree with applications Algorithmica

R Cole Parallel merge sort SIAM J Computing pp

MJ Fischer and MS Paterson String matching and other pro ducts Complexity of Compu

tation RM Karp editor SIAMAMS Pro ceedings

E Fredkin Trie Memory Communications of the ACM

HN Gab ow JL Bentley and RE Tarjan Scaling and related techniques for geometry prob

lems In Pro ceedings of th ACM Symp osium on Theory of Computing STOC pp

A Gibb ons and W Rytter Ecient Parallel Algorithms Cambridge University Press

RW Hamming Error detecting and error correcting co des Bell Sys Tech Journal

D Harel and RE Tarjan Fast algorithms for nding nearest common ancestors SIAM Journal

on Computing

H Karlo Fast algorithms for approximately counting mismatches Information Pro cessing

Letters

DE Knuth JH Morris and VB Pratt Fast pattern matching in strings SIAM J Comput

D Russel and GT Gangemi Sr Computer Security Basics OReilly and Asso ciates Inc

Sebastop ol California

B Schieber and U Vishkin On nding lower common ancestors simplication and paral

lelization In Pro ceedings of rd Aegean Workshop on VLSI Algorithms and Architecture

LNCS

P Weiner Linear pattern matching algorithms In Pro ceedings of th IEEE Symp osium on

Foundations of Computer Science FOCS pp