<<

A Note on the Hardness of Tree Isomorphism



Birgit Jenner Pierre McKenzie Jacob o Toran

Tubingen and Ulm Montreal and Tubingen Ulm

Decemb er

Abstract

In this note we prove that the tree isomorphism problem when trees are en

1 1

co ded as strings is NC hard under DLOGTIMEreductions NC completeness

1

thus follows from Busss recent NC upp er b ound By contrast we prove that

testing isomorphism of two trees enco ded as p ointer lists is Lcomplete

Intro duction

GI the graph isomorphism problem is one of the most intensively studied prob

lems in Theoretical Historically the main source of interest in

GI has b een the evidence that GI is probably neither in P nor NPcomplete but

other sources of interest include the sophistication of the to ols develop ed to attack

the problem for example and the connexions b etween GI and structural

complexity see

Understandably many GI restrictions have b een considered For example P

upp er b ounds are known in the cases of planar graphs or graphs of b ounded

valence However none of these GI restrictions are known to b e complete for a

natural and it seemed that the problem lacks the structure needed

for a hardness result

In the sp ecial case of trees a linear sequential time algorithm for the problem was

already known in to Aho Hop croft and Ullman In an NC algorithm

was develop ed by Miller and Reif One year later Lindell obtained an

L upp er b ound Finally in a subtle algorithm able to test two trees for

isomorphism in NC was devised by Buss

Buss in asks whether tree isomorphism is NC hard Here we answer this ques

tion armatively showing the hardness of the problem under DLOGTIME manyone

reducibility Hence tree isomorphism is NC complete Trees thus provide the rst

class of graphs for which the isomorphism problem captures a natural complexity



Corresp onding author Pierre McKenzie WilhelmSchickardInstitut fur Informatik Univer

sitat Tubingen Sand D Tubingen Germany Phone Fax

Email mckenzieinformatikunitue binge nde

class Moreover so far the problem of evaluating a Bo olean formula and the

problem of multiplying p ermutations on ve p oints and some of their varia

tions were the only two NC complete problems known Tree isomorphism is a

third such problem

As noted by Buss cho osing a graph representation is critical when working at

the level of NC Buss uses Miller and Reif s string representation for trees on the

grounds that when the pointer representation is used the deterministic transitive

closure problem can b e reduced to the descendant predicate and the former is known

to b e complete for logspace Section

We prove here that the tree isomorphism problem is Lhard under manyone

NC reducibility when the p ointer representation is used Hence tree isomorphism

in the p ointer representation is Lcomplete by Lindell or by Buss Tree

isomorphism thus captures in this way another very natural complexity class

Preliminaries

We assume familiarity with basic notions of complexity theory such as can b e found

in the stardard b o oks in the area In particular we simply recall that

AC TC NC L

where L is the set of languages accepted by deterministic Turing machines using

logarithmic space NC is the class of languages recognized by DLOGTIMEuniform

families of logarithmic depth p olynomial size Bo olean circuits of b ounded fanin

over the basis f g and AC resp TC is the set of languages recognized by

DLOGTIMEuniform families of constant depth p olynomial size Bo olean circuits

of unb ounded fanin over the basis f g resp over the basis consisting solely

of the MAJORITY function

For simplicity we will often not distinguish b etween a class of Bo olean func



tions like NC and the corresp onding class of functions from f g to the natural

numb ers sometimes called FNC

Reducibilities

We use the following reducibilities

D LT

DLOGTIME reducibility

We use this reducibility for the NC completeness results Following we



say that for languages A B a function f manyone reducing A to B is a

DLOGTIME if f increases the length of strings only p olynomially

jf xj pjxj for a p olynomial p and some DLOGTIME Turing machine can

decide given an input of c i x whether the ith symb ol of f x is c We write

D LT

A B if there exists a DLOGTIME reduction reducing A to B Recall that it

is customary in the case of Turing machines op erating in sublinear time to access

the input via a sp ecial input index tap e for more details on DLOGTIME Turing

machines see for example

1

NC

NC manyone reducibility

m

We use this reducibility for the Lcompleteness and all other results Here the many

one reduction f is required to b e computable in FNC

Trees and representations

All trees discussed in this pap er are ro oted hence implicitly directed and unordered

ie the ordering of the descendants of a no de do es not matter In some of our

constructions we use colored trees A tree with n no des is said to b e colored if

each no de in the tree is lab elled with a p ositive integer no greater than n An

isomorphism b etween two colored trees T and T is a bijection b etween their no de

sets which maps the ro ot of T to the ro ot of T preserves the colors and preserves

the edges Two colored trees are isomorphic denoted T T i an isomorphism

exists b etween them These notions apply mutatis mutandis to noncolored trees

The main computational problem of interest in this pap er is

Colored Tree Isomorphism CTI

Given Two colored trees T and T

Problem Determine whether T T

We consider two dierent representations for enco ding trees the string repre

sentation and the p ointer representation As we will see in the small complexity

classes we are dealing with the representation used might change the complexity of

the problem

In the string representation trees are represented over an alphab et contain

ing op ening and closing parentheses A string representation of a tree T is dened

recursively in the following way The tree with a single no de is represented by

the string and if T is a tree consisting of a ro ot and subtrees T T in

k

any order with string representations a representation of T is given

k

by Observe that a tree might have dierent string representations

k

dep ending on the order of the descendants of any of its no des Colored trees can b e

enco ded in the same way using colored parentheses Let C b e the set of colors An

op ening parenthesis in a colored tree is represented by followed by log jC j bits

enco ding the color Note that we only need to color the op ening parentheses

The pointer representation is a more standard way to enco de graphs In this case

we consider the trees given by a list of pairs of no des representing directed edges

As in the previous case if we deal with colored trees with the representation of a

no de we include log jC j bits to enco de its color

For many tree problems in NC and L completeness results seem to dep end on

the representation used For example the reachability problem on forests which

is Lcomplete in the p ointer representation can b e solved in NC and even

TC in the string representation Analogously the Bo olean formula value problem

which is complete for NC in the string representation b ecomes Lcomplete when

describ ed using trees given in the p ointer representation In fact changing from

p ointer to string representation is FLcomplete

Another imp ortant observation is that it is p ossible to test within the classes

NC and L whether a given input is a correct enco ding of a tree in the string

representation and resp ectively in the p ointer form

Tree Isomorphism for Trees Given by Strings

We show rst that deciding whether two trees given in string representation are

isomorphic is NC complete We rst consider colored trees for which the hardness

pro of is simpler

D LT

Lemma In the string representation CTI is NC hard under

Pro of Following Barrington Immerman and Straubing Lemma it suf

ces to reduce the problem of evaluating a balanced DLOGTIMEuniform family

of Bo olean expressions made up of alternating layers of ANDs and ORs to CTI

The core of the reduction is the simple construction from describ ed in

page for the purp ose of simulating ANDs and ORs using graph isomorphism

questions We adapt this construction as follows Consider four colored trees G

G H and H Then the colored trees

G

G

H

H

Tree H

Tree G

^

^

have the prop erty that G H i G G H H and the colored trees

^ ^

H H

G G

H H

G G

Tree H

Tree G

_

_

have the prop erty that G H i G G H H Observe that the

_ _

ORconstruction doubles the size of the initial trees for G G and H H that

is from trees each having the same size s the ORconstruction would pro duce

trees each having size s

Now pick two singleno de trees T and T and assign them dierent colors Start

ing from the CTI instance T T to represent a TRUE input and the CTI instance

T T to represent a FALSE input it is trivial to construct from a depthd Bo olean

d

expression f with no negation gates two colored trees G and H having O no des

this is a rough upp er b ound and verifying the prop erty that G H i f evaluates

to TRUE

To see that this yields a DLOGTIME reduction note it is p ossible to add a few

dummy no des in order to mo dify the constructs ab ove in such a way that if G

G H and G are full binary colored trees then so are G H G and H

^ ^ _ _

and moreover the color o ccurrences in these resp ective constructs are the same

Because the Bo olean expression is balanced its depth is O log n And b ecause

the expression is built from alternating levels of ANDs and ORs the argument is

completed as in the pro of of Lemma

Remark The complete simulation used in the pro of of Lemma in fact requires

only two distinct colors A similar construction could b e devised in the absence of

colors as well

D LT

Lemma In the string representation CTI TI

Pro of The obvious idea is to simulate the colors by attaching colordep endent

gadgets at each no de Supp ose that the trees in the CTI instance have n no des

Then it suces to attach at each ccolored no de c n a new no de which is

ro ot to a heightone subtree having n c leaves In detail at the string enco ding

level the color binary numb er c o ccurring after the op ening bracket which sp ecies

the o ccurrence of the ccolored no de is simply replaced with the enco ding of the c

color gadget To ensure DLOGTIME the color gadgets are mo died

to contain an identical numb er of no des it is easy to implement the idea with n

nonisomorphic gadgets each having n no des

Theorem In the string representation CTI and TI are NC complete under

D LT

D LT

Pro of CTI is NC hard Lemma and CTI TI Lemma Buss

shows in a delicate argument that TI NC Buss in fact p oints out that his

NC algorithm applies directly as well to the case of lab elled trees Hence CTI is

NC complete

D LT

Since is in general not transitive NC hardness of TI needs to b e argued

by combining the reductions of Lemma and Lemma or by adapting the pro of

of Lemma to the case of trees without colors Hence TI is NC complete

Tree Isomorphism for Trees Given by Pointers

Recall Lemma which states that in the string representation colored tree iso

morphism reduces to tree isomorphism by the intro duction of appropriate gadgets for

the colors These gadgets are clearly NC computable in the p ointer representation

proving the following

1

NC

Lemma In the pointer representation CTI TI

m

Theorem In the pointer representation TI is Lcomplete under manyone

NC reducibility

Pro of The containment follows from Lindell who shows that a canonical

form cT of a ro oted tree T can b e computed in logspace Hence for two given

trees we determine isomorphism by and comparing their canonical forms

symb ol by symb ol Alternatively the string representation of the trees could b e

computed in L and then Busss NC algorithm could b e used

We prove hardness of TI by reducing the Lcomplete problem ORD to Col

ored Tree Isomorphism and app ealing to Lemma

Order b etween Vertices ORD

Given A digraph G V E that is a line and two no des v v V

i j

Problem Decide whether v v in the total order induced on V

i j

Let a line graph G V fv v g E and two designated no des v and

n i

v V b e given We assume without loss of generality that v is the outdegree zero

j n

no de and that v v and v are three dierent no des We rst determine whether

i j n

v v in the order induced by E with the help of oracle queries to CTI Later we

i j

will show that the oracle queries can b e merged to yield a manyone reduction

0 0

We consider the instances T T of the CTI problem Here T is the colored

k l

tree that results from G by coloring the no de v with color v with color v

i j n

with color and the rest of the no des with color

For k l n the colored tree T is dened as V fv v j m ng

m m

k l

with the no des v v and v colored with colors and resp ectively and the rest

n

k l

of the no des colored by

It is not hard to see that v v in the order induced on V by G if and only if

i j

0

k l with k l n such that T T This therefore describ es a disjunctive

k l

reduction to CTI In order to transform it into a manyone reduction we use the

ORop erator from Lemma to combine all the O n many queries to CTI into a

single one The ORop erator has to b e applied carefully since it doubles the size of

the trees involved every time it is used and we are considering the disjunction of a

p olynomial numb er of trees However the op erator can b e applied in a divide and

conquer manner and the resulting pair of trees has size p olynomial in n see

page for the details Putting the two parts of the reduction together we have

that ORD is NC manyone reducible to CTI

1

NC

Remark The reduction used in Theorem can b e made much ner than

m

but obtaining the tightest p ossible reduction is not the fo cus of this note

Concluding remarks

Complementing the harder results of Buss and Lindell we have shown that tree

isomorphism dep ending on the tree representation captures two robust complexity

classes namely NC and L These are the rst hardness results for a natural re

striction of the graph isomorphism problem for which so far no hardness result was

known A mo dest Lhardness result for GI is implied by our Lhardness result for

the tree case An interesting op en question is whether the isomorphism problem for

general graphs is also hard for NL or even for P

The level of sophistication of Busss NC algorithm for TI is comparable to

that of his simplied NC algorithm for the Bo olean expression value problem FVP

Are these two upp er b ounds indep endent In other words is there a reduction

from TI to FVP or vice versa which is simpler than either of Busss two upp er

b ounds

1 1

NC NC

TI has re TI Proving that FVP It is interesting to consider FVP

m m

quired three ingredients the NC upp er b ound for FVP the characterization

of NC in terms of balanced Bo olean expressions and our simple Lemma

Lemma directly constructs trees from Bo olean formulas but the ensuing direct

reduction is from BalancedFVP to TI How can Lemma b e strengthened

The b ottleneck to a strengthening of Lemma is the handling of a Bo olean

OR Lemma can only handle balanced Bo olean expressions b ecause the trees G

_

and H depicted in its pro of each require a copy of G G H and H Hence an

_

op en question is whether Lemma can b e proved using simpler constructs G and

_

H still simulating the Bo olean OR but only adding a small numb er additional

_

no des If so the NC upp er b ound for FVP is redundant ie the NC upp er b ound

for FVP follows from the NC upp er b ound for TI

References

A V Aho J E Hop croft and J D Ullman The design and analysis of computer

algorithms AddisonWesley

L Babai Mo derately exp onential b ounds for graph isomorphism In Fundamentals of

Computation Theory Lecture Notes in Computer Science SpringerVerlag

pp

D A M Barrington Boundedwidth p olynomialsize branching programs recognize

1

exactly those languages in NC J Comput System Sci

1

D A M Barrington N Immerman and H Straubing On uniformity within NC

Journal of Computer and System Sciences

M Beaudry and P McKenzie Circuits matrices and nonasso ciative computation

Journal of Computer and System Sciences

S R Buss The Bo olean formula value problem is in ALOGTIME In th Annual

ACM Symposium on Theory of Computing

S R Buss Alogtime algorithms for tree isomorphism comparison and canonization In

Computational Logic and Proof Theory th Kurt Godel Col loquium Lecture Notes

in Computer Science SpringerVerlag pp

S R Buss and S A Co ok and A Gupta and V Ramachandran An optimal parallel

algorithm for formula evaluation SIAM Journal on Computing pp

R Chang and J Kadin On computing Bo olean connectives of characteristic functions

Math Systems Theory

S A Co ok and P McKenzie Problems complete for deterministic logarithmic space

Journal of Algorithms

K Etessami Counting quantiers successor relations and logarithmic space In Proc

of the th Structure in Complexity Theory Conf pages IEEE

2

J E Hop croft and R E Tarjan A V algorithm for determining isomorphism of planar

graphs Information Processing Letters

1 2

B Jenner Between NC and NC Classication of Problems by Logspace Resources

Manuskript of Habilitation thesis

J Kobler U Schoning and J Toran The Graph Isomorphism Problem Its Structural

Complexity Progress in Theoretical Computer Science Birkhauser Boston

S Lindell A logspace algorithm for tree canonization In Proc of the th STOC

ACM

A Lozano and J Toran On the nonuniform complexity of the Graph Isomorphism

problem Complexity Theory Current Research pp Edited by

K Amb osSpies S Homer and U Schoning Also in Proceedings of the th Struc

ture in Complexity Theory Conference pp June

E Luks Isomorphism of b ounded valence can b e tested in p olynomial time Journal

of Computer and System Sciences

GL Miller and JH Reif Parallel tree contraction part Further applications SIAM

Journal on Computing