15-455 Complexity Theory
Space Complexity: Savitch's Theorem and PSPACE- Completeness MEASURING SPACE COMPLEXITY
FINITE STATE CONTROL
I N P U T …
1 2 3 4 5 6 7 8 9 10 … We measure space complexity by looking at the furthest tape cell reached during the computation Let M = deterministic TM that halts on all inputs.
Definition: The space complexity of M is the function s : N→ N, where s(n) is the furthest tape cell reached by M on any input of length n.
Let N be a non-deterministic TM that halts on all inputs in all of its possible branches.
Definition: The space complexity of N is the function s : N → N, where s(n) is the furthest tape cell reached by M, on any branch if its computation, on any input of length n. Definition: SPACE(s(n)) = { L | L is a language decided by a O(s(n)) space deterministic Turing Machine }
Definition: NSPACE(s(n)) = { L | L is a language decided by a O(s(n)) space non-deterministic Turing Machine } PSPACE = SPACE(nk) ∪k ∈ N NPSPACE = NSPACE(nk) ∪k ∈ N 3SAT ∈ SPACE(n) ⊂ PSPACE
( x ∨ ¬ y ∨ x ) ( y ∨ x ∨ y )
( x ∨ ¬ y ∨ x ) ( y ∨ x ∨ y ) # x y
( x ∨ ¬ y ∨ x ) ( y ∨ x ∨ y ) # x 0 y 0
( x ∨ ¬ y ∨ x ) ( y ∨ x ∨ y ) # x 0 y 1
( x ∨ ¬ y ∨ x ) ( y ∨ x ∨ y ) # x 1 y 0 Assume a deterministic Turing machine that halts on all inputs runs in space s(n) Question: What’s an upper bound on the number of time steps for this machine?
A configuration gives a head position, state, and tape contents. Number of configurations is at most:
s(n) |Q| |Γ|s(n) = 2O(s(n))
Recall: Q is the set of states; Γ, the set to tape symbols MORAL: Space S computations can be simulated in at most 2O(S) time steps
PSPACE ⊆ EXPTIME k EXPTIME = TIME(2 n ) ∪k ∈ N SAVITCH’S THEOREM
Is NTIME(t(n)) ⊆ TIME(t(n))?
Is NTIME(t(n)) ⊆ TIME(t(n)k) for some k > 1?
We don’t know in general!
If the answer is yes, then P = NP… What about the space-bounded setting?
NSPACE(s(n)) ⊆ SPACE(s(n)2) s(n) ≥ n SAVITCH’S THEOREM
Is NTIME(t(n)) ⊆ TIME(t(n))?
Is NTIME(t(n)) ⊆ TIME(t(n)k) for some k > 1?
We don’t know in general!
If the answer is yes, then P = NP… What about the space-bounded setting?
therefore NPSPACE ⊆ PSPACE
SAVITCH’S THEOREM
Is NTIME(t(n)) ⊆ TIME(t(n))?
Is NTIME(t(n)) ⊆ TIME(t(n)k) for some k > 1?
We don’t know in general!
If the answer is yes, then P = NP… What about the space-bounded setting?
therefore PSPACE = NPSPACE
SAVITCH’S THEOREM
Theorem: For functions s(n) where s(n) ≥ n NSPACE(s(n)) ⊆ SPACE(s(n)2)
Proof Try: Let N be a non-deterministic TM with space complexity s(n) Construct a deterministic machine M that tries every possible branch of N Since each branch of N uses space at most s(n), then M uses space at most s(n)…? SAVITCH’S THEOREM
Theorem: For functions s(n) where s(n) ≥ n NSPACE(s(n)) ⊆ SPACE(s(n)2)
Proof Try: Let N be a non-deterministic TM with space complexity s(n) Construct a deterministic machine M that tries every possible branch of N Since each branch of N uses space at most s(n), then M uses space at most s(n)…? There are 2^(O(2^O(s))) branches to keep track of! We need to simulate a non-deterministic computation and save as much space as possible IDEA: Given two configurations C1 and C2 of an s(n) space machine N, and a number t, determine if N can
get from C1 to C2 within t steps
Procedure CANYIELD(C1, C2, t):
If t = 0 then accept iff C1 = C2 If t = 1 then accept iff C1 yields C2 within one step. Use transition map of N to check [uses space O(s(n)) ] If t > 1, then Accept if and only if for some configuration Cm of size s(n), both CANYIELD(C1,Cm ,t/2) and CANYIELD(Cm,C2, t/2) accept
CANYIELD(C1, C2, t) has log(t) levels of recursion. Each level of recursion uses O(s(n)) additional space to store Cm. So CANYIELD(C1, C2, t) uses O(s(n) log(t)) space. IDEA: Given two configurations C1 and C2 of an s(n) space machine N, and a number t, determine if N can
get from C1 to C2 within t steps
Procedure CANYIELD(C1, C2, t):
If t = 0 then accept iff C1 = C2 If t = 1 then accept iff C1 yields C2 within one step. Use transition map of N to check [uses space O(s(n)) ] If t > 1, then Accept if and only if for some configuration Cm of size s(n), both CANYIELD(C1,Cm ,t/2) and CANYIELD(Cm,C2, t/2) accept M: On input w, ds(n) Output the result of CANYIELD(cstart, caccept, 2 ) Here d > 0 is chosen so that 2d s|w|) upper bounds the number of configurations of N(w)
IDEA: Given two configurations C1 and C2 of an s(n) space machine N, and a number t, determine if N can
get from C1 to C2 within t steps
Procedure CANYIELD(C1, C2, t):
If t = 0 then accept iff C1 = C2 If t = 1 then accept iff C1 yields C2 within one step. Use transition map of N to check [uses space O(s(n)) ] If t > 1, then Accept if and only if for some configuration Cm of size s(n), both CANYIELD(C1,Cm ,t/2) and CANYIELD(Cm,C2, t/2) accept M: On input w, ds(n) Output the result of CANYIELD(cstart, caccept, 2 ) ds(n) ds(n) CANYIELD(C1, C2, 2 ) uses O(s(n) log(2 )) space.
IDEA: Given two configurations C1 and C2 of an s(n) space machine N, and a number t, determine if N can
get from C1 to C2 within t steps
Procedure CANYIELD(C1, C2, t):
If t = 0 then accept iff C1 = C2 If t = 1 then accept iff C1 yields C2 within one step. Use transition map of N to check [uses space O(s(n)) ] If t > 1, then Accept if and only if for some configuration Cm of size s(n), both CANYIELD(C1,Cm ,t/2) and CANYIELD(Cm,C2, t/2) accept M: On input w, ds(n) Output the result of CANYIELD(cstart, caccept, 2 ) ds(n) 2 CANYIELD(cstart, caccept, , 2 ) uses O(s(n) )) space.
PSPACE = NPSPACE P NP PSPACE NPSPACE
EXPTIME P ⊆ NP ⊆ PSPACE ⊆ EXPTIME
P ≠ EXPTIME
TIME HIERARCHY THEOREM TIME HIERARCHY THEOREM
Intuition: If you have more TIME to work with, then you can solve strictly more problems!
Theorem: For functions f, g where g(n)/(f(n))2 → ∞ TIME(g(n)) ⊄ TIME(f(n))
So, for all k, since 2n/n2k → ∞, TIME(2n) ⊄ TIME(nk)
Therefore, TIME(2n) ⊄ P TIME HIERARCHY THEOREM
Intuition: If you have more TIME to work with, then you can solve strictly more problems!
Theorem: For functions f, g where g(n)/(f(n))2 → ∞ TIME(g(n)) ⊄ TIME(f(n)) Proof IDEA: Diagonalization Make a machine M that works in g(n) time and “does the opposite” of all f(n) time machines on at least one input
So L(M) is in TIME(g(n)) but not TIME(f(n)) TIME HIERARCHY THEOREM
Intuition: If you have more TIME to work with, then you can solve strictly more problems!
Theorem: For functions f, g where g(n)/(f(n))2 → ∞ TIME(g(n)) ⊄ TIME(f(n)) Proof IDEA: Diagonalization Need g(n) >> f(n)2 to ensure that you can simulate an arbitrary machine running in f(n) time with a single machine that runs in g(n) time.
So L(M) is in TIME(g(n)) but not TIME(f(n)) Definition: Language B is PSPACE-complete if:
1. B ∈ PSPACE 2. Every A in PSPACE is poly-time reducible to B (i.e. B is PSPACE-hard) QUANTIFIED BOOLEAN FORMULAS (in prenex normal form)
∃x∃y [ x ∨ ¬y ] ∀x [ x ∨ ¬x ] ∀x [ x ] ∀x∃y [ (x ∨ y) ∧ (¬x ∨ ¬y) ] ∀x∃y [ (x ∨ 0) ∧ (¬x ∨ ¬y) ] TQBF = { φ | φ is a true fully quantified Boolean formula in prenex normal form}
Theorem: TQBF is PSPACE-complete TQBF ∈ PSPACE T(φ): 1. If φ has no quantifiers, then it is an expression with only constants. Evaluate φ. Accept iff φ evaluates to 1. 2. If φ = ∃x ψ, recursively call T on ψ, first with x = 0 and then with x = 1. Accept iff either one of the calls accepts.
3. If φ = ∀x ψ, recursively call T on ψ, first with x = 0 and then with x = 1. Accept iff both of the calls accept. Claim: Every language A in PSPACE is polynomial time reducible to TQBF
We build a poly-time reduction from A to TQBF
The reduction turns a string w into a fully quantified Boolean formula φ that simulates the PSPACE machine for A on w
Will use ideas from Cook-Levin and Savitch
Let M be a deterministic TM that decides A in space nk A tableau for M on w is a table whose rows are the configurations of the computation of M on input w
# q0 w1 w2 … wn ¨ … ¨ # # #
O(nk) 2
# # nk We design φ to encode a simulation of M on w φ will be true if and only if M accepts w
Given two configurations c and d (with corresponding ordered collections of variables) and t > 0, we construct a formula φc,d,t If we assign c and d to actual configurations,
φc,d,t will be true if and only if M can go from c to d in t steps
We let φ = φ , where h = 2e s(n) for a c start , c accept , h constant e chosen so that M has less than 2e s(n) possible configurations on an input of length n Here s(n) = nk We design φ to encode a simulation of M on w φ will be true if and only if M accepts w
Given two configurations c and d (with corresponding ordered collections of variables) and t > 0, we construct a formula φc,d,t If we assign c and d to actual configurations,
φc,d,t will say: “there exists a configuration m such that
φc,m,t/2 is true and φm,d,t/2 is true” We let φ = φ , where h = 2e s(n) for a c start , c accept , h constant e chosen so that M has less than 2e s(n) possible configurations on an input of length n Here s(n) = nk HIGH-LEVEL IDEA: Encode the Algorithm of Savitch’s Theorem with a Quantified Boolean Formula
If M uses nk space, 2k then the QBF φ will have size O(n ) If we assign c and d to actual configurations,
φc,d,t will say: “there exists a configuration m such that
φc,m,t/2 is true and φm,d,t/2 is true” We let φ = φ , where h = 2e s(n) for a c start , c accept , h constant e chosen so that M has less than 2e s(n) possible configurations on an input of length n Here s(n) = nk To construct φc,d,t use ideas of Cook-Levin plus Savitch:
Each cell in a configuration is associated with variables representing possible tape symbols and states.
Each config has nk cells and so is encoded by by a formula with O(nk) variables. If t = 0 or 1, we can easily construct φc,d,t:
φc,d,t = “c equals d” OR “d follows from c in a single step of M”
How do we express “c equals d”? Write a Boolean formula saying that each of the variables in c is equal to the corresponding one in d
“d follows from c in a single step of M”? Use 2 x 3 windows as in the Cook-Levin theorem, and write a CNF formula If t > 1, we construct φc,d,t recursively:
φc,d,t = ∃m [φc,m,t/2 ∧ φm,d,t/2 ]
k ∃x1 ∃x2 …∃xL L= O(n ) But how long is this formula? Every level of the recursion cuts t in half but roughly doubles the size of the formula. So, we modify the formula to be:
φc,d,t = ∃m∀a,b[ [(a,b)=(c,m) ∨ (a,b)=(m,d)] => [ φa,b,t/2 ] ] This folds the 2 recursive sub-formulas into 1 φc,d,t = ∃m∀a,b[ [(a,b)=(c,m) ∨ (a,b)=(m,d)] =>[ φa,b,t/2 ] ]
Set = where h = 2e s(n) φ φc start , c accept , h
• Each recursive step adds a portion that is linear in the size of the configurations, so has size O(s(n)) • Number of levels of recursion is log h = O(s(n)) • Hence, the size of φ is O(s(n)2) PSPACE is often called the class of games
Formalizations of many popular games are PSPACE-Complete THE FORMULA GAME (FG) …is played between two players, E and A Given a fully quantified Boolean formula ∃y∀x [ (x ∨ y) ∧ (¬x ∨ ¬y) ]
E chooses values for variables quantified by ∃ A chooses values for variables quantified by ∀ Start at the leftmost quantifier E wins if the resulting formula is true A wins otherwise ∃x ∀y∃z [ (x ∨ y) ∧((y ∨ z) (¬y ∨ ¬z) ] ∃x ∀y∃z [ (x ∨ y) ∧((y ∨ z) ( y ∨ ¬z) ]
FG = { φ | Player E has a winning strategy in φ }
Theorem: FG is PSPACE-Complete Proof: FG = TQBF GEOGRAPHY Two players take turns naming cities from anywhere in the world Each city chosen must begin with the same letter that the previous city ended with Cities cannot be repeated
Austin → Nashua → Albany → York Whoever cannot name any more cities loses GENERALIZED GEOGRAPHY
d
b g a e i
c h
f GG = { (G, b) | Player 1 has a winning strategy for generalized geography played on graph G starting at node b }
Theorem: GG is PSPACE-Complete GG ∈ PSPACE WANT: Machine M that accepts (G,b) ó Player 1 has a winning strategy on (G, b) M(G, b): If b has no outgoing edges, reject. 1. Remove node b and all edges touching it to
get to a new graph G1
2. For each of the nodes b1, b2, …, bk that b originally pointed at, recursively call M(G1, bi)
3. If all of these accept, Player 2 has a winning strategy, so M rejects. Otherwise, accept. GG IS PSPACE-HARD
We show that FG ≤P GG We convert a formula φ into (G, b) such that: Player E has winning strategy in φ if and only if Player 1 has winning strategy in (G, b)
For simplicity we assume φ is of the form:
φ = ∃x1∀x2∃x3…∃xk [ψ] where ψ is in cnf. (Quantifiers alternate, and the last move is E’s) TRUE b FALSE ∃x1∀x2…∃xk(x1 ∨ x1 ∨ x2) ∧ (¬x ∨ ¬x ∨ ¬x ) x F 1 2 2 1 T ∧ …
x2 T F
c
xk T F TRUE b FALSE ∃x1∀x2…∃xk(x1 ∨ x1 ∨ x2) ∧ (¬x1 ∨ ¬x2 ∨ ¬x2) x T F x 1 1 ∧ …
c1 x1
x2 T F x2
c2 ¬x 1 c
¬x2
xk T F ¬x2 cn TRUE b FALSE ∃x1∀x2…∃xk(x1 ∨ x1 ∨ x2) ∧ (¬x1 ∨ ¬x2 ∨ ¬x2) x T F x 1 1 ∧ …
c1 x1
x2 T F x2
c2 ¬x 1 c
¬x2
xk T F ¬x2 cn ∃x1 [ (x1 ∨ x1 ∨ x1) ]
TRUE b FALSE x F 1 T x1
c1 x1
x1
c ∃x1 [ (x1 ∨ x1 ∨ x1) ]
TRUE b FALSE x F 1 T x1
c1 x1
x1
c Question: Is Chess a PSPACE complete problem?
No, because determining whether a player has a winning strategy takes CONSTANT time and space (OK, the constant is large…)
But n x n GO, Chess and Checkers can be shown to be PSPACE-hard