How many recursive calls of Eval are there? • { one per subexpression: at most m Maximum depth of recursion? • { bounded by total number of calls: at most m Maximum number of iterations of for loop? • { I n per recursion level | |  { at most nm iterations

Checking c , ..., c pI can be done in linear time w.r.t. n • h 1 ni2

Runtime in m nm n = m nm+1 · · ·

How to Measure Query Answering Complexity

Query answering as decision problem { consider Boolean queries

DATABASE THEORY Various notions of complexity: Combined complexity (complexity w.r.t. size of query and database instance) • Lecture 4: Complexity of FO Query Answering Data complexity (worst case complexity for any fixed query) • Query complexity (worst case complexity for any fixed database instance) David Carral •

Knowledge-Based Systems Various common complexity classes:

L NL P NP PSpace ExpTime ✓ ✓ ✓ ✓ ✓

TU Dresden, 23rd Apr 2019

David Carral, 23rd Apr 2019 Database Theory slide 2 of 11

An for Evaluating FO Queries FO Algorithm Worst-Case Runtime function Eval(', ) I Let m be the size of ', and let n = (total table sizes) |I| 01 switch (') { 02 case p(c , ..., c ): return c , ..., c pI 1 n h 1 ni2 03 case : return Eval( , ) ¬ ¬ I 04 case : return Eval( , ) Eval( , ) 1 ^ 2 1 I ^ 2 I 05 case x. : 9 06 for c 2 I { 07 if Eval( [x c], ) then return true 7! I 08 } 09 return false 10 }

David Carral, 23rd Apr 2019 Database Theory slide 3 of 11 David Carral, 23rd Apr 2019 Database Theory slide 4 of 11 FO Algorithm Worst-Case Runtime Time Complexity of FO Algorithm

Let m be the size of ', and let n = (total table sizes) |I| Let m be the size of ', and let n = (total table sizes) How many recursive calls of Eval are there? |I| • { one per subexpression: at most m Runtime in m nm+1 · Maximum depth of recursion? • { bounded by total number of calls: at most m Time complexity of FO query evaluation Maximum number of iterations of for loop? Combined complexity: in ExpTime • • { I n per recursion level | |  Data complexity (m is constant): in P { at most nm iterations • Query complexity (n is constant): in ExpTime • Checking c , ..., c pI can be done in linear time w.r.t. n • h 1 ni2

Runtime in m nm n = m nm+1 · · ·

David Carral, 23rd Apr 2019 Database Theory slide 4 of 11 David Carral, 23rd Apr 2019 Database Theory slide 5 of 11

FO Algorithm Worst-Case Memory Usage Space Complexity of FO Algorithm

We can get better complexity bounds by looking at memory Let m be the size of ', and let n = (total table sizes) |I| Let m be the size of ', and let n = (total table sizes) |I| Memory in m log m + (m + 1) log n

For each (recursive) call, store pointer to current subexpression of ': log m • Space complexity of FO query evaluation For each variable in ' (at most m), store current constant assignment (as a • Combined complexity: in PSpace pointer): m log n • · Data complexity (m is constant): in Checking c1, ..., cn pI can be done in logarithmic space w.r.t. n • • h i2 Query complexity (n is constant): in PSpace • Memory in m log m + m log n + log n = m log m + (m + 1) log n

David Carral, 23rd Apr 2019 Database Theory slide 6 of 11 David Carral, 23rd Apr 2019 Database Theory slide 7 of 11 { QBF satisfiability

Let Q X . Q X . Q X .'[X , ..., X ] be a QBF (with Q , ) 1 1 2 2 ··· n n 1 n i 2 {8 9} Database instance with I = 0, 1 • I { } One table with one row: true(1) • Transform input QBF into Boolean FO query • Q x . Q x . Q x .'[X true(x ), ..., X true(x )] 1 1 2 2 ··· n n 1 7! 1 n 7! n It is easy to check that this yields the required reduction. ⇤

Better approach: Consider QBF Q X . Q X . Q X .'[X , ..., X ] with ' in negation normal form: • 1 1 2 2 ··· n n 1 n negations only occur directly before variables Xi (still PSpace-complete: exercise)

Database instance with I = 0, 1 • I { } Two tables with one row each: true(1) and false(0) • Transform input QBF into Boolean FO query •

Q x . Q x . Q x .'0 1 1 2 2 ··· n n where ' is obtained by replacing each negated variable X with false(x ) and 0 ¬ i i each non-negated variable Xi with true(xi).

FO Combined Complexity FO Combined Complexity

The algorithm shows that FO query evaluation is in PSpace. The algorithm shows that FO query evaluation is in PSpace. Is this the best we can get? Is this the best we can get?

Hardness proof: reduce a known PSpace-hard problem to FO query evaluation Hardness proof: reduce a known PSpace-hard problem to FO query evaluation { QBF satisfiability

Let Q X . Q X . Q X .'[X , ..., X ] be a QBF (with Q , ) 1 1 2 2 ··· n n 1 n i 2 {8 9} Database instance with I = 0, 1 • I { } One table with one row: true(1) • Transform input QBF into Boolean FO query • Q x . Q x . Q x .'[X true(x ), ..., X true(x )] 1 1 2 2 ··· n n 1 7! 1 n 7! n It is easy to check that this yields the required reduction. ⇤

David Carral, 23rd Apr 2019 Database Theory slide 8 of 11 David Carral, 23rd Apr 2019 Database Theory slide 8 of 11

PSpace-hardness for DI Queries PSpace-hardness for DI Queries

The previous reduction from QBF may lead to a query that is not domain independent The previous reduction from QBF may lead to a query that is not domain independent

Example: QBF p. p leads to FO query x. true(x) Example: QBF p. p leads to FO query x. true(x) 9 ¬ 9 ¬ 9 ¬ 9 ¬ Better approach: Consider QBF Q X . Q X . Q X .'[X , ..., X ] with ' in negation normal form: • 1 1 2 2 ··· n n 1 n negations only occur directly before variables Xi (still PSpace-complete: exercise)

Database instance with I = 0, 1 • I { } Two tables with one row each: true(1) and false(0) • Transform input QBF into Boolean FO query •

Q x . Q x . Q x .'0 1 1 2 2 ··· n n where ' is obtained by replacing each negated variable X with false(x ) and 0 ¬ i i each non-negated variable Xi with true(xi).

David Carral, 23rd Apr 2019 Database Theory slide 9 of 11 David Carral, 23rd Apr 2019 Database Theory slide 9 of 11 We have actually shown something stronger:

Theorem 4.2: The evaluation of FO queries is PSpace-complete with respect to query complexity.

Combined Complexity of FO Query Answering Combined Complexity of FO Query Answering

Summing up, we obtain: Summing up, we obtain:

Theorem 4.1: The evaluation of FO queries is PSpace-complete with respect to Theorem 4.1: The evaluation of FO queries is PSpace-complete with respect to combined complexity. combined complexity.

We have actually shown something stronger:

Theorem 4.2: The evaluation of FO queries is PSpace-complete with respect to query complexity.

David Carral, 23rd Apr 2019 Database Theory slide 10 of 11 David Carral, 23rd Apr 2019 Database Theory slide 10 of 11

Summary and Outlook

The evaluation of FO queries is PSpace-complete for combined complexity • PSpace-complete for query complexity •

Open questions: What is the data complexity of FO queries? • Are there query languages with lower complexities? (next lecture) • Which other computing problems are interesting? •

David Carral, 23rd Apr 2019 Database Theory slide 11 of 11