Indexed Languages and why you care

Presented by Pieter Hooimeijer 2008-02-07

1 Think Way Back...

2 www.xkcd.com ...Far Enough:

... CS NCF DCF REG REG

3 Classes

... CS NCF DCF REG REG

PDA

4 Formal Language Classes

... CS NCF DCF REG REG

PDA DFA

5 Formal Language Classes

... CS NCF DCF REG REG

Machines PDA DFA

Formalism S → aSb | z a(ab)*b

6 Formal Language Classes

... CS NCF DCF REG REG

Pump Pump

7 The Context-Free Pumping Lemma

Suppose L = { anbncn | n = 1...} is context- free. By the pumping lemma, any string s with |s| ≥ p can be 'pumped.'

p

8 The Context-Free Pumping Lemma

What does 'can be pumped' mean? s = uvxyz 1)|vy| > 0 2)|vxy| ≤ p 3) uvixyiz is in L for all i ≥ 0

9 The Context-Free Pumping Lemma

Suppose L = { anbncn | n = 1...} is context-free.

Consider s = apbpcp; for any split s = uvxyz, we have: - if v contains a's, then y cannot contain c's - if y contains c's, then v cannot contain a's - a problem: uv0xy0z = uxz will never be in L

10 Moving Right Along

11 Motivation – Some Examples

• Can solve problems by phrasing them as 'language' problems: • Finding valid control flow graph paths • Solving set constraint problems • Static string analysis • Lexing, parsing • For fun...

12 Example – String Variables

www.xkcd.com 13 The Nugget

Some Code: x = 'z'; • We want a context free grammar to model x while(n < 5) { x = '(' . x . ')'; n ++; • Suppose we don't know } anything about n

14 The Nugget

Some Code: Grammar:

>x = 'z'; A -> z

while(n < 5) { x = '(' . x . ')'; n ++; }

15 The Nugget

Some Code: Grammar:

x = 'z'; A -> z [True]

while(n < 5) { B -> (A) [n < 5] > x = '(' . x . ')'; n ++; }

16 The Nugget

Some Code: Grammar:

x = 'z'; A -> z [True]

while(n < 5) { B -> (A) [n < 5] x = '(' . x . ')'; C -> A | B [n < 5] n ++; >>}

17 The Nugget

Some Code: Grammar:

x = 'z'; A -> z [True] B -> (C) [n < 5] while(n < 5) { C -> A | B [True] x = '(' . x . ')'; n ++; >>}

18 The Nugget

Some Code: Grammar:

x = 'z'; X -> C

while(n < 5) { A -> z [True] x = '(' . x . ')'; B -> (C) [n < 5] n ++; C -> A | B [True] >>}

19 Indexed Languages

... CS *I* NCF DCF REG REG

20 Definition:

• G = (N, T, F, P, S)

P : { N → ((NF*) ∪ T)* }

F : { { N → (N ∪ T)*}f }

21 Derivation Rule Example

S 1 S → Sf S → ABgC Sf

Sff

Aff Bgff Cff

22 Derivation Rule Example

S 1 S → Sf S → ABgC Sf

Sff 'normal' production: copy index stack Aff Bgff Cff

23 Derivation Rule Example

S 2 S → Af F = { { A → B}f } Af

B●

24 Derivation Rule Example

S 2 S → Af F = { { A → B}f } Af B

25 Derivation Rule Example

S 2 S → Af F = { { A → B}f } Af B 'index' production: pop leftmost index (for current nonterminal only)

26 An Example Grammar

S → Df g = {A → Aa f = {A → a

D → Dg | ABC B → Bb B → b C → Cc} C → c}

27 An Example Grammar

S → Df g = {A → Aa f = {A → a

D → Dg | ABC B → Bb B → b C → Cc} C → c} What is the language of this grammar?

28 An Example Grammar

S → Df S D → Dg | ABC g = {A → Aa B → Bb C → Cc} f = {A → a B → b C → c} 29 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa B → Bb C → Cc} f = {A → a B → b C → c} 30 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb C → Cc} f = {A → a B → b C → c} 31 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb

C → Cc} Agf Bgf Cgf f = {A → a B → b C → c} 32 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb

C → Cc} Agf Bgf Cgf f = {A → a B → b C → c} 33 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb

C → Cc} Agf Bgf Cgf f = {A → a Afa B → b C → c} 34 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb

C → Cc} Agf Bgf Cgf f = {A → a Afa a B → b C → c} aa 35 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb

C → Cc} Agf Bgf Cgf Afa f = {A → a Bfb a B → b b C → c} aa bb 36 An Example Grammar

S → Df S D → Dg | ABC Df g = {A → Aa Dgf B → Bb

C → Cc} Agf Bgf Cgf Afa Cfc f = {A → a Bfb a c B → b b C → c} aa bb cc 37 Where am I?

{ anbncn | n = 1...}

... CS *I* NCF DCF REG REG

38 ... CS *I* NCF DCF REG REG

Machines PDA DFA

Formalisms S → aSb | z a(ab)*b

Pumps 39 ... CS *I* NCF DCF REG REG

NSA PDA DFA

S → aSfb | z S → aSb | z a(ab)*b

? 40 Associated Automaton

41 Associated Automaton

42 (Actual NSA)

[Gilman, Shapiro 1998] 43 But will it pump?

• It's complicated – on derivation trees rather than strings [Hayashi 1973]

• Several corollaries exist [Gilman 1996] (less general; easier to use)

44 The Lemma

L : m : positive integer There is a constant k > 0 so that each w in L can be split (w = w1w2...wr) subject to: The Lemma

L : indexed language m : positive integer There is a constant k > 0 so that each w in L can be split (w = w1w2...wr) subject to: • m < r ≤ k

• each |wi| > 0 The Lemma

L : indexed language m : positive integer There is a constant k > 0 so that each w in L can be split (w = w1w2...wr) subject to: • m < r ≤ k

• each |wi| > 0

• any m-sized set of wi's is a subset of some w' in L; w' is a subproduct of w Lemma Example Lemma Example (Montage Time)

50 { (abn)n | n = 1...} { anbncn | n = 1...}

... CS *I* NCF DCF REG REG

NSA PDA DFA

S → aSfb | z S → aSb | z a(ab)*b

? 51 52 References

• Indexed Grammars – An Extension to Context-Free Grammars (Aho) • Nested Stack Automata (Aho) • Sequentially Indexed Grammars (van Eijck) • A Shrinking Lemma for Indexed Languages (Gilman) • On Groups Whose Word Problem is Solved by a (Gilman and Shapiro) • On Derivation Trees of Indexed Grammars (Hayashi)

53