Recursive Descent Parser

Recursive Descent Parser

MIT 6. 035 Top-Down Parsing Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology Orientation • Language specification • Lexical structure – regular expressions • Syntactic structure – grammar • This Lecture - recursive ddescentescent parsers • Code parser as set of mutually recursive procedures • Structure ooff pprogramrogram matches sstructuretructure of grammar Starting Point • Assume lexical analysis has produced a sequence of tokens • Each token has a type and value • Types ccorrespondorrespond ttoo tterminalserminals • Values to contents of token read in • Examples • Int 549 – integer token with value 549 read in • if - if keyword, no need for a value • AddOp + - add operator, value + Basic Approach • Start with Start symbol • Bu ild a le ftft most der iva tion • If leftmost symbol is nonterminal, choose a production and applyapply itit • If leftmost symbol is terminal, match against input • If all terminals match, have found a parse! •Key: find correct productions for nonterminals Graphical Illustration of Leftmost Derivation Sentential Form NT1 T1 T2 T3 NT2 NT3 Apply Production Not Here Here Grammar for Parsing Example Start → Expr • Set ooff ttokokensens iiss Expr → Expr + Term { +, -, *, /, Int }, where Expr → Expr - Term Int = [0-9][0-9]* Expr → Term • FFoo r ccononveeniencenience , may reprepreesentsent Term → Term * Int each Int n token by n Term → Term / Int Term → Int Parsing Example Parse Remaining Input Tree Start 2-2*2 Sentential Form Start Current Position in Parse Tree Parsing Example Parse Remaining Input Tree Start 2-2*2 Exprp Sentential Form Expr Apppplied Production Start → Expr Current Position in Parse Tree Parsing Example Parse Remaining Input Tree Start 2-2*2 Exprp Sentential Form Expr - Term Expr - Term Expr → Expr + Term Apppplied Production Expr → Expr - Term Expr → Term Expr → Expr - Term Parsing Example Parse Remaining Input Tree Start 2-2*2 Exppr Sentential Form Expr - Term Term - Term Term Exprp → Exprp + Teerm Apppplied Production Expr → Expr - Term Expr → Term Expr →→Te Termrm Parsing Example Parse Remaining Input Tree Start 2-2*2 Exppr Sentential Form Expr - Term Int - Term Term Apppplied Production Int Term → Int Parsing Example Parse Remaining Input Tree Start Match Input 2-2*2 Exppr Token! Sentential Form Expr - Term 2 - Term Term Int 2 Parsing Example Parse Remaining Input Tree Start Match Input -2*2 Exppr Token! Sentential Form Expr - Term 2 - Term Term Int 2 Parsing Example Parse Remaining Input Tree Start Match Input 2*2 Exppr Token! Sentential Form Expr - Term 2 - Term Term Int 2 Parsing Example Parse Remaining Input Tree Start 2*2 Exppr Sentential Form Expr - Term 2 - Term *Int Term Term * Int Apppplied Production Int 2 Term → Term * Int Parsing Example Parse Remaining Input Tree Start 2*2 Exppr Sentential Form Expr - Term 2 - Int * Int Term Term * Int Apppplied Production Int 2 Int Term → Int Parsing Example Parse Remaining Input Tree Start Match Input 2*2 Exppr Token! Sentential Form Expr - Term 2 - 2* Int Term Term * Int Int 2 Int 2 Parsing Example Parse Remaining Input Tree Start Match Input *2 Exppr Token! Sentential Form Expr - Term 2 - 2* Int Term Term * Int Int 2 Int 2 Parsing Example Parse Remaining Input Tree Start Match Input 2 Exppr Token! Sentential Form Expr - Term 2 - 2* Int Term Term * Int Int 2 Int 2 Parsing Example Parse Remaining Input Tree Start Parse Complete! 2 Exppr Sentential Form Expr - Term 2 - 2* 2 Term Term * Int 2 Int 2 Int 2 Summary • Three Actions (Mechanisms) • AlApply pro ductit ion to expand current nonterminal in parse tree • Match ccurrenturrent terminal (consuming input) • Accept the parse as correct • Parser generates ppreorderreorder ttraversalraversal of parse ttreeree • visit parents before children • visit ssiblingsiblings ffromrom llefteft ttoo rrightight Policy Problem • Which production to use for each nonterminal? • Classical Separation of Policy and Mechanism • One Approach: Backtracking • Treat it as a search problem • At each choice point, try next alternative • If it is clear that current try fails, go back to previous choice and trytry something ddifferentifferent • General technique for searching • Used a lot in classical AI and natural langgguage processing (parsing, speech recognition) Backtracking Example Parse Remaining Input Tree Start 2-2*2 Sentential Form Start Backtracking Example Parse Remaining Input Tree Start 2-2*2 Exp r Sentential Form Expr Applied Production Start → Expr Backtracking Example Parse Remaining Input Tree Start 2-2*2 Expr Sentential Form Expr + Term Expr + Term Applied Production Expr → Expr + Term Backtracking Example Parse Remaining Input Tree Start 2-2*2 Expr Sentential Form Expr + Term Term + Term Term Applied Production Expr → Term Backtracking Example Parse Remaining Input Tree Start Match Input 2-2*2 Expr Token! Sentential Form Expr + Term Int + Term Term Applied Production Int Term → Int Backtracking Example Parse Remaining Input Tree Start Can’ t Match -2*2 Expr Input Sentential Form Token! Expr + Term 2 - Term Term Applied Production Int 2 Term → Int Backtracking Example Parse Remaining Input Tree Start So Backtrack! 2-2*2 Exp r Sentential Form Expr Applied Production Start → Expr Backtracking Example Parse Remaining Input Tree Start 2-2*2 Expr Sentential Form Expr - Term Expr - Term Applied Production Expr → Expr - Term Backtracking Example Parse Remaining Input Tree Start 2-2*2 Expr Sentential Form Expr - Term Term - Term Term Applied Production Expr → Term Backtracking Example Parse Remaining Input Tree Start 2-2*2 Expr Sentential Form Expr - Term Int - Term Term Applied Production Int Term → Int Backtracking Example Parse Remaining Input Tree Start Match Input -2*2 Expr Token! Sentential Form Expr - Term 2 - Term Term Int 2 Backtracking Example Parse Remaining Input Tree Start Match Input 2*2 Expr Token! Sentential Form Expr - Term 2 - Term Term Int 2 Left Recursion + Top-Down Parsing = Infinite Loop • Example Production: Term → Term*Num • PtPoten ttiial pars ing steps: Term Term Term Num Term * Num Term * Term * Num General Search Issues • Three components • Search space (parse trees) • Search algorithm (parsing(parsing algorithm)algorithm) • Goal to find (parse tree for input program) • Would like to (but can’t always) ensure that • Find goal (hopefully quickly) if it exists • Search terminates if it does not • Handled inin variousvarious ways in various contextscontexts • Finite search space makes it easy • Exploration strategies for infinite search space • Sometimes oneone goal more important (model checking) • For parsing, hack grammar to remove left recursion Eliminating Left Recursion • Start with productions of form •A→ A α • A →β • α, β sequences of terminals and nonterminals that do not startstart with A • Repeated application of A →A α A builds parseparse tree like tthis:his: A α A α β α Eliminating Left Recursion • Replacement productions –A→ A α A →βR R is a new nonterminal –A→β R →αR – R → ε New ParseParse TTrreeee Old Parse Tree A A R β R A α α R β α α ε Hacked Grammar Original Grammar New Grammar Fragment Fragment Te rm → In t Te rm ’ Term → Term * Int Term’ → * Int Term’ Term → Term / Int Term’ → / Int Term’ Term → Int Term’ → ε Parse Tree Comparisons Original Grammar New GGrraammarmmar Term Term Int Term’ Term * Int Term’ Int * Int * Int * Int Term’ ε Eliminating Left Recursion • Changes search space exploration algorithm • EElilim ina tes didirec t iifnfiiinitte recursion • But grammar less intuitive • Sets things up for predictivepredictive pparsingarsing Predictive Parsing • Alternative to backtracking • UfUseful for programming lan guages, whihich can be designed to make parsing easier • Basic iideadea • Look ahead in input stream • Decide which pproductionroduction ttoo aapplypply bbasedased on next tokens in input stream • We will use one token of lookahead Predictive Parsing Example Grammar Start → Expr Te rm → In t Te rm ’ Expr → Term Expr’ Term’ → * In t Term’ Expr’ → + Expr’ Term’ → / In t Term’ Expr’ → - Expr’ Term’ → ε Expr’ → ε Choice Points • Assume Term’ is current position in parse tree • Have three possiblepossible productionsproductions to apply Term’ → * Int Term’ Teerm’ → / Int Teerm’ Term’ →ε • Use next token to decide • If next token is *, apply Term’ → * Int Term’ • If next token is /, apply Term’ → / Int Term’ • Otherwise , aapplypply Term ’ → ε Predictive Parsing + Hand Coding = Recursive DescentDescent Parser • One procedure per nonterminal NT • PdProducttiions NT → β1 , …, NT → βn • Procedure examines the current input symbol T to determine which production to apppply •If T∈First(βk) • Apply production k • Consume terminals in βk (h(check for correct terminal) • Recursively call procedures for nonterminals in βk • Current input symbol stored in global variable token • Procedures return • true if parse succeedssucceeds • false if parse fails Boolean Term() Example if (token = Int n) token = NextToken(); return(TermPrime()) else return(false) Boolean TTermPrime()ermPrime() if (token = *) token = NextToken(); if (token = Int n) token = NexttTToken(); return((TTermPrime()) else return(false) else if (token = /) token = NextToken(); if (token = Int n) token = NextToken(); return(TermPrime()) else return(false) else return(true) Term → Int Term’ Term’ → * Int Term’ Term’ → / Int Term’ Term’ →ε Multiple Productions With Same Prefix in RHS • Example Grammar NT → if then NT → if then else • Assume NT is current position in parse ttree ree, aandnd if is the next token • Unclear whichwhich production to apply • Multiple

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    73 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us