Cs 61A/Cs 98-52

Cs 61A/Cs 98-52

CS 61A/CS 98-52 Mehrdad Niknami University of California, Berkeley Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 1 / 23 Something like this? (Is this good?) def find(string, pattern): n= len(string) m= len(pattern) for i in range(n-m+ 1): is_match= True for j in range(m): if pattern[j] != string[i+ j] is_match= False break if is_match: return i What if you were looking for a pattern? Like an email address? Motivation How would you find a substring inside a string? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 2 / 23 def find(string, pattern): n= len(string) m= len(pattern) for i in range(n-m+ 1): is_match= True for j in range(m): if pattern[j] != string[i+ j] is_match= False break if is_match: return i What if you were looking for a pattern? Like an email address? Motivation How would you find a substring inside a string? Something like this? (Is this good?) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 2 / 23 What if you were looking for a pattern? Like an email address? Motivation How would you find a substring inside a string? Something like this? (Is this good?) def find(string, pattern): n= len(string) m= len(pattern) for i in range(n-m+ 1): is_match= True for j in range(m): if pattern[j] != string[i+ j] is_match= False break if is_match: return i Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 2 / 23 Motivation How would you find a substring inside a string? Something like this? (Is this good?) def find(string, pattern): n= len(string) m= len(pattern) for i in range(n-m+ 1): is_match= True for j in range(m): if pattern[j] != string[i+ j] is_match= False break if is_match: return i What if you were looking for a pattern? Like an email address? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 2 / 23 Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Research is still ongoing ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 ...apparently more in Europe? Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Background Text processing has been at the heart of computer science since the 1950s Regular languages: 1950s (Kleene) Context-free languages (CFLs): 1950s (Chomsky) Regular expressions (regexes) & automata: 1960s (Thompson) LR parsing (left-to-right, rightmost-derivation): 1960s (Knuth) Context-free parsers: 1960s (Earley) String searching (Knuth-Morris-Pratt, Boyer-Moore, etc.): 1970s Periods & critical factorizations: 1970s (Cesari-Vincent) [...] Critical factorizations in linear complexity: 2016 (Kosolobov) Research is still ongoing ...apparently more in Europe? Mehrdad Niknami (UC Berkeley) CS 61A/CS 98-52 3 / 23 Most of you will probably graduate without learning string processing. Instead, you’ll learn how to process images and Big Data™. Which makes me sad. :( You should know how to solve solved problems! Learn & use 100%-accurate algorithms before 85%-accurate ones! O(mn)-time str.find(substring) is bad! You can do much better: Good algorithms finish in O(m + n) time

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    178 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us