Part-Of-Speech Tagging • Lit Review Part 2 • Written Review of 2 Articles, Due April 1

Part-Of-Speech Tagging • Lit Review Part 2 • Written Review of 2 Articles, Due April 1

Announcements Part-of-Speech Tagging • Lit Review Part 2 • Written review of 2 articles, due April 1 • Final Project Proposal CS 341: Natural Language Processing Prof. Heather Pon-Barry • Due Monday April 6 www.mtholyoke.edu/courses/ponbarry/cs341.html Today POS Tagging • Process of assigning part of speech marker to each word in a collection ! She/pronoun ! • POS Tagging found/verb ! herself/pronoun ! falling/verb ! ... POS Tagging Penn Treebank Tagset • Words often have more than one POS: e.g., back • The back door = adjective (JJ) • On my back = noun (NN) • Win the voters back = adverb (RB) • Promised to back the bill = verb (VB) • The POS tagging problem is to determine the POS tag for a particular instance of a word. Applications POS Tagging Performance • Speech synthesis • How many tags are correct? (Tag accuracy) • State of the art: about 97% • “I object” vs. “This object...” • But baseline is already 90% • Baseline is performance is: • Parsing • Tag every word with its most frequent tag • Machine translation • Tag unknown words as nouns • Partly easy because • Named entity recognition • Many words are unambiguous • Word sense disambiguation • You get points for them (the, a, etc.) and for punctuation marks! How difficult is POS Tagging? Automatic POS Tagging • In the Brown corpus: • Symbolic • ~ 11% of the word types are ambiguous with regard to part of speech • Rule-based • ~ 40% of the word tokens are ambiguous • Transformation-based • But they tend to be very common words. E.g., that • Probabilistic • I know that he is honest = preposition (IN) • Hidden Markov models • Yes, that play was nice = determiner (DT) • You can’t go that far = adverb (RB) • Log-linear models Rule-based Tagging Rule-based Example • Start with a dictionary ! • Assign all possible tags to words from the !!!! ! NN! dictionary !!!! ! RB!!! ! VBN!! JJ VB! • Write rules by hand to selectively remove tags PRP! VBD!! TO VB DT NN! • Leaving the correct tag for each word She!promised to back the!bill Rule-based Example Transformation-based Eliminate VBN if VBD is an option when • Combines rule-based and probabilistic tagging VBN|VBD follows “<start> PRP” • rules are used to specify tags in a certain environment !!!! ! NN! • probabilistic, we use a tagged corpus to find the best RB!!! performing rules (supervised learning) VBN ! JJ VB! • Input PRP VBD!! TO VB DT NN! • tagged corpus She!promised to back the!bill • dictionary (with most frequent tags) • Example: Brill tagger HMM: Part-of-Speech Automatic POS Tagging Transition Probabilities • Symbolic • Rule-based • Transformation-based • Probabilistic • Hidden Markov models • Log-linear models Observation Likelihoods: P(word|tag) HMM Maxent P(tag|word) MEMMs • Can do surprisingly well just looking at a word by itself: • Word the: the DT • Maximum Entropy Markov Model • Prefixes unfathomable: un- JJ • A sequence version of the maximum entropy • Suffixes Importantly: -ly RB classifier. • Capitalization Meridian: CAP NNP ti-2 ti-1 • Word shapes 35-year: d-x JJ NNP MD VB • Then build a classifier to predict tag wi-1 wi-1 wi wi+1 <s> Janet will back the bill • Maxent P(tag|word): 93.7% overall / 82.6% unknown Slide adapted from Dan Jurafsky MEMMs More Features ti-2 ti-1 NNP MD VB wi-1 wi-1 wi wi+1 <s> Janet will back the bill Slide adapted from Dan Jurafsky MEMM Decoding POS Tagging Accuracies • Rough accuracies: • Simplest algorithm • Baseline: most freq tag: ~90% • Greedy: at each step in sequence, select tag that maximizes P(tag | nearby words, nearby tags) • Trigram HMM: ~95% • Maxent P(t|w): 93.7% • In practice • MEMM tagger: 96.9% • Viterbi algorithm • Bidirectional MEMM: 97.2% • Beam search • Upper bound: ~98% (human agreement) Slide adapted from Dan Jurafsky More Resources References • Log-linear models • Stanford POS Tagger (cyclic dependency network, bidirectional version of MEMM) • Ratnaparkhi, EMNLP 1996 • http://nlp.stanford.edu/software/tagger.shtml • Toutanova et al., NAACL 2003 • CMU Twitter POS tagger • Excellent recent survey: “Part-of-speech tagging from 97% to 100%: is it time for some • http://www.ark.cs.cmu.edu/TweetNLP/ linguistics?” (Manning, 2011) Summary Training a Tagger • Input • Penn Treebank: standard tagset • tagged corpus • Approaches to POS tagging: • dictionary (with most frequent tags) • Symbolic: rule-based, transformation-based • These are available for English • Probabilistic: HMMs, MEMMs • What about other languages? Research in POS Tagging • Low resource languages • Learning a Part-of-Speech Tagger from Two Hours of Annotation (Garrette and Baldridge, 2013) [video].

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    15 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us