
Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation Statistical Machine Translation p Lecture 2 ¯ Components: Translation model, language model, decoder Theory and Praxis of Decoding foreign/English English Philipp Koehn parallel text text [email protected] statistical analysis statistical analysis School of Informatics University of Edinburgh Translation Language Model Model Decoding Algorithm – p.1 – p.2 Philipp Koehn, University of Edinburgh 2 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Phrase-Based Systems p Phrase-Based Translation p ¯ A number of research groups developed phrase-based Morgen fliege ich nach Kanada zur Konferenz systems ( RWTH Aachen, Univ. of Southern California/ISI, CMU, IBM, Johns Hopkins Univ., Cambridge Univ., Univ. of Catalunya, ITC-irst, Univ. Edinburgh, Univ. of Maryland...) Tomorrow I will fly to the conference in Canada ¯ Systems differ in – training methods ¯ Foreign input is segmented in phrases – model for phrase translation table – any sequence of words, not necessarily linguistically motivated – reordering models ¯ Each phrase is translated into English – additional feature functions ¯ Phrases are reordered ¯ Currently best method for SMT (MT?) – top systems in DARPA/NIST evaluation are phrase-based – best commercial system for Arabic-English is phrase-based – p.3 – p.4 Philipp Koehn, University of Edinburgh 3 Philipp Koehn, University of Edinburgh 4 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Phrase Translation Table p Decoding Process p Maria no dio una bofetada a la bruja verde ¯ Phrase Translations for “den Vorschlag”: j j English (e f) English (e f) the proposal 0.6227 the suggestions 0.0114 ’s proposal 0.1068 the proposed 0.0114 a proposal 0.0341 the motion 0.0091 ¯ Build translation left to right the idea 0.0250 the idea of 0.0091 – select foreign words to be translated this proposal 0.0227 the proposal , 0.0068 proposal 0.0205 its proposal 0.0068 of the proposal 0.0159 it 0.0068 the proposals 0.0159 ... ... – p.5 – p.6 Philipp Koehn, University of Edinburgh 5 Philipp Koehn, University of Edinburgh 6 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Decoding Process p Decoding Process p Maria no dio una bofetada a la bruja verde Maria no dio una bofetada a la bruja verde Mary Mary ¯ ¯ Build translation left to right Build translation left to right – select foreign words to be translated – select foreign words to be translated – find English phrase translation – find English phrase translation – add English phrase to end of partial translation – add English phrase to end of partial translation – mark foreign words as translated – p.7 – p.8 Philipp Koehn, University of Edinburgh 7 Philipp Koehn, University of Edinburgh 8 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Decoding Process p Decoding Process p Maria no dio una bofetada a la bruja verde Maria no dio una bofetada a la bruja verde Mary did not Mary did not slap ¯ ¯ One to many translation Many to one translation – p.9 – p.10 Philipp Koehn, University of Edinburgh 9 Philipp Koehn, University of Edinburgh 10 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Decoding Process p Decoding Process p Maria no dio una bofetada a la bruja verde Maria no dio una bofetada a la bruja verde Mary did not slap the Mary did not slap the green ¯ ¯ Many to one translation Reordering – p.11 – p.12 Philipp Koehn, University of Edinburgh 11 Philipp Koehn, University of Edinburgh 12 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Decoding Process p Translation Options p Maria no dio una bofetada a la bruja verde Maria no dio una bofetada a labruja verde Mary not give a slap to the witch green did not a slap by green witch no slap to the did not give to the Mary did not slap the green witch slap the witch ¯ Look up possible phrase translations ¯ Translation finished – many different ways to segment words into phrases – many different ways to translate each phrase – p.13 – p.14 Philipp Koehn, University of Edinburgh 13 Philipp Koehn, University of Edinburgh 14 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Hypothesis Expansion p Hypothesis Expansion p Maria no dio una bofetada a labruja verde Maria no dio una bofetada a labruja verde Mary not give a slap to the witch green Mary not give a slap to the witch green did not a slap by green witch did not a slap by green witch no slap to the no slap to the did not give to did not give to the the slap the witch slap the witch e: e: e: Mary f: --------- f: --------- f: *-------- p: 1 p: 1 p: .534 ¯ Start with empty hypothesis ¯ Pick translation option – e: no English words ¯ Create hypothesis – f: no foreign words covered – e: add English phrase Mary – p: probability 1 – f: first foreign word covered – p: probability 0.534 – p.15 – p.16 Philipp Koehn, University of Edinburgh 15 Philipp Koehn, University of Edinburgh 16 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p A Quick Word on Probabilities p Hypothesis Expansion p Maria no dio una bofetada a labruja verde ¯ Not going into detail here, but... Mary not give a slap to the witch green did not a slap by green witch ¯ Translation Model no slap to the did not give to – phrase translation probability p(MaryjMaria) the slap the witch – reordering costs e: witch f: -------*- – phrase/word count costs p: .182 e: e: Mary – ... f: --------- f: *-------- p: 1 p: .534 ¯ Language Model – uses trigrams: ¯ Add another hypothesis > j < > j < – p(Mary did not) = p(Maryj s ) * p(did Mary, s ) * p(not Mary did) – p.17 – p.18 Philipp Koehn, University of Edinburgh 17 Philipp Koehn, University of Edinburgh 18 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Hypothesis Expansion p Hypothesis Expansion p Maria no dio una bofetada a labruja verde Maria no dio una bofetada a la bruja verde Mary not give a slap to the witch green Mary not give a slap to the witch green did not a slap by green witch did not a slap by green witch no slap to the no slap to the did not give to did not give to the the slap the witch slap the witch e: witch e: ... slap e: witch e: slap f: -------*- f: *-***---- f: -------*- f: *-***---- p: .182 p: .043 p: .182 p: .043 e: e: Mary e: e: Mary e: did not e: slap e: the e:green witch f: --------- f: *-------- f: --------- f: *-------- f: **------- f: *****---- f: *******-- f: ********* p: 1 p: .534 p: 1 p: .534 p: .154 p: .015 p: .004283 p: .000271 ¯ ... until all foreign words covered ¯ Further hypothesis expansion – find best hypothesis that covers all foreign words – backtrack to read off translation – p.19 – p.20 Philipp Koehn, University of Edinburgh 19 Philipp Koehn, University of Edinburgh 20 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Hypothesis Expansion p Explosion of Search Space p Maria no dio una bofetada a labruja verde ¯ Number of hypotheses is exponential with respect to Mary not give a slap to the witch green did not a slap by green witch sentence length no slap to the did not give to the slap the witch µ Decoding is NP-complete [Knight, 1999] e: witch e: slap f: -------*- f: *-***---- µ Need to reduce search space p: .182 p: .043 e: e: Mary e: did not e: slap e: the e:green witch – risk free: hypothesis recombination f: --------- f: *-------- f: **------- f: *****---- f: *******-- f: ********* p: 1 p: .534 p: .154 p: .015 p: .004283 p: .000271 – risky: histogram/threshold pruning ¯ Adding more hypothesis µ Explosion of search space – p.21 – p.22 Philipp Koehn, University of Edinburgh 21 Philipp Koehn, University of Edinburgh 22 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Hypothesis Recombination p Hypothesis Recombination p p=0.092 p=0.092 p=1 p=0.534 p=0.092 p=1 p=0.534 p=0.092 Mary did not give Mary did not give did not did not give give p=0.164 p=0.044 p=0.164 ¯ ¯ Different paths to the same partial translation Different paths to the same partial translation µ Combine paths – drop weaker hypothesis – keep pointer from worse path – p.23 – p.24 Philipp Koehn, University of Edinburgh 23 Philipp Koehn, University of Edinburgh 24 Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Statistical Machine Translation — Lecture 2: Theory and Praxis of Decoding p Hypothesis Recombination p Hypothesis Recombination p p=0.092 p=0.017 p=0.092 did not give did not give Joe Joe p=1 p=0.534 p=0.092 p=1 p=0.534 p=0.092 Mary did not give Mary did not give did not did not give give p=0.164 p=0.164 ¯ ¯ Recombined hypotheses
Details
-
File Typepdf
-
Upload Time-
-
Content LanguagesEnglish
-
Upload UserAnonymous/Not logged-in
-
File Pages9 Page
-
File Size-