Discerning Intelligence from Text (at the UofA)

Denilson Barbosa [email protected] Web search is changing

• … from IR-style document retrieval • … to queson-answering over enes extracted from the web

which team did coach last? which was Lou Saban’s last team ?

• The answer is the Chowan Braves, and it can be found in Lou Saban’s Wikipedia page (ranked #1), obituary (ranked #4), and so on… [email protected] 2 in good company…

An explicit answer

[email protected] 3 it is not all bad news…

[email protected] 4 Structured knowledge (harnessed from the Web)

[email protected] 5 Surface-level relaon Extracon

Aer his departure from Buffalo, Saban Documents returned to coach teams including , Army, and UCF.

Recognize Resolve Split Find Entities Coreferences Sentences Relations

<“Lou Saban”, departed from, “Buffalo Bills”> <“Lou Saban”, coach, “Miami Hurricanes”> <“Lou Saban”, coach, “Army Black Knights football”> Triple store <“Lou Saban”, coach, “University of Central ”>

[email protected] 6 From triples to a KB…

?????

• There is a very, very, very long way… § Predicate disambiguaon into semanc “relaons”… § Named enty disambiguaon… § Assigning enes to classes… § Grouping classes into a hierarchy… § Ordering facts temporally…

• It would have been virtually impossible without Wikipedia

[email protected] 7 In this talk…

• Work on enty linking with random walks … [CIKM’2014]

• A bit of the work on open relaon extracon – less on disambiguaon § SONEX (clustering-based) [TWEB ‘2012] § EXEMPLAR (dependences based) [EMNLP’2013] § With Tree Kernels [NAACL’2013] § EFFICIENS (cost-constrained)

• A bit of our work on understanding disputes in Wikipedia [Hypertext2012] [ACM TIST’2015]

[email protected] 8 In this talk…

• Work on enty linking … [CIKM’2014]

Enty Linking

[email protected] 9 The enty graph

• We perform disambiguaon of a graph where nodes have ids of enes in the KB with their respecve context (i.e., text!)

The Enty Graph has The KB has facts text about the enes ≠ and asserons

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach college football teams including Miami, football US Army Army, University of Central Florida and UCF. UCF Knights football [email protected] 10 The enty graph

• Typically, built from Wikipedia

• Nodes are Wikipedia arcles § All known names § Context: whole arcle § Metadata: • types, keyphrases, • type compability…

• Edges: E1 – E2 iff: § There is a wikilink from E1 to E2 § There is arcle E3 that menons E1 and E2 close to each other

• Alias diconary: § Mapping from names to ids

[email protected] 11 Enty linking – main steps

• Candidate Selecon: find a small set of good candidates for each menon à using the alias diconary • Menon disambiguaon: assign each menon to one of its candidates

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football [email protected] 12 Candidate Selecon

• On the KB: alias-diconary expansion § Saban : {Nick Saban, Lou Saban, Saban Capital Group, …} • On the document: § Lookups: alias-diconary/Wikipedia disambiguaon pages § Co-reference resoluon[Cucerzan’07] § Acronym expansion[Zhang et.al’10, Zhang et.al’11] (ABC -> Australian Broadcasng Corporaon) [email protected] 13 Local menon disambiguaon—e.g., [Cucerzan’2007]

• Disambiguate each menon in isolaon ent(m) = arg max (↵ prior(m, e)+ sim(m, e)) e candidates (m) · · 2

•freq (e|m) •indegree (e) •length( context(e))

• cosine/Dice/KL( context(m), context(e))

[email protected] 14 Local menon disambiguaon

• Problemac assumpon: menons are independent of each other

Saban = Nick Saban è Miami = Miami Dolphins

Saban = Lou Saban è Miami = Miami Hurricanes

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football [email protected] 15 Global menon disambiguaon—e.g., [Hoffart’2011]

• Disambiguate all menons at once ent(m) = arg max (↵ prior(m, e)+ sim(m, e)+ e candidates (m) · · 2 coherence(G ))) · ent

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football [email protected] 16 Global menon disambiguaon

• Coherence captures the assumpon that the input document has a single theme or topic § E.g., rock music, or the world cup final match § NP-hard opmizaon in general

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football [email protected] 17 Global menon disambiguaon

• [Hoffart et al 2011] – dense sub-graph problem • Greedy algorithm: remove non-taboo enes unl a minimal subgraph with highest weight is found enty-enty menon-enty •overlap anchor words •sim (m,e) •overlap links •keyphraseness (m,e) •type similarity

post-processing

[email protected] 18 Global menon disambiguaon

• Rel-RW : Robust enty linking with Random Walks [CIKM2014] • Global noon of enty-enty similarity • Greedy algorithm: iteravely disambiguate menons; start with the easiest ones

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football [email protected] 19 Random Walks as context representaon

• Random walks capture indirect relatedness between nodes in the graph

k candidates n nodes in total

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football

[email protected] 20 Random Walks as context representaon

Relatedness between enes

Enty Semanc Signature

Document Semanc Signature

• One vector for each enty, and another for the whole document

• Similarity is measured using Zero-KL Divergence

[email protected] 21 Semanc Signatures of Enes

• Restart from the enty with probability α (e.g. 0.15) § Unl convergence • Repeat for the candidate menons only

Buffalo Bills Buffalo Bulls

Nick Saban Lou Saban

Miami Heat

. . . Miami Hurricanes Miami Dolphins

Army Black Knights football

US Army . . .

University of Central Florida

UCF Knights football

[email protected] 22 Semanc Signatures of Documents

• (From [Milne & Wien 2008]): If there are unambiguous menons, use only their enes to find the signature of the document • Otherwise, use all candidate enes

Buffalo Bills Aer his departure Buffalo Bulls from Buffalo, Nick Saban Saban Lou Saban

returned to coach Miami Heat unambiguous college football Miami Hurricanes Miami Dolphins teams including Miami, Army Black Knights football US Army Army, University of Central Florida and UCF. UCF Knights football

[email protected] 23 Algorithm

• Find candidate enes for each menon • Compute prior(m,e) and the context(m,e) • Sort menons by ambiguity (i.e., number of candidates) • Go through each menon m in ascending order: • SSd = semanc signature of document • Assign to m the candidate e with highest combined score prior(m,e) * context(m,e) + sim(SSe , SSd) • Update the set of enes for the document

[email protected] 24 Algorithm

Menons Candidates [ambiguity] [PriorProb, CtxSim, SemSim] UCF Knights football [0.133, 0.18, 0.50] University of Central Florida UCF [0.167, 0.13, 0.52] 33 UCF Knights [0.041, 0.13, 0.34]

Lou Saban [0.009, 0.28, 0.41] Nick Saban Saban [0.009, 0.15, 0.54] 45 Saban Capital Group [0.545, 0.13, 0.20]

Buffalo, [0.467, 0.07, 0.54] Buffalo Bulls football Buffalo [0.024, 0.11, 0.50] Use all candidates for SSd 317 Buffalo Bills [0.021, 0.09, 0.58]

Miami [0.632, 0.07, 0.61] Miami Hurricanes football Miami [0.029, 0.12, 0.58] 343 Miami Dolphins [0.011, 0.10, 0.56]

Army Black Knights football [0.062, 0.09, 0.52] Mariland Terrapins football Army [0.001, 0.07, 0.56] 402 Army [0.155, 0.04, 0.34] [email protected] 25 Algorithm

Menons Candidates Ed = {UCF Knights football} [ambiguity] [PriorProb, CtxSim, SemSim] UCF Knights football [0.133, 0.18, 0.50] University of Central Florida UCF UCF Knights football [0.167, 0.13, 0.52] [0.133, 0.18, 0.50] 33 UCF Knights basketball [0.041, 0.13, 0.34]

Lou Saban [0.009, 0.28, 0.41] Nick Saban Saban [0.009, 0.15, 0.54] 45 Saban Capital Group [0.545, 0.13, 0.20]

Buffalo, New York [0.467, 0.07, 0.54] Buffalo Bulls football Buffalo [0.024, 0.11, 0.50] 317 Buffalo Bills [0.021, 0.09, 0.58]

Miami [0.632, 0.07, 0.61] Miami Hurricanes football Miami [0.029, 0.12, 0.58] 343 Miami Dolphins [0.011, 0.10, 0.56]

Army Black Knights football [0.062, 0.09, 0.52] Mariland Terrapins football Army [0.001, 0.07, 0.56] 402 Army [0.155, 0.04, 0.34] [email protected] 26 Algorithm

Menons Candidates Ed = {UCF Knights football, Lou Saban} [ambiguity] [PriorProb, CtxSim, SemSim] UCF Knights football [0.133, 0.18, 0.50] University of Central Florida UCF UCF Knights football [0.167, 0.13, 0.52] [0.133, 0.18, 0.50] 33 UCF Knights basketball [0.041, 0.13, 0.34]

Lou Saban Lou Saban [0.009, 0.28, 0.41] [0.009, 0.28, 0.51] Nick Saban Nick Saban Lou Saban Saban [0.009, 0.15, 0.54] [0.009, 0.15, 0.58] [0.009, 0.28, 0.51] 45 Saban Capital Group Saban Capital Group [0.545, 0.13, 0.20] [0.545, 0.13, 0.18]

Buffalo, New York [0.467, 0.07, 0.54] Buffalo Bulls football Buffalo [0.024, 0.11, 0.50] 317 Buffalo Bills [0.021, 0.09, 0.58]

Miami [0.632, 0.07, 0.61] Miami Hurricanes football Miami [0.029, 0.12, 0.58] 343 Miami Dolphins [0.011, 0.10, 0.56]

Army Black Knights football [0.062, 0.09, 0.52] Mariland Terrapins football Army [0.001, 0.07, 0.56] 402 Army [0.155, 0.04, 0.34] [email protected] 27 Algorithm

Menons Candidates Ed = {UCF Knights football, Lou Saban, Buffalo Bills} [ambiguity] [PriorProb, CtxSim, SemSim] UCF Knights football [0.133, 0.18, 0.50] University of Central Florida UCF UCF Knights football [0.167, 0.13, 0.52] [0.133, 0.18, 0.50] 33 UCF Knights basketball [0.041, 0.13, 0.34]

Lou Saban Lou Saban [0.009, 0.28, 0.41] [0.009, 0.28, 0.51] Nick Saban Nick Saban Lou Saban Saban [0.009, 0.15, 0.54] [0.009, 0.15, 0.58] [0.009, 0.28, 0.51] 45 Saban Capital Group Saban Capital Group [0.545, 0.13, 0.20] [0.545, 0.13, 0.18]

Buffalo, New York Buffalo Bills [0.467, 0.07, 0.54] [0.021, 0.09, 0.95] Buffalo Bulls football Buffalo Bulls football Buffalo Bills Buffalo [0.024, 0.11, 0.50] [0.024, 0.11, 0.71] [0.021, 0.09, 0.95] 317 Buffalo Bills Buffalo, New York [0.021, 0.09, 0.58] [0.467, 0.07, 0.42]

Miami [0.632, 0.07, 0.61] Miami Hurricanes football Miami [0.029, 0.12, 0.58] 343 Miami Dolphins [0.011, 0.10, 0.56]

Army Black Knights football [0.062, 0.09, 0.52] Mariland Terrapins football Army [0.001, 0.07, 0.56] 402 Army [0.155, 0.04, 0.34] [email protected] 28 Algorithm

Menons Candidates Ed = {UCF Knights football, Lou Saban, Buffalo Bills, Miami Hurricanes football} [ambiguity] [PriorProb, CtxSim, SemSim] UCF Knights football [0.133, 0.18, 0.50] University of Central Florida UCF UCF Knights football [0.167, 0.13, 0.52] [0.133, 0.18, 0.50] 33 UCF Knights basketball [0.041, 0.13, 0.34]

Lou Saban Lou Saban [0.009, 0.28, 0.41] [0.009, 0.28, 0.51] Nick Saban Nick Saban Lou Saban Saban [0.009, 0.15, 0.54] [0.009, 0.15, 0.58] [0.009, 0.28, 0.51] 45 Saban Capital Group Saban Capital Group [0.545, 0.13, 0.20] [0.545, 0.13, 0.18]

Buffalo, New York Buffalo Bills [0.467, 0.07, 0.54] [0.021, 0.09, 0.95] Buffalo Bulls football Buffalo Bulls football Buffalo Bills Buffalo [0.024, 0.11, 0.50] [0.024, 0.11, 0.71] [0.021, 0.09, 0.95] 317 Buffalo Bills Buffalo, New York [0.021, 0.09, 0.58] [0.467, 0.07, 0.42]

Miami Miami Hurricanes football [0.632, 0.07, 0.61] [0.029, 0.12, 0.93] Miami Hurricanes football Miami Dolphins Miami Hurricanes football Miami [0.029, 0.12, 0.58] [0.011, 0.10, 0.98] [0.029, 0.12, 0.93] 343 Miami Dolphins Miami [0.011, 0.10, 0.56] [0.632, 0.07, 0.48]

Army Black Knights football [0.062, 0.09, 0.52] Mariland Terrapins football Army [0.001, 0.07, 0.56] 402 Army [0.155, 0.04, 0.34] [email protected] 29 Algorithm

Menons Candidates [ambiguity] [PriorProb, CtxSim, SemSim] UCF Knights football [0.133, 0.18, 0.50] University of Central Florida UCF UCF Knights football [0.167, 0.13, 0.52] [0.133, 0.18, 0.50] 33 UCF Knights basketball [0.041, 0.13, 0.34]

Lou Saban Lou Saban [0.009, 0.28, 0.41] [0.009, 0.28, 0.51] Nick Saban Nick Saban Lou Saban Saban [0.009, 0.15, 0.54] [0.009, 0.15, 0.58] [0.009, 0.28, 0.51] 45 Saban Capital Group Saban Capital Group [0.545, 0.13, 0.20] [0.545, 0.13, 0.18]

Buffalo, New York Buffalo Bills [0.467, 0.07, 0.54] [0.021, 0.09, 0.95] Buffalo Bulls football Buffalo Bulls football Buffalo Bills Buffalo [0.024, 0.11, 0.50] [0.024, 0.11, 0.71] [0.021, 0.09, 0.95] 317 Buffalo Bills Buffalo, New York [0.021, 0.09, 0.58] [0.467, 0.07, 0.42]

Miami Miami Hurricanes football [0.632, 0.07, 0.61] [0.029, 0.12, 0.93] Miami Hurricanes football Miami Dolphins Miami Hurricanes football Miami [0.029, 0.12, 0.58] [0.011, 0.10, 0.98] [0.029, 0.12, 0.93] 343 Miami Dolphins Miami [0.011, 0.10, 0.56] [0.632, 0.07, 0.48]

Army Black Knights football Army Black Knights football [0.062, 0.09, 0.52] [0.062, 0.09, 0.74] Mariland Terrapins football Mariland Terrapins football Army Black Knights football Army [0.001, 0.07, 0.56] [0.001, 0.07, 0.83] [0.062, 0.09, 0.74] 402 Army Army [0.155, 0.04, 0.34] [0.155, 0.04, 0.28] [email protected] 30 “Paul, John, Ringo, and George”

Menons Candidates [ambiguity] [PriorProb, CtxSim, SemSim] [0.266, 0.08, 0.42] Ringo () [0.297, 0.09, 0.30] Ringo Ringo Rama 35 [0.010, 0.14, 0.27] Johnny Ringo [0.010, 0.18, 0.20

Paul the Apostle [0.354, 0.06, 0.42] Paul McCartney [0.055, 0.06, 0.51] Paul Paul Field 379 [0.001, 0.12, 0.25] Paul I of Russia [0.026, 0.04, 0.25] Ed George Sco [0.002, 0.10, 0.21] George Moore George [0.001, 0.10, 0.20] George Costanza 807 [0.07, 0.06, 0.24] [0.011, 0.03, 0.47]

John Lennon [0.007, 0.05, 0.53] Gospel of John [0.154, 0.03, 0.33] John John the Apostle 1699 [0.038, 0.04, 0.33] John, King of England [0.066, 0.03, 0.28] [email protected] 31 “Paul, John, Ringo, and George”

Menons Candidates Ed = {Ringo Starr} [ambiguity] [PriorProb, CtxSim, SemSim] Ringo Starr [0.266, 0.08, 0.42] Ringo (album) [0.297, 0.09, 0.30] Ringo Ringo Starr Ringo Rama [0.266, 0.08, 0.42] 35 [0.010, 0.14, 0.27] Johnny Ringo [0.010, 0.18, 0.20

Paul the Apostle [0.354, 0.06, 0.42] Paul McCartney [0.055, 0.06, 0.51] Paul Paul Field 379 [0.001, 0.12, 0.25] Paul I of Russia [0.026, 0.04, 0.25]

George Sco [0.002, 0.10, 0.21] George Moore George [0.001, 0.10, 0.20] George Costanza 807 [0.07, 0.06, 0.24] George Harrison [0.011, 0.03, 0.47]

John Lennon [0.007, 0.05, 0.53] Gospel of John [0.154, 0.03, 0.33] John John the Apostle 1699 [0.038, 0.04, 0.33] John, King of England [0.066, 0.03, 0.28] [email protected] 32 “Paul, John, Ringo, and George”

Menons Candidates Ed = {Ringo Starr, Paul McCartney} [ambiguity] [PriorProb, CtxSim, SemSim] Ringo Starr [0.266, 0.08, 0.42] Ringo (album) [0.297, 0.09, 0.30] Ringo Ringo Starr Ringo Rama [0.266, 0.08, 0.42] 35 [0.010, 0.14, 0.27] Johnny Ringo [0.010, 0.18, 0.20

Paul the Apostle Paul McCartney [0.354, 0.06, 0.42] [0.055, 0.06, 1.25] Paul McCartney Paul the Apostle [0.055, 0.06, 0.51] [0.354, 0.06, 0.24] Paul McCartney Paul Paul Field Paul Field [0.055, 0.06, 1.25] 379 [0.001, 0.12, 0.25] [0.001, 0.12, 0.22] Paul I of Russia Paul I of Russia [0.026, 0.04, 0.25] [0.026, 0.04, 0.19]

George Sco [0.002, 0.10, 0.21] George Moore George [0.001, 0.10, 0.20] George Costanza 807 [0.07, 0.06, 0.24] George Harrison [0.011, 0.03, 0.47]

John Lennon [0.007, 0.05, 0.53] Gospel of John [0.154, 0.03, 0.33] John John the Apostle 1699 [0.038, 0.04, 0.33] John, King of England [0.066, 0.03, 0.28] [email protected] 33 “Paul, John, Ringo, and George”

Menons Candidates Ed = {Ringo Starr, Paul McCartney, George Harrison} [ambiguity] [PriorProb, CtxSim, SemSim] Ringo Starr [0.266, 0.08, 0.42] Ringo (album) [0.297, 0.09, 0.30] Ringo Ringo Starr Ringo Rama [0.266, 0.08, 0.42] 35 [0.010, 0.14, 0.27] Johnny Ringo [0.010, 0.18, 0.20

Paul the Apostle Paul McCartney [0.354, 0.06, 0.42] [0.055, 0.06, 1.25] Paul McCartney Paul the Apostle [0.055, 0.06, 0.51] [0.354, 0.06, 0.24] Paul McCartney Paul Paul Field Paul Field [0.055, 0.06, 1.25] 379 [0.001, 0.12, 0.25] [0.001, 0.12, 0.22] Paul I of Russia Paul I of Russia [0.026, 0.04, 0.25] [0.026, 0.04, 0.19]

George Sco George Harrison [0.002, 0.10, 0.21] [0.011, 0.03, 1.30] George Moore George Sco George [0.001, 0.10, 0.20] [0.002, 0.10, 0.20] George Harrison George Costanza George Costanza [0.011, 0.03, 1.30] 807 [0.07, 0.06, 0.24] [0.07, 0.06, 0.24] George Harrison George Moore [0.011, 0.03, 0.47] [0.001, 0.10, 0.19]

John Lennon [0.007, 0.05, 0.53] Gospel of John [0.154, 0.03, 0.33] John John the Apostle 1699 [0.038, 0.04, 0.33] John, King of England [0.066, 0.03, 0.28] [email protected] 34 “Paul, John, Ringo, and George”

Menons Candidates [ambiguity] [PriorProb, CtxSim, SemSim] Ringo Starr [0.266, 0.08, 0.42] Ringo (album) [0.297, 0.09, 0.30] Ringo Ringo Starr Ringo Rama [0.266, 0.08, 0.42] 35 [0.010, 0.14, 0.27] Johnny Ringo [0.010, 0.18, 0.20

Paul the Apostle Paul McCartney [0.354, 0.06, 0.42] [0.055, 0.06, 1.25] Paul McCartney Paul the Apostle [0.055, 0.06, 0.51] [0.354, 0.06, 0.24] Paul McCartney Paul Paul Field Paul Field [0.055, 0.06, 1.25] 379 [0.001, 0.12, 0.25] [0.001, 0.12, 0.22] Paul I of Russia Paul I of Russia [0.026, 0.04, 0.25] [0.026, 0.04, 0.19]

George Sco George Harrison [0.002, 0.10, 0.21] [0.011, 0.03, 1.30] George Moore George Sco George [0.001, 0.10, 0.20] [0.002, 0.10, 0.20] George Harrison George Costanza George Costanza [0.011, 0.03, 1.30] 807 [0.07, 0.06, 0.24] [0.07, 0.06, 0.24] George Harrison George Moore [0.011, 0.03, 0.47] [0.001, 0.10, 0.19]

John Lennon John Lennon [0.007, 0.05, 0.53] [0.007, 0.05, 1.20] Gospel of John Gospel of John [0.154, 0.03, 0.33] [0.154, 0.03, 0.22] John Lennon John John the Apostle John the Apostle [0.007, 0.05, 1.20] 1699 [0.038, 0.04, 0.33] [0.038, 0.04, 0.23] John, King of England John, King of England [0.066, 0.03, 0.28] [0.066, 0.03, 0.21] [email protected] 35 Enty Linking – Evaluaon

• Public benchmarks Datasets # of mentions # of articles MSNBC 739 20 AQUAINT 727 50 ACE2004 306 57

• Synthec Wikipedia dataset. § Generated based on the popularity of enes. e.g. dataset 0.3-0.4 means the accuracy of prior probability is between 0.3-0.4. § 8 datasets: 0.3-0.4, 0.4-0.5, … 0.9-1.0, 1.0-1.1 § 40 documents, each document has 20-40 enes. • Evaluaon Metrics § Accuracy § Micro F1: average F-1 per menon § Macro F1: average F-1 per document [email protected] 36 Enty Linking – Evaluaon

• Results on MSNBC, AQUAINT, ACE2004 Datasets MSNBC AQUAINT ACE2004 Systems Accuracy F1@MI F1@MA Accuracy F1@MI F1@MA Accuracy F1@MI F1@MA PriorProb 85.98 86.50 87.15 84.87 87.27 87.16 84.82 85.49 87.13 Prior-Type 81.86 82.81 83.84 83.22 85.57 85.08 80.93 84.04 86.08 Local 77.43 77.91 72.30 66.44 68.32 68.09 61.48 61.96 56.95 Cucerzan 87.80 88.34 87.76 76.62 78.67 78.22 78.99 79.30 78.22 M&W 68.45 78.43 80.37 79.92 85.13 84.84 75.54 81.29 84.25 Han’11 87.65 88.46 87.93 77.16 79.46 78.80 72.76 73.48 66.80 AIDA 76.83 78.81 76.26 52.54 56.47 56.46 77.04 80.49 84.13 GLOW 65.55 75.37 77.33 75.65 83.14 82.97 75.49 81.91 83.18 RI 88.57 90.22 90.87 85.01 87.72 87.74 82.35 86.60 87.13 REL-RW 91.62 92.18 92.10 88.45 90.82 90.51 84.43 87.68 89.23

• Prior probability is a strong baseline. • Benchmarks are biased towards popular enes, • Representaveness? (e.g. long tails of the menons in the Web)

[email protected] 37 Enty Linking – Evaluaon

REL-RW, Han11: Graph-based measure Curcerzan, AIDA: Lexical measure M&W, RI, GLOW: Linked-based measure

[email protected] 38 Enty Linking – Evaluaon

• Different configuraons § Iterave process performs best § Unambiguous menons are more informave than candidates. § Robust performance with different weighng schemes.

[email protected] 39 Robust Enty Linking with Random Walks

• Intuion: less popular enes are more likely to be beer linked than well described (i.e., have a lot of text) • Our semanc similarity has a natural interpretaon and relies more on the graph than on the document content • Menon disambiguaon without global noon of coherence • Use a greedy iterave approach: disambiguate the ``easiest’’ menon, re-compute everything, repeat • Robust against bad parameter choice • No learning! Previous state of the art [Hoffart’2011, Milne&Wien’2008] learn similarity weights • Future work: improving the first step of the algorithm § Maybe exploring a few alternaves

[email protected] 40 Shallow (SONEX): Clustering-based [ACM TWEB’12]

Not-so-shallow (dependency parsing) Reified networks/Nested relations [ICWSM’11] Rule-based (EXEMPLAR) [EMNLP’13] Tree-kernels on dependency parses [NAACL’13]

Improving ORE with text normalizaon [LREC’14]

Cost-constrained ORE EFFICIENS OPEN RELATION EXTRACTION

[email protected] 41 ORE “one sentence at a me”

Frequency Pattern Example

38% E1 Verb E2 X established Y

23% E1 NP Prep E2 X settlement with Y

16% E1 Verb Prep E2 X moved to Y

9% E1 Verb to Verb E2 X plans to acquire Y • Relaon extracon as a sequence predicon task § TextRunner: [Banko and Etzioni, 2008] a small list of part-of-speech tag sequences that account for a large number of relaons in a large corpus § ReVerb: [Fader et al., 2011] use an even shorter list of paerns • The ReVerb/TextRunner tools have extracted over one billion facts from the Web • Efficient: no need to store the whole corpus • Brile: mulple synonyms of the the same relaon are extracted

[email protected] 42 ORE on “all sentences” at once

currently the head basketball coach at George Mason Jim Larranaga Is the of University enters his 12th season at

new Golden Bears basketball head coach at ... University of Greg Francis from Basketball to Alberta replace rering legend Don Horwood at • [Hasegawa et al., 2004] use hierarchical agglomerave clustering of all triples (E1, C, E2) in the corpus

§ E1, C, E2 where the context C derives from all sentences connecng the enes § The clustering is done on the context vectors (not the enes) • All triples (and thus, enty pairs) in the same cluster belong to the same relaon

[email protected] 43 SONEX

• Offline (HAC clustering) – ACM TWEB 2012

• Online (buckshot): cluster a sample and classify the remaining sentences, one at a me § No discernible loss in accuracy, but much higher scalability § Also allows the same enty pair to belong to mulple relaons

[email protected] 44 SONEX: Clustering features

• Clustering features derived from the context words between enes § Unigrams: stemmed words, excluding stop words. • [Allan, 1998] Allan, J. (1998). Book Review: Readings in Informaon Retrieval edited by K. Sparck Jones and P. Wille. Inf. Process. Manage., 34(4):489–490. § Bigrams: sequence of two (unigram) words (e.g., Vice President). § Part of Speech Paerns: small number of relaon-independent linguiscs paerns from TextRunner [Banko and Etzioni, 2008]

• Weights: Term frequency (), inverse document frequency (idf) and Domain frequency (df) [SIGIR’10]

fi(t) df i(t)= 1 j n fj(t)   P [email protected] 45 SONEX: clustering threshold

• Used ~400 unique enty pairs from Freebase, manually annotated with true relaons • Picked the clustering threshold and features with the best fit

[email protected] 46 SONEX: importance of domain

• DF works really well except when MISC types are involved • Example: coach § LOC–PER domain: (England, Fabio Capello); (Croaa, Slaven Bilic) § MISC–PER domain: (Titans, Jeff Fisher); (Jets, Eric Mangini)

• Overall, df alone improved the f-measure by 12%

[email protected] 47 SONEX: From Clusters to Relaons

• Clusters are sets of enty Relation Cluster pairs with similar contexts Campaign McCain : Rick Davis Chairman Obama : David Plouffe Strongman Zimbabwean : Robert President Mugabe • We find relaon names Venezuelan : Hugo Chavez by looking for prominent Chief Kia Behnia : BMC Software Architect Brendan Eich : Mozilla terms in the context vectors Military Pakistan : Pervez Musharraf § Most frequent feature Dictator Zimbabwe : Robert Mugabe § Centroid of cluster Coach Tennessee : Rick Neuheisel Syracuse : Jim Boeheim

[email protected] 48 SONEX: from clusters to relaons

• Evaluate relaons by compung the agreement between the Freebase term and the chosen label § Scale: 1 (no agreement) to 5 (full agreement)

[email protected] 49 SONEX vs. ReVerb: qualitave comparison

Barack Obama married Michelle Obama Barack Obama `s spouse is Michelle Obama

<“Barack Obama”, married, “Michelle Obama”> <“Barack Obama”, spouseOf, “Michelle Obama”>

• Local methods like ReVerb do not understand synonyms, and return all variants of the same relaon for the same pair • Global methods like SONEX produce a single relaon for each enty pair, derived from all sentences connecng the pair • Trade-off: local methods are best when there are mulple relaons, whereas global methods aim at the most representave relaon § The biggest problem is we can’t tell which is the case just by looking at the corpus

[email protected] 50 SONEX vs ReVerb—clustering analysis

System Purity Inv. Purity ReVerb 0.97 0.22 SONEX 0.96 0.77

• Purity: homogeneity of clusters § Fracon of instances that belong together • Inv. purity: specificity of clusters § Maximal intersecon with the relaons § Also known as overall f-score [Larsen and Aone, 1999]

[email protected] 51 Deep vs shallow NLP in ORE

• Adding NLP machinery increases cost, but brings in beer results • What is the right trade-off?

[email protected] 52 EXEMPLAR [EMNLP’13]

• Rule-based method to idenfy the precise connecon between the argument and the predicate via dependency parsing

Open Source! hps://github.com/U-Alberta/exemplar/

[email protected] 53 EXEMPLAR

• Standard NLP pipeline to break document into sentences and find named enes (or just noun phrases) • Find triggers (nouns or verbs not tagged as part of an enty menon) • Find candidate arguments (dependencies of triggers)

[email protected] 54 EXEMPLAR

• Apply rules to assign roles to the candidate arguments

[email protected] 55 EXEMPLAR -- Evaluaon

• Binary mode

• Each sentence is annotated with: § Two named enes § A trigger § A window of allowed tokens

I’ve got a media call about [ORG Google] ->’s {acquisition} of<- [ORG YouTube] ->today<-

• An extracon is deemed correct if it contains the trigger and allowed tokens only

[email protected] 56 EXEMPLAR -- Evaluaon

0.7

0.6 EXEMPLAR[M] EXEMPLAR[S] 0.5 SONEX OLLIE LUND 0.4

f-measure PATTY

0.3 REVERB SWIRL

0.2 0.01 0.1 1 10 seconds per sentence

[email protected] 57 EXEMPLAR – Further evaluaons

[email protected] 58 Text Normalizaon

• ORE systems, even with dependency parsing make trivial mistakes as soon as sentences become slightly more evolved

Mrs. Clinton, who won the Senate race in New York in 2000, received $91,000 from the Kushner family and partnerships, while Mr. Gore received $66,000 for his presidential campaign that year

• Problem: an enty (Mrs. Clinton) is too far from the trigger (received); so EXEMPLAR (or ReVerb, SONEX) will miss the extracon

• Soluons: § Modify the relaon extracon system à BAD: fixing one issue oen breaks other kinds of extracon

§ Modify the input text à GOOD [email protected] 59 Text Normalizaon [LREC’14]

• Focused on relave clauses and parciple phrases • Method: § Apply chunking [Jurafsky and Marn‘08] to break sentences § Classify all pairs of chunks as: • Connected: are part of the same clause, and should be merged (usually to correct mistakes in chunking) • Dependent: are in different clauses but one depends on the other Mrs. Clinton | won the Senate race in New York in 2000 Mrs. Clinton + received $ 91,000 from the Kushner family and partnerships

• Disconnected: clauses that are ``le alone’’ Mr. Gore received $ 66,000 for his presidential campaign that year

§ Apply the original ORE system

[email protected] 60 Text Normalizaon

• Ideal case is when we have a dependence parse of the sentence § But this adds cost

• Used a Naïve Bayes classifier that looks at the parts of speech of the 2 words at each “end” of the chunks, and decides the relaonship among them § Training Data: 37,015 parse trees of The Wall Street Journal secon of OntoNotes § Accuracy (10-fold cross validaon): • 77% overall • 85% for disconnected • 75% for connected

[email protected] 61 Text Normalizaon Improves ORE

0.7

EXEMPLAR[S]+DEP-SR 0.6 EXEMPLAR[S]+NB-SR EXEMPLAR[M] EXEMPLAR[S]

0.5 SONEX OLLIE ReVerb+DEP-SR LUND

f-measure 0.4 ReVerb+NB-SR PATTY

0.3 SWIRL ReVerb

0.2 0.01 0.1 1 10 seconds per sentence

[email protected] 62 EFFICIENS

• Towards budget-conscious ORE § There are many sentences where shallow methods are just fine, so there is no need to fire up more expensive machinery

• Goal: apply each method based on need: § First apply ORE based on POS tagging

§ Predict if more is needed, if so, apply ORE based on dependencies

§ If more is needed, apply ORE using semanc role labeling

[email protected] 63 CONFLICTS IN WIKIPEDIA

[email protected] 64 Controversy

controversy |ˈkäntrəˌvərsē| noun (pl. controversies) disagreement, typically when prolonged, public, and heated

• Controversy leads to confusion for the reader and propagates misinformaon, bias, prejudice,…

• Example: the arcle about aboron was the stage for a discussion around breast cancer

4K words on the inial revision of the arcle alone! [email protected] 65 Quality control delegated to the crowd

• If a topic is important to a large enough group of editors, the collaborave eding process will (eventually) lead to a high-quality arcle

• Editors can tag whole arcles or secons as controversial § Controversial tags are tags such as {controversial}, {dispute}, {disputed-secon} § Less than 1% of the arcles are tagged as controversial

• Readers can only determine whether an arcle is controversial based on the tags, inspecon of the talk page, or the edit history

• Our Goal: automate the quality control process

[email protected] 66 Wikipedia takes care of the issue

• Secons of the the controversial arcles spawn new arcles • Example: Holiest sites in Islam

[email protected] 67 Controversy in Wikipedia is different

• Wikipedia’s neutral point of view fools senment analysis tools

[email protected] 68 The edit history

• The edit history contains the log of all acons performed in the making of the arcle • The me-stamp of each commit • The acon performed in each commit • An oponal comment explaining the intent of each commit

[email protected] 69 Why is this a hard problem?

• Controversy has to do with the content of the arcle but also the social process leading to the arcle • Stats don’t help § Controversial arcles tend to be a lile longer and have fewer editors, but the overlap in the distribuons is too high! § Simple scores or classifiers are not likely to work • NLP is hard § Revisions with millions of individual NPOV sentences § The talk page is not linked to specific revisions § Textual entailment? [email protected] 70 Finding hot spots

• Mutually assign trust and credibility [Adler et al. 2008]

credibility predicts likelihood of acceptance

editor edits

longevity increases credibility

[email protected] 71 Bipolarity

• Controversial arcles tend to look like biparte graphs more than non-controversial ones [Brandes et al 2009]

• However… [WWW2011]

[email protected] 72 Discussions in Wikipedia: exploing the edit history

• [Kiur et al., 2007] train a classifier to predict the number of controversial tags in an arcle § Features are based on metrics such as the number of authors, the number of versions, the number of anonymous edits, etc. • [Vuong et al., 2008] build a model to assign a controversy score to arcles assuming a mutual reinforcing relaonship between controversy score of arcles and their editors • [Druck et al., 2008] extract features from editor collaboraon to establish a noon of editor trust • Extracng affinity/polarity networks: verces are editors, edges indicate agreement/disagreement among them § [Maniu et al., 2011] [Leskovec et al., 2010] • Bipolarity [Brandes et atl. 2009]: compung how much a network approximates a biparte graph

[email protected] 73 Discussions in Wikipedia [Hypertext’12] [TIST’15]

• Main idea: model the social aspects of the eding of the arcles

[email protected] 74 Finding the Atudes of editors

• Look at all interacons between a pair of editors, and predict whether one would vote for the other § Use Wikipedia admin elecon data for training and tesng § Consider all arcles they worked together with 34 revisions of each other

• 87 % Accuracy § Test on Wikipedia admin elecon data (~100K votes) § High bias for posive votes; § Negave impressions persist for a long me [email protected] 75 Controversy in Wikipedia

• Once we know how the editors would vote for each other, we look for signs of ``fights’’ within the arcle’s history • We classify each collaboraon network

• Eding Stats + light NLP features § Various stats capturing the eding behavior § Use of senment words (adapted to the domain)

• Social networking Features § Several stats about the shape of the graph § Number and kind of triads • “A friend of my friend is my friend” • “An enemy of my friend is my enemy” [email protected] 76 Controversy in Wikipedia

• Structure classifier on the collaboraon networks § Logisc regression / WEKA • Ground truth: 480 arcles § 240 tagged by editors as controversial § 240 from other categories, that have never been tagged as controversial in their history

[email protected] 77 Controversy in Wikipedia: feature ablaon

number of triads An enemy of my enemy is my friend

[email protected] 78 Controversy in Wikipedia -- Robustness

[email protected] 79 Finer-grained noons of controversy

• Oen, the source of controversy in an arcle comes from a single secon or even a sentence § In the arcle about aboron, the secon which connects it to breast cancer is the main dispute

• Finding the text units that contribute the most to the controversy § Typical NP-hard opmizaon problem

§ Good news: if the controversy funcon is sub-modular and monotonic, there is a nice approximaon algorithm

§ Bad news: the best classifiers can’t be made into monotonic and sub-modular funcons

[email protected] 80 Controversy in Wikipedia -- NEXT

• Summary: § Detecng controversy in Wikipedia is challenging § Stascal features alone are not enough § Social networking theories help a lot in finding disagreement

• Use more NLP machinery § Link the discussion in the arcle history to the discussion/talk page • Beer understanding the Wikipedia dynamics • Impact on ORE tools § how hard is it to find contradictory informaon on subsequent versions?

[email protected] 81 Thank you

• Robust enty linking with Random Walks Zhaochen Guo [CIKM’2014] PhD (2015)

• Relaon extracon [TWEB’2012], EXEMPLAR (get the code!)

Yuval Merhav, Jordan Schmidek, Filipe Mesquita, MSc (2014) PhD (2014) PhD(2012) Illinois Int. Technology • Understanding Conflict in Wikipedia

Aibek Makhazanov Hoda Sepehri-Rad MSc (2013) PhD (2014) [email protected] 82