Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA DOCUMENTS ORGANIZATION

Total Page:16

File Type:pdf, Size:1020Kb

Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA DOCUMENTS ORGANIZATION DOCUMENTS ORGANIZATION TOPIC MODELS – LATENT DIRICHLET ALLOCATION Fabio Stella Associate Professor c/o Department of Informatics, Systems and Communication University of Milano-Bicocca Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA DOCUMENTS ORGANIZATION Part of the material presented in this lecture is taken from the following tutorial. David Blei (2012). Tutorial on Topic Models, International Conference on Machine Learning, ICML 2012, http://www.cs.princeton.edu/~blei/papers/icml-2012-tutorial.pdf Transcription and interpretation errors are responsibility of the lecturer. Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA DOCUMENTS ORGANIZATION The lecture introduces: LATENT DIRICHLET ALLOCATION (LDA) TOPIC-WORDS DISTRIBUTION DOCUMENT-TOPICS DISTRIBUTION CORPUS-TOPICS DISTRIBUTION Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA TOPIC MODELS: LATENT DIRICHLET ALLOCATION 1 The first and most applied topic model is the LATENT DIRICHLET ALLOCATION (LDA) The main assumption underlying LDA is that documents exhibit multiple topics each topic is a distribution over words each document is a mixture of corpus-wide topics each word is drawn from one of those topics Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA TOPIC MODELS: TOPIC-WORDS DISTRIBUTION 2 A TOPIC is a PROBABILITY DISTRIBUTION OVER THE WORDS OF THE VOCABULARY W1:N (the vocabulary consists of N unique words used to represent the document corpus). This is just a portion of the PROBABILITY DISTRIBUTION OVER THE WORDS OF THE VOCABULARY TOPIC 1 TOPIC 2 TOPIC 3 TOPIC 4 TOPIC 5 people 0.0101 growth 0.0093 england 0.0101 people 0.0137 labour 0.0134 games 0.0097 economy(just the0.008510 most frequentgame 0.0088words are shownusers 0.0093to summarizegovernmentTOPIC0.01041). music 0.0080 economic 0.0072 wales 0.0060 net 0.0087 election 0.0103 digital 0.0078 sales 0.0071 players 0.0058 software 0.0078 party 0.0101 technology 0.0073 market 0.0068 ireland 0.0055 internet 0.0066 blair 0.0101 game 0.0071 china 0.0064 club 0.0054 mobile 0.0057 people 0.0092 PW |TOPIC 1 wW1 TOPIC-WORDS DISTRIBUTION mobile 0.0063 prices1:0.0058N win 0.0052 security:N0.0055 brown 0.0073 video 0.0059 world 0.0054 six 0.0049 service 0.0055 minister 0.0070 players 0.0045 bank 0.0051 cup 0.0046 technology 0.0054 howard 0.0056 apple 0.0043 rise 0.0049 time 0.0045 phone 0.0054 prime 0.0055 Consider a single word wW1:N of a document, if the word is about TOPIC 1, then such word w will be TOPIC 6 TOPIC 7 TOPIC 8 TOPIC 9 TOPIC 10 music 0.0145 company 0.0124“PEOPLE” withworld probability0.0109 equalfilmto0.02420.0101 law 0.0070 awards 0.0089 firm 0.0099 win 0.0067 films 0.0071 government 0.0069 band 0.0072 deal 0.0079 set 0.0067 star 0.0047 people 0.0061 “GAMES” with probability equal to 0.0097 award 0.0064 shares 0.0061 champion 0.0063 series 0.0046 police 0.0058 album 0.0061 yukos 0.0057 final 0.0058 comedy 0.0044 court 0.0053 “MUSIC” with probability equal to 0.0080 won 0.0055 market 0.0047 olympic 0.0057 movie 0.0042 lord 0.0051 song 0.0053 oil 0.0046 won 0.0055 director 0.0041 home 0.0049 film 0.0050 financial 0.0046 roddick 0.0046 actor 0.0041 told 0.0046 british 0.0046 bid 0.0044 time 0.0045 office 0.0040 rights 0.0043 Text Mining – Fabiotop 0.0045 Stella offer 0.0044 race 0.0044 festival 0.0039 billDocuments0.0041 Organization: TOPIC MODELS – LDA TOPIC MODELS: DOCUMENT-TOPICS DISTRIBUTION 3 Each document d (a news article), of a document corpus D (the BBC news articles corpus), is associated with a DOCUMENT-TOPICS DISTRIBUTION. PT10 | d 0.32 PT8 | d 0.68 D DOCUMENT-TOPICS DISTRIBUTION d PT1:K | d PT1:10 | d 0,0,0,0,0,0,0,0.68,0,0.32 NEWS ARTICLE TOPIC 1 TOPIC 2 TOPIC 3 TOPIC 4 TOPIC 5 TOPIC 6 TOPIC 7 TOPIC 8 TOPIC 9 TOPIC 10 "Minister digs in over doping row. The Belgian sports minister at the centre of the Svetlana Kuznetsova doping row says he will not apologise for making allegations against her. Claude Eerdekens claims the US Open champion tested positive for ephedrine at an exhibition event last month. Criticised for making the announcementhe said: "I will never apologise". This product is banned and it's up to her to explain why it's there" Kuznetsova says the stimulant may have been in a cold remedy she took. The Russian said she did nothing wrong by taking the medicine during the eventThe Women's Tennis Association cleared Kuznetsova of any offence because the drug is not banned when taken out of competition. Eerdekens said he made the statement in order to protect the other three players that took part in the tournament Belgian Justine Henin-Hardenne Nathalie Dechy of France and Russia's Elena Dementieva. TOPIC 8 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.68 0.00 0.32 But Dechy is fuming that she has been implicated in the row "How can you be happy when you see your face on the cover page and talking about doping?". Dechy said "I'm really upset about it and I think the Belgian government did a really bad job about this". I think we deserve an apology from the guy. You cannot say anything like this - you cannot say some stuff like thissaying it's one of these girlsThis is terrible". Dementieva is also angry and says that Dechy and herself are the real victims of the scandal". You have no idea what I have been through all these daysIt's been too hard on me" she said. The WTA are trying to handle this problem by saying there are three victims but I see only two victims in this story - me and Nathalie Dechywho really have nothing to do with this". To be honest with youI don't feel like I want to talk to Sveta at allI'm just very upset with the way everything has happened"" Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA TOPIC MODELS: CORPUS-TOPICS DISTRIBUTION 4 A document corpus D (the BBC news articles corpus) is associated with a CORPUS-TOPICS DISTRIBUTION. D ORPUS OPICS DISTRIBUTION PT1:K | C -T D PT3 | PT1:10 | D 0.08,0.12,0.14,0.09,0.12,0.09,0.10,0.09,0.08,0.09 Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA TOPIC MODELS: LATENT DIRICHLET ALLOCATION 5 We now are ready to provide some more insights on learning a LATENT DIRICHLET ALLOCATION topic model. Given a document corpus D, learning a LATENT DIRICHLET ALLOCATION (LDA) topic model, gives the following: TOPIC 1 TOPIC 2 TOPIC 3 TOPIC 4 TOPIC 5 people 0.0101 growth 0.0093 england 0.0101 people 0.0137 labour 0.0134 games 0.0097 economy 0.0085 game 0.0088 users 0.0093 government 0.0104 music 0.0080 economic 0.0072 wales 0.0060 net 0.0087 election 0.0103 PW1:N |T1:K digital 0.0078 sales 0.0071 players 0.0058 software 0.0078 party 0.0101 technology 0.0073 market 0.0068 ireland 0.0055 internet 0.0066 blair 0.0101 game 0.0071 china 0.0064 club 0.0054 mobile 0.0057 people 0.0092 mobile 0.0063 prices 0.0058 win 0.0052 security 0.0055 brown 0.0073 video 0.0059 world 0.0054 six 0.0049 service 0.0055 minister 0.0070 D players 0.0045 bank 0.0051 cup 0.0046 technology 0.0054 howard 0.0056 PT1:K | d d apple 0.0043 rise 0.0049 time 0.0045 phone 0.0054 prime 0.0055 TOPIC 6 TOPIC 7 TOPIC 8 TOPIC 9 TOPIC 10 music 0.0145 company 0.0124 world 0.0109 film 0.0242 law 0.0070 awards 0.0089 firm 0.0099 win 0.0067 films 0.0071 government 0.0069 band 0.0072 deal 0.0079 set 0.0067 star 0.0047 people 0.0061 N = 6 283 award 0.0064 shares 0.0061 champion 0.0063 series 0.0046 police 0.0058 album 0.0061 yukos 0.0057 final 0.0058 comedy 0.0044 court 0.0053 won 0.0055 market 0.0047 olympic 0.0057 movie 0.0042 lord 0.0051 song 0.0053 oil 0.0046 won 0.0055 director 0.0041 home 0.0049 K = 10 film 0.0050 financial 0.0046 roddick 0.0046 actor 0.0041 told 0.0046 british 0.0046 bid 0.0044 time 0.0045 office 0.0040 rights 0.0043 top 0.0045 offer 0.0044 race 0.0044 festival 0.0039 bill 0.0041 Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA TOPIC MODELS: LATENT DIRICHLET ALLOCATION 6 sales service Given a LATENT DIRICHLET ALLOCATION (LDA) topic model growth internet sales bank rise document d china net users D world gold bank PW1:N |T1:K PT1:K | d d n=17 software phone TOPIC 1 TOPIC 2 TOPIC 3 TOPIC 4 TOPIC 5 oil growth A document d, consisting of n (non uniquepeople) words,0.0101 is growth 0.0093 england 0.0101 people 0.0137 labour 0.0134 games 0.0097 economy 0.0085 game 0.0088 users 0.0093 government 0.0104 music 0.0080 economic 0.0072 wales 0.0060 net 0.0087 election 0.0103 obtained as follows digital 0.0078 sales 0.0071 players 0.0058 software 0.0078 party 0.0101 technology 0.0073 market 0.0068 ireland 0.0055 internet 0.0066 blair 0.0101 game 0.0071 china 0.0064 club 0.0054 mobile 0.0057 people 0.0092 mobile 0.0063 prices 0.0058 win 0.0052 security 0.0055 brown 0.0073 video 0.0059 world 0.0054 six 0.0049 service 0.0055 minister 0.0070 FOR i=1:n % for all words of documentplayersd 0.0045 bank 0.0051 cup 0.0046 technology 0.0054 howard 0.0056 apple 0.0043 rise 0.0049 time 0.0045 phone 0.0054 prime 0.0055 th Sample the i topic from the corpus-topics distributionTOPIC 6P(T1:K|d) TOPIC 7 TOPIC 8 TOPIC 9 TOPIC 10 music 0.0145 company 0.0124 world 0.0109 film 0.0242 law 0.0070 awards 0.0089 firm 0.0099 win 0.0067 films 0.0071 government 0.0069 Sample a word w from the topic-words distribution P(W1:N|Tk) band 0.0072 deal 0.0079 set 0.0067 star 0.0047 people 0.0061 award 0.0064 shares 0.0061 champion 0.0063 series 0.0046 police 0.0058 Add the word w to document d album 0.0061 yukos 0.0057 final 0.0058 comedy 0.0044 court 0.0053 won 0.0055 market 0.0047 olympic 0.0057 movie 0.0042 lord 0.0051 END song 0.0053 oil 0.0046 won 0.0055 director 0.0041 home 0.0049 film 0.0050 financial 0.0046 roddick 0.0046 actor 0.0041 told 0.0046 british 0.0046 bid 0.0044 time 0.0045 office 0.0040 rights 0.0043 top 0.0045 offer 0.0044 race 0.0044 festival 0.0039 bill 0.0041 Text Mining – Fabio Stella Documents Organization: TOPIC MODELS – LDA.
Recommended publications
  • Dossier De Presse Sommaire
    DOSSIER DE PRESSE SOMMAIRE 3 ÉDITO Nathalie DECHY - Directrice du Tournoi 4 INFORMATIONS MÉDIA 5 LE TOURNOI 6 PALMARÈS 7 OBJECTIFS 2018 8 PROGRAMME SPORTIF 9 TENNIS DANS BIARRITZ 10 PROGRAMME DES ANIMATIONS 11 ANIMATIONS 12 THE SEA CLEANERS 13 ACTIONS ÉCO-RESPONSABLES 14 BILLETTERIE 15 INFORMATIONS PRATIQUES 16 LA PAROLE À ENGIE 17 NOS PARTENAIRES DOSSIER DE PRESSE ENGIE OPEN BIARRITZ PAYS BASQUE - 2 ÉDITO C’est avec un immense plaisir que je vous souhaite la bienvenue à la 16e édition de l’ENGIE OPEN Biarritz Pays Basque. Pour la 4e année consécutive maintenant, l’Agence Quarterback se charge de l’organisation. Le tennis est le premier sport mondial féminin et je suis fière de pouvoir m’associer à ce tournoi de tennis international. Il est important en tant qu’ancienne joueuse de pouvoir donner l’opportunité à de jeunes espoirs de dévoiler leur talent. L’ENGIE OPEN Biarritz Pays Basque est un vrai tremplin : en témoigne la progression de notre dernière vainqueur, Mihaela BUZARNESCU, qui a intégré le Top 20 quelques mois plus tard. Je m’associe à l’équipe d’organisation du tournoi pour remercier les partenaires de ce tournoi qui contribuent à son rayonnement. Grâce à leur fidélité et leur soutien nous pouvons relever le défi. Merci aux partenaires majeurs : ENGIE, Ville de Biarritz, la Région Nouvelle-Aquitaine, le Conseil Nathalie DECHY Départemental des Pyrénées-Atlantiques, le Crédit Agricole Pyrennées Gascogne, la Ligue Nouvelle- Directrice du tournoi Aquitaine et la Fédération Française de Tennis. Merci aussi à tous nos autres partenaires, ils forment la famille du tournoi.
    [Show full text]
  • Media Guide Template
    MOST CHAMPIONSHIP TITLES T O Following are the records for championships achieved in all of the five major events constituting U R I N the U.S. championships since 1881. (Active players are in bold.) N F A O M E MOST TOTAL TITLES, ALL EVENTS N T MEN Name No. Years (first to last title) 1. Bill Tilden 16 1913-29 F G A 2. Richard Sears 13 1881-87 R C O I L T3. Bob Bryan 8 2003-12 U I T N T3. John McEnroe 8 1979-89 Y D & T3. Neale Fraser 8 1957-60 S T3. Billy Talbert 8 1942-48 T3. George M. Lott Jr. 8 1928-34 T8. Jack Kramer 7 1940-47 T8. Vincent Richards 7 1918-26 T8. Bill Larned 7 1901-11 A E C V T T8. Holcombe Ward 7 1899-1906 E I N V T I T S I OPEN ERA E & T1. Bob Bryan 8 2003-12 S T1. John McEnroe 8 1979-89 T3. Todd Woodbridge 6 1990-2003 T3. Jimmy Connors 6 1974-83 T5. Roger Federer 5 2004-08 T5. Max Mirnyi 5 1998-2013 H I T5. Pete Sampras 5 1990-2002 S T T5. Marty Riessen 5 1969-80 O R Y C H A P M A P S I T O N S R S E T C A O T I R S D T I S C S & R P E L C A O Y R E D R Bill Tilden John McEnroe S * All Open Era records include only titles won in 1968 and beyond 169 WOMEN Name No.
    [Show full text]
  • US Open Doubles Champion Leaderboard Doubles Champion Leaders Among Players/Teams from the Open Era
    US Open Doubles Champion Leaderboard Doubles Champion Leaders among players/teams from the Open Era Leaderboard: Titles per player (9) US OPEN DOUBLES TITLES Martina Navratilova (USA) 1977 1978 1980 1983 1984 1986 1987 1989 1990 (6) US OPEN DOUBLES TITLES Mike Bryan (USA) 2005 2008 2010 2012 2014 2018 | * Tied for most all-time among men Darlene Hard (USA) 1969 (1958 1959 1960 1961 1962) * Richard Sears (USA) 1882 1883 1884 1885 1886 1887 * Holcombe Ward (USA) 1899 1900 1901 1904 1905 1906 (5) US OPEN DOUBLES TITLES Bob Bryan (USA) 2005 2008 2010 2012 2014 Margaret Court (AUS) 1968 1970 1973 1975 (1963) Gigi Fernández (USA) 1988 1990 1992 1995 1996) Billie Jean King (USA) 1974 1978 1980 (1964 1967) Pam Shriver (USA) 1983 1984 1986 1987 1991 (4) US OPEN DOUBLES TITLES Maria Bueno (BRA) 1968 (1960 1962 1966) Rosemary Casals (USA) 1971 1974 1982 (1967) Robert Lutz (USA) 1968 1974 1978 1980 John McEnroe (USA) 1979 1981 1983 1989 Stan Smith (USA) 1968 1974 1978 1980 Natalia Zvereva (BLR) 1991 1992 1995 1996 (3) US OPEN DOUBLES TITLES Peter Fleming (USA) 1979 1981 1983 Martina Hingis (SUI) 1998 2015 2017 John Newcombe (AUS) 1971 1973 (1967) Jana Novotná (CZE) 1994 1997 1998 Leander Paes (IND) 2006 2009 2013 Virginia Ruano Pascual (ESP) 2002 2003 2004 Lisa Raymond (USA) 2001 2005 2011 Fred Stolle (AUS) 1969 (1965 1966) Paola Suárez (ARG) 2002 2003 2004 Betty Stöve (NED) 1972 1977 1979 Todd Woodbridge (AUS) 1995 1996 2003 Mark Woodforde (AUS) 1989 1995 1996 (2) US OPEN DOUBLES TITLES Judy Tegart Dalton (AUS) 1970 1971 Nathalie Dechy (FRA) 2006
    [Show full text]
  • Nadal Revs for Roddick
    53 OPINIONS 100 2004 U.S. OPEN BE OUR GUEST By ANDREW FRIEDMAN A very tough sell Scaffold law works – don’t Justine salutes Israeli Wooing voters isn’t easy for black GOPers BLACK ONLY he most elegant folk you ers to the GOP. undermine it Obziler shows some Zip ever saw were clinking Despite his minstrel-show Twine glasses at a swank re- clowning in and around Madi- Contractors want to cut ception of the National Black Re- son Square Garden, King re- publican Council, held yester- mains what the black communi- corners on worker safety in 2nd-round loss to No. 1 day at the Central Park Boat- ty always has known him to be: house. To this group falls the a career criminal from Ohio ast weekend, one immigrant By WAYNE COFFEY thankless task of selling the Re- who has been convicted of kill- died in Brooklyn and anoth- DAILY NEWS SPORTS WRITER ing two men and who served L er was injured — both just publican Party to a black com- A 31-YEAR-OLD VETERAN of the Israeli Army took Center Court munity in which 9 of every 10 years in prison for his offenses. doing their jobs. They worked in voters are almost certain to vote King swindled astring of construction, and their accidents at the U.S. Open yesterday, a Flushing Meadows rookie unlike any Democratic. black boxers and virtually happened on scaffolding at the other. She wore an outfit that was the color of a school bus, Black Republicans come in ruined the sport.
    [Show full text]
  • La Fédération Française De Tennis a 100 Ans
    LA FÉDÉRATION FRANÇAISE DE TENNIS A 100 ANS La connaissez-vous vraiment ? les courts du territoire. Grâce à son plan de relance de 35 millions d’euros et sa plateforme digitale relance.fft.fr, la FFT souhaite soutenir financièrement toutes celles et tous ceux qui participent au succès du tennis en France et dans le 1920 – 2020 monde : clubs affiliés, joueuses et joueurs professionnels, officiels internationaux et organisateurs de tournois. 100 ANS DE PASSION, Désormais centenaire, la FFT s’appuie plus que jamais sur cet héritage – 100 D’ENGAGEMENT ans de passion, d’engagement et d’innovations – pour imaginer l’avenir du tennis ET D’INNOVATIONS et faire figure d’exemple en tant qu’organisation sportive engagée en faveur du développement durable et de la responsabilité sociétale. Avec engagement et vivacité, la Fédération Française de Tennis poursuit l’écriture de la légende du tennis français. Avec près d’un million de licenciés et quatre millions de pratiquants aujourd’hui en France, le tennis est bien plus qu’un sport réservé aux champions et aux records. Au fil des décennies, son histoire est écrite par chacun de ses joueurs et chacune de ses joueuses – amateurs et professionnels –, avec, à leurs côtés, la Fédération Française de Tennis (FFT), qui célèbre en 2020 son centenaire. La passion commune du tennis transcende les générations tant cette discipline ouverte à toutes et à tous permet de découvrir son potentiel et de s’épanouir. Avec ce même élan et depuis maintenant 100 ans, la FFT a contribué à l’évolution de la pratique du tennis en France et dans le monde.
    [Show full text]
  • Quick Facts 2019 Wta Finals Participants
    PREVIEW NOTES: SHISEIDO WTA FINALS SHENZHEN SHENZHEN, CHINA | OCTOBER 27 – NOVEMBER 3, 2019 | USD $14,000,000 WTA Finals Information: www.shiseidowtafinalsshenzhen.com l @WTAFinals l facebook.com/WTAFinals WTA Information: www.wtatennis.com l @WTA l facebook.com/WTA Communications Staff: Catherine Sneddon ([email protected]), Amy Binder ([email protected]), Adam Lincoln ([email protected]), Alex Prior ([email protected]), Estelle LaPorte ([email protected]), Bryan Shapiro ([email protected]), Jessica Culbreath ([email protected]), Xu Yanyan ([email protected]), Ellie Emerson ([email protected]) SHISEIDO WTA FINALS SHENZHEN – QUICK FACTS Main draw dates: Sunday, October 27 - Sunday, November 3, 2019 Singles Final: Sunday, November 3, NB 7:30pm Doubles Final: Sunday, November 3, 4:30pm Venue: C.R. Shenzhen Bay Sports Center Total prize money: USD $14,000,000 Surface: “Set & Match” Acrylic paint / Wooden base Tennis Ball: Wilson US Open (Regular Duty) First Held: 1972 (49th staging) 2018 Singles Final: [6] Elina Svitolina (UKR) d. [5] Sloane Stephens (USA) 3-6 6-2 6-2 2018 Doubles Final: [2] Babos/Mladenovic (HUN/FRA) d. [1] Krejcikova/Siniakova (CZE/CZE) 6-4 7-5 2019 WTA FINALS PARTICIPANTS WTA FINALS YTD YTD MD YTD CAREER CAREER CAREER TOP 8 QUALIFIERS RANK NAT AGE W-L PRIZE $ W/L TITLES PRIZE $ W/L TITLES [1] Ashleigh Barty 1 AUS 23 0-0 6,887,587 52-11 3 12,095,667 236-89 6 [2] Karolina Pliskova 2 CZE 27 5-6 3,918,077 50-15 4 18,292,518 525-287 15 [3] Naomi Osaka 3 JPN 23 0-3 5,863,282 39-11
    [Show full text]
  • Doubles Final (Seed)
    2016 ATP TOURNAMENT & GRAND SLAM FINALS START DAY TOURNAMENT SINGLES FINAL (SEED) DOUBLES FINAL (SEED) 4-Jan Brisbane International presented by Suncorp (H) Brisbane $404780 4 Milos Raonic d. 2 Roger Federer 6-4 6-4 2 Kontinen-Peers d. WC Duckworth-Guccione 7-6 (4) 6-1 4-Jan Aircel Chennai Open (H) Chennai $425535 1 Stan Wawrinka d. 8 Borna Coric 6-3 7-5 3 Marach-F Martin d. Krajicek-Paire 6-3 7-5 4-Jan Qatar ExxonMobil Open (H) Doha $1189605 1 Novak Djokovic d. 1 Rafael Nadal 6-1 6-2 3 Lopez-Lopez d. 4 Petzschner-Peya 6-4 6-3 11-Jan ASB Classic (H) Auckland $463520 8 Roberto Bautista Agut d. Jack Sock 6-1 1-0 RET Pavic-Venus d. 4 Butorac-Lipsky 7-5 6-4 11-Jan Apia International Sydney (H) Sydney $404780 3 Viktor Troicki d. 4 Grigor Dimitrov 2-6 6-1 7-6 (7) J Murray-Soares d. 4 Bopanna-Mergea 6-3 7-6 (6) 18-Jan Australian Open (H) Melbourne A$19703000 1 Novak Djokovic d. 2 Andy Murray 6-1 7-5 7-6 (3) 7 J Murray-Soares d. Nestor-Stepanek 2-6 6-4 7-5 1-Feb Open Sud de France (IH) Montpellier €463520 1 Richard Gasquet d. 3 Paul-Henri Mathieu 7-5 6-4 2 Pavic-Venus d. WC Zverev-Zverev 7-5 7-6 (4) 1-Feb Ecuador Open Quito (C) Quito $463520 5 Victor Estrella Burgos d. 2 Thomaz Bellucci 4-6 7-6 (5) 6-2 Carreño Busta-Duran d.
    [Show full text]
  • Grand Slam Singles Title Leaders
    OPEN ERA: GRAND SLAM SINGLES TITLE LEADERS SERENA WILLIAMS 23 STEFANIE GRAF 22 CHRIS EVERT 18 MARTINA NAVRATILOVA 18 MARGARET COURT 11 GRAND SLAMS Grand Slam Champions The Australian Open, Roland Garros, Wimbledon and US Open are the four Grand Slam tournaments. Winning the title at each major in the same year is known as the “Grand Slam”. Three women have completed the singles Grand Slam in a calendar year: 1953 – Maureen Connolly; 1970 – Margaret Court; 1988 – Stefanie Graf. A further seven women have won each Grand Slam singles title at least once in their careers (known as the career Grand Slam): Doris Hart, Shirley Fry, Billie Jean King, Chris Evert, Martina Navratilova, Serena Williams and Maria Sharapova, with Navratilova (1983-84) and Williams (2002-03, 2014-15) holding all four titles at the same time. Australia’s Margaret Court holds the record for all-time Grand Slam singles titles (men or women) with 24 titles, ahead of Serena Williams, who holds the Open Era record with 23 Grand Slam singles titles. In the Open Era, eight women have won three of the four Grand Slam titles: Lindsay Davenport, Evonne Goolagong Cawley, Justine Henin, Martina Hingis, Angelique Kerber, Hana Mandlikova, Monica Seles and Virginia Wade. All-Time Grand Slam Singles Titles Leaders PLAYER (NAT) AO RG WIMB US TOTAL Margaret Court (AUS) 11 5 3 5 24 Serena Williams (USA) 7 3 7 6 23 Stefanie Graf (GER) 4 6 7 5 22 Helen Wills Moody (USA) 4 8 7 19 Chris Evert (USA) 2 7 3 6 18 Martina Navratilova (USA) 3 2 9 4 18 Billie Jean King (USA) 1 1 6 4 12 Maureen Connolly
    [Show full text]
  • Orange Bowl International Tennis Champions Girls' 18
    ORANGE BOWL INTERNATIONAL TENNIS CHAMPIONS GIRLS’ 18-AND-UNDER SINGLES CHAMPIONSHIPS YEAR WINNER RUNNER-UP SCORE 2007 Michelle Larcher de Brito, Melanie Oudin, United States 7-5, 6-3 Portugal 2006 Nikola Hofmanova, Austria Ksenia Milevskaya, Belarus 7-5, 6-3 2005 Caroline Wozniacki, Denmark Mihaela Buzarnescu, Romania 6-1, 6-4 2004 Jessica Kirkland, United States Alla Kudryavtseva, Russia 6-3, 6-2 2003 Nicole Vaidisova, Czech Republic Neha Uberoi, United States 5-7, 6-4, 6-3 2002 Vera Douchevina, Russia Anna-Lena Groenfeld, Germany 6-0, 6-1 2001 Vera Zvonareva, Russia Svetlana Kuznetsova, Russia 6-7 (2-7), 6-4, 6-3 2000 Vera Zvonareva, Russia Edina Gallovits, Romania 7-6 (7-4), 6-4 1999 Maria Jose Martinez, Spain Maria Emilia Salerni, Argentina 6-4, 6-1 1998 Elena Dementieva, Russia Nadejda Petrova, Russia 3-6, 6-4, 6-0 1997 Tina Pisnik, Slovenia Gabriela Volekova, Slovakia 6-2, 6-0 1996 Ana Alcazar, Spain Katarina Srebotnik, Slovenia 6-3, 6-0 1995 Anna Kournikova, Russia Sandra Nacuk, Yugoslavia 6-3, 6-2 1994 Mariann Ramon, Spain Anna Kournikova, Russia 7-5, 6-4 1993 Angelies Montolio, Spain Sonya Jeyaseelan, Canada 6-7 (7-4), 6-1, 6-1 1992 Barbara Mules, Slovenia Rossana De Los Rios, Paraguay 7-5, 7-5 1991 Elena Likhovtseva, Soviet Union Maria Jose Galdano, Argentina 7-6 (7-5), 6-1 1990 Pili Perez, Spain Silvia Ramon, Spain 6-1, 7-6 (9-7) 1989 Luanne Spadea, United States Sofie Albinus, Denmark 6-0, 6-3 1988 Carrie Cunningham, United Laura Lapi, Italy 6-0, 6-1 States 1987 Natalia Zvereva, Soviet Union Laura Lapi, Italy 6-2, 6-0 1986 Patricia Tarabini, Argentina Bettina Fulco, Argentina 6-2, 6-2 1985 Mary Joe Fernandez, United Patricia Tarabini, Argentina 7-6 (7-5), 6-1 States 1984 Gabriela Sabatini, Argentina Katrina Maleeva, Bulgaria 6-1, 6-3 1983 Debbie Spence, United States Anamarie Cecchini, Italy 2-6, 7-5, 6-4 1982 Carling Bassett, Canada Manuela Maleeva, Bulgaria 6-4, 4-3, ret.
    [Show full text]
  • GRAND SLAMS Grand Slam Champions
    OPEN ERA: GRAND SLAM SINGLES TITLE LEADERS AO RG WIM USO SERENA 7 3 7 6 WILLIAMS 23 STEFANIE 4 6 7 5 GRAF 22 CHRIS 2 7 3 6 EVERT 18 MARTINA 3 2 9 4 NAVRATILOVA 18 MARGARET 4 3 1 3 COURT 11 GRAND SLAMS Grand Slam Champions The Australian Open, Roland Garros, Wimbledon and US Open are the four Grand Slam tournaments. Winning the title at each major in the same year is known as the “Grand Slam”. Three women have completed the singles Grand Slam in a calendar year: 1953 – Maureen Connolly; 1970 – Margaret Court; 1988 – Stefanie Graf. A further seven women have won each Grand Slam singles title at least once in their careers (known as the career Grand Slam): Doris Hart, Shirley Fry, Billie Jean King, Chris Evert, Martina Navratilova, Serena Williams and Maria Sharapova, with Navratilova (1983-84) and Williams (2002-03, 2014-15) holding all four titles at the same time. Australia’s Margaret Court holds the record for all-time Grand Slam singles titles (men or women) with 24 titles, ahead of Serena Williams, who holds the Open Era record with 23 Grand Slam singles titles. In the Open Era, eight women have won three of the four Grand Slam titles: Lindsay Davenport, Evonne Goolagong Cawley, Justine Henin, Martina Hingis, Angelique Kerber, Hana Mandlikova, Monica Seles and Virginia Wade. All-Time Grand Slam Singles Titles Leaders PLAYER (NAT) AO RG WIMB US TOTAL Margaret Court (AUS) 11 5 3 5 24 Serena Williams (USA) 7 3 7 6 23 Stefanie Graf (GER) 4 6 7 5 22 Helen Wills Moody (USA) 4 8 7 19 Chris Evert (USA) 2 7 3 6 18 Martina Navratilova (USA) 3
    [Show full text]
  • Learning from Negative Examples in Set-Expansion
    Learning from Negative Examples in Set-Expansion Prateek Jindal Dan Roth Dept. of Computer Science Dept. of Computer Science UIUC UIUC Urbana, IL, USA Urbana, IL, USA [email protected] [email protected] FEMALE TENNIS PLAYERS Abstract—This paper addresses the task of set-expansion State-of-the-art This Paper on free text. Set-expansion has been viewed as a problem of generating an extensive list of instances of a concept of Monica Seles Mary Pierce Steffi Graf Monica Seles interest, given a few examples of the concept as input. Our key Martina Hingis Martina Hingis contribution is that we show that the concept definition can be Mary Pierce Lindsay Davenport significantly improved by specifying some negative examples Lindsay Davenport Steffi Graf in the input, along with the positive examples. The state-of-the Jennifer Capriati Jennifer Capriati art centroid-based approach to set-expansion doesn’t readily Kim Clijsters Kim Clijsters admit the negative examples. We develop an inference-based Mary Joe Fernandez Karina Habsudova approach to set-expansion which naturally allows for negative Nathalie Tauziat Sandrine Testud examples and show that it performs significantly better than a Kimiko Date Kimiko Date strong baseline. Conchita Martinez Chanda Rubin Anke Huber Anke Huber Judith Wiesner Nathalie Tauziat Andre Agassi Jana Novotna I. INTRODUCTION Pete Sampras Conchita Martinez This paper addresses the task of set-expansion on free text. Jana Novotna Nathalie Dechy Karina Habsudova Amanda Coetzer Set-expansion has been viewed as a problem of generating Jim Courier Barbara Paulus an extensive list of instances of a concept of interest, given Justine Henin Arantxa Sanchez-Vicario a few examples of the concept as input.
    [Show full text]
  • Prize Top 100 Money YTD SD.Txt
    Prize Top 100 Money YTD S-D.txt WTA TOUR PRIZE MONEY YTD 1 MARTINA HINGIS 2,936,425 355,355 3,291,780 2 LINDSAY DAVENPORT 2,426,756 303,660 2,734,205 3 SERENA WILLIAMS 2,270,946 317,540 2,605,102 4 VENUS WILLIAMS 1,990,887 317,540 2,316,005 5 STEFFI GRAF 1,219,289 13,097 1,248,867 6 MARY PIERCE 857,078 129,465 996,442 7 NATHALIE TAUZIAT 723,648 140,859 864,507 8 MONICA SELES 744,741 77,477 822,218 9 ARANTXA SANCHEZ-VICARIO 617,433 190,488 807,921 10 ANNA KOURNIKOVA 389,325 325,205 748,424 11 JANA NOVOTNA 524,662 216,792 741,454 12 BARBARA SCHETT 633,068 92,617 725,685 13 AMELIE MAURESMO 551,825 30,643 582,468 14 ELENA LIKHOVTSEVA 336,719 156,567 525,307 15 JULIE HALARD-DECUGIS 457,930 56,140 514,070 16 SANDRINE TESTUD 359,847 123,155 488,496 17 CONCHITA MARTINEZ 394,876 91,516 486,392 18 AMANDA COETZER 425,814 58,525 486,120 19 LISA RAYMOND 207,110 179,550 465,664 20 DOMINIQUE VAN ROOST 394,995 52,967 450,401 21 NATASHA ZVEREVA 267,917 170,660 438,577 22 CORINA MORARIU 161,771 242,256 413,980 23 ANKE HUBER 361,439 50,548 411,987 24 AI SUGIYAMA 226,719 105,967 405,148 25 CHANDA RUBIN 229,078 126,643 355,721 26 PATTY SCHNYDER 277,567 59,030 336,597 27 LARISA NEILAND 57,096 231,563 324,300 28 RUXANDRA DRAGOMIR 261,606 47,616 309,222 29 IRINA SPIRLEA 190,680 102,963 299,143 30 MIRJANA LUCIC 271,801 12,535 286,259 31 MARY JOE FERNANDEZ 168,557 95,043 263,600 32 MARIAAN DE SWARDT 107,991 113,525 263,287 33 SILVIA FARINA 186,691 71,457 258,148 34 AMY FRAZIER 221,647 36,039 257,686 35 ANNE-GAELLE SIDOT 175,877 66,731 246,939 36 JENNIFER CAPRIATI
    [Show full text]