Semantic Network Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Semantic Network Analysis Semantic Network Analysis Techniques for Extracting, Representing, and Querying Media Content Wouter van Atteveldt Reading committee: prof.dr. Enrico Motta dr. Gertjan van Noord prof.dr. Guus Schreiber prof.dr. Klaus Schönbach prof.dr. Philip A. Schrodt c Wouter van Atteveldt 2008 You are allowed to copy and distribute this book in whole or in part andtomakeaderivedworkunderthetermsoftheCreativeCom- mons Attribution-Noncommercial 3.0 Netherlands License. (http://creativecommons.org/licenses/by-nc/3.0/nl/) An electronic version of this book is available from http://vanatteveldt.com/dissertation. This book can be purchased from http://www.amazon.com/. SIKS Dissertation Series No. 2008-30 The research reported in this thesis has been carried out under the auspices of SIKS, the Dutch Research School for Information and Knowledge Systems. Published by BookSurge Publishers, Charleston SC ISBN: 1-4392-1136-1 VRIJE UNIVERSITEIT Semantic Network Analysis Techniques for Extracting, Representing, and Querying Media Content ACADEMISCH PROEFSCHRIFT ter verkrijging van de graad Doctor aan de Vrije Universiteit Amsterdam, op gezag van de rector magnificus prof.dr. L.M. Bouter, in het openbaar te verdedigen ten overstaan van de promotiecommissie van de faculteit der Exacte Wetenschappen op vrijdag 14 november 2008 om 13.45 uur in de aula van de universiteit, De Boelelaan 1105 door Wouter Hendrik van Atteveldt geboren te Texel promotoren: prof.dr. F.A.H. van Harmelen prof.dr. J. Kleinnijenhuis copromotor: dr. K.S. Schlobach Preface Undoubtedly, this thesis contains many inaccuracies and omissions. One of the most glaring is the single author name on the cover: Although I doubt any Ph.D. thesis is really the sole work of its defender, especially an interdisciplinary work such as this thesis is the product of many hours of talking and collaborating. I started this thesis with little knowledge of both Communication Science and Knowledge Representation, and I am very grateful for the patient explanations and pointers from my supervi- sors and other colleagues. I think I’ve been very lucky with my supervisors: I will not quickly forget the 5 AM e-mails from Jan when I was in Turkey working on one of our papers last year; or the hours spent in front of the whiteboard with Stefan while working on modal logic; or the innocent sounding ques- tions Frank always asked to keep me on track both substantively and in terms of planning. Between the three of them, I think I’ve received in- credible support both on the content of my work and on the procedure and planning needed to get me here. I especially appreciate the way how it was always stressed that, after everybody had their say, it was my thesis, and my responsibility to decide what I wanted to study, how I wanted to write it down, and when I wanted to finish it. This gave me the confidence to write down and defend this thesis with my name on it, even though the work is neither finished, nor perfect, nor solely mine. According to Frank, the only thing worse than having no desk is hav- ing two desks, but I am very happy that I was a member of both the Communication Science and Knowledge Representation groups. The KR group (both on the AI and BI side of the invisible line) was and is a dy- namic group with a lot of room for discussion and learning. Apart from v vi Preface my supervisors, I especially appreciate the long talks, about work or oth- erwise, with Mark, Laura, Michel, and Willem. On the other side of the tram line, I was very lucky to start my PhD just after Rens and Lonneke started theirs, and I fondly remember the hours we spent looking at data and models together. Although Dirk wisely turned down my request to become co-promotor he was always there to talk about work but espe- cially about non-work. He taught me to think about why I do the things I do, and to concentrate on doing the things that really matter. Anita showed a surprising side to her character when she was stuck in Ams- terdam during a storm and we drank champagne for her birthday and played Catan until 2 AM. I am also very happy we stole back Janet from the UvA for the Contested Democracy project; we can always use more Klaverjas players to join us to the ICA conferences. I really look forward to continuing my collaboration and friendship with all these colleagues. I would also like to use the opportunity to thank my professors from the University College Utrecht and University of Edinburgh, especially Maarten Prak, who showed me how university education should be done, Mark Steedman and Greame Ritchie, who got me hooked on natural lan- guages, and of course Miles Osborne, Claire Grover, Bonnie Webber and all the others. The ACLU Lawyer Clarence Darrow once said that the first half of our life is ruined by our parents, and the second half by our children. In contrast, I feel that my childhood has both been very pleasant and helped me to value thinking, knowledge, and discussing. I have always felt my parents to support me in whatever I did, and I believe that I am very lucky to have had such a wonderful family. Bas, I sometimes miss the early days of 2AT, exploring the game called running a company, and I am glad that things are going so well with the company. I am also thrilled that we finally got the catamaran working, and I hope we will have a very windy summer next year. Nienke, I am looking forward to our first publication together, using text boxes and arrows between them to explain the human condition, and I look forward to more roof terrace parties if you still find Amsterdam liveable after your stay in New York. If you compare the above list with my list of co-authors, one name is con- spicuously lacking. I would probably end up on the couch if I would list Nel among my colleagues and collaborators, even though we did spend a lot of time working, discussing, and writing together. However, that is completely insignificant compared to her contribution to my real life. Since meeting her I’ve learned invaluable lessons on people, emotions, and insecurity, and I feel that I’ve become a much better person over the last four years, or at least a better dressed person. Contents 1 Introduction 1 1.1Introduction........................... 2 1.2ResearchQuestion....................... 6 1.3DomainandData........................ 7 1.4Contributions.......................... 8 1.5ThesisOutline.......................... 9 I Background 11 2 Content Analysis 13 2.1Introduction........................... 14 2.2ContentAnalysisinCommunicationScience........ 18 2.3SemanticNetworkAnalysis.................. 23 2.4TheNETmethod........................ 29 2.5 The 2006 Dutch parliamentary elections . ....... 33 2.6ComputerContentAnalysis.................. 35 2.7Conclusion............................ 39 3 Natural Language Processing 41 3.1Introduction........................... 42 3.2ThePreprocessingPipeline.................. 43 3.3Thesauri............................. 48 3.4Evaluationmetrics....................... 49 3.5Conclusion............................ 50 vii viii Table of Contents 4 Knowledge Representation and the Semantic Web 51 4.1Introduction........................... 52 4.2TheSemanticWeb........................ 54 4.3 The Semantic Web as a Knowledge Representation frame- work............................... 60 4.4Conclusion............................ 61 II Extracting Semantic Networks 63 5 Extracting Associative Frames using Co-occurrence 65 5.1Introduction........................... 66 5.2FramesasAssociations..................... 68 5.3 A Probabilistic Model of Associative Framing . ....... 71 5.4UseCase:TerrorismintheNews............... 75 5.5Conclusion............................ 88 6 Using Syntax to find Semantic Source, Subject, Object 91 6.1Introduction........................... 92 6.2 Determining Semantic Roles using Syntax Patterns . 93 6.3DeterminingValidity...................... 99 6.4Results..............................105 6.5ErrorComponentsAnalysis..................111 6.6Discussion/Conclusion....................114 7 Determining valence using Sentiment Analysis 117 7.1Introduction...........................118 7.2PolarityinPoliticalCommunication.............120 7.3Task:ClassifyingNETrelations................121 7.4SentimentAnalysis.......................122 7.5Method..............................124 7.6Results..............................130 7.7Validation............................135 7.8Conclusion............................141 III Reasoning with Media Data 143 8 Using RDF to store Semantic Network Data 145 8.1Introduction...........................146 8.2 Representing Media Data: Statements about Statements . 147 8.3RepresentingPoliticalBackgroundKnowledge.......152 8.4Apoliticalontology.......................156 8.5UsingOWLforaricherontology...............162 Table of Contents ix 8.6Conclusions...........................164 9 Querying, Analysing, and Visualising Semantic Network Data 165 9.1Introduction...........................166 9.2QueryingtheSemanticNetwork...............167 9.3SearchingtheNews.......................168 9.4Usingthesystem:Partiesinthenews............171 9.5Conclusion............................175 IV System Description 177 10 The AmCAT Infrastructure 179 10.1Introduction...........................180 10.2TheAmCATNavigatorandDatabase............181 10.3TheiNetCodingProgram...................192 10.4Conclusion............................202 11 Discussion and Conclusion 203 Bibliography 215 Samenvatting (Dutch Summary) 231 xTableofContents CHAPTER 1 Introduction ‘No-campaign Wilders is a circus
Recommended publications
  • Appendix 2 Dissertation Steven De Waal
    The Value(s) of Civil Leaders A Study into the Influence of Governance Context on Public Value Orientation Appendix 2 Portraits of Civil Leaders (13 leaders) Dissertation, University Utrecht 2014 Steven P.M. de Waal Appendix 2 Portraits of Civil Leaders (13 leaders) 1. Paul Baan 2. Hans Becker 3. Leon Bobbe 4. Piet Boekhoud (& Els Lubbers) 5. Yolanda Eijgenstein 6. Hans Nieukerke 7. Camille Oostwegel 8. Tom Rodrigues 9. Arie Schagen (& Esseline Schieven) 10. Clara and Sjaak Sies 11. Hans Visser 12. Mei Li Vos 13. Sister Giuseppa Witlox 2 Paul Baan A. Introduction Who is Paul Baan? Paul Baan was born in 1951. After finishing his bachelor of engineering, he started his career in the construction industry and later finished his master in Economics at the UniversitY of Groningen. In 1981, he joined his brother Jan at the Baan CompanY, a highlY successful software company, as president and vice-chairman. Jan and Paul Baan were successful and became verY wealthy when the company was floated. Paul Baan left the company in 1996, a Year after it went public and before it got into financial difficulties. His brother did the same sometime later. Through the Vanenburg Group, a venture capital companY investing in IT companies, also founded bY the Baan brothers, Jan and Paul Baan kept a stake in the Baan Company until the company was sold in 2000. According to Paul Baan, his passion for business and innovation stems from his time with Baan Group. In 2000, Baan started the Stichting Noaber Foundation (henceforth: Noaber Foundation). A ‘noaber’ (etYmologicallY linked to the English ‘neighbor’) is a word in an eastern Dutch dialect denoting a fellow supportive citizen.
    [Show full text]
  • Probabilistic Topic Modelling with Semantic Graph
    Probabilistic Topic Modelling with Semantic Graph B Long Chen( ), Joemon M. Jose, Haitao Yu, Fajie Yuan, and Huaizhi Zhang School of Computing Science, University of Glasgow, Sir Alwyns Building, Glasgow, UK [email protected] Abstract. In this paper we propose a novel framework, topic model with semantic graph (TMSG), which couples topic model with the rich knowledge from DBpedia. To begin with, we extract the disambiguated entities from the document collection using a document entity linking system, i.e., DBpedia Spotlight, from which two types of entity graphs are created from DBpedia to capture local and global contextual knowl- edge, respectively. Given the semantic graph representation of the docu- ments, we propagate the inherent topic-document distribution with the disambiguated entities of the semantic graphs. Experiments conducted on two real-world datasets show that TMSG can significantly outperform the state-of-the-art techniques, namely, author-topic Model (ATM) and topic model with biased propagation (TMBP). Keywords: Topic model · Semantic graph · DBpedia 1 Introduction Topic models, such as Probabilistic Latent Semantic Analysis (PLSA) [7]and Latent Dirichlet Analysis (LDA) [2], have been remarkably successful in ana- lyzing textual content. Specifically, each document in a document collection is represented as random mixtures over latent topics, where each topic is character- ized by a distribution over words. Such a paradigm is widely applied in various areas of text mining. In view of the fact that the information used by these mod- els are limited to document collection itself, some recent progress have been made on incorporating external resources, such as time [8], geographic location [12], and authorship [15], into topic models.
    [Show full text]
  • Semantic Memory: a Review of Methods, Models, and Current Challenges
    Psychonomic Bulletin & Review https://doi.org/10.3758/s13423-020-01792-x Semantic memory: A review of methods, models, and current challenges Abhilasha A. Kumar1 # The Psychonomic Society, Inc. 2020 Abstract Adult semantic memory has been traditionally conceptualized as a relatively static memory system that consists of knowledge about the world, concepts, and symbols. Considerable work in the past few decades has challenged this static view of semantic memory, and instead proposed a more fluid and flexible system that is sensitive to context, task demands, and perceptual and sensorimotor information from the environment. This paper (1) reviews traditional and modern computational models of seman- tic memory, within the umbrella of network (free association-based), feature (property generation norms-based), and distribu- tional semantic (natural language corpora-based) models, (2) discusses the contribution of these models to important debates in the literature regarding knowledge representation (localist vs. distributed representations) and learning (error-free/Hebbian learning vs. error-driven/predictive learning), and (3) evaluates how modern computational models (neural network, retrieval- based, and topic models) are revisiting the traditional “static” conceptualization of semantic memory and tackling important challenges in semantic modeling such as addressing temporal, contextual, and attentional influences, as well as incorporating grounding and compositionality into semantic representations. The review also identifies new challenges
    [Show full text]
  • Knowledge Graphs on the Web – an Overview Arxiv:2003.00719V3 [Cs
    January 2020 Knowledge Graphs on the Web – an Overview Nicolas HEIST, Sven HERTLING, Daniel RINGLER, and Heiko PAULHEIM Data and Web Science Group, University of Mannheim, Germany Abstract. Knowledge Graphs are an emerging form of knowledge representation. While Google coined the term Knowledge Graph first and promoted it as a means to improve their search results, they are used in many applications today. In a knowl- edge graph, entities in the real world and/or a business domain (e.g., people, places, or events) are represented as nodes, which are connected by edges representing the relations between those entities. While companies such as Google, Microsoft, and Facebook have their own, non-public knowledge graphs, there is also a larger body of publicly available knowledge graphs, such as DBpedia or Wikidata. In this chap- ter, we provide an overview and comparison of those publicly available knowledge graphs, and give insights into their contents, size, coverage, and overlap. Keywords. Knowledge Graph, Linked Data, Semantic Web, Profiling 1. Introduction Knowledge Graphs are increasingly used as means to represent knowledge. Due to their versatile means of representation, they can be used to integrate different heterogeneous data sources, both within as well as across organizations. [8,9] Besides such domain-specific knowledge graphs which are typically developed for specific domains and/or use cases, there are also public, cross-domain knowledge graphs encoding common knowledge, such as DBpedia, Wikidata, or YAGO. [33] Such knowl- edge graphs may be used, e.g., for automatically enriching data with background knowl- arXiv:2003.00719v3 [cs.AI] 12 Mar 2020 edge to be used in knowledge-intensive downstream applications.
    [Show full text]
  • Examining Different Strategies for the Card Game Sueca
    Universiteit Leiden Opleiding Informatica & Economie Examining different strategies for the card game Sueca Name: Vanessa Lopes Carreiro Date: July 13, 2016 1st supervisor: Walter Kosters 2nd supervisor: Rudy van Vliet BACHELOR THESIS Leiden Institute of Advanced Computer Science (LIACS) Leiden University Niels Bohrweg 1 2333 CA Leiden The Netherlands Examining different strategies for the card game Sueca Vanessa Lopes Carreiro Abstract Sueca is a point-trick card game with trumps popular in Portugal, Brazil and Angola. There has not been done any research into Sueca. In this thesis we will study the card game into detail and examine different playing strategies, one of them being basic Monte-Carlo Tree Search. The purpose is to see what strategies can be used to play the card game best. It turns out that the basic Monte-Carlo strategy plays best when both team members play that strategy. i ii Acknowledgements I would like to thank my supervisor Walter Kosters for brainstorming with me about the research and support but also for the conversations about life. It was a pleasure working with him. I am also grateful for Rudy van Vliet for being my second reader and taking time to read this thesis and providing feedback. iii iv Contents Abstract i Acknowledgements iii 1 Introduction 1 2 The game 2 2.1 The deck and players . 2 2.2 The deal . 3 2.3 Theplay ................................................... 3 2.4 Scoring . 4 3 Similar card games 5 3.1 Klaverjas . 5 3.2 Bridge . 6 4 How to win Sueca 7 4.1 Leading the first trick .
    [Show full text]
  • Large Semantic Network Manual Annotation 1 Introduction
    Large Semantic Network Manual Annotation V´aclav Nov´ak Institute of Formal and Applied Linguistics Charles University, Prague [email protected] Abstract This abstract describes a project aiming at manual annotation of the content of natural language utterances in a parallel text corpora. The formalism used in this project is MultiNet – Multilayered Ex- tended Semantic Network. The annotation should be incorporated into Prague Dependency Treebank as a new annotation layer. 1 Introduction A formal specification of the semantic content is the aim of numerous semantic approaches such as TIL [6], DRT [9], MultiNet [4], and others. As far as we can tell, there is no large “real life” text corpora manually annotated with such markup. The projects usually work only with automatically generated annotation, if any [1, 6, 3, 2]. We want to create a parallel Czech-English corpora of texts annotated with the corresponding semantic network. 1.1 Prague Dependency Treebank From the linguistic viewpoint there language resources such as Prague Dependency Treebank (PDT) which contain a deep manual analysis of texts [8]. PDT contains annotations of three layers, namely morpho- logical, analytical (shallow dependency syntax) and tectogrammatical (deep dependency syntax). The units of each annotation level are linked with corresponding units on the preceding level. The morpho- logical units are linked directly with the original text. The theoretical basis of the treebank lies in the Functional Gener- ative Description of language system [7]. PDT 2.0 is based on the long-standing Praguian linguistic tradi- tion, adapted for the current computational-linguistics research needs. The corpus itself is embedded into the latest annotation technology.
    [Show full text]
  • Universal Or Variation? Semantic Networks in English and Chinese
    Universal or variation? Semantic networks in English and Chinese Understanding the structures of semantic networks can provide great insights into lexico- semantic knowledge representation. Previous work reveals small-world structure in English, the structure that has the following properties: short average path lengths between words and strong local clustering, with a scale-free distribution in which most nodes have few connections while a small number of nodes have many connections1. However, it is not clear whether such semantic network properties hold across human languages. In this study, we investigate the universal structures and cross-linguistic variations by comparing the semantic networks in English and Chinese. Network description To construct the Chinese and the English semantic networks, we used Chinese Open Wordnet2,3 and English WordNet4. The two wordnets have different word forms in Chinese and English but common word meanings. Word meanings are connected not only to word forms, but also to other word meanings if they form relations such as hypernyms and meronyms (Figure 1). 1. Cross-linguistic comparisons Analysis The large-scale structures of the Chinese and the English networks were measured with two key network metrics, small-worldness5 and scale-free distribution6. Results The two networks have similar size and both exhibit small-worldness (Table 1). However, the small-worldness is much greater in the Chinese network (σ = 213.35) than in the English network (σ = 83.15); this difference is primarily due to the higher average clustering coefficient (ACC) of the Chinese network. The scale-free distributions are similar across the two networks, as indicated by ANCOVA, F (1, 48) = 0.84, p = .37.
    [Show full text]
  • The Netherlands from National Identity to Plural Identifications
    The NeTherlaNds From NaTioNal ideNTiTy To Plural ideNTiFicaTioNs By Monique Kremer TRANSATLANTIC COUNCIL ON MIGRATION THE NETHERLANDS From National Identity to Plural Identifications Monique Kremer March 2013 Acknowledgments This research was commissioned by the Transatlantic Council on Migration, an initiative of the Migration Policy Institute (MPI), for its seventh plenary meeting, held November 2011 in Berlin. The meeting’s theme was “National Identity, Immigration, and Social Cohesion: (Re)building Community in an Ever-Globalizing World” and this paper was one of the reports that informed the Council’s discussions. The Council, an MPI initiative undertaken in cooperation with its policy partner the Bertelsmann Stiftung, is a unique deliberative body that examines vital policy issues and informs migration policymaking processes in North America and Europe. The Council’s work is generously supported by the following foundations and governments: Carnegie Corporation of New York, Open Society Foundations, Bertelsmann Stiftung, the Barrow Cadbury Trust (UK Policy Partner), the Luso-American Development Foundation, the Calouste Gulbenkian Foundation, and the governments of Germany, the Netherlands, Norway, and Sweden. For more on the Transatlantic Council on Migration, please visit: www.migrationpolicy.org/transatlantic. © 2013 Migration Policy Institute. All Rights Reserved. Cover Design: Danielle Tinker, MPI Typesetting: April Siruno and Rebecca Kilberg, MPI No part of this publication may be reproduced or transmit- ted in any form by any means, electronic or mechanical, including photocopy, or any information storage and re- trieval system, without permission from the Migration Policy Institute. A full-text PDF of this document is available for free download from: www.migrationpolicy.org. Information for reproducing excerpts from this report can be found at www.migrationpolicy.org/about/copy.php.
    [Show full text]
  • Structure at Every Scale: a Semantic Network Account of the Similarities Between Very Unrelated Concepts
    De Deyne, S., Navarro D. J., Perfors, A. and Storms, G. (2016). Structure at every scale: A semantic network account of the similarities between very unrelated concepts. Journal of Experimental Psychology: General, 145, 1228-1254 https://doi.org/10.1037/xge0000192 Structure at every scale: A semantic network account of the similarities between unrelated concepts Simon De Deyne, Danielle J. Navarro, Amy Perfors University of Adelaide Gert Storms University of Leuven Word Count: 19586 Abstract Similarity plays an important role in organizing the semantic system. However, given that similarity cannot be defined on purely logical grounds, it is impor- tant to understand how people perceive similarities between different entities. Despite this, the vast majority of studies focus on measuring similarity between very closely related items. When considering concepts that are very weakly re- lated, little is known. In this paper we present four experiments showing that there are reliable and systematic patterns in how people evaluate the similari- ties between very dissimilar entities. We present a semantic network account of these similarities showing that a spreading activation mechanism defined over a word association network naturally makes correct predictions about weak sim- ilarities and the time taken to assess them, whereas, though simpler, models based on direct neighbors between word pairs derived using the same network cannot. Keywords: word associations, similarity, semantic networks, random walks. This work was supported by a research grant funded by the Research Foundation - Flanders (FWO), ARC grant DE140101749 awarded to the first author, and by the interdisciplinary research project IDO/07/002 awarded to Dirk Speelman, Dirk Geeraerts, and Gert Storms.
    [Show full text]
  • Evaluating Attacks on the Credibility of Politicians in Political Debates
    ISSA Proceedings 2006 – Evaluating Attacks On The Credibility Of Politicians In Political Debates 1. Introduction When analysing political argumentative discourse, we regularly come across attacks on the credibility of the participants in the discourse. By discrediting tactics, politicians intend to discourage their colleague politicians and the general public from supporting the standpoint of their opponent. In Dutch media it is suggested that the use of personal attacks in Dutch political debates has increased under the influence of international politics.[i] Although there is no empirical evidence for this claim, very recently there have been several examples in Dutch politics in which the credibility of politicians has been subject of debate. On June 24 2006 for instance, the Dutch progressive liberal party, Democrats 66 (D’66), organised elections in order to find a new party leader. The two most prominent candidates were the current Minister for Government Reform and Kingdom Relations, Alexander Pechtold, and chair of the parliamentary party, Lousewies van der Laan. In one of the debates in the build-up to the elections, van der Laan stated that her opponent Pechtold, had completely lost his credibility. First of all because Pechtold, when he was a minister, had agreed on the Uruzgan mission whereas, on an earlier occasion, he had said that under no circumstances he would agree on that mission. Secondly, because he characterised himself as an analytical person, whereas, according to van der Laan, this is not in keeping with the way in which he had profiled himself in an interview, claiming to be ‘a man who often shoots and some shots are successful’.
    [Show full text]
  • The Penguin Book of Card Games
    PENGUIN BOOKS The Penguin Book of Card Games A former language-teacher and technical journalist, David Parlett began freelancing in 1975 as a games inventor and author of books on games, a field in which he has built up an impressive international reputation. He is an accredited consultant on gaming terminology to the Oxford English Dictionary and regularly advises on the staging of card games in films and television productions. His many books include The Oxford History of Board Games, The Oxford History of Card Games, The Penguin Book of Word Games, The Penguin Book of Card Games and the The Penguin Book of Patience. His board game Hare and Tortoise has been in print since 1974, was the first ever winner of the prestigious German Game of the Year Award in 1979, and has recently appeared in a new edition. His website at http://www.davpar.com is a rich source of information about games and other interests. David Parlett is a native of south London, where he still resides with his wife Barbara. The Penguin Book of Card Games David Parlett PENGUIN BOOKS PENGUIN BOOKS Published by the Penguin Group Penguin Books Ltd, 80 Strand, London WC2R 0RL, England Penguin Group (USA) Inc., 375 Hudson Street, New York, New York 10014, USA Penguin Group (Canada), 90 Eglinton Avenue East, Suite 700, Toronto, Ontario, Canada M4P 2Y3 (a division of Pearson Penguin Canada Inc.) Penguin Ireland, 25 St Stephen’s Green, Dublin 2, Ireland (a division of Penguin Books Ltd) Penguin Group (Australia) Ltd, 250 Camberwell Road, Camberwell, Victoria 3124, Australia
    [Show full text]
  • Computing and Predicting Winning Hands in the Trick-Taking Game of Klaverjas
    Computing and Predicting Winning Hands in the Trick-Taking Game of Klaverjas Jan N. van Rijn2;4, Frank W. Takes3;4, and Jonathan K. Vis1;4 1 Leiden University Medical Center 2 Columbia University, New York 3 University of Amsterdam 4 Leiden University Abstract. This paper deals with the trick-taking game of Klaverjas, in which two teams of two players aim to gather as many high valued cards for their team as possible. We propose an efficient encoding to enumerate possible configurations of the game, such that subsequently αβ-search can be employed to effectively determine whether a given hand of cards is winning. To avoid having to apply the exact approach to all possible game configurations, we introduce a partitioning of hands into 981;541 equivalence classes. In addition, we devise a machine learning approach that, based on a combination of simple features is able to predict with high accuracy whether a hand is winning. This approach essentially mimics humans, who typically decide whether or not to play a dealt hand based on various simple counts of high ranking cards in their hand. By comparing the results of the exact algorithm and the machine learning approach we are able to characterize precisely which instances are difficult to solve for an algorithm, but easy to decide for a human. Results on almost one million game instances show that the exact approach typically solves a game within minutes, whereas a relatively small number of instances require up to several days, traversing a space of several billion game states. Interestingly, it is precisely those instances that are always correctly classified by the machine learning approach.
    [Show full text]