Investigating Social Networks with Agent Based Simulation and Link Prediction Methods

Total Page:16

File Type:pdf, Size:1020Kb

Investigating Social Networks with Agent Based Simulation and Link Prediction Methods Investigating Social Networks with Agent Based Simulation and Link Prediction Methods Angelico Giovanni Fetta School of Mathematics A thesis presented for the degree of Doctor of Philosophy 2014 Summary Social networks are increasingly being investigated in the context of individual behaviours. Research suggests that friendship connections have the ability to influence individual ac- tions, change personal opinions and subsequently impact upon personal wellbeing. This thesis aims to investigate the effects of social networks, through the use of Agent Based Simulation (ABS) and Link Prediction (LP) methods. Three main investigations form this thesis, culminating in the development of a new simulation-based approach to Link Predic- tion (PageRank-Max) and a model of behavioural spread through a connected population (Behavioural PageRank-Max). The first project investigates the suitability of ABS to explore a connected social system. The Peter Principle is a theory of managerial incompetence, having the potential to cause detrimental effects to system efficiency. Through the investigation of a theoretical hierar- chy of workplace social contacts, it is observed that the structure of a social network has the ability to impact system efficiency, demonstrating the importance of social network structure in conjunction with individual behaviours. The second project aims to further understand the structure of social networks, through the exploration of adolescent offline friendship data, taken from ‘A Stop Smoking in Schools Trial’ (ASSIST). An initial analysis of the data suggests certain factors may be pertinent in the formation of school social networks, identifying the importance of centrality mea- sures. An ABS aiming to predict the evolution of the ASSIST social networks is created, developing an algorithm based upon the optimisation of an individual’s eigen-centrality - termed PageRank-Max. This new approach to Link Prediction is found to predict ASSIST social network evolution more accurately than four existing prominent LP algorithms. The final part of this thesis attempts to improve the PageRank-Max method, by placing par- ticular emphasis upon specific individual attributes. Two new methods are developed, the first restricting the search space of the algorithm (Behavioural Search), while the second al- ters its calculation process by applying specific attribute weights (Behavioural PageRank- Max). The results demonstrate the importance of individual attributes in adolescent friend- ship selection. Furthermore, the Behavioural PageRank-Max offers an approach to model the spread of behaviours in conjunction with social network structure, with the value of this being evaluated against alternative models. i ii Declaration This work has not been submitted in substance for any other degree or award at this or any other university or place of learning, nor is being submitted concurrently in candidature for any degree or other award. Signed .. Date . STATEMENT 1 This thesis is being submitted in partial fulfilment of the requirements for the degree of PhD. Signed .. Date . STATEMENT 2 This thesis is the result of my own investigations, except where otherwise stated. Other sources are acknowledged and explicit references given. A reference section is appended. Signed .. Date . STATEMENT 3 I hereby give consent for my thesis, if accepted, to be available for photocopying and for inter-library loan, and for the title and summary to be made available to outside organisa- tions. Signed .. Date . iii iv Acknowledgements First and foremost, I would like to thank my supervisors - Paul Harper, Janet Williams and Vincent Knight - for the time, energy and passion they have put into this research. Thank you for giving me the freedom to take us off into some (really) unexpected directions, involving jealous co-workers, smoking children and an array of specially selected snacks. I hope to be able to share a whiskey, a gin and tonic, and a cherry coke with all three of you again very soon. I would also like to thank Israel Vieira who was integral in the early stages of my work - giving me the knowledge necessary to become an autonomous agent. Thank you also to the EPSRC and LANCS initiative, who have kept me in food, clothes and copious amounts of olive oil for the last 3.5 years. Thank you to the DECIPHer group, namely Laurence Moore and Jo Holliday, for providing the ASSIST data. Also, a very special thank you to Jonathan Gillard, whose wise words and pastoral care have kept my head out of the shed on a number of occasions. Truth be told, doing a PhD is soul destroying, but I was fortunate to have people around who made coming in everyday worthwhile. From muscling my way into the “Fabulous 4” all those years ago, with the group growing to incorporate special applied mathematicians, the Warwick upper classes and all manner of wonderful characters. So many memories. So much Euphoria! Given that this research is about the effect of social networks, it is only fitting that I mention the impact of my own social network. My parents, who have always pushed me to succeed and supported me; my brothers, for keeping my feet on the ground with their pep-talks laden with sarcasm; my housemate Jo, for putting up with my lack of conversation and inability to deal with tupperware; and those special members of my family, who always welcome me with open arms and an open fridge whenever I decide to drop by. Of course, my social circle would not be complete if I didn’t mention Clare-o, always direct and always up for doing nothing - long may our ‘moments of poetry’ continue. Finally, to my number one ‘Cardiff girl’: Leanne Smith. My biggest discovery during my research wasn’t PageRank-Max, it was the curly haired little lady sitting across from me. Thank you for being my team mate, my sous chef and my reason for bringing two pieces of kitchen roll to the table. Xxxx. v vi Publications and Presentations Publications • Fetta, A. G., Harper, P. R., Knight, V. A., Vieira, I. T. and Williams J. E. (2012). On the Peter Principle: An agent based investigation into the consequential effects of social networks and behavioural factors. Physica A: Statistical Mechanics and its Applications, 391(9): 2898 - 2910. Awards • Early Career Researcher Poster Prize at ORAHS Conference 2012 - First Place. Conference Contributions & Presentations • Fetta, A. G., Harper, P. R., Knight, V. A., Vieira, I. T. and Williams J. E. (2010). Agent Based Simulation for Complex Health Systems Interventions; SWORDS Sem- inar, Cardiff. • Fetta, A. G., Jones, M., Knight, V. A., and Williams J. E. (2010). Workforce Plan- ning: Signing on to system dynamics? System Dynamic Special Interest Group Meeting (2011), Cardiff. • Fetta, A. G., Harper, P. R., Knight, V. A., Vieira, I. T. and Williams J. E. (2011). The Peter Principle, Agents and Beyond. SWORDS Seminar, Cardiff. • Fetta, A. G., Harper, P. R., Knight, V. A., Vieira, I. T. and Williams J. E. (2011), On the Peter Principle: An Agent Based Investigation into the Consequential Effects of Social Networks and Behavioural Factors. Winter Simulation Conference, Pheonix, Arizona. • Fetta, A. G., Harper, P. R., Knight, V. A., Vieira, I. T. and Williams J. E. (2012). On the Peter Principle: An Agent Based Investigation into the Consequential Effects of Social Networks and Behavioural Factors. SCOR, Nottingham. vii • Fetta, A. G., Harper, P. R., Knight, V. A., Vieira, I. T. and Williams J. E. (2012). On the Peter Principle: An Agent Based Investigation into the Consequential Effects of Social Networks and Behavioural Factors. EURO, Vilnuis, Lithuania. • Fetta, A. G., Harper, P. R., Knight, V. A. and Williams J. E. (2012). Modelling Adolescent Smoking Behaviours with Social Network Analysis. ORAHS, Twente, Holland. • Fetta, A. G., Harper, P. R., Knight, V. A. and Williams J. E. (2012). Modelling Adolescent Smoking Behaviours with Social Network Analysis. NHS Confederation Event: Change by design - systems modelling and simulation in healthcare, Exeter. • Fetta, A. G., Harper, P. R., Knight, V. A. and Williams J. E. (2013). Modelling Adolescent Smoking Behaviours with Social Network Analysis. INSNA Sunbelt Conference, Hamburg, Germany. • Fetta, A. G., Harper, P. R., Knight, V. A. and Williams J. E. (2013). Modelling Ado- lescent Smoking Behaviours with Social Network Analysis. EURO, Rome, Italy. viii List of Abbreviations ABS Agent Based Simulation API Application Programming Interface ASSIST AStop Smoking In Schools Trial BPRM Behavioural PageRank-Max CI Confidence Interval CS Common Sense DECIPHer Development and Evaluation of Complex Interventions for Public Health Improvement EOS Evolution of Organised Societies ESS Evolutionary Stable Strategy GUI Graphical User Interface LP Link Prediction MANTA Modelling an ANT hill Activity MAS Multi Agent System MCMC Markov Chain Monte Carlo MLE Maximum Likelihood Estimator MPI Mass Psychogenic Illness NBM Network Behavioural Model NE Network Efficiency OOP Object-Oriented Programming OR Operational Research OSS Open Source System OOP Object-Oriented Programming PP Peter Principle PS Proprietary System PSM Population Structure Model RAN RANdom Network SAB Stochastic Actor Based SF Scale Free Network SNS Social Network Simulation SW Small World Network WHO World Health Organisation ix x Contents 1 Introduction1 1.1 Motivation . .2 1.1.1 Social Connection . .2 1.1.2 Social Network Simulation . .4 1.1.3 Application . .5 1.2 Research Methods . .7 1.3 Research Aims . .8 1.4 Outline . .9 1.5 Chapter Review . 11 2 Simulation Literature Review 13 2.1 What is Simulation? . 14 2.1.1 Benefits of Simulation . 15 2.1.2 Limitations . 17 2.1.3 Building a Simulation . 17 2.2 Types of Simulation . 20 2.2.1 System Dynamics . 20 2.2.2 Discrete Event Simulation . 21 2.2.3 Agent Based Simulation . 22 2.2.4 Advances in Simulation . 31 2.3 Applications of ABS . 32 2.3.1 Common ABS Applications . 32 2.3.2 Social Theory . 34 2.3.3 Social Networks .
Recommended publications
  • The Long-Term Benefits of Positive Self-Presentation Via Profile
    fpsyg-08-01981 November 14, 2017 Time: 17:23 # 1 ORIGINAL RESEARCH published: 15 November 2017 doi: 10.3389/fpsyg.2017.01981 The Long-Term Benefits of Positive Self-Presentation via Profile Pictures, Number of Friends and the Initiation of Relationships on Facebook for Adolescents’ Self-Esteem and the Initiation of Offline Relationships Anna Metzler* and Herbert Scheithauer Developmental Science and Applied Developmental Psychology, Freie Universität Berlin, Berlin, Germany Social networking sites are a substantial part of adolescents’ daily lives. By using a longitudinal approach the current study examined the impact of (a) positive self- presentation, (b) number of friends, and (c) the initiation of online relationships on Facebook on adolescents’ self-esteem and their initiation of offline relationships, as Edited by: well as the mediating role of positive feedback. Questionnaire data were obtained Kai S. Cortina, from 217 adolescents (68% girls, mean age 16.7 years) in two waves. Adolescents’ University of Michigan, United States positive self-presentation and number of friends were found to be related to a higher Reviewed by: Burkhard Gniewosz, frequency of receiving positive feedback, which in turn was negatively associated with University of Salzburg, Austria self-esteem. However, the number of Facebook friends had a positive impact on self- Katherine Fiori, esteem, and the initiation of online relationships positively influenced the initiation of Adelphi University, United States offline relationships over time, demonstrating that Facebook
    [Show full text]
  • Biased Edge Dropout for Enhancing Fairness in Graph Representation
    PREPRINT SUBMITTED TO A JOURNAL. 1 Biased Edge Dropout for Enhancing Fairness in Graph Representation Learning Indro Spinelli, Graduate Student Member, IEEE, Simone Scardapane, Amir Hussain, Senior Member, IEEE and Aurelio Uncini, Member, IEEE Abstract—Graph representation learning has become a ubiqui- I. INTRODUCTION tous component in many scenarios, ranging from social network analysis to energy forecasting in smart grids. In several applica- RAPH structured data, ranging from friendships on tions, ensuring the fairness of the node (or graph) representations G social networks to physical links in energy grids, powers with respect to some protected attributes is crucial for their many algorithms governing our digital life. Social networks correct deployment. Yet, fairness in graph deep learning remains under-explored, with few solutions available. In particular, the topologies define the stream of information we will receive, tendency of similar nodes to cluster on several real-world often influencing our opinion [1][2][3][4]. Bad actors, some- graphs (i.e., homophily) can dramatically worsen the fairness times, define these topologies ad-hoc to spread false informa- of these procedures. In this paper, we propose a biased edge tion [5]. Similarly, recommender systems [6] suggest products dropout algorithm (FairDrop) to counter-act homophily and tailored to our own experiences and history of purchases. improve fairness in graph representation learning. FairDrop can be plugged in easily on many existing algorithms, is efficient, However, pursuing the highest accuracy as the only metric of adaptable, and can be combined with other fairness-inducing interest has let many of these algorithms discriminate against solutions. After describing the general algorithm, we demonstrate minorities in the past [7][8][9], despite the law prohibiting its application on two benchmark tasks, specifically, as a random unfair treatment based on sensitive traits such as race, religion, walk model for producing node embeddings, and to a graph and gender.
    [Show full text]
  • Combining Collective Classification and Link Prediction
    Combining Collective Classification and Link Prediction Mustafa Bilgic, Galileo Mark Namata, Lise Getoor Dept. of Computer Science Univ. of Maryland College Park, MD 20742 {mbilgic, namatag, getoor}@cs.umd.edu Abstract however, this is rarely the case. Real world collections usu- ally have a number of missing and incorrect labels and links. The problems of object classification (labeling the nodes Other approaches construct a complex joint probabilistic of a graph) and link prediction (predicting the links in model, capable of handling missing values, but in which a graph) have been largely studied independently. Com- inference is often intractable. monly, object classification is performed assuming a com- In this paper, we take the middle road. We propose a plete set of known links and link prediction is done assum- simple yet general approach for combining object classifi- ing a fully observed set of node attributes. In most real cation and link prediction. We propose an algorithm called world domains, however,attributes and links are often miss- Iterative Collective Classification and Link Prediction (IC- ing or incorrect. Object classification is not provided with CLP) that integrates collective object classification and link all the links relevant to correct classification and link pre- prediction by effectively passing up-to-date information be- diction is not provided all the labels needed for accurate tween the algorithms. We experimentally show on many link prediction. In this paper, we propose an approach that different network types that applying ICCLP improves per- addresses these two problems by interleaving object clas- formance over running collective classification and link pre- sification and link prediction in a collective algorithm.
    [Show full text]
  • Finding Experts by Link Prediction in Co-Authorship Networks
    Finding Experts by Link Prediction in Co-authorship Networks Milen Pavlov1,2,RyutaroIchise2 1 University of Waterloo, Waterloo ON N2L 3G1, Canada 2 National Institute of Informatics, Tokyo 101-8430, Japan Abstract. Research collaborations are always encouraged, as they of- ten yield good results. However, the researcher network contains massive amounts of experts in various disciplines and it is difficult for the indi- vidual researcher to decide which experts will match his own expertise best. As a result, collaboration outcomes are often uncertain and research teams are poorly organized. We propose a method for building link pre- dictors in networks, where nodes can represent researchers and links - collaborations. In this case, predictors might offer good suggestions for future collaborations. We test our method on a researcher co-authorship network and obtain link predictors of encouraging accuracy. This leads us to believe our method could be useful in building and maintaining strong research teams. It could also help with choosing vocabulary for expert de- scription, since link predictors contain implicit information about which structural attributes of the network are important with respect to the link prediction problem. 1 Introduction Collaborations between researchers often have a synergistic effect. The combined expertise of a group of researchers can often yield results far surpassing the sum of the individual researchers’ capabilities. However, creating and organizing such research teams is not a straightforward task. The individual researcher often has limited awareness of the existence of other researchers with which collaborations might prove fruitful. Further- more, even in the presence of such awareness, it is difficult to predict in advance which potential collaborations should be pursued.
    [Show full text]
  • Friendship Paradox in Growth Networks: Analytical and Empirical Analysis
    Sidorov et al. Appl Netw Sci (2021) 6:51 https://doi.org/10.1007/s41109-021-00391-6 Applied Network Science RESEARCH Open Access Friendship paradox in growth networks: analytical and empirical analysis Sergei P. Sidorov* , Sergei V. Mironov and Alexey A. Grigoriev *Correspondence: [email protected] Abstract Saratov State University, Many empirical studies have shown that in social, citation, collaboration, and other Saratov, Russia 410012 types of networks in real world, the degree of almost every node is less than the aver- age degree of its neighbors. This imbalance is well known in sociology as the friendship paradox and states that your friends are more popular than you on average. If we intro- duce a value equal to the ratio of the average degree of the neighbors for a certain node to the degree of this node (which is called the ‘friendship index’, FI), then the FI value of more than 1 for most nodes indicates the presence of the friendship paradox in the network. In this paper, we study the behavior of the FI over time for networks generated by growth network models. We will focus our analysis on two models based on the use of the preferential attachment mechanism: the Barabási–Albert model and the triadic closure model. Using the mean-feld approach, we obtain diferential equa- tions describing the dynamics of changes in the FI over time, and accordingly, after obtaining their solutions, we fnd the expected values of this index over iterations. The results show that the values of FI are decreasing over time for all nodes in both models.
    [Show full text]
  • Learning to Make Predictions on Graphs with Autoencoders
    Learning to Make Predictions on Graphs with Autoencoders Phi Vu Tran Strategic Innovation Group Booz j Allen j Hamilton San Diego, CA USA [email protected] Abstract—We examine two fundamental tasks associated networks [38], communication networks [11], cybersecurity with graph representation learning: link prediction and semi- [6], recommender systems [16], and knowledge bases such as supervised node classification. We present a novel autoencoder DBpedia and Wikidata [35]. architecture capable of learning a joint representation of both local graph structure and available node features for the multi- There are a number of technical challenges associated with task learning of link prediction and node classification. Our learning to make meaningful predictions on complex graphs: autoencoder architecture is efficiently trained end-to-end in a single learning stage to simultaneously perform link prediction • Extreme class imbalance: in link prediction, the number and node classification, whereas previous related methods re- of known present (positive) edges is often significantly quire multiple training steps that are difficult to optimize. We less than the number of known absent (negative) edges, provide a comprehensive empirical evaluation of our models making it difficult to reliably learn from rare examples; on nine benchmark graph-structured datasets and demonstrate • Learn from complex graph structures: edges may be significant improvement over related methods for graph rep- resentation learning. Reference code and data are available at directed or undirected, weighted or unweighted, highly https://github.com/vuptran/graph-representation-learning. sparse in occurrence, and/or consisting of multiple types. Index Terms—network embedding, link prediction, semi- A useful model should be versatile to address a variety supervised learning, multi-task learning of graph types, including bipartite graphs; • Incorporate side information: nodes (and maybe edges) I.
    [Show full text]
  • Predicting Disease Related Microrna Based on Similarity and Topology
    cells Article Predicting Disease Related microRNA Based on Similarity and Topology Zhihua Chen 1,†, Xinke Wang 2,†, Peng Gao 2,†, Hongju Liu 3,† and Bosheng Song 4,* 1 Institute of Computing Science and Technology, Guangzhou University, Guangzhou 510006, China; [email protected] 2 School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; [email protected] (X.W.); [email protected] (P.G.) 3 College of Information Technology and Computer Science, University of the Cordilleras, Baguio 2600, Philippines; [email protected] 4 School of Information Science and Engineering, Hunan University, Changsha 410082, China * Correspondence: [email protected]; Tel.: +86-731-88821907 † All authors contributed equally to this work. Received: 4 July 2019; Accepted: 5 November 2019; Published: 7 November 2019 Abstract: It is known that many diseases are caused by mutations or abnormalities in microRNA (miRNA). The usual method to predict miRNA disease relationships is to build a high-quality similarity network of diseases and miRNAs. All unobserved associations are ranked by their similarity scores, such that a higher score indicates a greater probability of a potential connection. However, this approach does not utilize information within the network. Therefore, in this study, we propose a machine learning method, called STIM, which uses network topology information to predict disease–miRNA associations. In contrast to the conventional approach, STIM constructs features according to information on similarity and topology in networks and then uses a machine learning model to predict potential associations. To verify the reliability and accuracy of our method, we compared STIM to other classical algorithms.
    [Show full text]
  • Representation Learning on Graphs
    Representation learning on graphs A/Prof Truyen Tran [email protected] @truyenoz Deakin University truyentran.github.io letdataspeak.blogspot.com goo.gl/3jJ1O0 HCMC, May 2019 11/05/2019 1 Graph representation Graph reasoning Why bother? Graph generation Embedding Graph dynamics Message passing 11/05/2019 2 Why learning of graph representation? Graphs are pervasive in many scientific disciplines. The sub-area of graph representation has reached a certain maturity, with multiple reviews, workshops and papers at top AI/ML venues. Deep learning needs to move beyond vector, fixed-size data. Learning representation as a powerful way to discover hidden patterns making learning, inference and planning easier. 11/05/2019 3 System medicine 11/05/2019 https://www.frontiersin.org/articles/10.3389/fphys.2015.00225/full 4 Biology & pharmacy Traditional techniques: Graph kernels (ML) Molecular fingerprints (Chemistry) Modern techniques Molecule as graph: atoms as nodes, chemical bonds as edges #REF: Penmatsa, Aravind, Kevin H. Wang, and Eric Gouaux. "X- ray structure of dopamine transporter elucidates antidepressant mechanism." Nature 503.7474 (2013): 85-90. 11/05/2019 5 Chemistry DFT = Density Functional Theory Gilmer, Justin, et al. "Neural message passing for quantum chemistry." arXiv preprint arXiv:1704.01212 (2017). • Molecular properties • Chemical-chemical interaction • Chemical reaction • Synthesis planning 11/05/2019 6 Materials science • Crystal properties • Exploring/generating solid structures • Inverse design Xie, Tian, and Jeffrey
    [Show full text]
  • Network Interventions Based on Inversity: Leveraging the Friendship Paradox in Unknown Network Structures
    Network Interventions Based on Inversity: Leveraging the Friendship Paradox in Unknown Network Structures Vineet Kumar,1∗ David Krackhardt,2 Scott Feld3 1 Yale School of Management, Yale University. 2Heinz School of Public Policy and Management, Carnegie Mellon University 3 Department of Sociology, College of Liberal Arts, Purdue University. ∗To whom correspondence should be addressed; E-mail: [email protected] Network intervention problems benefit from selecting a more connected node, which is more likely to result in stronger indirect effects. However, in many network contexts, the structure of the network is unknown. We derive and examine the mathematical properties of two distinct “informationally light” strategies, a global strategy and local strategy, that yield higher degree nodes in virtually any network structure. These strategies are based on the friend- ship paradox: “your friends have more friends that you do.” We further iden- tify a novel network property called Inversity, which connects the fundamental parameters of these two strategies. We prove that the sign of Inversity for any given network determines which of the two strategies will be most effective in that network. We assess the performance of these strategies across a wide range of generative network models and real networks, and we show how to leverage network structure through these strategies even when the network structure is unknown. 1 Network-based interventions are of crucial importance in any setting where an individual’s choice or action has a multiplier impact on others, with a wide range of applications. Consider the following network intervention problems: (a) A new infectious disease is spreading through a large population.
    [Show full text]
  • Khanamk2021m-1A.Pdf (2.656Mb)
    Lakehead University Knowledge Commons,http://knowledgecommons.lakeheadu.ca Electronic Theses and Dissertations Electronic Theses and Dissertations from 2009 2021 Using homophily to analyze and develop link prediction models with deep learning framework Khanam, Kazi Zainab https://knowledgecommons.lakeheadu.ca/handle/2453/4829 Downloaded from Lakehead University, KnowledgeCommons Using Homophily to Analyze and Develop Link Prediction Models with Deep Learning Framework by Kazi Zainab Khanam A thesis submitted in partial fulfillment of the requirements for the degree of Master of Science in Computer Science in the Faculty of Science and Environmental Studies of Lakehead University, Thunder Bay Committee in charge: Dr. Vijay Mago (Principal Supervisor) Dr. Rajesh Sharma (External Examiner) Dr. Yimin Yang (Internal Examiner) Winter 2021 The thesis of Kazi Zainab Khanam, titled Using Homophily to Analyze and Develop Link Prediction Models with Deep Learning Framework, is approved: Chair Date Date Date Lakehead University, Thunder Bay Using Homophily to Analyze and Develop Link Prediction Models with Deep Learning Framework Copyright 2021 by Kazi Zainab Khanam 1 Abstract USING HOMOPHILY TO ANALYZE AND DEVELOP LINK PREDICTION MODELS WITH DEEP LEARNING FRAMEWORK Twitter is a prominent social networking platform where users’ short messages or “tweets” are often used for analysis. However, there has not been much attention paid to mining the medical professions, such as detecting users’ occupations from their biographical content. Mining such information can be useful to build recommender systems for cost-effective ad- vertisements. Conventional classifiers can be used to predict medical occupations, but they tend to perform poorly as there are a variety of occupations. As a result, the main focus of the research is to use various deep learning techniques to examine the textual properties of Twitter users’ biographic contents, network properties, and the impact of homophily of Twitter users employed in medical professional fields.
    [Show full text]
  • Neural Variational Inference for Embedding Knowledge Graphs
    University College London MSc Data Science and Machine Learning Neural Variational Inference For Embedding Knowledge Graphs Author: Supervisor: Alexander Cowen-Rivers Prof. Sebastian Riedel Disclaimer This report is submitted as part requirement for the MSc Degree in Data Science and Machine Learning at University College London. It is substantially the result of my own work except where explicitly indicated in the text. The report may be freely copied and distributed provided the source is explicitly acknowledged.The report may be freely copied and distributed provided the source is explicitly acknowledged. Machine Reading Group October 23, 2018 i \Probability is expectation founded upon partial knowledge. A perfect acquaintance with all the circumstances affecting the occurrence of an event would change expectation into certainty, and leave nether room nor demand for a theory of probabilities." George Boole ii Acknowledgements I would like to thank my supervisors Prof Sebastian Riedel, as well as the other Machine Reading group members, particularly Dr Minervini, who helped through comments and discussions during the writing of this thesis. I would also like to thank my father who nurtured my passion for mathematics. Lastly, I would like to thank Thomas Kipf from the University of Amsterdam, who cleared up some queries I had regarding some of his recent work. iii Abstract Alexander Cowen-Rivers Neural Variational Inference For Embedding Knowledge Graphs Statistical relational learning investigates the development of tools to study graph-structured data. In this thesis, we provide an introduction on how models of the world are learnt using knowledge graphs of known facts of information, later applied to infer new facts about the world.
    [Show full text]
  • A Scalable and Distributed Actor-Based Version of the Node2vec Algorithm
    Workshop "From Objects to Agents" (WOA 2019) A Scalable and Distributed Actor-Based Version of the Node2Vec Algorithm Gianfranco Lombardo Agostino Poggi Department of Engineering and Architecture Department of Engineering and Architecture University of Parma University of Parma Parma, Italy Parma, Italy [email protected] [email protected] Abstract—The analysis of systems that can be modeled as attributed networks: an interaction network and a friendship networks of interacting entities is becoming often more important network. In [3] the authors analyze the same community in in different research fields. The application of machine learning order to extract the giant component without using topology algorithms, like prediction tasks over nodes and edges, requires a manually feature extraction or a learning task to extract them information with a bio-inspired algorithm. In [4] the authors automatically (embedding techniques). Several approaches have uses a temporal network to model the USA financial market in been proposed in the last years and the most promising one order to discover correlations among the dynamics of stocks’ is represented by the Node2Vec algorithm. However, common cluster and to predict economic crises. In [5] the authors limitations of graph embedding techniques are related to memory modeled the interaction between proteins as a network, with requirements and to the time complexity. In this paper, we propose a scalable and distributed version of this algorithm called the aim of automatic predicting a correct label for each protein ActorNode2vec, developed on an actor-based architecture that describing its functionalities. In light of this, this formulation allows to overcome these kind of constraints.
    [Show full text]