Some Directed Graph Algorithms and Their Application to Pointer Analysis

Total Page:16

File Type:pdf, Size:1020Kb

Some Directed Graph Algorithms and Their Application to Pointer Analysis University of London Imperial College of Science, Technology and Medicine Department of Computing Some directed graph algorithms and their application to pointer analysis David J. Pearce February 2005 Submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Engineering of the University of London Abstract This thesis is focused on improving execution time and precision of scalable pointer analysis. Such an analysis statically determines the targets of all pointer variables in a program. We formulate the analysis as a directed graph problem, where the solution can be obtained by a computation similar, in many ways, to transitive closure. As with transitive closure, identifying strongly connected components and transitive edges offers significant gains. However, our problem differs as the computation can result in new edges being added to the graph and, hence, dynamic algorithms are needed to efficiently identify these structures. Thus, pointer analysis has often been likened to the dynamic transitive closure problem. Two new algorithms for dynamically maintaining the topological order of a directed graph are presented. The first is a unit change algorithm, meaning the solution must be recomputed immediately following an edge insertion. While this has a marginally inferior worse-case time bound, compared with a previous solution, it is far simpler to implement and has fewer restrictions. For these reasons, we find it to be faster in practice and provide an experimental study over random graphs to support this. Our second is a batch algorithm, meaning the solution can be updated after several insertions, and it is the first truly dynamic solution to obtain an optimal time bound of O(v + e + b) over a batch b of edge insertions. Again, we provide an experimental study over random graphs comparing this against the standard approach to topological sort. Furthermore, we demonstrate how both algorithms can be extended to the problem of dynamically detecting strongly connected components (i.e. cycles), thus achieving the first solutions which do not need to traverse the entire graph for half of all edge insertions. Several other new techniques for improving pointer analysis are also presented. These include difference propagation, which avoids redundant work by tracking changes in the points-to sets, and a novel approach to field-sensitive analysis of C. Finally, a detailed study of numerous solving algorithms, evaluating our techniques and algorithms against previous work, is contained herein. Our benchmark suite consists of many common C programs ranging in size from 15,000-200,000 lines of code. 2 Acknowledgements I am grateful to my supervisor Paul Kelly for his guidance throughout this work and for having the courage to let me develop my own directions. He has always supported my work through helpful advice, astute criticism and stimulating conversation. He also encouraged me to undertake internships at Bell Labs and IBM Hursley. For these things, I thank him. Many other people have been helpful to me throughout my time at Imperial College. My second supervisor, Chris Hankin, has provided many excellent comments and suggestions on my work. His depth of knowledge on program analysis has also been invaluable. I would also like to thank Oskar Mencer, who has always given an interesting and alternate viewpoint on life, and those members of the Software Performance Group, in particular Olav Beckmann and Kwok Cheung Yeung, for many interesting and delightful discussions. To my parents I am, of course, indebted for giving me such an excellent start in life. They encouraged my interest in computers from an early age and have provided both moral and financial support throughout the years. I must also thank the Engineering and Physical Sciences Research Council (EPSRC), without whose financial support I could not have done this work. I would also like to thank my examiners, Andy King and Mark Harman, for their excellent and helpful comments and their general appreciation of my work. Lastly, but my no means least, I must thank my partner Melika King for her love and patience throughout the final and most testing years of my work. 3 Contents 1 Introduction 10 1.1 Applications . 12 1.2 Contributions . 13 1.3 Thesis Organisation . 13 2 Constraint-Based Pointer Analysis 15 2.1 Solving the Analysis . 17 2.1.1 Set Implementation . 19 2.2 Extending the Basic Model . 21 2.2.1 Context-Sensitivity . 21 2.2.2 Flow-Sensitivity . 24 2.2.3 Field-Sensitivity . 26 2.2.4 The Heap . 28 2.2.5 Arrays, Conditionals and Loops . 31 2.2.6 Metrics . 32 2.2.7 Concluding Remarks . 33 2.3 Alternative Approaches to Pointer Analysis . 34 2.3.1 Abstract Interpretation . 34 2.3.2 Unification . 37 2.4 Concluding Remarks . 40 3 Dynamic Topological Order 41 3.1 Background . 42 3.1.1 The Complexity Parameter δxy . 44 3.1.2 The MNR Algorithm . 46 3.1.3 The AHRSZ Algorithm . 48 3.2 Algorithm PTO1 . 53 3.3 Algorithm PTO2 . 57 3.4 Experimental Study . 63 3.4.1 Generating a Random DAG . 63 3.4.2 Experimental Procedure . 64 3.4.3 Single Insertion Experiments . 65 4 CONTENTS 5 3.4.4 Experiment 2 - Batch Insertions . 67 3.5 Dynamic Strongly Connected Components . 69 3.6 Concluding Remarks . 71 4 Efficient Pointer Analysis 73 4.1 Worklist Solvers . 73 4.1.1 Background . 74 4.1.2 Algorithm PW1, a Simple Worklist Solver . 76 4.1.3 Algorithm PWD, a Difference Propagation Solver . 80 4.1.4 Experimental Study . 83 4.2 Beyond the Worklist . 88 4.2.1 Algorithm PW2 . 88 4.2.2 The Heintze-Tardieu Algorithm . 91 4.2.3 Experimental Study . 94 4.3 Concluding Remarks . 94 5 Field-Sensitive Pointer Analysis 97 5.1 Indirect Function Calls . 98 5.2 Field-Sensitive Pointer Analysis . 100 5.3 Experimental Study . 103 5.4 Related Work . 107 5.4.1 Field-Based Pointer Analysis . 111 5.5 Concluding Remarks . 113 6 Conclusions and Future Work 115 6.1 Review of Contributions . 115 6.2 Future Work for the Dynamic Topological Order Problem . 116 6.2.1 Experiments on Real-World Graphs . 117 6.2.2 A Bounded Complexity Result for PTO2 . 117 6.2.3 A Batch Variant of PTO1 . 117 6.2.4 Improving PTO1 . 118 6.3 Future Work on Pointer Analysis . 119 6.3.1 Eliminating Positive Weight Cycles . 119 6.3.2 Developing the Heintze-Tardieu Algorithm . 120 6.3.3 Transitive Edges . 120 6.4 Conclusions . 121 A Relating to Heintze-Aiken Systems 122 A.1 Inductive Form . 123 B Strongly Connected Components 126 List of Figures 2.1 An inference system for flow- and context-insensitive pointer analysis . 17 2.2 An illustration of how get/set methods affect field-sensitivity . 29 2.3 An example of how a dynamic heap model can improve the precision of pointer analysis . 30 2.4 An example showing a pointer analysis formulated using abstract interpretation . 35 2.5 Pseudo-code for a simple worklist solver . 35 2.6 An illustration of how unification avoids revisiting statements . 39 3.1 Algorithm STO, a simple solution to the dynamic topological order problem. 42 3.2 Pseudo-code for algorithm MNR, an existing solution for the (unit change) dy- namic topological order problem . 47 3.3 Pseudo-code for algorithm AHRSZ, an optimal solution for the (unit change) dy- namic topological order problem. 52 3.4 Pseudo-code for PTO1, a new algorithm for the unit change dynamic topological order problem . 56 3.5 Pseudo-code for PTO2, a novel and unique solution to the batch dynamic topolog- ical order problem . 61 3.6 Pseudo-code for our procedure measuring the Average Cost Per Insertion (ACPI) of algorithms for the dynamic topological order problem . 64 3.7 Experimental data illustrating how the Average Cost Per Insertion (ACPI) and certain complexity metrics vary with density for three unit change solutions to the dynamic topological order problem . 66 3.8 Experimental data illustrating how the Average Cost Per Insertion (ACPI) varies with batch size for all five solutions to the dynamic topological order problem . 68 3.9 Pseudo-code demonstrating how the depth-first search component of MNR can be modified to back-propagate component information . 69 3.10 An example showing MSCC, a dynamic algorithm for detecting strongly con- nected components, in use. 70 3.11 The extended shift procedure for MSCC, a dynamic algorithm for detecting strongly connected components. 70 4.1 Pseudo-code for a standard worklist solver . 74 6 LIST OF FIGURES 7 4.2 Pseudo-code for PW1, an extended worklist algorithm for solving pointer analysis 77 4.3 Pseudo-code for PWD, an extended worklist algorithm for solving pointer analysis which employs difference propagation . 81 4.4 A chart of our experimental data investigating the effect of iteration strategy on the performance of PW1, a worklist algorithm for solving pointer analysis . 86 4.5 A chart of our experimental data looking at visit count for PW1, a worklist algo- rithm for solving pointer analysis . 86 4.6 A chart of our experimental data looking at the effect of dynamic cycle detection on the performance of PW1, a worklist algorithm for solving pointer analysis . 87 4.7 A chart of our experimental data looking at the effect of dynamic cycle detection on visit count for PW1, a worklist algorithm for solving pointer analysis . 87 4.8 A chart of our experimental data looking at the effect of difference propagation on the performance of PW1, a worklist algorithm for solving pointer analysis . 89 4.9 A chart of our experimental data looking at the effect of difference propagation on average set size for PW1, a worklist algorithm for solving pointer analysis .
Recommended publications
  • Efficiently Mining Frequent Closed Partial Orders
    Efficiently Mining Frequent Closed Partial Orders (Extended Abstract) Jian Pei1 Jian Liu2 Haixun Wang3 Ke Wang1 Philip S. Yu3 Jianyong Wang4 1 Simon Fraser Univ., Canada, fjpei, [email protected] 2 State Univ. of New York at Buffalo, USA, [email protected] 3 IBM T.J. Watson Research Center, USA, fhaixun, [email protected] 4 Tsinghua Univ., China, [email protected] 1 Introduction Account codes and explanation Account code Account type CHK Checking account Mining ordering information from sequence data is an MMK Money market important data mining task. Sequential pattern mining [1] RRSP Retirement Savings Plan can be regarded as mining frequent segments of total orders MORT Mortgage from sequence data. However, sequential patterns are often RESP Registered Education Savings Plan insufficient to concisely capture the general ordering infor- BROK Brokerage mation. Customer Records Example 1 (Motivation) Suppose MapleBank in Canada Cid Sequence of account opening wants to investigate whether there is some orders which cus- 1 CHK ! MMK ! RRSP ! MORT ! RESP ! BROK tomers often follow to open their accounts. A database DB 2 CHK ! RRSP ! MMK ! MORT ! RESP ! BROK in Table 1 about four customers’ sequences of opening ac- 3 MMK ! CHK ! BROK ! RESP ! RRSP counts in MapleBank is analyzed. 4 CHK ! MMK ! RRSP ! MORT ! BROK ! RESP Given a support threshold min sup, a sequential pattern is a sequence s which appears as subsequences of at least Table 1. A database DB of sequences of ac- min sup sequences. For example, let min sup = 3. The count opening. following four sequences are sequential patterns since they are subsequences of three sequences, 1, 2 and 4, in DB.
    [Show full text]
  • Approximating Transitive Reductions for Directed Networks
    Approximating Transitive Reductions for Directed Networks Piotr Berman1, Bhaskar DasGupta2, and Marek Karpinski3 1 Pennsylvania State University, University Park, PA 16802, USA [email protected] Research partially done while visiting Dept. of Computer Science, University of Bonn and supported by DFG grant Bo 56/174-1 2 University of Illinois at Chicago, Chicago, IL 60607-7053, USA [email protected] Supported by NSF grants DBI-0543365, IIS-0612044 and IIS-0346973 3 University of Bonn, 53117 Bonn, Germany [email protected] Supported in part by DFG grants, Procope grant 31022, and Hausdorff Center research grant EXC59-1 Abstract. We consider minimum equivalent digraph problem, its max- imum optimization variant and some non-trivial extensions of these two types of problems motivated by biological and social network appli- 3 cations. We provide 2 -approximation algorithms for all the minimiza- tion problems and 2-approximation algorithms for all the maximization problems using appropriate primal-dual polytopes. We also show lower bounds on the integrality gap of the polytope to provide some intuition on the final limit of such approaches. Furthermore, we provide APX- hardness result for all those problems even if the length of all simple cycles is bounded by 5. 1 Introduction Finding an equivalent digraph is a classical computational problem (cf. [13]). The statement of the basic problem is simple. For a digraph G = (V, E), we E use the notation u → v to indicate that E contains a path from u to v and E the transitive closure of E is the relation u → v over all pairs of vertices of V .
    [Show full text]
  • Professor Marta Kwiatkowska
    Potential Supervisors Marta Kwiatkowska Marta Kwiatkowska is Professor of Computing Systems and Fellow of Trinity College, University of Oxford. Prior to this she was Professor in the School of Computer Science at the University of Birmingham, Lecturer at the University of Leicester and Assistant Professor at the Jagiellonian University in Cracow, Poland. She holds a BSc/MSc in Computer Science from the Jagiellonian University, MA from Oxford and a PhD from the University of Leicester. In 2014 she was awarded an honorary doctorate from KTH Royal Institute of Technology in Stockholm. Marta Kwiatkowska spearheaded the development of probabilistic and quantitative methods in verification on the international scene. She led the development of the PRISM model checker, the leading software tool in the area and widely used for research and teaching and winner of the HVC 2016 Award. Applications of probabilistic model checking have spanned communication and security protocols, nanotechnology designs, power management, game theory, planning and systems biology, with genuine flaws found and corrected in real-world protocols. Kwiatkowska gave the Milner Lecture in 2012 in recognition of "excellent and original theoretical work which has a perceived significance for practical computing" and was invited to give keynotes at the LICS 2003, ESEC/FSE 2007, ETAPS/FASE 2011, ATVA 2013, ICALP 2016 and CAV 2017 conferences. Marta Kwiatkowska is the first female winner of the 2018 Royal Society Milner Award and Lecture. She is a Fellow of ACM, member of Academia Europea, Fellow of EATCS and Fellow of the BCS. She serves on editorial boards of several journals, including Information and Computation, Formal Methods in System Design, Logical Methods in Computer Science, Science of Computer Programming and Royal Society Open Science journal.
    [Show full text]
  • Copyright © 1980, by the Author(S). All Rights Reserved
    Copyright © 1980, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. ON A CLASS OF ACYCLIC DIRECTED GRAPHS by J. L. Szwarcfiter Memorandum No. UCB/ERL M80/6 February 1980 ELECTRONICS RESEARCH LABORATORY College of Engineering University of California, Berkeley 94720 ON A CLASS OF ACYCLIC DIRECTED GRAPHS* Jayme L. Szwarcfiter** Universidade Federal do Rio de Janeiro COPPE, I. Mat. e NCE Caixa Postal 2324, CEP 20000 Rio de Janeiro, RJ Brasil • February 1980 Key Words: algorithm, depth first search, directed graphs, graphs, isomorphism, minimal chain decomposition, partially ordered sets, reducible graphs, series parallel graphs, transitive closure, transitive reduction, trees. CR Categories: 5.32 *This work has been supported by the Conselho Nacional de Desenvolvimento Cientifico e Tecnologico (CNPq), Brasil, processo 574/78. The preparation ot the manuscript has been supported by the National Science Foundation, grant MCS78-20054. **Present Address: University of California, Computer Science Division-EECS, Berkeley, CA 94720, USA. ABSTRACT A special class of acyclic digraphs has been considered. It contains those acyclic digraphs whose transitive reduction is a directed rooted tree. Alternative characterizations have also been given, including one by forbidden subgraph containment of its transitive closure. For digraphs belonging to the mentioned class, linear time algorithms have been described for the following problems: recognition, transitive reduction and closure, isomorphism, minimal chain decomposition, dimension of the induced poset.
    [Show full text]
  • 95-106 Parallel Algorithms for Transitive Reduction for Weighted Graphs
    Math. Maced. Vol. 8 (2010) 95-106 PARALLEL ALGORITHMS FOR TRANSITIVE REDUCTION FOR WEIGHTED GRAPHS DRAGAN BOSNAˇ CKI,ˇ WILLEM LIGTENBERG, MAXIMILIAN ODENBRETT∗, ANTON WIJS∗, AND PETER HILBERS Dedicated to Academician Gor´gi´ Cuponaˇ Abstract. We present a generalization of transitive reduction for weighted graphs and give scalable polynomial algorithms for computing it based on the Floyd-Warshall algorithm for finding shortest paths in graphs. We also show how the algorithms can be optimized for memory efficiency and effectively parallelized to improve the run time. As a consequence, the algorithms can be tuned for modern general purpose graphics processors. Our prototype imple- mentations exhibit significant speedups of more than one order of magnitude compared to their sequential counterparts. Transitive reduction for weighted graphs was instigated by problems in reconstruction of genetic networks. The first experiments in that domain show also encouraging results both regarding run time and the quality of the reconstruction. 1. Introduction 0 The concept of transitive reduction for graphs was introduced in [1] and a similar concept was given previously in [8]. Transitive reduction is in a sense the opposite of transitive closure of a graph. In transitive closure a direct edge is added between two nodes i and j, if an indirect path, i.e., not including edge (i; j), exists between i and j. In contrast, the main intuition behind transitive reduction is that edges between nodes are removed if there are also indirect paths between i and j. In this paper we present an extension of the notion of transitive reduction to weighted graphs.
    [Show full text]
  • LNCS 7034, Pp
    Confluent Hasse Diagrams DavidEppsteinandJosephA.Simons Department of Computer Science, University of California, Irvine, USA Abstract. We show that a transitively reduced digraph has a confluent upward drawing if and only if its reachability relation has order dimen- sion at most two. In this case, we construct a confluent upward drawing with O(n2)features,inanO(n) × O(n)gridinO(n2)time.Forthe digraphs representing series-parallel partial orders we show how to con- struct a drawing with O(n)featuresinanO(n)×O(n)gridinO(n)time from a series-parallel decomposition of the partial order. Our drawings are optimal in the number of confluent junctions they use. 1 Introduction One of the most important aspects of a graph drawing is that it should be readable: it should convey the structure of the graph in a clear and concise way. Ease of understanding is difficult to quantify, so various proxies for it have been proposed, including the number of crossings and the total amount of ink required by the drawing [1,18]. Thus given two different ways to present information, we should choose the more succinct and crossing-free presentation. Confluent drawing [7,8,9,15,16] is a style of graph drawing in which multiple edges are combined into shared tracks, and two vertices are considered to be adjacent if a smooth path connects them in these tracks (Figure 1). This style was introduced to re- duce crossings, and in many cases it will also Fig. 1. Conventional and confluent improve the ink requirement by represent- drawings of K5,5 ing dense subgraphs concisely.
    [Show full text]
  • Current Issue of FACS FACTS
    Issue 2021-2 July 2021 FACS A C T S The Newsletter of the Formal Aspects of Computing Science (FACS) Specialist Group ISSN 0950-1231 FACS FACTS Issue 2021-2 July 2021 About FACS FACTS FACS FACTS (ISSN: 0950-1231) is the newsletter of the BCS Specialist Group on Formal Aspects of Computing Science (FACS). FACS FACTS is distributed in electronic form to all FACS members. Submissions to FACS FACTS are always welcome. Please visit the newsletter area of the BCS FACS website for further details at: https://www.bcs.org/membership/member-communities/facs-formal-aspects- of-computing-science-group/newsletters/ Back issues of FACS FACTS are available for download from: https://www.bcs.org/membership/member-communities/facs-formal-aspects- of-computing-science-group/newsletters/back-issues-of-facs-facts/ The FACS FACTS Team Newsletter Editors Tim Denvir [email protected] Brian Monahan [email protected] Editorial Team: Jonathan Bowen, John Cooke, Tim Denvir, Brian Monahan, Margaret West. Contributors to this issue: Jonathan Bowen, Andrew Johnstone, Keith Lines, Brian Monahan, John Tucker, Glynn Winskel BCS-FACS websites BCS: http://www.bcs-facs.org LinkedIn: https://www.linkedin.com/groups/2427579/ Facebook: http://www.facebook.com/pages/BCS-FACS/120243984688255 Wikipedia: http://en.wikipedia.org/wiki/BCS-FACS If you have any questions about BCS-FACS, please send these to Jonathan Bowen at [email protected]. 2 FACS FACTS Issue 2021-2 July 2021 Editorial Dear readers, Welcome to the 2021-2 issue of the FACS FACTS Newsletter. A theme for this issue is suggested by the thought that it is just over 50 years since the birth of Domain Theory1.
    [Show full text]
  • An Improved Algorithm for Transitive Closure on Acyclic Digraphs
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector Theoretical Computer Science 58 (1988) 325-346 325 North-Holland AN IMPROVED ALGORITHM FOR TRANSITIVE CLOSURE ON ACYCLIC DIGRAPHS Klaus SIMON Fachbereich IO, Angewandte Mathematik und Informatik, CJniversitCt des Saarlandes, 6600 Saarbriicken, Fed. Rep. Germany Abstract. In [6] Goralcikova and Koubek describe an algorithm for finding the transitive closure of an acyclic digraph G with worst-case runtime O(n. e,,,), where n is the number of nodes and ered is the number of edges in the transitive reduction of G. We present an improvement on their algorithm which runs in worst-case time O(k. crud) and space O(n. k), where k is the width of a chain decomposition. For the expected values in the G,,.,, model of a random acyclic digraph with O<p<l we have F(k)=O(y), E(e,,,)=O(n,logn), O(n’) forlog’n/n~p<l. E(k, ercd) = 0( n2 log log n) otherwise, where “log” means the natural logarithm. 1. Introduction A directed graph G = ( V, E) consists of a vertex set V = {1,2,3,. , n} and an edge set E c VX V. Each element (u, w) of E is an edge and joins 2, to w. If G, = (V,, E,) and G, = (V,, E2) are directed graphs, G, is a subgraph of Gz if V, G V, and E, c EZ. The subgraph of G, induced by the subset V, of V, is the graph G, = (V,, E,), where E, is the set of all elements of E, which join pairs of elements of V, .
    [Show full text]
  • Some NP-Complete Problems
    Appendix A Some NP-Complete Problems To ask the hard question is simple. But what does it mean? What are we going to do? W.H. Auden In this appendix we present a brief list of NP-complete problems; we restrict ourselves to problems which either were mentioned before or are closely re- lated to subjects treated in the book. A much more extensive list can be found in Garey and Johnson [GarJo79]. Chinese postman (cf. Sect. 14.5) Let G =(V,A,E) be a mixed graph, where A is the set of directed edges and E the set of undirected edges of G. Moreover, let w be a nonnegative length function on A ∪ E,andc be a positive number. Does there exist a cycle of length at most c in G which contains each edge at least once and which uses the edges in A according to their given orientation? This problem was shown to be NP-complete by Papadimitriou [Pap76], even when G is a planar graph with maximal degree 3 and w(e) = 1 for all edges e. However, it is polynomial for graphs and digraphs; that is, if either A = ∅ or E = ∅. See Theorem 14.5.4 and Exercise 14.5.6. Chromatic index (cf. Sect. 9.3) Let G be a graph. Is it possible to color the edges of G with k colors, that is, does χ(G) ≤ k hold? Holyer [Hol81] proved that this problem is NP-complete for each k ≥ 3; this holds even for the special case where k =3 and G is 3-regular.
    [Show full text]
  • Dr Gordon Baxter, University of St Andrews
    Dr Gordon Baxter, University of St Andrews Gordon Baxter has degrees in Computer Science (BSc(Eng)), Cognitive Science (MPhil), and Human Factors (PhD). He has several years experience in systems development, mostly on industrial and defence applications, having worked for Rolls Royce, Racal, Systems Designers, NEI and ICI. Most of this work involved aspects related to user interface design and human-computer interaction. Since joining academia he has worked on projects looking at rapid decision making in complex environments (for the DRA), dependability of computer based systems (for the NHS in Leeds, and several social services departments across the UK), and large scale complex IT systems. He is a Chartered Engineer, a Member of the British Computer Society, and an Associate Member of the Ergonomics Society. Dr Radu Calinescu, University of Oxford Radu Calinescu is a senior research officer at Oxford University Computing Laboratory. He obtained his DPhil in Computation from the University of Oxford in 1998, with a thesis that received a Distinguished Dissertation Award from the British Computer Society and was published as a Springer book. From 1999 to 2005, he worked in industry as the technical architect and senior development manager of an Oxford University spin-out software company specialising in the development of policy-driven data-centre management solutions. His research interests include autonomic computing, policy-based management of complex systems, modelling and model checking, automated and model-driven software engineering, and high performance computing. He is a senior member of the IEEE, and has chaired or served on the technical program committees of several international conferences on autonomic computing and formal methods.
    [Show full text]
  • Perturbation Graphs, Invariant Prediction and Causal Relations in Psychology
    Perturbation graphs, invariant prediction and causal relations in psychology Lourens Waldorp, Jolanda Kossakowski, Han L. J. van der Maas University of Amsterdam, Nieuwe Achtergracht 129-B, 1018 NP, the Netherlands arXiv:2109.00404v1 [stat.ME] 1 Sep 2021 Abstract Networks (graphs) in psychology are often restricted to settings without interven- tions. Here we consider a framework borrowed from biology that involves multiple interventions from different contexts (observations and experiments) in a single analysis. The method is called perturbation graphs. In gene regulatory networks, the induced change in one gene is measured on all other genes in the analysis, thereby assessing possible causal relations. This is repeated for each gene in the analysis. A perturbation graph leads to the correct set of causes (not necessarily di- rect causes). Subsequent pruning of paths in the graph (called transitive reduction) should reveal direct causes. We show that transitive reduction will not in general lead to the correct underlying graph. There is however a close relation with an- other method, called invariant causal prediction. Invariant causal prediction can be considered as a generalisation of the perturbation graph method where including ad- ditional variables (and so conditioning on those variables) does reveal direct causes, and thereby replacing transitive reduction. We explain the basic ideas of pertur- bation graphs, transitive reduction and invariant causal prediction and investigate their connections. We conclude that perturbation graphs provide a promising new tool for experimental designs in psychology, and combined with invariant prediction make it possible to reveal direct causes instead of causal paths. As an illustration we apply the perturbation graphs and invariant causal prediction to a data set about attitudes on meat consumption.
    [Show full text]
  • Arxiv:1112.1444V2 [Cs.DS] 12 Feb 2013 Rnprainntok N8,NG8,Admr Eety Trop AGG10]
    ON THE COMPLEXITY OF STRONGLY CONNECTED COMPONENTS IN DIRECTED HYPERGRAPHS XAVIER ALLAMIGEON Abstract. We study the complexity of some algorithmic problems on directed hypergraphs and their strongly connected components (Sccs). The main con- tribution is an almost linear time algorithm computing the terminal strongly connected components (i.e. Sccs which do not reach any components but themselves). Almost linear here means that the complexity of the algorithm is linear in the size of the hypergraph up to a factor α(n), where α is the inverse of Ackermann function, and n is the number of vertices. Our motivation to study this problem arises from a recent application of directed hypergraphs to computational tropical geometry. We also discuss the problem of computing all Sccs. We establish a super- linear lower bound on the size of the transitive reduction of the reachability relation in directed hypergraphs, showing that it is combinatorially more com- plex than in directed graphs. Besides, we prove a linear time reduction from the well-studied problem of finding all minimal sets among a given family to the problem of computing the Sccs. Only subquadratic time algorithms are known for the former problem. These results strongly suggest that the prob- lem of computing the Sccs is harder in directed hypergraphs than in directed graphs. 1. Introduction Directed hypergraphs consist in a generalization of directed graphs, in which the tail and the head of the arcs are sets of vertices. Directed hypergraphs have a very large number of applications, since hyperarcs naturally provide a represen- tation of implication dependencies. Among others, they are used to solve several problems related to satisfiability in propositional logic, in particular on Horn for- arXiv:1112.1444v2 [cs.DS] 12 Feb 2013 mulas, see for instance [AI91, AFFG97, GP95, GGPR98, Pre03].
    [Show full text]