1990-Learning Causal Trees from Dependence Information

Total Page:16

File Type:pdf, Size:1020Kb

1990-Learning Causal Trees from Dependence Information TECHNICAL REPORT From: AAAI-90 Proceedings. Copyright ©1990, AAAI (www.aaai.org). All rights reserved. R-149 sal Trees fro n-nation * Dan Geiger Azaria Paz Judea Pearl [email protected] [email protected] [email protected] Northrop Research and Computer Science Department Cognitive Systems Lab. Technology Center Israel Institute of Technology University of California One Research Park Haifa, Israel 32000 Los Angeles, CA 90024 Palos Verdes, CA 90274 Abstract b”, “a suggests b”, ‘a tends to cause b”, and “u actu- ally caused b”. These utterances are the basic build- In constructing probabilistic networks from hu- ing blocks from which knowledge bases are assembled. man judgments, we use causal relationships to Second, any autonomous learning system attempting convey useful patterns of dependencies. The con- to build a causal model of its environment cannot rely verse task, that of inferring causal relationships exclusively on procedural semantics but must be able from patterns of dependencies, is far less under- to translate direct observations to cause and effect re- stood. Th’ 1s paper establishes conditions under lationships. which the directionality of some interactions can Temporal precedence is normally assumed essential be determined from non-temporal probabilistic in- for defining causation. Suppes [Suppes 701, for ex- formation - an essential prerequisite for attribut- ample, introduces a probabilistic definition of causa- ing a causal interpretation to these interactions. tion with an explicit requirement that a cause pre- An efficient algorithm is developed that, given cedes its effect in time. Shoham makes an identical data generated by an undisclosed causal polytree, assumption [Shoham 871. In this paper we propose a recovers the structure of the underlying polytree, non-temporal semantics, one that determines the di- as well as the directionality of all its identifiable rectionality of causal influence without resorting to links. temporal information, in the spirit of [Simon 541 and [Glymour at al. 871. Such semantics should be applica- 1 Introduction ble, therefore, to the organization of concurrent events or events whose chronological precedence cannot be de- The study of causation, because of its pervasive usage termined empirically. Such situations are common in in human communication and problem solving, is cen- the behavioral and medical sciences where we say, for tral to the understanding of human reasoning. All rea- example, that old age explains a certain disability, not soning tasks dealing with changing environments rely the other way around, even though the two occur to- heavily on the distinction between cause and effect. gether (in many cases it is the disability that precedes For example, a central task in applications such as di- old age). agnosis, qualitative physics, plan recognition and lan- Another feature of our formulation is the appeal to guage understanding, is that of abduction, i.e., finding probabilistic dependence, as opposed to functional or a satisfactory explanation to a given set of observa- deterministic dependence. This is motivated by the tions, and the meaning of explanation is intimately re- observation that most causal connections found in nat- lated to the notion of causation. ural discourse, for example “reckless driving causes ac- Most AI works have given the term “cause” a proce- cidents” are probabilistic in nature [Spohn 901. Given dural semantics, attempting to match the way people that statistical analysis cannot distinguish causation use it in inference tasks, but were not concerned with from covariation, we must still identify the asymme- what makes people believe that Ku causes b”, as op- tries that prompt people to perceive causal structures posed to, say, “b causes a” or Kc causes both a and in empirical data, and we must find a computational b.” [de Kleer & Brown 78,Simon 541. An empirical se- model for such perception. mantics for causation is important for several reasons. First, by formulating the empirical components of cau- Our attack on the problem is as follows; first, we sation we gain a better understanding of the mean- pretend that Nature possesses Ktrue” cause and effect ing conveyed by causal utterances, such as Ku explains relationships and that these relationships can be repre- sented by a causal network, namely, a directed acyclic *This work was supported in part by the National Sci- graph where each node represents a variable in the do- ence Foundation Grant # IRI-8821444 main and the parents of that node correspond to its 770 MACHINE LEARNING direct causes, as designated by Nature. Next, we as- properties of P allow the recovery of D ? It is shown sume that Nature selects a joint distribution over the that intersection composition and marginal weak tran- variables in such a way that direct causes of a variable sitivity are sufficient properties to ensure that the dag render this variable conditionally independent of all is uniquely recoverable (up to isomorphism) in polyno other variables except its consequences. Nature per- mial time. The recovery algorithm developed consid- mits scientists to observe the distribution, ask ques- erably generalizes the method of Rebane and Pearl for tions about its properties, but hides the underlying the same task, as it does not assume the distribution to causal network. We investigate the feasibility of recov- be dag-isomorph [Pearl 88, Chapter 81. The algorithm ering the network’s topology efficiently and uniquely implies, for example, that the assumption of a multi- from the joint distribution. variate normal distribution is sufficient for a complete This formulation contains several simplifications of recovery of singly-connected dags. the actual task of scientific discovery. It assumes, for example, that scientists obtain the distribution, rather 2 Probabilistic Dependence: than events sampled from the distribution. This as- Background and Definitions sumption is justified when a large sample is available, sufficient to reveal all the dependencies embedded in Our model of an environment consists of a finite set the distribution. Additionally, it assumes that all of variables U and a distribution P over these vari- relevant variables are measurable, and this prevents ables. Variables in a medical domain, for example, us from distinguishing between spurious correlations represent entities such as “cold”, Kheadache”, ‘fever”. [Simon 541 and genuine causes, a distinction that is im- Each variable has a domain which is a set of permis- possible within the confines of a closed world assump- sible values. The sample space of the distribution is tion. Computationally, however, solving this simplified the Cartesian product of all domains of the variables problem is an essential component in any attempt to in U. An environment can be represented graphically deduce causal relationships from measurements, and by an acyclic directed graph (dag) as follows: We se- that is the main concern of this paper. lect a linear order on all variables in U. Each variable It is not hard to see that if Nature were to assign is represented by a node. The parents of a node v totally arbitrary probabilities to the links, then some correspond to a minimal set of variables that make v distributions would not enable us to uncover the struc- conditionally independent of all lesser variables in the ture of the network. However, by employing additional selected order. Each ordering may produce a differ- restrictions on the available distributions, expressing ent graph, for example, one representation of the three properties we normally attribute to causal relation- variables above is the chain headache +- cold -+ fever ships, some structure could be recovered. The basic which is produced by the order cold, headache and requirement is that two independent causes should be- fever (assuming fever and headache are independent come dependent once their effect is known [Pearl 881. symptoms of a cold). Another ordering of these vari- For example, two independent inputs for an AND gate ables: fever, cold, and heuduche would yield the dag become dependent once the output is measured. This headache + cold c- fever with an additional arc be- observation is phrased axiomatically by a property tween fever and headache. Notice that the directional- called Marginal Weak Transitivity (Eq. 9 below). This ity of links may differ between alternative representa- property tells us that if two variables z and y are mu- tions. In the first graph directionality matches our per- tually independent, and each is dependent on their ef- ception of cause-effect relationships while in the second fect c, then II: and y are conditionally dependent for it does not, being merely a spurious by-product of the at least one instance of c. Two additional properties ordering chosen for the construction. We shall see that, of independence, intersection and composition (Eqs. 7, despite the arbitrariness in choosing the construction and 8 below ), are found useful. Intersection is guar- ordering, some directions will be preferred to others, anteed if the distributions are strictly positive and is and these can be determined mechanically. justified by the assumption that, to some extent, all The basis for differentiating alternative representa- observations are corrupted by noise. Composition is tions are the dependence relationships encoded in the a property enforced, for example, by multivariate nor- distribution describing the environment. We regard a mal distributions, stating that two sets of variables X distribution as a dependency model, capable of answer- and Y are independent iff every it; E X and y E Y ing queries of the form UAre X and Y independent are independent. In common discourse, this property given 2 ?” and use these answers to select among is often associated with the notion of “independence”, possible representations. The following definitions and yet it is not enforced by all distributions. theorems provide the ground for a precise formulation of the problem. The theory to be developed in the rest of the paper addresses the following problem.
Recommended publications
  • Applications of DFS
    (b) acyclic (tree) iff a DFS yeilds no Back edges 2.A directed graph is acyclic iff a DFS yields no back edges, i.e., DAG (directed acyclic graph) , no back edges 3. Topological sort of a DAG { next 4. Connected components of a undirected graph (see Homework 6) 5. Strongly connected components of a drected graph (see Sec.22.5 of [CLRS,3rd ed.]) Applications of DFS 1. For a undirected graph, (a) a DFS produces only Tree and Back edges 1 / 7 2.A directed graph is acyclic iff a DFS yields no back edges, i.e., DAG (directed acyclic graph) , no back edges 3. Topological sort of a DAG { next 4. Connected components of a undirected graph (see Homework 6) 5. Strongly connected components of a drected graph (see Sec.22.5 of [CLRS,3rd ed.]) Applications of DFS 1. For a undirected graph, (a) a DFS produces only Tree and Back edges (b) acyclic (tree) iff a DFS yeilds no Back edges 1 / 7 3. Topological sort of a DAG { next 4. Connected components of a undirected graph (see Homework 6) 5. Strongly connected components of a drected graph (see Sec.22.5 of [CLRS,3rd ed.]) Applications of DFS 1. For a undirected graph, (a) a DFS produces only Tree and Back edges (b) acyclic (tree) iff a DFS yeilds no Back edges 2.A directed graph is acyclic iff a DFS yields no back edges, i.e., DAG (directed acyclic graph) , no back edges 1 / 7 4. Connected components of a undirected graph (see Homework 6) 5.
    [Show full text]
  • Matroid Theory
    MATROID THEORY HAYLEY HILLMAN 1 2 HAYLEY HILLMAN Contents 1. Introduction to Matroids 3 1.1. Basic Graph Theory 3 1.2. Basic Linear Algebra 4 2. Bases 5 2.1. An Example in Linear Algebra 6 2.2. An Example in Graph Theory 6 3. Rank Function 8 3.1. The Rank Function in Graph Theory 9 3.2. The Rank Function in Linear Algebra 11 4. Independent Sets 14 4.1. Independent Sets in Graph Theory 14 4.2. Independent Sets in Linear Algebra 17 5. Cycles 21 5.1. Cycles in Graph Theory 22 5.2. Cycles in Linear Algebra 24 6. Vertex-Edge Incidence Matrix 25 References 27 MATROID THEORY 3 1. Introduction to Matroids A matroid is a structure that generalizes the properties of indepen- dence. Relevant applications are found in graph theory and linear algebra. There are several ways to define a matroid, each relate to the concept of independence. This paper will focus on the the definitions of a matroid in terms of bases, the rank function, independent sets and cycles. Throughout this paper, we observe how both graphs and matrices can be viewed as matroids. Then we translate graph theory to linear algebra, and vice versa, using the language of matroids to facilitate our discussion. Many proofs for the properties of each definition of a matroid have been omitted from this paper, but you may find complete proofs in Oxley[2], Whitney[3], and Wilson[4]. The four definitions of a matroid introduced in this paper are equiv- alent to each other.
    [Show full text]
  • Matroids You Have Known
    26 MATHEMATICS MAGAZINE Matroids You Have Known DAVID L. NEEL Seattle University Seattle, Washington 98122 [email protected] NANCY ANN NEUDAUER Pacific University Forest Grove, Oregon 97116 nancy@pacificu.edu Anyone who has worked with matroids has come away with the conviction that matroids are one of the richest and most useful ideas of our day. —Gian Carlo Rota [10] Why matroids? Have you noticed hidden connections between seemingly unrelated mathematical ideas? Strange that finding roots of polynomials can tell us important things about how to solve certain ordinary differential equations, or that computing a determinant would have anything to do with finding solutions to a linear system of equations. But this is one of the charming features of mathematics—that disparate objects share similar traits. Properties like independence appear in many contexts. Do you find independence everywhere you look? In 1933, three Harvard Junior Fellows unified this recurring theme in mathematics by defining a new mathematical object that they dubbed matroid [4]. Matroids are everywhere, if only we knew how to look. What led those junior-fellows to matroids? The same thing that will lead us: Ma- troids arise from shared behaviors of vector spaces and graphs. We explore this natural motivation for the matroid through two examples and consider how properties of in- dependence surface. We first consider the two matroids arising from these examples, and later introduce three more that are probably less familiar. Delving deeper, we can find matroids in arrangements of hyperplanes, configurations of points, and geometric lattices, if your tastes run in that direction.
    [Show full text]
  • Incremental Dynamic Construction of Layered Polytree Networks )
    440 Incremental Dynamic Construction of Layered Polytree Networks Keung-Chi N g Tod S. Levitt lET Inc., lET Inc., 14 Research Way, Suite 3 14 Research Way, Suite 3 E. Setauket, NY 11733 E. Setauket , NY 11733 [email protected] [email protected] Abstract Many exact probabilistic inference algorithms1 have been developed and refined. Among the earlier meth­ ods, the polytree algorithm (Kim and Pearl 1983, Certain classes of problems, including per­ Pearl 1986) can efficiently update singly connected ceptual data understanding, robotics, discov­ belief networks (or polytrees) , however, it does not ery, and learning, can be represented as incre­ work ( without modification) on multiply connected mental, dynamically constructed belief net­ networks. In a singly connected belief network, there is works. These automatically constructed net­ at most one path (in the undirected sense) between any works can be dynamically extended and mod­ pair of nodes; in a multiply connected belief network, ified as evidence of new individuals becomes on the other hand, there is at least one pair of nodes available. The main result of this paper is the that has more than one path between them. Despite incremental extension of the singly connect­ the fact that the general updating problem for mul­ ed polytree network in such a way that the tiply connected networks is NP-hard (Cooper 1990), network retains its singly connected polytree many propagation algorithms have been applied suc­ structure after the changes. The algorithm cessfully to multiply connected networks, especially on is deterministic and is guaranteed to have networks that are sparsely connected. These methods a complexity of single node addition that is include clustering (also known as node aggregation) at most of order proportional to the number (Chang and Fung 1989, Pearl 1988), node elimination of nodes (or size) of the network.
    [Show full text]
  • A Framework for the Evaluation and Management of Network Centrality ∗
    A Framework for the Evaluation and Management of Network Centrality ∗ Vatche Ishakiany D´oraErd}osz Evimaria Terzix Azer Bestavros{ Abstract total number of shortest paths that go through it. Ever Network-analysis literature is rich in node-centrality mea- since, researchers have proposed different measures of sures that quantify the centrality of a node as a function centrality, as well as algorithms for computing them [1, of the (shortest) paths of the network that go through it. 2, 4, 10, 11]. The common characteristic of these Existing work focuses on defining instances of such mea- measures is that they quantify a node's centrality by sures and designing algorithms for the specific combinato- computing the number (or the fraction) of (shortest) rial problems that arise for each instance. In this work, we paths that go through that node. For example, in a propose a unifying definition of centrality that subsumes all network where packets propagate through nodes, a node path-counting based centrality definitions: e.g., stress, be- with high centrality is one that \sees" (and potentially tweenness or paths centrality. We also define a generic algo- controls) most of the traffic. rithm for computing this generalized centrality measure for In many applications, the centrality of a single node every node and every group of nodes in the network. Next, is not as important as the centrality of a group of we define two optimization problems: k-Group Central- nodes. For example, in a network of lobbyists, one ity Maximization and k-Edge Centrality Boosting. might want to measure the combined centrality of a In the former, the task is to identify the subset of k nodes particular group of lobbyists.
    [Show full text]
  • 1 Plane and Planar Graphs Definition 1 a Graph G(V,E) Is Called Plane If
    1 Plane and Planar Graphs Definition 1 A graph G(V,E) is called plane if • V is a set of points in the plane; • E is a set of curves in the plane such that 1. every curve contains at most two vertices and these vertices are the ends of the curve; 2. the intersection of every two curves is either empty, or one, or two vertices of the graph. Definition 2 A graph is called planar, if it is isomorphic to a plane graph. The plane graph which is isomorphic to a given planar graph G is said to be embedded in the plane. A plane graph isomorphic to G is called its drawing. 1 9 2 4 2 1 9 8 3 3 6 8 7 4 5 6 5 7 G is a planar graph H is a plane graph isomorphic to G 1 The adjacency list of graph F . Is it planar? 1 4 56 8 911 2 9 7 6 103 3 7 11 8 2 4 1 5 9 12 5 1 12 4 6 1 2 8 10 12 7 2 3 9 11 8 1 11 36 10 9 7 4 12 12 10 2 6 8 11 1387 12 9546 What happens if we add edge (1,12)? Or edge (7,4)? 2 Definition 3 A set U in a plane is called open, if for every x ∈ U, all points within some distance r from x belong to U. A region is an open set U which contains polygonal u,v-curve for every two points u,v ∈ U.
    [Show full text]
  • On Hamiltonian Line-Graphsq
    ON HAMILTONIANLINE-GRAPHSQ BY GARY CHARTRAND Introduction. The line-graph L(G) of a nonempty graph G is the graph whose point set can be put in one-to-one correspondence with the line set of G in such a way that two points of L(G) are adjacent if and only if the corresponding lines of G are adjacent. In this paper graphs whose line-graphs are eulerian or hamiltonian are investigated and characterizations of these graphs are given. Furthermore, neces- sary and sufficient conditions are presented for iterated line-graphs to be eulerian or hamiltonian. It is shown that for any connected graph G which is not a path, there exists an iterated line-graph of G which is hamiltonian. Some elementary results on line-graphs. In the course of the article, it will be necessary to refer to several basic facts concerning line-graphs. In this section these results are presented. All the proofs are straightforward and are therefore omitted. In addition a few definitions are given. If x is a Une of a graph G joining the points u and v, written x=uv, then we define the degree of x by deg zz+deg v—2. We note that if w ' the point of L(G) which corresponds to the line x, then the degree of w in L(G) equals the degree of x in G. A point or line is called odd or even depending on whether it has odd or even degree. If G is a connected graph having at least one line, then L(G) is also a connected graph.
    [Show full text]
  • 9 the Graph Data Model
    CHAPTER 9 ✦ ✦ ✦ ✦ The Graph Data Model A graph is, in a sense, nothing more than a binary relation. However, it has a powerful visualization as a set of points (called nodes) connected by lines (called edges) or by arrows (called arcs). In this regard, the graph is a generalization of the tree data model that we studied in Chapter 5. Like trees, graphs come in several forms: directed/undirected, and labeled/unlabeled. Also like trees, graphs are useful in a wide spectrum of problems such as com- puting distances, finding circularities in relationships, and determining connectiv- ities. We have already seen graphs used to represent the structure of programs in Chapter 2. Graphs were used in Chapter 7 to represent binary relations and to illustrate certain properties of relations, like commutativity. We shall see graphs used to represent automata in Chapter 10 and to represent electronic circuits in Chapter 13. Several other important applications of graphs are discussed in this chapter. ✦ ✦ ✦ ✦ 9.1 What This Chapter Is About The main topics of this chapter are ✦ The definitions concerning directed and undirected graphs (Sections 9.2 and 9.10). ✦ The two principal data structures for representing graphs: adjacency lists and adjacency matrices (Section 9.3). ✦ An algorithm and data structure for finding the connected components of an undirected graph (Section 9.4). ✦ A technique for finding minimal spanning trees (Section 9.5). ✦ A useful technique for exploring graphs, called “depth-first search” (Section 9.6). 451 452 THE GRAPH DATA MODEL ✦ Applications of depth-first search to test whether a directed graph has a cycle, to find a topological order for acyclic graphs, and to determine whether there is a path from one node to another (Section 9.7).
    [Show full text]
  • Context-Bounded Analysis of Concurrent Queue Systems*
    Context-Bounded Analysis of Concurrent Queue Systems Salvatore La Torre1, P. Madhusudan2, and Gennaro Parlato1,2 1 Universit`a degli Studi di Salerno, Italy 2 University of Illinois at Urbana-Champaign, USA Abstract. We show that the bounded context-switching reachability problem for concurrent finite systems communicating using unbounded FIFO queues is decidable, where in each context a process reads from only one queue (but is allowed to write onto all other queues). Our result also holds when individual processes are finite-state recursive programs provided a process dequeues messages only when its local stack is empty. We then proceed to classify architectures that admit a decidable (un- bounded context switching) reachability problem, using the decidability of bounded context switching. We show that the precise class of decid- able architectures for recursive programs are the forest architectures, while the decidable architectures for non-recursive programs are those that do not have an undirected cycle. 1 Introduction Networks of concurrent processes communicating via message queues form a very natural and useful model for several classes of systems with inherent paral- lelism. Two natural classes of systems can be modeled using such a framework: asynchronous programs on a multi-core computer and distributed programs com- municating on a network. In parallel programming languages for multi-core or even single-processor sys- tems (e.g., Java, web service design), asynchronous programming or event-driven programming is a common idiom that programming languages provide [19,7,13]. In order to obtain higher performance and low latency, programs are equipped with the ability to issue tasks using asynchronous calls that immediately return, but are processed later, either in parallel with the calling module or perhaps much later, depending on when processors and other resources such as I/O be- come free.
    [Show full text]
  • A Low Complexity Topological Sorting Algorithm for Directed Acyclic Graph
    International Journal of Machine Learning and Computing, Vol. 4, No. 2, April 2014 A Low Complexity Topological Sorting Algorithm for Directed Acyclic Graph Renkun Liu then return error (graph has at Abstract—In a Directed Acyclic Graph (DAG), vertex A and least one cycle) vertex B are connected by a directed edge AB which shows that else return L (a topologically A comes before B in the ordering. In this way, we can find a sorted order) sorting algorithm totally different from Kahn or DFS algorithms, the directed edge already tells us the order of the Because we need to check every vertex and every edge for nodes, so that it can simply be sorted by re-ordering the nodes the “start nodes”, then sorting will check everything over according to the edges from a Direction Matrix. No vertex is again, so the complexity is O(E+V). specifically chosen, which makes the complexity to be O*E. An alternative algorithm for topological sorting is based on Then we can get an algorithm that has much lower complexity than Kahn and DFS. At last, the implement of the algorithm by Depth-First Search (DFS) [2]. For this algorithm, edges point matlab script will be shown in the appendix part. in the opposite direction as the previous algorithm. The algorithm loops through each node of the graph, in an Index Terms—DAG, algorithm, complexity, matlab. arbitrary order, initiating a depth-first search that terminates when it hits any node that has already been visited since the beginning of the topological sort: I.
    [Show full text]
  • CS302 Final Exam, December 5, 2016 - James S
    CS302 Final Exam, December 5, 2016 - James S. Plank Question 1 For each of the following algorithms/activities, tell me its running time with big-O notation. Use the answer sheet, and simply circle the correct running time. If n is unspecified, assume the following: If a vector or string is involved, assume that n is the number of elements. If a graph is involved, assume that n is the number of nodes. If the number of edges is not specified, then assume that the graph has O(n2) edges. A: Sorting a vector of uniformly distributed random numbers with bucket sort. B: In a graph with exactly one cycle, determining if a given node is on the cycle, or not on the cycle. C: Determining the connected components of an undirected graph. D: Sorting a vector of uniformly distributed random numbers with insertion sort. E: Finding a minimum spanning tree of a graph using Prim's algorithm. F: Sorting a vector of uniformly distributed random numbers with quicksort (average case). G: Calculating Fib(n) using dynamic programming. H: Performing a topological sort on a directed acyclic graph. I: Finding a minimum spanning tree of a graph using Kruskal's algorithm. J: Finding the minimum cut of a graph, after you have found the network flow. K: Finding the first augmenting path in the Edmonds Karp implementation of network flow. L: Processing the residual graph in the Ford Fulkerson algorithm, once you have found an augmenting path. Question 2 Please answer the following statements as true or false. A: Kruskal's algorithm requires a starting node.
    [Show full text]
  • On the Parameterized Complexity of Polytree Learning
    On the Parameterized Complexity of Polytree Learning Niels Grüttemeier , Christian Komusiewicz and Nils Morawietz∗ Philipps-Universität Marburg, Marburg, Germany {niegru, komusiewicz, morawietz}@informatik.uni-marburg.de Abstract In light of the hardness of handling general Bayesian networks, the learning and inference prob- A Bayesian network is a directed acyclic graph that lems for Bayesian networks fulfilling some spe- represents statistical dependencies between vari- cific structural constraints have been studied exten- ables of a joint probability distribution. A funda- sively [Pearl, 1989; Korhonen and Parviainen, 2013; mental task in data science is to learn a Bayesian Korhonen and Parviainen, 2015; network from observed data. POLYTREE LEARN- Grüttemeier and Komusiewicz, 2020a; ING is the problem of learning an optimal Bayesian Ramaswamy and Szeider, 2021]. network that fulfills the additional property that its underlying undirected graph is a forest. In One of the earliest special cases that has received atten- this work, we revisit the complexity of POLYTREE tion are tree networks, also called branchings. A tree is a LEARNING. We show that POLYTREE LEARN- Bayesian network where every vertex has at most one par- ING can be solved in 3n · |I|O(1) time where n is ent. In other words, every connected component is a di- the number of variables and |I| is the total in- rected in-tree. Learning and inference can be performed in stance size. Moreover, we consider the influence polynomial time on trees [Chow and Liu, 1968; Pearl, 1989]. of the number of variables d that might receive a Trees are, however, very limited in their modeling power nonempty parent set in the final DAG on the com- since every variable may depend only on at most one other plexity of POLYTREE LEARNING.
    [Show full text]