Parameterized Complexity Results for Probabilistic Network Structure Learning

Parameterized Complexity Results for Probabilistic Network Structure Learning Stefan Szeider Vienna University of Technology, Austria Workshop on Applications of Parameterized Algorithms and Complexity APAC 2012 Warwick, UK, July 8, 2012 Joint work with: Serge Gaspers Mikko Koivisto Mathieu Liedloff Sebastian Ordyniak UNSW, Sydney U Helsinki U d’Orleans Masaryk U Brno Papers: • Algorithms and Complexity Results for Exact Bayesian Structure Learning. Ordyniak and Szeider. Conference on Uncertainty in Artificial Intelligence (UAI 2010). • An Improved Dynamic Progamming Algorithm for Exact Bayesian Network Structure Learning. Ordyniak and Szeider. NIPS Workshop on Discrete Optimization in Machine Learning (DISCML 2011). • On Finding Optimal Polytrees. Gaspers, Koivisto, Liedloff, Ordyniak, and Szeider. Conference on Artificial Intelligence (AAAI 2012). APAC 2012 2 Stefan Szeider Probabilistic Networks • Bayesian Networks (BNs) were introduced by Judea Perl in 1985 (2011 Turing Award Winner) • A BN is a DAG D=(V,A) plus tables associated with the nodes of the network • In addition to Bayesian networks, also other probabilistic networks have been considered: Markov Random Fields, Judea Pearl Factor Graphs, etc. APAC 2012 3 Stefan Szeider Example: Rain Sprinkler A Bayesian Rain T F T F Network F 0.4 0.6 0.8 0.2 T 0.01 0.99 Sprinkler Rain Sprinkler Rain Grass Wet T F F F 0.8 0.2 Grass F T 0.01 0.99 T F 0.9 0.1 T T 0.99 0.01 APAC 2012 4 Stefan Szeider Applications • diagnosis • computational biology • document classification • information retrieval • image processing • decision support etc. A BN showing the main pathophysiological • relationships in the diagnostic reasoning focused on a suspected pulmonary embolism event APAC 2012 5 Stefan Szeider Computational Problems APAC 2012 6 Stefan Szeider Computational Problems • BN Reasoning: Given a BN, compute the probability of a variable taking a specific value APAC 2012 6 Stefan Szeider Computational Problems • BN Reasoning: Given a BN, compute the probability of a variable taking a specific value • BN Learning: Given a set of sample data, find a BN that fits the data best. APAC 2012 6 Stefan Szeider Computational Problems • BN Reasoning: Given a BN, compute the probability of a variable taking a specific value • BN Learning: Given a set of sample data, find a BN that fits the data best. • BN Structure Learning: Given sample data, find the best DAG APAC 2012 6 Stefan Szeider Computational Problems • BN Reasoning: Given a BN, compute the probability of a variable taking a specific value • BN Learning: Given a set of sample data, find a BN that fits the data best. • BN Structure Learning: Given sample data, find the best DAG • BN Parameter Learning: Given sample data and a DAG, find the best probability tables APAC 2012 6 Stefan Szeider Computational Problems • BN Reasoning: Given a BN, compute the probability of a variable taking a specific value • BN Learning: Given a set of sample data, find a BN that fits the data best. • BN Structure Learning: Given sample data, find the best DAG • BN Parameter Learning: Given sample data and a DAG, find the best probability tables APAC 2012 6 Stefan Szeider BN Learning and Local Scores BN n2 Sample Data n1 n1 n2 n3 n4 n5 n6 .... 0 1 0 1 1 0 .... n4 1 1 0 1 1 0 .... n3 * 1 0 0 * 1 .... 0 * 1 1 * 1 .... n6 1 0 1 1 0 1 .... .... n5 .... APAC 2012 7 ... Stefan Szeider BN Learning and Local Scores Local Score Function f(n,P)=score of node n with P ⊆V as its in-neighbors (parents) BN n2 Sample Data n1 n1 n2 n3 n4 n5 n6 .... 0 1 0 1 1 0 .... n4 1 1 0 1 1 0 .... n3 * 1 0 0 * 1 .... 0 * 1 1 * 1 .... n6 1 0 1 1 0 1 .... .... n5 .... APAC 2012 7 ... Stefan Szeider Our Combinatorial Model • BN Structure Learning: Input: a set N of nodes and a local sore function f (by explicit listing of nonzero tuples) Task: find a DAG D=(N,A) such that the sum of f(n,PD(n)) over all nodes n is maximum. APAC 2012 8 Stefan Szeider “poster” — 2011/12/9 — 21:33 — page 1 — #1 3rd“poster” Workshop — 2011/12/9 on Discrete — 21:33 Optimization — page in1 — Machine #1 Learning (DISCML’11), Sierra Nevada, Spain, Dec 17, 2011 3rd WorkshopAn on Improved Discrete Optimization in Dynamic Machine Learning (DISCML’11), Programming Sierra Nevada, Spain, Algorithm Dec 17, 2011 for Exact Bayesian Network Structure Learning An Improved Dynamic Programming Algorithm for Exact Bayesian Network Structure Learning Sebastian Ordyniak and Stefan Szeider1 Sebastian Ordyniak and Stefan Szeider1 Exact Bayesian Network Structure Learning Super-structure Exact Bayesian NetworkExample Structure Learning Super-structure Problem: Given a local score function f over a set of nodes N find a Idea: Restrict the search to BNs whose skeleton is contained inside an Problem: Given a local score function f over a set of nodes NAnfind aoptimalIdea: BNRestrict the search to BNs whose skeleton is contained inside an BayesianABayesian local network score network (BN) function on (BN)N whose on N scorewhose is maximum score is maximum with respect with to f. respectundirected to f. graphundirected (the super-structure). graph (the super-structure). structure a a nPnPf(n, P)f(n, P) a a a ad d 1 1 { } { } a b,ac, db, c,0d.5 0.5 { } I A good super-structure contains b a,{f 1} I A good super-structure contains {b }a, f 1 c e { }1 b c d an optimal solution. {c } e 1 b c d an optimal solution.b c d d { } 1 7! I A super-structure can be b c d ; 7! I A super-structure can be e db 1 1 efficiently obtained, e.g., using { } ; efficiently obtained, e.g., using f ed b 1 1 the tiling algorithm. { } { } g cf, d d 1 1 e g the tiling algorithm. { }{ } f g c, d 1 e f g A local score{ function} f An optimal BN for f e f g Exact BayesianA local Network score Structure function Learningf is NP-hard.An optimal BN for f e f g Heuristic Algorithms: find first an undirected graph, then greedily orient Dynamic Lower Bound Approach Exact Bayesian Network Structure Learning is NP-hard. the edges,Dynamic then do Programming local search over for a improving Tree Decomposition the orientation Idea: Ignore records that do not representDynamic optimal Lower BNs. Bound Approach Dynamic Programming over a Tree Decomposition APAC 2012 a, b, c, d 9 Stefan Szeider Idea:a, b,Ignorec, d recordsX X that do not represent optimal BNs. a, b, c, d a, b, c, d X X b, c, d b, c, d X b, c, d b, c, d X e, b, c f, b, d g, c, d e, b, c f, b, d g, c, d X Idea: Usee, b a, treec decomposition of a super-structuref, b, d to find an optimal Method:g, c, d e, b, c f, b, d g, c, d BN via dynamic programming (Ordyniak and Szeider, UAI 2010). I For every record R compute a lower bound and an upper bound for X Method: the score of any BN represented by R. I Compactly represent locally optimal (partial) BNs via records ( ). I Ignore all records whose upper bound is below the minimum of the Idea: Use a tree decomposition of a super-structure to find an optimallower bounds over all records seen so far. I Compute the set of all records that represent locally optimal (partial) Method: BN via dynamic programming (Ordyniak and Szeider, UAI 2010). BNs for each node of the tree decomposition in a bottom-up manner. I For every record R compute a lower bound and an upper bound for the scoreExperimental of any BN Results represented by R. Bottleneck:Method: Large Number of Records! I Ignore all records whose upper bound is below the minimum of the I Compactly represent locally optimal (partial) BNs via recordsWe ( compared). the algorithm without (no LB) to the algorithm with the lower bounds over all records seen so far. I Compute the set ofReferences all records that represent locally optimaldynamic (partial) lower bound (LB) on the Alarm network. BNs for each node of the tree decomposition in a bottom-up manner. running time (s) memory usage (MB) I S. Ordyniak, S. Szeider, Algorithms and complexity results for exact super-structure no LB LB noExperimental LB ResultsLB BayesianBottleneck: structure Large learning, Number in: P. of Grünwald, Records! P. Spirtes (Eds.), 14 7 337 180 Proceedings of UAI 2010, The 26th Conference on Uncertainty in SSKEL TIL(We0.01 compared) 20785 the algorithm6393 15679 without (no5346 LB) to the algorithm with the Artificial Intelligence, Catalina Island, California, USA, July 8-11, 2010, S TIL(dynamic0.05) lower46560 bound16712 (LB38525) on the Alarm18786 network. AUAI Press, Corvallis, Oregon, 2010.References S TIL(0.1) 44554 16928 38520 18918 S I I. Tsamardinos, L. Brown, C. Aliferis, The max-min hill-climbing running time (s) memory usage (MB) I S. Ordyniak, S. Szeider, Algorithms and complexity results for exact Bayesian network structure learning algorithm, Machine Learning 65 I SKEL is the skeleton ofsuper-structure the Alarm network.no LB LB no LB LB Bayesian structure learning, in: P. Grünwald, P. Spirtes (Eds.), S (2006) 31–78. I (↵) is a super-structure learned with the tiling algorithm, for the STIL SKEL 14 7 337 180 Proceedings of UAI 2010, The 26th Conference on Uncertainty insignificance level ↵. S I E. Perrier, S. Imoto, S. Miyano, Finding optimal Bayesian network given TIL(0.01) 20785 6393 15679 5346 a super-structure,Artificial Intelligence, J.

Parameterized Complexity Results for Probabilistic Network Structure Learning

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support