Constructing Liberal and Conservative Supertrees and Exact Solutions for Reduced Consensus Problems Jianrong Dong Iowa State University

Iowa State University Capstones, Theses and Graduate Theses and Dissertations Dissertations 2012 Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems Jianrong Dong Iowa State University Follow this and additional works at: https://lib.dr.iastate.edu/etd Part of the Applied Mathematics Commons, Bioinformatics Commons, and the Computer Sciences Commons Recommended Citation Dong, Jianrong, "Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems" (2012). Graduate Theses and Dissertations. 12315. https://lib.dr.iastate.edu/etd/12315 This Dissertation is brought to you for free and open access by the Iowa State University Capstones, Theses and Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Graduate Theses and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, please contact [email protected]. Constructing liberal and conservative supertrees and exact solutions for reduced consensus problems by Jianrong Dong A dissertation submitted to the graduate faculty in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Major: Computer Science Program of Study Committee: David Fernández-Baca,Major Professor Oliver Eulenstein Giora Slutzki Soma Chaudhuri Stephen Willson Iowa State University Ames, Iowa 2012 Copyright c Jianrong Dong, 2012. All rights reserved. ii DEDICATION I would like to dedicate this dissertation to my wife, Liming, and to our sons, Richard, Steven and Jason, without whose support I would not have been able to complete this work. iii TABLE OF CONTENTS LIST OF TABLES . vii LIST OF FIGURES . viii ACKNOWLEDGEMENTS . ix ABSTRACT . x CHAPTER 1. INTRODUCTION . 1 CHAPTER 2. PRELIMINARIES . 7 CHAPTER 3. REVIEW OF THE LITERATURE . 10 3.1 Consensus Trees . 10 3.2 Supertree Methods Related to the BUILD Algorithm . 11 3.2.1 The BUILD Algorithm . 11 3.2.2 The SUPERB, OneTree, AllTrees, and AllMinTrees Algorithms . 13 3.2.3 MinCut (MC) Supertrees . 14 3.2.4 Modified-MinCut (MMC) Supertrees . 14 3.2.5 BUILD-WITH-DISTANCES . 15 3.3 Matrix Representation-based Methods . 16 3.3.1 Matrix Representation with Parsimony (MRP) . 16 3.3.2 Matrix Representation with Flipping (MRF) . 18 3.3.3 Matrix Representation with Compatibility (MRC) . 18 3.4 Median Supertrees . 19 3.4.1 Majority-rule Supertrees . 19 3.4.2 Followups of Majority-rule Supertrees . 20 iv 3.4.3 Robinson-Foulds Supertrees . 21 3.5 Supertrees with Conservative Properties . 22 3.5.1 PI and PC . 22 3.5.2 PhySIC . 22 3.5.3 PhySIC IST.................................. 23 3.5.4 Conservative Supertrees . 24 3.6 Rogue Taxa . 24 CHAPTER 4. MAJORITY-RULE (+) SUPERTREES . 26 4.1 Definition . 26 4.2 Constructing Optimal Candidate Supertrees . 27 4.3 ILP Formulation . 31 4.3.1 Fill-in Variables . 31 4.3.2 Objective Function Variables . 32 4.3.3 Auxiliary Variables and Constraints . 33 4.3.4 Building Majority-rule (+) Supertree . 37 4.4 A Data Reduction Heuristic . 37 CHAPTER 5. EXPERIMENTAL STUDY FOR MAJORITY-RULE (+) SUPERTREES . 40 5.1 Experiments with the Basic ILP Formulation . 40 5.2 Experiments with the Data Reduction Heuristic . 42 CHAPTER 6. CONSERVATIVE SUPERTREES . 49 6.1 Original Formulations . 49 6.2 Alternative Formulation of Strict Supertrees . 49 6.3 Alternative Formulation of Loose Supertrees . 52 CHAPTER 7. PROPERTY OF CONSERVATIVE SUPERTREES . 56 7.1 Compatibility, Support and No conflict . 56 7.2 Comparisons among Strict, Loose, Majority-rule and Majority-rule (+) Supertrees 57 7.3 NP-hardness Results . 58 v 7.4 Reducible Set and Pull-out Set . 58 7.5 Invariant Under Remeshing . 60 7.6 Lack of Population Invariance . 60 CHAPTER 8. ILP FORMULATION FOR CONSERVATIVE SUPERTREES, THE (+) WAY . 62 8.1 Restricted Spans and Spans of Restricted Supertree . 62 8.2 Characterization of Various Consensus Trees . 63 8.3 Strict and Loose Supertrees via Restricted Spans . 67 8.4 ILP Formulations . 69 8.4.1 Normal Profiles . 69 8.4.2 Profiles with Trivial Trees . 74 8.4.3 Profile with Splits to Avoid . 76 8.4.4 Size of Practically Solvable Models . 76 CHAPTER 9. ILP FORMULATION FOR GOOD SPLITS, The (-) WAY . 77 9.1 Binary Fill-in Vector . 77 9.2 Representations of Input and Induced Subtrees . 78 9.3 Column Identity . 78 9.4 Constraints for a Strict Good Split . 78 9.4.1 Normal Profile . 78 9.4.2 Profiles with Trivial Trees . 79 9.5 Constraints for a Loose Good Split . 80 9.6 Distance Computation . 81 9.7 Objective Function . 82 9.8 Artificial Clusters and Constraints . 82 9.8.1 Constraints after Finding Artificial Clusters . 83 9.8.2 Constraints to Suppress Artificial Clusters . 84 9.9 Number of Variables and Constraints . 85 vi CHAPTER 10. ALGORITHMS . 87 10.1 Divide and Conquer . 87 10.2 Verification . 87 CHAPTER 11. EXPERIMENTAL RESULTS . 93 CHAPTER 12. ILP FOR REDUCED STRICT CONSENSUS TREES . 99 12.1 Binary Matrices for Profile and Split-Tree Relationship . 99 12.2 Variables and Constraints . 100 12.3 Objective Function . 102 12.4 Number of Variables and Constraints . 102 CHAPTER 13. ILP FOR REDUCED LOOSE CONSENSUS TREES . 104 13.1 Binary Matrix for Profile . 104 13.2 Variables and Constraints . 104 13.3 Objective Function . 107 13.4 Number of Variables and Constraints . 107 CHAPTER 14. ILP FOR REDUCED MAJORITY-RULE CONSENSUS TREES 109 14.1 Binary Matrices for Profile and Split-Tree Relationship . 109 14.2 Objective Function . 112 14.3 Number of Variables and Constraints . 112 CHAPTER 15. EXPERIMENTAL RESULTS FOR REDUCED CONSEN- SUS TREES . 114 CHAPTER 16. CONCLUSIONS . 117 BIBLIOGRAPHY . 119 vii LIST OF TABLES Table 5.1 Summary of Experimental Results with the Basic ILP Method . 41 Table 5.2 Results of Drosophila C Analysis Using Data Reduction . 44 Table 5.3 Results of Seabirds Analysis Using Data Reduction . 46 Table 5.4 Support and Conflict for the Seabirds Data Set . 47 Table 11.1 Summary of Experimental Results . 94 Table 15.1 ILP and Heuristic Results for Strict Reduced Consensus Trees . 115 Table 15.2 ILP and Heuristic Results for Majority-rule Reduced Consensus Trees 115 viii LIST OF FIGURES Figure 5.1 Comparing the Seabirds MRP Strict Consensus with the Majority-rule (+) Supertree . 48 Figure 7.1 Pull-out Set . 59 Figure 11.1 Strict and Loose Supertrees for Primates Small Data Set . 96 Figure 11.2 Strict and Loose Supertrees for Primates Large Data Set . 97 Figure 11.3 Strict Consensus of MRP, Majority-rule (+), Strict and Loose Supertrees for Seabirds Data Set . 98 ix ACKNOWLEDGEMENTS I would like to take this opportunity to express my thanks to those who helped me with various aspects of conducting research and the writing of this thesis. First and foremost, Prof. David Fernández-Baca,my mentor, for his guidance, patience and support throughout all these years, not only in the research and the writing of this thesis, but also in the publication of various papers and my career design and development. His knowledge, insights, judgment, planning and word of encouragement have often inspired me to think differently, study more on these interesting problems and publish more. I would also like to thank my committee members for their efforts and contributions to this work: Prof. Oliver Eulenstein, Prof. Giora Slutzki, Prof. Soma Chaudhuri and Prof. Stephen Willson. I would additionally like to thank Prof. F. R. McMorris and Dr. Mark Wilkinson for their seminal works to include in their formulations such nice properties that enable us to tackle large problems. x ABSTRACT This thesis studies two different approaches to extracting information from collections of phylogenetic.

Constructing Liberal and Conservative Supertrees and Exact Solutions for Reduced Consensus Problems Jianrong Dong Iowa State University

An Introduction to Lemurs for Teachers and Educators

Ecosystem Profile Madagascar and Indian

Fossil Lemur from Northern Madagascar (Palaeopropithecidae/Primate Evolution/Postcranium) WILLIAM L

For Peer Review

Fossil Primates

Duke University Dissertation Template

New Extinct Lemur Species Discovered in Madagascar 26 May 2009

Harrison CV June 2021

Ancient DNA from Giant Extinct Lemurs Verifies Single Origin of Malagasy

Subfossil Lemurs of Madagascar

Understanding Primate Mandibular Symphyseal Fusion: Function, Integration, and Ontogeny

Reconstruction of an Extraordinary Extinct Primate from Madagascar