Geometric Combinatorics and RNA Folding Algorithms

Total Page:16

File Type:pdf, Size:1020Kb

Geometric Combinatorics and RNA Folding Algorithms Geometric combinatorics and RNA folding algorithms Qijun He (Clemson University) Joint work with: Christine Heitsch (Georgia Tech) Svetlana Poznanovikj∗ (Clemson) Andrew Gainer-Dewar∗ (UConn Health Center) Elizabeth Drellich (North Texas) Heather Harrington (Oxford) Mathematics Research Communities (MRC) 2014 Snowbird, Utah ACSB Conference University of Connecticut Health Center May 23, 2015 Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms Introduction I am a 4th-year PhD student at Clemson University, interested in combinatorics, computational algebra, and mathematical biology. Current research projects Applying geometric combinatorics (polytopes) and parametric inference to RNA folding. [What I'll talk about today] With MRC 2014 research group led by Christine Heitsch & Svetlana Poznanovikj. k-canalizing Boolean functions. With Matthew Macauley & Elena Dimitrova. [See my poster!] Data identification using Gr¨obnernormal forms. With Elena Dimitrova & Brandy Stigler. Advertisments I am co-organizing (with Raina Robeva & Andy Jenkins) a mini-symposium on discrete and algebraic biology at the 2015 Biomathematics and Ecology Education and Research (BEER) Symposium in October 2015. I am the organizing chair of the 12th Graduate Students Combinatorics Conference which will be at Clemson in March 2016! I will be on the job market next year. Know of anyone looking for a postdoc? Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms What is RNA? RNA( Ribonucleic acid) are biological molecules built from strings of nucleotides. adenine (A), guanine (G), cytosine (C), thymine (T), uracil (U) RNA strands consist of A, C, G, and U. Combinatorially, an RNA strand is a length-n sequence, over the alphabet fA; C; G; Ug. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms RNA sequences via base pairings Primary sequence −! Secondary structure −! 3D molecule RNA secondary structures balance energetically favorable helices (consecutive base pairs) against destabilizing loops (single-stranded bases). Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms Secondary structure & pseudoknots Here are two folds of the same RNA strand, and the corresponding arc diagrams. The first is a secondary structure and the second is a pseudoknot. 50-end G G G A C C U C C C C C C U A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 A G G G G G 30-end G G G A C C U U C C C C C C A A G G G G G G G G G 50-end G G G A C C U U C C C C C C A 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 G A G G G G G G 30-end G G G A C C U U C C C C C C A A G G G G G G G Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms Secondary structure prediction The problem For a given sequence, there are many possible secondary structures into which it can fold. What is the most `likely' one? Minimal free energy (mfe) model The optimal secondary structure minimizes the free energy, ∆G. Example energy model Given an RNA sequence S = b1b2 ··· bn, let 8 −3 fbi ; bj g = fC; Gg and i ≤ j − 4 > <>−2 fb ; b g = fA; Ug and i ≤ j − 4 δg(i; j) = i j >−1 fbi ; bj g = fG; Ug and i ≤ j − 4 > :0 otherwise: be the free energy of the potential bond between bi and bj . Find the structure that minimizes ∆G, the sum of the energies of the base pairs. This can be done using dynamic programming (DP) to recurse on the substructures. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms MFE folding: toy example There are 4 ways to recurse on the substructure Si;j = bi bi+1 ··· bj−1bj . These correspond to the following 4 cases about how bases bi and bj can bond: ◦◦◦ ::: ◦◦◦ ◦◦:::◦◦◦ ::: ◦◦ ◦◦ ::: ◦◦::: ◦◦◦ ◦◦ ::: ◦◦:::◦::: ◦◦::: ◦◦ i j i i +1 k j−1 j i i +1 k j−1 j i i +1 ki k kj j−1 j Thus the optimal energy score ∆G(i; j) of subsequence Si;j is given by: 8 >∆G(i + 1; j − 1) + δg(i; j) > <>∆G(i + 1; j) ∆G(i; j) = min ∆G(i; j − 1) > > min ∆G(i; k) + ∆G(k + 1; j): :i<k<j Our final goal is to compute S1;n. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms A toy example: S = GGGACCUUCC j G G G A C C U U C C G G G A C C U U C C G 0 0 0 0 −3 G 0 0 0 0 −3−3−4−4−6−8 G 0 0 0 0 −3 G 0 0 0 0 −3−3−3−5−8 G 0 0 0 0 −1 G 0 0 0 0 −1−2−5−5 A 0 0 0 0 −2 A 0 0 0 0 −2−2−2 C 0 0 0 0 0 C 0 0 0 0 0 0 i C 0 0 0 0 0 C 0 0 0 0 0 U 0 0 0 0 U 0 0 0 0 U 0 0 0 U 0 0 0 C 0 0 C 0 0 C 0 C 0 Figure: Recording the optimal scores in a table during a DP routine. G G G A C C U U C C ∆G(S) = −8 Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms MFE folding: toy example We can use the language from OR to describe RNA folding folding problems. In our toy example of S = GGGACCUUCC, the problem can be rephrased as: min ∆G = −3k1 − 2k2 − k3; such that, there exist a `valid' structure T on S with k1 CG pairs, k2 AU pairs, k3 GU pairs. We know little about the feasible region of this optimization problem, but we know it is finite. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms NNTM and GTfold Nearest neighbor thermodynamic model: linear objective function with over 7,000 parameters (Turner99 database)! GTfold: DP algorithm based on NNTM (using Turner99 database) to minimize the free energy and provide optimal RNA folding structure. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms NNTM and GTfold DP solves discrete optimization efficiently, but quality of free energy approximation by the NNTM objective function varies widely. What might go wrong? Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms Multibranch loop parameters For computational efficiency, only3 parameters are used to govern multibranch loops. Even worse: they are almost purely `made up'. Multibranch loop parameters a: energy penalty for a multibranch loop. b: energy for each unpaired nucleotide in a multibranch loop. c: energy for each branching helix in a multibranch loop. Question How do multibranch loop parameters (a; b; c) affect the optimal structure? Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms Multibranch loop parameters We want to run parametric analysis on (a; b; c), but we don't know how `wrong' they are. Answer? We need to construct the convex hull (a polytope) of the feasible region. The optimal value should exist on an extreme point of the feasible region (a vertex of this polytope). The real problem The dimension of this polytope is over 7,000 and we know little about the feasible region. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms Dimension reduction For a given structure T , we can write its free energy as: ∆G(T ) = axT + byT + czT + wT ; where xT : number of multibranch loops in T , yT : number of unpaired nucleotides in multibranch loops in T , zT : number of branching helices in multibranch loops in T , wT : energy of the remaining structures using Turner99 parameters. The profile space is (xT ; yT ; zT ; wT ), and we introduce a dummy variable d: ∆G(T ) = axT + byT + czT + dwT : Question How do multibranch loop parameters (a; b; c; d) affect the optimal structure? The real problem How to compute the convex hull of an implicit finite feasible region? Answer! Let GTfold do it for us! Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method iB4e: given a solver for maximization of linear objective functions over an implicit finite feasible region, iB4e builds the convex hull of the feasible region incrementally, by systematically finding new vertices and facets. Step 1: find the affine hull of the feasible region. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 1: find the affine hull of the feasible region. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 1: find the affine hull of the feasible region. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 1: find the affine hull of the feasible region. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 2: build the polytope incrementally. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 2: build the polytope incrementally. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 2: build the polytope incrementally. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 2: build the polytope incrementally. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 2: build the polytope incrementally. Qijun He (Clemson) Geometric combinatorics and RNA folding algorithms iB4e and beneath-beyond method Step 2: build the polytope incrementally.
Recommended publications
  • A Logical Approach to Asymptotic Combinatorics I. First Order Properties
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Elsevier - Publisher Connector ADVANCES IN MATI-EMATlCS 65, 65-96 (1987) A Logical Approach to Asymptotic Combinatorics I. First Order Properties KEVIN J. COMPTON ’ Wesleyan University, Middletown, Connecticut 06457 INTRODUCTION We shall present a general framework for dealing with an extensive set of problems from asymptotic combinatorics; this framework provides methods for determining the probability that a large, finite structure, ran- domly chosen from a given class, will have a given property. Our main concern is the asymptotic probability: the limiting value as the size of the structure increases. For example, a common problem in elementary probability texts is to show that the asymptotic probabity that a per- mutation will have no fixed point is l/e. We shall say nothing about the closely related problem of determining rates of convergence, although the methods presented here may extend to such problems. To develop a general approach we must fix a language for specifying properties of structures. Thus, our approach is logical; logic is the branch of mathematics that deals with problems of language. In this paper we con- sider properties expressible in the language of first order logic and speak of probabilities of first order sentences rather than properties. In the sequel to this paper we consider properties expressible in the more general language of monadic second order logic. Clearly, we must restrict the classes of structures we consider in order for questions about asymptotic probabilities to be meaningful and significant. Therefore, we choose to consider only classes closed under disjoint unions and components (see Sect.
    [Show full text]
  • Interactions of Computational Complexity Theory and Mathematics
    Interactions of Computational Complexity Theory and Mathematics Avi Wigderson October 22, 2017 Abstract [This paper is a (self contained) chapter in a new book on computational complexity theory, called Mathematics and Computation, whose draft is available at https://www.math.ias.edu/avi/book]. We survey some concrete interaction areas between computational complexity theory and different fields of mathematics. We hope to demonstrate here that hardly any area of modern mathematics is untouched by the computational connection (which in some cases is completely natural and in others may seem quite surprising). In my view, the breadth, depth, beauty and novelty of these connections is inspiring, and speaks to a great potential of future interactions (which indeed, are quickly expanding). We aim for variety. We give short, simple descriptions (without proofs or much technical detail) of ideas, motivations, results and connections; this will hopefully entice the reader to dig deeper. Each vignette focuses only on a single topic within a large mathematical filed. We cover the following: • Number Theory: Primality testing • Combinatorial Geometry: Point-line incidences • Operator Theory: The Kadison-Singer problem • Metric Geometry: Distortion of embeddings • Group Theory: Generation and random generation • Statistical Physics: Monte-Carlo Markov chains • Analysis and Probability: Noise stability • Lattice Theory: Short vectors • Invariant Theory: Actions on matrix tuples 1 1 introduction The Theory of Computation (ToC) lays out the mathematical foundations of computer science. I am often asked if ToC is a branch of Mathematics, or of Computer Science. The answer is easy: it is clearly both (and in fact, much more). Ever since Turing's 1936 definition of the Turing machine, we have had a formal mathematical model of computation that enables the rigorous mathematical study of computational tasks, algorithms to solve them, and the resources these require.
    [Show full text]
  • Middleware and Application Management Architecture
    Middleware and Application Management Architecture vorgelegt vom Diplom-Informatiker Sven van der Meer aus Berlin von der Fakultät IV – Elektrotechnik und Informatik der Technischen Universität Berlin zur Erlangung des akademischen Grades Doktor der Ingenieurwissenschaften - Dr.-Ing. - genehmigte Dissertation Promotionsausschuss: Vorsitzender: Prof. Gorlatch Berichter: Prof. Dr. Dr. h.c. Radu Popescu-Zeletin Berichter: Prof. Dr. Kurt Geihs Tag der wissenschaftlichen Aussprache: 25.09.2002 Berlin 2002 D 83 Middleware and Application Management Architecture Sven van der Meer Berlin 2002 Preface Abstract This thesis describes a new approach for the integrated management of distributed networks, services, and applications. The main objective of this approach is the realization of software, systems, and services that address composability, scalability, reliability, and robustness as well as autonomous self-adaptation. It focuses on middleware for management, control, and use of fully distributed resources. The term integra- tion refers to the task of unifying existing instrumentations for middleware and management, not their replacement. The rationale of this work stems from the fact, that the current situation of middleware and management systems can be described with the term interworking. The actually needed integration of management and middleware concepts is still an open issue. However, identified trends in communications and computing demand for integrated concepts rather than the definition of new interworking scenarios. Distributed applications need to be prepared to be used and operated in a stable, secure, and efficient way. Middleware and service platforms are employed to solve this task. To guarantee this objective for a long-time operation, the systems needs to be controlled, administered, and maintained in its entirety, supporting the general aim of the system, and for each individual component, to ensure that each part of the system functions perfectly.
    [Show full text]
  • An Introduction to Ramsey Theory Fast Functions, Infinity, and Metamathematics
    STUDENT MATHEMATICAL LIBRARY Volume 87 An Introduction to Ramsey Theory Fast Functions, Infinity, and Metamathematics Matthew Katz Jan Reimann Mathematics Advanced Study Semesters 10.1090/stml/087 An Introduction to Ramsey Theory STUDENT MATHEMATICAL LIBRARY Volume 87 An Introduction to Ramsey Theory Fast Functions, Infinity, and Metamathematics Matthew Katz Jan Reimann Mathematics Advanced Study Semesters Editorial Board Satyan L. Devadoss John Stillwell (Chair) Rosa Orellana Serge Tabachnikov 2010 Mathematics Subject Classification. Primary 05D10, 03-01, 03E10, 03B10, 03B25, 03D20, 03H15. Jan Reimann was partially supported by NSF Grant DMS-1201263. For additional information and updates on this book, visit www.ams.org/bookpages/stml-87 Library of Congress Cataloging-in-Publication Data Names: Katz, Matthew, 1986– author. | Reimann, Jan, 1971– author. | Pennsylvania State University. Mathematics Advanced Study Semesters. Title: An introduction to Ramsey theory: Fast functions, infinity, and metamathemat- ics / Matthew Katz, Jan Reimann. Description: Providence, Rhode Island: American Mathematical Society, [2018] | Series: Student mathematical library; 87 | “Mathematics Advanced Study Semesters.” | Includes bibliographical references and index. Identifiers: LCCN 2018024651 | ISBN 9781470442903 (alk. paper) Subjects: LCSH: Ramsey theory. | Combinatorial analysis. | AMS: Combinatorics – Extremal combinatorics – Ramsey theory. msc | Mathematical logic and foundations – Instructional exposition (textbooks, tutorial papers, etc.). msc | Mathematical
    [Show full text]
  • Applications and Combinatorics in Algebraic Geometry Frank Sottile Summary
    Applications and Combinatorics in Algebraic Geometry Frank Sottile Summary Algebraic Geometry is a deep and well-established field within pure mathematics that is increasingly finding applications outside of mathematics. These applications in turn are the source of new questions and challenges for the subject. Many applications flow from and contribute to the more combinatorial and computational parts of algebraic geometry, and this often involves real-number or positivity questions. The scientific development of this area devoted to applications of algebraic geometry is facilitated by the sociological development of administrative structures and meetings, and by the development of human resources through the training and education of younger researchers. One goal of this project is to deepen the dialog between algebraic geometry and its appli- cations. This will be accomplished by supporting the research of Sottile in applications of algebraic geometry and in its application-friendly areas of combinatorial and computational algebraic geometry. It will be accomplished in a completely different way by supporting Sottile’s activities as an officer within SIAM and as an organizer of scientific meetings. Yet a third way to accomplish this goal will be through Sottile’s training and mentoring of graduate students, postdocs, and junior collaborators. The intellectual merits of this project include the development of applications of al- gebraic geometry and of combinatorial and combinatorial aspects of algebraic geometry. Specifically, Sottile will work to develop the theory and properties of orbitopes from the perspective of convex algebraic geometry, continue to investigate linear precision in geo- metric modeling, and apply the quantum Schubert calculus to linear systems theory.
    [Show full text]
  • Combinatorics and Graph Theory I (Math 688)
    Combinatorics and Graph Theory I (Math 688). Problems and Solutions. May 17, 2006 PREFACE Most of the problems in this document are the problems suggested as home- work in a graduate course Combinatorics and Graph Theory I (Math 688) taught by me at the University of Delaware in Fall, 2000. Later I added several more problems and solutions. Most of the solutions were prepared by me, but some are based on the ones given by students from the class, and from subsequent classes. I tried to acknowledge all those, and I apologize in advance if I missed someone. The course focused on Enumeration and Matching Theory. Many problems related to enumeration are taken from a 1984-85 course given by Herbert S. Wilf at the University of Pennsylvania. I would like to thank all students who took the course. Their friendly criti- cism and questions motivated some of the problems. I am especially thankful to Frank Fiedler and David Kravitz. To Frank – for sharing with me LaTeX files of his solutions and allowing me to use them. I did it several times. To David – for the technical help with the preparation of this version. Hopefully all solutions included in this document are correct, but the whole set is by no means polished. I will appreciate all comments. Please send them to [email protected]. – Felix Lazebnik 1 Problem 1. In how many 4–digit numbers abcd (a, b, c, d are the digits, a 6= 0) (i) a < b < c < d? (ii) a > b > c > d? Solution. (i) Notice that there exists a bijection between the set of our numbers and the set of all 4–subsets of the set {1, 2,..., 9}.
    [Show full text]
  • Combinatorics: a First Encounter
    Combinatorics: a first encounter Darij Grinberg Thursday 10th January, 2019at 9:29pm (unfinished draft!) Contents 1. Preface1 1.1. Acknowledgments . .4 2. What is combinatorics?4 2.1. Notations and conventions . .4 1. Preface These notes (which are work in progress and will remain so for the foreseeable fu- ture) are meant as an introduction to combinatorics – the mathematical discipline that studies finite sets (roughly speaking). When finished, they will cover topics such as binomial coefficients, the principles of enumeration, permutations, parti- tions and graphs. The emphasis falls on enumerative combinatorics, meaning the art of computing sizes of finite sets (“counting”), and graph theory. I have tried to keep the presentation as self-contained and elementary as possible. The reader is assumed to be familiar with some basics such as induction proofs, equivalence relations and summation signs, as well as have some experience with mathematical proofs. One of the best places to catch up on these basics and to gain said experience is the MIT lecture notes [LeLeMe16] (particularly their first five chapters). Two other resources to familiarize oneself with proofs are [Hammac15] and [Day16]. Generally, most good books about “reading and writing mathemat- ics” or “introductions to abstract mathematics” should convey these skills, although the extent to which they actually do so may differ. These notes are accompanying two classes on combinatorics (Math 4707 and 4990) I am giving at the University of Minneapolis in Fall 2017. Here is a (subjective and somewhat random) list of recommended texts on the kinds of combinatorics that will be considered in these notes: • Enumerative combinatorics (aka counting): 1 Notes on graph theory (Thursday 10th January, 2019, 9:29pm) page 2 – The very basics of the subject can be found in [LeLeMe16, Chapters 14– 15].
    [Show full text]
  • Teaching Discrete Mathematics, Combinatorics, Geometry, Number Theory, (Or Anything) from Primary Historical Sources David Pengelley New Mexico State University
    Teaching Discrete Mathematics, Combinatorics, Geometry, Number Theory, (or Anything) from Primary Historical Sources David Pengelley New Mexico State University Course Overview Here I describe teaching four standard courses in the undergraduate mathematics curriculum either entirely or largely from primary historical sources: lower division discrete mathematics (introduction to proofs) and introductory upper division combinatorics, geometry, and number theory. They are simply illustrations of my larger dream that all mathematics be taught directly from primary sources [16]. While the content is always focused on the mathematics, the approach is entirely through interconnected primary historical sources, so students learn a great deal of history at the same time. Primary historical sources can have many learning advantages [5,6,13], one of which is often to raise as many questions as they answer, unlike most textbooks, which are frequently closed-ended, presenting a finished rather than open- ended picture. For some time I have aimed toward jettisoning course textbooks in favor of courses based entirely on primary historical sources. The goal in total or partial study of primary historical sources is to study the original expositions and proofs of results in the words of the discoverers, in order to understand the most authentic possible picture of the mathematics. It is exciting, challenging, and can be extremely illuminating to read original works in which great mathematical ideas were first revealed. We aim for lively discussion involving everyone. I began teaching with primary historical sources in specially created courses (see my two other contributions to this volume), which then stimulated designing student projects based on primary sources for several standard college courses [1,2,3,7], all collaborations with support from the National Science Foundation.
    [Show full text]
  • Combinatorics and Counting Per Alexandersson 2 P
    Combinatorics and counting Per Alexandersson 2 p. alexandersson Introduction Here is a collection of counting problems. Questions and suggestions Version: 15th February 2021,22:26 are welcome at [email protected]. Exclusive vs. independent choice. Recall that we add the counts for exclusive situations, and multiply the counts for independent situations. For example, the possible outcomes of a dice throw are exclusive: (Sides of a dice) = (Even sides) + (Odd sides) The different outcomes of selecting a playing card in a deck of cards can be seen as a combination of independent choices: (Different cards) = (Choice of color) · (Choice of value). Labeled vs. unlabeled sets is a common cause for confusion. Consider the following two problems: • Count the number of ways to choose 2 people among 4 people. • Count the number of ways to partition 4 people into sets of size 2. In the first example, it is understood that the set of chosen people is a special set | it is the chosen set. We choose two people, and the other two are not chosen. In the second example, there is no difference between the two couples. The answer to the first question is therefore 4 , counting the chosen subsets: {12, 13, 14, 23, 24, 34}. 2 The answer to the second question is 1 4 , counting the partitions: {12|34, 13|24, 14|23}. 2! 2 That is, the issue is that there is no way to distinguish the two sets in the partition. However, now consider the following two problems: • Count the number of ways to choose 2 people among 5 people.
    [Show full text]
  • Chapter 4 Sets, Combinatorics, and Probability Section
    Chapter 4 Sets, Combinatorics, and Probability Section 4.1: Sets CS 130 – Discrete Structures Set Theory • A set is a collection of elements or objects, and an element is a member of a set – Traditionally, sets are described by capital letters, and elements by lower case letters • The symbol means belongs to: – Used to represent the fact that an element belongs to a particular set. – aA means that element a belongs to set A – bA implies that b is not an element of A • Braces {} are used to indicate a set • Example: A = {2, 4, 6, 8, 10} – 3A and 2A CS 130 – Discrete Structures 2 Continue on Sets • Ordering is not imposed on the set elements and listing elements twice or more is redundant • Two sets are equal if and only if they contain the same elements – A = B means (x)[(xA xB) Λ (xB xA)] • Finite and infinite set: described by number of elements – Members of infinite sets cannot be listed but a pattern for listing elements could be indicated CS 130 – Discrete Structures 3 Various Ways To Describe A Set • The set S of all positive even integers: – List (or partially list) its elements: • S = {2, 4, 6, 8, …} – Use recursion to describe how to generate the set elements: • 2 S, if n S, then (n+2) S – Describe a property P that characterizes the set elements • in words: S = {x | x is a positive even integer} • Use predicates: S = {x P(x)} means (x)[(xS P(x)) Λ (P(x) xS)] where P is the unary predicate.
    [Show full text]
  • Integrating Glideinwms and Corral
    CorralWMS: Integrating glideinWMS and Corral A collaboration between Fermilab, UCSD, and USC ISI Krista Larson Fermi National Laboratory [email protected] Mats Rynge USC Information Sciences Institute [email protected] Condor Week 2010 Overview • Grid computing • Pilot based workload management systems • glideinWMS – On-demand glidein provisioner used by CMS (and others) • Corral – Static glidein provisioner for Pegasus workflows • CorralWMS CorralWMS: Integrating glideinWMS and Corral Condor Week 2010 Grid Computing •Combines distributed computing resources from multiple administrative domains •User has access to a large pool of resources, but Middleware has problems - managing jobs - Monitoring jobs is complicated Heterogeneous grid resources can - cause issues - Queueing and scheduling delays Software overheads and scheduling - policies CorralWMS: Integrating glideinWMS and Corral Condor Week 2010 Pilot Based Workload Management Systems • Pilot generator submits pilots to the grid sites • Pilots start running on the compute resources - Pilot can run several checks - Hides some diversity of grid resources - Overlays personal cluster on top of the grid • Pilots fetch user jobs from a scheduler and execute • Issues with scalability - Central queue can be resource intensive - Security handshake can be expensive CorralWMS: Integrating glideinWMS and Corral Condor Week 2010 glideinWMS • glideinWMS a thin layer on top of Condor • Uses glideins (i.e. pilot jobs) - a glidein is a Condor Startd submitted as a grid job • All network traffic authenticated
    [Show full text]
  • What Is Combinatorics? What Is Optimization? Department of Combinatorics and Optimization University of Waterloo, Ontario, Canada
    What is Combinatorics? What is Optimization? Department of Combinatorics and Optimization University of Waterloo, Ontario, Canada http://www.math.uwaterloo.ca 1. Introduction The department of C&O o¤ers programs in these two areas. These pages tells you a little bit about them. In industry, commerce, govern- ment, indeed in all walks of life, one frequently needs answers to ques- tions concerning operational e¢ciency. Thus an architect may need to know how to lay out a factory ‡oor so that the article being manufac- tured does not have to be moved about too much as it goes from one machine tool to another; the manager of a shipping company needs to plan the itineraries of his ships so as to increase the amount of goods handled, while avoiding costly waiting-around in docks. A telecom- munications engineer may want to know how best to transmit signals so as to minimize the possibility of error on reception. Further exam- ples of problems of this sort are provided by the planning of a railroad time-table to ensure that trains are available as and when needed, the synchronization of tra¢c lights, and many other real-life situations. Formerly such problems would usually be “solved” by imprecise methods giving results that were both unreliable and costly. Today, they are increasingly being subjected to rigid mathematical analysis, designed to provide methods for …nding exact solutions or highly reli- able estimates rather than vague approximations. Combinatorics and optimization provide many of the mathematical tools used for solving such problems. Combinatorics is the mathematics of discretely structured problems.
    [Show full text]