R Programming for Bioinformatics

Total Page:16

File Type:pdf, Size:1020Kb

R Programming for Bioinformatics R Programming for Bioinformatics C6367_FM.indd 1 6/11/08 3:08:07 PM Chapman & Hall/CRC Computer Science and Data Analysis Series The interface between the computer and statistical sciences is increasing, as each discipline seeks to harness the power and resources of the other. This series aims to foster the integration between the computer sciences and statistical, numerical, and probabilistic methods by publishing a broad range of reference works, textbooks, and handbooks. SERIES EDITORS David Blei, Princeton University David Madigan, Rutgers University Marina Meila, University of Washington Fionn Murtagh, Royal Holloway, University of London Proposals for the series should be sent directly to one of the series editors above, or submitted to: Chapman & Hall/CRC 4th Floor, Albert House 1-4 Singer Street London EC2A 4BQ UK Published Titles Bayesian Articial Intelligence Design and Modeling for Computer Kevin B. Korb and Ann E. Nicholson Experiments Computational Statistics Handbook with Kai-Tai Fang, Runze Li, and MATLAB®, Second Edition Agus Sudjianto Wendy L. Martinez and Angel R. Martinez Introduction to Machine Learning Pattern Recognition Algorithms for and Bioinformatics Data Mining Sushmita Mitra, Sujay Datta, Sankar K. Pal and Pabitra Mitra Theodore Perkins, and George Michailidis Exploratory Data Analysis with MATLAB® R Graphics Wendy L. Martinez and Angel R. Martinez Paul Murrell Clustering for Data Mining: A Data R Programming for Bioinformatics Recovery Approach Robert Gentleman Boris Mirkin Semisupervised Learning for Correspondence Analysis and Data Computational Linguistics Coding with Java and R Steven Abney Fionn Murtagh Statistical Computing with R Maria L. Rizzo C6367_FM.indd 2 6/11/08 3:08:07 PM R Programming for Bioinformatics Robert Gentleman Fred Hutchinson Cancer Research Center Seattle, Washington, U.S.A. C6367_FM.indd 3 6/11/08 3:08:07 PM Chapman & Hall/CRC Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2009 by Taylor & Francis Group, LLC Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S. Government works Printed in the United States of America on acid-free paper 10 9 8 7 6 5 4 3 2 1 International Standard Book Number-13: 978-1-4200-6367-7 (Hardcover) This book contains information obtained from authentic and highly regarded sources Reason- able efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use. The Authors and Publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers. For permission to photocopy or use material electronically from this work, please access www. copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC) 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging-in-Publication Data Gentleman, Robert, 1959- R programming for bioinformatics / Robert Gentleman. p. cm. -- (Chapman & Hall/CRC computer science and data analysis series) Bibliographical references (p. ) and index. ISBN 978-1-4200-6367-7 1. Bioinformatics. 2. R (Computer program language) I. Title. II. Series. QH324.2.G46 2008 572.80285’5133--dc22 2008011352 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com C6367_FM.indd 4 6/11/08 3:08:07 PM To Tanja, Sophie and Katja Contents 1 Introducing R 1 1.1 Introduction . 1 1.2 Motivation . 2 1.3 A note on the text . 3 1.4 Acknowledgments . 4 2 R Language Fundamentals 5 2.1 Introduction . 5 2.1.1 A brief introduction to R . 5 2.1.2 Attributes . 6 2.1.3 A very brief introduction to OOP in R . 7 2.1.4 Some special values . 8 2.1.5 Types of objects . 9 2.1.6 Sequence generating and vector subsetting . 11 2.1.7 Types of functions . 12 2.2 Data structures . 12 2.2.1 Atomic vectors . 12 2.2.2 Numerical computing . 15 2.2.3 Factors . 16 2.2.4 Lists, environments and data frames . 18 2.3 Managing your R session . 22 2.3.1 Finding out more about an object . 24 2.4 Language basics . 25 2.4.1 Operators . 26 2.5 Subscripting and subsetting . 28 2.5.1 Vector and matrix subsetting . 29 2.6 Vectorized computations . 36 2.6.1 The recycling rule . 37 2.7 Replacement functions . 38 2.8 Functional programming . 39 2.9 Writing functions . 41 2.10 Flow control . 42 2.10.1 Conditionals . 44 2.11 Exception handling . 45 2.12 Evaluation . 50 2.12.1 Standard evaluation . 51 2.12.2 Non-standard evaluation . 52 vii viii 2.12.3 Function evaluation . 53 2.12.4 Indirect function invocation . 54 2.12.5 Evaluation on exit . 54 2.12.6 Other topics . 55 2.12.7 Name spaces . 57 2.13 Lexical scope . 59 2.13.1 Likelihoods . 61 2.13.2 Function optimization . 62 2.14 Graphics . 64 3 Object-Oriented Programming in R 67 3.1 Introduction . 67 3.2 The basics of OOP . 68 3.2.1 Inheritance . 69 3.2.2 Dispatch . 71 3.2.3 Abstract data types . 72 3.2.4 Self-describing data . 73 3.3 S3 OOP . 74 3.3.1 Implicit classes . 76 3.3.2 Expression data example . 77 3.3.3 S3 generic functions and methods . 78 3.3.4 Details of dispatch . 81 3.3.5 Group generics . 83 3.3.6 S3 replacement methods . 83 3.4 S4 OOP . 84 3.4.1 Classes . 85 3.4.2 Types of classes . 98 3.4.3 Attributes . 98 3.4.4 Class unions . 99 3.4.5 Accessor functions . 100 3.4.6 Using S3 classes with S4 classes . 100 3.4.7 S4 generic functions and methods . 101 3.4.8 The syntax of method declaration . 105 3.4.9 The semantics of method invocation . 106 3.4.10 Replacement methods . 107 3.4.11 Finding methods . 107 3.4.12 Advanced topics . 108 3.5 Using classes and methods in packages . 110 3.6 Documentation . 110 3.6.1 Finding documentation . 110 3.6.2 Writing documentation . 111 3.7 Debugging . 111 3.8 Managing S3 and S4 together . 112 3.8.1 Getting and setting the class attribute . 113 3.8.2 Mixing S3 and S4 methods . 114 ix 3.9 Navigating the class and method hierarchy . 115 4 Input and Output in R 119 4.1 Introduction . 119 4.2 Basic file handling . 120 4.2.1 Viewing files . 124 4.2.2 File manipulation . 125 4.2.3 Working with R’s binary format . 129 4.3 Connections . 130 4.3.1 Text connections . 131 4.3.2 Interprocess communications . 133 4.3.3 Seek . 136 4.4 File input and output . 137 4.4.1 Reading rectangular data . 138 4.4.2 Writing data . 139 4.4.3 Debian Control Format (DCF) . 140 4.4.4 FASTA Format . 141 4.5 Source and sink: capturing R output . 142 4.6 Tools for accessing files on the Internet . 143 5 Working with Character Data 145 5.1 Introduction . 145 5.2 Builtin capabilities . 146 5.2.1 Modifying text . 151 5.2.2 Sorting and comparing . 152 5.2.3 Matching a set of alternatives . 153 5.2.4 Formatting text and numbers . 155 5.2.5 Special characters and escaping . 155 5.2.6 Parsing and deparsing . 158 5.2.7 Plotting with text . 159 5.2.8 Locale and font encoding . 159 5.3 Regular expressions . 159 5.3.1 Regular expression basics . 160 5.3.2 Matching . 166 5.3.3 Using regular expressions . 167 5.3.4 Globbing and regular expressions . 169 5.4 Prefixes, suffixes and substrings . 169 5.5 Biological sequences . 171 5.5.1 Encoding genomes . 172 5.6 Matching patterns . 173 5.6.1 Matching single query sequences . 174 5.6.2 Matching many query sequences . 175 5.6.3 Palindromes and paired matches . 177 5.6.4 Alignments . ..
Recommended publications
  • Wireshark User's Guide
    Wireshark User’s Guide For Wireshark 2.1 Ulf Lamping <ulf.lamping[AT]web.de> Richard Sharpe, NS Computer Software and Services P/L <rsharpe[AT]ns.aus.com> Ed Warnicke <hagbard[AT]physics.rutgers.edu> Wireshark User’s Guide: For Wireshark 2.1 by Ulf Lamping, Richard Sharpe, and Ed Warnicke Copyright © 2004-2014 Ulf Lamping, Richard Sharpe, Ed Warnicke Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 or any later version published by the Free Software Foundation. All logos and trademarks in this document are property of their respective owner. Preface ...................................................................................................................... viii 1. Foreword ....................................................................................................... viii 2. Who should read this document? ....................................................................... viii 3. Acknowledgements .......................................................................................... viii 4. About this document ......................................................................................... ix 5. Where to get the latest copy of this document? ....................................................... ix 6. Providing feedback about this document ............................................................... ix 1. Introduction .............................................................................................................
    [Show full text]
  • Virtualization of the RIOT Operating System
    Computer Systems and Telematics — Distributed, Embedded Systems Diploma Thesis Virtualization of the RIOT Operating System Ludwig Ortmann Matr. 3914103 Supervisor: Dr. Emmanuel Baccelli Assisting Supervisor: Prof. Dr.-Ing. Jochen Schiller Institute of Computer Science, Freie Universität Berlin, Germany March 2, 2015 iii I hereby declare to have written this thesis on my own. I have used no other literature and resources than the ones referenced. All text passages that are literal or logical copies from other publications have been marked accordingly. All figures and pictures have been created by me or their sources are referenced accordingly. This thesis has not been submitted in the same or a similar version to any other examination board. Berlin, March 2, 2015 (Ludwig Ortmann) Abstract Abstract Software developers in the growing field of the Internet of Things face many hurdles which arise from the limitations of embedded systems and wireless networking. The employment of hardware and network virtualization promises to allow developers to test and debug hard- ware independent code without being affected by these limitations. This thesis presents RIOT native, a hardware and network emulation implementation for the RIOT operating system, which enables developers to compile and run RIOT as a process in their host operat- ing system. Running the operating system as a process allows for the use of debugging tools and techniques only available on desktop computers otherwise, the integration of common network analysis tools, and the emulation of arbitrary network topologies. By enabling the use of these tools and techniques for the development of software for distributed embedded systems, the hurdles they impose on the development process are significantly reduced.
    [Show full text]
  • NS-3 Advanced Tutorial: Visualization and Data Collection
    NS-3 Advanced Tutorial: Visualization and Data Collection Tom Henderson (University of Washington and Boeing Research & Technology) L. Felipe Perrone (Bucknell University) March 2013 NS-3 Consortium Meeting 1 March 2013 Outline Getting visualization and raw data from ns-3 • Tracing and packet traces • Gnuplot and Matplotlib • Flow Monitor • PyViz • NetAnim • Statistics • Data Collection Framework 2 NS-3 Consortium Meeting March 2013 Tracing requirements • Tracing is a structured form of simulation output • Example (from ns-2): + 1.84375 0 2 cbr 210 ------- 0 0.0 3.1 225 610 - 1.84375 0 2 cbr 210 ------- 0 0.0 3.1 225 610 r 1.84471 2 1 cbr 210 ------- 1 3.0 1.0 195 600 r 1.84566 2 0 ack 40 ------- 2 3.2 0.1 82 602 + 1.84566 0 2 tcp 1000 ------- 2 0.1 3.2 102 611 Problem: Tracing needs vary widely – would like to change tracing output without editing the core – would like to support multiple outputs 3 NS-3 Consortium Meeting March 2013 Tracing in ns-3 • ns-3 configures multiple 'TraceSource' objects (TracedValue, TracedCallback) • Multiple types of 'TraceSink' objects can be hooked to these sources • A special configuration namespace helps to manage access to trace sources TracedValue Config::Connect ("/path/to/traced/value", callback1); TraceSource Config::Connect ("/path/to/trace/source", callback2); TraceSource unattached NS-3 Consortium Meeting March 2013 NetDevice trace hooks • Example: CsmaNetDevice NetDevice:: CsmaNetDevice::Send () ReceiveCallback MacTx MacRx MacDrop queue Sniffer PromiscSniffer MacTxBackoff PhyTxBegin PhyRxEnd
    [Show full text]
  • Automated Multi-Platform OS Testing with External Reference Devices
    1 PHiLIP on the HiL: Automated Multi-platform OS Testing with External Reference Devices KEVIN WEISS, Hamburg University of Applied Sciences, Germany MICHEL ROTTLEUTHNER, Hamburg University of Applied Sciences, Germany THOMAS C. SCHMIDT, Hamburg University of Applied Sciences, Germany MATTHIAS WÄHLISCH, Freie Universität Berlin, Germany Developing an operating system (OS) for low-end embedded devices requires continuous adaptation to new hardware architectures and components, while serviceability of features needs to be assured for each individual platform under tight resource constraints. It is challenging to design a versatile and accurate heterogeneous test environment that is agile enough to cover a continuous evolution of the code base and platforms. This mission is even more challenging when organized in an agile open-source community process with many contributors such as for the RIOT OS. Hardware in the Loop (HiL) testing and Continuous Integration (CI) are automatable approaches to verify functionality, prevent regressions, and improve the overall quality at development speed in large community projects. In this paper, we present PHiLIP (Primitive Hardware in the Loop Integration Product), an open-source external reference device together with tools that validate the system software while it controls hardware and interprets physical signals. Instead of focusing on a specific test setting, PHiLIP takes the approach ofa tool-assisted agile HiL test process, designed for continuous evolution and deployment cycles. We explain its design, describe how it supports HiL tests, evaluate performance metrics, and report on practical experiences of employing PHiLIP in an automated CI test infrastructure. Our initial deployment comprises 22 unique platforms, each of which executes 98 peripheral tests every night.
    [Show full text]
  • Appendix a a Quick Introduction to Python
    Appendix A A Quick Introduction to Python A.1 General Information about the Language Python is an interpreted programming language which we can use both: interac- tively, by putting commands and getting immediate results, and in a batch mode, by processing the whole program stored in a text file. An open-source implementation of the Python programming language1 with its documentation can be downloaded from ironpython.net. It is common to use py as the file name extension for Python source files. To start an interactive session we type ipy in the command line console (please notice that so as to use some modules one is obligated to invoke IronPython with the -X:Frames parameter), whereas to run a program stored in a file name.py we type ipy name.py. Using Python, programmers are allowed to follow procedural as well as object- oriented programming language paradigms. The structure of a typical program is almost the same as in most programming languages: loading necessary modules (by means of the instruction import or from), the definition of procedures, functions, classes, and the main computational part. We will not present this in a formal way, because in Python we have broad discretion over the program framework; for exam- ple, instructions can be put between subroutines, and module import—in the body of a function. An interpreter executes the code of a program from the top of a file downward to the last line. It is also worth to emphasize the following features that distinguish Python in the realm of programming languages: • Dynamic type-checking: the type of a variable can not be explicitly shown.
    [Show full text]
  • A Simulation Framework for Modeling the Behaviour of Iot and Edge Computing Environments
    1 IoTSim-Edge: A Simulation Framework for Modeling the Behaviour of IoT and Edge Computing Environments Devki Nandan Jha∗, Khaled Alwasel∗, Areeb Alshoshan∗, Xianghua Huang∗, Ranesh Kumar Naha†, Sudheer Kumar Battula†, Saurabh Garg†, Deepak Puthal∗, Philip James∗, Albert Zomaya‡, § Schahram Dustdar , Rajiv Ranjan∗ ∗Newcastle University, United Kingdom. † University of Tasmania, Australia. ‡The University of Sydney, Australia. §TU Wien, Austria. Abstract— With the proliferation of Internet of Things (IoT) and edge computing paradigms, billions of IoT devices are being networked to support data-driven and real-time decision making across numerous application domains including smart homes, smart transport, and smart buildings. These ubiquitously distributed IoT devices send the raw data to their respective edge device (e.g. IoT gateways) or the cloud directly. The wide spectrum of possible application use cases make the design and networking of IoT and edge computing layers a very tedious process due to the: (i) complexity and heterogeneity of end-point networks (e.g. wifi, 4G, Bluetooth); (ii) heterogeneity of edge and IoT hardware resources and software stack; (iv) mobility of IoT devices; and (iii) the complex interplay between the IoT and edge layers. Unlike cloud computing, where researchers and developers seeking to test capacity planning, resource selection, network configuration, computation placement and security management strategies had access to public cloud infrastructure (e.g. Amazon and Azure), establishing an IoT and edge computing testbed which offers a high degree of verisimilitude is not only complex, costly and resource intensive but also time-intensive. Moreover, testing in real IoT and edge computing environments is not feasible due to the high cost and diverse domain knowledge required in order to reason about their diversity, scalability and usability.
    [Show full text]
  • A Unix Primer
    A UNIX PRIMER David A. Clarke Department of Astronomy and Physics Saint Mary’s University Halifax, NS, Canada September, 2002 Revised December, 2005 i TABLE OF CONTENTS I Overview . 1 II History . 4 III The Basics . 5 3.1 Your account . 5 3.2 Script files . 5 3.3 Directory trees . 6 3.4 Listing files (the ls command) and the Unix prompt . 7 3.5 The “Man” pages . 8 IV Common Unix Commands . 9 4.1 passwd ...............................9 4.2 mkdir <directory name> .......................9 4.3 rmdir <directory name> .......................9 4.4 ln -s <disc> <link> .........................9 4.5 rm <file or link> ..........................9 4.6 Creating text files . 10 4.7 File names . 10 4.8 pwd ................................ 10 4.9 cd <directory name> ........................ 10 4.10 mv <origin file> <target file> . 11 4.11 cp <origin file> <target file> . 11 4.12 lp -d <printer> <file name> ................... 11 4.13 enscript -d <printer> <file name> . 12 4.14 chmod ............................... 12 4.15 gnuplot .............................. 13 4.16 logout .............................. 14 V Unix Practise Session . 15 5.1 Accessing the A&P Unix environment . 15 5.2 The Common Desktop Environment (CDE) . 15 5.3 Practise session on Unix, including GNUPLOT . 16 5.4 Logging off, and a warning... ..................... 20 ii A UNIX PRIMER I OVERVIEW As an instructor teaching second and third-year physics students, teaching computing pro- gramming poses a unique challenge. While some of you may already be sophisticated com- puter programmers (and quite possibly much more sophisticated than your instructor!), others of you may be blown away by the idea that the keyboard can be used for something other than entering your ID and password, and composing text for e-mail and assignments.
    [Show full text]
  • Ns-3 Tutorial Release Ns-3.29
    ns-3 Tutorial Release ns-3.29 ns-3 project Jan 08, 2019 CONTENTS 1 Introduction 3 1.1 About ns-3................................................3 1.2 For ns-2 Users..............................................4 1.3 Contributing...............................................4 1.4 Tutorial Organization...........................................5 2 Resources 7 2.1 The Web.................................................7 2.2 Git....................................................7 2.3 Waf....................................................7 2.4 Development Environment........................................8 2.5 Socket Programming...........................................8 3 Getting Started 9 3.1 Overview.................................................9 3.2 Prerequisites...............................................9 3.3 Downloading ns-3 using Git....................................... 10 3.4 Building ns-3............................................... 13 3.5 Testing ns-3............................................... 21 3.6 Running a Script............................................. 22 4 Conceptual Overview 25 4.1 Key Abstractions............................................. 25 4.2 A First ns-3 Script............................................ 27 4.3 Ns-3 Source Code............................................ 36 5 Tweaking 39 5.1 Using the Logging Module........................................ 39 5.2 Using Command Line Arguments.................................... 44 5.3 Using the Tracing System.......................................
    [Show full text]
  • Deploying RIOT Operating System on a Reconfigurable Internet of Things End-Device
    Ângelo André Oliveira Ribeiro Deploying RIOT Operating System on a Reconfigurable Internet of Things End-device Guimarães, Outubro de 2017 Ângelo André Oliveira Ribeiro Deploying RIOT Operating System on a Reconfigurable Internet of Things End-device Dissertação de Mestrado Mestrado Integrado em Engenharia Eletrónica Industrial e Computadores Trabalho efetuado sob a orientação do Professor Doutor Tiago Gomes Guimarães, Outubro de 2017 Acknowledgments First of all, I express my gratitude to the best advisor Dr. Tiago Gomes, for his friendship, patience, priceless motivation and for the huge transmission of knowledge in such an adventure through IoT. I also wish to express my gratitude to the most challenging, motivational and inspiring professors Dr. Adriano Tavares and Dr. Jorge Cabral that were fundamental in my development of knowledge in the embedded systems field. To all the companions in this journey Hugo Araujo, Ailton Lopes, Sérgio Pereira, Nuno Silva, José Silva, Pedro Machado, Pedro Ribeiro, Francisco Petrucci and Ricardo Roriz I express my deepest gratitude for all the friendship, craziness and company on those long days and nights of work. I present also my gratitude to Cristiano Rodrigues and Ivo Marques for the collaboration and help in the development of the XIoT. A special thank you to Guilherme Raimundo, Barbara Rodrigues, João Torre and Tiago Ribeiro. To my family, specially to my mother, for all the support during these course, for never let me give-up, for making the impossible possible, and for sure, if I wright this words is thanks to you. Last but not least to my grandmother for such an example of perseverance and love.
    [Show full text]
  • Ns-3 Direct Code Execution (DCE) Manual Release 1.11
    ns-3 Direct Code Execution (DCE) Manual Release 1.11 Direct Code Execution project Sep 08, 2021 CONTENTS 1 Introduction 3 2 Quick Start 5 3 User’s Guide 11 4 Developer’s Guide 45 5 How It Works 61 6 Subprojects of DCE 69 7 About 71 i ii ns-3 Direct Code Execution (DCE) Manual, Release 1.11 This is the manual of Direct Code Execution (DCE). • Doxygen: Documentation of the public APIs of the DCE • Manual (this document) for the latest release and development tree This document is written in reStructuredText for Sphinx and is maintained in the doc/ directory of ns-3-dce’s source code. CONTENTS 1 ns-3 Direct Code Execution (DCE) Manual, Release 1.11 2 CONTENTS CHAPTER ONE INTRODUCTION 1.1 Overview Direct Code Execution (DCE) is a module for ns-3 that provides facilities to execute, within ns-3, existing implemen- tations of userspace and kernelspace network protocols or applications without source code changes. For example instead of using the pseudo application provided by ns-3 V4PingHelper you can use the true ping. If you wish to cite DCE in your academic work, please reference this paper. 1.2 Maintainers DCE is maintained by Hajime Tazaki, Matthieu Coudron, Tom Henderson, and Parth Pratim Chatterjee. 1.3 Manual Structure This document consists of the following parts: 0. Quick Start: The document describes a quick instruction of DCE. 1. User’s Guide: The document is for people who will use DCE to experiment. 1.4 DCE Outlook • To run an application using DCE, it is usually not necessary to change its sources.
    [Show full text]
  • Bsctools Hands-On 3 Virtual Institute – High Productivity Supercomputing
    VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING BSC Tools Extrae & Paraver Hands-On Germán Llort, Judit Giménez Barcelona Supercomputing Center VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING Getting a trace with Extrae 2 VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING Extrae features . Platforms . Intel, Cray, BlueGene, Intel MIC, ARM, Android, Fujitsu Sparc… . Parallel programming model . MPI, OpenMP, pthreads, OmpSs, CUDA, OpenCL, Java, Python... No need . Performance Counters . Using PAPI interface to . Link to source code recompile . Callstack at MPI routines / relink! . OpenMP outlined routines . Selected user functions . Periodic sampling . User events (Extrae API) BSCTOOLS HANDS-ON 3 VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING Extrae overheads Average values Stampede2 (KNL) Stampede2 (SKX) Event 150 – 200 ns 450 ns 110 ns Event + PAPI 750 – 1000 ns 6.3 us 1000 ns Event + callstack (1 level) 600 ns 8.7 us 2.3 us Event + callstack (6 levels) 1.9 us 20 us 5.6 us BSCTOOLS HANDS-ON 4 VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING How does Extrae work? . Symbol substitution through LD_PRELOAD . Specific libraries for each combination of runtimes . MPI Recommended . OpenMP . OpenMP+MPI . … . Dynamic instrumentation . Based on DynInst (developed by U.Wisconsin/U.Maryland) . Instrumentation in memory . Binary rewriting . Static link (i.e., PMPI, Extrae API) BSCTOOLS HANDS-ON 5 VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING Using Extrae in 3 steps 1. Adapt your job submission script 2. Configure what to trace . XML configuration file . Example configurations at $EXTRAE_HOME/share/example 3. Run it! . For further reference check the Extrae User Guide: . https://tools.bsc.es/doc/html/extrae/index.html . Also distributed with Extrae at $EXTRAE_HOME/share/doc BSCTOOLS HANDS-ON 6 VIRTUAL INSTITUTE – HIGH PRODUCTIVITY SUPERCOMPUTING Login to Stampede2 and copy the examples laptop> ssh –Y <USER>@stampede2.tacc.xsede.org stampede2> cp –r ~tg856590/tools-material $WORK stampede2> ls $WORK/tools-material ..
    [Show full text]
  • Quantecon-Python-Programming.Pdf
    Python Programming for Economics and Finance Thomas J. Sargent & John Stachurski Sep 17, 2021 CONTENTS I Introduction to Python 3 1 About Python 5 1.1 Overview ................................................. 5 1.2 What’s Python? .............................................. 5 1.3 Scientific Programming .......................................... 7 1.4 Learn More ................................................ 15 2 Setting up Your Python Environment 17 2.1 Overview ................................................. 17 2.2 Anaconda ................................................. 17 2.3 Jupyter Notebooks ............................................ 18 2.4 Installing Libraries ............................................ 30 2.5 Working with Python Files ........................................ 31 2.6 Exercises ................................................. 32 3 An Introductory Example 35 3.1 Overview ................................................. 35 3.2 The Task: Plotting a White Noise Process ................................ 35 3.3 Version 1 ................................................. 36 3.4 Alternative Implementations ....................................... 40 3.5 Another Application ........................................... 44 3.6 Exercises ................................................. 45 3.7 Solutions ................................................. 47 4 Functions 53 4.1 Overview ................................................. 53 4.2 Function Basics .............................................. 54 4.3 Defining
    [Show full text]