Bioinformatics Programming Using Python

Total Page:16

File Type:pdf, Size:1020Kb

Bioinformatics Programming Using Python Bioinformatics Programming Using Python Bioinformatics Programming Using Python Mitchell L Model Beijing • Cambridge • Farnham • Köln • Sebastopol • Taipei • Tokyo Bioinformatics Programming Using Python by Mitchell L Model Copyright © 2010 Mitchell L Model. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://my.safaribooksonline.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Editor: Mike Loukides Indexer: Lucie Haskins Production Editor: Sarah Schneider Cover Designer: Karen Montgomery Copyeditor: Rachel Head Interior Designer: David Futato Proofreader: Sada Preisch Illustrator: Robert Romano Printing History: December 2009: First Edition. O’Reilly and the O’Reilly logo are registered trademarks of O’Reilly Media, Inc. Bioinformatics Pro- gramming Using Python, the image of a brown rat, and related trade dress are trademarks of O’Reilly Media, Inc. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and O’Reilly Media, Inc. was aware of a trademark claim, the designations have been printed in caps or initial caps. While every precaution has been taken in the preparation of this book, the publisher and author assume no responsibility for errors or omissions, or for damages resulting from the use of the information con- tained herein. TM This book uses RepKover, a durable and flexible lay-flat binding. ISBN: 978-0-596-15450-9 [M] 1259959883 Table of Contents Preface . .................................................................... xi 1. Primitives . ............................................................ 1 Simple Values 1 Booleans 2 Integers 2 Floats 3 Strings 4 Expressions 5 Numeric Operators 5 Logical Operations 7 String Operations 9 Calls 12 Compound Expressions 16 Tips, Traps, and Tracebacks 18 Tips 18 Traps 20 Tracebacks 20 2. Names, Functions, and Modules . ......................................... 21 Assigning Names 23 Defining Functions 24 Function Parameters 27 Comments and Documentation 28 Assertions 30 Default Parameter Values 32 Using Modules 34 Importing 34 Python Files 38 Tips, Traps, and Tracebacks 40 Tips 40 v Traps 45 Tracebacks 46 3. Collections . ........................................................... 47 Sets 48 Sequences 51 Strings, Bytes, and Bytearrays 53 Ranges 60 Tuples 61 Lists 62 Mappings 66 Dictionaries 67 Streams 72 Files 73 Generators 78 Collection-Related Expression Features 79 Comprehensions 79 Functional Parameters 89 Tips, Traps, and Tracebacks 94 Tips 94 Traps 96 Tracebacks 97 4. Control Statements . ................................................... 99 Conditionals 101 Loops 104 Simple Loop Examples 105 Initialization of Loop Values 106 Looping Forever 107 Loops with Guard Conditions 109 Iterations 111 Iteration Statements 111 Kinds of Iterations 113 Exception Handlers 134 Python Errors 136 Exception Handling Statements 138 Raising Exceptions 141 Extended Examples 143 Extracting Information from an HTML File 143 The Grand Unified Bioinformatics File Parser 146 Parsing GenBank Files 148 Translating RNA Sequences 151 Constructing a Table from a Text File 155 vi | Table of Contents Tips, Traps, and Tracebacks 160 Tips 160 Traps 162 Tracebacks 163 5. Classes . ............................................................. 165 Defining Classes 166 Instance Attributes 168 Class Attributes 179 Class and Method Relationships 186 Decomposition 186 Inheritance 194 Tips, Traps, and Tracebacks 205 Tips 205 Traps 207 Tracebacks 208 6. Utilities . ............................................................ 209 System Environment 209 Dates and Times: datetime 209 System Information 212 Command-Line Utilities 217 Communications 223 The Filesystem 226 Operating System Interface: os 226 Manipulating Paths: os.path 229 Filename Expansion: fnmatch and glob 232 Shell Utilities: shutil 234 Comparing Files and Directories 235 Working with Text 238 Formatting Blocks of Text: textwrap 238 String Utilities: string 240 Comma- and Tab-Separated Formats: csv 241 String-Based Reading and Writing: io 242 Persistent Storage 243 Persistent Text: dbm 243 Persistent Objects: pickle 247 Keyed Persistent Object Storage: shelve 248 Debugging Tools 249 Tips, Traps, and Tracebacks 253 Tips 253 Traps 254 Tracebacks 255 Table of Contents | vii 7. Pattern Matching . .................................................... 257 Fundamental Syntax 258 Fixed-Length Matching 259 Variable-Length Matching 262 Greedy Versus Nongreedy Matching 263 Grouping and Disjunction 264 The Actions of the re Module 265 Functions 265 Flags 266 Methods 268 Results of re Functions and Methods 269 Match Object Fields 269 Match Object Methods 269 Putting It All Together: Examples 270 Some Quick Examples 270 Extracting Descriptions from Sequence Files 272 Extracting Entries From Sequence Files 274 Tips, Traps, and Tracebacks 283 Tips 283 Traps 284 Tracebacks 285 8. Structured Text . ...................................................... 287 HTML 287 Simple HTML Processing 289 Structured HTML Processing 297 XML 300 The Nature of XML 300 An XML File for a Complete Genome 302 The ElementTree Module 303 Event-Based Processing 310 expat 317 Tips, Traps, and Tracebacks 322 Tips 322 Traps 323 Tracebacks 323 9. Web Programming . ................................................... 325 Manipulating URLs: urllib.parse 325 Disassembling URLs 326 Assembling URLs 327 Opening Web Pages: webbrowser 328 Module Functions 328 viii | Table of Contents Constructing and Submitting Queries 329 Constructing and Viewing an HTML Page 330 Web Clients 331 Making the URLs in a Response Absolute 332 Constructing an HTML Page of Extracted Links 333 Downloading a Web Page’s Linked Files 334 Web Servers 337 Sockets and Servers 337 CGI 343 Simple Web Applications 348 Tips, Traps, and Tracebacks 354 Tips 355 Traps 357 Tracebacks 358 10. Relational Databases . ................................................. 359 Representation in Relational Databases 360 Database Tables 360 A Restriction Enzyme Database 365 Using Relational Data 370 SQL Basics 371 SQL Queries 380 Querying the Database from a Web Page 392 Tips, Traps, and Tracebacks 395 Tips 395 Traps 398 Tracebacks 398 11. Structured Graphics . .................................................. 399 Introduction to Graphics Programming 399 Concepts 400 GUI Toolkits 404 Structured Graphics with tkinter 406 tkinter Fundamentals 406 Examples 411 Structured Graphics with SVG 431 SVG File Contents 432 Examples 436 Tips, Traps, and Tracebacks 444 Tips 444 Traps 445 Tracebacks 447 Table of Contents | ix A. Python Language Summary . ........................................... 449 B. Collection Type Summary . ............................................. 459 Index . 473 x | Table of Contents Preface This preface provides information I expect will be important for someone reading and using this book. The first part introduces the book itself. The second talks about Python. The third part contains other notes of various kinds. Introduction I would like to begin with some comments about this book, the field of bioinformatics, and the kinds of people I think will find it useful. About This Book The purpose of this book is to show the reader how to use the Python programming language to facilitate and automate the wide variety of data manipulation tasks en- countered in life science research and development. It is designed to be accessible to readers with a range of interests and backgrounds, both scientific and technical. It emphasizes practical programming, using meaningful examples of useful code. In ad- dition to meeting the needs of individual readers, it can also be used as a textbook for a one-semester upper-level undergraduate or graduate-level course. The book differs from traditional introductory programming texts in a variety of ways. It does not attempt to detail every possible variation of the mechanisms it describes, emphasizing instead the most frequently used. It offers an introduction to Python pro- gramming that is more rapid and in some ways more superficial than what would be found in a text devoted solely to Python or introductory programming. At the same time, it includes some advanced features, techniques, and topics that are often omitted from entry-level Python books. These are included because of their wide applicability in bioinformatics programming, and they are used extensively in the book’s examples. Python’s installation includes a large selection of optional components called “modules.” Python books usually cover a small selection of the most generally useful modules, and perhaps some others in less detail. Having bioinformatics programming as this book’s target had some interesting effects on the choice of which modules to discuss, and at what depth. The modules (or parts of modules) that are xi covered in this book are the ones that are most likely to be particularly valuable in bioinformatics programming. In some cases the discussions are more substantial than would be found in a generic Python book, and many of the modules covered here appear in few other books. Chapter 6, in particular, describes a large number of narrowly focused “utility” modules. The remaining chapters focus on particular
Recommended publications
  • Programming Paradigms & Object-Oriented
    4.3 (Programming Paradigms & Object-Oriented- Computer Science 9608 Programming) with Majid Tahir Syllabus Content: 4.3.1 Programming paradigms Show understanding of what is meant by a programming paradigm Show understanding of the characteristics of a number of programming paradigms (low- level, imperative (procedural), object-oriented, declarative) – low-level programming Demonstrate an ability to write low-level code that uses various address modes: o immediate, direct, indirect, indexed and relative (see Section 1.4.3 and Section 3.6.2) o imperative programming- see details in Section 2.3 (procedural programming) Object-oriented programming (OOP) o demonstrate an ability to solve a problem by designing appropriate classes o demonstrate an ability to write code that demonstrates the use of classes, inheritance, polymorphism and containment (aggregation) declarative programming o demonstrate an ability to solve a problem by writing appropriate facts and rules based on supplied information o demonstrate an ability to write code that can satisfy a goal using facts and rules Programming paradigms 1 4.3 (Programming Paradigms & Object-Oriented- Computer Science 9608 Programming) with Majid Tahir Programming paradigm: A programming paradigm is a set of programming concepts and is a fundamental style of programming. Each paradigm will support a different way of thinking and problem solving. Paradigms are supported by programming language features. Some programming languages support more than one paradigm. There are many different paradigms, not all mutually exclusive. Here are just a few different paradigms. Low-level programming paradigm The features of Low-level programming languages give us the ability to manipulate the contents of memory addresses and registers directly and exploit the architecture of a given processor.
    [Show full text]
  • Bioconductor: Open Software Development for Computational Biology and Bioinformatics Robert C
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Collection Of Biostatistics Research Archive Bioconductor Project Bioconductor Project Working Papers Year 2004 Paper 1 Bioconductor: Open software development for computational biology and bioinformatics Robert C. Gentleman, Department of Biostatistical Sciences, Dana Farber Can- cer Institute Vincent J. Carey, Channing Laboratory, Brigham and Women’s Hospital Douglas J. Bates, Department of Statistics, University of Wisconsin, Madison Benjamin M. Bolstad, Division of Biostatistics, University of California, Berkeley Marcel Dettling, Seminar for Statistics, ETH, Zurich, CH Sandrine Dudoit, Division of Biostatistics, University of California, Berkeley Byron Ellis, Department of Statistics, Harvard University Laurent Gautier, Center for Biological Sequence Analysis, Technical University of Denmark, DK Yongchao Ge, Department of Biomathematical Sciences, Mount Sinai School of Medicine Jeff Gentry, Department of Biostatistical Sciences, Dana Farber Cancer Institute Kurt Hornik, Computational Statistics Group, Department of Statistics and Math- ematics, Wirtschaftsuniversitat¨ Wien, AT Torsten Hothorn, Institut fuer Medizininformatik, Biometrie und Epidemiologie, Friedrich-Alexander-Universitat Erlangen-Nurnberg, DE Wolfgang Huber, Department for Molecular Genome Analysis (B050), German Cancer Research Center, Heidelberg, DE Stefano Iacus, Department of Economics, University of Milan, IT Rafael Irizarry, Department of Biostatistics, Johns Hopkins University Friedrich Leisch, Institut fur¨ Statistik und Wahrscheinlichkeitstheorie, Technische Universitat¨ Wien, AT Cheng Li, Department of Biostatistical Sciences, Dana Farber Cancer Institute Martin Maechler, Seminar for Statistics, ETH, Zurich, CH Anthony J. Rossini, Department of Medical Education and Biomedical Informat- ics, University of Washington Guenther Sawitzki, Statistisches Labor, Institut fuer Angewandte Mathematik, DE Colin Smith, Department of Molecular Biology, The Scripps Research Institute, San Diego Gordon K.
    [Show full text]
  • Basic Structures: Sets, Functions, Sequences, and Sums 2-2
    CHAPTER Basic Structures: Sets, Functions, 2 Sequences, and Sums 2.1 Sets uch of discrete mathematics is devoted to the study of discrete structures, used to represent discrete objects. Many important discrete structures are built using sets, which 2.2 Set Operations M are collections of objects. Among the discrete structures built from sets are combinations, 2.3 Functions unordered collections of objects used extensively in counting; relations, sets of ordered pairs that represent relationships between objects; graphs, sets of vertices and edges that connect 2.4 Sequences and vertices; and finite state machines, used to model computing machines. These are some of the Summations topics we will study in later chapters. The concept of a function is extremely important in discrete mathematics. A function assigns to each element of a set exactly one element of a set. Functions play important roles throughout discrete mathematics. They are used to represent the computational complexity of algorithms, to study the size of sets, to count objects, and in a myriad of other ways. Useful structures such as sequences and strings are special types of functions. In this chapter, we will introduce the notion of a sequence, which represents ordered lists of elements. We will introduce some important types of sequences, and we will address the problem of identifying a pattern for the terms of a sequence from its first few terms. Using the notion of a sequence, we will define what it means for a set to be countable, namely, that we can list all the elements of the set in a sequence.
    [Show full text]
  • Paradigms of Computer Programming
    Paradigms of computer programming l This course aims to teach programming as a unified discipline that covers all programming languages l We cover the essential concepts and techniques in a uniform framework l Second-year university level: requires some programming experience and mathematics (sets, lists, functions) l The course covers four important themes: l Programming paradigms l Mathematical semantics of programming l Data abstraction l Concurrency l Let’s see how this works in practice Hundreds of programming languages are in use... So many, how can we understand them all? l Key insight: languages are based on paradigms, and there are many fewer paradigms than languages l We can understand many languages by learning few paradigms! What is a paradigm? l A programming paradigm is an approach to programming a computer based on a coherent set of principles or a mathematical theory l A program is written to solve problems l Any realistic program needs to solve different kinds of problems l Each kind of problem needs its own paradigm l So we need multiple paradigms and we need to combine them in the same program How can we study multiple paradigms? l How can we study multiple paradigms without studying multiple languages (since most languages only support one, or sometimes two paradigms)? l Each language has its own syntax, its own semantics, its own system, and its own quirks l We could pick three languages, like Java, Erlang, and Haskell, and structure our course around them l This would make the course complicated for no good reason
    [Show full text]
  • The Machine That Builds Itself: How the Strengths of Lisp Family
    Khomtchouk et al. OPINION NOTE The Machine that Builds Itself: How the Strengths of Lisp Family Languages Facilitate Building Complex and Flexible Bioinformatic Models Bohdan B. Khomtchouk1*, Edmund Weitz2 and Claes Wahlestedt1 *Correspondence: [email protected] Abstract 1Center for Therapeutic Innovation and Department of We address the need for expanding the presence of the Lisp family of Psychiatry and Behavioral programming languages in bioinformatics and computational biology research. Sciences, University of Miami Languages of this family, like Common Lisp, Scheme, or Clojure, facilitate the Miller School of Medicine, 1120 NW 14th ST, Miami, FL, USA creation of powerful and flexible software models that are required for complex 33136 and rapidly evolving domains like biology. We will point out several important key Full list of author information is features that distinguish languages of the Lisp family from other programming available at the end of the article languages and we will explain how these features can aid researchers in becoming more productive and creating better code. We will also show how these features make these languages ideal tools for artificial intelligence and machine learning applications. We will specifically stress the advantages of domain-specific languages (DSL): languages which are specialized to a particular area and thus not only facilitate easier research problem formulation, but also aid in the establishment of standards and best programming practices as applied to the specific research field at hand. DSLs are particularly easy to build in Common Lisp, the most comprehensive Lisp dialect, which is commonly referred to as the “programmable programming language.” We are convinced that Lisp grants programmers unprecedented power to build increasingly sophisticated artificial intelligence systems that may ultimately transform machine learning and AI research in bioinformatics and computational biology.
    [Show full text]
  • Scripting: Higher- Level Programming for the 21St Century
    . John K. Ousterhout Sun Microsystems Laboratories Scripting: Higher- Cybersquare Level Programming for the 21st Century Increases in computer speed and changes in the application mix are making scripting languages more and more important for the applications of the future. Scripting languages differ from system programming languages in that they are designed for “gluing” applications together. They use typeless approaches to achieve a higher level of programming and more rapid application development than system programming languages. or the past 15 years, a fundamental change has been ated with system programming languages and glued Foccurring in the way people write computer programs. together with scripting languages. However, several The change is a transition from system programming recent trends, such as faster machines, better script- languages such as C or C++ to scripting languages such ing languages, the increasing importance of graphical as Perl or Tcl. Although many people are participat- user interfaces (GUIs) and component architectures, ing in the change, few realize that the change is occur- and the growth of the Internet, have greatly expanded ring and even fewer know why it is happening. This the applicability of scripting languages. These trends article explains why scripting languages will handle will continue over the next decade, with more and many of the programming tasks in the next century more new applications written entirely in scripting better than system programming languages. languages and system programming
    [Show full text]
  • Introduction
    Introduction A need shared by many applications is the ability to authenticate a user and then bind a set of permissions to the user which indicate what actions the user is permitted to perform (i.e. authorization). A LocalSystem may have implemented it's own authentication and authorization and now wishes to utilize a federated Identity Provider (IdP). Typically the IdP provides an assertion with information describing the authenticated user. The goal is to transform the IdP assertion into a LocalSystem token. In it's simplest terms this is a data transformation which might include: • renaming of data items • conversion to a different format • deletion of data • addition of data • reorganization of data There are many ways such a transformation could be implemented: 1. Custom site specific code 2. Scripts written in a scripting language 3. XSLT 4. Rule based transforms We also desire these goals for the transformation. 1. Site administrator configurable 2. Secure 3. Simple 4. Extensible 5. Efficient Implementation choice 1, custom written code fails goals 1, 3 and 4, an admin cannot adapt it, it's not simple, and it's likely to be difficult to extend. Implementation choice 2, script based transformations have a lot of appeal. Because one has at their disposal the full power of an actual programming language there are virtually no limitations. If it's a popular scripting language an administrator is likely to already know the language and might be able to program a new transformation or at a minimum tweak an existing script. Forking out to a script interpreter is inefficient, but it's now possible to embed script interpreters in the existing application.
    [Show full text]
  • Six Canonical Projects by Rem Koolhaas
    5 Six Canonical Projects by Rem Koolhaas has been part of the international avant-garde since the nineteen-seventies and has been named the Pritzker Rem Koolhaas Architecture Prize for the year 2000. This book, which builds on six canonical projects, traces the discursive practice analyse behind the design methods used by Koolhaas and his office + OMA. It uncovers recurring key themes—such as wall, void, tur montage, trajectory, infrastructure, and shape—that have tek structured this design discourse over the span of Koolhaas’s Essays on the History of Ideas oeuvre. The book moves beyond the six core pieces, as well: It explores how these identified thematic design principles archi manifest in other works by Koolhaas as both practical re- Ingrid Böck applications and further elaborations. In addition to Koolhaas’s individual genius, these textual and material layers are accounted for shaping the very context of his work’s relevance. By comparing the design principles with relevant concepts from the architectural Zeitgeist in which OMA has operated, the study moves beyond its specific subject—Rem Koolhaas—and provides novel insight into the broader history of architectural ideas. Ingrid Böck is a researcher at the Institute of Architectural Theory, Art History and Cultural Studies at the Graz Ingrid Böck University of Technology, Austria. “Despite the prominence and notoriety of Rem Koolhaas … there is not a single piece of scholarly writing coming close to the … length, to the intensity, or to the methodological rigor found in the manuscript
    [Show full text]
  • Comparing Bioinformatics Software Development by Computer Scientists and Biologists: an Exploratory Study
    Comparing Bioinformatics Software Development by Computer Scientists and Biologists: An Exploratory Study Parmit K. Chilana Carole L. Palmer Amy J. Ko The Information School Graduate School of Library The Information School DUB Group and Information Science DUB Group University of Washington University of Illinois at University of Washington [email protected] Urbana-Champaign [email protected] [email protected] Abstract Although research in bioinformatics has soared in the last decade, much of the focus has been on high- We present the results of a study designed to better performance computing, such as optimizing algorithms understand information-seeking activities in and large-scale data storage techniques. Only recently bioinformatics software development by computer have studies on end-user programming [5, 8] and scientists and biologists. We collected data through semi- information activities in bioinformatics [1, 6] started to structured interviews with eight participants from four emerge. Despite these efforts, there still is a gap in our different bioinformatics labs in North America. The understanding of problems in bioinformatics software research focus within these labs ranged from development, how domain knowledge among MBB computational biology to applied molecular biology and and CS experts is exchanged, and how the software biochemistry. The findings indicate that colleagues play a development process in this domain can be improved. significant role in information seeking activities, but there In our exploratory study, we used an information is need for better methods of capturing and retaining use perspective to begin understanding issues in information from them during development. Also, in bioinformatics software development. We conducted terms of online information sources, there is need for in-depth interviews with 8 participants working in 4 more centralization, improved access and organization of bioinformatics labs in North America.
    [Show full text]
  • Technological Advancement in Object Oriented Programming Paradigm for Software Development
    International Journal of Applied Engineering Research ISSN 0973-4562 Volume 14, Number 8 (2019) pp. 1835-1841 © Research India Publications. http://www.ripublication.com Technological Advancement in Object Oriented Programming Paradigm for Software Development Achi Ifeanyi Isaiah1, Agwu Chukwuemeka Odi2, Alo Uzoma Rita3, Anikwe Chioma Verginia4, Okemiri Henry Anaya5 1Department of Maths/Comp Sci/Stats/Info., Faculty of science, Alex Ekwueme University, Ndufu-Alike 2Department of Computer Science, Ebonyi State University-Abakaliki. 3Alex Ekwueme University, Ndufu-Alike, 4Department of Maths/Comp Sci/Stats/Info., Faculty of science, Alex Ekwueme University, Ndufu-Alike 5Department of Maths/Comp Sci/Stats/Info., Faculty of science, Alex Ekwueme University, Ndufu-Alike Abstract and personalization problems. Take for instance, a lot of sophisticated apps are being produced and release in the Object oriented programming paradigm in software market today through the application of OOP. Almost desktop development is one of the most popular methods in the apps are being converted to mobile apps through Java, C++, information technology industry and academia as well as in PHP & MySQL, R, Python etc platform which form the many other forms of engineering design. Software testimony of OOP in the software industries. Software development is a field of engineering that came into existence developer has been revolving using procedural language for owing to the various problems that developers of software the past decade before the advent of OOP. The relationships faced while developing software projects. This paper analyzes between them is that procedural languages focus on the some of the most important technological innovations in algorithm but OOP focuses on the object model itself, object oriented software engineering in recent times.
    [Show full text]
  • Object Oriented Database Management Systems-Concepts
    View metadata, citation and similar papers at core.ac.uk brought to you by CORE provided by Global Journal of Computer Science and Technology (GJCST) Global Journal of Computer Science and Technology: C Software & Data Engineering Volume 15 Issue 3 Version 1.0 Year 2015 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global Journals Inc. (USA) Online ISSN: 0975-4172 & Print ISSN: 0975-4350 Object Oriented Database Management Systems-Concepts, Advantages, Limitations and Comparative Study with Relational Database Management Systems By Hardeep Singh Damesha Lovely Professional University, India Abstract- Object Oriented Databases stores data in the form of objects. An Object is something uniquely identifiable which models a real world entity and has got state and behaviour. In Object Oriented based Databases capabilities of Object based paradigm for Programming and databases are combined due remove the limitations of Relational databases and on the demand of some advanced applications. In this paper, need of Object database, approaches for Object database implementation, requirements for database to an Object database, Perspectives of Object database, architecture approaches for Object databases, the achievements and weakness of Object Databases and comparison with relational database are discussed. Keywords: relational databases, object based databases, object and object data model. GJCST-C Classification : F.3.3 ObjectOrientedDatabaseManagementSystemsConceptsAdvantagesLimitationsandComparativeStudywithRelationalDatabaseManagementSystems Strictly as per the compliance and regulations of: © 2015. Hardeep Singh Damesha. This is a research/review paper, distributed under the terms of the Creative Commons Attribution- Noncommercial 3.0 Unported License http://creativecommons.org/licenses/by-nc/3.0/), permitting all non-commercial use, distribution, and reproduction inany medium, provided the original work is properly cited.
    [Show full text]
  • Future Generation Supercomputers II: a Paradigm for Cluster Architecture
    Future Generation Supercomputers II: A Paradigm for Cluster Architecture N.Venkateswaran§,DeepakSrinivasan†, Madhavan Manivannan†, TP Ramnath Sai Sagar† Shyamsundar Gopalakrishnan† VinothKrishnan Elangovan† Arvind M† Prem Kumar Ramesh Karthik Ganesan Viswanath Krishnamurthy Sivaramakrishnan Abstract Blue Gene/L, Red Storm, and ASC Purple clearly marks that these machines although significantly diverse along the In part-I, a novel multi-core node architecture was proposed afore-mentioned design parameters, offer good performance which when employed in a cluster environment would be ca- during ”Grand Challenge Application” execution. But fu- pable of tackling computational complexity associated with ture generation applications might require close coupling of wide class of applications. Furthermore, it was discussed previously independent application models, as highlighted in that by appropriately scaling the architectural specifications, NASA’s report on Earth Science Vision 2030[2], which in- Teraops computing power could be achieved at the node volves simulations on coupled climate models, such as ocean, level. In order to harness the computational power of such atmosphere, biosphere and solid earth. These kind of hybrid a node, we have developed an efficient application execution applications call for simultaneous execution of the compo- model with a competent cluster architectural backbone. In nent applications, since the execution of different applica- this paper we present the novel cluster paradigm, dealing tions on separate clusters
    [Show full text]