Bioinformatics Biocomputing and Perl

Total Page:16

File Type:pdf, Size:1020Kb

Bioinformatics Biocomputing and Perl Bioinformatics Biocomputing and Perl An Introduction to Bioinformatics Computing Skills and Practice Michael Moorhouse Post-Doctoral Worker from Erasmus MC, The Netherlands Paul Barry Department of Computing and Networking, Institute of Technology, Carlow, Ireland Copyright 2004 John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England Telephone (+44) 1243 779777 Email (for orders and customer service enquiries): [email protected] Visit our Home Page on www.wileyeurope.com or www.wiley.com All Rights Reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except under the terms of the Copyright, Designs and Patents Act 1988 or under the terms of a licence issued by the Copyright Licensing Agency Ltd, 90 Tottenham Court Road, London W1T 4LP, UK, without the permission in writing of the Publisher. Requests to the Publisher should be addressed to the Permissions Department, John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ, England, or emailed to [email protected], or faxed to (+44) 1243 770620. This publication is designed to provide accurate and authoritative information in regard to the subject matter covered. It is sold on the understanding that the Publisher is not engaged in rendering professional services. If professional advice or other expert assistance is required, the services of a competent professional should be sought. Other Wiley Editorial Offices John Wiley & Sons Inc., 111 River Street, Hoboken, NJ 07030, USA Jossey-Bass, 989 Market Street, San Francisco, CA 94103-1741, USA Wiley-VCH Verlag GmbH, Boschstr. 12, D-69469 Weinheim, Germany John Wiley & Sons Australia Ltd, 33 Park Road, Milton, Queensland 4064, Australia John Wiley & Sons (Asia) Pte Ltd, 2 Clementi Loop #02-01, Jin Xing Distripark, Singapore 129809 John Wiley & Sons Canada Ltd, 22 Worcester Road, Etobicoke, Ontario, Canada M9W 1L1 Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be available in electronic books. British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library ISBN 0-470-85331-X Typeset in 9.5/12.5pt Lucida Bright by Laserwords Private Limited, Chennai, India Printed and bound in Great Britain by Antony Rowe Ltd, Chippenham, Wiltshire This book is printed on acid-free paper responsibly manufactured from sustainable forestry in which at least two trees are planted for each one used for paper production. For my parents, who taught me the value of knowledge – MJM For three great kids: Joseph, Aaron and Aideen – PJB Contents Preface xv 1 Setting the Biological Scene 1 1.1 Introducing Biological Sequence Analysis 1 1.2 Protein and Polypeptides 4 1.3 Generalised Models and their Use 5 1.4 The Central Dogma of Molecular Biology 6 1.4.1 Transcription 6 1.4.2 Translation 7 1.5 Genome Sequencing 10 1.5.1 Sequence assembly 11 1.6 The Example DNA-gene-protein system we will use 12 Where to from Here 13 2 Setting the Technological Scene 15 2.1 The Layers of Technology 15 2.1.1 From passive user to active developer 16 2.2 Finding perl 17 2.2.1 Checking for perl 17 Where to from Here 18 I Working with Perl 19 3TheBasics 21 3.1 Let’s Get Started! 21 3.1.1 Running Perl programs 22 3.1.2 Syntax and semantics 23 3.1.3 Program: run thyself! 25 3.2 Iteration 26 3.2.1 Using the Perl while construct 26 3.3 More Iterations 30 3.3.1 Introducing variable containers 31 3.3.2 Variable containers and loops 32 viii Contents 3.4 Selection 34 3.4.1 Using the Perl if construct 35 3.5 There Really is MTOWTDI 36 3.6 Processing Data Files 41 3.6.1 Asking getlines to do more 43 3.7 Introducing Patterns 44 Where to from Here 46 The Maxims Repeated 46 4 Places to Put Things 49 4.1 Beyond Scalars 49 4.2 Arrays: Associating Data with Numbers 49 4.2.1 Working with array elements 51 4.2.2 How big is the array? 51 4.2.3 Adding elements to an array 52 4.2.4 Removing elements from an array 54 4.2.5 Slicing arrays 54 4.2.6 Pushing, popping, shifting and unshifting 56 4.2.7 Processing every element in an array 57 4.2.8 Making lists easier to work with 59 4.3 Hashes: Associating Data with Words 60 4.3.1 Working with hash entries 61 4.3.2 How big is the hash? 61 4.3.3 Adding entries to a hash 62 4.3.4 Removing entries from a hash 62 4.3.5 Slicing hashes 63 4.3.6 Working with hash entries: a complete example 64 4.3.7 Processing every entry in a hash 66 Where to from Here 68 The Maxims Repeated 68 5 Getting Organised 71 5.1 Named Blocks 71 5.2 Introducing Subroutines 73 5.2.1 Calling subroutines 73 5.3 Creating Subroutines 74 5.3.1 Processing parameters 76 5.3.2 Better processing of parameters 78 5.3.3 Even better processing of parameters 80 5.3.4 A more flexible drawline subroutine 83 5.3.5 Returning results 84 5.4 Visibility and Scope 85 5.4.1 Using private variables 86 5.4.2 Using global variables properly 88 5.4.3 The final version of drawline 89 5.5 In-built Subroutines 90 5.6 Grouping and Reusing Subroutines 92 5.6.1 Modules 93 5.7 The Standard Modules 96 5.8 CPAN: The Module Repository 96 5.8.1 Searching CPAN 97 5.8.2 Installing a CPAN module manually 98 Contents ix 5.8.3 Installing a CPAN module automatically 99 5.8.4 A final word on CPAN modules 99 Where to from Here 100 The Maxims Repeated 100 6 About Files 103 6.1 I/O: Input and Output 103 6.1.1 The standard streams: STDIN, STDOUT and STDERR 103 6.2 Reading Files 105 6.2.1 Determining the disk-file names 106 6.2.2 Opening the named disk-files 108 6.2.3 Reading a line from each of the disk-files 110 6.2.4 Putting it all together 110 6.2.5 Slurping 114 6.3 Writing Files 116 6.3.1 Redirecting output 117 6.3.2 Variable interpolation 117 6.4 Chopping and Chomping 118 Where to from Here 119 The Maxims Repeated 119 7 Patterns, Patterns and More Patterns 121 7.1 Pattern Basics 121 7.1.1 What is a regular expression? 122 7.1.2 What makes regular expressions so special? 122 7.2 Introducing the Pattern Metacharacters 124 7.2.1 The + repetition metacharacter 124 7.2.2 The | alternation metacharacter 126 7.2.3 Metacharacter shorthand and character classes 127 7.2.4 More metacharacter shorthand 128 7.2.5 More repetition 130 7.2.6 The ? and * optional metacharacters 130 7.2.7 The any character metacharacter 131 7.3 Anchors 132 7.3.1 The \b word boundary metacharacter 132 7.3.2 The ^ start-of-line metacharacter 133 7.3.3 The $ end-of-line metacharacter 133 7.4 The Binding Operators 134 7.5 Remembering What Was Matched 135 7.6 Greedy by Default 137 7.7 Alternative Pattern Delimiters 138 7.8 Another Useful Utility 139 7.9 Substitutions: Search and Replace 140 7.9.1 Substituting for whitespace 141 7.10 Finding a Sequence 142 Where to from Here 146 The Maxims Repeated 146 8 Perl Grabbag 147 8.1 Introduction 147 8.2 Strictness 147 x Contents 8.3 Perl One-liners 149 8.4 Running Other Programs from perl 152 8.5 Recovering from Errors 153 8.6 Sorting 155 8.7 HERE Documents 159 Where to from Here 160 The Maxims Repeated 161 II Working with Data 163 9 Downloading Datasets 165 9.1 Let’s Get Data 165 9.2 Downloading from the Web 165 9.2.1 Using wget to download PDB data-files 167 9.2.2 Mirroring a dataset 168 9.2.3 Smarter mirroring 168 9.2.4 Downloading a subset of a dataset 169 Where to from Here 171 The Maxims Repeated 171 10 The Protein Databank 173 10.1 Introduction 173 10.2 Determining Biomolecule Structures 174 10.2.1 X-Ray Crystallography 174 10.2.2 Nuclear magnetic resonance 176 10.2.3 Summary of protein structure methods 177 10.3 The Protein Databank 177 10.4 The PDB Data-file Formats 179 10.4.1 Example structures 180 10.4.2 Downloading PDB data-files 181 10.5 Accessing Data in PDB Entries 182 10.6 Accessing PDB Annotation Data 183 10.6.1 Free R and resolution 184 10.6.2 Database cross references 186 10.6.3 Coordinates section 188 10.6.4 Extracting 3D coordinate data 191 10.7 Contact Maps 192 10.8 STRIDE: Secondary Structure Assignment 196 10.8.1 Installation of STRIDE 197 10.9 Assigning Secondary Structures 197 10.9.1 Using STRIDE and parsing the output 200 10.9.2 Extracting amino acid sequences using STRIDE 204 10.10 Introducing the mmCIF Protein Format 205 10.10.1 Converting mmCIF to PDB 206 10.10.2 Converting mmCIFs to PDB with CIFTr 206 10.10.3 Problems with the CIFTr conversion 208 10.10.4 Some advice on using mmCIF 208 10.10.5 Automated conversion of mmCIF to PDB 208 Where to from Here 210 The Maxims Repeated 210 Contents xi 11 Non-redundant Datasets 211 11.1 Introducing Non-redundant Datasets 211 11.1.1 Reasons for redundancy 211 11.1.2 Reduction of redundancy 212 11.1.3 Non-redundancy and non-representative 212 11.2 Non-redundant Protein Structures 213 Where to from Here 217 The Maxims Repeated 217 12 Databases 219 12.1 Introducing Databases 219 12.1.1 Relating tables 220 12.1.2 The problem with single-table databases 222 12.1.3 Solving the one-table problem 222 12.1.4 Database system: a definition 224 12.2 Available Database Systems 224 12.2.1 Personal database systems 225 12.2.2 Enterprise database systems 225 12.2.3 Open source database systems 225 12.3 SQL: the Language of Databases 226 12.3.1 Defining data with SQL 226 12.3.2 Manipulating data with SQL 227 12.4 A Database Case Study: MER 227 12.4.1 The requirement for the MER database 231 12.4.2 Installing a database system 232 12.4.3 Creating the MER database 233 12.4.4 Adding tables to the MER database 235 12.4.5 Preparing SWISS-PROT data for importation
Recommended publications
  • X10 Language Specification
    X10 Language Specification Version 2.6.2 Vijay Saraswat, Bard Bloom, Igor Peshansky, Olivier Tardieu, and David Grove Please send comments to [email protected] January 4, 2019 This report provides a description of the programming language X10. X10 is a class- based object-oriented programming language designed for high-performance, high- productivity computing on high-end computers supporting ≈ 105 hardware threads and ≈ 1015 operations per second. X10 is based on state-of-the-art object-oriented programming languages and deviates from them only as necessary to support its design goals. The language is intended to have a simple and clear semantics and be readily accessible to mainstream OO pro- grammers. It is intended to support a wide variety of concurrent programming idioms. The X10 design team consists of David Grove, Ben Herta, Louis Mandel, Josh Milthorpe, Vijay Saraswat, Avraham Shinnar, Mikio Takeuchi, Olivier Tardieu. Past members include Shivali Agarwal, Bowen Alpern, David Bacon, Raj Barik, Ganesh Bikshandi, Bob Blainey, Bard Bloom, Philippe Charles, Perry Cheng, David Cun- ningham, Christopher Donawa, Julian Dolby, Kemal Ebcioglu,˘ Stephen Fink, Robert Fuhrer, Patrick Gallop, Christian Grothoff, Hiroshi Horii, Kiyokuni Kawachiya, Al- lan Kielstra, Sreedhar Kodali, Sriram Krishnamoorthy, Yan Li, Bruce Lucas, Yuki Makino, Nathaniel Nystrom, Igor Peshansky, Vivek Sarkar, Armando Solar-Lezama, S. Alexander Spoon, Toshio Suganuma, Sayantan Sur, Toyotaro Suzumura, Christoph von Praun, Leena Unnikrishnan, Pradeep Varma, Krishna Nandivada Venkata, Jan Vitek, Hai Chuan Wang, Tong Wen, Salikh Zakirov, and Yoav Zibin. For extended discussions and support we would like to thank: Gheorghe Almasi, Robert Blackmore, Rob O’Callahan, Calin Cascaval, Norman Cohen, Elmootaz El- nozahy, John Field, Kevin Gildea, Sara Salem Hamouda, Michihiro Horie, Arun Iyen- gar, Chulho Kim, Orren Krieger, Doug Lea, John McCalpin, Paul McKenney, Hiroki Murata, Andrew Myers, Filip Pizlo, Ram Rajamony, R.
    [Show full text]
  • Exploring Languages with Interpreters and Functional Programming Chapter 22
    Exploring Languages with Interpreters and Functional Programming Chapter 22 H. Conrad Cunningham 5 November 2018 Contents 22 Overloading and Type Classes 2 22.1 Chapter Introduction . .2 22.2 Polymorphism in Haskell . .2 22.3 Why Overloading? . .2 22.4 Defining an Equality Class and Its Instances . .4 22.5 Type Class Laws . .5 22.6 Another Example Class Visible ..................5 22.7 Class Extension (Inheritance) . .6 22.8 Multiple Constraints . .7 22.9 Built-In Haskell Classes . .8 22.10Comparison to Other Languages . .8 22.11What Next? . .9 22.12Exercises . 10 22.13Acknowledgements . 10 22.14References . 11 22.15Terms and Concepts . 11 Copyright (C) 2017, 2018, H. Conrad Cunningham Professor of Computer and Information Science University of Mississippi 211 Weir Hall P.O. Box 1848 University, MS 38677 (662) 915-5358 Browser Advisory: The HTML version of this textbook requires use of a browser that supports the display of MathML. A good choice as of November 2018 is a recent version of Firefox from Mozilla. 1 22 Overloading and Type Classes 22.1 Chapter Introduction Chapter 5 introduced the concept of overloading. Chapters 13 and 21 introduced the related concepts of type classes and instances. The goal of this chapter and the next chapter is to explore these concepts in more detail. The concept of type class was introduced into Haskell to handle the problem of comparisons, but it has had a broader and more profound impact upon the development of the language than its original purpose. This Haskell feature has also had a significant impact upon the design of subsequent languages (e.g.
    [Show full text]
  • On the Interaction of Object-Oriented Design Patterns and Programming
    On the Interaction of Object-Oriented Design Patterns and Programming Languages Gerald Baumgartner∗ Konstantin L¨aufer∗∗ Vincent F. Russo∗∗∗ ∗ Department of Computer and Information Science The Ohio State University 395 Dreese Lab., 2015 Neil Ave. Columbus, OH 43210–1277, USA [email protected] ∗∗ Department of Mathematical and Computer Sciences Loyola University Chicago 6525 N. Sheridan Rd. Chicago, IL 60626, USA [email protected] ∗∗∗ Lycos, Inc. 400–2 Totten Pond Rd. Waltham, MA 02154, USA [email protected] February 29, 1996 Abstract Design patterns are distilled from many real systems to catalog common programming practice. However, some object-oriented design patterns are distorted or overly complicated because of the lack of supporting programming language constructs or mechanisms. For this paper, we have analyzed several published design patterns looking for idiomatic ways of working around constraints of the implemen- tation language. From this analysis, we lay a groundwork of general-purpose language constructs and mechanisms that, if provided by a statically typed, object-oriented language, would better support the arXiv:1905.13674v1 [cs.PL] 31 May 2019 implementation of design patterns and, transitively, benefit the construction of many real systems. In particular, our catalog of language constructs includes subtyping separate from inheritance, lexically scoped closure objects independent of classes, and multimethod dispatch. The proposed constructs and mechanisms are not radically new, but rather are adopted from a variety of languages and programming language research and combined in a new, orthogonal manner. We argue that by describing design pat- terns in terms of the proposed constructs and mechanisms, pattern descriptions become simpler and, therefore, accessible to a larger number of language communities.
    [Show full text]
  • An Analysis of the Dynamic Behavior of Javascript Programs
    An Analysis of the Dynamic Behavior of JavaScript Programs Gregor Richards Sylvain Lebresne Brian Burg Jan Vitek S3 Lab, Department of Computer Science, Purdue University, West Lafayette, IN fgkrichar,slebresn,bburg,[email protected] Abstract becoming a general purpose computing platform with office appli- The JavaScript programming language is widely used for web cations, browsers and development environments [15] being devel- programming and, increasingly, for general purpose computing. oped in JavaScript. It has been dubbed the “assembly language” of the Internet and is targeted by code generators from the likes As such, improving the correctness, security and performance of 2;3 JavaScript applications has been the driving force for research in of Java and Scheme [20]. In response to this success, JavaScript type systems, static analysis and compiler techniques for this lan- has started to garner academic attention and respect. Researchers guage. Many of these techniques aim to reign in some of the most have focused on three main problems: security, correctness and dynamic features of the language, yet little seems to be known performance. Security is arguably JavaScript’s most pressing prob- about how programmers actually utilize the language or these fea- lem: a number of attacks have been discovered that exploit the lan- tures. In this paper we perform an empirical study of the dynamic guage’s dynamism (mostly the ability to access and modify shared behavior of a corpus of widely-used JavaScript programs, and an- objects and to inject code via eval). Researchers have proposed ap- alyze how and why the dynamic features are used.
    [Show full text]
  • Thinking in C++ Volume 2 Annotated Solution Guide, Available for a Small Fee From
    1 z 516 Note: This document requires the installation of the fonts Georgia, Verdana and Andale Mono (code font) for proper viewing. These can be found at: http://sourceforge.net/project/showfiles.php?group_id=34153&release_id=105355 Revision 19—(August 23, 2003) Finished Chapter 11, which is now going through review and copyediting. Modified a number of examples throughout the book so that they will compile with Linux g++ (basically fixing case- sensitive naming issues). Revision 18—(August 2, 2003) Chapter 5 is complete. Chapter 11 is updated and is near completion. Updated the front matter and index entries. Home stretch now. Revision 17—(July 8, 2003) Chapters 5 and 11 are 90% done! Revision 16—(June 25, 2003) Chapter 5 text is almost complete, but enough is added to justify a separate posting. The example programs for Chapter 11 are also fairly complete. Added a matrix multiplication example to the valarray material in chapter 7. Chapter 7 has been tech-edited. Many corrections due to comments from users have been integrated into the text (thanks!). Revision 15—(March 1 ,2003) Fixed an omission in C10:CuriousSingleton.cpp. Chapters 9 and 10 have been tech-edited. Revision 14—(January ,2003) Fixed a number of fuzzy explanations in response to reader feedback (thanks!). Chapter 9 has been copy-edited. Revision 13—(December 31, 2002) Updated the exercises for Chapter 7. Finished rewriting Chapter 9. Added a template variation of Singleton to chapter 10. Updated the build directives. Fixed lots of stuff. Chapters 5 and 11 still await rewrite. Revision 12—(December 23, 2002) Added material on Design Patterns as Chapter 10 (Concurrency will move to Chapter 11).
    [Show full text]
  • Ferrite: a Judgmental Embedding of Session Types in Rust
    Ferrite: A Judgmental Embedding of Session Types in Rust RUOFEI CHEN, Independent Researcher, Germany STEPHANIE BALZER, Carnegie Mellon University, USA This paper introduces Ferrite, a shallow embedding of session types in Rust. In contrast to existing session type libraries and embeddings for mainstream languages, Ferrite not only supports linear session types but also shared session types. Shared session types allow sharing (aliasing) of channels while preserving session fidelity (preservation) using type modalities for acquiring and releasing sessions. Ferrite adopts a propositions as types approach and encodes typing derivations as Rust functions, with the proof of successful type-checking manifesting as a Rust program. We provide an evaluation of Ferrite using Servo as a practical example, and demonstrate how safe communication can be achieved in the canvas component using Ferrite. CCS Concepts: • Theory of computation ! Linear logic; Type theory; • Software and its engineering ! Domain specific languages; Concurrent programming languages. Additional Key Words and Phrases: Session Types, Rust, DSL ACM Reference Format: Ruofei Chen and Stephanie Balzer. 2021. Ferrite: A Judgmental Embedding of Session Types in Rust. In Proceedings of International Conference on Functional Programming (ICFP 2021). ACM, New York, NY, USA, 36 pages. 1 INTRODUCTION Message-passing concurrency is a dominant concurrency paradigm, adopted by mainstream lan- guages such as Erlang, Scala, Go, and Rust, putting the slogan “to share memory by communicating rather than communicating by sharing memory”[Gerrand 2010; Klabnik and Nichols 2018] into practice. In this setting, messages are exchanged over channels, which can be shared among several senders and recipients. Figure 1 provides a simplified example in Rust.
    [Show full text]
  • Asynchronous Liquid Separation Types
    Asynchronous Liquid Separation Types Johannes Kloos, Rupak Majumdar, and Viktor Vafeiadis Max Planck Institute for Software Systems, Germany { jkloos, rupak, viktor }@mpi-sws.org Abstract We present a refinement type system for reasoning about asynchronous programs manipulating shared mutable state. Our type system guarantees the absence of races and the preservation of user-specified invariants using a combination of two ideas: refinement types and concurrent separation logic. Our type system allows precise reasoning about programs using two ingredients. First, our types are indexed by sets of resource names and the type system tracks the effect of program execution on individual heap locations and task handles. In particular, it allows making strong updates to the types of heap locations. Second, our types track ownership of shared state across concurrently posted tasks and allow reasoning about ownership transfer between tasks using permissions. We demonstrate through several examples that these two ingredients, on top of the framework of liquid types, are powerful enough to reason about correct behavior of practical, complex, asynchronous systems manipulating shared heap resources. We have implemented type inference for our type system and have used it to prove complex invariants of asynchronous OCaml programs. We also show how the type system detects subtle concurrency bugs in a file system implementation. 1998 ACM Subject Classification F.3.1 Specifying and Verifying and Reasoning about Pro- grams, D.2.4 Software/Program Verification Keywords and phrases Liquid Types, Asynchronous Parallelism, Separation Logic, Type Sys- tems 1 Introduction Asynchronous programming is a common programming idiom used to handle concurrent interactions. It is commonly used not only in low-level systems code, such as operating systems kernels and device drivers, but also in internet services, in programming models for mobile applications, in GUI event loops, and in embedded systems.
    [Show full text]
  • Thinking in C++ 2Nd Edition Volume 2
    Thinking in C++ 2nd edition Volume 2: Standard Libraries & Advanced Topics To be informed of future releases of this document and other information about object- oriented books, documents, seminars and CDs, subscribe to my free newsletter. Just send any email to: [email protected] ________________________________________________________________________ “This book is a tremendous achievement. You owe it to yourself to have a copy on your shelf. The chapter on iostreams is the most comprehensive and understandable treatment of that subject I’ve seen to date.” Al Stevens Contributing Editor, Doctor Dobbs Journal “Eckel’s book is the only one to so clearly explain how to rethink program construction for object orientation. That the book is also an excellent tutorial on the ins and outs of C++ is an added bonus.” Andrew Binstock Editor, Unix Review “Bruce continues to amaze me with his insight into C++, and Thinking in C++ is his best collection of ideas yet. If you want clear answers to difficult questions about C++, buy this outstanding book.” Gary Entsminger Author, The Tao of Objects “Thinking in C++ patiently and methodically explores the issues of when and how to use inlines, references, operator overloading, inheritance and dynamic objects, as well as advanced topics such as the proper use of templates, exceptions and multiple inheritance. The entire effort is woven in a fabric that includes Eckel’s own philosophy of object and program design. A must for every C++ developer’s bookshelf, Thinking in C++ is the one C++ book you must have if you’re doing serious development with C++.” Richard Hale Shaw Contributing Editor, PC Magazine Thinking In C++ 2nd Edition, Volume 2 Bruce Eckel President, MindView Inc.
    [Show full text]
  • C for Java Programmers
    C for Java Programmers George Ferguson Summer 2016 (Updated Summer 2021) 2 Contents 1 Introduction7 2 Overview of Java and C9 2.1 What’s The Same?.........................9 2.2 What’s Different?.......................... 10 3 Development and Execution 11 3.1 Development and Execution in Java and C............. 11 3.2 Setting Up Your Development Environment............ 14 3.3 Writing Your First C Program................... 16 3.4 Compiling Your First C Program.................. 17 4 Basic Expressions and Statements 21 4.1 Comments.............................. 21 4.2 Primitive Types........................... 22 4.3 Producing Output.......................... 23 4.4 Operators and Expressions..................... 24 4.5 Variables and Assigment...................... 26 4.6 Arrays................................ 27 4.7 Strings................................ 29 3 4 CONTENTS 5 Control Flow 31 5.1 Conditional Statements....................... 31 5.2 Iteration Statements......................... 32 5.3 Other Control Flow Statements................... 33 6 Functions 35 6.1 Function Parameters and Arguments................ 36 6.2 Function Declarations........................ 37 7 Structured Types 39 8 Memory Management 43 8.1 Variables, Addresses, and Pointers................. 43 8.2 Passing Arguments by Reference.................. 46 8.3 Memory Allocation......................... 48 8.4 Dynamic Memory Allocation in Java................ 49 8.5 Dynamic Memory Allocation in C................. 50 8.6 Dynamic Arrays........................... 54 8.7 Dynamic Data Structures...................... 56 8.8 Function Pointers.......................... 64 9 Defining New Types 69 10 Sharing Code: Files and Libraries 73 10.1 The C Preprocessor......................... 73 10.2 Separate Compilation, Libraries, and Linking........... 75 10.3 Standard System Libraries..................... 76 CONTENTS 5 10.4 Project Development........................ 77 10.5 Building Larger C Programs.................... 79 11 Debugging a C Program 83 11.1 Debuggers.............................
    [Show full text]
  • Refining Expression Evaluation Order for Idiomatic C++ (Revision 2)
    P0145R1 2016-02-12 Reply-To: [email protected] Refining Expression Evaluation Order for Idiomatic C++ (Revision 2) Gabriel Dos Reis Herb Sutter Jonathan Caves Abstract This paper proposes an order of evaluation of operands in expressions, directly supporting decades-old established and recommended C++ idioms. The result is the removal of embarrassing traps for novices and experts alike, increased confidence and safety of popular programming practices and facilities, hallmarks of modern C++. 1. INTRODUCTION Order of expression evaluation is a recurring discussion topic in the C++ community. In a nutshell, given an expression such as f(a, b, c), the order in which the sub-expressions f, a, b, c (which are of arbitrary shapes) are evaluated is left unspecified by the standard. If any two of these sub-expressions happen to modify the same object without intervening sequence points, the behavior of the program is undefined. For instance, the expression f(i++, i) where i is an integer variable leads to undefined behavior, as does v[i] = i++. Even when the behavior is not undefined, the result of evaluating an expression can still be anybody’s guess. Consider the following program fragment: #include <map> int main() { std::map<int, int> m; m[0] = m.size(); // #1 } What should the map object m look like after evaluation of the statement marked #1? {{0, 0}} or {{0, 1}}? 1.1. CHANGES FROM PREVIOUS VERSIONS a. The original version of this proposal (Dos Reis, et al., 2014) received unanimous support from the Evolution Working Group (EWG) at the Fall 2014 meeting in Urbana, IL, as approved direction, and also strong support for inclusion in C++17.
    [Show full text]
  • Long X T 2021.Pdf (989.3Kb)
    Understanding Common Scratch Programming Idioms and Their Impact on Project Remixing Xingyu Long Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science in Computer Science and Applications Eli Tilevich, Chair Osman Balci Dimitrios S. Nikolopoulos May 10, 2021 Blacksburg, Virginia Keywords: Programming idioms; Scratch; Block-based programming; Novice programmers; Project remixing; Copyright 2021, Xingyu Long Understanding Common Scratch Programming Idioms and Their Im- pact on Project Remixing Xingyu Long (ABSTRACT) As Scratch has become one of the most popular educational programming languages, under- standing its common programming idioms can benefit both computing educators and learn- ers. This understanding can fine-tune the curricular development to help learners master the fundamentals of writing idiomatic code in their programming pursuits. Unfortunately, the research community’s understanding of what constitutes idiomatic Scratch code has been limited. To help bridge this knowledge gap, we systematically identified idioms as based on canonical source code, presented in widely available educational materials. We implemented a tool that automatically detects these idioms to assess their prevalence within a large dataset of over 70K Scratch projects in different demographic and project categories. Since communal learning and the practice of remixing are one of the cornerstones of the Scratch programming community, we studied the relationship between common programming idioms and remixes. Having analyzed the original projects and their remixes, we observed that different idioms may associate with dissimilar types of code changes. Code changes in remixes are desirable, as they require a meaningful programming effort that spurs the learning process.
    [Show full text]
  • Types You Can Count on Like Types for Javascript
    Types you can count on Like types for JavaScript Abstract As the need for error reporting was felt in industry, develop- Scripting languages support exploratory development by eschew- ers adopted a “dynamic first” philosophy, which led to the design ing static type annotations but fall short when small scripts turn into of optional type systems for languages such as Smalltalk, PHP and large programs. Academic researchers have proposed type systems JavaScript [3, 4, 11, 16, 20]. Optional type systems have two rather that integrate dynamic and static types, providing robust error re- surprising properties: they do not affect the program’s execution ports when the contract at the boundary between the two worlds is and they are unsound. These are design choices motivated by the violated. Safety comes at the price of pervasive runtime checks. On need to be backwards compatible and to capture popular program- the other hand, practitioners working on massive code bases have ming idioms used in scripting languages. Rather than limiting ex- come up with type systems that are unsound and that are ignored pressiveness of the language to what can shown to be type safe, at runtime, but which have proven useful in practice. This paper this design instead limits the type system. For example, Hack [20], proposes a new design for the JavaScript language, combining dy- Facebook’s optionally typed PHP dialect, has an UNSAFE direc- namic types, concrete types and like types to let developers pick tive that turns off type checking for the remainder of a block. Un- the level of guarantee that is appropriate for their code.
    [Show full text]