A Type-Checking Preprocessor for Cilk 2, a Multithreaded C Language

Total Page:16

File Type:pdf, Size:1020Kb

A Type-Checking Preprocessor for Cilk 2, a Multithreaded C Language A Typechecking Prepro cessor for Cilk a Multithreaded C Language by Rob ert C Miller Submitted to the Department of Electrical Engineering and Computer Science in partial fulllment of the requirements for the degrees of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology May Copyright Massachusetts Institute of Technology All rights reserved Author ::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::: Department of Electrical Engineering and Computer Science May Certied by ::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::: Charles E Leiserson Thesis Sup ervisor Accepted by ::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::: F R Morgenthaler Chairman Department Committee on Graduate Theses A Typechecking Prepro cessor for Cilk a Multithreaded C Language by Rob ert C Miller Submitted to the Department of Electrical Engineering and Computer Science on May in partial fulllment of the requirements for the degrees of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science Abstract This thesis describ es the typechecking optimizing translator that translates Cilk a C ex tension language for multithreaded parallel programming into C The CilktoC translator is based on CtoC a to ol I developed that parses type checks analyzes and regenerates a C program With a translator based on CtoC developers and users of C extension languages can enjoy the b enets of a typechecking optimizing compiler without the at tendant development and p orting costs Like a compiler the CilktoC translator runs quickly do es static checking and generates ecient co de making it t for everyday use by ordinary programmers Unlike a compiler however the CilktoC translator is easy to develop extend and p ort to other platforms Ab out of the development eort was devoted to Cilk semantics translating Cilk into C which is where a language developer wants to fo cus Only a small amount of work was sp ent extending CtoCs parsing type checking and dataow analysis to recognize Cilk In return for a few weeks of fulltime eort I obtained a typechecking optimizing translator for Cilk that targets any platform with an ANSI C compiler With CtoC other C extension language developers can obtain the same b enets CtoC has a number of sp ecial features that make it a go o d framework for C extension language translators First its C output is humanreadable so that a language developer or programmer can read and debug it CtoC also provides op erational transparency repro ducing nonsyntactic source features like line numbering and indentation which contributes to the readability and p ortability of its C output and assists debuggers and prolers Cto C p erforms dataow analysis directly on the highlevel abstract syntax tree of the source program so that its C output is also highlevel not unreadable assembly language written in C Finally CtoC provides a generic dataow analysis abstraction in which any mono tonic dataow problem needed for optimization can b e sp ecied and solved automatically These features greatly simplify the task of writing a typechecking optimizing translator for an arbitrary C extension language Thesis Sup ervisor Charles E Leiserson Title Professor of Computer Science and Engineering Contents Introduction One alternative a macro prepro cessor Another alternative a full compiler Related work Outline The Cilk Programming Language An example of Cilk fib Type checking Typedirected translations Optimizations The CtoC Typechecking Prepro cessor The abstract syntax tree Phases Macro prepro cessing cpp Parsing Type checking Analysis Transform Unparsing Transparency in CtoC Pragma directives Line numbering and indentation Constant expressions Dataow Analysis in CtoC Representing control ow in the AST Dataow analysis frameworks The iterative algorithm Conclusions Future work on CtoC Getting CtoC Acknowledgements This research was supp orted in part by the Advanced Research Pro jects Agency under Grant N I am indebted to my advisor Charles E Leiserson Yuli Zhou provided advice and supp ort throughout the development of the Cilk prepro cessor Thanks to Laura Cassenti Charles Leiserson Anil Somaya ji and Yuli Zhou who were generous with their time in agreeing to pro ofread this thesis any glaring errors that remain are solely the resp onsibili ty of the author Thanks also to the entire Cilk team for supp ort and suggestions Bobby Blumofe Matteo Frigo Chris Jo erg Bradley Kuszmaul Irena Kveraga Charles Leiserson Howard Lu Phil Lisiecki Keith Randall Richard Tauriello Daricha Techopitayakul and Yuli Zhou I gratefully acknowledge the warm supp ort of Laura Cassenti and my parents Larry and Marian Miller Chapter Introduction Cilk pronounced silk is a parallel programming language under development at MIT Lab oratory for Computer Science Cilk provides syntax for expressing control paral lelism allowing a programmer to sp ecify that certain pro cedure calls should b e spawned or run in parallel with the current thread of execution Cilk is an example of a C extension lan guage a programming language that extends C with new keywords syntax or semantics C extension languages have lately b ecome p opular in parallel and distributed computing research b ecause of the wide p ortability of ANSI C the large p opulation of C programmers and the extensive base of C applications and library software Like many C extension languages Cilk is not translated directly into ob ject co de In stead Cilk is translated into C with extension language features mapp ed into calls to a runtime system Figure shows the translation pro cess schematically The translator that converts a C extension language into C is called a preprocessor b ecause its output is a highlevel language rather than machine language The C postsource pro duced by the prepro cessor is compiled into ob ject co de by the target machines C compiler also called the backend compiler This compilation system a prepro cessor pip elined with a compiler yields p ortability and ease of development b ecause the extension language developer need not build a full compiler for every platform on which the language will run In fact simple C extension languages can b e translated into C by lo cal syntactic trans formations using a macro prepro cessor Cilk the rst incarnation of the Cilk language was translated by a macro prepro cessor Though easy to write and debug macro prepro ces sors suer a serious aw which manifested itself in Cilk A macro prepro cessor relies on the backend C compilers type checking to detect and rep ort common programmer errors Unfortunately type errors in Cilk programs are often missed by the backend compiler b ecause the C p ostsource contains lowlevel co de that overrides type checking Even when type errors are detected the C compilers error messages refer to the p ostsource which is Cilk Cilk C C obj preprocessor compiler source postsource object Figure The mechanism for compiling a Cilk program Cilk is prepro cessed into C which is then compiled into ob ject co de unhelpful to a Cilk programmer trying to nd and x the error in the source In addition macro prepro cessors are insucient for more abstract C extension languages that require their translators to gather semantic information ab out the program Cilk the latest version of the Cilk language includes features whose translation dep ends on the types of the ob jects involved so the Cilk prepro cessor must determine those types An ordinary macro prepro cessor has no access to type information Finally macro prepro cessors oer no opp ortunity for global optimization The backend C compiler p erforms global optimization when it generates ob ject co de but the compilers optimizations are necessarily conservative missing p otential optimizations that the prepro cessor can p erform Again Cilk serves as an example Gathering dataow information enables the Cilk prepro cessor to emit faster co de than a macro prepro cessor could This thesis studies typechecking optimizing prepro cessors for C extension languages using the Cilk prepro cessor as an example In the course of building the Cilk prepro cessor I have developed a generic prepro cessor framework for C extension languages called CtoC CtoC parses a C program into an abstract syntax tree AST representation p erforms type checking dataow analysis and tree transformations directly on the AST then unparses the AST to recover the original C program To build a C extension language prepro cessor based on CtoC a developer mo dies the frontend to recognize the extension language
Recommended publications
  • C Programming Language Review
    C Programming Language Review Embedded Systems 1 C: A High-Level Language Gives symbolic names to values – don’t need to know which register or memory location Provides abstraction of underlying hardware – operations do not depend on instruction set – example: can write “a = b * c”, even if CPU doesn’t have a multiply instruction Provides expressiveness – use meaningful symbols that convey meaning – simple expressions for common control patterns (if-then-else) Enhances code readability Safeguards against bugs – can enforce rules or conditions at compile-time or run-time Embedded Systems 2 A C Code “Project” • You will use an “Integrated Development Environment” (IDE) to develop, compile, load, and debug your code. • Your entire code package is called a project. Often you create several files to spilt the functionality: – Several C files – Several include (.h) files – Maybe some assembly language (.a30) files – Maybe some assembly language include (.inc) files • A lab, like “Lab7”, will be your project. You may have three .c, three .h, one .a30, and one .inc files. • More will be discussed in a later set of notes. Embedded Systems 3 Compiling a C Program C Source and Entire mechanism is usually called Header Files the “compiler” Preprocessor – macro substitution C Preprocessor – conditional compilation – “source-level” transformations • output is still C Compiler Source Code Compiler Analysis – generates object file Symbol Table Target Code • machine instructions Synthesis Linker – combine object files Library (including libraries) Linker into executable image Object Files Executable Image Embedded Systems 4 Compiler Source Code Analysis – “front end” – parses programs to identify its pieces • variables, expressions, statements, functions, etc.
    [Show full text]
  • Section “Common Predefined Macros” in the C Preprocessor
    The C Preprocessor For gcc version 12.0.0 (pre-release) (GCC) Richard M. Stallman, Zachary Weinberg Copyright c 1987-2021 Free Software Foundation, Inc. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.3 or any later version published by the Free Software Foundation. A copy of the license is included in the section entitled \GNU Free Documentation License". This manual contains no Invariant Sections. The Front-Cover Texts are (a) (see below), and the Back-Cover Texts are (b) (see below). (a) The FSF's Front-Cover Text is: A GNU Manual (b) The FSF's Back-Cover Text is: You have freedom to copy and modify this GNU Manual, like GNU software. Copies published by the Free Software Foundation raise funds for GNU development. i Table of Contents 1 Overview :::::::::::::::::::::::::::::::::::::::: 1 1.1 Character sets:::::::::::::::::::::::::::::::::::::::::::::::::: 1 1.2 Initial processing ::::::::::::::::::::::::::::::::::::::::::::::: 2 1.3 Tokenization ::::::::::::::::::::::::::::::::::::::::::::::::::: 4 1.4 The preprocessing language :::::::::::::::::::::::::::::::::::: 6 2 Header Files::::::::::::::::::::::::::::::::::::: 7 2.1 Include Syntax ::::::::::::::::::::::::::::::::::::::::::::::::: 7 2.2 Include Operation :::::::::::::::::::::::::::::::::::::::::::::: 8 2.3 Search Path :::::::::::::::::::::::::::::::::::::::::::::::::::: 9 2.4 Once-Only Headers::::::::::::::::::::::::::::::::::::::::::::: 9 2.5 Alternatives to Wrapper #ifndef ::::::::::::::::::::::::::::::
    [Show full text]
  • User's Manual
    rBOX610 Linux Software User’s Manual Disclaimers This manual has been carefully checked and believed to contain accurate information. Axiomtek Co., Ltd. assumes no responsibility for any infringements of patents or any third party’s rights, and any liability arising from such use. Axiomtek does not warrant or assume any legal liability or responsibility for the accuracy, completeness or usefulness of any information in this document. Axiomtek does not make any commitment to update the information in this manual. Axiomtek reserves the right to change or revise this document and/or product at any time without notice. No part of this document may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of Axiomtek Co., Ltd. Trademarks Acknowledgments Axiomtek is a trademark of Axiomtek Co., Ltd. ® Windows is a trademark of Microsoft Corporation. Other brand names and trademarks are the properties and registered brands of their respective owners. Copyright 2014 Axiomtek Co., Ltd. All Rights Reserved February 2014, Version A2 Printed in Taiwan ii Table of Contents Disclaimers ..................................................................................................... ii Chapter 1 Introduction ............................................. 1 1.1 Specifications ...................................................................................... 2 Chapter 2 Getting Started ......................................
    [Show full text]
  • SIPB's IAP Programming in C
    SIPB’s IAP Programming in C C was developed at AT&T Bell Labs between 1971 and 1973, by Dennis Ritchie. It was derived from an experimental language called B, which itself was a stripped-down version of BCPL. All of these are derivatives of the ALGOL family of languages, dating from the 1950s. This work was done on a PDP-11, a machine with a 16 bit address bus. While a PDP-11 should be able to address up to 64K of memory, these machines only had 24K. The C compiler was written in C, and was able to compile itself on these machines, running under the nascent Unix operating system. #include <stdio.h> main() The classic { printf("hello, world\n"); } /* * %% - a percent sign * %s - a string * %d - 32-bit integer, base 10 * %lld - 64-bit integer, base 10 * %x - 32-bit integer, base 16 * %llx - 64-bit integer, base 16 * %f, %e, %g - double precision * floating point number */ /* * the following should produce: A word on printf() * * printf() test: * string: 'string test' * number: 22 * float: 18.19 */ printf("printf() test:\n" " string: '%s'\n" " number: %d\n" " float: %g\n", "string test", 22, 18.19); Language Structure char i_8; There are five different /* -128 to 127 */ kinds of integer: unsigned char ui_8; /* 0 to 255 */ - char short i_16; - short /* -32768 to 32767 */ unsigned short ui_16; - int /* 0 to 65536 */ int i_32; - long /* -2147483648 to 2147483647 */ unsigned int ui_32; - long long /* 0 to 4294967295U */ long i_arch; unsigned ui_arch; Each of these can be /* architecture either: * dependent */ long long i64; - signed (the default)
    [Show full text]
  • LTIB Quick Start: Targeting the Coldfire Mcf54418tower Board By: Soledad Godinez and Jaime Hueso Guadalajara Mexico
    Freescale Semiconductor Document Number: AN4426 Application Note Rev. 0, December 2011 LTIB Quick Start: Targeting the ColdFire MCF54418Tower Board by: Soledad Godinez and Jaime Hueso Guadalajara Mexico Contents 1 Introduction 1 Introduction................................................................1 The purpose of this document is to indicate step-by-step how 2 LTIB..........................................................................1 to accomplish these tasks: • Install the BSP on a host development system. 2.1 Introduction....................................................1 • Run Linux Target Image Builder (LTIB) to build target 2.2 Installing the BSP...........................................2 images. • Boot Linux on the ColdFire MCF54418 Tower board. 2.3 Running LTIB................................................7 3 Target deployment .................................................13 This document is a guide for people who want to properly set up the Freescale Linux Target Image Builder (LTIB) for the 3.1 The Tower kit ..............................................13 ColdFire MCF54418 Tower Board Support Package (BSP). 3.2 Constructing the Tower kit...........................15 3.3 Programming U-boot, kernel, and root file system.............................................16 2 LTIB 3.4 Configuring U-boot......................................24 3.5 Running.......................................................25 4 Conclusions.............................................................26 2.1 Introduction Freescale GNU/Linux
    [Show full text]
  • Tinyos Programming Philip Levis and David Gay Frontmatter More Information
    Cambridge University Press 978-0-521-89606-1 - TinyOS Programming Philip Levis and David Gay Frontmatter More information TinyOS Programming Do you need to know how to write systems, services, and applications using the TinyOS operating system? Learn how to write nesC code and efficient applications with this indispensable guide to TinyOS programming. Detailed examples show you how to write TinyOS code in full, from basic applications right up to new low-level systems and high-performance applications. Two leading figures in the development of TinyOS also explain the reasons behind many of the design decisions made and explain for the first time how nesC relates to and differs from other C dialects. Handy features such as a library of software design patterns, programming hints and tips, end-of-chapter exercises, and an appendix summarizing the basic application-level TinyOS APIs make this the ultimate guide to TinyOS for embedded systems programmers, developers, designers, and graduate students. Philip Levis is Assistant Professor of Computer Science and Electrical Engineering at Stanford University. A Fellow of the Microsoft Research Faculty, he is also Chair of the TinyOS Core Working Group and a Member of the TinyOS Network Protocol (net2), Simulation (sim), and Documentation (doc) Working Groups. David Gay joined Intel Research in Berkeley in 2001, where he has been a designer and the principal implementer of the nesC language, the C dialect used to implement the TinyOS sensor network operating system, and its applications. He has a diploma in Computer Science from the Swiss Federal Institute of Technology in Lausanne and a Ph.D.
    [Show full text]
  • Nesc 1.2 Language Reference Manual
    nesC 1.2 Language Reference Manual David Gay, Philip Levis, David Culler, Eric Brewer August 2005 1 Introduction nesC is an extension to C [3] designed to embody the structuring concepts and execution model of TinyOS [2]. TinyOS is an event-driven operating system designed for sensor network nodes that have very limited resources (e.g., 8K bytes of program memory, 512 bytes of RAM). TinyOS has been reimplemented in nesC. This manual describes v1.2 of nesC, changes from v1.0 and v1.1 are summarised in Section 2. The basic concepts behind nesC are: • Separation of construction and composition: programs are built out of components, which are assembled (“wired”) to form whole programs. Components define two scopes, one for their specification (containing the names of their interfaces) and one for their implementation. Components have internal concurrency in the form of tasks. Threads of control may pass into a component through its interfaces. These threads are rooted either in a task or a hardware interrupt. • Specification of component behaviour in terms of set of interfaces. Interfaces may be provided or used by the component. The provided interfaces are intended to represent the functionality that the component provides to its user, the used interfaces represent the functionality the component needs to perform its job. • Interfaces are bidirectional: they specify a set of functions to be implemented by the inter- face’s provider (commands) and a set to be implemented by the interface’s user (events). This allows a single interface to represent a complex interaction between components (e.g., registration of interest in some event, followed by a callback when that event happens).
    [Show full text]
  • N2429 V 1 Function Failure Annotation
    ISO/IEC JTC 1/SC 22/WG14 N2429 September 23, 2019 v 1 Function failure annotation Niall Douglas Jens Gustedt ned Productions Ltd, Ireland INRIA & ICube, Université de Strasbourg, France We have been seeing an evolution in proposals for the best syntax for describing how to mark how a function fails, such that compilers and linkers can much better optimise function calls than at present. These papers were spurred by [WG21 P0709] Zero-overhead deterministic exceptions, which proposes deterministic exception handling for C++, whose implementation could be very compatible with C code. P0709 resulted in [WG21 P1095] Zero overhead deterministic failure – A unified mecha- nism for C and C++ which proposed a C syntax and exception-like mechanism for supporting C++ deterministic exceptions. That was responded to by [WG14 N2361] Out-of-band bit for exceptional return and errno replacement which proposed an even wider reform, whereby the most popular idioms for function failure in C code would each gain direct calling convention support in the compiler. This paper synthesises all of the proposals above into common proposal, and does not duplicate some of the detail in the above papers for brevity and clarity. It hews closely to what [WG14 N2361] proposes, but with different syntax, and in addition it shows how the corresponding C++ features can be constructed on top of it, without loosing call compatibility with C. Contents 1 Introduction 2 1.1 Differences between WG14 N2361 and WG21 P1095 . .3 2 Proposed Design 4 2.1 errno-related function failure editions . .5 2.2 When to choose which edition .
    [Show full text]
  • C Programming
    C PROGRAMMING 6996 Columbia Gateway Drive Suite 100 Columbia, MD 21046 Tel: 443-692-6600 http://www.umbctraining.com C PROGRAMMING Course # TCPRG3000 Rev. 10/14/2016 ©2016 UMBC Training Centers 1 C PROGRAMMING This Page Intentionally Left Blank ©2016 UMBC Training Centers 2 C PROGRAMMING Course Objectives ● At the conclusion of this course, students will be able to: ⏵ Write non-trivial C programs. ⏵ Use data types appropriate to specific programming problems. ⏵ Utilize the modular features of the C languages. ⏵ Demonstrate efficiency and readability. ⏵ Use the various control flow constructs. ⏵ Create and traverse arrays. ⏵ Utilize pointers to efficiently solve problems. ⏵ Create and use structures to implement new data types. ⏵ Use functions from the C runtime library. ©2016 UMBC Training Centers 3 C PROGRAMMING This Page Intentionally Left Blank ©2016 UMBC Training Centers 4 C PROGRAMMING Table of Contents Chapter 1: Getting Started..............................................................................................9 What is C?..................................................................................................................10 Sample Program.........................................................................................................11 Components of a C Program......................................................................................13 Data Types.................................................................................................................14 Variables.....................................................................................................................16
    [Show full text]
  • The C-- Language Reference Manual
    The C-- Language Reference Manual Simon Peyton Jones Thomas Nordin Dino Oliva Pablo Nogueira Iglesias April 23, 1998 Contents 1 Introduction 3 2 Syntax definition 3 2.1 General ......................................... 3 2.2 Comments........................................ 6 2.3 Names.......................................... 6 2.4 Namescope....................................... 6 2.5 The import and export declarations ........................ 6 2.6 Constants ....................................... 7 2.6.1 Integer and floating point numbers . ...... 7 2.6.2 Characters and strings . ... 7 3 Fundamental concepts in C-- 7 3.1 Memory......................................... 7 3.2 Datasegment ..................................... 8 3.3 Codesegment..................................... 8 3.4 Types .......................................... 8 3.5 Local variables (or registers) . ......... 8 3.6 Addresses ....................................... 9 3.7 Names.......................................... 9 3.8 Foreign language interface . ....... 9 4 Data layout directives 9 4.1 Labels.......................................... 10 4.2 Initialisation.................................. ..... 10 4.3 Alignment....................................... 12 1 5 Procedures 12 5.1 Proceduredefinition..... ...... ..... ...... ...... .. ..... 12 5.2 Statements...................................... 13 5.2.1 skip; ..................................... 13 5.2.2 Declaration ................................... 13 5.2.3 Assignment................................... 13 5.2.4
    [Show full text]
  • Lesson 01 Programming with C
    Lesson 01 Programming with C Pijushkanti Panigrahi What is C ? • The C is a programming Language, developed by Dennis Ritchie for creating system applications that directly interact with the hardware devices such as drivers, kernels, etc. • C programming is considered as the base for other programming languages, that is why it is known as mother language. • C is famous for its Compactness. • C language is case sensitive. 2 Features It can be defined by the following ways: Mother language System programming language Procedure-oriented programming language Structured programming language Mid-level programming language 3 1) C as a mother language ? • C language is considered as the mother language of all the modern programming languages because most of the compilers, JVMs, Kernels, etc. are written in C language , and most of the programming languages follow C syntax, for example, C++, Java, C#, etc. • It provides the core concepts like the array , strings , functions , file handling , etc . that are being used in many languages like C++ , Java , C# , etc. 4 2) C as a system programming language • A system programming language is used to create system software. • C language is a system programming language because it can be used to do low-level programming (for example driver and kernel) . • It is generally used to create hardware devices, OS, drivers, kernels, etc. For example, Linux kernel is written in C. 5 3) C as a procedural language • A procedure is known as a function, method, routine, subroutine, etc. A procedural language specifies a series of steps for the program to solve the problem.
    [Show full text]
  • Tinyos Programming
    TinyOS Programming Philip Levis and David Gay July 16, 2009 ii Acknolwedgements We’d like to thank several people for their contributions to this book. First is Mike Horton, of Crossbow, Inc., who first proposed writing it. Second is Pablo Guerrero, who gave detailed comments and corrections. Third is Joe Polastre of Moteiv, who gave valable feedback on how to better introduce generic components. Fourth, we’d like to thank Phil’s father, who although he doesn’t program, read the entire thing! Fifth, John Regehr, Ben Greenstein and David Culler provided valuable feedback on this expanded edition. Last but not least, we would like to thank the TinyOS community and its developers. Many of the concepts in this book – power locks, tree routing, and interface type checking – are the work and ideas of others, which we merely present. Chapter 10 of this book is based on: Software design patterns for TinyOS, in ACM Transactions on Embedded Computing Systems (TECS), Volume 6, Issue 4 (September 2007), c ACM, 2007. http://doi.acm.org/10.1145/1274858.1274860 Contents I TinyOS and nesC 1 1 Introduction 3 1.1 Networked, Embedded Sensors . 3 1.1.1 Anatomy of a Sensor Node (Mote) . 4 1.2 TinyOS . 5 1.2.1 What TinyOS provides . 5 1.3 Example Application . 6 1.4 Compiling and Installing Applications . 7 1.5 The rest of this book . 7 2 Names and Program Structure 9 2.1 Hello World! . 9 2.2 Essential Differences: Components, Interfaces and Wiring . 12 2.3 Wiring and Callbacks . 13 2.4 Summary .
    [Show full text]