A Type-Checking Preprocessor for Cilk 2, a Multithreaded C Language
Total Page:16
File Type:pdf, Size:1020Kb
A Typechecking Prepro cessor for Cilk a Multithreaded C Language by Rob ert C Miller Submitted to the Department of Electrical Engineering and Computer Science in partial fulllment of the requirements for the degrees of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science at the Massachusetts Institute of Technology May Copyright Massachusetts Institute of Technology All rights reserved Author ::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::: Department of Electrical Engineering and Computer Science May Certied by ::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::: Charles E Leiserson Thesis Sup ervisor Accepted by ::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::: F R Morgenthaler Chairman Department Committee on Graduate Theses A Typechecking Prepro cessor for Cilk a Multithreaded C Language by Rob ert C Miller Submitted to the Department of Electrical Engineering and Computer Science on May in partial fulllment of the requirements for the degrees of Bachelor of Science in Computer Science and Engineering and Master of Engineering in Electrical Engineering and Computer Science Abstract This thesis describ es the typechecking optimizing translator that translates Cilk a C ex tension language for multithreaded parallel programming into C The CilktoC translator is based on CtoC a to ol I developed that parses type checks analyzes and regenerates a C program With a translator based on CtoC developers and users of C extension languages can enjoy the b enets of a typechecking optimizing compiler without the at tendant development and p orting costs Like a compiler the CilktoC translator runs quickly do es static checking and generates ecient co de making it t for everyday use by ordinary programmers Unlike a compiler however the CilktoC translator is easy to develop extend and p ort to other platforms Ab out of the development eort was devoted to Cilk semantics translating Cilk into C which is where a language developer wants to fo cus Only a small amount of work was sp ent extending CtoCs parsing type checking and dataow analysis to recognize Cilk In return for a few weeks of fulltime eort I obtained a typechecking optimizing translator for Cilk that targets any platform with an ANSI C compiler With CtoC other C extension language developers can obtain the same b enets CtoC has a number of sp ecial features that make it a go o d framework for C extension language translators First its C output is humanreadable so that a language developer or programmer can read and debug it CtoC also provides op erational transparency repro ducing nonsyntactic source features like line numbering and indentation which contributes to the readability and p ortability of its C output and assists debuggers and prolers Cto C p erforms dataow analysis directly on the highlevel abstract syntax tree of the source program so that its C output is also highlevel not unreadable assembly language written in C Finally CtoC provides a generic dataow analysis abstraction in which any mono tonic dataow problem needed for optimization can b e sp ecied and solved automatically These features greatly simplify the task of writing a typechecking optimizing translator for an arbitrary C extension language Thesis Sup ervisor Charles E Leiserson Title Professor of Computer Science and Engineering Contents Introduction One alternative a macro prepro cessor Another alternative a full compiler Related work Outline The Cilk Programming Language An example of Cilk fib Type checking Typedirected translations Optimizations The CtoC Typechecking Prepro cessor The abstract syntax tree Phases Macro prepro cessing cpp Parsing Type checking Analysis Transform Unparsing Transparency in CtoC Pragma directives Line numbering and indentation Constant expressions Dataow Analysis in CtoC Representing control ow in the AST Dataow analysis frameworks The iterative algorithm Conclusions Future work on CtoC Getting CtoC Acknowledgements This research was supp orted in part by the Advanced Research Pro jects Agency under Grant N I am indebted to my advisor Charles E Leiserson Yuli Zhou provided advice and supp ort throughout the development of the Cilk prepro cessor Thanks to Laura Cassenti Charles Leiserson Anil Somaya ji and Yuli Zhou who were generous with their time in agreeing to pro ofread this thesis any glaring errors that remain are solely the resp onsibili ty of the author Thanks also to the entire Cilk team for supp ort and suggestions Bobby Blumofe Matteo Frigo Chris Jo erg Bradley Kuszmaul Irena Kveraga Charles Leiserson Howard Lu Phil Lisiecki Keith Randall Richard Tauriello Daricha Techopitayakul and Yuli Zhou I gratefully acknowledge the warm supp ort of Laura Cassenti and my parents Larry and Marian Miller Chapter Introduction Cilk pronounced silk is a parallel programming language under development at MIT Lab oratory for Computer Science Cilk provides syntax for expressing control paral lelism allowing a programmer to sp ecify that certain pro cedure calls should b e spawned or run in parallel with the current thread of execution Cilk is an example of a C extension lan guage a programming language that extends C with new keywords syntax or semantics C extension languages have lately b ecome p opular in parallel and distributed computing research b ecause of the wide p ortability of ANSI C the large p opulation of C programmers and the extensive base of C applications and library software Like many C extension languages Cilk is not translated directly into ob ject co de In stead Cilk is translated into C with extension language features mapp ed into calls to a runtime system Figure shows the translation pro cess schematically The translator that converts a C extension language into C is called a preprocessor b ecause its output is a highlevel language rather than machine language The C postsource pro duced by the prepro cessor is compiled into ob ject co de by the target machines C compiler also called the backend compiler This compilation system a prepro cessor pip elined with a compiler yields p ortability and ease of development b ecause the extension language developer need not build a full compiler for every platform on which the language will run In fact simple C extension languages can b e translated into C by lo cal syntactic trans formations using a macro prepro cessor Cilk the rst incarnation of the Cilk language was translated by a macro prepro cessor Though easy to write and debug macro prepro ces sors suer a serious aw which manifested itself in Cilk A macro prepro cessor relies on the backend C compilers type checking to detect and rep ort common programmer errors Unfortunately type errors in Cilk programs are often missed by the backend compiler b ecause the C p ostsource contains lowlevel co de that overrides type checking Even when type errors are detected the C compilers error messages refer to the p ostsource which is Cilk Cilk C C obj preprocessor compiler source postsource object Figure The mechanism for compiling a Cilk program Cilk is prepro cessed into C which is then compiled into ob ject co de unhelpful to a Cilk programmer trying to nd and x the error in the source In addition macro prepro cessors are insucient for more abstract C extension languages that require their translators to gather semantic information ab out the program Cilk the latest version of the Cilk language includes features whose translation dep ends on the types of the ob jects involved so the Cilk prepro cessor must determine those types An ordinary macro prepro cessor has no access to type information Finally macro prepro cessors oer no opp ortunity for global optimization The backend C compiler p erforms global optimization when it generates ob ject co de but the compilers optimizations are necessarily conservative missing p otential optimizations that the prepro cessor can p erform Again Cilk serves as an example Gathering dataow information enables the Cilk prepro cessor to emit faster co de than a macro prepro cessor could This thesis studies typechecking optimizing prepro cessors for C extension languages using the Cilk prepro cessor as an example In the course of building the Cilk prepro cessor I have developed a generic prepro cessor framework for C extension languages called CtoC CtoC parses a C program into an abstract syntax tree AST representation p erforms type checking dataow analysis and tree transformations directly on the AST then unparses the AST to recover the original C program To build a C extension language prepro cessor based on CtoC a developer mo dies the frontend to recognize the extension language