Compiler Construction
Total Page:16
File Type:pdf, Size:1020Kb
Compiler construction PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 10 Dec 2011 02:23:02 UTC Contents Articles Introduction 1 Compiler construction 1 Compiler 2 Interpreter 10 History of compiler writing 14 Lexical analysis 22 Lexical analysis 22 Regular expression 26 Regular expression examples 37 Finite-state machine 41 Preprocessor 51 Syntactic analysis 54 Parsing 54 Lookahead 58 Symbol table 61 Abstract syntax 63 Abstract syntax tree 64 Context-free grammar 65 Terminal and nonterminal symbols 77 Left recursion 79 Backus–Naur Form 83 Extended Backus–Naur Form 86 TBNF 91 Top-down parsing 91 Recursive descent parser 93 Tail recursive parser 98 Parsing expression grammar 100 LL parser 106 LR parser 114 Parsing table 123 Simple LR parser 125 Canonical LR parser 127 GLR parser 129 LALR parser 130 Recursive ascent parser 133 Parser combinator 140 Bottom-up parsing 143 Chomsky normal form 148 CYK algorithm 150 Simple precedence grammar 153 Simple precedence parser 154 Operator-precedence grammar 156 Operator-precedence parser 159 Shunting-yard algorithm 163 Chart parser 173 Earley parser 174 The lexer hack 178 Scannerless parsing 180 Semantic analysis 182 Attribute grammar 182 L-attributed grammar 184 LR-attributed grammar 185 S-attributed grammar 185 ECLR-attributed grammar 186 Intermediate language 186 Control flow graph 188 Basic block 190 Call graph 192 Data-flow analysis 195 Use-define chain 201 Live variable analysis 204 Reaching definition 206 Three address code 207 Static single assignment form 209 Dominator 215 C3 linearization 217 Intrinsic function 218 Aliasing 219 Alias analysis 221 Array access analysis 223 Pointer analysis 223 Escape analysis 224 Shape analysis 225 Loop dependence analysis 227 Program slicing 230 Code optimization 233 Compiler optimization 233 Peephole optimization 244 Copy propagation 247 Constant folding 248 Sparse conditional constant propagation 250 Common subexpression elimination 251 Partial redundancy elimination 252 Global value numbering 253 Strength reduction 254 Bounds-checking elimination 265 Inline expansion 266 Return value optimization 269 Dead code 272 Dead code elimination 273 Unreachable code 275 Redundant code 278 Jump threading 279 Superoptimization 279 Loop optimization 280 Induction variable 282 Loop fission 285 Loop fusion 286 Loop inversion 287 Loop interchange 289 Loop-invariant code motion 290 Loop nest optimization 291 Manifest expression 295 Polytope model 296 Loop unwinding 298 Loop splitting 305 Loop tiling 306 Loop unswitching 308 Interprocedural optimization 309 Whole program optimization 313 Adaptive optimization 313 Lazy evaluation 314 Partial evaluation 318 Profile-guided optimization 320 Automatic parallelization 320 Loop scheduling 322 Vectorization 323 Superword Level Parallelism 331 Code generation 332 Code generation 332 Name mangling 334 Register allocation 343 Chaitin's algorithm 345 Rematerialization 346 Sethi-Ullman algorithm 347 Data structure alignment 349 Instruction selection 357 Instruction scheduling 358 Software pipelining 360 Trace scheduling 364 Just-in-time compilation 364 Bytecode 368 Dynamic compilation 370 Dynamic recompilation 371 Object file 373 Code segment 374 Data segment 374 .bss 376 Literal pool 377 Overhead code 377 Link time 378 Relocation 378 Library 380 Static build 388 Architecture Neutral Distribution Format 389 Development techniques 391 Bootstrapping 391 Compiler correctness 392 Jensen's Device 394 Man or boy test 395 Cross compiler 397 Source-to-source compiler 403 Tools 405 Compiler-compiler 405 PQCC 407 Compiler Description Language 408 Comparison of regular expression engines 410 Comparison of parser generators 416 Lex 427 flex lexical analyser 430 Quex 437 JLex 440 Ragel 441 yacc 442 Berkeley Yacc 443 ANTLR 444 GNU bison 446 Coco/R 456 GOLD 458 JavaCC 463 JetPAG 464 Lemon Parser Generator 467 ROSE compiler framework 468 SableCC 470 Scannerless Boolean Parser 471 Spirit Parser Framework 472 S/SL programming language 474 SYNTAX 475 Syntax Definition Formalism 476 TREE-META 478 Frameworks supporting the polyhedral model 480 Case studies 485 GNU Compiler Collection 485 Java performance 495 Literature 505 Compilers: Principles, Techniques, and Tools 505 Principles of Compiler Design 507 The Design of an Optimizing Compiler 507 References Article Sources and Contributors 508 Image Sources, Licenses and Contributors 517 Article Licenses License 518 1 Introduction Compiler construction Compiler construction is an area of computer science that deals with the theory and practice of developing programming languages and their associated compilers. The theoretical portion is primarily concerned with syntax, grammar and semantics of programming languages. One could say that this gives this particular area of computer science a strong tie with linguistics. Some courses on compiler construction will include a simplified grammar of a spoken language that can be used to form a valid sentence for the purposes of providing students with an analogy to help them understand how grammar works for programming languages. The practical portion covers actual implementation of compilers for languages. Students will typically end up writing the front end of a compiler for a simplistic teaching language, such as Micro. Subfields • Parsing • Program analysis • Program transformation • Compiler or program optimization • Code generation Further reading • Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. • Michael Wolfe. High-Performance Compilers for Parallel Computing. ISBN 978-0805327304 External links • Let's Build a Compiler, by Jack Crenshaw [1], A tutorial on compiler construction. References [1] http:/ / compilers. iecc. com/ crenshaw/ Compiler 2 Compiler A compiler is a computer program (or set of programs) that transforms source code written in a programming language (the source language) into another computer language (the target language, often having a binary form known as object code). The most common reason for wanting to transform source code is to create an executable program. The name "compiler" is primarily used for programs that translate source code from a high-level programming language to a lower level language (e.g., assembly language or machine code). If the compiled program can run on a computer whose CPU or operating system is different from the one on which the compiler runs, the compiler is known as a cross-compiler. A program that translates from a low level language to a higher level one is a decompiler. A program that A diagram of the operation of a typical multi-language, multi-target compiler translates between high-level languages is usually called a language translator, source to source translator, or language converter. A language rewriter is usually a program that translates the form of expressions without a change of language. A compiler is likely to perform many or all of the following operations: lexical analysis, preprocessing, parsing, semantic analysis (Syntax-directed translation), code generation, and code optimization. Program faults caused by incorrect compiler behavior can be very difficult to track down and work around; therefore, compiler implementors invest a lot of time ensuring the correctness of their software. The term compiler-compiler is sometimes used to refer to a parser generator, a tool often used to help create the lexer and parser. Compiler 3 History Software for early computers was primarily written in assembly language for many years. Higher level programming languages were not invented until the benefits of being able to reuse software on different kinds of CPUs started to become significantly greater than the cost of writing a compiler. The very limited memory capacity of early computers also created many technical problems when implementing a compiler. Towards the end of the 1950s, machine-independent programming languages were first proposed. Subsequently, several experimental compilers were developed. The first compiler was written by Grace Hopper, in 1952, for the A-0 programming language. The FORTRAN team led by John Backus at IBM is generally credited as having introduced the first complete compiler in 1957. COBOL was an early language to be compiled on multiple architectures, in 1960.[1] In many application domains the idea of using a higher level language quickly caught on. Because of the expanding functionality supported by newer programming languages and the increasing complexity of computer architectures, compilers have become more and more complex. Early compilers were written in assembly language. The first self-hosting compiler — capable of compiling its own source code in a high-level language — was created for Lisp by Tim Hart and Mike Levin at MIT in 1962.[2] Since the 1970s it has become common practice to implement a compiler in the language it compiles, although both Pascal and C have been popular choices for implementation language. Building a self-hosting compiler is a bootstrapping problem—the first such compiler for a language must be compiled either by a compiler written in a different language, or (as in Hart and Levin's Lisp compiler) compiled by running the compiler in an interpreter. Compilers in education Compiler construction and compiler optimization are taught at universities and schools as part of the computer science curriculum. Such courses are usually supplemented with the implementation of a compiler for an educational