<<

Common Infrastructure: Compiler Writer’s Perspective

Eugene A. Zueff, Institute for Computer Systems, ETH Zürich [email protected]

June 26th, 2003 Talk Overview

• CCI Framework: What is it and what is it for? • CCI Architecture, Major Components & Features: An Overview. • CCI Intermediate Representation. • IR Transformation: Visitors. • CCI Compilation Model & Integration Issues. • Example: A CCI-based Compiler • Current State & Conclusion

2 CCI at a Glance

• CCI = Common Compiler Infrastructure. • Technically, CCI is a set of resources (classes) providing support for implementing and other language tools for .NET platform ‹ Compiler implementation ‹ Compiler integration • Conceptually, CCI is a part of the .NET Framework SDK

3 CCI at a Glance (2)

VSIP

Add-Ins

Macros

VS.NET

3+ CCI at a Glance (2)

CCI

BabelService

VSIP

Add-Ins

Macros

VS.NET

3+ CCI: Possible Scenarios

• Integrate an existing (“non-CCI”) compiler into VS.NET. • Develop a completely CCI-based compiler and integrate it into Visual Studio .NET. • Extend existing .NET languages & compilers (C#, VB etc.). • Develop post-compilation tools. • Develop “toy” compilers (either command- line or VS-integrated) for educational purposes! 4 CCI: Three Major Concerns

•(Common Concern) Developing compilers is a challenging task; integration brings a lot of additional issues. •(CCI Concern) CCI implements a radically different (non-conventional) view at compilation process. •(Technical Concern) CCI has wide and non-trivial interface, rules and contracts. 5 CCI Major Parts

Intermediate Representation (IR) A rich hierarchy of C# classes representing most common and typical notions of modern programming languages. System.Compiler.dll Transformers (“Visitors”) A set of classes performing consecutive transformations IR ⇒ MSIL System.Compiler.Framework.dll Integration Service Variety of classes and methods providing integration to Visual Studio environment (additional functionality required for editing, debugging, background compilation etc.) 6 CCI Way of Use: Principles

Common principles of using CCI: ‹ CCI services are represented as classes. In order to make use of them compiler writer should define classes inherited from CCI ones. ‹ Derived classes should implement some abstract methods declared in the base classes (they compose a “unified interface” with the environment) ‹ Derived classes may (and typically do) also implement some language-specific functionality.

7 CCI Way of Use: Parser Example

Prototype parser: using System.Compiler; abstract class from CCI namespace ZLanguageCompiler { public sealed class ZParser : System.Compiler.Parser { public override ... ParseCompilationUnit(...) {

l l . . . a } C private ... ParseZModule(...) { . . . Parser’s “unified interface”: } } implementation of the } interface between Z parser’s own logic Z compiler and environment 8 CCI Intermediate Representation (1)

The Central Part of CCI Classes representing Example: a C# class all language concepts public class C supported by CLI { public int m1; public void f ( ) { m1=0; } }

Field Class Name Identifier Name Flags Identifier Members Type Int32 Assignment- ... Statement

Method … ... Name Block … Identifier ... Flags Statements Void Type Body ... 9 CCI Intermediate Representation (2)

Node Node A part of IR Expression Member UnaryExpression TypeNode inheritance tree BinaryExpression Class NaryExpression DelegateNode MethodCall EnumNode Indexer Interface AssignmentExpression . . . Literal TypeParameter Parameter Pointer This Reference Statement Event AssignmentStatement Method If InstanceInitializer For StaticInitializer ForEach Field Continue Property ExpressionStatement Namespace VariableDeclaration CompilationUnit 10 CCI Intermediate Representation (3)

Example: Some Features: public class If : Statement ‹ Rather straightforward { approach. Expression condition; Block falseBlock; ‹ Very much similar to Block trueBlock; C# concept hierarchy. . . . } ‹ Supports some non-C# features (assignment public class Block : Statement statements etc). { bool hasLocals; ‹ Supports some future StatementList statements; C# features (generics). . . . } ‹ Suitable enough for representing a great number of other languages. 11 CCI Intermediate Representation (4) ‹ “Zero cost” approach: How to use IR classes Just take (a subset of) it as it is. - For relatively simple languages which are fully CLI-compliant (and therefore completely “covered” by IR classes).

‹ ”Standard” approach: Extend (some of them) adding own functionality if necessary (in a usual OO manner). - “Golden mean”: for most cases.

‹ ”Radical” approach: Create your own hierarchy providing means for converting its nodes to semantically equivalent IR nodes/sub-trees. - For complex languages and/or languages with completely different paradigms. 12 IR Transformation: Visitors (1)

StandardVisitor Every visitor walks an IR…

Looker …replacing Identifier nodes with Declarer the members/locals they resolve to;

Resolver …resolving overloads and deducing expression result types; …checking for semantic errors Checker and repairing it so that subsequent walks need not do error checking; Normalizer …preparing an IR for serializing to IL+MD. • It’s possible to modify standard visitors and/or… • Write own visitors replacing a standard one! 13 IR Transformation: Visitors (2) Overall scheme for IR processing Prototype “compiler”: abstract CCI class public class ZCompiler : System.Compiler.Compiler, ... { . . . protected override void Compile ( CompilationUnit cu, Class globalScope, ErrorNodeList errors ) { // Walk IR looking up names IR nodes (new Looker(globalScope)).VisitCompilationUnit(cu); // Walk IR inferring types and resolving overloads (new Resolver()).VisitCompilationUnit(cu); // Walk IR checking for semantic errors and repairing it (new Checker(errors)).VisitCompilationUnit(cu); // Walk IR reducing it to predefined mappings to MD+IL (new Normalizer().VisitCompilationUnit(cu); } . . . } Visitors 14 CCI Compilation Model

IL/MD Writer

OutputOutput Source MSIL+MD Source IR AssemblyAssembly (AST)

IL/MD Scanner Visitors & Reader Parser ImportedImported AssembliesAssemblies

Language specific Common to all languages 15 Compiler Integration: Traditional Approach

Source File Name Compilation Params

Compiler Start Up Environment Compiler Syntactic & Code Source Lexical Sequence Program Genera- Object Code Semantic Code Analysis of Tokens Analysis Tree tion

Compiler End Up

Compiler as a “Black File with Object Code Box” Program Diagnostic Messages 16 What Does Integration Assume? (1)

Visual Studio Components Features That Should be Supported by a Compiler • Language sources identification

• Syntax Highlighting Project Manager • Automatic text formatting Text Editor • Smart text browsing { Î } • Error checking while typing Semantic Support • Tooltip-like diagnostics & info (“Intellisense”) • Type member lists for classes and variables of class types Debugger • Lists of overloaded methods • Lists of method parameters • Expression evaluation • Conditional breakpoints 17 What Does Integration Assume? (2)

Example of “Intellisense” Feature

18 Compiler Integration: CCI Approach (1)

Token Document TokenToken Program Tree Object Code Source Code Token Attributes (Assembly) Source Context Token Context

Syntactic & Code Lexical Compiler Semantic Genera- Analysis Environment Analysis tion

Compiler as a Collection of Resources 19 Compiler integration: CCI Approach (2)

Token Document TokenToken Program Tree Source Code Token Attributes Object Code (Assembly) Source Context Token Context

Syntactic & Code Lexical Genera- Analysis Semantic Analysis tion Objects Compiler as a Set of

Environment Source Text Project Manager “Intellisense” Debugger Editor

20 Compiler Integration: CCI Approach (3)

Compilation Phase

Get Token Lexical Analysis SomeGet Token Methods with Extra Attributes

Parse Program Unit

Syntactic & Parse Expression Semantic Analysis Parse Statements

. . . 21 A CCI-based Compiler: Zonnon (1) • Zonnon language is a successor of Pascal, -2 and Oberon line of languages. • Zonnon preserves the spirit of Oberon being a compact language with a small number of orthogonal basic concepts including: * modularity; * simple but powerful object model; * active object concept. • The first Zonnon implementation is for .NET platform and is based on CCI framework. • The language is designed and implemented in ETH Zürich, Switzerland. 22 A CCI-based Compiler: Zonnon (2) Zonnon compiler integrated into Visual Studio: a screenshot

23 A CCI-based Compiler: Zonnon (3) Zonnon compiler integrated into Visual Studio: a screenshot

24 A CCI-based Compiler: Zonnon (4) Zonnon compiler integrated into Visual Studio: a screenshot

25 A CCI-based Compiler: Zonnon (5) Zonnon compiler integrated into Visual Studio: a screenshot

26 CCI: Current State

• (Almost) completely implemented; non-documented. • Since June 12th CCI Toolkit is included into MSDN Academic Alliance: www.msdnaa.net/cci • Zonnon compiler is, perhaps, the first experience in using CCI outside .

27 Conclusion

‹ Common Compiler Infrastructure seems to be powerful and practically useful framework for developing compilers and language tools for wide range of languages. ‹ CCI provides a convenient high-level model for real integration compilers to Visual Studio environment. ‹ CCI framework has been chosen as the platform for implementing compiler for the new Zonnon language.

28