Type Inference for Late Binding. the Smalleiffel Compiler. Suzanne Collin, Dominique Colnet, Olivier Zendra
Total Page:16
File Type:pdf, Size:1020Kb
Type Inference for Late Binding. The SmallEiffel Compiler. Suzanne Collin, Dominique Colnet, Olivier Zendra To cite this version: Suzanne Collin, Dominique Colnet, Olivier Zendra. Type Inference for Late Binding. The SmallEiffel Compiler.. Joint Modular Languages Conference (JMLC), 1997, Lintz, Austria. pp.67–81. inria- 00563353 HAL Id: inria-00563353 https://hal.inria.fr/inria-00563353 Submitted on 4 Feb 2011 HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. Typ e Inference for Late Binding. The SmallEiel Compiler. Suzanne COLLIN, Dominique COLNET and Olivier ZENDRA Campus Scientique, Bâtiment LORIA, Boîte Postale 239, 54506 Vando euvre-lès-Nancy Cedex France Tel. +33 03 83.59.20.93 Email: [email protected] Centre de Recherche en Informatique de Nancy Abstract. The SmallEiel compiler uses a simple typ e inference mecha- nism to translate Eiel source co de to C co de. The most imp ortant asp ect in our technique is that many o ccurrences of late binding are replaced by static binding. Moreover, when dynamic dispatch cannot b e removed, inlining is still p ossible. The advantage of this approach is that it sp eeds up execution time and decreases considerably the amount of generated co de. SmallEiel compiler source co de itself is a large scale b enchmark used to show the quality of our results. Obviously, this ecient tech- nique can also b e used for class-based languages without dynamic class creation : for example, it is p ossible for C++[10] or Java and not p ossible for Smalltalk. 1 Intro duction and Related Works Ob ject-oriented languages, by their very nature, p ose new problems for the deliv- ery of applications and the construction of time-ecient, typ e-safe, and compact binaries. The most imp ortant asp ect is the compilation of inheritance and late binding. The distinctive feature of late binding is that it allows some freedom ab out the exact typeofavariable. This exibility of late binding is due to the fact that a variable's typ e may dynamically change at run time. Weintro duce an approach to compilation which is able to automatically and selectively re- place many o ccurrences more than 80 of late binding by static binding, after considering the context in which a call is made. The principle we will detail in this article consists in computing each routine's co de in its calling context, ac- cording to the concrete typ e of the target. The computation of a sp ecic version of an Eiel [7] routine is not done for all the target's p ossible typ es, but only for those which real ly exist at run time. As a consequence, it is rst necessary to know which p oints of an Eiel program may b e reached at run time, and then to remove those that are unreachable. Our compilation technique, which requires the attribution of a typ e to each ob ject, deals with the domain of typ e inference. Ole Agesen's recent PhD thesis [2] contains a complete survey of related work. Reviewed systems range from purely theoretical ones [13] to systems in regular use by a large community[8], via partially implemented systems [11] [12] and systems implemented on small languages [5] [9]. Using Agesen classication [2], SmallEiel compiler's algorithm can b e qual- ied as polyvariant which means each feature may b e analyzed multiple times and ow insensitive no data ow analysis. Our algorithm deals b oth with con- crete types and abstract typ es. The Cartesian Pro duct Algorithm CPA [1] is a more p owerful one. However, the source language of CPAisvery dierent from Eiel : it is not statically typ ed, handles prototyp es no classes and allows dy- namic inheritance ! The late binding compilation technique we describ e for Eiel may apply to any class-based language [6], with or without genericity, but with- out dynamic creation of classes. The generated target co de is precisely describ ed as well as the representation of ob jects at run time. Section 2 intro duces our technique's terminology and basic principle. Section 3 explains in detail and with examples the technique we used to suppress late binding p oints. Section 4 considers the problem of genericity. Section 5 show the impact of late binding removal. Our examples are written in Eiel, and we use the vo cabulary which is generally dedicated to Eiel [7]. 2 Overall Principles and Terminology 2.1 Dead Co de and Living Co de In Eiel, the starting execution p oint of a program is sp ecied by a pair com- p osed of an initial class and one of its creation pro cedures. This pair is called the application's root. The rst existing ob ject is thus an instance of the ro ot class, and the very rst op eration consists in launching the ro ot pro cedure with that ob ject. First of all, our compilation pro cess computes which parts of the Eiel source co de mayormay not b e reached from the system's ro ot, without doing any data ow analysis. The result of this rst computation is thus completely indep endent of the order of instructions in the compiled program. With regard to conditional instructions, we assume that all alternatives may a priori b e exe- cuted. Starting from the system's ro ot, our algorithm recursively computes the living code, that is co de whichmay b e executed. Co de that can never b e executed is called dead code. Living co de computation is closely linked to the inventory of classes that mayhave instances a run time. For example, let's consider the following ro ot : class ROOT creation root feature a: A; root is do io.put_string"Enter a number :"; io.read_integer; !!a; -- 1 a.addio.last_integer; -- 2 a.sub1; -- 3 end; end -- ROOT Of course, the ro ot pro cedure of class root b elongs to the living co de. Since this pro cedure contains an instantiation instruction for class a at line 1, instances of class a may b e created. The co de of routines add and sub of class a may b e called lines 2 and 3. Consequently, the source co de of these two routines is living co de, from which our algorithm will recursively continue. In the end, if we neither consider input-output instructions nor the source co de of routines add et sub, only two classes mayhave instances at run time: class a and class root. By analogy with the terms dead and living co de, a living class is a class for which there is at least one instantiation instruction in living co de : a is a living class. Conversely,adead class is a class for which no instance may b e created in living co de. 2.2 Goals and Basic Principles Ob ject-oriented programming OOP is largely based on the fact that an op er- ation always applies to a given ob ject corresp onding to a precise dynamic typ e. In Eiel, as in Smalltalk, no op eration may b e triggered without rst supplying this ob ject, called target or receiver . Named current in Eiel, the target ob- ANIMAL CENTIPEDE QUADRUPEDE DOG CAT Fig. 1. Simple inheritance graph used in all the article. ject determines which op eration is called. Let's consider the simple inheritance example of gure 1, which will b e used all along this article. Let's also assume that each sub-class of animal has its own routine cry, which displays its own sp ecic cry. If the target is of typ e cat, the routine cry of class cat is used. If the target is of typ e dog, it is the routine of class dog. And so on... The main goal of our compilation technique consists in statical ly computing either at b est the only routine which corresp onds to a target ob ject, or the reduced set of routines that are p otentially concerned. 2.3 Representation of Ob jects at Run Time Only living typ es are represented at run time. Basic ob jects corresp onding to Eiel's typ es integer, character, real, double and boolean are directly represented with simple typ es of the target language. For example, an Eiel ob- ject of typ e integer is directly implemented with a C int. All other ob jects are represented by a structure C struct whose rst eld is the numb er that identies the corresp onding living typ e. Each of the ob ject's living attributes is implemented inside this structure as a eld b earing the attribute's name. If the attribute is directly an ob ject expanded, the eld's typ e corresp onds to the living typ e considered. If the attribute is a reference to another ob ject, the eld's typ e is p ointer to anything void * in C. For example, for the previously describ ed class root, the C structure is : structure ROOT {int id; void *a;}; If attribute a of class root is declared as expanded a, that is without interme- diate p ointer, the C structure would b e : structure ROOT {int id; A a;}; structure A {int id; ...; ...;}; 3 Adapting Primitives Considering that we know the set of living typ es, our compilation technique consists, for each of these typ es, in computing a sp ecic version of primitives that can b e applied to the typ e.