Multi-Method Dispatch Using Single-Receiver Projections
Total Page:16
File Type:pdf, Size:1020Kb
Multi-Method Dispatch Using Single-Receiver Projections Wade Holst Duane Szafron Candy Pang Yuri Leontiev g fwade,duane,candy,yuri @cs.ualberta.ca Department of Computing Science University of Alberta Edmonton, Canada Abstract A new technique for multi-method dispatch, called Single-Receiver Projections and abbre- viated SRP, is presented. It operates by performing projections onto single-receiver dispatch tables, and thus benefits from existing research into single-receiver dispatch techniques. It pro- k vides method dispatch in O(k ), where is the arity of the method. Unlike existing multi-method dispatch techniques, it also maintains the set of all applicable methods. This extra information allows SRP to support languages like CLOS that use next-method. SRP is also highly incremen- tal in nature, so it is well-suited for languages that require separate compilation and languages that allow methods and classes to be added at run-time. SRP uses less data-structure memory than all existing multi-method techniques and has the second fastest dispatch time. Furthermore, if the other techniques are extended to support all applicable methods, SRP becomes the fastest dispatch techniques. RESEARCH PAPER Subject Areas: language design and implementation, programming environments Keywords: multi-method, dispatch, object-oriented, implementation, compiler Word Count: 8722 Contact Author: Duane Szafron Department of Computing Science University of Alberta Edmonton, Canada, T6G 2H1 Phone: (780) 492-5468 Fax: (780) 492-1071 Email: [email protected] 1 1 Introduction In object-oriented languages, the method to invoke is determined not only by the method (selector) name, but also by the dynamic types of its arguments. Run-time determination of the method to invoke at a call-site is called method dispatch and is a critical component of execution performance. Although optimizing compilers can avoid method dispatch at numerous call-sites, run-time method dispatch is always necessary in the general case. Object-oriented languages can be separated into single-receiver languages and multi-method lan- guages. Single-receiver languages use the dynamic type of a dedicated receiver object in conjunction with the method name to determine the method to execute at run-time. Multi-method languages use the dynamic types of two or more object arguments1 in conjunction with the method name to de- termine the method to execute. In single-receiver languages, a call-site can be viewed as a message send to the receiver object. In multi-method languages, a call-site can be viewed as the execution of a method on a set of arguments. Note that languages like C++ and Java that allow methods with the same name but different static argument types are not performing multi-method dispatch since static argument types are simply encoded within the method name at compile-time. Multi-method languages are not yet in popular use, but there are numerous reasons to believe that they will become more predominant. First, such languages provide more expressive power, in the same way that single-receiver object-oriented languages are more expressive than procedural languages. Second, multi-method languages address certain problematic issues that arise in single- receiver languages. For example, implementing binary methods in a type-safe and efficient manner is problematic in single-receiver languages, but completely natural using multi-methods. Existing multi-method languages like CLOS ([BDG+ 88]), Cecil ([Cha92]) and Dylan ([App94]) show the viability of multi-methods, and there is already research to extend Java to multi-methods ([BC97]). Method dispatch techniques can be separated into two groups. Cache-based techniques look in global or local caches at the time of dispatch to determine if the method for a particular call- site has been determined. If it has been determined, that method is used. Otherwise, a cache-miss technique is used to compute the method, which is then cached for later executions. Table-based techniques pre-determine the method for every possible call-site, and record these methods in a table. At dispatch-time, the method name and dynamic argument types form an index into this table. 1In the rest of this paper, we will assume that dispatch occurs on all arguments. 1 The major research contribution of this paper is a new multi-method dispatch technique called Single-Receiver Projections (SRP). SRP has the following advantages: 1. SRP provides access to all applicable methods, not just the most specific applicable method. 2. SRP is inherently incremental so it is applicable to reflexive languages and languages that support separate compilation. 3. SRP uses less data-structure space than any other multi-method dispatch technique. 4. SRP has the second fastest dispatch time of all dispatch techniques. 5. SRP has the fastest dispatchtime of all dispatchtechniques if the others are extended to support all applicable methods. The rest of this paper is organized as follows. Section 2 introduces notation for multi-methods. Section 3 presents a summary of existing single-receiver table-based techniques. Section 4 presents a summary of the existing multi-method dispatch techniques. Section 5 presents the new multi-method table-based technique. Section 6 presents time and space results for the technique and compares it to other techniques. Section 7 discusses future work and Section 8 presents our conclusions. 2 Terminology for Multi-Method Dispatch 2.1 Notation k o Expression 1 shows the form of a -arity multi-method call-site. Each argument, i , represents an i T = ty pe(o ) H object, and has an associated dynamic type, i . Let represent a type hierarchy, and H num(T ) j H j be the number of types in the hierarchy. In , each type has a type number, . A directed T T T T i j i supertype edge exists between type j and type if is a direct subtype of , which we denote as T T T T T 1 i i j j j . If can be reached from by following one or more supertype edges, is a subtype T T T j i of i , denoted as . (o o o ) 2 k 1 (1) Method dispatch is the run-time determination of a method to invoke at a call-site. When a i method is defined, each argument has a specific static type, T . However, at a call-site, the dynamic i i o T fT jT T g type of each argument, i , can either be the static type, , or any of its subtypes, . For example, consider the type hierarchy and method definitions in Figure 1a, and the code in Figure 1b. The static type of anA is A, but the dynamic type of anA can be either A or C. In general, we do not know the dynamic type of an object at a call-site until run-time, so method dispatch is necessary. Although multi-method languages might appear to break the conceptual model of sending a mes- sage to a receiver, we can maintain this idea by introducing the concept of a product-type. A k-arity 2 A::α A0 D3 A anA; C::α if( ... ) C::β anA = new A(); B C E 1 2 4 D::β else anA = new C(); * The subscript beside the type is the anA.(); type number, num(T). (a) Type Hierarchy (b) Code Requiring Method Dispatch Figure 1: An example hierarchy and program segment requiring method dispatch 1 2 k P = T T T product-type is an ordered list of k types denoted by . The induced k-degree k k 1 H H H k product-type graph, k , denoted , is implicitly defined by the edges in . Nodes in are - arity product-types, where each type in the product-type is an element of H. Expression 2 describes k 2 1 T T P = T when a directed edge exists from a child product-type j to a parent product-type j j j 1 2 k P = T T T P P j 1 i i , which is denoted . i i i u u v v P P u 1 u k :(T T ) (v = u T = T ) 1 i 1 j (2) j i j i P P P P P i j i i The notation j indicates that is a sub-product-type of , which implies that can k P H be reached from j by following edges in the product-type graph . Figure 2 presents a sample inheritance hierarchy H and one of four connected components of its induced 2-arity product-type 2 2 H H 3 graph, . Three 2-arity methods ( 1 to ) for behavior have been defined on and associated with the appropriate product-types2. Note that for real inheritance hierarchies, the product-type hier- 2 3 H archies, H , , , are too large to store explicitly. Therefore, it is essential to define all product-type relationships in terms of relations between the original types, as in Expression 2. A behavior corresponds to a generic-function in CLOS and Cecil, to the set of methods that share the same signature in Java, and the set of methods that share the same message selector in Smalltalk. k k Behaviors are denoted by B , where is the arity and is the name. The maximum arity across all behaviors in the system is denoted by K . Multiple methods can be defined for each behavior. A th i j method for a behavior named is denoted by j . Ifthestatictypeofthe argument of is denoted i 1 2 k T dom( ) = T T T by , the list of argument types can viewed as a product-type, j . With 3 k j multi-method dispatch, the dynamic types of all arguments are needed .