Object-Oriented Programming across Native-Virtual Boundaries Paul Werbicki Rob Kremer Computer Science Department Computer Science Department University of Calgary University of Calgary 2500 University Dr., NW, 2500 University Dr., NW, Calgary, Canada, T2N 1N4 Calgary, Canada, T2N 1N4 1-403-2841718 1-403-2205112 [email protected] [email protected] [email protected]

ABSTRACT Keywords There exist many implementations of object-oriented Native-virtual boundary, interoperability, object integration, ++, programming languages that execute on virtual computers which Java, , Java Native Interface, design abstract from native host environments. These languages provide patterns. developers with a highly mobile platform where applications may easily execute on multiple environments. However, there exist situations where it is not possible to abstract from the host 1. INTRODUCTION environment, forcing the developer to program at least part of the One of the tenets of interpreted programming languages, Java in application specifically to a specific environment. Developing a particular, is the ability to develop and compile software once and single application using both native and virtual code allows those have it execute, without modification, on all environments mobile portions to remain abstracted from the environment while supported by the virtual machine [4]. To achieve this, the at the same time providing the required native access. software must generally be written entirely in the interpreted programming language. In some applications, however, it may be This paper discusses using object-oriented programming to allow necessary to have portions of the software highly mobile, while the use of C++ and Java in a single application. A class , other portions must take advantage of optimizations only developed as part of the investigation, is used to enhance the provided on a single native host environment. native-virtual boundary interface and provide the developer with support for the use of objects. By examining how the library A practical example of this type of application exists in the area of works it is possible to gauge the level of support currently Multi-Agent Systems (MAS). In these types of applications, all available for this technique. An analysis of the available support agent processes need to communicate regardless of the highlights where virtual machines need to provide additional environment in which they execute, but some provide services functionality to fully enable this method of programming. and use resources only available on specific operating systems and hardware platforms. Some components of each entity, specifically the communications library, benefit from being Categories and Subject Descriptors mobile without having to be ported to each new environment. D.3 [ Software ]: Programming Languages; D.1.5 [ Programming However, it may not be possible to port the entire agent. Techniques ]: Object-oriented Programming; D.2.3 [ Software Engineering ]: Coding Tools and Techniques – Object-oriented Some virtual machines provide methods for integrating both Programming ; D.3.4 [ Programming Languages ]: Processors - virtual code and native code within the same program. The Java Interpreters Virtual Machine (JVM) with its Java Native Interface (JNI) is a good example. Using the JNI it is possible to make calls into mobile components of the application without having to write the General Terms entire program in an interpreted programming language [5]. The Experimentation, Standardization, Languages. JNI exposes a set of C functions that provide a low-level method of manipulating, controlling and executing code compiled to run on the virtual machine. The JVM executes in the same process space and interacts with native code across the native-virtual Permission to make digital or hard copies of all or part of this work for boundary defined by the JNI. personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that The procedural approach used by the JNI does not lend itself well copies bear this notice and the full citation on the first page. To copy to using mobile code written as an object-oriented library. Instead otherwise, or republish, to post on servers or to redistribute to lists, of accessing objects using procedural-style JNI functions, it is requires prior specific permission and/or a fee. highly desirable from a software engineering standpoint to treat Virtual Execution Environments (Vee’05) , June 11-12, 2005, Chicago, these interpreted objects as normal objects within the native Illinois, USA. Copyright 200 5 ACM 1 -58113 -000 -0/00/0004…$5.00. portions of the application. To investigate this approach, JavaCOM [11] was developed, a class library written in C++ all supporting libraries and third-party software provided by under the Microsoft Windows family of operating systems. The all of the programming languages used in the application. use of JavaCOM makes it somewhat possible to treat objects 3. Single language for library/API development . Due to the within an interpreted library as objects within a C++ application increased access to supporting libraries, researchers and or as components within the Component Object Model (COM) third-party software developers do not need to support [6] in languages such as Microsoft Visual Basic and C++. different version of their software, one for each programming language they need to support. Instead they can concentrate This paper describes the challenges of trying to integrate on creating a single, mobile language API and allow the end- interpreted objects and native objects within the same application. user to integrate this into their application by integrating the The goal of this research is to attempt to achieve complete re-use programming language in which it was developed. of an object-oriented communication library for Multi-Agent 4. Code reuse without access to the original source code . In Systems written in Java within an agent application developed in situations where applications are being ported from one C++. In doing so the hope is to discover ways to achieve object- language to another (such as in legacy systems) the original oriented development using interpreted and native code within the source code may not be available. In cases like this it is same process. Where this is not possible, the goal is to propose possible to contain the original code and use it directly, requirements of virtual machines needed to support such instead of re-writing it and potentially introducing errors. applications. 5. Integration of specialized components, hardware platform components and legacy software . 1.1 Paper Overview Accessing specialized components and integrating with This paper is structured as follows. Section 2 specifies the legacy systems is common in Software Engineering. problem further, providing reasons why it is desirable to mix Combining native and virtual code makes it possible to use native and interpreted code and introduces an application that these exiting pieces from newer programming languages. benefits from such a scenario. Section 3 outlines approaches used to contain the JVM in a native application. Section 4 describes With the popularity of new object-oriented programming the JavaCOM class library and how it abstracts from the JNI languages, such as Java, that are free from any procedural heritage using design patterns to achieve a level of integration. Section 5 (unlike C++) more code is begin developed and shared in analyses how JavaCOM use of proxy classes provides support for languages that promote only object-oriented development. object-oriented programming between C++ and Java. Section 6 Integration of these libraries into a native application using a presents requirements for the JVM and virtual machines in procedural interface, such as the JNI, is awkward. It breaks down general to fully support object-oriented programming. We the object-oriented paradigm [9] making it difficult for the conclude with the benefits this research has for software developer to write a “good” program. engineering in general and future work in this area. If a language is said to support object-oriented programming then it must provide facilities to make it easy to program using objects. 2. PROBLEM SPECIFICATION Even though two programming languages both support object- The ability to integrate libraries written in an interpreted oriented programming it may be difficult for the flow of control to programming language (such as Java) into an application written 1 pass from the first programming language to the second. This in a native programming language presents some interesting interaction may take exceptional effort or skill on behalf of the opportunities in software development. Interoperability between developer and merely enable the use of multiple programming programming languages impacts many aspects of an application’s languages within a single application, but not support it. To fully development cycle, from initial design decisions, to the cost of support object-oriented programming using multiple maintaining the completed application. From a software programming languages there must be support for making it easy engineering viewpoint, some obvious advantages include: to integrate objects. 1. Decreased need to choose a single programming language Support for object-oriented development between programming for development . One of the first and often most difficult languages follows the same requirements for object-oriented decisions of any project is the choice of programming programming languages. The most significant features include: language. Developing an application with multiple programming languages reduces this to deciding which 1. Type Checking . Object-oriented programming languages are programming language will be the primary one, with the for the most part typed languages. The compiler rejects other languages being contained by the primary language. programs that are not well-typed based on the values that are 2. Access to more supporting libraries and a larger market of expected during compilation. third-party software . It is common for developers to prefer 2. Calling Mechanism . For a given object, there must be a one language over another, often because of the supporting method for calling a specific member function. The calling libraries which make the task of programming easier. mechanism must respect inheritance – directing the call to Combining multiple programming languages into a single the proper object in the inheritance hierarchy. application provides the developer with access to the set of 3. Encapsulation . Combining elements together to create a new entity is important for any object-oriented programming language. A class must be able to contain member functions 1 and variables and protect or expose those elements as Here, native programming languages are considered to be dictated by its purpose. languages that generate machine code specific to a host 4. Inheritance . Class derivation (subclassing) is an important environment (such as C++ or Visual Basic). technique in object-oriented programming. It allows a general concept expressed in a base class to be specialized in interoperate with other Microsoft programming languages and a subclass without having to modify the base class. operating systems. Including these features in the interface between programming Functionality such as Raw Native Interfaces (Microsoft’s version languages provides support for object-oriented programming of the JNI) and J/Direct [6] allowed Java developers to call using multiple programming languages. For example, it would be functions contained in dynamically-linked libraries directly from ideal to have a native language class (C++, for example) inherit Java applications. Java Callable Wrappers and COM Callable directly from a class written in Java. Wrappers [10], concepts similar to a proxy, allowed Java developers to expose their code as COM components and in turn 2.1 Multi-Agent Systems: A Practical use COM components written in other programming languages as Example if they were regular Java objects. Multi-Agent Systems are a good example of applications that These extensions were specific to the Microsoft JVM which benefit from the mixing of interpreted and native code. Multi- meant that any developer who employed these extensions was Agent Systems consist of computational entities (referred to as only able to run their programs on the Microsoft family of agents) that interact with one another to achieve a common goal operating systems. This breached part of the license agreement beyond the capabilities of an individual [8]. Communication is between Microsoft and Sun Microsystems and the court case that fundamental to these systems, providing the foundation upon was eventually won by Sun forced Microsoft to abandon the which cooperation between agents takes place. Generally, Microsoft JVM [3]. communication between agents is fairly straight-forward, using capabilities within the operating system (inter-process After the Microsoft – Sun fiasco it became obvious that changes communication, networking, etc.) to transfer information between made directly to the JVM, to support easier interfacing between agents. The method of communication used by every agent in the native and interpreted code, would not be possible. New system must be the same to ensure that they can communicate approaches would have to sit outside of the JVM treating it as a properly. In addition, the protocols (language, formatting, and black box and using only the JNI as the interface mechanism. turn-taking schemes) of communication must be common among Many commercial and open-source replacements appeared to fill the agents, and these can be quite complex. All this is the void left by the departure of the Microsoft JVM. Some of appropriately handled by libraries written in mobile, interpreted these new approaches included enhancement libraries that languages. provided higher-level interfaces to the JNI. These enhancement The internals of an agent however vary greatly depending on the libraries wrapped the JNI function set into a class or template purpose of that agent. In the case of “expert” agents, their purpose library, exposing a more intuitive object-oriented interface to the may be to provide access to a specialized hardware resource, or to developer. perform specialized computations for which they are highly Proxy class generators were another popular approach. They took optimized. For the former example it may only be possible to Java class files and produced C++ header and source files implement the agent using native code, in the latter example it embedded with the necessary JNI function calls. These utilities may be more efficient or obtain better performance. However, if could be integrated into a project’s build cycle so that as the Java this agent is able to communicate with other agents (using classes changed, the proxy classes changed as well. interpreted code) it is able to expose its expertise to the rest of the system allowing others to take advantage of the specialized Other approaches went so far as to use Inter-Process services it provides. Communication (IPC) and Remote Procedure Calls (RPC), relying on existing wire protocols to handle the calls across the The problem arises that for each programming language used to native-virtual boundary. These later approaches were more develop an agent in the system, the communications portion of complicated requiring multiple processes and greater installation the agent must be ported to that programming language/operating requirements. system combination. This decreases code reuse among agents, and increases code duplication, maintenance costs and the potential 4. JAVACOM for bugs to occur between the various versions of the library. Our JavaCOM class library started as a simple enhancement However, if the communications portion of the agent was library to make the Java Native Interface easier to use. One of the contained in an interpreted library that was used by the agent, drawbacks of the JNI is the procedural interface between the Java irrespective of the programming language used to implement the Virtual Machine and native programming languages: the agent’s internals, the developer could achieve the advantages extensive functionality provided by the JNI to allow for stated earlier. integration between Java and C/C++ increases the complexity of performing simple programming tasks [2]. For example, accessing 3. METHODS OF INTERFACING WITH a member variable in a Java class inside the JVM from C++ THE JVM requires multiple operations. In many cases, JNI functions require One of the first implementations to provide support beyond the the conversion of data into a form accessible by native JNI for integrating Java code into native applications was the programming languages before it can be used. Also, resources Microsoft JVM [10]. In order to have Java accepted at the time by created by the JNI must be explicitly freed - a paradigm that is developers as a Microsoft mainstream language, effort was made counter to the garbage collected nature of the Java programming to make Java as powerful and as flexible as it could be. language. Extensions where added to the Microsoft JVM to allow it to Figure 1. JavaCOM Class Diagram

The complete JavaCOM library (see Figure 1) was designed to The preceding Visual Basic example shows how the JavaCOM take advantage of design patterns as defined by Gamma et al [1]. library was able to adapt the java.util.Date class so that it could be Initially, the only class that existed was the CJavaVM singleton used as a Visual Basic component. This is done using the class which is used to start and manage the JVM inside of the IDispatch interface, where method calls in Visual Basic are current process. Then the CJTypeInfo class was developed to expanded upon execution/compilation so that to the developer the simplify type checking and the CJObject class to contain Java method call uses natural programming language syntax. However, object references and streamline the calling mechanism. By using under the covers, at run-time the parameters to the call are being the library to create a façade design pattern the JNI interface was packaged up and a method invocation is being performed by made substantially easier to use. method name. Using an adapter class it was possible to provide an interface to However, when using the IDispatch interface in C++ the CJObject that was compatible with other programming languages. developer does not have the luxury of the compiler performing Originally the goal of JavaCOM was to allow COM-based these steps on their behalf as it does in Visual Basic. The languages to use Java objects as if they were simply components developer is forced to write code each time the CJObject class or of the operating system [10]. Through the adapter, programming IDispatch interface is used, packaging up the parameters and languages such as Microsoft Visual Basic where enabled to use performing a Java member function call by name. Java objects with the IDispatch interface, a late-bound calling To provide an easy interface for C++ developers, where Java mechanism in COM. This exposed functionality to Visual Basic objects can be treated like C++ objects, the use of a proxy was applications that was only available in Java class libraries. introduced. A utility included with the JavaCOM library, translates a given Java class’s interface into a C++ header file Dim objJavaVM As JavaCOM.JVM Dim objDate As Object with inline expansion of all method calls. Using this header file Dim strTest As String the developer need only create this object and call a method using

Set objJavaVM = CreateObject(“JavaCOM.JVM”) C++ syntax and types and all of the work integrating with Java is Call objJavaVM.Initialize(“”) performed by the JavaCOM library. Managing the JVM,

Set objDate = objJavaVM.CreateObject(“java.util.Date”) containing the Java object and converting between C++ types and MsgBox “Date= “ & objDate.toString() Java types is performed through the use of the CJavaVM,

CJObject and CJTypeInfo classes.

Figure 2. Flow of Control

5. OBJECT-ORIENTED SUPPORT USING the same, parameter overloading is assumed and the method is matched based on the type signature of the parameters provided. PROXIES The JavaCOM library, along with the proxy generating utility, Encapsulation of member functions is possible through the proxy provides a compact solution for achieving a level of easy class using private, protected and public section qualifiers. integration of C++ and Java. Generated proxy classes provide a However, the encapsulation of member variable access is more compile time bridge between C++ native code and the JVM, difficult. Access to member variables may not be done directly allowing the developer to cross the native-virtual boundaries very since there must exist some code in the proxy that forwards the easily. To examine the level of support provided for object- request to the JVM to get/set the value. As such all member oriented programming across native-virtual boundaries using this variable access, including public member variables, must be method we compare how the proxy works to implement the performed through accessor functions. By placing the accessor features described above. function in the appropriate section of the class it is possible to obtain encapsulation of member variables. Type checking is handled at compile time by the C++ compiler. Each proxy class is uniquely named by JavaCOM using a Perhaps the most difficult feature to support is inheritance. Given common scheme based on the fully qualified Java that a proxy class is defined as a C++ class with inline methods it class name. Naming proxy classes uniquely allows each class type is possible to inherit the proxy and override member functions, to be treated individually by the compiler as a separate type. essentially inheriting and overriding the contained object at the Without the proxy class the developer would be forced to use a same time. This works for the most part as long as one strict rule CJObject to represent every Java class which provides no is followed: if a method in the Java superclass calls a function distinction between the various classes and no clue to the overridden in the C++ subclass, that function must itself be compiler that they represent different types. overridden and its implementation replaced in the subclass’s programming language. This rule is important due to the callback The calling mechanism is implemented through inline member nature in which method overloading works. functions exposed by the proxy class. A proxy class inherits privately from the CJObject class which provides the generalized However, the above rule breaks down when the flow of control calling mechanism used by the inline functions. Parameters from starts on the interpreted side of the boundary. Looking at Figure 2 the stack are packaged up and along with the method name are we see a C++ class that inherits from a Java class. Here, calls are passed to CJObject’s Invoke() method which performs a method able to originate from the native side to the virtual side, but not invocation by name on the Java object. Where method names are the other way around. When a message arrives asynchronously from the network, the flow of control starts from the interpreted side and the superclass in Java has no mechanism for calling the and if there is an entry for the member function being called, the overloaded function in C++. This is because there is no indication call is re-directed to the native side of the boundary, otherwise the that there is a native class inheriting the interpreted class from call proceeds as normal. This approach places the burden on the outside of the JVM. In this instance, moving this method to the virtual machine but also provides better integration making the other side of the native-virtual boundary still does not provide the virtual machine aware of what is actually happening to the flow of inheritance mechanism with the information it requires to direct control. the call to the proper native function. There already exists an event mechanism used by the Java Virtual Understanding these significant drawbacks allows a developer Machine Tool Interface (JVMTI). With some minor changes this some level of object-oriented support for inheritance. The interface could be adapted to serve as the event notification developer needs to be aware of the internal implementation of the mechanism. However, two problems come from such an superclass as well as any flow of control originating in the approach: first using the JVMTI adds an additional footprint to interpreted code. This goes against some of the software the JVM which may be unwanted overhead in many applications, engineering advantages stated earlier. and second the JVMTI was designed as an interface for debugging and profiling, using it as a way around JNI deficiencies Through trial and error it may be possible to determine if and means using it for something it was not designed to do. when the above rule can be applied. However, every time the Ultimately the JNI is the proper location for these approaches to superclass changes the rule needs to be applied to all overridden be implemented. methods in case a new dependency has been introduced. In the case of flow of control issues, significant redesign of the These two approaches demonstrate how virtual machine interpreted library, including the embedding of native methods in developers must be aware of objects on the native side that inherit the Java class, may be necessary to accomplish the needed from interpreted objects. There surely exist other approaches to callback. This elaborate coding does not make it possible for a enhancing the JVM and virtual machines in general but this developer to easily integrate objects between two object-oriented theme is common among all of them. With virtual machine programming languages. developers aware of this approach to using their platforms, and with support from them in their interfaces, it is possible to fully 6. VIRTUAL MACHINE SUPPORT support object-oriented programming across the native-virtual JavaCOM was designed to work independent of the virtual boundaries. machine, containing and extending it while treating it as a black box. The JNI, which provides a rich set of functions for working 7. CONCLUSION with the JVM, provides the interface to this black box but only up This paper describes the problem of how the interface between to a certain point. In order to fully support object-oriented interpreted object-oriented languages and native object-oriented programming across the native-virtual boundary using this languages is a procedural interface, not an object-oriented approach, an extension to the JNI is required. interface, which violates many of the advantages of working with object-oriented languages. The problem is particularly interesting Through experimentation using the JavaCOM class library and because our group has built an agent-based system with extensive several sample applications, two possible approaches were protocols in Java. However, some agents have components which discovered. The first is an event-based approach that monitors are difficult or impossible to implement in Java and must be member function calls and re-directs them to the appropriate implemented in native languages such as C++ and Visual Basic. overridden native method. The second approach uses callbacks at The cost of re-implementing the protocol portions of the agents in the object level into native code allowing the overriding of various native languages is extremely high, and fraught with interpreted methods. maintenance and compatibility problems. Therefore, there is great The event-based approach would place the burden of control motivation to program in mixed languages. Unfortunately, the outside of the JVM, providing a way to hook into the inner interface between Java and native languages is purely procedural, workings of the virtual machine. A library, such as JavaCOM, and our implementation requires native code to inherit from and would request notification events for member function calls. extend various Java agent classes. Before a member function is invoked, an event would be raised by To address the problem, JavaCOM, a class library, has been the virtual machine provided with the object reference, the implemented that seeks to provide a clean, object-oriented method in question and its type signature. It would then be up to interface between Java, the Java Virtual Machine, and native the library to match the instance of the interpreted object with a languages. Using JavaCOM has been very successful, providing native object inheriting it on the other side of the boundary. If the the ability to generate (from the Java code) native-language proxy interpreted object is being inherited, the call would then be made classes which cleanly wrap the ugly details of marshaling calls to into the object on the native side and the original call would be member functions in the Java-based superclass. However, the cancelled, otherwise the event would be released and the call JavaCOM solution fails to perfectly follow the tenets of object- would be made to the interpreted code as normal. oriented programming: if a native subclass of a Java class The object level callback approach is similar to how the vtable overrides a method in the Java superclass and the Java superclass works in C++. Each object maintains a table with an entry for has another method that calls the overridden method, the program each member function in the class of the object. Through the JNI, will invoke the Java superclass' method not the native subclass' external libraries like JavaCOM install callbacks in the table for method, as it should. The reason for this failure is that there exists each function being overridden. When a member function is being no way, through the JNI, to inform the JVM that native code has called by the virtual machine the callback table is first checked overridden a method in a Java class. There needs to exist a facility within the JVM to receive messages when an object’s member [5] Liang S. The Java™ Native Interface – Programmer’s Guide function is called. This requirement may be extended for virtual and Specification . Addison-Wesley, Reading, MA, 1999. machines in general. Since changing the JVM is not an option, [6] Microsoft Corporation. Writing Windows-Based two possible solution strategies are proposed that, unfortunately, Applications with J/Direct . Microsoft Developer Network involve extensions to the JNI. The hope is that one of these Library. Microsoft Corporation, Redmond, WA, 2005. extensions will eventually be implemented. [7] Microsoft Corporation and Digital Equipment Corporation. The Component Object Model Specification - Draft Version 8. REFERENCES 0.9. Microsoft Corporation, Redmond, WA, 24 Oct 1995. [1] Gamma E., Helm, R., Johnson, R. and Vlissides, J. Design Patterns, Elements of Reusable Object-Oriented Software . [8] Synder, R. D.; and Tomlinson, R. S. Robustness Addison Wesley, Boston, MA, USA, 1995. Infrastructure for Multi-Agent Systems Proceedings of the Open Cougaar, New York, N.Y., U.S.A, 2004. [2] Gabrilovich, E. and Finkelstein, L. JNI – C++ Integration Made Easy . C/C++ Users Journal. CMP Media LLC, [9] Stroustrup, B. What is Object-Oriented Programming? Manhasset, NY, USA, January 2001. Proceedings of the 1st European Software Festival, Munich, Germany, February 1991Sdafa [3] Gilbert H. The Tragedy of Microsoft and Java . Technology and Planning. Yale University, New Haven, CT, USA, 2003. [10] Verbowski, C. Microsoft Virtual Machine . Intergrating Java and COM – A Technology Overview. Microsoft [4] Gosling, J., and McGilton, H. The Java Language Corporation, Redmond, WA, USA, Nov 1998. Environment. A White Paper. Sun Microsystems Inc., Santa Clara, CA, USA, 1996. [11] Werbicki, P. JavaCOM . Master’s Thesis. University of Calgary, Calgary, AB, 2004.