Advanced Programming
Total Page:16
File Type:pdf, Size:1020Kb
301AA - Advanced Programming Lecturer: Andrea Corradini [email protected] http://pages.di.unipi.it/corradini/ AP-10: Components: the Microsoft way Overview • The Microsoft approach to components • DDE, OLE, COM, ActiveX, … • The .NET framework • Common Language Runtime • .NET components • Composition by aggregation and containment • Communication by Events and Delegates Chapter 15, sections 15.1, 15.2, 15.4, 15.11, and 15.12 of Component Software: Beyond Object-Oriented Programming. C. Szyperski, D. Gruntz, S. Murer, Addison- Wesley, 2002. 2 Distributed Component Technologies The goal: - Integration of services for applications on various platforms - Interoperability: let disparate systems communicate and share data seamlessly Approaches: - Microsoft: DDE, COM, OLE, OCX, DCOM and ActiveX - Sun: JavaBeans, Enterprise JavaBeans, J2EE - CORBA (Common Object Request Broker Architecture) - Mozilla: XPCOM (Gecko functionality as components) - SOAP (using XML) 3 The Microsoft Approach • Continuous re-engineering of existing applications • Component technology introduced gradually taking advantage of previous success, like – Visual Basic controls – Object linking and embedding (OLE) – Active X, ASP • Solutions mainly adopted on MS platforms • Review from older approaches to .NET + CLR 4 COM: Component Object Model • Underlying most MS component technologies (before .NET) • Made available on other platforms, but with little success • COM does not prescribe language, structure or implementation of an application • COM only specifies an object model and programming requirements that enable COM components to interact • COM is a binary standard for interfaces • Only requirement: code is generated in a language that can create structures of pointers and, either explicitly or implicitly, call functions through pointers. • Immediate for some languages (C++, SmallTalk) but possible for many others (C, Java, VBscript,…) 5 8557 Chapter 15 p329-380 8/10/02 12:24 pm Page 331 COMThe first fundamental interfaces wiring model – COM and components331 Client Interface • A COM interface is a pointer to an variable node Op 1 interface node, which is a pointer Op 2 to a table of function pointers (also called vtable) • Note the double indirection Component Op n Figure 15.1 Binary representation• Invocation of a COM specification interface. : when an operation (method) of the interface is invoked, a pointer to the interface itself is passed as the calling convention,additional which is argument the specification (like self of whator this exactly) is passed when calling an operation– The from pointer an interface. can be used to access instance variables Methods of an• objectCOM have component one additionalmay implementparameter – anythe objectnumber they of interfaces. belong to. This parameter is sometimes called self or this. Its declaration is hidden in most object-oriented• The entire languages,implementation but a few, can including be a defined Component in a single class, but it Pascal, make it explicit.does The not point have is thatto be. the interface pointer is passed as a self-parameter to •anyA of component the interface’s can operations. contain This many allows objects operations of different in a classes that COM interface to exhibitcollectively true object provide characteristics. the implementation In particular, the ofinterface the interfaces provided node can be used to byrefer the internally component. to instance variables. It is even possible to attach instance variables directly to the interface node, but this is not normally done. It is, however, quite common to store pointers that simplify the lookup 6 of instance variables and the location of other interfaces. A COM component is free to contain implementations for any number of interfaces. The entire implementation can be a single class, but it does not have to be. A component can just as well contain many classes that are used to instantiate objects of just as many different kinds. These objects then collec- tively provide the implementation of the interfaces provided by the component. Figure 15.2 shows a component that provides three different interfaces and uses two different objects to implement these. In Figure 15.2, object 1 implements interfaces A and B, whereas object 2 implements interface C. The dashed pointers between the interface nodes are used internally as it must be possible to get from each node to every other node. The unusual layout of objects and vtables is just what COM prescribes if such an n-to-m relationship between objects and interfaces is desired. However, without proper language support, it is not likely that many compo- nents will take such a complex shape. What is important, though, is that there is no single object part that ever leaves the component and represents the entire COM object. A COM component is not necessarily a traditional class and a COM object is not necessarily a traditional single-bodied object. However, a COM object can be such a traditional object and all of its inter- faces can be implemented in a single class by using multiple inheritance (Rogerson, 1997). There are two important questions to be answered at this point. How does a client learn about other interfaces and how does a client compare the identity 8557 Chapter 15 p329-380 8/10/02 12:24 pm Page 332 A COM component with 3 interfaces 332 andThe Microsoft2 objects way: COM, OLE/ActiveX, COM+, and .NET CLR Client Interface • Object 1 implements variables node A Interfaces A and B, Op A1 • Object 2 implements Op A2 Interface C Interface Object 1 • Interfaces must be node B mutually reachable Op B1 • Possible according to Op B2 COM specification, Op B3 rare in practice Interface Object 2 node C Op C1 Op C2 Figure 15.2 A COM object with multiple interfaces. of COM objects? Surprisingly, these two questions are closely related. Every COM interface has a common first method named QueryInterface. Thus, the first slot of the function table of any COM interface points to a QueryInterface operation. There are two further methods shared by all interfaces. These are explained below. QueryInterface takes the name of an interface, checks if the current COM object supports it, and, if so, returns the corresponding interface reference. An error indication is returned if the interface queried for is not supported. On the level of QueryInterface, interfaces are named using interface identifiers (IIDs). An IID is a GUID (Chapter 12), which is a 128-bit number guaran- teed to be globally unique. COM uses GUIDs for other purposes also. As every interface has a QueryInterface operation, a client can get from any provided interface to any other. Once a client has a reference to at least one interface, it can obtain access to all others provided by the same COM object. Recall that interface nodes are separate and therefore cannot serve to identify a COM object uniquely. However, COM requires that a given COM object returns the same interface node pointer each time it is asked for the IUnknown interface. As all COM objects must have an IUnknown interface, the identity of the IUnknown interface node can serve to identify the entire COM object. To ensure that this identity is logically preserved by interface navigation, the QueryInterface contract requires that any successful query yields an interface that is on the same COM object – that is, establishes the same identify via queries for IUnknown. To enable sound reasoning, the set of interfaces explorable by queries must be an equivalence class. This means that the queries are reflexive in that if they ask for an interface by querying that same interface, they will succeed. They are also symmetrical in that if they ask for an interface, 8557 Chapter 15 p329-380 8/10/02 12:24 pm Page 334 334 The Microsoft way: COM, OLE/ActiveX, COM+, and .NET CLR IUnknown COM Interfaces IOleObject IDataObject • Identity determined by Globally unique identifiers IPersistStorage (GUID) (128 bits) or (non-unique) name IOleDocument • IUnknown: root of interface hierarchy, includes: – QueryInterface Figure 15.3 Depiction of a COM object. – AddRef and Release (for Garbage Collection via Reference Counting) although this particular node may have no remaining references. This is nor- • QueryInterface (GUID ->mally Interface acceptable, reference/error) and sharing allows of a single to know reference if count is the usual approach. an interface is implemented byIn the some component cases, interface nodes may be resource intensive, such as when they • “Invocations to QueryInterfacemaintainarg a largeument cache IUnknown structure.on A theseparate same reference count for such an inter- component must return the sameface nodeaddress” can then be used to release that node as early as possible. (This technique of creating and deleting interface nodes as required is sometimes • Thus IUnknown used to get thereferred “identity” to as “tear-off of a component interfaces.”) [ uuid(00000000-0000-0000-C000-000000000046)On creation ]of interface an object IUnknown or node,{ the reference count is initialized to 1 HRESULT QueryInterface ([in] constbefore IID iid ,handing [out, iid_is out(iid a )]first IUnknown reference.iid );Each time a copy of a reference is created, unsigned long AddRef (); the count must be incremented (AddRef). Each time a reference is given up, the unsigned long Release (); } count must be decremented (Release). As soon as a reference count reaches zero, the COM object has become unreachable and should therefore self- destruct. As part of its destruction, it has to 8release all references to other objects that it might still hold, calling their Release methods. This leads to a recursive destruction of all objects exclusively held by the object under destruc- tion. Finally, the destructed object returns the memory space it occupied. Reference counting is a form of cooperative garbage collection. As long as all involved components play by the rules and cooperate, memory will be safely deallocated. At least, objects will never be deallocated while references still exist. Reference counting has the well-known problem that it cannot deal with cyclic references.