Managing the Evolution of .NET Programs
Total Page:16
File Type:pdf, Size:1020Kb
Managing the Evolution of .NET Programs Susan Eisenbach, Vladimir Jurisic and Chris Sadler Department of Computing Imperial College London, UK SW7 2BZ [sue, vj98]@doc.ic.ac.uk School of Computing Science Middlesex University London, UK N14 4YZ [email protected] Abstract Programs in the .NET environment consist of collaborating software components, and the Common Language Runtime has been designed in such a way that linking occurs ‘just-in-time’. This offers a good platform for transpar- ent evolution, provided that compatibility can be maintained between service- providing components and their clients. One goal of program evolution is to use the most recent version of each com- ponent that will actually link. We propose a model of a cache of interdepen- dent evolving components with operations defined to update this cache safely. A demonstration tool Dejavue.NET implements these ideas. 1 Introduction The dynamic link library (DLL) was invented to allow applications running on a sin- gle system to share code. Sharing code in this way confers several distinct benefits. In the first place, applications both when running in memory and when stored on disk are likely to be smaller because duplicated code has been eliminated. In the second place, applications accessing common code share a common functionality so they all do cer- tain things in the same way. This is good for the users, who get a reliable and predictable user interface. It is also makes the operating system design more straightforward, with provision of access to certain facilities through narrow and well-defined interfaces. The main benefit however, arises when the DLL needs to be corrected or enhanced – in other words, to evolve. Because it is linked at run-time, any client applications stand to benefit from the modifications immediately – the new version replaces the old one painlessly, in the sense that the user doesn’t have to do anything apart from launch the application. This is good for application users and developers generally. However there can be pain when a new application is installed on a user’s system: Upgrade: Some enhancements may make the new version of the DLL incompatible with the old one. The new application setup will overwrite the old DLL with the new version, and will execute perfectly, but the old applications may fail. Accidental Downgrade: If an application built with a previous version of the DLL is installed, this is likely to re-install the old version of the DLL too, so the new applications may fail. So, for any individual system that uses DLLs, as soon as something is changed (typically by installing a new – or upgraded – application) it is only a matter of time before the user will descend into “DLL hell” [14,13]. Many Microsoft support personnel report [1] DLL hell as the single most significant user problem that they are called upon to respond to, and Microsoft has struggled to design its way around the problem. We catalogue this in section two. The main contribution of this paper is made in section three where we develop a model of a dynamic linking intended to be a conservative extension of what ac- tually happens. Section four describes the design and development of a utility De- javue.NET [9] based on the model, and conceived to assist library developers in their task of developing DLLs that are trouble free. In section five we report on other recent work which has addressed this problem, and in section six we give some indications of where we want to take this work in the future. 2 DLL Hell Microsoft’s first idea was to introduce version numbers. By allowing several versions of a DLL to co-exist, this could have eased the upgrade problem, provided that each application ‘knew’ which version of the DLL it needed, and provided that the DLL de- velopers only incremented the version number when they made incompatible changes. In practice, this is too much to ask of a heterogenous group of developers and Microsoft did not even try it. Instead, version numbers were used during installation (rather than when the application was first built) to ensure that a previous version did not overwrite a newer version [19]. This would have solved the downgrade problem if the people who wrote the installation scripts had consistently checked version numbers. However, not checking version numbers guaranteed that the current application would run although others may fail (rather than vice versa) so there was little incentive to do this. The next solution cropped up in Windows 2000. The approach was on two fronts. Firstly, a few thousand DLLs were earmarked as protected DLLs, which the file protec- tion system (WFP) will guard. When a new application is being installed, if an attempt is made to download an earlier or later version than the currently installed version of a protected DLL, it will be stopped by the setup system. If there is no currently installed version of the protected DLL, the setup system will install the corresponding version from the home system’s DLL cache, regardless of which version the application actu- ally requires. Protected DLLs can only be updated by a service pack update. This means that the computer system user (via Microsoft) controls the DLLs that exist on his system. The developers thus lose one of the main advantages of DLLs, the ability to perform transparent evolution, and instead must take responsibility for delivering capability on whatever systems their customers have. Developers whose DLLs lie outside the chosen few thousand also lose the benefits of the ‘common functionality’. The second front opened by Windows 2000 loses even more benefits of sharing. So-called ‘side-by-side’ DLLs can be created by associating a specific instance of a DLL with an application. Other applications may use the same version, but will not be sharing the same instance. The DLL developer loses all the potential advantages of 2 dynamic linking, since every side-by-side instance will need to be explicitly replaced. Alternatively, they may use a different version, so the functionality will cease to be common. These mechanisms help to solve the problems of DLL hell and make life more peaceful for Microsoft support personnel, but they do so at the expense of some of the sharing advantages that DLLs were originally invented to exploit. Microsoft’s latest ideas are embodied in the Common Language Runtime (CLR) of the .NET environment. .NET aims to provide developers with a component-based execution model. Any application will consist of a suite of co-operating software com- ponents amongst whose members control is passed during execution. The CLR needs to be ‘language-neutral’ (hence Common) which means it should be possible to write a component in any one of a variety of languages and have it executed by the CLR; and also that it should be possible to pass data between components written in different languages (using the Common Type System – CTS). The usual way to do this is to introduce an intermediate langauge which the high- level languages compile to, and to constrain their data types to conform to those recog- nised by the intermediate language. .NET achieves this with IL (Intermediary Lan- guage). Whereas most systems with an intermediate language have run-time systems that interpret the intermediate code (usually for portability reasons), the CLR uses a ’just-in-time’ (JIT) compiler to translate IL fragments (primarily method bodies) into native code at run-time. Of course, the native code must run within some previously-loaded context, and when the JIT compiler encounters a reference 1 external to the context, it must invoke some process to locate and establish the appropriate context before it can resolve the reference. This process can be quite involved and is illustrated in Figure 1. When a reference is made to a not yet loaded entity (type or type member) within the current, or another previously-loaded component, a classloader is called to load the corresponding type definition. When a reference is made to an entity external to all loaded components, it will be necessary to load the new component (or assembly in Microsoftspeak) in its entirety via the Library Loader. The CLR has to meet some serious dynamic link- loading demands in order to provide both internal and external data definitions and JIT compilation in the time-frame of native code execution. The design of the .NET assembly reveals some details about how this is accom- plished. In the first place, each Microsoft assembly carries versioning information struc- tured into four fields – Major, Minor, Revision and Build. This allows many versions of the same DLL to co-exist in the system, and gives us the means to associate or disasso- ciate the different versions with one another. For example, versions with the same Major and Minor version numbers would probably be fairly compatible, whilst two versions with widely separated Major numbers would probably not make this system useful. Two things are required. Firstly, we have to agree the semantics of what constitutes Major, Minor etc. and this means we must understand what ‘compatibility’ means. Secondly, we have to be able to exercise some control over which version will be loaded by the Library Loader when a new component is needed at run-time. The Fusion utility in 1 Microsoft distinguishes between references which are in client code and definitions which are in server code. 3 Assembly Resolver (Fusion) Manifest module is not resolved to a loading path Cache Library binaries Loader Metadata definition of the type is not loaded Class Class or interface is not loaded Loader Reference made to an unloaded type Subtypes not checked Verifier Jit Compiler one block of IL Unresolved tokens code processed Resolver A method without a native stub is called Execution of native method code Key Reason Call Figure 1.