Managing the Evolution of .NET Programs

Susan Eisenbach, Vladimir Jurisic and Chris Sadler

Department of Computing Imperial College London, UK SW7 2BZ [sue, vj98]@doc.ic.ac.uk School of Computing Science Middlesex University London, UK N14 4YZ [email protected]

Abstract Programs in the .NET environment consist of collaborating software components, and the Common Language Runtime has been designed in such a way that linking occurs ‘just-in-time’. This offers a good platform for transpar- ent evolution, provided that compatibility can be maintained between service- providing components and their clients. One goal of program evolution is to use the most recent version of each com- ponent that will actually link. We propose a model of a cache of interdepen- dent evolving components with operations defined to update this cache safely. A demonstration tool Dejavue.NET implements these ideas.

1 Introduction

The dynamic link (DLL) was invented to allow applications running on a sin- gle system to share code. Sharing code in this way confers several distinct benefits. In the first place, applications both when running in memory and when stored on disk are likely to be smaller because duplicated code has been eliminated. In the second place, applications accessing common code share a common functionality so they all do cer- tain things in the same way. This is good for the users, who get a reliable and predictable user interface. It is also makes the design more straightforward, with provision of access to certain facilities through narrow and well-defined interfaces. The main benefit however, arises when the DLL needs to be corrected or enhanced – in other words, to evolve. Because it is linked at run-time, any client applications stand to benefit from the modifications immediately – the new version replaces the old one painlessly, in the sense that the user doesn’t have to do anything apart from launch the application. This is good for application users and developers generally. However there can be pain when a new application is installed on a user’s system: Upgrade: Some enhancements may make the new version of the DLL incompatible with the old one. The new application setup will overwrite the old DLL with the new version, and will execute perfectly, but the old applications may fail. Accidental Downgrade: If an application built with a previous version of the DLL is installed, this is likely to re-install the old version of the DLL too, so the new applications may fail. So, for any individual system that uses DLLs, as soon as something is changed (typically by installing a new – or upgraded – application) it is only a matter of time before the user will descend into “DLL hell” [14,13]. Many support personnel report [1] DLL hell as the single most significant user problem that they are called upon to respond to, and Microsoft has struggled to design its way around the problem. We catalogue this in section two. The main contribution of this paper is made in section three where we develop a model of a dynamic linking intended to be a conservative extension of what ac- tually happens. Section four describes the design and development of a utility De- javue.NET [9] based on the model, and conceived to assist library developers in their task of developing DLLs that are trouble free. In section five we report on other recent work which has addressed this problem, and in section six we give some indications of where we want to take this work in the future.

2 DLL Hell

Microsoft’s first idea was to introduce version numbers. By allowing several versions of a DLL to co-exist, this could have eased the upgrade problem, provided that each application ‘knew’ which version of the DLL it needed, and provided that the DLL de- velopers only incremented the version number when they made incompatible changes. In practice, this is too much to ask of a heterogenous group of developers and Microsoft did not even try it. Instead, version numbers were used during installation (rather than when the application was first built) to ensure that a previous version did not overwrite a newer version [19]. This would have solved the downgrade problem if the people who wrote the installation scripts had consistently checked version numbers. However, not checking version numbers guaranteed that the current application would run although others may fail (rather than vice versa) so there was little incentive to do this. The next solution cropped up in . The approach was on two fronts. Firstly, a few thousand DLLs were earmarked as protected DLLs, which the file protec- tion system (WFP) will guard. When a new application is being installed, if an attempt is made to download an earlier or later version than the currently installed version of a protected DLL, it will be stopped by the setup system. If there is no currently installed version of the protected DLL, the setup system will install the corresponding version from the home system’s DLL cache, regardless of which version the application actu- ally requires. Protected DLLs can only be updated by a service pack update. This means that the computer system user (via Microsoft) controls the DLLs that exist on his system. The developers thus lose one of the main advantages of DLLs, the ability to perform transparent evolution, and instead must take responsibility for delivering capability on whatever systems their customers have. Developers whose DLLs lie outside the chosen few thousand also lose the benefits of the ‘common functionality’. The second front opened by Windows 2000 loses even more benefits of sharing. So-called ‘side-by-side’ DLLs can be created by associating a specific instance of a DLL with an application. Other applications may use the same version, but will not be sharing the same instance. The DLL developer loses all the potential advantages of

2 dynamic linking, since every side-by-side instance will need to be explicitly replaced. Alternatively, they may use a different version, so the functionality will cease to be common. These mechanisms help to solve the problems of DLL hell and make life more peaceful for Microsoft support personnel, but they do so at the expense of some of the sharing advantages that DLLs were originally invented to exploit. Microsoft’s latest ideas are embodied in the Common Language Runtime (CLR) of the .NET environment. .NET aims to provide developers with a component-based execution model. Any application will consist of a suite of co-operating software com- ponents amongst whose members control is passed during execution. The CLR needs to be ‘language-neutral’ (hence Common) which means it should be possible to write a component in any one of a variety of languages and have it executed by the CLR; and also that it should be possible to pass data between components written in different languages (using the Common Type System – CTS). The usual way to do this is to introduce an intermediate langauge which the high- level languages compile to, and to constrain their data types to conform to those recog- nised by the intermediate language. .NET achieves this with IL (Intermediary Lan- guage). Whereas most systems with an intermediate language have run-time systems that interpret the intermediate code (usually for portability reasons), the CLR uses a ’just-in-time’ (JIT) compiler to translate IL fragments (primarily method bodies) into native code at run-time. Of course, the native code must run within some previously-loaded context, and when the JIT compiler encounters a reference 1 external to the context, it must invoke some process to locate and establish the appropriate context before it can resolve the reference. This process can be quite involved and is illustrated in Figure 1. When a reference is made to a not yet loaded entity (type or type member) within the current, or another previously-loaded component, a classloader is called to load the corresponding type definition. When a reference is made to an entity external to all loaded components, it will be necessary to load the new component (or assembly in Microsoftspeak) in its entirety via the Library . The CLR has to meet some serious dynamic link- loading demands in order to provide both internal and external data definitions and JIT compilation in the time-frame of native code execution. The design of the .NET assembly reveals some details about how this is accom- plished. In the first place, each Microsoft assembly carries versioning information struc- tured into four fields – Major, Minor, Revision and Build. This allows many versions of the same DLL to co-exist in the system, and gives us the means to associate or disasso- ciate the different versions with one another. For example, versions with the same Major and Minor version numbers would probably be fairly compatible, whilst two versions with widely separated Major numbers would probably not make this system useful. Two things are required. Firstly, we have to agree the semantics of what constitutes Major, Minor etc. and this means we must understand what ‘compatibility’ means. Secondly, we have to be able to exercise some control over which version will be loaded by the Library Loader when a new component is needed at run-time. The Fusion utility in

1 Microsoft distinguishes between references which are in client code and definitions which are in server code.

3 Assembly Resolver (Fusion)

Manifest module is not resolved to a loading path Cache Library binaries Loader

Metadata definition of the type is not loaded

Class Class or interface is not loaded Loader

Reference made to an unloaded type

Subtypes not checked Verifier Jit Compiler one block of IL Unresolved tokens code processed Resolver

A method without a native stub is called

Execution of native method code Key

Reason Call

Figure 1. The CLR dynamic linking phases

4 .NET is configured to identify, according to some policy, the location of an appropriate component and to pass its pathname to the Library Loader. The purpose of each assembly is to contain at least one code module. If we restrict it to exactly one module, then Microsoft’s definition of an assembly coincides with nearly everybody else’s definition of a component, so we shall adopt that restriction. It is the module that is the basic unit for loading in the .NET run-time. Each module contains a piece of code (e.g. a class definition) translated into an IL binary. In addition, the mod- ule incorporates metadata structures that record the name and location of every type and member defined within the module. This metadata is referenced by the CLR when- ever it needs to instantiate an object, access an attribute, invoke a method or request a previously unloaded type from the class loader. In addition to the above (module) features, the assembly contains its own metadata, consisting of type metadata and a manifest. These features distinguish .NET from other current run-time environments. The type metadata records external type and member references indexed against a manifest entry. The manifest contains details (name, ver- sion number, public key, etc. ...)ofeach external component needed by the assembly. This information is what permits dynamic linking between the executing components and, in .NET it must be explicitly provided by the programmer at compile-time. By con- trast, the Java Virtual Machine has a simpler dynamic linking mechanism that implicitly resolves references along various ‘classpaths’. The significance of this difference is that the JVM can only accommodate one version of each component at a time – normally, the first one encountered in the classpath unless elaborate mechanisms are employed to involve custom classloaders [18]. By forcing explicit references .NET allows the pro- grammer to bind the client code to particular versions of the referenced services. At run-time, those exact services will be accessed, provided that they still exist and that the manifest binding is not superseded by an alternative policy. The assembly metadata is the key to the management of both dynamic linking and the sensible evolution of components. Some of this information can be accessed at run- time by developers via a Managed Reflection API, but it cannot be manipulated by this means. Instead, a set of ‘unmanaged’ COM interfaces allows full read/write access via C++ hacking. By this means we can see a possible way out of DLL Hell: Firstly, when a component evolves, we do not need to displace the old version with the new, because .NET lets us distinguish between them by means of version numbers. Secondly, by manipulating the manifests of client components we can ensure that suitable clients can benefit from the post-compile-time evolution of their service components, thus fulfilling the main benefit of evolving DLLs. By the same means we can leave other clients’ bindings undisturbed and thus overcome DLL Hell. To do this, we need to – 1. establish the characteristics of a set of inter-related components – called, in .NET, the Global Assembly Cache (GAC); 2. investigate how these inter-relationships may be affected by component evolution, and identify any requirements this may impose; 3. reflect the requirements in new policies directed at Fusion. Section three tackles points (1) and (2) whilst section four describes a tool which at- tempts to implement (3).

5

3 Modelling the Component Cache

Ë Ë  × × ×  

¾ 

Let be a set of services ½ and let be a set of components

     

¾  ½ . A service is either a type (such as a class) or a member (field or method) represented by the underlying .NET Common Type System. Components

consist of a name, a version number, a Boolean to test if they contain a main method, ÖØ

and three sets. These are the set of services that the component imports – ÑÔ Ó , the set ÖØ of services that the component exports – ÜÔ Ó , and a set of components that provide

the services that are listed in the imports – ÖÕÙÖ . In a programming context, the services that a component imports are the references that need to be resolved so that the component can be linked.

Definition 1. A component  is defined as:

  ÒÑ  ×ØÖ Ò  ÚÖ×ÓÒ  Æ Ø ×ÑÒ  ÓÓÐ  ÑÔ ÓÖØ  Ë  ÜÔ ÓÖØ  Ë  ÖÕÙÖ   ×Ñ Ò We can extract an element of a component by its name, e.g.  . We assume

the ordering of ÚÖ×ÓÒ is temporal, that is, if two components have the same name but

different versions, the one with the greater version was produced more recently. We use ÚÖ×ÓÒ

the Ò Ñ and the to differentiate two components, so we can uniquely identify ÚÖ×ÓÒ a component with the pair (Ò Ñ, ). This means that in our model there cannot

be two identical components, since there is no way to represent this possibility. Nor

ÚÖ×ÓÒ ×Ñ Ò

can we have two components with the same Ò Ñ and with different ,

ÖØ ÜÔ ÓÖØ ÖÕÙÖ ÑÔ Ó , ,or .

Property 1 (Uniqueness).

  ¾´´ Ò Ñ   Ò Ñ   ÚÖ×ÓÒ   ÚÖ×ÓÒµ µ    µ

½ ¾ ½ ¾ ½ ¾ ½ ¾

Figure 2 contains an example of a set of components that have already evolved over time. In this example, the component ("Test", 1) needs services "a", "b" and "c".

When ("Test", 1) was first added to the set, its ÖÕÙÖ set was ("Y",1) and ("X", 1). ("Y", 1) provided b and c and ("X", 1) provided "a". The component ("X", 1) needed

to import a b and so it has a ÖÕÙÖ set containing ("Y", 1). After some time the component ("X",1) is updated and component ("X", 2) is added to the set. It is newer

and still provides an "a" and so it replaces ("X",1)in("Test", 1)’s ÖÕÙÖ set. When "X" is updated again to ("X", 3) the new component cannot replace ("X",2)in

the ÖÕÙÖ set of the ("Test", 1) because the change made removed "a" from the ÖØ ÜÔ Ó set. It is possible that a service required by one of a set of components is not avail-

able. We define the predicate Ó ÖÒØ to test for whether all services required by the components within a set are available in that set.

Definition 2.

Ó ÖÒØ´ µ ¾´× ¾ ÑÔ ÓÖØ´  ¾´× ¾  ÜÔ ÓÖصµµ

 

6 "X" "X" 2 3 false "Test" false {} {} 1 {a} {d} true {} {a,b,c} {} {}

"Y" "X" 1 1 false false {} {b} {a} {b,c} {}

Figure 2. A Ó ÖÒØ set of components containing three versions of one component and two other components

A component can be added to a set of components, only if it isn’t already in the set

and if adding it doesn’t lead to unresolved references in the augmented set. We tend to ¼

use the prime ( ) to differentiate two components with the same Ò Ñ. We do not allow ÚÖ×ÓÒ a component to be added to a set of components if it has the same Ò Ñ and as a component already in the set, because it would break property 1.

Definition 3.

¼ ¼ ¼

´ µ  ´ ¾´ Ò Ñ  Ò Ñ   ÚÖ×ÓÒ  ÚÖ×ÓÒµ

if

Ó ÖÒØ´ µµ

   ÓØÖ Û ×

A component can be removed from a set of components only if the actual services it provided to other components are already available in the other components in the set.

Definition 4.

´ µÒ Ó ÖÒØ´Òµ

ÖÑÓÚ if

   ÓØÖ Û ×

A set of components may have several components with the same Ò Ñ but different

ÚÖ×ÓÒs. This can occur when a component has been changed over time, but earlier versions have not been removed. We may wish to have only the latest version of each component in the set, but unfortunately this may not always be permissable because other components within the set may rely on an earlier version. Trying to remove a

component from a set, which would result in the set losing its Ó ÖÒØ property, results

in nothing being removed. So Ð Ø×Ø ensures that components which have in some sense become surplus to what is necessary have been removed.

7

Definition 5.

¼ ¼ ¼

Ð Ø×Ø´ µÒ  ¾  ¾ ´´Ò Ñ   Ò Ñ  ÚÖ×ÓÒ ÚÖ×ÓÒµµ

Ó ÖÒØ´Òµ

Two components may be related by one providing a service that the other requires.

Definition 6.

× Ø׬×´  µ × ¾  ÑÔ ÓÖØ  ÜÔ ÓÖØ

   

For any component , a set of components that provide all its services is called Ð Ø×Ø Ë Ø׬×. The use of ensures that the components used are up-to-date versions.

Definition 7.

Ë Ø׬×´µÐ Ø×Ø´  ¾× Ø׬×´  µµ

  

 

One component may require the services of another component  which in turn

 

may require the services of a component  .So will not be able to be built into a

£

   × Ø׬×  program without  . We call this (transitive) relation between and .

Definition 8.

£ £

× Ø׬× ´  µ× Ø׬×´  µ  × Ø׬×´  µ  × Ø׬× ´  µ

       

Starting with any component , a set of all the components that provide services

£

Ë Ø׬× Ð Ø×Ø to  either directly or indirectly is called . The use of in the definition ensures that earlier versions of a component, if not needed by any of the components within the set, will not get used if there is a more recent one that provides the necessary services.

Definition 9.

£

£

Ë Ø׬× ´µÐ Ø×Ø´  ¾× Ø׬× ´  µµ

   Ó ÖÒØ We need to model the component cache  . From the definitions of and

all sets have the property that services required from any component within the

set are always available within the set itself. The component cache  has an additional

  × Ø׬×

property: each component  in the has as its . a set of components where all ÖØ its references can be resolved. So for each component all ÑÔ Ó s will be resolved and we also know the components where they are resolved from.

Definition 10. A global component cache  is a finite set of components s.t.

 ¾´ÖÕÙÖ  Ë Ø׬×´µµ

Next we define a program È . Programs consist of a set of components of which at ×Ñ Ò least one component  contains a true and there are no unresolved references,

in any of the components. There exists exactly one component  that is the starting

point ( ×Ñ Ò holds for this component) and all other components are those that can £ be reached from this component by the Ë Ø׬× relation. Figure 3 shows a cache

 containing sixteen components and three programs.

8

Figure 3. The Cache  containing three programs

´µ 

Definition 11. A program È is a finite set of components including a component ×Ñ Ò

where  s.t.

£

È ´µ Ë Ø׬× ´µ

Sometimes we need to know whether a subset of a cache has the latest versions of

ÐÐÎÖ×ÓÒ every component available in that cache. This we call Ï . It is a property that we will need to have defined for programs.

Definition 12.

ÏÐÐÎÖ×ÓÒ ´   µ ¾ ¾Ò´ Ò Ñ  Ò Ñµ µ

´´ ÚÖ×ÓÒ  ÚÖ×ÓÒµ Ó ÖÒØ´ Ò µµ

Finally we need an algorithm for evolving the global cache. We only put into the global cache new components whose references can be completely satisfied by the com- ponents already in the global cache. After we put a component in, we need to do some

housekeeping. Firstly, all components in the global cache (including the new one) have ÖØ to be reconfigured, so they now get their ÑÔ Ó services from the latest components

available. This is done by overwriting, for which we use the symbol ´. Secondly, any components which are no longer needed are removed.

9

Definition 13.

ÖÓÒ¬ÙÖ´ µ ¾´ÖÕÙÖ ´ Ë Ø׬×´µµ

Definition 14.

´ µÐ Ø×Ø´ÖÓÒ¬ÙÖ´ ´ µµµ Ó ÖÒØ´ µ ÚÓÐÚ if

give error message and terminate ÓØÖ Û ×

There are three properties that we need to ensure are maintained during the evolution  of the cache. Firstly, we wish always to have a Ó ÖÒØ global cache . This property

is true by construction. The only operations we have to alter a component cache are

Ð Ø×Ø ÖÕÙÖ

to , remove (using ) and change the sets. The first two cannot take a

Ó ÖÒØ ÖÕÙÖ

Ó ÖÒØ set and produce one that isn’t . The changes to sets do not

ÖØ ÜÔ ÓÖØ

effect which components are in a set or the ÑÔ Ó and sets so these changes also

Ó ÖÒØ´ µ cannot alter whether a set is Ó ÖÒØ. So all changes that can be made to a

leave the cache Ó ÖÒØ.

´ µ

Property 2. Ó ÖÒØ

´µ 

We wish to know that a program È built from components in will also be

 Ó ÖÒØ´ µ Ó ÖÒØ. All the required components are available in since . New pro-

grams are Ó ÖÒØ and the changes during cache evolution are designed to maintain

the Ó ÖÒØ property.

´µ Ó ÖÒØ´È ´µµ Property 3. È

Finally, in our programs we wish to use the latest version of any component that will

ÐÐÎÖ×ÓÒ

not cause problems. New programs are Ï and the changes during cache

ÐÐÎÖ×ÓÒ

evolution are designed to maintain Ï ness.

´µ ÏÐÐÎÖ×ÓÒ ´È ´µ  µ Property 4. È

If the component cache is evolved according to our model then the programs built from the cache components should contain the latest versions of components that do not cause problems. We set out to build a versioning tool, Dejavue.NET, that could follow the precepts of our model to manage evolution in .NET.

4 Dejavue.NET

Before tackling the tool a number of practical problems need to be solved. Our model envisages a single version number with the convention that a higher number indicates a later version. Early Beta versions of .NET revealed a versioning policy (as observed by Fusion) which would resolve references against an assembly with higher Revision and Build numbers than the compile-time version, provided the Major and Minor numbers were the same. No checking was done to determine whether or not the two versions were binary compatible. Presumably, this approach did not solve the problems for the Beta testers because in later Beta releases and indeed in the .NET official release, this

10 Component Installation Procedure Cache Manager

Client Front End

Component Repository Client Configuration File

Figure 4. High level architecture of the single-machine Dejavue.NET policy was dropped in favour of an exact match between the compile-time and the link- time versions – thus ruling out dynamic evolution altogether. Until there is some agreement amongst developers about what sort of evolution will cause change in which parts of the version number, it will never be possible to implement sufficiently sensitive policies based on version numbers alone. For the tool

therefore, we will stick to the idea that a higher number indicates a later version, and

ÏÐÐÎÖ×ÓÒ we will use Ó ÖÒØ and properties to determine which later version, if any, maintains compatibility. .NET provides the GAC (Global Assembly Cache) which is where Fusion looks for assemblies (it can also look across the network) to resolve references. Initially we planned to manipulate assembly metadata in situ in order to keep client components abreast of service component evolution. This idea was frustrated in two ways – firstly, .NET encrypts assemblies in the GAC and this puts the metadata out of reach of un- managed API manipulations. Secondly, in the .NET release the GAC assemblies were placed out of reach of even the managed API. This is another blow (like the inflexible versioning policy) to dynamic evolution. However, before it looks in the GAC, Fusion consults a sequence of configuration files. Whoever can control these files has the capability to override, with more up-to- date possibilities, the assembly metadata relationships established at compile-time. This provides a mechanism for implementing a more evolutionary dynamic linking policy, utilising the operations and predicates identified within our theoretical model. Instead of using the GAC, our tool mimics the GAC with its own Component Repos- itory (CR) which is maintained by a CacheManager module (see Figure 4). This faith- fully implements the installation regime defined in our model via a Component Instal- lation Routine. A second function in the CacheManager, the Redirection Information Retrieval Routine, traverses the CR dependency tree to construct, for a given compo- nent, a configuration file containing the current (link-time) dependency information.

11 Distributed Distributed Front End Front End

Distributed Server

Client Client Configuration Configuration File File

Distributed Distributed Front End Front End

Component Repository Client Client Configuration Configuration File File

Figure 5. Distributed Dejavue.NET architecture

The final part of the tool is the ClientFrontEnd. This allows users to execute the most up-to-date version of an executable component e, in the following steps:

1. requests the CacheManager to interrogate the CR to produce a list of executable components;

2. allows the user to select the required component ; 3. passes the component name back to the CacheManager to construct ‘on the fly’ the appropriate configuration file, which is stored in the ClientFrontEnd’s working directory;

4. invokes .

Once this has been done, any references external to e will be resolved by invoking Fusion. Instead of looking in the GAC however, Fusion will use the configuration file to locate components in the CR. A tool of this sort can help an end-user to keep up-to-date provided all updates are downloaded onto the local system and introduced to the Component Repository via the Cache Manager. This is not a very practical solution, especially from the point of view of component developers. Instead, a prototype distributed version of the tool has been developed (see Figure 5). In this scheme the Component Repository is hosted on a server which performs the functions of the Cache Manager. Each client then has

12 a local copy of the Client Front End which maintains local copies of the application configuration files. This scheme could slow down execution times if all external references have to be resolved across the network, so efficiency considerations may dictate some local caching mechanism. In addition, where support is required for distributed component developers and maintainers, each one needs to be able to install components in the Component Repository. This is a complex procedure which alters the state of the entire cache, therefore it is necessary to lock the cache during each execution of the Com- ponent Installation Routine. This may make the Component Repository noticeably in- accessible to other developers and to all users, so there is some risk of degrading the service. This gives further motivation to the idea of local caching.

5 Related Work

The work described here is a natural extension of work on Java done in collaboration with Drossopoulou and Wragg [17,16]. We have examined the problems associated with the evolution of Java distributed libraries [5,6]. Drossopoulou then went on to model dynamic linking in [3] and we looked at the nature of dynamic linking in Java [4]. We have looked at the problems that arise with binary compatible code changes in [5] and have built tools, described in [6,18]. Other formal work on distributed versioning has been done by Sewell in [20]. Other groups have studied the problem of protecting clients from unfortunate li- brary modifications. [12] identified four problems with ‘parent class exchange’. One of these concerned the introduction of a new (abstract) method into an interface. The other issues all concern library methods which are overridden by client methods in cir- cumstances where, under evolution, the application behaviour is adversely affected. To solve these problems, reuse contracts are proposed in order to document the library de- veloper’s design commitments. As the library evolves, the terms of the library’s contract change and the same is true of the corresponding client’s contract. Comparison of these contracts can serve to identify potential problems. Mezini [11] investigated the same problem (here termed horizontal evolution) and considered that conventional composition mechanisms were not sophisticated enough to propagate design properties to the client. She proposed a smart composition model wherein, amongst other things, information about the library calling structure is made available at the client’s site. Successive versions of this information can be compared using reflection to determine how the client can be protected. These ideas have been implemented as an extension to Smalltalk. [10,7,15] have done work on altering previously compiled code. Such systems en- able library code to be mutated to behave in a manner to suit the client. Although work on altering binaries preceded Java [15] it came into its own with Java since Java byte- code is high level, containing type information. In both the Binary Component Adap- tation System [10] and the Java Object Instrumentation Environment [7] class files are altered and the new ones are loaded and run. One of the main purposes of this kind of work is extension of classes for instrumentation purposes but these systems could be used for other changes. We have not taken the approach of altering library developers’

13 code because it makes the application developer responsible for the used library code. Responsibility without source or documentation is not a desirable situation. There is also the problem of integrating new library releases, which the client may not benefit from. Often configuration management is about configuration per se – technical issues about storage and distribution; or management per se - policy issues about what should be managed and how. Network-Unified Configuration Management (NUCM) [2] em- braces both in an architecture, incorporating a generic model of a distributed repository. The interface to this repository is sufficiently flexible to allow different policies to be manifested. World Wide Configuration Management (WWCM) [8] provides an API for a web based client-server system. It is built around the Configuration Management Engine (CME) to implement what is effectively a distributed project. CME allows elements of a project to be arranged in a hierarchy and working sets to be checked in and out, and the project as a whole can be versioned so several different versions may exist concurrently.

6 Conclusions

Perhaps DLL Hell is a special place reserved only for Microsoft people, but the Upgrade Problem has to be faced by everybody who writes or uses component-based software. Some part of the solution may lie in clever language design, and another may lie in employing strict software development and maintenance procedures – but a large part must lie in the versioning strategy of the linking mechanism deployed by the run-time system. In this paper we have looked at this mechanism as deployed by .NET, one of the most up-to-date run-time systems. We have modelled it and analyzed some of its archi- tectural features. There is much to recommend many of the design decisions made. For example, the multi-part version numbers gives developers scope to classify the effects of modifications precisely, and the assembly metadata with the managed code API offers the prospect of reflective code compatibility analysis. However, some of the implemen- tation trade-offs in the current release have frustratingly limited the scope of what is possible. For example, the GAC is not usable as a cache for evolving components. As a consequence we have developed our prototype tool, to work alongside and within .NET and to demonstrate what could be attainable.

Acknowledgements

We would like to thank Sophia Drossopoulou, Matthew Smith and Michael Huth for reading earlier versions of this paper and for their helpful suggestions. Dejavue.NET was influenced by DEJaVU a tool for Java evolution, implemented by Shakil Shaikh and Miles Barr. Andrew Tseng and Dave Porter of UBS Warburg for providing motivation beyond idle academic curiosity. Finally, we would like to thank Sophia Drossopoulou and all the project students of our weekly meeting group for their helpful suggestions and insights while the ideas were being developed.

14 References

1. R. Anderson. The End of DLL Hell. msdn.microsoft.com/, January 2000. 2. M. Heimbigner D. Hoek and A.L. Wolf. Versioned Software Architecture. In Proc. of the 3rd International Software Architecture Workshop, November 1998. 3. S. Drossopoulou. An Abstract Model of Java Dynamic Linking, Loading and Verification. In Types in Compilation, September 2001. 4. S. Eisenbach and S. Drossopoulou. Manifestations of the Dynamic Linking Process in Java. www-dse.doc.ic.ac.uk/projects/ slurp/dynamic-link/linking.htm, June 2001. 5. S. Eisenbach and C. Sadler. Ephemeral Java Source Code. In IEEE Workshop on Future Trends in Distributed Systems, December 1999. 6. S. Eisenbach and C. Sadler. Changing Java Programs. In IEEE Conference in Software Maintenance, November 2001. 7. J. Chase G. Cohen and D. Kaminsky. Automatic Program Transformation with JOIE. In USENIX Annual Technical Symposium, 1998. 8. J. Reuter J. J. Hunt, F. Lamers and W. F. Tichy. Distributed Configuration Management Via Java and the World Wide Web. In In Proc 7th Intl. Workshop on Software Configuration Management, 1997. 9. V. Jurisic. Deja-vu.NET: A Framework for Evolution of Component Based Systems. www.doc.ic.ac.uk/ ajf/Teaching/Projects/DistProjects.html, June 2002. 10. R. Keller and U. Holzle. Binary Component Adaptation. In Proc. of the European Conf. on Object-Oriented Programming. Springer-Verlag, July 1998. 11. M. Mezini and K. J. Lieberherr. Adaptive Plug-and-Play Components for Evolutionary Soft- ware Development. In Proc. of OOPSLA, pages 97–116, 1998. 12. K. Mens P. Steyaert, C. Lucas and T. D’Hondt. Reuse Contracts: Managing the Evolution of Reusable Assets. In Proc. of OOPSLA, 1996. 13. M. Pietrek. Avoiding DLL Hell: Introducing Application Metadata in the Microsoft .NET Framework. MSDN Magazine, October 2000. 14. S. Pratschner. Simplifying Deployment and Solving DLL Hell with the .NET Framework. msdn.microsoft.com/, November 2001. 15. S. Lucco R. Wahbe and S. Graham. Adaptable Binary Programs. In Technical Report CMU- CS-94-137, 1994. 16. D. Wragg S. Drossopoulou and S. Eisenbach. What is Java binary compatibility? In Proc. of OOPSLA, volume 33, pages 341–358, 1998. 17. S. Eisenbach S. Drossopoulou and D. Wragg. A Fragment Calculus: Towards a Model of Separate Compilation, Linking and Binary Compatibility. In Logic in Computer Science, pages 147–156, 1999. 18. C. Sadler S. Eisenbach and S. Shaikh. Evolution of Distributed Java Programs. In IEEE Working Conf on Component Deployment, June 2002. 19. I. Salmre. Building, Versioning and Maintaining Components. msdn.microsoft.com/, February 1998. 20. P. Sewell. Modules, Abstract Types, and Distributed Versioning. In Proc. of Principles of Programming Languages. ACM Press, January 2001.

15