IL Intermediate Language CHAPTER 1.3
Total Page:16
File Type:pdf, Size:1020Kb
05 153x ch01_3 8/31/01 1:55 PM Page 31 IL Intermediate Language CHAPTER 1.3 IN THIS CHAPTER • Language Inter-Op 33 • Hello IL 34 • Functions 35 • Classes 38 • ILDASM 40 • Metadata 42 • Reflection API 43 05 153x ch01_3 8/31/01 1:55 PM Page 32 The .NET Framework 32 PART I In the current day and age of software development, developers use high-level languages such as C++, Visual Basic, or Java to implement applications and components. I personally don’t know any working binary developers and few assembly language programmers. So why is IL, Intermediate Language, so important to .NET? To answer this question requires a brief tour of compiler theory 101. The basic task of a compiler is to transform readable source code into native executable code for a given platform. The compilation process includes several phases, which include but are not limited to lexical scanning, parsing, abstract syntax tree creation, intermediate code creation, code emission, assembler, and linking. Figure 1.3.1 shows a basic diagram of this process. Source code abstract syntax tree assembling Lexical scanning intermediate code linking parsing code emission executable code FIGURE 1.3.1 The basic compilation process. The intermediate code creation poses an interesting question: What if it’s possible to translate a given set of languages into some standard intermediate code? This is where IL fits into the scheme of things. The ability to translate high-level language constructs into a standardized format, such as IL, allows for a single underlying compiler or just-in-time (JIT) compiler. This is the approach taken by .NET. With the introduction of Visual Studio .NET, Microsoft is providing Managed C++, C#, Visual Basic, and ASP.NET, each of which produce IL from their respective source code. There are also several third-party language vendors providing .NET versions of their respective lan- guages. Figure 1.3.2 shows the basic .NET compilation process. It is important to note that IL is not interpreted. Rather, the .NET platform makes use of vari- ous types of just-in-time compilers to produce native code from the IL code. Native code runs directly on the hardware and does not require a virtual machine, as is the case with Java and early versions of Visual Basic. By compiling IL to native code, the runtime performance is increased compared to interpreted languages that run on top of a virtual machine. We will be covering the various types of just-in-time compilers later on. (From here on out I will refer to the just-in-time compiler(s) as the jitter). 05 153x ch01_3 8/31/01 1:55 PM Page 33 IL Intermediate Language 33 CHAPTER 1.3 Managed C++ C# VB.NET Language X… 1.3 IL I L NTERMEDIATE ANGUAGE IL JIT Compiler Native Code FIGURE 1.3.2 .NET compilation. IL is a full, stack-based language that you can use to implement .NET components. Although it is definitely not suggested for the sake of productivity, you can use IL to do things that you can’t do with C#. For example, in IL, you can create global functions and call them from within IL. Language Inter-Op Like the quest for the Holy Grail, the ability to seamlessly inter-op between languages has been on the forefront. On the various Windows platforms, the standard for cross language development has been the Component Object Model (COM). An entire industry was born whose sole purpose was developing COM components and ActiveX controls. Because COM is a binary standard, any language capable of consuming COM components is able to use them regardless with which language the COM component was created. Although COM allowed for reusable components, it was far from perfect. Certain languages, such as scripting languages, could only make use of the default interface provided by the COM object. Also, it was not possible to inherit from a COM component to extend its basic function- ality. To extend a binary object, developers were forced to wrap the COM component, not a very OO approach. With the advent of .NET and IL, such barriers no longer exist. Because every language tar- geted at the .NET platform understands IL and the metadata of a .NET component, a new level of interoperability is born. The metadata allows inspection of a .NET component to discover the classes, methods, fields, and interfaces that it contains. The topic of metadata will be cov- ered later in this section. 05 153x ch01_3 8/31/01 1:55 PM Page 34 The .NET Framework 34 PART I Hello IL Using IL to develop .NET applications and components is probably not something you’ll be doing a lot. However, to truly dig into the .NET framework, a basic understanding of the IL structure is important. The .NET SDK does not ship with the underlying source code. This does not mean you can’t dig into the implementation of the various components. However, it does require some knowledge of IL. The traditional, almost required, first step in learning a new language is to display the infa- mous “Hello World” message on the console. Accomplishing this task with only IL is not as painful as it might sound. IL is not a truly low-level, assembly-like language. As with most modern languages, there is a fair amount of abstraction built into the language. Listing 1.3.1 shows the now famous “Hello World” program written in MSIL. LISTING 1.3.1 Hello World 1: //File :Hello.cil 2: //Author :Richard L. Weeks 3: // 4: // 5: 6: //define some basic assembly information 7: .assembly hello 8: { 9: .ver 1:0:0:0 10: } 11: 12: 13: //create an entry point for the exe 14: .method public static void main( ) il managed 15: { 16: 17: .entrypoint 18: .maxstack 1 19: 20: 21: //load string to display 22: ldstr “Hello World IL style\n” 23: 24: //display to console 25: call void [mscorlib]System.Console::WriteLine( class System.String ) 26: 27: ret 28: } 29: 05 153x ch01_3 8/31/01 1:55 PM Page 35 IL Intermediate Language 35 CHAPTER 1.3 To compile Listing 1.3.1, issue the following command line: 1.3 ilasm hello.cil IL I L NTERMEDIATE ANGUAGE The IL assembler will produce a .NET hello.exe. This IL code displays the message “Hello World IL style” by using the static Write method of the System.Console object. The Console class is part of the .NET framework and, as its use implies, its purpose is to write to the standard output. Notice that there is no need to create a class. IL is not an object-oriented language, but it does provide the constructs necessary to describe objects and use them. IL was designed to accom- modate a variety of language constructs and programming paradigms. Such diversity and open- ness in its design will allow for many different languages to be targeted at the .NET runtime. There are a number of directives that have special meanings within IL. These directives are shown in Table 1.3.1. TABLE 1.3.1 IL Directives used in “Hello World” Directive Meaning .assembly Name of the resulting assembly to produce .entrypoint Start of execution for an .exe assembly .maxstack Number of stack slots to reserve for the current method or function Notice that the .entrypoint directive defines the start of execution for any .exe assembly. Although our function has the name main, it could have any name we want. Unlike C, C++, and other high-level languages, IL does not define a function to serve as the beginning of execution. Rather, IL makes use of the .entrypoint directive to serve this purpose. Another directive of interest is the .assembly directive. IL uses this directive to name the output of the assembly. The .assembly directive also contains the version number of the assembly. Functions The Hello World IL example encompasses only one method. If only all software development requirements were that simple! Because IL is a stack based language, function parameters are pushed onto the stack and the function is called. The called function is responsible for remov- ing any parameters from the stack and for pushing any return value back onto the stack for the caller of the function. 05 153x ch01_3 8/31/01 1:55 PM Page 36 The .NET Framework 36 PART I The process of pushing and popping values to and from the call stack can be visualized as shown in Figure 1.3.3. Add function Program Code call stack call stack pop 15 push 10 15 10 push 15 10 call Add pop 10 pop 25 local result push result 25 ret FIGURE 1.3.3 The call stack. Although Figure 1.3.3. depicts a very simplified view of the actual process, enough informa- tion is provide to gain an understanding of the overall steps involved. Translating Figure 1.3.3 into IL produces the source in Listing 1.3.2. LISTING 1.3.2 IL Function Calls 1: //File :function.cil 2: 3: .assembly function_call 4: { 5: .ver 1:0:0:0 6: } 7: 8: 9: //Add method 10: .method public static int32 ‘add’( int32 a, int32 b ) il managed 11: { 12: 13: .maxstack 2 14: .locals (int32 V_0, int32 V_1) 15: 16: //pop the parameters from the call stack 17: ldarg.0 18: ldarg.1 19: 20: add 21: 22: //push the result back on the call stack 05 153x ch01_3 8/31/01 1:55 PM Page 37 IL Intermediate Language 37 CHAPTER 1.3 LISTING 1.3.2 Continued 1.3 23: stloc.0 IL I L NTERMEDIATE 24: ldloc.0 ANGUAGE 25: 26: ret 27: } 28: 29: .method public static void main( ) il managed 30: { 31: .entrypoint 32: .maxstack 2 33: .locals ( int32 V_0 ) 34: 35: //push two parameters onto the call stack 36: ldc.i4.s 10 37: ldc.i4.s 5 38: 39: call int32 ‘add’(int32,int32) 40: 41: stloc.0 //pop the result in the local variable V_0 42: 43: //Display the result 44: ldstr “10 + 5 = {0}” 45: ldloc V_0 46: box [mscorlib]System.Int32 47: call void [mscorlib]System.Console::WriteLine(class System.String, ➥ class System.Object ) 48: 49: ret 50: } 51: The function.cil code in Listing 1.3.2 introduces a new directive: .locals.