Chapter 3 Win32 Implementations: A Comparative Look

IN THIS CHAPTER ^ ^ ^

+ Comparing and contrasting implementation of the Win32 API in Windows NT and + Aspects of implementation in both Windows NT and Windows 95

EACH provides set f serviceo s applicationn sa referred- s a o t programming interface (API) - to developers in some form or another. The develop- ers write software applications using this API. For example, DOS provides this in- terface in the form of the famous INT 21h interface. Microsoft's newer 32-bit operating systems, such as Windows 95 and Windows NT, provide the interface in . API 2 Win3 e th f o m for e th Presently, there are four Win32 API implementations available from Microsoft:

8 95/9 s Window + ^ ^ + Window T ^sN - ^ ^^^^ ^^^ ^^^ ^^ ^^^^ ^^ ^- ^^^^^^^^^^ + Win32s ^ ^^ ^ ^ ^^^ + Windows CE

Of these, Win32s is very limited due to bugs and the restrictions of the underly- ing operating system. Presently, Win32 API implementations on Windows 95/98 and Windows NT are very popular among developers. Windows CE is meant for palmtop computers. The Win32 API was first implemented on the Windows NT op- erating system. Later, the same API was made available in Windows 95. Ideally, an application written using the standard Win32 API should work on any operating system that supports the Win32 API implementation. (However, this is not neces- sarily true due to the differences between the implementations.) The Win32 API should hide all the details of the underlying implementations and provide a consis- tent view to the outside world.

23 24 Part 1: Essentials

In this chapter, we focus on the differences between the implementations of the e b d shoul u yo , developers s A . 95 s Window d an T N s Window r unde I AP 2 Win3 f o h bot n o n ru n ca t tha s application p develo u yo e whil s difference e thes f o e awar these operating systems.

Win32 API Implementation on Windows 95

The Win32 API is provided in the form of the famous trio of the KERNEL32, e thes , cases t mos n i , However DLLs). ( s librarie k lin c dynami 2 GDI3 d an , USER32 DLLs are just wrappers that use generic thunking to call the 16-bit functions.

Generic thunking is a way of calling 16-bit functions from a 32-bit applica- tion. (More on thunking later in this chapter.)

- in , Hence . compatibility d backwar s wa 5 9 s Window r fo l goa n desig r majo e Th stead of porting all the 16-bit functions to 32-bit, Microsoft decided to reuse the existing 16-bit code (from the Windows 3.x operating system) by wrapping it in 32- bit code. This 32-bit code would in turn call the 16-bit functions. This was a good approach because the tried — and — true 16-bit code was already running on millions of machines all over the world. In this Win32 API implementation, most of the o t down s thunk 2 USER3 , KRNL386 o t n dow k thun 2 KERNEL3 m fro s function . GDI.EXE o t n dow s thunk 2 GDI3 d an , USER.EXE

Win32 API implementation on Windows NT On Windows NT also, the Win32 API is provided in the form of the famous trio of e don s i n implementatio s thi , However . DLLs 2 GDI3 d an , USER32 , KERNEL32 e th completely from scratch without using any existing 16-bit code, so it is purely a 32-bit implementation of Win32 API. Even 16-bit applications end up calling this . this e achiev o t g thunkin l universa s use m subsyste t 16-bi s NT' s Window . API t 2-bi 3 Chapter 3: Win32 Implementations: A Comparative Look 25

- applica t 16-bi m fro s function t 32-bi g callin f o y wa a s i thunking Universal ) chapter. s thi n i r late g thunkin n o e (Mor . tions

- appli t 16-bi t suppor o t d use e ar h whic , GDI.EXE d an , USER.EXE , KRNL386.EXE s (Window W WO e th h throug 2 GDI3 d an , USER32 , KERNEL32 o t p u k thun , cations on Windows) layer. Most of the functions provided by KERNEL32.DLL call one or e ar s service m syste e nativ e Th . work l actua e th o d o t s service m syste e nativ e mor . NTDLL.DLL d calle L DL a h throug e availabl

. 6 r Chapte n i d discusse e ar s service m syste e thes l Al

As far as USER32 and GDI32 are concerned, the implementation differs in NT m subsyste e separat a , 3.51 T N s Window r Unde . versions r late d an 1 3.5 s version implements the USER32 and GDI32 calls. The DLLs USER32 and GDI32 contain stubs, which pass the function parameters to the Win32 subsystem - ap t clien e th between n communicatio e Th . back s result e th t ge d an ) (CSRSS.EXE - fa l cal e procedur l loca e th g usin y b d achieve s i m subsyste 2 Win3 e th d an n plicatio cility provided by the NT executive.

. mechanism ) (LPC l cal e procedur l loca e th f o s detail e th s cover 8 r Chapte

Under Windows NT 4.0 and Windows 2000, the USER32 GDI32 calls the system d an 2 USER3 . WIN32K.SYS d calle r drive e devic e kernel-mod a y b d provide s service , Hence . interrupt h 2E e th g usin s service m syste e thes l cal t tha s stub n contai 2 GDI3 most of the functionality of the Win32 Subsystem process (CSRSS.EXE) is taken over 0 4. T N n i s exist l stil s proces S CSRS e Th . (WIN32K.SYS) r drive e kernel-mod e th y b . I/O e Consol g supportin y mainl o t d limite s i e rol s it , however — 0 200 s Window d an 26 Part 1: Essentials

It is interesting to note that the Win32 API completely hides NTDLL.DLL from - ulti I AP 2 Win3 e th y b d provide s function e th f o t mos , Actually . developer e th mately call one or more system services. This system service layer is very powerful - func I AP 2 Win3 t equivalen e hav t no o d t tha s function s contain s time y man d an . implicitly L DL s thi o t k lin s utilitie t Ki e Resourc T N s Window e th f o t Mos . tions Win32 Implementation Differences Now we will consider a few aspects of the Win32 API implementation on Windows - so s thi g usin m progra s developer y wa e th t affec t migh t tha 5 9 s Window d an T N called standard Win32 API. Address Space Both Windows 95 and Windows NT deal with flat, 32-bit linear addresses that give e th s a o t d referre r (hereafte B 2G r uppe e th , this f O . space s addres l virtua f o B 4G shared address space) is reserved for operating system use, and the lower 2GB (hereafter referred to as the private address space) is used by the running process. e th h Althoug . process h eac r fo t differen s i s proces h eac f o e spac s addres e privat e Th virtual addresses in the private address space of all processes is the same, they may point to a different physical page. The addresses in the shared address space of all the processes point to the same physical page. Under Windows 95/98, the operating system DLLs, such as KERNEL32, USER32, s DLL e thes T N s Window n i s wherea , space s addres d share e th n i e resid , GDI32 d an are loaded in the process's private address space. Hence, under Windows 95/98, it is . application r anothe f o g workin e th h wit e interfer o t n applicatio e on r fo e possibl For example, one application can accidentally overwrite memory areas occupied by . processes r othe e th l al f o g workin e th t affec d an s DLL e thes

Although the shared address space is protected at the page table level, a kernel-mode component (for example, a VXD) is able to write at any location . space s addres B 4G n i

In addition, under Windows 95/98, it is possible to load a dynamic link library in the shared address space. These DLLs will have the same problem described pre- viously if the DLL is used by multiple applications in the system. Windows NT loads all the system DLLs, such as KERNEL32, USER32, and GDI32, in the private address space. As a result, it is never possible for one application to ______Chapter 3: Win32 Implementations: A Comparative Look____27 f I . so o d o t g intendin t withou m syste e th n i s application r othe e th h wit e interfer - applica t tha y onl t affec l wil t i , DLLs e thes s overwrite y accidentall n applicatio e on . problems y an t withou n ru o t e continu l wil s application r Othe . tion s Window r unde e spac s addres d share e th n i d loade e ar s file d Memory-mappe n I . NT s Window n i e spac s addres e privat e th n i d loade e ar y the s wherea , 95/98 Windows 95/98, it is possible for one application to create and map a memory- mapped file, pass its address to another application, and have the other application use this address to share memory. This is not possible under Windows NT. You have d an n applicatio e on n i e fil d memory-mappe d name a p ma d an e creat y i explicitl o t open and map the memory-mapped file in another application in order to share it. e Th . hooking I AP l globa n o s impact g stron e hav s difference e spac s addres e Th topic of global API hooking has been covered many times in different articles and books. There is still no common API hooking solution for both Windows NT and s Window r unde t tha s i g hookin I AP l globa h wit m proble c basi e Th . 95/98 s Window - re s DLL m syste e th l al , Also . memory d share n i L DL a d loa o t e possibl s i t i , 95/98 side in shared memory. Hooking an API call amounts to patching the few instruc- tions at the start of function and routing them to a function in a shared DLL using a simple JMP instruction. This does not work under Windows NT because if you patch the bytes at the start of the function, they will be patched only in your ad- . space s addres e privat e th n i s reside n functio e th s a e spac s dres e mak o t e hav u yo , NT s Window r unde g hookin I AP l globa f o d kin y an o d o T sure that the hooking is performed in each of the running processes. For this, you - hook e sam e th , addition n I . processes r othe f o e spac s addres e th h wit y pla o t d nee ing also needs to be done in newly started processes. Windows NT provides a way to automatically load a particular DLL in each process through the AppInit_DLL registry key. Process Startup There are several differences in the way the process is started under Windows 95/98 and Windows NT. Although the same CreateProcess API call is used in Windows 95/98 and Windows NT, the implementation is quite different. In this chapter, we are looking only at an example of a CreateProcess API call. Ideally, both of the . world e outsid e th o t w vie e sam e th e giv d shoul s implementation s CreateProces When somebody says that a particular API call is standard, this means that given a \ specific set of parameters to a function, the function should behave exactly the same on all the implementations of this API call. In addition, the function should . error f o e typ e th n o d base s code r erro e sam e th n retur - applica n a f o t star l successfu e th g detectin s a h suc m proble e simpl a r Conside , example r (fo m proble p startu e som s ha t tha m progra a n spaw o t y tr u yo f I . tion implicitly linked DLLs are missing), it should return an appropriate error code. The - STA s a h suc e cod r erro e appropriat n a s return n implementatio 8 95/9 s Window TUS_DLL_NOT_FOUND, whereas Windows NT does not return any error. Windows 28 Part 1: Essentials

NT's implementation will return an error only if the file spawned is not present at the expected location. This happens mainly because of the way the CreateProcess a n spaw u yo n Whe . 95/98 s Window d an T N s Window r unde d implemente s i l cal process in Windows 95/98, the complete loading and startup of the process is per- formed as part of the CreateProcess call itself. That is, when the CreateProcess call returns, the spawned process is already running. . call s CreateProces e th f o n implementatio s NT' s Window e se o t g interestin s i t I Windows NT's CreateProcess calls the native system service (NtCreateProcess) to create a process object. As part of this call, NTDLL.DLL is mapped in the process's address space. Then, the CreateProcess API calls the native system service to create - load L DL d linke y implicitl e Th . (NtCreateThread) s proces e th n i d threa y primar e th ing does not happen as part of the CreateProcess API call. Instead, the primary thread of the process starts at a function in NTDLL.DLL. This function in turn loads the implicitly loaded DLLs. As a result, there is no way for the caller to know , applications I GU r fo , course f O . not r o y properl d starte s ha s proces e th r whethe . process a f o p startu e th h wit e synchroniz o t e WaitForlnputldl e us n ca u yo However, for non-GUI applications, there is no standard way to achieve this. Toolhelp Functions Win32 implementation on Windows 95/98 provides some functions that enable you to enumerate the processes running in the system, module list, and so on. These 2 CreateToolHelp3 e ar s function e Th . KERNEL32.DLL y b d provide e ar s function - im t no e ar s function e Thes . others d an , Process32Next , Process32First , Snapshot plemented under Windows NT's implementation of KERNEL32. The programs that use these functions implicitly will not start at all under Windows NT. The Windows NT 4.0 SDK comes with a new DLL called PSAPI.DLL, which provides the equivalent s Window e th h wit d include o als s i H PSAPI. s thi r fo e fil r heade e Th . functionality NT 4.0 SDK. Windows 2000 has this toolhelp functionality built into KERNEL32 .DLL.

y b y directl n functio e th s call m progra e th f i linked implicitly s i n functio A name and includes the appropriate .LIB file in the project.That is, it does not use GetProcAddress to get the address of the function.

Multitasking . multitasking e preemptiv d slice-base e tim e us T N s Window d an 5 9 s Window h Bot y largel s depend I AP 2 WIN3 e th f o n implementatio 5 9 s Window e th e becaus , However on 16-bit code, it has a few inherent drawbacks. The major one is the Winl6Mutex. Because the existing 16-bit code is not well suited for multitasking, the easiest choice ______Chapter 3: Win32 Implementations: A Comparative Look____ 29 o T . tasks e multipl m fro d entere t no s i e cod t 16-bi e th t tha e ensur o t s wa t Microsof r fo achieve this, Microsoft came up with the Winl6Mutex solution. , Winl6Mutex e th s acquire m syste g operatin e th , code t 16-bi e th g enterin e Befor and it leaves the Winl6Mutex while returning from 16-bit code. The Winl6Mutex d reduce n i s result h whic , running s i n applicatio t 16-bi a n whe d acquire s alway s i s i e cod e entir e th e becaus m proble s thi e hav t no does T N s Window . multitasking 32-bit and is well suited for time slice-based preemptive multitasking. Also, the 16- . NT s Window f o e cas e th n i e cod t 32-bi o t p u s thunk e cod t bi Thunking e vic d an t environmen t 2-bi 3 a n i n ru o t s application t 16-bi s enable g Thunkin versa. It is a way of calling a function written in one bitness from the code running m progra n ca u yo d an , processor e th f o y propert a s i s Bitnes . bitness t differen a t a d decode e ar s instruction y wa e th s decide s Bitnes . bitness e th t adjus o t r processo e th by the processor. There are two different types of thunking available:

+ Universal thunking + Generic thunking

Universal thunking enables you to call a 32-bit function from 16-bit code, . code t 32-bi m fro n functio t 16-bi a l cal o t u yo s enable g thunkin c generi s wherea Windows 95/98 supports both generic and universal thunking, but Windows NT - thunk c generi , chapter s thi n i r earlie w sa u yo s A . thunking l universa y onl s support ing is used extensively in WIN32 API implementation of Windows 95/98. For ex- t 32-bi a d an , USER.EXE t 16-bi a m fro s function s call L USER32.DL t 32-bi a , ample GDI32.DLL calls functions from a 16-bit GDI.EXE. Various issues are involved in thunking, such as converting 16:16 far pointers in 16-bit code to flat 32-bit address s bitnes e on t a g runnin e cod m fro l cal r prope a g makin r fo k stac a g manipulatin d an to code running at a different bitness. Microsoft provides tools such as thunk com- pilers to automate most of these tasks. Many vendors who write code for Windows 95/98 use generic thunking to avoid a s ha r vendo r particula a y sa , example r Fo . applications r thei f o n redesig r majo a product for Windows 3.1 and would like to port it to Windows 95. Instead of rewriting the code for Windows 95, an easier solution is to use the majority of the existing 16-bit code and use generic thunking as a way of calling this code from s Window r fo n rewritte e b o t d nee s application e thes , However . applications t 32-bi . thunking c generi t suppor t no s doe T N s Window s a T N Device Drivers s acces l ful e hav t tha m syste g operatin e th f o s component d truste e ar s driver e Devic . do n ca s driver e devic t wha n o s restriction o n e ar e Ther . hardware e entir e th o t - sys e th o t s driver e devic w ne g addin f o y wa e som s provide m syste g operatin h Eac 30 Parti: Essentials

tern. The device drivers need to be written according to the semantics imposed by the operating system. The device drivers are called virtual device drivers (VXD) in . NT s Window n i drivers device kernel-mode s a d calle e ar y the d an , 95/98 s Window Windows 95 uses LE file format for virtual device drivers, whereas Windows NT uses the PE format. As a result, the applications that use VXDs cannot be run on . driver e devic ) (kernel-mode T N s Window a o t d porte e b o t d nee y The . NT s Window

Chapter 2 explains how to write device drivers.

Microsoft has come up with a Common Driver Model in and t tha s application e th l al t por o t d nee u yo , however , point s thi t A . 2000 s Window use VXDs to Windows NT by writing an equivalent kernel-mode driver. Security The major WIN32 API implementation difference between Windows 95/98 and Windows NT is security. Windows 95/98's implementation does not have any support for security. In all the Win32 API functions that have SECURITY ATTRIBUTES as one of the parameters, Windows 95/98's implementation just ignores these parameters. s a h suc s API y Registr . programs r develope a y wa e th n o t impac e som s ha s Thi r unde , However . 95/98 s Window r unde e fin k wor y RegRestoreKe d an y RegSaveKe Windows NT, you need to do a few things before you can use these functions. In Windows NT, there is a concept of privileges. There are different kinds of privileges, such as Shutdown, Backup, and Restore. Before using a function such as RegSaveKey, e acquir o t d nee u yo , RegRestoreKey e us o T . privilege p Backu e th e acquir o t d nee u yo the Restore privilege, and to use the InitiateSystemShutdown function, you need to . privilege n Shutdow e th e acquir Under Windows 95/98, anybody can install a VXD. To install a kernel-mode de- vice driver under Windows NT, you need administrator privilege for security pur- poses. As mentioned previously, device drivers are trusted components of the operating system and have access to the entire hardware. By requiring privileges to install a , Windows NT restricts the possibility that a guest account m syste e whol e th g brin y potentiall d coul h whic , driver e devic a l instal l wil r holde down to its knees. Newly Added API Calls 1 With each version of Windows NT, new APIs are being added to the WIN32 API set. Most of these APIs do not have an equivalent API under Windows 95/98. Also, there are a few APIs, such as CreateRemoteThread, that do not have the real imple- ______Chapter 3: Win32 Implementations: A Comparative Look____ 31

- ER s return n functio s thi , 95/98 s Window r Unde . 95/98 s Window r unde n mentatio ROR_CALL_NOT_IMPLEMENTED. As a result, there will always be a few API calls that are not available on Windows 95/98 or are not implemented on Windows 95/98. At this point, one can only hope that Microsoft will implement the API in Windows 95/98 when they add a new API to Windows NT unless the API is archi- tecture dependent. Summary This chapter covered the WIN32 API implementation on Windows 95/98 and Windows NT. We discussed the differences between these two implementations with respect to address space, process startup, toolhelp functions, multitasking, thunking, device drivers, security, and newly added API calls.