digital investigation 6 (2010) 147–167

available at www.sciencedirect.com

journal homepage: www.elsevier.com/locate/diin

5 advanced forensics

C. Klaver*

Netherlands Forensic Institute, Dept. Digital Technology and Biometrics, Digital Technology Group, Postbus 24044, 2490 AA Den Haag, The Netherlands article info abstract

Article history: Windows CE (at this moment sold as Windows Mobile) is on the market for more than 10 Received 31 December 2009 years now. In the third quarter of 2009, Microsoft reached a market share of 8.8% of the Received in revised form more than 41 million mobile phones shipped worldwide in that quarter. This makes it 9 February 2010 a relevant subject for the forensic community. Most commercially available forensic tools Accepted 10 February 2010 supporting Windows CE deliver logical acquisition, yielding active data only. The possi- bilities for physical acquisition are increasing as some tool vendors are starting to imple- Keywords: ment forms of physical acquisition. This paper introduces the forensic application of freely Windows mobile available tools and describes how known methods of Physical Acquisition can be applied to NAND flash Windows CE devices. Furthermore it introduces a method to investigate isolated Windows TFAT file system CE database volume files for both active and deleted data. Live forensics ª 2010 Elsevier Ltd. All rights reserved. Heap CEDB/EDB database Logical/physical acquisition

1. Introduction MSAB’s.XRY and Cellebrite’s UFED support logical acquisition of WCE devices. In Ayers et al. (2005), a comprehensive over- With Windows CE on the market for more than 10 years now, view of forensic tools for mobile devices is given. Microsoft has a market share that makes it a relevant subject MSAB is implementing physical acquisition of WCE devices for the forensic community. The first versions of Windows CE in its tool XACT (MSAB). Cellebrite is supporting physical were not very successful on the hand-held electronics market. acquisition for Windows CE devices in their Physical-Pro However, with the release of Windows Mobile 6, based on version of UFED (Cellebrite). Since 2003 Hengeveld (2009) is Windows CE 5.2 (Herrera, 2009), Microsoft has gained a market publishing his open source XDA tools. With this toolset, share of 13.6% of the nearly 40 million mobile phones shipped among other things, an acquisition of RAM and flash memory worldwide in the third quarter of 2008, but appears to be inside WCE devices can be done. All these tools assume a WCE falling in 2009 (Canalys, 2009). device that is not device locked by a handset security code. Currently most commercial forensic tools that support Revealing or circumventing security codes is beyond the scope Windows CE (WCE) acquire data from the device through the of this paper, but physical acquisition methods like chip standard Remote Application Programmers Interface (RAPI). extraction, or the use of JTAG or a boot loader, work around This results in the acquisition of only the active data. The handset security codes. More advanced protection of a smart capturing of deleted data is not possible using just this method. phone would encrypt user data, imposing a new challenge to In 2005, PDA Seizure was one of the first tools that supported forensic examination of such a mobile device. This is also logical acquisition of WCE devices. Nowadays, other tools like beyond the scope of this paper.

5 The Netherlands government is authorized to reproduce and distribute reprints of this paper for governmental purposes notwith- standing any copyright notation there on. * Tel.: þ31 (0)70 888 6423; fax: þ31 (0)70 888 6559. E-mail address: c.klaver@nfi.minjus.nl 1742-2876/$ – see front matter ª 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.diin.2010.02.001 148 digital investigation 6 (2010) 147–167

This paper takes forensic examination of WCE devices 2.2. Flash memory beyond logical acquisition with commercial, off-the-shelf forensic tools. In section 2 relevant aspects of the typical Flash memory is widely used for non volatile storage of data. hardware of a WCE device are described and physical loca- There are two main types of flash memory, NOR and NAND tions that can contain user data are identified. Section 3 flash (Knijff). Flash has specific properties that have forensic describes software components in a WCE device that are relevance. For instance, as data cannot be updated in place in involved in storing user data or can be used in a forensic flash memory, first the data has to be copied from flash to acquisition. Section 4 describes the process of acquiring RAM, changed and then copied back to a different, empty a forensic duplicate of data on a WCE based device. Section 5 location in flash. The data before the change might be avail- covers methods for performing a physical acquisition of able after the change through physical acquisition for quite a WCE device. Section 6 presents tools and techniques for a while. (Breeuwsma et al., 2007). analyzing results of a physical acquisition. Section 7 discusses results and future work is identified in section 8. 2.2.1. NOR flash This type of flash memory has a RAM-like interface; it has a data bus, an address bus and control lines. NOR flash is 2. Typical WCE hardware mapped in the processor’s memory map and processor code can be executed directly from it (this is called ‘execute in This section describes hardware elements in a typical WCE place’; XIP). NOR flash can also be used as storage location for device that can be relevant for a forensic examination of such user data. Many older WCE devices have a single folder in the a device. Only a general overview will be given of aspects of root directory that is mapped to a section in NOR flash. With the processor, flash memory and RAM. Description of other, a special driver, like Intel’s Persistent Storage Manager more specialized hardware components fall outside the scope (Intel, 2005) the part of the NOR flash memory that is not used of this paper. for code can be used for user data. In a forensic investigation, this folder should not be overlooked. This folder is for 2.1. Processor example very suitable for storing system backups and because it resides in flash, deleted data can persist. When With WCE Microsoft intends to deliver an Operating System a device with a completely drained battery makes a full (OS) that can run on a range of hardware platforms. Currently system reset, this folder might still contain a recent backup four families of processor cores are supported: ARM, MIPS, SH4 of all user data. and x86 (Microsoft 1). Of these, ARM currently is most common in consumer electronics like smart phones, PDAs and naviga- 2.2.2. NAND flash tion devices. This paper focuses on ARM based devices. NAND flash can be regarded as the solid state equivalent of The ARM processors used in WCE devices are coming from a hard disk. It has an interface with an I/O bus and control various vendors. To name some that we have come across the lines connecting the memory chip to the processor. Over this last years: Intel PXA2x0/PXA30x XScale family of processors. I/O bus, commands, addresses and data are sent. As NAND Intel sold their activities in this field to Marvell (Intel, 2006). flash memory is not mapped in the memory space of the Texas Instruments has its OMAP series (Texas Instruments). processor, code stored in a NAND flash chip can not be Another player on this market is Samsung with its S3Cxxxx executed directly, but has to be loaded into RAM first, again range (Samsung). much like a hard disk. One of the interesting aspects of all of this range of After reset, boot loader code is loaded into RAM through processors is that nearly all peripheral devices needed to build some mechanism that is dependant of the type of flash a smart phone are integrated into one chip. These processors memory used. Some flash memory types are capable of pre- are also referred to as System on Chip (SoC). This means that senting a boot block of flash memory through a NOR flash a lot of relevant information that might be needed for physical interface, allowing the processor to boot from this block. The data extraction on a device powered by such a SoC must come code in this block will contain instructions to access the rest of from datasheets of this SoC. To be able to make a copy of the the blocks, in order to load the OS into RAM. Typical behavior NOR flash memory in the device by using Boundary Scan of WCE smart phones is that after the OS is loaded, it will technique it might be necessary to know the memory layout of detect whether it is a cold reset. In that case, it will install a smart phone, which chip select lines are connected to which customization .cab files from the customization flash parti- type of memory chip. Finding datasheets however is getting tion, often TFAT. After these files are installed, the device is harder because many SoCs are designed specifically for the rebooted and it is ready for use. mobile handset builders and distribution of datasheets is Flash memory must be erased to all 1s before reuse and strictly controlled. Where these datasheets do not become flash can only be erased in fairly large blocks, typically available, the necessary information can only be gained by between 128 and 512 kB. An erase block that mainly contains reverse engineering an exemplar device of the same make and expired pages can be made fully expired by copying away the model as the evidence. The latest SoCs even contain the RAM last few active pages and subsequently be erased. This means and flash dies inside one package, which makes physical that inactive data will be wiped beyond recovery by the extraction of memory even more challenging. Besides that, system itself, even in ‘quiescent’ state. Flash memory is worn SoCs might contain special secure memory for storing data out by erasing. To minimize this effect, flash manufacturers like cryptographic keys. supply so called Wear Leveling algorithms with their digital investigation 6 (2010) 147–167 149 hardware. Wear leveling is the process in a flash Memory 0x01FF-FFFF manager that takes care of the evenly spreading of erase RAM based dlls actions across the whole flash memory range.

2.3. RAM Free space RAM in WCE devices can contain various types of data that can have forensic relevance. As modern devices can have hundreds of MB of RAM, it is essential to know where to look stack for relevant data. In WCE versions prior to 6.0 a process has heap a virtual1 address space of 32 MB (Boling, 2002). Fig. 1 shows a simplified diagram of how various items are located within code the 32 MB address space. From the bottom up, first the code of 0x0000-0000 the process itself is loaded. From the highest address down, Fig. 1 – Simplified memory layout of a WCE process. dlls needed by the process are loaded. In between are the stack and the heap for the process. The stack and the heap are the locations where variables are stored during the life- is normally an unconditional jump to the address where the time of the process. From WCE 6.0 on, memory management actual bootloader code is located. Bootloaders sometimes has changed quite drastically. For instance the addressable have the functionality to copy various types of memory from Virtual Memory space for a process is no longer restricted to the device to external media, but the functionality is not 32 MB (Microsoft 2). always accessible. Because it could facilitate SIM unlocking or other forms of hacking, handset manufacturers make it diffi- 2.4. Other cult to access this functionality. If a bootloader has accessible functionality to read memory, Beside NAND and RAM, a WCE device might have additional then this is a very safe and fast way of obtaining memory memory for special purposes. Inside the processor for copies. If the bootloader functionality to read memory is not instance, special registers can reside to hold for instance accessible, one can replace the bootloader (risky), patch the cryptographic data. A register holding a unique number might bootloader (risky), find out why the functionality is not acces- function as a master key for encryption of data. For instance, sible, for instance a password block, and finding a way around the Texas Instruments (2009) OMAP35x processors have a 128 this. All of these steps are time consuming and labor intensive. bit CONTROL_DIE_ID at address 0x4830A218. When applying this method, with the objective to search for deleted data, one must be sure to avoid booting into the OS. This might enable the OS to reorganize flash memory, erasing 3. Typical WCE software components deleted data beyond recovery.

This section describes software components in a WCE device 3.2. Heap that are involved in storing user data or can be used in a forensic context. An important notion when looking at WCE A heap is a portion of memory reserved for an application to devices is the difference between the kernel that the WCE use to allocate and free memory on a per-byte (Microsoft 3). device is based on, and the version name of the retail OS The heap holds variables that are created with OS functions (Herrera). For instance, all Windows Mobile 6 versions are like ‘malloc()’2. Functions like this return a pointer to the based on WCE 5.2. memory chunk offered by the OS, if the requested amount of memory is available. 3.1. Bootloader Investigating the heap of a process can yield very inter- esting data. Often buffers for various purposes are located on In some WCE devices the bootloader can be used as a tool to the heap. Think for instance of a buffer for receiving data from get a physical image of memory in a WCE device. Some other devices like NMEA data from an external GPS receiver, bootloaders already have capabilities for this, sometimes a buffer to hold text that is to be printed on the screen or though barred by some security mechanism like a password a scratch pad for email composition. or insertion of a special memory card. Other devices would Once a process no longer needs the memory it has allo- need an adapted bootloader to provide the needed function- cated, it will (when well programmed.) return the memory to ality for creating a physical image. the OS by calling an OS function like ‘free()’, with a pointer to A reset on an ARM platform forces the processor to execute the memory it wants to return to the OS. the code at the reset vector. The reset vector is the (physical) On the heap itself however, the data is not changed just address from where the first instruction is fetched. For ARM because of freeing its location to the OS. The only thing that this address generally is 0x0000:0000. At the reset vector there happens is that the memory region is made available to the OS

1 See for a brief introduction of virtual memory www. 2 The ‘c’ version of this function is used. Other high level windowsfordevices.com/c/a/Windows-For-Devices-Articles/ languages might use the heap in other ways. This example is only What-is-virtual-memory/. illustrative. 150 digital investigation 6 (2010) 147–167

again. In the heap the status of a memory block(active or free) a number starting at 000 and counting up. The reason for the can be detected. presence of these files is not exactly clear at this moment. The software managing the heap will try to keep the heap They might be there to make sure that flash erase blocks that as clean as possible. When N bytes are requested through contain parts of File Allocation Tables, only contain FATs, and a call to ‘malloc(N)’, the heap manager will try to find a free not parts of regular files. If regular files share erase blocks with block (or contiguous free blocks) capable of holding at least FATs, changes to such a file will lead to copying that file and these N bytes. It might try to merge small free blocks into possible reallocation of the FATs to other erase blocks, a bigger free block by rearranging the heap. This is comparable possibly causing performance loss of the file system. to the defragmentation process that can be applied to a hard The user on a WCE device doesn’t see any of these files or disk. How well the heap manager succeeds in fitting requested directories, because the file system drivers hide them from the blocks in free blocks and how well it defragments will influ- user. When analyzing a WCE TFAT file system image with ence the lifetime of data in ‘free’ blocks. In any case it is forensic tools, one can safely ignore the ‘__TFA- possible to find data on the heap, either active or deleted, that T_HIDDEN_ROOT_DIR__’ and the ‘DONT_DELnnn’ entries. is otherwise not available to the user. 3.4. Databases

3.3. File systems In WCE versions earlier than 4.0, all user data was stored in the so called ‘object store’. The object store is a database con- Most modern WCE devices are equipped with flash memory taining the file system, the databases and the registry. The hosting (T)FAT partitions for user data or firmware exten- object store lived in RAM; when power failed, all user data was sions, and binary partitions with firmware and bootloader lost. From WCE 4.0 on, the roles are reversed; the file system is code (Rogers et al., 2005), see Fig. 2. File systems are usually now hosting the databases and the registry files. In devices not stored in NAND flash directly. The OS interfaces with a so where this file system is based on flash memory, user data is called Flash Translation Layer (FTL), which takes care of less dependent on battery life. storing File System blocks in NAND flash (Knijff, 2010, p. 390). Flash based file systems also allow for easier imaging of the When analyzing the storage devices at file system level on file system, compared to RAM based storage. After a flash a WCE device, both binary partitions as well as File System based file system image has been created from the WCE device, partitions can be found. Under normal use, the only partition the databases containing user data can be extracted from the interesting for forensic analysis is the partition containing the image. This can be done with normal forensic tools supporting user file system. This partition usually contains a FAT or a TFAT TFAT; as TFAT is compatible with FAT, most tools will load file system. TFAT is a Transaction Safe variant of FAT (Microsoft TFAT images without problems. Once loaded, the two most 4). As TFAT is transaction safe, sudden power loss, or other interesting databases are cemail.vol and pim.vol, both located interruptions of changes to the file system, will not lead to in the root directory of the file system, as seen by the user. a corrupt file system. When looking at a TFAT file system image with a forensic tool like EnCase or Ftk, one can notice that the root directory 4. WCE forensic investigation the user sees on the WCE device itself is often not the root directory of the file system. WCE can create a directory called When found during a criminal investigation, a WCE device has ‘__TFAT_HIDDEN_ROOT_DIR__’ and inside this directory all to be treated just like any other mobile phone. Mostly, the first files and directories are stored that are seen by the user goal is to avoid any further changes to the phone as much as (Microsoft 5). This means that the call possible. Phone data can be changed for instance by incoming CreateFile("\temp\myfile.dat") calls, received text messages, connecting to WiFi/Bluetooth networks, recorded GPS data and depleted batteries. In order resolves to to avoid these changes, the phone should be isolated from the CreateFile("__TFAT_HIDDEN_ROOT_DIR__\temp GSM and other networks, reception of GPS signals and pow- \myfile.dat") ered by an external power supply. A discussion on this subject can be found in Jansen et al. (2007), chapter 5.3 and 6. Another noticeable artifact is the presence of many Another cause of changes in phone data lies in the phone (deleted) file entries called ‘DONT_DELnnn’, where nnn is itself. While on, either in active or quiescent state, the phone’s OS is active. The OS might be trying to manage the various types of memory in the phone. The flash file system might TFAT TFAT User data fs Custom fs rearrange flash pages and erase flash blocks that only contain expired pages. The heap manager might be trying to rearrange Flash Translation FTL Firmware Boot the heap structure to join small free items into bigger ones. To Layer loader stop these processes, the phone has to be powered down, but this is not always wanted, for instance because it might acti- vate handset security code, hindering logical acquisition of the phone, or activate memory rearranging or garbage collection. Once connection to networks is properly prevented, data Fig. 2 – Flash memory in WCE devices. on the WCE device can be acquired. As mentioned already, digital investigation 6 (2010) 147–167 151

Table 1 – Relative risks for data during logical and physical acquisition. Risks Physical acquisition Logical acquisition

Chip extraction JTAG Bootloader (damaged chip) (damaged PCB, (‘bricking’) ‘bricking’)

Active data High High High Low Deleted data High High High –

two types of acquisition can be distinguished, physical investigator, and when a reference model is available, the acquisition and logical acquisition. Depending on the inves- absolute risks of physical extraction are acceptable. tigation, it has to be decided which of the two has to be done Physical acquisition might be the only option in cases where first, because either acquisition types have their own advan- there is an active phone lock or a non functioning phone. tages and disadvantages. Physical acquisition methods generally work at a low level and Physical acquisition can be a destructive operation. True are not hindered by the phone lock. The NAND flash of a broken physical acquisition can either mean physically removing phone might still be working, allowing physical chip extraction. memory from the device, using hardware techniques like There might be other situations where it is necessary to first JTAG to extract data from the device or use an (adapted) do a physical acquisition; when there is strong indication that bootloader to gain low level access to the device. Most of the the essential evidence is in deleted data, the risk of overwriting physical data extraction methods hold some risk of destroying this evidence by switching on the phone and doing a logical data, the device or both. acquisition might be considered higher than the risk of loosing For WCE devices there are ways to do an acquisition that is the evidence through a failed physical acquisition. somewhere in between a physical acquisition and a logical acquisition. A copy can be made of the flash file system over an 5. Physical acquisition ActiveSync connection. This requires a dedicated dll to be loaded into the system under investigation, thus overwriting In this section, several methods for getting a forensic image RAM and possibly flash memory. The result however is an image from a WCE device are described. The methods are described at file system level and not at flash hardware level. Because of in order of forensic soundness. One has to realize that the this, only unallocated clusters that reside in active flash pages success of the described methods greatly depends on the are in the image. Expired flash blocks that are no longer part of experience the investigator has with applying these methods. the file system but still might contain data will not be copied. In Incorrectly applying these methods may destroy the WCE this paper, it is referred to as pseudo physical acquisition. device, the data in it, or both. Logical acquisition is generally safer for active data. It does not have the risks of losing all active data because of risks 5.1. Physical chip extraction involved in physical extraction. However, setting up an ActiveSync connection to do a logical acquisition can change In a WCE device, the investigation of the file system residing in data related to the ActiveSync connection itself. Another flash memory is best done by accessing the flash memory downside is that during logical acquisition deleted data, that directly. This method ensures that the OS does not interfere with still resides in the system, might be erased beyond recovery. the data in memory. However, this type of acquisition might not Because logical acquisition uses the system that is being be feasible due to lack of necessary equipment. Section III-C, investigated, the processes in the phone that are used during Breeuwsma et al. (2007) describes how to remove a BGA acquisition are using memory, RAM and possibly flash. memory chip from a PCB and subsequently read the content of Another cause of permanent loss of deleted data is active the memory device. Desoldering the flash memory chip from Wear Leveling and Garbage Collection in a working system. a WCE device might be an option in the following cases: Garbage Collection is a phenomenon that occurs in RAM, where blocks of data that are no longer referred to by pointers Every risk of loosing deleted data has to be eliminated. are freed by the OS and made available for reuse. The device is not working anymore Sometimes logical acquisition is not possible, for instance No (known) possibility for access through JTAG when the device is broken beyond repair, or when the device does not have a standard interface to do the logical acquisition This method has some downsides: over. In cases where active data might be enough for the investi- TSOP/BGA rework equipment is required gation, doing a logical acquisition and a pseudo physical Memory reader equipment is required acquisition on a WCE device before doing a physical acquisition The memory reader tool might not support the target chip is the safest way to go. The risk of changing or destroying some The datasheet of chip equipment might not be available deleted data due to logical acquisition is then regarded less than the risk of loosing all data in a physical acquisition. Table 1 5.1.1. Case example shows relative risks for active and deleted data in physical and In an investigation Police seized a Fiat 500 equipped with logical acquisition. When executed by an experienced a Blue&Me multi media set. Blue&Me is an ‘‘in-car 152 digital investigation 6 (2010) 147–167

communication system’’, based on WCE for Automotive dimage dimage (BusinessWeek, 2006; Microsoft 6). The investigation required the examination of the content of the device, as it could Tffs.dll DoC Tffs.dll JTAG JTAG DoC contain information on handsets paired to the B&M unit, PC hardware PC hardware WCE hardware SMS messages received with the handset or MP3 files played with it. As time was limited it was decided not to look at RAM Normal situation ‘dimage’ connects to WCE DoC through JTAG for ‘dimage’ data and only focus on flash memory. Three options for accessing the flash data were identified: Fig. 3 – Making a file system image of an M-Systems DoC through JTAG Technologies’ tools. Acquisition through the USB port on the board Acquisition through JTAG Desoldering the flash chip DiskOnChip (DoC) G3 type MD4331-d1 G-V3Q18X.4 After From a Fiat dealer several scrap units were obtained as having identified the JTAG pins on the device, and the exemplar devices. On these it was established that the flash configuration of the JTAG chain, we searched for tools that chips were of a well known type (Samsung K9F5608U0D- would be able to read the DoC. We found a tool set from JIBO) and because of component placement, it was rather a Dutch company called JTAG Technologies. Their tool set easy to do Physical Extraction as described in Breeuwsma provides a mechanism to let M-Systems’ own utility ‘dimage’, et al. (2007). Furthermore, no information could be designed to make an image of a DoC hosted by a PC, obtained in reasonable time on how to get access through communicate with a DoC in another device through the JTAG USB, nor on the JTAG Test Access Points (TAPs) in the device. protocol, as if the DoC is on the same PC as the dimage utility This led to the decision that desoldering and subsequent itself (JTAG). In this setup a file system image of the DoC in reading the memory chip with the NFI Memory Toolkit was a WCE device can be made. The Flash Translation Layer is in the quickest and most sound way to obtain a copy of the the tffs5 library. It is crucial that the right version of the tffs flash chip. software is used (Fig. 3). Details of the flash translation This procedure was first tested on the exemplar. The file apparently change even between minor versions. A 6.3 tffs dll system of the exemplar was reconstructed from the memory is not capable of reading a 6.2 formatted DoC. As this method copy and it appeared to contain: offers a file system level image, expired pages within the DoC are not found in the image, although these pages might - Bluetooth MAC addresses from devices connected to the contain relevant information. B&M set The result is shown in Fig. 4. The first 0x30 bytes show data - Full pathnames of MP3 files played indicating this is a dump from an M-Systems device. We also - Contact lists from paired phones recognize the TFFS version 6.2.20 in this part. Then at offset - Call history 0x10C0 a Master Boot Record can be recognized. At offset 0x12C0 the boot sector of a TFAT16 file system is recognized. Then the exhibit was processed the same way. It appeared Loading the image in EnCase is still problematic, this is that non of the phones found earlier in the investigation had currently being researched further. been paired to the Blue&Me kit, so no further investigation was necessary. As usual, new knowledge produces new questions: It is the above list is complete? Probably not. For 5.3. Bootloader example, the device is able to read out loud incoming SMS messages, which indicates that received SMS messages will An example of bootloaders that have been reverse engineered probably be stored in the B&M unit. by people at xda-developers.com is the HTC Hermes boot- loader (XDA Developer, 2008). Another example is a process where the bootloader is replaced by one with capabilities to 5.2. JTAG copy of the TFAT file system (XDA Developer, 2007). The author claims that the command ‘fat2sd 3’ will copy the In Breeuwsma (2006), a method is described on how to find internal NAND based file system is copied to SD at file system and use JTAG Test Access Points to obtain copies of level. Supposedly the flash translation is being executed by memory in JTAG enabled digital devices. In this paper, the bootloader, looking at output lines like ‘Nand2SDReorder a WCE device, the HP iPaq h1930 is investigated. It was start.’. The output presented on this site looks like a valid shown that it is possible to access SDRAM and flash Windows CE TFAT root directory, including cemail.vol and memory in this device. pim.vol files. More research is needed to explore the possibilities of 5.2.1. Case example these techniques on recent WCE devices. In a case we received an HP Hx2790, of which we would like to 3 acquire an image of the internal flash memory, an M-Systems 4 M-Systems announced this chip End Of Life (EOL) in october 2005, see www.sandisk.com.tw/Assets/File/OEM/Manuals/eol/ 3 M-Systems was acquired by SanDisk in 2006, see www. mdoc/EOL-DOC-0505.pdf, but the chip is still found in older sandisk.com/about-sandisk/press-room/press-releases/2006/ devices. 2006-11-19-sandisk-completes-acquisition-of-msystems. 5 TrueFFS (tffs) is the flash file system developed by M-Systems. digital investigation 6 (2010) 147–167 153

Fig. 4 – Three sections of the image of an M-Systems DiskOnChip from an HP HX2790.

5.4. Pseudo physical acquisition ‘itsutils.dll’. This library will be copied onto the WCE device and loaded into memory by the RAPI server process. The tool There are several tools available for doing pseudo physical can then access specialized functions in the helper library. acquisition. In this paper, the focus is on RAPI tools. XACT is Fig. 5 shows this; the RAPI server interacts with the WCE well documented and not dealt with much deeper here. The device directly through the API functions (dotted arrow), and RAPI tools are not specifically designed for forensic acquisi- through the helper dll (dashed arrow). tion, so the use of these tools in a forensic context requires In early versions of the RAPI tools, the dll was always copied special care. into the directory \Windows. As of RAPI tools version 080731, the location on the WCE device where the helper library is 5.4.1. XACT copied to can be changed by adding a key to the PC’s registry: As of version 3.3 XACT supports the acquisition of the WCE file HKEY_CURRENT_USER\software\itsutils system, but for this it needs to load an ‘agent’ onto the device devicedllpath [ ’’\Storage Card\itsutils.dll’’ under investigation. The user has the option to store the agent on an external storage card, avoiding unnecessary changes to Also, itsutils.dll can write messages to a log file. By default, the device file system. Then the agent needs to be loaded into logging is off and when on, the log file is written to root. RAM to be able to be called by the ActiveSync server, thus Adding another key will set the log file destination and switch overwriting unallocated RAM. logging on or off: The result of an acquisition with XACT is a file system level copy of the device. PC Windows CE device

5.4.2. RAPI tools Another set of tools that can be used to obtain images from command line RAPI a live WCE device are the so called RAPI tools, developed by Active Sync shell server Hengeveld (2009). This toolset is a collection of some 30 command line programs which can be executed on a PC and that operate on the WCE device over an ActiveSync connec- itsutils.dll tion. All commands communicate with the RAPI server which is running on the WCE device. Some tools only use the native API that the RAPI server provides, other tools need to have more advanced access and these use a helper library called Fig. 5 – Software architecture of RAPI tools. 154 digital investigation 6 (2010) 147–167

the user has to give permission on the screen. Furthermore, a WCE can be configured so that there are restrictions on the execution of code through RAPI calls. To change restricting policies, one value in a registry key has to be changed. For this several options are available. One option is to use the rapi tool ‘prapi’ with the command line option –p 4097 1. This will set the registry key 0x1001 (4097d) in [HKLM\Security\Policies\- Policies] to 1. (Hengeveld). Some devices do not allow this key to be set through the RAPI. Then using a registry editor on the WCE device itself could be used to manipulate this key. If one doesn’t want to install a full blown registry editor, a small command line program could be created that just opens the registry key ‘‘Security\Policies\ Policies" in HKLM, by calling ‘‘RegCreateKeyEx’’ and subsequently set registry value 0x1001–1 by calling ‘‘RegSetValueEx’’. This program could then be loaded from an SD memory card, minimizing changes to RAM usage. Whichever method is chosen, some data on the target device will be changed. This might be violating the rule that Fig. 6 – Screen capture of WCE 5.2 Task Manager on an HTC ‘‘No actions performed by investigators should change data Blackstone100. contained on digital devices or storage media that may subsequently be relied upon in court’’ (ACPO). But as there often is no feasible alternative, the evidentiary implications of the changes should be evaluated first (maybe data related to HKEY_CURRENT_USER\Software\itsutils the ActiveSync connection is not relied upon in court) and devicelogpath [ "\Storage Card\itsutils.log" only after accepting the implications, the method can be logtype [ dword:00000002 applied. Log type has the following meaning: 0:no logging, 1:ker- The following sections discuss some useful RAPI tools. nellog, 2:file. The above allows for copying ‘itsutils.dll’ and writing the 5.4.2.1. pps. With the pps tool, all processes in the WCE log file to a memory card instead of the internal flash memory device can be listed. This is particularly interesting because of the WCE device, thus avoid overwriting unallocated flash the native WCE Task manager does not show all processes. As pages in the WCE device. shown in Figs. 6 and 7, pps shows a complete list of all Any non signed executable on a WCE device will only run processes running on the WCE device, whereas Task Manager after permission by the user. So for the helper dll to be loaded, only shows a few.

Fig. 7 – pps executed on same device. digital investigation 6 (2010) 147–167 155

Fig. 8 – Making a copy of the working memory of the tmail.exe process on a WCE device.

5.4.2.2. pmemdump. With the exact names of the processes use the handle references #0 through #3, listed in the four rows in the listing in Fig. 7, a dump of the working memory of right below ‘STRG handles’. a process can be made. The tool has several options for this. In Fig. 10 we see three attempts to read the first page. The The most straight forward way is to make a complete dump. first attempt fails because in this case the tool tries to read Processes in WCE <6.0, have a process space of 32 M. Not all of a DiskOnChip memory, which apparently is not present on this this space is actually backed by physical memory, but one can WCE device. The second attempt fails; with the ‘-w’ option, the make a 32 M dump of the processes in the pps list. Fig. 8 shows tool now read the generic Windows file system API and not the how a copy is made of the RAM used by the process tmail.exe. DoC API, but still, the block size isn’t specified correctly. The The copy is 32 MB in size and stored in a file tmail.exe.bin. One third attempt succeeds; here the block size is specified to 0x800 of the interesting parts inside a dump like this is the heap. In bytes, which is the correct value here and a very common value Section 6.1, it will be shown how to find the heap inside in many WCE devices. In these first 0x200 bytes, a regular boot a memory dump and how to analyze the heap. sector can be found. Notice that the file system type here is TFAT. In the listing of the partitions in Fig. 9, we can see that the 5.4.2.3. pdocread. pdocread can be used to make a copy of partition under handle #0 has a size of 133.00 MB, which is partitions on storage devices inside a WCE device. Originally it 0x8500000 bytes. This size will be used to make a full copy of was aimed at copying the M-Systems DiskOnChip managed this partition. NAND flash chips that were found in many smart phones. The In Fig. 11 the next three partitions are checked. In none of program grew into a versatile tool to make copies of managed these a regular file system isrecognized,so they will be left alone. NAND flash of manufacturers like Samsung and Qualcomm. In Fig. 12 finally a full image of the partition under handle The first step is to find out what partitions are present at the #0 is made. The output is sent to the file htc_wing220_h0.bin. WCE device. This can be done by running ‘pdocread-l’. In Fig. 9 In Fig. 13 the dump is shown. Here we have a TFAT32 partition, this command is given to an HTC S730. with a sector size of 0x800 bytes. In Fig. 9 one can see that there are 4 partitions in this particular device, these are the partitions pointed out in Section 3.3. We are mainly interested in the partition containing user 6. Forensic analysis of the physical image data. The file system of the partition will be TFAT or FAT. As the file system type is stored in the boot sector of the partition, let’s In this section we are going to analyze the images and dumps look at the first 0x200 bytes of each partition. The easiest way is fromvarious sources that we havefound in the previoussection.

Fig. 9 – Output of pdocread, listing all partitions in a WCE device. 156 digital investigation 6 (2010) 147–167

Fig. 10 – Output of pdocread, trying to dump the first 0x200 bytes of partition #0.

6.1. Flash - While possible: B Get the file offset 6.1.1. Reconstructing the file system B Read 0x210 bytes Breeuwsma et al. (2007), section IV-A describes how to B State ¼ byte at 0x206 reconstruct the file system from a physical acquisition of B LBN ¼ bytes at 0x200 through 0x203 a NAND flash chip. This principle is also applicable to WCE B If state is 0xf9, store (file offset, LBN) / active pages list devices. In the WCE devices that we have come across, the file B If state is 0xf8, store (file offset, LBN) / expired pages list system could be reconstructed rather easily. Data in NAND B If state is 0xff, store (file offset, LBN) / free pages list flash is organized in pages. We have come across page size of - Sort active pages list to LBN 0x210 and 0x840 bytes. When the page size is 0x210, usually - While pages in active pages list: the last 0x10 bytes are spare areas. In the spare area, bytes 0–3 B Goto offset of page indicate the Logical Block Number (LBN), and byte 6 indicates B Read 0x200 bytes the state of the page. The state can be either: free (0xff), busy B Gap ¼ LBN-previous LBN (0xf9) or expired (0xf8). With the following pseudo code the B If gap>1: pages can be reordered to form a valid TFAT image. While gap>1:

Fig. 11 – Output of pdocread, dumping the first 0x200 bytes of partition #1 through #3. digital investigation 6 (2010) 147–167 157

Fig. 12 – Making a full image of partition #0.

Write empty page to output - CeMountDBVol(file name) Decrease gap by 1 - CeFindFirstDatabaseEx - While database ¼ CeFindNextDatabaseEx B Write 0x200 bytes to output B CeOidGetInfoEx(database) - While pages in expired pages list: B CeOpenDatabaseEx(database) B Goto offset of page - While record ¼ CeReadRecordPropsEx(database)6 B Read 0x200 bytes While property in record B Write 0x200 bytes to output B Print property B Update MD5 When the file system is reconstructed this way, it can be Print MD5 of record loaded into tools like EnCase or FTK. A tool was written, called xpdumpcedb.exe, following this 6.1.2. Where can active data be found algorithm. The tool reads a cemail.vol (or any other CEDB The typical WCE file system and registry are very similar to the formatted volume file) and produces an XML file containing all Windows desktop equivalents. With tools like EnCase a file active data in the volume file. Fig. 14 shows a part of the system image can be investigated. One of the challenges is output when processing a sample cemail.vol file. The table that there is much less known about forensically interesting shown is the Inbox from the SMS root folder. artifacts of applications on WCE devices, than of artifacts on In the appendix, Table 5 lists some of the databases found desktop OSes. A lot of research has still to be done to fill this in the cemail.vol file. The meaning of some fields in knowledge gap. cemail.vol can be found in header files in the Platform Builder Besides file system and registry, there is generally a set directory, for instance property IDs from 0x3000 to 0x3FFF are of databases for Messaging and Personal Information defined in WINCE500\PUBLIC\IE\SDK\INC\wabtags.h and Management (PIM) data on a WCE device. These databases WINCE500\PUBLIC\ DATASYNC\SDK\INC\addrmapi.h. The are grouped into two volume files, cemail.vol for messaging meaning of some of the other fields in these databases has related databases and pim.vol for PIM related data. The been identified by reverse engineering. There are still other cemail.vol volume is in CEDB format, pim.vol is in the fields of which the meaning has not yet been determined. newer EDB format (Microsoft 7). Both formats are proprie- Furthermore, unlike a regular database, each record within tary and little formal documentation can be found on the a table can have different number of fields, which makes it internals. WCE databases can be decoded through the hard to determine when a database has been fully under- following steps: stood while doing reverse engineering on this volume.

- Extract cemail.vol from a file system image, f.i. with EnCase 6.1.2.2. pim.vol. A PC targeted equivalent of ‘cedb400.dll’ for - Use ‘cedb400.dll’ to open cemail.vol and read all items, see EDB formatted volume files has not been found. An alternative 6.1.2.1 to reading an extracted EDB formatted volume file like pim.vol - Use EDB API on a Device Emulator to read PIM.vol, see 6.1.2.2 in isolation is running a tool like xpdumpcedb on a WCE device, or preferably, a WCE emulator. Microsoft provides a WCE 6.1.2.1. cemail.vol. When a cemail.vol database is extracted emulator that is suitable for running a decoder program similar from a WCE device image, the database can be read ‘in isola- to that described in 6.2.2.1. The emulator can use a directory on tion’ by using a library that comes with the Windows CE the host PC as a shared folder. This shared folder can be used to development environment called Platform Builder (PB). With store the EDB volume and the decoder tool. A tool called PB come a number of tools and utilities (Microsoft 8) that can ‘wmdumpedb.exe’ was created to read an EDB file and produce be useful for forensic purposes. For instance, when installed an XML file containing all active data in the edb volume. on a PC, in the folder ‘WINCE520\PUBLIC\COMMON\OAK\BIN\ In the appendix, Table 6 lists some of the databases found in I386’ the library ‘cedb400.dll’ can be found. This dll contains all the pim.vol file. The meaning of some of the fields in pim.vol functions to read a CEDB database. The API of ‘cedb400.dll’ is can be found in header files in the Platform Builder directory, shown in Table 2. With the following pseudo code, all data can be read from 6 This API function is obsolete and should be replaced by CeO- a CEDB volume file. penDatabaseEx2 (database). 158 digital investigation 6 (2010) 147–167

Fig. 13 – A dump of an HTC S730 made with pdocread.

for instance WINCE500\PUBLIC\SERVERS\SDK\SAMPLES\ in unallocated clusters at all. This section deals with finding OBEX\SRVRMODS\VUTILS\pegmapi.h and INCE500\PUBLIC\ deleted data in cemail.vol. SERVERS\SDK\ SAMPLES\OBEX\SRVRMODS\VUTILS\splus- When analyzing an unknown embedded system, one can tag.h. The meaning of some of the other fields has been work ‘top down’ or ‘bottom up’. Working ‘top down’ means identified by reverse engineering. There are still other fields of trying to understand the way data is stored in the system from which the meaning has not yet been established. No investi- coarse to fine and finally understand the way in which user gation has been done on whether the pim.vol volume (an EDB data is stored in raw format. Somewhere in between, one will formatted database) has that same property as a CEDB data- find a mechanism with which the system deals with deleting base; not all records have the same fields all the time. data and freeing up memory space occupied by deleted data. In this way one might find data that is still present in the 6.1.3. Where deleted data can be found system, but no longer available through the API and for logical As in any file system, deleted data can often be recovered from acquisition. The Windows CE Object Store was examined with a TFAT image from a WCE device. In unallocated clusters and this method in Eide et al. (2006). file slack data from deleted files can be found. As mentioned When working ‘bottom up’, one tries to identify the smallest earlier, TFAT is compatible with FAT, so an image is easily entities containing the user data, and try to carve and decode all loaded into various forensic toolkits. those entities. All data, both active and deleted data can be found On WCE devices, the databases storing message and pim in this way, but as the mechanism to distinguish between active data can also be explored for deleted data as databases can and deleted data is (yet) unknown, at first data is not yet classified contain deleted information as well (Stahlberg et al., 2007). as active or deleted. By comparing the output of a logical acqui- Data deleted from the database volume file might not be found sition, yielding active data, and the physical acquisition, yielding all data, and subtracting the two, one can determine the deleted data as the difference between the two sets. Table 2 – Functions exposed by cedb400.dll. Reverse engineering showed that it is rather straight forward to find the location of individual database records in Function name Address Ordinal the cemail.vol file. First one needs to know that there are 9 CeCreateDatabaseEx 10003B5E 1 data types in a CEDB database, see Table 3. CeCreateDatabaseEx2 100058E2 2 A property of CEDB databases is that each record in a data- CeFindFirstDatabaseEx 100044FE 3 base can have its own set of fields. This requires that each CeFindNextDatabaseEx 100046CA 4 record stores a list of fields that are in the record. Another CeMountDBVol 10005C82 5 CeOidGetInfoEx 10003D72 6 property of CEDB databases is that the data block of the record CeOidGetInfoEx2 10003C57 7 can be split into odd and even bytes. Each of these two separate CeOpenDatabaseEx 10003C1E 8 byte streams might be compressed. Often Unicode text in CeOpenDatabaseEx2 10004A41 9 a record appears as plain ASCII in a binary dump of cemail.vol. CeReadRecordPropsEx 10007BDD 10 This happens when the even stream is stored uncompressed CeSeekDatabase 10003EAB 11 and the odd stream, which contains only zero values for Latin CeSeekDatabaseEx 10005103 12 CeUnmountDBVol 10005D60 13 characters, is compressed. The general structure of a CEDB CeWriteRecordProps 10007EE7 14 record is shown in Fig. 15. CloseDBFindHandle 100049A5 15 The (partial) record header always starts at a four byte CloseDBHandle 10005025 16 boundary. Next follows a list of field type indicators, each 4 NTCreateDatabaseEx 10003C05 17 bytes long (Table 4). NTReserveOID 10005E64 18 When analyzing the cemail.vol database volume file, NTSetFlags 10005FA3 19 potential single records can be found using the following DllEntryPoint 10003EC6 pseudo code: digital investigation 6 (2010) 147–167 159

Fig. 14 – Partial output of xpdumpcedb.exe, showing SMS Inbox content.

1. On a four byte boundary, search for a DWORD7 with the A screenshot of a cemail.vol file opened with Hex Work- following structure: *, *, [0j1],[0x0Bj0x41j0x05j0x40j0x02 shop and the corresponding bookmark file opened is shown in j0x03j0x1Fj0x12j0x13] Fig. 16. Below is an explanation by field: 2. Go back 8 bytes and read header structure 1. The bytes 0xB380-0xB387 makes up (a part of) the record 3. In 1, if byte three ¼¼ 1, this is the last type indicator, go to 6 header. 4. Next DWORD should have the following structure: *, *, a. The first 4 bytes are always 0x00. [0j1],[0x0Bj0x41j0x05j0x40j0x02j0x03j0x1Fj0x12j0x13] b. The next 2 bytes are the external size in bytes after 5. In 4, if byte three ¼¼ 0, this is not the last type indicator, go decompression, here 0x17c. The two highest bits are to 4 used as a flag, here 0x40, to indicate compression of the 6. Ready, potential (partial) record header found record data. c. The next 2 bytes are often the internal size of the record When a potential record header is found, checks can be body. In this field, there is again a flag in the two highest done to eliminate false positives. For instance, before and bits. after het property id list, there are fields with data on 2. The bytes 0xB388-0xB3BF list 14 DWORDs, those are the compression type and sizes (compressed and deflated) of the type identifiers of the 14 fields in this record. With the record. With this data most false positives can be eliminated. values in Table 3 the data types of each individual field can Also, it is very unlikely that decompression will yield sensible be decoded. In the last type identifier, byte 3 is 0x01, indi- data on random data found after a false hit. cating end of list. APython8 script was written, called cedbexplorer.py, to find 3. In 0xB3C0-0xB3C1 the highest bit is a flag: 0x80 means not cedb records as described above. Once the location where compressed, 0x00 means compressed. When uncom- a database record is located is known, the records that are pressed, the rest indicates the length of the record in double compressed need to be decompressed. For this the ‘cedb400.dll’ bytes. was reverse engineered to find out how decompression in cedb 4. When it is compressed, the next 6 (0xB3C2-0xB3C7) bytes records works. The decompression is implemented into the are the length of the even bytes stream. First 3 bytes for same Python script. Also the script calculates the MD5 hash uncompressed size, next 3 bytes for compressed size. value over each field in a record and over the record as a whole. 5. The next item is the compressed even byte stream. The text With this MD5 hash, the script can search in the output of is somewhat readable (in Dutch), but it also contains other xpdumpcedb.exe to find the corresponding MD5. If found, the items, like time, that cannot be interpreted at all in this way, record is identified and marked as ‘active’. If not found, the because they are spread over the even and odd streams. record is assumed ‘deleted’. Also the script produces a Book- 6. The next 6 bytes (0xB479-0xB47E) indicate the length of the mark file for Hex Workshop,9 so that the records found can be odd bytes stream. The first 3 bytes for uncompressed size, checked manually in Hex Workshop. the next 3 bytes contain the compressed size. As a check,

7 4 bytes. the uncompressed size for even and odd bytes should be 8 www.python.org. equal, and equal to length described in point 1.b multiplied 9 www.hexworkshop.com. by two. 160 digital investigation 6 (2010) 147–167

Table 3 – Data types in CEDB databases. Table 4 – CEDB field type indicators. Name Type Numerical ID Byte Indicator Description

CEVT_BOOL Boolean value 0x0B 1 Data type For instance ‘‘subject’’, ‘‘receive date’’, CEVT_CEBLOB Binary object 0x41 2 ‘‘message is opened’’. CEVT_R8 8-byte floating-point value 0x05 3 Last field Always ‘0’, except in the last CEVT_FILETIME Time and date data 0x40 field it is ‘1’ CEVT_I2 2-byte signed integer 0x02 4 Variable type See Table 3 CEVT_I4 4-byte signed integer 0x03 CEVT_LPWSTR Long pointer 0x1F to a Unicode string CEVT_UI2 2-byte unsigned integer 0x12 CEVT_UI4 4-byte unsigned integer 0x13 7. Discussion

In section 2, NOR and NAND flash were identified as important In Fig. 17 the output of cedbexplorer.py is shown, decoding sources of user data. In modern WCE devices, NAND flash is this same record. Notice how the MD5 of this record matches getting more important as opposed to NOR flash. RAM can also the MD5 of record 6 in Fig. 14. This means that record number hold information that can be of forensic relevance. WCE 43 is actually message 6 in the SMS inbox. When this message devices may also have special memory locations, for instance is deleted by the user, cedbexplorer.py will still find it back in in the processor itself. These locations might hold items like the volume file, as long as it is not actively erased by the unique numbers that can be used for encryption. messaging application. Section 3 identified software components that are either relevant to storage of user data, or useful in making copies of that user data. A TFAT file system hosted on flash memory can 6.2. RAM contain user data like text documents, pictures, videos, but also database files for messaging and PIM data. The heap, In 5.4.2.2 it was shown how to make a dump of the working located in RAM, can contain relevant information related to memory of a process. A first examination of such a memory processes running on the WCE device. Examples are naviga- dump can be done with knowledge that can be found in the tion software having an NMEA reception buffer or email source code that comes with Platform Builder. In heap.h10 clients maintaining a scratch pad. constants and structures are defined that can help to dissect In section 4 the order of acquisition of the components a RAM memory copy of a process. holding potential evidence was explored. Depending on what The first step is to find the start of the heap within the kind of evidence is looked for, the risks of different kinds of memory copy. The heap starts at a 64 kB boundary and has acquisitions are compared against the likelihood of successful a marker ‘HeaP’. Starting at the ‘HeaP’ marker, a structure recovery of the evidence. For example, if there is reason to occupying 0x30 bytes is the heap header. Then follows a so believe that the essential evidence will be deleted video, the called region. The regions contain the actual heap items. The risk of damaging the physical memory chip during chip region header also contains a field that can point to the next extraction might be regarded less than the risk of erasing region. If there is no next region, this pointer is 0 (Fig. 18). deleted data by switching the phone on to do a pseudo phys- The first heap item starts at offset 0x58. Each heap item has ical acquisitions with tools like pdocread or XACT. Likewise, a header of 8 bytes. Four bytes indicate the length of the item when it is suspected that evidence might be in the RAM that is (including item header) and a pointer to the heap itself. A occupied by a navigation application which is still active, one positive length indicates that the item is in use, a negative might decide to make a copy of RAM using JTAG or, if that is length indicates a free item. not feasible, using pmemdump. Finding out how to apply A python script called heapdigger.py was written to dissect JTAG on a specific device might not only be a very time the heap of a WCE process. This script searches the heap consuming task, applying JTAG holds a risk of system crash, marker in an image, decodes the heap headers and subse- making a reset necessary. This will obviously cause risk to quent heap items. With this script a first step can be made to analyze the way a program stores and leaves data on the Record header (8 bytes) stack. While simulating an SMS message being written but List of record ‘field type indicators’ (N* 4 bytes) cancelled before sending it, acquisitions of the process Record body memory of tmail.exe were made with pmemdump.exe. Fig. 19 Even byte stream shows an example of the output of the script. Heap item 823, [Compressed] [Compressed at address 0x1E0820 and a length of 0x118, is ‘free’ and avail- data container data container] able for reuse. In the raw data of this item one can read the cancelled email in HTML format. Odd byte stream [Compressed] [Compressed 10 See shared source code that comes with Platform Builder. The data container data container] file heap.h is in the directory WINCE500\PRIVATE\WINCEO- S\COREOS\CORE\ LMEM\. Fig. 15 – Structure of a CEDB record body. digital investigation 6 (2010) 147–167 161

Fig. 16 – cemail.vol opened with HexWorkshop and corresponding bookmark file.

deleted data in both RAM and flash memory. On the other In section 6 tools were presented to analyze acquired data. hand, when using pmemdump the OS is still active, there is An algorithm was presented to reconstruct a TFAT file system a risk that deleted data in flash and RAM will be erased beyond from a physical acquisition. This reconstruction yields an recovery. TFAT image file containing flash pages belonging to the latest Section 5 described methods for physical and pseudo version of the file system. This image can be further investi- physical acquisition. For two physical acquisition methods gated with COTS file system analysis tools like EnCase or FTK. a case example was given; physical memory chip removal and It also produces a file containing flash pages that no longer JTAG have been shown to be feasible methods to make belong to the latest version of the file system. This file can be a physical acquisition of a WCE device. Furthermore the use of loaded too into analysis tools as unallocated clusters. several RAPI tools is described; it was shown how to deter- A tool called xpdumpcedb was presented. This tool runs mine the list of running processes on a WCE device. With this under Windows XP. With this tool, a cedb database volume file list, the name of a running process could be determined and (for instance exported from EnCase) can be read completely. used to make a dump of the RAM occupied by this process. All fields in all records of all databases in the volume are Next it was shown how to make a pseudo physical acquisition outputted in xml format. The meaning of fields within data- of flash based file system in a WCE device. base records can sometimes be found in header files in the When using the RAPI tools, or any other tool that is doing Platform Builder source code, sometimes though, the meaning pseudo physical acquisition on a WCE device, one has to take has to be established by reverse engineering. Furthermore, into account that these tools make use of dedicated software a tool called wmdumpedb was presented. This tool runs under that has to be loaded on the running machine. Running this WCE, for instance on a WCE Emulator on an XP machine. This software on the device will at least overwrite unused RAM that tool reads a edb formatted database volume file. It writes all might hold evidence. If the software is transferred to the WCE fields in all records of all databases in the volume to a file in device, it has to be stored on external media like an SD card xml format. This tool has the disadvantage that is can only first, before it is loaded into RAM. If the software is loaded onto run on a WCE device or a WCE device emulator. No possibility the internal file system, it might cause expired flash blocks to has yet been found to make a similar tool that runs on desktop be erased and reused, thus erasing potential evidence. OS natively. 162 digital investigation 6 (2010) 147–167

Fig. 17 – Output of cedbexplorer.py.

Furthermore, some aspects of low level structures of 0x00 Heap header a WCE cedb database was explained. With knowledge of this 0x2F 0x30 Region hdr Next reg. Region hdr Next reg. structures, a python script called cedbexplorer.py was 0x57 developed. It was presented here and it was shown that the Heap item hdr Heap item header script can find and decompress (if necessary) individual cedb Heap Item Heap Item records. Because the tool does not look at the status of the Heap item hdr Heap item header individual records, it also finds deleted records that are still Heap Item Heap Item present in a cedb volume file. On the bases of MD5 hash values calculated over the record fields, the tool can deter- Heap item hdr Heap item header mine whether a record is active (already found by xpdump- Heap Item Heap Item cedb) or deleted (not found by xpcedbdump). Because the decompression algorithm is not yet fully understood, Fig. 18 – Structure of a WCE heap. digital investigation 6 (2010) 147–167 163

Fig. 19 – Part of the output of heap analysis tool.

sometimes active records are not decompressed correctly research was stopped because the required data was recov- and thus not found back in the xpdumpcedb results and ered or a case was closed. Because of this, there are still a lot of subsequently falsely marked as deleted. This effect is noticed interesting issues to be studied. mainly in big records. The script carves the records out of the Besides the heap, the stack is a place to look for data. Func- database file and uses an indirect way to distinguish between tions store local variables on the stack. Data stored by functions active and deleted records. This indirect method is not the with a relatively long lifetime might be on the stack for a rela- most efficient way, but the advantage is that it can be used tively long time. This has not been explored from a forensic point independent of the higher level structures of cemail.vol, so of view. Memory mapped files might be interesting in a forensic that also for instance expired flash memory blocks and context? First, ways of recovering those files have to be estab- unallocated clusters can be carved as well. In these locations lished. Then the forensic relevance can be determined. both active and deleted database records can be found Important aspects of the format of cedb database records is (Stahlberg). clear, but the decompression for these records presented in Finally a script called heapdigger.py was presented. With this paper is not yet perfect and has to be improved. The this script a dump of the RAM occupied by a process can be format of the cedb successor edb however is not yet known. It searched for the presence of a heap. If a heap is found, it will is likely that edb databases will also contain deleted records be analyzed. Sections in the heap are written to an xml file. In until the moment of a database clean-up. this way busy and free heap items can be studied. These items Deleted data is found in cedb databases. It would be can contain relevant evidence, as processes can use the heap interesting to establish to what extent deleted data will to allocate buffers. Examples are a receive buffer for GPS remain in the volume files and compare cedb databases to the NMEA data, or a buffer to hold an email while it is written. databases investigated by Stahlberg. These items can still be found on the heap although they are Finally, the knowledge on forensically interesting artifacts released for reuse. This tool is a proof of concept. Experiments of popular software on Desktop Windows version is not one are needed to find out what information is left on the heap by on one transferable to WCE platforms. Research should be popular WCE applications. Actual value in a forensic investi- conducted to find out about similarities and differences in the gation has to be established in this way. way evidence left by applications on both platforms.

8. Future work Acknowledgement This paper is mainly a starting point for further investigation of Windows CE based devices. Lots of questions came up The author would like to thank colleagues at the Netherlands during case driven research of WCE devices, but often the Forensic Institute for support in investigation the Windows CE 164 digital investigation 6 (2010) 147–167

platform and writing this paper. Furthermore the author would like to thank the reviewers for very useful suggestions.

Appendix.

Table 5. Databases and field types found in cemail.vol. Database name: pmailFolders

This database holds the folder structure of the messaging system. Generally there are several root folders for the various messaging methods: SMS, ActiveSync, Hotmail, and some POP/SMTP email accounts. Each of these root folders have subfolders: Inbox, Outbox, Sent items, Drafts and Deleted items.

c Type identifier Data type Function

1 3001001F String PR_DISPLAY_NAME, name of folder. When 2 and 3 are equal, this is a ‘messaging method’ (SMS, ActiveSync, hotmail or some SMTP/POP email account). When c2 and c3 are not equal, this is the ‘message box type’ (inbox, outbox, drafts, sent items or deleted) 2 80010013 Uint32 Database for this folder. The name of the database is ‘fldrX’, where X is the hexadecimal value of this field. If for example this field has the value 31000026, messages in this folder can be found in database with the name fldr31000026 3 80050013 Uint32 ‘Messaging method’ link. The ‘messaging methods’ and the ‘message box types’ that are grouped together by having this number equal 8119001F String Signature used when composing a message in this channel. Empty when c2 s c3 4 8117001F String Protocol. ‘SMS’, SMTP’ are values found 820F0040 DateTime Unknown 5 82160040 DateTime Unknown. Maybe last time sync’d Database name: fldr3100026

This is an example for instance an Inbox. For each messaging method there are at least five subfolders. The user can also create extra folders.

c Type identifier Data type Function

1 0E060040 DateTime Receive date and time 3 0C1A001F String ‘File as’ name 4 0C1F001F String ‘From’ name 5 003D001F String Subject prefix, like ‘Re:’ or ‘Fwd:’ 7 0E1B0013 Uint32 If>0 then there is an attachment to this message 8 80050013 Uint32 Attachment id. In the attachment database, there is a field ‘81000013’. The attachment to this message can be found in the record where the value in ‘81000013’ is equal to the value in ‘80050013’ in this database Database name: pmailAttach

This database is used to link the file holding the attachment to the message the attachment is attached to.

c Type identifier Data type Function

1 81000013 Uint32 Attachment ID. Links to field 80050013 in message folders named ‘fldrX’ 2 370E001F String Mime type of attachment 3 3704001F String Original name of file attached 4 81000013 Uint32 First set of 8 digits of file name where attachment is actually stored in on the WCE device 5 80010013 Uint32 Second set of 8 digits of file name where attachment is stored in, joined with c4 with a ‘-’. If c4 contains 13002345 and c5 contains 23450012, then the attachment is stored in: \Windows\Messaging\Attachments\13002345- 23450012.att digital investigation 6 (2010) 147–167 165

Database name: pmailMsgs

This database holds additional data on sent messages. Depending on the messaging method and the type of message (sent/received/draft or deleted), different data is stored. c Type identifier Data type Function

1 81000013 Uint32 Attachment ID. Links to the value in field 80050013 in message folders named ‘fldrX’ 2 0E090013 Uint32 Originating folder. If this contains 31000026, the message is in ‘fldr31000026’ 3 851F0040 DateTime Related to the message: received date/time, stored date/ time, deleted date/time 4 800F001F String Dependent of message type. Can be ‘File as’ name 5 800C001F String Dependent of message type. Can be ‘From’ name 6 800E0041 Blob Contains data on the message, often data already found in other fields. Seen to contain Protocol type, ‘File as’ name, From, Number, email address 7 80010013 Uint32 Message number. For email messages: points to the file holding the message body, including email header. Example: If this field holds 34000135, the email message is stored in a file \Windows\Messaging\35340001[postfix].mpb (the value is rotated right one byte). The [postfix] is seen to have on of the values: 8242001e, 8241001f, 1013001e, 1000001f, 1000001e and 81030102, but this list is probably incomplete. These values seem to indicate the format of the email body: html without header, smtp header, empty. Email messages do not necessarily have to have only one email body file. It can have more, for different storage format types. Database name: MessageThreadsDB

This database holds messages. Why messages are stored separately in this database is not yet looked at.

C Type identifier Data type Function

1 00010040 DateTime Date/time (exact meaning not looked at yet) 2 0002001F String Email subject or SMS body text 3 0004001F String ‘From’ name 4 0005001F String Originating from (phone number or email address)

Table 6. Databases and field types found in pim.vol. Database name: Appointments Database

Holds appointment items c Type identifier Data type Function

1 10420040 DateTime Due date/time 2 00520040 DateTime Some other date/time (exact meaning not looked at yet) 3 0020001F String Subject 4 0041001F String Location 5 0051001F String Organizer 6 0029001F String Type of appointment (exact meaning not looked at yet) Database name: Contacts database

Holds contact items c Type identifier Data type Function

1 0080001F String Name 1 2 0082001F String Name 2 3 0096001F String Number 166 digital investigation 6 (2010) 147–167

Tasks database

Holds tasks items

c Type identifier Data type Function

1 0020001F String Subject 2 0029001F String Type of task (exact meaning not looked at yet) 3 00620040 DateTime Date/time 1 (exact meaning not looked at yet) 4 00630040 DateTime Date/time 2 (exact meaning not looked at yet) 5 00640040 DateTime Date/time 3 (exact meaning not looked at yet) 6 00660040 DateTime Date/time 4 (exact meaning not looked at yet) Clog

Holds the call log of this phone

c Type identifier Data type Function

1 00020040 Date/Time Date/time 1 (exact meaning not looked at yet) 2 00030040 Date/Time Date/time 2 (exact meaning not looked at yet) 3 0006001F String ‘File as’ name 4 0007001F String Tbd 5 000A001F String Tbd

references Intel persistent storage manager user’s guide. Online, www. developers.net/filestore2/download/2613; September 2005. Intel. Marvell to purchase Intel’s communications and application processor business for $600 Million. Online, www.intel.com/ Association of Chief Police Officers (ACPO). Good practice guide pressroom/archive/releases/2006/20060627corp.htm;June2006. for computer-based electronic evidence. Online, cryptome. Jansen W, Ayers R. Guidelines on cell phone forensics. Online. org/acpo-guide.htm. Recommendations of the National Institute of Standards and Ayers R, Jansen W, Cilleros N, Daniellou R. Cell Phone Forensic Technology, ; May 2007. Standards and Technology, ; October 2005. com/en/Support/Device_support/Flash. Boling D. Windows CE.NET advanced memory management. Knijff R van der. Embedded systems analysis. In: Casey E, editor. Online, msdn.microsoft.com/en-us/library/ms836325.aspx; Handbook of digital forensics and investigation; 2010. August 2002. Marvell, communications processors. Online, www.marvell.com/ Breeuwsma M, Jongh M de, Klaver C, Knijff R van der, Roeloffs M. products/processors/communications/pxa_90/. Forensic data recovery from flash memory. Small Scale Digital Microsoft 1, supported processors. Online, msdn.microsoft.com/ Forensics Journal June 2007;1(1). en-us/windowsembedded/ce/aa714536.aspx#ARM. Breeuwsma M. Forensic imaging of embedded systems using JTAG Microsoft 2, virtual memory layout: Windows CE 5.0 vs. Windows (boundary-scan). Digital Investigation March 2006;3:32–42. Embedded CE 6.0. Online, msdn.microsoft.com/en-us/library/ BusinessWeek, Fiat and Microsoft Launch Blue&Me. Online, aa914933.aspx. www.businessweek.com/autos/content/feb2006/bw20060202_ Microsoft 3, heaps. Online, msdn.microsoft.com/en-us/library/ 986426.htm; February 2006. aa450550.aspx. Canalys. Smart phone market shows modest growth in Q3. Online, Microsoft 4, TFAT overview. Online, msdn.microsoft.com/en-us/ www.canalys.com/pr/2009/r2009112.html; November 2009. library/aa915463.aspx. Cellebrite, UFED Physical Pro. Online, www.cellebrite.com/UFED- Microsoft 5, TFAT File naming limitations. Online, msdn. Physical-Pro.html. microsoft.com/en-us/library/ms892402.aspx. Eide J, Skogheim Olsen JO. Forensic analysis of an unknown Microsoft 6, driving connectivity. Online, download.microsoft. embedded device. Online, ntnu.diva-portal.org/smash/get/ com/download/6/5/0/6505FA0E-1F39-4A34-BDC9- diva2:121991/FULLTEXT01; June 2006. A655A5D3D2DB/MicrosoftAutoOverview.pdf. Hengeveld Hengeveld W. -policies. Online, www. Microsoft 7, databases. Online, msdn.microsoft.com/en-us/ xs4all.nl/witsme/projects/xda/smartphone-policies.html. library/ms885343.aspx. Hengeveld W. xda tools. Online, www.xs4all.nl/witsme/projects/ Microsoft 8, Windows embedded CE 6.0 evaluation edition. xda/tools.html; November 2009. Online, www.microsoft.com/downloads/details. Herrera C de. Windows CE/Windows Mobile Versions. Online, aspx?familyid¼7E286847-6E06-4A0C-8CAC- www.pocketpcfaq.com/wce/versions.htm; October 2009. CA7D4C09CB56&displaylang¼en. digital investigation 6 (2010) 147–167 167

MSAB, XACT datasheet. Online, www.msab.com/fileadmin/user_ Texas Instruments, wireless handset solutions: overview. Online, upload/media/Documents/Product_Sgeets/XACT.pdf. focus.ti.com/general/docs/wtbu/wtbugencontent. Rogers A, Glaum J, Tonkelowitz M. Creating file systems within an tsp?templateId¼6123&navigationId¼11988&contentId¼4638. image file in a storage technology-abstracted manner, www. Texas Instruments. OMAP35x applications processor technical freepatentsonline.com/EP1544732.pdf; June 2005. reference manual. Online, focus.ti.com/lit/ug/spruf98d/ Samsung, application processor. Online, www.samsung.com/ spruf98d.pdf; October 2009. global/business/semiconductor/products/mobilesoc/ Xda-developers, wings SSPL and HardSPL. Online, forum.xda- Products_ApplicationProcessor.html. developers.com/showthread.php?t¼356295; May 2007. Stahlberg P, Miklau G, Levine B. Threats to privacy in the forensic Xda-developers, hermes bootloader. Online, wiki.xda-developers. analysis of database systems. In: Proc. ACM SIGMOD/PODS; com/index.php?pagename¼Hermes_BootLoader; November June 2007. 2008.