<<

Embedded Devices Security and Reverse Engineering

BH13US Workshop

∗ † Jonas Zaddach Andrei Costin FIRMWARE.RE FIRMWARE.RE jonas@firmware.re andrei@firmware.re

ABSTRACT Keywords Embedded devices have become the usual presence in the embedded devices, firmware, security, reverse engineering, network of (m)any household(s), SOHO, enterprise or criti- exploitation, vulnerabilities, backdoors, static analysis, bi- cal infrastructure. nary analysis, firmware unpacking, firmware analysis, firmware The preached promises to gazillion- modification uple their number and heterogeneity in the next few years. However, embedded devices are becoming lately the usual 1. INTRODUCTION suspects in security breaches and security advisories and thus In the world of ever increasing interconnection of com- become the Achilles’ heel of one’s overall infrastructure se- puting, mobile and embedded devices, their security has be- curity. come critical. The security of embedded devices and their An important aspect is that embedded devices run on firmwares is the new differentiator in the embedded market. what’s commonly known as firmwares. To understand how The security requirements and expectations for to secure embedded devices, one needs to understand their devices are being constantly raised as the world moves to- firmware and how it works. wards the Internet of Things. This is especially true for em- This workshop aims at presenting a quick-start at how to bedded devices and their counterpart – firmwares inspect firmwares and a hands-on presentation with exercises – which is also their weakest point as shown below. on real firmwares from a security analysis standpoint. On another hand, embedded devices still have much less to offer in terms of firmware security at this point. We can see almost daily security advisories related to embedded devices, General Terms many of them related to critical or cyber-physical Compuster System Security, Network and Distributed Sys- systems. It’s not accidental that anecdotical evidence vehic- tem Security, Embedded Devices, Firmware, Security, Re- ulate the term Embedded and Firmware Security - Back verse Engineering to The 90s!. It both shows how easy it is to find vulnera- bilities in embedded firmware, as well as how bad is the state ∗ of affairs in the firmware world from a security view-point. PhD candidate on With this whitepaper and workshop, we aim at presenting ”Development of novel binary analysis techniques for security applications” at a quick-start at how to inspect and analyze firmwares, deliv- EURECOM, Sophia-Antipolis, Biot, France, ering hands-on presentation on real firmwares and compiling [email protected] exercises from a security analysis standpoint. This, on an- other hand, should help speed-up the responsible disclosure †PhD candidate on and fixing of those dormant vulnerabilities. ”Software security in embedded systems” at This paper is organized as follows: we start with presen- EURECOM, Sophia-Antipolis, Biot, France, tation of minimal required theory in Section 2; we continue [email protected] with survey on previous work and state of the art in Section 3; we present most commont firmware formats, their chal- lenges and how to handle their unpacking end-to-end in Sec- tion 4; we reinforce the presented knowledge with hands-on exercises and solutions in Section 5; we conclude in Section Permission to make digital or hard copies of all or part of this work for 6. personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies 1.1 Workshop Outline bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific The workshop which supports this whitepaper, is orga- permission and/or a fee. nized according to the following outline: BH13US ’13 Las Vegas, USA . • what are the embedded systems • what are the firmwares • Whiteware – Washing Machine, Fridge, Dryer

• what are the challenges with firmwares • Entertainment gear – TV, DVRs, Receiver, Stereo, Game Console, MP3 Player, Camera, , • how to overcome those challenges Toys

• typical firmware formats and contents • Other Devices - Hard Drives, Printers

• what’s firmware packing • Cars • how to tackle unpacking problems elegantly • Medical Devices • typical firmware analysis 2.3 Embedded devices hardware architectures • introduction to firmware analysis process automation This section highlights the different architecture elements of an , ranging from the processor archi- • introduction to firmware emulation tecture to memories, and connections between of • challenges and ideas to overcome those the embedded device as well as connections to other systems. • some use case studies on real-world vulnerability find- Processor Architectures. ings Contrary to the relative uniformity of the PC market, em- bedded device’s architectures are very diverse. In middle to • hands-on exercises upper-class market segments of processors offering features like memory and high clock rates, ARM pro- 2. LITTLE BIT OF THEORY cessors are wide-spread, and is trying to catch up with its ATOM line. MIPS processors can be found, too. In the 2.1 Definition of firmware lower-class market processor architectures like Atmel AVR, The term ”firmware” has been coined by Ascher Opler in Intel 8051 and Motorola 6800/68000 power in a 1967 Datamation article. His definition of firmware with small memories and lower clock frequencies. Apart as a glue layer between the CPU instruction set from that, more exotic architectures like Ambarella, Axis and the actual hardware has since been superseded, and the CRIS and others can be found in some devices. IEEE Standard Glossary of Software Engineering Terminol- ogy, Std 610.12-1990, defines firmware today as follows: On-board buses. The processor cores communicate with other design blocks The combination of a hardware device and com- or chips around them through a variety of interfaces: Most puter instructions and data that reside as read- commonly, SPI (Serial Interface), I2C (Inter-IC), only software on that device. Dallas 1-Wire and UART serial buses can be found, but Notes: (1) This term is sometimes used to refer more complex systems also use buses more common in PC only to the hardware device or only to the com- architectures like PCI and PCI Express. Finally, ARM cores puter instructions or data, but these meanings can be connected to peripheral IP blocks through the AMBA are deprecated. (Advanced Bus Architecture) interface. Notes: (2) The confusion surrounding this term has led some to suggest that it be avoided alto- Common communication lines. gether. The above-mentioned buses are mainly used as commu- For the sake of simplicity we will deviate from this defini- nication interface on the same board. For communications tion in that we call the set of all code running on the hard- with or other systems, additional interfaces might ware’s processor (machine code and virtual machine code) be used: the firmware of this device. • Ethernet - RJ45

2.2 Device Classes • RS485 Firmware-driven devices can be found virtually everywhere - Nowadays our cars are controlled by hundreds microcon- • CAN/FlexRay trollers, washing machines are programmable, and of course industrial control is automated and can be controlled from • Bluetooth a central console. Here is a non-exhaustive list of all the • WIFI domains that use firmware-driven devices: • Infrared • Networking – Routers, Switches, NAS, VoIP phones • Zigbee • Surveillance – Alarms, Cameras, CCTV, DVRs, NVRs • Other radios (ISM-Band, etc) • Industry Automation – PLCs, Power Plants, Industrial Process Monitoring and Automation • GPRS/UMTS

• Home Automation – Sensoring, Smart Homes, Z-Waves, • USB Philips Hue Memory. • VxWorks is a popular proprietary real-time operating Different types of memory can be mapped directly into system. the address space of the embedded system: • Cisco IOS • DRAM is a volatile memory that can be accessed read/write. Though it is quite fast, some processor cycles might be needed to access content, which is why caching is • Windows CE/NT employed in some architectures to speed up DRAM access. The DRAM controller needs to be set up with • L4 the memory’s timings before this type of memory can actually be used, which normally happens at very early • eCos stages in the .

• SRAM is a volatile memory that can be accessed read/write. • DOS It is very fast and can be read or written without or much less delay than DRAM, but it is quite expensive, • which is why you fill find only small quantities of this memory (typically < 1Mb) in a device. • JunOS • ROM is a non-volatile memory that can only be read. • Ambarella Typically it is programmed in factory and contains startup code that is absolutely needed, for example a mask ROM bootloader. • etc.

• Memory-Mapped NOR Flash is a non-volatile memory that can be accessed read/write. Contrary to previous Common . memories, reads can happen for any offset, but writes The bootloader is the first piece of software that is exe- need to take place for a whole block, which is why they cuted after a possible mask ROM bootloader. Its purpose is are mainly used to store bootcode. to load parts of the into memory and bring the system in a defined state for the kernel (though this re- quirement is fluid, the kernel takes over some of the Common Storage. former duties of a bootloader, like setting up the Pin Mux). While the above-mentioned memories, with the exception It can be organized in one or two stages. In a two-stage set- of NOR flash, serve only as volatile storage, other options ting, the first stage only knows how to load the second stage, exist for permanent data storage: while the second stage provides support for file systems etc. • NAND Flash is typically connected through a bus like • U-Boot is probably the most popular bootloader SPI to the CPU and behaves similar to NOR flash: Any byte offset can be read, but writes and deletes need to happen for a whole erase block. Since each • RedBoot block may only be written so often before it breaks, special file systems exist that try to balance the wear • between cells. • Ubicom bootloader • SD Card or any other common storage card (MMC, ...) can directly be used as a block device in Linux. The connection of the controller to the main system Common Libraries and Dev Envs. varies (USB, integrated, SPI, ...). Today, there are several prepackaged toolchains available. • Hard Drive can be connected via SATA, SCSI or PATA These consist of build tools for the specific processor (com- to the system. Like for SD Cards, the actual connec- piler, assembler, etc). In most cases you will also get the tion of the controller to the system can be realized over standard compiled for your target, and for some even other buses. a wide range of open-source packages, like ’s toolchain with its recipes.

Common Operating Systems. • + uClibc is probably the most used combina- Embedded systems are powered by firmwares of varying tion. complexities. More complex ones usually use a full-blown operating system like Linux or Windows NT. Less complex • devices use operating systems like VxWorks or Windows CE, and lots of special purpose operating systems can be found, • openembedded too. Here is a non-exhaustive list of operating systems that can be encountered in firmware analysis: • crosstool • Linux is by far the most popular operating system for more complex embedded devices. • crossdev 3. RELATED WORK AND STATE OF THE which allows arbitrary injection of into the print- ART er´sfirmware. While [15] demonstrated the proof-of-concept sending arbitrary or custom command to any printer via In [12] the assessment of the security of current embedded standard printed documents, including MS Office Word. Adobe management interfaces was conducted. Vulnerabilities were PostScript and Java Applets-generated, [17] used the same found in all 21 devices from 16 different brands, representing attack vector to deliver a modified firmware. [10] examines 8 different categories, including network switches, cameras, the vulnerability of PLCs to intentional firmware modifica- photo frames, and lights-out management modules. Along tions in order to obtain a better understanding of the threats these, a new class of vulnerabilities was discovered, namely posed by PLC firmware modification attacks and the feasi- cross-channel scripting (XCS) [13]. XCS vulnerabilities are bility of these attacks. A general firmware analysis method- not particular to embedded devices, however it is indicated ology is presented, and a proof-of-concept experiment is used that embedded devices is the most affected population. to demonstrate how legitimate firmware can be updated and Results from [12] were subsequently used in [23]. Re- uploaded to an Allen-Bradley ControlLogix L61 PLC. searchers address the challenge of building secure embedded On the deffensive side, however, there is slightly less pre- web interfaces by proposing WebDroid, the first framework vious work available. specifically dedicated to this purpose. To that end, they In [24] addresses the important challenge of verifying the demonstrate and evaluate the efficiency of their framework integrity of peripherals’ firmware. Authors propose software- in terms of performance and security. only attestation protocols to verify the integrity of periph- In [14] RevNIC is presented. RevNIC is a tool for re- erals’ firmware, and show that they can detect all known verse engineering network drivers. The work presents a tech- software-based attacks. Authors also implement their scheme nique that helps automate the reverse engineering of device using a Netgear GA620 network adapter in an x86 PC, and drivers. It takes a closed-source binary driver, automatically evaluate theirs system with known attacks. reverse engineers the driver’s logic, and synthesizes new de- [25] presents a tool developed specifically for the SCADA vice driver code that implements the exact same hardware environment to verify PLC firmware. The tool does not re- protocol as the original driver. This code can be targeted quire any modifications to the SCADA system and can be at the same or a different OS. No vendor documentation or implemented on a variety of systems and platforms. The tool source code is required. captures serial data during firmware uploads and then veri- Continuing in direction of [14], the works of [20–22] present fies them against a known good firmware baseline. Attempts on multiple aspects of firmware reversing and backdooring to inject modified and/or malicious firmware are identified on the network cards. by the tool. [26] presents a time-of-check-to-time-of-use (TOCTTOU) attack via externally attached mass-storage devices. The at- 3.1 Community Efforts and Tools tack is based on emulating a mass-storage device to observe There are many community efforts dedicated to reverse and alter file access from the consumer device. The TOCT- engineering of firmwares and embedded devices. Each of TOU attack was executed by providing different file content these efforts have a specific goal and thus the tools produces to the check and installation code of the target device, re- by those efforts are influenced by their main goals. spectively. The presented attack shown to be effective to We try to summarize in a comprehensive list the most bypasses the file content inspection, resulting in the execu- used and visible efforts to date. tion of rogue code on the device. [18] presents the results of the study of a vulnerability as- • binwalk – Binwalk is a firmware analysis tool designed ˘ ´ sessment of embedded network devices within the worldˆaAZs to assist in the analysis, extraction, and reverse en- largest ISPs and civilian networks, spanning North America, gineering of firmware images and other binary blobs. Europe and Asia. The observed data confirmed the intuition It is simple to use, fully scriptable, and can be easily that these devices are indeed vulnerable to trivial attacks extended via custom signatures, extraction rules, and and that such devices can be found throughout the world in plugin modules. large numbers. This study was subsequently extended with works of [19] with a quantitative lower bound estimation on • firmware-mod-kit – This kit is a collection of scripts the number of vulnerable embedded device on a global scale. and utilities to extract and rebuild linux based firmware Work of [16] presented the reverse-engineering of firmware images. This kit allows for easy deconstruction and re- images for multiple Xerox devices. This allowed discovery construction of firmware images for various embedded of lower-level from the PostScript high-level document devices. language. The attacks were delivered to the printers via standard printed documents, as previously demonstrated • FRAK: Firmware Reverse Analysis Konsole – Unfortu- in [15]. Multiple attacks were presented, including mem- nately, after an year since BH12US and notes of FOSS ory dumping/scraping leading to password theft and pas- license in [17], FRAK tool and it’s source code re- sive network topology discovery, as well as outbound socket mained unreleased as the site welcomes with the same sending arbitrary data. message for an year: SVN Repository: Coming soon! There were recent advances in firmware modification at- Please subscribe to mailing list for release date. tacks by [10, 11, 17]. [11] addressed the network card based As of time of this writing, it was not possible to evalu- on Broadcom BCM4325 & BCM4329 chipsets and demon- ate the tool, hence it was not possible to conclude over strated how to put these cards in monitor mode. [17] pre- the state of this project. sented a case study of the HP-RFU (Remote Firmware Up- date) LaserJet printer firmware modification vulnerability, • ERESI framework – The ERESI Reverse Engineering Software Interface is a multi-architecture binary analy- sis framework with a domain-specific language tailored complexity, you will find different levels of packing and dif- to reverse engineering and program manipulation. ferent objects inside the archives. We classify the firmware complexity according to this categories: • signsrch – Tool for searching signatures inside files, ex- tremely useful as help in reversing jobs like figuring or • Full-blown (full-OS/kernel + bootloader + libs + apps) having an initial idea of what encryption/compression - This is typically a Linux or Windows firmware that algorithm is used for a proprietary protocol or file. It carries a complete file system. The driving applica- can recognize tons of compression, multimedia and en- tion will most likely run in User mode, though custom cryption algorithms and many other things like known kernel modules/drivers might be used. strings and anti- code which can be also man- ually added since it’s all based on a text signature file • Integrated (apps + OS-as-a-lib) - This is firmware with read at runtime and easy to modify. a small proprietary operating system or none at all - the application will typically run with the same privi- • hash-identifier – Software to identify the different types leges as the kernel. of hashes used to encrypt data and especially pass- words. Over few dozens of hash supported. Encryp- • Partial updates (apps or libs or resources or support) tion algorithms that can not be differentiated unless The firmware image will not contain all files that form they have been decrypted, so the efficiency of the soft- the complete system, but just an update for concerned ware also depends on the user’s criteria. files. • offzip – A very useful tool to unpack the zip (zlib, gzip, In a firmware of the first category, you will typicallyl find deflate, etc.) data contained in any type of file in- the following objects while you unpack the firmware: cluded raw files, packets, zip archives, and • Bootloader (1st/2nd stage) everything else. It’s needed only to specify the offset where the zip data starts or using the useful -S search • Kernel options able to find any possible zip block contained in the provided file. There are also other options for • File-system images extracting all the zip blocks which have been found or • User-land binaries dumping them as in their original compressed form. • Resources and support files • TrID – TrID is an utility designed to identify file types from their binary signatures. While there are similar • Web-server/web-interface utilities with hard coded logic, TrID has no fixed rules. Instead, it’s extensible and can be trained to recognize Those objects can be grouped and packed in any of the new formats in a fast and automatic way. following archives or filesystem images (non-exhaustive list): • gpltool/bat – BAT, previously GPLtool, makes it eas- • Pure archives (CPIO/Ar/Tar/GZip/BZip/LZxxx/RPM) ier and cheaper to look inside binary code, find com- • Pure filesystems (YAFFS, JFFS2, extNfs) pliance issues, and reduce uncertainty when deploying Free and Open Source Software. • Pure binary formats (SREC, iHEX, ELF) • PFS – PFS archive file format (file system?) is used in • Hybrids (any breed of above) images of routers like Benq ESG 103, NDC NWH8018, and probably many others. In this following paragraph, we list unpacking tools for each archive format: • CNU fpu – CNU fpu is a pack/unpack utility for Cisco IP Phones firmware files (7941, 7961, 7911-12 and oth- Firmware Formats – Flavors. ers based on CNU File Archive 3.0 format) Written by kbdfck at virtualab.ru • Ar - The ar tool is part of all Linux/FreeBSD distri- butions. Use ”ar x <file>” to extract. • ardrone-tool – Aims to develop tools for A.R. Drone, for example to create and flash custom linux kernels. • YAFFS2 - There are tools in the yaffs2utils project to extract this filesystem [9]. • UnYAFFS – Unyaffs is a program to extract files from a yaffs file system image. Now it can only extract • JFFS2 - BAT uses a python wrapper around the jffs2dump images created by mkyaffs2image. utility that is part of mtdtools. • squash-tools – SquashFS • SquashFS - You can find unpacking tools at [7] • UbiFS – UbiFS • CramFS - The firmware-mod-kit provides a tool called ”uncramfs” to extract files [3]. 4. FIRMWARE FORMATS AND UNPACK- • ROMFS - Harald Welte has developed a tool called ING EXPLAINED END-TO-END ”romfschk” that can extract files [5]. In this section, we are going to look at the various archive • UbiFS - Unfortunately no easy way to extract - see [8]. and filesystem image formats that can be encountered when inspecting a packed firmware image. Depending on the firmware • xFAT - Mount as loopback device in Linux. • NTFS - Mount as loopback device in Linux. README.TXT: ASCII English text, with CRLF line terminators • ext2fs/ext3fs/ext4fs - Mount as loopback device in Linux seaenum.exe: MS-DOS , COFF for MS-DOS, DJGPP go32 DOS extender, UPX compressed • iHEX - Convert to elf or binary by doing $ less flash.bat set exe=fdl464.exe objcopy -I ihex -O elf32-little set family=Moose objcopy -I ihex-O binary set model1=MAXTOR STM3750330AS set model2=MAXTOR STM31000340AS • SREC/S19 - Convert to elf or binary by doing rem set model3= rem set firmware=MX1A4d.lodd objcopy -I srec -O elf32-little set cfgfile=6_8hmx1a.txs objcopy -I srec-O binary set options=-s -x -b -v -a 20 ... • PJL :SEAFLASH1 %exe% -m %family% %options% -h %cfgfile% • CPIO/Ar/Tar/GZip/BZip/LZxxx/RPM - Your favorite if errorlevel 2 goto WRONGMODEL1 should provide tools to handle these if errorlevel 1 goto ERROR archive formats. goto DONE

5. EXERCISES AND SOLUTIONS Unpacking the firmware (Summary).

5.1 Reversing a Seagate HDD’s firmware file • We have unpacked the various wrappers, layers, archives format and filesystems of the firmware In this excercise, we want to inspect a firmware for an – ISO → DOS IMG → ZIP → LOD embedded system that does not have a known operating system, nor a known firmware file format. • The firmware is flashed on the HDD in a DOS envi- The first step is to obtain the firmware for the MooseDT ronment (FreeDOS) MX1A-3D4D-DMax22 from the Seagate website [6]. Then, the obtained file needs to be unpacked until the actual firmware • The update is run by executing a DOS batch file (flash.bat) file is found within. • There are Unpacking the firmware. – a firmware flash tool (fdl464.exe) A quite stupid and boring mechanic task: – a configuration for that tool (6 8hmx1a.txs, en- crypted or obfuscated/encoded) $ 7z x MooseDT-MX1A-3D4D-DMax22.iso -oimage $ cd image – the actual firmware (MX1A4d.lod) $ ls [BOOT] DriveDetect.exe FreeDOS README.txt • The firmware file is not in a binary format known to $ cd \[BOOT\]/ file and magic tools $ ls Bootable_1.44M.img → Let’s have a look at the firmware file! $ file Bootable_1.44M.img Bootable_1.44M.img: DOS floppy 1440k, x86 hard disk boot sector Inspecting the firmware file: hexdump. $ mount -o loop Bootable_1.44M.img /mnt $ hexdump - MX1A4d.lod $ mkdir disk 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 07 00 |...... | $ cp -r /mnt/ disk/ 00000010 80 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |...... | * 00000020 00 00 00 00 00 22 00 00 00 00 00 00 00 00 00 00 |....."...... | $ cd disk 00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 79 dc |...... y.| $ ls 00000040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |...... | * AUTOEXEC.BAT COMMAND.COM CONFIG.SYS HIMEM.EXE 000001c0 0e 10 14 13 02 00 03 10 00 00 00 00 ff 10 41 00 |...... A.| KERNEL.SYS MX1A3D4D.ZIP RDISK.EXE TDSK.EXE 000001d0 00 20 00 00 ad 03 2d 00 13 11 15 16 11 13 07 20 |. ....-...... | 000001e0 00 00 00 00 40 20 00 00 00 00 00 00 00 00 00 00 |....@ ...... | unzip.exe 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3f 1d |...... ?.| $ mkdir archive 00000200 00 c0 49 00 00 00 2d 00 10 b5 27 48 40 68 41 42 |..I...-...’H@hAB| 00000210 26 48 00 f0 78 ee 10 bd 10 b5 04 1c ff f7 f4 ff |&H..x...... | $ cd archive 00000220 a0 42 03 d2 22 49 40 18 00 1b 10 bd 00 1b 10 bd |.B.."I@...... | $ unzip ../MX1A3D4D.ZIP 00000230 1d 48 40 68 40 42 70 47 10 b5 01 1c ff f7 f8 ff |.H@h@BpG...... | 00000240 41 1a 0f 20 00 f0 5e ee 10 bd 7c b5 04 1c 20 1c |A.. ..^...|... .| $ ls 00000250 00 21 00 90 17 a0 01 91 0c c8 00 98 00 f0 f2 ed |.!...... | 6_8hmx1a.txs CHOICE.EXE FDAPM.COM fdl464.exe 00000260 01 da 00 f0 ed ff ff f7 cf ff 05 1c 28 1c ff f7 |...... (...| 00000270 d3 ff a0 42 fa d3 7c bd 7c b5 04 1c 20 01 00 1b |...B..|.|...... | flash.bat LIST.COM MX1A4d.lod README.TXT 00000280 00 21 00 90 0b a0 01 91 0c c8 00 98 00 f0 da ed |.!...... | seaenum.exe ... $ file * 6_8hmx1a.txs: ASCII text, with CRLF line terminators → The header did not look familiar to me :( CHOICE.EXE: MS-DOS executable, MZ for MS-DOS FDAPM.COM: FREE-DOS executable (COM), UPX compressedInspecting the firmware file: strings. fdl464.exe: MS-DOS executable, COFF for MS-DOS, DJGPP go32 DOS extender, UPX compressed $ strings MX1A4d.lod flash.bat: DOS batch file, ASCII text, with CRLF ... XlatePhySec, h[Sec],[NumSecs] line terminators XlatePhySec, p[Sec],[NumSecs] LIST.COM: DOS executable (COM) XlatePlpChs, d[Cyl],[Hd],[Sec],[NumSecs] XlatePlpChw, f[Cyl],[Hd],[Wdg],[NumWdgs] MX1A4d.lod: data XlateSfi, D[PhyCyl],[Hd],[Sfi],[NumSfis] that there are some clearly separate sections in the file, for example the shorter section in the beginning, separated by a sequence of 0xFF-bytes (white) from the next huge block, then another short separation and another block that shows a more regular pattern at its end. Finally there are three sections of different sizes in the end of the file, separated by 0x00-bytes (black).

Identifying the CPU instruction set.

• ARM: Look out for bytes in the form of 0xeX that occur every 4th byte. The highest nibble of the in- struction word in ARM is the condition field, whose value 0xe means AL, execute this instruction uncondi- tionally. The instruction space is populated sparsely, so a disassembly will quickly end in an invalid instruc- tion or lots of conditional instructions.

Figure 1: Bin2bmp output for the firmware file • Thumb: Look out for words with the pattern 0xF000F000 (bl/blx), 0xB500BD00 (”pop XXX, pc” followed by ”push XXX, lr”), 0x4770 (bx lr). The Thumb instruc- XlateWedge, t[Wdg],[NumWdgs] ChannelTemperatureAdj, U[TweakTemperature],[Partition],[Hd],[Zone],[Opts] tion set is much denser than the ARM instruction set, WrChs, W[Sec],[NumSecs],,[PhyOpt],[Opts] EnableDisableWrFault, u[Op] so a disassembly will go for a long time before hitting WrLba, W[Lba],[NumLbas],,[Opts] an invalid instruction. WrLongOrSystemChs, w[LongSec],[LongSecsOrSysSec],[SysSecs],[LongPhySecOpt],,[SysOpts] RwPowerAsicReg, V[RegAddr],[RegValue],[WrOpt] WrPeripheralReg, s[OpType],[RegAddr],[RegValue],[RegMask],[RegPagAddr] WrPeripheralReg, t[OpType],[RegAddr],[RegValue],[RegMask],[RegPagAddr] In general, you should either know the processor already ... from the reconnaissance phase, or you try to disassemble → Strings are visible, meaning the program is neither en- parts of the file with a disassembler for the processor you crypted nor compressed suspect the code was compiled for. In the visual represen- → We actually know these strings ... they are from the tation, executable code should be mostly colorful (dense in- diagnostic menu’s help! struction sets) or display patterns (sparse instruction sets). In our firmware, searching for ”e?” in the hexdump leads Inspecting the firmware file: binwalk. us to: 00002420 04 e0 4e e2 00 40 2d e9 00 e0 4f e1 00 50 2d e9 |[email protected].| $ binwalk MX1A4d.lod 00002430 db f0 21 e3 8f 5f 2d e9 18 10 9f e5 00 00 91 e5 |..!.._-...... | 00002440 30 ff 2f e1 8f 5f bd e8 d1 f0 21 e3 00 50 bd e8 |0./.._....!..P..| DECIMAL HEX DESCRIPTION 00002450 0e f0 69 e1 00 80 fd e8 44 00 00 00 08 20 fe 01 |..i.....D...... | ------00002460 94 00 00 00 00 30 a0 e1 0c ce 9f e5 01 00 a0 e1 |.....0...... | 499792 0x7A050 Zip archive data, compressed size: 48028, 00002470 10 40 2d e9 14 10 93 e5 be c3 dc e1 d0 10 d1 e1 |.@-...... | uncompressed size: 785886, name: "" 00002480 08 e0 93 e5 02 20 8c e0 92 01 01 e0 20 c0 e0 e3 |...... | 00002490 81 22 61 e0 01 25 62 e0 42 29 a0 e1 82 0c 62 e1 |."a..%b.B)....b.| $ dd if=MX1A4d.lod of=/tmp/bla.bin bs=1 skip=499792 000024a0 d8 cd 9f e5 82 11 81 e0 c6 20 51 e2 42 20 81 42 |...... Q.B .B| $ unzip -l /tmp/bla.bin 000024b0 81 10 8c e0 f0 10 d1 e1 82 20 8c e0 04 c0 93 e5 |...... | Archive: /tmp/bla.bin 000024c0 f0 20 d2 e1 ac 01 2c e1 8e c2 2c e1 00 c0 83 e5 |. ....,...,.....| End-of-central-directory signature not found. Either this file is not 000024d0 ac cd 9f e5 fc c9 dc e1 00 00 5c e3 10 40 bd a8 |...... \..@..| a zipfile, or it constitutes one disk of a multi-part archive. In the 000024e0 8e 1a 04 aa 10 80 bd e8 f0 41 2d e9 94 7d 9f e5 |...... A-..}..| latter case the central directory and zipfile comment will be found on 000024f0 80 40 a0 e1 07 00 54 e3 00 50 a0 e1 f7 6f 47 e2 |[email protected].| the last disk(s) of this archive. unzip: cannot find zipfile directory in one of /tmp/bla.bin or /tmp/bla.bin.zip, and cannot find /tmp/bla.bin.ZIP, period. Let’s verify that this is indeed ARM code ...

$ dd if=MX1A4d.lod bs=1 skip=$(( 0x2420 )) > /tmp/bla.bin → binwalk does not know this firmware, the contained archive $ arm-none-eabi-objdump -b binary -m arm -D /tmp/bla.bin was apparently a false positive. /tmp/bla.bin: file format binary

Inspecting the firmware file: Visualization. Disassembly of section .data: To spot different sections in a binary file, a visual repre- 00000000 <.data>: 0: e24ee004 sub lr, lr, #4 sentation can be helpful. 4: e92d4000 stmfd sp!, lr 8: e14fe000 mrs lr, SPSR c: e92d5000 push ip, lr • HexWorkshop is a commercial program for Windows. 10: e321f0db msr CPSR_c, #219 ; 0xdb Most complete featureset (Hex editor, visualisation, 14: e92d5f8f push r0, r1, r2, r3, r7, r8, r9, sl, fp, ip, lr 18: e59f1018 ldr r1, [pc, #24] ; 0x38 ...) [4] 1c: e5910000 ldr r0, [r1] 20: e12fff30 blx r0 24: e8bd5f8f pop r0, r1, r2, r3, r7, r8, r9, sl, fp, ip, lr • Binvis is a project on google code for different binary 28: e321f0d1 msr CPSR_c, #209 ; 0xd1 2c: e8bd5000 pop ip, lr visualisation methods. Visualisation is ok, but the pro- 30: e169f00e msr SPSR_fc, lr 34: e8fd8000 ldm sp!, pc^ gram seems unfinished. [2]. 38: 00000044 andeq r0, r0, r4, asr #32 3c: 01fe2008 mvnseq r2, r8 • Bin2bmp is a very simple python script that computes 40: 00000094 muleq r0, r4, r0 44: e1a03000 mov r3, r0 a bitmap from your binary [1]. 48: e59fce0c ldr ip, [pc, #3596] ; 0xe5c You can see the output of bin2bmp in figure 5.1. The out- put of the other tools is very similar to this plot. You can see → Looks good! Navigating the firmware. Reconnaissance. In this paragraph, we look at several starting points for Web-interface of 51110.2.1800.96.bin a more in-depth analysis of the firmware contained in the firmware file. The first method is to look for the stack setup • first, quick-explore the web-interface that has to happen for each ARM processor mode before a system can actually call functions. This typically happens • lighttpd-based in a sequence of ”msr CPSR c, XXX” instructions, which switch the CPU mode, and assignments to the stack pointer. – sudo apt-get install lighttpd php5-cgi The msr instruction exists only in ARM mode (not true for – sudo lighty-enable-mod fastcgi Thumb2 any more ... :( ) Very close you should also find some coprocessor initializations (mrc/mcr). – sudo lighty-enable-mod fastcgi-php

18a2c: e3a000d7 mov r0, #215 ; 0xd7 – sudo service lighttpd force-reload 18a30: e121f000 msr CPSR_c, r0 18a34: e59fd0cc ldr sp, [pc, #204] ; 0x18b08 18a38: e3a000d3 mov r0, #211 ; 0xd3 18a3c: e121f000 msr CPSR_c, r0 • then, we want to emulate the web-interface on a PC 18a40: e59fd0c4 ldr sp, [pc, #196] ; 0x18b0c 18a44: ee071f9a mcr 15, 0, r1, cr7, cr10, 4 18a48: e3a00806 mov r0, #393216 ; 0x60000 – requires tweaking $VICON_JFFS2/etc/lighttpd/lighttpd.conf 18a4c: ee3f1f11 mrc 15, 1, r1, cr15, cr1, 0 18a50: e1801001 orr r1, r0, r1 18a54: ee2f1f11 mcr 15, 1, r1, cr15, cr1, 0 – requires some minor development and fixes

A second method is to find the exception handler table that contains the branches to the actual exception handlers. Tweaking. This piece of firmware is important as it reveals informa- Tweaking $VICON JFFS2/etc/lighttpd/lighttpd.conf tion about the address spaces from where code is run. In the ARMv5 architecture, exceptions are handled by ARM • correct document-root instructions in a table at address 0. Normally these have the form ”ldr pc, XXX” and load the program counter with • replace /mnt/www.nf with $VICON_JFFS2/mnt/www.nf a value stored relative to the current program counter (i.e. in a table from address 0x20 on). • set port to 1337 → The exception vectors give an idea of which addresses are used by the firmware. • set errorlog and accesslog arm-none-eabi-objdump -b binary -m arm -D MX1A4d.lod \ • create plain -auth password file | grep -E ’ldr\s+pc’ | less

→ We get the following output from arm-none-eabi-objdump • set auth.backend.plain.userfile

220e4: e59ff018 ldr pc, [pc, #24] ; 0x22104 220e8: e59ff018 ldr pc, [pc, #24] ; 0x22108 • replace all .fcgi files with a generic action.bottle.fcgi.py 220ec: e59ff018 ldr pc, [pc, #24] ; 0x2210c 220f0: e59ff018 ldr pc, [pc, #24] ; 0x22110 220f4: e59ff018 ldr pc, [pc, #24] ; 0x22114 • enable .py as FastCGI in $VICON_JFFS2/etc/lighttpd/lighttpd.conf 220f8: e1a00000 nop ; (mov r0, r0) 220fc: e59ff018 ldr pc, [pc, #24] ; 0x2211c 22100: e59ff018 ldr pc, [pc, #24] ; 0x22120 22104: 0000a824 andeq sl, r0, r4, lsr #16 22108: 0000a8a4 andeq sl, r0, r4, lsr #17 Developing. 2210c: 0000a828 andeq sl, r0, r8, lsr #16 22110: 0000a7ec andeq sl, r0, ip, ror #15 Writing a stub action.bottle.fcgi.py 22114: 0000a44c andeq sl, r0, ip, asr #8 22118: 00000000 andeq r0, r0, r0 2211c: 0000a6ac andeq sl, r0, ip, lsr #13 • sudo apt-get install python-pip python-setuptools 22120: 00000058 andeq r0, r0, r8, asr r0 5.2 Exploring the Firmware of Vicon IPCAM • sudo pip install bottle 960 series Fuzz/pentest/debug. Downloading and Unpacking. Running and debugging web-interface of 51110.2.1800.96.bin

• Getting 51110.2.1800.96.bin • iterative-fixing approach

• Unpacking 51110.2.1800.96.bin • sudo lighttpd -D -f $VICON_JFFS2/etc/lighttpd/lighttpd.conf

– $VICON_JFFS2 is the unpacked JFFS2 image in- • check lighttpd logs for startup errors side 51110.2.1800.96.bin • check Firefox web-developer console for client/server • Exploring 51110.2.1800.96.bin web-interface errors

– $VICON_JFFS2/etc/lighttpd/lighttpd.conf – console shows we need to define INFO_SWVER in- side info.js – $VICON_JFFS2/mnt/www.nf – start from above by restarting lighttpd 6. CONCLUSIONS [16] A. Costin. Postscript(um): You’ve been hacked, 2012. We have presented on embedded devices and their under- http://andreicostin.com/papers/. lying firmware from a security point of view. We surveyed [17] A. Cui, M. Costello, and S. J. Stolfo. When firmware the related work and existing state of the art. Along the way, modifications attack: A case study of embedded we introduced most commong firmware formats, challenges exploitation. In Proceedings of the Symposium on they bring to the game, as well as the tools and techniques Network and Distributed System Security (NDSS), to unpack and analyze them. We also introduce on the topic 2013. of firmware emulation for easier and faster vulnerability dis- [18] A. Cui, Y. Song, P. V. Prabhu, and S. J. Stolfo. Brave covery. We finish the presentation with hands-on workshop new world: Pervasive insecurity of embedded network exercises and solutions to help the audience to better under- devices. In Recent Advances in Intrusion Detection, stand the presented material. pages 378–380. Springer, 2009. We conclude with the following: [19] A. Cui and S. J. Stolfo. A quantitative analysis of the insecurity of embedded network devices: results of a • though firmware reverse engineering is becoming eas- wide-area scan. In Proceedings of the 26th Annual ier, there are still many challenges related to their un- Applications Conference, pages packing and full-blown analysis 97–106. ACM, 2010. • firmware images rely on security by obscurity, rather [20] G. Delugr´e. Closer to metal: reverse-engineering the than security in-depth broadcom netextreme’s firmware. Hack. lu, pages 27–29, 2010. • many trivial and non-trivial security vulnerabilites can [21] L. Duflot, Y.-A. Perez, and B. Morin. Run-time be found just by using existing tools and frameworks firmware integrity verification: what if you can’t trust your network card, 2011. 7. REFERENCES [22] L. Duflot, Y.-A. Perez, and B. Morin. What if you [1] Bin2bmp on sourceforge. can’t trust your network card? In Recent Advances in . http://sourceforge.net/projects/bin2bmp/ Intrusion Detection, pages 378–397. Springer, 2011. [2] Binvis on google project. [23] B. Gourdin, C. Soman, H. Bojinov, and E. Bursztein. http://code.google.com/p/binvis/. Towards secure embedded web interfaces. In Usenix [3] firmware-mod-kit on github. Security(Usenix Security), August 2011. https://github.com/mirror/firmware-mod-kit/ [24] Y. Li, J. M. McCune, and A. Perrig. Viper: verifying tree/master/src/uncramfs. the integrity of peripherals’ firmware. In Proceedings [4] Hexworkshop website. of the 18th ACM conference on Computer and http://www.hexworkshop.com/. Communications Security, pages 3–16. ACM, 2011. [5] Reveng tools. http: [25] L. R. McMinn. External verification of scada system //svn.gnumonks.org/trunk/reveng-tools/romfsck/. embedded controller firmware. Technical report, DTIC [6] Seagate moosedt mx1a-3d4d dmax22 firmware Document, 2012. download. http://www.seagate.com/staticfiles/ [26] C. Mulliner and B. Mich´ele. Read it twice! a support/downloads/firmware/ mass-storage-based tocttou attack. In Proceedings of MooseDT-MX1A-3D4D-DMax22.iso. the 6th USENIX conference on Offensive Technologies, [7] Squashfs website. http://www.squashfs-lzma.org/. pages 11–11. USENIX Association, 2012. [8] Ubifs faq. http://www.linux-mtd.infradead.org/ faq/ubifs.html#L_ubifs_extract. [9] yaffs2utils on google code. https://code.google.com/p/yaffs2utils/. [10] Z. Basnight, J. Butts, J. Lopez Jr, and T. Dube. Firmware modification attacks on programmable logic controllers. International Journal of Critical Infrastructure Protection, 2013. [11] A. Blanco and M. Eissler. One firmware to monitor’em all. 2012. [12] H. Bojinov and E. Bursztein. Embedded management interfaces emerging massive insecurity. In BlackHat USA 2009(BlackHat USA 09), July 2009. [13] H. Bojinov, E. Bursztein, and D. Boneh. Xcs cross channel scripting and its impact on web applications. In Computer and Communications Security (CCS), November 2009. [14] V. Chipounov and G. Candea. Reverse engineering of binary device drivers with revnic. In Proceedings of the 5th European conference on Computer systems, pages 167–180. ACM, 2010. [15] A. Costin. Hacking printers for fun and profit, 2010 – 2011. http://andreicostin.com/papers/.