<<

In Lieu of Swap: Analyzing Compressed RAM in Mac OS X and

Golden G. Richard III

University of New Orleans and Arcane Alloy, LLC

[email protected]

Andrew Case

The Volatility Project

[email protected]

2 Who?

Professor of Science and University Research Professor, Director, GNOCIA, University of New Orleans hp://www.cs.uno.edu/~golden Digital forensics, OS internals, reverse engineering, offensive compung, pushing students to the brink of destrucon, et al.

Founder, Arcane Alloy, LLC. hp://www.arcanealloy.com Digital forensics, reverse engineering, malware analysis, security research, tool development, training.

Co-Founder, Partner / Photographer, High ISO Music, LLC. hp://www.highisomusic.com Rock stars. Heavy Metal. Earplugs.

© 2014 by Golden G. Richard III and Andrew Case 3 Forensics, Features, Privacy

• New OS features may negavely forensics • e.g.: – Nave whole disk encrypon – Encrypted swap – Secure Recycle Bin / Trash facilies – Secure erasure during disk formang • Today’s topic: Compressed RAM • Claim: posively impacts forensics • First some background

© 2014 by Golden G. Richard III and Andrew Case 4 Where’s the Evidence?

Files and Filesystem Applicaon Windows Deleted Files metadata metadata registry

Print spool Hibernaon Temp files Log files files files

Browser Network Slack space Swap files caches traces

RAM: OS and app Memory analysis “Tradional” structures storage forensics © 2014 by Golden G. Richard III and Andrew Case 5 Memory Analysis on RAM Dump

• Processes (dead/alive) e.g., discover unauthorized programs • files e.g., detect keystroke loggers, hidden processes

• Network connections e.g., find backdoors, connections to contraband sites • Volatile registry contents • Volatile application data e.g., plaintext for encrypted material, chat messages, email fragments • Other OS structures e.g., analysis of memory allocators to retrieve interesting structures, filesystem , etc. • Encryption keys e.g., keys for whole disk encryption schemes

What’s missing? © 2014 by Golden G. Richard III and Andrew Case 6 Memory Analysis: Swap Files

“Tradional” forensics: capture contents of storage devices, including the swap file swap file on disk Swap file contains RAM “overflow”—if you don’t have this data, you probably don’t have a complete memory dump

Live forensics / memory analysis: capture RAM contents RAM 00000000000000000000000000000000 0

11110010010011010010010101010101 4096 If you acquire both RAM and disk contents, 01000001010010010010001000100010 … … memory smearing is likely and the swap file 111111111111111111111111111111111 may be (very) out of with the memory . . dump 010101010101010101010101010101011 2GB

© 2014 by Golden G. Richard III and Andrew Case 7 Forensically, Swap Files are a Mess

Data trapped order in unsanized RAM dump Swap file blocks allocated to swap file

encrypted swap void swap_crypt_ctx_initialize ( void ) { unsigned int i; 554-88-2345 if ( swap_crypt_ctx_initialized == FALSE ) {

for (i = 0; i < ( sizeof ( swap_crypt_key ) / [email protected] sizeof ( swap_crypt_key [0])); i++) { swap_crypt_key [i] = random(); } aes_encrypt_key ( murder (const unsigned char *) swap_crypt_key , SWAP_CRYPT_AES_KEY_SIZE , &swap_crypt_ctx.encrypt); struct { ... swap_crypt_ctx_initialized = TRUE ; int ip; } … } } netstat; © 2014 by Golden G. Richard III and Andrew Case 8 Background: 2GB • Need to allocate RAM to: – Operang system – Individual applicaons – All applicaons • Possibly fairly, or some other policy • OS’s need to manage memory efficiently to feed RAM-hungry apps • Ever-increasing “need” for more memory

© 2014 by Golden G. Richard III and Andrew Case 9

• Provides conguous, linear for processes • is divided into fixed-size pages – Physical memory is divided into pages of same size – On , these are normally 4K • OS keeps track of free and allocated pages • Processes get only as many pages as they need © 2014 by Golden G. Richard III and Andrew Case 10 Logical vs. Physical Addresses

• Virtual address: Address whose scope is a parcular or OS kernel • : Locaon in physical RAM

Process P2 0 Process P1 RAM 4096 Process P3 0 0 … 4096 4096 0 … 4096 … … … … … . . … . . 4GB . . . 2GB 4GB . 4GB

© 2014 by Golden G. Richard III and Andrew Case 11 Paging: (Very) Simplified View

tables (per process) are used to translate logical to physical addresses • Pages for a parcular process are generally not in conguous order in RAM

Process P1 P1 RAM 0 01000001010010010010001000100010 2 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01000001010010010010001000100010 … … …

...... 2GB 4GB

© 2014 by Golden G. Richard III and Andrew Case 12 64-bit Mode: 4-level Page Tables: 4K Pages

© 2014 by Golden G. Richard III and Andrew Case 13 Stretching RAM

• Want to allocate more memory to processes than total amount of RAM • Extend RAM by swapping pages to disk and retrieving as necessary • Allows more (and bigger) processes to execute • Workable, because: – Not all execung processes are acve at once – Processes typically have “working sets” of pages, e.g., memory they’re acvely using “now”

© 2014 by Golden G. Richard III and Andrew Case 14 Virtual Memory with Swapping For good performance, can’t overdo it--the pipe swap file between RAM and disk is 100MB/sec on disk small

Otherwise: THRASHING

Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … … 11111111111111111111000011111111 Not present … 111111111111111111111111111111111

...... 010101010101010101010101010101011 2GB 4GB

© 2014 by Golden G. Richard III and Andrew Case 15 Virtual Memory with Swapping

SSDs substanally increase size of pipe swap file 500GB/sec or so on SSD

Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … … 11111111111111111111000011111111 Not present … 111111111111111111111111111111111

...... 010101010101010101010101010101011 2GB 4GB

© 2014 by Golden G. Richard III and Andrew Case 16 Virtual Memory with Swapping up to 1.2GB/sec+ swap file on PCIe (e.g., new Mac Pro, flash storage newest iMacs)

Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … … 11111111111111111111000011111111 Not present … 111111111111111111111111111111111

...... 010101010101010101010101010101011 2GB 4GB

© 2014 by Golden G. Richard III and Andrew Case 17 Avoiding Swapping by Compressing RAM

• Opmizaon: – “Hoard” some RAM, not available to processes – When RAM is running low, compress pages in memory instead of swapping to disk • Why bother? – SSDs are fast – PCIe solid state storage is “crazy” fast – Memory is ever cheaper • So…why?

© 2014 by Golden G. Richard III and Andrew Case 18 It’s Worth It

• Ultraportable laptops sll RAM limited – e.g., current Macbook Air: Base 4GB, Max 8GB RAM • Virtualizaon, even on laptops • Malware analysts, VMWare Workstaon, Fusion, etc. • Swapping can cause serious performance issues • New Mac Pro memory bandwidth: 60GB/sec • Max swap bandwidth: 1.2GB/sec • Modern CPUs compress and decompress very efficiently

© 2014 by Golden G. Richard III and Andrew Case 19 Virtual Memory with Compressed RAM

In Mac OS X Mavericks and Linux, set aside swap file some RAM and compress pages that would on SSD have been swapped due to memory pressure

Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … 11111111111111111111000011111111 Not present 111111111111111111111111111111111 0100100101 . . 0100100101 compressed . . 0100100101

0100100101 pages 4GB

© 2014 by Golden G. Richard III and Andrew Case 20

© 2014 by Golden G. Richard III and Andrew Case 21 It’s Back

• RAM Doubler -- ~20 years ago • Historically RAM compression not very effecve – Processors too slow to offset compress/decompress costs – Poor integraon with OS • Old academic paper (2003) on compression schemes for RAM: – M. Simpson, R. Barua, S. Biswas, Analysis of Compression Algorithms for Program Data, Tech. rep., U. of Maryland, ECE department, 2003. • Compression scheme in Mavericks (WKdm) was invented in 1999: – P. R. Wilson, S. F. Kaplan, and Y. Smaragdakis, The Case for Compressed Caching in Virtual Memory Systems, Proceedings of the USENIX Annual Technical Conference, 1999. • Now, super fast CPUs, lots of cores, ght OS integraon

© 2014 by Golden G. Richard III and Andrew Case 22 Compressed RAM: Forensic Impact

Currently, “simultaneous” capture results in lots of smearing swap file on disk With compressed RAM, may be lile or no data swapped out—capturing RAM captures (some or all) “swap”!

PHYSICAL RAM

No support in current tools—compressed 00000000000000000000000000000000 0 data opaque or ignored 11110010010011010010010101010101 4096 01000001010010010010001000100010 … 111111111111111111111111111111111 …

Our tools enable analysis of . . compressed RAM 010101010101010101010101010101011 2GB © 2014 by Golden G. Richard III and Andrew Case 23 Mavericks VM: Too Much Detail

© 2014 by Golden G. Richard III and Andrew Case 24 Mac OS X Mavericks Implementaon

• Implemented as a pager • Fits into the standard VM architecture • Compressor pager hoards some memory • Page in / out requests all go through the pager • Pages compressed / decompressed on demand • When compressor gets full, pushes compressed pages to swap file • On , page is either: – Compressed and in RAM: just decompress – On disk: from disk, decompress

© 2014 by Golden G. Richard III and Andrew Case 25 Mavericks Implementaon (2)

• Implementaon in • Like most of Mach and BSD • …except for opmized version of WKdm • Pages are compressed and decompressed using WKdm • Mavericks: Hand-opmized 64-bit assembler version of Kaplan’s original C code

© 2014 by Golden G. Richard III and Andrew Case 26 Mach VM: C + asm

xnu-2422.1.72/ osfmk/: ~73K lines of C

xnu-2422.1.72/osfmk/x86_64/WKdm*.s: ~1000 lines of 64-bit assembler for WKdm_compress_new() / WKdm_decompress_new()

Just vm_compressor / vm_compressor_pager: ~4K lines of C + the assembler, above

© 2014 by Golden G. Richard III and Andrew Case 27 Compressor / Compressor Pager Internals

cpgr_slots! virtual(address( compressor_pager (per"vm_map_entry"object)"" to(slot(mapping(

compressor_object (global)"" c_size c_offset!

virtual(address(of(page( c_segments""

…"

c_slots!

compressor(

compressed"pages" c_buffer! © 2014 by Golden G. Richard III and Andrew Case 28 Linux

/ (circa 3.11+ kernel release) in Linux • zram is just a memory-based compressed swap device • zswap uses new FRONTSWAP facility in Linux • hp://lxr.free-electrons.com/source/ Documentaon/vm/frontswap.txt

© 2014 by Golden G. Richard III and Andrew Case int swap_writepage(struct page *page, … ) /mm/page_io.c 29 { int ret = 0; if (try_to_free_swap(page)) { unlock_page(page); e.g., zswap

goto out; Can accept the page if it } fits, or say no and if (frontswap_store(page) == 0) { regular swapping will occur set_page_writeback(page); unlock_page(page); end_page_writeback(page); goto out; e.g., zram , } way down in there ret = __swap_writepage(page, wbc, end_swap_bio_write); out: Page eventually wrien to disk return ret;

© 2014 by Golden G. Richard III and Andrew Case int swap_readpage(struct page *page) /mm/page_io.c 30 { … e.g., zswap … if (frontswap_load(page) == 0) { SetPageUptodate(page); unlock_page(page); goto out; } … e.g., zram, // lots of pain here it’s just a stranger block device // lots of pain here out: Page eventually read from disk return ret; } © 2014 by Golden G. Richard III and Andrew Case 31 Current Reality + Goal

Physical memory dump:

Process P2 Process P2 0 0 Process P1 Process P1 4096 4096 0 0 … … 4096 4096 … … … …

… . … . . .

. 4GB . 4GB . . 4GB 4GB © 2014 by Golden G. Richard III and Andrew Case 32 Volality

• Most popular memory analysis framework • Open source • Portable, wrien in Python K • Supports analysis of Windows, Linux, Mac • Plugins add new funconality • Our contribuon: New plugins to support compressed RAM analysis

© 2014 by Golden G. Richard III and Andrew Case 33 New Plugins for Volality

• mac_compressed_swap / linux_compressed_swap – Find, decompress, and dump all compressed pages – Emits compressor stats: • Compressor memory used : 19167408 bytes • Available uncompressed memory : 466462 pages • Available memory : 471251 pages • … • Required Python implementaon of decompression algorithms • Required detailed analysis of new compressor pager • But not enre page fault à decompression path

© 2014 by Golden G. Richard III and Andrew Case 34 New / Modified Plugins for Volality

• mac_dump_maps / linux_dump_maps – Dump address space for all (or individual) processes – Previous implementaon used Volality’s standard mechanism for eming 4K pages in address space – Simply skipped swapped pages – Now, decompressed if possible and made available • Involves deeper analysis of OS VM systems • Much more complex (not your problem, we’re not complaining) • We ♥ OS internals

© 2014 by Golden G. Richard III and Andrew Case 35 Representave Distribuon of Compressed Pages

90 85 80 75 70 65 60 55 50 45 40 35 30 # of processes 25 20 15 10 5 0

# of compressed pages Mac OS X 10.9, 2GB RAM, 124 processes, moderate memory pressure, 300MB compressed © 2014 by Golden G. Richard III and Andrew Case 36