In Lieu of Swap: Analyzing Compressed RAM in Mac OS X and Linux
Golden G. Richard III
University of New Orleans and Arcane Alloy, LLC
Andrew Case
The Volatility Project
2 Who?
Professor of Computer Science and University Research Professor, Director, GNOCIA, University of New Orleans h p://www.cs.uno.edu/~golden Digital forensics, OS internals, reverse engineering, offensive compu ng, pushing students to the brink of destruc on, et al.
Founder, Arcane Alloy, LLC. h p://www.arcanealloy.com Digital forensics, reverse engineering, malware analysis, security research, tool development, training.
Co-Founder, Partner / Photographer, High ISO Music, LLC. h p://www.highisomusic.com Rock stars. Heavy Metal. Earplugs.
© 2014 by Golden G. Richard III and Andrew Case 3 Forensics, Features, Privacy
• New OS features may nega vely forensics • e.g.: – Na ve whole disk encryp on – Encrypted swap – Secure Recycle Bin / Trash facili es – Secure erasure during disk forma ng • Today’s topic: Compressed RAM • Claim: posi vely impacts forensics • First some background
© 2014 by Golden G. Richard III and Andrew Case 4 Where’s the Evidence?
Files and Filesystem Applica on Windows Deleted Files metadata metadata registry
Print spool Hiberna on Temp files Log files files files
Browser Network Slack space Swap files caches traces
RAM: OS and app data Memory analysis “Tradi onal” structures storage forensics © 2014 by Golden G. Richard III and Andrew Case 5 Memory Analysis on RAM Dump
• Processes (dead/alive) e.g., discover unauthorized programs • Open files e.g., detect keystroke loggers, hidden processes
• Network connections e.g., find backdoors, connections to contraband sites • Volatile registry contents • Volatile application data e.g., plaintext for encrypted material, chat messages, email fragments • Other OS structures e.g., analysis of memory allocators to retrieve interesting structures, filesystem cache, etc. • Encryption keys e.g., keys for whole disk encryption schemes
What’s missing? © 2014 by Golden G. Richard III and Andrew Case 6 Memory Analysis: Swap Files
“Tradi onal” forensics: capture contents of storage devices, including the swap file swap file on disk Swap file contains RAM “overflow”—if you don’t have this data, you probably don’t have a complete memory dump
Live forensics / memory analysis: capture RAM contents RAM 00000000000000000000000000000000 0
11110010010011010010010101010101 4096 If you acquire both RAM and disk contents, 01000001010010010010001000100010 … … memory smearing is likely and the swap file 111111111111111111111111111111111 may be (very) out of sync with the memory . . dump 010101010101010101010101010101011 2GB
© 2014 by Golden G. Richard III and Andrew Case 7 Forensically, Swap Files are a Mess
Data trapped order in unsani zed RAM dump Swap file blocks allocated to swap file
encrypted swap void swap_crypt_ctx_initialize ( void ) { unsigned int i; 554-88-2345 if ( swap_crypt_ctx_initialized == FALSE ) {
for (i = 0; i < ( sizeof ( swap_crypt_key ) / [email protected] sizeof ( swap_crypt_key [0])); i++) { swap_crypt_key [i] = random(); } aes_encrypt_key ( murder (const unsigned char *) swap_crypt_key , SWAP_CRYPT_AES_KEY_SIZE , &swap_crypt_ctx.encrypt); struct { ... swap_crypt_ctx_initialized = TRUE ; int ip; } … } } netstat; © 2014 by Golden G. Richard III and Andrew Case 8 Background: Memory Management 2GB • Need to allocate RAM to: – Opera ng system – Individual applica ons – All applica ons • Possibly fairly, or some other policy • OS’s need to manage memory efficiently to feed RAM-hungry apps • Ever-increasing “need” for more memory
© 2014 by Golden G. Richard III and Andrew Case 9 Virtual Memory
• Provides con guous, linear address space for processes • Virtual address space is divided into fixed-size pages – Physical memory is divided into pages of same size – On x86 Intel, these are normally 4K • OS keeps track of free and allocated pages • Processes get only as many pages as they need © 2014 by Golden G. Richard III and Andrew Case 10 Logical vs. Physical Addresses
• Virtual address: Address whose scope is a par cular process or OS kernel • Physical address: Loca on in physical RAM
Process P2 0 Process P1 RAM 4096 Process P3 0 0 … 4096 4096 0 … 4096 … … … … … . . … . . 4GB . . . 2GB 4GB . 4GB
© 2014 by Golden G. Richard III and Andrew Case 11 Paging: (Very) Simplified View
• Page tables (per process) are used to translate logical to physical addresses • Pages for a par cular process are generally not in con guous order in RAM
Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01000001010010010010001000100010 … … …
...... 2GB 4GB
© 2014 by Golden G. Richard III and Andrew Case 12 64-bit Mode: 4-level Page Tables: 4K Pages
© 2014 by Golden G. Richard III and Andrew Case 13 Stretching RAM
• Want to allocate more memory to processes than total amount of RAM • Extend RAM by swapping pages to disk and retrieving as necessary • Allows more (and bigger) processes to execute • Workable, because: – Not all execu ng processes are ac ve at once – Processes typically have “working sets” of pages, e.g., memory they’re ac vely using “now”
© 2014 by Golden G. Richard III and Andrew Case 14 Virtual Memory with Swapping For good performance, can’t overdo it--the pipe swap file between RAM and disk is 100MB/sec on disk small
Otherwise: THRASHING
Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … … 11111111111111111111000011111111 Not present … 111111111111111111111111111111111
...... 010101010101010101010101010101011 2GB 4GB
© 2014 by Golden G. Richard III and Andrew Case 15 Virtual Memory with Swapping
SSDs substan ally increase size of pipe swap file 500GB/sec or so on SSD
Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … … 11111111111111111111000011111111 Not present … 111111111111111111111111111111111
...... 010101010101010101010101010101011 2GB 4GB
© 2014 by Golden G. Richard III and Andrew Case 16 Virtual Memory with Swapping up to 1.2GB/sec+ swap file on PCIe (e.g., new Mac Pro, flash storage newest iMacs)
Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 0 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 4096 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … … 11111111111111111111000011111111 Not present … 111111111111111111111111111111111
...... 010101010101010101010101010101011 2GB 4GB
© 2014 by Golden G. Richard III and Andrew Case 17 Avoiding Swapping by Compressing RAM
• Op miza on: – “Hoard” some RAM, not available to processes – When RAM is running low, compress pages in memory instead of swapping to disk • Why bother? – SSDs are fast – PCIe solid state storage is “crazy” fast – Memory is ever cheaper • So…why?
© 2014 by Golden G. Richard III and Andrew Case 18 It’s Worth It
• Ultraportable laptops s ll RAM limited – e.g., current Macbook Air: Base 4GB, Max 8GB RAM • Virtualiza on, even on laptops • Malware analysts, VMWare Worksta on, Fusion, etc. • Swapping can cause serious performance issues • New Mac Pro memory bandwidth: 60GB/sec • Max swap bandwidth: 1.2GB/sec • Modern CPUs compress and decompress very efficiently
© 2014 by Golden G. Richard III and Andrew Case 19 Virtual Memory with Compressed RAM
In Mac OS X Mavericks and Linux, set aside swap file some RAM and compress pages that would on SSD have been swapped due to memory pressure
Process P1 P1 page table RAM 0 01000001010010010010001000100010 2 00000000000000000000000000000000 4096 11110010010011010010010101010101 1 11110010010011010010010101010101 … 01001001010100111111111111111000 Not present 01000001010010010010001000100010 … 11111111111111111111000011111111 Not present 111111111111111111111111111111111 0100100101 . . 0100100101 compressed . . 0100100101
0100100101 pages 4GB
© 2014 by Golden G. Richard III and Andrew Case 20
© 2014 by Golden G. Richard III and Andrew Case 21 It’s Back
• RAM Doubler -- ~20 years ago • Historically RAM compression not very effec ve – Processors too slow to offset compress/decompress costs – Poor integra on with OS • Old academic paper (2003) on compression schemes for RAM: – M. Simpson, R. Barua, S. Biswas, Analysis of Compression Algorithms for Program Data, Tech. rep., U. of Maryland, ECE department, 2003. • Compression scheme in Mavericks (WKdm) was invented in 1999: – P. R. Wilson, S. F. Kaplan, and Y. Smaragdakis, The Case for Compressed Caching in Virtual Memory Systems, Proceedings of the USENIX Annual Technical Conference, 1999. • Now, super fast CPUs, lots of cores, ght OS integra on
© 2014 by Golden G. Richard III and Andrew Case 22 Compressed RAM: Forensic Impact
Currently, “simultaneous” capture results in lots of smearing swap file on disk With compressed RAM, may be li le or no data swapped out—capturing RAM captures (some or all) “swap”!
PHYSICAL RAM
No support in current tools—compressed 00000000000000000000000000000000 0 data opaque or ignored 11110010010011010010010101010101 4096 01000001010010010010001000100010 … 111111111111111111111111111111111 …
Our tools enable analysis of . . compressed RAM 010101010101010101010101010101011 2GB © 2014 by Golden G. Richard III and Andrew Case 23 Mavericks VM: Too Much Detail
© 2014 by Golden G. Richard III and Andrew Case 24 Mac OS X Mavericks Implementa on
• Implemented as a pager • Fits into the standard Mach VM architecture • Compressor pager hoards some memory • Page in / out requests all go through the pager • Pages compressed / decompressed on demand • When compressor gets full, pushes compressed pages to swap file • On page fault, page is either: – Compressed and in RAM: just decompress – On disk: read from disk, decompress
© 2014 by Golden G. Richard III and Andrew Case 25 Mavericks Implementa on (2)
• Implementa on in C • Like most of Mach and BSD • …except for op mized version of WKdm • Pages are compressed and decompressed using WKdm • Mavericks: Hand-op mized 64-bit assembler version of Kaplan’s original C code
© 2014 by Golden G. Richard III and Andrew Case 26 Mach VM: C + asm
xnu-2422.1.72/ osfmk/vm: ~73K lines of C
xnu-2422.1.72/osfmk/x86_64/WKdm*.s: ~1000 lines of 64-bit assembler for WKdm_compress_new() / WKdm_decompress_new()
Just vm_compressor / vm_compressor_pager: ~4K lines of C + the assembler, above
© 2014 by Golden G. Richard III and Andrew Case 27 Compressor / Compressor Pager Internals
cpgr_slots! virtual(address( compressor_pager (per"vm_map_entry"object)"" to(slot(mapping(
compressor_object (global)"" c_size c_offset!
virtual(address(of(page( c_segments""
…"
c_slots!
compressor(
compressed"pages" c_buffer! © 2014 by Golden G. Richard III and Andrew Case 28 Linux
• zram/ zswap (circa 3.11+ kernel release) in Linux • zram is just a memory-based compressed swap device • zswap uses new FRONTSWAP facility in Linux • h p://lxr.free-electrons.com/source/ Documenta on/vm/frontswap.txt
© 2014 by Golden G. Richard III and Andrew Case int swap_writepage(struct page *page, … ) /mm/page_io.c 29 { int ret = 0; if (try_to_free_swap(page)) { unlock_page(page); e.g., zswap
goto out; Can accept the page if it } fits, or say no and if (frontswap_store(page) == 0) { regular swapping will occur set_page_writeback(page); unlock_page(page); end_page_writeback(page); goto out; e.g., zram , } way down in there ret = __swap_writepage(page, wbc, end_swap_bio_write); out: Page eventually wri en to disk return ret;
© 2014 by Golden G. Richard III and Andrew Case int swap_readpage(struct page *page) /mm/page_io.c 30 { … e.g., zswap … if (frontswap_load(page) == 0) { SetPageUptodate(page); unlock_page(page); goto out; } … e.g., zram, // lots of pain here it’s just a stranger block device // lots of pain here out: Page eventually read from disk return ret; } © 2014 by Golden G. Richard III and Andrew Case 31 Current Reality + Goal
Physical memory dump:
Process P2 Process P2 0 0 Process P1 Process P1 4096 4096 0 0 … … 4096 4096 … … … …
… . … . . .
. 4GB . 4GB . . 4GB 4GB © 2014 by Golden G. Richard III and Andrew Case 32 Vola lity
• Most popular memory analysis framework • Open source • Portable, wri en in Python K • Supports analysis of Windows, Linux, Mac • Plugins add new func onality • Our contribu on: New plugins to support compressed RAM analysis
© 2014 by Golden G. Richard III and Andrew Case 33 New Plugins for Vola lity
• mac_compressed_swap / linux_compressed_swap – Find, decompress, and dump all compressed pages – Emits compressor stats: • Compressor memory used : 19167408 bytes • Available uncompressed memory : 466462 pages • Available memory : 471251 pages • … • Required Python implementa on of decompression algorithms • Required detailed analysis of new compressor pager • But not en re page fault à decompression path
© 2014 by Golden G. Richard III and Andrew Case 34 New / Modified Plugins for Vola lity
• mac_dump_maps / linux_dump_maps – Dump address space for all (or individual) processes – Previous implementa on used Vola lity’s standard mechanism for emi ng 4K pages in address space – Simply skipped swapped pages – Now, decompressed if possible and made available • Involves deeper analysis of OS VM systems • Much more complex (not your problem, we’re not complaining) • We ♥ OS internals
© 2014 by Golden G. Richard III and Andrew Case 35 Representa ve Distribu on of Compressed Pages
90 85 80 75 70 65 60 55 50 45 40 35 30 # of processes 25 20 15 10 5 0
# of compressed pages Mac OS X 10.9, 2GB RAM, 124 processes, moderate memory pressure, 300MB compressed © 2014 by Golden G. Richard III and Andrew Case
36