Measuring Virtual Machine Performance with Real

Total Page:16

File Type:pdf, Size:1020Kb

Measuring Virtual Machine Performance with Real Measuring PROOF Lite performance in (non)virtualized environment Ioannis Charalampidis, Aristotle University of Thessaloniki Summer Student 2010 Overview • Introduction • Benchmarks: Overall execution time • Benchmarks: In-depth analysis • Conclusion What am I looking for? • There is a known overhead caused by the virtualization process ▫ How big is it? ▫ Where is located? ▫ How can we minimize it? ▫ Which hypervisor has the best performance? • I am using CernVM as guest What is CernVM? • It’s a baseline Virtual Software Appliance for use by LHC experiments • It’s available for many hypervisors • Hyper-V • KVM / QEMU • VM Ware • XEN • Virtual Box How am I going to find the answers? • Using as benchmark a standard data analysis application (ROOT + PROOF Lite) • Test it on different hypervisors • And on varying number of workers/CPUs • Compare the performance (Physical vs. Virtualized) Problem • The benchmark application requires too much time to complete ( 2 min ~ 15 min ) ▫ At least 3 runs are required for reliable results ▫ The in-depth analysis overhead is about 40% ▫ It is not efficient to perform detailed analysis for every CPU / Hypervisor configuration Create the overall execution time benchmarks Find the best configuration to run the traces on Benchmarks performed • Overall time ▫ Using time utility and automated batch scripts • In-depth analysis ▫ Tracing system calls using Strace KernelTAP ▫ Analyzing the trace files using applications I wrote BASST (Batch analyzer based on STrace) KARBON (General purpose application profiler based on trace files) Process description and results Benchmark Configuration • Base machine ▫ Scientific Linux CERN 5 • Guests ▫ CernVM 2.1 • Software packages from SLC repositories ▫ Linux Kernel 2.6.18-194.8.1.el5 ▫ XEN 3.1.2 + 2.6.18-194.8.1.el5 ▫ KVM 83-194.8.1.el5 ▫ Python 2.5.4p2 (from AFS) ▫ ROOT 5.26.00b (from AFS) • Base machine hardware ▫ 24 x Intel Xeon X7460 2.66GHz with VT-x Support (64 bit) ▫ No VT-d nor Extended Page Tables (EPT) hardware support ▫ 32G RAM Benchmark Configuration • Virtual machine configuration ▫ 1, 2 to 16 CPUs with 2 CPU step ▫ <CPU#> + 1Gb RAM for Physical disk and Network tests ▫ <CPU#> + 17Gb RAM for RAM Disk tests ▫ Disk image for the OS ▫ Physical disk for the Data + Software • Important background services running ▫ NSCD (Caching daemon) Benchmark Configuration • Caches were cleared before every test ▫ Page cache, dentries and inodes ▫ Using the /proc/sys/vm/drop_caches flag • No swap memory was used ▫ By periodically monitoring the free memory Automated batch scripts • The VM batch script runs on the host machine Server • It repeats the following procedure: ▫ Crate a new Virtual Machine Hypervisor ▫ Wait for the machine to finish booting ▫ Connect to the controlling script Client inside the VM ▫ Drop caches both on the host and Benchmark the guest BenchmarkBenchmark ▫ Start the job ▫ Receive and archive the results Problem • There was a bug on PROOF Lite that was looking up a non-existing hostname during the startup of each worker Example : 0.2-plitehp24.cern.ch-1281241251-1271 • Discovered by detailed system call tracing The hostname couldn’t be cached The application had to wait for the timeout The startup time was delayed randomly Call tracing applications made this delay even bigger virtually hanging the application Problem • The problem was resolved with: ▫ A minimal DNS proxy was developed that fakes the existence of the buggy hostname ▫ It was later fixed in PROOF source cernvm.cern.ch? 137.138.234.20 Fake DNS Application Proxy DNS Server x.x-xxxxxx-xxx-xxx? 127.0.0.1 Problem Example: Events / sec for different CPU settings, as reported by the buggy benchmark Before After 18000 16000 14000 12000 10000 8000 6000 4000 2000 0 0 5 10 15 20 25 30 0 5 10 15 20 25 30 RAM Disk - XEN RAM Disk - Host RAM Disk + Fixed DNS - XEN RAM Disk + Fixed DNS - Host Phys. Disk - XEN Phys. Disk - HOST Phys. Disk + Fixed DNS - XEN Phys. Disk + Fixed DNS - Host Results – Physical Disk 14000 12000 10000 8000 Baremetal 6000 XEN Events/Sec KVM 4000 2000 0 1 2 4 6 8 10 12 14 16 Workers = CPUs Results – Network (XROOTD) 14000 12000 10000 8000 Baremetal 6000 XEN Events/Sec KVM 4000 2000 0 1 2 4 6 8 10 12 14 16 Workers = CPUs Results – RAM Disk 14000 12000 10000 8000 Baremetal 6000 XEN Events/Sec KVM 4000 2000 0 1 2 4 6 8 10 12 14 16 Workers = CPUs Results – Relative values RAM Disk Network (XROOTD) Physical Disk 1.2 1 0.8 Ratio 0.6 Baremetal VM/ 0.4 0.2 0 0 5 10 15 200 5 10 15 200 5 10 15 20 Workers = CPUs Workers = CPUs Workers = CPUs Bare metal XEN KVM Results – Absolute values RAM Disk Network (XROOTD) Physical Disk 14000 12000 10000 8000 6000 Events /EventsSec 4000 2000 0 0 5 10 15 0 5 10 15 0 5 10 15 Workers = CPUs Workers = CPUs Workers = CPUs Bare metal XEN KVM Results – Comparison chart 14000 12000 10000 8000 6000 Events /EventsSec 4000 2000 0 0 2 4 6 8 10 12 14 16 18 Workers = CPUs Physical Disk - Bare metal Xrootd - Bare metal RAM Disk - Bare metal Physical Disk - XEN Xrootd - XEN RAM Disk - XEN Physica Disk - KVM Xrootd - KVM RAM Disk - KVM Procedure, problems and results In depth analysis • In order to get more details the program execution was monitored and all the system calls were traced and logged • Afterwards, the analyzer extracted useful information from the trace files such as ▫ Detecting the time spent on each system call ▫ Detecting the filesystem / network activity • The process of tracing adds some overhead but it is cancelled out from the overall performance measurement System call tracing utilities • STrace ▫ Traces application-wide system calls from user space Kernel ▫ Connects to the tracing process using the ptrace() system call and monitors it’s activity • Advantages STrace ▫ Traces the application’s system calls in real time Process ▫ Has very verbose output • Disadvantages ▫ Creates big overhead System call tracing utilities • SystemTAP ▫ Traces system-wide kernel System TAP activity, asynchronously ▫ Runs as a kernel module • Advantages Kernel ▫ Can trace virtually everything on a running kernel ▫ Supports scriptable kernel probes • Disadvantages Process ▫ It is not simple to extract detailed information ▫ System calls can be lost on high CPU activity System call tracing utilities • Sample STrace output: 5266 1282662179.860933 arch_prctl(ARCH_SET_FS, 0x2b5f2bcc27d0) = 0 <0.000005> 5266 1282662179.860960 mprotect(0x34ca54d000, 16384, PROT_READ) = 0 <0.000007> 5266 1282662179.860985 mprotect(0x34ca01b000, 4096, PROT_READ) = 0 <0.000006> 5266 1282662179.861009 munmap(0x2b5f2bc92000, 189020) = 0 <0.000011> 5266 1282662179.861082 open("/usr/lib/locale/locale-archive", O_RDONLY) = 4 <0.000008> 5266 1282662179.861113 fstat(4, {st_mode=S_IFREG|0644, st_size=56442560, ...}) = 0 <0.000005> 5266 1282662179.861166 mmap(NULL, 56442560, PROT_READ, MAP_PRIVATE, 4, 0) = 0x2b5f2bcc3000 <0.000007> 5266 1282662179.861192 close(4) = 0 <0.000005> 5266 1282662179.861269 brk(0) = 0x1ad1f000 <0.000005> 5266 1282662179.861290 brk(0x1ad40000) = 0x1ad40000 <0.000006> 5266 1282662179.861444 open("/usr/share/locale/locale.alias", O_RDONLY) = 4 <0.000009> 5266 1282662179.861483 fstat(4, {st_mode=S_IFREG|0644, st_size=2528, ...}) = 0 <0.000005> 5266 1282662179.861944 read(4, "", 4096) = 0 <0.000006> 5266 1282662179.861968 close(4) = 0 <0.000005> 5266 1282662179.861989 munmap(0x2b5f2f297000, 4096) = 0 <0.000009> 5264 1282662179.863063 wait4(-1, 0x7fff8d813064, WNOHANG, NULL) = -1 ECHILD (No child processes) ... KARBON – A trace file analyzer KARBON – A trace file analyzer • Is a general purpose application profiler based on system call trace files • It traces file descriptors and reports detailed I/O statistics for files, network sockets and FIFO pipes • It analyzes the child processes and creates process graphs and process trees • It can detect the “Hot spots” of an application • Custom analyzing tools can be created on-demand using the development API KARBON – Application block diagram Preprocessing Tool Presenter Filter Analyzer Presenter Router Tokenizer Source (File or TCP Stream) or TCP (File Source Results • Time utilization of the traced application Physical Disk - KVM File IO Physical Disk - XEN Net IO Misc calls Physical Disk - Baremetal Network (Xrootd) - KVM File IO UNIX Sockets Network (Xrootd) - XEN TCP Sockets Misc calls Network (Xrootd) - Baremetal RAM Disk - KVM File IO RAM Disk - XEN Net IO Misc calls RAM Disk - Baremetal 0 50000 100000 150000 200000 250000 300000 Time spent (ms) Results • Time utilization of the traced application Physical Disk - KVM File IO Physical Disk - XEN UNIX Sockets Misc calls Physical Disk - Baremetal Network (Xrootd) - KVM File IO UNIX Sockets Network (Xrootd) - XEN TCP Sockets Network (Xrootd) - Baremetal Misc calls RAM Disk - KVM File IO RAM Disk - XEN UNIX Sockets Misc calls RAM Disk - Baremetal 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Results • Time utilization of the traced application Physical Disk - KVM File IO Physical Disk - XEN UNIX Sockets Misc calls Physical Disk - Baremetal Network (Xrootd) - KVM File IO UNIX Sockets Network (Xrootd) - XEN TCP Sockets Network (Xrootd) - Baremetal Misc calls RAM Disk - KVM File IO RAM Disk - XEN UNIX Sockets Misc calls RAM Disk - Baremetal 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Results • Overall system call time for filesystem I/O [ms] Reading Writing Seeking Total Bare metal 490,861.354 2,054.354 21,594.583 524,872.823 KVM 38,391.715 36,422.440 122,769.518 244,406.512 XEN 38,111.980 20,930.382 102,769.901 210,247.468 • Reminder: Kernel buffers were dropped before
Recommended publications
  • Copy on Write Based File Systems Performance Analysis and Implementation
    Copy On Write Based File Systems Performance Analysis And Implementation Sakis Kasampalis Kongens Lyngby 2010 IMM-MSC-2010-63 Technical University of Denmark Department Of Informatics Building 321, DK-2800 Kongens Lyngby, Denmark Phone +45 45253351, Fax +45 45882673 [email protected] www.imm.dtu.dk Abstract In this work I am focusing on Copy On Write based file systems. Copy On Write is used on modern file systems for providing (1) metadata and data consistency using transactional semantics, (2) cheap and instant backups using snapshots and clones. This thesis is divided into two main parts. The first part focuses on the design and performance of Copy On Write based file systems. Recent efforts aiming at creating a Copy On Write based file system are ZFS, Btrfs, ext3cow, Hammer, and LLFS. My work focuses only on ZFS and Btrfs, since they support the most advanced features. The main goals of ZFS and Btrfs are to offer a scalable, fault tolerant, and easy to administrate file system. I evaluate the performance and scalability of ZFS and Btrfs. The evaluation includes studying their design and testing their performance and scalability against a set of recommended file system benchmarks. Most computers are already based on multi-core and multiple processor architec- tures. Because of that, the need for using concurrent programming models has increased. Transactions can be very helpful for supporting concurrent program- ming models, which ensure that system updates are consistent. Unfortunately, the majority of operating systems and file systems either do not support trans- actions at all, or they simply do not expose them to the users.
    [Show full text]
  • The Page Cache Today’S Lecture RCU File System Networking(Kernel Level Syncmem
    2/21/20 COMP 790: OS Implementation COMP 790: OS Implementation Logical Diagram Binary Memory Threads Formats Allocators User System Calls Kernel The Page Cache Today’s Lecture RCU File System Networking(kernel level Syncmem. Don Porter management) Memory Device CPU Management Drivers Scheduler Hardware Interrupts Disk Net Consistency 1 2 1 2 COMP 790: OS Implementation COMP 790: OS Implementation Recap of previous lectures Background • Page tables: translate virtual addresses to physical • Lab2: Track physical pages with an array of PageInfo addresses structs • VM Areas (Linux): track what should be mapped at in – Contains reference counts the virtual address space of a process – Free list layered over this array • Hoard/Linux slab: Efficient allocation of objects from • Just like JOS, Linux represents physical memory with a superblock/slab of pages an array of page structs – Obviously, not the exact same contents, but same idea • Pages can be allocated to processes, or to cache file data in memory 3 4 3 4 COMP 790: OS Implementation COMP 790: OS Implementation Today’s Problem The address space abstraction • Given a VMA or a file’s inode, how do I figure out • Unifying abstraction: which physical pages are storing its data? – Each file inode has an address space (0—file size) • Next lecture: We will go the other way, from a – So do block devices that cache data in RAM (0---dev size) physical page back to the VMA or file inode – The (anonymous) virtual memory of a process has an address space (0—4GB on x86) • In other words, all page
    [Show full text]
  • Drilling Network Stacks with Packetdrill
    Drilling Network Stacks with packetdrill NEAL CARDWELL AND BARATH RAGHAVAN Neal Cardwell received an M.S. esting and troubleshooting network protocols and stacks can be in Computer Science from the painstaking. To ease this process, our team built packetdrill, a tool University of Washington, with that lets you write precise scripts to test entire network stacks, from research focused on TCP and T the system call layer down to the NIC hardware. packetdrill scripts use a Web performance. He joined familiar syntax and run in seconds, making them easy to use during develop- Google in 2002. Since then he has worked on networking software for google.com, the ment, debugging, and regression testing, and for learning and investigation. Googlebot web crawler, the network stack in Have you ever had the experience of staring at a long network trace, trying to figure out what the Linux kernel, and TCP performance and on earth went wrong? When a network protocol is not working right, how might you find the testing. [email protected] problem and fix it? Although tools like tcpdump allow us to peek under the hood, and tools like netperf help measure networks end-to-end, reproducing behavior is still hard, and know- Barath Raghavan received a ing when an issue has been fixed is even harder. Ph.D. in Computer Science from UC San Diego and a B.S. from These are the exact problems that our team used to encounter on a regular basis during UC Berkeley. He joined Google kernel network stack development. Here we describe packetdrill, which we built to enable in 2012 and was previously a scriptable network stack testing.
    [Show full text]
  • Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser
    Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser The purpose of this presentation is the explanation of: IBM JFS goals: Where was Journaled File System (JFS) designed for? JFS cache design: How the JFS File System and Cache work. JFS benchmarking: How to measure JFS performance under OS/2. JFS cache tuning: How to optimize JFS performance for your job. What do these settings say to you? [E:\]cachejfs SyncTime: 8 seconds MaxAge: 30 seconds BufferIdle: 6 seconds Cache Size: 400000 kbytes Min Free buffers: 8000 ( 32000 K) Max Free buffers: 16000 ( 64000 K) Lazy Write is enabled Do you have a feeling for this? Do you understand the dynamic cache behaviour of the JFS cache? Or do you just rely on the “proven” cachejfs settings that the eCS installation presented to you? Do you realise that the JFS cache behaviour may be optimized for your jobs? November 13, 2009 / page 2 Dynamically Tuning the JFS Cache for Your Job Sjoerd Visser Where was Journaled File System (JFS) designed for? 1986 Advanced Interactive eXecutive (AIX) v.1 based on UNIX System V. for IBM's RT/PC. 1990 JFS1 on AIX was introduced with AIX version 3.1 for the RS/6000 workstations and servers using 32-bit and later 64-bit IBM POWER or PowerPC RISC CPUs. 1994 JFS1 was adapted for SMP servers (AIX 4) with more CPU power, many hard disks and plenty of RAM for cache and buffers. 1995-2000 JFS(2) (revised AIX independent version in c) was ported to OS/2 4.5 (1999) and Linux (2000) and also was the base code of the current JFS2 on AIX branch.
    [Show full text]
  • ARM Assembly Shellcode from Zero to ARM Assembly Bind Shellcode
    Lab: ARM Assembly Shellcode From Zero to ARM Assembly Bind Shellcode HITBSecConf2018 - Amsterdam 1 Learning Objectives • ARM assembly basics • Writing ARM Shellcode • Registers • System functions • Most common instructions • Mapping out parameters • ARM vs. Thumb • Translating to Assembly • Load and Store • De-Nullification • Literal Pool • Execve() shell • PC-relative Addressing • Reverse Shell • Branches • Bind Shell HITBSecConf2018 - Amsterdam 2 Outline – 120 minutes • ARM assembly basics • Reverse Shell • 15 – 20 minutes • 3 functions • Shellcoding steps: execve • For each: • 10 minutes • 10 minutes exercise • Getting ready for practical part • 5 minutes solution • 5 minutes • Buffer[10] • Bind Shell • 3 functions • 25 minutes exercise HITBSecConf2018 - Amsterdam 3 Mobile and Iot bla bla HITBSecConf2018 - Amsterdam 4 It‘s getting interesting… HITBSecConf2018 - Amsterdam 5 Benefits of Learning ARM Assembly • Reverse Engineering binaries on… • Phones? • Routers? • Cars? • Intel x86 is nice but.. • Internet of Things? • Knowing ARM assembly allows you to • MACBOOKS?? dig into and have fun with various • SERVERS?? different device types HITBSecConf2018 - Amsterdam 6 Benefits of writing ARM Shellcode • Writing your own assembly helps you to understand assembly • How functions work • How function parameters are handled • How to translate functions to assembly for any purpose • Learn it once and know how to write your own variations • For exploit development and vulnerability research • You can brag that you can write your own shellcode instead
    [Show full text]
  • Hardware-Driven Evolution in Storage Software by Zev Weiss A
    Hardware-Driven Evolution in Storage Software by Zev Weiss A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Computer Sciences) at the UNIVERSITY OF WISCONSIN–MADISON 2018 Date of final oral examination: June 8, 2018 ii The dissertation is approved by the following members of the Final Oral Committee: Andrea C. Arpaci-Dusseau, Professor, Computer Sciences Remzi H. Arpaci-Dusseau, Professor, Computer Sciences Michael M. Swift, Professor, Computer Sciences Karthikeyan Sankaralingam, Professor, Computer Sciences Johannes Wallmann, Associate Professor, Mead Witter School of Music i © Copyright by Zev Weiss 2018 All Rights Reserved ii To my parents, for their endless support, and my cousin Charlie, one of the kindest people I’ve ever known. iii Acknowledgments I have taken what might be politely called a “scenic route” of sorts through grad school. While Ph.D. students more focused on a rapid graduation turnaround time might find this regrettable, I am glad to have done so, in part because it has afforded me the opportunities to meet and work with so many excellent people along the way. I owe debts of gratitude to a large cast of characters: To my advisors, Andrea and Remzi Arpaci-Dusseau. It is one of the most common pieces of wisdom imparted on incoming grad students that one’s relationship with one’s advisor (or advisors) is perhaps the single most important factor in whether these years of your life will be pleasant or unpleasant, and I feel exceptionally fortunate to have ended up iv with the advisors that I’ve had.
    [Show full text]
  • Proceedings of the Linux Symposium
    Proceedings of the Linux Symposium Volume One June 27th–30th, 2007 Ottawa, Ontario Canada Contents The Price of Safety: Evaluating IOMMU Performance 9 Ben-Yehuda, Xenidis, Mostrows, Rister, Bruemmer, Van Doorn Linux on Cell Broadband Engine status update 21 Arnd Bergmann Linux Kernel Debugging on Google-sized clusters 29 M. Bligh, M. Desnoyers, & R. Schultz Ltrace Internals 41 Rodrigo Rubira Branco Evaluating effects of cache memory compression on embedded systems 53 Anderson Briglia, Allan Bezerra, Leonid Moiseichuk, & Nitin Gupta ACPI in Linux – Myths vs. Reality 65 Len Brown Cool Hand Linux – Handheld Thermal Extensions 75 Len Brown Asynchronous System Calls 81 Zach Brown Frysk 1, Kernel 0? 87 Andrew Cagney Keeping Kernel Performance from Regressions 93 T. Chen, L. Ananiev, and A. Tikhonov Breaking the Chains—Using LinuxBIOS to Liberate Embedded x86 Processors 103 J. Crouse, M. Jones, & R. Minnich GANESHA, a multi-usage with large cache NFSv4 server 113 P. Deniel, T. Leibovici, & J.-C. Lafoucrière Why Virtualization Fragmentation Sucks 125 Justin M. Forbes A New Network File System is Born: Comparison of SMB2, CIFS, and NFS 131 Steven French Supporting the Allocation of Large Contiguous Regions of Memory 141 Mel Gorman Kernel Scalability—Expanding the Horizon Beyond Fine Grain Locks 153 Corey Gough, Suresh Siddha, & Ken Chen Kdump: Smarter, Easier, Trustier 167 Vivek Goyal Using KVM to run Xen guests without Xen 179 R.A. Harper, A.N. Aliguori & M.D. Day Djprobe—Kernel probing with the smallest overhead 189 M. Hiramatsu and S. Oshima Desktop integration of Bluetooth 201 Marcel Holtmann How virtualization makes power management different 205 Yu Ke Ptrace, Utrace, Uprobes: Lightweight, Dynamic Tracing of User Apps 215 J.
    [Show full text]
  • NOVA: a Log-Structured File System for Hybrid Volatile/Non
    NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories Jian Xu and Steven Swanson, University of California, San Diego https://www.usenix.org/conference/fast16/technical-sessions/presentation/xu This paper is included in the Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST ’16). February 22–25, 2016 • Santa Clara, CA, USA ISBN 978-1-931971-28-7 Open access to the Proceedings of the 14th USENIX Conference on File and Storage Technologies is sponsored by USENIX NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories Jian Xu Steven Swanson University of California, San Diego Abstract Hybrid DRAM/NVMM storage systems present a host of opportunities and challenges for system designers. These sys- Fast non-volatile memories (NVMs) will soon appear on tems need to minimize software overhead if they are to fully the processor memory bus alongside DRAM. The result- exploit NVMM’s high performance and efficiently support ing hybrid memory systems will provide software with sub- more flexible access patterns, and at the same time they must microsecond, high-bandwidth access to persistent data, but provide the strong consistency guarantees that applications managing, accessing, and maintaining consistency for data require and respect the limitations of emerging memories stored in NVM raises a host of challenges. Existing file sys- (e.g., limited program cycles). tems built for spinning or solid-state disks introduce software Conventional file systems are not suitable for hybrid mem- overheads that would obscure the performance that NVMs ory systems because they are built for the performance char- should provide, but proposed file systems for NVMs either in- acteristics of disks (spinning or solid state) and rely on disks’ cur similar overheads or fail to provide the strong consistency consistency guarantees (e.g., that sector updates are atomic) guarantees that applications require.
    [Show full text]
  • Hibachi: a Cooperative Hybrid Cache with NVRAM and DRAM for Storage Arrays
    Hibachi: A Cooperative Hybrid Cache with NVRAM and DRAM for Storage Arrays Ziqi Fan, Fenggang Wu, Dongchul Parkx, Jim Diehl, Doug Voigty, and David H.C. Du University of Minnesota–Twin Cities, xIntel Corporation, yHewlett Packard Enterprise Email: [email protected], ffenggang, park, [email protected], [email protected], [email protected] Abstract—Adopting newer non-volatile memory (NVRAM) Application Application Application technologies as write buffers in slower storage arrays is a promising approach to improve write performance. However, due OS … OS … OS to DRAM’s merits, including shorter access latency and lower Buffer/Cache Buffer/Cache Buffer/Cache cost, it is still desirable to incorporate DRAM as a read cache along with NVRAM. Although numerous cache policies have been Application, web or proposed, most are either targeted at main memory buffer caches, database servers or manage NVRAM as write buffers and separately manage DRAM as read caches. To the best of our knowledge, cooperative hybrid volatile and non-volatile memory buffer cache policies specifically designed for storage systems using newer NVRAM Storage Area technologies have not been well studied. Network (SAN) This paper, based on our elaborate study of storage server Hibachi Cache block I/O traces, proposes a novel cooperative HybrId NVRAM DRAM NVRAM and DRAM Buffer cACHe polIcy for storage arrays, named Hibachi. Hibachi treats read cache hits and write cache hits differently to maximize cache hit rates and judiciously adjusts the clean and the dirty cache sizes to capture workloads’ tendencies. In addition, it converts random writes to sequential writes for high disk write throughput and further exploits storage server I/O workload characteristics to improve read performance.
    [Show full text]
  • “Out-Of-VM” Approach for Fine-Grained Process Execution Monitoring
    Workshop for Frontiers of Cloud Computing, Dec 1, 2011, IBM T.J. Watson Research Center, NY Process Out-Grafting: An Efficient “Out-of-VM” Approach for Fine-Grained Process Execution Monitoring Deepa Srinivasan, Zhi Wang, Xuxian Jiang, Dongyan Xu * North Carolina State University, Purdue University* Malware Infection Trend New malware samples collected by McAfee Labs, by month* *Figure source: McAfee Threats Report: Second Quarter 2011, McAfee Labs 2 Anti-Malware Isolation Traditional anti-malware tools are not well-isolated Virtual Machine (VM) introspection Isolate tool by placing it outside a VM Analyze states and events externally User-mode Applications Monitor Virtual Machine … OS Kernel Hypervisor 3 Anti-Malware Isolation Traditional anti-malware tools are not well-isolated Virtual Machine (VM) introspection Isolate tool by placing it outside a VM Analyze states and events externally User-mode Applications Monitor VM Virtual Introspection Machine … OS Kernel Hypervisor 4 Out-of-VM Solutions Livewire (Garfinkel et al. , NDSS ‘03) XenAccess (Payne et al. , ACSAC ‘07) VMScope (Jiang et al. , RAID ‘07) Lares (Payne et al. , Oakland ‘08) … 5 Semantic Gap in Introspection What we want to observe High-level states and events (e.g. system calls, processes) What we can observe Low-level states and events (e.g. raw memory, interrupts) Internal User-mode Applications Monitor … External Monitor Semantic Virtual Machine Gap OS Kernel Hypervisor 6 Addressing the Semantic Gap Guest view casting VMWatcher (Jiang et al. , CCS
    [Show full text]
  • Postmodern Strace Dmitry Levin
    Postmodern strace Dmitry Levin Brussels, 2020 Traditional strace [1/30] Printing instruction pointer and timestamps print instruction pointer: -i option print timestamps: -r, -t, -tt, -ttt, and -T options Size and format of strings string size: -s option string format: -x and -xx options Verbosity of syscall decoding abbreviate output: -e abbrev=set, -v option dereference structures: -e verbose=set print raw undecoded syscalls: -e raw=set Traditional strace [2/30] Printing signals print signals: -e signal=set Dumping dump the data read from the specified descriptors: -e read=set dump the data written to the specified descriptors: -e write=set Redirecting output to files or pipelines write the trace to a file or pipeline: -o filename option write traces of processes to separate files: -ff -o filename Traditional strace [3/30] System call filtering trace only the specified set of system calls: -e trace=set System call statistics count time, calls, and errors for each system call: -c option sort the histogram printed by the -c option: -S sortby option Tracing control attach to existing processes: -p pid option trace child processes: -f option Modern strace [4/30] Tracing output format pathnames accessed by name or descriptor: -y option network protocol associated with descriptors: -yy option stack of function calls: -k option System call filtering pathnames accessed by name or descriptor: -P option regular expressions: -e trace=/regexp optional specifications: -e trace=?spec new syscall classes: %stat, %lstat, %fstat, %statfs, %fstatfs, %%stat, %%statfs
    [Show full text]
  • Data Storage on Unix
    Data Storage on Unix Patrick Louis 2017-11-05 Published online on venam.nixers.net © Patrick Louis 2017 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of the rightful author. First published eBook format 2017 The author has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate. Contents Introduction 4 Ideas and Concepts 5 The Overall Generic Architecture 6 Lowest Level - Hardware & Limitation 8 The Medium ................................ 8 Connectors ................................. 10 The Drivers ................................. 14 A Mention on Block Devices and the Block Layer ......... 16 Mid Level - Partitions and Volumes Organisation 18 What’s a partitions ............................. 21 High Level 24 A Big Overview of FS ........................... 24 FS Examples ............................. 27 A Bit About History and the Origin of Unix FS .......... 28 VFS & POSIX I/O Layer ...................... 28 POSIX I/O 31 Management, Commands, & Forensic 32 Conclusion 33 Bibliography 34 3 Introduction Libraries and banks, amongst other institutions, used to have a filing system, some still have them. They had drawers, holders, and many tools to store the paperwork and organise it so that they could easily retrieve, through some documented process, at a later stage whatever they needed. That’s where the name filesystem in the computer world emerges from and this is oneofthe subject of this episode. We’re going to discuss data storage on Unix with some discussion about filesys- tem and an emphasis on storage device.
    [Show full text]