A Personal Virtual Computer Recorder Oren Laadan

A Personal Virtual Computer Recorder Oren Laadan Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Graduate School of Arts and Sciences COLUMBIA UNIVERSITY 2011 c 2011 Oren Laadan All Rights Reserved ABSTRACT A Personal Virtual Computer Recorder Oren Laadan Continuing advances in hardware technology have enabled the proliferation of faster, cheaper, and more capable personal computers. Users of all backgrounds rely on their computers to handle ever-expanding information, communication, and computation needs. As users spend more time interacting with their computers, it is becoming increasingly important to archive and later search the knowledge, ideas and information that they have viewed through their computers. However, existing state-of-the-art web and desktop search tools fail to provide a suitable solution, as they focus on static, accessible documents in isolation. Thus, finding the information one has viewed among the ever-increasing and chaotic sea of data available from a computer remains a challenge. This dissertation introduces DejaView, a personal virtual computer recorder that enhances personal computers with the ability to process display-centric content to help users with all the information they see through their computers. DejaView continuously records a user's session to provide a complete WYSIWYS (What You Search Is What You've Seen) record of a desktop computing experience, enabling users to playback, browse, search, and revive records, making it easier to retrieve and interact with information they have seen before. DejaView records visual output, checkpoints corresponding application and file system states, and captures onscreen text with contextual information to index the record. A user can then browse and search the record for any visual information that has been previously displayed on the desktop, and revive and interact with the desktop computing state corresponding to any point in the record. DejaView introduces new, transparent operating system, display and file system virtualization techniques and novel semantic display-centric information recording, and combines them to provide its functionality without any modifications to applications, window systems, or operating system kernels. Our results demonstrate that DejaView can provide continuous low-overhead recording without any user-noticeable performance degradation, and allows users to playback, browse, search, and time-travel back to records fast enough for interactive use. This dissertation also demonstrates how DejaView's execution virtualization and recording extend beyond the desktop recorder context. We introduce a coordinated, parallel checkpoint-restart mechanism for distributed applications that minimizes syn- chronization overhead and uniquely supports complete checkpoint and restart of network state in a transport protocol independent manner, for both reliable and unreli- able protocols. We introduce a scalable system that enables significant energy saving by migrating network state and applications off of idle hosts allowing the hosts to enter low-power suspend state, while preserving their network presence. Finally, we show how our techniques can be integrated into a commodity operating system, main- line Linux, thereby allowing the entire operating systems community to benefit from mature checkpoint-restart that is transparent, secure, reliable, efficient, and integral to the Linux kernel. Contents Contents ii List of Figures viii List of Tables xi List of Algorithms xiii 1 Introduction1 1.1 Display Recording . .6 1.2 Content Recording . .7 1.3 Execution Recording . 10 1.4 Contributions . 12 1.5 Organization of this Dissertation . 17 2 System Overview 19 2.1 Usage Model . 19 2.2 Example Scenarios . 23 2.2.1 Parking Ticket Proof . 23 i 2.2.2 Software Re-install . 24 2.2.3 Objects Not Indexable . 25 2.2.4 Searching for Generic Terms . 26 2.2.5 Sysadmin Troubleshooting . 26 2.3 Architecture Overview . 27 3 Display Recording and Search 32 3.1 Display Recording . 33 3.1.1 Record . 34 3.1.2 Playback . 36 3.2 Content Recording and Search . 38 3.2.1 Text Capture and Indexing . 39 3.2.2 Search with Database . 41 3.2.3 Search with Text-shots . 43 3.3 Summary . 45 4 Display-Centric Content Recording 47 4.1 Display-Centric Text Recording . 48 4.2 The Accessibility Framework . 50 4.3 Architecture . 56 4.3.1 Overview . 58 4.3.2 Mirror Tree . 60 4.3.3 Event Handler . 61 4.3.4 Output Modules . 65 4.3.5 Limitations . 66 ii 4.4 Evaluation . 67 4.4.1 Performance Overhead . 70 4.4.2 Single Application Text Coverage . 71 4.4.3 Multiple-Application Text Coverage . 73 4.4.4 Tree Characteristics . 74 4.5 Lessons Learned . 77 4.6 Summary . 79 5 Virtual Execution Environment 80 5.1 Operating System Virtualization . 82 5.2 Interposition Architecture . 86 5.3 Virtualization Challenges . 88 5.3.1 Race Conditions . 91 5.3.1.1 Process ID Races . 92 5.3.1.2 PID Initialization Races . 96 5.3.1.3 SysV IPC Races . 100 5.3.1.4 Pseudo Terminals Races . 106 5.3.2 File System Virtualization . 108 5.3.3 Pseudo File Systems . 108 5.4 Evaluation . 110 5.4.1 Micro-benchmarks . 111 5.4.2 Application Benchmarks . 116 5.5 Summary . 120 6 Live Execution Recording 121 iii 6.1 Application Checkpoint-Restart . 122 6.1.1 Virtualization Support . 124 6.1.2 Key Design Choices . 126 6.2 Record . 129 6.2.1 Consistent Checkpoints . 129 6.2.2 Optimize for Interactive Performance . 132 6.2.3 Checkpoint Policy . 135 6.3 Revive . 137 6.3.1 File System Restore . 139 6.3.2 Network Connectivity . 140 6.4 Quiescing Processes . 141 6.5 Process Forest . 144 6.5.1 DumpForest Algorithm . 146 6.5.1.1 Basic Algorithm . 147 6.5.1.2 Examples . 151 6.5.1.3 Linux Parent Inheritance . 152 6.5.2 MakeForest Algorithm . 154 6.6 Shared Resources . 156 6.6.1 Nested Shared Objects . 158 6.6.2 Compound Shared Objects . 159 6.6.3 Memory Sharing . 160 6.7 Evaluation . 161 6.8 Summary . 170 iv 7 Whole System Evaluation 172 7.1 System Overhead . 175 7.2 Access To Data . 181 7.3 Summary . 186 8 Distributed Checkpoint-Restart 187 8.1 Architecture Overview . 189 8.2 Distributed Checkpoint-Restart . 190 8.3 Network State Checkpoint-Restart . 198 8.4 Evaluation . 207 8.4.1 Virtualization Measurements . 209 8.4.2 Checkpoint-Restart Measurements . 209 8.5 Summary . 215 9 Desktop Power Management 216 9.1 Architecture Overview . 218 9.2 Application Containers . 221 9.3 Application Migration . 225 9.3.1 Checkpoint and Restart Overview . 228 9.3.2 Base Connection State . 231 9.3.3 Dynamic Connection State . 235 9.4 Evaluation . 238 9.5 Summary . 244 10 Checkpoint-Restart in Linux 245 10.1 Usage . 246 v 10.1.1 Userspace Tools . 249 10.1.2 System Calls . 251 10.2 Architecture . 252 10.2.1 Kernel vs. Userspace . 254 10.2.2 Checkpoint and Restart . 256 10.2.3 The Checkpoint Image . 260 10.2.4 Shared Resources . 262 10.2.5 Leak Detection . 263 10.2.6 Error Handling . 265 10.2.7 Security Considerations . 266 10.3 Kernel Internal API . 268 10.4 Experimental Results . ..

Load more