Performance and Protection in the Zofs User-Space NVM File System Mingkai Dong, Heng Bu, Jifei Yi, Benchao Dong, Haibo Chen
Total Page:16
File Type:pdf, Size:1020Kb
Performance and Protection in the ZoFS User-space NVM File System Mingkai Dong, Heng Bu, Jifei Yi, Benchao Dong, Haibo Chen Institute of Parallel and Distributed Systems (IPADS), Shanghai Jiao Tong University Oct. 2019 @SOSP ’19 Non-volatile memory (NVM) is coming with attractive features § Fast – Near-DRAM performance § Persistent – Durable data storage § Byte-addressable – CPU load/store access 2 File systems are designed for NVM § NVM File systems in kernel § User-space NVM file systems[1] – BPFS [SOSP ’09] – Aerie [EuroSys ’14] – PMFS [EuroSys ’14] – Strata [SOSP ’17] – NOVA [FAST ’16, SOSP ’17] – SoupFS [USENIX ATC ’17] [1] These file systems also require kernel part supports. 3 User-space NVM file systems have benefits § User-space NVM file systems[1] – Aerie [EuroSys ’14] – Strata [SOSP ’17] üEasier to develop, port, and maintain[2] üFleXible[3] üHigh-performance due to kernel bypass[3,4] [1] These file systems also require kernel part supports. [2] To FUSE or Not to FUSE: Performance of User-Space File Systems, FAST ’17 [3] Aerie: Flexible File-System Interfaces to Storage-Class Memory, EuroSys ’14 [4] Strata: A Cross Media File System, SOSP ’17 4 Metadata is indirectly updated in user space § Updates to metadata are performed by trusted components – Trusted FS Service in Aerie – Kernel FS in Strata Update data Update metadata Aerie direct write via IPCs Indirect updates! append a log in user space, Strata digest in kernel 5 Indirect updates are important but limit performance Create empty files in a shared directory § Indirect updates protect metadata 300 283.97 Strata – File system integrity 250 – Access control 200 × § Indirect updates limit performance! (us) 67 150 Latency 100 50 4.19 0 1 Process 2 Processes 6 Goal: fully exploit NVM performance § Problem: Indirect updates protect metadata but limit performance § Our approach: Directly manage both data and metadata in user- space libraries while retaining protection – Coffer: separate NVM protection from management – The kernel part protects coffers via paging – User-space libraries manage in-coffer structures (file data and metadata) § Results: Outperform existing systems by up to 82% and exploit full NVM bandwidth in some scenarios 7 Outline § Coffer § Protection and isolation § Evaluation 8 Files are stored with the same permissions that rarely change § Survey on database and webserver data files directory with rwx------ directory with rwxr-xr-x PostgreSQL 2% DokuWiki 5% 1835 files 20976 files Uid: postgres Uid: http Gid: postgres Gid: http regular file with rw------- regular file with rw-r--r-- 98% 95% – Most files have the same ownership & permission 9 Files are stored with the same permissions that rarely change § Survey on database and webserver data files – Most files have the same ownership & permission – Ownership & permissions are seldom changed 1. Group files with the same ownership & permission 2. Map their data and metadata to user space 3. Let user-space libraries manage these data and metadata directly 10 A new abstraction: Coffer / data home var Carol Alice Bob papers music coffee games work mine others work 11 A new abstraction: Coffer Group files with the same ownership & permission Coffer / data home var Carol Alice Bob papers music coffee games work mine others work 12 A new abstraction: Coffer Group files with the same ownership & permission Coffer / data home var Coffer Carol Alice Bob papers music coffee games work mine others work 13 A new abstraction: Coffer Group files with the same ownership & permission Coffer / data home var Coffer Coffer Carol Bob Coffer Alice papers music coffee games mine others Coffer Coffer work work 14 Coffer internals A collection of NVM pages that share the same Coffer / ownership & permission data home var § Files are organized by Metadata: Path,UID,GID,Perm.,... Coffer Coffer user-space FS libraries Coffer-local Free Pages Bob Root Page § Local space management ... Carol Coffer Alice § A root page with metadata papers music coffee – Path games mine others – Owner and permission Coffer Coffer work work 15 Coffer separates NVM protection from management Coffer / data home var Metadata: Path,UID,GID,Perm.,... Coffer Coffer Coffer-local Free Pages Bob Root Page ... Carol Coffer Alice papers music coffee games mine others Coffer Coffer work work 16 Coffer separates NVM protection from management KernFS § Protect coffer metadata § Mange global free space Super Coffer/ block datahome var Page Coffer Coffer Bob Allocation Root Page Table Coffer Carol Alice games KernFS Path- papersmusiccoffee coffer mineothers Cofferwork mappings workCoffer 17 Coffer separates NVM protection from management FSLibs / Process Coffer FSLibs / Process KernFS Coffer Coffer-local Free Pages / Coffer Root Page / ... data home var § Protect coffer metadata Carol data home var papers music coffee § Mange global free space mine others µFS µFS1 µFS2 FSLibs Coffer-API Memory Mappings § User-space FS libraries (µFSs) Super Coffer/ block datahome var § Manage in-coffer structures Page Coffer Coffer Bob (data, metadata and free space) Allocation Root Page Table Coffer Carol Alice games KernFS Coffer-APIs Path- papersmusiccoffee coffer mineothers workCoffer Cofferwork § create, map, split, ... mappings 18 Outline § Coffer § Protection and isolation § Evaluation 19 Protection and isolation Hardware paging § For each coffer map request – KernFS checks the ownership and permission of the coffer – Map the coffer to the process page table read-only/read-write accordingly § Applications can access a coffer only if they have the permission 20 Protection and isolation Hardware paging § For each coffer map request– KernFSMap thechecks coffer the to ownershipthe process and page permission table read of- theonly/read coffer-write accordingly § Applications can access a coffer only if they have the permission Memory protection keys A hardware feature that supplements paging Process Region 1 Region 2 Region 1 Region 3 Region 0 Region 1 Region 0 VM space Region 0: rw PKRU Region 1: r- § MPK permission violations Region 2: -- ... segmentation faults Region 15: -- 21 Protection and isolation Hardware paging § For each coffer map request– KernFSMap thechecks coffer the to ownerthe process and permission page table of read the- cofferonly/read-write accordingly § Applications can access a coffer only if they have the permission Memory protection keys § KernFS separates different coffers to different memory protection regions for each process § Application threads can control its access to each coffer efficiently 22 Protection and isolation Hardware paging § For each coffer map request– KernFSMap thechecks coffer the to ownerthe process and permission page table of read the- cofferonly/read-write accordingly § Applications can access a coffer only if they have the permission Memory protection keys § KernFS separates different coffers to different memory protection regions for each process § Application threads can control its access to each coffer efficiently Challenges 1. Stray writes 2. Malicious manipulations 3. Fate sharing 23 Challenge 1: stray writes Coffer / data home var Coffer Coffer Carol Bob Coffer Alice papers music coffee games mine others Coffer Coffer work work Challenge 1: stray writes Problem: Stray writes corrupt metadata in mapped coffers Coffer Approach: write windows[1] / § MPK regions are initialized as non-accessible data home § When a µFS modifies a coffer PKRU Coffer 1. Enable coffer access Carol 2. Modify coffer Region R W 3. Disable coffer access Coffer Blue R- W- papers music coffee ... - - mine others pop Result: Stray writes in application code cause segmentation faults due to MPK Coffer work [1] System Software for Persistent Memory, EuroSys ’14 Challenge 2: malicious manipulations Problem: An attacker manipulates a shared coffer to attack others! Case: manipulate a pointer to point to another coffer Coffer / data home var Coffer Carol Coffer papers music coffee games mine others Coffer Challenge 2: malicious manipulations Problem: An attacker manipulates a shared coffer to attack others! Case: manipulate a pointer to point to another coffer RW Attacker Shared Coffer/ RW data home var Victim RW Private Coffer Carol Coffer papers music coffee games mine others Coffer Challenge 2: malicious manipulations Problem: An attacker manipulates a shared coffer to attack others! Case: manipulate a pointer to point to another coffer RW The victim accesses the private coffer by mistake Attacker Shared Coffer/ RW data home var Victim RW Private Coffer Carol Coffer papers music coffee games mine others Coffer Challenge 2: malicious manipulations Problem: An attacker manipulates a shared coffer to attack others! Case: manipulate a pointer to point to another coffer RW The victim accesses the private coffer by mistake Attacker Shared Coffer/ RW Victim data home var Approach: At most one coffer is accessible RW at any time for each thread Private Coffer Carol Coffer Result: Following manipulated pointers papers music coffee triggers segmentation faults! games mine others Coffer Challenge 3: fate sharing Problem: An error in FS libraries can terminate the whole process! Approach: SIGSEGV Handler 1. Setjmp before user-space FS operations ... check_segv_reason(...); 2. Hook the SIGSEGV handler longjmp(&ctx, errcode); FS Lib ... 3. Jump back and return error code if ((err = setjmp(&ctx))) { return err; } void call_fs_mkdir(...) call_fs_mkdir(...); { Result: Segmentation faults are ... ... ... // seg fault here reported to the application ... } as an