Quick viewing(Text Mode)

NOVA: the Fastest File System for Nvdimms

NOVA: the Fastest File System for Nvdimms

NOVA: The Fastest for NVDIMMs

Steven Swanson, UC San Diego

XFS F2FS NILFS

EXT4

© 2017 SNIA Persistent Memory Summit. All Rights Reserved.

Disk-based file systems are inadequate for NVMM

Disk-based file systems cannot 1-Sector 1- N-Block 1-Sector 1-Block N-Block Atomicity overwrit overwrit overwrit append append append exploit NVMM performance e e e ✓ ✗ ✗ ✗ ✗ ✗ wb Ext4 ✓ ✓ ✗ ✓ ✗ ✓ Performance optimization Order Ext4 ✓ ✓ ✓ ✓ ✗ ✓ compromises consistency on Dataj system failure [1] Btrfs ✓ ✓ ✓ ✓ ✗ ✓ ✓ ✓ ✗ ✓ ✗ ✓ Reiserfs ✓ ✓ ✗ ✓ ✗ ✓

[1] Pillai et al, All File Systems Are Not Created Equal: On the Complexity of Crafting Crash-Consistent Applications, OSDI '14.

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. BPFS SCMFS PMFS Aerie EXT4-DAX XFS-DAX NOVA M1FS

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. Previous Prototype NVMM file systems are not strongly consistent

DAX does not provide data ATomic Atomicity Metadata Data Snapshot atomicity guarantee Mmap [1] So programming is more BPFS ✓ ✓ [2] ✗ ✗ difficult PMFS ✓ ✗ ✗ ✗ Ext4-DAX ✓ ✗ ✗ ✗

Xfs-DAX ✓ ✗ ✗ ✗

SCMFS ✗ ✗ ✗ ✗

Aerie ✓ ✗ ✗ ✗

© 2017 SNIA Persistent Memory Summit. All Rights Reserved.

Ext4-DAX and xfs-DAX shortcomings

No data atomicity support Single journal shared by all the transactions (JBD2- based) Poor performance Development teams are (rightly) “disk first”.

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA provides strong atomicity guarantees

1-Sector 1-Sector 1-Block 1-Block N-Block N-Block Atomicity Atomicity Metadata Data Mmap overwrite append overwrite append overwrite append Ext4 ✓ ✗ ✗ ✗ ✗ ✗ BPFS ✓ ✓ ✗ wb Ext4 ✓ ✓ ✗ ✓ ✗ ✓ PMFS ✓ ✗ ✗ Order Ext4 ✓ ✓ ✓ ✓ ✗ ✓ Ext4-DAX ✓ ✗ ✗ Dataj Btrfs ✓ ✓ ✓ ✓ ✗ ✓ Xfs-DAX ✓ ✗ ✗

xfs ✓ ✓ ✗ ✓ ✗ ✓ SCMFS ✗ ✗ ✗

Reiserfs ✓ ✓ ✗ ✓ ✗ ✓ Aerie ✓ ✗ ✗

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA provides strong atomicity guarantees

1-Sector 1-Sector 1-Block 1-Block N-Block N-Block Atomicity Atomicity Metadata Data Mmap overwrite append overwrite append overwrite append Ext4 ✓ ✗ ✗ ✗ ✗ ✗ BPFS ✓ ✓ ✗ wb Ext4 ✓ ✓ ✗ ✓ ✗ ✓ PMFS ✓ ✗ ✗ Order Ext4 ✓ ✓ ✓ ✓ ✗ ✓ Ext4-DAX ✓ ✗ ✗ Dataj Btrfs ✓ ✓ ✓ ✓ ✗ ✓ Xfs-DAX ✓ ✗ ✗

xfs ✓ ✓ ✗ ✓ ✗ ✓ SCMFS ✗ ✗ ✗

Reiserfs ✓ ✓ ✗ ✓ ✗ ✓ Aerie ✓ ✗ ✗

NOVA ✓ ✓ ✓ ✓ ✓ ✓ NOVA ✓ ✓ ✓

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA: A File System for NVDIMMs

Dispenses with disk-centric baggage Strong consistency guarantees Lean and fast Freely optimize for NVDIMMs No need to worry about future disk-based features or optimizations. E.g., Userspace f/msync() is no problem. Focus on latency and bandwidth advantages of NVDIMMs Freely available (GPL) 10 © 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA

Log Structure + copy-on- + Journals

• One log per • Non-contiguous Tail Tail • Fast, Simple atomic File log updates • Meta-data only

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA

Log Structure + copy-on-write + Journals

• Multi- atomic update Tail Tail • Fast allocation File log • Instant data GC

Data 0 Data 1 Data 1 Data 2

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA

Log Structure + copy-on-write + Journals

TailTail Directory log • Small, fixed sized Tail Tail journals File log • For complex ops.

Dir tail Journal File tail

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. NOVA Reliability Features

NVDIMM file systems are subject to multiple failure modes Media errors – “hard” and “soft” Local system failures Large scale failures NOVA includes reliability mechanisms (April release) Replicated metadata CRC for metadata Erasure codes for file data -only checkpoints for NOVA includes “unsafe” modes that improve performance but sacrifice consistency (April release)

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. Evaluation: Latency

Operation latency 25

PM Emulation Platform 20 – Emulates different NVM characteristics 15 – Emulates clwb/PCOMMIT 10 latency • NOVA provides low latency 5

Latency (microsecond) atomicity 0 Create Append (4KB) Delete Ext4-datajournal Ext4-DAX m1fs NOVA

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. Filebench throughput

Filebench throughput 450

400 350 300 250 • NOVA achieves high 200 150 performance with 100

Ops per second (x1000) second per Ops 50 strong data 0 consistency Ext4-datajournal Ext4-DAX m1fs NOVA © 2017 SNIA Persistent Memory Summit. All Rights Reserved. Conclusion

Existing file systems do not meet the requirements of applications on NVMM file systems NOVA’s multi-log design achieves high concurrency, efficient garbage collection and fast recovery NOVA outperforms existing file systems while providing stronger consistency and atomicity guarantees NOVA protects data against media errors and system failures.

© 2017 SNIA Persistent Memory Summit. All Rights Reserved. Thank you!

Tr y NOVA! https://github.com/NVSL/NOVA

© 2017 SNIA Persistent Memory Summit. All Rights Reserved.