Optimizing Space Amplification in RocksDB Siying Dong1, Mark Callaghan1, Leonidas Galanis1, Dhruba Borthakur1, Tony Savor1 and Michael Stumm2 1Facebook, 1 Hacker Way, Menlo Park, CA USA 94025 {siying.d, mcallaghan, lgalanis, dhruba, tsavor}@fb.com 2Dept. Electrical and Computer Engineering, University of Toronto, Canada M8X 2A6
[email protected] ABSTRACT Facebook has one of the largest MySQL installations in RocksDB is an embedded, high-performance, persistent key- the world, storing many 10s of petabytes of online data. The value storage engine developed at Facebook. Much of our underlying storage engine for Facebook's MySQL instances current focus in developing and configuring RocksDB is to is increasingly being switched over from InnoDB to My- give priority to resource efficiency instead of giving priority Rocks, which in turn is based on Facebook's RocksDB. The to the more standard performance metrics, such as response switchover is primarily motivated by the fact that MyRocks time latency and throughput, as long as the latter remain uses half the storage InnoDB needs, and has higher average acceptable. In particular, we optimize space efficiency while transaction throughput, yet has only marginally worse read ensuring read and write latencies meet service-level require- latencies. ments for the intended workloads. This choice is motivated RocksDB is an embedded, high-performance, persistent key-value storage system [1] that was developed by Face- by the fact that storage space is most often the primary 1 bottleneck when using Flash SSDs under typical production book after forking the code from Google's LevelDB [2, 3]. workloads at Facebook. RocksDB uses log-structured merge RocksDB was open-sourced in 2013 [5].