Proceedings on Privacy Enhancing Technologies ; 2019 (1):153–171

Chen Chen*, Anrin Chakraborti, and Radu Sion PD-DM: An efficient locality-preserving block with plausible deniability

Abstract: Encryption protects sensitive data from 1 Introduction unauthorized access, yet is not sufficient when users are forced to surrender keys under duress. In contrast, plau- Although encryption is widely used to protect sensitive sible deniability enables users to not only encrypt data data, it alone is usually not enough to serve users’ in- but also deny its existence when challenged. tentions. At the very least, the existence of encryption Most existing plausible deniability work (e.g. the suc- can draw the attention of adversaries that may have the cessful and unfortunately now-defunct TrueCrypt) tack- power to coerce users to reveal keys. les “single snapshot” adversaries, and cannot handle the Plausible deniability(PD) aims to tackle this prob- more realistic scenario of adversaries gaining access to lem. It defines a security property making it possible a device at multiple time points. Such “multi-snapshot” to claim that “some information is not in possession [of adversaries can simply observe modifications between the user] or some transactions have not taken place” snapshots and detect the existence of hidden data. [23]. In the context of storage devices, as addressed in Existing ideas handling “multi-snapshot” scenarios this paper, PD refers to the ability of a user to plausibly feature prohibitive overheads when deployed on deny the existence of certain stored data even when an practically-sized disks. This is mostly due to a lack adversary has access to the storage medium. of data locality inherent in certain standard access- In practice, PD storage is often essential for sur- randomization mechanisms, one of the building blocks vival and sometimes a matter of life and death. In used to ensure plausible deniability. a notable example, the human rights group Network In this work, we show that such randomization is not for Human Rights Documentation-Burma (ND-Burma) necessary for strong plausible deniability. Instead, it can carried data proving hundreds of thousands of human be replaced by a canonical form that permits most of rights violations out of the country on mobile devices, writes to be done sequentially. This has two key advan- risking exposure at checkpoints and border crossings tages: 1) it reduces the impact of seek due to random [27]. Videographers brought evidence of human rights accesses; 2) it reduces the overall number of physical violations out of Syria by hiding a micro-SD card in blocks that need to be written for each logical write. As body wounds [24], again at high risk of life. a result, PD-DM increases I/O throughput by orders Yet, unfortunately, existing PD storage solutions of magnitude (10–100× in typical setups) over exist- are either slow or insecure against “multi-snapshot” ad- ing work while maintaining strong plausible deniability versaries that can gain access to the storage multiple against multi-snapshot adversaries. times over a longer period. For example, steganographic Notably, PD-DM is the first plausible-deniable system filesystems [5, 23, 26] (also known as “deniable filesys- getting within reach of the performance of standard en- tems”), resist only adversaries who can access storage crypted volumes (dm-crypt) for random I/O. mediums once (“single-snapshot” adversaries). And al- Keywords: Plausible deniability, Storage security though some attempts [26] have been made to handle multi-snapshot adversaries, they are insecure. A strong DOI 10.2478/popets-2019-0009 Received 2018-05-31; revised 2018-09-15; accepted 2018-09-16. adversary can straightforwardly infer what data is hid- den with a relatively high probability, better than ran- dom guessing. On the other hand, while block-level PD solutions [6, 8] handling multi-snapshot adversaries ex- ist, their performance is often simply unacceptable and *Corresponding Author: Chen Chen: Stony Brook Uni- versity, E-mail: [email protected] many orders of magnitude below the performance of Anrin Chakraborti: Stony Brook University, E-mail: an- non-PD storage. [email protected] Nevertheless, most realistic PD adversaries are ul- Radu Sion: Stony Brook University, E-mail: timately multi-snapshot. Crossing a border twice, or [email protected] having an oppressive government collude with a hotel PD-DM 154 maid and subsequently a border guard, provides easy 2 Storage-centric PD Model and cheap multi-snapshot capabilities to any adversary. It is obvious that the security of a PD system PD can be provided at different layers in a storage sys- should not break down completely (under reasonable tem. However, properly designing multi-snapshot PD at user behavior) and should be resilient to such real- the filesystem level is not trivial. As filesystems involve istic externalities (hotel maids, border guards, airline many different types of data and associated metadata, checked luggage etc). Thus, undeniably, a practical and resolving all data and metadata modifications and as- sound plausible-deniable storage solution crucially needs sociated chain reactions in a plausibly deniable manner to handle multi-snapshot accesses. requires significant and careful consideration. We are In virtually all PD designs, handling multi-snapshot not aware of any successful, fully secure design. More- adversaries requires hiding the “access patterns” to hid- over, data loss (overwriting of data at a high deniability den data. Otherwise, adversaries could trivially observe level by writes to data at a lower deniability level) is implausible modifications between snapshots and infer likely an unavoidable drawback of filesystem PD ideas. the existence of hidden data. Instead, the focus of this work is on storage-centric Some of the first solutions [8, 31] hide access pat- PD for block devices – allowing a user to store files of dif- terns by rendering them indistinguishable from random. ferent confidentiality levels to different logical volumes Unfortunately, existing randomization-based mecha- stored on the same underlying physical device. These nisms completely break data locality and naturally im- volumes are accessed independently by upper-layer ap- pose large numbers of physical writes that are needed plications such as filesystems and databases. to successfully complete one logical write. This results It is important to note that no existing PD solu- in unacceptable performance, especially for high-latency tion hides the fact that a PD system is in place. So far storage (e.g., rotational disks). For example, HIVE [8] (and in this paper), the role of a sound PD system has performs 3 orders of magnitude slower than a non-PD been to hide whether a user stores any hidden files (for disk baseline. filesystems) or uses any hidden volumes (in block level We observe that randomization is not the only way schemes). In other words, it is assumed that adversaries to hide access patterns. Canonical forms that depend won’t punish users for merely using a PD-enabled sys- only on non-hidden public data (e.g., sequential struc- tem, at least as long as the system allows also non-PD turing) suffice and can yield dramatic throughput in- purposes, e.g., storing in a public volume. creases. This is mainly due to two reasons. First, it is well known that sequential structuring can vastly speed up I/O by reducing the impact of random seeks [14]. 2.1 PD Framework at Block Level This has been previously well explored in log-structured filesystems [13, 28]. Second, the predictable nature of A PD system at the block level intermediates between such sequential access leads to more efficient searches a raw storage device and upper logical volumes (Figure for free block locations (to write new data) with lower 1). It structures accesses to the underlying device while numbers of I/O operations, when compared with exist- providing upper-level abstractions for multiple indepen- i ing (mostly randomized access) solutions. dent logical volumes. Logical public volumes Vp can be Finally, we note that, by definition, a well-defined revealed to adversaries, whereas the existence and us- j multi-snapshot adversary cannot observe the device (in- age of hidden volumes Vh should be hidden from ad- cluding its RAM, etc) between snapshots. This enables versaries. The physical device contains Nd blocks of B i j additional optimizations for increased throughput. bytes each, while each logical volume has Np/Nh logi- The result is PD-DM, a new, efficient block-level cal data blocks, respectively. The volumes are encrypted i j storage solution that is strongly plausibly deniable with different encryption keys Kp and Kh (e.g., derived i j against multi-snapshot adversaries, preserves locality from passwords Pp and Ph). If coerced, users can pro- and dramatically increases performance over existing vide the public passwords and deny the existence or use work. In typical setups PD-DM is orders of magnitude of hidden volumes even to multi-snapshot adversaries. (10–100×) faster than existing approaches. Adversary. We consider a strong computationally bounded “multi-snapshot” adversary able to get com- plete representations of the device’s static state at mul- tiple times of her choosing. Importantly, the adversary cannot see the running state (memory, caches etc) of PD-DM 155

Fig. 1. Overview of a PD system with one public volume and one hidden volume as an example. PD-DM provides virtual volume abstractions. The choice of whether the hidden volume is used Fig. 2. Consider two access patterns O0 and O1 that transform will be hidden. the state of the disk from S1 to S20 and S21 respectively. Ad- versaries should not be able to distinguish S20 from S21 based on the snapshots. Effectively the adversary should not be able to the PD layer occurring before she takes possession of distinguish W(O0) from W(O1) the device. For example, a border guard in an oppres- sive government can check a user’s portable hard drives duced by O1 or O0. Effectively, the corresponding write every time he crosses the border, but the guard cannot traces W(O0) and W(O1) should be indistinguishable. install malwares or otherwise spy on the device. One key observation is that only write accesses are Access Pattern. An access can be either a of concern here since read accesses can be designed to i read or write to a logical volume. P ub_read(lp), have no effect on write traces. This also illustrates why i i i P ub_write(lp, dp) denote accesses to public volume Vp , it is important to guarantee PD at block level – the above while Hid_read(lj ), Hid_write(lj , dj ) denote accesses h h h may not necessarily hold for filesystems which may up- to hidden volume V j. li and lj are logical block IDs h p h date “last-accessed” and similar fields even on reads. i j while dp and dh are data items. An access pattern O refers to an ordered sequence of accesses being submit- ted to logical volumes, e.g., by upper-layer filesystems. 2.2 Security Game Write Trace. A write trace W(O) refers to a series of actual modifications to the physical block device result- An intuitive explanation of PD at block level is illus- ing from a logical access pattern O. Let P hy_write(α, d) trated in Figure 2. Towards more formal insights, we denote an atomic physical device block write operation turn to existing work and observe that security require- of data d at location α (physical block ID). Then a write ments are sometimes quite divergent across different trace W(O) can be represented as an ordered sequence works. For example, Blass et al. [8] introduced a hid- of physical block writes P hy_write(α, d). den volume encryption (HIVE) game. It outlines two main concerns: frequency of snapshots and “degree of A multi-snapshot adversary may get information hiding”. “Restricted”, “opportunistic” and “plausible” about a write trace by observing differences in the “be- hiding are defined along with three kinds of adversaries, fore” and “after” device states. A PD system is designed “arbitrary”, “on-event” and “one-time”. Chakraborti et to prevent the adversary from inferring that the write al. [6] proposed the PD-CPA game (Appendix A.1) tar- trace corresponds to hidden volume accesses. geting a more powerful “arbitrary” adversary, further Not unlike existing work, PD-DM achieves this by excluding reliance on any specific device property. providing plausible explanations for accesses to hidden We discover that both models feature a shared volumes using accesses to public volumes. Figure 2 illus- structural backbone coupled with certain adversary- trates. Consider two access patterns O0 and O1. O1 con- specific features. The games are defined between an ad- tains hidden accesses while O0 does not. The two corre- versary A and a challenger C on a device D with a “meta sponding write traces bring the disk from input state S1 game” structure. to output state S20 and S21 according to which of O0 Before presenting the meta game, we first introduce and O1 has been executed. The idea then is to empower IsExecutable(O) – a system-specific predicate that re- a user to plausibly deny the execution of hidden accesses turns T rue if and only if the access patterns O can be by arguing that any observed device state changes be- “fully executed”. It is hard to have a universal definition tween snapshots would have been indistinguishably in- of IsExecutable(O) for all the PD solutions. For exam- ple, in HIVE, IsExecutable(O) models whether there PD-DM 156 exists one corresponding read after each access to a ning the game by trivially choosing access patterns with hidden volume. In PD-CPA, IsExecutable(O) models hidden accesses that cannot be hidden by a given sys- whether the ratio between the number of hidden writes tem. In existing work, IsExecutable(O1) depends only and the number of public writes is upper bounded by a on the ratio of hidden to public operations (either public parameter φ. Then, the meta game is defined as follows: read or public write). This yields actual practical solu- tions, but also limits the solution space to ORAM-based

1. Let np and nh denote the number of public and hidden mechanisms. Yet, this is not necessary if we consider the volumes respectively. fact that adversaries in reality have much less freedom 2. Challenger C chooses encryption keys {Ki } p i∈[1,np] in choosing access patterns. and {Kj } using security parameter s. h j∈[1,nh] Specifically, requiring adversaries to choose no mat- 3. C creates logical volumes {V i} and p i∈[1,np] ter how many public accesses that are needed to cover {V j } in D, encrypted with keys Ki and h j∈[1,nh] p those hidden accesses in the access pattern is quite rea- Kj respectively. h sonable. The number of public accesses chosen could 4. C selects a random bit b. i be related to not only the number of hidden accesses 5. C returns {Kp}i∈[1,np] to A. 6. The adversary A and the challenger C then engage in but also the current state of hidden data on disk. In a polynomial number of rounds in which: reality, this corresponds to a multi-snapshot adversary (a) A selects a set of access patterns {Oi } and p i∈[1,np] who gains access to storage devices at different points {Oj } with the following restrictions: h j∈[1,nh] in time, but does not monitor the system running. As a i i i. Op includes arbitrary writes to Vp . j j result, the adversary has limited knowledge about what ii. Oh includes writes to Vh , subject to restric- tion 6(a)iv below. has happened in the meantime in the hidden volumes i (e.g. hard disk snooping [18, 22]). iii. O0 includes all the accesses in {Op}i∈[1,np] iv. There exists O1, a sequence composed (e.g., In this case, IsExecutable() can still be easily full- by adversary A) of all accesses in O0 and filled by access patterns coming from many realistic {Oj } , such that IsExecutable(O ) = h j∈[1,nh] 1 upper-layer workloads. For example, if PD-DM runs T rue underneath log-structure filesystems on both volumes, (b) C executes Ob on D. (c) C sends a snapshot of the resulting device to A. IsExecutable() becomes exactly the same as that in PD- 0 7. A outputs b , A’s guess for b. CPA. Further, once the public and hidden volume work- loads sufficiently correlate, the PD-DM IsExecutable() Definition 1. A storage mechanism is plausibly- is True with high probability (see Section 4.2). deniable if it ensures A cannot win the security game Timing. Although a system log may record the public with a non-negligible advantage over random guessing. writes which cover the hidden accesses in PD-DM, it is Or, more formally, there exists a function (s), negli- important to note that timing information is assumed 0 gible in security parameter s, such that P r[b = b] ≤ to not provide adversaries any significant advantage. 1/2 + (s).

The restrictions on access pattern choices (in step 6a) are mandatory in the game to prevent “trivial” impos- 3 System Design sibility. In the absence of access patterns restrictions, it The key idea in PD-DM is to ensure a canoni- may be impossible to find a solution. E.g., an adversary cal/sequential physical device write trace, notwith- may choose different public accesses for O0 and O1, so standing the logical layer workload. All writes to the that she can distinguish access patterns by public ac- physical device are located at sequentially increasing cesses only and trivially win the game. The restrictions physical addresses, similar to append operations in log- 6(a)i - 6(a)ii ensure that the public writes chosen to structure filesystems. cover the hidden writes are not a function of the hidden This canonical nature eliminates the ability of an writes, lest information about the existence of the hid- adversary to infer logical layer access patterns, and the den volume is leaked by the public write choices (which possible existence of a hidden volume stored in appar- are unprotected from adversaries). ently “free” space – e.g., by noticing that this “free” The IsExecutable(O1) in the restriction 6(a)iv is re- space is being written to in between observed snapshots. lated to the systems’ ability to combine public and hid- More importantly, it brings two significant perfor- den accesses to ensure PD. It prevents adversaries win- mance benefits. Firstly, it reduces the impact of disk PD-DM 157 seeks. This is very important as the seek time dominates the disk access time [20]. For example, the average seek time for modern drives today may range from 8 to 10 ms [21], while the average access time is around 14 ms [20]. Secondly, as we will see, a block that is chosen to be written in PD-DM is more likely to contain invalid data compared to a randomly-selected block. This re- duces the average number of physical blocks that need Fig. 3. PD-DM layout. The left-most dotted rectangle (blue) to be accessed to serve one logical write request. highlights the data structures used in mapping logical volume System layout. Figure 3 illustrates the disk layout of blocks to physical locations: PPM, PBM and HM_ROOT. The PD-DM and how logical volumes are mapped to the un- right-most dotted rectangle (red) highlights the data storage derlying physical device. For simplicity, we consider one segment including “Phead” and a set of consecutive blocks on the disk. This segment stores all data from both public and hidden public volume and one hidden volume in PD-DM.1 As volumes, as well as parts of the hidden map. shown later, it is easy to add more volumes. In general, the disk is divided into two segments: a data segment data and one block of hidden data can be included in and a mapping segment. Both public and hidden data one PD_write operation without loss of generality. is mapped to the data segment, more specifically, the PD-DM receives upper-layer I/O requests and Data area (in Figure 3). Several data structures (PPM, translates them into physical I/O to underlying devices. PBM, HM_ROOT in Figure 3) are stored in the map- For the sake of simplicity, we first discuss write requests ping segment, to maintain the logical volume to physical and later read requests. The remaining part of this sec- device mapping obliviously and provide physical block tion is structured as follows: Section 3.1 describes how status information. As will be detailed later, the PPM public and hidden data is arranged in the Data area and PBM are associated with the public volume, and under the assumption that mappings are in memory. HM_ROOT with the hidden volume. Section 3.2 details disk layout of mapping data and its For better performance, recently used mapping- associated plausibly deniable management. Finally, Sec- related blocks are cached in memory using assigned tion 3.3 outlines the entire end-to-end design of PD-DM. memory pools.2 Log-structure. The Data area is written sequentially as a log and the Phead records its head. An atomic write 3.1 Basic Design to the Data area writes not only public data but also ei- ther hidden data as well as its related mapping blocks or In this section, we simplify the discussion by consider- some dummy data to several successive blocks pointed ing certain part of the system as black boxes. Specifi- by P .3 The pointer P indicates the block af- head head cally, to handle logical write requests (P ub_write(lp, dp) ter the last written physical block and wraps around to and Hid_write(lh, dh)), PD-DM needs to arrange the point again to the first physical block once the last block received data dp/dh on the physical device and record of the Data area is written. the corresponding mappings from logical volumes to the Underlying Atomic Operation: PD_write. As physical device for future reference. For now, we will mentioned, PD-DM writes to the Data area sequentially consider the maps as black boxes managing the logical- with an atomic operation. Thus, we define the atomic physical mappings, as well as the status (free, valid, in- operation as PD_write (see Section 3.3 for details). The valid, etc.) of each physical block on the device. The content written by a PD_write operation can be ad- map blackbox internals will be provided in Sections 3.2. justed by users. We assume that one block of public The black boxes support the following operations, where α is the ID of the physical location storing the logical

block (lp or lh) and S denotes the “status” of a physical block: free, valid, or invalid: i j 1 Vp and Vh are used for the two volumes instead of Vp and V , h – α = Get_P ub_map(lp), Set_P ub_map(lp, α) and the index i or j is also omitted in all the other notations defined in Section 2 for simplicity. – α = Get_Hid_map(lh), Set_Hid_map(lh, α) 2 Not to be confused with chip level caches (PLB) used in some – S = Get_status(α), Set_status(α, S) ORAM-based designs [15]. With these auxiliary functions, incoming pairs of 3 This is a key point and reduces the impact of disk seeks by P ub_write(lp, dp) and Hid_write(lh,dh) requests are an order of magnitude. PD-DM 158 handled straightforwardly as described in Algorithm 1. Note that the while loop in this algorithm may end

As we assumed that one block of public data and one without setting dh to Null, i.e., the hidden write request block of hidden data are written together in this paper, is not completely satisfied within one public write. In two blocks in the Data area are written. Specifically, this case, more public writes are required to complete public data dp and hidden data dh are encrypted with the same hidden write. After a sufficient number – a Kp and Kh, respectively, and written to the two suc- constant, as will be shown later – of rounds, the hidden cessive blocks at physical locations Phead and Phead + 1. write request can be guaranteed to go through. Initially, Phead is set to 0 and PD-DM is ready to receive Algorithm 2 Pub_write + Hid_write – Considering wrap- logical write requests. around In a public-volume-only setting, random data is sub- Input: P ub_write(lp, dp), Hid_write(lh, dh), Phead stituted for the encrypted hidden data. In Algorithm 1, 1: while dp is not Null do dh = Null signifies the lack of hidden write requests // The new pair of data is constructed based on Table 1 new new from the upper layers. 2: (dp , dh ) = Get_new_data_pair(dp, dh,Phead) new 3: P hy_write(Phead, dp ) new Algorithm 1 Pub_write + Hid_write – Simple version 4: P hy_write(Phead + 1, dh ) new 5: if d == dp then Input: P ub_write(lp, dp), Hid_write(lh, dh), Phead p 6: Set_P ub_map(lp,P ), dp = Null 1: if dh is not Null then head 7: end if 2: P hy_write(Phead, dp) 8: if dnew == d then 3: P hy_write(Phead + 1, dh) h h 9: Set_Hid_map(l ,P + 1), d = Null 4: Set_P ub_map(lp,Phead) h head h 10: end if 5: Set_Hid_map(lh,Phead + 1) 6: else . a public volume only, no hidden write request 11: Phead ← Phead + 2 7: Randomize data dummy 12: end while

8: P hy_write(Phead, dp) 9: P hy_write(Phead + 1, dummy) Buffering and seek-optimized writes. Notwith- 10: Set_P ub_map(lp,Phead) standing the logical data layout, PD-DM always writes 11: end if physical data blocks sequentially starting from the cur- 12: Phead ← Phead + 2 rent value of Phead until enough blocks with invalid data Wrap around. As described, PD-DM always aims to have been found to store the current payload (Algorithm write to the “oldest” (least recently written) block in 2). Depending on the underlying stored data, written the Data area. To achieve this, it needs to mitigate the contents are either re-encrypted valid existing data or possibility that, after a wrap-around, valid data is in newly written payload data (Table 1). For example, con- fact stored in that location and cannot be overwritten. sider the following case where the pattern of existing For simplicity, consider the size of the Data area being valid (V) and invalid (I) data is [VIIVVI ··· ]. Instead a multiple of 2 and aligned by public-hidden data pairs. of seeking to position 2 to write new data there, PD- We denote the existing public-hidden data pair DM writes both position 1 and 2 with re-encrypted/new existing existing data sequentially. pointed to by Phead, dp and dh . Their status can be either valid or invalid. Invalid data may be either d dexisting dexisting dnew dnew the result of a previous dummy block write or the case h p h p h invalid invalid dp dummy of data being an “old” version of a logical block. invalid valid d dexisting NULL p h Now, the key to realize here is that preserv- existing valid invalid dp dummy ing any of the valid existing data can be done by existing existing valid valid dp dh including it into the current input of the ongoing invalid invalid dp dh existing round of writing. Table 1 illustrates how function not invalid valid dp dh existing Get_new_data_pair(d , d ,P ) generates the data NULL valid invalid dp dh p h head existing existing new new valid valid dp dh pair (dp , dh ) written in the ongoing round in PD-

DM. The data pair is generated based on the status new new Table 1. PD-DM constructs the data pair (dp , dh ) that can of existing data and dh. In brief, the priority for con- be written by a PD_write operation according to the status of existing existing structing the new pair of data is dp /d > existing existing h existing data (dp , dh ) and the availability of hidden dp/dh > dummy. Then, logical requests P ub_write and write requests (dh). Hid_write are handled with Algorithm 2. PD-DM 159

As a result, by construction, all physical data writes 20 r=100%, f=100% to the underlying device happen sequentially. The pat- r= 90%, f= 5% r= 80%, f= 20% tern is in fact composed of reads and writes to the same 15 block, followed by the next block, and so on. For rota- tional media this means two things: (i) disk heads do not 10 need to switch cylinders to write, which results in sig- nificantly lower (mostly rotational-only) latencies, and 5 (ii) any predictive read/write-ahead buffering (at the

Average number of physical writes 0 various layers within the drive and the OS) can work 0 0.1 0.2 0.3 0.4 extremely well, especially when compared with existing Spare factor work that randomizes physical access patterns. Fig. 4. Average number of physical writes required for one public Reduced number of writes. The sequential nature write request under different types of logical write workload. of the physical write trace results in a smaller average number of blocks written per logical write compared average number of blocks needed to be accessed for each 1 to existing work [11]. In a PD solution, a few blocks logical write is then ρ . The sequential write nature in ρ will be accessed following certain rules to find a physi- PD-DM ensures that no less than 2N · 1−ρ logical writes cal block for the logical data, lest the write trace leaks can be performed during the period when the entire the access pattern. It is more likely that the block se- Data area is written once from beginning to end. This lected in PD-DM contains no valid data as it is the least also implies that average number of blocks needed to be 1 recently written physical block. To set the stage for a accessed for each logical write is then at most ρ = O(1). detailed analysis, we first introduce two related compo- Moreover, as PD-DM always writes to the “least re- nents: “spare factor” and “traffic model”. cently written” physical block, it is more likely to have “Spare factor” ρ. In order to store all the logical been “invalidated” before being overwritten. The prob- blocks of each logical volume, the Data area in PD- ability that the written block contains invalid data de- DM should contain at least 2N physical blocks (N = pends on both ρ and the incoming workload/traffic pat- max(Np,Nh). In addition to these blocks, similar to ex- tern. For example, if the traffic is uniformly distributed, isting work [11], “spare” space is reserved so that it can the probability that the current written block contains be easy to find available blocks to write. We denote the invalid data can be represented as the closed-form Equa- ratio of this additional space to the total size of the Data tion 1 [12]. Here W (x) denote the Lambert’s W function area as the “spare factor” ρ. Then the size of the Data [10], which is the inverse of xex (i.e. W (y) = x|xex = y). 2N area can be represented as 1−ρ . Note that the number of required blocks will change from 2N to (k +1)N once −1 −1 1 + (1 − ρ) · W ( · e 1−ρ ) (1) we store maps on the disk (detailed in Section 3.2.2); 1 − ρ (k+1)N and the “spare factor” will be 1−ρ then. Traffic model. We model the logical write traffic In general cases, the probability that the written for each volume with Rosenblum’s hot/cold data model block in PD-DM contains invalid data cannot be repre- [28]. It captures the spatial locality in real storage work- sented with a closed-form equation. But it can be cal- loads [12] and can be applied to PD-DM by “separating” culated numerically [12]. the logical address space of each volume into “hot” and Figure 4 shows how the average number of physi- “cold” groups, respectively. Two parameters r and f are cal writes for one logical write changes when the logical introduced in the model. Specifically, a fraction f of the traffic and spare factor ρ vary. The average number of address space is “hot”, and hit by a fraction r of the physical writes decreases as ρ increases. Moreover, the overall write requests. The remaining 1 − f fraction is more focused the write traffic (on a smaller logical ad- “cold” and receives a fraction of 1 − r of all write re- dress space), the smaller the average number of required quests. The write traffic inside each group is uniformly physical write operations. To sum up, given a relatively distributed. When r = f = 100%, this becomes a special small spare factor, the average number of physical write case where the entire traffic is uniformly distributed. per logical write in PD-DM is much less compared to The average number of blocks written per logical other randomization-based solutions. write is related to the probability that the selected block is available for writing. Given a spare factor ρ, this prob- ability is actually ρ for a randomly selected block. The PD-DM 160

3.2 Mapping & Block Occupancy Status

So far we have considered the mapping structures as black boxes. We now discuss their design, implementa- tion and associated complexities.

3.2.1 Mapping-related Data Structures

Two different data structures are used to store logical- physical mappings for the two volumes, respectively: the public position map (PPM) for the public volume, and the hidden map B+ tree (HM) with its root node Fig. 5. Hidden logical blocks are mapped to the physical disk with a B+ tree map (e.g., of height 3 in this figure). A data (HM_ROOT) for the hidden volume. block and the related HM blocks construct a “hidden tuple”,

The PPM maps public logical block IDs to corre- denoted as T (dh). The HM blocks are the tree nodes in the path sponding physical blocks on disk. It’s an array stored from the root to the relevant leaf containing the corresponding mapping entry for the data block. In this figure, a “hidden tuple” at a fixed location starting at physical block ID αPPM ; subsequent offset logical block IDs are used as indices and the corresponding path of HM tree are colored gray. in the array. Its content requires no protection since it is related only to public data and leaks no information The HM_ROOT block is encrypted with the hid- about the existence of hidden data. den key Kh. Adversaries cannot read the map in the Unlike in the case of the PPM for the public vol- absence of Kh. Additionally, the HM_ROOT block will ume, we cannot store an individual position map for be accessed following a specific rule independent of the the hidden volume at a fixed location since accesses to existence of a hidden volume. This ensures that accesses this map may expose the existence of and access pattern to HM_ROOT do not leak hidden information. More to hidden data. Instead, we deploy an HM tree which details in Section 3.3. is stored distributed across the entire disk, leaving only the root node (denoted by HM_ROOT) in a fixed lo- 3.2.2 Get/Set_map cation (physical block ID = αHM_Root) of the mapping segment. To facilitate easy access and reconstruction of Managing the PPM is relatively simple. Logical block the HM, each node contains the location of all of its ID lp is used to retrieve the physical block where the children nodes. corresponding PPM entry resides (Algorithms 6 and 7 More specifically, HM indexes physical locations by in Appendix A.2). logical block IDs. The leaf nodes are similar to the PPM Managing the HM tree, however, requires much which contains the mapping entries, and they are or- more care (Algorithms 8 and 9 in Appendix A.2). The dered from left to right (from leaf_1 to leaf_n in Fig- access pattern to the HM tree should not be accessi- ure 5). An example of a HM tree of height 3 is shown ble to a curious multi-snapshot adversary. Fortunately, in Figure 5. A logical hidden block is mapped to its as the map entries are stored in the leaf nodes, access- corresponding physical block by tracing the tree branch ing one map entry always requires traversing one of the from the root to the corresponding leaf node contain- tree paths from the root node to the corresponding leaf. ing the relevant mapping entry. Suppose b is the size of Thus, each path (from the root to a leaf) is linked with physical block IDs in bytes, then the height of the HM and embedded within a given corresponding data block tree is O(log (N )). The last entry of each HM tree bB/bc h as a “hidden tuple” in PD-DM. And a “hidden tuple” leaf node is used for a special purpose discussed in Sec- is written to successive blocks in the Data area. tion 3.2.3. Thus, only bB/bc − 1 mapping entries can be Figure 5 illustrates the “hidden tuple” concept for stored in one leaf node. Further, each internal node can an example HM tree of height 3. The hidden tuples have bB/bc child nodes. Finally, the HM mapping entry here consist of the data block itself and the nodes in of the i-th logical block is stored in the ( i )-th leaf bB/bc−1 the path from the root of HM to the relevant leaf node node at offset i%(bB/bc−1). For a block size of 4KB and block IDs of 32 bits, the height of the map tree remains k = 3 for any volume size between 4GB and 4TB. PD-DM 161 containing the mapping entry for the data block.4 Let Instead, PD-DM deploys a “reverse map” mecha- hmi denote the HM tree node at level i in the path nism (similar to HIVE [8]) to map physical to logical from the root to the leaf. Then the hidden tuple corre- blocks. The corresponding logical ID of each block con- sponding to hidden data dh is denoted as T (dh) =< taining hidden data is stored directly in the correspond- hm1, hm2, ··· , hmk−1, dh > where k is the height of ing hidden tuple, or more specifically, in the last entry the HM tree. In order to accomplish a write request of the leaf node in the hidden tuple (see Figure 5).

Hid_write(lh, dh), we need to not only write the data This makes it easy to identify whether a hidden tu- dh but also update the HM entry to reflect the correct ple (and its associated block) is still valid, i.e., by check- new mapping. This is achieved by also writing a hidden ing whether the logical block is still mapped to where tuple and updating HM_ROOT. the hidden tuple currently resides. For example, suppose Note that accessing one HM entry does not affect the hidden tuple is stored in the physical blocks β + 1 any other branches of the tree unrelated to it. Thus, to β + k, then the hidden data is in block β + k and the importantly, no cascading effects ensue. To understand logical ID of this hidden tuple is in block β + 1. Thus, this better, consider that as leaves are ordered on logical the state of the hidden tuple will be obtained by first addresses, changing the mapping entry in one leaf node reading block β + 1 for the logical ID l and then reading only requires updating the corresponding index in its the current map entry for index l from the HM tree. If ancestor nodes which indicate the physical location of its HM[l] = β +k is true, then the hidden data there is still corresponding child in the path. Other indexes remain up to date. Otherwise, the hidden tuple is invalid (and the same and the rest of the tree is unchanged. can be overwritten). As an optimization, in this process, one doesn’t al- 3.2.3 Get/Set_Status ways need to read all k nodes in the path of the HM tree for the map entry. Any node read indicating that PD-DM requires one more key structure providing the a node on the path is located between β + 1 to β + k occupancy status of individual blocks. So far this was establishes HM[l] = β + k as true. assumed to be a black box (Section 3.1) . We now detail. Importantly, the above mechanisms do not leak any- Keeping track of the status of individual blocks is thing to multi-snapshot adversaries since all enabling in- done using a public bit map (PBM). It contains one bit formation is encrypted (e.g., the logical ID is part of the and one IV for each physical block in the Data area. A hidden tuple) and no data expansion occurs (would have bit is set once public data is being written to the cor- complicated block mapping significantly). Status man- responding physical block. The bitmap ensures that we agement (Set_status(α, S) and Get_status(α)) can be are able to check the state of a physical block before done through reading/writing the hidden tuples. Thus, writing to it to avoid public data loss e.g., by overwrit- no additional P hy_write operations are needed. ing. The PBM is public and not of concern with respect to adversaries. It is worth noting that the blocks occu- Algorithm 3 Set_Hid_map_full(lh, α) – update a hidden pied by hidden data are shown as unused in the PBM. map entry or fake map accesses To protect the hidden data from overwriting, we Input: lh, α, αHM_Root, k 1: if l_h = 0 then . Randomized data dummy is written deploy a number of additional mechanisms. We cannot 2: mapblock := P hy_read(αHM_Root) use a public bitmap since it would simply leak with- 3: P hy_write(αHM_Root, mapblock) out additional precautions, which necessarily introduce 4: for i ∈ {2, . . . , k} do inefficiencies. Examples of such inefficient bitmaps in- 5: Fake map block mapdummy clude the in-memory bitmap used by Peters [27] and 6: P hy_write(α − k + i − 1, mapdummy) 7: end for the fuzzy bitmap used by Pang [26] which introduced 8: else . real hidden data is written “dummy blocks” to explain blocks which were marked 9: Set_Hid_map(lh, α) as in-use but unreadable by the filesystem. 10: end if

3.3 Full PD-DM With Maps On Disk 4 The root node is not included in the hidden tuple since it is always stored as HM_ROOT as explained before. Updating the Once all mapping structures are on disk – in addition root node can be done in memory only so that the sequential to replacing the Get/Set_map black boxes with Algo- disk write behavior is not affected. PD-DM 162 rithms 6 – 9, a few more things need to be taken care With maps on disk, PD_write operations writes of to ensure end-to-end PD. more than just data blocks to the physical device. But Most importantly, accesses to the maps should not those physical writes still happen sequentially. However, expose the existence of hidden data and behave iden- accessing the map-related data structures (PPM, PBM, tically regardless of whether users store data in hid- ...) will bring in a few seeks. In-memory caches are den volumes or not. It is easy to see that no Set_map maintained in PD-DM for those data structures, which functions are called whenever the written data is either can significantly reduce the number of seeks. Consider “existing” or “dummy” in Algorithm 2, which skews caching PPM blocks as an example: as bB/bc mapping the access distribution and may arouse suspicion. To entries are stored in one mapping block, caching it al- tackle this, additional fake accesses to the mapping- lows the reading of map entries for bB/bc logical ad- related data structures are performed as necessary. The dresses (in a sequential access) without seeking. Thus, Set_Hid_map_full function either sets a real hidden comparing with the previous works which seeks before map entry or fakes those accesses (Algorithm 3), where writing every randomly chosen block , PD-DM has lower lh = 0 means faking accesses to related data structures. seek-related expense. As a result, the same number of physical write Finally, it is easy to generalize PD-DM to have mul- operations are performed regardless of whether the tiple volumes. A PD_write operation can write hidden pair of public-hidden data written contains new data, data from any hidden volume. existing data or dummy data in Algorithm 2. An atomic operation PD_write can be defined as Al- 3.4 Encryption gorithm 4 to describe all those physical writes. PD_write(Phead, lp, dp, lh, dh) arranges a public-hidden All blocks in the Data area are encrypted using AES- data pair (dp, dh) together in the Data area and up- CTR with one-time-use, random per-block IVs. Those date HM (real or fake) using a number of sequential IVs are stored as part of the PBM. In fact, other similar P hy_write operations. schemes of equivalent security level can be envisioned Then the resulting complete set of steps for handling and the deployed cipher mode can be any acceptable a logical write request can be illustrated with Algorithm mode providing IND-CPA security. new 5. The lh is set to be 0 when dh is randomized data new new dummy, and lp is 0 when dp is existing public data. Logical read requests are straightforwardly handled by 4 Analysis accessing the data and existing mapping structures as illustrated in Algorithms 10 and 11 (Appendix A.2). 4.1 Security

Algorithm 4 PD_write(P , l , d , l , d ) head p p h h We now prove that adversaries cannot distinguish the Input: Phead, lp, dp, lh, dh two write traces W(O0) and W(O1) and thus can win 1: P hy_write(P , d ) . Write the public data head p the game of Definition 1 with at most negligible advan- 2: Set_Hid_map_full(lh,Phead + k) . Write the hidden map tage. 3: P hy_write(Phead + k, dh) . Write the hidden data Theorem 1. PD-DM is plausibly deniable. Proof (sketch): Each access pattern results in a write Algorithm 5 Public write + Hidden write – Full PD-DM with maps on the disk trace transcript available to the adversary through dif- ferent device snapshots. The transcript contains these Input: P ub_write(lp, dp), Hid_write(lh, dh), Phead exact entries: 1: while dp is not Null do new new 2: (dp , dh ) = Get_new_data_pair(dp, dh,Phead) – Write k + 1 blocks to the Data area and update the new new new new 3: PD_write(Phead, lp , dp , lh , dh ) HM_ROOT with PD_write operations. new 4: if dp == dp then – Update the PPM accordingly. 5: Set_P ub_map(l ,P ), d = Null p head p – Update the PBM accordingly. 6: end if new – Update the Phead. 7: if dh == dh then 8: dh = Null In this transcript, Lemmas 4, 5, 6 and 7 (in Appendix 9: end if A.3) show that the locations of all writes depend only on 10: P ← P + (k + 1) head head public information (public writes). Further, the contents 11: end while of every write is either fully randomized or depends on PD-DM 163 public data only (public writes). This holds also for the 4.3 Throughput Comparison write traces of O0 and O1 in Definition 1. As a result, adversaries cannot win the game of Definition 1 better PD-DM accesses at most O(logN) blocks for each log- than random guessing. ical access with relatively small constants (e.g., 2-3 for judiciously chosen spare factors). Moreover, physical ac- 4.2 I/O Complexity cesses are mostly sequential, thus reducing latency com- ponents and improving performance. As discussed above, to hide the existence of the hid- By comparison, existing work such as HIVE fea- 2 den data, more work is being done in addition to just tures an I/O complexity of O(log N) [8] with constants reading and writing the data itself. Here we discuss the of either 64 (without a stash) or 3 (with a stash). Fur- associated I/O complexities and constants. thermore, access patterns are random, making seeks im- Public Read. A P ub_read request requires reading pactful and caching/buffering difficult. As a result, on a PPM block for the map entry plus reading the corre- rotational media, PD-DM outperforms existing work by sponding data block. Thus, compared with the raw disk, up to two orders of magnitude. P ub_read roughly reads twice as often and halves the throughput. It may be reduced using a smart memory caching strategy for the PPM blocks since consecutive 5 Evaluation PPM entries are stored in one PPM block. Hidden Read. Hid_read, on the other hand, requires PD-DM is implemented as a block device map- reading one HM tree path for the map entry first. Thus, per. Encryption uses AES-CTR with 256 bit keys and k + 1 blocks (k HM tree path blocks + data block) are random IVs. Blocks are 4KB each. Two virtual logical read in total, which results in B · (k + 1) = O(logN) devices (one public and one hidden) are created on one complexity, where k = logN is the height of the HM physical disk partition. tree. Experiments were run on a machine with dual-core Underlying Atomic Write Operation. PD_write i5-3210M CPUs, 2GB DRAM, and 3.13.6. checks if the next k + 1 blocks are free, fills those blocks The underlying block devices were a WD10SPCX- with the newly formed pair of public data and hidden 00KHST0 5400 rpm HDD and a Samsung-850 EVO tuple, and updates the HM_ROOT block. Checking the SSD. The average seek time of the HDD is 8.9ms [4]. state of the k + 1 candidate blocks requires reading one Two volumes of 40GB each were created on a 180GB PBM block and at most k HM tree nodes (Section 3.2.3). disk partition (ρ ≈ 0.1). Ext4 FSes were installed on This first step gives B · (k + 2), and results in O(logN) each volume in ordered mode. Additionally, an ext4 FS since k = logN. Further, the step of writing the k + 1 mounted on top of a dm-crypt (encryption only) vol- blocks in order gives B · (k + 1) resulting in O(logN) as ume was benchmarked on the same disk. HIVE [2] was well. Thus, the I/O complexity of the overall PD_write compiled and run on the same platform. operation is O(logN). Public/Hidden Write. As described in Algorithm 5.1 HDD Benchmarks 5, only a few additional I/Os (constant number) are needed besides a PD_write for each iteration of the Two benchmarks (Bonnie++[1], IoZone[25]) were used loop. As the I/O complexity of the PD_write opera- to evaluate performance under sequential and random tion is O(logN), the complexity of each iteration is also O(logN). The analysis in Section 3.1 shows that either Operation dm-crypt PD-DM HIVE [8] a P ub_write or a Hid_write can be completed within Public Read 97.3 17.452 ± 0.367 0.095 O(1) iterations (O(1) PD_writes). Thus, the overall Public Write 88.0 1.347 ± 0.093 0.013 I/O complexity is O(logN) for public/hidden write. Hidden Read n/a 12.030 ± 0.088 0.113 Throughput. Another fact worth noting is that, after Hidden Write n/a 1.456 ± 0.086 0.016 an initial period in which sufficient writes have been seen for each volume, the device usually ends up with N valid Table 2. Sequential workload throughput (MB/s) comparison with the Bonnie++ benchmark. PD-DM is 100-150 times faster blocks for each volume, regardless of the workload. At than HIVE for sequential read and write. The PD-DM through- that point, throughput stabilizes and is not a function puts reported here are the average value over 40 runs with the of uptime anymore. standard error. PD-DM 164

14 I/O is set for all file operations so that I/O caching by Hidden volume, Seq read Hidden volume, Seq write 13.5 Mean 2.5 Mean the OS is bypassed. Further, the RAM allocated for PD- 13 2 12.5 DM mapping-related data structures (PPM, PBM, etc) 1.5 12 1 is limited to 1MB. Results of random I/O throughputs 11.5 Throughput (MB/s) Throughput (MB/s) 11 0.5 for file sizes of 1GB are shown in Table 3.

10.5 0 0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40 It can be seen that the PD-DM sequential write Run number Run number mechanism really makes an impact and performs 3–30× (a) Seq read (b) Seq write faster than HIVE. Notably, even standard dm-crypt is only 2–3× faster. Fig. 6. PD-DM hidden I/O throughput results of multiple runs. Overall, PD-DM performs orders of magnitude faster than existing solutions for both sequential and workloads, respectively. Results are in Tables 2 and random workloads. Moreover, a great potential advan- 3. To further evaluate the impact of caching parts of tage of PD-DM for random workloads can be seen by mapping-related data structures, the IoZone experiment analyzing the results in Figure 7, depicting the perfor- was run multiple times for different cache sizes. Results mance for sequential and random I/O of the dm-crypt are illustrated in Figure 8. and the two volumes in PD-DM respectively. Note that both read and write throughputs drop significantly faster for dm-crypt when the I/O pat- 5.1.1 Throughput tern changes from sequential to random. In contrast, throughputs of the two PD-DM volumes are signifi- Sequential I/O. For consistency in comparing with cantly less affected, especially for writes. This is because HIVE, Bonnie++ was first used to benchmark the the underlying write pattern is always sequential in PD- throughput of sequential read and write for both vol- DM, regardless of logical I/O request locality. umes. The mapping-related structures cache was lim- It is important to note however that accessing map- ited to 1MB. Results averaged over 40 runs are shown ping structures may disturb the sequentiality of result- in Table 2. Overall, PD-DM performs about two orders ing write patterns. Fortunately, the design allows for of magnitude faster for both public and hidden volumes. even a small amount of cache to have a great impact. For example, the basic Bonnie++ benchmark required about one full week to complete on existing work [8] in the same setup. The same benchmark completed in 5.1.2 Cache Size under 4 hours on PD-DM Figure 6 shows the result of every run as a sam- To evaluate the influence of cache sizes on performance ple point as well as the mean value for the hidden vol- under sequential or random access with files of different ume I/O throughput. Although the throughput varies sizes, IoZone is run for different cache sizes. Results are around the mean value for both read and write, it does in Figure 8. not decrease as more runs are completed. Note that

Bonnie++ writes more than 8GB of data to the disk 100 seq read in each run – as a result the disk has been written over random read seq write and over during the test. Moreover, write throughput random write nears the mean with increasing run number, i.e., the 10 disk is converging to a stable state. The results for the public volume are similar and not presented here. As expected, PD-DM and all existing PD systems 1 Throughput (MB/s) are slower when compared to simple encryption (dm- crypt) without PD. However, notably, PD-DM is the 0.1 first PD system protecting against a multi-snapshot ad- dm-crypt PD-DM PD-DM Public Hidden versary with performance in the MB/s range on hard disks, and 15MB/s+ for public reads. Fig. 7. PD-DM throughput comparison with dm-crypt. Random Random I/O. IoZone was run to evaluate system be- I/O results in a throughput drop. PD-DM is less affected because havior under random I/O workloads. IoZone reads files the underlying write pattern is always sequential regardless of of different sizes using random intra-file accesses. Direct logical I/O request locality. PD-DM 165

Cache size = 4KB 750 4000 Cache size = 4KB 700 Cache size = 4KB Cache size = 4KB Cache size = 400KB 700 Cache size = 400KB Cache size = 400KB 3000 Cache size = 400KB 3500 Cache size = 1MB Cache size = 1MB 650 Cache size = 1MB Cache size = 1MB Cache size = 4MB 650 Cache size = 4MB 600 Cache size = 4MB Cache size = 4MB 3000 dm-crypt 600 dm-crypt dm-crypt 2500 2500 550 550 500 500 2000 450 450 2000 1500 400 400 Throughput (KB/s) 1000 Throughput (KB/s) 350 Throughput (KB/s) 350 Throughput (KB/s) 300 1500 500 300 250 0 250 100 1000 100 1000 100 1000 100 1000 File size (MB) File size (MB) File size (MB) File size (MB) (a) Hidden volume of PD-DM (b) Hidden volume of PD-DM (c) Public volume of PD-DM (d) PD-DM, Hidden volume, vs. dm-crypt, Random write vs. dm-crypt, Random read vs. dm-crypt, Random read Sequential write

43 24 21 70 Cache size = 4KB Cache size = 4KB 42 22 Cache size = 400KB Cache size = 400KB 68 Cache size = 1MB 20 Cache size = 1MB 66 41 20 Cache size = 4MB Cache size = 4MB 40 64 18 19 62 39 dm-crypt 16 60 dm-crypt 38 14 18 58 37 56

Throughput (MB/s) Throughput (MB/s) 12 Throughput (MB/s) Throughput (MB/s) 36 17 54 35 10 52 34 8 16 50 100 1000 100 1000 100 1000 100 1000 File size (MB) File size (MB) File size (MB) File size (MB) (e) dm-crypt, Sequential write (f) PD-DM, Hidden volume, (g) PD-DM, Public volume, Se- (h) dm-crypt, Sequential read Sequential read quential read

Fig. 8. IoZone throughput comparison for different workloads, cache sizes, and file sizes between PD-DM and dm-crypt. Figure 8(a) shows how PD-DM random-write throughput is affected by cache size. Write throughput exceeds that of dm-crypt as the cache size increases, since caching the mapping data structures reduces extraneous seeks and increases locality/sequentiality in the underlying writes. Similarly, for random read (Figure 8(b) and 8(c)), increased caching improves performance for both public and hidden volumes, especially when the file size is relatively large. But the throughput cannot be better than dm-crypt. Cache size determines the turn- ing point. Sequential I/O is illustrated in Figures 8(d)–8(h). Throughputs of dm-crypt are shown in Figures 8(e) and 8(h). PD-DM write throughput (Figure 8(d)) varies in a small range except from one special case (cache size=4KB), since one hidden map entry is related to 2 map blocks (8KB). Read throughput (Figures 8(f) and 8(d)) is also less affected by caching as long as the number of cache blocks is larger than the number of mapping blocks read for one mapping entry (4KB for public volume and 8KB for hidden vol- ume). Contrastingly, the sequential-write throughput of dm-crypt increases with the increasing file size at first, then remains in a stable range.

Operation dm-crypt PD-DM HIVE [8] for file sizes beyond a threshold. As expected, larger Public Read 0.526 0.309 ± 0.005 0.012 caches result in larger thresholds. For caches over 4MB, Public Write 1.094 0.276 ± 0.017 0.010 write throughput stabilizes for files smaller than 1GB. Hidden Read n/a 0.309 ± 0.011 0.097 This means that map blocks are mostly read from/write Hidden Write n/a 0.282 ± 0.024 0.014 to the caches without disk seeks. As a result, PD-DM Seeks Per Second 1227 336/89 10.3/5.7 random-write throughput exceeds that of dm-crypt. Table 3. Performance comparison for random workloads. The It is worth noting that the same cache mechanism PD-DM throughputs reported here are the average value over 40 does not work for dm-crypt as its logical-to-physical runs with the standard error. The first four rows show throughput block mapping is fixed and the larger number of seeks (MB/s) of random read and write for different volumes respec- derives from the logical-layer access randomness. In con- tively as measured by IoZone. The last row represents the number trast, in PD-DM, additional seeks occur only when ac- of seeks per second (the number for public and hidden volume are separated by a slash) reported by Bonnie++. cessing mapping blocks where multiple entries are stored together. This effectively maximizes the overall cache- hit rate for the mapping data structures. Write throughput. Figure 8(a) illustrates random- Sequential-write throughput for the hidden volume write throughput of the hidden volume with different (Figure 8(d)) is compared with the dm-crypt curve (Fig- cache sizes. Random-write throughput for dm-crypt de- ure 8(e)). As seen, the cache size has a decreasing effect creases gradually (from 3MB/s to 1MB/s) with increas- above caches larger than 4KB (1 block). Moreover, the ing file sizes, mainly because of higher number of seeks throughput is not greatly affected by file size (unlike in for larger files. However, PD-DM random-write through- the case of dm-crypt), although a larger file still results put behaves differently. For caches above 400KB, it re- into a lower throughput. mains stable for relatively small files, and then drops PD-DM 166

Write throughput for the public volume is similar Operation dm-crypt PD-DM HIVE [8] to the hidden volume mostly because hidden writes are Public Read 225.56 99.866 ± 0.015 0.771 executed together with public writes. Public Write 210.10 2.705 ± 0.023 0.546 Read throughput. Random-read throughput for Hidden Read n/a 97.373 ± 0.009 4.534 Hidden Write n/a 3.176 ± 0.017 0.609 both volumes (Figures 8(b) and 8(c)) follow a similar trend as the random-write throughput (Figure 8(a)) – Table 4. Throughput (MB/s) comparison on SSD for a sequen- throughput decreases significantly for file sizes beyond a tial workload using the Bonnie++. PD-DM is 30-180x faster than certain threshold. Further, larger caches result in larger HIVE for sequential read, and 5x faster for sequential write. thresholds. Finally, the threshold for both volumes is the same for the same cache size. Sequential-read throughputs for both volumes are 6 Discussion less affected by either file or cache sizes (Figures 8(f)– 8(h)). A special interesting case occurs for caches of Advantages. PD-DM has a number of other advan- only 4KB (1 block). In that case, the sequential-read tages over existing schemes aside from the performance. throughput of the hidden volume drops by half (Fig- First, PD-DM does not require secure stash queues as ures 8(f)), similar to what happens in Figures 8(d). The long as there are enough public writes ongoing. Notwith- reason for this is that reading a hidden mapping entry standing, it is worth noting that caches and filesystem requires reading more than one block when the height related buffers should be protected from adversaries of the hidden map tree is larger than 2. This results since they contain information about hidden volumes. in continuous cache misses. The same situation doesn’t Second, PD-DM preserves data locality from logical ac- happen for the public volume, mainly because a sin- cesses. The logical data written together will be arrange gle physical block can contain multiple adjacent public together on the device with high probability. Third, PD- mapping entries, thus even a one-block cache can make DM breaks the bond between reads and writes and en- an impact. sures security. Adversaries may have a prior knowledge The variation in sequential-read throughput is of the relationship between physical reads and writes mildly affected by the existing layout of file blocks on since reads and writes are usually generated together disk. After all, a continuous file may be fragmented by by a filesystem. HIVE masks hidden writes using pub- existing data on disk. lic reads, which may break this FS-implied relationship and arouse suspicions. For example, a hidden write mas- 5.2 SSD Benchmarks queraded as a read to a public file data block may raise suspicion if usually real public reads are coupled with PD-DM is obviously expected to perform most of its writing metadata or journaling, writes which may be magic on HDDs where reduced numbers of disk seeks absent in this case. can result in significant performance benefits. Fortu- Physical Space Utilization. Considering only the nately however, PD-DM performs also significantly bet- data ORAM in HIVE with two logical volumes, HIVE ter than existing work when run on SSDs. can use at most 50% of the disk for both public data A throughput comparison including PD-DM, HIVE and hidden data since half the disk is required to be and dm-crypt is shown in Table 4. Only a minimal 40KB free. Thus, assuming that the two logical volumes are of cache is being used – since seeks are cheap in the case the same size, each would be of 25% of the total device. of SSDs, we expected (and found) that cache size has In PD-DM, ρ and k decide volume sizes. Both the 1 little effect on performance. public and the hidden volume is a fraction of (1−ρ)· k+1 Overall, although the performance advantage of re- of the entire device (considering only the Data area). ducing the number of seeks decreases for SSDs, PD- Assuming that k = 3 (works for volume size from 4GB DM still outperforms existing work due to two impor- to 4TB), an empirically chosen sweet spot balancing tant facts: (i) reads are unlinked with writes, and (ii) space and performance can be found at ρ = 20% which PD-DM write complexity is only O(logN), compared to results in an overall volume size of around 20% of the O(log2N) in HIVE. device. Optimization Knobs. Performance is impacted by a number of tuning knobs that we discuss here. PD-DM 167

Increasing ρ results in increased throughput at the of the filename. This solution required storing multi- cost of decreased space utilization. The intuition here ple copies of the same file at different locations to pre- is that the probability that the k + 1 target blocks vent data loss. McDonald and Kahn [23] implemented a for PD_writes contain overwritable (invalid) data in- steganographic filesystem for Linux on the basis of this creases with increasing ρ. ρ enables fine-tuning of the idea. Pang et al. [26] improved on the previous construc- trade-off between throughput and space utilization. tions by avoiding hash collisions and providing more ef- Further, changing the ratio between the number ficient storage. of public data blocks and that of hidden data blocks These steganographic filesystem ideas defend written together can help balancing the public write only against a single-snapshot adversary. Han et throughput and the waiting time of hidden writes. In al. [17] designed a “Dummy-Relocatable Stegano- this paper, we assume that one public data block is al- graphic” (DRSteg) filesystem that allows multiple users ways written together with one hidden data block. For to share the same hidden file. The idea is to runtime- workloads where the user accesses public data more of- relocate data to confuse a multi-snapshot adversary and ten than hidden data, we can actually change the ratio provide a degree a deniability. However, such ideas do so that more than one public data blocks are written not scale well to practical scenarios as they require as- with one hidden data block (e.g., 2 public data paired sumptions of multiple users with joint-ownership and with 1 hidden data). This results in a higher throughput plausible activities on the same devices. Gasti et al. [16] for public writes at the cost of delayed writes for hidden proposed PD solution (DenFS) specifically for cloud data and a requirement for more memory to queue up storage. Its security depends on processing data tem- outstanding hidden data writes. porarily on a client machine, and it is not straightfor- The physical space utilization will also be affected ward to deploy DenFS for local storage. when the above ratio changes. The size of public volume DEFY [27] is a plausibly-deniable file system for increases while the size of hidden volume decreases. For flash devices targeting on multi-snapshot adversaries. It example, if 2 public data are paired with 1 hidden data, is based on WhisperYAFFS [30], a log structured filesys- then k + 2 instead of k + 1 blocks will be written in tem which provides full disk encryption for flash devices. the Data area for that. For an unchanged ρ, the size of It is important to note that, due to its inherent log- 1 1 public volume becomes (1 − ρ) · k/2+1 > (1 − ρ) · k+1 of structure nature derived from WhisperYAFFS, similar the size of the Data area while the size of hidden volume to PD-DM, DEFY also writes data sequentially under- 1 1 1 becomes (1 − ρ) · 2(k/2+1) = (1 − ρ) · k+2 < (1 − ρ) · k+1 neath on flash devices with the intention of maintaining of the size of the Data area. a reasonable flash wear-leveling behavior and thus not significantly reduce device lifespan. However, unlike PD-DM, PD in DEFY is unrelated 7 Related Work to the sequential access pattern and instead is achieved through encryption-based secure deletion. Additionally, DEFY takes certain performance-enhancing shortcuts Plausibly-deniable encryption (PDE) is related to which immediately weaken PD assurances and associ- PD and was first introduced by Canetti et al. [9]. PDE ated security properties as follows. allows a given ciphertext to be decrypted to multiple Firstly, DEFY doesn’t guarantee that a block plaintexts, by using different keys. The user reveals the DEFY writes is independent from the existence of hid- decoy key to the adversary when coerced and plausibly den levels. Specifically, all writes in DEFY will bypass hides the actual content of the message. Some examples the blocks occupied by hidden data unless the filesystem of PDE enabled systems are [7, 19]. Unfortunately, most is always mounted at the public level. This is a signif- PDE schemes only work for very short messages and are icant potential vulnerability that undermines PD and not suitable for larger data storage devices. requires careful analysis. Steganographic Filesystems. Anderson et al. [5] ex- Secondly, DEFY writes checkpoint blocks for hid- plored “steganographic filesystems” and proposed two den levels after all data has been written to the device. ideas for hiding data. A first idea is to use a set of cover Thus, these checkpoint blocks cannot be explained as files and their linear combinations to reconstruct hidden deleted public data any more, which leaks the existence files. This solution is not practical since the associated of hidden levels. performance penalties are significant. A second idea is Thirdly, DEFY requires all filesystem-related meta- to store files at locations determined by a crypto hash data to be stored in memory. This does not scale for PD-DM 168 memory constrained systems with large external stor- [4] Western digital blue wd10spcx. "https://www.goharddrive. age. The use of an in-memory free-page bitmap aggra- com/Western-Digital-Blue-WD10SPCX-1TB-5400RPM-2- vates this problem. In contrast, PD-DM stores mapping- 5-HDD-p/g02-0319.htm". [5] R. Anderson, R. Needham, and A. Shamir. The stegano- related data structures securely on disk and allows the graphic file system. In Information Hiding, pages 73–82. option of caching portions thereof. Springer, 1998. Block Devices. At block device level, disk encryption [6] C. C. Anrin Chakraborti and R. Sion. DataLair: Efficient tools such as Truecrypt [3] and Rubberhose [19] pro- block storage with plausible deniability against multi- vide PD against single-snapshot adversaries. Mobiflage snapshot adversaries. Proceedings on Privacy Enhancing [29] provides PD for mobile devices but cannot protect Technologies, 2017(3), 2017. [7] D. Beaver. Plug and play encryption. In Advances in Cryp- against multi-snapshot adversaries as well. tology – CRYPTO’97, pages 75–89. springer. Blass et al. [8] (HIVE) are the first to deal with PD [8] E.-O. Blass, T. Mayberry, G. Noubir, and K. Onarlioglu. against a multi-snapshot adversary at device level. They Toward robust hidden volumes using write-only oblivious use a write-only ORAM for mapping data from logical ram. In Proceedings of the 2014 ACM SIGSAC Conference volumes to an underlying storage device and hiding ac- on Computer and Communications Security, pages 203–214. ACM, 2014. cess patterns for hidden data within reads to non-hidden [9] R. Canetti, C. Dwork, M. Naor, and R. Ostrovsky. Deniable (public) data. Chakraborti et al. [6] use writes to public encryption. In Advances in Cryptology – CRYPTO’97, pages data to hide accesses to hidden data based on another 90–104. Springer, 1997. write-only ORAM that does not need recursive maps. [10] R. M. Corless, G. H. Gonnet, D. E. Hare, D. J. Jeffrey, and D. E. Knuth. On the lambertw function. Advances in Com- putational mathematics, 5(1):329–359, 1996. [11] P. Desnoyers. Analytic models of ssd write performance. 8 Conclusion Trans. Storage, 10(2):8:1–8:25, Mar. 2014. [12] P. Desnoyers. Analytic models of SSD write performance. ACM Transactions on Storage (TOS), 10(2):8, 2014. PD-DM is a new efficient block device with plausible de- [13] F. Douglis and J. Ousterhout. Log-structured file systems. niability, resistant against multi-snapshot adversaries. In COMPCON Spring’89. Thirty-Fourth IEEE Computer So- Its sequential physical write trace design reduces im- ciety International Conference: Intellectual Leverage, Digest portant seek components of random access latencies on of Papers., pages 124–129. IEEE, 1989. rotational media. Moreover, it also reduces the number [14] J. Dursi. On random vs. streaming I/O performance; or seek(), and you shall find – eventually, 2015. "http: of physical accesses required to serve a logical write. //simpsonlab.github.io/2015/05/19/io-performance/". Overall, this results in orders of magnitude (10– [15] C. W. Fletcher, L. Ren, A. Kwon, M. van Dijk, and S. De- 100×) speedup over existing work. Notably, PD-DM is vadas. Freecursive ORAM:[nearly] free recursion and in- the first plausible-deniable system with a throughput tegrity verification for position-based oblivious ram. In ACM finally getting within reach of the performance of stan- SIGPLAN Notices, volume 50, pages 103–116. ACM, 2015. dard encrypted volumes (dm-crypt) for random I/O. [16] P. Gasti, G. Ateniese, and M. Blanton. Deniable cloud stor- age: sharing files via public-key deniability. In Proceedings of Further, under identical, fine-tuned caching, PD-DM the 9th annual ACM workshop on Privacy in the electronic outperforms dm-crypt for random I/O. society, pages 31–42. ACM, 2010. [17] J. Han, M. Pan, D. Gao, and H. Pang. A multi-user steganographic file system on untrusted shared storage. In Proceedings of the 26th Annual Computer Security Applica- Acknowledgments tions Conference, pages 317–326. ACM, 2010. [18] J. Hinks. Nsa copycat spyware could be snooping on your This work was supported by the National Science hard drive, 2015. "https://goo.gl/Ktesvl". Foundation through awards 1161541, 1318572, 1526102, [19] R. P. W. J. Assange and S. Dreyfus. Rubber- hose:cryptographically deniable transparent disk encryption 1526707. system, 1997. "http://marutukku.org". [20] C. M. Kozierok. Access time, 2001. "http://pcguide.com/ ref/hdd/perf/perf/spec/pos_Access.htm". [21] C. M. Kozierok. Seek time, 2001. "http://pcguide.com/ref/ References hdd/perf/perf/spec/posSeek-c.html". [22] L. Mathews. Adobe software may be snooping through your [1] Bonnie++. "http://www.coker.com.au/bonnie++". hard drive right now, 2014. "https://goo.gl/lXzPS3". [2] Hive. "http://www.onarlioglu.com/hive". [23] A. D. McDonald and M. G. Kuhn. StegFS: A stegano- [3] TrueCrypt. "http://truecrypt.sourceforge.net/". graphic file system for Linux. In Information Hiding, pages 463–477. Springer, 1999. PD-DM 169

[24] J. Mull. How a Syrian refugee risked his life to bear witness to atrocities. Toronto Star Online, posted 14-March-2012, 2012. "https://goo.gl/QsivgB". [25] W. Norcott and D. Capps. IOZone filesystem benchmark, 2016. "http://www.iozone.org/". [26] H. Pang, K.-L. Tan, and X. Zhou. StegFS: A steganographic file system. In Data Engineering, 2003. Proceedings. 19th International Conference on, pages 657–667. IEEE, 2003. [27] T. Peters, M. Gondree, and Z. N. J. Peterson. DEFY: A deniable, encrypted file system for log-structured storage. In 22nd Annual Network and Distributed System Security Sym- posium, NDSS 2015, San Diego, California, USA, February 8-11, 2014, 2015. [28] M. Rosenblum and J. K. Ousterhout. The design and imple- mentation of a log-structured file system. ACM Transactions on Computer Systems (TOCS), 10(1):26–52, 1992. [29] A. Skillen and M. Mannan. On implementing deniable stor- age encryption for mobile devices. 2013. [30] WhisperSystems. Github: Whispersystems/whisperyaffs: Wiki, 2012. "https://github.com/WhisperSystems/ WhisperYAFFS/wiki". [31] X. Zhou, H. Pang, and K.-L. Tan. Hiding data accesses in steganographic file system. In Data Engineering, 2004. Proceedings. 20th International Conference on, pages 572– 583. IEEE. PD-DM 170

A Appendix Algorithm 6 to 9 define how the logical to physical maps for public and hidden volumes are maintained in PD-DM. A.1 PD-CPA Game

Algorithm 8 Set_Hid_map(lh, α)

Input: lh, α, αHM_Root, k 1. Adversary A provides a storage device D (the adver- // Update the HM_ROOT sary can decide its state fully) to challenger C. k−2 1: αM ← αHM_Root, X ← (bB/bc − 1) · bB/bc 2. C chooses two encryption keys Kpub and Khid using 2: mapblock := P hy_read(αM ) security parameter λ and creates two logical volumes, 3: αM := mapblock[lh/X] Vpub and Vhid, both stored in D. Writes to Vpub and 4: mapblock[lh/X] := α − k + 1 Vhid are encrypted with keys Kpub and Khid respec- 5: P hy_write(αHM_Root, mapblock) tively. C also fairly selects a random bit b. 6: lh ← lhmodX 3. C returns K to A. p 7: for i ∈ {2, . . . , k} do 4. The adversary A and the challenger C then engage in // Update each layer of the HM tree by writing sequentially a polynomial number of rounds in which: 8: X ← (bB/bc − 1) · bB/bck−i−1 (a) A selects access patterns O0 and O1 with the fol- 9: mapblock := P hy_read(αM ) lowing restriction: 10: αM := mapblock[lh/X] i. O1 and O0 include the same writes to Vpub. 11: mapblock[lh/X] := α − k + i ii. Both O1 and O0 may include writes to Vhid 12: P hy_write(α − k + i − 1, mapblock) iii. O0 and O1 should not include more writes to 13: lh ← lhmodX V than φ times the number of operations hid 14: end for to Vpub in that sequence. (b) C executes Ob on D and sends a snapshot of the device to A. 0 (c) A outputs b . 0 (d) A is said to have “won” the round iff. b = b. Algorithm 9 α = Get_Hid_map(lh)

Input: lh, αHM_Root, k . αHM_Root is the block ID of HM_ROOT in the physical device, k is the height of the A.2 Algorithms HM tree Ouput: α 1: αM ← αHM_Root 2: for i ∈ {1, . . . , k − 1} do // Traverse one path in the HM tree from root to corre- Algorithm 6 Set_P ub_map(lp, α) sponding leaf node Input: lp, α, αPPM 3: X ← (bB/bc − 1) · bB/bck−i−1 // Find the physical block containing the map entry 4: mapblock := P hy_read(αM ) 1: α := α + blp/ bB/bcc M PPM 5: αM := mapblock[lh/X] 2: map := P hy_read(α ) block M 6: lh ← lhmodX // Set the map entry and write back 7: end for 3: map [lpmod bB/bc] ← α block 8: mapblock := P hy_read(αM ) 4: P hy_write(αM , mapblock) // Get the map entry

9: α := mapblock[lh]

Algorithm 7 α = Get_P ub_map(lp) Algorithm 10 and 11 illustrate how logical read re- quests are performed in PD-DM for public and hidden Input: lp, αPPM . αPPM is the starting block ID of PPM in the physical device volume respectively. Ouput: α // Find the physical block containing the map entry 1: αM := αPPM + blp/ bB/bcc A.3 Security Proofs 2: mapblock := P hy_read(αM ) // Get the map entry

3: α := mapblock[lpmod bB/bc] Lemma 1. Write traces of a PD_write without a hid- den write are indistinguishable from write traces of a PD_write with a hidden write request. PD-DM 171

Algorithm 10 dp = P ub_read(lp) Lemma 4. Write traces to the Data area are indistin-

Input: lp guishable for any access pattern that contains the same Ouput: dp public writes, notwithstanding their contained hidden 1: α := Get_P ub_map(lp) writes. 2: dp := P hy_read(α) Proof (sketch): For access patterns that contain the

Algorithm 11 dh = Hid_read(lh) same public writes, Lemma 1 and Lemma 3 ensure that

Input: lh the corresponding number of PD_write operations are Ouput: dh the same and their actual write traces are indistinguish- 1: α := Get_Hid_map(lh) able. As PD-DM always write to the Data area with 2: d := P hy_read(α) h PD_write operations, the write traces to the Data area are indistinguishable . Proof (sketch): As shown in Algorithm 4, any PD_write writes a pair of public data dp and hidden Lemma 5. Write traces to the PPM are identical for tuple T (dh) or T (dummy) to the next k + 1 blocks of any access pattern that contains the same public writes, the Data area and updates the HM_ROOT – notwith- notwithstanding their contained hidden writes. standing of whether hidden write requests exist or not. Further, the written public data block and hidden tuple Proof (sketch): PPM is written to only in the case of a are encrypted with a semantically secure cipher which P ub_write. For access patterns that contain the same prevents adversaries from distinguishing between tu- P ub_write operations, the corresponding PPM write ples with new hidden data, existing hidden data or just traces are the same. dummy data. Overall, both the location and the content of write traces are independent, fixed and unrelated to Lemma 6. Write traces to the PBM are indistinguish- able for any access pattern that contains the same public whether or not a Hid_write(lh, dh) request was associ- ated with this underlying PD_write. writes, notwithstanding their contained hidden writes.

Proof (sketch): Each P ub_write(l , d ) marks the cor- Lemma 2. The number of PD_write operations exe- p p responding bits in the PBM bitmap for the written cuted for a P ub_write request is not related to whether physical blocks as well as the “recycled” physical block Hid_write requests are performed together with it. where the “old” version of lp resided (Section 3.2). Since Proof (sketch): As illustrated in Algorithm 2, the while no other operations modify the bitmap, any modifica- tions thereto relate only to P ub_write operations in an loop will not stop until the public data dp is written by one PD_write and then set as Null. This will happen access pattern. only when the obtained public status sp is invalid, which only relates to the state of the public data on the disk Lemma 7. Write trace to the Phead is indistinguish- – and is thus independent on whether any Hid_writes able for any access pattern that contains the same public are performed. writes, notwithstanding their contained hidden writes.

Proof (sketch): P is incremented by k + 1 for each Lemma 3. The number of underlying PD_write oper- head PD_write. As a result, the associated write trace is ations triggered by any access pattern is not related to only related to the number of PD_write operations – whether it contains any hidden writes. this number is decided by the public writes in an access Proof (sketch): This follows straightforwardly now since pattern only (Lemma 3). Thus, the resulting write trace PD_write operations can only be triggered by pub- is indistinguishable for any access pattern that contains lic write requests. Note that hidden writes can cer- the same public writes. tainly be completed with those PD_writes under the restrain 6(a)iv. Accordingly (Lemma 2), the number of PD_write operations needed to execute any access pat- tern depends on the public writes and has nothing to do with the hidden writes.