Policy-Driven Runtime Integrity Enforcement of Virtual Machines

WELES: Policy-driven Runtime Integrity Enforcement of Virtual Machines Wojciech Ozga Do Le Quoc Christof Fetzer TU Dresden, Germany TU Dresden, Germany TU Dresden, Germany IBM Research Europe May 3, 2021 Abstract provider, its employees, and the infrastructure protect the tenant’s intellectual property as well as the confidentiality Trust is of paramount concern for tenants to deploy their and the integrity of the tenant’s data. A malicious em- security-sensitive services in the cloud. The integrity of ployee [8], or an adversary who gets into possession of virtual machines (VMs) in which these services are de- employee credentials [74, 50], might leverage administra- ployed needs to be ensured even in the presence of pow- tor privileges to read the confidential data by introspecting erful adversaries with administrative access to the cloud. virtual machine (VM) memory [85], to tamper with com- Traditional approaches for solving this challenge leverage putation by subverting the hypervisor [54], or to redirect trusted computing techniques, e.g., vTPM, or hardware the tenant to an arbitrary VM under her control by alter- CPU extensions, e.g., AMD SEV. But, they are vulnera- ing a network configuration [91]. We tackle the problem ble to powerful adversaries, or they provide only load time of how to establish trust in a VM executed in the cloud. (not runtime) integrity measurements of VMs. Specifically, we focus on the integrity of legacy systems We propose WELES, a protocol allowing tenants to es- executed in a VM. tablish and maintain trust in VM runtime integrity of soft- The existing attestation protocols focus on leveraging ware and its configuration. WELES is transparent to the trusted hardware to report measurements of the execu- VM configuration and setup. It performs an implicit at- tion environment. In trusted computing [34], the trusted testation of VMs during a secure login and binds the VM platform module attestation [11] and integrity measure- integrity state with the secure connection. Our prototype’s ment architecture (IMA) [78] provide a means to en- evaluation shows that WELES is practical and incurs low force and monitor integrity of the software that has been performance overhead (≤ 6%). executed since the platform bootstrap [7]. The virtual TPM (vTPM) [19] design extends this concept by intro- arXiv:2104.14862v1 [cs.CR] 30 Apr 2021 1 Introduction ducing a software-based trusted platform module (TPM) that, together with the hardware TPM, provides integrity measurements of the entire software stack — from the Cloud computing paradigm shifts the responsibility of the firmware, the hypervisor, up to the VM. However, this computing resources management from application own- technique cannot be applied to the cloud because an ad- ers to cloud providers, allowing application owners (ten- versary can tamper with the communication between the ants) to focus on their business use cases instead of on vTPM and the VM. For example, by reconfiguring the net- hardware management and administration. However, trust work, she can mount a man-in-the-middle attack to per- is of paramount concern for tenants operating security- form a TPM reset attack [53], compromising the vTPM sensitive systems because software managing computing security guarantees. resources and its configuration and administration remains out of their control. Tenants have to trust that the cloud A complementary technology to trusted computing, 1 trusted execution environment (TEE) [46], uses hard- that provides integrity guarantees to legacy systems exe- ware extensions to exclude the administrator and privi- cuted in the cloud. WELES has noteworthy advantages. leged software, i.e., operating system, hypervisor, from First, it supports legacy systems with zero-code changes the trusted computing base. The Intel software guard ex- by running them inside VMs on the integrity-enforced extensions (SGX) [29] comes with an attestation protocol ecution environment. To do so, it leverages trusted com- that permits remotely verifying application’s integrity and puting to enforce and attest to the hypervisor’s and VM’s the genuineness of the underlying hardware. However, it integrity. Second, WELES limits the system administra- is available only to applications executing inside an SGX tor activities in the host OS using integrity-enforcement enclave. Legacy applications executed inside an enclave mechanisms, while relying on the TEE to protect its own suffer from performance limitations due to a small amount integrity from tampering. Third, it supports tenants con- of protected memory [15]. The SGX adoption in the vir- necting from machines not equipped with trusted hard- tualized environment is further limited because the pro- ware. Specifically, WELES integrates with the secure shell tected memory is shared among all tenants. (SSH) protocol [89]. Login to the VM implicitly performs Alternative technologies isolating VMs from the un- an attestation of the VM. trusted hypervisor, e.g., AMD SEV [52, 51] or IBM Our contributions are as follows: PEF [44], do not have memory limitations. They sup- • We designed a protocol, WELES, attesting to the VM’s port running the entire operating system in isolation from runtime integrity (§4). the hypervisor while incurring minimal performance over- • We implemented the WELES prototype using state-of- head [41]. However, their attestation protocol only pro- the-art technologies commonly used in the cloud (§5). vides information about the VM integrity at the VM ini- • We evaluated it on real-world applications (§6). tialization time. It is not sufficient because the loaded operating system might get compromised later–at runtime– with operating system vulnerabilities or misconfigura- 2 Threat model tion [88]. Thus, to verify the runtime (post-initialization) integrity of the guest operating system, one would still We require that the cloud node is built from the software need to rely on the vTPM design. But, as already men- which source code is certified by a trusted third party [3] tioned, it is not enough in the cloud environment. or can be reviewed by tenants, e.g., open-source soft- Importantly, security models of these hardware tech- ware [13] or proprietary software accessible under non- nologies isolating VM from the hypervisor assume threats disclosure agreement. Specifically, such software is typ- caused by running tenants’ OSes in a shared execution ically considered safe and trusted when (i) it originates environment, i.e., attacks performed by rogue operators, from trusted places like the official Linux git repository; compromised hypervisor, or malicious co-tenants. These (ii) it passes security analysis like fuzzing [90]; (iii) it technologies do not address the fact that a typical tenant’s is implemented using memory safe languages, like Rust; OS is a complex mixture of software and configuration (iv) it has been formally proven, like seL4 [56] or Ever- with a large vector attack. I.e., the protected application Crypt [72]; (v) it was compiled with memory corruption is not, like in the SGX, a single process, but the kernel, mitigations, e.g., position-independent executables with userspace services, and applications, which might be com- stack-smashing protection. promised while running inside the TEE and thus exposes Our goal is to provide tenants with an runtime integrity tenant’s computation and data to threats. These technolo- attestation protocol that ensures that the cloud node (i.e., gies assume it is the tenant’s responsibility to protect the host OS, hypervisor) and the VM (guest OS, tenant’s OS, but they lack primitives to enable runtime integrity legacy application) run only expected software in the ex- verification and enforcement of guest OSes. This work pected configuration. We distinguish between an inter- proposes means to enable such primitives, which are nei- nal and an external adversary, both without capabilities of ther provided by the technologies mentioned above nor by mounting physical and hardware attacks (e.g., [87]). This the existing cloud offerings. is a reasonable assumption since cloud providers control We present WELES, a VM remote attestation protocol and limit physical access to their data centers. 2 An internal adversary, such as a malicious administra- tamper-resistant. They cannot be written directly but tor or an adversary who successfully extracted administra- only extended with a new value using a cryptographic tors credentials [74, 50], aims to tamper with a hypervisor hash function: PCR_extend = hash(PCR_old_value jj configuration or with a VM deployment to compromise data_to_extend). the integrity of the tenant’s legacy application. She has The TPM attestation protocol [40] defines how to read remote administrative access to the host machine that al- a report certifying the PCRs values. The report is signed lows her to configure, install, and execute software. The using a cryptographic key derived from the endorsement internal adversary controls the network that will allow her key, which is an asymmetric key embedded in the TPM to insert, alter, and drop network packages. chip at the manufacturing time. The TPM also stores a An external adversary resides outside the cloud. Her certificate, signed by a manufacturer, containing the en- goal is to compromise the security-sensitive application’s dorsement key’s public part. Consequently, it is possible integrity. She can exploit a guest OS misconfiguration or to check that a genuine TPM chip produced the report be- use social engineering to connect to the tenant’s VM re- cause the report’s signature is verifiable using the public motely. Then, she runs dedicated software, e.g., software key read from the certificate. debugger or custom kernel, to modify the legacy applica- Runtime integrity enforcement. The administrator has tion’s behavior. privileged access to the machine with complete control We consider the TPM, the CPU, and their hardware fea- over the network configuration, with permissions to in- tures trusted. We rely on the soundness of cryptographic stall, start, and stop applications. These privileges per- primitives used by software and hardware components.

Load more