Logical Partitions on Many-core Processors

Ramya Masti, Claudio Marforio, Kari Kostiainen, Claudio Soriente, Srdjan Capkun ETH Zurich

ACSAC 2015 1 Infrastructure as a Service (IaaS)

VM VM App App App App OS OS OS OS Hardware Hardware Hardware

Personal platforms Shared cloud platform (IaaS) - Economies of scale - Shared resources - Security problems

2 Resource Partitioning

VM VM App App OS OS

Hardware

Cloud platform with dedicated resources - Guaranteed performance - Secure against shared memory/CPU attacks - Example: IBM cloud

3 Resource Partitioning

Logical Partition Memory VM VM Subset of a system’s resources that can run an OS/VM

Processor C0 C1 C2 C3

Every logical partition needs dedicated resources

Many logical partitions need platforms that have lots of resources!

4 Many-core Processors for Logical Partitioning

Cores, caches VM

Many-core ? processor VM

Many simple cores Many logical partitions

Designed for workloads that share data (no isolation!) Can many-core architectures support scalable logical partitioning?

5 Design Alternatives

Traditional Distributed Centralized

VM VM …… VM VM ……

Hypervisor (H) H H …… H H VM VM ……

C0 C1 ….. Cn C0 C1 ….. Cn C0 C1 ….. Cn

VM Run-time VM-hypervisor interaction H - Increases hypervisor’s attack surface - Performance overhead C

6 Centralized Today

Every core supports H VM VM …… - Hypervisor and VM modes

C0 C1 C2 ….. Cn - Privileged instructions and memory addressing - Enables memory confinement

Higher privilege Lower privilege But every core runs only in one mode! mode mode

VM Unused functionality in every core

H VM Solutions without processor virtualization

7 174 174IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 1, JANUARY 2011 IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 46, NO. 1, JANUARY 2011

Fig. 1. Block diagram and tile architecture. Intel Single-chip Cloud Computer (SCC) Fig. 1. Block diagram and tile architecture. Memory Core controller

Cache Network on chip (NoC) Messaging Buffer

Network Interface Tiles

Sources: Howard et al. IJSSC’10; The Future of Many-core Computing, Tim Mattson, Intel Labs

8 Fig. 2. Full-chip and tile micrograph and characteristics. quired for core-to-router communication. Each tile’s NI logic A. Core Architecture also features a Message Passing Buffer (MPB), or 16 KB of Fig. 2. Full-chip and tile micrograph and characteristics. on-die shared memory. The MPB is used to increase perfor- The core is an enhanced version of the second generation Pen- tium processor [4]. L1 instruction and data caches have been mance of a message passing programming model whereby cores upsized to 16 KB, over the previous 8 KB design, and sup- communicate through local shared memory. port 4-way set associativity and both write-through and write- Total die power is kept at a minimum by dynamically scaling quired for core-to-router communication. Each tile’s NI logic A. Core Architecture back modes for increased performance. Additionally, data cache both voltage and performance. Fine grained voltage change also features a Message Passing Buffer (MPB), or 16 KB of lineson-die have shared been modified memory. with The a MPB new status is used bit used to increase to mark perfor- The core is an enhanced version of the second generation Pen- commands are transmitted over the on-die network to a Voltage the content of the cache line as Message Passing Memory Type tium processor [4]. L1 instruction and data caches have been Regulator Controller (VRC). The VRC interfaces with two mance of a message passing programming model whereby cores (MPMT). The MPMT is introduced to differentiate between upsized to 16 KB, over the previous 8 KB design, and sup- on package voltage regulators. Each voltage regulator has a communicate through local shared memory. normal memory data and message passing data. The cache line’s port 4-way set associativity and both write-through and write- 4-rail output to supply 8 on-die voltage islands. Further power Total die power is kept at a minimum by dynamically scaling MPMT bit is determined by page table information found in the back modes for increased performance. Additionally, data cache savings are achieved through active frequency scaling at a tile core’sboth TLB voltage and must and be performance. setup properly Fine by the grained operating voltage system. change lines have been modified with a new status bit used to mark granularity. Frequency change commands are issued to a tile’s commandsThe Pentium are instruction transmitted set architecture over the on-die has been network extended to a to Voltage un-core logic, whereby frequency adjustments are processed. the content of the cache line as Message Passing Memory Type includeRegulator a new Controller instruction, (VRC). INVDMB, The used VRC to support interfaces software with two Tile performance is scalable from 300 MHz at 700 mV to 1.3 (MPMT). The MPMT is introduced to differentiate between managedon package coherency. voltage When regulators. executed, Each an INVDMB voltage instruction regulator has a GHz at 1.3 V. The on-chip network scales from 60 MHz at 550 invalidates all MPMT cache lines in a single clock cycle. Subse- normal memory data and message passing data. The cache line’s mV to 2.6 GHz at 1.3 V. The design target for nominal usage is quently,4-rail output reads or to writes supply to the 8 on-die MPMT voltage cache lines islands. are guaranteed Further power MPMT bit is determined by page table information found in the 1 GHz for tiles and 2 GHz for the 2-D network, when supplied tosavings miss and are data achieved will be fetched through or active written. frequency The instruction scaling ex- at a tile core’s TLB and must be setup properly by the . by 1.1 V. Full-Chip and tile micrograph and characteristics are posesgranularity. the programmer Frequency to direct change control commands of cache are management issued to a tile’s The Pentium instruction set architecture has been extended to shown in Fig. 2. whileun-core in a logic, message whereby passing environment. frequency adjustments are processed. include a new instruction, INVDMB, used to support software Tile performance is scalable from 300 MHz at 700 mV to 1.3 managed coherency. When executed, an INVDMB instruction GHz at 1.3 V. The on-chip network scales from 60 MHz at 550 invalidates all MPMT cache lines in a single clock cycle. Subse- mV to 2.6 GHz at 1.3 V. The design target for nominal usage is quently, reads or writes to the MPMT cache lines are guaranteed 1 GHz for tiles and 2 GHz for the 2-D network, when supplied to miss and data will be fetched or written. The instruction ex- by 1.1 V. Full-Chip and tile micrograph and characteristics are poses the programmer to direct control of cache management shown in Fig. 2. while in a message passing environment. Intel SCC Architecture

TILE Core Core Core - Asymmetric processor - Each core can run an independent OS NETWORK INTERFACE - Simple cores: no virtualization support Look Up Tables Look Up Tables (LUTs) - Determines the resources available to a core - Resources: other tiles, RAM and I/O

Network on Chip All system resources are memory mapped

9 Intel SCC Address Translation

Virtual address 0x87256199

CORE Memory Management Unit

Physical address 0x72008127

NETWORK INTERFACE Look Up Tables (LUTs) System-wide address

On-tile Off-tile

LUT Registers On-tile I/O Memory at DRAM Memory Another Tile

10 Problem: Lack of Isolation TILE

Core Core

NETWORK INTERFACE Look Up Tables

Every core can Network on Chip change its memory access configuration!

Off-tile memory (DRAM)

11 Solution Intuition

TILE TILE

Master CoreCore Core Core Core X X X NETWORK INTERFACE NETWORK INTERFACE Look Up Tables Look Up Tables

Network on Chip Master core configures memory confinement for all coresOff-tile memory (DRAM)

12 Contributions

1. Hardware change to Intel SCC that enables logical partitioning - isolation on Network on Chip - emulate in software

2. Custom hypervisor - small TCB and attack surface

3. Cloud architecture (IaaS) - implementation and evaluation

13 Cloud Architecture (IaaS)

Computing Node Crypto engine (HSM) H VM VM

HW-virtualized Intel SCC peripherals

Management Service

User Management Storage VM Management Service

14 Operation example: VM startup

Helper Hypervisor VM Computing VM node Storage Core Master Core Core Service Crypto Encrypt and Intel SCC Engine install VM 1. Boot 2. Start 3. Fetch VM 4. Decrypt VM

5. Assign core and configure LUTs 6. Start

15 Recap of Main Properties

No processor virtualization ◦ Beneficial for high number of cores Small TCB ◦ Hypervisor implementation size 3.4K LOC ◦ Tolerance to compromise of other cloud components Reduced VM interaction ◦ Small attack surface ◦ No virtualization overhead

16 Comparison

No Processor Small TCB Reduced VM Virtualization Interaction

Xen (>100K LOC) ✓ ✗ ✗ HypeBIOS (4K LOC) ✗ ✓ ✗ Merge? NoHype (>100K LOC) ✗ ✗ ✓ } Our solution (3.4K LOC) ✓ ✓ ✓

17 Summary

VM

Many-core ? H processor VM Yes! (with minor modifications)

Intel SCC case study - Isolation on Network on Chip (NoC)

18 Thank you!

Ramya Masti, Claudio Marforio, Kari Kostiainen, Claudio Soriente, Srdjan Capkun

19