Research Quarterly

November 2019

RESEARCH VOLUME 1:3 QUARTERLY Bringing great research ideas into open source communities

Rolling your own processor: Ahmed Sanaullah builds an open source toolchain for an FPGA

Keylime: Securing the edge, one slice at a time The Isolation of Time and Space: Partitioning Hypervisors The post general-purpose CPU world is upon us RESEARCH QUARTERLY

VOLUME 1:3

Table of Contents FROM THE DIRECTOR From the director 02 I began paying attention to computers just around the same NEWS

The post general-purpose CPU world is time that Intel’s x86 architecture was starting the incredible upon us 04 journey predicted by “Moore’s Law” and made possible by INTERVIEW

Interview with Leslie Hawthorn 07 the power density principle known as Dennard scaling.

FEATURE ARTICLES Of course, the inaptly named “Law” is nothing of the sort, but the annual Roll your own processor: Building an open source toolchain for an FPGA 10 doubling of compute power it promised was all too real—or at least it was until Women in Tech: Plugging the Leak in the Pipeline 16 fairly recently. We all learned to expect that the next chip generation would Keylime: Securing The Edge, One Slice at a Time 19 completely eclipse the current one in terms of capability and speed, and that it RIG Leader Perspective; UMass Lowell 23 wouldn’t be long in coming. In hindsight, The Isolation of Time and Space: Using what a strange thing to expect from a Partitioning Hypervisor to Host Virtual Machines 25 engineering!

Research Project Updates 31 One of the unfortunate consequences of this predictable growth in performance was that it made systems architecture more or less uninteresting to an entire innovations, and operating system work generation of engineers. What was required to enable them. In this issue, the point of striving for incremental Ahmed Sanaullah’s piece on the open improvements in system performance source FPGA toolchain he has been when the next chip generation was involved in developing shows just how going to come along in a year’s time and exciting this area is. Of course, when we render them irrelevant? Now, however, have a working toolchain, we will need About the Director: Hugh Brock is the Research as Dennard scaling approaches its Director for Red Hat, coordinating a more advanced way of partitioning theoretical limit, things like FPGAs, Red Hat research and collaboration with systems to take advantage of it. universities, governments, and industry worldwide. custom ASICS, and other specialized Craig Einstein’s thesis work on a very A Red Hatter since 2002, Hugh brings intimate accelerators are suddenly on everyone’s knowledge of the complex relationship between lightweight partitioning hypervisor, also mind, not to mention the systems upstream projects and shippable products to the in this issue, may provide just that. task of finding research to bring into the open architecture development, compiler source world.

02 RESEARCH QUARTERLY

VOLUME 1:3

Finding students who are interested project focuses on allowing system in open source and helping them build users to ensure for themselves that that interest is an important part of the systems they are using are running our mission at Red Hat Research. I’m the software they claim to be. This particularly happy therefore to be able is referred to as “attestation”. Before to present our involvement with Tech now, the process of checking key parts Together, described in Sarah Coghlan’s of the software stack, like the BIOS piece on women in tech. Through and firmware, the bootloader, and the our sponsorship, we aim to help Tech operating system kernel, was manual Together, and events like it, make a real and tedious. With Keylime, a cloud user dent in gender imbalance in tech, and, can attest the entire stack their code particularly, in open source tech. Our is running on, trusting only the TPM discussion with community guru Leslie chip in the machine to correctly report Hawthorn illustrates another path to cryptographic hashes of that stack’s the same end. By connecting open components. Keylime came out of an Finding students who source with important real world use MIT Lincoln Labs development effort at are interested in open cases, we can attract a whole different the Mass Open Cloud, was adopted by group of people to open source than a couple of folks in the Red Hat security source and helping those who are just interested in tech for group, and has now made the transition them build that interest its own sake. to a full product development effort. It is an important part is the kind of success story I hope will Finally, I’m really pleased to be able be commonplace as Red Hat Research of our mission at to share the graduation of a Red Hat grows, and we move more and more Red Hat Research. Research project to a full-fledged Red good ideas into open source. Hat engineering effort. The Keylime

Red Hat Research Quarterly delivered to your digital or physical mailbox?

Yes! Subscribe at research.redhat.com/quarterly.

03 RESEARCH QUARTERLY

VOLUME 1:3

THE POST GENERAL-PURPOSE CPU WORLD IS UPON US Hardware and microarchitecture innovations for complex, data-intensive, distributed computing systems were some of the themes for the inaugural Red Hat Research Day in May. Another topic, as covered in the Red Hat Research Quarterly Volume 2, was data sovereignty in all its aspects, including privacy, governance, and security.

WHY SO LITTLE FOCUS ON industry has heard of Moore’s Law, HARDWARE? the observation that the number For the past decade, the tech industry of minimum cost components in an has mostly focused on software integrated circuit doubles about every innovations rather than taking system two years. This means that you could get twice the transistors for the same cost ...any notable innovation hardware in new directions. To be sure, a lot of semiconductor and electrical which, historically, roughly translated on the hardware side engineering design work has gone into into also getting about twice the pales by comparison building out the massive scale-out x86 performance. to the variety of server farms that power the software. Moore’s Law persisted for a long time, But any notable innovation on the system and processor in part because of another observation hardware side pales by comparison to termed Dennard scaling. Dennard architectures of just a the variety of system and processor scaling states, more or less, that, as couple of decades ago. architectures of just a couple of transistors get smaller, their power decades ago. density stays constant—which also What happened was that x86 mostly means they won’t run any hotter. won. And, furthermore, it won in the form One important consequence of Moore’s of largely standardized dual-processor Law and Dennard scaling was that, if you socket rackmount servers. (Along with needed a faster computer, you could desktop and laptop clients.) A variety just wait a couple of years and buy a of intertwined economic and technical new one that was twice as fast for no factors led to this outcome. more money and no difference in size or As covered by Boston University (BU) power consumption. Oh, and it could run PhD candidate Han Dong in his Research the same software. Day talk, two of the most important To say that this state of affairs made life forces that got us to where we are today difficult for any entrepreneur thinking are Moore’s Law and Dennard scaling. to launch a new exotic system design Probably everyone in the tech is a significant understatement. Issues

04 RESEARCH QUARTERLY

VOLUME 1:3

of software compatibility, time-to- force multiplier. It’s an opportunity to Larry Woodman and BU PhD candidate market, or even just perceived risk made work on projects which might not be [in Ali Raza discussed UniKernel Linux. competing with the entrenched x86 products] in the near future.” Unikernels are single address space architecture challenging. library operating systems. With a RESEARCH DAY TOPICS unikernal, a developer selects, from WHAT COMES NEXT? Field Programmable Gate Arrays a modular stack, the minimal set of However, both Moore’s Law and Dennard (FPGA) was one major topic at the libraries which correspond to the OS scaling are petering out. The industry Research Day on the hardware side. constructs required for their application must now consider alternative ways of FPGAs are semiconductor devices to run. An application compiled into gaining performance at the hardware that connect configurable logic blocks a unikernel only has the required level, the operating system level, or even (CLBs) via programmable interconnects. functionality of the kernel and nothing a combination of the two. FPGAs can usually be reprogrammed for else. One longer-term goal for the different purposes after manufacturing. use of unikernals is to provide a viable We see early examples of this trend FPGAs are a more flexible alternative to alternative to packages that, fully or in- in the widespread use of graphics custom-designed Application Specific part, bypass or avoid the kernel and do processing units (GPU) and specialty Integrated Circuits (ASICs) and, with most of the processing in user space. processors, such as Google’s tensor increasing performance, are now being processing units (TPU), to accelerate used for an expanding set of workloads. Other presentations covered ongoing machine learning and other CPU-hungry work that is being done in areas such workloads. As part of the Research Day talks, as latency-sensitive workloads, single- BU’s Martin Herbordt and Red Hatter threaded performance acceleration, However, there’s also a great deal of Ahmed Sanaullah presented “FPGAs and novel ways to allocate memory work in this vein that is still in a relatively Everywhere in Large Scale Computer bandwidth. early stage of research—which is one Systems.” One use case in particular reason why work in this area is a great that Herbordt highlighted was lossy THE NEED FOR NEW TYPES OF fit for the Red Hat Collaboratory with compression, which is important in SPEED Boston University (BU). BU’s Orran certain high performance computing When he was at DARPA, Robert Colwell Krieger describes how, on the one hand, applications that can generate a pointed out that from 1980 to 2010, research systems are often considered petabyte of data per minute. The clocks improved 3500X and micro- “toy systems” and his vision with the problem is that compressing this architectural and other improvements Collaboratory is to “create projects minute of data with general purpose contributed about another 50X where we can do things together to build architectures can take half a day. With performance boost. The process shrink innovative systems that can be used FPGAs, Herbordt pointed out that you marvel expressed by Moore’s Law directly.” From Red Hat’s perspective, “can do it at streaming rate.” overshadowed just about everything Uli Drepper, a Red Hat Distinguished else. That performance knob has Engineer who investigates future Other talks focused on various aspects largely run its course, meaning that we compute architectures, sees this as “a of software. For instance, Red Hat’s can no longer take the broad outlines

05 RESEARCH QUARTERLY

VOLUME 1:3

of today’s hardware and software To see highlights from Research landscape as a given. Day at Red Hat Summit 2019, go to https://www.youtube.com/ The bad news is that Moore’s Law watch?v=h6YOA9agi5U played a huge part in enabling today’s technology landscape. Achieving a Chris Wright’s talk on hardware comparable pace of improvement in innovation https://www.youtube.com/ different ways won’t be easy and may watch?v=9sZCC73PfSo not even be possible. But the flip side Orran Krieger and Uli Drepper on the is that many potential innovations that relationship between operating systems weren’t of broad interest in an industry and hardware innovation https://www. dominated by Moore’s Law are now youtube.com/watch?v=vqybu_Q_ebA in play. There exists an enormous opportunity to research and develop AUTHOR ...we can no longer take these innovations, which will require different disciplines and different types — Gordon Haff, Technology evangelist, the broad outlines of of organizations to work together. Red Hat today’s hardware and software landscape as a given.

Orran Krieger and Uli Drepper on the relationship between operating systems and hardware innovation

06 RESEARCH QUARTERLY

VOLUME 1:3

THE HUMAN FACTOR IN OPEN SOURCE LESLIE HAWTHORN Senior Principal Technical Program Red Hat Research Quarterly had a chance to interview Leslie Hawthorn, one of Manager, Open Source Programs the founding members of Grace Hopper Open Source Day, about motivation and Office, Office of the CTO, Red Hat experiences. Open source software had been literally all around her–all she had to do was ask and discover. As an internationally known developer relations strategist and community RHRQ: You followed a non-engineering distributed team etc., in addition to management expert, Leslie Hawthorn path to open source, first discovering it learning open source version control has spent the past decade creating, when playing mp3 files from a GNOME tools and resources, like Github. cultivating, and enabling open source desktop and then discovering the At Google, I was able to act as an advisor communities across universities, usefulness of the Mozilla Firefox browser to the Humanitarian FOSS Project, enterprises, and non-profits. She’s from engineers at Google. It’s fair to which was started by several colleges best known for creating Google Code- say that the gateway to open source on the east coast of the U.S. The goal of in, the first global initiative to involve software was a business productivity the Humanitarian FOSS Project was to pre-university students in open source tool, but what was the motivation behind bring more computer science students software development. your lifelong engagement? into the world of free and open source Leslie Hawthorn: I’ve had the software development, but doing so opportunity to not only do work with a humanitarian focus. Significant focused on open source from a amounts of research had shown that business perspective, but also had the women, or other underrepresented opportunity to focus on open source groups in the technology space, were from more of a public good perspective, much more driven to working in STEM just because of the privileges that my disciplines and, in particular in computer work allowed me. science, when their work was made relevant to them on a social basis. A Part of the work on Google Summer of person who doesn’t necessarily see Code was to do outreach to universities, themselves as a computer scientist to help them understand that this was becomes much more interested in an opportunity provided by Google to computer science as a discipline help their students become competent when they realize this gives them the at open source software development. opportunity, for example, to develop Unlike typical programming experiences, applications used for healthcare in the it gave people all sorts of skills that were developing world, which is where their required for the job market: the ability true passion lies. They want to help to work well remotely, to communicate society and computer science becomes effectively in writing, to work in a the vehicle to do so.

07 RESEARCH QUARTERLY

VOLUME 1:3

RHRQ: It’s interesting to see how your a group of us just getting together to entry into open source was through help women get started working on human connections. Now you’re open source projects. Now there’s an suggesting that combining tech with entire conference track focused on open social good is a really good way to bring source software. more people into the computer science One of the outcomes has been to get fold. What do you make of that parallel? more academics talking about their work Leslie Hawthorn: When I think about in the open source software space, and my journey to becoming a technologist, really shining a light on how their use of I think it’s a little bit strange. I was open source software is able to meet very much a fan of the humanities all kinds of requirements around data and very excited about understanding reusability, code reusability, replicability the processes by which human beings of experiments. ...while I get really excited communicate effectively with one All of this is extremely important for about new software, it’s less another. I applied that knowledge to my projects that receive grants, e.g. from work in the tech industry and how folks about features than what the National Science Foundation (NSF). work together in open source software it’s bringing from a social It turns out doing things the open source communities. And, while I get really way just happens to satisfy the NSF change perspective... excited about new software, it’s less project architecture requirements. That about features than what it’s bringing way, you set your project up in an open from a social change perspective, like and transparent way from the get-go. ad-blocking software for example. I think about things like smart cities, or I’ve worked with various professors what does the world look like when we on introducing open source into their have IoT and edge applications that are curricula. For example, the Rochester really secure, to protect consumer and Institute of Technology became the first business data? university in the United States to offer a minor in open source software. That’s a I, along with some other folks who are huge way to move the needle forward. very passionate about the idea of the importance of open source software RHRQ: What are some of the barriers being part of that push to have women to the adoption of open source software to be more involved in technical and methods by universities? disciplines, created something called, Leslie Hawthorn: I think one of the Grace Hopper Opensource Day. It places where there is sometimes a started as a three-hour long Code- barrier to adoption of open source athon for Humanity in 2011, and it was

08 RESEARCH QUARTERLY

VOLUME 1:3

software is in universities where open generation lacking in opportunities for source software and its value is not economic mobility. I think what we’re necessarily well understood. If you look going to see is that fewer people are at Red Hat’s research relationship with motivated to participate in open source universities, Red Hat does not require software from pure delight over the assignment of intellectual property cool experiment that they’re doing and when it funds research. That is left to more of them are going to be doing the schools themselves. And they are it from an economic motivation. But able to productize the research that they still want career fulfillment and comes out of those grants in whatever they want to do something that they way they wish to, be that through the think contributes to making the world a university’s office of technology transfer, better place as a whole. or companies founded, etc. But if you look at the way that most universities are Maybe it’s working on medical record incentivized to spend grant funding, if systems for the developing world or it’s coming in from a private enterprise, maybe it’s making sure that communities there are hopes of developing doing subsistence agriculture are able intellectual property that the enterprise to get better economic value from their can then commercialize. agricultural transactions, because they have access to open source software For organizations that have a very applications. strong technology transfer office, those technology transfer offices are There are all kinds of ways in which looking for opportunities to be able to our work can contribute to the bottom produce a patent, or some other type line and social good. I think that we will of constrained intellectual property see more people invested in working rights, that can then be monetized, in technology and with open source and continue to drive revenue for the software when we return to our roots, university after the grant funding has where it’s not just this code is available, concluded. This is, of course, necessary and everyone can do with it as they wish, since the research needs to continue but it’s also that there was a sense of even after the grant has run out. social responsibility from the creator to the audience, and then the audience to RHRQ: Do you see these economic everyone else as well. pressures as negatively impacting the growth of open source software? –RH

Leslie Hawthorn: We have a

09 RESEARCH QUARTERLY

VOLUME 1:3

ROLL YOUR OWN PROCESSOR: BUILDING AN OPEN SOURCE TOOLCHAIN FOR AN FPGA Field Programmable Gate and iv) can be configured so specific integrated circuit). Arrays (FPGAs) are rapidly that each design is tuned for Finally, network switches becoming first-class citizens individual use cases. can also contain embedded in the datacenter, instead FPGAs that process data as Figure 1 illustrates the of niche components. This the data moves through the different configurations is because FPGAs i) can datacenter network (e.g., in which FPGAs are being achieve high throughput and collective operations such as deployed in a datacenter. low latency by implementing broadcast and all-reduce). Bump-in-the-Wire (BitW) a specialized architecture FPGAs process all traffic RESEARCH PROBLEM that eliminates a number of between a server and a bottlenecks and overheads of FPGAs have traditionally switch to perform application general purpose computing, lacked the clean, coherent, and system function ii) consume little power compatible, and consistent acceleration. Coprocessor support for code generation FPGAs provide a traditional and deployment generation accelerator configuration, that is typically available for like GPUs, with an optional traditional central processing back-end secondary network units (CPUs). For the most for direct connectivity part, previous efforts to between accelerators. address this have been ad Storage-attached FPGAs hoc and limited in scope. process data locally on Those who try to address storage servers to avoid this always do something memory copies to compute special due to poor tooling, servers. Stand-alone and the tooling that does FPGAs provide a pool of Figure 1: Different configurations for deploying FPGAs in Data Centers exist is insufficient, especially reconfigurable accelerators for datacenter and high- that can be programmed and and, by extension, have a performance computing interfaced with directly over high power-performance (HPC) applications. the network. Smart network ratio, iii) have high-speed interface controllers (NICs) This is because of the heavy interconnects and can tightly contain embedded FPGAs reliance on proprietary, couple computation with which perform custom vendor-specific tools for communication to mask the packet processing alongside core operations. These tools latency of data movement, a NIC ASIC (application- can change frequently and

10 RESEARCH QUARTERLY

VOLUME 1:3

significantly, which means using transparent, open, of pragmas and low-level OS on CPUs, a hardware that even if we wanted to end-to-end frameworks constructs, and hence require operating system (OS) is not be ad hoc, for the most that are i) not tied to any virtually no prior expertise provisioned on the FPGA part we couldn’t be without vendor, that is, do not use IP in hardware development; in order to share the FPGA investing a lot of work which blocks or tools that are only this prior expertise is both in fabric amongst multiple could be wiped out with a compatible with the FPGA terms of hardware-specific independent entities. new generation of FPGAs. boards of a particular vendor, languages (e.g HDL), as HLS CODE Moreover, these tools are ii) provide opportunities well as code structures (in PREPROCESSOR not necessarily aimed at for customization across all HDL and HLL) needed providing the most efficient levels of the development to effectively map design Current HLS tools can require solution since they limit the stack, and iii) can be patterns to hardware. A developers to explicitly “flexibility” offered to users. upstreamed, that is, can be preprocessor transforms identify opportunities (and For example, these tools i) easily and reliably integrated this simple HLL code into constraints) for parallelism, as do not allow modifications into downstream projects in an FPGA-centric HLL code well as manually implement the algorithms for core order to build more complex (HLL*) which removes a number of important operations (such as logic and intricate systems. hardware optimization design features such as optimization and place & blockers and helps infer caches, loop coalescing, HARDWARE AS A route), ii) hide details of (and opportunities for parallelism. function inlining, floating RECONFIGURABLE, access to) the underlying Then, an HLS compiler point accumultors and data ELASTIC, AND device architecture which converts this HLL* code into hazard elimination. This SPECIALIZED SERVICE prevents implementation of HDL (hardware assembly substantially increases the important functions (such We refer to our framework language). The resulting complexity of HLS code that as logic relocation without for providing upstream HDL is run through a system developers need to provide. recompilation), and iii) are support for datacenter generator which can perform Our HLS code preprocessor designed to be generic FPGAs as “Hardware as one of two operations: i) reduces this complexity by with limited opportunities a RecoNfigurable, Elastic cycle accurate simulation, or automatically identifying for customization (such as and Specialized Service” ii) deployment of application optimization blockers in vendor IP blocks). (HaaRNESS). HaaRNESS is logic on the physical FPGA an HLS compiler through built as a high-level synthesis system. In case of the latter, compiler instrumentation, This is not a problem (HLS) tool, which creates a bitstream compiler maps and then addressing them unique to FPGAs, however. and deploys high-quality the HDL code onto the using a series of system Similar issues already exist hardware from algorithms FPGA fabric using Synthesis code transformations. for software. Therefore, expressed in high-level and Place & Route. Then, a Optimization blockers occur similar to free software languages (HLL) such software runtime is used when a compiler writer is in the software world, we as OpenMP or OpenCL. to program the application not being allowed to infer an must be able to code and Developers only specify the onto the board and interface optimization. An optimizing deploy custom architectures algorithm, with minimal use with it. Finally, similar to the transform may be blocked if

11 RESEARCH QUARTERLY

VOLUME 1:3

it: output of the front-end HLS compiler). These transforms are only done if an i) modifies code functionality, instead of optimization is blocked. Finally, this set We then built a set of probes which structure only, ii) can result in a failure of code transforms and the probe report contain individual design patterns to compile, iii) is based on information is fed into the pre-processor. in relative isolation, so that we can available at run-time, iv) requires a determine compiler effectiveness ADVANCING HLS COMPILERS global view of the computation, and/ for each. By running these probes or v) is based on implicit code behavior Current HLS compilers have two major through the compiler and looking at that may be visible to the developer, drawbacks. First, since existing HLS instrumentation report, we can tell but cannot be reliably extracted by the tools map code fragments to vendor what optimizations are blocked. The compiler. IP blocks in order to generate HDL from HLL code, a Figure 2 illustrates large library of such our approach. To blocks is typically identify optimization needed. Such libraries blockers, we first consume a large built a logical model amount of CPU for FPGAs by memory, have high- identifying a set of overhead non-trivial core design patterns lookup operations, that an HLS compiler and provide limited should be able to opportunities for infer and implement optimization since efficiently in order to they are proprietary. achieve high quality Only a limited set code generation. of parameters can Examples of these be modified and design patterns Figure 2: Framework for automatically removing optimization blockers using compiler instrumentation that, too, is within include single predefined bounds. instruction, multiple data (SIMD), process is done once for a given version pipelining, caching, logic inlining, and of an HLS compiler. Along with a set of Second, it is also likely that a significant loop structures. probes, we also provide a set of HLL- fraction of code that is translated to HLL code transforms that can remove these IP blocks is not needed for the Then, we instrumented the HLS compiler the optimization blocker for each probe. HDL to execute. Application logic (OpenCL in our proof-of-concept) to Examples of these transforms include can have a spectrum of performance determine what it has inferred given an loop unrolling for SIMD and generating requirements for different components, input code. This requires analyzing the register caches for read-after-write not all of which require execution on IR at compile time (static profiler) after hazards for on-chip memories. custom hardware, e.g. control plane all optimizer passes have been run (i.e.

12 RESEARCH QUARTERLY

VOLUME 1:3

versus data plane. because the HDL does not change. If the HDL itself is as small as possible Our goal is to advance HLS compilers for computation kernels and/or is by addressing the above two drawbacks. relocatable to other parts of the FPGA The first enhancement is to reduce the fabric, the need to run Place & Route is size of code sequences being translated further reduced. to hardware, and perform this translation using only basic vendor-agnostic and SYSTEM GENERATOR: CYCLE transparent hardware building blocks ACCURATE SIMULATION like registers and gates. This enables Another major component of our faster compilation times and allows the research includes building a cycle design to be tuned for each individual accurate simulation framework. The application. framework can estimate performance The second, perhaps more critical, directly from HDL code without The second, perhaps improvement is to identify, at compiling to actual hardware, because more critical, rapid and reliable design space compile time, the best approach for improvement is to implementing the algorithm. For the exploration substantially reduces code generation itself, we have three turnaround times for building high quality identify, at compile different pieces: i) the part which must hardware. While this feature, called RTL time, the best approach simulation, is certainly not novel, we always executed on the host and cannot for implementing the be on the FPGA e.g. due to I/O, ii) the provide significantly more control over part which is translated into softcores on what can be evaluated and how. algorithm. the FPGA, and iii) the rest of the code Using our framework, developers which is translated into HDL. These parts have the flexibility of testing both the can be either inferred automatically application logic and its interaction by the compiler (directly or through with the world around it. The latter profiling executables), or marked up involves testing application logic after using OpenMP primitives. The split of connecting it to intra-FPGA wrappers, (ii) and (iii), in particular, is important, operating systems, external devices, since functions implemented using etc. This is important since testing the softcores consume negligible resources application logic only, without modelling (logic/memory/DSP blocks) and can the environment around it, can result in achieve asynchronous operation with developers converging on a design that respect to HDL. Moreover, for part (ii), gives worse performance than naive we eliminate Place & Route (an hours/ code when actually implemented on an days long process), and achieve CPU- FPGA. Worse, a design is likely to fail to only software-like turnaround times, execute altogether if deadlocks were not 13 RESEARCH QUARTERLY

VOLUME 1:3

properly identified beforehand. separate the software stack for FPGA tools into architecture/configuration SYSTEM GENERATOR: dependent and independent DEPLOYMENT components. This will enable us to Bitstream compiler maintain a uniform and reusable structure for software runtime across all The mapping of HDL to the FPGA fabric types of FPGA configurations boards is typically done exclusively by FPGA in the datacenter. It will also reduce the vendors. This is because it requires complexity of adding and removing knowledge of low-level details of the features, since well-defined APIs will underlying FPGA hardware, which ensure that changes are compatible with vendors typically do not disclose publicly existing code. Having these APIs map in order to protect intellectual property. well to a broad set of vendors is possible Lack of these low-level hardware since FPGAs talk to host machines details means that we cannot determine Using our framework, over standard buses, such as PCIe how designs map to FPGAs and thus (peripheral component interconnect developers have guarantees of security and performance express). These standard buses, and the flexibility of cannot be reliably provided. associated protocols, constrain the testing both the Our research focuses on inferring low- behaviour of both the software (drivers) application logic and level hardware by reverse engineering and hardware (PCIe logic blocks) built around the buses in a similar manner for its interaction with the FPGA bitstreams. The goal here is to obtain key insights into the compilation all FPGAs. Any subtle differences, such world around it. processes, which we can then use to as vendor or device IDs, can be supplied build an open bitstream compiler. This as compile/load/run-time values. As a allows us to both reduce the limitations result, it is possible to implement reliable of proprietary bitstream compilers, as and effective uniformity, at least for the well as implement important features lower levels of the hardware and software that are currently not supported, e.g. stacks, and expose a consistent interface FPGA fabric attestation. to applications on the host and device.

Software runtime Hardware OS

With regards to software runtime, our Hardware operating systems are research is primarily focused on building effectively any logic on the FPGA vendor agnostic tools, such as drivers that is not part of the application. and runtime libraries. Similar to how They are responsible for partitioning the Linux kernel is built, our goal is to the device fabric between multiple entities, data flow management and

14 RESEARCH QUARTERLY

VOLUME 1:3

interfaces, and hardware modifications. the datacenter. This is critical to ensuring AHMED SANAULLAH is a software They also manage the flow of data compatibility across the stack, enabling engineer for the Red Hat Office of between different components in portability across FPGAs, and reducing the CTO, working on all things FPGA. the FPGAs by defining a number of developer effort in integrating their This includes a number of projects specifications such as APIs, protocols, designs into the hardware OS. across the hardware and software bus widths, clock domains, FIFO depths, stacks, such as compilers, drivers and CONCLUDING REMARK etc. Figure 3 shows our hardware hardware operating systems. He has operating system for Bump-in-the-Wire We expect to have a toolchain, a PhD in Computer Engineering from FPGAs, called Morpheus. Morpheus Morpheus, and compiler extensions Boston University, a MSc in Electrical supports the sharing of the FPGA fabric to target the FPGA in the near future. and Electronic Engineering from The between developers and the system If you are interested in collaborating, University of Nottingham, and a BS administrator. Administrator functionality please feel free to contact us and we in Electrical Engineering from Lahore offloads are particularly useful for would be happy to discuss the research University of Management Sciences. accelerating a large number of critical with you: [email protected]. workloads such as encryption, SDN, Key Value Store, database operations, etc.

Since APIs are well defined, Morpheus AUTHOR can be easily modified to support other — Ahmed Sanaullah,PhD deployment configurations of FPGAs in Software engineer, Office of the CTO

Figure 3: Design of our prototype Hardware OS for Bump-in-the-Wire FPGAs

15 RESEARCH QUARTERLY

VOLUME 1:3

WOMEN IN TECH: PLUGGING A LEAK IN THE PIPELINE It’s hardly news that there is This year, Red Hat is to plug a leak in the tech an enduring gender gap in a top-tier sponsor for pipeline.” tech. According to the United TechTogether Boston, In 2018, only 20% of States National Center Boston’s largest all-female, hackathon participants for Women & Information femme, and non-binary identified as women. Technology, only 26 percent hackathon, to be hosted TechTogether Boston is of the U.S. computing at Boston University (BU). directly addressing this by workforce is female. TechTogether Boston’s creating an environment Women are chronically mission is to help gender- for traditionally gender- 26% underrepresented in the U.S. marginalized groups thrive marginalized people who FEMALE tech sector and this lack and be more successful, aspire who aspire to work of diversity is not a recent confident, and prepared on projects, explore their phenomenon. for careers in the tech field. U.S. interests on a deeper level, COMPUTING Red Hat attended last year Although the problem and connect with other WORKFORCE and hired 10 strong interns is complex and finding women and non-binary from the program. This year, solutions is daunting, it’s people in technical fields and we’re aiming to double the entirely possible to bring build lifelong connections. number of hires drawn from more girls into tech and Source: United States National the talent pool participating “With the support of Center for Women & Information support them at all points in this event. sponsors like Red Hat, we Technology along the pipeline. Women- are able to host our event led organizations like Girls “In high school and college, free of charge to all our Who Code and littleBits women fall out of the tech hackers, thereby eliminating were founded with this pipeline because there the financial barrier that is mission in mind: to get more aren’t the resources there present in trying to break into girls interested in technology to support them in their tech,” Yeung says. “With their in the first place, and then tech-related classes, clubs, on-the-ground presence at support them throughout internships, and even our event, Red Hat allows their journey so they stay hackathons,” says Grace women and non-binary interested, persist, and Yeung, Director of Marketing individuals to see that succeed. Industry can play a at TechTogether Boston. technology companies are powerful role in supporting “By being a top sponsor willing to invest in them and women even further as they for TechTogether Boston, hire them as well.” enter the workforce. Red Hat is directly helping

16 RESEARCH QUARTERLY

VOLUME 1:3

Red Hat is planning to for Research, Academics and provide the best experience Mentoring Pathways, is a for these hackers by offering six-week program for twenty “We rob people of the opportunity participants the opportunity first-year female engineering to learn if we shut them out from to network, attend a variety students. RAMP is designed of workshops, interview by the faculty based on pre- the beginning.” for internships, enjoy fun vious experiences mentoring –Kate Carcia, Associate manager in activities, and make lasting female students. The guiding quality engineering at Red Hat and graduate of UMass Lowell friendships along the way. principle behind this program Tech Together not only is that when women entering supports the development engineering don’t stay the of new talent, but also course, other young women celebrates strong female then feel isolated and switch role models in the local to other majors. technology industry. Last “Often times, subjects like year, Red Hatters Oindrilla math and science breed Chatterjee and Hema unwelcoming environments Veeradhi each led workshops from day one,” Kate Carcia, at the event. Data scientist associate manager in quality Chaterjee presented engineering at Red Hat and “Understanding Text and graduate of UMass Lowell, the Underlying Sentiment,” says. “We rob people of while software engineer the opportunity to learn if Veeradhi covered “Machine we shut them out from the Learning Flow on OpenShift.” beginning.” Chatterjee and Veeradhi are both recent graduates of Without other women to Boston University. look up to, many young women self-select out of a Efforts to reduce the gender technical career path before gap reach beyond Boston. they have even given it a This past summer, Red Hat chance. Leaning on her continued its sponsorship for own experiences, Carcia the University of Massachu- mentored students from the setts (UMass) Lowell’s RAMP RAMP program and spoke on program. RAMP, which stands

17 RESEARCH QUARTERLY

VOLUME 1:3

SARAH COGHLAN is the University the program’s industry panel with other come in all shapes and all sizes. Adding Program Manager for Red Hat Research female engineers from Red Hat. more women to Red Hat will attract working out of the Boston Office. She more women to Red Hat - and we are “I would not have stuck around if it oversees the PnT Intern and Co-op doing just that. weren’t for my mentors,” Carcia says. Programs and is the lead organizer for “The feeling of not belonging was Tech Together Boston. enough to start pushing me out the door AUTHOR on numerous occasions. I’m lucky to — Sarah Coghlan have people who push me right back in.” ARTICLE LINKS: 26 percent of the U.S. computing workforce is There is a huge opportunity to shift the female: http://www.techrepublic.com/article/the- trajectory of women and girls entering state-of-women-in-technology-15-data-points- the industry and make tech an exciting you-should-know/ and welcoming career opportunity Girls Who Code: http://www.girlswhocode.com for all. Changing how people envision littleBits: https://littlebits.com/ computer scientists is an important step TechTogether Boston: https://boston.techtogether. io/ in the effort to encourage more women 20% of hackathon participants identified as to pursue careers in technology. Girls women: http://ladyproblemshackathon.com/our- need to see that computer scientists impact/

Red Hat Research Quarterly delivered to your digital or physical mailbox?

Yes! Subscribe at research.redhat.com/quarterly.

18 RESEARCH QUARTERLY

VOLUME 1:3

KEYLIME: SECURING THE EDGE, ONE SLICE AT A TIME As the number of workloads in cloud, IoT, and edge continue to grow, how do we verify that a remote, shared, and/or physically unsecured computing system has not been tampered with? Is there a good way to use a standardized cryptographic module to establish a hardware root of trust, do a remote, trusted secure-measured boot, do remote integrity verification management, and do remote encrypted payload execution?

Keylime is an open source community- implementation, low performance, and based project that enables the lack of compatibility with virtualized establishment and maintenance environments has limited their adoption. of trusted compute in distributed Keylime’s design allows the remote deployments. It uses embedded attestation and IMA monitoring of Trusted Platform Module (TPM) thousands of nodes. Work is underway ∗ hardware (version 2 and later) and the

in collaboration with Boston University Thomas M. Moyer Bootstrapping and Maintaining Trust in theMIT Cloud Lincoln Laboratory Linux kernel’s Integrity Measurement Patrick T. Cable II [email protected] Threat Stack, Inc. (BU) on a virtual TPM (vTPM) quote. Nabil Schear [email protected] Rudd MIT Lincoln Laboratory MIT Lincoln Laboratory Architecture (IMA) subsystem to do [email protected] Richard [email protected] MIT Lincoln Laboratory Keylime can also scale to be used to [email protected] 1. INTRODUCTION so. Keylime was originally created by an The proliferation and popularity of infrastructure-as-a- service (IaaS) cloud computing services such as Amazon Web Services and Google Compute Engine means more cloud monitor thousands of virtual machines ABSTRACT tenants are hosting sensitive, private, and business critical Today’s infrastructure as a service (IaaS) cloud environ- data and applications in the cloud. Unfortunately, IaaS MIT Lincoln Laboratory research team. ments rely upon full trust in the provider to secure appli- cloud service providers do not currently furnish the build- cations and data. Cloud providers do not offer the ability ing blocks necessary to establish a trusted environment for to create hardware-rooted cryptographic identities for IaaS hosting these sensitive resources. Tenants have limited abil- running on a single host by reducing the cloud resources or sufficient information to verify the in- ity to verify the underlying platform when they deploy to tegrity of systems. Trusted computing protocols and hard- the cloud and to ensure that the platform remains in a good It has now grown to include a small and ware like the TPM have long promised a solution to this state for the duration of their computation. Additionally, problem. However, these technologies have not seen broad current practices restrict tenants’ ability to establish unique, keylime, a scal-pro- adoption because of their complexity of implementation,keylime low unforgeable identities for individual nodes that are tied to a performance penalty of directly calling performance, and lack of compatibility with virtualized en- hardware root of trust. Often, identity is based solely on a vironments. In this paper we introduce software-based cryptographic solution or unverifiable trust dedicated open source community able trusted cloud key management system. in the provider. For example, tenants often pass unprotected vides an end-to-end solution for both bootstrapping hard- secrets to their IaaS nodes via the cloud provider. ware rooted cryptographic identities for IaaS nodes and for Commodity trusted hardware, like the Trusted Platform the hardware TPM of the cloud node to system integrity monitoring of those nodes via periodic at- Module (TPM) [40], has long been proposed as the solution testation. We support these functions in both bare-metal for bootstrapping trust, enabling the detection of changes to behind it. and virtualized IaaS environments using a virtual TPM. system state that might indicate compromise, and establish- provides a clean interface that allows higher level keylime ing cryptographic identities. Unfortunately, TPMs have not cryptographically sign data. security services like disk encryption or configuration man- been widely deployed in IaaS cloud environments due to a not an agement to leverage trusted computing without being trusted variety of challenges. First, the TPM and related standards computing aware. We show that our bootstrapping proto- for its use are complex and difficult to implement. Second, col can derive a key in less than two seconds, we can detect since the TPM is a cryptographic co-processor and system integrity violations in as little as 110ms, and that accelerator, it can introduce substantial performance bottle- can scale to thousands of IaaS cloud nodes. WHAT PROBLEMS DOES KEYLIME keylime necks (e.g., 500+ms to generate a single digital signature). Lastly, the TPM is a physical device by design and most ∗This material is based upon work supported by the As- IaaS services rely upon virtualization, which purposefully sistant Secretary of Defense for Research and Engineering divorces cloud nodes from the hardware on which they run. KEYLIME’S BENEFITS under Air Force Contract No. FA8721-05-C-0002 and/or FA8702-15-D-0001. Any opinions, findings, conclusions or At best, the limitation to physical platforms means that HELP SOLVE? recommendations expressed in this material are those of the only the cloud provider would have access to the trusted author(s) and do not necessarily reflect the views of the As- hardware, not the tenants [17, 20, 31]. The Xen hypervisor sistant Secretary of Defense for Research and Engineering. includes a virtualized TPM implementation that links its se- c 2016 Massachusetts Institute of Technology. Delivered curity to a physical TPM [2, 10], but protocols to make use to the U.S. Government with Unlimited Rights, as defined of the vTPM in an IaaS environment do not exist. Keylime enables: in DFARS Part 252.227-7013 or 7014 (Feb 2014). Notwith- To address these challenges we identify the following de- standing any copyright notice, U.S. Government rights in Today’s clouds rely upon complete trust this work are defined by DFARS 252.227-7013 or DFARS sirable features of an IaaS trusted computing– the system system: should enable 252.227-7014 as detailed above. Use of this work other than Secure Bootstrapping as specifically authorized by the U.S. Government may vio- • late any copyrights that exist in this work. the tenant to securely install an initial root secret into each cloud node. This is typically the node’s long term ACM acknowledges that this contribution was authored or co-authored by an em- cryptographic identity and the tenant chains other se- in the provider to secure applications ployee, or contractor of the national government. As such, the Government retains crets to it to enable secure services. – the system should a nonexclusive, royalty-free right to publish or reproduce this article, or to allow oth- 1. Trusted measured boot functionality ers to do so, for Government purposes only. Permission to make digital or hard copies for personal or classroom use is granted. Copies must bear this notice and the full ci- System Integrity Monitoring tation on the first page. Copyrights for components of this work owned by others than • allow the tenant to monitor cloud nodes as they oper- ACM must be honored. To copy otherwise, distribute, republish, or post, requires prior and data. Cloud providers do not offer specific permission and/or a fee. Request permissions from [email protected]. and secrets provisioning using ACSAC ’16, December 05-09, 2016, Los Angeles, CA, USA c 2016 ACM. ISBN 978-1-4503-4771-6/16/12. . . $15.00 DOI: http://dx.doi.org/10.1145/2991079.2991104 the ability to create hardware-rooted encrypted payloads. cryptographic identities for cloud 2. Runtime integrity checks and resources nor do they supply sufficient verification. information to verify the integrity of systems. Trusted computing protocols, It also supports the following distribution as well as trusted hardware like TPM scenarios: chips, promised a solution to this • Single site - single node (multi-user) problem. Unfortunately their complex

19 RESEARCH QUARTERLY

VOLUME 1:3

• Single site - multi-node (Datacenter, THE TPM CORE IoT) The TPM chip provides a root of trust • Multi-site - multi-node (Distributed facility introduced by the Trusted Datacenter, Network Edge equipment, Computing Group (TCG), standardized IoT) in 2009, and updated in 2015 to • Multi-tenant (Cloud) version 2.0. The TPM standard includes general system trust facilities such as • Baremetal random number generation, secure key • Virtual machines (VM) generation, data encryption, and remote Figure 1 and Figure 2 illustrate the attestation. Version 2.0 is not backward difference between a traditional remote compatible with previous TPM versions. trusted secure boot and continuous Keylime targets version 2.0 or later of The Keylime remote integrity verification. the standard. community is currently working on packaging the project for different platforms and hardening the system by porting subsystems from Python to Rust.

Figure 1

Figure 2

20 RESEARCH QUARTERLY

VOLUME 1:3

KEYLIME’S COMPONENTS The verifier is responsible for bootstrapping a new node into the Using the below components, Keylime system and continuously requesting the attests system integrity during node quotes from each agent component in provisioning as part of a trusted boot the system. The verifier then performs workflow as well as continuously the attestation on the quotes returned attesting the trustability of the runtime to determine if there have been any environment while it is operational. The unauthorized changes to the remote only external component Keylime needs systems. to operate is a functional TPM provided by each infrastructure node where The registrar is responsible for attestation is desired. maintaining the set of known secure (public) key values used during The three main components to the attestation processing. The agent Keylime system are the agent, the on each node registers itself with registrar, and the verifier. All three the registrar upon boot up, locking in components were initially developed in the initial state of the node for later Python. Components that have greater comparison. The registrar’s secure performance and security needs, like key set also includes the public keys the agent, are being ported to the Rust for the hardware manufacturer of language for its performant nature as each node in the system. These a low-level systems language and for manufacturer keys are used to verify the strict security model of ownership that the hardware TPM is valid and enforced by the compiler. can be used as the root of trust for The agent is required to be installed on the respective node. each node in the infrastructure where Also included in the Keylime tooling is attestation is desired. It is responsible for a tenant command line interface (CLI) interacting with the TPM of the system it utility (keylime_tenant). The tenant resides on, including TPM 2.0 functions utility uses Keylime’s RESTful interfaces such as requesting cryptographic to communicate with the Keylime quotes. The agent is then responsible components. The user can either employ for communicating the collected the tenant utility, the Keylime web user information back to other system interface (UI), or integrate a management components to enable the processing of system with Keylime by integrating the trust chain. with the Keylime REST application programming interface (API) directly.

21 RESEARCH QUARTERLY

VOLUME 1:3

LUKE HINDS is a software engineer Keylime includes a simple Certificate different platforms and hardening in Red Hat’s CTO office with a focus on Authority (CA) manageable by the the system by porting subsystems security trust systems for Cloud, Edge tenant utility or its dedicated Keylime from Python to Rust. Rust will provide and IoT. CA utility (keylime_ca). The CA is an performance improvements and brings integral part of initially establishing trust extra security benefits due to its type during the bootstrapping phase of node ownership model enforced by the provisioning and in enforcing the trust compiler relationship of the node thereafter. The To learn more about Keylime CA is initially responsible for signing all boot keys sent to the nodes being Interested in learning more or trying provisioned, establishing the initial Keylime out for yourself? Come check trust the system relies on. If the verifier out the community at https://keylime. detects a breach of the established dev and explore the “Get Started With ANDREW TOTH is a software trust via broken attestations, the CA Keylime” guide. Engineering Manager in Red Hat’s CTO is notified and expected to revoke the office focused on Telecommunication trust by invalidating the keys associated Service Provider solutions. with the compromised node.

WHAT’S NEXT FOR KEYLIME AUTHORS The Keylime community is currently working on packaging the project for — Luke Hinds and Andrew Toth

22 RESEARCH QUARTERLY

VOLUME 1:3

RIG LEADER’S PERSPECTIVE: ONBOARDING A NEW UNIVERSITY A Red Hat Research Interest Group (RIG) is a self-organizing team that drives support for and participation in local technical research projects at academic and government agencies or industrial groups.

Red Hat Research Interest Groups are, themselves. But how could we motivate in part, driven by the passion of their them to collaborate with us? We hoped One of the projects members. Often, these passions revolve that, although Red Hat is not as big we sponsored was the around technology areas or social a fish as Google or Amazon, to the University of Massachusetts concerns and, sometimes, a combination people at UMass Lowell, we would still of both. But, even with passion, pulling be seen as a medium-sized fish. In the Lowell Cyber Range, which together partnerships with local end, what we found the real motivator emphasized the importance universities is not without its challenges. to be was the kind of partnership we of security in open source put together for them. The MetroWest RIG covers Boston, development. The cyber Massachusetts, a city with a high The next challenge was more difficult range supports research, concentration of universities, as well to address—finding the right channels as Boston’s suburbs, including the to formalize the activities we planned to teaching, and workforce New England technology belt where do in regards to Women in Engineering development, providing a Red Hat’s second largest engineering and Women in Science seminars and live, sandboxed environment office is located in Westford. While the teaching classes as a partnership with Westford office has many research commitment from both sides. Our where students can build and projects and internships to offer, it has entry point was the UMass Lowell investigate security systems fewer candidates in the immediate area. Co-op Program office. Through their from the ground up, safely We did not have to look too far away, Professional Co-op program, students however, to find excellent prospects in can now receive university credits and exploring security attacks the nearby University of Massachusetts earn a salary for their work experience at and defense techniques. at Lowell (UMass Lowell), where a lot of Red Hat. Students who participate spend us studied and where Red Hat software 6 months embedded in an engineering engineer Jeff Brown served as an team with the ability to make significant adjunct professor. contributions to open source projects. We now have 20 openings for UMass Lowell First problem solved--we had insight Co-op students. and connections. We also believed we had something of value to offer After determining the channel to to the university and to the students connect with the students, we were

23 RESEARCH QUARTERLY

VOLUME 1:3

ready to tackle the final challenge— When they have a huge program with a building up collaborative research conventional technology company, it is projects. We talked to the heads of the difficult to get papers out or advertise computer science and electrical and the research. Someone working on a computer engineering departments radar system for a private company about the kinds of opportunities we might find it impossible to publish could offer. Ideas included a semester’s a paper. But with Red Hat, it’s not a project, a senior project for undergrads, problem. We are open and all the work a master thesis, and even a multi-year is open, so we don’t mind when they PhD level project. Red Hatters came publish their work. up with the ideas for the projects and UMass Lowell has a special appeal to dedicated at least half a day each week Red Hat, as well. Besides the expertise to mentor students on campus or at the in advanced networking and 5G that Westford office. their students bring, there’s a “vibe” Although we offer some stipends for the that’s reminiscent of the early days of larger projects, the relationship between Red Hat. As a public university located UMass Lowell and Red Hat is more about in a historical industrial center, it is often ...there’s a “vibe” that’s the exchange of knowledge and creation overlooked and has to work hard to be reminiscent of the of opportunities for collaboration. The noticed or make an impact. Nobody early days of Red Hat. relationship is mutually beneficial. For believed in Red Hat at first, or in its example, the open source Linux project commitment to open source software, upstream from Red Hat Enterprise Linux, and we had to work hard to prove Fedora, is being experimented with and ourselves. That’s a big reason why we tweaked for a range of small IoT devices feel that the UMass Lowell and Red Hat and facial recognition technologies by partnership is going to thrive. students and, at the same time, they’re working on something they’re calling “Fedora for Academia.” AUTHOR

The openness of Red Hat’s approach — Rashid Kahn, Sr. Director of software is very appealing for the university. engineering, Red Hat

24 RESEARCH QUARTERLY

VOLUME 1:3

THE ISOLATION OF TIME AND SPACE: USING A PARTITIONING HYPERVISOR TO HOST VIRTUAL MACHINES For certain classes of systems and applications, the latency, nondeterminism, and resource sharing found in traditional hypervisor-based virtual systems are unacceptable. Examples of this can be seen in machine control, applications guaranteeing a strict quality of service, real-time systems, and power-constrained platforms. Each of these examples would encounter dynamics in the traditional hypervisor approach that could cause the system or application to not operate as intended. For these cases, making use of a virtualized system may still be desirable if different components of a platform, or different applications run on that platform, need to make use of different operating systems. However, it might be inefficient, or too power intensive, to use a different computational platform for each operating system. Thus, another approach to virtual systems must be used to mitigate these unacceptable dynamics. One such approach is the use of a partitioning hypervisor.

Virtualization is a powerful more. Virtualization provides correct guest. A diagram of tool that enables the hosting interesting dynamics that can an example system running of multiple operating exploit the hardware features three VMs that makes use of systems, known as virtual of a physical platform. a hypervisor can be seen in machines (VMs) or guests, Figure 1. Depending on how on the same physical virtualization is implemented platform. This allows for the though, this exploitation functionality and services has a limit. Traditionally, of several, potentially virtual systems make use of different, operating systems a hypervisor to manage the to coexist and make use of multiple operating systems’ the same set of hardware usage of hardware. This resources. These hardware hypervisor, otherwise known resources include processors, as a virtual machine manager, memory, and I/O devices or VMM, is a sort of resource such as network cards, USB manager. It ensures that each ports, and cameras. The guest on a platform can make use cases of such a system use of the hardware that it configuration are plentiful, should be allowed to use, In certain cases, a guest spanning across several Figure 1. This figure shows an that each guest gets a turn can be made aware of example of a virtual system making domains such a servers, use of a traditional hypervisor. The to use shared resources, and the hypervisor and that robotics, vehicles, general regions dedicated to each guest are that data is returned to the it is running in a virtual development, and many delineated by the dashed red lines.

25 RESEARCH QUARTERLY

VOLUME 1:3

environment, but in other to make use of the hardware from the nondeterminism cases a guest operating devices of the platform, and high round-trip time, system will remain oblivious uses of the hypervisor are hypervisors need to employ to the hypervisor. complicated, expensive, and additional control structures nondeterministic. to manage multiple guests’ In the first case, a usage of the hardware. mechanism known as When an operating paravirtualization is used to system running in a virtual All of these factors have an make the guests aware of environment makes a request impact on both the spatial the hypervisors existence. to use a device, it is unaware and temporal performance Paravirtualization, in essence, that other operating systems of the virtual systems. On is when an operating system might also be making use modern hardware, these communicates with the of that device. Thus, the effects might not be hypervisor directly to access OS might have to wait an immediately noticeable; the hardware resources of unexpected amount of time modern platforms have the platform on which they in order to make use of that suitable processing speeds are running. A guest’s device device and to receive the and memory to handle drivers are altered to make expected data. Additionally, multiple guest requests calls into the hypervisor to each round-trip into the without significantly perform tasks on its behalf, hypervisor (performing a degrading performance instead of the guest trying to VM exit and a VM enter) can (besides in extreme cases). directly access devices. cause a TLB flush. However, these systems In the second case, when The TLB (translation look- cannot guarantee dynamics each guest is unaware of the aside buffer) is used to store such as hard real-time hypervisor, a guest attempts address translations, so that performance. In this context, to make direct use of the a guest operating system real-time means that a task hardware. When this happens, does not always need to re- or event will finish within a the instructions that the translate virtual addresses predetermined deadline and guest attempts to execute into a physical address. Once can happen periodically, with are trapped (redirected) populated, the TLB reduces each occurrence finishing into the hypervisor. The memory access times and within the expected time- hypervisor then emulates improves the performance of bound. Because of the the guest’s usage of a device the system. Flushing the TLB non-deterministic nature of and returns any data back to negates this performance the hypervisor, and because the appropriate guest. While boost and thus with every of the competition for shared these two dynamics are call into the hypervisor guest resources, a typical guest useful in allowing each guest performance suffers. Aside operating system cannot

26 RESEARCH QUARTERLY

VOLUME 1:3

safely make the claim that one answer: it can create a based virtual system is set a certain task will finish in a virtual system that mitigates up by a monitor. During specifically predetermined the described latency, this set-up, the monitor amount of time. nondeterminism, and shared establishes the guest’s resource dynamics found in memory region, ensures that Hypervisors also have traditional hypervisors, while the guest only has access to limited device power preserving the flexibility the set of devices to which management. Because and low friction of VMs. A it is assigned (processors the hypervisor manages partitioning hypervisor is a and I/O devices), and can the hardware resources for form of virtual system which handle certain faults of a set of guest operating partitions the hardware the guest. After this set- systems, if a certain device resources of a physical up, the monitor largely is exposed to multiple platform. It then individually removes itself from the guest operating systems, assigns these partitioned guest’s runtime operations. any one of those operating resources to a set of guest This prevents costly VM systems could make use of operating systems running exits and enters and the device at essentially any in the virtual system. Each allows each guest to make time. Thus, the hypervisor guest can then make direct use of the hardware cannot safely give a guest exclusive use of the set assigned to it without any OS the capability to power of resources to which it is hypervisor intervention. off the device as another assigned. Using this method, guest might need to make each guest receives a use of it (or could be in dedicated set of processors, the middle of using it!). a dedicated region of The hypervisor is given the memory, and a dedicated ability to power down and set of I/O devices. This idle devices, but it can not partitioning arrangement be certain that one of the enables each guest operating guests will not need to make system to be both temporally use of it in the near future. and spatially isolated from What, then, can be done to each other. A diagram of preserve the advantages an example system running of virtualization while still three VMs that makes use of obtaining the guarantees a partitioning hypervisor can The monitor can also be Figure 2. This figure shows an example of a virtual system making a real-time system needs be seen in Figure 2. used to establish shared use of a partitioning hypervisor. The to operate correctly? A memory communication Each guest operating system regions dedicated to each guest are partitioning hypervisor is channels. These channels delineated by the dashed red lines. in a partitioning-hypervisor- 27 RESEARCH QUARTERLY

VOLUME 1:3

allow guest operating systems to information that should be shielded from communicate directly with each other external readers. through shared memory, instead of Certain guests might have a high using some I/O device as if they were sensitivity to multiple domains, with running on separate physical platforms, an example of such a system being which is the traditional approach. This an autonomous vehicle. A vehicle has allows for information generated in stringent timing dynamics, as the delay one guest to be quickly disseminated in the sending of a command to a wheel to, and consumed by, another guest or propeller could cause disastrous operating system. This shared consequences. Security-wise, it would memory communication enables be undesirable for an external attacker interesting inter-guest dynamics to be to gain control over the vehicle as established. unwanted commands could then be sent As each guest can interact with the and processed by the vehicle. Safety- ...the isolation that hardware directly and is the exclusive wise, it is important for the vehicle to user of that hardware, it has the ability ensure that it is fault-tolerant and will partitioning hypervisors to manage the power of the hardware not perform any unwanted behaviour provides is useful in a to which it is assigned. This allows that might harm the individuals within or server and data-center for per-guest power management of outside the vehicle. Using a partitioning devices. This per-guest management hypervisor, this vehicle might host a context. of power allows a guest to suspend its guest to provide the necessary safety, devices in periods of low activity, thus security, and timing required to control consuming less power per-guest. In the vehicle, and also host a separate, this way, less power is consumed by the isolated guest to communicate platform as a whole. with external sources or to perform autonomous computation that should The isolation between guests allows not interfere with, but inform, the critical for systems of different criticalities to vehicle control. be run on the same platform. Criticality indicates a system’s sensitivity to a Aside from vehicles and other IoT certain domain, such as safety, timing, devices, the isolation that partitioning or security. Thus, a system with a high- hypervisors provides is useful in a server timing criticality might be running tasks and data-center context. Allowing each that are very sensitive to variations guest to manage its own resources in time, or a system with a high- enables each one to optimize its security criticality might store sensitive utilization of the hardware resources to

28 RESEARCH QUARTERLY

VOLUME 1:3

which it is assigned. This optimization, will give Linux the ability to be used while beneficial for power consumption, more optimally in a wider variety of is also beneficial economically, as data domains and will enable partitioning centers would require less overall power. hypervisors to become more accessible. This accessibility provides In data centers, having mixed criticality an opportunity for a wider audience systems executing on the same physical to explore the potential of partitioning medium allows for the creation of new hypervisors. and interesting services. For example, users of the data center might be Linux is a well-known and widely used able to specify which I/O devices they system, so building these partitioning would like to make use of, how many features into the Linux kernel will not processors they require, and how much only make them easy to adopt, but will memory they should be allocated. also allow individuals to use Linux to Then, the data center would find a explore their hardware in interesting physical platform that can guarantee ways. For example, there are several dedicated access to those resources SBCs (single board computers), and gives the user exclusive control of such as the UP Squared, that have those resources. The user’s processes the hardware one would expect on might be executing on a machine with commodity platforms, but also include several other guest operating systems GPIOs (general purpose I/O pins). operating on it, but through the In some cases, such as Intel’s Aero isolation mechanisms of a partitioning Board, the platforms also include hypervisor, the guest would not gyroscopes and IMUs. Using encounter any interference from them. partitioning techniques, a core, This dynamic would be very valuable some memory, and the GPIOs could to edge devices, or applications be isolated and used to emulate that require strict quality of service something like an Arduino, which guarantees. can be hosted on the same physical Currently, our research is exploring platform as the platform where the how to make more efficient usage of, Arduino development is occurring. and build applications and services for, Shared memory channels could then the environments that a partitioning be used to communicate data from the hypervisor provides. We are also guest where development is occurring investigating how to better equip Linux to the Arduino-like guest, instead of with these partitioning features. This through a serial link or some other

29 RESEARCH QUARTERLY

VOLUME 1:3

CRAIG EINSTEIN is a computer science communication mechanism. This would These endeavours and explorations Ph.D. student at Boston University increase the bandwidth and flow rate are immediately useful and will become under the advisement of Professor of the data and reduces both the increasingly more prevalent, especially Richard West. Prior to starting his Ph.D. hardware and software complexity of as bigger and harder challenges he received his B.A. in Geophysics the communication. An example of a are tackled, such as creating fully and Planetary Sciences in 2016 from potential system configuration for such autonomous vehicles, increasing the Boston University. He researches a system can be seen in Figure 3. efficiency and capabilities of our global computer systems with an emphasis on computing infrastructures, and exploring With these partitioning features, we can real-time and mixed criticality systems the universe. investigate the limits and possibilities and autonomous control. Upon the of hardware, and discover and develop I would like to thank my advisor, completion of his doctorate, he would like configurations and protocols to aid in Professor Richard West, and Bandan to work in the space industry developing endeavours such as resource and power Das for their guidance, mentorship, and systems to make space travel more management, mixed criticality system insights. efficient and accessible. interactions, and autonomous vehicle dynamics. These features would also provide the flexibility to explore how AUTHOR to allow artificial devices to interact more efficiently with the physical world. — Craig Einstein. Ph.D. Student

Figure 3. This figure shows an overview of the hardware and software configuration of an example use-case of a partitioning hypervisor. The regions dedicated to each guest are delineated by the dashed red line.

30 RESEARCH QUARTERLY

VOLUME 1:3

RESEARCH PROJECTS UPDATE Faculty, PhD students, and U.S. Red Hat associates in the Northeast U.S. are collaborating actively on the following research projects. This quarter we highlight collaborative projects at Boston University (BU), Northeastern University, Harvard University, and the University of Massachusetts. We will highlight research colloborations from other parts of the world in future editions of the Research Quarterly. Contact [email protected] from more information on any project.

Academic investigators Red Hat investigators Project title Ali Raza (araza) Ulrich Drepper, Larry Woodman, Richard Unikernel Linux Jones

Unikernels are small, lightweight, single address space operating systems with the kernel included as a library within the application. Because unikernels run a single application, there is no sharing or competition for resources among different applications, improving performance and security. Unikernels have thus far seen limited production deployment. This project aims to turn the Linux kernel into a unikernel with the following characteristics: 1) are easily compiled for any application, 2) use battle-tested, production Linux and glibc code, 3) allow the entire upstream Linux developer community to maintain and develop the code, and 4) provide applications normally running vanilla Linux to benefit from unikernel performance and security advantages. The paper Unikernels: The Next Stage of Linux’s Dominance was presented at HotOS XVII, The 17th Workshop on Hot Topics in Operating Systems, 2019. Video presentation

https://www.youtube.com/watch?v=ltvXeolVnVE&feature=youtu.be&t=3h57m40s

Academic investigators Red Hat investigators Project title Tommy Unger, Han Dong, Yara Awad, Prof. Ulrich Drepper An optimizing operating system: Accellerating Jonathan Appavoo, Prof. Amos Waterland, execution with speculation Prof. Orran Kreiger To optimize performance, Automatically Scalable Computation (ASC), a Harvard/BU collaboration attempts to auto-parallelize single threaded workloads, reducing any new effort required from programmers to achieve wall clock speedup. SEUSS takes a different approach by splicing a custom operating system into the backend of a high throughput distributed serverless platform, Apache OpenWhisk. SEUSS uses an alternative isolation mechanism to containers, called Library Operating Systems (LibOSs). LibOSs enable a lightweight snapshotting technique. Snapshotting LibOSs enables two counterintuitive results: 1) although LibOSs inherently replicate system state, SEUSS can cache multiplicatively more functions on a node; 2) although LibOSs can suffer bad “first run” performance, SEUSS is able to reduce cold start times by orders of magnitude. By increasing sharing and decreasing deterministic bringup, SEUSS radically reduces the amount of hardware and cycles required to run a FaaS platform. Video presentation

https://www.youtube.com/watch?v=h1rpSeaTecQ Parul Sohal, Prof. Renato Mancuso, Prof. Orran Ulrich Drepper Removing memory as a noise factor Kreiger

31 RESEARCH QUARTERLY

VOLUME 1:3

Academic investigators Red Hat investigators Project title Memory bandwidth is increasingly the bottleneck in modern systems and a resource that, until today, we could not schedule. This means that, depending on what else is running on a server, performance may be highly unpredictable, impacting the 99% tail latency, which is increasingly important in modern distributed systems. Moreover, the increasing importance of high-performance computing applications, such as machine learning and real-time systems, demands more deterministic performance, even in shared environments. Alternatively, many environments resist running more than one workload on a server, reducing system utilization. Recent processors have started introducing the first mechanism to monitor and control memory bandwidth. Can we use these mechanisms to enable machines to be fully used while ensuring that primary workloads have deterministic performance? This project presents early results from using Intel’s Resource Director Technology and some insight into this new hardware support. The project also examines an algorithm using these tools to provide deterministic performance on different workloads. Video presentation

https://www.youtube.com/watch?v=i8JnQ7VKkEM Ali Raza (alraza), Prof. Orran Kreiger "Performance management for serverless computing

Academic investigators Red Hat investigators Project title Serverless computing provides developers the freedom to build and deploy applications without worrying about infrastructure. Resources (memory, cpu, location) specified for a function can affect performance, as well as cost, of a serverless platform, so configuring these resources properly is critical to both performance and cost. COSE uses a statistical learning approach to dynamically adapt the configurations of serverless functions while meeting QoS/SLA metrics and lowering the cost of cloud usage. This project evaluates COSE on a commercial serverless platform (AWS Lambda) as well as in multiple simulated scenarios, proving its efficacy. Video presentation

https://www.youtube.com/watch?v=7CgBJnNebrQ

Academic investigators Red Hat investigators Project title Sahil Tikale, Ali Raza (alraza), Leo McGann, Langdon White, Lars Kellog-Stedman, Tzu- FLOCX: First Layer of the Open Cloud Danni Shi, Filip Vukelic, Jacob Daitzman, Prof. Mainn Chen, Gagan Kumar eXchange Orran Kreiger FLOCX provides a marketplace for trading physical servers among co-located pools of hardware where each pool is owned and managed by independent organizations. Using FLOCX, organizations can rent nodes from their co-located neighbors in times of high demand and offer their own resources at a suitable price when others experience high demand. An implementation of FLOCX (https://cci-moc.github.io/flocx/ ) is running in the Massachusetts Open Cloud environment (https://massopen.cloud). Video presentation

https://youtu.be/goDpCRLhCao

32 RESEARCH QUARTERLY

VOLUME 1:3

Academic investigators Red Hat investigators Project title Han Dong, James Cadden, Yara Awad, Prof. Sanjay Arora Automatic Configuration of Complex Orran Kreiger, Prof. Jonathan Appavoo Hardware A modern network interface card (NIC), such as the Intel X520 10 GbE, is complex, with hardware registers that control every aspect of the NIC’s operation from device initialization to dynamic runtime configuration. The Intel X520 datasheet documents over 5600 registers; yet only about 1890 are initialized by a modern Linux kernal. It is thus unclear what the performance impact of tuning these registers on a per application basis will be. In this project, we pursue three goals towards this understanding: 1) identify, via a set of microbenchmarks, application characteristics that will illuminate mappings between hardware register values and their corresponding microbenchmark performance impact, 2) use these mappings to frame NIC configuration as a set of learning problems such that an automated system can recommend hardware settings corresponding to each network application, and 3) introduce either new dynamic or application instrumented policy into the device driver in order to better attune dynamic hardware configuration to application runtime behavior. Video presentation

https://www.youtube.com/watch?v=8UQTlNQTKtQ

Academic investigators Red Hat investigators Project title Emine Ugur Kaynar, Mania Abdi, Prof. Peter Matt Benjamin, Brett Niver, Ali Maredia, Mark D3N: A multi-layer cache for data centers Desnoyers, Prof. Orran Kreiger Kogan Current caching methods for improving the performance of big-data jobs assume abundant (e.g., full bi-section) bandwidth to cache nodes. However, many enterprise data centers and co-location facilities exhibit significant network imbalances due to over-subscription and incremental network upgrades. This project designs and develops D3N, a novel multi-layer cooperative caching architecture that mitigates network imbalances by caching data on the access side of each layer of hierarchical network topology. A prototype implementation, which incorporates a two-layer cache, is highly-performant (can read cached data at 5GB/s, the maximum speed of our SSDs) and significantly improves the performance of big-data jobs. To fully utilize bandwidth within each layer under dynamic conditions, we present an algorithm that adaptively adjusts cache sizes of each layer based on observed workload patterns and network congestion. Video presentation

https://www.youtube.com/watch?v=troLFFM6btc

Academic investigators Red Hat investigators Project title Ahmed Sanaullah, Prof. Martin Herbordt, Prof. Ulrich Drepper FPGAs in large-scale computer systems Orran Kreiger Secure Multiparty Computation (MPC) is a cryptographic primitive that allows several parties to jointly and privately compute desired functions over secret data. This project developed and deployed JIFF: an extensible general-purpose MPC framework capable of running on web and mobile stacks, showing how developments in distributed systems, web development, and the SMDI paradigm can inform MPC constructs and implementation. JIFF includes a JavaScript library for building applications that rely on secure MPC, with the ability to be run in the browser, on mobile phones, or via Node.js. JIFF is designed so that developers need not be familiar with MPC techniques or know the details of cryptographic protocols in order to build secure applications. This project used JIFF to implement several MPC applications, including a successfully deployed real-world study on economic opportunity for minority-owned businesses in the Boston area and a service for efficient privacy-preserving route recommendation. Video presentation

https://www.youtube.com/watch?v=wwwtyIWrTZ0 33 RESEARCH QUARTERLY

VOLUME 1:3

Academic investigators Red Hat investigators Project title Craig Einstein, Prof. Richard West Bandan Das A partitioning hypervisor for latency-sensitive workloads Quest-V is a separation kernel that partitions services of different criticality levels across separate virtual machines or sandboxes. Each sandbox encapsulates a subset of machine physical resources that it manages without requiring intervention from a hypervisor. In Quest-V, a hypervisor is only needed to bootstrap the system, recover from certain faults, and establish communication channels between sandboxes. The machine physical resources that are given to each sandbox include one or more processing cores, a region of machine physical memory, and a subset of I/O devices. Current Quest-V research is exploring how to manage hardware resources to allow for a power and latency aware system. The partitioning of virtual machines (VMs) onto separate machine resources offers an opportunity for per-sandbox power management. Thus, in idle periods, a sandbox may place its hardware into a suspend state, reducing the power utilization of the sandbox. Depending on the latency and power demands of the sandbox, the sandboxes can be suspended to RAM or to disk. The sandbox can then resume normal power consumption when appropriate. Sandboxes also have the ability to be migrated across hosts to balance system resources and reduce power consumption. This allows for entire machines to be placed into low power states upon the migration of all sandboxes away from those machines. Quest-V is unlike a normal hypervisor in that it allows VMs to suspend and resume individual hardware resources without interfering with the operation of other VMs on the same physical platform. This allows for the creation of systems that are both power and latency aware.

Academic investigators Red Hat investigators Project title Xiaojing Zhu Sanjay Arora Code2Vec: Learning code representations Code2Vec is a neural model for representing snippets of code as fixed-length continuous vectors (code embeddings) that encode some semantic similarities , which enables the application of neural techniques to a wide-range of programming-languages tasks. Embeddings can be applied to performance measurement of program execution in CPU and smarter code completion and finding similar functions in analyzed code. This project analyzed semantic similarities of learned code embeddings parsed from open source python libraries such as numpy, pandas and sklearn. Still in progress is another analysis that learns code embeddings in a supervised manner with the C++ codebase for performance measurement of program execution in CPU with performance counters (e.g. LLC misses to L1 requests, Cycles Per Instruction).

34 RESEARCH QUARTERLY

VOLUME 1:3

Red Hat Research Quarterly delivered to your digital or physical mailbox?

Yes! Subscribe at research.redhat.com/quarterly.

ABOUT RED HAT

Red Hat is the world’s leading provider of open source software solutions, using a community- powered approach to provide reliable and high- performing cloud, Linux, middleware, storage, and virtualization technologies. Red Hat also offers award-winning support, training, and consulting services. As a connective hub in a global network of enterprises, partners, and open source communities, Red Hat helps create relevant, innovative technologies that liberate resources for growth and prepare customers for the future of IT.

NORTH AMERICA 1 888 REDHAT1

EUROPE, MIDDLE EAST, AND AFRICA 00800 7334 2835 [email protected]

ASIA PACIFIC +65 6490 4200 [email protected] LATIN AMERICA +54 11 4329 7300 [email protected] Feedback, comments or ideas? Is there something you’d like to read about? facebook.com/redhatinc Drop us a line: [email protected] @redhatnews linkedin.com/company/red-hat

35 RESEARCH QUARTERLY

VOLUME 1:3

COMING IN FEBRUARY: • Real-time complex event processing from streaming data • Project Vega: Adding rotation to stellar computational models • Interview with Red Hat’s Mark Little on Quarkus and all things Java

Bringing great research ideas into open source communities

November 2019 – Volume 1: Issue 3