Robust Architectural Support for Transactional Memory in the Power Architecture

Total Page:16

File Type:pdf, Size:1020Kb

Robust Architectural Support for Transactional Memory in the Power Architecture Robust Architectural Support for Transactional Memory in the Power Architecture Harold W. Cain∗ Brad Frey Derek Williams IBM Research IBM STG IBM STG Yorktown Heights, NY, USA Austin, TX, USA Austin, TX, USA [email protected] [email protected] [email protected] Maged M. Michael Cathy May Hung Le IBM Research IBM Research (retired) IBM STG Yorktown Heights, NY, USA Yorktown Heights, NY, USA Austin, TX, USA [email protected] [email protected] [email protected] ABSTRACT in current p795 systems, with 8 TB of DRAM), as well as On the twentieth anniversary of the original publication [10], strengths in RAS that differentiate it in the market, adding following ten years of intense activity in the research lit- TM must not compromise any of these virtues. A robust erature, hardware support for transactional memory (TM) system is one that is sturdy in construction, a trait that has finally become a commercial reality, with HTM-enabled does not usually come to mind in respect to HTM systems. chips currently or soon-to-be available from many hardware We structured TM to work in harmony with features that vendors. In this paper we describe architectural support for support the architecture's scalability. Our goal has been to TM provide a comprehensive programming environment includ- TM added to a future version of the Power ISA . Two im- ing support for simple system calls and debug aids, while peratives drove the development: the desire to complement providing a robust (in the sense of "no surprises") execu- our weakly-consistent memory model with a more friendly tion environment with reasonably consistent performance interface to simplify the development and porting of multi- and without unexpected transaction failures.2 TM must be threaded applications, and the need for robustness beyond usable throughout the system stack: in hypervisors, oper- that of some early implementations. In the process of com- ating systems, libraries, compilers, run-time systems, and mercializing the feature, we had to resolve some previously applications. Further, TM must be easily exploited in exist- unexplored interactions between TM and existing features of ing code, as well as newly written software, driving the need the ISA, for example translation shootdown, interrupt han- to co-exist with nearly the full scope of the ISA's functional- dling, atomic read-modify-write primitives, and our weakly ity. The transactional extensions start with a set of instruc- consistent memory model. We describe these interactions, tions for implementing strongly-isolated[17]3 transactions as the overall architecture, and discuss the motivation and ra- well as a general mechanism for checkpointing and rollback tionale for our choices of architectural semantics, beyond of architectural state. It discourages unpredictable transac- what is typically found in reference manuals. tion behavior (e.g. aborting transactions on some but not all function calls [19]), while also providing support for identi- 1. INTRODUCTION fying the source of transaction failures when the contents of This work describes the first architectural support for TM the transaction are inconsistent with transactional seman- in the Power ISA. (The TM feature implemented in Blue tics (e.g. code that performs a cache flush operation). One Gene R /Q 1 contains no instruction set support.) Since the feature supporting these goals is transactional suspension, Power ISA is the basis for IBM R pSeries servers, whose which implicitly occurs when handling interrupts, or can strengths are robust and scalable performance in large sys- be explicitly invoked via a new suspend/resume instruction. tem configurations (e.g. up to 256 cores / 1024 threads This support is described in Section 3. In the near term, the value of the TM feature is expected ∗The author is now employed by Qualcomm Research- to come from lock elision[24] and to a lesser extent from Raleigh speculative compilation techniques that are enabled by the 1 IBM, Power, Blue Gene, AIX, Power Architecture, and checkpointing/rollback mechanism[23, 22, 28]. Over time, z/Architecture are registered trademarks of International the ease of programming relative to the weakly consistent Business Machines. Power ISA memory model will seed the development and porting of multithreaded applications. (For example if mul- tiple memory accesses can be wrapped in a transaction, there is no need to reason about the necessary memory barriers be- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are tween those particular accesses.) The architecture has been not made or distributed for profit or commercial advantage and that copies 2 bear this notice and the full citation on the first page. To copy otherwise, to We use the term failure to refer to any unsuccessful trans- republish, to post on servers or to redistribute to lists, requires prior specific action, and reserve the term abort to refer to failures that are explicitly requested via transaction abort instructions. permission and/or a fee. 3 ISCA ’13 Tel-Aviv, Israel While Blundell et. al[17] use the term strongly atomic, we Copyright 2013 ACM 978-1-4503-2079-5/13/06 ...$15.00. prefer and use the term strongly isolated. 225 tailored for these uses, while providing a base to support to be competitive with those primitives. From this start- emerging language standards and eventually be extended ing point, we also imposed the following implementation as- for lock-free programming. sumptions: Another aspect of the inclusion of TM is its intersection with the Power ISA weak memory model. In addition to the • We would not require support for maintaining more interaction of TM with non-transactional weakly ordered than one read or write set per transaction (i.e. no guar- code, since lock elision enables the transactionalization of anteed support for intermediate rollback of nested trans- arbitrary code, we must define correct and predictable se- actions or multiple speculative versions of a line). mantics for the inclusion of memory barriers and atomic syn- • There would be no guarantee of success for any trans- chronization primitives (i.e. load-reserve/store conditional) action; all transactions must specify a failure handler al- within a transaction. We describe the intersection of TM lowing for non-transactional alternatives. and the Power ISA memory model in Section 4. • Despite a desire for graceful handling of many rare Although this paper summarizes the architectural support events, there would be no requirement of support for for HTM in one commercial architecture, and the design transaction survival across context switches or paging of challenges and trade-offs involved, some of the features dis- transactionally accessed data. cussed have not been previously described in the literature, which we highlight here as novel contributions of this work: • When possible, the primitives themselves must be largely consistent with existing components of the Power • We define a suspended transactional mode of execu- ISA. Significant deviations were only permissible if abso- tion with deferred failure handling semantics, allowing lutely necessary. for a sequential programming model that can run existing • The resulting architecture would allow for decoupled software (e.g. interrupt handlers) without fear of failure- implementations of TM and existing larx/stcx atomic syn- induced control redirection. This mode is entered as the chronization primitives. result of an interrupt or via a new suspend instruction, and is an important ingredient in the architecture's ro- • We placed no architectural limits on the transaction bustness. It is described in Section 3. size an implementation must support, but encourage the possible support of large transactions by eliminating many • We describe the potential for a loss of transactional causes of transaction failure. Implementations must have atomicity in systems that implement hardware-based the flexibility to choose granule size for tracking transac- translation shootdowns (i.e. do not rely on interproces- tional conflicts. sor interrupts), motivating support for conflict detection mechanisms with translation shootdown. Support for TM was primarily motivated as an enabler • We describe various mechanisms, most importantly an of lock elision and speculative code generation techniques. integrated cumulative barrier effect on transaction com- Lock elision is a technique that can be implemented purely mittal, that are necessary to allow transactional and non- in hardware[24], purely in software [21], or using a combina- transactional accesses to interoperate in a sensible fash- tion of the two through transactional lock elision (TLE) [7], ion. The interaction of TM and the Power ISA memory which is the approach we adopt. Lock elision requires the model is described in Section 4. ability to execute transaction unaware code, as opposed to • We define a variant of transactions called rollback-only small carefully orchestrated transactions that may be con- transactions that support the checkpointing and rollback structed by a programmer by hand, for example when build- of architectural state without the atomic features of trans- ing a lock-free data structure. Since lock-based critical sec- actions, to be used for software speculation. tions can make procedure calls that may end up distantly removed from the original critical section (potentially even We begin with a summary of the requirements that guided in system
Recommended publications
  • Terms and Conditions for Vmware on IBM Cloud Services
    Terms and Conditions for VMware on IBM Cloud Services These Terms and Conditions for VMware on IBM Cloud Services (the “Terms”) are a legal agreement between you, the customer (“Customer” or “you”), and Dell. For purposes of these Terms, “Dell” means Dell Marketing L.P., on behalf of itself and its suppliers and licensors, or the Dell entity identified on your Order Form (if applicable) and includes any Dell affiliate with which you place an order for the Services. Your purchase of the Services is solely for your internal business use and may not be resold. 1. Your use of the Services reflected on your VMware on IBM Cloud with Dell account and any changes or additions to the VMware on IBM Cloud Services that you purchase from Dell are subject to these Terms, including the Customer Support terms attached hereto as Exhibit A, as well as the Dell Cloud Solutions Agreement located at http://www.dell.com/cloudterms (the “Cloud Agreement”), which is incorporated by reference in its entirety herein. 2. You are purchasing a Subscription to the Services which may include VMware vCenter Server on IBM Cloud or VMware Cloud Foundation on IBM Cloud (individually or collectively, the “Services”). “Subscription” means an order for a quantity of Services for a defined Term. The “Term” is one (1) year from the start date of your Services, and will thereafter automatically renew for successive month to month terms for the duration of your Subscription. 3. The Services will be billed on a monthly basis, with payment due and payable in accordance with the payment terms set forth in the Cloud Agreement.
    [Show full text]
  • New Dell Models Added to the IBM Integrated Multivendor Support (IMS)
    New Dell models added to the IBM Integrated Multivendor Support (IMS) On March 22nd, IBM announced the addition of Dell X86 Generation 14 server models to help your clients optimize the performance of their Dell x86 servers for better return on investment. Dell x86 servers are central components of an IT infrastructure. Keeping them running at peak efficiency is vital to meeting your client's availability requirements. Robust technical support from an experienced vendor with extensive resources can optimize the performance of the servers--regardless of their age--while better controlling support costs. This new offering covers the following: Warranty pass-thru for machine type 1510--this allows clients to be able to use a single source to handle all their service needs and IBM will take care of the call placement to Dell "NEW" - Clients now can request IBM Hardware maintenance during the Dell warranty period using machine type 7069 for select Gen 14 models. IBM will provide hardware maintenance support for those clients who wish to have a this coverage model during the warranty period. IBM Hardware maintenance for machine type 0138--this is post warranty coverage, so the client can be sure to have coverage post warranty Battery maintenance is available on all these models. Your clients expect coverage for Battery maintenance, please be sure to add to your Dell model quotes. What are the benefits of IMS to your client? Receive pricing immediately, as these products are all in ISAT IBM SSR provides the labor Does not require machine
    [Show full text]
  • The Case for Hardware Transactional Memory
    Lecture 20: Transactional Memory CMU 15-418: Parallel Computer Architecture and Programming (Spring 2012) Giving credit ▪ Many of the slides in today’s talk are the work of Professor Christos Kozyrakis (Stanford University) (CMU 15-418, Spring 2012) Raising level of abstraction for synchronization ▪ Machine-level synchronization prims: - fetch-and-op, test-and-set, compare-and-swap ▪ We used these primitives to construct higher level, but still quite basic, synchronization prims: - lock, unlock, barrier ▪ Today: - transactional memory: higher level synchronization (CMU 15-418, Spring 2012) What you should know ▪ What a transaction is ▪ The difference between the atomic construct and locks ▪ Design space of transactional memory implementations - data versioning policy - con"ict detection policy - granularity of detection ▪ Understand HW implementation of transaction memory (consider how it relates to coherence protocol implementations we’ve discussed in the past) (CMU 15-418, Spring 2012) Example void deposit(account, amount) { lock(account); int t = bank.get(account); t = t + amount; bank.put(account, t); unlock(account); } ▪ Deposit is a read-modify-write operation: want “deposit” to be atomic with respect to other bank operations on this account. ▪ Lock/unlock pair is one mechanism to ensure atomicity (ensures mutual exclusion on the account) (CMU 15-418, Spring 2012) Programming with transactional memory void deposit(account, amount){ void deposit(account, amount){ lock(account); atomic { int t = bank.get(account); int t = bank.get(account); t = t + amount; t = t + amount; bank.put(account, t); bank.put(account, t); unlock(account); } } } nDeclarative synchronization nProgrammers says what but not how nNo explicit declaration or management of locks nSystem implements synchronization nTypically with optimistic concurrency nSlow down only on true con"icts (R-W or W-W) (CMU 15-418, Spring 2012) Declarative vs.
    [Show full text]
  • The Evolution of Ibm Research Looking Back at 50 Years of Scientific Achievements and Innovations
    FEATURES THE EVOLUTION OF IBM RESEARCH LOOKING BACK AT 50 YEARS OF SCIENTIFIC ACHIEVEMENTS AND INNOVATIONS l Chris Sciacca and Christophe Rossel – IBM Research – Zurich, Switzerland – DOI: 10.1051/epn/2014201 By the mid-1950s IBM had established laboratories in New York City and in San Jose, California, with San Jose being the first one apart from headquarters. This provided considerable freedom to the scientists and with its success IBM executives gained the confidence they needed to look beyond the United States for a third lab. The choice wasn’t easy, but Switzerland was eventually selected based on the same blend of talent, skills and academia that IBM uses today — most recently for its decision to open new labs in Ireland, Brazil and Australia. 16 EPN 45/2 Article available at http://www.europhysicsnews.org or http://dx.doi.org/10.1051/epn/2014201 THE evolution OF IBM RESEARCH FEATURES he Computing-Tabulating-Recording Com- sorting and disseminating information was going to pany (C-T-R), the precursor to IBM, was be a big business, requiring investment in research founded on 16 June 1911. It was initially a and development. Tmerger of three manufacturing businesses, He began hiring the country’s top engineers, led which were eventually molded into the $100 billion in- by one of world’s most prolific inventors at the time: novator in technology, science, management and culture James Wares Bryce. Bryce was given the task to in- known as IBM. vent and build the best tabulating, sorting and key- With the success of C-T-R after World War I came punch machines.
    [Show full text]
  • Apple at Work Compatibility
    Apple at Work Compatibility Compatible with your existing systems. Apple devices work with most enterprise systems and apps that your company already uses—mail and messaging, network connectivity, file sharing, collaboration, and more—giving your employees access to everything they need to do their jobs. Connect to your infrastructure iPhone, iPad, and Mac support WPA2 Enterprise to provide secure access to your enterprise Wi-Fi network. With the integration of iOS, macOS, and the latest technology from Cisco, businesses everywhere can seamlessly connect to networks, optimize the performance of business-critical apps, and collaborate using voice and video—all with the security that businesses need. Secure access to private corporate networks is available in iOS, iPadOS, and macOS using established industry-standard virtual private network protocols. Out of the box, iOS, iPadOS, and macOS support IKEv2, Cisco IPSec, and L2TP over IPSec. Work with your existing enterprise systems Apple devices work with key corporate services including Microsoft Exchange, giving your employees full access to their business email, calendar, and contacts, across all their Apple devices. Your employees can use the built-in Apple apps including Mail, Calendar, Contacts, Reminders, and Notes to connect, and use Microsoft Outlook on Mac for working with Microsoft Exchange. iPhone, iPad, and Mac devices support a wide range of connectivity options including standards-based systems like IMAP and CalDAV. Popular productivity and collaboration tools like Microsoft Office, Google G Suite, Slack, Cisco Webex, and Skype are all available on the App Store, and deliver the functionality you know and expect. Access all your documents and files The Files app in iOS and iPadOS lets you access your Box, DropBox, OneDrive, and Google Drive files all from one place.
    [Show full text]
  • Leveraging Modern Multi-Core Processor Features to Efficiently
    LEVERAGING MODERN MULTI-CORE PROCESSORS FEATURES TO EFFICIENTLY DEAL WITH SILENT ERRORS, AUGUST 2017 1 Leveraging Modern Multi-core Processor Features to Efficiently Deal with Silent Errors Diego Perez´ ∗, Thomas Roparsz, Esteban Meneses∗y ∗School of Computing, Costa Rica Institute of Technology yAdvanced Computing Laboratory, Costa Rica National High Technology Center zUniv. Grenoble Alpes, CNRS, Grenoble INP ' , LIG, F-38000 Grenoble France Abstract—Since current multi-core processors are more com- about 200% of resource overhead [3]. Recent contributions [1] plex systems on a chip than previous generations, some transient [4] tend to show that checkpointing techniques provide the errors may happen, go undetected by the hardware and can best compromise between transparency for the developer and potentially corrupt the result of an expensive calculation. Because of that, techniques such as Instruction Level Redundancy or efficient resource usage. In theses cases only one extra copy checkpointing are utilized to detect and correct these soft errors; is executed, which is enough to detect the error. But, still the however these mechanisms are highly expensive, adding a lot technique remains expensive with about 100% of overhead. of resource overhead. Hardware Transactional Memory (HTM) To reduce this overhead, some features provided by recent exposes a very convenient and efficient way to revert the state of processors, such as hardware-managed transactions, Hyper- a core’s cache, which can be utilized as a recovery technique. An experimental prototype has been created that uses such feature Threading and memory protection extensions, could be great to recover the previous state of the calculation when a soft error opportunities to implement efficient error detection and recov- has been found.
    [Show full text]
  • DONALD J. ROSENBERG Executive Vice President, General Counsel and Corporate Secretary Qualcomm Incorporated
    Qualcomm Incorporated 5775 Morehouse Drive (858) 587-1121 San Diego, CA 92121-1714 www.qualcomm.com DONALD J. ROSENBERG Executive Vice President, General Counsel and Corporate Secretary Qualcomm Incorporated Donald J. Rosenberg is executive vice president, general counsel and corporate secretary of Qualcomm Incorporated. Mr. Rosenberg reports directly to CEO Steve Mollenkopf and is a member of the company's Executive Committee. In his role as chief legal officer, he is responsible for overseeing Qualcomm's worldwide legal affairs including litigation, intellectual property and corporate matters. Qualcomm's Government Affairs, Strategic Intellectual Property, Internal Audit and Compliance organizations also report to him. Prior to joining Qualcomm, Mr. Rosenberg served as senior vice president, general counsel and corporate secretary of Apple Inc. Prior to that, he was senior vice president and general counsel of IBM Corporation where he had also held numerous positions including vice president and assistant general counsel for litigation and counsel to IBM's mainframe division. Mr. Rosenberg has had extensive experience in corporate governance, compliance, law department management, litigation, securities regulation, intellectual property and competition issues. He has served as an adjunct professor of law at New York's Pace University School of Law, where he taught courses in intellectual property and antitrust law. Mr. Rosenberg is Co-Chair of the Lawyers’ Committee for Civil Rights under Law and a board member of the Corporate Directors Forum as well as the La Jolla Playhouse. Mr. Rosenberg received a Bachelor of Science degree in mathematics from the State University of New York at Stony Brook and his juris doctor from St.
    [Show full text]
  • Lecture 21: Transactional Memory
    Lecture 21: Transactional Memory • Topics: Hardware TM basics, different implementations 1 Transactions • New paradigm to simplify programming . instead of lock-unlock, use transaction begin-end . locks are blocking, transactions execute speculatively in the hope that there will be no conflicts • Can yield better performance; Eliminates deadlocks • Programmer can freely encapsulate code sections within transactions and not worry about the impact on performance and correctness (for the most part) • Programmer specifies the code sections they’d like to see execute atomically – the hardware takes care of the rest (provides illusion of atomicity) 2 Transactions • Transactional semantics: . when a transaction executes, it is as if the rest of the system is suspended and the transaction is in isolation . the reads and writes of a transaction happen as if they are all a single atomic operation . if the above conditions are not met, the transaction fails to commit (abort) and tries again transaction begin read shared variables arithmetic write shared variables transaction end 3 Example Producer-consumer relationships – producers place tasks at the tail of a work-queue and consumers pull tasks out of the head Enqueue Dequeue transaction begin transaction begin if (tail == NULL) if (head->next == NULL) update head and tail update head and tail else else update tail update head transaction end transaction end With locks, neither thread can proceed in parallel since head/tail may be updated – with transactions, enqueue and dequeue can proceed in parallel
    [Show full text]
  • IBM Aspera for Microsoft Sharepoint Increasing the Value of Sharepoint with Fast, Secure Transfers of Very Large Files and Data Sets
    IBM Software Data sheet IBM Aspera for Microsoft SharePoint Increasing the value of SharePoint with fast, secure transfers of very large files and data sets High-speed transfer for SharePoint Key benefits Microsoft SharePoint has been adopted by companies worldwide for knowledge capture, collaboration, and document storage. Many of • Fast uploads and downloads for files and these organizations have been unable to take full advantage of data sets of any type and size into SharePoint regardless of distance or SharePoint because of its file size and repository size limitations, and network conditions poor performance transferring large files and data sets over wide area networks. • Eliminates SharePoint 2GB file size limit and 4TB content repository limit IBM® Aspera® for Microsoft SharePoint is for organizations who need • Reliable and predictable transfers with to quickly, predictably, and securely store and access high volumes of resume in case of interruption large files in SharePoint. • Secure transfers with encryption over the wire and at rest This solution is designed to seamlessly integrate the patented FASP® • Flexible deployment, supports storing file transfer technology into SharePoint document upload and download data on-premises and in the cloud workflows. Using Aspera, customers can not only overcome the file size and repository size limitations of SharePoint, they also can transfer files into and out of SharePoint at much higher speeds with auditable and predictable results. Expanding the value of SharePoint Aspera for Microsoft SharePoint enables users to take advantage of SharePoint document management capabilities for files that may be distributed on multiple file systems. That means users can upload large video files, imagery files, laser scan data, and other large files leveraging SharePoint document library structures and metadata for organization and search.
    [Show full text]
  • IBM Managed Services for Azure Using Integrated Managed Infrastructure (IMI) for Azure
    IBM Managed Services for Azure Using Integrated Managed Infrastructure (IMI) for Azure Many clients that are ready to utilize Microsoft Azure for their workloads lack the technical expertise to manage this environment or simply want to focus on their core business IMI Services include: and leave the management of the Azure platform to Plan & Design someone else. Build & Migrate IBM’s Integrated Managed Infrastructure (IMI) Services Lifecycle Management for Azure provides a portfolio of modular solutions to Standards Compliance design, build, migrate, and run/manage Azure Cost Optimization infrastructure services across enterprise Hybrid IT. Plus, IBM can rapidly scale the supporting infrastructure up or down, based on your requirements. IBM Complementary Services Resiliency – D i s a s t e r R e c o v e r y Cloud Migration Services Integrated Managed Infrastructure for Azure at-a-glance Managed Security Services IMI is a cost-effective, modular and catalog-based managed service for Azure. Cost & Asset Management Whether you need assistance with a few specific tasks or want more robust outsourcing solutions, you can use IBM’s rich catalog of modular services to select and pay for only Provisioning and what you need. Orchestration IBM Services Platform with ❑ Consumption based pricing - Managed ❑ IBM Services Platform (Watson) - Platform Watson for Cognitive Insi g h t Services pricing that aligns to cloud pay driven approach powered by Watson for as you go consumption models cognitive analytics and machine learning ❑ Pre-integrated tooling
    [Show full text]
  • Practical Enclave Malware with Intel SGX
    Practical Enclave Malware with Intel SGX Michael Schwarz, Samuel Weiser, Daniel Gruss Graz University of Technology Abstract. Modern CPU architectures offer strong isolation guarantees towards user applications in the form of enclaves. However, Intel's threat model for SGX assumes fully trusted enclaves and there doubt about how realistic this is. In particular, it is unclear to what extent enclave malware could harm a system. In this work, we practically demonstrate the first enclave malware which fully and stealthily impersonates its host application. Together with poorly-deployed application isolation on per- sonal computers, such malware can not only steal or encrypt documents for extortion but also act on the user's behalf, e.g., send phishing emails or mount denial-of-service attacks. Our SGX-ROP attack uses new TSX- based memory-disclosure primitive and a write-anything-anywhere prim- itive to construct a code-reuse attack from within an enclave which is then inadvertently executed by the host application. With SGX-ROP, we bypass ASLR, stack canaries, and address sanitizer. We demonstrate that instead of protecting users from harm, SGX currently poses a se- curity threat, facilitating so-called super-malware with ready-to-hit ex- ploits. With our results, we demystify the enclave malware threat and lay ground for future research on defenses against enclave malware. Keywords: Intel SGX, Trusted Execution Environments, Malware 1 Introduction Software isolation is a long-standing challenge in system security, especially if parts of the system are considered vulnerable, compromised, or malicious [24]. Recent isolated-execution technology such as Intel SGX [23] can shield software modules via hardware protected enclaves even from privileged kernel malware.
    [Show full text]
  • Taming Transactions: Towards Hardware-Assisted Control Flow Integrity Using Transactional Memory
    Taming Transactions: Towards Hardware-Assisted Control Flow Integrity Using Transactional Memory B Marius Muench1( ), Fabio Pagani1, Yan Shoshitaishvili2, Christopher Kruegel2, Giovanni Vigna2, and Davide Balzarotti1 1 Eurecom, Sophia Antipolis, France {marius.muench,fabio.pagani,davide.balzarotti}@eurecom.fr 2 University of California, Santa Barbara, USA {yans,chris,vigna}@cs.ucsb.edu Abstract. Control Flow Integrity (CFI) is a promising defense tech- nique against code-reuse attacks. While proposals to use hardware fea- tures to support CFI already exist, there is still a growing demand for an architectural CFI support on commodity hardware. To tackle this problem, in this paper we demonstrate that the Transactional Synchro- nization Extensions (TSX) recently introduced by Intel in the x86-64 instruction set can be used to support CFI. The main idea of our approach is to map control flow transitions into transactions. This way, violations of the intended control flow graphs would then trigger transactional aborts, which constitutes the core of our TSX-based CFI solution. To prove the feasibility of our technique, we designed and implemented two coarse-grained CFI proof-of-concept implementations using the new TSX features. In particular, we show how hardware-supported transactions can be used to enforce both loose CFI (which does not need to extract the control flow graph in advance) and strict CFI (which requires pre-computed labels to achieve a better pre- cision). All solutions are based on a compile-time instrumentation. We evaluate the effectiveness and overhead of our implementations to demonstrate that a TSX-based implementation contains useful concepts for architectural control flow integrity support.
    [Show full text]