<<

USO09514285B2

(12) United States Patent (10) Patent No.: US 9,514,285 B2 Durham et al. (45) Date of Patent: Dec. 6, 2016

(54) CREATING STACK POSITION DEPENDENT (56) References Cited CRYPTOGRAPHC RETURN ADDRESS TO MTGATE RETURN ORIENTED U.S. PATENT DOCUMENTS PROGRAMMING ATTACKS 7.853,803 B2 * 12/2010 Milliken ...... GO6F 21/52 713, 176 (71) Applicant: Intel Corporation, Santa Clara, CA 2014/0173293 A1* 6/2014 Kaplan ...... G06F 21.54 (US) T13, 190 (72) Inventors: David M. Durham, Beaverton, OR OTHER PUBLICATIONS (US); Baiju V. Patel, Portland, OR “Disk theory,” available at http://en.wikipedia.org/wiki/ (US) XEX-TCB-CTS, accessed Dec. 29, 2014, 6 pages. “The Heartbleed bug, available at http://heartbleed.com/, accessed (73) Assignee: Intel Corporation, Santa Clara, CA Dec. 29, 2014, 7 pages. (US) “OpenSSL.” available at http://en.wikipedia.org/wiki/OpenSSL, accessed Dec. 29, 2014, 13 pages. (*) Notice: Subject to any disclaimer, the term of this “Return-oriented programming,' available at http://en.wikipedia. patent is extended or adjusted under 35 org/wiki/Return-oriented programming, accessed Dec. 29, 2014, 4 pageS. U.S.C. 154(b) by 0 days. “Buffer overflow,” available at http://en.wikipedia.org/wiki/Buf (21) Appl. No.: 14/498,521 fer overflow, accessed Dec. 29, 2014, 12 pages. (22) Filed: Sep. 26, 2014 * cited by examiner Primary Examiner — Jacob Lipman (65) Prior Publication Data (74) Attorney, Agent, or Firm — Barnes & Thornburg US 2016/0094.552 A1 Mar. 31, 2016 LLP (51) Int. C. (57) ABSTRACT G6F 2/54 (2013.01) A computing device includes technologies for securing G06F2L/00 (2013.01) return addresses that are used by a processor to control the (52) U.S. C. flow of execution of a program. The computing device uses CPC ...... G06F 21/00 (2013.01) a cryptographic algorithm to provide security for a return (58) Field of Classification Search address in a manner that binds the return address to a CPC ...... H04L 63/0876; G06F 21/54 location in a stack. USPC ...... 713/171, 190 See application file for complete search history. 23 Claims, 7 Drawing Sheets

COMPUTING DEWICE 100

USER SPACE APPLICATION 134

SUBROUTINE rTURNRCM RETURN FROM CALL 202 SUBROUTINE 210 SUROUTINE RFAULT216 prwileged system CoMomeNT 142 RETURN ADDRESS FAULTHANDLER MODULE 48

SECURE CONTROL TRANSFER NON-CANONICAL NSTRUCTIONS 104 FAULTING ARSS 24 SECURE CALL SECURE RETURN LOGIC 108 LOGIC 108 RTURN RETURN MODIFIED MODIFIED ADDRESS ADDRESS RETURN RTURN CESCRAMBLER 24 ADDRESS208 ARESS 206 LOGIC 110

MODIFIED MODIFIED RETURN AddRSS2O6 RTUN RTURN ARSS 206 ARESS 204 CALL STACK218

RETURNADRESS SECURITY LOGC 138

STACKPOSITIONIDENTIFIER(S) 114 scrTKY 16 U.S. Patent Dec. 6, 2016 Sheet 1 of 7 US 9,514,285 B2

COMPUTING DEVICE 100

PROCESSOR 102

SECURE CONTROL TRANSFER INSTRUCTIONS 104

SECURE CALL LOGIC 106 RETURN ADDRESS SECURITY LOGIC 136

SECURE RETURN LOGIC 108 CRYPTOGRAPHC LOGIC 138

RETURN ADDRESS ADDRESSPECATION DESCRAMBLINGLOGIC 110

REGISTERS 112

STACKPOSITION IDENTIFIER 114 SECRET 1 16

fC) SUBSYSTEM 124 MEMORY 122

PRIWILEGED SYSTEM COMPONENT 142 DATA STORAGE DEVICE 126 MEMORY MANAGER DISPLAY DEVICE 128 MODULE 144 KEY CREATIONMODULE 146 USUBSYSTEM 130 RETURN ADDRESS FAULT COMM SUBSYSTEM 132 HANDLERMODULE 148 INTERRUPT HANDLER USER SPACE APPLICATION 134 MODULE 150

FIG. 1 U.S. Patent Dec. 6, 2016 Sheet 2 of 7 US 9,514,285 B2

200

COMPUTING DEVICE 100

USER SPACE APPLICATION 134

SUBROUTINE CALL 202 RETURN FROM RETURN FROM SUBROUTINE 210 SUBROUTINE ORFAULT216

PRIVILEGED SYSTEM COMPONENT 142 RETURN ADDRESS FAULTHANDLER MODULE 148

SECURE CONTROL TRANSFER NON-CANONICAL INSTRUCTIONS 104 FAULTING ADDRESS 214

SECURE CALL SECURE RETURN LOGIC 106 LOGIC 108 RETURN RETURN MODIFIED MODIFIED ADDRESS ADDRESS RETURN RETURN DESCRAMBLER 204 ADDRESS 206 ADDRESS 206 LOGIC 110

MODIFIED MODIFIED RETURN RETURN ADDRESS 206 ADDRESS 206 RETURN ADDRESS 204 CALL STACK218

RETURN ADDRESS SECURITY LOGIC 136

STACK POSITION IDENTIFIER(S) 114 SECRET KEY 1 16

FIG. 2 U.S. Patent Dec. 6, 2016 Sheet 3 of 7 US 9,514,285 B2

300

FFFFFFFF t FFFFFFFF USED/CANONICAL ADDRESSESl (SYSTEM SPACE) 310 FFFF 80000 OOOOOOOO

MODIFIED RETURN ADDRESS 206

UNUSED/NON-CANONICAL ADDRESSES 312

0000 7FFFF FFFFFFFF USED/CANONICAL ADDRESSESl (USER SPACE) 314 OOOOOOOO OOOOOOOO

FIG. 3 U.S. Patent Dec. 6, 2016 Sheet 4 of 7 US 9,514,285 B2

M 400

410 CONTROL TRANSFER INSTRW RTN ADDRESS

PRE-PROCESS THE RETURN ADDRESS 412 AS NEEDED BY ENCRYPTIONALGORITHM

414 ENCRYPT THE RETURN ADDRESS USING SECRET KEY AND STACK POSITION IDENTIFIER AS A TWEAK

416 MANIPULATE THE ENCRYPTED RETURN ADDRESS TO FALL WITHIN AN UNUSED NON-CANONICAL ADDRESS RANGE

418 STORE THE MODIFIED RETURN ADDRESS ON THE CALL STACK

420 READ THE NEXT INSTRUCTION

FIG. 4 U.S. Patent Dec. 6, 2016 Sheet S of 7 US 9,514,285 B2

M 500

- -TRANSFER INSTR THAT USES RTN ADDRESSP

512 OBTAIN MODIFIED RETURN ADDRESS FROM CALL STACK

514 DOES RTN ADDRESS FALL WITHIN AN UNUSED NON-CANONICAL TO FIG. 6

RANGEP

YES 516 DECRYPT THE RETURN ADDRESS USING STACK POSITION IDENTIFIER AS A TWEAK

518 RESTORE THE DECRYPTED RETURN ADDRESS TO ORIGINAL FORM

520 JUMP TO RETURN ADDRESS

FIG. 5 U.S. Patent Dec. 6, 2016 Sheet 6 of 7 US 9,514,285 B2

M 600

614

SFAULTING ADDRESS NON-CANONICAL TREAT ASFAULT

YES 616 DECRYPT THE RETURN ADDRESS USING STACK POSITION IDENTIFIER AS A TWEAK

618 VERIFY DECRYPTED RETURN ADDRESS AND/OR CALLING CODE

620 TRANSFER CONTROL TO RETURN ADDRESS

F.G. 6 U.S. Patent Dec. 6, 2016 Sheet 7 Of 7 US 9,514,285 B2

710 ------4------LOOP - COUNTER MODE 712

716 |

PARALLEL EXECUTION

718 - 4- - - - - LOOP - NORMAL

720 PROCESSOR PIPELINE CONTROL TRANSFER INSTRW RTN ADDRESS

YES

722 OBTAIN CURRENT ENCRYPTED STACK POSITION IDENTIFIER

724 COMBINE CURRENT ENCRYPTED STACK POSITION IDNETFER WITH RETURN ADDRESS TO CREATE MODIFIED RETURN ADDRESS

726 STORE THE MODIFIED RETURN ADDRESS ON THE CALL STACK US 9,514,285 B2 1. 2 CREATING STACK POSITION DEPENDENT figures. For simplicity and clarity of illustration, elements CRYPTOGRAPHC RETURN ADDRESS TO illustrated in the figures are not necessarily drawn to scale. MTIGATE RETURN ORIENTED Where considered appropriate, reference labels have been PROGRAMMING ATTACKS repeated among the figures to indicate corresponding or analogous elements. BACKGROUND FIG. 1 is a simplified block diagram of at least one When a software application runs on a computing device, embodiment of a computing device configured with secure a processor executes machine-level instructions into which control transfer instructions as disclosed herein; high level programming code of the application has been FIG. 2 is a simplified environment diagram of the com translated (e.g., by a compiler and/or an interpreter, etc.). 10 puting device of FIG. 1, illustrating an application of the The pre-defined set of machine-level instructions that a secure control transfer instructions of FIG. 1; particular processor can execute is the processors instruc FIG. 3 is a simplified schematic diagram of a range of tion set. The processor typically fetches from memory, memory addresses including used and unused addresses; sequentially, the machine-level instructions corresponding FIG. 4 is a simplified flow diagram of at least one to the functionality of a Software application, and then 15 embodiment of a method for providing security for a control executes the instructions sequentially. transfer instruction as disclosed herein, which may be However, the processor may encounter an instruction that executed by the computing device of FIG. 1; transfers control of the program execution to another FIG. 5 is a simplified flow diagram of at least one instruction or set of instructions stored in another memory embodiment of a method for providing security for another location. For example, a call instruction causes the processor control transfer instruction as disclosed herein, which may to transfer control of the program execution to a Subroutine, be executed by the computing device of FIG. 1; Such as a function or a procedure. In this case, the called FIG. 6 is a simplified flow diagram of at least one Subroutine will have among its instructions at least one embodiment of a method for verifying a previously secured return (“ret') instruction. The return instruction causes the control transfer instruction as disclosed herein, which may processor to return control back to the instruction that 25 immediately follows the call instruction in the normal be executed by the computing device of FIG. 1; and sequence of instructions. The memory location of the FIG. 7 is a simplified flow diagram of at least one instruction to which control should return after a call instruc embodiment of another method for providing security for a tion may be referred to as the return address. Some examples control transfer instruction as disclosed herein, which may of control transfer instructions include call, jump (imp”), be executed by the computing device of FIG. 1. interrupt (“int”), and return from interrupt (“iret'). 30 The processor uses a call stack data structure to keep track DETAILED DESCRIPTION OF THE DRAWINGS of the program control flow when a call instruction is encountered. To do this, the call instruction pushes the return While the concepts of the disclosure are suscep address onto the top of the stack. A stack pointer is used to tible to various modifications and alternative forms, specific keep track of the memory location of the top of the stack. 35 embodiments thereof have been shown by way of example The stack pointer is stored in a processor register. The return in the drawings and will be described herein in detail. It instruction uses the stack pointer to retrieve the return should be understood, however, that there is no intent to address from the top of the stack. The return instruction limit the concepts of the present disclosure to the particular transfers control to the instruction found at the memory forms disclosed, but on the contrary, the intention is to cover location of the return address. 40 all modifications, equivalents, and alternatives consistent Return-oriented programming (ROP) refers to a category with the present disclosure and the appended claims. of computer security exploits that allow an attacker to take References in the specification to “one embodiment,” “an control of the call stack and redirect the program flow to embodiment,” “an illustrative embodiment,' etc., indicate malicious code. A ROP attack may be enabled by a buffer that the embodiment described may include a particular overflow. A buffer overflow may occur, for example, as a 45 feature, structure, or characteristic, but every embodiment result of a bug in an application program. When the buffer may or may not necessarily include that particular feature, overflow occurs, the return address on the call stack may be structure, or characteristic. Moreover, such phrases are not overwritten with a different memory address. When this necessarily referring to the same embodiment. Further, when occurs, program flow can be diverted to an address at which a particular feature, structure, or characteristic is described an attacker's code (which may be referred to as a "gadget) 50 in connection with an embodiment, it is submitted that it is is located. Gadgets can be chained together to perform a within the knowledge of one skilled in the art to effect such series of functions that provide useful information to the feature, structure, or characteristic in connection with other attacker. Another example of a ROP attack is the stack embodiments whether or not explicitly described. Addition pivoting exploit, which exposed Vulnerabilities in some ally, it should be appreciated that items included in a list in versions of the ADOBE ACROBAT and ADOBE READER 55 the form of “at least one A, B, and C can mean (A); (B): Software applications that allowed the stack pointer register (C); (A and B); (B and C); (A and C); or (A, B, and C). to be manipulated by an adversary. This allowed the adver Similarly, items listed in the form of “at least one of A, B, sary to redirect the Stack location in memory to a portion of or C can mean (A); (B); (C); (A and B); (B and C); (A and the memory heap controlled by the adversary, effectively C); or (A, B, and C). creating an alternative malicious stack. This attack was 60 The disclosed embodiments may be implemented, in designed to trick users into clicking on a malicious PDF file Some cases, in hardware, firmware, Software, or any com delivered in an email message. bination thereof. The disclosed embodiments may also be implemented as instructions carried by or stored on a BRIEF DESCRIPTION OF THE DRAWINGS transitory or non-transitory machine-readable (e.g., com 65 puter-readable) storage medium, which may be read and The concepts described herein are illustrated by way of executed by one or more processors. A machine-readable example and not by way of limitation in the accompanying storage medium may be embodied as any storage device, US 9,514,285 B2 3 4 mechanism, or other physical structure for storing or trans location, Such as in firmware, in a secure portion of the data mitting information in a form readable by a machine (e.g., storage device 126 or another data storage device, or another a volatile or non-volatile memory, a media disc, or other form of memory suitable for performing the functions media device). described herein. In some embodiments, the secret key 116 In the drawings, some structural or method features may may be transmitted across a secure communications channel be shown in specific arrangements and/or orderings. How and restored by an executive (such as an operating system or ever, it should be appreciated that such specific arrange a virtual machine monitor, e.g., the privileged system com ments and/or orderings may not be required. Rather, in some ponent 142 described below). In virtualized environments in embodiments, such features may be arranged in a different which virtual machines are migrated from one machine to manner and/or order than shown in the illustrative figures. 10 another, and/or in cases in which a virtual machine, process Additionally, the inclusion of a structural or method feature or program running on the computing device 100 begins a in a particular figure is not meant to imply that Such feature sleeping/hibernating mode after a return address is secured is required in all embodiments and, in some embodiments, using the secret key 116, and then later resumes, the secret may not be included or may be combined with other key 116 will need to be recovered and restored. In these features. 15 cases, the secret key 116 can be stored or possibly trans Referring now to FIG. 1, an embodiment of a computing mitted across a (secure) communications channel prior to a device 100 includes a processor 102 and a set of secure sleeping/hibernating mode, and then retrieved/restored by an control transfer instructions 104. The illustrative secure executive (such as an operating system or a virtual machine control transfer instructions 104 are embodied as, for monitor, e.g., the privileged system component 142). example, processor instructions (e.g., in hardware, as part of The privileged system component 142 may be embodied the processor instruction set architecture), or microcode as, for example, an operating system kernel or a privileged (e.g., instructions that are stored in read-only memory and component of a virtual machine monitor (e.g., an “execu executed directly by the processor 102). In other embodi tive”). If the secure control transfer instructions 104 are not ments, portions of the secure control transfer instructions able to properly verify a return address, a fault is raised. A 104 may be embodied as hardware, firmware, software, or a 25 return address fault handler module 148 of the privileged combination thereof (e.g., as programming code executed by system component 142 handles the fault by, for example, a privileged system component 142 of the computing device causing execution of the program to stop or taking some 100). The secure control transfer instructions 104 are execut other action to prevent an attacker's code from exploiting a able by the computing device 100 to provide security for security Vulnerability in the running application. In this way, return addresses "inline, e.g., during execution of a pro 30 the secure control transfer instructions 104 enable the com gram (such as a user space Software application) by the puting device 100 to provide return address security against computing device 100. As used herein, "return address” may return oriented programming attacks and similar exploits. In refer to, among other things, a location in memory (e.g., a comparison to other approaches, embodiments of the dis memory address) of an instruction following a “call’ or a closed return address security technologies can operate similar control transfer instruction. The return address indi 35 without adding any additional instructions, memory protec cates the location of the instruction to which program tions, or memory requirements (e.g., no shadow stack or execution is returned by a “return” or similar control transfer canary values are needed), and without adding any addi instruction. The return address may be referred to by other tional memory reads or writes. Additionally, some embodi terminology, such as “return vector, in some embodiments. ments of the disclosed technologies utilize aspects of the As used herein, “control transfer instruction” may refer to, 40 address modification logic 140 and the return address fault among other things, a call, return, jump, interrupt, return handler module 148 to Support legacy code compatibility, as from interrupt, or similar instruction that causes the proces described below. As used herein, “legacy code' may refer to Sor 102 to direct program execution to another instruction or a version of computer code that was designed to work on an group of instructions stored in another portion of memory earlier, or now-obsolete, or no-longer-supported computer (e.g., a Subroutine, procedure, function, method or program, 45 architecture. For example, legacy code may include software or more generally, a “callable unit”). that was originally developed for a 32-bit processor, but The secure control transfer instructions 104 include which is now running on a 64-bit processor. secure call logic 106, secure return logic 108, and return Referring now in more detail to FIG. 1, the computing address descrambling logic 110, each of which utilizes a device 100 may be embodied as any type of electronic stack position identifier 114, a secret key 116, and return 50 device for performing the functions described herein. For address security logic 136 (including cryptographic logic example, the computing device 100 may be embodied as, 138 and address modification logic 140), to secure and without limitation, a Smart phone, a tablet computer, a verify return addresses at the control transfer instruction wearable computing device, a laptop computer, a notebook level, as described in more detail below. While shown as a computer, a mobile computing device, a cellular telephone, separate block in FIG. 1, it should be understood that 55 a handset, a messaging device, a vehicle telematics device, portions of the return address security logic 136 may be a server computer, a workstation, a distributed computing embodied directly in the secure call logic 106, the secure system, a multiprocessor System, a consumer electronic return logic 108, and the return address descrambling logic device, and/or any other computing device configured to 110. perform the functions described herein. As shown in FIG. 1, The illustrative secret key 116 is generated by a key 60 the illustrative computing device 100 includes at least one creation module 146 of a privileged system component 142 processor 102 embodied with the secure control transfer and stored in a memory location that is readable by the instructions 104. processor 102 (e.g., a special purpose register or machine As described in more detail below, the illustrative secure specific register (MSR)). In some embodiments, the secret call logic 106 performs security computations for return key 116 may be stored in a location that is readable only by 65 addresses before storing the return addresses on the call the processor. In other embodiments, the secret key 116 used stack 218. To do this, the secure call logic 106 executes a to secure return addresses can be stored in another memory portion of cryptographic logic 138 using at least the stack US 9,514,285 B2 5 6 position identifier 114 and the secret key 116 as inputs. The encrypted portion of the return address. If the return address cryptographic logic 138 executes any cryptographic algo is not decrypted Successfully, a random memory address rithm capable of performing the functions described herein. results, which value is uncontrollable by the adversary, and For example, the cryptographic logic 138 may be imple likely raising a fault or causing a program crash. If the return mented using a message authentication algorithm, an exclu address is successfully decrypted, the secure return logic 108 sive-OR (XOR) operation, a hash algorithm, or an encryp returns program execution to the instruction located at the tion algorithm, including a lightweight such as return address and program execution continues normally. If , or , or a strong encryption the secure call logic 106 further modifies the return address algorithm such as AES. Some embodiments of the secure by manipulating the return address to fall within an unused call logic 106 also use the return address itself as an input to 10 range of memory, the secure return logic 108 manipulates the cryptographic logic 138, in addition to the stack position the decrypted return address to restore the return address to identifier 114 and the secret key 116. For example, some its original form and then returns program execution to the embodiments of the secure call logic 106 encrypt the return instruction located at the return address, and program execu address using the stack position identifier 114 as a tweak for tion continues normally. The return address descrambling a tweakable block cipher, such as XOR-encrypt-XOR 15 logic 110 handles the legacy code situation in which a fault (XEX), Liskov, Rivest, and Wagner (LRW), or XEX-based is raised because legacy code has tried to jump to a return tweaked-codebook mode with stealing (XTS). address that has been cryptographically modified (e.g., as a Other embodiments of the secure call logic 106 encrypt the result of legacy code using a jump instruction instead of a stack position identifier 114 and then combine the encrypted return). When this situation occurs, the jump instruction stack position identifier 114 with the return address (using, naturally causes a fault, because the return address is invalid e.g., XOR or other methods as used with counter mode). (e.g., the return address falls within a faulting memory range Some embodiments of the secure call logic 106 further or a non-canonical address range). As a result, the return modify the return address by executing the address modifi address descrambling logic 110 checks to see if the return cation logic 140. The address modification logic 140 address that caused the fault falls within the designated manipulates at least a portion of the return address (or the 25 unused or non-canonical range of memory, and treats the cryptographically-processed return address, as the case may fault as a “real' fault if the return address that caused the be) so that the return address (or the cryptographically fault does not fall within the unused/non-canonical range. In processed return address, as the case may be) falls within an cases in which legacy code uses a return vector to access unused or non-canonical memory region of virtual memory. memory (e.g., data accesses such as with a MOV instruc Manipulating the return address in this way acts as a “signal” 30 tion), the legacy code obtains a return address from the to other instructions (such as the secure return logic 108 and stack. The legacy code therefore knows the relative location the return address descrambling logic 110) that the return of data in the compiled program (relative to a call instruc address has been secured, rather than corrupted or otherwise tion). Accordingly, the legacy code may use the position of compromised, e.g., by an attacker's code. Depending on the the data in the compiled program (relative to the call cryptographic method used by the secure call logic 106 to 35 instruction) as a relative offset to the data, e.g., by adding or secure the return address, the secure call logic 106 outputs subtracting a known offset to the return vector address. In either an integrity check value (such as a message authen this way, legacy code can use the return address for both tication code or a secure hash) or a modified version of the control flow changes and data accesses. However, in some return address. However, regardless of which cryptographic cases, if the stack pointer is not current, a fault handler may technique is used, the output of the secure call logic 106 40 have to scan the stack searching for the modified return produces a return address that is bound to a specific location address value to determine the stack location, if the address (e.g., a specific address or range of memory addresses) in the value, or portion thereof, needs to be properly decrypted. call stack 218. As a result, an adversary's attempt to move Referring still to FIG. 1, the computing device 100 also the return address to another location in the linear address includes memory 122, an input/output Subsystem 124, a data space (e.g., by a stack pivot or maliciously constructed 45 storage device 126, a display device 128, a user interface alternative stack) will not be successful. (UI) subsystem 130, a communication subsystem 132, at The secure return logic 108 obtains return addresses from least one user space application 134, and the privileged the call stack 218 and verifies them before returning pro system component 142 (which, illustratively, includes a gram execution to the instruction stored at the return memory manager module 144, the key creation module 146. address. The implementation of the secure return logic 108 50 the return address fault handler module 148, and an interrupt depends on the method used by the secure call logic to handler module 150). The computing device 100 may secure the return address. For example, if the secure call include other or additional components, such as those com logic 106 secures the return address by storing an integrity monly found in a mobile and/or stationary computers (e.g., check value in an unused memory location, the secure return various sensors and input/output devices), in other embodi logic 108 verifies the return address obtained from the call 55 ments. Additionally, in Some embodiments, one or more of stack 218 by recomputing the integrity check value and the illustrative components may be incorporated in, or comparing the current integrity check value with the integ otherwise form a portion of another component. Each of the rity check value previously computed by the secure call components of the computing device 100 may be embodied logic 106. If the two integrity check values do not match, the as Software, firmware, hardware, or a combination of Soft return address is not successfully verified by the secure 60 ware and hardware. return logic 108 and a fault is raised. If the integrity check The processor 102 may be embodied as any type of values match, the secure return logic 108 returns program processor capable of performing the functions described execution to the instruction located at the return address and herein. For example, the processor 102 may be embodied as program execution continues normally. a multi-core processor or other multiple-CPU processor or If the secure call logic 106 secures the return address by 65 processing/controlling circuit. The processor 102 has a num encrypting at least a portion of the return address, the secure ber of registers 112, which include general purpose registers return logic 108 verifies the return address by decrypting the and special purpose registers. The stack position identifier US 9,514,285 B2 7 8 114 and the secret key 116 are stored in registers 112 (e.g., computer application (e.g., Software, firmware, hardware, or special purpose registers). The memory 122 of the comput a combination thereof) that interacts directly or indirectly ing device 100 may be embodied as any type of volatile or with an end user via, for example, the display device 128 or non-volatile memory or data storage capable of performing the UI subsystem 130. Some examples of user space appli the functions described herein. In operation, the memory cations 134 include word processing programs, document 122 may store various data and software used during opera viewers/readers, web browsers, electronic mail programs, tion of the computing device 100, including a call stack 218 messaging Services, computer games, camera and Video as described below, as well as operating systems, applica applications, etc. Among other things, the privileged system tions, programs, libraries, and drivers. component 142 facilitates the communication between the The memory 122 is communicatively coupled to the 10 user space applications 134 and the hardware components of processor 102, e.g., via the I/O subsystem 124. The I/O the computing device 100. Portions of the privileged system Subsystem 124 may be embodied as circuitry and/or com component 142 may be embodied as any operating system ponents to facilitate input/output operations with the pro capable of performing the functions described herein, Such cessor 102, the memory 122, and other components of the as a version of WINDOWS by Microsoft Corporation, computing device 100. For example, the I/O subsystem 124 15 ANDROID by Google, Inc., and/or others. Alternatively or may be embodied as, or otherwise include, memory con in addition, portion of the privileged system component 142 troller hubs, input/output control hubs, firmware devices, may be embodied as any type of virtual machine monitor communication links (i.e., point-to-point links, bus links, capable of performing the functions described herein (e.g., wires, cables, light guides, printed circuit board traces, etc.) a type I or type II hypervisor). and/or other components and Subsystems to facilitate the The illustrative privileged system component 142 input/output operations. In some embodiments, the I/O includes a number of computer program components, such Subsystem 124 may form a portion of a system-on-a-chip as the memory manager module 144, the key creation (SoC) and be incorporated, along with the processor 102, the module 146, the return address fault handler module 148, memory 122, and/or other components of the computing and an interrupt handler module 150. Each of the compo device 100, on a single integrated circuit chip. 25 nents of the privileged system component 142 may be The data storage device 126 may be embodied as any type embodied as Software, firmware, hardware, or a combination of physical device or devices configured for short-term or of software and hardware. For example, the components of long-term storage of data such as, for example, memory the privileged system component 142 may be embodied as devices and circuits, memory cards, hard disk drives, Solid modules of an operating system kernel, a virtual machine state drives, flash memory or other read-only memory, 30 monitor, or a hypervisor. memory devices that are combinations of read-only memory The memory manager module 144 allocates portions of and random access memory, or other data storage devices. memory 122 to the various processes running on the com The display device 128 may be embodied as any type of puting device 100 (e.g., as ranges of virtual memory display capable of displaying digital information Such as a addresses). As part of the return address security technolo liquid crystal display (LCD), a light emitting diode (LED), 35 gies disclosed herein, some embodiments of the memory a plasma display, a cathode ray tube (CRT), or other type of manager module 144 execute directives to declare portions display device. In some embodiments, the display device of linear memory as “unused (e.g., so no executable code 128 may be coupled to a touch screen or other human can be stored in that linear range). In some embodiments, computer interface device to allow user interaction with the Such as processors having greater than 32-bit register sizes, computing device 100. The display device 128 may be part 40 unused memory ranges are referred to as “non-canonical of the user interface (UI) subsystem 130. The user interface ranges of memory, while “used’ portions are referred to as subsystem 130 may include a number of additional devices “canonical ranges. The secure control transfer instructions to facilitate user interaction with the computing device 100, 104 can utilize the unused/non-canonical ranges of memory including physical or virtual control buttons or keys, a to, for example, store and retrieve integrity check values microphone, a speaker, a unidirectional orbidirectional still 45 (such as message authentication codes or hash values) or to and/or video camera, and/or others. The user interface store and retrieve modified versions of return addresses Subsystem 130 may also include devices, such as motion pushed onto a call stack (e.g., call Stack 218, shown in FIG. sensors, proximity sensors, and eye tracking devices, which 2). As described in more detail below, modified versions of may be configured to detect, capture, and process various return addresses include encrypted or otherwise obfuscated other forms of human interactions involving the computing 50 versions of the return addresses. device 100. The key creation module 146 creates the secret key 116 The computing device 100 further includes a communi and writes it to a register to which the processor 102 has read cation subsystem 132, which may be embodied as any access (e.g., a special purpose register). To create the secret communication circuit, device, or collection thereof, capable key 116, the key creation module 146 may execute, for of enabling communications between the computing device 55 example, a random number generator or another algorithm 100 and other electronic devices. The communication sub capable of generating a secret key that can perform the system 132 may be configured to use any one or more functions described herein. The return address fault handler communication technology (e.g., wireless or wired commu module 148 is used in some embodiments to handle faults nications) and associated protocols (e.g., Ethernet, Blu that occur as a result of a failed verification of the return etooth R., Wi-Fi R, WiMAX, 3G/LTE, etc.) to effect such 60 address by the secure return logic 108. The illustrative return communication. The communication Subsystem 132 may be address fault handler module 148 includes logic that allows embodied as a network adapter, including a wireless net the secure control transfer instructions 104 to accommodate work adapter. legacy code, as described below. An example of a legacy The illustrative computing device 100 also includes a code situation that can be handled by the return address fault number of computer program components, such as the user 65 handler module 148 is where the legacy code issues a call (or space application 134 and the privileged system component similar) instruction, but then uses a jump (or similar) instruc 142. The user space application 134 may be embodied as any tion instead of a return instruction. In this situation, the US 9,514,285 B2 10 secure return logic 108 is bypassed by the jump instruction the address 214 is within the unused/non-canonical range, and thus, unable to decode the modified return address. As the return address descrambling logic 110 verifies the modi a result, a page fault may occur. In response to the page fault, fied return address 206 and causes the processor 102 to the return address fault handler module 148 executes return return from the subroutine and proceed to the next instruc address descrambling logic 110 to verify that the address to tion (if the verification is successful), or generate a fault (if which the legacy code is attempting to jump is valid. The the verification is not successful) (216). interrupt handler module 150 may be embodied as, for As discussed above, there are a number of different example, a standard interrupt handler. The interrupt handler techniques that the secure control transfer instructions 104 module 150 handles faults when the return address fault can use to secure the return address 204 and verify the handler module 148 is unable to verify the return address 10 modified return address 206. In general, these various tech causing the fault (e.g., the fault is not the result of a legacy niques generate security data using one or more stack code implementation issue). position identifiers 114 and a secret key 116 as inputs to a FIG. 3 illustrates a range of memory addresses 300 that cryptographic algorithm (e.g., cryptographic logic 138), and may be allocated by the memory manager module 144, with store the security data in a memory location that is readable a modified return address 206 stored in an unused or 15 by the processor 102. FIGS. 4 and 5, described below, relate “non-canonical range 312 within the address range 300. to embodiments in which the return address is encrypted by The illustrative address range 300 also includes used or a tweakable block cipher using the stack position identifier “canonical ranges 310 and 314, where the range 310 is 114 as the tweak. FIG. 6, described below, relates to the declared used by system space programs (such as an oper disclosed technology for providing legacy code compatibil ating system or virtual machine monitor), and the range 314 ity. FIG. 7, described below, relates to “counter mode' is declared used by user space programs (such as interactive embodiments in which the return address is cryptographi Software applications). cally combined with an encrypted Stack position identifier Referring now to FIG. 2, in some embodiments, the 114. The counter mode embodiments can encrypt the stack computing device 100 establishes an environment 200 dur position identifier in parallel with the processor's instruction ing operation (e.g., native and/or virtual runtime or “execu 25 processing pipeline, e.g., to avoid adding latency to the tion” environments). The various modules depicted in the instruction pipeline due to the cryptographic operations. environment 200 may be embodied as hardware, firmware, Referring now to FIG. 4, an example of a method 400 for software, or a combination thereof. In the environment 200, securing a return address is shown. Portions of the method the user space application 134 may, from time to time, 400 may be executed by hardware, firmware, and/or soft during the operation of the computing device 100, issue a 30 ware of the computing device 100 (e.g., by the processor 102 subroutine call 202. The subroutine call 202 may be trans executing the secure call logic 106 and portions of the return lated (e.g., compiled or interpreted), as needed, by the address security logic 136). At block 410, the computing privileged system component 142 before being passed on to device 100 determines whether a control transfer instruction the processor 102. In the processor 102, the secure call logic containing a return address (e.g., as an argument). Such as a 106 of the secure control transfer instructions 104 is 35 Subroutine call, has occurred during execution of a program executed in response to the Subroutine call 202 (e.g., in place by the computing device 100. If no control transfer instruc of conventional “call logic). Whereas conventional call tion with a return address is encountered, the computing logic simply places a return address 204 specified by the device 100 remains in block 410 and evaluates the next subroutine call 202 onto a call stack 218, the secure call instruction. If a control transfer instruction with a return logic 106 executes a portion of the return address security 40 address is encountered, the computing device 100 pre logic 136 to modify, obfuscate or otherwise secure the return processes the return address as needed by the encryption address 204. As shown in FIG. 2, some embodiments of the algorithm being used to encrypt the return address, in block secure call logic 106 create a modified return address 206, 412. The encryption algorithm may be selected according to and then store the modified return address 206 onto the call the requirements of a particular design of the computing stack 218. The illustrative call stack 218 is embodied as a 45 device 100; in the illustrated example, the encryption algo stack (e.g., first-in, first-out or FIFO) data structure in rithm is a tweakable block cipher, such as XTS (XEX-based memory 122. tweaked-codebook mode with ), LRW During execution of the subroutine called by the subrou (Liskov, Rivest, and Wagner), XEX (Xor Encrypt Xor), or tine call 202, the processor 102 may encounter a return the Intel Cascade Cipher in CTR-ECB mode. To pre-process instruction. In response to the return instruction, the proces 50 the return address, the computing device 100 conforms the sor 102 executes the secure return logic 108. Whereas a return address to the block size of the encryption algorithm. conventional return instruction would simply obtain the For example, the computing device 100 may normalize the return address from the call stack 218, some embodiments of return address so that all of the canonical bits of the return the secure return logic 108 utilize a portion of the return address fit within the block size of the tweakable cipher. In address security logic 136 to verify the return address 55 block 414, the computing device encrypts the return address obtained from the call stack 218 (e.g., the modified return using the secret key 116 and using the stack position address 206), before jumping to the instruction located at the identifier 114 as a tweak for the tweakable cipher. As used return address 204. If the secure return logic 108 is able to herein, a "tweak” may refer to, among other things, a second successfully verify the return address 204, the processor 102 input to a block cipher, in addition to the usual plaintext or returns from the subroutine as normal (210). If the secure 60 ciphertext input. The tweak, along with the secret key 116, return logic 108 is unable to verify the return address 204, selects the permutation that is computed by the cipher. In the processor 102 raises a fault 212. In some embodiments, some embodiments, the stack position identifier 114 that is the fault 212 is handled by the return address fault handler used by the encryption algorithm is the stack pointer (which module 148. To handle the fault 212, the return address fault points to the top of the call stack 218). Other embodiments handler module 148 may determine that the address 214 of 65 may use counters that track the number of calls made minus the instruction causing the fault is within the unused or the number of returns in a running program. Other embodi non-canonical memory range (e.g., range 312 of FIG. 3). If ments may use a combination of these values. Portions of the US 9,514,285 B2 11 12 return address that are not modified by encryption by a block the fault is within the unused/non-canonical range of cipher may also be used, possibly in combination with the memory (where the unused/non-canonical range may be previous examples, as a tweak to the cipher. In block 416, defined by, for example, the memory manager module 144). the computing device manipulates the encrypted return If the faulting address is not in the unused/non-canonical address output by block 414 so that the encrypted return 5 range, the computing device 100 treats the event as a “real address falls within an unused or non-canonical range. To do fault (e.g., page fault or general protection fault, as the case this, the computing device 100 sets the top bit of the return may be), in block 614 (which may cause the executing address to a value (e.g., 1) that will cause the address to fall program to crash). If the faulting address is in the unused/ within the unused/non-canonical range). In block 418, the non-canonical range, the computing device 100 decrypts the computing device 100 stores the modified return address 10 return address, using the decryption algorithm counterpart of (e.g., the output of block 416) on the call stack (e.g., the call the encryption algorithm used previously to encrypt the stack 218). In block 420, the computing device 100 reads the address, and using the stack position identifier as a tweak, in next instruction in the program sequence (e.g., the first block 616. In block 618, the computing device 100 verifies instruction of the calling function), and returns to block 410. that the decrypted return address is correct (e.g., the Referring now to FIG. 5, an example of a method 500 for 15 decrypted return address points to the next instruction after verifying a return address is shown. Portions of the method the call instruction), and/or verifies that the calling code is 500 may be executed by hardware, firmware, and/or soft legitimate. If the computing device 100 successfully verifies ware of the computing device 100 (e.g., by the processor 102 the decrypted return address and/or the calling code in block executing the secure return logic 108 and the portions of the 618, then in block 620, the computing device 100 transfers return address security logic 136). In block 510, the com control to the instruction located at the verified return puting device 100 determines whether a control transfer address. To do this, the computing device may issue an "iret instruction that uses a return address (e.g., obtains the return or interrupt return instruction. To perform the decrypting of address from the call Stack 218). Such as a return instruction, the return address, the computing device 100 may obtain the has occurred during execution of a program by the comput secret key 116 from a memory location that is readable by ing device 100. If no control transfer instruction that uses a 25 the processor 102 or from a privileged handler, such as the return address obtained from the call stack 218 is encoun privileged system component 142. tered, the computing device 100 remains in block 510 and While not specifically shown in the drawings, it should be evaluates the next instruction. If a control transfer instruc understood that the “iret' instruction and other similar tion that uses a return address obtained from the call stack instructions may also be modified to accept an encrypted 218 is encountered, the computing device 100 proceeds to 30 return address from the stack and perform similar crypto block 512, where the computing device 100 obtains the graphic operations as the return “ret' instruction described return address from the top of the call stack 218. If the return herein. In the case of iret instructions, interrupts and other address has previously been encrypted by, e.g., secure call asynchronous events causing program control flow that save logic 106 and return address security logic 136, the return the resumption state to the Stack, including a return address address obtained by the computing device in block 512 is the 35 for the “iret' instruction, will instead place on the stack the modified return address (e.g., modified return address 206). cryptographically modified version of the return address In block 514, the computing device 100 determines following logic similar to the “call’ instruction described whether the return address obtained in block 512 falls within herein. an unused or non-canonical range of memory addresses Referring now to FIG. 7, an example of another method (where the unused/non-canonical range may be defined by, 40 700 for securing a return address is shown. Portions of the for example, the memory manager module 144). If the return method 700 may be executed by hardware, firmware, and/or address does not fall within an unused/non-canonical range, software of the computing device 100 (e.g., by the processor the computing device 100 proceeds to the top of the method 102 executing the secure call logic 106 and the return 600 shown in FIG. 6, described below. If the return address address security logic 136). The method 700 is an illustra falls within an unused/non-canonical range, the computing 45 tion of a “counter mode' embodiment of the secure call logic device decrypts the return address using the decryption 106 and return address security logic 136. In the method algorithm counterpart of the encryption algorithm used in 700, stack position identifiers 114 are encrypted in loop 710, block 414 of FIG. 4, and using the stack position identifier while in loop 718, return addresses are encrypted or other 114 as a tweak. In block 518, the computing device restores wise cryptographically combined with the encrypted Stack the decrypted return address to its original, e.g., canonical, 50 position identifiers that are generated by the loop 710. The form. To do this, the computing device “reverses' the loops 710 and 718 can be executed in parallel by the operation conducted in block 416 of FIG. 4. In block 520, processor 102 and a number of the resulting ciphertext can the computing device 100 jumps to the instruction located at be stored in local registers, buffers or near memory. In the the return address of that has been decrypted and “restored loop 710, one or more stack position identifiers 114 for the in blocks 516 and 518, respectively. In block 520, the 55 return addresses are obtained, in block 712. Illustratively, the computing device 100 jumps to the return address output by stack position identifier 114 is the stack pointer (which block 518, loads the instruction located at that address, and points to the top of the call stack 218), such that the returns to block 510. encryption of the stack position identifier 114 can be rep Referring now to FIG. 6, an example of a method 600 for resented by: k2=E (SP), where k1 is the secret key 116, SP handling return address faults is shown. Portions of the 60 is the stack pointer, E is the encryption algorithm, and k2 is method 600 may be executed by hardware, firmware, and/or a secret key that is derived from k1 using the stack pointer. software of the computing device 100 (e.g., by the privileged In the illustrated embodiment, E is a strong cryptographic system component 142 executing the return address fault algorithm Such as AES. Other data or programming objects handler module 148). In block 610, the computing device that can be used as the stack position identifiers 114 include 100 determines whether a fault has occurred (e.g., a general 65 the instruction pointer (also referred to as the program protection fault or a page fault). In block 612, the computing counter), a page identifier (e.g., a linear address of a page or device 100 determines whether the return address causing a sub page of a fixed size) or stack frame identifier (which US 9,514,285 B2 13 14 identifies a page, or a region of memory, referred to by the able to reverse the program’s modifications and re-apply stack pointer), page size, and a tracking counter (which them to the real address yielding the relative offset to the keeps track of the number of call instructions and/or the intended data. number of return instructions, during program execution). In some embodiments, in block 714, the computing device 100 5 EXAMPLES predicts ahead or behind the current stack position identifier, so that these stack position identifiers can be calculated Illustrative examples of the technologies disclosed herein ahead of time and stored for low latency access by the are provided below. An embodiment of the technologies may processor. include any one or more, and any combination of the In the loop 718, the computing device 100 determines 10 examples described below. whether a control transfer instruction including a return Example 1 includes a computing device to secure return address (e.g., as an argument) is encountered. If a control addresses to mitigate return oriented programming attacks. transfer instruction including a return address is not encoun The computing device includes: a processor comprising call tered, the computing device remains in block 720 and reads the next instruction. If a control transfer instruction includ 15 logic, wherein the call logic is to, prior to storage of a return ing a return address is encountered, in block 722 the address on a call stack, cause the processor to: read a secret computing device obtains the current encrypted Stack posi key from a memory location of the computing device that is tion identifier, generated in block 716 of the loop 710. In readable by the processor, determine a stack position iden block 724, the encrypted stack position identifier obtained in tifier, the stack position identifier usable to determine a block 722 is combined with the return address to create a location on the call stack at which the return address is to be modified return address (e.g., the modified return address stored; generate security data by execution of a crypto 206). To do this, the computing device 100 may use an XOR graphic algorithm with a plurality of inputs including: (i) the operation or a reduced round AES (Advanced Encryption secret key and (ii) the stack position identifier, the output of Standard) or other appropriate block size cipher. In some the cryptographic algorithm being the security data; and embodiments, the computing device may use k2 to encrypt 25 store the security data in a memory location that is readable the return address (or a portion thereof) using, for example, by the processor. a full or reduced round variant of a cipher such as SIMON, Example 2 includes the subject matter of Example 1, SPECK, or SERPENT. It should be noted that in any of the wherein the cryptographic algorithm includes an encryption encryption techniques described herein, the entire return algorithm, the plurality of inputs to the cryptographic algo address does not need to be encrypted. For example, in some 30 rithm further includes (iii) at least a portion of the return embodiments, only the top (most significant portion) of the address, and the call logic is to execute the cryptographic address needs to be encrypted. Additionally, in embodiments algorithm with the stack position identifier as a tweak, such in which the XOR is used to combine the return address with that the Security data is a modified return address, and the k2, not all address bits of the return address need to be call logic is to store the modified return address on the call modified (e.g., only the top 8 or 16 bits are XOR'd). Also, 35 stack. in some embodiments in which the AES algorithm is used to Example 3 includes the subject matter of Example 1 or encrypt the return address, a portion of the plaintext return Example 2, wherein the cryptographic algorithm includes address and the stack pointer cipher text can be used to one or more of an exclusive-or (XOR) operation, a reduced permute the AES block. Further, in some embodiments, the round cipher, and a full round cipher. AES permutations may be performed before the return 40 Example 4 includes the subject matter of Example 2, address is combined (e.g., XOR'd) with k2. Still further, wherein the encryption algorithm includes a tweakable AES permutations can be selected that can be achieved block cipher. within one clock cycle. Even further, in some embodiments, Example 5 includes the subject matter of Example 2, multiple different stack position identifiers 114 can be used. wherein the call logic is to further modify the modified For example, a stack page identifier and a call count (e.g., a 45 return address by manipulation of at least a portion of the count of the number of calls, which is incremented on a call modified return address such that the further-modified return instruction and decremented on a return instruction) can address falls within an unused range of memory addresses each be encrypted using, e.g., AES, and then combined with and store the further-modified return address on the call the return address (or a portion thereof) using, e.g., an XOR stack. operation. The plaintext portion of the return address (e.g. 50 Example 6 includes the subject matter of Example 2, the lower address bits) can be used as a selector to determine wherein the processor comprises return logic to, prior to which bits of the encrypted counter output are selected to be return of program execution control to a return address on combined with the upper return vector bits, making the the call Stack, cause the processor to: obtain the modified selection dependent on the lower address bits. In this return address from the call stack; decrypt the encrypted example, the XOR combination may be performed as part of 55 portion of the return address by execution of a decryption a call instruction. Whatever the embodiment of return algorithm with the stack position identifier as the tweak; and address security is used, the computing device performs the transfer program execution control to the address of the reverse operations (e.g., decrypts the return address) using decrypted return address. the same technique, via the secure return logic 108. More Example 7 includes the subject matter of Example 6, over, it should be noted that not encrypting the lower address 60 wherein the call logic is to further modify the modified bits allows legacy programs to manipulate those bits. For return address by manipulation of the modified return example, a program may add an offset to a return address address to fall within an unused range of memory addresses found on the stack to find relative program data (relative to and store the further-modified return address on the call the location of the call). By leaving this portion of the stack, and the return logic is to obtain the further-modified address (e.g., the lower address bits) unencrypted, a fault 65 return address from the call stack, and manipulate the handler (triggered as a result of the upper bits corresponding further-modified return address to restore the modified to a non-canonical address as described herein) would be return address to its original format. US 9,514,285 B2 15 16 Example 8 includes the subject matter of Example 2, identifier in parallel with the combination of the return wherein the processor comprises return address descram address with the encrypted stack position identifier. bling logic to cause the processor to: determine if the Example 20 includes a method for securing an address modified return address has been modified to fall within an used by a processor of a computing device to control the unused range of memory addresses; and if the modified flow of execution of a program, the method including: prior return address has been modified to fall within an unused to storing an address on a stack: reading a secret key from range of memory addresses, decrypt the encrypted portion of a memory location of the computing device that is readable the return address by execution of a decryption algorithm by the processor, determining a stack position identifier, the with the stack position identifier as the tweak. stack position identifier usable to determine a location on the 10 stack at which the address is to be stored; generating security Example 9 includes the subject matter of Example 1 or data by executing a cryptographic algorithm with a plurality Example 2, wherein the call logic is to cause the processor of inputs including: (i) the secret key and (ii) the stack to: compute the Security data by execution of a message position identifier, the output of the cryptographic algorithm authentication code algorithm, Such that the security data being the security data; and storing the security data in a includes a message authentication code; and store the Secu 15 memory location that is readable by the processor. rity data in address bits that are within an unused range of Example 21 includes the subject matter of Example 20, memory addresses. wherein the cryptographic algorithm includes an encryption Example 10 includes the subject matter of Example 9, algorithm, the plurality of inputs to the cryptographic algo wherein the message authentication code algorithm includes rithm further includes (iii) at least a portion of the address, a secure hash algorithm. and the method includes executing the cryptographic algo Example 11 includes the subject matter of Example 9, rithm using the stack position identifier as a tweak, Such that wherein the processor comprises return logic to, prior to the security data is a modified address, and storing the return of program execution control to a return address on modified address on the stack. the call stack: obtain the message authentication code from Example 22 includes the subject matter of Example 20, the unusable memory location; obtain a current return 25 wherein the cryptographic algorithm includes one or more address from the top of the call stack; compute second of an exclusive-or (XOR) operation, a reduced round security data by execution of the message authentication cipher, and a full round cipher. code algorithm with (i) the secret key and (ii) the stack Example 23 includes the subject matter of Example 21, position identifier, as inputs; compare the second security wherein the encryption algorithm includes a tweakable data to the security data; and if the second security data and 30 block cipher. the security data match, transfer program execution control Example 24 includes the subject matter of Example 21, to the address of the current return address. including further modifying the modified address by Example 12 includes the subject matter of Example 1 or manipulating at least a portion of the modified address Such Example 2, wherein the top of the call stack is indicated by that the further-modified address falls within an unused a stack pointer, and the stack position identifier includes the 35 range of memory addresses, and storing the further-modified stack pointer. address on the stack. Example 13 includes the subject matter of Example 1 or Example 25 includes the subject matter of Example 21, Example 2, wherein a memory address of a currently execut including, prior to directing program execution control to an ing instruction is indicated by an instruction pointer, and the address on the Stack: obtaining the modified address from stack position identifier includes the instruction pointer. 40 the stack; decrypting the encrypted portion of the address by Example 14 includes the subject matter of Example 1 or executing a decryption algorithm using the stack position Example 2, wherein the stack position identifier includes a identifier as the tweak; and transferring program execution page identifier representative of a range of memory control to the address of the decrypted address. addresses including the return address. Example 26 includes the subject matter of Example 25, Example 15 includes the subject matter of Example 1 or 45 including further modifying the modified address by Example 2, wherein the stack position identifier includes a manipulating the modified address to fall within an unused tracking counter indicative of one or more of a number of range of memory addresses, storing the further-modified calls executed by the call logic and a number of returns address on the stack, obtaining the further-modified address executed by the return logic. from the Stack, and manipulating the further-modified Example 16 includes the subject matter of Example 1 or 50 address to restore the modified address to its original format. Example 2, wherein the call logic is to encrypt at least a Example 27 includes the subject matter of Example 21, portion of the stack position identifier by execution of an including: determining if the modified address has been encryption algorithm with the security key. modified to fall within an unused range of memory Example 17 includes the subject matter of Example 16, addresses; and if the modified address has been modified to wherein the call logic is to modify the return address by 55 fall within an unused range of memory addresses, decrypt combination of at least a portion of the return address with ing the encrypted portion of the address by executing a the encrypted Stack position identifier. decryption algorithm using the stack position identifier as Example 18 includes the subject matter of Example 17, the tweak. wherein the call logic is to combine the return address with Example 28 includes the subject matter of Example 20, the encrypted Stack position identifier by execution of an 60 including: computing the security data by executing a mes exclusive-or (XOR) operation, a reduced round cipher, or a sage authentication code algorithm, such that the security full round cipher. data includes a message authentication code; and storing the Example 19 includes the subject matter of Example 18, security data in an memory location that is within an unused wherein the call logic is to: compute a predicted Stack range of memory addresses. position identifier representing a range of memory addresses 65 Example 29 includes the subject matter of Example 28, ahead or behind the range of memory addresses of the stack wherein the message authentication code algorithm includes position identifier, and encrypt the predicted Stack position a secure hash algorithm. US 9,514,285 B2 17 18 Example 30 includes the subject matter of Example 28, store the security data in a memory location that is including, prior to directing program execution control to an readable by the processor. address on the stack: obtaining the message authentication 2. The computing device of claim 1, wherein the stack code from the unusable memory location; obtaining a cur position identifier comprises one of a stack pointer indica rent address from the top of the stack; computing second 5 tive of the top of the stack, an instruction pointer indicative security data by executing the message authentication code of a memory address of a currently executing instruction, a algorithm with (i) the secret key and (ii) the stack position page identifier representative of a range of memory identifier, as inputs; comparing the second security data to addresses including the address, and a tracking counter the security data; and if the second security data and the security data match, transferring program execution control 10 indicative of one or more of a number of calls executed by to the address of the current address. the call logic and a number of returns executed by the return Example 31 includes the subject matter of Example 20, logic. wherein the top of the stack is indicated by a stack pointer, 3. The computing device of claim 1, wherein the proces and the stack position identifier includes the stack pointer. sor is further to: Example 32 includes the subject matter of Example 20, 15 determine whether the computing device is to enter a wherein a memory address of a currently executing instruc sleep mode; tion is indicated by an instruction pointer, and the stack transmit, in response to the determination that the com position identifier includes the instruction pointer. puting device is to enter a sleep mode, the secret key to Example 33 includes the subject matter of Example 20, a remote device; and wherein the stack position identifier includes a page identi- 20 restore the secret key from the remote device after the fier representing a range of memory addresses including the sleep mode has ended. address. 4. The computing device of claim 1, wherein the call logic Example 34 includes the subject matter of Example 20, is to: including encrypting at least a portion of the stack position compute the security data by execution of a message identifier by executing an encryption algorithm with the 25 authentication code algorithm, Such that the security security key. data comprises a message authentication code; and Example 35 includes the subject matter of Example 34, store the security data in address bits that are within an including modifying the address by combining at least a unused range of memory addresses. portion of the address with the encrypted Stack position 5. The computing device of claim 4, wherein the proces identifier. 30 Sor comprises return logic and wherein, prior to return of Example 36 includes the subject matter of Example 35, program execution control to a return address on the call including combining the address with the encrypted stack stack, the return logic is to: position identifier using an exclusive-or (XOR) operation, a obtain the message authentication code from the unusable reduced round cipher, or a full round cipher. memory location; Example 37 includes the subject matter of Example 36, 35 obtain a current return address from the top of the call including computing a predicted Stack position identifier stack; representing a range of memory addresses ahead or behind compute second security data by execution of the message the range of memory addresses of the stack position iden authentication code algorithm with (i) the secret key tifier, and encrypting the predicted Stack position identifier and (ii) the Stack position identifier, as inputs; in parallel with the combining of the address with the 40 compare the second security data to the security data; and encrypted Stack position identifier. if the second security data and the security data match, An Example 38 includes one or more non-transitory transfer program execution control to the address of the machine readable storage media including a plurality of current return address. instructions stored thereon that in response to being 6. The computing device of claim 1, wherein the call logic executed result in a computing device performing the 45 is to encrypt at least a portion of the Stack position identifier method of any of Examples 20-37. by execution of an encryption algorithm with the security An Example 39 includes a computing device including key. means for executing the method of any of Examples 20-37. 7. The computing device of claim 6, wherein the call logic The invention claimed is: is to modify the return address by combination of at least a 1. A computing device to secure return addresses to 50 portion of the return address with the encrypted stack mitigate return oriented programming attacks, the comput position identifier. ing device comprising: 8. The computing device of claim 7, wherein the call logic a processor comprising call logic, is to combine the return address with the encrypted stack wherein, prior to storage of a return address on a call position identifier by execution of an exclusive-or (XOR) stack, the call logic is to: 55 operation, a reduced round cipher, or a full round cipher. read a secret key from a memory location of the 9. The computing device of claim 8, wherein the call logic computing device that is readable by the processor; is to: compute a predicted Stack position identifier represen determine a stack position identifier, the Stack position tative of a range of memory addresses ahead or behind the identifier usable to determine a location on the call range of memory addresses of the Stack position identifier, stack at which the return address is to be stored; 60 and encrypt the predicted Stack position identifier in parallel generate security data indicative of the return address with the combination of the return address with the by execution of a cryptographic algorithm Such that encrypted Stack position identifier. the security data is based on both of: (i) the secret 10. The computing device of claim 1, wherein the cryp key and (ii) the Stack position identifier, tographic algorithm comprises an encryption algorithm, the modify the return address indicated in the security data 65 plurality of inputs to the cryptographic algorithm further to reference a non-canonical location in memory; includes (iii) at least a portion of the return address, and the and call logic is to execute the cryptographic algorithm with the US 9,514,285 B2 19 20 Stack position identifier as a tweak, and the call logic is to device securing an address used by a processor of a com store the modified return address on the call stack. puting device to control the flow of execution of a program, 11. The computing device of claim 10, wherein the bv: processor comprises return address descrambling logic to: prior to storing an address on a stack: determine if the modified return address has been modi reading a secret key from a memory location of the fied to fall within an unused range of memory computing device that is readable by the processor; addresses; and determining a stack position identifier, the stack posi if the modified return address has been modified to fall tion identifier usable to determine a location on the within an unused range of memory addresses, decrypt stack at which the address is to be stored; the encrypted portion of the return address by execution 10 generating security data indicative of the return address of a decryption algorithm with the stack position iden by executing a cryptographic algorithm such that the tifier as the tweak. security data is based on both of: (i) the secret key 12. The computing device of claim 10, wherein the and (ii) the stack position identifier; encryption algorithm comprises a tweakable block cipher. modifying the return address indicated in the security 13. The computing device of claim 10, wherein the 15 data to reference a non-canonical location in processor comprises return logic and wherein, prior to return memory; and of program execution control to a return address on the call storing the security data in a memory location that is stack, the return logic is to: readable by the processor. obtain the modified return address from the call stack; 19. The one or more non-transitory machine readable decrypt the encrypted portion of the return address by storage media of claim 18, wherein the cryptographic algo execution of a decryption algorithm with the stack rithm comprises an encryption algorithm, the plurality of position identifier as the tweak; and inputs to the cryptographic algorithm further includes (iii) at transfer program execution control to the address of the least a portion of the address, and the instructions result in decrypted return address. the computing device executing the cryptographic algorithm 14. The computing device of claim 13, wherein the return 25 using the stack position identifier as a tweak, and storing the logic is to obtain the modified return address from the call modified address on the stack. Stack, and manipulate the return address to restore the 20. The one or more non-transitory machine readable modified return address to its original format. storage media of claim 18, wherein the instructions result in 15. A method for securing an address used by a processor the computing device: of a computing device to control the flow of execution of a 30 computing the security data by executing a message program, the method comprising: authentication code algorithm, such that the security prior to storing an address on a stack: data comprises a message authentication code; and reading a secret key from a memory location of the storing the security data in a memory location that is computing device that is readable by the processor; within an unused range of memory addresses. determining a stack position identifier, the stack posi 35 21. The one or more non-transitory machine readable tion identifier usable to determine a location on the storage media of claim 18, wherein the instructions result in stack at which the address is to be stored; the computing device computing a predicted stack position generating security data indicative of the address by identifier representing a range of memory addresses ahead or executing a cryptographic algorithm such that the behind a range of memory addresses of the stack position security data is based on both of: (i) the secret key 40 identifier, and encrypting the predicted stack position iden and (ii) the stack position identifier; tifier in parallel with the combining of the address with the modifying the address indicated in the security data to encrypted stack position identifier. reference a non-canonical location in memory; and 22. The one or more non-transitory machine readable storing the security data in a memory location that is storage media of claim 18, wherein the instructions result in readable by the processor. 45 the computing device, prior to directing program execution 16. The method of claim 15, wherein the cryptographic control to an address on the stack: algorithm comprises an encryption algorithm, the plurality obtaining the modified address from the stack; of inputs to the cryptographic algorithm further includes (iii) manipulating the modified address to restore the modified at least a portion of the address, and the method comprises address to its original format. executing the cryptographic algorithm using the stack posi 50 23. The one or more non-transitory machine readable tion identifier as a tweak, and storing the modified address storage media of claim 22, wherein the instructions result in on the stack. the computing device: 17. The method of claim 15, comprising, prior to directing determining if the modified address has been modified to program execution control to an address on the stack: fall within an unused range of memory addresses; and obtaining the modified address from the stack; 55 if the modified address has been modified to fall within an manipulating the modified address to restore the modified unused range of memory addresses, decrypting the address to its original format. encrypted portion of the address by executing a decryp 18. One or more non-transitory machine readable storage tion algorithm using the stack position identifier as the media comprising a plurality of instructions stored thereon tweak. that in response to being executed result in a computing ck ck ci: ck ck