Ep 2182433 A1

(19) & (11) EP 2 182 433 A1 (12) EUROPEAN PATENT APPLICATION published in accordance with Art. 153(4) EPC (43) Date of publication: (51) Int Cl.: 05.05.2010 Bulletin 2010/18 G06F 9/38 (2006.01) (21) Application number: 07768016.3 (86) International application number: PCT/JP2007/063241 (22) Date of filing: 02.07.2007 (87) International publication number: WO 2009/004709 (08.01.2009 Gazette 2009/02) (84) Designated Contracting States: • AOKI, Takashi AT BE BG CH CY CZ DE DK EE ES FI FR GB GR Kawasaki-shi HU IE IS IT LI LT LU LV MC MT NL PL PT RO SE Kanagawa 211-8588 (JP) SI SK TR Designated Extension States: (74) Representative: Ward, James Norman AL BA HR MK RS Haseltine Lake LLP Lincoln House, 5th Floor (71) Applicant: Fujitsu Limited 300 High Holborn Kawasaki-shi, Kanagawa 211-8588 (JP) London WC1V 7JH (GB) (72) Inventors: • TOYOSHIMA, Takashi Kawasaki-shi Kanagawa 211-8588 (JP) (54) INDIRECT BRANCHING PROGRAM, AND INDIRECT BRANCHING METHOD (57) A processor 100 reads an interpreter program the indirect branch instruction in a link register 120c (the 200a to start an interpreter. The interpreter makes branch processor 100 internally stacks the branch destination prediction by executing, in place of an indirect branch addresses also in a return address stack 130) in inverse instruction that is necessary for execution of a source order and reading the addresses from the return address program 200b, storing branch destination addresses in stack 130 in one at a time manner. EP 2 182 433 A1 Printed by Jouve, 75001 PARIS (FR) 1 EP 2 182 433 A1 2 Description vised and brought into practical use. A principle under- lying the branch prediction is based on an idea of record- TECHNICAL FIELD ing branch results of a program executed in the past and predicting an outcome of branch as an extension of the [0001] The present invention is relates to an indirect 5 results (see, for example, Non-patent document 1). branch processing program and an indirect branch [0007] Branch instructions include not only ordinal processing method of causing a computer to execute branch instructions but also indirect branch instructions pseudo indirect branch instruction in place of indirect for taking branch based on an address stored in a regis- branch instruction. ter. Indirect branch instruction is frequently used for im- 10 plementation of the above-described interpreter and the BACKGROUND ART like. However, because indirect branch instruction is a high-cost instruction for a superscalar computer, predic- [0002] Conventionally, computers (CPU; Central tion methods focused on indirect branch instruction have Processing Unit, and the like) execute source code pro- also been studied (see, for example, Non-patent docu- grams written by person with compilation or interpretation 15 ment 2 and Non-patent document 3). mechanisms. Compilation is a technique of converting a [0008] A technique for applying a code written for a source code program into a computer- executable binary processor that handles only small address space to a format by using a conversion program, called a compiler, processor that handles large address space without us- and thereafter executing the code converted into the bi- ing indirect call is described in Patent document 1. nary format. Interpretation is a technique of executing a 20 [0009] Non-patent document 1: John L. Hennessy and special binary code, called an interpreter, on a computer David A. Patterson. "Computer Architecture-A Quantita- so that the interpreter translates a source code program tive Approach 3rd Eedition," MORGAN KAUFMANN in one at a time manner and performing operations ac- PUBLISHERS, ISBN 1-55860-724-2. cording to content as required. [0003] In recent years, we design computers based on 25 Non-patent document 2: P.-Y. Chang, E. Hao, and a technique called superscalar to increase processing Y. N. Patt. "Target Prediction for Indirect Jumps", In speed of them. This technique premises that processing Proceedings of 24th International Symposium on is performed with pipelining. In pipelining, processing Computer Architecture, pp. 274∼283, 1997. pertaining to each instruction is divided into a plurality of Non-patent document 3: K. Driesen and U. HAolzle, stages, which are independently processed by discrete 30 "Accurate Indirect Branch Prediction", In Proceed- units. Therefore, if the next instruction to be executed is ings of 25th International Symposium on Computer determined, execution of a first stage of the next instruc- Architecture, pp. 167∼178, 1998. tion can be performed simultaneously with execution of Patent document 1: Japanese Laid-open Patent a second stage of the preceding instruction, which allows Publication No. 2000-284965 high-speed execution by virtue of parallel processing with 35 parallelism corresponding to the number of divided stag- DISCLOSURE OF INVENTION es. [0004] Meanwhile, conditions for causing pipelining to PROBLEM TO BE SOLVED BY THE INVENTION function successfully include a condition that the next instruction to be executed is determined. More specifi- 40 [0010] The above-described conventional techniques cally, an instruction 2 to be executed following an instruc- are, however, disadvantageous in that even when the tion 1 (in a case where instructions are to be executed next instruction to be executed subsequent to an indirect in an order of the instruction 1, the instruction 2, an in- branch instruction written in a program is predicted based struction 3, ...) is not determined before a first stage of on branch history records, the probability that the pre- the instruction 1 completes, the next instruction 2 cannot 45 dicted instruction is the same as an instruction that is be executed in a parallel manner (the same goes for in- actually executed next is low. structions following the instruction 2). [0011] For such indirect branch instruction that has [0005] Such a circumstance typically occurs, for ex- branch destinations differing from execution to execu- ample, with branch instruction. In regard to branch in- tion, prediction cannot be made with high accuracy even struction, until a result of an immediately-preceding in- 50 with utilization of the techniques of Non- patent document struction for making conditional branch decision (for ex- 2 and Non-patent document 3. Indirect branch instruc- ample, compare instruction) is obtained, the next instructions remain to be high-cost instructions for computers tion to be executed remains undetermined, which ob- that process programs by superscalar technique. structs pipeline operation. Therefore, processing speed [0012] The present invention has been conceived in of an entire computer is undesirably substantially re-55 view of the above circumstances and aims at providing duced. an indirect branch processing program and an indirect [0006] To prevent performance loss of pipeline oper- branch processing method by which highly accurate ation, a technique called branch prediction has been de- branch prediction can be made. 2 3 EP 2 182 433 A1 4 MEANS FOR SOLVING PROBLEM causes execution to transfer to a branch destination address in a general purpose register, an instruction that [0013] To solve the disadvantage and achieve the ob- causes the branch destination address to be stored in ject, an interpreter on a computer is caused to execute the call stack and execution to transfer to the address is a reading step of reading the source program stored in 5 used as a substitute. Therefore, processing performance a storage device, a step of translating the source program of the computer that executes the source program may and selecting a processing code according to content, be increased. and a step of calling the processing code selected by the [0019] According to the present invention, the branch code selecting step. To take a branch to the selected destination addresses in the indirect branch instruction code, the interpreter stores addresses of selected codes 10 are stored in a predetermined stack or a link register im- in a call stack in reverse order of processed order and plemented in the computer in inverse order to thereby performs call operations by using a pseudo indirect indirectly update a return- address prediction table inside branch instruction in place of utilizing the indirect branch the computer so that branch prediction is made success- instruction in the code selecting and call steps. fully when pseudo indirect branch is processed. This may [0014] According to the present invention, in the above 15 improve in accuracy in branch prediction of the computer invention, in the processing-code call step, a pseudo in- and increase processing performance of the computer direct branch code that includes, in place of an indirect that executes the source program. branch instruction that causes execution to transfer to a [0020] According to the present invention, a control branch destination address stored in a general purpose statement that involves branching is written in the source register, an instruction that causes the branch destination 20 program, and the pseudo indirect branch code in the in- address to be stored in the call stack and execution to terpreter includes an instruction that causes the branch transfer to the address is used as a substitute. destination addresses in the indirect branch instruction [0015] According to the present invention, in the above to be stored in the call stack in inverse order in each invention, the branch destination addresses in the indi- section before a condition determining portion of the con- rect branch instruction are stored in a predetermined25 trol statement is written. This allows to perform pseudo stack or a link register implemented in the computer to indirect branch more efficiently and at higher speed. thereby indirectly update a return-address prediction table inside the computer so that branch prediction is made BRIEF DESCRIPTION OF DRAWINGS successfully when pseudo indirect branch is processed.

Ep 2182433 A1

2. Instruction Set Architecture

Exploiting Branch Target Injection Jann Horn, Google Project Zero

Simty: Generalized SIMT Execution on RISC-V Caroline Collange

Digital Design & Computer Arch. Lecture 17: Branch Prediction II

Leveraging Indirect Branch Locality in Dynamic Binary Translators

Intel Itanium 2 Processor Reference Manual

Understanding and Predicting Indirect Branch Behavior

Design of Digital Circuits Lecture 19: Branch Prediction II, VLIW, Fine-Grained Multithreading

Instruction Set Define Computer

AMD Architecture Guidelines for Indirect Branch Control

Predicated Code

SOFTWARE TECHNIQUES for MANAGING SPECULATION on AMD PROCESSORS 2 to Mitigate the Above Described Variants, There Are a Variety of Possible Techniques Software Can Use