Method and System for Supporting Speculative Execution of Instructions

Europaisches Patentamt J European Patent Office © Publication number: 0 605 872 A1 Office europeen des brevets EUROPEAN PATENT APPLICATION © Application number: 93120940.7 int ci 5 G06F 9/38 @ Date of filing: 27.12.93 ® Priority: 08.01.93 US 2445 @ Inventor: Levitan, David S. 9031 Marthas Drive @ Date of publication of application: Austin, Texas 7871 7(US) 13.07.94 Bulletin 94/28 © Designated Contracting States: © Representative: Lettieri, Fabrizio DE FR GB IBM SEMEA S.p.A., © Applicant: INTERNATIONAL BUSINESS Direzione Brevetti, MACHINES CORPORATION Ml SEG 024, Old Orchard Road P.O. Box 137 Armonk, N.Y. 10504(US) I-20090 Segrate (Milano) (IT) © Method and system for supporting speculative execution of instructions. © A data processing system executing speculative update value as the completion version value and instructions includes a memory for storing instruc- means (52) responsive to dispatch of a conditional tions at addresses, count registers (42, 44, 46) for branch instruction. Means (62) responsive to com- storing an update value, a dispatch version value pletion of the branch provide for decrementing con- and a completion version value. A fetcher connected tents of a completion version register. Finally, means to a branch unit fetches instructions from memory (58) responsive to occurrence of an interrupt prior to based upon addresses calculated by the branch unit, completion of the branch provide for replacing the which handles processing of conditional branch dispatch version value with the completion version instructions. Further included are means (60) respon- value to restore the system to a state prior to the sive to completion of initialization for copying the speculative execution of instructions. MOVE_TO_COUNT STARTS CM 00 m o CO TO BRANCH UNIT Rank Xerox (UK) Business Services (3. 10/3.09/3.3.4) 1 EP 0 605 872 A1 2 The invention relates to data processing sys- and makes fetches of instructions extremely fast. tems and in particular to a method and system for An instruction subset of great interest is that supporting speculative execution of program relating to conditional branches. Conditional branch instructions. Still more particularly, the invention instructions are instructions which dictate the taking relates to preservation of non-conditional state in- 5 of a specified conditional branch within an applica- formation for recovery after speculative execution tion in response to a selected outcome of the fails. processing of one or more other instructions. A Designers of data processing systems are con- practical example is a Fortran do-loop. Conditional tinually attempting to enhance the performance of branch instructions have long been a source of such systems. One technique for enhancing data io difficulty for pipeline computers (including RISC processing system efficiency is the achievement of systems). By the time a conditional branch instruc- short cycle times and a low Cycles-Per-lnstruction tion propagates through a pipeline queue to an (CPI) ratio in the system processor. An example of execution position within the queue, it will have the application of these techniques to data pro- been necessary to load instructions corresponding cessing system is the International Business Ma- 75 to one branch into the queue behind the conditional chines Corporation RISC System/6000(RS/6000)- branch instruction prior to resolving the conditional computer. The RS/6000 system is designed to branch, in order to avoid run-time delays. This perform well in numerically intensive engineering requires a choice be made as to which instruction and scientific applications as well as in multi-user, will follow the conditional branch without knowing commercial environments. The RS/6000 processor 20 the outcome of processing the related instructions. employs a superscalar implementation, which The choice can prove wrong. means that multiple instructions are issued and The execution of instructions prior to the final executed concurrently. possible definition of all conditions effecting execu- Processor architecture relates to the combina- tion is called speculative execution. To wait for the tion of registers, arithmetic units and control logic 25 outcome of conditional branches, or the arrival of to build the computational elements of a computer. all possible interrupts, would make full concurrent An important consideration during building of a processing impossible. Thus, some scheme for processor is the instruction set it will provide. An processor recovery from speculative execution of instruction is a statement which specifies an opera- instructions must be provided if full use of concur- tion and the values or locations of its operands. An 30 rent execution of instructions is to be made. Upon instruction set is the collection of all such valid determination that execution is proceeding down an statements for a particular machine. incorrect branch an interrupt may be generated to As originally conceived, RISC machines would change the course of execution. In responding to execute one instruction per machine cycle. To this an interrupt, the processor is returned to the last end all instructions were of one length and fit a 35 non- speculative execution step. scheme compatible with a pipeline implementation. Experience has demonstrated that use of some Simplicity in the instruction set was the design complex operations in RISC machines can improve objective. This allowed further reduction in the cy- performance. This in part stems from the nature of cle time compared with so called complex instruc- currently preferred technology for implementation tion set computers (CISC). However, some of the 40 of processors, i.e. very large scale integration benefits of RISC were offset by increases in traffic (VLSI). Minimization of area used on a chip is now between the processor and the main memory for a more important than minimizing the number of de- computer. This occurred because a RISC machine vices used to implement the processor. Hence, requires more instruction instances to do a task some complex instructions have begun infiltrating than a CISC machine with its more powerful in- 45 into RISC based designs. The criteria for inclusion struction set. is minimum utilization of space. One instruction in Concurrence in issuance and execution of mul- the RS/6000 instruction set allows execution of a tiple instructions requires independent functional branch on count loop. The branch on count instruc- units that can execute with a high instruction band- tion is a one step instruction replacing what was width. The RS/6000 system achieves this by utiliz- 50 formerly done in three instructions. Substitution of ing separate branch, fixed point and floating point a single instruction for three instructions was en- processing units which are pipelined in nature. The abled by providing a dedicated count register. branch processing unit handles conditional branch However, this arrangement does not in itself sup- instructions. In common with other RISC designs, port speculative execution. Implementation of the complex decoding logic no longer required to de- 55 count register could be done by a mechanism code instructions has been utilized to provide an provided in RS/6000 machines for register rename, instruction cache on the processor chip. This re- but the value for the count register would not be duces traffic between the processor and memory, known during the dispatch cycle resulting in some 2 3 EP 0 605 872 A1 4 loss of machine cycles. Figure 3 is a schematic illustration of a branch Desirable is a hardware implementation of the on count register architecture in accordance with branch on count loop which uses a minimum a preferred embodiment of the invention; and amount of area on a processor chip. Figure 4 is a schematic illustration of a branch It is therefore one object of the invention to 5 on count register architecture in accordance with provide an improved method and system for sup- a second preferred embodiment of the invention. porting speculative execution of program instruc- With reference now to the figures and in par- tions. ticular with reference to Figure 1 , there is depicted It is another object of the invention to provide a high level block diagram of a superscalar com- preservation of conditional state information for re- io puter system 10 which may be utilized to imple- covery after speculative execution fails. ment the method and system of the present inven- The foregoing objects are achieved by the in- tion. As illustrated, computer system to preferably vention as claimed. The invention provides a data includes a memory 18 which is utilized to store processing system for speculatively executing data, instructions and the like. Data or instructions instructions. The data processing system includes is stored within memory 18 are preferably accessed a memory for storing instructions at addresses utilizing cache/memory interface 20 in a method which can be generated by a branch unit in a well known to those having skill in the art. The processor. The processor also has a count register sizing and utilization of cache memory systems is for storing an update value, a dispatch version a well known subspecialty within the data process- value and a completion version value of a branch 20 ing art is not addressed within the present applica- control count. A fetcher connected to the branch tion. However, those skilled in the art will appre- unit fetches instructions from memory based upon ciate that by utilizing modern associative cache addresses calculated by the branch unit. The techniques a large percentage of memory access- branch unit handles processing of conditional es may be achieved utilizing data temporarily branch instructions. To do so, means for initializing 25 stored within cache/memory interface 20. the update value and the dispatch version value for Instructions from cache/memory interface 20 branch control are provided. Further included are are typically loaded into instruction queue 22 which means responsive to completion of initialization for preferably includes a plurality of queue positions.

Load more