Dynamic Multiple Instruction Stream Multiple Data Multiple Pipeline Floatingpoint Unit

Europaisches Patentamt 0 328 721 £ European Patent Office 0\) Publication number: A2 Office europeen des brevets EUROPEAN PATENT APPLICATION © Application number: 88109842.0 (Si) int. Ci.4: G06F 9/38 © Date of filing: 21.06.88 The title of the invention has been amended © Applicant: International Business Machines (Guidelines for Examination in the EPO, A— III. Corporation 7.3). Old Orchard Road Armonk, N.Y. 10504(US) ® Priority: 30.09.87 US 102985 © Inventor: Schwarz, Eric Mark 5A Jane Lacey Drive @ Date of publication of application: Endicott, NY 13760(US) 23.08.89 Bulletin 89/34 Inventor: Vassiliadis, Stamatis 717 Vestal Road © Designated Contracting States: Vestal, NY 13850(US) CH DE FR GB IT LI NL © Representative: Jost, Ottokarl, Dipl.-lng. IBM Deutschland GmbH Patentwesen und Urheberrecht Schonaicher Strasse 220 D-7030 Boblingen(DE) Dynamic multiple instruction stream multiple data multiple pipeline floatingpoint unit. © A dynamic multiple instruction stream, multiple processor should complete, execution of the oldest data, multiple pipeline (MIMD) apparatus simulta- instruction in such processor completes, leaving neously executes more than one instruction asso- room for insertion of the particular instruction therein ciated with a multiple number of instruction streams for execution. When the particular instruction is utilizing multiple data associated with the multiple transmitted to its associated pipeline processor, in- number of instruction streams in a multiple number formation including the pipe number is stored in the of pipeline processors. Since instructions associated dynamic history table for future reference. with a multiple number of instruction streams are ,in( Faint Unit 10 being executed simultaneously by a multiple number of pipeline processors, a tracking mechanism is needed for keeping track of the pipe in which each înstruction is executing. As a result, a dynamic his- ^tory table maintains a record of the pipeline proces- r"sor number in which each incoming instruction is êxecuting, and other characteristics of the instruc- ^tion. When a particular instruction is received, it is 00 decoded and its type is determined. Each pipeline certain of instructions; ^processor handles a category c me ifc ! r i ic it the particular instruction is transmitted to the pipeline Ho iri!!i i i io Ii fi i; tM.o 1i Fi [. IB;o IFii O processor having its corresponding category. How- 0 1 1 I (0 I I I [0 1 1 I 10 It SLI 1 1I IS!L 1[ 1 ISIL 1I I1 IS|L 1 Qêver, before transmission, the pipeline processor is HI checked for completion of its oldest instruction by consulting the dynamic history table. If the table indicates that the oldest instruction in the pipeline Xerox Copy Centre EP 0 328 721 A2 DYNAMIC MULTIPLE INSTRUCTION STREAM MULTIPLE DATA MULTIPLE PIPELINE APPARATUS FOR FLOATING-POINT SINGLE INSTRUCTION STREAM SINGLE DATA ARCHITECTURES The subject matter of this invention relates to termed a "dynamic MIMD pipeline". computing systems, and more particularly, to a It is another object of the present invention to multiple instruction stream, multiple data pipeline introduce the dynamic MIMD pipeline which is not for use in a functional unit of such computing limited by the "one instruction stream at a time" system, such as a floating point unit, which is 5 execution philosophy. designed to operate in conjunction with a single It is another object of the present invention to instruction stream, single data architecture. introduce the dynamic MIMD pipeline capable of Most computer processors utilize some form of simultaneously executing a multiple number of in- pipelining. In a pipelined computer processor, more struction streams in a multiple number of pipelines than one instruction of an instruction stream is 10 thereby increasing substantially the performance of being executed at the same time. However, each of the functional unit embodying the dynamic MIMD the instructions being executed are disposed within pipeline. different stages of the pipe. The performance of a In accordance with these and other objects of pipelined processor is necessarily better than the the present invention, a plurality of pipes are ca- performance of a non-pipelined processor. There 75 pable of piping, for execution thereof, a further are different types of pipelining. One type is plurality of instructions. Each pipe is capable of termed "single instruction stream single data simultaneously storing, for execution, a plurality of (SISD)" pipelining. In the SISD type of pipelining, instructions. Thus, the plurality of pipes are ca- individual instructions are pipelined with at most a pable of simultaneously storing, for execution, the single data operation. However, using the SISD 20 further plurality of instructions. The further plurality pipelining approach, many "hazards" were encoun- of instructions are chosen from a plurality of in- tered. Hazards are encountered upon entering the struction streams which are executing simulta- pipeline at a maximum possible new data rate. The neously in the plurality of pipes. Since the instruc- "hazards" can be divided in two categories, name- tions in a particular pipe may be in various stages ly, structural hazards and data dependent hazards. 25 of completion of execution, in order to keep an A structural hazard occurs when two pieces of data .accurate record of the execution disposition of attempt to use the same hardware and thus colli- each instruction in the pipe, a dynamic history sions occur. Data dependent hazards may occur table stores information associated with each in- when the events transpiring in one stage of a struction disposed in each of the plurality of pipes, pipeline determines whether or not data may pass 30 the information for each instruction including the through another stage of the pipeline. For example, pipe number in which the instruction is temporarily in a pipeline having two stages, each stage requir- stored, and the status of completion of execution of ing use of a single memory, when one stage is the particular instruction. A handshakes and global using the memory, the other stage must remain hazards circuit determines the busy status of the idle until the first stage is no longer using the 35 functional unit, in which the dynamic MIMD pipe is memory. Another type of pipeline approach is embodied, and responds to other functional units in termed "multiple instruction stream, multiple data the computer system, such as the central process- (MIMD)" pipelining. When the MIMD type of ing unit (CPU). It also determines if any hazards pipelining is being used, rather than pipe individual exist. If the functional unit is not busy and no instructions, as in the SISD pipeline approach, in- 40 hazards exist, the next instruction from one of the struction "streams" are piped. The MIMD pipeline plurality of instruction streams enters the next avail- approach did not encounter the hazards problem. able pipe. An MIMD/SISD switch circuit determines However, although instruction streams are being if an incoming instruction is greater than "X" bits piped in the MIMD approach, a first instruction long (e.g. - 64), and if so, the switch switches the stream must complete execution before a second 45 dynamic MIMD pipeline of the present invention to instruction stream could commence execution. the standard SISD mode and executes the incom- Thus, although the performance of the MIMD pipe- ing instruction in the "one instruction stream at a line was better than the performance of the SISD time" execution philosophy mode. SISD is also pipeline, the performance of the MIMD pipeline invoked for "difficult" instructions which are consid- was limited, by the "one instruction stream at a 50 ered to be divides and square roots. time" execution philosophy. Further scope of applicability of the present Accordingly, it is a primary object of the invention will become apparent from the detailed present invention to introduce a novel type of pipe- description presented hereinafter. It should be un- line for computer functional units, hereinafter derstood, however, that the detailed description EP 0 328 721 A2 and the specific examples, while representing a such as the cache, via a bus called the CBUS. The preferred embodiment of the invention, are given CBUS is the only means by which instructions are by way of illustration only, since various changes communicated between the CPU and the FPU. The and modifications within the spirit and scope of the CBUS conducts handshake control signals and in- invention will become obvious to one skilled in the 5 struction opcodes. When the CPU transmits these art from a reading of the following detailed descrip- requests, the functional units to which these re- tion. quests are sent are called Processor Bus Units A full understanding of the present invention (PBU). The FPU comprises one of the PBUs. When will be obtained from the detailed description of the the CPU encounters an instruction which it cannot preferred embodiment presented hereinbelow, and m execute, and a PBU should execute the instruction, the accompanying drawings, which are given by the CPU transmits a Processor Bus Operation way of illustration only and are not intended to be (PBO) signal to the appropriate PBU. For instance, limitative of the present invention, and wherein: if the CPU decoded an instruction to be a multiply Fig. 1 illustrates a block diagram of a prior floating point long, since it is much easier for the art standard MIMD pipeline; 75 FPU to perform this operation, the CPU transmits Fig. 2 illustrates the dynamic MIMD pipeline the PBO signal to the FPU requiring the FPU to of the present invention; perform the multiply floating point long instruction. Fig. 3 illustrates the instruction stack of Fig. The FPU comprises two main parts: a first 2; section in which data actually flows, and a second Fig. 4 illustrates the dynamic history table of 20 section into which instructions are introduced and Fig.

Dynamic Multiple Instruction Stream Multiple Data Multiple Pipeline Floatingpoint Unit

PIPELINING and ASSOCIATED TIMING ISSUES Introduction: While

2.5 Classification of Parallel Computers

Microprocessor Architecture

Instruction Pipelining (1 of 7)

Review of Computer Architecture

CS152: Computer Systems Architecture Pipelining

Threading SIMD and MIMD in the Multicore Context the Ultrasparc T2

Pipelined Computations

Trends in Processor Architecture

Reverse Engineering X86 Processor Microcode

Chap. 9 Pipeline and Vector Processing

Chapter 4 Data-Level Parallelism in Vector, SIMD, and GPU Architectures