(19) United States %

US 20030172256A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2003/0172256 A1 Soltis, JR. et al. (43) Pub. Date: Sep. 11, 2003 (54) USE SENSE URGENCY TO CONTINUE Publication Classi?cation WITH OTHER HEURISTICS TO DETERMINE SWITCH EVENTS IN A (51) Im. c1? ..................................................... .. G06F 9/00 TEMPORAL MULTITHREADED CPU (52) US. Cl. ............................................................ ..712/228 (57) ABSTRACT (76) Inventors: Donald C. Soltis JR., Fort Collins, CO (US); Rohit Bhatia, Fort Collins, CO A processing unit of the invention has multiple instruction (Us) pipelines for processing multi-threaded instructions. Each thread may have an urgency associated With its program Correspondence Address: instructions. The processing unit has a thread sWitch con HEWLETT-PACKARD COMPANY troller to monitor processing of instructions through the Intellectual Property Administration various pipelines. The thread controller also controls sWitch P.O. Box 272400 events to move from one thread to another Within the Fort Collins, CO 80527-2400 (US) pipelines. The controller may modify the urgency of any thread such as by issuing an additional instruction. The (21) Appl. No.: 10/092,670 thread controller preferably utilizes certain heuristics in making sWitch event decisions. A time slice expiration unit (22) Filed: Mar. 6, 2002 may also monitor expiration of threads for a given time slice. 100 x 102 V ( MONITOR PROGRAM THREAD PROGESSION > A8 INsTRUCTTONs EROCEssED IN ~ NO SW?! MULTIPLE PIPELINES 103 SET ACTIVE THREAD DID URGENCY HIGH, A TIME SLICE DEACTIVATE ACTIVE EXPIRATION THREAD AND OCCDR ACTIVATE INACTIVE ? THREAD % 104 105 DID A SWITCH EVENT OCCUR '2 1 130 106 MODIFY ACTIVE DID URGENCY, ASSES * 112 sTATI OR EVENT AGAINST INACTEVE OCCUR EOR ACTIVE URGENCY TO B THREAD DETERMINE WHETHER _ SWH'CH THREAD 7 SWITCH SHOULD OCCUR BY DEACTIVATINO ACTIVE THREAD AND ACTIVATINO 108 1141f MODIEV INACTIVE INACTIVE THREAD DID URGENCY, AssEs A M EVENT OCCUR AGAINST ACTIVE _,\__ FOR INACTIVE URGENCY TO THREAD DETERMINE WHETHER ? SWITCH SHOULD OCCUR sWITCH SWITCH Patent Application Publication Sep. 11, 2003 Sheet 1 0f 2 US 2003/0172256 A1 FIG. 1 TIME SLICE EXPIRATION BYPASS UNIT 21f LOGIC _2_() Q I} k A T 1 ‘I V [—___—__— PIPELINE ' 12m L“ ‘I I I“ V PIPELINE _, FETCH ISSUE 3A2) CACHE UNIT _H UNIT _ I 13 2.4 262 _ E I, ' EV ‘ PIPELINE 1 O ‘I 2! N I h’ A 1} A l ‘I 1 Ir THREAD CQNTROLLER 4 1 ll ‘2 15(M) 17 REGISTER FILE ‘ 16H ‘ ) REGISTER FILE REG§$TER FILE ‘I8 16(N) ‘ 16§2) FIG. 2 1 2 _’\ Ff- 62 f 66 EXI EXZ I EXZ @ 5.5 §§ 70 —————> TIME Patent Application Publication Sep. 11, 2003 Sheet 2 0f 2 US 2003/0172256 A1 FIG. 3 I START I‘.‘d 102 v F MONITOR PROGRAM THREAD PROGESSION H > AS INSTRUCTIONS PROCESSED IN 2 NO SWiTf MULTIPLE PIPELINES 1 O3 SET ACTIVE TH READ DID URGENCY HIGH, A TIME SLICE DEACTIVATE ACTIVE EXPIRATION THREAD AND OCCUR ACTIVATE INACTIVE ? THREAD 105 194 DID A SWITCH EVENT OCCUR 1 ‘IO MODIFY ACTIVE URGENCY, ASSES _ 1 12 AGAINST INACTIVE URGENCY TO 5 DETERMINE WHETHER _ SWITCH THREAD SWITCH SHOULD OCCUR BY DEACTIVATING ACTIVE THREAD AND ACTIVATING MODIFY INACTIVE INACTIVE THREAD URGENCY, ASSES A A EVENT OCCUR AGAINST ACTIVE _,\___ FOR INACTIVE URGENCY TO THREAD DETERMINE WHETHER ? SWITCH SHOULD OCCUR SWITCH SWITCH US 2003/0172256 A1 Sep. 11, 2003 USE SENSE URGENCY TO CONTINUE WITH hoW urgent the program or processor logic believes the OTHER HEURISTICS TO DETERMINE SWITCH thread should be. By Way of example, a thread’s urgency EVENTS IN A TEMPORAL MULTITHREADED may be “loW,”“medium” or “high,” or further quantiZed With CPU an 3-bit identi?er, thereby providing different sWitch event solutions to treatment of the thread in stalled pipelines. If for BACKGROUND OF THE INVENTION example a thread instruction misses the cache, the controller can loWer the thread’s urgency, e.g., from high to medium, [0001] Temporal multithreading is known in the art as a or from 5 to 4. Additional cache misses can loWer the technique that uses one set of execution resources to execute multiple “programs,” or “threads.” These execution thread’s urgency further, such as from medium to loW, or from 4 to 3. The thread controller preferably utiliZes certain resources often include an array of pipeline execution units. heuristics in making sWitch event decisions. Other heuristics Instructions for a program thread are processed through the may be used by the controller to prescribe the sWitch events pipeline until it stalls; in a “sWitch” event, those stalled according to certain guidelines. By Way of example, pro instructions are then removed and instructions from another cessor interrupts may also modify the sWitch event heuris thread are injected to the same pipeline so as to ef?ciently tics. utiliZe execution resources. [0002] Temporal multithreading thus gives the appearance [0008] The controller also preferably monitors the timing of multiple central processing units (“CPU”). Each thread of a thread, such as to incorporate time out features With the sWitch event heuristics. processes through the execution units as if the program had the entire control of the execution units; activation and [0009] The invention thus provides certain advantages. deactivation of various threads occurs in hardWare control Unlike the prior art, sWitch events are no longer “black and logic based on multiple sWitching events in an attempt to White” decisions that may cause negative processing effects maximally utiliZe the execution units. Within the pipeline. By Way of example, in accord With the [0003] There is a penalty associated With the above invention, a sWitch event may not automatically occur after mentioned sWitch events. Accordingly, the prior art has a certain number of cycles associated With time slice expi developed certain objective criteria for a sWitch event. In ration. That is, an assessment is also made of a thread’s one example, a cache miss triggers a sWitch event because execution and relative to time slice expiration: if that thread the processor needs to acquire data from main memory. In is making good forWard progress, it is not stalled or sWitched another example, a time out counter counts the cycles of a out; rather its use urgency may actually be increased to thread’s execution and promotes an automatic sWitch for an amplify the thread’s activity. If hoWever there is a stall, the out-of-bounds thread execution duration. next inactive thread might be activated. If on the other hand the active thread has a time slice expiration but has a higher [0004] There is the need to further reduce the negative urgency than the inactive thread, the inactive thread may effects of sWitching events in high performing processors. remain dormant While the pipeline Waits to process the By reducing or improving processing of sWitch events, a active, more-urgent thread. Those skilled in the art should processor Will have increased performance, by improving also appreciate that sWitching too late may also create instruction processing ef?ciency across multiple threads. processing penalties. The invention provides advantages One feature of the invention is therefore to provide a over the prior art by adding urgency to the thread in order to processor With intelligent logic for ef?ciently processing and encourage, or not, a sWitch event under appropriate heuris sWitching multi-threaded programs through the processor. tics. Several objects and other features of the invention are apparent Within the description that folloWs. [0010] Moreover, too many sWitch events may underuti liZe the pipeline. Once again, the invention has advantages SUMMARY OF THE INVENTION over the prior art by reducing the underutiliZation of execu tion units, due to sWitch events, by making appropriate [0005] The folloWing patents provide useful background sWitch decisions according to preferred instruction process to the invention and are incorporated herein by reference: ing policies; and these policies may be changed, dynami US. Pat. No. 6,188,633; US. Pat. No. 6,105,123; US. Pat. cally or otherWise, to further reduce the underutiliZation. By No. 5,857,104; US. Pat. No. 5,809,275; US. Pat. No. Way of example, a program thread With “loW” urgency may 5,778,219; US. Pat. No. 5,761,490; US. Pat. No. 5,721,865; sWitch immediately in the event of a stall (predicted or and US. Pat. No. 5,513,363. actual) [0006] In one aspect, a processing unit of the invention has [0011] The invention is next described further in connec multiple instruction pipelines for processing instructions tion With preferred embodiments, and it Will become appar from multiple threads. Though not required, each thread ent that various additions, subtractions, and modi?cations may include an urgency identi?er Within its program instruc can be made by those skilled in the art Without departing tions. The processing unit has a thread sWitch controller to from the scope of the invention. monitor processing of instructions through the various pipe lines. The thread controller also controls sWitch events to BRIEF DESCRIPTION OF THE DRAWINGS move from one thread to another Within the pipelines. The controller may modify the urgency of any thread such as [0012] A more complete understanding of the invention through modi?cation of the urgency identi?er. may be obtained by reference to the draWings, in Which: [0007] “Urgency,” as used herein, generally means an [0013] FIG. 1 schematically illustrates a processing unit abstraction of hoW Well a thread is progressing (or Will be of the invention for processing instructions through pipeline progressing) Within a pipeline; it is also an abstraction as to execution units; US 2003/0172256 A1 Sep. 11, 2003 [0014] FIG. 2 shows an exemplary switch event Within a the ?rst thread out, at sWitch event 70, and activates a second pipeline of the invention; and thread, illustratively With a “high” urgency.

(19) United States %

Processamento Paralelo Em Cuda Aplicado Ao Modelo De Geração De Cenários Sintéticos De Vazões E Energias - Gevazp

Radio Frequency Identification Based Smart

Andrzej Nowak - Bio

COEN-4710 Computer Hardware Lecture 1 Computer Abstractions and Technology (Ch.1)

FPL-2019 Why-I-Love-Latency

A Survey of Microarchitectural Timing Attacks and Countermeasures on Contemporary Hardware

Parallel Implementation of Sorting Algorithms

IJMEIT// Vol.04 Issue 08//August//Page No:1729-1735//ISSN-2348-196X 2016

Simultaneous Multithreading on X86 64 Systems: an Energy Efﬁciency Evaluation

PCAP Assignment IV.Pages

UC Riverside UC Riverside Previously Published Works

Heterogeneous Computing with Opencl 2.0 This Page Intentionally Left Blank Heterogeneous Computing with Opencl 2.0 Third Edition