Runahead Execution: An Alternative to Very Large Instruction Windows for Out-of-order Processors Onur Mutlu § Jared Stark † Chris Wilkerson ‡ Yale N. Patt § §ECE Department †Microprocessor Research ‡Desktop Platforms Group The University ofTexas at Austin Intel Labs Intel Corporation {onur,patt}@ece.utexas.edu
[email protected] [email protected] Abstract the processor buffers the operations in an instruction win- dow, the size ofwhich determines the amount oflatency the Today’s high performance processors tolerate long la- out-of-order engine can tolerate. tency operations by means of out-of-order execution. How- Today’s processors are facing increasingly larger laten- ever, as latencies increase, the size of the instruction win- cies. With the growing disparity between processor and dow must increase even faster if we are to continue to tol- memory speeds, operations that cause cache misses out to erate these latencies. We have already reached the point main memory take hundreds ofprocessor cycles to com- where the size of an instruction window that can handle plete execution [25]. Tolerating these latencies solely with these latencies is prohibitively large, in terms of both de- out-of-order execution has become difficult, as it requires sign complexity and power consumption. And, the problem ever-larger instruction windows, which increases design is getting worse. This paper proposes runahead execution complexity and power consumption. For this reason, com- as an effective way to increase memory latency tolerance puter architects developed software and hardware prefetch- in an out-of-order processor, without requiring an unrea- ing methods to tolerate these long memory latencies.