Efficient and Secure Context Switching and Migrations for Heterogeneous Multiprocessors

Total Page:16

File Type:pdf, Size:1020Kb

Efficient and Secure Context Switching and Migrations for Heterogeneous Multiprocessors University of Southampton Research Repository Copyright © and Moral Rights for this thesis and, where applicable, any accompanying data are retained by the author and/or other copyright owners. A copy can be downloaded for personal non-commercial research or study, without prior permission or charge. This thesis and the accompanying data cannot be reproduced or quoted extensively from without first obtaining permission in writing from the copyright holder/s. The content of the thesis and accompanying research data (where applicable) must not be changed in any way or sold commercially in any format or medium without the formal permission of the copyright holder/s. When referring to this thesis and any accompanying data, full bibliographic details must be given, e.g. Thesis: Author (Year of Submission) "Full thesis title", University of Southampton, name of the University Faculty or School or Department, PhD Thesis, pagination. Data: Author (Year) Title. URI [dataset] UNIVERSITY OF SOUTHAMPTON FACULTY OF ENGINEERING AND PHYSICAL SCIENCES Electronics and Computer Science Efficient and Secure Context Switching and Migrations for Heterogeneous Multiprocessors by Ilias Vougioukas Thesis for the degree of Doctor of Philosophy June 3, 2019 UNIVERSITY OF SOUTHAMPTON ABSTRACT FACULTY OF ENGINEERING AND PHYSICAL SCIENCES Electronics and Computer Science Doctor of Philosophy EFFICIENT AND SECURE CONTEXT SWITCHING AND MIGRATIONS FOR HETEROGENEOUS MULTIPROCESSORS by Ilias Vougioukas Modern mobile processors are constrained by their limited energy resource and demanding applications that require fast execution. Single core designs use voltage/frequency throttling techniques that allow the system to switch between performant and efficient configurations to address this problem. Heterogeneous multicores also need to decide on which core to run rather than just adjust voltage and frequency. Consequently, they remain an open subject in terms of near optimal design and operation. This thesis investigates the performance and energy trade-off when migrating between heterogeneous cores and presents designs that enable low overhead transitions between cores through a series of contributions. The first contribution is based on a novel methodology, that can directly compare the execution of heterogeneous cores. With it, an in-depth investigation of the effects on the memory system when migrating between such cores is conducted. The analysis reveals that heterogeneous multiprocessor system slowdown is asymmetrical. In-Order core performance relies on the memory subsystem, while Out-of-Order execution depends on accurate speculation. A proposed design minimises migration overheads in In-Order cores without making the design prohibitively complex to implement. This is achieved by only sharing the larger caches and the translation state. The second contribution is a branch predictor design that transfers state when a migration occurs between heterogeneous cores. The design eliminates the warm up for Out-of-Order cores by transferring only minimal state. This improves post migration accuracy, potentially enabling better performance and energy efficiency. Finally, security has become a major concern for shared or transferable components in multicore systems, namely the branch predictor. The third contribution in this thesis investigates mitigation techniques of recently discovered side channel attacks. The proposed design flushes all but the most useful branch predictor state, ensuring isolation with minimal performance loss. Contents Declaration of Authorship xiii Acknowledgements xv Acronyms xix Nomenclature xxi 1 Introduction 1 1.1 Research Motivation ............................... 5 1.2 Research Questions ................................ 7 1.3 Contributions ................................... 8 1.4 Outline ...................................... 9 2 Heterogeneous Multiprocessors 11 2.1 Heterogeneous Multiprocessor Types ....................... 11 2.1.1 ISA Based Classification ......................... 12 2.1.2 Commercial and Conventional HMPs . 13 2.1.3 Reasons for HMP Usage ......................... 14 2.2 HMP Architecture and Microarchitecture .................... 15 2.2.1 HMP Space Exploration ......................... 15 2.2.2 Architectural Optimisation in Conventional Designs . 17 2.2.3 Merging HMP Cores ........................... 18 2.2.4 Unconventional Approaches to HMP Design . 23 2.3 HMP Management ................................ 29 2.3.1 Estimation and Modelling Techniques . 30 2.3.2 Load Balancing and Work Stealing ................... 31 2.3.3 Management Using Learning ....................... 31 2.4 Discussion ..................................... 32 3 Heterogeneous Multiprocessor Memory Subsystem Optimisation 33 3.1 Simulation vs Hardware Experimentation .................... 34 3.2 Exploration in Heterogeneous Architectures ................... 35 3.2.1 Exhaustive Exploration .......................... 35 3.2.2 State Space Reduction .......................... 36 3.2.3 Simulation Methodology for First-Order Migrations . 38 3.3 Heterogeneous Multiprocessor Organisation ................... 41 3.3.1 HMP Resource Sharing .......................... 42 v vi CONTENTS 3.3.2 Internal Core Components........................ 46 3.3.3 Comparing Parallel Simulations..................... 46 3.4 Experimental Setup ................................ 48 3.4.1 Infrastructure ............................... 48 3.5 Migration Cost .................................. 51 3.5.1 Scenario A: No Sharing ......................... 52 3.5.2 Scenario B: Share Caches ........................ 53 3.5.3 Scenario C: Share Caches and TLB ................... 54 3.5.4 Scenario D: Share the L2 Cache and TLB . 55 3.5.5 Comparison of memory sharing schemes . 57 3.6 Efficient Migrations ................................ 57 3.7 Discussion ..................................... 59 4 Branch Predictor State Transfer for Heterogeneous Cores 61 4.1 Branch Predictor Overview ............................ 63 4.1.1 The Bimodal Predictor .......................... 64 4.1.2 The TAGE Predictor ........................... 65 4.1.3 Perceptron Predictors ........................... 67 4.2 Branch Predictor State Transfer Mechanism ................... 69 4.3 Impact of the Branch Predictor State ....................... 71 4.3.1 Calculating Potential Performance Gains . 71 4.3.2 Measuring Accuracy Improvements ................... 73 4.3.3 Calculating the transfer overheads .................... 76 4.4 Summary ..................................... 77 5 Security Considerations for Shared Microarchitecture 79 5.1 Speculative side-channels ............................. 81 5.1.1 Branch predictor side-channels ...................... 81 5.1.2 Mitigation techniques ........................... 82 5.1.3 Threat model ............................... 83 5.2 Predictor Flexibility ................................ 83 5.2.1 Steady-state and transient accuracy ................... 84 5.2.2 The Branch Retention Buffer ....................... 84 5.2.3 Perceptron amplified / reinforced TAGE . 87 5.3 Experimental Setup ................................ 88 5.4 Results ....................................... 89 5.4.1 Quantifying transient accuracy ...................... 89 5.4.2 ParTAGE ................................. 91 5.4.3 ParTAGE results ............................. 94 5.5 Discussion ..................................... 96 6 Conclusions 99 6.1 Summary of Conclusions ............................. 100 6.2 Research Questions Evaluation . 102 6.3 Future work .................................... 103 6.3.1 Research into HMP Scheduling . 103 6.3.2 Investigating Memory Sharing for HMPs with Emerging Accelerators . 103 CONTENTS vii 6.3.3 Extending Branch Predictor State Transfer to Other Designs . 104 6.3.4 Securing Other Shared Resources . 104 A Supplementary Material 105 A.1 Migration Overhead Study ............................ 105 A.2 State Space Reduction to Linear Proof . 107 B Performance and Energy Efficiency 111 B.1 Performance .................................... 111 B.1.1 Parallel Processor Performance . 112 B.1.2 Benchmarking .............................. 115 B.2 Power and Energy Efficiency . 116 B.2.1 Processor Power Dissipation . 117 B.2.2 Methods of Power Loss Mitigation . 119 B.2.2.1 DVFS ............................. 119 B.2.2.2 Core Migrations . 119 B.3 Performance - Energy Trade-offs . 119 B.3.1 Pareto Optimality ............................. 120 B.3.2 The Pareto Frontier . 121 E − D C Selected Publications 123 Bibliography 158 List of Figures 1.1 Moore’s Law depiction of exponential increase in transistor count.......2 1.2 A visual depiction of Dennard’s scaling breakdown...............3 1.3 The gradual decrease of power per single transistor in processor........4 2.1 A depiction of clumpiness in HMPs....................... 15 2.2 Approximation of the Pareto frontier of an HMP using PaTH.......... 17 2.3 Improved scheduling for HMPs using a Predictive Controller.......... 19 2.4 Performance comparison between Oracle, Reactive and Predictive Controllers. 20 2.5 Layout and wiring of two merged OoO cores.................. 22 2.6 Code reusability based on locality........................ 23 2.7 CoreFusion architecture fuses independent cores together............ 24 2.8 Comparison of OoO and InO performance for a range of SMT configurations. 25 2.9 The MorphCore microarchitecture........................ 26 2.10 Core and memory delay and frequency as functions of voltage......... 27 2.11 The TRIPS architecture.............................
Recommended publications
  • Microprocessors History of Computing Nouf Assaid
    MICROPROCESSORS HISTORY OF COMPUTING NOUF ASSAID 1 Table of Contents Introduction 2 Brief History 2 Microprocessors 7 Instruction Set Architectures 8 Von Neumann Machine 9 Microprocessor Design 12 Superscalar 13 RISC 16 CISC 20 VLIW 23 Multiprocessor 24 Future Trends in Microprocessor Design 25 2 Introduction If we take a look around us, we would be sure to find a device that uses a microprocessor in some form or the other. Microprocessors have become a part of our daily lives and it would be difficult to imagine life without them today. From digital wrist watches, to pocket calculators, from microwaves, to cars, toys, security systems, navigation, to credit cards, microprocessors are ubiquitous. All this has been made possible by remarkable developments in semiconductor technology enabling in the last 30 years, enabling the implementation of ideas that were previously beyond the average computer architect’s grasp. In this paper, we discuss the various microprocessor technologies, starting with a brief history of computing. This is followed by an in-depth look at processor architecture, design philosophies, current design trends, RISC processors and CISC processors. Finally we discuss trends and directions in microprocessor design. Brief Historical Overview Mechanical Computers A French engineer by the name of Blaise Pascal built the first working mechanical computer. This device was made completely from gears and was operated using hand cranks. This machine was capable of simple addition and subtraction, but a few years later, a German mathematician by the name of Leibniz made a similar machine that could multiply and divide as well. After about 150 years, a mathematician at Cambridge, Charles Babbage made his Difference Engine.
    [Show full text]
  • Webcore: Architectural Support for Mobile Web Browsing
    WebCore: Architectural Support for Mobile Web Browsing Yuhao Zhu Vijay Janapa Reddi Department of Electrical and Computer Engineering The University of Texas at Austin [email protected], [email protected] Abstract The Web browser is undoubtedly the single most impor- Browser Browser tant application in the mobile ecosystem. An average user 63% 54% spends 72 minutes each day using the mobile Web browser. Web browser internal engines (e.g., WebKit) are also growing 23% 8% 32% Media 6% in importance because they provide a common substrate for 7% 7% Others developing various mobile Web applications. In a user-driven, Media Games Others interactive, and latency-sensitive environment, the browser’s Email performance is crucial. However, the battery-constrained na- (a) Time dist. of window focus. (b) Time dist. of CPU processing. ture of mobile devices limits the performance that we can de- Fig. 1: Mobile Web browser share study conducted by our industry liver for mobile Web browsing. As traditional general-purpose research partner on their employees’ devices [2]. Similar observa- techniques to improve performance and energy efficiency fall tions were reported by NVIDIA on Tegra-based mobile handsets [3,4]. short, we must employ domain-specific knowledge while still maintaining general-purpose flexibility. network limited. However, this trend is changing. With about In this paper, we first perform design-space exploration 10X improvement in round-trip time from 3G to LTE, network to identify appropriate general-purpose architectures that latency is no longer the only performance bottleneck [51]. uniquely fit the characteristics of a popular Web browsing Prior work has shown that over the past decade, network engine.
    [Show full text]
  • Historical Perspective and Further Reading 162.E1
    2.21 Historical Perspective and Further Reading 162.e1 2.21 Historical Perspective and Further Reading Th is section surveys the history of in struction set architectures over time, and we give a short history of programming languages and compilers. ISAs include accumulator architectures, general-purpose register architectures, stack architectures, and a brief history of ARMv7 and the x86. We also review the controversial subjects of high-level-language computer architectures and reduced instruction set computer architectures. Th e history of programming languages includes Fortran, Lisp, Algol, C, Cobol, Pascal, Simula, Smalltalk, C+ + , and Java, and the history of compilers includes the key milestones and the pioneers who achieved them. Accumulator Architectures Hardware was precious in the earliest stored-program computers. Consequently, computer pioneers could not aff ord the number of registers found in today’s architectures. In fact, these architectures had a single register for arithmetic instructions. Since all operations would accumulate in one register, it was called the accumulator , and this style of instruction set is given the same name. For example, accumulator Archaic EDSAC in 1949 had a single accumulator. term for register. On-line Th e three-operand format of RISC-V suggests that a single register is at least two use of it as a synonym for registers shy of our needs. Having the accumulator as both a source operand and “register” is a fairly reliable indication that the user the destination of the operation fi lls part of the shortfall, but it still leaves us one has been around quite a operand short. Th at fi nal operand is found in memory.
    [Show full text]
  • Performance and Energy Efficient Network-On-Chip Architectures
    Linköping Studies in Science and Technology Dissertation No. 1130 Performance and Energy Efficient Network-on-Chip Architectures Sriram R. Vangal Electronic Devices Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden Linköping 2007 ISBN 978-91-85895-91-5 ISSN 0345-7524 ii Performance and Energy Efficient Network-on-Chip Architectures Sriram R. Vangal ISBN 978-91-85895-91-5 Copyright Sriram. R. Vangal, 2007 Linköping Studies in Science and Technology Dissertation No. 1130 ISSN 0345-7524 Electronic Devices Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden Linköping 2007 Author email: [email protected] Cover Image A chip microphotograph of the industry’s first programmable 80-tile teraFLOPS processor, which is implemented in a 65-nm eight-metal CMOS technology. Printed by LiU-Tryck, Linköping University Linköping, Sweden, 2007 Abstract The scaling of MOS transistors into the nanometer regime opens the possibility for creating large Network-on-Chip (NoC) architectures containing hundreds of integrated processing elements with on-chip communication. NoC architectures, with structured on-chip networks are emerging as a scalable and modular solution to global communications within large systems-on-chip. NoCs mitigate the emerging wire-delay problem and addresses the need for substantial interconnect bandwidth by replacing today’s shared buses with packet-switched router networks. With on-chip communication consuming a significant portion of the chip power and area budgets, there is a compelling need for compact, low power routers. While applications dictate the choice of the compute core, the advent of multimedia applications, such as three-dimensional (3D) graphics and signal processing, places stronger demands for self-contained, low-latency floating-point processors with increased throughput.
    [Show full text]
  • Computer Architecture: Dataflow (Part I)
    Computer Architecture: Dataflow (Part I) Prof. Onur Mutlu Carnegie Mellon University A Note on This Lecture n These slides are from 18-742 Fall 2012, Parallel Computer Architecture, Lecture 22: Dataflow I n Video: n http://www.youtube.com/watch? v=D2uue7izU2c&list=PL5PHm2jkkXmh4cDkC3s1VBB7- njlgiG5d&index=19 2 Some Required Dataflow Readings n Dataflow at the ISA level q Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974. q Arvind and Nikhil, “Executing a Program on the MIT Tagged- Token Dataflow Architecture,” IEEE TC 1990. n Restricted Dataflow q Patt et al., “HPS, a new microarchitecture: rationale and introduction,” MICRO 1985. q Patt et al., “Critical issues regarding HPS, a high performance microarchitecture,” MICRO 1985. 3 Other Related Recommended Readings n Dataflow n Gurd et al., “The Manchester prototype dataflow computer,” CACM 1985. n Lee and Hurson, “Dataflow Architectures and Multithreading,” IEEE Computer 1994. n Restricted Dataflow q Sankaralingam et al., “Exploiting ILP, TLP and DLP with the Polymorphous TRIPS Architecture,” ISCA 2003. q Burger et al., “Scaling to the End of Silicon with EDGE Architectures,” IEEE Computer 2004. 4 Today n Start Dataflow 5 Data Flow Readings: Data Flow (I) n Dennis and Misunas, “A Preliminary Architecture for a Basic Data Flow Processor,” ISCA 1974. n Treleaven et al., “Data-Driven and Demand-Driven Computer Architecture,” ACM Computing Surveys 1982. n Veen, “Dataflow Machine Architecture,” ACM Computing Surveys 1986. n Gurd et al., “The Manchester prototype dataflow computer,” CACM 1985. n Arvind and Nikhil, “Executing a Program on the MIT Tagged-Token Dataflow Architecture,” IEEE TC 1990.
    [Show full text]
  • EVA: an Efficient Vision Architecture for Mobile Systems
    EVA: An Efficient Vision Architecture for Mobile Systems Jason Clemons, Andrea Pellegrini, Silvio Savarese, and Todd Austin Department of Electrical Engineering and Computer Science University of Michigan Ann Arbor, Michigan 48109 fjclemons, apellegrini, silvio, [email protected] Abstract The capabilities of mobile devices have been increasing at a momen- tous rate. As better processors have merged with capable cameras in mobile systems, the number of computer vision applications has grown rapidly. However, the computational and energy constraints of mobile devices have forced computer vision application devel- opers to sacrifice accuracy for the sake of meeting timing demands. To increase the computational performance of mobile systems we Figure 1: Computer Vision Example The figure shows a sock present EVA. EVA is an application-specific heterogeneous multi- monkey where a computer vision application has recognized its face. core having a mix of computationally powerful cores with energy The algorithm would utilize features such as corners and use their efficient cores. Each core of EVA has computation and memory ar- geometric relationship to accomplish this. chitectural enhancements tailored to the application traits of vision Watts over 250 mm2 of silicon, typical mobile processors are limited codes. Using a computer vision benchmarking suite, we evaluate 2 the efficiency and performance of a wide range of EVA designs. We to a few Watts with typically 5 mm of silicon [4] [22]. show that EVA can provide speedups of over 9x that of an embedded To meet the limited computation capability of mobile proces- processor while reducing energy demands by as much as 3x. sors, computer vision application developers reluctantly sacrifice image resolution, computational precision or application capabili- Categories and Subject Descriptors C.1.4 [Parallel Architec- ties for lower quality versions of vision algorithms.
    [Show full text]
  • The Tablet That Can Replace Your Laptop What Makes Surface 3 the Tablet That Can Replace Your Laptop?
    The tablet that can replace your laptop What makes Surface 3 the tablet that can replace your laptop? The Best of Works like Runs Windows Great Pen a Tablet a Laptop & Office Experience Ultra-thin, ultra-light, ultra-durable Intel processor so you can run Works with old and new The optional Surface Pen feels all of your favorite Windows devices – like smartphones, cameras, and works just like a real pen, Up to 10 hours of video playback desktop software printers – more than any other OS and now comes in 2 vibrant colors Watch videos hands-free with Be productive anywhere Install millions of desktop apps One click opens OneNote, a 3-position Kickstand with a removable keyboard even when Surface 3 is For a limited time*, one year of Microsoft Office 365 Personal** and 3-position Kickstand locked or asleep comes included – a $69.99 value – with Connect to your devices with full versions of Excel, Word, PowerPoint, Palm block technology lets you powerful ports including OneNote, and Outlook, plus rest your hand on the screen, USB 3.0 OneDrive cloud storage. and you can apply natural pressure for thicker ink Use it however you want – with the Surface Pen, with touch, or with a mouse and keyboard How does Surface 3 beat iPad Air 2? Unlike iPad Air 2, Surface 3 works with all of your favorite devices like printers, smartphones, and cameras. iPad Air 2 is a tablet. Surface 3 gives you all the You can install millions of desktop apps and hundreds of thousands of touch-first apps on Surface 3.
    [Show full text]
  • 6Th Generation Intel® Core™ Processors Based on the Mobile U-Processor for Iot Solutions (Intel® Core™ I7-6600U, I5-6300U, and I3-6100U Processors)
    PLATFORM BRIEF 6th Generation Intel® Core™ Mobile Processor Family Internet of Things 6th Generation Intel® Core™ Processors Based on the Mobile U-Processor for IoT Solutions (Intel® Core™ i7-6600U, i5-6300U, and i3-6100U Processors) Harness the Performance, Features, and Edge-to-Cloud Scalability to Build Tomorrow’s IoT Solutions Today Product Overview Stunning Visual Performance Intel is proud to announce its 6th The 6th generation Intel Core generation Intel® Core™ processor processors utilize the new Gen9 family featuring ultra low-power, graphics engine, which improves 64-bit, multicore processors built on graphic performance by up to the latest 14 nm technology. Designed 34 percent.1 The improvements are for small form-factor applications, this demonstrated through faster 3-D multichip package (MCP) integrates graphics performance and rendering a low-power CPU and platform applications at low power. Video controller hub (PCH) onto a common playback is also faster and smoother package substrate. thanks to the new multiplane overlay capability. The new generation offers The 6th generation Intel Core processor up to three independent audio streams family offers dramatically higher CPU and displays, Ultra HD 4K support, and and graphics performance, a broad workload consolidation for lower BOM range of power and features scaling costs and energy output. the entire Intel product line, and new, advanced features that boost edge-to- Users will also enjoy enhanced cloud Internet of Things (IoT) designs high-density streaming applications in a wide variety of markets. These and optimized 4K videoconferencing processors run at 15W thermal design with accelerated 4K hardware media power (TDP) and are ideal for small, codecs HEVC (8-bit), VP8, VP9, and energy-efficient, form-factor designs, VDENC encoding, decoding, and including digital signage, point-of-sale transcoding.
    [Show full text]
  • PART I ITEM 1. BUSINESS Industry We Are
    PART I ITEM 1. BUSINESS Industry We are the world’s largest semiconductor chip maker, based on revenue. We develop advanced integrated digital technology products, primarily integrated circuits, for industries such as computing and communications. Integrated circuits are semiconductor chips etched with interconnected electronic switches. We also develop platforms, which we define as integrated suites of digital computing technologies that are designed and configured to work together to provide an optimized user computing solution compared to ingredients that are used separately. Our goal is to be the preeminent provider of semiconductor chips and platforms for the worldwide digital economy. We offer products at various levels of integration, allowing our customers flexibility to create advanced computing and communications systems and products. We were incorporated in California in 1968 and reincorporated in Delaware in 1989. Our Internet address is www.intel.com. On this web site, we publish voluntary reports, which we update annually, outlining our performance with respect to corporate responsibility, including environmental, health, and safety compliance. On our Investor Relations web site, located at www.intc.com, we post the following filings as soon as reasonably practicable after they are electronically filed with, or furnished to, the U.S. Securities and Exchange Commission (SEC): our annual, quarterly, and current reports on Forms 10-K, 10-Q, and 8-K; our proxy statements; and any amendments to those reports or statements. All such filings are available on our Investor Relations web site free of charge. The SEC also maintains a web site (www.sec.gov) that contains reports, proxy and information statements, and other information regarding issuers that file electronically with the SEC.
    [Show full text]
  • Appendix A: Microprocessor Data Sheets
    Appendix A: Microprocessor Data Sheets Intel8085 Zilog Z80 MOS Technology 6502 Motorola 6809 Microcontrollers (Single-chip Microcomputers) Intel 8086 ( & 80186 & 80286) Zilog Z8000 Motorola 68000 32-bit Microprocessors lnmos Transputer 184 Appendix A 185 Intel 8085 Followed on from the 8080, which was a two-chip equivalent of the 8085. Not used in any home computers, but was extremely popular in early (late 1970s) industrial control systems. A15-A8 A B c D E Same register AD7-ADO H L set is used in SP 8080 PC ALE Flags Multiplexed d ata bus and lower half of address bus (require 8212 to split data and address buses) Start addresses of Interrupt P/Os Service Routines: 8155- 3 ports, 256 bytes RAM RESET-()()()(J 8255 - 3 ports TRAP- 0024 8355 - 2 ports, 2K ROM RST5.5- 002C 8755 - 2 ports, 2K EPROM RST6.5 - ()(J34 RST7.5- <XJ3C INTR - from interrupting device Other 8251- USART 8202 - Dynamic RAM controller support 8253- CTC (3 counters) 8257 - DMA controller devices: 8271 - FDC 8257 - CRT controller Intel DMA Control System Character CPU buses­ de-multiplexed Video signal to CRT 186 Microcomputer Fault-finding and Design Zilog Z80 Probably the most popular 8-bit microprocessor. Used in home computers (Spectrum, Amstrad, Tandy), office computers and industrial controllers. A F A' F' B c B' C' D E D' E' H L H' L' 8 data Interrupt Memory lines vector I refresh R Index register IX Index register IY (to refresh dynamic RAMI Stack pointer Based on the Intel 8085, but possesses second set of registers.
    [Show full text]
  • The Ipad Comparison Chart Compare All Models of the Ipad
    ABOUT.COM FOOD HEALTH HOME MONEY STYLE TECH TRAVEL MORE Search... About.com About Tech iPad iPad Hardware and Competition The iPad Comparison Chart Compare All Models of the iPad By Daniel Nations SHARE iPad Expert Ads iPAD Pro New Apple iPAD iPAD 2 iPAD Air iPAD Cases iPAD MINI2 Cheap Tablet PC Air 2 Case Used Computers iPAD Display The iPad has evolved since it was originally announced in January 2010. Sign Up for our The iPad 2 added dual-facing cameras Free Newsletters along with a faster processor and improved graphics, but the biggest jump About Apple was with the iPad 3, which increased the Tech Today resolution of the display to 2,048 x 1,536 iPad and added Siri for voice recognition. The iPad 4 was a super-charged iPad 3, with Enter your email around twice the processing power, and the iPad Mini, released alongside the iPad SIGN UP 4, was Apple's first 7.9-inch iPad. Two years ago, the iPad Air became the TODAY'S TOP 5 PICKS IN TECH first iPad to use a 64-bit chip, ushering IPAD CATEGORIES the iPad into a new era. We Go Hands-On 5 With the OnePlus X New to iPad: How to Get The latest in Apple's lineup include the By Faryaab Sheikh Started With Your iPad iPad Pro, which super-sizes the screen to Smartphones Expert The entire iPad family: Pro, Air and Mini. Image © 12.9 inches and is compatible with a new The Best of the iPad: Apps, Apple, Inc.
    [Show full text]
  • NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad Core
    Whitepaper NVIDIA Tegra 4 Family CPU Architecture 4-PLUS-1 Quad core 1 Table of Contents ...................................................................................................................................................................... 1 Introduction .............................................................................................................................................. 3 NVIDIA Tegra 4 Family of Mobile Processors ............................................................................................ 3 Benchmarking CPU Performance .............................................................................................................. 4 Tegra 4 Family CPUs Architected for High Performance and Power Efficiency ......................................... 6 Wider Issue Execution Units for Higher Throughput ............................................................................ 6 Better Memory Level Parallelism from a Larger Instruction Window for Out-of-Order Execution ...... 7 Fast Load-To-Use Logic allows larger L1 Data Cache ............................................................................. 8 Enhanced branch prediction for higher efficiency .............................................................................. 10 Advanced Prefetcher for higher MLP and lower latency .................................................................... 10 Large Unified L2 Cache .......................................................................................................................
    [Show full text]