Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures

Total Page:16

File Type:pdf, Size:1020Kb

Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures Accurate Throughput Prediction of Basic Blocks on Recent Intel Microarchitectures Andreas Abel and Jan Reineke Saarland University Saarland Informatics Campus Saarbrücken, Germany abel, [email protected] ABSTRACT been shown to be relatively low; Chen et al. [12] found that Tools to predict the throughput of basic blocks on a spe- the average error of previous tools compared to measurements cific microarchitecture are useful to optimize software per- is between 9% and 36%. formance and to build optimizing compilers. In recent work, We have identified two main reasons that contribute to several such tools have been proposed. However, the accuracy these discrepancies between measurements and predictions: of their predictions has been shown to be relatively low. (1) the microarchitectural models of previous tools are not In this paper, we identify the most important factors for detailed enough; (2) evaluations were partly based on un- these inaccuracies. To a significant degree these inaccura- suitable benchmarks, and biased and inaccurate throughput cies are due to elements and parameters of the pipelines of measurements. In this paper, we address both of these issues. recent CPUs that are not taken into account by previous tools. In the first part, we develop a pipeline model that is signif- A primary reason for this is that the necessary details are icantly more detailed than those of previous tools. It is appli- often undocumented. In this paper, we build more precise cable to all Intel Core microarchitectures released in the last models of relevant components by reverse engineering us- decade; the differences between these microarchitectures can ing microbenchmarks. Based on these models, we develop be captured by a small number of parameters. A challenge in a simulator for predicting the throughput of basic blocks. building such a model is that many of the relevant properties In addition to predicting the throughput, our simulator also are undocumented. We have developed microbenchmark- provides insights into how the code is executed. based techniques to reverse-engineer these properties. Our tool supports all Intel Core microarchitecture gen- Based on this model, we then implement a simulator that erations released in the last decade. We evaluate it on an can compute throughput predictions for basic blocks. In ad- improved version of the BHive benchmark suite. On many dition, it also provides further insights into how basic blocks recent microarchitectures, its predictions are more accurate are executed, such as the port usage for each instruction. than the predictions of state-of-the-art tools by more than an To compare basic block throughput predictors, Chen et order of magnitude. al. [12] proposed the BHive benchmark suite, along with a tool for measuring throughputs. We discovered that this 1. INTRODUCTION benchmark suite contains a significant number of benchmarks with ambiguous throughputs that can give an unfair advantage Techniques to predict the performance of software are, for to some of the tools. We therefore discard the corresponding example, required to build compilers, to optimize code for benchmarks. Moreover, we discovered that the measurement specific platforms, and to analyze the worst-case execution tool often provides inaccurate measurements, and it is based time of software in safety-critical systems. Such techniques on a different definition of throughput than most previous are also useful in the area of security, where performance throughput predictors. We significantly improve the accuracy differences may be used for side-channel attacks. of the measurement tool, and we extend it to support both arXiv:2107.14210v1 [cs.PF] 29 Jul 2021 A specific problem in this area that has received a lot of throughput definitions. attention in recent work is to predict the performance of basic Using this improved benchmark suite and measurement blocks, under the assumption that all memory accesses lead to methodology, we compare our throughput predictor with a cache hits. A popular performance predictor from this class number of previous tools on nine different Intel Core microar- is Intel’s IACA [1]. It is, for example, used by the Kerncraft chitectures. In most cases, the average error of our tool is tool [2] to construct Roofline [3] and ECM [4] models. It around or below 1%, which is significantly lower than the has also been used by Bhattacharyya et al. [5] to build a error of previous tools. speculative execution attack. To summarize, the main contributions of our paper are: However, IACA is closed source, does not support recent • A parametric pipeline model that is applicable to all microarchitectures, and has been discontinued by Intel in Intel Core microarchitectures released in the last ten 2019. A number of alternatives were proposed in recent work, years (Section2). such as OSACA [6,7], llvm-mca [8], CQA [9], Ithemal [10], • A throughput predictor (Section3) that is more accurate and DiffTune [11]. than previous work by more than an order of magnitude The accuracy of the throughput predictions of previous on several recent microarchitectures (Section5). tools compared to measurements on the actual hardware has 1 mainly responsible for detecting where each instruction be- Instruction Cache gins. This is not completely straightforward, as an instruction Predecoder can be between 1 and 15 bytes long, and detecting its length can require inspecting several bytes of the instruction. Instruction Queue (IQ) The predecoded instructions are inserted into the instruc- Front End tion queue (IQ). The length of this queue is a parameter of our DSB Decoder MS model; it is between 20 and 25 on recent microarchitectures. Instruction Decode Queue (IDQ) According to our experiments, the predecoder can prede- code at most five instructions per cycle on all of the microar- Renamer / Allocator chitectures we consider. If there are, e.g., six instructions in a 16-byte block, in the first cycle, five instructions would retire Reorder Buffer be predecoded, and in the next cycle, only one instruction would be predecoded. Several sources claim that the prede- Scheduler Port 0 Port 1 Port 2 Port 3 Port 4 Port 5 coding limit is six instructions per cycle [15, 16, 17]. This µop µop µop µop µop µop does not agree with our experiments. The source for the claims in [15, 16, 17] might be a section in Intel’s optimiza- tion manual [18] that mentions a limit of six instructions per Store Data Load, AGU Load, AGU cycle; however, this section is in a part of the manual that de- Execution Engine (Back End) ALU, JMP, . ALU, V-MUL, . ALU, V-ADD, . scribes predecoding on old Intel Core 2 CPUs (which did not have a µop cache, and thus decoding was a more significant L1 Data Cache bottleneck). The manual does not describe the predecoding limits for more recent microarchitectures. Memory L2 Cache A special case are instructions with a so called length- changing prefix (LCP). For such instructions, the predecoder Figure 1: Pipeline of Intel Core CPUs has to use a slower length-decoding algorithm. This results in a penalty of three cycles for each such instruction. • An improved set of benchmarks for validating through- For building an accurate simulator, it is important to know put predictors, and a more accurate measurement method- how instructions that cross a 16-byte boundary are prede- ology (Section4). coded; however, this is undocumented. Our experiments show that instructions are predecoded with the 16-byte block 2. PIPELINE MODEL in which they end; they count toward the limit of five instruc- tions in the corresponding cycle. We found that there is a Figure1 shows the general structure of the pipelines of one cycle penalty if five instructions were predecoded in the recent Intel Core CPUs. At such a high level, all these CPUs current cycle, the next instruction crosses a 16-byte boundary, are very similar. For developing an accurate performance but the primary opcode is contained in the current 16-byte predictor, however, a more detailed model is necessary. Un- block; if there are only prefixes or escape opcodes of the next fortunately, the official documentation only provides limited instruction in the current block, there is no penalty. information on these details. Unofficial sources are not al- Besides the difference in the length of the IQ, the prede- ways correct, and also do not provide all of the necessary coders appear to be identical on all CPUs that we consider. information at the required level of detail. In this section, we describe the pipeline model that we have developed for accurate throughput prediction. Thus, the 2.2 Decoding focus of the presentation is on those aspects that we have The decoding unit fetches up to four instructions from found to be relevant for predicting the performance of basic the IQ. It decodes the instructions into a sequence of µops blocks. We describe the general structure and fill in the and sends them to the Instruction Decode Queue (IDQ, some- missing details. We obtained information on undocumented times also called µop queue). The IDQ has between 28 and details through reverse engineering via microbenchmarks 70 entries on the microarchitectures we consider. This is a using hardware performance counters. For evaluating these parameter of our model. microbenchmarks, we use nanoBench [13] and an extension The decoding unit consists of one complex decoder and to nanoBench that provides cycle-by-cycle performance data, three simple decoders. The simple decoders can only decode similar to Brandon Falk’s “Sushi Roll” approach [14]. instructions with one µop. The complex decoder always de- The model presented in this section is significantly more codes the first of the fetched instructions, and it can generate detailed than any previous models described in the literature. up to four µops. Instructions with more than four µops are It is applicable to all Intel Core microarchitectures released in (at least partially) handled by the Microcode Sequencer (see the last ten years; maybe surprisingly, the differences between Section 2.5).
Recommended publications
  • Intel® Architecture Instruction Set Extensions and Future Features Programming Reference
    Intel® Architecture Instruction Set Extensions and Future Features Programming Reference 319433-037 MAY 2019 Intel technologies features and benefits depend on system configuration and may require enabled hardware, software, or service activation. Learn more at intel.com, or from the OEM or retailer. No computer system can be absolutely secure. Intel does not assume any liability for lost or stolen data or systems or any damages resulting from such losses. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifica- tions. Current characterized errata are available on request. This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Intel does not guarantee the availability of these interfaces in any future product. Contact your Intel representative to obtain the latest Intel product specifications and roadmaps. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1- 800-548-4725, or by visiting http://www.intel.com/design/literature.htm. Intel, the Intel logo, Intel Deep Learning Boost, Intel DL Boost, Intel Atom, Intel Core, Intel SpeedStep, MMX, Pentium, VTune, and Xeon are trademarks of Intel Corporation in the U.S.
    [Show full text]
  • GPU Developments 2018
    GPU Developments 2018 2018 GPU Developments 2018 © Copyright Jon Peddie Research 2019. All rights reserved. Reproduction in whole or in part is prohibited without written permission from Jon Peddie Research. This report is the property of Jon Peddie Research (JPR) and made available to a restricted number of clients only upon these terms and conditions. Agreement not to copy or disclose. This report and all future reports or other materials provided by JPR pursuant to this subscription (collectively, “Reports”) are protected by: (i) federal copyright, pursuant to the Copyright Act of 1976; and (ii) the nondisclosure provisions set forth immediately following. License, exclusive use, and agreement not to disclose. Reports are the trade secret property exclusively of JPR and are made available to a restricted number of clients, for their exclusive use and only upon the following terms and conditions. JPR grants site-wide license to read and utilize the information in the Reports, exclusively to the initial subscriber to the Reports, its subsidiaries, divisions, and employees (collectively, “Subscriber”). The Reports shall, at all times, be treated by Subscriber as proprietary and confidential documents, for internal use only. Subscriber agrees that it will not reproduce for or share any of the material in the Reports (“Material”) with any entity or individual other than Subscriber (“Shared Third Party”) (collectively, “Share” or “Sharing”), without the advance written permission of JPR. Subscriber shall be liable for any breach of this agreement and shall be subject to cancellation of its subscription to Reports. Without limiting this liability, Subscriber shall be liable for any damages suffered by JPR as a result of any Sharing of any Material, without advance written permission of JPR.
    [Show full text]
  • Intel Core I7 Download Driver Intel Core I7 Download Driver
    intel core i7 download driver Intel core i7 download driver. Completing the CAPTCHA proves you are a human and gives you temporary access to the web property. What can I do to prevent this in the future? If you are on a personal connection, like at home, you can run an anti-virus scan on your device to make sure it is not infected with malware. If you are at an office or shared network, you can ask the network administrator to run a scan across the network looking for misconfigured or infected devices. Another way to prevent getting this page in the future is to use Privacy Pass. You may need to download version 2.0 now from the Chrome Web Store. Cloudflare Ray ID: 67d2a613e88a84c8 • Your IP : 188.246.226.140 • Performance & security by Cloudflare. Core i7 Processor Extreme Edition Driver for Windows XP Media Center Edition 2.0. Core i7 Processor Extreme Edition Driver for Windows XP Media Center Edition 2.0. User rating User Rating. Changelog. We don't have any change log information yet for version 2.0 of Core i7 Processor Extreme Edition Driver for Windows XP Media Center Edition 2.0. Sometimes publishers take a little while to make this information available, so please check back in a few days to see if it has been updated. Can you help? If you have any changelog info you can share with us, we'd love to hear from you! Head over to ourContact pageand let us know. Explore Apps. Related Software. Kaspersky Anti-Virus.
    [Show full text]
  • Intel CEO Remarks Q1'21 Earnings Webcast April 22, 2021
    Intel CEO Remarks Q1’21 Earnings Webcast April 22, 2021 Pat Gelsinger, Intel CEO Good afternoon everyone. It’s a pleasure to be with you for my first earnings call. I consider it an honor to be CEO of this great company. Thanks for joining. Intel delivered a strong Q1 that beat our January guide on both the top and bottom line, driven by exceptional demand for our products and exquisite execution by our team. We shipped a record volume of notebook CPUs. We launched new competitive Intel® Core™ and Xeon® processors. Mobileye had its best quarter ever. With tremendous industry support, we unveiled our IDM 2.0 strategy, setting a bold new course for technology leadership at Intel. The response from employees, partners and customers has been incredible. Our teams are re- invigorated, innovating and executing. It’s amazing to be back at Intel ... and Intel is back. Before George takes you through the financial details of the quarter, I’ll begin with the industry trends we’re seeing and why Intel is well positioned to aggressively capitalize on them. Said simply, Intel is the only company with the depth and breadth of software, silicon and platforms, and packaging and process with at-scale manufacturing that customers can depend on for their next-generation innovations. There are four superpowers driving digital transformation: cloud, connectivity, artificial intelligence and the intelligent edge. Intel’s mission, and we are uniquely positioned to do so, is to help customers harness these superpowers to improve the lives of every human on the planet The digitization of everything was markedly accelerated by COVID and has spurred innovation and new models of working, learning, interacting and caring.
    [Show full text]
  • EARNINGS Presentation Disclosures
    Q2 2020 EARNINGS Presentation Disclosures This presentation contains non-GAAP financial measures. Earnings per share (EPS), gross margin, and operating margin are presented on a non- GAAP basis unless otherwise indicated, and this presentation also includes a non-GAAP free cash flow (FCF) measure. The Appendix provides a reconciliation of these measures to the most directly comparable GAAP financial measure. The non-GAAP financial measures disclosed by Intel should not be considered a substitute for, or superior to, the financial measures prepared in accordance with GAAP. Please refer to “Explanation of Non-GAAP Measures” in Intel's quarterly earnings release for a detailed explanation of the adjustments made to the comparable GAAP measures, the ways management uses the non-GAAP measures, and the reasons why management believes the non-GAAP measures provide investors with useful supplemental information. Statements in this presentation that refer to business outlook, future plans, and expectations are forward-looking statements that involve a number of risks and uncertainties. Words such as "anticipate," "expect," "intend," "goals," "plans," "believe," "seek," "estimate," "continue,“ “committed,” “on-track,” ”positioned,” “launching,” "may," "will," “would,” "should," “could,” and variations of such words and similar expressions are intended to identify such forward-looking statements. Statements that refer to or are based on estimates, forecasts, projections, uncertain events or assumptions, including statements relating to total addressable market (TAM) or market opportunity, future products and technology and the expected availability and benefits of such products and technology, including our 10nm and 7nm process technologies, products, and product designs, and anticipated trends in our businesses or the markets relevant to them, also identify forward-looking statements.
    [Show full text]
  • Intel® Tiger Lake-UP3 Solution
    Intel® Tiger Lake-UP3 Solution ® Intel Tiger Lake-UP3 Platform Overview Graphics Media, and Display Security and Manageability ● New Xe (Gen12) Gfx Engine with up to 96 Eus ● Intel® Total Memory Encryption (Intel® TME) ● 4 independent display units (4 x 4K or 2 x 8K) * Hardware based encryption to protect entire memory contents ● Image processing unit IPU6 from physical attacks * Up to 27MP image capture ● Intel® AES-NI * Up to 4K@60fps video performance * A new, hardware based AES-NI instruction set to protect * Support for 4 concurrent streams private keys from malware * VTIO and RGB-IR for facial recognitionn * Keys are no longer exposed and handlers are utilized to perform encrypt/decrypt operations ● Control-Flow Enforcement Technology AI Experience Enhanced Features ● Vector Neural Network Instruction (VNNI) improves ● Up to 4 cores inferencing workload performance ● In-Band ECC available ● Common industry wide AI frameworks optimized on Tiger ● PCIe 4.0 (off CPU complex), 12 HSIO lanes (off PCH complex) Lake architecture (via OpenVINO™) ● Improved Thunderbolt™ data performance ● Intel® Deep Learning Boost ● Integrated Type C subsystem with for USB4 compliance ● AI/DL Instruction Sets including VNNI and CV/AI applications ® Intel Tiger Lake-UP3 Platform Block Diagram eDP DDR4/LPDDR4x (2ch) www.ieiworld.com 2 DDI PCIe 4.0 (1x4) TIGER LAKE-UP3 USB 3.2 Gen2 Thunderbolt™ 3 USB 3.2 Gen1 Hi-Speed USB 2.0 PCIe 3.0 CERTIFIED USB ™ SATA 6Gb/s Intel® Wireless-AC TIGER LAKE Intel® LAN PHY 2.5GbE TSN MAC ® PCH-LP Intel About IEI-2021-V10 ®
    [Show full text]
  • 11Th Gen Intel® Core™ Vpro® Mobile Processors & Intel® Xeon® W
    11th Gen Intel® Core™ vPro® mobile processors & Intel® Xeon® W-11000 processors for mobile workstations Powering the unrivaled Intel vPro® platform PROCESSOR BRIEF Empowering enterprises to meet evolving business needs Business technology needs have changed a lot over the past few years. Now, both IT and everyday end users prefer more user-friendly technology that can easily adapt to today’s more flexible working environment. The Intel vPro® platform provides tools and technologies you can count on to help your business keep up with the needs of an ever-changing world. It’s time to set a new standard for business security, performance, and remote manageability in your organization. The latest 11th Gen Intel vPro® platform processors are designed to deliver business-class performance, comprehensive hardware-based security, and optimal user experiences — making it the unrivaled PC platform for business. A scalable processor portfolio The 11th Gen Intel® Core™ vPro® processors are segmented into the i5, i7, and i9 processor brands which, along with Intel® Xeon® W-11000 processors for mobile workstations, give enterprises greater flexibility to meet a wide variety of performance and price point requirements across their organization. 11th Gen Intel® Core™ vPro® U-series i5 and i7 processors power highly mobile business-class notebooks in a thin-and-light form factor. These PCs are designed for on-the-go employees and feature exclusive Intel® Iris® Xe graphics for impressive visual performance. Processors with the i5 brand feature 4 cores, 8 threads, and an 8 MB cache. Processors with the i7 brand support 4 cores, 8 threads, and a 12 MB cache.
    [Show full text]
  • Intel® Architecture Instruction Set Extensions and Future Features
    Intel® Architecture Instruction Set Extensions and Future Features Programming Reference May 2021 319433-044 Intel technologies may require enabled hardware, software or service activation. No product or component can be absolutely secure. Your costs and results may vary. You may not use or facilitate the use of this document in connection with any infringement or other legal analysis concerning Intel products described herein. You agree to grant Intel a non-exclusive, royalty-free license to any patent claim thereafter drafted which includes subject matter disclosed herein. No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document. All product plans and roadmaps are subject to change without notice. The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade. Code names are used by Intel to identify products, technologies, or services that are in development and not publicly available. These are not “commercial” names and not intended to function as trademarks. Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be ob- tained by calling 1-800-548-4725, or by visiting http://www.intel.com/design/literature.htm. Copyright © 2021, Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.
    [Show full text]
  • Intel Full Product Line Inventory Report
    Camp Marketing Consultancy Power BEGINS DESKTOP at SLIDE 22 Dimensions of Economic Freedom. Law Int l Express Int lExpress IntelExpress Int e l Express e e Understanding artificial acceleration - systematic concentration – targeted financial misappropriation Intel full product line inventory report; April 10, 2021 IN DEVLEOPMENT – with INTEL FULL RUN PRODUCT CATEGORIES PLOT NOW with AMD and Intel q3 2020 through q1 2021 Market Share “The central fact of industrial economics is not profit but loss - not the expectation of ending up with a surplus, its justification, and the legitimacy of the claims to a share in it; but the inevitable and real risk of ending up with an impoverishing deficit, and the need, the absolute need, to avoid this loss by providing against the risks”. - Peter Drucker, The New Society, 1950 Mike Bruzzone, Camp Marketing Consultancy, [email protected] Seeking Alpha – April 15, 2021 Camp Marketing Consultancy – Intel full product line inventory report By Invitation Product Category % Full Run % 4.10.21 C Xeon Ivy Bridge v2 14.85% 13.84% Xeon Product Laundering Theft H Xeon Haswell v3 32.73% 29.64% A Intel simultaneously produces v2 / v3 / v4 N Xeon Broadwell v4 12.82% 17.50% N Xeon SL & CL + r 11.29% 17.14% E May 1998 Core Haswell 6.47% 4.45% L Xeon Core Broadfwell 1.69% 1.32% Haswell v3 Docket 9288 C Core Skylake 4.16% 3.79% Docket 9341 U M Core Kaby Lake 3.12% 2.14% U Core KBr/W/A & Coffee 6.37% 2.62% IVBv2 Xeon L Coffee Lake Refresh 2.75% 1.65% A Haswell v3 T Comet Lake 1.22% 3.12% I Tiger & Ice Lake 0.35% 1.36%
    [Show full text]
  • Hardware-Enabled Security: 3 Enabling a Layered Approach to Platform Security for Cloud 4 and Edge Computing Use Cases
    1 Draft NISTIR 8320 2 Hardware-Enabled Security: 3 Enabling a Layered Approach to Platform Security for Cloud 4 and Edge Computing Use Cases 5 6 Michael Bartock 7 Murugiah Souppaya 8 Ryan Savino 9 Tim Knoll 10 Uttam Shetty 11 Mourad Cherfaoui 12 Raghu Yeluri 13 Akash Malhotra 14 Karen Scarfone 15 16 17 18 This publication is available free of charge from: 19 https://doi.org/10.6028/NIST.IR.8320-draft 20 21 22 23 Draft NISTIR 8320 24 Hardware-Enabled Security: 25 Enabling a Layered Approach to Platform Security for Cloud 26 and Edge Computing Use Cases 27 Michael Bartock 28 Murugiah Souppaya 29 Computer Security Division 30 Information Technology Laboratory 31 32 Ryan Savino 33 Tim Knoll 34 Uttam Shetty 35 Mourad Cherfaoui 36 Raghu Yeluri 37 Intel Data Platforms Group 38 Santa Clara, CA 39 40 Akash Malhotra 41 AMD Product Security and Strategy Group 42 Austin, TX 43 44 Karen Scarfone 45 Scarfone Cybersecurity 46 Clifton, VA 47 48 49 50 May 2021 51 52 53 54 U.S. Department of Commerce 55 Gina Raimondo, Secretary 56 57 National Institute of Standards and Technology 58 James K. Olthoff, Performing the Non-Exclusive Functions and Duties of the Under Secretary of Commerce 59 for Standards and Technology & Director, National Institute of Standards and Technology 60 National Institute of Standards and Technology Interagency or Internal Report 8320 61 58 pages (May 2021) 62 This publication is available free of charge from: 63 https://doi.org/10.6028/NIST.IR.8320-draft 64 Certain commercial entities, equipment, or materials may be identified in this document in order to describe an 65 experimental procedure or concept adequately.
    [Show full text]
  • Advancement in Processor's Architecture
    © June 2021| IJIRT | Volume 8 Issue 1 | ISSN: 2349-6002 Advancement In Processor’s Architecture Avinash Maurya1, Prachi Mishra2, Prateek Choudhary3, Prashant Upadhyay4 1,2,3,4 Student of Information Technology, RKGIT, Ghaziabad, India Abstract - AMD has launched its next-gen Zen 3 lineup physical layout perspective. The Zen2 AMD Ryzen CPU which makes it the best and most powerful processor series had a big improvement in architecture commercial CPU series. as they moved from 12/14nm processors to 7nm The "Zen 3" core architecture delivers a 19% increase architecture manufactured by TSMC (Taiwan in the instructions per clock for the Ryzen 5600x, 5800x, Semiconductor Manufacturing Company). 5900x, and 5950x. This is a significant modification in the last decade that The 2nd Gen AMD Ryzen processor's which is there happened due to the improvement in core architecture, in the AMD Ryzen 7 3700x has one compute die with which results in full access of the L3 cache without 8 cores and one input/output die that handles PCI having to deal with the extra latency involved. express lanes, SATA, USB, etc. Here, the layout of die matters. In the Zen2 layouts, the L3 cache that is of I.INTRODUCTION 32MB lies in the middle, which is separated into 2 parts of 16MB each with 4 cores on either side so, in In the world of computer performance, there is total, we get 32 MB of L3 cache with 8 total CPU competition between Intel and AMD processors and cores. both are having a large consumer base, which makes it If we look at the gen 3 architecture it's a little bit tricky to get unbiased advice about the best choice for different, we've got 32 MB of L3 cache just like the a processor.
    [Show full text]
  • Intel's Tiger Lake 11Th Gen Core I7-1185G7 Review and Deep Dive
    Intel’s Tiger Lake 11th Gen Core i7-1185G7 Review and Deep Dive: Baskin’ for the Exotic by Dr. Ian Cutress & Andrei Frumusanu on September 17, 2020 Contents 1. Tiger Lake: Playing with Toe Beans 2. 10nm Superfin, Willow Cove, Xe, and new SoC 3. New Instructions and Updated Security 4. Cache Architecture: The effect of increasing L2 and L3 5. Power Consumption: Intel’s TDP Shenanigans Hurts Everyone 6. Power Consumption: Comparing 15 W TGL to 28 W TGL 7. Power Consumption: Comparing 15 W TGL to 15 W ICL to15 W Renoir 8. CPU ST Performance: SPEC 2006, SPEC 2017 9. CPU MT Performance: SPEC 2006, SPEC 2017 10. CPU Performance: Office and Web 11. CPU Performance: Simulation and Science 12. CPU Performance: Encoding and Rendering 13. Xe-LP GPU Performance: Civilization 6 14. Xe-LP GPU Performance: Deus Ex Mankind Divided 15. Xe-LP GPU Performance: Final Fantasy XIV 16. Xe-LP GPU Performance: Final Fantasy XV 17. Xe-LP GPU Performance: World of Tanks 18. Xe-LP GPU Performance: F1 2019 19. Conclusion: Is Intel Smothering AMD in Sardine Oil? The big notebook launch for Intel this year is Tiger Lake, its upcoming 10nm platform designed to pair a new graphics architecture with a nice high frequency for the performance that customers in this space require. Over the past few weeks, we’ve covered the microarchitecture as presented by Intel at its latest Intel Architecture Day 2020, as well as the formal launch of the new platform in early September. The missing piece of the puzzle was actually testing it, to see if it can match the very progressive platform currently offered by AMD’s Ryzen Mobile.
    [Show full text]