Valpont.Com, a Technology Content Platform Valpont.Com, a Technology Content Platform

Total Page:16

File Type:pdf, Size:1020Kb

Valpont.Com, a Technology Content Platform Valpont.Com, a Technology Content Platform Valpont.com, a Technology Content Platform Valpont.com, a Technology Content Platform Software Development for Embedded Multi-core Systems Valpont.com, a Technology Content Platform This page intentionally left blank Valpont.com, a Technology Content Platform Software Development for Embedded Multi-core Systems A Practical Guide Using Embedded Intel ® Architecture Max Domeika AMSTERDAM • BOSTON • HEIDELBERG • LONDON NEW YORK • OXFORD • PARIS • SAN DIEGO SAN FRANCISCO • SINGAPORE • SYDNEY • TOKYO Newnes is an imprint of Elsevier Valpont.com, a Technology Content Platform Cover image by iStockphoto Newnes is an imprint of Elsevier 30 Corporate Drive, Suite 400, Burlington, MA 01803, USA Linacre House, Jordan Hill, Oxford OX2 8DP, UK Copyright © 2008, Elsevier Inc. All rights reserved. Intel® and Pentium® are registered trademarks of Intel Corporation. *Other names and brands may be the property of others. The author is not speaking for Intel Corporation. This book represents the opinions of author. Performance tests and ratings are measured using specifi c computer systems and/or components and refl ect the approximate performance of Intel products as measured by those tests. Any difference in system hardware or software design or confi guration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, visit Intel Performance Benchmark Limitations. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. Permissions may be sought directly from Elsevier ’ s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail: http://[email protected] . You may also complete your request online via the Elsevier homepage ( http://elsevier.com ), by selecting “ Support & Contact ” then “ Copyright and Permission ” and then “ Obtaining Permissions. ” Library of Congress Cataloging-in-Publication Data Domeika, Max. Software development for embedded multi-core systems : a practical guide using embedded Intel architecture / Max Domeika. p. cm. ISBN 978-0-7506-8539-9 1. Multiprocessors. 2. Embedded computer systems. 3. Electronic data processing— Distributed processing. 4. Computer software—Development. I. Title. QA76.5.D638 2008 004 .35—dc22 2008006618 British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library. For information on all Newnes publications visit our Web site at www.books.elsevier.com ISBN: 978-0-7506-8539-9 Typeset by Charon Tec Ltd (A Macmillan Company), Chennai, India www.charontec.com Printed in the United States of America 08 09 10 11 10 9 8 7 6 5 4 3 2 1 Valpont.com, a Technology Content Platform Contents Preface .............................................................................................................................. ix Acknowledgments ........................................................................................................... xiii Chapter 1: Introduction .....................................................................................................1 1.1 Motivation ....................................................................................................................3 1.2 The Advent of Multi-core Processors ...........................................................................4 1.3 Multiprocessor Systems Are Not New .........................................................................4 1.4 Applications Will Need to be Multi-threaded ..............................................................6 1.5 Software Burden or Opportunity ..................................................................................8 1.6 What is Embedded? ....................................................................................................10 1.7 What is Unique About Embedded? ............................................................................13 Chapter Summary ..............................................................................................................14 Chapter 2: Basic System and Processor Architecture .....................................................17 Key Points ..........................................................................................................................17 2.1 Performance ................................................................................................................19 2.2 Brief History of Embedded Intel ® Architecture Processors .......................................20 2.3 Embedded Trends and Near Term Processor Impact .................................................37 2.4 Tutorial on x86 Assembly Language ..........................................................................39 Chapter Summary ..............................................................................................................53 Related Reading ................................................................................................................54 Chapter 3: Multi-core Processors and Embedded ..........................................................55 Key Points ..........................................................................................................................55 3.1 Motivation for Multi-core Processors .........................................................................56 3.2 Multi-core Processor Architecture .............................................................................57 3.3 Benefi ts of Multi-core Processors in Embedded ........................................................62 3.4 Embedded Market Segments and Multi-core Processors ........................................... 63 www.newnespress.com Valpont.com, a Technology Content Platform vi Contents 3.5 Evaluating Performance of Multi-core Processors .....................................................69 Chapter Summary ..............................................................................................................87 Related Reading ................................................................................................................88 Chapter 4: Moving to Multi-core Intel Architecture .......................................................89 Key Points ..........................................................................................................................89 4.1 Migrating to Intel Architecture ...................................................................................91 4.2 Enabling an SMP OS ................................................................................................111 4.3 Tools for Multi-Core Processor Development ..........................................................117 Chapter Summary ............................................................................................................136 Related Reading ..............................................................................................................137 Chapter 5: Scalar Optimization and Usability ..............................................................139 Key Points ........................................................................................................................139 5.1 Compiler Optimizations ...........................................................................................143 5.2 Optimization Process ................................................................................................153 5.3 Usability ...................................................................................................................161 Chapter Summary ............................................................................................................170 Related Reading ..............................................................................................................170 Chapter 6: Parallel Optimization Using Threads ..........................................................173 Key Points ........................................................................................................................173 6.1 Parallelism Primer ....................................................................................................175 6.2 Threading Development Cycle .................................................................................184 Chapter Summary ............................................................................................................206 Related Reading ..............................................................................................................207 Chapter 7: Case Study: Data Decomposition ................................................................209 Key Points ........................................................................................................................209 7.1 A Medical Imaging Data Examiner ..........................................................................209 Chapter Summary ............................................................................................................245 Chapter 8: Case Study: Functional Decomposition ......................................................247 Key Points ........................................................................................................................247
Recommended publications
  • QATS001 Agenda
    Overview of Intel® QuickAssist Technology Dr. Nash Palaniswamy Intel Corporation and Guest Speaker Richard Kaufmann HP QATS001 Agenda • Intel® QuickAssist Technology Strategy • Overview of Solutions – Intel® Embedded Processor for 2008 (Tolapai), FSB-FPGA, AAL, etc. • A viewpoint from HP* on Accelerators • Summary 2 Intel® QuickAssist Technology – Comprehensive Approach to Acceleration • Multiple accelerator and attach options with software and ecosystem support • Performance and scalability based on customer needs and priorities Comprehensive Initiative to optimize the use and deployment of accelerators on Intel® architecture platforms. 3 Design Alternatives as Diverse as Application Requirements Accelerator Accelerator Software Design Attach Models Customized Solutions Power Software Maximum Efficiency Scalability Performance 4 Intel® Embedded Processor for 2008 (Tolapai) Single Die integrates - IA 32 based Core @ 600, 1066 and 1200MHz - DDR2 memory controller (MCH) - PCI Express* Technology - Standard IA PC peripherals (ICH) - 3x Gigabit Ethernet MACs - 3x TDM high-speed serial interfaces for 12 T1/E1 or SLIC/CODEC connections - Intel® QuickAssist Integrated Accelerator Vital Statistics - 148 million transistors - 1,088-ball FCBGA w/1.092 mm pitch - 37.5 mm x 37.5 mm package Intel's first integrated x86 processor, chipset and memory controller since 1994's 80386EX 5 Intel® QuickAssist® FPGA Acceleration Platform Source : Pat Gelsinger Keynote Fall IDF 2006 1. Open Ubiquitous Standards Based Approach PCI Express* Gen1, PCI Express* Gen2, Gen 3 in the future 2. Enable third party FSB-FPGA Modules – targeted for Financial Services Industry, Oil and Gas, Life Sciences, Digital Health, etc. FSB-FPGA Modules Targeted 1H08 3. Intel® QuickAssist Technology Accelerator Geneseo – PCI Express* Abstraction Layer (AAL) that seamlessly allows Source : Intel Internal the SW to access acceleration across various technologies.
    [Show full text]
  • Real-Time Visualization of Aerospace Simulations Using Computational Steering and Beowulf Clusters
    The Pennsylvania State University The Graduate School Department of Computer Science and Engineering REAL-TIME VISUALIZATION OF AEROSPACE SIMULATIONS USING COMPUTATIONAL STEERING AND BEOWULF CLUSTERS A Thesis in Computer Science and Engineering by Anirudh Modi Submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy August 2002 We approve the thesis of Anirudh Modi. Date of Signature Paul E. Plassmann Associate Professor of Computer Science and Engineering Thesis Co-Advisor Co-Chair of Committee Lyle N. Long Professor of Aerospace Engineering Professor of Computer Science and Engineering Thesis Co-Advisor Co-Chair of Committee Rajeev Sharma Associate Professor of Computer Science and Engineering Padma Raghavan Associate Professor of Computer Science and Engineering Mark D. Maughmer Professor of Aerospace Engineering Raj Acharya Professor of Computer Science and Engineering Head of the Department of Computer Science and Engineering iii ABSTRACT In this thesis, a new, general-purpose software system for computational steering has been developed to carry out simulations on parallel computers and visualize them remotely in real-time. The steering system is extremely lightweight, portable, robust and easy to use. As a demonstration of the capabilities of this system, two applications have been developed. A parallel wake-vortex simulation code has been written and integrated with a Virtual Reality (VR) system via a separate graphics client. The coupling of computational steering of paral- lel wake-vortex simulation with VR setup provides us with near real-time visualization of the wake-vortex data in stereoscopic mode. It opens a new way for the future Air-Traffic Control systems to help reduce the capacity constraint and safety problems resulting from the wake- vortex hazard that are plaguing the airports today.
    [Show full text]
  • Roccat Ryos Mk Pro Gigabyte Force K7
    WESTERN DO-IT-YOURSELF GIGABYTE DIGITAL BLACK2 STEAM BOX BRIX PRO SSD and HDD How to get SteamOS Full-on desktop together in one running on your PC power you can hold chassis! PG. 82 PG. 66 in your hand! PG. 53 minimum BS • mARCH 2014 • www.maximumpc.com THE CHEAPSKATE'S GUIDE TO POWER COMPUTING • Tips for saving on hardware • Pointers to the best deal sites • A guide to free and cheap digital content • Instructions for building a $600 PC • And so much more! GAMING KEYBOARDS We review six high- performance planks PG. 40 where we put stuff table of contents WESTERN DO IT YOURSELF GIGABYTE DIGITAL BLACK2 STEAM BOX BRIX PRO SSD and HDD How to get SteamOS Full-on desktop together in one running on your PC power you can hold chassis! PG. 82 PG. 66 in your hand! PG. 53 MINIMUM BS • MARCH 2014 • www.maximumpc.com THE inside CHEAPSKATE'S TO POWER COMPUTING On the Cover GUIDE Illustration by • Tips for saving on hardware Georg Zumbulev MARCH 2014 • Pointers to the best deal sites • A guide to free and cheap digital content QUICKSTART • Instructions for building a $600 PC • And so much more! GAMING KEYBOARDS We review six high- performance planks PG. 40 08 THE NEWS Hardware vendors commit to SteamOS; Windows XP death watch; Gigabit Internet over phone lines? FEATURES 14 THE LIST The 10 coolest things we saw 22 at CES. 16 HEAD TO HEAD Nvidia GeForce Experience vs. AMD Gaming Evolved beta. R&D Razer Project Christine 61 HOW TO What Windows could learn from smartphones; fine-tune your SSD; edit photos with Gimp.
    [Show full text]
  • Carrier Grade Virtualization
    Carrier Grade Virtualization Leveraging virtualization in Carrier Grade Systems Abstract Network Equipment Providers (NEPs) have been building networking infrastructure equipment able to deliver “carrier grade” services, typically mission-critical services such as voice telephony. In decades past, NEPs have achieved high degrees of availability through purpose-built hardware and software implementations. Today they increasingly build on COTS (Commercial Off The Shelf) hardware and Open Source Software (OSS), freeing their engineering resources to focus on core telephony competencies. The move to COTS and OSS requires that these hardware and software components be available from an ecosystem of suppliers, and that they interoperate seamlessly. Bodies such as The Linux Foundation (LF), the Service Availability Forum (SA Forum) and PICMG have defined standards and specifications such as carrier grade OSes (CGLinux), Service Availability Forum APIs and AdvancedTCA hardware to target carrier grade applications. Recent advances have made virtualization appealing for carrier class equipment by permitting significant cost reduction through consolidation of workloads and of physical hardware. Virtualization also transparently lets NEPs and other OEMs (Original Equipment Manufacturers) leverage multi-core processors to run legacy software designed for uniprocessor hardware. However, virtualization needs to meet specific requirements to enable network equipment deploying this technology to meet industry expectations for carrier grade systems. This
    [Show full text]
  • Programming Languages, Database Language SQL, Graphics, GOSIP
    b fl ^ b 2 5 I AH1Q3 NISTIR 4951 (Supersedes NISTIR 4871) VALIDATED PRODUCTS LIST 1992 No. 4 PROGRAMMING LANGUAGES DATABASE LANGUAGE SQL GRAPHICS Judy B. Kailey GOSIP Editor POSIX COMPUTER SECURITY U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 100 . U56 4951 1992 NIST (Supersedes NISTIR 4871) VALIDATED PRODUCTS LIST 1992 No. 4 PROGRAMMING LANGUAGES DATABASE LANGUAGE SQL GRAPHICS Judy B. Kailey GOSIP Editor POSIX COMPUTER SECURITY U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 October 1992 (Supersedes July 1992 issue) U.S. DEPARTMENT OF COMMERCE Barbara Hackman Franklin, Secretary TECHNOLOGY ADMINISTRATION Robert M. White, Under Secretary for Technology NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY John W. Lyons, Director - ;,’; '^'i -; _ ^ '’>.£. ; '':k ' ' • ; <tr-f'' "i>: •v'k' I m''M - i*i^ a,)»# ' :,• 4 ie®®;'’’,' ;SJ' v: . I 'i^’i i 'OS -.! FOREWORD The Validated Products List is a collection of registers describing implementations of Federal Information Processing Standards (FTPS) that have been validated for conformance to FTPS. The Validated Products List also contains information about the organizations, test methods and procedures that support the validation programs for the FTPS identified in this document. The Validated Products List is updated quarterly. iii ' ;r,<R^v a;-' i-'r^ . /' ^'^uffoo'*^ ''vCJIt<*bjteV sdT : Jr /' i^iL'.JO 'j,-/5l ':. ;urj ->i: • ' *?> ^r:nT^^'Ad JlSid Uawfoof^ fa«Di)itbiI»V ,, ‘ isbt^u ri il .r^^iytsrH n 'V TABLE OF CONTENTS 1.
    [Show full text]
  • Open Communication Platforms « Streamline Your Broadband, Cable and Mobile Infrastructure System Designs
    product portfolio 2012 » Open Communication Platforms « streamline your broadband, cable and mobile infrastructure system designs 2 If it’s embedded,www.kontron.com/OCP it’s Kontron. » Achieving a lower Total Cost of Ownership (TCO) « Kontron integrated, standardized platforms ‘future-proof’ tEM’s network applications In the rush to deploy new network solutions to the carrier market, Telecom Equipment Manufacturers (TEMs) cannot afford to waste valuable resources on non-core competencies. From initial product conception to market deployment, Kontron works with TEMs to alleviate and manage all the challenges of designing and integrating the complex hardware and software components – right up to the application layer. This enables TEMs to remain focused on developing their value-add application innovations, and to deploy them on the most appropriate carrier grade, integrated off-the-shelf platform. Kontron’s comprehensive lifecycle management and various services to assist clients to seamlessly migrate to next generation platform and bladed technologies ensures the “future proofing” of TEM applications for a much lower overall cost of ownership. Open Communication Platforms (OCP) deliver highly dense and extreme design flexibility, plus all the required mission-critical RAS features (reliability, availability, serviceability) for wireless, broadband and carrier data center networks. five steps to a lower tCo with Kontron oCp 1. Design with a common network hardware platform Multi-Core, 10G/40G Fabric, I/O, Storage 2. Ensure scalability Keep pace with higher bandwidth demands 3. Reduce development costs and timelines Achieve a faster time to revenue with new network infrastructure 4. Reduce system-level footprint Modular design enables multi-functional bladed applications on single platform 5.
    [Show full text]
  • System Design for Telecommunication Gateways
    P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come SYSTEM DESIGN FOR TELECOMMUNICATION GATEWAYS Alexander Bachmutsky Nokia Siemens Networks, USA A John Wiley and Sons, Ltd., Publication P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come SYSTEM DESIGN FOR TELECOMMUNICATION GATEWAYS P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come SYSTEM DESIGN FOR TELECOMMUNICATION GATEWAYS Alexander Bachmutsky Nokia Siemens Networks, USA A John Wiley and Sons, Ltd., Publication P1: OTE/OTE/SPH P2: OTE FM BLBK307-Bachmutsky August 30, 2010 15:13 Printer Name: Yet to Come This edition first published 2011 C 2011 John Wiley & Sons, Ltd Registered office John Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom For details of our global editorial offices, for customer services and for information about how to apply for permission to reuse the copyright material in this book please see our website at www.wiley.com. The right of the author to be identified as the author of this work has been asserted in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
    [Show full text]
  • Comparison of Platform Virtual Machines - Wikipedia
    Comparison of platform virtual machines - Wikipedia... http://en.wikipedia.org/wiki/Comparison_of_platform... Comparison of platform virtual machines From Wikipedia, the free encyclopedia The table below compares basic information about platform virtual machine (VM) packages. Contents 1 General Information 2 More details 3 Features 4 Other emulators 5 See also 6 References 7 External links General Information Name Creator Host CPU Guest CPU Bochs Kevin Lawton any x86, AMD64 CHARON-AXP Stromasys x86 (64 bit) DEC Alphaserver CHARON-VAX Stromasys x86, IA-64 VAX x86, x86-64, SPARC (portable: Contai ners (al so 'Zones') Sun Microsystems (Same as host) not tied to hardware) Dan Aloni helped by other Cooperati ve Li nux x86[1] (Same as parent) developers (1) Denal i University of Washington x86 x86 Peter Veenstra and Sjoerd with DOSBox any x86 community help DOSEMU Community Project x86, AMD64 x86 1 of 15 10/26/2009 12:50 PM Comparison of platform virtual machines - Wikipedia... http://en.wikipedia.org/wiki/Comparison_of_platform... FreeVPS PSoft (http://www.FreeVPS.com) x86, AMD64 compatible ARM, MIPS, M88K GXemul Anders Gavare any PowerPC, SuperH Written by Roger Bowler, Hercul es currently maintained by Jay any z/Architecture Maynard x64 + hardware-assisted Hyper-V Microsoft virtualization (Intel VT or x64,x86 AMD-V) OR1K, MIPS32, ARC600/ARC700, A (can use all OVP OVP Imperas [1] [2] Imperas OVP Tool s x86 (http://www.imperas.com) (http://www.ovpworld compliant models, u can write own to pu OVP APIs) i Core Vi rtual Accounts iCore Software
    [Show full text]
  • Optimizing Authenticated Encryption Algorithms
    Masaryk University Faculty of Informatics Optimizing authenticated encryption algorithms Master’s Thesis Ondrej Mosnáček Brno, Fall 2017 Masaryk University Faculty of Informatics Optimizing authenticated encryption algorithms Master’s Thesis Ondrej Mosnáček Brno, Fall 2017 This is where a copy of the official signed thesis assignment and a copy ofthe Statement of an Author is located in the printed version of the document. Declaration Hereby I declare that this paper is my original authorial work, which I have worked out on my own. All sources, references, and literature used or excerpted during elaboration of this work are properly cited and listed in complete reference to the due source. Ondrej Mosnáček Advisor: Ing. Milan Brož i Acknowledgement I would like to thank my advisor, Milan Brož, for his guidance, pa- tience, and helpful feedback and advice. Also, I would like to thank my girlfriend Ludmila, my family, and my friends for their support and kind words of encouragement. If I had more time, I would have written a shorter letter. — Blaise Pascal iii Abstract In this thesis, we look at authenticated encryption with associated data (AEAD), which is a cryptographic scheme that provides both confidentiality and integrity of messages within a single operation. We look at various existing and proposed AEAD algorithms and compare them both in terms of security and performance. We take a closer look at three selected candidate families of algorithms from the CAESAR competition. Then we discuss common facilities provided by the two most com- mon CPU architectures – x86 and ARM – that can be used to implement cryptographic algorithms efficiently.
    [Show full text]
  • The 2007 European ICT Prize the 3 European ICT Grand Prize Winners
    CeBIT - Hanover, 16 March 2007 The 2007 European ICT Prize The 3 European ICT Grand Prize Winners Telepo (SE) for Telepo Business Communication solution Telepo's fixed-mobile convergence solution enables efficient business communication anytime, anywhere. [email protected] - www.telepo.com Transitive Corporation (UK) for QuickTransit® Virtualization product that eliminates the hardware-software dependency. [email protected] - www.transitive.com Treventus Mechatronics (AT) for ScanRobot High-speed (up to 2,400 pages/hour) book scanner with integrated fully automatic page-turning for bound documents. [email protected] - www.treventus.com www.ict-prize.org The 2007 European ICT Prize Winners A3M (DE) for A3M Tsunami Alarm System San Disk (IL) for mToken Global Tsunami Warning System for mobile Combines PKI-based two factor authentication, phones, protecting human lives and health. secure storage and smartcard-based applications www.tsunami-alarm-system.com inone USB device. www.m-systems.com/mtoken Byometric Systems (DE) for Large scale iden- T-VIPS (NO) for T-VIPS TVG Video Gateways tification Solution based on iris-recognition Provides professional video market with innovative Biometric access system based on iris-recognition IP transport solutions. www.t-vips.com identification in banking environment. www.byometric.com Telepo (SE) for Telepo Business Communication solution DIGIMIND (FR) for DIGIMIND FINDER Telepo's fixed-mobile convergence solution New vertical meta-search engine revolutionizing enables efficient business communication any- the professional web search experience. time, anywhere. www.telepo.com www.digimind.com TEMIS (FR) for Luxid® g.tec Guger Technologies (AT) Innovative information discovery solution serving for Brain-Computer Interface the information intelligence needs of business/ Interface for cursor control and writing by corporations.
    [Show full text]
  • Ruefenachtm 2021.Pdf (2.599Mb)
    Towards Larger Scale Collective Operations in the Message Passing Interface Martin Rufenacht¨ I V N E R U S E I T H Y T O H F G E R D I N B U Doctor of Philosophy Institute of Computing Systems Architecture School of Informatics University of Edinburgh 2020 Abstract Supercomputers continue to expand both in size and complexity as we reach the be- ginning of the exascale era. Networks have evolved, from simple mechanisms which transport data to subsystems of computers which fulfil a significant fraction of the workload that computers are tasked with. Inevitably with this change, assumptions which were made at the beginning of the last major shift in computing are becoming outdated. We introduce a new latency-bandwidth model which captures the characteristics of sending multiple small messages in quick succession on modern networks. Contrary to other models representing the same effects, the pipelining latency-bandwidth model is simple and physically based. In addition, we develop a discrete-event simulation, Fennel, to capture non-analytical effects of communication within models. AllReduce operations with small messages are common throughout supercomput- ing, particularly for iterative methods. The performance of network operations are crucial to the overall time-to-solution of an application as a whole. The Message Pass- ing Interface standard was introduced to abstract complex communications from ap- plication level development. The underlying algorithms used for the implementation to achieve the specified behaviour, such as the recursive doubling algorithm for AllRe- duce, have to evolve with the computers on which they are used. We introduce the recursive multiplying algorithm as a generalisation of recursive doubling.
    [Show full text]
  • Intel's Core 2 Family
    Intel’s Core 2 family - TOCK lines References Dezső Sima Vers. 1.0 Januar 2019 Contents (1) • 1. Introduction • 2. The Core 2 line • 3. The Nehalem line • 4. The Sandy Bridge line • 5. The Haswell line • 6. The Skylake line • 7. The Kaby Lake line • 8. The Kaby Lake Refresh line • 9. The Coffee Lake line • 10. The Coffee Lake line Refresh Contents (2) • 11. The Cannon Lake line (outlook) • 12. Sunny Cove • 13. References 13. References 12. References (1) [1]: Singhal R., “Next Generation Intel Microarchitecture (Nehalem) Family: Architecture Insight and Power Management, IDF Taipeh, Oct. 2008, http://intel.wingateweb.com/taiwan08/ published/sessions/TPTS001/FA08%20IDFTaipei_TPTS001_100.pdf [2]: Bryant D., “Intel Hitting on All Cylinders,” UBS Conf., Nov. 2007, http://files.shareholder.com/downloads/INTC/0x0x191011/e2b3bcc5-0a37-4d06- aa5a-0c46e8a1a76d/UBSConfNov2007Bryant.pdf [3]: Fisher S., “Technical Overview of the 45 nm Next Generation Intel Core Microarchitecture (Penryn),” IDF 2007, ITPS001, http://isdlibrary.intel-dispatch.com/isd/89/45nm.pdf [4]:Pabst T., The New Athlon Processor: AMD Is Finally Overtaking Intel, Tom's Hardware, August 9, 1999, http://www.tomshardware.com/reviews/athlon-processor,121-2.html [5]: Carmean D., “Inside the Pentium 4 Processor Micro-architecture,” Aug. 2000, http://people.virginia.edu/~zl4j/CS854/pda_s01_cd.pdf [6]: Shimpi A. L. & Clark J., “AMD Opteron 248 vs. Intel Xeon 2.8: 2-way Web Servers go Head to Head,” AnandTech, Dec. 17 2003, http://www.anandtech.com/showdoc.aspx?i=1935&p=1 [7]: Völkel F., “Duel of the Titans: Opteron vs. Xeon : Hammer Time: AMD On The Attack,” Tom’s Hardware, Apr.
    [Show full text]