Heterogeneous Processors

Total Page:16

File Type:pdf, Size:1020Kb

Heterogeneous Processors COMP 635: Seminar on Heterogeneous Processors www.cs.rice.edu/~vsarkar/comp635 Vivek Sarkar Department of Computer Science Rice University [email protected] August 27, 2007 Course Goals • Gain familiarity with heterogeneous processor systems by studying a few sample design points in the spectrum • Study and critique current software environments for these designs (programming models, compilers, tools, runtimes) • Discuss research challenges in advancing the state of the art of software for heterogeneous processors • Target audience: software, hardware, and application researchers interested in building or using heterogeneous processor systems, or understanding strengths and weaknesses of heterogeneous processors w.r.t. their research areas COMP 635, Fall 2007 (V.Sarkar) 2 Course Organization • Class dates (12 lectures) — 8/27, 9/10, 9/20 (Thurs), 9/24, 10/1, 10/8, 10/22, 10/29, 11/5, 11/19, 11/26, 12/3 — No classes on 9/3 (Labor Day), 10/15 (Midterm Recess), 11/12 (Supercomputing 2007 conference week) — No class on 9/17 (Mon); we will meet on 9/20 (Thurs) instead that week • Time & Place — Default: Mondays, 3:30pm - 4:30pm, DH 2014 — Exception: time & place for 9/20 (Thurs) lecture TBD — 30 minutes reserved after lecture for discussion (optional) • Office Hours (DH 3131) — 11am - 12noon, Fridays from 8/31/07 to 12/7/07 • OWL-Space repository: COMP 635 F07 • Grading — Satisfactory/unsatisfactory grade for students taking seminar for credit – Others should register officially as auditors, if possible — For a satisfactory grade, you need to 1. Attend at least 50% of lectures 2. Submit a 4-page project/study report by 12/7/07 (report can be prepared in a group - just plan on 4 pages/person in that case) — Optional in-class presentation of project/study report on 12/3/07 COMP 635, Fall 2007 (V.Sarkar) 3 Course Content • Introduction to Heterogeneous Processors and their Programming Models (1 lecture) • Cell Processor and Cell SDK (2 lectures) • Nvidia GPU and CUDA programming environment (2 lectures) • DRC FPGA Coprocessor Module and Celoxica Programming Environment (1 lecture) • Clearspeed Accelerator and SDK (1 lecture) • Imagine Stream Processor (1 lecture) • Microsoft Accelerator Library (1 lecture) • Vector and SIMD processors -- a historical perspective (1 lecture) • Programming Model and Runtime Desiderata for future Heterogeneous Processors (1 lecture) • Student presentations (1 lecture) COMP 635, Fall 2007 (V.Sarkar) 4 COMP 635 Lecture 1: Introduction to Heterogeneous Processors and their Programming Models COMP 635, Fall 2007 (V.Sarkar) 5 Acknowledgments • Georgia Tech ECE 6100, Module 14 — Vince Mooney, Krishna Palem, Sudhakar Yalamanchili —http://www.ece.gatech.edu/academic/courses/fall2006/ece6100/Class/ind ex.html • MIT 6.189 IAP 2007, Lecture 2 —“Introduction to the Cell Processor”, Michael Perrone — http://cag.csail.mit.edu/ps3/lectures/6.189-lecture2-cell.pdf • UIUC ECE 497, Lecture 16 —courses.ece.uiuc.edu/ece412/lectures/lecture16.ppt • UIUC ECE 498 AL1, Programming Massively Parallel Processors — David Kirk, Wen-mei Hwu —http://courses.ece.uiuc.edu/ece498/al1/Syllabus.html COMP 635, Fall 2007 (V.Sarkar) 6 Heterogeneous Processors General-purpose processor orchestrates activity LOCAL Memory transfer MEMORY module Accelerators can use schedules GPP scheduled, streaming system-wide bulk ACC communication… data movement or can operate on locally-buffered data Y R pushed to them in N I O advance A MTM M M E M Motivation: ACC ACC 1) Different parts of programs have different requirements Control-intensive portions need good branch predictors, speculation, big caches to achieve good performance LOCAL Data-processing portions need lots of MEMORY ALUs, have simpler control flows 2) Power consumption Accelerated activities and associated private data Features like branch prediction, out-of- order execution, tend to have very are localized for bandwidth, power, efficiency high power/performance ratios. Applications often have time-varying performance requirements COMP 635, Fall 2007 (V.Sarkar) 7 Sample Application Domains for Heterogeneous Processors • Cell Processor — Medical imaging, Drug discovery, Reservoir modeling, Seismic analysis, … • GPU (e.g., Nvidia) — Computer-aided design (CAD), Digital content creation (DCC), emerging HPC applications, … • FPGA (e.g., Xilinx DRC) —HPC, Petroleum, Financial, … • HPC accelerators (e.g., Clearspeed) — HPC, Network processing, Graphics, … • Stream Processors (e.g., Imagine) —Image processing, Signal processing, Video, Graphics, … • Others —TCP/IP offload, Crypto, … COMP 635, Fall 2007 (V.Sarkar) 8 Programming Models for Heterogeneous Processors • Data Parallelism • Single Program Multiple Data (SPMD) • Pipelining • Work Queue • Fork Join • Message Passing • Storage Models: Shared vs. Local vs. Partitioned Memories • Hybrid combinations of above Only a limited subset of these models are in production use today ==> programming model implementations for heterogeneous processors will have to grow to accommodate new application domains and new classes of programmers COMP 635, Fall 2007 (V.Sarkar) 9 Heterogeneous Processor Spectrum Dimension 1: Distance of accelerator from main processor Heterogeneous Multicore Dimension 2: Hardware customization in accelerator COMP 635, Fall 2007 (V.Sarkar) 10 Heterogeneous Processor Spectrum Dimension 1: Distance of accelerator from main processor Focus of this course Heterogeneous Multicore Dimension 2: Hardware customization in Focus of accelerator this course COMP 635, Fall 2007 (V.Sarkar) 11 Spectrum of Programmers for Heterogeneous Processors • Application-level Users — Plug & play experience by using ISV frameworks such as MATLAB and Mathematica, etc • Library-level Programmers — Portable library interface that works across homogeneous and heterogeneous processors • Language-level Programmers — Portable programming language that works across homogeneous and heterogeneous processors — Conspicuous lack of new languages for heterogeneous processors, especially languages with managed runtimes! • SDK-level Programmers — C-based compilers and tools that are specific to a given heterogeneous processor COMP 635, Fall 2007 (V.Sarkar) 12 Spectrum of Programmers for Heterogeneous Processors • Application-level Users — Plug & play experience by using ISV frameworks such as MATLAB and Mathematica, etc • Library-level Programmers — Portable library interface that works across homogeneous and heterogeneous processors • Language-level Programmers — Portable programming language that works across homogeneous and heterogenFocuseous pofro cessors — Conspicuous lack of new lathisngu courseages for heterogeneous processors, especially languages with managed runtimes! • SDK-level Programmers — C-based compilers and tools that are specific to a given heterogeneous processor COMP 635, Fall 2007 (V.Sarkar) 13 Cell Broadband Engine (BE) COMP 635, Fall 2007 (V.Sarkar) 14 Cell Performance COMP 635, Fall 2007 (V.Sarkar) 15 Cell Temperature Distribution Power and heat are key constraints COMP 635, Fall 2007 (V.Sarkar) 16 Code Partitioning for Cell Key Flow Graph Node Call Graph Node Flow Graph Edge Call Graph Edge Compile for PPE Compile for SPE Outlining Cloning • Outlining: extract parallel loop into a separate procedure • Cloning: make separate copies for PPE and SPE, including clones of all procedures called from loop • Coordination: insert operations on signal registers and mailbox queues in PPE and SPE codes • Reference: “Using advanced compiler technology to exploit the performance of the Cell Broadband Engine architecture”, A. Eichenberger et al, IBM Systems Journal, Vol 45, No 1, 2006 COMP 635, Fall 2007 (V.Sarkar) 17 Why GPUs? • A quiet revolution and potential build-up — Calculation: 367 GFLOPS vs. 32 GFLOPS — Memory Bandwidth: 86.4 GB/s vs. 8.4 GB/s — Until last year, programmed through graphics API — GPU in every PC and workstation – massive volume and potential impact COMP 635, Fall 2007 (V.Sarkar) 18 Sample GPU Applications Application Description Source Kernel % time H.264 SPEC ‘06 version, change in guess vector 34,811 194 35% LBM SPEC ‘06 version, change to single precision 1,481 285 >99% and print fewer reports RC5-72 Distributed.net RC5-72 challenge client code 1,979 218 >99% FEM Finite element modeling, simulation of 3D 1,874 146 99% graded materials RPES Rye Polynomial Equation Solver, quantum 1,104 281 99% chem, 2-electron repulsion PNS Petri Net simulation of a distributed system 322 160 >99% SAXPY Single-precision implementation of saxpy, 952 31 >99% used in Linpack’s Gaussian elim. routine TRACF Two Point Angular Correlation Function 536 98 96% FDTD Finite-Difference Time Domain analysis of 1,365 93 16% 2D electromagnetic wave propagation MRI-Q Computing a matrix Q, a scanner’s 490 33 >99% configuration in MRI reconstruction COMP 635, Fall 2007 (V.Sarkar) 19 Performance of Sample Kernels and Applications • GeForce 8800 GTX vs. 2.2GHz Opteron 248 • 10× speedup in a kernel is typical, as long as the kernel can occupy enough parallel threads • 25× to 400× speedup if the function’s data requirements and control flow suit the GPU and the application is optimized • Keep in mind that the speedup also reflects how suitable the CPU is for executing the kernel Source: Slide 21, Lecture 1, UIUC ECE 498, David Kirk & Wen-mei Hwu, http://courses.ece.uiuc.edu/ece498/al1/lectures/lecture1%20intro%20fall%202007.ppt COMP 635, Fall 2007 (V.Sarkar) 20 FPGAs: Basics of FPGA Offload Source: “Compiling Software Code to FPGA-based Accelerator Processors for HPC
Recommended publications
  • Exploring Weak Scalability for FEM Calculations on a GPU-Enhanced Cluster
    Exploring weak scalability for FEM calculations on a GPU-enhanced cluster Dominik G¨oddeke a,∗,1, Robert Strzodka b,2, Jamaludin Mohd-Yusof c, Patrick McCormick c,3, Sven H.M. Buijssen a, Matthias Grajewski a and Stefan Turek a aInstitute of Applied Mathematics, University of Dortmund bStanford University, Max Planck Center cComputer, Computational and Statistical Sciences Division, Los Alamos National Laboratory Abstract The first part of this paper surveys co-processor approaches for commodity based clusters in general, not only with respect to raw performance, but also in view of their system integration and power consumption. We then extend previous work on a small GPU cluster by exploring the heterogeneous hardware approach for a large-scale system with up to 160 nodes. Starting with a conventional commodity based cluster we leverage the high bandwidth of graphics processing units (GPUs) to increase the overall system bandwidth that is the decisive performance factor in this scenario. Thus, even the addition of low-end, out of date GPUs leads to improvements in both performance- and power-related metrics. Key words: graphics processors, heterogeneous computing, parallel multigrid solvers, commodity based clusters, Finite Elements PACS: 02.70.-c (Computational Techniques (Mathematics)), 02.70.Dc (Finite Element Analysis), 07.05.Bx (Computer Hardware and Languages), 89.20.Ff (Computer Science and Technology) ∗ Corresponding author. Address: Vogelpothsweg 87, 44227 Dortmund, Germany. Email: [email protected], phone: (+49) 231 755-7218, fax: -5933 1 Supported by the German Science Foundation (DFG), project TU102/22-1 2 Supported by a Max Planck Center for Visual Computing and Communication fellowship 3 Partially supported by the U.S.
    [Show full text]
  • Lewis University Dr. James Girard Summer Undergraduate Research Program 2021 Faculty Mentor - Project Application
    Lewis University Dr. James Girard Summer Undergraduate Research Program 2021 Faculty Mentor - Project Application Exploring the Use of High-level Parallel Abstractions and Parallel Computing for Functional and Gate-Level Simulation Acceleration Dr. Lucien Ngalamou Department of Engineering, Computing and Mathematical Sciences Abstract System-on-Chip (SoC) complexity growth has multiplied non-stop, and time-to- market pressure has driven demand for innovation in simulation performance. Logic simulation is the primary method to verify the correctness of such systems. Logic simulation is used heavily to verify the functional correctness of a design for a broad range of abstraction levels. In mainstream industry verification methodologies, typical setups coordinate the validation e↵ort of a complex digital system by distributing logic simulation tasks among vast server farms for months at a time. Yet, the performance of logic simulation is not sufficient to satisfy the demand, leading to incomplete validation processes, escaped functional bugs, and continuous pressure on the EDA1 industry to develop faster simulation solutions. In this research, we will explore a solution that uses high-level parallel abstractions and parallel computing to boost the performance of logic simulation. 1Electronic Design Automation 1 1 Project Description 1.1 Introduction and Background SoC complexity is increasing rapidly, driven by demands in the mobile market, and in- creasingly by the fast-growth of assisted- and autonomous-driving applications. SoC teams utilize many verification technologies to address their complexity and time-to-market chal- lenges; however, logic simulation continues to be the foundation for all verification flows, and continues to account for more than 90% [10] of all verification workloads.
    [Show full text]
  • Virtualization: Comparision of Windows and Linux
    VIRTUALIZATION: COMPARISION OF WINDOWS AND LINUX Ms. Pooja Sharma Lecturer (I.T) PCE, Jaipur Email:[email protected] Charnaksh Jain IV yr (I.T) PCE, Jaipur [email protected] Abstract Full-Virtualization, Para-Virtualization, hyper- visior(Hyper-V), Guest Operating System, Host Virtualization as a concept is not new; computational Operating System. environment virtualization has been around since the first mainframe systems. But recently, the term 1. Introduction "virtualization" has become ubiquitous, representing any type of process obfuscation where a process is Virtualization provides a set of tools for increasing somehow removed from its physical operating flexibility and lowering costs, things that are environment. Because of this ambiguity, important in every enterprise and Information virtualization can almost be applied to any and all Technology organization. Virtualization solutions are parts of an IT infrastructure. For example, mobile becoming increasingly available and rich in features. device emulators are a form of virtualization because the hardware platform normally required to run the Since virtualization can provide significant benefits mobile operating system has been emulated, to your organization in multiple areas, you should be removing the OS binding from the hardware it was establishing pilots, developing expertise and putting written for. But this is just one example of one type virtualization technology to work now. of virtualization; there are many definitions of the In essence, virtualization increases flexibility by term "virtualization" floating around in the current decoupling an operating system and the services and lexicon, and all (or at least most) of them are correct, applications supported by that system from a specific which can be quite confusing.
    [Show full text]
  • A Survey of Reconfigurable Processors
    Hindawi Publishing Corporation VLSI Design Volume 2013, Article ID 683615, 18 pages http://dx.doi.org/10.1155/2013/683615 Review Article Ingredients of Adaptability: A Survey of Reconfigurable Processors Anupam Chattopadhyay MPSoC Architectures, UMIC Research Centre, RWTH Aachen University, Mies-van-der-Rohe Strasse 15, 52074 Aachen, Germany Correspondence should be addressed to Anupam Chattopadhyay; [email protected] Received 18 December 2012; Revised 14 May 2013; Accepted 1 June 2013 Academic Editor: Yann Thoma Copyright © 2013 Anupam Chattopadhyay. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a design to survive unforeseen physical effects like aging, temperature variation, and/or emergence of new application standards, adaptability needs to be supported. Adaptability, in its complete strength, is present in reconfigurable processors, which makes it an important IP in modern System-on-Chips (SoCs). Reconfigurable processors have risen to prominence as a dominant computing platform across embedded, general-purpose, and high-performance application domains during the last decade. Significant advances have been made in many areas such as, identifying the advantages of reconfigurable platforms, their modeling, implementation flow and finally towards early commercial acceptance. This paper reviews these progresses from various perspectives with particular emphasis on fundamental challenges and their solutions. Empowered with the analysis of past, the future research roadmap is proposed. 1. Introduction Circuits (ASICs) in terms of flexibility and performance. Since this work, notable research has been done in accel- The changing technology landscape and fast evolution of erator design (application-specific processors), multicore application standards make it imperative for a design to homogeneous and heterogeneous System-on-Chip (SoC) be adaptable.
    [Show full text]
  • Introduction Hardware Acceleration Philosophy Popular Accelerators In
    Special Purpose Accelerators Special Purpose Accelerators Introduction Recap: General purpose processors excel at various jobs, but are no Theme: Towards Reconfigurable High-Performance Computing mathftch for acce lera tors w hen dea ling w ith spec ilidtialized tas ks Lecture 4 Objectives: Platforms II: Special Purpose Accelerators Define the role and purpose of modern accelerators Provide information about General Purpose GPU computing Andrzej Nowak Contents: CERN openlab (Geneva, Switzerland) Hardware accelerators GPUs and general purpose computing on GPUs Related hardware and software technologies Inverted CERN School of Computing, 3-5 March 2008 1 iCSC2008, Andrzej Nowak, CERN openlab 2 iCSC2008, Andrzej Nowak, CERN openlab Special Purpose Accelerators Special Purpose Accelerators Hardware acceleration philosophy Popular accelerators in general Floating point units Old CPUs were really slow Embedded CPUs often don’t have a hardware FPU 1980’s PCs – the FPU was an optional add on, separate sockets for the 8087 coprocessor Video and image processing MPEG decoders DV decoders HD decoders Digital signal processing (including audio) Sound Blaster Live and friends 3 iCSC2008, Andrzej Nowak, CERN openlab 4 iCSC2008, Andrzej Nowak, CERN openlab Towards Reconfigurable High-Performance Computing Lecture 4 iCSC 2008 3-5 March 2008, CERN Special Purpose Accelerators 1 Special Purpose Accelerators Special Purpose Accelerators Mainstream accelerators today Integrated FPUs Realtime graphics GiGaming car ds Gaming physics
    [Show full text]
  • Torrenza and the Pareto Distribution
    CoEHT Symposium February 16, 2007 Douglas O’Flaherty The Heterogeneous Processing Imperative ≤ 1981 Single16-bit 486 Core x8 By the end of Spreadsheets,the decade,6 word-processing homogenous PERF. So 2 multi-core1990s becomes increasingly inadequate f tw 32-bit a Single r e AMD64 Core E-mail, GUI, PowerPoint,C web browsers om p l Dual- ex i Opteron ty Java, XML, web a services CoE HT Symposia n C d PERF. D ore 2000s i ve 64-bit rsi 3D, digital mediat Single Core y ER/PERF. POW 64-bit ogeneous HD, DRM HomMulti-CPU essing DIVERSITY 2010s The End of “One Size Fits All” Computing Platform Co-Proc HeterogeneousCPU+xPU Industry Landscape: The Insight Gap Accelerators Devices optimized to enhance the performance on a particular function “The Insight Gap” Data Driven X86 PerformanceComputing Time 3 CoE HT Symposia Torrenza and the Pareto Distribution • New features are introduced in niche markets y Some features will never reach broad market appeal, but will have stable niche markets over time y The sum of those niche market opportunities is itself a considerable market opportunity • Features with broad market appeal are quickly moved up the value chain y FPGA to custom logic on add-in board y Custom logic integrated into chipset y For very high value features, chipset logic moves into processor • Our goals for Torrenza y Enable new markets by changing system economics with a standard platform y Evaluate new features for their potential in the mass market % of Target Market Value Migration Mass New Features Enter Market in Niches Market Niche Markets ◄ General Features Specialized Features ► 4 CoE HT Symposia Early vs.
    [Show full text]
  • What Every Programmer Should Know About Memory
    What Every Programmer Should Know About Memory Ulrich Drepper Red Hat, Inc. [email protected] November 21, 2007 Abstract As CPU cores become both faster and more numerous, the limiting factor for most programs is now, and will be for some time, memory access. Hardware designers have come up with ever more sophisticated memory handling and acceleration techniques–such as CPU caches–but these cannot work optimally without some help from the programmer. Unfortunately, neither the structure nor the cost of using the memory subsystem of a computer or the caches on CPUs is well understood by most programmers. This paper explains the structure of memory subsys- tems in use on modern commodity hardware, illustrating why CPU caches were developed, how they work, and what programs should do to achieve optimal performance by utilizing them. 1 Introduction day these changes mainly come in the following forms: In the early days computers were much simpler. The var- • RAM hardware design (speed and parallelism). ious components of a system, such as the CPU, memory, mass storage, and network interfaces, were developed to- • Memory controller designs. gether and, as a result, were quite balanced in their per- • CPU caches. formance. For example, the memory and network inter- faces were not (much) faster than the CPU at providing • Direct memory access (DMA) for devices. data. This situation changed once the basic structure of com- For the most part, this document will deal with CPU puters stabilized and hardware developers concentrated caches and some effects of memory controller design. on optimizing individual subsystems. Suddenly the per- In the process of exploring these topics, we will explore formance of some components of the computer fell sig- DMA and bring it into the larger picture.
    [Show full text]
  • The Amd Opteron Northbridge Architecture
    ..................................................................................................................................................................................................................................................... THE AMD OPTERON NORTHBRIDGE ARCHITECTURE ..................................................................................................................................................................................................................................................... TO INCREASE PERFORMANCE WHILE OPERATING WITHIN A FIXED POWER BUDGET, THE AMD OPTERON PROCESSOR INTEGRATES MULTIPLE X86-64 CORES WITH A ROUTER AND MEMORY CONTROLLER.AMD’S EXPERIENCE WITH BUILDING A WIDE VARIETY OF SYSTEM TOPOLOGIES USING OPTERON’S HYPERTRANSPORT-BASED PROCESSOR INTERFACE HAS PROVIDED USEFUL LESSONS THAT EXPOSE THE CHALLENGES TO BE ADDRESSED WHEN DESIGNING FUTURE SYSTEM INTERCONNECT, MEMORY HIERARCHY, AND I/O TO SCALE WITH BOTH THE NUMBER OF CORES AND SOCKETS IN FUTURE X86-64 CMP ARCHITECTURES. ...... In 2005, Advanced Micro Devices significant throughput improvements in introduced the industry’s first native 64-bit future products while operating within x86 chip multiprocessor (CMP) architec- a fixed power budget. AMD has also ture combining two independent processor launched an initiative to provide industry cores on a single silicon die. The dual-core access to the Direct Connect architecture. Opteron chip featuring AMD’s Direct The ‘‘Torrenza Initiative’’ sidebar sum- Connect architecture provided
    [Show full text]
  • FPGA-Acceleration on COTS X86 Platforms University of Mannheim, 16 Feb 2007
    FPGA-Acceleration on COTS x86 Platforms University of Mannheim, 16 Feb 2007 XtremeData, Inc.: Confidential Slide 1 Slide 1: XtremeData Inc.: Confidential Information TodayToday’’’’ss Agenda XtremeData Corporate & Team background Why FPGAs in COTS x86? Issues and XDI Solution FPGA acceleration markets FPGAs in HPC Summary XtremeData, Inc.: Confidential Slide 2 Slide 2: XtremeData Inc.: Confidential Information XtremeDataXtremeData:: Corporate HistoryHistory………… 2004 Incorporated 2003, Seed funds raised. Jan Market research & POC completed: target markets identified. Apr SeriesA raised and development started with two teams: Jul hardware in Chicago and software in Bangalore, India. Oct System architecture defined: commodity hardware platform, Jan accelerator and database engine. Apr Jul FPGA Module offered as a stand-alone product, press releases; 2006 2005 strategic partnerships made, shipments started… Oct Jan Apr SeriesB fund raise closing Jan 2007 for Go-To-Market financing 2007 XtremeData, Inc.: Confidential Slide 3 Slide 3: XtremeData Inc.: Confidential Information Team Background Ravi Chandran , CEO BE Electronics, India, MS EE, University of Texas, Arlington, MBA, Kellogg School, Northwestern University, IL President, Binary Machines, Inc., Schaumburg, IL COO, VP of Engineering., Bio-Imaging Research, Inc., Lincolnshire, IL (www.bio-imaging.com) 20+ years of product development & design services in medical & industrial (NDT) imaging markets. 20+ years experience with Toshiba Medical Systems – 20% of worldwide CT scanner installed
    [Show full text]
  • Embedded Computing Design Resource Guide
    RSC# @ www.embedded-computing.com/rsc RSC# @ www.embedded-computing.com/rsc www.embedded-computing.com VOLUME 4 • NUMBER 6 A U G U S T 2 0 0 6 COLUMNS RESOURCE GUIDE 7 Editor’s Foreword Federation of Associations 27 Middleware: The last roadblock to distributed By Jerry Gipper systems development By Dr. Stan Schneider, RTI 8 Embedded Perspective Ab fabless deals 29 BIOS, firmware, middleware By Don Dingee 33 FPGAs, reconfigurable computing 10 Embedded Technology in Europe Embedded devices for those with disabilities By Hermann Strass 52 High-performance computing 14 Eclipse Perspective and News ALF and 10 questions every QA team should ask 61 Why automate testing? By Tracy Ragan By Kingston Duffie, Fanfare 63 Integrated development environment FEATURES 67 Intellectual Property cores SPECIAL: Custom solutions and short design cycles 17 Customization for the masses 70 Mezzanine cards By Jerry Gipper PCI Express: Software/Firmware 84 Microprocessors, microcontrollers 23 Advanced functional verification and debug of PCI Express-based designs By Chris Browy, Avery Design Systems 89 Improving code migration and reuse By Robert Day, LynuxWorks DEPARTMENTS 94 Operating systems – embedded 30, 138 Editor’s Choice Products 102 Packaging By Jerry Gipper 105 Techniques to shrink embedded system E-CASTS design cycles RapidIO System Architecture: What Designers Need to Know By Rodger H. Hosking, Pentek August 24, 2 p.m. EST 108 Single board computers and blades www.opensystems-publishing.com/ecast OpenSystems 137 Storage solutions E-LETTER Publishing™ August: www.embedded-computing.com/eletter 143 It’s all just plumbing, isn’t it? Mobile phone security – in your face OpenSystems By Victor Menasce, AMCC By Seiji Inoue, Oki Electric Publishing™ 148 Switch network fabrics 153 System-on-Chip (SoC) Published by: OpenSystems OpenSystems 157 Using assertions to track functional coverage Publishing™Publishing™ By Kelly D.
    [Show full text]
  • REPORT the World's Best Process Company
    GILDER August 2006 / Vol. XI No. 8 TECHNOLOGY The World’s Best REPORT Process Company scending into life-after-television with its superior liquid crystal displays (LCDs), exotic green lasers, last-mile fiber webs and cable A triplays is Corning (GLW). With virtually 100 percent of the world still trapped in a copper cage, with growing demand for movie-and-game- ready mobile displays, and with high-definition video and flat-panel televi- sions just beginning long, global market runs, Corning’s future glitters like Steuben glass. For Corning, LCD Driven by displays that range from cell phones to notebook computers and desk- top monitors to televisions, global demand for LCD glass is expected to swell from glass could well 800 million square feet last year to 1,400 million square feet in 2007, with Corning hoping to outgrow the total market. Most notable is the anticipated ascent of LCD repeat the 30-plus television from 5 percent in 2004 and 11 percent in 2005 to 20 percent this year and 30 percent next with average screen size reaching 27 inches. year run of cathode At which point we’ll have only just begun. ray tubes. If you (CONTINUED ON PAGE 3) don’t own Corning, FEATURED COMPANY: Broadwing (BWNG) Broad Wings now is the time If you watched Tiger Woods’s emotional win at the British Open on ABC, you were watching con- tent transported on Broadwing’s (BWNG) media network. Ditto for the World Cup in high definition on to board the great ESPN or ABC. The carrier is also capturing major league baseball in most venues, transporting game content from stadiums without compression so that the production facility sees it as it came out of glass boat.
    [Show full text]
  • Corporate Overview
    AMD CPU Roadmap Justin Boggs Sr. Developer Relations Engineer Sunnyvale, CA, USA July 2007 Table of Contents AMD-At-a-Glance Roadmaps and Technologies 2 AMD At-a-Glance The New AMD: Capabilities Server Workstation Desktop Game Notebook DTV Handheld Segments consoles Geography Greater Latin Europe North Korea Japan Strengths China America America Customers/ Distribution PC OEM Retail Digital ODM Consumer Partners Handheld Media Microprocessors Customer Best-in- Chipset Graphics & Media Products Focus Class Products Processors 64-bit Multi- Hyper- Tech-Centric CrossFire Avivo Low H.264 Tech Core Transport Culture Power MFG Fabs and Process Technology Foundry Partnerships Blending world-class knowledge, cultures and people 4 The New AMD: Capabilities 5 A New Level of Choice: Customer-Centric, Open PC Platforms Commercial Gaming & Media Mobile Devices Emerging Client Computing Markets • Stable image • Best-in-class • Optimized and • Development for the Windows® scalable of integrated enterprise Media Center multimedia CPU-GPU Edition processing •Best platform platform solutions for • Accelerated support for experience better time-to- new business Windows market for and Vista™ OEMs deployment models • Longer battery life with no compromise in performance 6 The Next Major x86 Inflection Point 1981 1990’s 2000’s 2010’s Legacy Processing Era Single Core CPUs/GPUs Traditionally Optimized Platforms Multi-Core CPUs/GPUs Accelerated Processing Era Platform Level Silicon Level The Era of Accelerated Computing is coming, and AMD is again leading the
    [Show full text]