RISC-V • 1980S: RISC • Case for Open Isas • 1990S: VLIW • Tour of RISC-V ISA • 2000S: NUMA Vs

Total Page:16

File Type:pdf, Size:1020Kb

RISC-V • 1980S: RISC • Case for Open Isas • 1990S: VLIW • Tour of RISC-V ISA • 2000S: NUMA Vs Past and Future Trends in Architecture and Hardware David Paerson [email protected] SOSP History Day October 3, 2015 1 Outline Part I - Past Part II – Future 50 years of Computer HW Technology Architecture History: • End of Moore’s Law • 1960s: • Flash vs. Disks Computer Families / • Fast DRAM Microprogramming • Crosspoint NVRAM • 1970s: CISC Open ISA & RISC-V • 1980s: RISC • Case for Open ISAs • 1990s: VLIW • Tour of RISC-V ISA • 2000s: NUMA vs. • RISC-V Software Stack Clusters • RISC-V Chips 2 IBM Compatibility Problem in early 1960s By early 1960’s, IBM had 4 incompatible lines of computers! 701 → 7094 650 → 7074 702 → 7080 1401 → 7010 Each system had its own • Instruction set • I/O system and Secondary Storage: magnetic tapes, drums and disks • Assemblers, compilers, libraries,... • Market niche: business, scientific, real time, ... ⇒ IBM System/360 – one ISA to rule them all 3 IBM 360: A Computer Family Model 30 . Model 70 Storage 8K - 64 KB 256K - 512 KB Datapath 8-bit 64-bit Circuit Delay 30 nsec/level 5 nsec/level Registers Main Store Transistor Registers The IBM 360 is why bytes are 8-bits long today! IBM 360 instrucon set architecture (ISA) completely hid the underlying technological differences between various models. Milestone: The first true ISA designed as portable hardware- soKware interface! With minor modifica>ons it s>ll survives today! 4 IBM System/360 Reference Card (“Green card”) OR (cl OC m' FLOA~NEP~INTFEATURE 'INSTRUCTIONS' Sy~t8m/~~ PaGk PAW F2 I%- Dinn4b#) RDD 86 AddN~allted,ExtenddJc.x) . , AXR 36 , RR I R~hrmcaData Add Normdized, Long (c] IBM SM-m MsDk (n) SPM 04 ADR 2A RR Srorlge Key 4e.d Fdd Normdid, LO~BIc) , AD 8A RX - SSK 08 I @ at %s@~nW fpi SSM 80 Add Norrndized, Short (01 AER 3A RR ' MACHINE IWTRUCTIONhS 1 Shift Left Dwble (c) SLDA 8F add Nonnalizad, Short Id AE 7A RX shift bft~oubfe ~owical SLDL 8D Add Unnormalized, Long (c) AWR 2E RR IUm MlWDlllG *;hm~%!#efc) SLA 88 Add Unnormaltzad, Long (c) AW 6E RX Add Unnomralf*ed, Short (c) AUR 3E RR Apkl (0) AR SMLM Single Logical SLL 89 Wbc3 A RWt Dmlbk bl SRDA 8E Add Unnomalited, WrtICY AU 7E KX Add Deeimd (e,d) AP Shift RwtDoubh Lagical SRDL 8C Compare, Low (d CDR 29 RR Akl Halfvvarel (cl AH ah R@ht $Me Id SRA 8A Cmpm Long (c) CD 69 RX Compare, Short (c) CER 39 RR Add Logical (d ALR Wcl Rwt Qry& Logical SRL 88 IfQ Y)Qd 510 9C Compare, Short Ic) CE 76 RX add CDBical (d AL DiviL, Long DDR 2D RR AND (c) NR Sore Sf 50 Dilde, Long DD 60 RX ' AND (c) N store C~Q~~=QC STC 42 AND (cl Ni St- hl&wd STH So Divide, Short DER 30 RR DM&, short DE 7D RX AND (4 Nc m 90 Bmmh and Link S£mMuSrir Oontrol (eip) STMC 60 Wve. Long HDR 24 RR BALR Halve, Shon HER 34 RR Branch md Link BAL fbbmlmU-- SR 16 Load and Teat, Long (c) LTDR RR Branch adStare (el WR SdNmft td S 68 a Load md Test, Short (c) LTER 32 RR Branch Md Srwe (el BAS 6ubfm&99dm@ lc,d SP FB Load Conplment, Long (c) LCDR 23 RR I BrMCGonCondftion rnR QbtFW Wdfword (6) SH 48 Branah on C;ondEth BIC 8k4wW t&!&d k) StR IF Load Complement Short (cl LCER RR BfZmhahCeunt BCTA wm@Loeieal (el St SF L-4 Long LDR 28 RR Bmnch on count 867 slwbuk%~m SVC OA Load, Long LD 68 RX on Idem H5gh 8XH mMdwl(c) TS 93 Load NNegative. Long (cl LNDR 21 RR BnncSronlhdexL~arEqusl BXLE 'fMIW4%~) TCH SF Load Negative, SfwR (c) LNER 31 RR CR %Hi€l& h1 TI0 9D Load Positive, Long (c) LPDR 20 RR Conrpors (4 Load Padtiva, Shart (c) LPER RR Campare C ~~.tr~!Wlvsk (cr) ' TM 91 30 rc) lQad Rwnded, Extended to Long (x) LRDR 26 RR fawxm Dschapd bJ) bP TR DC cMns1*rr,HsWwdnd(cr) ct.E nambbddT~ TPlT OD Load Rounded, Long to Short Lx) LRER 3S RR Laad, Short LER 38 RR Compnn hmtat CCR m UNPK F3 CL WRD 84 Load, Short LE 78 RX eongrr?oLe?iwW w@Em&? ~~~ Extended 1x1 MXR 26 RR '* (GRf) ZAP F8 Multiply. Wnmv bqhall I4 CLC ZmQ Multiply, tong MDR 2C RR ' CL4 . -mm Laded Id Multiply, Long , MD 6C RX CV& NQTW @OWPAWLS 1-3 Co-t(P~ code is load& Multiply, LongIExtended. (XI MXDR RR cwwM.ta kwmal CVD r ~mrmttafemm d. Decimel feamte n & Waa$oanW fentwm e. Model67 Privileged instruction MulrOpIy, LonaIExtended (x) mu 87 RX mwm- I(d Multiply, Short DhFkk m o. km&tifmctoFla it set n. New condition Extended pwcision * MER 3C RR Mi D floating point feature Multiply, Short ME 7C RX Mvlrs *al Idl w Store, Long STD 60 RX ~ditjt?,d)\ ED #i&CMl#E FORMATS ST€ 70 RX Rl,D2lX2,B2) EWC Editarralnkk 0 H&@WQFID 1 SECOND HRLFWORD 2 THIRD HALFWORD 3 E&qh 951 tc) XR 1 1 SDR 28 RR R1.W Em#wiv# OR dc) X SD 68 RX Rl.M(X2.62~ Rl.R2 Ewhsb OR Ce) ' SER 38 RR xi SE 78 RX Rl,D2tX2,02) I EfdwbmQA le) XC E&-Y EX llQ) tc&J HI0 fnaort Ghlmer IC KgY (w) 1SK Lwd LR --LBlu$ L LQadMdw8, LA eapdaml Twrt (4 LTR Lod lzawment (c) LCR Lordxdiliword LH mu^ I& Lod IrduWe Coharol (@a) LMC ' ~~w(C) LNR Loai aorith 4W LPR Losd PSW lFl(h,p) LPmY Losd Reid Ackkm (c,e$l LRA MVI MVC - WN !Eswmda ~uswithOfkDIst AII"VQ MamzoMa MWZ MulOb MR value of addms MMl*lv M Mu* Decbd tdb w AfiutalplVHsitmxd MH aR OR OR fc) 0 OR (d 01 5 Control versus Datapath ▪ Processor designs can be split between datapath, where numbers are stored and arithmetic operations computed, and control, which sequences operations on datapath ▪ Biggest challenge for early computer designers was getting control circuitry correct Control ! Maurice Wilkes invented the InstrucUon Control Lines Condion? idea of microprogramming to design the control unit of a processor, 1958 - Logic expensive compared to PC ALU ROM or RAM Datapath Registers Inst. Reg. - ROM cheaper than RAM - ROM much faster than RAM Busy? Address Data Main Memory 6 Microprogramming in IBM 360 Model M30 M40 M50 M65 Datapath width 8 bits 16 bits 32 bits 64 bits Microcode size 4k x 50 4k x 52 2.75k x 85 2.75k x 87 Clock cycle Ume (ROM) 750 ns 625 ns 500 ns 200 ns Main memory cycle Ume 1500 ns 2500 ns 2000 ns 750 ns Annual rental fee (1964 $) $48,000 $54,000 $115,000 $270,000 Annual rental fee (2015 $) $570,000 $650,000 $1,400,000 $3,200,000 7 IC technology, Microcode, and CISC ▪ Logic, RAM, ROM all implemented using MOS transistors ▪ Semiconductor RAM ≈ same speed as ROM ▪ With Moore’s Law, memory for control store could grow ▪ Allowed more complicated instruction sets (CISC) ▪ Minicomputer (TTL server) Example: Digital Equipment VAX ISA in 1978 8 Microprocessor Evolution ▪ Rapid progress in 1970s, fueled by advances in MOS technology, imitated minicomputers and mainframe ISAs ▪ Intel i432 - Most ambitious 1970s micro - started in 1975 - released 1981 - 32-bit capability-based object-oriented architecture - Instructions variable number of bits long - Heavily microcoded - Severe performance, complexity, and usability problems ▪ Intel 8086 (1978, 8MHz, 29,000 transistors) - “Stopgap” 16-bit processor, architected in 10 weeks - Extended accumulator architecture - Assembly-compatible with 8080 - 20-bit addressing through segmented addressing scheme ▪ IBM PC uses Intel 8088 for 8-bit bus (and Motorola 68000 was late) - Estimated sales of 250,000; 100,000,000s sold 9 Analyzing Microcoded Machines 1980s ▪ John Cocke and group at IBM - Working on a simple pipelined processor, 801 minicomputer (ECL server), and advanced compilers inside IBM - Ported experimental PL.8 compiler to IBM 370, only used simple register-register and load/store instructions similar to 801 - Code ran faster than other existing compilers that used all 370 instructions! - Up to 6 MIPS whereas 2 MIPS considered good before ▪ Emer and Clark at DEC - Found 20% of VAX instructions responsible for 60% of microcode, but only account for 0.2% of execution time! ▪ Patterson 1979 sabbatical at DEC - VAX microcode bugs ⇒ field repair, but field-repairable chips don’t make sense 10 From CISC to RISC ▪ Use fast RAM to build fast instruction cache of user- visible instructions, not fixed hardware microroutines - Contents of fast instruction memory change to fit what application needs right now ▪ Simple ISA => hardwired pipelined implementation - Compiled code only used a few CISC instructions - Simpler encoding allowed pipelined implementations ▪ Further benefit with integration - In early ‘80s, could finally fit 32-bit datapath + small caches on a single chip - No chip crossings in common case allows faster operation 11 Berkeley RISC Chips RISC-I (1982) Contains 44,420 transistors, fabbed in 5 µm NMOS, with a die area of 77 mm2, ran at 1 MHz. RISC-II (1983) contains 40,760 transistors, was fabbed in 3 µm NMOS, ran at 3 MHz, and the size is 60 mm2 Stanford built some too… 12 13 CISC vs. RISC Today ● PC Era ● PostPC Era: Client/Cloud ● Hardware translates ● IP in SoC vs. MPU x86 instructions into ● Value die area, energy as internal RISC much as performance instructions ● > 16B / year in 2014! ● Then use any RISC ● 98% RISC Processors: technique inside MPU 12.0B ARM (Advanced ● > 350M / year ! RISC Machine) ● x86 ISA eventually 2.0B Tensilica dominates servers as 1.5B ARC (Argonaut well as desktops RISC Core) 0.8B MIPS 14 VLIW: Very Long Instrucon Word Int Op 1 Int Op 2 Mem Op 1 Mem Op 2 FP Op 1 FP Op 2 Two Integer Units, Single Cycle Latency Two Load/Store Units, Three Cycle Latency Two Floa>ng-Point Units, Four Cycle Latency ▪ MulUple operaons packed into one instrucUon ▪ Each operaon slot is for a fixed funcUon ▪ Constant operaon latencies are specified ▪ Architecture requires guarantee of: - Parallelism within an instrucUon => no cross-operaon RAW check - No data use before data ready => no data interlocks 15 VLIW Compiler ResponsibiliQes ▪ Schedule operaons to maximize parallel execuUon ▪ Guarantees intra-instrucUon parallelism ▪ Schedule to avoid data hazards (no interlocks) - Typically separates operaons with explicit NOPs
Recommended publications
  • Configurable RISC-V Softcore Processor for FPGA Implementation
    1 Configurable RISC-V softcore processor for FPGA implementation Joao˜ Filipe Monteiro Rodrigues, Instituto Superior Tecnico,´ Universidade de Lisboa Abstract—Over the past years, the processor market has and development of several programming tools. The RISC-V been dominated by proprietary architectures that implement Foundation controls the RISC-V evolution, and its members instruction sets that require licensing and the payment of fees to are responsible for promoting the adoption of RISC-V and receive permission so they can be used. ARM is an example of one of those companies that sell its microarchitectures to participating in the development of the new ISA. In the list of the manufactures so they can implement them into their own members are big companies like Google, NVIDIA, Western products, and it does not allow the use of its instruction set Digital, Samsung, or Qualcomm. (ISA) in other implementations without licensing. The RISC-V The main goal of this work is the development of a RISC- instruction set appeared proposing the hardware and software V softcore processor to be implemented in an FPGA, using development without costs, through the creation of an open- source ISA. This way, it is possible that any project that im- a non-RISC-V core as the base of this architecture. The plements the RISC-V ISA can be made available open-source or proposed solution is focused on solving the problems and even implemented in commercial products. However, the RISC- limitations identified in the other RISC-V cores that were V solutions that have been developed do not present the needed analyzed in this thesis, especially in terms of the adaptability requirements so they can be included in projects, especially the and flexibility, allowing future modifications according to the research projects, because they offer poor documentation, and their performances are not suitable.
    [Show full text]
  • A 2000 Frames / S Programmable Binary Image Processor Chip for Real Time Machine Vision Applications
    A 2000 frames / s programmable binary image processor chip for real time machine vision applications A. Loos, D. Fey Institute of Computer Science, Friedrich-Schiller-University Jena Ernst-Abbe-Platz 2, D-07743 Jena, Germany {loos,fey}@cs.uni-jena.de Abstract the inflexible and fixed instruction set. To meet that we present a so called ASIP (application specific instruction Industrial manufacturing today requires both an efficient set processor) which combines the flexibility of a GPP production process and an appropriate quality standard of (General Purpose Processor) with the speed of an ASIC. each produced unit. The number of industrial vision appli- cations, where real time vision systems are utilized, is con- tinuously rising due to the increasing automation. Assem- Embedded Vision System bly lines, where component parts are manipulated by robot 1. real scene image sensing, AD- CMOS-Imager grippers, require a fast and fault tolerant visual detection conversion and read out of objects. Standard computation hardware like PC-based 2. grey scale image image segmentation platforms with frame grabber boards are often not appro- representation priate for such hard real time vision tasks in embedded sys- 3. raw binary image image enhancement, tems. This is because they meet their limits at frame rates of representation removal of disturbance a few hundreds images per second and show comparatively ASIP FPGA long latency times of a few milliseconds. This is the result 4. improved binary image calculation of of the largely serial working and time consuming process- projections ing chain of these systems. In contrast to that we designed 5.
    [Show full text]
  • 18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures
    18-447 Computer Architecture Lecture 6: Multi-Cycle and Microprogrammed Microarchitectures Prof. Onur Mutlu Carnegie Mellon University Spring 2015, 1/28/2015 Agenda for Today & Next Few Lectures n Single-cycle Microarchitectures n Multi-cycle and Microprogrammed Microarchitectures n Pipelining n Issues in Pipelining: Control & Data Dependence Handling, State Maintenance and Recovery, … n Out-of-Order Execution n Issues in OoO Execution: Load-Store Handling, … 2 Reminder on Assignments n Lab 2 due next Friday (Feb 6) q Start early! n HW 1 due today n HW 2 out n Remember that all is for your benefit q Homeworks, especially so q All assignments can take time, but the goal is for you to learn very well 3 Lab 1 Grades 25 20 15 10 5 Number of Students 0 30 40 50 60 70 80 90 100 n Mean: 88.0 n Median: 96.0 n Standard Deviation: 16.9 4 Extra Credit for Lab Assignment 2 n Complete your normal (single-cycle) implementation first, and get it checked off in lab. n Then, implement the MIPS core using a microcoded approach similar to what we will discuss in class. n We are not specifying any particular details of the microcode format or the microarchitecture; you can be creative. n For the extra credit, the microcoded implementation should execute the same programs that your ordinary implementation does, and you should demo it by the normal lab deadline. n You will get maximum 4% of course grade n Document what you have done and demonstrate well 5 Readings for Today n P&P, Revised Appendix C q Microarchitecture of the LC-3b q Appendix A (LC-3b ISA) will be useful in following this n P&H, Appendix D q Mapping Control to Hardware n Optional q Maurice Wilkes, “The Best Way to Design an Automatic Calculating Machine,” Manchester Univ.
    [Show full text]
  • United Health Group Capacity Analysis
    Advanced Technical Skills (ATS) North America zPCR Capacity Sizing Lab SHARE Sessions 7774 and 7785 August 4, 2010 John Burg Brad Snyder Materials created by John Fitch and Jim Shaw IBM 1 © 2010 IBM Corporation Advanced Technical Skills Trademarks The following are trademarks of the International Business Machines Corporation in the United States and/or other countries. AlphaBlox* GDPS* RACF* Tivoli* APPN* HiperSockets Redbooks* Tivoli Storage Manager CICS* HyperSwap Resource Link TotalStorage* CICS/VSE* IBM* RETAIN* VSE/ESA Cool Blue IBM eServer REXX VTAM* DB2* IBM logo* RMF WebSphere* DFSMS IMS S/390* xSeries* DFSMShsm Language Environment* Scalable Architecture for Financial Reporting z9* DFSMSrmm Lotus* Sysplex Timer* z10 DirMaint Large System Performance Reference™ (LSPR™) Systems Director Active Energy Manager z10 BC DRDA* Multiprise* System/370 z10 EC DS6000 MVS System p* z/Architecture* DS8000 OMEGAMON* System Storage zEnterprise ECKD Parallel Sysplex* System x* z/OS* ESCON* Performance Toolkit for VM System z z/VM* FICON* PowerPC* System z9* z/VSE FlashCopy* PR/SM System z10 zSeries* * Registered trademarks of IBM Corporation Processor Resource/Systems Manager The following are trademarks or registered trademarks of other companies. Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom. Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
    [Show full text]
  • Computer Architecture Research with RISC-‐V
    Computer Architecture Research with RISC-V Krste Asanovic UC Berkeley, RISC-V Foundaon, & SiFive Inc. [email protected] www.riscv.org CARRV, Boston, MA October 14, 2017 Only Two Big Mistakes Possible when Picking Research ISA § Design your own § Use someone else’s Promise of using commercially popular ISAs for research § Ported applicaons/workloads to study § Standard soRware stacks (compilers, OS) § Real commercial hardware to experiment with § Real commercial hardware to validate models with § ExisAng implementaons to study / modify § Industry is more interested in your results 3 Types of projects and standard ISAs used by me or my group in last 30 years § Experiments on real hardware plaorms: - Transputer arrays, SPARC workstaons, MIPS workstaons, POWER workstaons, ARMv7 handhelds, x86 desktops/ servers § Research chips built around modified MIPS ISA: - T0, IRAM, STC1, Scale, Maven § FPGA prototypes/simulaons using various ISAs: - RAMP Blue (modified Microblaze), RAMP Gold/ DIABLO (SPARC v8) § Experiments using soRware architectural simulators: - SimpleScalar (PISA), SMTsim (Alpha), Simics (SPARC,x86), Bochs (x86), MARSS (x86), Gem5(SPARC), PIN (Itanium, x86), … § And of course, other groups used some others too. RealiMes of using standard ISAs § Everything only works if you don’t change anything - Stock binary applicaons - Stock libraries - Stock compiler - Stock OS - Stock hardware implementaon § Add a new instrucAon, get a new non-standard ISA! - Need source code for the apps and recompile - Impossible for most real interesAng applicaons
    [Show full text]
  • Pwny Documentation Release 0.9.0
    pwny Documentation Release 0.9.0 Author Nov 19, 2017 Contents 1 pwny package 3 2 pwnypack package 5 2.1 asm – (Dis)assembler..........................................5 2.2 bytecode – Python bytecode manipulation..............................7 2.3 codec – Data transformation...................................... 11 2.4 elf – ELF file parsing.......................................... 16 2.5 flow – Communication......................................... 36 2.6 fmtstring – Format strings...................................... 41 2.7 marshal – Python marshal loader................................... 42 2.8 oracle – Padding oracle attacks.................................... 43 2.9 packing – Data (un)packing...................................... 44 2.10 php – PHP related functions....................................... 46 2.11 pickle – Pickle tools.......................................... 47 2.12 py_internals – Python internals.................................. 49 2.13 rop – ROP gadgets........................................... 50 2.14 shellcode – Shellcode generator................................... 50 2.15 target – Target definition....................................... 79 2.16 util – Utility functions......................................... 80 3 Indices and tables 83 Python Module Index 85 i ii pwny Documentation, Release 0.9.0 pwnypack is the official CTF toolkit of Certified Edible Dinosaurs. It aims to provide a set of command line utilities and a python library that are useful when playing hacking CTFs. The core functionality of pwnypack
    [Show full text]
  • Escuela Politécnica Superior De Jaén
    UNIVERSIDAD DE JAÉN ESCUELA POLITÉCNICA SUPERIOR DE JAÉN Trabajo Fin de Grado DATALOGGER CON ENLACE BLUETOOTH A SISTEMA ANDROID Alumno: Juan Miguel Bejarano Bueno Tutor: Prof. D. Luis Miguel Nieto Nieto Dpto: Ingeniería Electrónica y Automática Escuela Politécnica Superior Jaén de Area: Tecnología Electrónica Septiembre, 2015 Juan Miguel Bejarano Bueno Datalogger con enlace Bluetooth a aplicación Android. Universidad de Jaén Escuela Politécnica Superior de Jaén Departamento de Electrónica y Automática Don LUIS MIGUEL NIETO NIETO, tutor del Trabajo Fin de Grado titulado: DATALOGGER CON ENLACE BLUETOOTH A ANDROID, que presenta JUAN MIGUEL BEJARANO BUENO, autoriza su presentación para defensa y evaluación en la Escuela Politécnica Superior de Jaén. Jaén, SEPTIEMBRE de 2015 El alumno: El tutor: JUAN MIGUEL BEJARANO BUENO LUIS MIGUEL NIETO NIETO 1 Escuela Politécnica Superior de Jaén Juan Miguel Bejarano Bueno Datalogger con enlace Bluetooth a aplicación Android. RESUMEN Este Trabajo de Fin de Grado consiste en el desarrollo de dos sistemas. En primer lugar, se dispondrá de un sistema de registro de datos por eventos o datalogger. El sistema está basado en un microcontrolador Microchip y dispondrá de una interfaz de comunicaciones Bluetooth. Este dispositivo será capaz de realizar mediciones con una duración mínima configurable. En segundo lugar y de manera simultánea, se ha diseñado una aplicación Android capaz de gestionar la comunicación con el dispositivo de una manera gráfica. El sistema se ha diseñado siempre teniendo en cuenta el ahorro económico, dispondrá de una alimentación autónoma y el mantenimiento será prácticamente nulo. Se pretende así obtener un datalogger funcional, económico e intuitivo para el manejo del mismo por cualquier tipo de usuario.
    [Show full text]
  • A 4.7 Million-Transistor CISC Microprocessor
    Auriga2: A 4.7 Million-Transistor CISC Microprocessor J.P. Tual, M. Thill, C. Bernard, H.N. Nguyen F. Mottini, M. Moreau, P. Vallet Hardware Development Paris & Angers BULL S.A. 78340 Les Clayes-sous-Bois, FRANCE Tel: (+33)-1-30-80-7304 Fax: (+33)-1-30-80-7163 Mail: [email protected] Abstract- With the introduction of the high range version of parallel multi-processor architecture. It is used in a family of the DPS7000 mainframe family, Bull is providing a processor systems able to handle up to 24 such microprocessors, which integrates the DPS7000 CPU and first level of cache on capable to support 10 000 simultaneously connected users. one VLSI chip containing 4.7M transistors and using a 0.5 For the development of this complex circuit, a system level µm, 3Mlayers CMOS technology. This enhanced CPU has design methodology has been put in place, putting high been designed to provide a high integration, high performance emphasis on high-level verification issues. A lot of home- and low cost systems. Up to 24 such processors can be made CAD tools were developed, to meet the stringent integrated in a single system, enabling performance levels in performance/area constraints. In particular, an integrated the range of 850 TPC-A (Oracle) with about 12 000 Logic Synthesis and Formal Verification environment tool simultaneously active connections. The design methodology has been developed, to deal with complex circuitry issues involved massive use of formal verification and symbolic and to enable the designer to shorten the iteration loop layout techniques, enabling to reach first pass right silicon on between logical design and physical implementation of the several foundries.
    [Show full text]
  • A Developer's Guide to the POWER Architecture
    http://www.ibm.com/developerworks/linux/library/l-powarch/ 7/26/2011 10:53 AM English Sign in (or register) Technical topics Evaluation software Community Events A developer's guide to the POWER architecture POWER programming by the book Brett Olsson , Processor architect, IBM Anthony Marsala , Software engineer, IBM Summary: POWER® processors are found in everything from supercomputers to game consoles and from servers to cell phones -- and they all share a common architecture. This introduction to the PowerPC application-level programming model will give you an overview of the instruction set, important registers, and other details necessary for developing reliable, high performing POWER applications and maintaining code compatibility among processors. Date: 30 Mar 2004 Level: Intermediate Also available in: Japanese Activity: 22383 views Comments: The POWER architecture and the application-level programming model are common across all branches of the POWER architecture family tree. For detailed information, see the product user's manuals available in the IBM® POWER Web site technical library (see Resources for a link). The POWER architecture is a Reduced Instruction Set Computer (RISC) architecture, with over two hundred defined instructions. POWER is RISC in that most instructions execute in a single cycle and typically perform a single operation (such as loading storage to a register, or storing a register to memory). The POWER architecture is broken up into three levels, or "books." By segmenting the architecture in this way, code compatibility can be maintained across implementations while leaving room for implementations to choose levels of complexity for price/performances trade-offs. The levels are: Book I.
    [Show full text]
  • Ece585 Lec2.Pdf
    ECE 485/585 Microprocessor System Design Lecture 2: Memory Addressing 8086 Basics and Bus Timing Asynchronous I/O Signaling Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering and Computer Science Source: Lecture based on materials provided by Mark F. Basic I/O – Part I ECE 485/585 Outline for next few lectures Simple model of computation Memory Addressing (Alignment, Byte Order) 8088/8086 Bus Asynchronous I/O Signaling Review of Basic I/O How is I/O performed Dedicated/Isolated /Direct I/O Ports Memory Mapped I/O How do we tell when I/O device is ready or command complete? Polling Interrupts How do we transfer data? Programmed I/O DMA ECE 485/585 Simplified Model of a Computer Control Control Data, Address, Memory Data Path Microprocessor Keyboard Mouse [Fetch] Video display [Decode] Printer [Execute] I/O Device Hard disk drive Audio card Ethernet WiFi CD R/W DVD ECE 485/585 Memory Addressing Size of operands Bytes, words, long/double words, quadwords 16-bit half word (Intel: word) 32-bit word (Intel: doubleword, dword) 0x107 64-bit double word (Intel: quadword, qword) 0x106 Note: names are non-standard 0x105 SUN Sparc word is 32-bits, double is 64-bits 0x104 0x103 Alignment 0x102 Can multi-byte operands begin at any byte address? 0x101 Yes: non-aligned 0x100 No: aligned. Low order address bit(s) will be zero ECE 485/585 Memory Operand Alignment …Intel IA speak (i.e. word = 16-bits = 2 bytes) 0x107 0x106 0x105 0x104 0x103 0x102 0x101 0x100 Aligned Unaligned Aligned Unaligned Aligned Unaligned word word Double Double Quad Quad address address word word word word -----0 address address address address -----00 ----000 ECE 485/585 Memory Operand Alignment Why do we care? Unaligned memory references Can cause multiple memory bus cycles for a single operand May also span cache lines Requiring multiple evictions, multiple cache line fills Complicates memory system and cache controller design Some architectures restrict addresses to be aligned Even in architectures without alignment restrictions (e.g.
    [Show full text]
  • Programming Languages, Database Language SQL, Graphics, GOSIP
    b fl ^ b 2 5 I AH1Q3 NISTIR 4951 (Supersedes NISTIR 4871) VALIDATED PRODUCTS LIST 1992 No. 4 PROGRAMMING LANGUAGES DATABASE LANGUAGE SQL GRAPHICS Judy B. Kailey GOSIP Editor POSIX COMPUTER SECURITY U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 100 . U56 4951 1992 NIST (Supersedes NISTIR 4871) VALIDATED PRODUCTS LIST 1992 No. 4 PROGRAMMING LANGUAGES DATABASE LANGUAGE SQL GRAPHICS Judy B. Kailey GOSIP Editor POSIX COMPUTER SECURITY U.S. DEPARTMENT OF COMMERCE Technology Administration National Institute of Standards and Technology Computer Systems Laboratory Software Standards Validation Group Gaithersburg, MD 20899 October 1992 (Supersedes July 1992 issue) U.S. DEPARTMENT OF COMMERCE Barbara Hackman Franklin, Secretary TECHNOLOGY ADMINISTRATION Robert M. White, Under Secretary for Technology NATIONAL INSTITUTE OF STANDARDS AND TECHNOLOGY John W. Lyons, Director - ;,’; '^'i -; _ ^ '’>.£. ; '':k ' ' • ; <tr-f'' "i>: •v'k' I m''M - i*i^ a,)»# ' :,• 4 ie®®;'’’,' ;SJ' v: . I 'i^’i i 'OS -.! FOREWORD The Validated Products List is a collection of registers describing implementations of Federal Information Processing Standards (FTPS) that have been validated for conformance to FTPS. The Validated Products List also contains information about the organizations, test methods and procedures that support the validation programs for the FTPS identified in this document. The Validated Products List is updated quarterly. iii ' ;r,<R^v a;-' i-'r^ . /' ^'^uffoo'*^ ''vCJIt<*bjteV sdT : Jr /' i^iL'.JO 'j,-/5l ':. ;urj ->i: • ' *?> ^r:nT^^'Ad JlSid Uawfoof^ fa«Di)itbiI»V ,, ‘ isbt^u ri il .r^^iytsrH n 'V TABLE OF CONTENTS 1.
    [Show full text]
  • Downloaded to the Test Chip Verified
    - -------------------------------------------------------------------------------- - -------------------------------------------------------------------------------- SECURITIES AND EXCHANGE COMMISSION WASHINGTON, D.C. 20549 ------------------------ FORM 10-K ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(D) OF THE SECURITIES EXCHANGE ACT OF 1934 FOR THE FISCAL YEAR ENDED DECEMBER 31, 1997 COMMISSION FILE NUMBER 0-23006 ------------------------ DSP GROUP, INC. (Exact name of registrant as specified in its charter) DELAWARE 94-2683643 (State or other jurisdiction of (I.R.S. Employer Identification No.) incorporation and organization) 3120 SCOTT BOULEVARD, SANTA CLARA, CA 95054 (Address of principal executive offices, including zip code) (408) 986-4300 (Registrant's telephone number) Securities registered pursuant to Section 12(b) of the Act: NONE Securities registered pursuant to Section 12(g) of the Act: COMMON STOCK, $.001 PER SHARE (Title of class) Indicate by check mark whether the Registrant: (1) has filed all reports required to be filed by Section 13 or 15(d) of the Securities Exchange Act of 1934 during the preceding 12 months (or for such shorter period that the Registrant was required to file such reports), and (2) has been subject to such filing requirements for the past 90 days. Yes /X/ No / / Indicate by check mark if disclosure of delinquent filers pursuant to Item 405 of Regulation S-K is not contained herein, and will not be contained, to the best of Registrant's knowledge, in definitive proxy or information statements incorporated by reference in Part III of this Form 10-K or any amendment to this Form 10-K. [ ] The aggregate market value of the voting stock held by non-affiliates of the Registrant, based on the closing price of the Common Stock on March 2, 1998, as reported on the Nasdaq National Market, was approximately $231,228,988.
    [Show full text]