Ramp: Research Accelerator for Multiple Processors

Total Page:16

File Type:pdf, Size:1020Kb

Ramp: Research Accelerator for Multiple Processors ..................................................................................................................................................................................................................................................... RAMP: RESEARCH ACCELERATOR FOR MULTIPLE PROCESSORS ..................................................................................................................................................................................................................................................... THE RAMP PROJECT’S GOAL IS TO ENABLE THE INTENSIVE, MULTIDISCIPLINARY INNOVATION John Wawrzynek THAT THE COMPUTING INDUSTRY WILL NEED TO TACKLE THE PROBLEMS OF PARALLEL David Patterson PROCESSING. RAMP ITSELF IS AN OPEN-SOURCE, COMMUNITY-DEVELOPED, FPGA-BASED University of California, EMULATOR OF PARALLEL ARCHITECTURES. ITS DESIGN FRAMEWORK LETS A LARGE, Berkeley COLLABORATIVE COMMUNITY DEVELOP AND CONTRIBUTE REUSABLE, COMPOSABLE DESIGN Mark Oskin MODULES. THREE COMPLETE DESIGNS—FOR TRANSACTIONAL MEMORY, DISTRIBUTED University of Washington SYSTEMS, AND DISTRIBUTED-SHARED MEMORY—DEMONSTRATE THE PLATFORM’S POTENTIAL. Shih-Lien Lu ...... In 2005, the computer hardware N Prototyping a new architecture in Intel industry took a historic change of direction: hardware takes approximately four The major microprocessor companies all an- years and many millions of dollars, nounced that their future products would even at only research quality. Christoforos Kozyrakis be single-chip multiprocessors, and that N Software engineers are ineffective until future performance improvements would the new hardware actually shows up, Stanford University rely on software-specified parallelism rather because simulators are too slow to than additional software-transparent paral- support serious software development lelism extracted automatically by the micro- activities. Software engineers tend to James C. Hoe architecture. Several of us discussed this mile- innovate only after hardware arrives. stone at the International Symposium on N Feedback from software engineers on Carnegie Mellon University Computer Architecture (ISCA) in June the current production hardware can- 2005. We were struck that a multibillion- not help the immediate next genera- dollar industry would bet its future on tion because of overlapped hardware Derek Chiou solving the general-purpose parallel com- development cycles. Instead, the feed- puting problem, when so many have back loop can take several hardware University of Texas at Austin previously attempted but failed to provide generations to close fully. a satisfactory approach. To tackle the parallel processing prob- Hence, we conspired on how to create an Krste Asanovic´ lem, our industry urgently needs innovative inexpensive, reconfigurable, highly parallel solutions, which in turn require extensive platform that would attract researchers from Massachusetts Institute codevelopment of hardware and software. many disciplines—architectures, compilers, However, this type of innovation currently operating systems, applications, and others— of Technology gets bogged down in the traditional de- to work together on perhaps the greatest velopment cycle: challenge facing computing in the past ....................................................................... 46 Published by the IEEE Computer Society 0272-1732/07/$25.00 G 2007 IEEE 50 years. Because our industry desperately A second example of how RAMP differs needs solutions, our goal is to develop a from real hardware is reproducibility. Using platform that would allow far more rapid the RAMP Description Language (RDL) evolution than traditional approaches. framework, different researchers can con- struct the same deterministic parallel com- RAMP vision puting system that will perform exactly the Our hallway conversations led us to the same way every time, clock cycle for clock idea of using field-programmable gate arrays cycle. By using processor designs donated by (FPGAs) to emulate highly parallel architec- industry, RAMP users will start with familiar tures at hardware speeds. FPGAs enable very architectures and operating systems, which rapid turnaround for new hardware. You can will provide far more credibility than software tape out a FPGA design every day, and have simulations that model idealized processors a new system fabricated overnight. Another or that ignore operating-system effects. RDL key advantage of FPGAs is that they easily is designed to make constructing a full com- exploit Moore’s law. As the number of cores puter out of RDL-compatible modules easy. per microprocessor die grows, FPGA density Our target speeds of 100 to 200 MHz are will grow at about the same rate. Today we slower than real hardware but fast enough to can map about 16 simple processors onto a run standard operating systems and large- single FPGA, which means we can construct scale applications that are orders of magni- a 1,000-processor system in just 64 FPGAs. tude faster than software simulators. Finally, because of the similarities in the design flow Such a system is cheaper and consumes less of logic for FPGAs and custom hardware, we power than a custom multiprocessor, at believe RAMP is realistic enough to convince about $100 and 1 W per processor. software developers to start aggressive de- Because our goal is to ramp up the rate of velopment on innovative architectures and innovation in hardware and software multi- programming models and to convince hard- processor research, we named this project ware and software companies that RAMP RAMP (Research Accelerator for Multiple results are relevant. Processors). RAMP is an open-source project This combination of cost, power, speed, to develop and share the hardware and flexibility, observability, reproducibility, and software necessary to create parallel architec- credibility will make the platform attractive tures. RAMP is not just a hardware architec- to software and hardware researchers in- ture project. Perhaps our most important terested in the parallel challenge. In particu- goal is to support the software community as lar, it allows the research community to revive it struggles to take advantage of the potential the 1980s culture of building experimental capabilities of parallel microprocessors, by hardware and software systems, which today providing a malleable platform through has been almost entirely lost because of the which the software community can collabo- higher cost and difficulty of building hard- rate with the hardware community. ware. Unlike commercial multiprocessor hard- Table 1 compares alternatives for pursuing ware, RAMP is designed as a research plat- parallel-systems research in academia. The form. We plan to include research features four options are a conventional shared- that are impossible to include in real hard- memory multiprocessor (SMP), a cluster, a ware systems owing to speed, cost, or prac- simulator, a custom-built chip and system, ticality issues. For example, the FPGA design and RAMP. The rows are the features of can incorporate additional hardware to mon- interest, with a grade for each alternative and itor any event in the system. Being able to add quantification where appropriate. Cost rules arbitrary event probes, including arbitrary out a large SMP for most academics. The computation on those events, provides visi- costs of both purchase and ownership make a bility formerly only available in software sim- large cluster too expensive for most academics ulators, but without the inevitable slowdown as well. Our only alternative thus far has been faced by software simulators when introduc- software simulation, and indeed that has been ing such visibility. the vehicle for most architecture research in ........................................................................ MARCH–APRIL 2007 47 ......................................................................................................................................................................................................................... HOT CHIPS Table 1. Relative comparison of four options for parallel research. From the architect’s perspective, the most surprising aspect of this table is that not only is performance not the top concern, it is at the bottom of this list. The platform just needs to be fast enough to run the entire software stack. Feature SMP Cluster Simulator Custom RAMP Scalability (1,000 CPUs) C A A A A Cost (1,000 CPUs) F ($40M) C ($2M-$3M) A+ ($0M) F ($20M) A ($0.1M-$0.2M) Cost of ownership A D A D A Power/space (kW, racks) D (120, 12) D (120, 12) A+ (.1, 0.1) A B (1.5, 0.3) Development community D A A F B Observability D C A+ AB+ Reproducibility B D A+ AA+ Reconfigurability D C A+ CA+ Credibility of result A+ A+ DA+ B+/A Performance (clock) A (2 GHz) A (3 GHz) F (0 GHz) B (0.4 GHz) C (0.1 GHz) Modeling flexibility D D A B A Overall grade C C+ BB2 A the past decade. As mentioned, software de- total instructions per second, drops as more velopers rarely use software simulators, be- cores are simulated and as operating-system cause they run too slowly, and results might effects are included, and the amount of not be credible. In particular, it’s unclear how memory required for each node in the host credible results will be to industry when they compute cluster rises rapidly. are based on simulations of 1,000 processors RAMP is obviously attractive to a broad set running small snippets of applications. The of hardware and software researchers in paral- RAMP option is
Recommended publications
  • Fax86: an Open-Source FPGA-Accelerated X86 Full-System Emulator
    FAx86: An Open-Source FPGA-Accelerated x86 Full-System Emulator by Elias El Ferezli A thesis submitted in conformity with the requirements for the degree of Master of Applied Science (M.A.Sc.) Graduate Department of Electrical and Computer Engineering University of Toronto Copyright c 2011 by Elias El Ferezli Abstract FAx86: An Open-Source FPGA-Accelerated x86 Full-System Emulator Elias El Ferezli Master of Applied Science (M.A.Sc.) Graduate Department of Electrical and Computer Engineering University of Toronto 2011 This thesis presents FAx86, a hardware/software full-system emulator of commodity computer systems using x86 processors. FAx86 is based upon the open-source IA-32 full-system simulator Bochs and is implemented over a single Virtex-5 FPGA. Our first prototype uses an embedded PowerPC to run the software portion of Bochs and off- loads the instruction decoding function to a low-cost hardware decoder since instruction decode was measured to be the most time consuming part of the software-only emulation. Instruction decoding for x86 architectures is non-trivial due to their variable length and instruction encoding format. The decoder requires only 3% of the total LUTs and 5% of the BRAMs of the FPGA's resources making the design feasible to replicate for many- core emulator implementations. FAx86 prototype boots Linux Debian version 2.6 and runs SPEC CPU 2006 benchmarks. FAx86 improves simulation performance over the default Bochs by 5 to 9% depending on the workload. ii Acknowledgements I would like to begin by thanking my supervisor, Professor Andreas Moshovos, for his patient guidance and continuous support throughout this work.
    [Show full text]
  • Computer Architectures an Overview
    Computer Architectures An Overview PDF generated using the open source mwlib toolkit. See http://code.pediapress.com/ for more information. PDF generated at: Sat, 25 Feb 2012 22:35:32 UTC Contents Articles Microarchitecture 1 x86 7 PowerPC 23 IBM POWER 33 MIPS architecture 39 SPARC 57 ARM architecture 65 DEC Alpha 80 AlphaStation 92 AlphaServer 95 Very long instruction word 103 Instruction-level parallelism 107 Explicitly parallel instruction computing 108 References Article Sources and Contributors 111 Image Sources, Licenses and Contributors 113 Article Licenses License 114 Microarchitecture 1 Microarchitecture In computer engineering, microarchitecture (sometimes abbreviated to µarch or uarch), also called computer organization, is the way a given instruction set architecture (ISA) is implemented on a processor. A given ISA may be implemented with different microarchitectures.[1] Implementations might vary due to different goals of a given design or due to shifts in technology.[2] Computer architecture is the combination of microarchitecture and instruction set design. Relation to instruction set architecture The ISA is roughly the same as the programming model of a processor as seen by an assembly language programmer or compiler writer. The ISA includes the execution model, processor registers, address and data formats among other things. The Intel Core microarchitecture microarchitecture includes the constituent parts of the processor and how these interconnect and interoperate to implement the ISA. The microarchitecture of a machine is usually represented as (more or less detailed) diagrams that describe the interconnections of the various microarchitectural elements of the machine, which may be everything from single gates and registers, to complete arithmetic logic units (ALU)s and even larger elements.
    [Show full text]
  • Selected Project Reports, Spring 2005 Advanced OS & Distributed Systems
    Selected Project Reports, Spring 2005 Advanced OS & Distributed Systems (15-712) edited by Garth A. Gibson and Hyang-Ah Kim Jangwoo Kim††, Eriko Nurvitadhi††, Eric Chung††; Alex Nizhner†, Andrew Biggadike†, Jad Chamcham†; Srinath Sridhar∗, Jeffrey Stylos∗, Noam Zeilberger∗; Gregg Economou∗, Raja R. Sambasivan∗, Terrence Wong∗; Elaine Shi∗, Yong Lu∗, Matt Reid††; Amber Palekar†, Rahul Iyer† May 2005 CMU-CS-05-138 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 †Information Networking Institute ∗Department of Computer Science ††Department of Electrical and Computer Engineering Abstract This technical report contains six final project reports contributed by participants in CMU’s Spring 2005 Advanced Operating Systems and Distributed Systems course (15-712) offered by professor Garth Gibson. This course examines the design and analysis of various aspects of operating systems and distributed systems through a series of background lectures, paper readings, and group projects. Projects were done in groups of two or three, required some kind of implementation and evalution pertaining to the classrom material, but with the topic of these projects left up to each group. Final reports were held to the standard of a systems conference paper submission; a standard well met by the majority of completed projects. Some of the projects will be extended for future submissions to major system conferences. The reports that follow cover a broad range of topics. These reports present a characterization of synchronization behavior
    [Show full text]
  • First Steps Simics/GEMS on Ubuntu 7.10
    First Steps Simics/GEMS on Ubuntu 7.10 Author: Aleksandar Vitorovic, [email protected], student of Faculty of Electrical Engineering, University of Belgrade, Serbia This document is not a product of the Multifacet Group. It is a GEMS user contribution, and this document is provided as-is, without warranty or guarantee of any kind, and without support. This document may serve as a useful guide for many users, but is neither tested nor verified by any authors of the GEMS infrastructure. 1 Introduction This tutorial explains basic steps for users to get familiar with GEMS/Simics environment. It is idiot- proof tutorial. Basic think you should know to start simulator is showed. In every fork I decided to choose one option and explain it in detail. For further details and other options you should consult appropriate documentation. In chapter 2 is detailed installation process for GEMS 2.1 and Simics 3.0.30 on Ubuntu 7.10 x86. In chapter 3 is detailed installation process for Simics 2.2.19 on Ubuntu 7.10 x86. In chapter 4 there is different manuals for Frequently used Commands. In chapter 5 we explain how to add benchmark to your target machine and how to start that. In chapter 6 there is list of used references. This tutorial are provided AS IS with NO WARRANTY OF ANY KIND, INCLUDING THE WARRANTY OF DESIGN, MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE. 2 GEMS 2.1/Simics 3.0.30 Installation It is possible to install GEMS 2.1 based on Simics 3.0.30 on Ubuntu 7.10 x86 with GCC 4.1.3, although GEMS manufacturers internally work with GCC 3.4.6.
    [Show full text]
  • Automotive MRD Overview
    Developing Multicore Software is Easy, If… Freescale Technical Forum - Summer 2009 Emeka Nwafor Director, Product Management Development Tools © 2009 Wind River Systems, Inc. Abstract Multicore is here. There has been much written and discussed about how challenging developing Multicore Software can be. Most of us have heard the saying "you shouldn't bring a knife to a gunfight". This principle applies to developing Multicore Software. We will talk about how Wind River's development tools help to make developing Multicore Software easier on Freescale's family of multicore processors. We will also talk about directions that Wind River is taking to make developing Multicore Software "easy". 2 © 2009 Wind River Systems, Inc. Agenda Context: Where are we with multicore, today? What is Wind River's Multicore Software solution? Simplifying development on Freescale's multicore devices Directions for Multicore Software 3 © 2009 Wind River Systems, Inc. Some Market Trends around Multicore 12% Decline in single processor projects since 2006 18% Increase in dual-processor based designs since 2006 42% Percentage of embedded developers that say that chip technology will represent the most dramatic area change in embedded software development over the next 2-5 years 34% Percentage of embedded developers saying that embedded development / design processes will change significantly in the next 2-5 years 84% State that in-house code from previous projects is reused in new projects Source: 2009 TechInsights Embedded Market Study 4 © 2009 Wind River Systems, Inc. Some Qualitative Observations “We found that things were obscenely sensitive while we were trying to optimize the usage of our cache…” “It is very easy to write a slow multicore application and very hard to figure why it is slow.” “Software Engineers didn’t ask for multicore – it was forced on them…” “We are absolutely terrified about multicore and the challenges that parallelism and concurrency are introducing.
    [Show full text]
  • MC-Sim: an Efficient Simulation Tool for Mpsoc Designs
    MC-Sim: An Efficient Simulation Tool for MPSoC Designs Jason Cong, Karthik Gururaj, Guoling Han, Adam Kaplan, Mishali Naik, Glenn Reinman Computer Science Department, University of California, Los Angeles Los Angeles, CA 90095, USA ABSTRACT Chip (MPSoC). To manage the massive increase in the design The ability to integrate diverse components such as processor space available for system designers, simulation tools, which cores, memories, custom hardware blocks and complex can provide rapid and accurate estimations of the performance network-on-chip (NoC) communication frameworks onto a and cost of design instances, are required. Specifically, the tool single chip has greatly increased the design space available for should possess the capability to rapidly and accurately evaluate system-on-chip (SoC) designers. Efficient and accurate the performance of multi-core designs, model a variety of performance estimation tools are needed to assist the designer in memory systems and NoC configurations, simulate coprocessors making design decisions. In this paper, we present MC-Sim, a efficiently and allow the designer to migrate from one design heterogeneous multi-core simulator framework which is capable instance to another easily. of accurately simulating a variety of processor, memory, NoC configurations and application specific coprocessors. We also 1.1 Previous Work describe a methodology to automatically generate fast, cycle- Single processor simulators true behavioral, C-based simulators for coprocessors using a Traditional single-core simulators, such as SimpleScalar [7], high-level synthesis tool and integrate them with MC-Sim, thus were designed to model out-of-order superscalar processor augmenting it with the capacity to simulate coprocessors. Our C- pipelines, and lack the ability to model multiple application based simulators provide on an average 45x improvement in workloads or shared-memory parallel applications, let alone the simulation speed over that of RTL descriptions.
    [Show full text]
  • Simics Brief Introduction Presented By: Chunrong Lai Software Engineer Simics Technology Center (STC), Software and Services Group
    Simics Brief Introduction Presented By: Chunrong Lai Software Engineer Simics Technology Center (STC), Software and Services Group 26 Oct, 2012 What is Simics? Simics is a high-performance, full system simulator used by software developers to simulate large and complex electronic systems. Wind River Simics • Simulate any size of target system • Run unmodified binaries • Other use cases as add-ons Simics allows you to break the rules of embedded systems development Software and Services Group ‹#› Simics History Research project at SICS started in 1991 Virtutech was founded in Sweden in 1998 and incorporated in USA in 2003. Acquired by Intel in 2010 Marketed and sold through Wind River (subsidiary) Software and Services Group ‹#› Simics Customers Marketing, Field & Sales Simics users Simics Technology Center Widley used acroos different Intel Simics External Customers, include groups/projects as an uniform network, communication, space, software platform simulator semiconductor and HPC field Software and Services Group ‹#› What Wind River Simics Customers Say Computing Server Networking Industrial Space . Shortens bring-up . Finds and solves . Prevents the loss . Saves $6M in time from 22 weeks problems in 30 of $10M by getting target hardware to 29 days minutes instead the ASIC right labs of three weeks before and three people manufacturing . Configures complex labs in minutes instead of weeks Software and Services Group ‹#› 5 Some Simics Features and Capabilities • High performance • Modeling language for fast development • Synchronized
    [Show full text]
  • Embedded Systems Computer Embeddedembedded Architecture Systemssystems
    Embedded Systems Computer EmbeddedEmbedded Architecture SystemsSystems Jakob Engblom, PhD Uppsala University & Virtutech Inc. [email protected] [email protected] virtuvirtutechtech 14 Nov 2003 Embedded Computer Architecture 2 Embedded Systems Embedded Systems Now what is this elephant thing? You’re all wrong, it is a »“A computer that doesn’t fan! No, a look like a computer” It is a wall! snake! »Interacts with world »Primitive or no user interface »Part of other products No, it is a Part of other products treetrunk! No, a pillar! 14 Nov 2003 Embedded Computer Architecture 3 14 Nov 2003 Embedded Computer Architecture 4 Embedded Systems Processor Market »Single purpose products »Embedded = most processors! ¡Not general purpose like desktop PCs ¡200 million PC and server ¡Do one thing very efficiently ¡8000 million embedded »Software very important: ¡Gives character to product Used to differentiate inside a “platform” ¡Can be changed late "Embedded" 98% ¡Processor cheaper than special HW "Desktop" ¡Today, dominates dev cost 2% 14 Nov 2003 Embedded Computer Architecture 5 14 Nov 2003 Embedded Computer Architecture 6 Processor Market Real-Time System » Processors: 100% »Timing as important as result ¡ 50% of all DSP DSP semiconductor revenue 90% 4-bit 4-bit »Hard real-time: ¡ Explains why everyone 80% 8-bit wants to do processors 70% 16-bit ¡Hard deadlines » 32-bit dominant 60% ¡Dead if missed deadline ¡ 30% of total 50% 8-bit ¡Worst-case semiconductors 40% » PC processors: 30% 32-bit »Soft real-time: ¡ 50% of CPU revenue 20% 16-bit
    [Show full text]
  • HPC Virtualization with Xen on Itanium
    HPC Virtualization with Xen on Itanium H˚avard K. F. Bjerke July 18, 2005 Abstract The Xen Virtual Machine Monitor has proven to achieve higher efficiency in virtualizing the x86 architecture than competing x86 virtualization technolo- gies. This makes virtualization on the x86 platform more feasible in High- Performance and mainframe computing, where virtualization can offer attractive solutions for managing resources between users. Virtualization is also attractive on the Itanium architecture. Future x86 and Itanium computer architectures include extensions which make virtualization more efficient. Moving to virtu- alizing resources through Xen may ready computer centers for the possibilities offered by these extensions. The Itanium architecture is “uncooperative” in terms of virtualization. Privi- lege-sensitive instructions make full virtualization inefficient and impose the need for para-virtualization. Para-virtualizing Linux involves changing certain native operations in the guest kernel in order to adapt it to the Xen virtual ar- chitecture. Minimum para-virtualizing impact on Linux is achieved by, instead of replacing illegal instructions, trapping them by the hypervisor, which then emulates them. Transparent para-virtualization allows the same Linux kernel binary to run on top of Xen and on physical hardware. Itanium region registers allow more graceful distribution of memory between guest operating systems, while not disturbing the Translation Lookaside Buffer. The Extensible Firmware Interface provides a standardized interface to hard- ware functions, and is easier to virtualize than legacy hardware interfaces. The overhead of running para-virtualized Linux on Itanium is reasonably small and measured to be around 4.9 %. Also, the overhead of running transpar- ently para-virtualized Linux on physical hardware is reasonably small compared to non-virtualized Linux.
    [Show full text]
  • Evaluation of CPU Architecture by Simulation Technologies And
    International Journal of Scientific & Engineering Research, Volume 7, Issue 11, November-2016 1271 ISSN 2229-5518 Evaluation of CPU architecture by simulation technologies and benchmark computer systems Madiha Yousafa, Muhammad Harisb abCOMSATS Institute of Information Technology Park Road, Tarlai Kalan, Islamabad-45550, Pakistan [email protected] Abstract— The processor architecture designers face major challenges to improve the processor’s performance. To measure the performance of the processor there are many parameter like performance of cache, TLB, IO operations, bus speed etc. Different companies launch series of processors with same base configuration and a little change of variations base on cache size, cache levels, share and separate cache and many other parameters like that. There are many simulators available to measure processor's performance theoretically with several parameters. We simulate some models of Intel Pentium 4 and UltraSPARC II processors and analyze the performance of the processors regarding cache and TLB. Keywords: ISTC (Instruction Simulator Translation Caches), DSTC (Data Simulator Translation Caches) —————————— —————————— 1 INTRODUCTION LSI design and developer companies like Intel, SUN, It stands for multimedia extensions and was developed by V AMD and IBM are also working in the field of processor- the Intel to increase the speed of multi media design. Before moving on we must first discuss that what operations. [6] is processor design and the factors on which it depends. Designing a computer processor in such a way that cycle time, In addition to above mentioned technologies, the architec- reliability, performance, chip area and many other issues are tures like P4 Netburst were very helpful to improve proces- resolved.
    [Show full text]
  • Simics-On-Simics
    Jakob Engblom, PhD, Intel, Stockholm, Sweden [email protected] My Background Jakob Engblom Blog at the Intel Developer Zone . MSc, Computer Science, Uppsala . https://software.intel.com/en-us/meet- . PhD, Real-Time Systems, Uppsala the-developers/evangelists/team/jakob- Currently: engblom . Product Management Engineer, Simics Core team, at Intel in Stockholm, Sweden My own blog, since 2007: . Software Evangelist – Simulation . https://jakob.engbloms.se Previously: . https://www.engbloms.se/jakob.html . IAR Systems, Virtutech, Wind River Very rarely touch actual hardware . Product management, product marketing, technical sales, technical marketing, business when doing development. Very rarely. development, training development, demos, ... Copyright Intel 2019 | SAMOS 2019 2 What Does Intel Do? • Intel® Core® • Intel® Xeon® • SSD • Ethernet • Intel® Atom™ • Processors • 3D XPoint™ • WiFi • Chipsets • Chipsets • Intel® Optane™ • Bluetooth • Smart NICs • GNSS Laptop and Server Storage Connectivity desktop • SoC-FPGA • Movidius • Processors • Development tools • FPGA • Nervana • Gateways • Compilers • FPGA-CPU combo • MobilEye • Security • Simulation solutions • Intel® Xeon® • Management • Linux & Windows drivers • UEFI & BIOS FPGA AI and ML IoT Software Copyright Intel 2019 | SAMOS 2019 3 Copyright Intel 2019 | SAMOS 2019 Hardware: A Hard Development Platform? Copyright Intel 2019 | SAMOS 2019 5 Hardware is Hard When it is... Not yet available Flaky prototype stage Not available anymore Copyright Intel 2019 | SAMOS 2019 6 Hardware is
    [Show full text]
  • Proceedings of the Bsdcon 2002 Conference
    USENIX Association Proceedings of the BSDCon 2002 Conference San Francisco, California, USA February 11-14, 2002 THE ADVANCED COMPUTING SYSTEMS ASSOCIATION © 2002 by The USENIX Association All Rights Reserved For more information about the USENIX Association: Phone: 1 510 528 8649 FAX: 1 510 548 5738 Email: [email protected] WWW: http://www.usenix.org Rights to individual papers remain with the author or the author's employer. Permission is granted for noncommercial reproduction of the work for educational or research purposes. This copyright notice must be included in the reproduced paper. USENIX acknowledges all trademarks herein. Porting NetBSD to the AMD x86-64:£ a case study in OS portability Frank van der Linden Wasabi Systems, Inc. [email protected] Abstract dealt with when NetBSD was ported to the DEC Alpha in 1994. A variation on this problem had to be dealt with NetBSD is known as a very portable operating system, with the UltraSPARC (sparc64) port in 1998, which is currently running on 44 different architectures (12 dif- 64-bit, but big endian (vs. the little-endianness of the ferent types of CPU). This paper takes a look at what Alpha). When interacting with datastructures of a fixed has been done to make it portable, and how this has de- size, such as on-disk metadata for filesystems, or datas- creased the amount of effort needed to port NetBSD to tructures directly interpreted by device hardware, explic- a new architecture. The new AMD x86-64 architecture, itly sized types are used, such as uint32_t, int8_t, etc. of which the specifications were published at the end of 2000, with hardware to follow in 2002, is used as an ex- 1.3 Device drivers ample.
    [Show full text]