Teaching HPC Systems and Parallel Programming with Small Scale Clusters of Embedded Socs
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
A 1024-Core 70GFLOPS/W Floating Point Manycore Microprocessor
A 1024-core 70GFLOPS/W Floating Point Manycore Microprocessor Andreas Olofsson, Roman Trogan, Oleg Raikhman Adapteva, Lexington, MA The Past, Present, & Future of Computing SIMD MIMD PE PE PE PE MINI MINI MINI CPU CPU CPU PE PE PE PE MINI MINI MINI CPU CPU CPU PE PE PE PE MINI MINI MINI CPU CPU CPU MINI MINI MINI BIG BIG CPU CPU CPU CPU CPU BIG BIG BIG BIG CPU CPU CPU CPU PAST PRESENT FUTURE 2 Adapteva’s Manycore Architecture C/C++ Programmable Incredibly Scalable 70 GFLOPS/W 3 Routing Architecture 4 E64G400 Specifications (Jan-2012) • 64-Core Microprocessor • 100 GFLOPS performance • 800 MHz Operation • 8GB/sec IO bandwidth • 1.6 TB/sec on chip memory BW • 0.8 TB/sec network on chip BW • 64 Billion Messages/sec IO Pads Core • 2 Watt total chip power • 2MB on chip memory Link Logic • 10 mm2 total silicon area • 324 ball 15x15mm plastic BGA 5 Lab Measurements 80 Energy Efficiency 70 60 50 GFLOPS/W 40 30 20 10 0 0 200 400 600 800 1000 1200 MHz ENERGY EFFICIENCY ENERGY EFFICIENCY (28nm) 6 Epiphany Performance Scaling 16,384 G 4,096 F 1,024 L 256 O 64 4096 P 1024 S 16 256 64 4 16 1 # Cores On‐demand scaling from 0.25W to 64 Watt 7 Hold on...the title said 1024 cores! • We can build it any time! • Waiting for customer • LEGO approach to design • No global timinga paths • Guaranteed by design • Generate any array in 1 day • ~130 mm2 silicon area 1024 Cores 1Core 8 What about 64-bit Floating Point? Single Precision Double Precision 2 FLOPS/CYCLE 2 FLOPS/CYCLE 64KB SRAM 64KB SRAM 0.215mm^2 0.237mm^2 700MHz 600MHz 9 Epiphany Latency Specifications -
Comparison of 116 Open Spec, Hacker Friendly Single Board Computers -- June 2018
Comparison of 116 Open Spec, Hacker Friendly Single Board Computers -- June 2018 Click on the product names to get more product information. In most cases these links go to LinuxGizmos.com articles with detailed product descriptions plus market analysis. HDMI or DP- USB Product Price ($) Vendor Processor Cores 3D GPU MCU RAM Storage LAN Wireless out ports Expansion OSes 86Duino Zero / Zero Plus 39, 54 DMP Vortex86EX 1x x86 @ 300MHz no no2 128MB no3 Fast no4 no5 1 headers Linux Opt. 4GB eMMC; A20-OLinuXino-Lime2 53 or 65 Olimex Allwinner A20 2x A7 @ 1GHz Mali-400 no 1GB Fast no yes 3 other Linux, Android SATA A20-OLinuXino-Micro 65 or 77 Olimex Allwinner A20 2x A7 @ 1GHz Mali-400 no 1GB opt. 4GB NAND Fast no yes 3 other Linux, Android Debian Linux A33-OLinuXino 42 or 52 Olimex Allwinner A33 4x A7 @ 1.2GHz Mali-400 no 1GB opt. 4GB NAND no no no 1 dual 40-pin 3.4.39, Android 4.4 4GB (opt. 16GB A64-OLinuXino 47 to 88 Olimex Allwinner A64 4x A53 @ 1.2GHz Mali-400 MP2 no 1GB GbE WiFi, BT yes 1 40-pin custom Linux eMMC) Banana Pi BPI-M2 Berry 36 SinoVoip Allwinner V40 4x A7 Mali-400 MP2 no 1GB SATA GbE WiFi, BT yes 4 Pi 40 Linux, Android 8GB eMMC (opt. up Banana Pi BPI-M2 Magic 21 SinoVoip Allwinner A33 4x A7 Mali-400 MP2 no 512MB no Wifi, BT no 2 Pi 40 Linux, Android to 64GB) 8GB to 64GB eMMC; Banana Pi BPI-M2 Ultra 56 SinoVoip Allwinner R40 4x A7 Mali-400 MP2 no 2GB GbE WiFi, BT yes 4 Pi 40 Linux, Android SATA Banana Pi BPI-M2 Zero 21 SinoVoip Allwinner H2+ 4x A7 @ 1.2GHz Mali-400 MP2 no 512MB no no WiFi, BT yes 1 Pi 40 Linux, Android Banana -
What We Know About Testing Embedded Software
What we know about testing embedded software Vahid Garousi, Hacettepe University and University of Luxembourg Michael Felderer, University of Innsbruck Çağrı Murat Karapıçak, KUASOFT A.Ş. Uğur Yılmaz, ASELSAN A.Ş. Abstract. Embedded systems have overwhelming penetration around the world. Innovations are increasingly triggered by software embedded in automotive, transportation, medical-equipment, communication, energy, and many other types of systems. To test embedded software in a cost effective manner, a large number of test techniques, approaches, tools and frameworks have been proposed by both practitioners and researchers in the last several decades. However, reviewing and getting an overview of the entire state-of-the- art and the –practice in this area is challenging for a practitioner or a (new) researcher. Also unfortunately, we often see that some companies reinvent the wheel (by designing a test approach new to them, but existing in the domain) due to not having an adequate overview of what already exists in this area. To address the above need, we conducted a systematic literature review (SLR) in the form of a systematic mapping (classification) in this area. After compiling an initial pool of 560 papers, a systematic voting was conducted among the authors, and our final pool included 272 technical papers. The review covers the types of testing topics studied, types of testing activity, types of test artifacts generated (e.g., test inputs or test code), and the types of industries in which studies have focused on, e.g., automotive and home appliances. Our article aims to benefit the readers (both practitioners and researchers) by serving as an “index” to the vast body of knowledge in this important and fast-growing area. -
Survey and Benchmarking of Machine Learning Accelerators
1 Survey and Benchmarking of Machine Learning Accelerators Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner MIT Lincoln Laboratory Supercomputing Center Lexington, MA, USA freuther,pmichaleas,michael.jones,vijayg,sid,[email protected] Abstract—Advances in multicore processors and accelerators components play a major role in the success or failure of an have opened the flood gates to greater exploration and application AI system. of machine learning techniques to a variety of applications. These advances, along with breakdowns of several trends including Moore’s Law, have prompted an explosion of processors and accelerators that promise even greater computational and ma- chine learning capabilities. These processors and accelerators are coming in many forms, from CPUs and GPUs to ASICs, FPGAs, and dataflow accelerators. This paper surveys the current state of these processors and accelerators that have been publicly announced with performance and power consumption numbers. The performance and power values are plotted on a scatter graph and a number of dimensions and observations from the trends on this plot are discussed and analyzed. For instance, there are interesting trends in the plot regarding power consumption, numerical precision, and inference versus training. We then select and benchmark two commercially- available low size, weight, and power (SWaP) accelerators as these processors are the most interesting for embedded and Fig. 1. Canonical AI architecture consists of sensors, data conditioning, mobile machine learning inference applications that are most algorithms, modern computing, robust AI, human-machine teaming, and users (missions). Each step is critical in developing end-to-end AI applications and applicable to the DoD and other SWaP constrained users. -
Embedded Sensors System Applied to Wearable Motion Analysis in Sports
Embedded Sensors System Applied to Wearable Motion Analysis in Sports Aurélien Valade1, Antony Costes2, Anthony Bouillod1,3, Morgane Mangin4, P. Acco1, Georges Soto-Romero1,4, Jean-Yves Fourniols1 and Frederic Grappe3 1LAAS-CNRS, N2IS, 7, Av. du Colonel Roche 31077, Toulouse, France 2University of Toulouse, UPS, PRiSSMH, Toulouse, France 3EA4660, C3S - Université de Franche Comté, 25000 Besançon, France 4 ISIFC – Génie Biomédical - Université de Franche Comté, 23 Rue Alain Savary, 25000 Besançon, France Keywords: IMU, FPGA, Motion Analysis, Sports, Wearable. Abstract: This paper presents two different wearable motion capture systems for motion analysis in sports, based on inertial measurement units (IMU). One system, called centralized processing, is based on FPGA + microcontroller architecture while the other, called distributed processing, is based on multiple microcontrollers + wireless communication architecture. These architectures are designed to target multi- sports capabilities, beginning with tri-athlete equipment and thus have to be non-invasive and integrated in sportswear, be waterproofed and autonomous in energy. To characterize them, the systems are compared to lab quality references. 1 INTRODUCTION electronics (smartphones, game controllers...), in an autonomous embedded system to monitor the Electronics in sports monitoring has been a growing sportive activity, even in field conditions. field of studies for the last decade. From the heart rate Our IMU based monitoring system allows monitors to the power meters, sportsmen -
SPORK: a Summarization Pipeline for Online Repositories of Knowledge
SPORK: A SUMMARIZATION PIPELINE FOR ONLINE REPOSITORIES OF KNOWLEDGE A Thesis presented to the Faculty of California Polytechnic State University San Luis Obispo In Partial Fulfillment of the Requirements for the Degree Master of Science in Computer Science by Steffen Lyngbaek June 2013 c 2013 Steffen Lyngbaek ALL RIGHTS RESERVED ii COMMITTEE MEMBERSHIP TITLE: SPORK: A Summarization Pipeline for Online Repositories of Knowledge AUTHOR: Steffen Lyngbaek DATE SUBMITTED: June 2013 COMMITTEE CHAIR: Professor Alexander Dekhtyar, Ph.D., De- parment of Computer Science COMMITTEE MEMBER: Professor Franz Kurfess, Ph.D., Depar- ment of Computer Science COMMITTEE MEMBER: Professor Foaad Khosmood, Ph.D., Depar- ment of Computer Science iii Abstract SPORK: A Summarization Pipeline for Online Repositories of Knowledge Steffen Lyngbaek The web 2.0 era has ushered an unprecedented amount of interactivity on the Internet resulting in a flood of user-generated content. This content is of- ten unstructured and comes in the form of blog posts and comment discussions. Users can no longer keep up with the amount of content available, which causes developers to start relying on natural language techniques to help mitigate the problem. Although many natural language processing techniques have been em- ployed for years, automatic text summarization, in particular, has recently gained traction. This research proposes a graph-based, extractive text summarization system called SPORK (Summarization Pipeline for Online Repositories of Knowl- edge). The goal of SPORK is to be able to identify important key topics presented in multi-document texts, such as online comment threads. While most other automatic summarization systems simply focus on finding the top sentences rep- resented in the text, SPORK separates the text into clusters, and identifies dif- ferent topics and opinions presented in the text. -
Master's Thesis: Adaptive Core Assignment for Adapteva Epiphany
Adaptive core assignment for Adapteva Epiphany Master of Science Thesis in Embedded Electronic System Design Erik Alveflo Chalmers University of Technology Department of Computer Science and Engineering G¨oteborg, Sweden 2015 The Author grants to Chalmers University of Technology and University of Gothenburg the non-exclusive right to publish the Work electronically and in a non-commercial pur- pose make it accessible on the Internet. The Author warrants that he/she is the author to the Work, and warrants that the Work does not contain text, pictures or other mate- rial that violates copyright law. The Author shall, when transferring the rights of the Work to a third party (for example a publisher or a company), acknowledge the third party about this agreement. If the Author has signed a copyright agreement with a third party regarding the Work, the Author warrants hereby that he/she has obtained any necessary permission from this third party to let Chalmers University of Technology and University of Gothenburg store the Work electronically and make it accessible on the Internet. Adaptive core assignment for Adapteva Epiphany Erik Alveflo, c Erik Alveflo, 2015. Examiner: Per Larsson-Edefors Chalmers University of Technology Department of Computer Science and Engineering SE-412 96 G¨oteborg Sweden Telephone + 46 (0)31-772 1000 Department of Computer Science and Engineering G¨oteborg, Sweden 2015 Adaptive core assignment for Adapteva Epiphany Erik Alveflo Department of Computer Science and Engineering Chalmers University of Technology Abstract The number of cores in many-core processors is ever increasing, and so is the number of defects due to manufacturing variations and wear-out mechanisms. -
Embedded Systems Supporting by Different Operating Systems
A Survey: Embedded Systems Supporting By Different Operating Systems Qamar Jabeen, Fazlullah Khan, Muhammad Tahir, Shahzad Khan, Syed Roohullah Jan Department of Computer Science, Abdul Wali Khan University Mardan [email protected], [email protected] -------------------------------------------------------------------------------------------------------------------------------------- Abstract: In these days embedded systems used in industrial, commercial system have an important role in different areas. e.g Mobile Phones and different Fields and applications like Network type of Network Bridges are embedded embedded system , Real-time embedded used by telecommunication systems for systems which supports the mission- giving better requirements to their users. critical domains, mostly having the time We use digital cameras, MP3 players, DVD constraints, Stand-alone systems which players are the example of embedded includes the network router etc. A great consumer electronics. In our daily life its deployment in the processors made for provided us efficiency and flexibility and completing the demanding needs of the many features which includes microwave users. There is also a large-scale oven, washing machines dishwashers. deployment occurs in sensor networks for Embedded system are also used in providing the advance facilities, for medical, transportation and also used in handled such type of embedded systems wireless sensor network area respectively a specific operating system must provide. medical imaging, vital signs, automobile This paper presents some software electric vehicles and Wi-Fi modules. infrastructures that have the ability of supporting such types of embedded systems. 1. Introduction: Embedded system are computer systems designed for specific purpose, to increase functionality and reliability for achieving a specific task, like general Figure 1: Taxonomy of Embedded Software’s purpose computer system it does not use for multiple tasks. -
New Approach for Testing and Providing Security Mechanism for Embedded Systems
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 78 ( 2016 ) 851 – 858 International Conference on Information Security & Privacy (ICISP2015), 11-12 December 2015, Nagpur, INDIA New Approach for Testing and providing Security Mechanism for Embedded Systems Swapnili P. Karmorea, Anjali R. Mahajanb Research Scholar, Department of Computer Science and enginnering, G. H. Raisoni College of Engineering, Nagpur, India. Head, Department of Information Technology, Government Polytechnic, Nagpur, India. Abstract Assuring the quality of embedded system is posing a big challenge for software testers around the globe. Testing of one embedded system widely differs from another. The approach presented in the paper is used for the testing of safety critical feature of embedded system. The input and outputs are trained validated and tested via ANN. Security is provided via skipping invalid and critical classes and by embedding secret key in the Ram of embedded device. The work contributes simulating environment where cost and time required for testing of embedded systems will be minimized, which removes drawback of traditional approaches. Keywords: Embedded system testing; Black box testing; Safety critical embedded systems; Artificial neural network. 1. Introduction Testing is the most common process used to determine the quality and providing security for embedded systems. In the embedded world, testing is an immense challenge. In the test plan, the distinguished characteristics of embedded systems must be reflected as these are application specific. They give embedded system exclusive and distinct flavor. Real-time systems have to meet the challenge of assuring the correct implementation of an application not only dependent upon its logical accuracy, but also its ability to meet the constraints of timing. -
Program & Exhibits Guide
FROM CHIPS TO SYSTEMS – LEARN TODAY, CREATE TOMORROW CONFERENCE PROGRAM & EXHIBITS GUIDE JUNE 24-28, 2018 | SAN FRANCISCO, CA | MOSCONE CENTER WEST Mark You Calendar! DAC IS IN LAS VEGAS IN 2019! MACHINE IP LEARNING ESS & AUTO DESIGN SECURITY EDA IoT FROM CHIPS TO SYSTEMS – LEARN TODAY, CREATE TOMORROW JUNE 2-6, 2019 LAS VEGAS CONVENTION CENTER LAS VEGAS, NV DAC.COM DAC.COM #55DAC GET THE DAC APP! Fusion Technology Transforms DOWNLOAD FOR FREE! the RTL-to-GDSII Flow GET THE LATEST INFORMATION • Fusion of Best-in-Class Optimization and Industry-golden Signoff Tools RIGHT WHEN YOU NEED IT. • Unique Fusion Data Model for Both Logical and Physical Representation DAC.COM • Best Full-flow Quality-of-Results and Fastest Time-to-Results MONDAY SPECIAL EVENT: RTL-to-GDSII Fusion Technology • Search the Lunch at the Marriott Technical Program • Find Exhibitors www.synopsys.com/fusion • Create Your Personalized Schedule Visit DAC.com for more details and to download the FREE app! GENERAL CHAIR’S WELCOME Dear Colleagues, be able to visit over 175 exhibitors and our popular DAC Welcome to the 55th Design Automation Pavilion. #55DAC’s exhibition halls bring attendees several Conference! new areas/activities: It is great to have you join us in San • Design Infrastructure Alley is for professionals Francisco, one of the most beautiful who manage the HW and SW products and services cities in the world and now an information required by design teams. It houses a dedicated technology capital (it’s also the city that Design-on-Cloud Pavilion featuring presentations my son is named after). -
What Is Embedded Computing?
EMBEDDED COMPUTING more stringent cost and power con- straints. PDA design requires careful What Is Embedded attention to both hardware and soft- ware. In the next decade, some micro- processors, largely invisible to users, Computing? will be used for signal processing and control—for example, to enable home Wayne Wolf, Princeton University networking across noisy, low-quality media such as power lines. Others will be used to create advanced user inter- or those who think a lot about embedded computing, as well as Evolving from a craft to an the uninitiated, it’s important to engineering discipline over F define exactly what the term means. In brief, an embedded the past decade, embedded system is any computer that is a com- ponent in a larger system and that relies computing continues to on its own microprocessor. mature. But is embedded computing a field or just a fad? The purpose of this new bimonthly column is to give researchers difficult to implement in mechanical faces—for example, for the entire clus- as well as practitioners an opportunity controllers. ter of home entertainment devices. to demonstrate that embedded com- Laser and inkjet printers also Microprocessors have also enabled puting is an established engineering dis- emerged in the 1980s. Print engines new categories of portable devices cipline with principles and knowledge require computational support for that will assume roles and perform at its core. both typesetting and real-time control. functions yet to be determined. The First, users generate characters and cell phone and PDA combinations A LONG HISTORY lines that a computation must convert that hit the market in 2001 are a People have been building embedded into pixels. -
Embedded Hardware
Lecture #2 Embedded Hardware 18-348 Embedded System Engineering Philip Koopman Friday, 15-Jan-2016 Electrical& Computer ENGINEERING © Copyright 2006-2016, Philip Koopman, All Rights Reserved Announcements Many posted materials are accessible only from a CMU IP Address • Look for this on course web page: If you can't access a file due to access restrictions, you need to get a campus IP address for your web browsing requests. Use Cisco VPN Anyconnect... Course web page has schedules, assignments, other important info • http://www.ece.cmu.ecu/~ece348 • Blackboard will have grades, announcements, sample tests • Look at blackboard announcements before sending e-mail to course staff Lab board handouts in progress • See Blackboard/admin page for TA office hours • OK to go to any scheduled lab section (but, give priority to scheduled students) • For Friday prelab give a good faith attempt to get things working by the deadline – If you hit a showstopper get it fixed on Tuesday so you can do Prelab 2 on time. 2 Design Example: Rack-Mount Power Supply Power supply for server • AC to DC conversion (750-1000W) • Coordinates 2 redundant supplies to maximize uptime • Safeguard against power problems (under/over-voltage; over-current; over-temp) Typical approach: • General microcontroller for AC, alarms, housekeeping • DSP runs control loop on DC side at > 10 KHz to provide stable DC http://accessories.us.dell.com Dell 870W Power Supply for PowerEdge R710 Server power Key requirement: “Doesn’t emit smoke” 3 Apple USB Power Adapter Teardown iPhone