Architectural Frameworks for Automated Design and Optimization of Hardware Accelerators

Total Page:16

File Type:pdf, Size:1020Kb

Architectural Frameworks for Automated Design and Optimization of Hardware Accelerators ARCHITECTURAL FRAMEWORKS FOR AUTOMATED DESIGN AND OPTIMIZATION OF HARDWARE ACCELERATORS A Dissertation Presented to the Faculty of the Graduate School of Cornell University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Tao Chen May 2018 c 2018 Tao Chen ALL RIGHTS RESERVED ARCHITECTURAL FRAMEWORKS FOR AUTOMATED DESIGN AND OPTIMIZATION OF HARDWARE ACCELERATORS Tao Chen, Ph.D. Cornell University 2018 As technology scaling slows down and only provides diminishing improve- ments in general-purpose processor performance, computing systems are in- creasingly relying on customized accelerators to meet the performance and en- ergy efficiency requirements of emerging applications. For example, today’s mobile SoCs rely on accelerators to perform compute-intensive tasks, and dat- acenters are starting to deploy accelerators for applications such as web search and machine learning. This trend is expected to continue and future systems will contain more specialized accelerators. However, the traditional hardware- oriented accelerator design methodology is costly and inefficient because it re- quires significant manual effort in the design process. This development model is unsustainable in the future where a wide variety of accelerators are expected to be designed for a large number of applications. To solve this problem, the development cost of accelerators must be drastically reduced, which calls for more productive design methodologies that can create high-quality accelerators with low manual effort. This thesis addresses the above challenge with architectural frameworks that combine novel accelerator architectures with automated design and optimization frameworks to enable designing high-performance and energy-efficient accelera- tors with minimal manual effort. Specifically, the first part of the thesis proposes a framework for automatically generating accelerators that can effectively toler- ate long, variable memory latencies, which improves performance and reduces design effort by removing the need to manually create data preloading logic. The framework leverages architecture mechanisms such as memory prefetch- ing and access/execute decoupling, as well as automated compiler analysis to generate accelerators that can intelligently preload data needed in the future from the main memory. The second part of the thesis proposes a framework for building parallel ac- celerators that leverage concepts from task-based parallel programming, which enables software programmers to quickly create high-performance accelerators using familiar parallel programming paradigms, without needing to know low- level hardware design knowledge. The framework uses a computation model that supports dynamic parallelism in addition to static parallelism, and includes a flexible architecture that supports dynamic scheduling to enable mapping a wide range of parallel applications to hardware accelerators and achieve good performance. In addition, we designed a unified language that can be mapped to both software and hardware, enabling programmers to create parallel soft- ware and parallel accelerators in a unified framework. The third part of the thesis proposes a framework that enables accelerators to perform intelligent dynamic voltage and frequency scaling (DVFS) to achieve good energy-efficiency for interactive and real-time applications. The frame- work combines program analysis and machine learning to train predictors that can accurately predict the computation time needed for each job, and adjust the DVFS levels to reduce the energy consumption. BIOGRAPHICAL SKETCH Tao Chen attended Fudan University from the year 2008 to 2012, where he re- ceived his Bachelor of Science degree (with distinction) in Microelectronics. Af- ter graduation from Fudan University, he began pursuing his Ph.D. degree in the School of Electrical and Computer Engineering at Cornell University, where he worked with his advisor, Professor G. Edward Suh, on topics in the field of computer architecture, with a focus on hardware accelerators. iii This dissertation is dedicated to my parents. iv ACKNOWLEDGEMENTS Six years ago, I arrived at Cornell to pursue my Ph.D. degree. At that time, I was a young student who was nervous about the challenges ahead, and was uncer- tain if I could make it to the end. Six years later, I have successfully completed this dissertation and become a doctor. I am extremely grateful to have so many people help me along this exciting and rewarding journey. First and foremost, I would like to express my sincerest gratitude to my ad- visor, Professor G. Edward Suh. Throughout my Ph.D. study, Ed has supported me without reservations and provided valuable guidance, advice, encourage- ment, and help whenever I needed them. Ed gave me the freedom to pursue research directions that I am passionate about, and at the same time providing necessary guidance so that I can stay on the right path. Ed is always ready to of- fer his generous help, whether it is about brainstorming ideas, revising a paper, or perfecting a conference talk. Ed is also always encouraging when I face diffi- culties, which helped me stay optimistic and motivated through the challenging journey of working towards a Ph.D. degree. I am deeply grateful to him. I would like to thank my committee members, Professor David H. Albonesi and Professor Zhiru Zhang. Dave is a role model to me as a great computer ar- chitect who is passionate about research and teaching. Dave’s course on mem- ory systems is one of the most exciting classes that I took, and inspired me to pursue the research on memory optimizations for accelerators. Zhiru’s vision and his pioneering work on high-level synthesis is a major source of inspiration for my research. He also provided many helpful suggestions and comments that greatly improved my work. I would like to thank Professor Christopher Batten for his guidance and sup- port, and for generously sharing the research infrastructure that his group de- v veloped. Chris also mentored me on the parallel accelerator project and pro- vided many insightful suggestions and advice. I am sincerely grateful to him. Special thanks to my friends and colleagues at CSL who helped me tremen- dously both with my research and with navigating graduate school. I would like to thank members of the Suh Research Group. I want to thank Daniel Lo for pro- viding many helpful comments and insights that greatly helped my research. I would like to thank Ruirui (Raymond) Huang and Wing-kei (KK) Yu for sharing their experiences as senior Ph.D. students. Special thanks to Yao Wang for pro- viding great suggestions and directions throughout my Ph.D. journey. I would also like to thank Andrew Ferraiuolo, Mohamed Ismail, Benjamin Wu, Weizhe (Will) Hua, and Mulong Luo for their support and friendship, which made my life as a Ph.D. student a lot more enjoyable. Special thanks to Shreesha Srinath for being both a mentor and a good friend. I enjoyed discussing and debating research ideas with him, and also benefited from his suggestions and guidance as a senior student. I would also like to thank Xiaodong Wang, Gai Liu, Steve Dai, and all other CSL students, whom I am fortunate to be friends with. I am proud to be a member of this brilliant community. I would like to thank my girlfriend Lin, for being caring and supportive for my life and research. Her encouragement helped me push forward in times of difficulties, and her warmth made me feel delighted every day. Finally, I would like to express my deepest gratitude to my parents, Xin Chen and Meihua Liu, for their unconditional love and support. They taught me to be persistent and optimistic, and that hard work pays off, which got me this far in my academic endeavor. They encouraged me to think independently, and supported me no matter what decisions I have made in my life. I am proud to have them as my parents, and I hope I have made them proud of me too. vi TABLE OF CONTENTS Biographical Sketch . iii Dedication . iv Acknowledgements . .v Table of Contents . vii List of Tables . .x List of Figures . xi 1 Introduction 1 1.1 Background . .1 1.2 Design Complexity of Accelerators . .3 1.3 Thesis Contributions and Organization . .6 2 Memory Optimization Framework for Efficient Data Supply 9 2.1 Introduction . .9 2.2 Overview . 12 2.2.1 System Architecture . 12 2.2.2 High-Level Synthesis . 13 2.2.3 Impact of Memory Accesses on Accelerator Performance . 14 2.2.4 Data Preloading Framework . 16 2.3 Prefetching . 18 2.4 Decoupled Access/Execute . 19 2.4.1 Access Unit . 23 2.4.2 Memory Units . 24 2.4.3 Execute Unit . 26 2.4.4 Deadlock Avoidance . 26 2.4.5 Customization of Memory Units . 27 2.4.6 Automated DAE Accelerator Generation . 28 2.5 Evaluation . 29 2.5.1 Methodology . 30 2.5.2 Experimental Setup . 31 2.5.3 Baseline Validation . 33 2.5.4 Performance Results . 34 2.5.5 Area, Power, and Energy Results . 36 2.5.6 Design Space Exploration: Queue Size . 42 3 Parallel Accelerator Framework 44 3.1 Introduction . 44 3.2 Computation Model for Dynamic Parallelism . 48 3.2.1 Primitives . 48 3.2.2 Continuation Passing . 50 3.2.3 Scheduling the Computation . 54 3.2.4 Function Calls . 56 vii 3.3 Accelerator Architecture . 56 3.3.1 FlexArch Tile and PE Architecture . 58 3.3.2 LiteArch Tile and PE Architecture . 64 3.3.3 Networks . 64 3.3.4 Memory Hierarchy . 65 3.3.5 CPU-Accelerator Interface . 66 3.4 Design Methodology and Framework . 67 3.4.1 Architectural Template . 67 3.4.2 Algorithm Description Format . 68 3.4.3 Accelerator RTL Generation . 69 3.5 Unified Framework for Parallel Accelerators and Software . 71 3.5.1 CPPWD-TBB Library . 72 3.5.2 Programmability . 74 3.6 Evaluation . 75 3.6.1 Benchmarks . 76 3.6.2 Design Effort Comparison .
Recommended publications
  • Development of Systemc Modules from HDL for System-On-Chip Applications
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Masters Theses Graduate School 8-2004 Development of SystemC Modules from HDL for System-on-Chip Applications Siddhartha Devalapalli University of Tennessee - Knoxville Follow this and additional works at: https://trace.tennessee.edu/utk_gradthes Part of the Electrical and Computer Engineering Commons Recommended Citation Devalapalli, Siddhartha, "Development of SystemC Modules from HDL for System-on-Chip Applications. " Master's Thesis, University of Tennessee, 2004. https://trace.tennessee.edu/utk_gradthes/2119 This Thesis is brought to you for free and open access by the Graduate School at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Masters Theses by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. To the Graduate Council: I am submitting herewith a thesis written by Siddhartha Devalapalli entitled "Development of SystemC Modules from HDL for System-on-Chip Applications." I have examined the final electronic copy of this thesis for form and content and recommend that it be accepted in partial fulfillment of the equirr ements for the degree of Master of Science, with a major in Electrical Engineering. Dr. Donald W. Bouldin, Major Professor We have read this thesis and recommend its acceptance: Dr. Gregory D. Peterson, Dr. Chandra Tan Accepted for the Council: Carolyn R. Hodges Vice Provost and Dean of the Graduate School (Original signatures are on file with official studentecor r ds.) To the Graduate Council: I am submitting herewith a thesis written by Siddhartha Devalapalli entitled "Development of SystemC Modules from HDL for System-on-Chip Applications".
    [Show full text]
  • A Fedora Electronic Lab Presentation
    Chitlesh GOORAH Design & Verification Club Bristol 2010 FUDConBrussels 2007 - [email protected] [ Free Electronic Lab ] (formerly Fedora Electronic Lab) An opensource Design and Simulation platform for Micro-Electronics A one-stop linux distribution for hardware design Marketing means for opensource EDA developers (Networking) From SPEC, Model, Frontend Design, Backend, Development boards to embedded software. FUDConBrussels 2007 - [email protected] Electronic Designers Problems Approx. 6 month design development cycle Tackling Design Complexity Lower Power, Lower Cost and Smaller Space Semiconductor Industry's neck squeezed in 2008 Management (digital/analog) IP Portfolio FUDConBrussels 2007 - [email protected] FUDConBrussels 2007 - [email protected] A basic Design Flow FUDConBrussels 2007 - [email protected] TIP: Use verilator to lint your verilog files. Most of the Veripool tools are available under FEL. They are in sync with Wilson Snyder's releases. FUDConBrussels 2007 - [email protected] FUDConBrussels 2007 - [email protected] GTKWaveGTKWave Don'tDon't forgetforget itsits TCLTCL backendbackend WidelyWidely usedused togethertogether withwith SystemCSystemC FUDConBrussels 2007 - [email protected] Tools Standard Cell libraries FUDConBrussels 2007 - [email protected] BackendBackend designdesign Open Circuit Design, Electric FUDConBrussels 2007 - [email protected], Toped gEDA/gafgEDA/gaf Well known and famous. A very good example of opensource
    [Show full text]
  • Simulator for the RV32-Versat Architecture
    Simulator for the RV32-Versat Architecture João César Martins Moutoso Ratinho Thesis to obtain the Master of Science Degree in Electrical and Computer Engineering Supervisor(s): Prof. José João Henriques Teixeira de Sousa Examination Committee Chairperson: Prof. Francisco André Corrêa Alegria Supervisor: Prof. José João Henriques Teixeira de Sousa Member of the Committee: Prof. Marcelino Bicho dos Santos November 2019 ii Declaration I declare that this document is an original work of my own authorship and that it fulfills all the require- ments of the Code of Conduct and Good Practices of the Universidade de Lisboa. iii iv Acknowledgments I want to thank my supervisor, Professor Jose´ Teixeira de Sousa, for the opportunity to develop this work and for his guidance and support during that process. His help was fundamental to overcome the multiple obstacles that I faced during this work. I also want to acknowledge Professor Horacio´ Neto for providing a simple Convolutional Neural Net- work application, used as a basis for the application developed for the RV32-Versat architecture. A special acknowledgement goes to my friends, for their continuous support, and Valter,´ that is developing a multi-layer architecture for RV32-Versat. When everything seemed to be doomed he always had a miraculous solution. Finally, I want to express my sincere gratitude to my family for giving me all the support and encour- agement that I needed throughout my years of study and through the process of researching and writing this thesis. They are also part of this work. Thank you. v vi Resumo Esta tese apresenta um novo ambiente de simulac¸ao˜ para a arquitectura RV32-Versat baseado na ferramenta de simulac¸ao˜ Verilator.
    [Show full text]
  • Chapter 1. Origins of Mac OS X
    1 Chapter 1. Origins of Mac OS X "Most ideas come from previous ideas." Alan Curtis Kay The Mac OS X operating system represents a rather successful coming together of paradigms, ideologies, and technologies that have often resisted each other in the past. A good example is the cordial relationship that exists between the command-line and graphical interfaces in Mac OS X. The system is a result of the trials and tribulations of Apple and NeXT, as well as their user and developer communities. Mac OS X exemplifies how a capable system can result from the direct or indirect efforts of corporations, academic and research communities, the Open Source and Free Software movements, and, of course, individuals. Apple has been around since 1976, and many accounts of its history have been told. If the story of Apple as a company is fascinating, so is the technical history of Apple's operating systems. In this chapter,[1] we will trace the history of Mac OS X, discussing several technologies whose confluence eventually led to the modern-day Apple operating system. [1] This book's accompanying web site (www.osxbook.com) provides a more detailed technical history of all of Apple's operating systems. 1 2 2 1 1.1. Apple's Quest for the[2] Operating System [2] Whereas the word "the" is used here to designate prominence and desirability, it is an interesting coincidence that "THE" was the name of a multiprogramming system described by Edsger W. Dijkstra in a 1968 paper. It was March 1988. The Macintosh had been around for four years.
    [Show full text]
  • High Speed Data Link
    High Speed Data Link Vladimir Stojanovic liheng zhu Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2017-72 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2017/EECS-2017-72.html May 12, 2017 Copyright © 2017, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. University of California, Berkeley College of Engineering MASTER OF ENGINEERING - SPRING 2017 Electrical Engineering and Computer Science Physical Electronics and Integrated Circuits Project High Speed Data Link Liheng Zhu This Masters Project Paper fulfills the Master of Engineering degree requirement. Approved by: 1. Capstone Project Advisor: Signature: __________________________ Date ____________ Print Name/Department: Vladimir Stojanovic, EECS Department 2. Faculty Committee Member #2: Signature: __________________________ Date ____________ Print Name/Department: Elad Alon, EECS Department Capstone Report Project High Speed Data Link Liheng Zhu A report submitted in partial fulfillment of the University of California, Berkeley requirements of the degree of Master of Engineering in Electrical Engineering and Computer Science March 2017 1 Introduction For our project, High-Speed Data Link, we are trying to implement a serial communication link that can operate at ~25Gb/s through a noisy channel. We decided to build a parameterized library to allow individual user to set up his/her own parameters according to the project specifications and requirements.
    [Show full text]
  • FPGA-Accelerated Evaluation and Verification of RTL Designs
    FPGA-Accelerated Evaluation and Verification of RTL Designs Donggyu Kim Electrical Engineering and Computer Sciences University of California at Berkeley Technical Report No. UCB/EECS-2019-57 http://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-57.html May 17, 2019 Copyright © 2019, by the author(s). All rights reserved. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission. FPGA-Accelerated Evaluation and Verification of RTL Designs by Donggyu Kim A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Computer Science in the Graduate Division of the University of California, Berkeley Committee in charge: Professor Krste Asanovi´c,Chair Adjunct Assistant Professor Jonathan Bachrach Professor Rhonda Righter Spring 2019 FPGA-Accelerated Evaluation and Verification of RTL Designs Copyright c 2019 by Donggyu Kim 1 Abstract FPGA-Accelerated Evaluation and Verification of RTL Designs by Donggyu Kim Doctor of Philosophy in Computer Science University of California, Berkeley Professor Krste Asanovi´c,Chair This thesis presents fast and accurate RTL simulation methodologies for performance, power, and energy evaluation as well as verification and debugging using FPGAs in the hardware/software co-design flow. Cycle-level microarchitectural software simulation is the bottleneck of the hard- ware/software co-design cycle due to its slow speed and the difficulty of simulator validation.
    [Show full text]
  • Static Analysis to Improve RTL Verification
    Static Analysis to Improve RTL Verification Akash Agrawal Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements for the degree of Master of Science in Computer Engineering Michael Hsiao, Chair Haibo Zeng A. Lynn Abbott February 16, 2017 Blacksburg, Virginia Keywords: Static Analysis, ATPG, RTL, Reachability Analysis Copyright 2017, Akash Agrawal Static Analysis to Improve RTL Verification Akash Agrawal ABSTRACT Integrated circuits have traveled a long way from being a general purpose microprocessor to an application specific circuit. It has become an integral part of the modern era of technology that we live in. As the applications and their complexities are increasing rapidly every day, so are the sizes of these circuits. With the increase in the design size, the associated testing effort to verify these designs is also increased. The goal of this thesis is to leverage some of the static analysis techniques to reduce the effort of testing and verification at the register transfer level. Studying a design at register transfer level gives exposure to the relational information for the design which is inaccessible at the structural level. In this thesis, we present a way to generate a Data Dependency Graph and a Control Flow Graph out of a register transfer level description of a circuit description. Next, the generated graphs are used to perform relation mining to improve the test generation process in terms of speed, branch coverage and number of test vectors generated. The generated control flow graph gives valuable information about the flow of information through the circuit design.
    [Show full text]
  • THE FUTURE of HOME NETWORKING the Impact of Wi-Fi, Remote UI and Open Source Stacks on Service Provider Network Architecture
    THE FUTURE OF HOME NETWORKING The Impact of Wi-Fi, Remote UI and Open Source Stacks on Service Provider Network Architecture Business Integration with Clarity The Future of Home Networking | pureIntegration Table of Contents 1 Introduction ................................................................................................. 2 2 Proposed Evolutions .................................................................................... 3 Authentication and WebUI .............................................................................................................. 5 Self-Healing/Diagnostic ................................................................................................................... 6 Security and Content Protection ..................................................................................................... 6 3 Gateway design impact ................................................................................ 7 4 CPE and IoT devices design impact ............................................................... 8 5 Proposed development and integration approach ....................................... 9 Phase 1: Interconnection tests with RDK-B or OpenWrt on Raspberry PI ...................................... 9 Phase 2: Authentication & Remote Management development on Raspberry PI .......................... 9 Phase 3: Port on Production Gateway ............................................................................................ 9 Phase 4: End to End Integration ...................................................................................................
    [Show full text]
  • Performed the Most Often. in FPGA Design Flow, Functional and Gate
    performed the most often. In FPGA design flow, functional and gate-level timing simulation is typically performed when designers suspect that there might be a mismatch between RTL and functional or gate-level timing simulation results, which can lead to an incorrect design. The mismatch can be caused for several reasons discussed in more detail in Tip #59. Note that the nomenclature of simulation types is not consistent. The same name, for instance “gate-level simulation”, can have slightly different meaning in simulation flows of different FPGA vendors. The situation is even more confusing in ASIC simulation flows, which have many more different simulation types, such as transistor-level, and dynamic simulation. The following figure shows simulation types designers can perform during Xilinx FPGA synthesis and physical implementation process. Figure 1: Simulation types Xilinx FPGA designers can perform simulation after each level of design transformation from the original RTL to the bitstream. The following example is a 12-bit OR gate implemented in Verilog. module sim_types(input [11:0] user_in, output user_out); assign user_out = |user_in; endmodule XST post-synthesis simulation model is implemented using LUT6 and LUT2 primitives, which are parts of Xilinx UNISIMS RTL simulation library. wire out, out1_14; LUT6 #( .INIT ( 64'hFFFFFFFFFFFFFFFE )) out1 ( .I0(user_in[3]), .I1(user_in[2]), .I2(user_in[5]), .I3(user_in[4]), .I4(user_in[7]), .I5(user_in[6]), .O(out)); LUT6 #( .INIT ( 64'hFFFFFFFFFFFFFFFE )) out2 ( .I0(user_in[9]), .I1(user_in[8]), .I2(user_in[11]), .I3(user_in[10]), .I4(user_in[1]), .I5(user_in[0]), .O(out1_14)); LUT2 #( .INIT ( 4'hE )) out3 ( .I0(out), .I1(out1_14), .O(user_out) ); Post-synthesis simulation model can be generated using the following command: $ netgen -w -ofmt verilog -sim sim.ngc post_synthesis.v Post-translate simulation model is implemented using X_LUT6 and X_LUT2 primitives, which are parts of Xilinx SIMPRIMS simulation library.
    [Show full text]
  • RTL Design and Implementation of a Framebuffer for a RISC-V Processor
    Universitat Politècnica de Catalunya (UPC) BarcelonaTech Facultat d’Informàtica de Barcelona (FIB) RTL design and implementation of a framebuffer for a RISC-V processor Educational Cooperative Agreement with Barcelona Supercomputing Centre (BSC) Computer Engineering Degree Final Project Author: Narcís Rodas Quiroga ​ Supervisor: Miquel Moretó (Computer Architecture Department DAC) ​ Co-supervisor: Guillem Cabo ​ Specialization: Computer Engineering ​ Date of oral defense: 28th of October 2020 ​ Abstract The RISC-V instruction set architecture (ISA) and the foundation that supports it continue to grow rapidly as an open-source alternative for hardware designs. Despite open-source software already being established as an important part of all the software solutions, open-source hardware has only recently begun to expand. Before that, the market was entirely made of proprietary ISAs (mostly from the US) that controlled it. This Final Degree Thesis shows the design, implementation and testing of a VGA (Video Graphics Array) framebuffer for the RISC-V processor being developed in the DRAC project by the Barcelona Supercomputing Centre. This document explains the various steps taken along the way and the reasoning behind the decisions that were taken. Keywords: RISC-V, VGA, RTL, Verilog, Framebuffer, AXI. ​ Resumen El conjunto de instrucciones o ISA (del inglés instruction set architecture) RISC-V y la ​ ​ fundación que lo respalda siguen creciendo rápidamente como una alternativa open-source para los diseños hardware. Aunque el software open-source ya representa una parte importante de todas las soluciones software, el hardware open-source todavía está empezando a expandirse. Antes de esto, el mercado estaba compuesto íntegramente de ISAs propietarias (la gran mayoría provenientes de los E.E.
    [Show full text]
  • Pymtl: a Unified Framework for Vertically Integrated Computer
    Appears in the Proceedings of the 47th Int’l Symp. on Microarchitecture (MICRO-47), December 2014 PyMTL: A Unified Framework for Vertically Integrated Computer Architecture Research Derek Lockhart, Gary Zibrat, and Christopher Batten School of Electrical and Computer Engineering, Cornell University, Ithaca, NY {dml257,gdz4,cbatten}@cornell.edu Abstract—Technology trends prompting architects to con- tures, it has general value as a methodology for more tradi- sider greater heterogeneity and hardware specialization have tional architecture research as well. exposed an increasing need for vertically integrated research Unfortunately, attempts to implement an MTL methodol- methodologies that can effectively assess performance, area, and energy metrics of future architectures. However, constructing ogy using existing publicly-available research tools reveals such a methodology with existing tools is a significant challenge numerous practical challenges we call the computer architec- due to the unique languages, design patterns, and tools used ture research methodology gap. This gap is manifested as in functional-level (FL), cycle-level (CL), and register-transfer- the distinct languages, design patterns, and tools commonly level (RTL) modeling. We introduce a new framework called used by functional level (FL), cycle level (CL), and register- PyMTL that aims to close this computer architecture research methodology gap by providing a unified design environment transfer level (RTL) modeling. We believe the computer ar- for FL, CL, and RTL modeling. PyMTL leverages the Python chitecture research methodology gap exposes a critical need programming language to create a highly productive domain- for a new vertically integrated framework to facilitate rapid specific embedded language for concurrent-structural modeling design-space exploration and hardware implementation.
    [Show full text]
  • Capability Directory 3
    AUTOMOTIVE CONSULTANCIES DEVELOPMENT TOOL SUPPLIERS POWERTRAIN CONSULTANCIES CAPABILITY TEST & CERTIFICATION FACILITIES TIER 1 - SYSTEM DEVELOPERS & DIRECTORY INTEGRATORS TIER 2 - COMPONENT DEVELOPERS & SUPPLIERS VEHICLE COMPONENT SUPPLIERS 2018/19 UK based Companies offering Automotive Electronics services and solutions. AESIN Capability Directory 3 As Chairman of AESIN I am delighted to announce the second revision of this valuable Automotive Capability Directory which provides a rich resource of UK based Companies offering Automotive Electronics services and solutions. As part of our work at AESIN to help enable rapid innovation in Automotive Electronic Systems, it is vital that we reach out to companies and organisations across the UK and engage with our core Industry led Workstream activities (Connectivity (V2X), More Electric powertrain, ADAS & AV, Security, Software and Research). As the AESIN Community is growing and seeking technology solutions this Directory has both a searchable on-line and printed version to be distributed at AESIN2018 Annual Conference 2nd October. I would like to thank all those involved at AESIN and TECHWORKS for the work in preparing this new revision of the publication and also to those providing the engaging Editorial supplements. I would encourage you make full use of the directory to help locate and connect with new potential suppliers and look forward to seeing this valuable resource grow year on year as we expand the AESIN Community to include Research and Academic Institutions . Alan Banks AESIN
    [Show full text]